Data Engineer II at a comms service provider with 10,001+ employees
Real User
Ingests data from various sources, integrates well, and offers a helpful alert mechanism
Pros and Cons
  • "Datadog agents act as an integration to different services, providing easy access and management."
  • "Ingesting data from various sources to monitor the log metrics of the system can always improve so that, if something goes wrong, the right teams are alerted."

What is our primary use case?

Ingesting data from various sources to monitor the log metrics of the system and enabling an alert mechanism to notify the teammates if something goes wrong.

More specifically, having Datadog agents as integration to different services provides easy access and management.

How has it helped my organization?

The solution has helped out organization by allowing us to ingest data from various sources to monitor log metrics and enabling alert mechanisms to notify teams if something goes wrong.

Datadog agents act as an integration to different services, providing easy access and management.

What is most valuable?

The solution is useful for ingesting data from various sources. This helps to monitor the log metrics of the system. It has alert mechanisms that can be enabled to notify the teams if something goes wrong.

Datadog offers good integration to different services. It provides for easy access and management.

What needs improvement?

Ingesting data from various sources to monitor the log metrics of the system can always improve so that, if something goes wrong, the right teams are alerted. 

Buyer's Guide
Datadog
April 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,578 professionals have used our research since 2012.

For how long have I used the solution?

We've used the solution for close to 1.5 years.

What other advice do I have?

We use a SaaS deployment.

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Sr. Director of Software Engineering at a tech consulting company with 1,001-5,000 employees
Real User
Helpful support, good incident management, and helps triage faster
Pros and Cons
  • "The RUM solution has improved our ability to triage faster and hand more capabilities to our customer support."
  • "The pricing is a bit confusing."

What is our primary use case?

The RUM is implemented for customer support session replays to quickly route, triage, and troubleshoot support issues which can be sent to our engineering teams directly. 

Customer Support will log in directly after receiving a customer request and work on the issue. Engineers will utilize the replay along with RUM to pinpoint the issue combined with APM and Infra trace to be able to look for signals to find the direct cause of the customer impact. 

Incident management will be utilized to open a Jira ticket for engineering, and it integrates with ITSM systems and on-call as needed.

How has it helped my organization?

The RUM solution has improved our ability to triage faster and hand more capabilities to our customer support.

The RUM is implemented for customer support. It can quickly route, triage, and troubleshoot support issues that are sent to our engineering teams. 

Customer support can log in and start troubleshooting after receiving a customer request. The replay and RUM help pinpoint the issue. This functionality is combined with APM and Infra trace to be able to look for the cause of the issue. Incident management is leveraged to open a Jira ticket for engineering, and it can integrate with ITSM systems and on-call as needed.

What is most valuable?

RUM with session replay combined with a future use case to support synthetics will help to identify issues earlier in our process. We have not rolled this out yet but plan for it as a future use case for our customer support process. This, combined with integrated automation for incident management, will drive down our MTTR and time spent working through tickets. Overall, we are hoping to use this to look at our data and perfection rate over time in a BI-like way to reduce our customer support headcount by saving on time spent.

What needs improvement?

I would like to see retention options greater than 30-days for session replay. I'd also like to see forwarding options for retention to custom solutions, and a greater ability to event and export data from the tooling overall to BI/DW solutions for reporting across the long term and to see trends as needed.

For how long have I used the solution?

I've used the solution for about nine months.

What do I think about the stability of the solution?

So far, stability has been great.

What do I think about the scalability of the solution?

I'd like to see more bells and whistles added over time. Widgets are coming soon to help with RUM.

How are customer service and support?

Support is very good. They are responsive and gave us the help we need.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We have utilized New Relic, however, not for RUM. We went with Datadog to potentially switch the entire platform into an all-in-one solution that makes sense for a company of our size.

How was the initial setup?

We started on the beta, and the documentation was lagging behind. We also needed direct instructions and links from the customer support/account representative that was not immediately available by searching online.

What about the implementation team?

We implemented the solution ourselves.

What was our ROI?

Ideally, this will inform our strategy to not increase our customer support headcount as significantly into 2023 and beyond.

What's my experience with pricing, setup cost, and licensing?

The pricing is a bit confusing. However, the RUM session replay, in general, is very inexpensive compared to whole solutions.

Which other solutions did I evaluate?

We looked into LogRocket and New Relic.

What other advice do I have?

I'd advise other users to try it out.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Datadog
April 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,578 professionals have used our research since 2012.
Senior Software Engineer at Grata
User
Great charts and visualizations with good debugging
Pros and Cons
  • "We really like the charts and visualization."
  • "Sometimes, it takes a long time to load the dashboard if we have many charts."

What is our primary use case?

We primarily use the dashboard/metric with many tags.

How has it helped my organization?

It makes the system easier to debug. 

What is most valuable?

We really like the charts and visualization.

What needs improvement?

Sometimes, it takes a long time to load the dashboard if we have many charts.

For how long have I used the solution?

I've used the solution for five years.

Which deployment model are you using for this solution?

Private Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Technology Competency and Solution Head at LearningMate
Real User
Top 20
Good infrastructure and traffic visualizations help with capacity planning
Pros and Cons
    • "The error traceability is an area that can be improved."

    What is our primary use case?

    We use Datadog for application monitoring, to help identify errors. It is also used to monitor application performance.

    It helps organizations to understand User Experience with user behaviour pattern

    How has it helped my organization?

    Helped to reduce production issues in a defined timeframe

    Helped to refine UX

    What is most valuable?

    Datadog has a very good visualization for my complete infrastructure and network traffic, which enabled me to create a capacity plan.

    This product is great because it shows you the SQL and your application request in a single view.

    What needs improvement?

    The error traceability is an area that can be improved. This is something that helps us to pinpoint the area where a problem is occurring. It is a function stack, and it should be showing us how each function is defined.

    For how long have I used the solution?

    We have been using Datadog for the past couple of Years.

    What do I think about the stability of the solution?

    I have not worked on it long enough to properly comment on stability, yet, because it has to be tested across my other platforms.

    What do I think about the scalability of the solution?

    We have not done a full evaluation yet, but given that it is cloud-based, DataDog has to be scalable.

    How are customer service and support?

    I have not needed to contact technical support.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    We were using New Relic prior to implementing Datadog. In terms of application monitoring, Datadog is not up to the level that New Relic is. It is a better product but the price is too high, which is why we switched.

    How was the initial setup?

    Yes. It is not complex. It allows you to get a certification of DataDog prior deployment of associates to administration and configuration

    What about the implementation team?

    Inhouse. We got our Admin team certified.

    What was our ROI?

    Time to resolution production issue

    What's my experience with pricing, setup cost, and licensing?

    The price is better than some competing products.

    Which other solutions did I evaluate?

    NewRelic

    What other advice do I have?

    This is a good product and I can recommend it to others, although New Relic is still my first choice. Datadog is my second choice.

    Overall, it is a good product and my main complaint is that it needs better error traceability.

    I would rate this solution a nine out of ten.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    reviewer1486134 - PeerSpot reviewer
    Infrastructure Engineer at DATACAMP, INC
    Vendor
    Easy to set up, supported with good documentation, and the single pane of glass improves efficiency
    Pros and Cons
    • "The fact that everything is under a single pane of glass is really valuable, as developers don't have to spend their time copying correlation IDs across tools to find what they need."
    • "The incident management beta looks promising, but it is still missing the ability to automatically create incidents based on certain alerts."

    What is our primary use case?

    We use Datadog as a monitoring platform to achieve visibility into our container environments.

    Almost all of our workloads are containerized and with DataDog, we are able to get metrics, logs, alerts, and events about all the containers that we are running. Our developers also extensively use APM to find and diagnose performance issues that might appear.

    We use Terraform to automatically create all of the necessary monitors and dashboards that our developers need to make sure that our level of service is sufficient.

    How has it helped my organization?

    We implemented Datadog around the same time as the company was growing from 30 to 150 people. Before that, we didn't have a standard stack for monitoring. Each team used their own logging solutions, metrics were missing or non-existent, and it was impossible to correlates metrics collected by different teams. DataDog provided us with an out-of-the-box solution that allowed us to focus on putting in place practices and processes around monitoring, rather than focus on implementation details.

    Every squad is now confident in their ability to quickly identify and diagnose issues when they arise.

    What is most valuable?

    The fact that everything is under a single pane of glass is really valuable, as developers don't have to spend their time copying correlation IDs across tools to find what they need.

    Thanks to the unified tagging system, it's really easy to jump around the different Datadog products without losing the context. That makes debugging really easy for developers because they can go from APM to logs to metrics in a few clicks.

    Watchdog is also a great feature that helped us identify overlooked issues more than once.

    What needs improvement?

    The incident management beta looks promising, but it is still missing the ability to automatically create incidents based on certain alerts.

    SLOs are also a great way to visualize how you are doing with regard to the level of service that you are providing but it missing crucial components like:

    • The ability to visualize the remaining error budget and how it evolved during the month. An error budget burndown graph would be helpful.
    • The ability to display a different level of alert on an SLO based on how fast it is consuming the error budget. This is the slow burn versus fast burn.

    For how long have I used the solution?

    We've been using Datadog for a bit more than two years.

    How are customer service and technical support?

    There is extensive documentation and the support is very reactive.

    Which solution did I use previously and why did I switch?

    Prior to using Datadog, each team was using their own solutions. This included a mix of custom tooling, third-party tools, and AWS tools.

    How was the initial setup?

    The initial setup is very easy. 

    Which deployment model are you using for this solution?

    Public Cloud
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    reviewer1477686 - PeerSpot reviewer
    Senior DevOps Engineer at DigitalOnUs
    Real User
    Affordably-priced and improves visibility of infrastructure, apps, and services
    Pros and Cons
    • "Having a clear view, not only of our infrastructure but our apps and services as well, has brought a great added value to our customers."
    • "The pricing model could be simplified as it feels a bit outdated, especially when you look at the billing model of compute instances vs the containers instances."

    What is our primary use case?

    Our primary use of Datadog includes: 

    • Keeping a close look into our AWS resources. Monitoring our multiple RDS and ElastiCache instances play a big role in our indicators.
    • Kubernetes. We aren't using all of the available Kubernetes integrations but the few of them that work out of the box adds great value to our metrics.
    • Monitoring and alerting. We wired our most relevant monitoring and alerts to services like PagerDuty, and for the rest of them, we keep our engineers up to date with constant Slack updates. 

    How has it helped my organization?

    Observability is something that a lot of Companies are trying to achieve. Having a clear view, not only of our infrastructure but our apps and services as well, has brought a great added value to our customers.

    For a logging solution, we use to have Papertrail. It did the trick but having a single point that manages and indexes all the logs is a BIG improvement. Also, having the option to generate metrics from logs is a game-changer that we're trying to include in our monitoring strategy.

    I would like to say the same about APM but the support for PHP seems to be somewhat lacking. It works but I think this service could provide us more information.

    What is most valuable?

    With respect to logs, we used to integrate various kinds of tools to achieve very basic tasks and it always felt like a very fragile solution. I think logs are by far the most useful feature and at the same time, the one that we could improve.

    APM - This is either a hit or miss, allow me to explain: we use various programming languages, mainly PHP and Ruby, and the traces generated don't always provide all of the information we want. For example, we get a great level of detail for the SQL queries that the app generates but not so much for the PHP side. It's hard to track where exactly where all of the bottlenecks are, so some analysis tools for APM could make a good addition.

    What needs improvement?

    Please add PHP profiling; you already have it for other popular programming languages such as Python and Java, which is great because we have a little bit of those, but our main app is powered by PHP and we don't have profiling for this yet. I guess it's only a matter of time for this to be added, so in the meanwhile, you can consider this review as a vote for the PHP profiling support.

    The pricing model could be simplified as it feels a bit outdated, especially when you look at the billing model of compute instances vs the containers instances.

    For how long have I used the solution?

    We have been using Datadog for one year.

    What do I think about the stability of the solution?

    It's pretty stable for the main integrations. There was only one time where Datadog was down and that was scary since all of our monitoring is handled by Datadog. There was a lot of uncertainty while the outage was in place.

    What do I think about the scalability of the solution?

    For everyday use, it's adequate, but for very specific tasks, not so much. There was a time where I had to do a big export and as expected, the API is somewhat limited. Since it was a one-time task, it was not a big deal but if this was a regular task, I wouldn't be happy about it.

    How are customer service and technical support?

    For small tasks, I think it's great. For specialized support, it feels like you're under-staffed, having to wait days/weeks for a solution is a big NO-NO.

    Which solution did I use previously and why did I switch?

    I've used a few other products such as NewRelic and AppDynamics. The switch is usually affected by two factors: pricing and convenience.

    How was the initial setup?

    Getting APM metrics out of Kubernetes is always a painful task. We got support to take a look at this and we had to go through various iterations to get it right, and then AGAIN the next year. This was a bad experience.

    What about the implementation team?

    It was all implemented in-house. The documentation is fairly up to date, for the most part.

    What's my experience with pricing, setup cost, and licensing?

    Pricing is somewhat affordable compared to other solutions but in order to really lower the costs of other products you need to plan very carefully your resources usage, otherwise, it can get expensive real quick.

    Which other solutions did I evaluate?

    Unfortunately, it wasn't my call to include Datadog for this Company but sure I'm glad that the Lead Architect took this decision. It brought many improvements in a small span of time.

    What other advice do I have?

    Please add PHP profiling soon!

    Which deployment model are you using for this solution?

    Public Cloud
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Product Manager, Delivery Engineering at a media company with 1,001-5,000 employees
    Real User
    Top 20
    Intuitive to set up with great dashboards and dashboards and APM
    Pros and Cons
    • "The tools are powerful and intuitive to set up."
    • "Billing should be more transparent."

    What is our primary use case?

    The main use case is observability and reliability as part of a platform/delivery engineering solution. We use the product to assist tenants and clients within the company to get more ramped up on SRE/DevOps.

    How has it helped my organization?

    The solution has provided us with a lot more insight into service-level metrics, which is especially useful with APM/tracing. It gives us all-up dashboards and alerts to assist with incident management.

    What is most valuable?

    The most useful aspects of the product are the dashboards and APM/tracing. The tools are powerful and intuitive to set up as well.

    What needs improvement?

    Custom-level metrics could be improved.

    Billing should be more transparent.

    For how long have I used the solution?

    I've used the solution for three to four years at multiple companies.

    What do I think about the stability of the solution?

    The solution is very stable.

    What do I think about the scalability of the solution?

    The scalability is fantastic so far.

    How are customer service and support?

    Technical support has been great so far.

    How was the initial setup?

    The initial setup is straightforward. The documentation we use is very clean and concise.

    What about the implementation team?

    We handled the initial setup in-house.

    What's my experience with pricing, setup cost, and licensing?

    I would advise others to be cautious around custom metrics and be picky when setting them up.

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Performance Testing Manager at a tech services company with 10,001+ employees
    Real User
    Has good troubleshooting and instrumentation features
    Pros and Cons
    • "The solution is sufficiently stable."
    • "The setup was a bit complex."

    What is our primary use case?

    I'm not sure which version we're using, although I believe it to be the latest. 

    We essentially use the solution in advance of performance testing, performance monitoring, and troubleshooting.

    How has it helped my organization?

    The solution affords us many uses when it comes to troubleshooting and application. Should we encounter issues while troubleshooting under load, we can take advantage of Datadog usage metrics to see which method or area is having difficulty, in order that we may resolve the issue. 

    What is most valuable?

    The solution has valuable troubleshooting and instrumentation features. 

    What needs improvement?

    The setup was a bit complex. 

    As Datadog is a bit on the expensive side, I would recommend it for simple, uncomplicated, solutions.

    For how long have I used the solution?

    I have been using Datadog for nearly four or five years. 

    What do I think about the stability of the solution?

    The solution is sufficiently stable. 

    What do I think about the scalability of the solution?

    As our company does not have many users who are making use of the solution at present, I have not encountered issues with its scalability. 

    I do not have plans to increase the usage at the moment. 

    How are customer service and support?

    I cannot comment on tech support, as I have not made use of them, although my team members have. It seems okay. 

    Which solution did I use previously and why did I switch?

    We mainly work with Datadog, although with Dynatrace, as well. 

    How was the initial setup?

    The initial setup was a bit complex. 

    What about the implementation team?

    We hired a separate team to handle the implementation. 

    At present, not much staff is needed for the deployment and maintenance. 

    What was our ROI?

    I am not in much of a position to address the question of return on investment, although I believe it is for the better that my organization employs the solution.

    What's my experience with pricing, setup cost, and licensing?

    I do not have knowledge of the licensing costs. 

    As Datadog is a bit on the expensive side, I would recommend it for simple, uncomplicated, solutions. 

    What other advice do I have?

    We are just the users of the solution. There are not many of us in our company. 

    The solution is good for complex integations which have lots of downstream and involve multiple combined systems. This is where it is useful. 

    I rate Datadog as a nine out of ten. 

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Buyer's Guide
    Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
    Updated: April 2024
    Buyer's Guide
    Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.