Try our new research platform with insights from 80,000+ expert users
reviewer1486134 - PeerSpot reviewer
Infrastructure Engineer at DATACAMP, INC
Vendor
Easy to set up, supported with good documentation, and the single pane of glass improves efficiency
Pros and Cons
  • "The fact that everything is under a single pane of glass is really valuable, as developers don't have to spend their time copying correlation IDs across tools to find what they need."
  • "The incident management beta looks promising, but it is still missing the ability to automatically create incidents based on certain alerts."

What is our primary use case?

We use Datadog as a monitoring platform to achieve visibility into our container environments.

Almost all of our workloads are containerized and with DataDog, we are able to get metrics, logs, alerts, and events about all the containers that we are running. Our developers also extensively use APM to find and diagnose performance issues that might appear.

We use Terraform to automatically create all of the necessary monitors and dashboards that our developers need to make sure that our level of service is sufficient.

How has it helped my organization?

We implemented Datadog around the same time as the company was growing from 30 to 150 people. Before that, we didn't have a standard stack for monitoring. Each team used their own logging solutions, metrics were missing or non-existent, and it was impossible to correlates metrics collected by different teams. DataDog provided us with an out-of-the-box solution that allowed us to focus on putting in place practices and processes around monitoring, rather than focus on implementation details.

Every squad is now confident in their ability to quickly identify and diagnose issues when they arise.

What is most valuable?

The fact that everything is under a single pane of glass is really valuable, as developers don't have to spend their time copying correlation IDs across tools to find what they need.

Thanks to the unified tagging system, it's really easy to jump around the different Datadog products without losing the context. That makes debugging really easy for developers because they can go from APM to logs to metrics in a few clicks.

Watchdog is also a great feature that helped us identify overlooked issues more than once.

What needs improvement?

The incident management beta looks promising, but it is still missing the ability to automatically create incidents based on certain alerts.

SLOs are also a great way to visualize how you are doing with regard to the level of service that you are providing but it missing crucial components like:

  • The ability to visualize the remaining error budget and how it evolved during the month. An error budget burndown graph would be helpful.
  • The ability to display a different level of alert on an SLO based on how fast it is consuming the error budget. This is the slow burn versus fast burn.
Buyer's Guide
Datadog
June 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
857,028 professionals have used our research since 2012.

For how long have I used the solution?

We've been using Datadog for a bit more than two years.

How are customer service and support?

There is extensive documentation and the support is very reactive.

Which solution did I use previously and why did I switch?

Prior to using Datadog, each team was using their own solutions. This included a mix of custom tooling, third-party tools, and AWS tools.

How was the initial setup?

The initial setup is very easy. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1480866 - PeerSpot reviewer
Director of DevOps at Digital Media Solutions Group
Real User
Provides good visibility across applications, good integration, and helpful support
Pros and Cons
  • "The most valuable features are logging, the extensive set of integrations, and easy jumpstart."
  • "In the past two years, there have been a couple of outages."

What is our primary use case?

We primarily use this product for availability and performance monitoring, log aggregation.

How has it helped my organization?

Datadog gave us awesome visibility across all of our applications.

What is most valuable?

The most valuable features are logging, the extensive set of integrations, and easy jumpstart.

What needs improvement?

In the past two years, there have been a couple of outages.

For how long have I used the solution?

We have been using Datadog for two years.

What do I think about the stability of the solution?

The outages that we have had in the past two years were fixed in a matter of minutes.

What do I think about the scalability of the solution?

So far we did not have any issues with scaling, and everything is working great.

How are customer service and technical support?

Support is awesome.

Which solution did I use previously and why did I switch?

We did use NewRelic, but the logging feature was not as good as it is in Datadog.

How was the initial setup?

The initial setup is straightforward and everything is very well documented and easy to start using.

What about the implementation team?

We implemented it in-house.

Which other solutions did I evaluate?

We evaluated a custom ELK solution, Sumo Logic, and Logentries.

What other advice do I have?

Datadog is already covering much more than we normally need with exceptional quality. This is a great product.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Datadog
June 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
857,028 professionals have used our research since 2012.
reviewer1479957 - PeerSpot reviewer
Senior Director of DevOps at Housecall Pro
Real User
Good graphing and dashboards, and it improves visibility for developers
Pros and Cons
  • "Having a wealth of information has helped us investigate outages, and having historical data helps us tune our system."
  • "Datadog has a lot of documentation, but a lot of that documentation assumes you know how the service works, which can lead to confusion."

What is our primary use case?

We primarily use Datadog for the monitoring of EC2 and ECS containers running mostly Rails applications that host a SaaS product. We also monitor ElasticSearch and RDS, and we are working on adding their Application Performance Monitoring solution to monitor our applications directly.

We use DataDog to create dashboards, graphs, and alerts based on interesting metrics. DataDog is our first place to look to find the performance of our system.

We also use their logging platform and it works well. Especially useful is that the logs and metrics are tightly integrated so you can jump between them easily.

How has it helped my organization?

Developers are able to see how code is running in production, where this was mostly opaque previous to us implementing DataDog. We are able to emit custom metrics that are specific to our business, and the built-in metrics have also proven useful. Having a wealth of information has helped us investigate outages, and having historical data helps us tune our system.

DevOps engineers are able to put sensors around our system to proactively detect problems, whereas before, our engineers heard about problems from customers. Logs are easier to find for developers.

What is most valuable?

Metric graphing and Dashboards are the most valuable features because they give us good observability into our system and work well to alert us when interesting things happen. We use this functionality daily.

We value the monitoring capability since it allows us to be pushed alerts, rather than have to observe graphs continually. The integrations with Slack and PagerDuty enable us to be interrupted appropriately and keep a running tab on the system without bothering us unnecessarily.

The online process monitoring has been extremely helpful, as it gives engineers the ability to see the live status of all the processes running our systems without them having to log in.

What needs improvement?

Their logging solution is expensive for our use case. They do have the capability to rehydrate old or incomplete logs, and it works, but I would rather not have to think about that operation.

Datadog has a lot of documentation, but a lot of that documentation assumes you know how the service works, which can lead to confusion. Positive note is that they do have lots of documentation, it just needs better curation.

Their APM solution still needs some work, but they are actively developing it. I would also like to see more database-specific application monitoring.

For how long have I used the solution?

I have been using Datadog for five years across two companies.

What do I think about the stability of the solution?

Any issues are addressed and communicated very quickly. I have not had any issues with uptime.

What do I think about the scalability of the solution?

If you do not need 100% of data such as logs, APM traces, etc., this scales well. It does not scale as well if you want 100% of your logs indexed. You should understand any other usage-based bills before using any part of their service as it is very easy to run up a large bill.

The performance of the system scales very well, and host monitoring and APM are relatively cheap.

How are customer service and technical support?

Account support is excellent.

Customer support is good if you get them to go beyond pointing out the right documentation.

Which solution did I use previously and why did I switch?

Previously, I used homebuilt solutions with Nagios and Cacti but found that there was far too much work to understand them and keep them up and fed compared to the value that I got. They also did not integrate well with existing data sources without a lot of effort.

I also previously used StackDriver and found it too opinionated. I like that DataDog gives you tools to work with certain types of data and make your own graphs, monitors, etc., whereas, with StackDriver, I felt like there were a limited number of ways you could accomplish goals.

How was the initial setup?

The basic setup is easy. A more advanced setup can be tricky because the documentation assumes you know how the system works already. Support is somewhat helpful, but mostly points out the documentation you should already have found.

What about the implementation team?

We implemented in-house.

What's my experience with pricing, setup cost, and licensing?

My advice is to understand what number of hosts and data you want to commit to. Beware that usage-based billing is both a blessing and a curse. It is easy to run up a large bill, so become familiar with the cost of each piece of your bill and use the metrics they supply to estimate and monitor your bill.

I have had good luck with their support team helping us to figure out the correct commit levels. Their account support is excellent in this regard. I have heard their sales team can be aggressive, but I have not experienced it personally.

Which other solutions did I evaluate?

I originally chose Datadog because of my previous experience. We recently considered moving over to New Relic because we liked their APM solution better. However, the pricing of New Relic and our familiarity with Datadog won over. New Relic is a good product but it didn't fit our overall needs as well as Datadog.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1477686 - PeerSpot reviewer
Senior DevOps Engineer at DigitalOnUs
Real User
Affordably-priced and improves visibility of infrastructure, apps, and services
Pros and Cons
  • "Having a clear view, not only of our infrastructure but our apps and services as well, has brought a great added value to our customers."
  • "The pricing model could be simplified as it feels a bit outdated, especially when you look at the billing model of compute instances vs the containers instances."

What is our primary use case?

Our primary use of Datadog includes: 

  • Keeping a close look into our AWS resources. Monitoring our multiple RDS and ElastiCache instances play a big role in our indicators.
  • Kubernetes. We aren't using all of the available Kubernetes integrations but the few of them that work out of the box adds great value to our metrics.
  • Monitoring and alerting. We wired our most relevant monitoring and alerts to services like PagerDuty, and for the rest of them, we keep our engineers up to date with constant Slack updates. 

How has it helped my organization?

Observability is something that a lot of Companies are trying to achieve. Having a clear view, not only of our infrastructure but our apps and services as well, has brought a great added value to our customers.

For a logging solution, we use to have Papertrail. It did the trick but having a single point that manages and indexes all the logs is a BIG improvement. Also, having the option to generate metrics from logs is a game-changer that we're trying to include in our monitoring strategy.

I would like to say the same about APM but the support for PHP seems to be somewhat lacking. It works but I think this service could provide us more information.

What is most valuable?

With respect to logs, we used to integrate various kinds of tools to achieve very basic tasks and it always felt like a very fragile solution. I think logs are by far the most useful feature and at the same time, the one that we could improve.

APM - This is either a hit or miss, allow me to explain: we use various programming languages, mainly PHP and Ruby, and the traces generated don't always provide all of the information we want. For example, we get a great level of detail for the SQL queries that the app generates but not so much for the PHP side. It's hard to track where exactly where all of the bottlenecks are, so some analysis tools for APM could make a good addition.

What needs improvement?

Please add PHP profiling; you already have it for other popular programming languages such as Python and Java, which is great because we have a little bit of those, but our main app is powered by PHP and we don't have profiling for this yet. I guess it's only a matter of time for this to be added, so in the meanwhile, you can consider this review as a vote for the PHP profiling support.

The pricing model could be simplified as it feels a bit outdated, especially when you look at the billing model of compute instances vs the containers instances.

For how long have I used the solution?

We have been using Datadog for one year.

What do I think about the stability of the solution?

It's pretty stable for the main integrations. There was only one time where Datadog was down and that was scary since all of our monitoring is handled by Datadog. There was a lot of uncertainty while the outage was in place.

What do I think about the scalability of the solution?

For everyday use, it's adequate, but for very specific tasks, not so much. There was a time where I had to do a big export and as expected, the API is somewhat limited. Since it was a one-time task, it was not a big deal but if this was a regular task, I wouldn't be happy about it.

How are customer service and technical support?

For small tasks, I think it's great. For specialized support, it feels like you're under-staffed, having to wait days/weeks for a solution is a big NO-NO.

Which solution did I use previously and why did I switch?

I've used a few other products such as NewRelic and AppDynamics. The switch is usually affected by two factors: pricing and convenience.

How was the initial setup?

Getting APM metrics out of Kubernetes is always a painful task. We got support to take a look at this and we had to go through various iterations to get it right, and then AGAIN the next year. This was a bad experience.

What about the implementation team?

It was all implemented in-house. The documentation is fairly up to date, for the most part.

What's my experience with pricing, setup cost, and licensing?

Pricing is somewhat affordable compared to other solutions but in order to really lower the costs of other products you need to plan very carefully your resources usage, otherwise, it can get expensive real quick.

Which other solutions did I evaluate?

Unfortunately, it wasn't my call to include Datadog for this Company but sure I'm glad that the Lead Architect took this decision. It brought many improvements in a small span of time.

What other advice do I have?

Please add PHP profiling soon!

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Senior Director with 10,001+ employees
Real User
A good solution for infrastructure, but not for application-level monitoring
Pros and Cons
  • "Datadog's ability to group and visualize the servers and the data makes it relatively easy for the root cause analysis."
  • "Datadog lacks a deeper application-level insight. Their competitors had eclipsed them in offering ET functionality that was important to us. That's why we stopped using it and switched to New Relic. Datadog's price is also high."

What is our primary use case?

We used Datadog to capture the salvatory of our AWS fleet of around 1,200 servers.

What is most valuable?

Datadog's ability to group and visualize the servers and the data makes it relatively easy for the root cause analysis.

What needs improvement?

Datadog lacks a deeper application-level insight. Their competitors had eclipsed them in offering ET functionality that was important to us. That's why we stopped using it and switched to New Relic.

Datadog's price is also high.

For how long have I used the solution?

I have been using Datadog for about three years.

What do I think about the stability of the solution?

Stability really wasn't ever an issue. We didn't have any outages specific to Datadog where we couldn't get reports or insights to information. We were more concerned about the stability of our own systems and applications.

What do I think about the scalability of the solution?

There was no issue with scaling as such. It didn't scale well only from the cost perspective.

How are customer service and technical support?

Fortunately, because of the stability of the solution, we never had reasons to deal with technical support. Most of our interaction was with their product management, which was focused on the feature capability and ultimately pricing.

What's my experience with pricing, setup cost, and licensing?

It didn't scale well from the cost perspective. We had a custom package deal.

Which other solutions did I evaluate?

We switched from Datadog to New Relic because it offered ET functionality. Datadog was traditionally born out of monitoring infrastructure. Over the years, they have improved their ability to give you insights at the application layer and to be considered under APM. New Relic really started at the application layer and has worked its way down. 

Ultimately, we were able to accept New Relic because coming from an operations team, infrastructure was more important. As our application became more complex, our application developers needed better insight. Because there is a significant overlap in the Venn diagram between Datadog and New Relic, we felt that the needs of the infrastructure team and the applications team could be met with New Relic and its expansion in providing a sort of lightweight security.

What other advice do I have?

Datadog started off at the infrastructure level, and New Relic started off at the application level. Both of them were expanding not only into each other's space but also into the SIM space.

There are a lot of options out there. For folks like me, it becomes a costly proposition because, at the end of the day, we're talking about logs, events that get pushed out. I have to push out some to Datadog and some to the security event manager. Then you start to think why can't you just push them to one place and let a product do that. That's where these products are trying to grow. They're not quite there yet because the SIM space is pretty mature. An enterprise like ours needs something fully focused and dedicated. Startups can live with New Relic that has a security capability or Datadog.

I would advise you to really understand the value that you're trying to go after. Make sure that you're not trying to solve all problems that you have from the observability perspective with Datadog because that will erode the value you get out of this solution.

Make sure that you are going to use Datadog for infrastructure, and it is going to be great. If you start adding other kinds of stuff to it, you'll probably start losing some of that value. Especially, if you want to go for application-level monitoring, you may be a bit disappointed.

I would rate this solution a six out of ten. I'm a very price-conscious kind of purchaser.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2000271 - PeerSpot reviewer
Software Developer at a tech vendor with 51-200 employees
Vendor
Great for logging and monitoring with useful RUM capabilities
Pros and Cons
  • "The product has offered increased visibility via logging APM, metrics, RUM, etc."
  • "Even though it is powerful on its own, the UI-based design lacks elegance, efficiency, and complexity."

What is our primary use case?

We’re currently using logging, monitoring, metrics, APM, etc.

We've started to use e-SLOs, however, it takes a bit of time to work through those.

RUM has been very useful. I have used this in the past to debug problems in production, which has been g great.

We also want to start using synthetics and tracing more. 

Our application currently runs in many different environments based on our customers' requirements. This allows us to see everything in one place and filter by environment as required, which is extremely useful.

How has it helped my organization?

The product has offered increased visibility via logging APM, metrics, RUM, etc. We've gone from almost nothing to something that didn’t take a lot of time to set up. It has been great since we had so little time to spare.

As a startup, we have limited resources, and no one has enough time for anything. The fact that there are so many easy integrations and configurations by YAML makes everything easy to set up without needing a full-time employee. Instead, we're just configuring monitoring solutions which are very desirable.

What is most valuable?

The product is very useful for tracking down anything that’s gone wrong. I’ve been using it to make sure everything is working correctly after deployment and to make sure we don’t suffer performance degradation. We've found it great for tracking down anything that’s gone wrong in real-time.  

The logs are helpful. They are necessary for any application. Prior to this, our solution would have been to SSH into a machine and tail log files. This, however, is untenable for many reasons, and one of the first things I wanted to change.

The RUM has been great. Seeing real users interacting with our website is quite helpful.

What needs improvement?

Sometimes it’s difficult to customize certain queries to find specific things, specifically with the logging solution. I’ve used other logging platforms in the past that have extensive and mature query languages. This might not be super friendly to start out with, yet can be very powerful. 

I wish there was more of an emphasis on query languages instead of the UI-based tooling that Datadog provides. Even though it is powerful on its own, the UI-based design lacks the elegance, efficiency, and complexity.

For how long have I used the solution?

I've been using the solution for six months or so.

Which solution did I use previously and why did I switch?

I was previously familiar with Splunk.

What's my experience with pricing, setup cost, and licensing?

I don’t handle pricing or licensing aspects. 

Which other solutions did I evaluate?

I did not evaluate other options. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2045049 - PeerSpot reviewer
Product Manager, Delivery Engineering at a media company with 1,001-5,000 employees
Real User
Intuitive to set up with great dashboards and dashboards and APM
Pros and Cons
  • "The tools are powerful and intuitive to set up."
  • "Billing should be more transparent."

What is our primary use case?

The main use case is observability and reliability as part of a platform/delivery engineering solution. We use the product to assist tenants and clients within the company to get more ramped up on SRE/DevOps.

How has it helped my organization?

The solution has provided us with a lot more insight into service-level metrics, which is especially useful with APM/tracing. It gives us all-up dashboards and alerts to assist with incident management.

What is most valuable?

The most useful aspects of the product are the dashboards and APM/tracing. The tools are powerful and intuitive to set up as well.

What needs improvement?

Custom-level metrics could be improved.

Billing should be more transparent.

For how long have I used the solution?

I've used the solution for three to four years at multiple companies.

What do I think about the stability of the solution?

The solution is very stable.

What do I think about the scalability of the solution?

The scalability is fantastic so far.

How are customer service and support?

Technical support has been great so far.

How was the initial setup?

The initial setup is straightforward. The documentation we use is very clean and concise.

What about the implementation team?

We handled the initial setup in-house.

What's my experience with pricing, setup cost, and licensing?

I would advise others to be cautious around custom metrics and be picky when setting them up.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2045043 - PeerSpot reviewer
Software Engineer at a comms service provider with 5,001-10,000 employees
Real User
Great monitors and APM with helpful Terraform support
Pros and Cons
  • "APM is great and has provided low-effort out-of-the-box observability for various services."
  • "Delta traces on the Golang profiler are extremely expensive concerning memory utilization."

What is our primary use case?

We primarily use the product for tracing, metrics, and alarms in various deployment environments.

How has it helped my organization?

The product has provided our company with improved observability, which has helped make the incident response more targeted and quicker.

What is most valuable?

APM is great and has provided low-effort out-of-the-box observability for various services. 

Monitors are helpful, and definitions are simple. 

Terraform support is nice as it allows us to create homogenous monitoring environments in various deployment environments with little additional effort. It also facilitates version control of monitor definitions, etc. 

The Golang profiler is generally good with the exception of delta profiles; it has provided helpful observability into Heap Allocations which has helped us reduce GC overhead.

What needs improvement?

Delta traces on the Golang profiler are extremely expensive concerning memory utilization. In a Kubernetes environment where we would like to set per-pod memory allocations as low as possible, the overhead of that profiler feature is prohibitive. In one case, our pods (which were provisioned to target 250 MB and max at 500 MB memory) got stuck in a crash loop due to out-of-memory, which was caused entirely by the delta profiles feature of the profiler.

Multistep Datadog synthetics lack the feature of basic arithmetic. For our use case, performing basic arithmetic on the output of previous steps to produce input for subsequent steps would be extremely useful.

For how long have I used the solution?

I've used the solution for nine months.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2025
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.