Try our new research platform with insights from 80,000+ expert users
reviewer2000487 - PeerSpot reviewer
Test Engineer at a tech services company with 1,001-5,000 employees
Real User
Great for monitoring, helps with internal communication and offers improved visibility
Pros and Cons
  • "We've been able to glean from the monitors what servers are down, and can alert the team in Slack."
  • "The more tools that they can build that allow you to run AWX playbooks, or other similar fixes, would benefit clients greatly."

What is our primary use case?

We're moving towards the cloud yet still have several active data center contracts. As we move to the cloud, we are interested in knowing more about our services, and DataDog APM/logs should give us this perspective. 

We currently use the infrastructure monitoring part of DataDog. Still, I've really seen the advantage of moving more data into the cloud for comparison and being able to have one place where we can view all related pieces of information regarding a possible incident or potential issue.

How has it helped my organization?

We've been able to glean from the monitors what servers are down, and can alert the team in Slack. Knowing what we need to do next is what we would like to move to, so seeing the power of Notebooks is key. We also have several other services that we are underutilizing (logging, error tracking, etc.) that would be better housed in DataDog since it gives us more visibility into linking all of the things together into one cohesive picture.

What is most valuable?

Monitoring has been invaluable, and as we start to look to other products, bringing in logs and APM traces will create a full picture of what we need to do to resolve incidents. We're also interested in our users' perspectives and when things are slowing down for them.

Additionally, we are interested in the workflows announced today as an actionable decision tree that will allow our teams to see what is wrong and know what to do based on the connected runbook/notebook. I feel that this is the final piece that will make DataDog so much more valuable than its competition.

What needs improvement?

I have talked to vendors that mention that DataDog is drawing attention to an issue. However, you'll still need to take action on your own. The more tools that they can build that allow you to run AWX playbooks, or other similar fixes, would benefit clients greatly.

Buyer's Guide
Datadog
May 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
852,649 professionals have used our research since 2012.

For how long have I used the solution?

We've used the solution for two years.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: partner
PeerSpot user
reviewer1996524 - PeerSpot reviewer
Director Of Software Development at Major League Baseball
Real User
Good for monitoring and telemetry with helpful tracing capabilities
Pros and Cons
  • "APM and tracing are super useful."
  • "We would like to see smaller or shorter tutorials and video sessions."

What is our primary use case?

We primarily use the solution for monitoring and telemetry.

We use lots of log collections, log-based metrics, and dashboard visualization.  The logging, metrics, and APM are vital.  

How has it helped my organization?

My team focuses on the backend. Day-to-day monitoring includes observing metrics such as the CPU and memory until it gets too high. This solution provides an alert during the metric collection.  

What is most valuable?

APM and tracing are super useful. We use it for daily monitoring of CPU and memory. We can get alerts to tail to specific metrics.

We also find the tracing feature useful. We often run into bugs, and when a production issue happens, it is super useful to see the related services and sense where the problem is.

What needs improvement?

We would like to see smaller or shorter tutorials and video sessions. Also, the ability to provide a custom formula for monitoring is vital. Perhaps there can be more training materials on this. We often need to detect slow-running queries and slow network responses. We also focus a lot on the abuse of request limits. Having some form of rate limit features or metrics would be useful.  

Profiling could also be useful. Some services are CPU-intensive, and others are IO-intensive. Knowing where the bottleneck is, is crucial. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Datadog
May 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
852,649 professionals have used our research since 2012.
reviewer1915611 - PeerSpot reviewer
Principal Solutions Architect at a security firm with 51-200 employees
Real User
Provides great visibility, has good replay functionality, and helps with monitoring
Pros and Cons
  • "The dashboards and the performance of the software have been great."
  • "It could probably be a little bit of a better user experience."

What is our primary use case?

One of the things we use it for is the same thing that we use FullStory for, which is to replay customer interactions with our platform. However, it also does the monitoring. It's like monitoring cloud tools. We're really mostly monitoring our own software to make sure that everything is functioning properly. We can check a bunch of things, and we can even play back customer sessions. It’s basically monitoring our application.

How has it helped my organization?

It really provides a lot of visibility in terms of how our software is working. If there are any problems, it surfaces them right away. We get alerts in Slack. It's really an essential tool for a company that provides software as a service.

What is most valuable?

I really like the replay, the ability to replay sessions, as I'm in sales engineering, so I sometimes need to know what my prospects are doing during a proof of value. I can actually see all the mouse moving and clicking on buttons and stuff like that. I can actually tell what they've been doing. There’s a lot of the other monitoring stuff as well. The development team uses it for monitoring and finds it very helpful.

It’s been kind of in the middle of many different things. The dashboards and the performance of the software have been great.

What needs improvement?

I haven't really noticed anything that they could improve upon. Maybe they could add in some features to go both ways, to maybe make some configuration changes, etc. That's a little bit outside of what Datadog does, though. It's really very full-featured, so I don't really have any complaints.

I haven't really fully looked at the documentation as I know where I need to go and look at things. It could probably be a little bit of a better user experience. There are so many functions there that sometimes navigating your way around is a little bit hard. They have a really nice menu system. However, there's so much there. It's possible that I skipped a guided tour when I started.

It’s not intuitive to everyone. There are a lot of technical features.

For how long have I used the solution?

I’ve been using the solution for the last five months. However, the company may have used it for a year and a half.

What do I think about the stability of the solution?

The solution has been stable and reliable.

What do I think about the scalability of the solution?

We haven’t had a problem with scalability. It’s been good.

We have 25 to 30 users on it currently. Our entire organization is under 60 people. Although not everyone is on it, a lot of our staff are. The sales, engineering, and customer success teams are all on it.

We may increase usage. No doubt that will come naturally with time. We’re hiring more people, and likely new hires will use it.

How are customer service and support?

I have not had occasion yet to reach out to support.

Which solution did I use previously and why did I switch?

We’ve also been using FullStory.

How was the initial setup?

I wasn’t part of the implementation. The one thing I will say is that when they added the functionality to review sessions, it made our use of another product, FullStory, almost obsolete. I'll have to see if we will continue using FullStory or if we can rely completely on Datadog.

What other advice do I have?

I am a customer and end-user.

We’re on the most recent version and keep it updated.

I’d rate it nine out of ten. The user experience could be slightly better.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer1652517 - PeerSpot reviewer
Senior Manager, Cyber Digital Transformation at a security firm with 1,001-5,000 employees
Real User
Insightful and easy to use solution that makes application performance monitoring easy
Pros and Cons
  • "The infrastructure monitoring capabilities are really valuable. You can just log on and see everything that is happening within an IT environment."
  • "Their security features could be improved. We looked at their Security Monitoring feature but it was early in its development. Datadog are just getting into the security space so I'm sure this will improve in the future."

What is our primary use case?

We have used this solution primarily for application performance monitoring. To do this, we needed to make sure we had the right data in the system so that people could be able to monitor their applications end-to-end.

What is most valuable?

The infrastructure monitoring capabilities are really valuable. You can just log on and see everything that is happening within an IT environment.

What needs improvement?

Their security features could be improved. We looked at their Security Monitoring feature but it was early in its development. Datadog are just getting into the security space so I'm sure this will improve in the future.

What do I think about the stability of the solution?

This is a stable solution. 

What do I think about the scalability of the solution?

This is a scalable solution. 

What other advice do I have?

To get started with this solution, I would recommend front-loading it with some sort of data or process and filter out to view only the information you need.

I would rate this solution a nine out of ten. 

Disclosure: My company has a business relationship with this vendor other than being a customer: Reseller
PeerSpot user
Real User
Great dashboards, good monitoring, and easy SLAs
Pros and Cons
  • "Profiling has been made easier."
  • "Lately, chat support has a longer waiting time."

What is our primary use case?

Our primary use case would be using the dashboards and getting proper insights based on the dashboards.

The monitoring, SLO, and SLA have been better and easier since we started using the Terraform infrastructure. APM has been easier as we had to enable it through the CronJob directly.

Profiling has been made easier. We are able to get many insights into the code. Profiling provides really good insights right now. 

Logs are the most valuable and the best solution so far. Datadog can help solve any slow queries or database-related errors. 

The primary use case would be using the dashboards and getting proper insights based on the dashboards.

How has it helped my organization?

Monitoring has been better and easier since we started using the Terraform infrastructure.

APM has been easier as we had to enable it through the CronJob directly.

Profiling has made it easier in terms of getting many insights into the code.

The logs are the most valuable and the best solution. Datadog can help us to solve any slow queries or database-related errors.

What is most valuable?

Profiling provides really good insights, and APM has really good tracing visibility. 

The SLA and SLO definitions and the monitoring are also really important and very valuable parts of the product and make great Datadog features. 

Datadog support is also really valuable as they provide support for the product through the chat as well. 

The Datadog premium support has helped us to provide faster outcomes for a problem. 

Also, rather than having an email thread, it would be better to get the support on call and sort out the issue, which is the support we get from Datadog CSM.

What needs improvement?

Integration should have been easier. It is very tough to go to all the services and enable Datadog integration for each AWS service. 

We can add the AWS services and the services on one page and show only the services that are enabled. A similar approach should be for any other integration.

Lately, chat support has a longer waiting time. We would love to get faster chat support. We also need additional support for sending the flare files

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2045022 - PeerSpot reviewer
Software Engineer at a financial services firm with 501-1,000 employees
Real User
Great UI and documentation but needs to offer K8s deployment monitoring in real-time
Pros and Cons
  • "The installation step is pretty straightforward."
  • "I'm not sure if Datadog can monitor K8s deployments in real-time. For instance, being able to see a deployment step by step visually. This would be helpful if there were any incidents during the deployment."

What is our primary use case?

We use Datadog to monitor our Kubernetes clusters. 

We have 3 different clusters for different parts of the SDLC. We run the Datadog agent DaemonSet as well as the Datadog cluster agent. Our services have the APM installed by default. 

To create monitors, we use Terraform. This is provided out-of-the-box for our service owner. 

We run EKS on top of K8s, therefore, we also make use of some of the AWS monitoring capabilities that can be integrated into Datadog. 

We are hugely reliant on Datadog for all aspects of our system.

How has it helped my organization?

With Datadog, we were able to gain observability in our system. 

The installation step is pretty straightforward. 

It's easy to use by non-DevOps users. For instance, our engineers do not interact with K8s often; therefore, it is hard for them to debug. However, with Datadog, they are able to view their containers and deployments with a single click. 

We also heavily use the tags to help us identify who the service owners are. This is super useful when we need to track owners for patching or pick up new features we implemented.

What is most valuable?

The APM and K8s monitoring are the most valuable aspects of the solution. The K8s monitoring allows all customers to view their infra, even if they do not use K8s daily. They can just click on a few tabs to get all of the information they need. 

It is also very easy to install on our system. APM has helped debug applications on our system as well. We were able to view why a service has suddenly shut down.

We also use Datadog for SLOs/SLAs as well. We check the live endpoint of services to ensure they are still up and running.

What needs improvement?

There is not much that needs to be improved. 

The UI is super user-friendly. The deployment process is easy. We enjoy using the integrations with Slack and PagerDuty. 

Customer support is awesome from our experience. There is a lot of documentation for us to be able to use if we need to. 

I'm not sure if Datadog can monitor K8s deployments in real-time. For instance, being able to see a deployment step by step visually. This would be helpful if there were any incidents during the deployment. 

In general, Datadog is a great solution.

For how long have I used the solution?

I've used Datadog since I joined my company about a year ago.

What do I think about the stability of the solution?

We haven't had issues with the stability.

What do I think about the scalability of the solution?

The scalability is really great.

How are customer service and support?

We've had no issues with the product or support. 

How was the initial setup?

The initial setup is super simple, and the documentation was helpful.

What about the implementation team?

We managed the initial setup process in-house.

What was our ROI?

We've witnessed ROI in our DevOps.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2044977 - PeerSpot reviewer
Senior Site Reliability Engineer at a tech vendor with 10,001+ employees
Real User
Good alerts and monitoring with a relatively simple setup
Pros and Cons
  • "The management of SLOs and their related burn-rate monitors have allowed us to onboard teams to on-call fast."
  • "Managing dashboards as IaC is a bit hard to work out at times."

What is our primary use case?

Datadog provides us with a solution for data ingesting for all of our application metrics, resource metrics, APM/tracing data etc. 

We use it for use in dashboards, monitoring/alerting, SLO targets, incident response etc. 

We have a lot of applications across multiple languages/frameworks etc., and have deployed in Kubernetes across multiple regions in AWS, along with underlying managed resources such as SQS, Aurora, etc. 

Datadog makes understanding the state of these seamless. We are a company with millions of daily active users, and this level of detail is excellent.

How has it helped my organization?

Datadog has allowed us to rapidly spin up alerting and monitoring that helps our incident responders get alerted quickly when our SLOs are in danger and helps to quickly resolve issues. 

It is the single most important tool we have from an SRE perspective. 

It also provides us with an easy way to get information at a glance for all of our services through APM and create unified dashboards that track our underlying resources, such as databases, queues, etc., alongside application data. 

It has been invaluable to our organization.

What is most valuable?

The management of SLOs and their related burn-rate monitors have allowed us to onboard teams to on-call fast. 

Management of resources using infrastructure-as-code has been a recent game-changer for us. Combining the two has allowed us to provide product teams with a total solution for getting their applications attached to user-focused alerting and monitoring within a matter of days rather than months - and has clearly impacted our ability to discover and respond to significant production incidents.

What needs improvement?

Managing dashboards as IaC is a bit hard to work out at times. I use custom tools to convert JSON dashboards to Terraform resources. Ideally, I'd like for some sort of building tool for this to be built into the app. For example, a templating system that can easily be exported to IaC would be transformative for us. 

There are also some aspects of the API that can be a bit verbose - especially in the area of new features like SLOs - and take some time to understand. That said, overall, they're well-documented enough to be a minor concern for us.

For how long have I used the solution?

I've been using the solution for over five years.

What do I think about the stability of the solution?

I have never seen a major outage that prevented us from using Datadog, although I can't speak for other teams/time zones

What do I think about the scalability of the solution?

This product is massively scalable - I haven't seen any issues as we continue to onboard new technologies and teams

How are customer service and support?

Datadog provides us with a number of direct lines to support, although I haven't personally required their assistance.

Which solution did I use previously and why did I switch?

We previously used LightStep for APM and switched to Datadog to unify all of our application data.

How was the initial setup?

Most elements are quite simple to set up. However, some types of data collection require organization-wide engineering buy-in.

What about the implementation team?

We handled the initial setup in-house.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2004192 - PeerSpot reviewer
Lead Support Engineer at a tech vendor with 11-50 employees
Real User
Good centralization of data with good integration but can be overwhelming at first
Pros and Cons
  • "The integration into AWS is key as well as our software is currently bound to AWS."
  • "The ability to find what you are looking for when starting out could be improved."

What is our primary use case?

Our use case is mainly deploying into our applications for monitoring/logging observability. We currently have our microservices feed into an actuator that exists in each instance of our application that extends to a local and central Grafana for client and internal visibility. The application we use is Grafana.

Logging captures application and system logs that are ported to each application instance for querying.

Whenever anything occurs that is considered unhealthy from a range of health checks, we have notification rules configured internally and externally for a prompt response time.

How has it helped my organization?

We have been able to be a more confident, knowledgeable, and capable team when everything is being ported into a centralized format. Beforehand, knowledge was isolated to individuals. Knowledge in terms of what information represented and where it was led to a lack of confidence. By having everything in one place, rules out that confusion and allows us to respond better to issues.

It also allows for personal growth as our team is learning the application from the ground up, and each person is enhancing their own skills.

What is most valuable?

The valuable features include the following: 

  • We are currently utilizing a decentralized distributed framework for our deployment, including our monitoring/logging observability capabilities. Centralizing them, if contingent on our company privacy guidelines, will be a big help in tracking and responding to issues that come up and have the means to understand the origin of the log management tools that were demonstrated.
  • The ability to fiddle around and manipulate how logs are outputted.
  • The ability to track AWS Lambda functions, Cloudformation, and Cloudwatch allow someone that is not savvy to dip their toe into understanding their own product.
  • The integration into AWS is key as well as our software is currently bound to AWS.

What needs improvement?

The ability to find what you are looking for when starting out could be improved. It was a bit overwhelming trying to figure out what is the best solution. It led to many prototypes or time spent just perusing documentation. If we were able to select bundles or template use cases, we would hit the ground running quicker.

For how long have I used the solution?

I've used the solution for one year.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: May 2025
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.