Try our new research platform with insights from 80,000+ expert users
Principal Enterprise Systems Engineer at a healthcare company with 10,001+ employees
Real User
An out-of-the-box solution that allows you to quickly build dashboards
Pros and Cons
  • "I like that you can build out a dashboard pretty quickly. There are some things that come out of the box that you don't really need to do, which is great because they're default settings."
  • "I think better access to their engineers when we have a problem could be better."

What is our primary use case?

We deploy agents on-premise to collect data on on-premise VM instances. We don't use Datadog in our cloud network. We do have some Cloud apps that we have it on and we also have Containers. We have it on their headquarters, the main software for them is on their own Cloud.

Eventually, we're building out the process now and using it better. We plan to use Datadog for root cause analysis relating to any kinds of issues we have with software, with applications going down, latency issues, connection issues, etc. Eventually, we're going to use Datadog for application performance, monitoring, and management. To be proactive around thresholds, alerts, bottlenecks, etc. 

Our developers and QA teams use this solution. They use it to analyze network traffic, load, CPU load, CPU usage, and then Tracey NPM, API calls for their application. There are roughly 100 users right now. Maybe there's 200 total, but on a given day, maybe 13 people using this solution.

How has it helped my organization?

It hasn't improved the way our organization functions yet, because there's a lot of red tape to cut through with cultural challenges and changes. I don't think it's changed the way we do things yet, but I think it will — absolutely it will. It's just going to take some time.

What is most valuable?

I like that you can build out a dashboard pretty quickly. There are some things that come out of the box that you don't really need to do, which is great because they're default settings. Once you install the agent on the machine, they pick up a lot of metrics for you that are going to be 70 or more percent of what you need. Out of the box, it's pretty good.

For how long have I used the solution?

I have been using Datadog every day since September 2020. I also used it at a previous company that I worked for.

Buyer's Guide
Datadog
May 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
851,451 professionals have used our research since 2012.

What do I think about the stability of the solution?

Stability-wise, it's great.

What do I think about the scalability of the solution?

It seems like it'll scale well. We're automating it with Ansible scripts and service now so that when we build a new virtual machine it will automatically install Datadog on that box.

How are customer service and support?

The tool itself is pretty good and the customer service is good, but I think they're a growing company. I think better access to their engineers when we have a problem could be better. For example, if I asked the question, "Hey, how do I install it on this type of component?" We'll try to get an engineer on the phone with us to step us through everything, but that's a challenge because they're so busy.

Technically-wise, everything's fine. We don't need any support, everything that I need to do, I can do right out of the box. But as far as, in the knowledge of their engineers on how to configure it on given systems that we have, that's maybe at six because they're just not as available as I would've hoped.

Which solution did I use previously and why did I switch?

We were using AppDynamics. Technically, we still have it in-house because it's tightly wound into certain systems, but we'll probably pull that off slowly over time. The reason we added Datadog and eventually we'll fully switch over is due to cost. It's more cost-friendly to do it with Datadog.

Which other solutions did I evaluate?

Yes, we looked at Dynatrace, AppDynamics, and New Relic. Personally, I wouldn't have chosen Datadog for the POC if it were up to me. Datadog was a leader, but New Relic was looking really good. In the end, the people above me decided to go with Datadog — it's a big company, so they wanted to move fast, which makes sense.

What other advice do I have?

If you're interested in using Datadog, just do your homework, as we did. We're happy so far I think; time will tell as we are still rolling things out. It's a very good company. It's going to be a year before we really can tell anything. If you do your homework, you'll find that if you're really concerned with cost, it's good.

There are some strengths that AppDynamics and Dynatrace have that Datadog I don't think will have down the road, but they're not things we necessarily need — they're outliers. It would be nice to have them, but we can manage without them.

Know what you want. There is no need to pay for solutions like Dynatrace or AppDynamics that are more expensive or things that are just nice to have if you don't absolutely need to have them. That's something people need to understand. You just have to make sure you understand what it is that you need out of the tool — they are all a little different, those three. I would say to anybody that's going with Datadog: you just have to be patient at the beginning. It's a very busy company right now. They're very hot in the market.

Overall, on a scale from one to ten, I would give Datadog a rating of eight. It does what we need it to do, and it seems to be pretty user-friendly in terms of setting things up.

Features-wise, I'd give them a rating of ten out of ten. The better access we get to assistance from the engineers on how to configure dashboards and pulling metrics that we need, that would bring it up a little bit. So overall it would be harder and it would have to be perfect for it. I would say maybe they could bring it to a nine.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Ravel Leite - PeerSpot reviewer
Head of DevOps at Traveltek Ltd.
User
Top 20
Proactive, provides user trends, and works harmoniously
Pros and Cons
  • "Each component complements the other, creating a cohesive system where data, logs, and metrics are seamlessly integrated."
  • "Datadog is too pricey when compared to its competitors, and this is something that its always on my mind during the decision-making process."

What is our primary use case?

From day one, we have seamlessly integrated our new product into Datadog, a comprehensive monitoring and analytics platform. By doing so, we are continuously collecting essential data such as host information, system logs, and key performance metrics. This enables us to gain deep insights into product adoption, monitor usage patterns, and ensure optimal performance. Additionally, we use Datadog to capture and analyze errors in real-time, allowing us to troubleshoot, replay, and resolve production issues efficiently.

How has it helped my organization?

It has proven invaluable in helping us identify early issues within the product as soon as they occur, allowing us to take immediate action before they escalate into more significant problems. This proactive approach ensures that potential challenges are addressed in real-time, minimizing any impact on users. Furthermore, the system allows us to measure product adoption and usage trends effectively, providing insights into how customers are interacting with the product and identifying areas for improvement or enhancement.

What is most valuable?

There isn't any single aspect that stands out in particular; rather, everything is interconnected and works together harmoniously. Each component complements the other, creating a cohesive system where data, logs, and metrics are seamlessly integrated. This interconnectedness ensures that no part operates in isolation, allowing for a more holistic view of the product's performance and health. The way everything binds together strengthens our ability to monitor, analyze, and improve the product efficiently.

What needs improvement?

At the moment, nothing specific comes to mind. Everything seems to be functioning well, and there are no immediate concerns or issues that I can think of. 

The system is operating as expected, and any challenges we've faced so far have been successfully addressed. If anything does come up in the future, we will continue to monitor and assess it accordingly, but right now, there’s nothing that stands out requiring attention or improvement. 

Datadog is too pricey when compared to its competitors, and this is something that its always on my mind during the decision-making process.

For how long have I used the solution?

I've used the solution for nearly two years now.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Datadog
May 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
851,451 professionals have used our research since 2012.
ZJ - PeerSpot reviewer
Software Engineer at a computer software company with 201-500 employees
User
Top 20
Very good custom metrics, dashboards, and alerts
Pros and Cons
  • "The dashboards provide a comprehensive and visually intuitive way to monitor all our key data points in real-time, making it easier to spot trends and potential issues."
  • "One key improvement we would like to see in a future Datadog release is the inclusion of certain metrics that are currently unavailable. Specifically, the ability to monitor CPU and memory utilization of AWS-managed Airflow workers, schedulers, and web servers would be highly beneficial for our organization."

What is our primary use case?

Our primary use case for Datadog involves utilizing its dashboards, monitors, and alerts to monitor several key components of our infrastructure. 

We track the performance of AWS-managed Airflow pipelines, focusing on metrics like data freshness, data volume, pipeline success rates, and overall performance. 

In addition, we monitor Looker dashboard performance to ensure data is processed efficiently. Database performance is also closely tracked, allowing us to address any potential issues proactively. This setup provides comprehensive observability and ensures that our systems operate smoothly.

How has it helped my organization?

Datadog has significantly improved our organization by providing a centralized platform to monitor all our key metrics across various systems. This unified observability has streamlined our ability to oversee infrastructure, applications, and databases from a single location. 

Furthermore, the ability to set custom alerts has been invaluable, allowing us to receive real-time notifications when any system degradation occurs. This proactive monitoring has enhanced our ability to respond swiftly to issues, reducing downtime and improving overall system reliability. As a result, Datadog has contributed to increased operational efficiency and minimized potential risks to our services.

What is most valuable?

The most valuable features we’ve found in Datadog are its custom metrics, dashboards, and alerts. The ability to create custom metrics allows us to track specific performance indicators that are critical to our operations, giving us greater control and insights into system behavior. 

The dashboards provide a comprehensive and visually intuitive way to monitor all our key data points in real-time, making it easier to spot trends and potential issues. Additionally, the alerting system ensures we are promptly notified of any system anomalies or degradations, enabling us to take immediate action to prevent downtime. 

Beyond the product features, Datadog’s customer support has been incredibly timely and helpful, resolving any issues quickly and ensuring minimal disruption to our workflow. This combination of features and support has made Datadog an essential tool in our environment.

What needs improvement?

One key improvement we would like to see in a future Datadog release is the inclusion of certain metrics that are currently unavailable. Specifically, the ability to monitor CPU and memory utilization of AWS-managed Airflow workers, schedulers, and web servers would be highly beneficial for our organization. These metrics are critical for understanding the performance and resource usage of our Airflow infrastructure, and having them directly in Datadog would provide a more comprehensive view of our system’s health. This would enable us to diagnose issues faster, optimize resource allocation, and improve overall system performance. Including these metrics in Datadog would greatly enhance its utility for teams working with AWS-managed Airflow.

For how long have I used the solution?

I've used the solution for four months.

What do I think about the stability of the solution?

The stability of Datadog has been excellent. We have not encountered any significant issues so far. 

The platform performs reliably, and we have experienced minimal disruptions or downtime. This stability has been crucial for maintaining consistent monitoring and ensuring that our observability needs are met without interruption.

What do I think about the scalability of the solution?

Datadog is generally scalable, allowing us to handle and display thousands of custom metrics efficiently. However, we’ve encountered some limitations in the table visualization view, particularly when working with around 10,000 data points. In those cases, the search functionality doesn’t always return all valid results, which can hinder detailed analysis.

How are customer service and support?

Datadog's customer support plays a crucial role in easing the initial setup process. Their team is proactive in assisting with metric configuration, providing valuable examples, and helping us navigate the setup challenges effectively. This support significantly mitigates the complexity of the initial setup.

Which solution did I use previously and why did I switch?

We used New Relic before.

How was the initial setup?

The initial setup of Datadog can be somewhat complex, primarily due to the learning curve associated with configuring each metric field correctly for optimal data visualization. It often requires careful attention to detail and a good understanding of each option to achieve the desired graphs and insights

What about the implementation team?

We implemented the solution in-house.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
reviewer2004174 - PeerSpot reviewer
Senior Software Engineer at a insurance company with 10,001+ employees
Real User
Very good RUM, synthetics, and infrastructure host maps
Pros and Cons
  • "Overall, the Data UI and the usability of customer features continue to improve."
  • "It is very difficult to make the solutions fit perfectly for large organizations, especially in terms of high cardinality objects and multi-tenancy, where the data needs to be rolled up to a summarized level while maintaining its individual data granularity and identifiers."

What is our primary use case?

I have been using Datadog products and capabilities increasingly over the last 4 years, from POC to widespread adoption. 

The capabilities we use are unique for each use case and can be combined in various ways to provide the full observability coverage needed to maintain stable operations and shift from becoming more reactive to proactive. 

Our organization uses both site/service reliability for the range of backend and frontend services, custom monitoring, and dashboards that can be dynamic and reused for multiple teams.

How has it helped my organization?

The capabilities we use are unique for each use case. They can be combined in various ways to provide the full observability coverage needed to maintain stable operations in order to become more proactive. 

Our organization uses both site/service reliability for backend and frontend services. Custom monitoring and dashboards that can be dynamic and reused for multiple teams. 

We continue to increase the size of our footprint as we get more and more positive experiences.

What is most valuable?

The APM, RUM, synthetics, and infrastructure host maps have been some of the most popular and commonly used features. 

Overall, the Data UI and the usability of customer features continue to improve. 

The RUM session data and replays are much more convenient and applicable than other tools I have worked with in the past, and by combining multiple capabilities or features together, there is full visibility across the technology stacks and can identify specific bottlenecks or areas for risk and vulnerabilities to be likely to exist. 

Watchdog insights take the work out of the hardest part, helping us identify the issues before our customers.

What needs improvement?

It is very difficult to make the solutions fit perfectly for large organizations, especially in terms of high cardinality objects and multi-tenancy, where the data needs to be rolled up to a summarized level while maintaining its individual data granularity and identifiers. Tagging is imperative. However, the solutions could be improved for these needs in the future.

For how long have I used the solution?

I've used the solution for over four years now.

What do I think about the stability of the solution?

The stability is excellent.

What do I think about the scalability of the solution?

You can work with engineering to make it work for your needs. They are excellent at supporting their customers.

How are customer service and support?

Technical support is excellent.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I previously used New Relic, App Dynamics, Heap, Clicktale, and more. Datadog has incorporated many of the features we were looking for into a one-stop shop.

How was the initial setup?

The initial setup is simple and straightforward.

What about the implementation team?

We had an in-house team working directly with Datadog engineering support and technical enablement.

Which other solutions did I evaluate?

We looked into New Relic, App Dynamics, Heap, Clicktale, and more. Datadog has many of the features we were looking for in one place.

What other advice do I have?

We use all versions of the solution.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2004024 - PeerSpot reviewer
SRE at a financial services firm with 10,001+ employees
Real User
Excellent synthetic monitoring, APM, and alert features
Pros and Cons
  • "The monitoring functionality, in general, and tagging infrastructure are great."
  • "While the tool is robust with many different capabilities, users would greatly benefit from more examples in the documentation."

What is our primary use case?

We deploy various services for our main platform on AWS across multiple regions. We have a development environment, a staging environment, a QA environment, and a production environment. We deploy our many services across hundreds of instances. 

We have many server farms, all responsible for various services on our market intelligence platform. The deployment of each server farm or even individual instances varies depending on what stood up. We have instances built in three different ways, with two different pipelines and some even on user data scripts.

How has it helped my organization?

My team has a 24/7 on-call schedule where we need to be ready to handle and mitigate incidents with the platform at any moment. 

We have countless monitors set up on Datadog that alert directly to our queue using an email that generates a ticket. 

The actionable steps for each type of monitor and its associated incident are easily included in the alerts whenever something is triggered. We generate links to the Datadog monitors and can instantly drill down into what went wrong and for how long.

What is most valuable?

The features I have found most helpful are synthetic monitoring, APM, and alert features. The monitoring functionality, in general, and tagging infrastructure are great.

Synthetics have become bread and butter for us as we have migrated many tests over to Datadog. We have simplified and consolidated our synthetic tests while also making them more robust with the help of your tagging. 

A large portion of our monitoring is based on synthetics results, and alerts integrate seamlessly without an incident queue system. We use dashboards heavily. 

The metrics capabilities are extremely helpful, and we use virtually all of the widgets.

What needs improvement?

My main place of improvement for Datadog would be the documentation. While the tool is robust with many different capabilities, users would greatly benefit from more examples in the documentation. 

The number of current code snippets available in the docs is not enough, and some need to be updated even today. 

One function I would add would be a button to generate a report of the performance of a synthetic test and the performance of each of the steps in the test over time.

For how long have I used the solution?

This timeline varies in terms of how long we've used the solution. We have one platform completely in the cloud and one still on-premises. We've had the solution for many years on AWS.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Ramon Snir - PeerSpot reviewer
CTO at a tech vendor with 1-10 employees
Real User
Increases delivery velocity with les manual testing and good integrations
Pros and Cons
  • "Since we integrated Datadog, we have had increased confidence in the quality of our service, and we had an easier time increasing our delivery velocity."
  • "Since the Datadog platform has so many separate features, solving so many use cases, there are often inconsistencies in feature availability and interoperability between products."

What is our primary use case?

We use Datadog for three main use cases, including:

  • Infrastructure and application monitoring. It is ensuring that our services are available and performant at all times. This allows us to proactively address incidents and outages without customers contacting us. This includes monitoring of cloud resources (databases, load balancers, CPU usage, etc.), high-level application monitoring (response times, failure rates, etc.), and low-level application monitoring (business-oriented metrics and functional exceptions to customer experience.
  • Analyzing application behavior, especially around performance. We often use Datadog's application performance monitoring on non-production environments to evaluate the impact of newly introduced features and gain confidence in changes.
  • End-to-end regression testing for APIs and browser-based experiences. Using Datadog's synthetic testing checks periodically that the system behaves in the exact correct way. This is often used as a canary to detect issues even before users reach them organically.

How has it helped my organization?

Since we integrated Datadog, we have had increased confidence in the quality of our service, and we had an easier time increasing our delivery velocity. 

We have seen time after time that the monitors we have carefully created based on all ingested data are detecting issues quickly and accurately. 

This means we allow ourselves to manually test things less frequently. We have also had an easier time investigating application errors and slowness using Datadog's APM and log explorer products which allow us to introspect any part of the system, in its execution context.

What is most valuable?

The most valuable features include:

  • Integrated observability data ingestions: All data that Datadog collects is connected. This allows easily connected logs with failed requests, and slow database questions with services and requests.
  • Broad integrations allow us to monitor our entire production environment in a single place, not just cloud resources. Since all parts stream metrics, logs, and events to Datadog, we can have unified dashboards and manage monitors and incidents all from the same page.
  • A high level of configuration. We can configure and modify many parts, from how data is collected from our applications to how Datadog parses and visualizes it. This means that we always get the best experience, and we don't need to find ten different products that do small things well or settle on one product that does everything badly.

What needs improvement?

Since the Datadog platform has so many separate features, solving so many use cases, there are often inconsistencies in feature availability and interoperability between products. 

Older, more mature products tend to be complete (many features, customization, broad integrations, etc.), while newer products will often be at a "just above minimum viable product" phase for a long time, doing what's intended yet missing valuable customizations and integrations.

For how long have I used the solution?

We've used the solution for 12 months.

What do I think about the scalability of the solution?

The solution scales very well on technical aspects, being able to ingest large quantities of data from many services. However, the pricing often doesn't scale naturally, and effort has to be put in to keep ongoing costs at a reasonable amount.

How are customer service and support?

Customer service and support are generally very high-quality. In most cases, they reply very quickly and offer well-researched and relevant responses. This is contrasted with many vendors who take a long time to reply and send links to documentation instead of understanding the problem.

However, we had cases where support took several weeks to reply to a complicated request and sometimes eventually responded that the issue cannot be resolved. These are rare edge-case occurrences.

How would you rate customer service and support?

Positive

How was the initial setup?

A large part of the initial setup was straightforward. We were able to collect about 80% of the relevant and 90% of the meaningful insights from just a couple of hours of connecting the AWS integration and the Datadog APM agent. 

Getting it to 100% and configuring and customizing things to our unique situation, took about two weeks. Datadog's documentation and support team were extremely helpful during both phases.

What about the implementation team?

We handled the setup in-house.

What was our ROI?

From the number of outages stopped or shortened (which lead to lost revenue from non-renewals) and the number of hours saved on investigations (which correlates to engineering salaries), I estimate that the ROI of the implementation time and monthly charges to be between 10x and 20x.

What other advice do I have?

We use the solution as a SaaS deployment.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2003202 - PeerSpot reviewer
Architect at a comms service provider with 10,001+ employees
Real User
Good for monitoring and following metrics with a helpful flame graph
Pros and Cons
  • "Flame graphs are pretty useful for understanding how GraphQL resolves our federated queries when it comes to identifying slow points in our requests. In our microservice environment with 170 services."
  • "I often have issues with the UI in my browser."

What is our primary use case?

We use the solution primarily for distributed tracing, service insight and observability, metrics, and monitoring. We create custom metrics from outbound service calls to trace the availability of back-office systems. 

We use the flame graph to get insights into our GraphQL implementation. It helps highlight how resolvers work. 

However, it's lacking in tracing which GraphQL queries are run, and we use custom spans for that.

How has it helped my organization?

Prior, the team only had Instana, and few people used it. The main barriers to entry were the access (since it was not integrated into our SSO) and the user experience, which made it hard to follow. We had an on-prem version, and it wasn't the snappiest. The APM has made observability and tracing more accessible to developers.

What is most valuable?

Flame graphs are pretty useful for understanding how GraphQL resolves our federated queries when it comes to identifying slow points in our requests. In our microservice environment with 170 services. There are complex transactions over the course of a single user request since we essentially operate as a middle layer with 90 back office systems we integrate to.

What needs improvement?

I often have issues with the UI in my browser. I tend to have a lot of tabs open, yet have issues with it not responding or not showing data. A couple of times, pasting the URL into an incognito window shows the data that's there.

For how long have I used the solution?

I've used the solution for two years. 

How was the initial setup?

The initial setup was complex and required a bit of tweaking to get everything configured correctly and into our pipelines.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Site Reliability Engineer at a computer software company with 201-500 employees
Real User
They have a good ecosystem for their integrations
Pros and Cons
  • "Their interface is probably one of the easiest things to use because it lets non-developers and non-engineers quickly get access to metrics and pull business value out of them. We could put together dashboards and give it to people who are non-technical, then they can see the state of the world."
  • "We have been able to set very specific CPU and memory alerts, at the very base level, then we started to pull real business value, like 99th percentile response rates for our API calls."
  • "It has turned into an operational dashboard. If you felt something is going wrong, you can immediately open up Datadog. It has been our go to application because we know the answer will be there."
  • "The way data is represented can be limiting. When I first tried it out a long time ago, you could graph a metric and another metric, and they'd overlay, but you couldn't take the ratio between the two."
  • "When I started using it years ago, it had stability problems. I remember, specifically, we ran everything in Docker containers. There were some problems getting it into a Docker container with very specific memory limits."

What is our primary use case?

We use it for custom metrics of our applications and monitoring of our systems.

How has it helped my organization?

My current company didn't have very good monitoring in the past. We had been using basic CPU monitoring. We have been able to set very specific CPU and memory alerts, at the very base level, then we started to pull real business value, like 99th percentile response rates for our API calls. 

It has turned into an operational dashboard. If you felt something is going wrong, you can immediately open up Datadog. It has been our go to application because we know the answer will be there.

What is most valuable?

Their interface is probably one of the easiest things to use because it lets non-developers and non-engineers quickly get access to metrics and pull business value out of them. We could put together dashboards and give it to people who are non-technical, then they can see the state of the world. 

They have a very good ecosystem for their integrations. They have a lot of different integrations, and we use a lot of them. We have integrations with Amazon for ECS, RDS, and all of the subsystems of Amazon. We also have Docker and Splunk integrations. The integrations are great because they're definitely vetted and not third-party integrations. They're part of the Datadog ecosystem and seamless.

What needs improvement?

The way data is represented can be limiting. They have added their own little query language that you can use to manipulate things, so you can graph and relate two different metrics together. This is relatively new this year. When I first tried it out a long time ago, you could graph a metric and another metric, and they'd overlay, but you couldn't take the ratio between the two. However, it looks like this is the direction that they're going, and that's a good direction. I think they should continue adding things that way.

I like being able to put the formulas in myself. I don't want the average. I want a rolling average over three minutes, not five minutes. They're getting better at letting the user customize this.

For how long have I used the solution?

Three to five years.

What do I think about the stability of the solution?

When I started using it years ago, it had stability problems. I remember, specifically, we ran everything in Docker containers. There were some problems getting it into a Docker container with very specific memory limits. We couldn't nail down exactly what the limits and the application needed. Once we did that, we were good. However, it was tricky to get the limit in the first place.

What do I think about the scalability of the solution?

It has always scaled for us. Cost scales up too, but that is not necessarily a bad thing. It's reasonable for what they're providing. I haven't had any concerns about scaling.

We use between a 100 to 500 servers at any given point in time.

How is customer service and technical support?

For the most part, the technical support is pretty good. Every now and again, you will get stuck with a support rep who could have better training, but in general, they are very good and responsive. They're willing to talk about new features, etc.

How was the initial setup?

The integration and configuration processes have been very smooth because everything is very well-documented. The documentation is phenomenal. 

What was our ROI?

We can see trends a lot easier than if we didn't have the solution. The management can see the changes which are being made, whether it being performance or in the number of hosts that went down. We recently made internal improvements to some of our internal APIs, so we reduced the number of servers that we needed. So, you could see that the load on the system went down and the number of servers went down. Thus, it was easy to visualize.

What's my experience with pricing, setup cost, and licensing?

Pricing and licensing are reasonable for what they give you. You get the first five hosts free, which is fun to play around with. Then it's about four dollars a month per host, which is very affordable for what you get out of it. We have a lot of hosts that we put a lot of custom metrics into, and every host gives you an allowance for the number of custom metrics. We have not had a problem with it.

Which other solutions did I evaluate?

My company now is pretty good at looking at alternatives. Also, I evaluated alternative solutions at my last company. 

There are some other competitors. For example, I know one of them started doing metrics and their licensing is very cheap because the metric size is very small and it's per megabyte. They charge you per storage, and it's very small. However, the interface and integrations aren't there. and there are some other competitors, 

The other thing is granularity. Datadog gives you one second granularity for a year. Whereas, some of the competitors would roll up, so after about a week you don't have one second, you have five seconds. Then, after a month, you don't have five seconds, you have a minute. So, you start to lose the granularity, whether it be that it averages it or maxes it, you start to lose the ability to see incidents historically, which is super valuable. If we have an incident, which we think we've seen this before, and want to look back historically, we can zoom right in and see in the database where it peaked.

What other advice do I have?

Give Datadog a try. It's the leader in this space. 

I have only used the AWS version of the product.

They have a thing for the color purple, but it is all good.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: May 2025
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.