Try our new research platform with insights from 80,000+ expert users
Gediminas Anza - PeerSpot reviewer
Manager, System at Visma
User
Top 20
Increases efficiency, helps with customer satisfaction, and enhances collaboration
Pros and Cons
  • "The agents feature in Datadog stands out as a valuable asset within our organization due to its robust functionality, versatility, and role in providing comprehensive monitoring and observability capabilities."
  • "Presently, the billing CSV reports provide insights into billing-related information yet are somewhat limited in functionality, typically offering reports with only three columns."

What is our primary use case?

The primary use case of Datadog within our organization encompasses providing a comprehensive and sophisticated solution that caters to the diverse needs of our internal customers. We have strategically implemented Datadog to serve as a centralized platform for monitoring, analyzing, and optimizing various aspects of our operations. With a robust suite of functionalities, Datadog empowers us to meet the dynamic requirements of over 40 internal customers efficiently.

Through Datadog, we offer a wide array of services to our internal stakeholders, allowing them to access and leverage its capabilities to enhance performance, troubleshoot issues, and make data-driven decisions. The tool's versatility enables different teams within our organization to monitor and track distinct metrics, such as application performance, infrastructure health, and logs, tailored to their specific requirements.

Moreover, Datadog serves as a pivotal component in our organizational ecosystem by streamlining processes, enhancing collaboration, and fostering a culture of data-driven decision-making. By harnessing the power of Datadog, our internal customers can proactively address issues, optimize resources, and ultimately improve operational efficiency across the board.

In essence, the primary use case of Datadog in our organization revolves around empowering our internal customers with a comprehensive and feature-rich solution that enables them to monitor, analyze, and optimize various aspects of our operations seamlessly and effectively. This strategic implementation of Datadog plays a vital role in enhancing our overall performance, fostering transparency, and driving continuous improvement within our organization.

How has it helped my organization?

Datadog has significantly contributed to enhancing the overall effectiveness and efficiency of our organization through various key improvements. One of the standout benefits has been the accelerated resolution of issues. By leveraging Datadog's monitoring and alerting capabilities, we have been able to swiftly detect, diagnose, and address issues before they escalate, resulting in minimized downtime and enhanced operational continuity.

Moreover, the implementation of Datadog has had a tangible positive impact on customer satisfaction. With improved visibility into our systems and applications, coupled with proactive monitoring and performance optimization, we have been able to deliver a more reliable and seamless experience to our customers. This has translated into higher customer satisfaction scores and strengthened relationships with our stakeholders.

Another notable improvement brought about by Datadog is the streamlining of our toolset. By identifying and removing multiple unused or redundant features and tools, Datadog has helped optimize our workflows and resources. This decluttering of unnecessary functionalities has not only increased operational efficiency yet also streamlined our processes, allowing us to focus on the tools and features that truly add value to our operations.

In summary, Datadog's impact on our organization has been profound, enhancing our ability to resolve issues rapidly, improving customer satisfaction levels, and streamlining our toolset for increased efficiency and focus. These improvements have led to a more robust and resilient operational environment, enabling us to better meet the needs of our internal and external stakeholders.

What is most valuable?

Within our organization, we have found the Agents feature in Datadog to be exceptionally valuable due to its rich set of functionalities and capabilities. The Agents play a crucial role in our monitoring and data collection processes, providing a comprehensive and reliable means to gather crucial performance metrics and insights across our systems and applications.

One of the key reasons why the agents feature stands out as particularly valuable is its versatility. The Agents offer a wide range of monitoring and data collection options, allowing us to capture diverse metrics and performance data with precision. This flexibility enables us to tailor our monitoring strategy to meet the specific needs of different teams and use cases within our organization.

Moreover, the agents feature in Datadog enhances the overall observability of our infrastructure and applications. By deploying Agents strategically across our environment, we can gather real-time metrics, logs, and traces, enabling us to monitor the health, performance, and behavior of our systems comprehensively. This deep level of observability empowers us to proactively identify issues, optimize performance, and make informed decisions based on accurate and timely data.

Furthermore, the agents feature in Datadog plays a pivotal role in driving actionable insights and facilitating efficient troubleshooting. With the detailed data collected by the Agents, we can perform in-depth analysis, detect anomalies, and troubleshoot issues quickly and effectively. This proactive approach to monitoring and analysis ultimately enhances our operational efficiency and resilience.

In essence, the agents feature in Datadog stands out as a valuable asset within our organization due to its robust functionality, versatility, and role in providing comprehensive monitoring and observability capabilities. By leveraging the power of the Agents feature, we can effectively monitor, analyze, and optimize our systems and applications to ensure seamless operations and performance excellence.

What needs improvement?

In assessing areas for potential improvement, one key aspect where Datadog could enhance its service is in the realm of billing CSV reports. Presently, the billing CSV reports provide insights into billing-related information yet are somewhat limited in functionality, typically offering reports with only three columns. Expanding the capabilities of the billing CSV reports to include more detailed and customizable information would greatly benefit users by allowing them to gain a deeper understanding of their usage, costs, and billing trends within Datadog.

Additionally, in considering features for inclusion in the next release of Datadog, the development of more robust and customizable billing CSV reports could be a significant enhancement. By allowing users to tailor their billing reports to specific metrics, timeframes, and parameters of interest, Datadog could provide greater transparency and control over billing data, enabling users to make informed decisions regarding resource allocation, cost optimization, and budget planning.

Moreover, the inclusion of features such as cost forecasting, budget tracking, and customizable alerts related to billing thresholds could further empower users to manage their expenses effectively and proactively monitor and control costs within Datadog. These additions would not only enhance user experience and satisfaction, however, also contribute to a more holistic and actionable approach to financial management within the Datadog platform.

By refining the functionality of billing CSV reports and incorporating advanced features for cost analysis, forecasting, and monitoring, Datadog can elevate its service offering and provide users with enhanced tools for optimizing their usage, expenses, and financial oversight within the platform.

Buyer's Guide
Datadog
June 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.

For how long have I used the solution?

I've used the solution for over three years.

What do I think about the scalability of the solution?

Datadog is easy to scale. However, it's scaled for price, so be sure to measure what you need and not push all logs to the solution, or your price will skyrocket quickly.

Which solution did I use previously and why did I switch?

We use multiple APM tools to have both price and value correlations relevant to the teams using them.

What's my experience with pricing, setup cost, and licensing?

Request a test account during the POC phase to determine if the tool is the right fit; all providers do that for free.

Which other solutions did I evaluate?

We did POC with over five products. I can't name them due to the related NDA.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
reviewer9816413 - PeerSpot reviewer
Engineering Manager at Video Blocks
User
Top 20
Easy, more reliable, and transparent monitoring
Pros and Cons
  • "Monitors have also been very valuable when setting up our on-call processes. It makes it easy to set up and adjust alerting to keep our teams aware of anything going wrong."
  • "One thing to improve would be making it easier to see common patterns across traces."

What is our primary use case?

We use the solution to monitor and investigate issues with production services at work. We're periodically reviewing the service catalog view for the various applications and I use it to identify any anomalies with service metrics, any changes in user behavior evident via API calls, and/or spikes in errors.  

We use monitors to trigger alerts for on-call engineers to act upon. The monitors have set thresholds for request latency, error rates, and throughput. 

We also use automated rules to block bad actors based on request volume or patterns.

How has it helped my organization?

Datadog has made setting up monitors easier, more reliable, and more transparent. This has helped standardize our on-call process and set all of our on-call engineers up for success.  

It has also standardized the way we evaluate issues with our applications by encouraging all teams to use the service catalog.  

It makes it easier for our platforms and QA teams to get other engineering teams up to speed with managing their own applications' performance. 

Overall, Datadog has been very helpful for us.

What is most valuable?

The service catalog view is very helpful for periodic reviews of our application. It has also standardized the way we evaluate issues with our applications.  Having one page with an easy-to-scan view of app metrics, error patterns, package vulnerabilities, etc., is very helpful and reduces friction for our full-stack engineers.

Monitors have also been very valuable when setting up our on-call processes. It makes it easy to set up and adjust alerting to keep our teams aware of anything going wrong.

What needs improvement?

Datadog is great overall. One thing to improve would be making it easier to see common patterns across traces. I sometimes end up in a trace but have a hard time finding other common features about the error/requests that are similar to that trace. This could be easier to get to; however, in that case, it's actually an education issue.  

Another thing that could be improved is the service list page sometimes refreshes slowly, and I accidentally click the wrong environment since the sort changes late.

For how long have I used the solution?

I've used the solution for about a year.

What do I think about the stability of the solution?

It is very stable. I have not seen any issues with Datadog.

What do I think about the scalability of the solution?

It seems very scalable.

How are customer service and support?

I've had no specific experience with technical support.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We used Honeycomb before. We switched since Datadog offered more tooling.

How was the initial setup?

Each application has been easy to instrument.

What about the implementation team?

We implemented the solution in-house.

What was our ROI?

Engineers save an unquantifiable amount of time by having one standard view for all applications and monitors.

What's my experience with pricing, setup cost, and licensing?

I am not exposed to this aspect of Datadog.

Which other solutions did I evaluate?

We did not evaluate other options. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Datadog
June 2025
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
reviewer2561139 - PeerSpot reviewer
Director of DevSecOps at CIBT
User
Consistent, centralized service for varied cloud-based applications
Pros and Cons
  • "Our primary alerts, based on metrics and synthetic transactions, are the most used and relied upon for decreased MTTA/MTTR across all of our platforms. This is followed by deep log analysis that enables us to quickly and easily get to a preliminary root cause that someone on the infrastructure, platform or development teams can take and focus their attention on the precise target that Datadog revealed as the issue to be remediated."
  • "They could enhance the alerting functions by creating a new feature to add direct SMS notifications, on-call rotation scheduling, etc., that could replace the need to have this as an external third-party solution integration."

What is our primary use case?

The current use case for Datadog in our environment is observability.  We use Datadog as the primary log ingestion and analysis point, along with consolidation of application/infrastructure metrics across cloud environments and realtime alerting to issues that arise in production.  

Datadog integrates within all aspects of our infrastructure and applications to provide valuable insights into Containers, Serverless functions, Deep Logging Analysis, Virtualized Hardware and Cost Optimizations.

How has it helped my organization?

Datadog improved our observability layer by creating a consistent, centralized service for all of our varied cloud-based applications. All of our production and non-production environment applications and infrastructure send metrics directly to Datadog for analysis and determination of any issues that would need to be looked at by the Infrastructure, Platform and Development teams for quick remediation. Using Datadog as this centralized Observability platform has enabled us to become leaner without sacrificing project timelines when issues arise and require triage for efficient resolution.

What is most valuable?

All of Datadog's features have become valuable tools in our cloud environments.

Our primary alerts, based on metrics and synthetic transactions, are the most used and relied upon for decreased MTTA/MTTR across all of our platforms. This is followed by deep log analysis that enables us to quickly and easily get to a preliminary root cause that someone on the infrastructure, platform or development teams can take and focus their attention on the precise target that Datadog revealed as the issue to be remediated.

What needs improvement?

The two areas I could see needing improvement or a feature to add value are building a more robust SIM that would include container scanning to rival other such products on the market so we do not need to extend functionality to another third-party provider. The other expands the alerting functions by creating a new feature to add direct SMS notifications, on-call rotation scheduling, etc., that could replace the need to have this as an external third party solution integration. 

For how long have I used the solution?

I've been a Datadog user for almost ten years.

What do I think about the stability of the solution?

Datadog is very stable, and we've only come across a few items that needed to be addressed quickly when there were issues.

What do I think about the scalability of the solution?

Scalability is very favorable, aside from cost/budget, which limits the scalability of this platform.

How are customer service and support?

Both customer service and support need a little work, as we have had a number of requests/issues that were not addressed as we needed them to be.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Being an Observability SME, I have used many native and third party solutions, including Dynatrace, New Relic, CloudWatch and Zabbix. As previously mentioned, Datadog provides a superior platform for centralizing and consolidating our Observability layer. Switching to Datadog was a no-brainer when most other solutions either didn't provide the maturity of functions, or have them available, at all.

How was the initial setup?

The initial setup was very straightforward, and the integrations were easily configured.

What about the implementation team?

We implemented Datadog in-house.

What was our ROI?

For the most part, Datadog's ROI is quite impressive when you consider all of the features and functions that are centralized on the platform. It doesn't require us to purchase additional third-party solutions to fill in the gaps.

What's my experience with pricing, setup cost, and licensing?

The setup was dead simple once the cloud integrations and agent components were identified and executed. Licensing falls into our normal third-party processes, so it was a familiar feeling when we started with Datadog. Cost is the only outlier when it comes to a perfect solution. Datadog is expensive, and each add-on drives that cost further into the realm of requiring justifications to finance expanding the core suite of features we would like to enable.

Which other solutions did I evaluate?

Yes, we evaluated several competing platforms that included Dynatrace, New Relic and Zabbix.

What other advice do I have?

They should provide more inclusive pricing, or an "all you can eat" tier that would include all relevant features, as opposed to individual cost increases to let Datadog to become more valuable and replace even more third-party solutions that have a lower cost of entry.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Real User
Top 20
Improves monitoring and observability with actionable alerts
Pros and Cons
  • "The selection of monitors is a big feature I have been working with."
  • "The PagerDuty integration could be a little bit better."

What is our primary use case?

We are using Datadog to improve our monitoring and observability so we can hopefully improve our customer experience and reliability.  

I have been using Datadog to build better actionable alerts to help teams across the enterprise. Also by using Datadog we are hoping to have improved observability into our apps and we are also taking advantage of this process to improve our tagging strategy so teams can hopefully troubleshoot incidents faster and a much reduced mean time to resolve. 

We have a lot of different resources we use like Kubernetes, App Gateway and Cosmos DB just to name a few.

How has it helped my organization?

As soon as we started implementing Datadog into our cloud environment people really like how it looked and how easy it was to navigate. We could see the most data in our Kubernetes environments than we ever could. 

Some people liked how the logs were color coded so it was easy to see what kind of log you were looking at. The ease of making dashboards has also been greatly received as a benefit. 

People have commented that there is so much information that it takes a time to digest and get used to what you are looking at and finding what you are looking for. 

What is most valuable?

The selection of monitors is a big feature I have been working with. Previously with Azure Monitor we couldn't do a whole lot with their alerts. The log alerts can sometimes take a while to ingest. Also, we couldn't do any math with the metrics we received from logs to make better alerts from logs.  

The metric alerts are ok but are still very limited. With Datadog, we can make a wide range of different monitors that we can tweak in real time because there is a graph of data as you are creating the alert which is very beneficial. The ease of making dashboards has saved a lot of people a lot of time. No KQL queries to put together the information you are looking for and the ability to pin any info you see into a dashboard is very convenient. 

RUM is another feature we are looking forward to using this upcoming tax season, as we will have a front-row view into what frustrates customers or where things go wrong in their process of using our site. 

What needs improvement?

The PagerDuty integration could be a little bit better. If there was a way to format the monitors to different incident management software that would be awesome. As of right now, it takes a lot of manipulating of PagerDuty to get the monitors from Datadog to populate all the fields we want in PagerDuty.  

I love the fact you can query data without using something like KQL. However, it would also be helpful if there was a way to convert a complex KQL query into Datadog to be able to retrieve the same data - especially for very specific scenarios that some app teams may want to look for.

For how long have I used the solution?

I've used the solution for about two years.

Which solution did I use previously and why did I switch?

We previously used Azure Monitor, App Insights, and Log Analytics.  We switched because it was a lot for developers and SREs to switch between three screens to try troubleshoot and when you add in the slow load times from Azure it can take a while to get things done.

What's my experience with pricing, setup cost, and licensing?

I would advise taking a close look at logging costs, man-hours needed, and the amount of time it takes for people to get comfortable navigating Datadog because there is so much information that it can be overwhelming to narrow down what you need.

Which other solutions did I evaluate?

We did evaluate DynaTrace and looked into New Relic before settling on Datadog.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: H&R Block has just recently became a customer of Datadog.
Flag as inappropriate
PeerSpot user
Dmitri Panfilov - PeerSpot reviewer
Software Engineer at Redfin Corp
User
Top 20
Easy dashboard creation and alarm monitoring with a good ROI
Pros and Cons
  • "The ease of dashboard creation and alarm monitoring has helped us not only stay competitive but be industry leaders in performance."
  • "The product can be improved by allowing the grouping of APIs to add variables. That way, any API with a unique ID could be grouped together."

What is our primary use case?

We use the solution to monitor production service uptime/downtime, latency, and log storage. 

Our entire monitoring infrastructure runs off Datadog, so all our alarms are configured with it. We also use it for tracing API performance; what are the biggest regression points. 

Finally we use it to compare performance on SEO metrics vs competitors. This is a primary use case as SEO dictates our position from google traffic which is a large portion of our customer view generation so it is a vital part of the business we rely on datadog for.

How has it helped my organization?

The product improved the organization primarily by providing consistent data with virtually zero downtime. This was a problem we had with an old provider. It also made it easy to transition an otherwise massive migration involving hundreds of alarms. 

The training provided was crucial, along with having a dedicated team that can forward our requests to and from Datadog efficiently. Without that, we may have never transitioned to Datadog in the first place since it is always hard to lead a migration for an entire company.

What is most valuable?

The API tracing has been massive for debugging latency regressions and how to improve the performance of our least performant APIs. Through tracing, we managed to find the slowest step of an API, improve its latency, and iterate on the process until we had our desired timings. This is important for improving our SEO as LCP, INP are directly taking from the numbers we see on Datadog for our API timings. 

The ease of dashboard creation and alarm monitoring has helped us not only stay competitive but be industry leaders in performance.

What needs improvement?

The product can be improved by allowing the grouping of APIs to add variables. That way, any API with a unique ID could be grouped together. 

Furthermore, SEO monitoring has been crucial for us but also a difficult part to set up as comparing alarms between us and competitors is a tough feat. Data is not always consistent so we have been toying and experimenting with removing the noise of datadog but its been taking a while. 

Finally, Datadog should have a feature that reports stale alarms based on activity.

For how long have I used the solution?

I've used the solution for six months.

What do I think about the stability of the solution?

Its very stable and we have not experienced an issue with downtime on Datadog.

What do I think about the scalability of the solution?

Datadog works well for scalability as volume has not seemed to slow.

How are customer service and support?

We haven't talked to the support team. 

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We switched to Datadog as we used to have a provider that had very inconsistent logging. Our alarms would often not fire since our services were not working since the provider had a logging problem.

How was the initial setup?

The initial setup was somewhat complex due to the built-in monitoring with services. This is not always super comprehensive and has to be studied as opposed to other metrics platforms that just service all your endpoints, which you can trace them with Grafana.

What about the implementation team?

We implemented the solution through an in-house team.

What was our ROI?

The ROI is good.

What's my experience with pricing, setup cost, and licensing?

Users must try to understand the way Datadog alarms work off the bat so that they can minimize the requirements for expensive features like custom metrics. 

It can sometimes be tempting to use them; however, it is not always necessary as you migrate to Datalog, as they are a provider that treats alarms somewhat differently than you may be used to.

Which other solutions did I evaluate?

We have evaluated New Relic, Grafana, Splunk, and many more in our quest to find the best monitoring provider.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
reviewer1494894 - PeerSpot reviewer
Senior Manager, Site Reliability Engineering at Extra Space Storage
Real User
Top 20
Improved time to discovery and resolution but needs better consumption visibility
Pros and Cons
  • "Several critical dashboards were created years ago and are still in use today."
  • "We would love to see a % consumed and alert us if we are over budget before getting an overage charge 20 days into the month."

What is our primary use case?

The product monitors multiple systems, from customer interactions on our web applications down to the database and all layers in between. RUM, APM, logging, and infrastructure monitoring are all surfaced into single dashboards.

We initially started with application logs and generated long-term business metrics out of critical logs. We have turned those metrics and logs into a collection of alerts integrated into our pager system. As we have evolved, we have also used APM and RUM data to trigger additional alerts

How has it helped my organization?

The solution has surfaced how integrated our applications really are and helps us track calls from the top down, identifying slowness and errors all through the call stack.

The biggest improvement we have seen is our time to discovery and resolution. As Datadog has improved, and we add new features, the depth and clarity we get from top to bottom has been excellent. Our engineering teams have quickly adopted many features within Datadog, and are quick to build out their own dashboards and alerts. This has also led to a rapid sprawl when left unchecked.

What is most valuable?

We started with application logs and have expanded over the years to include infrastructure, APM, and now RUM. All of these tools have been incredibly valuable in their own sphere. The huge value is tying all of the data points together.

Logging was the first tool we started with years ago, replacing our ELK stack. It was the easiest to get in place, and our engineers quickly embraced the tools. Several critical dashboards were created years ago and are still in use today. Over time, we have shifted from verbose logs and matured into APM and RUM. That has helped us focus on fine-tuning the performance of our applications.

What needs improvement?

We need better visibility into our consumption rate, which is tied to our commit levels. We would love to see a % consumed and alert us if we are over budget before getting an overage charge 20 days into the month.

The biggest complaint we hear comes from the cost of the tool. It is pretty easy to accidentally consume a lot of extra data. Unless you watch everything come in almost daily, you could be in for a big surprise. 

We utilize the Datadog estimated usage metrics to build out alerts and dashboards. The usage and cost system page still doesn't tie into our committed spending - it would be wonderful to see the monthly burn rate on any given day.

For how long have I used the solution?

I've used the solution for six years.

What do I think about the stability of the solution?

There have not been as many outages in the past year. We also haven't been jumping into the new features as quickly as they come out. We may be working on more stable products.

What do I think about the scalability of the solution?

It has scaled up to meet our needs pretty well. Over the years, we have only managed to trigger internal DataDog alerts once or twice by misconfiguring a metric and spiralling out of control with costs.

How are customer service and support?

Support has been lacking. Opening a chat with the tech support rep of the day is always a gamble. We are looking into working with third-party support because it has been so rough over the years.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We used the ELK stack for logging and monitoring and AppDynamics for APM.

How was the initial setup?

The initial setup for new teams has become easier over the years. We are increasing our adoption rate as we shift our technology to more cloud-native tools. Datadog has supported easy implementation by simply adding a package to the app. 

They have really focused on a lot of out-of-the-box functionality, but the real fun happens as you dive deeper into the configuration. We have also begun adapting open telemetry standards. This has kept us from going too deep into vendor-specific implementations.

What about the implementation team?

We did the initial setup via an in-house team.

What was our ROI?

As long as we stay on top of our consumption mid-month, it has been worth it. However, the few engineers we have who are dedicated to playing whack-a-mole with the growing spending could be better utilized in teaching best practices to new users. I suppose our implementation of the rapidly changing tools over the years has led to a fair amount of technical debt.

What's my experience with pricing, setup cost, and licensing?

It is quite easy to set up any specific tool, but to take advantage of the full visibility it offers, you need to instrument across the board—which can be time-consuming. Be careful about how each tool is billed, and watch your consumption like a hawk.

Which other solutions did I evaluate?

We evaluated AppDynamics and Dynatrace.

What other advice do I have?

It's a very powerful tool, with lots of new features coming, but you certainly will pay for what you get.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Head of Software at Emporia
User
Top 10
Great for web application log aggregation, performance tracing, and alerting
Pros and Cons
  • "Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most."
  • "I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view."

What is our primary use case?

Our primary use case is for custom and vendor-supplied web application log aggregation, performance tracing, and alerting. 

We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. We're managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. 

Datadog agents are on each web host, and we have native integrations with GitHubAWS, and Azure to get all of our instrumentation and error data in one place for easy analysis and monitoring.

How has it helped my organization?

Through use of Datadog across all of our apps, we were able to consolidate a number of alerting and error-tracking apps. Datadog ties them all together in cohesive dashboards. Whether the app is vendor supplied or we built it ourselves, the depth of tracing, profiling, and hooking into logs is all obtainable and tunable. Both legacy .NET Framework and Windows Event Viewer and cutting edge .NET Core with streaming logs all work. The breath of coverage for any app type or situation is really incredible. It feels like there's nothing we can't monitor.

What is most valuable?

When it comes to Datadog, several features have proven particularly valuable. 

The centralized pipeline tracking and error logging provide a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly. 

Synthetic testing has been a game-changer, allowing us to catch potential problems before they impact real users. 

Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders. 

Together, these features form a powerful toolkit that helps us maintain high performance and reliability across our applications and infrastructure, ultimately leading to better user satisfaction and more efficient operations.

What needs improvement?

I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view. 

I like the idea of monitoring on the go, yet it seems the options are still a bit limited out of the box. 

While the documentation is very good considering all the frameworks and technology Datadog covers, there are areas - specifically .NET Profiling and Tracing of IIS hosted apps - that need a lot of focus to pick up on the key details needed. 

In some cases the screenshots don't match the text as updates are made.

For how long have I used the solution?

I've been using the solution for about three years.

What do I think about the stability of the solution?

We have been impressed with the uptime. It offers clean and light resource usage of the agents.

What do I think about the scalability of the solution?

The solution scales well and is customizable. 

How are customer service and support?

Customer support is always helpful to help us tune our committed costs and alerting us when we start spending out of the on demand budget.

Which solution did I use previously and why did I switch?

We used a mix of a custom error email system, SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility regardless of whether it is Linux or Windows or Container, cloud or on-prem hosted.

How was the initial setup?

The implementation is generally simple. .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

What about the implementation team?

We implemented the setup in-house. 

What was our ROI?

We've witnessed significant time saved by the development team assessing bugs and performance issues.

What's my experience with pricing, setup cost, and licensing?

Set up live trials to asses cost scaling. Small decisions around how monitors are used can have big impacts on cost scaling. 

Which other solutions did I evaluate?

NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

What other advice do I have?

We're excited to dig further into the new offerings around LLM and continue to grow our footprint in Datadog. 

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Product Engineering Manager at FMG Suite
User
Good logging, easy to find issues, and saves time
Pros and Cons
  • "The logging in general is one of my favorite features."
  • "I love to have some DD guru come in and do a department training directly at our setup."

What is our primary use case?

We use the solution for APM, AWS, Lambda, logging, and infrastructure. We have many different things all over AWS, and having one place to look is great.

We have all sorts of different AWS things out there that are in C# and Node. Having a single place to log and APM into is very important to us.

Keeping track of the cloud infrastructure is also important. We have Lambda, containers, EC2, etc.

Having a super simple interface to filter the searching for APM and logging is great. It is super easy to show people how to use. This is super important to us.

How has it helped my organization?

Finding issues quickly is super important. Being able to create dashboards and alert on issues.

Having the ability to create dashboards has really taught us how to utilize the searching part of the system. We are able to share them, and build upon them so easily. Many iterations later people are putting some solid information out there.

Alerting is also important to us. We have set up many alerts that help us spot issues in the platform before they become bigger issues. This has enabled my teams to use incidents and address the issues so they are no longer problems.

What is most valuable?

Alerting on running systems is very helpful. Finding issues is quick. We have one place for logging, searching through. Being able to save these and reference them in the future and build upon them.

The logging in general is one of my favorite features. The search is so straight forward and easy to use. Just being able to click on a field and add it to search has taught me so much about the interface, It might not be as useful without a shortcut like that to teach me the system. We have Cloudflare logs in there, and I have no idea sometimes how to filter on such a buried piece of JSON. That is where the interface helps me by clicking on the add to search I get what I need.

What needs improvement?

The "Pager Duty" replacement is something we are very interested in. We only really use pager duty to call the team when things are down.

I love to have some DD guru come in and do a department training directly at our setup. We would love to have someone come in and show us the things we could do better within our current setup.

Also saving a bit of cash would also help if there are things we are doing that are costing us. It's a big enough tool that it is tough to have someone dedicated to manage. 

For how long have I used the solution?

I've used the solution for a bit over a year at this point.

What do I think about the stability of the solution?

The stability seems good here too.

What do I think about the scalability of the solution?

Scalability seems good to me. I have no complaints

How are customer service and support?

I get answers from our contact, and one team member did reach out. It went well.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used Loggly. 

We switched because we wanted an all-in-one tool

How was the initial setup?

Some parts of our setup were tough. Some Windows container setups cost us a lot of time.

The AWS infrastructure was tough to fully turn on due to the large cost of everything being run.

What about the implementation team?

We handled the setup ourselves in-house.

What was our ROI?

This cost us more overall. ROI is hard to sell. That said, I can find issues way faster and see what is going on in my entire platform. I pay back the cost every month with productivity. 

What's my experience with pricing, setup cost, and licensing?

It is going to cost you more than you think to keep everything running. We saw value in the one-for-all solution, however, it came at a premium to what we were paying. 

Which other solutions did I evaluate?

We did evaluate Dynatrace.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2025
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.