Datadog Reviews and Pricing

reviewer092526

Director of Engineering at Ordoro

Oct 15, 2024

Download

Debugs slow performance with good support and a straightforward setup

Pros and Cons

"Datadog infrastructure monitoring has helped us identify health issues with our virtual machines, such as high load, CPU, and disk usage, as well as monitoring uptime and alerting when Kubernetes containers have a bad time staying up."

"We have found that some of the different options for filtering for logs ingestion, APM traces and span ingestion, and RUM sessions vs replay settings can be hard to discover and tough to determine how to adjust and tweak for both optimal performance and monitoring as well as for billing within the console."

What is our primary use case?

We use Datadog for monitoring the performance of our infrastructure across multiple types of hosts in multiple environments. We also use APM to monitor our applications in production.

We have some Kubernetes clusters and multi-cloud hosts with Datadog agents installed. We have recently added RUM to monitoring our application from the user side, including replay sessions, and are hoping to use those to replace existing monitoring for errors and session replay for debugging issues in the application.

How has it helped my organization?

We have been using Datadog since I started working at the company ten years ago and it has been used for many reasons over the years. Datadog across our services has helped debug slow performance on specific parts of our application, which, in turn, allows us to provide a snappier and more performant application for our customers.

The monitoring and alerting system has allowed our team to be aware of the issues that have come up in our production system and react faster with more tools to debug and view to keep the system online for our customers.

What is most valuable?

Datadog infrastructure monitoring has helped us identify health issues with our virtual machines, such as high load, CPU, and disk usage, as well as monitoring uptime and alerting when Kubernetes containers have a bad time staying up. Our use of Datadog's Application Monitoring, APM over the last six years or so has been crucial to identifying performance and bottleneck issues as well as alerting us when services are seeing high error rates, which have made it easier to debug when specific services may be going down.

What needs improvement?

We have found that some of the different options for filtering for logs ingestion, APM traces and span ingestion, and RUM sessions vs replay settings can be hard to discover and tough to determine how to adjust and tweak for both optimal performance and monitoring as well as for billing within the console.

It can sometimes be difficult to determine which information is documented, as we have found inconsistencies with deprecated information, such as environment variables within the documentation.

Buyer's Guide

Datadog

August 2025

Free Report: Datadog Reviews and More

Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: August 2025.

DOWNLOAD NOW

865,295 professionals have used our research since 2012.

For how long have I used the solution?

I've been using the solution for ten years.

What do I think about the stability of the solution?

The solution seems pretty stable, as we've been using it for more than a decade.

What do I think about the scalability of the solution?

The solution seems quite scalable, especially within Kubernetes. Costs are a factor.

How are customer service and support?

SUpport has been very helpful whenever we need it.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We had tried some other APM monitoring in the past, however, it was too expensive, and then we added it to Datadog since we were already using Datadog and it seemed like a good value add.

How was the initial setup?

The solution is straightforward to set up. Sometimes, it is complex to find the correct documentation.

What about the implementation team?

We handled the setup in-house.

What was our ROI?

Our ROI is ease of mind with alerts and monitoring, as well as the ability to review and debug issues for our customers.

What's my experience with pricing, setup cost, and licensing?

Getting settled on pricing is something you want to keep an eye on, as things seem to change regularly.

Which other solutions did I evaluate?

We used New Relic previously.

What other advice do I have?

Datadog is a great service that is continually growing its solution for monitoring and security. It is easy to set up and turn on and off its features once you have instrumented agents and tailored solutions to your needs.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

reviewer9816413

Engineering Manager at Video Blocks

Sep 23, 2024

Download

Easy, more reliable, and transparent monitoring

Pros and Cons

"Monitors have also been very valuable when setting up our on-call processes. It makes it easy to set up and adjust alerting to keep our teams aware of anything going wrong."

"One thing to improve would be making it easier to see common patterns across traces."

What is our primary use case?

We use the solution to monitor and investigate issues with production services at work. We're periodically reviewing the service catalog view for the various applications and I use it to identify any anomalies with service metrics, any changes in user behavior evident via API calls, and/or spikes in errors.

We use monitors to trigger alerts for on-call engineers to act upon. The monitors have set thresholds for request latency, error rates, and throughput.

We also use automated rules to block bad actors based on request volume or patterns.

How has it helped my organization?

Datadog has made setting up monitors easier, more reliable, and more transparent. This has helped standardize our on-call process and set all of our on-call engineers up for success.

It has also standardized the way we evaluate issues with our applications by encouraging all teams to use the service catalog.

It makes it easier for our platforms and QA teams to get other engineering teams up to speed with managing their own applications' performance.

Overall, Datadog has been very helpful for us.

What is most valuable?

The service catalog view is very helpful for periodic reviews of our application. It has also standardized the way we evaluate issues with our applications. Having one page with an easy-to-scan view of app metrics, error patterns, package vulnerabilities, etc., is very helpful and reduces friction for our full-stack engineers.

Monitors have also been very valuable when setting up our on-call processes. It makes it easy to set up and adjust alerting to keep our teams aware of anything going wrong.

What needs improvement?

Datadog is great overall. One thing to improve would be making it easier to see common patterns across traces. I sometimes end up in a trace but have a hard time finding other common features about the error/requests that are similar to that trace. This could be easier to get to; however, in that case, it's actually an education issue.

Another thing that could be improved is the service list page sometimes refreshes slowly, and I accidentally click the wrong environment since the sort changes late.

For how long have I used the solution?

I've used the solution for about a year.

What do I think about the stability of the solution?

It is very stable. I have not seen any issues with Datadog.

What do I think about the scalability of the solution?

It seems very scalable.

How are customer service and support?

I've had no specific experience with technical support.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We used Honeycomb before. We switched since Datadog offered more tooling.

How was the initial setup?

Each application has been easy to instrument.

What about the implementation team?

We implemented the solution in-house.

What was our ROI?

Engineers save an unquantifiable amount of time by having one standard view for all applications and monitors.

What's my experience with pricing, setup cost, and licensing?

I am not exposed to this aspect of Datadog.

Which other solutions did I evaluate?

We did not evaluate other options.

Which deployment model are you using for this solution?

Public Cloud

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Buyer's Guide

Datadog

August 2025

Free Report: Datadog Reviews and More

Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: August 2025.

DOWNLOAD NOW

865,295 professionals have used our research since 2012.

reviewer2561139

Director of DevSecOps at CIBT

Sep 30, 2024

Download

Consistent, centralized service for varied cloud-based applications

Pros and Cons

"Our primary alerts, based on metrics and synthetic transactions, are the most used and relied upon for decreased MTTA/MTTR across all of our platforms. This is followed by deep log analysis that enables us to quickly and easily get to a preliminary root cause that someone on the infrastructure, platform or development teams can take and focus their attention on the precise target that Datadog revealed as the issue to be remediated."

"They could enhance the alerting functions by creating a new feature to add direct SMS notifications, on-call rotation scheduling, etc., that could replace the need to have this as an external third-party solution integration."

What is our primary use case?

The current use case for Datadog in our environment is observability. We use Datadog as the primary log ingestion and analysis point, along with consolidation of application/infrastructure metrics across cloud environments and realtime alerting to issues that arise in production.

Datadog integrates within all aspects of our infrastructure and applications to provide valuable insights into Containers, Serverless functions, Deep Logging Analysis, Virtualized Hardware and Cost Optimizations.

How has it helped my organization?

Datadog improved our observability layer by creating a consistent, centralized service for all of our varied cloud-based applications. All of our production and non-production environment applications and infrastructure send metrics directly to Datadog for analysis and determination of any issues that would need to be looked at by the Infrastructure, Platform and Development teams for quick remediation. Using Datadog as this centralized Observability platform has enabled us to become leaner without sacrificing project timelines when issues arise and require triage for efficient resolution.

What is most valuable?

All of Datadog's features have become valuable tools in our cloud environments.

Our primary alerts, based on metrics and synthetic transactions, are the most used and relied upon for decreased MTTA/MTTR across all of our platforms. This is followed by deep log analysis that enables us to quickly and easily get to a preliminary root cause that someone on the infrastructure, platform or development teams can take and focus their attention on the precise target that Datadog revealed as the issue to be remediated.

What needs improvement?

The two areas I could see needing improvement or a feature to add value are building a more robust SIM that would include container scanning to rival other such products on the market so we do not need to extend functionality to another third-party provider. The other expands the alerting functions by creating a new feature to add direct SMS notifications, on-call rotation scheduling, etc., that could replace the need to have this as an external third party solution integration.

For how long have I used the solution?

I've been a Datadog user for almost ten years.

What do I think about the stability of the solution?

Datadog is very stable, and we've only come across a few items that needed to be addressed quickly when there were issues.

What do I think about the scalability of the solution?

Scalability is very favorable, aside from cost/budget, which limits the scalability of this platform.

How are customer service and support?

Both customer service and support need a little work, as we have had a number of requests/issues that were not addressed as we needed them to be.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Being an Observability SME, I have used many native and third party solutions, including Dynatrace, New Relic, CloudWatch and Zabbix. As previously mentioned, Datadog provides a superior platform for centralizing and consolidating our Observability layer. Switching to Datadog was a no-brainer when most other solutions either didn't provide the maturity of functions, or have them available, at all.

How was the initial setup?

The initial setup was very straightforward, and the integrations were easily configured.

What about the implementation team?

We implemented Datadog in-house.

What was our ROI?

For the most part, Datadog's ROI is quite impressive when you consider all of the features and functions that are centralized on the platform. It doesn't require us to purchase additional third-party solutions to fill in the gaps.

What's my experience with pricing, setup cost, and licensing?

The setup was dead simple once the cloud integrations and agent components were identified and executed. Licensing falls into our normal third-party processes, so it was a familiar feeling when we started with Datadog. Cost is the only outlier when it comes to a perfect solution. Datadog is expensive, and each add-on drives that cost further into the realm of requiring justifications to finance expanding the core suite of features we would like to enable.

Which other solutions did I evaluate?

Yes, we evaluated several competing platforms that included Dynatrace, New Relic and Zabbix.

What other advice do I have?

They should provide more inclusive pricing, or an "all you can eat" tier that would include all relevant features, as opposed to individual cost increases to let Datadog to become more valuable and replace even more third-party solutions that have a lower cost of entry.

Which deployment model are you using for this solution?

Hybrid Cloud

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Kenneth Dozier Jr.

Works

Sep 26, 2024

Download

Improves monitoring and observability with actionable alerts

Pros and Cons

"The selection of monitors is a big feature I have been working with."

"The PagerDuty integration could be a little bit better."

What is our primary use case?

We are using Datadog to improve our monitoring and observability so we can hopefully improve our customer experience and reliability.

I have been using Datadog to build better actionable alerts to help teams across the enterprise. Also by using Datadog we are hoping to have improved observability into our apps and we are also taking advantage of this process to improve our tagging strategy so teams can hopefully troubleshoot incidents faster and a much reduced mean time to resolve.

We have a lot of different resources we use like Kubernetes, App Gateway and Cosmos DB just to name a few.

How has it helped my organization?

As soon as we started implementing Datadog into our cloud environment people really like how it looked and how easy it was to navigate. We could see the most data in our Kubernetes environments than we ever could.

Some people liked how the logs were color coded so it was easy to see what kind of log you were looking at. The ease of making dashboards has also been greatly received as a benefit.

People have commented that there is so much information that it takes a time to digest and get used to what you are looking at and finding what you are looking for.

What is most valuable?

The selection of monitors is a big feature I have been working with. Previously with Azure Monitor we couldn't do a whole lot with their alerts. The log alerts can sometimes take a while to ingest. Also, we couldn't do any math with the metrics we received from logs to make better alerts from logs.

The metric alerts are ok but are still very limited. With Datadog, we can make a wide range of different monitors that we can tweak in real time because there is a graph of data as you are creating the alert which is very beneficial. The ease of making dashboards has saved a lot of people a lot of time. No KQL queries to put together the information you are looking for and the ability to pin any info you see into a dashboard is very convenient.

RUM is another feature we are looking forward to using this upcoming tax season, as we will have a front-row view into what frustrates customers or where things go wrong in their process of using our site.

What needs improvement?

The PagerDuty integration could be a little bit better. If there was a way to format the monitors to different incident management software that would be awesome. As of right now, it takes a lot of manipulating of PagerDuty to get the monitors from Datadog to populate all the fields we want in PagerDuty.

I love the fact you can query data without using something like KQL. However, it would also be helpful if there was a way to convert a complex KQL query into Datadog to be able to retrieve the same data - especially for very specific scenarios that some app teams may want to look for.

For how long have I used the solution?

I've used the solution for about two years.

Which solution did I use previously and why did I switch?

We previously used Azure Monitor, App Insights, and Log Analytics. We switched because it was a lot for developers and SREs to switch between three screens to try troubleshoot and when you add in the slow load times from Azure it can take a while to get things done.

What's my experience with pricing, setup cost, and licensing?

I would advise taking a close look at logging costs, man-hours needed, and the amount of time it takes for people to get comfortable navigating Datadog because there is so much information that it can be overwhelming to narrow down what you need.

Which other solutions did I evaluate?

We did evaluate DynaTrace and looked into New Relic before settling on Datadog.

Which deployment model are you using for this solution?

Hybrid Cloud

Disclosure: My company has a business relationship with this vendor other than being a customer. H&R Block has just recently became a customer of Datadog.

Dmitri Panfilov

Software Engineer at Redfin Corp

Sep 20, 2024

Download

Easy dashboard creation and alarm monitoring with a good ROI

Pros and Cons

"The ease of dashboard creation and alarm monitoring has helped us not only stay competitive but be industry leaders in performance."

"The product can be improved by allowing the grouping of APIs to add variables. That way, any API with a unique ID could be grouped together."

What is our primary use case?

We use the solution to monitor production service uptime/downtime, latency, and log storage.

Our entire monitoring infrastructure runs off Datadog, so all our alarms are configured with it. We also use it for tracing API performance; what are the biggest regression points.

Finally we use it to compare performance on SEO metrics vs competitors. This is a primary use case as SEO dictates our position from google traffic which is a large portion of our customer view generation so it is a vital part of the business we rely on datadog for.

How has it helped my organization?

The product improved the organization primarily by providing consistent data with virtually zero downtime. This was a problem we had with an old provider. It also made it easy to transition an otherwise massive migration involving hundreds of alarms.

The training provided was crucial, along with having a dedicated team that can forward our requests to and from Datadog efficiently. Without that, we may have never transitioned to Datadog in the first place since it is always hard to lead a migration for an entire company.

What is most valuable?

The API tracing has been massive for debugging latency regressions and how to improve the performance of our least performant APIs. Through tracing, we managed to find the slowest step of an API, improve its latency, and iterate on the process until we had our desired timings. This is important for improving our SEO as LCP, INP are directly taking from the numbers we see on Datadog for our API timings.

The ease of dashboard creation and alarm monitoring has helped us not only stay competitive but be industry leaders in performance.

What needs improvement?

The product can be improved by allowing the grouping of APIs to add variables. That way, any API with a unique ID could be grouped together.

Furthermore, SEO monitoring has been crucial for us but also a difficult part to set up as comparing alarms between us and competitors is a tough feat. Data is not always consistent so we have been toying and experimenting with removing the noise of datadog but its been taking a while.

Finally, Datadog should have a feature that reports stale alarms based on activity.

For how long have I used the solution?

I've used the solution for six months.

What do I think about the stability of the solution?

Its very stable and we have not experienced an issue with downtime on Datadog.

What do I think about the scalability of the solution?

Datadog works well for scalability as volume has not seemed to slow.

How are customer service and support?

We haven't talked to the support team.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We switched to Datadog as we used to have a provider that had very inconsistent logging. Our alarms would often not fire since our services were not working since the provider had a logging problem.

How was the initial setup?

The initial setup was somewhat complex due to the built-in monitoring with services. This is not always super comprehensive and has to be studied as opposed to other metrics platforms that just service all your endpoints, which you can trace them with Grafana.

What about the implementation team?

We implemented the solution through an in-house team.

What was our ROI?

The ROI is good.

What's my experience with pricing, setup cost, and licensing?

Users must try to understand the way Datadog alarms work off the bat so that they can minimize the requirements for expensive features like custom metrics.

It can sometimes be tempting to use them; however, it is not always necessary as you migrate to Datalog, as they are a provider that treats alarms somewhat differently than you may be used to.

Which other solutions did I evaluate?

We have evaluated New Relic, Grafana, Splunk, and many more in our quest to find the best monitoring provider.

Which deployment model are you using for this solution?

Hybrid Cloud

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

reviewer1494894

Senior Manager, Site Reliability Engineering at Extra Space Storage

Sep 19, 2024

Download

Improved time to discovery and resolution but needs better consumption visibility

Pros and Cons

"Several critical dashboards were created years ago and are still in use today."

"We would love to see a % consumed and alert us if we are over budget before getting an overage charge 20 days into the month."

What is our primary use case?

The product monitors multiple systems, from customer interactions on our web applications down to the database and all layers in between. RUM, APM, logging, and infrastructure monitoring are all surfaced into single dashboards.

We initially started with application logs and generated long-term business metrics out of critical logs. We have turned those metrics and logs into a collection of alerts integrated into our pager system. As we have evolved, we have also used APM and RUM data to trigger additional alerts.

How has it helped my organization?

The solution has surfaced how integrated our applications really are and helps us track calls from the top down, identifying slowness and errors all through the call stack.

The biggest improvement we have seen is our time to discovery and resolution. As Datadog has improved, and we add new features, the depth and clarity we get from top to bottom has been excellent. Our engineering teams have quickly adopted many features within Datadog, and are quick to build out their own dashboards and alerts. This has also led to a rapid sprawl when left unchecked.

What is most valuable?

We started with application logs and have expanded over the years to include infrastructure, APM, and now RUM. All of these tools have been incredibly valuable in their own sphere. The huge value is tying all of the data points together.

Logging was the first tool we started with years ago, replacing our ELK stack. It was the easiest to get in place, and our engineers quickly embraced the tools. Several critical dashboards were created years ago and are still in use today. Over time, we have shifted from verbose logs and matured into APM and RUM. That has helped us focus on fine-tuning the performance of our applications.

What needs improvement?

We need better visibility into our consumption rate, which is tied to our commit levels. We would love to see a % consumed and alert us if we are over budget before getting an overage charge 20 days into the month.

The biggest complaint we hear comes from the cost of the tool. It is pretty easy to accidentally consume a lot of extra data. Unless you watch everything come in almost daily, you could be in for a big surprise.

We utilize the Datadog estimated usage metrics to build out alerts and dashboards. The usage and cost system page still doesn't tie into our committed spending - it would be wonderful to see the monthly burn rate on any given day.

For how long have I used the solution?

I've used the solution for six years.

What do I think about the stability of the solution?

There have not been as many outages in the past year. We also haven't been jumping into the new features as quickly as they come out. We may be working on more stable products.

What do I think about the scalability of the solution?

It has scaled up to meet our needs pretty well. Over the years, we have only managed to trigger internal DataDog alerts once or twice by misconfiguring a metric and spiralling out of control with costs.

How are customer service and support?

Support has been lacking. Opening a chat with the tech support rep of the day is always a gamble. We are looking into working with third-party support because it has been so rough over the years.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We used the ELK stack for logging and monitoring and AppDynamics for APM.

How was the initial setup?

The initial setup for new teams has become easier over the years. We are increasing our adoption rate as we shift our technology to more cloud-native tools. Datadog has supported easy implementation by simply adding a package to the app.

They have really focused on a lot of out-of-the-box functionality, but the real fun happens as you dive deeper into the configuration. We have also begun adapting open telemetry standards. This has kept us from going too deep into vendor-specific implementations.

What about the implementation team?

We did the initial setup via an in-house team.

What was our ROI?

As long as we stay on top of our consumption mid-month, it has been worth it. However, the few engineers we have who are dedicated to playing whack-a-mole with the growing spending could be better utilized in teaching best practices to new users. I suppose our implementation of the rapidly changing tools over the years has led to a fair amount of technical debt.

What's my experience with pricing, setup cost, and licensing?

It is quite easy to set up any specific tool, but to take advantage of the full visibility it offers, you need to instrument across the board—which can be time-consuming. Be careful about how each tool is billed, and watch your consumption like a hawk.

Which other solutions did I evaluate?

We evaluated AppDynamics and Dynatrace.

What other advice do I have?

It's a very powerful tool, with lots of new features coming, but you certainly will pay for what you get.

Which deployment model are you using for this solution?

Public Cloud

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Franz Kettwig

Head of Software at Emporia

Sep 30, 2024

Download

Great for web application log aggregation, performance tracing, and alerting

Pros and Cons

"Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most."

"I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view."

What is our primary use case?

Our primary use case is for custom and vendor-supplied web application log aggregation, performance tracing, and alerting.

We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. We're managing a hybrid multi-cloud solution across hundreds of applications is always a challenge.

Datadog agents are on each web host, and we have native integrations with GitHub, AWS, and Azure to get all of our instrumentation and error data in one place for easy analysis and monitoring.

How has it helped my organization?

Through use of Datadog across all of our apps, we were able to consolidate a number of alerting and error-tracking apps. Datadog ties them all together in cohesive dashboards. Whether the app is vendor supplied or we built it ourselves, the depth of tracing, profiling, and hooking into logs is all obtainable and tunable. Both legacy .NET Framework and Windows Event Viewer and cutting edge .NET Core with streaming logs all work. The breath of coverage for any app type or situation is really incredible. It feels like there's nothing we can't monitor.

What is most valuable?

When it comes to Datadog, several features have proven particularly valuable.

The centralized pipeline tracking and error logging provide a comprehensive view of our development and deployment processes, making it much easier to identify and resolve issues quickly.

Synthetic testing has been a game-changer, allowing us to catch potential problems before they impact real users.

Real user monitoring gives us invaluable insights into actual user experiences, helping us prioritize improvements where they matter most. And the ability to create custom dashboards has been incredibly useful, allowing us to visualize key metrics and KPIs in a way that makes sense for different teams and stakeholders.

Together, these features form a powerful toolkit that helps us maintain high performance and reliability across our applications and infrastructure, ultimately leading to better user satisfaction and more efficient operations.

What needs improvement?

I'd like to see an expansion of the Android and IOS apps to have a simplified CI/CD pipeline history view.

I like the idea of monitoring on the go, yet it seems the options are still a bit limited out of the box.

While the documentation is very good considering all the frameworks and technology Datadog covers, there are areas - specifically .NET Profiling and Tracing of IIS hosted apps - that need a lot of focus to pick up on the key details needed.

In some cases the screenshots don't match the text as updates are made.

For how long have I used the solution?

I've been using the solution for about three years.

What do I think about the stability of the solution?

We have been impressed with the uptime. It offers clean and light resource usage of the agents.

What do I think about the scalability of the solution?

The solution scales well and is customizable.

How are customer service and support?

Customer support is always helpful to help us tune our committed costs and alerting us when we start spending out of the on demand budget.

Which solution did I use previously and why did I switch?

We used a mix of a custom error email system, SolarWinds, UptimeRobot, and GitHub actions. We switched to find one platform that could give deep app visibility regardless of whether it is Linux or Windows or Container, cloud or on-prem hosted.

How was the initial setup?

The implementation is generally simple. .NET Profiling of IIS and aligning logs to traces and profiles was a challenge.

What about the implementation team?

We implemented the setup in-house.

What was our ROI?

We've witnessed significant time saved by the development team assessing bugs and performance issues.

What's my experience with pricing, setup cost, and licensing?

Set up live trials to asses cost scaling. Small decisions around how monitors are used can have big impacts on cost scaling.

Which other solutions did I evaluate?

NewRelic was considered. LogicMonitor was chosen over Datadog for our network and campus server management use cases.

What other advice do I have?

We're excited to dig further into the new offerings around LLM and continue to grow our footprint in Datadog.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure

Disclosure: My company does not have a business relationship with this vendor other than being a customer.

reviewer907251

Product Engineering Manager at FMG Suite

Sep 30, 2024

Download

Good logging, easy to find issues, and saves time

Pros and Cons

"The logging in general is one of my favorite features."

"I love to have some DD guru come in and do a department training directly at our setup."

What is our primary use case?

We use the solution for APM, AWS, Lambda, logging, and infrastructure. We have many different things all over AWS, and having one place to look is great.

We have all sorts of different AWS things out there that are in C# and Node. Having a single place to log and APM into is very important to us.

Keeping track of the cloud infrastructure is also important. We have Lambda, containers, EC2, etc.

Having a super simple interface to filter the searching for APM and logging is great. It is super easy to show people how to use. This is super important to us.

How has it helped my organization?

Finding issues quickly is super important. Being able to create dashboards and alert on issues.

Having the ability to create dashboards has really taught us how to utilize the searching part of the system. We are able to share them, and build upon them so easily. Many iterations later people are putting some solid information out there.

Alerting is also important to us. We have set up many alerts that help us spot issues in the platform before they become bigger issues. This has enabled my teams to use incidents and address the issues so they are no longer problems.

What is most valuable?

Alerting on running systems is very helpful. Finding issues is quick. We have one place for logging, searching through. Being able to save these and reference them in the future and build upon them.

The logging in general is one of my favorite features. The search is so straight forward and easy to use. Just being able to click on a field and add it to search has taught me so much about the interface, It might not be as useful without a shortcut like that to teach me the system. We have Cloudflare logs in there, and I have no idea sometimes how to filter on such a buried piece of JSON. That is where the interface helps me by clicking on the add to search I get what I need.

What needs improvement?

The "Pager Duty" replacement is something we are very interested in. We only really use pager duty to call the team when things are down.

I love to have some DD guru come in and do a department training directly at our setup. We would love to have someone come in and show us the things we could do better within our current setup.

Also saving a bit of cash would also help if there are things we are doing that are costing us. It's a big enough tool that it is tough to have someone dedicated to manage.

For how long have I used the solution?

I've used the solution for a bit over a year at this point.

What do I think about the stability of the solution?

The stability seems good here too.

What do I think about the scalability of the solution?

Scalability seems good to me. I have no complaints

How are customer service and support?

I get answers from our contact, and one team member did reach out. It went well.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used Loggly.

We switched because we wanted an all-in-one tool

How was the initial setup?

Some parts of our setup were tough. Some Windows container setups cost us a lot of time.

The AWS infrastructure was tough to fully turn on due to the large cost of everything being run.

What about the implementation team?

We handled the setup ourselves in-house.

What was our ROI?

This cost us more overall. ROI is hard to sell. That said, I can find issues way faster and see what is going on in my entire platform. I pay back the cost every month with productivity.

What's my experience with pricing, setup cost, and licensing?

It is going to cost you more than you think to keep everything running. We saw value in the one-for-all solution, however, it came at a premium to what we were paying.

Which other solutions did I evaluate?

We did evaluate Dynatrace.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Disclosure: My company does not have a business relationship with this vendor other than being a customer.