Deivid Viana - PeerSpot reviewer
Enterprise Architect at Vale
Real User
Top 5
Comes with good documentation and clear dashboards
Pros and Cons
  • "Datadog has clear dashboards and good documentation."
  • "The solution needs to integrate AI tools."

What is most valuable?

Datadog has clear dashboards and good documentation. 

What needs improvement?

The solution needs to integrate AI tools. 

How are customer service and support?

I avail support from our internal team. 

What other advice do I have?

I rate Datadog a nine out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Software Engineer at a financial services firm with 10,001+ employees
Real User
Helpful support, good RUM monitoring, and nice dashboards
Pros and Cons
  • "I really enjoy the RUM monitoring features of Datadog. It allows us to monitor user behavior in a way we couldn't before."
  • "At times, it can be hard to generate metrics out of logs."

What is our primary use case?

We use it to monitor and alert our ECS instances as well as other AWS services, including DynamoDB, API Gateway, etc. 

We have it connected to Pagerduty for alerting all our cloud applications. 

We also use custom RUM monitoring and synthetic tests for both our internal and public-facing websites. 

For our cloud applications, we can use Datadog to define our SLOs, and SLIs and generate dashboards that are used to monitor SLOs and report them to our senior leadership.

How has it helped my organization?

Datadog has been able to improve our cloud-native monitoring significantly, as CloudWatch doesn't have enough features to create robust, sustainable dashboards that are easily able to present all the information in an aggregated manner in one place for a combination of applications, databases, and other services including our UI applications. 

RUM monitoring is also something we didn't have before Datadog. We had Splunk, which was a lot harder to set up than Datadog's custom RUM metrics and its dashboards.

What is most valuable?

I really enjoy the RUM monitoring features of Datadog. It allows us to monitor user behavior in a way we couldn't before. 

It's useful to be able to obfuscate sensitive information by setting up custom RUM actions and blocking the default ones with too much data. 

I also like being able to generate custom metrics and monitors by adding facets to existing logging. Datadog can parse logs well for that purpose. The primary method of error detection for our external website is synthetic tests. This is extremely valuable for us as we have a large user base.

What needs improvement?

At times, it can be hard to generate metrics out of logs. I've seen some of those break over time and have flakey data available. 

Creating a monitor out of the metric and using it in a dashboard to generate our SLIs and SLOs has been hard, especially in cases where the data comes from nested logging facets.

For how long have I used the solution?

I've used the solution for two years.

What do I think about the stability of the solution?

The stability is pretty good.

What do I think about the scalability of the solution?

The solution is pretty scalable! It's hard to set up all the infra (terraform code) required to link private links in Datadog to all of our different AWS accounts.

How are customer service and support?

They offer good support. Solutions are provided by the team when needed. For example, we had to delete all our RUM metrics when we accidentally logged sensitive data and the CTO of Datadog stepped in to help out and prioritize it at the time.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We previously used Splunk and some internal tools. We switched due to the fact that some cloud applications don't integrate well with pre-existing solutions.

How was the initial setup?

The initial setup for connecting our different AWS accounts via Datadog private link wasn't great. There was a lot of duplicate terraform that had to be written. The dashboard setup is way easier.

What about the implementation team?

We installed it with the help of a vendor team.

What was our ROI?

Our return on investment is great and is so much better than CloudWatch. We can easily integrate with Pagerduty for alerting.

What's my experience with pricing, setup cost, and licensing?

Our company set up the product for us, so the engineers didn't need to be involved with pricing. 

The pricing structure isn't very clear to engineers.

Which other solutions did I evaluate?

We looked into Splunk and some internal tools.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Datadog
April 2024
Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,246 professionals have used our research since 2012.
Jon Schwartz - PeerSpot reviewer
Senior Software Engineer at LeafLink
Real User
Good log stream with a useful APM and democratizes logs
Pros and Cons
  • "Datadog's log aggregation is really helpful since it lets me and every other engineer on my team login, view, and share logs when we need to debug our application."
  • "The menu on the left is pretty dense (and I know it has to be). I never knew about the cmd+k functionality until recently. It would be helpful to offer more tips/cheat sheets to see handy shortcuts like that."

What is our primary use case?

We use Datadog to view and aggregate logs and monitor all of our services. We have a lot of running infrastructure and it is very convenient to have logs and metrics all aggregated somewhere we can view and chart them. 

I use Datadog to create dashboards and runbooks, and sharable graphs, which really help out my whole team. We mostly use logs and APM, yet have been starting to use other products. I would like to use more synthetic monitors.

How has it helped my organization?

It has democratized our logs and metrics, allowing all engineers to have insight into how our apps perform. It is also extremely helpful when debugging issues. 

It would be very difficult to debug issues without aggregated logs and APM traces. 

It has also definitely saved us some money since we can keep an eye on our running infrastructure in an easy-to-see way, rather than a less friendly CLI. It has been a very big help!

What is most valuable?

The log stream has been the most useful thing. Having so many logs on so many different running containers means it is very inconvenient to view them individually. Datadog's log aggregation is really helpful since it lets me and every other engineer on my team login, view, and share logs when we need to debug our application. 

APM has also been extremely helpful for debugging issues and profiling and optimizing our apps. Dashboards have also been really helpful for communicating needs and priorities to engineering leadership. 

It is very easy to get buy-in with graphs to back things up.

What needs improvement?

I recently saw the education, and it is amazing. Events like DASH are extremely helpful in understanding the deep set of features. Anything that helps to educate users is a huge win here. 

The menu on the left is pretty dense (and I know it has to be). I never knew about the cmd+k functionality until recently. It would be helpful to offer more tips/cheat sheets to see handy shortcuts like that.

For how long have I used the solution?

I've used the solution for three years. 

Which solution did I use previously and why did I switch?

We previously used AWS Cloudwatch logs. It was way less friendly and fully featured.

How was the initial setup?

The solution is pretty straightforward to set up. It helps with logs and metrics, and the AWS integration is really great.

What about the implementation team?

We handled the implementation in-house.

What other advice do I have?

It is hard to educate an entire team. There is a big learning curve.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Software Engineer at Enable Medicine
User
Good technical documentation and overall education with improved visibility
Pros and Cons
  • "We've found it most useful for managing Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch."
  • "We primarily use the log management functionality, and the only feedback I have there is better fuzzy text searching in logs (the kind that Kibana has)."

What is our primary use case?

We primarily use the solution for log monitoring across our entire cloud infra (EB, EC2, Batch, and Lambda).

This is in addition to Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch(https://docs.rstudio.com/ide/server-pro/server_management/logging.html#default-log-file-locations). 

We own several dozen of these servers, and we used to manage instance logs by tailing logs when incidents occurred. Datadog allows for much better visibility across our entire fleet and has saved us countless hours.

How has it helped my organization?

It is now way easier to search in one place rather than across all of Cloudwatch (and needing to know log groups, etc.). 

Primarily, we run several separate deployments of Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch. 

We own several dozen of these servers. We used to manage instance logs manually. 

Datadog allows for much better visibility.

What is most valuable?

We've found it most useful for managing Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch. 

Datadog allows for much better visibility across our entire fleet and has saved us countless eng hours as a result. 

We plan on trying out offerings such as APM moving forward too.

Some things that Datadog does very well:

  • Technical documentation (the docs are clear, concise, and include realistic code samples)
  • Overall education efforts (e.g. the codelabs/workshops)

What needs improvement?

We primarily use the log management functionality, and the only feedback I have there is better fuzzy text searching in logs (the kind that Kibana has). 

I've learned about a ton of other offerings, like APM, NPM, etc., over the course of workshops. Once I try those out, I'm sure I will have additional feedback.

For how long have I used the solution?

I've used the solution for one year. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Rawat Singhsatit - PeerSpot reviewer
Solutions Consultant Manager at MFEC
Consultant
Top 10
Stable cloud monitoring solution that is easy to use and deploy and is budget friendly
Pros and Cons
  • "Datadog is easy to use and easy to deploy. It's a better solution compared to others on the market in terms of being budget friendly for our customers."
  • "Datadog could be improved if it could detect other software in a container or server."

What is our primary use case?

We use this solution for our customer's IP and to support their cloud infrastructure.

What is most valuable?

Datadog is easy to use and easy to deploy. It's a better solution compared to others on the market in terms of being budget friendly for our customers.

What needs improvement?

Datadog could be improved if it could detect other software in a container or server. Datadog is better than other APM or observability tools, but it focuses mostly on telling the customer what they need to know about the software, database or applications that land on the server. We also need to know the version before setting up an agent with the APM modeling tool.

In some instances, the owner of a particular software changes to another person and this person did not originally transfer the knowledge or data to manage the server. The new person needs to monitor this server and they need to know what software or version of software was installed on this server before they used the APM agent for monitoring. If datadog could provide this insight, it would improve how we use the solution. 

In a future release, we would like to be able to complete a network traffic or network flow analysis to detect the errors or problems on the network.

For how long have I used the solution?

I have been using this solution for two years. 

What do I think about the stability of the solution?

This is a stable solution. 

How was the initial setup?

The initial setup was straightforward. We needed two engineers for the deployment.

What's my experience with pricing, setup cost, and licensing?

This solution is budget friendly.

What other advice do I have?

Overall, Datadog is a good product to use and is easy to deploy.

I would rate this solution a nine out of ten. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
SRE at a financial services firm with 10,001+ employees
Real User
Excellent synthetic monitoring, APM, and alert features
Pros and Cons
  • "The monitoring functionality, in general, and tagging infrastructure are great."
  • "While the tool is robust with many different capabilities, users would greatly benefit from more examples in the documentation."

What is our primary use case?

We deploy various services for our main platform on AWS across multiple regions. We have a development environment, a staging environment, a QA environment, and a production environment. We deploy our many services across hundreds of instances. 

We have many server farms, all responsible for various services on our market intelligence platform. The deployment of each server farm or even individual instances varies depending on what stood up. We have instances built in three different ways, with two different pipelines and some even on user data scripts.

How has it helped my organization?

My team has a 24/7 on-call schedule where we need to be ready to handle and mitigate incidents with the platform at any moment. 

We have countless monitors set up on Datadog that alert directly to our queue using an email that generates a ticket. 

The actionable steps for each type of monitor and its associated incident are easily included in the alerts whenever something is triggered. We generate links to the Datadog monitors and can instantly drill down into what went wrong and for how long.

What is most valuable?

The features I have found most helpful are synthetic monitoring, APM, and alert features. The monitoring functionality, in general, and tagging infrastructure are great.

Synthetics have become bread and butter for us as we have migrated many tests over to Datadog. We have simplified and consolidated our synthetic tests while also making them more robust with the help of your tagging. 

A large portion of our monitoring is based on synthetics results, and alerts integrate seamlessly without an incident queue system. We use dashboards heavily. 

The metrics capabilities are extremely helpful, and we use virtually all of the widgets.

What needs improvement?

My main place of improvement for Datadog would be the documentation. While the tool is robust with many different capabilities, users would greatly benefit from more examples in the documentation. 

The number of current code snippets available in the docs is not enough, and some need to be updated even today. 

One function I would add would be a button to generate a report of the performance of a synthetic test and the performance of each of the steps in the test over time.

For how long have I used the solution?

This timeline varies in terms of how long we've used the solution. We have one platform completely in the cloud and one still on-premises. We've had the solution for many years on AWS.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
API Developer at a tech services company with 501-1,000 employees
Real User
Good monitoring, logging, and alert features
Pros and Cons
  • "Thanks to the logs, we manage to make better reports through Jira and also to trace the request with more facility than we would be able to do otherwise."
  • "When the logs are too big, and Datadog splits them, the JSON format breaks and it is not so useful for us."

What is our primary use case?

We use the solution for monitoring, logging, and alerts. 

Thanks to Datadog, we report errors using the logger integrated into our services, which is crucial since we only do unit tests. The infrastructure team handles the monitoring part, so I can't give more insights about that. I am an API developer, so I use Datadog mainly for logging.

The alerts are connected to Microsoft Teams in a specific channel, and we pay a lot of attention to it, and we usually create tickets based on these alerts.

How has it helped my organization?

Thanks to the logs, we manage to make better reports through Jira and also to trace the request with more facility than we would be able to do otherwise. 

Since there are many teams in my company, the fact that we can share the trace of an error, for example, together with all the information about the log, we are able to save a lot of time when it comes to communication between everyone.

What is most valuable?

The most valuable feature for me so far is logging. We do not do integration tests, so we rely a lot on tracing all the requests and we report errors to different teams in the company together with logs that we take from Datadog.

Since I am an API developer, I do not use so much with the other features. Also, I have been in the company for only four months. I have only worked with monitors and alters.

I value tracing the request and being able to tell other teams which component, service, or line of code has an issue.

What needs improvement?

Since I have only been in the organization for four months, I only worked with the log, alerts, and monitoring. I do not have so many insights to share about what can be improved.

I am not an expert user, and not even an intermediate user yet. Rather, I am a beginner.

That said when the logs are too big, and Datadog splits them, the JSON format breaks and it is not so useful for us.

For how long have I used the solution?

I've used the solution for four months.

Which solution did I use previously and why did I switch?

I did not previously use a different solution.

What's my experience with pricing, setup cost, and licensing?

I will get informed about this, I have no idea about costs as an API developer. But I get curious about it

Which other solutions did I evaluate?

I did not evaluate other options previously.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
VP, Application support at a financial services firm with 10,001+ employees
Real User
Good service catalog and dashboard but the application performance monitoring module needs more functionality
Pros and Cons
  • "The service catalog helped improve our organization by giving a good view of the flow for our microservices applications."
  • "The dashboard could be improved. It would be helpful to get a view of specific things that we need to monitor for our application."

What is our primary use case?

We primarily use the solution for the service catalog.

We use this type of offering for our Microservices applications, and it gives a good view of flow. It is a must when we have different developers working on different services.

Having the trace and log features are useful for locating the microservice for the on-call person.

We would like to see some more useful applications for health monitoring where we can customize the cases based on data from the database.

It needs to have the facility to monitor data inside tables and the status of the UI.

How has it helped my organization?

The service catalog helped improve our organization by giving a good view of the flow for our microservices applications. It's important when we have different developers working on different services and having the trace and log features help the on-call person locate the microservice.

The application performance monitoring has also been useful. This module had a few functionalities that we needed for the application health check. This needs to have some more features to consolidate the view in one tree. We may need more of a one-stop shop on top of the dashboard, and that is missing in Datadog. We'd like to be able to scrap our existing monitoring tool.

What is most valuable?

The service catalog is very useful. We use this type of offering for our Microservices applications, and it gives a good view of flow. It is a must when we have different developers working on different services. Having the trace and log features have been useful in order to locate the microservice for the on-call person.

The dashboard is great. It is helpful to get a view of specific things that we need to monitor for our application. It has been a good way to watch specific things and add them together.

The application performance monitoring is an excellent aspect. This module had a few functionalities that we needed for the application health check. This needs to have some more features to consolidate the view into one tree, however.

What needs improvement?

The dashboard could be improved. It would be helpful to get a view of specific things that we need to monitor for our application. However, it was a good way to watch specific things and add them together.

The application performance monitoring module had very few functionalities that we needed for the application health check. This needs to have some more features to consolidate the view into one tree.

For how long have I used the solution?

I've used the solution for one month. 

Which solution did I use previously and why did I switch?

We previously used ITRS Geneos.

What other advice do I have?

We are using the latest version of the solution. 

I'd rate the solution seven out of ten. 

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Provider
PeerSpot user
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.
Updated: April 2024
Buyer's Guide
Download our free Datadog Report and get advice and tips from experienced pros sharing their opinions.