We use it to store editorial content.
We started out on the on-premise version, then moved to the AWS version.
We use it to store editorial content.
We started out on the on-premise version, then moved to the AWS version.
I don't have to worry about upgrades with the AWS version.
The on-premise version is very difficult to upgrade.
We run the agent in AWS.
It has empowered all our platform engineers with a very powerful and easy to use monitoring system. Most of our platform organization is now involved in monitoring. Previously, only a handful of platform engineers were involved, because Graphite and Sensu were so cumbersome to use.
It is incredibly easy to do common monitoring actions:
Very rarely. Maybe only once or twice that we noticed. It is very reliable.
No.
It is excellent. The web app has a real-time support chat window in which a support engineer is chatting with you within a minute. That is the "right" way to do support.
We previously ran Graphite and Sensu ourselves. By moving to Datadog, we did not need to manage our own monitoring infrastructure anymore. Graphite was somewhat complex to run.
Initial setup is easy. Install the agent and send it metrics. There are StatsD/Datadog libraries available for most languages.
Pricing seems reasonable. It depends on the size of your organization, the size of your infrastructure, and what portion of your overall business costs go toward infrastructure. It is hard to say without looking at all of this.
We looked at several competitors at the time (Summer 2016). There did not seem to be any compelling alternatives. Once we did the PoC with Datadog, we loved it and decided to move forward.
Try it out and see if you like it.
We can build dashboards as fast we roll out new systems, which can be fast.
We use standard and custom metrics for every new system we roll out for 360 degree visibility into our systems.
The most valuable features have been: Sharable dashboards, TimeBoards, dogstatsd API, Slack Integration, Event logging API. CloudTrail Events, Tags, alerts, and anomaly detection. EBS Volume Snapshot Age, which they added upon request. We used PagerDuty integration for a while as well.
More granular control over dashboard sharing. Timeboard sharing.
There are infrequent hiccups, which have been decreasing over the time we have used it.
No.
Customer Service:
Never seen better. Questions answered usually almost immediately, even on weekends. An in-stream with your event stream.
Technical Support:
High.
Overall they have always had an amazing team, and quality has been maintained as the company has grown.
Complementary to other tools we used.
Setup is generally easy. They provide an large number of integrations, some are more complex than others, which is to be expected.
In house implementation.
We didn’t calculate explicitly, but as we used the product to track down underutilized instances, it more than paid for itself in the first month.
Pricing overall in this segment has standardized in the last several years.
A few, including Zabbix and Icinga.
One of the fastest and most flexible tools we have used in this area..
We were building a real-time bidding exchange for digital out-of-home ads by providing the analytics and the infrastructure for the ecosystem. We not only facilitate the buys, but we also act as an ad server for the network and advertisers who put in their server requirements. It is very similar to the large online ad servers. In order to provide this real-time service, we had to be able to monitor and analyze a wide range of data points for the many media companies in their ad exchange.
We wanted to find an “out-of-the-box” metrics solution that would integrate easily with current systems. We found that Datadog included the integrations that we needed to get our monitoring solution up quickly. Additionally, we liked Datadog’s ability to perform customized metric monitoring, log critical events, and scale easily. I had scaled up open-source monitoring solutions once before, and it wasn’t fun. So when I had the opportunity to do it again, I said ‘No.’
On a daily basis, Datadog eliminates a lot of the back and forth with the media owners to find out what is going on, it is just a good visual tool for seeing the activity from each of the content management systems that we work with. It is an easy way to go in and get a feel of what is going on. Without Datadog, we would have to repeatedly spend time to reach out to each media owner directly to see if they are sending any requests that day.
As we moved toward a real-time system, it was important to understand whether our partner networks had reported all of their ads in a short period of time. We now use Datadog as a daily monitoring tool to get a feel for what networks are actively sending requests to their servers, what live campaigns are running smoothly and whether the right networks are requesting ads. I always have Datadog up on a daily basis to debug issues with integrations or just to make sure that live campaigns are running smoothly.
Each network that we work with has different requirements. In terms of connectivity of their networks, we are always dealing with different frequencies and rates of requests each day. The wide diversity in partners has made customization key since the monitoring needs for each partner varies drastically. There are days where we see weird trends of requests coming in. We are able to use Datadog’s custom graphs to catch the anomalies or concern areas in their system on a real-time basis.
Going forward, we are looking to create separate dashboards for each ad network that we work with. This will enable us to gain more detailed metrics for each individual media owner. We will be able to take these detailed metrics and use them to quickly identify potential problem areas, improving our ability to solve problems as soon as they are detected.
We are providing managed services to our customers across multiple industries. Datadog is key to delivering these services by bringing the observability, monitoring, and alerting capabilities we need to operate at scale.
We operate custom cloud native workloads as well as ISV products such as Atlassian Jira or Confluence.
Integrating Synthetics, infrastructure, and application performance monitoring, as well as piping all logs through Datadog allows us to operate more with less with good alerting right in time.
We are providing managed services to our customers across multiple industries.
Datadog is key to delivering these services. It brings in observability, monitoring, and alerting capabilities - all of which we need to operate at scale.
We operate custom cloud native workloads as well as ISV products such as Atlassian Jira or Confluence.
Integrating Synthetics, infrastructure, and application performance monitoring, as well as piping all logs through Datadog, help with getting alerts in real-time.
We are providing managed services to our customers across multiple industries.
Datadog delivers observability, monitoring, and alerting capabilities we need to operate at scale.
Operating custom cloud native workloads as well as ISV products such as Atlassian Jira or Confluence is also something we do. Integrating Synthetics, infrastructure, and application performance monitoring, as well as piping all logs through Datadog allows us to operate while grabbing alerts in real-time.
The current way accounts are billed could be vastly improved - especially when involving multiple organizations across multiple accounts in combination with reserved commitments.
Being able to have an automatic materialized report on certain dashboards that could be exported as PDF to be shared with non-Datadog users could help a lot.
Other than that, we are more than happy with the features we use regularly.
We have been using Datadog since 2015.
We use Datadog incident management for our incident tooling. Whenever we run into an incident, we try to use it. It allows us to create a separate Slack channel for it.
Keeping track of incidents makes it possible to share learnings and follow-up. It is also required for SOC2 compliance.
It is great that creating an incident is possible from Slack while having all the relevant data in Datadog. It allows us to not think too much about organizing incidents, which is a highly stressful situation.
I would like the tooling to have better integration in Slack, specifically sending out reminders to the relevant people to take breaks, do a retrospective, and specify with emojis which messages to log.
Learning about the tooling could also be improved. It is a bit clunky to use, especially in an incident. It could definitely be more streamlined.
I've used the solution for about a year.
We're moving towards the cloud yet still have several active data center contracts. As we move to the cloud, we are interested in knowing more about our services, and DataDog APM/logs should give us this perspective.
We currently use the infrastructure monitoring part of DataDog. Still, I've really seen the advantage of moving more data into the cloud for comparison and being able to have one place where we can view all related pieces of information regarding a possible incident or potential issue.
We've been able to glean from the monitors what servers are down, and can alert the team in Slack. Knowing what we need to do next is what we would like to move to, so seeing the power of Notebooks is key. We also have several other services that we are underutilizing (logging, error tracking, etc.) that would be better housed in DataDog since it gives us more visibility into linking all of the things together into one cohesive picture.
Monitoring has been invaluable, and as we start to look to other products, bringing in logs and APM traces will create a full picture of what we need to do to resolve incidents. We're also interested in our users' perspectives and when things are slowing down for them.
Additionally, we are interested in the workflows announced today as an actionable decision tree that will allow our teams to see what is wrong and know what to do based on the connected runbook/notebook. I feel that this is the final piece that will make DataDog so much more valuable than its competition.
I have talked to vendors that mention that DataDog is drawing attention to an issue. However, you'll still need to take action on your own. The more tools that they can build that allow you to run AWX playbooks, or other similar fixes, would benefit clients greatly.
We've used the solution for two years.