I primarily use the solution to learn, watch and monitor business and engineering metrics in the production and QA environments of my team.
We create monitors on key business metrics and observe regressions and anomalies.
Less often, I leverage the events ability in Datadog to get notified about significant activities happening in my teams' deployments.
We learn about Datadog monitor alerts through Slack and often attempt to create SLOs using Terraform.
We use APM for observability.
Most recently, I learned about WatchDog Alerts that I will be heavily looking into.
Datadog simplified my ability to watch easily and add monitors on any metric emitted by any team at my organization.
Datadog APM immensely improved our ability to understand the reasons behind production issues. Its ability to navigate across services seamlessly to understand the time spent at each critical stage of a production request is helpful. This, combined with Datadog's historical ability to show business metrics aside, helped get more powerful insights much more quickly.
Datadog's seamless integration with Slack and PagerDuty helped us to receive alerts right to the most common notification methods we use (our mobile devices and Slack).
The most valuable aspects include:
- The ability to monitor any team's metric in my company (transparency)
- The ability to create/clone dashboards for myself (ease of use)
- Its integration with Slack (it is very powerful)
- The ability to add monitors on any metric emitted by any team at my organization
- (Through Datadog APM) the ability to understand the reasons behind production issues. Its ability to navigate across services seamlessly in order to understand the time spent at each critical stage of a production request is key. This, combined with Datadog's historical ability to show business metrics aside, helped me get more powerful insights much more quickly.
- (Through integrations like Slack and PagerDuty) the ability to receive alerts right to the most common notification method we use (our mobile devices and Slack), which saves a lot of time and helps us maintain focus.
I would like better navigability across pages. The UI/UX is powerful, yet less intuitive. A lot of times, I somehow navigate across buttons and pages, and I end up forgetting how to get back to a particular view that was more insightful.
Particularly as Datadog starts offering more platform capabilities like APM, Watchdog, Shift left initiatives like instrumentation, continuous testing, intelligent test runner, and Synthetic and real user monitoring, the UI can become more and more clunky, giving users a very frustrating experience.
I've used the solution for five to six years.