What is our primary use case?
My main use case for Dynatrace is logs and tracking metrics, mostly API latencies, to find API latency metrics and monitor whole services and microservices.
About my main use case with Dynatrace, I will add a few things: first is that we create a custom dashboard, which is very helpful for us to monitor multiple microservices, since the application relies on multiple services. Then we have a few features including error detection, such as how many 4xx error rates are coming, what 5xx error rates are coming, and what are the client-side and server-side problems. Those kinds of built-in Dynatrace alert systems we can use for latency and error classification. We can link those things into channels such as Slack, email, or mostly PagerDuty, so the on-call person will receive these alerts.
What is most valuable?
The best features Dynatrace offers include AI-based metrics collection, which impresses me a lot because it is very helpful to quickly check all the issues that come up. The first thing is end-to-end visibility across all the services. That is one thing where we can know the infrastructures, hosts, containers, VMs, and what is happening with the application. Then this automatic discovery and mapping helps a lot. Distributed tracing, which is the PurePath thing, is a very helpful feature. For root cause analysis, the recent Davis AI is something game-changing and is helping a lot, saving a lot of time. That is the main feature right now. The remaining things are the foundational features and making access to all the features so quick. Davis AI makes use of the features that exist, and now I am using those features. So that is a good thing.
The root cause analysis feature specifically helps my team save time or solve issues by providing detailed timings and external APIs and cache whenever the PurePath thing comes, to trace every subsequent call. We are not only working with our own microservices; we are working along with multiple vendors. Sometimes the latency is not from our system but from a vendor system. During those times, we see why this service is taking a lot of time because this service is waiting for a call. These small details on timings and the DB cache add so much value to the PurePath thing. The AI powered Davis detects anomalies and root cause issues across the stack, which is also a valuable feature. Application topology and network flows are also very helpful. These things are primarily helping us.
Before moving on, I would appreciate the custom dashboard and visualization part of Dynatrace, where I did not talk much about those things. The real-time charts are very helpful to monitor the health of the application. Heat maps are really cool to see, and service flow maps are really required to understand the things. The key performance indicators are very good. These are the key standpoints of my application metrics. Whenever I am the on-call person in PagerDuty, these are the things I am fundamentally looking for. This is a lookup book for me to make sure this application is in good health.
Dynatrace has positively impacted my organization by allowing us to see very quick action whenever there is an issue with production. We can quickly open the dashboard, see the error rates where it is happening, and perform a quick analysis to find the root cause. That is a huge part. The second part is improvisation; whenever we come to a stable state and have some technical debt to take, we can check what APIs are causing latency and work on those issues, asking why it is happening. Sometimes it leads to redesigning the domain, coming from the dashboards. Every time we do not know how many services exist or why a particular service is working for this domain while we are in some centralized stage. But while coming to this Dynatrace dashboard, seeing all services in one place helps us think why this service is talking to this service. We can classify domain-level and regroup the domain. This way, it is helping us by giving a mental model and a mental map to track everything. That is a huge win for us.
What needs improvement?
I think Dynatrace can be improved in dashboard creation, as this is the face of a product. Although Dynatrace may have many things, new users typically do not know the whole power of the product. To onboard new people, it would be beneficial to provide template dashboards or suggested dashboards. Organizations working with microservice architecture will follow some templates, so suggesting simple, hand-picked features that are really helpful could make a difference. Quick configurable and customizable templates would be very useful. Whenever they come into the product, they can use this template and feel that their product is applied to it. Immediately, they will think everything is online and can improve easily. This base can be provided in template form. Since Dynatrace has a lot of features now, the AI integration makes it simple, but the product itself is very huge. To understand and utilize it, maybe it would be good to index the essential features.
For how long have I used the solution?
I have been using Dynatrace for three years.
What do I think about the stability of the solution?
Dynatrace is stable.
What do I think about the scalability of the solution?
Dynatrace's scalability is very good.
How are customer service and support?
The customer support for Dynatrace is good.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Previously, we used Grafana as a different solution. We switched because Dynatrace provides better overall metrics, but switching the solution is not my decision; it is an organization-wide decision, so I do not have much influence there or comment on it.
What was our ROI?
There is definitely a return on investment; I see time saving. Also, the employees working with operations do not require that level anymore, so we can reduce a few.
What's my experience with pricing, setup cost, and licensing?
My experience with pricing, setup cost, and licensing indicates that it is a bit pricey, but it is very much worth it. We can maintain licenses as a product that we must own because this is an alerting system and metric system. The product I am working with is critical, having millions of users coming and going; it is the health monitor part of that product. It is essential, and it is okay to spend that money.
Which other solutions did I evaluate?
Before choosing Dynatrace, we evaluated Grafana, and we considered moving from Grafana to Dynatrace because of specific benefits. We had our own trade-offs, and ultimately, we chose Dynatrace.
What other advice do I have?
I would advise others looking into using Dynatrace to set up as many metrics for the product they are using as possible, and I ask them to create dashboards to monitor.
Overall, Dynatrace is a good product; I will definitely suggest it. It offers strong usage while addressing production issues and health monitoring. It is a very good product. I rate this product an eight out of ten.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?