

Find out in this report how the two Application Performance Monitoring (APM) and Observability solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
Previously we had thirteen contractors doing the monitoring for us, which is now reduced to only five.
Datadog has delivered more than its value through reduced downtime, faster recovery, and infrastructure optimization.
Datadog is something which I can 100% rely on, and it is very cost-effective and totally worth it.
We are seeing a return on investment from using Gremlin Reliability Management Platform because we are getting less production issues by thirty percent, as I mentioned earlier, making it a great investment.
We do not need to look at all the day's metrics on Grafana dashboards; we run our chaos experiments in a production environment to see how reliable our product or service is.
If we needed ten people to do tests once upon a time, now, using Gremlin Reliability Management Platform, we can do it with a fifty percent reduction in employees.
When I have additional questions, the ticket is updated with actual recommendations or suggestions pointing me in the correct direction.
Overall, the entire Datadog comprehensive experience of support, onboarding, getting everything in there, and having a good line of feedback has been exceptional.
I've had a couple instances where I reached out to Datadog's support team, and they have been really super helpful and very kind, even reaching back out after resolving my issues to check if everything's going well.
When I have questions or run into issues with Gremlin Reliability Management Platform, their support team is helpful and responsive.
The customer support for Gremlin Reliability Management Platform is good overall.
Datadog's scalability has been great as it has been able to grow with our needs.
We did, as a trial, engage the AWS integration, and immediately it found all of our AWS resources and presented them to us.
Datadog's scalability is strong; we've continued to significantly grow our software, and there are processes in place to ensure that as new servers, realms, and environments are introduced, we're able to include them all in Datadog without noticing any performance issues.
Gremlin Reliability Management Platform scales smoothly for running more chaos experiments, adding more services, or supporting a larger team.
More than scalability, I thought about availability because it is a really important thing of the architecture tools.
The scalability of Gremlin Reliability Management Platform depends on the scalability of the underlying infrastructure that we are hosting it on.
Datadog is very stable, as there hasn't been any downtime or issues since I've been here, and it's always on time.
Datadog seems stable in my experience without any downtime or reliability issues.
Datadog seems to be more stable, and I really want to have a complete demo before making a call to decide on this.
I have not seen any downtime or issues with its behavior or performance.
It would be great to see stronger AI-driven anomaly detection and predictive analytics to help identify potential issues before they impact performance.
We want to be able to customize the cost part, and we would appreciate more granular access control.
The documentation is adequate, but team members coming into a project could benefit from more guided, interactive tutorials, ideally leveraging real-world data.
I think it would be useful to have some integration with Splunk or other log collectors, or maybe in the future, the ability to link Dynatrace or any other observability platform.
If we can integrate it with natural language, could we talk to Gremlin Reliability Management Platform and have it configure some of the basic settings so that non-technical persons can also work on Gremlin Reliability Management Platform-like tools?
The user interface is great, the integration is smooth, and Gremlin Reliability Management Platform has a fantastic support team that helps us a lot in many cases.
The setup cost for Datadog is more than $100.
Everybody wants the agent installed, but we only have so many dollars to spread across, so it's been difficult for me to prioritize who will benefit from Datadog at this time.
My experience with pricing, setup cost, and licensing is that it is really expensive.
It is not so cheap, but it has very powerful features.
My role does not incur costs for us since we have an NFR for Gremlin Reliability Management Platform that we can use in our case.
Our architecture is written in several languages, and one area where Datadog particularly shines is in providing first-class support for a multitude of programming languages.
Having all that associated analytics helps me in troubleshooting by not having to bounce around to other tools, which saves me a lot of time.
Datadog was able to find the alerts and trigger to notify our team in a very prompt manner before it got worse, allowing us to promptly adjust and remediate the situation in time.
There are really two pathways along: fewer incidents because with Gremlin Reliability Management Platform, we can make every part of the infrastructure more solid, and less downtime because we can test more architectures and then things like how to put in high availability clusters.
The best feature that Gremlin Reliability Management Platform offers for me is the prebuilt reliability test; I think that is the best feature along with the automated scheduling.
One of my best features of Gremlin Reliability Management Platform is the built-in chaos experiments, which gives you the reliability score of your service.
| Product | Mindshare (%) |
|---|---|
| Datadog | 5.2% |
| Gremlin Reliability Management Platform | 0.1% |
| Other | 94.7% |

| Company Size | Count |
|---|---|
| Small Business | 81 |
| Midsize Enterprise | 46 |
| Large Enterprise | 99 |
Datadog integrates extensive monitoring solutions with features like customizable dashboards and real-time alerting, supporting efficient system management. Its seamless integration capabilities with tools like AWS and Slack make it a critical part of cloud infrastructure monitoring.
Datadog offers centralized logging and monitoring, making troubleshooting fast and efficient. It facilitates performance tracking in cloud environments such as AWS and Azure, utilizing tools like EC2 and APM for service management. Custom metrics and alerts improve the ability to respond to issues swiftly, while real-time tools enhance system responsiveness. However, users express the need for improved query performance, a more intuitive UI, and increased integration capabilities. Concerns about the pricing model's complexity have led to calls for greater transparency and control, and additional advanced customization options are sought. Datadog's implementation requires attention to these aspects, with enhanced documentation and onboarding recommended to reduce the learning curve.
What are Datadog's Key Features?In industries like finance and technology, Datadog is implemented for its monitoring capabilities across cloud architectures. Its ability to aggregate logs and provide a unified view enhances reliability in environments demanding high performance. By leveraging real-time insights and integration with platforms like AWS and Azure, organizations in these sectors efficiently manage their cloud infrastructures, ensuring optimal performance and proactive issue resolution.
Gremlin Reliability Management Platform empowers organizations to proactively identify and mitigate potential failures. It enhances system resilience through controlled chaos engineering, aiding tech teams in delivering reliable services.
Designed for tech-savvy users, Gremlin enables teams to implement chaos engineering effectively to ensure system reliability. It offers precise control over variables, allowing teams to simulate real-world scenarios and fortify system operations. Gremlin plays a strategic role in preventing downtime and maintaining optimal service delivery through a suite of advanced tools tailored for IT infrastructure.
What are the most important features of Gremlin?In industries such as e-commerce, finance, and healthcare, Gremlin helps maintain service reliability by identifying vulnerabilities before they affect operations. IT teams can simulate stress tests specific to their industry, ensuring systems are resilient against potential threats, enhancing customer satisfaction, and securing business continuity.
We monitor all Application Performance Monitoring (APM) and Observability reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.