No more typing reviews! Try our Samantha, our new voice AI agent.

Gremlin Reliability Management Platform vs Splunk Observability Cloud comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

ROI

Sentiment score
7.0
Gremlin's platform improved ROI by reducing costs and downtime, streamlining error identification, and enhancing availability with efficient Chaos Engineering.
Sentiment score
6.5
Splunk Observability Cloud boosts efficiency, reduces costs, and enhances productivity with centralized tools and improved monitoring capabilities.
We are seeing a return on investment from using Gremlin Reliability Management Platform because we are getting less production issues by thirty percent, as I mentioned earlier, making it a great investment.
DEVOPS specialist at a media company with 10,001+ employees
We do not need to look at all the day's metrics on Grafana dashboards; we run our chaos experiments in a production environment to see how reliable our product or service is.
DevOps & Mlops Engineer at a printing company with 1-10 employees
If we needed ten people to do tests once upon a time, now, using Gremlin Reliability Management Platform, we can do it with a fifty percent reduction in employees.
Senior Software Engineer at a sports company with 10,001+ employees
We have saved considerable amounts of money, reducing our expenditures from around three to four crores to approximately one to one point two crores.
Senior Manager at Agriculture Skill Council of India
We have been able to save a great deal of money, and our profits have increased by twenty percent.
Project Manager at AGRICULTURE SKILL COUNCIL OF INDIA (ASCI)
Using Splunk has saved my organization about 30% of our budget compared to using multiple different monitoring products.
Senior Manager at Bank of America
 

Customer Service

Sentiment score
8.0
Gremlin's customer support is highly rated for responsiveness and resources, with most users giving scores of eight to ten.
Sentiment score
7.2
Splunk Observability Cloud's customer service is highly rated for its responsiveness, support, and effective issue resolution.
When I have questions or run into issues with Gremlin Reliability Management Platform, their support team is helpful and responsive.
DevOps & Mlops Engineer at a printing company with 1-10 employees
The expert partnership model is a significant strength I can suggest for Gremlin Reliability Management Platform.
VP Global at a tech vendor with 10,001+ employees
The customer support for Gremlin Reliability Management Platform is good overall.
DEVOPS specialist at a media company with 10,001+ employees
On a scale of 1 to 10, the customer service and technical support deserve a 10.
Systems Administrator at a insurance company with 1,001-5,000 employees
They have consistently helped us resolve any issues we've encountered.
Software Engineer at UKG
The customer support system is the foundational pillar of any successful business.
Project Manager at AGRICULTURE SKILL COUNCIL OF INDIA (ASCI)
 

Scalability Issues

Sentiment score
7.8
Gremlin's platform ensures seamless scalability and efficient workload management, enhancing DevOps across diverse cloud environments with safety mechanisms.
Sentiment score
6.9
Splunk Observability Cloud is scalable and flexible but can incur high costs; management of custom metrics may be challenging.
Gremlin Reliability Management Platform scales smoothly for running more chaos experiments, adding more services, or supporting a larger team.
DevOps & Mlops Engineer at a printing company with 1-10 employees
Gremlin Reliability Management Platform's workload management capability is good, effectively managing large workloads seamlessly while providing safety mechanisms and governance around chaos engineering.
Documentation Engineer at a tech vendor with 1,001-5,000 employees
More than scalability, I thought about availability because it is a really important thing of the architecture tools.
Dev Ops To Development (IT) at a non-tech company with self employed
We've used the solution across more than 250 people, including engineers.
Splunk Observability Expert
As we are a growing company transitioning all our applications to the cloud, and with the increasing number of cloud-native applications, Splunk Observability Cloud will help us achieve digital resiliency and reduce our mean time to resolution.
Application Developer at UMB Financial
We have never seen any kind of downtime or crashes, as it has been absolutely very easy to scale.
Project Manager at AGRICULTURE SKILL COUNCIL OF INDIA (ASCI)
 

Stability Issues

Sentiment score
9.3
Gremlin Reliability Management Platform is praised for its stability and reliability, consistently delivering downtime-free and dependable performance.
Sentiment score
7.7
Splunk Observability Cloud is stable, reliable, and scalable, with minor performance issues and occasional but limited downtime.
I have not seen any downtime or issues with its behavior or performance.
Senior Software Engineer at a sports company with 10,001+ employees
When downtime occurs, it raises concerns about how we measure and receive alerts, as everything needs to be in place.
Aws Dev Ops Engineer at a consultancy with 10,001+ employees
Splunk Observability Cloud is very stable.
Software Engineer at Titans Lab
It is highly scalable because it can handle approximately up to one hundred applications at a time without any lapse or lag.
Project Manager at AGRICULTURE SKILL COUNCIL OF INDIA (ASCI)
 

Room For Improvement

Gremlin could improve through AI-driven analysis, better user onboarding, expanded service integrations, enhanced UI, and additional learning resources.
Splunk Observability Cloud needs better cost transparency, user interface, third-party integration, and improved setup, onboarding, and AI capabilities.
I think it would be useful to have some integration with Splunk or other log collectors, or maybe in the future, the ability to link Dynatrace or any other observability platform.
Dev Ops To Development (IT) at a non-tech company with self employed
If we can integrate it with natural language, could we talk to Gremlin Reliability Management Platform and have it configure some of the basic settings so that non-technical persons can also work on Gremlin Reliability Management Platform-like tools?
DEVOPS specialist at a media company with 10,001+ employees
The user interface is great, the integration is smooth, and Gremlin Reliability Management Platform has a fantastic support team that helps us a lot in many cases.
DevOps & Mlops Engineer at a printing company with 1-10 employees
The out-of-the-box customizable dashboards in Splunk Observability Cloud are very effective in showcasing IT performance to business leaders.
IT Operations Engineer at ABC Supply Co. Inc.
The next release of Splunk Observability Cloud should include a feature that makes it so that when looking at charts and dashboards, and also looking at one environment regardless of the product feature that you're in, APM, infrastructure, RUM, the environment that is chosen in the first location when you sign into Splunk Observability Cloud needs to stay persistent all the way through.
Systems Monitoring Engineer II at a government with 10,001+ employees
There should be a solution to update OTeL agents from Splunk Observability Cloud itself.
Senior Software Engineer at WorldPay US
 

Setup Cost

Enterprise users value Gremlin's platform for reliability and risk management, justifying costs despite visibility challenges in dashboards.
Enterprise users find Splunk Observability Cloud pricey compared to competitors, but negotiations can reduce costs by 10-15%.
It is not so cheap, but it has very powerful features.
Dev Ops To Development (IT) at a non-tech company with self employed
From a pricing standpoint of view regarding Gremlin Reliability Management Platform, I would say it is a bit expensive, but that expense is worth it given the kind of benefits it offers.
VP Global at a tech vendor with 10,001+ employees
My role does not incur costs for us since we have an NFR for Gremlin Reliability Management Platform that we can use in our case.
DevOps & Mlops Engineer at a printing company with 1-10 employees
Splunk is a bit expensive since it charges based on the indexing rate of data.
Senior Manager at Bank of America
It is expensive, especially when there are other vendors that offer something similar for much cheaper.
Solutions Architect at Ikusi
I can confidently say our availability improved by forty percent, and downtime was reduced by approximately seventy to eighty percent.
Splunk Engineer at Data Elicit Solutions Pvt. Ltd.
 

Valuable Features

Gremlin enhances efficiency with test suites, fault injection, risk detection, and reliability insights, boosting uptime and customer satisfaction.
Splunk Observability Cloud offers real-time monitoring, AI analytics, and easy integration, enhancing user experience and operational performance.
There are really two pathways along: fewer incidents because with Gremlin Reliability Management Platform, we can make every part of the infrastructure more solid, and less downtime because we can test more architectures and then things like how to put in high availability clusters.
Dev Ops To Development (IT) at a non-tech company with self employed
We fix failures even before they occur, which is basically proactive risk detection and risk mitigation.
VP Global at a tech vendor with 10,001+ employees
Gremlin Reliability Management Platform has positively impacted our organization by making outages less frequent and improving recovery time significantly, resulting in fewer complaints on the customer success side and overall optimization of our DevOps process.
Documentation Engineer at a tech vendor with 1,001-5,000 employees
Splunk provides advanced notifications of roadblocks in the application, which helps us to improve and avoid impacts during high-volume days.
Senior Manager at Bank of America
For troubleshooting, we can detect problems in seconds, which is particularly helpful for digital teams.
Splunk Observability Expert
It offers unified visibility for logs, metrics, and traces.
Administrator at a tech vendor with 10,001+ employees
 

Categories and Ranking

Gremlin Reliability Managem...
Ranking in Application Performance Monitoring (APM) and Observability
25th
Ranking in IT Infrastructure Monitoring
23rd
Average Rating
8.6
Reviews Sentiment
7.0
Number of Reviews
8
Ranking in other categories
DevSecOps (8th)
Splunk Observability Cloud
Ranking in Application Performance Monitoring (APM) and Observability
5th
Ranking in IT Infrastructure Monitoring
5th
Average Rating
8.2
Reviews Sentiment
6.8
Number of Reviews
88
Ranking in other categories
Network Monitoring Software (6th), Cloud Monitoring Software (4th), Container Management (5th), Digital Experience Monitoring (DEM) (2nd)
 

Mindshare comparison

As of July 2026, in the Application Performance Monitoring (APM) and Observability category, the mindshare of Gremlin Reliability Management Platform is 0.2%. The mindshare of Splunk Observability Cloud is 2.3%, up from 1.9% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Application Performance Monitoring (APM) and Observability Mindshare Distribution
ProductMindshare (%)
Splunk Observability Cloud2.3%
Gremlin Reliability Management Platform0.2%
Other97.5%
Application Performance Monitoring (APM) and Observability
 

Featured Reviews

VL
Senior Software Engineer at a sports company with 10,001+ employees
Chaos experiments have revealed weak points and now provide controlled cost-saving tests
The best features of Gremlin Reliability Management Platform are the safe failure injection, which is crucial as we can simulate the failures in a manner that we know these are just dumping tests and not the actual issues. Whether it is the CPU spike or the memory exhaustion, or the network latency, or the server shutdown, server shutdown is one of the most favorite features that I have in Gremlin Reliability Management Platform. The controlled blast radius is another standout feature. The controlled blast radius feature has helped my team in that we actually wanted to target only one specific container, our Docker containers that we deployed. It helped us to conduct tests in a very specific, isolated manner instead of launching a larger test or focusing on hundreds of servers at a time, resulting in very limited impact. Since ours is a very small team, we do not want to impact other servers. This controlled blast radius helped us to only focus on our servers and not impact any other team. Gremlin Reliability Management Platform has positively impacted my organization because before Gremlin Reliability Management Platform, we did not even know how to conduct these chaos engineering tests. We heard about it, but we had no idea of how to do something of that nature. If there are ten servers, ten systems in our architecture and if suddenly something goes down, nobody knew what would happen next. We did not even know how to simulate these types of tests. This lack of confidence has been mitigated by using Gremlin Reliability Management Platform. Now we can confidently test and see which system is the most critical. If this goes down, what happens? How much business valuation are we going to impact? How much loss are we going to incur? All of this is now clearly visible and transparent. Since using Gremlin Reliability Management Platform, we were able to reduce the incidents by six percent after conducting our limited experiments. We were also able to increase the uptime from ninety-eight to ninety-nine, which represents a one percent increase in uptime.
PK
Project Manager at AGRICULTURE SKILL COUNCIL OF INDIA (ASCI)
Unified observability has improved real-time governance and now drives data-led decisions
Log Observer Connect is embedded here, but we are facing some delays in centralized log collection and analysis, which can be further fastened. We are collecting all the data metrics and decision-making insights, but all these data-driven decisions coming from different applications are not connected somewhere. A consolidated form or correlation of these insights is not happening between each other due to which we feel we are missing something significant. Some generalized feedback includes that predictive alerts or alarms which can be integrated with AI-driven alarms and alerting features should be established so that there is AI-driven intelligence and anomaly detection happening with a complete systematic process in service delivery. Application dependencies are huge, and business and operational dashboards should be improved. Right now there are very interactive custom dashboards, and every now and then, the personalization of enhancements keeps happening. KPI monitoring, executive reporting, and analytics have definitely been introduced to a great extent. There are few things in cloud-native monitoring, such as integration with AWS and Azure, where we sometimes do face lags. Those things can definitely be improved upon. I have used Datadog and Dynatrace before using Splunk Observability Cloud. Datadog was definitely recommended by most of our peers because of its very strong comprehensive observability and very strong and unique dashboard systems. Dynatrace was also very good because they have offered a lot of AI-driven analysis methods and processes, which was helping our organization a lot. Since our organization has a very strong IT ecosystem for agriculture, very different kinds of customized things are required.
report
Use our free recommendation engine to learn which Application Performance Monitoring (APM) and Observability solutions are best for your needs.
902,988 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Construction Company
13%
Printing Company
10%
Financial Services Firm
9%
Sports Company
9%
Financial Services Firm
12%
Manufacturing Company
9%
Computer Software Company
8%
Construction Company
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
By reviewers
Company SizeCount
Small Business3
Large Enterprise7
By reviewers
Company SizeCount
Small Business32
Midsize Enterprise8
Large Enterprise55
 

Questions from the Community

What needs improvement with Gremlin Reliability Management Platform?
While I have no complaints about Gremlin Reliability Management Platform, I believe the UI can be improved to enhance the developer experience for security engineers and DevOps engineers. Additiona...
What is your primary use case for Gremlin Reliability Management Platform?
My main use case for Gremlin Reliability Management Platform is to see how our applications behave under extreme stress and how resilient our application is when a simulation of server crash alongs...
What advice do you have for others considering Gremlin Reliability Management Platform?
For others considering Gremlin Reliability Management Platform, it is an excellent tool for organizations facing downtime issues, as it allows for chaos testing without needing to check logs and me...
What needs improvement with SignalFx?
Regarding dashboard customization, while Splunk has many dashboard building options, customers sometimes need to create specific dashboards, particularly for applicative metrics such as Java and pr...
What is your primary use case for SignalFx?
The solution involves observability in general, such as Application Performance Monitoring, and generally addresses digital applications, web applications, sites, and mobile applications. I worked ...
What advice do you have for others considering SignalFx?
We're a customer and end-user. Currently, in France, we cannot use the artificial intelligence option. While this option is enabled for the United States and many countries, it's not yet available ...
 

Also Known As

No data available
Splunk Infrastructure Monitoring, Splunk Real User Monitoring (RUM), Splunk Synthetic Monitoring
 

Overview

 

Sample Customers

Information Not Available
Sunrun, Yelp, Onshape, Tapjoy, Symphony Commerce, Chairish, Clever, Grovo, Bazaar Voice, Zenefits, Avalara
Find out what your peers are saying about Gremlin Reliability Management Platform vs. Splunk Observability Cloud and other solutions. Updated: June 2026.
902,988 professionals have used our research since 2012.