What is our primary use case?
I am Purnambica Kolavennu and I have been working for the past two to three years with the Agriculture Skills Council of India. I work at the intersection of technology and AI innovation in the agriculture skilling platform. I have been working in a similar portfolio for approximately five to seven years.
We have been using Splunk Observability Cloud for the past two years. At the intersection of agriculture and AI innovation, real-time visibility monitoring and analytics across the entire agriculture skilling ecosystem is extremely critical. We are helping various organizations, clients, training partners, assessment bodies, and government stakeholders improve their service delivery, audit compliance, and ultimate outcomes.
We are doing end-to-end monitoring of skilling platforms through our unified cloud-native SaaS platform, which is Splunk Observability Cloud. We are troubleshooting any kind of foreign incumbents or cyber threats. We are monitoring the health and performance of various systems including training systems, assessment portals, attendance portals, and learning management systems. All these components are monitored end-to-end. The entire monitoring of our training and assessment activities has become extremely easy, and we are able to make our clients happy with this solution. We are monitoring the real-time dashboard for PMKVY and government schemes.
What is most valuable?
Splunk Observability Cloud is helping us derive and fetch data-driven decision-making insights. These fresh insights help us take better decisions and improve scheme monitoring for PMKBY, PM Vishakarma, and other state programs. When monitoring assessment operations, our daily activities are critically handled here, and we have been able to reduce our assessment delays. We are able to make better compliance with guidelines and norms, and we can early detect bottlenecks or foreign threats or cyber threats coming into the system.
Through this platform, we have started doing automated identification of eligible and non-eligible candidates, which has improved the system greatly. Now there is more transparency and credibility and authenticity in the entire process. We have also reduced the manual verification effort, which has greatly helped our entire team. We are doing many activities in predictive analytics where we are trying to identify and gauge early intervention to protect scheme guidelines and enhance infrastructure performance monitoring of our various applications in a unified manner.
We have been able to have more traceability because the Log Observer Connect is a very useful functionality that has been given to our clients and various vendors. Through this, centralized log data for visibility is available to all partners, which allows seamless flow of information and data. We have been able to have more AI and automation integration, through which our manual effort has been reduced. We are doing specialization monitoring of the AI agents and our infrastructure stacks.
Splunk Observability Cloud is a very strong cloud-native SaaS platform designed for monitoring and troubleshooting cyber environments. The mean time to resolution functionality, which is an MTTR functionality, enables different kinds of intervention. Application performance monitoring is a very strong feature providing deep code-level visibility into distributed applications. It is also helping us manage each data flow into the system from one end to another end. Infrastructure monitoring is most important because improving operational efficiency is very critical for our organization. For that, we need real-time streaming analytics and automatic service discovery, which we are able to achieve through infrastructure monitoring. Real user monitoring of various applications is also a very strong feature, capturing the complete end-user experience from both web and mobile applications. The application proactively monitors service performance, APIs, and URLs globally before end users are impacted.
The strongest feature is unified application performance monitoring. We are working on approximately twenty-plus applications at a time, fetching data insights from them, and tracking their performance and identifying any bottlenecks or cyber threats. This would be extremely difficult if done manually, as many eyeballs looking for performance metrics would be necessary. This unified SaaS platform is helping us out and giving us full visibility and full business control, providing full business control in terms of mapping all performance metrics, deriving decision-making insights, and helping our people.
There is a very strong aspect of compliance monitoring and audit readiness for our organization because we are directly governed by the government of India. Fraud detection is helping us a lot because there are many times when confidential information and data flow are attacked by suspicious login patterns, unusual duplicate registrations, unauthorized system access, and anomalies that come into the system in the form of cyber threats. The system is doing twenty-four-seven monitoring of the various applications we are using, and because of that, there is enhanced compliance and improved audit readiness, which has actually reduced the risk of malpractices. The system is keeping us very clean in terms of compliance practices, and our service delivery has been excellent. Our stakeholder service management experience has increased greatly because clients are happy, there is faster issue resolution, and satisfaction is being built with our clients.
There is a very strong performance monitoring feature where we are able to track the entire performance metrics end-to-end because there is a single performance control center through which all dashboards and application visibility comes to us in just one click. There is real-time monitoring happening, which is helping us a lot. Our operational productivity and efficiency has improved. There is better audit and regulatory compliance now. Our clients are happy because they do not have queries raised on their processes. Twenty-four-seven issue resolution metrics are incorporated, which is helping us a lot. There is predictive insights, which is a kind of proactive insight, helping us in decision-making. There is strong unified governance. We are seeing a thirty percent to fifty percent reduction in incident resolution time, which is helping us a lot. There is higher compliance metrics, improved batch completion rates, and better visibility of all processes and systems. The strong data-driven decision-making is creating a very strong ecosystem and environment throughout.
There is a twenty-four-seven ticket resolution system and a very advanced chatbot system, which is a humanized form of chatbot system that works on a four-to-six-hour resolution time. For many of the applications, resolution time is twenty-four to forty-eight hours, but for them, it is approximately four to six hours. There is an embedded feature that if an issue is not addressed within thirty minutes or within one hour, then it is escalated to a higher level authority and quick resolution time is attained. Our manpower in terms of resolution and in terms of following up on tickets has been extremely reduced. Through the chatbot system, strong chatbot system support, and their email support, we have been able to reduce our burden greatly.
What needs improvement?
Log Observer Connect is embedded here, but we are facing some delays in centralized log collection and analysis, which can be further fastened. We are collecting all the data metrics and decision-making insights, but all these data-driven decisions coming from different applications are not connected somewhere. A consolidated form or correlation of these insights is not happening between each other due to which we feel we are missing something significant.
Some generalized feedback includes that predictive alerts or alarms which can be integrated with AI-driven alarms and alerting features should be established so that there is AI-driven intelligence and anomaly detection happening with a complete systematic process in service delivery. Application dependencies are huge, and business and operational dashboards should be improved. Right now there are very interactive custom dashboards, and every now and then, the personalization of enhancements keeps happening. KPI monitoring, executive reporting, and analytics have definitely been introduced to a great extent. There are few things in cloud-native monitoring, such as integration with AWS and Azure, where we sometimes do face lags. Those things can definitely be improved upon.
I have used Datadog and Dynatrace before using Splunk Observability Cloud. Datadog was definitely recommended by most of our peers because of its very strong comprehensive observability and very strong and unique dashboard systems. Dynatrace was also very good because they have offered a lot of AI-driven analysis methods and processes, which was helping our organization a lot. Since our organization has a very strong IT ecosystem for agriculture, very different kinds of customized things are required.
What do I think about the stability of the solution?
Splunk Observability Cloud is very, very stable. We are using approximately twenty-plus applications, and the system has the capacity to increase applications to up to ninety. There was a time when we were having sixty to seventy applications to be monitored in one go, but there was never any outages or downtime. We have never faced any kind of downtime or performance issue. It is highly scalable because it can handle approximately up to one hundred applications at a time without any lapse or lag.
What do I think about the scalability of the solution?
Scalability is huge for our needs. We are able to use it in our native cloud environment, but we also have external cloud environments of our client servers with very different configurations. The API integration is so smooth with those external client servers that there is never a scalability issue or compatibility issue that we have seen. We have never seen any kind of downtime or crashes, as it has been absolutely very easy to scale. If I am working on twenty different applications today and tomorrow I want to scale it up to fifty different applications, everything can be done easily without any downtime or outages.
How are customer service and support?
Customer support is great. The turnaround time for solution is extremely good. They are available twenty-four-seven with an advanced AI-driven chatbox system, and they are resolving issues within four to eight hours, which is commendable. The customer support system is the foundational pillar of any successful business, and the team has greatly excelled at this.
Which solution did I use previously and why did I switch?
I evaluated Datadog and Dynatrace. Datadog was very highly recommended by most of my peers because of its strong comprehensive observability and unique dashboard systems. Dynatrace was also recommended because of the strong AI root cause analysis. We also checked for new solutions but could not find the best deal with them, so we ultimately switched to Splunk Observability Cloud.
What about the implementation team?
Many features keep adding up every now and then as per different requirements and as per the changing business environment. We request their business team and tech team to do capacity planning or capacity development sessions every now and then so that there is uniform training happening across the ecosystem. Our new incumbents, new learners, and new tech executives are learning those new systems every day, and there is no mishap in understanding. This would definitely enhance user experiences and provide better orientation and better understanding of the systems, processes, and how the application actually functions and what the various utilities are.
What was our ROI?
We have reduced our employees from approximately ten to twelve people working in this vertical to five. We have reduced our operational expense by forty percent, and we have reduced our operational burden by nearly ten percent in the form of multitask management, which was done by human intervention or manual intervention. We have been able to save a great deal of money, and our profits have increased by twenty percent. Initially, even after one year of deployment, we were in profits.
What's my experience with pricing, setup cost, and licensing?
The pricing and initial setup cost were a bit pricey for us. However, we have done a lot of negotiations with the business team, and now we have gotten a reduction of approximately ten to fifteen percent. Their licensing has annual renewal, so we are doing every year SLA agreements with them and renewing it.
What other advice do I have?
I would definitely give Splunk Observability Cloud a nine out of ten rating. The unique strengths include strong application monitoring infrastructure, a very comprehensive observability environment, and a very powerful native cloud environment. There are strong dashboards for real-time visibility twenty-four-seven, and it is suitable for large enterprises. I would say it is best for large enterprises because their personalization and customization is extremely good and suited to the requirements and needs. Unlike other observability cloud applications, this is very advanced, and AI root cause analysis keeps happening throughout, due to which even complex IT ecosystems or complex integrations are handled very easily. There is full stack monitoring happening, and there is excellent log analytics, which is actually helping us a lot to make faster and better data-driven decision-making. Splunk Observability Cloud is extremely reliable and an extremely trusted source, and it has definitely gained public faith and public trust. It is a highly recommended application.
With the small enhancements or improvements regarding integration and doing a lot of training and orientation time and again to make the system more compatible and understandable for all, it could definitely be a ten out of ten.
There is very strong governance and security. The policy processes are very strongly governed. Government cloud ecosystems are very susceptible to any kind of threat attacks, and there are a lot of system bridges built there, with a lot of stake involved. The system is giving a lot of advanced use cases, such as Google Cloud-based applications which are very secure. They are hosting the entire program on another platform and also creating a duplicate of it, due to which there are various strong audit processes that have been inbuilt. There has been real-time observability all the time so that there is no such any problem. All of this is very cost-effective, so it is definitely very strongly compliant with built processes.
Accuracy and reliability are excellent. We are dealing with approximately millions of data every week, and the system runs throughout the day continuously running on those data and bringing data and insights to us. In the past two years, I have never seen any data mismatch or inaccuracy. There is strong trust built in where there is no data leak, no data misinformation, and nothing leaking or any kind of information going out of the system. Accuracy and reliability are very strong features. My overall review rating for Splunk Observability Cloud is nine out of ten.