It is mostly centralized logging, a whole bunch of BI metrics, and an aggregation point, which we have adulterated for some PCI data.
It does meet our use case for the most part.
It is mostly centralized logging, a whole bunch of BI metrics, and an aggregation point, which we have adulterated for some PCI data.
It does meet our use case for the most part.
We like the dashboard creation and the ease with which we can harness the APIs to create custom BI dashboards on the fly. This adds most value for us. The nature of some of our microservices that I have run on the cloud are mixed workloads, wherein with the flow of data, it can change over time. In order to adjust for this, and cater to the needs of some of our internal customers, BI dashboards need to be created, tweaked, and modified. Also, doing this by hand is next to impossible. Therefore, we have strung all of this through a programmatic pipeline, which s something which we like because it is easier for us to harness it utilizing the API.
For on-premise, it's more about optimization. With such a heavy byte scale of data that we are operating on, the search for disparate data sometimes takes about a minute. This is understandable considering the amount of data that we are pumping into it. The only optimization that I recommend is better sharding, when it comes to Splunk, so that data retrieval can be faster.
With the AWS hosted version, we have not hit this bottleneck yet, simply because we are not yet at the multiple terabyte scale. We have hit with the on-premise enterprise version. This is a problem that we run into every so often. We don't run into this problem day in and day out. Only during the month of August through October do we contend with this issue. Also, there is a fair bit of lag. We have our ways to work around it. Between those few months, we are pumping in a lot of data. It is between 8 to 10 terabytes of data easily, so it is at a massive scale. There are also limitations from the hardware perspective, which is why it is an optimizing problem.
On the cloud, we are pushing through less than half a petabyte of data. So far, it has been fairly stable because it runs on all the underlying AWS infrastructures. Therefore, we have had no issues at all. In terms of availability or outages that we've experienced, there haven't been any. We've been fairly happy with the overall landscape of how it works on AWS.
On cloud, we absolutely like it. Splunk AMIs make it easy for us to spin up a Splunk cluster or add a new node to it. For our rapid development and scale of deployments in terms of microservices and the number of microservices that we run, we have had no problems here.
On-premise requires a lot of planning, which happens on a yearly basis. We have Splunk dedicated staff onsite for on-premise to help us through this.
We have 450 people making use of Splunk in our organization, and there was a bit of knowledge transfer needed on how to write a Splunk query. So, there is a bit of a learning curve. Once you get over it, it is fairly simple to use. We also have ready-made Splunk queries to help people get started.
We do deal with technical support on an ongoing basis. They can definitely do better from a technical point of view. Their only purpose working onsite is to make sure that our massive set of Splunk clusters are online, and the clusters are tuned well enough to work well.
We would expect the technical support people onsite to be subject-matter experts of Splunk. We have seen in a few areas where we have been left wanting more, wherein some of our engineers happen to know more than them in terms of some of the query optimizations, etc. This is where we think there is a fair amount of improvement that can be done.
We wrote the automation to bootstrap everything onto AWS, which was fairly easy. As long as we had all the hooks going into AWS, and we had the SDK. So, we did not have too much trouble getting the bootstrap up and running.
Some of the insights that we have obtained as a part of using Splunk have greatly helped us in increasing our revenue in terms of selling our products.
We have seen a decent ROI. For the month of October 2018, when we had a product launch, we were able to query and generate BI dashboards on the fly. This was huge, and not possible two and a half to three years back because it was more of a manual process. Now, with APIs being available, it is very simple to tweak or write a small piece of glue code to go ahead and create a new dashboard for a business unit to make near real-time decisions to focus more on other geographies when launching the product.
I wasn't there when the evaluation was done. When I came on board, this product was handed down to me, and we have not evaluated any other solutions or products since then.
Make sure it fits your use case. Be clear about what you want to achieve, get out of the product, and how you want to integrate it. Once you tie the solution into your systems, it is not trivial or easy to walk away from. Therefore, due diligence needs to be made to understand what your requirements are before choosing a product. Some companies may not even want to host, and prefer to go the managed services route.
We have it integrated with every product that I can think of.
We use both the AWS and on-premise versions. The AWS hosted version typically caters to all the microservices that we run on AWS, so there is a clear segregation between on-premise and cloud. In terms of usability and experience, both of them have been similar. We have seen a few bottlenecks on the cloud, but that can probably be attributed more on the user side of the house in terms of the way we write our applications and the type of payloads that we sent this month. This is an optimization which is ongoing from our end. Other that, we have been fairly happy with Splunk and what we get out of it.
We use it for application log monitoring.
It is a logging product. Our application generates log files, then we upload them to Splunk. We run their agent on our EC2 instances in AWS, then we view the logs through their product, and it is all stored on their infrastructure.
We have used the alerts for a lot of things. They gave us the ability to kind of make an alert simply. So, we did one for SQL injection. We also had some services which were problematic that would fail, but we figured out what log line that we could look for, so it was easy to make an alert for that.
Its usability is the best part. It is easy for our developers to use if they want to search their logs, etc.
A problem that we had recently had was we licensed it based on how much data you upload to them every day. Something changed in one our applications, and it started generating three to four times as many logs and. So now, we are trying to assemble something with parts of the Splunk API to warn ourselves, then turn it off and throttle it back more. However it would be better if they had something systematically built into the product that if you're getting close to your license, then to shut things down. This sort of thing would help out a lot. It would help them out too, because then they wouldn't be hollering at us for going over our license.
Stability has been great. I don't think we have ever had an outage from it.
We don't do a lot of searching. If there is somewhere with problems, it will probably have to be with a lot of searches, and we don't have that. We don't have many developers searching every day. It is mostly when there is a problem, then we use it for diagnostics. So, we don't put a large search load on it. However, the reliability of it has been great. It hasn't been down for us at any point.
It seems to have worked out great. We haven't had any problems yet.
I haven't used the technical support.
Before Splunk, we used Kibana and Elasticsearch. Sometimes, with them, logs wouldn't even be there. We have received an infinite time reduction there. We couldn't use what we had before, so Splunk being there and working does a lot.
The integration and configuration with the AWS environment was easy. They had the documentation. All we had to do was get their agent running on our EC2 instance, and their documentation was good for that. It worked, which was great.
The product is also integrated with PagerDuty, Slack, and AWS. Those integrations are good and seamless.
It has made life easier for us through use, then by troubleshooting problems. It reduces the cost of the intangibles.
The pricing seems good relative to the other vendors that we have had here. However, they need to find ways to be more flexible with the licensing and be able to deal with situations where we start generating more logs. Maybe having some controls in the Splunk interface to turn it off, so we don't have to change anything in our application.
We have an existing contract with Splunk, so it makes sense to stay with them for now. Our license is for a 100 GB/logs a day.
There are a lot of vendors in the space at the conference this year. Therefore, we probably talked to six or seven different ones, and the market seems to be consolidating. The market's metrics and log monitoring all seem to be rolling up into a single provider. It looks like that is what will be happening in the next few years.
Right now, there are a ton of different smaller providers doing little pieces of this and that. All the big players, like Splunk, New Relic, and Datadog, seem to be rolling them all up into one offering.
Implement something and watch how much data you are sending to it, then have some way to shut it off without redeploying your app in case things get hairy.
We use the cloud version of the product.
The primary use case is for log analytics. Although, we have been using it as a hammer which hits all the nails. We have sort of overused it in some areas where it doesn't need to be used.
We have a one stop dashboard for health of some of our services where you can click in and it takes you to other dashboards that have custom near real-time metrics that show the application's health. From there, you can drill in to see the real deep dive example of what is happening in your environment. It has reduced our time to resolve incidents.
The most valuable feature is its centralized log analytics.
The historical data extraction needs improvement. I would like the capability of taking data and having it trend longer. Splunk is good about viewing data within the last seven or 14 days, but if you want to see a year-over-year trend, you have to do a lot of work to get to that point. If there was a better way to extract the data point and put it into a long-term viewing ability for a year-over-year analysis, then compare that to your other business metrics. That is what I am looking for, as an example, for a call center you want to see the time it takes for your customer to be handled on their need comparatively to the system performance that is happening, then overlay that data.
We put a lot of trust in it. It has been pretty rock-solid outside of a couple of changes that we made. Upgrades sometimes don't always go smoothly, but otherwise the system performs, and operates.
When we were trying to implement an enterprise solution on-premise, we had scaling issues. It was very difficult to search the data retention beyond a few days. A lot of talent was given to the ability to go into AWS and scale with our need. We still had to do some administrative things to prevent consumers from trying to search all records for all time in very inefficient searches. This could sometimes bring our core system functionality to a halt, so we had to do some user administration in it.
I don't engage with the support directly. Another member of my team does. Any time that we have needed support, he hasn't had an issue opening a ticket and receiving the help that he needs.
The integration and configuration in the AWS environment was pretty good. They have a consumption method for pretty much every service. They might be able to do a little better at advertising different patterns for best practices for different service, but overall there's a method to get everything.
We have had a reduction in the time it takes to resolve issues and correlate what has failed. This has significantly helped.
We looked at the Elk Stack, Kibana, and Sumo Logic.
We chose Splunk because their cost is better, the maintenance factor is a little higher, and the core functionality is higher than what other products provide. The core functionality is out-of-the-box. E.g., with a Toyota Scion, you can customize the parts to make it whatever you want, but it's a lot of work to get there. Where if you buy a Cadillac, you pay the Cadillac's price, but it's a Cadillac. It will work right out-of-the-box.
It works well when searching logs. If you looked to try to do things beyond this, the problem that we ran into is that we treated it as the hammer which hits all nails. That is not really feasible, and there are other tools out there that can do more specialized things.
User administration is key. Trying to prevent users from being able search records all the time is a huge problem. You need a tight approval process on dashboards, making sure the dashboards are queried in the most efficient way possible.
The on-premise version that we had was not scalable at all. It was very difficult to use. We have EC2 instances in the cloud with Splunk installed, which is more scalable and easier to use. It now works much better.
We use it mostly for log monitoring, and also for trying to raise alarms.
It has helped with troubleshooting, making it easier. Now, we have one place where we can find logs and errors. There is no need to go to the actual server to search for the log file.
It provides logs in one place, so they are easy to find. It collects the logs from multiple places, then you have just one place where you see the whole flow from the front-end to the back-end. This is the best thing.
The search could be improved. Now, it is a bit difficult to write search queries because they become quite long, then maintaining those long search queries is a quite challenging.
I have not had any issues with it, and we have the whole banking infrastructure running on it.
The scalability is okay as far as I have seen and used it. We have dozens of different environment environments using the same Splunk instruments, and it has been able to scale.
I have not used technical support.
Splunk's website is quite useful. You can find a lot of information on it. I would recommend to use it and try to figure out the product's features and what you can actually do with Splunk. You can do a lot of things with Splunk, but you need to know what to do first.
I have used both the AWS and on-premise versions, but in two different environment, so I am unable to compare the versions.
We primarily use it for SIEM.
It has a big user base, so the community is useful.
The community surrounding the product is okay, but I would like more material supplied by Splunk around some more common integration stuff. I wish there was a bigger library, because we are building stuff. Where I often feel like other people have done things before, we are reinventing the wheel. While it is not a core piece of our organization and it is not a priority, it does inform our SIEM platform. It would be nice if there was a little more cookie cutter solutioning inside of it, and that they would take a little more time to shake it out.
The first year and a half was a little wacky with its usefulness, but now it is a solid piece of our infrastructure.
We don't have any issues with it now. We had some issues in the past, but we chalked those up to user error. We didn't know what we were doing at first.
We haven't had any issues with it.
I haven't heard any complaints about the technical support.
The integration with all our tool sets felt like we were reinventing the wheel, which was a pain point for us.
It would be nice if the pricing were cheaper. However, we did purchase it.
We evaluated Alert Logic and Splunk. We still use both products heavily.
We have different use cases for the products. At first, Splunk was free, so we started to take more advantage of it.
Do your homework and make sure it fits your needs.
The product is pretty good. We are pretty satisfied with it. It does what it does.
We host the product on AWS, but we did not purchase it on the AWS Marketplace.
We use it for log analysis and alerting, and our stock analysts use it.
I have used the product for more than five years. Then, in the cloud, I have used it for probably a year. It scales better in the cloud than on-premise.
It is a place for all our logs, and everything goes in one place. The stock analysts and security people use one single dashboard (one single location) to check our logs.
Every product needs improvement. If we can get a faster product, we will take it. There are new services which are coming up. If Splunk can catch up with the speed of Amazon, and with the integration, instead of us waiting for another year or so, that would be good.
We would like more integrations with other cloud products, not just AWS, e.g., Azure.
The stability is good. We stress it at 98 percent.
The AWS scalability is pretty good. We currently have it running on three servers.
Other teams have told me that the technical support is pretty good.
For the few integrations that we have already made, these have been easy to do.
We have seen ROI.
Splunk is not free.
I would recommend trying different stuff based on your company's needs and log types.
We like the product.
It has helped us look at modern technology, as well as penetrate our legacy systems, to see where the bottlenecks are.
I have not tested the hybrid model yet. I don't know whether all its integrations and interfaces will work between the cloud and on-premise model. I also don't know if across multiple clouds all the products will perform properly.
If it could be made available as a service, this would be much better than as a product.
It is stable under production environments.
The scalability is decent. We have implemented it in our production environment, and it scales.
We have seen ROI and improvements as we have continued to use the product, but they are more reactive. We want to be proactive on an enterprise-wide scale.
We considered Oracle Enterprise Manager, but Splunk is way more powerful. Splunk is product-agnostic, as it can move across different platforms and products.
Explore Splunk. The product has a lot of depth.
It works with multiple products which are scheduling systems to ERPs to legacy, and it works perfectly fine.
I use the on-premise version. I have not had the opportunity to explore the AWS on Splunk version yet.
We use it for logging, essentially for auditing and troubleshooting errors in production and finding out what happened.
I have used the product personally for five years and at my current company for a year and a half.
I haven't had any problems with it so far.
There are a lot of plugins to integrate this. The client site login is pretty extensible and probably cost-effective. Plus, it is easy to configure.
I would like some additional AI capabilities to provide additional information about things going wrong and things going well.
It is very stable. We have not had any problems.
We had to upgrade when it was on-premise, but then we went to cloud version, which is very good.
It is pretty scalability, even though we have a lot of logs. It runs well.
I assume that the pricing is reasonable, because if it was too costly, there are other alternatives. However, with some of the other solutions, you have to spend time on them and manage them yourself. It might also take you three times to get it right. So, Splunk may be more costly upfront, but in the long run, it saves on time and man-hours.
I would consider ELK Kibana a competitor for this solution. If you have time, and you want to do it yourself, you can save a little money going with Kibana. However, Splunk is pretty good and I would recommend an enterprise to switch to Splunk.