We use it almost exclusively for flow data. We use that for a variety of things from network optimization to network capacity to security events, including DDoS protection, etc.
We're using the SaaS version.
We use it almost exclusively for flow data. We use that for a variety of things from network optimization to network capacity to security events, including DDoS protection, etc.
We're using the SaaS version.
The drill-down into detailed views of network activity helps us to quickly pinpoint locations and causes. Anecdotally, it has decreased our mean time to remediation. On a per-incident basis, it could save anywhere from five minutes to 60 minutes.
We also believe it has improved our total network uptime. We haven't done any direct before-and-after comparison, though.
Again, anecdotally, it has sped up our security team's ability to respond to attacks that did not surface as readily, prior to having the flow log data.
One of the valuable features is the intuitive nature of building out reports, and then triggering actions based on specific metrics from those reports. It has a really good UI and the ability to surface data through the reporting functions is pretty good. That's helped a lot in the security space. If you get a massive, 100 GB attack coming through, saturating links, you can surface that really quickly and then act to engage DDoS protection or other mitigations from the IPS.
The real-time visibility across our network infrastructure is really good. One of the things that we love it for is our global backbone visualization. Being able to see that utilization in real-time is pretty critical for us.
It also proactively detects network performance degradation and things like availability issues and anomalies when used in concert with the SevOne network management system. In conjunction with that — with all of our polling and availability data coming from that NMS — the flow data provides that type of insight.
We also use Kentik's months of historical data for forensic work. We do 90 days.
I believe they're already working on this, but I would love for them to create better integrations from network flow data to application performance — tracing — so that we could overlay that data more readily. With more companies going hybrid, flow logs and flow data, whether it be VPC or on-prem, matched with application performance and trace data, is pretty important.
The other area would be supplanting companies like SevOne and other companies that are really good in the NMS space, specifically for SNMP data.
We've had it since before I took over this space and took over Kentik, so 2017 is when the initial contract started. We're going on three years.
The stability has been very good. There was only one outage or impacting event that I can remember in the past year. It took them a couple of days to fix it, but the impact was remediated through some mitigation they did on their end to prevent it from causing us too much headache. They got it down to where it only affected some long-term reporting, which wasn't super-critical for us. It wasn't too big a deal.
So far, Kentik has scaled for what we've done with it and we haven't hit any scale issues to date. I don't know if we're a very large user compared to some of their other customers so I don't know if we're a good example to discuss scale, per se. But we haven't encountered any scale issues from our side.
We don't have plans to expand the use of Kentik, other than increasing licenses to gather flow data for more devices. We buy per license and we have 75 or 100 licenses. The size of the teams that use it is 100 people or so. They are security engineers, network engineers, network health analysts, and threat-intelligence folks.
Their tech support is phenomenal. They tell us about an issue before we even get to it.
With the incident that I mentioned in the context of the solution's stability, even before we experienced any issues relating to it, they had already reached out to us and let us know what was going on. They gave us some timelines, and the ongoing communication kept us informed throughout the incident and was able to mitigate any kerfuffle from the executive layer. That can be a giant headache when dealing with those types of situations, but they managed it perfectly and were proactive with their communication and we didn't hear a peep from anyone about it.
I wasn't involved in the initial setup, but there is time involved for us to set up the checks for the flow data and to set up the reports. Depending on what someone is setting up, it could take five minutes or it could take a couple of days. It just depends on what they're implementing with it.
I'm sure we have data available to show ROI but I don't have it available. Where Kentik is bringing us the most value is in the security realm, in terms of attack prevention, but ROI on that is hard to measure.
There have been other folks in our company who have tested a variety of things. Prior to Kentik they went through an evaluation phase, from what I understand, and vetted out a variety of solutions. I believe that what made Kentik stand out was pricing and the intuitive user-experience.
The biggest lesson in using Kentik is that as we continue to use it and learn more, we learn about the use cases that are valuable. Initially, when I came over to the team, we weren't using it to its fullest capabilities. As we started to understand the capabilities and dive in, in specific areas with Kentik engineers themselves for customer success, we learned that we needed to change our thought process a little bit; how we thought about flow logs and what they could provide insight into.
My advice would be to leverage their customer success engineers upfront and don't let them go until you've hit all your use cases. Constantly be in touch with them to understand what some of the forward-thinking ideas are and what some of the cutting-edge use cases are that their other customers might be getting into.
We don't make use of Kentik's ability to overlay multiple datasets, like orchestration, public cloud infrastructure, network paths, or threat data onto our existing data. That is something we're evaluating. We're currently talking with a couple of teams that are moving to AWS, teams that would like to use Kentik to potentially capture VPC flow logs and overlay that with their application performance data. That is something that is currently on-hold, pending some other priority work. We will probably dive back into that, with that team, around mid-2020.
For maintenance, it requires less than one full-time engineer because it's a SaaS model.
In terms of overall vendor partnership, I'd give Kentik a nine out of 10. They're right up there as one of my best partners to work with, amongst all the contracts that I own. They're very customer-centric. They're always available. There's nothing too small or too big that I can't ask them to help with, and they seem to be willing and able to jump in no matter what. That customer focus — which is a theme across the digital world right now with companies trying to try to do more of that — Kentik does a really good job of embodying that.
We're using Kentik for flow data, so we can do things like peering management and interconnection research, as well as capacity management. We also use it fairly heavily in our tech cost-reporting so we can see things such as how many dollars per gigabyte and how much we're using.
The deployment model is cloud, which Kentik provides.
Before using the solution, we had to do all these manual tasks, such as running all these queries manually, and building our tech cost-report used to be a two or two-and-a-half-week effort. Using Kentik, and the automation that it provides us, we've brought that down to a day or two, which is a massive time savings.
Our capacity managers would say the visibility and the dashboards have improved the way our organization functions. They can see, at a glance, which of our data centers is serving which countries, and that really helps them in their planning.
In terms of the solution's months of historical data for forensic work, we reformulated the way we calculate costs so we had to go back into the historical data and use that raw data to calculate the costs again. It wasn't necessarily forensic networking, but it was forensic cost and business work.
The queries, in general, are great, being able to add tagging to the flows.
Having the API access allows us to do a great deal of automation around a lot of our reporting and management tools. I'm the manager of automation at our company, so for me, that's the big winner.
We're using the dashboard, but that's not as high-priority a use case for us.
We're also using Kentik to ingest metrics. It's a useful feature, and its response time, whenever we're pulling back the data, is higher than our on-prem solution. That's one of the points in its favor.
There is room for improvement around the usability of the API. It's a hugely complex task to call it and you need a lot of backing to be able to do it. I should say, as someone who's not in networking, maybe it's easier for people who are in networking, but for me that one part is not very user-friendly.
It's very stable. We have not had any issues that I've heard of since it was brought online.
So far we've been able to more than double and triple our use cases for it and it hasn't really even hiccupped. The scalability seems good.
We have data centers all around the world and Kentik is in every data center. We plan to build many more, and it's going to be included into each of those builds. We're using it globally and expect to keep growing its use.
Technical support has been very excellent. We've had many novel use cases which we've had to send back to them and their response times, and the solutions that they've given us, have always been better than satisfactory. They've met our needs and then some.
We're also using a competitor, Druid, which is an open-source version that we host on-prem. So we're actually doing both cloud and on-prem solutions. At the moment we use each solution to verify the other.
We decided to bring in Kentik partially because of cost. Our particular instantiation of Druid is enormously expensive. The director of capacity management wanted to spin up Kentik, originally as a PoC, to see what kind of data we could get out of it for a lower price point. But then he quit, and we're still just going along with it. Eventually a decision will be made between them, but initially it was cost that pushed us toward Kentik, to see if we could get more bang for our buck.
A related difference between the two solutions is that with Kentik, and the infrastructure owned by someone else since it's a SaaS solution, the overhead on our side is much lower. We're also not necessarily on the hook for dedicated infrastructure costs. So, the price is amazing.
With Druid, for every node that's recording all of this data that goes across the routers and such, there are no limits. With Kentik, occasionally, we'll drop data or we'll get a spike in the traffic that will make the data unreliable. That does not happen in Druid because it's not metered the same way.
So far, we have seen ROI by going with Kentik through the time savings I already spoke about, with the ability to automate. But there is also ROI just straight up on the cost. It's considerably cheaper, and from what we've found, the data is just as rich. It's a very meaty data point in some of our planning for the future.
We have an annual contract with Kentik that we renew each year for a set number of licenses. We also have some burstable licenses which we can spin up and spin down, and those are paid as they are used. We have 20 of those licenses but we don't get charged for them unless we use them.
Rely on the customer service reps. That would be my biggest piece of advice because they've got all the good tips and tricks.
The user base of Kentik in our company is very small, about 15 people or less. That includes our interconnection managers, peering managers, and capacity managers, as well as my small team of software developers.
For deployment and maintenance of the solution it requires one person or less. Nobody needs to make it their full-time job, which is nice. Those responsibilities are spread across several people. One of the interconnection managers helps me, for example.
Overall, I would rate Kentik at eight out of ten. The fact that we can lose data whenever we have traffic spikes, which our business does pretty regularly, is what would keep any solution from a ten, because I can't always, for every data point, say this is accurate. Occasionally there's an asterisk next to the data.
For our purposes, where we're at today, and even in the past, to analyze flows and to pull specific data and understand where our traffic is going to — which AS path — that's primarily the value that I extrapolate from Kentik.
It's mostly on-prem. We do some stuff with GCP and AWS, but it was all primarily licensed-based, based on the number of pieces of equipment we have on-prem that we actually attach it to. We have over 55 edge nodes and about 10 compute nodes.
We can actually see what we're doing now. When it comes to making an educated decision on a number of things, if you have no visibility into what you're doing, you really can't make that decision. Collecting that data and having those metrics first-hand, in real-time, allows us to make an educated decision, versus an uneducated guess.
Kentik has proactively detected network performance degradation, availability issues, and anomalies. When we had no visibility. When we had congestion, things would actually happen and it was hard to troubleshoot as to where they were coming from. That was one of the first things we were able to do.
A specific example is where we had a number of tenants that were created that were getting DDoS'ed. We couldn't understand how or why we were getting DDoS'ed because we had no visibility. We were guessing. Kentik opened up and showed us where the traffic was coming from and how we could go about mitigating.
It lets us understand what those attacks are, versus not actually knowing where they're coming from or how they're affecting us. It cuts down the time it takes for us to troubleshoot and actually mitigate by about 50 percent, guaranteed, if not more. But we're running a bunch of GRE IP sectionals. It's not like we have huge amounts of capacity. But for some of our large customers, it really has helped us detect what the problem is, instead of guessing.
At my previous company, it improved our total network uptime by about 20 percent. I wouldn't correlate that back to Kentik in my current company.
The most valuable feature is being able to pull traffic patterns; to and from destinations. We're able to understand where our traffic is going, our top talkers from an AS set, as well as where our traffic's coming from.
The only downside to Kentik, something that I don't like, is that it's great that it shows you where these anomalies lie, but it's not actionable. Kentik is valuable, don't get me wrong, but if it had an actionable piece to it... I keep telling them, "Man, you need to find a way to make it actionable because if you could actually mitigate, it'd be huge what you guys could do."
The way things are, we have to have some sort of DDoS mitigation, like Arbor or something of that nature. Once the anomaly is detected, that's great, but then you have to mitigate. If Kentik had mitigation, or if they could acquire a solution and throw it onto their platform and have that portion available, that would be huge.
I have been using Kentik at this company for about a year and, prior to that, I used it a previous job for about another year.
Coming into this company, I felt they were flying blind, meaning they didn't really have anything from a monitoring standpoint. They didn't understand how decisions were made. And to make educated decisions, you actually have to have the proper tools in place. Kentik was a tool that I know works really well.
Kentik has pretty good intuition, as a company, as to where the market sits and what they're into. They don't delude themselves. They really focus. They've been pretty good. I know the leadership over there and it seems like between Justin and Avi, they're good at what they do and that's why I'll continue to use them.
Anywhere I go, I'm going to use Kentik if I have the chance.
Kentik is a mature software product and is provided as SaaS. This is very valuable to me, because I don’t have to maintain a server, worry about resources, updates to software or OS, hard-drive space, or backups. I can just use Kentik to get the data in the format I want, such as reports or ad-hoc information.
Good NetFlow visualization and analysis tools are valuable to anybody who needs to understand the traffic that flows in and out the network.
This solution helps with planning, security, and troubleshooting. Kentik’s data explorer allows easy access to all data, grouped by any variable. You can do a quick overview of the top ten users or peer performance with very little effort.
One of our Network Operations Centers has a large overview screen with a web browser that shows Kentik and the data explorer running. This provides a constant overview of live traffic sorted by the source port.
We use Kentik to monitor the network and get alerts from its alert module if there is a DDoS or other attack on our network.
Kentik is constantly improving. I have seen their alert portion grow this year to include a new Beta that allows you to use automatic mitigation with multiple platforms.
Kentik is used when customers call in to troubleshoot their internet service and to decide on new peering partners.
I would like to see more granular user and security rights. Currently, a user can be a member or an administrator. I would like to limit what a user can see, be it IP or interface. I would like to be able to give my customers access to the data explorer with just their data.
We have been using this solution for about a year.
We did not encounter any issues with stability.
We did not encounter any issues with scalability.
The technical support is very good. Emails to support@kentik.com are answered quickly and competently. Kentik keeps in touch, listens to feedback, and cares.
We used multiple NetFlow products. Kentik shines with the ease of running reports and looking at data.
I was involved in the installation. The setup was easy. Exporting NetFlow records is all that is needed. I also setup a BGP session with Kentik. This allows Kentik to see the AS path and to record it with each record as well.
The benefit in our case outweighs the cost. We use Kentik at the core of our network, which provides us with a central and detailed view of our traffic.
We have used open-source options such as NfSen and SolarWinds.
I would suggest giving it a try. A demo can be set up in no time. You can see for yourself whether or not you like it.
DDoS Alarming allows us to get a feel for the bandwidth of an attack and determine if mitigation is needed to prevent collateral damage. Secondly, the flow analysis lets us look at how traffic is transiting our network. This allows us to optimize metrics to reduce cost.
Kentik answers the flow question: what are my flows, where are they are going, and what can I do to better optimize my connectivity. Kentik also baselines flow behavior and can alert you when there are abnormal flows such as DDoS.
We now have real metrics on DDoS attack vectors and use the alerting dashboard to gather information used in CLI filters and eventually in RTBH.
Firstly, my Dashlane password manager attempts to fill in the dimensions field for me, so I just turn off my password manager when that occurs.
Secondly, sometimes its difficult to order the dimensions correctly when trying to make Sankey flow diagrams. It’d be nice if there was a knob somewhere in my users settings that allowed me to make the dimensions box a single column from top to bottom so I don’t have to spend extra time tying to drag a dimension into the correct column to get the order correctly.
I have used Kentik since April of 2016; usually four times a week.
We have not encountered any stability issues.
We have not encountered any scalability issues. Kentik allows us to set sampling of flows on a per device basis.
Technical support is proactive in letting us know when we accidentally stop sending them flows. Additionally, when asking for help in configuring BGP settings, they have expert level knowledge in CLI configuration of network devices.
We did trials on a few competitor solutions. They were too slow, too complex, and required lots of on-premises touches to fix their equipment. They crashed often and they had poor customer service.
Initial setup was relatively straightforward. We had to evaluate which method of flow export/ingestion to use, implement the samplicator instance and then send Kentik the flows. We also had to exchange some information for BGP and SNMP settings.
I’ve told others that they charge based on the amount of devices and provide a discount for education customers. In my role, I haven’t been exposed to the cost of the product.
We looked at Plixer Scrutinizer.
If they haven’t already decided to use it, I typically log into my portal and show them it’s capabilities. Then, I let them know they can get a trial for their network. If they have already decided to use the product, then I tell them they are in capable hands, because the customer support knows networks and servers very well.
The DDoS alerting was, at first, the most useful. It was able to alert the entire team of more than 20 that the issues with the website were actually network based, instead of, say, bad code. In time, we mitigated the DDoS attack surface, so the usefulness is still there. We just don't see it every day.
Now we use Kentik for more nuanced traffic insight. This is ad hoc usually, but we do email 'peering' reports daily to the lead network engineers. This gives them some view into new traffic patterns we are picking up in IXes.
I find it very useful to see when traffic destined for a prefix that we prefer ingress on in the East Coast actually ingresses or egresses on the West Coast. It shows the difference between BGP paths vs. regional expectations.
The alerting ability is greatly improved. I think there is some movement still to make this into a 'dumb mode' vs 'expert mode'. There is the SQL-like syntax, but that is expert+.
I have used Kentik for 2.5 years.
We rarely, if ever, had any stability issues.
I have not had any scalability issues.
Technical support is second to none.
We used in-house, hand-built things. All based on binary RRDs or worse.
Initial setup was very straightforward. Nothing I needed too much help with.
There is a large difference between BGP and normal nodes. I don't think this plays out to the best for the customer or Kentik. To be able to split off the BGP vs PPS requirements would be good.
We've evaluated almost everything except SiLK.
Use the technical support if you need it. They are excellent.
We have put it on half of our large monitoring screens. Sometimes, it is actually easier to identify and attack incoming traffic using Kentik, than it is to use our own gear.
Even when we know what the traffic is, it allows us to jump directly into the next steps of our process more quickly, since we can visually see everything in one place and on one screen through the customizable dashboards.
Instead of just total traffic in bits or packets, we can get protocol, destination port, TCP flags; everything you might want.
Kentik has been remarkable at anticipating the design requirements of their customers. They have provided everything that I might want already. After using it for over six months constantly, I am still discovering new things.
The only times I’ve felt that “I wish I could use this to XYZ,” I’ve contacted support and it turns out that I can do that already. However, I just didn’t know if I could do it using the existing controls or via a combination of query types.
Perhaps a better explanation would be to see how tagging is captured and a method of comparing my tagged interfaces on Kentik’s side. Right now, I can go in and look at all of the interfaces that they’re receiving the flow for and also sort/filter it, but there is no way for me to easily compare them between my nodes. I need to add, though, that’s really not a missing feature of their product; it is just a way to help troubleshoot my own (potentially broken) systems.
I add the tags to my own devices, not them. However, if we’ve made a mistake on our side, it’s a basic row-by-row comparison. I believe there is a way to use their SQL query feature to pull a better comparison but a method of using the GUI would be nice.
I have used this solution for about 6-8 months. For five months, I have used it as a standard user. Now, my organization created a separate admin account for me, so in total I have used it for eight months.
We have not experienced any stability issues. Other than the planned maintenance, which is short, it is always available and working great.
There have been a few very minor bugs; for instance, the auto-refresh was not working on the dashboards. When we notified them of it, they responded in less than an hour; they had replicated the issue and were working on a fix. A day later, it was done.
We have not scaled the product past the current level we are at. However, I don’t see that could ever be an issue. You just send them the flow from your devices.
If you’re scaling, you make sure your interfaces are sending the data and you're golden.
The level of technical support is beyond any vendor that I have ever worked with before.
The service is totally hosted by Kentik, with a web portal and API. I have not had issues with it being available to use. I have not tried to get to it expecting it to be available and had it not load. Occasionally we’ll get an email or pop up notification on the Web UI that planned maintenance will take Kentik down for an hour or so, these come a few days in advance of the planned service.
The only issue we have had of a technical nature was with their dashboards. Dashboards are a custom page you build and layout manually with different “Data Explorer” queries, then you turn on auto-refresh and let it continue to build the graphs as time moves on. This auto-refresh feature stopped working after an update to the Kentik UI’s look and feel. When we noticed it was not functioning we sent them an email, they responded back quickly and told us they had replicated the issue and were going to work on a fix. It was the next day when they told us to try it again, and they had indeed fixed it already! I rarely get such prompt attention to an issue.
I have used SolarWinds in another company. You get a very simple, non-configurable type of view with green, yellow, red and ingress/egress numbers. It doesn’t compare to the analytical capabilities that Kentik has.
It was set up before I joined this organization.
I am not a part of the purchasing or evaluation in any way. We still use Cacti for general stuff, but Kentik has replaced it on half of our boards so far.
While I was not a part of the implementation, if you know how to set up NetFlow on your device, just point it at Kentik. They have another setup option for a sensor that lives in your network. I have only heard of it; never used it or spoken to anyone that has.
This product is easily the best network monitor that I’ve ever seen or heard about.