Paul Adams - PeerSpot reviewer
Consultant Solution Architect at a tech services company with 51-200 employees
Consultant
Straightforward implementation, highly resilient, and good support
Pros and Cons
  • "The most valuable feature of Apache Kafka is its versatility. It can solve many use cases or can be a part of many use cases. Its fundamental value of it is in the real-time processing capability."
  • "Managing Apache Kafka can be a challenge, but there are solutions. I used the newest release, as it seems they have removed Zookeeper, which should make it easier. Confluent provides a fully managed Kafka platform, in which the cluster does not need to be managed."

What is our primary use case?

We had an application stack consisting of Salesforce frontend and a Commander VPN position management system and used Apache Kafka to decouple the microservices. Additionally, we planned to use Kafka for stream processing and to use event sourcing to pull data from legacy systems and reference data to form a compacted topic that the microservices could consume.

The usage of Kafka is a combination of deploying on a personal Kubernetes cluster or using a managed service such as MSK. However, most people who use Kafka are using a managed service provided by Confluent. It can be deployed on the cloud or on-premise.

What is most valuable?

The most valuable feature of Apache Kafka is its versatility. It can solve many use cases or can be a part of many use cases. Its fundamental value of it is in the real-time processing capability.

You need time-sensitive technology now, particularly in the analytics space. We have looked at using change data capture and Apache Kafka to modernize our analytics capabilities. Additionally, microservices can be used to capture events from legacy systems.

What needs improvement?

Managing Apache Kafka can be a challenge, but there are solutions. I used the newest release, as it seems they have removed Zookeeper, which should make it easier. Confluent provides a fully managed Kafka platform, in which the cluster does not need to be managed.

If it is a native Apache Kafka, it would have schema registry capabilities. However, this type of functionality is often provided by third-party tools. Additionally, there may be a need for improved manageability and additional tools to manage the cluster, including standard operational metrics and inbuilt management capabilities.

For how long have I used the solution?

I have been using Apache Kafka for approximately three years.

Buyer's Guide
Apache Kafka
April 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
769,479 professionals have used our research since 2012.

What do I think about the stability of the solution?

The solution is highly resilient.

I rate the stability of Apache Kafka a nine out of ten.

What do I think about the scalability of the solution?

Apache Kafka is scalable.

I rate the scalability of Apache Kafka a nine out of ten.

How are customer service and support?

The support from Apache Kafka is good.

How was the initial setup?

The initial setup of Apache Kafka is easy to set up a cluster.  I did the initial setup on my laptop and it is straightforward. I used the Confluent version, but even if you want to run native capabilities it's straightforward to do the implementation.

What about the implementation team?

The recent proof of concept was done on behalf of a client by a system integrator. Similarly, the previous one was mainly done in-house and it utilized Confluent, Apache Kafka, and MSK. The process involved setting up pre-built capabilities.

What's my experience with pricing, setup cost, and licensing?

The price of the solution is low.

I rate the price of Apache Kafka a nine out of ten.

What other advice do I have?

I rate Apache Kafka a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Integrator
PeerSpot user
Assistant Professor at CHAROTAR UNIVERSITY OF SCIENCE AND TECHNOLOGY
Real User
Difficult to configure, lacking automation, but has good community support
Pros and Cons
  • "The valuable features are the group community and support."
  • "The solution can improve by having automation for developers. We have done many manual calculations and it has been difficult but if it was automated it would be much better."

What is our primary use case?

We are in the early stages of testing this solution in our lab as a demo. It is in development and we are not in production at this point.

We are using this solution to relay events when they happen to multiple receivers at once to allow better functionality.

How has it helped my organization?

Apache Kafka has helped our client's online restaurant company by allowing them to take any orders and send the notifications with some other details, such as logic commands, to the different microservices.

What is most valuable?

The valuable features are the group community and support.

What needs improvement?

The solution can improve by having automation for developers. We have done many manual calculations and it has been difficult but if it was automated it would be much better.

For how long have I used the solution?

I have been using this solution for approximately three months.

What do I think about the scalability of the solution?

The solution's scalability is important for our ability to have more throughput from multiple receivers. If we need more throughput it can deliver.

Which solution did I use previously and why did I switch?

We did use other solutions previously but this solution makes things a lot easier.

How was the initial setup?

The installation is fairly easy. Additionally, there is a cloud-based version available if a use case requires it.

What about the implementation team?

We did the implementation ourselves.

What's my experience with pricing, setup cost, and licensing?

The solution is free, it is open-source.

What other advice do I have?

There is a lot of configuration involved in this solution. We have found many configurations that have helped us but it would be beneficial if there was automation. 

I rate Apache Kafka a five out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Apache Kafka
April 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
769,479 professionals have used our research since 2012.
Software Support & Development Engineer at a computer software company with 501-1,000 employees
Real User
Scalable and free to use
Pros and Cons
  • "Apache Kafka is scalable. It is easy to add brokers."
  • "Apache Kafka can improve by making the documentation more user-friendly. It would be beneficial if we could explain to customers in more detail how the solution operates but the documentation get highly technical quickly. For example, if they had a simple page where we can show the customers how it works without the need for the customer to have a computer science background."

What is our primary use case?

Apache Kafka is used for connecting components between each other in the same application. The use is quite limited, but I was curious about its filtering capability of it.

How has it helped my organization?

We implemented the notification system between our components, and we found that Apache Kafka performs well in scalability. It has improved our organization because of the scalability and the comfort of a fail-safe or disaster recovery it provides.

What needs improvement?

Apache Kafka can improve by making the documentation more user-friendly. It would be beneficial if we could explain to customers in more detail how the solution operates but the documentation get highly technical quickly. For example, if they had a simple page where we can show the customers how it works without the need for the customer to have a computer science background.

For how long have I used the solution?

I have been using Apache Kafka for approximately two years.

What do I think about the scalability of the solution?

Apache Kafka is scalable. It is easy to add brokers.

We have approximately 30 people using this solution in my organization. They use the solution daily.

Which solution did I use previously and why did I switch?

I have only used Apache Kafka.

How was the initial setup?

The initial setup of Apache Kafka took some time but after it was easy.

I rate the initial setup of Apache Kafka a three out of five.

What about the implementation team?

We set up the solution in-house.

What's my experience with pricing, setup cost, and licensing?

This is an open-source solution and is free to use.

What other advice do I have?

We have not used the solution in production. We do not have a lot of data at the moment.

I would recommend this solution to others.

I rate Apache Kafka an eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Sr Technical Consultant at a tech services company with 1,001-5,000 employees
Real User
Effective stream API, useful consumer groups, and highly scalable
Pros and Cons
  • "The most valuable features are the stream API, consumer groups, and the way that the scaling takes place."
  • "would like to see real-time event-based consumption of messages rather than the traditional way through a loop. The traditional messaging system works by listing and looping with a small wait to check to see what the messages are. A push system is where you have something that is ready to receive a message and when the message comes in and hits the partition, it goes straight to the consumer versus the consumer having to pull. I believe this consumer approach is something they are working on and may come in an upcoming release. However, that is message consumption versus message listening."

What is our primary use case?

One of our clients needed to take events out of SAP to stream them through Apache Kafka while applying data enrichment before reaching the consumers.

How has it helped my organization?

The solution can handle more speed and has horizontal scalability for both messaging, but more specifically stream processing and data enrichment. By using this solution it can reduce the number of components required in the tech stack. For example, we were taking data events out of SAP and sending them to consumers without having to go through multiple processors that were outside of the KAFKA space. Additionally, we are using Kafka from GoldenGate to propagate database updates in real-time.

What is most valuable?

The most valuable features are the stream API, consumer groups, and the way that the scaling takes place. 

What needs improvement?

I would like to see real-time event-based consumption of messages rather than the traditional way through a loop. The traditional messaging system works by listing and looping with a small wait to check to see what the messages are. A push system is where you have something that is ready to receive a message and when the message comes in and hits the partition, it goes straight to the consumer versus the consumer having to pull. I believe this consumer approach is something they are working on and may come in an upcoming release. However, that is message consumption versus message listening.

Confluent created the KSQL language, but they gave it to the open-source community. I would like to see KSQL be able to be used on raw data versus structured and semi-structured data.

For how long have I used the solution?

I have been using this solution for approximately one year.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

I have found the Apache Kafka to be highly scalable

How are customer service and technical support?

The project we were working on was open-source, we were using Confluent as support and they were great.

How was the initial setup?

Apache Kafka on AWS is a bit complex. There is a third-party company called Confluent and they have the support that makes their installation much easier, especially for the on-premise deployment. You install Apache Kafka alone it can be a little complex compared to other queuing messaging solutions.

The on-premise deployment takes approximately a few days. The cloud or hybrid deployments including all the permissions, typologies, firewalls, and networking configuration can take weeks for all the accessibility issues to be resolved. However, the delay could have been client-related and not necessarily the solution.

What about the implementation team?

We provide the implementation service.

What's my experience with pricing, setup cost, and licensing?

Apache Kafka is free. My clients were using Confluent which provides high-quality support and services, and it was relatively expensive for our client. There was a lot of back and forth on negotiating the price.

Confluent has an offering that has Cloud-Based pricing. There are different packages, prices, and capabilities. The highest level being the most expensive. AWS provides services to their market, for example, to have Kafka running. I do not know what the pricing is and I am fairly confident, Azure and GCP provide similar services.

What other advice do I have?

My advice to others wanting to implement this solution is to start with data streaming projects, not simple messaging projects because while it is very good at general-purpose messaging, it is more suited and geared for when you are using it as a streaming solution.

I rate Apache Kafka an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Owner at Binarylogicworks.com.au
Real User
Good performance and resilience, but it is complex and has a learning curve
Pros and Cons
  • "The most valuable feature is the performance."
  • "Kafka is complex and there is a little bit of a learning curve."

What is our primary use case?

I am a solution architect and this is one of the products that I implement for my customers.

Kafka works well when subscribes want to stream data for specific topics.

What is most valuable?

The most valuable feature is the performance.

What needs improvement?

Kafka is complex and there is a little bit of a learning curve.

For how long have I used the solution?

I have been using Apache Kafka for between one and two years.

What do I think about the stability of the solution?

Resilience-wise, Kafka is very good.

What do I think about the scalability of the solution?

Kafka is a very scalable system. You can have multiple, scalable architectures.

How are customer service and technical support?

I have not seen any problems with technical support. There is licensed support available, which is not the case with all open-source solutions. Open-source products often have issues when it comes to getting support.

Which solution did I use previously and why did I switch?

I have customers who were using IBM MQ but they have been switching to open-source.

How was the initial setup?

The initial setup was straightforward for me. However, it is not straightforward for everyone because there are some tricky things to implement. In single-mode it is a little bit easier, but when it is set up as a distributed system then it is more complex because there are a lot of things to be considered.

What's my experience with pricing, setup cost, and licensing?

Kafka is open-source and it is cheaper than any other product.

Which other solutions did I evaluate?

There is a competing open-source solution called NATS but I see that Apache Kafka is widely used in many places.

Performance-wise, Kafka is better than any of the other products.

What other advice do I have?

This is currently the product that I am recommending to customers. Some customers want an open-source solution.

There are some newer products that are coming on to the market that are even faster than Kafka but this solution is very resilient.

In the long run, I think that open-source will dominate the pace.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Building Event-centric Data processing Architectures at a tech services company with 51-200 employees
Real User
The product is scalable and provides good connectors, but the ability to connect the producers and consumers must be improved
Pros and Cons
  • "The connectors provided by the solution are valuable."
  • "The ability to connect the producers and consumers must be improved."

What is our primary use case?

We use the solution for analytics for streaming. We also use it for fraud detection.

What is most valuable?

The Kafka Streams library gives quite a bit of functionality. The connectors provided by the solution are valuable.

What needs improvement?

The ability to connect the producers and consumers must be improved. It's still a pain point because a lot of development goes into it.

For how long have I used the solution?

I have been using the solution for seven to eight years.

What do I think about the stability of the solution?

For what it does, the tool is very stable. It is a message broker. It receives the messages and holds them for producers and consumers. It's usually everything around Kafka that has stability problems because Kafka does exactly what it's supposed to do.

What do I think about the scalability of the solution?

Scalability is one of the main selling points of the tool. The additional nodes we add give us the additional storage capacity we need. I rate the scalability a ten out of ten. The solution is used across multiple domains in our organization. I use the product daily. It’s a continuously growing platform.

How are customer service and support?

Apache doesn't provide support. There are sites we can go to for information, but there's no support team for Apache. There are companies like Confluent and HPE that provide support for the solution.

Which solution did I use previously and why did I switch?

We also use Flink and other streaming tools. We use Apache Kafka in addition to other technologies because of the requirement and the business use cases.

How was the initial setup?

It is super easy to set up. I rate the ease of setup a ten out of ten. However, building and administration get quite difficult. It takes three months to make things production-ready.

What about the implementation team?

The deployment was done in-house. We used the tools that we have in our CI/CD pipeline. We needed three people for the deployment. The infrastructure team maintains the tool. The infrastructure team has three to ten members.

What was our ROI?

We see an ROI on the product. If we don't have a tool to buffer the amount of traffic coming in from high-traffic sites, we cannot use the data. Apache Kafka gives us a resting area where we can push as much information as we want to. It’s picked up by consumers when they need it.

It’s a huge return on investment. Otherwise, we must have a system tied to the producer waiting for the consumer to consume before we can do anything with the rest of the messages. A solution like Kafka provides us with a buffer to consume the data as we choose to.

What's my experience with pricing, setup cost, and licensing?

The price depends on who we are getting the product from. If we buy it from Confluent, we always have to try to negotiate the price. The price is always negotiable.

What other advice do I have?

Overall, I rate the product a six out of ten.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Assistant Student at a retailer with 5,001-10,000 employees
Real User
Reliable solution for processing broker messages from many clients
Pros and Cons
  • "The most valuable feature is the messaging function and reliability."
  • "Something that could be improved is having an interface to monitor the consuming rate."

What is our primary use case?

I have a lot of messages, and we need to process those messages from many clients. Each client takes those messages and processes them.

I'm using the brokerage partner. I'm not storing or maintaining the application on servers. I'm just a client for the Apache Kafka server.

The solution is deployed on-prem.

How has it helped my organization?

Apache Kafka has improved our organization because it's more reliable than Rabbit. That's the whole point for us.

What is most valuable?

The most valuable feature is the messaging function and reliability.

What needs improvement?

Something that could be improved is having an interface to monitor the consuming rate. We use something, but I'm not sure if it's from Apache Kafka, or if it's a borrowed third-party solution. So, the interface for monitoring the processes is an additional feature that could be added.

For how long have I used the solution?

I have been using this solution for two years.

What do I think about the stability of the solution?

The solution is pretty stable compared to Rabbit or other brokers. 

What do I think about the scalability of the solution?

The solution is scalable. We have about 10 departments that use Kafka in various forms. Each department might have 5 or 10 people.

We use the solution all the time. We have consumers that consume messages that come every day because we have clients and customers for the main website. All of those messages go to KAF clients. Our backend departments consume messages from the actions of the final customers.

Which solution did I use previously and why did I switch?

We used Rabbit and we switched to Kafka because it seemed like an upgrade in ability, reliability, and in the consuming process of broker messages.

How was the initial setup?

Implementations took half a year for everyone to learn the solution. It was quite lengthy.

What other advice do I have?

I would rate this solution 9 out of 10.

My advice is to take some time in investigating how to implement the solution.

We used to require about half a year to implement in our organization. Someone who needs to implement Kafka has to be prepared for a quite lengthy process. Don't expect implementation to be completed in a week. It's a little bit longer because it's complex.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user653562 - PeerSpot reviewer
Solutions Architect at a consultancy with 1,001-5,000 employees
Consultant
Has the ability to write data at one velocity and have subscribing consumers read at different velocities.
Pros and Cons
  • "Apache Kafka is actually a distributed commit log. That is different than most messaging and queuing systems before it."
  • "The GUI tools for monitoring and support are still very basic and not very rich. There is no help in determining a shard key for performance."

How has it helped my organization?

Kafka has a guaranteed delivery mechanism that is very easy to set up. When starting out with minimal hardware, it can handle very large data volumes. When prototyping and creating a proof of concept, Kafka has helped to speed up the timeline from the prototype all the way to production volumes.

What is most valuable?

Apache Kafka is actually a distributed commit log. That is different than most messaging and queuing systems before it. I find the ability to write data at one velocity and have subscribing consumers read at different velocities to be the best feature.

What needs improvement?

The GUI tools for monitoring and support are still very basic and not very rich. There is no help in determining a shard key for performance.

What do I think about the stability of the solution?

We did not have any issues with stability.

What do I think about the scalability of the solution?

We did not have any issues with scalability.

How are customer service and technical support?

  • Kafka is open source from LinkedIn and support comes from the community of users.
  • You can go with Confluent, the company that was founded by the original engineers from LinkedIn.
  • You can go with a cloud hosting service, like AWS EMR or Azure HDInsight.


    Which solution did I use previously and why did I switch?

    We used traditional message queues and file semaphores. There was a lot of overhead with asynchronous messages being put into an order and making sure nothing got dropped. It required a lot of code and maintenance.

    How was the initial setup?

    Since it is open source, you are on your own for setup. However, the tutorials from the Apache foundation and online sources have been an immense help.

    Getting started is very easy. The complexity of very large volumes of data and appropriate sharding, however, is difficult. There are fewer resources for tuning and best practices.

    What's my experience with pricing, setup cost, and licensing?

    When starting to look at a distributed message system, look for a cloud solution first. It is an easier entry point than an on-premises hardware solution. A lot of the complexity has already been taken care of. Both AWS and Azure have supported Kafka clusters that can be provisioned very easily.

    Which other solutions did I evaluate?

    We looked at RabbitMQ and Spark Streaming.

    What other advice do I have?

    Be sure to define the use cases as best as possible at first.

    Kafka is very good, but it is complex to support. It can handle any message size, whereas native cloud options have size limitations.

    Be sure to understand what messages will be sent and how many discrete topics will be needed.

    Be aware that you must code both producers and consumers.

    The bulk of the work is with the consumer.

    The Apache stack for Kafka is very open source. There are essentially no tools other than command line options to monitor brokers and topic health. So there are 3rd party tools that will help with that, some free, some paid – but it requires that you install agents on the servers hosting Kafka and open up ports for netbeans on the scripts that start up the Kafka services. Additionally, you also have to monitor zookeeper – which is very memory intensive. Cloud offerings that provide the whole modern data architecture stack – like AWS EMR and Azure HDInsight as well as Hortonworks and Cloudera provide a console GUI as part of each of their offerings. Also Confluent, a company founded by the Linked-In engineers that designed Kafka, also have a paid enterprise offering that has much better tools for maintain the kafka cluster. But apache Kafka with the community – you are on your own.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user