Abdul-Samad - PeerSpot reviewer
Software Engineer at a tech services company with 201-500 employees
Real User
Top 10
It can manage a high volume of data from many sources
Pros and Cons
  • "Kafka is scalable. It can manage a high volume of data from many sources."
  • "The interface has room for improvement, and there is a steep learning curve for Hadoop integration. It was a struggle learning to send from Hadoop to Kafka. In future releases, I'd like to see improvements in ETL functionality and Hadoop integration."

What is our primary use case?

I use Kafka to send network packets from different sources to my cluster. We have around 10 users at my company.

What is most valuable?

Kafka is scalable. It can manage a high volume of data from many sources.

What needs improvement?

The interface has room for improvement, and there is a steep learning curve for Hadoop integration. It was a struggle learning to send from Hadoop to Kafka. In future releases, I'd like to see improvements in ETL functionality and Hadoop integration. 

For how long have I used the solution?

I have used Kafka for around six months.

Buyer's Guide
Apache Kafka
April 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
769,334 professionals have used our research since 2012.

What do I think about the stability of the solution?

I rate Apache Kafka seven out of 10 for stability. 

What do I think about the scalability of the solution?

I rate Kafka eight out of 10 for scalability. 

How are customer service and support?

I rate Apache support six out of 10. It was hard to find the information I needed. 

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Before Kafka, I sent feeds directly to Hadoop.

How was the initial setup?

I initially found Kafka difficult to set up, so I would rate it about five out of 10 for ease of setup. After I learned more about the platform, I would rate it eight out of 10. It is deployed on-premises over a cluster of three or four PCs. You can deploy Kafka in a few hours with one person. 

What's my experience with pricing, setup cost, and licensing?

Kafka is open source. 

What other advice do I have?

I rate Apache Kafka eight out of 10. I would recommend it to others. 

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Guirino Ciliberti - PeerSpot reviewer
Data Governance & Lineage Product Manager at Primeur
Real User
Top 5
Impressive solution with a speedy deployment
Pros and Cons
  • "Deployment is speedy."
  • "It's not possible to substitute IBM MQ with Apache Kafka because the JMS part is not very stable."

What is our primary use case?

Our primary use case for this solution is streaming.

For how long have I used the solution?

We have been using this solution for four years.

What do I think about the stability of the solution?

The solution is stable. However, it's not possible to substitute IBM MQ with Apache Kafka because the JMS part is not very stable. It is inadequate and doesn't have the support of the MQI interface of IBM MQ.

What do I think about the scalability of the solution?

The solution is scalable. Deployment is speedy, but we don't have many installations. We have over a thousand users using this solution and will most likely increase the number of users because we have tested 100,000 messages per second. The solution is impressive.

Which solution did I use previously and why did I switch?

We previously used Mosquitto and Rabbit solutions, but we currently use Apache Kafka.

What's my experience with pricing, setup cost, and licensing?

We are licensed annually for this solution.

What other advice do I have?

I rate this solution a nine out of ten for streaming. I recommend it to other people. The solution is good, but its performance can be improved.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Apache Kafka
April 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
769,334 professionals have used our research since 2012.
Principal Technology Architect at a computer software company with 5,001-10,000 employees
Real User
Events and streaming are persistent, and multiple subscribers can consume the data
Pros and Cons
  • "With Kafka, events and streaming are persistent, and multiple subscribers can consume the data. This is an advantage of Kafka compared to simple queue-based solutions."
  • "Kafka's interface could also use some work. Some of our products are in C, and we don't have any libraries to use with C. From an interface perspective, we had a library from the readies. And we are streaming some of the products we built to readies. That is one of the requirements. It would be good to have those libraries available in a future release for our C++ clients or public libraries, so we can include them in our product and build on that."

What is our primary use case?

It's a combination of an on-premise and cloud deployment. We use AWS, and we have our offshore deployment that's on-premise for OpenShift, Red Hat, and Kafka. Red Hat provides managed services and everything. We use Kafka and a specific deployment where we deploy on our basic VMs and consume Kafka as well.

We publish or stream all our business events as well as some of the technical events. You stream it out to Kafka, and multiple consumers develop a different set of solutions. It could be reporting, analytics, or even some data persistence. Later, we used it to build a data lake solution. They all would be consuming the data or events we are streaming into Kafka.

What is most valuable?

With Kafka, events and streaming are persistent, and multiple subscribers can consume the data. This is an advantage of Kafka compared to simple queue-based solutions.

What needs improvement?

We are still on the production aspect, with our service provider or hyper-scalers providing the solutions. I would like to see some improvement on the HA and DR solutions, where everything is happening in real-time. 

Kafka's interface could also use some work. Some of our products are in C, and we don't have any libraries to use with C. From an interface perspective, we had a library from the readies. And we are streaming some of the products we built to readies. That is one of the requirements. It would be good to have those libraries available in a future release for our C++ clients or public libraries, so we can include them in our product and build on that.

For how long have I used the solution?

We've been using Apache Kafka for the past two to three years.

What do I think about the stability of the solution?

Kafka is stable. It's a great product. 

What do I think about the scalability of the solution?

We did some benchmarking, but we are still looking further to scale up some of the benchmarking and performances. So far, it meets all our business requirements. We are just developers, so everything goes to the clients, who will deploy it at their scale and use it for their end customers. So were are looking at it from a developer's perspective. Those who are developing the products are working on this.

How are customer service and support?

We haven't really contacted technical support, but some of our clients have subscribed to support from the vendors. We generally look for open-source solutions. From there, we try to figure out if there are any issues. There's a good online community where you can ask questions.

How was the initial setup?

We were able to deploy and use it with no problems for our use case. We didn't find it so complex. We work with so many applications, databases, Postgres, and so many other things, so we could manage it easily. We deployed Kafka in a few hours. We have an infrastructure team and DevOps. Those teams are pretty capable, and they've completely automated the whole deployment. It always takes time the first time you upgrade any application, not just Kafka. We might discover some issues, such as configuration, parameters, compatibility, etc. Once that becomes standard, it is stable, and then they only need to replicate it to the different environments or different developers groups. We have a sophisticated process.

What other advice do I have?

I rate Apache Kafka eight out of 10. There are so many products on the market, so my advice is to consider if Kafka suits your business requirements first. If it's suitable, the next step is to check whether all the technical requirements are met. If everything checks out, I would say that Kafka is a relatively stable, sound, and scalable product, so they can try it out. 

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Project Engineer at Wipro Limited
Real User
Top 20
Free to use, mature, and offers good scalability
Pros and Cons
  • "It's an open-source product, which means it doesn't cost us anything to use it."
  • "The UI is based on command line. It would be helpful if they could come up with a simpler user interface."

What is our primary use case?

We primarily use the solution for big data. We often get a million messages per second, and with such a high output we use Kafka to help us handle it. 

What is most valuable?

When we're working with big data, we need a throughput computing panel, which is something that Kafka provides, and something we find extremely valuable. It helps us support computing and ensures there's no loss of data. It can even do replication with some data.

The delivery of data is it's most valuable aspect.

It's an easy to use product overall.

The solution is quite mature.

It's an open-source product, which means it doesn't cost us anything to use it.

What needs improvement?

We're still going through the solution. Right now, I can't suggest any features that might be missing. I don't see where there can be an improvement in that regard.

The speed isn't as fast as RabbitMQ, even though the solution touts itself as very quick. It could be faster. They should work to make it at least as fast as RabbitMQ.

The UI is based on command line. It would be helpful if they could come up with a simpler user interface.

They should make it easier to configure items on the solution.

The solution would benefit from the addition of better monitoring tools.

For how long have I used the solution?

I've been using the solution for six months.

What do I think about the stability of the solution?

The solution is a bit slow in comparison to RabbitMQ. It's supposed to be a very fast solution, and it has okay performance, but speed-wise, it's quite slow.

What do I think about the scalability of the solution?

The scaling of the solution is quite good.

How are customer service and technical support?

In terms of technical support, we don't get that directly from Apache Kafka. We have certain cloud data distribution so we get assistance from our cloud data support.

How was the initial setup?

We're continuously deploying the product. We're still in the process of deployment.

What's my experience with pricing, setup cost, and licensing?

It's an open-source product, so the pricing isn't an issue. It's free to use. We don't have costs associated with it.

Which other solutions did I evaluate?

I'm not the product owner, so I didn't have a say in what should be chosen. We were seeing a high throughput with Kafka which is why we ultimately chose it.

What other advice do I have?

 I'd rate the solution eight out of ten. It's good at scaling, and, performance-wise, it's excellent. If they could add upon the UI and allow for easier configuration, I'd rate them higher.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Technical Consultant at KPMG
Real User
It eases our current data flow and framework
Pros and Cons
  • "It eases our current data flow and framework."
  • "Kafka 2.0 has been released for over a month, and I wanted to try out the new features. However, the configuration is a little bit complicated: Kafka Broker, Kafka Manager, ZooKeeper Servers, etc."

What is our primary use case?

It's convenient and flexible for almost all kinds of data producers. We integrated it with Kafka Streams, which can perform some easy data processing, like summary, count, group, etc

How has it helped my organization?

It eases our current data flow and framework, which digests all types of sources regardless of it being structured or not.

What is most valuable?

  • High availability
  • High throughput

With such a large digest, I was genuinely impressed at the process being almost real-time.

What needs improvement?

Kafka 2.0 has been released for over a month, and I wanted to try out the new features. However, the configuration is a little bit complicated: Kafka Broker, Kafka Manager, ZooKeeper Servers, etc.

For how long have I used the solution?

Less than one year.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user660591 - PeerSpot reviewer
Senior Java Consultant at a tech services company with 501-1,000 employees
Consultant
The product is a distributed system for persistent messaging

What is most valuable?

The most valuable features are performance, persistent messaging, and reliability. It allows us to persist the message for a configurable number of days, even after it has been delivered to the consumer. The message delivery is also fast.

How has it helped my organization?

We wanted to track the customer activities on our application and store those details on another system(RDBMS/Apache Hadoop). We do extensive analysis with that. This helps the company to analyze the customer activities, such as search terms, and do better.

What needs improvement?

It’s perfect for our requirements.

For how long have I used the solution?

I have been using Apache Kafka for two years.

What do I think about the stability of the solution?

We have had no issues with stability.

What do I think about the scalability of the solution?

We have had no issues with scalability.

How are customer service and technical support?

We use the open source one, so we did not opt for any technical support.

Which solution did I use previously and why did I switch?

We started to use Apache Kafka with our application from scratch.

How was the initial setup?

The initial setup was straightforward. We faced some issues during the development in areas such as message producer and consumer. We rectified those with the tweaking the producer and consumer configurations. The documentation is very good.

What's my experience with pricing, setup cost, and licensing?

I don’t have any idea, as we use the open source version.

What other advice do I have?

It's a high-performance distributed system. If you want to track the user activities or any stream processing, then this is perfect. We have used Docker Kafka for our implementation. It's very easy for setup and testing. You could also try the same.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user578787 - PeerSpot reviewer
Java Developer at a media company with 10,001+ employees
Real User
It provides safety for data in case of node failure or data center outage. Partitioning is useful for parallelizing processing.

What is most valuable?

The most valuable features to me are replication, partitioning and easy integration with Apache Spark, which we use quite a bit for distributed processing.

Replication is good for high availability. It provides additional safety for data in case of node failure or data center outage. Partitioning is a really useful feature for parallelizing processing. We use Apache Spark to process data from a Kafka queue, and Spark is able to assign one executor to each Kafka partition. The more partitions we have, the more threads we can use to process data in parallel. This helps us achieve really good throughput.

How has it helped my organization?

It will help us build a scalable platform. This will allow the company to provide better customer service.

What needs improvement?

It’s pretty easy to use for now. I haven’t had any difficulty or problems that I can complain about. Maybe they can add a UI to the configure queues and to display statistics about data stores.

For how long have I used the solution?

I have used Kafka for about a year.

What do I think about the stability of the solution?

So far, we have not encountered any stability issues.

What do I think about the scalability of the solution?

We have not had any scalability issues. The product is horizontally scalable, so adding extra hardware is all that is needed.

How are customer service and technical support?

We haven’t needed technical support with the product yet.

Which solution did I use previously and why did I switch?

I think performance-wise, the product is very good and fits in our use case. We used other distributed message queues, but all products have their own use case

How was the initial setup?

Initial setup wasn’t really complex. We use Kafka through Hortonworks Suite, which comes with many other big data tools. Ambari makes it easy to setup

What's my experience with pricing, setup cost, and licensing?

Licensing and pricing was handled by my management, so I don’t have much knowledge there.

What other advice do I have?

Give it a try. It’s a valuable, high-performance, distributed processing tool.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Developer Infrastructure at Outbrain
Real User
Very easy to install, stable, and has good scaling options
Pros and Cons
  • "It's very easy to keep to install and it's pretty stable."
  • "The third party is not very stable and sometimes you have problems with this component. There are some developments in newer versions and we're about to try them out, but I'm not sure if it closes the gap."

How has it helped my organization?

In my previous company, we had a proprietary implementation and we changed it with Kafka. We changed it because we had many different connectors available and it also allowed us to create a window to our products for the client. It was an on-premise product and it allowed the outline to take the data out, without us developing anything.

You can connect in any language and there are a lot of connectors available, it helps a lot. And it creates visibility into the data and stability. There are several alternatives but this is one of the best options for this.

What is most valuable?

It's very easy to install and it's pretty stable.

The possibility to have connectors is very helpful. Another valuable aspect is that it's mature and open-source. 

From a scalability point of view, you just add servers and it's scalable. The whole architecture is very scalable.

What needs improvement?

There is a feature that we're currently using called MirrorMaker. We use it to combine the information from different Kafka servers into another server. It's very wide and it gives a very generic scenario. I think it would be great if the possibility would exist out of the box and not as a third party. The third party is not very stable and sometimes you have problems with this component. There are some developments in newer versions and we're about to try them out, but I'm not sure if it closes the gap.

For how long have I used the solution?

I have been using this solution for six months. I also worked with it additionally in my previous company but not so intensively. 

How are customer service and technical support?

I have never needed to use technical support. I know it's available but we haven't needed it because there's a lot of information on the internet that has helped us to solve our issues. 

What other advice do I have?

I would definitely recommend Kafka. In our current position, we use it to move a lot of data and I think it's definitely working well. I would definitely recommend it.

I would rate it an eight out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user