Try our new research platform with insights from 80,000+ expert users
Kemal Duman - PeerSpot reviewer
Team Lead, Data Engineering at Nesine.com
Real User
Top 5
Achieves real-time data management with fast and fault-tolerant solutions
Pros and Cons
  • "Apache Kafka is very fast and stable."
  • "Apache Kafka is very fast and stable."
  • "Config management can be better."
  • "Config management can be better. We are always trying to find the best configs, which is a challenge."

What is our primary use case?

We are always using Apache Kafka for our real-time scenarios. It helps us detect anomalies and attacks on our website through machine learning models.

What is most valuable?

We are managing our data by topics. Splitting topics is more effective for us. Apache Kafka is very fast and stable. It offers scalability with ease and also integrates well with our tools. Fault tolerance is a good feature, and it also has high throughput rates.

What needs improvement?

Config management can be better. We are always trying to find the best configs, which is a challenge.

For how long have I used the solution?

I have been working with Apache Kafka for more than four years. It has been used since the beginning of our department, maybe six years.

Buyer's Guide
Apache Kafka
June 2025
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.

What do I think about the stability of the solution?

It is very stable and meets our needs consistently.

What do I think about the scalability of the solution?

If there is latency, our Kubernetes admin includes our Kafka nodes to increase scalability. Kafka provides flexibility and integrates easily with Kubernetes.

Which solution did I use previously and why did I switch?

Before Apache Airflow, I used Cron Tab. However, Apache Airflow makes it easy to follow and manage tasks, and data science departments can easily build their models or pipelines using it.

What other advice do I have?

I would rate Apache Kafka nine out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Flag as inappropriate
PeerSpot user
Madhan Potluri - PeerSpot reviewer
Head of Data at a energy/utilities company with 51-200 employees
Real User
Top 5Leaderboard
Offers a free version but needs to improve the support offered to users
Pros and Cons
  • "The most valuable features of the solution revolve around areas like the latency part, where the tool offers very little latency and the sequencing part."
  • "One complexity that I faced with the tool stems from the fact that since it is not kind of a stand-alone application, it won't integrate with native cloud, like AWS or Azure."

What is our primary use case?

I was planning to use the tool for real-time analysis in terms of data processing and real-time analytics workflows. The real-time IoT data comes through with a few challenges, and that is for one time, so it is more like a Kafka topic. I want to actually use multiple Kafka topics where one of them can be directly fed into the data pipeline, another one can be fed into the real-time alert system, and the next one can be fed into machine learning.

How has it helped my organization?

The most valuable features of the solution revolve around areas like the latency part, where the tool offers very little latency and the sequencing part. The sequencing part actually helps to aggregate things in a way that I don't need to write another function or kind of sequence it, and I write an aggregate function to figure out the maximum value in the last ten samples.

What needs improvement?

One complexity that I faced with the tool stems from the fact that since it is not kind of a stand-alone application, it won't integrate with native cloud, like AWS or Azure. Apache Kafka has another mask on it, so if users can have a direct service, like Grafana, that can actually be used as a stand-alone tool with Grafana cloud, or you can use a mix of AWS and Grafana, so there is not much difference with it. I expect Apache Kafka to have Grafana's same nature.

The product's support and the cloud integration capabilities are areas of concern where improvements are required.

For how long have I used the solution?

I have been using Apache Kafka for a year.

What do I think about the stability of the solution?

Stability-wise, I rate the solution an eight out of ten.

What do I think about the scalability of the solution?

Scalability-wise, I rate the solution an eight out of ten.

Around four people in my company use the product.

How are customer service and support?

I did not interact much with the product technical support team. I did not have dedicated support that responded to all my queries since I was using the product's free version. I rate the support a seven out of ten.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I have worked with Databricks. I use Databricks and Apache Kafka simultaneously.

How was the initial setup?

The product's deployment phase is neither complex nor straightforward. As the software has evolved a lot, users can actually keep it even simpler by opting for a plug-and-play model.

The solution is deployed on an on-premises model.

The solution can be deployed in two or three days.

What about the implementation team?

I was involved with the tool's installation process.

What was our ROI?

I cannot comment on the tool's ROI since I did not use it for production purposes.

What's my experience with pricing, setup cost, and licensing?

I was using the product's free version.

What other advice do I have?

I did not come across any scenarios involving fault tolerance because when it comes to the issue data consistency issues, like missing or incorrect value of data are actually part of the system where the data is being fed. Nevertheless here, when it comes to the missing values, I never tried the option, especially whenever a value is missing, that can allow one to impute the value with another parameter.

Speaking about if I incorporated any emerging data stream streaming trends in Apache Kafka workflows, for example, utilization of AI, I would say that I use it as a local system, so if I have an EC2 server where I kind of read the sample and then populate the regression and reintegration model on top of it, but that is done locally and not on the cloud.

I recommend the product to those who plan to use it. I like Kafka and Flink, and I want to actually create a system in AWS mainly for real-time streaming so that I don't need to worry about multiple data copies.

Considering the improvements needed in the product's support, and the cloud integration capabilities, while looking at the simplicity during the installation phase, I rate the tool a seven out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Apache Kafka
June 2025
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
Jhon Rico - PeerSpot reviewer
Senior Solutions Architect at BVC
Real User
Top 10
Is very scalable and has been beneficial is in the context of financial trading
Pros and Cons
  • "The publisher-subscriber pattern and low latency are also essential features that greatly piqued my interest."
  • "Maintaining and configuring Apache Kafka can be challenging, especially when you want to fine-tune its behavior."

What is our primary use case?

I have previous professional experience using Kafka to implement a system related to gathering software events in one centralized location. 

How has it helped my organization?

One example of how Kafka has been beneficial is in the context of financial trading. When a trade is executed, it generates an event. I used Kafka to create an application that captures these events and stores them in a topic, allowing for efficient processing in real time.

What is most valuable?

Regarding the most valuable feature in Kafka, I would say it's scalability. The publisher-subscriber pattern and low latency are also essential features that greatly piqued my interest.

What needs improvement?

Maintaining and configuring Apache Kafka can be challenging, especially when you want to fine-tune its behavior. It involves configuring traffic partitioning, understanding retention times, and dealing with various variables. Monitoring and optimizing its behavior can also be difficult.

Perhaps a more straightforward approach could be using messaging queues instead of the publish-subscribe pattern. Some solutions may not require the complex features of Apache Kafka, and a messaging queue with Kafka's capabilities might provide a more complete messaging solution for events and messages.

For how long have I used the solution?

I have been using Apache Kafka for the past 10 years. 

What do I think about the stability of the solution?

The stability may improve if the configuration and management aspects become less challenging.

What do I think about the scalability of the solution?

It depends on the configuration., but scalability is one of the best features of Kafka. I would rate it nine out of ten.

How are customer service and support?

Support can vary depending on whether you're using the open source version or a paid one. Our version, the paid console version, offers highly available support, and you can find a wealth of information and assistance from various providers online. However, when I used MSA on AWS, I encountered limited support for it.

How would you rate customer service and support?

Neutral

What was our ROI?

Despite the challenges we faced with configuration and management, I believe the return on investment is safeguarded.

What's my experience with pricing, setup cost, and licensing?

The cost can vary depending on the provider and the specific flavor or version you use. I'm not very knowledgeable about the pricing details.

What other advice do I have?

I believe that when working with Kafka Apache, it's essential to have a specialist who thoroughly understands and can optimize all the available variables within the solution to achieve the desired behavior.

I would rate it an eight out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer2075460 - PeerSpot reviewer
Group Manager at a media company with 201-500 employees
Real User
Real-time processing and reliable for data integrity
Pros and Cons
  • "Kafka can process messages in real-time, making it useful for applications that require near-instantaneous processing."
  • "Data pulling and restart ability need improving."

What is our primary use case?

We have different use cases for different clients. For example, a banking sector client wants to use Apache Kafka for fraud detection and messaging back to the client. For instance, if there's a fraudulent swipe of a credit or debit card, we stream near real-time data (usually three to five minutes old) into the platform. Then, we use a look-up model to relay the data to the messaging queue to the customers. This is just one use case.

We also have data science engineers who use Kafka to feed on the data (usually within the last five to seven minutes of transactions) to detect any fraudulent transactions for internal consumption of the bank. This is not for relaying back to the customer. This client is based in the Middle East, and they have some interesting use cases.

How has it helped my organization?

We are still in the cluster database phase. Based on the use cases captured during the advisory phase, there will be a mix of 40 to 60% of users. 40% will be internal data science and IT teams, while 60% will be end users. So, the total number of users we have seen is 25. Out of these, around 15 business users will make decisions based on reports generated by Kafka analytic data. The remaining users are internal, who analyze this data daily to identify more use cases from a predictive and AI perspective for the future banking domain.

Moreover, our current client is an enterprise business. It is a globally renowned bank that has entered Saudi.

What is most valuable?

One of the major features that we are currently exploring, which is coming from my previous experience as well, is a multiple PubSub model kind of architecture. We have one hub data, which is the IBM DB2 system that banks use for their daytime transaction tracking from OLTP systems. We want to use the data on different platforms. So, we are trying to use Kafka in a model where it will be a publisher onto multiple messaging queues. These different messaging queues belong to different business units, where we are segregating the data lake we are building into different domains. For example, HR data is becoming too sensitive, so they don't want to give it to any other businesses. We are working on a common publisher and multiple subscriber model, which I feel is much more easily implementable using Kafka.

The other part that we are trying to implement, and which is in its very near center stages, is to see if we can make it future-ready. Right now, in the Middle East, there are not many cloud subscribers like DCP AWS and Azure. It is all on-premise. But it'll be there just in the next two or three years. So, we are trying to see if we can have these Kafka models working from a future perspective wherein instead of dumping some of the data into a data lake, we can directly dump it into solutions like DCP BigQuery for real-time analytics. This is just for the use cases, which are for real-time analytics. This data will definitely also be there in the data lake as that is the intention of keeping it. 

But using Kafka, we are trying to see if we can make these subscribers ready to use these DCP BigQuery platforms for real-time analytics. It's still in the remittance stages, but those are still use cases.

What needs improvement?

One of the major areas for improvement, which I have to check out, is their pulling mechanism. Sometimes, when the data volume is too huge, and I have a pulling period of, let's say, one minute, there can be issues due to technical glitches, data anomalies, or platform-related issues such as cluster restarts. These polling periods tend to stop messaging use, and the restart ability part needs to be improved, especially when data volumes are too high. 

If there are obstructions due to technical glitches or platform issues, sometimes we have to manually clean up or clear the queue before it eventually gets sealed. It doesn't mean it doesn't get restarted on its own, but it takes too much time to catch up. At that point, one year ago, I couldn't find a solution to make it more agile in terms of catching up quickly and showing that it is real-time in case of any downtime. 

This was one area where I couldn't find a solution when I connected with Cloudera and Apache. One of our messaging tools was sending a couple of million records. We found it tough when there were any cluster downtimes or issues with the subscribers consuming data. 

For future releases, one feature I would like to see is a more robust solution in terms of restart ability. It should be able to handle platform issues and data issues and restart seamlessly. It should not cause a cascading effect if there is any downtime.

Another feature that would be helpful is if they could add monitoring features as they have for their other services. A UI where I can monitor the capacity of the managed queue and resources I need to utilize more to make it ready for future data volumes. It would be great to have analytics on the overall performance of Kafka to plan for data volumes and messaging use. Currently, we plan the cluster resources manually based on data volumes for Kafka. If they can have a UI for resource planning based on data volume, that could be a great addition.

For how long have I used the solution?

I have been using Apache Kafka for five years. In the current project, we're setting up a cluster. We'll be doing the service installations next week. It's a private cloud-based implementation, and I'm leading the end-to-end implementation. In my previous project, we mainly used Kafka for streaming real-time SAP data into the analytics platform for a technology client.

But for the current banking sector client, we're setting up a 58-node cluster and reserving six nodes for Kafka because we have a lot of streaming use cases.

What do I think about the stability of the solution?

I would rate the stability of Apache Kafka a six out of ten due to the polling period and high data volumes; there is a catch-up problem. If there is a five-minute downtime, it can have a cascading effect. 

When it comes to data availability, that is, how available the data is on the messaging queue, I would rate it a little less due to the coding mechanism and data getting stuck when the data volumes are high.  From the data availability perspective, I would rate it between six to seven. The major reason for this is the ransomware data that comes in day in and day out. If the resources are not allocated correctly to the Kafka messaging queue, sometimes it gets stuck. And once it is stuck, it can have a steady effect on catching up to the real-time data. Only because of this issue, I rate it between six to seven. 

However, from an overall data security perspective and ensuring that the data is consistent across the system, I would rate it around nine. If your PubSub model is written correctly, you can be assured that the data will not be lost. It will either be in the messaging queues or your landing tables or staging tables, but it will not be lost, at least if you have written it correctly.

What do I think about the scalability of the solution?

I would rate the scalability of Apache Kafka somewhere around seven out of ten. I'm not going on the higher side because a lot of manual work is involved in upgrading Kafka. You have to estimate the overall capacity, not just in data but also in other use cases running on the same cluster. In the cloud, it is easier as you don't have to worry about the turnaround time of your cluster setup. But in an on-premise setup, you need to add more nodes, RAM resources, and storage based on the increasing data volume. 

Also, it would help if you ensured that the streaming use cases running on Kafka are not impacting other use cases like batch or archival use cases. Because of this manual activity of overall estimation, I will still keep it somewhere around six to seven. But regarding scalability in terms of horizontal or vertical scalability from the data perspective, I feel more comfortable with Kafka compared to other available streaming solutions.

How are customer service and support?

I have noticed a drastic improvement in the last five to seven months. Last year, the turnaround time for certain cases was around 14 days unless you escalated. Issues used to take two, three, or even four times to follow up on. Even though the solutions provided were often resource upgradation solutions, which I felt were not always the best. 

However, in the last month, I have seen Cloudera coming through with two to three days turnaround times, even for low-severity issues. I'm unsure if it is region-specific, but I assume it should be, as they have region-specific teams. Sometimes, however, you cannot always depend on them. The type of solutions provided are sometimes like hidden trials. When you work for bigger enterprises, you cannot always go with them because there is a cost associated.

For example, in one of my recent cases, we implemented the engine policy, a security setup on our cluster. We were stuck at some point and asked for technical support. They provided a solution that was just a patchwork. When we did our analysis and went to the bank security team to review it, we found out that their solution was inadequate. They told us to set up role-based access control through Ranger, where AD users should be synced with the Ranger, and access control policies should be set up. However, they provided only for the local range level if you have Linux users. That is not a solution because, at the enterprise level, everything is integrated with AD and authenticated by AD.

I would rate Cloudera's support on a scale of five to six. But from the turnaround time, there has been an improvement from last year. We used to wait two to three days for critical solutions, but now it is much better. I used to work for the US region in my previous project, and the turnaround time was not as expected. It all depends on the licensing, and if you have a premium license from Cloudera, they assign a professional services guide to your project, and you get better support. If you do not have a premium license, you have to go through the process of rating the cases and wait for their support team to come. Overall, it is not at par with them if you compare it with solutions like Azure DCP and others.

How would you rate customer service and support?

Neutral

How was the initial setup?

The initial two months were for capacity estimation, where we worked with the client's different business teams to understand the data volumes and use cases. Then, the next four to five months went into procurement, where we had to work with infrastructure teams and vendors to understand the servers and networks required for the cluster. 

The actual cluster setup took us two months, and it was a little longer due to a shortage of expertise on the client's networking team. We had to handle everything ourselves since it was an on-premise setup with physical servers and network connections. Currently, we are in the security review phase, and once it completes, we will start implementing various use cases like task and batch processing, archival, etc.

If you see my experience from the Apache Kafka implementation and clusters as a perspective, I will rate the setup somewhere between seven to eight out of ten.

What about the implementation team?

We deployed the solution On-premises because it is a Middle East client. That is where my admin experience in the last two years has been too much. So even if I move to the cloud, it'll be much easier because I have seen cluster implementation from scratch and how it is done. So I have been involved in the very first stage of working with Cloudera on the sizing and then working on the actual infrastructure networking team on the implementation, working with the network team on doing all the network structuring, then setting up the cluster ourselves. I have a team of around seventeen people who am I getting here. So yeah, from that perspective, it is there, but what we are implementing is Cloudera private cloud-based as a solution, which is a future-ready solution for the cloud. 

So once cloud services enter with least, especially Saudi Arabia, for example, the CPaaW as an Azure, our cluster will always be ready to be upgraded to the cloud because there's a private cloud-based solution on which we have the cluster. We can anytime add the cloud-native hosts and nodes onto our cluster. Also, at the same time, because it's a banking client, it has some restrictions in terms of geographies and all for the data to decide. The physical cluster provides a solution from the future AD perspective that once TCP, Azure, and AWS set up their data centers in Saudi Arabia, we can have some of our data nodes in the bank data center, plus we can have other sets of nodes or VMs in the cloud service provider's data center.

What was our ROI?

I have seen that the ROI is very good when implemented correctly and used for a period of time. I have seen, from a POC perspective, data getting churned in a couple of months, and the amount of insights generated was overwhelming. 

I have also seen some critical decisions taken based on that data at an enterprise level, which earlier used to take years. Because of the time it used to take, the intention of doing those analytics used to lose its flavor. But now, people can make decisions in a few months based on this streaming analytics use case through Kafka. And they see, if those decisions had been taken earlier, they could have quickly gone on for four to five percent of their year-on-year profits. 

However, too many things are involved because of the overall use case perspective, data perspective, underlying cluster sizing, and sourcing. One has to think from a holistic point of view, not just from a business point of view.

What's my experience with pricing, setup cost, and licensing?

I have experience in private cluster implementation. When you use Apache Kafka with Cloudera, the pricing is included in your Cloudera license. The pricing is based on the number of nodes, the storage cost, and other components. As part of this license, Kafka is one of the solutions offered. When you compare it with OnCloud, if you don't have a good volume of data and use cases, your benefits realization will not be there, as the initial cost of setting up the cluster and bringing up the license can be as much as $760k for a small cluster of ten to twenty nodes. You need at least 20-30 GBs of data and use cases before utilizing and profiting from the Teradata license and cloud data. Kafka is just one piece of it. 

When it comes to the cloud, the pricing also goes at the solution level so that you can compare it at the Kafka level. Still, I don't have much information on that from where I am currently implementing the solution. After we did the cost-benefit analysis, we only opted for the solution. We realized that by bringing in Cloudera along with Kafka, we would be able to replace two or three existing systems, including Teradata, Oracle, Informatica, and IBM Datastage. Only then were we able to realize the benefit for the bank. Otherwise, Cloudera would be much more expensive, especially in the short term. With distributed computing, the concept of Delta Lake is coming in, and IDBMS systems like Teradata and distributed systems like data lakes will coexist. Not all use cases will be solved, but cloud solutions like Azure come as a package, and you need not worry about having different physical systems in your enterprise to take care of. That's where I think the cost-benefit analysis from a data perspective becomes too important.

At the end of the day, we bring in big data systems only when the data volumes are high. When the data volumes are low, the cost-benefit analysis can easily show that systems like Oracle or Teradata can run it just fine.

What other advice do I have?

From an architecture and solution design perspective, I would say that before going for streaming solutions, we should analyze the data, which might be old, and decide if it's a streaming use case or not. Often, people think it's a streaming use case, but when they perform analytics on top of it, they realize they can't do a month-to-date or year-to-date analysis. So, it's essential to think again from the data basics perspective before going to Kafka.

Overall, from the product and solution perspective, I would rate it a nine based on my personal use of data.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementor
PeerSpot user
Pratul Shukla - PeerSpot reviewer
Vice President at a financial services firm with 10,001+ employees
Real User
Top 20
Open-source, stable, and scalable
Pros and Cons
  • "The use of Kafka's logging mechanism has been extremely beneficial for us, as it allows us to sequence messages, track pointers, and manage memory without having to create multiple copies."
  • "There is a lot of information available for the solution and it can be overwhelming to sort through."

What is our primary use case?

We have multiple use cases for our Kafka system. One is Kafka Connect, which is used to facilitate communication between different regions with Grid Deal. Another is to distribute events and projects to multiple downstream. We publish all the messages to Kafka and other listeners subscribe and write them to different MQs. Lastly, Kafka Connect is used especially for inter-application communication.

How has it helped my organization?

We had been using a lot of expensive licenses earlier, such as SOLEIL, as well as some legacy versions, which were not only costly but also caused memory issues and required highly technical personnel to manage. This posed a huge challenge in terms of resourcing and cost, and it simply wasn't worth investing more in. However, Kafka was comparatively free as it was open source, and we were able to build our own monitoring system on top of it. Kafka is an open-source platform that allows us to develop modern solutions with relative ease. Additionally, there are many resources available in the market to quickly train personnel to work with this platform. Kafka is user-friendly and does not require an extensive learning curve, unlike other tools. Furthermore, the configuration is straightforward. All in all, Kafka provides us with a great platform to build upon with minimal effort.

What is most valuable?

The use of Kafka's logging mechanism has been extremely beneficial for us, as it allows us to sequence messages, track pointers, and manage memory without having to create multiple copies. We are currently on a legacy version and have found that the latest version of Kafka has solved many of the issues we were facing, such as sequencing, memory management, and more. Additionally, the fact that it is open source is a major benefit.

What needs improvement?

Multiple people have constructed conflict resolution with successful solutions on top of open-source platforms. Unfortunately, open source does not have the monitoring and capabilities these solutions offer, so organizations must create their own. Investing in these solutions may be beneficial for many companies, who prefer to use open-source options. 

There is a lot of information available for the solution and it can be overwhelming to sort through. The solution can improve by including user-friendly documentation.

For how long have I used the solution?

I have been using the solution for four years.

What do I think about the stability of the solution?

We have not experienced any issues with the stability of the solution. We had some issues with Grid Gain and Kafka Connect, but we believe it was more of an issue on Grid Gain's side since they informed us of a bug. Our result has been that we have not encountered many issues on the Kafka side.

What do I think about the scalability of the solution?

We use the solution in the distributed mode in multiple regions – the US, London, and Hong Kong. We have increased the number of nodes to ensure it is available to us at all times.

I give the scalability an eight out of ten.

We have around 600 people within my team using the solution.

How was the initial setup?

The initial setup was relatively easy for us since we already had Zookeeper and the necessary setup in place. We also had good knowledge of Kafka. Therefore, it was not a difficult challenge. In general, I believe that it is manageable. There are benefits and the setup is not overly complex.

Our company has implemented Ship, making our lives easier when it comes to changes or version updates. We can package everything in one place and deploy it with Ship, then implement the virtual number with a minimum of 50 changes.

Deployment time depends on our location and the task at hand. Initially, there is a lot of setup and configuration that must be done, but this can become easier with experience. Nowadays, the process is not too difficult, as all the version numbers and conflict files are already in place. However, if this is a new task for us, it may take some time to figure out all the configurations.

One person was dedicated to deploying Kafka. This person got help from our release team, who had already set up Zookeeper and other necessary components.

What about the implementation team?

The implementation was completed in-house.

What other advice do I have?

I give the solution an eight out of ten.

Maintaining Kafka, the open source, can be difficult without the proper version purchased or the right infrastructure in place. However, once the initial setup is complete, it is relatively simple to maintain. The open-source version of Kafka is not a complete package, so additional maintenance may be required.

I strongly recommend reading the documentation for any issues because it is likely to contain the answer we are looking for. There is a lot of information provided that may not be immediately obvious, so take the time to explore thoroughly.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
AbhishekGupta - PeerSpot reviewer
Engineering Leader at Walmart
Real User
Stable, plenty of features, and useful for real-time analytics
Pros and Cons
  • "The most valuable feature of Apache Kafka is Kafka Connect."
  • "Apache Kafka could improve data loss and compatibility with Spark."

What is our primary use case?

Apache Kafka can be deployed on the cloud and on-premise.

We use Apache Kafka internally to build a service on a cluster. Additionally, we use the intermediate persistence layer for events. There are many teams who leverage it as a message queue and further their microservice connections.

How has it helped my organization?

Apache Kafka has helped out the organization because we leverage it for all our eCommerce real-time analytics use cases.

What is most valuable?

The most valuable feature of Apache Kafka is Kafka Connect.

What needs improvement?

Apache Kafka could improve data loss and compatibility with Spark.

For how long have I used the solution?

I have been using Apache Kafka for approximately five years.

What do I think about the stability of the solution?

Apache Kafka is stable.

What do I think about the scalability of the solution?

The scalability of Apache Kafka could improve.

We have approximately 10,000 users using this solution.

How are customer service and support?

The support from Apache Kafka could improve. Their engineers at times do not know what the solutions can do.

Which solution did I use previously and why did I switch?

We previously used IBM MQ, Tipco, and AMQ.

How was the initial setup?

The initial setup of Apache Kafka was complex. We were able to simplify it by doing registry-based integration of the services.

What was our ROI?

Apache Kafka has given a substantial return on investment.

What other advice do I have?

The number of people required for maintenance depends on the team. They need a centralized team to offer Apache Kafka and services. Each team does have knowledge of Kafka.

This solution has a lot of features and there is no other solution on the market that has similar advanced features. It is a very good solution.

I rate Apache Kafka an eight out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Reza Sadeghi - PeerSpot reviewer
Software Development Team Lead at asa com
Real User
The command line interface is powerful
Pros and Cons
  • "Kafka is an open-source tool that's easy to use in our country, and the command line interface is powerful."
  • "The user interface is one weakness. Sometimes, our data isn't as accessible as we'd like. It takes a lot of work to retrieve the data and the index."

What is our primary use case?

We use Kafka daily for our messaging queue to reduce costs because we have a lot of consumers, producers, and repeat messages. Our company has only one system built on Apache Kafka because it's based on microservices, so all of the applications can communicate using it.

What is most valuable?

Kafka is an open-source tool that's easy to use in our country, and the command line interface is powerful. 

What needs improvement?

The user interface is one weakness. Sometimes, our data isn't as accessible as we'd like. It takes a lot of work to retrieve the data and the index.

For how long have I used the solution?

I've used Kafka for about 10 months.

What do I think about the stability of the solution?

Kafka is stable.

How are customer service and support?

We can't access support because we are in Iran, and many countries prohibit business with Iran. 

Which solution did I use previously and why did I switch?

We used MSMQ on Windows, but we decided to migrate our system to Docker and we wanted to use base Linux, so we move them from Amazon Queue to Kafka.

Apache Kafka has one advantage that sets it apart from other providers. We need to iterate on the messages, but others don't have this feature. Kafka has partitioning, which is useful, so we decided to go with Kafka. 

How was the initial setup?

I rate Kafka 10 out of 10 for ease of setup. It's easy for us because we use Docker, but if you want to use another system like Linux it may be a little challenging

What's my experience with pricing, setup cost, and licensing?

Kafka is free. 

Which other solutions did I evaluate?

Redis has an open-source solution, but I'm not sure about IBM. I haven't researched it. 

What other advice do I have?

I rate Apache Kafka seven out of 10. It's a good solution. They're constantly fixing bugs and adding new features. 

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Bharath-Reddy - PeerSpot reviewer
Architect at Tekgeminus
Real User
Top 5
An open-source solution that can be used for messaging or event processing
Pros and Cons
  • "Apache Kafka is an open-source solution that can be used for messaging or event processing."
  • "Apache Kafka has performance issues that cause it to lag."

What is most valuable?

Apache Kafka is an open-source solution that can be used for messaging or event processing.

What needs improvement?

Apache Kafka has performance issues that cause it to lag.

For how long have I used the solution?

We did a couple of POCs on Apache Kafka for more than two years for messaging and event processing.

What do I think about the stability of the solution?

I rate Apache Kafka an eight out of ten for stability.

What do I think about the scalability of the solution?

I rate Apache Kafka a seven out of ten for scalability.

How are customer service and support?

Since it's an open-source solution, there is no technical support, and users often rely on the community edition.

Which solution did I use previously and why did I switch?

I have previously worked with Confluent and Anypoint MQ. Confluent is completely an event-driven architecture. Anypoint MQ is a typical messaging software and cannot be used for an event-driven architecture.

How was the initial setup?

The solution's initial setup is quite straightforward. You just have to upgrade a couple of configuration files.

What's my experience with pricing, setup cost, and licensing?

Apache Kafka is an open-source solution.

What other advice do I have?

A non-enterprise business with a low message load can use an open-source solution like Apache Kafka.

I would recommend the solution to enterprise businesses depending on their use cases. Suppose an enterprise business doesn't have any integration or a middleware platform and wants to do a greenfield implementation. I'll evaluate the use cases and refer Apache Kafka to them if messaging is needed only for exception handling or transferring the messages.

I have recommended Apache Kafka to some customers who wanted asynchronous messaging for logging purposes. Those messages were not business-critical messages as such.

I would recommend Apache Kafka to other users. Apache Kafka is more relevant when we use open-source integrations and when customers want to reduce the TCO. As an architect, I recommend the solution to customers based on their messaging needs. Apache Kafka and Anypoint MQ are the only two messaging products available today. The open-source Apache Kafka is always recommended if the customer really doesn't want to get into any of the license models.

Overall, I rate Apache Kafka an eight out of ten.

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user