Try our new research platform with insights from 80,000+ expert users

Apache Kafka vs Cloudera DataFlow comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Dec 17, 2024

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Kafka
Ranking in Streaming Analytics
8th
Average Rating
8.2
Reviews Sentiment
6.9
Number of Reviews
88
Ranking in other categories
No ranking in other categories
Cloudera DataFlow
Ranking in Streaming Analytics
15th
Average Rating
7.4
Reviews Sentiment
6.5
Number of Reviews
5
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of June 2025, in the Streaming Analytics category, the mindshare of Apache Kafka is 3.0%, up from 1.9% compared to the previous year. The mindshare of Cloudera DataFlow is 1.1%, down from 1.4% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Streaming Analytics
 

Featured Reviews

Snehasish Das - PeerSpot reviewer
Data streaming transforms real-time data movement with impressive scalability
I worked with Apache Kafka for customers in the financial industry and OTT platforms. They use Kafka particularly for data streaming. Companies offering movie and entertainment as a service, similar to Netflix, use Kafka Apache Kafka offers unique data streaming. It allows the use of data in…
Mohamed-Saied - PeerSpot reviewer
Efficient data integration and workflow scheduling elevate project performance
Cloudera DataFlow is used as an ETL or ELT solution within Cloudera's data pipeline. Our organization heavily relies on it for data ingestion, transformation, and warehousing. It is also used daily for operational tasks, and it integrates well within Cloudera's ecosystem for high performance and…

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"There are numerous possibilities that can be explored. While it may be challenging to fully comprehend the potential advantages, one key aspect is the ability to establish a proper sequence of events rather than simply dealing with a jumbled group of occurrences. These events possess their own timestamps, even if they were not initially provided with one, and are arranged in a chronological order that allows for a clear understanding of the progression of the events."
"The valuable features are the group community and support."
"Kafka allows you to handle huge amounts of data and classify it into different categories. If you have huge amounts of data, Kafka is a very good solution for data classification."
"valuable features relate to microservices architecture and working on KStream and KSQL DB as a microservices event bus."
"Kafka provides us with a way to store the data used for analytics. That's the big selling point. There's very good log management."
"Apache Kafka has good integration capabilities and has plenty of adapters in its ecosystem if you want to build something. There are adapters for many platforms, such as Java, Azure, and Microsoft's ecosystem. Other solutions, such as Pulsar have fewer adapters available."
"Deployment is speedy."
"The most valuable features are the stream API, consumer groups, and the way that the scaling takes place."
"This solution is very scalable and robust."
"The most effective features are data management and analytics."
"DataFlow's performance is okay."
"The initial setup was not so difficult"
"Cloudera DataFlow is fully compatible with Cloudera's ecosystem and offers high efficiency through native connectors for various ecosystems."
 

Cons

"Some vendors don't offer extra features for monitoring."
"Kafka is complex and there is a little bit of a learning curve."
"The product is good, but it needs implementation and on-going support. The whole cloud engagement model has made the adoption of Kafka better due to PaaS (Amazon Kinesis, a fully managed service by AWS)."
"I would like to see an improvement in authentication management."
"Apache Kafka could improve data loss and compatibility with Spark."
"There have been some challenges with monitoring Apache Kafka, as there are currently only a few production-grade solutions available, which are all under enterprise license and therefore not easily accessible. The speaker has not had access to any of these solutions and has instead relied on tools, such as Dynatrace, which do not provide sufficient insight into the Apache Kafka system. While there are other tools available, they do not offer the same level of real-time data as enterprise solutions."
"Observability could be improved."
"The user interface is one weakness. Sometimes, our data isn't as accessible as we'd like. It takes a lot of work to retrieve the data and the index."
"It is not easy to use the R language. Though I don't know if it's possible, I believe it is possible, but it is not the best language for machine learning."
"It's an outdated legacy product that doesn't meet the needs of modern data analysts and scientists."
"Cloudera DataFlow's UI interface could be enhanced significantly. Memory handling can also be improved to be better than it is today."
"Although their workflow is pretty neat, it still requires a lot of transformation coding; especially when it comes to Python and other demanding programming languages."
 

Pricing and Cost Advice

"When starting to look at a distributed message system, look for a cloud solution first. It is an easier entry point than an on-premises hardware solution."
"We use the free version."
"Kafka is more reasonably priced than IBM MQ."
"The solution is open source."
"Running a Kafka cluster can be expensive, especially if you need to scale it up to handle large amounts of data."
"Apache Kafka is an open-source solution."
"I rate Apache Kafka's pricing a five on a scale of one to ten, where one is cheap and ten is expensive. There are no additional costs apart from the licensing fees for Apache Kafka."
"Licensing issues are not applicable. Apache licensing makes it simple with almost zero cost for the software itself."
"DataFlow isn't expensive, but its value for money isn't great."
report
Use our free recommendation engine to learn which Streaming Analytics solutions are best for your needs.
859,129 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
29%
Computer Software Company
12%
Manufacturing Company
7%
Retailer
5%
University
16%
Financial Services Firm
15%
Computer Software Company
13%
Retailer
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

What are the differences between Apache Kafka and IBM MQ?
Apache Kafka is open source and can be used for free. It has very good log management and has a way to store the data used for analytics. Apache Kafka is very good if you have a high number of user...
What do you like most about Apache Kafka?
Apache Kafka is an open-source solution that can be used for messaging or event processing.
What is your experience regarding pricing and costs for Apache Kafka?
Its pricing is reasonable. It's not always about cost, but about meeting specific needs.
What do you like most about Cloudera DataFlow?
The most effective features are data management and analytics.
What needs improvement with Cloudera DataFlow?
Cloudera DataFlow's UI interface could be enhanced significantly. Memory handling can also be improved to be better than it is today.
What is your primary use case for Cloudera DataFlow?
Cloudera DataFlow is used as an ETL or ELT solution within Cloudera's data pipeline. Our organization heavily relies on it for data ingestion, transformation, and warehousing. It is also used daily...
 

Also Known As

No data available
CDF, Hortonworks DataFlow, HDF
 

Overview

 

Sample Customers

Uber, Netflix, Activision, Spotify, Slack, Pinterest
Clearsense
Find out what your peers are saying about Apache Kafka vs. Cloudera DataFlow and other solutions. Updated: June 2025.
859,129 professionals have used our research since 2012.