Try our new research platform with insights from 80,000+ expert users

Confluent vs Pentaho Data Integration and Analytics comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Confluent
Average Rating
8.2
Reviews Sentiment
6.7
Number of Reviews
23
Ranking in other categories
Streaming Analytics (4th)
Pentaho Data Integration an...
Average Rating
8.0
Reviews Sentiment
6.9
Number of Reviews
53
Ranking in other categories
Data Integration (22nd)
 

Mindshare comparison

Confluent and Pentaho Data Integration and Analytics aren’t in the same category and serve different purposes. Confluent is designed for Streaming Analytics and holds a mindshare of 8.2%, down 11.2% compared to last year.
Pentaho Data Integration and Analytics, on the other hand, focuses on Data Integration, holds 1.6% mindshare, up 0.6% since last year.
Streaming Analytics
Data Integration
 

Featured Reviews

Gustavo-Barbosa Dos Santos - PeerSpot reviewer
Has good technical support services and a valuable feature for real-time data streaming
Implementing Confluent's schema registry has significantly enhanced our organization's data quality assurance. It helps us understand the various requirements of multiple customers and validates the information for different versions. We can automate the tasks using Confluent Kafka. Thus, it guarantees us the data quality and maintains the integrity of message contracts.
Ryan Ferdon - PeerSpot reviewer
Low-code makes development faster than with Python, but there were caching issues
If you're working with a larger data set, I'm not so sure it would be the best solution. The larger things got the slower it was. It was kind of buggy sometimes. And when we ran the flow, it didn't go from a perceived start to end, node by node. Everything kicked off at once. That meant there were times when it would get ahead of itself and a job would fail. That was not because the job was wrong, but because Pentaho decided to go at everything at once, and something would process before it was supposed to. There were nodes you could add to make sure that, before this node kicks off, all these others have processed, but it was a bit tedious. There were also caching issues, and we had to write code to clear the cache every time we opened the program, because the cache would fill up and it wouldn't run. I don't know how hard that would be for them to fix, or if it was fixed in version 10. Also, the UI is a bit outdated, but I'm more of a fan of function over how something looks. One other thing that would have helped with Pentaho was documentation and support on the internet: how to do things, how to set up. I think there are some sites on how to install it, and Pentaho does have a help repository, but it wasn't always the most useful.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The monitoring module is impressive."
"The client APIs are the most valuable feature."
"The most valuable feature of Confluent is the wide range of features provided. They're leading the market in this category."
"I find Confluent's Kafka Connectors and Kafka Streams invaluable for my use cases because they simplify real-time data processing and ETL tasks by providing reliable, pre-packaged connectors and tools."
"Kafka Connect framework is valuable for connecting to the various source systems where code doesn't need to be written."
"Their tech support is amazing; they are very good, both on and off-site."
"The documentation process is fast with the tool."
"Implementing Confluent's schema registry has significantly enhanced our organization's data quality assurance."
"We also haven't had to create any custom Java code. Almost everywhere it's SQL, so it's done in the pipeline and the configuration. That means you can offload the work to people who, while they are not less experienced, are less technical when it comes to logic."
"One of the valuable features is the ability to use PL/SQL statements inside the data transformations and jobs."
"The fact that it enables us to leverage metadata to automate data pipeline templates and reuse them is definitely one of the features that we like the best. The metadata injection is helpful because it reduces the need to create and maintain additional ETLs. If we didn't have that feature, we would have lots of duplicated ETLs that we would have to create and maintain. The data pipeline templates have definitely been helpful when looking at productivity and costs."
"The solution offers features for data integration and migration. Pentaho Data Integration and Analytics allows the integration of multiple data sources into one. The product is user-friendly and intuitive to use for almost any business."
"It has a really friendly user interface, which is its main feature. The process of automating or combining SQL code with some databases and doing the automation is great and really convenient."
"The fact that it's a low-code solution is valuable. It's good for more junior people who may not be as experienced with programming."
"Provides a good open source option."
"We can schedule job execution in the BA Server, which is the front-end product we're using right now. That scheduling interface is nice."
 

Cons

"It would help if the knowledge based documents in the support portal could be available for public use as well."
"there is room for improvement in the visualization."
"Confluent has a good monitoring tool, but it's not customizable."
"They should remove Zookeeper because of security issues."
"The Schema Registry service could be improved. I would like a bigger knowledge base of other use cases and more technical forums. It would be good to have more flexible monitoring features added to the next release as well."
"In Confluent, there could be a few more VPN options."
"There is a limitation when it comes to seamlessly importing Microsoft documents into Confluent pages, which can be inconvenient for users who frequently work with Microsoft Office tools and need to transition their content to Confluent."
"One area we've identified that could be improved is the governance and access control to the Kafka topics. We've found some limitations, like a threshold of 10,000 rules per cluster, that make it challenging to manage access at scale if we have many different data sources."
"The testing and quality could really improve. Every time that there is a major release, we are very nervous about what is going to get broken. We have had a lot of experience with that, as even the latest one was broken. Some basic things get broken. That doesn't look good for Hitachi at all. If there is one place I would advise them to spend some money and do some effort, it is with the quality. It is not that hard to start putting in some unit tests so basic things don't get broken when they do a new release. That just looks horrible, especially for an organization like Hitachi."
"I would like to see support for some additional cloud sources. It doesn't support Azure, for example. I was trying to do a PoC with Azure the other day but it seems they don't support it."
"Should provide additional control for the data warehouse"
"In the Community edition, it would be nice to have more modules that allow you to code directly within the application. It could have R or Python completely integrated into it, but this could also be because I'm using an older version."
"While Pentaho Data Integration is very friendly, it is not very useful when there isn't a lot of data to handle."
"The performance could be improved. If they could have analytics perform well on large volumes, that would be a big deal for our products."
"​There is not a data quality or MDM solution in the Pentaho DI suite.​"
"I work with different databases. I would like to work with more connectors to new databases, e.g., DynamoDB and MariaDB, and new cloud solutions, e.g., AWS, Azure, and GCP. If they had these connectors, that would be great. They could improve by building new connectors. If you have native connections to different databases, then you can make instructions more efficient and in a more natural way. You don't have to write any scripts to use that connector."
 

Pricing and Cost Advice

"Regarding pricing, I think Confluent is a premium product, but it's hard for me to say definitively if it's overly expensive. We're still trying to understand if the features and reduced maintenance complexity justify the cost, especially as we scale our platform use."
"On a scale from one to ten, where one is low pricing and ten is high pricing, I would rate Confluent's pricing at five. I have not encountered any additional costs."
"It comes with a high cost."
"Confluent is expensive, I would prefer, Apache Kafka over Confluent because of the high cost of maintenance."
"Confluent is highly priced."
"Confluent is an expensive solution as we went for a three contract and it was very costly for us."
"You have to pay additional for one or two features."
"The solution is cheaper than other products."
"The pricing has been pretty good. I'm used to using everything open-source or freeware-based. I understand that organizations need to make sure that the solutions are secure, and that's basically where I hit a roadblock in my current organization. They needed to ensure that we had a license and we had a secure way of accessing it so that no outside parties could get access to our data, but in terms of pricing, considering how much other teams are spending on cloud solutions or even their existing solutions, its price point is pretty good. At this time, there are no additional costs. We just have the licensing fees."
"I mostly used the open-source version. I didn't work with a license."
"For most development tasks, the Enterprise edition should be sufficient. It depends on the type of support that you require for your production environment."
"The price of the regular version is not reasonable and it should be lower."
"I use it because it is free. I download from their page for free. I don't have to pay for a license. With other tools, I have to pay for the licenses. That is why I use Pentaho."
"You don't need the Enterprise Edition, you can go with the Community Edition. That way you can use it for free and, for free, it's a pretty good tool to use."
"We are using the Community Edition. We have been trying to use and sell the Enterprise version, but that hasn't been possible due to the budget required for it."
"We did a two or three-year deal the last time we did it. As compared to other solutions, at least so far in our experience, it has been very affordable. The licensing is by component. So, you need to make sure you only license the components that you really intend to use. I am not sure if we have relicensed after the Hitachi acquisition, but previously, multi-year renewals resulted in a good discount. I'm not sure if this is still the case. We've had the full suite for a lot of years, and there is just the initial cost. I am not aware of any additional costs."
report
Use our free recommendation engine to learn which Streaming Analytics solutions are best for your needs.
850,028 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
19%
Computer Software Company
16%
Manufacturing Company
7%
Insurance Company
6%
Financial Services Firm
22%
Computer Software Company
15%
Government
8%
Manufacturing Company
5%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Confluent?
I find Confluent's Kafka Connectors and Kafka Streams invaluable for my use cases because they simplify real-time data processing and ETL tasks by providing reliable, pre-packaged connectors and to...
What is your experience regarding pricing and costs for Confluent?
They charge a lot for scaling, which makes it expensive.
What needs improvement with Confluent?
I am not very impressed by Confluent. We continuously face issues, such as Kafka being down and slow responses from the support team. The lack of easy access to the Confluent support team is also a...
Which ETL tool would you recommend to populate data from OLTP to OLAP?
Hi Rajneesh, yes here is the feature comparison between the community and enterprise edition : https://www.hitachivantara.com/en-us/pdf/brochure/leverage-open-source-benefits-with-assurance-of-hita...
What do you think can be improved with Hitachi Lumada Data Integrations?
In my opinion, the reporting side of this tool needs serious improvements. In my previous company, we worked with Hitachi Lumada Data Integration and while it does a good job for what it’s worth, ...
What do you use Hitachi Lumada Data Integrations for most frequently?
My company has used this product to transform data from databases, CSV files, and flat files. It really does a good job. We were most satisfied with the results in terms of how many people could us...
 

Also Known As

No data available
Hitachi Lumada Data Integration, Kettle, Pentaho Data Integration
 

Overview

 

Sample Customers

ING, Priceline.com, Nordea, Target, RBC, Tivo, Capital One, Chartboost
66Controls, Providential Revenue Agency of Ro Negro, NOAA Information Systems, Swiss Real Estate Institute
Find out what your peers are saying about Confluent vs. Pentaho Data Integration and Analytics and other solutions. Updated: April 2025.
850,028 professionals have used our research since 2012.