No more typing reviews! Try our Samantha, our new voice AI agent.

Confluent vs IBM InfoSphere DataStage comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Confluent
Average Rating
8.2
Reviews Sentiment
6.3
Number of Reviews
25
Ranking in other categories
Streaming Analytics (5th)
IBM InfoSphere DataStage
Average Rating
7.8
Reviews Sentiment
6.7
Number of Reviews
43
Ranking in other categories
Data Integration (9th)
 

Mindshare comparison

Confluent and IBM InfoSphere DataStage aren’t in the same category and serve different purposes. Confluent is designed for Streaming Analytics and holds a mindshare of 6.9%, down 8.6% compared to last year.
IBM InfoSphere DataStage, on the other hand, focuses on Data Integration, holds 1.9% mindshare, down 5.4% since last year.
Streaming Analytics Mindshare Distribution
ProductMindshare (%)
Confluent6.9%
Apache Flink10.9%
Databricks9.0%
Other73.2%
Streaming Analytics
Data Integration Mindshare Distribution
ProductMindshare (%)
IBM InfoSphere DataStage1.9%
SSIS3.6%
Informatica Intelligent Data Management Cloud (IDMC)3.6%
Other90.9%
Data Integration
 

Featured Reviews

PavanManepalli - PeerSpot reviewer
AVP - Sr Middleware Messaging Integration Engineer at Wells Fargo
Has supported streaming use cases across data centers and simplifies fraud analytics with SQL-based processing
I recommend that Confluent should improve its solution to keep up with competitors in the market, such as Solace and other upcoming tools such as NATS. Recently, there has been a lot of buzz about Confluent charging high fees while not offering features that match those of other tools. They need to improve in that direction by not only reducing costs but also providing better solutions for the problems customers face to avoid frustrations, whether through future enhancement requests or ensuring product stability. The cost should be worked on, and they should provide better solutions for customers. Solutions should focus on hierarchical topics; if a customer has different types of data and sources, they should be able to send them to the same place for analytics. Currently, Confluent requires everything to send to the same topic, which becomes very large and makes running analytics difficult. The hierarchy of topics should be improved. This part is available in MQ and other products such as Solace, but it is missing in Confluent, leading many in capital markets and trading to switch to Solace. In terms of stability, it is not the stability itself that needs improvement but rather the delivery semantics. Other products offer exactly-once delivery out of the box, whereas Confluent states it will offer this but lacks the knobs or levers for tuning configurations effectively. Confluent has hundreds of configurations that application teams must understand, which creates a gap. Users are often unaware of what values to set for better performance or to achieve exactly-once semantics, making it difficult to navigate through them. Delivery semantics also need to be worked on.
Prasad Bodduluri - PeerSpot reviewer
Senior Data Warehouse Developer at itcinfotech
Has required complex workarounds for scripts and struggles with unstructured data processing
There is no issue with IBM InfoSphere DataStage's graphical interface for designing data flows, but I will provide feedback that we are gathering the source from the Oracle database mainly, as well as from some spreadsheets. With respect to the Oracle DB Connector, if you write any PL/SQL or SQL with the connectors, there aren't many options, such as executing procedures in the PL/SQL, executing functions, or executing packages. The Oracle connector doesn't have many features and needs improvement. Nowadays many people are writing programs in Python or in PL/SQL with respect to Oracle, so especially in IBM InfoSphere DataStage, there are no features to call programs directly instead of calling them as a script. What I am facing, especially with parallel processing, is that a developer and admin have to sit together. They have to run the job multiple times with different combinations of parallel processing to get the best performance. Instead of that, if the job itself gave some guidance, such as running this parallel processing with this many nodes, it would help; I think that is missing. An additional feature I would want to see in the next release is the ability to work on logs, especially machine logs or artificial logs, to pull semi-structured or unstructured data without having to write extensive code in Python and integrate it. If IBM InfoSphere DataStage provided some feature for this, it would help.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"I would rate the scalability of the solution at eight out of ten. We have 20 people who use Confluent in our organization now, and we hope to increase usage in the future."
"One of the best features of Confluent is that it's very easy to search and have a live status with Jira."
"The most valuable feature that we are using is the data replication between the data centers allowing us to configure a disaster recovery or software. However, is it's not mandatory to use and because most of the features that we use are from Apache Kafka, such as end-to-end encryption. Internally, we can develop our own kind of product or service from Apache Kafka."
"Implementing Confluent's schema registry has significantly enhanced our organization's data quality assurance."
"We primarily use Confluent for service desk and task management, and it is also good for knowledge base management."
"Our main goal is to validate whether we can build a scalable and cost-efficient way to replicate data from these various sources."
"To date, we have seen improvements in performance and scalability, so we recommend this solution."
"I find Confluent's Kafka Connectors and Kafka Streams invaluable for my use cases because they simplify real-time data processing and ETL tasks by providing reliable, pre-packaged connectors and tools."
"Finding logs is very easy on the solution."
"The best feature of IBM InfoSphere DataStage for me was that it was very much user-friendly. The solution didn't require that much raw coding because most of its features were drag and drop, plus it had a large number of functionalities."
"DataStage works better with Linux operating systems when the application services are hosted on Linux system equipment, but it's powerful on Windows too."
"The most valuable feature is the ability to transfer information via notes."
"This solution is very easy to use because you can design to compile and to run."
"When we have needed help from the IBM team, they were helpful. Our company is a premium partner so we get fast responses."
"IBM is stable and accurate to monitor. It's easy to understand to monitor the data lineage from source to target."
"It is quite useful and powerful."
 

Cons

"Areas for improvement include implementing multi-storage support to differentiate between database stores based on data age and optimizing storage costs."
"It requires some application specific connectors which are lacking. This needs to be added."
"It could be improved by including a feature that automatically creates a new topic and puts failed messages."
"It could have more integration with different platforms."
"The product should integrate tools for incorporating diagrams like Lucidchart. It also needs to improve its formatting features. We also faced issues while granting permissions."
"We continuously face issues, such as Kafka being down and slow responses from the support team."
"Confluent is expensive, I would prefer, Apache Kafka over Confluent because of the high cost of maintenance."
"It would be great if the knowledge based documents in the support portal could be available for public use as well."
"The interface needs work to be more user-friendly."
"Many companies are moving away from DataStage because it is expensive."
"There are three things that could improve - the cloud, monitoring and cloud integration. It's a solid product but not a modern one and of course it depends what you're looking for."
"It would be great if they can include some basic version of data quality checking features."
"What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag. Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources. The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well."
"The response time from support is slow and needs to be improved."
"The solution should be more user-friendly."
"I'd like to be able to do more with the data and metadata, including copy and pasting, et cetera."
 

Pricing and Cost Advice

"Confluent is expensive, I would prefer, Apache Kafka over Confluent because of the high cost of maintenance."
"Confluent is highly priced."
"Regarding pricing, I think Confluent is a premium product, but it's hard for me to say definitively if it's overly expensive. We're still trying to understand if the features and reduced maintenance complexity justify the cost, especially as we scale our platform use."
"Confluence's pricing is quite reasonable, with a cost of around $10 per user that decreases as the number of users increases. Additionally, it's worth noting that for teams of up to 10 users, the solution is completely free."
"The solution is cheaper than other products."
"You have to pay additional for one or two features."
"It comes with a high cost."
"Confluent is an expensive solution."
"High-cost of ownership: They could take a page from open source software."
"Our internal team takes care of group licensing and cost. We don't have individual licenses. We have group licensing at the company level. Usually, IBM doesn't charge anything separately on the licensing side. For storage and everything else, we are paying around $6,000 per month, which is not very high. It includes Linux data storage, execution, and licensing. They're charging $40 for one-hour execution. Based on that, we are spending around $2,000 on the production environment and $1,000 on the lower environment for testing and development-side executions. For the mainframe, we are using the Db2 mainframe database, and we are spending around $1,000 on the Db2 mainframe database as well. All this comes out to be around $6,000. We, however, would like to have some cost reduction."
"It's quite expensive."
"Pricing varies based on use, and it is not as costly as some competing enterprise solutions."
"The price is expensive but there are no licensing fees."
"The pricing depends on the setup. However, we paid $100,000 as a one-time cost for an on-premises setup."
"The pricing is competitive but on the higher side of the pricing scale."
"The solution is cheap."
report
Use our free recommendation engine to learn which Streaming Analytics solutions are best for your needs.
885,837 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
16%
Retailer
10%
Computer Software Company
10%
Manufacturing Company
5%
Financial Services Firm
24%
Government
9%
Manufacturing Company
8%
Computer Software Company
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
By reviewers
Company SizeCount
Small Business6
Midsize Enterprise4
Large Enterprise16
By reviewers
Company SizeCount
Small Business23
Midsize Enterprise4
Large Enterprise26
 

Questions from the Community

What do you like most about Confluent?
I find Confluent's Kafka Connectors and Kafka Streams invaluable for my use cases because they simplify real-time data processing and ETL tasks by providing reliable, pre-packaged connectors and to...
What is your experience regarding pricing and costs for Confluent?
They charge a lot for scaling, which makes it expensive.
What needs improvement with Confluent?
I recommend that Confluent should improve its solution to keep up with competitors in the market, such as Solace and other upcoming tools such as NATS. Recently, there has been a lot of buzz about ...
Would you upgrade to more premium versions of IBM InfoSphere DataStage?
My company currently uses the free version of the product, and we are definitely switching to a paid one. We needed a tool that can help us not only integrate our data but use it effectively. For ...
Is IBM InfoSphere DataStage more difficult to use compared to other tools in the field?
I think the tool may cause some difficulties if you have not used other data integration solutions before. I have worked at companies that used different tools for data integration, and they work ...
Do you rely on IBM Cloud Paks for your data? Have you utilized this product, or do you use IBM InfoSphere DataStage without it?
IBM Cloud Paks makes a big difference in your data integration. My company has been using it alongside IBM InfoSphere DataStage and while the main product is good on its own, this one truly expands...
 

Overview

 

Sample Customers

ING, Priceline.com, Nordea, Target, RBC, Tivo, Capital One, Chartboost
Dubai Statistics Center, Etisalat Egypt
Find out what your peers are saying about Confluent vs. IBM InfoSphere DataStage and other solutions. Updated: March 2026.
885,837 professionals have used our research since 2012.