Try our new research platform with insights from 80,000+ expert users

IBM InfoSphere DataStage vs StreamSets comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Dec 19, 2024

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

ROI

Sentiment score
5.9
IBM InfoSphere DataStage increases ROI with improved performance, reduced maintenance, efficient management, and ongoing developer support despite some manual needs.
Sentiment score
8.1
StreamSets speeds up data processing, boosts efficiency and revenue, simplifies tasks, enhances security, and reduces costs significantly.
 

Customer Service

Sentiment score
6.2
IBM InfoSphere DataStage support is generally well-rated for availability and responsiveness, but some report regional and efficiency issues.
Sentiment score
6.7
StreamSets support is responsive and knowledgeable, offering effective solutions, though response times and technical handling could improve.
We also have the flexibility to submit a feature request to be included as part of the wishlist, potentially becoming a product feature in subsequent releases.
Sr Product Manager at a computer software company with 501-1,000 employees
I rate their support as nine on a scale from one to ten.
Senior Data Warehouse Developer at itcinfotech
IBM tech support has allocated dedicated resources, making it satisfactory.
Senior Officer at State Bank of India
IBM technical support sometimes transfers tickets between different teams due to shift changes, which can be frustrating.
Enterprise Solutions Architect at a energy/utilities company with 1,001-5,000 employees
 

Scalability Issues

Sentiment score
7.5
IBM InfoSphere DataStage scales well but may require hardware adjustments under heavy loads, with ratings between 7-9.
Sentiment score
7.6
StreamSets is scalable and flexible, favored for cloud use but could improve auto-scaling for large data migrations.
If the job provided suggestions about running this kind of parallel processing and how many virtual nodes are required, it would help.
Senior Data Warehouse Developer at itcinfotech
 

Stability Issues

Sentiment score
7.6
IBM InfoSphere DataStage is stable, especially on Linux, but experiences some instability on Windows due to memory issues.
Sentiment score
7.8
StreamSets is praised for stability and reliability, despite minor memory issues, with high user ratings and market competitiveness.
 

Room For Improvement

IBM InfoSphere DataStage requires enhanced interfaces, modern integration, better support, user-friendliness, and adaptability with improved performance and cloud capabilities.
StreamSets struggles with integration, real-time processing, clarity in UI, memory issues, security, documentation, and cloud storage performance.
An additional feature I would want to see in the next release is the ability to work on logs, especially machine logs or artificial logs, to pull semi-structured or unstructured data without having to write extensive code in Python and integrate it.
Senior Data Warehouse Developer at itcinfotech
I wonder if it supports other areas, such as cloud environments with open source support, or EdgeShift.
Sr Product Manager at a computer software company with 501-1,000 employees
The solution needs improvement in connectivity with big data technologies such as Spark.
Senior Officer at State Bank of India
It would be beneficial if StreamSets addressed any potential memory leak issues to prevent unnecessary upgrades.
Enterprise Solutions Architect at a energy/utilities company with 1,001-5,000 employees
 

Setup Cost

IBM InfoSphere DataStage pricing varies widely and can be costly, particularly for small businesses, despite being cheaper than competitors.
StreamSets provides flexible pricing models, with varied user satisfaction, favoring larger enterprises over smaller companies due to cost.
Pricing for IBM InfoSphere DataStage is moderate and not much expensive.
Senior Officer at State Bank of India
 

Valuable Features

IBM InfoSphere DataStage offers robust ETL capabilities, scalability, excellent integration, user-friendly design, and strong performance for large data volumes.
StreamSets offers intuitive interface, extensive connectors, and features accessible to non-technical users for seamless data integration and manipulation.
IBM InfoSphere DataStage is very scalable, allowing us to extend it according to our processing needs.
Senior Officer at State Bank of India
It is straightforward from a design and development perspective, and also for deployment.
Sr Product Manager at a computer software company with 501-1,000 employees
I have leveraged IBM InfoSphere DataStage's integration with IBM's Information Server suite, and it is indeed beneficial.
Senior Data Warehouse Developer at itcinfotech
It allows a hybrid installation approach, rather than being completely cloud-based or on-premises.
Enterprise Solutions Architect at a energy/utilities company with 1,001-5,000 employees
 

Categories and Ranking

IBM InfoSphere DataStage
Ranking in Data Integration
5th
Average Rating
7.8
Reviews Sentiment
6.7
Number of Reviews
43
Ranking in other categories
No ranking in other categories
StreamSets
Ranking in Data Integration
22nd
Average Rating
8.4
Reviews Sentiment
7.0
Number of Reviews
21
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of January 2026, in the Data Integration category, the mindshare of IBM InfoSphere DataStage is 2.2%, down from 5.5% compared to the previous year. The mindshare of StreamSets is 1.2%, down from 1.6% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Integration Market Share Distribution
ProductMarket Share (%)
IBM InfoSphere DataStage2.2%
StreamSets1.2%
Other96.6%
Data Integration
 

Featured Reviews

Prasad Bodduluri - PeerSpot reviewer
Senior Data Warehouse Developer at itcinfotech
Has required complex workarounds for scripts and struggles with unstructured data processing
There is no issue with IBM InfoSphere DataStage's graphical interface for designing data flows, but I will provide feedback that we are gathering the source from the Oracle database mainly, as well as from some spreadsheets. With respect to the Oracle DB Connector, if you write any PL/SQL or SQL with the connectors, there aren't many options, such as executing procedures in the PL/SQL, executing functions, or executing packages. The Oracle connector doesn't have many features and needs improvement. Nowadays many people are writing programs in Python or in PL/SQL with respect to Oracle, so especially in IBM InfoSphere DataStage, there are no features to call programs directly instead of calling them as a script. What I am facing, especially with parallel processing, is that a developer and admin have to sit together. They have to run the job multiple times with different combinations of parallel processing to get the best performance. Instead of that, if the job itself gave some guidance, such as running this parallel processing with this many nodes, it would help; I think that is missing. An additional feature I would want to see in the next release is the ability to work on logs, especially machine logs or artificial logs, to pull semi-structured or unstructured data without having to write extensive code in Python and integrate it. If IBM InfoSphere DataStage provided some feature for this, it would help.
SS
Enterprise Solutions Architect at a energy/utilities company with 1,001-5,000 employees
Enables effective batch loading with visual interface and enterprise support
One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Because of this memory issue, we have to upgrade our EC2 boxes in the Amazon AWS infrastructure. I had to switch to a new EC2 box, even though the processor was not fully utilized. It would be beneficial if StreamSets addressed any potential memory leak issues to prevent unnecessary upgrades. Additionally, it would be a great enhancement if StreamSets could produce a lineage graph to visualize how the data has passed through the system.
report
Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
880,481 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
27%
Government
9%
Manufacturing Company
8%
Computer Software Company
7%
Insurance Company
8%
Financial Services Firm
8%
Manufacturing Company
8%
Computer Software Company
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
By reviewers
Company SizeCount
Small Business23
Midsize Enterprise4
Large Enterprise26
By reviewers
Company SizeCount
Small Business9
Midsize Enterprise2
Large Enterprise11
 

Questions from the Community

Would you upgrade to more premium versions of IBM InfoSphere DataStage?
My company currently uses the free version of the product, and we are definitely switching to a paid one. We needed a tool that can help us not only integrate our data but use it effectively. For ...
Is IBM InfoSphere DataStage more difficult to use compared to other tools in the field?
I think the tool may cause some difficulties if you have not used other data integration solutions before. I have worked at companies that used different tools for data integration, and they work ...
Do you rely on IBM Cloud Paks for your data? Have you utilized this product, or do you use IBM InfoSphere DataStage without it?
IBM Cloud Paks makes a big difference in your data integration. My company has been using it alongside IBM InfoSphere DataStage and while the main product is good on its own, this one truly expands...
What do you like most about StreamSets?
The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customiz...
What needs improvement with StreamSets?
One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Because of this memory issue, we have to upgrade our EC2 boxes in the Amazon AWS infr...
What is your primary use case for StreamSets?
We are using StreamSets for batch loading.
 

Overview

 

Sample Customers

Dubai Statistics Center, Etisalat Egypt
Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
Find out what your peers are saying about IBM InfoSphere DataStage vs. StreamSets and other solutions. Updated: January 2026.
880,481 professionals have used our research since 2012.