Try our new research platform with insights from 80,000+ expert users

IBM Cloud Pak for Data vs IBM InfoSphere DataStage comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Jul 27, 2025

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

IBM Cloud Pak for Data
Ranking in Data Integration
26th
Average Rating
7.8
Reviews Sentiment
6.5
Number of Reviews
13
Ranking in other categories
Data Virtualization (3rd)
IBM InfoSphere DataStage
Ranking in Data Integration
7th
Average Rating
7.8
Reviews Sentiment
6.8
Number of Reviews
42
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of October 2025, in the Data Integration category, the mindshare of IBM Cloud Pak for Data is 1.8%, up from 1.7% compared to the previous year. The mindshare of IBM InfoSphere DataStage is 3.5%, down from 5.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Integration Market Share Distribution
ProductMarket Share (%)
IBM InfoSphere DataStage3.5%
IBM Cloud Pak for Data1.8%
Other94.7%
Data Integration
 

Featured Reviews

Michelle Leslie - PeerSpot reviewer
Starts strong with data management capabilities but needs a demo database
What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated. There are so many components to data management, and more often than not, people understand one thing really well. They may understand DataStage and how to move data around, but they do not see the impact of moving data incorrectly. They also do not see the impact of everyone understanding a piece of data in the same way. I would love Cloud Pak to come with a demo database that illustrates the different components of data management in a logical way, so I can see the whole picture instead of just the area I'm specializing in. It would be great if Cloud Pak, from a data modeling point of view, allowed us to import our PDMs, for example. It would be ideal to import and create business terms in Cloud Pak. The PEA would be great to create the technical data. The association between the business and the technical metadata could then be automated by pulling it through from your ACE models. The data modeling component is available in Cloud Pak. Additionally, when it comes to Cloud Pak, even though it has the NextGen DataStage built into it, there is Cloud Pak for data integration as well. Currently, I do not think we have a full enough understanding of how CP4D and CP4I can enhance each other.
Swetha S - PeerSpot reviewer
The solution streamlines design, development, and deployment with effective ETL features
The support has been really good. Typically, if we have any issues, we raise a ticket with IBM, and they help us resolve the issues if required. We also have the flexibility to submit a feature request to be included as part of the wishlist, potentially becoming a product feature in subsequent releases.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Its data preparation capabilities are highly valuable."
"DataStage allows me to connect to different data sources."
"What I found most helpful in IBM Cloud Pak for Data is containerization, which means it's easy to shift and leave in terms of moving to other clouds. That's an advantage of IBM Cloud Pak for Data."
"Cloud Pak is a very, very, very good system."
"One of Cloud Pak's best features is the Watson Knowledge Catalog, which helps you implement data governance."
"The most valuable features of IBM Cloud Pak for Data are the Watson Studio, where we can initiate more groups and write code. Additionally, Watson Machine Learning is available with many other services, such as APIs which you can plug the machine learning models."
"You can model the data there, connect the data models with the business processes and create data lineage processes."
"Cloud Pak's most valuable features are IBM MQ, IBM App Connect, IBM API Connect, and ISPF."
"The best feature of IBM InfoSphere DataStage for me was that it was very much user-friendly. The solution didn't require that much raw coding because most of its features were drag and drop, plus it had a large number of functionalities."
"Once you have Infosphere up and running properly, it is stable."
"Finding logs is very easy on the solution."
"I am impressed with the tool's ETL tracing."
"It generates highly efficient backend code to write data onto IBM systems, which I find valuable."
"As a data integration platform, it is easy to use. It is quite robust and useful for volumetric analysis when you have huge volumes of data. We have tested it for up to ten million rows, and it is robust enough to process ten million rows internally with its parallel processing. Its error logging mechanism is far simpler and easier to understand than other data integration tools. The newer version of InfoSphere has the data catalog and IDC lineage. They are helpful in the easy traceability of columns and tables."
"The performance optimization is quite good in DataStage. It provides parallelism and pipelining mechanisms"
"The product is easy to deploy."
 

Cons

"The interface could improve because sometimes it becomes slow. Sometimes there is a delay between clicks when using the software, which can make the development process slow. It can take a few seconds to complete one action, and then a few more seconds to do the next one."
"The product is trying to be more maturity in terms of connectors. That, I believe, is an area where Cloud Pak can improve."
"The solution's catalog searching or map search needs to be improved."
"What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated."
"One challenge I'm facing with IBM Cloud Pak for Data is native features have been decommissioned, such as XML input and output. Too many changes have been made, and my company has around one hundred thousand mappings, so my team has been putting more effort into alternative ways to do things. Another area for improvement in IBM Cloud Pak for Data is that it's more complicated to shift from on-premise to the cloud. Other vendors provide secure agents that easily connect with your existing setup. Still, with IBM Cloud Pak for Data, you have to perform connection migration steps, upgrade to the latest version, etc., which makes it more complicated, especially as my company has XML-based mappings. Still, the XML input and output capabilities of IBM Cloud Pak for Data have been discontinued, so I'd like IBM to bring that back."
"The tool depends on the control plane, an OpenShift container platform utilized as an orchestration layer...So, we have communicated this issue to IBM and asked if it is feasible to adapt the solution to work on a Kubernetes platform that we support."
"The technical support could be a little better."
"There is a solution that is part of IBM Cloud Pak for Data called Watson OpenScale. It is used to monitor the deployed models for the quality and fairness of the results. This is one area that needs a lot of improvement."
"The response time from support is slow and needs to be improved."
"It would be useful to provide support for Python, AR, and Java."
"There are three things that could improve - the cloud, monitoring and cloud integration. It's a solid product but not a modern one and of course it depends what you're looking for."
"Its documentation is not up to the mark. While building APIs, we had a lot of problems trying to get around it because it is not very user-friendly. We tried to get hold of API documentation, but the documentation is not very well thought out. It should be more structured and elaborate. In terms of additional features, I would like to see good reporting on performance and performance-tuning recommendations that can be based on AI. I would also like to see better data profiling information being reported on InfoSphere."
"It doesn't have any big data connections. It would be good to have them because most of the systems are moving towards big data. There should also be a user-friendly way to interact with the cloud. Its loading process is very slow. It takes a lot of time for around 5 or 6 million records, and we are not able to provide real-time data to the vendors due to this delay. Its performance needs to be improved. It is also like a legacy system. It is not updated much. In higher versions, they only do small changes. We would like to have new features and new technologies."
"The solution can be a bit more user-friendly, similar to Informatica."
"Improvements for DataStage could include better integration with modern data sources like cloud solutions and documents, along with enhancing its capability to handle non-structured data."
"They can provide better support for non-IBM tools when it comes to the target."
 

Pricing and Cost Advice

"I think that this product is too expensive for smaller companies."
"It's quite expensive."
"For the licensing of the solution, there is a yearly payment that needs to be made. Also, since it is expensive, cost-wise, I rate the solution an eight or nine out of ten."
"Cloud Pak's cost is a little high."
"The solution's pricing is competitive with that of other vendors."
"The solution is expensive."
"I don't have the exact licensing cost for IBM Cloud Pak for Data, as my company is still finalizing requirements, including monthly, yearly, and three-year licensing fees. Still, on a scale of one to five, I'd rate it a three because, compared to other vendors, it's more complicated."
"IBM Cloud Pak for Data is expensive. If we include the training time and the machine learning, it's expensive. The cost of the execution is more reasonable."
"The cost is too high."
"It is quite expensive."
"The product is expensive."
"The pricing depends on the setup. However, we paid $100,000 as a one-time cost for an on-premises setup."
"It's very expensive."
"Pricing varies based on use, and it is not as costly as some competing enterprise solutions."
"It's quite expensive."
"High-cost of ownership: They could take a page from open source software."
report
Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
871,688 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
28%
Manufacturing Company
10%
Computer Software Company
9%
Government
6%
Financial Services Firm
28%
Government
9%
Computer Software Company
9%
Manufacturing Company
8%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
By reviewers
Company SizeCount
Small Business7
Large Enterprise8
By reviewers
Company SizeCount
Small Business23
Midsize Enterprise4
Large Enterprise25
 

Questions from the Community

What is your experience regarding pricing and costs for IBM Cloud Pak for Data?
The setup cost is very expensive. The cost depends on the pieces of the solution I'm using, how much data I have, and whether it's on the cloud or on-prem.
What needs improvement with IBM Cloud Pak for Data?
What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated. There are so many components to data...
What is your primary use case for IBM Cloud Pak for Data?
My primary use case for Cloud Pak is that I am the reference Data steward for the Africa regions in the banks where I work. My main objective is to capture the reference data in Caltech or Data and...
Would you upgrade to more premium versions of IBM InfoSphere DataStage?
My company currently uses the free version of the product, and we are definitely switching to a paid one. We needed a tool that can help us not only integrate our data but use it effectively. For ...
Is IBM InfoSphere DataStage more difficult to use compared to other tools in the field?
I think the tool may cause some difficulties if you have not used other data integration solutions before. I have worked at companies that used different tools for data integration, and they work ...
Do you rely on IBM Cloud Paks for your data? Have you utilized this product, or do you use IBM InfoSphere DataStage without it?
IBM Cloud Paks makes a big difference in your data integration. My company has been using it alongside IBM InfoSphere DataStage and while the main product is good on its own, this one truly expands...
 

Also Known As

Cloud Pak for Data
No data available
 

Overview

 

Sample Customers

Qatar Development Bank, GuideWell, Skanderborg Music Festival
Dubai Statistics Center, Etisalat Egypt
Find out what your peers are saying about IBM Cloud Pak for Data vs. IBM InfoSphere DataStage and other solutions. Updated: September 2025.
871,688 professionals have used our research since 2012.