Try our new research platform with insights from 80,000+ expert users

IBM Cloud Pak for Data vs IBM InfoSphere DataStage comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Apr 20, 2025

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

IBM Cloud Pak for Data
Ranking in Data Integration
24th
Average Rating
7.8
Reviews Sentiment
6.5
Number of Reviews
13
Ranking in other categories
Data Virtualization (3rd)
IBM InfoSphere DataStage
Ranking in Data Integration
6th
Average Rating
7.8
Reviews Sentiment
6.8
Number of Reviews
42
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of June 2025, in the Data Integration category, the mindshare of IBM Cloud Pak for Data is 1.9%, up from 1.7% compared to the previous year. The mindshare of IBM InfoSphere DataStage is 5.0%, down from 5.6% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Integration
 

Featured Reviews

Michelle Leslie - PeerSpot reviewer
Starts strong with data management capabilities but needs a demo database
What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated. There are so many components to data management, and more often than not, people understand one thing really well. They may understand DataStage and how to move data around, but they do not see the impact of moving data incorrectly. They also do not see the impact of everyone understanding a piece of data in the same way. I would love Cloud Pak to come with a demo database that illustrates the different components of data management in a logical way, so I can see the whole picture instead of just the area I'm specializing in. It would be great if Cloud Pak, from a data modeling point of view, allowed us to import our PDMs, for example. It would be ideal to import and create business terms in Cloud Pak. The PEA would be great to create the technical data. The association between the business and the technical metadata could then be automated by pulling it through from your ACE models. The data modeling component is available in Cloud Pak. Additionally, when it comes to Cloud Pak, even though it has the NextGen DataStage built into it, there is Cloud Pak for data integration as well. Currently, I do not think we have a full enough understanding of how CP4D and CP4I can enhance each other.
Swetha S - PeerSpot reviewer
The solution streamlines design, development, and deployment with effective ETL features
The support has been really good. Typically, if we have any issues, we raise a ticket with IBM, and they help us resolve the issues if required. We also have the flexibility to submit a feature request to be included as part of the wishlist, potentially becoming a product feature in subsequent releases.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"It is a scalable solution, and we have had no issues with its scalability in our company. I rate the solution's scalability a nine out of ten."
"DataStage allows me to connect to different data sources."
"Cloud Pak's most valuable features are IBM MQ, IBM App Connect, IBM API Connect, and ISPF."
"Its data preparation capabilities are highly valuable."
"You can model the data there, connect the data models with the business processes and create data lineage processes."
"The most valuable features of IBM Cloud Pak for Data are the Watson Studio, where we can initiate more groups and write code. Additionally, Watson Machine Learning is available with many other services, such as APIs which you can plug the machine learning models."
"Scalability-wise, I rate the solution a nine or ten out of ten."
"IBM Watson Catalog and data pipelines are the most valuable features of the solution."
"The most valuable feature of the solution is the ability to incorporate very complex business rules in Data Stage."
"The most valuable feature for our data processing needs is IBM InfoSphere DataStage's capability to handle ETL tasks with large record volumes."
"DataStage works better with Linux operating systems when the application services are hosted on Linux system equipment, but it's powerful on Windows too."
"DataStage has also improved its connectors, such as connectivity with Kafka for real-time data integration, cloud connectors, and others like Spark and HVAC SaaS. All these processes are expected to shift to the cloud in the next five years. It's a very robust tool, much like PowerCenter."
"The performance optimization is quite good in DataStage. It provides parallelism and pipelining mechanisms"
"The most valuable feature is the data integration for data warehousing."
"The solution's scalability is really good...we are using multi-instance jobs where you can scale them easily."
"The most valuable feature is the product's versatility to inject data."
 

Cons

"The tool depends on the control plane, an OpenShift container platform utilized as an orchestration layer...So, we have communicated this issue to IBM and asked if it is feasible to adapt the solution to work on a Kubernetes platform that we support."
"The setup cost is very expensive. The cost depends on the pieces of the solution I'm using, how much data I have, and whether it's on the cloud or on-prem."
"There is a solution that is part of IBM Cloud Pak for Data called Watson OpenScale. It is used to monitor the deployed models for the quality and fairness of the results. This is one area that needs a lot of improvement."
"The product is trying to be more maturity in terms of connectors. That, I believe, is an area where Cloud Pak can improve."
"One thing that bugs me is how much infrastructure Cloud Pak requires for the initial deployment. It doesn't allow you to start small. The smallest permitted deployment is too big. It's a huge problem that prevents us from implementing the solution in many scenarios."
"The solution's user experience is an area that has room for improvement."
"One challenge I'm facing with IBM Cloud Pak for Data is native features have been decommissioned, such as XML input and output. Too many changes have been made, and my company has around one hundred thousand mappings, so my team has been putting more effort into alternative ways to do things. Another area for improvement in IBM Cloud Pak for Data is that it's more complicated to shift from on-premise to the cloud. Other vendors provide secure agents that easily connect with your existing setup. Still, with IBM Cloud Pak for Data, you have to perform connection migration steps, upgrade to the latest version, etc., which makes it more complicated, especially as my company has XML-based mappings. Still, the XML input and output capabilities of IBM Cloud Pak for Data have been discontinued, so I'd like IBM to bring that back."
"The technical support could be a little better."
"Working with some of the big data components is good, but I can see improvements are needed."
"The documentation and in-application help for this solution need to be improved, especially for new features."
"It would be useful to provide support for Python, AR, and Java."
"There are three things that could improve - the cloud, monitoring and cloud integration. It's a solid product but not a modern one and of course it depends what you're looking for."
"The interface needs improvement. It is really too technical. That is the main problem."
"The troubleshooting guide is very bad."
"It would be great if they can include some basic version of data quality checking features."
"Improvements for DataStage could include better integration with modern data sources like cloud solutions and documents, along with enhancing its capability to handle non-structured data."
 

Pricing and Cost Advice

"IBM Cloud Pak for Data is expensive. If we include the training time and the machine learning, it's expensive. The cost of the execution is more reasonable."
"The solution's pricing is competitive with that of other vendors."
"Cloud Pak's cost is a little high."
"The solution is expensive."
"I don't have the exact licensing cost for IBM Cloud Pak for Data, as my company is still finalizing requirements, including monthly, yearly, and three-year licensing fees. Still, on a scale of one to five, I'd rate it a three because, compared to other vendors, it's more complicated."
"It's quite expensive."
"For the licensing of the solution, there is a yearly payment that needs to be made. Also, since it is expensive, cost-wise, I rate the solution an eight or nine out of ten."
"I think that this product is too expensive for smaller companies."
"The cost is too high."
"Our internal team takes care of group licensing and cost. We don't have individual licenses. We have group licensing at the company level. Usually, IBM doesn't charge anything separately on the licensing side. For storage and everything else, we are paying around $6,000 per month, which is not very high. It includes Linux data storage, execution, and licensing. They're charging $40 for one-hour execution. Based on that, we are spending around $2,000 on the production environment and $1,000 on the lower environment for testing and development-side executions. For the mainframe, we are using the Db2 mainframe database, and we are spending around $1,000 on the Db2 mainframe database as well. All this comes out to be around $6,000. We, however, would like to have some cost reduction."
"It is quite expensive."
"Pricing varies based on use, and it is not as costly as some competing enterprise solutions."
"The pricing depends on the setup. However, we paid $100,000 as a one-time cost for an on-premises setup."
"It's very expensive."
"The solution is cheap."
"The price is expensive but there are no licensing fees."
report
Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
857,688 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
30%
Manufacturing Company
11%
Computer Software Company
10%
Government
6%
Financial Services Firm
27%
Computer Software Company
11%
Manufacturing Company
9%
Government
9%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about IBM Cloud Pak for Data?
DataStage allows me to connect to different data sources.
What is your experience regarding pricing and costs for IBM Cloud Pak for Data?
The setup cost is very expensive. The cost depends on the pieces of the solution I'm using, how much data I have, and whether it's on the cloud or on-prem.
What needs improvement with IBM Cloud Pak for Data?
What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated. There are so many components to data...
Would you upgrade to more premium versions of IBM InfoSphere DataStage?
My company currently uses the free version of the product, and we are definitely switching to a paid one. We needed a tool that can help us not only integrate our data but use it effectively. For ...
Is IBM InfoSphere DataStage more difficult to use compared to other tools in the field?
I think the tool may cause some difficulties if you have not used other data integration solutions before. I have worked at companies that used different tools for data integration, and they work ...
Do you rely on IBM Cloud Paks for your data? Have you utilized this product, or do you use IBM InfoSphere DataStage without it?
IBM Cloud Paks makes a big difference in your data integration. My company has been using it alongside IBM InfoSphere DataStage and while the main product is good on its own, this one truly expands...
 

Also Known As

Cloud Pak for Data
No data available
 

Overview

 

Sample Customers

Qatar Development Bank, GuideWell, Skanderborg Music Festival
Dubai Statistics Center, Etisalat Egypt
Find out what your peers are saying about IBM Cloud Pak for Data vs. IBM InfoSphere DataStage and other solutions. Updated: June 2025.
857,688 professionals have used our research since 2012.