IBM Cloud Pak for Data vs IBM InfoSphere DataStage comparison

IBM Cloud Pak for Data vs. IBM InfoSphere DataStage

Download the complete report

Helped 885,444 peers since 2012

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

ROI

Sentiment score

5.9

IBM Cloud Pak for Data boosts ROI by improving data quality, reducing costs, and enabling efficient, compliant AI-driven decisions.

Sentiment score

5.9

IBM InfoSphere DataStage increases ROI with improved performance, reduced maintenance, efficient management, and ongoing developer support despite some manual needs.

We have been able to drive responsible, transparent, and explainable AI workflow to operationalize AI and mitigate risk and regulatory compliance easily.

Senior Data Analyst at Wipro Limited

It is easy to collect, organize, and analyze data no matter where it is, hence being able to make data-driven decisions.

For more quotes and insights, download the IBM Cloud Pak for Data report

Engineer at Turner Construction

No quotes available

For more quotes and insights, download the IBM InfoSphere DataStage report

Customer Service

Sentiment score

7.1

IBM Cloud Pak for Data support is generally satisfactory but needs improved responsiveness, especially for complex issues and non-English assistance.

Sentiment score

6.2

IBM InfoSphere DataStage support is generally well-rated for availability and responsiveness, but some report regional and efficiency issues.

Cloud Pak is a complicated system, and it's often difficult to find the right resource in IBM to help with specific issues.

Michelle Leslie

Data asset management engineer at a tech services company with 1-10 employees

The customer support for IBM Cloud Pak for Data is great and responsive.

For more quotes and insights, download the IBM Cloud Pak for Data report

Engineer at Turner Construction

I would rate IBM's support at about a seven or eight out of ten because we have good support coverage owing to our long association with IBM.

reviewer2648136

EDW Manager at a university with 1,001-5,000 employees

We also have the flexibility to submit a feature request to be included as part of the wishlist, potentially becoming a product feature in subsequent releases.

Swetha S

Sr Product Manager at a computer software company with 501-1,000 employees

I rate their support as nine on a scale from one to ten.

Senior Data Warehouse Developer at itcinfotech

IBM tech support has allocated dedicated resources, making it satisfactory.

For more quotes and insights, download the IBM InfoSphere DataStage report

Senior Officer at State Bank of India

Scalability Issues

Sentiment score

6.9

IBM Cloud Pak for Data is highly rated for scalability, favored by medium and large organizations for its expandable infrastructure.

Sentiment score

7.5

IBM InfoSphere DataStage scales well but may require hardware adjustments under heavy loads, with ratings between 7-9.

I have not noticed any downtime or lagging, especially when dealing with large data, so it is relatively very scalable.

Senior Data Analyst at Wipro Limited

IBM Cloud Pak for Data's scalability is very good; it can be used by any size of organization.

For more quotes and insights, download the IBM Cloud Pak for Data report

Engineer at Turner Construction

If the job provided suggestions about running this kind of parallel processing and how many virtual nodes are required, it would help.

For more quotes and insights, download the IBM InfoSphere DataStage report

Senior Data Warehouse Developer at itcinfotech

Stability Issues

Sentiment score

7.8

IBM Cloud Pak for Data is generally stable, rated 7-9, improving over time, despite integration-related challenges.

Sentiment score

7.6

IBM InfoSphere DataStage is stable, especially on Linux, but experiences some instability on Windows due to memory issues.

No quotes available

For more quotes and insights, download the IBM Cloud Pak for Data report

No quotes available

For more quotes and insights, download the IBM InfoSphere DataStage report

Room For Improvement

IBM Cloud Pak for Data needs better performance, integration, user support, deployment options, and more training resources to enhance usability.

IBM InfoSphere DataStage requires enhanced interfaces, modern integration, better support, user-friendliness, and adaptability with improved performance and cloud capabilities.

Setting up the hybrid and multi-cloud environments is a long job and it takes time.

Senior Data Analyst at Wipro Limited

IBM Cloud Pak for Data can be improved because processing speeds are sometimes slow.

For more quotes and insights, download the IBM Cloud Pak for Data report

Engineer at Turner Construction

To improve IBM Cloud Pak for Data, I suggest more out-of-the-box integration.

Eunice Romper

Senior Project Manager at EY

If the job itself gave some guidance, such as running this parallel processing with this many nodes, it would help; I think that is missing.

Senior Data Warehouse Developer at itcinfotech

I wonder if it supports other areas, such as cloud environments with open source support, or EdgeShift.

Swetha S

Sr Product Manager at a computer software company with 501-1,000 employees

The solution needs improvement in connectivity with big data technologies such as Spark.

For more quotes and insights, download the IBM InfoSphere DataStage report

Senior Officer at State Bank of India

Setup Cost

IBM Cloud Pak for Data offers flexible, competitive pricing, appealing to large enterprises despite high setup costs.

IBM InfoSphere DataStage pricing varies widely and can be costly, particularly for small businesses, despite being cheaper than competitors.

The setup cost is very expensive.

Michelle Leslie

Data asset management engineer at a tech services company with 1-10 employees

Regarding my experience with pricing, setup cost, and licensing, for a small organization, the price might be relatively high, but for huge enterprises such as ours, the price is relatively affordable.

For more quotes and insights, download the IBM Cloud Pak for Data report

Senior Data Analyst at Wipro Limited

The list price is high, but the flexibility in pricing is adequate.

Bálint Tóth

Solution Manager at Intalion

Pricing for IBM InfoSphere DataStage is moderate and not much expensive.

For more quotes and insights, download the IBM InfoSphere DataStage report

Senior Officer at State Bank of India

Valuable Features

IBM offers powerful AI and data tools enhancing decision-making, connectivity, and cost efficiency through automation and multi-cloud support.

IBM InfoSphere DataStage offers robust ETL capabilities, scalability, excellent integration, user-friendly design, and strong performance for large data volumes.

From there, I can work my way into a more granular level, applying all of that information on top of my actual data to understand what my data looks like, where it came from, and where it went wrong, managing it throughout the cycle.

Michelle Leslie

Data asset management engineer at a tech services company with 1-10 employees

The benefits of choosing IBM Cognos, in addition to saving on cost, include having institutional knowledge about maintaining this infrastructure and enough people who have developed on Cognos in the past, which creates comfort in its use.

reviewer2648136

EDW Manager at a university with 1,001-5,000 employees

We have been able to save approximately 80 percent of our time. We are not doing data analysis manually, so this relieves our data department of dealing with data.

For more quotes and insights, download the IBM Cloud Pak for Data report

Senior Data Analyst at Wipro Limited

It is straightforward from a design and development perspective, and also for deployment.

Swetha S

Sr Product Manager at a computer software company with 501-1,000 employees

As we are a financial organization, security is our main concern, so we prefer enterprise tools.

Senior Officer at State Bank of India

I have leveraged IBM InfoSphere DataStage's integration with IBM's Information Server suite, and it is indeed beneficial.

For more quotes and insights, download the IBM InfoSphere DataStage report

Senior Data Warehouse Developer at itcinfotech

Categories and Ranking

IBM Cloud Pak for Data

Ranking in Data Integration

16th

Average Rating

8.2

Reviews Sentiment

6.2

Number of Reviews

Ranking in other categories

Data Virtualization (3rd)

IBM InfoSphere DataStage

Ranking in Data Integration

9th

Average Rating

7.8

Reviews Sentiment

6.7

Number of Reviews

Ranking in other categories

No ranking in other categories

Mindshare comparison

As of April 2026, in the Data Integration category, the mindshare of IBM Cloud Pak for Data is 1.3%, down from 1.7% compared to the previous year. The mindshare of IBM InfoSphere DataStage is 1.9%, down from 5.4% compared to the previous year. It is calculated based on PeerSpot user engagement data.

Data Integration Mindshare Distribution
Product	Mindshare (%)
IBM InfoSphere DataStage	1.9%
IBM Cloud Pak for Data	1.3%
Other	96.8%

Data Integration

Featured Reviews

Collaborative data platform has transformed analytics and now drives faster decisions

Senior Data Analyst at Wipro Limited

The best features IBM Cloud Pak for Data offers include robust data visualization, centralized data analytics, data reliability, and compatibility with hybrid and multi-cloud environments. The compatibility with hybrid and multi-cloud environments has helped our organization as data visualization is very simple. Migrations, reading, analysis, and data management from other sources are performed without problems of requirements. We have a team of experts in IBM Cloud Pak for Data to maintain security and correct data management easily. I find this cloud excellent for visualizing and managing data across networks and also fulfilling fastest data storage, making it less complex and completely improving productivity in my organization. Everything is managed in multiple environments without any problem. IBM Cloud Pak for Data has positively impacted my organization, and I have noticed some improvement since we started using this tool. Configuration with hybrid and multi-cloud environments has been very seamless and easy. It is a robust platform capable of working with multiple data sources where we gain insights to make data-driven decisions easily. It automates data analysis for quick and better performance. We have seen improvements in analysis and data correction from multiple sources. Our productivity in the company is growing, thanks to the data analysis team. We have also seen a robust hybrid and multi-cloud access system working seamlessly. I can share specific outcomes that show how productivity has grown and how performance has improved since the data is automated, and the analysis is done much faster, saving us a lot of time. We have been able to save approximately 80 percent of our time. We are not doing data analysis manually, so this relieves our data department of dealing with data. We have been relieved of a lot of duties, and now we are able to focus on other strategic tasks. Our productivity has greatly increased since we are able to make concrete and data-driven decisions easily.

Read full review

Has required complex workarounds for scripts and struggles with unstructured data processing

Senior Data Warehouse Developer at itcinfotech

There is no issue with IBM InfoSphere DataStage's graphical interface for designing data flows, but I will provide feedback that we are gathering the source from the Oracle database mainly, as well as from some spreadsheets. With respect to the Oracle DB Connector, if you write any PL/SQL or SQL with the connectors, there aren't many options, such as executing procedures in the PL/SQL, executing functions, or executing packages. The Oracle connector doesn't have many features and needs improvement. Nowadays many people are writing programs in Python or in PL/SQL with respect to Oracle, so especially in IBM InfoSphere DataStage, there are no features to call programs directly instead of calling them as a script. What I am facing, especially with parallel processing, is that a developer and admin have to sit together. They have to run the job multiple times with different combinations of parallel processing to get the best performance. Instead of that, if the job itself gave some guidance, such as running this parallel processing with this many nodes, it would help; I think that is missing. An additional feature I would want to see in the next release is the ability to work on logs, especially machine logs or artificial logs, to pull semi-structured or unstructured data without having to write extensive code in Python and integrate it. If IBM InfoSphere DataStage provided some feature for this, it would help.

Read full review

See which vendors are best for you

Use our free recommendation engine to learn which Data Integration solutions are best for your needs.

See recommendations

885,444 professionals have used our research since 2012.

Top Industries

By visitors reading reviews

Financial Services Firm

21%

Manufacturing Company

10%

Computer Software Company

University

Financial Services Firm

24%

Government

Manufacturing Company

Computer Software Company

Company Size

By reviewers

Large Enterprise

Midsize Enterprise

Small Business

By reviewers
Company Size	Count
Small Business	8
Large Enterprise	15

By reviewers
Company Size	Count
Small Business	23
Midsize Enterprise	4
Large Enterprise	26

Questions from the Community

What is your experience regarding pricing and costs for IBM Cloud Pak for Data?

Regarding the price, I know IBM is traditionally relatively expensive in the Hungarian market, but we work together with the local IBM sales team, and on a project basis they manage to negotiate th...

What needs improvement with IBM Cloud Pak for Data?

I think we are happy with IBM Cloud Pak for Data, and there is no specific idea that comes to my mind regarding room for improvement. We are following the progress and the new features, so overall ...

What is your primary use case for IBM Cloud Pak for Data?

I usually recommend IBM Cloud Pak for Data for companies in the financial sector, as we are mostly working with local insurance companies and banks within Hungary where we are located. For IBM Clou...

Would you upgrade to more premium versions of IBM InfoSphere DataStage?

My company currently uses the free version of the product, and we are definitely switching to a paid one. We needed a tool that can help us not only integrate our data but use it effectively. For ...

Is IBM InfoSphere DataStage more difficult to use compared to other tools in the field?

I think the tool may cause some difficulties if you have not used other data integration solutions before. I have worked at companies that used different tools for data integration, and they work ...

Do you rely on IBM Cloud Paks for your data? Have you utilized this product, or do you use IBM InfoSphere DataStage without it?

IBM Cloud Paks makes a big difference in your data integration. My company has been using it alongside IBM InfoSphere DataStage and while the main product is good on its own, this one truly expands...

Denodo vs IBM Cloud Pak for Data

Comparisons

Compared 8% of the time

Informatica Intelligent Data Management Cloud (IDMC) vs IBM Cloud Pak for Data

Compared 7% of the time

SAP HANA vs IBM Cloud Pak for Data

Compared 5% of the time

Import.io vs IBM Cloud Pak for Data

Compared 5% of the time

Azure Data Factory vs IBM Cloud Pak for Data

Compared 5% of the time

More IBM Cloud Pak for Data Competitors

SSIS vs IBM InfoSphere DataStage

Compared 7% of the time

Informatica PowerCenter vs IBM InfoSphere DataStage

Compared 6% of the time

IBM InfoSphere Information Server vs IBM InfoSphere DataStage

Compared 5% of the time

Qlik Talend Cloud vs IBM InfoSphere DataStage

Compared 4% of the time

Azure Data Factory vs IBM InfoSphere DataStage

Compared 4% of the time

More IBM InfoSphere DataStage Competitors

Product Reports

IBM Cloud Pak for Data

Download IBM Cloud Pak for Data product report

IBM InfoSphere DataStage

Download IBM InfoSphere DataStage product report

Also Known As

Cloud Pak for Data

No data available

Overview

IBM Cloud Pak® for Data is a fully-integrated data and AI platform that modernizes how businesses collect, organize and analyze data to infuse AI throughout their organizations. Cloud-native by design, the platform unifies market-leading services spanning the entire analytics lifecycle. From data management, DataOps, governance, business analytics and automated AI, IBM Cloud Pak for Data helps eliminate the need for costly, and often competing, point solutions while providing the information architecture you need to implement AI successfully.

Building on the streamlined hybrid-cloud foundation of Red Hat® OpenShift®, IBM Cloud Pak for Data takes advantage of the underlying resource and infrastructure optimization and management. The solution fully supports multicloud environments such as Amazon Web Services (AWS), Azure, Google Cloud, IBM Cloud™ and private cloud deployments. Find out how IBM Cloud Pak for Data can lower your total cost of ownership and accelerate innovation.

IBM

IBM InfoSphere DataStage is a high-quality data integration tool that aims to design, develop, and run jobs that move and transform data for organizations of different sizes. The product works by integrating data across multiple systems through a high-performance parallel framework. It supports extended metadata management, enterprise connectivity, and integration of all types of data.

The solution is the data integration component of IBM InfoSphere Information Server, providing a graphical framework for moving data from source systems to target systems. IBM InfoSphere DataStage can deliver data to data warehouses, data marts, operational data sources, and other enterprise applications. The tool works with various types of patterns - extract, transform and load (ETL), and extract, load, and transform (ELT). The scalability of the platform is achieved by using parallel processing and enterprise connectivity.

The solution has various versions, catering to different types of companies, which include the Server Edition, the Enterprise Edition, and the MVS Edition. Depending on which version a company has bought, different goals can be achieved. They include the following:

Designing data flows to extract information from multiple sources, transform the data, and deliver it to target databases or applications.
Delivery of relevant and accurate data through direct connections to enterprise applications.
Reduction of development time and improvement of consistency through prebuilt functions.
Utilization of InfoSphere Information Server tools for accelerating the project delivery cycle.

IBM InfoSphere DataStage can be deployed in various ways, including:

As a service: The tool can be accessed from a subscription model, where its capabilities are a part of IBM DataStage on IBM Cloud Park for Data as a Service. This option offers full management on IBM Cloud.
On premises or in any cloud: The two editions - IBM DataStage Enterprise and IBM DataStage Enterprise Plus - can run workloads on premises or in any cloud when added to IBM DataStage on IBM Cloud Pak for Data as a Service.
On premises: The basic jobs of the tool can be run on premises using IBM DataStage.

IBM InfoSphere DataStage Features

The tool has various features through which users can integrate and utilize their data effectively. The components of IBM InfoSphere DataStage include:

AI services: The tool offers services such as data science, event messaging, data warehousing, and data virtualization. It accelerates processes through artificial intelligence (AI) and offers a connection with IBM Cloud Paks - the cloud-native insight platform of the solution.
Parallel engine: Through this feature, ETL performance can be optimized to process data at scale. This is achieved through parallel engine and load balancing, which maximizes throughput.
Metadata support: This feature of the product uses the IBM Watson Knowledge Catalog to protect companies' sensitive data and monitor who can access it and at what levels.
Automated delivery pipelines: IBM InfoSphere DataStage reduces costs by automating continuous integration and delivery of pipelines.
Prebuilt connectors: The feature for prebuilt connectivity and stages allows users to move data between multiple cloud sources and data warehouses, including IBM native products.
IBM DataStage Flow Designer: This feature offers assistance through machine learning design. The product offers its clients a user-friendly interface which facilitates the work process.
IBM InfoSphere QualityStage: The tool provides a feature that automatically resolves data quality issues and increases the reliability of the delivered data.
Automated failure detection: Through this feature, companies can reduce infrastructure management efforts, relying on the automated detection that the tool offers.
Distributed data processing: Cloud runtimes can be executed remotely through this feature while maintaining its sovereignty and decreasing costs.

IBM InfoSphere DataStage Benefits

This solution offers many benefits for the companies that utilize it for data integration. Some of these benefits include:

Increased speed of workload execution due to better balancing and a parallel engine.
Reduction of data movement costs through integrations and seamless design of jobs.
Modernization of data integration by extending the capabilities of companies' data.
Delivery of reliable data through IBM Cloud Pak for Data.
Utilization of a drag-and-drop interface which assists in the delivery of data without the need for code.
Effective data manipulation allows data to be merged before being mapped and transformed.
Creating easier access of users to their data by providing visual maps of the process and the delivered data.

Reviews from Real Users

A data/solution architect at a computer software company says the product is robust, easy to use, has a simple error logging mechanism, and works very well for huge volumes of data.

Tirthankar Roy Chowdhury, team leader at Tata Consultancy Services, feels the tool is user-friendly with a lot of functionalities, and doesn't require much coding because of its drag-and-drop features.

IBM

Sample Customers

Qatar Development Bank, GuideWell, Skanderborg Music Festival

Dubai Statistics Center, Etisalat Egypt

IBM Cloud Pak for Data vs. IBM InfoSphere DataStage