

IBM InfoSphere DataStage and IBM Cloud Pak for Data are IBM's data management solutions. DataStage holds the advantage in scalability and metadata management, while Cloud Pak for Data stands out with its cloud-native architecture.
Features: IBM InfoSphere DataStage offers parallel processing, scalability, and robust metadata management. It is known for its dependable ETL capabilities and seamless integration with IBM systems. Meanwhile, IBM Cloud Pak for Data is praised for its containerization and integration features, including Watson Studio for advanced analytics, which supports AI workflows.
Room for Improvement: IBM InfoSphere DataStage needs advancements in data recovery and documentation and more user-friendly interfaces. Enhancements in big data technology integration are suggested. IBM Cloud Pak for Data requires improvements in deployment infrastructure, native features, and cloud migration. Better integration with cloud service providers is also needed.
Ease of Deployment and Customer Service: DataStage supports on-premises and hybrid environments, whereas Cloud Pak for Data offers flexibility with public and hybrid cloud deployments. Both benefit from IBM's 24/7 support, though there are variations in regional support efficiency.
Pricing and ROI: IBM InfoSphere DataStage is on the higher end, with licensing inclusive of maintenance, making it more suited for large enterprises. Cloud Pak for Data follows a subscription-based licensing model and is also costly. Both offer significant ROI through performance improvements and reduced maintenance.
We have been able to drive responsible, transparent, and explainable AI workflow to operationalize AI and mitigate risk and regulatory compliance easily.
It is easy to collect, organize, and analyze data no matter where it is, hence being able to make data-driven decisions.
Cloud Pak is a complicated system, and it's often difficult to find the right resource in IBM to help with specific issues.
The customer support for IBM Cloud Pak for Data is great and responsive.
I would rate IBM's support at about a seven or eight out of ten because we have good support coverage owing to our long association with IBM.
We also have the flexibility to submit a feature request to be included as part of the wishlist, potentially becoming a product feature in subsequent releases.
I rate their support as nine on a scale from one to ten.
IBM tech support has allocated dedicated resources, making it satisfactory.
I have not noticed any downtime or lagging, especially when dealing with large data, so it is relatively very scalable.
IBM Cloud Pak for Data's scalability is very good; it can be used by any size of organization.
If the job provided suggestions about running this kind of parallel processing and how many virtual nodes are required, it would help.
Setting up the hybrid and multi-cloud environments is a long job and it takes time.
IBM Cloud Pak for Data can be improved because processing speeds are sometimes slow.
To improve IBM Cloud Pak for Data, I suggest more out-of-the-box integration.
If the job itself gave some guidance, such as running this parallel processing with this many nodes, it would help; I think that is missing.
I wonder if it supports other areas, such as cloud environments with open source support, or EdgeShift.
The solution needs improvement in connectivity with big data technologies such as Spark.
The setup cost is very expensive.
Regarding my experience with pricing, setup cost, and licensing, for a small organization, the price might be relatively high, but for huge enterprises such as ours, the price is relatively affordable.
The list price is high, but the flexibility in pricing is adequate.
Pricing for IBM InfoSphere DataStage is moderate and not much expensive.
From there, I can work my way into a more granular level, applying all of that information on top of my actual data to understand what my data looks like, where it came from, and where it went wrong, managing it throughout the cycle.
The benefits of choosing IBM Cognos, in addition to saving on cost, include having institutional knowledge about maintaining this infrastructure and enough people who have developed on Cognos in the past, which creates comfort in its use.
We have been able to save approximately 80 percent of our time. We are not doing data analysis manually, so this relieves our data department of dealing with data.
It is straightforward from a design and development perspective, and also for deployment.
As we are a financial organization, security is our main concern, so we prefer enterprise tools.
I have leveraged IBM InfoSphere DataStage's integration with IBM's Information Server suite, and it is indeed beneficial.
| Product | Mindshare (%) |
|---|---|
| IBM InfoSphere DataStage | 1.9% |
| IBM Cloud Pak for Data | 1.3% |
| Other | 96.8% |

| Company Size | Count |
|---|---|
| Small Business | 8 |
| Large Enterprise | 15 |
| Company Size | Count |
|---|---|
| Small Business | 23 |
| Midsize Enterprise | 4 |
| Large Enterprise | 26 |
IBM Cloud Pak® for Data is a fully-integrated data and AI platform that modernizes how businesses collect, organize and analyze data to infuse AI throughout their organizations. Cloud-native by design, the platform unifies market-leading services spanning the entire analytics lifecycle. From data management, DataOps, governance, business analytics and automated AI, IBM Cloud Pak for Data helps eliminate the need for costly, and often competing, point solutions while providing the information architecture you need to implement AI successfully.
Building on the streamlined hybrid-cloud foundation of Red Hat® OpenShift®, IBM Cloud Pak for Data takes advantage of the underlying resource and infrastructure optimization and management. The solution fully supports multicloud environments such as Amazon Web Services (AWS), Azure, Google Cloud, IBM Cloud™ and private cloud deployments. Find out how IBM Cloud Pak for Data can lower your total cost of ownership and accelerate innovation.
IBM InfoSphere DataStage is a high-quality data integration tool that aims to design, develop, and run jobs that move and transform data for organizations of different sizes. The product works by integrating data across multiple systems through a high-performance parallel framework. It supports extended metadata management, enterprise connectivity, and integration of all types of data.
The solution is the data integration component of IBM InfoSphere Information Server, providing a graphical framework for moving data from source systems to target systems. IBM InfoSphere DataStage can deliver data to data warehouses, data marts, operational data sources, and other enterprise applications. The tool works with various types of patterns - extract, transform and load (ETL), and extract, load, and transform (ELT). The scalability of the platform is achieved by using parallel processing and enterprise connectivity.
The solution has various versions, catering to different types of companies, which include the Server Edition, the Enterprise Edition, and the MVS Edition. Depending on which version a company has bought, different goals can be achieved. They include the following:
IBM InfoSphere DataStage can be deployed in various ways, including:
IBM InfoSphere DataStage Features
The tool has various features through which users can integrate and utilize their data effectively. The components of IBM InfoSphere DataStage include:
IBM InfoSphere DataStage Benefits
This solution offers many benefits for the companies that utilize it for data integration. Some of these benefits include:
Reviews from Real Users
A data/solution architect at a computer software company says the product is robust, easy to use, has a simple error logging mechanism, and works very well for huge volumes of data.
Tirthankar Roy Chowdhury, team leader at Tata Consultancy Services, feels the tool is user-friendly with a lot of functionalities, and doesn't require much coding because of its drag-and-drop features.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.