No more typing reviews! Try our Samantha, our new voice AI agent.

IBM InfoSphere DataStage vs erwin Data Catalog comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

erwin Data Catalog
Average Rating
7.6
Reviews Sentiment
5.1
Number of Reviews
2
Ranking in other categories
Metadata Management (13th)
IBM InfoSphere DataStage
Average Rating
7.8
Reviews Sentiment
6.7
Number of Reviews
43
Ranking in other categories
Data Integration (15th)
 

Mindshare comparison

erwin Data Catalog and IBM InfoSphere DataStage aren’t in the same category and serve different purposes. erwin Data Catalog is designed for Metadata Management and holds a mindshare of 3.2%, up 2.4% compared to last year.
IBM InfoSphere DataStage, on the other hand, focuses on Data Integration, holds 1.5% mindshare, down 5.1% since last year.
Metadata Management Mindshare Distribution
ProductMindshare (%)
erwin Data Catalog3.2%
Collibra Platform17.1%
Informatica Intelligent Data Management Cloud (IDMC)13.5%
Other66.2%
Metadata Management
Data Integration Mindshare Distribution
ProductMindshare (%)
IBM InfoSphere DataStage1.5%
SSIS3.7%
Informatica Intelligent Data Management Cloud (IDMC)3.6%
Other91.2%
Data Integration
 

Featured Reviews

Andres-Martinez - PeerSpot reviewer
BI Data Analytics Engineer at Targa Research
Helps with metadata management, saves time, and allows us to do impact analysis on any changes
There are always ways to improve things. For example, we can use AI to be able to find out something. When we are typing something, if we don't know the exact term, Artificial Intelligence would be useful to find terms that are phonetically or syntactically similar. Instead of having to type in the exact name, they can provide those in the list. So, they can provide AI support for the search because when you have thousands and thousands of terms, it is hard to remember all the names. There were some issues when drawing the data models. If you have more than 500 or 600 tables, it takes a long time to display those in the right position on the screen. That can also be improved. They need some caching and some parallel pipelines working on the backend in order to divide it into sections.
Prasad Bodduluri - PeerSpot reviewer
Senior Data Warehouse Developer at itcinfotech
Has required complex workarounds for scripts and struggles with unstructured data processing
There is no issue with IBM InfoSphere DataStage's graphical interface for designing data flows, but I will provide feedback that we are gathering the source from the Oracle database mainly, as well as from some spreadsheets. With respect to the Oracle DB Connector, if you write any PL/SQL or SQL with the connectors, there aren't many options, such as executing procedures in the PL/SQL, executing functions, or executing packages. The Oracle connector doesn't have many features and needs improvement. Nowadays many people are writing programs in Python or in PL/SQL with respect to Oracle, so especially in IBM InfoSphere DataStage, there are no features to call programs directly instead of calling them as a script. What I am facing, especially with parallel processing, is that a developer and admin have to sit together. They have to run the job multiple times with different combinations of parallel processing to get the best performance. Instead of that, if the job itself gave some guidance, such as running this parallel processing with this many nodes, it would help; I think that is missing. An additional feature I would want to see in the next release is the ability to work on logs, especially machine logs or artificial logs, to pull semi-structured or unstructured data without having to write extensive code in Python and integrate it. If IBM InfoSphere DataStage provided some feature for this, it would help.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"When you combine it with data lineage, every time you need to make a change, it allows you to do impact analysis on any changes and then connect to the end-users or data stewards so that they can be aware that a change is coming. That's one of the main benefits we use it for."
"The data catalog feature is pretty good."
"When you combine it with data lineage, every time you need to make a change, it allows you to do impact analysis on any changes and then connect to the end-users or data stewards so that they can be aware that a change is coming."
"Finding logs is very easy on the solution."
"The performance optimization is quite good in DataStage, and it provides parallelism and pipelining mechanisms."
"I have leveraged IBM InfoSphere DataStage's integration with IBM's Information Server suite, and it is indeed beneficial."
"DataStage has also improved its connectors, such as connectivity with Kafka for real-time data integration, cloud connectors, and others like Spark and HVAC SaaS. All these processes are expected to shift to the cloud in the next five years. It's a very robust tool, much like PowerCenter."
"The product is easy to deploy."
"DataStage works better with Linux operating systems when the application services are hosted on Linux system equipment, but it's powerful on Windows too."
"The most valuable feature is the data integration for data warehousing."
"We can view what we want to do. We can transform data and put them on tables."
 

Cons

"There were some issues when drawing the data models. If you have more than 500 or 600 tables, it takes a long time to display those in the right position on the screen."
"There are always ways to improve things. For example, we can use AI to be able to find out something. When we are typing something, if we don't know the exact term, Artificial Intelligence would be useful to find terms that are phonetically or syntactically similar. Instead of having to type in the exact name, they can provide those in the list. So, they can provide AI support for the search because when you have thousands and thousands of terms, it is hard to remember all the names."
"There is room for improvement with respect to the connector and how to connect to the structured and unstructured database."
"What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag. Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources. The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well."
"For reading the machine log in IBM InfoSphere DataStage, I would rate it only a two because there's not much improvement in this area."
"The solution can be a bit more user-friendly, similar to Informatica."
"Reduced cost would allow more customers to choose the product. It's quite expensive in relation to the cost of other similar solutions."
"I wonder if it supports other areas, such as cloud environments with open source support, or EdgeShift."
"There are three things that could improve - the cloud, monitoring and cloud integration. It's a solid product but not a modern one and of course it depends what you're looking for."
"The response time from support is slow and needs to be improved."
"I think it's a little bit difficult to use because of the interface."
 

Pricing and Cost Advice

"Erwin Data Catalog is very expensive."
"I am not very familiar with its pricing. I know it is not cheap, but it is also not super expensive. It depends on the company size. For a company making $1 million, it is very expensive. For a company making 10 million and above, it might be okay."
"Pricing varies based on use, and it is not as costly as some competing enterprise solutions."
"It's very expensive."
"Our internal team takes care of group licensing and cost. We don't have individual licenses. We have group licensing at the company level. Usually, IBM doesn't charge anything separately on the licensing side. For storage and everything else, we are paying around $6,000 per month, which is not very high. It includes Linux data storage, execution, and licensing. They're charging $40 for one-hour execution. Based on that, we are spending around $2,000 on the production environment and $1,000 on the lower environment for testing and development-side executions. For the mainframe, we are using the Db2 mainframe database, and we are spending around $1,000 on the Db2 mainframe database as well. All this comes out to be around $6,000. We, however, would like to have some cost reduction."
"The pricing depends on the setup. However, we paid $100,000 as a one-time cost for an on-premises setup."
"The price is expensive but there are no licensing fees."
"The solution is cheap."
"It's quite expensive."
"The pricing is competitive but on the higher side of the pricing scale."
report
Use our free recommendation engine to learn which Metadata Management solutions are best for your needs.
895,399 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
13%
Construction Company
10%
Government
8%
Real Estate/Law Firm
8%
Financial Services Firm
24%
Manufacturing Company
9%
Government
7%
Computer Software Company
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
By reviewers
Company SizeCount
Small Business23
Midsize Enterprise4
Large Enterprise26
 

Questions from the Community

Which ETL tool would you recommend to populate data from OLTP to OLAP?
There are two products I know about * TimeXtender : Microsoft based, Transformation logic is quiet good and can easily be extended with T-SQL , Has a semantic layer that generates metat data for cu...
Would you upgrade to more premium versions of IBM InfoSphere DataStage?
My company currently uses the free version of the product, and we are definitely switching to a paid one. We needed a tool that can help us not only integrate our data but use it effectively. For ...
Is IBM InfoSphere DataStage more difficult to use compared to other tools in the field?
I think the tool may cause some difficulties if you have not used other data integration solutions before. I have worked at companies that used different tools for data integration, and they work ...
Do you rely on IBM Cloud Paks for your data? Have you utilized this product, or do you use IBM InfoSphere DataStage without it?
IBM Cloud Paks makes a big difference in your data integration. My company has been using it alongside IBM InfoSphere DataStage and while the main product is good on its own, this one truly expands...
 

Overview

 

Sample Customers

Balfour Beatty Construction, Banco de México, BFSI Canada, CenturyLink, Daktronics
Dubai Statistics Center, Etisalat Egypt
Find out what your peers are saying about Collibra, Informatica, Alation and others in Metadata Management. Updated: April 2026.
895,399 professionals have used our research since 2012.