Try our new research platform with insights from 80,000+ expert users

Collibra Catalog vs Pentaho Data Integration and Analytics comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Collibra Catalog
Average Rating
8.0
Reviews Sentiment
5.3
Number of Reviews
11
Ranking in other categories
Metadata Management (3rd)
Pentaho Data Integration an...
Average Rating
8.0
Reviews Sentiment
6.9
Number of Reviews
53
Ranking in other categories
Data Integration (18th)
 

Mindshare comparison

Collibra Catalog and Pentaho Data Integration and Analytics aren’t in the same category and serve different purposes. Collibra Catalog is designed for Metadata Management and holds a mindshare of 10.7%, down 11.0% compared to last year.
Pentaho Data Integration and Analytics, on the other hand, focuses on Data Integration, holds 1.7% mindshare, up 1.1% since last year.
Metadata Management Market Share Distribution
ProductMarket Share (%)
Collibra Catalog10.7%
Informatica Intelligent Data Management Cloud (IDMC)20.0%
Alation Data Catalog14.9%
Other54.4%
Metadata Management
Data Integration Market Share Distribution
ProductMarket Share (%)
Pentaho Data Integration and Analytics1.7%
Informatica PowerCenter6.0%
SSIS5.7%
Other86.6%
Data Integration
 

Featured Reviews

Tejbir Singh - PeerSpot reviewer
Facilitates data quality monitoring and AI governance with a complete suite of tools
When I initially started with Collibra, it was just a data cataloging platform with governance workflows around it. Now they have acquired a lot of other tools, or they have merged or acquired different platforms. It is a complete suite of tools for managing data. We can monitor data quality and take actions on the profiling results obtained by running data quality checks. Collibra helps catalog data assets, monitor the health of data assets, and take necessary actions. If we find data quality issues, it also provides a medium to capture those issues and how to remediate them. The workflows allow the creation of custom workflows based on needs. The newest addition in their tool suite is AI governance, which allows cataloging all AI models currently deployed or even in the pre-production stage. It helps document model meanings and the risks involved, thus managing all risks related to AI deployments.
Aqeel UR Rehman - PeerSpot reviewer
Transform data efficiently with rich features but there's challenges with large datasets
Currently, I am using Pentaho Data Integration for transforming data and then loading it into different platforms. Sometimes, I use it in conjunction with AWS, particularly S3 and Redshift, to execute the copy command for data processing Pentaho Data Integration is easy to use, especially when…

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The workflows allow the creation of custom workflows based on needs."
"Collibra Catalog's best feature is the data quality checker."
"Collibra Catalog has significantly enhanced data governance and compliance for our team, primarily through its valuable feature of endpoint lineage enabling visual representation of the data."
"The data lineage capability is valuable as it shows how different sources are connected and how data flows, which is crucial for projects like migrations. Moreover, data lineage visualization in Collibra Catalog aids our data governance initiatives."
"Collibra Catalog is simple to use and user-friendly for those who are not technically inclined since it is easy to find while also easy to see data lineage diagrams."
"The most valuable features of Collibra Catalog are its customizability and ease of use."
"Except for data quality, everything is perfect."
"Collibra Catalog allows us to automate metadata management, significantly saving time, effort, and finances."
"Pentaho Data Integration is easy to use, especially when transforming data."
"We're using the PDI and the repository function, and they give us the ability to easily generate reporting and output, and to access data. We also like the ability to schedule."
"It makes it pretty simple to do some fairly complicated things. Both I and some of our other BI developers have made stabs at using, for example, SQL Server Integration Services, and we found them a little bit frustrating compared to Data Integration. So, its ease of use is right up there."
"Data transformation within Pentaho is a nice feature that they have and that I value."
"I can use Python, which is open-source, and I can run other scripts, including Linux scripts. It's user-friendly for running any object-based language. That's a very important feature because we live in a world of open-source."
"One of the valuable features is the ability to use PL/SQL statements inside the data transformations and jobs."
"We also haven't had to create any custom Java code. Almost everywhere it's SQL, so it's done in the pipeline and the configuration. That means you can offload the work to people who, while they are not less experienced, are less technical when it comes to logic."
"It has improved our data integration capabilities​."
 

Cons

"The tool's overall functionalities need to improve since, nowadays, many tools, from a business perspective, are easy to use."
"There is an issue with Collibra Catalog's pricing model, especially for organizations with many databases, as the initial package comes with a limited number of connectors."
"In Collibra Catalog, the main area that has room for improvement is the search functionality. It should be more natural language oriented instead of searching for exact names."
"If it can become more user-intuitive and work on integrating with communication platforms like Slack or Teams, it would significantly help business users."
"More automation and artificial intelligence involvement are necessary. Reducing required employee involvement and enhancing ease of use are vital."
"A key area for improvement in Collibra Catalog lies in its integration capabilities, particularly with a broader range of sources."
"Collibra Catalog could improve its automation to increase the efficiency of the software."
"One of the very key drawbacks is that automation for access provisioning is not available. If I discover a data set or data product in the marketplace and want to access the data, this feature doesn't exist at all."
"As far as I remember, not all connectors worked very well. They can add more connectors and more drivers to the process to integrate with more flows."
"I work with different databases. I would like to work with more connectors to new databases, e.g., DynamoDB and MariaDB, and new cloud solutions, e.g., AWS, Azure, and GCP. If they had these connectors, that would be great. They could improve by building new connectors. If you have native connections to different databases, then you can make instructions more efficient and in a more natural way. You don't have to write any scripts to use that connector."
"In terms of the flexibility to deploy in any environment, such as on-premise or in the cloud, we can do the cloud deployment only through virtual machines. We might also be able to work on different environments through Docker or Kubernetes, but we don't have an Azure app or an AWS app for easy deployment to the cloud. We can only do it through virtual machines, which is a problem, but we can manage it. We also work with Databricks because it works with Spark. We can work with clustered servers, and we can easily do the deployment in the cloud. With a right-click, we can deploy Databricks through the app on AWS or Azure cloud."
"I'm still in the very recent stage concerning Pentaho Data Integration, but it can't really handle what I describe as "extreme data processing" i.e. when there is a huge amount of data to process. That is one area where Pentaho is still lacking."
"Parallel execution could be better in Pentaho. It's very simple but I don't think it works well."
"If you develop it on MacBook, it'll be quite a hassle."
"I would like to see improvements made for real-time data processing."
"It's not very stable, at least not in the case of the community edition. I'm working with the community edition right now and I think perhaps it is because of that it is not very stable, it causes the system to sometimes hang. I'm not sure if this is the case for pair tiers."
 

Pricing and Cost Advice

"Collibra Catalog is fairly priced - I would rate their pricing seven out of ten."
"Collibra offers a per-user licensing model."
"I think they can bring a few more features and align better with other quality products."
"The product is highly priced compared to other vendors."
"You need to go through the paid version to have Hitachi Lumada specialized support. However, if you are using the free version, then you will have only the community support. You will depend on the releases from Hitachi to solve some problem or questions that you have, such as bug fixes. You will need to wait for the newest versions or releases to solve these types of problems."
"Sometimes we provide the licenses or the customer can procure their own licenses. Previously, we had an enterprise license. Currently, we are on a community license as this is adequate for our needs."
"When we first started with it, it was much cheaper. It has gone up drastically, especially since Hitachi bought out Pentaho."
"I believe the pricing of the solution is more affordable than the competitors"
"There was a cost analysis done and Pentaho did favorably in terms of cost."
"There is a good open source option (Community Edition)​."
"For most development tasks, the Enterprise edition should be sufficient. It depends on the type of support that you require for your production environment."
"I mostly used the open-source version. I didn't work with a license."
report
Use our free recommendation engine to learn which Metadata Management solutions are best for your needs.
870,697 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
28%
Manufacturing Company
9%
Government
7%
Insurance Company
6%
Financial Services Firm
18%
Computer Software Company
12%
Government
8%
Manufacturing Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
By reviewers
Company SizeCount
Small Business3
Large Enterprise9
By reviewers
Company SizeCount
Small Business17
Midsize Enterprise16
Large Enterprise25
 

Questions from the Community

What do you like most about Collibra Catalog?
The data lineage capability is valuable as it shows how different sources are connected and how data flows, which is crucial for projects like migrations. Moreover, data lineage visualization in C...
What is your experience regarding pricing and costs for Collibra Catalog?
Pricing is not under my purview as I am an architect. The platform team handles the licensing aspects.
What needs improvement with Collibra Catalog?
I have utilized the sophisticated search capability in Collibra Catalog, and it can be improved by implementing more natural language search capabilities. Currently, we need to enter the asset name...
Which ETL tool would you recommend to populate data from OLTP to OLAP?
Hi Rajneesh, yes here is the feature comparison between the community and enterprise edition : https://www.hitachivantara.com/en-us/pdf/brochure/leverage-open-source-benefits-with-assurance-of-hita...
What do you think can be improved with Hitachi Lumada Data Integrations?
In my opinion, the reporting side of this tool needs serious improvements. In my previous company, we worked with Hitachi Lumada Data Integration and while it does a good job for what it’s worth, ...
What do you use Hitachi Lumada Data Integrations for most frequently?
My company has used this product to transform data from databases, CSV files, and flat files. It really does a good job. We were most satisfied with the results in terms of how many people could us...
 

Also Known As

No data available
Hitachi Lumada Data Integration, Kettle, Pentaho Data Integration
 

Overview

 

Sample Customers

AXA XL, DNB, Adobe, PMI, Holland America Line, UC Davis Health, Cox Automotive
66Controls, Providential Revenue Agency of Ro Negro, NOAA Information Systems, Swiss Real Estate Institute
Find out what your peers are saying about Informatica, Alation, Collibra and others in Metadata Management. Updated: September 2025.
870,697 professionals have used our research since 2012.