Try our new research platform with insights from 80,000+ expert users

Pentaho Data Integration and Analytics vs Talend Open Studio comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Dec 19, 2024

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Pentaho Data Integration an...
Ranking in Data Integration
19th
Average Rating
8.0
Reviews Sentiment
6.9
Number of Reviews
53
Ranking in other categories
No ranking in other categories
Talend Open Studio
Ranking in Data Integration
5th
Average Rating
8.0
Reviews Sentiment
6.8
Number of Reviews
50
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of July 2025, in the Data Integration category, the mindshare of Pentaho Data Integration and Analytics is 1.8%, up from 0.8% compared to the previous year. The mindshare of Talend Open Studio is 4.4%, down from 5.1% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Integration
 

Featured Reviews

Aqeel UR Rehman - PeerSpot reviewer
Transform data efficiently with rich features but there's challenges with large datasets
Currently, I am using Pentaho Data Integration for transforming data and then loading it into different platforms. Sometimes, I use it in conjunction with AWS, particularly S3 and Redshift, to execute the copy command for data processing Pentaho Data Integration is easy to use, especially when…
Costin Marzea - PeerSpot reviewer
Allows you to develop your own components and can be used as an OEM
Sometimes, scalability is part of planning. It depends on what you mean by scalability. People talk a lot about it, but scalability is not always about system functionality. Sometimes, it may be planning the job you're doing. If you want to split it into several jobs or servers, you don't actually have to have it built in as a functionality. You can create a job using a loop, which runs and controls several jobs in a loop that may be controlled. Scaling should not always be part of the infrastructure based on whether the engine can scale or not. I think it's your plan or project that should scale and split, and you can define these parameters. These parameters include how many servers you want to run or how many executions you want to do on different parts of the data. It's not always an issue of the engine running. Sometimes, your database should be configured to support partitioning. The product may scale very well without partitioning, but if the basic response is very slow, you didn't solve the problem. You should solve the problems at a higher level, not just at the execution level. They should be solved at the database level and communication level, and you should have firewalls. We are trying to add to the open source the ability to generate code for containers and Kubernetes that exist in the subscription version. Once you do this, Kubernetes will take care of the scaling, so there is no problem.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Provides a good open source option."
"The abstraction is quite good."
"We're using the PDI and the repository function, and they give us the ability to easily generate reporting and output, and to access data. We also like the ability to schedule."
"One of the valuable features is the ability to use PL/SQL statements inside the data transformations and jobs."
"It makes it pretty simple to do some fairly complicated things. Both I and some of our other BI developers have made stabs at using, for example, SQL Server Integration Services, and we found them a little bit frustrating compared to Data Integration. So, its ease of use is right up there."
"The fact that it enables us to leverage metadata to automate data pipeline templates and reuse them is definitely one of the features that we like the best. The metadata injection is helpful because it reduces the need to create and maintain additional ETLs. If we didn't have that feature, we would have lots of duplicated ETLs that we would have to create and maintain. The data pipeline templates have definitely been helpful when looking at productivity and costs."
"We also haven't had to create any custom Java code. Almost everywhere it's SQL, so it's done in the pipeline and the configuration. That means you can offload the work to people who, while they are not less experienced, are less technical when it comes to logic."
"The solution offers features for data integration and migration. Pentaho Data Integration and Analytics allows the integration of multiple data sources into one. The product is user-friendly and intuitive to use for almost any business."
"Talend is significantly easier to use."
"The standout feature for me is the user-friendly nature of the components."
"The rapidity of integration with data may be one of the valuable features."
"The drag-and-drop feature in the interface is very good."
"The solution has a good balance between automated items and the ability for a developer to integrate and extend what he needs. Other competing tools do not offer the same grade of flexibility when you need to go beyond what is provided by the tool. Talend, on the other hand, allows you to expand very easily."
"The most valuable feature for me when it comes to this solution is that it's easy to use."
"This is a user-friendly solution that is easy to use."
"It is user-friendly and the interface is good."
 

Cons

"If you're working with a larger data set, I'm not so sure it would be the best solution. The larger things got the slower it was."
"Parallel execution could be better in Pentaho. It's very simple but I don't think it works well."
"Should provide additional control for the data warehouse"
"A big problem after deploying something that we do in Lumada is with Git. You get a binary file to do a code review. So, if you need to do a review, you have to take pictures of the screen to show each step. That is the biggest bug if you are using Git."
"The product needs more plugins."
"I would like to see improvements made for real-time data processing."
"I work with different databases. I would like to work with more connectors to new databases, e.g., DynamoDB and MariaDB, and new cloud solutions, e.g., AWS, Azure, and GCP. If they had these connectors, that would be great. They could improve by building new connectors. If you have native connections to different databases, then you can make instructions more efficient and in a more natural way. You don't have to write any scripts to use that connector."
"I would like to see more improvements with AS400 DB2."
"The physical appearance of the product is an area of concern where improvements are required."
"Talend Open Studio's interface does not provide a good experience...I would like to see better documentation and a more enhanced graphical interface in Talend Open Studio."
"When faced with a challenge, such as the necessity to link up with an unconventional data source like the legacy Cyprus Vision database that wasn't inherently supported by Talend, I had to resort to writing Python code to establish the connection."
"It needs better installation configuration for other databases. Although the installation allows you to select another database, this doesn't mean that all connection points in the application point to the database selected. You actually need to do a search through the entire install to locate the configuration settings and change them."
"Talend Open Studio has a lot of capabilities, but there is some restriction. For example, if we want to connect to an SAP system, Open Studio cannot do it. We need to go with an enterprise version. Additionally, the monitoring features could improve."
"As for improvement or additional features, I would like to know how to use Java in Talend and also how to use Talend in the cloud or in big data. I would prefer to have storage directly on Talend."
"The server-side should be completely revamped."
"It is complicated to understand the configuration process for email components."
 

Pricing and Cost Advice

"Sometimes we provide the licenses or the customer can procure their own licenses. Previously, we had an enterprise license. Currently, we are on a community license as this is adequate for our needs."
"We are using the Community Edition. We have been trying to use and sell the Enterprise version, but that hasn't been possible due to the budget required for it."
"I mostly used the open-source version. I didn't work with a license."
"You need to go through the paid version to have Hitachi Lumada specialized support. However, if you are using the free version, then you will have only the community support. You will depend on the releases from Hitachi to solve some problem or questions that you have, such as bug fixes. You will need to wait for the newest versions or releases to solve these types of problems."
"The pricing has been pretty good. I'm used to using everything open-source or freeware-based. I understand that organizations need to make sure that the solutions are secure, and that's basically where I hit a roadblock in my current organization. They needed to ensure that we had a license and we had a secure way of accessing it so that no outside parties could get access to our data, but in terms of pricing, considering how much other teams are spending on cloud solutions or even their existing solutions, its price point is pretty good. At this time, there are no additional costs. We just have the licensing fees."
"You don't need the Enterprise Edition, you can go with the Community Edition. That way you can use it for free and, for free, it's a pretty good tool to use."
"We did a two or three-year deal the last time we did it. As compared to other solutions, at least so far in our experience, it has been very affordable. The licensing is by component. So, you need to make sure you only license the components that you really intend to use. I am not sure if we have relicensed after the Hitachi acquisition, but previously, multi-year renewals resulted in a good discount. I'm not sure if this is still the case. We've had the full suite for a lot of years, and there is just the initial cost. I am not aware of any additional costs."
"There is a good open source option (Community Edition)​."
"Right now, because we're using the open-source version, there's no cost."
"Pricing is always a challenge. It is quite an expensive model, but because the platform is so simple to use, we haven't had to purchase any additional licenses."
"I am using the open-source version of the solution, so there are no extra costs for any feature."
"Pricing and licensing are fairly straightforward. It is reasonably priced and managed."
"Talend Open Studio is priced too high."
"Open Studio has a basic license and additional costs for services, including customer support and technical assistance."
"The paid version of this solution has a very high price, but even with the limitations, the Community version works fine."
"It is an open-source tool which means it is a free solution."
report
Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
863,901 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
18%
Computer Software Company
12%
Government
7%
Manufacturing Company
6%
Financial Services Firm
17%
Computer Software Company
12%
Manufacturing Company
8%
Government
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

Which ETL tool would you recommend to populate data from OLTP to OLAP?
Hi Rajneesh, yes here is the feature comparison between the community and enterprise edition : https://www.hitachivantara.com/en-us/pdf/brochure/leverage-open-source-benefits-with-assurance-of-hita...
What do you think can be improved with Hitachi Lumada Data Integrations?
In my opinion, the reporting side of this tool needs serious improvements. In my previous company, we worked with Hitachi Lumada Data Integration and while it does a good job for what it’s worth, ...
What do you use Hitachi Lumada Data Integrations for most frequently?
My company has used this product to transform data from databases, CSV files, and flat files. It really does a good job. We were most satisfied with the results in terms of how many people could us...
How does Talend Open Studio compare with AWS Glue?
We reviewed AWS Glue before choosing Talend Open Studio. AWS Glue is the managed ETL (extract, transform, and load) from Amazon Web Services. AWS Glue enables AWS users to create and manage jobs in...
What do you like most about Talend Open Studio?
It is easy to use and covers most of the functions needed. We can use the code without any extra effort. The open source is very good. They have the same commercials with additional connectors. The...
 

Also Known As

Hitachi Lumada Data Integration, Kettle, Pentaho Data Integration
Open Studio
 

Overview

 

Sample Customers

66Controls, Providential Revenue Agency of Ro Negro, NOAA Information Systems, Swiss Real Estate Institute
Almerys, BF&M, Findus
Find out what your peers are saying about Pentaho Data Integration and Analytics vs. Talend Open Studio and other solutions. Updated: July 2025.
863,901 professionals have used our research since 2012.