It is a very good open source ETL tool that's capable of connecting to most databases. It has a lot of functions that makes transforming the data very easy. Also, because it is an open source product, it is very easy to build your own solution with it.
Pentaho Consultant at a comms service provider with 10,001+ employees
It is an open source product it is very easy to build your own solution against it.
What is most valuable?
How has it helped my organization?
It is also possible to build a new solution quit quick so the customer sees results quite fast.
What needs improvement?
In the community version the scheduling tool is not good, and we had to build it ourselves.
For how long have I used the solution?
I have worked with different versions of Pentaho since 2009.
Buyer's Guide
Pentaho Data Integration and Analytics
May 2025

Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
856,873 professionals have used our research since 2012.
What was my experience with deployment of the solution?
There are a couple of bugs in the newer versions. We were forced to wait until those bugs were fixed before we could upgrade.
What do I think about the stability of the solution?
There were no issues with its stability.
What do I think about the scalability of the solution?
There have been no issues scaling it.
How are customer service and support?
Because we use the community edition, there is no support from the vendor. When I worked with the Enterprise edition last year the technical support was quick and to the point. I was more than happy with their knowledge.
Which solution did I use previously and why did I switch?
In the past I also worked with SAP BI. The main reason we switched to Pentaho was the cost of SAP. Because of the flexibility of Pentaho, I prefer to work with it.
How was the initial setup?
When I started using Pentaho in 2009 the initial setup was quit complex, mainly because of a lack of good documentation at that time. Since then, it has dramatically improved. Also the community on the web is quit active and there are some good blogs.
What about the implementation team?
I was hired to do the implementation. I think it is necessary to have a good understanding of the product to implement is well so I would recommend, when not in-house, to hire the appropriate knowledge
What other advice do I have?
When you don’t have the knowledge of the product I would recommend to follow some courses in to speed up the learning curve. A cheap way to start with Pentaho is using the Community Edition. You can do almost everything with it and the purchase of the Enterprise Edition is not necessary
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Data Developer at a tech services company with 10,001+ employees
It is possible to understand how to develop an ETL solution even when using it for the first time.
What is most valuable?
- Pentaho Kettle has a very intuitive and easy to use graphical user interface (GUI)
- It is possible to understand how to develop an ETL solution even when using it for the first time
- The Community Edition is free and very efficient
- They have versions for Windows, Linux and Mac
- Large selection of options.
How has it helped my organization?
We have developed some complex ETL processes for some clients and they are very satisfied with the results.
What needs improvement?
They could improve the logging generator. Sometimes the error description is so generic that it is not possible to detect the problem.
For how long have I used the solution?
We've used it for three years.
What was my experience with deployment of the solution?
There were no issues with the deployment.
What do I think about the stability of the solution?
There were no issues with the stability.
What do I think about the scalability of the solution?
There have been no issues scaling it.
How are customer service and technical support?
I use the Community Edition without support or customer service. I recommend the Pentaho Community Forums for technical issues.
Which solution did I use previously and why did I switch?
I have used Informatica PowerCenter, which is an excellent solution. However, it´s not so easy to use as Pentaho kettle.
How was the initial setup?
The initial setup is straightforward. All you need to do is to download it, unzip the file into a folder and execute the Spoon.bat (for Windows) or Spoon.sh (for Linux) to start the graphical user interface (GUI).
What about the implementation team?
In-house. The implementation is very simple. Data developers will not encounter difficulties to implement ETL solutions.
What's my experience with pricing, setup cost, and licensing?
The community edition is free. If you need a full BI solution, I would recommend the enterprise edition.
What other advice do I have?
Pentaho Kettle is an excellent solution to implement ETL process.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Pentaho Data Integration and Analytics
May 2025

Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
856,873 professionals have used our research since 2012.
Consultant at a tech vendor with 501-1,000 employees
It's open source so there's no concern for pricing and licensing, and we've deployed it with minimal hardware.
What is most valuable?
- It has a nice GUI that anyone can learn to use in just a few days with minimal training.
- It has great support for big data technologies; Pentaho 5.3 comes with support for HBase, Pig, Oozie, and Hadoop distribution support.
How has it helped my organization?
- It's an open-source tool, so you don't need to worry about licensing costs.
- We've deployed it with very minimal hardware.
- We have migrated one of the key project from Microsoft BI to Pentaho Data integration, this saved lot of money as well there were much improvement in performance as well.
What needs improvement?
The rule executor step can be improved. It has one limitation: we cannot give it a dynamic file name in this step.
For how long have I used the solution?
I've used it for 4.5 years.
What was my experience with deployment of the solution?
We've had no issues with deployment.
What do I think about the stability of the solution?
We've had no issues with stability.
What do I think about the scalability of the solution?
We've had no issues with scalability.
Which solution did I use previously and why did I switch?
We used MSBI and we switched to Pentaho Data Integration and it has saved us lot of money and time.
How was the initial setup?
For Pentaho, the initial setup is very straightforward.
What about the implementation team?
We did it in-house.
What's my experience with pricing, setup cost, and licensing?
Because it's open source, there's no issue of pricing or licensing.
What other advice do I have?
Use if for any data warehousing and migration projects. I love this tool and we can use it without spending a penny.
I would say this is the best ETL tool in the market, considering this is open source and ease to use, very nice GUI.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
CEO with 51-200 employees
Easy to use and has a nice GUI. The json input needs to perform better.
What is most valuable?
Ease of use, stability, graphical interface, small amount of "voodoo" and cost.
What needs improvement?
There some steps that should perform better like the json input, but because of the flexibility we at inflow, override it by using scripting steps. Of course it's ideal to use the steps that come with the software but if you can write your own step that's powerful. Also, it would be nice to have the drivers for the data sources shipped with Pentaho Kettle instead of looking for the right ones on the Internet.
What was my experience with deployment of the solution?
In every project there are issues with the deployment, but we were able to overcome them.
What do I think about the stability of the solution?
I think that Pentaho Kettle is a stable software, if it wasn't, we wouldn't have used it (because we don’t like angry customers).
What do I think about the scalability of the solution?
Actually, Pentaho Kettle comes equipped with the option to scale out, out of the box.
And no, we didn't encountered specific scalability problems.
How are customer service and technical support?
Customer Service:
We mainly use material which was written over the years (pentaho kettle materials), the forum, Matt casters blog, videos, etc. We also try to solve our issues inside the company for our customers before contacting customer service. We even developed a full-scale data integration Pentaho Kettle online course and built a website for it.
When we use the customer service it's very good. There is a large community for the tool, people gladly help each another.
Very good support.
Which solution did I use previously and why did I switch?
Before Pentaho Kettle we used stored procedures, writing code and also Informatica. Informatica is a very good tool, but it is not open source so it is far more costly compared to Pentaho Kettle. From my perspective I don't see the difference, we can do almost everything with Pentaho Kettle and if we need a little extra we are tech guys, we solve it.
Of course that from the customer's perspective the cheaper the better, so if the customer has a smaller budget, they get more when using Pentaho Kettle open source. Even with the Pentaho Kettle enterprise edition.
What's my experience with pricing, setup cost, and licensing?
I can say from the vendor perspective- usually the part of the data integration (from data source to the warehouse/target) takes at least 60% of the whole initial business intelligence project. It depends on the data sources and complexity, for example: big data, NoSql, xml, web services, "weird" files and more.
After the data integration project is "live" it will work fine until someone breaks something. (Network connectivity, servers, DBA that changes the data source, or any other change for that matter that changes variables that the data integration was built upon) but this is true for all data integration software.
The day-to-day costs are very low if there are no new requirements. Luckily for us (as a vendor) once the customer starts and the users get their fancy reports and dashboards there's no turning back, and the requirements are piling up. But these are new requirements, not maintenance.
What other advice do I have?
Instead of trying to decide on a specific data integration tool, pick the right vendor partner, not a biased one. They will be able to recommend the set of tools you need according to your requirements and budget.
Business intelligence project are made up of at least three components:
- 1. Data integration tool
- 2. Data warehouse tool
- 3. Visualization tool
Several of the software vendors have them all, but not the best solution for each component. From my experience it's better to combine solutions. (Unless it is a small project.)
For example: data integration from Pentaho Kettle, if it's big data we need an in memory/ columnar database for data warehouse but if it's not we can use traditional databases (SQL Server, Oracle, even MySQL for smaller projects) and a BI visualization tool like Yellowfin/Tableau/Sisense/etc.
In the middle you have tens of software vendors that can be suitable for the customer needs.
Of course if the vendor partner is biased then suddenly Tableau/Sisense/Qlikview/etc. become the best data integration tool. Or "you don’t need a data integration tool at all" although they don't have the right components. (They are very good tools for visualization but not for "playing" extract and transform complex data). We work with several vendors such as Sisense and Yellowfin which are are great tool for the specific solution they were made for.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
COO at a tech services company with 11-50 employees
For me, it's the best ETL tool in the world
What is most valuable?
Easy to use, support for all databases (jdbc and odbc connection), xls , csv, files, txt, SAS, R
How has it helped my organization?
Integrate all datasources in one OLTP or OLAP database
For how long have I used the solution?
4 years
What was my experience with deployment of the solution?
None
What do I think about the stability of the solution?
None
What do I think about the scalability of the solution?
None
How are customer service and technical support?
Customer Service: 5/10Technical Support: 10/10
Which solution did I use previously and why did I switch?
Talend Studio.
How was the initial setup?
Easy
What was our ROI?
100% (PDI CE)
Which other solutions did I evaluate?
Talend Studio
Disclosure: My company has a business relationship with this vendor other than being a customer: EspriSûr Consultants

Buyer's Guide
Download our free Pentaho Data Integration and Analytics Report and get advice and tips from experienced pros
sharing their opinions.
Updated: May 2025
Product Categories
Data IntegrationPopular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
Azure Data Factory
Informatica PowerCenter
Oracle Data Integrator (ODI)
Palantir Foundry
IBM InfoSphere DataStage
Talend Open Studio
Oracle GoldenGate
SAP Data Services
Alteryx Designer
Spring Cloud Data Flow
Buyer's Guide
Download our free Pentaho Data Integration and Analytics Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which ETL tool would you recommend to populate data from OLTP to OLAP?
- What do you think can be improved with Hitachi Lumada Data Integrations?
- What do you use Hitachi Lumada Data Integrations for most frequently?
- Is using Hitachi Lumada Data Integrations cost-effective? Did this solution save money for your company compared to other products?
- When evaluating Data Integration, what aspect do you think is the most important to look for?
- Microsoft SSIS vs. Informatica PowerCenter - which solution has better features?
- What are the best on-prem ETL tools?
- Which integration solution is best for a company that wants to integrate systems between sales, marketing, and project development operations systems?
- Experiences with Oracle GoldenGate vs. Oracle Data Integrator?
- Should we choose Data Hub or GoldenGate?