No more typing reviews! Try our Samantha, our new voice AI agent.
System Engineer at a tech services company with 11-50 employees
Real User
Oct 12, 2022
Enterprise Edition pricing and reduced Community Edition functionality are making us look elsewhere
Pros and Cons
  • "We also haven't had to create any custom Java code. Almost everywhere it's SQL, so it's done in the pipeline and the configuration. That means you can offload the work to people who, while they are not less experienced, are less technical when it comes to logic."
  • "Using the solution we were able to reduce our ETL deployment time by between 10 and 20 percent, and when it comes to personnel costs, we have gained 10 percent."
  • "The support for the Enterprise Edition is okay, but what they have done in the last three or four years is move more and more things to that edition. The result is that they are breaking the Community Edition. That's what our impression is."
  • "Overall, our Hitachi solution was quite good, but over the last couple of years, we have been trying to move away from the product due to a number of things."

What is our primary use case?

We use it for two major purposes. Most of the time it is for ETL of data. And based on the loaded and converted data, we are generating reports out of it. A small part of that, the pivot tables and the like, are also on the web interface, which is the more interactive part. But about 80 percent of our developers' work is on the background processes for running and transforming and changing data.

How has it helped my organization?

Before, a lot of manual work had to be done, work that isn't done anymore. We have also given additional reports to the end-users and, based upon them, they have to take some action. Based on the feedback of the users, some of the data cleaning tasks that were done manually have been automated. It has also given us a fast response to new data that is introduced into the organization.

Using the solution we were able to reduce our ETL deployment time by between 10 and 20 percent. And when it comes to personnel costs, we have gained 10 percent.

What is most valuable?

The graphical user interface is quite okay. That's the most important feature. In addition, the different types of stores and data formats that can be accessed and transferred are an important component.

We also haven't had to create any custom Java code. Almost everywhere it's SQL, so it's done in the pipeline and the configuration. That means you can offload the work to people who, while they are not less experienced, are less technical when it comes to logic. It's more about the business logic and less about the programming logic and that's really important.

Another important feature is that you can deploy it in any environment, whether it's on-premises or cloud, because you can reuse your steps. When it comes to adding to your data processing capacity dynamically that's key because when you have new workflows you have to test them. When you have to do it on a different environment, like your production environment, it's really important.

What needs improvement?

I would like to see better support from one version to the next, and all the more so if there are third-party elements that you are using. That's one of the differences between the Community Edition and the Enterprise Edition. 

In addition to better integration with third-party tools, what we have seen is that some of the tools just break from one version to the next and aren't supported anymore in the Community Edition. What is behind that is not really clear to us, but the result is that we can't migrate, or we have to migrate to other parts. That's the most inconvenient part of the tool.

We need to test to see if all our third-party plugins are still available in a new version. That's one of the reasons we decided we would move from the tool to the completely open-source version for the ETL part. That's one of the results of the migration hassle we have had every time.

The support for the Enterprise Edition is okay, but what they have done in the last three or four years is move more and more things to that edition. The result is that they are breaking the Community Edition. That's what our impression is.

The Enterprise Edition is okay, and there is a clear path for it. You will not use a lot of external plugins with it because, with every new version, a lot of the most popular plugins are transferred to the Enterprise Edition. But the Community Edition is almost not supported anymore. You shouldn't start in the Community Edition because, really early on, you will have to move to the Enterprise Edition. Before, you could live with and use the Community Edition for a longer time.

Buyer's Guide
Pentaho Data Integration and Analytics
March 2026
Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: March 2026.
886,349 professionals have used our research since 2012.

For how long have I used the solution?

I have been working with Hitachi Lumada Data Integration for seven or eight years.

What do I think about the stability of the solution?

The stability is okay. In the transfer from before it was Hitachi to Hitachi, it was two years of hell, but now it's better.

What do I think about the scalability of the solution?

At the scale we are using it, the solution is sufficient. The scalability is good, but we don't have that big of a data set. We have a couple of billion data records involved in the integration. 

We have it in one location across different departments with an outside disaster recovery location. It's on a cluster of VMs and running on Linux. The backend data store is PostgreSQL.

Maybe our design wasn't quite optimal for reloading the billions of records every night, but that's probably not due to the product but to the migration. The migration should have been done in a bit of a different way.

How are customer service and support?

I had contact with their commercial side and with the technical side for the setup and demos, but not after we implemented it. That is due to the fact that the documentation and the external consultant gave us a lot of information about it.

Which solution did I use previously and why did I switch?

We came from the Microsoft environment to Hitachi, but that was 10 years back. We switched due to the licensing costs and because there wasn't really good support for the PostgreSQL database.

Now, I think the Microsoft environment isn't that bad, and there is also better support for open-source databases.

How was the initial setup?

I was involved in the initial migration from Microsoft to Hitachi. It was rather straightforward, not too complex. Granted, it was a new toolset, but that is the same with every new toolset. The learning curve wasn't too steep.

The maintenance effort is not significant. From time to time we have an error that just pops up without our having any idea where it comes from. And then, the next day, it's gone. We get that error something like three times a year. Nobody cares about it or is looking into the details of it. 

The migrations from one version to the next that we did were all rather simple. During that process, users don't have it available for a day, but they can live with that. The migration was done over a weekend and by the following Monday, everything was up and running again.

What about the implementation team?

We had some external help from someone who knows the product and had already had some experience with implementing the tool.

What was our ROI?

In terms of ROI, over the years it was a good step to make the move to Hitachi. Now, I don't think it would be. Now, it would be a different story.

What's my experience with pricing, setup cost, and licensing?

We are using the Community Edition. We have been trying to use and sell the Enterprise version, but that hasn't been possible due to the budget required for it.

Which other solutions did I evaluate?

When we made the choice, it was between Microsoft, Hitachi, and Cognos. The deciding factor in going with Hitachi was its better support for open-source databases and data stores. Also, the functionality of the Community version was what was needed by most of our customers.

What other advice do I have?

Our experience with the query performance of Lumada on large data sets is that Lumada is not what determines performance. Most of the time, the performance comes from the database or the data store underneath Lumada. Depending on how big your data set is, you have to change or optimize your data store and then you can work with large data sets.

The fine-tuning of the database that is done outside of Lumada is okay because a tool can't provide every insight into every type of data store or dataset. If you are looking into optimization, you have to use your data store optimization tools. Hitachi isn't designed for that, and we were not expecting to have that.

I'm not really that impressed with Hitachi's ability to quickly and effectively solve issues we have brought up, but it's not that bad either. It's halfway, not that good and not that bad.

Overall, our Hitachi solution was quite good, but over the last couple of years, we have been trying to move away from the product due to a number of things. One of them is the price. It's really expensive. And the other is that more and more of what used to be part of the Community Edition functionality is moving to the Enterprise Edition. The latter is okay and its functions are okay, but then we are back to the price. Some of our customers don't have the deeper pockets that Hitachi is aiming for.

Before, it was more likely that I would recommend Hitachi Ventara to a colleague. But now, if you are starting in an environment, you should move to other solutions. If you have the money for the Enterprise Edition, then I would say my likelihood of recommending it, on a scale of one to 10, would be a seven. Otherwise, it would be a one out of 10.

If you are going with Hitachi, go for the Enterprise version or stay away from Hitachi.

It's also really important to think in great detail about your loading process at the start. Make sure that is designed correctly. That's not directly related to the tool itself, but it's more about using the tool and how the loads are transferred.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Anton Abrarov - PeerSpot reviewer
Project Leader at a mining and metals company with 10,001+ employees
Real User
Jun 8, 2022
Fastens the data flow processes and has a user-friendly interface
Pros and Cons
  • "It has a really friendly user interface, which is its main feature. The process of automating or combining SQL code with some databases and doing the automation is great and really convenient."
  • "After using this product, we could do some of the things much faster than before."
  • "As far as I remember, not all connectors worked very well. They can add more connectors and more drivers to the process to integrate with more flows."
  • "As far as I remember, not all connectors worked very well."

What is our primary use case?

The company where I was working previously was using this product. We were using it for ETL process management. It was like a data flow automatization.

In terms of deployment, we were using an on-premise model because we had sensitive data, and there were some restrictions related to information security.

How has it helped my organization?

Our data flow processes became faster with this solution.

What is most valuable?

It has a really friendly user interface, which is its main feature. The process of automating or combining SQL code with some databases and doing the automation is great and really convenient.

What needs improvement?

As far as I remember, not all connectors worked very well. They can add more connectors and more drivers to the process to integrate with more flows.

The last time I saw this product, the onboarding instructions were not clear. If the process of onboarding this product is made more clear, it will take the product to the next level. There is a possibility that the onboarding process has already improved, and I haven't seen it. 

For how long have I used the solution?

I have used this solution for two or three years.

What do I think about the stability of the solution?

I would rate it an eight out of ten in terms of stability.

What do I think about the scalability of the solution?

We didn't have to scale too much. So, I can't evaluate it properly in terms of scalability.

In terms of its users, only our team was using it. There were approximately 20 users. It was not for the whole company.

How are customer service and support?

We didn't use too much customer support. We were using the open-source resources through Google Search. So, we were just using text search. There were some helpful forums where we were able to find the answers to our questions.

Which solution did I use previously and why did I switch?

I didn't use any other solution previously. This was the only one.

How was the initial setup?

I wasn't a part of its deployment. In terms of maintenance, as far as I know, it didn't require much maintenance.

What was our ROI?

We absolutely saw an ROI. It was hard to calculate, but we felt it in terms of
the speed of our processes. After using this product, we could do some of the things much faster than before.

What's my experience with pricing, setup cost, and licensing?

I mostly used the open-source version. I didn't work with a license.

Which other solutions did I evaluate?

I did not evaluate other options.

What other advice do I have?

I would recommend using this product for data engineering and Extract, Transform, and Load (ETL) processes.

I would rate it an eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Pentaho Data Integration and Analytics
March 2026
Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: March 2026.
886,349 professionals have used our research since 2012.
reviewer1872000 - PeerSpot reviewer
Senior Data Analyst at a tech services company with 51-200 employees
Real User
Jun 8, 2022
We're able to query large data sets without affecting performance
Pros and Cons
  • "One of the most valuable features is the ability to create many API integrations. I'm always working with advertising agents and using Facebook and Instagram to do campaigns. We use Pentaho to get the results from these campaigns and to create dashboards to analyze the results."
  • "Before we used Pentaho, our processes were in Microsoft Excel and the updates from databases had to be done manually, but now all our routines are done automatically and we have more time to do other jobs, saving us four or five hours daily."
  • "Parallel execution could be better in Pentaho. It's very simple but I don't think it works well."
  • "Parallel execution could be better in Pentaho. It's very simple but I don't think it works well."

What is our primary use case?

I use it for ETL. We receive data from our clients and we join the most important information and do many segmentations to help with communication between our product and our clients.

How has it helped my organization?

Before we used Pentaho, our processes were in Microsoft Excel and the updates from databases had to be done manually. Now all our routines are done automatically and we have more time to do other jobs. It saves us four or five hours daily.

In terms of ETL development time, it depends on the complexity of the job, but if the job is simple it saves two or three hours.

What is most valuable?

One of the most valuable features is the ability to create many API integrations. I'm always working with advertising agents and using Facebook and Instagram to do campaigns. We use Pentaho to get the results from these campaigns and to create dashboards to analyze the results.

I'm working with large data sets. One of the clients I'm working with is a large credit card company and the database from this client is very large. Pentaho allows me to query large data sets without affecting its performance.

I use Pentaho with Jenkins to schedule the jobs. I'm using the jobs and transformations in Pentaho to create many links. 

I always find ways to have minimal code and create the processes with many parameters. I am able to reuse processes that I have created before. 

Creating jobs and putting them into production, as well as the visibility that Pentaho gives, are both very simple.

What needs improvement?

Parallel execution could be better in Pentaho. It's very simple but I don't think it works well.

For how long have I used the solution?

I've been working with Pentaho for four or five years.

What do I think about the stability of the solution?

The stability is good. 

What do I think about the scalability of the solution?

It's scalable.

How are customer service and support?

I find help on the forums.

Which solution did I use previously and why did I switch?

I used SQL Server Integration Services, but I have much more experience with Pentaho. I have also worked with Apache NiFi but it is more focused on single data processes but I'm always working with batch processes and large data sets.

How was the initial setup?

The first deployment was very complex because we didn't have experience with the solution, but the next deployment was simpler.

We create jobs weekly in Pentaho. The development time takes, on average, one week and the deployment takes just one day or so.

We just put it on Git and pull a server and schedule the execution.

We use it on-premises while the infrastructure is Amazon and Azure.

What other advice do I have?

I always recommend Pentaho for working with automated processes and to do API integrations.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Aqeel UR Rehman - PeerSpot reviewer
BI Analyst at a computer software company with 51-200 employees
Real User
Top 5
May 19, 2022
Simple to use, supports custom transformations, and the open-source version can be used free of charge
Pros and Cons
  • "This solution allows us to create pipelines using a minimal amount of custom coding."
  • "My advice for anybody who is considering this product is if they're looking for any kind of custom transformation, or they're gleaning data from multiple sources and sending it to multiple destinations, I definitely recommend this tool."
  • "I have been facing some difficulties when working with large datasets. It seems that when there is a large amount of data, I experience memory errors."
  • "The technical support does not reply in a timely manner. The support they have in place does not work very well."

What is our primary use case?

I have used this ETL tool for working with data in projects across several different domains. My use cases include tasks such as transforming data that has been taken from an API like PayPal, extracting data from different sources such as Magenta or other databases, and transforming all of the information.

Once the transformation is complete, we load the data into data warehouses such as Amazon Redshift.

How has it helped my organization?

There are a lot of different benefits we receive from using this solution. For example, we can easily accept data from an API and create JSON files. The integration is also very good.

I have created many data pipelines and after they are created, they can be reused on different levels.

What is most valuable?

The best feature is that it's simple to use. There are simple data transformation steps available, such as trimming data or performing different types of replacement.

This solution allows us to create pipelines using a minimal amount of custom coding. Anyone in the company can do so, and it's just a simple step. If any coding is required then we can use JavaScript.

What needs improvement?

I have been facing some difficulties when working with large datasets. It seems that when there is a large amount of data, I experience memory errors. If there is a large amount of data then there is definitely a lag.

I would like to see a cloud-based deployment because it will allow us to easily handle a large amount of data.

For how long have I used the solution?

I have been working with Hitachi Lumada Data Integration for almost three years, across two different organizations.

What do I think about the stability of the solution?

There is definitely some lag but with a little improvement, it will be a good fit.

What do I think about the scalability of the solution?

This is a good product for an enterprise-level company.

We use this solution for all of our data integration jobs. It handles the transformation. As our business grows and the demand for data integration increases, our usage of this tool will also increase.

Between versions, they have added a lot of plugins.

How are customer service and support?

The technical support does not reply in a timely manner. I have filled out the support request form, one or two times, asking about different things, but I have not received a reply.

The support they have in place does not work very well. I would rate them one or two out of ten.

How would you rate customer service and support?

Negative

Which solution did I use previously and why did I switch?

In this business, they initially began with this product and did not use another one beforehand. I have also worked on the cloud-level integration tool. 

How was the initial setup?

The initial setup and deployment are straightforward.

I have deployed it on different servers and on average, it takes an hour to complete. I have not read any documentation regarding installation. With my experience, we were able to set everything up.

What's my experience with pricing, setup cost, and licensing?

I primarily work on the Community Version, which is available to use free of charge. I have asked for pricing information but have not yet received a response.

What other advice do I have?

We are currently using version 8.3 but version 9 is available. More features to support big data are available in the newest release.

My advice for anybody who is considering this product is if they're looking for any kind of custom transformation, or they're gleaning data from multiple sources and sending it to multiple destinations, I definitely recommend this tool.

Overall, this is a good product and I recommend it.

I would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Data Architect at a tech services company with 1,001-5,000 employees
Reseller
May 12, 2022
Helped us to fully digitalize a national census process, eliminating door-to-door interviews
Pros and Cons
  • "One of the valuable features is the ability to use PL/SQL statements inside the data transformations and jobs."
  • "As a result of one of the projects that we did in the Middle East, we achieved the main goal of fully digitalizing their population census."
  • "I would like to see support for some additional cloud sources. It doesn't support Azure, for example. I was trying to do a PoC with Azure the other day but it seems they don't support it."
  • "The support from Hitachi is not the greatest, the fixing of bugs can take a really long time."

What is our primary use case?

We use it as an ETL tool. We take data from a source database and move it into a target database. We do some additional processing on our target databases as well, and then load the data into a data warehouse for reports. The end result is a data warehouse and the reports built on top of that.

We are a consulting company and we implement it for clients.

How has it helped my organization?

As a result of one of the projects that we did in the Middle East, we achieved the main goal of fully digitalizing their population census. They did previous censuses doing door-to-door surveys, but for the last census, using Pentaho Data Integration, we managed to get it all running in a fully digital way, with nothing on paper forms. No one had to go door-to-door and survey the people.

What is most valuable?

One of the valuable features is the ability to use PL/SQL statements inside the data transformations and jobs.

What needs improvement?

I would like to see support for some additional cloud sources - Azure, Snowflake.

For how long have I used the solution?

I have been using Hitachi Lumada Data Integration for four years.

What do I think about the stability of the solution?

There have been some bugs and some weird things every now and then, but it is mostly fairly stable.

What do I think about the scalability of the solution?

If you work with relatively small data sets, it's all okay. But if you are going to use really huge data sets, then you might get into a bit of trouble, at least from what I have seen.

How are customer service and support?

The support from Hitachi is not the greatest, the fixing of bugs can take a really long time.

How would you rate customer service and support?

Neutral

How was the initial setup?

The initial setup is very straightforward compared to many other ETL tools. It takes about half a day.

We have about five users, altogether. There are two to three developers, one to two customer people who run the ETLs and one to two admins who take care of the environment itself. It doesn't require much maintenance. Occasionally someone has to restart the server or take a look at logs.

What was our ROI?

Because you can basically get Pentaho Data Integration for free, I would give the cost versus performance a pretty good rating.

Taking the census project that I mentioned earlier as an example (Pentaho Data Integration was not the only contributor though, it was just one part of the whole solution) the statistical authority managed to save huge amounts of money by making the census electronic, versus the traditional version.

What's my experience with pricing, setup cost, and licensing?

You don't need the Enterprise Edition, you can go with the Community Edition. That way you can use it for free and, for free, it's a pretty good tool to use. 

If you pay for licenses, the only thing that you're getting, in addition, is customer support, which is pretty much nonexistent in any case. I would recommend going with the Community Edition.

Which other solutions did I evaluate?

I have had experience with other solutions, but for the last project we did not evaluate other options. Because we had previous experience with Pentaho Data Integration, it was pretty much a no-brainer to use it.

What other advice do I have?

Hitachi Vantara's roadmap is promising. They came up with Lumada and it seems that they do have some ideas on how to make their product a bit more attractive than it currently is.

I'm fairly satisfied with using Pentaho Data Integration. It's more or less okay. When it comes to all the other parts, like Pentaho reports and Pentaho dashboards, things could be better there.

The biggest lesson I've learned from using this solution is that a cheap, open-source tool can sometimes be even more efficient than some of the high-priced enterprise ETL tools. Overall, the solution is okay, considering the low cost. It has all of the main things that you would expect it to have, from a technical perspective.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Michel Philippenko - PeerSpot reviewer
Project Manager at a computer software company with 51-200 employees
Real User
Apr 4, 2022
Forums are helpful, and creating ETL jobs is simpler than in other solutions
Pros and Cons
  • "I would fully recommend Pentaho."
  • "I was not happy with the Pentaho Report Designer because of the way it was set up. There was a zone and, under it, another zone, and under that another one, and under that another one. There were a lot of levels and places inside the report, and it was a little bit complicated. You have to search all these different places using a mouse, clicking everywhere... each report is coded in a binary file... You cannot search with a text search tool..."
  • "I was not happy with the Pentaho Report Designer because of the way it was set up."

What is our primary use case?

I was working with Pentaho for a client. I had to implement complicated data flows and extraction. I had to take data from several sources in a PostgreSQL database by reading many tables in several databases, as well as from Excel files. I created some complex jobs. I also had to implement business reports with the Pentaho Report Designer.

The client I was working for had Pentaho on virtual machines.

What is most valuable?

The ETL feature was the most valuable to me. I like it very much. It was very good.

What needs improvement?

I was not happy with the Pentaho Report Designer because of the way it was set up. There was a zone and, under it, another zone, and under that another one, and under that another one. There were a lot of levels and places inside the report, and it was a little bit complicated. You had to search all these different places using a mouse, clicking everywhere. The interface does not enable you to find things and manage all that. I don't know if other tools are better for end-users when it comes to the graphical interface, but this was a bit complicated. In the end, we were able to do everything with Pentaho.

And when you want to improve the appearance of your report, Pentaho Report Designer has complicated menus. It is not very user-friendly. The result is beautiful, but it takes time.

Also, each report is coded in a binary file, so you cannot read it. Maybe that's what the community or the developers want, but it is inconvenient because when you want to search for information, you need to open the graphical interface and click everywhere. You cannot search with a text search tool because the reports are coded in binary. When you have a lot of reports and you want to find where a precise part of one of your reports is, you cannot do it easily.

The way you specify parameters in Pentaho Report Designer is a little bit complex. There are two interfaces. The job creators use the PDI which provides the ETL interface, and it's okay. Creating the jobs for extract/transform/load is simpler than in other solutions. But there is another interface for the end-users of Pentaho and you have to understand how they relate to each other, so it's a little bit complex. You have to go into XML files, which is not so simple.

Also, using the solution overall is a little bit difficult. You need to be an engineer and somebody with a technical background. It's not absolutely easy, it's a technical tool. I didn't immediately understand it and had to search for information and to think about it.

For how long have I used the solution?

I used Hitachi Lumada Data Integration, Pentaho, for approximately two years.

What do I think about the stability of the solution?

The stability was perfect.

What do I think about the scalability of the solution?

I didn't scale the solution. I had to migrate from an old Pentaho to a new Pentaho. I had quite a big set of data, but I didn't add new data. I worked with the same volume of data all the time so I didn't test the scaling.

In the company I consulted for, there were about 15 people who input the data and worked with the technical part of Pentaho. There were a lot of end-users, who were the people interested in the reports; on the order of several thousand end-users. 

How are customer service and support?

The technical support was okay. I used the open-source version of Pentaho and I used the forum. I found what I needed. And, the one or two times when I didn't find something, I asked a question in the forum and I received an answer very quickly. I appreciated that a lot. I had an answer one or two hours later. It's very good that somebody from Pentaho Enterprise responds so rapidly.

How was the initial setup?

The initial setup was complex, but I'm an engineer and it's my job to deal with complex systems. It's not the most complex that I have dealt with, but it was still somewhat complex. The procedure was explained on the Pentaho website in the documentation. You had to understand which module does what. It was quite complex.

It took quite a long time because I had to troubleshoot, to understand what was wrong, and I had to do it several times before it worked.

What's my experience with pricing, setup cost, and licensing?

I didn't purchase Pentaho. There is a business version but I used only the open source. I was fully satisfied and very happy with it. It's a very good open-source solution. The communication channels, the updates, the patches, et cetera are all good.

What other advice do I have?

I would fully recommend Pentaho. I have already recommended it to some colleagues. It's a good product with good performance.

Overall, I was very happy with it. It was complicated, but that is part of my job. I was happy with the result and the stability. The Data Integration product is simpler than the Report Designer. I would rate the Data Integration at 10 out of 10 and the Report Designer at nine, because of the graphical interface.

Disclosure: My company has a business relationship with this vendor other than being a customer. System integrator
PeerSpot user
Dale Bloom - PeerSpot reviewer
Credit Risk Analytics Manager at MarketAxess
Real User
Jan 30, 2022
Integrates easily, significantly reduces our development time, and allows us to put as much code as we want
Pros and Cons
  • "I absolutely love Hitachi. I'm one of the forefront supporters of Hitachi for my firm. It's so easy to integrate within our environments. In terms of being able to quickly build ETL jobs, transform, and then automate them, it's really easy to integrate throughout for data analytics."
  • "When we get a question from our CEO that needs a response and that requires a little bit of legwork of pulling in from various market data, our own in-house repositories, and everything else, it allows me to arrive at the solutions much faster than having to do it through scripting in Python, coding, or anything else."
  • "In the Community edition, it would be nice to have more modules that allow you to code directly within the application. It could have R or Python completely integrated into it, but this could also be because I'm using an older version."
  • "In the Community edition, it would be nice to have more modules that allow you to code directly within the application."

What is our primary use case?

The use case is for data ETL on our various data repositories. We use it to aggregate and transform data for visualization purposes for our upper management.

Currently, I am using the PDI locally on my laptop, but we are undergoing an integration to push this off. We have purchased the Enterprise edition and have licenses, and we are just working with our infrastructure to get that set up on a server. 

We haven't yet launched the Enterprise edition, so I've had very minimal touch with Lumada, but I did have an overview with one of the engineers as to how to use the customer portal in terms of learning documentation. So, the documentation and support are basically the two main areas that I've been using it for. I haven't piped any data or anything through it. I've logged in a couple of times to the customer portal, and I've pretty much been using it as support functionality. I have been submitting requests to understand more about how to get everything to be working for the Enterprise edition. So, I have been using the Lumada customer portal mostly for Pentaho Data Integration.

How has it helped my organization?

When we get a question from our CEO that needs a response and that requires a little bit of legwork of pulling in from various market data, our own in-house repositories, and everything else, it allows me to arrive at the solutions much faster than having to do it through scripting in Python, coding, or anything else. I use multiple tools within my toolkit. I'm pretty heavy on Python, but I find that I can do quite a bit of pre-transformation of the data within the actual application for PDI Spoon than having to do everything through coding in Python.

It has significantly reduced our ETL development time. I can't really quantify the hours, but it's a no-brainer for me for just pumping in things. If I have a simple question to ascertain, I can pull up and create any type of job or transform to easily get the solution within minutes, as opposed to however many hours of coding it would take. My estimate is that per week, I would be spending about 75% of my time in coding external to the application, whereas, with the application itself, I can do things within a fraction of that. So, it has reduced my time from 75% to about 5%. In terms of the cost of full-time employee coding and everything, the savings would also roughly be the same, which is from 75% to 5% per week. There is also a broader impact on other colleagues within my team. Currently, their processes are fairly manual, such as Excel-based, so the time savings are carried over to them as well.

What is most valuable?

I'm at the early stages with Lumada, and I have been using the documentation quite a bit. The support has definitely been critical right now in terms of trying to find out more about the architectural elements that need to go in for pushing the Enterprise edition.

I absolutely love Hitachi. I'm one of the forefront supporters of Hitachi for my firm. It's so easy to integrate within our environments. In terms of being able to quickly build ETL jobs, transform, and then automate them, it's really easy to integrate throughout for data analytics. 

I also appreciate the fact that it's not one of the low-code/no-code solutions. You can put as much JavaScript or another code into it as you want, and that makes it a really powerful tool.

What needs improvement?

I haven't been able to broach all the functionality of the Enterprise edition because it hasn't been integrated into our server. We're still building out the server, app server, and repository to support it.

In the Community edition, it would be nice to have more modules that allow you to code directly within the application. It could have R or Python completely integrated into it, but this could also be because I'm using an older version.

For how long have I used the solution?

I have been using it here for about two months. 

What do I think about the stability of the solution?

I haven't had any problems with stability. Right now, for the implementation of the Enterprise edition, we're trying to make sure that it's highly available in case anything goes down, and we have proper safety nets in place, but personally, I haven't found any issues.

What do I think about the scalability of the solution?

It seems highly scalable. I've used the product in other firms, and we've managed to work pretty coherently pushing our changes for code, revisions, and everything else to Git and things like that.

In terms of users, currently, in my firm, I'm the only user, but the intention is to push it globally for all of our users to be able to use it. 

We would like to be able to support other teams and other departments within the organization. Currently, this is being used only for our credit risk team, but in general, within risk, we have many departments such as operational risk, enterprise risk, market risk, and credit risk. I'm bridging all of them right now. However, with other teams that have expressed an interest, it also will include our settlements team and potentially even our research team and FP&A.

How are customer service and support?

So far, it's been pretty good. I would rate them an eight out of 10. 

People are fairly responsive initially to saying, "Okay, yes, we have this on our radar. Coming back." Sometimes, it might take a little bit longer for some responses, but it's still very good, and the quality is a 10 out of 10.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

At my current firm, we weren't using anything in this team. I just came in, and I knew I wanted to use this product. I had used it quite heavily at my previous firm, and it was just very easy. Even the folks who did not have prior coding experience or data ETL experience could fairly quickly learn its semantics or the ways to work with it. So, I figured that it would be a great product to push forward.

Other teams in my firm were using low-code or no-code solutions, but I just can't stand their interfaces. It's rather limited in terms of even viewing what's on the screen and what you have. I appreciate the way you can debug very quickly within PDI.

How was the initial setup?

It was pretty straightforward for me. I had no problem with configuring it. For my personal use of the product, it took an hour of my time to get it onto my machine. For the Enterprise edition, the deployment is still going on, but it's mainly because we don't have many people on our infrastructure team to help. They have multiple ongoing projects. 

The implementation strategy for my personal use case was fairly straightforward. It involved getting the Community edition and configuring it so that I can set up the pipelines for connecting to my data sources and databases and then output to a file share drive for now. All our databases are fairly read-only on our side. In terms of the implementation strategy for the Enterprise edition, we haven't gotten to the stage of completing it, but it'll work somewhat similarly. It's just that the repositories, instead of them being folder repositories, are going to be database-driven, and any code is going to be pushed to the database repository.

What about the implementation team?

We are not using any integrator or consultant for this. For its deployment and maintenance, we're rather limited in terms of the staff. We have one infrastructure person and me. I'm going to be in charge of maintaining it for the time being until I can increase my team.

What was our ROI?

When you can get things done much faster and free up people's time, it's a no-brainer.

When I came into the firm, I was using the Community edition, which is the freeware version. Because the Enterprise edition costs something, it has actually increased our costs, but as a whole, in terms of operational ability and time savings for the rest of my team, the output from PDI and everything else has only increased the value of using this product.

What's my experience with pricing, setup cost, and licensing?

The pricing has been pretty good. I'm used to using everything open-source or freeware-based. I understand that organizations need to make sure that the solutions are secure, and that's basically where I hit a roadblock in my current organization. They needed to ensure that we had a license and we had a secure way of accessing it so that no outside parties could get access to our data, but in terms of pricing, considering how much other teams are spending on cloud solutions or even their existing solutions, its price point is pretty good.

At this time, there are no additional costs. We just have the licensing fees.

What other advice do I have?

If you don't have the comfort level for the architectural build-out, then you can definitely opt for the white gloves treatment with an additional cost of about 50,000 to help with the integration and implementation effort of it. We chose not to go that route. Therefore, we're using support for any of the fine-tuning questions about making it highly available and other things.

I have not used Lumada for creating pipelines. I'm using PDI to help with our data pipelines. Similarly, I am not using its ability to develop and deploy data pipeline templates at this time, and I also haven't used it for single end-to-end data management from ingestion to insight.

The biggest lesson that I have learned from using this solution is that the order of operations is critical. Other than that, it has been an absolute treat to use.

I've been espousing this product to everybody. I would rate it a 10 out of 10.

Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
PeerSpot user
Lead, Data and BI Architect at a financial services firm with 201-500 employees
Real User
Jan 13, 2022
We can use the same tool on all our environments. The patching is buggy.
Pros and Cons
  • "Flexible deployment, in any environment, is very important to us. That is the key reason why we ended up with these tools. Because we have a very highly secure environment, we must be able to install it in multiple environments on multiple different servers. The fact that we could use the same tool in all our environments, on-prem and in the cloud, was very important to us."
  • "I love the fact that we haven't come up with a problem yet that we haven't been able to address with this tool."
  • "The testing and quality could really improve. Every time that there is a major release, we are very nervous about what is going to get broken. We have had a lot of experience with that, as even the latest one was broken. Some basic things get broken. That doesn't look good for Hitachi at all. If there is one place I would advise them to spend some money and do some effort, it is with the quality. It is not that hard to start putting in some unit tests so basic things don't get broken when they do a new release. That just looks horrible, especially for an organization like Hitachi."
  • "The stability is not great, especially when you start patching it a lot because things get broken."

What is our primary use case?

We run the payment systems for Canada. We use it as a typical ETL tool to transfer and modify data into a data warehouse. We have many different pipelines that we have built with it.

How has it helped my organization?

I love the fact that we haven't come up with a problem yet that we haven't been able to address with this tool. I really appreciate its maturity and the breadth of its capabilities.

If we did not have this tool, we would probably have to use a whole different variety of tools, then our environment would be a lot more complicated.

We develop metadata pipelines and use them.

Flexible deployment, in any environment, is very important to us. That is the key reason why we ended up with these tools. Because we have a very highly secure environment, we must be able to install it in multiple environments on multiple different servers. The fact that we could use the same tool in all our environments, on-prem and in the cloud, was very important to us. 

What is most valuable?

Because it comes from an open-source background, it has so many different plugins. It is just extremely broad in what it can do. I appreciate that it has a very broad, wide spectrum of things that it can connect to and do. It has been around for a while, so it is mature and has a lot of things built into it. That is the biggest thing. 

The visual nature of its development is a big plus. You don't need to have very strong developers to be able to work with it.

We often have to drop down to JavaScript, but that is fine. I appreciate that it has the capability built-in. When you need to, you can drop down to a scripting language. This is important to us.

What needs improvement?

The documentation is very basic.

The testing and quality could really improve. Every time that there is a major release, we are very nervous about what is going to get broken. We have had a lot of experience with that, as even the latest one was broken. Some basic things get broken. That doesn't look good for Hitachi at all. If there is one place I would advise them to spend some money and do some effort, it is with the quality. It is not that hard to start putting in some unit tests so basic things don't get broken when they do a new release. That just looks horrible, especially for an organization like Hitachi.

For how long have I used the solution?

Overall, I have been using it for about 10 years. At my current organization, I have been using it for about seven years. It was used a little bit at my previous organization as well.

What do I think about the stability of the solution?

The stability is not great, especially when you start patching it a lot because things get broken. That is not a great look. When you start patching, you are expecting things to get fixed, not new things to get broken.

With modern programming, you build a lot of automated testing around your solution, and it is specifically for that. I changed this piece of code. Well, what else got broken? Obviously they don't have a lot of unit tests built into their code. They need to start doing that because it looks horrible when they change one thing, then two other things get broken. Then, they released that as a commercial product, which is horrible. Last time, somehow they broke the ability to connect with databases. That is something incredibly basic. How could you release this product without even testing for that?

What do I think about the scalability of the solution?

We don't have a huge amount of data, so I can't really answer how we could scale up to very large solutions.

How are customer service and support?

Lumada’s ability to quickly and effectively solve issues we have brought up is not great. We have a service for the solution with Hitachi. I don't get the sense that Pentaho, and Hitachi still calls it Pentaho, is a huge center of focus for them. 

You kind of get help, but the people from whom you get help aren't necessarily super strong. It often goes around in circles forever. I eventually have to find my own solution. 

I haven't found that the Hitachi support site has a depth of understanding for the solution. They can answer simple questions, but when it gets more in-depth, they have a lot of trouble answering questions. I don't think the support people have the depth of expertise to really deal with difficult questions.

I would rate them as five out of 10. They are responsive and polite. I don't feel ignored or anything like that, just the depth of knowledge isn't there.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

It has always been here. There was no solution like it until I got to the company.

How was the initial setup?

The initial setup was complex because we had to integrate with SAML. Even though they had some direction on that, it was really a do-it-yourself kind of thing. That was pretty complicated, so if they want to keep this product fresh, I think they have to work on making it integrate more with modern technology, like single sign-on and stuff like that. Every organization has that now and Pentaho doesn't have a good story for that. However, it is the platform that they don't give a lot of love to.

It took us a long time to figure it out, something like two weeks.

What was our ROI?

This has reduced our ETL development time. If it wasn't for this solution, we would be doing custom coding. The reason why we are using the solution is because of its simplicity of development.

What's my experience with pricing, setup cost, and licensing?

The cost of these types of solutions are expensive. So, we really appreciate what we get for our money. Though, we don't think of the solution as a top-of-the-line solution or anything like that.

Which other solutions did I evaluate?

Apache has a project going on called Apache Hop. Because Pentaho was open sourced, people have taken and forged it. They are really modernizing the solution. As far as I know, Hitachi is not involved yet. I would highly advise them to get involved in that open-source project. It will be the next generation of Pentaho. If they get left behind, they're not going to have anything. It would be a very bad move to just ignore it. Hitachi should not ignore Apache Hop.

What other advice do I have?

I really like the data integration tool. However, it is part of a whole platform of tools, and it is obvious the other tools just don't get a lot of love. We are in it for Pentaho Data Integration (PDI) because that is what we want as our ETL tool. We use their reporting platform and stuff like that, but it is obvious that they just don't get a lot of love or concern.

I haven't looked at the roadmap that much. We are also a Google customer using BigQuery, etc. Hitachi is really just a very niche part of what we do. Therefore, we are not generally looking very seriously at what Hitachi is doing with their products nor a big investor in what Hitachi is doing.

I would recommend this specific Hitachi product to a friend or colleague, depending on their use case and need. If they have a very similar need, I would recommend it. I wouldn't be saying, "Oh, this is the best thing next to sliced bread," but say, "Hey, if this is what you need, this works well for us."

On a scale of one to 10 for recommending the product, I would rate it as seven out of 10. Overall, I would also rate it as seven out of 10.

We really appreciated the breadth of its capabilities. It is not the top-of-the-line solution, but you really get a lot for what you pay for.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
PeerSpot user
Buyer's Guide
Download our free Pentaho Data Integration and Analytics Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2026
Product Categories
Data Integration
Buyer's Guide
Download our free Pentaho Data Integration and Analytics Report and get advice and tips from experienced pros sharing their opinions.