We performed a comparison between Hitachi Lumada Data Integration and Informatica PowerCenter based on real PeerSpot user reviews.
Find out in this report how the two Data Integration Tools solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
"It is really easy to set up and the interface is easy to use."
"It is a very powerful, modern data analytics solution, in which you can integrate a large volume of data from different sources. It integrates all of the data and you can design, create, and monitor pipelines according to your requirements. It is an all-in-one day data ops solution."
"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
"In StreamSets, everything is in one place."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks."
"The fact that it enables us to leverage metadata to automate data pipeline templates and reuse them is definitely one of the features that we like the best. The metadata injection is helpful because it reduces the need to create and maintain additional ETLs. If we didn't have that feature, we would have lots of duplicated ETLs that we would have to create and maintain. The data pipeline templates have definitely been helpful when looking at productivity and costs."
"We can schedule job execution in the BA Server, which is the front-end product we're using right now. That scheduling interface is nice."
"I can use Python, which is open-source, and I can run other scripts, including Linux scripts. It's user-friendly for running any object-based language. That's a very important feature because we live in a world of open-source."
"The fact that it's a low-code solution is valuable. It's good for more junior people who may not be as experienced with programming."
"The amount of data that it loads and processes is good."
"One of the most valuable features is the ability to create many API integrations. I'm always working with advertising agents and using Facebook and Instagram to do campaigns. We use Pentaho to get the results from these campaigns and to create dashboards to analyze the results."
"I absolutely love Hitachi. I'm one of the forefront supporters of Hitachi for my firm. It's so easy to integrate within our environments. In terms of being able to quickly build ETL jobs, transform, and then automate them, it's really easy to integrate throughout for data analytics."
"We use Lumada’s ability to develop and deploy data pipeline templates once and reuse them. This is very important. When the entire pipeline is automated, we do not have any issues in respect to deployment of code or with code working in one environment but not working in another environment. We have saved a lot of time and effort from that perspective because it is easy to build ETL pipelines."
"One of the most valuable features for us is the metadata repository because it can easily understand the lineage of first target mapping. My company and I also find Informatica really easy to use—when a consultant joins our company, in just a few days to a few weeks, they can understand how to use it—so we prefer to use this ETL tool."
"The interface is very clean and clear."
"I would recommend that others considering the solution go ahead and use it for any batch and high volume loads with complex transactions."
"The support is valuable. There are also open-source ETL products, which work very well, but there is no support. When we face a production problem, being able to get support is valuable, and it brings efficiency. With an open-source solution, we can't engage anyone to resolve the problem as quickly as possible."
"It's a complete package, which is why we use this solution."
"If the systems get migrated or upgraded, the amount of resources required are very minimal. We can change the connections and establish a new connection. It's very helpful."
"The most valuable feature of Informatica PowerCenter is the flow designer functionally. It is the best out of any ETL tool. Additionally, the solution is reliable and trustable in dealing with large data sources anytime. When we're using billions of data transactions, it's smooth."
"The technical support for Informatica PowerCenter is good."
"The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful."
"If you use JDBC Lookup, for example, it generally takes a long time to process data."
"I was not happy with the Pentaho Report Designer because of the way it was set up. There was a zone and, under it, another zone, and under that another one, and under that another one. There were a lot of levels and places inside the report, and it was a little bit complicated. You have to search all these different places using a mouse, clicking everywhere... each report is coded in a binary file... You cannot search with a text search tool..."
"It could be better integrated with programming languages, like Python and R. Right now, if I want to run a Python code on one of my ETLs, it is a bit difficult to do. It would be great if we have some modules where we could code directly in a Python language. We don't really have a way to run Python code natively."
"Its basic functionality doesn't need a whole lot of change. There could be some improvement in the consistency of the behavior of different transformation steps. The software did start as open-source and a lot of the fundamental, everyday transformation steps that you use when building ETL jobs were developed by different people. It is not a seamless paradigm. A table input step has a different way of thinking than a data merge step."
"If you develop it on MacBook, it'll be quite a hassle."
"Parallel execution could be better in Pentaho. It's very simple but I don't think it works well."
"As far as I remember, not all connectors worked very well. They can add more connectors and more drivers to the process to integrate with more flows."
"I would like to see improvement when it comes to integrating structured data with text data or anything that is unstructured. Sometimes we get all kinds of different files that we need to integrate into the warehouse."
"I would like to see support for some additional cloud sources. It doesn't support Azure, for example. I was trying to do a PoC with Azure the other day but it seems they don't support it."
"The performance of Informatica PowerCenter could improve."
"I would like to see it be able to import data from NoSQL."
"Its scalability can be improved. It is not scalable."
"The pricing could be improved."
"Its licensing can be improved. It should be features-wise and not bundle-wise. A bundle will definitely be costly. In addition, we might use one or two features. That's why the pricing model should be based on the features. The model should be flexible enough based on the features. Their support should also be more responsive to premium customers."
"There is some room for improvement in terms of pricing."
"In the future, I would like to see Informatica PowerCenter integrate a more powerful dashboard."
"Informatica PowerCenter could improve on the documentation for the implementation. The documents provided are not very good for a new user."
StreamSets offers an end-to-end data integration platform to build, run, monitor and manage smart data pipelines that deliver continuous data for DataOps, and power the modern data ecosystem and hybrid integration.
Only StreamSets provides a single design experience for all design patterns for 10x greater developer productivity; smart data pipelines that are resilient to change for 80% less breakages; and a single pane of glass for managing and monitoring all pipelines across hybrid and cloud architectures to eliminate blind spots and control gaps.
With StreamSets, you can deliver the continuous data that drives the connected enterprise.
Pentaho data integration prepares and blends data to create a complete picture of your business that drives actionable insights. The complete data integration platform delivers accurate, "analytics ready" data to end users from any source. With visual tools to eliminate coding and complexity, Pentaho puts big data and all data sources at the fingertips of business and IT users alike.
Enterprise data integration platform to help organizations access, transform, and integrate data from a variety of systems.
Hitachi Lumada Data Integration is ranked 5th in Data Integration Tools with 25 reviews while Informatica PowerCenter is ranked 1st in Data Integration Tools with 35 reviews. Hitachi Lumada Data Integration is rated 8.0, while Informatica PowerCenter is rated 7.8. The top reviewer of Hitachi Lumada Data Integration writes "Saves time and makes it easy for our mixed-skilled team to support the product, but more guidance and better error messages are required in the UI". On the other hand, the top reviewer of Informatica PowerCenter writes "A stable, scalable, and mature solution for complex transformations and data integration". Hitachi Lumada Data Integration is most compared with SSIS, Talend Open Studio, Oracle Data Integrator (ODI), Azure Data Factory and Informatica Enterprise Data Catalog, whereas Informatica PowerCenter is most compared with Informatica Cloud Data Integration, Azure Data Factory, SSIS, AWS Glue and Informatica PowerExchange. See our Hitachi Lumada Data Integration vs. Informatica PowerCenter report.
We monitor all Data Integration Tools reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.