We performed a comparison between Hitachi Lumada Data Integration and Talend Open Studio based on real PeerSpot user reviews.
Find out in this report how the two Data Integration Tools solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
"In StreamSets, everything is in one place."
"It is really easy to set up and the interface is easy to use."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks."
"It is a very powerful, modern data analytics solution, in which you can integrate a large volume of data from different sources. It integrates all of the data and you can design, create, and monitor pipelines according to your requirements. It is an all-in-one day data ops solution."
"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
"It's my understanding that the product can scale."
"One of the most valuable features is the ability to create many API integrations. I'm always working with advertising agents and using Facebook and Instagram to do campaigns. We use Pentaho to get the results from these campaigns and to create dashboards to analyze the results."
"Flexible deployment, in any environment, is very important to us. That is the key reason why we ended up with these tools. Because we have a very highly secure environment, we must be able to install it in multiple environments on multiple different servers. The fact that we could use the same tool in all our environments, on-prem and in the cloud, was very important to us."
"The graphical nature of the development interface is most useful because we've got people with quite mixed skills in the team. We've got some very junior, apprentice-level people, and we've got support analysts who don't have an IT background. It allows us to have quite complicated data flows and embed logic in them. Rather than having to troll through lines and lines of code and try and work out what it's doing, you get a visual representation, which makes it quite easy for people with mixed skills to support and maintain the product. That's one side of it."
"The area where Lumada has helped us is in the commercial area. There are many extractions to compose reports about our sales team performance and production steps. Since we are using Lumada to gather data from each industry in each country. We can get data from Argentina, Chile, Brazil, and Colombia at the same time. We can then concentrate and consolidate it in only one place, like our data warehouse. This improves our production performance and need for information about the industry, production data, and commercial data."
"One of the valuable features is the ability to use PL/SQL statements inside the data transformations and jobs."
"The amount of data that it loads and processes is good."
"It has a really friendly user interface, which is its main feature. The process of automating or combining SQL code with some databases and doing the automation is great and really convenient."
"I can connect with different databases such as Oracle Database or SQL Server. It allows you to extract the data from one database to another. I can structure the data by filtering and mapping the fields. It is very user-friendly. You need to know the basics of SQL development or SQL queries, and you can use this tool."
"We're able to handle large amounts of data with ease."
"Talend Open Studio is easy to create jobs. We use the basic functionality and it is very good."
"Talend lets you do everything — mapping, workflow, and orchestration — in a single place."
"The main differentiator that I have seen between Talend and other data integration tools is the ability to view the data pipeline in the form of a program."
"Talend can connect to multiple data sources, including relational data sources, ERP, CRM, and others."
"Talend is safe to use because it is very restrictive. It is easy to use when one learns how to manipulate data with SQL."
"The solution has a good balance between automated items and the ability for a developer to integrate and extend what he needs. Other competing tools do not offer the same grade of flexibility when you need to go beyond what is provided by the tool. Talend, on the other hand, allows you to expand very easily."
"If you use JDBC Lookup, for example, it generally takes a long time to process data."
"Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using."
"The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"Although it is a low-code solution with a graphical interface, often the error messages that you get are of the type that a developer would be happy with. You get a big stack of red text and Java errors displayed on the screen, and less technical people can get intimidated by that. It can be a bit intimidating to get a wall of red error messages displayed. Other graphical tools that are focused at the power user level provide a much more user-friendly experience in dealing with your exceptions and guiding the user into where they've made the mistake."
"I'm still in the very recent stage concerning Pentaho Data Integration, but it can't really handle what I describe as "extreme data processing" i.e. when there is a huge amount of data to process. That is one area where Pentaho is still lacking."
"I was not happy with the Pentaho Report Designer because of the way it was set up. There was a zone and, under it, another zone, and under that another one, and under that another one. There were a lot of levels and places inside the report, and it was a little bit complicated. You have to search all these different places using a mouse, clicking everywhere... each report is coded in a binary file... You cannot search with a text search tool..."
"I would like to see improvements made for real-time data processing."
"In terms of the flexibility to deploy in any environment, such as on-premise or in the cloud, we can do the cloud deployment only through virtual machines. We might also be able to work on different environments through Docker or Kubernetes, but we don't have an Azure app or an AWS app for easy deployment to the cloud. We can only do it through virtual machines, which is a problem, but we can manage it. We also work with Databricks because it works with Spark. We can work with clustered servers, and we can easily do the deployment in the cloud. With a right-click, we can deploy Databricks through the app on AWS or Azure cloud."
"If you develop it on MacBook, it'll be quite a hassle."
"If you're working with a larger data set, I'm not so sure it would be the best solution. The larger things got the slower it was."
"Lumada could have more native connectors with other vendors, such as Google BigQuery, Microsoft OneDrive, Jira systems, and Facebook or Instagram. We would like to gather data from modern platforms using Lumada, which is a better approach. As a comparison, if you open Power BI to retrieve data, then you can get data from many vendors with cloud-native connectors, such as Azure, AWS, Google BigQuery, and Athena Redshift. Lumada should have more native connectors to help us and facilitate our job in gathering information from these new modern infrastructures and tools."
"The server-side should be completely revamped."
"The pricing could be lower. They should work to make it more affordable."
"Talent consumes a lot of resources on my PC."
"In the next release, Open Studio should include cloud storage as an input."
"In terms of features, it has all the features that I need. However, it consumes a lot of resources. It is using a lot of RAM, and they need to fix the issue related to resource consumption. It currently requires more than 24 gigabytes of RAM, which is a big amount of RAM."
"The documentation is lacking within the product. They need to get better at all aspects of describing how it works and how to use it."
"In terms of what can be improved, the scheduling is not there in the sister version, while it is there in the cloud one, which is a paid version. If all kinds of scheduling could be available on the Open Studio that we generally use and practice on, that would be great. The scheduling of the data migration is currently not available in the sister version of Talend Open Studio that we are working on. It is available in the advanced version of the Talend. This is the one thing that can be improvised."
"We need more components to be more efficient. We use a lot of components, such as Salesforce, and it's not easy to use. There's are minor bugs and it's not easy to use some of the features."
StreamSets offers an end-to-end data integration platform to build, run, monitor and manage smart data pipelines that deliver continuous data for DataOps, and power the modern data ecosystem and hybrid integration.
Only StreamSets provides a single design experience for all design patterns for 10x greater developer productivity; smart data pipelines that are resilient to change for 80% less breakages; and a single pane of glass for managing and monitoring all pipelines across hybrid and cloud architectures to eliminate blind spots and control gaps.
With StreamSets, you can deliver the continuous data that drives the connected enterprise.
Pentaho data integration prepares and blends data to create a complete picture of your business that drives actionable insights. The complete data integration platform delivers accurate, "analytics ready" data to end users from any source. With visual tools to eliminate coding and complexity, Pentaho puts big data and all data sources at the fingertips of business and IT users alike.
Talend Open Studio is a free, open source ETL tool for data integration and Big Data. The solution enables you to extract diverse datasets and normalize and transform them into a consistent format which can be loaded into a number of third-party databases and applications.
Talend Open Studio Features
Talend Open Studio has many valuable key features. Some of the most useful ones include:
Talend Open Studio Benefits
There are several benefits to implementing Talend Open Studio. Some of the biggest advantages the solution offers include:
Reviews from Real Users
Below are some reviews and helpful feedback written by PeerSpot users currently using the Talend Open Studio solution.
Elio B., Data Integration Specialist/CTO at Asset messages, says, "The solution has a good balance between automated items and the ability for a developer to integrate and extend what he needs. Other competing tools do not offer the same grade of flexibility when you need to go beyond what is provided by the tool. Talend, on the other hand, allows you to expand very easily."
A Practice Head, Analytics at a tech services company mentions, “The data integration aspect of the solution is excellent. The product's data preparation features are very good. There's very useful data stewardship within the product. From a technical standpoint, the solution itself is pretty good. There are very good pre-built connectors in Talend, which is good for many clients or businesses, as, in most cases, companies are dealing with multiple data sources from multiple technologies. That is where a tool like Talend is extremely helpful.”
Prerna T., Senior System Executive at a tech services company, comments, “The best thing I have found with Talend Open Studio is their major support for the lookups. With Salesforce, when we want to relate our child objects to their parent object, we need to create them via IDs. Then the upsert operation, which will allow you to relate a child object to the event, will have an external ID. That is the best thing which keeps it very sorted. I like that.”
An Implementation Specialist, Individual Contributor at a computer software company, states, “I can connect with different databases such as Oracle Database or SQL Server. It allows you to extract the data from one database to another. I can structure the data by filtering and mapping the fields.” He also adds, “It is very user-friendly. You need to know the basics of SQL development or SQL queries, and you can use this tool.”
PeerSpot user Badrakh V., Information System Architect at Astvision, explains, "The most valuable features are the ETL tools."
Hitachi Lumada Data Integration is ranked 6th in Data Integration Tools with 25 reviews while Talend Open Studio is ranked 5th in Data Integration Tools with 15 reviews. Hitachi Lumada Data Integration is rated 8.0, while Talend Open Studio is rated 7.6. The top reviewer of Hitachi Lumada Data Integration writes "Saves time and makes it easy for our mixed-skilled team to support the product, but more guidance and better error messages are required in the UI". On the other hand, the top reviewer of Talend Open Studio writes "Popular open-source ETL tool in Singapore/ South East Asia; Connects to Big Data; Easy to spot errors with generated Java code". Hitachi Lumada Data Integration is most compared with SSIS, Oracle Data Integrator (ODI), Informatica PowerCenter, Azure Data Factory and Informatica Enterprise Data Catalog, whereas Talend Open Studio is most compared with SSIS, AWS Glue, Azure Data Factory, Talend Data Fabric and FME. See our Hitachi Lumada Data Integration vs. Talend Open Studio report.
We monitor all Data Integration Tools reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.