We performed a comparison between Pentaho Data Integration and Analytics, SSIS, and Talend Data Management Platform based on real PeerSpot user reviews.
Find out what your peers are saying about Microsoft, Informatica, Oracle and others in Data Integration."I can use Python, which is open-source, and I can run other scripts, including Linux scripts. It's user-friendly for running any object-based language. That's a very important feature because we live in a world of open-source."
"The area where Lumada has helped us is in the commercial area. There are many extractions to compose reports about our sales team performance and production steps. Since we are using Lumada to gather data from each industry in each country. We can get data from Argentina, Chile, Brazil, and Colombia at the same time. We can then concentrate and consolidate it in only one place, like our data warehouse. This improves our production performance and need for information about the industry, production data, and commercial data."
"It's my understanding that the product can scale."
"It has improved our data integration capabilities."
"The abstraction is quite good."
"The amount of data that it loads and processes is good."
"One of the most valuable features is the ability to create many API integrations. I'm always working with advertising agents and using Facebook and Instagram to do campaigns. We use Pentaho to get the results from these campaigns and to create dashboards to analyze the results."
"We're using the PDI and the repository function, and they give us the ability to easily generate reporting and output, and to access data. We also like the ability to schedule."
"It has the ability to be deployed into the cloud through Data Factory, and run completely as a software as a service in the cloud."
"It is also easy to learn and user-friendly. Microsoft is also good in terms of technical support. They have built a large community all over the world."
"It's something I needed for bulk imports. I'm not a big fan of it, but I haven't seen anything better."
"The debugging capabilities are great, particularly during data flow execution. You can look into the data and see what's going on in the pipeline."
"There are many good features in this solution including the data fields, database integration, support for SQL views, and the lookups for matching information."
"SSIS is easy to use."
"It is easy to set up the solution."
"I have used most of the standard SQL features, but the ones that stand out are the Data Flows and Bulk Import."
"I like the way that you can use the context variables, and how you can work those context variables to give you values and settings for every development environment, such as PROD, TEST, and DEV."
"The most valuable feature is the data loading and scripting language"
"The basic tools are easy to pick up and understand."
"The solution can run on any machine and that is a big advantage."
"They're very competitive in terms of performance, which is a good selling point. It has very rich features. It provides a very rich feature set in the application."
"I like everything about this product, but the biggest thing is the ease of use."
"The most valuable feature is integration."
"The most valuable features of the Talend Data Management Platform are the components."
"I work with different databases. I would like to work with more connectors to new databases, e.g., DynamoDB and MariaDB, and new cloud solutions, e.g., AWS, Azure, and GCP. If they had these connectors, that would be great. They could improve by building new connectors. If you have native connections to different databases, then you can make instructions more efficient and in a more natural way. You don't have to write any scripts to use that connector."
"I could not connect to our Hadoop environment in an easy and flexible way, and it was important to scale our data warehouse."
"One thing that I don't like, just a little, is the backward compatibility."
"The performance could be improved. If they could have analytics perform well on large volumes, that would be a big deal for our products."
"Lumada could have more native connectors with other vendors, such as Google BigQuery, Microsoft OneDrive, Jira systems, and Facebook or Instagram. We would like to gather data from modern platforms using Lumada, which is a better approach. As a comparison, if you open Power BI to retrieve data, then you can get data from many vendors with cloud-native connectors, such as Azure, AWS, Google BigQuery, and Athena Redshift. Lumada should have more native connectors to help us and facilitate our job in gathering information from these new modern infrastructures and tools."
"A big problem after deploying something that we do in Lumada is with Git. You get a binary file to do a code review. So, if you need to do a review, you have to take pictures of the screen to show each step. That is the biggest bug if you are using Git."
"I have been facing some difficulties when working with large datasets. It seems that when there is a large amount of data, I experience memory errors."
"Although it is a low-code solution with a graphical interface, often the error messages that you get are of the type that a developer would be happy with. You get a big stack of red text and Java errors displayed on the screen, and less technical people can get intimidated by that. It can be a bit intimidating to get a wall of red error messages displayed. Other graphical tools that are focused at the power user level provide a much more user-friendly experience in dealing with your exceptions and guiding the user into where they've made the mistake."
"Improving the login procedure would make our reporting easier on monitoring our ETL processes."
"The debugging could be improved because when it came to solving the errors that I've experienced in the past, I've had to look at the documentation for more information."
"We In upgrading SSIS, we encountered challenges fixing SQL Server and performance issues, including problems during a failover in our data warehouse."
"There are a lot of things that Microsoft could improve in relation to SSIS. One major problem we faced was when attempting to move some Excel files to our SQL Server. The Excel provider has a limitation that prevents importing more than 255 columns from a particular Excel file to the database. This restriction posed a significant issue for us."
"Microsoft should offer an on-premises support warranty for those using that deployment. They seem to be withdrawing from on-premises options."
"We're in the process of switching to Informatica, and we need to work out data lineage and data profiling and to improve the quality of our data. SSIS, however, is not that compatible with Informatica. We managed to connect it to Informatica Metadata Manager, but we don't get good lineage, so we have to redo all our ETLs using the Informatica process in order to accept the proper data lineage."
"When I compare Talend and SSIS, Talend provides more features. With Talend, we can handle a large volume of data. Talend is usually used to treat a large volume of data, which makes it better than SSIS on the data side. Talend also has a very good Talend Management Console to schedule the jobs and do other things. It can also be easily connected to version control tools such as GitHub or SVN. The last time I used SSIS, it was connected through TSS for the Windows Console version. I am not sure it has been improved or not. If it is not improved, Microsoft should improve it. They should change the product to provide another console."
"SSIS should be made a little bit more intuitive and user-friendly because it needs an expert-level person to work on it."
"Including either XML or JSON in the next release would definitely be a good transformation. I'm not sure if Talend has that feature, but it's one of those requirements that we are working around and have to do some parsing of XML so this could make it easier."
"The product must enhance the data quality."
"We'd like to see more connectors it the future."
"The solution's memory sometimes bottlenecks and that can be challenging."
"Once you get past the basic tools, it gets pretty complicated."
"I'd be interested in seeing the running of Python programs and transformations from within the studio itself."
"I think they should drive toward AI and machine learning. They could include a machine-learning algorithm for the deduplication."
"Performance and speed could be improved."
More Pentaho Data Integration and Analytics Pricing and Cost Advice →
More Talend Data Management Platform Pricing and Cost Advice →
There are two products I know about
* TimeXtender : Microsoft based, Transformation logic is quiet good and can easily be extended with T-SQL , Has a semantic layer that generates metat data for cubes . price approx 40K$, works with tables
. Attunity (Bought by Qlik) : technology agnostic , nice web interface , expensive > 100K€. Works with transaction logs
There are many other pure ETL tools
* ERWIN has a nice one ,
Depends upon the technologies being used. If you're using Oracle for both OLTP and OLAP then you'll get a lot of value from an Oracle solution.
The other question is how up to date do you want your OLAP DB to be? Goldengate is a good answer if you're looking to minimize latency, but it can be expensive. ODI is less expensive but better suited to bulkier data sets. If an Oracle product wasn't the option I'd probably consider something like Informatica.
Hi Rajneesh,
yes here is the feature comparison between the community and enterprise edition : www.hitachivantara.com
And a short description of the community edition: www.predictiveanalyticstoday.com
And the download link: community.hitachivantara.com
You can ask more from the great community: forums.pentaho.com
Regards
Károly
We usually use Talend.
Look here: community.talend.com
As someone mentioned, if you're purely Oracle shop and staying that way then there's value with prioritizing Oracle tools. However, let me contrast that with this caveat...
Consider expectations for tool and vendor longevity. Oracle has a long history of retiring and/or replacing tools leaving customers in the cold with prior versions/tools (I've been burned multiple times by Oracle product retirements or replacements including OWB, Oracle Designer2k, Oracle Express, Oracle OEDW, their purchase of Sagent ETL which as later abandoned).
But I would also consider these questions and relative prioritization:
What is your organization's plans for moving to other database technologies?
Where is your org going with on-prem versus cloud solutions? How important are PaaS versus IaaS solutions?
Where is your current staff's expertise?
Prioritize mature over immature tools.
How many sources do you have? What are their technologies and does the integration tool support them?
Is it just moving data from a single ERP such as Oracle EBS to Olap? When you say Olap what do you mean by that? Are you talking Oracle Olap product or something else? That makes a really big difference of course - if your ETL tool doesn't support your source(s) and target(s) then it shouldn't be considered.
Given the industry's trajectory, I myself would highly prioritize PaaS solutions over others.
What is the OLAP that you are using? Hosted in Cloud or on-premise?
The target DB should have its tool to extract data.
Pentaho is a really nice tool if opensource is the only option.
Please think about issues such as upgrade and disaster in the future. These operations are very easy in Pentaho.
I can only suggest one thing for replication and that is Qlik. (ex-Attunity).
Hi Karoly, Thanks for your input. community: forums.pentaho.com is not allowing new registrations for new users. I guess they accept queries from customers only and not from any one. Do you know any other forum, community, SMEs contacts who can help on queries?