Pentaho Data Integration and Analytics vs SSIS vs StreamSets comparison

Cancel
You must select at least 2 products to compare!
Hitachi Vantara Logo
3,247 views|1,075 comparisons
94% willing to recommend
Microsoft Logo
Read 69 SSIS reviews
19,105 views|15,500 comparisons
79% willing to recommend
StreamSets Logo
4,200 views|2,349 comparisons
100% willing to recommend
Comparison Buyer's Guide
Executive Summary

We performed a comparison between Pentaho Data Integration and Analytics, SSIS, and StreamSets based on real PeerSpot user reviews.

Find out what your peers are saying about Microsoft, Informatica, Oracle and others in Data Integration.
To learn more, read our detailed Data Integration Report (Updated: April 2024).
769,334 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"We also haven't had to create any custom Java code. Almost everywhere it's SQL, so it's done in the pipeline and the configuration. That means you can offload the work to people who, while they are not less experienced, are less technical when it comes to logic.""The fact that it's a low-code solution is valuable. It's good for more junior people who may not be as experienced with programming.""It has a really friendly user interface, which is its main feature. The process of automating or combining SQL code with some databases and doing the automation is great and really convenient.""I can create faster instructions than writing with SQL or code. Also, I am able to do some background control of the data process with this tool. Therefore, I use it as an ELT tool. I have a station area where I can work with all the information that I have in my production databases, then I can work with the data that I created.""I can use Python, which is open-source, and I can run other scripts, including Linux scripts. It's user-friendly for running any object-based language. That's a very important feature because we live in a world of open-source.""It has improved our data integration capabilities​.""We use Lumada’s ability to develop and deploy data pipeline templates once and reuse them. This is very important. When the entire pipeline is automated, we do not have any issues in respect to deployment of code or with code working in one environment but not working in another environment. We have saved a lot of time and effort from that perspective because it is easy to build ETL pipelines.""Provides a good open source option."

More Pentaho Data Integration and Analytics Pros →

"This solution is easy to implement, has a wide variety of connectors, has support for Visual Basic, and supports the C language.""SSIS' best feature is SFTP connectivity.""It is easily scheduled and integrates well with SQL Server and SQL Server Agent jobs.""The product's deployment phase is easy.""The performance is good.""The solution is easy to use and developer friendly.""SSIS integrates well with SQL servers and Microsoft products.""There are many good features in this solution including the data fields, database integration, support for SQL views, and the lookups for matching information."

More SSIS Pros →

"Also, the intuitive canvas for designing all the streams in the pipeline, along with the simplicity of the entire product are very big pluses for me. The software is very simple and straightforward. That is something that is needed right now.""What I love the most is that StreamSets is very light. It's a containerized application. It's easy to use with Docker. If you are a large organization, it's very easy to use Kubernetes.""The UI is user-friendly, it doesn't require any technical know-how and we can navigate to social media or use it more easily.""StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes.""The ability to have a good bifurcation rate and fewer mistakes is valuable.""The most valuable would be the GUI platform that I saw. I first saw it at a special session that StreamSets provided towards the end of the summer. I saw the way you set it up and how you have different processes going on with your data. The design experience seemed to be pretty straightforward to me in terms of how you drag and drop these nodes and connect them with arrows.""The Ease of configuration for pipes is amazing. It has a lot of connectors. Mainly, we can do everything with the data in the pipe. I really like the graphical interface too""For me, the most valuable features in StreamSets have to be the Data Collector and Control Hub, but especially the Data Collector. That feature is very elegant and seamlessly works with numerous source systems."

More StreamSets Pros →

Cons
"Although it is a low-code solution with a graphical interface, often the error messages that you get are of the type that a developer would be happy with. You get a big stack of red text and Java errors displayed on the screen, and less technical people can get intimidated by that. It can be a bit intimidating to get a wall of red error messages displayed. Other graphical tools that are focused at the power user level provide a much more user-friendly experience in dealing with your exceptions and guiding the user into where they've made the mistake.""As far as I remember, not all connectors worked very well. They can add more connectors and more drivers to the process to integrate with more flows.""​I could not connect to our Hadoop environment in an easy and flexible way, and it was important to scale our data warehouse​.""Its basic functionality doesn't need a whole lot of change. There could be some improvement in the consistency of the behavior of different transformation steps. The software did start as open-source and a lot of the fundamental, everyday transformation steps that you use when building ETL jobs were developed by different people. It is not a seamless paradigm. A table input step has a different way of thinking than a data merge step.""The product needs more plugins.""In terms of the flexibility to deploy in any environment, such as on-premise or in the cloud, we can do the cloud deployment only through virtual machines. We might also be able to work on different environments through Docker or Kubernetes, but we don't have an Azure app or an AWS app for easy deployment to the cloud. We can only do it through virtual machines, which is a problem, but we can manage it. We also work with Databricks because it works with Spark. We can work with clustered servers, and we can easily do the deployment in the cloud. With a right-click, we can deploy Databricks through the app on AWS or Azure cloud.""The support for the Enterprise Edition is okay, but what they have done in the last three or four years is move more and more things to that edition. The result is that they are breaking the Community Edition. That's what our impression is.""One thing that I don't like, just a little, is the backward compatibility."

More Pentaho Data Integration and Analytics Cons →

"Involving a data lake or data engineering aspects would be useful. While it is there, we need more features included.""I would like to see more features in terms of the integration with Azure Data Factory.""The solution could improve by having quicker release updates.""SSIS can improve in handling different data sources like Salesforce connectivity, Oracle Cloud's connectivity, etc.""Video training would be a helpful addition.""The creation of the measure in the DAC's model could be improved.""The solution could improve on integrating with other types of data sources.""SSIS doesn't have a very good user interface, but if you can work with it, it'll provide you with almost all of the functionality."

More SSIS Cons →

"One area for improvement could be the cloud storage server speed, as we have faced some latency issues here and there.""In terms of the product, I don't think there is any room for improvement because it is very good. One small area of improvement that is very much needed is on the knowledge base side. Sometimes, it is not very clear how to set up a certain process or a certain node for a person who's using the platform for the first time.""We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which was painful. Also, pipeline failures were common, and data drifting wasn't addressed, which made things worse. Licensing was another issue we encountered.""Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using.""The monitoring visualization is not that user-friendly. It should include other features to visualize things, like how many records were streamed from a source to a destination on a particular date.""The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time.""StreamSet works great for batch processing but we are looking for something that is more real-time. We need latency in numbers below milliseconds.""I would like to see it integrate with other kinds of platforms, other than Java. We're going to have a lot of applications using .NET and other languages or frameworks. StreamSets is very helpful for the old Java platform but it's hard to integrate with the other platforms and frameworks."

More StreamSets Cons →

Pricing and Cost Advice
  • "There is a good open source option (Community Edition)​."
  • "The price of the regular version is not reasonable and it should be lower."
  • "Sometimes we provide the licenses or the customer can procure their own licenses. Previously, we had an enterprise license. Currently, we are on a community license as this is adequate for our needs."
  • "It does seem a bit expensive compared to the serverless product offering. Tools, such as Server Integration Services, are "almost" free with a database engine. It is comparable to products like Alteryx, which is also very expensive."
  • "I think Lumada's price is fair compared to some of the others, like BusinessObjects, which is was the other thing that I used at my previous job. BusinessObject's price was more reasonable before SAP acquired it. They jacked the price up significantly. Oracle's OBIEE tool was also prohibitively expensive."
  • "When we first started with it, it was much cheaper. It has gone up drastically, especially since Hitachi bought out Pentaho."
  • "The cost of these types of solutions are expensive. So, we really appreciate what we get for our money. Though, we don't think of the solution as a top-of-the-line solution or anything like that."
  • "The pricing has been pretty good. I'm used to using everything open-source or freeware-based. I understand that organizations need to make sure that the solutions are secure, and that's basically where I hit a roadblock in my current organization. They needed to ensure that we had a license and we had a secure way of accessing it so that no outside parties could get access to our data, but in terms of pricing, considering how much other teams are spending on cloud solutions or even their existing solutions, its price point is pretty good. At this time, there are no additional costs. We just have the licensing fees."
  • More Pentaho Data Integration and Analytics Pricing and Cost Advice →

  • "This solution has provided an inexpensive tool, and it is easy to find experienced developers."
  • "My advice is to look at what your configuration will be because most companies have their own deals with Microsoft."
  • "This solution is included with the MSSQL server package."
  • "It would be beneficial if the solution had a less costly cloud offering."
  • "Based on my experience and understanding, Talend comes out to be a little bit expensive as compared to SSIS. The average cost of having Talend with Talend Management Console is around 72K per region, which is much higher than SSIS. SSIS works very well with Microsoft technologies, and if you have Microsoft technologies, it is not really expensive to have SSIS. If you have SQL Server, SSIS is free."
  • "We have an enterprise license for this solution."
  • "It comes bundled with other solutions, which makes it difficult to get the price on the specific product."
  • "All of my clients have this product included as part of their Microsoft license."
  • More SSIS Pricing and Cost Advice →

  • "We are running the community version right now, which can be used free of charge."
  • "StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
  • "It has a CPU core-based licensing, which works for us and is quite good."
  • "There are different versions of the product. One is the corporate license version, and the other one is the open-source or free version. I have been using the corporate license version, but they have recently launched a new open-source version so that anybody can create an account and use it. The licensing cost varies from customer to customer. I don't have a lot of input on that. It is taken care of by PMO, and they seem fine with its pricing model. It is being used enterprise-wide. They seem to have got a good deal for StreamSets."
  • "The pricing is good, but not the best. They have some customized plans you can opt for."
  • "We use the free version. It's great for a public, free release. Our stance is that the paid support model is too expensive to get into. They should honestly reevaluate that."
  • "The overall cost for small and mid-size organizations needs to be better."
  • "There are two editions, Professional and Enterprise, and there is a free trial. We're using the Professional edition and it is competitively priced."
  • More StreamSets Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
    769,334 professionals have used our research since 2012.
    Comparison Review
    Anonymous User
    Technology has made it easier for businesses to organize and manipulate data to get a clearer picture of what’s going on with their business. Notably, ETL tools have made managing huge amounts of data significantly easier and faster, boosting many organizations’ business intelligence operations There are many third-party vendors offering ETL solutions, but two of the most popular are PowerCenter Informatica and Microsoft SSIS (SQL Server Integration Services). Each technology has its advantages but there are also similarities on how they carry out the extract-transform-load processes and only differ in terminologies. If you’re in the process of choosing ETL tools and PowerCenter Informatica and Microsoft SSIS made it to your shortlist, here is a short comparative discussion detailing the differences between the two, as well as their benefits. Package Configuration Most enterprise data integration projects would require the capacity to develop a solution in one platform and test and deploy it in a separate environment without having to manually change the established workflow. In order to achieve this seamless movement between two environments, your ETL technology should allow the dynamic update of the project’s properties using the content or a parameter file or configuration. Both Informatica and SSIS support this functionality using different methodologies. In Informatica, every session can have more than one source and one or more destination connections. There are… Read more →
    Questions from the Community
    Top Answer:Hi Rajneesh yes here is the feature comparison between the community and enterprise edition :… more »
    Top Answer: In my opinion, the reporting side of this tool needs serious improvements. In my previous company, we worked with… more »
    Top Answer:My company has used this product to transform data from databases, CSV files, and flat files. It really does a good job… more »
    Top Answer:SSIS PowerPack is a group of drag and drop connectors for Microsoft SQL Server Integration Services, commonly called… more »
    Top Answer:The product's deployment phase is easy.
    Top Answer:If you don't want to pay a lot of money, you can go for SSIS, as its open-source version is available. When it comes to… more »
    Top Answer:The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's… more »
    Top Answer:We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys… more »
    Top Answer:StreamSets is used for data transformation rather than ETL processes. It focuses on transforming data directly from… more »
    Ranking
    15th
    out of 101 in Data Integration
    Views
    3,247
    Comparisons
    1,075
    Reviews
    10
    Average Words per Review
    1,105
    Rating
    7.5
    2nd
    out of 101 in Data Integration
    Views
    19,105
    Comparisons
    15,500
    Reviews
    35
    Average Words per Review
    474
    Rating
    7.8
    8th
    out of 101 in Data Integration
    Views
    4,200
    Comparisons
    2,349
    Reviews
    22
    Average Words per Review
    1,306
    Rating
    8.4
    Comparisons
    Also Known As
    Hitachi Lumada Data Integration, Kettle, Pentaho Data Integration
    SQL Server Integration Services
    Learn More
    Overview

    Pentaho Data Integration stands as a versatile platform designed to cater to the data integration and analytics needs of organizations, regardless of their size. This powerful solution is the go-to choice for businesses seeking to seamlessly integrate data from diverse sources, including databases, files, and applications. Pentaho Data Integration facilitates the essential tasks of cleaning and transforming data, ensuring it's primed for meaningful analysis. With a wide array of tools for data mining, machine learning, and statistical analysis, Pentaho Data Integration empowers organizations to glean valuable insights from their data. What sets Pentaho Data Integration apart is its maturity and a vibrant community of users and developers, making it a reliable and cost-effective option. Pentaho Data Integration offers a range of features, including a comprehensive ETL toolkit, data cleaning and transformation capabilities, robust data analysis tools, and seamless deployment options for data integration and analytics solutions, making it a go-to solution for organizations seeking to harness the power of their data.

    SSIS is a versatile tool for data integration tasks like ETL processes, data migration, and real-time data processing. Users appreciate its ease of use, data transformation tools, scheduling capabilities, and extensive connectivity options. It enhances productivity and efficiency within organizations by streamlining data-related processes and improving data quality and consistency.

    StreamSets is a data integration platform that enables organizations to efficiently move and process data across various systems. It offers a user-friendly interface for designing, deploying, and managing data pipelines, allowing users to easily connect to various data sources and destinations. StreamSets also provides real-time monitoring and alerting capabilities, ensuring that data is flowing smoothly and any issues are quickly addressed.

    Sample Customers
    66Controls, Providential Revenue Agency of Ro Negro, NOAA Information Systems, Swiss Real Estate Institute
    1. Amazon.com 2. Bank of America 3. Capital One 4. Coca-Cola 5. Dell 6. E*TRADE 7. FedEx 8. Ford Motor Company 9. Google 10. Home Depot 11. IBM 12. Intel 13. JPMorgan Chase 14. Kraft Foods 15. Lockheed Martin 16. McDonald's 17. Microsoft 18. Morgan Stanley 19. Nike 20. Oracle 21. PepsiCo 22. Procter & Gamble 23. Prudential Financial 24. RBC Capital Markets 25. SAP 26. Siemens 27. Sony 28. Toyota 29. UnitedHealth Group 30. Visa 31. Walmart 32. Wells Fargo
    Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
    Top Industries
    REVIEWERS
    Healthcare Company19%
    Financial Services Firm19%
    Comms Service Provider11%
    Manufacturing Company11%
    VISITORS READING REVIEWS
    Financial Services Firm19%
    Computer Software Company14%
    Comms Service Provider12%
    Government7%
    REVIEWERS
    Financial Services Firm23%
    Government8%
    Retailer8%
    Insurance Company8%
    VISITORS READING REVIEWS
    Financial Services Firm17%
    Computer Software Company12%
    Government7%
    Healthcare Company6%
    REVIEWERS
    Financial Services Firm20%
    Energy/Utilities Company20%
    Comms Service Provider13%
    Computer Software Company13%
    VISITORS READING REVIEWS
    Financial Services Firm17%
    Computer Software Company13%
    Manufacturing Company8%
    Government7%
    Company Size
    REVIEWERS
    Small Business27%
    Midsize Enterprise31%
    Large Enterprise42%
    VISITORS READING REVIEWS
    Small Business21%
    Midsize Enterprise11%
    Large Enterprise68%
    REVIEWERS
    Small Business27%
    Midsize Enterprise17%
    Large Enterprise56%
    VISITORS READING REVIEWS
    Small Business18%
    Midsize Enterprise13%
    Large Enterprise69%
    REVIEWERS
    Small Business40%
    Midsize Enterprise12%
    Large Enterprise48%
    VISITORS READING REVIEWS
    Small Business16%
    Midsize Enterprise11%
    Large Enterprise73%
    Buyer's Guide
    Data Integration
    April 2024
    Find out what your peers are saying about Microsoft, Informatica, Oracle and others in Data Integration. Updated: April 2024.
    769,334 professionals have used our research since 2012.