Try our new research platform with insights from 80,000+ expert users

SAP Data Hub vs SSIS vs StreamSets comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Mindshare comparison

Data Governance
Data Integration
Data Integration
 

Featured Reviews

VM
The solution is seamless, but the database sometimes leads to confusion
We used to have multiple different kinds of databases, which internally, had different compliance levels. Retention management is very different now. If the policy is live and the claim has been completed, I couldn't archive the claim. I needed to keep a reference integrity of that claim and understand which policy paid out the claim. With this solution, the policy came in six months ago and qualified for archiving. The claim had been paid and in every environment, the claim had been closed, including the reporting system, the claims system, etc. With the payment set gateway, I can just go and archive. But, we had a hard time during this process. I rate the overall solution a seven out of ten.
Sean Achim - PeerSpot reviewer
Building impactful organizational KPIs with ease and precision
Stability is rated at 10. One other important aspect I appreciate is that SSAS is included in the base installation of SQL Server. Obviously, it requires installation, but it is readily available, which is a major strength. It's all about setting it up, configuring it, and then using it. If there are additional costs associated with it or separating it as a second product, that would be a disadvantage. The area of improvement is really in education. Microsoft is trying to push everything as a Power BI solution or trying to get people to solve the problems which are solved with SSAS in another space in Power BI, or in Power Pivot, is not enough. There's not enough marketing, conversation, and support around that space. As a result, we end up with people not understanding that you need to build your models correctly, and then they try to model everything inside of Power BI, or another visualization tool, without first building the data model. That leads people to consider alternate solutions because SAP and others argue that their whole thing is in memory, and they disseminate misleading information. Additionally, what would be very helpful is local user group developments, so getting people around the table and teaching them how to use it. That is the biggest problem; it's not the technology itself. The challenge lies in Microsoft withdrawing a lot of the qualifications and watering down its emphasis, leading to a perception that this is supposed to be an elite product.
Ved Prakash Yadav - PeerSpot reviewer
Useful for data transformation and helps with column encryption
We use various tools and alerting systems to notify us of pipeline errors or failures. StreamSets supports data governance and compliance by allowing us to encrypt incoming data based on specified rules. We can easily encrypt columns by providing the column name and hash key. If you're considering using StreamSets for the first time, I would advise first understanding why you want to use it and how it will benefit you. If you're dealing with change tracking or handling large amounts of data, it could be cost-effective compared to services like Amazon. It's easy to schedule and manage tasks with the tool, and you can enhance your skills as an ETL developer. You can easily migrate traditional pipelines built on platforms like Informatica or Talend to StreamSets. I rate the overall solution an eight out of ten.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Its connection to on-premise products is the most valuable. We mostly use the on-premise connection, which is seamless. This is what we prefer in this solution over other solutions. We are using it the most for the orchestration where the data is coming from different categories. Its other features are very much similar to what they are giving us in open source. Their push-down approach is the most advantageous, where they push most of the processing on to the same data source. This means that they have a serverless kind of thing, and they don't process the data inside a product such as Data Hub. They process the data from where the data is coming out. If it is coming from HANA, to capture the data or process it for analytics, orchestration, or management, they go to the HANA database and give it out. They don't process it on Data Hub. This push-down approach increases the processing speed a little bit because the data is processed where it is sitting. That's the best part and an advantage. I have used another product where they used to capture the data first and then they used to process it and give it. In Data Hub, it is in reverse. They process it first and give it, and then they put their own manipulations. They lead in terms of business functions. No other solution has business functions already implemented to perform business analysis. They have a lot of prebuilt business functions for machine learning and orchestration, which we can use directly to get an analysis out from the existing data. Most of the data is sitting as enterprise data there. That's a major advantage that they have."
"SAP is one of the most seamless ERPs that have integrated SAP archiving within Excel. I have not seen this with any other database."
"The most valuable feature is the S/4HANA 1909 On-Premise"
"It's a competent product."
"It has a drag and drop feature that makes it easy to use. It has a good user experience because it takes into account your most-used tools and they're lined up nicely so you can just drag and drop without looking too far. It also integrates nicely with Microsoft."
"The UI is very user-friendly."
"The script component is very powerful, things that you cannot normally do, is feasible through C#."
"It's saved time using visualization descriptions."
"The simplicity of the solution is great. The solution also offers excellent integration."
"The performance is better than doing it in some alternative ways. We don't have to worry about so much manual work."
"It is easy to set up the solution."
"I really appreciate the numerous ready connectors available on both the source and target sides, the support for various media file formats, and the ease of configuring and managing pipelines centrally."
"In StreamSets, everything is in one place."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"One of the things I like is the data pipelines. They have a very good design. Implementing pipelines is very straightforward. It doesn't require any technical skill."
"The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
"It is really easy to set up and the interface is easy to use."
"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
"I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks."
 

Cons

"In 2018, connecting it to outside sources, such as IoT products or IoT-enabled big data Hadoop, was a little complex. It was not smooth at the beginning. It was unstable. It took a lot of time for the initial data load. Sometimes, the connection broke, and we had to restart the process, which was a major issue, but they might have improved it now. It is very smooth with SAP HANA on-premise system, SAP Cloud Platform, and SAP Analytics Cloud. It could be because these are their own products, and they know how to integrate them. With Hadoop, they might have used open-source technologies, and that's why it was breaking at that time. They are providing less embedded integration because they want us to use their other products. For example, they don't want to go and remove SAP Analytics Cloud and put everything in Data Hub. They want us to use SAP Analytics Cloud somewhere else and not inside the Data Hub. On the integration part, it lacks real-time analytics, and it is slow. They should embed the SAP Analytics Cloud inside Data Hub or support some kind of analysis. They do provide some analysis, but it is not extensive. They are moreover open source. So, we need a lot of developers or data scientists to go in and implement Python algorithms. It would be better if they can provide their own existing algorithms and give some connections and drop-down menus to go and just configure those. It will make things really quick by increasing the embedded integrations. It will also improve the process efficiency and processing power. Its performance needs improvement. It is a little slow. It is not the best in the market, and there are other products that are much better than this. In terms of technology and performance, it is a little slow as compared to Microsoft and other data orchestration products. I haven't used other products, but I have read about those products, their settings, and the milliseconds that they do. In Azure Purview, they say that they can copy, manage, or transform the data within milliseconds. They say that they can transform 100 gigabytes of data within three to five seconds, which is something SAP cannot do. It generally takes a lot of time to process that much amount of data. However, I have never tested out Azure."
"Nowadays there are some inconsistencies in data bases, however, they upgrade and release the versions to market."
"The company has everything offshore."
"You have to write push down join & lookup SQL to the database yourself via stored procedures or use of the SQL Task to get very high performance. That said, this is a common complaint for nearly all ETL tools on the market and those that offer an alternative such as Informatica offer them at a very expensive add-on price."
"Improvement as per customer requirements."
"We have a stability problem because when something works, it works one time. The next time, it doesn't work."
"SSIS is cumbersome despite its drag-and-drop functionality. For example, let's say I have 50 tables with 30 columns. You need to set a data type for each column and table. That's around 1,500 objects. It gets unwieldy adding validation for every column. Previously, SSIS automatically detected the data type, but I think they removed this feature. It would automatically detect if it's an integer, primary key, or foreign key column. You had fewer problems building the model."
"We'd like them to develop data exploration more."
"The solution could improve on integrating with other types of data sources."
"When I compare Talend and SSIS, Talend provides more features. With Talend, we can handle a large volume of data. Talend is usually used to treat a large volume of data, which makes it better than SSIS on the data side. Talend also has a very good Talend Management Console to schedule the jobs and do other things. It can also be easily connected to version control tools such as GitHub or SVN. The last time I used SSIS, it was connected through TSS for the Windows Console version. I am not sure it has been improved or not. If it is not improved, Microsoft should improve it. They should change the product to provide another console."
"SSIS sometimes hangs, and there are some problems with servers going down after they've been patched."
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful."
"The monitoring visualization is not that user-friendly. It should include other features to visualize things, like how many records were streamed from a source to a destination on a particular date."
"They need to improve their customer care services. Sometimes it has taken more than 48 hours to resolve an issue. That should be reduced. They are aware of small or generic issues, but not the more technical or deep issues. For those, they require some time, generally 48 to 72 hours to respond. That should be improved."
"The design experience is the bane of our existence because their documentation is not the best. Even when they update their software, they don't publish the best information on how to update and change your pipeline configuration to make it conform to current best practices. We don't pay for the added support. We use the "freeware version." The user community, as well as the documentation they provide for the standard user, are difficult, at best."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time."
"One area for improvement could be the cloud storage server speed, as we have faced some latency issues here and there."
"We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which was painful. Also, pipeline failures were common, and data drifting wasn't addressed, which made things worse. Licensing was another issue we encountered."
 

Pricing and Cost Advice

"The Cloud is very expensive, but SAP HANA previous service is okay."
"I'm not involved in licensing details, but SSIS provides value to our organization by simplifying data management tasks."
"The solution is economical. You don't have to worry about the pricing as long as you're installing both services on the same server."
"It comes bundled with other solutions, which makes it difficult to get the price on the specific product."
"We have an enterprise license for this solution."
"My advice is to look at what your configuration will be because most companies have their own deals with Microsoft."
"If you don't want to pay a lot of money, you can go for SSIS, as its open-source version is available. When it comes to licensing, SSIS can be expensive."
"SSIS' licensing is a little high, but it gives good value for money."
"t's incredibly cost effective, easy to learn the basics quickly (although like all ETL tools requires the traditional learning curve to get good at) and has an immense user base."
"There are two editions, Professional and Enterprise, and there is a free trial. We're using the Professional edition and it is competitively priced."
"There are different versions of the product. One is the corporate license version, and the other one is the open-source or free version. I have been using the corporate license version, but they have recently launched a new open-source version so that anybody can create an account and use it. The licensing cost varies from customer to customer. I don't have a lot of input on that. It is taken care of by PMO, and they seem fine with its pricing model. It is being used enterprise-wide. They seem to have got a good deal for StreamSets."
"We are running the community version right now, which can be used free of charge."
"The pricing is too fixed. It should be based on how much data you need to process. Some businesses are not so big that they process a lot of data."
"It's not so favorable for small companies."
"The licensing is expensive, and there are other costs involved too. I know from using the software that you have to buy new features whenever there are new updates, which I don't really like. But initially, it was very good."
"I believe the pricing is not equitable."
"The overall cost is very flexible so it is not a burden for our organization... However, the cost should be improved. For small and mid-size organizations it might be a challenge."
report
Use our free recommendation engine to learn which Data Governance solutions are best for your needs.
857,162 professionals have used our research since 2012.
 

Comparison Review

it_user90069 - PeerSpot reviewer
Feb 20, 2014
Informatica PowerCenter vs. Microsoft SSIS - each technology has its advantages but also have similarities
Technology has made it easier for businesses to organize and manipulate data to get a clearer picture of what’s going on with their business. Notably, ETL tools have made managing huge amounts of data significantly easier and faster, boosting many organizations’ business intelligence operations…
 

Top Industries

By visitors reading reviews
Financial Services Firm
16%
Computer Software Company
14%
Manufacturing Company
12%
Government
12%
Financial Services Firm
19%
Computer Software Company
12%
Government
8%
Manufacturing Company
6%
Financial Services Firm
13%
Computer Software Company
11%
Manufacturing Company
10%
Insurance Company
9%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

What do you like most about SAP Data Hub?
SAP is one of the most seamless ERPs that have integrated SAP archiving within Excel. I have not seen this with any o...
What needs improvement with SAP Data Hub?
We moved from Oracle. If you're aware of your monitoring system, the RPU market, and the managed system, you should m...
What is your primary use case for SAP Data Hub?
I technically handle the database, like cycle management projects. When transaction data comes in, we see it based on...
Which is better - SSIS or Informatica PowerCenter?
SSIS PowerPack is a group of drag and drop connectors for Microsoft SQL Server Integration Services, commonly called ...
What do you like most about SSIS?
The product's deployment phase is easy.
What is your experience regarding pricing and costs for SSIS?
Utilizing SSIS involves no extra charges beyond the SQL Server license. It's an economical choice for my clients.
What do you like most about StreamSets?
The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It...
What needs improvement with StreamSets?
One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Becau...
What is your primary use case for StreamSets?
We are using StreamSets for batch loading.
 

Also Known As

No data available
SQL Server Integration Services
No data available
 

Overview

 

Sample Customers

Kaeser Kompressoren, HARTMANN
1. Amazon.com 2. Bank of America 3. Capital One 4. Coca-Cola 5. Dell 6. E*TRADE 7. FedEx 8. Ford Motor Company 9. Google 10. Home Depot 11. IBM 12. Intel 13. JPMorgan Chase 14. Kraft Foods 15. Lockheed Martin 16. McDonald's 17. Microsoft 18. Morgan Stanley 19. Nike 20. Oracle 21. PepsiCo 22. Procter & Gamble 23. Prudential Financial 24. RBC Capital Markets 25. SAP 26. Siemens 27. Sony 28. Toyota 29. UnitedHealth Group 30. Visa 31. Walmart 32. Wells Fargo
Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
Find out what your peers are saying about Microsoft, Informatica, Collibra and others in Data Governance. Updated: June 2025.
857,162 professionals have used our research since 2012.