Try our new research platform with insights from 80,000+ expert users

IBM Cloud Pak for Integration vs StreamSets comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

IBM Cloud Pak for Integration
Average Rating
8.6
Reviews Sentiment
7.0
Number of Reviews
5
Ranking in other categories
API Management (26th), Cloud Data Integration (15th)
StreamSets
Average Rating
8.4
Reviews Sentiment
7.0
Number of Reviews
21
Ranking in other categories
Data Integration (22nd)
 

Featured Reviews

Igor Khalitov - PeerSpot reviewer
Manages APIs and integrates microservices with redirection feature
IBM Cloud Pak for Integration includes monitoring capabilities to track the performance and health of your integrations. You can quickly roll back to a previous version if an issue arises. Additionally, it supports incremental deployments, allowing you to shift traffic to a new version of an API gradually. For example, you can start by directing 10% of traffic to the new version while the rest continue using the legacy version. If everything works as expected, you can gradually increase the traffic to the new version over time. IBM Cloud Pak for Integration has a client base that includes numerous organizations using AI and machine learning technologies. We leverage an open-source machine learning framework and integrate it with Kafka to help create and manage various products and data retrieval processes. For companies with private data, the framework first retrieves relevant data from a GitHub database, which is then combined with the final request before being sent to a language model like GPT. This ensures that the language model uses your specific data to generate responses. Kafka plays a key role by streaming real-time data from file systems and databases like Oracle and Microsoft SQL. This data is published to Kafka topics, then vectorized and used with artificial intelligence to enhance the overall process. It's like an old-fashioned approach. The best way is to redesign it with products such as Kafka. Overall, I rate the solution an eight out of ten.
Ved Prakash Yadav - PeerSpot reviewer
Useful for data transformation and helps with column encryption
We use various tools and alerting systems to notify us of pipeline errors or failures. StreamSets supports data governance and compliance by allowing us to encrypt incoming data based on specified rules. We can easily encrypt columns by providing the column name and hash key. If you're considering using StreamSets for the first time, I would advise first understanding why you want to use it and how it will benefit you. If you're dealing with change tracking or handling large amounts of data, it could be cost-effective compared to services like Amazon. It's easy to schedule and manage tasks with the tool, and you can enhance your skills as an ETL developer. You can easily migrate traditional pipelines built on platforms like Informatica or Talend to StreamSets. I rate the overall solution an eight out of ten.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Cloud Pak for Integration is definitely scalable. That is the most important criteria."
"It is a stable solution."
"The most preferable aspect would be the elimination of the command, which was a significant improvement. In the past, it was a challenge, but now we can proceed smoothly with the implementation of our policies and everything is managed through JCP. It's still among the positive aspects, and it's a valuable feature."
"The most valuable aspect of the Cloud Pak, in general, is the flexibility that you have to use the product."
"Redirection is a key feature. It helps in managing multiple microservices by centralizing control and access."
"The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
"The Ease of configuration for pipes is amazing. It has a lot of connectors. Mainly, we can do everything with the data in the pipe. I really like the graphical interface too"
"The best feature that I really like is the integration."
"For me, the most valuable features in StreamSets have to be the Data Collector and Control Hub, but especially the Data Collector. That feature is very elegant and seamlessly works with numerous source systems."
"The entire user interface is very simple and the simplicity of creating pipelines is something that I like very much about it. The design experience is very smooth."
"StreamSets is the leader in the market."
"The scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy."
"The most valuable features are the option of integration with a variety of protocols, languages, and origins."
 

Cons

"Enterprise bots are needed to balance products like Kafka and Confluent."
"Its queuing and messaging features need improvement."
"The initial setup is not easy."
"Setting up Cloud Pak for Integration is relatively complex. It's not as easy because it has not yet been fully integrated. You still have some products that are still not containerized, so you still have to run them on a dedicated VM."
"The pricing can be improved."
"One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time."
"Visualization and monitoring need to be improved and refined."
"If you use JDBC Lookup, for example, it generally takes a long time to process data."
"We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which was painful. Also, pipeline failures were common, and data drifting wasn't addressed, which made things worse. Licensing was another issue we encountered."
"One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Because of this memory issue, we have to upgrade our EC2 boxes in the Amazon AWS infrastructure."
 

Pricing and Cost Advice

"The solution's pricing model is very flexible."
"It is an expensive solution."
"It has a CPU core-based licensing, which works for us and is quite good."
"We use the free version. It's great for a public, free release. Our stance is that the paid support model is too expensive to get into. They should honestly reevaluate that."
"There are two editions, Professional and Enterprise, and there is a free trial. We're using the Professional edition and it is competitively priced."
"There are different versions of the product. One is the corporate license version, and the other one is the open-source or free version. I have been using the corporate license version, but they have recently launched a new open-source version so that anybody can create an account and use it. The licensing cost varies from customer to customer. I don't have a lot of input on that. It is taken care of by PMO, and they seem fine with its pricing model. It is being used enterprise-wide. They seem to have got a good deal for StreamSets."
"StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
"Its pricing is pretty much up to the mark. For smaller enterprises, it could be a big price to pay at the initial stage of operations, but the moment you have the Seed B or Seed C funding and you want to scale up your operations and aren't much worried about the funds, at that point in time, you would need a solution that could be scaled."
"We are running the community version right now, which can be used free of charge."
"StreamSets is an expensive solution."
report
Use our free recommendation engine to learn which Cloud Data Integration solutions are best for your needs.
856,874 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
21%
Computer Software Company
11%
Government
8%
Insurance Company
8%
Financial Services Firm
13%
Computer Software Company
11%
Manufacturing Company
10%
Insurance Company
9%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
 

Questions from the Community

What do you like most about IBM Cloud Pak for Integration?
The most preferable aspect would be the elimination of the command, which was a significant improvement. In the past, it was a challenge, but now we can proceed smoothly with the implementation of ...
What needs improvement with IBM Cloud Pak for Integration?
Enterprise bots are needed to balance products like Kafka and Confluent.
What is your primary use case for IBM Cloud Pak for Integration?
It manages APIs and integrates microservices at the enterprise level. It offers a range of capabilities for handling APIs, microservices, and various integration needs. The platform supports thousa...
What do you like most about StreamSets?
The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customiz...
What needs improvement with StreamSets?
One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Because of this memory issue, we have to upgrade our EC2 boxes in the Amazon AWS infr...
What is your primary use case for StreamSets?
We are using StreamSets for batch loading.
 

Overview

 

Sample Customers

CVS Health Corporation
Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
Find out what your peers are saying about IBM Cloud Pak for Integration vs. StreamSets and other solutions. Updated: June 2025.
856,874 professionals have used our research since 2012.