No more typing reviews! Try our Samantha, our new voice AI agent.

Actian Pervasive Data Integrator [EOL] vs StreamSets comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Actian Pervasive Data Integ...
Average Rating
5.8
Number of Reviews
6
Ranking in other categories
No ranking in other categories
StreamSets
Average Rating
8.4
Reviews Sentiment
7.0
Number of Reviews
21
Ranking in other categories
Data Integration (22nd)
 

Featured Reviews

BB
Associate Manager at a tech services company with 10,001+ employees
It is stable and we did not have any issues with initial setup. I am not sure if there are various connectors available in the recent version to support the wide range of sources available.
I am not sure if there are various connectors available in the recent version of Pervasive DI to support the wide range of sources available (e.g., big data, cloud, EME) I have been using it for 6-7 months. There were no concerns with the stability. This product is very good from a stability…
SS
Enterprise Solutions Architect at a energy/utilities company with 1,001-5,000 employees
Enables effective batch loading with visual interface and enterprise support
One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Because of this memory issue, we have to upgrade our EC2 boxes in the Amazon AWS infrastructure. I had to switch to a new EC2 box, even though the processor was not fully utilized. It would be beneficial if StreamSets addressed any potential memory leak issues to prevent unnecessary upgrades. Additionally, it would be a great enhancement if StreamSets could produce a lineage graph to visualize how the data has passed through the system.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Web based interface."
"This product is extremely useful and better than building an in-house solution."
"We have been using it for ETL, it is very robust and easy to use."
"For data transfers and ETLs on Windows, this is a great product."
"There were no concerns with the stability. This product is very good from a stability perspective."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"The Ease of configuration for pipes is amazing. It has a lot of connectors. Mainly, we can do everything with the data in the pipe. I really like the graphical interface too"
"The best feature that I really like is the integration."
"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
"StreamSets has definitely helped us in getting the information into our data lake very quickly, in terms of ingestion, and the most important thing is it has helped us from a resourcing point of view because you can easily upskill a BI or ETL resource without any programming knowledge to work with this, which has drastically reduced the time that we are spending on workloads by 60% to 70% as well as reducing the time spent on ingestion by 30%."
"The most valuable would be the GUI platform that I saw. I first saw it at a special session that StreamSets provided towards the end of the summer. I saw the way you set it up and how you have different processes going on with your data. The design experience seemed to be pretty straightforward to me in terms of how you drag and drop these nodes and connect them with arrows."
"The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
"The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customize it to do what you need. Many other tools have started to use features similar to those introduced by StreamSets, like automated workflows that are easy to set up."
 

Cons

"The areas which I think need to be improved are the transform and load steps."
"Technical support is an area which can definitely improve."
"I am not sure if there are various connectors available in the recent version of Pervasive DI to support the wide range of sources available (e.g., big data, cloud, EME)."
"At the moment, we can't even use it for 1, since it keeps failing in our QA testing."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"In terms of the product, I don't think there is any room for improvement because it is very good. One small area of improvement that is very much needed is on the knowledge base side. Sometimes, it is not very clear how to set up a certain process or a certain node for a person who's using the platform for the first time."
"One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Because of this memory issue, we have to upgrade our EC2 boxes in the Amazon AWS infrastructure."
"I would like to see it integrate with other kinds of platforms, other than Java. We're going to have a lot of applications using .NET and other languages or frameworks. StreamSets is very helpful for the old Java platform but it's hard to integrate with the other platforms and frameworks."
"If the data processing in StreamSets takes a long time as compared to the previous solution, then we will reconsider why we use StreamSets."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
"The monitoring visualization is not that user-friendly. It should include other features to visualize things, like how many records were streamed from a source to a destination on a particular date."
 

Pricing and Cost Advice

Information not available
"StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
"StreamSets is an expensive solution."
"The licensing is expensive, and there are other costs involved too. I know from using the software that you have to buy new features whenever there are new updates, which I don't really like. But initially, it was very good."
"Its pricing is pretty much up to the mark. For smaller enterprises, it could be a big price to pay at the initial stage of operations, but the moment you have the Seed B or Seed C funding and you want to scale up your operations and aren't much worried about the funds, at that point in time, you would need a solution that could be scaled."
"I believe the pricing is not equitable."
"It has a CPU core-based licensing, which works for us and is quite good."
"It's not so favorable for small companies."
"We are running the community version right now, which can be used free of charge."
report
Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
900,644 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
No data available
Financial Services Firm
12%
Manufacturing Company
7%
Insurance Company
7%
Computer Software Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
By reviewers
Company SizeCount
Small Business3
Large Enterprise4
By reviewers
Company SizeCount
Small Business9
Midsize Enterprise2
Large Enterprise11
 

Questions from the Community

Ask a question
Earn 20 points
What needs improvement with StreamSets?
One issue I observed with StreamSets is that the memory runs out quickly when processing large volumes of data. Because of this memory issue, we have to upgrade our EC2 boxes in the Amazon AWS infr...
What is your primary use case for StreamSets?
We are using StreamSets for batch loading.
What advice do you have for others considering StreamSets?
If asked, I definitely recommend StreamSets to other users. My overall rating for the solution is nine.
 

Also Known As

Pervasive Data Integrator, PDI
No data available
 

Overview

 

Sample Customers

Pediatrix Medical Group
Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
Find out what your peers are saying about Informatica, Microsoft, Palantir and others in Data Integration. Updated: June 2026.
900,644 professionals have used our research since 2012.