We performed a comparison between Informatica Cloud Data Integration and StreamSets based on real PeerSpot user reviews.
Find out in this report how the two Cloud Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."Informatica Cloud Data Integration is stable."
"It has a more UI-based tool, and the scripting is good."
"Data integration is the most valuable feature. The ability to connect to any of the sources and enterprise applications makes our lives easier."
"The solution provides increased efficiency while still being user-friendly and easy to operate."
"The most valuable features of the solution are pushdown, optimization, and partitioning, which can be used for frequent data loading, especially when we have a huge data volume in our company."
"It is quite easy to use and flexible."
"I like the fact that you can find almost any product connection that you need and the list is always expanding."
"REST API: Excellent for scripting control and reporting mechanisms"
"What I love the most is that StreamSets is very light. It's a containerized application. It's easy to use with Docker. If you are a large organization, it's very easy to use Kubernetes."
"It's very easy to integrate. It integrates with Snowflake, AWS, Google Cloud, and Azure. It's very helpful for DevOps, DataOps, and data engineering because it provides a comprehensive solution, and it's not complicated."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"It is a very powerful, modern data analytics solution, in which you can integrate a large volume of data from different sources. It integrates all of the data and you can design, create, and monitor pipelines according to your requirements. It is an all-in-one day data ops solution."
"One of the things I like is the data pipelines. They have a very good design. Implementing pipelines is very straightforward. It doesn't require any technical skill."
"The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
"In StreamSets, everything is in one place."
"A general improvement in icons and the virtual interface would be good."
"Cost-wise, it could be better."
"The regions in which the data resides are still limited. This could be an issue in terms of the data residency laws of some of the countries. They should get more regions."
"While my company operates on the cloud, we have seen that Informatica Cloud Data Integration has some performance issues causing it to lag."
"Error reporting and debugging need improvement."
"It needs to be a little more intuitive but it’s really not bad."
"It would be helpful if there was a GenAI feature integrated into the system, especially regarding the data quality."
"There is room for improvement at the highest level in terms of useability and connectors for various types of new applications. The row processing performance could be better because you experience some latency dealing with high volumes of data. Most organizations will be dealing with multiple cloud applications, so you could see performance issues moving from one system to another."
"They need to improve their customer care services. Sometimes it has taken more than 48 hours to resolve an issue. That should be reduced. They are aware of small or generic issues, but not the more technical or deep issues. For those, they require some time, generally 48 to 72 hours to respond. That should be improved."
"In terms of the product, I don't think there is any room for improvement because it is very good. One small area of improvement that is very much needed is on the knowledge base side. Sometimes, it is not very clear how to set up a certain process or a certain node for a person who's using the platform for the first time."
"Visualization and monitoring need to be improved and refined."
"Using ETL pipelines is a bit complicated and requires some technical aid."
"The monitoring visualization is not that user-friendly. It should include other features to visualize things, like how many records were streamed from a source to a destination on a particular date."
"One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
"The data collector in StreamSets has to be designed properly. For example, a simple database configuration with MySQL DB requires the MySQL Connector to be installed."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
More Informatica Cloud Data Integration Pricing and Cost Advice →
Informatica Cloud Data Integration is ranked 5th in Cloud Data Integration with 40 reviews while StreamSets is ranked 8th in Data Integration with 24 reviews. Informatica Cloud Data Integration is rated 7.8, while StreamSets is rated 8.4. The top reviewer of Informatica Cloud Data Integration writes "A stable, scalable, and user-friendly solution". On the other hand, the top reviewer of StreamSets writes "We no longer need to hire highly skilled data engineers to create and monitor data pipelines". Informatica Cloud Data Integration is most compared with Informatica PowerCenter, Azure Data Factory, AWS Glue, Fivetran and Mule Anypoint Platform, whereas StreamSets is most compared with Fivetran, Azure Data Factory, Informatica PowerCenter, SSIS and IBM InfoSphere DataStage. See our Informatica Cloud Data Integration vs. StreamSets report.
See our list of best Cloud Data Integration vendors.
We monitor all Cloud Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.