"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"It is really easy to set up and the interface is easy to use."
"In StreamSets, everything is in one place."
"It has built-in connectors for more than 100 sources and onboarding data from many different sources to the cloud environment."
"Data Factory's most valuable feature is Copy Activity."
"The most valuable features are data transformations."
"The overall performance is quite good."
"The most valuable feature of Azure Data Factory is that it has a good combination of flexibility, fine-tuning, automation, and good monitoring."
"The security of the agent that is installed on-premises is very good."
"I think it makes it very easy to understand what data flow is and so on. You can leverage the user interface to do the different data flows, and it's great. I like it a lot."
"The most valuable feature of this solution would be ease of use."
"WhereScape is really helpful in terms of architecture data. Everything is one of automation. Two people can do thousands of tables in one day or two. It saves a lot of time."
"The most valuable feature is the metadata generated code."
"I like the data vault implementations."
"If you use JDBC Lookup, for example, it generally takes a long time to process data."
"The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time."
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"The need to work more on developing out-of-the-box connectors for other products like Oracle, AWS, and others."
"Azure Data Factory can improve by having support in the drivers for change data capture."
"One area for improvement is documentation. At present, there isn't enough documentation on how to use Azure Data Factory in certain conditions. It would be good to have documentation on the various use cases."
"Occasionally, there are problems within Microsoft itself that impacts the Data Factory and causes it to fail."
"There is always room to improve. There should be good examples of use that, of course, customers aren't always willing to share. It is Catch-22. It would help the user base if everybody had really good examples of deployments that worked, but when you ask people to put out their good deployments, which also includes me, you usually got, "No, I'm not going to do that." They don't have enough good examples. Microsoft probably just needs to pay one of their partners to build 20 or 30 examples of functional Data Factories and then share them as a user base."
"It would be better if it had machine learning capabilities."
"Data Factory could be improved by eliminating the need for a physical data area. We have to extract data using Data Factory, then create a staging database for it with Azure SQL, which is very, very expensive. Another improvement would be lowering the licensing cost."
"Some of the optimization techniques are not scalable."
"Technical support isn't the best."
"Project-based searching of data objects in the data warehouse browser needs to be improved."
"Customization could be better."
StreamSets offers an end-to-end data integration platform to build, run, monitor and manage smart data pipelines that deliver continuous data for DataOps, and power the modern data ecosystem and hybrid integration.
Only StreamSets provides a single design experience for all design patterns for 10x greater developer productivity; smart data pipelines that are resilient to change for 80% less breakages; and a single pane of glass for managing and monitoring all pipelines across hybrid and cloud architectures to eliminate blind spots and control gaps.
With StreamSets, you can deliver the continuous data that drives the connected enterprise.
Create, schedule, and manage your data integration at scale with Azure Data Factory - a hybrid data integration (ETL) service. Work with data wherever it lives, in the cloud or on-premises, with enterprise-grade security.
WhereScape is data warehouse software that automates the Data Warehouse lifecycle. From implementation to maintenance, WhereScape will ensure your data warehouse projects are completed up to 5x faster than manual coding.
Azure Data Factory is ranked 2nd in Data Integration Tools with 32 reviews while WhereScape RED is ranked 22nd in Data Integration Tools with 3 reviews. Azure Data Factory is rated 7.8, while WhereScape RED is rated 8.0. The top reviewer of Azure Data Factory writes "Easy to bring in outside capabilities, flexible, and works well". On the other hand, the top reviewer of WhereScape RED writes "Quick to set up, flexible, and stable". Azure Data Factory is most compared with Informatica PowerCenter, Informatica Cloud Data Integration, Talend Open Studio, Alteryx Designer and Palantir Foundry, whereas WhereScape RED is most compared with SSIS, Informatica PowerCenter, Talend Open Studio, Matillion ETL and AWS Glue. See our Azure Data Factory vs. WhereScape RED report.
See our list of best Data Integration Tools vendors.
We monitor all Data Integration Tools reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.