Azure Data Factory vs StreamSets comparison

 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

Azure Data Factory
Ranking in Data Integration
1st
Average Rating
8.0
Number of Reviews
81
Ranking in other categories
Cloud Data Warehouse (3rd)
StreamSets
Ranking in Data Integration
8th
Average Rating
8.4
Number of Reviews
24
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of July 2024, in the Data Integration category, the mindshare of Azure Data Factory is 11.7%, down from 14.3% compared to the previous year. The mindshare of StreamSets is 1.9%, up from 1.4% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Data Integration
Unique Categories:
Cloud Data Warehouse
11.0%
No other categories found
 

Featured Reviews

Rohit Sircar - PeerSpot reviewer
Nov 7, 2022
Helps to pull records and parse them quickly, but the exception handling and logging mechanisms can be improved
When we were integrating the Ports product with our internal data warehouse, we had to update all the reports to our internal data warehouse on the Ports system database. However, they were not given access to the database company, and they dump some files or provide you with them. In one case, they were providing files. In another case, they provided some APIs where you need to call in a batch of thousands of records multiple times. It works very well with Azure Data Factory to pull the records, parse them quickly and post them in the database and data warehouse.
Kevin Kathiem Mutunga - PeerSpot reviewer
Mar 24, 2023
Enables us to build data pipelines without knowing how to code and helped us break down data silos within our organization
Using StreamSets to create pipelines for batch streaming or ETL is easy and straightforward. However, if one is new to StreamSets, it may not be so simple and may require a lot of documentation for assistance. We utilize StreamSets' ability to connect to enterprise data stores, making it easy to begin trading instantly without needing to be technically skilled. We use StreamSets to move data into analytics platforms. In my experience, it is initially quite easy to move data back if we have a clear understanding of data transit, importation, and exporting from external sources. This solution enables us to build data pipelines without knowing how to code. The solution includes templates that guide us and help us customize our data easily. It is essential that StreamSets does not necessitate coding, as this saves a considerable amount of time that would otherwise be spent writing code, as well as resources that would be required to hire experts. Transformer for Snowflake can help with both simple and complex transformation logic. For example, creating a plan to perform EPL and machine learning operations is easy and fast. However, if the same operations are performed on-site, it can be difficult to troubleshoot events due to limited visibility into the results. StreamSets' Transformer for Snowflake is important to us because it saves us a lot of time and enables us to complete a task remotely with only two or three people. It is important that Transformer for Snowflake is a serverless engine embedded within the platform. We have the capability of creating a data operations platform, so we don't have to worry or even be aware of what we are doing at the moment. We can simply create a device and use it in the pipeline we want it to be in. The solution improved the way we work, benefiting both our customers and our development and retainer teams. StreamSets helps us develop a platform manually, with a lot of teamwork, either remotely or on-site, depending on which option we use. This has had a significant impact on our organization in terms of how we process and transform data. I would say that it is very easy for us to update the template so that we can have real, actual data in APL claims and in the supply chain. StreamSets' data drift resilience is very effective and can run in the data grid. The data drift resilience has reduced the time it takes us to fix data drift breakages by approximately 25 percent. StreamSets helped us break down data silos within our organization. The ability to break down data silos helps StreamSets to gain quick insights. In general, it is a great feature that ensures we have activities or processes in place. We know precisely what to prevent and what to implement. StreamSets saved us around 30 percent of our time, meaning that a task that would take five hours to complete manually can now be done in around three and a half hours. The reusable assets are reducing workload by 35 percent by allowing different people to use a single platform or resource, regardless of whether they have a similar SKU or a different SKU. This feature can help an organization simplify, implement, and transmit more easily. It is not only the cost of one packet that we paid for, but now we are implementing a strategy using different people within the company. It would be very expensive if we had to hire a new person to manage that task and it would also take a lot of time. StreamSets is not only saving us money, but it is also ensuring that we complete strategies on time. StreamSets as well helped us scale our operations, which has had a significant impact on our business. We now have a better understanding of how to secure data and provide reliable security for the transmission of data from internal servers to external services, as well as meeting our client's application needs.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The solution includes a feature that increases the number of processors used which makes it very powerful and adds to the scalability."
"The feature I found most helpful in Azure Data Factory is the pipeline feature, including being able to connect to different sources. Azure Data Factory also has built-in security, which is another valuable feature."
"Data Flow and Databricks are going to be extremely valuable services, allowing data solutions to scale as the business grows and new data sources are added."
"It is a complete ETL Solution."
"Data Factory's most valuable feature is Copy Activity."
"I enjoy the ease of use for the backend JSON generator, the deployment solution, and the template management."
"The most valuable feature of Azure Data Factory is that it has a good combination of flexibility, fine-tuning, automation, and good monitoring."
"In terms of my personal experience, it works fine."
"The entire user interface is very simple and the simplicity of creating pipelines is something that I like very much about it. The design experience is very smooth."
"The UI is user-friendly, it doesn't require any technical know-how and we can navigate to social media or use it more easily."
"Important features include that it comprises lots of functionality to connect data from various sources through connector availability, scheduling pipelines at any time, and integration with third-party and security solutions for encryption."
"The most valuable would be the GUI platform that I saw. I first saw it at a special session that StreamSets provided towards the end of the summer. I saw the way you set it up and how you have different processes going on with your data. The design experience seemed to be pretty straightforward to me in terms of how you drag and drop these nodes and connect them with arrows."
"I really appreciate the numerous ready connectors available on both the source and target sides, the support for various media file formats, and the ease of configuring and managing pipelines centrally."
"The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customize it to do what you need. Many other tools have started to use features similar to those introduced by StreamSets, like automated workflows that are easy to set up."
"The scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy."
"It's very easy to integrate. It integrates with Snowflake, AWS, Google Cloud, and Azure. It's very helpful for DevOps, DataOps, and data engineering because it provides a comprehensive solution, and it's not complicated."
 

Cons

"The user interface could use improvement. It's not a major issue but it's something that can be improved."
"There's no Oracle connector if you want to do transformation using data flow activity, so Azure Data Factory needs more connectors for data flow transformation."
"User-friendliness and user effectiveness are unquestionably important, and it may be a good option here to improve the user experience. However, I believe that more and more sophisticated monitoring would be beneficial."
"It can improve from the perspective of active logging. It can provide active logging information."
"When working with AWS, we have noticed that the difference between ADF and AWS is that AWS is more customer-focused. They're more responsive compared to any other company. ADF is not as good as AWS, but it should be. If AWS is ten out of ten, ADF is around eight out of ten. I think AWS is easier to understand from the GUI perspective compared to ADF."
"The setup and configuration process could be simplified."
"The support and the documentation can be improved."
"The Microsoft documentation is too complicated."
"The documentation is inadequate and has room for improvement because the technical support does not regularly update their documentation or the knowledge base."
"In terms of the product, I don't think there is any room for improvement because it is very good. One small area of improvement that is very much needed is on the knowledge base side. Sometimes, it is not very clear how to set up a certain process or a certain node for a person who's using the platform for the first time."
"The software is very good overall. Areas for improvement are the error logging and the version history. I would like to see better, more detailed error logging information."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"There aren't enough hands-on labs, and debugging is also an issue because it takes a lot of time. Logs are not that clear when you are debugging, and you can only select a single source for a pipeline."
"I would like to see further improvement in the UI. In addition, upgrades are not automatic and they should be automated. Currently, we have to manually upgrade versions."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"Visualization and monitoring need to be improved and refined."
 

Pricing and Cost Advice

"My company is on a monthly subscription for Azure Data Factory, but it's more of a pay-as-you-go model where your monthly invoice depends on how many resources you use. On a scale of one to five, pricing for Azure Data Factory is a four. It's just the usage fees my company pays monthly."
"The pricing is pay-as-you-go or reserve instance. Of the two options, reserve instance is much cheaper."
"I would not say that this product is overly expensive."
"I would rate Data Factory's pricing nine out of ten."
"The pricing model is based on usage and is not cheap."
"While I can't specify the actual cost, I believe it is reasonably priced and comparable to similar products."
"Azure Data Factory gives better value for the price than other solutions such as Informatica."
"I am aware of the pricing of Azure Data Factory, but I prefer not to disclose specific details."
"We use the free version. It's great for a public, free release. Our stance is that the paid support model is too expensive to get into. They should honestly reevaluate that."
"It's not expensive because you pay per month, and the tasks you can perform with it are huge. It's reliable and cost-effective."
"The overall cost for small and mid-size organizations needs to be better."
"The pricing is good, but not the best. They have some customized plans you can opt for."
"StreamSets is expensive, especially for small businesses."
"There are different versions of the product. One is the corporate license version, and the other one is the open-source or free version. I have been using the corporate license version, but they have recently launched a new open-source version so that anybody can create an account and use it. The licensing cost varies from customer to customer. I don't have a lot of input on that. It is taken care of by PMO, and they seem fine with its pricing model. It is being used enterprise-wide. They seem to have got a good deal for StreamSets."
"There are two editions, Professional and Enterprise, and there is a free trial. We're using the Professional edition and it is competitively priced."
"I believe the pricing is not equitable."
report
Use our free recommendation engine to learn which Data Integration solutions are best for your needs.
793,295 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Computer Software Company
13%
Financial Services Firm
13%
Manufacturing Company
9%
Healthcare Company
7%
Financial Services Firm
17%
Computer Software Company
13%
Manufacturing Company
8%
Government
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

How do you select the right cloud ETL tool?
AWS Glue and Azure Data factory for ELT best performance cloud services.
How does Azure Data Factory compare with Informatica PowerCenter?
Azure Data Factory is flexible, modular, and works well. In terms of cost, it is not too pricey. It offers the stability and reliability I am looking for, good scalability, and is easy to set up an...
How does Azure Data Factory compare with Informatica Cloud Data Integration?
Azure Data Factory is a solid product offering many transformation functions; It has pre-load and post-load transformations, allowing users to apply transformations either in code by using Power Q...
What do you like most about StreamSets?
The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customiz...
What needs improvement with StreamSets?
We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which ...
What is your primary use case for StreamSets?
StreamSets is used for data transformation rather than ETL processes. It focuses on transforming data directly from sources without handling the extraction part of the process. The transformed data...
 

Learn More

Video not available
 

Overview

 

Sample Customers

1. Adobe 2. BMW 3. Coca-Cola 4. General Electric 5. Johnson & Johnson 6. LinkedIn 7. Mastercard 8. Nestle 9. Pfizer 10. Samsung 11. Siemens 12. Toyota 13. Unilever 14. Verizon 15. Walmart 16. Accenture 17. American Express 18. AT&T 19. Bank of America 20. Cisco 21. Deloitte 22. ExxonMobil 23. Ford 24. General Motors 25. IBM 26. JPMorgan Chase 27. Microsoft (Azure Data Factory is developed by Microsoft) 28. Oracle 29. Procter & Gamble 30. Salesforce 31. Shell 32. Visa
Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
Find out what your peers are saying about Azure Data Factory vs. StreamSets and other solutions. Updated: July 2024.
793,295 professionals have used our research since 2012.