IT Central Station is now PeerSpot: Here's why

Azure Data Factory vs SAP Data Hub comparison

Cancel
You must select at least 2 products to compare!
StreamSets Logo
5,196 views|3,630 comparisons
Microsoft Logo
32,083 views|26,240 comparisons
SAP Logo
3,769 views|3,220 comparisons
Featured Review
Buyer's Guide
Data Integration Tools
July 2022
Find out what your peers are saying about Informatica, Microsoft, Talend and others in Data Integration Tools. Updated: July 2022.
622,645 professionals have used our research since 2012.
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"It is really easy to set up and the interface is easy to use.""StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved.""In StreamSets, everything is in one place.""StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes.""I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks."

More StreamSets Pros →

"In terms of my personal experience, it works fine.""I enjoy the ease of use for the backend JSON generator, the deployment solution, and the template management.""The solution is okay.""The most valuable feature is the copy activity.""Its integrability with the rest of the activities on Azure is most valuable.""The security of the agent that is installed on-premises is very good.""Allows more data between on-premises and cloud solutions""The trigger scheduling options are decently robust."

More Azure Data Factory Pros →

"The most valuable feature is the S/4HANA 1909 On-Premise""Its connection to on-premise products is the most valuable. We mostly use the on-premise connection, which is seamless. This is what we prefer in this solution over other solutions. We are using it the most for the orchestration where the data is coming from different categories. Its other features are very much similar to what they are giving us in open source. Their push-down approach is the most advantageous, where they push most of the processing on to the same data source. This means that they have a serverless kind of thing, and they don't process the data inside a product such as Data Hub. They process the data from where the data is coming out. If it is coming from HANA, to capture the data or process it for analytics, orchestration, or management, they go to the HANA database and give it out. They don't process it on Data Hub. This push-down approach increases the processing speed a little bit because the data is processed where it is sitting. That's the best part and an advantage. I have used another product where they used to capture the data first and then they used to process it and give it. In Data Hub, it is in reverse. They process it first and give it, and then they put their own manipulations. They lead in terms of business functions. No other solution has business functions already implemented to perform business analysis. They have a lot of prebuilt business functions for machine learning and orchestration, which we can use directly to get an analysis out from the existing data. Most of the data is sitting as enterprise data there. That's a major advantage that they have."

More SAP Data Hub Pros →

Cons
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful.""We've seen a couple of cases where it appears to have a memory leak or a similar problem.""The logging mechanism could be improved. If I am working on a pipeline, then create a job out of it and it is running, it will generate constant logs. So, the logging mechanism could be simplified. Now, it is a bit difficult to understand and filter the logs. It takes some time.""If you use JDBC Lookup, for example, it generally takes a long time to process data.""We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."

More StreamSets Cons →

"It does not appear to be as rich as other ETL tools. It has very limited capabilities.""My only problem is the seamless connectivity with various other databases, for example, SAP.""The performance could be better. It would be better if Azure Data Factory could handle a higher load. I have heard that it can get overloaded, and it can't handle it.""The one element of the solution that we have used and could be improved is the user interface.""There should be a way that it can do switches, so if at any point in time I want to do some hybrid mode of making any data collections or ingestions, I can just click on a button.""Data Factory's cost is too high.""The need to work more on developing out-of-the-box connectors for other products like Oracle, AWS, and others.""For some of the data, there were some issues with data mapping. Some of the error messages were a little bit foggy. There could be more of a quick start guide or some inline examples. The documentation could be better."

More Azure Data Factory Cons →

"In 2018, connecting it to outside sources, such as IoT products or IoT-enabled big data Hadoop, was a little complex. It was not smooth at the beginning. It was unstable. It took a lot of time for the initial data load. Sometimes, the connection broke, and we had to restart the process, which was a major issue, but they might have improved it now. It is very smooth with SAP HANA on-premise system, SAP Cloud Platform, and SAP Analytics Cloud. It could be because these are their own products, and they know how to integrate them. With Hadoop, they might have used open-source technologies, and that's why it was breaking at that time. They are providing less embedded integration because they want us to use their other products. For example, they don't want to go and remove SAP Analytics Cloud and put everything in Data Hub. They want us to use SAP Analytics Cloud somewhere else and not inside the Data Hub. On the integration part, it lacks real-time analytics, and it is slow. They should embed the SAP Analytics Cloud inside Data Hub or support some kind of analysis. They do provide some analysis, but it is not extensive. They are moreover open source. So, we need a lot of developers or data scientists to go in and implement Python algorithms. It would be better if they can provide their own existing algorithms and give some connections and drop-down menus to go and just configure those. It will make things really quick by increasing the embedded integrations. It will also improve the process efficiency and processing power. Its performance needs improvement. It is a little slow. It is not the best in the market, and there are other products that are much better than this. In terms of technology and performance, it is a little slow as compared to Microsoft and other data orchestration products. I haven't used other products, but I have read about those products, their settings, and the milliseconds that they do. In Azure Purview, they say that they can copy, manage, or transform the data within milliseconds. They say that they can transform 100 gigabytes of data within three to five seconds, which is something SAP cannot do. It generally takes a lot of time to process that much amount of data. However, I have never tested out Azure.""Nowadays there are some inconsistencies in data bases, however, they upgrade and release the versions to market."

More SAP Data Hub Cons →

Pricing and Cost Advice
  • "We are running the community version right now, which can be used free of charge."
  • "StreamSets Data Collector is open source. One can utilize the StreamSets Data Collector, but the Control Hub is the main repository where all the jobs are present. Everything happens in Control Hub."
  • "It has a CPU core-based licensing, which works for us and is quite good."
  • "There are different versions of the product. One is the corporate license version, and the other one is the open-source or free version. I have been using the corporate license version, but they have recently launched a new open-source version so that anybody can create an account and use it. The licensing cost varies from customer to customer. I don't have a lot of input on that. It is taken care of by PMO, and they seem fine with its pricing model. It is being used enterprise-wide. They seem to have got a good deal for StreamSets."
  • More StreamSets Pricing and Cost Advice →

  • "The price you pay is determined by how much you use it."
  • "Understanding the pricing model for Data Factory is quite complex."
  • "I would not say that this product is overly expensive."
  • "The licensing is a pay-as-you-go model, where you pay for what you consume."
  • "Our licensing fees are approximately 15,000 ($150 USD) per month."
  • "The licensing cost is included in the Synapse."
  • "It's not particularly expensive."
  • "Product is priced at the market standard."
  • More Azure Data Factory Pricing and Cost Advice →

  • "The Cloud is very expensive, but SAP HANA previous service is okay."
  • More SAP Data Hub Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Data Integration Tools solutions are best for your needs.
    622,645 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:It is really easy to set up and the interface is easy to use.
    Top Answer:We've seen a couple of cases where it appears to have a memory leak or a similar problem. It grows for a bit and then… more »
    Top Answer:We typically use it to transport our Oracle raw datasets up to Microsoft Azure, and then into SQL databases there.
    Top Answer:AWS Glue and Azure Data factory for ELT best performance cloud services.
    Top Answer:Azure Data Factory is flexible, modular, and works well. In terms of cost, it is not too pricey. It offers the stability… more »
    Top Answer:Azure Data Factory is a solid product offering many transformation functions; It has pre-load and post-load… more »
    Top Answer:Its connection to on-premise products is the most valuable. We mostly use the on-premise connection, which is seamless… more »
    Top Answer:In 2018, connecting it to outside sources, such as IoT products or IoT-enabled big data Hadoop, was a little complex. It… more »
    Top Answer:It is build based on the IoT implementations that we have. We are capturing a lot of data for a warehouse and… more »
    Comparisons
    Learn More
    StreamSets
    Video Not Available
    Overview

    StreamSets offers an end-to-end data integration platform to build, run, monitor and manage smart data pipelines that deliver continuous data for DataOps, and power the modern data ecosystem and hybrid integration.

    Only StreamSets provides a single design experience for all design patterns for 10x greater developer productivity; smart data pipelines that are resilient to change for 80% less breakages; and a single pane of glass for managing and monitoring all pipelines across hybrid and cloud architectures to eliminate blind spots and control gaps.

    With StreamSets, you can deliver the continuous data that drives the connected enterprise.

    Create, schedule, and manage your data integration at scale with Azure Data Factory - a hybrid data integration (ETL) service. Work with data wherever it lives, in the cloud or on-premises, with enterprise-grade security.

    The SAP® Data Hub solution enables sophisticated data operations management. It gives you the capability and flexibility to connect enterprise data and Big Data and gain a deep understanding of data and information processes across sources and systems throughout the distributed landscape. The unified solution provides visibility and control into data opportunities, integrating cloud and on-premise information and driving data agility and business value. Distributed processing power enables greater speed and efficiency.

    Offer
    Learn more about StreamSets
    Learn more about Azure Data Factory
    Learn more about SAP Data Hub
    Sample Customers
    Availity, BT Group, Humana, Deluxe, GSK, RingCentral, IBM, Shell, SamTrans, State of Ohio, TalentFulfilled, TechBridge
    Milliman, Pier 1 Imports, Rockwell Automation, Ziosk, Real Madrid
    Kaeser Kompressoren, HARTMANN
    Top Industries
    VISITORS READING REVIEWS
    Computer Software Company17%
    Financial Services Firm14%
    Comms Service Provider13%
    Insurance Company8%
    REVIEWERS
    Computer Software Company33%
    Non Profit11%
    Insurance Company7%
    Manufacturing Company7%
    VISITORS READING REVIEWS
    Computer Software Company25%
    Comms Service Provider13%
    Financial Services Firm9%
    Energy/Utilities Company6%
    VISITORS READING REVIEWS
    Computer Software Company31%
    Comms Service Provider13%
    Manufacturing Company8%
    Government6%
    Company Size
    VISITORS READING REVIEWS
    Small Business15%
    Midsize Enterprise13%
    Large Enterprise72%
    REVIEWERS
    Small Business25%
    Midsize Enterprise21%
    Large Enterprise54%
    VISITORS READING REVIEWS
    Small Business16%
    Midsize Enterprise14%
    Large Enterprise70%
    VISITORS READING REVIEWS
    Small Business13%
    Midsize Enterprise12%
    Large Enterprise75%
    Buyer's Guide
    Data Integration Tools
    July 2022
    Find out what your peers are saying about Informatica, Microsoft, Talend and others in Data Integration Tools. Updated: July 2022.
    622,645 professionals have used our research since 2012.

    Azure Data Factory is ranked 2nd in Data Integration Tools with 34 reviews while SAP Data Hub is ranked 7th in Data Governance with 2 reviews. Azure Data Factory is rated 7.8, while SAP Data Hub is rated 8.0. The top reviewer of Azure Data Factory writes "There's the good, the bad and the ugly....unfortunately lots of ugly". On the other hand, the top reviewer of SAP Data Hub writes "Good push-down approach, on-premise connection, and integration with SAP products, but needs better performance and integration with other solutions". Azure Data Factory is most compared with Informatica PowerCenter, Informatica Cloud Data Integration, Alteryx Designer, Talend Open Studio and Microsoft Azure Synapse Analytics, whereas SAP Data Hub is most compared with SAP Data Services, Qlik Replicate, Palantir Foundry, SAP Process Orchestration and Talend Data Fabric.

    We monitor all Data Integration Tools reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.