

Amazon EMR and Azure Data Factory compete in the data management solutions category. Amazon EMR seems to have the upper hand in user satisfaction, particularly in pricing, scalability, and support, although it is more expensive than Azure Data Factory.
Features: Amazon EMR offers significant scalability, integrates seamlessly with Hadoop clusters, and provides features like Spark and Hive ecosystem integration. It is designed for processing large data sets with minimal downtime. Azure Data Factory is known for robust data transformation capabilities, customizable workflows, and extensive pre-built connectors, making it ideal for ETL pipelines.
Room for Improvement: Amazon EMR could benefit from a simplified setup for newcomers, better monitoring and automation, and improved cost management. Azure Data Factory could enhance machine learning integrations, improve documentation, and streamline its user interface. Users also seek pricing transparency and support for real-time data processing.
Ease of Deployment and Customer Service: Both products are deployed on public and private clouds. Azure Data Factory offers hybrid cloud options for flexibility. Customer service for Amazon EMR receives praise for faster responses, while Azure Data Factory support has been noted as inconsistent, indicating room for improvement.
Pricing and ROI: Amazon EMR charges based on usage without licensing fees but can become costly due to infrastructure expenses, offering strong ROI for those transitioning from on-premise systems. Azure Data Factory uses a pay-as-you-go model, potentially becoming expensive with increased usage but generally cost-effective for smaller batches. Both solutions offer cost savings with different pricing structures and value realizations.
Our stakeholders and clients have expressed satisfaction with Azure Data Factory's efficiency and cost-effectiveness.
They help with billing, cost determination, IAM properties, security compliance, and deployment and migration activities.
We get all call support, screen sharing support, and immediate support, so there are no problems.
I would rate the technical support from Amazon as ten out of ten.
The technical support from Microsoft is rated an eight out of ten.
The technical support is responsive and helpful
The technical support for Azure Data Factory is generally acceptable.
Scalability can be provisioned using the auto-scaling feature, EC2 instances, on-demand instances, and storage locations like block storage, S3, or file storage.
Azure Data Factory is highly scalable.
Regular updates, patch installations, monitoring, logging, alerting, and disaster recovery activities are crucial for maintaining stability.
The solution has a high level of stability, roughly a nine out of ten.
The cost factor differs significantly. When you run Spark application on EKS, you run at the pod level, so you can control the compute cost. But in Amazon EMR, when you have to run one application, you have to launch the entire EC2.
There is room for improvement with respect to retries, handling the volume of data on S3 buckets, cluster provisioning, scaling, termination, security, and integration between services like S3, Glue, Lake Formation, and DynamoDB.
I have thoughts on what would be great to see in the product, such as AI/ML features or additional options.
Incorporating more dedicated API sources to specific services like HubSpot CRM or Salesforce would be beneficial.
Sometimes, the compute fails to process data if there is a heavy load suddenly, and it doesn't scale up automatically.
There is a problem with the integration with third-party solutions, particularly with SAP.
Costs are involved based on cluster resources, data volumes, EC2 instances, instance sizes, Kubernetes, Docker services, storage, and data transfers.
I would rate the price for Amazon EMR, where one is high and ten is low, as a good one.
The pricing is cost-effective.
It is considered cost-effective.
Amazon EMR helps in scalability, real-time and batch processing of data, handling efficient data sources, and managing data lakes, data stores, and data marts on file systems and in S3 buckets.
Amazon EMR provides out-of-the-box functionality because we can deploy and get Spark functionality over Hadoop.
The features at Amazon EMR that I have found most valuable are fully customizable functions.
It connects to different sources out-of-the-box, making integration much easier.
The platform excels in handling major datasets, particularly when working with Power BI for reporting purposes.
Regarding the integration feature in Azure Data Factory, the integration part is excellent; we have major source connectors, so we can integrate the data from different data sources and also perform basic transformation while transforming, which is a great feature in Azure Data Factory.
| Product | Market Share (%) |
|---|---|
| Azure Data Factory | 6.1% |
| Amazon EMR | 3.4% |
| Other | 90.5% |

| Company Size | Count |
|---|---|
| Small Business | 6 |
| Midsize Enterprise | 5 |
| Large Enterprise | 12 |
| Company Size | Count |
|---|---|
| Small Business | 31 |
| Midsize Enterprise | 19 |
| Large Enterprise | 57 |
Azure Data Factory efficiently manages and integrates data from various sources, enabling seamless movement and transformation across platforms. Its valuable features include seamless integration with Azure services, handling large data volumes, flexible transformation, user-friendly interface, extensive connectors, and scalability. Users have experienced improved team performance, workflow simplification, enhanced collaboration, streamlined processes, and boosted productivity.
We monitor all Cloud Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.