Developer at a tech company with 51-200 employees
A highly scalable, reliable, and cost-effective data processing and number crunching platform
What is most valuable?
- A highly scalable platform to run vast amounts of data processing with very high efficiency.
- Can be set to auto-scale up/down, as and when required.
- Highly reliable platform with almost no downtime
- Cost Effective
- Can be used as a Web Management Console and a Web Services API.
- Web consoles makes it very easy to run simple jobs.
- Can be very easily integrated with Hadoop Clusters and HDFS distributed file systems.
- Uses the in-house Amazon Elastic Cloud (EC2) and Amazon Simple Storage Service (Amazon S3) for providing a dynamic cloud storage facility.
What needs improvement?
- Setting up jobs for operations like data mining, web indexing, and machine learning is comparatively easier than log file analysis, financial analysis, etc .
- For novice users, there is a bit of a steep learning curve, but things become much easier once you have the basics under your belt.
- One of the lacking features is good web support. Though the web interface looks pretty decent, some of the basic features are missing. For example, you will find it a bit difficult to customize a particular map to reduce tasks, which involves a lot of customizations with regard to a given web indexing task. This involves extensive use of the underlying HDFS file system.
What other advice do I have?
I've been using Amazon Elastic MapReduce for more than a year and found it to be a very useful tool. I was a bit hesitant to try this wonderful tool when I had started for the first time, but having previous expertise in a similar tool helped me grasp things at a faster pace.
The bottom line is that if you have used some similar tools in the past, you are good to go. And, if you are new to the concept of a distributed task structure, it would be wise to spend a couple of minutes to get yourself acquainted with the MapReduce technology. This is my personal experience.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Download our free Amazon EMR Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2025
Popular Comparisons
Databricks
Azure Data Factory
Teradata
Snowflake
Dremio
Apache Spark
Microsoft Azure Synapse Analytics
Amazon Redshift
Cloudera Distribution for Hadoop
AWS Lake Formation
HPE Data Fabric
Spark SQL
Qubole Data Services
Buyer's Guide
Download our free Amazon EMR Report and get advice and tips from experienced pros
sharing their opinions.












Very helpful.