We performed a comparison between Amazon EMR and Apache Hadoop based on real PeerSpot user reviews.
Find out in this report how the two Cloud Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The initial setup is straightforward."
"Amazon EMR's most valuable features are processing speed and data storage capacity."
"The solution is pretty simple to set up."
"Amazon EMR is a good solution that can be used to manage big data."
"The solution is scalable."
"It allows users to access the data through a web interface."
"The solution helps us manage huge volumes of data."
"The ability to resize the cluster is what really makes it stand out over other Hadoop and big data solutions."
"The solution is easy to expand. We haven't seen any issues with it in that sense. We've added 10 servers, and we've added two nodes. We've been expanding since we started using it since we started out so small. Companies that need to scale shouldn't have a problem doing so."
"Its integration is Hadoop's best feature because that allows us to support different tools in a big data platform."
"We selected Apache Hadoop because it is not dependent on third-party vendors."
"The most important feature is its ability to handle large volumes. Some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume. So, its processing power and speed are most valuable."
"High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization."
"The most valuable feature is scalability and the possibility to work with major information and open source capability."
"As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into other platforms like R."
"The scalability of Apache Hadoop is very good."
"The product's features for storing data in static clusters could be better."
"Amazon EMR can improve by adding some features, such as megastore services and HiveServer2. Additionally, the user interface could be better, similar to what Apache service provides, cross-platform services."
"The problem for us is it starts very slow."
"There is no need to pay extra for third-party software."
"The dashboard management could be better. Right now, it's lacking a bit."
"There is room for improvement in pricing."
"The initial setup was time-consuming."
"As people are shifting from legacy solutions to other technologies, Amazon EMR needs to add more features that give more flexibility in managing user data."
"I mentioned it definitely, and this is probably the only feature we can improve a little bit because the terminal and coding screen on Hadoop is a little outdated, and it looks like the old C++ bio screen. If the UI and UX can be improved slightly, I believe it will go a long way toward increasing adoption and effectiveness."
"The load optimization capabilities of the product are an area of concern where improvements are required."
"It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it."
"What could be improved in Apache Hadoop is its user-friendliness. It's not that user-friendly, but maybe it's because I'm new to it. Sometimes it feels so tough to use, but it could be because of two aspects: one is my incompetency, for example, I don't know about all the features of Apache Hadoop, or maybe it's because of the limitations of the platform. For example, my team is maintaining the business glossary in Apache Atlas, but if you want to change any settings at the GUI level, an advanced level of coding or programming needs to be done in the back end, so it's not user-friendly."
"The stability of the solution needs improvement."
"The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment."
"It would be good to have more advanced analytics tools."
"The integration with Apache Hadoop with lots of different techniques within your business can be a challenge."
Amazon EMR is ranked 9th in Cloud Data Warehouse with 20 reviews while Apache Hadoop is ranked 5th in Data Warehouse with 32 reviews. Amazon EMR is rated 7.8, while Apache Hadoop is rated 7.8. The top reviewer of Amazon EMR writes "Provides efficient data processing features and has good scalability ". On the other hand, the top reviewer of Apache Hadoop writes "A file system for data collection that contains needed information and files". Amazon EMR is most compared with Snowflake, Cloudera Distribution for Hadoop, Azure Data Factory, Amazon Redshift and Apache Spark, whereas Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake and Teradata. See our Amazon EMR vs. Apache Hadoop report.
See our list of best Cloud Data Warehouse vendors.
We monitor all Cloud Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.