"One of the valuable features about this solution is that it's managed services, so it's pretty stable, and scalable as much as you wish. It has all the necessary distributions. With some additional work, it's also possible to change to a Spark version with the latest version of EMR. It also has Hudi, so we are leveraging Apache Hudi on EMR for change data capture, so then it comes out-of-the-box in EMR."
"The initial setup is pretty straightforward."
"This is the best tool for hosts and it's really flexible and scalable."
"The solution is pretty simple to set up."
"We are using applications, such as Splunk, Livy, Hadoop, and Spark. We are using all of these applications in Amazon EMR and they're helping us a lot."
"Data validation and ease of use are the most valuable features."
"It is a stable solution."
"This solution is useful to leverage within a distributed ecosystem."
"We don't have much control. If we have multiple users, if they want to scale up, the cost will go and increase and we don't know how we can restrict that price part."
"The most complicated thing is configuring to the cluster and ensure it's running correctly."
"Amazon EMR is continuously improving, but maybe something like CI/CD out-of-the-box or integration with Prometheus Grafana."
"The dashboard management could be better. Right now, it's lacking a bit."
"This solution could be improved by adding monitoring and integration for the EMR."
"Being a new user, I am not able to find out how to partition it correctly. I probably need more information or knowledge. In other database solutions, you can easily optimize all partitions. I haven't found a quicker way to do that in Spark SQL. It would be good if you don't need a partition here, and the system automatically partitions in the best way. They can also provide more educational resources for new users."
"There should be better integration with other solutions."
Amazon EMR is ranked 3rd in Hadoop with 5 reviews while Spark SQL is ranked 6th in Hadoop with 3 reviews. Amazon EMR is rated 7.0, while Spark SQL is rated 7.4. The top reviewer of Amazon EMR writes "Stable, scalable, and has all the necessary distributions ". On the other hand, the top reviewer of Spark SQL writes "Useful tool within a distributed ecosystem". Amazon EMR is most compared with Cloudera Distribution for Hadoop, Snowflake, Apache Spark, Hortonworks Data Platform and Azure Data Factory, whereas Spark SQL is most compared with Apache Spark, Informatica Big Data Parser, IBM Db2 Big SQL, AtScale Adaptive Analytics (A3) and SAP HANA. See our Amazon EMR vs. Spark SQL report.
See our list of best Hadoop vendors.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.