IT Central Station is now PeerSpot: Here's why

Amazon EMR vs Spark SQL comparison

Cancel
You must select at least 2 products to compare!
Amazon Logo
1,954 views|1,640 comparisons
Apache Logo
886 views|322 comparisons
Featured Review
Anonymous User
Anonymous User
Buyer's Guide
Amazon EMR vs. Spark SQL
May 2022
Find out what your peers are saying about Amazon EMR vs. Spark SQL and other solutions. Updated: May 2022.
607,127 professionals have used our research since 2012.
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"One of the valuable features about this solution is that it's managed services, so it's pretty stable, and scalable as much as you wish. It has all the necessary distributions. With some additional work, it's also possible to change to a Spark version with the latest version of EMR. It also has Hudi, so we are leveraging Apache Hudi on EMR for change data capture, so then it comes out-of-the-box in EMR.""The initial setup is pretty straightforward.""This is the best tool for hosts and it's really flexible and scalable.""The solution is pretty simple to set up.""We are using applications, such as Splunk, Livy, Hadoop, and Spark. We are using all of these applications in Amazon EMR and they're helping us a lot."

More Amazon EMR Pros →

"Data validation and ease of use are the most valuable features.""It is a stable solution.""This solution is useful to leverage within a distributed ecosystem."

More Spark SQL Pros →

Cons
"We don't have much control. If we have multiple users, if they want to scale up, the cost will go and increase and we don't know how we can restrict that price part.""The most complicated thing is configuring to the cluster and ensure it's running correctly.""Amazon EMR is continuously improving, but maybe something like CI/CD out-of-the-box or integration with Prometheus Grafana.""The dashboard management could be better. Right now, it's lacking a bit."

More Amazon EMR Cons →

"This solution could be improved by adding monitoring and integration for the EMR.""Being a new user, I am not able to find out how to partition it correctly. I probably need more information or knowledge. In other database solutions, you can easily optimize all partitions. I haven't found a quicker way to do that in Spark SQL. It would be good if you don't need a partition here, and the system automatically partitions in the best way. They can also provide more educational resources for new users.""There should be better integration with other solutions."

More Spark SQL Cons →

Pricing and Cost Advice
  • "You don't need to pay for licensing on a yearly or monthly basis, you only pay for what you use, in terms of underlying instances."
  • "The cost of Amazon EMR is very high."
  • More Amazon EMR Pricing and Cost Advice →

  • "The solution is open-sourced and free."
  • "There is no license or subscription for this solution."
  • More Spark SQL Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
    607,127 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:The solution is pretty simple to set up.
    Top Answer:The price can get a bit high. We're looking for ways to reduce costs. Right now it costs us between $40,000 and $50,000 a month.
    Top Answer:The cost is increasing. We are looking into how we can optimize the cost part of EMR. We're doing a comparison between Cloudera running on AWS and running AWS EMR. We don't have much control. If we… more »
    Top Answer:This solution is useful to leverage within a distributed ecosystem.
    Top Answer:There is no license or subscription for this solution.
    Top Answer:This solution could be improved by adding monitoring and integration for the EMR.
    Ranking
    3rd
    out of 22 in Hadoop
    Views
    1,954
    Comparisons
    1,640
    Reviews
    5
    Average Words per Review
    445
    Rating
    7.0
    6th
    out of 22 in Hadoop
    Views
    886
    Comparisons
    322
    Reviews
    3
    Average Words per Review
    199
    Rating
    7.3
    Comparisons
    Also Known As
    Amazon Elastic MapReduce
    Learn More
    Overview
    Amazon Elastic MapReduce (Amazon EMR) is a web service that makes it easy to quickly and cost-effectively process vast amounts of data. Amazon EMR simplifies big data processing, providing a managed Hadoop framework that makes it easy, fast, and cost-effective for you to distribute and process vast amounts of your data across dynamically scalable Amazon EC2 instances.
    Spark SQL is a Spark module for structured data processing. Unlike the basic Spark RDD API, the interfaces provided by Spark SQL provide Spark with more information about the structure of both the data and the computation being performed. There are several ways to interact with Spark SQL including SQL and the Dataset API. When computing a result the same execution engine is used, independent of which API/language you are using to express the computation. This unification means that developers can easily switch back and forth between different APIs based on which provides the most natural way to express a given transformation.
    Offer
    Learn more about Amazon EMR
    Learn more about Spark SQL
    Sample Customers
    Yelp
    UC Berkeley AMPLab, Amazon, Alibaba Taobao, Kenshoo, Hitachi Solutions
    Top Industries
    VISITORS READING REVIEWS
    Computer Software Company26%
    Media Company13%
    Comms Service Provider12%
    Financial Services Firm10%
    VISITORS READING REVIEWS
    Computer Software Company27%
    Comms Service Provider23%
    Financial Services Firm8%
    Educational Organization6%
    Company Size
    REVIEWERS
    Small Business44%
    Midsize Enterprise22%
    Large Enterprise33%
    VISITORS READING REVIEWS
    Small Business14%
    Midsize Enterprise11%
    Large Enterprise75%
    REVIEWERS
    Small Business29%
    Midsize Enterprise43%
    Large Enterprise29%
    VISITORS READING REVIEWS
    Small Business19%
    Midsize Enterprise14%
    Large Enterprise67%
    Buyer's Guide
    Amazon EMR vs. Spark SQL
    May 2022
    Find out what your peers are saying about Amazon EMR vs. Spark SQL and other solutions. Updated: May 2022.
    607,127 professionals have used our research since 2012.

    Amazon EMR is ranked 3rd in Hadoop with 5 reviews while Spark SQL is ranked 6th in Hadoop with 3 reviews. Amazon EMR is rated 7.0, while Spark SQL is rated 7.4. The top reviewer of Amazon EMR writes "Stable, scalable, and has all the necessary distributions ". On the other hand, the top reviewer of Spark SQL writes "Useful tool within a distributed ecosystem". Amazon EMR is most compared with Cloudera Distribution for Hadoop, Snowflake, Apache Spark, Hortonworks Data Platform and Azure Data Factory, whereas Spark SQL is most compared with Apache Spark, Informatica Big Data Parser, IBM Db2 Big SQL, AtScale Adaptive Analytics (A3) and SAP HANA. See our Amazon EMR vs. Spark SQL report.

    See our list of best Hadoop vendors.

    We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.