Amazon EC2 Auto Scaling vs Apache Spark comparison

Cancel
You must select at least 2 products to compare!
Amazon Web Services (AWS) Logo
3,148 views|2,743 comparisons
100% willing to recommend
Apache Logo
3,093 views|2,345 comparisons
89% willing to recommend
Comparison Buyer's Guide
Executive Summary

We performed a comparison between Amazon EC2 Auto Scaling and Apache Spark based on real PeerSpot user reviews.

Find out in this report how the two Compute Service solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
To learn more, read our detailed Amazon EC2 Auto Scaling vs. Apache Spark Report (Updated: March 2024).
768,578 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"The solution is highly scalable.""What we have found most valuable are the purchasing of usage at the time and small storage.""The product's most valuable features are high availability and persistence.""The solution includes many features for configuring networks and VPCs.""The feature I found most valuable was the vertical and horizontal scaling.""Most of what I've deployed are CI/CD pipelines. AWS is scalable. You can always increase or adjust the resources to meet the specific requirements. I also like choosing an instance in any location, preferably the closest one. We don't have any AWS locations in South Africa, but the latency is about the same as hosting in Europe.""It has the best auto-scaling features.""The initial setup of Amazon EC2 Auto Scaling is easy...Since we are an enterprise-sized company and a client of Amazon, the response from the technical support team was immediate."

More Amazon EC2 Auto Scaling Pros →

"The most valuable feature of Apache Spark is its ease of use.""It is highly scalable, allowing you to efficiently work with extensive datasets that might be problematic to handle using traditional tools that are memory-constrained.""The scalability has been the most valuable aspect of the solution.""It provides a scalable machine learning library.""I feel the streaming is its best feature.""One of the key features is that Apache Spark is a distributed computing framework. You can help multiple slaves and distribute the workload between them.""I like that it can handle multiple tasks parallelly. I also like the automation feature. JavaScript also helps with the parallel streaming of the library.""I appreciate everything about the solution, not just one or two specific features. The solution is highly stable. I rate it a perfect ten. The solution is highly scalable. I rate it a perfect ten. The initial setup was straightforward. I recommend using the solution. Overall, I rate the solution a perfect ten."

More Apache Spark Pros →

Cons
"The solution's pricing is expensive. You pay based on how much you use it, like paying for the time or hours you use the service. There's no need to buy hardware separately.""The product's setup is complex for an intermediate user.""The pricing could be reduced.""There is room for improvement in the scalability.""We would like to see improvement in the UI for this solution, so that it is more user-friendly.""When creating a new instance there is a set of questions that have to be answered, and this is something that can be simplified.""The tool must provide proper guidelines to troubleshoot connectivity issues.""The licensing cost is expensive."

More Amazon EC2 Auto Scaling Cons →

"Apache Spark provides very good performance The tuning phase is still tricky.""Apache Spark should add some resource management improvements to the algorithms.""The logging for the observability platform could be better.""There were some problems related to the product's compatibility with a few Python libraries.""If you have a Spark session in the background, sometimes it's very hard to kill these sessions because of D allocation.""Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors.""Apache Spark could potentially improve in terms of user-friendliness, particularly for individuals with a SQL background. While it's suitable for those with programming knowledge, making it more accessible to those without extensive programming skills could be beneficial.""Spark could be improved by adding support for other open-source storage layers than Delta Lake."

More Apache Spark Cons →

Pricing and Cost Advice
  • "Pricing could be a little bit more competitive."
  • "The pricing is not fixed and it is based on usage."
  • "The price of this product could be a little bit lower."
  • "Licensing fees are paid on a yearly basis."
  • "I have not explored the price of the solution extensively, but from what I have seen the price is alright."
  • "When we want to use more services, we need to pay more. It's a monthly subscription, rather than licensed-based. Pricing or fees for Amazon EC2 Auto Scaling could be improved."
  • "The solution pricing varies by service region is mid-range."
  • "Amazon EC2 Auto Scaling uses a pay-as-you-go pricing model."
  • More Amazon EC2 Auto Scaling Pricing and Cost Advice →

  • "Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
  • "Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."
  • "We are using the free version of the solution."
  • "Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
  • "Apache Spark is an expensive solution."
  • "Spark is an open-source solution, so there are no licensing costs."
  • "On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
  • "It is an open-source solution, it is free of charge."
  • More Apache Spark Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
    768,578 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:The tool must provide proper guidelines to troubleshoot connectivity issues. It must also improve AMI creation.
    Top Answer:We use Spark to process data from different data sources.
    Top Answer:In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, and do the transformation in a subsecond
    Ranking
    2nd
    out of 16 in Compute Service
    Views
    3,148
    Comparisons
    2,743
    Reviews
    31
    Average Words per Review
    324
    Rating
    9.0
    5th
    out of 16 in Compute Service
    Views
    3,093
    Comparisons
    2,345
    Reviews
    25
    Average Words per Review
    432
    Rating
    8.7
    Comparisons
    Also Known As
    AWS RAM
    Learn More
    Overview

    Amazon EC2 Auto Scaling helps you maintain application availability and allows you to automatically add or remove EC2 instances according to conditions you define. ... Dynamic scaling responds to changing demand and predictive scaling automatically schedules the right number of EC2 instances based on predicted demand.

    Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

    Sample Customers
    Expedia, Intuit, Royal Dutch Shell, Brooks Brothers
    NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
    Top Industries
    REVIEWERS
    Computer Software Company44%
    Financial Services Firm16%
    Comms Service Provider8%
    Media Company4%
    VISITORS READING REVIEWS
    Financial Services Firm22%
    Computer Software Company14%
    University8%
    Government7%
    REVIEWERS
    Computer Software Company30%
    Financial Services Firm15%
    University9%
    Marketing Services Firm6%
    VISITORS READING REVIEWS
    Financial Services Firm24%
    Computer Software Company13%
    Manufacturing Company7%
    Comms Service Provider6%
    Company Size
    REVIEWERS
    Small Business33%
    Midsize Enterprise15%
    Large Enterprise53%
    VISITORS READING REVIEWS
    Small Business25%
    Midsize Enterprise10%
    Large Enterprise65%
    REVIEWERS
    Small Business40%
    Midsize Enterprise19%
    Large Enterprise40%
    VISITORS READING REVIEWS
    Small Business17%
    Midsize Enterprise12%
    Large Enterprise71%
    Buyer's Guide
    Amazon EC2 Auto Scaling vs. Apache Spark
    March 2024
    Find out what your peers are saying about Amazon EC2 Auto Scaling vs. Apache Spark and other solutions. Updated: March 2024.
    768,578 professionals have used our research since 2012.

    Amazon EC2 Auto Scaling is ranked 2nd in Compute Service with 37 reviews while Apache Spark is ranked 5th in Compute Service with 60 reviews. Amazon EC2 Auto Scaling is rated 8.8, while Apache Spark is rated 8.4. The top reviewer of Amazon EC2 Auto Scaling writes "Well-documented setup process and highly stable solution". On the other hand, the top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". Amazon EC2 Auto Scaling is most compared with AWS Fargate, AWS Lambda, AWS Batch, Amazon Elastic Inference and Oracle Compute Cloud Service, whereas Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Cloudera Distribution for Hadoop. See our Amazon EC2 Auto Scaling vs. Apache Spark report.

    See our list of best Compute Service vendors.

    We monitor all Compute Service reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.