Try our new research platform with insights from 80,000+ expert users

AWS Lambda vs Apache Spark comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on May 21, 2025

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Apache Spark
Ranking in Compute Service
4th
Average Rating
8.4
Reviews Sentiment
7.4
Number of Reviews
66
Ranking in other categories
Hadoop (1st), Java Frameworks (2nd)
AWS Lambda
Ranking in Compute Service
1st
Average Rating
8.6
Reviews Sentiment
7.3
Number of Reviews
88
Ranking in other categories
No ranking in other categories
 

Mindshare comparison

As of July 2025, in the Compute Service category, the mindshare of Apache Spark is 11.5%, up from 11.1% compared to the previous year. The mindshare of AWS Lambda is 20.7%, up from 19.5% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Compute Service
 

Featured Reviews

Dunstan Matekenya - PeerSpot reviewer
Open-source solution for data processing with portability
Apache Spark is known for its ease of use. Compared to other available data processing frameworks, it is user-friendly. While many choices now exist, Spark remains easy to use, particularly with Python. You can utilize familiar programming styles similar to Pandas in Python, including object-oriented programming. Another advantage is its portability. I can prototype and perform some initial tasks on my laptop using Spark without needing to be on Databricks or any cloud platform. I can transfer it to Databricks or other platforms, such as AWS. This flexibility allows me to improve processing even on my laptop. For instance, if I'm processing large amounts of data and find my laptop becoming slow, I can quickly switch to Spark. It handles small and large datasets efficiently, making it a versatile tool for various data processing needs.
Andrew-Wong - PeerSpot reviewer
Convenience in deployment process with room for code preview improvement
Having a better preview would be helpful. Sometimes, if my Lambda code is too big, it can be inconvenient as I'm unable to see my code when it exceeds a certain size. AWS has a limit, like a three-megabyte limit, beyond which I cannot view or edit the code easily.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"There's a lot of functionality."
"One of Apache Spark's most valuable features is that it supports in-memory processing, the execution of jobs compared to traditional tools is very fast."
"Apache Spark provides a very high-quality implementation of distributed data processing."
"I feel the streaming is its best feature."
"AI libraries are the most valuable. They provide extensibility and usability. Spark has a lot of connectors, which is a very important and useful feature for AI. You need to connect a lot of points for AI, and you have to get data from those systems. Connectors are very wide in Spark. With a Spark cluster, you can get fast results, especially for AI."
"The main feature that we find valuable is that it is very fast."
"We use it for ETL purposes as well as for implementing the full transformation pipelines."
"The solution is very stable."
"They have the built-in IDE, so everything happens without integration issues."
"Amazon takes care of the scalability. That's the right way. It's automatic and it's fully managed. That's one benefit of Lambda."
"We use AWS Lambda because it provides a solution for our needs without requiring us to manage our infrastructure. With the tool, we only pay for the resources we use. Additionally, it is straightforward to implement and integrates with other services like API Gateway."
"The most valuable feature of AWS Lambda, from a conceptual point, is its functions. For example, it's mathematical templates into which you can write, and create your solution. You write small pieces of a solution under given parameters."
"It is easy to use."
"AWS Lambda is interlinked with CloudWatch. When we have any errors we can directly go there and check the CloudWatch logs. Additionally, we can run it very fast and we can increase the RAM size and other components."
"Lambda has improved our organization by making it possible to transform data."
"Thanks to this solution, we do not need to worry about hardware or resource utilization. It saves us time."
 

Cons

"The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive."
"If you have a Spark session in the background, sometimes it's very hard to kill these sessions because of D allocation."
"The migration of data between different versions could be improved."
"I know there is always discussion about which language to write applications in and some people do love Scala. However, I don't like it."
"Spark could be improved by adding support for other open-source storage layers than Delta Lake."
"Dynamic DataFrame options are not yet available."
"The solution must improve its performance."
"The logging for the observability platform could be better."
"AWS Lambda's GUI could be improved with a twist or tweak in its look and feel to make it more impressive."
"The support team does not know how to implement and build the solution."
"Regarding layers, you need to manually zip and install them. This step needs practice, and you might need to do it three to four times to get a hang of it."
"The 60 seconds limitation with the consumption of the service is really restrictive for a service and the solution can be improved by eliminating that."
"The setup was pretty complex because there were many steps. For me, it was complex because I was somewhat new at it. It could be easier for someone who has done it a bunch of times. I just found that it was a very dense user experience. There's a lot going on during setup."
"It could be cheaper."
"We need to invest time in learning the tool's language variant. We have encountered instances of downtime as well."
"I would like the layers to have a bigger volume. I would like to be able to add more. I don't want to be limited by the layer."
 

Pricing and Cost Advice

"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."
"On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
"It is an open-source platform. We do not pay for its subscription."
"Licensing costs can vary. For instance, when purchasing a virtual machine, you're asked if you want to take advantage of the hybrid benefit or if you prefer the license costs to be included upfront by the cloud service provider, such as Azure. If you choose the hybrid benefit, it indicates you already possess a license for the operating system and wish to avoid additional charges for that specific VM in Azure. This approach allows for a reduction in licensing costs, charging only for the service and associated resources."
"Apache Spark is an open-source tool."
"They provide an open-source license for the on-premise version."
"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
"The cost is based on runtime."
"AWS Lambda is cheap."
"AWS Lambda cost is pretty decent."
"This is a product that is pay-per-use, as opposed to a licensing fee."
"The price of the solution is reasonable and it is a pay-per-use model. It is very good for cost optimization."
"We don't need to pay for licensing to use Lambda."
"It computes by the cycle, and it's very cheap."
"AWS Lambda is a very inexpensive solution. They charge for the number of times we run it. If you run AWS Lambda for one time, they charge around 50 cents or 25 cents for the use. I don't know the exact price, but it's less than a dollar."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
861,170 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
27%
Computer Software Company
12%
Manufacturing Company
7%
Comms Service Provider
6%
Educational Organization
42%
Financial Services Firm
13%
Computer Software Company
7%
Manufacturing Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Apache Spark is open-source, so it doesn't incur any charges.
What needs improvement with Apache Spark?
There is complexity when it comes to understanding the whole ecosystem, especially for beginners. I find it quite complex to understand how a Spark job is initiated, the roles of driver nodes, work...
Which is better, AWS Lambda or Batch?
AWS Lambda is a serverless solution. It doesn’t require any infrastructure, which allows for cost savings. There is no setup process to deal with, as the entire solution is in the cloud. If you use...
What do you like most about AWS Lambda?
The tool scales automatically based on the number of incoming requests.
What is your experience regarding pricing and costs for AWS Lambda?
The pricing of AWS Lambda is reasonable. It's beneficial and cost-effective for users regardless of the number of instances used.
 

Comparisons

 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Netflix
Find out what your peers are saying about AWS Lambda vs. Apache Spark and other solutions. Updated: June 2025.
861,170 professionals have used our research since 2012.