Try our new research platform with insights from 80,000+ expert users

AWS Lambda vs Apache Spark vs Azure Stream Analytics comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Mindshare comparison

Hadoop
Compute Service
Streaming Analytics
 

Featured Reviews

Dunstan Matekenya - PeerSpot reviewer
Open-source solution for data processing with portability
Apache Spark is known for its ease of use. Compared to other available data processing frameworks, it is user-friendly. While many choices now exist, Spark remains easy to use, particularly with Python. You can utilize familiar programming styles similar to Pandas in Python, including object-oriented programming. Another advantage is its portability. I can prototype and perform some initial tasks on my laptop using Spark without needing to be on Databricks or any cloud platform. I can transfer it to Databricks or other platforms, such as AWS. This flexibility allows me to improve processing even on my laptop. For instance, if I'm processing large amounts of data and find my laptop becoming slow, I can quickly switch to Spark. It handles small and large datasets efficiently, making it a versatile tool for various data processing needs.
Andrew-Wong - PeerSpot reviewer
Convenience in deployment process with room for code preview improvement
Having a better preview would be helpful. Sometimes, if my Lambda code is too big, it can be inconvenient as I'm unable to see my code when it exceeds a certain size. AWS has a limit, like a three-megabyte limit, beyond which I cannot view or edit the code easily.
SantiagoCordero - PeerSpot reviewer
Native connectors and integration simplify tasks but portfolio complexity needs addressing
There are too many products in the Azure landscape, which sometimes leads to overlap between them. Microsoft continuously releases new products or solutions, which can be frustrating when determining the appropriate features from one solution over another. A cost comparison between products is also not straightforward. They should simplify their portfolio. The Microsoft licensing system is confusing and not easy to understand, and this is something they should address. In the future, I may stop using Stream Analytics and move to other solutions. I discussed Palantir earlier, which is something I want to explore in depth because it allows me to accomplish more efficiently compared to solely using Azure. Additionally, the vendors should make the solution more user-friendly, incorporating low-code and no-code features. This is something I wish to explore further.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The data processing framework is good."
"The most valuable feature of Apache Spark is its memory processing because it processes data over RAM rather than disk, which is much more efficient and fast."
"DataFrame: Spark SQL gives the leverage to create applications more easily and with less coding effort."
"It is useful for handling large amounts of data. It is very useful for scientific purposes."
"With Spark, we parallelize our operations, efficiently accessing both historical and real-time data."
"The most valuable feature of this solution is its capacity for processing large amounts of data."
"The solution is very stable."
"Spark helps us reduce startup time for our customers and gives a very high ROI in the medium term."
"Because AWS Lambda is serverless, server configuration is not required, and we can run it directly anywhere."
"You can spin up anything instantly without any investment."
"The basic feature that I like is that there is no server installation. It also has good support for various languages, such as Java, .NET, C#, and Python."
"We moved our users into the Amazon Cognito pool, so it helps us to standardize our security practices, approaches, etc. We can customize Lambda for authentication to integrate it with API Gateway and other services."
"It's a serverless solution which is the best feature. It helps us because it offers free aspects. From the infrastructure perspective, it helps us manage costs. There is no overhead of estimating how much infrastructure we're going to need. We can focus on building the business functionality that we want to build."
"The stability is good."
"The initial setup of AWS Lambda is very straightforward and quick."
"AWS Lambda is interlinked with CloudWatch. When we have any errors we can directly go there and check the CloudWatch logs. Additionally, we can run it very fast and we can increase the RAM size and other components."
"I appreciate this solution because it leverages open-source technologies. It allows us to utilize the latest streaming solutions and it's easy to develop."
"The most valuable aspect is the SQL option that Azure Stream Analytics provides."
"Provides deep integration with other Azure resources."
"It was easy for me to use from the beginning. I am accustomed to working with Microsoft."
"The solution's technical support is good."
"I like the IoT part. We have mostly used Azure Stream Analytics services for it"
"The way it organizes data into tables and dashboards is very helpful."
"We find the query editor feature of this solution extremely valuable for our business."
 

Cons

"The solution’s integration with other platforms should be improved."
"When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data."
"Apache Spark is very difficult to use. It would require a data engineer. It is not available for every engineer today because they need to understand the different concepts of Spark, which is very, very difficult and it is not easy to learn."
"We've had problems using a Python process to try to access something in a large volume of data. It crashes if somebody gives me the wrong code because it cannot handle a large volume of data."
"The main concern is the overhead of Java when distributed processing is not necessary."
"Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors."
"It should support more programming languages."
"Its UI can be better. Maintaining the history server is a little cumbersome, and it should be improved. I had issues while looking at the historical tags, which sometimes created problems. You have to separately create a history server and run it. Such things can be made easier. Instead of separately installing the history server, it can be made a part of the whole setup so that whenever you set it up, it becomes available."
"We can write anything as code, but the solution will not give proper error information."
"AWS Lambda should support additional languages."
"The solution should continue to streamline integrations with AWS services."
"It could be cheaper."
"AWS Lambda is a bit difficult to set up if someone doesn't know how to code."
"Lamba functions have cold-starts that can cause some delay."
"If it is a specific ETL process or a long-term one, then AWS Lambda is not a good option."
"The first time Lambda is started up, it takes some time to spin up an instance for serving the consumer requests. AWS has been trying to solve this in a variety of ways but have not yet managed to do so."
"Its features for event imports and architecture could be enhanced."
"Sometimes when we connect Power BI, there is a delay or it throws up some errors, so we're not sure."
"The solution’s customer support could be improved."
"There are too many products in the Azure landscape, which sometimes leads to overlap between them."
"We would like to have centralized platform altogether since we have different kind of options for data ingestion. Sometimes it gets difficult to manage different platforms."
"The collection and analysis of historical data could be better."
"The solution's interface could be simpler to understand for non-technical people."
"The UI should be a little bit better from a usability perspective."
 

Pricing and Cost Advice

"The product is expensive, considering the setup."
"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
"They provide an open-source license for the on-premise version."
"On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
"I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten."
"Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
"We only need to pay for the compute time our code consumes."
"The pricing varies based on the specific solution you're implementing, and in comparison to the value it provides, the overall cost is reasonable."
"AWS Lambda is a very inexpensive solution. They charge for the number of times we run it. If you run AWS Lambda for one time, they charge around 50 cents or 25 cents for the use. I don't know the exact price, but it's less than a dollar."
"The price is expensive and is based on usage. The more users you have the higher the cost."
"We don't need to pay for licensing to use Lambda."
"Price-wise, AWS Lambda is very cheap. It's not free, but it's not that expensive."
"AWS Lambda is inexpensive."
"The cost is based on runtime."
"Azure Stream Analytics is a little bit expensive."
"When scaling up, the pricing for Azure Stream Analytics can get relatively high. Considering its capabilities compared to other solutions, I would rate it a seven out of ten for cost. However, we've found ways to optimize costs using tools like Databricks for specific tasks."
"We pay approximately $500,000 a year. It's approximately $10,000 a year per license."
"The product's price is at par with the other solutions provided by the other cloud service providers in the market."
"I rate the price of Azure Stream Analytics a four out of five."
"The current price is substantial."
"There are different tiers based on retention policies. There are four tiers. The pricing varies based on steaming units and tiers. The standard pricing is $10/hour."
"The licensing for this product is payable on a 'pay as you go' basis. This means that the cost is only based on data volume, and the frequency that the solution is used."
report
Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
864,574 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
26%
Computer Software Company
10%
Comms Service Provider
7%
Manufacturing Company
7%
Financial Services Firm
20%
Computer Software Company
12%
Manufacturing Company
8%
University
6%
Financial Services Firm
15%
Computer Software Company
14%
Manufacturing Company
9%
Retailer
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Apache Spark is open-source, so it doesn't incur any charges.
What needs improvement with Apache Spark?
There is complexity when it comes to understanding the whole ecosystem, especially for beginners. I find it quite com...
Which is better, AWS Lambda or Batch?
AWS Lambda is a serverless solution. It doesn’t require any infrastructure, which allows for cost savings. There is n...
What do you like most about AWS Lambda?
The tool scales automatically based on the number of incoming requests.
What is your experience regarding pricing and costs for AWS Lambda?
The pricing of AWS Lambda is reasonable. It's beneficial and cost-effective for users regardless of the number of ins...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analyti...
What is your experience regarding pricing and costs for Azure Stream Analytics?
The solution does not need any license; it comes with your subscription.
What needs improvement with Azure Stream Analytics?
It does not always give you the right reason or the correct reason. For example, if a service is stopped, it just tel...
 

Also Known As

No data available
No data available
ASA
 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Netflix
Rockwell Automation, Milliman, Honeywell Building Solutions, Arcoflex Automation Solutions, Real Madrid C.F., Aerocrine, Ziosk, Tacoma Public Schools, P97 Networks
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop. Updated: July 2025.
864,574 professionals have used our research since 2012.