Try our new research platform with insights from 80,000+ expert users

AWS Lambda vs Apache Spark vs Azure Stream Analytics comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Mindshare comparison

Hadoop
Compute Service
Streaming Analytics
 

Featured Reviews

Dunstan Matekenya - PeerSpot reviewer
Open-source solution for data processing with portability
Apache Spark is known for its ease of use. Compared to other available data processing frameworks, it is user-friendly. While many choices now exist, Spark remains easy to use, particularly with Python. You can utilize familiar programming styles similar to Pandas in Python, including object-oriented programming. Another advantage is its portability. I can prototype and perform some initial tasks on my laptop using Spark without needing to be on Databricks or any cloud platform. I can transfer it to Databricks or other platforms, such as AWS. This flexibility allows me to improve processing even on my laptop. For instance, if I'm processing large amounts of data and find my laptop becoming slow, I can quickly switch to Spark. It handles small and large datasets efficiently, making it a versatile tool for various data processing needs.
Andrew-Wong - PeerSpot reviewer
Convenience in deployment process with room for code preview improvement
Having a better preview would be helpful. Sometimes, if my Lambda code is too big, it can be inconvenient as I'm unable to see my code when it exceeds a certain size. AWS has a limit, like a three-megabyte limit, beyond which I cannot view or edit the code easily.
SantiagoCordero - PeerSpot reviewer
Native connectors and integration simplify tasks but portfolio complexity needs addressing
There are too many products in the Azure landscape, which sometimes leads to overlap between them. Microsoft continuously releases new products or solutions, which can be frustrating when determining the appropriate features from one solution over another. A cost comparison between products is also not straightforward. They should simplify their portfolio. The Microsoft licensing system is confusing and not easy to understand, and this is something they should address. In the future, I may stop using Stream Analytics and move to other solutions. I discussed Palantir earlier, which is something I want to explore in depth because it allows me to accomplish more efficiently compared to solely using Azure. Additionally, the vendors should make the solution more user-friendly, incorporating low-code and no-code features. This is something I wish to explore further.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"With Hadoop-related technologies, we can distribute the workload with multiple commodity hardware."
"Spark is used for transformations from large volumes of data, and it is usefully distributed."
"ETL and streaming capabilities."
"The distribution of tasks, like the seamless map-reduce functionality, is quite impressive."
"I found the solution stable. We haven't had any problems with it."
"The data processing framework is good."
"The product is useful for analytics."
"The most valuable feature of Apache Spark is its memory processing because it processes data over RAM rather than disk, which is much more efficient and fast."
"The ability to scale up and down very quickly helps because we can maintain our system performance and business at a low cost."
"The solution is designed very well. You don't need to keep a server up. You just need some router to route your API request and Lambda provides a very well-designed feature to process the request."
"We use AWS Lambda because it provides a solution for our needs without requiring us to manage our infrastructure. With the tool, we only pay for the resources we use. Additionally, it is straightforward to implement and integrates with other services like API Gateway."
"The installation and configuration of the solution is straightforward."
"The support from AWS Lambda is very good, they are responsive."
"Lambda is the preferred compute option because of on-demand cost. We don't have to provision any hardware beforehand. We don't have to provision the capacity required for the services because it is serverless."
"AWS Lambda is itself serverless, and it is connected to the API gateway, and you can directly call the API through the API gateway and connect through AWS Lambda."
"It is my preferred product, as it provides me with source code within the solution."
"The most valuable features of Azure Stream Analytics are the ease of provisioning and the interface is not terribly complex."
"The solution's technical support is good."
"The solution has a lot of functionality that can be pushed out to companies."
"The way it organizes data into tables and dashboards is very helpful."
"Technical support is pretty helpful."
"It's a product that can scale."
"I like the IoT part. We have mostly used Azure Stream Analytics services for it"
"It provides the capability to streamline multiple output components."
 

Cons

"In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, do the transformation in a subsecond, and all that."
"The solution needs to optimize shuffling between workers."
"The graphical user interface (UI) could be a bit more clear. It's very hard to figure out the execution logs and understand how long it takes to send everything. If an execution is lost, it's not so easy to understand why or where it went. I have to manually drill down on the data processes which takes a lot of time. Maybe there could be like a metrics monitor, or maybe the whole log analysis could be improved to make it easier to understand and navigate."
"There were some problems related to the product's compatibility with a few Python libraries."
"It needs a new interface and a better way to get some data. In terms of writing our scripts, some processes could be faster."
"Technical expertise from an engineer is required to deploy and run high-tech tools, like Informatica, on Apache Spark, making it an area where improvements are required to make the process easier for users."
"The setup I worked on was really complex."
"At times during the deployment process, the tool goes down, making it look less robust. To take care of the issues in the deployment process, users need to do manual interventions occasionally."
"They should work on the solution's stability and pricing."
"It could be cheaper."
"We can write anything as code, but the solution will not give proper error information."
"The first time Lambda is started up, it takes some time to spin up an instance for serving the consumer requests. AWS has been trying to solve this in a variety of ways but have not yet managed to do so."
"AWS Lambda should improve its compatibility with the language used to write the code."
"AWS Lambda needs to improve its stability."
"The 60 seconds limitation with the consumption of the service is really restrictive for a service and the solution can be improved by eliminating that."
"Memory limitation is one of the weaknesses of AWS Lambda and as a result, we have to use several Lambda, instead of just one. Recently, I met with an Amazon employee, who is responsible for Lambda as a product. It appears Amazon has some plans with Lambda, so I don’t have to add something to the additional features."
"Azure Stream Analytics is challenging to customize because it's not very flexible."
"The only challenge was that the streaming analytics area in Azure Stream Analytics could not meet our company's expectations, making it a component where improvements are required."
"It is not complex, but it requires some development skills. When the data is sent from Azure Stream Analytics to Power BI, I don't have the access to modify the data. I can't customize or edit the data or do some queries. All queries need to be done in the Azure Stream Analytics."
"I would like to have a contact individual at Microsoft."
"Its features for event imports and architecture could be enhanced."
"The solution offers a free trial, however, it is too short."
"The solution doesn't handle large data packets very efficiently, which could be improved upon."
"There were challenges with Azure Stream Analytics. When I initially started, the learning curve was difficult because I didn't have knowledge of the service."
 

Pricing and Cost Advice

"Apache Spark is an open-source tool."
"The tool is an open-source product. If you're using the open-source Apache Spark, no fees are involved at any time. Charges only come into play when using it with other services like Databricks."
"Apache Spark is an expensive solution."
"On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
"Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
"I did not pay anything when using the tool on cloud services, but I had to pay on the compute side. The tool is not expensive compared with the benefits it offers. I rate the price as an eight out of ten."
"Considering the product version used in my company, I feel that the tool is not costly since the product is available for free."
"The pricing varies based on the specific solution you're implementing, and in comparison to the value it provides, the overall cost is reasonable."
"I would rate the tool’s pricing a nine out of ten. The solution’s pricing works on a pay-as-you-go basis."
"This is a product that is pay-per-use, as opposed to a licensing fee."
"The fees are volume-based."
"AWS Lambda is inexpensive."
"The price of AWS Lambda is priced very low."
"The solution is part of the AWS subscription model that is paid annually."
"AWS Lambda is not expensive for micro testing but is expensive if used for long deployment or long services."
"There are different tiers based on retention policies. There are four tiers. The pricing varies based on steaming units and tiers. The standard pricing is $10/hour."
"When scaling up, the pricing for Azure Stream Analytics can get relatively high. Considering its capabilities compared to other solutions, I would rate it a seven out of ten for cost. However, we've found ways to optimize costs using tools like Databricks for specific tasks."
"I rate the price of Azure Stream Analytics a four out of five."
"Azure Stream Analytics is a little bit expensive."
"The licensing for this product is payable on a 'pay as you go' basis. This means that the cost is only based on data volume, and the frequency that the solution is used."
"The current price is substantial."
"The product's price is at par with the other solutions provided by the other cloud service providers in the market."
"We pay approximately $500,000 a year. It's approximately $10,000 a year per license."
report
Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
856,873 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
7%
Comms Service Provider
6%
Educational Organization
57%
Financial Services Firm
10%
Computer Software Company
6%
Manufacturing Company
4%
Computer Software Company
15%
Financial Services Firm
15%
Manufacturing Company
9%
Retailer
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Apache Spark is open-source, so it doesn't incur any charges.
What needs improvement with Apache Spark?
There is complexity when it comes to understanding the whole ecosystem, especially for beginners. I find it quite com...
Which is better, AWS Lambda or Batch?
AWS Lambda is a serverless solution. It doesn’t require any infrastructure, which allows for cost savings. There is n...
What do you like most about AWS Lambda?
The tool scales automatically based on the number of incoming requests.
What is your experience regarding pricing and costs for AWS Lambda?
The pricing of AWS Lambda is reasonable. It's beneficial and cost-effective for users regardless of the number of ins...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analyti...
What is your experience regarding pricing and costs for Azure Stream Analytics?
I have no problem with pricing. We sell the data analytics value and operational value to customers, focusing on prod...
What needs improvement with Azure Stream Analytics?
There is a lack of technical support from Microsoft's local office, particularly in Taiwan. We often have to learn on...
 

Also Known As

No data available
No data available
ASA
 

Overview

 

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Netflix
Rockwell Automation, Milliman, Honeywell Building Solutions, Arcoflex Automation Solutions, Real Madrid C.F., Aerocrine, Ziosk, Tacoma Public Schools, P97 Networks
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop. Updated: June 2025.
856,873 professionals have used our research since 2012.