Apache Spark vs Azure Stream Analytics comparison

Cancel
You must select at least 2 products to compare!
Apache Logo
2,498 views|1,884 comparisons
89% willing to recommend
Microsoft Logo
9,925 views|8,426 comparisons
95% willing to recommend
Comparison Buyer's Guide
Executive Summary
Updated on Sep 5, 2022

We performed a comparison between Apache Spark vs.Azure Stream Analytics based on our users’ reviews in five categories. After reading all of the collected data, you can find our conclusion below.

  • Ease of Deployment: U users note that both products are very straightforward and simple to set up.
  • Features: Users of both products are generally happy with their flexibility, stability, and scalability. Some Azure Stream Analytics users noted issues with stability.

    Apache Spark users note being particularly satisfied with its AI libraries and batch processing, but that there’s a learning curve to using it and that its stream processing needs to be developed more.

    Azure Stream Analytics users say they’re impressed with the solution's UI, real-time analytics, and its deep integration with other Azure products. Some users mention issues when connecting to Microsoft Power BI and would like to see clearer metrics.
  • Pricing: Apache Spark is an open-source product. You have to pay only when you use any bundled product, such as Cloudera. Azure Stream Analytics users say that the solution is fairly priced and is cheaper than its biggest competitors.
  • ROI: Apache Spark users make no mention of ROI. Azure Stream Analytics users mention being pleased with the ROI.
  • Service and Support: Because Apache Spark is open-source, they do not offer support. Azure Stream Analytics users report excellent service and support.

Comparison Results: Apache Spark and Azure Stream Analytics come out about equal in this comparison. Some users are more satisfied with Apache Spark’s stability, and pricing, but Azure Stream Analytics has an edge when it comes to ROI and technical support.

To learn more, read our detailed Hadoop Report (Updated: April 2024).
767,847 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"The fault tolerant feature is provided.""This solution provides a clear and convenient syntax for our analytical tasks.""Apache Spark provides a very high-quality implementation of distributed data processing.""The data processing framework is good.""The most valuable feature of Apache Spark is its ease of use.""We use Spark to process data from different data sources.""The most valuable feature is the Fault Tolerance and easy binding with other processes like Machine Learning, graph analytics.""The solution has been very stable."

More Apache Spark Pros →

"The most valuable features of Azure Stream Analytics are the ease of provisioning and the interface is not terribly complex.""Technical support is pretty helpful.""The way it organizes data into tables and dashboards is very helpful.""Provides deep integration with other Azure resources.""The solution's technical support is good.""I like all the connected ecosystems of Microsoft, it is really good with other BI tools that are easy to connect.""It provides the capability to streamline multiple output components.""It's scalable as a cloud product."

More Azure Stream Analytics Pros →

Cons
"It needs a new interface and a better way to get some data. In terms of writing our scripts, some processes could be faster.""The logging for the observability platform could be better.""The initial setup was not easy.""If you have a Spark session in the background, sometimes it's very hard to kill these sessions because of D allocation.""More ML based algorithms should be added to it, to make it algorithmic-rich for developers.""Apache Spark could potentially improve in terms of user-friendliness, particularly for individuals with a SQL background. While it's suitable for those with programming knowledge, making it more accessible to those without extensive programming skills could be beneficial.""It would be beneficial to enhance Spark's capabilities by incorporating models that utilize features not traditionally present in its framework.""Apache Spark is very difficult to use. It would require a data engineer. It is not available for every engineer today because they need to understand the different concepts of Spark, which is very, very difficult and it is not easy to learn."

More Apache Spark Cons →

"We would like to have centralized platform altogether since we have different kind of options for data ingestion. Sometimes it gets difficult to manage different platforms.""Azure Stream Analytics could improve by having clearer metrics as to the scale, more metrics around the data set size that is flowing through it, and performance tuning recommendations.""The collection and analysis of historical data could be better.""One area that could use improvement is the handling of data validation. Currently, there is a review process, but sometimes the validation fails even before the job is executed. This results in wasted time as we have to rerun the job to identify the failure.""The solution offers a free trial, however, it is too short.""The solution’s customer support could be improved.""Sometimes when we connect Power BI, there is a delay or it throws up some errors, so we're not sure.""I would like to have a contact individual at Microsoft."

More Azure Stream Analytics Cons →

Pricing and Cost Advice
  • "Since we are using the Apache Spark version, not the data bricks version, it is an Apache license version, the support and resolution of the bug are actually late or delayed. The Apache license is free."
  • "Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."
  • "We are using the free version of the solution."
  • "Apache Spark is not too cheap. You have to pay for hardware and Cloudera licenses. Of course, there is a solution with open source without Cloudera."
  • "Apache Spark is an expensive solution."
  • "Spark is an open-source solution, so there are no licensing costs."
  • "On the cloud model can be expensive as it requires substantial resources for implementation, covering on-premises hardware, memory, and licensing."
  • "It is an open-source solution, it is free of charge."
  • More Apache Spark Pricing and Cost Advice →

  • "The cost of this solution is less than competitors such as Amazon or Google Cloud."
  • "We pay approximately $500,000 a year. It's approximately $10,000 a year per license."
  • "I rate the price of Azure Stream Analytics a four out of five."
  • "The licensing for this product is payable on a 'pay as you go' basis. This means that the cost is only based on data volume, and the frequency that the solution is used."
  • "There are different tiers based on retention policies. There are four tiers. The pricing varies based on steaming units and tiers. The standard pricing is $10/hour."
  • "The current price is substantial."
  • "Azure Stream Analytics is a little bit expensive."
  • "The product's price is at par with the other solutions provided by the other cloud service providers in the market."
  • More Azure Stream Analytics Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Hadoop solutions are best for your needs.
    767,847 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:We use Spark to process data from different data sources.
    Top Answer:In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, and do the transformation in a subsecond
    Top Answer:Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analytics teams that have to interpret data to further the business goals of their… more »
    Top Answer:The product's price is at par with the other solutions provided by the other cloud service providers in the market.
    Top Answer:Azure Stream Analytics was not meeting our company's expectations because it was tedious to change the job, write queries, or if I needed to change something, I needed to stop the entire stream… more »
    Ranking
    1st
    out of 22 in Hadoop
    Views
    2,498
    Comparisons
    1,884
    Reviews
    25
    Average Words per Review
    432
    Rating
    8.7
    4th
    out of 38 in Streaming Analytics
    Views
    9,925
    Comparisons
    8,426
    Reviews
    13
    Average Words per Review
    405
    Rating
    8.2
    Comparisons
    Also Known As
    ASA
    Learn More
    Overview

    Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

    Azure Stream Analytics is a robust real-time analytics service that has been designed for critical business workloads. Users are able to build an end-to-end serverless streaming pipeline in minutes. Utilizing SQL, users are able to go from zero to production with a few clicks, all easily extensible with unique code and automatic machine learning abilities for the most advanced scenarios.

    Azure Stream Analytics has the ability to analyze and accurately process exorbitant volumes of high-speed streaming data from numerous sources at the same time. Patterns and scenarios are quickly identified and information is gathered from various input sources, such as social media feeds, applications, clickstreams, sensors, and devices. These patterns can then be implemented to trigger actions and launch workflows, such as feeding data to a reporting tool, storing data for later use, or creating alerts. Azure Stream Analytics is also offered on Azure IoT Edge runtime, so the data can be processed on IoT devices.

    Top Benefits

    • User friendly: Azure Stream Analytics is very straightforward and easy to use. Out of the box and with a few clicks, users are able to connect to numerous sources and sinks, and easily develop an end-to-end pipeline. Stream Analytics can easily connect to Azure IoT Hub and Azure Event Hub for streaming ingestion, in addition to connecting with Azure Blob storage for historical data ingestion.

    • Flexible deployment: For low-latency analytics, Azure Stream Analytics can run on Azure Stack or IoT edge. For large-scale analytics, the solution can run in the cloud. Azure Stream Analytics uses the same query language and tools for both the cloud and the edge, facilitating an easier process for developers to design exceptional hybrid architectures for streaming processes.

    • Cost-effective: With Azure Stream Analytics, users only pay for the streaming units they consume; there are no upfront costs. Users can easily scale up or down as needed; there is no commitment or cluster provisioning.

    • Trustworthy: Azure Stream Analytics guarantees event processing to be 99.99% available with a minute level of granularity. Azure Stream Analytics has embedded recovery capabilities and checkpoints to keep things running smoothly at all times. Events are never lost with Azure Stream Analytics at-least once delivery of events and exactly one event processing.

    Reviews from Real Users

    “Azure Stream Analytics is something that you can use to test out streaming scenarios very quickly in the general sense and it is useful for IoT scenarios. If I was to do a project with IoT and I needed a streaming solution, Azure Stream Analytics would be a top choice. The most valuable features of Azure Stream Analytics are the ease of provisioning and the interface is not terribly complex.” - Olubisi A., Team Lead at a tech services company.

    “It's used primarily for data and mining - everything from the telemetry data side of things. It's great for streaming and makes everything easy to handle. The streaming from the IoT hub and the messaging are aspects I like a lot.” - Sudhendra U., Technical Architect at Infosys

    Sample Customers
    NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
    Rockwell Automation, Milliman, Honeywell Building Solutions, Arcoflex Automation Solutions, Real Madrid C.F., Aerocrine, Ziosk, Tacoma Public Schools, P97 Networks
    Top Industries
    REVIEWERS
    Computer Software Company30%
    Financial Services Firm15%
    University9%
    Marketing Services Firm6%
    VISITORS READING REVIEWS
    Financial Services Firm24%
    Computer Software Company13%
    Manufacturing Company7%
    Comms Service Provider6%
    REVIEWERS
    Computer Software Company27%
    Manufacturing Company18%
    Insurance Company9%
    Government9%
    VISITORS READING REVIEWS
    Computer Software Company15%
    Financial Services Firm12%
    Manufacturing Company8%
    Comms Service Provider5%
    Company Size
    REVIEWERS
    Small Business40%
    Midsize Enterprise19%
    Large Enterprise40%
    VISITORS READING REVIEWS
    Small Business17%
    Midsize Enterprise12%
    Large Enterprise71%
    REVIEWERS
    Small Business24%
    Midsize Enterprise10%
    Large Enterprise67%
    VISITORS READING REVIEWS
    Small Business20%
    Midsize Enterprise11%
    Large Enterprise69%
    Buyer's Guide
    Hadoop
    April 2024
    Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop. Updated: April 2024.
    767,847 professionals have used our research since 2012.

    Apache Spark is ranked 1st in Hadoop with 60 reviews while Azure Stream Analytics is ranked 4th in Streaming Analytics with 22 reviews. Apache Spark is rated 8.4, while Azure Stream Analytics is rated 8.2. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of Azure Stream Analytics writes "Easy to set up and user-friendly, but could be priced better". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, SAP HANA and Apache NiFi, whereas Azure Stream Analytics is most compared with Amazon Kinesis, Databricks, Amazon MSK, Apache Flink and Apache Spark Streaming.

    We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.