Try our new research platform with insights from 80,000+ expert users

Apache NiFi vs Apache Spark vs Azure Stream Analytics comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Mindshare comparison

Compute Service
Hadoop
Streaming Analytics
 

Featured Reviews

Bharghava Raghavendra Beesa - PeerSpot reviewer
The tool enables effective data transformation and integration
There are some areas for improvement, particularly with record-level tasks that take a bit of time. The quality of JSON data processing could be improved, as JSON workloads require manual conversions without a specific process. Enhancing features related to alerting would be helpful, including mobile alerts for pipeline issues. Integration with mobile devices for error alerts would simplify information delivery.
Dunstan Matekenya - PeerSpot reviewer
Open-source solution for data processing with portability
Apache Spark is known for its ease of use. Compared to other available data processing frameworks, it is user-friendly. While many choices now exist, Spark remains easy to use, particularly with Python. You can utilize familiar programming styles similar to Pandas in Python, including object-oriented programming. Another advantage is its portability. I can prototype and perform some initial tasks on my laptop using Spark without needing to be on Databricks or any cloud platform. I can transfer it to Databricks or other platforms, such as AWS. This flexibility allows me to improve processing even on my laptop. For instance, if I'm processing large amounts of data and find my laptop becoming slow, I can quickly switch to Spark. It handles small and large datasets efficiently, making it a versatile tool for various data processing needs.
SantiagoCordero - PeerSpot reviewer
Native connectors and integration simplify tasks but portfolio complexity needs addressing
There are too many products in the Azure landscape, which sometimes leads to overlap between them. Microsoft continuously releases new products or solutions, which can be frustrating when determining the appropriate features from one solution over another. A cost comparison between products is also not straightforward. They should simplify their portfolio. The Microsoft licensing system is confusing and not easy to understand, and this is something they should address. In the future, I may stop using Stream Analytics and move to other solutions. I discussed Palantir earlier, which is something I want to explore in depth because it allows me to accomplish more efficiently compared to solely using Azure. Additionally, the vendors should make the solution more user-friendly, incorporating low-code and no-code features. This is something I wish to explore further.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"The initial setup is very easy. I would rate my experience with the initial setup a ten out of ten, where one point is difficult, and ten points are easy."
"Visually, this is a good product."
"We can integrate the tool with other applications easily."
"The initial setup is very easy."
"The most valuable features of this solution are ease of use and implementation."
"Apache NiFi is user-friendly. Its most valuable features for handling large volumes of data include its multitude of integrated endpoints and clients and the ability to create cron jobs to run tasks at regular intervals."
"It is highly effective for handling real-time data by working with APIs for immediate and continuous data extraction."
"The most valuable feature has been the range of clients and the range of connectors that we could use."
"It is highly scalable, allowing you to efficiently work with extensive datasets that might be problematic to handle using traditional tools that are memory-constrained."
"The most valuable feature of Apache Spark is its flexibility."
"The solution has been very stable."
"Its scalability and speed are very valuable. You can scale it a lot. It is a great technology for big data. It is definitely better than a lot of earlier warehouse or pipeline solutions, such as Informatica. Spark SQL is very compliant with normal SQL that we have been using over the years. This makes it easy to code in Spark. It is just like using normal SQL. You can use the APIs of Spark or you can directly write SQL code and run it. This is something that I feel is useful in Spark."
"There's a lot of functionality."
"Spark is used for transformations from large volumes of data, and it is usefully distributed."
"The scalability has been the most valuable aspect of the solution."
"I like that it can handle multiple tasks parallelly. I also like the automation feature. JavaScript also helps with the parallel streaming of the library."
"We find the query editor feature of this solution extremely valuable for our business."
"It provides the capability to streamline multiple output components."
"The integrations for this solution are easy to use and there is flexibility in integrating the tool with Azure Stream Analytics."
"The most valuable features of Azure Stream Analytics are the ease of provisioning and the interface is not terribly complex."
"It's easy to implement and maintain pipelines with minimal complexity."
"Cloud tools and cloud services enable flexibility and lower entry barriers for Taiwanese enterprises."
"We use Azure Stream Analytics for simulation and internal activities."
"The way it organizes data into tables and dashboards is very helpful."
 

Cons

"The use case templates could be more precise to typical business needs."
"The tool should incorporate more tutorials for advanced use cases. It has tutorials for simple use cases."
"There are some claims that NiFi is cloud-native but we have tested it, and it's not."
"We run many jobs, and there are already large tables. When we do not control NiFi on time, all reports fail for the day. So it's pretty slow to control, and it has to be improved."
"The quality of JSON data processing could be improved, as JSON workloads require manual conversions without a specific process."
"The overall stability of this solution could be improved. In a future release, we would like to have access to more features that could be used in a parallel way. This would provide more freedom with processing."
"There is room for improvement in integration with SSO. For example, NiFi does not have any integration with SSO. And if I want to give some kind of rollback access control across the organization. That is not possible."
"I think the UI interface needs to be more user-friendly."
"In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, do the transformation in a subsecond, and all that."
"More ML based algorithms should be added to it, to make it algorithmic-rich for developers."
"From my perspective, the only thing that needs improvement is the interface, as it was not easily understandable."
"Include more machine learning algorithms and the ability to handle streaming of data versus micro batch processing."
"It would be beneficial to enhance Spark's capabilities by incorporating models that utilize features not traditionally present in its framework."
"Apache Spark could improve the connectors that it supports. There are a lot of open-source databases in the market. For example, cloud databases, such as Redshift, Snowflake, and Synapse. Apache Spark should have connectors present to connect to these databases. There are a lot of workarounds required to connect to those databases, but it should have inbuilt connectors."
"This solution currently cannot support or distribute neural network related models, or deep learning related algorithms. We would like this functionality to be developed."
"The setup I worked on was really complex."
"The solution could be improved by providing better graphics and including support for UI and UX testing."
"There may be some issues when connecting with Microsoft Power BI because we are providing the input and output commands, and there's a chance of it being delayed while connecting."
"The solution's interface could be simpler to understand for non-technical people."
"There are too many products in the Azure landscape, which sometimes leads to overlap between them."
"There are too many products in the Azure landscape, which sometimes leads to overlap between them."
"I would like to have a contact individual at Microsoft."
"The solution doesn't handle large data packets very efficiently, which could be improved upon."
"If something goes wrong, it's very hard to investigate what caused it and why."
 

Pricing and Cost Advice

"I used the tool's free version."
"The solution is open-source."
"We use the free version of Apache NiFi."
"It's an open-source solution."
"Spark is an open-source solution, so there are no licensing costs."
"Apache Spark is an open-source solution, and there is no cost involved in deploying the solution on-premises."
"Licensing costs can vary. For instance, when purchasing a virtual machine, you're asked if you want to take advantage of the hybrid benefit or if you prefer the license costs to be included upfront by the cloud service provider, such as Azure. If you choose the hybrid benefit, it indicates you already possess a license for the operating system and wish to avoid additional charges for that specific VM in Azure. This approach allows for a reduction in licensing costs, charging only for the service and associated resources."
"Considering the product version used in my company, I feel that the tool is not costly since the product is available for free."
"Apache Spark is an expensive solution."
"The product is expensive, considering the setup."
"It is quite expensive. In fact, it accounts for almost 50% of the cost of our entire project."
"Apache Spark is open-source. You have to pay only when you use any bundled product, such as Cloudera."
"There are different tiers based on retention policies. There are four tiers. The pricing varies based on steaming units and tiers. The standard pricing is $10/hour."
"The current price is substantial."
"When scaling up, the pricing for Azure Stream Analytics can get relatively high. Considering its capabilities compared to other solutions, I would rate it a seven out of ten for cost. However, we've found ways to optimize costs using tools like Databricks for specific tasks."
"The cost of this solution is less than competitors such as Amazon or Google Cloud."
"We pay approximately $500,000 a year. It's approximately $10,000 a year per license."
"The product's price is at par with the other solutions provided by the other cloud service providers in the market."
"I rate the price of Azure Stream Analytics a four out of five."
"The licensing for this product is payable on a 'pay as you go' basis. This means that the cost is only based on data volume, and the frequency that the solution is used."
report
Use our free recommendation engine to learn which Compute Service solutions are best for your needs.
857,028 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
15%
Computer Software Company
13%
Manufacturing Company
10%
Retailer
7%
Financial Services Firm
27%
Computer Software Company
13%
Manufacturing Company
7%
Comms Service Provider
6%
Computer Software Company
15%
Financial Services Firm
15%
Manufacturing Company
9%
Retailer
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What is your experience regarding pricing and costs for Apache NiFi?
Apache NiFi is open-source and free. Its integration with systems like Cloudera can be expensive, but Apache NiFi its...
What needs improvement with Apache NiFi?
The logging system of Apache NiFi needs improvement. It is difficult to debug compared to Airflow ( /products/apache-...
What do you like most about Apache Spark?
We use Spark to process data from different data sources.
What is your experience regarding pricing and costs for Apache Spark?
Apache Spark is open-source, so it doesn't incur any charges.
What needs improvement with Apache Spark?
There is complexity when it comes to understanding the whole ecosystem, especially for beginners. I find it quite com...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analyti...
What is your experience regarding pricing and costs for Azure Stream Analytics?
I have no problem with pricing. We sell the data analytics value and operational value to customers, focusing on prod...
What needs improvement with Azure Stream Analytics?
There is a lack of technical support from Microsoft's local office, particularly in Taiwan. We often have to learn on...
 

Also Known As

No data available
No data available
ASA
 

Overview

 

Sample Customers

Macquarie Telecom Group, Dovestech, Slovak Telekom, Looker, Hastings Group
NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions
Rockwell Automation, Milliman, Honeywell Building Solutions, Arcoflex Automation Solutions, Real Madrid C.F., Aerocrine, Ziosk, Tacoma Public Schools, P97 Networks
Find out what your peers are saying about Amazon Web Services (AWS), Apache, Zadara and others in Compute Service. Updated: June 2025.
857,028 professionals have used our research since 2012.