Try our new research platform with insights from 80,000+ expert users
Apache Spark Streaming Logo

Apache Spark Streaming pros and cons

Vendor: Apache
4.0 out of 5

Pros & Cons summary

Buyer's Guide

Get pricing advice, tips, use cases and valuable features from real users of this product.
Get the category report

Prominent pros & cons

PROS

Apache Spark Streaming provides efficiency and stability, making it better than average for data processing.
Near real-time analytics and easy API development are standout features, benefiting developers in building code-streaming pipelines.
Low latency and data transformation capabilities make it the fastest in the market, significantly improving communication speed.
The open-source nature of Apache Spark Streaming means it is essentially free to use, making it a cost-effective choice for organizations.
Integration with Anaconda and Miniconda enhances interaction with databases and supports decision-making processes by utilizing machine learning capabilities effectively.

CONS

Apache Spark Streaming could have improvements in user configuration to be less developer-focused and more business user-oriented.
There is a need for Apache Spark Streaming to support arbitrary stateful functions in Python.
The service structure of Apache Spark Streaming could be enhanced, particularly in memory management and latency.
The initial setup of Apache Spark Streaming is complex and resource-intensive, even for small-scale applications.
Apache Spark Streaming requires improvement in real-time processing instead of relying on micro-batch or near real-time processing.
 

Apache Spark Streaming Pros review quotes

Venkata Phaneendra Reddy Janga - PeerSpot reviewer
Aug 18, 2025
By integrating Apache Spark Streaming, the data freshness rate, and latency have significantly improved from 24-hour batch processing to less than one minute, facilitating faster communication to downstream systems, aiding marketing campaigns.
Himansu Jena - PeerSpot reviewer
Aug 19, 2025
With Apache Spark Streaming's integration with Anaconda and Miniconda with Python, I interact with databases using data frames or data sets in micro versions and create solutions based on business expectations for decision-making, logistic regression, linear regression, or machine learning which provides image or voice record and graphical data for improved accuracy.
Kuldeep Pal - PeerSpot reviewer
Aug 22, 2025
With Apache Spark Streaming, you can have multiple kinds of windows; depending on your use case, you can select either a tumbling window, a sliding window, or a static window to determine how much data you want to process at a single point of time.
Find out what your peers are saying about Apache, Amazon Web Services (AWS), Microsoft and others in Streaming Analytics. Updated: August 2025.
865,985 professionals have used our research since 2012.
Shahzad Munir - PeerSpot reviewer
Aug 25, 2025
I appreciate Apache Spark Streaming's micro-batching capabilities; the watermarking functionality and related features are quite good.
DR
Jun 8, 2023
Apache Spark Streaming was straightforward in terms of maintenance. It was actively developed, and migrating from an older to a newer version was quite simple.
RK
Jun 3, 2024
Apache Spark's capabilities for machine learning are quite extensive and can be used in a low-code way.
Prashast Tripathi - PeerSpot reviewer
Jul 24, 2023
Apache Spark Streaming has features like checkpointing and Streaming API that are useful.
Oscar Estorach - PeerSpot reviewer
Aug 18, 2021
The solution is very stable and reliable.
SB
Nov 21, 2022
It's the fastest solution on the market with low latency data on data transformations.
AbhishekGupta - PeerSpot reviewer
Oct 8, 2022
Apache Spark Streaming's most valuable feature is near real-time analytics. The developers can build APIs easily for a code-steaming pipeline. The solutions have an ecosystem of integration with other stock services.
 

Apache Spark Streaming Cons review quotes

Venkata Phaneendra Reddy Janga - PeerSpot reviewer
Aug 18, 2025
One improvement I would expect is real-time processing instead of micro-batch or near real-time.
Himansu Jena - PeerSpot reviewer
Aug 19, 2025
When dealing with various data types including COBOL, Excel, JSON, video, audio, and MPG files, challenges can arise with incomplete or missing values.
Kuldeep Pal - PeerSpot reviewer
Aug 22, 2025
While it is reliable, there are some issues with Apache Spark Streaming as it is not 100% reliable.
Find out what your peers are saying about Apache, Amazon Web Services (AWS), Microsoft and others in Streaming Analytics. Updated: August 2025.
865,985 professionals have used our research since 2012.
Shahzad Munir - PeerSpot reviewer
Aug 25, 2025
Monitoring is an area where they could definitely improve Apache Spark Streaming. When you have a streaming application, it generates numerous logs. After some time, the logs become meaningless because they're quite large and impossible to open.
DR
Jun 8, 2023
It was resource-intensive, even for small-scale applications.
RK
Jun 3, 2024
The debugging aspect could use some improvement.
Prashast Tripathi - PeerSpot reviewer
Jul 24, 2023
The cost and load-related optimizations are areas where the tool lacks and needs improvement.
Oscar Estorach - PeerSpot reviewer
Aug 18, 2021
The solution itself could be easier to use.
SB
Nov 21, 2022
The initial setup is quite complex.
AbhishekGupta - PeerSpot reviewer
Oct 8, 2022
The service structure of Apache Spark Streaming can improve. There are a lot of issues with memory management and latency. There is no real-time analytics. We recommend it for the use cases where there is a five-second latency, but not for a millisecond, an IOT-based, or the detection anomaly-based. Flink as a service is much better.