Try our new research platform with insights from 80,000+ expert users

Apache Spark Streaming Primary Use Case

Himansu Jena - PeerSpot reviewer
Sr Project Manager at Raj Subhatech

I use Apache Spark Streaming for GIS (Graphical Information System), satellite imaging processing, image processing, longitude, latitude, and predicting electricity, road, and transformations in these areas.

I process all the information in real time where I can get lots of petabyte data, terabyte data from any type of XML, Excel, structured data, semi-structured data, and unstructured data. I use micro-batching, streams, transformations, and this information. Based on that, I predict and create models that can be used for regular expressions and image processing. Then using TensorFlow, I create dynamic views. Additionally, I create models which provide accuracy of predictive analytics.

With Apache Spark Streaming's integration with Anaconda and Miniconda with Python, I interact with databases using data frames or data sets in micro versions. I create solutions based on what business is expecting for decision-making, logistic regression, linear regression, or machine learning which will give image or voice record, graphical data that will provide more accuracy. These features are implemented based on client requirements. We ensure we are on track using AML and real-time processing from various data sources, whether structured, unstructured, or semi-structured data.

View full review »
Kuldeep Pal - PeerSpot reviewer
Data Engineer at Walmart Global Tech

I am an end user of Apache Spark Streaming as a developer. I use this technology to build streaming pipelines.

Our usual use case for Apache Spark Streaming is mostly on the fraud side. Whatever data is coming in, we use Apache Spark Streaming with Kafka to stream all the real-time messages, and in real-time we are trying to find out the frauds from the real-time data. It is currently near to real-time because Apache Spark Streaming works window by window.

The real-time data messages coming in are processed by Apache Spark Streaming. While we have real-time messages, which are actually near to real-time, we are able to find out frauds. If somebody is conducting a scam or fraud, we match it based on our rules.

View full review »
Shahzad Munir - PeerSpot reviewer
Sr. Manager Data Engineer at a tech consulting company with 51-200 employees

I work for a telecommunication company where we process network data in near real-time. All of our use cases revolve around that.

View full review »
Buyer's Guide
Apache Spark Streaming
September 2025
Learn what your peers think about Apache Spark Streaming. Get advice and tips from experienced pros sharing their opinions. Updated: September 2025.
867,676 professionals have used our research since 2012.
Venkata Phaneendra Reddy Janga - PeerSpot reviewer
Data Engineer III at a tech consulting company with 10,001+ employees

We have used Apache Spark Streaming for ingestion from streaming sources such as Apache Kafka. Another use case is for building a Customer 360, specifically C360 in real-time. Additionally, we used it to build a real-time feature store to build features that will be used for machine learning models and AI models.

When building this customer profile with Apache Spark Streaming, we create a micro-batch of one minute for customer profiling. For example, we track email changes, contact changes, and address changes. This approach is near-real-time due to the micro-batch.

By using Apache Spark Streaming, the data freshness rate and latency have decreased significantly. Earlier we used to run 24-hour batches, and now it is less than one minute, allowing us to communicate any customer changes, such as address or email, to downstream systems such as Adobe Experience Platform for marketing within one minute, capturing all changes in near real-time.

We have integrated Apache Spark Streaming with Google's Cloud Storage (GCS) and Google BigQuery. Additionally, we have integrated with native HDFS and Hive as well.

View full review »
Ajay Hiremath - PeerSpot reviewer
Gen AI Lead/Architect at Alvaria

My use cases for Apache Spark Streaming were during my academics. During that time, I used Apache Spark Streaming to transmit data live from one source to another.

View full review »
Oscar Estorach - PeerSpot reviewer
Chief Data-strategist and Director at Theworkshop.es

As a data engineer, I use Apache Spark Streaming to process real-time data for web page analytics and integrate diverse data sources into centralized data warehouses.

View full review »
AbhishekGupta - PeerSpot reviewer
Engineering Leader at Walmart

We have built services around Apache Spark Streaming. We use it for real-time streaming use cases. There are many last-minute delivery use cases. We are trying to build on Apache Spark Stream but the latency has to be better.

View full review »
AM
Head of Data Science center of excellence at Ameriabank CJSC

We use Spark Streaming in a micro-batch region. It's not a full real-time system, but it offers high performance and low latency.

View full review »
DR
Chief Technology Officer at Teslon Technologies Pvt Ltd

We used Spark and Spark Streaming, as well as Spark ML, for multiple use cases, particularly streaming IoT-related data. Additionally, we applied Spark ML for various machine learning algorithms on the streaming data, mainly in the healthcare space. So, primarily in the healthcare domain.

View full review »
Prashast Tripathi - PeerSpot reviewer
Data Engineer at a comms service provider with 201-500 employees

The solution has industry-related use cases, with orders flowing from the order management system. We use Apache Spark Streaming to collect and store these orders in our database.

View full review »
SB
Sr Technical Analyst at Sumtotal

The primary use case of this solution is for streaming data. It can stream large amounts of data in small data chunks which are used for Databricks data. I've been using the solution for personal research purposes only and not for business applications. I'm a customer of Apache. 

View full review »
reviewer1494531 - PeerSpot reviewer
Head of Data Science at a energy/utilities company with 10,001+ employees

We're primarily using the solution for anomaly detection.

View full review »
RK
DevOps engineer at Vvolve management consultants

I've used it more for ETL. It's useful for creating data pipelines, streaming datasets, generating synthetic data, synchronizing data, creating data lakes, and loading and unloading data is fast and easy. 

In my ETL work, I often move data from multiple sources into a data lake. Apache Spark is very helpful for tracking the latest data delivery and automatically streaming it to the target database.

View full review »
reviewer1516182 - PeerSpot reviewer
Chief Innovation & Technology Leader at a mining and metals company with 1,001-5,000 employees

The primary use of the solution is to implement predictive maintenance qualities. 

View full review »
Buyer's Guide
Apache Spark Streaming
September 2025
Learn what your peers think about Apache Spark Streaming. Get advice and tips from experienced pros sharing their opinions. Updated: September 2025.
867,676 professionals have used our research since 2012.