Try our new research platform with insights from 80,000+ expert users
Apache Spark Logo

Apache Spark pros and cons

Vendor: Apache
4.2 out of 5
Badge Ranked 1

Pros & Cons summary

Buyer's Guide

Get pricing advice, tips, use cases and valuable features from real users of this product.
Get the report

Prominent pros & cons

PROS

Apache Spark delivers exceptional performance and memory processing with in-memory capabilities, offering significant improvements over traditional tools.
The system's fault tolerance and scalability are highly valued, enabling efficient data handling for extensive datasets.
Apache Spark's support for machine learning, real-time streaming, data processing, and integration with AI libraries is outstanding.
The ease of use, combined with SQL compliance and a clear syntax, makes Apache Spark user-friendly and efficient for analytical tasks.
Apache Spark's distributed computing framework and workload distribution capabilities lead to enhanced processing speed and efficiency.

CONS

Apache Spark could benefit from expanding its support for more diverse programming languages and database connectors.
Apache Spark's stream processing capabilities and dynamic DataFrame options are areas that require development.
Apache Spark's machine learning and algorithmic features need enhancement to provide more comprehensive solutions for developers.
Apache Spark has a significant learning curve, requiring technical expertise that might hinder adoption by less technical users.
Apache Spark could improve its resource management and optimization techniques to enhance overall performance.
 

Apache Spark Pros review quotes

it_user372393 - PeerSpot reviewer
Big Data Consultant at a tech services company with 501-1,000 employees
Aug 25, 2017
The good performance. The nice graphical management console. The long list of ML algorithms.
it_user326142 - PeerSpot reviewer
Architect at a healthcare company with 51-200 employees
Sep 26, 2017
ETL and streaming capabilities.
it_user746673 - PeerSpot reviewer
Sr. Software Engineer at a tech vendor with 1-10 employees
Oct 1, 2017
The most valuable feature is the Fault Tolerance and easy binding with other processes like Machine Learning, graph analytics.
Learn what your peers think about Apache Spark. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.
879,853 professionals have used our research since 2012.
it_user746943 - PeerSpot reviewer
Big Data and Cloud Solution Consultant at a financial services firm with 10,001+ employees
Oct 2, 2017
DataFrame: Spark SQL gives the leverage to create applications more easily and with less coding effort.
it_user786777 - PeerSpot reviewer
Manager | Data Science Enthusiast | Management Consultant at a consultancy with 5,001-10,000 employees
Dec 10, 2017
With Hadoop-related technologies, we can distribute the workload with multiple commodity hardware.
reviewer894894 - PeerSpot reviewer
Solutions Architect at a computer software company with 51-200 employees
Jun 27, 2018
Features include machine learning, real time streaming, and data processing.
it_user946074 - PeerSpot reviewer
Principal Architect at a financial services firm with 1,001-5,000 employees
Jul 10, 2019
I found the solution stable. We haven't had any problems with it.
LC
Snr Security Engineer at a tech vendor with 201-500 employees
Jul 14, 2019
The scalability has been the most valuable aspect of the solution.
reviewer1046250 - PeerSpot reviewer
Senior Consultant & Training at a tech services company with 51-200 employees
Oct 13, 2019
The most valuable feature of this solution is its capacity for processing large amounts of data.
MG
Director of BigData Offer at IVIDATA
Dec 9, 2019
The solution is very stable.
 

Apache Spark Cons review quotes

it_user372393 - PeerSpot reviewer
Big Data Consultant at a tech services company with 501-1,000 employees
Aug 25, 2017
Apache Spark provides very good performance The tuning phase is still tricky.
it_user326142 - PeerSpot reviewer
Architect at a healthcare company with 51-200 employees
Sep 26, 2017
Stability in terms of API (things were difficult, when transitioning from RDD to DataFrames, then to DataSet).
it_user746673 - PeerSpot reviewer
Sr. Software Engineer at a tech vendor with 1-10 employees
Oct 1, 2017
More ML based algorithms should be added to it, to make it algorithmic-rich for developers.
Learn what your peers think about Apache Spark. Get advice and tips from experienced pros sharing their opinions. Updated: January 2026.
879,853 professionals have used our research since 2012.
it_user746943 - PeerSpot reviewer
Big Data and Cloud Solution Consultant at a financial services firm with 10,001+ employees
Oct 2, 2017
Dynamic DataFrame options are not yet available.
it_user786777 - PeerSpot reviewer
Manager | Data Science Enthusiast | Management Consultant at a consultancy with 5,001-10,000 employees
Dec 10, 2017
Include more machine learning algorithms and the ability to handle streaming of data versus micro batch processing.
reviewer894894 - PeerSpot reviewer
Solutions Architect at a computer software company with 51-200 employees
Jun 27, 2018
It should support more programming languages.
it_user946074 - PeerSpot reviewer
Principal Architect at a financial services firm with 1,001-5,000 employees
Jul 10, 2019
It needs a new interface and a better way to get some data. In terms of writing our scripts, some processes could be faster.
LC
Snr Security Engineer at a tech vendor with 201-500 employees
Jul 14, 2019
The management tools could use improvement. Some of the debugging tools need some work as well. They need to be more descriptive.
reviewer1046250 - PeerSpot reviewer
Senior Consultant & Training at a tech services company with 51-200 employees
Oct 13, 2019
When you first start using this solution, it is common to run into memory errors when you are dealing with large amounts of data.
MG
Director of BigData Offer at IVIDATA
Dec 9, 2019
The solution needs to optimize shuffling between workers.