No more typing reviews! Try our Samantha, our new voice AI agent.
Apache Spark Logo

Apache Spark Reviews

Vendor: Apache
4.2 out of 5
Badge Ranked 1

What is Apache Spark?

Featured Apache Spark reviews

Apache Spark mindshare

As of June 2026, the mindshare of Apache Spark in the Hadoop category stands at 13.9%, down from 17.6% compared to the previous year, according to calculations based on PeerSpot user engagement data.
Hadoop Mindshare Distribution
ProductMindshare (%)
Apache Spark13.9%
Cloudera Distribution for Hadoop14.7%
HPE Data Fabric10.2%
Other61.2%
Hadoop

PeerResearch reports based on Apache Spark reviews

TypeTitleDate
CategoryHadoopJun 21, 2026Download
ProductReviews, tips, and advice from real usersJun 21, 2026Download
ComparisonApache Spark vs Cloudera Distribution for HadoopJun 21, 2026Download
ComparisonApache Spark vs Amazon EMRJun 21, 2026Download
ComparisonApache Spark vs HPE Data FabricJun 21, 2026Download
Suggested products
TitleRatingMindshareRecommending
Spring Boot4.2N/A93%44 interviewsAdd to research
Spot by Flexera4.3N/A100%6 interviewsAdd to research
 
 
Key learnings from peers

Valuable Features

Room for Improvement

ROI

Pricing

Popular Use Cases

Service and Support

Deployment

Scalability

Stability

Review data by company size

By reviewers
Company SizeCount
Small Business25
Midsize Enterprise14
Large Enterprise25
By reviewers
By visitors reading reviews
Company SizeCount
Small Business164
Midsize Enterprise48
Large Enterprise226
By visitors reading reviews

Top industries

By visitors reading reviews
Financial Services Firm
22%
Manufacturing Company
9%
Construction Company
8%
Comms Service Provider
7%
Marketing Services Firm
5%
Computer Software Company
5%
Outsourcing Company
5%
Healthcare Company
4%
University
4%
Government
4%
Retailer
4%
Insurance Company
3%
Educational Organization
3%
Transportation Company
2%
Performing Arts
2%
Media Company
2%
Real Estate/Law Firm
2%
Consumer Goods Company
1%
Legal Firm
1%
Non Profit
1%
Pharma/Biotech Company
1%
Recreational Facilities/Services Company
1%
Religious Institution
1%
Hospitality Company
1%
Renewables & Environment Company
1%

Compare Apache Spark with alternative products

Learn more about Apache Spark

Apache Spark customers

Related questions

 
Apache Spark Reviews Summary
Author infoRatingReview Summary
Data Architect at Devtech4.5I’ve used Apache Spark for four years, mainly for data integration and access. Its in-memory processing and open-source flexibility suit my needs, despite some stability issues. I prefer it over commercial tools like Informatica due to cost and adaptability.
Consultant, Chief Engineer, Teamleiter at infoteam Software AG4.0I used Apache Spark for two years in an on-prem prototype; setup was straightforward and support was good. I liked its fast database access, transformation, and reliable data exchange/integration. Licensing seemed midrange, but the customer ultimately chose another technology.
Data Engineer at a tech company with 10,001+ employees5.0I use Apache Spark for real-time data processing and transformation across multiple sources like CRM and Siebel. It's reliable, fast, and improves our decision-making, though I see future needs for better integration with emerging cloud solutions.
Data Scientist at a financial services firm with 10,001+ employees4.5I primarily use Apache Spark for data processing tasks involving large datasets, appreciating its ease of use and portability. While it's efficient for both small and large datasets, the lack of support for geospatial data is a limitation.
Head of Data at a energy/utilities company with 51-200 employees4.0Apache Spark significantly reduced operational costs by 50% and although it supports parallel processing, it needs improvements in scalability and user-friendliness. Working with datasets isn't as straightforward as with Pandas, though it's flexible and functional.
Manager Data Analytics at a outsourcing company with 5,001-10,000 employees3.5We use Apache Spark to handle real-time data streaming and machine learning, significantly improving efficiency and reducing costs. It offers flexibility in scaling and integrates well with other tools, though its learning curve could be challenging for non-technical users.
Senior Software Architect at USEReady4.0I use Apache Spark for big data engineering, valuing its batch and streaming capabilities. While stable and scalable, its ecosystem is complex for beginners, and clustering setup can be tricky. I rate it 8/10.
Senior Developer at Infosys3.5My experience with Spark for large-scale distributed data transformations is positive due to its speed and cost reduction. While setup is complex and scheduling needs external tools, I recommend it for big data processing.
Sr Manager at a transportation company with 10,001+ employees4.5I use Apache Spark for real-time data processing and ETL tasks. It offers unparalleled features but faces limitations due to its in-memory implementation. Despite improvements in version 3.0, reducing costs and addressing memory issues would enhance it further.
Head of Data Science center of excellence at Ameriabank CJSC4.0I use Apache Spark primarily for in-memory processing of big data, which is valuable for tasks like running ML algorithms. Although its Pandas UDF support is advantageous, the Java overhead and performance issues suggest alternatives may be preferable.
Devindra Weerasooriya - PeerSpot reviewer
Devindra Weerasooriya
Data Architect at Devtech
Nov 20, 2025
Provides a consistent framework for building data integration and access solutions with reliable performance
ML
Michael Lierheimer
Consultant, Chief Engineer, Teamleiter at infoteam Software AG
Feb 27, 2026
Prototype work has improved fast data access, transformation, and reliable cross-site exchange
Omar Khaled - PeerSpot reviewer
Omar Khaled
Data Engineer at a tech company with 10,001+ employees
Aug 12, 2025
Empowering data consolidation and fast decision-making with efficient big data processing
Dunstan Matekenya - PeerSpot reviewer
Dunstan Matekenya
Data Scientist at a financial services firm with 10,001+ employees
Jul 10, 2024
Open-source solution for data processing with portability
Madhan Potluri - PeerSpot reviewer
Madhan Potluri
Head of Data at a energy/utilities company with 51-200 employees
Aug 5, 2024
Offers user-friendliness, clarity and flexibility
reviewer2534727 - PeerSpot reviewer
reviewer2534727
Manager Data Analytics at a outsourcing company with 5,001-10,000 employees
Aug 12, 2024
A flexible solution with real-time processing capabilities
KamleshPant - PeerSpot reviewer
KamleshPant
Senior Software Architect at USEReady
Apr 24, 2025
Handles both batch and streaming data efficiently for real-time processing
Bharghava Raghavendra Beesa - PeerSpot reviewer
Bharghava Raghavendra Beesa
Senior Developer at Infosys
Jan 21, 2025
Faster data transformations achieved but scheduling dependencies require external solutions
SS
Sachin Shukre
Sr Manager at a transportation company with 10,001+ employees
Dec 6, 2023
Offers real-time and near-real-time data processing
AM
Aleksandr Motuzov
Head of Data Science center of excellence at Ameriabank CJSC
Sep 23, 2024
Enhanced data processing with good support and helpful integration with Pandas syntax in distributed mode