We performed a comparison between Apache Spark and SAP HANA based on real PeerSpot user reviews.
Find out in this report how the two Hadoop solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The scalability has been the most valuable aspect of the solution."
"It provides a scalable machine learning library."
"It is highly scalable, allowing you to efficiently work with extensive datasets that might be problematic to handle using traditional tools that are memory-constrained."
"With Spark, we parallelize our operations, efficiently accessing both historical and real-time data."
"The most valuable feature of Apache Spark is its flexibility."
"Now, when we're tackling sentiment analysis using NLP technologies, we deal with unstructured data—customer chats, feedback on promotions or demos, and even media like images, audio, and video files. For processing such data, we rely on PySpark. Beneath the surface, Spark functions as a compute engine with in-memory processing capabilities, enhancing performance through features like broadcasting and caching. It's become a crucial tool, widely adopted by 90% of companies for a decade or more."
"ETL and streaming capabilities."
"We use Spark to process data from different data sources."
"Technically it resembles Oracle, but as a somewhat lighter version."
"The performance in terms of processing time is unmatched due to the in-memory processing capability."
"If you want to scale with new processes and new reports, that's fairly easy."
"I like the integration process. Also, the data is trusted by our management, and we use the data from transactions for analysis."
"We have found the solution to be customizable and it is beneficial it comes as a bundled package. Additionally, it is user-friendly."
"In comparison with other DMS solutions, it offers good performance."
"What I like most are the dashboards and pervasive analytics."
"The feature I found most valuable in SAP HANA is modeling. I also like that the solution has good integration and you can integrate it with any system, even third-party systems."
"Dynamic DataFrame options are not yet available."
"They could improve the issues related to programming language for the platform."
"Stream processing needs to be developed more in Spark. I have used Flink previously. Flink is better than Spark at stream processing."
"Include more machine learning algorithms and the ability to handle streaming of data versus micro batch processing."
"The graphical user interface (UI) could be a bit more clear. It's very hard to figure out the execution logs and understand how long it takes to send everything. If an execution is lost, it's not so easy to understand why or where it went. I have to manually drill down on the data processes which takes a lot of time. Maybe there could be like a metrics monitor, or maybe the whole log analysis could be improved to make it easier to understand and navigate."
"It requires overcoming a significant learning curve due to its robust and feature-rich nature."
"If you have a Spark session in the background, sometimes it's very hard to kill these sessions because of D allocation."
"The solution must improve its performance."
"They can improve their technology for the CRM subsystem. There are other products that are better and more effective for the CRM subsystem. Its price could be better. It is expensive."
"SAP HANA's long run time to extract data could be improved upon."
"The initial setup was very, very complex, tedious, and costly and required someone with great expertise to complete it."
"The product is very demanding on memory requirements."
"SAP HANA is not strong like Oracle when it comes to finance. They are only strong with the logistic business project."
"Uses a large amount of RAM and is costly."
"The worst thing about SAP HANA is the price; it's very expensive. The licensing cost, implementation cost, hosting cost, and appliance cost are all high."
"When you do a report on a non-SAP platform, you face some compatibility problems."
Apache Spark is ranked 1st in Hadoop with 60 reviews while SAP HANA is ranked 1st in Embedded Database with 79 reviews. Apache Spark is rated 8.4, while SAP HANA is rated 8.4. The top reviewer of Apache Spark writes "Reliable, able to expand, and handle large amounts of data well". On the other hand, the top reviewer of SAP HANA writes "Excellent compatibility between modules and the control". Apache Spark is most compared with Spring Boot, AWS Batch, Spark SQL, Cloudera Distribution for Hadoop and AWS Lambda, whereas SAP HANA is most compared with Oracle Database, SQL Server, MySQL, IBM Db2 Database and SAP Adaptive Server Enterprise. See our Apache Spark vs. SAP HANA report.
See our list of best Hadoop vendors.
We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.