We performed a comparison between Apache Spark, Cloudera Distribution for Hadoop, and HPE Ezmeral Data Fabric based on real PeerSpot user reviews.
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop."The most valuable feature of this solution is its capacity for processing large amounts of data."
"The solution is scalable."
"The most crucial feature for us is the streaming capability. It serves as a fundamental aspect that allows us to exert control over our operations."
"With Hadoop-related technologies, we can distribute the workload with multiple commodity hardware."
"The most valuable feature is the Fault Tolerance and easy binding with other processes like Machine Learning, graph analytics."
"The solution has been very stable."
"The deployment of the product is easy."
"The most valuable feature of Apache Spark is its flexibility."
"The solution is reliable and stable, it fits our requirements."
"It has the best proxy, security, and support features compared to open-source products."
"The product provides better data processing features than other tools."
"The scalability of Cloudera Distribution for Hadoop is excellent."
"The solution's most valuable feature is the enterprise data platform."
"The product is completely secure."
"The search function is the most valuable aspect of the solution."
"With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
"HPE Ezmeral Data Fabric can be accessed from any namespace globally as you would access it from a machine using an NFS."
"My customers find the product cheaper compared to other solutions. The previous solution that we used did not have unified analytics like the runtime or the analog."
"It is a stable solution...It is a scalable solution."
"The model creation was very interesting, especially with the libraries provided by the platform."
"I like the administration part."
"Spark could be improved by adding support for other open-source storage layers than Delta Lake."
"They could improve the issues related to programming language for the platform."
"Apache Spark could potentially improve in terms of user-friendliness, particularly for individuals with a SQL background. While it's suitable for those with programming knowledge, making it more accessible to those without extensive programming skills could be beneficial."
"In data analysis, you need to take real-time data from different data sources. You need to process this in a subsecond, do the transformation in a subsecond, and all that."
"When using Spark, users may need to write their own parallelization logic, which requires additional effort and expertise."
"The setup I worked on was really complex."
"One limitation is that not all machine learning libraries and models support it."
"It requires overcoming a significant learning curve due to its robust and feature-rich nature."
"The solution does not support multiple languages very well and this means users need to create work-arounds to implement some solutions."
"Currently, we are using many other tools such as Spark and Blade Job to improve the performance."
"They should focus on upgrading their technical capabilities in the market."
"While the deployed product is generally functional, there are instances where it presents difficulties."
"The price of this solution could be lowered."
"Cloudera Distribution for Hadoop is not always completely stable in some cases, which can be a concern for big data solutions."
"I would like to see an improvement in how the solution helps me to handle the whole cluster."
"It would be useful if Cloudera had more tools like SQL Engines that offer the traditional relational database. We have to do a lot of work preparing the data outside Cloudera before getting it into the platform."
"The deployment could be faster. I want more support for the data lake in the next release."
"Having the ability to extend the services provided by the platform to an API architecture, a micro-services architecture, could be very helpful."
"HPE Ezmeral Data Fabric is not compatible with third-party tools."
"The product is not user-friendly."
"Upgrading Ezmeral to a new version is a pain. They're trying to make the solution more container-friendly, so I think they're going in the right direction. The only problem we've had in the past was the upgrades. The process isn't smooth due to how the Red Hat operating system upgrades currently work."
More Cloudera Distribution for Hadoop Pricing and Cost Advice →