We performed a comparison between Amazon EMR, Cloudera Distribution for Hadoop, and Hortonworks Data Platform based on real PeerSpot user reviews.
Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop."When we grade big jobs from on-prem to the cloud, we do it in EMR with Spark."
"This is the best tool for hosts and it's really flexible and scalable."
"In Amazon EMR it is easy to rebuild anything, easy to upgrade and has good fault tolerance."
"We are using applications, such as Splunk, Livy, Hadoop, and Spark. We are using all of these applications in Amazon EMR and they're helping us a lot."
"It allows users to access the data through a web interface."
"The ability to resize the cluster is what really makes it stand out over other Hadoop and big data solutions."
"Amazon EMR's most valuable features are processing speed and data storage capacity."
"It has a variety of options and support systems."
"The solution's most valuable feature is the enterprise data platform."
"The file system is a valuable feature."
"The most valuable feature is Impala, the querying engine, which is very fast."
"Very good end-to-end security features."
"We also really like the Cloudera community. You can have any question and will have your answer within a few hours."
"The most valuable feature is Kubernetes."
"The product is completely secure."
"The data science aspect of the solution is valuable."
"The data platform is pretty neat. The workflow is also really good."
"We use it for data science activities."
"The product offers a fairly easy setup process."
"Hortonworks should not be expensive at all to those looking into using it."
"Now, using this solution, it is much cheaper to have all of the data available for searching, not in real-time, but whenever there is a pending request."
"The upgrades and patches must come from Hortonworks."
"The scalability is the key reason why we are on this platform."
"It is a scalable platform."
"The problem for us is it starts very slow."
"Modules and strategies should be better handled and notified early in advance."
"The initial setup was time-consuming."
"The most complicated thing is configuring to the cluster and ensure it's running correctly."
"The product must add some of the latest technologies to provide more flexibility to the users."
"There is no need to pay extra for third-party software."
"Amazon EMR is continuously improving, but maybe something like CI/CD out-of-the-box or integration with Prometheus Grafana."
"There were times where they would release new versions and it seemed to end up breaking old versions, which is very strange."
"They should focus on upgrading their technical capabilities in the market."
"The dashboard could be improved."
"The Cloudera training has deteriorated significantly."
"I would like to see an improvement in how the solution helps me to handle the whole cluster."
"The tool's ability to be deployed on a cloud model is an area of concern where improvements are required."
"There are multiple bugs when we update."
"The procedure for operations could be simplified."
"The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it."
"Since Cloudera acquired HDP, it's been bundled with CBH and HDP. However, the biggest challenge is cloud storage integration with Azure, GCP, and AWS."
"Security and workload management need improvement."
"I would like to see more support for containers such as Docker and OpenShift."
"The cost of the solution is high and there is room for improvement."
"Hive performance. If Hive performance increased, Hadoop would replace (not everywhere) traditional databases."
"More information could be there to simplify the process of running the product."
"It's at end of life and no longer will there be improvements."
"It would also be nice if there were less coding involved."
More Cloudera Distribution for Hadoop Pricing and Cost Advice →