We performed a comparison between Cloudera DataFlow and Cloudera Distribution for Hadoop based on real PeerSpot user reviews.
Find out what your peers are saying about Amazon Web Services (AWS), Databricks, Microsoft and others in Streaming Analytics."DataFlow's performance is okay."
"The initial setup was not so difficult"
"This solution is very scalable and robust."
"The scalability of Cloudera Distribution for Hadoop is excellent."
"The file system is a valuable feature."
"We had a data warehouse before all the data. We can process a lot more data structures."
"With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
"CDH has a wide variety of proprietary tools that we use, like Impala. So from that perspective, it's quite useful as opposed to something open-source. We get a lot of value from Cloudera's proprietary tools."
"The search function is the most valuable aspect of the solution."
"The most valuable feature is Impala, the querying engine, which is very fast."
"The solution is stable."
"It's an outdated legacy product that doesn't meet the needs of modern data analysts and scientists."
"It is not easy to use the R language. Though I don't know if it's possible, I believe it is possible, but it is not the best language for machine learning."
"Although their workflow is pretty neat, it still requires a lot of transformation coding; especially when it comes to Python and other demanding programming languages."
"There are multiple bugs when we update."
"Cloudera Distribution for Hadoop is not always completely stable in some cases, which can be a concern for big data solutions."
"I would like to see an improvement in how the solution helps me to handle the whole cluster."
"The areas of improvement depend on the scale of the project. For banking customers, security features and an essential budget for commercial licenses would be the top priority. Data regulation could be the most crucial for a project with extensive data or an extra use case."
"Cloudera Distribution for Hadoop has a limited feature list and a lot of costs involved."
"The dashboard could be improved."
"The procedure for operations could be simplified."
"The price of this solution could be lowered."
More Cloudera Distribution for Hadoop Pricing and Cost Advice →
Cloudera DataFlow is ranked 13th in Streaming Analytics with 3 reviews while Cloudera Distribution for Hadoop is ranked 2nd in Hadoop with 47 reviews. Cloudera DataFlow is rated 6.6, while Cloudera Distribution for Hadoop is rated 8.0. The top reviewer of Cloudera DataFlow writes "A scalable and robust platform for analyzing data". On the other hand, the top reviewer of Cloudera Distribution for Hadoop writes "Good end-to-end security features and we like that it's cloud independent". Cloudera DataFlow is most compared with Databricks, Confluent, Amazon MSK, Informatica Data Engineering Streaming and Hortonworks Data Platform, whereas Cloudera Distribution for Hadoop is most compared with Amazon EMR, HPE Ezmeral Data Fabric, Apache Spark, MongoDB and Cassandra.
We monitor all Streaming Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.