We changed our name from IT Central Station: Here's why

Cloudera DataFlow vs Databricks comparison

You must select at least 2 products to compare!
Cloudera DataFlow Logo
1,903 views|1,559 comparisons
Databricks Logo
30,779 views|25,469 comparisons
Featured Review
Find out what your peers are saying about Databricks, Amazon, Solace and others in Streaming Analytics. Updated: January 2022.
564,997 professionals have used our research since 2012.
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
"This solution is very scalable and robust."

More Cloudera DataFlow Pros →

"Databricks' Lakehouse architecture has been most useful for us. The data governance has been absolutely efficient in between other kinds of solutions.""Can cut across the entire ecosystem of open source technology to give an extra level of getting the transformatory process of the data.""Databricks helps crunch petabytes of data in a very short period of time.""One of the features provides nice interactive clusters, or compute instances that you don't really need to manage often.""The built-in optimization recommendations halved the speed of queries and allowed us to reach decision points and deliver insights very quickly.""Imageflow is a visual tool that helps make it easier for business people to understand complex workflows.""The solution is easy to use and has a quick start-up time due to being on the cloud.""The most valuable feature is the ability to use SQL directly with Databricks."

More Databricks Pros →

"Although their workflow is pretty neat, it still requires a lot of transformation coding; especially when it comes to Python and other demanding programming languages."

More Cloudera DataFlow Cons →

"Would be helpful to have additional licensing options.""Pricing is one of the things that could be improved.""The initial setup is difficult.""Anyone who doesn't know SQL may find the product difficult to work with.""Overall it's a good product, however, it doesn't do well against any individual best-of-breed products.""I have seen better user interfaces, so that is something that can be improved.""Databricks is an analytics platform. It should offer more data science. It should have more features for data scientists to work with.""The product could be improved by offering an expansion of their visualization capabilities, which currently assists in development in their notebook environment."

More Databricks Cons →

Pricing and Cost Advice
Information Not Available
  • "Licensing on site I would counsel against, as on-site hardware issues tend to really delay and slow down delivery."
  • "We find Databricks to be very expensive, although this improved when we found out how to shut it down at night."
  • "The pricing depends on the usage itself."
  • "I am based in South Africa, where it is expensive adapting to the cloud, and then there is the price for the tool itself."
  • "The price is okay. It's competitive."
  • "Databricks uses a price-per-use model, where you can use as much compute as you need."
  • "There are different versions."
  • "The solution uses a pay-per-use model with an annual subscription fee or package. Typically this solution is used on a cloud platform, such as Azure or AWS, but more people are choosing Azure because the price is more reasonable."
  • More Databricks Pricing and Cost Advice →

    Use our free recommendation engine to learn which Streaming Analytics solutions are best for your needs.
    564,997 professionals have used our research since 2012.
    Questions from the Community
    Top Answer: 
    This solution is very scalable and robust.
    Top Answer: 
    Although its workflow is pretty neat, it still requires a lot of transformation coding; especially when it comes to Python and other demanding programming languages. It should be more like other… more »
    Top Answer: 
    We use this solution for sentimental analysis and fraud detection. We also use it to analyze product royalties.
    Top Answer: 
    Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or Python. It offers many different cluster choices and excellent integration with… more »
    Top Answer: 
    We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designed to accelerate innovation projects. It is based on Spark so it is very fast. It… more »
    Top Answer: 
    Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analytics teams that have to interpret data to further the business goals of their… more »
    out of 38 in Streaming Analytics
    Average Words per Review
    out of 38 in Streaming Analytics
    Average Words per Review
    Also Known As
    CDF, Hortonworks DataFlow, HDF
    Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
    Learn More

    Cloudera DataFlow (CDF), formerly Hortonworks DataFlow (HDF), is a scalable, real-time streaming analytics platform that ingests, curates, and analyzes data for key insights and immediate actionable intelligence.

    Databricks creates a Unified Analytics Platform that accelerates innovation by unifying data science, engineering, and business. It utilizes Apache Spark to help clients with cloud-based big data processing. It puts Spark on “autopilot” to significantly reduce operational complexity and management cost. The Databricks I/O module (DBIO) improves the read and write performance of Apache Spark in the cloud. An increase in productivity is ensured through Databricks’ collaborative workplace.

    Learn more about Cloudera DataFlow
    Learn more about Databricks
    Sample Customers
    Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
    Top Industries
    Computer Software Company29%
    Comms Service Provider19%
    Media Company8%
    Financial Services Firm8%
    Financial Services Firm14%
    Computer Software Company14%
    Mining And Metals Company14%
    Energy/Utilities Company7%
    Computer Software Company27%
    Comms Service Provider15%
    Financial Services Firm8%
    Company Size
    No Data Available
    Small Business14%
    Midsize Enterprise17%
    Large Enterprise69%
    Small Business26%
    Midsize Enterprise19%
    Large Enterprise55%
    Find out what your peers are saying about Databricks, Amazon, Solace and others in Streaming Analytics. Updated: January 2022.
    564,997 professionals have used our research since 2012.

    Cloudera DataFlow is ranked 10th in Streaming Analytics with 1 review while Databricks is ranked 1st in Streaming Analytics with 24 reviews. Cloudera DataFlow is rated 8.0, while Databricks is rated 7.8. The top reviewer of Cloudera DataFlow writes "A scalable and robust platform for analyzing data". On the other hand, the top reviewer of Databricks writes "Has a good feature set but it needs samples and templates to help invite users to see results". Cloudera DataFlow is most compared with Spring Cloud Data Flow, Confluent, Hortonworks Data Platform, Amazon MSK and Amazon Kinesis, whereas Databricks is most compared with Microsoft Azure Machine Learning Studio, Amazon SageMaker, Azure Stream Analytics, Alteryx and Dataiku Data Science Studio.

    See our list of best Streaming Analytics vendors.

    We monitor all Streaming Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.