

Find out in this report how the two Data Management Platforms (DMP) solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
There are licensing costs that have been saved when we moved some of the data platforms, decommissioned them, and moved on to this platform.
In terms of return on investment, I see great changes in operational effectiveness measured by RTO when comparing on-premises solutions with cloud solutions.
A specific example of the positive impact of Cloudera Data Platform is the clearly saved time and improved performance, which is the main result of it.
This reduction in both time and money resulted in real-time impact and significant cost savings.
For a lot of different tasks, including machine learning, it is a nice solution.
When it comes to big data processing, I prefer Databricks over other solutions.
I would rate the customer support of Cloudera Data Platform ten out of ten.
I have communicated with technical support, and they are responsive and helpful.
Cloudera support is timely and responsive, adhering to the SLAs they provide.
Whenever we reach out, they respond promptly.
As of now, we are raising issues and they are providing solutions without any problems.
I rate the technical support as fine because they have levels of technical support available, especially partners who get really good support from Databricks on new features.
CDP allows for easy, mostly automated scalability where I can schedule job workflows, fine-tune system resource metrics, and add nodes with just a click.
They have the cloud burst feature available where if the on-premises capacity is not sufficient at a point in time, you can run that Spark job on the cloud itself.
The ability to scale processing capacity on demand for batch jobs without impacting other workloads, and support for a growing number of concurrent users and teams accessing the platform simultaneously are significant advantages.
The sky's the limit with Databricks.
The patches have sometimes caused issues leading to our jobs being paused for about six hours.
Databricks is an easily scalable platform.
Sometimes the end user is not experienced or does not have all the expertise related to Cloudera specifically, making it very difficult to manage properly
Sometimes a node goes down, but it automatically returns to a healthy state.
Cloudera Data Platform is pretty stable in my experience; there are not any downtime or reliability issues.
They release patches that sometimes break our code.
Although it is too early to definitively state the platform's stability, we have not encountered any issues so far.
Databricks is definitely a very stable product and reliable.
We aim to address these issues with a Kubernetes-based platform that will simplify the task of upgrading services.
Cloudera Data Platform should include additional capabilities and features similar to those offered by other data management solutions like Azure and Databricks.
Cloudera Data Platform can be improved by addressing the feasibility of using it in the cloud; there are some complexities around the components used in cloud by Cloudera Data Platform that are not really convenient.
Adjusting features like worker nodes and node utilization during cluster creation could mitigate these failures.
We prefer using a small to mid-sized cluster for many jobs to keep costs low, but this sometimes doesn't support our operations properly.
We use MLflow for managing MLOps, however, further improvement would be beneficial, especially for large language models and related tools.
Initially, CDH had a straightforward pricing model based on nodes, but CDP includes factors like processors, cores, terabytes, and drives, making it difficult to calculate costs.
We find Cloudera Data Platform to be cost-effective.
So far, I would say that it is competitive pricing that we have received.
It is not a cheap solution.
I believe that in terms of credits for Databricks, we're spending between £15,000 and £20,000 a month.
By using the Hadoop File System for distributed storage, we have 1.5 petabytes of physical storage with 500 terabytes of effective storage due to a replication factor of three.
The Ranger integration makes it more flexible and reliable for me by allowing control over data access, specifying who can access at what level, such as table level, masking, or data layer level.
What stands out the most in Cloudera Manager are SDX, which provide centralized control for governance, security, and data lineage across multiple sources.
Databricks' capability to process data in parallel enhances data processing speed.
The platform allows us to leverage cloud advantages effectively, enhancing our AI and ML projects.
The Unity Catalog is for data governance, and the Delta Lake is to build the lakehouse.
| Product | Mindshare (%) |
|---|---|
| Cloudera Data Platform | 8.4% |
| Databricks | 6.7% |
| Other | 84.9% |


| Company Size | Count |
|---|---|
| Small Business | 8 |
| Midsize Enterprise | 7 |
| Large Enterprise | 26 |
| Company Size | Count |
|---|---|
| Small Business | 27 |
| Midsize Enterprise | 12 |
| Large Enterprise | 56 |
Cloudera Data Platform provides efficient data management through features like Hue, Spark, and Impala. It integrates open-source solutions, supports hybrid environments, and enhances data governance while prioritizing security, scalability, and cost-effectiveness.
Cloudera Data Platform addresses data management needs by supporting large-scale analytics, data science, and ETL processes. It facilitates seamless operation with Ambari UI for deployment and monitoring. Users benefit from robust security via Ranger, open-source compatibility, and a flexible eco-system that uses Hadoop components. While it simplifies setup and supports hybrid workloads, improvements in AI, machine learning, stability in Name Node High Availability, and cost management are ongoing needs. Challenges in tool usability, governance maturity, and scalability call for continued innovation, especially in cloud adoption and staying aligned with open-source technologies.
What are the key features of Cloudera Data Platform?Organizations in banking, healthcare, and hospitality leverage Cloudera Data Platform for data management, analytics, and cross-source integration. It handles complex data structures, bolsters AI workloads, and adheres to data compliance standards while integrating with tools like Spark, Kafka, and machine learning models.
Databricks offers a scalable, versatile platform that integrates seamlessly with Spark and multiple languages, supporting data engineering, machine learning, and analytics in a unified environment.
Databricks stands out for its scalability, ease of use, and powerful integration with Spark, multiple languages, and leading cloud services like Azure and AWS. It provides tools such as the Notebook for collaboration, Delta Lake for efficient data management, and Unity Catalog for data governance. While enhancing data engineering and machine learning workflows, it faces challenges in visualization and third-party integration, with pricing and user interface navigation being common concerns. Despite needing improvements in connectivity and documentation, it remains popular for tasks like real-time processing and data pipeline management.
What features make Databricks unique?
What benefits can users expect from Databricks?
In the tech industry, Databricks empowers teams to perform comprehensive data analytics, enabling them to conduct extensive ETL operations, run predictive modeling, and prepare data for SparkML. In retail, it supports real-time data processing and batch streaming, aiding in better decision-making. Enterprises across sectors leverage its capabilities for creating secure APIs and managing data lakes effectively.
We monitor all Data Management Platforms (DMP) reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.