Data Architect at a financial services firm with 51-200 employees
ETL processes benefit from cost-effective offloading and could see improved deployment capabilities
Pros and Cons
- "Integration with other tools works well for us and we successfully scaled the solution after two to three years without any issues."
- "The initial setup may take several hours or days, depending on the challenges faced during installation. It's not always a smooth process due to potential complexities."
What is our primary use case?
The primary usage of Cloudera Data Platform is to offload ETL processes because it's cheaper compared to data warehouse solutions like Teradata or Oracle. Furthermore, basic reporting can be done, and some real-time processes can be managed.
What is most valuable?
The foremost benefit is offloading data from the warehouse to Cloudera Data Platform, which allows for cheaper storage. We use it to push transformations and run ETL processes, leveraging tools like Spark. Cloudera also supports various functionalities, including AI and Gen AI tools. Basic reporting and some real-time functions are manageable on the platform.
What needs improvement?
Cloudera Data Platform should include additional capabilities and features similar to those offered by other data management solutions like Azure and Databricks.
For how long have I used the solution?
I have been using Cloudera Data Platform for more than five years.
Buyer's Guide
Cloudera Data Platform
December 2025
Learn what your peers think about Cloudera Data Platform. Get advice and tips from experienced pros sharing their opinions. Updated: December 2025.
879,310 professionals have used our research since 2012.
What was my experience with deployment of the solution?
The installation of Cloudera Data Platform had some challenges, but this is common with many products. An improved deployment process would help deliver solutions more quickly.
What do I think about the stability of the solution?
I would rate the stability of Cloudera Data Platform as eight out of ten.
What do I think about the scalability of the solution?
Integration with other tools works well for us and we successfully scaled the solution after two to three years without any issues. I would rate the scalability as eight out of ten.
How are customer service and support?
I have communicated with technical support, and they are responsive and helpful. I would rate their support as seven out of ten.
How would you rate customer service and support?
Neutral
Which solution did I use previously and why did I switch?
Initially, the decision for Cloudera was driven by pricing and the support they provided.
How was the initial setup?
The initial setup may take several hours or days, depending on the challenges faced during installation. It's not always a smooth process due to potential complexities.
What about the implementation team?
The implementation involved multiple teams, including Cloudera support, with three to four people from our client's side involved.
What other advice do I have?
I recommend Cloudera Data Platform. Overall, I would rate it a seven out of ten despite the complexities in deployment. I suggest including my alternative email address for contact in case of access issues. The overall product rating is seven out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer. Integrator
Last updated: May 5, 2025
Flag as inappropriateSenior Software Engineer at a tech vendor with 501-1,000 employees
Has improved data analysis workflows and centralized sensitive information but needs faster adoption of latest technologies
Pros and Cons
- "Cloudera Data Platform has impacted my organization positively by providing cost-saving benefits, which is the North Star because of which we have shifted to it."
- "A major drawback that I see with Cloudera Data Platform is that it tends to be one or two years behind the latest open-source implementation of the Apache toolkits such as Spark, Hive, and everything else, which relates to the guardrails of following compliance as the main driving factor, so that is something that could be done better on Cloudera Data Platform's side."
What is our primary use case?
My main use case for Cloudera Data Platform is to host in-house data which is sensitive and very guard-railed for compliance.
A quick specific example of the type of sensitive data I'm hosting is related to personally identifiable information as well as data which is financial and transactional in nature, and Cloudera Data Platform helps with compliance by giving us a uniform approach to this. We have implemented the compliance-based entitlements using toolkits provided by Cloudera Data Platform and have our own implementation for each region where we are hosting the data.
What is most valuable?
The most unique thing about my setup with Cloudera Data Platform is having loads and loads of high-volume data, even for specific geographies, and using Cloudera Data Platform has helped us modernize that in a way where the tech is simpler for us.
The best features Cloudera Data Platform offers include their Kafka and Spark offerings, which we are using majorly, along with the Sqoop offering and a bit of Airflow here and there. The standout offering we have used from them is the Spark engine and the Impala engine to query our data, making the Impala cluster the best thing we have used from them.
Using the Spark and Impala engines makes my daily work easier and more efficient because Impala gives us a way to easily do analysis on the data, which simplifies the work of a business analyst as well as a PM when they are doing the initial analysis before the actual development begins for Spark, helping reduce the overall development cycle time.
Cloudera Data Platform has impacted my organization positively by providing cost-saving benefits, which is the North Star because of which we have shifted to it. We had data distributed across many platforms before starting this, and now the entire data strategy is designed around Cloudera Data Platform because it is very simple and very configurable.
What needs improvement?
A major drawback that I see with Cloudera Data Platform is that it tends to be one or two years behind the latest open-source implementation of the Apache toolkits such as Spark, Hive, and everything else, which relates to the guardrails of following compliance as the main driving factor, so that is something that could be done better on Cloudera Data Platform's side.
For how long have I used the solution?
I have been using Cloudera Data Platform for the last five years.
What do I think about the stability of the solution?
Cloudera Data Platform is stable most of the time.
What do I think about the scalability of the solution?
The scalability of Cloudera Data Platform is the best part about it, as you can just add, and the horizontal scaling is very fast, while vertical scaling takes a bit of time, but that works as well.
How are customer service and support?
The customer support for Cloudera Data Platform is okay; I would not say they are the best with technical abilities, however, they are able to address most of the issues.
How would you rate customer service and support?
Neutral
Which solution did I use previously and why did I switch?
Before choosing Cloudera Data Platform, we were using an IBM Netezza solution, which was working for the last ten years, but the costing was getting pretty high as we were scaling.
What was our ROI?
The major improvement we have is the consolidation of all the data models that we had, as now we have everything designed for Cloudera Data Platform, which saves a lot of time when we are doing exchanges between various teams.
Which other solutions did I evaluate?
I was not part of the exercise to evaluate other options before choosing Cloudera Data Platform, so I cannot comment on that.
What other advice do I have?
I would like to add that we are using Kafka in some cases across geographies as well, if the compliances allow us, and it has helped us maintain data pipelines more easily.
While I do not have exact numbers, I can tell you that we are processing more than a million data points every day on Cloudera Data Platform.
I don't have anything else to add about needed improvements regarding support, documentation, or other features.
My advice for others looking into using Cloudera Data Platform is to start with using the cloud and see if you can stay on the cloud if your domain allows, because being on the cloud gives you faster adoption to new technologies as well as distributed technical implementation, which provides more stability and flexibility.
I would rate this product a seven out of ten.
Which deployment model are you using for this solution?
On-premises
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Other
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: Nov 4, 2025
Flag as inappropriateBuyer's Guide
Cloudera Data Platform
December 2025
Learn what your peers think about Cloudera Data Platform. Get advice and tips from experienced pros sharing their opinions. Updated: December 2025.
879,310 professionals have used our research since 2012.
MES Consultant at a consultancy with 10,001+ employees
Has supported multi-source data integration and enabled real-time analytics across hybrid environments
Pros and Cons
- "My main use case for Cloudera Data Platform is to support a multi-source system with different data types, handling semi-structured, structured, and unstructured data to support those kinds of workloads."
What is our primary use case?
The main use case for Cloudera Data Platform is to support a multi-source system with a multi-data structure. We have streaming services, Kafka services, RDBMS systems, and semi-structured data in the form of CSV and JSON files where we used to have everything in place and centralized.
Cloudera Data Platform also supports a hybrid data warehouse, which is similar to a relational database management system where business users can do query analytics, similar to a select star. Cloudera Data Platform also supports PySpark, where a user can create a data frame and then do a transformation load to perform and get insights.
What is most valuable?
The best features of Cloudera Data Platform are that it supports hybrid types of environments, real-time streaming analytics, secure data and governance, machine learning and AI workloads, data warehousing and BI, and edge-to-edge AI use cases.
In the hybrid environment, we can have a private cloud as well as a public cloud, which helps us enable both types of workloads. We have data that keeps coming through a pipeline, and then we just ingest our data. The data engineer transforms and loads it to a data lake, which is Amazon S3. Once the data is ready, it's on the downstream, and it's available for the consumer end to consume the data.
The most important features of Cloudera Data Platform are Rangers, which provide a granular level of security, allowing you to provide column-level security and decide what column you want to expose to the consumer, not just the tabular level.
Cloudera Data Platform has a great impact on my organization as it supports the business demand and business requirements, making me happy with the business use case. It depends on what the business demands and the business use case, which allows for an evaluation of what the business wants. Based on that, they can make a decision on where to go and where to migrate a workload.
What needs improvement?
I would definitely want to see more on the invention part of Cloudera Data Platform to provide a full-fledged AI and ML workload, as AI is supported currently, but I'm interested in having ML and LLM also supported in a full-fledged manner.
For how long have I used the solution?
I have been working in the current field for almost six to eight years.
What do I think about the stability of the solution?
Cloudera Data Platform is stable.
What do I think about the scalability of the solution?
Cloudera Data Platform's scalability is very nice, as you can have multiple workloads and even have multiple clusters with different CDP runtimes. You just have to define the business requirement in the configuration, and based on usage, it automatically scales up and scales down.
How are customer service and support?
Customer support for Cloudera Data Platform is very good.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
We have been using a Cloudera distribution for Hadoop, which is a CDP product, a CDH product. The CDH product provided on-premises only, so we migrated from on-premises to the cloud to opt for cloud compute.
How was the initial setup?
The experience with pricing, setup cost, and licensing is very good. The cloud service provider has an inbuilt tool to analyze what zone and what region to use, as the services have costs associated with that, allowing us to manipulate which region is best suitable and cheaper.
What was our ROI?
In terms of ROI, we definitely have seen a return on investment. Due to security, we cannot disclose the value, but we have definitely seen an ROI.
What's my experience with pricing, setup cost, and licensing?
The experience with pricing, setup cost, and licensing is very good.
Which other solutions did I evaluate?
I did not evaluate other options before choosing Cloudera Data Platform.
What other advice do I have?
I would rate Cloudera Data Platform an eight out of ten because it's excellent in terms of the product, its deliverability, its support, and its use cases. It might differ for different industries depending on what each industry wants, but overall, it has a good impression, and I'm happy with the work relationship with Cloudera technical support.
If someone is looking for a hybrid environment or a cloud environment, they can definitely consider reviewing Cloudera Data Platform. They can look at all the aspects, as the Cloudera Data Platform ecosystem provides Apache Hive, HBase, Kafka, NiFi, Solr, and Knox, which they can review based on their business use case.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Last updated: Oct 27, 2025
Flag as inappropriateSr Manager at a transportation company with 10,001+ employees
Good for secure containerization, and governance capabilities
Pros and Cons
- "Distributed computing, secure containerization, and governance capabilities are the most valuable features."
- "Since Cloudera acquired HDP, it's been bundled with CBH and HDP. However, the biggest challenge is cloud storage integration with Azure, GCP, and AWS."
What is our primary use case?
We use it for multiple domains, including oil & gas, finance (Morgan Stanley), and healthcare. We process around 186 TB of data per day for analytics purposes.
Currently, we use it for healthcare domain.
What is most valuable?
Distributed computing, secure containerization, and governance capabilities are the most valuable features.
What needs improvement?
Since Cloudera acquired HDP, it's been bundled with CBH and HDP. However, the biggest challenge is cloud storage integration with Azure, GCP, and AWS. These platforms offer competitive storage solutions like Gen2, Gen1, Bigtable, BigQuery, Lightstore, S3 buckets, etc., which pose a significant competition to HDP.
For how long have I used the solution?
I have experience with this product. The short form is HDP 2.7. I have been using it since 2011.
It was on-premises and hybrid for the first three months, then we migrated it to AWS and Azure.
What do I think about the stability of the solution?
In terms of storing data in different formats, it's been somewhat unstable. But when compared to Azure Gen2 and its support and features, it's much more advanced. The suitability depends on specific use cases, but overall, HDP seems more mature than it was in the past.
What do I think about the scalability of the solution?
From my experience with both HDP and CDH, they are both scalable. Currently, most people in my company have shifted to Azure, so they are using Gen2 primarily and discarding Gen1.
How are customer service and support?
I have frequently contacted technical support for both Cloudera and Hortonworks.
We have an IT system to raise issues against their team. Issues usually get attended by someone at an L1, L2, or L3 support level. They connect with us directly.
Which solution did I use previously and why did I switch?
Previously, we used Cloudera Data Platform (CDP), which turned out to be a cloud-based Azure infrastructure, and implemented metadata solutions like Hive and others.
How was the initial setup?
The setup was very difficult on non-cloud platforms. We had to implement a version-based approach. However, it became simpler with the use of Docker. We used to do it HDP sandboxes and VM boxes and then created clusters in the ancient days. Now, on cloud platforms, it's much easier, just a matter of a few clicks. That's another approach we can take.
What's my experience with pricing, setup cost, and licensing?
I haven't done a price analysis specifically for HDP. However, when it was first introduced as Hadoop 2.0, there were a few use cases where the price was quite high.
It was particularly expensive for Cloudera and Hortonworks Data Platform. Both options were quite resource-intensive.
So, seven, or even nine or ten years ago, it was quite expensive.
What other advice do I have?
I recommend a mature decision-making model. Assess your specific needs and use cases. If HDP suits your requirements, use it. Otherwise, there are many advanced options available. Review and choose the best one for your use case.
Overall, I would rate the solution a nine out of ten.
I simply love this technology when it comes to new developments. And I've been working with it for the past twelve to thirteen years. However, with the emergence of new technologies, there might be a chance that I would reduce one point because there's room for improvement.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Vice President -Product Management at a computer software company with 1,001-5,000 employees
A good technology with an easy setup but is at end of life
Pros and Cons
- "The product offers a fairly easy setup process."
- "It's at end of life and no longer will there be improvements."
What is our primary use case?
We primarily use the solution for data storage and processing.
What is most valuable?
It is one of the better technology in terms of Hadoop.
The product offers a fairly easy setup process.
It is quite stable.
The scalability is good.
What needs improvement?
We'd like the solution to be easier.
There used to be a free version on offer. Now, everything is paid.
It's not quite as easy to install as EMR.
It's at end of life and no longer will there be improvements.
For how long have I used the solution?
I've been using the solution for more than two years.
What do I think about the stability of the solution?
The solution for the most part has been stable. There aren't issues with bugs or glitches and it doesn't crash or freeze. it's reliable.
What do I think about the scalability of the solution?
The product is scalable. It's not a problem if you would like to expand it.
How are customer service and support?
Now, the customer support is Cloudera. We are in discussions about migrating it to Cloudera. As we are still in the discussion phase, we have yet to work with technical support.
Which solution did I use previously and why did I switch?
We need to upgrade and are considering Cloudera. It also used to be free, and now that we have to pay for this product, we are looking to move off of it and onto potentially Cloudera.
How was the initial setup?
The solution is fairly simple to set up. It's not too complex or difficult. If you know the solution, it's easy. However, there is a learning curve. If you don't know anything about it, it can be more complex.
You can typically deploy it within a week. We have five or six people capable of handling a deployment.
What about the implementation team?
We have our own team that is capable of handling deployments. We did not need to have an integrator or consultant come in.
What's my experience with pricing, setup cost, and licensing?
The solution used to have a free tier. They have since taken that away, which is disappointing.
Which other solutions did I evaluate?
We are constantly evaluating other options. We're always curious to see what could help our business. Currently, we've looked into EMR and Cloudera.
What other advice do I have?
We're a customer and end-user.
We are not using the latest version. We have to upgrade it too, which that's why we are looking at Cloudera now.
Hortonworks is at end of life. They are not adding new things to this, so everything will be Cloudera now.
I'd rate it seven out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Head of technical and projects at a tech vendor with 10,001+ employees
Helps with data management and has good scalability
Pros and Cons
- "It is a scalable platform."
- "More information could be there to simplify the process of running the product."
What is our primary use case?
We use Hortonworks Data Platform for data management, significant data ingestion, and analytics.
What needs improvement?
Hortonworks Data Platform has a limited user community. I haven't seen much discussion about user experiences. More information could be there to simplify the process of running the product.
For how long have I used the solution?
We have been using Hortonworks Data Platform for a couple of months.
What do I think about the stability of the solution?
I rate the product's stability an eight out of ten.
What do I think about the scalability of the solution?
We have five Hortonworks Data Platform users in our organization. It is a scalable platform.
How was the initial setup?
The initial setup could be more straightforward. It would help if you are technically inclined to follow the necessary steps. There could be easy ways to set it up. It takes 45 minutes to complete and requires a team of five people to execute the process.
What about the implementation team?
We implement the product in-house.
What's my experience with pricing, setup cost, and licensing?
Currently, we are using the product in a sandbox environment, and there is no licensing. We might choose a licensing option once we get the results.
What other advice do I have?
I recommend Hortonworks Data Platform to others and rate it an eight out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Senior Cloud Storage Engineer at a comms service provider with 10,001+ employees
Upgrades and patching are addressed by the solution, and they offer a sandbox for testing
Pros and Cons
- "The upgrades and patches must come from Hortonworks."
- "The cost of the solution is high and there is room for improvement."
What is our primary use case?
There are a lot of use cases for the Hortonworks Data Platform. We use it alongside GPFS, so most of the information we use for operational analytics is primarily on the Hortonworks Data Platform.
What is most valuable?
The upgrades and patches must come from Hortonworks. Therefore, if we encounter any problems, they will be responsible for addressing them. This is one of the instances where we have to rely on them for all the upgrades.
What needs improvement?
The cost of the solution is high and there is room for improvement.
For how long have I used the solution?
I have been using the Hortonworks Data Platform for two years.
What do I think about the scalability of the solution?
Hortonworks Data Platform is scalable, but it lacks the capability for horizontal scaling. Therefore, we need to add more servers to increase its capacity.
How was the initial setup?
I am responsible for setting up the infrastructure, but I don't handle the engineering work.
What other advice do I have?
I would rate Hortonworks Data Platform an eight out of ten. The solution delivers on its promises, and Hortonworks provides a sandbox for testing before making a purchase.
The maintenance requires a lot of people, including the DRE and IRE teams.
It is not practical for most organizations that lack large amounts of resources to maintain their own data platform. The Hortonworks Data Platform makes it easier for such organizations.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Data Science and Data Engineering Leader | Senior Principal Data Scientist at a healthcare company with 10,001+ employees
A modeling and analysis data platform that is inexpensive and stable
Pros and Cons
- "The scalability is the key reason why we are on this platform."
- "Hortonworks should not be expensive at all to those looking into using it."
- "The version control of the software is also an issue."
What is our primary use case?
We use Hortonworks as a storage platform and then we create machine learning models and do the execution using Cloudera Data Science Workbench. (Cloudera and Hortonworks merged in January of 2019.)
What is most valuable?
The most valuable part of this product is what Cloudera Data Science Workbench can do as a whole for modeling and analysis.
What needs improvement?
We are happy with the platform but we are also looking at what else is out there. We are comparing what other teams are using in our company to the solution we have adopted.
The main difference, and what seems better with other tools, is the deployment. That part is not entirely clear to us. We can create models, but once we create the models we are not sure as to how to deploy them.
The version control of the software is also an issue.
Maybe these improvements are already included in the newest release but we are not aware of it because nobody on our team has had the opportunity to try it. I do not know how well this product is supported.
For how long have I used the solution?
We have been using Hortonworks Data Platform for almost a year.
What do I think about the stability of the solution?
We do not maintain it in our department. We just use it. It has been mostly available other than when they had to upgrade. The IT department did run into some upgrade issues. But as users, it has been stable for us.
What do I think about the scalability of the solution?
We expect to have a lot more data. The scalability is the key reason why we are on this platform.
How was the initial setup?
My IT team is the user group when it comes to installation and setup. They are the ones who do the product installation and management.
What's my experience with pricing, setup cost, and licensing?
I think it is priced well and it is affordable. Hadoop, which we use with the solution, is open-source. That part is free. We pay only for whatever wrappers Cloudera provides on top of the open-source product, Hadoop. I do not know about the actual pricing in total. The whole point of Hadoop is that it is open-source and they have created their own cluster. Cloudera is just the vendor that they are using.
My guess is Hortonworks should not be expensive at all to those looking into using it.
What other advice do I have?
It is important to note that the IT team has to support the product. We are not the IT team so if we have to scale it, someone has to be able to do that administrative job of adding another server and managing the distribution of the data across all the servers. To work with this, you need to have that skill set within the IT department.
On a scale from one to ten (where one is the worst and ten is the best), I would rate Hortonworks Data Platform as an eight or nine out-of-ten. It is meeting a need.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Download our free Cloudera Data Platform Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2025
Product Categories
Data Management Platforms (DMP) Cloud Master Data Management (MDM) AI Data AnalysisPopular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
Databricks
Palantir Foundry
Buyer's Guide
Download our free Cloudera Data Platform Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links






