Try our new research platform with insights from 80,000+ expert users
Sr Manager at a transportation company with 10,001+ employees
Real User
Top 5Leaderboard
Good for secure containerization, and governance capabilities
Pros and Cons
  • "Distributed computing, secure containerization, and governance capabilities are the most valuable features."
  • "Since Cloudera acquired HDP, it's been bundled with CBH and HDP. However, the biggest challenge is cloud storage integration with Azure, GCP, and AWS."

What is our primary use case?

We use it for multiple domains, including oil & gas, finance (Morgan Stanley), and healthcare. We process around 186 TB of data per day for analytics purposes.

Currently, we use it for healthcare domain. 

What is most valuable?

Distributed computing, secure containerization, and governance capabilities are the most valuable features.

What needs improvement?

Since Cloudera acquired HDP, it's been bundled with CBH and HDP. However, the biggest challenge is cloud storage integration with Azure, GCP, and AWS. These platforms offer competitive storage solutions like Gen2, Gen1, Bigtable, BigQuery, Lightstore, S3 buckets, etc., which pose a significant competition to HDP.

For how long have I used the solution?

I have experience with this product. The short form is HDP 2.7. I have been using it since 2011. 

It was on-premises and hybrid for the first three months, then we migrated it to AWS and Azure.

Buyer's Guide
Cloudera Data Platform
November 2025
Learn what your peers think about Cloudera Data Platform. Get advice and tips from experienced pros sharing their opinions. Updated: November 2025.
872,846 professionals have used our research since 2012.

What do I think about the stability of the solution?

In terms of storing data in different formats, it's been somewhat unstable. But when compared to Azure Gen2 and its support and features, it's much more advanced. The suitability depends on specific use cases, but overall, HDP seems more mature than it was in the past.

What do I think about the scalability of the solution?

From my experience with both HDP and CDH, they are both scalable. Currently, most people in my company have shifted to Azure, so they are using Gen2 primarily and discarding Gen1.  

How are customer service and support?

I have frequently contacted technical support for both Cloudera and Hortonworks.

We have an IT system to raise issues against their team. Issues usually get attended by someone at an L1, L2, or L3 support level. They connect with us directly.

Which solution did I use previously and why did I switch?

Previously, we used Cloudera Data Platform (CDP), which turned out to be a cloud-based Azure infrastructure, and implemented metadata solutions like Hive and others.

How was the initial setup?

The setup was very difficult on non-cloud platforms. We had to implement a version-based approach. However, it became simpler with the use of Docker. We used to do it HDP sandboxes and VM boxes and then created clusters in the ancient days. Now, on cloud platforms, it's much easier, just a matter of a few clicks. That's another approach we can take.

What's my experience with pricing, setup cost, and licensing?

I haven't done a price analysis specifically for HDP. However, when it was first introduced as Hadoop 2.0, there were a few use cases where the price was quite high.

It was particularly expensive for Cloudera and Hortonworks Data Platform. Both options were quite resource-intensive.

So, seven, or even nine or ten years ago, it was quite expensive.

What other advice do I have?

I recommend a mature decision-making model. Assess your specific needs and use cases. If HDP suits your requirements, use it. Otherwise, there are many advanced options available. Review and choose the best one for your use case.

Overall, I would rate the solution a nine out of ten. 

I simply love this technology when it comes to new developments. And I've been working with it for the past twelve to thirteen years. However, with the emergence of new technologies, there might be a chance that I would reduce one point because there's room for improvement.  

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
MES Consultant at a consultancy with 10,001+ employees
MSP
Has supported multi-source data integration and enabled real-time analytics across hybrid environments
Pros and Cons
  • "My main use case for Cloudera Data Platform is to support a multi-source system with different data types, handling semi-structured, structured, and unstructured data to support those kinds of workloads."

    What is our primary use case?

    The main use case for Cloudera Data Platform is to support a multi-source system with a multi-data structure. We have streaming services, Kafka services, RDBMS systems, and semi-structured data in the form of CSV and JSON files where we used to have everything in place and centralized.

    Cloudera Data Platform also supports a hybrid data warehouse, which is similar to a relational database management system where business users can do query analytics, similar to a select star. Cloudera Data Platform also supports PySpark, where a user can create a data frame and then do a transformation load to perform and get insights.

    What is most valuable?

    The best features of Cloudera Data Platform are that it supports hybrid types of environments, real-time streaming analytics, secure data and governance, machine learning and AI workloads, data warehousing and BI, and edge-to-edge AI use cases.

    In the hybrid environment, we can have a private cloud as well as a public cloud, which helps us enable both types of workloads. We have data that keeps coming through a pipeline, and then we just ingest our data. The data engineer transforms and loads it to a data lake, which is Amazon S3. Once the data is ready, it's on the downstream, and it's available for the consumer end to consume the data.

    The most important features of Cloudera Data Platform are Rangers, which provide a granular level of security, allowing you to provide column-level security and decide what column you want to expose to the consumer, not just the tabular level.

    Cloudera Data Platform has a great impact on my organization as it supports the business demand and business requirements, making me happy with the business use case. It depends on what the business demands and the business use case, which allows for an evaluation of what the business wants. Based on that, they can make a decision on where to go and where to migrate a workload.

    What needs improvement?

    I would definitely want to see more on the invention part of Cloudera Data Platform to provide a full-fledged AI and ML workload, as AI is supported currently, but I'm interested in having ML and LLM also supported in a full-fledged manner.

    For how long have I used the solution?

    I have been working in the current field for almost six to eight years.

    What do I think about the stability of the solution?

    Cloudera Data Platform is stable.

    What do I think about the scalability of the solution?

    Cloudera Data Platform's scalability is very nice, as you can have multiple workloads and even have multiple clusters with different CDP runtimes. You just have to define the business requirement in the configuration, and based on usage, it automatically scales up and scales down.

    How are customer service and support?

    Customer support for Cloudera Data Platform is very good.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    We have been using a Cloudera distribution for Hadoop, which is a CDP product, a CDH product. The CDH product provided on-premises only, so we migrated from on-premises to the cloud to opt for cloud compute.

    How was the initial setup?

    The experience with pricing, setup cost, and licensing is very good. The cloud service provider has an inbuilt tool to analyze what zone and what region to use, as the services have costs associated with that, allowing us to manipulate which region is best suitable and cheaper.

    What was our ROI?

    In terms of ROI, we definitely have seen a return on investment. Due to security, we cannot disclose the value, but we have definitely seen an ROI.

    What's my experience with pricing, setup cost, and licensing?

    The experience with pricing, setup cost, and licensing is very good.

    Which other solutions did I evaluate?

    I did not evaluate other options before choosing Cloudera Data Platform.

    What other advice do I have?

    I would rate Cloudera Data Platform an eight out of ten because it's excellent in terms of the product, its deliverability, its support, and its use cases. It might differ for different industries depending on what each industry wants, but overall, it has a good impression, and I'm happy with the work relationship with Cloudera technical support.

    If someone is looking for a hybrid environment or a cloud environment, they can definitely consider reviewing Cloudera Data Platform. They can look at all the aspects, as the Cloudera Data Platform ecosystem provides Apache Hive, HBase, Kafka, NiFi, Solr, and Knox, which they can review based on their business use case.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    Flag as inappropriate
    PeerSpot user
    Buyer's Guide
    Cloudera Data Platform
    November 2025
    Learn what your peers think about Cloudera Data Platform. Get advice and tips from experienced pros sharing their opinions. Updated: November 2025.
    872,846 professionals have used our research since 2012.
    Prashant  Singh - PeerSpot reviewer
    Vice President -Product Management at Paytm
    Real User
    Top 10
    A good technology with an easy setup but is at end of life
    Pros and Cons
    • "The product offers a fairly easy setup process."
    • "It's at end of life and no longer will there be improvements."

    What is our primary use case?

    We primarily use the solution for data storage and processing.

    What is most valuable?

    It is one of the better technology in terms of Hadoop.

    The product offers a fairly easy setup process. 

    It is quite stable.

    The scalability is good. 

    What needs improvement?

    We'd like the solution to be easier.

    There used to be a free version on offer. Now, everything is paid.

    It's not quite as easy to install as EMR.

    It's at end of life and no longer will there be improvements. 

    For how long have I used the solution?

    I've been using the solution for more than two years. 

    What do I think about the stability of the solution?

    The solution for the most part has been stable. There aren't issues with bugs or glitches and it doesn't crash or freeze. it's reliable.

    What do I think about the scalability of the solution?

    The product is scalable. It's not a problem if you would like to expand it. 

    How are customer service and support?

    Now, the customer support is Cloudera. We are in discussions about migrating it to Cloudera. As we are still in the discussion phase, we have yet to work with technical support.

    Which solution did I use previously and why did I switch?

    We need to upgrade and are considering Cloudera. It also used to be free, and now that we have to pay for this product, we are looking to move off of it and onto potentially Cloudera. 

    How was the initial setup?

    The solution is fairly simple to set up. It's not too complex or difficult. If you know the solution, it's easy. However, there is a learning curve. If you don't know anything about it, it can be more complex.

    You can typically deploy it within a week. We have five or six people capable of handling a deployment. 

    What about the implementation team?

    We have our own team that is capable of handling deployments. We did not need to have an integrator or consultant come in.

    What's my experience with pricing, setup cost, and licensing?

    The solution used to have a free tier. They have since taken that away, which is disappointing. 

    Which other solutions did I evaluate?

    We are constantly evaluating other options. We're always curious to see what could help our business. Currently, we've looked into EMR and Cloudera. 

    What other advice do I have?

    We're a customer and end-user.

    We are not using the latest version. We have to upgrade it too, which that's why we are looking at Cloudera now.

    Hortonworks is at end of life. They are not adding new things to this, so everything will be Cloudera now. 

    I'd rate it seven out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Leslie Mavonyani - PeerSpot reviewer
    Head of technical and projects at BI Dynamics Pty Ltd
    Real User
    Helps with data management and has good scalability
    Pros and Cons
    • "It is a scalable platform."
    • "More information could be there to simplify the process of running the product."

    What is our primary use case?

    We use Hortonworks Data Platform for data management, significant data ingestion, and analytics.

    What needs improvement?

    Hortonworks Data Platform has a limited user community. I haven't seen much discussion about user experiences. More information could be there to simplify the process of running the product.

    For how long have I used the solution?

    We have been using Hortonworks Data Platform for a couple of months.

    What do I think about the stability of the solution?

    I rate the product's stability an eight out of ten.

    What do I think about the scalability of the solution?

    We have five Hortonworks Data Platform users in our organization. It is a scalable platform.

    How was the initial setup?

    The initial setup could be more straightforward. It would help if you are technically inclined to follow the necessary steps. There could be easy ways to set it up. It takes 45 minutes to complete and requires a team of five people to execute the process.

    What about the implementation team?

    We implement the product in-house.

    What's my experience with pricing, setup cost, and licensing?

    Currently, we are using the product in a sandbox environment, and there is no licensing. We might choose a licensing option once we get the results.

    What other advice do I have?

    I recommend Hortonworks Data Platform to others and rate it an eight out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Senior Cloud Storage Engineer at Comcast Business
    Real User
    Upgrades and patching are addressed by the solution, and they offer a sandbox for testing
    Pros and Cons
    • "The upgrades and patches must come from Hortonworks."
    • "The cost of the solution is high and there is room for improvement."

    What is our primary use case?

    There are a lot of use cases for the Hortonworks Data Platform. We use it alongside GPFS, so most of the information we use for operational analytics is primarily on the Hortonworks Data Platform.

    What is most valuable?

    The upgrades and patches must come from Hortonworks. Therefore, if we encounter any problems, they will be responsible for addressing them. This is one of the instances where we have to rely on them for all the upgrades.

    What needs improvement?

    The cost of the solution is high and there is room for improvement.

    For how long have I used the solution?

    I have been using the Hortonworks Data Platform for two years.

    What do I think about the scalability of the solution?

    Hortonworks Data Platform is scalable, but it lacks the capability for horizontal scaling. Therefore, we need to add more servers to increase its capacity.

    How was the initial setup?

    I am responsible for setting up the infrastructure, but I don't handle the engineering work.

    What other advice do I have?

    I would rate Hortonworks Data Platform an eight out of ten. The solution delivers on its promises, and Hortonworks provides a sandbox for testing before making a purchase.

    The maintenance requires a lot of people, including the DRE and IRE teams.

    It is not practical for most organizations that lack large amounts of resources to maintain their own data platform. The Hortonworks Data Platform makes it easier for such organizations.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    reviewer1426866 - PeerSpot reviewer
    Data Science and Data Engineering Leader | Senior Principal Data Scientist at a healthcare company with 10,001+ employees
    Real User
    A modeling and analysis data platform that is inexpensive and stable
    Pros and Cons
    • "The scalability is the key reason why we are on this platform."
    • "Hortonworks should not be expensive at all to those looking into using it."
    • "The version control of the software is also an issue."

    What is our primary use case?

    We use Hortonworks as a storage platform and then we create machine learning models and do the execution using Cloudera Data Science Workbench. (Cloudera and Hortonworks merged in January of 2019.)  

    What is most valuable?

    The most valuable part of this product is what Cloudera Data Science Workbench can do as a whole for modeling and analysis.  

    What needs improvement?

    We are happy with the platform but we are also looking at what else is out there. We are comparing what other teams are using in our company to the solution we have adopted.  

    The main difference, and what seems better with other tools, is the deployment. That part is not entirely clear to us. We can create models, but once we create the models we are not sure as to how to deploy them.  

    The version control of the software is also an issue.  

    Maybe these improvements are already included in the newest release but we are not aware of it because nobody on our team has had the opportunity to try it. I do not know how well this product is supported.  

    For how long have I used the solution?

    We have been using Hortonworks Data Platform for almost a year.  

    What do I think about the stability of the solution?

    We do not maintain it in our department. We just use it. It has been mostly available other than when they had to upgrade. The IT department did run into some upgrade issues. But as users, it has been stable for us.  

    What do I think about the scalability of the solution?

    We expect to have a lot more data. The scalability is the key reason why we are on this platform.  

    How was the initial setup?

    My IT team is the user group when it comes to installation and setup. They are the ones who do the product installation and management.  

    What's my experience with pricing, setup cost, and licensing?

    I think it is priced well and it is affordable. Hadoop, which we use with the solution, is open-source. That part is free. We pay only for whatever wrappers Cloudera provides on top of the open-source product, Hadoop. I do not know about the actual pricing in total. The whole point of Hadoop is that it is open-source and they have created their own cluster. Cloudera is just the vendor that they are using.  

    My guess is Hortonworks should not be expensive at all to those looking into using it.  

    What other advice do I have?

    It is important to note that the IT team has to support the product. We are not the IT team so if we have to scale it, someone has to be able to do that administrative job of adding another server and managing the distribution of the data across all the servers. To work with this, you need to have that skill set within the IT department.  

    On a scale from one to ten (where one is the worst and ten is the best), I would rate Hortonworks Data Platform as an eight or nine out-of-ten. It is meeting a need.  

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Manager at a tech services company with 201-500 employees
    Real User
    A seamless solution with a solid workflow
    Pros and Cons
    • "The data platform is pretty neat. The workflow is also really good."
    • "It would also be nice if there were less coding involved."

    What is our primary use case?

    We use this solution for the hospitality industry. 

    How has it helped my organization?

    It was for end to end data processing and data manipulations.

    What is most valuable?

    The data platform is pretty neat. The workflow is also really good. 

    What needs improvement?

    The NiFi platform could be enhanced. This refers to the data ingestion in a workflow. 

    It would also be nice if there was less coding involved. 

    For how long have I used the solution?

    I have been using this solution for six years. 

    How are customer service and support?

    The technical support is okay, but not excellent. They can take a while to respond. 

    What other advice do I have?

    If you wish to use this solution, make sure you compare it with some other solutions first to make sure it's right for your needs. 

    Overall, on a scale from one to ten, I would give this solution a rating of nine.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    SenioITh677 - PeerSpot reviewer
    Senior IT Officer- Head of Administration, System Administration Division for Unix and Linux Servers at a financial services firm with 10,001+ employees
    Real User
    A cost-effective alternative for managing our big data
    Pros and Cons
    • "Now, using this solution, it is much cheaper to have all of the data available for searching, not in real-time, but whenever there is a pending request."
    • "I would like to see more support for containers such as Docker and OpenShift."

    What is our primary use case?

    We use this solution to look at and manage big data. It's mostly historical data that we offload from our data warehouse, as well as from other databases in other platforms.

    We have two different installations. The first one is based on IBM POWER CPUs, and the other one is based on Intel CPUs. Our data center is on-premise. There is some thought on moving to a private could, or a private IBM cloud, but we have not proceeded with that as of yet.

    How has it helped my organization?

    This solution is a cheaper way for us to offload the otherwise expensive data. We can move data from outdated database versions, such as Oracle 10. It is now out of support, but still hosts some of our historical data. This solution has helped us move our data to the current version.

    Previously, we had our data on more expensive platforms. Now, using this solution, it is much cheaper to have all of the data available for searching, not in real-time, but whenever there is a pending request.

    What needs improvement?

    We have had problems with the backup and with services that require a disaster site. We are still struggling with some of these issues.

    We are having trouble with Active Directory and Hive integration.

    I would like to see more support for containers such as Docker and OpenShift.

    For how long have I used the solution?

    About a year and a half.

    What do I think about the stability of the solution?

    We have had some issues with the code, but it's mostly from the developers. From our side, we don't see any issues with stability, although it may be that we have a lot of unused CPU capacity.

    What do I think about the scalability of the solution?

    We have not acquired any additional hardware since our initial purchase. However, we expect more use cases to be added, at which point we may have performance or scalability problems.

    How was the initial setup?

    The initial setup is not very difficult. The configuration is not easy, but somebody with some experience is able to set it up. We had users for which we had to set up quotas and queues. For us, the basic installation was completed within a matter of a week.

    What about the implementation team?

    We had IBM set up both of our installations. 

    What other advice do I have?

    This is a good product, but we still have some issues with backup, and the performance monitors that we install on every system. There may be solutions, but we're struggling to integrate them.

    This is a product that I recommend. It's a solution that comes at a lower price, and it works well if you don't have expectations that it will behave like a much more expensive system.

    I would rate this solution an eight out of ten.

    Disclosure: My company does not have a business relationship with this vendor other than being a customer.
    PeerSpot user
    Buyer's Guide
    Download our free Cloudera Data Platform Report and get advice and tips from experienced pros sharing their opinions.
    Updated: November 2025
    Buyer's Guide
    Download our free Cloudera Data Platform Report and get advice and tips from experienced pros sharing their opinions.