I primarily use CDH for data storage and regular dashboard reports.
AI & Data Engineering Lead at a tech services company with 10,001+ employees
Flexible and comprehensive solution
Pros and Cons
- "The most valuable feature is that I can use CDH for almost all use cases across all industries, including the financial sector, public sector, private retailers, and so on."
- "Cloudera's support is extremely bad and cannot be relied on."
What is our primary use case?
What is most valuable?
The most valuable feature is that I can use CDH for almost all use cases across all industries, including the financial sector, public sector, private retailers, and so on.
What needs improvement?
Cloudera's prices are too high and are not competitive with other solutions. They could also improve the Data Science Workbench and add some more features, like wizard activities.
What do I think about the stability of the solution?
CDH is stable.
Buyer's Guide
Cloudera Distribution for Hadoop
June 2025

Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
What do I think about the scalability of the solution?
CDH is scalable, but it's expensive to do it.
How are customer service and support?
Cloudera's support is extremely bad and cannot be relied on.
What's my experience with pricing, setup cost, and licensing?
I wouldn't recommend CDH to others because of its high cost.
What other advice do I have?
I would rate CDH as eight out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

AD - Associate Director at a financial services firm with 10,001+ employees
Feature rich and scalable with good support, but there are performance issues and the security could be improved
Pros and Cons
- "The main advantage is the storage is less expensive."
- "Currently, we are using many other tools such as Spark and Blade Job to improve the performance."
What is our primary use case?
We are using this solution for storing Big Data in one centralized location.
How has it helped my organization?
It has been helpful in allowing data storage in one centralized location with data lakes and all of the surrounding applications.
All of the data processes are being stored into the Big Data Lake.
What is most valuable?
It allows us to store huge amounts of data, which is an advantage.
They have BI (Business Intelligence) tools. There are many AI tools.
We are able to connect and analyze the data to get reports. The reports are very good.
The main advantage is the storage is less expensive.
What needs improvement?
The performance can be improved. We have experienced some performance issues. It is not as sophisticated as Oracle Sybase.
Currently, we are using many other tools such as Spark and Blade Job to improve the performance.
The setup could be simplified, it's complex.
The security needs to be improved.
For how long have I used the solution?
I have been using this solution since 2015.
What do I think about the stability of the solution?
It's a stable solution.
What do I think about the scalability of the solution?
Scalability is good. It's replicated and by default, with Big Data there is a replication factor.
Over the years we have grown, when we started we had 10 nodes now we have increased to a large number of nodes.
How are customer service and technical support?
Technical support is good. I have been able to learn from them. As a developer, I am learning every day.
I would rate the technical support a ten out of ten.
Which solution did I use previously and why did I switch?
Previously we were using Oracle Sybase SQL. We switched because now, we have introduced Big Data.
How was the initial setup?
The initial setup was complex.
It's not as simple as Oracle Sybase.
It's a complex architecture because you have raw data and many engines.
What's my experience with pricing, setup cost, and licensing?
When comparing with Oracle Sybase and SQL, it's cheaper. It's not expensive.
What other advice do I have?
I am a part of security and software development.
We are currently considering migrating to the cloud, and planning on using Microsoft Azure, mainly for the Big Data component.
I would rate this solution a five out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Buyer's Guide
Cloudera Distribution for Hadoop
June 2025

Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
BI Manager at Discovery Health
Open-source solution for intelligent data management and analysis
Pros and Cons
- "Provides a viable open-source solution for enterprise implementations and reliable, intelligent data analysis."
- "The solution does not support multiple languages very well and this means users need to create work-arounds to implement some solutions."
What is our primary use case?
We make recommendations to clients for using different models of this solution to handle data intelligently.
How has it helped my organization?
It gives us the opportunity to offer more options to our clients and create better solution models.
What is most valuable?
We find CDSW useful and plan to use it as a one-stop application for model build and training. Currently, we use Zeppelin notebook and we want to gravitate to a single application for notebooks.
What needs improvement?
The Data Science Workbench doesn't support multiple languages. It needs to support multiple programming languages. We were trying to use Scalar and Python for some solutions we wanted to deploy, but they didn't work properly. As a result, we had to come up with other workaround solutions. If the Data Science Workbench supported multiple programming languages our workflow would be easier and the solutions could be better.
Another aspect we would like to see improved is better opportunities for integration. For example, we would like to use H2O machine learning, which is an open-source product, and Cloudera doesn't support H2O.
If they could support H2O and also deploy multi-language support on the Cloudera Data Science that would be great. But the biggest thing that would help right now is H2O support.
Finally, one other improvement I would suggest is integrating data privacy software into Cloudera. It is not quite complete in this aspect.
For how long have I used the solution?
We have been using the solution for approximately eight weeks.
What do I think about the stability of the solution?
From a stability point of view, we know that there is a new product coming out called Unity — or that is the proposed name of the product that merges Cloudera and Hortonworks. We know that this means that some changes will be happening within the environment. We don't believe that they will be radical changes that will affect existing software that we have. It should just be added functionality of Hortonworks integration. But we know at the same time that Cloudera support will be available if we need it.
What do I think about the scalability of the solution?
While we have not yet done a lot to scale the solution, we think that is going to be quite scalable because it's working on a distributed architecture.
We will probably start with 10 or 15 users once we roll the solution out into production, which will probably be at the end of this week. Afterward, the user base will be growing quite large by double digits in percentage. But that is just to start with. Over a few years, we plan to start thinking about rolling out our experiences to our international businesses as well. This would be a substantial increase in user base.
How are customer service and technical support?
At the moment and for what we have been able to experience, technical support seems to be fine. I would rate it at between seven to eight out of ten.
Which solution did I use previously and why did I switch?
We did not consider other solutions.
How was the initial setup?
The initial setup was difficult and we didn't like it. That is only because we implemented it with other software solutions outside Cloudera and needed to do the integrations.
We are still battling with working out problems with some integrations after eight weeks. It's up and running, but we're optimizing, so that is why I'm saying it's probably medium to complex. But that was the situation for us and our particular needs. It may not be as complex for other businesses at all.
What about the implementation team?
We have been working through the implementation with our own team.
Which other solutions did I evaluate?
We did consider other opportunities. Although we are quite comfortable with our current solution we may look at Hortonworks again, but that is not yet confirmed. We believe, from what we have read and what has been advertised, that Hortonworks and Cloudera are going to eventually merge and become one product. According to some sources, it has already happened.
We're simply trying to get the best of both worlds.
What other advice do I have?
I would say that the product as it currently is should rate at an eight out of ten. The reason that score is not higher is because of the workarounds that we have to do when it comes to certain models that do not support using multiple programming languages. For example, in a single notebook, it is inflexible if you want to use other program languages.
As far as other advice for people considering this solution, I would say take a good look at your business need before you decide on this technology and which solution to choose. Make sure that you are not already able to solve for your particular, identified needs using your existing technology before even considering a change.
You want to be sure you're applying the technology to the right business case because of actual need and not just change for change's sake.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
CEO at AM-BITS LLC
Good enterprise platform with good stability
Pros and Cons
- "It has the best proxy, security, and support features compared to open-source products."
- "The areas of improvement depend on the scale of the project. For banking customers, security features and an essential budget for commercial licenses would be the top priority. Data regulation could be the most crucial for a project with extensive data or an extra use case."
What is most valuable?
It is a good enterprise platform. It is easier and more stable. Additionally, it has the best proxy, security, and support features compared to open-source products.
What needs improvement?
The areas of improvement depend on the scale of the project. For banking customers, security features and an essential budget for commercial licenses would be the top priority. Data regulation could be the most crucial for a project with extensive data or an extra use case.
For how long have I used the solution?
We have been using Cloudera Distribution for Hadoop for a few years.
What do I think about the stability of the solution?
I rate the product’s stability a ten out of ten.
What do I think about the scalability of the solution?
We have ten customers using the product. They include data engineers, performance engineers, and environment engineers.
I rate its stability a ten out of ten.
How are customer service and support?
The product has a support subscription for one year. We use technical support only for complex use cases. We work with their team as we have direct and quick access to contact them. It helps us better understand the technical and business-related queries of the customers.
How was the initial setup?
The on-cloud version is easy to set up. Although, it is complicated to process a large amount of data for on-premises or hybrid setup. It is not a ready-to-use solution for telecom or finance technology. It requires the deployment of robust technology relying on network infrastructure.
What's my experience with pricing, setup cost, and licensing?
The product’s price depends from project to project. It is more expensive than open-source solutions and could be cheaper. However, in some cases, it is less costly than open-source.
What other advice do I have?
It is the best solution in the world at the moment. I advise others to go for it if you have an enterprise customer. I rate it a ten out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Great being able to manage the security layer using the shared SDX which provides flexibility
Pros and Cons
- "With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
- "This is a very expensive solution."
What is our primary use case?
This product is a framework for edge AI, it comes with multiple ecosystems as a project. I'm a senior data architect manager and we are consultants. We offer Cloudera to our customers but we don't have a partnership with them.
What is most valuable?
The best feature is the layer shared experience. If you have a cluster available on-prem or in the cloud, you can manage that security layer using the shared SDX and it provides flexibility. New features are constantly being added.
What needs improvement?
The only thing that needs improvement is the cost, it's a very expensive solution and one of the main reasons companies are not attracted to the product.
What do I think about the stability of the solution?
This product has been around for a long time so it's very mature and stable.
What do I think about the scalability of the solution?
The scalability is very good.
How was the initial setup?
The initial setup has become easier although you need a dedicated admin to maintain and manage the solution because it's a framework and not a single product. Deployment nowadays is much smoother with the PaaS offering in the public cloud, so you can carry out the deployment with an in-house team. The deployment only takes a day but a company is unlikely to go with the default so the solution needs fine-tuning which can take a couple of weeks.
What's my experience with pricing, setup cost, and licensing?
For enterprise organizations that can bear the cost, it's a good solution. A smaller company wouldn't be able to afford the licensing fees. You can get a free trial for 60 days. They'll never have a community version because they're the only ones in the market offering this kind of framework.
What other advice do I have?
I rate this solution nine out of 10.
Disclosure: My company has a business relationship with this vendor other than being a customer: Consultant
Engineering Manager/Solution architect at a computer software company with 201-500 employees
Preferred solution for on-prem
Pros and Cons
- "Cloudera is a very manageable solution with good support."
- "The initial setup of Cloudera is difficult."
What is our primary use case?
We are a distributor for Hadoop. Our customers choose whether they would like to use Cloudera or another product.
Cloudera Distribution is deployed on-premise as well as on bare metal servers in AWS.
What is most valuable?
Cloudera is a very manageable solution with good support.
What needs improvement?
When you compare Cloudera with EMR, EMR has a lot of administrative features, so you don't need to manage the solution. Cloudera is not as easy, as it requires more DevOps resources than other solutions.
For how long have I used the solution?
We have been offering this solution for five years.
What do I think about the stability of the solution?
Cloudera Distribution is stable.
What do I think about the scalability of the solution?
This is a scalable solution. We have clients that have a large installation of Cloudera.
How are customer service and support?
Technical support from Cloudera is fine.
How was the initial setup?
The initial setup of Cloudera is difficult. After you have installed it once, it is not difficult to reproduce.
What about the implementation team?
For a POC deployment, we required only one DevOps. On larger-scale implementation, we also require a data engineer.
What's my experience with pricing, setup cost, and licensing?
Cloudera requires a license to use.
Which other solutions did I evaluate?
We looked at EMR, however Cloudera is better when using OnPrem.
What other advice do I have?
Cloudera is one of the best solutions for on-prem.
I would rate this solution an 8 out of 10.
Which deployment model are you using for this solution?
Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Associate Manager at a consultancy with 501-1,000 employees
Easy to install, good technical support, and with a single script we can run jobs within minutes
Pros and Cons
- "I don't see any performance issues."
- "It could be faster and more user-friendly."
What is our primary use case?
We use this solution to process data.
When using an SQL Server you have to build indexes and you need to fine-tune the data.
We import the data that is in the SQL Source.
With a single script, we are able to run the jobs within minutes, which is an advantage.
We are using the Power BI model for the business convention. The performance in Power BI will be reduced if you incorporate more calculations. Those calculations are captured in the Hadoop layer and processed.
What needs improvement?
It could be faster and more user-friendly.
For how long have I used the solution?
I have been using this solution for seven months.
What do I think about the stability of the solution?
It's a stable product. I don't see any performance issues.
What do I think about the scalability of the solution?
This solution is scalable. We have 40 users for different projects in our organization.
We will continue to use this solution.
How are customer service and technical support?
Technical support is good.
Which solution did I use previously and why did I switch?
I didn't use any other product.
How was the initial setup?
The installing is straightforward.
Our clients provide us with the access to use it directly.
Once you have been given access to the edge nodes we are able to run the scripts in the Hadoop layer.
What's my experience with pricing, setup cost, and licensing?
We do not pay for licensing because our customers forward it, so there is no need to purchase the license for the project.
What other advice do I have?
I would recommend this solution.
I would rate Cloudera Distribution for Hadoop a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Chief Executive Officer at a financial services firm with 51-200 employees
Overall operational, stable but price could be better
Pros and Cons
- "The product as a whole is good."
- "There are better solutions out there that have more features than this one."
What is our primary use case?
We use the solution for the data warehousing.
What is most valuable?
The product as a whole is good.
What needs improvement?
There are better solutions out there that have more features than this one.
For how long have I used the solution?
I have just started using the solution.
What do I think about the stability of the solution?
I do not know of any issues with the stability of the solution.
What about the implementation team?
I have an internal team that does maintenance for the solution.
What's my experience with pricing, setup cost, and licensing?
The price could be better for the product.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros
sharing their opinions.
Updated: June 2025
Popular Comparisons
Apache Spark
HPE Ezmeral Data Fabric
Apache HBase
Neo4j Graph Database
Oracle NoSQL
Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions: