- Cloudera Manager for administering the Hadoop cluster
- Cloudera specific solutions like Impala
- Extensive documentation
- Good user community
Vice President - Big Data and Delivery at a computer software company with 51-200 employees
Cloudera Manager is a good tool to administer. Sometimes it gets confusing to follow a single path for installation.
What is most valuable?
How has it helped my organization?
Implementing a Hadoop cluster has become relatively straight-forward using CDH. Administering it is also less complex. As a result, efforts spent in these areas are less than anticipated.
What needs improvement?
- Some of the UI features seem confusing e.g. charts on the CM Services page
- Sometimes it gets confusing to follow a single path for installation due to multiple recommended approaches e.g. parcels vs packages
For how long have I used the solution?
We have been using it for the last two years.
Buyer's Guide
Cloudera Distribution for Hadoop
June 2025

Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
What was my experience with deployment of the solution?
Following a single path for installation becomes confusing due to multiple recommended approaches e.g. parcels vs packages.
What do I think about the stability of the solution?
Flume seems unstable and has to be restarted quite often.
What do I think about the scalability of the solution?
None as such
How are customer service and support?
We are mostly using Cloudera Express so we did not use their technical support. However, the Cloudera community is an active place and Cloudera representatives participate actively in understanding and resolving issues.
Which solution did I use previously and why did I switch?
Cloudera is a prominent player in the Hadoop space and we did not have a need to adopt a different solution. However, we are also looking to work on Hadoop and MapR
How was the initial setup?
Following a single path for installation was initially confusing due to multiple recommended approaches e.g. parcels vs. packages. However, after a while, we managed to master it. However, knoweldge of Cloudera Manager and Hadoop architecture is a must.
What about the implementation team?
We have our own team of consultants who are proficient in implementing it. The high level steps about the implementation remain the same; however, it is the environment specific issues which are challenging.
What was our ROI?
We haven't really measured ROI.
What's my experience with pricing, setup cost, and licensing?
Licensing price on per node basis for Cloudera seems to be pretty steep (based on the inputs we have received from Cloudera).
What other advice do I have?
It is user friendly and installation is pretty straightforward. Cloudera Manager is a good tool to administer it. However, configuration for specific requirements is sometimes pretty complex.
You should have a team which is knowledgeable in Hadoop. Do keep in mind that the product is still maturing so there are good chances that you will come across unexpected issues now and then.
Disclosure: My company has a business relationship with this vendor other than being a customer: We're Cloudera partners and regularly install CDH

Data Consultant with 10,001+ employees
Features like Hive, Pig, Impala, Flume and Spark are valuable to us.
Valuable Features
Cloudera Manager is the most valuable feature for it’s ease of use, features, ease of upgrade and install components. CM can also be use to set up high availability within minutes. Others features like Hive, Pig, Impala, Flume and Spark are also valuable.
Improvements to My Organization
It's improved our storage and the availability of analytics tools such as Hive, Pig, Impala, and Spark helps us tremendously.
Room for Improvement
I'd like to see improvements to Impala. Also, it needs a more integrated environment with Spark, data warehouse, storage systems, cloud. Additionally, I'd want more UIs for components of ecosystem, preferably those UIs are centralized in a gateway.
Use of Solution
I've used it for 3.5 years.
Deployment Issues
For experimental and production clusters alike, use Cloudera Manager right from the beginning. RPM installation is good for learning.
Stability Issues
It has compatibility issues if installed in specialized hardware such as EMC Isilon or if node manager and data nodes are not co-located. For production, draw out a detailed plan on how to manage local repo for installation and upgrade. Never install from internet for production clusters.
Customer Service and Technical Support
Most of the clusters are for experimentation that don’t require support. For production clusters, implementations are through major vendors which are handled by them.
Initial Setup
It depends on mode of installation. Cloudera Manager is always more straight forward and manageable. Avoid RPM installation as much as possible. Lay out plans with system admin on upgrade plan, commission and decommission nodes. Investigate impact and consequences of having HBase and Hadoop in the same cluster or as separate cluster, what are the impacts on system admin, cost, upgrades, data migrations, resources, etc?
The complexity kicks in when performing parameter configurations. Find out what are the use cases, are there disk IO or compution IO bound, are there lots of structured data or unstructured data for text analytics, etc.
Implementation Team
Both vendor team and in-house depending on the cluster size and use cases. Some customers may require certain number of certified personnel, something to think about when choosing a partner.
Other Advice
Be prepared for fast changing landscape in how Hadoop works under the hood and how it is used. Each major release usually involved change of file system and data structure. How would they impact data migration. Ask questions like should they Upgrade or create a new cluster? Plans for training and skill upgrades.
Disclosure: My company has a business relationship with this vendor other than being a customer: We're a system integration partner.
Buyer's Guide
Cloudera Distribution for Hadoop
June 2025

Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
Director of Data Management at a media company with 51-200 employees
It gives us improved business intelligence reporting from daily to every two hours.
Valuable Features:
Faster runtime for batch jobs.
Improvements to My Organization:
Improved Business Intelligence reporting from daily to every two hours satisfying the business stakeholders who would favour transactional systems to draft reports because it had the latest data.
The issue that arises using transactional systems with multiple version of truths across the enterprise. With faster turn-around time business stakeholders are now adopting the BI systems designed to give a cohesive view of the performance metrics important to them.
Room for Improvement:
Full Support for all Spark SQL features, support for SparkR, compatibility with Hive for DataFrame saved tables.
Cloudera CDH5.5.x does not support SparkR. SparkR, the integration of R models in API would be a great addition since this will enable fast near real-time analytical integration of R models with data feed.
The functionality in SparkSQL to save a DataFrame as a table in HIVE produces a table not compatible with HIVE. There is a workaround for this in creating the HIVE table first and then doing inserts.
Cloudera CDH5.5.x is a great product, but the adoption of additional features not currently supported will make the product even better but by no means subtract from its desirability.
Other Advice:
Do thorough research and ensure your use-cases or scale does not conflict with the system requirements and that those features that would make a difference are supported.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Team Lead / Data Architect at a tech services company with 51-200 employees
The Cloudera Manager administrator webpage simplifies the administration tasks.
What is most valuable?
The Cloudera Manager administrator webpage simplifies the administration tasks and helps to maintain a global overview of the cluster performance.
How has it helped my organization?
We are moving from an standard SQL environment (Oracle DataWarehouse) to a Big Data environment, and the Hadoop cluster will be the key of our new organization. It will allow to scale in an easy namer.
What needs improvement?
We found some difficulties when importing Hive tables from another Cluster.
I want to point the fact that we encounter many problems related to the cloud storage and how resources are managed. Our learning has been that, although it is quite simple to deploy single machines on the cloud, deploying clusters of machines is much more complex as many factors need to be considered: individual machines, connectivity across machines, storage.
For how long have I used the solution?
I've used it for three months.
What do I think about the stability of the solution?
We found some issues but were related with the hardware provider. For the moment I have not detected any problem from the Cloudera software point of view.
How are customer service and technical support?
Technical support is really efficient.
Which solution did I use previously and why did I switch?
We chose this product as it is considered a market standard and due to its wide documentation on the web. I evaluated other options but the fact that now it is becoming an standard for many companies helped me to choose this option.
How was the initial setup?
In the cloud environment where we deployed (Azure Resource Manager) there was a ready-to-deploy template which simplified a lot the initial set-up.
What about the implementation team?
We implemented with an in-house team. Our initial idea was to stop the cluster during the weekends and when there was no usage. However, we found strong difficulties and we were not able to start programmatically the whole cluster, so finally we left the cluster working all the time.
This issues were mainly related with the cloud provider and how this provider manages the resources for the cluster machines.
What was our ROI?
From our point of view it is a long-time investment. We hope to get the ROI in the following years.
What other advice do I have?
I am very comfortable with this product. The combination of Cloudera Manager administrator server, which allows the management of the Hadoop Cluster, and the Hue server, which simplifies the use make this product a current standard on the market. Perhaps it lacks a full integration of all its components.
Disclosure: My company has a business relationship with this vendor other than being a customer: My company has a partnership relation with the vendor.
R&D Solutions Architect at a tech vendor with 10,001+ employees
It has good ease of use in terms of integration within the Hadoop ecosystem related products.
Valuable Features
Enterprise resource management, ease of use in terms of integration within the Hadoop ecosystem related products, and security.
Room for Improvement
Mainly they have to continuously evolve following the technology trends and replace or adapt part of their solutions accordingly.
Use of Solution
We've used it since October 2012.
Deployment Issues
No issues encountered.
Stability Issues
No issues encountered.
Scalability Issues
No issues encountered.
Customer Service and Technical Support
Pretty responsive and reactive compared to their competitors in the field.
Initial Setup
It was extremely easy, and allowed less experienced personnel to get into the context pretty fast. Any difficulties/complexities faced were not related to the product itself rather than to the cluster infrastructure used.
Implementation Team
In our case it was an in-house team including data scientists and data engineers (management & QA as well). With the appropriate training and the support offered by the vendor, it is not that hard to implement a small to medium scale project solution. However, complexity and size varies significantly between projects; therefore, it really depends.
ROI
That is not easy to answer since Huawei has several divisions using the product in different ways. Again regarding pricing/licensing highly depends on the context and the aims of the given organization for instance the level of support they are going to need, the type of services they are going to provide, or even the business domain they are targeting.
Other Solutions Considered
There were two provider solutions that have been evaluated. However, the level of customer service and technical support from Cloudera was better than the first one, and the second solution licence pricing was higher compared to Cloudera’s pricing schema.
Other Advice
Cloudera is doing a great job in the field offering an enterprise ready data platform. Based on my experiences I would definitely recommend it.
Disclosure: My company has a business relationship with this vendor other than being a customer: We do have a partnership with Cloudera.
Consultant at a tech consulting company with 51-200 employees
The Cloudera Hadoop manager eased the work of orchestrating scripts.
Valuable Features
Very solid. Excellent user experience. good documentation. The Cloudera Manager is definitely a deal breaker. Packaging for Ubuntu is great for all the components.
Improvements to My Organization
Before the introduction of Cloudera Manager (that actually works), all the orchestration was done with scripts and Chef, and inexperienced team members had difficulties to participate in maintenance. The Cloudera Hadoop manager eased the work.
Room for Improvement
More customization, better documentation for the API (basically it's the same for all Cloudera Hadoop components).
Use of Solution
I've used it for two years.
Deployment Issues
No issues encountered.
Stability Issues
No issues encountered.
Scalability Issues
No issues encountered.
Customer Service and Technical Support
Didn't use dedicated service or support. The documentation is a bit of a mess, but it is decent and sufficient.
Initial Setup
Straightforward. The CDH VirtualBox with preconfigured environment helps for demonstration purposes
Implementation Team
We did it in-house.
Other Solutions Considered
We also looked at Hortonworks, but chose Cloudera because of my familiarity with it.
Other Advice
Do a comparisomn with Hortonworks as it's always good to compare to another major vendor
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Data/Big Data Architect at a healthcare company with 1,001-5,000 employees
We were trying AWS Impala as well, but Cloudera won as it had more functionality with HUE, Sqoop, and Solr as built-in functions. At times, heavy queries do not finish at all.
What is most valuable?
Mostly HUE, Impala, Sqoop, and Hive. The impala-shell command is number one.
How has it helped my organization?
We are working on research for genomic data looking for specific genes and variances. Even Hive was not good enough to process it correctly, only with Impala are we getting results quicker.
What needs improvement?
Sometimes the heavy queries do not finish at all. It would be good to see the progress of heavy script in the impala shell or get some way to access it.
For how long have I used the solution?
We started to use Cloudera about one-and-a-half years ago.
What do I think about the stability of the solution?
We are having some issues with stability and are speaking to Cloudera support.
How are customer service and technical support?
Customer Service:
It's acceptable.
Technical Support:It's acceptable.
Which solution did I use previously and why did I switch?
We were trying AWS Impala as well, but Cloudera won as it had more functionality with HUE, Sqoop, and Solr as built-in functions.
How was the initial setup?
We have struggled a bit in installing and configuring Cloudera Manager on the AWS cluster. For now, it is good.
What about the implementation team?
We did the implementation only using our team and resources. It was a hard start, but an easy landing.
What other advice do I have?
Cloudera is good for mid to big company, but small ones can use AWS Impala/HUE. Go to training, or you are going to spend many hours to find short answers. The Cloudera solution is big with good documentation, but you need to know what and where to read first.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Director of Data Architecture at a financial services firm with 501-1,000 employees
It has enabled us to move BI out of our OLTP database and build a data warehouse, but although Spark under rapid development, it needs improvement.
What is most valuable?
- Cloudera Manager
- Impala
- Sentry
How has it helped my organization?
It has enabled us to move BI out of our OLTP database and build a data warehouse.
What needs improvement?
Some areas are under rapid development, like Spark.
For how long have I used the solution?
I've used it for three years.
What was my experience with deployment of the solution?
No issues with the current version.
What do I think about the stability of the solution?
No issues with the current version.
What do I think about the scalability of the solution?
No issues with the current version.
How are customer service and technical support?
Customer Service:
It's excellent.
Technical Support:It's excellent.
Which solution did I use previously and why did I switch?
We switched because Cloudera just works.
How was the initial setup?
Cloudera Manager greatly simplifies initial setup.
What about the implementation team?
In-house.
What other advice do I have?
Make sure you have clearly articulated, doable use cases before you start.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.

Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros
sharing their opinions.
Updated: June 2025
Popular Comparisons
Apache Spark
HPE Ezmeral Data Fabric
Apache HBase
Neo4j Graph Database
Oracle NoSQL
Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions: