Try our new research platform with insights from 80,000+ expert users
EricLin - PeerSpot reviewer
Chairman at Athemaster co.,ltd.
Real User
Top 10
Performs cost analysis tasks for our customers in the financial industry
Pros and Cons
  • "The most valuable feature is Kubernetes."
  • "The price of this solution could be lowered."

What is our primary use case?

We are a solution provider and this is one of the systems that we implement for our clients.

Our clients for this product are in the financial industry and they use it to perform cost analysis tasks.

What is most valuable?

The most valuable feature is Kubernetes.

What needs improvement?

The price of this solution could be lowered.

For how long have I used the solution?

We have been using the Cloudera Distribution for Hadoop for five years.

Buyer's Guide
Cloudera Distribution for Hadoop
June 2025
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.

What do I think about the stability of the solution?

It is a stable solution.

What do I think about the scalability of the solution?

The Cloudera Distribution for Hadoop can be scaled. Our customers are enterprise-level companies and they have about 100 users for this solution.

How are customer service and support?

We offer technical support for this solution to our customers.

Which solution did I use previously and why did I switch?

We did not use another solution prior to this one.

How was the initial setup?

The initial setup is straightforward.

What's my experience with pricing, setup cost, and licensing?

The pricing is expensive.

Which other solutions did I evaluate?

Cloudera really has no competition.

What other advice do I have?

I would rate this solution a nine out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer: reseller
PeerSpot user
Data engineer at a tech services company with 11-50 employees
Real User
Supports a wide range of tools and has a good support community
Pros and Cons
  • "We also really like the Cloudera community. You can have any question and will have your answer within a few hours."
  • "Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment."

What is our primary use case?

Our primary use case for this solution is to host a big amount of data in our platform, processing, analysis and all of this stuff on the platform.

What is most valuable?

Cloudera is always developing new tools and supports a wide range of tools. We also really like the Cloudera community. You can have any question and will have your answer within a few hours. Cloudera is better than other competitors because they acquired Hortonworks.

What needs improvement?

We're processing a huge amount of data on our system. Without the big data environment, we cannot store all of this data live. We have billions of records and terabytes of storage to be used. It's not an option actually for us to have a big data environment. Cloudera is trying to adopt new technologies.

I think the idea of open source tools now is dominating. So Cloudera has to decide how to deal with open-source tools. I subscribe to Cloudera to get an enterprise version but I have found that I can get some of its features from other vendors that would be at a lower cost than Cloudera. They should lower the price. 

For how long have I used the solution?

We have been using Cloudera for a year. 

What do I think about the stability of the solution?

It's stable. I have no issue regarding the stability.

What do I think about the scalability of the solution?

It's scalable. You can add more nodes and you can expand your cluster easily.

How are customer service and technical support?

After we open a ticket, the issue can be resolved very quickly, they have a management portal. I don't contact them directly, but I haven't heard anybody having any problems with it. 

How was the initial setup?

The initial setup is complicated. We needed the vendor to install it themselves. The deployment took around three weeks. Three people were involved because they just follow up and supervise the deployment, but they're not deploying anything. The vendor does it. 

What other advice do I have?

In terms of the advice, I would say to focus on what tools are available on the market. In terms of open-source, most companies are delivering open source technologies and providing support to these tools. Now I have the option to purchase a license for whatever platform for $1. I can deliver it with another small company at a lower cost. If I was the decision-maker, I'd invest in open-source tools. Cloudera and all of these companies are trying to adapt to these big data technologies and open source tools. Cloudera is trying to put it inside their platform so that we can have a compatible solution.

I would rate it an eight out of ten. 

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Cloudera Distribution for Hadoop
June 2025
Learn what your peers think about Cloudera Distribution for Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
it_user1296588 - PeerSpot reviewer
Senior Software Engineer at a tech services company with 10,001+ employees
Real User
Performs well and the technical support is helpful, but the upgrade process needs to be consolidated
Pros and Cons
  • "The most valuable feature is Impala, the querying engine, which is very fast."
  • "There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon."

What is our primary use case?

We are dealing with data from the telecom industry. We were using an Oracle system but our volume has increased. We now have a lot of real-time data that needs to be transformed so that it can be made available and used.

What is most valuable?

The most valuable feature is Impala, the querying engine, which is very fast. We have been able to work with one terabyte of data in less than 20 minutes. The speed makes it easy for us to process all of the data that comes in, in time.

The support is very good.

All of the data has automatic triple replication in order to secure integrity.

What needs improvement?

There is a maximum of a one-gigabyte block size, which is an area of storage that can be improved upon.

When we are upgrading CDH, there are many things that need to be upgraded and it would be helpful if it were bundled. As it is now, we have to upgrade many different things separately.

For how long have I used the solution?

I have been working with the Cloudera Distribution for Hadoop for around two years.

What do I think about the stability of the solution?

It is a stable solution.

What do I think about the scalability of the solution?

The scalability is good and it works on commodity hardware. One of the problems we have right now is that there is a lot of data and we're moving it from our Oracle solution. This means that there is a double cost, in terms of storage, during our transition to working with big data.

We are using a data lake that is a store for all of the data in our organization. There are more than25 projects, with between 25 and 30 people in each one, for a total of almost 1,000 people. All of them are dependent on this solution.

Most of our users are technicians who have problems to solve using the data available to them. A couple of them are data scientists and the remainder are upper management, who do the analysis.

How are customer service and technical support?

The technical support is very good. Whenever we open a ticket, we get support right away.

Which solution did I use previously and why did I switch?

We did use another solution prior to this one but it could not keep up with our increase in data.

What other advice do I have?

This suitability of this solution depends on the size of the data that you are going to be working with. If you have going to be working with a huge dataset that contains many gigabytes of data then this is a good solution. For smaller datasets, you should also consider other technologies.

My advice for anybody who is implementing this solution is to take some time to learn it. Beyond that, be sure to contact support if you have any problems because they are very helpful.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user900987 - PeerSpot reviewer
Data Management at BCX
Real User
Offers big data support for analytical applications but the technical support needs improvement
Pros and Cons
  • "In terms of scalability, if you have enough hardware you can scale out. Scalability doesn't have any issues."
  • "The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it."

What is our primary use case?

We primarily use it only for big data support for analytical applications.

What is most valuable?

The feature that we've used quite intensively is Spark, in how it specifically can speed up some of the data to assist with processing.

What needs improvement?

The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it.

In the next release, I think it would be helpful if there was easier integration into all the other existing data back corners. It will be a big plus as it's a favorite capability. We had to go with a third-party application in order to achieve that.

For how long have I used the solution?

I've been using the solution since 2016.

What do I think about the stability of the solution?

The stability is problematic. We did encounter quite a lot of issues with the cluster going down quite frequently.

What do I think about the scalability of the solution?

In terms of scalability, if you have enough hardware you can scale out. Scalability doesn't have any issues. Currently, only about 10 people in total are using the solution. So we have about four business users and then four technical people. It's only limited to two environments.

How are customer service and technical support?

I think there's a lot of room for improvement on the technical support side. Mostly because we don't have a lot of local skills in South Africa that could have supported the solution. It was an issue.

Which solution did I use previously and why did I switch?

This is our first solution. We tested a bunch of other technologies, but that was our first one and we're still using it.

How was the initial setup?

The initial implementation was straightforward from an application side. There weren't any hiccups. In terms of deployment time, it's going to be difficult to say, because most of it was related to hardware problems. Software took about two months to deploy. We required four people for deployment.

What's my experience with pricing, setup cost, and licensing?

The pricing is very competitive. It's not bad.

Which other solutions did I evaluate?

We considered working with a few other companies, including IBM Bluemix.

What other advice do I have?

I would recommend the solution given that they've proven the business case and that they've proven the technology. We have found that if you don't use or address the right business code you end up buying a technology that doesn't necessarily solve your business problems.

I would rate the solution seven out of ten. The main reason for not rating it higher is that I think that the overall support is not great and we've found some limitations. It wasn't mature when we started. It's getting there. It's getting better. The main reason for the score of seven is mainly the support as well as the limited functionality.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Coordinate677 - PeerSpot reviewer
Project Coordinator at a manufacturing company with 1,001-5,000 employees
Real User
Good search functionality but the user interface needs improvement
Pros and Cons
  • "The search function is the most valuable aspect of the solution."
  • "The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better."

What is our primary use case?

We primarily use the solution for external storage.

What is most valuable?

The search function is the most valuable aspect of the solution.

What needs improvement?

The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better.

For how long have I used the solution?

I've been using the solution for 1 year.

Which solution did I use previously and why did I switch?

We did not previously use a different solution.

How was the initial setup?

The initial setup was complex, due to the user interface. We were doing a POC, so we're still doing the deployment.

What other advice do I have?

I would rate this solution seven out of 10. There's tons of room for improvement.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
reviewer1046250 - PeerSpot reviewer
Senior Consultant & Training at a tech services company with 51-200 employees
Consultant
The valuable combination of all the tools enable me to solve use cases I'm working on
Pros and Cons
  • "We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there are a lot of things that need to improve. I believe they are working on that."
  • "We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there is a lot of things that need to improve."

What is our primary use case?

I've been working on the software installation from the beginning, and we have a client for global supply change, so we get information from Telefonica's sales and distributions. Getting all that information into this system allows us to process it, get KPIs, and create outgoing information for business intelligence tools. 

In the cloud provider enterprise we get all the information from the gamers, like delays, response, and information from the games. It allows us to see if gamers are having trouble, high latency or any other kind of issue. They test that and get information about the issues in order to solve them.

What is most valuable?

I like the combination of all the tools that allow me to provide solutions and enable me to solve the use cases I'm working on. You need tools or components to foresee everything, and they are all in our emails. Sometimes you try several of them, and sometimes one will work better than the other. So you have to test the tools to see what works for you. 

What needs improvement?

We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there are a lot of things that need to improve. I believe they are working on that. 

For how long have I used the solution?

I've been using this solution for about a year and a half now.

How was the initial setup?

It's been quite easy to install. We only had to follow the instructions and there weren't many problems. That's important for us.

What other advice do I have?

I will rate this solution a nine out of ten because nothing is ever perfect. You will always face problems, but I'm quite happy with Cloudera. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user692565 - PeerSpot reviewer
DBA team manager at a financial services firm with 1,001-5,000 employees
Real User
Helpful to build infrastructure for advanced analytics and is easy to install
Pros and Cons
  • "The features I find most valuable is that the solution is that it is easy to install and to work with. It starts with the installation and from there on the management is very simple and centralized."
  • "I would like to see an improvement in how the solution helps me to handle the whole cluster."

What is our primary use case?

I'm part of the IT team at my company, and our primary use case of this solution is building infrastructure for advanced analytics, where we copy data from our data warehouse that is now our relational database. We copy it to the Cloudera Distribution for Hadoop and then analyze it with Python and machine learning. 

What is most valuable?

The features I find most valuable is that the solution is that it is easy to install and to work with. It starts with the installation and from there on the management is very simple and centralized.

What needs improvement?

I would like to see an improvement in how the solution helps me to handle the whole cluster. For example, when I'm going down to a specific tool, like Kafka, for example, the Cloudera manager doesn't really help me. Then I have to use Google with other Kafka knowledge and tools. 

For how long have I used the solution?

I've been using this solution for about three years now.

What do I think about the stability of the solution?

It is a very stable solution.

What do I think about the scalability of the solution?

Not many people are currently using this solution at my organization, but I do believe it is scalable. I don't, however, have experience with upgrading or adding users. 

How are customer service and technical support?

My problem is that I started using Cloudera Express without technical support and then I purchased the Enterprise edition through another company. So now I don't really have access to Cloudera support, even though I hardly need to use it. 

How was the initial setup?

The initial setup was simple, but we had trouble implementing the cables in the Hadoop solution.

What other advice do I have?

I had a bad experience connecting the Cloudera Distribution for Hadoop cluster to my other resources in the company, like the active directory or firewall. I would like to see the outside environment to be easier to handle. I will rate this eight out of ten because the solution doesn't cover everything. It is a very complicated solution because it contains a lot of internal tools. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
PeerSpot user
Lead Consultant - Product Development at FIS (http://www.fisglobal.com/)
Real User
We use this solution to use big data for our analyses

What is our primary use case?

Our core product is an insurance product and the actuarial module is quite complex. SMEs so far collect data from various sources into Excel sheets and through macros do the analytics which is a very crude form of doing the analysis. So we thought to use big data for such analysis.

How has it helped my organization?

That is still in PUC stage, as I have mentioned our analyst used to do the actuarial on a spreadsheet but after Hadoop  implementation they are getting confidence that now analysis is more appropriate and fast. Now exploring cloud implementation as well.

What is most valuable?

Keeping multi copies of the file and tools of map reduce like PIG, HIVE due to their flexibility it is easy to develop the application with less or almost no knowledge of Java and Sql. And capability to handle huge data size.

What needs improvement?

As such in the product side, I don't have much to comment. But like other upcoming technologies like RPA, AI, GO etc they have ample training materials with variety of USE Cases, which users can understand and aligned with their current requirements. On same ground I didn't see much training materials from Cloudera.

For how long have I used the solution?

One to three years.

What do I think about the stability of the solution?

Seems quite stable, as such didn't face any issue.

What do I think about the scalability of the solution?

It is very stable, didn't face any performance issue.

Which solution did I use previously and why did I switch?

No when we were heard of Hadoop, we tried on that only. I mean tried to migrate from spreadsheets to Hadoop.

How was the initial setup?

Very straight forward. Typical Windows type installation...Next, next, next clicks.

What about the implementation team?

In-house.

What was our ROI?

Other department handles all these so I can't comment on that.

What's my experience with pricing, setup cost, and licensing?

 

Which other solutions did I evaluate?

Not really.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2025
Product Categories
Hadoop NoSQL Databases
Buyer's Guide
Download our free Cloudera Distribution for Hadoop Report and get advice and tips from experienced pros sharing their opinions.