Try our new research platform with insights from 80,000+ expert users

Cassandra vs Cloudera Distribution for Hadoop comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Jan 7, 2025

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Categories and Ranking

Cassandra
Ranking in NoSQL Databases
4th
Average Rating
8.0
Reviews Sentiment
6.1
Number of Reviews
24
Ranking in other categories
Vector Databases (15th)
Cloudera Distribution for H...
Ranking in NoSQL Databases
8th
Average Rating
8.0
Reviews Sentiment
6.4
Number of Reviews
50
Ranking in other categories
Hadoop (2nd)
 

Mindshare comparison

As of April 2025, in the NoSQL Databases category, the mindshare of Cassandra is 10.8%, down from 12.8% compared to the previous year. The mindshare of Cloudera Distribution for Hadoop is 1.9%, down from 2.8% compared to the previous year. It is calculated based on PeerSpot user engagement data.
NoSQL Databases
 

Featured Reviews

Himanshu Amodwala - PeerSpot reviewer
Well-equipped to handle a massive influx of data and billions of requests
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-time updates is paramount. For instance, when a customer leaves comments or feedback on an image, they anticipate an immediate reflection of these changes on the portal. Similarly, sellers altering product attributes or updating images expect instant visibility of these modifications. Handling large data volumes with Cassandra has been an excellent experience. Despite challenges related to the influx, these were not attributed to Cassandra itself but rather to middle-layer issues. Generally, it demonstrated scalability with workloads, thanks to its horizontal scaling capabilities. We could easily add new nodes to the system as needed, ensuring the platform coped well with increasing loads. The tool's most beneficial feature for scalability is its entire architecture. The absence of a single point of failure or a leader within the ecosystem contributes to its robust scalability. This key aspect influenced our decision to opt for the Cassandra ecosystem. In terms of performance, it demonstrated the ability to handle approximately 1.6 billion requests per day. This was achieved on AWS using EC2 instances, and it was during a period about four to five years ago.
Rok Dolinsek - PeerSpot reviewer
Enables on-premise implementation with powerful data processing capabilities
This is the only solution that is possible to install on-premise. Cloudera provides a hybrid solution that combines compute on cloud or on-premises. It includes all machine learning algorithms in the Spark machine learning library. All functionalities needed for a big data platform and ETL are on the platform, eliminating the need for other tools. It is scalable, ready for vertical scaling, and very powerful, offering numerous functionalities and configurations for generative AI.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"I am getting much better performance than relational databases."
"Cassandra offers high availability and fault tolerance, making it suitable for large-scale data storage and real-time processing."
"Can achieve continuous data without a single downtime because of node to node ring architecture."
"Our primary use case for the solution is testing."
"The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-time updates is paramount."
"Its retrieval is similar to an RDBMS, so our team finds it easy to adapt."
"I am satisfied with the performance."
"The most valuable features of Cassandra are the NoSQL database, high performance, and zero-copy streaming."
"We had a data warehouse before all the data. We can process a lot more data structures."
"The scalability of Cloudera Distribution for Hadoop is excellent."
"We're now able to store large volumes of data through Cloudera Distribution for Hadoop. We're able to push large volumes of data to the platform, and that used to be a challenge, especially when storing a terabyte of information. This is the area where Cloudera Distribution for Hadoop improved the organization."
"The most valuable feature is that I can use CDH for almost all use cases across all industries, including the financial sector, public sector, private retailers, and so on."
"The most valuable feature is Kubernetes."
"The data science aspect of the solution is valuable."
"Very good end-to-end security features."
"With a cluster available, you can manage the security layer using the shared SDX - it provides flexibility."
 

Cons

"The disc space is lacking. You need to free it up as you are working."
"While Cassandra can handle NoSQL, I think there should be more flexibility for whole schema design when data is stored in wide columns. Additionally, I believe that eventual consistency should be enhanced."
"Maybe they can improve their performance in data fetching from a high volume of data sets."
"Doesn't support a solution that can give aggregation."
"The solution is not easy to use because it is a big database and you have to learn the interface. This is the case though in most of these solutions."
"The initial setup of Cassandra can be difficult in the configuration. There might be a need to have assistance. The implementation process can six months for connecting to certain databases."
"Cassandra can improve by adding more built-in tools. For example, if you want to do some maintenance activities in the cluster, we have to depend on third-party tools. Having these tools build-in would be e benefit."
"I want Cassandra to update its open-source version more quickly. It's already feature-rich, but I'd appreciate better integration with other NoSQL databases like MariaDB or MongoDB. If I ever need to work with customers or vendors using different NoSQL databases, having native integration in Cassandra would make managing and interacting with their databases much easier."
"We experienced many issues when we started working with Hadoop 3.0 in the Cloudera 6.0 version, so there is a lot of things that need to improve."
"This is a very expensive solution."
"The one thing that we struggled with predominately was support. Because it was relatively new, support was always a big issue and I think it's still a bit of an ongoing concern with the team currently managing it."
"The user infrastructure and user interface needs to be improved, as well as the performance. The GUI needs to be better."
"Currently, we are using many other tools such as Spark and Blade Job to improve the performance."
"The Cloudera training has deteriorated significantly."
"It could be faster and more user-friendly."
"The tool doesn't support reporting, and relational databases are still the major source of reporting data. Apache Iceberg will be launched soon within the Cloudera cluster for analytical purposes. The Cloudera Machine Learning aspect could be tuned and enhanced to enable us to host some predictive analytics machine learning and AI use cases."
 

Pricing and Cost Advice

"We pay for a license."
"Cassandra is a free open source solution, but there is a commercial version available called DataStax Enterprise."
"There are licensing fees that must be paid, but I'm not sure if they are paid monthly or yearly."
"We are using the open-source version of Cassandra, the solution is free."
"I don't have the specific numbers on pricing, but it was fairly priced."
"I use the tool's open-source version."
"The price is very high. The solution is expensive."
"Cloudera Distribution for Hadoop is expensive, with support costs involved."
"Cloudera requires a license to use."
"When comparing with Oracle Sybase and SQL, it's cheaper. It's not expensive."
"The pricing must be improved."
"The tool is not expensive."
"The solution is fairly expensive."
"It is an expensive product."
report
Use our free recommendation engine to learn which NoSQL Databases solutions are best for your needs.
849,686 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
22%
Computer Software Company
15%
Comms Service Provider
5%
University
5%
Financial Services Firm
25%
Computer Software Company
15%
Educational Organization
14%
Manufacturing Company
6%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Cassandra?
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-ti...
What needs improvement with Cassandra?
While Cassandra can handle NoSQL, I think there should be more flexibility for whole schema design when data is stored in wide columns. Additionally, I believe that eventual consistency should be e...
What do you like most about Cloudera Distribution for Hadoop?
The tool can be deployed using different container technologies, which makes it very scalable.
What is your experience regarding pricing and costs for Cloudera Distribution for Hadoop?
The price for Cloudera is average, yet it is very good compared to other solutions. It can be deployed on-premises, unlike competitors' cloud-only solutions.
What needs improvement with Cloudera Distribution for Hadoop?
It is quite complicated to configure and install. Integrating the platform into an information system is always a challenge, especially when starting with on-premise implementation. Integrating wit...
 

Overview

 

Sample Customers

1. Apple 2. Netflix 3. Facebook 4. Instagram 5. Twitter 6. eBay 7. Spotify 8. Uber 9. Airbnb 10. Adobe 11. Cisco 12. IBM 13. Microsoft 14. Yahoo 15. Reddit 16. Pinterest 17. Salesforce 18. LinkedIn 19. Hulu 20. Airbnb 21. Walmart 22. Target 23. Sony 24. Intel 25. Cisco 26. HP 27. Oracle 28. SAP 29. GE 30. Siemens 31. Volkswagen 32. Toyota
37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
Find out what your peers are saying about Cassandra vs. Cloudera Distribution for Hadoop and other solutions. Updated: April 2025.
849,686 professionals have used our research since 2012.