Try our new research platform with insights from 80,000+ expert users

Cassandra vs Cloudera Distribution for Hadoop vs InfluxDB comparison

 

Comparison Buyer's Guide

Executive Summary

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Mindshare comparison

As of June 2025, in the NoSQL Databases category, the mindshare of Cassandra is 10.6%, down from 12.9% compared to the previous year. The mindshare of Cloudera Distribution for Hadoop is 2.2%, down from 2.6% compared to the previous year. The mindshare of InfluxDB is 8.8%, down from 12.8% compared to the previous year. It is calculated based on PeerSpot user engagement data.
NoSQL Databases
 

Featured Reviews

Himanshu Amodwala - PeerSpot reviewer
Well-equipped to handle a massive influx of data and billions of requests
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-time updates is paramount. For instance, when a customer leaves comments or feedback on an image, they anticipate an immediate reflection of these changes on the portal. Similarly, sellers altering product attributes or updating images expect instant visibility of these modifications. Handling large data volumes with Cassandra has been an excellent experience. Despite challenges related to the influx, these were not attributed to Cassandra itself but rather to middle-layer issues. Generally, it demonstrated scalability with workloads, thanks to its horizontal scaling capabilities. We could easily add new nodes to the system as needed, ensuring the platform coped well with increasing loads. The tool's most beneficial feature for scalability is its entire architecture. The absence of a single point of failure or a leader within the ecosystem contributes to its robust scalability. This key aspect influenced our decision to opt for the Cassandra ecosystem. In terms of performance, it demonstrated the ability to handle approximately 1.6 billion requests per day. This was achieved on AWS using EC2 instances, and it was during a period about four to five years ago.
Rok Dolinsek - PeerSpot reviewer
Enables on-premise implementation with powerful data processing capabilities
This is the only solution that is possible to install on-premise. Cloudera provides a hybrid solution that combines compute on cloud or on-premises. It includes all machine learning algorithms in the Spark machine learning library. All functionalities needed for a big data platform and ETL are on the platform, eliminating the need for other tools. It is scalable, ready for vertical scaling, and very powerful, offering numerous functionalities and configurations for generative AI.
DeepakR - PeerSpot reviewer
An open-source database that can be used to insert data
InfluxDB is generally stable, but we've encountered issues with the configuration file in our ticket stack. For instance, a mistake in one of the metrics out of a hundred KPIs can disrupt data collection for all KPIs. This happens because the agent stops working if there's an issue with any configuration part. To address this, it is essential to ensure that all configurations are part of the agent's EXE file when provided. This makes it easier to package the agent for server installation and ensures all KPIs are available from the server. Additionally, the agent cannot encrypt and decrypt passwords for authentication, which can be problematic when monitoring URLs or requiring authentication tokens. This requires additional scripting and can prolong service restart times.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"I am getting much better performance than relational databases."
"The most valuable feature of Cassandra is its fast retrieval. Additionally, the solution can handle large amounts of data. It is the quickest application we use."
"Our primary use case for the solution is testing."
"Cassandra has some features that are more useful for specific use cases where you have time series where you have huge amounts of writes. That should be quick, but not specifically the reads. We needed to have quicker reads and writes and this is why we are using Cassandra right now."
"A consistent solution."
"The technical evaluation is very good."
"Can achieve continuous data without a single downtime because of node to node ring architecture."
"Some of the valued features of this solution are it has good performance and failover."
"The solution is stable."
"The tool's most interesting features are the distributed file system and unstructured data processing capability. Because we have a lot of unstructured data, like XML and social media logs, these features make it more valuable than the usual data warehousing solutions."
"The solution is reliable and stable, it fits our requirements."
"The product is completely secure."
"The most valuable feature is that I can use CDH for almost all use cases across all industries, including the financial sector, public sector, private retailers, and so on."
"The scalability of Cloudera Distribution for Hadoop is excellent."
"The data science aspect of the solution is valuable."
"In terms of scalability, if you have enough hardware you can scale out. Scalability doesn't have any issues."
"InfluxDB works as expected with excellent scalability and stability, which is critical for our application."
"In our case, it started with a necessity to fill the gap that we had in monitoring. We had very reactive monitoring without trend analysis and without some advanced features. We were able to implement them by using a time series database. We are able to have all the data from applications, logs, and systems, and we can use a simple query language to correlate all the data and make things happen, especially with monitoring. We could more proactively monitor our systems and our players' trends."
"While I would rate InfluxDB a ten on a scale of one to ten, users should be thoughtful about matching the engine to their specific needs."
"The most valuable feature of the solution is we can use InfluxDB to integrate with and plug into any other tools."
"The most valuable features of InfluxDB are the documentation and performance, and the good plugins metrics in the ecosystem."
"The platform operates very quickly. It is easy to configure, connect, and query and integrates seamlessly with Grafana."
"The user interface is well-designed and easy to use. It provides a clear overview of the data, making it simple to understand the information at hand."
"InfluxDB works as expected with excellent scalability and stability, which is critical for our application."
 

Cons

"Fine-tuning was a bit of a challenge."
"The initial setup of Cassandra can be difficult in the configuration. There might be a need to have assistance. The implementation process can six months for connecting to certain databases."
"We found some issues with the batch inserts when the data volume is large."
"Cassandra could be more user-friendly like MongoDB."
"Interface is not user friendly."
"It can be difficult to analyze what's going on inside of the database relative to other databases. It can also be difficult to troubleshoot sometimes."
"The solution is not easy to use because it is a big database and you have to learn the interface. This is the case though in most of these solutions."
"There were challenges with the query language and the development interface. The query language, in particular, could be improved for better optimization. These challenges were encountered while using the Java SDK."
"The solution does not support multiple languages very well and this means users need to create work-arounds to implement some solutions."
"The Cloudera training has deteriorated significantly."
"Currently, we are using many other tools such as Spark and Blade Job to improve the performance."
"The tool doesn't support reporting, and relational databases are still the major source of reporting data. Apache Iceberg will be launched soon within the Cloudera cluster for analytical purposes. The Cloudera Machine Learning aspect could be tuned and enhanced to enable us to host some predictive analytics machine learning and AI use cases."
"The competitors provide better functionalities."
"The tool's ability to be deployed on a cloud model is an area of concern where improvements are required."
"While the deployed product is generally functional, there are instances where it presents difficulties."
"There are multiple bugs when we update."
"The solution's UI can be more user-friendly."
"I've tried both on-premises and cloud-based deployments, and each has its limitations."
"InfluxDB is generally stable, but we've encountered issues with the configuration file in our ticket stack. For instance, a mistake in one of the metrics out of a hundred KPIs can disrupt data collection for all KPIs. This happens because the agent stops working if there's an issue with any configuration part. To address this, it is essential to ensure that all configurations are part of the agent's EXE file when provided. This makes it easier to package the agent for server installation and ensures all KPIs are available from the server. Additionally, the agent cannot encrypt and decrypt passwords for authentication, which can be problematic when monitoring URLs or requiring authentication tokens. This requires additional scripting and can prolong service restart times."
"In terms of features that I would like to see or have, in the community version, some features are not available. I would like to have clustering and authentication in the community version."
"InfluxDB cannot be used for high-cardinality data. It's also difficult and time-consuming to write queries, and there are some issues with bulk API."
"It is challenging to get long-running backups while running InfluxDB in a Microsoft Azure Kubernetes cluster."
"The error logging capability can be improved because the logs are not very informative."
"One area for improvement is the querying language. InfluxDB deprecated FluxQL, which was intuitive since developers are already familiar with standard querying."
 

Pricing and Cost Advice

"We are using the open-source version of Cassandra, the solution is free."
"I don't have the specific numbers on pricing, but it was fairly priced."
"We pay for a license."
"Cassandra is a free open source solution, but there is a commercial version available called DataStax Enterprise."
"There are licensing fees that must be paid, but I'm not sure if they are paid monthly or yearly."
"I use the tool's open-source version."
"The price is very high. The solution is expensive."
"The product’s price depends from project to project."
"The tool is not expensive."
"The pricing must be improved."
"The solution is expensive."
"It is an expensive product."
"I wouldn't recommend CDH to others because of its high cost."
"The price could be better for the product."
"The tool is an open-source product."
"InfluxDB is open-source, but there are additional costs for scaling."
"We are using the open-source version of InfluxDB."
"InfluxDB recently increased its price. It is very expensive now."
report
Use our free recommendation engine to learn which NoSQL Databases solutions are best for your needs.
859,545 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Financial Services Firm
20%
Computer Software Company
15%
Comms Service Provider
6%
Retailer
6%
Financial Services Firm
23%
Educational Organization
15%
Computer Software Company
14%
Energy/Utilities Company
6%
Financial Services Firm
14%
Computer Software Company
12%
Manufacturing Company
9%
Comms Service Provider
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about Cassandra?
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operat...
What needs improvement with Cassandra?
While Cassandra can handle NoSQL, I think there should be more flexibility for whole schema design when data is store...
What do you like most about Cloudera Distribution for Hadoop?
The tool can be deployed using different container technologies, which makes it very scalable.
What is your experience regarding pricing and costs for Cloudera Distribution for Hadoop?
The price for Cloudera is average, yet it is very good compared to other solutions. It can be deployed on-premises, u...
What needs improvement with Cloudera Distribution for Hadoop?
It is quite complicated to configure and install. Integrating the platform into an information system is always a cha...
What do you like most about InfluxDB?
InfluxDB is a database where you can insert data. However, it would be best if you had different components for alert...
What needs improvement with InfluxDB?
It is challenging to get long-running backups while running InfluxDB in a Microsoft Azure Kubernetes cluster. Replica...
What is your primary use case for InfluxDB?
InfluxDB is the main component in our large enterprise-scale streaming data application for maritime vessels. We coll...
 

Overview

 

Sample Customers

1. Apple 2. Netflix 3. Facebook 4. Instagram 5. Twitter 6. eBay 7. Spotify 8. Uber 9. Airbnb 10. Adobe 11. Cisco 12. IBM 13. Microsoft 14. Yahoo 15. Reddit 16. Pinterest 17. Salesforce 18. LinkedIn 19. Hulu 20. Airbnb 21. Walmart 22. Target 23. Sony 24. Intel 25. Cisco 26. HP 27. Oracle 28. SAP 29. GE 30. Siemens 31. Volkswagen 32. Toyota
37signals, Adconion,adgooroo, Aggregate Knowledge, AMD, Apollo Group, Blackberry, Box, BT, CSC
ebay, AXA, Mozilla, DiDi, LeTV, Siminars, Cognito, ProcessOut, Recommend, CATS, Smarsh, Row 44, Clustree, Bleemeo
Find out what your peers are saying about MongoDB, ScyllaDB, InfluxData and others in NoSQL Databases. Updated: June 2025.
859,545 professionals have used our research since 2012.