Cloudera Distribution for Hadoop is valued for its easy installation and management, comprehensive security features like Sentry and encryption, and excellent data processing capabilities with Impala and Cloudera Manager. It supports large-scale data management with tools like Hive, Pig, Spark, and integrates smoothly across different environments. Users appreciate its scalability, extensive documentation, proactive support, and efficient role-based access control, enabling effective management of big data with robust analytics and machine learning capabilities.
- "Cloudera Distribution for Hadoop provides numerous features and capabilities combined into one platform, offers power processing, supports different file systems and query engines, and provides parallel processing for handling many requests."
- "Cloudera provides a hybrid solution that combines compute on cloud or on-premises."
- "This is the only solution that is possible to install on-premise."
Cloudera Distribution for Hadoop has room for improvement in multiple areas such as stability, processing speed, and integration capabilities. Users find the licensing structure expensive and suggest enhancements to training materials and support. They report challenges with documentation, deployment complexity, and user interface. Additionally, better cloud integration, data science support, and price adjustments are recommended to make it more competitive. Issues with data compatibility and a lack of certain features also pose difficulties for many organizations.
- "If they could support modifying the data more easily than the current implementation, it would be beneficial."
- "It is quite complicated to configure and install. Integrating the platform into an information system is always a challenge, especially when starting with on-premise implementation."
- "It is quite complicated to configure and install."