Apache Hadoop Room for Improvement
Vice President - Finance & IT at a consumer goods company with 1-10 employees
I'm not sure if I have any ideas as to how to improve the product.
Every year, the solution comes out with new features. Spark is one new feature, for example. If they could continue to release new helpful features, it will continue to increase the value of the solution.
The solution could always improve performance. This is a consistent requirement. Whenever you run it, there is always room for improvement in terms of performance.
The solution needs a better tutorial. There are only documents available currently. There's a lot of YouTube videos available. However, in terms of learning, we didn't have great success trying to learn that way. There needs to be better self-paced learning.
We would prefer it if users didn't just get pushed through to certification-based learning, as certifications are expensive. Maybe if they could arrange it so that the certification was at a lesser cost. The certification cost is currently around $2,500 or thereabout.View full review »
Apache Hadoop's real-time data processing is weak and is not enough to satisfy our customers, so we may have to pick other products. We are continuously researching other solutions and other vendors.
Another weak point of this solution, technically speaking, is that it's very difficult to run and difficult to smoothly implement. Preparation and integration are important.
The integration of this solution with other data-related products and solutions, and having other functions, e.g. API connectivity, are what I want to see in the next release.View full review »
Founder & CTO at a tech services company with 1-10 employees
I don't have any concerns because each part of Hadoop has its use cases. To date, I haven't implemented a huge product or project using Hadoop, but on the level of POCs, it's fine.
The community of Hadoop is now a cluster, I think there is room for improvement in the ecosystem.
From the Apache perspective or the open-source community, they need to add more capabilities to make life easier from a configuration and deployment perspective.View full review »
Technical Lead at a government with 201-500 employees
For the visualization tools, we use Apache Hadoop and it is very slow.
It lacks some query language. We have to use Apache Linux. Even so, the query language still has limitations with just a bit of documentation and many of the visualization tools do not have direct connectivity. They need something like BigQuery which is very fast. We need those to be available in the cloud and scalable.
The solution needs to be powerful and offer better availability for gathering queries.
The solution is very expensive.
Partner at a tech services company with 11-50 employees
Hadoop's security could be better.View full review »
Co-Founder at a tech services company with 201-500 employees
It would be helpful to have more information on how to best apply this solution to smaller organizations, with less data, and grow the data lake.View full review »