We performed a comparison between Apache Hadoop and Infobright DB based on real PeerSpot user reviews.
Find out what your peers are saying about Snowflake Computing, Oracle, Teradata and others in Data Warehouse."It's primarily open source. You can handle huge data volumes and create your own views, workflows, and tables. I can also use it for real-time data streaming."
"We selected Apache Hadoop because it is not dependent on third-party vendors."
"The most valuable feature is scalability and the possibility to work with major information and open source capability."
"The solution is easy to expand. We haven't seen any issues with it in that sense. We've added 10 servers, and we've added two nodes. We've been expanding since we started using it since we started out so small. Companies that need to scale shouldn't have a problem doing so."
"The most valuable feature is the database."
"Initially, with RDBMS alone, we had a lot of work and few servers running on-premise and on cloud for the PoC and incubation. With the use of Hadoop and ecosystem components and tools, and managing it in Amazon EC2, we have created a Big Data "lab" which helps us to centralize all our work and solutions into a single repository. This has cut down the time in terms of maintenance, development and, especially, data processing challenges."
"Data ingestion: It has rapid speed, if Apache Accumulo is used."
"The best thing about this solution is that it is very powerful and very cheap."
"It has very amazing smart grid query feature for very fast aggregate queries across millions of rows"
"The stability of the solution needs improvement."
"The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed by processing the data in chunks."
"There is a lack of virtualization and presentation layers, so you can't take it and implement it like a radio solution."
"I think more of the solution needs to be focused around the panel processing and retrieval of data."
"It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it."
"The solution needs a better tutorial. There are only documents available currently. There's a lot of YouTube videos available. However, in terms of learning, we didn't have great success trying to learn that way. There needs to be better self-paced learning."
"General installation/dependency issues were there, but were not a major, complex issue. While migrating data from MySQL to Hive, things are a little challenging, but we were able to get through that with support from forums and a little trial and error."
"In the next release, I would like to see Hive more responsive for smaller queries and to reduce the latency."
"Only the data from the columns that reached 2GB will actually decrease. Other columns below 2GB in size do not leave the disk."
Earn 20 points
Apache Hadoop is ranked 5th in Data Warehouse with 33 reviews while Infobright DB is ranked 27th in Data Warehouse. Apache Hadoop is rated 7.8, while Infobright DB is rated 7.6. The top reviewer of Apache Hadoop writes "A file system for data collection that contains needed information and files". On the other hand, the top reviewer of Infobright DB writes "If you need a real big data solution, look for a distributed solution that actually has a proven track record". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake and Teradata, whereas Infobright DB is most compared with MySQL and LocalDB.
See our list of best Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.