Apache Hadoop Valuable Features
Hadoop File System is a perfect choice if we want to use any database systems or file systems because it is open-source. It has no cost. Or else, we’ll have to use Amazon S3 or Azure database, for which we will have to pay a lot. A lot of big data processing needs a proper partition and structure. Hadoop File System is compatible with almost all the query engines. That’s another reason why people would be very comfortable working with the Hadoop ecosystem.
View full review »What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution. I like the data processing feature for AI/ML use cases the most because some solutions allow me to collect data from relational databases, while Hadoop provides me with more options for newer technologies.
View full review »GM
reviewer2324613
Data Architect at a computer software company with 51-200 employees
It's open-source, so it's very cost-effective. Apache Hadoop has its strengths. For example, in my previous organization, which was a small startup, we used it because it was cost-effective.
We only had to pay for the servers, and we could optimize applications and performance using our employees, which was especially cost-effective in India. So, human resources were the main investment, not software.
That was five years ago, though. In the last five years, I've mainly seen Redshift, Azure, and Oracle in the market.
View full review »Buyer's Guide
Apache Hadoop
April 2024
Learn what your peers think about Apache Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
767,995 professionals have used our research since 2012.
AM
reviewer1976262
Credit & Fraud Risk Analyst at a financial services firm with 10,001+ employees
The ability to take a lot of data and attempt to basically deliver the appropriate splices and summary chart is the most crucial function that I have discovered.
This stands in contrast to some of the other tools that are available, such as SQL and SAS, which are likely incapable of handling such a large volume of data. Even R, for instance, is unable to handle such data volumes.
Apache Hadoop can manage large amounts and volumes of data with relative ease, which is a feature that is beneficial.
View full review »The most important feature is its ability to handle large volumes. Some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume. So, its processing power and speed are most valuable.
Another feature that I like is online analysis. In some cases, data requires online analysis. We like using Hadoop for that.
View full review »RC
Randy Chng
Senior Associate at a financial services firm with 10,001+ employees
Impala. As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into other platforms like R (for further data analysis) or QlikView (for data visualisation).
Its integration is Hadoop's best feature because that allows us to support different tools in a big data platform. Hadoop can integrate all of these features in various environments and have use cases beyond all of the tools in the environment.
View full review »YM
Yevgen Manzhulyanov
CEO at AM-BITS LLC
The most valuable feature is scalability and the possibility to work with major information and open source capability.
View full review »SF
Samuel Feinberg
Analytics Platform Manager at a consultancy with 10,001+ employees
- Scalability
- Parallel processing
There are jobs that cannot be done unless you have massively parallel processing; for instance, processing call-detail records for telecom.
View full review »The Distributed File System, which is the base of Hadoop, has been the most valuable feature with its ability to store video, pictures, JSON, XML, and plain text all in the same file system.
View full review »We don't use many of the Hadoop features, like Pig, or Sqoop, but what I like most is using the Ambari feature. You have to use Ambari otherwise it is very difficult to configure.
What comes with the standard setup is what we mostly use, but Ambari is the most important.
View full review »AM
Arul Mani
CEO
HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect.
View full review »YT
Yogesh Thakkar
Business data analyst at RBSG Internet operations
One valuable feature is that we can download data. Another is that it is a low-cost solution. Hadoop has also made it feasible to have all the data available in one area.
View full review »JP
reviewer1384338
Vice President - Finance & IT at a consumer goods company with 1-10 employees
The data is stored in micro-partitions which makes the processes very fast compared to other RDBMS systems. Apache Spark is in the memory process, and it's much better than MapReduce.
Micro-partitions and the HDFS are both excellent features.
View full review »MS
MahalingamShanmugam
Works
The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.
View full review »- Storage
- Processing (cost efficient)
DD
reviewer901065
Partner at a tech services company with 11-50 employees
Hadoop is extensible — it's elastic.
View full review »MB
reviewer1040328
IT Expert at a comms service provider with 1,001-5,000 employees
I liked that Apache Hadoop was powerful, had a lot of tools, and the fact that it was free and community-developed.
View full review »YM
Yevgen Manzhulyanov
CEO at AM-BITS LLC
The ability to add multiple nodes without any restriction is the solution's most valuable aspect.
View full review »MB
reviewer1040328
IT Expert at a comms service provider with 1,001-5,000 employees
The most valuable thing about this program for us is that it is very powerful and very cheap. We're using a lot of the program's modules and features because we're using software and hardware that can be difficult to integrate. For example, we're using supersets and a lot of old products from difficult systems. We love having the various options and features that allow us to work with flexibility.
SS
reviewer1433400
Technical Lead at a government with 201-500 employees
The distributed processing is excellent.
On the solution, Spark is very good.
The performance is pretty good.
View full review »CB
Chitharanjan Billa
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
- Data ingestion: It has rapid speed, if Apache Accumulo is used.
- Data security
- Inexpensive
GA
reviewer1464630
Founder & CTO at a tech services company with 1-10 employees
I actually like most of the capabilities, but I think Spark has added reposit capabilities on top of the Hadoop ecosystem. The Spark area includes the capabilities that I like the most with Hadoop.
View full review »The most valuable feature is the database.
View full review »HDFS allows you to store large data sets optimally.
View full review »The most valuable features are powerful tools for ingestion, as data is in multiple systems.
View full review »The solution is perfect for when you have big data. It's good for managing and replication.
It's good for storing historical data and handling analytics on a huge amount of data.
View full review »High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.
View full review »Buyer's Guide
Apache Hadoop
April 2024
Learn what your peers think about Apache Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
767,995 professionals have used our research since 2012.