IT Central Station is now PeerSpot: Here's why

Hortonworks Data Platform OverviewUNIXBusinessApplication

Hortonworks Data Platform is #4 ranked solution in top Hadoop tools. PeerSpot users give Hortonworks Data Platform an average rating of 8 out of 10. Hortonworks Data Platform is most commonly compared to Amazon EMR: Hortonworks Data Platform vs Amazon EMR. Hortonworks Data Platform is popular among the large enterprise segment, accounting for 73% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a government, accounting for 24% of all views.
Buyer's Guide

Download the Hadoop Buyer's Guide including reviews and more. Updated: June 2022

What is Hortonworks Data Platform?
Hortonworks is a leading innovator in the industry, creating, distributing and supporting enterprise-ready open data platforms and modern data applications. Our mission is to manage the world's data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. We along with our 1600+ partners provide the expertise, training and services that allow our customers to unlock transformational value for their organizations across any line of business. Our connected data platforms powers modern data applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. We are Powering the Future of Data.

Hortonworks Data Platform was previously known as Hortonworks, HDP.

Hortonworks Data Platform Customers
Mayo Clinic, Symantec, Progressive Insurance, Noble Energy, Cardinal Health, Rogers, Mercy, Neustar, TRUECar, T-Mobile
Hortonworks Data Platform Video

Archived Hortonworks Data Platform Reviews (more than two years old)

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
Senior IT Officer- Head of Administration, System Administration Division for Unix and Linux Servers at a financial services firm with 10,001+ employees
Real User
A cost-effective alternative for managing our big data
Pros and Cons
  • "Now, using this solution, it is much cheaper to have all of the data available for searching, not in real-time, but whenever there is a pending request."
  • "I would like to see more support for containers such as Docker and OpenShift."

What is our primary use case?

We use this solution to look at and manage big data. It's mostly historical data that we offload from our data warehouse, as well as from other databases in other platforms. We have two different installations. The first one is based on IBM POWER CPUs, and the other one is based on Intel CPUs. Our data center is on-premise. There is some thought on moving to a private could, or a private IBM cloud, but we have not proceeded with that as of yet.

How has it helped my organization?

This solution is a cheaper way for us to offload the otherwise expensive data. We can move data from outdated database versions, such as Oracle 10. It is now out of support, but still hosts some of our historical data. This solution has helped us move our data to the current version. Previously, we had our data on more expensive platforms. Now, using this solution, it is much cheaper to have all of the data available for searching, not in real-time, but whenever there is a pending request.

What needs improvement?

We have had problems with the backup and with services that require a disaster site. We are still struggling with some of these issues. We are having trouble with Active Directory and Hive integration. I would like to see more support for containers such as Docker and OpenShift.

For how long have I used the solution?

About a year and a half.
Buyer's Guide
Hadoop
June 2022
Find out what your peers are saying about Cloudera, IBM, Amazon and others in Hadoop. Updated: June 2022.
610,045 professionals have used our research since 2012.

What do I think about the stability of the solution?

We have had some issues with the code, but it's mostly from the developers. From our side, we don't see any issues with stability, although it may be that we have a lot of unused CPU capacity.

What do I think about the scalability of the solution?

We have not acquired any additional hardware since our initial purchase. However, we expect more use cases to be added, at which point we may have performance or scalability problems.

How was the initial setup?

The initial setup is not very difficult. The configuration is not easy, but somebody with some experience is able to set it up. We had users for which we had to set up quotas and queues. For us, the basic installation was completed within a matter of a week.

What about the implementation team?

We had IBM set up both of our installations. 

What other advice do I have?

This is a good product, but we still have some issues with backup, and the performance monitors that we install on every system. There may be solutions, but we're struggling to integrate them. This is a product that I recommend. It's a solution that comes at a lower price, and it works well if you don't have expectations that it will behave like a much more expensive system. I would rate this solution an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Oguzhan Herkiloglu - PeerSpot reviewer
Senior HPC and BigData Architect at BitNet
Real User
Provides a complete solution and just one user interface that can manage all the packages
Pros and Cons
  • "The Hortonworks solution is so stable. It is working as a production system, without any error, without any downtime. If I have downtime, it is mostly caused by the hardware of the computers."
  • "I work a lot with banking, IT and communications customers. Hortonworks must improve or must upgrade their services for these sectors."

What is our primary use case?

Hortonworks actually provides a complete solution and just one user interface that can manage all the packages. It can monitor all the requirements, all the versions and additionally all the quays and all the hardware-dependent services. What I want is a useful user interface which is the reason why I currently prefer to use Hortonworks.

What is most valuable?

One of the most valuable features is that you can configure your data nodes in the big data. Whatever you want. Normally, for example, if you are testing websites and Hash clouds and other sites, in most of them you must manage more than three or four requirements. For example, you must install each feature, you must compile some additional things, and also you must manage more than three configuration files to enable all the nodes to work together.

In the Hortonworks solution, you just need the service, and you just want to install it once to get started on projects easily. You can just click run and it's already installed and you can create and communicate between your services.

What needs improvement?

I work a lot with banking, IT and communications customers. Hortonworks must improve or must upgrade their services for these sectors.

Each customer has different requirements. From the IT side, someone who has some experience of the cluster, computer clustering, computer networking, different fire defense, for example, it is so important that they have some additional graphics, some additional service reports added to the Hortonworks current user interface, which could provide easy images. Especially if they are using it without any experience first.

For how long have I used the solution?

I've been using the solution for three years.

What do I think about the stability of the solution?

The Hortonworks solution is very stable. It works as a production system, without any error, without any downtime. If I have downtime, it is mostly caused by the hardware of the computers.

What do I think about the scalability of the solution?

The solution is very scalable. In our procedure, there are two we are using in production without any downtime. The most important thing is the hardware from the computer cluster. I can work on more than 1,000 servers at the same time. On the communication solution, currently, 50 people use it at the same time.

How was the initial setup?

The initial setup is so easy. You can just watch a video. It's handy. If you have some knowledge of computer networking and computer clusters, it is so easy. Deployment time depends on the project and the project size. Sometimes it takes more than three hours to complete.

What's my experience with pricing, setup cost, and licensing?

The solution is comprehensible but it also depends on the customers and the customer's stability requirements. I know that Hortonworks is stable, but sometimes when you are talking with the customers, they wonder if Hortonworks is free, how can it be enterprise. But I explain that Hortonworks is open-source.

What other advice do I have?

The solution is an open-source project. If you don't want to use their professional support services, you don't pay anything for Hortonworks and its solutions. When you want to call there or use its user interface, it's paid. This is why prefer Hortonworks solutions in my projects.

I would rate this solution 10 out of 10.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Hadoop
June 2022
Find out what your peers are saying about Cloudera, IBM, Amazon and others in Hadoop. Updated: June 2022.
610,045 professionals have used our research since 2012.
Solution Architect at Teradata Corporation
Vendor
We use it for data science activities. Security and workload management need improvement.

What is our primary use case?

We use it for data science activities.

How has it helped my organization?

Data is now available.

What is most valuable?

I have no preferences towards any feature.

What needs improvement?

  • Security
  • Performance
  • Workload management

For how long have I used the solution?

Less than one year.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user742794 - PeerSpot reviewer
User at a comms service provider with 10,001+ employees
Vendor
Enabled us to implement fraud detection and improve performance at a lower cost
Pros and Cons
  • "Ranger for security; with Ranger we can manager user’s permissions/access controls very easily."
  • "Hive performance. If Hive performance increased, Hadoop would replace (not everywhere) traditional databases."

What is most valuable?

A few of them, namely: Hive/Tez, HBase, Ranger, Yarn and Ambari. Ambari helps managing the platform, Hive is very easy to use. Ranger for security; with Ranger we can manager user’s permissions/access controls very easily.

How has it helped my organization?

We have successfully ported a Microsoft SSIS product application into Hadoop, that saved millions of dollars for the company and, at the same time, they are getting better performance. Also, we implemented fraud detection, as quickly as possible, for the online orders. (Fraudulent orders became a big headache for our company. The early detection of fraud is saving the company a lot of money).

What needs improvement?

Hive performance. If Hive performance increased, Hadoop would replace (not everywhere) traditional databases (Oracle/Teradata, etc.), which would save a lot of money for the company.

For how long have I used the solution?

I have been working on this HDP platform since Jan 2015.

What do I think about the stability of the solution?

No, our company is a satisfied customer.

What do I think about the scalability of the solution?

No, not at all.

What other advice do I have?

Product is good. Reason I gave a rating of eight is that their community is very large and relatively very quick in bug fixes.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user742737 - PeerSpot reviewer
BigData(QA & RnD) with 51-200 employees
Vendor
The user-friendly feature of the Ambari Web UI is one of its best features. On the other hand, the Ambari upgrade is difficult.

What is most valuable?

  • Ambari Web UI: user-friendly
  • Views for Hive, Tez, Pig
  • Spark and Ranger

How has it helped my organization?

It has helped our organisation cater to clients who are using Big Data for data storage and analysis combined with our security product.

What needs improvement?

Deleting any service requires a lot of clean up, unlike Cloudera.

For how long have I used the solution?

Five years.

What do I think about the stability of the solution?

Not until now.

What do I think about the scalability of the solution?

No.

How are customer service and technical support?

Very supportive, prompt responses.

Which solution did I use previously and why did I switch?

We didn't use a previous solution.

How was the initial setup?

The Ambari upgrade is not very user-friendly.

What's my experience with pricing, setup cost, and licensing?

Not applicable.

Which other solutions did I evaluate?

Cloudera and MapR.

What other advice do I have?

It's a great company with a great product employing dedicated people.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Big Data - Senior Solutions Architect at a tech vendor with 10,001+ employees
Vendor
It is open and there is no lock-in.

What is most valuable?

We evaluated Cloudera and Hortonworks. Based on our evaluation and actual experience in production of 60 nodes and development of 12 nodes, the most valuable features of Hortonworks are:

  • 100% open
  • No lock-in like Cloudera
  • Fast and accurate support instantly
  • Largest number of committers to Hadoop by any means
  • Hive is better in performance and ease of use compared to Impala

How has it helped my organization?

It helps a lot in data in motion (ingestion and manage in real time). We are able to do 3rd-party data monetization of our data within a t+20 minute time frame to our end customers.

What needs improvement?

  • Cost
  • Reliability
  • Speed
  • Ease of use

For how long have I used the solution?

I have used it for three years.

What was my experience with deployment of the solution?

I initially encountered deployment issues, but they were very good in resolving them.

What do I think about the stability of the solution?

I have not encountered stability issues.

What do I think about the scalability of the solution?

I have not encountered any scalability issues at all. That's the key reason we picked HDP over Cloudera, as Cloudera have issues & don't support compression of Hive in ORC format. They push only their products (not good).

How are customer service and technical support?

Customer Service:

Customer service has been excellent from the day one until now... and our Admin is comfortable with the SLA and turnaround time.

Technical Support:

Technical support is very good and proactive with SmartSense.

Which solution did I use previously and why did I switch?

We previously used a different solution. We switched from Cloudera. Initially, we went with Cloudera due to it being a popular choice in the market, etc, then realized it was bad choice. Before we scaled from 6 nodes to 12 nodes and before we went livein production, we scrapped it due to Impala's performance and lock-in.

How was the initial setup?

Using Ambari, it was easy to set up and we even tried the AWS for a test cluster.

What about the implementation team?

An in-house team implemented it: two admins, seven developers, one data scientist, one PM and 22 business users at the customer (end-user side).

What was our ROI?

ROI is 300%.

What's my experience with pricing, setup cost, and licensing?

Hortonworks is the best, comparing all three flavors. If all is well, we might use open source alone in the next three years; others you can't due to lock-in...

Which other solutions did I evaluate?

Before choosing this product, we also evaluate Cloudera.

What other advice do I have?

It is the best in terms of product vision and actual delivery.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Solution Architect at MIMOS Berhad
Real User
Top 5Leaderboard
It gives us semantic analysis based on the feeds from social networking data, clickstream data, etc., but it needs to support disaster recovery features such as mirroring.

What is most valuable?

  • It's the one and only complete open source big data platform
  • Ambari-managed admin configuration for HDFS, YARN, Hive, HBase, etc.
  • Customized dashboards
  • Web-based HDFS browser
  • SQL editor for Hive
  • Apache Phoenix - OLTP and operational analytics on Hadoop
  • Apache Zeppelin - A web-based notebook that enables interactive data analytics

How has it helped my organization?

  • Maintenance of our own data lake in the enterprise-level
  • Storage and analysis of server logs
  • Applying Operational Intelligence in the enterprise-level based on the analysis of various department units data
  • Semantic analysis based on the feeds from social networking data, clickstream data, etc.

What needs improvement?

  • Rolling upgrade
  • Disaster recovery features such as mirroring should be supported

For how long have I used the solution?

We've used it for one year.

What was my experience with deployment of the solution?

No issues encountered.

What do I think about the stability of the solution?

No issues encountered.

What do I think about the scalability of the solution?

No issues encountered.

How are customer service and technical support?

Customer Service:

3/10

Technical Support:

3/10

Which solution did I use previously and why did I switch?

No previous solution was in place.

How was the initial setup?

It's easy to setup.

What about the implementation team?

We did it in-house.

What's my experience with pricing, setup cost, and licensing?

Completely use the community edition along with other features that can be implemented on top.

Which other solutions did I evaluate?

No other solutions were looked at.

What other advice do I have?

Study, analyze, and compare with other big data platforms features according to your requirements before choosing the appropriate one.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
CTO at a tech services company
Real User
​The setup of hadoop was easy thanks to Ambari, but installing the security components was complex.

What is most valuable?

It has a powerful, user-friendly interface called Ambari which allowed us to administrate our cluster easily.

How has it helped my organization?

It allows us to performa data lake implementation to handle/treat huge amounts of data, or what we call the "terrible bytes”.

What needs improvement?

Integrate a complete hive web client (Ambari views), like Hue Today, in the next release.

For how long have I used the solution?

I've used it for three years.

What do I think about the stability of the solution?

The HDP v2.3 is stable release. But we have encountered some issues linked to hbase (fixed by: hbase server and region server should not be installed on the same node).

How are customer service and technical support?

It has good documentation, but it's not fully complete for complex security needs (knox/ranger with cluster).

Which solution did I use previously and why did I switch?

We have installed our first cluster using native Apache repositories.

How was the initial setup?

The setup of Hadoop was easy thanks to Ambari, but installing the security components was complex.

What about the implementation team?

Our cluster has been implemented in-house. We have automated the entire installation of Hadoop, set-up and configuration included.

What's my experience with pricing, setup cost, and licensing?

Hadoop/Cloudera is still much cheaper than Oracle's RDBM system, if you want to handle a huge amount of data and make complex analytics. 

It's 40,000€ for 10 Hadoop nodes vs 1.7 million Euros  for an Oracle server with 40 cores.

What other advice do I have?

In short, I recommend this product simply because Hortonworks is the only distribution that runs on Linux and Windows Servers.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Consultant at a tech services company with 51-200 employees
Consultant
It enables customers to perform sentimental analysis from social media data to engineering analytics. Name Node High Availability is still not stable.

Valuable Features:

Hortonworks is 100% Open Source. Hortonworks does a great job in managing all different components of Hadoop.

Improvements to My Organization:

We've done multiple implementations of it. It enables customers to perform sentimental analysis from social media data to engineering analytics.

Room for Improvement:

Security- Although they support Knox and Ranger and Kerberos, they are still missing attribute-level encryption features.

Name Node High Availability is still not stable (memory issues).

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user347793 - PeerSpot reviewer
Principal Consultant - Big Data with 501-1,000 employees
Vendor
It is improving rapidly, but like other flavors of Hadoop there is room for improvement.

What is most valuable?

  • Ambari
  • Hive
  • Sqoop
  • Flume
  • Spark

How has it helped my organization?

The Hadoop value proposition is in expanded functionality, linear scalability, and reduced software and infrastructure costs. Hadoop offers several generic frameworks for batch, real-time, and iterative processing, such as map-reduce, spark, and spark streaming. Additionally, these frameworks provide libraries for predictive analytics and machine learning. This type of expanded functionality is not easily achieved on any other single platform.

What needs improvement?

File system to provide indexed access to individual records with in-place update/delete. Also, Security integration through a common interface for authentication, authorization, disk encryption, network encryption, data access layer, data masking, etc.

Hadoop, does not provide improved performance, compared to traditional RDBMS, unless processing batches in the TB-PB range, or if the Hadoop platform has significantly more resources available.

For how long have I used the solution?

I have implemented various flavors of Hadoop over the past five years, including platform configuration and application development.

What was my experience with deployment of the solution?

Deployment is improving rapidly, but like other flavors of Hadoop there are always issues.

What do I think about the stability of the solution?

Stability is improving rapidly, but like other flavors of Hadoop there are always issues.

How are customer service and technical support?

5/10 - Responsive, but like all flavors of Hadoop, there are too many tickets to be reasonably triaged and supported.

Which solution did I use previously and why did I switch?

I have implemented various flavors of Hadoop such as Hortonworks and Cloudera over the past five years, including platform configuration and application development.

How was the initial setup?

Straightforward once you know what you’re doing.

What about the implementation team?

I work for a vendor team.

What was our ROI?

ROI is one of the main reasons organization pursue Hadoop. Cost per TB is a compelling factor.

Which other solutions did I evaluate?

Hadoop is complex. It takes a dedicated approach from individuals with a broad range of technology skills and commitment to overcome challenges that do not normally present themselves in well-established technologies.

What other advice do I have?

Hadoop is complex. It takes a dedicated approach from individuals with a broad range of technology skills and commitment to overcome challenges that do not normally present themselves in well-established technologies.

Disclosure: My company has a business relationship with this vendor other than being a customer: We're partners.
PeerSpot user
Lead IT Consultant at a tech services company with 5,001-10,000 employees
MSP
We've integrated our current distribution of it with Tableau, but we had issues upgrading to the newer versions, but these were resolved with their help.

What is most valuable?

The features I've found most valuable are--

  • Ambari UI
  • Hive
  • Pig
  • Hive
  • Also integrated Tableau with this distribution

How has it helped my organization?

It's easy to deploy and we've used this distribution for some of our recommendation and trend analysis use cases.

For how long have I used the solution?

I've used it for almost one year.

What was my experience with deployment of the solution?

No issues encountered.

What do I think about the stability of the solution?

No issues encountered.

What do I think about the scalability of the solution?

We faced some issues while upgrading to newer versions with current distributions, but with their support we solved it.

How are customer service and technical support?

Customer Service:

Customer service is great.

Technical Support:

Technical support is great.

Which solution did I use previously and why did I switch?

No, we did not use a previous solution.

How was the initial setup?

Initial setup was straightforward.

What about the implementation team?

We implemented it with our in-house team.

Disclosure: My company has a business relationship with this vendor other than being a customer: We're partners.
PeerSpot user
Associate Consultant at a tech vendor with 501-1,000 employees
Vendor
The Ambari UI is valuable for cluster monitoring, but there are certain features that need tuning, such as the Hue UI.

What is most valuable?

From a product standpoint, their Ambari UI is incredibly valuable for cluster monitoring. It simplifies the deployment and maintenance of hosts, and we can provision, configure and test Hadoop services.

How has it helped my organization?

From an overall perspective, Hortonworks support is crucial to our operations.

What needs improvement?

As this is open source, there are certain features that need tuning, such as the Hue UI. More stability on this would be helpful.

For how long have I used the solution?

I've used it for one year.

What was my experience with deployment of the solution?

As this is all new technology, we face issues at every level. However, hardware support and documentation have been instrumental in helping us resolve the majority of those issues.

What do I think about the stability of the solution?

We've had some issues with stability, but hardware support and documentation have helped us resolve most of those.

What do I think about the scalability of the solution?

No issues with scalability.

How are customer service and technical support?

They have outstanding customer support. Their responses are prompt, and they resolve issues quickly.

Which solution did I use previously and why did I switch?

I have not used a solution of this nature before.

How was the initial setup?

The set up is straightforward enough, but at every level there are many parameters to be tuned. Ensuring all these parameters are set is the complex part, as poorly set parameters can cause unwanted issues.

What about the implementation team?

We have an in-house team to do implementations. I would advise that all implementations get seen through all the way to having users smoke test applications to ensure correct functionality.

What other advice do I have?

I would suggest that if you are implementing this at an enterprise level, the support is compulsory. Additionally having a high degree of patience is key, as this is open source and road bumps can be frequent when moving at a fast pace.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Big Data Architect at a tech services company with 1,001-5,000 employees
Consultant
We have faster processing times for our apps, but it needs to automate deployment on multi nodes.

What is most valuable?

There are several features that are most valuable for us--

  • Hue
  • Hive
  • Spark
  • S3

How has it helped my organization?

With it, we have faster processing times for our apps.

What needs improvement?

It needs to be quicker and to have the ability to automate deployment on multiple nodes.

For how long have I used the solution?

I've used it for two years.

What was my experience with deployment of the solution?

Sometimes there were issues.

What do I think about the stability of the solution?

Sometimes there were issues.

What do I think about the scalability of the solution?

Sometimes there were issues.

How are customer service and technical support?

I've not had to use it.

Which solution did I use previously and why did I switch?

No solution had been used previously, but we are using it alongside AWS EMR.

How was the initial setup?

It was complex to configure.

What about the implementation team?

It was done in-house.

What other advice do I have?

We provide services for product implementation, so people looking for such products can contact me.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user346956 - PeerSpot reviewer
Cyber Security and Analytics Engineer at a government with 1,001-5,000 employees
Vendor
We can collect data from different databases, and where the data is similar, it allows for a detailed analysis from a single data store. It could improve, though, on the ability to update data.

Valuable Features

Ease of deployment and management of the Hadoop cluster are features we've found most valuable.

Improvements to My Organization

It allows our organization to collect data from databases that are different, and where the data is similar, it allows for a detailed analysis from a single data store.

Room for Improvement

The ability to update data is an area where the product could improve.

Use of Solution

I've used it for one year.

Deployment Issues

We had an issue during deployment. You have to be sure that your base image is perfect and that your infrastructure is properly configured or issues will occur.

Customer Service and Technical Support

We don't have their paid support, but I have had discussions with their engineers and they have been extremely helpful. So based on that, I would give them 8/10.

Initial Setup

The initial setup is complex. It mainly stems from small issues that typically pop up and also a lack of experience in deploying the product. I highly suggest taking the Hortonworks Training prior to deploying.

Implementation Team

We used an in-house team. Take your time and utilize the free resources provided by Hortonworks.

ROI

At this point I don't believe I could provide a ROI as we aren't fully utilizing the product.

Pricing, Setup Cost and Licensing

If possible, I would suggest paying for the professional services which would give you on-site engineers to help deploy the cluster.

Other Solutions Considered

We did look at Cloudera, but due to having literally no money to spend for the project, we chose Hortonworks due to its being completely free and open source.

Other Advice

Take your time and script as much as you can so that all base images are the same.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
it_user344022 - PeerSpot reviewer
Data science engineer at a tech services company with 501-1,000 employees
Consultant
We are capable of processing various data science tasks, e.g. natural language processing or log processing.

What is most valuable?

  • Open-source
  • Big community

How has it helped my organization?

It is a different paradigm than standard relational databases. We can also process different tasks then just those related to the standard database world. That said, we are capable of processing various data science tasks, e.g. natural language processing or log processing.

What needs improvement?

  • Stability
  • It needs to be more mature
  • Security
  • User friendliness

For how long have I used the solution?

I've used it for three years alongside MapR and Cloudera.

What was my experience with deployment of the solution?

Almost every part of the Hadoop ecosystem has its problems and bugs.

What do I think about the stability of the solution?

Almost every part of the Hadoop ecosystem has its problems and bugs.

What do I think about the scalability of the solution?

Almost every part of the Hadoop ecosystem has its problems and bugs.

How are customer service and technical support?

The paid service is pretty good, but if you don't pay, there is documentation available in the community which is pretty good.

Which solution did I use previously and why did I switch?

I slightly experienced Cloudera which is very similar to Hortonworks, but there are parts which are not open source. I'm working more with Hortnoworks because all its parts are open source. and my company has a longer partnership with Hortnoworks

How was the initial setup?

It is easy if you have good administrators. It is also easy if you want to just play with it on your laptop. For real work and stability, I definitely recommend some paid support.

What about the implementation team?

I was involved in multiple projects. Usually, it was done in-house with paid support.

What was our ROI?

Every project is different. Since Hadoop is an infrastructure for a long period there is no simple ROI. Also, each customer has different expectations.

Disclosure: My company has a business relationship with this vendor other than being a customer: We have a partnership with all major Hadoop vendors.
it_user343344 - PeerSpot reviewer
Business Objects Consultant at a manufacturing company with 1,001-5,000 employees
Vendor
We ​can perform sentiment analysis on Twitter data, but it needs a better UI.

What is most valuable?

Its flexibility is the most valuable feature because you can leverage any Hadoop component and take full advantage of its open source capabilities.

How has it helped my organization?

We're able to perform sentiment analysis on Twitter data.

What needs improvement?

It needs a better UI.

For how long have I used the solution?

I used it for five months, one year ago.

What was my experience with deployment of the solution?

It requires too much coding work; we're not good Java and Python developers.

What do I think about the stability of the solution?

No issues.

What do I think about the scalability of the solution?

No issues.

How are customer service and technical support?

Customer Service:

They are detailed and informational.

Technical Support:

We got stuck with Java job development and were able to get assistance from tech support.

Which solution did I use previously and why did I switch?

No previous solution was used.

How was the initial setup?

It was straightforward since we have Linux resources.

What about the implementation team?

We did it in-house.

Which other solutions did I evaluate?

  • Datameer
  • Cloudera

What other advice do I have?

You need to be tech savvy.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Big Data Consultant at a tech services company with 51-200 employees
Consultant
It allows us to provide our customers with data insights that they previously were unable to obtain, but the governance initiatives are far from production ready.

What is most valuable?

Its ability to scale out seamlessly with little to no effort is very valuable to us. All the tools in the stack are built from the ground up to support massive amounts of data.

How has it helped my organization?

It allows us to provide our customers with data insights that they previously were unable to obtain.

What needs improvement?

There have been some governance initiatives, but they are far from production ready. I would like to see a big improvement in that space, as governance is critical in many regulated industries.

For how long have I used the solution?

I've been using it for one year.

What do I think about the stability of the solution?

Stability is good if configured properly, but for some tools such as for instance HBase, configuration is extremely hard to get right.

What do I think about the scalability of the solution?

Scalability is superb.

How are customer service and technical support?

Customer Service:

I never interacted with customer support.

Technical Support:

Cloudera and vanilla Big Data tech. We continue to use them alongside HortonWorks, depending on our clients preferences and needs.

Which solution did I use previously and why did I switch?

Cloudera and vanilla Big Data tech. We continue to use them alongside Hortonworks, depending on our clients preferences and needs.

How was the initial setup?

With Ambari, it is pretty straightforward, but I have no idea why they prefer FQDN over IP.

What about the implementation team?

My colleagues and I are the implementation team. The general advice is to start out with a small enough scope. Try to get an MVP up and running before bringing out the big guns./

What's my experience with pricing, setup cost, and licensing?

Licensing is on a per node basis and it encourages people to scale vertically rather than horizontally yet the whole purpose of the tools they sell is to scale horizontally. I do like that everything is also available freely for those that do not require support.

What other advice do I have?

Make sure you understand what happens under the hood. Out-of-the-box tools are sub-par. Customisation is the way to go for now.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Infrastructure Engineer at Zirous, Inc.
Real User
It's increased the amount of data that we store from sensor data and weblogs, which gives us a greater scope of data to analyze. However, I'd like to see an increase in usability for Apache Storm.

What is most valuable?

The HDFS (Java-based file system) and Hive Utilities are proving to be most useful.

How has it helped my organization?

Hortonworks has allowed my organization to increase the amount of data that we regularly store from sensor data and weblogs, which in turn gives us a greater scope of data to analyze.

What needs improvement?

I would like to see an increase in usability for the Apache Storm engine within the data platform.

For how long have I used the solution?

I have been using it for less than a year.

What was my experience with deployment of the solution?

When initializing our cluster, we did not allocate enough space to our VAR partition and that ended up causing some issues with the networking to our onsite Tomcat server.

How are customer service and technical support?

Customer Service:

It's fairly low customer service.

Technical Support:

It's fairly low technical support.

Which solution did I use previously and why did I switch?

We started off with this product.

How was the initial setup?

Both straightforward and complex, everything was easy to set up, but a lot of the behind the scenes configuration changes for customization could be rather time consuming.

What about the implementation team?

We used an in-house team. My advice is to study hard and read all the documentation thoroughly before starting any implementation. It is paramount that one understands the system before implementing it.

What was our ROI?

Current ROI is none as we are still in the POC phase with most of our products.

What other advice do I have?

Be sure that the product is necessary for the situation.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
ICT Consultant (Advanced Infrastructure) at a tech services company with 1,001-5,000 employees
Consultant
The Ambari server provides the user an easy way to manage, administrate, and configure their clusters, but it needs to support having more than two HDFS namenodes.

What is most valuable?

There’s not only one, the all-stack of Hadoop is valuable, the distributed file system HDFS, Spark, Kafka, HBase, etc. Hortonworks has certainly got the most up-to-date version of each component of Hadoop.

Compared to the other Hadoop distributions, the Ambari server provides the user an easy way to manage, to administrate and to configure their cluster. Ambari also provides a single view that gives you the possibility to use different Hadoop components from the same web interface.

How has it helped my organization?

This product gives the possibility to the organization to easily and quickly install and configure a Hadoop cluster. With this cluster, the organization will be able to store and process their data and bring out some specificity on it. For example, unknown common points between their clients or key elements that will increase or decrease the churn of the client.

What needs improvement?

It would be interesting to have an easy way to implement multi-tenant for HDFS with federation. At the moment, you have to do it manually in command line.

Also, it needs to support having more than two HDFS namenodes. HDFS supports more than 2 namenodes, but Hortonworks doesn't.

For how long have I used the solution?

I work with it in different projects and POCs for two years now.

What was my experience with deployment of the solution?

The only issue that I had was when I tried to reinstall the software on every node. You have to manually clean up everything, as Hortonworks doesn’t provide the ability to perform a clean uninstall (software, library, log, configuration files, etc). In some case, it can generate some problems if the uninstall has not done correctly.

How are customer service and technical support?

I never had to create a case at the support, so I don’t know. I always find the answers to my questions on the web (forum or blog). There’s a big community that can support you.

Which solution did I use previously and why did I switch?

I also used Cloudera, MapR, and Microsoft HD Insight.

How was the initial setup?

The first time, I didn’t know anything about Big Data and Hadoop, so yes it was difficult because I did not clearly understand what I was doing.

What about the implementation team?

The implementation was at the clients datacenter. My advice is to perform a POC on premise or via a virtual machine to learn how to use it and how to tune the configuration of each Hadoop component.

When implementing it in production, firstly you need to have a clear view of the requirements you need to perform the install. For example, if you are using a local repository to install the software, it has to be updated with Hortonworks sources, especially if there are security rules (firewall access, root access limitation, etc.).

My last piece of advice is that if you have a heavy load, it is really important to implement the solution on premise, not in a virtualized environment. If you do both, you will see the difference in performance.

What's my experience with pricing, setup cost, and licensing?

The use of Hortonworks is free there’s no license but if you want there’s a support. It’s up to you to see if you need it (certainly) and to maybe negotiate it.

Which other solutions did I evaluate?

I did not really made the choice, as the client made it dependent on their experience, functionality of each distribution, privacy of the data and the licensing/support price.

What other advice do I have?

Firstly perform a POC to learn and to get an idea of the load of your future applications. Then, you should be able to correctly design the need infrastructure.

Disclosure: My company has a business relationship with this vendor other than being a customer: We are partners.
Buyer's Guide
Download our free Hadoop Report and find out what your peers are saying about Cloudera, IBM, Amazon, and more!
Updated: June 2022
Product Categories
Hadoop
Buyer's Guide
Download our free Hadoop Report and find out what your peers are saying about Cloudera, IBM, Amazon, and more!
Quick Links