Engineering Manager/Solution architect at a computer software company with 201-500 employees
Real User
2021-12-02T14:41:01Z
Dec 2, 2021
One of the valuable features about this solution is that it's managed services, so it's pretty stable, and scalable as much as you wish. It has all the necessary distributions. With some additional work, it's also possible to change to a Spark version with the latest version of EMR. It also has Hudi, so we are leveraging Apache Hudi on EMR for change data capture, so then it comes out-of-the-box in EMR.
Hadoop architectures refer to the various design patterns and configurations used to implement Hadoop, an open-source framework for distributed storage and processing of large datasets.
Amazon EMR is a good solution that can be used to manage big data.
It allows users to access the data through a web interface.
It has a variety of options and support systems.
Amazon EMR's most valuable features are processing speed and data storage capacity.
The project management is very streamlined.
The solution is scalable.
When we grade big jobs from on-prem to the cloud, we do it in EMR with Spark.
The solution is pretty simple to set up.
We are using applications, such as Splunk, Livy, Hadoop, and Spark. We are using all of these applications in Amazon EMR and they're helping us a lot.
One of the valuable features about this solution is that it's managed services, so it's pretty stable, and scalable as much as you wish. It has all the necessary distributions. With some additional work, it's also possible to change to a Spark version with the latest version of EMR. It also has Hudi, so we are leveraging Apache Hudi on EMR for change data capture, so then it comes out-of-the-box in EMR.
This is the best tool for hosts and it's really flexible and scalable.
The initial setup is pretty straightforward.