We performed a comparison between Apache Hadoop and Teradata based on real PeerSpot user reviews.
Find out in this report how the two Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing."
"The performance is pretty good."
"The scalability of Apache Hadoop is very good."
"What comes with the standard setup is what we mostly use, but Ambari is the most important."
"Hadoop File System is compatible with almost all the query engines."
"Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability."
"As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into other platforms like R."
"The most valuable feature is the database."
"It handles large amounts of information with a linear performance increase, in relation to a HW investment."
"Auto-partitioning and indexing, and resource allocation on the fly are key features."
"The most valuable features are the Shared-nothing architecture and data protection functionality."
"The functionality of the solution is excellent."
"It is a solid database a lot of different tools to move data."
"The two types of partitioning have been very significant for us - row and columnar partitioning."
"Improved performance of ETL procedures, reporting."
"There are several features of Teradata that I like. One of the most basic is the indexes. I also like that it provides lower TCO. It also has the optimizer feature which is a good feature and isn't found in other legacy systems. Parallelism is also another feature I like in Teradata because when you are running or hosting on multiple systems, you have this shared-nothing architecture that helps. Loading and unloading in Teradata are also really helpful compared to other systems."
"It requires a great deal of learning curve to understand. The overall Hadoop ecosystem has a large number of sub-products. There is ZooKeeper, and there are a whole lot of other things that are connected. In many cases, their functionalities are overlapping, and for a newcomer or our clients, it is very difficult to decide which of them to buy and which of them they don't really need. They require a consulting organization for it, which is good for organizations such as ours because that's what we do, but it is not easy for the end customers to gain so much knowledge and optimally use it."
"The solution is very expensive."
"The key shortcoming is its inability to handle queries when there is insufficient memory. This limitation can be bypassed by processing the data in chunks."
"The load optimization capabilities of the product are an area of concern where improvements are required."
"Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them."
"We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it."
"The stability of the solution needs improvement."
"The price could be better. I think we would use it more, but the company didn't want to pay for it. Hortonworks doesn't exist anymore, and Cloudera killed the free version of Hadoop."
"Sometimes the large injestion takes days to load data, and some of our stored procedures take two to three days."
"The setup is not straightforward."
"I think the UI is not there yet. It could be improved by being more user-friendly."
"GUI of administrative tools is really outdated."
"Teradata is a bit late for the cloud."
"The following could be better: licensing, architecture openness, integration with other tools."
"Teradata needs to expand the kind of training that's available to customers. Teradata only offers training directly and doesn't delegate to any third-party companies. As a result, it's harder to find people trained on Teradata in our market relative to Oracle."
"I would like to see an improved Knowledge Base on the web."
Apache Hadoop is ranked 5th in Data Warehouse with 32 reviews while Teradata is ranked 3rd in Data Warehouse with 54 reviews. Apache Hadoop is rated 7.8, while Teradata is rated 8.2. The top reviewer of Apache Hadoop writes "A file system for data collection that contains needed information and files". On the other hand, the top reviewer of Teradata writes "Offers seamless integration capabilities and performance optimization features, including extensive indexing and advanced tuning capabilities". Apache Hadoop is most compared with Azure Data Factory, Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake and BigQuery, whereas Teradata is most compared with SQL Server, Snowflake, Oracle Exadata, MySQL and Teradata IntelliFlex. See our Apache Hadoop vs. Teradata report.
See our list of best Data Warehouse vendors.
We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.
Hi Tomasz,
Collibra can scan all these sources. See this link: https://marketplace.collibra.c...
Also, Erwin Data Intelligence Suite can harvest most (if not all) of these sources:
https://www.erwin.com/products...
Hi Tomasz Rabong,
I believe that if you have a developer team in Amundsen it would be possible.
Alternatively, you can look at Informatica EDC or at Data Virtualization Data Catalog (from Denodo).
@Tomasz Rabong, it depends upon the actual requirements of the data catalog.
As far as we have experienced SAP BO 4.0 is way ahead in solving architectural, clustering, warehousing and mining complex problems whereas Tableau server 2022.1 is really awesome and has recently included features to solve complex problems.
As a team, we prefer SAP BO for billions of data.
Hi @Tomasz Rabong, I hope you're well and safe.
Specifically, if you need any help regarding Infogix Data360 Govern, please let me know.
Cheers.