Oracle Exadata vs. HPE Vertica vs. EMC GreenPlum vs. IBM Netezza

We’re in the process of comparing Oracle Exadata vs. HPE Vertica vs. EMC GreenPlum vs. IBM Netezza. 

Which one is better?

it_user462477 - PeerSpot reviewer
User with 1,001-5,000 employees
  • 31
  • 357
PeerSpot user
16 Answers
it_user403353 - PeerSpot reviewer
Senior DBA and Architect at a tech services company with 501-1,000 employees
Jun 24, 2016

At the outset comparing these technologies/products I believe is truly not an apple to apple comparison. However one could make an attempt to look at their core functionality and what they were designed to accomplish. All of these have one goal in common – that is to manage and analyze very large volumes of data including Big Data. To address this all of them do have a cluster based architecture.

Ultimately one has to look at their business goals, current infrastructure, additional training if any and more.

Oracle Exadata –
The Exadata product family (referred to as Engineered Systems by Oracle) includes Exadata Database Machine, SuperCluster, etc. A key component is the Exadata Storage Server. It employs a massively parallel MPP architecture of multiple compute/DB nodes and storage/cell nodes all communicating over the InfiniBand network. This results in a clustered shared disk architecture. The Exadata software is optimally divided between the database servers and Exadata cells. The DB kernel and the storage cells communicate through the iDB (Intelligent DB Protocol) and this enables to ship a portion of the DB kernel activity down to the storage layer. Exadata cells return only the rows and columns that satisfy the SQL query thereby reducing the network data movement and as well as eliminating wastage of CPU cycles on the DB nodes.

Oracle ASM is the file system and volume manager on Exadata and can be designed and deployed to handle both OLTP and DSS workloads unlike others which are only geared for DSS. It can handle both RAC and single instance databases and helps in database consolidation. However the storage servers cannot communicate with one another; instead all communication is forced via the InfiniBand network to DB node and then back to the storage tier. Exadata supports both row and columnar format through Hybrid Columnar (eHCC) Compression and as well as in-memory store.

Oracle Exadata comes out in multiple flavors (quarter, half, full rack, etc.; in case of SuperCluster half and full rack; storage expansion rack) and can be scaled as and when required. Exadata also provides connectors for Big Data, Hadoop, ODBC, JDBC and more. Since the Sun acquisition Oracle now owns the entire stack both hardware and software on the Exadata.

EMC Greenplum –
Greenplum Database is a MPP (massively parallel processing) database server and is based on PostgreSQL open-source technology. MPP is inherently a shared nothing architecture with 2 or more server nodes. Greenplum uses this architecture to derive high performance and thereby load distribution for very large data warehouses. This translates to 2 or more PostgreSQL database instances acting together as a single unified DBMS. All nodes can scan and process in parallel. To this extent this is similar to an Oracle RAC database with 2 or more nodes. User’s interaction is similar to PostgreSQL DBMS. Supports table structures as heap, row/column oriented and external. This is also true of Oracle Exadata.

EMC acquired Greenplum in July 2010. EMC has built an advanced analytics appliance (Greenplum DCA) to develop Business Intelligence as a Service (BIaaS). This has been designed for predictive analytics for Big Data.

HP Vertica -
Vertica is also built based on the MPP architecture and can be scaled-out. It employs column-oriented RDBMS and compression of data. This enables Vertica to handle analytical workloads through a distributed (multiple nodes) compressed columnar architecture. It provides support for open ANSI SQL, ODBC and JDBC along with built-in predictive analysis via Ruby and Python. Vertica can be used together with Hadoop. The Vertica cluster architecture stores data on the database in two containers - Write Optimized Store (WOS) and Read Optimized Store (ROS).

The WOS stores data in memory without compression or indexing whereas the ROS stores data on disk wherein it is segmented, sorted, and compressed for high optimization. A Tupler Mover handles the data movement between the WOS and ROS.

Neteeza –
Neteeza employs an asymmetric massively parallel processing (AMPP) architecture. IBM acquired Neteeza in Sep 2010 and has now integrated this technology into their PureData System appliance. This appliance is tailored for Data Warehouse workloads and the architecture makes use of the IBM blade servers and disk storage using IBM’s FPGAs (Field Programmable Gate Arrays). The FPGA is a special chip having a large number of gates that are programmable to implement most of the logical functions needed to manage streaming processing tasks. Like all of its competitors provides integration with ODBC, JDBC, ETL and BI tools as well.

Search for a product comparison in Data Warehouse
it_user6336 - PeerSpot reviewer
Principal Technical Consultant at Khwarizm Consulting
Jun 16, 2016

If you already have Oracle DBs, then Exadata will be the solution as training for the new machine will be almost null, except if they don't have RAC skills.
Adjusting Oracle Instance to work for DW is not that much a headache.
I have trained a large Telco in Africa for Exadata. 50% of the attendants were DW on Teradata. They accepted the idea that they will can use Exadata for DW tasks in the future beside/over Teradata.

Oracle sees Teradata as the main competitor not the other you mentioned here

Exadata is shipped from one vendor only al built to match each other.

Generally when I look to other DB/DW systems I found that they are multi-vendor inside, which is a drawback for me.

Why you don't also consider Tedata too in comparison?

If I were you I will set with all vendors and put them in the real offering.

Some, said that Exadata is very hard to adjust. This is very wired for me. The machine is very well balanced in configuration. Setup by Oracle ACS or certified partner, with attending the one course your team will have full control over the box.

Exadata is the Storge engine for Bigdata shipped by Oracle too.

it_user88296 - PeerSpot reviewer
Sales at IBM
Jun 15, 2016

Go with HPE Vertica, extremely fast, platform independent, very well integrated with Hadoop (all distributions) and easy to use ! Definitely the most economic solution to purchase and maintain.
More info Ican give if i new business case.

it_user429198 - PeerSpot reviewer
IT Specialist at a tech services company
Real User
Jun 15, 2016

Depends what you want to do. Exadata is a hardware solution ORACLE offers to increase performance for high volume data processing, plays very well if you are an ORACLE shop. If you tech stack is IBM based, DB2, Websphere etc then I would certainly look at Netezza..

Usually I think of Green Plum for uses cases involving data analysis and presentation. seems to me they changed owners recently.

I have no experience with HPE.

Real User
Apr 10, 2023

@it_user429198 so here we are in 2023.. oracle X9 and we did have as a comparison HPE vertica... What is a  good replacement for the X9 assuming we are replacing the infrastructure?

PeerSpot user
it_user100374 - PeerSpot reviewer
Network/Comms Mgr at Hewlett-Packard
Jun 15, 2016

They are different products so a comparison is not easy without knowing the scope of the project. The main difference is that Oracle is a RDBMS platform while the other 3 are columnar DBs or similar. This means that HPE Vertica, EMC GreenPlum, IBM Netezza are much efficiente (5-8x less storage) and fast DBs (3-7x query speed) but the lack some RDBMS capabilities especially and they have some limits in concurrent access (100' vs 1000'). This should be a problem for large DWH with thousand of simultaneous accesses, but if you don't need 1000' simultaneous access that's not an issue
The other big difference is price and cost of ownership, Oracle is much higher than other 3.
Regards the difference between vertica, greemplun and netezza; HP Vertica and greenplum are the more DB than netezza. IBM netezza is more an appliance to use in conjunction with RDBMS technologies. vertica and Greenplum have a lot of DWH/OLAP features both on specialized queries (dimensional queries) than on geoloaction capabilities and on DWH aspects (resource manager to allocate more resource to specific queries, Query optimized, ..).

it_user453681 - PeerSpot reviewer
BI Manager, Vertica ASE Certified DBA at a marketing services firm with 1,001-5,000 employees
Real User
Jun 15, 2016

I am strongly in favor of Vertica. It's columnar storage is fast, flexible, load times are lightning fast, and compression is extremely impressive. Their latest version allows for more personalized query hints. The included Database Designer helps you design your database for optimal performance. Training is top notch. Support can be a bit sketchy at times, but they are very friendly, knowledgeable and helpful. I've been using Vertica for three years and would not go back to using any other product.

Learn what your peers think about Oracle Exadata. Get advice and tips from experienced pros sharing their opinions. Updated: May 2023.
708,544 professionals have used our research since 2012.
it_user112773 - PeerSpot reviewer
EMEA HP Servers Category Manager at Hewlett Packard Enterprise
Jun 15, 2016

HPE Vertica is the best solution. Extreme low latency query performance. Better compression capabilities. Lower TCO . Facebook and Uber runs on Vertica

EMC GreenPlum : High TCO because of long and complex implementation cycles
Not well suited for real time workloads requiring continuous load and query. Proprietaray HW

IBM Netezza: Inferior workload management ( allocating query resources among different groups of users)
Expensive “fork lift” upgrades driven by appliance sizing not incremental customer need - adds significant downtime. Proprietaray HW

Oracle Exadata: High Admin and Skills costs (RAC) Complex to set up. Expensive. To match Vertica, customers will require a much higher hardware footprint thus significantly increasing TCO. Proprietaray HW

it_user340983 - PeerSpot reviewer
Infrastructure Engineer at Zirous, Inc.
Real User
Jun 15, 2016

Netezza would have to be my answer as well. As mentioned before GreenPlum is on its way out as EMC has recently released multiple new platforms and are beginning to focus more along those lines. Netezza excels in the structured data capacity, but if you are looking for more of a catch-all data types, I might suggest an EMC Isilon solution.

Jun 15, 2016

My vote is for netezza from this list.

Exadata - More generic, covering oltp to olap/analytical use cases. But difficult to configure for olap use cases....100s of parameters to deal with. Oracle RDBms optimizer still maturing to easily and completely leverage Exadata configuration advantages such as that from full scans, storage indexes etc

Vertica - very efficient for use cases where processing of specific or limited columns is required. Such as olap queries requiring aggregation of a specific column value. However not the most efficient for the wider use cases supported by an EDW. Good for reporting data marts not for EDWs

Greenplum - cheaper platform however etl vendor support is not great, like informatica pdo not fully supported. Use postgress SQL...not very different from ANSI or oracle SQL....but still some differences are there.

Netezza - pretty good dw appliance configuration.....very little overheads for tuning, indexing or any other configuration required in other dw appliances like exadata. Good from ease of deployment and use, and performance perspective. Capable of better supporting wider use cases of an EDW as compared to others from the list

Business Unit technical Lead at a tech services company with 1,001-5,000 employees
Real User
Jun 15, 2016

for EDW I would have to say Netezza. For ease of data loading and parallelism on queries puts it far above any competitor listed.

it_user120534 - PeerSpot reviewer
CTO at a consultancy with 51-200 employees
Real User
Jun 15, 2016

This is a question that is often asked because of the overlap and unique differentiation of each product. First, it should be known that I am in favor of Vertica as the overall leading EDW based on multiple use cases ranging from structured, semi-structured and unstructured data sets. There is a reason why Facebook uses Vertica.

GreenPlum was a good concept but it is too storage-centric imho. Applications are no longer written from the storage layer, but rather top-down with a solid API stack to black box infrastructure altogether and this makes GreenPlum less than desirable as a critical infrastructure component.

Netezza is solid and has performed well over the years. Works very well with structured data use cases and process-oriented applications.

However, it is becoming harder to justify the enterprise grade EDWs due to the innovation and performance of in-memory database architectures with a high performance storage backend (i.e., MapR, DDN, etc.). The integration of Hadoop (lakes or streams) makes the in-memory approach more effective, relevant and scalable.

Last note: the increasing demand for near real-time analytics is also driving more disruption in the traditional EDW/3-tier architecture approach. With the right storage backend and a high performing in-memory database, the ROI comparison begins to unfold rather quickly.

Data Engineer at Broadridge Financial Solutions
Real User
Jun 15, 2016

We’re in the process of comparing Oracle Exadata vs. HPE Vertica vs. EMC GreenPlum vs. IBM Netezza.
What would be your requirement – Batch processing, split seconds response, DW ?

Greenplum is very good for batch processing and a good data warehouse. Netezza is similar to Greenplum but having better support.


it_user462555 - PeerSpot reviewer
Technical Team Lead, Business Intelligence at a tech services company with 51-200 employees
Jun 15, 2016

There are a number of factors that you'll need to consider. The biggest advantage to all of them is data compression over a traditional RDMBS. I'm not an expert but I have worked with Oracle Exadata, HPE Vertica and IBM Netezza. If you're already using Oracle products then Exadata seems like the logical choice. Same for IBM and Netezza. Both I believe are appliances. Not sure about EMC GreenPlum, but HPE Vertica is the only pure columnar that I've worked with which probably has the highest data compression. From my experience, you need to take time to figure out how to use the product. For example, IBM DB2 had materialized tables and IBM Netezza has no such function. The hints used in Oracle 11g didn't work once we switched over to Netezza. For HPE Vertica, if you're used to writing queries with a lot of columns, each additional column degrades performance a bit. All the issues raised are solvable but it takes time to figure out the best approach. HPE Vertica was the cheapest to install and maintain. Support wasn't great and that's something that would be probably better with Oracle and IBM. Hope this helps.

it_user433215 - PeerSpot reviewer
Senior Architect at a agriculture with 1,001-5,000 employees
Jun 15, 2016

That depends on your use case. What are you trying to do?

Jun 15, 2016

My very short answer, IBM netezza.

Exa and Ver are tooled together. GreenPlum may become history.


it_user145740 - PeerSpot reviewer
Consultant at a financial services firm with 5,001-10,000 employees
Real User
Jun 15, 2016

This is very much dependant on the size of the organisation's data, alongside what you're trying to achieve. If you're attempting to have a Kimball-style data warehouse, perhaps using ETL routines against a Hadoop data lake, you might well consider one of these perfectly capable tools.

If you're intending to set up an Inmon-style data warehouse, these may also be a good fit. In my personal experience, I would opt for Greenplum out of the list of technolgies. My organisation uses this and it is a very good MPP solution. That said, if speed of query processing is something you're keen to do, you may wish to extend your list to include the next generation of MPP databases, such as Exasol.

Related Questions
Content Manager at PeerSpot (formerly IT Central Station)
May 17, 2022
Why would you choose that one?
See 1 answer
Tech blogger
May 17, 2022
When I compared various data warehouse tools and solutions, I found Snowflake’s software as a service (SaaS) platform and Oracle Exadata to be the most effective data warehouse solutions currently available on the market. One of the things that I initially noticed about Snowflake’s software as a service (SaaS) platform was how it made my operations more efficient by enabling me to search for and find relevant data in a more efficient way than had previously been possible. Snowflake allows me to create a custom storage unit for all of my critical data. Part of this customization includes the ability to make any and all data searchable. As soon as I started to use it, it began to show its value. All of the data that I have stored in Snowflake’s virtual warehouse becomes easy for me to locate. Instead of spending long periods of time seeking the particular piece of data that I need, all I have to do is to type in a search term. This will immediately call up the information that I want to find. This aspect of the solution makes it immensely valuable. It enables me to save time that I can then devote to other critical tasks. A major advantage that Snowflake offers me is that it gives me the ability to perform a number of different functions with a single solution. It is a highly flexible solution and allows me to store organized processed data, centrally store raw data that has yet to be processed, process data through data engineering, examine data using data science, develop data applications, and securely share and take in real-time or shared data. It empowers me to make my data really and truly my own. I am able to make full use of my data and shape it in whatever way my needs dictate. Two aspects of the Oracle Exadata Database Machine that I really appreciate are its scalability and its ease of use. This solution makes it so that I can expand my digital warehouses virtually limitlessly. If I needed to, the Oracle Exadata Database Machine would enable me to scale my data warehouses to hold 31 petabytes of data. I can easily meet my data needs without having to worry about running out space for my data. Additionally, every component of this tool, including its database servers, storage servers, and network, are all ready to use straight out of the box. Everything about this solution is pre-configured, pre-tuned and pre-tested before we ever received it from Oracle. All of the components work in perfect harmony without outside intervention. This means that I don’t have to struggle to deploy it or do very much to get all of the features to work in tandem. This solution also offers me the ability to easily move workloads from a data center to the cloud. I am able to migrate my data without having to worry about availability, scaling, or performance. My workloads lose nothing in the transfer. Additionally, the same database options that are available to me on my physical systems are available on the cloud. This allows me to continue using the solution the same way that I had been using it up to this point. Ultimately, either of these two solutions will empower you to take full control of every stage of your data and its lifecycle from its initial storage to its final use.
Director of Community at PeerSpot (formerly IT Central Station)
Sep 9, 2022
What are the relations between them? What are their use cases?
2 out of 7 answers
Consulting Practice Partner - Data, Analytics & AI at FH
Oct 10, 2021
Hi @Evgeny Belenky ​ - great question.  Here is the best answer crafted by Talend  Data Lake Data Warehouse Data Structure Raw Processed Purpose of Data Not Yet Determined Currently In Use Users Data Scientists Business Professionals Accessibility Highly accessible and quick to update More complicated and costly to make change Please read more here https://www.talend.com/resourc...
CEO at WInterCorp LLC
Oct 11, 2021
Many of the comparisons of data lake and data warehouse that you see (such as the one below from Talend) are based on an out-of-date or dumbed-down idea of the data warehouse.   The more advanced data warehouse engines: - support a wide range of data types and formats - can access external data (e.g., in object storage) that has never been ingested - support data scientists as well as business users (e.g., with an ability to run Python, R, SAS routines and data science libraries on data in place in parallel in the data warehouse) - support operational query on live, rapidly changing data While also providing capabilities and services never provided on data lakes or their cloud-based equivalents.  Data warehouses, properly operated and housing data that is properly curated, are much more efficient, cost-effective and performant for data that is intensively shared and widely used. Data lakes are good repositories for data that is more lightly or locally used and does not merit the level of curation usually desired in a data warehouse.
Related Articles
Content Manager at PeerSpot (formerly IT Central Station)
Apr 26, 2022
PeerSpot’s crowdsourced user review platform helps technology decision-makers around the world to better connect with peers and other independent experts who provide advice without vendor bias. Our users have ranked these solutions according to their valuable features, and discuss which features they like most and why. You can read user reviews for the Top 5 Data Warehouse Tools to help you d...
Related Categories
Related Articles
Content Manager at PeerSpot (formerly IT Central Station)
Apr 26, 2022
Top 5 Data Warehouse Tools 2022
PeerSpot’s crowdsourced user review platform helps technology decision-makers around the world to...
Download Free Report
Download our free Oracle Exadata Report and get advice and tips from experienced pros sharing their opinions. Updated: May 2023.
708,544 professionals have used our research since 2012.