PeerSpot user
Data Engineer at Broadridge Financial Solutions
Real User
Strong integration with Greenplum Servers.

What is most valuable?

Strong integration with Greenplum Servers.

How has it helped my organization?

Loading data to Greenplum server after batch processing.

For how long have I used the solution?

6 months

What was my experience with deployment of the solution?

Yes, issues with HA and issues in syncing with the Greenplum.

Buyer's Guide
VMware Tanzu Greenplum
March 2024
Learn what your peers think about VMware Tanzu Greenplum. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
768,857 professionals have used our research since 2012.

What do I think about the scalability of the solution?

You cannot expect a split second response.

How are customer service and support?

Customer Service:

On a scale of 10 I would rate it as 7

Technical Support:

On a scale of 10 I would rate it as 8

Which solution did I use previously and why did I switch?

This has just been used for POC period not for a regular use.

How was the initial setup?

Initial setup works absolutely well. We were using with only Greenplum.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user373128 - PeerSpot reviewer
Data Architect & ETL Lead at a financial services firm with 1,001-5,000 employees
Vendor
Processing speed of queries used for ‘Reporting’ solutions is the most valuable feature.

Valuable Features:

Processing speed of queries used for ‘Reporting’ solutions is the most valuable feature.

Improvements to My Organization:

Not Applicable for the area I was responsible for, as we ended up migrating away from Greenplum.

Room for Improvement:

Stability and scalability for large number of concurrent applications & their users. The results we got were very inconsistent, depending on number of connections taken up by multiple applications and users.

When our application was first deployed using Greenplum, the number of users of the rrack on which Greenplum was deployed was very limited. We got excellent query performance results at that time. But as more applications started getting deployed, we started getting very inconsistent performance results. Sometimes the queries would run in sub-seconds, and sometimes same queries would run 10 times longer. The reason we found this was that Greenplum limits the number of active concurrent connections. Once all connections are being used, any new query gets queued, and thus response time suffers.

The impression we got was that the EMC Sales team that sold Greenplum to the organization did a great job. But later on the ball was dropped when it came to educating on which type of applications are suitable to Greenplum , and how to configure it to get optimal performance. When Pivotal took over support of Greenplum, their consultant visited us to go over the issues we were having. He advised us that Greenplum is not the best environment for our application needs. We ended up migrating our application out of Greenplum, along with a few other applications.

Deployment Issues:

There was no issue with the deployment.

Stability Issues:

There were issues with the stability.

Scalability Issues:

There were issues with the scalability.

Other Advice:

Ensure that this is the right tool for your needs. For instance, Greenplum is not the best tool for cases where data has to be kept up to date in real time. Capacity planning is key to success, once you do decide it is the right tool for you.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
VMware Tanzu Greenplum
March 2024
Learn what your peers think about VMware Tanzu Greenplum. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
768,857 professionals have used our research since 2012.
it_user371898 - PeerSpot reviewer
Sr ETL Developer at a financial services firm with 1,001-5,000 employees
Vendor
It only takes minutes to process millions of record. The bug fixes come as many patches like a start up instead of having scheduled release with proper improvements.

What is most valuable?

  • Parallel processing
  • Takes minutes to process millions of records

How has it helped my organization?

This has improved our daily load process reducing the run time at least by three to four hours which made other departments within the organization to look for data from the Enterprise Data Warehouse.

What needs improvement?

It acts like a mainstream product not a novice any more. There are a lot of areas that can be improved. The bug fixes come as many patches like a start up instead of having scheduled release with proper improvements.

For how long have I used the solution?

I've used it for five years.

What was my experience with deployment of the solution?

It took a while to fully take advantage of this as we had to come up with lots proprietary solutions when linking to other products.

What do I think about the stability of the solution?

There is no issue with its stability.

What do I think about the scalability of the solution?

There is no issue scaling it.

How are customer service and technical support?

8/10 we had a technician come out on a new year in 2012 to fix some hardware failure.

Which solution did I use previously and why did I switch?

We were one of pioneers in adopting this solution. So we got the best deal when compared to the competitors in several areas.

How was the initial setup?

The initial move was a bulk transfer from the old system to new Greenplum based solutions. It was all done by Greenplum contractors. But to get it working with other products was challenging.

What about the implementation team?

Greenplum employees developed and supported the initial move. Later they became remote consultants with support through phone and in-person as needed.

What was our ROI?

They gave a really good deal and we have been with them for five years even though the product got bought over by couple of different companies.

What other advice do I have?

You need strong DBAs and architects to support the initial transfer.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Senior Enterprise Technical Architect at a computer software company with 10,001+ employees
Real User
We experience performance of approximately 1TB per hour loading data to Greenplum without the use of specialized hardware.
Pros and Cons
  • "Scalable (Massive) Parallel Processing (MPP) – The ability to bring to bear large amounts of compute against large data sets with Greenplum and the EMC DCA has proven itself to be very effective."
  • "We would like to see Greenplum maintain a closer relationship with and parity to features implemented in PostgreSQL."

What is most valuable?

Of particular value to our environment and applications are the following Greenplum capabilities:

  1. Scalable (Massive) Parallel Processing (MPP) – The ability to bring to bear large amounts of compute against large data sets with Greenplum and the EMC DCA has proven itself to be very effective.
  2. Fast load of data into Greenplum – We experience performance of approximately 1TB per hour loading data to Greenplum without the use of specialized hardware.
  3. MADlib (madlib.net) – There are a number of statistical and analytical functions available within MADlib upon which we rely. Among these are linear regression, logistic regression, apriori, k-means, principle component analysis, etc.
  4. User Defined Functions in Python (UDFs in PL/Python) – Where MADlib does not provide a direct solution to an application problem, the ability to quickly prototype and deploy user defined functions with Python has been effective.

What needs improvement?

We would like to see Greenplum maintain a closer relationship with and parity to features implemented in PostgreSQL. The current version of Greenplum is based on a fork of PostgreSQL v8.2.15. This edition of PostgreSQL was EOL by the PostgreSQL project on Dec 2011. The current version of PostgreSQL is v9.5.

For how long have I used the solution?

We began production use in November, 2011. Alongside Greenplum, we're also using EMC Data Computing Appliance v2.3.3 (8/10), of which we have two and a half racks in production, and one and a quarter racks in dev/tests.

What was my experience with deployment of the solution?

We had no issues with the deployment.

What do I think about the stability of the solution?

The only issues with stability we’ve experience have been the sporadic fail over of primary to mirror segments. The environment continues to operate in this instance with the failure of queries that were in flight at the time of the fail-over.

What do I think about the scalability of the solution?

We have had no issues with scalability whatsoever.

How are customer service and technical support?

The service and support we’ve received from both Pivotal and EMC has been exemplary. The exceptions to this would be:

  1. The EMC Request for Product Qualification (RPQ) process – EMC DCA support is contingent upon EMC approval of all third party software installed onto a DCA. There have been times that this approval has taken as long as 60 days to process.
  2. Root Cause Analysis of Greenplum Database Incidents – When Greenplum Database incidents have occurred (e.g. primary database segments failing over to their backup), and Pivotal has been called for support, the response has been near immediate (30 minutes or less). Additionally, the incident resolution provided has been equally expedient. Where this has caused some disappointment is the response to our request for a root cause of the incident. These requests tend to queue up and we don’t seem to get answers beyond the typical vendor response of “that’s been fixed in the next release”.

Which solution did I use previously and why did I switch?

The purchase of Greenplum was our first interaction with Pivotal. We have been a customer of EMC for a very long time.

What other advice do I have?

My primary reason for reducing points on this rating is due to the fact that Greenplum is based on a fork of PostgreSQL v8.2.15 (EOL by the PostgreSQL project on Dec 2011). The current version of PostgreSQL is v9.5. There are a number of current PostgreSQL features of which we would like to take advantage (JSON support, materialized views, full text search, XML support, column-based permissions, row-based permissions, etc.).

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user76890 - PeerSpot reviewer
Engineer at a marketing services firm with 51-200 employees
Vendor
We use Greenplum for production but not for development and testing because it consumes too much resources

We use the scrum methodology to manage our engineering work. And while this does give us the flexibility to quickly respond to changing business needs, it also means that our databases’ schemas are not set in stone. To accomodate frequent changes to the way we structure our data, we’ve adapted Ruby on Rails database migrations for use beyond our Rails apps.

Our first challenge was to find and extract the migration’s functionality from Rails so that we could use it without needing to bring in the other features of Rails we didn’t need. Thankfully, this turned out to be relatively simple because almost all of the functionality can already be accessed by Rake tasks, so it was just a matter of building a Rake file that contains the tasks we wanted from Rails. The only thing we had to add to get started was a Rake task for creating new migrations, since Rails creates them either automatically when creating new model objects or through the `rails generate migration` command.

We quickly ran into problems, though, because we use Greenplum, a DBMS built on Postgres that adds support for features that help accomodate big data. Unlike standard Postgres, Greenplum adds features, such as table partitions, not found in most DBMSes, but Rails is designed to be DBMS agnostic so you can easily switch between, say, Postgres and MySQL and SQLite without any more trouble than changing a single configuration file. So we decided to cut around Rails’s database abstractions and instead directly write SQL in our migrations and dump SQL structure files rather than Rails schema files.

Only this didn’t work because, to complicate matters further, although we use Greenplum in production and staging environments, for local testing and development we use Postgres because Greenplum has minimum requirements that ended up consuming too much of our workstations’ resources. So we ultimately ended up developing a hybrid solution that allows us to support Greenplum and Postgres in the same migrations.

The two main aspects of the solution are selective application of options when making changes to the database and keeping two separate dumps of the database, one for Greenplum with Greemplum-only syntax and one for Postgres with Greenplum-only syntax removed. For example, we rewrote the Rails `create_table` migration function to add support for Greenplum options like partitions, data distribution, and append-only tables, but then have the function ignore those options when running the migrations against our development Postgres databases. This allows us to use a single set of migrations on all of our databases, drastically simplifying what could have otherwise been a gnarly challenge.

By no means is our solution complete or perfect. It only supports those features of Greenplum that we use, and new features only get added in as we need them, plus there is some risk associated with keeping two separate schema dumps and the potential for them to get out-of-sync. Still, compared to maintaining a database with frequently changing schemas without the aid migrations, it’s lightyears better.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user1127370 - PeerSpot reviewer
Co-Founder, Chief of Operations with 10,001+ employees
Real User
A scalable and future-proof solution for data warehousing
Pros and Cons
  • "We chose Greenplum because of the architecture in terms of clustering databases and being able to have, or at least utilize the resources that are sitting on a database."
  • "The installation is difficult and should be made easier."

What is our primary use case?

We install this solution for our clients. At the moment we are in the middle of an installation for a data warehouse that will be used by a telecommunications company that is based in Lesotho. We have not gone into production yet, but we have used it in a test environment and it works very well.

We are a technology company, so we handle software development, software implementation, data warehousing, and business intelligence.

We are using the on-premise deployment model. In Africa, there isn't much adoption of cloud services, so most of our clients are expecting on-premise implementation.

What is most valuable?

We chose Greenplum because of the architecture in terms of clustering databases and being able to have, or at least utilize the resources that are sitting on a database.

What needs improvement?

The installation is difficult and should be made easier. Maybe if the process was simpler it would have a quicker adoption by other developers. This could also be accomplished by providing training aids, such as videos to help with installation or using certain features. There are resources currently available on their website, but you have to search through a lot of documentation.

For how long have I used the solution?

We are currently implementing this solution.

What do I think about the scalability of the solution?

Our expectation is that the scalability will be good, as it is one of the main reasons that we have invested in this solution.

How are customer service and technical support?

To this point, I have referenced the material on the website but have not really interacted with technical support.

How was the initial setup?

The initial setup of this solution is not very simple. You need to properly follow the steps in terms of getting the whole architecture put together. We have a team of five people who are working on different aspects of the implementation.

Currently, we are focusing on the data layer. Next will be the ETL layer.

What about the implementation team?

We are using our in-house team to implement this solution for our client.

Which other solutions did I evaluate?

We have used Oracle and Microsoft SQL, but we haven't had much success. We found that Oracle was not as scalable and we were having some performance bottlenecks. Also, from a licensing perspective, Greenplum was a better choice. For all of these reasons, we have chosen to invest heavily in Greenplum.

What other advice do I have?

I would recommend this solution specifically for the scalability. This solution has a more futuristic technology, as opposed to the old school kind of data warehousing. If people are interested in getting something that is more future-proof, then I would recommend this solution.

So far, we're comfortable with what we've seen. What we have configured is addressing our needs at the moment.

I would rate this solution an eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Statistician at a financial services firm with 1,001-5,000 employees
Real User
We were able to analyze and produce output on large volumes of data very quickly which saved us lots of time.

What is most valuable?

Greenplum is an MPP architecture database. Data can be distributed across multiple nodes and strong distribution will allow queries to execute on all segments at once, which is very powerful. As long as we have good SQL knowledge, we can start playing in the platform. Greenplum uses Postgres and ANSI Standard SQL. Also, it supports many other procedural languages, such as Python, C++, and Pearl.

How has it helped my organization?

Greenplum is a high powered, multi-node database that was chosen for its capacity to ingest and query data at extremely high rates of speed, enabling in Database Analytics and Statistical output on granular levels of data that was otherwise inaccessible before its deployment. We were able to analyze and produce output on large volumes of data very quickly which saved us lots of time (we used to wait for hours to get the same output). The management was able to get insights very quickly so that they can make informed decisions.

For how long have I used the solution?

I used Greenplum between Aug 2011 – Aug 2015. Almost all the members in the analytics team used Greenplum on a daily basis.

What do I think about the stability of the solution?

There were no issues, and it was doing what it was supposed to do.

What do I think about the scalability of the solution?

There were no issues, and it was doing what it was supposed to do.

How are customer service and technical support?

We had a pre-sales consultant who provided end-end solution about the product. Also, he was working with our data and clearly demonstrated the advantages of Greenplum. After we purchased the product, we were provided a full time consultant who had extensive knowledge about the product. He was primarily responsible for providing hands on experience on projects and also did excellent job of teaching everyone and bringing everyone up to the speed on the new platform. We also had a technical person offshore who was responsible for fixing things if something breaks up or any other issues.

Which solution did I use previously and why did I switch?

We did use other products in the company but it wasn’t an MPP architecture database. Our data was getting bigger so we needed something with MPP architecture to tackle big data challenges so Greenplum was considered. It was a management decision to purchase this product (not sure whether other similar products were considered or not)

What about the implementation team?

All the set up I believe was done by Greenplum team.

What was our ROI?

I didn't have any visibility on the pricing and licensing. But I can say that, we needed product like Greenplum to store, manage and analyze huge volumes of data which can be daunting task

What other advice do I have?

Greenplum is a MPP (massively parallel processing) database which is extremely fast. If people are dealing with very high volume of data, it is definitely a product to consider seriously

Disclosure: My company has a business relationship with this vendor other than being a customer: To my knowledge, we were one of the biggest customers in Canada, they were looking for our feedback to improve the product offerings.
PeerSpot user
Subdirector of Support for Production at a financial services firm with 51-200 employees
Real User
Top 10
A database solution that has good support but needs to provide quick access to the disc
Pros and Cons
  • "With VMware Tanzu Greenplum, one can make a huge database table and analyze the queries by adding in the SQL command. Some hint or command for the query goes over the multi-parallel execution."
  • "VMware Tanzu Greenplum needs improvement in the memory area and improved methods for quick access to the disc. So, one of the quick goals of Greenplum must work on enhancing access to the disc by adding hints in the database."

What is most valuable?

With VMware Tanzu Greenplum, one can make a huge database table and analyze the queries by adding in the SQL command. Some hint or command for the query goes over the multi-parallel execution. Also, queries can be sent quickly to other database systems.

What needs improvement?

VMware Tanzu Greenplum needs improvement in the memory area and improved methods for quick access to the disc. So, one of the quick goals of Greenplum must work on enhancing access to the disc by adding hints in the database.

For how long have I used the solution?

I have been using VMware Tanzu Greenplum since 2017. I have installed the ring through the Dell Dell Platform, the EMC platform, and the DX100.

How are customer service and support?

We use the VMware Tanzu Greenplum support directly, the community for open source Greenplum database. I rate it a seven out of ten.

How would you rate customer service and support?

Neutral

How was the initial setup?

The initial setup was complex because of the architecture we used. Ten people are needed to maintain the product, of which eight are DBAs, and two support the operating system as the administrator.

What other advice do I have?

For the support network of the product, overall installation, use of the database, and improved database and support, I rate the solution an eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free VMware Tanzu Greenplum Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2024
Product Categories
Data Warehouse
Buyer's Guide
Download our free VMware Tanzu Greenplum Report and get advice and tips from experienced pros sharing their opinions.