Try our new research platform with insights from 80,000+ expert users
PeerSpot user
Architect at a tech services company with 51-200 employees
Consultant
Vertica allows for thousands of users to run an analysis at the same time. Great aggressive compression.

At the tech company I work , we were looking for new ways to allow end users (a couple of thousand external users)  to crunch through their detailed data in real time as well as enabling internal users and data analysts to gain the information they needed to run and optimize their business processes. 

Unfortunately our current system had  become slower and slower over time due to the tremendous increase in data to be managed so a new approach had to be taken to accomplish this goal.  Our existing data warehouse/data management infrastructure just could not handle big data.

We evaluated a variety of different solutions such as Amazon Redshift, Infobright and Microsoft. Vertica won out above all these other solutions. Our dataset is several hundred million rows and our avg. response time goal was less than 5 secs. We are building our environment for the future so another requirement was to be able to scale horizontally. 

Redshift came close in response time but failed in concurrency, meaning multiple users running an analysis at the same time. Infobright came close in response time and concurrency but didn’t provide sufficient scalability. Vertica checked all boxes at a very competitive price-point.

We found that the extreme speed, performance and flexibility is superior to all the other solutions out there. The massive scalability on industry-standard hardware, standard SQL interface and database designer and administration tools are excellent features of Vertica. I also really value the simplicity, concurrency for hundreds or thousands of users, and aggressive compression.

This new environment allowed us to implement applications such as clickstream and predictive analysis which have added tremendous value for us. Currently there is about 500 GB – 1 TB of data that I am managing and I have found that Vertica is able to be integrated very well with a variety of Business Intelligence (BI), visualization, and ETL tools in their environment. I use Hadoop, Tableau and Birst and using all these solutions with Vertica has been overall quite smooth.

Our query performance has increased by 500 – 1,000% through improvements in response time and I am now able to compress our data by more than 50%. The simultaneous loading and querying and aggressive compression has helped us become more efficient and productive. Furthermore the high availability without hardware redundancy, optimizer and execution engine, and high availability for analytics systems has saved us both time and money. 

Disclosure: PeerSpot has made contact with the reviewer to validate that the person is a real user. The information in the posting is based upon a vendor-supplied case study, but the reviewer has confirmed the content's accuracy.
PeerSpot user

It seems you were mainly focused on how Vertica is good and did not run a benchmark, otherwise it could be nice if you could publish loading and query performance between all above DB's. 500GB - 1TB is not a lot of data .

See all 4 comments
it_user99918 - PeerSpot reviewer
Chief Data Scientist at a tech vendor with 10,001+ employees
Vendor
We're using Vertica, just because of the performance benefits. On big queries, we're getting sub-10 second latencies.

My company recognized early, near the inception of the product, that if we were able to collect enough operational data about how our products are performing in the field, get it back home and analyze it, we'd be able to dramatically reduce support costs. Also, we can create a feedback loop that allows engineering to improve the product very quickly, according to the demands that are being placed on the product in the field.

Looking at it from that perspective, to get it right, you need to do it from the inception of the product. If you take a look at how much data we get back for every array we sell in the field, we could be receiving anywhere from 10,000 to 100,000 data points per minute from each array. Then, we bring those back home, we put them into a database, and we run a lot of intensive analytics on those data.

Once you're doing that, you realize that as soon as you do something, you have this data you're starting to leverage. You're making support recommendations and so on, but then you realize you could do a lot more with it. We can do dynamic cache sizing. We can figure out how much cache a customer needs based on an analysis of their real workloads.

We found that big data is really paying off for us. We want to continue to increase how much it's paying off for us, but to do that we need to be able to do bigger queries faster. We have a team of data scientists and we don't want them sitting here twiddling their thumbs. That’s what brought us to Vertica.

We have a very tight feedback loop. In one release we put out, we may make some changes in the way certain things happen on the back end, for example, the way NVRAM is drained. There are some very particular details around that, and we can observe very quickly how that performs under different workloads. We can make tweaks and do a lot of tuning.

Without the kind of data we have, we might have to have multiple cases being opened on performance in the field and escalations, looking at cores, and then simulating things in the lab.

It's a very labor-intensive, slow process with very little data to base the decision on. When you bring home operational data from all your products in the field, you're now talking about being able to figure out in near real-time the distribution of workloads in the field and how people access their storage. I think we have a better understanding of the way storage works in the real world than any other storage vendor, simply because we have the data.

I don’t remember the exact year, but it may have been eight years ago roughly that I became aware of Vertica. At some point, there was an announcement that Mike Stonebraker was involved in a group that was going to productize the C-Store Database, which was sort of an academic experiment at UC Berkeley, to understand the benefits and capabilities of real column store.

I was immediately interested and contacted them. I was working at another storage company at the time. I had a 20 terabyte (TB) data warehouse, which at the time was one of the largest Oracle on Linux data warehouses in the world.

They didn't want to touch that opportunity just yet, because they were just starting out in alpha mode. I hooked up with them again a few years later, when I was CTO at a different company, where we developed what's substantially an extract, transform, and load (ETL) platform.

By then, they were well along the road. They had a great product and it was solid. So we tried it out, and I have to tell you, I fell in love with Vertica because of the performance benefits that it provided.

When you start thinking about collecting as many different data points as we like to collect, you have to recognize that you’re going to end up with a couple choices on a row store. Either you're going to have very narrow tables and a lot of them or else you're going to be wasting a lot of I/O overhead, retrieving entire rows where you just need a couple fields.

That was what piqued my interest at first. But as I began to use it more and more, I realized that the performance benefits you could gain by using Vertica properly were another order of magnitude beyond what you would expect just with the column-store efficiency.

That's because of certain features that Vertica allows, such as something called pre-join projections. At a high-level, it lets you maintain the normalized logical integrity of your schema, while having under the hood, an optimized denormalized query performance physically on disk.

Can you be efficient if you have a denormalized structure on disk because Vertica allows you to do some very efficient types of encoding on your data. So all of the low cardinality columns that would have been wasting space in a row store end up taking almost no space at all.

It's been my impression, that Vertica is the data warehouse that you would have wanted to have built 10 or 20 years ago, but nobody had done it yet.

Nowadays, when I'm evaluating other big data platforms, I always have to look at it from the perspective of it's great, we can get some parallelism here, and there are certain operations that we can do that might be difficult on other platforms, but I always have to compare it to Vertica. Frankly, I always find that Vertica comes out on top in terms of features, performance, and usability.

I built the environment at my current company from the ground up. When I got here, there were roughly 30 people. It's a very small company. We started with Postgres. We started with something free. We didn’t want to have a large budget dedicated to the backing infrastructure just yet. We weren’t ready to monetize it yet.

So, we started on Postgres and we've scaled up now to the point where we have about 100 TBs on Postgres. We get decent performance out of the database for the things that we absolutely need to do, which are micro-batch updates and transactional activity. We get that performance because the database lives here.

I don't know what the largest unsharded Postgres instance is in the world, but I feel like I have one of them. It's a challenge to manage and leverage. Now, we've gotten to the point where we're really enjoying doing larger queries. We really want to understand the entire installed base of how we want to do analyses that extend across the entire base.

We want to understand the lifecycle of a volume. We want to understand how it grows, how it lives, what its performance characteristics are, and then how gradually it falls into senescence when people stop using it. It turns out there is a lot of really rich information that we now have access to to understand storage lifecycles in a way I don't think was possible before.

But to do that, we need to take our infrastructure to the next level. So we've been doing that and we've loaded a large number of our sensor data that’s the numerical data I have talked about into Vertica, started to compare the queries, and then started to use Vertica more and more for all the analysis we're doing.

Internally, we're using Vertica, just because of the performance benefits. I can give you an example. We had a particular query, a particularly large query. It was to look at certain aspects of latency over a month across the entire installed base to understand a little bit about the distribution, depending on different factors, and so on.

We ran that query in Postgres, and depending on how busy the server was, it took anywhere from 12 to 24 hours to run. On Vertica, to run the same query on the same data takes anywhere from three to seven seconds.

I anticipated that because we were aware upfront of the benefits we'd be getting. I've seen it before. We knew how to structure our projections to get that kind of performance. We knew what kind of infrastructure we'd need under it. I'm really excited. We're getting exactly what we wanted and better.

This is only a three node cluster. Look at the performance we're getting. On the smaller queries, we're getting sub-second latencies. On the big ones, we're getting sub-10 second latencies. It's absolutely amazing. It's game changing.

People can sit at their desktops now, manipulate data, come up with new ideas and iterate without having to run a batch and go home. It's adramatic productivity increase. Data scientists tend to be fairly impatient. They're highly paid people, and you don’t want them sitting at their desk waiting to get an answer out of the database. It's not the best use of their time.

When it comes to the cloud model for deployment, there's the ease of adding nodes without downtime, the fact that you can create a K-safe cluster. If my cluster is 16 nodes wide now, and I want two nodes redundancy, it's very similar to RAID. You can specify that, and the database will take care of that for you. You don’t have to worry about the database going down and losing data as a result of the node failure every time or two.

I love the fact that you don’t have to pay extra for that. If I want to put more cores or nodes on it or I want to put more redundancy into my design, I can do that without paying more for it. Wow! That’s kind of revolutionary in itself.

It's great to see a database company incented to give you great performance. They're incented to help you work better with more nodes and more cores. They don't have to worry about people not being able to pay the additional license fees to deploy more resources. In that sense, it's great.

We have our own private cloud -- that’s how I like to think of it -- at an offsite colocation facility. We do DR here. At the same time, we have a K-safe cluster. We had a hardware glitch on one of the nodes last week, and the other two nodes stayed up, served data, and everything was fine.


Those kinds of features are critical, and that ability to be flexible and expand is critical for someone who is trying to build a large cloud infrastructure, because you're never going to know in advance exactly how much you're going to need.

If you do your job right as a cloud provider, people just want more and more and more. You want to get them hooked and you want to get them enjoying the experience. Vertica lets you do that.

Disclosure: PeerSpot has made contact with the reviewer to validate that the person is a real user. The information in the posting is based upon a vendor-supplied case study, but the reviewer has confirmed the content's accuracy.
PeerSpot user
Buyer's Guide
OpenText Analytics Database (Vertica)
May 2025
Learn what your peers think about OpenText Analytics Database (Vertica). Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
858,038 professionals have used our research since 2012.
PeerSpot user
Senior Product Manager (Data Infrastructure) and Security Researcher at a tech company
Vendor
Great Platform

If you are in the topic of Databases, you should know who is Dr. Michael Stonebraker, who is right now an adjunct professor in the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology, considered like one of the world experts in this field. Why I began in that way? Because Dr. Stonebraker co-founded Vertica Systems, seeing the innovation behind this amazing product.

But, What is Vertica?

Vertica Analytic Database is a high performance MPP (Massive Parallel Processing) columnar engine optimized to deliver faster query results in the shortest time. I said optimized because this is a keyword inside the Vertica team: every piece of code in Vertica has a lot of research and innovation, which I will discuss later. I heard abut this database when I was writing a research paper for my organization about MPP systems, and I found that Vertica was one of the good players in this Big Data Analytics game (the other good players are the Greenplum Database and Teradata’s Aster Data Platform). Then, HP saw the great opportunity that this product represented for the Big Data business and acquired the company in 2011.

OK, let’s talk now about some of the Vertica’s features

  • Column-based storage:

    Vertica use a patented architecture called FlexStoreTM, created based on three principles: the grouping of multiples columns in a single file, the selection of disk storage format based on data load patterns automatically, and the ability to differentiate storage media by their performance characteristics and to enable intelligent placement of data based on usage patterns

  • Advanced Data compression:

    Based on the choosed architecture by Vertica team of grouping columns in a single file; the data compression follows the same principle: Vertica organizes values of similar data types contiguously in memory and on disk, enabling to select the best compression algorithm depending of the data type. This improves dramatically the query execution and parallel load times

  • Built-in Analytics functions:

    Vertica comes with a completed packages of useful functions for Analytics, divided by topics like Natural Language Processing, Data Mining, Logistic Regression, etc. This is called User-Defined Extensions. You can read more about this here in this whitepaper

  • Automatic High Availability:

    Vertica allows to scale your data almost without limits, with remarkable features like automatic failover and redundancy, fast recovery, and fast query performance, executing queries 50x-1000x faster eliminating costly disk I/O.

  • Native integration with Hadoop, BI and ETL tools:

    Seamless integration with a robust and ever growing ecosystem of analytics solutions.

You can read deeply about all these features here. The last version of the platform is Vertica 6, and here you can find some of the new features and improvements in this version, or you can view this video, where Luis Maldonado, Director of Product Management at Vertica, explaining a quick overview of this version.

Ok, it’s a great platform, but who are using it today?

There are a lot of companies that are trusting in Vertica today: Twitter, Zynga (th number # 1 company in the Social Gaming industry), Groupon, JPMorgan Chase, Mozilla, AT&T, Verizon, Diio, Capital IQ, Guess Inc,and many more. Read its testimonials here.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user4518 - PeerSpot reviewer
Head of Databases at a retailer with 501-1,000 employees
Vendor
Great DW value for the money, but needs better workload management

Valuable Features:

• As it can load a lot of data very quickly, it is very helpful for our organization in managing our VLDB. • It has the lowest cost per TB compared to other DW solutions. • Provides us clustering for our commodity hardware and a good data compression ratio compared to other DW tools I've looked into. • It is very scalable. We didn’t face any problems in terms of installation. • Allows concurrent users access. Our staff can process and send query requests simultaneously -- it responds to all requests very efficiently. • It is very easy to use! It also is good in providing OLAP support to our database system. We can also use traditional SQL queries and tables.

Room for Improvement:

• We can’t use vertica as a transactional database because it is unable to handle lot of transactions. • While copying tables straight across from MYSQL, we observe poor performance. Queries take more time for execution. • Our team is challenged during peak loads because of a lack of work load management options. • It doesn’t offer us any configuration options for reducing the size of our database system and limiting our resource usage.

Other Advice:

We are using Vertica because it is a fast and reliable data warehousing tool and provides us a very good data compression ratio. It is cheap in terms of overall cost per TB and also requires less human resource for maintenance and tuning. Vertica I have been told has improved cloud deployment capabilities on any infrastructure.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
it_user267693 - PeerSpot reviewer
it_user267693Manager Data and Analytics at a tech consulting company
Consultant

It's a comprehensive review.

See all 2 comments
ERICK RAMIREZ - PeerSpot reviewer
Team Lead Solutions Architect at IMEXPERTS DO BRASIL
Real User
Top 10
Data analytics solution where data can be coded and compressed and does not require additional infrastructure
Pros and Cons
  • "Vertica is a great product because customers can compress and code data. The infrastructure that data warehouse solutions need is a commodity server so that customers don't have to invest in infrastructure."
  • "In a future release, we would like to have artificial intelligence capabilities like neural networks. Customers are demanding this type of analytics."

What is our primary use case?

This solution is used as part of our data warehouse solution. We have some customer indexing content in this Vertica product. Vertica is a relational database that is used as part of our data warehouse implementation.

What is most valuable?

Vertica is a great product because customers can compress and code data. The infrastructure that data warehouse solutions need is a commodity server so that customers don't have to invest in infrastructure. Vertica is a column oriented database so the response of queries is very fast. 

What needs improvement?

In a future release, we would like to have artificial intelligence capabilities like neural networks. Customers are demanding this type of analytics.

For how long have I used the solution?

We have been using this solution for six years. 

What do I think about the stability of the solution?

This is a stable solution.

What do I think about the scalability of the solution?

This is a scalable solution. 

How are customer service and support?

The customer service for this solution is good. 

How would you rate customer service and support?

Neutral

How was the initial setup?

The initial setup is straightforward. 

What other advice do I have?

It is important for those considering this solution to have some experience in database management solutions because Vertica is a relational database. Knowledge regarding SQL and any other database related skills is very important in order to implement or use this product correctly.

I would rate this solution a nine out of ten. 

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
MunkhsaikhanBayar - PeerSpot reviewer
Project Lead - Digital Transformation Unit at Bodi Electronics LLC
Real User
Scalable big data analytics platform that is reasonably priced compared to other solutions
Pros and Cons
  • "The hardware usage and speed has been the most valuable feature of this solution. It is very fast and has saved us a lot of money."
  • "The integration of this solution with ODI could be improved."

What is our primary use case?

Our primary use case for this solution is data analytics.

What is most valuable?

The hardware usage and speed has been the most valuable feature of this solution. It is very fast and has saved us a lot of money.

What needs improvement?

The integration of this solution with ODI could be improved. 

For how long have I used the solution?

I have used this solution for 1 year.

What do I think about the stability of the solution?

This is a stable solution with a fast compression system.

What do I think about the scalability of the solution?

This is a scalable solution. 

How are customer service and support?

The technical support for this solution is really good. The support team are friendly and provide assistance quickly. 

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup is straightforward. 

What's my experience with pricing, setup cost, and licensing?

The pricing for this solution is very reasonable compared to other vendors.

What other advice do I have?

I would rate this solution a ten out of ten. 

Which deployment model are you using for this solution?

On-premises
Disclosure: My company has a business relationship with this vendor other than being a customer: partner
PeerSpot user
Buyer's Guide
Download our free OpenText Analytics Database (Vertica) Report and get advice and tips from experienced pros sharing their opinions.
Updated: May 2025
Buyer's Guide
Download our free OpenText Analytics Database (Vertica) Report and get advice and tips from experienced pros sharing their opinions.