IT Central Station is now PeerSpot: Here's why

Amazon Redshift OverviewUNIXBusinessApplication

Amazon Redshift is #5 ranked solution in top Cloud Data Warehouse tools. PeerSpot users give Amazon Redshift an average rating of 7.8 out of 10. Amazon Redshift is most commonly compared to Snowflake: Amazon Redshift vs Snowflake. Amazon Redshift is popular among the large enterprise segment, accounting for 72% of users researching this solution on PeerSpot. The top industry researching this solution are professionals from a computer software company, accounting for 26% of all views.
Amazon Redshift Buyer's Guide

Download the Amazon Redshift Buyer's Guide including reviews and more. Updated: August 2022

What is Amazon Redshift?

Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of most other data warehousing solutions.

Traditional data warehouses require significant time and resource to administer, especially for large datasets. In addition, the financial cost associated with building, maintaining, and growing self-managed, on-premise data warehouses is very high. Amazon Redshift not only significantly lowers the cost of a data warehouse, but also makes it easy to analyze large amounts of data very quickly.

Amazon Redshift gives you fast querying capabilities over structured data using familiar SQL-based clients and business intelligence (BI) tools using standard ODBC and JDBC connections. Queries are distributed and parallelized across multiple physical resources. You can easily scale an Amazon Redshift data warehouse up or down with a few clicks in the AWS Management Console or with a single API call. Amazon Redshift automatically patches and backs up your data warehouse, storing the backups for a user-defined retention period. Amazon Redshift uses replication and continuous backups to enhance availability and improve data durability and can automatically recover from component and node failures. In addition, Amazon Redshift supports Amazon Virtual Private Cloud (Amazon VPC), SSL, AES-256 encryption and Hardware Security Modules (HSMs) to protect your data in transit and at rest.

As with all Amazon Web Services, there are no up-front investments required, and you pay only for the resources you use. Amazon Redshift lets you pay as you go. You can even try Amazon Redshift for free.

Amazon Redshift Customers

Liberty Mutual Insurance, 4Cite Marketing, BrandVerity, DNA Plc, Sirocco Systems, Gainsight, Blue 449

Amazon Redshift Video

Archived Amazon Redshift Reviews (more than two years old)

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
Thomas Dallemagne - PeerSpot reviewer
Cloud & Data - practice leader at Micropole Belgium
Real User
Top 20
Quick to deploy, easy to use, and performs well, but ingesting data in realtime should be improved
Pros and Cons
  • "I like the cost-benefit ratio, meaning that it is as easy to use as it is powerful and well-performing."
  • "There are too many limitations with respect to concurrency."

What is our primary use case?

We are a service provider and we currently have five clients with active IT implementations that use Amazon Redshift. We also use it ourselves.

My clients primarily use this product for data analytics. They are mostly working with big data and using the machine learning functionality.

What is most valuable?

I like the cost-benefit ratio, meaning that it is as easy to use as it is powerful and well-performing. There are only three parameters that you need to understand, which are the distribution key, the sort key, and the compression method or encoding method. Once you understand these, you can tune the performance.

What needs improvement?

I would like a better way to ingest data in realtime because there is a bit too much latency.

There are too many limitations with respect to concurrency. It is now possible to auto-scale it, although that is still slow.

It could offer smaller nodes with decoupling of storage and processing because for the moment, the only nodes available to work that way are huge, and for large companies.

For how long have I used the solution?

My first implementation of Redshift was three and a half years ago, in 2017.

Buyer's Guide
Amazon Redshift
August 2022
Learn what your peers think about Amazon Redshift. Get advice and tips from experienced pros sharing their opinions. Updated: August 2022.
620,319 professionals have used our research since 2012.

What do I think about the stability of the solution?

We have not had many issues with stability.

What do I think about the scalability of the solution?

Scalability can be a problem if you don't write your database queries correctly. For example, if you write a cartesian product in Redshift then you may end up consuming all of the resources. However, it does have features like workload management to prevent this from happening.

Our clients are mid-sized to very large companies.

How are customer service and support?

I have been in touch with Amazon technical support and they are very good. They are efficient and resolve problems quickly. They know what they're doing and they're very professional.

Which solution did I use previously and why did I switch?

I have also used Snowflake and its methods for ingesting real-time data are faster. It also offers a bit more functionality and a bit more flexibility. It's a bit easier to maintain and faster to scale, but more expensive as well.

To me, the big drawback with Snowflake is that the data is not stored in your AWS or Azure subscription, or AWS account. They store the data in their own account that they manage for you, which might be a problem for some companies in terms of compliance and legal requirements.

Azure Synapse and Google BigQuery are also competing solutions.

How was the initial setup?

The deployment is very straightforward and it usually takes a couple of minutes. This is one of the reasons I like it.

As long as a person understands the AWS landscape, they can deploy it on their own. Otherwise, without realizing it, they might for example deploy a Redshift cluster that is not properly secure. Similarly, it could cost a lot of money if they don't know what they're doing. You don't need a very in-depth technical expertise, but you do need to understand how AWS works.

What about the implementation team?

I have a team that provides maintenance for our customers. It is spread between France and Belgium and I have 25 people who report to me, with another 20 who I work with indirectly.

What's my experience with pricing, setup cost, and licensing?

The cost of Redshift ranges from a few hundred dollars a month to thousands of dollars a month, according to the resources that you're going to use, the number of nodes, and the type of nodes.

My customers have implementations that cost about $500 a month for a very small one. I also have a customer with a monthly invoice of about $25,000 USD.

What other advice do I have?

With the most recent update, we should now be able to decouple storage from processes.

My advice for anybody who is implementing Redshift is to make sure that they are using it for what it is made to do. It's an analytical database, so it's not meant to process transactional data. It's the perfect tool if you use it for the right purpose.

Overall, it is a very stable and robust product. That said, there is still plenty of potential for improvement.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Sarfraz Nawaz - PeerSpot reviewer
Chief Executive Officer at Ampcome
Real User
Scales according to our needs, which saves a lot in terms of upfront costs
Pros and Cons
  • "The most valuable feature is the scalability, as it grows according to our needs."
  • "The OLAP slide and dice features need to be improved."

What is our primary use case?

We are a digital transformation services company, and we are using Amazon Redshift for one of our clients. They are a logistics company that has transportation and other needs.

Their first requirement is for financial reporting, where we pull financial data from their many ERP systems and can provide a corporate-level view.

There is also an operations standpoint, where they are looking for operational insights. For this, we again pull different information from their ERPs, bring it into Redshift, and then model it in such a way that they will be able to see a consolidated view in terms of operational success across lines of business.

How has it helped my organization?

I've been working with data warehouses for a long time and it has always been the case that we had to invest quite a bit on infrastructure, upfront. We are used to dealing with Teradata, and the cost of setting up the data center and getting the appropriate licenses was a big deal. Now, we are able to spin up some clusters and then start using it, allowing us to incrementally pay as we expand.

This has become a big shift in how we spend because there is no capital cost upfront. Moreover, this works with startups as well as with enterprise, and they provide an equal footing. This means that even the advanced capabilities and insights that are available with a data warehouse are no longer limited to the larger clients. Even a startup can use these features, immediately.

What is most valuable?

The most valuable feature is the scalability, as it grows according to our needs.

The part that I like best is that you only pay for what you are using.

What needs improvement?

The OLAP slide and dice features need to be improved. For example, if a business wants to bring in a general ledger from an ERP, they want to slice and dice the data. What we have found is that they have a lot of formulas that are used to calculate metrics, so what we do is use SQL Server Analysis Services. The question then becomes one of adopting a single vendor and transitioning to Azure. If Redshift had similar capabilities then it would be very good.

For how long have I used the solution?

We have been using Amazon Redshift for about five years.

What do I think about the stability of the solution?

The stability is awesome. We have been using it for quite a while and haven't faced any issues.

What do I think about the scalability of the solution?

The scalability is very good. You can start at a very low scale and just keep expanding as required. It is the type of product that fits organizations of all sizes.

How are customer service and technical support?

We have contacted support on several occasions. With our most recent customer, they are pretty large and we were directly in touch with the regional account manager, who is the head of database analytics for India. This person was directly involved in our calls and helped with the evaluation, so the support has been pretty good.

Which solution did I use previously and why did I switch?

We have worked with Teradata and more recently, have been working with Azure SQL Warehouse. Teradata is an on-premises solution and the upfront costs are high. Comparing Azure SQL Warehouse and Amazon Redshift, in terms of features I think that they are pretty much on par.

The SQL Data Warehouse does have better OLAP capabilities, and they also offer a level of serverless capability where they have split the compute and the storage. This means that they can operate at a lower cost in the development environment.

Many of our clients have begun to adopt Power BI, and once they start using it, they tend to lean towards Azure and the Azure SQL Data Warehouse. The fact that Power BI is free, makes quite an impact.

How was the initial setup?

The initial setup is straightforward, once you get used to it. There is a lot of documentation available.

What about the implementation team?

We handle the implementation and deployment of Redshift for our clients.

What other advice do I have?

I am interested in seeing a split between compute and storage, which is something that they are currently working on. We plan to start leveraging it at some point in the future.

In summary, I think that Amazon Redshift is a very good data warehouse and we really like it a lot.

I would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Amazon Redshift
August 2022
Learn what your peers think about Amazon Redshift. Get advice and tips from experienced pros sharing their opinions. Updated: August 2022.
620,319 professionals have used our research since 2012.
Service Manager & Solution Architect at a logistics company with 10,001+ employees
Real User
Easy to use and simple to setup, but the performance is low, and there is no tool to support the CDC
Pros and Cons
  • "It is quite simple to use and there are no issues with creating the tables."
  • "It takes a lot of time to ingest and update the data."

What is our primary use case?

We stored all of the data in the S3 bucket and would like to have it stored in a data warehouse, which is why we chose this database. 

It would be very easy for us as an end-user, who would like to access the data, rather than draw it post-transformation and store it at a database level.

What is most valuable?

The TP transactions for the creation of the tables does very well.

It is quite simple to use and there are no issues with creating the tables.

What needs improvement?

The managing updates, deletes, and role-level change performance is very low. For example, while you are doing inserts, updates, deletes, and amalgamates, the performance is very, very poor.

If you want to query the database after you have a lot of terabytes of data, the load, performance-wise, is very low.

Looking at the performance of the query, querying the database, and especially with the amalgamates when it is getting updated, it is really poor.

We like this solution and have tried all of the native services; they were working quite well. The only concern about Redshift was managing the cluster, especially the EMR cluster. Our company policy was not to use EMR clusters, especially with the nodes failing. There were many instances of downtime happening. Essentially, there was too much data traffic.

The other drawback was the CDC, as we do not have any tools that can support it.

Creating the structure is easy on the DDL side, but after you create the table and you want to transform the data to store it in a database, the performance is poor.

It takes a lot of time to ingest and update the data. After you ingest the data and someone wants to fetch it in the table, it takes a lot of time performance-wise to return the results.

For how long have I used the solution?

We have been using this solution for three months.

We are using the latest version.

What do I think about the stability of the solution?

There are issues with stability and it should be compared with Snowflake.

What do I think about the scalability of the solution?

This solution is scalable. We scale up and scale down manually when we are required to, we do not have an automatic setup.

We have three or four people using this solution.

How are customer service and technical support?

We have contacted technical support to give our opinion and recommendations or feedback and they agreed that it needs improvement.

Which solution did I use previously and why did I switch?

Previously, we tried the Snowflake database, which works really well. The expectations were really good with the performance, also the DDL, DML operations on the processing of the data.

How was the initial setup?

The initial setup is simple and we did not find it very complex at all.

The time it takes to deploy depends on how many tables you want to create, or how many tables will you merge the data with.

Which other solutions did I evaluate?

We are switching to Azure, although not because of the product or the services that we did not like. It's about AWS being competitors for logistic companies that we are working with. Also for security reasons, we do not know how secure the data is on the cloud.

If you are competitors then you don't know if the data can be accessed by your competitor, and the team can be looking at a demographic, which could impact your sales.

What other advice do I have?

We have only just started using Redshift, but we are not really satisfied with it.

I would rate this solution a six out of ten.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
AndrewAlangaram - PeerSpot reviewer
Business Analyst at a tech services company with 201-500 employees
Real User
Good performance and integrates well with our Tableau solution
Pros and Cons
  • "The processing of data is very fast."
  • "It would be useful to have an option where all of the data can be queried at once and then have the result shown."

What is our primary use case?

We use Amazon S3 along with RedShift for storing our data. The data comes from various sources, including our client and third-parties. We get the data as an S3 file and then load it into RedShift using the ETL tools. RedShift will then act as the data source for Tableau, which is used for forecasting and other marketing activities.

What is most valuable?

The processing of data is very fast.

What needs improvement?

It would be useful to have an option where all of the data can be queried at once and then have the result shown. As it is now, when we run a query and we are looking at the results, part of the data remains to be processed at the back end. That works very well, but in some cases, we require the whole data to be queried at once and then have the results shown. We have not faced many use cases where it would have been useful, but in one or two, we used other methods to achieve this goal.

When our clients contact customer support, they don't want to speak with a machine. Instead, they want to chat with a real person who can provide a solution. Customer service bots can provide solutions but they cannot understand our problems.

For how long have I used the solution?

I have been working with Amazon RedShift for about two and a half years.

What do I think about the stability of the solution?

RedShift is a stable solution.

What do I think about the scalability of the solution?

Given that RedShift is a cloud-based solution, scalability is very good. I have not faced any issues regarding that. We have about 20 users, who are all developers.

Beyond the development stage and in terms of the users who make use of the data for Tableau, there are many people all around the world. I do not know the number, but it could be 1,000 or it could be 10,000. When it leaves us and goes into production, it is the client who takes care of it. One of the clients I am working with now has more than 10,000 personnel.

How are customer service and technical support?

I do not have enough direct contact with the solution to see issues that would require contacting technical support. I work with the data but not so much on the technical side. It is the developers who would see these issues and would do so. The level of support is based on our subscription plan.

Which solution did I use previously and why did I switch?

The company was using other solutions such as Google Cloud and the Microsoft Cloud Service, but I have not personally used these solutions.

What other advice do I have?

I am very happy with what RedShift has. So far, anything that I have required has been there and whatever use case I have faced, the functionality is available.

My advice for anybody who is implementing this solution is to look into what certifications are available and which ones are required for different roles. Depending on the job, different certifications are relevant or required. For example, as a business analyst, a coding certification would not be useful for me and it would be a waste of money. These things should all be considered before beginning with any certifications.

I would rate this solution a nine out of ten.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Reseller
PeerSpot user
Shyam Shankar - PeerSpot reviewer
Senior System Engineer at a computer software company with 10,001+ employees
Real User
Has good data updates and latency between data-refreshes
Pros and Cons
  • "In terms of valuable features, I like the columnar storage that Redshift provides. The storage is one of the key features that we're looking for. Also, the data updates and the latency between the data-refreshes."
  • "Pricing is one of the things that it could improve. It should be more competitive."

What is our primary use case?

My primary use for Amazon Redshift is for analytical purposes.

What is most valuable?

In terms of valuable features, I like the columnar storage that Redshift provides. The storage is one of the key features that we're looking for. Also, the data updates and the latency between the data-refreshes are valuable.

What needs improvement?

Pricing is one of the concerns that I have because if you compare Snowflake with Redshift, it provides some of the same services, but at a much cheaper rate. So pricing is one of the things that it could improve. It should be more competitive.

Otherwise, everything else looks good, especially the data storage and analytical processes.

For how long have I used the solution?

I just started using Amazon Redshift. I'm still learning about the product and about how the pricing and the dynamic skilling are done.

What do I think about the stability of the solution?

I understand that Amazon Redshift is quite stable and it has features like higher readability. So I think it should be a good product to be used.

How are customer service and technical support?

I have not been in touch with support but I will be needing their support in the near future to help me set up the environment.

How was the initial setup?

In terms of the initial setup, as I said I've still not started using it, so I pray that it's going to be pretty easy for me to set it up.

What's my experience with pricing, setup cost, and licensing?

Of course I would advise others to choose Amazon Redshift, as long as pricing is not a concern for them.

What other advice do I have?

On a scale of 1 - 10, where 1 is the worst and 10 is the best, I'd give it an 8.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Head of Data Warehouse & Business Intelligence at a comms service provider with 501-1,000 employees
Real User
The product is simple to use but it should be more flexible
Pros and Cons
  • "The product is relatively easy to use because there is no indexing and no partitions."
  • "The product could be improved by making it more flexible."

What is our primary use case?

I'm the head of Data Warehouse and Business Intelligence, and our company is an Amazon customer. Most of the company's data sources were on Amazon at the time the product was deployed so it was logical to use this database. The data warehouse is quite small, compressed it's maybe 160 GB. It's not like an autonomous data warehouse or Exadata, which was almost 40 terabytes. It's a simple method of achieving extraction and loading. There is no real incremental load on this. Of course in the future, with the company growing, this should be changed and we'll probably need some kind of incremental system instead of this approach.

What is most valuable?

The product is relatively easy to use because there is no indexing and no partitions. There is no referential integrity only declarative, which is okay.

What needs improvement?

From my perspective, the product could be improved by making it more flexible. There are now more flexible products on the market that allow for expandability and dynamic expansion as the market changes with regard to data warehouses. Although the product is simple to use there can be problems. If you declare some unique key in a column and then store it, the database is going to believe this is what you have and results will be distorted. It's fine if the query is simple but if it's complex or you have too many queries per hour, it can create a bottleneck for Redshift and then you can't return and recover. It requires some fine-tuning. 

For additional features, I would like to see support for partitions, it doesn't exist yet as a feature. It's quite an important issue when you're dealing with large databases. Also, I believe the product needs improvement in parallel threading to support more database users without jeopardizing performance.

For how long have I used the solution?

I've been using Redshift for about a year and a half. 

What do I think about the stability of the solution?

We sometimes have issues with stability, especially over the weekends. It performs, of course, but sometimes there are problems. 

What's my experience with pricing, setup cost, and licensing?

In terms of pricing, if you plan to use the product for a small company and you compare the cost to a product like Oracle Autonomous Data Warehouse, then you need to consider potential company growth and whether you may need to expand the system in the future, require partitions, etc. It could be that in the end, Autonomous Data Warehouse is cheaper than Amazon because if you use that product it can fit your processes and run dynamically. This is not possible on Redshift. If you have two different types of servers on Redshift then at some point you'll reach a limit and will need to move to another type of cluster which is not such an easy operation. On the other hand, with Autonomous Data Warehouse, you can add extra terabytes, but that's it. Nothing else. 

What other advice do I have?

I would rate it a five out of ten. 

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Senior Software Engineer at a tech services company with 51-200 employees
Real User
Cost effective and stable, for the performance for larger applications needs to be improved
Pros and Cons
  • "The initial setup of this solution is straightforward."
  • "Running parallel queries results in poor performance and this needs to be improved."

What is our primary use case?

The primary use case for this solution is related to statistical data.

What is most valuable?

The best part about this solution is the cost.

The performance of this solution is good if you have only a few use cases and a few queries, but for larger applications, it is not so good. 

What needs improvement?

Running parallel queries results in poor performance and this needs to be improved.

For how long have I used the solution?

I have been using this solution for one year, and it has been used within the company for several years.

What do I think about the stability of the solution?

This solution is stable, although we have had some issues with maintenance.

What do I think about the scalability of the solution?

The scalability of this solution needs to be improved.

How are customer service and technical support?

We have dealt with technical support and it is ok.

How was the initial setup?

The initial setup of this solution is straightforward.

What other advice do I have?

The suitability of this solution depends on the environment and the requirements. For some, Amazon Redshift is perfect. However, some people will need better queries. Based on my research, many of the products are pretty close. It is possible that the much higher priced solutions have more differences.

This is a good solution but the queries need to perform better.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Head of Analytics at a tech services company with 10,001+ employees
Real User
A solution with a straightforward setup, improved stability and good flexibility

What is our primary use case?

We primarily use the solution for analytics.

What is most valuable?

The solution's flexibility is its most valuable feature. It's also easy to scale and has relatively painless pricing. 

What needs improvement?

The speed of the solution and its portability needs improvement.

For how long have I used the solution?

I've been using the solution since 2006.

What do I think about the stability of the solution?

The solution is getting more stable with each new version. It's stable now, but it can always continue to improve.

What do I think about the scalability of the solution?

The solution can scale easily.

How are customer service and technical support?

I've never been in touch with technical support.

How was the initial setup?

The initial setup was straightforward.

What other advice do I have?

We use public and hybrid cloud deployment models. We are Amazon partners.

The solution itself is very popular. Many people use it these days.

I recently went to an AWS summit in Zurich. I was very impressed by AWS and the presentation. Their solutions are very good.

I'd rate this solution eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Mediha Šiljić - PeerSpot reviewer
Chief Information Officer at Sensilab
Real User
Easy to set up and easy to connect the many tools that connect to it

What is our primary use case?

We are using the private cloud model of this solution. Our primary use case is for a data warehouse for BI.

What is most valuable?

The most valuable features are that it's easy to set up and easy to connect the many tools that connect to it.

What needs improvement?

Compatibility with other products, for example, Microsoft and Google, is a bit difficult because each one of them wants to be isolated with their solutions. That's a big problem now.

For how long have I used the solution?

I have been using Redshift for around eight to nine months.

What do I think about the stability of the solution?

It is stable.

What do I think about the scalability of the solution?

Scalability is okay. It's easily scalable. We don't have any plans to increase usage at the moment. We currently have two users directly using this solution. Indirectly we have around 50 users.

We require two staff members for maintenance and others are just consuming data from it.

How are customer service and technical support?

There hasn't been a need to contact technical support at this point. We haven't had any technical issues. 

How was the initial setup?

The initial setup was straightforward. The deployment took a few hours. 

What about the implementation team?

We integrated it ourselves. 

What was our ROI?

We have seen ROI. It's been useful.

What's my experience with pricing, setup cost, and licensing?

It's around $200 US dollars. There are some data transfer costs but it's minimal, around $20.

What other advice do I have?

I would rate it a ten out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Nir Wasserman - PeerSpot reviewer
BI Manager at jfrog
Real User
Allows you write complex queries and perform row by row processes

What is our primary use case?

We use it to build a data warehouse and a centralized location for all of our data sources, allowing for in-depth analysis by using SQL queries.

How has it helped my organization?

  • Allows for the storage of huge amounts of data. 
  • Assists users to perform ad hoc analysis on a lot sources together.

What is most valuable?

  • Windows functions, such as LEAD and LAG.
  • Allows you write complex queries and perform row by row processes.

What needs improvement?

In the next release, a pivot function would be a big help. It could save a lot of time creating a query or process to handle operations.

For how long have I used the solution?

Three to five years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user869871 - PeerSpot reviewer
Principal Consultant at Inawisdom
User
Easy to load and reload data. Retaining data long-term should be cheaper.

What is our primary use case?

  • Storing and querying data in near real time.
  • Loading millions of raw CSV records.
  • Running data comparisons and queries, then shutting it all down all within a few hours, once a week.

How has it helped my organization?

Easy to load and reload data.

What is most valuable?

  • Fast load times 
  • Flexibility in column definitions
  • The ability to reload data multiple times at different times.

What needs improvement?

  • It would be nice if it was a bit cheaper to retain data long-term. 
  • Should be made available across zones, like other Multi-AZ solutions.

For how long have I used the solution?

Less than one year.

What's my experience with pricing, setup cost, and licensing?

Per hour pricing is helpful to keep the costs of a pilot down, but long-term retention is expensive.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user705738 - PeerSpot reviewer
Senior Solutions Engineer, West at a tech vendor with 5,001-10,000 employees
Vendor
It helped my customers migrate off on-premise platforms
Pros and Cons
  • "Redshift COPY command, because much of my work involved helping customers migrate large amounts of data into Redshift."
  • "Migrating data from other data sources can be challenging when you are working with multibyte character sets."

What is most valuable?

Redshift COPY command, because much of my work involved helping customers migrate large amounts of data into Redshift.

How has it helped my organization?

It helped my customers migrate off on-premise platforms such as Teradata to Redshift, at a fraction of the cost.

What needs improvement?

There are challenges with dealing with character set mismatches. Migrating data from other data sources can be challenging when you are working with multibyte character sets.

For how long have I used the solution?

Two years.

What do I think about the stability of the solution?

No.

What do I think about the scalability of the solution?

I personally haven’t hit scalability issues but at dinner a year ago with a few of my existing customers (all Fortune 500 companies), I was told there are scalability issues once you get to 32-nodes.

One of my previous customers told me they were migrating off Redshift because they hit the ceiling and had scalability issues. They told me the responsiveness they were getting was inferior to alternative solutions once your Redshift gets to a specific size.

How are customer service and technical support?

I never utilized AWS technical support.

Which solution did I use previously and why did I switch?

I’ve helped customers migrate off Teradata, SQL Server , Oracle Exadata, Greenplum, and ParAccel Matrix to Redshift. Some due to cost savings, others because of the EOL of the product.

How was the initial setup?

Setup of Redshift infrastructure is pretty straightforward. I’ve been told that setting up partitions can be tricky in order to ensure good performance.

What's my experience with pricing, setup cost, and licensing?

I have nothing to add here as I wasn’t involved in this part of the process. However, one of my customers went with Google Big Query over Redshift because it was significantly cheaper for their project.

Which other solutions did I evaluate?

I only provided advice to my customers, but some looked at Azure SQL DW , Greenplum, Netezza, and Google Big Query as possible alternatives

What other advice do I have?

Be careful with vendor lock-in! You cannot move your Redshift environment to a different cloud provider or to an on-premise solution.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Manager Data Services at a logistics company with 201-500 employees
Real User
Gives us the ability to increase space requirements
Pros and Cons
  • "Easy to build out our snowflake design and load data."
  • "It would be nice if we could turn off an instance. However, it would retain the instance in history, thus allowing us to restart without beginning from scratch."

What is most valuable?

  • Easy to build out our snowflake design and load data
  • Ability to dynamically increase space requirements
  • Good speed
  • Extremely reliable

How has it helped my organization?

We use Redshift as our primary BI data repository. It provides BI for all four of our product lines, which run on Redshift. It is low cost and highly reliable!

What needs improvement?

It would be nice if we could turn off an instance. However, it would retain the instance in history, thus allowing us to restart without beginning from scratch.

For how long have I used the solution?

We have been using this a little over a year.

What was my experience with deployment of the solution?

It is pretty easy to learn and use. If you've worked with SQL before, you won't have any problems. Also, it works very well with Talend, which is our ETL tool.

What do I think about the stability of the solution?

No issues of stability at this point.

What do I think about the scalability of the solution?

No issues of scalability at this point.

How are customer service and technical support?

Customer Service:

I don't know that we've ever had to contact AWS for Redshift assistance. Everything is straightforward.

Technical Support:

Not applicable.

Which solution did I use previously and why did I switch?

This was our first foray into a data warehouse and customer facing BI.

How was the initial setup?

We went into the project with a design in mind and Redshift provided a great working platform. Our ETL tool helped us out in building the metadata and Birst built out many of the necessary tables.

What about the implementation team?

The primary implementation was done in-house, with assistance from Birst during the Birst installation.

What was our ROI?

The corporate expectation is to break-even after the first year of selling BI to our customer base. We'll know this by end of Q2, 2018. On the positive side, the addition of BI has helped close multiple sales, so we're beating the competitors with our BI tools (running on Redshift).

What's my experience with pricing, setup cost, and licensing?

BI is sold to our customer base as a part of the initial sales bundle. A customer may elect to opt for a white labeled site for an up-charge.

Which other solutions did I evaluate?

We pretty much went with Redshift, as the company migrated everything to AWS. We might have looked at other database options, but we did not put much time into it.

What other advice do I have?

Plan out your our DB design in advance and test your theories on running a small instance first. Use a good ETL tool, like Talend, so updates can be scheduled easily. Don't try to write these from scratch. Redshift has been a great DB for us to date. We haven't seen any slowdowns or outages!

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Nir Wasserman - PeerSpot reviewer
BI Manager at jfrog
Real User
You can copy JSON to the column and have it analyzed using simple functions
Pros and Cons
  • "You can copy JSON to the column and have it analyzed using simple functions."
  • "It lacks a few features which can be very useful, such as stored procedures"

What is most valuable?

The features I find valuable in Redshift are JSON format support. You can copy JSON to the column and have it analyzed using simple functions. Second, is the parallel off/on where you can choose if you want it to unload to split files or into one file.

How has it helped my organization?

Since we have lots of data sources and high volumes, we needed a unified and organized DB that can handle these amounts and will be our single source of truth for the organization. Therefore, Redshift is the best solution.

What needs improvement?

It lacks a few features which can be very useful, such as stored procedures, Also, one needs to perform Vacuum in order to manage this DB. It would be nice not to worry about that and have this manageable.

For how long have I used the solution?

Three years.

What do I think about the stability of the solution?

Yes. Sometimes, for some reason, Redshift is down (not due to maintenance).

What do I think about the scalability of the solution?

No, cause we know how to use Redshift. We have a cluster of both HDD and SSD for which we keep the maximum data in each, so it would be scalable.

How is customer service and technical support?

Great. They are available and very helpful.

How was the initial setup?

Initial setup is very straightforward, very easy. No need of any side help.

What's my experience with pricing, setup cost, and licensing?

If you want to think of every query you make but want to know that your nodes are fully managed, then use BigQuery Data Analytics. If you want a fixed price, an to not worry about every query, but you need to manage your nodes personally, use Redshift.

Which other solutions did I evaluate?

I did not. we did consider using BigQuery Data Analytics, but eventually, we decided to use Redshift.

What other advice do I have?

My rating would be 8.5. This a great product, but one still needs to know how to manage clusters and nodes.

In order to make your DB scalable and reliable. it has the greatest benefit of build on PostgreSQL, so any data specialist that has SQL experience can handle Redshift.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user689532 - PeerSpot reviewer
Full Stack Engineer at a tech services company with 11-50 employees
Consultant
Valuable features are performance, data compression, and scalability. Query compilation time needs a lot of improvement.
Pros and Cons
  • "The valuable features are performance, data compression, and scalability."
  • "Query compilation time needs a lot of improvement for cases where you are generating queries dynamically."

What is most valuable?

The valuable features are performance, data compression, and scalability.

What needs improvement?

Query compilation time needs a lot of improvement for cases where you are generating queries dynamically. Also, it would help tremendously to have some more user-friendly, query optimization helper tools.

For how long have I used the solution?

We have been using the solution for 24 months now.

What do I think about the stability of the solution?

We have not faced any stability related issues so far.

What do I think about the scalability of the solution?

The time it takes to scale the cluster up or down is not trivial and it can take a while. In case you need to do this fast, you will need to think about other solutions.

How are customer service and technical support?

Apart from the official documentation, we haven't had the need to reach out to technical support yet. The quality of the documentation is very good. There are a lot of very useful articles from the community.

Which solution did I use previously and why did I switch?

Previously, we were using AWS RDS for our use case. We found that we had outgrown it. Our data grew in size and we wanted to still have performance queries.

How was the initial setup?

The initial setup of the cluster was pretty straightforward. The following step, setting the right table configuration, was not so straightforward, though. It required an understanding of how the product works. Sort and distribution keys are required concepts to know about.

What's my experience with pricing, setup cost, and licensing?

Redshift is very cost effective for a cloud based solution if you need to scale it a lot. For smaller data sizes, I would think about using other products.

Which other solutions did I evaluate?

We were thinking about using a self-managed PostgreSQL. We chose Redshift because we didn't need to manage it ourselves and because it integrates with the rest of the AWS services more fluently.

We are currently evaluating Druid.

What other advice do I have?

It is very important to understand how Redshift is designed to work. The database schema design is not trivial and requires an in-depth knowledge about it, especially if your use-case requires it to perform well.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Padmanesh NC - PeerSpot reviewer
Big Data Solution Architect - Spatial Data Specialist at Sciera, Inc.
Real User
Top 5
It processes petabytes of data and supports many file formats. Restoring huge snapshots takes too long.

What is most valuable?

Scalability: Ability to load huge number of datasets (I have experience with petabytes of data) and process those things. Storage is not limited. We can increase whatever we want.

Performance: The distributed architecture of Redshift has the capacity to process the workflow in a different cluster and coordinate those things in the leader node, making the process much faster.

Flexibility: This feature is helpful for user to increase the node size and config depending on their need. There is no need to wait for hardware to be in place whenever we increase the dataset. Redshift provides the option to increase the node or cluster size whenever required.

Multi-formatted accessibility: The Redshift engine has the capability to read the following file formats: CSV, DELIMITER, FIXEDWIDTH, AVRO, JSON, BZIP2, GZIP, LZOP. The user can choose which is best for their requirements.

VPC configuration: VPC configuration secures our dataset, which we keep inside the Redshift cluster. This VPC config doesn’t allow any third party in or out bound against firewall.

Python UDF calls: This is useful for a user to create their own user-defined function through Python and import that class into Redshift and process the dataset.

How has it helped my organization?

We were using MySQL & MongoDB for our regular operations, but when we grew, we were forced to handle a huge number of datasets. It could be petabytes of data in and out on a regular basis. We struggled a lot to complete the operations in a timely manner. With Amazon Redshift, we gained a lot in terms of timing, as well as project completion.

Some of the scoring mechanism really works well in the distributed architecture of Amazon Redshift.

What needs improvement?

Of course, every product has pluses and minuses. From that perspective, Amazon Redshift has some issues with snapshot restoring when we handle huge datasets. When our snapshot size is really huge, like 20 TB+, we are forced to wait a long time to get it restored. This is reasonable, as they need to transfer the entire dataset to the cluster.

My thought on this issue is that Amazon has their own data centers and they are connecting each region of storage through Direct Connect. The input and output network data transfer might not be a complex thing. For example, if they used 10 Gbps network transfer, they can transfer 1 TB in less than two minutes, but that’s not happening now. To restore 1 TB of data, it takes more than 30-40 minutes.

For how long have I used the solution?

I have used it for the last 3.5 Years.

I am using Amazon Redshift for big data mapping and data aggregation.

We are using most of their products. Specifically, we are using their dedicated data-centre service (Direct Connect). We are using Amazon products such as Amazon EC2, S3, SQS, EMR, ML, CloudWatch, Redshift, DynamoDB, etc., for more than 10-12 years.

What do I think about the stability of the solution?

I have encountered stability issues. A few weeks ago, I encountered an issue with hardware failure and database health status failure. When we face these kind of issues, we can't do anything from our side until the Amazon technical team finds the issue and rectifies it. It takes time to get resolved. If we are in a rush to deliver something for a client and encountered these issue, we are really screwed.

What do I think about the scalability of the solution?

Ofcourse. When the amount of data that we handle in the cluster grew, we need to increase the cluster or node size. Apparently, the size of node or cluster increases the hold time for synchronizing the data (meta data) with the node manager. The initial time increases when we start the cluster.

How are customer service and technical support?

Customer Service:

Customer Service good. But couldn't make direct call to customer service many times. I could catch them through their web UI rather making direct call.

Technical Support:

Technical support is really great, but it’s paid support. The Basic Support plan doesn't have the option for technical support. It’s only providing billing support.

Which solution did I use previously and why did I switch?

I have experience working in Hadoop as well. When I compare the two (Redshift & Hadoop), Redshift is more user friendly in terms of configuration and maintenance.

How was the initial setup?

The initial setup of Amazon Redshift is so simple and straightforward. We do not need to read or understand any of the technical documentation. Simply said, it’s a plug-and-play service or platform.

What about the implementation team?

I have implemented through in-house.

What was our ROI?

In terms of ROI, I can't directly convert to it. Because we are not using only Redshift. We are using multiple product to increase our revenue and decrease time consumption. So It's difficult to calculate ROI of Redshift usage.

What's my experience with pricing, setup cost, and licensing?

Pricing and licensing is so important. In terms of pricing, it's bit high, as they are using standard hardware. My advice to users is: We need to start the cluster when we require it. At the end of the workday, we can just snapshot the clusters and shut them down. And then we restore those snapshots when we need them back. That way, we are charged only for usage rather than spending money on wait time or sleep.

Which other solutions did I evaluate?

I evaluated Hadoop and Spark, along with Redshift. I have no negative comments about those other products. Redshift is flexible in terms of configuration, maintenance and security, especially VPC configuration, which secures our data a lot.

What other advice do I have?

Use this product for huge data mapping or aggregation. Use Redshift through VPC to keep their data very secure and for a long time.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user396519 - PeerSpot reviewer
Director at a tech company with 1,001-5,000 employees
Vendor
Columnar-storage databases leverage the Massively Parallel Processing (MPP) capabilities of its data warehouse architecture.

What is most valuable?

  • Performance: Very fast query performance due to columnar-storage databases that leverage the Massively Parallel Processing (MPP) capabilities of its data warehouse architecture.
  • Petabyte-scale data warehouse, without any loss in performance and low cost: One of our existing customers stores more than 500 terabytes of data in an AWS Redshift database and the warehouse performance was good. We want to highlight that even if the warehouse size increases to petabytes, Redshift would still work fine and there wouldn’t be any performance issues and would cost less also.

How has it helped my organization?

The end users were able to have access to real-time analytics.

What needs improvement?

We would really like to see a few more connectors included that would enable connecting with other databases and services. We have faced some difficulties pulling data from Teradata and storing it in Redshift. There is no direct connector available between Teradata and Redshift.

For how long have I used the solution?

We are working with this product for the past 24 months.

What do I think about the stability of the solution?

We have not faced any stability related issue so far.

What do I think about the scalability of the solution?

We did not encounter any scalability issues in the last 24 months that we have been working with Redshift.

How are customer service and technical support?

We actually had to reach out to technical support a few times and they were really helpful and solved our problems. We would give it 4/5.

Which solution did I use previously and why did I switch?

We were using an on-premise MySQL data warehouse. To reduce the cost and improve scalability, we switched to a cloud version of data warehouse databases.

How was the initial setup?

Initial setup and configuration was pretty straightforward. First, we needed to create a Redshift cluster. Once the cluster was created, we created a database schema based on our need in the Redshift cluster.

What's my experience with pricing, setup cost, and licensing?

AWS Redshift is one of the fastest and most cost-effective cloud-based databases. They have charged $3330 per TB/year for the ds2.8x large instances which have 244 GB RAM, 36-core CPU, 10Gbps network and 16 TB HDD.

What other advice do I have?

You need to design the database structure with best sort and distribution keys, along with primary and foreign keys.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Padmanesh NC - PeerSpot reviewer
Padmanesh NCBig Data Solution Architect - Spatial Data Specialist at Sciera, Inc.
Top 5Real User

Hey Aju Mathew,

It was a nice review about Amazon Redshift. I am also using this for last 3 year. I would like to understand a bit in terms of pricing. I am using there instance based on $0.25/per node/per hour. So If I am using 100 node cluster I have bane 25$ per hour. But as you said something like $3330 / year/ tb. Can you please elaborate the same. Is that based on node size or storage size?

it_user576450 - PeerSpot reviewer
Data Science Lead at a tech services company with 51-200 employees
Consultant
The PostgreSQL interface is good because you can play with big data with just SQL.

What is most valuable?

The valuable features are:

  • PostgreSQL Interface
  • Scalability
  • Pricing/Maintenance/Setup

The PostgreSQL interface is good because you can play with big data with just SQL. This is one of the reasons why they made Hive.

However, Hive’s SQL is still not as standard as what Redshift provides:
http://docs.aws.amazon.com/red...

How has it helped my organization?

Redshift has been the data warehouse in at least three of my previous companies. The impact is huge to anyone who uses data in any way.

What needs improvement?

I would like to see improvements in the database integrations. Currently, Amazon does not provide real-time/near real-time integration with other products like RDS or DynamoDB out-of-the-box.

We need to either build the integrations ourselves, or rely on third-party services which are not always the best.

For how long have I used the solution?

We have been using this solution for over three years.

What do I think about the stability of the solution?

There were stability issues in the beginning. However, the product has improved quite a lot in the last two years in term of stability.

What do I think about the scalability of the solution?

Redshift can scale up to a petabyte with a few simple clicks.

How are customer service and technical support?

Technical support is good, but similar to any other Amazon Web Service, you have to pay for a good level of technical support.

Which solution did I use previously and why did I switch?

We did not have a previous solution. Redshift worked for us the first time we tried. The pricing could not be beaten by anything else in the market at that time.

How was the initial setup?

The installation was straightforward and only required a few clicks.

What's my experience with pricing, setup cost, and licensing?

Pricing was quite a strong point of Redshift when it was first released. Nowadays, quite a number of other services are very competitive in pricing, such as BigQuery.

What other advice do I have?

Redshift, like any other big data technology, isn’t a silver bullet for everything. The most important thing is to understand your data and your requirements before you make any decision to use any technology.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user576444 - PeerSpot reviewer
Rails Developer at a recruiting/HR firm with 51-200 employees
Vendor
It's based on PostgreSQL, is a managed solution, and has low price per terabyte per year.

What is most valuable?

  • It is based on PostgreSQL.
  • It’s managed. Meaning, AWS takes care of handling infrastructure, deployments, encryption, and uptime for you.
  • It’s cheap when you consider the price per terrabyte per year.
  • It’s integrated into the AWS stack.

How has it helped my organization?

At my previous company that does mobile analytics as its core product, we moved all the analytics backend from MongoDB to Redshift. Where I currently work, we use it as our main data lake/data warehouse.

What needs improvement?

While It's probably the best product of its category (managed SQL-based data warehouse at scale), it has a few shortcomings, although very few.

The main issue people complain about, and I agree with the claim, is that it's hard to load your data into it. You need to first export your data on S3 as CSV, JSON or AVRO. Then you can load it into Redshift. And even then, you have to make sure your data is properly formatted. (you can use the copy options: TRUNCATECOLUMNS to load fields that are too big, and MAXERROR to allow for a given number of errors while loading). In general, ETL and data cleaning is a hurdle in data engineering, and Redshift suffers from it.

For how long have I used the solution?

I have used Redshift for three years.

What do I think about the stability of the solution?

I once had an issue because my data contained a Unicode NULL character in a VARCHAR field ("\u0000"). The AWS support has been very quick and helpful to respond. Other than that, I have had no issues whatsoever.

What do I think about the scalability of the solution?

No scalability issues whatsoever.

How are customer service and technical support?

Technical support is very good.

Which solution did I use previously and why did I switch?

At my previous company, we switched from MongoDB to Redshift. The main reason was price and performance. At my current company, we started a data warehouse (greenfield project). The choice was between Google BigQuery and AWS Redshift. The main criteria was that Redshift was PostgreSQL-based and supports CTE and Window functions (PostgreSQL features).

How was the initial setup?

The big part when using Redshift is setting up the ETLs and doing the data cleaning. It was very hard when moving from MongoDB, because I had to re-discover our data schema (that had no spec). With that said, in both cases (moving from MongoDB and starting from scratch), I had a prototype up in about a day. By that I mean that I had the most important parts of my data loaded into Redshift and I could query it.

What's my experience with pricing, setup cost, and licensing?

The pricing page is explicit. Choose what suits your needs in terms of storage and performance.

Which other solutions did I evaluate?

For setting up a data warehouse, BigQuery was a serious contender. BigQuery is simpler to setup and scale. It's also more of a black box: you worry less what's inside and how it scales and you get charged for what you consume (which is both a pro and a con). With Redshift, you choose in advance the type of machine you want, like EC2 (resizing your cluster is easy).

What other advice do I have?

If you evaluate Redshift, chances are that you should evaluate BigQuery too. So take the time to weigh the pro and cons of each (plenty has been written online about that).

Take a look at the reserved instances pricing. It is very advantageous if you know you will stick with Redshift for some time.

Take the time to learn PostgreSQL (eg: https://www.pgexercises.com/). Redshift, while based on PostgreSQL 8.0, supports a good number of advanced Postgres features.

Do not be afraid of joins. PostgreSQL is performs very well in this regard.
If you need performance, have a look at the suggested optimizations in the official documentation (such as setting up the correct distkeys, sortkeys and compression schemes).

Understand that Redshift has no indexes.

Understand that Redshift is an analytical database with columnar storage, and that it does not enforce constraints.

Redshift plays very well with a PostgreSQL instance in RDS linked to it via DBLINK (see this guide: https://aws.amazon.com/blogs/big-data/join-amazon-redshift-and-amazon-rds-postgresql-with-dblink/). I've used this in production at my current company, and this is tremendously useful. You can have your raw data in Redshift and aggregate it directly into RDS. To do this, insert into RDS what you select from Redshift through the dblink.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user576456 - PeerSpot reviewer
Manager BI Development at a comms service provider with 1,001-5,000 employees
Vendor
The fact that it stores data using a columnar approach allows us to use columns in join conditions.

What is most valuable?

Redshift gives extremely fast response involving large tables. This is the most important feature I look for in data warehouse solutions. Often you came across use cases where it is not possible to distribute data on a certain column, yet you need this column in join conditions. Redshift stores data using a columnar approach, which is useful for data aggregation.

All this at an extremely low price makes it possible for small to medium sized organizations to use Redshift’s power to get business insights.

How has it helped my organization?

One of my clients required large amounts of data but had a low budget. Amazon Redshift was the perfect choice for my client. We joined two tables containing billions of rows each and got results back in 27 seconds with a relatively small cluster of nodes.

What needs improvement?

Amazon should bring more SQL functions that are required in data warehouse implementations. It lacks SQL functions for complex data processing. A very small example is recursive queries. However, Amazon is developing the product at a fast pace and bringing new features with every release.

For how long have I used the solution?

I’ve been using Redshift for more than two years. I created one traditional data warehouse with 3-tier architecture and one big data solution.

What do I think about the stability of the solution?

We have not really had stability problems. The product is mature and can be utilized for production systems.

What do I think about the scalability of the solution?

Since Redshift is on AWS cloud, scalability is not an issue. With a few clicks, cluster size can be increased or reduced. This is useful especially when you expect a large amount of data processing temporarily. For example, on Black Friday retail organizations expect large amounts of data flow/processing. Redshift can be scaled up for few days to accommodate the surge of data and then scaled back to normal cluster size to save OPEX.

How are customer service and technical support?

The AWS team gives special focus to customer support. This is a very big benefit of going to the cloud. You get a reply from AWS in small time frame.

Which solution did I use previously and why did I switch?

I worked on Teradata and IBM solutions. Redshift gives performance similar to these solutions and costs a fraction of the amount.

How was the initial setup?

Your Redshift can be up and running with few clicks and in less than 5 minutes. A big benefit when you shift to cloud.

Which other solutions did I evaluate?

We analyzed Microsoft, Oracle, AWS RDS and Mango DB for our requirements.

What other advice do I have?

Redshift is based on PostgreSQL and adds MPP/columnar features to make it a data warehouse product. It is very easy for developers to adopt this solution. Your existing team can easily work on Redshift with no extra cost of learning.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user572622 - PeerSpot reviewer
BI Architect & Developer (contract) at a retailer with 501-1,000 employees
Vendor
You can configure tables to live in the memory of all of the available cores.

What is most valuable?

Column store and distributed processing is optimized for read access. We grew to 3000+ users with no impact.

Column store is a data compression technique for relational data. I’m using it now in SQL Server 2016. We configured a 16-core VM for handling requests on the DB. The recommendation was to separate inbound data packets into related chunks, which were 1/16th of the size.

This way, the import process could make full use of parallelization, and it worked. We imported 20 million rows of sales facts in less than 15 seconds, and the content was query-able immediately. I’ve never seen that before. This was impressive. This meant that we could completely rebuild the data warehouse to “current” from "scratch" within minutes, assuming that the data was in S3 already.

Tables that would typically be 2GB in size are now about 250MB. This means more data in memory. You can also configure the tables to live in the memory of all of the available cores. This is good for small dimension tables. You can also fragment them across all cores, for the larger fact tables. This allows for distributed query processing. Once you set it up, it just worked. It was all specified in the PG-SQL table statements.

There were two data centers in Sydney that were guaranteeing us a distributed solution. We really didn’t notice this. It was more of a check box situation. At one point, there was an outage at AWS, but it didn’t impact our operations directly.

How has it helped my organization?

This has given us the ability to provide metrics to the large number of company staff on their performance without impacting core systems.

What needs improvement?

I’d like to see these RedShift features arrive in other languages, such as SQL's ColumnStore index.

.

For how long have I used the solution?

I have used this solution for three years.

What do I think about the stability of the solution?

There have been no stability issues.

How are customer service and technical support?

Technical support always met my expectations.

Which solution did I use previously and why did I switch?

I was on a team that was using AWS tools for Dick Smith Electronics (now liquidated). The tools ceased use in February of 2016.

Prior to that, we were using them fully for about 3 years. We loaded data to Redshift according to the best practices included in the online docs and through consultation with the AWS staff. The combination of S3 and Redshift for this purpose was very high in performance. Redshift was used to provide the data model to an instance of MicroStrategy for BI reporting.

We were using MicroStrategy, which generated all the SQL that our reporting services needed.

As such, I could only comment on the data engineering phase. Technically, this was so impressive that I don’t know what to add. I don’t recall feeling that it missed anything. If anything, I was not using all the available features. AWS documentation is great in this regard. You can tell they have put a lot of thought into it.

A lot of the future direction in database technology has to do with memory optimization and concurrency (VoltDB). This is more targeted towards transactional processing, and not data warehousing.

Memory-only data warehousing solves a lot of access issues without having to think too hard about the problem from the consumers' point of view. I am sure that you can already configure this.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user583371 - PeerSpot reviewer
BI Architect at a comms service provider with 5,001-10,000 employees
Vendor
Columnar storage technology is valuable.

What is most valuable?

Columnar storage technology is the most valuable feature of this solution.

How has it helped my organization?

We can get the SLS/SLAs in our daily processes.

What needs improvement?

Some improvements can be brought about in:

Restore table:

I would like to use this option to move data across different clusters. Right now, you can only restore a table from the same cluster.

Right now, the feature only permits bringing the table back in the same cluster, based on the snapshot taken. I would like to have a similar option to move data across different clusters, right now I have to UNLOAD from cluster A and then COPY in cluster B. I would like to use the snapshots taken to bring the data in the cluster I need.
Maybe current design cannot be used, because it is based on nodes and data distribution.

But, our real scenario is: if we lose the data and we need to recover it in other cluster, we have to do:

1) Restore table in current table with a different name

2) Unload data to s3

3) Copy data to a new cluster. When we are talking about billions of records is complex to do.

Vacuum process: The vacuum needs to be segmented. For example, after 24 hours of execution, I had to cancel the process and 0% was sorted (big table).


Vacuum process:

The vacuum needs to be segmented, example after 24 hr of execution, I had to cancel the process and 0 % was sorted (big table)"

For big tables (billions of records). if the table is 100% unsorted, the vacuum can take more than 24hrs. If we don't have this timeframe, we have to work around taking out the data to additional tables and run vacuum by batches in the main table.

Why, because If I run the vacuum directly over the main table, and I stop it after 5 hrs, 0 records will be sorted. I would like to run the vacuum over the main table, stop when I need but get vacuumed some records. Like incremental process.

For how long have I used the solution?

I have used this solution for around three years.

What do I think about the stability of the solution?

We did encounter stability issues, i.e., if you are using more than 25 nodes (ds2.xlarge), the cluster is totally unstable.

What do I think about the scalability of the solution?

I have not experienced any scalability issues.

How are customer service and technical support?

I would rate the technical support a 9/10 for normal issues.

However, for advanced issues, I would give it a 5/10 since I had to go directly with the AWS engineers support.

Which solution did I use previously and why did I switch?

Initially, we were using the Microsoft SQL solution. We decided to move over to this product due to the DWH volume and performance.

How was the initial setup?

In my opinion, the setup was normal.

What's my experience with pricing, setup cost, and licensing?

Based on quality of the product and its price, it is the one of the best options available in the market now.

Which other solutions did I evaluate?

We also looked at the Oracle solution.

What other advice do I have?

You need to make sure that the space used in DWH has to be a maximum of 50% of the total space.

You must create processes to vacuum and analyze tables frequently. Also, before creating the tables, you should choose the right encoding, DISTKEY and sort keys.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user576441 - PeerSpot reviewer
Senior Software Engineer [Redshift Programmer] at a tech services company with 1,001-5,000 employees
Consultant
It supports SCD1 and SCD2, and the star schema. Improvement is needed in the scope of data types and complex RDBMS functionalities.

What is most valuable?

The most valuable features of this product are:

  • Processing huge data in petabytes
  • Massively Parallel Processing (MPP)
  • Concept of data compression
  • The way it stores the data in drives especially with the distribution key
  • Supports BI tools like MicroStrategy (MSTR) and Tableau
  • Supports all the data warehouse core features such as SCD1 and SCD2, and different schemas like the star schema

How has it helped my organization?

It has helped us to understand the response and interest of the customers and the user conversion rate in this competitive world. Thus, it has helped us in the decision-making process.

What needs improvement?

In most of the scenarios, the data source for Redshift will be traditional RDBMS like MySQL, PostgreSQL, SQL server, etc. After migrating to Redshift, we will find few disconnects for w.r.t data types, the stored procedures and other complex functionalities. There is a need for improvement in these aspects, mainly in the scope of data types and some complex functionalities which we can perform in RDBMS.

For how long have I used the solution?

I have used this solution for more than a year.

What do I think about the stability of the solution?

I have not encountered any issues with stability. In terms of performance, Redshift is highly stable.

What do I think about the scalability of the solution?

I have not encountered any issues with scalability. We can easily scale the nodes in AWS only with a few clicks.

How are customer service and technical support?

I would give the technical support a 6 out of 10 rating.

Which solution did I use previously and why did I switch?

We have not used any other solution.

How was the initial setup?

The setup was straightforward for those who know AWS.

What's my experience with pricing, setup cost, and licensing?

The Redshift pricing policy is easy to understand.

Which other solutions did I evaluate?

We did not evaluate other options prior to selecting this solution.

What other advice do I have?

As of now, Redshift is far better than the other products in the market.

Lastly, I would like to mention that Redshift is more about scaling and stabilizing your data. One should also focus on data modeling from time to time.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Senior Engineer, Big-Data/Data-Warehousing at a manufacturing company with 501-1,000 employees
Vendor
We create different-sized clusters and orchestrate them using the SDK.

What is most valuable?

The most valuable features to us are: speed, DML, the fact that it is cloud-based, the management console, and Boto3.

Because we are dealing with a lot of data, speed is always important. Redshift is blistering fast when doing "deep" copies and inserts. Conceptually, my data-transformation pipelines are a series of proprietary "waves" that leverage Redshift's DML/"deep" copy/insert strengths. Doing all this in the cloud allows us to easily test alternatives. We create different sized Redshift clusters and orchestrate them by using the SDK (Python Boto3). We go beyond the traditional DWH to "infrastructure-as-software".

How has it helped my organization?

Redshift has helped to transform Makerbot into a data-driven company.

What needs improvement?

Integrating database security/access rights with AWS IAM would be great. I would also like to see more DML features that might aid in processing unstructured or log-file data. This would allow us to avoid having to use EMR/Hadoop.

For how long have I used the solution?

We’ve used Amazon Redshift for 3 years.

What was my experience with deployment of the solution?

We did not encounter any deployment issues.

What do I think about the stability of the solution?

We did not encounter any issues with stability.

What do I think about the scalability of the solution?

We did not encounter any issues with scalability.

How are customer service and technical support?

Customer Service:

I think the customer services is adequate.

Technical Support:

The level of technical support is good.

Which solution did I use previously and why did I switch?

We tried prior solutions, but they had limited or no scalability/agility.

How was the initial setup?

The initial setup was straightforward.

What was our ROI?

It took less than a year for the product to pay for itself.

What's my experience with pricing, setup cost, and licensing?

Regarding pricing and licensing, I advise to start small and have your developers/DBA use table compression and partitioning from the start.

Which other solutions did I evaluate?

We have used different options over the last 20 years. We found AWS Redshift to be the leader in capability and provides an ecosystem of related services from AWS, many of which are free.

What other advice do I have?

My advice to other is to prototype, prototype, prototype! Everything depends on your data and what you need to do to it. No two projects are the same.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Amazon Redshift Report and get advice and tips from experienced pros sharing their opinions.
Updated: August 2022
Product Categories
Cloud Data Warehouse
Buyer's Guide
Download our free Amazon Redshift Report and get advice and tips from experienced pros sharing their opinions.