Amazon Redshift Room for Improvement

LS
IoT Consultant at a computer software company with 1,001-5,000 employees

Pricing sometimes depends on the setup (key, etc.) which makes it hard for somebody new to AWS. Detailed research has to be conducted to end up with a competitive solution in terms of pricing and performance.

View full review »
BR
Senior Director Data Architecture at Managed Markets Insight & Technology, LLC

As our scalability requirements and data growth exceeded expectations, Redshift didn't scale up to meet our business needs. So, at that point, we made a switch to Snowflake, which provided the scalability we needed. So, scalability is one area of improvement.

One area where Amazon Redshift could improve is in adopting the compute-separate, data-separate architecture, which Delta, Snowflake are adopting, and a few others in the cloud data warehouse spectrum. Although Redshift introduces Aqua to achieve some level of scalability, I still feel that when it comes to scaling up, whether it's vertical scaling or horizontal scaling, there is a noticeable amount of downtime for end consumers. So, if I need to switch from DC1 to DC2, or from one compute/storage optimization to another, I have to bring down the entire cluster and then bring it back up. That's a pain point.

View full review »
Beverly R. Jamison - PeerSpot reviewer
IT Solutions Architect / Computer Scientist at Practical Semantics

The solution could improve in handling more data formats and more native support for RDF.

View full review »
Buyer's Guide
Amazon Redshift
April 2024
Learn what your peers think about Amazon Redshift. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
767,995 professionals have used our research since 2012.
Tamás Srancsik - PeerSpot reviewer
Data Analyst Lead at Vectornator

It would be good to see Redshift as a serverless offering. The proposition may be unclear, but at the time, there were certain limitations with the pay-as-you-go offering. However, a serverless offering would be more flexible on-demand pricing, which would be good to see because Redshift is not expensive, but I always have to buy a new server if I need more computing than I have. Setting up a new server is an easy task, but it would be better if I could scale my Redshift cluster up or down as needed; still, there is a need for manual control. For example, my analyst team is working on a job that requires a lot of computing and is only needed for this month, week, or even today. The job should scale up and down automatically, but it is not yet fully developed.

View full review »
MiodragMilojevic - PeerSpot reviewer
Senior Data Archirect at Telenor

From my perspective, the product could be improved by making it more flexible. There are now more flexible products on the market that allow for expandability and dynamic expansion as the market changes with regard to data warehouses. Although the product is simple to use there can be problems. If you declare some unique key in a column and then store it, the database is going to believe this is what you have and results will be distorted. It's fine if the query is simple but if it's complex or you have too many queries per hour, it can create a bottleneck for Redshift and then you can't return and recover. It requires some fine-tuning. 

For additional features, I would like to see support for partitions, it doesn't exist yet as a feature. It's quite an important issue when you're dealing with large databases. Also, I believe the product needs improvement in parallel threading to support more database users without jeopardizing performance.

View full review »
SP
Senior Economics Analyst at a manufacturing company with 51-200 employees

The initial deployment was complex.

View full review »
Ansari Rehman - PeerSpot reviewer
Cloud Data Architect (AWS-Snowflake-Teradata-Oracle) at a consultancy with 10,001+ employees

During our last office project, Redshift couldn't perform well even for a data size of 6 TB. Thus, compared to Teradata and Snowflake, the solution needs to work faster. They should extend the plan by including better optimization and readability as we get while using Teradata. Also, they should provide zero-copy coding and sharing facilities.

View full review »
Jayanta Datta - PeerSpot reviewer
Executive Director at Morgan Stanley

Infinite storage is available in Snowflake and is not available in Redshift. Analytical tools for integration would be helpful in the future.

View full review »
Coby Jefferson Gardner - PeerSpot reviewer
Data Consultant at Align BI

Amazon Redshift is a little more expensive than other products.

View full review »
TT
Soullution Architech at a tech services company with 51-200 employees

When compared to Snowflake, Amazon Redshift does not have the capability to dynamically increase the VM file. However, Amazon Redshift provides a virtual database called 'VW' that allows you to increase the size of the warehouse to run faster on a monthly basis without changing anything. This feature is not available in Redshift. So it's a limitation of Redshift.

It's not possible to immediately increase the virtual warehouse size in Amazon Redshift. When compared to Snowflake, we cannot increase the virtual warehouse size in Redshift. 

View full review »
Liana Iuhas - PeerSpot reviewer
CEO at Quark Technologies SRL

AWS Snowflake has a very good feature for cloning databases. It makes it easy to clone a data warehouse, which is useful. I would like to see this feature in Redshift. 

View full review »
Mikalai Surta - PeerSpot reviewer
Head of Big Data Department at IBA Group

The product must become a bit more serverless. Users should have to pay only for the resources they consume.

View full review »
NJ
Sr BI and Data Engineer at Datacult

Redshift's serverless technology needs to improve because not everyone is technically inclined. Organizations want to quickly access and import data into their data warehouse without hassle.

Redshift's ETL tool, Glue, is not seamlessly integrated with Redshift. I've encountered many instances where it couldn't fetch the perfect data type from the source, which should be intuitive. Snowflake's ETL tool, on the other hand, is more intuitive and seamless.

View full review »
Martin Gregor - PeerSpot reviewer
DWH, BI & Big Data consultant / developer /modeler - independent contractor at Freelancer

The explain panel in the Redshift database could be better.

View full review »
MS
Service Manager & Solution Architect at a logistics company with 10,001+ employees

The managing updates, deletes, and role-level change performance is very low. For example, while you are doing inserts, updates, deletes, and amalgamates, the performance is very, very poor.

If you want to query the database after you have a lot of terabytes of data, the load, performance-wise, is very low.

Looking at the performance of the query, querying the database, and especially with the amalgamates when it is getting updated, it is really poor.

We like this solution and have tried all of the native services; they were working quite well. The only concern about Redshift was managing the cluster, especially the EMR cluster. Our company policy was not to use EMR clusters, especially with the nodes failing. There were many instances of downtime happening. Essentially, there was too much data traffic.

The other drawback was the CDC, as we do not have any tools that can support it.

Creating the structure is easy on the DDL side, but after you create the table and you want to transform the data to store it in a database, the performance is poor.

It takes a lot of time to ingest and update the data. After you ingest the data and someone wants to fetch it in the table, it takes a lot of time performance-wise to return the results.

View full review »
Ved Prakash Yadav - PeerSpot reviewer
Senior Data Platform Manager at a manufacturing company with 10,001+ employees

In terms of improvement, I believe Amazon Redshift could work on reducing its costs, as they tend to increase significantly. Additionally, there are occasional issues with nodes going down, which can be problematic.

We often encounter issues like someone dropping a column or changing the order of columns, which can cause synchronization problems when pushing data through our pipeline. It's a minor issue, but it can be annoying.

View full review »
MandarGarge - PeerSpot reviewer
V.P. Digital Transformation at e-Zest Solutions

I would like to see improvement in the pricing and the simplicity of using this solution.

View full review »
AishwaryaKumar - PeerSpot reviewer
Solution Architect at Capgemini

We are using third-party tools to integrate Amazon Redshift, they should create their own interface on their own for it to be easily connected on the AWS itself.

View full review »
AR
VP, Data and Insights at a tech company with 201-500 employees

There are physically too many pipelines for a company of this size to maintain. For a data scientist, it's very difficult to learn the data in all of these different environments. It's easier to train people in just one environment to start with, like Snowflake or Databricks. It's difficult to have so many technologies that are very comparable, and each comes with a price tag.

View full review »
LourensWalters - PeerSpot reviewer
Senior Data Scientist at a tech services company with 51-200 employees

They should provide a structured way to work with interim data than to store it in parquet files locally. Also, Redshift is unwieldy. There should be better integration between Python and Redshift. It could be more accessible without using so many sequels.

They should make writing and reading the data frames into and from Redshift easier. The performance could be better. I have used Redshift for extensive queries. For the large tables, it's easier to unload to Redshift, but subquery tables that run complex grids are slower for configuration. I have to use the unloaded command to unload the whole table. Further, I have to read the table into a server with extensive memory in Python and process the data ahead. It's not optimal. 

View full review »
Syed Zakaulla - PeerSpot reviewer
Project Manager at Softway

I would like Amazon Redshift to improve its performance, analytics, scalability, and stability. Other than these points, I am not aware of any other areas to address since Amazon provides a variety of independent services for their customers to choose from, and if one were to express dissatisfaction with Amazon Redshift, Amazon would likely suggest AWS Glue as an alternative. Similarly, if another issue arose, Amazon might recommend Amazon RDS. There are a lot of things they try to upsell to you, each with its own pros and cons and in different packages offering different perks. So, it all depends on your business needs and what you choose for your business. I wouldn't criticize Amazon for this because they have created packages tailored to their customer's needs, which helps to prevent customers from looking elsewhere.

View full review »
Kundan Amin - PeerSpot reviewer
Senior Consultant at Dynamic Elements AS

Redshift's GUI could be more user-friendly. It's easier to perform queries and all that stuff in Azure Synapse Analytics.

View full review »
RR
Data Engineer at a tech services company with 51-200 employees

We recently moved from the DC2 cluster to the RA3 cluster, which is a different node type and we are finding some issues with the RA3 cluster regarding connection and processing. There is room for improvement in this area. We are in talks with AWS regarding the connection issues.

In an upcoming release, I would like to have a Snowflake-like feature where we can create another cluster in the same data warehouse, with the same data. You can create a different cluster and compute nodes for each of your use cases, for retail, and for your data analyst all while keeping your underlying data safe.

Additionally, the cluster resize process takes down the cluster for too long, approximately 15 minutes. There are limitations to the size, you can resize only by a multiplier of two, for example, if you have four nodes then you can either go to eight nodes or you can come down to two nodes. There should be fewer limitations.

View full review »
PS
Business Analyst at Insphere Solutiona

What would make Amazon Redshift better is improvising on the pricing structure. For example, Acronis provides backups in cybersecurity, yet the pricing is a bit lesser than Amazon Redshift. I was comparing AWS, Amazon Redshift, Azure, GCP, and Acronis, and I found that Acronis has a lesser price than the other solutions in the archiving space.

View full review »
HS
Consultant at ANWB

The customer support could be more responsive. 

View full review »
it_user396519 - PeerSpot reviewer
Director at a tech company with 1,001-5,000 employees

We would really like to see a few more connectors included that would enable connecting with other databases and services. We have faced some difficulties pulling data from Teradata and storing it in Redshift. There is no direct connector available between Teradata and Redshift.

View full review »
AJ
Senior Solutions Architect at a retailer with 10,001+ employees

Planting is the primary key enforcement that should be improved but there is probably a reason that they don't follow the reference architecture. It means they are creating clones of the data shading. Cost control measures could be improved along with added transparency.

View full review »
AG
Engineering Manager/Solution architect at a computer software company with 201-500 employees

This solution could be cheaper, but Amazon could constantly add new instances and decrease the size from previous ones, so it could be an AWS work model.

It would also be better if it can be integrated with non-AWS sources, e.g. additional open source connectors.

View full review »
AA
Financial Performance Manager at a retailer with 1,001-5,000 employees

We need more AWS applications. They have some solutions from Amazon that can manage performance, however, we will need something that can also manage financial reporting, visualization, and analytics - instead of having to go to other solutions like Tableau or Power BI. If they can offer comparable options, we'd like to be able to choose all AWS solutions instead of other platforms. 

The refreshment rate of data reaching Redshift from other sources should be faster.

It could be more efficient during tasks such as data refreshing or uploading. 

View full review »
MJ
Solutions Architect at a computer software company with 1,001-5,000 employees

Improvement could be made in the area of streaming data. The capability can definitely be improved. There are other products like Kinesis which is a separate service we use for streaming data ingestion. Whatever features are missing in Redshift, they have separate sources but if there were the feasibility to ingest real-time streaming data directly into RedShift, that would be very useful. 

View full review »
AA
Data Analyst at a tech vendor with 51-200 employees

I cannot state which features of the solution are in need of improvement, since those which we make use of have not changed.

The solution has four maintenance windows so, when it comes to stability, I think it would be better to decrease their number.

I rate the solution as an eight out of ten because it is not 100 percent, as there is much service-related maintenance required. It would be nice to see support for the usual LTP features, such as those involving traditional processing.

View full review »
MS
CEO at Screenit Labs Pvt Ltd

We have had some challenges with respect to considering some of the high-end availability architecture for production. We don't find many issues now, but initially, we had some challenges.

This is an older product, so when it comes to usability, it requires a technical person to work with it. It requires a specialist and a good business case to work on it. It has to be a little more user-friendly than what it is today.

In our experiments, the handling of unstructured data was not very smooth.

View full review »
it_user653898 - PeerSpot reviewer
Manager Data Services at a logistics company with 201-500 employees

It would be nice if we could turn off an instance. However, it would retain the instance in history, thus allowing us to restart without beginning from scratch.

View full review »
EM
DBA at Kimetrics

Redshift is a multi-tier engine that works like a calculator. There is some missing functionality and sometimes it's so difficult to work in. We need to convert these functionalities using VACUUM inside Amazon Redshift and then it causes some complexity. Sometimes I'd like for them to support some special features or some special installations because we need automatic populations. I would like to see more programming outside of the cloud. I would like to see more functionalities under JSON files. the only functionality that they have now with JSON is reports. I would also like to see other data sources like MongoDB.

View full review »
MP
Data Scientist at a tech vendor with 51-200 employees

Specifically, with Redshift Spectrum, some SQL commands have to run on Redshift instead of Redshift Spectrum, which slows down a few things in the tool. In the solution, user-based access is quite hard. In general, certain permissions are difficult to manage. The solution is expensive compared to Snowflake. I have used Snowflake because they're cheaper than Amazon Redshift. Some of the commands are run on Redshift itself, and some of the commands are completed by Spectrum, which is problematic. However, it is faster when computed by Spectrum.

View full review »
AN
Manager at Protiviti

The technical support should be better in terms of their knowledge, and they should be more customer-friendly.

View full review »
TD
Cloud & Data - practice leader at Micropole Belgium

I would like a better way to ingest data in realtime because there is a bit too much latency.

There are too many limitations with respect to concurrency. It is now possible to auto-scale it, although that is still slow.

It could offer smaller nodes with decoupling of storage and processing because for the moment, the only nodes available to work that way are huge, and for large companies.

View full review »
MS
Chief Information Officer at Sensilab

Compatibility with other products, for example, Microsoft and Google, is a bit difficult because each one of them wants to be isolated with their solutions. That's a big problem now.

View full review »
SN
Chief Executive Officer at Ampcome

The OLAP slide and dice features need to be improved. For example, if a business wants to bring in a general ledger from an ERP, they want to slice and dice the data. What we have found is that they have a lot of formulas that are used to calculate metrics, so what we do is use SQL Server Analysis Services. The question then becomes one of adopting a single vendor and transitioning to Azure. If Redshift had similar capabilities then it would be very good.

View full review »
Padmanesh NC - PeerSpot reviewer
Big Data Solution Architect - Spatial Data Specialist at SCIERA, INC

Of course, every product has pluses and minuses. From that perspective, Amazon Redshift has some issues with snapshot restoring when we handle huge datasets. When our snapshot size is really huge, like 20 TB+, we are forced to wait a long time to get it restored. This is reasonable, as they need to transfer the entire dataset to the cluster.

My thought on this issue is that Amazon has their own data centers and they are connecting each region of storage through Direct Connect. The input and output network data transfer might not be a complex thing. For example, if they used 10 Gbps network transfer, they can transfer 1 TB in less than two minutes, but that’s not happening now. To restore 1 TB of data, it takes more than 30-40 minutes.

View full review »
it_user576444 - PeerSpot reviewer
Rails Developer at a recruiting/HR firm with 51-200 employees

While It's probably the best product of its category (managed SQL-based data warehouse at scale), it has a few shortcomings, although very few.

The main issue people complain about, and I agree with the claim, is that it's hard to load your data into it. You need to first export your data on S3 as CSV, JSON or AVRO. Then you can load it into Redshift. And even then, you have to make sure your data is properly formatted. (you can use the copy options: TRUNCATECOLUMNS to load fields that are too big, and MAXERROR to allow for a given number of errors while loading). In general, ETL and data cleaning is a hurdle in data engineering, and Redshift suffers from it.

View full review »
it_user583371 - PeerSpot reviewer
BI Architect at a comms service provider with 5,001-10,000 employees

Some improvements can be brought about in:

Restore table:

I would like to use this option to move data across different clusters. Right now, you can only restore a table from the same cluster.

Right now, the feature only permits bringing the table back in the same cluster, based on the snapshot taken. I would like to have a similar option to move data across different clusters, right now I have to UNLOAD from cluster A and then COPY in cluster B. I would like to use the snapshots taken to bring the data in the cluster I need.
Maybe current design cannot be used, because it is based on nodes and data distribution.

But, our real scenario is: if we lose the data and we need to recover it in other cluster, we have to do:

1) Restore table in current table with a different name

2) Unload data to s3

3) Copy data to a new cluster. When we are talking about billions of records is complex to do.

Vacuum process: The vacuum needs to be segmented. For example, after 24 hours of execution, I had to cancel the process and 0% was sorted (big table).


Vacuum process:

The vacuum needs to be segmented, example after 24 hr of execution, I had to cancel the process and 0 % was sorted (big table)"

For big tables (billions of records). if the table is 100% unsorted, the vacuum can take more than 24hrs. If we don't have this timeframe, we have to work around taking out the data to additional tables and run vacuum by batches in the main table.

Why, because If I run the vacuum directly over the main table, and I stop it after 5 hrs, 0 records will be sorted. I would like to run the vacuum over the main table, stop when I need but get vacuumed some records. Like incremental process.

View full review »
it_user576441 - PeerSpot reviewer
Senior Software Engineer [Redshift Programmer] at a tech services company with 1,001-5,000 employees

In most of the scenarios, the data source for Redshift will be traditional RDBMS like MySQL, PostgreSQL, SQL server, etc. After migrating to Redshift, we will find few disconnects for w.r.t data types, the stored procedures and other complex functionalities. There is a need for improvement in these aspects, mainly in the scope of data types and some complex functionalities which we can perform in RDBMS.

View full review »
AJ
Consultant at a tech services company with 51-200 employees

Amazon Redshift could improve the user interface support.

View full review »
it_user576456 - PeerSpot reviewer
Manager BI Development at a comms service provider with 1,001-5,000 employees

Amazon should bring more SQL functions that are required in data warehouse implementations. It lacks SQL functions for complex data processing. A very small example is recursive queries. However, Amazon is developing the product at a fast pace and bringing new features with every release.

View full review »
it_user705738 - PeerSpot reviewer
Senior Solutions Engineer, West at a tech vendor with 5,001-10,000 employees

There are challenges with dealing with character set mismatches. Migrating data from other data sources can be challenging when you are working with multibyte character sets.

View full review »
NW
BI Manager at jfrog

It lacks a few features which can be very useful, such as stored procedures, Also, one needs to perform Vacuum in order to manage this DB. It would be nice not to worry about that and have this manageable.

View full review »
it_user572622 - PeerSpot reviewer
BI Architect & Developer (contract) at a retailer with 501-1,000 employees

I’d like to see these RedShift features arrive in other languages, such as SQL's ColumnStore index.

.

View full review »
it_user1256502 - PeerSpot reviewer
Senior System Engineer at a computer software company with 10,001+ employees

Pricing is one of the concerns that I have because if you compare Snowflake with Redshift, it provides some of the same services, but at a much cheaper rate. So pricing is one of the things that it could improve. It should be more competitive.

Otherwise, everything else looks good, especially the data storage and analytical processes.

View full review »
it_user149223 - PeerSpot reviewer
Senior Engineer, Big-Data/Data-Warehousing at a manufacturing company with 501-1,000 employees

Integrating database security/access rights with AWS IAM would be great. I would also like to see more DML features that might aid in processing unstructured or log-file data. This would allow us to avoid having to use EMR/Hadoop.

View full review »
it_user869871 - PeerSpot reviewer
Principal Consultant at Inawisdom
  • It would be nice if it was a bit cheaper to retain data long-term. 
  • Should be made available across zones, like other Multi-AZ solutions.
View full review »
it_user576450 - PeerSpot reviewer
Data Science Lead at a tech services company with 51-200 employees

I would like to see improvements in the database integrations. Currently, Amazon does not provide real-time/near real-time integration with other products like RDS or DynamoDB out-of-the-box.

We need to either build the integrations ourselves, or rely on third-party services which are not always the best.

View full review »
it_user1135503 - PeerSpot reviewer
Business Analyst at a tech services company with 201-500 employees

It would be useful to have an option where all of the data can be queried at once and then have the result shown. As it is now, when we run a query and we are looking at the results, part of the data remains to be processed at the back end. That works very well, but in some cases, we require the whole data to be queried at once and then have the results shown. We have not faced many use cases where it would have been useful, but in one or two, we used other methods to achieve this goal.

When our clients contact customer support, they don't want to speak with a machine. Instead, they want to chat with a real person who can provide a solution. Customer service bots can provide solutions but they cannot understand our problems.

View full review »
it_user689532 - PeerSpot reviewer
Full Stack Engineer at a tech services company with 11-50 employees

Query compilation time needs a lot of improvement for cases where you are generating queries dynamically. Also, it would help tremendously to have some more user-friendly, query optimization helper tools.

View full review »
NW
BI Manager at jfrog

In the next release, a pivot function would be a big help. It could save a lot of time creating a query or process to handle operations.

View full review »
AK
Consultant at kulki data management & consultants

Amazon should provide more cloud-native tools that can integrate with Redshift like Microsoft's development tools for Azure. 

View full review »
AD
Senior Software Engineer at a tech services company with 51-200 employees

Running parallel queries results in poor performance and this needs to be improved.

View full review »
SV
Head of Analytics at a tech services company with 10,001+ employees

The speed of the solution and its portability needs improvement.

View full review »
Buyer's Guide
Amazon Redshift
April 2024
Learn what your peers think about Amazon Redshift. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
767,995 professionals have used our research since 2012.