We are primarily using it for batch crossing and transformations.
Data solution architect at a pharma/biotech company with 5,001-10,000 employees
Excellent scalability, with valuable features, and profitable return on investment
Pros and Cons
- "The most valuable features currently are glue studio, jobs, and triggers."
- "I would like to see stable libraries at the moment they are not there."
What is our primary use case?
How has it helped my organization?
We have a large set of data and we are doing some transformations and identification. We are cleaning the data and transformations. Then we are putting the data into the destination table. So it is very comfortable.
What is most valuable?
The most valuable features currently are glue studio, jobs, and triggers.
What needs improvement?
I would like to see stable libraries at the moment they are not there.
Buyer's Guide
AWS Glue
May 2025

Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
852,764 professionals have used our research since 2012.
For how long have I used the solution?
I have been using AWS Glue for the past five years.
What do I think about the stability of the solution?
The stability I would consider to be an extensible Apache Spark.
What do I think about the scalability of the solution?
The scalability is good and we have three hundred projects we are working with.
Which solution did I use previously and why did I switch?
Previously, we used EMR, Informatica, Data Pipeline, and Azure Data Factory.
How was the initial setup?
The initial setup is straightforward.
What about the implementation team?
We did our deployment in-house with the CI/CD integrations like GitHub and deployed the code on Glue.
What was our ROI?
We are seeing a very good return on our investment.
What's my experience with pricing, setup cost, and licensing?
The current cost is around forty to fifty thousand a month.
What other advice do I have?
I would definitely recommend using AWS Glue for batching procedures. I would rate AWS Glue an eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.

Sr. Data Engineer at a tech services company with 5,001-10,000 employees
An event-driven, serverless computing platform that is flexible, powerful, and customizable
Pros and Cons
- "I like that it's flexible, powerful, and allows you to write your own queries and scripts to get the needed transformations."
- "It would be better if it were more user-friendly. The interesting thing we found is that it was a little strange at the beginning. The way Glue works is not very straightforward. After trying different things, for example, we used just the console to create jobs. Then we realized that things were not working as expected. After researching and learning more, we realized that even though the console creates the script for the ETL processes, you need to modify or write your own script in Spark to do everything you want it to do. For example, we are pulling data from our source database and our application database, which is in Aurora. From there, we are doing the ETL to transform the data and write the results into Redshift. But what was surprising is that it's almost like whatever you want to do, you can do it with Glue because you have the option to put together your own script. Even though there are many functionalities and many connections, you have the opportunity to write your own queries to do whatever transformations you need to do. It's a little deceiving that some options are supposed to work in a certain way when you set them up in the console, but then they are not exactly working the right way or not as expected. It would be better if they provided more examples and more documentation on options."
What is our primary use case?
We used AWS Glue to build our data warehouse. We built prototypes to go all the way all across their warehouse platforms. From AWS Glue to Spreadsheets and then QuickSight, that's how we're building their warehouse.
What is most valuable?
I like that it's flexible, powerful, and allows you to write your own queries and scripts to get the needed transformations.
What needs improvement?
It would be better if it were more user-friendly. The interesting thing we found is that it was a little strange at the beginning. The way Glue works is not very straightforward. After trying different things, for example, we used just the console to create jobs. Then we realized that things were not working as expected. After researching and learning more, we realized that even though the console creates the script for the ETL processes, you need to modify or write your own script in Spark to do everything you want it to do.
For example, we are pulling data from our source database and our application database, which is in Aurora. From there, we are doing the ETL to transform the data and write the results into Redshift. But what was surprising is that it's almost like whatever you want to do, you can do it with Glue because you have the option to put together your own script. Even though there are many functionalities and many connections, you have the opportunity to write your own queries to do whatever transformations you need to do.
It's a little deceiving that some options are supposed to work in a certain way when you set them up in the console, but then they are not exactly working the right way or not as expected. It would be better if they provided more examples and more documentation on options.
For how long have I used the solution?
I have been using AWS Glue since last year.
What other advice do I have?
On a scale from one to ten, I would give AWS Glue a nine.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
AWS Glue
May 2025

Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: May 2025.
852,764 professionals have used our research since 2012.
Data engineer at nust
Better than other tools for ETL jobs, but needs better documentation
Pros and Cons
- "AWS Glue is quite better than other tools, but you have to learn it properly before you start using it."
- "While working on AWS Glue, I could not find any training material for it."
What is our primary use case?
I constructed a straightforward ETL job using AWS Glue, wherein I had to load a couple of files in the Teradata database.
What is most valuable?
AWS Glue is quite better than other tools, but you have to learn it properly before you start using it.
What needs improvement?
While working on AWS Glue, I could not find any training material for it. Although it's not a problem with the product, the solution could include better documentation.
For how long have I used the solution?
I have been using AWS Glue for about two months.
What do I think about the stability of the solution?
AWS Glue is a stable solution.
How was the initial setup?
AWS Glue's initial setup is quite straightforward.
What other advice do I have?
Overall, I rate AWS Glue a seven out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Data Engineer at a consultancy with self employed
Easy to use, simple configurations, and good documentation
Pros and Cons
- "The most valuable feature of AWS Glue is its ease of use and good documentation. Additionally, we can do all the transformations that we need."
- "The price of the solution could improve."
What is our primary use case?
We are using AWS Glue for transforming firewalls synced to the Data Lake in the bronze zone. The ATL uses the solution to transform fields in the silver layer and later we will produce the gold zone. We are using the Delta Lake Architecture.
What is most valuable?
The most valuable feature of AWS Glue is its ease of use and good documentation. Additionally, we can do all the transformations that we need.
What needs improvement?
The price of the solution could improve.
For how long have I used the solution?
I have been using AWS Glue for approximately one month.
What do I think about the stability of the solution?
The stability of AWS Glue is good.
What do I think about the scalability of the solution?
AWS Glue is highly scalable.
There are dozens of customers using this solution.
How are customer service and support?
I have not used the support from AWS Glue but I know their support is good.
Which solution did I use previously and why did I switch?
I have previously used Azure and Spark for testing.
How was the initial setup?
The initial setup of AWS Glue is simple. In other solutions, such as Spark, the configuration would take a lot longer.
What about the implementation team?
I did the deployment of AWS Glue myself with the AMS console. I am a data engineer.
What's my experience with pricing, setup cost, and licensing?
The overall cost of AWS Glue could be better. It cost approximately $1,000 a month. There is paid support available from AWS Glue.
If the cost of AWS Glue was 50 percent less then we would not move to another solution.
What other advice do I have?
I am moving to the EMR serverless or GCP solution.
I rate AWS Glue a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
ECM CONSULTANT/ARCHITECT/SOFTWARE DEVELOPER, DELUXE MN at a tech services company with 5,001-10,000 employees
Easy to perform ETL on multiple data sources, and easy to use after you learn it
Pros and Cons
- "Glue is a NoSQL-based data ETL tool that has some advantages over IIS and ISAs."
- "There is a learning curve to this tool."
What is our primary use case?
Glue is a NoSQL-based data ETL tool that has some advantages over IIS and ISAs. It is tailored and customized to use with SQL Server, which works very well in that platform.
If you want to use other data sources, the NoSQL concept makes it very easy, because missing data can be inserted as a new column or with null values.
That is not the case with many other tools. If you have on-premises tools, such as IIS, they don't manage missing data well.
What is most valuable?
If you want extremely high-performance functionality, you have to use both AWS Glue or Data Lake to store it in some temporary table. First, you will have to do some cleaning of the data, then if you need performance and speed, you have to use IIS with an IBM tool.
You have to use the right tool in the right places. For example, if you're using Oracle, you have got to use the Oracle tools. If you are using SQL, you have to use the SQL tools. There is no other tool that provides the performance.
It's context-based and project-based. In the projects that I have used, it has worked well.
What needs improvement?
There is a learning curve to this tool.
For how long have I used the solution?
I have been working with AWS Glue for four years.
Everything runs on AWS, even if it belongs to a third party. For example, if you have a Netflix subscription, it runs on AWS. We have other products or vendor subscriptions that run on AWS.
What do I think about the stability of the solution?
Undoubtedly, the cloud is built to handle failure. If you have your devices, and your resources configured correctly, you won't have any issues. I haven't seen a problem.
How are customer service and support?
You have to pay for their technical support, and depending on which level of subscription, you will receive a call within an hour; otherwise, you will have to wait for days.
Which solution did I use previously and why did I switch?
We also use Azure's Data Lake, and I worked with Tipco in the past, though it's been a few years since we used it.
You should select the best tool for the job or the projects that are currently being worked on. Tipco was heavily used in the previous project we worked on.
How was the initial setup?
It takes some time to learn, but once you get the hang of it, you'll be fine. It's like any other IT tool, where nobody is an expert or isn't an expert, it is just the way you are exposed to a tool.
You've chosen the right tool if you understand how the data works and what it needs to do. It's like going to Home Depot to get the right tool. You can purchase a set of tools, and it will work for you, but you will still need to purchase something else.
It's one of those tools in which someone must be an expert. After that, all tools and platforms become secondary.
What's my experience with pricing, setup cost, and licensing?
With AWS Glue, you pay more, but if you want to process the data, with speed and performance, you need the correct EC2 instances.
There is a price to pay. It doesn't come free.
Technical support is a paid service, and which subscription you have is dependent on that. You must pay one of them, and it ranges from $15,000 to $25,000 per year.
You sign up for a level of service, and it does not come for free. As previously stated, everything is based on performance, ELAs.
It was very expensive, at that time. If a company wants to pay the money, it makes my job easier. However, if the company or enterprise does not have the funds to pay for it, then it is a hassle.
What other advice do I have?
In that environment, there is a lot going on. There are some things that you can get for free, and there are some add-ons that you can develop or use that have been tested. It's all about convenience and service. You will get what you pay for if you pay for what you want.
I'm not a fan of any tools; it all depends on the organization I work for, where their data is, what they want to do with it, how quickly they want to get there, and what their budget is, and you work around that. For me, I would not choose one over the other, unless I know the details of the project.
I would rate AWS Glue a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Data Engineer at GISbiz
It efficiently collects and catalogs the data but needs to improve performance
Pros and Cons
- "It is a stable and scalable solution."
- "It fails to handle massive databases acquired from various sources."
What is our primary use case?
We use the solution to collect customers' data containing multiple files and convert it into a common database. Later, we send the database for SQL injection.
What is most valuable?
The solution's most valuable feature is its ability to efficiently collect and catalog the data in the warehouse.
What needs improvement?
They should improve the solution's performance in case of large amounts of data. Currently, AWS fails to handle massive databases acquired from various sources. Also, it is challenging to queue the data or use a standard code in AWS environment. We need to install a third-party tool to tackle the issue. We need to use another tool to convert the data as well. Thus, we are using multiple tools to handle the database. They should work on this particular area.
For how long have I used the solution?
We have been using the solution for one year.
What do I think about the stability of the solution?
It is a stable solution. I rate its stability as an eight.
What do I think about the scalability of the solution?
I rate the solution's scalability as a six.
How was the initial setup?
The initial setup is a bit complex, and I rate the process as a six. We have to install multiple third-party tools whenever we update the security patches or renew the solution. Thus, the deployment process is complicated.
What other advice do I have?
If you already have AWS environment, you can opt for AWS Glue for its ETL operations feature; if you want to process multiple operations, such as creating a table or catalog, or for machine learning purposes better to go for other database tools.
I rate the solution as a seven.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer:
Principal System Architect at a transportation company with 1,001-5,000 employees
Used for data engineering ETL jobs to extract, transform, and load data
Pros and Cons
- "The solution’s most valuable feature is the ETL job."
- "The solution’s technical support could be improved."
What is our primary use case?
AWS Glue is essentially used for data engineering ETL jobs to extract, transform, and load data. We use it to clean data. You have multiple data sources from your application that are not so clean. You have this data and may want to delete certain columns or fill in certain data in an Excel sheet. That's where the extract part comes in. Then, you transform, drop, or make the data uniform and load it to your destination like a data warehouse.
What is most valuable?
The solution’s most valuable feature is the ETL job. AWS Glue is an easy-to-use solution. AWS Glue integrates seamlessly with other AWS services like Athena, Redshift, and S3.
What needs improvement?
The solution’s technical support could be improved.
For how long have I used the solution?
I have been using AWS Glue for a few months.
What do I think about the stability of the solution?
AWS Glue is a stable solution.
I rate the solution’s stability eight and a half out of ten.
What do I think about the scalability of the solution?
In the future, our data sets are going to increase. For now, the solution's scalability is fine.
Which solution did I use previously and why did I switch?
I previously used Data Pipeline, and I tried using Lambda.
How was the initial setup?
The solution’s initial setup is easy.
What other advice do I have?
AWS Glue is built for large datasets, and it does the job perfectly. I would recommend the solution to other users.
Overall, I rate the solution eight and a half out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Sep 12, 2024
Flag as inappropriateConsultant - Business Operations at a computer software company with 10,001+ employees
Transformations are valuable for modifying complex data but rely too heavily on code
Pros and Cons
- "Transformations are valuable because you can modify or override complex data logic from an open source or Spark to solve issues."
- "The setup and installation is a bit complex without advanced knowledge or training."
What is our primary use case?
Our company uses the solution for ETL data movement for our customers such as on-premises to cloud, cloud to cloud, and cloud to Snowflake. We also data catalog and schedule ETL jobs. We are able to monitor all jobs through AWS services.
What is most valuable?
Transformations are valuable because you can modify or override complex data logic from an open source or Spark to solve issues.
For example, it is easy to solve issues where volume is good but performance is degrading because you can split jobs into small chunks to more quickly handle data loads.
What needs improvement?
The setup and installation is a bit complex without advanced knowledge or training. It would be easier for an AWS expert or someone in DevOps.
Transformations need improvements to be more user friendly and rely less on coding like Matillion.
For how long have I used the solution?
I have been using the solution for three years.
What do I think about the stability of the solution?
The solution's stability is decent and rates higher than other products. It works well with Snowflake, Azure, GCP, and AWS-supported products.
A hybrid situation may cause delays in performance.
What do I think about the scalability of the solution?
The solution is scalable.
How are customer service and support?
One of our customers used technical support and found them to be helpful.
How was the initial setup?
The setup and installation is a bit complex. Training or advance knowledge is required. Someone with AWS experience or a DevOps perspective would have fewer issues.
What about the implementation team?
We install the solution for customers and the timeline depends on the job.
A complete project will take a few days to a week for deployment. The number of jobs and components determines how many technicians are required for setup, installation, and deployment. Technician requirements can range from two to fifteen.
Deployment will take a couple of hours for a few announcement jobs that deploy from the CI/CD pipeline.
Which other solutions did I evaluate?
The solution is my second choice because I prefer Snowflake's capabilities.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner

Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros
sharing their opinions.
Updated: May 2025
Product Categories
Cloud Data IntegrationPopular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
MuleSoft Anypoint Platform
webMethods.io
Palantir Foundry
AWS Database Migration Service
Denodo
Elastic Search
Fivetran
Matillion Data Productivity Cloud
SnapLogic
IBM App Connect
Zapier
IBM Cloud Pak for Integration
Talend Data Integration
Jitterbit Harmony
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which is the best choice for cloud integration: AWS Glue or Informatica Intelligent Cloud Services (IICS)?
- Is AWS Glue a difficult solution to use if you are a complete beginner?
- Is AWS Glue effective for AWS-related products only?
- Why would you choose AWS Glue over other tools?
- What are the most common use cases for AWS Glue?
- How does Talend Open Studio compare with AWS Glue?
- Does AWS Glue offer more flexibility than other ETL (Extract, Transform, Load) tools in terms of data loading?
- Oracle ICS vs ODI
- When evaluating Cloud Data Integration, what aspect do you think is the most important to look for?
- What is data lake storage?