I had the source data, which was unstructured and non-fixable, and my responsibility was to convert it into structured data. For this task, I used PySpark as the programming language. With Python, I implemented the creation of a data frame using Glue jobs. Since Glue jobs are a serverless mechanism, I deployed my code into the Glue job, and that's how I got the job done.
Senior Software Developer at a computer software company with 10,001+ employees
Provides serverless mechanism, easy data transformation and automated infrastructure management
Pros and Cons
- "We no longer had to worry much about infrastructure management because AWS Glue is serverless, and Amazon takes care of the underlying infrastructure."
- "In terms of performance, if they can further optimize the execution time for serverless jobs, it would be a welcome improvement."
What is our primary use case?
How has it helped my organization?
We no longer had to worry much about infrastructure management because AWS Glue is serverless, and Amazon takes care of the underlying infrastructure. This allowed us to focus on the code and application logic without concerns about scaling, CPU management, or handling fluctuations in flow. The serverless nature of Glue jobs relieved us from these infrastructure-related worries.
What is most valuable?
The most valuable feature of AWS Glue for me is the ability to write PySpark code and execute it within Glue. I use Glue functions to publish parameters and interact with Glue.
What needs improvement?
In terms of performance, if they can further optimize the execution time for serverless jobs, it would be a welcome improvement. Faster code execution would be beneficial. If AWS could enhance the serverless execution capabilities, like increasing CPU, RAM, and processing speed, that would be great.
Buyer's Guide
AWS Glue
August 2025

Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: August 2025.
865,295 professionals have used our research since 2012.
For how long have I used the solution?
I have been using AWS Glue for a year.
What do I think about the stability of the solution?
In terms of stability, I haven't experienced any issues with AWS Glue. It has been quite reliable in my usage, and I would rate it around nine on a scale of ten.
What do I think about the scalability of the solution?
AWS Glue is highly scalable and reliable. The serverless nature ensures that scaling is automatically managed by AWS. It's one of the strengths of Glue, and there is no doubt about its scalability.
How are customer service and support?
The customer service and support were responsive and helpful in resolving the issues I had at that time. The response time was quick. So, overall, the support was okay.
How would you rate customer service and support?
Positive
What about the implementation team?
Our admin team was responsible for setting up the solution.
What's my experience with pricing, setup cost, and licensing?
The pricing is a bit higher than EMR. EMR is a managed Hadoop and Spark platform. AWS Glue has a higher price.
Which other solutions did I evaluate?
I found that AWS Lambda could be a possible alternative, but since AWS Lambda has a 15-minute execution limit, we opted for AWS Glue, which does not have any execution limits. I don't see any other service that can do what AWS Glue does.
Other services that could potentially do the same job as AWS Glue would require your application to be deployed in Elastic Beanstalk, EC2, or Container Services, which is another task altogether. Since AWS Glue is already doing its job perfectly, I don't think any other service can do the same job as AWS Glue.
What other advice do I have?
I would recommend that new users refer to the AWS documentation. The documentation is very well-written and easy to understand. Even new users with no prior experience with AWS should be able to get up and running quickly. I would also recommend that new users learn Python.
Overall, I would rate the solution a nine out of ten.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.

Data Engineer at Tata Consultancy
Good integration with other AWS services but 30-minute runtime is limiting for large data sets
Pros and Cons
- "The solution integrates well with other AWS products or services."
- "On occasion, the solution's dashboard reports that a project failed due to runtime but it actually succeeded."
What is our primary use case?
Our company has five data engineers who use the solution for metadata catalogs and ETL pipelines that are built in S3 or EC2.
What is most valuable?
The solution integrates well with other AWS products or services.
What needs improvement?
The solution needs to expand its 30-minute query or runtime. Sometimes it fails with certain data types such as Athena due to the limited runtime. Some large data sets run overtime during busy hours so we try to avoid failures by running data at idle times or at night.
On occasion, the solution's dashboard reports that a project failed due to runtime but it actually succeeded. This can be quite confusing.
For how long have I used the solution?
I have been using the solution for one year.
What do I think about the stability of the solution?
The solution is stable 99% of the time but is somewhat dependent on workloads. Heavy workloads or bigger data sets sometimes fail. Improvements are needed for stability to be consistent.
How are customer service and support?
I have not needed technical support but my colleagues have used it. I do not have relevant feedback to report.
How was the initial setup?
The setup requires some knowledge and understanding of how the solution works.
What about the implementation team?
We implemented the solution in-house.
What's my experience with pricing, setup cost, and licensing?
I rate pricing an eight out of ten.
Which other solutions did I evaluate?
There are other options that offer equivalent service such as SRA and GCP but the solution is fully integrated with AWS where our data resides so we take full advantage of that.
What other advice do I have?
I rate the solution a five out of ten.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
Buyer's Guide
AWS Glue
August 2025

Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: August 2025.
865,295 professionals have used our research since 2012.
Associate Director - Delivery (Technology DWH & Data Engineer) at MOBIUS KNOWLEDGE SERVICES PRIVATE LIMITED
Efficiently integrates and transforms data but lacks in scalability
Pros and Cons
- "I like its integration and ability to handle all data-related tasks."
- "We face performance issues when using AWS Glue for data transformation and integration."
What is our primary use case?
Our primary use cases include pulling data from multiple sources and loading it into the central capacity for data transformation, integration, and processing.
What is most valuable?
I like its performance, integration, and its ability to handle all data-related tasks.
What needs improvement?
We face performance issues when using AWS Glue for data transformation and integration. It takes almost three to four hours to execute single transformations, which is a lot. We want to improve the performance to meet customer requirements.
Mainly, I am focused on improving the performance aspect because the customer is keen on this improvement.
For how long have I used the solution?
I have been using AWS Glue for five years. I am using the latest version.
What do I think about the stability of the solution?
I would rate the stability of AWS Glue a seven out of ten. There are some performance issues.
What do I think about the scalability of the solution?
I would rate the scalability of AWS Glue a six out of ten. There are over 25 users in my company.
How are customer service and support?
The customer service and support team is okay.
How would you rate customer service and support?
Neutral
How was the initial setup?
It is a cloud-based solution; there is no such installation procedure required. One DevOps is required for the maintenance of AWS Glue.
What other advice do I have?
Based on the customer scenario, I have previously recommended AWS Glue. Sometimes, customers directly request either Azure RapidAPI or AWS Glue. It depends on the specific business use case. Both tools have limitations, so it's hard to say which is best. If a customer already uses Microsoft products, I suggest going with Azure. As for a general rating, I would give AWS Glue a seven out of ten.
Overall, I would rate AWS Glue a seven out of ten because it's not about performance. It's because of how the tool is used.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC
Straightforward, easy to set up, and needs little to no training
Pros and Cons
- "It's fairly straightforward as a product; it's not very complicated."
- "The mapping area and the use of the data catalog from Glue could be better."
What is our primary use case?
We use the solution to do the usual type of transformations that before required ETL. It's mostly transformation-type purposes that we have, including transforming data from source to target. Also, we are replacing the usual ETLs with Glue, for example.
How has it helped my organization?
The solution has helped the organization mostly with migration. When we migrate from mostly on-premises to the cloud, Glue replaces a more expensive item that's also available in the cloud. Lots of programmers can understand Glue because they can write scripts in Python and PySpark, and there are quite a few programmers that know how to program in those languages. With the previous components that did similar things on-premise, you needed specialized knowledge.
What is most valuable?
The fact that we can use PySpark to program them is great.
It's fairly straightforward as a product; it's not very complicated.
The fact that Amazon offers it and it's a quick way of getting things done, and the fact that it doesn't require a lot of training is very positive attributes.
The initial setup is straightforward.
It can scale.
What needs improvement?
The mapping area and the use of the data catalog from Glue could be better. I would say those two are the main things we'd like to see improvements on.
The solution needs support for big data.
As I understand it, Glue is based on Lambdas and Lambdas have some limitations as far as running them continuously. Sometimes they get dropped, and they have to be reinitialized.
For how long have I used the solution?
I've been using the solution for about two years.
What do I think about the stability of the solution?
So far, the solution has been stable. For everything that we've done, we didn't run into any problems, so it was fairly stable for us.
What do I think about the scalability of the solution?
The solution can scale. I don't know exactly at what scale. I was using it for a medium-scale type solution. We never tested it with extremely large scales, so I don't know how expansive it can be. That said, since it's in the cloud, if you need more scale, it would be easy enough to add additional Glue components. At least, in theory, it should be fairly scalable.
Indirectly, we have hundreds of people on the solution.
At this time, I'm not sure if any clients plan to increase usage.
How are customer service and support?
Since the solution is so easy to use, we have not needed the help of technical support.
Which solution did I use previously and why did I switch?
We've previously used many different solutions, including Informatica, Data Storage from IBM, and SSIS.
How was the initial setup?
The solution is easy to set up. It's not overly complex.
At a minimum, a company would need one to two people to handle the deployment. If it is a larger scale of deployment, they might need more personnel.
What about the implementation team?
As a consultant, I assist with the implementation.
What was our ROI?
Likely, a company would receive an ROI as it's cheaper than the alternatives.
What's my experience with pricing, setup cost, and licensing?
I was not involved in the cost negotiation process.
Which other solutions did I evaluate?
Our clients typically have chosen Glue from the start. I was not involved in the evaluation process.
What other advice do I have?
We are using one of the latest versions of the solution. It's about two years old.
Depending on the number of data sources, the variety of data sources, and the variety of targets they will have, I might recommend the solution. What they have and plan to do will dictate whether Glue is a good solution or whether they would require something more sophisticated - such as Databricks. For example, if you have big data, then Databricks is probably a better solution to do ETL.
I'd rate the solution seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Owner at a tech services company with 51-200 employees
Capable of handling real-time but ETL interface could be more user-friendly
Pros and Cons
- "I also like that you can add custom libraries like JAR files and use them. So, the ability to use a fast processing engine and embed basic jobs easily are significant advantages."
- "One area that could be improved is the ETL view. The drag-and-drop interface is not as user-friendly as some other ETL tools."
What is our primary use case?
One common use case is migrating data from one system to another. So, mostly migrating data and data engineering, getting real-time or near-real-time data using Lambda functions and migrating big data from on-prem to the cloud for historical data before starting a project.
What is most valuable?
If you have the Fund Manager, you could use a fast processing engine, which is crucial for performance.
I also like that you can add custom libraries like JAR files and use them. So, the ability to use a fast processing engine and embed basic jobs easily are significant advantages.
What needs improvement?
One area that could be improved is the ETL view. The drag-and-drop interface is not as user-friendly as some other ETL tools.
Additionally, AWS Glue can sometimes be slow, especially when processing large datasets. It was sometimes a bit slow. Also, I couldn't directly use bucketed data. With Elastic Glue, you had to convert your data frames into the correct format before connecting them using the drag-and-drop interface. So that's something I didn't like because the conversion process wasn't straightforward.
In future releases, I would like to see a feature that could trigger Glue pipeline using an API or something.
For how long have I used the solution?
I have experience with AWS Glue. I have about one year of experience in a professional setting, but I have also done some personal work with this solution.
How are customer service and support?
Support was good, but I was working with a big client, so that might have influenced the experience. The response time was fast, we heard back from them within a day.
How would you rate customer service and support?
Positive
How was the initial setup?
I would rate my experience with the initial setup an eight out of ten, where one is difficult and ten is easy.
The initial setup is not very complex. You can customize parameters like minimum and maximum for your needs. For me, it wasn't complex to deploy the solution.
What other advice do I have?
I'd rate it around six out of ten compared to other tools like Databricks.
Which deployment model are you using for this solution?
On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Sr Associate at Cognizant
A stable and easy-to-use solution that can be used for data analytics
Pros and Cons
- "AWS Glue is a stable and easy-to-use solution."
- "The solution’s stability could be improved."
What is our primary use case?
We use AWS Glue for data analytics.
What is most valuable?
AWS Glue is a stable and easy-to-use solution.
What needs improvement?
The solution’s stability could be improved.
For how long have I used the solution?
I have been using AWS Glue for the last three years.
What do I think about the stability of the solution?
I rate AWS Glue a seven out of ten for stability.
What do I think about the scalability of the solution?
AWS Glue is a very scalable solution, and you can connect multiple databases.
How are customer service and support?
AWS Glue's technical support is very good.
What's my experience with pricing, setup cost, and licensing?
AWS Glue is not a licensed solution. AWS Glue follows a pay-as-you-go model, wherein the cost of the data you use will be counted as a monthly bill.
What other advice do I have?
Currently, there are many ETL tools in the marketplace. Compared to other ETL tools, AWS Glue is a low-cost and serverless solution.
Overall, I rate AWS Glue a nine out of ten.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Associate Director at a consultancy with 10,001+ employees
A scalable tool to build data engineering pipelines
Pros and Cons
- "The solution's technical support is good. Whenever we raise a use case where we face an issue in our company, we get a response from the solution's technical team."
- "Only people who can code, either in Java or Python, can use the product freely. Those who don't know Java or Python might find using AWS Glue difficult."
What is our primary use case?
In my company, we use AWS Glue to build data engineering pipelines, so we ingest data from either S3 or other sources and put it back into Redshift, where we have a data lake or data warehouse.
What is most valuable?
With AWS Glue, there have been a lot of updates over the past couple of years. The feature which I like the most is that AWS Glue has now integrated Jupyter Notebook into it, causing AWS Glue to become interactive and develop new use cases. The fact that we don't have to worry about the resources required for any particular job since it gets taken care of by AWS Glue adds to the valuable feature set of the product.
What needs improvement?
AWS Glue Studio has undergone a lot of enhancements in the last couple of months. An improvement that can help the solution is if the user interface can become more user-friendly and allow for features like drag and drop, allowing it to build transformations. There can be a good improvement if the product itself supports different kinds of transformations so that the pipeline, which we want to create, can be done easily since right now, we have to write a code to do so in our company. Only people who can code, either in Java or Python, can use the product freely. Those who don't know Java or Python might find using AWS Glue difficult.
AWS has pricing for spot instances that reduces the cost substantially, but that is not available for AWS Glue AWS pricing for spot instances comes for products like EC2, and if the same gets introduced for AWS Glue, then the pricing can substantially reduce.
For how long have I used the solution?
I have been using AWS Glue for five years. I am an end user of the product.
What do I think about the stability of the solution?
Stability-wise, I rate the solution an eight out of ten.
What do I think about the scalability of the solution?
It is a very scalable solution. In our company, we don't have to bother about the latest features implemented by the solution since the solution itself allocates resources, causing the product to scale up or down as the job progresses.
How are customer service and support?
The solution's technical support is good. Whenever we raise a use case where we face an issue in our company, we get a response from the solution's technical team.
How was the initial setup?
AWS Glue is a platform as a service from AWS, so there is no setup required. It's very simple since you can just go into the UI and start using it.
The solution is deployed on a private cloud.
I am a part of the development team, which doesn't have access to all the different sets of features and has a different backup. There is a team that does the initial setup, user access, and all those things. Once the product is ready, my team takes it over, so the setup is done by someone else.
What's my experience with pricing, setup cost, and licensing?
I rate the product's pricing a five on a scale of one to ten, where one is a high price, and ten is a low price.
Which other solutions did I evaluate?
AWS Glue can be compared to the database provided by Azure which has a lot of extra features or capabilities compared to AWS Glue which is only useful for processing.
What other advice do I have?
I rate the overall product an eight out of ten.
Which deployment model are you using for this solution?
Private Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Project Manager at Softway
It has a real-time backup feature and records and backs up information every single moment, but its cost is high, and setting it up is complex
Pros and Cons
- "What I like best about AWS Glue is its real-time data backup feature. Last week, there was a production push, and what used to take almost ten days to send out around fifty-six thousand emails now takes only two hours."
- "Cost-wise, AWS Glue is expensive, so that's an area for improvement. The process for setting up the solution was also complex, which is another area for improvement."
What is our primary use case?
We're using GPU 0.2 in ten verticals and wanted to use AWS Glue only for one purpose: to optimize Amazon Redshift.
We have millions of data that we have to back up. Previously, we did it once every six months, but the client data have been very interactive, and we need spontaneous back and forth of data communication in real-time. In one second, we have almost one million records that come and go continuously. The client wanted to keep all data because they're using it for analytics and wanted to back up the data every second without delay. We tried to optimize Amazon Redshift and found out about AWS Glue, which comes with massive costs, but the client is willing to pay.
What is most valuable?
What I like best about AWS Glue is its real-time data backup feature. Last week, there was a production push, and what used to take almost ten days even to send out around fifty-six thousand emails now takes only two hours.
I also like that the data backup in AWS Glue is spontaneous, and data is recorded and backed up every single moment.
What needs improvement?
AWS Glue had some issues, which required optimization, particularly in terms of the number of workers you deploy, and that's where costing comes in. Cost-wise, AWS Glue is expensive, so that's an area for improvement. My company did some modifications, which turned out to be successful, so overall, the solution works fine.
Even though there is a backup, you need to know what's happening. You need to understand why there's a failure. AWS Glue doesn't provide the information, so my company uses its logs. The development team also doesn't have specific answers because the team is still playing around with the process, which means the company is still trying to figure out other areas for improvement in AWS Glue.
The process for setting up the solution was also complex, which is another area for improvement.
AWS should provide help during migration and assist its users. Otherwise, it's a nightmare.
For how long have I used the solution?
I've been using AWS Glue for one and a half months.
What do I think about the stability of the solution?
AWS Glue is stable, but stability depends on how many workers you deploy and the work that you do.
What do I think about the scalability of the solution?
AWS Glue is highly scalable. It can scale to almost one billion data per second.
How are customer service and support?
We did make some good friends in AWS, so they gave us technical support for AWS Glue for free. They were also new and were trying to evolve, so they provided us with free support, but they'll be charging other clients for the support moving forward.
How was the initial setup?
The setup for AWS Glue is highly complex. The company started with R&D four months ago and only completed the deployment last week.
My company used one and a half FTE resources for the deployment.
The deployment process for AWS Glue was normal and involved CI/CD, but it was mainly the backend dev ops engineers who did it. I'm more of a project manager, so I'm not involved in technical items. It's more of me helping the engineers with the R&D.
What's my experience with pricing, setup cost, and licensing?
AWS Glue is a high-priced solution that bills the client $150,000 to $250,000 annually. That's just the starting price because it's a small data sample, but if it hits over three hundred million users, the cost will probably go up almost thirty times more.
What other advice do I have?
I'm using the latest version of AWS Glue.
I'm not the end-user, as I work for a company that implements AWS Glue for clients.
My company has one client using AWS Glue, but that client has three hundred million users.
I recommend AWS Glue to others because it's an excellent solution. However, it lacks documentation. There's only a little documentation available. Even certified AWS practitioners struggle with the lack of documentation for AWS Glue. You'll find complicated processes or features, such as time series tables. Even if there's documentation, implementing the solution requires many trial and error methods, and revamping becomes a nightmare if you're using the old infrastructure.
My rating for AWS Glue is seven out of ten because of the complexity of the deployment, and the lack of information and documentation, that my company had to do some R&D. If AWS had complete documentation, or sent more than one person to assist my company, then it could have saved more time.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer. Implementer

Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros
sharing their opinions.
Updated: August 2025
Product Categories
Cloud Data IntegrationPopular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
MuleSoft Anypoint Platform
webMethods.io
Palantir Foundry
Talend Open Studio
AWS Database Migration Service
Elastic Search
Denodo
Fivetran
Matillion Data Productivity Cloud
SnapLogic
Zapier
IBM Cloud Pak for Integration
Jitterbit Harmony
Talend Data Integration
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which is the best choice for cloud integration: AWS Glue or Informatica Intelligent Cloud Services (IICS)?
- Is AWS Glue a difficult solution to use if you are a complete beginner?
- Is AWS Glue effective for AWS-related products only?
- Why would you choose AWS Glue over other tools?
- What are the most common use cases for AWS Glue?
- How does Talend Open Studio compare with AWS Glue?
- Does AWS Glue offer more flexibility than other ETL (Extract, Transform, Load) tools in terms of data loading?
- Oracle ICS vs ODI
- When evaluating Cloud Data Integration, what aspect do you think is the most important to look for?
- What is data lake storage?