Try our new research platform with insights from 80,000+ expert users
reviewer2290962 - PeerSpot reviewer
VP- Cloud Data/ Solution Architect at a financial services firm with 10,001+ employees
Real User
Top 5
Offers metadata management, logging, and ETL processing capabilities
Pros and Cons
  • "AWS Glue is fast and managed by AWS. Hence, you don't have to worry about capacity and the performance of Glue jobs. It has integrations with other data stores of AWS. The product offers metadata management, logging, and ETL processing capabilities. It comes with a powerful feature, Glue Studio, which helps to do queries interactively within the community. It is a managed service and very secure. Another popular and mature service is S3."
  • "I have encountered challenges with multi-region support."

What is most valuable?

AWS Glue is fast and managed by AWS. Hence, you don't have to worry about capacity and the performance of Glue jobs. It has integrations with other data stores of AWS. The product offers metadata management, logging, and ETL processing capabilities. It comes with a powerful feature, Glue Studio, which helps to do queries interactively within the community. It is a managed service and very secure. Another popular and mature service is S3. 

What needs improvement?

I have encountered challenges with multi-region support. 

How are customer service and support?

AWS Glue has standard procedures for tech support. 

How would you rate customer service and support?

Positive

Buyer's Guide
AWS Glue
June 2025
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.

How was the initial setup?

AWS Glue is very user-friendly. However, if you are not a UI person, you can do IaaS. AWS provides a cloud automation service and a service catalog. 

What's my experience with pricing, setup cost, and licensing?

I rate the tool's pricing a four out of ten. 

What other advice do I have?

I rate AWS Glue an eight out of ten. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Neelabh Sharma - PeerSpot reviewer
Data Engineer at Scania
Real User
Top 10
Provides good scalability and has an easy setup process
Pros and Cons
  • "The product has a valuable feature for data catalog."
  • "The product is expensive for data streaming. This area needs improvement."

What is our primary use case?

We use AWS Glue for ETL batch processing purposes.

What is most valuable?

The product has a valuable feature for data catalog.

What needs improvement?

The product is expensive for data streaming compared to EMR. This area needs improvement.

For how long have I used the solution?

We have been using AWS Glue for one and a half years.

What do I think about the stability of the solution?

I rate the product's stability a ten out of ten.

What do I think about the scalability of the solution?

We have five to six AWS Glue users. I rate its scalability a nine out of ten.

Which solution did I use previously and why did I switch?

We have used Cloudera before. We switched to AWS Glue for better pricing, scalability, and innovation.

How was the initial setup?

The initial setup is easy. I rate the process an eight or nine out of ten. It could be deployed on-premises and on the cloud as well. We have a team of five executives to carry out the implementation.

What's my experience with pricing, setup cost, and licensing?

It is an expensive product. I rate its pricing a nine out of ten.

What other advice do I have?

I rate AWS Glue a nine out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
AWS Glue
June 2025
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: June 2025.
856,873 professionals have used our research since 2012.
Data Engineer at Tata Consultancy
Real User
Good integration with other AWS services but 30-minute runtime is limiting for large data sets
Pros and Cons
  • "The solution integrates well with other AWS products or services."
  • "On occasion, the solution's dashboard reports that a project failed due to runtime but it actually succeeded."

What is our primary use case?

Our company has five data engineers who use the solution for metadata catalogs and ETL pipelines that are built in S3 or EC2.  

What is most valuable?

The solution integrates well with other AWS products or services. 

What needs improvement?

The solution needs to expand its 30-minute query or runtime. Sometimes it fails with certain data types such as Athena due to the limited runtime. Some large data sets run overtime during busy hours so we try to avoid failures by running data at idle times or at night. 

On occasion, the solution's dashboard reports that a project failed due to runtime but it actually succeeded. This can be quite confusing. 

For how long have I used the solution?

I have been using the solution for one year. 

What do I think about the stability of the solution?

The solution is stable 99% of the time but is somewhat dependent on workloads. Heavy workloads or bigger data sets sometimes fail. Improvements are needed for stability to be consistent. 

How are customer service and support?

I have not needed technical support but my colleagues have used it. I do not have relevant feedback to report. 

How was the initial setup?

The setup requires some knowledge and understanding of how the solution works. 

What about the implementation team?

We implemented the solution in-house. 

What's my experience with pricing, setup cost, and licensing?

I rate pricing an eight out of ten. 

Which other solutions did I evaluate?

There are other options that offer equivalent service such as SRA and GCP but the solution is fully integrated with AWS where our data resides so we take full advantage of that. 

What other advice do I have?

I rate the solution a five out of ten. 

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Vimalathithan M - PeerSpot reviewer
Associate Director - Delivery (Technology DWH & Data Engineer) at MOBIUS KNOWLEDGE SERVICES PRIVATE LIMITED
Real User
Top 10
Efficiently integrates and transforms data but lacks in scalability
Pros and Cons
  • "I like its integration and ability to handle all data-related tasks."
  • "We face performance issues when using AWS Glue for data transformation and integration."

What is our primary use case?

Our primary use cases include pulling data from multiple sources and loading it into the central capacity for data transformation, integration, and processing.

What is most valuable?

I like its performance, integration, and its ability to handle all data-related tasks.

What needs improvement?

We face performance issues when using AWS Glue for data transformation and integration. It takes almost three to four hours to execute single transformations, which is a lot. We want to improve the performance to meet customer requirements.

Mainly, I am focused on improving the performance aspect because the customer is keen on this improvement.

For how long have I used the solution?

I have been using AWS Glue for five years. I am using the latest version. 

What do I think about the stability of the solution?

I would rate the stability of AWS Glue a seven out of ten. There are some performance issues. 

What do I think about the scalability of the solution?

I would rate the scalability of AWS Glue a six out of ten. There are over 25 users in my company. 

How are customer service and support?

The customer service and support team is okay. 

How would you rate customer service and support?

Neutral

How was the initial setup?

It is a cloud-based solution; there is no such installation procedure required. One DevOps is required for the maintenance of AWS Glue.  

What other advice do I have?

Based on the customer scenario, I have previously recommended AWS Glue. Sometimes, customers directly request either Azure RapidAPI or AWS Glue. It depends on the specific business use case. Both tools have limitations, so it's hard to say which is best. If a customer already uses Microsoft products, I suggest going with Azure. As for a general rating, I would give AWS Glue a seven out of ten.

Overall, I would rate AWS Glue a seven out of ten because it's not about performance. It's because of how the tool is used. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Sunil Morya - PeerSpot reviewer
Consultant at a tech vendor with 10,001+ employees
Real User
Easy to set up, useful for batch processing, and is free to try
Pros and Cons
  • "The solution helps organizations gain flexibility in defining the structure of the data."
  • "I haven't looked into Glue in terms of seeking out flaws. I've not come across missing features."

What is our primary use case?

Once you get the data and you don't know about the structure of the data, then Glue is very helpful to estimate the structure, including where is the structure, and it'll identify everything for you. It has one component that is called Glue Crawler that is quite useful for this task. It will go through segments of your data and try to guess their structure. It pops out the structure, and you can modify it according to your convenience.

It is good to basically perform the ETL when your files are stored in the S3 bucket. Glue supports other external sources also. That said, most of the time, we have basically given our proposal to clients if the data is available in S3.

How has it helped my organization?

The solution helps organizations gain flexibility in defining the structure of the data.

You can define and then include the original data structure and decide what the required fills are or what other ones you can omit. You can perform certain processing tasks also, and you can basically apply the multiplying factor; you can do the cleanup, et cetera, on the fly with the Glue.

What is most valuable?

The Glue Crawler can have a set of connectors, so you can utilize those connectors to connect with the external databases, which may be on-premise in different networks or maybe locally on AWS. Basically, you can use the connector to fetch the data. 

Once you have a data schema, you can start streaming or fetching the data in the particular format conversion. For example, suppose you have the text file, and you have Word in place or maybe in SQL, and you can use the connector on the fly to convert the database.

For batch processing, batch genetics, it is helpful for the ETL process.

The setup is easy. 

The solution offers a free trial. 

The solution can scale. 

It's stable.

Users only pay for what they use once they have a license. 

What needs improvement?

I haven't looked into Glue in terms of seeking out flaws. I've not come across missing features. 

For how long have I used the solution?

I've been dealing with the solution for two or three years. I have given a lot of proposals based on customer demand.

What do I think about the stability of the solution?

The solution is quite stable and reliable. There are no bugs or glitches. It doesn't crash or freeze. It is reliable. 

What do I think about the scalability of the solution?

Typically, data analytics individuals use the solution. It's not for an entire organization.

It's a scalable solution. I'd rate it ten out of ten. 

We do have plans to increase usage. We are in the process of moving many things to the cloud, and if they move onto AWS, they'll need Glue.

How are customer service and support?

I've never been in touch with technical support. I can't speak to how helpful or responsive they are. 

How was the initial setup?

The solution is very straightforward to set up and implement. 

I'd rate the ease of deployment at an eight or nine out of ten. However, it all depends on the circumstances. 

The deployment only takes two to three minutes. It's very fast.

Using the console, you have different sections of AWS Glue You can go and specify the input data source and output target data place. Then you need to specify the transformation. If you want to do the filtering, et cetera, you have to specify. You have the blueprint of transformation functions available also, and you can select from there and then just run it. 

What about the implementation team?

I've only just explored the solution. It has not been deployed yet. 

What's my experience with pricing, setup cost, and licensing?

When you are just learning and testing the solution, it is free. I cannot speak to the full cost beyond that, as I am just experimenting with the product. They do offer it to users for a limited time to try for free, however.

My understanding is you only pay for what you use, so pricing would vary based on that. You don't need to maintain a cluster and it is serverless. 

There are no extra costs beyond a standard license fee.

What other advice do I have?

We are using the latest version of the solution. The solution runs on the cloud and is serverless. 

It's a good solution to use when people are not exporting analytics. If you want to perform some ETL on your data and the data is complex, then you should go for Glue. It is easy to set up.

I'd rate the solution ten out of ten. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer:
PeerSpot user
Joaquin Marques - PeerSpot reviewer
CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC
Real User
Straightforward, easy to set up, and needs little to no training
Pros and Cons
  • "It's fairly straightforward as a product; it's not very complicated."
  • "The mapping area and the use of the data catalog from Glue could be better."

What is our primary use case?

We use the solution to do the usual type of transformations that before required ETL. It's mostly transformation-type purposes that we have, including transforming data from source to target. Also, we are replacing the usual ETLs with Glue, for example.

How has it helped my organization?

The solution has helped the organization mostly with migration. When we migrate from mostly on-premises to the cloud, Glue replaces a more expensive item that's also available in the cloud. Lots of programmers can understand Glue because they can write scripts in Python and PySpark, and there are quite a few programmers that know how to program in those languages. With the previous components that did similar things on-premise, you needed specialized knowledge.

What is most valuable?

The fact that we can use PySpark to program them is great. 

It's fairly straightforward as a product; it's not very complicated. 

The fact that Amazon offers it and it's a quick way of getting things done, and the fact that it doesn't require a lot of training is very positive attributes.

The initial setup is straightforward.  

It can scale.

What needs improvement?

The mapping area and the use of the data catalog from Glue could be better. I would say those two are the main things we'd like to see improvements on.

The solution needs support for big data.

As I understand it, Glue is based on Lambdas and Lambdas have some limitations as far as running them continuously. Sometimes they get dropped, and they have to be reinitialized.

For how long have I used the solution?

I've been using the solution for about two years. 

What do I think about the stability of the solution?

So far, the solution has been stable. For everything that we've done, we didn't run into any problems, so it was fairly stable for us.

What do I think about the scalability of the solution?

The solution can scale. I don't know exactly at what scale. I was using it for a medium-scale type solution. We never tested it with extremely large scales, so I don't know how expansive it can be. That said, since it's in the cloud, if you need more scale, it would be easy enough to add additional Glue components. At least, in theory, it should be fairly scalable.

Indirectly, we have hundreds of people on the solution. 

At this time, I'm not sure if any clients plan to increase usage. 

How are customer service and support?

Since the solution is so easy to use, we have not needed the help of technical support. 

Which solution did I use previously and why did I switch?

We've previously used many different solutions, including Informatica, Data Storage from IBM, and SSIS.

How was the initial setup?

The solution is easy to set up. It's not overly complex.

At a minimum, a company would need one to two people to handle the deployment. If it is a larger scale of deployment, they might need more personnel. 

What about the implementation team?

As a consultant, I assist with the implementation. 

What was our ROI?

Likely, a company would receive an ROI as it's cheaper than the alternatives.

What's my experience with pricing, setup cost, and licensing?

I was not involved in the cost negotiation process. 

Which other solutions did I evaluate?

Our clients typically have chosen Glue from the start. I was not involved in the evaluation process. 

What other advice do I have?

We are using one of the latest versions of the solution. It's about two years old. 

Depending on the number of data sources, the variety of data sources, and the variety of targets they will have, I might recommend the solution. What they have and plan to do will dictate whether Glue is a good solution or whether they would require something more sophisticated - such as Databricks. For example, if you have big data, then Databricks is probably a better solution to do ETL.

I'd rate the solution seven out of ten. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Diego Henrique Da SilvaBastos - PeerSpot reviewer
Data Engineer at a tech services company with 501-1,000 employees
Real User
Top 20
Offers good documentation, stability but error handling is difficult
Pros and Cons
  • "It's very good to manage."
  • "AWS Glue's error handling is difficult."

What is our primary use case?

I use AWS Glue for data processing. Some of my colleagues have data for software, and I use AWS Glue to transform and inspect this data.

What is most valuable?

It's very good to manage. It is easy to integrate other products with AWS. 

Glue integrates with other AWS processes and networks. So, it's quite easy to integrate.

I've worked with AI integration but I haven't gone into much depth on that topic.

What needs improvement?

AWS Glue's error handling is difficult. 

The errors in AWS are very hard to handle. The screen is very hard to understand. 

I have to use CloudWatch, but whatever our error was, the new ones, and so on. I would test this with someone. It's not so easy for me, and there are more things related to this.

For how long have I used the solution?

I have been using it for a year and a half. 

What do I think about the stability of the solution?

I would rate the stability a nine out of ten. 

What do I think about the scalability of the solution?

I would rate the scalability a seven out of ten. 

My data is small, so we need to consider more days. We need to deal with what we have, but I understand the documentation. 

Some people find it hard, but I rated it a seven. In my company, TechOps uses AWS with about 1,200 users.

Which solution did I use previously and why did I switch?

I worked with Databricks. In my opinion, Databricks is improving and is easier to use. It's more user-friendly, and I think it's better overall.

How was the initial setup?

I work with a big company, and most of it is already quickly done, like using something that is a blueprint. This configuration stuff is already working in another place. The only thing I have to do with the cloud is the remote configuration.

What's my experience with pricing, setup cost, and licensing?

AWS can be expensive.

What other advice do I have?

Overall, I would rate it a seven out of ten. I would recommend it.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Mbaye Babacar Gueye - PeerSpot reviewer
Owner at a tech services company with 51-200 employees
Real User
Top 5
Capable of handling real-time but ETL interface could be more user-friendly
Pros and Cons
  • "I also like that you can add custom libraries like JAR files and use them. So, the ability to use a fast processing engine and embed basic jobs easily are significant advantages."
  • "One area that could be improved is the ETL view. The drag-and-drop interface is not as user-friendly as some other ETL tools."

What is our primary use case?

One common use case is migrating data from one system to another.  So, mostly migrating data and data engineering, getting real-time or near-real-time data using Lambda functions and migrating big data from on-prem to the cloud for historical data before starting a project.

What is most valuable?

If you have the Fund Manager, you could use a fast processing engine, which is crucial for performance. 

I also like that you can add custom libraries like JAR files and use them. So, the ability to use a fast processing engine and embed basic jobs easily are significant advantages.

What needs improvement?

One area that could be improved is the ETL view. The drag-and-drop interface is not as user-friendly as some other ETL tools. 

Additionally, AWS Glue can sometimes be slow, especially when processing large datasets. It was sometimes a bit slow. Also, I couldn't directly use bucketed data. With Elastic Glue, you had to convert your data frames into the correct format before connecting them using the drag-and-drop interface. So that's something I didn't like because the conversion process wasn't straightforward. 

In future releases, I would like to see a feature that could trigger Glue pipeline using an API or something. 

For how long have I used the solution?

I have experience with AWS Glue. I have about one year of experience in a professional setting, but I have also done some personal work with this solution.

How are customer service and support?

Support was good, but I was working with a big client, so that might have influenced the experience. The response time was fast, we heard back from them within a day. 

How would you rate customer service and support?

Positive

How was the initial setup?

I would rate my experience with the initial setup an eight out of ten, where one is difficult and ten is easy. 

The initial setup is not very complex. You can customize parameters like minimum and maximum for your needs. For me, it wasn't complex to deploy the solution. 

What other advice do I have?

I'd rate it around six out of ten compared to other tools like Databricks.  

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2025
Product Categories
Cloud Data Integration
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros sharing their opinions.