Joaquin Marques - PeerSpot reviewer
CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC
Real User
Top 5Leaderboard
Straightforward, easy to set up, and needs little to no training
Pros and Cons
  • "It's fairly straightforward as a product; it's not very complicated."
  • "The mapping area and the use of the data catalog from Glue could be better."

What is our primary use case?

We use the solution to do the usual type of transformations that before required ETL. It's mostly transformation-type purposes that we have, including transforming data from source to target. Also, we are replacing the usual ETLs with Glue, for example.

How has it helped my organization?

The solution has helped the organization mostly with migration. When we migrate from mostly on-premises to the cloud, Glue replaces a more expensive item that's also available in the cloud. Lots of programmers can understand Glue because they can write scripts in Python and PySpark, and there are quite a few programmers that know how to program in those languages. With the previous components that did similar things on-premise, you needed specialized knowledge.

What is most valuable?

The fact that we can use PySpark to program them is great. 

It's fairly straightforward as a product; it's not very complicated. 

The fact that Amazon offers it and it's a quick way of getting things done, and the fact that it doesn't require a lot of training is very positive attributes.

The initial setup is straightforward.  

It can scale.

What needs improvement?

The mapping area and the use of the data catalog from Glue could be better. I would say those two are the main things we'd like to see improvements on.

The solution needs support for big data.

As I understand it, Glue is based on Lambdas and Lambdas have some limitations as far as running them continuously. Sometimes they get dropped, and they have to be reinitialized.

Buyer's Guide
AWS Glue
March 2024
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
765,386 professionals have used our research since 2012.

For how long have I used the solution?

I've been using the solution for about two years. 

What do I think about the stability of the solution?

So far, the solution has been stable. For everything that we've done, we didn't run into any problems, so it was fairly stable for us.

What do I think about the scalability of the solution?

The solution can scale. I don't know exactly at what scale. I was using it for a medium-scale type solution. We never tested it with extremely large scales, so I don't know how expansive it can be. That said, since it's in the cloud, if you need more scale, it would be easy enough to add additional Glue components. At least, in theory, it should be fairly scalable.

Indirectly, we have hundreds of people on the solution. 

At this time, I'm not sure if any clients plan to increase usage. 

How are customer service and support?

Since the solution is so easy to use, we have not needed the help of technical support. 

Which solution did I use previously and why did I switch?

We've previously used many different solutions, including Informatica, Data Storage from IBM, and SSIS.

How was the initial setup?

The solution is easy to set up. It's not overly complex.

At a minimum, a company would need one to two people to handle the deployment. If it is a larger scale of deployment, they might need more personnel. 

What about the implementation team?

As a consultant, I assist with the implementation. 

What was our ROI?

Likely, a company would receive an ROI as it's cheaper than the alternatives.

What's my experience with pricing, setup cost, and licensing?

I was not involved in the cost negotiation process. 

Which other solutions did I evaluate?

Our clients typically have chosen Glue from the start. I was not involved in the evaluation process. 

What other advice do I have?

We are using one of the latest versions of the solution. It's about two years old. 

Depending on the number of data sources, the variety of data sources, and the variety of targets they will have, I might recommend the solution. What they have and plan to do will dictate whether Glue is a good solution or whether they would require something more sophisticated - such as Databricks. For example, if you have big data, then Databricks is probably a better solution to do ETL.

I'd rate the solution seven out of ten. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Neelabh Sharma - PeerSpot reviewer
Data Engineer at Scania
Real User
Top 10
Provides good scalability and has an easy setup process
Pros and Cons
  • "The product has a valuable feature for data catalog."
  • "The product is expensive for data streaming. This area needs improvement."

What is our primary use case?

We use AWS Glue for ETL batch processing purposes.

What is most valuable?

The product has a valuable feature for data catalog.

What needs improvement?

The product is expensive for data streaming compared to EMR. This area needs improvement.

For how long have I used the solution?

We have been using AWS Glue for one and a half years.

What do I think about the stability of the solution?

I rate the product's stability a ten out of ten.

What do I think about the scalability of the solution?

We have five to six AWS Glue users. I rate its scalability a nine out of ten.

Which solution did I use previously and why did I switch?

We have used Cloudera before. We switched to AWS Glue for better pricing, scalability, and innovation.

How was the initial setup?

The initial setup is easy. I rate the process an eight or nine out of ten. It could be deployed on-premises and on the cloud as well. We have a team of five executives to carry out the implementation.

What's my experience with pricing, setup cost, and licensing?

It is an expensive product. I rate its pricing a nine out of ten.

What other advice do I have?

I rate AWS Glue a nine out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Buyer's Guide
AWS Glue
March 2024
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
765,386 professionals have used our research since 2012.
Ankit  Shukla - PeerSpot reviewer
Data Engineer at YASH Technologies
Real User
Top 5
Cheap, reliable, and able to expand as needed
Pros and Cons
  • "The solution is stable and reliable."
  • "The monitoring is not that good."

What is most valuable?

The best feature is the price point. It's pretty cheap as compared to other tools like Informatica, et cetera. That's why major companies are moving to the cloud and using Glue. At least, that's what I found.

The solution is stable and reliable.

You can scale the product if you need to. 

What needs improvement?

The monitoring is not that good. We'd like to see job progress be more clear. Right now, how we can view that is not that good. The is that mostly it is Python or Scala code based. The UX is lacking.

There is a bit of a learning curve, particularly during the setup process. 

More connectors should be included.

For how long have I used the solution?

I've been using the solution for three years. 

What do I think about the stability of the solution?

The solution is very reliable. It's stable. There are no bugs or glitches It works just fine. 

What do I think about the scalability of the solution?

The solution can scale very well. It's not a problem.

How are customer service and support?

Technical support is okay. We tend to go to the partner if we have issues, and they'll go to WS if they need to.

Which solution did I use previously and why did I switch?

I'm also familiar with Informatica. However, Glue is less expensive. 

How was the initial setup?

In terms of the initial setup, the learning part was a little bit stiff. After that, it is okay. We didn't have any issues once we understood the process. 

What about the implementation team?

We didn't require any outside assistance such as integrators or consultants. We were able to handle it ourselves. 

What's my experience with pricing, setup cost, and licensing?

The price is very good. It's enticing people to move to the cloud. 

That said, I do not have exact information on pricing. 

What other advice do I have?

I'm an AWS engineer. My company is a gold partner.

I'd rate the product eight out of ten. So far, it's quite good. I don't have any complaints.  

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
UjjwalGupta - PeerSpot reviewer
Module Lead at Mphasis
Real User
Top 10
Provides inbuilt data quality and cataloging features, but it is costly compared to other tools
Pros and Cons
  • "The most valuable feature of AWS Glue is that it provides a GUI format with a drag-and-drop feature."
  • "AWS Glue is more costly compared to other tools like Airflow."

What is our primary use case?

We use AWS Glue for building ETL pipelines.

What is most valuable?

The most valuable feature of AWS Glue is that it provides a GUI format with a drag-and-drop feature. The solution provides a codeless feature or no code feature, where you can write a pipeline without adding code in AWS Glue.

If you are working on AWS services for your pipeline, AWS Glue better interacts with the AWS services than other third-party tools. The solution provides inbuilt data quality and cataloging features. AWS Glue provides a complete package, and you can do all things in one place.

What needs improvement?

AWS Glue is more costly compared to other tools like Airflow. It would be better if the solution's pricing could be reduced. The default scheduling that AWS Glue provides is not as good as Airflow. The scheduler of AWS Glue could be improved because you cannot customize it.

For how long have I used the solution?

I have been using AWS Glue for more than three years.

What do I think about the stability of the solution?

AWS Glue is a stable product because it's an AWS-managed service. We can directly contact AWS for any issues we face. If there are any glitches in the version, AWS solves the issue by creating patches or version upgrades to the solution.

What do I think about the scalability of the solution?

Around 100 to 200 people are using the solution in our organization.

How was the initial setup?

As AWS Glue is a SaaS product, you don't have to set up anything. You can just create a new job, write your script, and run your job. You have to select the particular cluster or nodes you want to run and write the code. Not much admin part is required for the solution's setup.

What's my experience with pricing, setup cost, and licensing?

AWS Glue is a paid service that doesn't come under the free trial of AWS. You have to pay a charge for using the solution.

What other advice do I have?

If you are doing a job once or twice a day, AWS Glue will not cost you much. If you run jobs for four to five times a day or hourly jobs, it will be costlier compared to other tools. If you are required to run hourly jobs five to six times a day, then using other tools would be a better option. You can choose AWS Glue if you are running jobs only one or two times a day.

Our company decided to go with AWS Glue because the tools we were using in the pipeline were AWS services only. AWS Glue easily interacts with AWS services. The jobs we were running were also not frequent.

You can use AWS Glue for learning purposes. AWS Glue is a paid service that doesn't come under the free trial of AWS. You have to pay a charge for using the solution. You can learn the code by directly testing the basic spark code in any local system. Once you are comfortable that your code is working fine, then you can run your code in AWS Glue jobs. You should test the code in the local system first and then run it in AWS Glue. Testing on AWS Glue will be costly.

If a person is familiar with Spark jobs or Python jobs, they can easily learn AWS Glue. A new user will take the same amount of time to learn AWS Glue as he takes to get comfortable with Spark. Since it provides the GUI and no code thing, users can directly start using AWS Glue without having to learn any code. It's much easier to learn to use the solution.

Overall, I rate the solution a seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
VP- Cloud Data/ Solution Architect at a financial services firm with 10,001+ employees
Real User
Top 5
Offers metadata management, logging, and ETL processing capabilities
Pros and Cons
  • "AWS Glue is fast and managed by AWS. Hence, you don't have to worry about capacity and the performance of Glue jobs. It has integrations with other data stores of AWS. The product offers metadata management, logging, and ETL processing capabilities. It comes with a powerful feature, Glue Studio, which helps to do queries interactively within the community. It is a managed service and very secure. Another popular and mature service is S3."
  • "I have encountered challenges with multi-region support."

What is most valuable?

AWS Glue is fast and managed by AWS. Hence, you don't have to worry about capacity and the performance of Glue jobs. It has integrations with other data stores of AWS. The product offers metadata management, logging, and ETL processing capabilities. It comes with a powerful feature, Glue Studio, which helps to do queries interactively within the community. It is a managed service and very secure. Another popular and mature service is S3. 

What needs improvement?

I have encountered challenges with multi-region support. 

How are customer service and support?

AWS Glue has standard procedures for tech support. 

How would you rate customer service and support?

Positive

How was the initial setup?

AWS Glue is very user-friendly. However, if you are not a UI person, you can do IaaS. AWS provides a cloud automation service and a service catalog. 

What's my experience with pricing, setup cost, and licensing?

I rate the tool's pricing a four out of ten. 

What other advice do I have?

I rate AWS Glue an eight out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Sunil Morya - PeerSpot reviewer
Consultant at a tech vendor with 10,001+ employees
Real User
Top 5
Easy to set up, useful for batch processing, and is free to try
Pros and Cons
  • "The solution helps organizations gain flexibility in defining the structure of the data."
  • "I haven't looked into Glue in terms of seeking out flaws. I've not come across missing features."

What is our primary use case?

Once you get the data and you don't know about the structure of the data, then Glue is very helpful to estimate the structure, including where is the structure, and it'll identify everything for you. It has one component that is called Glue Crawler that is quite useful for this task. It will go through segments of your data and try to guess their structure. It pops out the structure, and you can modify it according to your convenience.

It is good to basically perform the ETL when your files are stored in the S3 bucket. Glue supports other external sources also. That said, most of the time, we have basically given our proposal to clients if the data is available in S3.

How has it helped my organization?

The solution helps organizations gain flexibility in defining the structure of the data.

You can define and then include the original data structure and decide what the required fills are or what other ones you can omit. You can perform certain processing tasks also, and you can basically apply the multiplying factor; you can do the cleanup, et cetera, on the fly with the Glue.

What is most valuable?

The Glue Crawler can have a set of connectors, so you can utilize those connectors to connect with the external databases, which may be on-premise in different networks or maybe locally on AWS. Basically, you can use the connector to fetch the data. 

Once you have a data schema, you can start streaming or fetching the data in the particular format conversion. For example, suppose you have the text file, and you have Word in place or maybe in SQL, and you can use the connector on the fly to convert the database.

For batch processing, batch genetics, it is helpful for the ETL process.

The setup is easy. 

The solution offers a free trial. 

The solution can scale. 

It's stable.

Users only pay for what they use once they have a license. 

What needs improvement?

I haven't looked into Glue in terms of seeking out flaws. I've not come across missing features. 

For how long have I used the solution?

I've been dealing with the solution for two or three years. I have given a lot of proposals based on customer demand.

What do I think about the stability of the solution?

The solution is quite stable and reliable. There are no bugs or glitches. It doesn't crash or freeze. It is reliable. 

What do I think about the scalability of the solution?

Typically, data analytics individuals use the solution. It's not for an entire organization.

It's a scalable solution. I'd rate it ten out of ten. 

We do have plans to increase usage. We are in the process of moving many things to the cloud, and if they move onto AWS, they'll need Glue.

How are customer service and support?

I've never been in touch with technical support. I can't speak to how helpful or responsive they are. 

How was the initial setup?

The solution is very straightforward to set up and implement. 

I'd rate the ease of deployment at an eight or nine out of ten. However, it all depends on the circumstances. 

The deployment only takes two to three minutes. It's very fast.

Using the console, you have different sections of AWS Glue You can go and specify the input data source and output target data place. Then you need to specify the transformation. If you want to do the filtering, et cetera, you have to specify. You have the blueprint of transformation functions available also, and you can select from there and then just run it. 

What about the implementation team?

I've only just explored the solution. It has not been deployed yet. 

What's my experience with pricing, setup cost, and licensing?

When you are just learning and testing the solution, it is free. I cannot speak to the full cost beyond that, as I am just experimenting with the product. They do offer it to users for a limited time to try for free, however.

My understanding is you only pay for what you use, so pricing would vary based on that. You don't need to maintain a cluster and it is serverless. 

There are no extra costs beyond a standard license fee.

What other advice do I have?

We are using the latest version of the solution. The solution runs on the cloud and is serverless. 

It's a good solution to use when people are not exporting analytics. If you want to perform some ETL on your data and the data is complex, then you should go for Glue. It is easy to set up.

I'd rate the solution ten out of ten. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer:
PeerSpot user
Liana Iuhas - PeerSpot reviewer
CEO at Quark Technologies SRL
Real User
Top 5
Highly scalable, reliable, and beneficial pay-as-you-go pricing model
Pros and Cons
  • "AWS Glue is a good solution for developers, they have the ability to write code in different languages and other software."
  • "The interface for AWS Glue could improve, they do not put a lot of details. You can write the code, in PySpark or in Scala, which is a big advantage, it is only easy to use for a developer. It will be difficult for new users to enter the cloud environment."

What is our primary use case?

My colleagues work with Spark, PySpark, and Scala as programming languages for writing complex aggregations. They have a repository in order to have a general view of all the sources and jobs on the platform and AWS Glue is very helpful.

What is most valuable?

AWS Glue is a good solution for developers, they have the ability to write code in different languages and other software.

What needs improvement?

The interface for AWS Glue could improve, they do not put a lot of details. You can write the code, in PySpark or in Scala, which is a big advantage, it is only easy to use for a developer. It will be difficult for new users to enter the cloud environment.

If business users want to run their own graphs they will not have the opportunity to use such features, such as running code inside AWS Glue in Spark, which will be complex for them.

For how long have I used the solution?

 I have been using AWS Glue for approximately four years.

What do I think about the stability of the solution?

AWS Glue is a highly stable solution. We didn't have bugs in production. 

The solution works well with Spark, which is a good framework for large volumes of data. It operates very well.

I rate the stability of AWS Glue a ten out of ten.

What do I think about the scalability of the solution?

The scalability of AWS Glue is great. It was used for enterprise customers. We worked a lot with AWS Glue for International companies.

We have approximately 10 people using AWS Glue in my company.

How are customer service and support?

I have to use the support from AWS Glue. The response time could improve.

I rate the support from AWS Glue a nine out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup of AWS Glue is very simple.

What's my experience with pricing, setup cost, and licensing?

AWS Glue uses a pay-as-you-go approach which is helpful. The price of the overall solution is low and is a great advantage.

Which other solutions did I evaluate?

If I can compare AWS Glue to other solutions, it has the advantage of the cloud, which assures availability and scalability, and the pay-as-you-go is beneficial. This is why many companies are moving from their traditional ETL tools to the cloud because the costs will be reduced dramatically.

What other advice do I have?

I would recommend this solution to others.

I rate AWS Glue a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Data Engineer at Tata Consultancy
Real User
Good integration with other AWS services but 30-minute runtime is limiting for large data sets
Pros and Cons
  • "The solution integrates well with other AWS products or services."
  • "On occasion, the solution's dashboard reports that a project failed due to runtime but it actually succeeded."

What is our primary use case?

Our company has five data engineers who use the solution for metadata catalogs and ETL pipelines that are built in S3 or EC2.  

What is most valuable?

The solution integrates well with other AWS products or services. 

What needs improvement?

The solution needs to expand its 30-minute query or runtime. Sometimes it fails with certain data types such as Athena due to the limited runtime. Some large data sets run overtime during busy hours so we try to avoid failures by running data at idle times or at night. 

On occasion, the solution's dashboard reports that a project failed due to runtime but it actually succeeded. This can be quite confusing. 

For how long have I used the solution?

I have been using the solution for one year. 

What do I think about the stability of the solution?

The solution is stable 99% of the time but is somewhat dependent on workloads. Heavy workloads or bigger data sets sometimes fail. Improvements are needed for stability to be consistent. 

How are customer service and support?

I have not needed technical support but my colleagues have used it. I do not have relevant feedback to report. 

How was the initial setup?

The setup requires some knowledge and understanding of how the solution works. 

What about the implementation team?

We implemented the solution in-house. 

What's my experience with pricing, setup cost, and licensing?

I rate pricing an eight out of ten. 

Which other solutions did I evaluate?

There are other options that offer equivalent service such as SRA and GCP but the solution is fully integrated with AWS where our data resides so we take full advantage of that. 

What other advice do I have?

I rate the solution a five out of ten. 

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2024
Product Categories
Cloud Data Integration
Buyer's Guide
Download our free AWS Glue Report and get advice and tips from experienced pros sharing their opinions.