Try our new research platform with insights from 80,000+ expert users
Bernd Stroehle. - PeerSpot reviewer
Enterprise Architect at kosakya
Real User
Top 5Leaderboard
An open-source solution that has limitations in processing too many jobs
Pros and Cons
  • "I worked on a project at a leading German bank for two years, successfully migrating large applications with hundreds of jobs."
  • "Apache Airflow improved workflow efficiency, but we had to find solutions for large workflows. For instance, a monthly workflow with 1200 jobs had to be split into three to four pieces as it struggled with large job numbers. Loading a workflow with 500 jobs could take 10 minutes, which wasn't acceptable."

What needs improvement?

Apache Airflow improved workflow efficiency, but we had to find solutions for large workflows. For instance, a monthly workflow with 1200 jobs had to be split into three to four pieces as it struggled with large job numbers. Loading a workflow with 500 jobs could take 10 minutes, which wasn't acceptable.

The most important feature Apache Airflow lacks is support for external configuration files. All classical schedulers like Control-M or Automic allow you to load workflow definitions from YAML, XML, or JSON files, but the tool requires you to write Python programs. Airflow only supports external configuration for variables, not for workflows. To address this, I created a YAML configuration file that I converted into Python programs, but this functionality is missing from Apache Airflow itself.

All of its competitors have this feature. In Control-M, Automic, and IBM's scheduler, you can load workflows from XML, JSON, or YAML files.

For how long have I used the solution?

I've been familiar with Apache Airflow for about three to four years. I worked on a project at a leading German bank for two years, successfully migrating large applications with hundreds of jobs. However, the leading German bank paused its migration strategy due to issues with the team in India. They're likely waiting for version 3, which is expected next year.

What do I think about the stability of the solution?

I rate the tool's stability a nine out of ten. 

What do I think about the scalability of the solution?

I rate the product's scalability a seven out of ten. 

Buyer's Guide
Apache Airflow
August 2025
Learn what your peers think about Apache Airflow. Get advice and tips from experienced pros sharing their opinions. Updated: August 2025.
865,164 professionals have used our research since 2012.

How are customer service and support?

Apache Airflow doesn't have its own technical support.

How was the initial setup?

I've been involved in all aspects of Airflow deployment, including building infrastructure using Kubernetes and containers. We faced challenges migrating from enterprise schedulers like Control-M and IBM's scheduler to Airflow, as it lacked some functionality. I had to implement extra features and extensions to support things like individual calendars.

What's my experience with pricing, setup cost, and licensing?

Apache Airflow is open-source and free. Hyperscalers like Google (with Composer), Azure, and AWS offer managed Airflow services.

What other advice do I have?

I recommend Apache Airflow because it's open-source, but you must accept its limitations. However, I wouldn't recommend it to companies in biomedical, chemistry, or oil and gas industries with large workflows and thousands of jobs. For example, genomic analysis at an American multinational pharmaceutical and biotechnology corporation involved workflows with around twenty thousand jobs, which Airflow can't handle. Special schedulers are needed for such cases, as even classical schedulers like Control-M and Automic aren't suitable.

I rate the overall solution a seven out of ten. 

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
ManojKumar43 - PeerSpot reviewer
Big Data Engineer at BigTapp Analytics
Real User
Top 10
A solution for orchestrating EMR clusters with plug-and-play UI
Pros and Cons
  • "Apache Airflow is easy to use and can monitor task execution easily. For instance, when performing setup tasks, you can conveniently view the logs without delving into the job details."
  • "Airflow should support the dynamic drag creation."

What is our primary use case?

I have used Apache Airflow for various purposes, such as orchestrating Spark jobs, EMR clusters, Glue jobs, and submitting jobs within the DCP data flow on Azure Databricks including ad hoc queries. For instance, if there's a need to execute queries on Redshift or other databases.

How has it helped my organization?

If you are working with APIs or databases, you must write SQL queries and formulate the right statements to retrieve everything. But with the UI, it's more like plug-and-play. You go there, select the task you want to see, like logs, and click on it. It will promptly display the details of the logs, automatically showing the returned logs. However, if you're accessing logs manually from the web server, you must write commands and perform additional tasks. These overheads can be efficiently managed using the UI.

What is most valuable?

Apache Airflow is easy to use and can monitor task execution easily. For instance, when performing setup tasks, you can conveniently view the logs without delving into the job details. All logs are readily accessible within the interface itself. Examining the logs lets you discern which steps and processes are being executed.

You don't have to configure SMTP for everything. You need to configure email settings, such as email on error, failure, or alert access. With Apache Airflow, you can send emails with just a few lines of code. You don't have to write extensive code to configure SMTP; all those configurations can be accomplished within a few lines of code.

I managed a complex workflow for a finance application project. They use Apache Airflow to orchestrate processes, such as retrieving data from SFTP and landing it into S3. From S3, they trigger Glue jobs based on certain conditions. Additionally, they use the Glue catalog in Glusoft for data management, all orchestrated using Airflow. Furthermore, various logics are written in Airflow DAGs to handle scenarios like security mismatches. For instance, files are sent accordingly if there's a missing security.

Apache Airflow triggers a set of tasks based on DAGs. If you have multiple tags, such as raw, transform, and ready layers, instead of manually triggering each DAGs. In that case, you can integrate them to trigger one, automatically triggering the others. Also, you can put conditions.

What needs improvement?

Airflow should support the dynamic drag creation.

For how long have I used the solution?

I have been using Apache Airflow for over 8 years.

What do I think about the stability of the solution?

The solution is stable. 

I rate the solution's stability a nine-point five out of ten.

What do I think about the scalability of the solution?

We were using Apache Airflow on Kubernetes. As more requests came in, it scaled dynamically based on the available ports. There are almost 15 data engineers who are using Apache Airflow.

I rate the solution's scalability a nine out of ten.

How was the initial setup?

The initial setup is straightforward. It will be tricky if you go with an executor or Kubernetes operator.

If you're into plug-and-play convenience, Apache Airflow supports various deployment methods like Docker, Helm, or Kubernetes. If you want to spin up Airflow, it will take more than 10-15 minutes. However, if you're making customizations or prefer not to use existing databases, the setup time could be extended due to customization requests.

What other advice do I have?

You use Apache Airflow to automate your data pipelines. When you have a data pipeline, such as a Spark job or any other job, and want to automate it, triggering the job manually is not always necessary. You need to configure these DAGs accordingly. For instance, Airflow can initiate the job when the data becomes available. We don't need to keep the cluster running all the time, 24/7. We start the cluster using Airflow when we need to submit the job. Once the job is completed, we terminate the cluster.

I recommend the solution.

Overall, I rate the solution a nine out of ten.

Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Apache Airflow
August 2025
Learn what your peers think about Apache Airflow. Get advice and tips from experienced pros sharing their opinions. Updated: August 2025.
865,164 professionals have used our research since 2012.
Miodrag Milojevic - PeerSpot reviewer
Senior Data Archirect at Yettel
Real User
Top 10
Streamlines complex data workflows with its user-friendly interface, robust scheduling, and monitoring capabilities, offering scalability and efficient orchestration of diverse sources
Pros and Cons
  • "Its user-friendly interface makes it straightforward to operate, offering a plethora of features for data preparation, buffering, and format conversion."
  • "It would be beneficial to improve the pricing structure."

What is our primary use case?

It serves as a versatile tool for data ingestion, enabling various tasks including data transformation from one type or format to another. It facilitates seamless preparation and processing of data, supporting diverse operations such as format conversion, type transformation, and other related functions.

How has it helped my organization?

We leverage Apache Airflow to orchestrate our data pipelines, primarily due to the multitude of data sources we manage. These sources vary in nature, with some delivering streaming data, while others follow different protocols such as FTP or utilize landing areas. We utilize Airflow for orchestrating tasks such as data ingestion, transformation, and preparation, ensuring that the data is formatted appropriately for further processing. Typically, this involves tasks like normalization, enrichment, and structuring the data for consumption by tools like Spark or other similar platforms in our ecosystem.

The scheduling and monitoring functionalities enhance our data processing workflows. While the interface could be more user-friendly, proficiency in scheduling and monitoring can be attained through practice and skill development.

The scalability of Apache Airflow effectively accommodates our increasing data processing demands without issue. While occasional server problems may arise, particularly in this aspect, overall, the product remains reliably stable.

It offers a straightforward means to orchestrate numerous data sources efficiently, thanks to its user-friendly interface. Learning to use it is relatively quick and straightforward, although some experimentation, practice, and training may be required to master certain aspects.

What is most valuable?

Our data workflow management is greatly streamlined by the use of Apache Airflow, which proves highly beneficial. Its user-friendly interface makes it straightforward to operate, offering a plethora of features for data preparation, buffering, and format conversion. With its extensive capabilities, Airflow serves as a comprehensive tool for managing our data workflows effectively.

What needs improvement?

The current pricing of Apache Airflow is considerably higher than anticipated, catching us off guard as it has evolved from its initial pricing structure. It would be beneficial to improve the pricing structure. Also, enhancing the interface furthermore would be highly beneficial.

For how long have I used the solution?

We have been using it for approximately two years.

What do I think about the stability of the solution?

While the stability of the system is satisfactory, maintaining stability requires vigilance and attention to various factors. During usage, occasional issues may arise, particularly when operating on-premises configurations. For instance, a single hard disk failure on a physical node can pose a challenge, necessitating the node's shutdown for disk replacement. However, the process of switching off and on the node is intricate and requires careful handling.

What do I think about the scalability of the solution?

Scalability is achievable, but it comes with its challenges, particularly in terms of temporary downsizing due to failures or other unforeseen circumstances. While scaling up is feasible, each additional node introduced into the cluster adds complexity and raises the likelihood of potential failures. Dealing with failures involves following standard procedures, yet reinstating the cluster to its fully operational state can be a demanding task.

Approximately ten technical staff members and an equivalent number of data scientists utilize the platform. Additionally, a segment of the network team employs it for network quality analysis, leveraging reporting tools built on top of Impala, which is integrated into the cluster.

How was the initial setup?

The setup process is notably intricate, particularly considering our cluster configuration consisting of twelve data nodes and various additional components. Furthermore, unforeseen issues may arise, such as disk space constraints for Airflow or similar challenges, necessitating vigilance and attention to detail to avoid complications.

What about the implementation team?

The initial phase of the deployment process involves creating a comprehensive plan outlining the setup of our cluster, considering all nodes involved. Since we're deploying on-premises, we need to determine which components will reside on physical machines and which can be accommodated on virtual machines or clusters. This assessment will guide the allocation of resources to each server, ensuring an optimal configuration. Following this, the configuration phase begins, taking into account the specific requirements of our organization and stringent security measures. Access to the clusters must be carefully managed, categorized, and restricted as per security protocols. It's imperative to prepare everything meticulously prior to deployment to ensure a smooth and successful implementation. We've undertaken the deployment process partially in-house and with the assistance of the system integrator dedicated to this project. For maintenance and deployment tasks, we rely on a team of ten technical personnel. Typically, only two or three individuals are needed to monitor operations and address issues as they arise. Moreover, we have the backing of our system integrator for additional support if necessary.

What's my experience with pricing, setup cost, and licensing?

The pricing is on the higher side.

What other advice do I have?

I would confidently recommend Apache Airflow to others, assuring them of its benefits. In my opinion, it's a mature and efficient product that delivers reliable performance. Overall, I would rate it nine out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
SUDHIR KUMAR RATHLAVATH - PeerSpot reviewer
Student at University of South Florida
Real User
Enable seamless integration with various connectivity and integrated services, including BigQuery and Python operators
Pros and Cons
  • "Every feature in Apache Airflow is valuable. The number of operators and features I've used are mainly related to connectivity services and integrated services because I primarily work with GCP."
  • "For admins, there should be improved logging capabilities because Apache Airflow does have logging, but it's limited to some database data."

What is our primary use case?

Apache Airflow is like a freeway. Just as a freeway allows cars to travel quickly and efficiently from one point to another, Apache Airflow allows data engineers to orchestrate their workflows in a similarly efficient way.

There are a lot of scheduling tools in the market, but Apache Airflow has taken over everything. With the help of airflow operators, any task required for day-to-day data engineering work becomes possible. It manages the entire lifecycle of data engineering workflows.

How has it helped my organization?

So, for example, let's say you want to connect to multiple sources, extract data, run your pipeline, and trigger your pipeline through other integrated services. You also want to do this at a specific time.

You have a number of operators that can help you with this. For example, you can use the External Sensor operator to take the dependency of workflows. This means that you can wait for one workflow to complete before triggering another workflow. There are also good operators like the Python operator and the Bash operator. These operators allow you to run your scripts without having to change your code. This is great for traditional projects that are already running on batch.

So, let's say you have a scheduled alert called Informatica. You can use Airflow to trigger your VTech scripts through the Informatica jobs. This way, you don't need to use Informatica as a scheduling engine. That's a good way to decouple your data pipelines from Informatica. Airflow is a powerful tool that can help you to automate your workflows and improve your efficiency.

What is most valuable?

Every feature in Apache Airflow is valuable. The number of operators and features I've used are mainly related to connectivity services and integrated services because I primarily work with GCP. Specifically, I've utilized the BigQuery connectors and operators, as well as Python operators and other runnable operators like Bash. These common operators have been quite useful in my work.

Another thing that stands out is its ease of use for developers familiar with Python. They can simply write their code and set up their environment to run the code through the scheduling engine. It's quite convenient, especially for those in the data engineering field who are already well-versed in Python. They don't need to install any additional tools or perform complex environment setups. It's straightforward for them.

The graphical interface is good because it runs on a DAG (Directed Acyclic Graph).

What needs improvement?

One improvement could be the inclusion of a plugin with a drag-and-drop feature. This graphical feature would be beneficial when dealing with connectivity and integration services like connecting to BigQuery or other systems. As a first-time user, although the documentation is available, it would be more user-friendly to have a drag-and-drop interface within the portal. Users could simply drag and drop components to create a pseudo-code, making it more flexible and intuitive.

Therefore, I suggest having a drag-and-drop feature for a more user-friendly experience and better code management.  

Moreover, for admins, there should be improved logging capabilities because Apache Airflow does have logging, but it's limited to some database data. It would be better if everything goes into the server where it's hosted. Probably on the interface level. If something goes well for the developers.

For how long have I used the solution?

I have been using Apache Airflow since 2014. So, it's been over eight years. We currently use the latest version, 2.4.0

What do I think about the stability of the solution?

Performance-wise, it's good because I've been using two versions - 2.0 and 2.4.

So, it's stable. The version we've been using is much more stable.

What do I think about the scalability of the solution?

It's pretty scalable.

How was the initial setup?

The initial setup is easy if it's on the cloud, you get everything - scalability, usability, so you don't need to worry about storage. It's pretty scalable.

What was our ROI?

The ROI is very high. Most companies are adopting Apache Airflow, and it can be used for a wide variety of tasks, including pulling data, summarizing tables, and generating reports. Everything can be done in Python and integrated within Apache Airflow. The efficiency and ease of use it offers contribute to its high ROI.

Which other solutions did I evaluate?

I've been using Apache Airflow, but I haven't directly compared it with other scheduling tools available in the market. This is because each cloud platform has its own built-in scheduling tool. For instance, if we consider Azure, it has a service called Azure Data Factory, which serves as a scheduling engine.

When you compare Apache Airflow with services like Azure Data Factory and AWS, you'll find that Airflow excels in various aspects. However, one area that could be improved is the integration with hosting services. Currently, Airflow can be hosted on different platforms and machines, which offers flexibility but may require some enhancements to streamline integration with certain hosting services.

What other advice do I have?

Since I have been using Apache Airflow for six to seven years, I would confidently rate the solution a solid ten. We help customers re-design and implement projects using Apache Airflow, and approximately 90% of our work revolves around this powerful tool. So, I rate this product a perfect ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
IT Professional at Freelance
Real User
Top 20
Equips users with a comprehensive feature set for managing complex workflows and has a responsive technical support team
Pros and Cons
  • "Airflow integrates well with Cloudera and effectively supports complex operations."
  • "One area for improvement would be to address specific functionalities removed in recent updates that were previously useful for our operations."

What is our primary use case?

We use the product for scheduling and defining workflows. It helps us extensively to manage complex workflows within Cloudera's ecosystem, particularly for handling and processing data.

How has it helped my organization?

The solution has been beneficial in automating and managing our data workflows efficiently. It has integrated well with our Cloudera environment, enabling us to handle complex workflows with greater ease and reliability.

What is most valuable?

The solution's most valuable feature is its ability to run workflows without saving changes. It allows us to execute tasks without permanently altering our configurations, which is useful for temporary adjustments and testing.

What needs improvement?

One area for improvement would be to address specific functionalities removed in recent updates that were previously useful for our operations.

Additional features that could enhance the product include more flexibility in parameterization and improved tools for managing and debugging workflows.

For how long have I used the solution?

I have been working with Airflow for approximately a year and a half, focusing on the current version for the past eight months.

What do I think about the stability of the solution?

The product has been stable in our environment.

What do I think about the scalability of the solution?

The product is scalable. 

How are customer service and support?

The technical support team has been responsive and helpful. They addressed issues related to removed functionalities and ensured critical features were restored in subsequent updates.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We previously used Hortonworks but switched to Cloudera CDP. We also used other Cloudera tools but found Airflow to be a better fit for our current needs due to its capabilities in workflow management.

How was the initial setup?

The initial setup was complex due to the integration with various data sources and configuration requirements, but once properly set up, it has proven effective.

What about the implementation team?

The implementation was carried out with guidance from Cloudera's support team, who provided valuable assistance in configuring the solution to meet our requirements.

Which other solutions did I evaluate?

We evaluated other data workflow solutions but found Airflow the most suitable due to its integration with Cloudera and comprehensive feature set for managing complex workflows.

What other advice do I have?

Airflow integrates well with Cloudera and effectively supports complex operations. However, users should be aware of changes in functionality between versions and plan accordingly.

Overall, I rate it a nine out of ten. 

Disclosure: My company has a business relationship with this vendor other than being a customer. Partner
PeerSpot user
UjjwalGupta - PeerSpot reviewer
Module Lead at Mphasis
Real User
Top 5
User-friendly, provides a graphical representation of the whole flow, and the user interface is pretty good
Pros and Cons
  • "The tool is user-friendly."
  • "We cannot run real-time jobs in the solution."

What is our primary use case?

The main use case is orchestration. We use it to schedule our jobs.

What is most valuable?

The best thing about the product is its UI. The tool is user-friendly. We can divide our work into different tasks and groups. It gives a graphical representation of the whole flow. It also creates a graph of the complete pipeline. The UI is beautiful. Whenever there is a failure, we can see it at the backend. We can retry at the point where the failure happened. We do not have to redo the whole flow. The user interface is pretty good. It provides details about the jobs. It also provides monitoring features. We can see the metrics and the history of the runs. The administration features are good. We can manage the users.

What needs improvement?

The solution lacks certain features. We cannot run real-time jobs in the solution. It supports only batch jobs. If we are using ETL pipelines, it can either be a batch job or a real-time job. Real-time jobs run continuously. They are not scheduled. Apache Airflow is for scheduled jobs, not real-time jobs. It would be a good improvement if the solution could run real-time jobs. Many connectors are available in the product, but some are still missing. We have to build a custom connector if it is not available. The solution must have more in-built connectors for improved functionality.

For how long have I used the solution?

I have been using the solution for four to five years.

What do I think about the stability of the solution?

The tool has stability issues that are present in open-source products. It has some failures or bugs sometimes. It is difficult to troubleshoot because we do not have any support for it. We have to search the community to get answers. It would be good if there were a support team for the tool.

What do I think about the scalability of the solution?

We have 5000 to 10,000 users in our organization.

How was the initial setup?

The installation is relatively easy. It doesn't have much configuration. It is straightforward. Some companies provide custom installations. It is easier, but it will be a costly paid service. We generally use the core product. We also have AWS Managed Services. It is a better option if we do not want to do the configuration ourselves.

What other advice do I have?

Apache Airflow is a better option for batch jobs. My advice depends on the tools people use and the jobs they schedule. Databricks has its own scheduler. If someone is using Databricks, a separate tool for scheduling would be useless. They can schedule the jobs through Databricks.

Apache Airflow is a good option if someone is not using third-party tools to run the jobs. When we use APIs to get data or get data from RDBMS systems, we can use Apache Airflow. If we use third-party vendors, using the in-built scheduler is better than Apache Airflow. Overall, I rate the solution a nine out of ten.

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Punit_Shah - PeerSpot reviewer
Director at Smart Analytica
Reseller
Top 10
Excels in orchestrating complex workflows, offering extensibility, a graphical user interface for clear pipeline monitoring and affordability
Pros and Cons
  • "One of its most valuable features is the graphical user interface, providing a visual representation of the pipeline status, successes, failures, and informative developer messages."
  • "Enhancements become necessary when scaling it up from a few thousand workflows to a more extensive scale of five thousand or ten thousand workflows."

What is our primary use case?

We utilize Apache Airflow for two primary purposes. Firstly, it serves as the tool for ingesting data from the source system application into our data warehouse. Secondly, it plays a crucial role in our ETL pipeline. After extracting data, it facilitates the transformation process and subsequently loads the transformed data into the designated target tables.

What is most valuable?

One of its most valuable features is the graphical user interface, providing a visual representation of the pipeline status, successes, failures, and informative developer messages. This graphical interface greatly enhances the user experience by offering clear insights into the pipeline's status.

What needs improvement?

Enhancements become necessary when scaling it up from a few thousand workflows to a more extensive scale of five thousand or ten thousand workflows. At this point, resource management and threading, become critical aspects. This involves optimizing the utilization of resources and threading within the Kubernetes VM ecosystem.

For how long have I used the solution?

I have been working with it for five years.

What do I think about the stability of the solution?

I would rate its stability capabilities nine out of ten.

What do I think about the scalability of the solution?

While it operates smoothly with up to fifteen hundred pipelines, scaling beyond that becomes challenging. The performance tends to drop when dealing with five thousand pipelines or more, leading to the rating of five out of ten.

How are customer service and support?

I would rate the customer service and support nine out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup is straightforward. I would rate it nine out of ten.

What about the implementation team?

The deployment process requires approximately four hours, and the level of involvement from individuals depends on the quantity of pipelines intended for deployment.

What's my experience with pricing, setup cost, and licensing?

The cost is quite affordable. I would rate it two out of ten.

What other advice do I have?

If you have around two thousand pipelines to execute daily within an eight to nine-hour window, Apache Airflow proves to be an excellent solution. I would rate it nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other
Disclosure: My company has a business relationship with this vendor other than being a customer. Service provider
PeerSpot user
reviewer1619292 - PeerSpot reviewer
Director of Business Intelligence at a consultancy with self employed
Real User
Top 20
Helps to schedule data pipelines but improvement is needed in workflow integration across the servers
Pros and Cons
  • "To increase efficiency, it's quite simple to add dbt tasks to an Apache Airflow pipeline or orchestration file. With the tool, you can specify dependencies."
  • "I would like to see workflow integration across the servers."

What is our primary use case?

We use the tool to schedule data pipelines. We also use Apache Airflow to orchestrate dbt, another data processing tool. Airflow helps manage dbt processes, which, in our case, load data from our data lake.

What is most valuable?

To increase efficiency, it's quite simple to add dbt tasks to an Apache Airflow pipeline or orchestration file. With the tool, you can specify dependencies.

What needs improvement?

I would like to see workflow integration across the servers. 

For how long have I used the solution?

I have been using the product for two years. 

What do I think about the stability of the solution?

The solution is stable, but we do have occasional performance issues. These aren't performance problems, but the Apache Airflow cluster sometimes crashes when too many tasks run simultaneously. 

What do I think about the scalability of the solution?

My team has around 11 people using the tool. Each team has a separate server, so we have about 10-20 different Apache Airflow servers. Altogether, I would estimate that around 200 people in our organization use it.

How are customer service and support?

I haven't contacted the support team directly. Our system team does it. 

How was the initial setup?

Apache Airflow provides templates for deployment, which makes it easy. When deploying the tool or using dbt, we usually use Kubernetes. We configure Kubernetes to generate a Docker file that sets up the Kubernetes servers for us. This means that when we deploy, it automatically goes to production. The whole process can be completed in seven weeks. 

What's my experience with pricing, setup cost, and licensing?

I use the tool's open-source version. 

What other advice do I have?

The solution's maintenance involves upgrades. Our system team handles maintenance for us. Their main tasks are upgrading versions and addressing vulnerabilities. It's hard work, but they manage it well. Maintenance takes about two weeks per year for our system team.

I rate the product a seven out of ten and I recommend it to others. 

Which deployment model are you using for this solution?

On-premises
Disclosure: My company does not have a business relationship with this vendor other than being a customer.
PeerSpot user
Buyer's Guide
Download our free Apache Airflow Report and get advice and tips from experienced pros sharing their opinions.
Updated: August 2025
Buyer's Guide
Download our free Apache Airflow Report and get advice and tips from experienced pros sharing their opinions.