What is our primary use case?
Apache Airflow is like a freeway. Just as a freeway allows cars to travel quickly and efficiently from one point to another, Apache Airflow allows data engineers to orchestrate their workflows in a similarly efficient way.
There are a lot of scheduling tools in the market, but Apache Airflow has taken over everything. With the help of airflow operators, any task required for day-to-day data engineering work becomes possible. It manages the entire lifecycle of data engineering workflows.
How has it helped my organization?
So, for example, let's say you want to connect to multiple sources, extract data, run your pipeline, and trigger your pipeline through other integrated services. You also want to do this at a specific time.
You have a number of operators that can help you with this. For example, you can use the External Sensor operator to take the dependency of workflows. This means that you can wait for one workflow to complete before triggering another workflow. There are also good operators like the Python operator and the Bash operator. These operators allow you to run your scripts without having to change your code. This is great for traditional projects that are already running on batch.
So, let's say you have a scheduled alert called Informatica. You can use Airflow to trigger your VTech scripts through the Informatica jobs. This way, you don't need to use Informatica as a scheduling engine. That's a good way to decouple your data pipelines from Informatica. Airflow is a powerful tool that can help you to automate your workflows and improve your efficiency.
What is most valuable?
Every feature in Apache Airflow is valuable. The number of operators and features I've used are mainly related to connectivity services and integrated services because I primarily work with GCP. Specifically, I've utilized the BigQuery connectors and operators, as well as Python operators and other runnable operators like Bash. These common operators have been quite useful in my work.
Another thing that stands out is its ease of use for developers familiar with Python. They can simply write their code and set up their environment to run the code through the scheduling engine. It's quite convenient, especially for those in the data engineering field who are already well-versed in Python. They don't need to install any additional tools or perform complex environment setups. It's straightforward for them.
The graphical interface is good because it runs on a DAG (Directed Acyclic Graph).
What needs improvement?
One improvement could be the inclusion of a plugin with a drag-and-drop feature. This graphical feature would be beneficial when dealing with connectivity and integration services like connecting to BigQuery or other systems. As a first-time user, although the documentation is available, it would be more user-friendly to have a drag-and-drop interface within the portal. Users could simply drag and drop components to create a pseudo-code, making it more flexible and intuitive.
Therefore, I suggest having a drag-and-drop feature for a more user-friendly experience and better code management.
Moreover, for admins, there should be improved logging capabilities because Apache Airflow does have logging, but it's limited to some database data. It would be better if everything goes into the server where it's hosted. Probably on the interface level. If something goes well for the developers.
Buyer's Guide
Apache Airflow
November 2025
Learn what your peers think about Apache Airflow. Get advice and tips from experienced pros sharing their opinions. Updated: November 2025.
873,209 professionals have used our research since 2012.
For how long have I used the solution?
I have been using Apache Airflow since 2014. So, it's been over eight years. We currently use the latest version, 2.4.0
What do I think about the stability of the solution?
Performance-wise, it's good because I've been using two versions - 2.0 and 2.4.
So, it's stable. The version we've been using is much more stable.
What do I think about the scalability of the solution?
How was the initial setup?
The initial setup is easy if it's on the cloud, you get everything - scalability, usability, so you don't need to worry about storage. It's pretty scalable.
What was our ROI?
The ROI is very high. Most companies are adopting Apache Airflow, and it can be used for a wide variety of tasks, including pulling data, summarizing tables, and generating reports. Everything can be done in Python and integrated within Apache Airflow. The efficiency and ease of use it offers contribute to its high ROI.
Which other solutions did I evaluate?
I've been using Apache Airflow, but I haven't directly compared it with other scheduling tools available in the market. This is because each cloud platform has its own built-in scheduling tool. For instance, if we consider Azure, it has a service called Azure Data Factory, which serves as a scheduling engine.
When you compare Apache Airflow with services like Azure Data Factory and AWS, you'll find that Airflow excels in various aspects. However, one area that could be improved is the integration with hosting services. Currently, Airflow can be hosted on different platforms and machines, which offers flexibility but may require some enhancements to streamline integration with certain hosting services.
What other advice do I have?
Since I have been using Apache Airflow for six to seven years, I would confidently rate the solution a solid ten. We help customers re-design and implement projects using Apache Airflow, and approximately 90% of our work revolves around this powerful tool. So, I rate this product a perfect ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.