Tool

Airflow

What is Airflow?

Apache Airflow is a workflow manager type tool or, in english: manage, monitor and plan workflows, used as a service orchestrator.

How does it work?

Airflow is used to automate jobs programmatically by breaking them into subtasks. It allows planning and monitoring from a centralized tool. The most common use cases are automating data ingestion, periodic maintenance actions, and administration tasks.

To do this, it allows you to schedule jobs as a cron and also execute them on demand using DAGs (Directed Acyclic Graphs) which are collections of tasks or jobs to be executed connected by relationships and dependencies.

Use Cases

We must understand Airflow as a tool for coordinating work carried out by other services. It is very useful for managing workflows in Data Warehouses and Machine Learning pipelines.

The main focus of Airflow is batch processes, with a series of finite tasks that are executed every certain intervals or triggers. Although there are also orchestrators for streaming jobs, Airflow is not the right tool.

Advantages

Apache Airflow allows us to define our own workflows to orchestrate services and maintain centralized control and monitoring.

In need of new tools?

Tekne provides Data Consulting, where we can define and guide you through a technological roadmap that aligns your company’s strategy with its objectives and tool’s usage.