Skip to content

akashmittal18/Airflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Airflow

Airflow is an open-source platform designed to programmatically author, schedule, and monitor workflows. It allows you to create and execute complex data pipelines and workflows that can involve multiple steps, dependencies, and sources of data.

At its core, Airflow provides a way to define a DAG (Directed Acyclic Graph) of tasks, which can be orchestrated and executed on a schedule or triggered manually. Each task in the DAG represents a specific operation or step in the workflow, and tasks can depend on one another, allowing you to create complex dependencies between tasks.

Airflow Task Life Cycle

image

A happy workflow execution process

image

Install Airflow on Linux

  1. Install Airflow using pip
pip install apache-airflow
  1. Initialize Airflow database: After installing Airflow, you need to initialize its metadata database.
airflow db init
  1. Start Airflow webserver and scheduler: Once the database is initialized, you can start the Airflow webserver and scheduler using the following commands:
airflow webserver --port 8080
airflow scheduler
This will start the webserver on port 8080 and the scheduler in the background. You can now access the Airflow web UI by opening a web browser and navigating to http://localhost:8080.

What is a DAG?

A DAG (Directed Acyclic Graph) is a collection of tasks that you want to execute, organized in a way that reflects their dependencies and relationships. A DAG is a fundamental concept in Airflow, as it represents the workflow that you want to automate or orchestrate.

A DAG consists of nodes and edges, where "nodes represent the tasks that need to be executed", and "edges represent the dependencies between tasks". The direction of the edges is always from upstream tasks to downstream tasks, indicating that a downstream task depends on the successful completion of its upstream tasks.

An example of what is a DAG.

image

An example of what is not a DAG.

image

About

This repo contains the concepts of Apache Airflow and the practical implemetation I'll be doing while learning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages