Apache Airflow 50
Apache Airflow 50
Monitoring Scheduling
Pipeline Python
Architect-Data
Anil Patel
Architect-Data
Engineering & Analytics
Career Transition Coach
Airflow DAG
DAGs (Directed Acyclic Graphs) -
1.Permanent Table
workflows ,including scheduling and logic.
sequence without loops, allowing users to define complex
2.Temporary
Part 1//2
Python program DAG
Part 2//2
Follow For more Data Engineering ,Analytics & AI content Anil Patel
Architect-Data
Microsoft Azure -
Astronomer -
A managed Airflow service that allows data
teams to build, run, and manage data
pipelines as code. Astronomer can run Airflow
on AWS, GCP, Azure, or on-premise
Introduction to
Apache Airflow
INTRODUCTION TO APACHE AIRFLOW IN PYTHON
Data Engineer
What is data engineering?
Data engineering is:
Taking any action involving data and turning it into a reliable, repeatable, and maintainable
process.
What is a workflow?
A workflow is:
Creation
Scheduling
Monitoring
Airflow continued...
Can implement programs from any
language, but workflows are written in
Python
Implements workflows as DAGs: Directed
Acyclic Graphs
Accessed via code, command-line, or via
web interface / REST API
1 https://airflow.apache.org/docs/stable/
Other workflow tools
Other tools:
ADF
SSIS
Dagster
Prefect
Mage
Apache Oozie
Informatica
etl_dag = DAG(
dag_id='etl_pipeline',
default_args={"start_date": "2024-01-08"}
)
Are written in Python (but can use components written in other languages).
Mike Metzger
Data Engineer
DAGs view
@rganesh0203