0% found this document useful (0 votes)
26 views4 pages

Apache Airflow Workflow

Uploaded by

shubhammagar785
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views4 pages

Apache Airflow Workflow

Uploaded by

shubhammagar785
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 4

What is Apache Airflow?

• • Open-source platform for orchestrating


workflows and scheduling tasks.
• • Built on Python; uses Directed Acyclic
Graphs (DAGs) to define workflows.
• • Key Features:
• - Scalable and extensible.
• - Supports task dependencies and parallel
execution.
• - User-friendly UI for monitoring and
managing workflows.
How Apache Airflow Works
• • Core Components:
• - DAG (Directed Acyclic Graph): Blueprint of
the workflow.
• - Operators: Define actions (e.g.,
PythonOperator, BashOperator).
• - Scheduler: Manages task execution based
on intervals or triggers.
• - Executor: Executes tasks (e.g.,
LocalExecutor, CeleryExecutor).
• - Metastore Database: Stores DAG/task
Fetching Data from Different
Sources
• • Supported Data Sources:
• - Databases (MySQL, PostgreSQL, etc.)
• - Cloud storage (AWS S3, GCS, Azure Blob
Storage).
• - APIs and web services.
• - Local/remote file systems.

• • Implementation Steps:
• 1. Use appropriate Operators (e.g.,
Example Workflow
• • Scenario: Fetch data from an API, process it,
and store it in a database.

• • DAG Overview:
• 1. Task 1: Fetch data from API using
HttpOperator.
• 2. Task 2: Transform data with Python using
PythonOperator.
• 3. Task 3: Load data into a database with
PostgresOperator.

You might also like