0% found this document useful (0 votes)
95 views30 pages

Airflow Chapter4

The document provides an introduction to using templates in Apache Airflow. It explains that templates allow substituting information during DAG runs and provide flexibility when defining tasks. It provides examples of templated BashOperator tasks where the filename parameter is templated, allowing the task to operate on different files. It also demonstrates using Jinja templating syntax like loops to iterate over lists. The document concludes with an overview of concepts taught in the Airflow introduction course.

Uploaded by

massyweb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views30 pages

Airflow Chapter4

The document provides an introduction to using templates in Apache Airflow. It explains that templates allow substituting information during DAG runs and provide flexibility when defining tasks. It provides examples of templated BashOperator tasks where the filename parameter is templated, allowing the task to operate on different files. It also demonstrates using Jinja templating syntax like loops to iterate over lists. The document concludes with an overview of concepts taught in the Airflow introduction course.

Uploaded by

massyweb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Working with

templates
INTRODUCTION TO AIRFLOW IN PYTHON

Mike Metzger
Data Engineer
What are templates?
Templates:

Allow substituting information during a DAG run

Provide added exibility when de ning tasks

Are created using the Jinja templating language

INTRODUCTION TO AIRFLOW IN PYTHON


Non-Templated BashOperator example
Create a task to echo a list of les:

t1 = BashOperator(
task_id='first_task',
bash_command='echo "Reading file1.txt"',
dag=dag)
t2 = BashOperator(
task_id='second_task',
bash_command='echo "Reading file2.txt"',
dag=dag)

INTRODUCTION TO AIRFLOW IN PYTHON


Templated BashOperator example
templated_command="""
echo "Reading {{ params.filename }}"
"""
t1 = BashOperator(task_id='template_task',
bash_command=templated_command,
params={'filename': 'file1.txt'}
dag=example_dag)

Output:

Reading file1.txt

INTRODUCTION TO AIRFLOW IN PYTHON


Templated BashOperator example (continued)
templated_command="""
echo "Reading {{ params.filename }}"
"""
t1 = BashOperator(task_id='template_task',
bash_command=templated_command,
params={'filename': 'file1.txt'}
dag=example_dag)
t2 = BashOperator(task_id='template_task',
bash_command=templated_command,
params={'filename': 'file2.txt'}
dag=example_dag)

INTRODUCTION TO AIRFLOW IN PYTHON


Let's practice!
INTRODUCTION TO AIRFLOW IN PYTHON
More templates
INTRODUCTION TO AIRFLOW IN PYTHON

Mike Metzger
Data Engineer
Quick task reminder
Take a list of lenames

Print "Reading < lename>" to the log / output

Templated version:

templated_command="""
echo "Reading {{ params.filename }}"
"""
t1 = BashOperator(task_id='template_task',
bash_command=templated_command,
params={'filename': 'file1.txt'}
dag=example_dag)

INTRODUCTION TO AIRFLOW IN PYTHON


More advanced template
templated_command="""
{% for filename in params.filenames %}
echo "Reading {{ filename }}"
{% endfor %}
"""
t1 = BashOperator(task_id='template_task',
bash_command=templated_command,
params={'filenames': ['file1.txt', 'file2.txt']}
dag=example_dag)

Reading file1.txt
Reading file2.txt

INTRODUCTION TO AIRFLOW IN PYTHON


Variables
Air ow built-in runtime variables

Provides assorted information about DAG runs, tasks, and even the system con guration.

Examples include:

Execution Date: {{ ds }} # YYYY-MM-DD


Execution Date, no dashes: {{ ds_nodash }} # YYYYMMDD
Previous Execution date: {{ prev_ds }} # YYYY-MM-DD
Prev Execution date, no dashes: {{ prev_ds_nodash }} # YYYYMMDD
DAG object: {{ dag }}
Airflow config object: {{ conf }}

1 h ps://air ow.apache.org/docs/stable/macros-ref.html

INTRODUCTION TO AIRFLOW IN PYTHON


Macros
In addition to others, there is also a {{ macros }} variable.

This is a reference to the Air ow macros package which provides various useful objects /
methods for Air ow templates.

{{ macros.datetime }} : The datetime.datetime object

{{ macros.timedelta }} : The timedelta object

{{ macros.uuid }} : Python's uuid object

{{ macros.ds_add('2020-04-15', 5) }} : Modify days from a date, this example returns


2020-04-20

INTRODUCTION TO AIRFLOW IN PYTHON


Let's practice!
INTRODUCTION TO AIRFLOW IN PYTHON
Branching
INTRODUCTION TO AIRFLOW IN PYTHON

Mike Metzger
Data Engineer
Branching
Branching in Air ow:

Provides conditional logic

Using BranchPythonOperator

from airflow.operators.python_operator import BranchPythonOperator

Takes a python_callable to return the next task id (or list of ids) to follow

INTRODUCTION TO AIRFLOW IN PYTHON


Branching example
def branch_test(**kwargs):
if int(kwargs['ds_nodash']) % 2 == 0:
return 'even_day_task'
else:
return 'odd_day_task'

INTRODUCTION TO AIRFLOW IN PYTHON


Branching example
def branch_test(**kwargs):
if int(kwargs['ds_nodash']) % 2 == 0:
return 'even_day_task'
else:
return 'odd_day_task'

branch_task = BranchPythonOperator(task_id='branch_task',dag=dag,
provide_context=True,
python_callable=branch_test)

start_task >> branch_task >> even_day_task >> even_day_task2


branch_task >> odd_day_task >> odd_day_task2

INTRODUCTION TO AIRFLOW IN PYTHON


Branching graph view

INTRODUCTION TO AIRFLOW IN PYTHON


Branching even days

INTRODUCTION TO AIRFLOW IN PYTHON


Branching odd days

INTRODUCTION TO AIRFLOW IN PYTHON


Let's practice!
INTRODUCTION TO AIRFLOW IN PYTHON
Creating a
production pipeline
INTRODUCTION TO AIRFLOW IN PYTHON

Mike Metzger
Data Engineer
Running DAGs & Tasks
To run a speci c task from command-line:

airflow run <dag_id> <task_id> <date>

To run a full DAG:

airflow trigger_dag -e <date> <dag_id>

INTRODUCTION TO AIRFLOW IN PYTHON


Operators reminder
BashOperator - expects a bash_command

PythonOperator - expects a python_callable

BranchPythonOperator - requires a python_callable and provide_context=True . The


callable must accept **kwargs .

FileSensor - requires filepath argument and might need mode or poke_interval


a ributes

INTRODUCTION TO AIRFLOW IN PYTHON


Template reminders
Many objects in Air ow can use templates

Certain elds may use templated strings, while others do not

One way to check is to use built-in documentation:

1. Open python3 interpreter

2. Import necessary libraries (ie,


from airflow.operators.bash_operator import BashOperator )

3. At prompt, run help(<Airflow object>) , ie, help(BashOperator)

4. Look for a line that referencing template_ elds. This will specify any of the arguments that
can use templates.

INTRODUCTION TO AIRFLOW IN PYTHON


Template documentation example

INTRODUCTION TO AIRFLOW IN PYTHON


Let's practice!
INTRODUCTION TO AIRFLOW IN PYTHON
Congratulations!
INTRODUCTION TO AIRFLOW IN PYTHON

Mike Metzger
Data Engineer
What we've learned
Work ows / DAGs SLAs / Alerting

Operators (BashOperator, PythonOperator, Templates


EmailOperator)
Branching
Tasks
Air ow command line / UI
Dependencies / Bitshi operators
Air ow executors
Sensors
Debugging / Troubleshooting
Scheduling

INTRODUCTION TO AIRFLOW IN PYTHON


Next steps
Setup your own environment for practice

Look into other operators / sensors

Experiment with dependencies

Look into parts of Air ow we didn't cover


XCom

Connections

Refer to docs for more

Keep building work ows!

INTRODUCTION TO AIRFLOW IN PYTHON


Thank you!
INTRODUCTION TO AIRFLOW IN PYTHON

You might also like