Apache Airflow Fundamentals Study Guide
Apache Airflow Fundamentals Study Guide
Airflow Fundamentals
Introduction
Welcome to the Airflow Fundamentals Certification Exam study guide! We are very excited you have
decided to get certified with us. This guide will give you an overview of the certification exam to help you
determine how to study and when to take it. This guide covers the following sections:
• Preparation Expectations
• Exam Details
• Exam Topics
• Sample Questions
This guide covers the latest version of the exam, which was last updated on September 1, 2023. If you have
any questions or want to talk to our team, reach out to us at: [email protected].
Preparation Expectations
To pass the Airflow Fundamentals Certification, you must demonstrate an understanding of Apache Airflow's
core concepts, such as architecture, DAGs, the task lifecycle, and the scheduling process. You should be
comfortable recommending use cases, architectural needs, settings, and design choices for data pipelines.
You should be able to trigger, debug, and retry DAGs (and their associated tasks) and use the correct views
Exam Details
Cost: $150 US
Language: Englis
Important Notes:
◦ Once enrolled in the exam, you will have 30 days to complete it. After 30 days, the exam expires
and must be purchased again.
◦ Once the exam is completed (both the exam itself and any other modules), our badge vendor,
Credly, will issue your digital badge to the email associated with your Astronomer Academy
account.
Astronomer Academy : A large catalog of free Airflow courses taught by the Astronomer experts behind
j
the pro ect.
◦ Airflow 101 Learning Path : A curated learning path that guides you through the foundational skills
and knowledge you need to start with Apache Airflow.
1
Exam Topics
The Airflow Fundamentals Certification Exam covers a variety of topics about Airflow. The exam randomizes
questions from a pool of over 90+ questions that are categorized by topic. This means that you may
encounter a different set of questions each time you take the exam, but all topics will be covered. Use the
learning outcomes below to guide your study and prepare for the exam.
Topic 1: Airflow Use Cases Topic 4: Airflow CLI
Given a specific scenario, identify if Airflow is an Identify the purpose of specific Airflow CLI
applicable solution. commands:
◦ `airflow tasks test`
Topic 2: Airflow Concepts
◦ `airflow db init`
Identify which folder the Airflow Scheduler
◦ `airflow info`
parses when searching for new DAG files
◦ `airflow tasks test`
Identify what an Airflow provider is
◦ `airflow config list`
Identify what a DAG run is
◦ `airflow cheat-sheet`
Identify the role of a worker in Airflow
◦ `airflow variables`
Identify which programming language Airflow
primarily uses. ◦ `airflow users`
Identify the purpose of an XCom ◦ `airflow standalone`
Identify the purpose of a DAG ◦ `airflow version`
Identify the purpose of the `default_args` DAG Identify the impact of using the `airflow tasks
parameter test` Airflow CLI command with a DAG that has
Identify the default time zone of an Airflow an XCom.
instance Topic 5: Airflow UI
Identify the role of an executor in Airflow
Identify the most helpful Airflow UI view to use
Identify the core architectural components of for real-world scenarios.
Airflow
◦ Grid view
Identify the typical journey of a task
◦ Graph view
Identify what happens when two DAGs share the
◦ Gantt view
same `dag_id`
◦ DAGs view
Identify optional and non-optional DAG
parameters. ◦ Landing times view
Identify what each of the task lifecycle stages ◦ Tree view
does ◦ Calendar view
Identify valid ways to define a DAG in Airflow. Identify the default time for DAGs to appear in
the Airflow U
Topic 3: Dependencies Identify the result of deleting a DAG using the
Identify the purpose of task-level dependencies Airflow UI
in Airflow Identify the purpose of core Airflow UI
Identify where DAG dependencies are set up in components (e.g., The Last Run Column
Airflow Given a specific Airflow UI, identify solutions to
Compare and contrast DAG task dependency common issues.
relationships for equivalency
Match DAG task dependency graphs to their
equivalent DAG dependency code.
2
Topic 6: DAG Scheduling Topic 11: Best Practices
Identify the purpose of each DAG scheduling Given specific DAG code, identify ways to
parameter: improve the code using Airflow best practices.
◦ `catchup`
Topic 12: Connections
◦ `start_date`
◦ `end_date` Identify the different ways to create an Airflow
connectio
◦ `schedule_interval`
Identify the correct way to create an Airflow
Identify which tools/commands make it possible connection in a `.env` fil
to backfill DAGs when the `catchup` parameter
is set to `false Given a specific Airflow connection string,
identify the connection ID
Identify the default value for the `start_date`
paramete Topic 13: Tasks
Identify the valid values a DAG can accept for its Given DAG code, identify the number of tasks
`schedule_interval` paramete that will run when the DAG is scheduled.
Given a specific DAG scheduling goal (e.g.,
Schedule a DAG every day at 2 PM), identify the Topic 14: Sensors
correct DAG scheduling parameter values to Identify the default timeout value of a senso
accomplish the goal Identify the mode to use in a DAG when the
Given specific DAG scheduling parameter DAG’s poke_interval parameter value is set to
values, identify if a specific scheduling goal will specific durations
be accomplished Given the code for a sensor in a DAG, identify if
the sensor is properly configured to accomplish
Topic 7: DAG Runs a specific goal.
Given a specific DAG scheduling scenario or
DAG code, identify the number of DAG runs that Topic 15: Variables
will occur. Identify the purpose of an Airflow variabl
Topic 8: Debugging Identify the types of data that can be stored in a
variabl
Given a specific Airflow issue or completed DAG
code, identify the cause of it. Identify the correct way to define a variable with
a specific value in Airflo
Topic 9: XComs Identify how to fetch the value of an Airflow
Identify the purpose of each XCom method: variable in specific formats (e.g., JSON
◦ `xcom push` Given a specific variable name, Identify if a
◦ `xcom pull` variable will be visible in the Airflow UI
Identify the limitations of using XComs
Topic 10: Operators
Identify what a Transfer Operator doe
Identify what a Sensor Operator doe
Identify the purpose of the Airflow
`PythonOperator`
3
Sample Questions
To help give you a sense of the types of questions that will be asked on the exam, we are providing five
sample questions. These questions are modified versions of similar questions you might find on the exam.
The answer key can be found at the end of this section.
4
Sample Exam Question 4 - Debugging
Examine the following DAG:
with DAG(
'example_dag',
schedule_interval= '@daily',
catchup=False
):
task_1 = PythonOperator(
task_2 = PythonOperator(
Which of the following are issues with this DAG? (select all that apply)
a. The DAG is missing a `start_date` parameter
b. Both tasks are missing a task_id
c. The DAG has a cycle
d. The tasks are not assigned to the dag
5
Answer Key
1. A
2. B
3. D
4. A & B
5. B