Apache Airflow Fundamentals Study Guide
Apache Airflow Fundamentals Study Guide
Apache Airflow
Fundamentals
Introduction
Welcome to the Apache Airflow Fundamentals Certification Exam study guide! We are very excited you have
decided to get certified with us. This guide will give you an overview of the certification exam to help you
This guide covers the latest version of the exam, which was last updated on September 1, 2023. If you have
any questions or want to talk to our team, reach out to us at: [email protected].
Preparation Expectations
This exam is designed for data practitioners who want to establish a strong foundation in Airflow. Whether
you're looking to enhance your skills, validate your expertise, or advance your career, this exam is tailored to
To attain this certification, you must demonstrate an understanding of Apache Airflow's core concepts, such
as architecture, DAGs, the task lifecycle, and the scheduling process. You should be comfortable
recommending use cases, architectural needs, settings, and design choices for data pipelines. You should be
able to trigger, debug, and retry DAGs (and their associated tasks) and use the correct views in the UI to
monitor them.
Leverage the additional resources mentioned in this guide to learn more about each topic
Additionally, most individuals that successfully pass the exam have at-least 3 months of Airflow experience.
Exam Details
Language: Englis
Important Notes:
◦ Once enrolled in the exam, you will have 30 days to complete it. After 30 days, the exam expires
and must be purchased again.
◦ Once the exam is completed (both the exam itself and any other modules), our badge vendor,
Credly, will issue your digital badge to the email associated with your Astronomer Academy
account.
1
Additional Resources
This guide is intended to be only one of the many resources you can use to prepare for the certification
exam. We also recommend leveraging the following resources:
Astronomer Academy: A large catalog of free Airflow courses taught by the Astronomer experts behind
the project.
◦ Airflow 101 Learning Path: A curated learning path that guides you through the foundational skills
and knowledge you need to start with Apache Airflow.
Astronomer Docs: The official place to learn everything you need to know about Astro and Apache
Airflow.
◦ Airflow Concepts: Learn about the fundamentals of Apache Airflow.
◦ Airflow Tutorials: Step-by-step guides for writing DAGs and running Airflow.
Airflow Community Resources: A collection of resources for the Airflow community, including a newsletter
and Slack.
Exam Topics
This exam covers a variety of topics about Apache Airflow. The exam randomizes questions from a pool of
over 90+ questions that are categorized by topic. This means that you may encounter a different set of
questions each time you take the exam, but all topics will be covered. Use the learning outcomes below to
guide your study and prepare for the exam.
2
Topic 6: Debugging Topic 10: Best Practices
Given a specific Airflow issue or completed DAG Given specific DAG code, identify ways to
code, identify the cause of it improve the code using Airflow best practices
Topic 7: XComs Topic 11: Connections
Identify the purpose of each XCom method: Identify the different ways to create an Airflow
◦ xcom push connectio
◦ xcom pull Identify the correct way to create an Airflow
Identify the limitations of using XComs connection in a .env fil
Given a specific Airflow connection string,
Topic 8: Airflow CLI identify the connection ID
Identify the purpose of specific Airflow CLI Topic 12: Tasks
commands:
◦ airflow tasks test Given DAG code, identify the number of tasks
that will run when the DAG is scheduled
◦ airflow db init
◦ airflow info Topic 13: Sensors
◦ airflow tasks test Identify the default timeout value of a senso
◦ airflow config list Identify the mode to use in a DAG when the
◦ airflow cheat-sheet DAG’s poke_interval parameter value is set to
◦ airflow variables specific duration
◦ airflow users Given the code for a sensor in a DAG, identify if
◦ airflow standalone the sensor is properly configured to accomplish
a specific goal
◦ airflow version
Topic 14: Variables
Topic 9: Airflow Concepts
Identify the purpose of an Airflow variabl
Identify which folder the Airflow Scheduler parses
when searching for new DAG file Identify the types of data that can be stored in a
variabl
Identify what an Airflow provider i
Identify the correct way to define a variable with
Identify what a DAG run i a specific value in Airflo
Identify the role of a worker in Airflo Identify how to fetch the value of an Airflow
Identify which programming language Airflow variable in specific formats (e.g., JSON
primarily use Given a specific variable name, Identify if a
Identify the purpose of an XCo variable will be visible in the Airflow UI
Identify the purpose of a DA
Identify the purpose of the default_args DAG Topic 15: Airflow UI
paramete Identify the most helpful Airflow UI view to use
Identify the default time zone of an Airflow for real-world scenarios.
instanc Identify the default time for DAGs to appear in
Identify the role of an executor in Airflo the Airflow U
Identify the core architectural components of Identify the result of deleting a DAG using the
Airflo Airflow U
Identify the typical journey of a tas Identify the purpose of core Airflow UI
Identify what happens when two DAGs share the components (e.g., The Last Run Column
same dag_id valu Given a specific Airflow UI, identify solutions to
Identify optional and non-optional DAG parameter common issues
Identify what each of the task lifecycle stages doe
Identify valid ways to define a DAG in Airflow
3
Sample Questions
To help give you a sense of the types of questions that will be asked on the exam, we are providing five
sample questions. These questions are modified versions of similar questions you might find on the exam.
What would be the Cron value of the schedule_interval parameter of a DAG if it needed to be triggered
a. 0 */2 * * 0,6
b. 0 2 * * 6,7
c. 0 */2 * * 6,0
d. 0 0/2 * * 6,7
a. To execute tasks.
b. To both trigger scheduled workflows and submit tasks to the executor to run.
b. a >> bc >> d
c. a >> [b,c,d]
4
Sample Exam Question 4 - Debugging
Examine the following DAG:
with DAG(
'example_dag',
schedule_interval= '@daily',
catchup=False
):
task_1 = PythonOperator(
task_2 = PythonOperator(
task_2 << task_1
Which of the following are issues with this DAG? (select all that apply)
a. The DAG is missing a start_date parameter
b. Both tasks are missing a task_id
c. The DAG has a cycle
d. The tasks are not assigned to the dag
a. 0
b. 1
c. 4
d. 5
5
Answer Key
1. A
2. B
3. D
4. B & C
5. B