0% found this document useful (0 votes)
463 views3 pages

Best Practices of Apache Airflow

The document outlines best practices for using Apache Airflow including defining connections and variables, keeping DAG code in templates, writing plugins by extending existing classes, testing workflows at different levels, and integrating Airflow with other enterprise workflow tools.

Uploaded by

Deepak Mane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
463 views3 pages

Best Practices of Apache Airflow

The document outlines best practices for using Apache Airflow including defining connections and variables, keeping DAG code in templates, writing plugins by extending existing classes, testing workflows at different levels, and integrating Airflow with other enterprise workflow tools.

Uploaded by

Deepak Mane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Best Practices of Apache Airflow

Things to be Considered Best Practices The composition of the


Management

 Give concern on the definition of Built-ins such as


Connections, Variables.

 There are also other tools which are non-python and present
in Airflow; forget their usability also.

 Target single source of configuration.

Fabricating and Cutting the Directed Acyclic Graph

 There should be one DAG per data source, one DAG per
project and one DAG per data sink.

 The code should be kept in template files.

 The Hive template used for the Hive.

 For the template search, the template search path is used.

 The template files are kept “Airflow agnostic.”

Generating Extensions and Plugins


 It is easy to write plugins and extensions, but it is a needed
thing

 Extension paths which should be concerned are operators,


hooks, executors, macros, UI adaption (views, links).

 Writing of plugins and extensions should be started from


existing classes, and the adapt it.

Generating and Expanding Workflows

 For this point, Database should be considered at three levels


Personal level, Integration level, Productive level.

 The personal level is handled by Data engineers or Data


scientists and at this level, testing should be done by “airflow
test.”

 At the integration level, Performance testing and Integration


testing considered.

 At productive level monitoring handled.

Accommodating with the enterprise

 The existing workflow tools considered for scheduling.

 There are tools which are there in Airflow for integration,


considering them is a nice practice.

You might also like