DBT Developer Guide
DBT Developer Guide
Developer’s Guide
Table of Contents:
Contents
DBT Cloud UI Access.....................................................................................................................................................4
GIT Repository Setup....................................................................................................................................................4
1. New Repository creation..................................................................................................................................4
1.1. Service API....................................................................................................................................................5
1.2. GITHUB Cisco Portal.....................................................................................................................................5
2. SSH Key Setup...................................................................................................................................................5
DBT Cloud UI Setup......................................................................................................................................................6
1. Access to DBT Projects.....................................................................................................................................6
2. Configure Snowflake for DBT Projects..............................................................................................................7
3. DBT Environment Variables Setup....................................................................................................................7
4. dbt_project.yml setup......................................................................................................................................8
5. packages.yml setup..........................................................................................................................................9
6. sources.yml setup...........................................................................................................................................10
Develop DBT Models..................................................................................................................................................11
1. {{config ()}} Jinja function...............................................................................................................................11
2. Sources...........................................................................................................................................................12
2.1. Sources from .yml file.................................................................................................................................12
2.2. ref () function..............................................................................................................................................13
2.3. this..............................................................................................................................................................13
3. Multi-Column unique key in Incremental Models..........................................................................................14
4. Usage of dummy updates in pre/post hooks..................................................................................................15
5. Common Table Expression (CTE)....................................................................................................................15
6. Tags................................................................................................................................................................16
Macros using jinja......................................................................................................................................................17
Control Table Setup....................................................................................................................................................19
DBT Statements..................................................................................................................................................19
Type Cast Column types.............................................................................................................................................20
COMMIT and CHECK IN Code to GIT..........................................................................................................................21
DBT Docs....................................................................................................................................................................22
DBT Jobs.....................................................................................................................................................................23
CONTROL M Setup..................................................................................................................................................... 24
1. Steps to Install Python3.7...............................................................................................................................24
2. Steps to Activate Python3.7 in virtual environment.......................................................................................25
3. Command to execute DBT job from Control-M..............................................................................................25
4. Entry of DBT job information in “_JOBS” table...............................................................................................26
References................................................................................................................................................................. 28
Ex: Below is the screenshot for Deploy key of Asset View project with project number 25
and repository id 34
For Ex: Deploy Key set up for Asset View project in GIT Repository
Note: The Deploy keys should match in DBT UI and GIT. This key is used for all merging
operations.
For each project, developer needs to configure snowflake database connections and
connect to Snowflake account using SSO on Profile tab.
4. dbt_project.yml setup
The .yml file has all available configurations that are required for a dbt project.
We can define environmental variables in dbt_project.yml that will be referenced as
global variable invoked by any model.
For Ex: Defining Global environmental variable to use TS3 environment. This variable
can be referenced by all models under a dbt project.
Ex: To Configure models globally as materialized = table and transient = false in
dbt_project.yml. These settings will be default, for all the models under the dbt project.
These configurations can be overridden in the model using config () Jinja macro.
5. packages.yml setup
DBT Platform teams maintains all the source tables (SS, BR and BR_VIEWS) that
are available in snowflake. Use the package below to download source tables. The
source tables will be downloaded to dbt_packages folder.
6. sources.yml setup
Source tables required for a dbt project can be
In the models, source tables are referenced using {{source ()}} jinja function.
Ex 1: Reference Database sources.yml that are maintained as local copy
Ex 2: Work Interim tables that need to be sourced in the models and not part of any dbt
model creation can be added to sources.yml files.
Ex 1: Below is the config () setup to change the model’s materialized property from
table to incremental, WI schema to W schema.
Note: The configurations that are setup in the model will apply to that model only.
2. Sources
In a model, we can source tables from
- Actual sources from .yml file.
- ref () function
- this
2.3. this
this is a relation, representation of the current model. It is same as ref
(‘current model’). this can avoid circular dependencies in the model.
Since this is temporary data set, we cannot use time travel on the CTE’s.
Ex: Defining a CTE with temporary data set and usage of that CTE in select query of a
model
6. Tags
Tags can be defined in dbt_project.yml as well as model config. Tags will be used to run
the model which has been tagged under it.
If we want to run the dependent data lineage models defined under one Tag, we use the
command dbt run --select tag:my_tag
Reference of the above macro in a model, ‘1’ represents the number of columns you
want to do a group by on.
DBT Macro:
Call procedure takes the input parameter of (database name, schema, table name and
stage(pre/post)
The above macro can be referenced in the model as a per/post hook as shown below.
The default Schema used is ‘WI’. If you want to use a different schema like ‘W’ schema
within the same database. The Custom Schema macro should be overwritten with the
below code, as we do not want to the new schema to be ‘WI_W’ schema instead of just
‘W’ schema
Custom code:
Modified Code:
Setting the schema name in the config of the Model, will override the default
schema of the DBT projects.
The returned results which is a Matrix can be read and assigned to a parameter as
shown below.
T
he below code shows how to refer the parameter in a model
Update the Control table for next run in a post hook in the same model or a pre hook of
any other subsequent models.
Please see below example to typecasting the column datatypes in the model select query
DBT team provided other solution for incremental models with optional data type
changes. Here is a video that walks you through the workflow:
https://www.loom.com/share/9f71e6ae396f488fb82a094247730bec
Code: https://www-github.cisco.com/CLOUD-ARCH/platform_reference/blob/master/macros/
frozen_dtypes_incremental.sql
After you finish developing the models, user can commit and check in the code to GIT so
that your repository has the latest code.
Developers working parallelly on the same code can encounter merge commit conflict
issues. These conflicts can identify using the flags in the code. User needs to manually fix the
conflicts and commit these changes.
For Example:
Once the code is committed successfully, user can pull request the changes to the
branches like stage/prod/master.
These changes can be reviewed by the reviewer/owner of the code and then merge the
changes to stage/prod/master respectively.
DBT Docs
To generate documentation in the IDE, for the models in the project, execute the
command dbt docs generate.
It documents the Project models, sources.yml, databases, model-code, dependency,
referred by, node, columns, model compile sql
Ex: View docs will be enabled after the generating dbt docs. Please see below
screenshot.
Select hamburger icon on DBT IDE and click on Jobs to create a DBT job.
Specify meaningful name for a job and select the environment for a project.
Different Environment can link to different code branch and snowflake account.
Add Commands to execute a model, multiple commands can be added for a job.
Environment setup:
Select hamburger icon on DBT IDE and click on Environment to create and configure
environment for project.
Multiple environments can be created for a project and by default environment points
to master branch.
To point an environment to a branch other than master click on “CUSTOM BRANCH”
checkbox and specify a BRANCH created for a project.
CONTROL M Setup
1) Identify a Unix box and Sudo to build/place the DBT polling scripts in-order to invoke a
DBT job from Control-M
2) Place DBT Polling scripts on Unix box under the Sudo.
3) Python3.7 or higher version is required to successfully execute the scripts. In case of
lower version, please upgrade and activate Python3.7 in virtual environment.
4) requirements.txt depicts all the libraries required to successfully execute DBT polling
scripts.
5) Modify properties.ini and snowflake.ini files as per your induvial project details and
snowflake connections respectively.
6) queries.py list all the tables used to hold the DBT Job level details and tracks the DBT
execution log.
7) Please make sure to add an entry of DBT job in “_JOBS” tables to execute the job
successfully through Control-M
1. Steps to Install Python3.7
Login to Unix server and connect to sudo. Execute below commands one by one
in a sequence to install python3.7.9 on server. Replace “YOUR_SUDO_NAME” with the
sudo used for individual project.
mkdir -p /users/YOUR_SUDO_NAME/data/software/python3.7
cd /users/YOUR_SUDO_NAME/data/software/python3.7
wget https://www.python.org/ftp/python/3.7.9/Python-3.7.9.tgz
tar -zxvf Python-3.7.9.tgz
cd Python-3.7.9
./configure --prefix /users/YOUR_SUDO_NAME/data/software/python3.7 --with-
ssl
make
make install
/users/YOUR_SUDO_NAME/data/software/python3.7/bin/pip3 install --upgrade
pip
/users/YOUR_SUDO_NAME/data/software/python3.7/bin/pip3 install --upgrade
pip-tools
/users/YOUR_SUDO_NAME/data/software/python3.7/bin/pip3 install -r
requirements.txt
AAV_DBT_SCRIPT - ./run_job_script.sh
DBT_JOB_NAME – JOB_CREATED_IN_DBT
Rerunning the jobs from the point of failure: Airflow and dbt Cloud FAQs | dbt Docs
(getdbt.com)