0% found this document useful (0 votes)
53 views23 pages

KB Catalogue

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views23 pages

KB Catalogue

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Full Catalogue of Data Vault content

Copyright © 2022 by Patrick Cuba


Blog Catalogue
Category Article Keywords URL
Practice Data Vault 2.0 bit.ly/2ZZlLcv
bit.ly/2II5fVt

Advantage Data Vault 2.0


Highlighting what DV2.0 gives any other interpretation of DV
Practice Beer, Hubs, Links, bit.ly/2ZYGpJP
Satellites, bit.ly/3GwZ12r
Modelling, Business
Process

Learning Data Vault is Like Learning How to Make Beer!


All it takes are three or four “things” to start any learning journey, your
cognitive load. At the time of writing the article I was learning how to make
beer!

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Practice DataOps, GDPR, bit.ly/3vhEDxJ
CCPA, Data bit.ly/35xhVGL
Governance,
Satellite Splitting,
Auditability

Data Vault or: how I learnt to stop worrying and love Data Governance
Following an Atomic Space Age theme, a glimpse into Data Vault with
DataOps
Practice & Hoshin Kanri, bit.ly/3fUL7fN
Modelling Business bit.ly/335dqW1
Architecture,
Business Agility,
Data Platform,
Information
Mapping, OKR,
Balanced Scorecard,
Time to upgrade your thinking on Data Vault KPI, Measure,
Data Vault is more than just a data modelling methodology, it is designed to Metrics
change and flex as the business evolves and matures around core business
capabilities
Practice Enterprise bit.ly/2WEWCSw
Architecture, bit.ly/3gM1k7H
Solution
Architecture,
Business
Architecture,
Business Process,
Capability Map,
Raw Vault, Business

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Objects, Business
Data Vault Recipes View, Technical
A holistic look at what it means to adopt Data Vault 2.0 methodology, inspired Debt, Business
by baking of course! Vault, Information
Mart, Data Delivery,
Data Quality
Practice Ensemble bit.ly/39o9qPX
Modelling, Data bit.ly/3so8qo3
Vault 2.0, Hans
Hultgren, Dan
Linstedt

a DATA VAULT ANALOGY


Data Vault in the industry has two standards, one following the Hans Hultgren
method (Ensemble Modelling) and the other follows Dan Linstedt (Data Vault
2.0). Sometimes the terms of the two are confused and to the untrained eye it is
difficult to tell who is following which method, which of course adds to the
confusion of learning about Data Vault. Ensemble tends to lean towards
replacing Kimball, Data Vault 2.0 does not - instead DV2.0 keeps the patterns
simple and repeatable.
The art was inspired by “in the land of the blind the one-eyed man is king”
proverb.
Practice Agility, Agnostic, bit.ly/2RyoRjv
Automation, bit.ly/3qSmjIz
Auditability

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Data Vault Elevator Pitch
One’s point of view is usually biased toward their own interests, and it is the
same when you pitch a Data Vault to different professions within a business.
Practice Team, Community bit.ly/3uHne3N
of Practice, DVBoK, bit.ly/33bU2qn
Consulting

Data Vault Dream Team


Ideas on how to get a Data Vault project started and build momentum
Practice & Modelling bit.ly/3zgP7OP
Modelling bit.ly/2NiceTN

Building Data Vault modelling capability through the Mob


How to go about modelling your Data Vault through collaboration and having
the right people in the right place. Inspired by work done at a major customer
and extreme programming principles

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Practice deprecated data, bit.ly/3o8To3m
GDPR, freshness, bit.ly/3Lh6pCP
observibility

Bring out your Dead… Data


The first DV article on consideration on what to do with defunct data! Inspired
by Pet Semetary, Poltergeist and the Sixth Sense!
Modelling Dependent-Child bit.ly/3tIckuf
Key, Standards, bit.ly/34hHezy
Business Key,
Industry Model

Data Vault Industry Verticals


An outcome of a Data Vault model review, this article explains some of the
pitfalls of attempting to conform a data vault to an industry model. Art inspired
by Sim City.

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling Satellites, Multi- bit.ly/3tqwAQU
table insert, Semi- bit.ly/34Duine
structure function,
Hashing

Data Vault Loader Traps


Articulating some of the pitfalls of not doing a Data Vault properly!
Modelling & Mapping, modelling bit.ly/3bRlV7U
Integration bit.ly/35XZ0YX

Decided to build your own Data Vault automation tool?


Based on experience building a home-grown Data Vault automation tool, this
post covers most of the patterns you will encounter in a Data Vault 2.0 model,
with examples!

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Snowflake PIT, Point in Time, bit.ly/3dn83n8
OLAP, Snowflake, bit.ly/3rH7gS5
MPP, Massively
Parallel Platform,
hashing, zone maps

Data Vault 2.0 on Snowflake…To hash or not to hash… that is the


question
To hash or not to hash on Snowflake…? An article justifying why you should
and how Snowflake’s MPP interpretation can still be used to deliver a Data
Vault. Any guess to whom that is in the title page?
Modelling & Raw Vault, Query bit.ly/3dBxOQK
Snowflake Assistance, Hashing, bit.ly/3Bci9BV
PIT, Point in Time,
Sequence Number
PIT, equijoin, right-
deep-join-tree,
sequence key,
OLAP, Information
Mart, Business
Vault, Clustering
Why EQUIJOINS Matter!
Evidence on how PIT tables (when designed right) take advantage of inherent
OLAP capabilities for querying facts and dimensions. Inspired by 12 Angry
Men and Juror #8.

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling & Reconciliation, Test bit.ly/3dUHPIS
Snowflake Automation, Hubs, bit.ly/3shb5jq
links and satellites,
staging, business
keys, unit of work

Data Vault Test Automation


Reconciliation between staged and target and between target tables is a must.
This test framework is designed to keep the data vault implementation honest,
and it is insert-only as well.
Snowflake Snowsight, Test bit.ly/3BjSg1F
Framework, bit.ly/3ryMdEM
Automated Testing,
Auditability,
Dashboards

Data Vault Dashboard Monitoring


How to set up and track Data Vault dashboard reporting based out of
Snowsight and the same INSERT-ONLY paradigm of DV2.0

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling & Point in time, PIT, bit.ly/3iEkBJC
Snowflake as_of, Conditional bit.ly/3uNJVUn
Multi-Table Insert,
Snowflake

Data Vault PIT Flow Manifold.


A little bit of Snowflake engineering in Conditional Multi-Table INSERTS and
Point in Time (PIT) tables
Snowflake Snowflake, bit.ly/3aCCRhQ
Extended Record bit.ly/3rvZduO
Tracking Satellite,
XTS

Data Vault’s XTS pattern on Snowflake


Solving Time Crime in Data Vault, using Snowflake. How does the timeline
correction pattern perform on Snowflake?

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling & Testing, Data bit.ly/337Jhp3
Snowflake Quality, Business bit.ly/3sxzmSe
Architecture,
Source-System Data
Vault, Passive
Integration, Multi-
tenancy

Data Vault Agility on Snowflake


Partly inspired by Tron! Some practical consideration for deploying a Data
Vault on Snowflake and taking advantage of some little-known nuances of the
platform.
Modelling & business bit.ly/2V32eFu
Integration architecture, bit.ly/3GwJX51
enterprise
architecture,
business objects, raw
vault, business
processes,
automation, passive
integration

You might be doing #datavault Wrong!


A long list of considerations when building your Data Vault, what to do, and
not to do!

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling & Business Vault, Unit bit.ly/3n9zIxB
Practice of Work, bit.ly/3lGvevA
Auditability,
Dependent-Child
Keys, Weak Hubs,
Seven Deadly Sins of Fake Vault Source System Data
Born out of observing Data Vault implementation seen in the wild that do not Vault, Business Key
follow the standards, DV2 practitioners have seen various unguided Collision Code,
interpretations; these are the main sins we see in the industry Staggered Load,
Sequence Key, Link
Theme and images inspired by Seven and Milton Satellite, Satellite
Splitting, Refactor,
Schema Evolution
Modelling Business Vault, Raw bit.ly/3BUt81s
Vault, Business Key, bit.ly/3rfV7V3
Unit of Work,
Business Process,
Change Record,
Derived Content,
PITs and Bridges,
Point in Time, PIT,
Bridge, Information
Mart

Data Vault Mysteries... Business Vault


Just what is a Business Vault and why is its creation a mystery, it really
shouldn’t be if you follow the standards!
Theme based on 1950s culture and story telling

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling Business Vault, Raw bit.ly/3L5qLP9
Vault, Business bit.ly/3GsWZAF
Process, Business
Rules, PITs and
Bridges, Information
Marts, Auditability

Is it Business Vault or is it not?


An often-foggy area of Data Vault is how to define a Business Vault, here is
some guidance
Modelling Business Vault, bit.ly/3EQ9wO3
Spark, Big Data, bit.ly/3J7JR5y
Graph, Link, bit.ly/3ezZ6Wh
Graphx, Pregel,
Euler

Apache Spark GraphX and the Seven Bridges of Königsberg


An example of building a Business Vault Link but using Big Data (Spark +
Parquet) to get there. Theme inspired by the story of Euler and the origins of
Graph theory

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling Effectivity Satellite, bit.ly/3oS4k70
Driving Key, bit.ly/3lQosU9
relationship, link,
unit of work

Data Vault Mysteries… Effectivity Satellite and Driver Key


Just what does the Effectivity Satellite solve? And why do you need to define a
driving key for it?
Modelling ghost records, point bit.ly/3vjTXdg
in time, pit, zero bit.ly/3ccwMcq
keys, nulls, default
keys

Data Vault Mysteries... Zero Keys & Ghost Records


DV2.0 has a few esoteric concepts, this article describes the difference
between default keys, ghost records and zero keys

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling Refactor, Business bit.ly/3iEiHZB
Key, Kimball, bit.ly/3gssD6C
Inmon, Patterns,
Schema Evolution,
Schema Drift,
Extended Record
Tracking Satellite,
XTS, Time Crime

Say NO to Refactoring Data Models!


Facing the same problems every data platform face is the challenge of making
changes without regression testing and escalating costs. Sticking to the Data
Vault 2.0 patterns rises to that challenge by promoting data agility.
Modelling Business Key bit.ly/3xlFK0s
Collision Code, bit.ly/3BIziSh
BKCC, Passive
Integration, Business
Key, Hub
A Rose by any other name… Wait.. is it still the same Rose?
Initially this article was released on Valentine’s Day, it delves into Passive
Integration and Business Key Collision Codes by way of an example.
Modelling & Book bit.ly/3bcayaO
Practice bit.ly/3tXoyNK

The Data Vault Guru: a pragmatic guide on building a data vault


A summary of what is in the book

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Integration XTS, Extended bit.ly/3y4mUdV
Record Tracking bit.ly/39jjf19
Satellite, Timeline
Correction, Self
Heal, Applied Date

Data Vault has a new Hero


Originally titled “Solving Time Crime in Data Vault 2.0”; this article delves
into how to deal with batch data that arrives out of sequence; this is an
authorised extension of the DV2.0 standards called the eXtended Record
Tracking Satellite (XTS). A data driven approach to dynamically enable the
DV model to self-heal.
Integration PIT, Point in Time, bit.ly/3mNxuD9
Ghost Record bit.ly/3iBfAzh

How I can get away without paying the Pied Piper... in Data Vault 2.0
What you learn on DV2.0 training is that a Data Vault model is not easy to
query, to make it easier and to support your Information Models you build
Point-in-Time and/or Bridge tables but the expense of querying the data vault
is pushed to the creation of the PIT tables themselves. But what if you don’t
have to?

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Modelling & Business Key, bit.ly/2YnlSh0
Integration Business Key bit.ly/3pgNJaK
Treatment, Passive
Integration, Hub,
Hashing

Business Key Treatments


What do you do when a source provides business keys that don’t quite follow
the standard business key assignment best practices? An approach to ensure
passive integration without sacrificing automation.
Modelling & dbt, integration bit.ly/3kyExhZ
Integration bit.ly/3ok9PsF

What does dbt give you?


A gloss over dbt and its power of transformation

Copyright © 2022 by Patrick Cuba


Category Article Keywords URL
Integration & Passive Integration bit.ly/3pTWCXP
Modelling

Passive integration explained…


Another take on explaining passive integration

Copyright © 2022 by Patrick Cuba


Advent Calendar
1. Business keys (… and to learn Data Vault), https://bit.ly/3D7B0xf
2. Surrogate keys (… and why Data Vault 2.0), https://bit.ly/3xZtxzH
3. Satellites (… and a Data Vault pitch), https://bit.ly/31pVODx
4. Links (Unit of Work) (… and defining a Business Vault Link), https://bit.ly/32ZWEHy
5. Multi-Active Satellites (… and the Data Vault Guru), https://bit.ly/31A85or
6. Driving Key (… and Effectivity Satellites), https://bit.ly/3si25My
7. Raw Vault (… and say NO to refactoring data models), https://bit.ly/3mffrVQ
8. Business Vault (… and how to build it), https://bit.ly/3sh4eIy
9. Hierarchy Link (… and Seven Deadly Sins of Fake Vault), https://bit.ly/3e5ez1C
10. Zero-key (… and why it is different to Ghost Records), https://bit.ly/3F8Sw6e
11. Same-as link (… and what to do with Dead Data), https://bit.ly/3yD1vKu
12. Status Tracking Satellite (STS… and #DataOps), https://bit.ly/327LIYf
13. Insert-only architecture (… and Data Vault on #Snowflake), https://bit.ly/3Ff1AXq
14. Record Tracking Satellite (RTS… and Building a Data Vault through the
Mob), https://bit.ly/3J0a8mX
15. Business Key Collision Codes, https://bit.ly/33GK28z
16. Business Key Treatments, https://bit.ly/3e14EtV
17. Extended Record Tracking Satellite (XTS), https://bit.ly/30FK2Ve
18. Multi-tenancy (… and XTS on Snowflake), https://bit.ly/3q90wh7
19. Non-historised Link (… and an article on #dbt), https://bit.ly/3p73MKP
20. Record Source (… and Why Equi-Joins Matter), https://bit.ly/3q4P324
21. Load Date (… and Automated Data Vault Testing
and #Snowflake’s #snowsight dashboard), https://bit.ly/3FiA7UH
22. Applied Date (.. and why you might be doing Data Vault Wrong), https://bit.ly/3qen689
23. Ghost Records (.. and how it differs to Zero Keys), https://bit.ly/3GXKPjQ
24. PITs & Bridges (.. and a PIT Flow Manifold using Multi-Table
Insert), https://bit.ly/3Ep8GHs
25. Dependent-child keys (… and Data Vault Recipes), https://bit.ly/30XPeDX

Copyright © 2022 by Patrick Cuba


Book
Title Description URL
the data vault guru The data vault methodology presents a unique opportunity to model the enterprise US: https://amzn.to/3d7LsJV
a pragmatic guide on data warehouse using the same automation principles applicable in today’s UK: https://amzn.to/3nsqTfR
building a data vault software delivery, continuous integration, continuous delivery and continuous AU: https://amzn.to/30IxOYF
deployment while still maintaining the standards expected for governing a DE: https://amzn.to/2TiAsAb
corporation’s most valuable asset: data. This book provides at first the landscape FR: https://amzn.to/37yfnKl
of a modern architecture and then as a thorough guide on how to deliver a data ES: https://amzn.to/3jl5tOr
model that flexes as the enterprise flexes, the data vault. Whether the data is IT: https://amzn.to/37Awag6
structured, semi-structured or even unstructured one thing is clear, there is always NL: https://amzn.to/35sCpjc
a model either applied early (schema-on-write) or applied late (schema-on-read). JP: https://amzn.to/3dNJgYq
Today’s focus on data governance requires that we know what we retain about BR: https://amzn.to/3dRvIek
our customers, the data vault provides that focus by delivering a methodology CA: https://amzn.to/3jl5LVx
focused on all aspects about the customer and provides some of the best practices MX: https://amzn.to/35pkslI
for modern day data compliance. IN: https://amzn.to/3jl65DJ
The book will delve into every data vault modelling artefact, its automation with
sample code, raw vault, business vault, testing framework, a build framework,
sample data vault models, how to build automation patterns on top of a data vault
and even offer an extension of data vault that provides automated timeline
correction, not to mention variation of data vault designed to provide audit trails,
metadata control and integration with agile delivery tools.

Copyright © 2022 by Patrick Cuba


Copyright © 2022 by Patrick Cuba
Copyright © 2022 by Patrick Cuba
Other
• Github - https://github.com/PatrickCuba/thedatamustflow
• Data Vault UK Interview - bit.ly/3baadp9
• Data Vault UK Presentation - youtube.com/watch?v=7lUn3eBiuyU
• Data Vault Munich Presentation - youtube.com/watch?v=tRPgijauH2w
• Meet the Expert: Data Vault - bit.ly/3t1hBe1
• Snowflake Data Vault User Group: DataOps - bit.ly/3qbnm7P
• DataVault Interview - data-vault.co.uk/patrick-cuba-interview/
• Integrating SAS and Data Vault, bit.ly/2YUw1xT
• Data Mapping, bit.ly/3s0kcEj, bit.ly/32FnFQI
• 3 Ways to load data into SQL Server MDS, bit.ly/3mirsbs
• My Hash of Hashes, bit.ly/2MGKE5L
• SAS indexing tricks, bit.ly/2L9gsiW
• SAS Parallelism, bit.ly/3oiQubn
• SAS SQL vs Data Step part 3, bit.ly/3s0Hbie
• SAS SQL Join vs Data Step Merge part 2, bit.ly/3nnfNaF
• SAS Hash Tables, bit.ly/3hPSwxg
• SAS Data Step Merge vs SQL Joins, bit.ly/3be5jIf

Copyright © 2022 by Patrick Cuba

You might also like