Module 3 - Creating a Data Model
Module 3 - Creating a Data Model
WHAT’S A “DATA
MODEL”?
TIP: In a normalized database, each table should serve a distinct and specific purpose (i.e. product
information, dates, transaction records, customer attributes, etc.)
This Calendar Lookup table provides additional attributes about each date (month, year, weekday, quarter, etc.)
This Product Lookup table provides additional attributes about each product (brand, product name, sku, price, etc.)
These columns are foreign keys; they These columns are primary keys; they uniquely identify each
contain multiple instances of each row of a table, and match the foreign keys in related data tables
value, and are used to match the
primary keys in related lookup tables
RELATIONSHIPS VS. MERGED
TABLES
Can’t I just merge queries or use LOOKUP or RELATED functions to pull those
attributes into the fact table itself, so that I have everything in one place??
-Anonymous confused man
Original Fact Table fields Attributes from Calendar Lookup table Attributes from Product Lookup table
The Sales_Data table can connect to Products using the ProductKey field,
but cannot connect directly to the Subcategories or Categories tables
PRO TIP:
Models with chains of dimension tables are often called
“snowflake” schemas (whereas “star” schemas usually have
individual lookup tables surrounding a central data table)
MANAGING & EDITING
RELATIONSHIPS
The “Manage Relationships” dialog box allows Editing tools allow you to activate/deactivate relationships, view
you to add, edit, or delete table relationships cardinality, and modify the cross filter direction (stay tuned!)
ACTIVE VS. INACTIVE
RELATIONSHIPS
The Sales_Data table contains two date fields (OrderDate & StockDate), but
there can only be one active relationship to the Date field in the Calendar table
Double-click the relationship line, and check the “Make this relationship
active”
box to toggle (note that you have to deactivate one in order to activate
another)
RELATIONSHIP CARDINALITY
In this case, there is only ONE instance of each ProductKey in the Products
table (noted by the “1”), since each row contains attributes of a single product
(Name, SKU, Description, Retail Price, etc)
There are MANY instances of each ProductKey in the Sales_Data table (noted
by the asterisk *), since there are multiple sales associated with each product
CARDINALITY CASE STUDY: MANY-TO-
MANY
• Connecting the two tables above using the product_id field creates a one-to-one relationship,
since each ID only appears once in each table
• Unlike many-to-many, there is nothing illegal about this relationship; it’s just inefficient
NOTE: this still respects the laws of normalization, since all rows
are unique and capture attributes related to the primary key
CONNECTING MULTIPLE DATA TABLES
PRO TIP:
Arrange your lookup tables above your data tables in your model as a visual reminder that filters flow “downstream”
*In some cases filters may default to “two-way” depending on your Power BI Desktop settings
FILTER FLOW
(CONT.)
In this case, the only valid way filter both Sales and Returns data by
Territory is to use the TerritoryKey field from the Territory_Lookup
table, which is upstream and related to both data tables
• Filtering using TerritoryKey from the Sales table yields incorrect
Returns values, since the filter context cannot flow upstream to
either one of the other tables
• Note that we still see incorrect values when filtering using TerritoryKey from
the Returns table, since the filter context is isolated to that single table
• While the values appear to be correct when filtering using TerritoryKey from
the Returns table, we’re missing sales data from any territories that didn’t
register returns (specifically Territories 2 & 3)
In this model, filter context from the Product_Lookup table can pass down to
Returns_Data and up to Territory_Lookup, which would filter accordingly based on the
TerritoryKey values passed from the Returns table
*Two-way filters are not recommended for models with multiple data tables, but may be used when you need to filter a lookup using a data table, or connect two “many” tables via a shared lookup (not covered in this course)
HIDING FIELDS FROM REPORT VIEW
PRO TIP:
Hide the foreign key columns in your data tables to force
users to filter using the primary keys in the lookup tables
BEST PRACTICES: DATA
MODELING
Focus on building a normalized model from the start
• Make sure that each table in your model serves a single, distinct purpose
• Use relationships vs. merged tables; long & narrow tables are better than short & wide