0% found this document useful (0 votes)

2 views25 pages

Module 3 - Creating A Data Model

The document explains the concept of data modeling, emphasizing the importance of creating relationships between tables to ensure data integrity and efficiency. It covers topics such as database normalization, the distinction between data and lookup tables, and best practices for managing relationships and filter flows. Additionally, it highlights the significance of avoiding complex cross-filtering and the necessity of hiding irrelevant fields in report views.

Uploaded by

Vishal Kapoor

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views25 pages

Module 3 - Creating A Data Model

Uploaded by

Vishal Kapoor

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

CREATING A DATA MODEL

WHAT’S A “DATA
MODEL”?

This IS NOT a data model

• This is a collection of independent tables,
which share no connections or relationships
• If you tried to visualize Orders and Returns
by Product, this is what you’d get
WHAT’S A “DATA
MODEL”?

This IS a data model!

• The tables are connected via relationships,
based on the common ProductKey field
• Now the Sales and Returns tables know how
to filter using fields from the Product table!
DATABASE NORMALIZATION
Normalization is the process of organizing the tables and columns in a relational
database to reduce redundancy and preserve data integrity. It’s commonly used to:
• Eliminate redundant data to decrease table sizes and improve processing speed & efficiency
• Minimize errors and anomalies from data modifications (inserting, updating or deleting records)
• Simplify queries and structure the database for meaningful analysis

TIP: In a normalized database, each table should serve a distinct and specific purpose (i.e. product
information, dates, transaction records, customer attributes, etc.)

When you don’t normalize, you end up with tables like

this; all of the rows with duplicate product info could be
eliminated with a lookup table based on product_id

This may not seem critical now, but minor inefficiencies

can become major problems as databases scale in size!
DATA TABLES VS. LOOKUP
TABLES
Models generally contain two types of tables: data (or “fact”) tables, and lookup (or “dimension”) tables
• Data tables contain numbers or values, typically at a granular level, with ID or “key” columns that can be used to
create table relationships
• Lookup tables provide descriptive, often text-based attributes about each dimension in a table

This Calendar Lookup table provides additional attributes about each date (month, year, weekday, quarter, etc.)

This Product Lookup table provides additional attributes about each product (brand, product name, sku, price, etc.)

This Data Table contains “quantity” values, and connects

to lookup tables via the “date” and “product_id” columns
PRIMARY VS. FOREIGN
KEYS

These columns are foreign keys; they These columns are primary keys; they uniquely identify each
contain multiple instances of each row of a table, and match the foreign keys in related data tables
value, and are used to match the
primary keys in related lookup tables
RELATIONSHIPS VS. MERGED
TABLES
Can’t I just merge queries or use LOOKUP or RELATED functions to pull those
attributes into the fact table itself, so that I have everything in one place??
-Anonymous confused man

Original Fact Table fields Attributes from Calendar Lookup table Attributes from Product Lookup table

Sure you can, but it’s inefficient!

• Merging data in this way creates redundant data and utilizes significantly more memory and
processing power than creating relationships between multiple small tables
CREATING TABLE RELATIONSHIPS
Option 1: Click and drag to connect primary and foreign Option 2: Add or detect relationships using
keys within the Relationships pane the “Manage Relationships” dialog box
CREATING “SNOWFLAKE” SCHEMAS

The Sales_Data table can connect to Products using the ProductKey field,
but cannot connect directly to the Subcategories or Categories tables

By creating relationships from Products to Subcategories (using

ProductSubcategoryKey) and Subcategories to Categories (using
ProductCategoryKey), we have essentially connected Sales_Data to each
lookup table; filter context will now flow all the way down the chain

PRO TIP:
Models with chains of dimension tables are often called
“snowflake” schemas (whereas “star” schemas usually have
individual lookup tables surrounding a central data table)
MANAGING & EDITING
RELATIONSHIPS

The “Manage Relationships” dialog box allows Editing tools allow you to activate/deactivate relationships, view
you to add, edit, or delete table relationships cardinality, and modify the cross filter direction (stay tuned!)
ACTIVE VS. INACTIVE
RELATIONSHIPS

The Sales_Data table contains two date fields (OrderDate & StockDate), but
there can only be one active relationship to the Date field in the Calendar table

Double-click the relationship line, and check the “Make this relationship
active”
box to toggle (note that you have to deactivate one in order to activate
another)
RELATIONSHIP CARDINALITY

Cardinality refers to the uniqueness of values in a column

• For our purposes, all relationships in the data model should
follow a “one-to-many” cardinality; one instance of each
primary key, but potentially many instances of each foreign key

In this case, there is only ONE instance of each ProductKey in the Products
table (noted by the “1”), since each row contains attributes of a single product
(Name, SKU, Description, Retail Price, etc)
There are MANY instances of each ProductKey in the Sales_Data table (noted
by the asterisk *), since there are multiple sales associated with each product
CARDINALITY CASE STUDY: MANY-TO-
MANY

• If we try to connect these tables using product_id,

we’ll get a “many-to-many relationship” error since
there are multiple instances of each ID in both tables
• Even if we could create this relationship, how would
you know which product was actually sold on each
date – Cream Soda or Diet Cream Soda?
CARDINALITY CASE STUDY: ONE-
TO-ONE

• Connecting the two tables above using the product_id field creates a one-to-one relationship,
since each ID only appears once in each table
• Unlike many-to-many, there is nothing illegal about this relationship; it’s just inefficient

To eliminate the inefficiency, you could simply

merge the two tables into a single, valid lookup

NOTE: this still respects the laws of normalization, since all rows
are unique and capture attributes related to the primary key
CONNECTING MULTIPLE DATA TABLES

This model contains two data tables:

Sales_Data and Returns_Data
• Note that the Returns table connects to
Calendar and Product_Lookup just like the
Sales table, but without a CustomerKey field
it cannot be joined to Customer_Lookup
• This allows us to analyze sales and returns
within the same view, but only if we filter or
segment the data using shared lookups
• In other words, we know which product was
returned and on which date, but nothing
about which customer made the return

HEY THIS IS IMPORTANT!

In general, never create direct relationships between data tables; instead, connect them through shared lookups
FILTER FLOW

Here we have two data tables (Sales_Data and Returns_Data),

connected to Territory_Lookup

Note the filter directions (shown as arrows) in each relationship;

by default, these will point from the “one” side of the relationship
(lookups) to the “many” side (data)
• When you filter a table, that filter context is passed along to all
related “downstream” tables (following the direction of the arrow)
• Filters cannot flow “upstream” (against the direction of the arrow)

PRO TIP:
Arrange your lookup tables above your data tables in your model as a visual reminder that filters flow “downstream”

*In some cases filters may default to “two-way” depending on your Power BI Desktop settings
FILTER FLOW
(CONT.)
In this case, the only valid way filter both Sales and Returns data by
Territory is to use the TerritoryKey field from the Territory_Lookup
table, which is upstream and related to both data tables
• Filtering using TerritoryKey from the Sales table yields incorrect
Returns values, since the filter context cannot flow upstream to
either one of the other tables

• Similarly, filtering using TerritoryKey from the Returns table yields

incorrect Sales data; in addition, only territories that registered
returns are visible in the table (even though they registered sales)

3) Filtering using TerritoryKey from

1) Filtering using TerritoryKey from 2) Filtering using TerritoryKey from the Returns_Data table
the Territory_Lookup table the Sales_Data table
TWO-WAY FILTERS

Updating the filter direction between Sales and Territory

from “Single” to “Both” allows filter context to flow both ways
• This means that filters applied to the Sales_Data table will pass to
the lookup, and then down to the Returns_Data table
NOTE: The “Apply security filter in both directions” option relates to row-level security (RLS)
settings, which are not covered in this course
TWO-WAY FILTERS
(CONT.)
With two-way cross-filtering enabled between the Sales and Territory
tables, we now see correct values using TerritoryKey from either table
• The filter context for Sales_Data[TerritoryKey] now passes up to the
Territory_Lookup, and then down to the Returns_Data table

• Note that we still see incorrect values when filtering using TerritoryKey from
the Returns table, since the filter context is isolated to that single table

3) Filtering using TerritoryKey from

1) Filtering using TerritoryKey from 2) Filtering using TerritoryKey from the Returns_Data table
the Territory_Lookup table the Sales_Data table
TWO-WAY FILTERS
(CONT.)
In this case, we’ve enabled two-way cross-filtering between the
Returns and Territory tables
• As expected, we now see incorrect values when filtering using TerritoryKey
from the Sales table, since the filter context is isolated to that single table

• While the values appear to be correct when filtering using TerritoryKey from
the Returns table, we’re missing sales data from any territories that didn’t
register returns (specifically Territories 2 & 3)

Since no information about

Territory 2 or 3 is passed from the
Returns_Data table to
Territory_Lookup, they get filtered
out of the lookup, and subsequently
filtered out of the Sales_Data

3) Filtering using TerritoryKey from

1) Filtering using TerritoryKey from 2) Filtering using TerritoryKey from the Returns_Data table
the Territory_Lookup table the Sales_Data table
TWO-WAY FILTERS: A WORD OF
WARNING

Use two-way filters carefully, and only when necessary*

• If you try to use multiple two-way filters in a more complex model,
you run the risk of creating “ambiguous relationships” by introducing
multiple filter paths between tables:

In this model, filter context from the Product_Lookup table can pass down to
Returns_Data and up to Territory_Lookup, which would filter accordingly based on the
TerritoryKey values passed from the Returns table

If we were able to activate the relationship between Product_Lookup and Sales_Data as

PRO TIP: well, filters could pass from the Product_Lookup table through EITHER the Sales or
Returns table to reach the Territory_Lookup, which could yield conflicting filter context
Design your models with one-way filters
and 1-to-Many cardinality, unless more
complex relationships are necessary

*Two-way filters are not recommended for models with multiple data tables, but may be used when you need to filter a lookup using a data table, or connect two “many” tables via a shared lookup (not covered in this course)
HIDING FIELDS FROM REPORT VIEW

Hiding fields from Report View makes them inaccessible

from the Report tab (although they can still be accessed
within the Data or Relationships views)
This is commonly used to prevent users from filtering
using invalid fields, or to hide irrelevant metrics from view

PRO TIP:
Hide the foreign key columns in your data tables to force
users to filter using the primary keys in the lookup tables
BEST PRACTICES: DATA
MODELING
Focus on building a normalized model from the start
• Make sure that each table in your model serves a single, distinct purpose
• Use relationships vs. merged tables; long & narrow tables are better than short & wide

Organize lookup tables above data tables in the diagram view

• This serves as a visual reminder that filters flow “downstream”

Avoid complex cross-filtering unless absolutely necessary

• Don’t use two-way filters when 1-way filters will get the job done

Hide fields from report view to prevent invalid filter context

• Recommend hiding foreign keys from data tables, so that users can only access valid fields
Reference sources:

- Microsoft PowerBI website

- PowerBI resources on Coursera, Udemy
Disclaimer
The information in this document is highly confidential and may be legally privileged. It
is intended solely for the addressee. Access to this presentation by anyone else is
unauthorized. If you are not the intended recipient, any disclosure, copying, distribution
or any action taken or omitted to be taken in reliance on it, is prohibited and may be
unlawful. The sample screens shown in this presentation are CONVZ FZE’s IP and
cannot be used or distributed without their prior consent. This presentation is
considered approved for submission to the Client by the Above-Authorized signatory.

Love Sick
100% (1)
Love Sick
627 pages
Accenture Data Analyst Interview Questions
No ratings yet
Accenture Data Analyst Interview Questions
17 pages
Project-1 Updated
100% (1)
Project-1 Updated
28 pages
Reading: Based On Tricia Hedge's "Teaching and Learning in The Language Classroom"
100% (1)
Reading: Based On Tricia Hedge's "Teaching and Learning in The Language Classroom"
22 pages
Flipkart Data Analyst Interview Questions 1747625566
No ratings yet
Flipkart Data Analyst Interview Questions 1747625566
27 pages
COGNIZANT Data Analyst Interview Questions Part 2-11
No ratings yet
COGNIZANT Data Analyst Interview Questions Part 2-11
17 pages
Buku Program English Week
100% (3)
Buku Program English Week
2 pages
Data Modeling Best Practices
No ratings yet
Data Modeling Best Practices
41 pages
Tech Mahindra Data Analyst Interview Questions
No ratings yet
Tech Mahindra Data Analyst Interview Questions
11 pages
Data Modeling in Power BI
100% (1)
Data Modeling in Power BI
15 pages
Nafs and Rizq PDF
No ratings yet
Nafs and Rizq PDF
4 pages
Advanced DAX For Business Intelligence
83% (6)
Advanced DAX For Business Intelligence
178 pages
Notes To The Annals of Tacitus
No ratings yet
Notes To The Annals of Tacitus
391 pages
Lesson Plan COT 1 MIL
100% (2)
Lesson Plan COT 1 MIL
10 pages
Tetra 30 Final Result
No ratings yet
Tetra 30 Final Result
6 pages
Mary Jane Lesson Plan in English VI
100% (1)
Mary Jane Lesson Plan in English VI
27 pages
Discussions About Vandanam and Vanakkam
100% (1)
Discussions About Vandanam and Vanakkam
7 pages
1151CS107 - Database Management Systems: Subject Code / Title
No ratings yet
1151CS107 - Database Management Systems: Subject Code / Title
174 pages
Unit - V
No ratings yet
Unit - V
90 pages
INFS5700 Week 3 Data Modelling - Lecture Slides - Moodle
No ratings yet
INFS5700 Week 3 Data Modelling - Lecture Slides - Moodle
46 pages
Oracle DBA Syllabus
No ratings yet
Oracle DBA Syllabus
7 pages
INF10024 - 2023 - WEEKLY SCHEDULE TOPIC 7 - Data
No ratings yet
INF10024 - 2023 - WEEKLY SCHEDULE TOPIC 7 - Data
40 pages
Adbms 12
No ratings yet
Adbms 12
28 pages
3rd Quarterly Test in English 5
No ratings yet
3rd Quarterly Test in English 5
5 pages
Week 4 - Session 1
No ratings yet
Week 4 - Session 1
24 pages
3.model Data in Power BI
No ratings yet
3.model Data in Power BI
55 pages
Advanced Data Modeling in Power BI
No ratings yet
Advanced Data Modeling in Power BI
31 pages
Power BI - Day 7
No ratings yet
Power BI - Day 7
27 pages
ADBMS Unit 1
No ratings yet
ADBMS Unit 1
21 pages
Tripleten 5 - Introduction To Table Relationships and Joining Tables
No ratings yet
Tripleten 5 - Introduction To Table Relationships and Joining Tables
31 pages
Lab 03 - Design A Data Model in Power BI
No ratings yet
Lab 03 - Design A Data Model in Power BI
23 pages
Mayuri Dandekar DATA MODELING
No ratings yet
Mayuri Dandekar DATA MODELING
26 pages
Week7RelationalModel 91647
No ratings yet
Week7RelationalModel 91647
24 pages
Advanced Data Modeling
No ratings yet
Advanced Data Modeling
51 pages
Lecture-5-Exploring Data Visualization in Power BI
No ratings yet
Lecture-5-Exploring Data Visualization in Power BI
45 pages
SC4x W2L1 DataModeling v3
No ratings yet
SC4x W2L1 DataModeling v3
59 pages
Deloitte Interview Insights For A Power BI Developer
No ratings yet
Deloitte Interview Insights For A Power BI Developer
26 pages
Tutorial Acces 2010
No ratings yet
Tutorial Acces 2010
49 pages
Power BI 101 Relationship
No ratings yet
Power BI 101 Relationship
19 pages
Chapter 5 - From Conceptual Design To Relational Implementation
No ratings yet
Chapter 5 - From Conceptual Design To Relational Implementation
24 pages
Singh Advanced Data Cleaning Techniques For E-Commerce Projects
No ratings yet
Singh Advanced Data Cleaning Techniques For E-Commerce Projects
14 pages
Lab Design A Data Model
No ratings yet
Lab Design A Data Model
24 pages
Day 3
No ratings yet
Day 3
12 pages
IPL - Lesson 3 - Creating Table Relationships and Data Models
No ratings yet
IPL - Lesson 3 - Creating Table Relationships and Data Models
12 pages
Relational Data Model
No ratings yet
Relational Data Model
39 pages
W3 Relational Data Model
No ratings yet
W3 Relational Data Model
38 pages
DBMS Practice Questions
No ratings yet
DBMS Practice Questions
11 pages
CEF342 - Database and Design Chapter 3 - The Relational Database Model
No ratings yet
CEF342 - Database and Design Chapter 3 - The Relational Database Model
10 pages
Project To Share
No ratings yet
Project To Share
15 pages
Marc Lelijveld Jeroen Ter Heerdt
No ratings yet
Marc Lelijveld Jeroen Ter Heerdt
40 pages
Power BI Week 2
No ratings yet
Power BI Week 2
43 pages
BDST 122 RDBMS
No ratings yet
BDST 122 RDBMS
12 pages
Theis Topics - MA in Teaching English As A Foreign Language
No ratings yet
Theis Topics - MA in Teaching English As A Foreign Language
14 pages
5 Modelling With Power BI
No ratings yet
5 Modelling With Power BI
10 pages
BI Sceberio Q
No ratings yet
BI Sceberio Q
16 pages
Ch1 Introduction To Os
No ratings yet
Ch1 Introduction To Os
16 pages
THEO DÕI HỌC VIÊN
No ratings yet
THEO DÕI HỌC VIÊN
7 pages
Ads Ise 2
No ratings yet
Ads Ise 2
11 pages
04 Foundations of Business Intelligence DBMS
No ratings yet
04 Foundations of Business Intelligence DBMS
29 pages
Gbio 55 Lec Lesson 4
No ratings yet
Gbio 55 Lec Lesson 4
8 pages
Chapter 3: Data Relational Model: Prepared By: Norzelan Bin Saleh
No ratings yet
Chapter 3: Data Relational Model: Prepared By: Norzelan Bin Saleh
42 pages
Chap2 - Table Relationship and Data Models
No ratings yet
Chap2 - Table Relationship and Data Models
13 pages
Top 50 SQL Server Interview Question
No ratings yet
Top 50 SQL Server Interview Question
15 pages
DAM Week 5
No ratings yet
DAM Week 5
5 pages
Module 3 - Building Data Models & Relationships
No ratings yet
Module 3 - Building Data Models & Relationships
7 pages
Cultural Diversity
No ratings yet
Cultural Diversity
17 pages
Form, Structure, and Sense (Level 3) Answer Key
No ratings yet
Form, Structure, and Sense (Level 3) Answer Key
12 pages
Lab 3
No ratings yet
Lab 3
5 pages
2 Design and Develop The Data Model
No ratings yet
2 Design and Develop The Data Model
4 pages
03 Configure Data Model in Power Bi Desktop
No ratings yet
03 Configure Data Model in Power Bi Desktop
12 pages
Infoman Finals
No ratings yet
Infoman Finals
5 pages
The Relational Database Model: Database Systems: Design, Implementation, and Management
No ratings yet
The Relational Database Model: Database Systems: Design, Implementation, and Management
52 pages
CH02
No ratings yet
CH02
48 pages
Management Information System
No ratings yet
Management Information System
27 pages
Brand Guidelines
No ratings yet
Brand Guidelines
2 pages
Class 11 Winter Holidays Homework 202425
No ratings yet
Class 11 Winter Holidays Homework 202425
3 pages
IPL - Lesson 3 - Creating Table Relationships and Data Models
No ratings yet
IPL - Lesson 3 - Creating Table Relationships and Data Models
12 pages
Object Pool Design Pattern
No ratings yet
Object Pool Design Pattern
16 pages
Amin Talahmeh Amin Talahmeh: Web Developer
No ratings yet
Amin Talahmeh Amin Talahmeh: Web Developer
4 pages
Assignment DBMS: Mojahid Ali
No ratings yet
Assignment DBMS: Mojahid Ali
16 pages
DM Unit I
No ratings yet
DM Unit I
6 pages
Untitled
No ratings yet
Untitled
2 pages
Bài Kiểm Tra Đầu Vào
No ratings yet
Bài Kiểm Tra Đầu Vào
18 pages
ENGLISH 10 DFOT Activity
No ratings yet
ENGLISH 10 DFOT Activity
2 pages
(Online Teaching) A2 Flyers Speaking Part 2
No ratings yet
(Online Teaching) A2 Flyers Speaking Part 2
11 pages
Identifying and Remediating Reading Difficulties
No ratings yet
Identifying and Remediating Reading Difficulties
17 pages
Verbs in Early Modern English
No ratings yet
Verbs in Early Modern English
9 pages
The Beginner’s Guide to Databases & SQL
From Everand
The Beginner’s Guide to Databases & SQL
Steven Mcananey
No ratings yet
SQL Interview Success From Beginner To Pro
From Everand
SQL Interview Success From Beginner To Pro
Shana
No ratings yet
Pivot Tables In Depth For Microsoft Excel 2016
From Everand
Pivot Tables In Depth For Microsoft Excel 2016
Suljan Qeska
3.5/5 (3)

Module 3 - Creating A Data Model

Uploaded by

Module 3 - Creating A Data Model

Uploaded by

CREATING A DATA MODEL

This IS NOT a data model

This IS a data model!

When you don’t normalize, you end up with tables like

This may not seem critical now, but minor inefficiencies

This Data Table contains “quantity” values, and connects

Sure you can, but it’s inefficient!

By creating relationships from Products to Subcategories (using

Cardinality refers to the uniqueness of values in a column

• If we try to connect these tables using product_id,

To eliminate the inefficiency, you could simply

This model contains two data tables:

HEY THIS IS IMPORTANT!

Here we have two data tables (Sales_Data and Returns_Data),

Note the filter directions (shown as arrows) in each relationship;

• Similarly, filtering using TerritoryKey from the Returns table yields

3) Filtering using TerritoryKey from

Updating the filter direction between Sales and Territory

3) Filtering using TerritoryKey from

Since no information about

3) Filtering using TerritoryKey from

Use two-way filters carefully, and only when necessary*

If we were able to activate the relationship between Product_Lookup and Sales_Data as

Hiding fields from Report View makes them inaccessible

Organize lookup tables above data tables in the diagram view

Avoid complex cross-filtering unless absolutely necessary

Hide fields from report view to prevent invalid filter context

- Microsoft PowerBI website

You might also like