L03B-Dimensional Modeling II
L03B-Dimensional Modeling II
Dimensional Modeling II
Samuel I. G. Situmeang
Lecture Objectives
• Rules of Fact Table Design • Junk Dimensions
• Rules of Dimension Table Design • Snowflake & Outrigger Dimensions
• Dimension Cases in Detail • Fact Table Cases in Detail
• Conformed Dimensions • Facts of Different Granularity
• Multiple currencies / Units of
• Date and Time Dimensions Measure
• Degenerate Dimensions • Factless Fact Tables
• Slowly Changing Dimensions • Consolidated Fact Tables
• Role-Playing Dimensions
• Do's and Don'ts of DM
2
Dimensional Modeling II Data Warehouse and Business Intelligence
• The Primary Key of your fact table uses the minimum number columns possible
& no surrogate keys.
(It should be made up of FK’s and Degenerate Dimensions)
• Referential Integrity is a must. Every foreign key in the fact table must have a
value.
• Avoid NULLs in the foreign key by using flags which are special values in place of
null.
• Ex. “No Shopper Card” in Customer Dimension
• The granularity of your fact table should be at the lowest, most detailed atomic
grain captured by the business process. (discussed last time)
• Each fact should be Additive, or re-designed to be as additive as possible.
• Each fact must be of the of the same granularity.
3
Dimensional Modeling II Data Warehouse and Business Intelligence
Stat Player Game Shot Shots Points Pts Per Shooting Pct
ID (PK) ID ID Attempts Made Shot
1 Jordan 1 3 2 5 1.667 0.667
2 Jordan 2 7 6 12 1.714 0.583
3 Miller 1 2 0 0 0.000 0.000
4 Miller 2 5 3 9 1.800 0.600
5 Miller 1 2 0 0 0.000 0.000
4
Dimensional Modeling II Data Warehouse and Business Intelligence
Poor Choice
Can you find the 3 things wrong with
of FK (or PK) the implementation of this fact table?
5
Dimensional Modeling II Data Warehouse and Business Intelligence
6
Dimensional Modeling II Data Warehouse and Business Intelligence
Prod Id Prod Name Prod Cat Prod Price Prod Region Code
7
Dimensional Modeling II Data Warehouse and Business Intelligence
Prod Id Prod Name Prod Cat Prod Price Prod Reg Code
Not Discretely
Valued Poor Data Incomplete
Quality
8
Dimensional Modeling II Data Warehouse and Business Intelligence
• Surrogate keys (identities, sequences e.g. 1,2,3,…) are used for the
primary key constraint.
• They yield best performance for the Star Schema
• most efficient joins,
• smaller indexes in fact table,
• more rows per block in the fact table
• They have no dependency on primary key in operational source data.
• Makes it easier to deal with changes to the source data.
• Dimension table requires a natural key or business key to identify a
unique row.
• Ex: Customer’s email address, Employee’s ID number.
9
Dimensional Modeling II Data Warehouse and Business Intelligence
10
Dimensional Modeling II Data Warehouse and Business Intelligence
Conformed Dimensions
11
Dimensional Modeling II Data Warehouse and Business Intelligence
Subset
12
Dimensional Modeling II Data Warehouse and Business Intelligence
13
Dimensional Modeling II Data Warehouse and Business Intelligence
14
Dimensional Modeling II Data Warehouse and Business Intelligence
15
Dimensional Modeling II Data Warehouse and Business Intelligence
Degenerate Dimensions
16
Dimensional Modeling II Data Warehouse and Business Intelligence
• Dimensional data changes infrequently but when it does you need a strategy for
addressing the change.
• Ex: What happens when a customer has a new address, or an Employee has a
name change?
4 Popular strategies
Type 1: Overwrite the existing attribute
Type 2: Add a new Dimension row
Type 3: Add a new Dimension attribute -
Mini-Dimension: Add a new Dimension
17
Dimensional Modeling II Data Warehouse and Business Intelligence
Type 1: Overwrite
• Appropriate for:
• correcting mistakes or errors in data
• changes where historical associations do not matter
• the old value has no significance
• If the previous value matters, don’t use this strategy. You are
rewriting history.
• Problems will occur with data aggregated on old values.
• Ex. Employee Name Changes, Corrections, Natural Key Edits.
18
Dimensional Modeling II Data Warehouse and Business Intelligence
19
Dimensional Modeling II Data Warehouse and Business Intelligence
20
Dimensional Modeling II Data Warehouse and Business Intelligence
21
Dimensional Modeling II Data Warehouse and Business Intelligence
Role-Playing Dimensions
22
Dimensional Modeling II Data Warehouse and Business Intelligence
Junk Dimensions
• Miscellaneous Flags and text attributes which do not fit within any other
dimension.
• Do Not make a Dimension for each one.
• Instead place them in their own “Junk” dimension
Invoice Payment Order Ship
Indicator Id Terms Mode Mode
23
Dimensional Modeling II Data Warehouse and Business Intelligence
24
Dimensional Modeling II Data Warehouse and Business Intelligence
Hierarchies in Dimensions
25
Dimensional Modeling II Data Warehouse and Business Intelligence
Multi-Valued Dimensions
26
Dimensional Modeling II Data Warehouse and Business Intelligence
27
Dimensional Modeling II Data Warehouse and Business Intelligence
28
Dimensional Modeling II Data Warehouse and Business Intelligence
1. Events or
Transactions Transaction
(single event)
2. Workflows a.k.a.
Accumulating
Accumulating Snapshots Snapshot
(Events over Time)
3. Points in time a.k.a
Periodic Snapshots Periodic
Snapshot
(point in time)
29
Dimensional Modeling II Data Warehouse and Business Intelligence
• A single fact table cannot have facts with different levels of granularity
• All measurements must be in the same level of details
• Example:
• Measurements are captured for each line order except for the shipping charge
which is for the entire order
• Solutions:
• Allocating higher level facts to a lower granularity
(split shipping charge among each item)
• Create two separate fact tables
(Orders fact & Line Order fact)
30
Dimensional Modeling II Data Warehouse and Business Intelligence
31
Dimensional Modeling II Data Warehouse and Business Intelligence
32
Dimensional Modeling II Data Warehouse and Business Intelligence
33
Dimensional Modeling II Data Warehouse and Business Intelligence
34
Dimensional Modeling II Data Warehouse and Business Intelligence
Resources
• Reading:
• R. Kimball and M. Ross. (2007). The Data Warehouse Toolkit (2nd Edition),
Wiley & Sons.
• R. Kimball and M. Ross. (2013). The Data Warehouse Toolkit (3rd Edition),
Wiley & Sons.
35
Dimensional Modeling II Data Warehouse and Business Intelligence
EOF
36