4.0 - Lession 3 - Data Modeling
4.0 - Lession 3 - Data Modeling
MÔN HỌC
Chapter 3
DATA MODELING
fit@hcmus
Learning Objectives
fit@hcmus
what products
Where those products
were sold
were bought
• Surrogate keys
• Is often generated by the database system
• is an integer whose value is meaningless
• to provide an identifier that is consistent and unique
across source systems and time, and independent of
business systems
• is a great data type to index and join in a relational
model
fit@hcmus
Dimension keys
Best practice:
maintain the source system’s primary key as an
alternate key in the dimension (also called source
system’s natural key)
Ex: Dim_Customers
CustomerNK is the natural key in the dimension and
primary key in the source system
SOR_NK:
fit@hcmus
Dimension hierachy
Group things together in ways that an enterprise
would measure itself
These hierarchies represent many-to-one
relationships
EX:
DimProduct: Product Product subcategory
Product category
DimeDate: Day month quarter Year
DimGeography: City State Country
fit@hcmus
Dimension hierachy
Important
Several levels in the dimension table
recorded all the time levels needed for analysis
and in accordance with the content
Example:
Analysis financial data of an enterprise: day, week,
month, quarter, year, holiday…
fit@hcmus
Modeling the calendar
Gregorian
calendar
19
fit@hcmus
Modeling the calendar
Fiscal Calendar
20
fit@hcmus
Modeling the calendar
Fiscal Calendar
21
fit@hcmus
Slowly Changing Dimension (SCD)
SCD: is a technique used to store the historical value
of dimension attributes
Three types of SCD:
overwrite the old values with the new ones (SCD
type 1)
Preserve the old value:
store the old values as rows (SCD type 2)
store them as columns (SCD type 3)
fit@hcmus
Slowly Changing Dimension (SCD)
28
fit@hcmus
FACT TABLE KEYS
Key column: consists of a group of foreign
keys (FK) that point to the primary keys of
dimensional tables that are associated with this
fact table to enable business analysis.
The primary key of a fact table is typically a
multipart key consisting of the combination of
foreign keys that can uniquely identify the fact
table row.
primary key is a surrogate key.
primary key with degenerative dimensions
fit@hcmus
FACT TABLE KEYS
The relationships
between fact tables and
the dimensions are one-
to-many
If combining a subset of
foreign keys creates
uniqueness, then this
multipart key will Fact keys:
become the primary key (DateKey, StoreKey, ProductKey,
CustomerKey)
Fact table—primary key is a surrogate key
fit@hcmus
Date Product
FactSales
DateKey ProductKey
DateKey
Year ProName
ProductKey
Quarter Category
SalePrices
Month SubCategory
day
http://sqlserver-qa.net/2015/06/25/different-types-of-facts-and-fact-tables-in-data-warehouse-design/
FACT MEASURES - SemiAdditive Facts
fit@hcmus
Date Customer
FactSales
DateKey CusKey
DateKey
Year CusName
ProductKey
Quarter
TotalSalePrices
Month
NetProfitMargin
day
http://sqlserver-qa.net/2015/06/25/different-types-of-facts-and-fact-tables-in-data-warehouse-design/
fit@hcmus
SCHEMA
which schema should use when building the
dimensional model?
What kind of analysis are you trying to perform on that data
and how complex is it?
What are the analytical requirements and restrictions?
How consistent is the data you want to query and analyze?
What BI tool do you plan to use?
fit@hcmus
Star Schema
45