0% found this document useful (0 votes)
4 views

Week 3

The document provides an overview of data warehousing concepts, focusing on data warehouse architecture, data marts, and the elements of dimensional data models including facts, dimensions, and attributes. It explains types of dimensions such as slowly changing dimensions, conformed dimensions, junk dimensions, degenerate dimensions, and role-playing dimensions. Additionally, it discusses data warehouse schemas, particularly the star schema and its structure involving fact and dimension tables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Week 3

The document provides an overview of data warehousing concepts, focusing on data warehouse architecture, data marts, and the elements of dimensional data models including facts, dimensions, and attributes. It explains types of dimensions such as slowly changing dimensions, conformed dimensions, junk dimensions, degenerate dimensions, and role-playing dimensions. Additionally, it discusses data warehouse schemas, particularly the star schema and its structure involving fact and dimension tables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

1

Data Warehousing

Week # 03
2
Data warehouse architecture
4

Data Mart
5

Reasons for creating Data Marts


6

From Data warehouse to Data Marts


7

Characteristics of Departmental Data Mart


8

Elements of Dimensional Data Model

• Fact
• Dimension
• Attributes
• Fact Table
• Dimension Table
9

Fact
• Facts are the measurements/metrics or facts from your
business process. For a Sales business process, a
measurement would be quarterly sales number
10

Dimension
• Dimension provides the context surrounding a business
process event. In simple terms, they give who, what, where of
a fact. In the Sales business process, for the fact quarterly
sales number, dimensions would be
• Who – Customer
• Where – Location
• What – Product
11

Facts & Dimensions

“Facts and dimensions are the fundamental elements that


define a data warehouse. They record relevant events of a
subject or functional area (facts) and the characteristics
that define them (dimensions).”
12

Attributes
• The Attributes are the various characteristics of the dimension
in dimensional data modeling.
• In the Location dimension, the attributes can be
• State
• Country
• Zipcode etc.
• Attributes are used to search, filter, or classify facts.
Dimension Tables contain Attributes
13

Fact Table:
• A fact table contains records that combine
attributes from different dimension tables.
• These records allow users to analyze different
aspects of their business, which can aid in
decision-making and improving the business.
14

Dimension table:

• It provides the context and background


information for the measures recorded in the fact
table. One of the main differences between fact
tables and dimension tables is that dimension
tables contain the attributes that the measures in
the fact table are based on.
15

Facts & Dimensions


16

Facts
17

Facts
18

Dimensions
19

Dimensions
20

Types of Dimensions in Data Warehouse


1.Slowly Changing Dimension
2.Conformed Dimension
3.Degenerate Dimension
4.Junk Dimension
5.Role-playing Dimension
21
22

Slowly Changing Dimension


• These are the dimension tables that change slowly
over time.
There are majorly 3 types of Slowly Changing
Dimension tables.
• SCD Type 1
• SCD Type 2
• SCD Type 3
23

SCD Type 1
• In this, no history is stored. If we take Address Dimension as an
example then in that we do not store any address history of an
individual.

• The dimension table consists of only the latest address of the


individual.

• This approach is easy to design and saves storage!


24

SCD Type 2
• This is the most commonly used type out of all these. In this
approach complete history is maintained along with dates to
identify from when to when the record is valid.

• They are usually called start_date and end_date. If the end date is
empty then that means that is the active record!

• In Address example, if John moves from New York to Boston then


Address Dimension would store from when to when John lived in
New York and from when he started living in Boston.

• This approach takes lots of storage.


25

SCD Type 3
• In this, only a limited history is stored.
• Let's say a business wants to keep only the current address and
previous address of the individuals and they do not care to store all
of the previous addresses.
26

Conformed Dimension
• A dimension that can be used by multiple facts
and has the same meaning across the model is
called a Conformed Dimension.
• For example, if we have a dimension for a list of
places. This can be used across multiple fact
tables.
27

Conformed Dimension Example


28

Conformed Dimension Example


29

Conformed Dimension Example


30

Junk Dimension
• A junk dimension is the combination of several row-level
cardinality flags and attributes into a single dimension
table rather than modeling them as a separate dimension
table.

• Cardinality:
• it’s the number of distinct values in a table column relative to the number of rows in the table.
Repeated values in the column don’t count.
31
32

Degenerate Dimension

• A Degenerate Dimension is a key, such as a transaction


number, invoice number, ticket number, or bill-of-lading
number, that has no attributes and hence does not join to
an actual dimension table.
33
34

Role-playing Dimension

• A single physical dimension can be referenced multiple times in


a fact table, with each reference linking to a logically distinct
role for the dimension.
DWH-Ahsan Abdullah 35

Normalization
36

De- Normalization
37

Data warehouse Schema


• Schema is a logical description of the entire
database. It includes the name and description of
records of all record types including all associated
data-items and aggregates.
• Star Schema
• Snowflake Schema
• Fact constellation Schema
38

Star Schema
• Each dimension in a star schema is represented
with only one-dimension table.
• This dimension table contains the set of attributes.
• The following diagram shows the sales data of a
company with respect to the four dimensions,
namely time, item, branch, and location.
• There is a fact table at the center. It contains the
keys to each of four dimensions.
• The fact table also contains the attributes, namely
dollars sold and units sold.
39

Star Schema

You might also like