0% found this document useful (0 votes)
2 views

lecture 4

This document covers dimensional modeling in data warehousing, focusing on the life cycle of data models, fact granularity, and types of fact tables including transaction, periodic, and accumulating facts. It also discusses various types of measures such as additive, semi-additive, non-additive, derived, and textual facts, as well as different types of dimension tables. The content is based on the book 'Business Intelligence, Analytics, and Data Science: A Managerial Perspective' by Sharda et al.

Uploaded by

cekagi7032
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

lecture 4

This document covers dimensional modeling in data warehousing, focusing on the life cycle of data models, fact granularity, and types of fact tables including transaction, periodic, and accumulating facts. It also discusses various types of measures such as additive, semi-additive, non-additive, derived, and textual facts, as well as different types of dimension tables. The content is based on the book 'Business Intelligence, Analytics, and Data Science: A Managerial Perspective' by Sharda et al.

Uploaded by

cekagi7032
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Data warehouse

(IS 422)
Lecture 4
Dimensional modeling
Dr. Wael Abbas
2023 - 2024

slides in this file from the following book Sharda, Ramesh, Dursun Delen, and Efraim Turban. Business intelligence,
analytics, and data science: a managerial perspective. pearson, 2018.
Data modeling
• The Elements of Data Model
Dimensional model life cycle:
1. Gathering Requirements (Source Driven, Business/User Driven).

2. Identify granularity of the facts

3. Identify the dimensions

4. Identify the facts


Fact granularity
• The grain is the definition of what a single row in the fact table will
represent or contains.

• The grain describes the physical event which needs to be measured.

• Grain controls the dimensions which are available in fact.

• Grain represents the level of information we need to represent. It is not


always time; it could be the physical business measurement level.

• Design from the lowest possible grain.


Fact granularity
• Grain: The most detailed grain is stored in a transaction table, where you
get a row for every business event related to that fact.

• Periodic has less detail, with one row covering an entire time period. For
example, rather than showing every transaction in your banking account, it
shows one row related to the end of month or end of period balance.

• Finally, the least-detailed grain and fewest rows is the one where you’re
accumulating facts. For example, you placed an order, and all the dates
related to the order data are requested, filled, shipped, etc., and all placed on
the same record. That record is updated and in place.
Fact Types
 There are three types of fact tables in dimensional modeling:

1. Transaction fact

2. Periodic fact

3. Accumulating fact
Fact Types
1. Transaction fact tables are the most common in dimensional modeling.
They record a business event or transaction, one record at a time, with all
the data associated with the event. An example is a sales transaction, such
as the purchase of a book on Amazon. The purchase is recorded in the
transaction fact table for sales.

Characteristics of transaction fact :-

– Fact grain set at a single transaction


– It has one row per transaction.
– For each transaction, we add a new single record.
– The transaction fact table is known to grow very fast as the number of
transactions increases.
Fact Types
1. Transaction fact

Transaction fact example


Fact Types
2. Periodic fact are record snapshots of data for specific time, such as
inventory levels at the end of a quarter or account balances at the end of
the month. Each row represents a fact at a specific point in time. A periodic
fact table contains one row for a group of transactions over a period. It
must be from lower granularity to higher granularity hourly, daily,
monthly, and quarterly, then yearly.

cust_id month_id Total_incoming Total_outgoing Total_internatonal


2010 20231031 2000 3000 200

Periodic fact example


Fact Types
3. Accumulating fact tables store a record for the entire lifetime of the
event, showing the activity as it progresses.

An example of this, using an internet sales order, is where you record the
first event, the order, then you record subsequent dates and events such as
when the order was placed, when a credit card transaction was processed,
when the order was shipped, various states during the shipping, then finally
when it is delivered to the customer.

These are called role-playing date dimensions with events referred to in our
example as the order date, ship date, and delivered date.
Fact Types
3. Accumulating fact tables
Characteristics of accumulating fact :-
1. An accumulating fact table stores one row for the entire process.
2. Accumulating Fact tables does not accumulate time it accumulates business
process.
3. A row in an accumulating snapshot fact table summarizes the measurement events
occurring at predictable steps between the beginning and the end of a process
4. Accumulating Fact tables are used to show the activity of progress through a well-
defined process and are most often used to research the time between milestones.
5. These fact tables are updated as the business process unfolds, and each milestone is
happen
completed.
Fact Types
3. Accumulating fact tables

Usecases of accumulating fact :-


1. An Order life-cycle.
2. Insurance processing.
3. Hiring process.
Fact Types
3. Accumulating fact tables
Example of Accumulated Snapshot: Telecom company

 The fact table named: fact_claim_processing.

 This fact represents the claim life-cycle inside the company.

 It contains detail related to claim.

 This fact update after each stage finished.

 The requirement it to report the number of days between stages (milestone)


and the claim data (starting).
Fact Types
3. Accumulating fact tables
Example of Accumulated Snapshot: telecom company
Fact_claim _accumulated
Fact_claim Claim_key
Claim_key Customer_key
Customer_key Claim_date
Claim_date inspect_date
inspect_date Day_to_inspect
Review_date Review_date
Decision_date Day_to_review
Decision_date
Day_to_decision
Process_complete_flag
Fact Types
3. Accumulating fact tables
Example of Accumulated Snapshot: telecom company

Claim_key 21000

Customer_key 2001
Claim_date 2023-10-31
inspect_date 2023-11-1
Day_to_inspect 1
Review_date 2023-11-4
Day_to_review 4
Decision_date 2023-11-6
Day_to_decision 6
Process_complete_flag 0
Fact Types
Fact Types
Measures types
• There are types of measures: additive, semiadditive, nonadditive ,derived,
and textual fact.
1. Additive Facts
An additive fact is the easiest to define and manage. It’s simply a measure of
the fact table that can be added across all dimensions. The simplest example
of an additive fact is the quantity of items you bought in an online store—such
as the number of books. Additive measures enable the fact to be aggregated by
all applicable dimensions, which in our example is customer, store, product,
Sales_fact
and date.
Date_key
Customer_id
Store_id
Product_id
Sales_amount
Measures types
2. Semi_additive Facts
Semi additive facts are measurements in the fact table that can be added across
some dimensions but not others. For example , retrieving the account balance
for each account each day does not give us any useful information .

Sales_fact
Date_key
account
Account_balance
Measures types
3. non_additive Facts
Nonadditive facts are measures in fact tables that can’t be added across any
dimensions. Non-additive facts are usually the result of ratios (percentage) or
other mathematical calculations. even though they are numbers they aren’t
supposed to be added.

Sales_fact
Date_key
account
Profit_Margin
Measures types
4. Derived Facts
Derived facts are created by performing a mathematical calculation

on a number of other facts, and are sometimes referred to as calculated facts.


Derived facts may or may not be stored inside the fact table.

Total_sales = Qty_Sold * ( Unit_price - Discount)


orders
Order_date
Order_id
discount
Unit_price
Qty_Sold
Total_sales
Measures types
4. Textual Facts
textual fact consists of one or more characters such as flags
Fact_claim _accumulated
Claim_key
Customer_key
Claim_date
inspect_date
Day_to_inspect
Review_date
Day_to_review
Decision_date
Day_to_decision
Process_complete_flag
Dimensions types
• Types of Dimensions tables
1. Conformed Dimension.
2. Degenerate Dimension.
3. Junk Dimension (Garbage Dimension).
4. Role-Playing Dimension.
5. Outrigger Dimension.
6. Snowflake Dimension.
7. Shrunken Rollup Dimension.
8. Swappable Dimension.
9. Slowly changing Dimension.
10. Fast Changing Dimension (Mini Dimension).
11. Heterogenous Dimensions
12. Multi-valued dimensions
Dimensions types
Conformed Dimension.
• A conformed dimension can be associated with different fact tables,
maintaining the same meaning with all of them.
• A typical conformed dimension is the date. Its meaning does not vary by
fact table. For this reason, most data warehouses have a single date
dimension shared by all fact tables.
Dimensions types
Degenerate Dimension

•Dimension Key without corresponding dimension table.


•Stored in fact table(does not have dimension table ) Because all the
interesting attributes have been placed in analytic dimensions.
•It used to provide a grouping for business cases.

OrderID OrderDate ProductID Quantity Amount


123 123456789 111 2 120.45
123 123456789 222 5 10.45
431 98765122 333 1 15.45
431 98765122 555 6 4.45

You might also like