C - High Level Dimensional Modeling
C - High Level Dimensional Modeling
Assignment C:
High Level Dimensional Modeling
Part 1: Overview
This assignment will introduce you to the High Level Dimensional Modeling process. The goal of this
process is to turn functional business requirements into dimensional data warehouse (DDS)
specifications based on the Kimball technical architecture. Upon completing this lab activity you learn:
Additional for profiling data using the SQL Query language as to identify master data and
business processes.
The process of high-level dimensional modeling, including:
o Create a high-level dimensional model diagram (Kimball: fig. 7-3 pp. 304).
o Create an attribute and metrics list (Kimball: fig. 7-2 pp. 294).
o Keeping track of issues
Goals
Specifically the goals of this assignment are to:
Understand the goals of the high-level dimensional modeling process and practice its steps.
Master the act of profiling data and transforming functional requirements into a technical
specifications for a Kimball (DDS) data warehouse architecture.
Understand the value of the high level modeling worksheet as a technical documentation tool,
which can be later used to determine how to properly build tables in our DDS.
Technical Requirements
To complete this assignment you will need the following:
Access to the course externalworld.cent-su.org SQL Server, and specifically the Northwind
Traders database. You should connect to this server before starting the assignment.
The Dimensional modeling Excel Workbook, available in the same place where you got this
assignment.
Microsoft Excel for editing the workbook
Page 1 of 9
IST722 – Data Warehouse Homework Assignment C
Michael A. Fudge, Jr. High Level Dimensional Modeling
Page 2 of 9
IST722 – Data Warehouse Homework Assignment C
Michael A. Fudge, Jr. High Level Dimensional Modeling
1. Sales reporting. Senior management would like to be able to track sales by customer, employee,
product and supplier, with the goal of establishing which products are the top sellers which
employees place the most orders, and who are the best suppliers.
2. Order Fulfillment and Delivery. There is a need to analyze the order fulfillment process to see if
the time between when the order is placed and when it is shipped can be improved
3. Product Inventory Analysis. Management requires a means to track inventory, On Order, and
Re-Order levels of products by supplier or category. Inventory levels should be snapshotted daily
and recorded into the warehouse for analysis.
4. Sales Coverage Analysis. An Analysis of the employees and the sales territories they cover.
Getting Started
Connect to your SQL Server using Azure Data Studio and open a query window for the
Northwind database.
Open the High-Level-Dimensional-Modeling Excel Workbook, to the Bus Matrix page. You can
find this workbook in the same place you got this assignment.
Page 3 of 9
IST722 – Data Warehouse Homework Assignment C
Michael A. Fudge, Jr. High Level Dimensional Modeling
can determine that each row represents the sale of a product, or a line item on an order. This is a
transaction type fact, and so we update as follows:
To really know the answer to this question you’ll need to query the data and understand the processes
by which the data arrives in the table. This is
where data profiling comes into play. Let’s take a
look.
NOTE: In real life you won’t strike gold so easily. You’ll have to look at several tables before you can get
a clear picture of your fact table grain.
For example if you review the database diagram on page 2 of the lab you’ll see that the Order Details
table connects directly to the Products table via a foreign key in a many to one relationship. Because it
appears on multiple orders, Product fits the candidacy of a dimension. Once again we can verify this
dimension works for us and “rolls up” a couple of our known facts by writing some SQL.
Page 4 of 9
IST722 – Data Warehouse Homework Assignment C
Michael A. Fudge, Jr. High Level Dimensional Modeling
Important Tip: You should always exercise caution when profiling live systems. Executing SQL queries
against production data is usually not a wise decision as you may impact performance negatively. It is
important to seek the advice of a Database Administrator prior to embarking your data profiling
adventure!
In this case, if we determine the hierarchy is useful we can consolidate the attributes we need from it
into the product dimension. This makes more sense than including a separate dimension for Category.
There are cases where some other business process might need Suppliers or Categories and therefore it
would make sense to combine them into a single dimension. This is the fundamental idea behind
snowflaking.
Page 5 of 9
IST722 – Data Warehouse Homework Assignment C
Michael A. Fudge, Jr. High Level Dimensional Modeling
Once you’ve identified a useful dimension, it’s time to add it to our Bus Matrix like so. In this example
we’ve added the Product dimension.
Important Tip: There should always be a many to one relationship between the business process table
and the master data which makes up your dimension. One row in the dimension should appear many
times in the business process. For example, one product appears many times on different orders.
Fast-forward through some more data profiling and here’s a screenshot of the dimensions I’ve
discovered so far:
Important Tip: The x at the intersection of dimension and business process indicates there will be a
foreign key in our DDS connecting the business process to the dimension table. Our goal is to re-use
dimensions like Product, Customer, etc… in across other business processes. This is call conforming
dimensions.
Page 6 of 9
IST722 – Data Warehouse Homework Assignment C
Michael A. Fudge, Jr. High Level Dimensional Modeling
Important Tip: When you build the fact table each date will be a FK back to the date dimension.
How many of a specific product category were sold? Category is the attribute of the Product
dimension and how many is the measurement, and therefore the fact.
Which customers have ordered the most? Customer is the dimension and Sold Amount is the
measurement (fact).
From merely identifying the fact grain of the model you probably already have a few facts in mind (they
can be found in the business process table), but now’s the time to really nail down the facts you need in
your model. Like everything else in this step a lot will depend on your requirements.
One important this to recognize is not all facts appear among your source data. Some of the facts you’ll
need are derived facts we do a little math on some of the source data values. We include the facts we
want in the Bus Matrix but explain how they are derived in the Attributes and Metrics worksheet. For
now, we’ll add the following facts to our Bus Matrix and complete it.
Page 7 of 9
IST722 – Data Warehouse Homework Assignment C
Michael A. Fudge, Jr. High Level Dimensional Modeling
This is not a technical problem, but a data governance problem. This the process is not trivial, the
organization must agree how to handle this as to represent the values, if at all.
The idea behind the attributes and metrics is to define your facts and outline the important attributes in
your dimensions.
1. Sales reporting. Senior management would like to be able to track sales by customer, employee,
product and supplier, with the goal of establishing which products are the top sellers which
employees place the most orders, and who are the best suppliers.
2. Order Fulfillment and Delivery. There is a need to analyze the order fulfillment process to see if
the time between when the order is placed and when it is shipped can be improved
3. Product Inventory Analysis. Management requires a means to track inventory, On Order, and
Re-Order levels of products by supplier or category. Inventory levels should be snapshotted daily
and recorded into the warehouse for analysis.
4. Sales Coverage Analysis. An Analysis of the employees and the sales territories they cover.
In this part, you will repeat the process outlined in part 2 of the assignment for the remaining three
business processes.
When you are finished you should have the following in your Dimensional Modeling Workbook:
1. A completed Bus Matrix with all 4 business processes in it. No dimensions should repeat. To re-
use a dimension for another business process, include an X at its intersection.
IMPORTANT TIP: Keep in mind you can only model the data you have. If it’s not in your external world
source data (in this case, it’s Northwind Traders) then you cannot include it in your data warehouse!
Page 8 of 9
IST722 – Data Warehouse Homework Assignment C
Michael A. Fudge, Jr. High Level Dimensional Modeling
Turning it in:
Please turn in the submission template your completed reflection and with your name, netid, and date
at the top.
Please turn in your completed High Level Dimensional Modeling worksheet and make sure your name,
NetID, and date appear somewhere at the top of the Bus Matrix page.
Page 9 of 9