DWDesign Practice Mini Case
DWDesign Practice Mini Case
DWDesign Practice Mini Case
This mini case study contains two data sources with sample data along with a statement
of business needs. Using the data sources and business needs, you will specify a dimensional
model with dimensions, measures, and grain, create a schema design for the data warehouse that
integrates the data sources, identify summarizability problems in the design, and populate data
Data Sources
The case study involves two data sources for a retail firm. The Purchase database
with the purchase number, date, payment method, delivery date, and supplier. A purchase
contains a collection of products with the quantity and unit cost recorded on a purchase line
along with links to the product and purchase heading. Each product has one preferred supplier.
Individual stores of the retail firm also maintain an inventory of custom products ordered
from local suppliers. These products are ordered through the purchase spreadsheets for custom
products. Inventory practices for custom products are informal. New products are typically
purchased when the manager senses new demand for local items.
The ERD in Figure 1 supports the purchase database. Tables 1 to 4 show sample data for
the tables in the purchases database. The supply purchase spreadsheet (Table 5) contains a
sample of purchases of custom products from local suppliers. The Stock column in the
2
Practice Mini Case for Data Warehouse Design
3
Practice Mini Case for Data Warehouse Design
Business Needs
The main purpose of the data warehouse is to track inventory balances over time.
Inventory balances are a type of snapshot. Snapshots are typical in applications in which
balances are involved, such as account balances in financial services, enrollment in courses,
reservations in hospitality and travel, and head count in personnel management. Snapshots
cannot be aggregated over time correctly.Summing quantities and values over time is not
meaningful.
The basic values for inventory tracking are quantity on hand and inventory value.
Inventory valuation can be complex as many accounting methods exist to value inventory. For
this case, the purchase price or unit cost of the inventory can be used for valuation. The data
warehouse should support detailed tracking of inventory to the individual product, purchased by
Here are typical computations for analyzing and tracking inventory balances using the
The change in inventory levels between consecutive periods and parallel periods
The relative contribution of the stocked item to the overall stock value
Problems
1. You should identify dimensions, map dimensions to data sources, and specify dimension
hierarchies. For each dimension, you should identify its data sources and attributes in each
4
Practice Mini Case for Data Warehouse Design
data source. For hierarchical dimensions, you should indicate the levels from broad to
narrow.
2. You should specify measures, related data sources, and measure aggregation properties.
3. Identify the grain in your dimensional design using the business needs as a guideline. You
should then indicate relative storage requirements for the grain using the statistics for the data
sources. Using the cardinality estimates provided, you should determine either the fact table
size or sparsity and then compute the unknown grain size variable. For example, you should
4. Extend your analysis to design a star schema (or variation) to support inventory analysis. For
each table, you should define the table name, primary key, and columns. You do not need to
5. Identify summarizability problems in your star schema and indicate preferred resolutions of
the summarizability problems. For incomplete dimension-fact relationships, you should also
6. You should populate your data warehouse tables based on the data in the sample tables and
spreadsheet. You do not need to write SQL INSERT statements or insert the data into your
tables. You can just show table listings in your solution. You should indicate mappings from
data sources into tables. For example, a mapping may involve generating new primary key
values for a data warehouse table or using a default value for a missing value.