Unit 2 - Data Warehouse Logical Designm
Unit 2 - Data Warehouse Logical Designm
Unit 2 - Data Warehouse Logical Designm
Logical Design
Age of data Current & time series Current & real time
Supplier_Name
Supplier_Address
Figure:By:Sample
Prepared E-R Chaudhary.
Mukesh Prasad Diagram 24
Figure: Sample E-R Diagram
Prepared By: Mukesh Prasad Chaudhary. 25
• A data warehouse is based on a multidimensional data
model which views data in the form of a data cube
• A data cube, such as sales, allows data to be modeled
and viewed in multiple dimensions
– Dimension tables, such as item (item_name, brand,
type), or time(day, week, month, quarter, year)
– Fact table contains measures (such as dollars_sold)
and keys to each of the related dimension tables
• The lattice of cuboids forms a data cube.
all
0-D(apex) cuboid
time,location,supplier
time,item,location 3-D cuboids
time,item,supplier item,location,supplier
4-D(base) cuboid
time, item, location, supplier
Region
Product
Time
Figure: Dimensional Model
Prepared By: Mukesh Prasad Chaudhary. 32
A Data Mining Query Language, DMQL:
Language Primitives
Cube Definition (Fact Table)
define cube <cube_name> [<dimension_list>]: <measure_list>
Characteristics:
- Normalization of dimension tables
- Each hierarchical level has its own table
- less memory space is required
- a lot of joins can be required if they involve attributes in secondary
dimension tables
City Key
Product Key
City
Product Desc
State
Region Product Dimension
City Dimension
Store Key
Store Name
City
State
Region
Any Solution???
Figure:
Prepared Star schema
By: Mukesh Prasad Chaudhary. 68
Figure:
Prepared Snowflake
By: Mukesh schema
Prasad Chaudhary. 69
Class Assignment (5)
An online order wine company requires the designing of a data
warehouse to record the quantity and sales of its wines to its
customers. Part of the original database is composed by the
following tables:
CUSTOMER (Code, Name, Address, Phone, BDay, Gender)
WINE (Code, Name, Type, Vintage, BottlePrice, CasePrice, Class)
CLASS (Code, Name, Region)
TIME (TimeStamp, Date, Year)
ORDER (Customer, Wine, Time, nrBottles, nrCases)
Note that the tables represent the main entities of the ER schema,
thus it is necessary to derive the significant relationships among
them in order to correctly design the data warehouse.
FACT Sales
MEASURES Quantity, Cost
DIMENSIONS Customer, Area, Time, Wine Class
Prepared By: Mukesh Prasad Chaudhary. 71
References
1. Sam Anahory, Dennis Murray, “Data warehousing In
the Real World”, Pearson Education.
2. Kimball, R. “The Data Warehouse Toolkit”, Wiley,
1996.
3. Teorey, T. J., “Database Modeling and Design: The
Entity-Relationship Approach”, Morgan Kaufmann
Publishers, Inc., 1990.
4. “An Overview of Data Warehousing and OLAP
Technology”, S. Chaudhuri, Microsoft Research
5. “Data Warehousing with Oracle”, M. A. Shahzad