CT075!3!2-DTM-Topic 7 - Data Warehouse
CT075!3!2-DTM-Topic 7 - Data Warehouse
CT075!3!2-DTM-Topic 7 - Data Warehouse
CT075-3-2-DTM
– OLTP
– OLAP
– Data Cube
– Star Scheme
– Snowflake schema
– Fact constellations
– Information processing
– Analytical processing
Database Architecture
What is Data Warehouse?
Database Architecture
Data Warehouse—Integrated
Database Architecture
Data Warehouse—Time Variant
Database Architecture
Data Warehouse—Non-Volatile
Database Architecture
How are organizations using the
information from data warehouses ?
• Many organizations use this information to support
business decision making activities:
Database Architecture
Why Separate Data Warehouse?
• High performance for both systems
– DBMS— tuned for OLTP: access methods, indexing,
concurrency control, recovery
– Warehouse—tuned for OLAP: complex OLAP queries,
multidimensional view, consolidation.
• Different functions and different data:
– missing data: Decision support requires historical data
which operational DBs do not typically maintain
– data consolidation: DS requires consolidation
(aggregation, summarization) of data from
heterogeneous sources
– data quality: different sources typically use inconsistent
data representations, codes and formats which have to
be reconciled
Database Architecture
Data Warehousing and OLAP
Database Architecture
A 2-D view of sales data for
AllElectronics
all
0-D(apex) cuboid
time,location,supplier
time,item,location 3-D cuboids
time,item,supplier item,location,supplier
4-D(base) cuboid
time, item, location, supplier
Database Architecture
Conceptual Modeling of
Data Warehouses
• Modeling data warehouses: dimensions & measures
– Star schema: A fact table in the middle connected to a
set of dimension tables
– Snowflake schema: A refinement of star schema
where some dimensional hierarchy is normalized into a
set of smaller dimension tables, forming a shape
similar to snowflake
– Fact constellations: Multiple fact tables share
dimension tables, viewed as a collection of stars,
therefore called galaxy schema or fact constellation
Database Architecture
Star schema
• The most common modeling paradigm is the
star schema, in which the data warehouse
contains :
• (1) a large central table (fact table) containing
the bulk of the data, with no redundancy, and
Database Architecture
Example of Snowflake Schema
time
time_key item
day item_key supplier
day_of_the_week Sales Fact Table item_name supplier_key
month brand supplier_type
quarter time_key type
year item_key supplier_key
branch_key
branch location
location_key
location_key
branch_key
units_sold street
branch_name
city_key city
branch_type
dollars_sold
city_key
avg_sales city
province_or_street
Measures country
Database Architecture
Example of Fact Constellation
time
time_key item Shipping Fact Table
day item_key
day_of_the_week Sales Fact Table item_name time_key
month brand
quarter time_key type item_key
year supplier_type shipper_key
item_key
branch_key from_location
all all
Database Architecture
View of Warehouses and Hierarchies
Database Architecture
Multidimensional Data
Office Day
Month
Database Architecture
A Sample Data Cube
Total annual sales
Date of TV in U.S.A.
1Qtr 2Qtr 3Qtr 4Qtr sum
t
uc
TV
od
PC U.S.A
Pr
VCR
Country
sum
Canada
Mexico
sum
Database Architecture
Cuboids Corresponding to the
Cube
all
0-D(apex) cuboid
product date country
1-D cuboids
3-D(base) cuboid
product, date, country
Database Architecture
Typical OLAP Operations
Database Architecture
Data Warehousing and OLAP
Database Architecture
Data Warehouse Design Process
Database Architecture
Multi-Tiered Architecture
Monitor
Metadata & OLAP Server
other
source Integrator
s Analysis
Operational Extract Query
Transform Data Serve Reports
DBs
Load
Refresh
Warehouse Data mining
Data Marts
• Data Mart
– a subset of corporate-wide data that is of value to a
specific groups of users. Its scope is confined to
specific, selected groups, such as marketing data mart
Database Architecture
Data Warehouse Development:
A Recommended Approach
Multi-Tier Data
Warehouse
Distributed
Data Marts
Database Architecture
Question & Answer Session
Q&A
Database Architecture ‹#›