An Overview of Data Warehousing and OLAP Technology What Is Decision Support?
An Overview of Data Warehousing and OLAP Technology What Is Decision Support?
An Overview of Data Warehousing and OLAP Technology What Is Decision Support?
Presenter: Pooyan Fazli Discussion by Nguyet Department of Computer Science University of British Columbia
Decision support systems usually require consolidating data form many heterogeneous sources: these might include external sources. -Such as stock market feeds.
Users Function DB Design Data Usage Access Unit of work # rec accessed # users DB size Metric
Clerk, IT professional Day to day operations Application-oriented Current, up-to-date detailed. repetitive Read/write Short, simple transaction tens thousands 100 MB-GB Transaction throughput
External Sources
Data cleaning
detect errors in the data and rectify them when possible
Data transformation
Operational Databases
Extract Transform Load Refresh
Data Warehouse
Serve
Load
sort, summarize, consolidate, compute views, check integrity, and build indicies and partitions
Refresh
propagate the updates from the data sources to the warehouse
Data Marts
Data Sources
Data Storage
The dimensions together are assumed to uniquely determine the measure. Each dimension is described by a set of attributes. The attributes of a dimension may be related via a hierarchy of relationships.
Pr od u
1Qtr
2Qtr
Date
3Qtr 4Qtr sum
Country
Star Schema
Sales Fact Table
Time_key Item_key item
I_key I_name I_brand I_type I_supplier_type
Star Schema
Snowflake Schema
A refinement of star schema where hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to snowflake
Branch
B_key B_name B_type
location
location_key street city province country
Star Schema
Snowflake Schema
Time
T_key T_day T_day_week T_month T_quarter T_year
identify the views to materialize exploit the materialized views to answer queries, efficiently update the materialized views during load and refresh.
Branch
B_key B_name B_type
Location
location_key street city City C_key C_city C_province C_country
Snowflake Schema
Metadata Requirements
Administrative metadata
Source database and their contents Source database and their contents Back-end and front-end tools Definitions of the warehouse schema Pre-defined queries and reports Data mart locations and contents Data refresh and purging policies User profiles and user access control policies
Metadata Requirements
Business metadata
Business terms and definitions Ownership of data Charging policies
Operational metadata
Data lineage: history of migrated data and sequence of transformations applied Currency of data: active, archived, purged Monitoring information: warehouse usage statistics, error reports, audit trails