0% found this document useful (0 votes)

185 views54 pages

Top Tier Front-End Processing-: Star Schema Design OLAP Implementation

The document describes the architecture of a three-tier data warehouse. It consists of three tiers - a top tier for front-end processing using OLAP tools, a middle tier consisting of an OLAP server, and a bottom tier data warehouse server housing the data. The data is extracted from various source databases and loaded into dimensional models consisting of fact and dimension tables in a star schema design for analysis.

Uploaded by

puneetha89

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

185 views54 pages

Top Tier Front-End Processing-: Star Schema Design OLAP Implementation

Uploaded by

puneetha89

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Architecture of Three Tier Data Warehouse

Users

Relational views
with OLAP

SQL query

OLAP
command

----------------------------------------------Top Tier Front-end Processing--OR

MOLAP

OR
HOLAP

OLAP implementation
Star Schema design

Data
storage

----Middle Tier OLAP Server---

Dimension
table 1

:
:
:
:
:

Source databases

2008/2/4

Dimension
table 2

Fact
table

Dimension
table n

Source
Database
1

ROLAP

Source
Database
2

-Bottom TierData Warehouse ServerData

extraction

Source
Database
m

Data Warehouse for Decision Support

A data base is a collection of data organized by a database
management system.
A data warehouse is a read-only analytical database used for
a decision support system operation.
A data warehouse for decision support is often taking data
from various platforms, databases, and files as source data.
The use of advanced tools and specialized technologies may
be necessary in the development of decision support
systems, which affects tasks, deliverables, training, and
project timelines.
2008/1/29

Data Warehouse for end users

A data warehouse is readily user-friendly by the
analyst for end users, even those who are not
familiar with database structure.
Data warehouse is a collection of integrated denormalized databases for fast response
performance.
In general, a data warehousing storage is for at
least 5 years long term capacity planning growth.
2008/1/29

Phases of the Decision Support Life Cycle

1. Planning
2. Gathering Data Requirements and Modeling
3. Physical Database Design and Development
4. Data Mapping and Transformation
5. Data Extraction and Load
6. Automating the Data Management Process
7. Application Development-Creating the starter sets
of reports
8. Data Validation and Testing
9. Training
10. Rollout
2008/1/29

Phase 1: Planning

Planning for a data warehouse is concerned with:

Defining the project scope
Creating the project plan
Defining the necessary resources, both internal and
external
Defining the tasks and deliverables
Defining timelines
Defining the final project deliverables
2008/1/29

Capacity Planning
Calculate the record size for each table
Estimate the number of initial records for
each table
Review the data warehouse access
requirements to predict index requirements
Determine the growth factor for each table
Identify the largest target table expected
over the selected period of time and add
approximately 25-30% overhead to the table
size to determine temporary storage size
2008/1/29

Phase 2: Gathering data requirements and Modeling

Gathering Data Requirements:
How the user does business?
How the users performance is measured?
What attributes does the user need?
What are the business hierarchies?
What data do users use now and what would they
like to have?
What levels of detail or summary do the users need?
2008/1/29

Data Modeling
A logical data model covering the scope of the
development project including relationships,
cardinality, attributes, and candidate keys.
or
A Dimensional Business Model that diagrams the
facts, dimensions, hierarchies, relationships and
candidate keys for the scope of the development
project
2008/1/29

Phase 3: Physical Database

Design and Development
Designing the database, including fact
tables, relationship tables, and description
(lookup) tables.
Denormalizing the data.
Identifying keys.
Creating indexing strategies.
Creating appropriate database objects.
2008/1/29

Phase 4: Data Mapping and

Transformation
Defining the source systems.
Determining file layouts.
Developing
written
transformation
specifications
for
sophisticated
transformations.
Mapping source to target data.
Reviewing capacity plans.
2008/1/29

Phase 5: Populating the data

warehouse
Developing procedures to extract and move the
data.
Developing procedures to load the data into the
warehouse.
Developing programs or use data transformation
tools to transform and integrate data.
Testing extract, transformation and load
procedures
2008/1/29

Phase 6: Automating Data

Management Procedures
Automating and scheduling the data load
process.
Creating backup and recovery procedures.
Conducting a full test of all of the
automated procedures.

2008/1/29

Phase 7: Application Development Creating the Starter Set of Reports

Creating the starter set of predefined
reports.
Developing core reports.
Testing reports.
Documenting applications.
Developing navigation paths.
2008/1/29

Phase 8: Data Validation and

Testing
Validating Data using the starter set of
reports.
Validating Data using standard processes.
Iteratively changing the data.

2008/1/29

Phase 9: Training
To gain real business value from your warehouse
development, users of all levels will need to be
trained in:
The scope of the data in the warehouse.
The front end access tool and how it works.
The DSS application or starter set of reports - the
capabilities and navigation paths.
Ongoing training/user assistance as the system
evolves
2008/1/29

Phase 10: Rollout

Installing the physical infrastructures for all users.
Developing the DSS application.
Creating procedures for adding new reports and
expanding the DSS application.
Setting up procedures to backup the DSS
application, not just the data warehouse.
Creating procedures for investigating and
resolving data integrity related issues.
2008/1/29

Star Schema Database Design

The goals of a decision support database are often
achieved by a database design called a star schema.
A star schema design is a simple structure with
relatively few tables and well-defined join paths.
This database design, in contrast to the normalized
structure used for transaction-processing databases,
provides fast query response time and a simple
schema that is readily understood by the analysts
and end users.
2008/1/29

Understanding Star Schema

Design - Facts and Dimensions
A star schema contains two types of tables, fact tables and
dimension tables. Fact tables contain the quantitative or
factual data about a business - the information being
queried. This information is often numerical measurements
and can consist of many columns and millions of rows.
Dimension tables are smaller and hold descriptive data that
reflect the dimensions of a business. SQL queries then use
predefined and user-defined join paths between fact and
dimension tables to return selected information.
2008/1/29

Identifying Facts and Dimensions

Look for the elemental transactions within the business
process. This identifies entities that are candidates to be
fact table.
Determine the key dimensions that apply to each fact. This
identifies entities that are candidates to be dimension
tables.
Check that a candidate fact is not actually a dimension
with embedded facts.
Check that a candidate dimension is not actually a fact
table
within the context of the decision support
2008/1/29
19
requirement.

Step 1 Look for the elemental transactions within the business process

The first step in the process of identifying

fact tables is where we examine the
business, and identify the transactions that
may be of interest. They will tend to be
transactions
that
describe
events
fundamentals to the business.

2008/1/29

Step 2 Determine the key dimension that apply to each fact

The next step is to identify the main

dimensions for each candidate fact table.
This can be achieved by looking at the
logical model, and finding out which entities
are associated with the entity representing
the fact table. The challenge here is to focus
on the key dimension entities.

2008/1/29

Step 3 Check that a candidate fact is not actually a

dimension table with denormalized facts

Look for denormalized dimensions within

candidate fact tables. It may be the case that the
candidate fact table is a dimension containing
repeating groups of factual attributes.

2008/1/29

Step 4 Check that a candidate dimension is not a fact table

If the business requirement is geared toward

analysis of the entity that is currently a
candidate dimension, chances are that it is
probably more appropriate to make it a fact
table.

2008/1/29

Simple Star Schemas

Each table must have a primary key, which is a
column or group of columns whose contents
uniquely identify each row. In a simple star schema,
the primary key for the fact table is composed of
one or more foreign keys. When a database is
created, the SQL statements used to create the
tables will designate the columns that are to form
the primary and foreign keys.

2008/1/29

A sales database with a simple star schema

Sales Table
(Fact Table)

Period Table
(dimension table)

Period_Id
Product_Id

Period_Id
Period_Desc
Quarter
Year

Product
Table
(dimension
Table )
Product_Id
Period_Id
Prod_Desc
Brand
Size

2008/1/29

Market_Id
Units
Dollars
Discount%

Market
Table
(dimension
Table)
Market_Id
Market_Desc
District
Region

Multiple Fact Tables

A star schema can contain multiple fact tables.
Multiple fact tables exist because they contain
unrelated facts or because periodicity of the load
times differs. In other cases, multiple fact tables
exist because they improve performance. Creating
different tables for different levels of aggregation is
a common design technique for a data warehouse
database so that any single request is against a table
of reasonable size.
2008/1/29

Sales Table
(Fact Table)

Period Table
(dimension table)

Period_Id
Product_Id

Period_Id
Period_Desc
Quarter
Year

Product
Table
(dimension
Table )
Product_Id
Prod_Desc
Brand
Size
Group table

Market_Id
Units
Dollars
Discount%
Product_Group
table(fact table)
Period_Id

Market
Table
(dimension
Table)
Market_Id
Market_Desc
District
Region

Group_Id

Group_Id
2008/1/29 Group_Desc

Outboard Tables
Dimension tables can also contain a foreign
key that references the primary key in
another dimension table. The referenced
dimension tables are sometimes referred to
as outboard, outrigger, or secondary
dimension tables.

2008/1/29

Sales Table
(Fact Table)

Period Table
(dimension table)

Period_Id
Product_Id

Period_Id
Period_Desc
Quarter
Year

Product
Table
(dimension
Table )
Product_Id
Prod_Desc
Brand
Size

Market_Id
Units
Dollars
Discount%
District table
District_Id

Market
Table
(dimension
Table)
Market_Id
Market_Desc
District
Region

District_Desc
Region table
Region_Id
2008/1/29

Region_Desc
29

Multi-Star Schema
In some applications the concatenated foreign keys
might not provide a unique identifier for each row
in the fact table. These applications require a multistar schema.
In a multi-star schema, the fact table has both a set of
foreign keys, which reference dimension tables, and
a primary key, which is composed of one or more
columns that provide a unique identifier for each
row.
2008/1/29

Retail sales database designed as a multi-star schema with

two secondary dimension tables
Transaction Table
Store Table
Store_Id
Store_Id

SKU Table
SKU_Id

Class Table
SKU_Id
Class_Id
Class_Desc

Dept_Id

Class_Id
Dept_Id
Item

Date

Store_Name
Region
Manager

Receipt_Nbr
Receipt_
Line_Item
Units
Price
Amount

Dept_Desc
2008/1/29

Snowflake Schema
Snowflake schema is a star schema which
stores all dimensional information in third
normal form, while keeping fact table
structures the same.

2008/1/29

Example of Snowflake Schema

time
time_key
day
day_of_the_week
month
quarter
year

item
Sales Fact Table

time_key
item_key
branch_key

branch

location_key

branch_key
branch_name
branch_type

units_sold
dollars_sold
avg_sales

Measures
2008/1/29

item_key
item_name
brand
type
supplier_key

supplier
supplier_key
supplier_type

location
location_key
street
city_key

city
city_key
city
province_or_street
country
33

Data Warehouse architectures

Source

User

Source

Data
Transformation
&
Integration

Data
Warehouse

User

Source

User

2008/1/29

Case study of building a data warehouse

Step 1 Planning

2008/1/29

Capacity planning

Given time dimension:

2 years x 365 days
Product dimension:
average 5 product per transaction
Promotion dimension:
1 promotion type per transaction
Store dimension:
10 local country stores
Customer dimension:
1 customer per transaction
Number of sales transaction:
200 per day for major customers

As a result, the number of base fact records = 2 x 365 x 5 x 1 x 200 =

7.3 million records
Assume number of key field = 5, number of fact field = 7, which
implies total fields = 12
Thus, the base fact table size = 7.3 million x 12 x 4 bytes per field =
350 MB (the size of dimension tables are negligible).
2008/1/29

Step 2 Data Requirements and Modeling

Dimension
Time

Dimension

Deal

Dimension
Product

FACTS

Dimension

Store Sales

Distribution
Center

Dimension

Store

Promotion

Customer

Brand
Company

2008/1/29

Dimension

Step 3 Physical database design and development

Example: Design a Simple Star Schema from a relational schema

Identify measurable fields in a Fact table.

Identify selection criteria of the measurement as
keys in a Fact table.
Construct the dimension tables derived from the
keys in the Fact table.
Validate the Simple Star Schema as SR1 type
relation.
2008/1/29

Example
Given
Relation A (a1, a2, a3)
Relation B (b1, b2, b3)
Relation C (*a1, *b1, m1, m2)
Derived Simple Star Schema
FACT TABLE
DIMENSION TABLE A
a1
a2
a3

a1
b1

DIMENSION TABLE B
b1
b2
b3

m1
m2

2008/1/29

Step 4 Map Corporate model into a data warehouse

Data Mapping and Transformation

2008/1/29

Step 5 Data Extraction and Load

Technical infrastructures should be in place to assist with
these middle phases of data mapping, transformation,
extracting and loading including:
1.
2.
3.
4.
5.
6.
7.

Database administration expertise

Data transformation tool training / expertise
Update / refresh strategies
Load strategies
Operations /job scheduling
Quality assurance procedures
Capacity planning expertise
2008/1/29

Step 6 Automating Data Management Process

A data warehouse has very bimodal usage.

Most data warehouses are online 16 to 22
hours per day in a read-only mode. The data
warehouse goes off-line for 2 to 8 hours in
the wee hours of the morning for data
loading, data indexing, data quality
assurance, and data release.
2008/1/29

Step 7 Application Development-Creating starter set of reports

Reports for Executive Information Systems such as:

Is it worthwhile to stock so many individual sizes of certain

products?
Which items are cannibalized when I promote a particular
product like Absolute Vodka?
What are the top 10 items my competitors are selling that I
dont sell at all?
Which season sold the most Cognac last year?
Which product item is the most profitable in year 2001 in
Macau?
Which customer/Outlet buy the most in terms of cases sales in
year 2001?
2008/1/29
51
What is the total gross profit in April this year?

Reading assignment
Data Mining: Concepts and Techniques, by
Jiawei Han and Micheline Kamber, Morgan
Kaufmann Publishers, 2nd edition, 2007,
Chapter 3 Data Warehouse and OLAP
Technology, pp.105-134

2008/1/29

Lecture review question 4

Compare database with data warehouse in
performance, user friendliness, capacity
planning and data manipulation language
operations?

2008/1/29

Tutorial Question 4
You are to design a data warehouse to track the sales of salad dressing products in
supermarkets at weekly intervals over a four-year period and it is a typical
consumer-goods marketing database. The salad dressing product category contains
14000 items at the universal product code (UPC) level. Data are summarized for
each of 120 geographic areas (markets) in the United States, and are also
summarized for each of 208 weekly time periods spanning over four years. The
followings are the tables:
Product Table (Product_id, Prod_Desc, Brand, Manufacturer, Pack, Class, Flavor, Size)
Sales Table (*Period_id, *Product_id, *Market_id, Units, Dollars, Discount, Selling_Price,
Large_Ads, Medium_Ads, Small_Ads)
Period Table (Period_id, Period_Desc, Quarter, Fiscal_Year, Calendar_Year, Agg_Level)
Market Table (Market_id, Market_Desc, District, Region)

Show a simple star schema design for the application.

2008/1/29

Maintenance: - Occurs When The System Is in Production - Includes
No ratings yet
Maintenance: - Occurs When The System Is in Production - Includes
25 pages
DWM Exp 1-2
No ratings yet
DWM Exp 1-2
9 pages
Create First Data WareHouse - CodeProject
No ratings yet
Create First Data WareHouse - CodeProject
10 pages
Case Study V1
No ratings yet
Case Study V1
26 pages
DWDM Concept Demonstration
No ratings yet
DWDM Concept Demonstration
102 pages
Unit - I
No ratings yet
Unit - I
65 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
11 pages
Chapter-2 DM
No ratings yet
Chapter-2 DM
23 pages
Data Warehouse Design & Implementation
No ratings yet
Data Warehouse Design & Implementation
27 pages
Data Warehousing: People Making Technology Wor K™
100% (1)
Data Warehousing: People Making Technology Wor K™
44 pages
9 Step To Design Data Warehouse
No ratings yet
9 Step To Design Data Warehouse
24 pages
Dimensional Model
No ratings yet
Dimensional Model
18 pages
Data Warehouse Essentials
No ratings yet
Data Warehouse Essentials
85 pages
Week 3
No ratings yet
Week 3
39 pages
MIS 385/MBA 664 Systems Implementation With DBMS/ Database Management
No ratings yet
MIS 385/MBA 664 Systems Implementation With DBMS/ Database Management
39 pages
Data Warehouse Development Strategies
No ratings yet
Data Warehouse Development Strategies
25 pages
Unit 2
No ratings yet
Unit 2
8 pages
Data Warehouse
No ratings yet
Data Warehouse
81 pages
Data Warehouse Design Principles
No ratings yet
Data Warehouse Design Principles
75 pages
Chapter 5 Headings
No ratings yet
Chapter 5 Headings
3 pages
Lecture 3
No ratings yet
Lecture 3
42 pages
Dimensional Modeling
No ratings yet
Dimensional Modeling
47 pages
Dim Modelling Part 1 - Sh24
No ratings yet
Dim Modelling Part 1 - Sh24
50 pages
Ch4 DW Detailed Version
No ratings yet
Ch4 DW Detailed Version
39 pages
4.online Analytical Processing
No ratings yet
4.online Analytical Processing
59 pages
Create First Data WareHouse
No ratings yet
Create First Data WareHouse
39 pages
Data Warehouse Toolkit Classics - Kimball Ross Muncy Becker
No ratings yet
Data Warehouse Toolkit Classics - Kimball Ross Muncy Becker
56 pages
Dimensional Data Modeling Guide
No ratings yet
Dimensional Data Modeling Guide
36 pages
Data Warehouse Fundamentals and Design
No ratings yet
Data Warehouse Fundamentals and Design
36 pages
Dimensional Modeling Guide
No ratings yet
Dimensional Modeling Guide
26 pages
Data Warehouse Goals and Design Principles
No ratings yet
Data Warehouse Goals and Design Principles
22 pages
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-28 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-28 Reference-Material-I
32 pages
Data Warehousing & Modeling Guide
No ratings yet
Data Warehousing & Modeling Guide
10 pages
Dimensional Modeling: Prof. Sunita Sahu
No ratings yet
Dimensional Modeling: Prof. Sunita Sahu
50 pages
ch4 DW Summary
No ratings yet
ch4 DW Summary
8 pages
Understanding Data Warehousing Concepts
100% (1)
Understanding Data Warehousing Concepts
29 pages
Data Warehousing Fundamentals: Priyanka Deshmukh
No ratings yet
Data Warehousing Fundamentals: Priyanka Deshmukh
43 pages
Data Warehouse Implementation
No ratings yet
Data Warehouse Implementation
37 pages
Principles OF Dimensional Modeling
No ratings yet
Principles OF Dimensional Modeling
18 pages
DW Concepts Shiva
No ratings yet
DW Concepts Shiva
32 pages
DWH Architecture & Concepts
No ratings yet
DWH Architecture & Concepts
37 pages
Unit-1 Lecture Notes
100% (1)
Unit-1 Lecture Notes
43 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
7 pages
Data Warehouse Schema Design
No ratings yet
Data Warehouse Schema Design
10 pages
Chapter-04-Analisis Dan Drfinisi Kebutuhan Datawarehouse
No ratings yet
Chapter-04-Analisis Dan Drfinisi Kebutuhan Datawarehouse
56 pages
Mapping Data Warehouse to Multiprocessor
No ratings yet
Mapping Data Warehouse to Multiprocessor
34 pages
Data Warehousing: Data Models and OLAP Operations: Lecture-1
No ratings yet
Data Warehousing: Data Models and OLAP Operations: Lecture-1
47 pages
DW Design Concept Lecture
No ratings yet
DW Design Concept Lecture
5 pages
Data Warehousing and OLAP Guide
No ratings yet
Data Warehousing and OLAP Guide
87 pages
Microsoft BI & Data Warehousing Guide
No ratings yet
Microsoft BI & Data Warehousing Guide
65 pages
MSBI Corporate Training MatMSBIerial
No ratings yet
MSBI Corporate Training MatMSBIerial
65 pages
Multidimensional
No ratings yet
Multidimensional
77 pages
Lecture 03
No ratings yet
Lecture 03
31 pages
Data Mining in Decision Support Systems
No ratings yet
Data Mining in Decision Support Systems
39 pages
Unit I DMT
No ratings yet
Unit I DMT
74 pages
Introduction To Data Warehousing
100% (2)
Introduction To Data Warehousing
53 pages
Understanding Dimensional Modeling Basics
No ratings yet
Understanding Dimensional Modeling Basics
84 pages
Business Objects Step by Step Tutorial
No ratings yet
Business Objects Step by Step Tutorial
27 pages
ABAP Data Dictionary Overview
100% (2)
ABAP Data Dictionary Overview
12 pages
Database Integrity and Relational Operators
No ratings yet
Database Integrity and Relational Operators
60 pages
SQ L Worksheet 24
No ratings yet
SQ L Worksheet 24
35 pages
Oracle Lab Manual 2024-25
No ratings yet
Oracle Lab Manual 2024-25
22 pages
DBMS & SQL Interview Q&A for Freshers
No ratings yet
DBMS & SQL Interview Q&A for Freshers
90 pages
Class12 CS Practical File Akshat Updated
No ratings yet
Class12 CS Practical File Akshat Updated
20 pages
SQL100
No ratings yet
SQL100
10 pages
Indexes and View and Other Important Interview Questions
No ratings yet
Indexes and View and Other Important Interview Questions
11 pages
Introduction To SQL (W3school)
100% (1)
Introduction To SQL (W3school)
105 pages
Enhanced Microsoft Excel 2013 Illustrated Complete 1st Edition Reding Test Bank Download
100% (21)
Enhanced Microsoft Excel 2013 Illustrated Complete 1st Edition Reding Test Bank Download
15 pages
SQL Cheat Sheet
100% (2)
SQL Cheat Sheet
3 pages
Oracle Data Dictionary Tables
No ratings yet
Oracle Data Dictionary Tables
28 pages
Oracle - Actualtests.1z0 448.v2018!11!26.by - Colin.49q
No ratings yet
Oracle - Actualtests.1z0 448.v2018!11!26.by - Colin.49q
22 pages
Power BI & Plant 3D: Data-Driven Decisions
No ratings yet
Power BI & Plant 3D: Data-Driven Decisions
18 pages
PostgreSQL Table Creation Guide
No ratings yet
PostgreSQL Table Creation Guide
4 pages
PL/SQL Collections Guide
No ratings yet
PL/SQL Collections Guide
46 pages
DBMS Lecture 14
No ratings yet
DBMS Lecture 14
3 pages
Worksheet 2.8 Relational Database and SQL
No ratings yet
Worksheet 2.8 Relational Database and SQL
9 pages
Data Retrieval and Analysis 2023R1
100% (1)
Data Retrieval and Analysis 2023R1
132 pages
Power BI Guide
100% (2)
Power BI Guide
46 pages
SQL Commands Cheat Sheet
No ratings yet
SQL Commands Cheat Sheet
3 pages
Power BI Assignment 1 - Data Transformation and Data Modeling
No ratings yet
Power BI Assignment 1 - Data Transformation and Data Modeling
2 pages
Open ODS View in SAP BW on HANA
No ratings yet
Open ODS View in SAP BW on HANA
11 pages
MySQL Programming Notes
No ratings yet
MySQL Programming Notes
87 pages
SQL Slides (Apna College)
No ratings yet
SQL Slides (Apna College)
59 pages
IGNOU MCSL-045 Lab Manual Solutions
50% (2)
IGNOU MCSL-045 Lab Manual Solutions
11 pages
TCS ASPIRE: UNIX Quiz/question-Answer 2012
100% (2)
TCS ASPIRE: UNIX Quiz/question-Answer 2012
12 pages
Computer Package1
100% (1)
Computer Package1
33 pages
Fy BSC Cs Fdbms Syllabus
No ratings yet
Fy BSC Cs Fdbms Syllabus
4 pages

Top Tier Front-End Processing-: Star Schema Design OLAP Implementation

Uploaded by

Top Tier Front-End Processing-: Star Schema Design OLAP Implementation

Uploaded by

Architecture of Three Tier Data Warehouse

----------------------------------------------Top Tier Front-end Processing--OR

----Middle Tier OLAP Server---

-Bottom TierData Warehouse ServerData

Data Warehouse for Decision Support

Data Warehouse for end users

Phases of the Decision Support Life Cycle

Planning for a data warehouse is concerned with:

Phase 2: Gathering data requirements and Modeling

Phase 3: Physical Database

Phase 4: Data Mapping and

Phase 5: Populating the data

Phase 6: Automating Data

Phase 7: Application Development Creating the Starter Set of Reports

Phase 8: Data Validation and

Phase 10: Rollout

Star Schema Database Design

Understanding Star Schema

Identifying Facts and Dimensions

The first step in the process of identifying

Step 2 Determine the key dimension that apply to each fact

The next step is to identify the main

Step 3 Check that a candidate fact is not actually a

Look for denormalized dimensions within

Step 4 Check that a candidate dimension is not a fact table

If the business requirement is geared toward

Simple Star Schemas

A sales database with a simple star schema

Multiple Fact Tables

Retail sales database designed as a multi-star schema with

Example of Snowflake Schema

Data Warehouse architectures

Case study of building a data warehouse

Given time dimension:

As a result, the number of base fact records = 2 x 365 x 5 x 1 x 200 =

Step 2 Data Requirements and Modeling

Step 3 Physical database design and development

Identify measurable fields in a Fact table.

Step 4 Map Corporate model into a data warehouse

Step 5 Data Extraction and Load

Database administration expertise

Step 6 Automating Data Management Process

A data warehouse has very bimodal usage.

Step 7 Application Development-Creating starter set of reports

Is it worthwhile to stock so many individual sizes of certain

Lecture review question 4

Show a simple star schema design for the application.

You might also like