0% found this document useful (0 votes)

30 views36 pages

Chapter Nine

A star schema organizes data into a central fact table linked to multiple dimension tables. A snowflake schema expands on this by normalizing dimension tables into multiple tables linked through foreign keys. This reduces data redundancy but increases the number of tables and joins needed for queries. Both schemas improve query performance over traditional OLTP schemas by separating facts and dimensions for faster retrieval and updates.

Uploaded by

ambroseoryem1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views36 pages

Chapter Nine

Uploaded by

ambroseoryem1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

Brian Kathabasya BIT/pgdIT

Chapter Nine
What is Star Schema?
What is Star Schema?

A star schema is the elementary form of a dimensional model, in which data

are organized into facts and dimensions. A fact is an event that is counted
or measured, such as a sale or log in. A dimension includes reference data
about the fact, such as date, item, or customer.

A star schema is a relational schema where a relational schema whose

design represents a multidimensional data model. The star schema is the
explicit data warehouse schema. It is known as star schema because the
entity-relationship diagram of this schemas simulates a star, with points,
diverge from a central table. The center of the schema consists of a large
fact table, and the points of the star are the dimension tables.
Fact Tables

A table in a star schema which contains facts and connected to dimensions. A fact table has
two types of columns: those that include fact and those that are foreign keys to the dimension
table. The primary key of the fact tables is generally a composite key that is made up of all of
its foreign keys.

A fact table might involve either detail level fact or fact that have been aggregated (fact tables
that include aggregated fact are often instead called summary tables). A fact table generally
contains facts with the same level of aggregation.

Dimension Tables

A dimension is an architecture usually composed of one or more hierarchies that categorize

data. If a dimension has not got hierarchies and levels, it is called a flat dimension or list.
The primary keys of each of the dimensions table are part of the composite primary keys of
the fact
table. Dimensional attributes help to define the dimensional value. They are generally
descriptive, textual values. Dimensional tables are usually small in size than fact table.

Fact tables store data about sales while dimension tables data about the geographic
region (markets, cities), clients, products, times, channels.

Characteristics of Star Schema

The star schema is intensely suitable for data warehouse database design because of
the following features:

oIt creates a DE-normalized database that can quickly provide query responses.
oItprovides a flexible design that can be changed easily or added to throughout the
development cycle, and as the database grows.
oIt provides a parallel in design to how end-users typically think of and use the data.
oIt reduces the complexity of metadata for both developers and end-users.
Advantages of Star Schema

Star Schemas are easy for end-users and application to understand and navigate. With a well-
designed schema, the customer can instantly analyze large, multidimensional data sets.

The main advantage of star schemas in a decision-support environment are:

Query Performance

A star schema database has a limited number of table and clear join paths, the query run
faster than they do against OLTP systems. Small single-table queries, frequently of a
dimension table, are almost instantaneous. Large join queries that contain multiple tables
takes only seconds or minutes to run.

In a star schema database design, the dimension is connected only through the central fact
table. When the two-dimension table is used in a query, only one join path, intersecting the
fact tables, exist between those two tables. This design feature enforces authentic and
consistent query results.

Load performance and administration

Structural simplicity also decreases the time required to load large batches of record into a
star schema database. By describing facts and dimensions and separating them into the
various table, the impact of a load structure is reduced. Dimension table can be populated
once and
occasionally refreshed. We can add new facts regularly and selectively by appending records
to a fact table.

Built-in referential integrity

A star schema has referential integrity built-in when information is loaded. Referential
integrity is enforced because each data in dimensional tables has a unique primary key, and
all keys in the fact table are legitimate foreign keys drawn from the dimension table. A record
in the fact table which is not related correctly to a dimension cannot be given the correct key
value to be retrieved.

Easily Understood

A star schema is simple to understand and navigate, with dimensions joined only through the
fact table. These joins are more significant to the end-user because they represent the
fundamental relationship between parts of the underlying business. Customer can also browse
dimension table attributes before constructing a query.
Disadvantage of Star Schema

There is some condition which cannot be meet by star schemas like the relationship between
the user, and bank account cannot describe as star schema as the relationship between them is
many to many.

Example: Suppose a star schema is composed of a fact table, SALES, and several dimension
tables connected to it for time, branch, item, and geographic locations.

The TIME table has a column for each day, month, quarter, and year. The ITEM table has
columns for each item_Key, item_name, brand, type, supplier_type. The BRANCH table has
columns for each branch_key, branch_name, branch_type. The LOCATION table has
columns of geographic data, including street, city, state, and country.
What is Snowflake Schema?

A snowflake schema is equivalent to the star schema. "A schema is known as a snowflake if
one or more dimension tables do not connect directly to the fact table but must join through
other dimension tables."

The snowflake schema is an expansion of the star schema where each point of the star
explodes into more points. It is called snowflake schema because the diagram of
snowflake schema resembles a snowflake. Snowflaking is a method of normalizing the
dimension tables in a STAR schemas. When we normalize all the dimension tables entirely,
the resultant structure resembles a snowflake with the fact table in the middle.

Snowflaking is used to develop the performance of specific queries. The schema is

diagramed with each fact surrounded by its associated dimensions, and those dimensions are
related to other dimensions, branching out into a snowflake pattern.
The snowflake schema consists of one fact table which is linked to many dimension tables,
which can be linked to other dimension tables through a many-to-one relationship. Tables in
a snowflake schema are generally normalized to the third normal form. Each dimension table
performs exactly one level in a hierarchy.

The following diagram shows a snowflake schema with two dimensions, each having three
levels. A snowflake schemas can have any number of dimension, and each dimension can
have any number of levels.
Example: Figure shows a snowflake schema with a Sales fact table, with Store,
Location, Time, Product, Line, and Family dimension tables. The Market dimension has two
dimension tables with Store as the primary dimension table, and Location as the outrigger
dimension table. The product dimension has three dimension tables with Product as the
primary dimension table, and the Line and Family table are the outrigger dimension tables.
A star schema store all attributes for a dimension into one denormalized table. This needed
more disk space than a more normalized snowflake schema. Snowflaking normalizes
the dimension by moving attributes with low cardinality into separate dimension tables that
relate to the core dimension table by using foreign keys. Snowflaking for the sole
purpose of minimizing disk space is not recommended, because it can adversely
impact query performance.

In snowflake, schema tables are normalized to delete redundancy. In snowflake

dimension tables are damaged into multiple dimension tables.

Figure shows a simple STAR schema for sales in a manufacturing company. The sales fact
table include quantity, price, and other relevant metrics. SALESREP,
CUSTOMER, PRODUCT, and TIME are the dimension tables.
The STAR schema for sales, as shown above, contains only five tables, whereas the
normalized version now extends to eleven tables. We will notice that in the snowflake
schema, the attributes with low cardinality in each original dimension tables are removed to
form separate tables. These new tables are connected back to the original dimension table
through artificial keys.
A snowflake schema is designed for flexible querying across more complex dimensions
and relationship. It is suitable for many to many and one to many relationships between
dimension levels.

Advantage of Snowflake Schema

1.The primary advantage of the snowflake schema is the development in query

performance due to minimized disk storage requirements and joining smaller lookup
tables.
2.It provides greater scalability in the interrelationship between dimension levels
and components.
3.No redundancy, so it is easier to maintain.
Disadvantage of Snowflake Schema

 The primary disadvantage of the snowflake schema is the additional

maintenance efforts required due to the increasing number of lookup tables.
It is also known as a multi fact star schema.
 There are more complex queries and hence, difficult to understand.
 More tables more join so more query execution time.
2
In this scenario, the SALES table contains only four columns with IDs from the dimension tables,
TIME, ITEM, BRANCH, and LOCATION, instead of four columns for time data, four columns for
ITEM data, three columns for BRANCH data, and four columns for LOCATION data. Thus, the size
of the fact table is significantly reduced. When we need to change an item, we need only make a
single change in the dimension table, instead of making many changes in the fact table.

We can create even more complex star schemas by normalizing a dimension table into several tables.
The normalized dimension table is called a Snowflake.

Difference between Star and Snowflake Schemas Star Schema

oIn a star schema, the fact table will be at the center and is connected to the dimension tables.
oThe tables are completely in a denormalized structure.

3
o SQL queries performance is good as there is less number of joins involved.
o Data redundancy is high and occupies more disk space.

4
5
Snowflake Schema

oA snowflake schema is an extension of star schema where the dimension tables are connected to
one or more dimensions.
oThe tables are partially denormalized in structure.
oThe performance of SQL queries is a bit less when compared to star schema as more number of
joins are involved.
oData redundancy is low and occupies less disk space when compared to star schema.

6
7
Let's see the differentiate between Star and
Snowflake Schema.
9
Basis for Comparison Star Schema Snowflake Schema
Ease of It has redundant data and hence less No redundancy and therefore more easy to maintain
Maintenance/change easy to maintain/change and change
Ease of Use Less complex queries and simple to More complex queries and therefore less easy to
understand understand
Parent table In a star schema, a dimension table In a snowflake schema, a dimension table will have one
will not have any parent table or more parent tables
Query Performance Less number of foreign keys and More foreign keys and thus more query execution
hence lesser query execution time time

Normalization It has De-normalized tables It has normalized tables

Type of Data Warehouse Good for data marts with simple Good to use for data warehouse core to simplify
relationships (one to one or one to complex relationships (many to many)
many)

Joins Fewer joins Higher number of joins

10
Dimension Table It contains only a single dimension It may have more than one dimension table for each
table for each dimension dimension
Hierarchies Hierarchies for the dimension are Hierarchies are broken into separate tables in a
stored in the dimensional table snowflake schema. These hierarchies help to drill
itself in a star schema down the information from topmost hierarchies to the
lowermost hierarchies.
When to use When the dimensional table When dimensional table store a huge number of rows
contains less number of rows, we with redundancy information and space is such an
can go for Star schema. issue, we can choose snowflake schema to store
space.

Data Warehouse system Work best in any data warehouse/ Better for small data warehouse/data mart.
data mart

11
What is Fact Constellation Schema?

A Fact constellation means two or more fact tables sharing one or more dimensions. It is
also called Galaxy schema.

Fact Constellation Schema describes a logical structure of data warehouse or data mart. Fact
Constellation Schema can design with a collection of de-normalized FACT, Shared, and Conformed
Dimension tables.

12
Fact Constellation Schema is a sophisticated database design that is difficult to summarize
information. Fact Constellation Schema can implement between aggregate Fact tables or decompose
a complex Fact table into independent simplex Fact tables.

13
Example: A fact constellation schema is shown in
the figure below.
15
This schema defines two fact tables, sales, and shipping. Sales are treated along four
dimensions, namely, time, item, branch, and location. The schema contains a fact table for sales that
includes keys to each of the four dimensions, along with two measures: Rupee_sold and units_sold.
The shipping table has five dimensions, or keys: item_key, time_key, shipper_key, from_location,
and to_location, and two measures: Rupee_cost and units_shipped.

The primary disadvantage of the fact constellation schema is that it is a more challenging
design because many variants for specific kinds of aggregation must be considered and selected.

16
Thanks for listening

Star Schema
No ratings yet
Star Schema
5 pages
DWM Unit 2. Data Warehousing Modeling & OLAP I
100% (2)
DWM Unit 2. Data Warehousing Modeling & OLAP I
16 pages
Snowflake & Starflake
100% (2)
Snowflake & Starflake
9 pages
Data Warehouse Schema
No ratings yet
Data Warehouse Schema
6 pages
CM230 Series Brochure
No ratings yet
CM230 Series Brochure
12 pages
Video Editing: So, Here's Everything To Consider When Taking Up Video Editing
100% (1)
Video Editing: So, Here's Everything To Consider When Taking Up Video Editing
11 pages
ADBMS: Assignment - 05: Snowflake Schema in Data Warehouse
No ratings yet
ADBMS: Assignment - 05: Snowflake Schema in Data Warehouse
5 pages
CH 3
No ratings yet
CH 3
60 pages
Lect-6-Data warehousing-Part-II
No ratings yet
Lect-6-Data warehousing-Part-II
37 pages
Operational Data Stores Data Warehouse: 8) What Is Ods Vs Datawarehouse?
No ratings yet
Operational Data Stores Data Warehouse: 8) What Is Ods Vs Datawarehouse?
15 pages
DWM 2
No ratings yet
DWM 2
21 pages
Bi Lecture4 - 2023
No ratings yet
Bi Lecture4 - 2023
49 pages
MODULE2
No ratings yet
MODULE2
22 pages
Data Cubemod2
100% (1)
Data Cubemod2
21 pages
DW Lab Manual Print
No ratings yet
DW Lab Manual Print
47 pages
Unit 2
No ratings yet
Unit 2
30 pages
Introduction To DataWarehouse and DataMining
No ratings yet
Introduction To DataWarehouse and DataMining
35 pages
1
No ratings yet
1
35 pages
Unit 2-DATA WAREHOUSE
No ratings yet
Unit 2-DATA WAREHOUSE
28 pages
Data Warehouse Concepts PDF
0% (1)
Data Warehouse Concepts PDF
14 pages
Untitled
No ratings yet
Untitled
1 page
Unit 2
No ratings yet
Unit 2
33 pages
NPTEL - CC - Assignment 3
0% (1)
NPTEL - CC - Assignment 3
4 pages
Data Warehouse Schemas
No ratings yet
Data Warehouse Schemas
17 pages
Ssas Real Time Interview Questions and Answers
No ratings yet
Ssas Real Time Interview Questions and Answers
7 pages
Unit 5 DW
No ratings yet
Unit 5 DW
12 pages
Data Warehouse and Data Mining
No ratings yet
Data Warehouse and Data Mining
11 pages
Infor Basics
No ratings yet
Infor Basics
15 pages
Schema
No ratings yet
Schema
17 pages
Unit2 - 5marks (Datascience)
No ratings yet
Unit2 - 5marks (Datascience)
16 pages
Dataware House Strcture
No ratings yet
Dataware House Strcture
13 pages
Adbms
No ratings yet
Adbms
4 pages
Dimensional Modeling
100% (1)
Dimensional Modeling
12 pages
Data Warehousing Schemas and Objects
No ratings yet
Data Warehousing Schemas and Objects
24 pages
8 Database Schema
No ratings yet
8 Database Schema
8 pages
Classic Star Schema As Data Model of Data Warehouse
No ratings yet
Classic Star Schema As Data Model of Data Warehouse
7 pages
CDM - Class 8
No ratings yet
CDM - Class 8
4 pages
Lecture 11data Warehouse Scema
No ratings yet
Lecture 11data Warehouse Scema
12 pages
ADBMS EXP1 Chinmay
No ratings yet
ADBMS EXP1 Chinmay
5 pages
Data Warehouse Schemas
No ratings yet
Data Warehouse Schemas
87 pages
Dimensional Modeling and Schemas: Data Modeling Research Paper
No ratings yet
Dimensional Modeling and Schemas: Data Modeling Research Paper
11 pages
Schema Cheatsheet No.1
No ratings yet
Schema Cheatsheet No.1
4 pages
Data Warehouse Schema
No ratings yet
Data Warehouse Schema
6 pages
Data Cube
No ratings yet
Data Cube
6 pages
Star Schemas
No ratings yet
Star Schemas
7 pages
Star and Snowflake Schemas: What Is A Star Schema?
No ratings yet
Star and Snowflake Schemas: What Is A Star Schema?
18 pages
Datadeling
No ratings yet
Datadeling
27 pages
Data Warehousing Mid-Term Answers (Tentative)
No ratings yet
Data Warehousing Mid-Term Answers (Tentative)
4 pages
Data Warehousing Concepts 2
No ratings yet
Data Warehousing Concepts 2
26 pages
Entity Relational Modeling Vs
No ratings yet
Entity Relational Modeling Vs
9 pages
The Basics: Facts & Dimensions
No ratings yet
The Basics: Facts & Dimensions
4 pages
Database Design 1
100% (1)
Database Design 1
4 pages
Home Work 3
0% (1)
Home Work 3
10 pages
Lecture Six-Schemas
No ratings yet
Lecture Six-Schemas
5 pages
HCP Replication Activities v4-0
No ratings yet
HCP Replication Activities v4-0
40 pages
ETL Testing
No ratings yet
ETL Testing
3 pages
Snowflake Schema - Jenny
No ratings yet
Snowflake Schema - Jenny
2 pages
DW Concepts
No ratings yet
DW Concepts
7 pages
ETL Testing Fundamentals
No ratings yet
ETL Testing Fundamentals
5 pages
Star and Snowflake
No ratings yet
Star and Snowflake
4 pages
Multidimensional Schema
No ratings yet
Multidimensional Schema
4 pages
Pseudo Dionysius of Areopagite - The Celestial & Ecclesiastical Hierarchy Transl John Parker (1894)
100% (2)
Pseudo Dionysius of Areopagite - The Celestial & Ecclesiastical Hierarchy Transl John Parker (1894)
119 pages
Dimensional Model Schemas - Start and Snowflake
No ratings yet
Dimensional Model Schemas - Start and Snowflake
2 pages
Manual LN63
No ratings yet
Manual LN63
19 pages
Hotel Basic Network Configuration - PNP
No ratings yet
Hotel Basic Network Configuration - PNP
12 pages
Chapter Four
No ratings yet
Chapter Four
43 pages
Schemas For Multidimensional Databases
No ratings yet
Schemas For Multidimensional Databases
5 pages
Snowflake Schema
No ratings yet
Snowflake Schema
4 pages
Comparativa de Switches Cabeza de Grupo
No ratings yet
Comparativa de Switches Cabeza de Grupo
21 pages
Extended ECM CE 23.4 Release Notes - Edition For SAP S4HANA
No ratings yet
Extended ECM CE 23.4 Release Notes - Edition For SAP S4HANA
58 pages
APM2613 - Lesson 1 - 0 - 2023
No ratings yet
APM2613 - Lesson 1 - 0 - 2023
9 pages
HW-14 DFEM v01-4 PDF
No ratings yet
HW-14 DFEM v01-4 PDF
222 pages
Project Python
No ratings yet
Project Python
63 pages
DHCP, DNS, SNMP: Debashis Saha MIS Group, IIM Calcutta Ds@iimcal - Ac.in
No ratings yet
DHCP, DNS, SNMP: Debashis Saha MIS Group, IIM Calcutta Ds@iimcal - Ac.in
69 pages
P53 Hach
No ratings yet
P53 Hach
99 pages
Ganesh SUMMER INTERNSHIP REPORT
No ratings yet
Ganesh SUMMER INTERNSHIP REPORT
27 pages
IGCSE O-Level Computer Coursebook: Chapter 2: Communications and Internet Technologies
No ratings yet
IGCSE O-Level Computer Coursebook: Chapter 2: Communications and Internet Technologies
2 pages
Appearance Checked Report of Samples TV
50% (2)
Appearance Checked Report of Samples TV
7 pages
Picapool Contribution NHC
No ratings yet
Picapool Contribution NHC
6 pages
Chapter Eight
No ratings yet
Chapter Eight
33 pages
Rental Pro
No ratings yet
Rental Pro
24 pages
134.4020.23 DM705 Datasheet
No ratings yet
134.4020.23 DM705 Datasheet
32 pages
How To Install
No ratings yet
How To Install
2 pages
Learn Machine Learning in 20 Days
No ratings yet
Learn Machine Learning in 20 Days
23 pages
Applsci 13 13339
No ratings yet
Applsci 13 13339
25 pages
Bulleto PDF July 2023-1
No ratings yet
Bulleto PDF July 2023-1
20 pages
PHP Programming - 22UAI318L Syllabus
No ratings yet
PHP Programming - 22UAI318L Syllabus
5 pages
Ray Wireless
No ratings yet
Ray Wireless
13 pages
Create A 5.1 Surround Audio Sequence: Adobe Premiere Pro
No ratings yet
Create A 5.1 Surround Audio Sequence: Adobe Premiere Pro
4 pages
Omar 2018
No ratings yet
Omar 2018
11 pages
Supervised Learning in Healthcare
No ratings yet
Supervised Learning in Healthcare
6 pages
Branson 2000d Error Code 300: Direct Link #1
No ratings yet
Branson 2000d Error Code 300: Direct Link #1
3 pages
Machine Learning Engineer
No ratings yet
Machine Learning Engineer
2 pages
4001 BSCS (Assignment 2)
No ratings yet
4001 BSCS (Assignment 2)
2 pages
How To Develop A Performance Reporting Tool with MS Excel and MS SharePoint
From Everand
How To Develop A Performance Reporting Tool with MS Excel and MS SharePoint
S. Alyafei
No ratings yet

Chapter Nine

Uploaded by

Chapter Nine

Uploaded by

Brian Kathabasya BIT/pgdIT

A star schema is the elementary form of a dimensional model, in which data

A star schema is a relational schema where a relational schema whose

A dimension is an architecture usually composed of one or more hierarchies that categorize

Characteristics of Star Schema

The main advantage of star schemas in a decision-support environment are:

Load performance and administration

Built-in referential integrity

Snowflaking is used to develop the performance of specific queries. The schema is

In snowflake, schema tables are normalized to delete redundancy. In snowflake

Advantage of Snowflake Schema

1.The primary advantage of the snowflake schema is the development in query

 The primary disadvantage of the snowflake schema is the additional

Difference between Star and Snowflake Schemas Star Schema

Normalization It has De-normalized tables It has normalized tables

Joins Fewer joins Higher number of joins

You might also like