Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
61 views
41 pages
Star and Snowflake Schema - SQL
Uploaded by
Nouman shah
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download
Save
Save Star and Snowflake Schema _ SQL For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
61 views
41 pages
Star and Snowflake Schema - SQL
Uploaded by
Nouman shah
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Carousel Previous
Carousel Next
Download
Save
Save Star and Snowflake Schema _ SQL For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 41
Search
Fullscreen
Star and snowflake schema DATABASE DESIGN EST ag comet eure 0 datacampStar schema Dimensional modeling: star schema Fact tables * Holds records of a metric * Changes regularly * Connects to dimensions via foreign keys Dimension tables * Holds descriptions of attributes + Does not change as often Example: + Supply books to stores in USA and Canada + Keep track of book sales DATABASE DESIGNStar schema example 0 datacamp dim_book star book id int Pk| title varchar(256) author varchar(256) publisher varchar(256) genre varchar(128) Te _ fact booksales storeid int PK fees int Fe store-addres varchar(256) bony im RK chy varchar(128) Be int FK| state varchar(128) o ‘sales amount float country _varchar(128) quantity ae dim_time_star ime_id day month quarter year DATABASE DESIGNSnowflake schema (an extension) Dk datacamp DATABASE DESIGNSame fact table, different dimensions ay Star schemas: one dimension Snowflake schemas: more than one dimension Because dimension tables are normalized Dk datacamp DATABASE DESIGNWhat is normalization? + Database design technique * Divides tables into smaller tables and connects them via relationships + Goal: reduce redundancy and increase data integrity DATABASE DESIGNWhat is normalization? + Database design technique * Divides tables into smaller tables and connects them via relationships + Goal: reduce redundancy and increase data integrity Identify repeating groups of data and create new tables for them DATABASE DESIGNBook dimension of the star schema dim_book star book id int PK, title varchar(256) author varchar(256) publisher _varchar(256) genre varchar(128) Most likely to have repeating values: * Author * Publisher * Genre DATABASE DESIGNBook dimension of the snowflake schema dim_publisher_sf publisher_id int Pk dim_author_sf publisher varchar(256) dim_genre_sf ‘author id int PK 7 A genre_id int Pk author —_varchar(256) i genre har(128) dim_book sf book id int Pk varchar(256) int Fk| int Fk| int FK Dk datacamp DATABASE DESIGNStore dimension of the star schema + City * State * Country dim_store star store_id int PK store_addres varchar(256) city varchar(128) state varchar(128) country —_varchar(128) DATABASE DESIGNStore dimension of the snowflake schema sina ae ae sana [Ra Sowese™| arp store-address Vries) a ae eee ey i = Seow coord im cty sf oe. RR ayant x aa Sty varcharti28) state id int r« anata 149 sated nt mI Sate varchar(128) feounty.id ine rx ‘im country sf oun. nt 7 county varchar(128) 0 datacamp DATABASE DESIGNdim_time_star time_id int P| day int month int quarter int year int 0 datacamp ‘dim time sf time id int PK day int ‘month id int Fx 1 at ‘monthid int PK| ‘month int quarter_id int FK, jarter_sf quarter_id_int Pk| aan ee )_ are ete ae DATABASE DESIGNste Sense = i cae! es Sena | aati 0& datacamp DATABASE DESIGNLet's practice! DATABASE DESIGNNormalized and denormalized databases DATABASE DESIGN EST ag comet eure Dk datacampBack to our book store example Denormalized: star schema Normalized: snowflake schema Dk datacamp DATABASE DESIGNDenormalized Query Goal: get quantity of all Octavia E. Butler books sold in Vancouver in @4 of 2018 SELECT SUM(quantity) FROM fact_booksales Join to get city INNER JOIN dim_store_star on fact_booksales.store_id = dim_store_star.store_id == Join to get author INNER JOIN dim_book_star on fact_booksales.book_id = din_book_star.book_id -- Join to get year and quarter INNER JOIN dim_tine_star on fact_booksales.tine_id = din_tine_star.tine_id WHERE dim_store_star.city = ‘Vancouver’ AND dim_book_star.author = ‘Octavia £. Butler’ AND dim_tine_star.year = 2018 AND dim_tine_star.quarter = 4; Total of 3 joins DATABASE DESIGNNormalized query SELECT SUN(fact_booksales. quantity) FROK fact_booksales -- Join to get city INNER JOIN dim_store_st ON tact_booksales.store_id = dim_store_st.store_id INNER JOIN dim_city ON dim_store_st.city_id = dim city sf.city_id == Join to get author INNER JOIN dim_book_st ON fact_booksales.book_id = dim_book_st.book_id INNER JOIN dim_author_st ON dim_book_sf.author_id = dim_author_st.avthor_id Join to get year and quarter INNER JOIN dim_tine_st ON fact_booksales.time_id = dim_time_sf.time_id INNER JOIN dimmonth_st ON dim timesf.month_id = dim_aonth_st.month_id INNER JOIN dim_quarter_st OM dim_uonth_st.quarter_id = dim_quarter_st.quarter_id INNER JOIN dim_year_sf ON dim_quarter_sf.year_id = dim_year_sf.year_id Dk datacamp DATABASE DESIGNNormalized query (continued) WHERE dim_city_st.city = “Vancouver” AND dim_author_st.author = ‘Octavia E. Butler AND dim_year_st.year = 2018 AND dim_quarter_st.quarter a) Total of 8 joins So, why would we want to normalize a databases? DATABASE DESIGNNormalization saves space dim_store_star id | store_address | city state ‘country 1 [67 First st Brooklyn |New York | USA 2 | 12Jefferson Rd | San Francisco | California | USA. 3. | 90 Coolidge St | Los Angeles | California | USA 4 |[85Main Ave | Brooklyn |New York | USA. 5 |123Bedtord st |Brooklyn |New York | USA Denormalized databases enable data redundancy 0 datacamp DATABASE DESIGNNormalization saves space im _store_ st id [store_address [oiy_id 1 [67 First St 2 2. | 12 Jefferson Ra 3 [90 Cootige St 4 [e5 Main Ave 5 | 123 Bedtord St Normalization eliminates data redundancy 0 datacamp dim cty_st cty.id [otyname [stat ia | Sees —— state id ]siote [country ia 2 | Brookiyn 8 3_[SanFrancisco [36 43 [Newyork | 121 4 [tosangetes [36 36 [Cattomia [121 DATABASE DESIGNNormalization ensures better data integrity 1. Enforces data consistency Must respect naming conventions because of referential integrity, e.g., ‘California’, not ‘CA’ or ‘california’ 2. Safer updating, removing, and inserting Less data redundancy = less records to alter 3. Easier to redesign by extending Smaller tables are easier to extend than larger tables DATABASE DESIGNDatabase normalization Advantages + Normalization eliminates data redundancy; save on storage + Better data integrity: accurate and consistent data Disadvantages * Complex queries require more CPU DATABASE DESIGNRemember OLTP and OLAP? OLTP OLAP e.g. Operational databases eg., Data warehouses Typically highly normalized Typically less normalized + Write-intensive + Read-intensive * Prioritize quicker and safer insertion of data * Prioritize quicker queries for analytics DATABASE DESIGNLet's practice! DATABASE DESIGNNormal forms DATABASE DESIGN ET lg comet eure 0 datacampNormalization Identify repeating groups of data and create new tables for them A more formal definition: The goals of normalization are to: * Be able to characterize the level of redundancy in a relational schema * Provide mechanisms for transforming schemas in order to remove redundancy ‘ Database Design, 2nd Edition by Adrienne Watt DATABASE DESIGNNormal forms (NF) Ordered from least to most normalized: First normal form (!NF) Second normal form (2NF) Third normal form (SNF) Elementary key normal form (EKNF) Boyce-Codd normal form (BCNF) https://en.wikipedia.org/wiki/Database_normalization datacamp Fourth normal form (4NF) Essential tuple normal form (ETNF) Fifth normal form (SNF) Domain-key Normal Form (DKNF) Sixth normal form (6NF) DATABASE DESIGNANF rules * Each record must be unique - no duplicate rows * Each cell must hold one value Initial data SCC a Sheet) I- ie Der ee re Taso or Rowan Le RoCc Lae cen) ee Oa eu Umar) DATABASE DESIGNIn INF form SCC ieee Le [p =] i ero rte OMS OCI e La UC tet Ie | Introduction to Python eu Cee eer en mu ca ecru Ona re DATABASE DESIGN2NF + Must satisfy NF AND © If primary key is one column = then automatically satisfies 2NF © If there is a composite primary key * then each non-key column must be dependent on all the keys Initial data PCCM eC RC OM Rass c ics el a reo curs ae cee) DATABASE DESIGNIn 2NF form [eS eek GLO} TT ee COM Lee a et eC Me MPO a ae Cs La lo =o ervey) cet eh Ta cer Leva DATABASE DESIGN* Satisfies 2NF * No transitive dependencies: non-key columns can't depend on other non-key columns Initial Data Course_id (PK) | Instructor_id | Instructor | Tech | | 560 vec ae ace is) Tce ta | dD Ce Cuca | DATABASE DESIGNPe GLO UC acy | aa one] | Nick Carchedi | Python | Soi ue cuca i Instructor_id | Instructor i Th =I Fate a a roe eae cca DATABASE DESIGNData anomalies What is risked if we don't normalize enough? 1. Update anomaly 2. Insertion anomaly 3. Deletion anomaly DATABASE DESIGNUpdate anomaly Data inconsistency caused by data redundancy when updating SS et eS) SU] PUT y ie Te Te SL CU eee | Maggie Matsui ee Pee sk Ur ec) OC ee Oss Ce ce eC nO UC | David stoffer To update student 520 's email: * Need to update more than one record, otherwise, there will be inconsistency + User updating needs to know about redundancy DATABASE DESIGNInsertion anomaly Unable to add a record due to missing attributes SS et eS) SU] PUT y ie Te Te SL CU eee | Maggie Matsui ee Pee sk Ur ec) OC ee Oss Ce ce eC nO UC | David stoffer Unable to insert a student who has signed up but not enrolled in any courses DATABASE DESIGNDeletion anomaly Deletion of record(s) causes unintentional loss of data SS et eS) SU] PUT y ie Te Te SL CU eee | Maggie Matsui ee Pee sk Ur ec) OC ee Oss Ce ce eC nO UC | David stoffer If we delete Student 230 , what happens to the data on Cleaning Data in R? DATABASE DESIGNData anomalies What is risked if we don't normalize enough? 1. Update anomaly 2. Insertion anomaly 3. Deletion anomaly The more normalized the database, the less prone it will be to data anomalies Don't forget the downsides of normalization from the last video DATABASE DESIGNLet's practice! DATABASE DESIGN
You might also like
NORMALIZATION
PDF
No ratings yet
NORMALIZATION
6 pages
Chapter 7 - Database Design
PDF
No ratings yet
Chapter 7 - Database Design
52 pages
PostgreSQL Data Base Design Part 2
PDF
No ratings yet
PostgreSQL Data Base Design Part 2
40 pages
Databases 1 Programming Assignment Unit 3
PDF
No ratings yet
Databases 1 Programming Assignment Unit 3
5 pages
Unit 3 - Relational Database Design
PDF
No ratings yet
Unit 3 - Relational Database Design
5 pages
Database Lecture Slides
PDF
No ratings yet
Database Lecture Slides
39 pages
Co4, Co5, Co6 Rdbms Assignment Solution
PDF
No ratings yet
Co4, Co5, Co6 Rdbms Assignment Solution
32 pages
16 - Normalization Part-3
PDF
No ratings yet
16 - Normalization Part-3
26 pages
Database Design and Normalization
PDF
No ratings yet
Database Design and Normalization
27 pages
Lecture 4 - Relational Data Modeling
PDF
No ratings yet
Lecture 4 - Relational Data Modeling
26 pages
Unit 4
PDF
No ratings yet
Unit 4
19 pages
DBMS Unit 5
PDF
No ratings yet
DBMS Unit 5
28 pages
EDac Mysql 22
PDF
No ratings yet
EDac Mysql 22
19 pages
Adbm Mid-I
PDF
No ratings yet
Adbm Mid-I
24 pages
Explore The Role of Normal Forms in Dimensional Modeling
PDF
No ratings yet
Explore The Role of Normal Forms in Dimensional Modeling
15 pages
Kroenke Dbp16e Chapter 4
PDF
No ratings yet
Kroenke Dbp16e Chapter 4
31 pages
Lecture 5
PDF
No ratings yet
Lecture 5
16 pages
MSQL Lesson 5 - Database Design and Creation
PDF
No ratings yet
MSQL Lesson 5 - Database Design and Creation
11 pages
Dbms Theory Notes Unit IV
PDF
No ratings yet
Dbms Theory Notes Unit IV
73 pages
Vincent
PDF
No ratings yet
Vincent
9 pages
Primary Key
PDF
No ratings yet
Primary Key
10 pages
Bad DB Design: Duplicate of Data Updating Deleting
PDF
No ratings yet
Bad DB Design: Duplicate of Data Updating Deleting
21 pages
Lesson5 NORMALIZATION (Midtrem)
PDF
No ratings yet
Lesson5 NORMALIZATION (Midtrem)
29 pages
Data Warehousing: Lecture No 04
PDF
No ratings yet
Data Warehousing: Lecture No 04
47 pages
Intro - To-Database - Chapter No 4
PDF
No ratings yet
Intro - To-Database - Chapter No 4
45 pages
Normalization
PDF
No ratings yet
Normalization
4 pages
Siyanganyambe
PDF
No ratings yet
Siyanganyambe
7 pages
Lesson 4: Database Normalization
PDF
No ratings yet
Lesson 4: Database Normalization
16 pages
Unit 4
PDF
No ratings yet
Unit 4
6 pages
Jason Park Normalization
PDF
No ratings yet
Jason Park Normalization
18 pages
Normalization Consolidated 2003
PDF
No ratings yet
Normalization Consolidated 2003
74 pages
Normalization: by JK MCA-Sem-II
PDF
No ratings yet
Normalization: by JK MCA-Sem-II
18 pages
Database Journal Unit 3
PDF
No ratings yet
Database Journal Unit 3
5 pages
DBMS Ca3
PDF
No ratings yet
DBMS Ca3
15 pages
Database Normalization
PDF
No ratings yet
Database Normalization
10 pages
Unit 4 Relational Database Design
PDF
No ratings yet
Unit 4 Relational Database Design
22 pages
Normalization in IAD 413 For Reporting
PDF
No ratings yet
Normalization in IAD 413 For Reporting
4 pages
DSDSF
PDF
No ratings yet
DSDSF
35 pages
Databases
PDF
No ratings yet
Databases
4 pages
Jason Park Normalization
PDF
No ratings yet
Jason Park Normalization
18 pages
Logical Database Design
PDF
No ratings yet
Logical Database Design
21 pages
Normalization
PDF
No ratings yet
Normalization
27 pages
G12 Notes
PDF
No ratings yet
G12 Notes
4 pages
Database Normalization: Problems Addressed by Normalization
PDF
No ratings yet
Database Normalization: Problems Addressed by Normalization
22 pages
12th Databases
PDF
No ratings yet
12th Databases
32 pages
Database Design and Development Week 1
PDF
No ratings yet
Database Design and Development Week 1
64 pages
Assignment 1
PDF
No ratings yet
Assignment 1
18 pages
Normalisation Part 3
PDF
No ratings yet
Normalisation Part 3
26 pages
Normalization: by Asmat Ali BS - CS
PDF
No ratings yet
Normalization: by Asmat Ali BS - CS
17 pages
Data Modeling Advanced Concepts & Database Tables and Normalization
PDF
No ratings yet
Data Modeling Advanced Concepts & Database Tables and Normalization
7 pages
Normalisation
PDF
No ratings yet
Normalisation
21 pages
3 Normalization
PDF
No ratings yet
3 Normalization
16 pages
Normalisation Concepts in Database
PDF
No ratings yet
Normalisation Concepts in Database
5 pages
Relational Data Manipulation: CXB 3104 Advanced Database Systems
PDF
No ratings yet
Relational Data Manipulation: CXB 3104 Advanced Database Systems
10 pages
What's The Problem?: Relational Databases
PDF
No ratings yet
What's The Problem?: Relational Databases
14 pages
Lecture 5 - Normalization of Relational Tables
PDF
No ratings yet
Lecture 5 - Normalization of Relational Tables
29 pages
Chapter 3
PDF
No ratings yet
Chapter 3
56 pages
Comparing Files - Git
PDF
No ratings yet
Comparing Files - Git
45 pages
Chapter 2
PDF
No ratings yet
Chapter 2
41 pages
Storing Data - SQL
PDF
No ratings yet
Storing Data - SQL
26 pages