0% found this document useful (0 votes)
12 views63 pages

4 Normal Forms Post 3

The document outlines important deadlines and information for a course, including due dates for assignments and exams, specifically highlighting Midterm 1 scheduled for February 11. It also discusses normalization concepts in database design, including 1NF, 2NF, BCNF, and 3NF, along with functional dependency preservation. Additionally, it emphasizes the importance of minimal covers for functional dependencies in ensuring efficient database schema design.

Uploaded by

Roy Chen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views63 pages

4 Normal Forms Post 3

The document outlines important deadlines and information for a course, including due dates for assignments and exams, specifically highlighting Midterm 1 scheduled for February 11. It also discusses normalization concepts in database design, including 1NF, 2NF, BCNF, and 3NF, along with functional dependency preservation. Additionally, it emphasizes the importance of minimal covers for functional dependencies in ensuring efficient database schema design.

Uploaded by

Roy Chen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Reminders and Important Dates

Feb 06, 2020


Upcoming deadlines
• February 3 to 7: Tutorial 2 is due
• February 7 @ 11:59PM: Milestone 2- ER Diagram and Schema
• February 11 @ 7PM: Midterm 1
• February 14: Last day to withdraw with a W

Please check piazza post @7 for Important Dates and Deadlines

Normalization 1
Reminders and Important Dates
Feb 06, 2020
Midterm 1 information
• Tuesday February 11 @ 7PM (next week)
• Duration : 50 minutes
• Covers up until BCNF (3NF will NOT be on midterm 1)
• Check the Piazza post which contains additional information
• Lot of practice material + previous midterm exam samples available on
canvas

• Both sections will have an open office hour on Tuesday (lecture


time). TAs and I will be here to answer your questions (5:00 -6:30
pm). No office hour from 3-4:30 on Tuesday (Feb 11)
• Midterm1 makeup exam – Wednesday February 12 @ 5pm in ICCS
202 (If you submitted a midterm conflict, you should have heard from
us. If not then email us)

Please check piazza post @7 for Important Dates and Deadlines


Normalization 2
Outline
1. Recap
2. FD Preservation
3. 3NF
4. Revisit learning objectives

Normalization 3
Previously…
• We started looking at normalization
• 1NF (only one value per attribute)
• 2NF (All nonkey attributes in the table must be functionally
dependent on the entire primary key)

Normalization 4
Previously…2NF
City, Street, HouseNumber → HouseColor
City → CityPopulation

House# Street City HouseColor CityPopulation


12 Burrad St Vancouver White 600,218
21 Burrad St Vancouver Red 600,218
23 Hamilton St Richmond Blue 216,288
12 Burrad St Richmond White 216,288

House# Street City HouseColor City CityPopulation


12 Burrad St Vancouver White Vancouver 600,218
21 Burrad St Vancouver Red Richmond 216,288
23 Hamilton St Richmond Blue
12 Burrad St Richmond White

City, Street, HouseNumber → HouseColor City → CityPopulation

Normalization 5
Previously…
• We started looking at normalization
• 1NF (only one value per attribute)
• 2NF (All nonkey attributes in the table must be functionally
dependent on the entire primary key)
• BCNF (X→ b then X must be a sk)

Normalization 6
Try!
PID Name Age canDrive Phone#
P1 John 14 No 604 111 1111
P2 Raj 28 Yes 604 111 1111
P3 John 14 No 604 333 3333

PID → Name
Age → canDrive
Is the relation in BCNF? If not, decompose this relation into BCNF

Normalization 7
Sol
PID Name Age canDrive Phone# PID → Name
P1 John 14 No 604 111 1111 Age → canDrive
P2 Raj 28 Yes 604 111 1111
P3 John 14 No 604 333 3333

PID Name PID Age canDrive Phone#

P1 John P1 14 No 604 111 1111

P2 Raj P2 28 Yes 604 111 1111

P3 John P3 14 No 604 333 3333

PID → Name PID Age Phone#


Age canDrive
P1 14 604 111 1111
14 No
P2 28 604 111 1111
BCNF 28 Yes
P3 14 604 333 3333
Age → canDrive
Normalization 8
Previously… Lossless-Join
Decompositions
• We should be able to construct the instance of the original table from the
instances of the tables in the decomposition
SName TutorialNum TutorialGroup Marks

John T1 G1 2
John T2 G3 1
Linda
(University of Waterloo)
T1 G2
Normalization Theory
1

Normalization 9
Previously… Lossless-Join
Decompositions
• We should be able to construct the instance of the original table from the
instances of the tables in the decomposition
SName TutorialNum TutorialGroup Marks

John T1 G1 2
John T2 G3 1
Linda
(University of Waterloo)
T1 G2
Normalization Theory
1

decompose in two tables


SName TutorialGroup Marks
SName TutorialNum TutorialGroup Marks
John G1 2
John T1 G1 2
John G3 1
John T2 G3 1
Linda G2 1
John T1 G3 1
TutorialNum Marks Linda T2 G2 1
T1 2 After join Linda T1 G2 1
T2 1
T1 1
Normalization 10
Previously… Lossless-Join
Decompositions
• A decomposition {R1, R2} of R is lossless if and only if the common
attributes of R1 and R2 form a key for either schema, that is
• R1 ∩ R2 → R1 or R1 ∩ R2 → R2

In the previous example we had


R = {SName, Tutorial, TGroup, Mark} and
F = {SName, TutorialNum → TGroup, Mark}
R1 = {SName, TGroup, Mark}
R2 = {Tutorial, Mark}

SName TutorialGroup Marks TutorialNum Marks


John G1 2 T1 2
John G3 1 T2 1
Linda G2 1 T1 1

R1 ∩ R2 = {Mark} and it is not a key of either R1 or R2


Therefore, decomposition {R1, R2} is lossy
Normalization 11
Outline
1. Recap
2. FD Preservation
3. 3NF
4. Revisit learning objectives

Normalization 12
FD Preservation
• Given a relation R and a set of FDs F, decompose R into
R1 and R2

• Suppose
– R1 has a set of FDs F1
– R2 has a set of FDs F2
– F1 and F2 are computed from F.

• The decomposition is dependency preserving if by


enforcing F1 over R1 and F2 over R2, we can enforce F
over R

Normalization 13
Example

Normalization 14
Example : FD Preservation
PID Name Age canDrive Phone# PID → Name
P1 John 14 No 604 111 1111 Age → canDrive
P2 Raj 28 Yes 604 111 1111
P3 John 14 No 604 333 3333

PID Name PID Age Phone#


Age canDrive
P1 John P1 14 604 111 1111
14 No
P2 Raj P2 28 604 111 1111
28 Yes
P3 John P3 14 604 333 3333
Age → canDrive
PID → Name

Normalization 15
After you decompose, how do you
know which FDs apply?
• For an FD X→b,
• if the decomposed relation S contains {X U b}, and
• b  X+ t
• then the FD holds for S:

• For example. Consider relation R(A,B,C,D,E) with functional dependencies AB


→ C, BC → D, CD → E, DE → A, and AE → B. Project these FD's onto the
relation S(A,B,C,D).

• Does AB→D hold?


• First check if A, B and D are all in S? They are
• Find AB+= ABCDE
• Then yes AB→ D does hold in S.

• Does CD→E hold?


• No
Normalization 16
Clicker Question
• Consider relation R(A,B,C,D,E) with functional dependencies
AB → C, BC → D, CD → E, DE → A, and AE → B. Project
these FD's onto the relation S(A,B,C,D).
• Which of the following hold in S?
A. A→B
B. AB→E
C. AE→B
D. BCD→A
E. None of the above

Normalization 17
Clicker Question
• Consider relation R(A,B,C,D,E) with functional dependencies AB → C, BC → D,
CD → E, DE → A, and AE → B. Project these FD's onto the relation S(A,B,C,D).

• Which of the following hold in S?


A. A→B
B. AB→E X→ b S contains b  X+ Hold in
S?
C. AE→B {X U b}
D. BCD→A A→ B Yes A+=A No

E. None of the above


AB→E No No

AE→ B No No

BCD→A Yes BCD+=ABCDE yes

Note that we use all FDs for finding closures, so for (D) option we use DC→E
even though E is not present in S.

Normalization 18
Midterm cut-off
• Closed book exam
• Comprehensive covering everything up to FD
preservation
• Introduction to DB
• ERD
• RS
• Normalization (up to BCNF – FD, Anomalies, 1NF, 2NF,
BCNF, lossless join decomposition, FD preservation)

Normalization 19
Outline
1. Recap
2. FD Preservation
3. 3NF
4. Revisit learning objectives

Normalization 20
What is offered by 3NF decomposition?

• Lossless Join (Yes)

• Dependency Preservation (Yes)

Neither BCNF nor 3NF can guarantee all three!

Decompose too far → can’t enforce all FDs.

Not far enough → can have redundancy.

A schema is considered “good” if it is in either BCNF or 3NF.

Normalization 21
3NF come to Rescue!

Normalization 22
23

3NF to the rescue!


A relation R is in 3NF if:
If X → b is a non-trivial dependency in R, BCNF
then X is a superkey for R
or b is part of a minimal key.
(must be true for every such functional dependency)

Note: b must be part of a key not part of a superkey (if a key


exists, all attributes are part of a superkey!)

Normalization 23
24

3NF to the rescue!


Example: R(Unit,Company, Product)

Keys: {Company, Product}, {Unit,Product}

• Unit → Company
FDs
• Company, Product → Unit

Normalization 24
25

3NF to the rescue!


Example: R(Unit,Company, Product)

Keys: {Company, Product}, {Unit,Product}

• Unit → Company Not in BCNF. Company part of a key so 3NF


FDs
• Company, Product → Unit Company, Product is superkey. BOTH

To decompose into 3NF we rely on the minimal cover

Normalization 25
Minimal Cover for a Set of FDs
• Sets of functional dependencies may have redundant
dependencies that can be inferred from the others .
• Eg: A → C is redundant in: {A → B, B → C, A → C}

• Goal: Transform FDs to be as small as possible

Normalization 26
Minimal Cover for a Set of FDs
• A minimal cover of a set of dependencies, F, is a set of
dependencies, U, such that:
• U is equivalent to F (F+ = U+)

• All FDs in U have the form X → A where A is a single attribute

• It is not possible to make U smaller (while preserving equivalence)


by
• Deleting an FD
• Deleting an attribute from an FD (either from LHS or RHS)

• FDs and attributes that can be deleted in this way are called
redundant

Normalization 27
Minimal Cover
• Intuitively, every FD in U is needed, and is “as small as
possible’’ in order to get the same closure as F

A→B, A→B,
minimal cover ACD→E,
ABCD→E,
EF→GH, EF→G,
ACDF→EG EF→H,

F U

Let see how to find the minimal cover

Normalization 28
Finding minimal covers of FDs
1. Put FDs in standard form (have only one attribute on RHS)
2. Minimize LHS of each FD
3. Delete Redundant FDs

Normalization 29
Finding minimal covers of FDs
1. Put FDs in standard form (have only one attribute on RHS)
2. Minimize LHS of each FD
A→B
3. Delete Redundant FDs
ABCD→E
EF→G
Example: EF→H
ACDF→EG
Replace ACDF→EG with
• ACDF → E
• ACDF → G

Normalization 30
Finding minimal covers of FDs
1. Put FDs in standard form (have only one attribute on RHS)
2. Minimize LHS of each FD
3. Delete Redundant FDs

Step 2: Eliminate redundant attributes from LHS.

Algorithm:
Remove B from the left-hand-side of X → A in F if A is in X+(X-{B},F).
Example:
ABCD→E
ABCD : can we get E now? We check for all subsets
ABCD : can we get E now? by eliminating one
ABCD : can we get E now? attribute at a time
ABCD : can we get E now?
Normalization 31
Finding minimal covers of FDs
Reduce LHS Final FDs
1. Put FDs in standard form
(have only one attribute on A→B
RHS) ABCD→E
2. Minimize LHS of each FD
3. Delete Redundant FDs

EF→G
A→B
ABCD→E EF→H
EF→G
EF→H ACDF→E
ACDF→E
ACDF→G
ACDF→G

Normalization 32
Finding minimal covers of FDs
1. Put FDs in standard form (have only one attribute on RHS)
2. Minimize LHS of each FD
A→B
3. Delete Redundant FDs ACD→E
EF→G
EF→H
ACD→E
ACDF→G

Normalization 33
A→B
ACD→E
EF→G

Finding minimal covers of FDs EF→H


ACDF→G

Closure when given FD Closure when given FD Decision


is considered is not considered
A →B

ACD→ E

EF→G

EF→H

ACDF→G

Normalization 34
Final answer

A→B A →B
ABCD→E ACD→ E
EF→G EF→G
EF→H EF→H
ACDF→EG
Minimal Cover

Normalization 35
Find the minimal cover - Try!
• R(ABCD)
• F = {A → BC, B → C, AB → D}

Normalization 36
Find the minimal cover
• R(ABCD)
• F = {A → BC, B → C, AB → D}
1. Reduce RHS 2. Reduce LHS
Given FDs Test for the reduction of Write FDs after
Given FDs Write FDs after LHS reducing LHS
reducing RHS A→B We can’t reduce LHS A→B
A→ BC A→B
A→C We can’t reduce LHS A→C
A→C
B→C B→C B→C We can’t reduce B→C

AB → D AB → D AB→D A+ = ABCD A→D


B+ = BC

A+ includes D, so B is
extraneous, ie., we can
identify D without B on the
LHS.

Normalization 37
Find the minimal cover
3. Remove Redundant FD F = {A→B , A→C , B→C , A→D}

Reduced FDs Consider FD while Don’t Consider FD while Decision


finding Closure finding Closure (If both contain same closure
then discard that FD)

A→B A+ = ABCD A+ = ACD Keep it

A→C A+= ABCD A+= AB CD Discard (now don’t consider this


FD further)
B→C B+ = BC B+ = B Keep it
A→D A+ = ADBC A+= ABC Keep it

Final Minimal Cover


A→B
B→C
A→D

Normalization 38
Now we’re ready to decompose into
3NF
• We’ll cover two methods.

• Both methods are


• Results in relations that do not violate 3NF
• Lossless (you don’t get any additional tuples)
• Preserve all functional dependencies

• The first one starts by ensuring that the decomposition is


lossless and then preserves all functional dependencies

• The second one starts by preserving all functional


dependencies and then ensures that the decomposition
is lossless

Normalization 39
Decomposition into 3NF using
Minimal Cover
• Decomposition into 3NF:
• Given the FDs F, compute F': the minimal cover for F
• Decompose using F' if violating 3NF similar to how it was
done for BCNF
• After each decomposition identify the set of dependencies
N in F' that are not preserved by the decomposition.
• For each X→a in N create a relation Rn(X  a) and add it to
the decomposition

Intuition: first remove redundancy using


lossless joins to ensure all results are valid.
Then ensure that we maintain all FDs.

Normalization 40
3NF Example
Relation: R(ABCDE)
• FD: AB→C, C→D

Normalization 41
Your turn!
Let R(CSJDPQV) be a relation with the following FDs
SD→P
JP→C
J→S
• Is this in 3NF? If not, decompose R into 3NF

Normalization 42
Your turn!
Let R(CSJDPQV) be a relation with the following FDs
• SD→P
• JP→C
• J→S
• Is this in 3NF? If not, decompose R into 3NF .
• Already in minimal cover
SD+=SDP
JP+= JPSC
J+=JS
JDQV+ = CSJDPQV Key

Normalization 43
Your turn!
• Let R(CSJDPQV) be a relation FDs SD→P, JP→C , J→S

SD → P R(CSJDPQV
(Violate 3NF) J→S
(Violate 3NF)
R1(SDP) R2(CSJDQV)

R3(JS) R4(CJDQV)

• No more violations!
• Are all FDs preserved? JP→C ? No so add R5(CJP)
• Final answerR1(SDP), R3(JS), R4(CJDQV) , R5(CJP)

Normalization 44
3NF Synthesis
• Conceptually simpler.
• Given a set of FDs 𝐹, obtain a minimal cover 𝐹’
• ∀FD 𝑋 → 𝐴 ∈ 𝐹 ′ , create a scheme 𝑋𝐴.
• Resulting decomposition is guaranteed to preserve all FDs
(trivially) and each scheme is in 3NF. But no guarantee for
LLJ!
• Easy fix: add any schema that contains a key of the
original relation scheme 𝑅.

• Revisit previous example:


• R(ABCDE) FD: AB→C, C→D.

Normalization 45
Example: Decomposition into 3NF Using
a Minimal Cover and 3NF Synthesis
• Suppose we have R(A,B,C) with FDs: A → B, C → B.
1. Find a minimal cover F'. Already done.
2. For each FD X→b, add relation Xb to
the decomposition for R.
• Result: R1(A,B) and R2(B,C). Are we done? No.
3. Does it contain a key? What are the keys of R? AC.
Add R3(A,C).

• Let’s see why we need step #3

Normalization 46
Consider the following set of tuples
with the previous relation and FDs
Maintains A→B
A B

A→B
1 2 Maintains C→B
C→B 1 2 and A→B
2 2
A B C
A B C 1 2 3
Decompose Natural
1 2 3
Join 1 2 4
1 2 4 2 2 4
B C
2 2 4 2 2 3
2 3
2 4
2 4
Not lossless!

Maintains C→B
Normalization 47
Consider the following set of tuples
with the previous relation and FDs
Maintains A→B
A B

A→B
1 2 Maintains C→B
C→B 1 2 and A→B
2 2
A B C
A B C 1 2 3
Decompose Natural
1 2 3
Join 1 2 4
1 2 4 2 2 4
B C
2 2 4 2 2 3
2 3
2 4
2 4
Not lossless!

Maintains C→B
Normalization 48
Take two!
Maintains A→B
A B
1 2
A→B Maintains C→B
1 2
C→B and A→B
2 2
A B C
A B C Maintains C→B
1 2 3
1 2 3 Decompose B C Natural 1 2 4
2 3 Join
1 2 4 2 2 4
2 4
2 2 4
2 4 Lossless!
Key! Can’t get (2, 2, 3)
A C because there is
1 3 no (2, 3) tuple in
1 4 R3(A,C)
2 4 105
In decompositions, you can often make some
adjustments to make a “better” decomposition

• For example, if,


R1(ABC)
R2(CD)
R3(EFG)
R4(EF)
R5(ABEG)
have redundant relations – you don’t need R4 because all
information is contained in R3

• You can make the same optimizations in BCNF

Normalization 50
3NF Synthesis – Try!
Relation: R(ABCDE)
• FD: AB→C, C→D

Normalization 51
3NF Synthesis – Try!
Relation: R(ABCDE) Cover is already minimal
AB+ = ABCD
• FD: AB→C, C→D C+= CD
ABE+ = ABCDE only key

R1 (ABC)
R2 (CD)
R3 (ABE)

Normalization 52
Question: BCNF and 3NF
Consider the following relation and functional dependencies:
R(ABCD)

FD's: ACD → B ; AC → D ; D → C ; AC → B

Which of the following is true:


A. R is in neither BCNF nor 3NF
B. R is in BCNF but not 3NF
C. R is in 3NF but not in BCNF
D. R is in both BCNF and 3NF

Normalization 53
BCNF and 3NF
Consider the following relation and functional dependencies:
R(ABCD)
FD's: ACD → B ; AC → D ; D → C ; AC → B

Which of the following is true:


A. R is in neither BCNF nor 3NF
B. R is in BCNF but not 3NF ACD+=ABCD
AC+=ACDB
C. R is in 3NF but not in BCNF D+=DC
D. R is in both BCNF and 3NF AD+=ADCB
Keys: AC, AD

D→C (D not key so NOT BCNF)


C is part of a minimal key so R is in
3NF

Normalization 54
Comparing BCNF & 3NF
• BCNF guarantees removal of all anomalies
• 3NF has some anomalies, but preserves all dependencies
• If a relation R is in BCNF it is in 3NF.

BCNF 3NF 2NF 1NF

Normalization 55
Other normal forms
• Further normal forms exist which deal with issues not
covered by functional dependencies
• Fourth Normal Form deals with multi-valued dependencies
• Fifth Normal Form addresses more complex (and rarer) situations
where 4NF is not sufficient

Normalization 56
Normalization and Design

• Most organizations go to 3NF or better


• If a relation has only 2 attributes, it is automatically in 3NF and
BCNF
• Our goal is to use lossless-join for all decompositions and
preserve dependencies
• BCNF decomposition is always lossless, but may not preserve
dependencies
• Good heuristic:
• Try to ensure that all relations are in at least 3NF
• Check for dependency preservation

Normalization 57
Normalization Drawbacks
• By limiting redundancy, normalization helps maintain consistency
and saves space

• But performance of querying can suffer because related


information that was stored in a single relation is now distributed
among several
Transcript(StudId, CrsCode, Semester, Grade, Year)
Student(Id, name, Address)
Class (CrsCode, Semester , time, textbook)

• Example: A join is required to get the names and grades of all


students taking CPSC304 in S2019.
SELECT S.Name, T.Grade
FROM Student S, Transcript T
WHERE S.Id = T.StudId AND
T.CrsCode = ‘CPSC304’ AND T.Semester = ‘S2019’
Normalization 58
On the other hand…
Denormalization
• Tradeoff: Cautiously introduce redundancy to improve performance of
certain queries

• Example: Add attribute Name to Transcript


SELECT T.Name, T.Grade
FROM Transcript’ T
WHERE T.CrsCode = ‘CPSC304’ AND T.Semester = ‘S2019’
• Join is avoided
• If queries are asked more frequently than we can modify the
Transcript, and the added redundancy might improve average
performance.

• But, Transcript’ is no longer in BCNF since key is (StudId, CrsCode,


Semester) and StudId → Name

Normalization 59
Learning Goals Revisited
1. Debate the pros and cons of redundancy in a database.
2. Provide examples of update, insertion, and deletion
anomalies.
3. Given a set of tables and a set of functional dependencies
over them, determine all the keys for the tables.
4. Show that a table is/isn’t in 3NF or BCNF.
5. Prove/disprove that a given table decomposition is a
lossless join decomposition. Justify why lossless join
decompositions are preferred decompositions.
6. Decompose a table into a set of tables that are in 3NF, or
BCNF.

Normalization 60
Normalization
Consider the following database schema (R) . The attributes are ABCDEF. The
FDs are:
• ABF → C
• CF → B
• CD → A
• BD → AE
• C→F

(a) Which of the above 5 FDs, if any, violate BCNF? No explanation needed.

b) Is R in BCNF? If not, decompose this relation into BCNF.

Normalization 61
Normalization
Consider the following database schema (R) . The attributes are ABCDEF. The
FDs are:
• ABF → C
• CF → B
• CD → A
• BD → AE
• C→F

(a) Which of the above 5 FDs, if any, violate BCNF? No explanation needed.
All violates except CD→A

b) Is R in BCNF? If not, decompose this relation into BCNF.

Normalization 62
(b)

Normalization 63

You might also like