Unit 3 - Normalization
Unit 3 - Normalization
Normalization
Goonjan Jain
Department of Applied Mathematics
Delhi Technological University
Overview
• DB design and normalization
• Pitfalls of bad design
• Redundancy – space, inconsistencies, updation anomalies
• Decomposition
• lossless join decomp.
• dependency preserving
• Normal forms
Decompositions
There are “bad” decompositions. Good ones are:
• lossless and
• dependency preserving
Decompositions – Lossy:
• R1(roll_no, grade, name, address) R2(course_id, grade)
• An easier criterion?
Decomposition - Lossless
• Theorem: lossless join decomposition if the joining attribute is a
superkey in at least one of the new tables
• Formally:
R1 ⋂ R2 → R1 or
R1 ⋂ R2 → R2
Decompositions – Lossless:
R1 R2
Roll_no Course_id Grade Roll_no Name Address
2017/MC/24 MC302 A 2017/MC/24 Aman Prime
2017/MC/24 MC304 A+ 2017/MC/78 Rohit Main
2017/MC/78 MC306 A
A B C D E f
R1 X X
R2 x x x x x
No change in the matrix after applying FDs. Thus test fails and the
decomposition is lossy.
Example 2
R {A, B, C, D, E, F}
F={A → B, C → DE, AC → F}
R1{A, B} R2{C, D, E} R3{A, C, F}
A B C D E f
R1 x X
R2 x x x
R3 x x x
Example 2
R {A, B, C, D, E, F}
F={A → B, C → DE, AC → F}
R1{A, B} R2{C, D, E} R3{A, C, F}
A B C D E f
R1 x X
R2 x x x
R3 x x x x x x
• Objectives of Normalization
• Free relations from undesirable insertion, update and deletion dependencies.
• Increase life span of application programs
Levels of Normalization
• In total, six Normal Forms are defined but Third Normal Form (3NF) is considered sufficient for
most business database design purposes.
Insertion Anomaly
• when certain attributes cannot be inserted into the database without the
presence of other attributes.
• E.g. Student (roll_no, courseid, name, address, course_title)
• Steps:
• Identify primary key for the 1NF relation.
• Identify functional dependencies in the relation.
• If partial dependencies exist on the primary key remove them by placing them
in a new relation along with copy of their determinant.
2NF
counter-example:
Table1(roll_no, course_id, grade, name, address)
roll_no, course_id → grade Roll_no → name, address
name
Roll_no
Grade
Course_id address
2NF
Roll_no Course_id Grade
2017/MC/24 MC302 A Roll_no Name Address
2017/MC/24 Aman Prime
2017/MC/24 MC304 A+
2017/MC/78 Rohit Main
2017/MC/78 MC306 A
Partially dependent on
Primary key {emp #,
project #}
Emp # Project # Project Loc
E101 P101 India
E102` P102 Australia
E103 P101 India
name
Roll_no
Grade
Course_id address
BCNF
R1 R2
Roll_no Course_id Grade Roll_no Name Address
2017/MC/24 MC302 A 2017/MC/24 Aman Prime
2017/MC/24 MC304 A+ 2017/MC/24 Aman Prime
2017/MC/78 MC306 A 2017/MC/78 Rohit Main
name name
Roll_no Roll_no
address address
Aadhar_no aadhar_no
BCNF
• But not
3NF vs BCNF
If ‘R’ is in BCNF, it is always in 3NF (but not the reverse)
for A B:
• 3NF:
if B is a primary-key attribute and A is not a candidate key.
• BCNF:
A must be candidate key
1) R1(T,J) R2(S,J)
(BCNF? - lossless? - dep. pres.? )
2) R1(T,J) R2(S,T)
(BCNF? - lossless? - dep. pres.? )
3NF vs BCNF
STJ( Student, Teacher, subJect)
T → J S, J → T
1) R1(T,J) R2(S,J)
(BCNF? Y - lossless? N - dep. pres.? N)
2) R1(T,J) R2(S,T)
(BCNF? Y - lossless? Y - dep. pres.? N)
A1 Thick Crust S1
1. Candidate Keys?
A1 Thick Crust S2
2. All MVDs?
A1 Thick Crust S3
3. Are pizza varieties offered
A1 Stuffed Crust S1
affects the delivery area? Elite Thin Crust S3
Discuss for both ‘yes’ and Elite Stuffed Crust S3
‘no’. Elite Thin Crust S3
4. Is the relation in 4NF?
5. If not, transform.
Take Away – Normalize till BCNF
• Relation – Lots(Property_id#, country_name, lot#, area, price,
tax_rate)
• FD1: P → CLAPrT
• FD2: CL → PAPrT
• FD3: C → T
• FD4: A → Pr
• FD5: A → C
Fifth Normal Form
• Also called Project-Join Normal Form
• a relation is in 5NF,
• in 4NF and
• Have no lossless decomposition into smaller tables, i.e. cannot be decomposed
further.
• Table has Join Dependency –
• Table can be recreated by joining multiple tables, and
• each of this table have a subset of the attributes of the table
• Trivial JD – if one of the tables has all the attributes of T
• Relationships in JD are independent of each other
Example
Agent Company Product
Smith Ford Car
Smith Ford Truck
Smith GM Car
Smith GM Truck
Jones Ford Car
Jones Ford Truck
Brown Ford Car
Brown GM Car
Brown Toyota Car
Brown Toyota bus
• Jones sells cars and GM makes cars, but Jones does not represent GM
• Brown represents Ford and Ford makes trucks, but Brown does not sell trucks
• Brown represents Ford and Brown sells buses, but Ford does not make buses
Conclusion