Normalization
Normalization
UNIT III
Normal Forms Based on Primary Keys
Normalization of Relations
Practical Use of Normal Forms
Definitions of Keys and Attributes
Participating in Keys
First Normal Form
Second Normal Form
Third Normal Form
Normalization of Relations
Proposed by Codd(1972) – three NFs
Boyce and Codd proposed BCNF
Normalization:
◼ The process of decomposing unsatisfactory
"bad" relations by breaking up their
attributes into smaller relations
Normal form:
◼ Condition using keys and FDs of a relation to
certify whether a relation schema is in a
particular normal form
Normalization of Relations
2NF, 3NF, BCNF
◼ based on keys and FDs of a relation schema
4NF
◼ based on keys, multi-valued dependencies :
MVDs;
5NF
◼ based on keys, join dependencies : JDs
Additional properties may be needed to
ensure a good relational design
◼ lossless join
◼ dependency preservation
Practical Use of Normal Forms
Normalization is carried out in practice so
that the resulting designs are of high quality
and meet the desirable properties
The practical utility of these normal forms
becomes questionable when the constraints on
which they are based are hard to understand or
to detect
The database designers need not normalize to
the highest possible normal form
◼ usually up to 3NF, BCNF or 4NF
Denormalization:
◼ The process of storing the join of higher normal form
relations as a base relation—which is in a lower
normal form
First Normal Form
Disallows
◼ composite attributes
◼ multivalued attributes
◼ nested relations; attributes whose values
for an individual tuple are non-atomic
Examples:
◼ SSN -> DMGRSSN is a transitive FD
Since SSN -> DNUMBER and DNUMBER ->
DMGRSSN hold
◼ SSN -> ENAME is non-transitive
Since there is no set of attributes X where SSN
-> X and X -> ENAME
Third Normal Form
A relation schema R is in third normal form
(3NF) if it is in 2NF and no non-prime attribute
A in R is transitively dependent on the primary
key
R can be decomposed into 3NF relations via the
process of 3NF normalization
NOTE:
◼ In X -> Y and Y -> Z, with X as the primary key, we
consider this a problem only if Y is not a candidate
key.
◼ When Y is a candidate key, there is no problem with
the transitive dependency .
◼ E.g., Consider EMP (SSN, Emp#, Salary ).
Here, SSN -> Emp# -> Salary and Emp# is a
candidate key.
Normal Forms Defined Informally
1st normal form
◼ All attributes depend on the key
2nd normal form
◼ All attributes depend on the whole key
3rd normal form
◼ All attributes depend on nothing but the
key
3 NF
3 NF
Successive Normalization of LOTS into 2NF and
3NF
SUMMARY OF NORMAL FORMS
based on Primary Keys
BCNF (Boyce-Codd Normal Form)
A relation schema R is in Boyce-Codd Normal
Form (BCNF) if whenever an FD X -> A holds
in R, then X is a superkey of R
Each normal form is strictly stronger than the
previous one
◼ Every 2NF relation is in 1NF
◼ Every 3NF relation is in 2NF
◼ Every BCNF relation is in 3NF
There exist relations that are in 3NF but not in
BCNF
The goal is to have each relation in BCNF (or
3NF)
BCNF
BCNF
Boyce-Codd Normal Form
A relation TEACH that is in 3NF but not in
BCNF
Achieving the BCNF by Decomposition
101 Alex
Professor Table
Alex Java
Amit C++
Example 1: Determine NF
ISBN → Title
ISBN → Publisher
Publisher → Address All attributes are directly
or indirectly determined
by the primary key;
therefore, the relation is
at least in 1 NF
BOOK
BOOK
BOOK
ORDER
ORDER
Product_ID →
Description
In your solution you will write the
following justification:
1) No M/V attributes, therefore at least 1NF
2) There is a partial dependency
(Product_ID → Description), therefore not
in 2NF
Conclusion: The relation is in 1NF
ORDER
PART
PART
STUDENT
STUDENT
STUDENT
STUDENT
Stud_ID Name
101 Lennon
125 Jonson
STUDENT_COURSE
Composite
Primary Key
STUDENT
STUDENT
STUDENT COURSE
Transitive
Dependency
EMPLOYEE
EMPLOYEE
EMPLOYEE
DEPARTMENT
Dept_ID Dept_Name
1 Acct
2 Mktg