Normalization
Normalization
1
Motivation
When designing a relational database, we have choices about how to
design relations. A poor choice can lead to problems such as being
unable to record facts or easily update information, or the inadvertent
loss of information.
For example, suppose we choose to record information about students,
the courses that they are enrolled on and their tutors for these courses,
in a relation:
2
Motivation
(i) Suppose a new tutor, Meera, has been appointed on course C2, given
the identifier T4, but not yet allocated any students. Can this
information not be recorded in a new relation with the same heading as
StudentTutorCourse and with the body extended from that of
StudentTutorCourse so as to include the new information?
(ii) Tutor T1, Ann, has decided that henceforth she wants to be known as
Albert. What problems might this pose for database maintenance?
(iii) Student S2, Belinda, has decided to withdraw from the university.
What problems might this pose?
3
Motivation
These problems are commonly referred to as insertion, amendment and
deletion anomalies.
You may have noticed that this relational table contains redundant
information (caused the amendment anomaly).
5
Functional dependency
Informally, an attribute r of a relation R is functionally dependent on a set
of attributes S = {A1, ..., An} of R if each value (a1, ..., an) determines a
single value of r, where a1 is a value of A1, a2 a value of A2, and so on.
6
Functional dependency
Write down the functional dependency of each SVF(single valued fact)
7
Properties of functional dependencies
Property 1: Combining functional dependencies
If then
8
Properties of functional dependencies
Property 3: Transitivity
If then
Property 4: Augmentation
This seemingly trivial property can be very useful in identifying new FDs,
and hence in elucidating more of the dependency structure of the data.
9
First Normal Form (1NF)
If a relation contain composite or multi-valued attribute, it violates first
normal form or a relation is in first normal form if it does not contain any
composite or multi-valued attribute. A relation is in first normal form if
every attribute in that relation is singled valued attribute.
10
Second Normal Form (2NF)
To be in second normal form, a relation must be in first normal form
and relation must not contain any partial dependency.
A relation is in 2NF if it has No Partial Dependency, i.e., no non-
prime attribute (attributes which are not part of any candidate
key) is dependent on any proper subset of any candidate key of the
table.
Partial Dependency – If the proper subset of candidate key determines
non-prime attribute, it is called partial dependency.
STUD_NO COURSE_NO COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000
11
Second Normal Form (2NF)
To convert the above relation to 2NF,
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
12
Third Normal Form (3NF)
A relation is in third normal form, if there is no transitive
dependency for non-prime attributes as well as it is in second
normal form.
Transitive dependency – If A->B and B->C are two FDs then A->C is
called transitive dependency.
13
Third Normal Form (3NF)
Example 1 – In relation STUDENT given in Table 4, FD set:
14