Lecture #4-1. Normalization
Lecture #4-1. Normalization
2
Database design approaches
Top-down design Bottom-up design
3
Normalization definition
The process of structuring and removing redundant
data of relations in order to improve storage efficiency
and data integrity.
4
The reason to use normalization
1. Minimizing data redundancy
Book
Name Author name Author email
There are two book which are written by the same Author1. The author’s email
address is in the relation twice. It’s redundant.
5
The reason to use normalization
2. Eliminating modification anomalies:
- Update anomaly
- Delete anomaly
- Insert anomaly
6
Update anomaly
Book
7
Delete anomaly
Book
8
Insert anomaly
Book
9
Normal forms
Codd introduced the concept of normalization and what is now known as the first normal form (1NF) in 1970.
Codd went on to define the second normal form (2NF) and third normal form (3NF) in 1971.
Codd and Raymond F. Boyce defined the Boyce-Codd normal form (BCNF) in 1974.
A relational database relation is often described as "normalized" if it meets third normal form.
Most 3NF relations are free of insertion, update, and deletion anomalies.
10
Normal forms
Normalization is a database design technique, which is used to design a relational database table up to higher
normal form (NF).
The process is progressive, and a higher level of database normalization cannot be achieved unless the
previous levels have been satisfied.
1 NF 2 NF 3 NF 3 BCNF 4 NF 5 NF 6 NF
Used in practice
11
Normal forms - Functional dependency
If one set of attributes in a relation (A) determines another set of attributes in the relation, then the second set of
attributes (B) is said to be functionally dependent (FD) on the first set of attributes.
where A is determinant
Examples:
Student ID → Student Name
City→Country
Username→Profile
Profile→Username
Email→First Name, Last Name, Age
12
Normal forms - Functional dependency
The following is a complete system of rules for functional dependencies:
That is, a given X-value must always occur with the same Y-value.
It is like 1:1 but only from left to right (sometimes the opposite is true)
When X is a key, then all fields are by definition functionally dependent on X in a trivial way, since there
can't be two records having the same X value.
14
Normal forms - Functional dependency
Examples:
Functional dependencies:
ID→Name, ISBN, Author Name, Author Email, Pages (By definition of FD and PK + CK)
ISBN→Name, ID, Author Name, Author Email, Pages (By definition of FD and PK + CK)
ISBN→Name
Author Name→Author Email
Author Email→Author Name (it is not always true that A→B and B→A !)
ID→Author Name, Author Email and because of F5 the next two:
ID→Author Name
ID→Author Email
... and others.. 15
Normal forms - Functional dependency
Exercise:
16
Normal forms - Functional dependency
Exercise:
Alex Mary
18
2 Normal form (2NF)
A relation is said to be in 2NF if both the following conditions hold:
This 2NF only is an issue for relations with a composite primary key. Composite primary key is a primary key
consisting of two or more attributes.
If a relation has primary key consisting of just only one attribute then a relation satisfies 2NF automatically.
19
2 Normal form (2NF). Example
Parts in warehouses
P1 W1 10 Kyiv
P2 W1 7 Kyiv
P1 W2 44 Lviv
Primary key:
(Parts, Warehouse)
Functional dependencies:
Parts, Warehouse→Quantity, Warehouse City (OK by Primary Key definition)
Parts, Warehouse→Quantity (OK)
Parts, Warehouse→Warehouse City (OK)
Warehouse→Warehouse City (Fails 2NF because Warehouse is just a part of the Primary Key)
Warehouse City→Warehouse (OK) 20
2 Normal form (2NF). Example
Parts in warehouses
P1 W1 10 Kyiv
P2 W1 7 Kyiv
P1 W2 44 Lviv
P1 W1 10 W1 Kyiv
P2 W1 7 W2 Lviv
P1 W2 44
22
2 Normal form (2NF). Exercise
TV programmes
The task:
1) What are functional dependencies?
2) Does the relation satisfy 2NF? Why?
23
2 Normal form (2NF). Exercise
TV programmes
Functional dependencies:
TV Channel, Program→TV Channel Genre, Program Duration
Program→Program Duration (2NF violation!)
TV Channel→TV Channel Genre (2NF violation!)
24
2 Normal form (2NF). Exercise
TV Channel/Program
Channel
Functional dependencies:
TV Channel TV Channel TV Channel Program
Only trivial (why?)
Genre
TV1 Football Match
TV1 Sport
TV1 Skiing
TV2 Movie
TV2 Terminator 6
Program
Functional dependencies:
TV Channel→TV Channel Genre Program Program Duration
A non-prime attribute of R is an attribute that does not belong to any candidate key of R
All non-prime attribute of R must be functionally dependent on a primary/candidate key i.e. there can be
no interdependencies between non-prime attributes.
26
3 Normal form (3NF). Example
Student University University Location
Functional dependencies:
Student→University, University Location
Student→University
Student→University Location
University→University Location
No transitive dependencies!
28
3 Normal form (3NF). Exercise
Books
B1 A1 [email protected] Criminal
B2 A1 [email protected] Romance
B3 A2 [email protected] Technology
Assumption:
One book is written by one author
The task:
1) What are functional dependencies?
2) Does the relation satisfy 3NF? Why?
29
3 Normal form (3NF). Exercise
Books
B1 A1 [email protected] Criminal
B2 A1 [email protected] Romance
B3 A2 [email protected] Technology
B3 [email protected] Technology
Is it possible to use ‘Author’ as the PK?
No transitive dependencies!
31
3 Normal form (3NF). Exercise
User
ID Username Email
1 a1 [email protected]
2 a2 [email protected]
32
3 Normal form (3NF). Exercise
User
2 a2 [email protected]
Functional dependencies:
ID→Username, Email
ID→Username
ID→Email
Username→Email
Email→Username
ID→Username→Email (transitive but not non-key! No problem)
ID→Email→Username (transitive but not non-key! No problem)
33
Homework
Student Faculty Dean Student Dean Age Student
Age Photo
34