Lecture 11fjylkgey
Lecture 11fjylkgey
Normalization
Lecture 10
In this lecture you will
learn
• Mathematical notions behind relational model
• Normalization
Introduction
• Relations derived from ER model may be ‘faulty’
– Subjective process.
– May cause data redundancy, and insert/delete/update
anomalies.
Insertion Anomaly…
1. Inserting a new lecturer to the
LECTURER table
- Department information is
repeated (ensure that correct
department information is
inserted).
Deletion Anomalies…
Updating Anomalies…
S R1 R2
S P D S P P D
S1 P1 D1 S1 P1 P1 D1
S2 P2 D2 S2 P2 P2 D2
S3 P1 D3 S3 P1 P1 D3
Loss-less join property
(contd.)
Joining them together, we get spurious tuples…
S P D
S1 P1 D1
R1 R2 S1 P1 D3
S2 P2 D2
S3 P1 D1
S3 P1 D3
The Process of
Normalization
• Formal technique for analyzing a relation based on its primary key and
functional dependencies between its attributes.
TEACHER COURSE
Functional Dependency
• Diagrammatic representation:
Sally Singer 123 Broadway New York, NY, 11234 (111) 222-3345
Jason Jumper 456 Jolly Jumper St. Trenton NJ, 11547 (222) 334-5566
Another example of UNF
Example 2 – repeating columns for each client &
composite name field
Kilroy
TS-89 Gilroy Gladstone US Corp. 14 hrs Taggarts 26 hrs 9 hrs
Inc.
Or by
– placing repeating data along with copy of the original key
attribute(s) into a separate relation.
Normalization (contd.)
For example…
DEPARTMENT (Dname,Dnumber, DMGRSSN, DLocation)
DEPARTMENT
DEPARTMENT
TEACH
STUDENT COURSE TEACHER CAMPUS
Narayan Database ABC Metro
Smith Database XYZ Malabe
Nalin Operating Systems Samantha Metro
Kamal Operating Systems ABC Malabe
Janith Database ABC Metro
Ranil Operating Systems Samantha Metro
Saman Operating Systems ABC Malabe
Ruwan Database XYZ Malabe
{Teacher, Campus} Course
Normalization (contd.)
2nd Normal Form:
• A relation R is in second normal form (2NF) if every
nonprime attribute A in R is not partially dependent on any
key of R.
Theorem:
CAMPUS ADDRESS
Metro BoC Merchant Tower
Malabe Malabe Campus
Another Example
EMP_PROJ
FD1
FD2
FD3
Normalization (contd.)
– R is in 2NF, and
– No nonprime attribute is
transitively dependent on any
key.
Transitive dependency
Attribute is dependent on another attribute that is not
part of the primary key.
Requires the decomposition of the table containing
the transitive dependency.
EMP_DEPT
FD1
FD2
FD3
FD4
• Normalization complete
UNF
Remove repeating groups
1NF
Remove partial dependencies
2NF
3NF
• COURSE
• STUDENT
• TUTOR
• GRADE?
Relational Data
Analysis
Unnormalised Form
e 1
1
ode 2
ame 2
rth 2
e 2
me 2
2
2
Relational Data Analysis
SYA
enter values of non-
COB repeating attribute
PAS
Relational Data Analysis
COB COBOL
PAS Pascal
Relational Data Analysis
Course Student Student Date of Tutor Tutor Grade Result
Code Code Name Birth Code Name
Normalisation Table
UNF 1NF 2NF
LEVEL
1 Course Code
1 Course Title
e 2
me 2 Course Code
h 2 Student Code
2 Student Name
2 Date of Birth
2 Tutor Code
2 Tutor Name
Grade
Result
Relational Data Analysis
Normalisation Table
UNF UNF 1NF
2NF 3NF LEVEL
Normalisation Table
UNF UNF 1NF 2NF
3NF
LEVEL
Normalisation Table
UNF UNF 1NF 2NF
3NF
LEVEL
• Course Code
• Student Code
• Course Code + Student Code
Relational Data Analysis
Student Code
Tutor Code
Course Code
Relational Data Analysis
Student Name
Date of Birth
Course Code Tutor Code
Student Code Tutor Name
Grade
Result
Relational Data Analysis
Summary:
• choose a suitable key from a table of raw data
• identify repeating groups
• write the data in unnormalised form
• convert unnormalised data to first normal form
• convert first normal form to second normal form
• convert second normal form to third normal form
Another Example
• The following report is a User View
Figure 4.3
1NF Summarised
• All key attributes defined
Types of Dependencies
2NF Conversion Results
EMPLOYEE (EMP_NUM,
EMP_NAME,JOB_CLASS, CHG_HOUR)
• In 2NF
• Contains no transitive
dependencies
Denormalisation
• Normalisation is one of many database design goals
• Normalised table requirements
– Additional processing
– Loss of system speed
• normalisation purity is difficult to sustain due to
conflict in:
– Design efficiency
– Information requirements
– Processing
Unnormalised Table Defects