Normalization
Normalization
Chapter Objectives
The purpose of normailization Data redundancy and Update Anomalies Functional Dependencies The Process of Normalization First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF)
Update Anomalies
Relations that have redundant data may have problems called update anomalies, which are classified as , Insertion anomalies Deletion anomalies Modification anomalies
To insert a new staff with branchNo B007 into the StaffBranch relation; To delete a tuple that represents the last member of staff located at a branch B007; To change the address of branch B003.
StaffBranch
staffNo
SL21 SG37 SG14 SA9 SG5
sName
John White Ann Beech David Ford Mary Howe Susan Brand
position
Manager Assistant Supervisor Assistant Manager
salary
30000 12000 18000 9000 24000
branchNo bAddress
B005 B003 B003 B007 B003 22 Deer Rd, London 163 Main St,Glasgow 163 Main St,Glasgow 16 Argyll St, Aberdeen 163 Main St,Glasgow
SL41
Julie Lee
Assistant
9000
B005
sName
John White
position
Manager
salary
30000
branceNo
B005
SG37
SG14 SA9 SG5 SL41
Ann Beech
David Ford Mary Howe Susan Brand Julie Lee
Assistant
Supervisor Assistant Manager Assistant
12000
18000 9000 24000 9000
B003
B003 B007 B003 B005
Branch
branceNo
B005 B007 B003
bAddress
22 Deer Rd, London 16 Argyll St, Aberdeen 163 Main St,Glasgow
Functional Dependencies
Functional dependency describes the relationship between
attributes in a relation. For example, if A and B are attributes of relation R, and B is functionally dependent on A ( denoted A B), if each value of A is associated with exactly one value of B. ( A and B may each consist of one or more attributes.)
B is functionally
A
dependent on A
B
Refers to the attribute or group of attributes on the left-hand side of the arrow of a functional dependency
Determinant
Functional Dependencies (3) Main characteristics of functional dependencies in normalization Have a one-to-one relationship between attribute(s) on the left- and right- hand side of a dependency;
Inference Rules
A set of all functional dependencies that are implied by a given set of functional dependencies X is called closure of X, written X+. A set of inference rule is needed to compute X+ from X.
Armstrongs axioms 1. 2. 3. 4. 5. 6. 7. Relfexivity: If B is a subset of A, them A B Augmentation: If A B, then A, C B Transitivity: If A B and B C, then A C Self-determination: AA Decomposition: If A B,C then A B and A C Union: If A B and A C, then A B,C Composition: If A B and C D, then A,C B,
pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow
rentStart
1-Jul-00
rentFinish
31-Aug-01
rent
350
ownerNo
CO40
oName
Tina Murphy Tony Shaw Tina Murphy Tony Shaw Tony Shaw
CR76
John kay
1-Sep-02
1-Sep-02
450
CO93
PG4
6 lawrence St,Glasgow
2 Manor Rd, Glasgow 5 Novar Dr, Glasgow
1-Sep-99
10-Jun-00
350
CO40
CR56
Aline Stewart
PG36
10-Oct-00
1-Dec-01
370
CO93
PG16
1-Nov-02
1-Aug-03
450
CO93
Definition of 1NF
First Normal Form is a relation in which the intersection of each row and column contains one and only one value.
There are two approaches to removing repeating groups from unnormalized tables: 1. Removes the repeating groups by entering appropriate data in the empty columns of rows containing the repeating data. 2. Removes the repeating group by placing the repeating data, along with a copy of the original key attribute(s), in a separate relation. A primary key is identified for the new relation.
propertyNo
PG4 PG16 PG4
cName
John Kay John Kay Aline Stewart Aline Stewart Aline Stewart
pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 6 lawrence St,Glasgow 2 Manor Rd, Glasgow 5 Novar Dr, Glasgow
rentStart
1-Jul-00 1-Sep-02 1-Sep-99
rentFinish
31-Aug-01 1-Sep-02 10-Jun-00
rent
350 450 350
ownerNo
CO40 CO93 CO40
oName
Tina Murphy Tony Shaw Tina Murphy Tony Shaw Tony Shaw
CR56
PG36
10-Oct-00
1-Dec-01
370
CO93
CR56
PG16
1-Nov-02
1-Aug-03
450
CO93
aClientNo copy of the original key attribute (clientNo) in a separte relation. cName
ClientNo
CR76 CR76 CR56 CR56 CR56
propertyNo
PG4 PG16 PG4 PG36 PG16
pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 6 lawrence St,Glasgow 2 Manor Rd, Glasgow 5 Novar Dr, Glasgow
rentStart
1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02
rentFinish
31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03
rent
350 450 350 370 450
ownerNo
CO40 CO93 CO40 CO93 CO93
oName
Tina Murphy
Tony Shaw
Tina Murphy Tony Shaw Tony Shaw
Rental
cName
John Kay Aline Stewart
ClientNo
CR76 CR76 CR56 CR56 CR56
propertyNo
PG4 PG16 PG4 PG36 PG16
rentStart
1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02
rentFinish
31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03
PropertyOwner
propertyNo
PG4 PG16 PG36
pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 2 Manor Rd, Glasgow
rent
350 450 370
ownerNo
CO40 CO93 CO93
oName
Tina Murphy Tony Shaw Tony Shaw
Third normal form (3NF) A relation that is in first and second normal form, and in which no non-primary-key attribute is transitively dependent on the primary key.
The normalization of 2NF relations to 3NF involves the removal of transitive dependencies by placing the attribute(s) in a new relation along with a copy of the determinant.
Rental
fd1 fd5 fd6 clientNo, propertyNo rentStart, rentFinish clientNo, rentStart propertyNo, rentFinish propertyNo, rentStart clientNo, rentFinish (Primary Key) (Candidate key) (Candidate key)
PropertyOwner
fd3 fd4 propertyNo pAddress, rent, ownerNo, oName (Primary Key) ownerNo oName (Transitive Dependency)
Rental
cName
John Kay Aline Stewart
ClientNo
CR76 CR76 CR56 CR56 CR56
propertyNo
PG4 PG16 PG4 PG36 PG16
rentStart
1-Jul-00 1-Sep-02 1-Sep-99 10-Oct-00 1-Nov-02
rentFinish
31-Aug-01 1-Sep-02 10-Jun-00 1-Dec-01 1-Aug-03
PropertyOwner
propertyNo
PG4 PG16 PG36
Owner
rent
350 450 370
pAddress
6 lawrence St,Glasgow 5 Novar Dr, Glasgow 2 Manor Rd, Glasgow
ownerNo
CO40 CO93 CO93
ownerNo
CO40 CO93
oName
Tina Murphy Tony Shaw
Example of BCNF
fd1 fd2 fd3 fd4 clientNo, interviewDate interviewTime, staffNo, roomNo (Primary Key) staffNo, interviewDate, interviewTime clientNo (Candidate key) roomNo, interviewDate, interviewTime clientNo, staffNo (Candidate key) staffNo, interviewDate roomNo (not a candidate key)
As a consequece the ClientInterview relation may suffer from update anmalies. For example, two tuples have to be updated if the roomNo need be changed for staffNo SG5 on the 13-May-02.
ClientInterview
ClientNo
CR76 CR76 CR74 CR56
interviewDate
13-May-02 13-May-02 13-May-02 1-Jul-02
interviewTime
10.30 12.00 12.00 10.30
staffNo
SG5 SG5 SG37 SG5
roomNo
G101 G101 G102 G102
Example of BCNF(2)
To transform the ClientInterview relation to BCNF, we must remove the violating functional dependency by creating two new relations called Interview and SatffRoom as shown below, Interview (clientNo, interviewDate, interviewTime, staffNo) StaffRoom(staffNo, interviewDate, roomNo)
Interview
ClientNo
CR76 CR76 CR74 CR56
interviewDate
13-May-02 13-May-02 13-May-02 1-Jul-02
interviewTime
10.30 12.00 12.00 10.30
staffNo
SG5 SG5 SG37 SG5
StaffRoom
staffNo
SG5
interviewDate
13-May-02
roomNo
G101
SG37 SG5
13-May-02 1-Jul-02
G102 G102
A multi-valued dependency can be further defined as being trivial or nontrivial. A MVD A > B in relation R is defined as being trivial if B is a subset of A or AUB=R A MVD is defined as being nontrivial if neither of the above two conditions is satisfied.