0% found this document useful (0 votes)
68 views64 pages

4 - Normalization

Normalization is a process of structuring data in a database to minimize redundancy and dependency. It involves decomposing tables to eliminate anomalies that occur from inserting, modifying, and deleting data. The goals of normalization are to conserve storage space, ensure data dependencies make logical sense, and represent real world entities in a simplified and accurate way within the database.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views64 pages

4 - Normalization

Normalization is a process of structuring data in a database to minimize redundancy and dependency. It involves decomposing tables to eliminate anomalies that occur from inserting, modifying, and deleting data. The goals of normalization are to conserve storage space, ensure data dependencies make logical sense, and represent real world entities in a simplified and accurate way within the database.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

NORMALIZATION

Definitions
• A process of analyzing a relation to ensure that it is well formed

• Normalization involves decomposing relations to produce smaller,


well-structured relations

• A formal process for deciding which attributes should be grouped


together in a relation so that all anomalies are removed

• More specifically, if a relation is normalized (well-formed), rows can


be inserted, deleted, or modified without creating anomalies
Goals

• Minimize data redundancy thereby conserving space and


avoiding anomalies

• Make it easier to maintain data

• Provide a better design that is an improved representation


of the real world
Normalization is a logical data-
modelling technique used to ensure
that data are well structured from an
organization-wide view
Student
Stud_ID Stud_Name No_Subject College Build_No Room_No
101 Snowhite 4 COC B2 R101
102 Belle 5 CCIS B1 R105
103 Mulan 3 COE B3 R108
104 Elsa 6 CCIS B1 R105
105 Diana 4 COC B2 R101
Modification Anomalies
Tables that are not normalized are susceptible to experiencing
modification anomalies

◦ Insertion Anomaly - occurs when certain attributes cannot be inserted into


the database without the presence of other attributes

◦ Update Anomaly - exists when one or more instances of duplicated data is updated,
but not all

◦ Deletion Anomaly - exists when certain attributes are lost because of the deletion of
other attributes
Insertion Anomaly - occurs when certain attributes cannot be inserted into
the database without the presence of other attributes
Student
Stud_ID Stud_Name No_Subject College Build_No Room_No
101 Snowhite 4 COC B2 R101 Update Anomaly

102 Belle 5 CCIS B1 R105


103 Mulan 3 COE B3 R108
Deletion Anomaly 104 Elsa 6 CCIS B1 R105
105 Diana 4 COC B2 R101
- - - MT B1 R103 Insertion Anomaly

Update Anomaly - exists when one or more instances of duplicated data is updated,
but not all
Deletion Anomaly - exists when certain attributes are lost because of the deletion of
other attributes
Insertion Anomaly
customerId customerName carId year make model

1 Mischka Sophia 102 2003 Volkswagen Golf

2 Kenra Elienette 101 2014 BMW 550i

3 Thea Pauline 103 2010 Nissan Sentra

Most modification problems are solved by breaking an existing


table into two or more tables through a process known as
normalization
Update Anomaly
Customer
customerId carId year make model price
Name
Mischka
1 102 2014 Mitsubishi Mirage 900K
Sophia
Kenra
2 101 2014 Toyota Prius 2M
Elienette
Thea
3 103 2010 Nissan Sentra 900K
Pauline
4 Ruel Paolo 102 2014 Mitsubishi Mirage 900K
James
5 104 2016 Honda Civic 1.2M
Aaron
Eunice
6 102 2014 Mitsubishi Mirage 900K
Czarina
Deletion Anomaly
EmpID EmpName Address DeptID DeptName DeptMgr

Hanae
1 Osaka D01 Accounting Valdez
Yoshimori

2 Atsuko Imai Tokyo D02 Operations Deverala

3 Chigusa Sazaki Nagoya D03 Services Talavera

4 Reiko Kawai Kyoto D04 Training Antonio

5 Kalpint Patel Chennai D02 Operations Deverala

Donald
6 Mumbai D01 Accounting Valdez
McElhenny
EmpID EmpName Address DeptID DeptName DeptMgr
Hanae
1 Osaka D01 Accounting Valdez
Yoshimori
2 Atsuko Imai Tokyo D02 Operations Deverala
Chigusa
3 Nagoya D03 Services Talavera
Sazaki
4 Reiko Kawai Kyoto D04 Training Antonio

5 Kalpint Patel Chennai D02 Operations Deverala


Donald
6 Mumbai D01 Accounting Valdez
McElhenny
Definitions

Functional dependency
• A constraint between two attributes in which the value of one attribute is
determined by the value of another attribute
(Constraint – certain properties that data must comply with)

• Describes the relationship between attributes in a relation.


• It is a relationship that exists when one attribute uniquely determines
another attribute

Examples:
ISBN -> Title,FirstAuthorName, Publisher
ISBN 978-981-4586-96-2 – Modern Database Mgnt, 11th Ed, Hoffer , Prescott,
Topi – Pearson Educ.
You can also describe this as a relationship where knowing the
value of one attribute (or a set of attributes) is enough to tell you
the value of another attribute (or set of attributes) in the same
table.

Examples:
SSS No. -> Employee’s Name
Bank Account No. -> Customer Name
Functional dependency
Illustration: Student
Stud_ID StudName Grade College Code
determinant dependent
1 a 78 CCIS C1
FD attribute1 attribute2 2 b 80 COE C1
3 a 78 CCIS C2
FD x y x y
4 b 60 COE C2
1 1
y is functionally dependent on x 5 c 80 COC C2
2 1
6 d 80 MT C2
x determines y 3 2
4 6 Stud_ID StudName
Not FD 2 9
StudName Stud_ID
FD 3 2
Stud_ID Grade
FD x à y
reference having same determinants If t1x == t2x
College Code
Then t1y == t2y
Student
Stud_ID StudName Grade College Code
1 a 78 CCIS C1
2 b 80 COE C1
3 a 78 CCIS C2
4 b 80 COE C2 FD x à y
5 c 60 COC C2 If t1x == t2x
Then t1y == t2y
6 d 80 MT C2

Stud_ID, StudName Grade

StudName Grade

StudName, Grade College

Grade College

StudName, Grade College, Code


Student
Stud_ID StudName Grade College Code
1 a 78 CCIS C1
2 b 80 COE C1
3 a 78 CCIS C2
4 b 60 COE C2 FD x à y
5 c 80 COC C2 If t1x == t2x
Then t1y == t2y
6 d 80 MT C2

Stud_ID Name, Grade

College, Code Name

Stud_ID, Grade College

StudName College

StudName, Grade, College Stud_ID


Primary Key – Is an attribute (or combination of attributes) that
uniquely identifies each row in a relation

When a primary key is composed of two or more attributes, it is


called a composite key.

Foreign Key – an attribute (or combination of attributes) that


provide a link between 2 tables. It acts as a cross reference
between tables because it references the primary key of another
table, thereby establishing the link between them.
Table uses Flight # and Date as composite primary key
Flight No. of
Date From To
Number Passengers

1750 Kuala
UA 36 3-Sep-16 1410 Manila 200
Lumpur

0455 Kuala
AA 704 31-Oct-16 0120 Manila 150
Lumpur

1750 Kuala
UA 36 4-Sep-16 1410 Manila 190
Lumpur

0435 Kuala
BA 9 31-Oct-16 0100 Manila 100
Lumpur
Types of Functional Dependency Student
A trivial functional dependency is a database dependency that occurs when you describe
a functional dependency of an attribute or of a collection of attributes that includes the original
attribute.
Stud_ID StudName Grade College Code
Trivial xày Student_ID à Student_ID 1 a 78 CCIS C1
If y C x
xàx Student_ID, StudName à StudName 2 b 80 COE C1
3 a 78 CCIS C2
Student_ID , StudName à Student_ID
Functional dependency which also known as a nontrivial dependency occurs when 4 b 60 COE C2
A->B holds true where B is not a subset of A. In a relationship, if attribute B is not a
subset of attribute A, then it is considered as a non-trivial dependency. 5 c 80 COC C2
xày 6 d 80 MT C2
Non-Trivial Student_ID à StudName
If x ⊄ y

xày
SemiTrivial Student_ID , StudName à StudName, Grade Employee
If x ⊄ y
Ename Project Dept
Joe DBMS Accounting
Multivalued (x à ày, x à à z) (Ename à àProjecty, Ename à à Dept) Joe Java Payroll

(Ename à àProjecty, Dept) Joe DBMS Payroll


Transitive Joe Java Accounting
Armstrong’s Axioms/ Inference Rules

•Armstrong's Axioms is a set of rules.


•It provides a simple technique for reasoning about
functional dependencies.
•It was developed by William W. Armstrong in 1974.
•It is used to infer all the functional dependencies on
a relational database.
Armstrong’s Axioms/ Inference
Student
Rules
•Reflexive rule − If alpha is a set of attributes and beta is subset of alpha, then
alpha holds beta. Stud_ID StudName Grade College Code
Reflexivity xàx 1 a 78 CCIS C1
xày 2 b 80 COE C1
yC x
3 a 78 CCIS C2
4 b 60 COE C2
•Transitivity rule − Same as transitive rule in algebra, if a → b holds and b → c
holds, then a → c also holds. a → b is called as a functionally that determines
5 c 80 COC C2
b. 6 d 80 MT C2
When an indirect relationship causes functional dependency

Transivity
if(x à y & y à z)
then x à z

StudName à Grade & Grade à College


then StudName à College
Armstrong’s Axioms/ Inference
Student
Rules
Augmentation rule: When x -> y holds, and c is attribute set, then xc -> yc also Stud_ID StudName Grade College Code
holds. That is adding attributes which do not change the basic dependencies.
1 a 78 CCIS C1
2 b 80 COE C1
Augmentation if x à y
then xa à ya 3 a 78 CCIS C2
4 b 60 COE C2
Stud_ID à StudName
then Stud_ID, Grade à StudName, Grade 5 c 80 COC C2
6 d 80 MT C2

Union rule says, if X determines Y and X determines Z, then X must


also determine Y and Z.

Union if(x à y & x à z)


then x à yz
Stud_ID à StudName & Stud_ID à Grade
then Stud_ID à StudName, Grade
Armstrong’s Axioms/ Inference Rules
Student
Decomposition If A holds BC and A holds B, then A holds C.
Stud_ID StudName Grade College Code
Decomposition/Splitting if(x à yz) 1 a 78 CCIS C1
then x à y & x à z
2 b 80 COE C1
Note: only right handside or attribute can be splitted 3 a 78 CCIS C2
StudName, Grade à College, Code
4 b 60 COE C2
then StudName, Grade à College & StudName, Grade à Code
5 c 80 COC C2
6 d 80 MT C2

Pseudo Transitivity If A holds B and BC holds D, then AC holds D.

Pseudo Transivity if(x à y & yz à a)


then xz à a
Stud_ID à StudName & StudName, Grade à College
then Stud_ID, Grade à College
Armstrong’s Axioms/ Inference Rules
Student
Stud_ID StudName Grade College Code
1 a 78 CCIS C1
2 b 80 COE C1
Composition if(x à y & a à b)
then xa à yb 3 a 78 CCIS C2
4 b 60 COE C2
Stud_ID à StudName & Grade à College 5 c 80 COC C2
then Stud_ID, Grade à StudName, College 6 d 80 MT C2
Attribute Closure/Closure of Attribute Set
The set of all those attributes which can be functionally determined from an
attribute set is known as closure of that attribute set.
Steps to find closure of an attribute set
1. Add the attributes which are already present in the attribute set for
which closure is being computed in the result set.
2. Repetitively add attributes which can be functionally determined from
the attributes already present in the result set.
Examples: Compute {A}+ = {A}
R ( A,B,C) closure of A. = {A, B}
AàB
= {A, B, C}
BàC
R ( A,B,C,D,E,F,G)
A à BC BC à DE D à F CF à G
Compute closure of
{D}+ = {D}
A, D, BC.
= {D,F}
{A}+ = {A}
= {A,B,C}
{BC}+ = {B,C,}
= {A,B,C,D,E} = {B,C,D,E}
= {A,B,C,D,E,F,G} = {B,C,D,E, F}
= {B,C,D,E,F,G}
R ( A,B,C,D,E,F)
A à B C à DE AC à F D à AF E à CF
{D}+ = {D} {DE}+ = {D,E}
= {D,A,F} = {A,D,E,F}
= {A,B,D,F} = {A,B,D,E,F}
= {A,B,C,D,E,F}
Full functional dependency
A functional dependency x à y is a full functional dependency if removal
of any attribute from x means that the dependency does not hold
anymore.

ABC à D

Name, Course, Subject à Grade


Partial functional dependency
A functional dependency x à y is a partial dependency if some
attributes in x or determinants can be removed and the dependency still
holds (we can also apply closure attribute)
AC à P
AàD
DàP
{A}+ = {A}
{Name, ID} à Course
= {A, D}
{ID} à Course
{A}+ = {A, D, P}
Points to Remember:
ABC à D
Full functional dependency Name, Course, Subject à Grade
1. y is functionally dependent on x and
2. y is not functionally dependent on any subset of x

Partial functional dependency AC à P


1. y is functionally dependent on x and
{Name, ID} à Course
2. y can be determined by any subset of x
{ID} à Course
Partial functional dependency – when a nonkey attribute is
functionally dependent on part (but not all) of the primary key

Flight No. of
Date From To
Number Passengers

1750 Kuala
UA 36 3-Sep-16 1410 Manila 200
Lumpur

0455 Kuala
AA 704 31-Oct-16 0120 Manila 150
Lumpur

1750 Kuala
UA 36 4-Sep-16 1410 Manila 190
Lumpur

0435 Kuala
BA 9 31-Oct-16 0100 Manila 100
Lumpur
Transitive dependency – a functional dependency between the
primary key and one or more nonkey attributes that are
dependent on the primary key via another nonkey attribute

This means if we have a primary key A and a non-key domain B


and C where C is more dependent on B than A and B is directly
dependent on A, then C can be considered transitively dependant
on A.

A B C
Steps in Normalization
Normalization can be accomplished in stages, each of which
corresponds to a normal form

1. First Normal Form - Any multivalued attribute (also called


repeating groups) have been removed

2. Second Normal Form – Any partial functional dependencies have


been removed

3. Third Normal Form – Any transitive dependencies have been


removed
Order ID : 1006 Order Date : Oct 24, 2019

Customer ID : 22
Customer Name : Dan’s Furniture
Customer Address : Fullerton, California

Product ID Product Description Product Finish Unit Price Ordered Quantity


7 Tea Table Walnut $450 2
5 TV Stand Oak $300 6
4 Porch Swing Pine $800 5
Order ID : 1007 Order Date : Oct 31, 2019

Customer ID : 65
Customer Name : Furniture Barn
Customer Address : Fort Collins, Colorado

Product ID Product Description Product Finish Unit Price Ordered Quantity


11 Table Cherry $1150 3
4 Porch Swing Pine $800 7
First Normal Form (1NF)
vAny multivalued attribute (also called repeating
groups) have been removed
vA relation is in 1NF if it contains an atomic value
vValues stored in a column should be of the same
domain
vAll the attributes in a table should have unique
names
Table with multivalued Attributes:
Not in 1 Normal Form
st
Order Order Cust ID Cust Cust Prod Product Prod Unit Ordered
ID Date Name Address ID Description Finish Price Qty

1006 24 Oct 22 Dan’s Fullerton, 7 Tea Table Walnut $450 2


2019 Furniture CA 5 TV Stand Oak $300 6
4 Porch Swing Pine $800 5
1007 31 Oct 65 Furniture Ft. 11 Table Cherry $1150 3
2019 Barn Collins,Co 4 Porch Swing Pine $800 7
Order ID : 1007 Order Date : Oct 31, 2019

Customer ID : 65
Customer Name : Furniture Barn
Customer Address : Fort Collins, Colorado

Product ID Product Description Product Finish Unit Price Ordered Quantity


11 Table Cherry $1150 3
4 Porch Swing Pine $800 7
Table with multivalued Attributes:
Not in 1 Normal Form
st
Order ID Order Cust Customer Customer Prod Product Prod Unit Ordered
Date ID Name Address ID Description Finish Price Qty
1006 24 Oct 22 Dan’s Fullerton, 7 Tea Table Walnut $450 2
2019 Furniture CA
1006 24 Oct 22 Dan’s Fullerton, 5 TV Stand Oak $300 6
2019 Furniture CA
1006 24 Oct 22 Dan’s Fullerton, 4 Porch Swing Pine $800 5
2018 Furniture CA
1007 31 Oct 65 Furniture Ft. Collin,Co 11 Table Cherry $1150 3
2019 Barn
1007 31 Oct 65 Furniture Ft. Collin,Co 4 Porch Swing Pine $800 7
2019 Barn
Order ID Order Cust ID Cust Name Cust Address Prod Product Prod Finish Unit Ordered Qty
Date ID Description Price

1006 24 Oct 22 Dan’s Fullerton, CA 7 Tea Table Walnut $450 2


2019 Furniture 5 TV Stand Oak $300 6
4 Porch Swing Pine $800 5
1007 31 Oct 65 Furniture Ft. Collins,Co 11 Table Cherry $1150 3
2019 Barn 4 Porch Swing Pine $800 7
Table with multivalued Attributes:
Not in 1 Normal Form
st
PatientID PatienName PatientAdd PatientCPNo
P101 Bea Makati 091777
092677
P102 Liza Manila 091933
092433
PatientID PatienName PatientAdd PatientCPNo
P103 Kim Mandaluyong 091801
097702 P101 Bea Makati 091777

P104 Angel Pasig 0926413 P101 Bea Makati 092677


0926518 P102 Liza Manila 091933

P102 Liza Manila 092433

P103 Kim Mandaluyong 091801

P103 Kim Mandaluyong 097702

P104 Angel Pasig 0926413

P104 Angel Pasig 0926518


Table with multivalued Attributes: (Having Primary Key)
Not in 1st Normal Form
Emp_Project_Details
1. Remove nested relation attributes into a Emp_ID EmpName Proj_No WorkingHrs
new relation
2. Propagate the primary into it E101 Bea 1 32
3. Unnest relation into a set of 1NF relations 2 8
E102 Liza 3 40
tions
Project E103 Kim 1 20
e la
po s ed R Emp_ID Proj_No WorkingHrs 2 20
Decom
E101 1 32 E104 Angel 2 10
3 10
Employee E101 2 8
6 10
Emp_ID EmpName E102 3 40 8 10
E101 Bea E103 1 20
E103 2 20
E102 Liza
E103 Kim E104 2 10

E104 Angel E104 3 10


E104 6 10
E104 8 10
Second Normal Form (2NF)
vAny partial functional dependencies have been
removed
vA relation should be in 1NF

Point to Remember – if the proper subset of candidate key


determines non-prime attribute, it is called partial functional
dependency
example 1

R(P, Q, R, S, T) step 2. Find closure for the identified candidate key

PQ à R partial FD PQS+ = PQS


PQSR
SàT PQSRT or PQRST = {R}

step 3. Identify the prime and non-prime attributes


step 1. Find the candidate key
prime is PQS
non-prime is RT
P Q R S T
DECOMPOSE relation R
partial FD
R1(P, Q, S)
PQS is the candidate key
R2(P, Q, R)
R3(S,T)
example 2
step 2. Find closure for the identified candidate key
R(P, Q, R, S, T)
PàQ partial FD PR+ = PR
PQR
QàT PQRST = {R}
RàS partial FD
step 3. Identify the prime and non-prime attributes
step 1. Find the candidate key prime is PR
partial FD non-prime is QST

P Q R S T DECOMPOSE relation R
R1(P, R)
partial FD

R2(P, Q, T)
PR is the candidate key
R3(R, S)
example 3

R(A, B, C, D, E, F, G, H, I, J) step 2. Find closure for the identified candidate key

AB à C partial FD ABD+ = ABD


ABCDGH
AD à GH partial FD
ABCDEFGHIJ = {R}
BD à EF partial FD
step 3. Identify the prime and non-prime attributes
AàI partial FD
prime is ABD
HàJ non-prime is CEFGH I J
step 1. Find the candidate key
partial FD DECOMPOSE relation R
partial FD
R1(A, B, D)
A B C D E F G H I J R2(A, B, C)
R3(A, I)
partial FD
partial FD R4(A, D, G, H, J)
ABD is the candidate key R5(B, D, E,F)
example 4
Student step 2. Find closure for the identified candidate key
Sid Sname Tid Tname Grade (Sid,Tid)+ = Sid, Tid
1 Bea 3 Nayre 5 Sid, Sname, Tid, Tname, Grade = {R}
2 Angel 2 Dastas 4
3 Ivana 1 Cruz 6
step 3. Identify the prime and non-prime attributes

Tid à Tname partial FD prime is Sid, Tid


non-prime is Sname, Tname, Grade
Sid à Sname partial FD
Sid, Tid à Grade FD DECOMPOSE relation R
step 1. Find the candidate key R1 R2 R3
FD
Sid Tid Grade Tid Tname Sid Sname

1 3 5 3 Nayre 1 Bea
Sid Sname Tid Tname Grade 2 2 4 2 Dastas 2 Angel

3 1 6 1 Cruz 3 Ivana
partial FD partial FD

Sid and Tid are the candidate keys


Going to the 2nd Normal Form (2NF)
Full Dependencies

Order Order Cust Customer Customer Prod Product Product Unit Ordered
ID Date ID Name Address ID Description Finish Price Qty

Partial Dependencies Partial Dependencies

Order_ID -> Order_Date, Customer_ID, Customer_Name,


Customer_Address

Product_ID -> Product_Description, Product_Finish, Unit_Price

Order_ID, Product_ID -> Order_Quantity


Second Normal Form
1NF PLUS No partial dependencies

Order Customer Customer Customer


Order ID
Date ID Name Address

Product Product
Product ID Unit Price
Description Finish

Ordered
Order ID Product ID
Quantity
Third Normal Form (3NF)
vA relation should be in 2NF
vAny transitive dependencies have been removed
Transitive dependency – a functional dependency between the primary key and one or more nonkey attributes that
are dependent on the primary key via another nonkey attribute

This means if we have a primary key A and a non-key domain B and C where C is more dependent on B than A and B
is directly dependent on A, then C can be considered transitively dependant on A.

Partial Dependency Transitive Dependency


Point to Remember: x à y prime à non-prime non-prime à non-prime

• Left handside x is part of candidate key (violates 2NF)


• Left handside x is a non-prime attributes (violates 3NF)
example 1
R(A, B, C, D, E) step 2. Find closure for the identified candidate key
AC+ = AC
A à B partial FD ABCDE = {R}
B à E transitive dependency
step 3. Identify the prime and non-prime attributes
C à D partial FD
prime are A, C
step 1. Find the candidate key non-prime are BDE
transitive dependency

DECOMPOSE relation R
A B C D E
R1(A, C)
partial FD partial FD
R2(A, B)
R3(B, E)
AC is the candidate key
R4(C, D)
example 2

Student step 2. Find closure for the identified candidate key


Sname Major Dept Sname+ = Sname
Sname, Major, Dept = {Student}
Sname à Major partial FD
Sname à Dept partial FD step 3. Identify the prime and non-prime attributes

Major à Dept transitive dependency prime is Sname


non-prime are Major, Dept
step 1. Find the candidate key
partial FD
DECOMPOSE relation R
Sname Major Dept
Student_Major Major_Dept
Sname Major Major Dept
partial FD transitive

Sname is the candidate key


Going to the 3rd Normal Form
Transitive Dependencies

Order Customer Customer Customer


Order ID
Date ID Name Address

Product Product
Product ID Unit Price
Description Finish

Ordered
Order ID Product ID
Quantity
Third Normal Form
2NF PLUS No transitive dependencies

Product Product Unit


Product ID
Description Finish Price

Product Ordered
Order ID
ID Quantity
Dependency Diagram Full Dependency

Transitive Dependencies

Order Order Cust Customer Customer Prod Product Product Unit Ordered
ID Date ID Name Address ID Description Finish Price Qty

Partial Dependencies Partial Dependencies

Order_ID, Product_ID -> Order_Quantity


Order_ID -> Order_Date, Customer_ID
Product_ID -> Product_Description, Product_Finishm Unit_Price
Customer_ID -> Customer_Name, Customer_Address
Foreign Key – an attribute (or combination of
attributes) that provide a link between 2 tables. It
acts as a cross reference between tables because it
references the primary key of another table, thereby
establishing the link between them.

ORDER

In this example, CUST_ID in


ORDER table serves as the
foreign key which links to
CUSTOMER CUST_ID in CUSTOMER
table
Surrogate key – used as primary key to simplify key structures
like when composite key is too long or when the primary key
(that can be used) is inefficient i.e. it is too long or cannot be
guaranteed to be unique over time (e.g. name)

Ex.

Flight Number + Flight Date (example in previous slide) could be


assigned a Flight Code which can instead serve as the primary key
Source

Modern Database Management, 11th Ed


Hoffer , Prescott, Topi
Some definitions (anomalies), examples (functional dependencies) taken from the net
(a) The table shown in Figure 1 is susceptible to update anomalies. Provide examples of insertion, deletion, and
modification anomalies.
Answers:
This table is not well structured, un-normalized containing redundant data. By using a bottom-up approach we
analyzing the given table for anomalies. First observation, we see multiple values in an appointment column and this of
course violate the 1NF. By assuming the staffNo and patientNo as candidate keys, there are many anomalies exist.
Insertion anomalies:
To insert a new patient particular that makes an appointment with the designated Doctor, we need to enter the correct detail for
the staff. For example, to insert the details of new patient in patientNo, patientName and an appointment, we must enter the
correct details of the doctor (staffNo, dentistName) so that the patient details are consistent with values for the designated
Doctor for example, S1011.
To enter new patient data that doesn’t have Doctor to be assigned we can’t insert NULL values for the primary key.
Deletion anomalies:
If we want to delete a patient named Ian MacKay for example, two records need to be deleted as in row 3 and 4. This anomaly
also obvious when we want to delete the dentistName, multiple records needs to be deleted to maintain the data integrity.
When we delete a Dentist record, for example Tony Smith, the details about his patients also lost from the database.

With redundant data, when we want to change the value of one columns of a particular Dentist, for example the dentistName,
we must update all the Dentist records that assigned to the particular patient otherwise the database will become inconsistent.
We also need to modify the appointment schedules because different Dentist has different schedules.
(b) Describe and illustrate the process of normalizing the table shown in Figure 1 to 3NF. State any assumptions you make
about the data shown in this table.

Assumptions made include that a patient is registered at only one surgery and he/she may have more than one appointment on
a given day. All the schedules have been fixed for the whole days and week.
In the 1NF we remove all the repeating groups (appointment), assigning new column (apptDate and apptTime) and assigned
primary keys (candidate keys). Then we figure out the functional dependencies (FDs). By using dependency diagram we
represent the table as shown below. (NF – stand for Normal Form)
Note: How to find the FDs is subjective!!! However, the rule is, it must reflect the real word situation.
staffNo apptDate apptTime dentistName patientNo patientName surgeryNo

FD1 is already in 2NF. In this case, we can see that FD2 (just depend on staffNo) and FD4 (just depend on staffNo and apptDate)
violate the 2NF. These two NFs are partially dependent on the candidate keys not the whole keys. FD2 can stand on its own by
depending on the staffNo and meanwhile FD4 also can stand on its own by depending on the staffNo.
The FD3 violates the 3NF showing the transitive dependency where surgeryNo and patientName depend on patientNo while
patientNo depend on the staffNo that is the non-key is depending on another non-key.
staffNo apptDate apptTime dentistName patientNo patientName surgeryNo

staffNo apptDate apptTime patientNo patientName

staffNo apptDate surgeryNo

staffNo dentistName

The 2NF, it is already in 1NF and there is no partial dependency. So we need to remove the FD2 and FD4 by splitting into
new tables and at the same time creating foreign keys.
staffNo apptDate apptTime dentistName patientNo patientName surgeryNo

FK

staffNo apptDate apptTime patientNo

FK
staffNo apptDate surgeryNo

staffNo dentistName

patientNo patientName

Finally in 3NF we must remove the transitive dependency. In this case we remove the FD3 by splitting into a new table.
The transitive dependency left is the patientName that depend on the patientNo, so we split this into new table while
creating a foreign key.
staffNo dentistName

Dentist(staffNo, dentistName)

FK
staffNo apptDate surgeryNo

Surgery(staffNo, apptDate, surgeryNo)

patientNo patientName
Patient(patientNo, patientName)

FK
staffNo apptDate apptTime patientNo

Appointment(staffNo, apptDate, apptTime, patientNo)

https://www.javaguicodexample.com/normalizationexercisean
swer.pdf

You might also like