0% found this document useful (0 votes)
32 views24 pages

Chapter 5

Uploaded by

battal2023513
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views24 pages

Chapter 5

Uploaded by

battal2023513
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Modern Database Management

1
Normalization

Two Approaches in Relational Database Design

From Data Modeling ( eg. ER Model to Relational Logical Model


for implementation:
TOP DOWN DESIGN

Normalization of Relations:

BOTTOM UP DESIGN

Today’s lecture looks at bottom-up design

Normalization
Normalization

Normalization Of ‘User View’ Relations (Bottom Up Design )

The Steps

-- Represent all user views (e.g forms, reports etc) as a collection


of relations.
-- Normalize these relations , user view by user view.

-- Combine relations that have exactly the same primary key/s.

Normalization
Normalization

Functional Dependencies
A relational table is one way of describing the way in which
several attributes interact, or depend on each other. The simplest
kind of dependency is called functional dependency (FD).
For example,
LecturerID  LecturerName
is a valid FD because:
• For each LecturerID there is at most one LecturerName, or
• LecturerName is determined by LecturerID , or
• LecturerName is uniquely determined by LecturerID , or
• LecturerName depends on LecturerID .

Each of the above statements is equivalent.


Normalization
Normalization
The FD X  Y is a full dependency if no attribute can be removed
from X.
The FD X  Y is a partial dependency if an attribute can be removed
from X.
LecturerID, SubjectCode  LecturerName is partial, that is,
LecturerName is partially dependent on LecturerID and
SubjectCode
LabDate, SubjectCode  Tutor is full, that is, Tutor is fully
dependent on both LabDate and SubjectCode .
Dependencies can be transitive.
For example, if each subject is taught by one lecturer and had only
one tutor, then we might have the dependencies
SubjectCode  LecturerID
LecturerID  Tutor
and, transitively, SubjectCode  Tutor.
Normalization
Normalization

Functional dependencies can be used to decide whether a schema is


well designed.
For example, in the following relation:
LecturerSubject (LecturerID, LecturerName, SubjectCode, SubjectName)

Anomalies?:
o If there is a new subject which has not been allocated a
lecturer, can you record the details of this subject in the above
table? (Insert Anomaly)
o If an existing subject changes it’s title, can you do the changes
to once instance only? (Update Anomaly)
o If a lecturer resigns and the details are to be deleted, would
there be a chance that some subjects will be removed
permanently and we won’t have any track record of those
subjects anymore? (Delete Anomaly)

Normalization
Normalization

o Design errors in relations, such as the potential for certain


kinds of anomalies, can be categorised.
o These categories of error can be successively eliminated by
decomposing relations into normal forms.
o The major/main normal forms are first (1NF), second (2NF),
third (3NF), and Boyce­Codd (BCNF). Higher/advanced
normal forms including fourth (4NF), and fifth (5NF). Because
problems with 4NF and 5NF rarely occur, moreover database
designers in industry normally do not need to use the highest
possible NF for practical reasons, in this subject we will focus
on satisfying 3NF and BCNF level.
o These forms are increasingly strict, that is, increasingly error­
free. Advanced normal forms are based on complex kinds of
dependency.

Normalization
Normalization

NORMAL FORMS

• First Normal Form(1NF)


-- A relation is is 1NF if :

- there are no repeating groups.


- a unique key has been identified for each relation.
- all attributes are functionally dependent on all or part of the key.

• Second Normal Form (2NF)

-- A relation is in 2NF if :

- the relation is in 1 NF
- all non-key attributes are fully functionally dependent on the
entire key (partial dependency has been removed).

Normalization
Normalization

NORMAL FORMS

• Third Normal Form(3NF)


-- A relation is in 3NF if :

- the relation is in 2NF


- all transitive dependencies have been removed.
Transitive dependency: non-key attribute dependent on another
non-key attribute.

• Boyce-Codd Normal Form (BCNF)

-- A relation is in BCNF if :

- the relation is in 3NF


- any remaining anomalies that result from functional dependencies
have been removed.

Normalization
Normalization

NORMAL FORMS

• Fourth Normal Form(4NF)


-- A relation is in 4NF if :

- the relation is in BCNF


- any multivalued dependencies have been removed.

• Fifth Normal Form (5NF)

-- A relation is in 5NF if :

- the relation is in 4NF


- any remaining anomalies that result from join dependencies
have been removed.

Normalization
Normalization

FIRST NORMAL FORM

In the standard definition of a relation, each


attribute value is atomic - an attribute value
cannot be a set or other compound structure.
Such relations are said to be in first normal
form (1NF).
Therefore, any relation that contains non-atomic
attribute values (repeating groups) is said to be in
Un-normalized form (UNF).
Normalization

Example
Transform the ORDER form below into BCNF relations.

ORDER FORM
Order #: 5258
Customer # : 32
Customer Name : Computer Training Center
Customer Address: 1, Plenty Road
City-State-Post Code: Bundoora, VICTORIA 3083
Order Date: 20/ 2/ 2009

P ro d u c t # D e s c rip t io n Q u a n t it y Un it P ric e
P 123 Bo o k Cas e 4 200
P 234 C a bine t 2 150
P 345 Ta ble 1 500
Normalization

Solution
From the ORDER FORM (user view) we can derive ORDER relation:

-- Currently in UNF (Un-normalized Form)

ORDER( Order #, Customer #, Customer Name, Customer Address, City State


PostCode, Order Date( Product #, Description, Quantity, Unit Price))

the order form is not in1NF because there is a repeating group


(Product#, Description….).

To convert the above relation into 1NF, the repeating group must be removed
by creating a new relation based on the repeating group along with the primary
key of the main relation:
Normalization

Solution( ctd)
-- 1NF:
ORDER( Order#, Customer#, Customer Name, Customer Address,
CityStatePostCode, OrderDate)
ORDER_PRODUCT
O ( Order#, Product#, Description, Quantity, Unit Price)

Anomalies:

Insertion Anomalies: cannot insert a new product until there is an order for
that product.

Deletion Anomalies: if an order is deleted the whole detail of the product will
also be deleted.

Update Anomalies: if the detail of a particular product needs to be updated,


each order that contains that product has to be updated.
Normalization

Solution( ctd)
--2NF

The ORDER_PRODUCT relation is not in 2NF because not all non-key


attributes are fully dependent on the entire key.

To convert the ORDER_PRODUCT relation into 2NF, a new relation must


be created which consists of part of the keys (becomes the primary key
of the new relation) and all non key attributes that are dependent on the
partial key.

ORDER_PRODUCT(Order#, Product#, Quantity)


PRODUCT(Product#, Description, Unit_Price)

The ORDER relation is already in 2NF as there are no non key attributes
that are dependent on partial key (ORDER only has a single key).
Normalization

Solution( ctd)
--2NF

Anomalies

Insert Anomalies : a new customer cannot be inserted until he/she has an order.

Delete Anomalies : if an order is deleted, the whole information of the customer


is also deleted.

Update Anomalies : if a customer detail is to be updated, all orders for that


customer need to be updated (not atomic)
Normalization

--3NF

The ORDER relation is not in 3NF because there is a transitive dependency (non-
key attribute dependent on another non-key attribute).

To convert the relation into 3NF, a new relation must be created for the non-key
attributes that are dependent to another non-key attribute.

ORDER (Order#, Customer#, Order Date)


CUSTOMER (Customer#, Customer Name, Customer Address,
CityStatePostCode)

Both the order ORDER_PRODUCT and the PRODUCT relations are already in 3NF.

ORDER_PRODUCT(Order#, Product#, Quantity)


PRODUCT (Product#, Description, Unit Price)
Normalization

Final Relations in 3NF and BCNF:

ORDER (Order#, Customer#, OrderDate)


CUSTOMER (Customer#, CustomerName,
CustomerAddress, CityStatePostCode)
ORDER_PRODUCT(Order#, Product#, Quantity)
PRODUCT (Product#, Description, UnitPrice)

Note that only in rare situations that a relation in 3NF is not in


4NF or 5NF. Furthermore, 4NF and 5NF are harder to identify.
Hence current practice of database design often ignore them.
Most relations that are in 3NF are also in BCNF.

The above relations are all in BCNF already.

In the following slides, we will go through some examples where


3NF relations may not be in BCNF.
BCNF (Boyce Codd Normal Form) – an example
CustomerNo BranchNo

CUSTOMER BRANCH

M M

(1:N) (1:N)

Branch-
Customer
VisitingFrequency
DateRelationship Established

1 (0:N)

SalespersonNo
SALESPERSON
BCNF (Boyce Codd Normal Form)

BRANCH-CUSTOMER
(CustomerNo, BranchNo, SalespersonNo, VisitingFrequency,
DateRelationshipEstablished)

1. The table enforces the rule that each branch will serve a
customer through only one salesperson, as there is only one
SalespersonNo for each combination of CustomerNo and
BranchNo.
2. The table is in 3NF(there are no repeating groups, no partial
and transitive dependencies). If each salesperson works for
one branch only, the table still has some normalization
problems. The fact that a particular salesperson belongs to a
particular branch can appear in more than one row. In fact, it
will appear in every row for that salesperson.

The underlying reason for the normalization problems is that there is a


dependency between SalespersonNo and BranchNo (SalespersonNo
is a determinant of BranchNo). In another word, there is a non-key
that determine partial of the keys (violating BCNF rule).
BCNF (Boyce Codd Normal Form)

Problem 1:
SALESPERSON (SalespersonNo, BranchNo)

Although we can now record which branch a salesperson belongs


to, we cannot take anything out of the original table. We would
like to remove BranchNo, but that would mean destroying the
key.

Problem 2:
CUSTOMER-SALESPERSON (CustomerNo, SalespersonNo,
VisitingFrequency, DateRelationshipEstablished, BranchNo)

This table is not even in 2NF

Solution:
CUSTOMER-SALESPERSON (CustomerNo, SalespersonNo,
VisitingFrequency, DateRelationshipEstablished)
SALESPERSON (SalespersonNo, BranchNo)
BCNF (Boyce Codd Normal Form)

BCNF:
o In the original table we enforced the rule that a given customer was
only served by one salesperson from each branch.

o Our new model no longer enforces that rule. It is now possible for a
customer to be supported by several salespersons from the same
branch.

o Generic Format:
3NF but not BCNF: R1 (A, B, C) where C may determine B
Converted to BNCF: R11 (A, C) and R12 (C, B)
BCNF (Boyce Codd Normal Form)

CustomerNo BranchNo

CUSTOMER BRANCH

M 1 (1:N)
(1:N)

VisitingFrequency
DateRelationship Established

employ
Customer-
Salesperson

M M

(1:N) (1:1)

SALESPERSON

SalespersonNo
Next Lecture …

Data Manipulation using


Relational Algebra

Reading : Chapter 6 Elmasri & Navathe

You might also like