Normalization
Normalization
Objective
Normalization presents a set of rules that tables
and databases must follow to be well structured.
Historically presented as a sequence of normal
forms
First Normal From
A table is in the first normal form iff
The domain of each attribute contains only
atomic values, and
The value of each attribute contains only a
single value from that domain.
Flight Weekdays
UA59 Mo We Fr
UA73 Mo Tu We Th Fr
1NF Solution
Flight Weekday
UA59 Mo
UA59 We
UA59 Fr
UA73 Mo
UA73 We
… …
Implication for the ER model
Watch for entities that can have multiple values
for the same attribute
Phone numbers, …
What about course schedules?
MW 5:30-7:00pm
Can treat them as atomic time slots
Functional dependency
Let X and Y be sets of attributes in a table T
Y is functionally dependent on X in T iff for
each set x R.X there is precisely one
corresponding set y R.Y
Y is fully functional dependent on X in T if Y is
functional dependent on X and Y is not
functional dependent on any proper subset of X
Example
Book table
BookNo Title Author Year
B1 Moby Dick H. Melville 1851
B2 Lincoln G. Vidal 1984
Address attribute is
functionally dependent on the pair
{ BookNo, Patron}
fully functionally dependent on Patron
Problems
Cannot insert new patrons in the system until they
have borrowed books
Insertion anomaly
Must update all rows involving a given patron if he or
she moves.
Update anomaly
Will lose information about patrons that have returned
all the books they have borrowed
Deletion anomaly
Armstrong inference rules (1974)
Axioms:
Reflexivity: if YX, then X→Y
Augmentation: if X→Y, then WX→WY
Transitivity: if X→Y and Y→Z, then X→Z
Derived Rules:
Union: if X→Y and X→Z, the X→YZ
Decomposition: if X→YZ, then X→Y and X→Z
Pseudotransitivity: if X→Y and WY→Z, then XW→Z
Armstrong inference rules (1974)
Axioms are both
Sound:
when applied to a set of functional
dependencies they only produce dependency
tables that belong to the transitive closure of
that set
Complete:
can produce all dependency tables that belong
to the transitive closure of the set
Armstrong inference rules (1974)
Three last rules can be derived from the first
three (the axioms)
Let us look at the union rule:
if X→Y and X→Z, the X→YZ
Using the first three axioms, we have:
if X→Y, then XX→XY same as X→XY (2nd)
if X→Z, then YX→YZ same as XY→YZ (2nd)
if X→XY and XY→YZ, then X→YZ (3rd)
Second Normal Form
A table is in 2NF iff
It is in 1NF and
no non-prime attribute is dependent on any
proper subset of any candidate key of the table.
A non-prime attribute of a table is an attribute that
is not a part of any candidate key of the table
A candidate key is a minimal superkey
Example
Library allows patrons to request books that are
currently out
Update anomalies
Deletion anomalies
2NF Solution
Put telephone number in separate Patron table
Patron Address
J. Fisher 101 Main Street
L. Perez 202 Market Street
Another example
Tournament winners
We can assume
Manager → Branch
{Project, Branch} → Manager
Example
Manager Project Branch
Alice Alpha Austin
Bob Delta Houston
Carol Alpha Houston
Alice Delta Austin
Restaurant Pizza
Pizza Milano Thin crust
Pizza Milano Thick crust
Pizza Firenze Thin crust
Pizza Firenze Thick crust
Join dependency
A table T is subject to a join dependency if it
can always be recreated by joining multiple
tables each having a subset of the attributes of T
Store Brand
Circuit City Apple
Circuit City Toshiba
CompUSA Apple
Conclusion
The first "big" table was 5NF
The second table was decomposable
Lossless
Decomposition
General Concept
If R(A, B, C) satisfies AB
We can project it on A,B and A,C
without losing information
Lossless decomposition
R = AB(R) ⋈ AC(R)
AB(R) is the projection of R on AB
⋈ is the natural join operator
Example
Course Text
Course, Text (R) 4330 none
3330 Patterson & Hennessy
Course Instructor
4330 Paris
Course, Instructor (R)
4330 Cheng
3330 Hillford
A different case
Course Instructor
4330 Paris
Course, Instructor (R) 4330 Cheng
3330 Hillford
An Example
Normalisation Example
We have a table Columns
representing orders in Order
an online store Product
Each row represents Quantity
an item on a
UnitPrice
particular order
Customer
Primary key is
Address
{Order, Product}
Functional Dependencies
Each order is for a single customer:
Order Customer
Each customer has a single address
Customer Address
Each product has a single price
Product UnitPrice
As Order Customer and Customer Address
Order Address
2NF Solution (I)
First decomposition
First table
Second table
Order Customer Address
2NF Solution (II)
Second decomposition
First table
Customer Address
Split second table into
Order Customer
Customer Address
Normalisation to 2NF
Second normal form To remove the first FD
means no partial we project over
dependencies on {Order, Customer,
candidate keys Address} (R1)
{Order} {Customer,
and
Address} {Order, Product, Quantity,
{Product} UnitPrice} (R2)
{UnitPrice}
Normalisation to 2NF
To remove this we project over
R1 is now in 2NF, but
there is still a partial FD in {Product, UnitPrice} (R3)
R2 and
{Product} {UnitPrice} {Order, Product, Quantity} (R4)
Normalisation to 3NF
R has now been split into To remove
3 relations - R1, R3, and {Order} {Customer}
R4 {Address}
R3 and R4 are in 3NF we project R1 over
R1 has a transitive FD {Order, Customer}
on its key {Customer, Address}
Normalisation
1NF:
{Order, Product, Customer, Address, Quantity,
UnitPrice}
2NF:
{Order, Customer, Address}, {Product, UnitPrice},
and {Order, Product, Quantity}
3NF:
{Product, UnitPrice}, {Order, Product, Quantity},