0% found this document useful (0 votes)

97 views

DB Design - Normal Forms

This document discusses normalizing a database schema to reduce redundancy. It begins by providing an example of an address table with redundancy from a functional dependency between postal code and city/province. This redundancy can cause update, insertion and deletion anomalies. The document then suggests splitting the table into two tables to remove redundancy and avoid these anomalies. The new schema separates address information from the mapping of postal codes to locations.

Uploaded by

John Gong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

97 views

DB Design - Normal Forms

Uploaded by

John Gong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 81

Unit 4 Schema Refinement and

Normal Forms

Readings :
3rd edition: Chapter 19, sections
19.1-19.6 (except 19.5.2), or
2nd edition: Chapter 15 sections 15.1-15.7

In Databases so far
Whats great about databases?
How to create a conceptual design using ER diagrams
How to create a logical design by turning the ER
diagrams into a relational schema including minimizing
the data and relations created
Now showing
Are we done (with the logical design)?
How to refine that schema to reduce duplication of
information

Learning Goals
Discuss pros and cons of redundancy in a database.
Provide examples of update, insertion, and deletion anomalies.
Given a set of tables and a set of functional dependencies over
them, determine all the keys for the tables.
Show that a table is/isnt in 3NF or BCNF.
Prove/disprove that a given table decomposition is a lossless
join decomposition. Justify why lossless join decompositions
are preferred decompositions.
Decompose a table into a set of tables that are in 3NF, or
BCNF.
Explain FD-preserving decompositions and why they are
desirable.
3

Dont forget about your project

milestones & deadlines!
http://www.cs.ubc.ca/~laks/cpsc304/project.html
You should have handed in your proposal (milestone #1) and
are hopefully working on turning your specification into an
ER diagram and then into tables (milestone #2).
Initially, you should be able to identify primary keys and
foreign keys.
see if you can identify additional constraints: FDs and
(candidate) keys from those FDs.
Normalize the tables into 3NF or BCNF.
Formal spec of your application (milestone #3).
Completed project (milestone #4).
Sign up for your demo once your TA notifies about demo
scheduling.
4

Consider the following entity set for mailing

addresses at UBC:

Name

Department

Address

Mailing address

Meets all the criteria that we have for an entity

There is nothing wrong with this entity
5

What would an instance look like?

Name

Department

Mailing Location

Ed Knorr

Computer Science

201-2366 Main Mall

Raymond Ng

Computer Science

201-2366 Main Mall

Laks V.S. Lakshmanan

Computer Science

201-2366 Main Mall

Meghan Allan

Computer Science

201-2366 Main Mall

Joel Friedman

Computer Science

201-2366 Main Mall

Joel Friedman

Math

121-1984 Mathematics Rd

Brian Marcus

Math

121-1984 Mathematics Rd

Problems?
1. space.
2. typos
3. changes (e.g., departments move, or
change names)

Okay, thats bad. But how do I know if a

department has just one address?
Databases allow you to say that one attribute determines
another through a functional dependency (FD).
So if Department determines MailingLocation but not
Name, we say that theres a functional dependency from
Department to MailingLocation. But Department is NOT
a key.

We write Department > MailingLocation to say each

dept has at most one mailing location.
Such statements are integrity constraints (ICs) called
functional dependencies (FDs).
7

Another example
Address(House#, Street, City,
Province, PostalCode).
PostalCode determines City, and Province, but is NOT a
key either.
That is, PostalCode > {City, Province}.
Street

Notational Note: We normally omit

House #
the set brackets and write instead
PostalCode > City, Province.

City
Address

Postal code

Province

Functional Dependencies (FDs)

technically speaking
Street
A functional dependency XY
City
(where X & Y are sets of attributes)
holds if for every legal instance, House #
Address
for all tuples t1, t2 :
if t1.X = t2.X then

Postal code
Province
t1.Y = t2.Y

Example:
PostalCode City, Province holds provided:
for each possible t1, t2,
if t1.PostalCode = t2.PostalCode then
(t1.{City,Province} = t2.{City,Province})
9

What are Functional Dependencies

saying exactly?
An FD X > Y holds for a relation r, provided given any
two tuples in r, if the X values agree, then the Y values
also agree
Also can be read as X determines Y, X arrow Y, or
X implies Y, or X controls Y. Pick a version that sounds
intuitive to you.

FDs made precise

Youve already seen a special case of FDs Key
constraints!
The FD Department > MailingLocation is supposed to
hold for the relation mailingAddress(Name, Department,
MailingLocation).
In Datalog-like notation, this means

mailingLocation(N, D, M ),
mailingLocation(N 0 , D, M 0 )
! M = M 0.
11

Better yet, in simplified datalog-like

notation
mailingLocation( , D, M ),
mailingLocation( , D, M 0 )
! M = M 0.

IMPORTANT: Read the two dont cares (i.e., _) as

whatever; they need not be equal; we dont constrain
them or even think about them!

One more example

The FD PostalCode > Province, City which is
supposed to hold for
address(House#, Street, City, Province, PostalCode) says:

address(H, S, C, P, P C),

address(H 0 , S 0 , C 0 , P 0 , P C) address(H 0 , S 0 , C 0 , P 0 , P C)
! C = C 0.

! P = P 0.

And its simplified version

address(House#, Street, City, Province, PostalCode)

address( , , C, , P C),

address( , , , P, P C),

address( , , C 0 , , P C)

address( , , , P 0 , P C)

! C = C 0.

! P = P 0.

Lets see some more instances

House #

Street

City

Province

Postal Code

101

Main Street

Vancouver

V6A 2S5

103

Main Street

Vancouver

V6A 2S5

101

Cambie Street

Vancouver

V6B 4R3

103

Cambie Street

Vancouver

V6B 4R3

101

Main Street

Delta

V4C 2N1

103

Main Street

Delta

V4C 2N1

Note: House#, Street, PostalCode is a key.

FD: It looks like maybe CityProvince, but theres a Victoria
in BC, Newfoundland, and Ontario & a Delta in Ontario.

Moral: cant tell from instances alone whether an FD necessarily

holds.

Which functional dependencies. again?

An FD is a statement about all allowable instances.
Must be identified by application semantics and at
design time.

Postal
code street? Department mailingLocation?

Given an instance r of R, we can check if r violates

some FD f, but we cannot tell if f holds over R, i.e.,
whether f holds for all allowable instances of R!
Note, r denotes instance and R denotes schema.

Which functional dependencies. again?

Well concentrate on cases where theres a single
attribute on the RHS: (e.g., PostalCodeProvince)

There are boring, trivial cases:

e.g. PostalCode, House# PostalCode
more generally, X > Y where Y is a subset of X

Our focus: the non-boring ones, since boring FDs are

trivial and dont tell us anything worthwhile.

Naming the Evils of Redundancy

Lets consider Postal Code

City, Province

House #

Street

City

Province

Postal Code

101

Main Street

Vancouver

V6A 2S5

103

Main Street

Vancouver

V6A 2S5

101

Cambie Street

Vancouver

V6B 4R3

103

Cambie Street

Vancouver

V6B 4R3

101

Main Street

Delta

V4C 2N1

103

Main Street

Delta

V4C 2N1

redundancy: city, province info. for each postal code

repeated once for each house!

Naming the Evils of Redundancy

Lets consider Postal Code

City, Province

House #

Street

City

Province

Postal Code

101

Main Street

Vancouver

V6A 2S5

103

Main Street

Vancouver

V6A 2S5

101

Cambie Street

Vancouver

V6B 4R3

103

Cambie Street

Vancouver

V6B 4R3

101

Main Street

Delta

V4C 2N1

103

Main Street

Delta

V4C 2N1

Update anomaly: How easily can we change Deltas

province?
must change once for each house in Delta!
19

Naming the Evils of Redundancy

Lets consider Postal Code

City, Province

House #

Street

City

Province

Postal Code

101

Main Street

Vancouver

V6A 2S5

103

Main Street

Vancouver

V6A 2S5

101

Cambie Street

Vancouver

V6B 4R3

103

Cambie Street

Vancouver

V6B 4R3

101

Main Street

Delta

V4C 2N1

103

Main Street

Delta

V4C 2N1

Insertion anomaly: What if we want to insert that

V6T 1Z4 is in Vancouver, BC?
Cant do now without a street and a house#!
20

Naming the Evils of Redundancy

Lets consider Postal Code

City, Province

House #

Street

City

Province

Postal Code

101

Main Street

Vancouver

V6A 2S5

103

Main Street

Vancouver

V6A 2S5

101

Cambie Street

Vancouver

V6B 4R3

103

Cambie Street

Vancouver

V6B 4R3

101

Main Street

Delta

V4C 2N1

103

Main Street

Delta

V4C 2N1

NULL

Vancouver

V6T 1Z4

NULL

We can force-fit using NULL values.

But null values are problematic. Will discuss those
problems when discussing SQL. Want to minimize the
occurrence of nulls in a database.

Naming the Evils of Redundancy

Lets consider Postal Code

City, Province

House #

Street

City

Province

Postal Code

101

Main Street

Vancouver

V6A 2S5

103

Main Street

Vancouver

V6A 2S5

101

Cambie Street

Vancouver

V6B 4R3

103

Cambie Street

Vancouver

V6B 4R3

101

Main Street

Delta

V4C 2N1

103

Main Street

Delta

V4C 2N1

Deletion anomaly: If we delete all addresses with postal

code V6A 2S5, we lose the info. that V6A 2S5 is in
Vancouver!
Can we do better?
E.g., what about splitting the relation?
22

Maybe we should split up our relation?

House #

Street

City

Province Postal
Code

101

Main Street

Vancouver

103

Main Street

Vancouver

V6A 2S5

101

Cambie Street

Vancouver

V6B 4R3

103

Cambie Street

Vancouver

Delta

V4C 2N1

101

Main Street

Delta

103

Main Street

Delta

Is this DB equivalent to what we started with?

How can you tell?
Did we gain / lose info.?
That didnt work so well!
23

One more try

What if we tried
House #

Street

Postal Code

City

Province Postal
Code

101

Main Street

V6A 2S5

103

Main Street

V6A 2S5

Vancouver

V6A 2S5

101

Cambie Street

V6B 4R3

Vancouver

V6B 4R3

103

Cambie Street

V6B 4R3

Delta

V4C 2N1

101

Main Street

V4C 2N1

103

Main Street

V4C 2N1

Did we lose anything?

Are our problems fixed?
Okay, that worked pretty well.
Would be nice to understand why it worked!
Would be even better to understand when it would work.

What do we need to know to split apart

addresses without losing information?
FDs tell us when were storing redundant information
Reducing redundancy helps eliminate anomalies and
save storage space
Wed like to split apart tables without losing information

Suppose
a schema R(A,B,C,D) is not known to satisfy any FDs.
Can
we split R in a lossless way?

<in-class exercise.>

What do we need to know to split apart

Suppose a schema R(A,B,C,D) does satisfy some FDs.

Will any split of R be a lossless split?

<in-class exercise.>
26

What do we need to know to split apart

addresses without losing information?
FDs tell us when were storing redundant information
Reducing redundancy helps eliminate anomalies and save
storage space
Wed like to split apart tables without losing information

But first, we need to know:

what FDs are explicit (given) and
what FDs are implicit (can be derived)
Among other things, this can help us derive additional keys
from the given FDs (spare keys are handy in databases, just
as in real life well see why shortly)
27

What happened so far?

redundancy is bad and leads to several problems.
to minimize redundancy, split a table.
some splits may cause us to lose info.!
were hoping FDs will guide us to good splits w/o losing
info.
before we can do that, we need to know how to derive
new FDs from given FDs.

<Clicker Break 1>

The Keys the key!

As a reminder, a key is a minimal set of attributes that
uniquely identify tuples in a relation
i.e., a key is a minimal set of attributes that functionally
determines all the attributes
e.g., House#, Street, PostalCode is a key
A superkey for a relation uniquely identifies the relation, but
does not have to be minimal
i.e.,: key superkey
E.g.,:
House#, Street, PostalCode is a key and a super key
House#, Street, PostalCode, Province is a superkey,
but not a key
30

Notational (Review) Clinic

We write sets of attributes {A, B, C} as ABC.
In case of real attributes we write House#, Street instead
of {House#, Street}.
Instead of X [ Y, we simply write XY, where X and Y are
sets of attributes.

Deriving Additional FDs:

the basics

William W. Armstrong.
Canadian, eh?

Given some FDs, we can often infer additional FDs:

e.g., {sid phone, phone acode} implies
sid acode.
An FD f is implied by a set of FDs F if f holds whenever all
FDs in F hold.
(Consequence) closure of F : the set of all FDs implied
by F.

Deriving Additional FDs:

the basics

William W. Armstrong.
Canadian, eh?

Given some FDs, we can often infer additional FDs:

sid! phone, phone! acode implies
sid! acode.
An FD f is implied by a set of FDs F if f holds whenever all
FDs in F hold.
(Consequence) closure of F : the set of all FDs implied
by F.
Armstrongs Axioms (X, Y, Z are sets of attributes):
Reflexivity: If Y X, then X Y
e.g., city,majorcity

Deriving Additional FDs:

the basics

William W. Armstrong.
Canadian, eh?

Given some FDs, we can often infer additional FDs:

sid phone, phone acode implies sid acode.
An FD f is implied by a set of FDs F if f holds whenever all
FDs in F hold.
(Consequence) closure of F : the set of all FDs implied
by F.
Armstrongs Axioms (X, Y, Z are sets of attributes):

Augmentation: If X Y, then X Z Y Z for any Z

e.g., if sidcity, then sid,major

city,major
34

Deriving Additional FDs:

the basics

William W. Armstrong.
Canadian, eh?

Given some FDs, we can often infer additional FDs:

Transitivity: If X Y and Y Z, then X Z

e.g., sid phone, phone acode implies sid acode
These three are sound and complete inference rules for FDs.
35

Why do we care? Greatly simplifies analysis!

Deriving Additional FDs

Couple of additional rules (that follow from axioms):
Union: If XY and XZ, then XY Z

e.g., if sidacode and sidcity, then sidacode,city

Decomposition: If XY Z, then XY and XZ

e.g., if sidacode,city then sidacode, and sidcity

Deriving Additional FDs

Examples:
Derive union rule from axioms (Augmentation and
Transitivity)
Derive Decomposition rule from Reflex and Trans.

Corollary: Given any set of FDs F, can convert F into an

equivalent set of FDs F, s.t. every FD in F is of the form
XA, where X is a set of attributes and A is a single
attribute.

Example: Supplier-Part DB
Suppliers supply parts to projects.
supplier attributes: sname, city, status
part attributes: p#, pname
supplier-part attributes: qty:
SupplierPart(sname,city,status,p#,pname,qty)
Functional dependencies:
fd1:
sname city
fd2:
city status
fd3:
p# pname
fd4:
sname, p# qty

Supplier-Part Key: Part 1:

Determining all attributes
Exercise: Show that (sname, p#) is a key of
SupplierPart(sname,city,status,p#,pname,qty)

fd1:
fd2:
fd3:
fd4:

sname city
city status
p# pname
sname, p# qty

Supplier-Part Key: Part 1:

Determining all attributes
fd1:
Exercise: Show that (sname, p#) is a key of
fd2:
SupplierPart(sname,city,status,p#,pname,qty)
fd3:
Proof has two parts:
a. Show: sname, p# is a (super)key
fd4:
1.
sname, p# sname, p#
sname city
2.
sname status
3.
4.
sname,p# city, p#
sname,p# status, p#
5.
6.
sname,p# sname, p#, status
7.
sname,p# sname, p#, status, city
sname,p# sname, p#, status, city, qty
8.
9.
sname,p#sname,pname
10. sname,p# sname, p#, status, city, qty, pname

sname city
city status
p# pname
sname, p# qty
reflex
fd1.
2, fd2, trans
2, aug
3, aug
1, 5, union
4, 6, union
7, fd4, union
fd3, aug.
8, 9, union
40

Supplier-Part Key: Part 2:

Minimality
b. Show: (sname, p#) is a minimal superkey of
SupplierPart(sname,city,status, p#,pname,qty)
1. p# does not appear on the RHS of
any FD therefore except for p# itself,
nothing determines p#
3. specifically, sname p# does not hold
4. therefore, sname is not a key
fd1:
sname city
5. similarly, p# is not a key
fd2:
city status
fd3:
fd4:

p# pname
sname, p# qty

Do you, by any chance, have

anything less painful?
Scared youre going to mess up? There is a closure
method for checking FDs that is intuitive and easy to use.
We denote the closure of a set of attributes X as X+.
Fact: Attribute A belongs to X+ iff X>A holds. That is,
X+ = R iff X is a super key of R.

Closure = X;
repeat {
if (A1A2Ak
B is an FD) &
(A1A2Ak Closure)

add B to Closure }
until Closure does not change.

Algorithm for
computing the
closure of an
attribute set X.

Example of closure computation

SupplierPart(sname,city,status,p#,pname,qty)
Let us compute the following closures:
{sname, p#}+ =

fd1:
fd2:
fd3:
fd4:

sname city
city status
p# pname
sname, p# qty

{sname}+ =
{p#}+ =

Note on notation: when convenient, we write A+ instead of {A}+

and (AB)+ instead of {A, B}+.
43

Heres a painless method

4
Revisit previous supplier example.
44

Heres what happened in the last little

while
deriving new FDs from given FDs: painful method
directly use inference rules.
painless method: use the closure method.
find (candidate) keys: use the closure method to find
super keys and some additional reasoning to find
minimal super keys (aka candidate keys).

<Clicker Break 2>

Flash back our original question was

Is this a good design?
Name

Department

Mailing Location

Ed Knorr

Computer Science

201-2366 Main Mall

Raymond Ng

Computer Science

201-2366 Main Mall

Laks V.S. Lakshmanan

Computer Science

201-2366 Main Mall

Meghan Allan

Computer Science

201-2366 Main Mall

Joel Friedman

Computer Science

201-2366 Main Mall

Joel Friedman

Math

121-1984 Mathematics Rd

Brian Marcus

Math

121-1984 Mathematics Rd

Is there a rule that says if the amount of redundancy that

we have is good?
If this design isnt good, how to split the table in a good way?

Functional dependencies & keys

In a functional dependency, a set of attributes determines
other attributes, e.g., ABC, means A and B together
determine C
A trivial FD determines what you already have, e.g.,
ABB
A key is a minimal set of attributes determining the rest
of the attributes of a relation, e.g.,
R(Name, Department, MailingLocation).
A super key is a set of attributes determining the rest of
the attributes in the relation, but does not HAVE to be
minimal (e.g., the key {sname, p#} of relation
supplierPart, or adding in other attributes like city, status,
)
47

Functional dependencies & keys

Given a set of (explicit) functional dependencies, we can
derive others. Wed covered how to do so using
Armstrongs axioms
Theorem: R satisfying FDs F, decomposed into R1 and
R2. It is lossless join (LLJ) iff one of these FDs is implied
by F:
R1 R2 R1 OR
R1 R2 R2.
Note the Key connection! :-)
<Clicker Break 3>
48

Time we achieved some normalcy!

Role of FDs in detecting redundancy:
Consider a relation schema R with 3 attributes, A B C.
No FDs hold: There is no redundancy here.
Given A B: Several tuples could have the
same A value, and if so, theyll all have the same B
value!
Normalization: the process of removing redundancy
from data

Normal Forms: Why have one rule

when you can have four?
Provide guidance for table refinement/reducing redundancy.
Four important normal forms:
First normal form(1NF)
Second normal form (2NF)
Third normal form (3NF)
Boyce-Codd Normal Form (BCNF)
If a relation is in a certain normal form, certain problems
(aka anomalies!) are avoided/minimized.
Normal forms can help decide whether decomposition (i.e.,
splitting tables) will help.

1NF
Each attribute has only one value
E.g., for postal code you cant have
both V6T 1Z4 and V6S 1W6 in the same
tuple!
Codds original vision of the relational
model allowed multi-valued attributes.

Recall trivial FDs

An FD X > A is trivial if A belongs to X.
More generally, a FD X > Y (where X and Y are sets of
attributes) is trivial if Y is a subset of X.
e.g., City, Province City is a trivial FD.
We say an FD is non-trivial if it is not trivial.

3NF
i.e., whenever X
A relation R is in 3NF if:
If X A is a non-trivial dependency in R, determines a non-key
attr, X better be a
then either X is a superkey for R
super key.
or A is part of a key.
Note: Being part of a super key doesnt count! Why? Super Key could
contain junk.

Example: address(Street, City, PostalCode), abbreviated to:

address(S,C,P).
FDs: SCP.
PC.
Keys: SC, SP.
Does it satisfy 3NF? What about 2NF?
We will return to 3NF a little later.
53

Raymond Boyce & Ted Codd

Boyce-Codd Normal Form (BCNF)

A relation R is in BCNF if:
If X A is a non-trivial FD in R,
then X is a superkey for R
(Must be true for every such FD)

In English:
Only (super)keys should determine other attributes.
Ex: Address(House#, Street, City, Province, PostalCode)
FD: PostalCode City
Recall applicable FDs for
Is it in BCNF? Why (not)?
Address: PostalCode City,
PostalCode Province.

What do we want?
Guaranteed freedom from redundancy!
How do we get there?
A relation may be in BCNF already!
Interesting fact: all two attribute relations are in BCNF!
Hint: What are the only possible non-trivial FDs in a 2attribute relation schema?
If not, decomposition is the answer!

Decomposing a Relation
A decomposition of R replaces R by two
or more relations s.t.:
Each new relation contains a subset of
the attributes of R (and no attributes
not appearing in R), and
Every attribute of R appears in at least one new relation.
Intuitively, decomposing R means storing instances of the
relations produced by the decomposition, instead of
instances of R.
E.g., Address(House#,Street,City,Province,Postal Code)
How can we decompose without losing information?
56

How can we decompose a relation w/o

losing information?
Address(House#,Street,City,Province,Postal Code).

Address(House#,Street#,PostalCode)

PC(City, Province, PostalCode)

Does the above decomposition lose information?

What does it mean to lose information?
How can we tell if we lose?
We need to know how the JOIN operation in
Relational Algebra works, for this purpose.

Lossless-Join Decompositions:
Definition
Informally: If we break a relation, r, into pieces, when we put the
pieces back together, we should get exactly r back again
Formally: Decomposition of R into X and Y is lossless-join w.r.t. a
set of FDs F if, for every instance r that satisfies F:
If we JOIN the X-part of r with the Y-part of r the result is
exactly r
REMARKS:
1. It is always true that r is a subset of the JOIN of its X-part
and Y-part.
2. In general, the other direction does not hold! If it does, the
decomposition is a lossless-join.
58
All decompositions used to resolve redundancy must be lossless!

Example Lossy-Join Decomposition

A
1
4
7

B
2
5
2

A
1
4
7

B
2
5
2

C
3
6
8

C
3 decompose
6
8

So what did we lose?

(join)

A
1
4
7
1
7

B
2
5
2
2
2

C
3
6
8
8
3

Note: tuples (1 2 8),

(7 2 3) not present in
original.
59

How do we decompose into BCNF

losslessly?
Let r be a relation with attributes R, and F be a set of FDs
on R s.t. all FDs have a single attribute on the RHS.
Pick any f FD of the form XA that violates BCNF
Decompose R into two relations: R1(R-A) & R2(XA)
Recurse on R1 and R2 using FDs
Pictorially:

R2
Others

Note: answer may vary depending on order you choose.

Thats okay -- All final answers guaranteed to be in
BCNF.
60

BCNF Example
Recall def. of BCNF: For all non-trivial FDs XA, X must
be a superkey .
ABCD
E.g.: Relation: R(ABCD) FD: BC, DA
Keys?
AD B C
A+ = A; B+ = BC; C+ = C; D+ = AD;
(BD)+ = BDCA; BD is the only key
Process R(ABCD).
Look at FD B C. Is B a superkey?
No. Decompose R into R1(B,C), R2(A,B,D)
BC is the only FD that applies to R1.
R1 is in BCNF.
Process R2(ABD).
====>

BCNF Example (contd.)

This is how far we got
We know all is well with BC, i.e., it is
in BCNF.

ABCD
AD B

Now, look at FD DA. Is D a superkey for R2?

No. Decompose R2 into
R3(D,A), R4(D,B).
B

ADB
D

Final answer: R1(B,C), R3(D,A), R4(D,B)

{R1, R3, R4} is a LLJ decomposition of R.
R1, R3, R4 are each in BCNF.
62

Another BCNF Example

R(ABCDE)
FD: ABC, DE.
Generate the BCNF (lossless-join) decomposition of R.
IOW, split up R into smaller relation schemas s.t. each of
them is in BCNF and together they are LLJ.

After you decompose, how do you

know which FDs apply?

Yes. Closure

Yet Another BCNF Example:

R(A,B,C,D,E,F)
FD =
AB
DE F,
BC
Is it in BCNF? If so, why. If not, decompose into BCNF

This BCNF stuff is great and easy!

Guaranteed that there will be no redundancy of data
Easy to understand (just look for superkeys)
Easy to do.
So why are there more normal forms?
For one thing, BCNF may not preserve all
dependencies
What does that mean?

An illustrative BCNF example

Unit

Company

Product

Company, Product
Unit, Product
Unit

Company

Unit Company
Company, Product Unit
Key(s)?
Unit

Product

BCNF:
No non-trivial FDs
We lose the FD: Company, Product Unit !!
67

Unit Company

So Whats the Problem?

Unit

Company

Unit

Product

SKYWill

UBC

SKYWill

Databases

Team Meat

UBC

Team Meat

Databases

Unit Company
No problem so far. All local FDs are satisfied.
Lets put all the data back into a single table again:
How could
Unit
Company
Product
the dbms
check if an
SKYWill
UBC
Databases
update would
Team Meat
UBC
Databases
violate the FD
Company,
Violates the FD:
Company, Product Unit
Product
68
Unit?

3NF to the rescue!

Recall: A relation R is in 3NF if:
If X A is a non-trivial FD in R,
then either X is a superkey for R
or A is part of a key.
(must be true for every such FD)

BCNF

Note: A must be part of a key not just a superkey (if a key

exists, all attributes are part of a superkey!)
Example: R(Unit,Company, Product)
FDs: Unit Company BCNF, no. Company part of a key so 3NF
Company, Product Unit Company, Product = superkey
Keys: {Company, Product}, {Unit,Product}
Is it in BCNF? 3NF?
To decompose into 3NF we rely on the minimal cover
69

Minimal Cover for a Set of FDs

Goal: Transform FDs to be as compact as possible
Minimal cover G for a set of FDs F:
Closure of F = closure of G (i.e., imply the same FDs)
RHS of each FD in G is a single attribute
If we delete an FD in G or delete attributes from an FD in
G, the closure changes
Intuitively, every FD in G is needed, and is as slim as
possible in order to get the same closure as F
e.g., AB, ABCDE, EFGH, ACDFEG has the
following minimal cover:
AB, ACDE, EFG and EFH
Well see how to derive this on the next slide

Finding minimal covers of FDs

1.
2.
3.

Put FDs in standard form (have only one attribute on

RHS)
Minimize LHS of each FD 1. Need ACDFE, ACDFG ?
2. ABCDE goes to ACDE (closure)
Delete Redundant FDs
3. Redundant: ACDFE, ACDFG
(take closure of ACDF w/o rule ACDFE)
In the end: AB, ACDE, EFG, EFH

Example:
AB, ABCDE, EFGH, ACDF

Another minimal cover example

Consider the relation R(CSJDPQV) with FDs
CSJDPQV, JPC, SDP, JS
Find a minimal cover

Decomposition into 3NF

using Minimal Cover

Synthesis of 3NF from scratch

Another 3NF example

R(ABCDE)

FDs: ACDE,

CEA

Yet another checkpoint

3ND and BCNF most popular normal forms, i.e.,
designs free from anomalies.
BCNF stronger than 3NF.
BCNF decomposition guarantees lossless join (LLJ)
decomposition, i.e., good splits.
3NF decomposition also guarantees LLJ.
it can be additionally made to preserve FDs, which BCNF
is not always guaranteed to do.
there is a simple synthesis algorithm for obtaining a 3NF
design too; to use it, you need to be able to find a
minimal cover for the given set of FDs.
76

Comparing BCNF & 3NF

BCNF guarantees removal of all anomalies
3NF has some anomalies, but preserves all
dependencies
If a relation R is in BCNF it is in 3NF.
A 3NF relation R may not be in BCNF if all 3 of the
following conditions are true:
a. R has multiple keys
b. Keys are composite (i.e. not single-attributed)
c. These keys overlap
BCNF

3NF

2NF 1NF
77

On the one hand

Normalization and Design
Most organizations go to 3NF or better
If a relation has only 2 attributes, it is automatically in 3NF
and BCNF
Our goal is to use lossless-join for all decompositions and
preserve dependencies
BCNF decomposition is always lossless, but may not
preserve dependencies
Good heuristic :
Try to ensure that all relations are in at least 3NF
Check for dependency preservation

On the other hand

Denormalization
Process of intentionally violating a normal form to gain
performance improvements
Performance improvements:
Fewer joins
Reduces number of foreign keys
Since FDs are often indexed, the number of indexes
may be reduced
Useful if certain queries often require (joined) results, and
the queries are frequent enough

Learning Goals Revisited

Debate the pros and cons of redundancy in a database.
Provide examples of update, insertion, and deletion
anomalies.
Given a set of tables and a set of functional dependencies
over them, determine all the keys for the tables.
Show that a table is/isnt in 3NF or BCNF.
Justify why lossless join decompositions are preferred
decompositions.
Decompose a table into a set of tables that are in 3NF, or
BCNF.
Additionally

Learning Goals Revisited

Given a set of FDs, find all keys of a relation scheme and
prove that we have found them all.
Find a minimal cover for a set of FDs.
Test if a decomp. Is LLJ.
Test if a decomp. is dependency preserving, i.e.,
preserves all FDs.

Database Design Theory: Introduction To Databases CSCC43 Winter 2011 Ryan Johnson
No ratings yet
Database Design Theory: Introduction To Databases CSCC43 Winter 2011 Ryan Johnson
10 pages
16 Normalization 1
No ratings yet
16 Normalization 1
43 pages
CT1212_Slides_443_7
No ratings yet
CT1212_Slides_443_7
68 pages
5 RelationalDesignTheory
No ratings yet
5 RelationalDesignTheory
25 pages
Module 4
No ratings yet
Module 4
72 pages
Chapter14
No ratings yet
Chapter14
53 pages
09 - Functional Dependencies Normalization
No ratings yet
09 - Functional Dependencies Normalization
60 pages
NBKHH
No ratings yet
NBKHH
41 pages
MODULE 4 (To Send)
No ratings yet
MODULE 4 (To Send)
32 pages
Normalization (2)
No ratings yet
Normalization (2)
43 pages
DS1 Part2
No ratings yet
DS1 Part2
31 pages
Normalization
100% (1)
Normalization
55 pages
Lecture #4-1. Normalization
No ratings yet
Lecture #4-1. Normalization
34 pages
Functional Dependecies
No ratings yet
Functional Dependecies
7 pages
Normalization (2)
No ratings yet
Normalization (2)
39 pages
FDMS - Chapter Four
No ratings yet
FDMS - Chapter Four
62 pages
Chapter 15: Basics of Functional Dependencies and Normalization For Relational Databases
No ratings yet
Chapter 15: Basics of Functional Dependencies and Normalization For Relational Databases
65 pages
4.4 Normalization
No ratings yet
4.4 Normalization
55 pages
Chapter 4- Logical Database Design
No ratings yet
Chapter 4- Logical Database Design
21 pages
Relational Database Design
No ratings yet
Relational Database Design
79 pages
Lecture 10: BCSE302L - DBMS: Functional Dependencies
No ratings yet
Lecture 10: BCSE302L - DBMS: Functional Dependencies
35 pages
Functional Dependencies and Normalization4
No ratings yet
Functional Dependencies and Normalization4
86 pages
Chapter 4-Functional Dependancy and Normalization
No ratings yet
Chapter 4-Functional Dependancy and Normalization
86 pages
Functional Dependencies and Normalization For Relational Databases
No ratings yet
Functional Dependencies and Normalization For Relational Databases
54 pages
Chapter 14
No ratings yet
Chapter 14
79 pages
Unit 4 - Database Management System - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Database Management System - WWW - Rgpvnotes.in
13 pages
Normalization: Repetition of Information Inability To Represent Certain Information Loss of Information
No ratings yet
Normalization: Repetition of Information Inability To Represent Certain Information Loss of Information
39 pages
Module 4 Dbms Student
No ratings yet
Module 4 Dbms Student
51 pages
NORMALIZATION
No ratings yet
NORMALIZATION
51 pages
CH 14 FDs and Normalization PDF
No ratings yet
CH 14 FDs and Normalization PDF
55 pages
20-Normalization - BCNF-02-09-2024
No ratings yet
20-Normalization - BCNF-02-09-2024
35 pages
Chapter 14 Slides
No ratings yet
Chapter 14 Slides
58 pages
Funcational Dependancies and Normalization Lesson-3
No ratings yet
Funcational Dependancies and Normalization Lesson-3
63 pages
9 Design Theory
No ratings yet
9 Design Theory
47 pages
Normalization
No ratings yet
Normalization
51 pages
Database Normalization
No ratings yet
Database Normalization
28 pages
Normalization
No ratings yet
Normalization
89 pages
Fundamental of Database: Madda Walabu University College of Computing Department of Information Technology
No ratings yet
Fundamental of Database: Madda Walabu University College of Computing Department of Information Technology
46 pages
Normalizationcse
No ratings yet
Normalizationcse
120 pages
DBMS Unit 3.0 Functional Dependencies
No ratings yet
DBMS Unit 3.0 Functional Dependencies
44 pages
Relational Database Design
No ratings yet
Relational Database Design
52 pages
dbms5
No ratings yet
dbms5
51 pages
Functional Dependency
No ratings yet
Functional Dependency
35 pages
Normalization
No ratings yet
Normalization
177 pages
Normalization
No ratings yet
Normalization
65 pages
Unit-3 Dbms Odd Sem 2020-2021
No ratings yet
Unit-3 Dbms Odd Sem 2020-2021
53 pages
Unit IV Database Normalization
No ratings yet
Unit IV Database Normalization
36 pages
Unit II
No ratings yet
Unit II
48 pages
Unit-III Part - I
No ratings yet
Unit-III Part - I
35 pages
DBMSPPTModule-4
No ratings yet
DBMSPPTModule-4
93 pages
Database Mangement Systems
No ratings yet
Database Mangement Systems
59 pages
MIS - Lec 11 - FDs-Anomalies
No ratings yet
MIS - Lec 11 - FDs-Anomalies
26 pages
Chapter_4_(3)[1]
No ratings yet
Chapter_4_(3)[1]
45 pages
Databases Lecture 5
No ratings yet
Databases Lecture 5
34 pages
Chapter 14
No ratings yet
Chapter 14
54 pages
Unit 2
No ratings yet
Unit 2
146 pages
6 Normalization
No ratings yet
6 Normalization
72 pages
Unit 4 - Database Management System - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Database Management System - WWW - Rgpvnotes.in
12 pages
Interview Questions for DB2 z/OS Application Developers
From Everand
Interview Questions for DB2 z/OS Application Developers
Robert Wingate
No ratings yet
SQL Query Basics
From Everand
SQL Query Basics
Isabella Ramirez
No ratings yet
Landslides Worksheet 2
No ratings yet
Landslides Worksheet 2
2 pages
Landslides Worksheet 2
No ratings yet
Landslides Worksheet 2
2 pages
Day1-Video 01:: Day1-Top Gear. Car Struck by Lightning
No ratings yet
Day1-Video 01:: Day1-Top Gear. Car Struck by Lightning
46 pages
E114 - Volc - Day2 - LP 2017W1
No ratings yet
E114 - Volc - Day2 - LP 2017W1
62 pages
Landslides Worksheet 1
100% (1)
Landslides Worksheet 1
2 pages
Practice Questions - Sailor Questions
No ratings yet
Practice Questions - Sailor Questions
1 page
Earthquakes Day4 LP2017W1
No ratings yet
Earthquakes Day4 LP2017W1
50 pages
E114 - Volc - Day5 - DW 2017W1
No ratings yet
E114 - Volc - Day5 - DW 2017W1
50 pages
Fragile Systems 1
No ratings yet
Fragile Systems 1
55 pages
Earthquakes Day 2LP2017W1
No ratings yet
Earthquakes Day 2LP2017W1
53 pages
Earthquakes - Day 3 LP2017W1
No ratings yet
Earthquakes - Day 3 LP2017W1
56 pages
FragileSystems EquationsList PDF
No ratings yet
FragileSystems EquationsList PDF
1 page
3NF Synthesis Full Example
No ratings yet
3NF Synthesis Full Example
6 pages
Practice Questions On Basic SQL: Sailors (Sid, Sname, Rating, Age) Boats (Bid, Bname, Color) Reserved (Sid, Bid, Date)
No ratings yet
Practice Questions On Basic SQL: Sailors (Sid, Sname, Rating, Age) Boats (Bid, Bname, Color) Reserved (Sid, Bid, Date)
1 page
Data Warehousing, OLAP, Data Mining Practice Questions Solutions
No ratings yet
Data Warehousing, OLAP, Data Mining Practice Questions Solutions
4 pages
Tutorial3 - FDS, Anomalies, LLJ, Keys
No ratings yet
Tutorial3 - FDS, Anomalies, LLJ, Keys
1 page
Tutorial 2 - ER To Relational Model Solutions
No ratings yet
Tutorial 2 - ER To Relational Model Solutions
3 pages
Tutorial2 - ER To Relational Model
No ratings yet
Tutorial2 - ER To Relational Model
2 pages
Database Systems: CPSC 304
No ratings yet
Database Systems: CPSC 304
39 pages
SED MAN Manual For GUI
100% (4)
SED MAN Manual For GUI
40 pages
8127-8130 5001474 A4 Web
No ratings yet
8127-8130 5001474 A4 Web
4 pages
Name: Sonali Dattatray Vadda Roll No: 173547 Class: T.Y B.M.S Topic Name: Role of HR Manager at Bharti Airtel Company Guide Name: Prof. Shilpa Sable
No ratings yet
Name: Sonali Dattatray Vadda Roll No: 173547 Class: T.Y B.M.S Topic Name: Role of HR Manager at Bharti Airtel Company Guide Name: Prof. Shilpa Sable
13 pages
Philips TV Chasis L03 (1) (1) (1) .2L AA
No ratings yet
Philips TV Chasis L03 (1) (1) (1) .2L AA
42 pages
High Definition Television
100% (1)
High Definition Television
40 pages
Heatseal H520: Product Data Sheet
No ratings yet
Heatseal H520: Product Data Sheet
1 page
Sop Mee
100% (5)
Sop Mee
5 pages
ANGOL - Container Vessel
No ratings yet
ANGOL - Container Vessel
1 page
PSLV C44 LaunchKit
No ratings yet
PSLV C44 LaunchKit
6 pages
Embedded BLCD
No ratings yet
Embedded BLCD
24 pages
Formel QAudit
No ratings yet
Formel QAudit
19 pages
Opaster Anios Fiche Technique 00000 en
100% (2)
Opaster Anios Fiche Technique 00000 en
2 pages
Specification: Constant Current Discharge Characteristics: A (25)
No ratings yet
Specification: Constant Current Discharge Characteristics: A (25)
2 pages
Measurement and Calibration of Thermocouple
No ratings yet
Measurement and Calibration of Thermocouple
4 pages
Desigb Case Study Imt
No ratings yet
Desigb Case Study Imt
6 pages
Details of ARP and PPP
100% (1)
Details of ARP and PPP
40 pages
Chemical Composition, Mechanical, Physical and Environmental Properties of SS 2343, Steel Grades, Special Alloy
No ratings yet
Chemical Composition, Mechanical, Physical and Environmental Properties of SS 2343, Steel Grades, Special Alloy
1 page
2X Sto
No ratings yet
2X Sto
2 pages
Bomba de Vacio Duoseal
No ratings yet
Bomba de Vacio Duoseal
40 pages
SSA Fed Hire Workbook - 082107
100% (1)
SSA Fed Hire Workbook - 082107
19 pages
PW Scope and O&M
No ratings yet
PW Scope and O&M
2 pages
EEL5881 Software Engineering I UML Lecture: Yi Luo
No ratings yet
EEL5881 Software Engineering I UML Lecture: Yi Luo
40 pages
Xseries 3400 M3 7379
No ratings yet
Xseries 3400 M3 7379
346 pages
Phoenix Contact 3260112 en
No ratings yet
Phoenix Contact 3260112 en
5 pages
Constell: Consultants Pvt. LTD
No ratings yet
Constell: Consultants Pvt. LTD
1 page
Unit 3
No ratings yet
Unit 3
44 pages
Quinntherm Technical Manual
No ratings yet
Quinntherm Technical Manual
42 pages
ISO-5057-2022
No ratings yet
ISO-5057-2022
10 pages
A Critical Review of The Reinforced Concrete Colum
No ratings yet
A Critical Review of The Reinforced Concrete Colum
16 pages
B-H Curve
No ratings yet
B-H Curve
21 pages