CAS CS 460/660 Introduction To Database Systems Functional Dependencies and Normal Forms
CAS CS 460/660 Introduction To Database Systems Functional Dependencies and Normal Forms
Functional Dependencies
and
Normal Forms
1.1
Review: Database Design
Requirements Analysis
user needs; what must database do?
Conceptual Design
high level descr (often done w/ER model)
Logical Design
translate ER into DBMS data model
Schema Refinement
consistency,normalization
Physical Design - indexes, disk layout
Security Design - who accesses what
1.2
Keys (review)
1.5
FD’s Continued
1.7
Redundancy Problems Due to R W
Hourly_Emps
1.8
Decomposing a Relation
Redundancy can be removed by “chopping” the relation into
pieces.
FD’s are used to drive this process.
R W is causing the problems, so decompose SNLRWH into what
relations?
Wages
Hourly_Emps2
1.11
Reasoning About FDs
Given some FDs, we can usually infer additional FDs:
title studio, star implies title studio and title star
title studio and title star implies title studio, star
title studio, studio star implies title star
But,
title, star studio does NOT necessarily imply that title
studio or that star studio
1.12
Rules of Inference
1.13
Example
Contracts(cid,sid,jid,did,pid,qty,value), and:
C is the key: C CSJDPQV
Job purchases each part using single contract: JP C
Dept purchases at most 1 part from a supplier: SD P
Problem: Prove that SDJ is a key for Contracts
• JP C, C CSJDPQV imply JP CSJDPQV
(by transitivity) (shows that JP is a key)
• SD P implies SDJ JP (by augmentation)
• SDJ JP, JP CSJDPQV imply SDJ CSJDPQV
• (by transitivity) thus SDJ is a key.
1.15
Attribute Closure (example)
R = {A, B, C, D, E}
F = { B CD, D E, B A, E C, AD B }
Is B E in F+ ? • Is AD a key for R?
B =B
+ AD +
= AD
B+ = BCD AD+ = ABD and B is a key, so
B+ = BCDA Yes!
B+ = BCDAE … Yes! B is a key for R too! • Is AD a candidate key
Is D a key for R? for R?
D+ = D A+ = A
D+ = DE
A not a key, nor is D so Yes!
D = DEC
+
1st Normal Form – all attributes are atomic (i.e., “flat tables”)
1st 2nd (of historical interest) 3rd Boyce-Codd …
1.17
Normal Forms
1.18
Boyce-Codd Normal Form (BCNF)
Reln R with FDs F is in BCNF if, for all X A in F+
A X (called a trivial FD), or
X is a superkey for R.
In other words: “R is in BCNF if the only non-trivial FDs over R are key
constraints.”
If R in BCNF, then every field of every tuple records information that cannot
be inferred using FDs alone.
Say we are told that FD X A holds for this example relation:
1.20
Decomposition of a Relation Scheme
If a relation is not in a desired normal form, it can be decomposed into
multiple relations that each are in that normal form.
1.21
Example
Hourly_Emps
Wages
Hourly_Emps2
• Q: Are both of these relations now in BCNF?
• Decompositions should be used only when needed.
– Q: potential problems of decomposition?
1.23
Refining an ER Diagram
1st diagram becomes: Before:
Workers(S,N,L,D,Si) since
name dname
Departments(D,M,B)
ssn lot did budget
Lots associated with
workers. Works_In
Employees Departments
Suppose all workers in
a dept are assigned the same
lot: DL
Redundancy; fixed by: After:
Workers2(S,N,D,Si) budget
since
Dept_Lots(D,L) name dname
Departments(D,M,B) ssn did lot
Can fine-tune this:
Workers2(S,N,D,Si) Employees Works_In Departments
Departments(D,M,B,L)
1.24
Decomposing a Relation
Easiest fix is to create a relation RW to store these associations,
and to remove W from the main schema:
Wages
Hourly_Emps2
• Q: Are both of these relations now in BCNF?
• Decompositions should be used only when needed.
– Q: potential problems of decomposition?
1.25
Problems with Decompositions
There are three potential problems to consider:
1) May be impossible to reconstruct the original relation! (Lossiness)
Fortunately, not in the SNLRWH example.
2) Dependency checking may require joins.
Fortunately, not in the SNLRWH example.
3) Some queries become more expensive.
e.g., How much does Guldu earn?
1.26
Lossless Decomposition (example)
=
1.29
Lossy Decomposition (example)
A B; C B
=
1.30
Lossless Decomposition
Decomposition of R into X and Y is lossless-join w.r.t. a set of FDs F if,
for every instance r that satisfies F:
(r) (r) = r
The decomposition of R into X and Y is lossless with
respect to F if and only if F+ contains:
X Y X, or
XYY
in previous example: decomposing ABC into AB and BC is lossy, because
intersection (i.e., “B”) is not a key of either resulting relation.
1.31
Lossless Decomposition (example)
A B; C B
=
But, now we can’t check A B without doing a join!
1.32
Dependency Preserving
Decomposition
Dependency preserving decomposition (Intuitive):
1.33
Dependency Preserving Decompositions
(Contd.)
So, (F F )+ contains C A
AB BC
1.34
Decomposition into BCNF
Consider relation R with FDs F.
If X Y violates BCNF, decompose R into R - Y and XY
(guaranteed to be lossless).
Repeated application of this idea will give us a collection of relations that are in
BCNF; lossless join decomposition, and guaranteed to terminate.
e.g., CSJDPQV, key C, JP C, SD P, J S
{contractid, supplierid, projectid,deptid,partid, qty, value}
To deal with SD P, decompose into SDP, CSJDQV.
To deal with J S, decompose CSJDQV into JS and CJDQV
So we end up with: SDP, JS, and CJDQV
1.35
BCNF and Dependency Preservation
1.36
Third Normal Form (3NF)
Reln R with FDs F is in 3NF if, for all X A in F+
A X (called a trivial FD), or
X is a superkey of R, or
A is part of some candidate key (not superkey!) for R. (sometimes stated as “A
is prime”)
Minimality of a key is crucial in third condition above!
If R is in BCNF, obviously in 3NF.
If R is in 3NF, some redundancy is possible. It is a compromise, used when
BCNF not achievable (e.g., no ``good’’ decomp, or performance
considerations).
Lossless-join, dependency-preserving decomposition of R into a collection of
3NF relations always possible.
1.37
Decomposition into 3NF
Obviously, the algorithm for lossless join decomp into BCNF can be used to
obtain a lossless join decomp into 3NF (typically, can stop earlier) but does
not ensure dependency preservation.
To ensure dependency preservation, one idea:
If X Y is not preserved, add relation XY.
Problem is that XY may violate 3NF! e.g., consider the addition of CJP to
`preserve’ JP C. What if we also have J C ?
Refinement: Instead of the given set of FDs F, use a minimal cover for F.
1.38
Minimal Cover for a Set of FDs
1.39
Assertions
ASSERTIONS:
Example:
1.40
Assertions
1.41
Another example
1.42