Lec 06
Lec 06
www.cl.cam.ac.uk/Teaching/current/Databases/
1
Recall: Database design
lifecycle
• Requirements analysis
– User needs; what must database do?
• Conceptual design
– High-level description; often using E/R model
• Logical design
– Translate E/R model into relational schema Next
• Schema refinement two
– Check schema for redundancies and anomalies lectures
• Physical design/tuning
– Consider typical workloads, and further optimise
2
Today’s lecture
• Why are some designs bad?
• What’s a functional dependency?
• What’s the theory of functional
dependencies?
3
Not all designs are equally
good
• Why is this design bad?
Data(sid,sname,address,cid,cname,grade)
Student(sid,sname,address)
Course(cid,cname)
Enrolled(sid,cid,grade)
4
An instance of our bad
design
5
Evils of redundancy
• Redundancy is the root of many problems associated
with relational schemas
– Redundant storage
– Update anomalies
– Insertion anomalies
– Deletion anomalies
– LOW TRANSACTION THROUGHPUT
• In general, with higher redundancy, if transactions are
correct (no anomalies), then they have to lock more objects
thus causing greater contention and lower throughput
6
Decomposition
• We remove anomalies by replacing the schema
Data(sid,sname,address,cid,cname,grade)
with
Student(sid,sname,address)
Course(cid,cname)
Enrolled(sid,cid,grade)
• Note the implicit extra cost here
• Two immediate questions:
1. Do we need to decompose a relation?
2. What problems might result from a decomposition?
7
Functional dependencies
• Recall:
– A key is a set of fields where if a pair of tuples
agree on a key, they agree everywhere
• In our bad design, if two tuples agree on
sid, then they also agree on address,
even though the rest of the tuples may not
agree
8
Functional dependencies
cont.
• We can say that sid determines address
– We’ll write this
sid address
10
Formalities
• Given a relation R=R(A1:1, …, An:n), and X,
Y ({A1, …, An}), an instance r of R satisfies
XY, if
– For any two tuples t1, t2 in R, if t1.X=t2.X then
t1.Y=t2.Y
12
Closure of a set of FDs
• Which of the following are in the closure of
our Student FDs?
– addressaddress
– cidcname
– cidcname,sname
– cid,sidcname,sname
13
Candidate keys and FDs
• If R=R(A1:1, …, An:n) with FDs F and
X{A1, …, An}, then X is a candidate key
for R if
– X A1, …,An F+
– For no proper subset YX is
Y A1, …,An F+
14
Armstrong’s axioms
• Reflexivity: If YX then F \ XY
– (This is called a trivial dependency)
– Example: sname,addressaddress
• Augmentation: If F \ XY then F
\ X,WY,W
– Example: As cidcname then
cid,sidcname,sid
• Transitivity: If F \ XY and F \ YZ then F \
XZ
– Example: As sid,cidcid and
cidcname, then sid,cidcname
15
Consequences of
Armstrong’s axioms
• Union: If F \ XY and F \ XZ then F
\ XY,Z
• Pseudo-transitivity: If F \ XY and
F \ W,YZ then F \ X,WZ
• Decomposition: If F \ XY and ZY then
F \ XZ
R =π A, B ( R ) π A ,C ( R )
A
18
Proof of Heath’s Rule
First show that
π A ,C ( R )
A
Suppose
then
and
Since
A
we have π A ,C ( R ) 19
Proof of Heath’s Rule (cont.)
In the other direction, we must show that
A
Suppose Then there must exist records
and
23
Why Armstrong’s axioms?
• Soundness
– If F \ XY is deduced using the rules, then
XY is true in any relation in which the
dependencies of F are true
• Completeness
– If XY is is true in any relation in which the
dependencies of F are true, then F \ XY can
be deduced using the rules
24
Soundness
• Consider the Augmentation rule:
– We have XY, i.e. if t1.X=t2.X then t1.Y=t2.Y
25
Soundness cont.
Consider the Transitivity rule:
– We have XY, i.e. if t1.X=t2.X then t1.Y=t2.Y
(*)
– We have YZ, i.e. if t1.Y=t2.Y then t1.Z=t2.Z
(**)
– Take two tuples s1 and s2 such that s1.X=s2.X
then from (*) s1.Y=s2.Y and then from (**)
s1.Z=s2.Z
26
Completeness
• Exercise
– (You may need the fact from slide 23)
27
Attribute closure