CS2202 Design
CS2202 Design
CS2202
1
• Let’s consider the following inst_dept relation
2
What about smaller schemas?
• How would we know to split up (decompose) it into instructor and
department?
• In inst_dept, because dept_name is not a candidate key, the building and
budget of a department may have to be repeated
– This indicates the need to decompose inst_dept
• However, not all decompositions are good
• Suppose we decompose
employee(ID, name, street, city, salary) into
employee1 (ID, name)
employee2 (name, street, city, salary)
3
Lossy decomposition
4
Example of lossless join decomposition
Lossless join decomposition
Decomposition of R = (A, B, C)
R1 = (A, B) R2 = (B, C)
A B C A B B C
1 A 1 1 A
2 B 2 2 B
r A,B(r) B,C(r)
A B C
A,B (r) B,C (r) 1 A
2 B
5
1st Normal Form
• This is mainly used to disallow multivalued attributes, composite attributes and
their combination
• Domain is atomic if its elements are considered to be indivisible units
– Examples of non-atomic domains:
• Set of names, composite attributes
6
Goal- devise a theory for the following
• Decide whether a particular relation r is in “good” form.
• In the case that a relation r is not in “good” form, decompose it into a set of
relations {r1, r2, ..., rn} such that
– each relation is in good form
– the decomposition is a lossless-join decomposition
• The theory is based on:
– functional dependencies
– multivalued dependencies
7
Functional dependency
• Require that the value for a certain set of attributes, determines uniquely the
value for another set of attributes
• A functional dependency is a generalization of the notion of a key
8
Functional dependency
• Let R be a relation schema
R and R
• The functional dependency
holds on R if and only if for any legal relations r(R), whenever any two
tuples t1 and t2 of r agree on the attributes , they also agree on the
attributes . That is,
t1,t2 r {(t1[] = t2 []) (t1[ ] = t2 [ ] )}
9
Example
• Consider r(A,B ) with the following instance of r.
A B
1 4
1 5
3 7
10
Functional dependency
• K is a superkey for relation schema R if and only if K R
• K is a candidate key for R if and only if
– K R, and
– for no K, R
11
• Functional dependencies allow us to express constraints that cannot be expressed
using superkeys
• Consider the schema:
inst_dept (ID, name, salary, dept_name, building, budget ).
We expect these functional dependencies to hold:
dept_name building
and ID building
but would not expect the following to hold:
dept_name salary
12
Use of functional dependency
• We use functional dependencies to:
– test relations to see if they are legal under a given set of functional
dependencies
• If a relation r is legal under a set F of functional dependencies,
we say that r satisfies F
– specify constraints on the set of legal relations
• We say that F holds on R if all legal relations on R satisfy the set of
functional dependencies F
• Note: A specific instance of a relation schema may satisfy a functional
dependency even if the functional dependency does not hold on all legal
instances
– For example, a specific instance of instructor may, by chance, satisfy
name ID 13
Functional dependency (contd)
• A functional dependency is trivial if it is satisfied by all instances of a relation
– Example:
• ID, name ID
• name name
– In general, is trivial if
14
Closure of a set of functional dependency
• Given a set F of functional dependencies, there are certain other functional
dependencies that are logically implied by F.
– For example: If A B and B C, then we can infer that A C
• The set of all functional dependencies logically implied by F is the closure of F.
• We denote the closure of F by F+.
• F+ is a superset of F.
15
Closure of a set of functional dependency
• We can find F+, the closure of F, by repeatedly applying
Armstrong’s Axioms:
– if , then (reflexivity)
– if , then (augmentation)
– if , and , then (transitivity)
• These rules are
– sound (generate only functional dependencies that actually hold), and
– complete (generate all functional dependencies that hold).
16
Example
• some members of F+
• R = (A, B, C, G, H, I) – AH
F={ AB • by transitivity from A B and B
AC H
CG H – AG I
CG I • by augmenting A C with G, to
B H} get AG CG
and then transitivity
with CG I
– CG HI
• by augmenting CG I to infer
CG CGI,
and augmenting of CG H to
infer CGI HI, and then
transitivity 17
Procedure for computing F+
• To compute the closure of a set of functional dependencies F:
F+=F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F +
for each pair of functional dependencies f1and f2 in F +
if f1 and f2 can be combined using transitivity
then add the resulting functional dependency to F +
until F + does not change any further
18
Closure of FDs
Additional rules which can be inferred from Armstrong’s axioms
19
Functional Dependency Example
• Flight <flight_no, c_arr, c_dept, fl_type>
• Seats_free <flight_no, date, seats_avl>
• What are some possible valid FDs?
– flight_no → c_arr
– flight_no → c_dept
– flight_no → fl_type
– flight_no, date → seats_avl
20
Functional Dependency Example
• Stud_addr <name, address>
• Stud_grade <name, subject, grade>
• Some possible FDs that hold are
– name → address
– name, subject → grade
21
• Which FDs hold here?
X Y Z W
x1 y1 z1 w1
x1 y2 z1 w2
X Z?
x2 y2 z2 w2
X W?
x2 y3 z2 w3 XY W?
x3 y3 z2 w4
22
Full functional dependency
• When the functional dependency is ‘minimal’ in size (i.e., containing non
redundant terms)
• FD X →A for which there is no proper subset Y of X such that Y →A (A is said to be
fully functionally dependent on X)
23
Closure of attribute sets
• The set of all attributes functionally determined by α under a set F of FDs
• It is denoted by α+
• Let’s consider a relation r with the following FDs
– A → BC
– AC → D
– D→B
– AB → D
• So what is A+, B+, C+, D+
– A+={A,B,C,D}, B+={B}, …
24
Cover of a set of FDs
Let f and g be two FDs on a relation schema R.
Then f is a cover of g if f+=g+
This is also known as f is equivalent to g
f g
A→BC A→BC
B →C B →C
A →B AB →C
AB →C
Here f+=g+
So g covers f
25
Minimal cover or canonical cover
• A cover is said to be minimal if it has no redundant terms
• Denoted by Fc
• Example:
F Fc
A → BC A → CD
AC → D D→B
D→B
AB → D
26
Extraneous Attribute
• An attribute of a FD is said to be extraneous if we can remove it without changing
the closure of the set of FDs
• Formally,
• Consider a set F of FDs and α→β in F
– Attribute A is extraneous in α if A ε α, and F logically implies (F-{α → β}) U{(α –
A) → β}
– Attribute A is extraneous in β if A ε β, and the set of functional dependencies
(F-{α → β}) U{α → (β - A)} logically implies F
27
Extraneous attribute test
• Let R be the relation schema, and let F be the given set of functional dependencies
that hold on R. Consider an attribute A in a dependency α→β .
• If A ∈ α , to check if A is extraneous, let γ = α − {A}, and check if γ → β can be
inferred from F. To do so, compute γ + (the closure of γ ) under F; if γ + includes all
attributes in β, then A is extraneous in α.
• E.g:- Given F:{AB →C and A →C}, Is B extra in AB →C?
– Find A + and check whether it includes C
– A + = {A,C}
– Thus, B is extraneous in AB →C
28
Extraneous attribute test
• Let R be the relation schema, and let F be the given set of functional dependencies
that hold on R. Consider an attribute A in a dependency α→β .
• If A ∈ β , to check if A is extraneous, consider the set F’ =(F-{α → β}) U{α → (β -
A)} and check if α → A can be inferred from F’ . To do so, compute α+ (the closure
of α) under F’ ; if α+ includes A, then A is extraneous in β.
• E.g:- Given F:{AB →CD, A →E, E →C}, Is C extra in AB →CD?
– Find F’ and then check whether AB + includes C or not
– F’ = {AB →D, A →E, E →C}
– AB + = {A,B,C,D,E},
– Thus C is extraneous in AB →CD
29
Included in the
Normal Forms definition of relation
31
Example
• Let’s consider the following supplier-parts database system
• first <sno, status, city, pno, qty>
• Here the only possible candidate key is (sno, pno)
• FDs for relation first
sno city
qty
pno status
32
• Instance of relation first
city
sno
sno qty
pno
status
35
Example of 2NF relations
sp
second sno pno qty
s1 p1 300
sno status city
s1 p2 200
s1 20 Morrison
s1 p3 400
s2 10 Centennial
s1 p4 200
s3 10 Centennial
s1 p5 100
s4 20 Morrison
s1 p6 700
s5 30 Denver
s2 p1 200
s2 p2 120
s3 p2 340
s4 p2 230
s4 p4 432
s4 p5 120
36
• Thus in r(A,B,C,D) if (A,B) is a candidate key and A →D holds
• Then by 2NF r can be replaced by r1 and r2 as follows
– r1(A,D) candidate key {A}
– r2(A,B,C) candidate key {A,B} and foreign key A references r1(A)
37
3rd Normal Form (3NF)
• A relation schema R is in 3NF if, whenever a non-trivial functional dependency
X→A holds in R, either
– X is a superkey of R or
– A is a prime attribute of R
38
• Addresses two type cases-
– A proper subset of a key of R functionally determines a non-prime attribute
– A non-prime attribute determines another non-prime attribute. This is same
as addressing the transitive dependency
• Now consider relation second
city
sno
status
39
Anomalies
• Insert:
– A particular city has a particular status
– Ex: any supplier in city Kersey has 10 status
– Cannot be inserted until there is actually a supplier located in that city
• Delete:
– If we delete S5 then we lose information that Denver has status 30
• Update:
– The status of a given city appears in many places
– So updating the status value may be problematic
40
• Now if we decompose the relation second into two relations such that they satisfy
3NF
• sc <sno, city>
• cs <city, status>
• The FDs of the above relations are
41
• If r(A,B,C) and A is a candidate key and B → C holds
• Then by 3NF r can be replaced by
– r1 (B,C) and B is a candidate key
– r2(A,B) and A is a candidate key and foreign key B references r1(B)
42
Properties of decomposition
• Decomposition1: Relation second is decomposed into
– sc <sno, city>
– cs <city,status>
• Decomposition2: Relation second is decomposed into
– sc <sno,city>
– ss <sno,status>
• Which of the above decomposition is lossless and dependency preserving?
43
Desirable properties of decomposition
• Lossless join
– When decomposing a relation into number of smaller ones then it is crucial
that the decomposition be lossless
• Dependency preservation
– The system must not create relation that does not satisfy all the given
functional dependencies
44
Lossless join
• Let R be a relation schema and F be a set of functional dependencies
• Let R1 and R2 form a decomposition of R
• The decomposition will be lossless if atleast one of the following functional
dependencies is in F+
R1∩R2→R1
R1∩R2→R2
In other words, R1 ∩ R2 forms a super key of either R1 or R2
45