DBMS R19 UNIT III-Part2
DBMS R19 UNIT III-Part2
Combine Schemas?
A Lossy Decomposition
Example of Lossless-Join Decomposition
Functional Dependencies
α → β is trivial (i.e., β ⊆ α)
α is a superkey for R
Goals of Normalization
Functional-Dependency Theory
+,
We can find F
Axioms: the closure of F, by repeatedly applying Armstrong’s
if β ⊆ α, then α → β (reflexivity)
if α → β, then γ α → γ β (augmentation)
if α → β, and β → γ, then α → γ (transitivity)
These rules are
sound (generate only functional dependencies that actually hold),
and
complete (generate all functional dependencies that hold).
Example
R = (A, B, C, G, H, I) F
={A→B
A → C
CG → H
CG → I
B → H}
+
some members of F
A→H
by transitivity from A → B and B → H
AG → I
by augmenting A → C with G, to get AG → CG
and then transitivity with CG → I
CG → HI
by augmenting CG → I to infer CG → CGI,
and augmenting of CG → H to infer CGI → HI,
and then transitivity
+
Procedure for Computing F
Additional rules:
If α → β holds and α → γ holds, then α → β γ holds (union)
If α → β γ holds, then α → β holds and α → γ holds
(decomposition)
If α → β holds and γ β → δ holds, then α γ → δ holds
(pseudotransitivity)
The above rules can be inferred from Armstrong’s axioms.
Closure of Attribute Sets
+
Given a set of attributes α, define the closure of α under F (denoted by α )
as the set of attributes that are functionally determined by α under F
+
Algorithm to compute α , the closure of α under F
result := α;
while (changes to result) do
for each β → γ in F do
begin
if β ⊆ result then result := result ∪ γ
end
Example of Attribute Set Closure
R = (A, B, C, G, H, I)
F = {A → B
A→C
CG → H
CG → I B
→ H}
+
(AG)
1. result = AG
2. result = ABCG (A → C and A → B)
3. result = ABCGH (CG → H and CG ⊆ AGBC)
4. result = ABCGHI (CG → I and CG ⊆ AGBCH)
Is AG a candidate key?
1. Is AG a super key?
+
1. Does AG → R? == Is (AG) ⊇ R
2. Is any subset of AG a superkey?
+
1. Does A → R? == Is (A) ⊇ R
+
2. Does G → R? == Is (G) ⊇ R
Uses of Attribute Closure
Extraneous Attributes
R = (A, B,
C) F = {A →
BC
B → C
A → B
AB →
C}
Combine A → BC and A → B into A → BC
Set is now {A → BC, B → C, AB → C}
A is extraneous in AB → C
Check if the result of deleting A from AB → C is implied by
the other dependencies
4 Yes: in fact, B → C is already present!
Set is now {A → BC, B → C}
C is extraneous in A → BC
Check if A → C is logically implied by A → B and the other
dependencies
4 Yes: using transitivity on A → B and B → C.
– Can use attribute closure of A in more complex cases
The canonical cover is: A→B
B→C
Lossless-join Decomposition
+
Let F be the set of dependencies F that include only attributes in R .i
i
A decomposition is dependency preserving, if
(F ∪ F + +
1 2∪ … ∪ F ) = F
n
If it is not, then checking updates for violation of functional
dependencies may require computing joins, which is
expensive.
Example
R = (A, B, C )
F = {A → B
B → C}
Key = {A}
R is not in BCNF
Decomposition R = (A, B), R 2 = (B, C)
1
R and R in BCNF
1 2
Lossless-join decomposition
Dependency preserving
Testing for BCNF
To check if a relation R
i in a decomposition of R is in BCNF,
Either test R for BCNF with respect to the restriction of F to R
i i
+
(that is, all FDs in F that contain only attributes from R )
i
or use the original set of dependencies F that hold on R, but with
the following test:
+
– for every set of attributes α ⊆ R , check that (the
α
i
attribute closure of α) either includes no attribute of
R - α, or includes all attributes of R .
i i
R = (A, B, C )
F = {A → B
B → C}
Key = {A}
Functional dependencies:
course_id→ title, dept_name, credits
building, room_number→capacity
course_id, sec_id, semester, year→building,
room_number, time_slot_id
BCNF Decomposition:
course is in BCNF
How do we know this?
building, room_number→capacity holds on class-1
but {building, room_number} is not a superkey for class-1.
We replace class-1 by:
classroom (building, room_number, capacity)
section (course_id, sec_id, semester, year,
building, room_number, time_slot_id)
classroom and section are in BCNF.
Relation dept_advisor:
dept_advisor (s_ID, i_ID, dept_name)
F = {s_ID, dept_name → i_ID, i_ID → dept_name}
Two candidate keys: s_ID, dept_name, and i_ID, s_ID
R is in 3NF
s_ID, dept_name → i_ID s_ID
– dept_name is a superkey
i_ID → dept_name
– dept_name is contained in a candidate key
Redundancy in 3NF
+
Optimization: Need to check only FDs in F, need not check all FDs in F .
Use attribute closure to check for each dependency α → β, if α is a
superkey.
If α is not a superkey, we have to verify if each attribute in β is contained
in a candidate key of R
this test is rather more expensive, since it involve finding candidate
keys
testing for 3NF has been shown to be NP-hard
Interestingly, decomposition into third normal form (described
shortly) can be done in polynomial time
3NF Decomposition Algorithm
Relation schema:
cust_banker_branch = (customer_id, employee_id, branch_name, type )
The functional dependencies for this relation schema are:
1. customer_id, employee_id → branch_name, type
2. employee_id → branch_name
3. customer_id, branch_name → employee_id
We first compute a canonical cover
st
1. branch_name is extraneous in the r.h.s. of the 1 dependency
2. No other attribute is extraneous, so we get F =
C
customer_id, employee_id → type
employee_id → branch_name
customer_id, branch_name →
employee_id
3NF Decompsition Example (Cont.)
result will not depend on the order in which FDs are considered
inst_child(ID, child_name)
inst_phone(ID, phone_number)
MVD (Cont.)
Tabular representation of α →→ β
Example
In our example:
ID →→ child_name
ID →→ phone_number
The above formal definition is supposed to formalize the notion that given
a particular value of Y (ID) it has associated with it a set of values of Z
(child_name) and a set of values of W (phone_number), and these two sets
are in some sense independent of each other.
Note:
If Y → Z then Y →→ Z
Indeed we have (in above notation) Z = Z The claim follows.
1 2
Use of Multivalued Dependencies
n The restriction of D to R
i is the set Di consisting of
+
l All functional dependencies in D that include only attributes of R
i
l All multivalued dependencies of the form
α →→ (β ∩ R )
i
where α ⊆ +
R and α →→ β is in D
i
4NF Decomposition Algorithm
+
result: = {R};done := false;compute D ;
+
Let D denote the restriction of D to R
i i
while (not done)
if (there is a schema R in result that is not in 4NF) then
i
begin
let α →→ β be a nontrivial multivalued dependency that holds
on R such that α → R is not in D , and α∩β=φ;
i i i
result := (result - R ) ∪ (R - β) ∪ (α, β); end else done:= true;
i i
Note: each R is in 4NF, and decomposition is lossless-join
i
Example
R =(A, B, C, G, H, I) F
={ A →→ B
B →→ HI
CG →→ H }
R is not in 4NF since A →→ B and A is not a superkey for R
Decomposition
a) R = (A, B) (R is in 4NF)
1 1
b) R = (A, C, G, H, I) (R is not in 4NF, decompose into R and R )
2 2 3 4
c) R = (C, G, H) (R is in 4NF)
3 3
d) R = (A, C, G, I) (R is not in 4NF, decompose into R and R )
4 4 5 6