Dbms Unit-4
Dbms Unit-4
UNIT-IV
Schema Refinement (Normalization)
• Purpose of Normalization or schema refinement:
Normalization is a process of organizing the data
in database to avoid data redundancy, insertion
anomaly, update anomaly & deletion anomaly.
• Redundancy means having multiple copies of
same data in the database.
A+ = { A }
={A,B,C} ( Using A → BC )
= { A , B , C , D , E} ( Using BC → DE )
= { A , B , C , D , E , F} ( Using D → F )
= { A , B , C , D , E , F , G } ( Using CF → G )
Thus,
A+ = { A , B , C , D , E , F , G}
Closure of attribute D-
D+ = { D }
= { D , F } ( Using D → F )
We can not determine any other attribute using attributes D and F contained in the result set.
Thus,
D+ = { D , F }
{ B , C }+= { B , C }
= {B , C, D , E} ( Using BC → DE )
= { B , C , D , E , F} ( Using D → F )
= { B , C , D , E , F , G } ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Finding the Keys Using Closure:
• Super Key-
If the closure result of an attribute set contains all the attributes of the
relation, then that attribute set is called as a super key of that relation.
Thus, we can say-
“The closure of a super key is the entire relation schema.”
Example-
In the above example, The closure of attribute A is the entire relation
schema. Thus, attribute A is a super key for that relation.
• Candidate Key-
If there exists no subset of an attribute set whose closure contains all the
attributes of the relation, then that attribute set is called as a candidate
key of that relation.
Example-
In the above example, No subset of attribute A contains all the attributes
of the relation. Thus, attribute A is also a candidate key for that relation.
A Candidate Key might be seen as a minimal Super Key, in terms of
attributes.
Problem-
•Example-3: Let a Relation R have attributes {a1, a2, a3,…, an} and the
candidate key is “a1 a2 a3” then the possible number of super
keys? Following the previous formula, we have 3 attributes instead of
one. So, here the number of possible super keys is 2(N-3).
Example-4: Let a Relation R have attributes {a1, a2,
a3,…, an} and the candidate keys are “a1”, “a2” then
the possible number of super keys?
This problem now is slightly different since we now
have two different candidate keys instead of only
one. Tackling problems like these is shown in the
diagram below:
Let a Relation R have attributes {a1, a2, a3,…, an} and the candidate
keys are “a1 a2”, “a3 a4” then the possible number of super
keys? Super keys of(a1 a2) + Super keys of(a3 a4) – Super keys of(a1
a2 a3 a4)
2(n – 2) + 2(n – 2) – 2(n – 4)
Let a Relation R have attributes {a1, a2, a3,…, an} and the candidate
keys are “a1 a2”, “a1 a3” then the possible number of super
keys? Super keys of (a1 a2) + Super keys of (a1 a3) – Super keys of(a1
a2 a3)
2(n – 2) + 2(n – 2) – 2(n – 3)
Armstrong’s Axioms
Canonical Cover OR Irreducible set
In DBMS,
•A canonical cover is a simplified and reduced version of
the given set of functional dependencies.
•Since it is a reduced version, it is also called
as Irreducible set.
Characteristics-
For X → W:
•Considering X → W, (X)+ = { X , W }
•Ignoring X → W, (X)+ = { X }
Now,
•Clearly, the two results are different.
•Thus, we conclude that X → W is essen al and can not be
eliminated.
For WZ → X:
•Considering WZ → X, (WZ)+ = { W , X , Y , Z }
•Ignoring WZ → X, (WZ)+ = { W , X , Y , Z }
Now,
•Clearly, the two results are same.
•Thus, we conclude that WZ → X is non-essential and can be
eliminated.
•Elimina ng WZ → X, our set of func onal dependencies
reduces to-
X→W
WZ → Y
Y→W
Y→X
Y→Z
Now, we will consider this reduced set in further checks.
For WZ → Y:
•Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
•Ignoring WZ → Y, (WZ)+ = { W , Z }
Now,
•Clearly, the two results are different.
•Thus, we conclude that WZ → Y is essen al and can not be eliminated.
For Y → W:
•Considering Y → W, (Y)+ = { W , X , Y , Z }
•Ignoring Y → W, (Y)+ = { W , X , Y , Z }
Now,
•Clearly, the two results are same.
•Thus, we conclude that Y → W is non-essential and can be eliminated.
•Elimina ng Y → W, our set of func onal dependencies reduces to-
X→W
WZ → Y
Y→X
Y→Z
For Y → X:
•Considering Y → X, (Y)+ = { W , X , Y , Z }
•Ignoring Y → X, (Y)+ = { Y , Z }
Now,
•Clearly, the two results are different.
•Thus, we conclude that Y → X is essen al and can not be
eliminated.
For Y → Z:
•Considering Y → Z, (Y)+ = { W , X , Y , Z }
•Ignoring Y → Z, (Y)+ = { W , X , Y }
Now,
•Clearly, the two results are different.
•Thus, we conclude that Y → Z is essen al and can not be
eliminated.
From here, our essential functional dependencies are-
X→W
WZ → Y
Y→X
Y→Z
Step-03:
•Consider the functional dependencies having more than one attribute
on their left side.
•Check if their left side can be reduced.
In our set,
•Only WZ → Y contains more than one a ribute on its le side.
•Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
Now,
•Consider all the possible subsets of WZ.
•Check if the closure result of any subset matches to the closure result
of WZ.
(W)+ = { W }
(Z)+ = { Z }
Clearly,
•None of the subsets have the same closure result
same as that of the entire left side.
•Thus, we conclude that we can not write
WZ → Y as W → Y or Z → Y.
Finally, the canonical cover is-
X→W
WZ → Y
Y→X
Y→Z
Exercise Problems:
Check is it in 2NF?
• A table has fields F1, F2, F3, F4, and F5, with the
following functional dependencies:
F1->F3
F2->F4
(F1,F2)->F5
Check is it in 2NF?
Employee Bank_Info
• Consider relation R(A, B, C, D, E)
A -> BC,
CD -> E,
B -> D,
E -> A
t1[a] = t2[a]
Then there exists t3 and t4 in r such that.
1 Science 1 Cricket
1 Maths 1 Hockey
2 C# 2 Chess
2 Php 2 Hockey
Decomposition
The process of breaking up of a relation into smaller
sub-relations is called Decomposition.
Decomposition is required in DBMS to convert a
relation into specific normal form which further
reduces redundancy, anomalies and inconsistency
in the relation.
There are mainly two types of decompositions in
DBMS-
1.Lossless Decomposition
2.Lossy Decomposition
1. Lossless Decomposition
• The decompositions R1, R2, R2…Rn for a relation
schema R are said to be Lossless if there natural
join results the original relation R.
a1 b1 c1 a1 b1 b1 c1
a2 b1 c1 a2 b1 b2 c2
a1 b2 c2 a1 b2
R1 R2 = R then, R1 R2 is
A B C
a1 b1 c1
a2 b1 c1
a1 b2 c2
As, R1 R2 = R,
This decomposition is Lossless.
Determining Whether Decomposition
Is Lossless Or Lossy-
Consider a relation R is decomposed into two sub
relations R1 and R2.
Then,
•If all the following conditions satisfy, then the
decomposition is lossless.
•If any of these conditions fail, then the
decomposition is lossy.
Condition-01:
Union of both the sub relations must contain
all the attributes that are present in the
original relation R.
Thus, R1 R2 = R
Condition-02:
Intersection of both the sub relations must
not be null.
In other words, there must be some common
attribute which is present in both the sub
relations.
Thus, R1 ∩ R2 ≠
Condition-03:
Intersection of both the sub relations must be
a super key of either R1 or R2 or both.
Thus,R1 ∩ R2 = Super key of R1 or R2
2. Lossy Decomposition
The decompositions R1, R2, R2…Rn for a relation
schema R are said to be Lossy if there natural join
results into addition of extraneous tuples with the
original relation R.
Check yourself?
a1 b1 c1 a1 b1 a1 c1
a2 b1 c1 a2 b1 a2 c1
a1 b2 c2 a1 b2 a1 c2
R1 R2
A B C
a1 b1 c1
Here R R1 R2 so
a1 b1 c2
a2 b1 c1 its lossy.
a1 b2 c1
a1 b2 c2
Dependency Preservation
A Decomposition D = { R1, R2, R3….Rn } of R is dependency
preserving wrt a set F of Functional dependency if
Sol:
a) First check whether given relation R is lossless or lossy:
Condition 1: R1 U R2 = {A,B,C,D} = R, success
Condition 2: R1 ∩ R2 = {} is it an empty set? Yes. If its empty set no need to check
third condition, directly we say it is lossy decomposition.
Sol:
a) In this case take any two sub relations, first conclude those two sub relations
are lossy or lossless. If its lossless, join these two sub relations and check this
with 3rd sub relation.
First check whether given relation R is lossless or lossy:
Condition 1: R1 U R2 U R3 = {ABCD} = R, success
Condition 2 & 3:
In this case take any two sub relations(take any 2 relations which should have
common attribute), first conclude those two sub relations are lossy or lossless. If
its lossless, join these two sub relations and check this with 3rd sub relation.
R2 ∩ R3 = {D} is it an empty set? No. So check is it a super key or not in R2 or R3.
Yes. Which is the candidate key in R2. So its lossless. Join these two sub relations
R2 and R3, we get R23{ACDE}.
Now check R1 and R23 is lossy or lossless
R1 ∩ R23 = {AC} is it an empty set? No. So check is it a super key or not in R1 or
R23. Yes, which is a candidate key of R1. So given relation R is lossless.
b) Now check whether it is dependency preserving or not.
Divide the FD’s among sub-relations
R1(A,B,C) R2(D,E) R3(A,C,D)
{ { {
AB DE DAC
ABC } }
}
If we perform F1 U F2 U F3 should equal to F. (Where F1, F2 and F3 are
functional dependencies in R1, R2 and R3 respectively.
So F1 U F2 U F3= {AB, ABC, DAC, DE}
Now check is it equivalent with F or not, by using Closure of left side
attributes in F gives the same result or not. Observe the below table, all
values are equal, so its dependency preserving.
F1 U F2 U F3 F
A+ {A,B,C} {A,B,C}
AB+ {A,B,C} {A,B,C}
D+ {A,C,D,E} {A,C,D,E}
Q4) Consider a relation schema R ( A , B , C , D ) with the following functional
dependencies-
A→B
B→C
C→D
D→B
Determine whether the decomposition of R into R1 ( A , B ) , R2 ( B , C ) and R3 ( B , D ) is
lossless or lossy and also check it is dependency preserving or not?
Sol:
a)
First check whether given relation R is lossless or lossy:
Condition 1: R1 U R2 U R3 = {ABCD} = R, success
Condition 2 & 3:
In this case take any two sub relations(take any 2 relations which should have
common attribute), first conclude those two sub relations are lossy or lossless. If its
lossless, join these two sub relations and check this with 3rd sub relation.
R1 ∩ R2 = {B} is it an empty set? No. So check is it a super key or not in R1 or R2.
Yes. Which is the candidate key in R2. So its lossless. Join these two sub relations R1
and R2, we get R12{ABC}
Now check is R12 and R3 is lossy or lossless.
R12 ∩ R3 = {B} is it an empty set? No. So check is it a super key or not in R12 or R3.
Yes, which is a candidate key of R3. So given relation R is lossless.
b) Now check whether it is dependency preserving or not.
Divide the FD’s among sub-relations, we need to identify closure for the
attributes present in that sub relation using given FD’s, we need to add it as
functional dependency in that relation. Here closure of C is C+={C,D,B} So we add
CB in R2 relation. Similarly B+={B,C,D} So we add BD in R3 sub relation.
R1(A,B) R2(B,C) R3(B,D)
{ { {
AB BC BD
} CB DB
} }
If we perform F1 U F2 U F3 should equal to F. (Where F1, F2 and F3 are
functional dependencies in R1, R2 and R3 respectively.
So F1 U F2 U F3= {AB, BC, CB, BD, DB}.
Now check is it equivalent with F or not, by using Closure of attributes
gives the same result or not. Observe the below table, so its dependency preserving.
F1 U F2 U F3 F
A+ {A,B,C,D} {A,B,C,D}
B+ {B,C,D} {B,C,D}
C+ {C,B,D} {C,B,D}
D+ {D,B,C} {D,B,C}