Unit-4 DBMS
Unit-4 DBMS
UNIT-4
Normalization:
Introduction to Normalization, concepts of anomalies and their types, closure set of dependencies and attributes,
Various Normal Forms: 1NF, 2NF, 3NF, BCNF, Functional Dependency, Decomposition, Dependency Preservation,
Loss Less & Lossy Join, Definition of Dangling Tuple, and Multi-values Dependencies.
• Normalization divides the larger table into smaller ones and links them using
relationships.
• The normal form is used to reduce redundancy from the database table.
In the base we are the normalization to remove for minimize) the following two problems.
(i) Insertion Anomalies: Insertion Anomaly refers to when one cannot insert a new tuple
into a relationship due to a lack of data. If there is a new row inserted in the table and it
creates inconsistency in the table then it is called the insertion anomaly.
(ii) Deletion Anomalies: The delete anomaly refers to the situation where the deletion of
data results in the unintended loss of some other important data. If we delete some
rows from the table and if any other information or data that is required is also deleted
from the database, this is called the deletion anomaly in the database.
(iii) Updation Anomalies: The update anomaly is when an update of a single data value
requires multiple rows of data to be updated. When we update some rows in the table,
and if it leads to the inconsistency of the table then this anomaly occurs. This type of
anomaly is known as an updation anomaly.
Example:
In the above table, we have listed students with their name, id, branch and their respective clubs.
a) Updation / Update Anomaly: In the above table, if Shivani changes her branch from Computer
Science to Electronics, then we will have to update all the rows. If we miss any row, then
Shivani will have more than one branch, which will create the update anomaly in the table.
b) Insertion Anomaly: If we add a new row for student Ankit who is not a part of any club, we
cannot insert the row into the table as we cannot insert null in the column of stu_club. This is
called insertion anomaly.
c) Deletion Anomaly: If we remove the photography club from the college, then we will have to
delete its row from the table. But it will also delete the table of Gopal and his details. So, this is
called deletion anomaly and it will make the database inconsistent.
• A functional dependency (FD) is a relationship between two attributes, typically between the
primary key (PK) and other non-key attributes within a table. For any relation R, attribute Y is
functionally dependent on attribute X (usually the PK), if for every valid instance of X, that value
of X uniquely determines the value of Y.
X →Y
X is determinant, Y is dependent
• “X determine Y”
• “Y dependent on X”
Mathematically,
•α ⊆ R
•β⊆ R
Then, for a functional dependency to exist from α to β,
α β
t1[α] t1[β]
t2[α] t2[β]
……. …….
fd : α → β
A B C
a1 b1 c1
a1 b1 c2
a2 b1 c1
a2 b1 c3
A→B
C→B
CA → B
A key attribute is an attribute, for the given value of that can able to identify other attributes.
Relation R
A B C
1 d c
2 a b
3 r p
2 a b
A → BC
if t₁ (A) = t₂ (A)
then
t1(BC) = t2 (BC)
That means if we have two same values for A then it must also have the same value for BC for the
corresponding tuples.
• A → B {A determines B}
• X→Y
(i) Trivial Functional Dependencies: A dependency is trivial if its dependent (RHS) is a subset of
determinants (LHS).
FD X → Y is trivial FD is Y ⊆ X
Example: If B is a subset of A
Then A → B is trivial.
The following dependencies are also trivial: A → A & B → B.
Example: A → A
AB → A
AB → B
AB → AB
• AB → BC
• AB → CD
• FD X → Y is non-trivial if Y ⊈ X
a. Semi Non-trivial: FD X → Y is Semi non-trivial if Y ⊈ X and X Ո Y ≠ 𝛟
A → B, A → BC, AB → CD
b. Fully Non-trivial: FD X → Y is Fully non-trivial if Y ⊈ X and X Ո Y = 𝛟
XY →YZ, AB→ BC
Rule-01:
A functional dependency X → Y will always hold if all the values of X are unique (different)
irrespective of the values of Y.
Example-
A B C D E
5 4 3 2 2
8 5 3 2 1
1 9 3 3 5
4 7 3 3 8
The following functional dependencies will always hold since all the values of attribute ‘A’
are unique-
Rule-02:
A functional dependency X → Y will always hold if all the values of Y are the same
irrespective of the values of X.
Example-
A B C D E
5 4 3 2 2
8 5 3 2 1
1 9 3 3 5
4 7 3 3 8
The following functional dependencies will always hold since all the values of attribute ‘C’
are the same-
• A→C
• AB → C
• ABDE → C
• DE → C
• AE → C
Rule-03:
For a functional dependency X → Y to hold, if two tuples in the table agree on the value of
attribute X, then they must also agree on the value of attribute Y.
Rule-04:
For a functional dependency X → Y, the violation will occur only when for two or more same
values of X, the corresponding Y values are different.
Example 1:
Eid Ename
1 a
2 b
3 b
A B
1 1
1 2
2 2
• A → B (X)
• B → A (X)
Example 3:
A B C
1 1 4
1 2 4
2 1 3
2 2 3
2 4 3
• A → B (X)
• B → C (X)
• B → A (X)
• C → B (X)
• C → A (√)
• A → C (√)
Example 5:
A B C
1 2 3
4 2 3
5 3 3
• A → B (√)
• BC → A (X)
• B → C (√)
Example 6:
X Y Z
1 4 3
1 5 3
4 6 3
3 2 2
• XZ → X (√) {Trivial FD}
• XY → Z (√)
• Z → Y (X)
• Y → Z (√)
• XZ → Y (X)
Example 8: (GATE 2002): From the following instance of a relation schema R(A, B, C), we can
calculate that:
A B C
1 1 1
1 1 0
2 3 2
2 3 2
A B
t1
t2
If t1 and t2 Agree here (A have Same values) Then They must agree here
(B must have same values)
Example:
A B
IF same 1 a Must be same
1 a
1 a
Not Same May or may not be same
3 b
• We mainly use the closure set of attributes method to solve the questions.
Armstrong’s axioms: Armstrong’s axioms are a set of inference rules used to infer all the
functional dependencies on a relational database. They were developed by William W. Armstrong
in 1974, there are 8 rules (axioms) that all possible functional dependencies may be derived from
them.
Inference Rules:
• The set of all those attributes which can be functionally determined from an attribute set
is called as a closure of that attribute set.
• Closure of attribute set {X} is denoted as {X}+.
The following steps are followed to find the closure of an attribute set-
Step-01:
Add the attributes contained in the attribute set for which closure is being calculated to the result set.
Step-02:
Recursively add the attributes to the result set which can be functionally determined from the
attributes already contained in the result set.
Example-
Closure of attribute A-
A+ = { A }
={A,B,C} ( Using A → BC )
={A,B,C,D,E} ( Using BC → DE )
={A,B,C,D,E,F} ( Using D → F )
= { A , B , C , D , E , F , G } ( Using CF → G )
Thus,
Closure of attribute D-
D+ = { D }
= { D , F } ( Using D → F )
We can not determine any other attribute using attributes D and F contained in the result set.
Thus,
D+ = { D , F }
={B,C,D,E} ( Using BC → DE )
={B,C,D,E,F} ( Using D → F )
= { B , C , D , E , F , G } ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Solution:
(CF)+ = {C, F, G, E, A, D} = {A, C, D, E, F, G}
(BG)+ = {B, G, A, C, D} = {A, B, C, D, G}
(AF)+ = {A, F, E, D}
(AB)+ = {A, B, C, D, G}
Determining Candidate Keys: Using the closure property of attributes, we can find out the candidate
keys.
• Candidate key is a combination of one or more attributes by which we can determine all the
other attributes.
• Find the attribute closure of attributes and combination of attributes. If any attribute closure
contains all the attributes of the relation, then we can say that given attribute of set of
attributes is a candidate key.
• If we have n number of attributes in any relation R (A1, A2, A3, …….. An)
Maximum number of candidate keys are possible = All possible subset of the attributes except
the 𝛟.
Max. no. of possible CKs = 2n-1
• Example: R (A, B, C), max possible CKs = 23-1 = 7
CKs = {A, B, C, AB, BC, AC, ABC}
Example:
A→B
C→D
D→E
Here, the attributes which are not present on RHS of any functional dependency are A, C
and F.
Example 1: Consider a Relation R (A, B, C, D). The following functional dependencies are given:
FDs: { A → B, B → C, C → D, D → A}. Find the followings:
(i) Max possible CKs for relations R
(ii) Find all the CKs for relation R
1 (ABCD) 1
2 (ABC, ACD, ABD, BCD) 4
3 (AB, BC, CD, DA, AC, BD) 6
4 (A, B, C, D) 4
Total Possible keys 15
Example 1 (GATE 96): Consider a Relation R (A, B, C, D, E, F). The following functional dependencies are
given: FDs: {C → F, E → A, EC → D, A → B}, Find key for the given relation?
a) CD
b) EC
c) AE
d) AC
Solution: C → F, E → A, EC → D, A → B
(CD)+ = {C, D, F}
(EC)+ = {E, C, A, B, D, F}
(AE)+ = {A, E, B}
(AC)+ = {A, C, F, B}
CK = EC
Example 2 (GATE 14): Consider the Relation schema R (E, F, G, H, I, J, K, L, M, N) and the set of
functional dependencies FDs: {{E, F} → {G}, {F} → {I, J}, {E, H} → {K, L} {K} → {M}, {L} → {N}} on R. What is
the key for R?
a) {E, F}
b) {E, F, H}
c) {E, F, H, K, L}
d) {E}
Solution: {{E, F} → {G}, {F} → {I, J}, {E, H} → {K, L} {K} → {M}, {L} → {N}}
(E, F)+ = {E, F, G, I, J}
(E, F, H)+ = {E, F, H, G, I, J, K, L, M, N}
(E, F, H, K, L)+ = {E, F, H, K, L, G, I, J, M, N}
(E)+ = {E}
Note: Attributes which are not present in the RHS of any FDs then that attribute will always be
a part of the candidate key.
Solution: R = (A, B, C, D, E, H)
FDs: {A → B, BC →D, E→ C, D → A}
Since E and H are not present in the RHS of any FD, so E and H must be the part of CKs.
(EH)+ = {E, H, C}
(AEH)+ = {A, E, H, B, C, D} → CK
(BEH)+ = {B, E, H, C, D, A} → CK
(CEH)+ = {C, E, H, D, A} → Not a CK
(DEH)+ = {D, E, H, A, B, C} → CK
Example (GATE 13): Consider the Relation schema R has eight attributes (A, B, C, D, E, F, G, H). Fields of
R contains only atomic values. The set of functional dependencies F = {CH →G, A → BC, B →CFH, E → A,
F →EG} on R. How many candidate keys does the relation R have?
a) 3
b) 4
c) 5
d) 6
Solution: R = (A, B, C, D, E, F, G, H)
FDs: F = {CH →G, A → BC, B →CFH, E → A, F →EG}
Since D is not present in the RHS of any FD, so D must be the part of CKs.
(AD)+ = {A, D, B, C, F, H, G, E} → CK
(BD)+ = {B, D, C, F, H, E, G, A} → CK
(CD)+ = {C, D} → Not a CK
(ED)+ = {E, D, A, B, C, F, H, G} → CK
+
(FD) = {F, D, E, G, A, B, C, H} →CK
(GD)+ = {G, D} → Not a CK
(HD)+ = {H, D} → Not a CK
Example 1:
Consider a relation R (A, B, C, D, E, F) with the set of functional dependencies
F = {AB →C, B → D, AD → F, C → D, D → E, E → F, E → D} on R
What are the candidate keys of sub-relation R1 (D, E, F) ?
R1 (D, E, F) → {D → E, E → F, E → D}
D+ = {D, E, F} → CK for R1
E+ = {E, F, D} → CK for R1
F+ = {F} → Not a CK for R1
Example 2:
Consider a relation R (A, B, C, D, E) with the set of functional dependencies
F = {A → BC, CD → E, B → D, E → A} on R
What are the candidate keys of sub-relation R1 (A, B, C, E) ?
R1 (A, B, C, E) : FD {A → BC, E → A}
A+ = {A, B, C} → Not a CK for R1
B+ = {B} → Not a CK for R1
C+ = {C} → Not a CK for R1
E+ = {E, A, B, C} → CK for R1
BC+ = {B, C} → Not a CK for R1
R (A, B)
FD: X → Y
A A
B B
AB AB
A → A, A→ B, A → AB
B → A, B→ B, B → AB
AB → A, AB→ B, AB → AB
• Trivial FDs are always true for any given relation. So, we do not consider the trivial FDs.
A → AB, B → AB
A → AB
FDs: {A → B, A → AB}
Example 4 (GATE 2005): In a schema R with attributes A, B, C, D, and E, the following set of
functional dependencies FDs are given {A → B, A → C, CD → E, B → D, E → A}. Which of the following
functional dependencies is not implied by the above set?
a) CD → AC
b) BD → CD
c) BC → CD
d) AC → BC
Solution: For checking any additional possible FD, take the closure of LHS part if the RHS part is in the
closure set then that FD is possible.
CD+ = {C, D, E, A, B} → containing A and C in closure so CD → AC is possible.
BD+ = {B, D} → BD → CD is not possible.
BC+ = {B, C, D, E, A} → containing C and D in closure so BC → CD is possible.
AC+ = {A, C, B} → containing B and C in closure so AC → BC is possible.
In DBMS,
• Two different sets of functional dependencies for a given relation may or may not be
equivalent.
• If F and G are the two sets of functional dependencies, then the following 3 cases are
possible-
Case-01: F covers G (F ⊇ G)
Case-02: G covers F (G ⊇ F)
Solution:
In F:
A+ = {A, C, D} , A → CD holds true in F
E+ = {E, A, D, H, C} , E → AH holds true in F
So, G ⊆ F (F covers G)
In G:
A+ = {A, C, D} , A → C holds true in G
AC+ = {A, C, D} , AC → D holds true in G
E+ = {E, A, H, C, D} , E → AD and E → H holds true in G.
So, F ⊆ G (G covers F)
Then F = G (F and G are equivalent).
Solution:
In F:
A+ = {A, B, C, D}, A → BC covered in F
C+ = {C, D}, C → D covered in F
So, G ⊆ F (F covers G)
In G:
A+ = {A, B, C, D}, A → B covered in G
B+ = {B}, B → C not covered in G
So, F ⊈ G (G not covers F)
Then F ≠ G (F and G are not equivalent).
Solution:
In F:
A+ = {A, B, C}, A → BC covered in F
D+ = {D, A, B}, D → AB covered in F
So, G ⊆ F (F covers G)
In G:
A+ = {A, B, C}, A → B covered in G
AB+ = {A, B, C}, AB → C covered in G
D+ = {D, A, B, C}, D → AC covered in G, but D → E not covered in G
So, F ⊈ G (G not covers F)
Then F ≠ G (F and G are not equivalent).
Solution:
In F:
A+ = {A, B, C} , A → BC covered in F
B+ = {B, C, A} , B → A covered in F
C+ = {C, A} , C → A covered in F
So, G ⊆ F (F covers G)
In G:
A+ = {A, B, C} , A → B covered in G
Practice Problem-
Set F-
A → C, AC → D, E → AD, E → H
Set G-
A → CD, E → AH
Which of the following holds true?
(A) G ⊇ F
(B) F ⊇ G
(C) F = G
(D) All of the above
Solution-
• Functional dependencies of set F can determine all the attributes that have
been determined by the functional dependencies of set G.
• Thus, we conclude F covers G i.e. F ⊇ G.
• Functional dependencies of set G can determine all the attributes which have
been determined by the functional dependencies of set F.
• Thus, we conclude G covers F i.e. G ⊇ F.
In DBMS,
If F is any set of FDs and we can minimize it to a Set G such that F ⊆ G and G ⊆ F then G is the minimal
cover of F.
Characteristics-
Need:
• Working with the set containing extraneous functional dependencies increases the
computation time.
(1) Split the FDs such that the RHS part contains single attributes only.
Write the given set of functional dependencies in such a way that each functional dependency
contains exactly one attribute on its right side.
• For Example: A → BC can be split into A → B and A → C
• The functional dependency X → YZ will be written as- X → Y and X → Z
(2) Find the redundant FDs and delete them from the set.
• Consider each functional dependency one by one from the set obtained in
Step-01.
• Determine whether it is essential or non-essential.
• To determine whether a functional dependency is essential or not, compute
the closure of its left side-
• Once by considering that the particular functional dependency is present in
the set.
• Once by considering that the particular functional dependency is not present
in the set.
• It means that the presence or absence of that functional dependency does not
create any difference.
• Thus, it is non-essential.
NOTE-
• Eliminate the non-essential functional dependency from the set as soon as it is
discovered.
• Do not consider it while checking the essentiality of other functional
dependencies.
Case-01: No-
• There exists no functional dependency containing more than one attribute on its
left side.
• In this case, the set obtained in Step-02 is the canonical cover.
Case-02: Yes-
• There exists at least one functional dependency containing more than one
attribute on its left side.
• In this case, consider all such functional dependencies one by one.
• Check if their left side can be reduced.
Solution:
(1) A → C, AC → D, E → A, E → D, E → H
(2) A+ = {A}, A → C Not redundant in F
AC+ = {A, C}, AC → D Not redundant in F
E+ = {E, D, H}, E → A Not redundant in F
E+ = {E, A, D, C}, E → D Redundant in F
E+ = {E, D, H}, E → H Not Redundant in F
F = {A → C, AC → D, E → A, E → H}
(3) For AC → D
C+ = {C}, A+ = {A, C}
So, AC → D can be minimized to A → D.
Solution:
(1) A → B, C → B, D → A, D → B, D → C, AC → D
(2) A+ = {A}, A → B Not redundant in F
C+ = {C}, C → B Not redundant in F
D+ = {D, B, C}, D → A Not redundant in F
D+ = {D, A, B, C}, D → B Redundant in F
D+ = {D, A, B}, D → C Not redundant in F
AC+ = {A, C, B}, AC → D Not Redundant in F
F = {A → B, C → B, D → A, D → C, AC → D}
(3) For AC → D
A+ = {A, B}, C+ = {C, B}
So, AC → D cannot be minimized
Solution:
(1) AB → C, D → E, AB → E, E → C
(2) AB+ = {A, B, E, C}, AB → C is redundant
D+ = {D}, D → E is Not redundant
AB+ = {A, B, C}, AB → E is not redundant
E+ = {E}, E → C is Not redundant
F = {D → E, AB → E, E → C}
(3) For AB → E
A+ = {A}, B+ = {B}
So, AB → E cannot be minimized
Practice Problem 1:
The following functional dependencies hold true for the relational scheme R ( W, X, Y, Z):
X → W, WZ → XY, Y → WXZ
Solution-
Step-01:
Write all the functional dependencies such that each contains exactly one attribute on its right
side-
X→W
WZ → X
Y→W
Y→X
Y→Z
Step-02:
Check the essentiality of each functional dependency one by one.
For X → W:
• Considering X → W, (X)+ = { X , W }
• Ignoring X → W, (X)+ = { X }
Now
• Clearly, the two results are different.
• Thus, we conclude that X → W is essential and can not be eliminated.
For WZ → X:
• Considering WZ → X, (WZ)+ = { W , X , Y , Z }
• Ignoring WZ → X, (WZ)+ = { W , X , Y , Z }
Now,
X→W
WZ → Y
Y→W
Y→X
For WZ → Y:
• Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
• Ignoring WZ → Y, (WZ)+ = { W , Z }
Now,
For Y → W:
• Considering Y → W, (Y)+ = { W , X , Y , Z }
• Ignoring Y → W, (Y)+ = { W , X , Y , Z }
Now,
X→W
WZ → Y
Y→X
Y→Z
For Y → X:
• Considering Y → X, (Y)+ = { W , X , Y , Z }
• Ignoring Y → X, (Y)+ = { Y , Z }
Now,
For Y → Z:
X→W
WZ → Y
Y→X
Y→Z
Step-03:
• Consider the functional dependencies having more than one attribute on their left
side.
• Check if their left side can be reduced.
In our set,
Now,
(Z)+ = { Z }
Clearly,
• None of the subsets have the same closure result same as that of the entire left side.
• Thus, we conclude that we can not write WZ → Y as W → Y or Z → Y.
• Thus, set of functional dependencies obtained in step-02 is the canonical cover.
Finally, the canonical cover is-
X → W, WZ → Y, Y → X, Y → Z
The process of breaking up or dividing a single relation into two or more sub-relations is called as
decomposition of a relation.
Properties of Decomposition-
The following two properties must be followed when decomposing a given relation-
1. Lossless decomposition:
Lossless decomposition ensures-
2. Dependency Preservation:
Dependency preservation ensures-
• None of the functional dependencies that hold on the original relation are lost.
• The sub-relations still hold or satisfy the functional dependencies of the original
relation.
Types of Decomposition:
• For combining the two tables there must be one common attribute between them. Any
decomposition must be lossless.
A B C
𝑎1 𝑏1 𝑐1
𝑎2 𝑏1 𝑐1
𝑎1 𝑏2 𝑐2
A B B
𝑎1 𝑏1 𝑐1
𝑎2 𝑏2 𝑐2
When merged again using cross product (No common attributes to perform natural join
operation):
A B C
𝑎1 𝑏1 𝑐1
𝒂𝟏 𝒃𝟐 𝒄𝟐
𝑎2 𝑏1 𝑐1
𝑎1 𝑏2 𝑐2
Extra tuples are generated so there is a loss of information here. So, the given decomposition
is lossy decomposition.
A B C
𝑎1 𝑏1 𝑐1
𝒂𝟏 𝒃𝟏 𝒄𝟐
𝑎2 𝑏1 𝑐1
𝒂𝟏 𝒃𝟐 𝒄𝟏
𝑎1 𝑏2 𝑐2
Extra tuples are generated so there is loss of information here. So, the given decomposition is
lossy decomposition.
Note: Lossless decomposition is possible whenever we split the table into two parts and the common
attribute is a candidate key (or key) in any one of the sub relations. Then we can say that the
decomposition is lossless.
• Consider there is a relation R which is decomposed into sub relations R1, R2, …., Rn.
• This decomposition is called lossless join decomposition when the join of the sub
relations results in the same relation R that was decomposed.
• For lossless join decomposition, we always have-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
A B C
1 2 1
2 5 3
3 3 3
R (A, B, C)
Consider this relation is decomposed into two sub relations R1(A, B) and R2(B, C)-
A B
1 2
2 5
3 3
R1 (A, B)
2 1
5 3
3 3
R2 (B, C)
R1 ⋈ R2 = R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-
A B C
1 2 1
2 5 3
3 3 3
NOTE-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R
Example-
Consider the following relation R(A, B, C)-
A B C
1 2 1
2 5 3
3 3 3
R(A, B, C)
Consider this relation is decomposed into two sub relations as R1(A, C) and R2(B, C)-
A C
1 1
2 3
3 3
R1(A, B)
B C
2 1
5 3
3 3
R2(B, C)
R1 ⋈ R2 ⊃ R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 we get-
A B C
1 2 1
2 5 3
2 3 3
3 5 3
3 3 3
This relation is not the same as the original relation R and contains some extraneous tuples.
Clearly, R1 ⋈ R2 ⊃ R.
NOTE-
R (𝑨𝟏 , 𝑨𝟐 , 𝑨𝟑 , … … … … 𝑨𝒏 )
F → F+
R1 R2
F1 ⊆ F+ F 2 ⊆ F+
R (A, B, C, D)
F → F+
R1 (A, B) R2 (C, D)
F1 ⊆ F+ F 2 ⊆ F+
Solution:
F = {A → B, B → C, C → A}
F+ = {A → B, B → C, C → A, A → C, B → A, C→ B}
R1 (A, B) R2 (B, C)
A→B F1 B→C F2
B→A C→B
A+ = {A, B, C} B+ = {B, C, A}
B+ = {B, C, A} C+ = {C, A, B}
So, C → A is preserved
Solution:
F = {A → B, B → C, C → D, D → A}
D → A is preserved
𝑭𝟏 ∪ 𝑭 𝟐 ∪ 𝑭𝟑 = 𝑭
So, all the FDs are preserved. Relations R1 and R2 have one Common attribute B, and B is a
candidate key in R1 relation. Similarly, relations R2 and R3 have one Common attribute C, and C is a
candidate key in R2 relation. So, the decomposition is lossless and dependency preserving.
Then,
• If all the following conditions satisfy, then the decomposition is lossless.
• If any of these conditions fail, then the decomposition is lossy.
Condition-01:
The Union of both the sub relations must contain all the attributes that are present in the original
relation R.
Thus,
Condition-02:
The Intersection of both the sub relations must not be null.
In other words, there must be some common attribute which is present in both the sub relations.
Thus,
R1 ∩ R2 ≠ ∅
Condition-03:
Intersection of both the sub relations must be a super key of either R1 or R2 or both.
Thus,
R1 ∩ R2 = Super key of R1 or R2
Problem-01:
Consider a relation schema R ( A , B , C , D ) with the functional dependencies A → B and C → D.
Determine whether the decomposition of R into R1 ( A , B ) and R2 ( C , D ) is lossless or lossy.
Solution-
Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R.
So, we have-
R1 ( A , B ) ∪ R2 ( C , D ) = R ( A , B , C , D )
Clearly, union of the sub relations contain all the attributes of relation R.
Condition-02:
According to condition-02, intersection of both the sub relations must not be null.
R1 ( A , B ) ∩ R2 ( C , D ) = Φ
Problem-02:
Consider a relation schema R ( A , B , C , D ) with the following functional dependencies-
A → B, B → C, C → D, D → B
Solution-
Consider the original relation R was decomposed into the given sub relations as shown-
Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R.
So, we have-
R‘ ( A , B , C ) ∪ R3 ( B , D ) = R ( A , B , C , D )
Clearly, union of the sub relations contain all the attributes of relation R.
So, we have-
R‘ ( A , B , C ) ∩ R3 ( B , D ) = B
Condition-03:
According to condition-03, intersection of both the sub relations must be the super key of one of the
two sub relations or both.
So, we have-
R‘ ( A , B , C ) ∩ R3 ( B , D ) = B
B+ = { B , C , D }
Now, we see-
• Attribute ‘B’ can not determine attribute ‘A’ of sub relation R’.
• Thus, it is not a super key of the sub relation R’.
• Attribute ‘B’ can determine all the attributes of sub relation R3.
• Thus, it is a super key of the sub relation R3.
Clearly, intersection of the sub relations is a super key of one of the sub relations.
Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R’.
So, we have-
R1 ( A , B ) ∪ R2 ( B , C ) = R’ ( A , B , C )
Clearly, union of the sub relations contain all the attributes of relation R’.
Condition-02:
According to condition-02, intersection of both the sub relations must not be null.
So, we have-
R1 ( A , B ) ∩ R2 ( B , C ) = B
Condition-03:
According to condition-03, intersection of both the sub relations must be the super key of one of the
two sub relations or both.
So, we have-
R1 ( A , B ) ∩ R2 ( B , C ) = B
B+ = { B , C , D }
Now, we see-
Clearly, intersection of the sub relations is a super key of one of the sub relations.
Example 3: Consider a relation R (A, B, C, D) with set of FDs: {AB → CD, D → A} is decomposed in to R1
(A, D) and R2 (B, C, D).
Normal Forms: Normalization is the process to minimize redundancies and anomalies from the data
base. Process of decomposition of bigger relation into smaller sub relation is known as normalization.
To achieve the normalization, we follow the various normal forms.
A B C D E F
There exist several other normal forms even after BCNF but generally, we normalize till BCNF only.
A given relation is called in First Normal Form (1NF) if each cell of the table contains only an
atomic value.
OR
A given relation is called in the First Normal Form (1NF) if the attribute of every tuple is either
single-valued or a null value.
Example:
Student
S-No FName LName
However,
• This relation can be brought into 1NF.
• This can be done by rewriting the relation such that each cell of the table contains only
one value.
NOTE-
So, for 2NF: “All non-prime attributes must be fully functional dependent of candidate key (or Super
key)”.
A given relation is called in Second Normal Form (2NF) if and only if-
1. Relation already exists in 1NF.
2. No partial dependency exists in the relation.
Partial Dependency
A partial dependency is a dependency where few attributes of the candidate key determines
non-prime attribute(s).
OR
A partial dependency is a dependency where a portion of the candidate key or
incomplete candidate key determines non-prime attribute(s).
In other words,
A → B is called a partial dependency if and only if-
1. A is a subset of some candidate key
2. B is a non-prime attribute.
If any one condition fails, then it will not be a partial dependency.
NOTE-
• To avoid partial dependency, incomplete candidate key must not determine any
non-prime attribute.
• However, incomplete candidate key can determine prime attributes.
Example-
Solution:
First of all, find the candidate keys:
A+ = {A}
B+ = {B, C}
C+ = {C}
AB+ = {A, B, C}
BC+ = {B, C}
AC+ = {A, C}
So, the CK = AB
Prime attributes = A, B
Non-prime attributes = C
So, according to 2 NF no-prime attributes should be fully functionally dependent of CKs.
AB → C (Satisfies 2NF)
B → C (Partial Dependency)
So not in 2NF.
Now to convert the given relation into 2NF we try to eliminate the partial dependency B → C from the
relation.
B+ = {B, C}
R (A, B, C)
R1 (A, B) R2 (B, C)
B→C
+
AB = {A, B, C}, so AB → C is also preserved. The decomposition is lossless and
dependency preserving.
AB+ = {A, B, C, D}
BC+ = {B, C, D}
BD+ = {B, D}
AC+ = {A, C}
AD+ = {A, D}
CD+ = {C, D}
BCD+ = {B, C, D}
ACD+ = {A, C, D}
So, the CK = AB
Prime attributes = A, B
Non-prime attributes = C, D
1 NF: Every relation is by default in 1 NF.
2 NF: According to 2 NF no-prime attributes should be fully functionally dependent of CKs.
B → D (Partial Dependency)
So not in 2NF.
Now to convert the given relation into 2NF we try to eliminate the partial dependency B → C from the
relation.
B+ = {B, C}
R (A, B, C, D)
R1 (A, B, C) R2 (B, D)
AB → C B→D
Lossless and dependency-preserving decomposition
Example 4: Consider a relation R (A, B, C, D, E) with FD set F: {A → B, B → E, C → D}. Find the highest
normal form for the given relation R.
A given relation is called in Third Normal Form (3NF) if and only if-
1. Relation already exists in 2NF.
2. No transitive dependency exists for non-prime attributes.
Transitive Dependency
NOTE-
Example-
From here,
Now,
• It is clear that there are no non-prime attributes in the relation.
• In other words, all the attributes of relation are prime attributes.
• Thus, all the attributes on RHS of each functional dependency are prime attributes.
Example 1: Consider a relation R (A, B, C) with FD set F: {A → B, B → C}. Find the highest normal form
for the given relation R.
Solution:
Find the candidate keys:
A+ = {A, B, C}
So, the CK = A
Prime attributes = A
Non-prime attributes = B, C
1 NF: Every relation is by default in 1 NF.
2 NF: According to 2 NF no-prime attributes should be fully functionally dependent of CKs.
No partial dependencies, So R is in 2NF.
3 NF: C (non-prime attribute) is transitively dependent on CK. ‘A’.
B → C is transitive dependency, not in 3 NF.
Now to convert the given relation into 3NF we try to eliminate the transitive dependency B → C from
the relation.
B+ = {B, C}
Create a new table with attributes B, C and remaining A, B will be in another table.
R (A, B, C)
R1 (A, B) R2 (B, C)
A→B B→C
Lossless and dependency preserving decomposition
Example 2: Consider a relation R (A, B, C, D, E) with FD set F: {AB → C, B → D, D → E}. Find the highest
normal form for the given relation R.
Solution:
Find the candidate keys:
So, the CK = AB
Prime attributes = A, B
Non-prime attributes = C, D, E
1 NF: Every relation is by default in 1 NF.
2 NF: According to 2 NF no-prime attributes should be fully functionally dependent of CKs.
B → D is partial dependencies, So R is not in 2NF.
Now to convert the given relation into 2NF we try to eliminate the partial dependency B → D from the
relation.
B+ = {B, D, E}
R (A, B, C, D, E)
R1 (A, B, C) R2 (B, D, E)
AB → C B → D, D → E
Now both the decomposed relations are in 2 NF
Lossless and dependency preserving decomposition
3 NF:
Now to convert the given relation into 3NF we try to eliminate the transitive dependency B → C from
the relation.
B+ = {B, C}
Create a new table with attributes B, C and remaining A, B will be in another table.
R (A, B, C, D, E)
R1 (A, B, C) R2 (B, D, E)
AB → C B → D, D → E
Is also in 3 NF Here, D → E is transitive dependency.
So, not in 3 NF.
Now to convert the given relation into 3NF
we try to eliminate the transitive dependency
D → E from the relation.
D+ = {D, E}
Example-
Now, we can observe that RHS of each given functional dependency is a candidate key.
Thus, we conclude that the given relation is in BCNF.
(C)+ = {C, B}
R (A, B, C)
R1 (A, C) R2 (C, B)
C→B
This decomposition is lossless but not dependency-preserving.
Notes:
• In a relational database, a relation is always in First Normal Form (1NF) at least.
• Singleton keys are those that consist of only a single attribute.
• If all the candidate keys of a relation are singleton candidate keys, then it will always be in
2NF at least.
Note:
(1) Decomposition is 3NF, always guaranteed the lossless and dependency preserving.
(2) Decomposition is BCNF will always be a lossless decomposition but may or may not be
dependency preserving.
(B)+ = {B, D}
R (A, B, C, D, E)
R1 (A, B, C, E) R2 (B, D)
A→ BC, E → A B→D
The decomposition is in BCNF.
Decomposition is lossless but not Dependency Preserving.
CKs = AB & BD
Prime Attributes = A, B, D
Non-Prime Attributes = C
1 NF: Every relation is by default in 1 NF.
2 NF: No partial dependency for non-prime attributes. So, Relation is in 2 NF.
3 NF: No transitive dependency for non-prime attributes. So, relation is in 3 NF.
BCNF: Relation is in 3NF, but not in BCNF because of D → A
In D → A, D is not a super key
(D)+ = {D, A}
R (A, B, C, D)
R1 (B, C, D) R2 (A, D)
D→A
The decomposition is in BCNF, but not Dependency Preserving.
(C)+ = {C, D}
R (A, B, C, D)
R1 (C, D) R2 (A, B, C)
C→D A → B, B → A, B → C
The decomposition is in BCNF, Dependency Preserving and Lossless.
Conversion in to BCNF:
C+ = {C, D, E, A}, D+ = {D, E, A}, E+ = {E, A}
R1 (C, D, E, A) R2 (B, C)
C→D
D→E
E→A
Not in BCNF
D+ = {D, E, A}
R (A, B, C, D, E)