0% found this document useful (0 votes)
162 views64 pages

Unit-4 DBMS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
162 views64 pages

Unit-4 DBMS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Department of Computer Science and Engineering

Database Management System

UNIT-4

Functional Dependency and Normalization

Normalization:
Introduction to Normalization, concepts of anomalies and their types, closure set of dependencies and attributes,
Various Normal Forms: 1NF, 2NF, 3NF, BCNF, Functional Dependency, Decomposition, Dependency Preservation,
Loss Less & Lossy Join, Definition of Dangling Tuple, and Multi-values Dependencies.

Introduction to Normalization: Normalization is used to minimize the redundancy from a relation or


set of relations. It is also used to eliminate undesirable characteristics like Insertion, Update, and
Deletion Anomalies.

• Normalization divides the larger table into smaller ones and links them using
relationships.
• The normal form is used to reduce redundancy from the database table.

In the base we are the normalization to remove for minimize) the following two problems.

1) Redundancy: Repetition of the same data in the relational database


2) Anomalies: Inconsistency in the relational database. Three types of anomalies are there:

(i) Insertion Anomalies: Insertion Anomaly refers to when one cannot insert a new tuple
into a relationship due to a lack of data. If there is a new row inserted in the table and it
creates inconsistency in the table then it is called the insertion anomaly.

(ii) Deletion Anomalies: The delete anomaly refers to the situation where the deletion of
data results in the unintended loss of some other important data. If we delete some
rows from the table and if any other information or data that is required is also deleted
from the database, this is called the deletion anomaly in the database.

(iii) Updation Anomalies: The update anomaly is when an update of a single data value
requires multiple rows of data to be updated. When we update some rows in the table,
and if it leads to the inconsistency of the table then this anomaly occurs. This type of
anomaly is known as an updation anomaly.

1 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


• The solution is to divide the table into table as small possible. But it increases the as Query
procuring time.
• The procedure of dividing the & Anomalies is tables to reduce the redundancy called as
normalization.

Example:

Stu_id Stu_name Stu_branch Stu_club


2018nk01 Shivani Computer science literature
2018nk01 Shivani Computer science dancing
2018nk02 Ayush Electronics Videography
2018nk03 Mansi Electrical dancing
2018nk03 Mansi Electrical singing
2018nk04 Gopal Mechanical Photography

In the above table, we have listed students with their name, id, branch and their respective clubs.

a) Updation / Update Anomaly: In the above table, if Shivani changes her branch from Computer
Science to Electronics, then we will have to update all the rows. If we miss any row, then
Shivani will have more than one branch, which will create the update anomaly in the table.
b) Insertion Anomaly: If we add a new row for student Ankit who is not a part of any club, we
cannot insert the row into the table as we cannot insert null in the column of stu_club. This is
called insertion anomaly.
c) Deletion Anomaly: If we remove the photography club from the college, then we will have to
delete its row from the table. But it will also delete the table of Gopal and his details. So, this is
called deletion anomaly and it will make the database inconsistent.

2 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Functional Dependencies (FDs): -

• In any relation, a functional dependency α → β holds if-


Two tuples having the same value of attribute α also have the same value for attribute β.

• A functional dependency (FD) is a relationship between two attributes, typically between the
primary key (PK) and other non-key attributes within a table. For any relation R, attribute Y is
functionally dependent on attribute X (usually the PK), if for every valid instance of X, that value
of X uniquely determines the value of Y.

X →Y

X is determinant, Y is dependent

• “X determine Y”
• “Y dependent on X”

Mathematically,

If α and β are the two sets of attributes in a relational table R where-

•α ⊆ R

•β⊆ R
Then, for a functional dependency to exist from α to β,

If t1[α] = t2[α], then t1[β] = t2[β]

α β

t1[α] t1[β]

t2[α] t2[β]

……. …….

fd : α → β

3 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


R(A, B, C). Find FD?

A B C
a1 b1 c1
a1 b1 c2
a2 b1 c1
a2 b1 c3
A→B
C→B
CA → B

A key attribute is an attribute, for the given value of that can able to identify other attributes.

Relation R
A B C
1 d c
2 a b
3 r p
2 a b

A → BC

if t₁ (A) = t₂ (A)

then

t1(BC) = t2 (BC)

That means if we have two same values for A then it must also have the same value for BC for the
corresponding tuples.

• A → B {A determines B}
• X→Y

4 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Types Of Functional Dependencies-

There are two types of functional dependencies-

1. Trivial Functional Dependencies


2. Non-trivial Functional Dependencies

Types of FDs: {FD (X → Y)}

(i) Trivial Functional Dependencies: A dependency is trivial if its dependent (RHS) is a subset of
determinants (LHS).
FD X → Y is trivial FD is Y ⊆ X

• A functional dependency X → Y is said to be trivial if and only if Y ⊆ X.


• Thus, if the RHS of a functional dependency is a subset of LHS, then it is called
as a trivial functional dependency.

Example: If B is a subset of A
Then A → B is trivial.
The following dependencies are also trivial: A → A & B → B.

Example: A → A
AB → A
AB → B
AB → AB

(ii) Non-Trivial Functional Dependencies:


• A functional dependency X → Y is said to be non-trivial if and only if Y ⊄ X.
• Thus, if there exists at least one attribute in the RHS of a functional dependency that
is not a part of LHS, then it is called as a non-trivial functional dependency.

5 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Examples-

The examples of non-trivial functional dependencies are-

• AB → BC
• AB → CD

• FD X → Y is non-trivial if Y ⊈ X
a. Semi Non-trivial: FD X → Y is Semi non-trivial if Y ⊈ X and X Ո Y ≠ 𝛟
A → B, A → BC, AB → CD
b. Fully Non-trivial: FD X → Y is Fully non-trivial if Y ⊈ X and X Ո Y = 𝛟
XY →YZ, AB→ BC

Rules for Functional Dependency:

Rule-01:

A functional dependency X → Y will always hold if all the values of X are unique (different)
irrespective of the values of Y.

Example-

Consider the following table-

A B C D E

5 4 3 2 2

8 5 3 2 1

1 9 3 3 5

4 7 3 3 8

The following functional dependencies will always hold since all the values of attribute ‘A’
are unique-

6 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


• A→B
• A → BC
• A → CD
• A → BCD
• A → DE
• A → BCDE
In general, we can say the following functional dependency will always hold-

A → Any combination of attributes A, B, C, D, E

Similar will be the case for attributes B and E.

Rule-02:

A functional dependency X → Y will always hold if all the values of Y are the same
irrespective of the values of X.

Example-

Consider the following table-

A B C D E

5 4 3 2 2

8 5 3 2 1

1 9 3 3 5

4 7 3 3 8

The following functional dependencies will always hold since all the values of attribute ‘C’
are the same-

• A→C
• AB → C
• ABDE → C
• DE → C
• AE → C

7 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


In general, we can say the following functional dependency will always hold true-

Any combination of attributes A, B, C, D, E → C

Combining Rule-01 and Rule-02 we can say-

In general, a functional dependency α → β always holds-


If either all values of α are unique or if all values of β are the same or both.

Rule-03:

For a functional dependency X → Y to hold, if two tuples in the table agree on the value of
attribute X, then they must also agree on the value of attribute Y.

Rule-04:

For a functional dependency X → Y, the violation will occur only when for two or more same
values of X, the corresponding Y values are different.

Rule out the functional Dependencies based on the tables:


Identify the F.D. for the given table

Example 1:
Eid Ename
1 a
2 b
3 b

• Eid → Ename (√)


• Ename → Eid (X)

8 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Example 2:

A B
1 1
1 2
2 2

• A → B (X)
• B → A (X)

Example 3:
A B C
1 1 4
1 2 4
2 1 3
2 2 3
2 4 3
• A → B (X)
• B → C (X)
• B → A (X)
• C → B (X)
• C → A (√)
• A → C (√)

Example 4: (GATE 2000)


A B C
1 1 1
1 1 0
2 3 2
2 3 2
• A → B (√)
• B → C (X)

Example 5:
A B C
1 2 3
4 2 3
5 3 3
• A → B (√)
• BC → A (X)
• B → C (√)

9 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


• AC → B (√)

Example 6:
X Y Z
1 4 3
1 5 3
4 6 3
3 2 2
• XZ → X (√) {Trivial FD}
• XY → Z (√)
• Z → Y (X)
• Y → Z (√)
• XZ → Y (X)

Example 7: (GATE 2000): Given the following relation instance:


X Y Z
1 4 2
1 5 3
1 6 3
3 2 2
Which of the following functional dependency are satisfied by the given instance:
(i) XY → Z and Z → Y
(ii) YZ → X and Y → Z
(iii) YZ → X and X → Z
(iv) XZ → Y and Y → X

Solution: XY → Z (√) and Z→Y (X)


YZ → X (√) and Y→Z (√)
YZ → X (√) and X→Z (X)
XZ → Y (X) and Y→X (√)

Example 8: (GATE 2002): From the following instance of a relation schema R(A, B, C), we can
calculate that:
A B C
1 1 1
1 1 0
2 3 2
2 3 2

10 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


a) A functionally determines B and B functionally determines C.
b) A functionally determines B and B does not functionally determine C.
c) B does not functionally determine C.
d) A does not functionally determine B and B does not functionally determine C.

Formal Definition of FD:


A→B

A B
t1
t2
If t1 and t2 Agree here (A have Same values) Then They must agree here
(B must have same values)

If t1 and t2 disagree here


(A have different Values) They may agree or disagree

Example:
A B
IF same 1 a Must be same
1 a
1 a
Not Same May or may not be same
3 b

11 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Various uses of FDs:
(i) Identify Additional FDs
(ii) Identifying Keys
(iii) Identifying the Equivalence of FDs
(iv) Finding minimal FD Set

For Performing all the above activities, we have two methods:


(i) Inference Rules
(ii) Closure set of attributes

• We mainly use the closure set of attributes method to solve the questions.

Armstrong’s axioms: Armstrong’s axioms are a set of inference rules used to infer all the
functional dependencies on a relational database. They were developed by William W. Armstrong
in 1974, there are 8 rules (axioms) that all possible functional dependencies may be derived from
them.

Inference Rules:

1) Reflexive: If B is a subset of A, then A → B always holds.


If B ⊆ A then A → B
2) Transitive: If A → B and B → C, then A → C always holds.
If A → B and B → C then A → C
3) Decomposition: If A → BC, then A → B and A → C always holds.
If A → BC then A → B and A → C
4) Augmentation: If A → B, then AC → BC always holds.
If A → B then AC → BC
5) Union/Additive: If A → B and A → C, then A → BC always holds.
If A → B and A → C then A → BC
6) Composition: If A → B and C → D, then AC → BD always holds.
If A → B and C → D then AC → BD

12 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Closure Set of Attributes:

• The set of all those attributes which can be functionally determined from an attribute set
is called as a closure of that attribute set.
• Closure of attribute set {X} is denoted as {X}+.

Steps to Find Closure of an Attribute Set-

The following steps are followed to find the closure of an attribute set-

Step-01:

Add the attributes contained in the attribute set for which closure is being calculated to the result set.

Step-02:

Recursively add the attributes to the result set which can be functionally determined from the
attributes already contained in the result set.

Example-

Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-


A → BC, BC → DE, D → F, CF → G
Now, let us find the closure of some attributes and attribute sets-

Closure of attribute A-

A+ = { A }

={A,B,C} ( Using A → BC )

={A,B,C,D,E} ( Using BC → DE )

={A,B,C,D,E,F} ( Using D → F )

= { A , B , C , D , E , F , G } ( Using CF → G )

Thus,

13 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


A+ = { A , B , C , D , E , F , G }

Closure of attribute D-

D+ = { D }

= { D , F } ( Using D → F )

We can not determine any other attribute using attributes D and F contained in the result set.
Thus,
D+ = { D , F }

Closure of attribute set {B, C}-


{ B , C }+= { B , C }

={B,C,D,E} ( Using BC → DE )

={B,C,D,E,F} ( Using D → F )

= { B , C , D , E , F , G } ( Using CF → G )

Thus,

{ B , C }+ = { B , C , D , E , F , G }

Example: Relation R(A, B, C, D, E) with set of FDs { A → B, B → D, C→ DE, CD → AB}


Closure set of A = A+ = {A, B, D}
Meaning of A+ is that in any table which has the attribute A, B, and D for any given value of A we can
find out the values of B and D. Similarly
B+ = {B, D}
C+ = {C, D, E, A, B} = {A, B, C, D, E}
D+ = {D}
E+ = {E}

• Closure set can also be calculated for the combination of attributes.


(CD)+ = {C, D, E, A, B}
• If C+ Derives all the attributes so (CD)+ also derives all the attributes.
• Closure Set has multiple Applications
(AD)+ = {A, D, B}

14 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Question 1: (GATE 06): The following functional dependencies are given:
AB → CD, AF → D, DE → F, C → G, F → E, G → A
Which one of the following options is false?
a) (CF)+ = {A, C, D, E, F, G}
b) (BG)+ = {A, B, C, D, G}
c) (AF)+ = {A, C, D, E, F, G}
d) (AB)+ = {A, B, C, D, G}

Solution:
(CF)+ = {C, F, G, E, A, D} = {A, C, D, E, F, G}
(BG)+ = {B, G, A, C, D} = {A, B, C, D, G}
(AF)+ = {A, F, E, D}
(AB)+ = {A, B, C, D, G}

Determining Candidate Keys: Using the closure property of attributes, we can find out the candidate
keys.

• Candidate key is a combination of one or more attributes by which we can determine all the
other attributes.
• Find the attribute closure of attributes and combination of attributes. If any attribute closure
contains all the attributes of the relation, then we can say that given attribute of set of
attributes is a candidate key.
• If we have n number of attributes in any relation R (A1, A2, A3, …….. An)
Maximum number of candidate keys are possible = All possible subset of the attributes except
the 𝛟.
Max. no. of possible CKs = 2n-1
• Example: R (A, B, C), max possible CKs = 23-1 = 7
CKs = {A, B, C, AB, BC, AC, ABC}

15 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Steps for Finding Candidate Keys-
We can determine the candidate keys of a given relation using the following steps-
• Determine all essential attributes of the given relation. Essential attributes are those attributes
which are not present on RHS of any functional dependency.
• Essential attributes are always a part of every candidate key. This is because they cannot be
determined by other attributes.
• Check the closure of the combination of all the essential attributes. If the closure contains all
the attributes of the given relation, then that will be the only candidate key of the given
relation.
• Otherwise check the closure of the various combinations of the essential and non-essential
attributes.

Example:

Let R(A, B, C, D, E, F) be a relation scheme with the following functional dependencies-

A→B

C→D

D→E

Here, the attributes which are not present on RHS of any functional dependency are A, C
and F.

So, essential attributes are- A, C and F.

Example 1: Consider a Relation R (A, B, C, D). The following functional dependencies are given:
FDs: { A → B, B → C, C → D, D → A}. Find the followings:
(i) Max possible CKs for relations R
(ii) Find all the CKs for relation R

Solution: For the given relation Max. Possible CKs = 24-1 = 15


Possible CKs for above given relation:

1 (ABCD) 1
2 (ABC, ACD, ABD, BCD) 4
3 (AB, BC, CD, DA, AC, BD) 6
4 (A, B, C, D) 4
Total Possible keys 15

16 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Identification of CKs:
A+ = {A, B, C, D}
B+ = {B, C, D, A}
C+ = {C, D, A, B}
D+ = {D, A, B, C}
CKs : {A, B, C, D}

Example 1 (GATE 96): Consider a Relation R (A, B, C, D, E, F). The following functional dependencies are
given: FDs: {C → F, E → A, EC → D, A → B}, Find key for the given relation?
a) CD
b) EC
c) AE
d) AC

Solution: C → F, E → A, EC → D, A → B
(CD)+ = {C, D, F}
(EC)+ = {E, C, A, B, D, F}
(AE)+ = {A, E, B}
(AC)+ = {A, C, F, B}

CK = EC

Example 2 (GATE 14): Consider the Relation schema R (E, F, G, H, I, J, K, L, M, N) and the set of
functional dependencies FDs: {{E, F} → {G}, {F} → {I, J}, {E, H} → {K, L} {K} → {M}, {L} → {N}} on R. What is
the key for R?
a) {E, F}
b) {E, F, H}
c) {E, F, H, K, L}
d) {E}

Solution: {{E, F} → {G}, {F} → {I, J}, {E, H} → {K, L} {K} → {M}, {L} → {N}}
(E, F)+ = {E, F, G, I, J}
(E, F, H)+ = {E, F, H, G, I, J, K, L, M, N}
(E, F, H, K, L)+ = {E, F, H, K, L, G, I, J, M, N}
(E)+ = {E}

Note: Attributes which are not present in the RHS of any FDs then that attribute will always be
a part of the candidate key.

17 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Example (GATE 05): Consider the Relation schema R = (A, B, C, D, E, H) on which the following set of
functional dependencies hold FDs: {A → B, BC →D, E→ C, D → A} on R. What are the candidate keys for
R?
a) AE, BE
b) AE, BE, DE
c) AEH, BEH, BCH
d) AEH, BEH, DEH

Solution: R = (A, B, C, D, E, H)
FDs: {A → B, BC →D, E→ C, D → A}
Since E and H are not present in the RHS of any FD, so E and H must be the part of CKs.
(EH)+ = {E, H, C}
(AEH)+ = {A, E, H, B, C, D} → CK
(BEH)+ = {B, E, H, C, D, A} → CK
(CEH)+ = {C, E, H, D, A} → Not a CK
(DEH)+ = {D, E, H, A, B, C} → CK

Example (GATE 13): Consider the Relation schema R has eight attributes (A, B, C, D, E, F, G, H). Fields of
R contains only atomic values. The set of functional dependencies F = {CH →G, A → BC, B →CFH, E → A,
F →EG} on R. How many candidate keys does the relation R have?
a) 3
b) 4
c) 5
d) 6

Solution: R = (A, B, C, D, E, F, G, H)
FDs: F = {CH →G, A → BC, B →CFH, E → A, F →EG}
Since D is not present in the RHS of any FD, so D must be the part of CKs.
(AD)+ = {A, D, B, C, F, H, G, E} → CK
(BD)+ = {B, D, C, F, H, E, G, A} → CK
(CD)+ = {C, D} → Not a CK
(ED)+ = {E, D, A, B, C, F, H, G} → CK
+
(FD) = {F, D, E, G, A, B, C, H} →CK
(GD)+ = {G, D} → Not a CK
(HD)+ = {H, D} → Not a CK

(CDG)+ = {C, D, G} → Not a CK


(CDH)+ = {C, D, H, G} → Not a CK
(CDG)+ = {G, H, D} → Not a CK
(CDG)+ = {C, D, G, H} → Not a CK

18 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Candidate Key for Sub-relation: Using the closure property of attributes we can find out the candiate
key for sub-relation.

Example 1:
Consider a relation R (A, B, C, D, E, F) with the set of functional dependencies
F = {AB →C, B → D, AD → F, C → D, D → E, E → F, E → D} on R
What are the candidate keys of sub-relation R1 (D, E, F) ?

R1 (D, E, F) → {D → E, E → F, E → D}
D+ = {D, E, F} → CK for R1
E+ = {E, F, D} → CK for R1
F+ = {F} → Not a CK for R1

Example 2:
Consider a relation R (A, B, C, D, E) with the set of functional dependencies
F = {A → BC, CD → E, B → D, E → A} on R
What are the candidate keys of sub-relation R1 (A, B, C, E) ?

R1 (A, B, C, E) : FD {A → BC, E → A}
A+ = {A, B, C} → Not a CK for R1
B+ = {B} → Not a CK for R1
C+ = {C} → Not a CK for R1
E+ = {E, A, B, C} → CK for R1
BC+ = {B, C} → Not a CK for R1

19 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Checking Additional FDs: Using the inference rules and closure property of attributes we can find out
the additional FDs for any given relation.

R (A, B)
FD: X → Y
A A
B B
AB AB

So total 9 FDs possible for the given relation:

A → A, A→ B, A → AB

B → A, B→ B, B → AB

AB → A, AB→ B, AB → AB

• Trivial FDs are always true for any given relation. So, we do not consider the trivial FDs.

Example 1: Consider a relation R (A, B) with the set of functional dependencies


FD = {A → B, B → A} on R

Additional FDs are:


A+ = {A, B}
A → A (trivial), A → B (already given), A → AB (Additional FD)
B+ = {B, A}
B → B (trivial), B → A (already given), B → AB (Additional FD)
AB+ = {A, B}
AB → A (trivial), AB → B (trivial), AB → AB (trivial)

So, the additional FDs other (Excluding Trivial FDs)

A → AB, B → AB

FDs = {A → B, B → A, A → AB, B → AB}

20 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Example 2: Consider a relation R (A, B) with the set of functional dependencies
FD = {A → B} on R

Additional FDs are:


A+ = {A, B}
A → A (trivial), A → B (already given), A → AB (Additional FD)
B+ = {B}
B → B (trivial)
AB+ = {A, B}
AB → A (trivial), AB → B (trivial), AB → AB (trivial)

So the additional FDs other (Excluding Trivial FDs)

A → AB

FDs: {A → B, A → AB}

Example 3: Consider a relation R (A, B, C) with the set of functional dependencies FD = {A → B, B →


C} on R. What are the set of FDs that we can derive from the given FDs?

Example 4 (GATE 2005): In a schema R with attributes A, B, C, D, and E, the following set of
functional dependencies FDs are given {A → B, A → C, CD → E, B → D, E → A}. Which of the following
functional dependencies is not implied by the above set?
a) CD → AC
b) BD → CD
c) BC → CD
d) AC → BC

Solution: For checking any additional possible FD, take the closure of LHS part if the RHS part is in the
closure set then that FD is possible.
CD+ = {C, D, E, A, B} → containing A and C in closure so CD → AC is possible.
BD+ = {B, D} → BD → CD is not possible.
BC+ = {B, C, D, E, A} → containing C and D in closure so BC → CD is possible.
AC+ = {A, C, B} → containing B and C in closure so AC → BC is possible.

21 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Equivalence of FDs: Suppose F and G are two sets of FDs, so F and G are equivalent if F covers G and G
covers F.
If G ⊆ F → (G covered by F or F Covers G)
and F ⊆ G → (F covered by G or G Covers F)
Then F = G

Steps for Equivalence of Two Sets of Functional Dependencies-

In DBMS,
• Two different sets of functional dependencies for a given relation may or may not be
equivalent.
• If F and G are the two sets of functional dependencies, then the following 3 cases are
possible-

Case-01: F covers G (F ⊇ G)

Case-02: G covers F (G ⊇ F)

Case-03: Both F and G cover each other (F = G)

Case-01: Determining Whether F Covers G-

The following steps are followed to determine whether F covers G or not-

• consider the functional dependencies of set G.


• For each functional dependency X → Y, find the closure of X using the
functional dependencies of set F.
• Compare the results of Step-01 and Step-02.
• If the functional dependencies of set F have determined all those attributes
that were determined by the functional dependencies of set G, then it means F
covers G.
• Thus, we conclude F covers G (F ⊇ G) otherwise not.

22 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Case-02: Determining Whether G Covers F-

The following steps are followed to determine whether G covers F or not-

• Take the functional dependencies of set F into consideration.


• For each functional dependency X → Y, find the closure of X using the
functional dependencies of set G.
• Compare the results of Step-01 and Step-02.
• If the functional dependencies of set G has determined all those attributes that
were determined by the functional dependencies of set F, then it means G
covers F.
• Thus, we conclude G covers F (G ⊇ F) otherwise not.

Case-03: Determining Whether Both F and G Cover Each Other-

• If F covers G and G covers F, then both F and G cover each other.


• Thus, if both the above cases hold true, we conclude both F and G cover each
other (F = G).

Example 1: Consider F = {A → C, AC → D, E → AD, E → H} and G = {A → CD, E → AH}

Solution:
In F:
A+ = {A, C, D} , A → CD holds true in F
E+ = {E, A, D, H, C} , E → AH holds true in F
So, G ⊆ F (F covers G)
In G:
A+ = {A, C, D} , A → C holds true in G
AC+ = {A, C, D} , AC → D holds true in G
E+ = {E, A, H, C, D} , E → AD and E → H holds true in G.

So, F ⊆ G (G covers F)
Then F = G (F and G are equivalent).

23 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Example 2: Consider F = {A → B, B → C, C → D} and G = {A → BC, C → D}

Solution:
In F:
A+ = {A, B, C, D}, A → BC covered in F
C+ = {C, D}, C → D covered in F
So, G ⊆ F (F covers G)
In G:
A+ = {A, B, C, D}, A → B covered in G
B+ = {B}, B → C not covered in G
So, F ⊈ G (G not covers F)
Then F ≠ G (F and G are not equivalent).

Example 3: Consider F = {A → B, AB → C, D → AC, D → E} and G = {A → BC, D → AB}

Solution:
In F:
A+ = {A, B, C}, A → BC covered in F
D+ = {D, A, B}, D → AB covered in F
So, G ⊆ F (F covers G)
In G:
A+ = {A, B, C}, A → B covered in G
AB+ = {A, B, C}, AB → C covered in G
D+ = {D, A, B, C}, D → AC covered in G, but D → E not covered in G
So, F ⊈ G (G not covers F)
Then F ≠ G (F and G are not equivalent).

Example 4: Consider F = {A → B, B → C, C → A} and G = {A → BC, B → A, C → A}

Solution:
In F:
A+ = {A, B, C} , A → BC covered in F
B+ = {B, C, A} , B → A covered in F
C+ = {C, A} , C → A covered in F

So, G ⊆ F (F covers G)
In G:
A+ = {A, B, C} , A → B covered in G

24 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


B+ = {B, A, C} , B → C covered in G
C → A is covered in G
So, F ⊆ G (G covers F)
Then F = G (F and G are equivalent).

Practice Problem-

A relation R (A, C, D, E, H) is having two functional dependency sets F and G as shown-

Set F-

A → C, AC → D, E → AD, E → H

Set G-

A → CD, E → AH
Which of the following holds true?
(A) G ⊇ F
(B) F ⊇ G
(C) F = G
(D) All of the above
Solution-

Determining whether F covers G-

• (A)+ = { A , C , D } // closure of left side of A → CD using set F


• +
(E) = { A , C , D , E , H } // closure of left side of E → AH using set F

Comparing the results, we find-

• Functional dependencies of set F can determine all the attributes that have
been determined by the functional dependencies of set G.
• Thus, we conclude F covers G i.e. F ⊇ G.

Determining whether G covers F-

25 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


• (A)+ = { A , C , D } // closure of left side of A → C using set G
• (AC)+ = { A , C , D } // closure of left side of AC → D using set G
• (E)+ = { A , C , D , E , H } // closure of left side of E → AD and E → H using set G

Comparing the results we find-

• Functional dependencies of set G can determine all the attributes which have
been determined by the functional dependencies of set F.
• Thus, we conclude G covers F i.e. G ⊇ F.

Determining whether both F and G cover each other-

• From Step-01, we conclude F covers G.


• From Step-02, we conclude G covers F.
• Thus, we conclude both F and G cover each other i.e. F = G.

Thus, Option (D) is correct.

26 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Minimal Cover/ Canonical Cover of FDs:

In DBMS,

• A canonical cover is a simplified and reduced version of the given set of


functional dependencies.
• Since it is a reduced version, it is also called as Irreducible set.

If F is any set of FDs and we can minimize it to a Set G such that F ⊆ G and G ⊆ F then G is the minimal
cover of F.

Characteristics-

• Canonical cover is free from all the extraneous functional dependencies.


• The closure of canonical cover is the same as that of the given set of functional
dependencies.
• Canonical cover is not unique and may be more than one for a given set of
functional dependencies.

Need:

• Working with the set containing extraneous functional dependencies increases the
computation time.

• Therefore, the given set is reduced by eliminating the useless functional


dependencies.
• This reduces the computation time and working with the irreducible set becomes
easier.

27 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Steps (Procedure) to find Minimal/Canonical Cover of FD set:

(1) Split the FDs such that the RHS part contains single attributes only.
Write the given set of functional dependencies in such a way that each functional dependency
contains exactly one attribute on its right side.
• For Example: A → BC can be split into A → B and A → C
• The functional dependency X → YZ will be written as- X → Y and X → Z

(2) Find the redundant FDs and delete them from the set.

For Ex.: if FD set F: {A → B, B → C, A → C} here FD A → C is redundant so we can delete it


Now F: {A → B, B → C}

• Consider each functional dependency one by one from the set obtained in
Step-01.
• Determine whether it is essential or non-essential.
• To determine whether a functional dependency is essential or not, compute
the closure of its left side-
• Once by considering that the particular functional dependency is present in
the set.
• Once by considering that the particular functional dependency is not present
in the set.

Then the following two cases are possible-

Case-01: Results Come Out to be Same-

If the results come out to be the same,

• It means that the presence or absence of that functional dependency does not
create any difference.

• Thus, it is non-essential.

• Eliminate that functional dependency from the set.

Case-02: Results Come Out to be Different-

If the results come out to be different,

• It means that the presence or absence of that functional dependency creates a

28 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


difference.
• Thus, it is essential.
• Do not eliminate that functional dependency from the set.
• Mark that functional dependency as essential.

NOTE-
• Eliminate the non-essential functional dependency from the set as soon as it is
discovered.
• Do not consider it while checking the essentiality of other functional
dependencies.

(3) Find the redundant/extraneous attributes on LHS and delete them.


For Ex.: AB → C, here A can be deleted if B+ contains ‘A’.

• Consider the newly obtained set of functional dependencies after performing


Step-02.
• Check if there is any functional dependency that contains more than one
attribute on its left side.

The following two cases are possible-

Case-01: No-

• There exists no functional dependency containing more than one attribute on its
left side.
• In this case, the set obtained in Step-02 is the canonical cover.

Case-02: Yes-

• There exists at least one functional dependency containing more than one
attribute on its left side.
• In this case, consider all such functional dependencies one by one.
• Check if their left side can be reduced.

Use the following steps to perform a check-

• Consider a functional dependency.


• Compute the closure of all the possible subsets of the left side of that

29 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


functional dependency.
• If any of the subsets produce the same closure result as produced by the
entire left side, then replace the left side with that subset.

Example 1: Minimize F = {A → C, AC → D, E → AD, E → H}.

Solution:
(1) A → C, AC → D, E → A, E → D, E → H
(2) A+ = {A}, A → C Not redundant in F
AC+ = {A, C}, AC → D Not redundant in F
E+ = {E, D, H}, E → A Not redundant in F
E+ = {E, A, D, C}, E → D Redundant in F
E+ = {E, D, H}, E → H Not Redundant in F

F = {A → C, AC → D, E → A, E → H}
(3) For AC → D
C+ = {C}, A+ = {A, C}
So, AC → D can be minimized to A → D.

So, the final FD set:


F = {A → C, A → D, E → A, E → H}
Or F = {A → CD, E → AH}

Example 2: Minimize F = {A → B, C → B, D → ABC, AC → D}.

Solution:
(1) A → B, C → B, D → A, D → B, D → C, AC → D
(2) A+ = {A}, A → B Not redundant in F
C+ = {C}, C → B Not redundant in F
D+ = {D, B, C}, D → A Not redundant in F
D+ = {D, A, B, C}, D → B Redundant in F
D+ = {D, A, B}, D → C Not redundant in F
AC+ = {A, C, B}, AC → D Not Redundant in F

F = {A → B, C → B, D → A, D → C, AC → D}
(3) For AC → D
A+ = {A, B}, C+ = {C, B}
So, AC → D cannot be minimized

30 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


So the final FD set:
F = {A → B, C → B, D → A, D → C, AC → D}

Example 3: Is {AB → C, D → E, E → C} is the minimal cover of {AB → C, D → E, AB → E, E → C}

Solution:
(1) AB → C, D → E, AB → E, E → C
(2) AB+ = {A, B, E, C}, AB → C is redundant
D+ = {D}, D → E is Not redundant
AB+ = {A, B, C}, AB → E is not redundant
E+ = {E}, E → C is Not redundant

F = {D → E, AB → E, E → C}
(3) For AB → E
A+ = {A}, B+ = {B}
So, AB → E cannot be minimized

So the final FD set:


F = {D → E, AB → E, E → C}

Practice Problem 1:

The following functional dependencies hold true for the relational scheme R ( W, X, Y, Z):

X → W, WZ → XY, Y → WXZ

Write the irreducible equivalent for this set of functional dependencies.

Solution-

Step-01:

Write all the functional dependencies such that each contains exactly one attribute on its right
side-

X→W

WZ → X

31 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


WZ → Y

Y→W

Y→X

Y→Z

Step-02:
Check the essentiality of each functional dependency one by one.

For X → W:

• Considering X → W, (X)+ = { X , W }
• Ignoring X → W, (X)+ = { X }
Now
• Clearly, the two results are different.
• Thus, we conclude that X → W is essential and can not be eliminated.

For WZ → X:

• Considering WZ → X, (WZ)+ = { W , X , Y , Z }
• Ignoring WZ → X, (WZ)+ = { W , X , Y , Z }

Now,

• Clearly, the two results are same.


• Thus, we conclude that WZ → X is non-essential and can be eliminated.

Eliminating WZ → X, our set of functional dependencies reduces to-

X→W

WZ → Y

Y→W

Y→X

32 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Y→Z

Now, we will consider this reduced set in further checks.

For WZ → Y:

• Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
• Ignoring WZ → Y, (WZ)+ = { W , Z }
Now,

• Clearly, the two results are different.


• Thus, we conclude that WZ → Y is essential and can not be eliminated.

For Y → W:

• Considering Y → W, (Y)+ = { W , X , Y , Z }
• Ignoring Y → W, (Y)+ = { W , X , Y , Z }
Now,

• Clearly, the two results are same.


• Thus, we conclude that Y → W is non-essential and can be eliminated.
• Eliminating Y → W, our set of functional dependencies reduces to-

X→W

WZ → Y

Y→X

Y→Z

For Y → X:

• Considering Y → X, (Y)+ = { W , X , Y , Z }
• Ignoring Y → X, (Y)+ = { Y , Z }

Now,

• Clearly, the two results are different.


• Thus, we conclude that Y → X is essential and can not be eliminated.

For Y → Z:

33 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


• Considering Y → Z, (Y)+ = { W , X , Y , Z }
• Ignoring Y → Z, (Y)+ = { W , X , Y }
Now,

• Clearly, the two results are different.


• Thus, we conclude that Y → Z is essential and cannot be eliminated.
From here, our essential functional dependencies are-

X→W

WZ → Y

Y→X

Y→Z

Step-03:
• Consider the functional dependencies having more than one attribute on their left
side.
• Check if their left side can be reduced.
In our set,

• Only WZ → Y contains more than one attribute on its left side.


• Considering WZ → Y, (WZ)+ = { W , X , Y , Z }

Now,

• Consider all the possible subsets of WZ.


• Check if the closure result of any subset matches to the closure result of WZ.
(W)+ = { W }

(Z)+ = { Z }

Clearly,

• None of the subsets have the same closure result same as that of the entire left side.
• Thus, we conclude that we can not write WZ → Y as W → Y or Z → Y.
• Thus, set of functional dependencies obtained in step-02 is the canonical cover.
Finally, the canonical cover is-

X → W, WZ → Y, Y → X, Y → Z

34 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Decomposition of a Relation:

The process of breaking up or dividing a single relation into two or more sub-relations is called as
decomposition of a relation.

Properties of Decomposition-

The following two properties must be followed when decomposing a given relation-

1. Lossless decomposition:
Lossless decomposition ensures-

• No information is lost from the original relation during decomposition.


• When the sub-relations are joined back, the same relation is obtained that was
decomposed.
Every decomposition must always be lossless.

2. Dependency Preservation:
Dependency preservation ensures-

• None of the functional dependencies that hold on the original relation are lost.
• The sub-relations still hold or satisfy the functional dependencies of the original
relation.

Types of Decomposition:

The decomposition of a relation can be completed in the following two ways-

35 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Lossless Decomposition: To convert any relation into normalized form (Normal form) we have to
divide the bigger relation into smaller sub-relations. But during the time of query processing, we have
to combine the sub-relation to form the original relation. During this process, there must be no loss of
information from the relation.

• For combining the two tables there must be one common attribute between them. Any
decomposition must be lossless.

A B C
𝑎1 𝑏1 𝑐1
𝑎2 𝑏1 𝑐1
𝑎1 𝑏2 𝑐2

The above relation can be decomposed into several types of sub-relations:


(i) First Decomposition:

A B B
𝑎1 𝑏1 𝑐1
𝑎2 𝑏2 𝑐2
When merged again using cross product (No common attributes to perform natural join
operation):
A B C
𝑎1 𝑏1 𝑐1
𝒂𝟏 𝒃𝟐 𝒄𝟐
𝑎2 𝑏1 𝑐1
𝑎1 𝑏2 𝑐2
Extra tuples are generated so there is a loss of information here. So, the given decomposition
is lossy decomposition.

(ii) Second Decomposition:


A B B C
𝑎1 𝑏1 𝑏1 𝑐1
𝑎2 𝑏1 𝑏2 𝑐2
𝑎1 𝑏2

When merged again using Natural Join:


A B C
𝑎1 𝑏1 𝑐1
𝑎2 𝑏1 𝑐1
𝑎1 𝑏2 𝑐2

36 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


No extra tuples are generated so no loss of information here. So, the given decomposition is
lossless decomposition.

(iii) Third Decomposition:


A B A C
𝑎1 𝑏1 𝑎1 𝑐1
𝑎2 𝑏1 𝑎2 𝑐1
𝑎1 𝑏2 𝑎1 𝑐2

When merged again using Natural Join:

A B C
𝑎1 𝑏1 𝑐1
𝒂𝟏 𝒃𝟏 𝒄𝟐
𝑎2 𝑏1 𝑐1
𝒂𝟏 𝒃𝟐 𝒄𝟏
𝑎1 𝑏2 𝑐2
Extra tuples are generated so there is loss of information here. So, the given decomposition is
lossy decomposition.

Note: Lossless decomposition is possible whenever we split the table into two parts and the common
attribute is a candidate key (or key) in any one of the sub relations. Then we can say that the
decomposition is lossless.

1. Lossless Join Decomposition-

• Consider there is a relation R which is decomposed into sub relations R1, R2, …., Rn.

• This decomposition is called lossless join decomposition when the join of the sub
relations results in the same relation R that was decomposed.
• For lossless join decomposition, we always have-

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R

where ⋈ is a natural join operator

37 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Example-

Consider the following relation R (A, B, C)-

A B C

1 2 1

2 5 3

3 3 3

R (A, B, C)

Consider this relation is decomposed into two sub relations R1(A, B) and R2(B, C)-

The two sub-relations are-

A B

1 2

2 5

3 3

R1 (A, B)

38 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


B C

2 1

5 3

3 3

R2 (B, C)

Now, let us check whether this decomposition is lossless or not.

For lossless decomposition, we must have-

R1 ⋈ R2 = R

Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-

A B C

1 2 1

2 5 3

3 3 3

This relation is same as the original relation R.

Thus, we conclude that the above decomposition is lossless join decomposition.

NOTE-

• Lossless join decomposition is also known as non-additive join decomposition.


• This is because the resultant relation after joining the sub relations is same as the
decomposed relation.
• No extraneous tuples appear after joining of the sub-relations.

39 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


2. Lossy Join Decomposition-

• Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.


• This decomposition is called lossy join decomposition when the join of the sub
relations does not result in the same relation R that was decomposed.
• The natural join of the sub relations is always found to have some extraneous tuples.
• For lossy join decomposition, we always have-

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R

where ⋈ is a natural join operator

Example-
Consider the following relation R(A, B, C)-

A B C

1 2 1

2 5 3

3 3 3

R(A, B, C)

Consider this relation is decomposed into two sub relations as R1(A, C) and R2(B, C)-

40 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


The two sub-relations are-

A C

1 1

2 3

3 3

R1(A, B)

B C

2 1

5 3

3 3

R2(B, C)

Now, let us check whether this decomposition is lossy or not.

For lossy decomposition, we must have-

R1 ⋈ R2 ⊃ R

Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 we get-

A B C

1 2 1

2 5 3

2 3 3

3 5 3

3 3 3

This relation is not the same as the original relation R and contains some extraneous tuples.

Clearly, R1 ⋈ R2 ⊃ R.

41 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Thus, we conclude that the above decomposition is a lossy join decomposition.

NOTE-

• Lossy join decomposition is also known as careless decomposition.


• This is because extraneous tuples get introduced in the natural join of the sub-
relations.
• Extraneous tuples make the identification of the original tuples difficult.

42 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


FD Preserving Decomposition: If all the FDs are preserved after decomposition, then we can say that
the decomposition is FD preserving.

• Suppose R (𝑨𝟏 , 𝑨𝟐 , 𝑨𝟑 , … … … … 𝑨𝒏 ) With Set of functional dependencies F.


• Identify closure of F i.e., F+ (F → F+)
• Suppose R is divided in to R1 and R2.
• Identify F1 for R1 (F1 ⊆ F+) and Identify F2 for R2 (F2 ⊆ F+)
• If ( F1 Ս F2 )+ = F+ then we can say that the decomposition is dependency preserving.

R (𝑨𝟏 , 𝑨𝟐 , 𝑨𝟑 , … … … … 𝑨𝒏 )
F → F+

R1 R2
F1 ⊆ F+ F 2 ⊆ F+

R (A, B, C, D)
F → F+

R1 (A, B) R2 (C, D)
F1 ⊆ F+ F 2 ⊆ F+

Example 1: Consider a relation R (A, B, C) with set of FDs: {A → B, B → C, C → A} is decomposed in to


R1 (A, B) and R2 (B, C).

Solution:
F = {A → B, B → C, C → A}
F+ = {A → B, B → C, C → A, A → C, B → A, C→ B}

R1 (A, B) R2 (B, C)
A→B F1 B→C F2
B→A C→B
A+ = {A, B, C} B+ = {B, C, A}
B+ = {B, C, A} C+ = {C, A, B}

So, C → A is preserved

43 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


So, all the FDs are preserved. Both the relations R1 and R2 have one Common attribute B. And B is
a candidate key in both the relations. So the decomposition is lossless and dependency
preserving.

Example 2: Consider a relation R (A, B, C, D) with set of FDs: {A → B, B → C, C → D, D → A} is


decomposed in to R1 (A, B), R2 (B, C), and R3 (C, D).

Solution:
F = {A → B, B → C, C → D, D → A}

R1 (A, B) R2 (B, C) R3 (C, D)


A→B F1 B→C F2 C→D F3
B→A C→B D→C
B+ = {B, C, D, A} C+ = {C, D, A, B} D+ = {D, A, B, C}

D → A is preserved

D+ = {D, A, B, C}, So D → A is preserved.

𝑭𝟏 ∪ 𝑭 𝟐 ∪ 𝑭𝟑 = 𝑭

So, all the FDs are preserved. Relations R1 and R2 have one Common attribute B, and B is a
candidate key in R1 relation. Similarly, relations R2 and R3 have one Common attribute C, and C is a
candidate key in R2 relation. So, the decomposition is lossless and dependency preserving.

Determining Whether Decomposition Is Lossless Or Lossy-

Consider a relation R is decomposed into two sub relations R1 and R2.

Then,
• If all the following conditions satisfy, then the decomposition is lossless.
• If any of these conditions fail, then the decomposition is lossy.

Condition-01:

The Union of both the sub relations must contain all the attributes that are present in the original
relation R.

Thus,

44 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


R1 ∪ R2 = R

Condition-02:
The Intersection of both the sub relations must not be null.
In other words, there must be some common attribute which is present in both the sub relations.
Thus,

R1 ∩ R2 ≠ ∅

Condition-03:
Intersection of both the sub relations must be a super key of either R1 or R2 or both.

Thus,

R1 ∩ R2 = Super key of R1 or R2

Problem-01:
Consider a relation schema R ( A , B , C , D ) with the functional dependencies A → B and C → D.
Determine whether the decomposition of R into R1 ( A , B ) and R2 ( C , D ) is lossless or lossy.

Solution-

To determine whether the decomposition is lossless or lossy,


• We will check all the conditions one by one.
• If any of the conditions fail, then the decomposition is lossy otherwise lossless.

Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R.

So, we have-

R1 ( A , B ) ∪ R2 ( C , D ) = R ( A , B , C , D )
Clearly, union of the sub relations contain all the attributes of relation R.

Thus, condition-01 satisfies.

Condition-02:
According to condition-02, intersection of both the sub relations must not be null.

45 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


So, we have-

R1 ( A , B ) ∩ R2 ( C , D ) = Φ

Clearly, intersection of the sub relations is null.


So, condition-02 fails. Thus, we conclude that the decomposition is lossy.

Problem-02:
Consider a relation schema R ( A , B , C , D ) with the following functional dependencies-

A → B, B → C, C → D, D → B

Determine whether the decomposition of R into R1 ( A , B ) , R2 ( B , C ) and R3 ( B , D ) is lossless or


lossy.

Solution-

Consider the original relation R was decomposed into the given sub relations as shown-

Decomposition of R(A, B, C, D) into R'(A, B, C) and R3(B, D)-


To determine whether the decomposition is lossless or lossy,

• We will check all the conditions one by one.


• If any of the conditions fail, then the decomposition is lossy otherwise lossless.

Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R.
So, we have-
R‘ ( A , B , C ) ∪ R3 ( B , D ) = R ( A , B , C , D )
Clearly, union of the sub relations contain all the attributes of relation R.

Thus, condition-01 satisfies.

46 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Condition-02:
According to condition-02, intersection of both the sub relations must not be null.

So, we have-

R‘ ( A , B , C ) ∩ R3 ( B , D ) = B

Clearly, intersection of the sub relations is not null.

Thus, condition-02 satisfies.

Condition-03:
According to condition-03, intersection of both the sub relations must be the super key of one of the
two sub relations or both.

So, we have-

R‘ ( A , B , C ) ∩ R3 ( B , D ) = B

Now, the closure of attribute B is-

B+ = { B , C , D }
Now, we see-

• Attribute ‘B’ can not determine attribute ‘A’ of sub relation R’.
• Thus, it is not a super key of the sub relation R’.
• Attribute ‘B’ can determine all the attributes of sub relation R3.
• Thus, it is a super key of the sub relation R3.

Clearly, intersection of the sub relations is a super key of one of the sub relations.

So, condition-03 satisfies.

Thus, we conclude that the decomposition is lossless.

Decomposition of R'(A, B, C) into R1(A, B) and R2(B, C)-

To determine whether the decomposition is lossless or lossy,

We will check all the conditions one by one.

47 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


If any of the conditions fail, then the decomposition is lossy otherwise lossless.

Condition-01:

According to condition-01, union of both the sub relations must contain all the attributes of relation R’.

So, we have-

R1 ( A , B ) ∪ R2 ( B , C ) = R’ ( A , B , C )

Clearly, union of the sub relations contain all the attributes of relation R’.

Thus, condition-01 satisfies.

Condition-02:

According to condition-02, intersection of both the sub relations must not be null.

So, we have-

R1 ( A , B ) ∩ R2 ( B , C ) = B

Clearly, intersection of the sub relations is not null.

Thus, condition-02 satisfies.

Condition-03:

According to condition-03, intersection of both the sub relations must be the super key of one of the
two sub relations or both.

So, we have-

R1 ( A , B ) ∩ R2 ( B , C ) = B

Now, the closure of attribute B is-

B+ = { B , C , D }

Now, we see-

48 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


• Attribute ‘B’ can not determine attribute ‘A’ of sub relation R1.
• Thus, it is not a super key of the sub relation R1.
• Attribute ‘B’ can determine all the attributes of sub relation R2.
• Thus, it is a super key of the sub relation R2.

Clearly, intersection of the sub relations is a super key of one of the sub relations.

So, condition-03 satisfies.

Thus, we conclude that the decomposition is lossless.

Example 3: Consider a relation R (A, B, C, D) with set of FDs: {AB → CD, D → A} is decomposed in to R1
(A, D) and R2 (B, C, D).

Example 4: Consider a relation R (A, B, C, D, E, G) with set of FDs: {AB → C, AC → B, AD → E, B → D, BC


→ A, E → G} is decomposed in to R1 (A, B, C), R2 (A, B, D, E), and R3 (E, G).

Example 5: Consider a relation R (A, B, C, D, E) with set of FDs: {A → BC, C → DE, D → E} is


decomposed in to R1 (A, B, C, D) and R2 (D, E).

Example 6: Consider a relation R (A, B, C, D, E, G) with set of FDs: {AB → C, AC → B, AD → E, B → D, BC


→ A, E → G} is decomposed in to R1 (A, B), R2 (B, C), R3 (A, B, D, E) and R3 (E, G).

49 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Normalization in DBMS:

In DBMS, database normalization is a process of making the database consistent by


• Reducing the redundancies
• Ensuring the integrity of data through lossless decomposition
Normalization is done through normal forms.

Normal Forms: Normalization is the process to minimize redundancies and anomalies from the data
base. Process of decomposition of bigger relation into smaller sub relation is known as normalization.
To achieve the normalization, we follow the various normal forms.

A B C D E F

A B C Decomposed into two A D E F


sub tables

The standard normal forms used are-

1. First Normal Form (1 NF)


2. Second Normal Form (2 NF)
3. Third Normal Form (3 NF)
4. Boys Codd Normal Form (BCNF)

There exist several other normal forms even after BCNF but generally, we normalize till BCNF only.

50 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


First Normal Form (1 NF): A relation must have atomic values in each cell of a relation. In 1 NF
multivalued attributes and composite attributes are not allowed. By default, every relation is in 1NF.

A given relation is called in First Normal Form (1NF) if each cell of the table contains only an
atomic value.
OR

A given relation is called in the First Normal Form (1NF) if the attribute of every tuple is either
single-valued or a null value.

Example:

Student
S-No FName LName

Relation for Mobile_Number


S-No Mobile_Number

51 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Example-

The following relation is not in 1NF-

Student_id Name Subjects

100 Akshay Computer Networks, Designing

101 Aman Database Management System

102 Anjali Automata, Compiler Design

Relation is not in 1NF.

However,
• This relation can be brought into 1NF.
• This can be done by rewriting the relation such that each cell of the table contains only
one value.

Student_id Name Subjects

100 Akshay Computer Networks

100 Akshay Designing

101 Aman Database Management System

102 Anjali Automata

102 Anjali Compiler Design

This relation is in First Normal Form (1NF).

NOTE-

• By default, every relation is in 1NF.


• This is because formal definition of a relation states that value of all the attributes
must be atomic.

52 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Second Normal Form (2 NF): Second normal form is based on full functional dependency. No partial
dependency is allowed in case of non-prime attributes.

So, for 2NF: “All non-prime attributes must be fully functional dependent of candidate key (or Super
key)”.

A given relation is called in Second Normal Form (2NF) if and only if-
1. Relation already exists in 1NF.
2. No partial dependency exists in the relation.

Partial Dependency

A partial dependency is a dependency where few attributes of the candidate key determines
non-prime attribute(s).
OR
A partial dependency is a dependency where a portion of the candidate key or
incomplete candidate key determines non-prime attribute(s).

In other words,
A → B is called a partial dependency if and only if-
1. A is a subset of some candidate key
2. B is a non-prime attribute.
If any one condition fails, then it will not be a partial dependency.

NOTE-

• To avoid partial dependency, incomplete candidate key must not determine any
non-prime attribute.
• However, incomplete candidate key can determine prime attributes.

Example-

Consider a relation- R ( V , W , X , Y , Z ) with functional dependencies-


VW → XY
Y→V
WX → YZ

The possible candidate keys for this relation are-


VW , WX , WY

53 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


From here,
• Prime attributes = { V , W , X , Y }
• Non-prime attributes = { Z }

Now, if we observe the given dependencies-


• There is no partial dependency.
• This is because there exists no dependency where incomplete candidate key determines
any non-prime attribute.

Thus, we conclude that the given relation is in 2NF.

Example 1: Consider a relation R (A, B, C) with FD set F: {AB → C, B → C}


A B C

Solution:
First of all, find the candidate keys:
A+ = {A}
B+ = {B, C}
C+ = {C}
AB+ = {A, B, C}
BC+ = {B, C}
AC+ = {A, C}
So, the CK = AB
Prime attributes = A, B
Non-prime attributes = C
So, according to 2 NF no-prime attributes should be fully functionally dependent of CKs.
AB → C (Satisfies 2NF)
B → C (Partial Dependency)
So not in 2NF.

Now to convert the given relation into 2NF we try to eliminate the partial dependency B → C from the
relation.
B+ = {B, C}

R (A, B, C)
R1 (A, B) R2 (B, C)
B→C
+
AB = {A, B, C}, so AB → C is also preserved. The decomposition is lossless and
dependency preserving.

54 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Example 2: Consider a relation R (A, B, C, D) with FD set F: {AB → C, B → D}. Find the highest normal
form for the given relation R.
Solution:
Find the candidate keys:
A+ = {A}
B+ = {B, D}
C+ = {C}
D+ = {D}

AB+ = {A, B, C, D}
BC+ = {B, C, D}
BD+ = {B, D}
AC+ = {A, C}
AD+ = {A, D}
CD+ = {C, D}
BCD+ = {B, C, D}
ACD+ = {A, C, D}
So, the CK = AB
Prime attributes = A, B
Non-prime attributes = C, D
1 NF: Every relation is by default in 1 NF.
2 NF: According to 2 NF no-prime attributes should be fully functionally dependent of CKs.
B → D (Partial Dependency)
So not in 2NF.

Now to convert the given relation into 2NF we try to eliminate the partial dependency B → C from the
relation.
B+ = {B, C}

R (A, B, C, D)
R1 (A, B, C) R2 (B, D)
AB → C B→D
Lossless and dependency-preserving decomposition

Example 3: Consider a relation R (A, B, C, D, E, F, G, H, I, J) with FD set F: {AB → C, BD → EF, AD → GH,


A → I, H → J}. Find the highest normal form for the given relation R.

Example 4: Consider a relation R (A, B, C, D, E) with FD set F: {A → B, B → E, C → D}. Find the highest
normal form for the given relation R.

55 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Third Normal Form (3 NF): In 3 NF transitive dependencies are not allowed for the non-prime
attributes (non-prime attributes transitively depending on the Keys).
{Transitive rule: A → B, B → C then A → C}

A given relation is called in Third Normal Form (3NF) if and only if-
1. Relation already exists in 2NF.
2. No transitive dependency exists for non-prime attributes.

Transitive Dependency

A → B is called a transitive dependency if and only if-


1. A is not a super key.
2. B is a non-prime attribute.
If any one condition fails, then it is not a transitive dependency.

NOTE-

• Transitive dependency must not exist for non-prime attributes.


• However, transitive dependency can exist for prime attributes.

Formal Definition of 3 NF:


A relational Schema ‘R’ is called in Third Normal Form (3 NF) if and only if for every non-trivial FD X → Y
either
(1) X is a candidate key (or super key) or
(2) Y is a prime attribute

Example-

Consider a relation- R ( A , B , C , D , E ) with functional dependencies-


A → BC
CD → E
B→D
E→A

The possible candidate keys for this relation are-


A , E , CD , BC

From here,

56 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


• Prime attributes = { A , B , C , D , E }
• There are no non-prime attributes

Now,
• It is clear that there are no non-prime attributes in the relation.
• In other words, all the attributes of relation are prime attributes.
• Thus, all the attributes on RHS of each functional dependency are prime attributes.

Thus, we conclude that the given relation is in 3NF.

Example 1: Consider a relation R (A, B, C) with FD set F: {A → B, B → C}. Find the highest normal form
for the given relation R.
Solution:
Find the candidate keys:
A+ = {A, B, C}

So, the CK = A
Prime attributes = A
Non-prime attributes = B, C
1 NF: Every relation is by default in 1 NF.
2 NF: According to 2 NF no-prime attributes should be fully functionally dependent of CKs.
No partial dependencies, So R is in 2NF.
3 NF: C (non-prime attribute) is transitively dependent on CK. ‘A’.
B → C is transitive dependency, not in 3 NF.

Now to convert the given relation into 3NF we try to eliminate the transitive dependency B → C from
the relation.
B+ = {B, C}
Create a new table with attributes B, C and remaining A, B will be in another table.

R (A, B, C)
R1 (A, B) R2 (B, C)
A→B B→C
Lossless and dependency preserving decomposition

These two tables are now in 3 NF.

Example 2: Consider a relation R (A, B, C, D, E) with FD set F: {AB → C, B → D, D → E}. Find the highest
normal form for the given relation R.
Solution:
Find the candidate keys:

57 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


AB+ = {A, B, C, D, E}

So, the CK = AB
Prime attributes = A, B
Non-prime attributes = C, D, E
1 NF: Every relation is by default in 1 NF.
2 NF: According to 2 NF no-prime attributes should be fully functionally dependent of CKs.
B → D is partial dependencies, So R is not in 2NF.
Now to convert the given relation into 2NF we try to eliminate the partial dependency B → D from the
relation.
B+ = {B, D, E}
R (A, B, C, D, E)
R1 (A, B, C) R2 (B, D, E)
AB → C B → D, D → E
Now both the decomposed relations are in 2 NF
Lossless and dependency preserving decomposition

3 NF:

Now to convert the given relation into 3NF we try to eliminate the transitive dependency B → C from
the relation.
B+ = {B, C}
Create a new table with attributes B, C and remaining A, B will be in another table.

R (A, B, C, D, E)
R1 (A, B, C) R2 (B, D, E)
AB → C B → D, D → E
Is also in 3 NF Here, D → E is transitive dependency.
So, not in 3 NF.
Now to convert the given relation into 3NF
we try to eliminate the transitive dependency
D → E from the relation.
D+ = {D, E}

R1 (A, B, C) R21 (D, E) R2 (B, D)


AB → C D→E B→D
In 3 NF In 3 NF In 3 NF
Decomposition is lossless and dependency preserving

These three tables are now in 3 NF.

58 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


BCNF (Boyce Codd Normal Form): A relational schema ‘R’ is in BCNF if whenever a nontrivial FD X → Y
holds in R, then X is a super key of R i.e., determinant of all FDs must be super key.

A given relation is called in BCNF if and only if-


1. Relation already exists in 3NF.
2. For each non-trivial functional dependency A → B, A is a super key of the relation.

Example-

Consider a relation- R ( A , B , C ) with the functional dependencies-


A→B
B→C
C→A

The possible candidate keys for this relation are-


A,B,C

Now, we can observe that RHS of each given functional dependency is a candidate key.
Thus, we conclude that the given relation is in BCNF.

Example 1: Consider Relation R(A, B, C), FDs : {A → B, B → C, C → A}


Solution:
(A)+ = {A, B, C}
(B)+ = {A, B, C}
(C)+ = {A, B, C}
CKs = A, B, and C
Prime Attributes = A, B, C
Non-Prime Attributes = 𝛟
Relation is in BCNF.

• BCNF will always provide 0 % redundancy in the database.

Example 2: Consider Relation R(A, B, C), FDs : {AB → C, C → B}


Solution:
(AB)+ = {A, B, C}
(AC)+ = {A, B, C}
CKs = AB, and AC
Prime Attributes = A, B, C

59 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Non-Prime Attributes = 𝛟
Relation is in 3NF but not in BCNF because of C → B.
In C → B, C is not super key.

(C)+ = {C, B}
R (A, B, C)
R1 (A, C) R2 (C, B)
C→B
This decomposition is lossless but not dependency-preserving.

Remember the following diagram which implies-


• A relation in BCNF will surely be in all other normal forms.
• A relation in 3NF will surely be in 2NF and 1NF.
• A relation in 2NF will surely be in 1NF.

The above diagram also implies-


• BCNF is stricter than 3NF.
• 3NF is stricter than 2NF.
• 2NF is stricter than 1NF.

Notes:
• In a relational database, a relation is always in First Normal Form (1NF) at least.
• Singleton keys are those that consist of only a single attribute.
• If all the candidate keys of a relation are singleton candidate keys, then it will always be in
2NF at least.

60 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


• This is because there will be no chances of existing any partial dependency.
• The candidate keys will either fully appear or fully disappear from the dependencies.
• Thus, an incomplete candidate key will never determine a non-prime attribute.
• If all the attributes of a relation are prime attributes, then it will always be in 2NF at least.
• This is because there will be no chances of existing any partial dependency.
• Since there are no non-prime attributes, there will be no Functional Dependency which
determines a non-prime attribute.
• If all the attributes of a relation are prime attributes, then it will always be in 3NF at least.
• This is because there will be no chances of existing any transitive dependency for non-prime
attributes.
• Third Normal Form (3NF) is considered adequate for normal relational database design.
• BCNF is free from redundancies arising out of functional dependencies (zero redundancy).

Note:
(1) Decomposition is 3NF, always guaranteed the lossless and dependency preserving.
(2) Decomposition is BCNF will always be a lossless decomposition but may or may not be
dependency preserving.

Example: Consider Relation R(A, B, C, D, E), FDs : {A → BC, CD → E, B → D, E → A}


Solution:
(A)+ = {A, B, C, D, E}
(E)+ = {A, B, C, D, E}
(CD)+ = {A, B, C, D, E}
(BC)+ = {A, B, C, D, E}
CKs = A, E, CD, BC
Prime Attributes = A, B, C, D, E
Non-Prime Attributes = 𝛟
Relation is in 3NF, but not in BCNF because of B → D
In B → D, B is not a super key

(B)+ = {B, D}
R (A, B, C, D, E)

R1 (A, B, C, E) R2 (B, D)
A→ BC, E → A B→D
The decomposition is in BCNF.
Decomposition is lossless but not Dependency Preserving.

61 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


R (A, B, C, D, E)

R1 (A, B, C, E) R2 (B, D) R 3 (C, D, E)


A→ BC, E → A B→D CD →E

Now decomposition is in BCNF, Lossless and Dependency Preserving.

Example: Consider Relation R(A, B, C, D), FDs : {AB → CD, D → A}


Solution:
(AB)+ = {A, B, C, D}
(BD)+ = {A, B, D, C}

CKs = AB & BD
Prime Attributes = A, B, D
Non-Prime Attributes = C
1 NF: Every relation is by default in 1 NF.
2 NF: No partial dependency for non-prime attributes. So, Relation is in 2 NF.
3 NF: No transitive dependency for non-prime attributes. So, relation is in 3 NF.
BCNF: Relation is in 3NF, but not in BCNF because of D → A
In D → A, D is not a super key

(D)+ = {D, A}
R (A, B, C, D)

R1 (B, C, D) R2 (A, D)
D→A
The decomposition is in BCNF, but not Dependency Preserving.

Example: Consider Relation R(A, B, C, D), FDs : {A → B, B → A, B → C, C → D}


Solution: (A)+ = {A, B, C, D}
(B)+ = {B, C, A, D}
(C)+ = {C, D}
(D)+ = {D}
CKs = A & B
Prime Attributes = A, B

62 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


Non-Prime Attributes = C, D
1 NF: every relation is by default in 1 NF.
2 NF: No partial dependency for non-prime attributes. So, Relation is in 2 NF.
3 NF: Not in 3NF because of C → D

(C)+ = {C, D}
R (A, B, C, D)

R1 (C, D) R2 (A, B, C)
C→D A → B, B → A, B → C
The decomposition is in BCNF, Dependency Preserving and Lossless.

Example: Consider Relation R (A, B, C, D, E), FDs: {AB → C, C → D, D → E, E → A}


Solution:
(AB)+ = {A, B, C, D, E}
(BC)+ = {B, C, D, E, A}
(BD)+ = {B, D, E, A, C}
(BE)+ = {E, A, B, C, D}
CK = AB, BC, BD, BE
Prime Attributes = A, B, C, D, E
Relation R is in 3NF {Because no non-prime attributes}
But Not in BCNF Because
C → D, C is not S. K.
D → E, D is not S. K.
E → A, E is not S. K.

Conversion in to BCNF:
C+ = {C, D, E, A}, D+ = {D, E, A}, E+ = {E, A}

63 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}


R (A, B, C, D, E)

R1 (C, D, E, A) R2 (B, C)
C→D
D→E
E→A
Not in BCNF
D+ = {D, E, A}

R11 (D, E, A) R12 (C, D)


D→E C→D
E→A
Not in BCNF
E + = {E, A}

R111 (E, A) R112 (D, E)

Decomposition is in BCNF lossless but not dependency preserving.

R (A, B, C, D, E)

R1 (E, A) R2 (D, E) R3 (C, D) R4 (A, B, C)


E→A D→E C→D AB → C

Now the decomposition is lossless and dependency preserving.

64 DBMS Unit 4: Functional Dependency and Normalization {Dr. Kuldeep N. Tripathi}

You might also like