Ch-08 Relational Database Design
Ch-08 Relational Database Design
On
Chapter-8
Relational Database Design
Prepared By:
Kunal Anand, Asst. Professor
SCE, KIIT, DU, Bhubaneswar-24
Lecture Outcome:
• After the completion of this chapter, the students
will be able to:
– Identify the problems associated with bad database
design
– Explain Functional Dependency
– List different types of dependencies
– Explain Armstrong Inference Axioms
– Determine closure of attribute / attribute set
– Determine closure of FDs
– Determine equivalence of FDs
– Identify redundant FDs
– Determine Canonical or Minimal Cover
2
Organization of this Lecture:
3
Introduction to Database Design
• The goal of relational database design is to generate a set of
relational schemas that allow us to store information without
unnecessary redundancy, yet also allows us to retrieve
information easily.
• First, fully characterize the data requirements of the
prospective database users, which usually involves in textual
descriptions.
• Next, choose ER model to translate these requirements into a
conceptual schema of the database.
• In the logical design phase, map the high level conceptual
schema onto the implementation data model of the database
system that will be used. The implementation data model is
typically the Relational data model.
4
contd..
• Finally, use the resulting system specific database schema in
the subsequent physical design phase, in which the physical
features of the database are specified
• Database design may be performed using following two
approaches:
– Bottom-up approach: It is also known as design by synthesis. It
considers the basic relationship among individual attributes to
construct the schemas. It suffers from a problem of finding large
number of binary relationship among the attributes as the starting
point, that is extremely difficult to achieve.
– Top-down approach: It is also known as design by analysis. It
starts with a number of grouping of attributes into relations that
exists naturally e.g. Invoice, or a form, or a report. The relations are
then analyzed individually and collectively, leading to further
decomposition and until all desirable properties are met.
5
Bad Database Design/Concepts of Anomalies
•Problems that can occur in poorly planned, un-normalized
databases where all the data is stored in one table.
•These anomalies affect the process of inserting, deleting and
updating data in the relations. The intension of relational
database theory is to eliminate anomalies from occurring in a
database. Consider the student database.
Name Course Phone_no Major Prof Grade
Mahesh 353 1234 Comp sc Alok A
Nitish 329 2435 Chemistry Pratap B
Mahesh 328 1234 Comp sc Samuel B
Harish 456 4665 Physics James A
Pranshu 293 4437 Decision sc Sachin C
Prateek 491 8788 Math Saurav B
Prateek 356 8788 Math Sunil In prog
Mahesh 492 1234 Comp sc Paresh In prog
Sumit 379 4575 English Rakesh C
6
contd..
• Insertion Anomaly: It is the anomaly in which the user cannot
insert an attribute about an entity until he/she has an additional
attribute about another entity.
– Ex: Suppose a new prof has joined but not assigned with a
course. Now, We cannot record the new prof details in our
database due to non assignment of the course to him.
8
Functional Dependency (FD)
• Functional Dependency is the building block of normalization
principles.
• Attribute(s) A in a relation schema R functionally determines
another attribute(s) B in R, if for a given value a1 of A; there is
a single, specific value b1 of B i.e. if t1 and t2 are two tuples in
the relation R where t1(A) = t2(A), then we must have t1(B) =
t2(B).
• The symbolic expression of this FD is:
– A→B;
– where A(LHS of FD) is known as the determinant
– B(RHS of FD) is known as the dependent
• The functional dependency is a relationship that exists
between two attributes. It typically exists between the primary
key and non-key attribute within a table.
9
contd..
• From Student schema, we can infer that Name→Phone_no
because all tuples of Student with a given Name value also
have the same Phone_no value.
10
contd..
•FD is a constraint between two sets of attributes in a relation
from a database.
11
Example-1
Testing of Functional Dependency: (FD Holds Good or
NOT)
1. A -> BC (holds Good)
2. DE ->C (Holds good)
3. C -> DE (Not holds good)
4.BC -> A (holds good)
Q. which of the above option is correct (FD holds good.)
A B C D E
a 2 3 4 5
2 a 3 4 5
a 2 3 6 5
a 2 3 6 6
12
Example-2
Testing of Functional Dependency: (FD Holds Good or
NOT)
1. XY -> Z && Z->Y
2. YZ ->X && Y->Z (GOOD)
3. YZ ->X && X-> Z
4.XZ -> Y && Y ->Z
Q. which of the above option is correct (FD holds good.)
X Y Z
1 4 2
1 5 3
1 6 3
3 2 2
13
Example-3
Testing of Functional Dependency: (FD Holds Good or
NOT)
1. A -> B && BC -> A
2. C -> B && CA->B
3. B ->C && AB-> C (GOOD)
4.A -> C && BC ->A
Q. which of the above option is correct (FD holds good.)
A B C
1 2 4
3 5 4
3 7 2
14
Trivial vs Non-Trivial FD
• Trivial FDs:
– A functional dependency X→Y is a trivial functional dependency, if
Y is a subset or equal of X.
– For example, {Name, Course}→Course. If two records have the
same values on both the Name and Course attributes, then they
obviously have the same Course. Again, A→A is a trivial FD as
here the dependent is equal to determinant.
– Trivial dependencies hold for all relation instances
• Non-Trivial FDs:
– A functional dependency X→Y is called as non-trivial type if
Y∩X =Φ
– For example, Prof→Grade as Grade is not a subset of Prof.
– Non-trivial FDs are given implicitly in the form of constraints
when designing a database.
15
Multivalued Dependency
▪ Multivalued dependency occurs when there are more than one independent
multivalued attributes in a table.
▪ Example: Consider a bike manufacture company, which produces two
colors (Black and white) in each model every year.
bike_model mfg_year color
M1001 2007 Black
M1001 2007 Red
M2012 2008 Black
M2012 2008 Red
M2222 2009 Black
M2222 2009 Red
• Here columns mfg_year and color are independent of each other and
dependent on bike_model.
• In this case these two columns are said to be multivalued dependent on
bike_model. These dependencies can be represented like this:
bike_model ->> mfg_year; bike_model ->> color
16
Transitive Dependency
A functional dependency is said to be transitive if it is
indirectly formed by two functional dependencies.
17
Armstrong’s Inference Axioms
The inference axioms or rules allow users to infer the FDs that
are satisfied by a relation.
✔ Transitivity Rule:
✔ If X→Y and Y→Z, then X→Z.
✔ Ex: Course→Name and Name→Phone_no functional
dependencies are present, therefore
Course→Phone_no
19
Contd..
✔ Decomposition Rule:
✔ If X→{Y, Z}, then X→Y and X→Z.
✔ Ex: if Prof→{Grade, Course}, then this FD can
be decomposed as Prof→Grade and
Prof→Course
20
contd..
✔ Composition Rule:
✔ If X→Y and Z→W, then {X, Z}→{Y,W}.
✔ Ex: if Prof→Grade and Name→Phone_no, then
the FDs can be composed as {Prof, Name}→
{Grade, Phone_no}
✔ Pseudotransitivity Rule:
✔ If X→Y and {Y, W}→Z, then {X,W}→Z.
✔ Ex: if Prof→Grade and {Grade, Major}→Course,
then the FD {Prof, Major}→Course is valid
21
An Example
Example: Consider relation E = (P, Q, R, S, T, U) having set of
Functional Dependencies (FD).
P→Q P→R
QR → S Q→T
QR → U PR → U
So, some members of Axioms are as follows,
1. P → T (Transitivity Rule)
2. PR → S (Pseudo Transitivity Rule)
3. QR → SU (Union Rule)
4. PR → SU (Pseudo transitivity and Union Rule)
22
Logical Implication
Given a relation schema R and a set of functional
dependencies F.
Let FD X→Y is not in F.
F can be said to logically imply X→Y; if for every
relation r on the relation schema R that satisfies the
FD in F, the relation r also satisfies X→Y
F logically implies X→Y is written as: F|=X→Y
Let R = (A, B, C, D) and F =
{A→B, A→C, BC→D} F|=A→D;
Given F = {A→B, C→D} with
C⊆B, show that F|=A→D
23
Closure of Attribute Set
Attribute Closure of an attribute set 'A' can be defined as a set
of attributes which can be functionally defined from it.
Denoted by A+.
(A)+ =A
= AB (A-> B) 24
contd..
• Ex-2: R(ABCDEF) and the dependencies are{ A- > B , C->
DE, AC ->F, D ->AF, E ->CF}. Find (D)+ and (DE)+.
• Ans :
(D)+ = D (Using Trivial property)
= ADF (Using D->AF)
= ABDF (Using A->B)
25
Exercise
• Consider the relation R(A,B,C,D,E,F) and set of FDs
S={AB->C, BC->AD, D->E, CF->B}. Find (AB)+.
– Ans: (A,B,C,D,E)
• Consider the relation R(A,B,C,D,E,F,G) and set of FDs
S={A->B, BC->DE, AEG->G}. Find (AC)+.
– Ans: (A,C,B,D,E)
• Consider the relation R(A,B,C,D,E) and set of FDs
S={A->BC, CD->E, B->D, E->A}. Find (B)+.
– Ans: (B,D)
• Consider the relation R(A,B,C,D,E,F,G,H) and set of FDs
S={A->BC, CD->E, E->C, D->AEH, ABH->BD, DH->BC,
AEG->G}. Is BCD->H a valid FD?
– Ans: Yes, BCD->H is a valid FD.
26
Closure of set of Attributes
Given a set of attributes X and a set of functional dependencies
F, then the closure of the set of attributes X under F, denoted as
X +, is the set of attributes A that can be derived from X by
applying the Armstrong’s Inference Axioms to the functional
dependencies of F.
The closure of X is always a non empty set.
Consider the relation R = (A, B, C, D) and the dependencies
F = {A→C, B→D}, then the closure of set of attributes is
{A}+ = {A, C},
{B}+ = {B, D},
{C}+={C},
{D}+={D},
27
contd..
{A, B}+ = {A, B, C, D},
{A, C}+ = {A, C},
{A, D}+ = {A, C, D},
{B, C}+ = {B, C, D},
{B, D}+ = {B, D},
{C, D}+ = {C, D},
{A, B, C}+ = {A, B, C, D},
{A, B, D}+ = {A, B, C, D},
{B, C, D}+ = {B, C, D},
{A, B, C, D}+ = {A, B, C, D}
28
contd..
Uses of Attribute Closure:
✔ Testing for key :
✔ To test whether X is a key or not, X + is computed.
✔ X is a key or super key iff X + contains all the
attributes of R.
✔ X is a candidate key if none of its subsets is a key.
✔ Testing functional dependencies: To check whether a
functional dependency X→Y holds or not, just check if
Y⊆ X +
To generate all FDs that can be derived from F, the steps are:
✔ First, apply the inference axioms to all single attributes
and use the FDs of F whenever it is applicable.
✔ Second, apply the inference axioms to all combinations of
two attributes and use the functional dependencies of F
whenever it is applicable.
✔ Next apply the inference axioms to all combinations of
three attributes and use the FDs of F when necessary.
✔ Proceed in this manner for as many different attributes as
there are in F.
30
An Example
Let R=(A, B, C) and F={A→B, B→C}. Find the closure of
F. Also, find the SK and CK.
Ans:
S-1: When all the three attributes are excluded then we will
have only Φ->Φ i.e. 1 FD.
S-2: When only A is included then, (A)+ =(A,B,C) i.e. 2^3=8
FDs can exist which is as below:
A->A; A->B; A->C
A->AB; A->BC; A->AC
A->ABC; A->Φ
A is a super key as it determines all attributes of R.
S-3: When only B is included then, (B)+ = (B,C) i.e. 2^2=4
FDs can exist which is as below:
B->B; B->BC
31
B->C; B-> Φ
contd..
S-4: When only C is included then, (C)+ = (C) i.e. 2^1=2 FDs
can exist which is as below:
C->C; C-> Φ
S-5: When only (AB) is included then, (AB)+ = (ABC) i.e.
2^3=8 FDs can exist which is as below:
AB->A; AB->B; AB->C
AB->AB; AB->BC; AB->AC
AB->ABC; AB->Φ
AB is a super key as it determines all attributes of R.
S-6: When only (BC) is included then, (BC)+ = (BC) i.e. 2^2=4
FDs can exist which is as below:
BC->B; BC->C
BC->BC; BC->Φ
32
S-7: When only (AC) is included then, (AC)+ = (ABC) i.e.
2^3=8 number of FDs can exist qhich is as below:
AC->A; AC->B; AC->C
AC->AB; AC->BC; AC->AC
AC->ABC; AC->Φ
AC is a super key as it determines all attributes of R.
S-8: When only (ABC) is included then, (ABC)+ = (ABC) i.e.
2^3=8 number of FDs can exist qhich is as below:
ABC->A; ABC->B; ABC->C
ABC->AB; ABC->BC; ABC->AC
ABC->ABC; ABC->Φ
ABC is a super key as it determines all attributes of R.
S-9: F+ ={43 number of FDs}
SK= {A, AB, AC, ABC}; CK={A}
33
Equivalence of Functional Dependency
• For a given relation R, if F and G are two sets of FDs then
When F +=G+, then the FD sets F and G are equivalent.
• This can also be derived from the closure set of attributes.
• Ex: Consider the relation R(A,C,D,E,H) and the set of FDs
are F={A->C, AC->D, E->AD, E->H} and G={A->CD,
E->AH}. Determine if F and G are equivalent.
Ans:
S-1: Find the closure of LHS of F using FDs of G
(A)+ = (ACD); (AC)+ = (ACD); (E)+ = (EAHCD)
Hence, F⊆G
S-2: Find the closure of LHS of G using FDs of F
(A)+ = (ACD); (E)+ = (EADHC); Hence, G⊆F
Hence, the set of FDs F and G are equivalent.
34
Redundant Functional Dependencies
A FD in the set F is redundant if it can be derived from the
other FDs in the set. A redundant FD can be detected using the
following steps:
• Step 1: Start with a set of S of FDs
• Step 2: Remove an FD f and create a set of FDs S' = S - f .
• Step 3: Test whether f can be derived from the FDs in S'; by
using the set of Armstrong's axioms and derived rules.
• Step 4: If f can be so derived, it is redundant , and hence S'
= S. Otherwise replace f into S'; so that now S' = S + f.
• Step 5: Repeat steps 2 to 4 for all FDs in S.
35
An Example
For example, suppose the following set of FDs is given in the
algorithm:
Z -> A; B -> X; AX -> Y; ZB -> Y
37
contd..
Sometimes FD Sets are not able to reduce if the set has
following properties,
A set of FDs with the above three properties are also called as
Canonical or Minimal.
38
How to find Canonical Cover?
• Step-01:
– Write the given set of FDs in such a way that each FD
contains exactly one attribute on its right side.
• Step-02:
– Consider each FD one by one from the set obtained in
Step-01. Determine whether it is essential or non-essential.
– To determine whether a FD is essential or not, compute the
closure of its left side:
• Once by considering that the particular FD is present in
the set.
• Once by considering that the particular FD is not
present in the set.
39
• Step-2 (contd..)
– Case-1:
• If results come out to be same,It means that the presence
or absence of that FD does not create any difference.
• Thus, it is non-essential. Eliminate that FD from the set.
• Do not consider it while checking the essentiality of
other FDs.
– Case-2:
• If results come out to be different, It means that the
presence or absence of that FD creates a difference.
• Thus, it is essential. Do not eliminate that FD from the
set.
• Mark that FD as essential.
•
40
contd..
Step-03:
– Consider the newly obtained set of FDs after performing
Step-02 and check if there is any FD that contains more
than one attribute on its left side.
– Then following two cases are possible-
• Case-01: No
– There exists no FD containing more than one
attribute on its left side.
– In this case, the set obtained in Step-02 is the
canonical cover.
41
• Case-02: Yes
– There exists at least one FD containing more than
one attribute on its left side.
– In this case, consider all such FD one by one.
– Check if their left side can be reduced.
– Use the following steps to perform a check-
» Consider a FD and compute the closure of all the
possible subsets of the left side of that FDs.
» If any of the subsets produce the same closure
result as produced by the entire left side, then
replace the left side with that subset.
» After this step is complete, the set obtained is the
canonical cover.
42
An Example
Consider a relation R(WXYZ) and a set of FDs X-> W,
WZ ->XY, Y->WXZ. Find the canonical Cover or Minimal
cover.
Ans:
Step-1: The right side should contain single attribute.
X->W; WZ->X; WZ->Y; Y->W; Y->X; Y->Z
Step-2 : Find out Closure
X->W
X+ = XW (When X->W is considered)
X + = X (When X->W is not considered)
Clearly, the two results are different. Thus, we conclude
that X → W is essential and can not be eliminated.
43
contd..
Step-2 (contd..)
• For WZ → X:
Considering WZ → X, (WZ)+ = { W , X , Y , Z }
Ignoring WZ → X, (WZ)+ = { W , X , Y , Z }
• Now, Clearly, the two results are same. Thus, we conclude
that WZ → X is non-essential and can be eliminated.
• Eliminating WZ → X, our set of functional dependencies
reduces to-
X → W; WZ → Y; Y → W; Y → X; Y → Z
44
Step-2 (contd..)
• For WZ → Y:
Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
Ignoring WZ → Y, (WZ)+ = { W , Z }
Now, Clearly, the two results are different. Thus, we
conclude that WZ → Y is essential and can not be
eliminated.
• For Y → W:
Considering Y → W, (Y)+ = { W , X , Y , Z }
Ignoring Y → W, (Y)+ = { W , X , Y , Z }
Now, Clearly the two results are same. Thus, the FD
Y->W is not essential and hence can be eliminated.
Eliminating Y → W, our set of functional dependencies
reduces to-
X → W; WZ → Y; Y → X; Y → Z
45
contd..
Step-2 (contd..)
• For Y → X:
Considering Y → X, (Y)+ = { W , X , Y , Z }
Ignoring Y → X, (Y)+ = { Y , Z }
Now, Clearly, the two results are different. Thus, we
conclude that Y → X is essential and can not be eliminated.
• For Y → Z:
Considering Y → Z, (Y)+ = { W , X , Y , Z }
Ignoring Y → Z, (Y)+ = { W , X , Y }
Now, Clearly, the two results are different. Thus, we
conclude that Y → Z is essential and can not be eliminated.
46
contd..
Step-03: Consider the FD having more than one attribute on their
left side. Check if their left side can be reduced.
• In our set,
– Only WZ → Y contains more than one attribute on its left
side.
– Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
– Now, Consider all the possible subsets of WZ and check if
the closure result of any subset matches to the closure result
of WZ.
• (W)+ = { W }
• (Z)+ = { Z }
47
contd..
Step-4:
– Clearly, None of the subsets have the same closure result
same as that of the entire left side.
– Thus, we conclude that we can not write WZ → Y as W →
Y or Z → Y.
– Thus, set of FDs obtained in step-02 is the canonical cover.
48