PDD 4
PDD 4
Asst.Professor, CSE
CE, Cherthala
Module 4
Relational Database Design: Different anomalies in designing a database,
normalization, functional dependency (FD), Armstrong’s Axioms, closures,
Equivalence of FDs, minimal Cover (proofs not required). Normalization using
functional dependencies, INF, 2NF, 3NF and BCNF, lossless and dependency
preserving decompositions
● Define relations/attributes
● Define primary keys
● Define relationships
● Normalization
Insertion anomalies: User unable to insert new record of data. If a tuple is inserted in
referencing relation and referencing attribute value is not present in referenced attribute, it will
not allow inserting in referencing relation. For Example, If we try to insert a record in
STUDENT_COURSE with STUD_NO =7, it will not allow.
Deletion Anomalies: Happen when the deletion of unwanted information causes desired
information to be deleted as well. We can’t delete a row from REFERENCED RELATION if
value of REFRENCED ATTRIBUTE is used in value of REFERENCING ATTRIBUTE.it can
be handled by DELETE CASCADE: It will delete the tuples from REFERENCING RELATION
if value used by REFERENCING ATTRIBUTE is deleted from REFERENCED RELATION.
Updation Anomalies: when a record is updated, but the appearance of same record are not
get update.We can’t update a row from REFERENCED RELATION if value of REFRENCED
ATTRIBUTE is used in value of REFERENCING ATTRIBUTE. UPDATE CASCADE: It will
update the REFERENCING ATTRIBUTE in REFERENCING RELATION if attribute value
used by REFERENCING ATTRIBUTE is updated in REFERENCED RELATION.
Assume a relation S with attribute of Student_name, Roll No, class. The set of attributes
Student_name are fully dependent on the attributes (Roll No, class). This means we need
to get the information of both Roll No and class to get values of student_name.
Eg if {A,B} → {C} but also {A} → {C} then {C} is partially functionally dependent on {A,B}.
Multivalued dependency: Multivalued dependency occurs when there are more than
one independent multivalued attributes in a table. A multivalued dependency consists of at
least two attributes that are dependent on a third attribute that's why it always requires at
least three attributes. Multivalued dependency occurs when two attributes in a table are
independent of each other but, both depend on a third attribute.
The problem here is that both Ravi and Beth play multiple sports. It is necessary to add a
new row for every additional sport.
This table has introduced a multivalued dependency because the major and the sport are
independent of one another but both depend on the student. This is a simple example and
easily identifiable, but a multivalue dependency could become a problem in a large, complex
database.
Trivial functional dependency: If a functional dependency (FD) X → Y holds, where Y
is a subset of X, then it is called a trivial FD.
Non-trivial functional dependency: If a functional dependency X->Y holds true where
Y is not a subset of X then this dependency is called non trivial Functional dependency.
For example:An employee table with three attributes: emp_id, emp_name, emp_address.
P-> Q
Q->R
Book} ->{Author} (if we know the book, we knows the author name)
Therefore as per the rule of transitive dependency: {Book} -> {Author_age} should hold
4.2.2 Armstrong’s Axioms
Armstrong’s Axiom is a mathematical notation used to find the functional dependencies in a
database. Conceived by William W. Armstrong, it is a list of axioms or inference rules that
can be implemented on any relational database. It is denoted by the symbol F+.
Rule Reflexivity
1 If A is a set of attributes and B is a subset of A, then A holds B. { A →
B}
Rule Augmentation
2 If A hold B and C is a set of attributes, then AC holds BC. {AC → BC}
It means that attribute in dependencies does not change the basic
dependencies.
Rule Transitivity
3 If A holds B and B holds C, then A holds C.
If {A → B} and {B → C}, then {A → C}
A holds B {A → B} means that A functionally determines B.
B. Secondary Rules
Rule 1 Union
If A holds B and A holds C, then A holds BC.
If{A → B} and {A → C}, then {A → BC}
Rule 2 Decomposition
If A holds BC and A holds B, then A holds C.
If{A → BC} and {A → B}, then {A → C}
Example:
P→Q P→R
QR → S Q→T
QR → U PR → U
1. P → T
2. PR → S
3. QR → SU
4. PR → SU
Solution:
1. P → T
∴ If P → Q and Q → T, then P → T.
P→T
2. PR → S
As, QR → S
So, Using Pseudo Transitivity Rule: If{A → B} and {BC → D}, then {AC → D}
∴ If P → Q and QR → S, then PR → S.
PR → S
3. QR → SU
QR → SU
4. PR → SU
PR → SU
The Closure Of Functional Dependency means the complete set of all possible attributes that
can be functionally derived from given functional dependency
Step-1 : Add the attributes which are present on Left Hand Side in the original functional
dependency.
Step-2 : Now, add the attributes present on the Right Hand Side of the functional
dependency.
Step-3 : With the help of attributes present on Right Hand Side, check the other attributes
that can be derived from the other given functional dependencies. Repeat this process until
all the possible attributes which can be derived are added in the closure.
Similarly, we can calculate closure for other attributes too i.e “Name”.
Step-3 : Since, we don’t have any functional dependency where “Marks or Location”. So
{Name}+ = {Name, Marks, Location}
{Marks}+ = {Marks}
and
{Location}+ = { Location}
FD1 : A -> BC
FD2 : C -> B
FD3 : D -> E
FD4 : E -> D
Now, calculate the closure of attributes of the relation R. The closures will be:
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {B, C}
{D}+ = {D, E}
{E}+ = {E, D}
A Candidate Key of a relation is an attribute or set of attributes that can determine the whole
relation or contains all the attributes in its closure.
FD1 : A-> B
FD2 : B ->C
{A}+ = {A, B, C}
{B}+ = {B, C}
{C}+ = {C}
Clearly, “A” is the candidate key as, its closure contains all the attributes present in the
relation “R”.
FD1 : A-> BC
FD2 : C-> B
FD3 : D ->E
FD4 : E ->D
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {C, B}
{D}+ = {E, D}
{E}+ = {E, D}
In this case, a single attribute does is unable to determine all the attribute on its own like in
previous example. Here, we need to club two or more attributes to determine the candidate
keys.
Hence, "AD" and "AE" are the two possible keys of the given relation “R”. Any other
combination other than these two would have acted as extraneous attributes.
1. Prime Attributes : Attributes which are indispensable part of candidate keys. For
example : “A, D, E” attributes are prime attributes in above example-2.
2. Non-Prime Attributes : Attributes other than prime attributes which does not take part
in formation of candidate keys. For example.
3. Extraneous Attributes : Attributes which does not make any effect on removal from
candidate key.
FD1 : A-> BC
FD2 : B ->C
FD3 : D ->C
Prime Attributes : A, D.
Non-Prime Attributes : B, C
Extraneous Attributes : B, C(As if we add any of the to the candidate key, it will remain
unaffected). Those attributes, which if removed does not affect closure of that set.
4.3 Equivalence of Functional Dependencies
Two different sets of functional dependencies for a given relation may or may not be equivalent. If
FD1 and FD2 are the two sets of functional dependencies following with below 3 cases are possible,
then FD’s are equivalent.
● If FD1 can be derived from FD2, we can say that FD2 ⊃ FD1.
● If FD2 can be derived from FD1, we can say that FD1 ⊃ FD2.
● If above two cases are true, FD1=FD2.
Q. Let us take an example to show the relationship between two FD sets. A relation
R(A,B,C,D) having two FD sets FD1 = {A->B, B->C, AB->D} and FD2 = {A->B, B->C,
A->C, A->D}
Step 1. Checking whether all FDs of FD1 are present in FD2
can derive it or not. For set FD2, (AB)+ = {A,B,C,D}. It means that AB can
As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.
Step 2. Checking whether all FDs of FD2 are present in FD1
derive it or not. For set FD1, (A)+ = {A,B,C,D}. It means that A can functionally
derive it or not. For set FD1, (A)+ = {A,B,C,D}. It means that A can functionally
Step 3. As FD2 ⊃ FD1 and FD1 ⊃ FD2 both are true FD2 =FD1 is true. These two FD sets
Q. Let us take another example to show the relationship between two FD sets. A
relation R2(A,B,C,D) having two FD sets FD1 = {A->B, B->C,A->C} and FD2 = {A->B,
B->C, A->D}
Step 1. Checking whether all FDs of FD1 are present in FD2
● A->B in set FD1 is present in set FD2.
● B->C in set FD1 is also present in set FD2.
● A->C is present in FD1 but not directly in FD2 but we will check whether we can
derive it or not. For set FD2, (A)+ = {A,B,C,D}. It means that A can functionally
determine A, B, C and D. SO A->C will also hold in set FD2.
As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.
Step 2. Checking whether all FDs of FD2 are present in FD1
● A->B in set FD2 is present in set FD1.
● B->C in set FD2 is also present in set FD1.
● A->D is present in FD2 but not directly in FD1 but we will check whether we can
derive it or not. For set FD1, (A)+ = {A,B,C}. It means that A can’t functionally
determine D. SO A->D will not hold in FD1.
As all FDs in set FD2 do not hold in set FD1, FD2 ⊄ FD1.
Step 3. In this case, FD2 ⊃ FD1 and FD2 ⊄ FD1, these two FD sets are not semantically
equivalent.
1. Break down the RHS of each functional dependency into a single attribute .
2. Find redundant fds
3. Minimize LHS.
4. Group the functional dependencies that have common LHS together into a Single FD .
4.5 Normalization
ormalization is the process of organizing the data in the database. Normalization is used to
N
minimize the redundancy from a relation or set of relations. It is also used to eliminate the
undesirable characteristics like Insertion, Update and Deletion Anomalies. Normalization
divides the larger table into the smaller table and links them using relationship.
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional
dependent on the primary key.
4NF A relation will be in 4NF if it is in Boyce Codd normal form and has no multivalued
dependency.
5NF A relation is in 5NF if it is in 4NF and not contains any join dependency and joining
should be lossless.
As per the rule of first normal form, an attribute (column) of a table cannot hold multiple
values. It should hold only atomic values.
ID NAME MOBILE
101 A 9446007329,
8281401703
102 B 9995398726
103 C 9786219187
This table is not in 1NF because the mobile values for employees A violates that rule. To
make the table complies with 1NF we should have the data like this:
ID NAME MOBILE
101 A 9446007329
101 A 8281401703
102 B 9995398726
103 C 9786219187
ID Subject Age
101 maths 40
102 DS 38
101 Dbms 40
103 Cp 35
This table is in 1NF but not in 2NF because partial dependency is there. By using ID only we
can find age.To make the table complies with 2NF we can break it in two tables like this:
ID Subject
101 maths
102 DS
101 Dbms
103 Cp
ID Age
101 40
102 38
103 35
In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each
functional dependency X-> Y at least one of the following conditions hold:
Candidate Keys:ID
Non-prime attributes: all attributes except ID are non-prime as they are not part of any
candidate keys.
To make this table complies with 3NF we have to break the table into two tables to remove
the transitive dependency:
A relation is in BCNF
All relations in BCNF is 3NF but all relations in 3NF is not BCNF
Decomposition helps in eliminating some of the problems of bad design such as redundancy,
inconsistencies and anomalies.
There are two problems of decomposition :
● Lossy Decomposition
● Lossless Join Decomposition
Lossy Decomposition :
"The decomposition of relation R into R1 and R2 is lossy when the join of R1 and R2 does
not yield the same relation as in R."
Consider that we have table STUDENT with three attribute roll_no , sname and department.
STUDENT
22 A EC
This relation is decomposed into two relation no_name and name_dept
11 A A CS
22 A A EC
In lossy decomposition ,spurious tuples are generated when a natural join is applied to the
relations in the decomposition. When 2 relations are natural joined and if the resulting relation
has more tuples than the original set of tuples then those tuples are called spurious tuples(No
proper Primary key).
11 A CS
11 A EC
22 A CS
22 A EC
"The decomposition of relation R into R1 and R2 is lossless when the join of R1 and R2
yield the same relation as in R." This is also refferd as non-additive decomposition.
R1 ∩ R2 → R1 OR R1 ∩ R2 → R2
Consider that we have table STUDENT with three attribute roll_no , sname and department.
STUDENT
11 A CS
22 A EC
11 A 11 CS
22 A 22 EC
Now ,when these two relations are joined on the common column 'roll_no' ,the resultant
relation will look like same.
11 A CS
22 A EC
In lossless decomposition, no spurious tuples are generated when a natural joined is applied
to the relations in the decomposition.
R is decomposed or divided into R1 with FD { f1 } and R2 with { f2 }, then there can be three
cases:
Q1. Let a relation R (A, B, C, D ) and functional dependency {AB –> C, C –> D, D –> A}.
Relation R is decomposed into R1( A, B, C) and R2(C, D). Check whether decomposition is
dependency preserving or not.
closure(A) = { A } // Trivial
closure(B) = { B } // Trivial
closure(AB) = {A, B, C, D}
= {A, B, C}
AB --> C // Removing AB from right side as these are trivial attributes
closure(BC) = {B, C, D, A}
= {A, B, C}
closure(AC) = {A, C, D}
Similarly F2 { C--> D }