0% found this document useful (0 votes)
135 views

Module3 Dbms

The document discusses normalization in databases. Normalization is the process of organizing data to minimize redundancy and dependency. It has two main goals - eliminating redundant data storage and ensuring data dependencies make logical sense. Normalization improves storage efficiency, data integrity, and scalability. Some key points include decomposing tables to eliminate anomalies like update, deletion, and insert anomalies. Normalization follows specific rules to break tables into smaller, more logical tables and relationships.

Uploaded by

Vaidehi Verma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views

Module3 Dbms

The document discusses normalization in databases. Normalization is the process of organizing data to minimize redundancy and dependency. It has two main goals - eliminating redundant data storage and ensuring data dependencies make logical sense. Normalization improves storage efficiency, data integrity, and scalability. Some key points include decomposing tables to eliminate anomalies like update, deletion, and insert anomalies. Normalization follows specific rules to break tables into smaller, more logical tables and relationships.

Uploaded by

Vaidehi Verma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 192

Module 3

NORMALIZATION
• Normalization is the process of removing redundant
data from your tables in order to improve storage
efficiency, data integrity and scalability.
• This improvement is balanced against an increase in
complexity and potential performance losses from the
joining of the normalized tables at query-time.
• There are two goals of the normalization process:
eliminating redundant data (for example, storing the
same data in more than one table) and ensuring data
dependencies make sense (only storing related data in
a table).
• Both of these are worthy goals as they reduce the
amount of space a database consumes and ensure
that data is logically stored.
Why We Need Normalization?
• Normalization is the aim of well design Relational Database
 Management System (RDBMS). It is step by step set of rules
by which data is put in its simplest forms.
• We normalize the relational database management system
because of the following reasons:

• Minimize data redundancy i.e. no unnecessarily duplication


of data.

• To make database structure flexible i.e. it should be possible


to add new data values and rows without reorganizing the
database structure.
• Data should be consistent throughout the database i.e.
it should not suffer from following anomalies.
• Insert Anomaly – Due to lack of data i.e., all the data
available for insertion such that null values in keys should
be avoided. This kind of anomaly can seriously damage a
database

• Update Anomaly – It is due to data redundancy i.e.


multiple occurrences of same values in a column. This
can lead to inefficiency.

• Deletion Anomaly – It leads to loss of data for rows


that are not stored else where. It could result in loss of
vital data.

.
Complex queries required by the user should be easy to
handle.
• On decomposition of a relation into smaller relations
with fewer attributes on normalization the resulting
relations whenever joined must result in the same
relation without any extra rows. The join operations
can be performed in any order. This is known as
Lossless Join decomposition.

• The resulting relations (tables) obtained on


normalization should possess the properties such as
each row must be identified by a unique key, no
repeating groups, homogeneous columns, each
column is assigned a unique name etc
• Disadvantages of Normalization
• You cannot start building the database before
you know what the user needs.
• On Normalizing the relations to higher normal
forms i.e. 4NF, 5NF the performance degrades.
• It is very time consuming and difficult process
in normalizing relations of higher degree.
• Careless decomposition may leads to bad
design of database which may leads to serious
problems.
• What do we mean when we say a table is not
in normalized form?

•  consider the STUDENT table shown next


where one or more students may be assigned
a common course. Notice that for each Course
Name every “row” of the table has more than
one value under the columns RoIIno, Name,
System Used, Hourly Rate, and Total_ Hrs.
• Table entries that have more than one value
called multi-value entries. Tables with multi-
value entries are called unnormalized tables.
• Within an unnormalized table, we will call a repeating
group an attribute or group of attributes that may
have multivalue entries for single occurrences of the
table identifier.
• The term refers to the attribute that allows us to
distinguish the different rows of the unnormalized
table.
• Using this terminology we can describe the STUDENT
table shown above as an unnormalized table where
attributes Rollno, Name, System Used, Hourly Used,
and Total_ Hrs are repeating groups.
• This type of table cannot be considered a relation
because there are entries with more than one value.
To be able to represent this table as a relation and to
implement it in a RDBMS it is necessary to normalize
the table.
First Approach: Flattening the table
• The first approach known as “flattening the table” removes
repeating groups by filling in the “missing” entries of each
“incomplete row” of the table with copies of their corresponding
non-repeating attributes.

• The following example illustrates this.


• In the STUDENT table, for each individual Course, under the
RolIno Name, System Used, Hourly_Used, and Total_Hrs
attributes, there is more than one value per entry. To normalize this
table, we just fill in the remaining entries by copying the
corresponding information from the non-repeating attributes.

• For instance, for the row that contains the course Visual Basic, we
fill in the remaining “blank” entries by copying the values of the
Course_Code, Course_Name and Teacher_Name columns. This
row has now a single value in each of its entries. We have repeated
a similar process for the students· of the remaining two courses
• The normalized representation of the
STUDENT table is:
• Second Approach: Decomposition of the table
• The second approach for normalizing a table
requires that the table be decomposed into two
new tables that will replace the original table.
• Decomposition of a relation involves separating
the attributes of the relation to create the
schemes of two new relations. However, before
decomposing the original table it is necessary to
identify an attribute or a set of its attributes that
can be used as table identifiers.
• Rule of decomposition
• One of the two tables contains the table
identifier of the original table and all the non-
repeating attributes.
• The other table contains a copy of the table
identifier and all the repeating attributes.
• To transform these tables in to relations, it
may be necessary to identify a Primary Key for
each table. The Tuples of the new relations are
the projection of the original relation into
their respective schemes
• To normalize the STUDENT table we need to
replace it by two new tables.
• The first table COURSE contains the table
identifier and the non-repeating groups. These
attributes are Course_Code (the table
identifier), Course_Name, anc Teacher_Name.
• The second table contains the table .identifier
and· all the repeating groups. Therefore, the
attributes of COURSE_STUDENT table are
Course_Code, RolIno, Name, System Used,
Hourly Rate and Total_Hrs.
Functional Dependency

• The functional dependency is a relationship


that exists between two attributes.
• It typically exists between the primary key
and non-key attribute within a table.
• X   →   Y  
• The left side of FD is known as a determinant,
the right side of the production is known as a
dependent.
Functional Dependency
• Assume we have an employee table with
attributes: Emp_Id, Emp_Name,
Emp_Address.
• Here Emp_Id attribute can uniquely identify
the Emp_Name attribute of employee table
because if we know the Emp_Id, we can tell
that employee name associated with it.
• Functional dependency can be written as:
⮚ Emp_Id → Emp_Name   
• We can say that Emp_Name is functionally
dependent on Emp_Id.
Types of Dependencies
• Dependencies in DBMS is a relation between
two or more attributes. It has the following
types in DBMS −
• Multivalued dependency:
• Trivial functional dependency:
• Non-trivial functional dependency:
• Transitive dependency:
Multivalued dependency

• Multivalued dependency occurs in the


situation where there are multiple
independent multivalued attributes in a single
table. A multivalued dependency is a complete
constraint between two sets of attributes in a
relation. It requires that certain tuples be
present in a relation.
Multivalued dependency
Trivial Functional dependency

• The Trivial dependency is a set of attributes


which are called a trivial if the set of attributes
are included in that attribute.
• So, X -> Y is a trivial functional dependency if Y
is a subset of X.
Non trivial functional dependency

• Functional dependency which also known as a


nontrivial dependency occurs when A->B
holds true where B is not a subset of A. In a
relationship, if attribute B is not a subset of
attribute A, then it is considered as a non-
trivial dependency.
Non trivial functional dependency
Transitive dependency

• A transitive is a type of functional dependency


which happens when it is indirectly formed by
two functional dependencies.
Transitive dependency:
• Multivalued dependency occurs in the situation
where there are multiple independent
multivalued attributes in a single table
• The Trivial dependency occurs when a set of
attributes which are called a trivial if the set of
attributes are included in that attribute
• Nontrivial dependency occurs when A->B holds
true where B is not a subset of A
• A transitive is a type of functional dependency
which happens when it is indirectly formed by
two functional dependencies
Inference Rule (IR):
•The Armstrong's axioms are the basic
inference rule.
•Armstrong's axioms are used to conclude
functional dependencies on a relational
database.
•The inference rule is a type of assertion. It
can apply to a set of FD(functional
dependency) to derive other FD.
•Using the inference rule, we can derive
additional functional dependency from the
initial set.
1.Reflexive Rule(IR1)
• In the reflexive rule, if Y is a subset of X, then X
determines Y.
• If X ⊇ Y then X  →    Y  
Example:
• X = {a, b, c, d, e}  
• Y = {a, b, c}  
2. Augmentation Rule (IR2)

• The augmentation is also called as a partial


dependency. In augmentation, if X determines
Y, then XZ determines YZ for any Z.
• If X    →  Y then XZ   →   YZ  
 
• Example:
• For R(ABCD),  if A   →   B then AC  →   BC 
3. Transitive Rule(IR3)

• In the transitive rule, if X determines Y and Y


determine Z, then X must also determine Z.
• If X   →   Y and Y  →  Z then X  →   Z   
4. Union Rule(IR4)
• Union rule says, if X determines Y and
X determines Z,
then X must also determine Y and Z.
• If X    →  Y and X   →  Z then X  →    YZ     
Proof:
• 1. X → Y (given)
2. X → Z (given)
3. X → XY (using IR2 on 1 by augmentation with X.
Where XX = X)
4. XY → YZ (using IR2 on 2 by augmentation with Y)
5. X → YZ (using IR3  (Transitive rule) on 3 and 4)
5. Decomposition Rule (IR5)
• Decomposition rule is also known as project rule.
It is the reverse of union rule.
• This Rule says,
if X determines Y and Z, then X determines Y and X
determines Z separately.
• If X   →   YZ then X   →   Y and X  →    Z   
• Proof:
• 1. X → YZ (given)
2. YZ → Z and YZ🡪 Y(using IR1 (Reflexive Rule)
3. X → Y and X🡪Z (using IR3 (Transitive rule)on 1
and 2)
6. Pseudo transitive Rule (IR6)
• In Pseudo transitive Rule, if X determines Y
and YZ determines W, then XZ determines W.
• If X   →   Y and YZ   →   W then XZ   →   W   
• Proof:
1. X → Y (given)
2. YZ → W (given)
3. XZ → YZ (using IR2 on 1 by augmenting with Z)
4. XZ→ W (using IR3 on 3 and 2)
Fully Functional Dependency
• An attribute is fully functional dependent on
another attribute, if it is Functionally
Dependent on that attribute and not on any of
its proper subset.
Fully Functional Dependency
Partial Dependency
Closure of an Attribute Set-

 
• The set of all those attributes which can be
functionally determined from an attribute set
is called as a closure of that attribute set.
• Closure of attribute set {X} is denoted as {X}+.
Steps to Find Closure of an Attribute Set-
 

• Step-01:
 
• Add the attributes contained in the attribute set for
which closure is being calculated to the result set.
 
• Step-02:
 
• Recursively add the attributes to the result set which
can be functionally determined from the attributes
already contained in the result set.
Example
Consider a relation R ( A , B , C , D , E , F , G ) with
the functional dependencies-
• A → BC
• BC → DE
• D→F
• CF → G
Now, let us find the closure of some attributes
and attribute sets-
So, number of super keys possible = 2 x 2 x 2 x 2 = 16.
Thus, total number of super keys possible = 16.
 
• Total Number of Super Keys-

• There are total 5 attributes in the given


relation of which-
• There are 2 essential attributes- A and B.
• Remaining 3 attributes are non-essential
attributes.
• So, number of super keys possible =
• 2 x 2 x 2 = 8.
• Thus, total number of super keys possible = 8.
• Total Number of Super Keys-
 
• There are total 10 attributes in the given relation
of which-
• There are 3 essential attributes- E, F and H.
• Remaining 7 attributes are non-essential
attributes.
• So, number of super keys possible
• = 2 x 2 x 2 x 2 x 2 x 2 x 2 = 128.
• Thus, total number of super keys possible = 128.
Canonical Cover
Canonical Cover
Need-

• Working with the set containing extraneous


functional dependencies increases the
computation time.
• Therefore, the given set is reduced by eliminating
the useless functional dependencies.
• This reduces the computation time and working
with the irreducible set becomes easier.
Steps To Find Canonical Cover-

Step-01:
 
Write the given set of functional dependencies in such a way that each
functional dependency contains exactly one attribute on its right side.

Step-02:
 
Consider each functional dependency one by one from the set obtained in
Step-01.
Determine whether it is essential or non-essential.
 
To determine whether a functional dependency is essential or not, compute
the closure of its left side-
• Once by considering that the particular functional dependency is present
in the set
• Once by considering that the particular functional dependency is not
present in the set
Steps To Find Canonical Cover-
Case-01: Results Come Out to be Same-
 
• If results come out to be same,
• It means that the presence or absence of that functional dependency does
not create any difference.
• Thus, it is non-essential.
• Eliminate that functional dependency from the set.

Case-02: Results Come Out to be Different-


 
• If results come out to be different,
• It means that the presence or absence of that functional dependency
creates a difference.
• Thus, it is essential.
• Do not eliminate that functional dependency from the set.
• Mark that functional dependency as essential.
Steps To Find Canonical Cover-
Step-03:
 
Consider the newly obtained set of functional dependencies after performing Step-02.
Check if there is any functional dependency that contains more than one attribute on its left side.

  Case-01: No-
 
There exists no functional dependency containing more than one attribute on its left side.
In this case, the set obtained in Step-02 is the canonical cover.
 
Case-01: Yes-
 
• There exists at least one functional dependency containing more than one attribute on its left side.
• In this case, consider all such functional dependencies one by one.
• Check if their left side can be reduced.
 
Use the following steps to perform a check-
1. Consider a functional dependency.
2. Compute the closure of all the possible subsets of the left side of that functional dependency.
3. If any of the subsets produce the same closure result as produced by the entire left side, then
replace the left side with that subset.
4. After this step is complete, the set obtained is the canonical cover.
Considering WZ🡪X
(WZ)+= {W,Z}
(WZ)+={W,Z,X,Y}
Ignoring WZ🡪X
(WZ)+={W,Z}
(WZ)+={W,Z,Y}
(WZ)+={W,Z,Y,X}

The results are same. Hence it can be eliminated


Considering WZ🡪Y
(WZ)+= {W,Z}
(WZ)+={W,Z,X,Y}
Ignoring WZ🡪Y
(WZ)+={W,Z}
(WZ)+={W,Z,X}

The results are different. It is an essential FD


Considering Y🡪W
(Y)+= {Y}
(Y)+={Y,W}
(Y)+={Y,W,X,Z}
Ignoring Y🡪W
(Y)+={Y}
(Y)+={X,Z}
(Y)+={X,Z,W,Y}

The results are same.It can be eliminated


Considering Y🡪X
(Y)+= {Y}
(Y)+={Y,W}
(Y)+={Y,W,X,Z}
Ignoring Y🡪X
(Y)+={Y}
(Y)+={Y,W,Z}
(Y)+={Y,W,Z,X}

The results are same It can be eliminated


Considering Y🡪Z
(Y)+= {Y}
(Y)+={Y,Z}
(Y)+={Y,W,X,Z}
Ignoring Y🡪Z
(Y)+={Y}
(Y)+={Y,W,X}

The results are different.It can’t be eliminated


• Reduced Functional dependencies are
X🡪W
WZ🡪Y
Y🡪Z
Canonical cover:

X🡪W
WZ🡪Y
Y🡪Z
Problem :Find the minimal cover of the set of functional dependencies given;
{A → BC, B → C, AB → D}

• Solution:
A🡪BC is decomposed into A🡪B and A🡪C
So the FD’s are
1:A🡪B
2:A🡪C
3:B🡪C
4:AB🡪D
1. A🡪 B
Considering FD A🡪B
1:A🡪B
2:A🡪C
3:B🡪C
4:AB🡪D

{A}+ ={A}
={A,B}
={A,B,D}--------(AB🡪D)
={A,B,C,D}---------(B-🡪C)

Therefore {A}+ ={A,B,C,D}


1. A🡪 B
Ignoring FD A🡪B
2:A🡪C
3:B🡪C
4:AB🡪D

{A}+ ={A}
={A,C}------------(A🡪C)

The result is different .


Hence It is essential
2.A🡪 C
Considering FD A🡪C
1:A🡪B
2:A🡪C
3:B🡪C
4:AB🡪D

{A}+ ={A}
={A,C}-------------(A🡪C)
={A,B,C}--------(A🡪B)
={A,B,C,D}---------(AB-🡪D)

Therefore {A}+ ={A,B,C,D}


2.A🡪 C
Ignoring FD A🡪C
1:A🡪B
3:B🡪C
4:AB🡪D

{A}+ ={A}
={A,B}------------(A🡪B)
={A,B,C}--------(B🡪C)
={A,B,C,D}------(AB🡪D)

The result is same.


Hence It is non essential
3.B🡪 C
Considering FD B🡪C
1:A🡪B
2:A🡪C
3:B🡪C
4:AB🡪D

{B}+ ={B}
={B,C}--------(B🡪C)

Therefore {B}+ ={B,C}


3. B🡪C
Ignoring FD B🡪C
1:A🡪B
2:A🡪C
4:AB🡪D

{B}+ ={B}

The result is different .


Hence It is essential
4.AB🡪 D
Considering FD AB🡪D
1:A🡪B
2:A🡪C
3:B🡪C
4:AB🡪D

{AB}+ ={A,B}
={A,B,D}--------(AB🡪D)
={A,B,D,C}------(B🡪C)

Therefore {AB}+ ={A,B,C,D}


4. AB🡪D
Ignoring FD AB🡪D
1:A🡪B
2:A🡪C
3:B🡪C

{AB}+ ={A,B}
={A,B,C} ---------B🡪C

The result is different .


Hence It is essential
Check to reduce AB🡪D into A🡪D
and B🡪D
• AB🡪 D
The closure of AB is {A,B,C,D}

The closure of A is
A+={A,B,C,D}

The closure of B is
B+={B,C,D}
Here the subset closure of A has result set equal to
Closure of AB.
Hence AB🡪 D can be replaced by A🡪D
and B-🡪 D can be eliminated
• Hence the Minimal cover of F is

1.A🡪B
2.B🡪C
4.A🡪D
Normal Forms
1NF Example
NOTE-
 
By default, every relation is in 1NF.
This is because formal definition of a relation states that value of all the
attributes must be atomic.
2NF
Example 2:
Example 3
• Candidate Keys:
{student_id, programming_language}
• Non-prime attribute: student_age

• The above relation is in 1 NF because each attribute


contains atomic values. However, it is not in 2NF
because a non-prime attribute student_age
is dependent on student_id, which is a proper subset
of a candidate key.
This violates the rule for second normal form as a rule
says “no non-prime attribute should be dependent on
the part of a candidate key of the relation”.
Third Normal Form (3NF):
• A relation is in third normal form, if there is no
transitive dependency for non-prime attributes as
well as it is in second normal form.
• A relation is in 3NF if at least one of the following
condition holds in every non-trivial function
dependency X –> Y:
• X is a super key.
• Y is a prime attribute (each element of Y is part of
some candidate key).
Here, emp_state, emp_city & emp_district dependent on emp_zip.
And, emp_zip is dependent on emp_id that makes non-prime attributes
(emp_state, emp_city & emp_district) transitively dependent on super key
(emp_id). This violates the rule of 3NF.
Example 2 – Third Normal Form
Example 3

• Consider the relation PLAYER with relational


schema PLAYER (Player-no, Player-name, Team,
Team-color, Coach-no, Coach-name, Player-
position, Team-captain) and set of functional
dependencies as follows;
• F = {Player-no → Player-name,
• Player-no → Player-position,
• Player-no → Team,
• Coach-no → Coach-name,
• Team → Team-color,
• Team → Coach-no,
• Team → Team-captain}
• Answer the questions given below;
• a) Is PLAYER in 2NF? If not, convert into 2NF.
• b) Is PLAYER in 3NF? If not, convert into 3NF.
a) Is PLAYER in 2NF?
Let us find the closure for all the left hand side attributes of all the FDs of F.

• (Player-no)+ = Player-no, Player-name, Player-position, Team, Team-color,


Coach-no, Team-captain, Coach-name.

• (Team)+ = Team, Team-color, Team-captain, Coach-no, Coach-name

• (Coach-no)+ = Coach-no, Coach-name

• When we find closure, only Player-no can uniquely determine all the


attributes of PLAYER. Hence, Player-no is the only candidate key
a) Is PLAYER in 2NF?
• As the key (Player-no) is single and simple
attribute, there is no possibilities for partial-
key dependencies. Hence, PLAYER is in 2NF.
b) Is PLAYER is in 3NF?
• To answer this question, we need to check for non-key
dependencies or transitive dependencies. That is, we have
to look for dependencies like the one follows;
1.Non-key attribute(s) → Non-key attribute(s)
• From the given set of functional dependencies F, we could
derive the following non-key dependencies;

• Team → Team, Team-color, Team-captain, Coach-no, Coach-


name
• Coach-no → Coach-no, Coach-name

• Hence, PLAYER is not in 3NF.


b) Is PLAYER is in 3NF?
• Solution:
• Decompose PLAYER into more tables based on
the non-key dependencies. Then we shall get
the tables as follows;

• PLAYER (Player-no, Player-name, Player-


position, Team)
• TEAM (Team, Team-color, Team-captain,
Coach-no, Coach-name)
b) Is PLAYER is in 3NF?
• The key for PLAYER is Player-no, and all the
others are non-key attributes.
Hence, PLAYER is in 2NF (no partial
dependencies) and 3NF (no transitive
dependencies)
b) Is PLAYER is in 3NF?
• The key for TEAM is Team. All the other attributes are non-key attributes and
depends on Team-no. Hence, TEAM is in 2NF.
• TEAM has following transitive dependency;
Team → Coach-no → Coach-name.
• Hence, TEAM is not in 3NF. To convert, decompose TEAM as follows;

• TEAM (Team, Team-color, Team-captain, Coach-no)


• COACH (Coach-no, Coach-name)

• Now, TEAM and COACH are both in 2NF and 3NF.


Final set of decomposed tables that are in 3NF are

• PLAYER (Player-no, Player-name, Player-


position, Team)
• TEAM (Team, Team-color, Team-captain,
Coach-no)
• COACH (Coach-no, Coach-name)
Boyce-Codd Normal Form
• For a table to satisfy the Boyce-Codd Normal
Form, it should satisfy the following two
conditions:
1.It should be in the Third Normal Form.
2.And, for any dependency A → B, A should be
a super key.
• The second point sounds a bit tricky, right? In
simple words, it means, that for a dependency A
→ B, A cannot be a non-prime attribute, if B is
a prime attribute
Boyce-Codd Normal Form
BCNF
• Well, in the table above student_id, subject together form the primary key, because
using student_id and subject, we can find all the columns of the table.

• One more important point to note here is, one professor teaches only one subject,
but one subject may have two different professors.

• Hence, there is a dependency between subject and professor here,


where subject depends on the professor name.

• This table satisfies the 1st Normal form because all the values are atomic, column
names are unique and all the values stored in a particular column are of same
domain.
• This table also satisfies the 2nd Normal Form as their is no Partial Dependency.
• And, there is no Transitive Dependency, hence the table also satisfies the 3rd
Normal Form.
• But this table is not in Boyce-Codd Normal Form
BCNF
BCNF
BCNF
Decomposition
• Decomposition is a process of dividing a single
relation into two or more sub relations.
• Decomposition of a relation can be completed
in the following two ways-
Decomposition
Lossless join decomposition
• Lossless join decomposition is also known
as non-additive join decomposition.
• This is because the resultant relation after
joining the sub relations is same as the
decomposed relation.
• No extraneous tuples appear after joining of
the sub-relations.
 
Lossy Join Decomposition
• In this extraneous tuples get introduced in the
natural join of the sub-relations.
• Extraneous tuples make the identification of
the original tuples difficult.
Problem-01:

• Consider a relation schema R ( A , B , C , D )


with the functional dependencies
A → B and C → D.
Determine whether the decomposition of R into
R1 ( A , B ) and R2 ( C , D ) is lossless or lossy.
Problem 2:
R(ABC) is decomposed into R1(AB) and R2(BC)
Check whether it is lossy or lossless
decomposition, given FD of R{A🡪B}
• Solution:
• R1 U R2={A,B,C}
• R1 ∩ R2={B}
• B is not a key of any relation since FD is A🡪B

• Hence the decomposition is lossy


Problem 3:
• Given R=(ABC),
FD:{A🡪B}, Decomposition of R1(AB)and R2(AC) is
lossy or lossless
Solution:
• R1 U R2={A,B,C} == R
• R1 ∩ R2={A}
• A is a key of relation R1 since FD is A🡪B

• Hence the decomposition is lossless


Problem 4:
Given R(ABCD), F:{A🡪B, A🡪C, C🡪D},
R1(ABC) and R2(CD) is lossless or lossy?
Solution:
• R1 U R2 U R3={A,B,C} == R
• R1 ∩ R2={C}
• C is a key of relation R2 since FD is C🡪D

• Hence the decomposition is lossless


A B C D
R1 α α α ----
R2 ---- --- α α
From FD: C🡪 D
C values are α
Hence D value can be changed to α

A B C D
R1 α α α α

R2 ---- --- α α
Here the Row values of R1 are α

Hence the decomposed relation is lossless


Problem 5:
• Given R(ABCDE)
F:{AB🡪CD, A🡪E, C🡪D}
R1(ABC), R2(BCD) and R3(CDE) is lossless or
lossy?
A B C D E
R1 α α α

R2 α α α

R3 α α α

All Values of C Attributes= α


Hence we can fill α in D Column

A B C D E
R1 α α α α

R2 α α α

R3 α α α
Here the none of the values of R1, R2 or R3 are
α

Hence the decomposed relation is lossy


Dependency Preservation
Problem1.: Let a relation R (A, B, C, D ) and
functional dependency
{AB –> C, C –> D, D –> A}.
Relation R is decomposed into
R1( A, B, C) and R2(C, D).
Check whether decomposition is dependency
preserving or not.
Dependency Preservation
Problem 1 Solution:
Consider R1(A,B,C)
1.Closure(A)={A}
2.Closure(B)={B}
3.Closure(C)={C,A,D}
Since D is not in R1 , remove D
Closure(C)={C,A}
Hence C🡪A
4.Closure(AB)={A,B,C,D}
Since D is not in R1 , remove D
Closure(AB)={A,B,C}
Hence AB🡪C
Consider R1(A,B,C)
5.Closure(BC)={B,C,D,A}
Since D is not in R1 , remove D
Closure(BC)={B,C,A}
Hence BC🡪A

6.Closure(AC)={A,C,D}
Since D is not in R1 , remove D
Closure(AC)={A,C}

F1={C🡪A, AB🡪C, BC🡪A}


Consider R2(C,D)
1.Closure(C)={C,D}
Hence C🡪D

2.Closure(D)={D}

3.Closure(CD)={C,D}
C🡪D

Hence F2={C🡪D}
• In the original Relation dependency
{AB🡪C,C🡪D,D🡪A}

AB🡪 C is present in F1
C-->D is present in F2
D🡪A is not preserved.

F1 U F2 is a subset of F.
Hence given decomposition is not dependency
preserving.
Problem 2: Dependency Preserving
Consider R1(ABC)
1.Closure(A)={A,B,D}
={A,B,D,E}
={A,B}
Hence A🡪B
2.Closure(B)={B}
3.CLosure(C)={C}
4.Closure (AB)={A,B}
={A,B,D,E}
={A,B}
5.Closure (AC)={A,C}
={A,B,D,C}
={A,B,C}
AC🡪B
6.Closure(BC)={B,C}
={B,C,E}
={B.C}
Consider R2(A,D)
1. Closure(A)={A,B,D}
={A,D}
Hence A🡪D
2. Closure(D)={D}
3.Closure(AD)={A,B,D}
={A,D}
Consider R3(B,D,E)
1. Closure(B)={E}
Hence B🡪E

2. Closure(E)={E}
3.Closure(BD)={B,D}
={B,D,E}
Hence BD🡪E
4.Closure(DE)={D,E}

5.Closure(BE)={B,E}
• F1={A🡪B,AC🡪B}
• F2={A🡪D}
• F3={B🡪E,BD🡪E}
Consider original relation FDs
1.A🡪BD
2. B🡪E
1.A🡪BD is in F1(A—B)UF2(A🡪D)
2.B🡪E is in F3
Hence the given decomposition is dependency
preserving
Multi Valued Dependency
• A table is said to have multi-valued dependency, if the
following conditions are true,
1.For a dependency A → B, if for a single value of A,
multiple value of B exists, then the table may have multi-
valued dependency.
2.Also, a table should have at-least 3 columns for it to
have a multi-valued dependency.
3.And, for a relation R(A,B,C), if there is a multi-valued
dependency between, A and B, then B and C should be
independent of each other.
• If all these conditions are true for any relation(table), it
is said to have multi-valued dependency.
Multi Valued Dependency
Multi Valued Dependency
4NF
Rules for 4th Normal Form
For a table to satisfy the Fourth Normal Form, it
should satisfy the following two conditions:
1.It should be in the Boyce-Codd Normal Form.
2.And, the table should not have any Multi-
valued Dependency.
4NF
4NF
Join Dependency
• Join decomposition is a further generalization of Multivalued
dependencies.
• If the join of R1 and R2 over C is equal to relation R, then we
can say that a join dependency (JD) exists.
• Where R1 and R2 are the decompositions R1(A, B, C) and
R2(C, D) of a given relations R (A, B, C, D).
• Alternatively, R1 and R2 are a lossless decomposition of R.
• A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1,
R2,....., Rn is a lossless-join decomposition.
• The *(A, B, C, D), (C, D) will be a JD of R if the join of join's
attribute is equal to the relation R.
• Here, *(R1, R2, R3) is used to indicate that relation R1, R2,
R3 and so on are a JD of R.
5NF
• Fifth normal form (5NF) is also known as project-
join normal form (PJ/NF). It is designed to
minimize redundancy in relational databases by
separating semantically connected relationships in
multiple formats to store multi-valued facts.
• A relation R with attributes, its values and tuples
is in 5NF if and only if the following conditions are
satisfied,
• The relation R should be already in 4NF.
• The relation R cannot be additionally non loss
decomposed (join dependency).
5NF
5NF
• In the above table, John takes both Computer and
Math class for Semester 1 but he doesn't take Math
class for Semester 2. In this case, combination of all
these fields required to identify a valid data.
• Suppose we add a new Semester as Semester 3 but do
not know about the subject and who will be taking
that subject so we leave Lecturer and Subject as NULL.
But all three columns together acts as a primary key,
so we can't leave other two columns blank.
• So to make the above table into 5NF, we can
decompose it into three relations P1, P2 & P3:
First Normal Form-1NF
1NF
2NF
• A relation is in 2NF if it is in 1NF and every non-
key attribute is fully dependent on each candidate
key of the relation.
• That means in second normal form each table
have only one entity which uniquely identify other
entities.
• This particular entity contain only primary key
value. In another way we can say that if there is
more than one primary key then the table is
required to convert into second normal form.
2NF
• Example -
The "Office" table which shown in First Normal
Form is require to convert into Second Normal
Form.
• Functional Dependency in "Office" Table
(Department_id, Employee_id) →
(Department_name, Employee_name, Salary)

• Partial Dependency in "Office" Table


Department_id → Department_name
Employee_id → (Employee_name, Salary)

• After 2NF the "Office" table is divided into two


tables which are :
2NF
3NF
• A relation is in third normal form if it is in 2NF and every non-
key attribute of the relation is non-transitively dependent on
each candidate key of the relation.
• Example -
Library(Book_id, Book_name, Author_name,
Bookshelf_number, Book_category)

• Functional Dependecy
book_id → (Book_name, Author_name, Bookshelf_number,
Book_category)

• Transitive Dependecy
Bookshelf_number → Book_category
3NF
3NF
3NF
BCNF
• A table is in BCNF when every determinant in
the table is a candidate key.
BCNF
BCNF
4NF
• Fourth Normal Form is related to Multi-value Dependency. Under
fourth normal form, a record type should not contain two or more
independent multi-value facts about an entity. In addition the
record must satisfy third normal form.
A multi-value dependency exists when
• There are at least three attributes A, B and C in a relation.
• For each value of A there is a well-defined set of values for B, and
a well-defined set of values for C.
• The set of values of B is independent of set C.

If a table in 4NF then -


• All attributes must be dependent on the primary key, but they must
be independent of each other.
• No row may contain two or more multivalued facts about an entity.
4NF
The table reflects the following conditions :
• A course can be tought by one or many professor but each
professor can teach only one course. For example - Course "C-
100" tought by the professors "Mr. X" and "Mr. Y".
(So, Course_code → → Professor)

• A professor can refer one or many textbook for a particular


course.

• A textbook can be refer by one or many professor allocated for a


particular course.

• Textbooks refer for a particular course can not be refer for


another course. (So, Course_code → → Reference_book)
4NF
5NF

• A table is in Fifth Normal Form (5NF) or


Project-Join Normal Form (PJNF) if it is in 4NF
and it can not have a lossless decomposition
into any number of smaller tables.
The fifth normal form deals with join-
dependencies, which is a generalisation of the
multi-value dependency.
5NF

You might also like