Chapter 4
Chapter 4
The functional dependency is a relationship that exists between two attributes. It typically exists
between the primary key and non-key attribute within a table.
X → Y
The left side of FD is known as a determinant, the right side of the production is known as a
dependent. For example: Assume we have an employee table with attributes: Emp_Id,
Emp_Name, Emp_Address. Here Emp_Id attribute can uniquely identify the Emp_Name attribute
of employee table because if we know the Emp_Id, we can tell that employee name associated
with it.
Functional dependency can be written as: Emp_Id → Emp_Name. We can say that Emp_Name is
functionally dependent on Emp_Id.
Functional Dependency
Fully-Functional Dependency
Transitive Dependency
Multivalued Dependency
Partial Dependency
P -> Q
Employee table
E01 Amit 28
E02 Rohit 31
Supply table
supplier_id item_id price
1 1 540
2 1 545
1 2 200
2 2 201
1 1 540
2 2 201
3 1 542
From the table, we can clearly see that neither supplier_id nor item_id can uniquely
determine the price but both supplier_id and item_id together can do so. So we can say
that price is fully functionally dependent on { supplier_id, item_id }. This summarizes
and gives our fully functional dependency is
Transitive Dependency
When an indirect relationship causes functional dependency it is called Transitive Dependency.
A functional Dependency X->Y is a partial dependency if there is some attribute that can be
removed from x and yet the dependency still holds.
The above table shows about Partial Functional Dependency
Multivalued Dependency
When existence of one or more rows in a table implies one or more other rows in the same table,
then the Multi-valued dependencies occur.
P->->Q
Q->->R
A large database defined as a single relation may result in data duplication. This repetition of data
may result in:
So to handle these problems, we should analyze and decompose the relations with redundant data
into smaller, simpler, and well-structured relations that are satisfy desirable properties.
Normalization is a process of decomposing the relations into relations with fewer attributes.
What is Normalization?
The main reason for normalizing the relations is removing these anomalies. Failure to eliminate
anomalies leads to data redundancy and can cause data integrity and other problems as the database
grows. Normalization consists of a series of guidelines that helps to guide you in creating a good
database structure.
Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple into a
relationship due to lack of data.
Deletion Anomaly: The delete anomaly refers to the situation where the deletion of data
results in the unintended loss of some other important data.
Updatation Anomaly: The update anomaly is when an update of a single data value requires
multiple rows of data to be updated.
Normalization works through a series of stages called Normal forms. The normal forms apply to
individual relations. The relation is said to be in particular normal form if it satisfies constraints.
Normal Description
Form
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional
dependent on the primary key.
14 John 7272826385, UP
9064738238
The decomposition of the EMPLOYEE table into 1NF has been shown below:
14 John 7272826385 UP
14 John 9064738238 UP
StudentProject
In the above table, we have partial dependency; let us see how −The prime key attributes
are StudentID and ProjectID.
As stated, the non-prime attributes i.e. StudentName and ProjectName should be functionally
dependent on part of a candidate key, to be Partial Dependent. The StudentName can be
determined by StudentID, which makes the relation Partial Dependent.
The ProjectName can be determined by ProjectID, which makes the relation Partial Dependent.
Therefore, the StudentProject relation violates the 2NF in Normalization and is considered a bad
database design.
To remove Partial Dependency and violation on 2NF, decompose the above tables.
StudentInfo
ProjectID ProjectName
P09 Geo Location
P07 Cluster Exploration
P03 IoT Devices
P05 Cloud Deployment
A relation is in third normal form if it holds at least one of the following conditions for every non-
trivial function dependency X → Y.
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.
2. The table or relation should not contain any transitive partial dependency.
Example of Third Normal Form
Let us consider the below table ‘TEACHER_DETAILS’ to understand the Third Normal Form
better.
The candidate key in the above table is ID. The functional dependency set can be
defined as ID->NAME, ID->SUBJECT, ID->STATE, STATE->COUNTRY.
If A->B and B->C are the two functional dependencies, then A->C is called the
Transitive Dependency. For the above relation, ID->STATE, STATE->COUNTRY is
true. So we deduce that COUNTRY is transitively dependent upon ID. This does not
satisfy the conditions of the Third Normal Form. So in order to transform it into Third
Normal Form, we need to break the table into two tables in total and we need to create
another table for STATE and COUNTRY with STATE as the primary key.
Below are the tables after normalization to the Third Normal Form.
TEACHER_DETAILS:
STATE_COUNTRY:
STATE COUNTRY
Gujrat INDIA
Punjab INDIA
Maharashtra INDIA
Bihar INDIA
4.2.4. Boyce Codd normal form (BCNF)
BCNF is the advance version of 3NF. It is stricter than 3NF.
A table is in BCNF if every functional dependency X → Y, X is the super key of the
table.
For BCNF, the table should be in 3NF, and for every FD, LHS is super key.
Advantages of Normalization
Disadvantages of Normalization
You cannot start building the database before knowing what the user needs.
The performance degrades when normalizing the relations to higher normal forms, i.e.,
4NF, 5NF.
It is very time-consuming and difficult to normalize relations of a higher degree.
Careless decomposition may lead to a bad database design, leading to serious problems.