DS Unit 2
DS Unit 2
Module-III
Functional Dependency
The functional dependency is a relationship that exists between two attributes. It
typically exists between the primary key and non-key attribute within a table.
1. X → Y
The left side of FD is known as a determinant, the right side of the production is
known as a dependent.
For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name,
Emp_Address.
Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee
table because if we know the Emp_Id, we can tell that employee name associated
with it.
Functional dependency can be written as:
1. Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on Emp_Id.
The above table is not normalized. We will see the problems that we face when the
table is not normalized.
Departments Table:
deptNum deptName deptCity deptCountry
3 HR Berlin Germany
1 1001 1
2 1002 2
3 1009 3
4 1007 4
5 1007 3
Here, we can observe that we have split the table in 1NF form into three different
tables. the Employees table is an entity about all the employees of a company and
its attributes describe the properties of each employee. The primary key for this table
is empNum.
Similarly, the Departments table is an entity about all the departments in a company
and its attributes describe the properties of each department. The primary key for
this table is the deptNum.
In the third table, we have combined the primary keys of both the tables. The primary
keys of the Employees and Departments tables are referred to as Foreign keys in
this third table.
If the user wants an output similar to the one, we had in 1NF, then the user has to
join all the three tables, using the primary keys.
A sample query would look as shown below:
SELECT empNum, lastName, firstName, deptNum, deptName, deptCity, deptCountry
FROM Employees A, Departments B, EmpDept C
WHERE A.empNum = C.empNum
AND B.deptNum = C.deptNum
WITH UR;
#3) 3NF (Third Normal Form)
By definition, a table is considered in third normal if the table/entity is already in the
second normal form and the columns of the table/entity are non-transitively
dependent on the primary key.
Let‟s understand non-transitive dependency, with the help of the following example.
In the above example, employees with empNum 1001 and 1007 work in two different
departments. Each department has a department head. There can be multiple
department heads for each department. Like for the Accounts department, Raymond
and Samara are the two heads of departments.
In this case, empNum and deptName are super keys, which implies that deptName
is a prime attribute. Based on these two columns, we can identify every single row
uniquely.
Also, the deptName depends on deptHead, which implies that deptHead is a non-
prime attribute. This criterion disqualifies the table from being part of BCNF.
To solve this we will break the table into three different tables as mentioned below:
Employees Table:
empNum firstName empCity deptNum
D1 Accounts Raymond
D2 Technology Donald
D1 Accounts Samara
D3 HR Elizabeth
D4 Infrastructure Tom
#5) Fourth Normal Form (4 Normal Form)
By definition, a table is in Fourth Normal Form, if it does not have two or more,
independent data describing the relevant entity.
#6) Fifth Normal Form (5 Normal Form)
A table can be considered in Fifth Normal Form only if it satisfies the conditions for
Fourth Normal Form and can be broken down into multiple tables without loss of any
data.
Questions And Answers
Q #1) What is Normalization in a Database?
Answer: Database Normalization is a design technique. Using this we can design or
re-design schemas in the database to reduce redundant data and the dependency of
data by breaking the data into smaller and more relevant tables.
Q #2) What are the different types of Normalization?
Answer: Following are the different types of normalization techniques that can be
employed to design database schemas:
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (3.5NF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)
Q #3) What is the Purpose of Normalization?
Answer: The primary purpose of the normalization is to reduce the data redundancy
i.e. the data should only be stored once. This is to avoid any data anomalies that
could arise when we attempt to store the same data in two different tables, but
changes are applied only to one and not to the other.
Q #4) What is Denormalization?
Answer: Denormalization is a technique to increase the performance of the
database. This technique adds redundant data to the database, contrary to the
normalized database that removes the redundancy of the data.
This is done in huge databases where executing a JOIN to get data from multiple
tables is an expensive affair. Thus, redundant data are stored in multiple tables to
avoid JOIN operations.