Functional Dependencies and Normalization For Relational Databases
Functional Dependencies and Normalization For Relational Databases
Functional Dependencies and Normalization For Relational Databases
Relation schema defines the design and structure of the relation like it
consists of the relation name, set of attributes/field names/column names.
every attribute would have an associated domain.
Whenever we are going to form relational schema there should be some meaning
among the attributes. This meaning is called semantics. This semantics relates one
attribute to another with some relation
Example:
Design guideline1:
Design a relation schema that is easy to understand and explain its meaning
clearly.
For which, do not combine various attributes drawn from different entity types
and relationship types into single relation.
2. Reducing the redundant values in tuples.
Here whenever if we insert the tuples there may be ‘N’ students in one
department, so Dept No,Dept Name values are repeated ‘N’ times which
leads to data redundancy.
Another problem is update anomalies ie if we insert new dept that has no
students.
If we delete the last student of a dept, then whole information about that
department will be deleted
GUIDELINE
As far as possible, avoid placing attributes in the base relation whose values
frequently are null. If nulls are unavoidable they should be applied only to
exceptional cases and not to majority of tuples.
Functional Dependency
Or
Functional Dependency is nothing but relationship that exist, when one attribute
uniquely determines another attribute.
Example
DeptId = Department ID
DeptName = Department Name
DeptId DeptName
001 Finance
002 Marketing
003 HR
Therefore, the above functional dependency between DeptId and DeptName can be
determined as DeptId is functionally dependent on DeptName −
Normalization
Defining Normalization:
It is the process of analysing the given set of relation schemas based on their Functional
Dependencies and primary keys to achieve desirable properties like
1. Minimizing redundancy
2. Minimizing insertion, deletion and updating anomalies
A relation is in 1NF if every attribute is a single-valued attribute or it does not
contain any multi-valued or composite attribute, i.e., every attribute is an atomic
attribute. If there is a composite or multi-valued attribute, it violates the 1NF.
To solve this, we can create a new row for each of the values of the multi-valued
attribute to convert the table into the 1NF.
Example:
Let’s take an example of a relational table <EmployeeDetail> that contains the
details of the employees of the company
Here, the Employee Phone Number is a multi-valued attribute. So, this relation is
not in 1NF.
Solution:
To convert this table into 1NF, we make new rows with each Employee Phone
Number as a new row as shown below:
If a partial dependency exists, we can divide the table to remove the partially dependent
attributes and move them to some other table where they fit in well.
In the above table, the prime attributes of the table are Employee Code and Project ID.
We have partial dependencies in this table because Employee Name can be determined
by Employee Code and Project Name can be determined by Project ID. Thus, the above
relational table violates the rule of 2NF.
Solution
To remove partial dependencies from this table and normalize it into second normal
form, we can decompose the <EmployeeProjectDetail> table into the following three
tables:
Thus, we’ve converted the <EmployeeProjectDetail> table into 2NF by decomposing it
into <EmployeeDetail>, <ProjectDetail> and <EmployeeProject> tables. As you can see,
the above tables satisfy the following two rules of 2NF as they are in 1NF and every non-
prime attribute is fully dependent on the primary key.
X -> Y
Y does not -> X
Y -> Z
For a relational table to be in third normal form, it must satisfy the following rules:
If a transitive dependency exists, we can divide the table to remove the transitively
dependent attributes and place them to a new table along with a copy of the
determinant.
The above table is not in 3NF because it has Employee Code -> Employee City transitive
dependency because:
Also, Employee Zipcode is not a super key and Employee City is not a prime attribute.
To remove transitive dependency from this table and normalize it into the third normal
form, we can decompose the <EmployeeDetail> table into the following two tables:
Thus, we’ve converted the <EmployeeDetail> table into 3NF by decomposing it into
<EmployeeDetail> and <EmployeeLocation> tables as they are in 2NF and they don’t
have any transitive dependency.
For a relational table to be in Boyce-Codd normal form, it must satisfy the following
rules:
A superkey is a set of one or more attributes that can uniquely identify a row in a
database table.
Let us take an example of the following <EmployeeProjectLead> table to understand
how to normalize the table to the BCNF:
The above table satisfies all the normal forms till 3NF, but it violates
the rules of BCNF because the candidate key of the above table is
{Employee Code, Project ID}. For the non-trivial functional
dependency, Project Leader -> Project ID, Project ID is a prime
attribute but Project Leader is a non-prime attribute. This is not
allowed in BCNF.
To convert the given table into BCNF, we decompose it into two tables:
Thus, we’ve converted the <EmployeeProjectLead> table into BCNF by
decomposing it into <EmployeeProject> and <ProjectLead> tables.