0% found this document useful (0 votes)
32 views12 pages

Chapter 4

The document discusses normalization and functional dependencies. It defines functional dependency as a relationship between attributes where values of one attribute determine values of another. The document then covers different types of functional dependencies and normal forms including 1NF, 2NF, 3NF and BCNF. Normalization is defined as a process to reduce data redundancy and anomalies by decomposing tables into smaller, linked tables in multiple normal forms.

Uploaded by

QALI IBRAHIM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views12 pages

Chapter 4

The document discusses normalization and functional dependencies. It defines functional dependency as a relationship between attributes where values of one attribute determine values of another. The document then covers different types of functional dependencies and normal forms including 1NF, 2NF, 3NF and BCNF. Normalization is defined as a process to reduce data redundancy and anomalies by decomposing tables into smaller, linked tables in multiple normal forms.

Uploaded by

QALI IBRAHIM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Chapter 4: Functional Dependency and Normalization

4.1. Functional Dependency

The functional dependency is a relationship that exists between two attributes. It typically exists
between the primary key and non-key attribute within a table.

X → Y

The left side of FD is known as a determinant, the right side of the production is known as a
dependent. For example: Assume we have an employee table with attributes: Emp_Id,
Emp_Name, Emp_Address. Here Emp_Id attribute can uniquely identify the Emp_Name attribute
of employee table because if we know the Emp_Id, we can tell that employee name associated
with it.

Functional dependency can be written as: Emp_Id → Emp_Name. We can say that Emp_Name is
functionally dependent on Emp_Id.

Types of Functional dependency

 Functional Dependency

 Fully-Functional Dependency

 Transitive Dependency

 Multivalued Dependency

 Partial Dependency

 Trivial functional dependency


Functional Dependency
If the information stored in a table can uniquely determine another information in the same table,
then it is called Functional Dependency. Consider it as an association between two attributes of
the same relation.
If P functionally determines Q, then

P -> Q

Let us see an example −

Employee table

EmpID EmpName EmpAge

E01 Amit 28

E02 Rohit 31

The same is displayed below −

Fully Functional Dependency :


If X and Y are an attribute set of a relation, Y is fully functional dependent on X, if Y is
functionally dependent on X but not on any proper subset of X.
Example
In the relation ABC->D, attribute D is fully functionally dependent on ABC and not on
any proper subset of ABC. That means that subsets of ABC like AB, BC, A, B, etc
cannot determine D. Let us take another example :-

Supply table
supplier_id item_id price

1 1 540

2 1 545

1 2 200

2 2 201

1 1 540

2 2 201

3 1 542

From the table, we can clearly see that neither supplier_id nor item_id can uniquely
determine the price but both supplier_id and item_id together can do so. So we can say
that price is fully functionally dependent on { supplier_id, item_id }. This summarizes
and gives our fully functional dependency is

{ supplier_id , item_id } -> price

Transitive Dependency
When an indirect relationship causes functional dependency it is called Transitive Dependency.

If P -> Q and Q -> R is true, then P-> R is a transitive dependency.

Partial Functional Dependency

A functional Dependency X->Y is a partial dependency if there is some attribute that can be
removed from x and yet the dependency still holds.
The above table shows about Partial Functional Dependency

Multivalued Dependency
When existence of one or more rows in a table implies one or more other rows in the same table,
then the Multi-valued dependencies occur.

If a table has attributes P, Q and R, then Q and R are multi-valued facts of P.

It is represented by double arrow ->->.

For our example:

P->->Q
Q->->R

4.2. Normal Forms

A large database defined as a single relation may result in data duplication. This repetition of data
may result in:

 Making relations very large.


 It isn't easy to maintain and update data as it would involve searching many records
in relation.
 Wastage and poor utilization of disk space and resources.
 The likelihood of errors and inconsistencies increases.

So to handle these problems, we should analyze and decompose the relations with redundant data
into smaller, simpler, and well-structured relations that are satisfy desirable properties.
Normalization is a process of decomposing the relations into relations with fewer attributes.

What is Normalization?

 Normalization is the process of organizing the data in the database.


 Normalization is used to minimize the redundancy from a relation or set of relations. It is
also used to eliminate undesirable characteristics like Insertion, Update, and Deletion
Anomalies.
 Normalization divides the larger table into smaller and links them using relationships.
 The normal form is used to reduce redundancy from the database table.

Why do we need Normalization?

The main reason for normalizing the relations is removing these anomalies. Failure to eliminate
anomalies leads to data redundancy and can cause data integrity and other problems as the database
grows. Normalization consists of a series of guidelines that helps to guide you in creating a good
database structure.

Data modification anomalies can be categorized into three types:

 Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple into a
relationship due to lack of data.
 Deletion Anomaly: The delete anomaly refers to the situation where the deletion of data
results in the unintended loss of some other important data.
 Updatation Anomaly: The update anomaly is when an update of a single data value requires
multiple rows of data to be updated.

Types of Normal Forms:

Normalization works through a series of stages called Normal forms. The normal forms apply to
individual relations. The relation is said to be in particular normal form if it satisfies constraints.

Following are the various types of Normal forms:

Normal Description
Form

1NF A relation is in 1NF if it contains an atomic value.

2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional
dependent on the primary key.

3NF A relation will be in 3NF if it is in 2NF and no transition dependency exists.

BCNF A stronger definition of 3NF is known as Boyce Codd's normal form.

4.2.1. First Normal Form (1NF)


 A relation will be 1NF if it contains an atomic value.
 It states that an attribute of a table cannot hold multiple values. It must hold only single-
valued attribute.
 First normal form disallows the multi-valued attribute, composite attribute, and their
combinations.

Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE.


EMPLOYEE table:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385, UP
9064738238

20 Harry 8574783832 Bihar

12 Sam 7390372389, Punjab


8589830302

The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP

14 John 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389 Punjab

12 Sam 8589830302 Punjab

4.2.2. Second Normal Form (2NF)


 In the 2NF, relational must be in 1NF.
 In the second normal form, all non-key attributes are fully functional dependent on the
primary key
Example :-Table violates 2NF

StudentProject

StudentID ProjectID StudentName ProjectName

S89 P09 Olivia Geo Location


S76 P07 Jacob Cluster Exploration
S56 P03 Ava IoT Devices
S92 P05 Alexandra Cloud Deployment

In the above table, we have partial dependency; let us see how −The prime key attributes
are StudentID and ProjectID.

As stated, the non-prime attributes i.e. StudentName and ProjectName should be functionally
dependent on part of a candidate key, to be Partial Dependent. The StudentName can be
determined by StudentID, which makes the relation Partial Dependent.

The ProjectName can be determined by ProjectID, which makes the relation Partial Dependent.
Therefore, the StudentProject relation violates the 2NF in Normalization and is considered a bad
database design.

Example (Table converted to 2NF)

To remove Partial Dependency and violation on 2NF, decompose the above tables.

StudentInfo

StudentID ProjectID StudentName


S89 P09 Olivia
S76 P07 Jacob
S56 P03 Ava
S92 P05 Alexandra
ProjectInfo

ProjectID ProjectName
P09 Geo Location
P07 Cluster Exploration
P03 IoT Devices
P05 Cloud Deployment

Now the relation is in 2nd Normal form of Database Normalization

4.2.3. Third Normal Form (3NF)


 A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.
 3NF is used to reduce the data duplication. It is also used to achieve the data integrity.
 If there is no transitive dependency for non-prime attributes, then the relation must be in
third normal form.
 The Transitive Dependency in a table or relation comes into picture when one
non-prime attributes are dependent upon another non-prime attribute instead of
it being dependent upon the primary key.
 So removing the transitive dependency ensures data integrity as well as less
duplication of data.

A relation is in third normal form if it holds at least one of the following conditions for every non-
trivial function dependency X → Y.

1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.

The steps for achieving Third Normal Form are as below:

1. A table or relation should be in Second Normal Form.

2. The table or relation should not contain any transitive partial dependency.
Example of Third Normal Form
Let us consider the below table ‘TEACHER_DETAILS’ to understand the Third Normal Form

better.

ID NAME SUBJECT STATE COUNTRY

29 Lalita English Gujrat INDIA

33 Ramesh Geography Punjab INDIA

49 Sarita Mathematics Maharashtra INDIA

78 Zayed History Bihar INDIA

The candidate key in the above table is ID. The functional dependency set can be
defined as ID->NAME, ID->SUBJECT, ID->STATE, STATE->COUNTRY.

If A->B and B->C are the two functional dependencies, then A->C is called the
Transitive Dependency. For the above relation, ID->STATE, STATE->COUNTRY is
true. So we deduce that COUNTRY is transitively dependent upon ID. This does not
satisfy the conditions of the Third Normal Form. So in order to transform it into Third
Normal Form, we need to break the table into two tables in total and we need to create
another table for STATE and COUNTRY with STATE as the primary key.

Below are the tables after normalization to the Third Normal Form.
TEACHER_DETAILS:

ID NAME SUBJECT STATE

29 Lalita English Gujrat

33 Ramesh Geography Punjab

49 Sarita Mathematics Maharashtra

78 Zayed History Bihar

STATE_COUNTRY:

STATE COUNTRY

Gujrat INDIA

Punjab INDIA

Maharashtra INDIA

Bihar INDIA
4.2.4. Boyce Codd normal form (BCNF)
 BCNF is the advance version of 3NF. It is stricter than 3NF.
 A table is in BCNF if every functional dependency X → Y, X is the super key of the
table.
 For BCNF, the table should be in 3NF, and for every FD, LHS is super key.

Advantages of Normalization

 Normalization helps to minimize data redundancy.


 Greater overall database organization.
 Data consistency within the database.
 Much more flexible database design.
 Enforces the concept of relational integrity.

Disadvantages of Normalization

 You cannot start building the database before knowing what the user needs.
 The performance degrades when normalizing the relations to higher normal forms, i.e.,
4NF, 5NF.
 It is very time-consuming and difficult to normalize relations of a higher degree.
 Careless decomposition may lead to a bad database design, leading to serious problems.

You might also like