0% found this document useful (0 votes)
3 views

Lecture#09

Normalization is the process of organizing database data to minimize redundancy and eliminate anomalies such as insertion, update, and deletion issues. It involves decomposing larger tables into smaller, well-structured relations based on functional dependencies, which help identify primary keys and ensure data integrity. The lecture outlines the purpose of normalization, the problems associated with redundant data, and the characteristics of various normal forms, including 1NF, 2NF, and 3NF.

Uploaded by

wajabaloch7860
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture#09

Normalization is the process of organizing database data to minimize redundancy and eliminate anomalies such as insertion, update, and deletion issues. It involves decomposing larger tables into smaller, well-structured relations based on functional dependencies, which help identify primary keys and ensure data integrity. The lecture outlines the purpose of normalization, the problems associated with redundant data, and the characteristics of various normal forms, including 1NF, 2NF, and 3NF.

Uploaded by

wajabaloch7860
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Lecture-9

Normalization
Normalization- Outlines

• The purpose of normalization.


• How normalization can be used when designing a relational database.
• The potential problems associated with redundant data in base relations.
• The concept of functional dependency, which describes the relationship between
• attributes.
• The characteristics of functional dependencies used in normalization.
• How to identify functional dependencies for a given relation.
• How functional dependencies identify the primary key for a relation.
• How to undertake the process of normalization.
• How normalization uses functional dependencies to group attributes into relations
• that are in a known normal form.
• How to identify the most commonly used normal forms, namely First Normal Form
• (1NF), Second Normal Form (2NF), and Third Normal Form (3NF).
• The problems associated with relations that break the rules of 1NF, 2NF, or 3NF.
• How to represent attributes shown on a form as 3NF relations using normalization.
Today’s Lecture Outlines

• The purpose of normalization.


• How normalization can be used when designing a relational
database.
• The potential problems associated with redundant data in base
relations.
Purpose of Normalization(1)
Normalization:
 Normalization is the process of organizing the data in the database.

 Normalization is used to minimize the redundancy from a relation


or set of relations.

 It is also used to eliminate undesirable characteristics like Insertion,


Update, and Deletion Anomalies.

 Normalization divides the larger table into smaller and links them
using relationships.

 The normal form is used to reduce redundancy from the database


table.
Purpose of Normalization(1)
Normalization:
A large database defined as a single relation may result in data
duplication.
This repetition of data may result in:
 Making relations very large.
 It is not easy to maintain and update data as it would involve
searching many records in relation.
 Wastage and poor utilization of disk space and resources.
 The likelihood of errors and inconsistencies increases.
So to handle these problems, we should analyze and decompose the
relations with redundant data into smaller, simpler, and well-structured
relations that are satisfy desirable properties. Normalization is a
process of decomposing the relations into relations with fewer
attributes.
Purpose of Normalization(1)
Why do we need Normalization?
The main reason for normalizing the relations is removing these
anomalies.

Failure to eliminate anomalies leads to data redundancy and can cause


data integrity and other problems as the database grows.

Normalization consists of a series of guidelines that helps to guide you


in creating a good database structure.
Data Redundancy and Update Anomalies
A major aim of relational database design is to group attributes into
relations to minimize data redundancy.

If this aim is achieved, the potential benefits for the implemented


database include the following:

 Updates to the data stored in the database are achieved with a


minimal number of operations thus reducing the opportunities for
data inconsistencies occurring in the database;

 Reduction in the file storage space required by the base relations


thus minimizing costs.
Data Redundancy and Update Anomalies
Why do we need Normalization?
Relational databases also rely on the existence of a certain amount of
data redundancy.

This redundancy is in the form of copies of primary keys (or candidate


keys) acting as foreign keys in related relations to enable the modeling
of relationships between data.

Staff and Branch Relations


Data Redundancy and Update Anomalies
Here we illustrate the problems associated with unwanted data
redundancy by comparing the Staff and Branch relations as shown in
Figure with the Staff Branch relation.

Staff and Branch Relations


The Staff Branch relation is an alternative format of the Staff and
Branch relations. The relations have the form:
Staff (staff No, sName, position, salary, branch No)
Branch (branch No, bAddress)
StaffBranch (staff No, sName, position, salary, branchNo, bAddress)
Data Redundancy and Update Anomalies
The Staff Branch relation is an alternative format of the Staff and Branch relations.
The relations have the form:
Staff (staff No, sName, position, salary, branch No)
Branch (branch No, bAddress)
StaffBranch (staff No, sName, position, salary, branchNo, bAddress)

Note that the primary key for each relation is underlined.

In the Staff Branch relation there is redundant data; the details of a branch are repeated for
every member of staff located at that branch. In contrast, the branch details appear only
once for each branch in the Branch relation, and only the branch number (branchNo) is
repeated in the Staff relation to represent where each member of staff is located. Relations
that have redundant data may have problems called update anomalies, which are classified
as;
 Insertion,
 Deletion, or
 Modification anomalies.
Insert Anomalies
There are two main types of insertion anomaly, which we illustrate using the Staff
Branch relation shown in before figure.

To insert the details of new members of staff into the Staff Branch relation, we
must include the details of the branch at which the staff are to be located.

For example, to insert the details of new staff located at branch number B007, we
must enter the correct details of branch number B007 so that the branch details
are consistent with values for branch B007 in other tuples of the Staff Branch
relation.

The relations shown in Figure do not suffer from this potential inconsistency
because we enter only the appropriate branch number for each staff member in
the Staff relation.

Instead, the details of branch number B007 are recorded in the database as a
single tuple in the Branch relation.
Insert Anomalies
To insert details of a new branch that currently has no members of staff into the
StaffBranch relation, it is necessary to enter nulls into the attributes for staff, such
as staffNo.

However, as staffNo is the primary key for the StaffBranch relation, attempting to
enter nulls for staffNo violates entity integrity and is not allowed.

We therefore cannot enter a tuple for a new branch into the StaffBranch relation
with a null for the staffNo.

The design of the relations shown in Figure avoids this problem because branch
details are entered in the Branch relation separately from the staff details.

The details of staff ultimately located at that branch are entered at a later date into
the Staff relation.
Delete Anomalies
If we delete a tuple from the StaffBranch relation that represents the last member
of staff
located at a branch, the details about that branch are also lost from the database.

For example, if we delete the tuple for staff number SA9 (Mary Howe) from the
StaffBranch
relation, the details relating to branch number B007 are lost from the database.

The design of the relations in Figure avoids this problem, because branch tuples
are stored separately from staff tuples and only the attribute branch No relates the
two relations.

If we delete the tuple for staff number SA9 from the Staff relation, the details on
branch number B007 remain unaffected in the Branch relation.
Modification Anomalies
If we want to change the value of one of the attributes of a particular branch in the
StaffBranch relation, for example the address for branch number B003, we must update the
tuples of all staff located at that branch.

If this modification is not carried out on all the appropriate tuples of the StaffBranch
relation, the database will become inconsistent.

In this example, branch number B003 may appear to have different addresses in different
staff tuples.

The above examples illustrate that the Staff and Branch relations of Figure have more
desirable properties than the StaffBranch relation of Figure.

This demonstrates that while the StaffBranch relation is subject to update anomalies, we can
avoid these anomalies by decomposing the original relation into the Staff and Branch
relations.

There are two important properties associated with decomposition of a larger relation into
smaller relations:
Modification Anomalies
The lossless-join property ensures that any instance of the original relation can be
identified from corresponding instances in the smaller relations.

The dependency preservation property ensures that a constraint on the original


relation can be maintained by simply enforcing some constraint on each of the
smaller relations.

In other words, we do not need to perform joins on the smaller relations to check
whether a constraint on the original relation is violated.

Later in this chapter, we discuss how the process of normalization can be used to
derive well-formed relations.

However, we first introduce functional dependencies, which are fundamental to


the process of normalization.

You might also like