0% found this document useful (0 votes)
57 views50 pages

Normalization 2021 New

The document discusses normalization in database design. Normalization is the process of organizing data to eliminate redundancy and inconsistencies. It involves decomposing tables to reduce anomalies like insertion, update and deletion anomalies. The document describes various normal forms like 1NF, 2NF, 3NF and BCNF that are used to achieve normalization. It also provides examples to explain anomalies and outlines the purpose and benefits of normalization in database design.

Uploaded by

flamezodiark
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views50 pages

Normalization 2021 New

The document discusses normalization in database design. Normalization is the process of organizing data to eliminate redundancy and inconsistencies. It involves decomposing tables to reduce anomalies like insertion, update and deletion anomalies. The document describes various normal forms like 1NF, 2NF, 3NF and BCNF that are used to achieve normalization. It also provides examples to explain anomalies and outlines the purpose and benefits of normalization in database design.

Uploaded by

flamezodiark
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

NORMALIZATION

SLIDESMANIA.CO

DFC20083
DATABASE DESIGN
M
Course Outline - Normalization
3.1.1 Define normalization concept
1 3.1.6 Describe the various types of normal forms
6
3.1.2 Describe the purpose of normalization in
database model
2
7 3.1.7 Identify the steps in Normalization process
3.1.3 Explain the importance of normalization in
database c
3
3.1.8 Apply the rule of First, Second, and Third
3.1.4 Define functional dependencies (FD)
8 Normal Form to resolve a violation in the model
SLIDESMANIA.CO

3.1.5 Define transitive dependencies 9 3.1.9 Define Boyce Codd Normal Forms (BCNF)

5
M

3.1.10 Apply the concept of normalization using


an appropriate desktop database by creating
1 tables and relationship
0
Normalization in DBMS
● Normalization is a process of organizing the data in the database to avoid data
redundancy, insertion anomaly, update anomaly and deletion anomaly.
● Normalization is a technique for producing
c a set of suitable relations that
support the data requirements of an enterprise.
● The process of decomposing unsatisfactory "bad" relations by breaking up
SLIDESMANIA.CO

their attributes into smaller relations


M
Purpose of normalization
● Characteristics of a suitable set of relations:

○ The minimal number of attributescnecessary to support data requirement of


the enterprise
○ Attributes with a close logical relationship are found in the same relation.
SLIDESMANIA.CO

○ Minimal redundancy with each attribute represented only once with the
important exception of attributes that form all or part of foreign keys
M
Purpose of normalization
● The benefits of using a database that has a suitable set of relation is that the
database will be:

○ Easier for the user to access and maintain the data.


c
○ Take up minimal storage space on the computer.
SLIDESMANIA.CO

○ Minimize data redundancy.

○ Reduce the complexity of data.


M

○ To ensure the relationship between tables as well as data in the tables.

○ To ensure data dependencies and data is logically stored,


Purpose of normalization
● The benefits of implementing normalization in the database:

○ Update to the data stored in the database are achieved with a minimal
c
number of operations which reduce opportunities or data inconsistencies

○ Reduce file storage required by the base relations which minimize cost.
SLIDESMANIA.CO
M
Anomalies in DBMS
● There are three types of anomalies that occur when the database is nor
normalized.
● The anomalies are: c

○ Insertion anomalies
SLIDESMANIA.CO

○ Update anomalies

○ Deletion anomalies.
M
Anomalies in DBMS
● Example:

● Suppose a manufacturing company cstores the employee details in a table


named employee that has four attributes: emp_id for storing employee’s id,
emp_name for storing employee’s name, emp_address for storing employee’s
SLIDESMANIA.CO

address and emp_dept for storing the department details in which the employee
works. At some point of time the table looks like this:
M
Anomalies in DBMS
The table is
not normalized

emp_id emp_name emp_address emp_dept

101 Rick Delhi D001


c
101 Rick Delhi D002

123 Maggie Agra D890


SLIDESMANIA.CO

166 Glenn Chennai D900

166 Glenn Chennai D004


M
Anomalies in DBMS
● Update anomaly – Occurs when there is a data inconsistency that results from
data redundancy and a partial update of data
c
● In the previous table we have two rows for employee Rick as he belongs to two
departments of the company. If we want to update the address of Rick then we
SLIDESMANIA.CO

have to update the same in two rows or the data will become inconsistent. If
somehow, the correct address gets updated in one department but not in other
then as per the database, Rick would be having two different addresses, which
M

is not correct and would lead to inconsistent data.


Anomalies in DBMS
● Insert anomaly – Occurs when we cannot insert data to the table without the
presence of another attributes
c
● Suppose a new employee joins the company, who is under training and
currently not assigned to any department then we would not be able to insert
SLIDESMANIA.CO

the data into the table if emp_dept field doesn’t allow nulls.
M
Therefore to overcome
update, insert and delete
Anomalies in DBMS anomalies, normalization is
needed to normalize the data

● Delete anomaly – occurs when certain attributes are lost because of deletion of
other attributes.
c
● Suppose, if at a point of time the company closes the department D890 then
deleting the rows that are having emp_dept as D890 would also delete the
SLIDESMANIA.CO

information of employee Maggie since she is assigned only to this department.


M
Normalization
● The most commonly used normal forms are:

c
○First normal form (1NF)

○Second normal form (2NF)


SLIDESMANIA.CO

○Third normal form (3NF)


M

○Boyce Codd normal form (BCNF)


c
Normalization
SLIDESMANIA.CO
M
First normal form (1NF)
● As per the rule of first normal form, an attribute (column) of a table cannot
hold multiple values.
Two employees (Jon & Lester) are having two
● It should hold only atomic values c mobile numbers so the company store it in the same
field

emp_id emp_name emp_address emp_mobile


SLIDESMANIA.CO

101 Herschel New Delhi 8912312390 This table is not in


1NF as the rule
8812121212 stated “each
102 Jon Kanpur attribute of a table
9900012222
must have atomic
M

103 Ron Chennai 7778881212 (single) values”

9990000123
104 Lester Bangalore
8123450987
First normal form (1NF)
● To make the table complies with 1NF, we should have the data in the tablie like
this.
emp_id emp_name c
emp_address emp_mobile
101 Herschel New Delhi 8912312390
This table is in
102 Jon Kanpur 8812121212
SLIDESMANIA.CO

1NF as the rule


stated “each
102 Jon Kanpur 9900012222
attribute of a table
103 Ron Chennai 7778881212 must have atomic
(single) values”
104 Lester Bangalore 9990000123
M

104 Lester Bangalore 8123450987


Second normal form (2NF)
● A table is said to be in 2NF if both of the following conditions are met:

○ A) Table is in 1NF (First normal form)


c
○ B) No non-prime attribute is dependent on the proper subset of any
candidate key of a table
SLIDESMANIA.CO

● An attribute that is not part of any candidate key is know as non-prime attribute
M
Second normal form (2NF)
● Example : Suppose a school wants to store the data of teachers and the subjects
they teach. They create a table that look like this: Since a teacher can teach
more than one subjects, the table can have
c multiple rows for the same teacher.
SLIDESMANIA.CO

teacher_id subject teacher_age Candidate key:


111 Maths 38 {teacher_id,
subject}
111 Physics 38 Non prime attribute:
M

222 Biology 38 teacher_age


333 Physics 40
333 Chemistry 40
Second normal form (2NF)
● The table is in 1NF because each attribute has atomic values. However, it is not
in 2NF because non-prime attribute teacher_age is dependent on teacher_id
alone which is a subset of candidate key.
c
● This
SLIDESMANIA.CO

teacher_id subject teacher_age To make the table


111 Maths 38 complies with 2NF,
we can break it into
111 Physics 38 two tables like in
M

222 Biology 38 the next slides


333 Physics 40
333 Chemistry 40
Second normal form (2NF)
● Teacher_details table:

teacher_id teacher_age
c
111 38
222 38
SLIDESMANIA.CO

333 40
M
Second normal form (2NF)
Now the tables
comply with
● Teacher_subject table: Second normal
form (2NF)

teacher_id c subject
111 Maths
111 Physics
SLIDESMANIA.CO

222 Biology
333 Physics
333 Chemistry
M
Third normal form (3NF)
● A table design is said to be in 3NF if both the following conditions hold:

○ A) Table must be in 2NF

○ Transitive functional dependency cof non-prime attribute on any super key


should be removed.
SLIDESMANIA.CO

● An attribute that is not part of any candidate key is known as non-prime


attribute.
M
Third normal form (3NF)
● In other words 3NF can be explained like this:

○ A table is in 3NF if it is in 2NF and for each functional dependency X->Y


at least one of the following conditions hold:
c
■ X is a super key of table
SLIDESMANIA.CO

■ Y is a prime attribute of table

○ An attribute that is a part of one of the candidate keys is known as prime


M

attribute.
Third normal form (3NF)
● Example: Suppose a company wants to store the complete address of each
employee, they create a table named employee details that looks like this:

Super key:
{emp_id},{emp_id,
emp_id emp_name emp_zip c
emp_state emp_city emp_district emp_name},
{emp_id,
1001 John 282005 UP Agra Dayal Bagh emp_name,
emp_zip} and so on
SLIDESMANIA.CO

1002 Ajeet 222008 TN Chennai M-City Candidate key:


{emp_id}
1006 Lora 282007 TN Chennai Urrapakkam Non prime attribute:
All attributes except
1101 Lilly 292008 UK Pauri Bhagwan emp_id are non-
M

prime as they are


1201 Steve 222999 MP Gwalior Ratan not part of any
candidate key
Third normal form (3NF)
● In the previous slides, emp_state, emp_city and emp_district dependent on
emp_zip.
● And emp_zip is dependent on emp_id that makes non-prime attributes
(emp_state, emp_city and emp_district) transitively dependent on super key
c
(emp_id).
● This violates the rule of 3NF.
SLIDESMANIA.CO

● To make this table complies with 3NF we have to break the table into two
tables to remove the transitive dependency.
M
Third normal form (3NF)
● employee table:

emp_id emp_name emp_zip


1001 John 282005
1002 Ajeet
c 222008
1006 Lora 282007
1101 Lilly 292008
SLIDESMANIA.CO

1201 Steve 222999


M
Third normal form (3NF)
● Employee_zip table:

emp_zip emp_state emp_city emp_district


282005 UP Agra Dayal Bagh
222008 TN c Chennai M-City
282007 TN Chennai Urrapakkam
292008 UK Pauri Bhagwan
SLIDESMANIA.CO

222999 MP Gwalior Ratan


M
Boyce Codd normal form (BCNF)
● It is an advance version of 3NF that’s why it is also referred as 3.5NF.
● BCNF is stricter than 3NF.
● A table complies with BCNF if it is in 3NF and for every functional
dependency X->Y, X should be the super key of the table.
c
SLIDESMANIA.CO
M
Boyce Codd normal form (BCNF)
● Example: Suppose there is a company wherein employees work in more than
one department. They store the data like this:
emp_id emp_nationality emp_dept dept_type dept_no_of_emp
1001 Austrian Production and planning D001 200
c
1001 Austrian Stores D001 250
1002 American Design and technical support D134 100
SLIDESMANIA.CO

1002 American Purchasing department D134 600

Functional dependencies in the table above:


This table is not in BCNF as neither emp_id nor emp_dep
emp_id -> emp_nationality
M

alone are keys.


Empt_dept->{dept_type, dept_no_of_emp}
Therefore, to make the table comply to BCNF, we can
Candidate key : {emp_id, emp_dept}
break the table into three table like in the next slides
Boyce Codd normal form (BCNF)
● Emp_nationality table:
emp_id emp_nationality
1001 Austrian
1002 American
c

● Emp_dept table:
SLIDESMANIA.CO

emp_dept dept_type dept_no_of_emp


Production and planning D001 200
Stores D001 250
M

Design and technical support D134 100


Purchasing department D134 600
Boyce Codd normal form (BCNF)
● Emp_dept_mapping table:
Functional dependencies:
emp_id emp_dept Emp_id -> emp_nationality
1001 Production and planning Emp_dept -> {dept_type,
dept_no_of_emp}
1001 stores c
1002 design and technical support Candidate keys:
1002 Purchasing department For first table : emp_id
SLIDESMANIA.CO

For second table : emp_dept


For third table : {emp_id,
emp_dept}

There tables are now in BCNF


M

since both functional dependencies


left side part is a key
Normalization
c

Example
SLIDESMANIA.CO
M
Example 1 – (1NF) (Original table)
Unnormalized table
c
Phone number
column has two
values which
violated the 1NF
SLIDESMANIA.CO

rule
M
First normal form (1NF) (Result)

c In this table,
atomicity is
achieved and
every column
SLIDESMANIA.CO

have unique
values
M
Example 2 (Original table) – 2NF
This table has a composite primary key
Employee id and Department ID.

The non key attribute is Office location.

In this case, Office Location only depends on


c Department ID which is only part of the
primary key

Therefore, this table does not satisfy the


SLIDESMANIA.CO

second normal form


M
Example 2 (Result)– 2NF
Column Office Location is fully dependent
on the primary key of that table,
Department ID

c
SLIDESMANIA.CO
M
Example 3 (Original table) – 3NF
In this table, Student ID determines Subject ID
an Subject ID determines Subject
Therefore, Student ID determines Subject via
Subject ID.
c This implies that we have transitive functional
dependency and this structure does not
satisfy third normal form
SLIDESMANIA.CO
M
Example 3 (Result) – 3NF
In this table, all the non-key attributes are now
fully functional dependent only on the primary
key
In the first table, columns Student name,
c Subject ID and Address are only dependent
on Student ID.
In the second table, Subject is only dependent
on Subject ID
SLIDESMANIA.CO
M
Example 4 (Original table) – BCNF
For this table, one student can enrols in many
subject

There can be multiple professor that will be


c teaching one subject

For each subject only one professor is


assigned to that subject
SLIDESMANIA.CO
M
Example 4 (Result) – BCNF

New column Professor ID will be implemented


since it will be the Super key of the table and
c removing the non-prime attributes functional
dependency
SLIDESMANIA.CO
M
Types of key
● A key is a value used to identify a record in a table uniquely.
● A key could be a single column or combination on multiple columns
● Columns in a table that are NOT used to identify a record uniquely are called
non-key columns
c
● There are several types of keys:
SLIDESMANIA.CO

○ Primary key

○ Composite key
M

○ Foreign key
Types of key – Primary key
● A primary key is a single column value used to identify a database record
uniquely

● It has the following attributes:


c

○ A primary key cannot be NULL


SLIDESMANIA.CO

○ A primary key value must be unique

○ The primary key values should rarely be changed


M

○ The primary key must be given a value when a new record is inserted
Types of key – Composite key
● A composite key is a primary key composed of multiple columns used to
identify a record uniquely.
● Example:
c
SLIDESMANIA.CO

● Therefore, a combination of Full Name and Address is needed to identify a


M

record uniquely. This combination is called composite key.


Types of key – Foreign key
● Foreign key is a key referenced the primary key of another table
● Foreign key helps to connect tables together.
● A foreign key can have a different name from its primary key
● A transitive functional dependency is when changing a non-key columns,
● It ensures rows in one table have corresponding rows in another
might cause any of the other non-key ccolumns to change
● Unlike primary key, they do not have to be unique
● Foreign key can be null even though primary keys cannot
SLIDESMANIA.CO
M
Types of dependencies – transitive
dependencies
● A transitive functional dependency is when changing a non-key columns,
might cause any of the other non-key ccolumns to change
SLIDESMANIA.CO
M
Types of dependencies – partial
dependencies
● A partial dependency occurs when a non-prime attribute is functionally
dependent on part of a candidate key c
SLIDESMANIA.CO
M

In this table, membership ID and full name is the composite key that can uniquely
identify the data. However, physical address only rely on only full name to identify the
address of that particular user. Therefore, in this table, there exist partial dependency
situation.
Summary of First Normal Form (1NF)
rules
● Removes repeating groups from the table

● Create a separate table for each set of related data


c

● Identify each set of related data with primary key


SLIDESMANIA.CO
M
Summary of Second Normal Form (1NF)
rules
● It has to be in 1 Normal Form
st

● Table also should not contain partial dependency


c
SLIDESMANIA.CO
M
Summary of Third Normal Form (3NF)
rules
● It has to be in 2 Normal Form
nd

● There should be no transitive dependency for non-prime attributes


c
SLIDESMANIA.CO
M
Summary of Boyce Codd Normal Form
(BCNF) rules
● It has to be in 3rd Normal Form
c

● Higher version 3NF and was developed by Raymond F. Boyce and Edgar F.
SLIDESMANIA.CO

Codd

● Every functional dependency A->B, then A has to be the Super Key of that
M

particular table

You might also like