0% found this document useful (0 votes)
135 views

Normal Forms

The document discusses database normalization, which involves removing redundant data from tables to improve storage efficiency, data integrity, and scalability. It originally established three normal forms (1NF, 2NF, 3NF) and most tables reaching 3NF are also in BCNF. Normalization generally involves splitting tables into multiple tables to remove multivalued attributes and dependencies. The examples demonstrate how to determine the normal form of relations and how to bring relations to 1NF by handling repeating groups.

Uploaded by

Arif Abdullah
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views

Normal Forms

The document discusses database normalization, which involves removing redundant data from tables to improve storage efficiency, data integrity, and scalability. It originally established three normal forms (1NF, 2NF, 3NF) and most tables reaching 3NF are also in BCNF. Normalization generally involves splitting tables into multiple tables to remove multivalued attributes and dependencies. The examples demonstrate how to determine the normal form of relations and how to bring relations to 1NF by handling repeating groups.

Uploaded by

Arif Abdullah
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Database Normalization

Md. Fazlul Karim Patwary


IIT
Database Normalization
• Database normalization is the process of removing
redundant data from tables to improve storage efficiency,
data integrity, and scalability.

• Classification of normalization are called normal forms (or


NF), and there are algorithms for converting a given
database between them.

• Normalization generally involves splitting existing tables


into multiple ones, which must be re-joined or linked each
time a query is issued.
Database Normalization
Edgar F. Codd originally established three normal forms:
1NF, 2NF and 3NF.

• There are now others that are generally accepted, but 3NF
is widely considered to be sufficient for most applications.

• Most tables when reaching 3NF are also in BCNF (Boyce-


Codd Normal Form).
Dependencies: Some Definitions
Multivalued Attributes or repeating groups: non-key
attributes or groups of non-key attributes the values of
which are not uniquely identified by the value of the
Primary Key. In other words, their values are not
functionally dependent on primary key.

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Dependencies: Some Definitions
Partial Dependency – when an non-key attribute is
determined by a part, but not the whole, of a
COMPOSITE primary key.

Partial
Dependency
CUSTOMER

Cust_ID Name Order_ID


101 AT&T 1234
101 AT&T 156
125 Cisco 1250
Dependencies: Some Definitions

• Transitive Dependency – when a non-key


attribute determines another non-key
attribute.
Transitive
Dependency

EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Normal Forms: Review

• Unnormalized – There are multivalued


attributes or repeating groups
• 1 NF – No multivalued attributes or
repeating groups.
• 2 NF – 1 NF plus no partial dependencies
• 3 NF – 2 NF plus no transitive dependencies
Normal Forms: Definition

First Normal Form: If all attributes are directly or indirectly


determined by the primary key then the table is in First
Normal Form (1NF).

Second Normal Form: If a table is in first normal form and


there are no partial dependencies then the table is called
is in second normal form.

Third Normal Form: If a table is in second normal form and


there are no partial dependencies then the table is called
is in third normal form.
Table: 1

Title Author1 Author2 ISBN Subject Pages Publisher

Database Abraham Henry F. 0072958863 MySQL, 1168 McGraw-Hill


System Silberschatz Korth Computers
Concepts

Operating Abraham Henry F. 0471694665 Computers 944 McGraw-Hill


System Silberschatz Korth
Concepts
First Normal Form
In Table 1, we have two violations of First Normal Form:

• There are more than one author field,

• Subject field contains more than one piece of


information. With more than one value in a single field, it
would be very difficult to search for all books on a given
subject.
First Normal Table: Table 2

Title Author ISBN Subject Pages Publisher

Database System Abraham 0072958863 MySQL 1168 McGraw-Hill


Concepts Silberschatz
Database System Henry F. Korth 0072958863 Computers 1168 McGraw-Hill
Concepts

Operating System Henry F. Korth 0471694665 Computers 944 McGraw-Hill


Concepts

Operating System Abraham 0471694665 Computers 944 McGraw-Hill


Concepts Silberschatz
Decomposition

Subject Table

Subject_ID Subject
Author Table
1 MySQL
Author_ID Last Name First Name
2 Computers
1 Silberschatz Abraham

2 Korth Henry
Book Table

ISBN Title Pages Publisher

0072958863 Database System 1168 McGraw-Hill


Concepts

0471694665 Operating System 944 McGraw-Hill


Concepts
Decomposition

Publisher Table

Publisher_ID Publisher Name

1 McGraw-Hill

Book Table
ISBN Title Pages Publisher_ID

0072958863 Database System 1168 1


Concepts

0471694665 Operating System 944 1


Concepts
Example 1: Determine NF
• ISBN  Title
• ISBN  Publisher
• Publisher  Address All attributes are directly or
indirectly determined by the
primary key; therefore, the
relation is at least in 1 NF

BOOK

ISBN Title Publisher Address


Example 1: Determine NF

• ISBN  Title The relation is at least in 1NF.


• ISBN  Publisher There is no COMPOSITE primary
key, therefore there can’t be
• Publisher  Address partial dependencies. Therefore,
the relation is at least in 2NF
BOOK

ISBN Title Publisher Address


Example 1: Determine NF

• ISBN  Title Publisher is a non-key attribute, and it


• ISBN  Publisher determines Address, another non-key
attribute. Therefore, there is a
• Publisher  Address transitive dependency, which means
that the relation is NOT in 3 NF.

BOOK

ISBN Title Publisher Address


Example 1: Determine NF
We know that the relation is at
• ISBN  Title least in 2NF, and it is not in 3 NF.
• ISBN  Publisher Therefore, we conclude that the
relation is in 2NF.
• Publisher  Address

BOOK

ISBN Title Publisher Address


Example 2: Determine NF
• Product_ID  Description

All attributes are directly or indirectly


determined by the primary key;
therefore, the relation is at least in 1 NF
ORDER

Order_No Product_ID Description


Example 2: Determine NF
• Product_ID  Description

The relation is at least in 1NF.


There is a COMPOSITE Primary Key (PK) (Order_No,
Product_ID), therefore there can be partial dependencies.
Product_ID, which is a part of PK, determines Description;
hence, there is a partial dependency. Therefore, the
relation is not 2NF. No sense to check for transitive
dependencies!

ORDER

Order_No Product_ID Description


Example 2: Determine NF
• Product_ID  Description

We know that the relation is at least in


1NF, and it is not in 2 NF. Therefore, we
conclude that the relation is in 1 NF.

ORDER

Order_No Product_ID Description


Example 2: Determine NF
• Product_ID  Description

In your solution you will write the following


justification:
1) No M/V attributes, therefore at least 1NF
2) There is a partial dependency (Product_ID 
Description), therefore not in 2NF
Conclusion: The relation is in 1NF

ORDER

Order_No Product_ID Description


Example 3: Determine NF
• Part_ID  Description Comp_ID and No are not
• Part_ID  Price determined by the primary
key; therefore, the relation is
• Part_ID, Comp_ID  No NOT in 1 NF. No sense in
looking at partial or transitive
dependencies.

PART

Part_ID Descr Price Comp_ID No


Example 3: Determine NF
In your solution you will write the
• Part_ID  Description following justification:
• Part_ID  Price 1) There are M/V attributes;
therefore, not 1NF
• Part_ID, Comp_ID  No Conclusion: The relation is not
normalized.

PART

Part_ID Descr Price Comp_ID No


Bringing a Relation to 1NF

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 1NF
• Option 1: Make a determinant of the
repeating group (or the multivalued
attribute) a part of the primary key.
Composite
Primary Key

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 1NF
• Option 2: Remove the entire repeating group from the
relation. Create another relation which would contain all
the attributes of the repeating group, plus the primary
key from the first relation. In this new relation, the
primary key from the original relation and the
determinant of the repeating group will comprise a
primary key.

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 1NF
STUDENT

Stud_ID Name
101 Lennon
125 Jonson

STUDENT_COURSE

Stud_ID Course Units


101 MSI 250 3
101 MSI 415 3
125 MSI 331 3
Bringing a Relation to 2NF

Composite
Primary Key

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 2NF
• Goal: Remove Partial Dependencies
Partial
Composite Dependencies
Primary Key

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 2NF
• Remove attributes that are dependent from the part but
not the whole of the primary key from the original
relation. For each partial dependency, create a new
relation, with the corresponding part of the primary key
from the original as the primary key.
STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 2NF
CUSTOMER
STUDENT_COURSE
Stud_ID Name Course_ID Units
101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00 Stud_ID Course_ID
125 Johnson MSI 331 3.00
101 MSI 250
101 MSI 415
125 MSI 331

STUDENT COURSE

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 3NF

• Goal: Get rid of transitive dependencies.

Transitive
Dependency
EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Bringing a Relation to 3NF
• Remove the attributes, which are dependent on a non-
key attribute, from the original relation. For each
transitive dependency, create a new relation with the
non-key attribute which is a determinant in the transitive
dependency as a primary key, and the dependent non-key
attribute as a dependent.

EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Bringing a Relation to 3NF
EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg

EMPLOYEE

Emp_ID F_Name L_Name Dept_ID


111 Mary Jones 1
122 Sarah Smith 2

DEPARTMENT

Dept_ID Dept_Name
1 Acct
2 Mktg
Boyce-Codd Normal Form
A relation schema R is in BCNF with respect to a set F of functional
dependencies if for all functional dependencies in F+ of the form



where   R and   R, at least one of the following holds:


•    is trivial (i.e.,   )
•  is a superkey for R
Example schema not in BCNF:

bor_loan = ( customer_id, loan_number, amount )

because loan_number  amount holds on bor_loan but loan_number


is not a superkey
Third Normal Form
A relation schema R is in third normal form (3NF) if
• for all functional dependencies in F+ of the form   ,
where   R and   R, at least one of the following
holds:
    is a trivial functional dependency (  )
  contains a key for R
 every B   is part of some candidate key of R

• BCNF and 3NF


– A BCNF relation is in 3NF
– A 3NF relation is not necessary in BCNF
Third Normal Form
• Example
– Consider the two relational schemas
• R1 = (cust-num, name, house-num, street, city, state)
cust-num  name, house-num, street, city, state

• R2 = (house-num, street, city, state, zip)


house-num, street, city, state  zip
zip  state

– Are these relations in 3NF?

– For R1
• The only nontrivial functional dependencies in F+ are those with cust-num
as a member of the left-side of the FD
• As cust-num is a superkey of R1, these functional dependencies satisfy the
second condition for 3NF. i.e satisfy BCNF conditions.
Third Normal Form
• R1 = (cust-num, name, house-num, street, city, state)
cust-num  name, house-num, street, city, state

• R2 = (house-num, street, city, state, zip)


house-num, street, city, state  zip
zip  state

– For R2
There are two kinds of nontrivial functional dependencies in F+:
– Those with (house-num, street, city, state) as a subset of the left
hand side of the FD: As (house-num, street, city, state) is a
superkey for R2, these functional dependencies satisfy the
second condition for 3NF
– Those of the form   {zip}    {state} where   
For any such functional dependency:
(  {state}) – (  {zip}) = {state} (or = )
Because state is part of a candidate key of R2, such functional
dependencies satisfy the third condition for 3NF
Create Tables
Assume the supply department in a company is in charge of
bringing parts from different manufacturers. A part is uniquely
identified by its name and manufacturer; for convenience, a part
is also given an id.

A separate delivery is necessary for each type of part, from


each manufacturer. At most one delivery is made in one day for
one type of part from one manufacturer. A “transport” (e.g.
van23) is associated with each delivery. Each transport has a
unique driver. A driver can drive more than one “transports”.

First create non-normal tables and then progressively


create normal forms.

You might also like