0% found this document useful (0 votes)
9 views30 pages

4 - Normalization

Normalization is a data modeling technique aimed at organizing data to minimize redundancy and avoid anomalies during data operations such as insertion, update, and deletion. It involves decomposing relations into smaller, well-structured tables and follows specific steps to achieve various normal forms, addressing issues like functional, partial, and transitive dependencies. The ultimate goal is to create a better representation of real-world data for easier maintenance and management.

Uploaded by

yvoalexis24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views30 pages

4 - Normalization

Normalization is a data modeling technique aimed at organizing data to minimize redundancy and avoid anomalies during data operations such as insertion, update, and deletion. It involves decomposing relations into smaller, well-structured tables and follows specific steps to achieve various normal forms, addressing issues like functional, partial, and transitive dependencies. The ultimate goal is to create a better representation of real-world data for easier maintenance and management.

Uploaded by

yvoalexis24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

NORMALIZATION

Definitions
• A process of analyzing a relation to ensure that it is well formed

• Normalization involves decomposing relations to produce smaller,


well-structured relations

• A formal process for deciding which attributes should be grouped


together in a relation so that all anomalies are removed

• More specifically, if a relation is normalized (well-formed), rows can


be inserted, deleted, or modified without creating anomalies
Goals

• Minimize data redundancy thereby conserving space and


avoiding anomalies

• Make it easier to maintain data

• Provide a better design that is an improved representation


of the real world
Normalization is a logical data-
modelling technique used to ensure
that data are well structured from an
organization-wide view
Modification Anomalies
Tables that are not normalized are susceptible to experiencing
modification anomalies

◦ Insertion Anomaly - occurs when certain attributes cannot be inserted into


the database without the presence of other attributes

◦ Update Anomaly - exists when one or more instances of duplicated data is updated,
but not all

◦ Deletion Anomaly - exists when certain attributes are lost because of the deletion of
other attributes
Insertion Anomaly
customerId customerName carId year make model
1 Mischka Sophia 102 2003 Volkswagen Golf

2 Kenra Elienette 101 2014 BMW 550i

3 Thea Pauline 103 2010 Nissan Sentra

Most modification problems are solved by breaking an existing


table into two or more tables through a process known as
normalization
Customer Car

customerId customerName carId carId year make model

1 Mischka Sophia 102 102 2003 Volkswagen Golf

2 Kenra Elienette 101 101 2014 BMW 550i

3 Thea Pauline 103 103 2010 Nissan Sentra

104 2020 Honda Civic


Update Anomaly
exists when one or more instances of duplicated data is updated, but not all

Customer
customerId carId year make model price
Name
Mischka
1 102 2014 Mitsubishi Mirage 900K
Sophia
Kenra
2 101 2014 Toyota Prius 2M
Elienette
Thea
3 103 2010 Nissan Sentra 900K
Pauline
4 Ruel Paolo 102 2014 Mitsubishi Mirage 900K
James
5 104 2016 Honda Civic 1.2M
Aaron
Eunice
6 102 2014 Mitsubishi Mirage 900K
Czarina
Customer
customerId carId carId year make model price
Name
Mischka 102 2014 Mitsubishi Mirage 900K
1 102
Sophia
Kenra 101 2014 Toyota Prius 2M
2 101
Elienette
Thea 103 2010 Nissan Sentra 900K
3 103
Pauline
4 Ruel Paolo 102 102 2014 Mitsubishi Mirage 900K

James 104 2016 Honda Civic 1.2M


5 104
Aaron
Eunice 102 2014 Mitsubishi Mirage 900K
6 102
Czarina
Deletion Anomaly
exists when certain attributes are lost because of the deletion of
other attributes

EmpID EmpName Address DeptID DeptName DeptMgr

1 Hanae Takada Osaka D01 Accounting Valdez

2 Atsuko Ono Tokyo D02 Operations Deverala

3 Chigusa Mami Nagoya D03 Services Talavera

4 Akiko Mizuno Kyoto D04 Training Antonio

5 Chelsea Dy Chennai D02 Operations Deverala

6 Donald Cruz Mumbai D01 Accounting Valdez


EmpID EmpName Address DeptID DeptID DeptName DeptMgr

Hanae
1 Osaka D01 D01 Accounting Valdez
Yoshimori

2 Atsuko Imai Tokyo D02 D02 Operations Deverala

3 Chigusa Sazaki Nagoya D03 D03 Services Talavera

4 Reiko Kawai Kyoto D04 D04 Training Antonio

5 Kalpint Patel Chennai D02 D02 Operations Deverala

Donald
6 Mumbai D01 D01 Accounting Valdez
McElhenny
Definitions

Functional dependency
You can also describe this as a relationship where knowing the
value of one attribute (or a set of attributes) is enough to tell you
the value of another attribute (or set of attributes) in the same
table.

Examples:
SSS No. -> Employee’s Name, Employee’s contributions …
Bank Account No. -> Customer Name, Amount of Deposit…
Partial functional dependency – when a non-key attribute is
functionally dependent on part (but not all) of the primary key

Flight NoOf
FlightDate From To
Number Passengers
1750 Kuala
UA 36 3-Sep-16 1410 Manila 200
Lumpur

0455 Kuala
AA 704 31-Oct-16 0120 Manila 150
Lumpur

1750 Kuala
UA 36 4-Sep-16 1410 Manila 190
Lumpur

0435 Kuala
BA 9 31-Oct-16 0100 Manila 100
Lumpur
Transitive dependency – a functional dependency between the
primary key and one or more nonkey attributes that are
dependent on the primary key via another nonkey attribute

This means if we have a primary key A and a non-key domain B


and C where C is more dependent on B than A and B is directly
dependent on A, then C can be considered transitively dependent
on A.

A B C
Transitive dependency – a functional dependency between the
primary key and one or more nonkey attributes that are
dependent on the primary key via another nonkey attribute

This means if we have a primary key DeptID and a non-key


domain MgrID and MgrName where MgrName is more
dependent on MgrID than DeptID and MgrID is directly
dependent on DeptID, then MgrName can be considered
transitively dependent on DeptID.

DeptID MgrID MgrName


Steps in Normalization
Normalization can be accomplished in stages, each of which
corresponds to a normal form

1. First Normal Form - Any multivalued attribute (also called


repeating groups) have been removed

2. Second Normal Form – Any partial functional dependencies have


been removed

3. Third Normal Form – Any transitive dependencies have been


removed
Order ID : 1006 Order Date : Oct 24, 2019

Customer ID : 22
Customer Name : Dan’s Furniture
Customer Address : Fullerton, California

Product ID Product Description Product Finish Unit Price Ordered Quantity


7 Tea Table Walnut $450 2
5 TV Stand Oak $300 6
4 Porch Swing Pine $800 5
Order ID : 1007 Order Date : Oct 31, 2019

Customer ID : 65
Customer Name : Furniture Barn
Customer Address : Fort Collins, Colorado

Product ID Product Description Product Finish Unit Price Ordered Quantity


11 Table Cherry $1150 3
4 Porch Swing Pine $800 7
Table with multivalued Attributes:
Not in 1st Normal Form

Order Order Cust ID Cust Cust Prod Product Prod Unit Ordered
ID Date Name Address ID Descriptio Finish Price Qty
n

1006 24 Oct 22 Dan’s Fullerton, 7 Tea Table Walnu $450 2


2019 Furniture CA 5 TV Stand t $300 6
4 Porch Oak $800 5
Swing Pine
1007 31 Oct 65 Furniture Ft. 11 Table Cherry $1150 3
2019 Barn Collins,Co 4 Porch Pine $800 7
Swing
Table with no multivalued Attributes:
In 1st Normal Form
Order Order Cust Customer Customer Prod Product Prod Unit Ordere
ID Date ID Name Address ID Description Finish Price d Qty
1006 24 Oct 22 Dan’s Fullerton, 7 Tea Table Walnut $450 2
2019 Furniture CA
1006 24 Oct 22 Dan’s Fullerton, 5 TV Stand Oak $300 6
2019 Furniture CA
1006 24 Oct 22 Dan’s Fullerton, 4 Porch Swing Pine $800 5
2019 Furniture CA
1007 31 Oct 65 Furniture Ft. 11 Table Cherry $1150 3
2019 Barn Collin,Co
1007 31 Oct 65 Furniture Ft. 4 Porch Swing Pine $800 7
2019 Barn Collin,Co
Going to the 2nd Normal Form Full Dependency

Order Order Cust Customer Customer Prod Product Product Unit Ordered
ID Date ID Name Address ID Description Finish Price Qty

Partial Dependencies Partial Dependencies

Order_ID -> Order_Date, Customer_ID, Customer_Name,


Customer_Address

Product_ID -> Product_Description, Product_Finishm Unit_Price

Order_ID, Product_ID -> Order_Quantity


Second Normal Form
1NF PLUS No partial dependencies

Order Customer Customer Customer


Order ID
Date ID Name Address

Product Product
Product ID Unit Price
Description Finish

Ordered
Order ID Product ID
Quantity
Going to the 3rd Normal Form
Transitive Dependencies

Order Customer Customer Customer


Order ID
Date ID Name Address

Product Product
Product ID Unit Price
Description Finish

Ordered
Order ID Product ID
Quantity
Third Normal Form
2NF PLUS No transitive dependencies

OrderT

Product Product
Product ID Unit Price Product
Description Finish

Ordered
Order ID Product ID Order_Detail
Quantity

Customer
Third Normal Form
2NF PLUS No transitive dependencies

OrderT

Product Product
Product ID Unit Price Product
Description Finish

Ordered
Order ID Product ID Order_Detail
Quantity

Customer
Dependency Diagram Full Dependency

Transitive Dependencies

Order Order Cust Customer Customer Prod Product Product Unit Ordered
ID Date ID Name Address ID Description Finish Price Qty

Partial Dependencies Partial Dependencies

Order_ID, Product_ID -> Order_Quantity


Order_ID -> Order_Date, Customer_ID
Product_ID -> Product_Description, Product_Finishm Unit_Price
Customer_ID -> Customer_Name, Customer_Address
UNF

ORDERPRODUCT(orderid, orderdate,custid,custname, custadrs,{prodid, proddesc,prodfin,uprice,qty})

Order Order Cust ID Cust Cust Prod Product Prod Unit Ordered
ID Date Name Address ID Descriptio Finish Price Qty
n

1006 24 Oct 22 Dan’s Fullerton, 7 Tea Table Walnu $450 2


2019 Furniture CA 5 TV Stand t $300 6
4 Porch Oak $800 5
Swing Pine
1007 31 Oct 65 Furniture Ft. 11 Table Cherry $1150 3
2019 Barn Collins,Co 4 Porch Pine $800 7
Swing
1NF

ORDERPRODUCT(orderid, orderdate,custid,custname, custadrs,prodid, proddesc,prodfin,uprice,qty)

Order Order Cust Customer Customer Prod Product Prod Unit Ordere
ID Date ID Name Address ID Description Finish Price d Qty
1006 24 Oct 22 Dan’s Fullerton, 7 Tea Table Walnut $450 2
2019 Furniture CA
1006 24 Oct 22 Dan’s Fullerton, 5 TV Stand Oak $300 6
2019 Furniture CA
1006 24 Oct 22 Dan’s Fullerton, 4 Porch Swing Pine $800 5
2019 Furniture CA
1007 31 Oct 65 Furniture Ft. 11 Table Cherry $1150 3
2019 Barn Collin,Co
1007 31 Oct 65 Furniture Ft. 4 Porch Swing Pine $800 7
2019 Barn Collin,Co
3NF
OrderT(orderID, orderdate,custid)

Product(prodid, proddesc,finish,
uprice)
OrderT
Order_detail(orderID, prodid, qty)

Customer(custid,custname,custadrs) Product Product


Product ID Unit Price
Description Finish

Ordered
Order ID Product ID Order_Detail
Quantity

Customer
Source

Modern Database Management, 11th Ed


Hoffer , Prescott, Topi
Some definitions (anomalies), examples (functional dependencies) taken from the net

You might also like