0% found this document useful (0 votes)
53 views

Lecture 5 - Normalization of Relational Tables

The document discusses normalization of tables in database design. It describes common problems that can occur with single table designs, such as anomalies from inserting, deleting or updating records. The document then explains the process of normalization to transform entity-relationship diagrams into tables according to normalization rules. This helps structure tables to minimize problems like redundancy, multi-valued attributes, and various types of anomalies. The goals of normalization include achieving first normal form, which addresses repeating groups, and higher normal forms up to third normal form.

Uploaded by

Abdul Ghani Khan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Lecture 5 - Normalization of Relational Tables

The document discusses normalization of tables in database design. It describes common problems that can occur with single table designs, such as anomalies from inserting, deleting or updating records. The document then explains the process of normalization to transform entity-relationship diagrams into tables according to normalization rules. This helps structure tables to minimize problems like redundancy, multi-valued attributes, and various types of anomalies. The goals of normalization include achieving first normal form, which addresses repeating groups, and higher normal forms up to third normal form.

Uploaded by

Abdul Ghani Khan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 29

Normalization of Tables

“Between two evils, choose neither; between two goods, choose both.”
Tryon Edwards
Steps to E-R Transformation
1. Identify entities
2. Identify relationships
3. Determine relationship type
4. Determine level of participation
5. Assign an identifier for each entity
6. Draw completed E-R diagram
7. Deduce a set of preliminary skeleton tables along with a
proposed primary key for each table (using rules provided)
8. Develop a list of all attributes of interest (not already
listed and systematically assign each to a table in such a
way to achieve a 3NF design (i.e., no repeating groups,
no partial dependencies, and no transitive dependencies)
Tables
 Database design is the process of separating
information into multiple tables that are related to
each other
 Single table designs work only for the simplest of
situations in which data integrity problems are
easy to correct
 Anomalies (abnormalities) often arise in single
table designs as a result of inserting, deleting, or
updating records
 Some tables are better structured than others (i.e.,
result in fewer anomalies)
Redundancy
Unnecessary repetition or duplication of data

increases likelihood of errors due to keying inconsistencies
Multi-valued Problems
Solution 1? Include all author’s names in a single field

Difficult to search for a single author’s name or create an
alphabetical list of authors
Multi-valued Problems
Solution 2? Add multiple columns, one for each value

empty fields waste storage space

awkward to search across fields (e.g., Any books by Snoopy? Must search Author1, Author2, etc.)

necessitates the creation of a new column every time a book has an additional author

  
Multi-valued Problems
Solution 3? Add multiple rows, one for each value

Data about a book must be repeated for as many times as there are authors of a book (also creates
redundancy which lead to keying errors and unnecessarily wasting storage space with large files)

count of total # of books or # from each publisher would be wrong




Update Anomalies
 Update Anomalies

To update an agent’s telephone number, each instance must be
changed

if we miss an item or enter it incorrectly we create an unreliable
table
sometimes previous errors
propagate errors further

 An update anomaly occurs when multiple record changes for a


single attribute are necessary.
Deletion Anomalies

 Deletion anomalies

What happens if a customer record is deleted?

What happens if an agent record is deleted?

 A deletion anomaly occurs when the removal of a record


results in the unintended loss of important information.
Insertion Anomalies
 Insertion anomalies

What happens if we want to enter information regarding
an agent for whom we do not have a customer?

Do we add null values (blanks) for the other fields?

 An insertion anomaly occurs when there is not a reasonable place


to assign attributes and attribute values to records.
The Problem with Nulls

1. Nulls used in mathematical expressions


- unknown quantity leads to unknown total value
- misleading value of all inventory
Product ID Product Description Category Price Quantity Total Value
801 Shur-Lock U-Lock Accessories 75.00
802 SpeedRite Cyclecomputer 60.00 20 1,200.00
803 SteelHead Microshell HelmetAccessories 40.00 40 1,600.00
804 SureStop 133-MB Brakes Components 25.00 10 250.00
805 Diablo ATM Mountain Bike Bikes 1,200.00
806 Ultravision Helmet Mount Mirrors 7.45 10 74.50

Total: 3,124.50
Category Total Occurences
0
Accessories 2
2. Nulls used in aggregate functions
Bikes 1 - blanks exist under category
Components 1 - cannot be counted because they don’t exist!
Database Design Problems
 Use of the relational database model removes
some database anomalies
 Further removal of database anomalies relies on
a structured technique called normalization
 Presence of some of these anomalies is
sometimes justified in order to enhance
performance

Thus, database design consists of balancing the


art of design with the science of design
Normalization
 Goal in database design to create well-structured tables
 Transform E-R models to tables following the rules provided
 Assuring tables are well-structured with minimal problems
(redundancy, multi-valued attributes, update anomalies, insertion
anomalies, deletion anomalies) is achieved using structured technique
called normalization
 Normalization is the structured decomposition of one table into two or
more tables using a procedure designed to determine the most
appropriate split
 Normalization our method of making sure the E-R design was correct
in the first place
 Normalization refers to a series of forms: we will cover 1NF to 3NF,
which is usually sufficient. Note that there are also: 4NF, Boyce-Codd
Normal Form (BCNF), Fifth Normal Form (5NF) and Domain-Key
Normal Form (DKNF)
First Normal Form
 A table is in first normal form if it meets the following criteria: The data
are stored in a two-dimensional table with no two rows identical and
there are no repeating groups.

The following table in NOT in first normal form because it contains a multi-valued
attribute (an attribute with more than one value in each row).

Member_ID Memb_FName Memb_LName Hobbies


1 Rodney Jones hiking, cooking
3 Francine Moire golf, theatre, hiking
2 Anne Abel concerts
Handling multi-valued attributes: Incorrect Solutions

Member_ID Memb_FName Memb_LName Hobbies


1 Rodney Jones hiking, cooking
3 Francine Moire golf, theatre, hiking
2 Anne Abel concerts

Member_ID Memb_FName Memb_LName Hobby1 Hobby2 Hobby3


1 Rodney Jones hiking cooking
3 Francine Moire golf theatre hiking
2 Anne Abel concerts

Member_ID Memb_FName Memb_LName Hobbies


1 Rodney Jones fishing
1 Rodney Jones cooking
3 Francine Moire golf
3 Francine Moire theatre
3 Francine Moire hiking
2 Anne Abel concerts
Handling multi-valued attributes: Correct Solution
 Create another entity (table) to handle multiple instances of the repeating group. This second table is then linked to the original table with an
identifier (i.e., foreign key). This solution has the following advantages:

no limit to the number of hobbies per member

no waste of disk space

searching becomes much easier within a column (e.g., who likes hiking?)

Member_ID Memb_FName Memb_LName Hobbies


1 Rodney Jones hiking, cooking
3 Francine Moire golf, theatre, hiking
2 Anne Abel concerts
Member_ID Hobby
1 hiking
Member_ID Memb_FName Memb_LName 1 cooking
1 Rodney Jones 3 golf
3 Francine Moire 3 theatre
3 hiking
2 Anne Abel
2 concerts
Handling Repeating Groups
 An attribute can have a group of several data entries. Repeating groups can be
removed by creating another table which holds those attributes that repeat. This
second table (validation table) is then linked to the original table with an identifier
(i.e., foreign key)
 Advantages: fewer characters tables; reduces miskeying, update anomalies

Product_ID Product_Name Category Price


801 Shur-Lock U-Lock Accessory 75.00
802 SpeedRite Cyclecomputer Component 60.00
803 SteelHead Microshell Helmet Accessory 40.00
804 SureStop 133-MB Brakes Component 25.00
805 Diablo ATM Mountain Bike Bike 1,200.00
806 Ultravision Helmet Mount Mirrors
Accessory 7.45

Product_ID Product_Name Category Price


801 Shur-Lock U-Lock 1 75.00
802 SpeedRite Cyclecomputer 2 60.00
Category_ID Category
803 SteelHead Microshell Helmet 1 40.00 1 Accessory
804 SureStop 133-MB Brakes 2 25.00 2 Component
805 Diablo ATM Mountain Bike 3 1200.00 3 Bike
806 Ultravision Helmet Mount Mirrors 1 7.45
Second Normal Form
 A table is in second normal form if it meets the following criteria: The relation is in
first normal form, and, all nonkey attributes are functionally dependent on the entire
primary key.

Applies only to tables that have a composite primary key.

In the following table, both the EmpID and Training (composite primary key) determine Date,
whereas, only EmpID (part of the primary key) determines Dept.

EmpID Training Date Dept


1 Word 12-Sep-99 Oncology
3 Excel 14-Oct-99 Paediatrics
2 Excel 14-Oct-99 Renal
1 Access 23-Nov-99 Oncology
Removing Partial Dependencies
 Remove partial dependencies by separating the relation into two relations. Reduces the problems of

update anomalies

delete anomalies

insert anomalies

redundancies

EmpID Training Date Dept


1 Word 12-Sep-99 Oncology
3 Excel 14-Oct-99 Paediatrics
2 Excel 14-Oct-99 Renal
1 Access 23-Nov-99 Oncology

EmpID Training Date


1 Word 12-Sep-99 EmpID Dept
3 Excel 14-Oct-99 1 Oncology
2 Excel 14-Oct-99 2 Renal
1 Access 23-Nov-99 3 Paediatrics
Third Normal Form
 A table is in third normal form if it meets the following criteria: The relation is in second
normal form, and, a nonkey field is not functionally dependent on another nonkey field.

The following table is in second normal form but NOT in third normal form because Member_Id (the
primary key) does not determine every attribute (does not determine RegistrationFee). RegistrationFee
is determined by Sport.

Member_ID Memb_FName Memb_LName Sport RegistrationFee


1 Rodney Jones Swimming $100
3 Francine Moire Tennis $200
2 Anne Abel Tennis $200
4 Goro Azuma Skiing $150
Member ID  FName, LName, Lesson; Lesson  Cost
Removing non-key Transitive Dependencies
 Remove transitive dependencies by placing attributes involved in a new relational
table. Reduces the problems of:

update anomalies

delete anomalies

insert anomalies

redundancies MemberID MembFName MembLName Sport RegFee
1 Rodney Jones Swimming $100
3 Francine Moire Tennis $200
2 Anne Abel Tennis $200
4 Goro Azuma Skiing $150

MemberID MembFName MembLName Sport


SportID Sport RegFee
1 Rodney Jones 1
1 Swimming $100
3 Francine Moire 2
2 Anne Abel 2 2 Tennis $200
4 Goro Azuma 1 3 Skiing $150
Normalization Example: Video Store
A video rental shop tracks all of their information in one table. There are
now 20,000 records in it. Is it possible to achieve a more efficient design?
(They charge $10/movie/day.)
Cust_Name Cust_address Cust_Phone Rental_date
Rodney Jones 23 Richmond St. 681-9854 15-Oct-99
Francine Moire 750-12 Kipps Lane 672-9999 4-Nov-99
Anne Abel 5 Sarnia Road 432-1120 3-Sep-99
Rodney Jones 23 Richmond St. 681-9854 22-Sep-99

Video_1 Video_2 Video_3 VideoType_1 VideoType_2 VideoType3


Gone with the Wind
Braveheart Mississippi Burning
Classic Adventure Adventure
Manhatten Comedy
Manhatten The African Queen Comedy Classic
Never Say Never Silence
Again of the Lambs Adventure Horror

Return_date TotalPrice Paid? VIDEO (Cust_name, Cust_address,


17-Oct-99 $ 60.00 yes Cust_phone, Rental_date, Video_1,
Video_2, Video_3, VideoType_1,
4-Sep-99 $ 20.00 yes VideoType_2, VideoType3,
26-Sep-99 $ 80.00 yes Return_date, Total_Price, Paid?)
Is the Video store in 1NF?
No attributes should form repeating groups - remove them by creating
another table. There are repeating groups for videos and customers.
Cust_Num Cust_Name Cust_address Cust_Phone CUSTOMER (Cust_Num,
1 Rodney Jones 23 Richmond St.681-9854
Cust_Name,
2 Francine Moire 750-12 Kipps Lane
672-9999
3 Anne Abel 5 Sarnia Road 432-1120 Cust_address_Cust_phone

VideoNum VideoName VideoType


1 Gone with the Wind Classic
2 Manhatten Comedy
3
4
Never Say Never Again
Braveheart
Adventure
Adventure
VIDEO (VideoNum,
5 Mississippi Burning Adventure VideoName, VideoType
6 The African Queen Classic
7 Silence of the Lambs Horror

RENTAL (Cust_num, VideoNum, Rental_date, Return_date, TotalPrice, Paid?)


Cust_Num VideoNum Rental_date Return_date TotalPrice Paid?
1 1,4,5 15-Oct-99 17-Oct-99 $ 60.00 yes
2 2 4-Nov-99
3 2,6 3-Sep-99 4-Sep-99 $ 20.00 yes
1 3,7 22-Sep-99 26-Sep-99 $ 80.00 yes
Video Store: 1NF (cont’d)
Have not yet removed all repeating groups - video is a multi-valued
attribute - move to another table.
Cust_Num VideoNum Rental_date Return_date TotalPrice Paid?
1 1,4,5 15-Oct-99 17-Oct-99 $ 60.00 yes
2 2 4-Nov-99
3 2,6 3-Sep-99 4-Sep-99 $ 20.00 yes
1 3,7 22-Sep-99 26-Sep-99 $ 80.00 yes
RentalNum VideoNum
1 1
1 4
1 5
RentalNum Cust_Num Rental_date Return_date TotalPrice Paid?
2 2
1 1 15-Oct-99 17-Oct-99 $ 60.00 yes
3 2
2 2 4-Nov-99 3 6
3 3 3-Sep-99 4-Sep-99 $ 20.00 yes 4 3
4 1 22-Sep-99 26-Sep-99 $ 80.00 yes 4 7

RENTALDETAILS
RENTAL (RentalNum, Cust_Num, Rental_date,
(RentalNum,
Return_Date, TotalPrice, Paid?)
VideoNum)
The Video Store is now in 1NF
Cust_Num Cust_Name Cust_address Cust_Phone CUSTOMER
1 Rodney Jones 23 Richmond St.681-9854
(Cust_Num, Cust_Name,
2 Francine Moire 750-12 Kipps Lane
672-9999
Cust_address,
3 Anne Abel 5 Sarnia Road 432-1120
Cust_phone
VideoNum VideoName VideoType
1 Gone with the Wind Classic
2
3
Manhatten
Never Say Never Again
Comedy
Adventure
VIDEO (VideoNum,
4 Braveheart Adventure VideoName, VideoType
5 Mississippi Burning Adventure
6 The African Queen Classic RentalNum VideoNum
7 Silence of the Lambs Horror 1 1
1 4
1 5
RentalNum Cust_Num Rental_date Return_date TotalPrice Paid? 2 2
1 1 15-Oct-99 17-Oct-99 $ 60.00 yes 3 2
2 2 4-Nov-99 3 6
3 3 3-Sep-99 4-Sep-99 $ 20.00 yes 4 3
4 1 22-Sep-99 26-Sep-99 $ 80.00 yes 4 7
RENTALDETAILS
RENTAL (RentalNum, Cust_Num, Rental_date,
(RentalNum,
Return_Date, TotalPrice, Paid?)
VideoNum)
Is the Video Store in 2NF?
The only table that has a composite primary key has no other fields, therefore, yes.

Cust_Num Cust_Name Cust_address Cust_Phone


CUSTOMER (Cust_Num,
1 Rodney Jones 23 Richmond St.681-9854
Cust_Name,
2 Francine Moire 750-12 Kipps Lane
672-9999
3 Anne Abel 5 Sarnia Road 432-1120
Cust_address, Cust_phone
VideoNum VideoName VideoType
1 Gone with the Wind Classic
2 Manhatten Comedy VIDEO (VideoNum,
3 Never Say Never AgainAdventure
4 Braveheart Adventure
VideoName, VideoType
5 Mississippi Burning Adventure RentalNum VideoNum
6 The African Queen Classic 1 1
7 Silence of the Lambs Horror 1 4
1 5
RentalNum Cust_Num Rental_date Return_date TotalPrice Paid?
2 2
1 1 15-Oct-99 17-Oct-99 $ 60.00 yes
3 2
2 2 4-Nov-99 3 6
3 3 3-Sep-99 4-Sep-99 $ 20.00 yes 4 3
4 1 22-Sep-99 26-Sep-99 $ 80.00 yes 4 7

RENTAL (RentalNum, Cust_Num, Rental_date, RENTALDETAILS


Return_Date, TotalPrice, Paid?) (RentalNum, VideoNum)
Is the Video Store in 3NF?
Does each attribute in each table depend upon the primary key?

Cust_Num Cust_Name Cust_address Cust_Phone


1 Rodney Jones 23 Richmond St.681-9854
2 Francine Moire 750-12 Kipps Lane
672-9999
3 Anne Abel 5 Sarnia Road 432-1120

RentalNum VideoNum
VideoNum VideoName VideoType
1 1
1 Gone with the Wind Classic
1 4
2 Manhatten Comedy
1 5
3 Never Say Never Again
Adventure
2 2
4 Braveheart Adventure 3 2
5 Mississippi BurningAdventure 3 6
6 The African Queen Classic 4 3
7 Silence of the LambsHorror 4 7

RentalNum Cust_Num Rental_date Return_date TotalPrice Paid?


1 1 15-Oct-99 17-Oct-99 $ 60.00 yes
2 2 4-Nov-99
3 3 3-Sep-99 4-Sep-99 $ 20.00 yes
4 1 22-Sep-99 26-Sep-99 $ 80.00 yes
The Video Store is now in 3NF
Yes, because in each table, every attribute depends on the primary key and not
on any other key.
Cust_Num Cust_Name Cust_address Cust_Phone
CUSTOMER (Cust_Num,
1 Rodney Jones 23 Richmond St.681-9854
Cust_Name,
2 Francine Moire 750-12 Kipps Lane
672-9999
3 Anne Abel 5 Sarnia Road 432-1120
Cust_address, Cust_phone

VideoNum VideoName VideoType


1 Gone with the Wind Classic VIDEO (VideoNum,
2 Manhatten Comedy VideoName, VideoType
3 Never Say Never Again
Adventure
4 Braveheart Adventure RentalNum VideoNum ReturnDate Amt_Paid
5 Mississippi BurningAdventure 1 1 16-Oct-99 $10
6 The African Queen Classic 1 4 17-Oct-99 $20
7 Silence of the LambsHorror 1 5 16-Oct-99 $10
2 2 5-Nov-99 $10
RentalNum Cust_Num Rental_date
3 2 4-Sep-99 0
1 1 15-Oct-99
3 6 6-Sep-99 0
2 2 4-Nov-99
4 3 24-Sep-99 $5
3 3 3-Sep-99
4 7 16-Sep-99 0
4 1 22-Sep-99

RENTAL (RentalNum, Cust_Num, RENTALDETAILS (RentalNum,


Rental_date) VideoNum, ReturnDate, Amt_Paid)
Conflicting Goals of Design
 Database design must reconcile the following
requirements:

Design elegance requires that the design must adhere to design
rules concerning nulls, derived attributes, redundancies, relationship
types, etc.

Information requirements are dictated by the end users

Operational (transaction) speed requirements are also dictated by
the end users
 Clearly, an elegant database design that fails to address
end user information requirements or one that forms the
basis for an implementation whose use progresses at a
snail's pace has little practical use.

You might also like