0% found this document useful (0 votes)
29 views33 pages

7 Normalization

Uploaded by

willdelete001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views33 pages

7 Normalization

Uploaded by

willdelete001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Chapter 15

Normalization for Relational


Database

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Data Normalization
§ Primarily a tool to validate and improve a
logical design so that it satisfies certain
constraints that avoid unnecessary
duplication of data
§ The process of decomposing relations with
anomalies to produce smaller, well-
structured relations

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


3

Well-Structured Relations
§ A relation that contains minimal data redundancy and
allows users to insert, delete, and update rows without
causing data inconsistencies
§ Goal is to avoid anomalies
§ Insertion Anomaly–adding new rows forces user to create
duplicate data
§ Deletion Anomaly–deleting rows may cause a loss of data
that would be needed for other future rows
§ Modification Anomaly–changing data in a row forces
changes to other rows because of duplication

General rule of thumb: A table should not pertain to


more than one entity type

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Example
EmpID Name Salary Course# CourseTitle Date
100 Alaa 32000 459 SPSS 9/9/2016
100 Alaa 32000 876 Surveys 7/8/2016
140 Atheer 40000 333 Visual Basic 1/1/2016
150 Aisha 23000 459 SPSS 9/9/2016
150 Aisha 23000 901 C++ 12/8/2016
140 Atheer 40000 901 C++ 12/8/2016

Is this a relation? Yes: Unique rows and no multivalued


attributes

What’s the primary key? Composite: Emp_ID, Course#

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


5

Anomalies in this Table


Insertion–can’t enter a new employee without having
the employee take a class. In addition, adding course for
existing employee duplicate employees’ data and course
data
Deletion–if we remove employee 140, we lose
information about the existence of a Visual Basic class
Modification–giving a salary increase to employee 100
forces us to update multiple records

Why do these anomalies exist?


Because there are two themes (entity types) in this
one relation. This results in data duplication and
an unnecessary dependency between the entities
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Functional Dependency
 Functional dependency, denoted by
X Y between two sets of attributes X, Y
means that value of Y is determined by the
value of X.

 The value of the X of a tuple uniquely (or


functionally) determine the value of
Y

 Y is functionally dependent on X
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Functional Dependency (2)
 Ssn→Ename
 The value of an employee’s Social security number
uniquely determines the employee name
 Pnumber →{Pname, Plocation}
 The value of a project’s number (Pnumber)
uniquely determines the project name and
location
 {Ssn, Pnumber}→Hours
 A combination of Ssn and Pnumber values
uniquely determines the number of hours the
employee currently works on specific project

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Normal Forms Based on Primary
Key
 Most practical relational design projects
take one of the following two approaches:
 Perform a conceptual schema design using a
conceptual model such as ER or EER and map
the conceptual design into a set of relations
 Design the relations based on external
knowledge derived from an existing
implementation of files or forms or reports

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Normalization of Relations
 The normalization process takes a relation
schema through a series of tests to certify
whether it satisfies a certain normal
form.
 There are three normal forms, which are
first, second, and third normal form
 It is a purifying process that makes the
design have better quality and minimizes
redundancy
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Normalization Steps

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


First Normal Form
 For a relation to be in 1st normal form,
redundant groups or multivalued attributes
should be removed
 To change to 1NF:
 Remove nested relation attributes into a
new relation
 Propagate the primary key into it

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Removing multivalued attributes

New relation to be in 1st normal form


Dnumber Dlocations

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Removing Relation within
Relation or Repeating Groups
St_code St_name Address Age

Student

courses

C# c_name hours

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Example Repeating Groups
St_code St_name Address Age C# C_name Hours
20111 Karim AlRawda 18
DS34 Data Str. 3
OR23 Ope. Rese 3
DB12 Database 3
20112 Mona Alfayhaa 17
N22 Network 2
OR23 Ope. Rese 3
Db12 Database 3
DS34 Data Str. 3

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Example Repeating Groups

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


To Be in First Normal Form
The previous data to be in 1st normal form,
split the repeating group in another table with
the primary key of the relation to be as
foreign key in the new relation
St_code St_name Address Age

St_code C# C_name Hours

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Second Normal Form
 Based on concept of full functional
dependency
 Versus partial dependency- Remove partial dependency

 Second normalize into a number of 2NF


relations
 Nonprime attributes are associated only with part of
primary key on which they are fully functionally dependent

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Second Normal Form (2)
SSN P# Hours Ename Pname Plocation

 Ename is functionally dependent on SSN only, no


need for P#
 Pname and Plocation are functionally dependent on
P#
 So this relation is not in the second normal form
 Solution:

SSN P# Hours

SSN Ename P# Pname Plocation

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Third Normal Form
 To be in 3rd normal form you should remove
transitive dependency
 Non key attribute is dependent on non key
attribute

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Third Normal Form (2)
SSN Ename Address Dept_code Dept_name Mgr_SSN

 Dept_name and mgr_SSN are functionally


dependent on dept_code which is non key
attribute
 Solution:
SSN Ename Address Dept_code

Dept_code Dept_name Mgr_SSN

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Exercises
Consider the following report, suppose sales order number with item
ordered is the primary key, normalize this report to reach better design

Sales Order

Fiction Company
202 N. Main
Mahattan, KS 66502

CustomerNumber: 1001 Sales Order Number: 405


Customer Name: ABC Company Sales Order Date: 2/1/2000
Customer Address: 100 Points Clerk Number: 210
Manhattan, KS 66502 Clerk Name: Martin Lawrence

Item Ordered Description Quantity Unit Price Total


800 widgit small 40 60.00 2,400.00
801 tingimajigger 20 20.00 400.00
805 thingibob 10 100.00 1,000.00

Order Total 3,800.00

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Relation
R=salesOrderNo, salesOrderDate, custNo,
custName, address clerkNo, clerkName,
{itemsOrdered, description, quantity,
unitPrice}

To be in 1st normal form, remove repeating


group:
R1= salesOrderNo, salesOrderDate, custNo,
custName, address clerkNo, clerkName
R2= salesOrderNo, itemsOrdered,
description, quantity, unitPrice
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
2 nd
Normal Form
R1= salesOrderNo, salesOrderDate, custNo,
custName, address clerkNo, clerkName
R2.1= salesOrderNo, itemsOrdered, quantity
R2.2= itemsOrdered, description, unitPrice

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


3rd
Normal Form
R1.1= salesOrderNo, salesOrderDate, custNo,
clerkNo
R1.2= custNo, custName, address
R1.3= clerkNo, clerkName
R2.1= salesOrderNo, itemsOrdered, quantity
R2.2= itemsOrdered, description, unitPrice

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Exercises
 Normalize the following schemas into 3rd
normal form:
 BRANCH (Branch#, Branch_Addr, {ISBN,
Title, Author, Publisher, Num_copies})
1st Normal Form:
R1: Branch#, Branch_addr
R2: Branch#, ISBN, title, author, publisher,
num_copies

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


2nd Normal Form:
R1: Branch#, Branch_addr
R2.1: Branch#, ISBN, num_copies
R2.2: ISBN, title, author, publisher

3rd Normal Form:


No change in the previous relations.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Exercises
 Project code, project title, project manager,
project budget {employeeNo,
employeename,completed_hour,
departmentNo, department_name,
rate_per_hour}
 Note: rate per hour for each employee is
fixed regardless of the project. Completed
hour means the number of hours employee
accomplished in this project
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
 1st Normal Form:
R1: Pcode, ptitle, pmgr, pbudget
R2: Pcode, empNo, empName, c_hours,
dNo, dName, rate

 2nd Normal Form:


R1: Pcode, ptitle, pmgr, pbudget
R2.1: Pcode, empNo, c_hours
R2.2: empNo, empName, dNo, dName, rate

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


 3rd Normal Form:
R1: Pcode, ptitle, pmgr, pbudget
R2.1: Pcode, empNo, c_hours
R2.2.1: empNo, empName, dNo, rate
R2.2.2: dNo, dName

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Exercises
Al Salam Hospital - Doctor's report Date: 10/5/2004
Doctor Id. : A121 Doctor Name : Dr. Ahmed
Department Id : A Department Name : Internal Diseases
P# Pat-name Address Given Treatments
Item# Description Quantity Unit Price
10 Saleh Maadi A01 Aspirin 10 1.5
A03 Panadol 6 3.5
B01 Vitamin C 12 4.0

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Relation
R=docId, docname, deptId, deptName,
{ patient(p#,pname,address){given
treatment (item#, description, quantity,
unit_price)}}
First Normal Form
 R1= docId, docname, deptId, deptName
 R2= docId, p#, pname, address
 R3=docId, p#, item#, description, qantity,
unit_price

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Second Normal Form
 R1= docId, docname, deptId, deptName
 R2.1= docId, p#
 R2.2= p#, pname, address
 R3=docId, p#, item#, qantity
 R3.2= item#, description, unit_price

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Third Normal Form
 R1.1= docId, docname, deptId
 R1.2=deptId, deptName
 R2.2= p#, pname, address
 R3=docId, p#, item#, qantity
 R3.2= item#, description, unit_price

Copyright © 2011 Ramez Elmasri and Shamkant Navathe

You might also like