0% found this document useful (0 votes)
8 views49 pages

chp5 Normalization-1

Uploaded by

umarhanif696
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views49 pages

chp5 Normalization-1

Uploaded by

umarhanif696
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

NORMALIZATION

- Objectives:
▪ The purpose of normalization.
▪ The problems associated with redundant data.
▪ The identification of various types of update anomalies such as insertion,
deletion, and modification anomalies.
▪ How to recognize the appropriateness or quality of the design of relations.
▪ The concept of functional dependency, the main tool for measuring the
appropriateness of attribute groupings in relations.
▪ How functional dependencies can be used to group attributes into relations
that are in a known normal form.
▪ How to define normal forms for relations. How to undertake the process of
normalization.
▪ How to identify the most commonly used normal forms, namely first (1NF),
second (2NF), and third (3NF) normal forms, and Boyce–Codd normal form
(BCNF).
NORMALIZATION

▪ A technique for producing a set of relations with desirable properties, given the
data requirements of an enterprise. Developed by E.F. Codd (1972).
▪ Often performed as a series of tests on a relation to determine whether it
satisfies or violates the requirements of a given normal form.
▪ Four most commonly used normal forms are first (1NF), second (2NF), third (3NF)
and Boyce-Codd (BCNF) normal forms.
▪ Based on functional dependencies among the attributes of a relation.
▪ A relation can be normalized to a specific form to prevent the possible
occurrence of update anomalies.
▪ Data normalization is the process of structuring data into a form that is …….
- easily accessible and maintainable
- eliminates "anomalies"
which would cause problems during use
The Normalization
▪ Normalization is the process of organizing data in a database.

▪ This includes creating tables and establishing relationships


between those tables according to rules designed both to
protect the data and to make the database more flexible by
eliminating two factors:

- redundancy and
- inconsistent dependency.
Relational Database
• The relational database overcomes the limitations of the file-
oriented approach:

• Update anomalies
• Insert anomalies
• Delete anomalies

• by creating multiple tables from the single table


Database Normalization
• How do we decide how many tables and what fields go in
each table?
• Database normalization: a process to assure no significant
data anomalies are caused by the database design.
• ”Normal form" means that a table conforms to certain rules
• There are three normal forms that provide rules for database
structure
Database Normalization continued
• A table that is not in third normal form has one or more design
flaws that can result in anomalies in the data
• Three kinds of data anomalies:
– Insert anomalies
– Delete anomalies
– Update anomalies
Normalization Benefits
• Facilitates data integration.
• Reduces data redundancy.
• Provides a robust architecture for retrieving and maintaining
data.
• Compliments data modeling.
• Reduces the chances of data anomalies occurring.
Deletion Anomaly
• Occurs when the removal of a record results in a loss of
important information about an entity.
• Example:
• All the information about a customer is contained in an order
file, if the order is canceled, all the customer information
could be lost when the order record is deleted
• Solution:
• Create two tables--one table contains order information
and the other table contains customer information.
Update Anomaly
• Occurs when a change of a single attribute in one record
requires changes in multiple records
• Example:
• A staff person changes their telephone number and
every potential customer that person ever worked
with has to have the corrected number inserted.
• Solution:
• Put the employees telephone number in one location-
-as an attribute in the employee table.
Insertion Anomaly
• Occurs when there does not appear to be any reasonable
place to assign attribute values to records in the database.
Probably have overlooked a critical entity.
• Example:
• Adding new attributes or entire records when they
are not needed. Where do you place information on
new Evaluator’s? Do you create a dummy Lead.
• Solution:
• Create a new table with a primary key that contains
the relevant or functional dependent attributes.
Functional Dependency
• It describes the relationship between attributes in a relation.
• For example, if A and B are attributes of relation R, B is functionally
dependent on A (denoted A → B), if each value of A in R is associated
with exactly one value of B in R.
• The determinant of a functional dependency refers to the attribute or group of
attributes on the left-hand side of the arrow.
Diagrammatic representation.
THE PROCESS OF NORMALIZATION
First Normal Form (1NF)
• A relation in which the intersection of each row and column contains one
and only one value

UNF to 1NF
• Nominate an attribute or group of attributes to act as the key for the
unnormalized table.
• Identify the repeating group(s) in the unnormalized table which repeats
for the key attribute(s).

Remove the repeating group by


• Entering appropriate data into the empty columns of rows containing the
repeating data (‘flattening’ the table).
Or by
• Placing the repeating data along with a copy of the original key
attribute(s) into a separate relation.
Example 1 - Normalization UNF to 1NF
UNF to 1NF (Alternative)
SECOND NORMAL FORM (2NF)
It based on the concept of full functional dependency.
• Full functional dependency
• Full functional dependency indicates that if A and B are attributes of a
relation, B is fully dependent on A if B is functionally dependent on A but
not on any proper subset of A. i.e no partial dependencies exist.
• This relate only to the relation with composite primary keys.
• 2ND Definition:
• A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key.
• To Normalize a relation from 1NF to 2NF:

• Identify the primary key for the 1NF relation.


• Identify the functional dependencies in the relation.

• If partial dependencies exist on the primary key remove them by placing then in
a new relation along with a copy of their determinant.
Not 2NF Example:

Because:
• Partial Dependency: (attributes depend on part of composite key).
• Customer No → Cname
• Prop No → Paddress, OwnerNo, OwnerName.
• Update Anomalies In 2NF:
• Much less redundancy now but anomalies still exist e.g if we want to update
name of an owner, such as Tony Show, we still have to update 2 rows in
property relation (as he owns 2 properties) 3NF will remove such anomalies.

• THIRD NORMAL FORM (3NF)


• It based on the concept of transitive dependency.
• Transitive Dependency is a condition where A, B and C are attributes of a
relation such that if A → B and B → C, then C is transitively dependent on
A through B. (Provided that A is not functionally dependent on B or C).

• 3NF Definition:
• A relation that is in 1NF and 2NF and in which no non-primary-key attribute
is transitively dependent on the primary key.
• To normalize a relation from 2NF to 3NF
- Identify the primary key in the 2NF relation.
- Identify functional dependencies in the relation.

• If transitive dependencies exist on the primary key remove them by placing them
in a new relation along with a copy of their dominant, Consider previous example
Property_Owner ProNo Address Rent OwnerNo OwnerName

PG4 6 Lawarance St Glasgow 350 CO40 Tinan Murply

PG36 5 Novar Glasgow 450 CO93 Tony Show

PG16 2 Manor Rd Glasgow 375 CO93 Tony Show

- PropNO → Owner No
- Owner No → Owner Name
- Transitive dependency of OwnerName on PropNO through OwnerNO.

• To convert to 3NF, remove this attribute (OwnerName) and place in new relation
with the attribute it depends on (determinant).
- PROP_FOR_RENT (PropNO Padd, Rent, OwnerNO)
- OWNER (OwnerNo, OwnerName).
Process of Decomposition
FDs for Customer_Rental Relation
PROF-COURSE Table
PROF-COURSE Unnormalized
CREDIT-
PROF-NO NAME OFFICE-NO PHONE-NO COURSE-NO DESCRIPTION DAY-PREF TIME-PREF
HOURS

3
ACCT427 AIS MW 8 am
Dr.
1001 401A 5-5515 ACCT 648 Advanced AIS 1 MW 2 pm
Benton
COMP101 Software tools MWF 11 am
3

1002 Dr. Carter 401B 5-8761 ACCT209 Principles 3 TR 3 pm

3
ACCT327 Intermediate I MWF 10 am
1003 Dr. Ross 402G 2-6785
ACCT 328 Intermediate II TR 10 am
3

3
ACCT427 AIS MW 10 am
1004 Dr. Green 401Z 5-9111
ACCT 209 Principles TR 9 am
3
First Normal Form
• A relation that is said to be in first normal
form (1NF) when there is exactly one value
in each cell - at the intersection of a row and
a column.
• Therefore there are no repeating groups
PROF-COURSE Table Make a
Unnormalized new table
with these
PROF-COURSE fields
CREDIT-
OFFICE-
PROF-NO NAME PHONE-NO COURSE-NO DESCRIPTION DAY-PREF TIME-PREF
NO
HOURS

3
ACCT427 AIS MW 8 am
1001 Dr. Benton 401A 5-5515 ACCT 648 Advanced AIS 1 MW 2 pm
COMP101 Software tools MWF 11 am
3

1002 Dr. Carter 401B 5-8761 ACCT209 Principles 3 TR 3 pm

3
ACCT327 Intermediate I MWF 10 am
1003 Dr. Ross 402G 2-6785
ACCT 328 Intermediate II TR 10 am
3

3
ACCT427 AIS MW 10 am
1004 Dr. Green 401Z 5-9111
ACCT 209 Principles TR 9 am
3

Repeating Groups
PROF-COURSE Relational Database
First Normal Form (1NF)
PROF

PROF-NO NAME OFFICE-NO PHONE-NO

1001 Dr. Benton 401A 5-5515

1002 Dr. Carter 401B 5-8761

1003 Dr. Ross 402G 2-6785

1004 Dr. Green 401Z 5-9111

COURSE-PREF(1NF)

PROF-NO COURSE-NO DESCRIPTION CREDIT-HOURS DAY-PREF TIME-PREF

1001 ACCT427 AIS 3 MW 8 am

1001 ACCT648 Advanced AIS 3 MW 2 pm

1002 ACCT209 Principles 3 TR 3 pm

1003 ACCT327 Intermediate I 3 MWF 10 am

1003 ACCT328 Intermediate II 3 TR 10 am

1004 ACCT427 AIS 3 MW 10 am

1004 ACCT209 Principles 3 TR 9 am

1001 COMP101 Software tools 1 MW 11 am


Database Normalization
Primary Keys
• One key in each table should be the primary
key. The primary key is the unique
identifier of each record.
• The entity integrity rule says that each table
have unique primary key and the primary
key cannot be null (zero or blank)
Database Normalization
Functional Dependency

• The other fields in the table should be functionally


dependent upon the primary key.
• In a table with fields X and Y, if there is only one
possible value of Y for every value of X, then Y is
said to be functionally dependent on X.
• In simple terms, field X (the primary key)
determines the value in field Y.
PROF-COURSE Relational Database
Primary Keys
Primary key
PROF

PROF-NO NAME OFFICE-NO PHONE-NO

1001 Dr. Benton 401A 5-5515

1002 Dr. Carter 401B 5-8761

1003 Dr. Ross 402G 2-6785


Composite
primary key
1004 Dr. Green 401Z 5-9111

COURSE-PREF(1NF)

PROF-NO COURSE-NO DESCRIPTION CREDIT-HOURS DAY-PREF TIME-PREF

1001 ACCT427 AIS 3 MW 8 am

1001 ACCT648 Advanced AIS 3 MW 2 pm

1002 ACCT209 Principles 3 TR 3 pm

1003 ACCT327 Intermediate I 3 MWF 10 am

1003 ACCT328 Intermediate II 3 TR 10 am

1004 ACCT427 AIS 3 MW 10 am

1004 ACCT209 Principles 3 TR 9 am

1001 COMP101 Software tools 1 MW 11 am


PROF-COURSE Relational Database
Primary Keys
PROF
Primary key to primary
key links allow tables to
PROF-NO NAME OFFICE-NO PHONE-NO

1001 Dr. Benton 401A 5-5515

1002 Dr. Carter 401B 5-8761 be merged so query


1003 Dr. Ross 402G 2-6785 information can be
1004 Dr. Green 401Z 5-9111
extracted from the
COURSE-PREF(1NF)
database
PROF-NO COURSE-NO DESCRIPTION CREDIT-HOURS DAY-PREF TIME-PREF

1001 ACCT427 AIS 3 MW 8 am

1001 ACCT648 Advanced AIS 3 MW 2 pm

1002 ACCT209 Principles 3 TR 3 pm

1003 ACCT327 Intermediate I 3 MWF 10 am

1003 ACCT328 Intermediate II 3 TR 10 am

1004 ACCT427 AIS 3 MW 10 am

1004 ACCT209 Principles 3 TR 9 am

1001 COMP101 Software tools 1 MW 11 am


PROF-COURSE Relational Database
Anomalies that can Occur in First Normal Form
PROF

PROF-NO NAME OFFICE-NO PHONE-NO •Can’t insert course without


1001 Dr. Benton 401A 5-5515
instructor assigned
1002 Dr. Carter 401B 5-8761

1003 Dr. Ross 402G 2-6785 •Delete professor and course


1004 Dr. Green 401Z 5-9111
information may be lost
COURSE-PREF(1NF)

PROF-NO COURSE-NO DESCRIPTION CREDIT-HOURS DAY-PREF TIME-PREF

1001 ACCT427 AIS 3 MW 8 am

1001 ACCT648 Advanced AIS 3 MW 2 pm

1002 ACCT209 Principles 3 TR 3 pm

1003 ACCT327 Intermediate I 3 MWF 10 am

1003 ACCT328 Intermediate II 3 TR 10 am

1004 ACCT427 AIS 3 MW 10 am

1004 ACCT209 Principles 3 TR 9 am

1001 COMP101 Software tools 1 MW 11 am


Second Normal Form
• A relation that is said to be in second
normal form (2NF) when:
– It is 1NF, and
– Every secondary key field is functionally
dependent on the entire primary key
PROF-COURSE Relational Database
Creating the Second Normal Form

PROF table is
PROF

PROF-NO NAME OFFICE-NO PHONE-NO

1001

1002
Dr. Benton

Dr. Carter
401A

401B
5-5515

5-8761
already 2NF (it
1003 Dr. Ross 402G 2-6785
has a single
primary key)
1004 Dr. Green 401Z 5-9111

COURSE-PREF(1NF)

PROF-NO COURSE-NO DESCRIPTION CREDIT-HOURS DAY-PREF TIME-PREF

1001 ACCT427 AIS 3 MW 8 am

1001 ACCT648 Advanced AIS 3 MW 2 pm

1002 ACCT209 Principles 3 TR 3 pm

1003 ACCT327 Intermediate I 3 MWF 10 am

1003 ACCT328 Intermediate II 3 TR 10 am

1004 ACCT427 AIS 3 MW 10 am

1004 ACCT209 Principles 3 TR 9 am

1001 COMP101 Software tools 1 MW 11 am


PROF-COURSE Relational Database
Creating the Second Normal Form
PROF (2NF)

Some Course-Pref table


PROF-NO NAME OFFICE-NO PHONE-NO

fields are functionally


1001 Dr. Benton 401A 5-5515

1002 Dr. Carter 401B 5-8761

1003 Dr. Ross 402G 2-6785 dependent on only part


1004 Dr. Green 401Z 5-9111
of the primary key
COURSE-PREF(1NF)

PROF-NO COURSE-NO DESCRIPTION CREDIT-HOURS DAY-PREF TIME-PREF

Make a
1001 ACCT427 AIS 3 MW 8 am

1001 ACCT648 Advanced AIS 3 MW 2 pm

1002 ACCT209 Principles 3 TR 3 pm new table


1003 ACCT327 Intermediate I 3 MWF 10 am
with these
1003

1004
ACCT328

ACCT427
Intermediate II

AIS
3

3
TR

MW
10 am

10 am
fields
1004 ACCT209 Principles 3 TR 9 am

1001 COMP101 Software tools 1 MW 11 am


PROF-COURSE Relational Database
Second Normal Form (2NF) - Creating the COURSES Table
COURSES

COURSE-NO DESCRIPTION CREDIT-HOURS

ACCT427 AIS 3

ACCT648 Advanced AIS 3

ACCT209 Principles 3

ACCT327 Intermediate I 3

ACCT328 Intermediate II 3

COMP101 Software tools 1

COURSE-PREF

PROF-NO COURSE-NO DAY-PREF TIME-PREF

1001 ACCT427 MW 8 am

1001 ACCT648 MW 2 pm

1001 COMP101 MWF 11 am

1002 ACCT209 TR 3 pm

1003 ACCT327 MWF 10 am

1003 ACCT328 TR 10 am

1004 ACCT427 MW 10 am

1004 ACCT209 TR 9 am
Third Normal Form
• A relation that is said to be in third normal
form (3NF) when:
– It is 2NF, and
– Every secondary key field is functionally
dependent on only the primary key (and not on
any other field in the table)
PROF-COURSE Relational Database
Creating Third Normal Form

PROF (2NF) Transitive Dependency

PROF-NO NAME OFFICE-NO PHONE-NO

1001 Dr. Benton 401A 5-5515

1002 Dr. Carter 401B 5-8761

1003 Dr. Ross 402G 2-6785

1004 Dr. Green 401Z 5-9111


PROF-COURSE Relational Database
Anomalies of this 2NF Table???
• Offices cannot be added without professor assignment
• Phone numbers are given to offices not professors
PROF (2NF) Transitive Dependency
PROF-COURSE Relational Database
Creating Third Normal Form

PROF (2NF) Transitive Dependency

PROF-NO NAME OFFICE-NO PHONE-NO

1001 Dr. Benton 401A 5-5515

1002 Dr. Carter 401B 5-8761

1003 Dr. Ross 402G 2-6785

1004 Dr. Green 401Z 5-9111

Make a new table with this field


PROF-COURSE Relational Database
Creating Third Normal Form

PROF-INFO (3NF)
Foreign Key
PROF-NO NAME OFFICE-NO*

1001 Dr. Benton 401A

1002 Dr. Carter 401B

1003 Dr. Ross 402G

1004 Dr. Green 401Z

Primary Key
OFFICES (3NF)

OFFICE-NO PHONE-NO

401A 5-5515

401B 5-8761

402G 2-6785

401Z 5-9111
Database Normalization
Foreign Keys
• A secondary field in one table that is the
same as the primary key in another table is
called a foreign key.
• (Just like primary key – primary key links)
Foreign key to primary key links allow
tables to be merged so query information
can be extracted from the database.
PROF-COURSE Relational Database
These tables are already Third Normal Form
COURSES (3NF)

COURSE-NO DESCRIPTION CREDIT-HOURS

ACCT427 AIS 3

ACCT648 Advanced AIS 3

ACCT209 Principles 3

ACCT327 Intermediate I 3

ACCT328 Intermediate II 3

COMP101 Software tools 1

COURSE-PREF (3NF)

PROF-NO COURSE-NO DAY-PREF TIME-PREF

1001 ACCT427 MW 8 am

1001 ACCT648 MW 2 pm

1001 COMP101 MWF 11 am

1002 ACCT209 TR 3 pm

1003 ACCT327 MWF 10 am

1003 ACCT328 TR 10 am

1004 ACCT427 MW 10 am

1004 ACCT209 TR 9 am
PROF-COURSE Relational Database
Third Normal Form – 4 tables from 1
PROF-INFO (3NF)

PROF-NO NAME OFFICE-NO*

1001 Dr. Benton 401A

1002 Dr. Carter 401B

1003 Dr. Ross 402G

1004 Dr. Green 401Z

OFFICES (3NF)
PROF-COURSE OFFICE-NO PHONE-NO

401A 5-5515
CREDIT-
OFFICE-
PROF-NO NAME PHONE-NO COURSE-NO DESCRIPTION DAY-PREF TIME-PREF 401B 5-8761
NO
HOURS
402G 2-6785
3
ACCT427 AIS MW 8 am 401Z 5-9111
1001 Dr. Benton 401A 5-5515 ACCT 648 Advanced AIS 1 MW 2 pm
COMP101 Software tools MWF 11 am
3
COURSES (3NF)
1002 Dr. Carter 401B 5-8761 ACCT209 Principles 3 TR 3 pm
COURSE-NO DESCRIPTION CREDIT-HOURS
3
ACCT327 Intermediate I MWF 10 am ACCT427 AIS 3
1003 Dr. Ross 402G 2-6785
ACCT 328 Intermediate II TR 10 am
3
ACCT648 Advanced AIS 3

3 ACCT209 Principles 3
ACCT427 AIS MW 10 am
1004 Dr. Green 401Z 5-9111
ACCT 209 Principles TR 9 am
3 ACCT327 Intermediate I 3

ACCT328 Intermediate II 3

COMP101 Software tools 1

COURSE-PREF (3NF)

PROF-NO COURSE-NO DAY-PREF TIME-PREF

1001 ACCT427 MW 8 am

1001 ACCT648 MW 2 pm

1001 COMP101 MWF 11 am

1002 ACCT209 TR 3 pm

1003 ACCT327 MWF 10 am

1003 ACCT328 TR 10 am

1004 ACCT427 MW 10 am

1004 ACCT209 TR 9 am
Summary of the Rules for Normal Forms
• 1NF if there are no repeating groups in the
relation
• 2NF if every secondary key field is
functionally dependent on the entire primary
key
• 3NF if all functional dependencies in the
relation originate from the primary key (no
transitive dependencies)
PROF-COURSE Relational Database
Schema of the Relation

• A relation schema is a compact description of the


relational table. Underline primary keys; star (*)
foreign keys.
• The schema for this four-table database:
PROF-INFO (PROF-NO, NAME, OFFICE-NO*)
OFFICES (OFFICE-NO, PHONE-N0)
COURSES (COURSE-NO, DESCRIPTION, CREDIT-HOURS)
COURSE-PREF (PROF-NO, COURSE-NO, DAY-PREF, TIME-PREF)
• Boyce-Codd Normal Form (4NF)
A table is in 4NF if it is in 3NF and has no multiple sets of multivlued dependencies. (A multivalued
dependency exists when there are at least three attributes in a relation - A, B, and C, and for each
value of A there is a well-defined et of values for B and a well defined set of values for C, but the set
of values for B is independent of set C)

• 3NF to BCNF
- Identify all candidate keys in the relation.
- Identify all functional dependencies in the relation.

•If functional dependencies exists in the relation where their determinants are not candidate keys
for the relation, remove the functional dependencies by placing them in a new relation along with a
copy of their determinant.
filmNo fTitle dirNo director actorNo aName role timeOnScreen
F1100 Happy
Days D101 Jim Alan A1020 Sheila Toner Jean Simson 15.45

D101 Jim Alan A1222 Peter Watt Tom Kinder 25.38

D101 Jim Alan A1020 Sheila Toner Silvia 22.56


Simpson
F1109 Snake Bite Sue Steven
D076 Ramsay A1567 McDonald Tim Rosey 19.56

D076 Sue A1222 Peter Watt Archie Bold 10.44


Ramsay

filmNo fTitle dirNo director actorNo aName role timeOnScree


n

F1100 Happy D101 Jim Alan A1020 Sheila Toner Jean Simson 15.45
Days

F1100 Happy D101 Jim Alan A1222 Peter Watt Tom Kinder 25.38
Days

F1100 Happy D101 Jim Alan A1020 Sheila Toner Silvia 22.56
Days Simpson

F1109 Snake Bite D076 Sue A1567 Steven Tim Rosey 19.56
Ramsay McDonald

F1109 Snake Bite D076 Sue A1222 Peter Watt Archie Bold 10.44
Ramsay
filmNo fTitle dirNo director actorNo aName role timeOnScreen

fd1

fd2

fd2

filmNo actorNo role timeOnScreen

filmNo fTitle dirNo director actorNo aName

dirNo director

filmNo fTitle dirNo


staffNo branchNo branchAddress name position hoursPerWeek
City Center Plaza, Seattle,
S4555 B002 Ellen Layman Assistant 16
WA 98122
16 – 14th Avenue, Seattle,
S4555 B004 Ellen Layman Assistant 9
WA 98128
City Center Plaza,Seattle, WA
S4612 B002 Dave Sinclair Assistant 14
98122
16 – 14th Avenue, Seattle,
S4612 B004 Dave Sinclair Assistant 10
WA 98128

appNo date/time instructorID iFName iLName clientIDD cFName cLName cAddress


111 Storrie Road,
1001 25/07/00.10.00 I456 Jane Watt C034 Anne Way
Paisley
111 Storrie Road,
1102 29/07/00.10.00 I456 Jane Watt C034 Anne Way
Paisley
111 Storrie Road,
1203 30/07/00.11.00 I344 Tom Jones C034 Anne Way
Paisley
120 Lady Lane,
1334 2/08/00.13.00 I666 Karen Black C089 Mark Fields
Paisley
13 Renfrew Road,
1455 2/08/00.13.00 I957 Steven Smith C019 John Brown
Paisley
34 High Street,
1676 25/08/00.10.00 I344 Tom Jones C039 Karen Worth
Paisley

You might also like