0% found this document useful (0 votes)
56 views30 pages

Normalization of Database Tables

The document discusses normalization of database tables. Normalization is a process that reduces data redundancy and improves data integrity. It involves organizing data into tables and defining relationships between tables. There are several normal forms like 1NF, 2NF, 3NF and BCNF that are applied to tables to remove anomalies and make the database more efficient and flexible. Higher normal forms ensure lower levels of normalization and remove more types of dependencies between attributes.

Uploaded by

AsasAsas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views30 pages

Normalization of Database Tables

The document discusses normalization of database tables. Normalization is a process that reduces data redundancy and improves data integrity. It involves organizing data into tables and defining relationships between tables. There are several normal forms like 1NF, 2NF, 3NF and BCNF that are applied to tables to remove anomalies and make the database more efficient and flexible. Higher normal forms ensure lower levels of normalization and remove more types of dependencies between attributes.

Uploaded by

AsasAsas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

1

Normalization of Database Tables

CHAPTER 4

Chapter Objectives

Understand concepts of normalization


Learn how to normalize tables
Understand normalization and database design issues

Database Tables and Normalization

Normalization is a process for assigning attributes to entities.


It reduces data redundancies.
An un-normalized relation (table) stores redundant data, which
can cause insertion, deletion, and modification anomalies.
In simple words: Normalization means keeping a single copy
of data in your database.
Normalization theory provides a step by step method to
remove redundant data and undesirable table structures.

Normal Forms

Tables are normalized by applying rules to create a series of


normal forms:

First normal form (1NF)


Second normal form (2NF)
Third normal form (3NF)
Boyce/Codd normal form (BCNF)
Fourth normal form (4NF)
Projection Join normal form (PJNF, aka 5NF)

A table or relation in a higher level normal form always


confirms to lower level normal forms.

Normal Forms
1NF Relations
2NF Relations
3NF Relations
BCNF Relations
4NF Relations
PJ/NF (5NF) Relations

While higher level normal forms are available, normalization up to BCNF is often found to
be adequate for business data.

First Normal Form

A relation is in 1NF if all underlying domains contain atomic


values only, i.e., the intersection of each row and column
contains one and only one value.
The relation must not contain repeating groups.
PNo
1

PName
Alpha

Beta

Omega

ENo
101
105
110
101
108
106
102
105

EName
John Doe
Jane Vo
Bob Lund
John Doe
Jeb Lee
Sara Lee
Beth Reed
Jane Vo

Is the above relation in 1NF?

Jcode
NE
SA
CP
NE
NE
SA
PM
SA

ChgHr
$65
$80
$60
$65
$65
$80
$125
$80

Hrs
20
15
40
20
15
20
20
10

First Normal Form

The previous relation can be converted into first


normal form by adding Pno and Pname to each row.
PNo
1
1
1
2
2
2
3
3

PName
Alpha
Alpha
Alpha
Beta
Beta
Beta
Omega
Omega

ENo
101
105
110
101
108
106
102
105

EName
John Doe
Jane Vo
Bob Lund
John Doe
Jeb Lee
Sara Lee
Beth Reed
Jane Vo

Jcode
NE
SA
CP
NE
NE
SA
PM
SA

What is the primary key in this relation?


Do you see redundant data in this table?
What anomalies could be caused?

ChgHr
$65
$80
$60
$65
$65
$80
$125
$80

Hrs
20
15
40
20
15
20
20
10

Functional Dependency Revisited

If A and B are attributes (or group of attributes) of a relation R,


B is functionally dependent on A (denoted A B), if each value
of A in R is associated with exactly one value of B in R.
A is called a determinant.
Consider the relation

Student (ID, Name, Soc Sec Nbr, Major, Deptmt)


Assume a department offers several majors, e.g. INSY
department offers, INSY, MASI, and POMA majors.
How many determinants can you identify in Student?

Functional Dependency Revisited

A Dependency diagram

ID

Name Soc_Sec_Nbr

Major

Dept

Functional Dependency Revisited

Full functional dependency

10

Attribute B is fully functionally dependent on attribute A if it is


functionally dependent on A and not functionally dependent on
any proper subset of A.
This becomes an issue only with composite keys.

Transitive dependency

A, B and C are attributes of a relation such that A B and B C,


then C is transitively dependent on A via B (provided that A is
not functionally dependent on B or C)

Second Normal Form

11

Dependency diagram for Project

PNo

PName

ENo

EName

JCode

ChgHr

Hrs

Second Normal Form

12

A relation is in 2NF if:


It is in 1NF and
every nonkey attribute is fully dependent on the primary
key, i.e., no partial dependency.

A nonkey attribute is one that is not a primary key or part of a


primary key.

We create new relations that are in 2NF through projection


of the original relation.

Project(PNo, PName)
Employee(ENo, EName, Jcode, ChgHr)
Charge(PNo, ENo, Hrs)

2NF

13

2NF
PNo

PName

ENo

PNo

ENo

Hrs

EName

JCode

ChgHr

Second Normal Form

14

Tables in 2NF
Project
PNo
1
2
3

PName
Alpha
Beta
Omega

Employee
ENo
EName JCode ChgHr
101
John Doe NE $65
102
Beth Reed PM $125
105
Jane Vo
SA $80
106
Sara Lee
SA $80
108
Jeb Lee
NE $65
110
Bob Lund CP $60

Charge
PNo
1
1
1
2
2
2
3
3

ENo
101
105
110
101
108
106
102
105

Hrs
20
15
40
20
15
20
20
10

Second Normal Form

15

Note that the original relation can be recreated through natural


join of the new relation.
Thus, no information is lost in the process of creating 2NF
relations from a 1NF relation. This is called nonloss
decomposition.
If a relation that is in 1NF has a non composite primary key
(i.e., the primary key consists of a single attribute) what can
you say about its status with regard to 2NF?
Do you see any redundant data in the tables that are in 2NF?
What anomalies could be caused by such redundancy?

Third Normal Form

16

A relation is in 3NF if:


It is in 2NF and
every nonkey attribute is nontransitively dependent on the
primary key (i.e., no transitive dependency).
Relation Employee has a transitive dependency:
ENo
JCode ChgHr
Employee can be replaced by two relations, that are in 3NF:
Employee(ENo, EName, Jcode)
Job(JCode, ChgHr)

3NF

17

3NF
PNo

PName

ChgHr

JCode

PNo

ENo

ENo

Hrs

EName

JCode

Third Normal Form

18

Tables in 3NF
Project
PNo
1
2
3

PName
Alpha
Beta
Omega

Employee
ENo
EName Jcode
101
John Doe NE
102
Beth Reed PM
105
Jane Vo
SA
106
Sara Lee
SA
108
Jeb Lee
NE
110
Bob Lund CP

Charge
PNo
1
1
1
2
2
2
3
3
Job
Jcode
CP
NE
PM
SA

ENo
101
105
110
101
108
106
102
105

ChgHr
$60
$65
$125
$80

Hrs
20
15
40
20
15
20
20
10

Boyce-Codd Normal Form

A relation is in BCNF if

every determinant is a candidate key.

BCNF is a special case of 3NF.


The potential to violate BCNF may occur in a relation that:

A determinant is an attribute (combination of attributes) on which some


other attribute is fully functionally dependent.

contains two (or more) composite candidate keys,


these keys overlap and share at least one attribute.

Thus, if a table contains only one candidate key or only noncomposite keys, then 3NF and BCNF are equivalent.

19

3NF Table Not in BCNF


Figure 4.7

20

Decomposition of Table
Structure to Meet BCNF

21

Figure 4.8

Boyce-Codd Normal Form

22

Consider the following example:

The members of a recruiting team interview candidates on a oneto-one basis. Each member is assigned a particular room on a
given date. Each candidate is interviewed only once on a
specific date. He/she may return for follow up interviews on
later dates.
Interview (CID, IDate, ITime, StaffID, RmNo)
CID
C01
C02
C03
C01

IDate
8-22-99
8-22-99
8-22-99
8-29-99

ITime
10:00
11:00
10:00
3:00

StaffID
S01
S01
S05
S06

RmNo
B107
B107
B108
B108

Boyce-Codd Normal Form

This relation has following functional dependencies:

23

CID, IDate ITime, StaffID, RmNo


StaffID, IDate, ITime
CID, RmNo
RmNo, Idate, Itime
StaffID, CID
StaffID, IDate
RmNo

This relation does not have any partial or transitive


dependencies on the primary key (CID, IDate)
It is not in BCNF because (StaffID, Idate) is a determinant but
not a candidate key.
The new relations in BCNF are:

Interview (CID, IDate, ITime, StaffID)


Room(StaffID, IDate, RmNo)

Dependency Diagram
Dependency diagram

Fig 1

CID

IDate

ITime StaffID RmNo

Fig 2

CID

IDate

ITime StaffID RmNo

Fig 3

CID

IDate

ITime StaffID RmNo

24

Fourth Normal Form

A table is in 4NF if

25

it is in 3NF and
has no multiple sets of multivalued dependencies.

Consider the following example:

Each course is taught by many teachers and requires many texts.

CTXU (Unnormalized)
Course Teacher Text
Physics Green
Basic Mechanics
Brown Intro to Optics
Math
White
Modern Algebra
Intro to Calculus

CTXN (Normalized)
Course Teacher Text
Physics Green
Basic Mechanics
Physics Green
Intro to Optics
Physics Brown Basic Mechanics
Physics Brown Intro to Optics
Math
White
Modern Algebra
Math
White
Intro to Calculus

Fourth Normal Form

CTXN is in BCNF, because it is all key and there are no other


functional dependencies.
It, however, has redundant data that could cause update
anomalies.
This table shows two multivalued dependencies:

Each course has a defined set of teachers and

Course

Teacher

Each course has a defined set of textbooks.

26

Course

Text

MVDs can exist only when the relation has at least three
attributes.
An FD is a special case of MVD when the set of dependent
values has a single value.

Fourth Normal Form

Tables in 4NF
CT
Course
Physics
Physics
Math

Teacher
Green
Brown
White

CX
Course
Physics
Physics
Math
Math

Text
Basic Mechanics
Intro to Optics
Modern Algebra
Intro to Calculus

27

Conversion to 4NF

Figure 4.15
Set of Tables in 4NF

Figure 4.14
Multivalued Dependencies

28

Normalization and Database Design

Normalization should be part of the design process


E-R Diagram provides macro view
Normalization provides micro view of entities

29

Focuses on characteristics of specific entities


May yield additional entities

Difficult to separate normalization from E-R diagramming


Business rules must be determined
Normalization purity is difficult to sustain due to conflict
in:
Design efficiency
Information requirements
Processing

Denormalization

30

Normalized (decomposed) tables require additional processing,


thus reducing system speed.
Sometimes normalization is not done keeping in mind
processing speed requirements and practical aspects of the
situation.
A good example is: storing Zip code and City as attributes in a
Customer relation violates 3NF because City is transitively
dependent on Cust ID via Zip Code.

Why should we not create a separate relation


City)?

ZIP (ZipCode,

You might also like