0% found this document useful (0 votes)
27 views51 pages

CH 14

The document discusses normalization and functional dependencies in database design. It covers the concepts of normalization including normal forms and functional dependencies. The goals of normalization are to reduce data redundancy and update anomalies. Examples are provided to illustrate key normalization concepts.

Uploaded by

Hanako Ono
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views51 pages

CH 14

The document discusses normalization and functional dependencies in database design. It covers the concepts of normalization including normal forms and functional dependencies. The goals of normalization are to reduce data redundancy and update anomalies. Examples are provided to illustrate key normalization concepts.

Uploaded by

Hanako Ono
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Chapter 14

Normalization

Pearson Education © 2014


Chapter 14 - Objectives

The potential problems associated with


redundant data in base relations.
Functional dependency
Normalization
Normal forms

2
Pearson Education © 2014
Purpose of Normalization
Characteristics of a suitable set of relations
include:
the minimal number of attributes necessary to
support the data requirements of the enterprise;
attributes with a close logical relationship are
found in the same relation;
minimal redundancy with each attribute
represented only once with the important
exception of attributes that form all or part of
foreign keys.

3
Pearson Education © 2014
How Normalization Supports
Database Design

4
Data Redundancy and Update
Anomalies
Major aim of relational database design is to
group attributes into relations to minimize
data redundancy.

5
Pearson Education © 2014
Data Redundancy and Update
Anomalies
Potential benefits for implemented database
include:
Updates to the data stored in the database are
achieved with a minimal number of operations
thus reducing the opportunities for data
inconsistencies.
Reduction in the file storage space required by
the base relations thus minimizing costs.

6
Pearson Education © 2014
Data Redundancy and Update
Anomalies
Problems associated with data redundancy
are illustrated by comparing the Staff and
Branch relations with the StaffBranch
relation.

7
Pearson Education © 2014
Data Redundancy and Anomalies:
Staff&Branch vs StaffBranch

8
Pearson Education © 2014
Data Redundancy and Update
Anomalies
StaffBranch relation has redundant data; the
details of a branch are repeated for every
member of staff.
If we update address in StaffBranch for branch
B005 for John White, will still have wrong (old)
address for B005 for Julie Lee
In contrast, the branch information appears
only once for each branch in the Branch
relation and only the branch number
(branchNo) is repeated in the Staff relation
update address in Branch table, affects
9 all Staff
Pearson Education © 2014
Data Redundancy and Update
Anomalies
Relations that contain redundant information
may potentially suffer from update
anomalies.
Types of update anomalies include
Insertion
Insert new Staff member at branch 005 with old
address/wrong address
Deletion
fire John and Julie, lose the address of Branch 005
Modification
Modify address for John but not Julie 10
Pearson Education © 2014
Functional Dependencies
Important concept associated with
normalization.
Functional dependency describes relationship
between attributes.
For example, if A and B are attributes of
relation R, B is functionally dependent on A
(denoted A → B), if each value of A in R is
associated with exactly one value of B in R.
i.e., A is a candidate key

11
Pearson Education © 2014
Characteristics of Functional
Dependencies
Diagrammatic representation.

The determinant of a functional dependency


refers to the attribute or group of attributes
on the left-hand side of the arrow.

12
Pearson Education © 2014
An Example Functional
Dependency

13
Example Functional Dependency
that holds for all Time
Consider the values shown in staffNo and
sName attributes of the Staff relation (see
Slide 8).

Based on sample data, the following functional


dependencies appear to hold.

staffNo → sName
sName → staffNo
14
Pearson Education © 2014
Example Functional Dependency
that holds for all Time
However, the only functional dependency that
remains true for all possible values for the
staffNo and sName attributes of the Staff
relation is:

staffNo → sName

15
Pearson Education © 2014
Characteristics of Functional
Dependencies
Determinants should have the minimal number
of attributes necessary to maintain the
functional dependency with the attribute(s) on
the right hand-side.
i.e., should not be superkeys

This requirement is called full functional


dependency.

16
Pearson Education © 2014
Example Full Functional
Dependency
Exists in the Staff relation (see Slide 8).

staffNo, sName → branchNo

• True - each value of (staffNo, sName) is


associated with a single value of branchNo.
• However, branchNo is also functionally
dependent on a subset of (staffNo, sName),
namely staffNo. Example above is a partial
dependency.
17
Pearson Education © 2014
Characteristics of Functional
Dependencies
Main characteristics of functional
dependencies used in normalization:
There is a one-to-one relationship between the
attribute(s) on the left-hand side (determinant)
and those on the right-hand side of a functional
dependency.
Holds for all time.
The determinant has the minimal number of
attributes necessary to maintain the dependency
with the attribute(s) on the right hand-side.

18
Pearson Education © 2014
Transitive Dependencies
Important to recognize a transitive dependency
because its existence in a relation can
potentially cause update anomalies.

Transitive dependency describes a condition


where A, B, and C are attributes of a relation
such that
if A → B and B → C,
then C is transitively dependent on A via B

19
Pearson Education © 2014
Example Transitive Dependency
Consider functional dependencies in the
StaffBranch relation (see Slide 9).

staffNo → sName, position, salary, branchNo,


bAddress
branchNo → bAddress

• Transitive dependency, staffNo → bAddress


exists via branchNo.
20
Pearson Education © 2014
The Process of Normalization
Formal technique for analyzing a relation
based on its primary key and the functional
dependencies between the attributes of that
relation.

Often executed as a series of steps. Each step


corresponds to a specific normal form, which
has known properties.

21
Pearson Education © 2014
Identifying Functional
Dependencies
Identifying all functional dependencies
between a set of attributes is relatively simple
if the meaning of each attribute and the
relationships between the attributes are well
understood.
Based on users’ requirements specification
And common sense

22
Pearson Education © 2014
Example - Identifying a set of
functional dependencies for the
StaffBranch relation
Examine semantics of attributes in
StaffBranch relation (see the next slide).
Assume that position held and branch
determine a member of staff’s salary.

23
Pearson Education © 2014
Example Database

24
Pearson Education © 2014
Example - Identifying a set of
functional dependencies for the
StaffBranch relation
With sufficient information available, identify
the functional dependencies for the
StaffBranch relation as:
staffNo → sName, position, salary, branchNo,
bAddress
branchNo → bAddress
bAddress → branchNo
branchNo, position → salary
bAddress, position → salary 25
Pearson Education © 2014
Example - Using sample data to
identify functional dependencies.
Consider the data for attributes denoted A, B,
C, D, and E in the Sample relation (see next
slide).

Important to establish that sample data values


shown in relation are representative of all
possible values that can be held by attributes
A, B, C, D, and E. Assume true despite the
relatively small amount of data shown in this
relation.
26
Pearson Education © 2014
Example - Using sample data to
identify functional dependencies.

27
Pearson Education © 2014
Example - Using sample data to
identify functional dependencies.
Function dependencies between attributes A
to E in the Sample relation.

A→C (fd1)
C→A (fd2)
B →D (fd3)
A, B → E (fd4)

28
Pearson Education © 2014
Identifying the Primary Key for a
Relation using Functional
Dependencies
Main purpose of identifying a set of functional
dependencies for a relation is to specify the set
of integrity constraints that must hold on a
relation.

An important integrity constraint to consider


first is the identification of candidate keys, one
of which is selected to be the primary key for
the relation.
29
Pearson Education © 2014
Example - Identify Primary Key for
StaffBranch Relation
StaffBranch relation has five functional
dependencies (see Slide 25).

The determinants are staffNo, branchNo,


bAddress, (branchNo, position), and (bAddress,
position).

To identify all candidate key(s), identify the


attribute (or group of attributes) that uniquely
identifies each tuple in this relation.
30
Pearson Education © 2014
Example - Identifying Primary Key
for StaffBranch Relation
All attributes that are not part of a candidate
key should be functionally dependent on the
key.

The only candidate key and therefore primary


key for StaffBranch relation, is staffNo, as all
other attributes of the relation are functionally
dependent on staffNo.

31
Pearson Education © 2014
Example - Identifying Primary Key
for Sample Relation
Sample relation has four functional
dependencies (see Slide 27).

The determinants in the Sample relation


are A, B, C, and (A, B). However, the only
determinant that functionally determines
all the other attributes of the relation is (A,
B).

(A, B) is identified as the primary32key for


this relation. Pearson Education © 2014
The Process of Normalization
As normalization proceeds, the relations
become progressively more restricted
(stronger) in format and also less vulnerable
to update anomalies.

33
Pearson Education © 2014
The Process of Normalization

34
Pearson Education © 2014
The Process of Normalization

35
Unnormalized Form (UNF)
A table that contains one or more repeating
groups.

36
Pearson Education © 2014
First Normal Form (1NF)
A relation in which the intersection of each
row and column (i.e., each cell) contains one
and only one value.
Why? So that it can be represented in a table
and each field is searchable (one value per
attribute)

37
Pearson Education © 2014
UNF to 1NF
Nominate an attribute or group of attributes
to act as the key for the unnormalized table.

Identify any cell in the unnormalized table


that has multiple values for the key
attribute(s).

38
Pearson Education © 2014
UNF to 1NF
Remove the repeating group by
Entering appropriate data into the empty
columns of rows containing the repeating data
(‘flattening’ the table). i.e., add columns
Or by
Placing the repeating data along with a copy of
the original key attribute(s) into a separate
relation. (i.e., add rows, duplicating all
attributes; one row for each of the multiple
values)

39
Pearson Education © 2014
UNF
Name Age Subject State Zip
Adam 21 Biology, Math AR 72701
Alex 20 English MA 02108
Namita 22 Math KS 66044
Julie 24 CS MA 02020

Name Age Major Minor State Zip


Adam 21 Biology Minor AR 72701 1NF; 2 versions
Alex 20 English NULL MA 02108
Namita 22 Math NULL KS 66044
Julie 24 CS NULL AR 72701

Name Age Subject State Zip


Adam 21 Biology AR 72701
Adam 21 Math AR 72701
Alex 20 English MA 02108
Namita 22 Math KS 66044
Julie 24 CS AR 72701
Pearson Education © 2014
Second Normal Form (2NF)
Based on the concept of full functional
dependency.

Full functional dependency indicates that if


A,B is a composite key and C is an attribute of a
relation,
C is fully dependent on A,B if
C is functionally dependent on A,B but not on any
proper subset of A,B,... i.e., not dependent on just A
or just B

41
Pearson Education © 2014
Second Normal Form (2NF)
A relation that is in 1NF and every non-
primary-key attribute is fully functionally
dependent on the primary key.
Why? Decreases data redundancy
Prevents update/deletion/modification
anomalies
Decreases storage

42
Pearson Education © 2014
1NF to 2NF
Identify the primary key for the 1NF relation.
If the primary key is not composite
In 2NF by definition

Identify the functional dependencies in the


relation.

If partial dependencies exist on the primary


key remove them by placing then in a new
relation along with a copy of their
determinant. 43
Pearson Education © 2014
1NF to 2NF - Example
Student (Name, Age, Subject, State, Zip)
Key:
No single attribute uniquely identifies records
Composite key
Age, Subject could be a key (based on data)
Add in domain knowledge to rule it out
May be >1 student in Math in future
Name, Age cannot be a key
Two records with Adam, 15
Name, Subject is our key

44
Pearson Education © 2014
1NF to 2NF - Example
Functional dependencies:
Name, Subject -> Age, State, Zip
Name -> Age, State, Zip
Name -> Age; Name -> State; Name -> Zip
Example of partial dependency
Age, State & Zip are dependent on Name,
Name is one component of our composite key
SO not in 2NF
Assumes Names are unique
Solution
Make a separate table with partial key for partial
dependency
45
Note data redundancy (Adam 21, etc) reducedPearson Education © 2014
1NF Name Age Subject State Zip
Adam 21 Biology AR 72701
Adam 21 Math AR 72701
Alex 20 English MA 02108
Namita 22 Math KS 66044
Julie 24 CS AR 72701

Name Subject
Adam Biology
Adam Math 2NF
Alex English
Namita Math
Julie CS

Name Age State Zip


Adam 21 AR 72701
Alex 20 MA 02108
Namita 22 KS 66044
Julie 24 AR 72701
Third Normal Form (3NF)

Based on the concept of transitive


dependency.

Transitive Dependency is a condition where


A, B and C are attributes of a relation such that
if A → B and B → C,
then C is transitively dependent on A through B.
(Provided that A is not functionally dependent
on B or C).

47
Pearson Education © 2014
Third Normal Form (3NF)
A relation that is in 1NF and 2NF and in which
no non-primary-key attribute is transitively
dependent on the primary key.
Example:
In Student table,
Name -> Zip
Zip -> State
Transitively Name -> State

48
Pearson Education © 2014
2NF to 3NF
Identify the primary key in the 2NF relation.
Name in Student table

Identify functional dependencies in the


relation.
Name -> Age; Name -> Zip; Zip -> State
If transitive dependencies exist on the
primary key remove them by placing them in
a new relation along with a copy of their
determinant.
49
Pearson Education © 2014
2NF

Name Age State Zip


Adam 21 AR 72701
Alex 20 MA 02108
Namita 22 KS 66044
Julie 24 AR 72701

3NF

Name Age Zip Zip State


Adam 21 72701 72701 AR
Alex 20 02108 02108 MA
Namita 22 66044 66044 KS
Julie 24 72701
General Definitions of 2NF and 3NF
Second normal form (2NF)
A relation that is in first normal form and every
non-primary-key attribute is fully functionally
dependent on any candidate key.

Third normal form (3NF)


A relation that is in first and second normal form
and in which no non-primary-key attribute is
transitively dependent on any candidate key.

51
Pearson Education © 2014

You might also like