Normalization
Normalization
Overview
What is normalization?
2
Introduction
Question:
How many, and what, relations (tables) should be
used to store my data?
Is the current relation free of problems?
Stud_ID Stud_Name Course_ID Course_Name Instructor Office Room Credit
224 Waters CIS20 Intro CIS Greene CBA001 205G 5
224 Waters CIS40 Database Mgt Hong CBA908 311S 5
224 Waters CIS50 Sys.Analysis Purao CBA700 139S 5
351 Byron CIS30 COBOL Hong CBA908 629G 3
351 Byron CIS50 Sys.Analysis Purao CBA700 139S 5
421 Smith CIS20 Intro CIS Greene CBA001 205G 5
421 Smith CIS30 COBOL Hong CBA908 629G 3
421 Smith CIS50 Sys.Analysis Purao CBA700 139S 5
3
Normalization
Normalization is a process of producing a set
of related relations(tables) with desirable
attributes, given the data requirements of a
domain
The goal is to remove redundancy and other data
modification (insertion, update and deletion)
problems
Usually dividing a table into 2 or more tables
Using Normal Forms as a guide
4
Normal Forms
Normal forms are guidelines
(steps) for the normalization
process
DK/NF
...
…
tio
all
bu
4th Normal Form (4NF)
r a liza
tm
a
ore
3rd Normal Form (3NF)
N
tab
we
l es
1st Normal Form (1NF)
Unnormalized Form (UNF)
5
Normalization – 1NF
A table is in 1NF if
it satisfies the definition of a relation
Review: what are the characteristics of a
relation?
No “repeating groups” (columns)
6
Repeating Groups
Customer First Telephone
Surname
ID Name Number
123 Robert Ingram 555-861-2025
555-403-1659
456 Jane Wright
555-776-4100
789 Maria Fernandez 555-808-9633
8
Transforming to 1NF: Example
Another example
UNF 1NF
9
Problems in 1NF
Basically has the same problem as the
spreadsheet tables
Redundancy
Data may be inconsistent after modification
Higher Normal Forms
Normal forms higher than 1NF deal with
functional dependency
11
Functional Dependency
If each value of attribute A is associated with only
one value of attribute B, we say
A determines B
Or, B is dependent on A
Denoted as: A B
12
Functional Dependency Examples
Dependency example
For each SSN, there is only one corresponding
first name (or last name), so:
SSN determines FirstName
SSN FirstName
Non-dependency example
Each instructor teaches multiple courses, so:
InstructorId does not determines CourseNumber
13
Functional Dependency and Keys
By definition, a primary key (candidate key)
functionally determines all other attributes
Primary key
Surrogate key
Composite primary key
Dependency diagram
14
Functional Dependency Exercise
CustomerNum CustomerName?
{Street, City, State} Zip?
CustomerName Balance?
State (?) Zip
RepNum ( ? ) CustomerName
15
Normalization – 2NF
A relation is in 2NF, if
It is in 1st normal formal, and
All nonkey attributes must be functionally dependent on the
whole primary key (Full dependency)
No partial dependency
It implies that a relation is in 2NF if there is a single-
attribute primary key (candidate key)
Partial dependency
A non-key attribute is dependent on part of a primary key
A B C D
16
A Relation in 1NF but Not in 2NF
Partial dependency
Transforming to 2NF
Identify primary key (PK)
If PK consists of only one field, then it is in 2NF
If PK is a composite key, then analyze functional dependency
between part of primary key and other non-key attributes
Move partial dependency involved attribute to another relation
A B C D
A B C
B D
18
Transforming to 2NF: Example
Order
Transitive dependency
A B and B C, then A C
A B C D
21
A Relation in 2NF but Not in 3NF
Identify primary key (PK) and Look for
transitive dependence
Transitive dependency
22
Transforming to 3NF
Move the attributes involved in transitive
dependency to another relation
Order_ID Order_Date Cust_ID Cust_Name Cust_Address
Value
1006 10/24/2004 2 Furniture Plano, TX
Furniture
1007 10/25/2004 6 Gallery Boulder, CO
Value
1008 11/1/2004 2 Furniture Plano, TX
Order Customer
Order_ID Order_Date Cust_ID Cust_ID Cust_Name Cust_Address
1006 10/24/2004 2 Value
1007 10/25/2004 6 2 Furniture Plano, TX
1008 11/1/2004 2 Furniture
6 Gallery Boulder, CO
23
Some Practical Tips
If there are attributes of two different entities in one table,
there are usually problems
24
Normalization Exercise 1-1
25
Normalization Exercise 1-2
26
Normalization Exercise 1-3
27
Normalization Exercise 1-4
Final database design with 3 tables
28
Summary
3NF: If the tables are in 2NF, and every non-key attribute is
dependent on the key, the whole key, and nothing but the key
Eliminate transitive
dependencies
2NF: If the tables are in 1NF, and every non-key attribute is
dependent on the key, the whole key
Eliminate partial
dependencies
UNF
29
Summary
Key concepts
Normalization
Normal forms
Functional dependency