0% found this document useful (0 votes)
65 views53 pages

Carl Carison

Uploaded by

Monir Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views53 pages

Carl Carison

Uploaded by

Monir Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Chapter 8: Relational Database Design

Normalization in Databases

Database System Concepts, 6th Ed.


©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Chapter 8: Relational Database Design

Features of Good Relational Design


Atomic Domains and First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Combine Schemas?
Suppose we combine instructor and department into inst_dept
(No connection to relationship set inst_dept)
Result is possible repetition of information
A Combined Schema Without Repetition

Consider combining relations


sec_class(sec_id, building, room_number) and
section(course_id, sec_id, semester, year)
into one relation
section(course_id, sec_id, semester, year,
building, room_number)
No repetition in this case
What About Smaller Schemas?
Suppose we had started with inst_dept. How would we know to split up
(decompose) it into instructor and department?
Write a rule “if there were a schema (dept_name, building, budget), then
dept_name would be a candidate key”
Denote as a functional dependency:
dept_name → building, budget
In inst_dept, because dept_name is not a candidate key, the building
and budget of a department may have to be repeated.
This indicates the need to decompose inst_dept
Not all decompositions are good. Suppose we decompose
employee(ID, name, street, city, salary) into
employee1 (ID, name)
employee2 (name, street, city, salary)
The next slide shows how we lose information -- we cannot reconstruct
the original employee relation -- and so, this is a lossy decomposition.
A Lossy Decomposition
Example of Lossless-Join Decomposition

Lossless join decomposition


Decomposition of R = (A, B, C)
R1 = (A, B) R2 = (B, C)

A B C A B B C
 1 A  1 1 A
 2 B  2 2 B
r A,B(r) B,C(r)

A B C
A (r) B (r)
 1 A
 2 B
Normal Forms

1NF
2NF
3NF
Other…
Not covered.
Normal Forms: Review

Unnormalized – There are multivalued attributes or


repeating groups
1 NF – No multivalued attributes or repeating groups.
2 NF – 1 NF plus no partial dependencies
3 NF – 2 NF plus no transitive dependencies
First Normal Form
Domain is atomic if its elements are considered to be indivisible units
Examples of non-atomic domains:
 Set of names, composite attributes
 Identification numbers like CS101 that can be broken up into
parts
A relational schema R is in first normal form if the domains of all
attributes of R are atomic
Non-atomic values complicate storage and encourage redundant
(repeated) storage of data
Example: Set of accounts stored with each customer, and set of
owners stored with each account
First Normal Form (Cont’d)
Atomicity is actually a property of how the elements of the domain are
used.
Example: Strings would normally be considered indivisible
Suppose that students are given roll numbers which are strings of
the form CS0012 or EE1127
If the first two characters are extracted to find the department, the
domain of roll numbers is not atomic.
Doing so is a bad idea: leads to encoding of information in
application program rather than in the database.
Example 1: Table Violating 1NF
Instructor First Name Last Name Phone Number
123 Ali Baba 215-000-1212
222 Mikey Mouse 215-000-1212
215-111-1212
215-222-1212
555 Donald Duck 312-000-1212
312-111-1212
312-222-1212
Example 1: Table Not Violating 1NF
Instructor First Name Last Name Phone Number
123 Ali Baba 215-000-1212
222 Mikey Mouse 215-000-1212
222 Mikey Mouse 215-111-1212
222 Mikey Mouse 215-222-1212
555 Donald Duck 312-000-1212
555 Donald Duck 312-111-1212

It violates other normal forms, though.


Example 2: Table Violating 1NF

Product ID Color Price


1 Black, Red $15
2 Yellow, Purple $20
5 White, Green $40
Example 2: Table Not Violating 1NF
Product ID Color Price
1 Black $15
1 Red $15
2 Yellow $20
2 Purple $20
5 White $40
5 Green $40

It violates other normal forms, though.


Types of Normalization

First Normal Form


each field contains the smallest
meaningful value
the table does not contain
repeating groups of fields or
repeating data within the same field
 Create a separate field/table for each set of related data.
 Identify each set of related data with a primary key
Tables Violating First Normal Form

PART (Primary Key) WAREHOUSE


P0010 Warehouse A, Warehouse B, Warehouse C

P0020 Warehouse B, Warehouse D

Really Bad Set-up!


Better, but still flawed!

PART
WAREHOUSE A WAREHOUSE B WAREHOUSE C
(Primary Key)

P0010 Yes No Yes

P0020 No Yes Yes


Table Conforming to 1NF

PART WAREHOUSE
(Primary Key) (Primary Key) QUANTITY
P0010 Warehouse A 400

P0010 Warehouse B 543

P0010 Warehouse C 329

P0020 Warehouse B 200

P0020 Warehouse D 278


Second Normal Form – 2NF
usually used in tables with a multiple-field
primary key (composite key)
each non-key field relates to the entire
primary key
any field that does not relate to the primary
key is placed in a separate table
MAIN POINT –
eliminate redundant data in a table
Create separate tables for sets of values that
apply to multiple records
Table Violating 2NF
Where is the problem?

PART WAREHOUSE WAREHOUSE


(Primary Key) (Primary Key) QUANTITY ADDRESS

P0010 Warehouse A 400 1608 New Field Road

P0010 Warehouse B 543 4141 Greenway Drive

P0010 Warehouse C 329 171 Pine Lane

P0020 Warehouse B 200 4141 Greenway Drive

P0020 Warehouse D 278 800 Massey Street


Table Violating 2NF

PART WAREHOUSE WAREHOUSE


(Primary Key) (Primary Key) QUANTITY ADDRESS

P0010 Warehouse A 400 1608 New Field Road

P0010 Warehouse B 543 4141 Greenway Drive

P0010 Warehouse C 329 171 Pine Lane

P0020 Warehouse B 200 4141 Greenway Drive

P0020 Warehouse D 278 800 Massey Street


Tables Conforming to 2NF

PART_STOCK TABLE
PART (Primary Key) WAREHOUSE (Primary Key) QUANTITY
P0010 Warehouse A 400
P0010 Warehouse B 543
P0010 Warehouse C 329
P0020 Warehouse B 200
P0020 Warehouse D 278
WAREHOUSE TABLE
1

WAREHOUSE (Primary Key) WAREHOUSE_ADDRESS
Warehouse A 1608 New Field Road
Warehouse B 4141 Greenway Drive
Warehouse C 171 Pine Lane
Warehouse D 800 Massey Street
Third Normal Form – 3NF
Usually used in tables with a single- field
primary key
Records do not depend on anything other
than a table's primary key
Each non-key field is a fact about the key
Values in a record that are not part of that record's key
do not belong in the table. In general, any time the
contents of a group of fields may apply to more than a
single record in the table, consider placing those fields in
a separate table.
Table Violating 3NF
EMPLOYEE_DEPARTMENT TABLE

EMPNO
FIRSTNAME LASTNAME WORKDEPT DEPTNAME
(Primary Key)

000290 John Parker E11 Operations

000320 Ramlal Mehta E21 Software Support

000310 Maude Setright E11 Operations

The underlying problem is the transitive


dependency to which the DeptName attribute is
subject. DeptName actually depends on
WORKDEPT, which in turn depends on the key
EmpNO.
Tables Conforming to Third
Normal Form
EMPLOYEE TABLE

EMPNO (Primary Key) FIRSTNAME LASTNAME WORKDEPT

000290 John Parker E11

000320 Ramlal Mehta E21

000310 Maude Setright E11

DEPARTMENT TABLE

1

DEPTNO (Primary Key) DEPTNAME

E11 Operations

E21 Software Support


A Note on 2NF

A table may have multiple candidate key.


A functional dependency on part of any candidate
key is a violation of 2NF.
It is necessary to establish that no non-prime
attributes have part-key dependencies on any of
these candidate keys.
Example
Candidate Key
PK

Manufacturer Model Model Full Name Manufacturer Country

Forte X-Prime Forte X-Prime Italy


Forte Ultraclean Forte Ultraclean Italy

Dent-o-Fresh EZbrush Dent-o-Fresh EZbrush USA

Kobayashi ST-60 Kobayashi ST-60 Japan

Hoch Toothmaster Hoch Toothmaster Germany

Hoch X-Prime Hoch X-Prime Germany

Example taken from Wikipedia:


http://en.wikipedia.org/wiki/Second_normal_form
Example

Electric Toothbrush Manufacturers


Manufacturer Manufacturer Country
Forte Italy
Dent-o-Fresh USA
Kobayashi Japan
Hoch Germany

Electric Toothbrush Models

Manufacturer Model Model Full Name


Forte X-Prime Forte X-Prime
Forte Ultraclean Forte Ultraclean
Dent-o-Fresh EZbrush Dent-o-Fresh EZbrush
Kobayashi ST-60 Kobayashi ST-60
Hoch Toothmaster Hoch Toothmaster
Hoch X-Prime Hoch X-Prime
More Examples
Example 1
Un-normalized Table:

Student# Advisor# Advisor Adv-Room Class1 Class2 Class3

1022 10 Susan Jones 412 101-07 143-01 159-02

4123 12 Anne Smith 216 101-07 159-02 214-01


Table in First Normal Form
No Repeating Fields
Data in Smallest Parts

Adv-
Student# Advisor# AdvisorFName AdvisorLName Class#
Room
1022 10 Susan Jones 412 101-07

1022 10 Susan Jones 412 143-01

1022 10 Susan Jones 412 159-02

4123 12 Anne Smith 216 101-07

4123 12 Anne Smith 216 159-02

4123 12 Anne Smith 216 214-01


Is table in 2NF?
What is the key?

Adv-
Student# Advisor# AdvisorFName AdvisorLName Class#
Room
1022 10 Susan Jones 412 101-07

1022 10 Susan Jones 412 143-01

1022 10 Susan Jones 412 159-02

4123 12 Anne Smith 216 101-07

4123 12 Anne Smith 216 159-02

4123 12 Anne Smith 216 214-01

2011 10 Susan Jones 412 101-07


Is table in 2NF?
What is the key?

Adv-
Student# Advisor# AdvisorFName AdvisorLName Class#
Room
1022 10 Susan Jones 412 101-07

1022 10 What Susan


do we notice?Jones 412 143-01

1022 10
• Advisor fields dependJones
Susan
on Student# 412 159-02

4123 12 Anne Smith 216 101-07

4123 12 Anne Smith 216 159-02

4123 12 Anne Smith 216 214-01

2011 10 Susan Jones 412 101-07


Tables in Second Normal Form
Redundant Data Eliminated
Table: Registration
Table: Students

Adv- Student# Class#


Student# Advisor# AdvFirstName AdvLastName
Room
1022 101-07
1022 10 Susan Jones 412

1022 143-01
4123 12 Anne Smith 216
1022 159-02
2011 10 Susan Jones 412
4123 201-01

4123 211-02

4123 214-01
Tables Registration in 2NF
Who about the Students?

Table: Registration
Table: Students

Adv- Student# Class#


Student# Advisor# AdvFirstName AdvLastName
Room
1022 101-07
1022 10 Susan Jones 412

1022 143-01
4123 12 Anne Smith 216
1022 159-02
2011 10 Susan Jones 412
4123 201-01

4123 211-02
What is the candidate key for Students?
4123 214-01
Tables in 2NF.

Table: Advisors

Adv- Table: Registration


Advisor# AdvFirstName AdvLastName
Room
Student# Class#
10 Susan Jones 412

1022 101-07
12 Anne Smith 216
1022 143-01
Table: Students
1022 159-02
Student# Advisor#
4123 201-01

1022 10 4123 211-02

4123 12 4123 214-01

2011 10
Relationships for Example 1

Registration Students Advisors

Student# Student# Advisor#

Class# Advisor# AdvFirstName


AdvLastName
Adv-Room
Example 2
Un-normalized Table:

EmpID Name Dept Dept Name Proj 1 Time Proj 2 Time Proj 3 Time
Code Proj 1 Proj 2 Proj 3

EN1-26 Sean Breen TW Technical Writing 30-T3 25% 30-TC 40% 31-T3 30%

EN1-33 Amy Guya TW Technical Writing 30-T3 50% 30-TC 35% 31-T3 60%

EN1-36 Liz Roslyn AC Accounting 35-TC 90%


Table in First Normal Form
EmpID Project Time on Last First Dept Dept Name
Number Project Name Name Code
EN1-26 30-T3 25% Breen Sean TW Technical Writing

EN1-26 30-TC 40% Breen Sean TW Technical Writing

EN1-26 31-T3 30% Breen Sean TW Technical Writing

EN1-33 30-T3 50% Guya Amy TW Technical Writing

EN1-33 30-TC 35% Guya Amy TW Technical Writing

EN1-33 31-T3 60% Guya Amy TW Technical Writing

EN1-36 35-TC 90% Roslyn Liz AC Accounting


Tables in Second Normal Form

Table: Employees and Projects Table: Employees

EmpID Project Time on EmpID Last First Dept Dept Name


Number Project Name Name Code
EN1-26 30-T3 25% EN1-26 Breen Sean TW Technical Writing

EN1-26 30-T3 40% EN1-33 Guya Amy TW Technical Writing

EN1-26 31-T3 30% EN1-36 Roslyn Liz AC Accounting

EN1-33 30-T3 50%

EN1-33 30-TC 35% Are they in 3NF?


EN1-33 31-T3 60%

EN1-36 35-TC 90% The underlying problem is the transitive


dependency to which the Dept Name attribute is
subject. Dept Name actually depends on Dept
Code, which in turn depends on the key EmpID.
Tables in Third Normal Form

Table: Employees_and_Projects Table: Employees

EmpID Project Time on EmpID Last First Dept


Number Project Name Name Code
EN1-26 30-T3 25%
EN1-26 Breen Sean TW
EN1-26 30-T3 40%
EN1-33 Guya Amy TW
EN1-26 31-T3 30%
EN1-36 Roslyn Liz AC
EN1-33 30-T3 50%

EN1-33 30-TC 35%


Table: Departments
EN1-33 31-T3 60%
Dept Code Dept Name
EN1-36 35-TC 90%
TW Technical Writing
AC Accounting
Relationships for Example 2

Employees Departments
Employees_and_Projects
EmpID DeptCode
EmpID
FirstName
ProjectNumber DeptName
LastName
TimeonProject
DeptCode
Example 3
• Un-normalized Table:

EmpID Name Manager Dept Sector Spouse/Children

285 Carl Smithers Engineering 6G


Carlson
365 Lenny Smithers Marketing 8G

458 Homer Mr. Burns Safety 7G Marge, Bart, Lisa, Maggie


Simpson
Table in First Normal Form
Fields contain smallest meaningful
values

EmpID FName LName Manager Dept Sector Spouse Child1 Child2 Child3

285 Carl Carlson Smithers Eng. 6G

365 Lenny Smithers Marketing 8G

458 Homer Simpson Mr. Burns Safety 7G Marge Bart Lisa Maggie
Table in First Normal Form
No more repeated fields

EmpID FName LName Manager Department Sector Dependent


285 Carl Carlson Smithers Engineering 6G

365 Lenny Smithers Marketing 8G

458 Homer Simpson Mr. Burns Safety 7G Marge

458 Homer Simpson Mr. Burns Safety 7G Bart

458 Homer Simpson Mr. Burns Safety 7G Lisa

458 Homer Simpson Mr. Burns Safety 7G Maggie


Second/Third Normal Form
Remove Repeated Data From Table
Step 1
EmpID FName LName Manager Department Sector

285 Carl Carlson Smithers Engineering 6G


365 Lenny Smithers Marketing 8G
458 Homer Simpson Mr. Burns Safety 7G

EmpID Dependent

458 Marge
458 Bart
458 Lisa
458 Maggie
Tables in Second Normal Form
Removed Repeated Data From Table
Step 2
EmpID FName LName ManagerID Dept Sector
285 Carl Carlson 2 Engineering 6G
365 Lenny 2 Marketing 8G
458 Homer Simpson 1 Safety 7G

We look for the transitive dependency.


EmpID Dependent
458 Marge ManagerID Manager
458 Bart 1 Mr. Burns
458 Lisa
2 Smithers
458 Maggie
Tables in Second Normal Form
How about 3NF?
Step 3
EmpID FName LName ManagerID Dept Sector
285 Carl Carlson 2 Engineering 6G
365 Lenny 2 Marketing 8G
458 Homer Simpson 1 Safety 7G

We look the transitive dependency.


EmpID Dependent
458 Marge If I know Dept, then
ManagerID I know
Manager
458 Bart ManagerID and Sector. If I
know 1 EmpID then Mr. Burns
I know
458 Lisa Dept.2 Smithers
458 Maggie
Tables in Third Normal Form
Employees Table Manager Table
EmpID FName LName DeptCode ManagerID Manager
285 Carl Carlson EN 1 Mr. Burns
365 Lenny MK 2 Smithers
458 Homer Simpson SF

Dependents Table
Department Table
EmpID Dependent
DeptCode Department Sector ManagerID
458 Marge EN Engineering 6G 2
458 Bart MK Marketing 8G 2
458 Lisa SF Safety 7G 1
458 Maggie
Example 4
Table Violating 1st Normal Form
Rep ID Representative Client 1 Time 1 Client 2 Time 2 Client 3 Time 3
TS-89 Gilroy Gladstone US Corp. 14 hrs Taggarts 26 hrs Kilroy Inc. 9 hrs
RK-56 Mary Mayhem Italiana 67 hrs Linkers 2 hrs

Table in 1st Normal Form


Client
Rep ID Rep First Name Rep Last Name Client Time With Client
ID*
TS-89 Gilroy Gladstone 978 US Corp 14 hrs
TS-89 Gilroy Gladstone 665 Taggarts 26 hrs
TS-89 Gilroy Gladstone 782 Kilroy Inc. 9 hrs
RK-56 Mary Mayhem 221 Italiana 67 hrs
RK-56 Mary Mayhem 982 Linkers 2 hrs
Tables in 2nd and 3rd Normal
Form
Rep ID* First Name Last Name
TS-89 Gilroy Gladstone
Rep ID* Client ID* Time With Client
RK-56 Mary Mayhem
TS-89 978 14 hrs
TS-89 665 26 hrs Client
Client Name
TS-89 782 9 hrs ID*
RK-56 221 67 hrs 978 US Corp
RK-56 982 2 hrs 665 Taggarts
RK-56 665 4 hrs 782 Kilroy Inc.
221 Italiana
982 Linkers
This example comes from a tutorial from
http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=95
and
http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=104
Please check them out, as they are very well done.
Example 5
SupplierID Status City PartID Quantity
S1 20 London P1 300
Table in 1st S1 20 London P2 200
Normal Form S2 10 Paris P1 300
S2 10 Paris P2 400
S3 10 Paris P2 200
S4 20 London P2 200
S4 20 London P4 300

Although this table is in 1NF it contains redundant data. For example, information about the supplier's location and the
location's status have to be repeated for every part supplied. Redundancy causes what are called update anomalies. Update
anomalies are problems that arise when information is inserted, deleted, or updated. For example, the following anomalies
could occur in this table:

INSERT. The fact that a certain supplier (s5) is located in a particular city (Athens) cannot be added until they supplied a part.
DELETE. If a row is deleted, then not only is the information about quantity and part lost but also information about the supplier.
UPDATE. If supplier s1 moved from London to New York, then two rows would have to be updated with this new information.
Tables in 2NF
Suppliers Parts

SupplierID Status City SupplierID PartID Quantity


S1 20 London S1 P1 300
S2 10 Paris S1 P2 200
S3 10 Paris
S2 P1 300
S4 20 London
S2 P2 400
S5 30 Athens
S3 P2 200
S4 P4 300
S4 P5 400

Tables in 2NF but not in 3NF still contain modification anomalies. In the example of Suppliers, they are:

INSERT. The fact that a particular city has a certain status (Rome has a status of 50) cannot be inserted until
there is a supplier in the city.
DELETE. Deleting any row in SUPPLIER destroys the status information about the city as well as the
association between supplier and city.

You might also like