0% found this document useful (0 votes)
13 views38 pages

7 Normalization For Relational Databases

Chapter 15 discusses normalization for relational databases, outlining informal design guidelines, functional dependencies, and various normal forms including 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF. It emphasizes the importance of reducing redundancy, avoiding update anomalies, and ensuring clear semantics in relation schemas. The chapter also provides definitions and examples to illustrate the concepts of keys and normalization processes.

Uploaded by

huss awd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views38 pages

7 Normalization For Relational Databases

Chapter 15 discusses normalization for relational databases, outlining informal design guidelines, functional dependencies, and various normal forms including 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF. It emphasizes the importance of reducing redundancy, avoiding update anomalies, and ensuring clear semantics in relation schemas. The chapter also provides definitions and examples to illustrate the concepts of keys and normalization processes.

Uploaded by

huss awd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 38

Chapter 15

Normalizatio
n for
Relational
Databases

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley


Chapter 15 Outline
 Informal Design Guidelines for Relation
Schemas
 Functional Dependencies
 Normal Forms Based on Primary Keys
 General Definitions of Second and Third
Normal Forms
 Boyce-Codd Normal Form

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Chapter 15 Outline (cont’d.)
 Multivalued Dependency and Fourth
Normal Form
 Join Dependencies and Fifth Normal Form

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Introduction
 Levels at which we can discuss goodness
of relation schemas
 Logical (or conceptual) level
 Implementation (or physical storage) level
 Approaches to database design:
 Bottom-up or top-down

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Informal Design Guidelines
for Relation Schemas
 Measures of quality
 Making sure attribute semantics are clear
 Reducing redundant information in tuples
 Reducing NULL values in tuples
 Disallowing possibility of generating spurious
tuples

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Imparting Clear Semantics to
Attributes in Relations
 Semantics of a relation
 Meaning resulting from interpretation of
attribute values in a tuple
 Easier to explain semantics of relation
 Indicates better schema design

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Guideline 1
 Design relation schema so that it is easy to
explain its meaning
 Do not combine attributes from multiple
entity types and relationship types into a
single relation
 Example of violating Guideline 1: Figure
15.3

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Guideline 1 (cont’d.)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Redundant Information in Tuples
and Update Anomalies
 Grouping attributes into relation schemas
 Significant effect on storage space
 Storing natural joins of base relations leads
to update anomalies
 Types of update anomalies:
 Insertion
 Deletion
 Modification

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


EXAMPLE OF AN UPDATE
ANOMALY
Consider the relation:
EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
 Update Anomaly
• Changing the name of project number P1 from “Billing” to
“Customer-Accounting” may cause this update to be made for
all 100 employees working on project P1
 Insert Anomaly
• Cannot insert a project unless an employee is assigned to .
• Inversely- Cannot insert an employee unless he/she is
assigned to a project.

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


EXAMPLE OF AN UPDATE
ANOMALY (2)
 Delete Anomaly
• When a project is deleted, it will result in deleting all the
employees who work on that project. Alternately, if an
employee is the sole employee on a project, deleting that
employee would result in deleting the corresponding project.
 Design a schema that does not suffer from the
insertion, deletion and update anomalies. If
there are any present, then note them so that
applications can be made to take them into
account

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Guideline 2
 Design base relation schemas so that no
update anomalies are present in the
relations
 If any anomalies are present:
 Note them clearly
 Make sure that the programs that update the
database will operate correctly

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


NULL Values in Tuples
 May group many attributes together into a
“fat” relation
 Can end up with many NULLs
 Problems with NULLs
 Wasted storage space
 Problems understanding meaning

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Guideline 3
 Avoid placing attributes in a base relation
whose values may frequently be NULL
 If NULLs are unavoidable:
 Make sure that they apply in exceptional cases
only, not to a majority of tuples

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Generation of Spurious Tuples
 Figure 15.5(a)
 Relation schemas EMP_LOCS and
EMP_PROJ1
 NATURAL JOIN
 Result produces many more tuples than the
original set of tuples in EMP_PROJ
 Called spurious tuples
 Represent spurious information that is not valid

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Guideline 4
 Design relation schemas to be joined with
equality conditions on attributes that are
appropriately related
 Guarantees that no spurious tuples are
generated
 Avoid relations that contain matching
attributes that are not (foreign key, primary
key) combinations

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Summary and Discussion of
Design Guidelines
 Anomalies cause redundant work to be
done
 Waste of storage space due to NULLs
 Difficulty of performing operations and joins
due to NULL values
 Generation of invalid and spurious data
during joins

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Definitions of Keys and Attributes
Participating in Keys
 Definition of superkey and key
 Candidate key
 If more than one key in a relation schema
• One is primary key
• Others are secondary keys

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Introduction to Normalization
 Normalization: Process of decomposing
unsatisfactory "bad" relations by breaking up their
attributes into smaller relations
 Normal form: Condition using keys and FDs of a
relation to certify whether a relation schema is in
a particular normal form
 2NF, 3NF, BCNF based on keys and FDs of a relation
schema
 4NF based on keys, multi-valued dependencies

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


First Normal Form
 Disallows composite attributes, multivalued
attributes, and nested relations; attributes
whose values for an individual tuple are
non-atomic
 Considered to be part of the definition of
relation

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Second Normal Form
 Uses the concepts of FDs, primary key
 Definitions:
 Prime attribute - attribute that is member of
the primary key K or Part of any candidate key
will be considered as prime
 Full functional dependency - a FD Y  Z
where removal of any attribute from Y means
the FD does not hold any more

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Examples
Second Normal Form
 {SSN, PNUMBER}  HOURS is a full FD since neither
SSN  HOURS nor PNUMBER  HOURS hold
 {SSN, PNUMBER}  ENAME is not a full FD (it is
called a partial dependency ) since SSN  ENAME
also holds
 A relation schema R is in second normal form (2NF) if
every non-prime attribute A in R is fully functionally
dependent on the primary key (whole key) or
Candidate key

 R can be decomposed into 2NF relations via the


process of 2NF normalization

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Third Normal Form
 Based on concept of transitive dependency

A relation schema R is in third normal form


(3NF) if it is in 2NF and no non-prime
attribute A in R is transitively dependent on
the primary key

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Third Normal Form
 Definition
 Transitive functional dependency – if there a set of
atribute Z that are neither a primary or candidate key
and both X  Z and Y  Z holds.
 Examples:
 SSN  DMGRSSN is a transitive FD since
SSN  DNUMBER and DNUMBER  DMGRSSN hold
 SSN  ENAME is non-transitive since there is no set
of
attributes X where SSN  X and X  ENAME

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
General Definitions of Second
and Third Normal Forms

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


BCNF (Boyce-Codd Normal
Form)
 A relation schema R is in Boyce-Codd Normal
Form (BCNF) if whenever an FD X  A holds in
R, then X is a superkey of R
 Each normal form is strictly stronger than the previous
one:
• Every 2NF relation is in 1NF
• Every 3NF relation is in 2NF
• Every BCNF relation is in 3NF

 There exist relations that are in 3NF but not in BCNF


 The goal is to have each relation in BCNF (or 3NF)

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Copyright © 2011 Ramez Elmasri and Shamkant Navathe
Copyright © 2011 Ramez Elmasri and Shamkant Navathe
BCNF
 {Student,course}  Instructor
 Instructor  Course
 Decomposing into 2 schemas
 {Student,Instructor} {Student,Course}
 {Course,Instructor} {Student,Course}
 {Course,Instructor} {Instructor,Student}

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Example
 Given the relation
Book(Book_title, Authorname, Book_type,
Listprice, Author_affil, Publisher)
The FDs are
Book_title Publisher, Book_type
Book_type Listprice
Authorname Author_affil
What normal form the relation in?

Copyright © 2011 Ramez Elmasri and Shamkant Navathe


Summary
 Informal guidelines for good design
 Functional dependency
 Basic tool for analyzing relational schemas
 Normalization:
 1NF, 2NF, 3NF, BCNF, 4NF, 5NF

Copyright © 2011 Ramez Elmasri and Shamkant Navathe

You might also like