0% found this document useful (0 votes)
28 views54 pages

Chapter 14

Uploaded by

taimoor.giki.cs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views54 pages

Chapter 14

Uploaded by

taimoor.giki.cs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

CHAPTER 14

Functional Dependencies and


Normalization
for Relational Databases
Engr. Ahsan Shah
Informal Design Guidelines for
Relational Databases

❑Semantics of the Relation Attributes

❑Reducing the redundant values in tuples

❑Reducing the null values in tuples

❑Disallowing the possibility of generating

spurious tuples
Semantics of the Relation Attributes
❑GUIDELINE 1:
− Each tuple in a relation should represent one
entity or relationship instance.
− Attributes of different entities should not be
mixed in the same relation.
− Only foreign keys should be used to refer to
other entities
− Entity and relationship attributes should be kept
apart as much as possible.
Semantics of the Relation Attributes
❑Example:
Redundant Information in Tuples and
Update Anomalies
❑Mixing attributes of multiple entities may cause
problems
❑Information is stored redundantly wasting storage
❑Problems with update anomalies
Insertion anomalies
Deletion anomalies
Modification anomalies
Redundant Information in Tuples and
Update Anomalies
Redundant Information in Tuples and
Update Anomalies
❑Consider the relation:
EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
❑Update Anomaly: Changing the name of project
number P1 from “Billing” to “Customer-
Accounting” may cause this update to be made for
all 100 employees working on project P1.
❑Insert Anomaly: Cannot insert a project unless
an employee is assigned to .
❑Inversely - Cannot insert an employee unless an
he/she is assigned to a project
Redundant Information in Tuples and
Update Anomalies
❑Delete Anomaly: When a project is deleted, it will
result in deleting all the employees who work on
that project. Alternately, if an employee is the sole
employee on a project, deleting that employee
would result in deleting the corresponding project
Redundant Information in Tuples and
Update Anomalies
❑GUIDELINE 2: Guideline to Redundant
Information in Tuples and Update Anomalies
Design the base relation schemas so that no
insertion, deletion, or modification anomalies
are present in the relations.
If any anomalies are present, note them clearly
and make sure that the programs that update the
database will operate correctly.
Null Values in Tuples
GUIDELINE 3: Relations should be designed such
that their tuples will have as few NULL values as
possible
❑Attributes that are NULL frequently could be
placed in separate relations
❑ Reasons for nulls:
Attribute not applicable or invalid
Attribute value unknown (may exist)
Value known to exist, but unavailable
Spurious Tuples
❑Bad designs for a relational database may result in
erroneous results for certain JOIN operations
❑The "lossless join" property is used to guarantee
meaningful results for join operations

GUIDELINE 4: The relations should be designed to


satisfy the lossless join condition. No spurious
tuples should be generated by doing a natural-join
of any relations.
Definition: Functional Dependency
❑Let R be the relation, and let x and y be the
arbitrary subset of the set of attributes of R.
Then we say that Y is functionally dependent on
x – in symbol.
X→Y
(Read x functionally determines y) –
If and only if each x value in R has associated with
it precisely one y value in R
In other words
Whenever two tuples of R agree on their x value,
they also agree on their Y value.
Example (SCP Relation)
S# City P# QTY

S1 London P1 100

S1 London P2 100

S2 Paris P1 200

S2 Paris P2 200

S3 Delhi P2 300

S4 Kolkata P2 400

S4 Kolkata P2 400

S4 Kolkata P5 400
Example (SCP Relation) (contd..)
One FD : - ( { S#} → {City})

❑Because every tuple of that relation with a given


S# value also has the same city value.

❑The left and right hand side of an FD are


sometimes called determinant and the
dependents respectively.
Exercise
Check whether following relation satisfy FD as
not
❑ < S#, P# > → <QTY>
❑ <S#, P#> → <City>
❑ < S#, P#> → <City, QTY>
❑ <S#, P#> → <S#>
❑ <S#, P#> → <S#, P#, QTY, City>
❑ <QTY> → <S#>
Functional dependencies
❑Consider the following relation schema

❑{Plane}→ StartTime
❑{Pilot, StartDate, StartTime}→ Plane
❑{Plane, StartDate}→ Pilot
Functional dependencies
❑A functional dependency is a constraint between
two sets of attributes from the database. Suppose
that relational database schema has n attributes
R{A1, A2, … An}, X and Y are subset of R.
❑Definition.
A functional dependency between two sets of
attributes X and Y: X→Y, specifies a constraint on
the possible tuples that can form a relation state
r of R.
The constraint is that, for any two tuples t1 and
t2 in r that have t1[X]=t2[X]➔ t1[Y] = t2[Y] .
Functional Dependency
❑Main concept associated with normalization.

❑Functional Dependency
Describes relationship between attributes in a
relation.
If A and B are attributes of relation R, B is
functionally dependent on A (denoted A → B), if each
value of A in R is associated with exactly one value of
B in R.
Functional Dependency
❑Property of the meaning (or semantics) of the
attributes in a relation.

❑Diagrammatic representation:

Determinant of a functional dependency refers


to attribute or group of attributes on left-hand
side of the arrow.
Example - Functional Dependency
Functional Dependency
❑Main characteristics of functional dependencies
used in normalization:
have a 1:1 relationship between attribute(s) on
left and right-hand side of a dependency;
hold for all time;
are nontrivial.
Functional dependencies
❑Example:
Functional dependencies
❑Example:
Social security number determines employee
name:
SSN -> ENAME
Project number determines project name and
location
PNUMBER -> {PNAME, PLOCATION}
Employee SSN and project number determines
the hours per week that the employee works on
the project:
{SSN, PNUMBER} -> HOURS
Inference Rules for FDs
❑The following six inference rules for functional
dependencies (Armstrong's inference rules)
IR1 (reflexive): If Y ⊆ X, then X→Y.
IR2. (Augmentation) If X → Y, then XZ → YZ.
IR3. (Transitive) If X → Y and Y → Z, then X → Z
Inference Rules for FDs
IR4(Decomposition):
If X → YZ, then X → Y and X → Z
IR5(Union):
If X → Y and X → Z, then X → YZ
IR6(pseudo-transitive)
If X → Y and WY → Z, then WX → Z
❑ The last three inference rules, as well as
any other inference rules, can be deduced
from IR1, IR2, and IR3 (completeness
property)
Dependencies Types

*Consider PropertyNo and iDate as PK


Inference Rules for FDs
❑Closure : the set of all dependencies that can be
inferred from F is called the closure of F; it is
denoted by F+.
Closure of a set of attributes X with respect to F
is the set X + of all attributes that are functionally
determined by X.
X + can be calculated by repeatedly applying IR1,
IR2, IR3 using the FDs in F
Closure
Closure Example
Closure Example
Inference Rules for FDs
❑Example 1: Let Q be a relation with attributes
(A,B,C,D,E,G,H) and let the following functional
dependencies hold
F={f1: B→A;
f2: AD→CE;
f3: D→H;
f4: GH→ C;
f5: AC→D}
❑Find the closure X+ of X = {AC}.
❑Answer: X+=ACDEH
Inference Rules for FDs
❑Example 2: Let Q be a relation with attributes
(A,B,C,D,E,G) and let the following functional
dependencies hold
F = { f1: A → C;
f2: A → EG;
f3: B → D;
f4: G → E}
❑Find the closure X+ and Y+ of X = {A,B}; Y = {C,G,D}
❑Answer: X+ = {ABCDEG} , Y+= {CGDE}
Inference Rules for FDs
❑Example 3:
F = {SSN →ENAME,
PNUMBER → {PNAME, PLOCATION},
{SSN, PNUMBER}→ HOURS}
Calculate the following closure sets with respect to
F;
{SSN }+ = {SSN, ENAME}
{PNUMBER }+ = {PNUMBER, PNAME, PLOCATION}
{SSN, PNUMBER}+ = {SSN, PNUMBER, ENAME,
PNAME, PLOCATION, HOURS}
Inference Rules for FDs
❑Example: Q(A,B,C) F = {AB → C,C → B} F+ ?
All subset of attributes
 A B C
 {A} {B} {C}
{A,B} {A,C}
Closure of all subsets {B,C}
−A+ = A {A,B,C}
− B+ = B
− C+ = BC
− AC+ = ABC so AC is candidate key
− AB+ = ABC so AB is candidate key
−BC+ = BC
Finding Candidate Key
Finding Candidate Key
Finding Candidate Key
Example to Finding CK
Finding Candidate Key
Finding Candidate Key
Finding Candidate Key
Normalization of Relations
❑Normalization: The process of decomposing
unsatisfactory "bad" relations by breaking up their
attributes into smaller relations
❑Normal form: Condition using keys and FDs of a
relation to certify whether a relation schema is in
a particular normal form
Normalization of Relations
❑2NF, 3NF, BCNF based on keys and FDs of a
relation schema
❑4NF based on keys, multi-valued dependencies :
MVDs; 5NF based on keys, join dependencies : JDs
(Chapter 11)
❑Additional properties may be needed to ensure a
good relational design (lossless join, dependency
preservation; Chapter 11)
Practical Use of Normal Forms
❑Normalization is carried out in practice so that
the resulting designs are of high quality and meet
the desirable properties
❑The practical utility of these normal forms
becomes questionable when the constraints on
which they are based are hard to understand or
to detect
❑The database designers need not normalize to the
highest possible normal form. (usually up to 3NF,
BCNF or 4NF)
❑Denormalization: the process of storing the join
of higher normal form relations as a base
relation—which is in a lower normal form
First Normal Form
❑First normal form (INF):
Disallow multivalued attributes, composite
attributes, and their combinations.
It states that the domain of an attribute must
include only atomic values and that the value of
any attribute in a tuple must be a single value
from the domain of that attribute.
First Normal Form
Second Normal Form
❑Second normal form (2NF) is based on the
concept of full functional dependency.
A functional dependency X→Y is a full functional
dependency if removal of any attribute A from X
means that the dependency does not hold any
more.
Examples:
{SSN, PNUMBER} -> HOURS is a full FD since
neither SSN -> HOURS nor PNUMBER -> HOURS
hold
{SSN, PNUMBER} -> ENAME is not a full FD (it is
called a partial dependency ) since SSN -> ENAME
also holds
2NF Normal Form
❑Let Say you have Relation R(A, B, C, D)
With functional Dependencies
AB→D
B→C

R1 (A, B, D)
R2 (B,C)
Third Normal Form
❑Third normal form (3NF) is based on the
concept of transitive dependency.
❑Transitive functional dependency A functional
dependency X→Z that can be derived from two
FDs X →Y and Y→ Z
❑Examples:
SSN → DMGRSSN is a transitive FD since
SSN→DNUMBER and DNUMBER →DMGRSSN hold
3NF Normal Form
❑Let Say you have Relation R(A, B, C, D)
With functional Dependencies
A→B
A→C
C→D

R1 (A, B, C)
R2 (C, D)
Example
❑ Let Say you have Relation R(A, B, C, D, E, F)
AB→CDEF
A→F
D→E
2NF:
R1 (A, B, C, D, E)
R2 (A, F)
3NF:
R1 (A, B, C, D)
R2 (A, F)
R3 (D, E)
BCNF (Boyce-Codd Normal Form)
❑Boyce-Codd normal form (BCNF) was proposed as
a simpler form of 3NF, but it was found to be
stricter than 3NF. That is, every relation in BCNF is
also in 3NF; however, a relation in 3NF is not
necessarily in BCNF.
❑Definition. A relation schema R is in BCNF if
whenever a nontrivial functional dependency
X→A holds in R, then X is a candidate key of R
BCNF (Boyce-Codd Normal Form)
❑Each normal form is strictly stronger than the
previous one
Every 2NF relation is in 1NF
Every 3NF relation is in 2NF
Every BCNF relation is in 3NF
❑There exist relations that are in 3NF but not in
BCNF The goal is to have each relation in BCNF (or
3NF)

You might also like