0% found this document useful (0 votes)
49 views42 pages

Relational Database Design - FDs

The document discusses relational database design and normalization using functional dependency theory. It defines functional dependencies and how they can be used to determine if a relation is in a "good" form with no data redundancy or anomalies. If a relation is not in good form, it can be decomposed into smaller relations through lossless join decomposition in a way that eliminates redundancy while preserving all original data through joins. Normalization is the process of decomposing relations to eliminate anomalies using dependencies between attributes.

Uploaded by

hityasha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views42 pages

Relational Database Design - FDs

The document discusses relational database design and normalization using functional dependency theory. It defines functional dependencies and how they can be used to determine if a relation is in a "good" form with no data redundancy or anomalies. If a relation is not in good form, it can be decomposed into smaller relations through lossless join decomposition in a way that eliminates redundancy while preserving all original data through joins. Normalization is the process of decomposing relations to eliminate anomalies using dependencies between attributes.

Uploaded by

hityasha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 42

Relational Database Design:

Functional Dependency Theory


Bad Schema
 Consider a relation inst_dept, which represents the result of a natural join
on the relations corresponding to instructor and department

 This relation is in “bad form” since there is a possibility of repetition of


information and we may need to use null values to represent information
(department with no instructors)
How to Deal with a Bad Schema?
 The only way to deal with a bad schema is to split the schema
(decompose) into a number of smaller schemas.
 Hopefully, each of the smaller schemas is in a “good form”; that
is, there is no repetition of information and there is no need to use
null values.
 In the case of the inst_dept schema, we would decompose it into
two relational schemas – instructor and department.
 We know that both instructor and department are in good form
(how do we know this)?
Normalization or Schema Refinement
 Normalization or schema refinement is a technique of organizing the
data in a database in efficient manner.
 It is a systematic approach of decomposing tables to eliminate data
redundancy and undesirable characteristics.
 Insertion anomaly
 Update anomaly
 Deletion anomaly
 Most common technique for the schema refinement is decomposition
 Goal of normalization or schema refinement: Eliminate redundancy
 Redundancy refers to repetition of same data or duplicate copies of
same data stored in different locations.
 Normalization is used for mainly two purposes:
 Eliminating redundant (useless) data
 Ensuring data dependencies make sense, that is, data is logically stored
Anomalies
1. Update anomaly: Due to some coding error if we miss to update
any of the fields then data inconsistency arises which is known as
update anomaly.
Example: Employee 519 is shown as having different addresses
on different records

Employees’ Skill
Employee ID Address Skill
426 24 Vile Parle Typing
426 24 Vile Parle Shorthand
519 21 Juhu Public speaking
519 23 Bandra Carpentry
Anomalies
2. Insertion anomaly: Until the new faculty member, Dr. Hayes, is
assigned to teach at least one course his details cannot be recorded.
Faculty and Their Courses
Faculty ID Name Hire Date Course Code

389 Dr. Brad 10-Feb-2009 ENG-206

407 Dr. Hilton 19-Apr-2008 PHY-201

407 Dr. Hilton 19-Apr-2008 PHY-203

424 Dr. Hayes 29-Mar-2018

3. Deletion anomaly: All information about Dr. Brad is lost if he


temporarily ceases to be assigned to any courses.
Resolution: Decomposition of schema
1. Update: (Employee ID, Address), (Employee ID, Skill)
2. Insert: (Faculty ID, Name, Hire Date), (Faculty ID, Course Code)
3. Delete: (Faculty ID, Name, Hire Date), (Faculty ID, Course Code)
Lossless-Join Decomposition
 When we decompose a relation r into two relations r1 and r2 we must
make sure that we can reconstruct the information that r originally
contained. Reconstruction is achieved by using the natural-join
operation
 Decomposition of r = (A, B, C) into:
r1 = (A, B) and r2 = (B, C)

A B C A B B C
 1 A  1 1 A
 2 B  2 2 B
r r1 r2

A B C
r1 r2
 1 A
 2 B
Decomposition does not always work
 Not all compositions are lossless-join decomposition
 Suppose we decompose
employee(ID, name, street, city, salary) into:
employee1 (ID, name)
employee2 (name, street, city, salary)
 The next slide shows how we lose information -- we cannot
reconstruct the original employee relation using employee1 and
employee2 -- and so, this is a lossy decomposition.
A Lossy Decomposition
Example: A Lossy Decomposition
 Consider Supplier_Parts schema: Supplier_Parts (S#, Sname,
City, P#, Qty)
 Decompose it as:
Supplier(S#, Sname, City, Qty) and Parts(P#, Qty)

S# Sname City P# Qty S# Sname City Qty P# Qty


3 Smith London 301 20 3 Smith London 20 301 20
5 Nick NY 500 50 5 Nick NY 50 500 50
2 Steve Boston 20 10 2 Steve Boston 10 20 10
5 Nick NY 400 40 5 Nick NY 40 400 40
5 Nick NY 301 10 5 Nick NY 10 301 10
Example: A Lossy Decomposition (Contd…)
 Take natural join to reconstruct: Supplier Parts

S# Sname City P# Qty


3 Smith London 301 20
5 Nick NY 500 50
2 Steve Boston 20 10
2 Steve Boston 301 10
5 Nick NY 400 40
5 Nick NY 20 10
5 Nick NY 301 10

 We get extra tuples.


 Join is lossy.

Note: Common attribute Qty is not a superkey in Supplier or in Parts.


Example: Lossless-Join Decomposition
 Consider Supplier_Parts schema: Supplier_Parts (S#,
Sname, City, P#, Qty)
 Now, decompose it as:
Supplier(S#, Sname, City) and Parts(S#, P#, Qty)

S# Sname City P# Qty S# Sname City S# P# Qty


3 Smith London 301 20 3 Smith London 3 301 20
5 Nick NY 500 50 5 Nick NY 5 500 50
2 Steve Boston 20 10 2 Steve Boston 2 20 10
5 Nick NY 400 40 5 400 40
5 Nick NY 301 10 5 301 10
Example: Lossless-Join Decomposition (Contd..)

 Take natural join to reconstruct: Supplier Parts

S# Sname City P# Qty


3 Smith London 301 20
5 Nick NY 500 50
2 Steve Boston 20 10
5 Nick NY 400 40
5 Nick NY 301 10

 We get back the original relation.


 Join is lossless.

Note: Common attribute S# is a superkey in Supplier.


Lossless-Join Decomposition Definition
 Let R be a relation schema, and let R1 and R2 form a decomposition of
R.
 We say that the decomposition is a lossless decomposition if there is
no loss of information by replacing R with two relation schemas R1
and R2 .
 More precisely, we say the decomposition is lossless if, for all legal
database instances (this is, all possible relations r on schema R)
r = R1 (r) R2 (r)
Decomposition Theory
Goal: Devise a theory for the following:

 Decide whether a particular relation R is in “good” form.


 In the case that a relation R is not in “good” form, decompose it
into a set of relations {R1, R2, ..., Rn} such that
 Each relation is in good form
 The decomposition is a lossless-join decomposition
 Our theory is based on:
 Functional dependencies
 Multivalued dependencies
Functional Dependencies
 Constraints on the set of legal relations.
 Require that the value for a certain set of attributes
determines uniquely the value for another set of
attributes.
 A functional dependency is a generalization of the notion
of a key.
Functional Dependencies (Cont.)
 Let R be a relation schema, and let  and  are two subsets of R:
  R and   R
 The functional dependency

holds on R if and only if for any legal relation r(R), whenever any
two tuples t1 and t2 of r agree on the attributes , they also agree
on the attributes . That is,
t1[] = t2 []  t1[ ] = t2 [ ]
 Example: Consider r(A, B) with the following instance of r.

A B
1 4
1 5
3 7

 On this instance, A  B does NOT hold, but B  A does hold.


Functional Dependencies (Cont.)
 K is a superkey for relation schema R if and only if K  R
 K is a candidate key for R if and only if
 K  R, and
 For no   K,   R
 Functional dependencies allow us to express constraints that cannot
be expressed using superkeys.
 Consider the schema:
inst_dept (ID, name, salary, dept_name, building, budget)
We expect these two functional dependencies to hold:
dept_name  building
and ID  building
but would not expect the following to hold:
dept_name  salary
Use of Functional Dependencies
 We use functional dependencies to:
 Test relations to see if they are legal under a given set of functional
dependencies
 If a relation r is legal under a set F of functional dependencies, we say that r
satisfies F
 Specify constraints on the set of legal relations
 We say that F holds on R if all legal relations on R satisfy the set of
functional dependencies F
 Note: A specific instance of a relation schema may satisfy a
functional dependency even if the functional dependency does not
hold on all legal instances
 For example, a specific instance of instructor may, by chance, satisfy

name  ID
Trivial Functional Dependencies
 A functional dependency of the form    is trivial if   
 Example:
 ID, name  ID
 name  name

 A trivial functional dependency is always satisfied by all instances


of a relation
Functional Dependencies (Contd..)

StudentID Semester Course TA


1234 6 Numerical Methods John
1221 4 Numerical Methods Smith
1234 6 Distributed Computing Bob
1201 2 Numerical Methods Peter
1201 2 Physics II Simon

 For above relation, some functional dependencies are:


 StudentID  Semester
 {StudentID, course}  TA
 {StudentID, course}  {TA, Semester}
Functional Dependencies (Contd..)

Employee_ID Employee_Name Department_ID Department_Name


0001 John Doe 1 Human Resources
0002 Jane Doe 2 Marketing
0003 John Smith 1 Human Resources
0004 Jane Goodall 3 Sales

 For above relation, some functional dependencies are:


 Employee_ID  Employee_Name
 Employee_ID  Department_ID
 Department_ID  Department_Name
Functional-Dependency Theory
Closure of a Set of Functional Dependencies
 Given a set F of functional dependencies, there are certain
other functional dependencies that are logically implied by F.
 For example: If A  B and B  C, then we can infer that A  C
 The set of all functional dependencies logically implied by F is
the closure of F.
 We denote the closure of F by F+.
 F+ is a superset of F
 F = {A  B, B  C}
 F+ = {A  B, B  C, A  C}
 Closure of set of FDs (F+): Closure of set of functional
dependencies F is a set of all FDs that include F as well as all
dependencies that can be inferred from F.
Computing F+
 We can compute F+, the closure of F, by repeatedly applying
Armstrong’s Axioms:
 If   , then    (Reflexivity)
 If   , then    (Augmentation)
 If   , and   , then    (Transitivity)
 These rules are
 Sound (generate only functional dependencies that actually hold),
and
 Complete (generate all functional dependencies that hold).
Example of Armstrong’s Axioms
 R = (A, B, C, G, H, I)
F={ AB
AC
CG  H
CG  I
B  H}
 Some members of F+
 AH
 by transitivity from A  B and B  H

 AG  I
 by augmenting A  C with G, to get AG  CG

and then transitivity with CG  I


 CG  HI
 by augmenting CG  I with CG, to infer CG  CGI,

and augmenting of CG  H with I, to infer CGI  HI,


and then transitivity
Procedure for Computing F+
F+=F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F +
for each pair of functional dependencies f1and f2 in F +
if f1 and f2 can be combined using transitivity
then add the resulting functional dependency to F +
until F + does not change any further
Computing F+ (Contd..)
 Additional rules:
 If    holds and    holds then     holds (Union)
 If     holds, then    holds and    holds (Decomposition)

 If    holds, and   δ, then   δ holds (Pseudotransitivity)

The above rules can be inferred from Armstrong’s axioms


(i) Union

 given
   augmentation rule
   union of identical sets
 given
    augmentation rule
 transitivity rule and set union commutativity
Computing F+ (Contd..)
ii. Decomposition: If     holds, then    holds and    holds
 given
    reflexivity rule
 transitivity rule
    reflexivity rule
 transitive rule

iii. Pseudotransitivity: If    holds, and   δ, then   δ holds


 given
   augmentation rule and set union commutativity
  δ given
  δ transitivity rule
Computing F+ (Contd..)
 R = (A, B, C, G, H, I)
F={ AB
AC
CG  H
CG  I
B  H}
 Computation of some members of F+ from F using Armstrong’s Axioms and
additional rules inferred from Armstrong’s axioms:
 AH
 by transitivity from A  B and B  H

 AG  H
 By pseudotransitivity A  C and CG  H, to get AG  H

 AG  I
 By pseudotransitivity A  C and CG  I, to get AG  I

 CG  HI
 by Union CG  H and CG  I
Closure of Attribute Sets
 Given a set of attributes , define the closure of  under F
(denoted by +) as the set of attributes that are functionally
determined by  under F.
 Algorithm to compute +, the closure of  under F

result := ;
repeat
for each functional dependency    in F do
begin
if  ⊆ result then result := result ∪ ;
end
until (result does not change)
Note: Since  ⊆ result, means   , and    (given)   ,
hence result := result ∪ 
Example: Closure of Attribute Sets
 R = (A, B, C, G, H, I)
 F = {A  B
AC
CG  H
CG  I
B  H}
 (AG)+
1. result = AG
2. result = ABCG (A  B and A  C)
3. result = ABCGH (CG  H and CG ⊆ ABCG)
4. result = ABCGHI (CG  I and CG ⊆ ABCGH)

Note: AG  ABCGHI, hence AG is a key of R


Uses of Attribute Closure
 There are several uses of the attribute closure algorithm:
 Testing for superkey:
 To test if  is a superkey, we compute +, and check if + contains all
attributes in R.
 Testing functional dependencies:
 We can check if a functional dependency    holds (or, in other
words, is in F+), by checking if  ⊆ +.
 That is, we compute + by using attribute closure, and then check if
it contains .
 This test is simple and cheap, so very useful.
 It gives us an alternative way to compute F+:
 For each  ⊆ R, we find the closure +, and for each S ⊆ +, we output
a functional dependency  → S.
Canonical Cover
 Sets of functional dependencies may have redundant dependencies
that can be inferred from the others
 For example: A  C is redundant in: {A  B, B  C, A  C}
 Parts of a functional dependency may be redundant (or extraneous)
 E.g.: (i) on RHS: {A  B, B  C, A  CD} can be simplified to

{A  B, B  C, A  D}
 Forward: (1) A  CD A  C and A  D (2) A  B, B  C AC
 Reverse: (1) A  B, B  C A  C (2) A  C, A  D A  CD
 E.g.: (ii) on LHS: {A  B, B  C, AC  D} can be simplified to

{A  B, B  C, A  D}
 Forward: (1) A  B, B  C AC A  AC
(2) A  AC, AC  D AD
 Reverse: (1) A  D AC  D
 Canonical cover: A canonical cover of F, denoted by Fc is a
“minimal” set of functional dependencies equivalent to F, having no
redundant dependencies or redundant parts of dependencies.
Canonical Cover (Contd..)
For Example: RHS
 {A  B, B  C, A  CD} {A  B, B  C, A  D}
 A  CD A  C and A  D
 A  B, B  C AC
OR
 A+ = ABCD
 {A  B, B  C, A  D} {A  B, B  C, A  CD}
 A  B, B  C AC
 A  C, A  D A  CD
OR
 A+ = ABCD

Note: Here, we do not need to apply Armstrong’s Axioms. We simply


take LHS attribute & compute a closure and check whether RHS is
induced or not.
Finding Canonical Cover (Fc) of FDs

 Steps:
1. Make all FDs having singleton right hand side
2. For each FD remove extraneous attributes, if exists (in LHS)
3. Removal of redundant FDs
Lossless-join Decomposition
 We have formally defined lossless-join decomposition earlier.
 Let R be a relation schema, and let R1 and R2 form a decomposition
of R. For the case of R = (R1, R2), we require that for all possible
relations r on schema R
r = R1 (r) R2 (r)
Dependency Preservation
 Let F be a set of functional dependencies on a schema R,
and let R1, R2, . . . , Rn be a decomposition of R.
 Let Fi be the set of all functional dependencies in F+ that
include only attributes of Ri
 A decomposition is dependency preserving, if
(F1 ∪ F2 ∪ ……..∪ Fn)+ = F+
Test for Dependency Preservation
 To check if a dependency    is preserved in a decomposition of
R into R1, R2, ….., Rn we apply the following test, with attribute
closure done with respect to F (it avoids computing F+):
result = 
repeat
for each Ri in the decomposition
t = (result ∩ Ri)+ ∩ Ri
result = result ∪ t
until (result does not change)
 If result contains all attributes in , then the functional dependency 
  is preserved.
 We apply the test on all dependencies in F to check if a
decomposition is dependency preserving.
Example: Test for Dependency Preservation

 R (ABCDEF):
 F = {A  BCD, A  EF, BC  AD, BC  E, BC  F, B  F, D  E}
 D = {ABCD, BF, DE} [Decomposition]
 On projections:
ABCD (R1) BF (R2) DE (R3)
A  BCD BF DE
BC  AD

 Need to check for: A  BCD, A  EF, BC  AD, BC  E, BC  F,


B  F, D  E
 (BC)+/F1=ABCD. (ABCD)+/F2=ABCDF. (ABCDF)+/F3=ABCDEF.
 Preserves BC  E, BC  F
 (A)+/F1=ABCD. (ABCD)+/F2=ABCDF. (ABCDF)+/F3=ABCDEF.
 Preserves A  EF
What do Know Thus Far?
 We know that any acceptable decomposition must be a lossless-
join decomposition
 We know how to test whether decomposition of a relation into a
set of relations is lossless-join decomposition or not
 We know that, it is desirable to have a lossless-join decomposition
that is dependency preserving
End

You might also like