0% found this document useful (0 votes)
5 views

Lecture16

The document discusses schema refinement in database design, focusing on the issues of redundancy and anomalies such as update, insertion, and deletion anomalies. It emphasizes the importance of decomposing relations to eliminate redundancy and maintain data integrity through functional dependencies. Additionally, it covers the concept of attribute closure and the algorithms used to determine it in the context of functional dependencies.

Uploaded by

theheatman675
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lecture16

The document discusses schema refinement in database design, focusing on the issues of redundancy and anomalies such as update, insertion, and deletion anomalies. It emphasizes the importance of decomposing relations to eliminate redundancy and maintain data integrity through functional dependencies. Additionally, it covers the concept of attribute closure and the algorithms used to determine it in the context of functional dependencies.

Uploaded by

theheatman675
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

COMP 421: Files and Databases

Lecture 16: Schema Refinement

1
Back to Database Design
• Do we just translate the ER diagram?

In HW1, we saw that, we don’t really need a separate table for the relationship PLACES

2
Back to Database Design
• Is it a good idea to store the same information multiple times?

An instance of the ORDERS table


3
Back to Database Design
• Decompose

.
.
An instance of the ORDERS table . 4
Anomalies due to Redundant Storage
• Update anomaly: If one copy of a repeated data is updated, an
inconsistency is created unless all copies are similarly updated
• Rating determines the hourly wage

ssn name lot rating hourly_wages hours_worked


123-22-3666 Attishoo 48 8 10 40
231-31-5368 Smiley 22 8 10 30
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40
An instance of the Hourly_Emps table

Suppose, we update hourly_wages for Attishoo, what will happen to Smiley and Madayan?
5
Anomalies due to Redundant Storage
• Insertion anomaly: It may not be possible to store certain information
unless some other, unrelated, information is stored
• Rating determines the hourly wage

ssn name lot rating hourly_wages hours_worked


123-22-3666 Attishoo 48 8 10 40
231-31-5368 Smiley 22 8 10 30
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40
An instance of the Hourly_Emps table

Suppose, we want to insert a new employee whose rating is 6. what should be the hourly_wage?
6
Anomalies due to Redundant Storage
• Delete anomaly: It may not be possible to store certain information
without losing some other, unrelated, informatio
• Rating determines the hourly wage

ssn name lot rating hourly_wages hours_worked


123-22-3666 Attishoo 48 8 10 40
231-31-5368 Smiley 22 8 10 30
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40
An instance of the Hourly_Emps table

Suppose, we delete Shemthusrt and Guldu’s informaiton from the table. How to keep the
association between rating and hourly_wages? 7
Decompositions
• Replace the relation with a collection of smaller relations
ssn name lot rating hourly_wages hours_worked
123-22-3666 Attishoo 48 8 10 40
231-31-5368 Smiley 22 8 10 30
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40

ssn name lot rating hours_worked rating hourly_wages


123-22-3666 Attishoo 48 8 40 8 10
231-31-5368 Smiley 22 8 30 5 7
131-24-3650 Smethurst 35 5 30
434-26-3751 Guldu 35 5 32
612-67-4134 Madayan 35 8 40
8
Eliminate Redundancy
• Redundancy is the root of several problems associated with the schema
• Redundant storage
• Insert/delete/update anomalies
• Functional Dependencies, FD can be used to identify schemas with
such problems
• rating → hourly_wage
• cid → shippingAddress
• Main refinement technique: decomposition
• Another table with rating, hourly_wage
• Another table with cid, ShippingAddress

9
Functional Dependency
• A functional dependency is a kind of Integrity Constraint (IC)
• X → 𝑌 holds over relation R if the following holds for every pair of
tuples 𝑡1 and 𝑡2
• If 𝑡1 . 𝑋 = 𝑡2. 𝑋 then 𝑡1 . 𝑌 = 𝑡2 . 𝑌

A B C D
a1 b1 c1 d1
Do these FDs hold over the given relation?
a1 b1 c1 d1
A→𝐷
a1 b2 c2 d1 AB → 𝐶
a2 b1 c3 d2
a1 b2 c2 d1

10
Primary Key
• A special case of an FD beccause it uniquely determines the value of
other attributes
• Since no two tuples can have the same primary key, we never
encounter a situation where 𝑡1 . 𝑋 = 𝑡2. 𝑋, so no chance of violation the
FD condition: If 𝑡1 . 𝑋 = 𝑡2. 𝑋 then 𝑡1 . 𝑌 = 𝑡2 . 𝑌
What are the FDs in Relation?
P A B C D
P→𝐴 A→𝐷
p1 a1 b1 c1 d1 P→𝐵 AB → 𝐶
p2 a1 b1 c1 d1 P→𝐶
p3 a1 b2 c2 d1 P→𝐷
p4 a2 b1 c3 d2
P → 𝐴𝐵
P → 𝐴𝐵C
p5 a1 b2 c2 d1
P → 𝐴𝐵𝐶𝐷
11
Primary Key
• FD X → 𝑌 does not require the X to be minimal
• If X is minimal, then it’s a key, candidate key
• If not, then it’s a super key

P A B C D P→𝐴 𝑃𝐴 → 𝐴𝐵𝐶𝐷
p1 a1 b1 c1 d1 P→𝐵 𝑃𝐵 → 𝐴𝐵𝐶𝐷
p2 a1 b1 c1 d1 P→𝐶
P→𝐷
p3 a1 b2 c2 d1
P → 𝐴𝐵
p4 a2 b1 c3 d2 P → 𝐴𝐵C
p5 a1 b2 c2 d2 P → 𝐴𝐵𝐶𝐷

𝑃𝐴, 𝑃𝐵 are superkeys, 𝑃 is a primary key


12
Closure of a set of FDs
• Set of FD, S: A → 𝐷, AB → 𝐶, P → 𝐴𝐵𝐶𝐷
• Closure of S includes set of all FDs implied by S
• Apply Armstrong’s Axioms 3 rules repeatedly to infer all FDs
• Reflexivity: if X ⊇ 𝑌, then X → 𝑌
• Augmentation: If X → 𝑌, then XZ → 𝑌𝑍 for any 𝑍
• Transitivity: If X → 𝑌 and Y → 𝑍, then X → 𝑍
• Additional rules:
• Union: If X → 𝑌 and X → 𝑍, then X → 𝑌𝑍
• Decomposition: If X → 𝑌𝑍 then X → 𝑌and X → 𝑍

If XZ → 𝑌 then X → 𝑌 and Z → 𝑌

You can add attributes to LHS, but cannot remove anything from LHS
13
Closure of a set of FDs
• Suppose, we have a relation names Contracts
• Contracts(contractid, supplierid, taskid, deptid, partid, qty, value)
• Schema for Contracts: CSTDPQV
• This schema holds following ICs:
• The contractid C is a key: 𝐶 → 𝐶𝑆𝑇𝐷𝑃𝑄𝑉
• A task purchases a given part using a single contract: 𝑃𝑇 → 𝐶
• A department purchases at most one part from a supplier: 𝑆𝐷 → 𝑃

14
Closure of a set of FDs
• Suppose, we have a relation names Contracts
• Contracts(contractid, supplierid, taskid, deptid, partid, qty, value)
• Schema for Contracts: CSTDPQV
• This schema holds following ICs:
• The contractid C is a key: 𝐶 → 𝐶𝑆𝑇𝐷𝑃𝑄𝑉
• A task purchases a given part using a single contract: 𝑃𝑇 → 𝐶
• A department purchases at most one part from a supplier: 𝑆𝐷 → 𝑃

From 𝑃𝑇 → 𝐶 and 𝐶 → 𝐶𝑆𝑇𝐷𝑃𝑄𝑉, using transitivity we get 𝑃𝑇 → 𝐶𝑆𝑇𝐷𝑃𝑄𝑉


From 𝑆𝐷 → 𝑃, using augmentation we get 𝑆𝐷𝑇 → 𝑃𝑇

From 𝑃𝑇 → 𝐶𝑆𝑇𝐷𝑃𝑄𝑉 and 𝑆𝑇𝐷 → 𝑃𝑇, using transitivity we get 𝑆𝑇𝐷 → 𝐶𝑆𝑇𝐷𝑃𝑄𝑉
You can not conclude 𝑆𝐷 → 𝐶𝑆𝐷𝑃𝑄𝑉 by removing T from both sides
15
Attribute Closure
• Given a set of functional dependencies (FDs) in a relation, the total set
of FDs that can be inferred is called the closure of the FDs.
• Computing the closure can be very expensive
• Typically we just want to check if a given FD X→Y is in the closure
of a set of FDs
• Suppose we have the following FDs:
• 𝐹 = {𝐴 → 𝐵, 𝐵 → 𝐶, 𝐶 → 𝐸}
• Is 𝐴 → 𝐸 in the closure 𝐹?
• If A’s closure contains E, then we can say 𝐴 → 𝐸

16
Attribute Closure
Algorithm to find closure

1. Closure_X: Start with the initial set of attributes 𝑋.


2. For each functional dependency 𝑋 → 𝑌
• If the left-hand side 𝑋 is a subset of the current closure, add the right-hand side 𝑌 to
the closure.
3. Repeat this process until no more attributes can be added to the closure.

𝐹 = {𝐴 → 𝐵, 𝐵 → 𝐶, 𝐶 → 𝐸}
Is 𝐴 → 𝐸 in the closure 𝐹?
so we have to find A’s closure

17
Attribute Closure
Algorithm to find A’s closure

1. Closure_X: Start with the initial set of attributes 𝑋.


2. For each functional dependency 𝑋 → 𝑌
• If the left-hand side 𝑋 is a subset of the current closure, add the right-hand side 𝑌 to
the closure.
3. Repeat this process until no more attributes can be added to the closure.

𝐹 = {𝐴 → 𝐵, 𝐵 → 𝐶, 𝐶 → 𝐸} Closure = {𝐴}
Is 𝐴 → 𝐸 in the closure 𝐹?

18
Attribute Closure
Algorithm to find A’s closure

1. Closure_X: Start with the initial set of attributes 𝑋.


2. For each functional dependency 𝑋 → 𝑌
• If the left-hand side 𝑋 is a subset of the current closure, add the right-hand side 𝑌 to
the closure.
3. Repeat this process until no more attributes can be added to the closure.

𝐹 = {𝐴 → 𝐵, 𝐵 → 𝐶, 𝐶 → 𝐸} Closure = {𝐴}
Is 𝐴 → 𝐸 in the closure 𝐹? LHS of 𝐴 → 𝐵 (=A) is a subset of the current closure
we add RHS of 𝐴 → 𝐵 (=B) to closure set
Closure = {𝐴, 𝐵}

19
Attribute Closure
Algorithm to find A’s closure

1. Closure_X: Start with the initial set of attributes 𝑋.


2. For each functional dependency 𝑋 → 𝑌
• If the left-hand side 𝑋 is a subset of the current closure, add the right-hand side 𝑌 to
the closure.
3. Repeat this process until no more attributes can be added to the closure.

𝐹 = {𝐴 → 𝐵, 𝐵 → 𝐶, 𝐶 → 𝐸} Closure = {𝐴}
Is 𝐴 → 𝐸 in the closure 𝐹? LHS of 𝐴 → 𝐵 (=A) is a subset of the current closure
we add RHS of 𝐴 → 𝐵 (=B) to closure set
Closure = {𝐴, 𝐵}

20
Attribute Closure
Algorithm to find A’s closure

1. Closure_X: Start with the initial set of attributes 𝑋.


2. For each functional dependency 𝑋 → 𝑌
• If the left-hand side 𝑋 is a subset of the current closure, add the right-hand side 𝑌 to
the closure.
3. Repeat this process until no more attributes can be added to the closure.

𝐹 = {𝐴 → 𝐵, 𝐵 → 𝐶, 𝐶 → 𝐸} Closure = {𝐴, 𝐵}
Is 𝐴 → 𝐸 in the closure 𝐹? LHS of 𝐵 → 𝐶 (=B) is a subset of the current closure
we add RHS of 𝐵 → 𝐶 (=C) to closure set
Closure = {𝐴, 𝐵, 𝐶}

21
Attribute Closure
Algorithm to find A’s closure

1. Closure_X: Start with the initial set of attributes 𝑋.


2. For each functional dependency 𝑋 → 𝑌
• If the left-hand side 𝑋 is a subset of the current closure, add the right-hand side 𝑌 to
the closure.
3. Repeat this process until no more attributes can be added to the closure.

𝐹 = {𝐴 → 𝐵, 𝐵 → 𝐶, 𝐶 → 𝐸} Closure = {𝐴, 𝐵, 𝐶}
Is 𝐴 → 𝐸 in the closure 𝐹? LHS of C→ 𝐸 (=C) is a subset of the current closure
we add RHS of 𝐶 → 𝐸 (=E) to closure set
Closure = {𝐴, 𝐵, 𝐶, 𝐸}

Now that A’s closure contains E, then we can say 𝐴 → 𝐸


22
Exercise
• Consider a relation R with 5 attributes ABCDE. You are given the
following dependencies:
• 𝐴→𝐵
• 𝐵𝐶 → 𝐸
• 𝐸𝐷 → 𝐴

1. List all keys for R (try all combinations: A, B, AB, AC …) -- difficult


2. Lets see if ACD is a key?
• Find closure of ACD, check if all 5 attributes are in the closure
3. Lets see if ADE is a key?
4. Lets see if BCD is a key?
5. Lets see if CDE is a key?

23
Exercise
• Consider a relation R with 5 attributes ABCDE. You are given the
following dependencies:
• 𝐴→𝐵
• 𝐵𝐶 → 𝐸
• 𝐸𝐷 → 𝐴

2. Lets see if ACD is a key?


• Find closure of ACD, check if all 5 attributes are in the closure
• Closure = {A, C, D}
• Using 𝐴 → 𝐵, LHS = A which is in the current closure, so add RHS = B to the closure,
closure = {A, B, C, D}
• Using 𝐵𝐶 → 𝐸, LHS = BC which is in the current closure, so add RHS = E to the closure,
closure = {A, B, C, D, E}

24

You might also like