0% found this document useful (0 votes)
86 views5 pages

Assignment 2: Write Clearly Your Name, Student Number and Lab Number On The Front Page of Your Assignment

This document provides instructions for Assignment 2 for the course CMPUT 391. It includes details on the due date, percentage of overall grade, penalties for late assignments, and maximum marks. It lists 4 questions to answer, providing the weight and mark allocation for each question. Question 1 asks to prove an inference rule for functional dependencies. Question 2 involves calculating closures of sets of attributes and determining if a set of functional dependencies is minimal. Question 3 asks about keys and decomposing a relation into 2NF and 3NF. Question 4 involves optimizing a multi-table query by comparing costs of different join strategies.

Uploaded by

eder Hunter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views5 pages

Assignment 2: Write Clearly Your Name, Student Number and Lab Number On The Front Page of Your Assignment

This document provides instructions for Assignment 2 for the course CMPUT 391. It includes details on the due date, percentage of overall grade, penalties for late assignments, and maximum marks. It lists 4 questions to answer, providing the weight and mark allocation for each question. Question 1 asks to prove an inference rule for functional dependencies. Question 2 involves calculating closures of sets of attributes and determining if a set of functional dependencies is minimal. Question 3 asks about keys and decomposing a relation into 2NF and 3NF. Question 4 involves optimizing a multi-table query by comparing costs of different join strategies.

Uploaded by

eder Hunter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Assignment 2

CMPUT 391 (Winter 2003)

Due Date (in class): Wednesday February 12, 2003 in class.


Percentage overall grade: 3%
Penalties: 20% off a day for late assignments
Maximum Marks: 100

Write clearly your Name, Student Number and Lab Number on the front
page of your assignment.
Deliverables:
The answers of the following questions 1, 2, 3, and 4 clearly typed on paper.
The TA will take into account the cleanliness of what was handed in. It is your responsibility to
make your assignment readable.

Question 1: (Armstrong Axioms) [10%]

There are inference rules for functional dependencies. Three of those rules are known as the
Armstrong's Axioms: Reflexivity, Augmentation and transitivity. These axioms are sound and
complete. We saw in class two other rules that are inferred from these axioms. These derived rules
are known as the decomposition rule and the union rule.
There is yet another inference rule called pseudotransitive rule that stipulates that:
if XÿY and WYÿZ then WXÿZ
Prove this rule using the known axioms.
Solution:
1- Xÿ Y (given)
2- WY ÿ Z (given)
3- WX ÿ WY (augmentation on 1.)
4- WX ÿ Z (transitivity on 3. and 2.)

Question 2: (Functional Dependencies) [25%]

1- Consider the following relation schema and set of functional dependancies:


Emp-Dept (SIN, E_Name, B_Date, Address, D_Num, D_Name, D_Manager)
F={SINÿ{E_Name, B_Date, Address, D_Num},
D_Num ÿ {D_Name, D_Manager}
}
Calculate the closure of {SIN}+ and {D_Num}+ with respect to F.
Solution:
Using the algorithm seen in class:
{SIN}+ = {SIN, E_Name, B_Date, Address, D_Num, D_Name, D_Manager}
{D_Num}+ = {D_Num, D_Name, D_Manager}

2- Is the set of functional dependencies F minimal? If not, try to find a minimal set of functional
dependencies that is equivalent to F (minimal cover). Prove the equivalence.
Solution:
No, the set of functional dependencies F is not minimal since the right-hand side of the rules
have more than just one attribute.
The minimal cover G of F is:
SIN ÿ E_Name
SIN ÿ B_Date
SIN ÿ Address
SIN ÿ D_Num
D_Num ÿ D_Name
D_Num ÿ D_Manager

To prove that two sets of functional dependencies F and E are equivalent, we either show that
F+ = E+ or that E covers F and F covers E.
To show that F covers E, we calculate X+ with respect to F for every FD XÿY in E and check
whether X+ includes the attributes in Y.
Rather than calculating G+ and F+ we show that the coverage of G and F.
F covers G
{SIN}+ = {SIN, E_Name, B_Date, Address, D_Num, D_Name, D_Manager} and {D_Num}+ =
{D_Num, D_Name, D_Manager} with respect to F (see 2.1). All right hand side of any FD in G
is included.
G covers F
{SIN}+ = {SIN, E_Name, B_Date, Address, D_Num, D_Name, D_Manager} and {D_Num}+ =
{D_Num, D_Name, D_Manager} with respect to G and all right-hand side of any FD in F is
included.

3- What update anomalies can happen to Emp-Dept? Give examples.


Solution:
Insert anomaly: Adding a new department and new manager, we would need employees in the
department first.
Delete anomaly: Removing the only employee in a department would remove the department
and its manager.
Update anomaly: Changing the manager of a populous department would need to update many
tuples.

Question 3: (Functional Dependencies) [25%]

Consider the relation R = {A, B, C, D, E, F, G, H, I, J} and the set of functional dependencies


F={{A, B} ÿ C,
Aÿ {D, E},
Bÿ F,
Fÿ{G, H},
Dÿ{I, J}
}
1- What is the key for R? Demonstrate it using the inference rules.
Decompose R into 2NF, then 3NF relations.
Solution:
AÿDE (given) => AÿD and Aÿ D
Since AÿD and DÿIJ (given) => AÿIJ
Using the union rule Aÿ ADEIJ, thus AB ÿ ABDEIJ (augmentation)
Also ABÿC (given) => ABÿ ABCDEIJ.
Since BÿF (given) and Fÿ GH (given), BÿGH (transitivity)
Thus ABÿAGH holds. Also ABÿAF holds from BÿF (given)
Finally, using the union rule ABÿABCDEFGHIJ.
So AB is a key. This can also be determined by calculating AB+ with respect to the set F.
2NF 3NF
R1 (A, B, C) R1 (A, B, C)
R2 (A,D,E,I,J) R2.1 (A,D,E) R2.2 (D, I, J)
R3 (B, F, G, H) R3.1 (B, F) R3.2 (F, G, H)

2- What is the key for R if F = {{A, B}ÿC,


{B, D}ÿ{E, F},
{A, D}ÿ{G, H},
AÿI,
HÿJ
}?
Demonstrate it using inference rules. Decompose R in 2NF then 3NF in this case.
Solution:
ADÿGH (given) => ABDÿ ABDGH (augmentation and reflexivity)
Since AÿI (given) then ABDÿABDI (augmentation and reflexivity)
ABÿC (given) => ABDÿABCD
BDÿEF (given) => ABDÿABDEF
ADÿGH (given) => ADÿH. Since HÿJ then ADÿJ. Thus, ABDÿABDJ.
Finally, using the union rule ABDÿABCDEFGHIJ.
So ABD is a key. This can also be determined by calculating ABD+ with respect to the set F.

2NF 3NF
R1 (A, B, C) R1 (A, B, C)
R2 (B,D, E, F) R2 (B,D, E, F)
R3 (A, D, G, H, J) R3.1 (A, D, G, H)
R4 (A, I) R4 (A, I)
R3.2 (H, J)
Question 4: (Query Optimization) [40%]

Consider the following query Q:

SELECT E-Name, Salary


FROM Employee, Works, Project
WHERE P-Type=”design” AND P-Num=PNO AND ESIN=SIN AND B-Date > “1961-01-29”

On the following tables:

Employee(SIN, E-Name, B-Date, Address, Sex, Salary, Supervisor) with 10,000 tuples;
Works(ESIN, PNO) with 20,000 tuples;
Project(P-Name, P-Type, P-Num, Location, D-Num) with 500 tuples.

Knowing that one page can accommodate 100 tuples of Employee, 400 tuples of Works, or 120
tuples of Project, and assuming that we have 6 buffers in main memory calculate the cost for
evaluating Q if we choose Bloc-Nested Loop joins or Sort-merge joins for both of the two joins, or
Bloc-Nested Loop for the first join and Sort-Merge for the second join. The first join is between
Project and Works while the second joins the result with Employee. Assume that the same number
of tuples of the result of the first join can fit per page as we can fit Project tuples (120). Which plan
would be the best? Assume that there are 5 types of projects and ¾ of the employees are born after
January 29, 1961. All distributions are uniform. Push selections as early as possible in all cases.
Draw your query plans.
Solution:
Employee has 10,000 tuples, 100 per page there are 10000/100 = 100 pages
Works has 20,000 tuples with 400 per page there are 20000/400 = 50 pages
Project has 500 tuples with 120 per page there are 500/120 = 4.16 ≈ 5 pages
Since there are 5 types of projects, the selection on projects with type=design will generate 100
tuples fitting in one page.
Since there are ¾ of employees born after 1961, the selection with the birth date constraint will
generate 7500 tuples fitting in 75 pages.
Since the selection on Project is smaller than the Works
π E-Name, Salary relation, Works should better be the outer relation.
BNL On the fly
The first select costs 5 I/Os.
PNO=P-Num
Since the result is the size of one buffer, it can reside in
Write T1 Write T2 main memory to do the join. Thus, the cost of the first join
BNL σB-Date>1961-01-29
is the cost of scanning Works: 50 I/Os
The result of the first join is estimated at 20000/ 500 *100 =
pipeline PNO=P-Num
4000 tuples. This is assuming a uniform distribution (i.e. the
Employee number of employees assigned per project is uniform.).
σP-Type=design
Since the distributions are assumed uniform: we have
Works 20,000 works tuples and 500 projects. That is 40 employees
per project. Since we have 100 projects with type “design”,
Project that gives us 4000 tuples.
At 120 tuples per page, the result is about 34 pages (exactly 33 and a third). Thus writing T1 costs
34 I/Os. The cost of the second select is 100 I/Os and writing T2 costs 75 I/Os. The cost of the
second join is 34+ 34/4 *75 = 709 I/Os.
Thus the total cost for this plan is 5+50+34+100+75+709= 973 I/Os.

π E-Name, Salary
The first select costs 5 I/Os
On the fly
The result can fit in one buffer and can be sorted in main
SMJ memory.
Sort T2
PNO=P-Num
Sorting Works on PNO costs 2*log5(50) * 50 = 2*3*50 =
Write T2 Sort+Write T3 300 I/Os. The SM join would cost 50 I/Os since the outer
relation fits in memory.
SMJ σ B-Date>1961-01-29 Writing T2 costs 34 I/Os (see above) and sorting T2 on
pipeline PNO=P-Num
ESIN costs 2*3*34 = 204 I/Os
Employee Selecting Employees costs 100 I/Os for scanning and 75
σ Sort+Write T1
P-Type=design
I/Os to write T3. Sorting T3 on SIN costs 2*3*75 = 450
Works I/Os. The final join costs 75+34 = 109 I/Os.
Thus the total cost for this plan is
Project
5+300+50+34+204+100+75++450+109=1327 I/Os.

π E-Name, Salary

On the fly
The first select costs 5 I/Os.
SMJ Since the result is the size of one buffer, it can reside in
Sort T1
PNO=P-Num
main memory to do the join. Thus, the cost of the first join
Write T1 Sort+Write T2 is the cost of scanning Works: 50 I/Os
Writing T1 costs 34 I/Os (see above) and sorting T1 on
BNL σ B-Date>1961-01-29 ESIN costs 2*3*34 = 204 I/Os
pipeline PNO=P-Num
Selecting Employees costs 100 I/Os for scanning and 75
Employee I/Os to write T3. Sorting T3 on SIN costs 2*3*75 = 450
σP-Type=design
I/Os. The final join costs 75+34 = 109 I/Os.
Works Thus the total cost for this plan is
5+50+34+204+100+75++450+109=1027 I/Os.
Project

The best plan among these three is to use Bloc-Nested Loops join for both joins.

You might also like