0% found this document useful (0 votes)

98 views19 pages

2019 Spring Final Sol

Here are the steps to normalize the schema to 3NF: (1) (4 points) Identify the candidate keys: {D}, {A} (2) (2 points) Decompose based on FD (1): grade_duty1(D, T, A, H, P, S, G, O) grade_duty2(T, A, H, P, S, G, O) (3) (2 points) Decompose grade_duty2 based on FD (2): grade_duty3(A, S, H) grade_duty4(A, P, G, O) (4) (2 points) The schemas grade_

Uploaded by

Ishaan Maitra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views19 pages

2019 Spring Final Sol

Uploaded by

Ishaan Maitra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

CompSci 316 Spring 2019: Final Exam

Question Booklet
100 points (20% of course grade) + 20 points extra credit
Friday, May 3, 2019

●This exam booklet is printed single-sided. You may use the back of the pages as scratch space or
extra space (add a pointer if you are writing your answer there).
● The exam is open-book and open-notes; any written materials may be used. No electronic materials
can be used.
● You have 180 minutes to complete the exam.
● The problems do not necessarily come in increasing order of difficulty. You might want to look
through the entire exam before getting started, in order to plan your strategy.
● There is no penalty for guessing answers to questions. However, for short-answer questions,
simplicity and clarity of solutions will count. You may get as few as 0 point for a problem if your
solution is far more complicated than necessary.
● No explanations are needed unless asked explicitly. But if you welcome to write 1-2 sentences if you
think the question does not specify all assumptions and clarify your answer for possible partial/full
credit.
NAME (please print):

In accordance with both the letter and spirit of the Duke Community Standard, I have neither given nor
received assistance on this exam.

SIGNATURE:

0. (bonus) 1/1 1. (Short Q/A) /40

2. (SQL etc.) /10 3. (Normalization) /10
4. (XML) /9 5. (Logs) /8
6. (Index/Joins) /17 7. (Distributed) /6
X1. (extra credit) /13 X2. (extra credit) /7
Total

Page 1 of 19

Problem 0 (1 point)
We fell little short to meet the numbers of course evaluations to have the promised free points, but as a thank
you to all of you who did complete the course evaluation, you will all receive one free points!

Problem 1: For each statement below, indicate whether it is “T” (true) or “F” (false).
No explanations needed. But if you are unsure or think either answer may be correct depending on the
context, you can give a 1 line explanation.

Problem 1a ( 2 points)
Given a table R(A, B, C) that has a B+ tree index on R(A, B), then there is no need to create a B+ tree index
on R(B).

ANS: F.

For answering queries like select * from R where B > X, index on R(A, B) does not help.

Problem 1b ( 3 points)
Given three tables R(A, B), S(B, C), and T(C, D), consider Selinger’s dynamic programming algorithm that
optimizes the natural join of the three tables. Suppose there are two sub query plans P1 and P2 for joining R
and S: P1 sorts the join result by B, while P2 sorts the join result by C, and P2 costs more than P1. We should
always discard P2 for better performance.

ANS: T. In Selinger’s algorithm we only consider optimal subplans so we should discard P2.

NOTE: We also gave points for F that implies we cannot discard P2 from further consideration because P1
and P2 produce relevant yet different interesting orders.

Problem 1c ( 4 points)
(2 points) If a DBMS uses steal and force policy, it never has to redo changes of a committed transaction.

Ans: T

(2 points) If a DBMS uses steal and force policy, it never has to undo changes of a uncommitted transaction.

Ans: F

Page 2 of 19

Problem 1d ( 10 points)
Consider the following schedule involving three transactions T1, T2, T3.

(recall that r1(A) means T1 reads A etc.)

r1(A), w1(A), r3(B), w2(A), w3(B), w1(B)

(3 points) The schedule is conflict serializable.

True: draw the precedence graph: T3 -> T1 -> T2

(3 points) The schedule is possible under two-phase locking (2PL).

False:

T1 releases lock on A after w1(A), because T2 needs it; and T3 acquires lock on B before r3(B), then T1 has
to acquire lock on B after w3(B), which is not allowed in 2PL.

If T1 retains the lock on A until it is done with w1(B), w2(A) cannot execute.

(2 points) The schedule avoids cascading rollback.

True: No dirty read from data written by other uncommitted transaction

(2 points) The schedule has two equivalent serial schedules.

False: from the precedence graph, there is only one.

Problem 1e (2 points)
Given relations R(A, B) , S (B, C) , the following relational algebra expression is valid: πBσA=5 R
− σB=5S,

where − denote set difference.

Answer: False

The schema are different, not union compatiable

Problem 1f (6 points)

Page 3 of 19

(3 points) Given relations R and S (without duplicates), R∩S = R − ((R − S ) ⋃ (S − R)) , where ⋃ , ∩ ,
and − denote set union, set intersection, and set difference

True

(3 points) Given relations R and S (with possible duplicates, bag semantic), if R has m copies of 1 and S
has n copies of 1, then for all values of m and n such that m >= n

R ⋒ S = R − ((R − S )⋓ (S − R)) , where ⋓ , ⋒ , a nd − denote bag union, bag intersection, and bag
difference.

True: Suppose R has m 1, S has n 1, and m >= n. LHS = n. RHS = m - ((m-n) + 0) = n

(3 points) In the above question, for all values of m and n such that m < n

R ⋒ S = R − ((R − S )⋓ (S − R)) , where ⋓ , ⋒ , a nd − denote bag union, bag intersection, and bag
difference.

False: If m < n, LHS = m, RHS = m - (0 + n-m) = 2m - n, may be different from m.

e.g. assume R is {1}, S is {1, 1}: LHS = {1}, RHS = {1} - {1} = emptyset

Problem 1g (6 points)
Consider the following two XPath queries:

//A[B/C = "foo" and B/D = "bar"]

//A[B[C = "foo" and D = "bar"]]

(3 points) Every element returned by the first query will be returned by the second.

Answer: a

False. Consider the following XML document:

The first query will return this element; the second query will not.

(3 points) Every element returned by the second query will be returned by the first.

Answer: a

True. The C and D elements that make an A element satisfy the condition in the second query will make the
same A element satisfy the condition in the first query

Page 4 of 19

Problem 2: SQL etc. ( points 10)
Consider the schema of a database for course registration:

• Course: C(cid, name, year, capacity) (capacity denotes the maximum possible number of students). - note
that we clarified that cid is not the key here.

• Student: S(sid, name, major)

• Registration: R(sid, cid)

The following SQL query outputs the yearly enrollment rate (as e_rate) of the courses across different years.

SELECT C.cid, C.year, COUNT(S.sid) / C.capacity as e_rate

FROM Course C, Student S, Registration R

WHERE C.cid = R.cid AND R.sid = S.sid

GROUP BY C.cid, C.year

SOLUTION:

(i) (3 points) How will you modify the given SQL query, if for each output tuple (c, y, r), course id c
should have enrollment rate r >= 0.9 in year y? (No need to rewrite the query, just state the changes).
SELECT C.cid, C.year, COUNT(S.sid) / C.capacity as e_rate
FROM Course C, Student S, Registration R
WHERE C.cid = R.cid AND R.sid = S.sid
GROUP BY C.cid, C.year
HAVING (COUNT(S.sid) / C.capacity) >= 0.9
or use the given query as a subquery and select on e_rate

(ii) (4 points) Using the given SQL query, output cours ids c whose enrollment rate did not reach 0.9 in any
year. (You can assume that the given query result has been materialized as a view TEMP and use TEMP in
your new query).
SELECT TEMP.cid FROM TEMP GROUP BY TEMP.cid HAVING
MAX(TEMP.e_rate) < 0.9

SELECT TEMP.cid FROM TEMP

EXCEPT
SELECT TEMP.cid FROM TEMP WHERE TEMP.e_rate >= 0.9

Page 5 of 19

(iii) (3 points) Draw a query plan tree for the given SQL query that IS NOT CONSIDERED by Selinger’s
dynamic programming algorithm for the joins of C, S, and R.

Any query plan which is not a left-deep plan.

Gammacid, year, COUNT(S.sid) / C.capacity -> e_rate

|
JOIN
/ \
C JOIN
/ \
S R

NOTE: We did not deduct point if someone did not include the top-most aggregate step.

Problem 3: Normalization (14 points)

Suppose you have the following schema representing the grading duties of TAs in CS 316 in Spring 2020 :

grade_duty(duty_id, ta_id, answer_id, homework_id, problem_id, student_id, grade, overdue)

ta_id and student_id denote the id-s of the TA and the student. For each homework, we have multiple
problems. Each answer may include the solutions to multiple problems (as in gradescope). For each problem
in each homework submitted by a student, there is a grade, and if it is late then overdue = 1.

We denote this schema by DTAHPSGO by taking the first letter from each attribute. Suppose there are
following functional dependencies.

(1) D -> TAHPSGO -- Duty id is the key for the table

(2) A -> SH -- Each answer applies to one student, one homework
(3) HT -> P -- One TA grades only one problem for each homework

(i) (3 points): For the following example table, are there any rows violating the stated functional
dependencies? If so, identify any one violation, and state which functional dependency is violated,
and which duty_id’s violate them.

duty_id TA_id (T) answer_id homewor_ problem_i student_id grade (G) overdue
(D) (A) id (H) d (P) (S) (O)

1000 John 101 1 1 5001 90 Yes

1001 Mary 104 2 3 5005 92 Yes

Page 6 of 19

1002 Jesse 105 2 3 5002 85 No

1003 John 103 4 2 5005 100 No

1004 Mary 105 2 5 5002 95 No

1005 Ray 103 4 3 5003 70 No

1001 and 1004, violate HT -> P

1003 and 1005, violate A -> SH

(ii) (4 points): Compute a BCNF decomposition of this table schema DTAHPSGO using the three FDs.
Show the steps.

Option 1:

From A -> SH, decompose DTAHPSGO into DTAPGO and ASH.

From A -> SH and HT -> P we have AT -> SHT -> P, so AT -> P.

From AT -> P, demopose DTAPGO into DATGO and ATP

Option 2:

Using HT->P, get HTP and DTAHSGO

Using A -> SH, decompose DTAHSGO and get ASH and DTGO

(iii) (3 points): Consider a simplified version of this schema with only the attributes DTH, and one
functional dependency D -> TH. If we decompose the table into DT and TH is it lossless join
decomposition? Explain your answer..

No. See counterexample below:

DTH:

duty_id TA_id homework_id

1001 Ray 1

1002 Ray 2

Page 7 of 19

DT:

duty_id TA_id

1001 Ray

1002 Ray

TH:

TA_id homework_id

Ray 1

Ray 2

DT join TH:

duty_id TA_id assignment_id

1001 Ray 1

1001 Ray 2

1002 Ray 1

1002 Ray 2
Row 2 and 3 are incorrect.

Problem 4: XML and XQuery (9 points)

Consider a course registration XML DTD:

<!DOCTYPE Registration [
<!ELEMENT Registration (Course+)>
<!ELEMENT Course (Name, Student* )>
<!ATTLIST Course Capacity CDATA #REQUIRED>
<!ELEMENT Name (#PCDATA)>

Page 8 of 19

<!ELEMENT Student (Name, Grade)>
<!ELEMENT Name (#PCDATA)>
<!ELEMENT Grade (#PCDATA)>
]>

(i) (4 points) Find XPath expressions that are equivalent to the XQuery below. Select ALL OPTIONS that
are correct (or say that none are correct).

for $c in /Registration/Course
return
if (exists($c/Student[Grade >= 90 and Grade < 95])) then $c/Name

A. /Registration/Course[Student[Grade >= 90 and Grade < 95]]/Name
B. /Registration/Course[./Student/Grade >= 90 and ./Student/Grade < 95]/Name
C. /Registration/Course[Student[Grade >= 90][Grade < 95]]/Name
D. /Registration/Course[count(./Student[Grade < 90 or Grade >= 95]) = 0]/Name

ANS: 1. AC

B is wrong because there can be multiple students in a course, and the condition checks if one student’s grade
is >= 90 and at least one student’s grade is < 95.

D is wrong because not necessarily the count of student outside the range [90, 95) should be 0.

(ii) (5 points) Write a query using XQuery to find courses where at least half of the enrolled students
received their highest score among all courses they took. You may use some of the aggregate functions:
count(), min(), max(), where aggr(nodeset) returns a single value. Assume that the names of students are
unique.

for $c in /Registration/Course
let $cnt := count(
for $s in $c/Student
let $g := max(//Student[Name = $s/Name]/Grade)
where $g <= $s/Grade
return $s/Name
)
let $cnt2 := count(for $s in $c/Student return $s)
where $cnt >= 0.5 * $cnt2
return $c/Name

Page 9 of 19

Problem 5: LOGS (8 points)

Consider UNDO/REDO logging with fuzzy checkpointing. Recall that update record (T, A, o, n) denotes
that transaction T changed the value of A from o to n.

At the time of a system crash, let the log segment involving four transactions S,T,U,V b
e as follows.

1. (START S)

2. (S, X, 10, 20)

3. (S, Y, 5, 10)

4. (COMMIT S)

5. (START T)

6. (T, X, 20, 30)

7. (START CKPT(T))

8. (T, Y, 10, 20)

9. (START U)

10. (COMMIT T)

11. (U, X, 30, 40)

12. (END CKPT)

13. (U, Y, 20, 30)

14. (START V)

15. (START CKPT(U,V))

16. (COMMIT U)

17. (V, Y, 30, 40)

18. (V, Y, 40, 50)

(i) (4 pts) Fill out the table below by writing “Guaranteed” or “Maybe” or “Impossible” for each updates,
which says whether that updated value of the variable was written to disk by the time of the crash:

● Guaranteed: this update is guaranteed to be written to disk,

● Maybe: this update might have been written to disk,

Page 10 of 19

● Impossible: this update could not be written to disk.

Log sequence Object and its value Guaranteed/Maybe/Impossible

number

2 X = 20

3 Y = 10

6 X = 30

8 Y = 20

11 X = 40

13 Y = 30

17 Y = 40

18 Y = 50

(ii) (2 pts) After the REDO stage of recovery, what will be the value of X and Y?

X =

Y =

(iii) (2 pts) After the UNDO stage of recovery, what will be the value of X and Y?

X =

Y =

Solution:

(i) Since the second checkpointing did not complete, we only have guarantee on the updates before
the first START CKPT. But all other updates might have been written to disk while the second
checkpointing was running.

Page 11 of 19

Log sequence Object and its value Guaranteed/Maybe/Impossible
number

2 X = 20 Guaranteed

3 Y = 10 Guaranteed

6 X = 30 Guaranteed

8 Y = 20 Maybe

11 X = 40 Maybe

13 Y = 30 Maybe

17 Y = 40 Maybe

18 Y = 50 Maybe

(ii) X = 40

Y = 50

S c ommitted before the first START CKPT, so it is complete. T,U committed as well at the time of
the crash. But all changes made by T prior to the START CKPT are guaranteed to be on disk, so we
do not need to go beyond the START CKPT at step 7. From this point, we start “repeating the
history”, i.e. make all updates in order, as a result we get X = 40 (line 11) and Y = 50 (line 18)

(iii)

Only uncommitted transaction is V. So the UNDO step will only focus on V and undo the changes
by V in reverse direction. It first reverts Y from 50 to 40 (line 18), then it reverts Y from 40 to 30
(line 17). So the final values will be

X = 40

Y = 30.

Page 12 of 19

Problem 6: Join Algorithm/Index (17 points)
Consider the the following query:

SELECT R.A, S.B, S.C

FROM R, S

WHERE R.B = S.B AND S.C > X

Suppose that all attributes in R(A, B), S(B, C) are integers, the range of S.C is [11, 210], and the tuples are
uniformly distributed.

(i). (9 points) Assume that we have infinite memory, no indexes, and our only join algorithm is a nested
loops join. The database optimizer decides to execute the query with the following query plan. When X =
208, for each table and operator, estimate the number of output tuples and the I/O Cost (in terms of number
of pages) of each operator in the query plan.

The tables have the following number of tuples and pages on disk:

Table Tuples Pages

R 1,000 100

S 50,000 5,000

tuples: ________

cost: ________

Nested Loops Join R.B = S.B

tuples: tuples:

cost: cost:

R select(S.C > 208)

Page 13 of 19

Ans:

tuples: (see below - everyone got 1.5 for this)_

cost: ___0_____

Nested Loops Join R.B = S.B

tuples: _1000_ tuples: 500______

cost: _100_ cost: 5000_____

Scan R Scan S and select(S.C > 208)

For selection on S, the estimated number of tuples in the selection result is (210-208) / (210 - 11+1)
* 50000 = 500. And we have to scan the entire table.

For R join S, the result can be stored in memory so there is no I/O cost in this phase.

Now the solution needed some additional assumptions. If you assume that R has the Primary key and S
foreign key, then since there are 50000 tuples in S and 1000 tuples in R, the estimated number of matching
tuples in R for every tuple in S is 0.02. So the estimated number of tuples in final result is 500 * 0.02 = 10.
(note the uniformity assumption.)

If you assume some distinct number of values of B in S and R, the answer would be different.

Since the question was under-specified, everyone got 1.5 points for the top-most sub-question on #tuples.

(ii) (8 points) Now assume that the memory has only 10 pages, one index lookup requires 4 I/O, and after
the selection S.C > X, the results are written back to disk. Suppose there is a clustered B+tree index on S(B).
Ignore page boundaries. Suppose the options are

● hash join
● index nested loop join

a. (4 points) Which join algorithm will you choose when X = 208 to have less I/O? Write costs or
explanations to justify your answer.

Page 14 of 19

a. (4 points) Which join algorithm will you choose when X = 110 to have less I/O? Write costs or
explanations to justify your answer.

Ans:

2.a: Use hash join. When X=208, there are 500 tuples in the S part fitting in 50 pages. There are 1000 tuples
in the R part fitting in 100 pages. In total we need min(sqrt(50, 100))+2 = 10 pages in memory, so hash join
can be performed, cost = 3(50+100) = 450

The cost of indexed nested loop join is B(R) + |R|* (4 + 1) = 100 + 1,000 * 5 = 5,100

1 R tuple joins with at most 5 S tuples, which would fit in one page (ignoring page boundary).

2.b: Use index nested-loop join.

When X=110, there are 25000 tuples in the S part fitting in 2500 pages. There are 1000 tuples in the R part
fitting in 100 pages. In total we need min(sqrt(2500, 100))+2 = 12 pages in memory, so hash join cannot be
performed

Problem 7: Distributed Joins (6 points)

Recall the semijoin algorithm for distributed join processing.

Suppose you have a table R(A, B, C) in London, and the table S(B, E, F) in Paris and we want to do an
equijoin. Here are the steps:

1. At London: Project R onto the join column B. Send the projection R1(B) to Paris.

2. At Paris, equijoin R1 with S, send the result S1(B, E, F) to London

3. At London, join S1 with R. Output final answer A(A, B, C, E, F).

Here is a slight modification of semijoin, called Bloomjoin, works.

1. At London: compute a bit-vector of some size k: – Hash R.B values into range 0 to k-1. If
some tuple hashes to p-th bucket, set bit p to 1 (p from 0 to k-1) – Ship bit-vector to Paris.
2. At Paris, hash each tuple of S.B similarly – discard S tuples that hash to 0 in R-s bit-vector.
Let the resulting subset of S be S1. Ship S1(B, E, F) to London.

3. At London, join S1 with R. Output final answer A(A, B, C, E, F).

Step 3 is the same as semi-join. Steps 1 and 2 are different.

(i) (3 points) State one reason why bloom-join can be more efficient than semi-join.

Page 15 of 19

Solution:

(i) Sending a bit vector is more efficient than sending a subset of R in step 1.

(ii) (3 points) State one reason why bloom-join can be less efficient than semi-join.

Solution:

(ii) Since hashing may have collision, in step 3, Bloom join can ship more S tuples that might not appear in
final join result. Semijoin only sends the S tuples that appears in the join result.

Problem X1: Advanced join algorithm. (14 points)

So far we have only considered I/O cost for join processing. Now we will look at an advanced join algorithm
that aims to improve the CPU cost, i.e., we will try to design a better join algorithm with better running time
complexity considering standard CPU cost (you can assume all relations are in memory).

Consider four tables: S(A, B), T(B, C, D), U(C, D, E), V(B, C, F) and a natural join among them. Instead of
computing the result of join, the goal is to compute “fully reduced relations”: which will produce (1) subsets
of tuples from S, T, U, V such that each of these tuples will participate in at least one result tuple. i.e.

πA, B (S ⨝ T ⨝ U ⨝ V) = S πB, C, D (S ⨝ T ⨝ U ⨝ V) = T etc.

(2) Further, these subsets should be maximal, i.e. all tuples in S, T, U, V that appear in the original join result
should stay.

Example of fully reduced relations:

S(A, B) T(B, C, D) U(C, D, E) V(B, C, F)

10 9 9 1 7 1 7 6 9 1 5

12 9 9 1 4 1 4 1

Note that there are four result tuples: (A, B, C, D, E, F) = (10, 9, 1, 7, 6, 5), (10, 9, 1, 4, 1, 5), (12, 9, 1, 7, 6, 5),
(12, 9, 1, 4, 1, 5), and all tuples from all relations participate in these result set.

Example of NOT fully reduced relations:

S(A, B) T(B, C, D) U(C, D, E) V(B, C, F)

10 9 9 1 7 1 7 6 9 1 5

12 9 9 2 4 1 4 1

Here T(9, 2, 4) does not participate in the join result and is a “dangling tuple”. Similarly U(1, 4, 1). The fully
reduced relations for S, T, U, V will get rid of (only) these two tuples and produce the following output.

Page 16 of 19

S(A, B) T(B, C, D) U(C, D, E) V(B, C, F)

10 9 9 1 7 1 7 6 9 1 5

12 9

Answer the following questions.

(i) (4 points) can you construct an instance where each of S, T, U, V has n tuples, and the result of the join is
n2 ?

Ans: Have A: a1… an. Have B = 1. Have C = c1, .. . cn

D = E = F = 1 for all tuples

S, T, U, V have n tuples, join has n2, i.e.,

S has (a1, 1), (a2, 1) …., (an, 1)

T has (1, c1, 1), (1, c2, 1), … , (1, cn, 1)

U has (c1, 1, 1), …. (cn, 1, 1)

V has (1, c1, 1), ... , (1, cn, 1)

Several other answers are possible!

(ii) (10 points) Given arbitrary S, T, U, V, can you design an O(n log n) algorithm to compute the fully
reduced relations? Note that the join size can be 𝚹(n2), so you cannot compute the join and project on to
individual relations! A brief description in English with some analysis is fine (no need for pseudocode etc.)

Hint1: One concept from class learnt in another context would help!

Hint 2: does this tree give you some idea?

S(A, B)

T(B, C, D)

/ \

U(C, D, E) V(B, C, F)

Page 17 of 19

Ans: The concept we use here is semi-join! i.e., project onto the join column, join with sorting, see which
tuples participate. But basically it is project and sorting, and the tree should give us the idea how to project
and compare relations.

We need to do a semi-join pass bottom -up, then do another pass top-down.. and use sorting to keep rows
that actually join.

So here is the full algo:

Bottom-up pass:
Step 1: U and T

-- Project U to CD to U1 and keep U1 in sorted order (O(n log n))

-- Sort T by CD (O(n log n))

-- See which tuples of T match with U1, delete the rest from T (O(n))

Step 2: V and T

-- Project V to BC to V1 and keep V1 in sorted order (O(n log n))

-- Sort (new) T by BC (O(n log n))

-- See which tuples of T match with V1, delete the rest from T (O(n))

Step 3: T and S

-- Project (new) T to B to T1 and keep T1 in sorted order (O(n log n))

-- Sort S by B (O(n log n))

-- See which tuples of S match with T1, discard the rest from S (O(n))

After this bottom up pass, every tuple in S participate in a join tuple!

Top-down pass:

Now do the same trick top-down, i.e. first (new) S with (new) T, then (newer) T with U, and then with V to
keep only tuples in T, U, V that would participate in the join result.

Total cost = still O(n log n)

Trivia: These queries are called “acyclic queries” for which such trees (called join trees) would always exist.
e.g., you cannot do this trick for queries like R(A, B), S(B, C), T(C, A), which is not acyclic and intuitively has
a cycle A -> B -> C -> A!

Page 18 of 19

Problem X2: 6 points

For the following three problems, assume we are considering replacing a hard disk drive (HDD) with a solid
state drive (SSD). Assume that our database can be held on both disks and pages from each relation are
stored in consecutive disk blocks initially. Suppose sequential scans over every table in the database on HDD
and SSD take roughly the same amount of time, but a random access in SSD is much faster. Write True/false,
no explanations are needed.

Problem X2.a (2 points)

Consider an indexed nested loop join using an unclustered index, where the inner table cannot fit in the
memory, replacing HDD with SSD will improve the performance significantly.

Ans: T

Problem X2.b (2 points)

Consider a block nested loop join where the smaller table fits in the memory and multiple pages are available
to load pages from the outer relation, replacing HDD with SSD will improve the performance significantly.

Ans: F

Problem X2.c (2 points)
Consider an external sort where the table being sorted has M2 pages (M is the number of pages in the
memory available for sorting), replacing HDD with SSD will improve the performance significantly.

Ans: T

Problem X2.d (1 points)

Explain your answer for above three questions in 1-2 sentences.

Ans: The first and third one need random access, the second one needs sequential access.

INLJ with unclustered index will access pages from random locations.

BNLJ will load multiple pages of outer relation from consecutive relation sequentially.

Sorting accesses next page from different runs that are not consecutively stored necessarily.

Page 19 of 19

Self Healing Concrete PPT Mu
50% (2)
Self Healing Concrete PPT Mu
22 pages
Fermentation
No ratings yet
Fermentation
23 pages
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
No ratings yet
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
104 pages
Scania Parts List
100% (4)
Scania Parts List
2 pages
Active Directory Administration The Personal Trainer For Windows Server 2008 and Windows Server 2008 R2 William Stanek Instant Download
100% (1)
Active Directory Administration The Personal Trainer For Windows Server 2008 and Windows Server 2008 R2 William Stanek Instant Download
41 pages
Problem Solving 5
No ratings yet
Problem Solving 5
60 pages
Combined Exam 29.06.2020
No ratings yet
Combined Exam 29.06.2020
13 pages
Dbms Net Ques
No ratings yet
Dbms Net Ques
42 pages
Eim Q3W8
No ratings yet
Eim Q3W8
47 pages
Arch Dam Design - U.S. Army Corps of Engineers-Part A
No ratings yet
Arch Dam Design - U.S. Army Corps of Engineers-Part A
120 pages
FSD Material
No ratings yet
FSD Material
122 pages
QN
No ratings yet
QN
50 pages
Class Activity1-Most of The Exam From This
No ratings yet
Class Activity1-Most of The Exam From This
7 pages
Differential Protection of A Potential Transformer
No ratings yet
Differential Protection of A Potential Transformer
24 pages
Dbms Mulearn Exam Orientation
No ratings yet
Dbms Mulearn Exam Orientation
25 pages
BSC
No ratings yet
BSC
20 pages
Possible Answers For Test2 Css 214 2022/2023 Question One: Data Redundancy: Update Anomalies
No ratings yet
Possible Answers For Test2 Css 214 2022/2023 Question One: Data Redundancy: Update Anomalies
6 pages
Exam Dbms
No ratings yet
Exam Dbms
7 pages
CS316 Fall24 Midterm Question
No ratings yet
CS316 Fall24 Midterm Question
11 pages
Midterm Exam: Introduction To Database Systems: Solutions: Below Is The Preferred Solution
No ratings yet
Midterm Exam: Introduction To Database Systems: Solutions: Below Is The Preferred Solution
9 pages
Intermediary Liability in A Global World: Prof. Dr. Matthias Leistner, LL.M. (Cambridge)
No ratings yet
Intermediary Liability in A Global World: Prof. Dr. Matthias Leistner, LL.M. (Cambridge)
40 pages
1 - 2 - CSC 3170 Final 2019-20 Term 2
No ratings yet
1 - 2 - CSC 3170 Final 2019-20 Term 2
7 pages
cs317 s2022 Midsem
No ratings yet
cs317 s2022 Midsem
7 pages
Madeeasy Bits FORMAT
No ratings yet
Madeeasy Bits FORMAT
25 pages
All About Bohol
No ratings yet
All About Bohol
5 pages
Numerical Questions
No ratings yet
Numerical Questions
14 pages
UMTS Call Flow Scenarios Overview
No ratings yet
UMTS Call Flow Scenarios Overview
161 pages
CSE302 Final
No ratings yet
CSE302 Final
3 pages
DBMS
No ratings yet
DBMS
15 pages
Tropical Rainforest: Presented by
No ratings yet
Tropical Rainforest: Presented by
30 pages
Midterm 2018
No ratings yet
Midterm 2018
6 pages
Selfstudys Com File
No ratings yet
Selfstudys Com File
6 pages
Quiz1 Rubric Sol
No ratings yet
Quiz1 Rubric Sol
2 pages
Um6p Cs Introdb 2023 Midterm Exam Solutions
No ratings yet
Um6p Cs Introdb 2023 Midterm Exam Solutions
7 pages
CHỨC NĂNG GIAO TIẾP
No ratings yet
CHỨC NĂNG GIAO TIẾP
10 pages
Mindterm Sol
No ratings yet
Mindterm Sol
12 pages
Ce/It B.Tech: Marwadi University Faculty of
No ratings yet
Ce/It B.Tech: Marwadi University Faculty of
4 pages
Midtermxx
No ratings yet
Midtermxx
9 pages
APacCHRIE 2016 Paper 12
No ratings yet
APacCHRIE 2016 Paper 12
6 pages
Financial Instruments Managed Through DTC
No ratings yet
Financial Instruments Managed Through DTC
6 pages
Dasmesh Group of Schools: Faridkot/Kotkapura/Bargari Std. VII
No ratings yet
Dasmesh Group of Schools: Faridkot/Kotkapura/Bargari Std. VII
23 pages
QUESTION
No ratings yet
QUESTION
3 pages
Past Exam Questions 2
No ratings yet
Past Exam Questions 2
2 pages
Mock
No ratings yet
Mock
20 pages
K Thi: Final Exam - Ngày Thi: 09.11.2013: 2 (Next)
No ratings yet
K Thi: Final Exam - Ngày Thi: 09.11.2013: 2 (Next)
32 pages
PJS Damansara Qtr4 2022 - Invoices
No ratings yet
PJS Damansara Qtr4 2022 - Invoices
3 pages
Csi ZG518 Ec-2
No ratings yet
Csi ZG518 Ec-2
6 pages
Better Homes & Gardens 8 Cube Organizer EN
No ratings yet
Better Homes & Gardens 8 Cube Organizer EN
26 pages
Unit-1, 2 and 3
No ratings yet
Unit-1, 2 and 3
22 pages
3 P32 Midterm 2019
No ratings yet
3 P32 Midterm 2019
6 pages
DBMS 2016
No ratings yet
DBMS 2016
5 pages
The NGINX Real-Time API Handbook
No ratings yet
The NGINX Real-Time API Handbook
26 pages
DBMSEnd Sem Winter 2017 Solution
No ratings yet
DBMSEnd Sem Winter 2017 Solution
7 pages
Assignment 1 DBMS JUL 2022
100% (1)
Assignment 1 DBMS JUL 2022
11 pages
Databaseexam PDF
No ratings yet
Databaseexam PDF
3 pages
∏ (Σ (P×R) ) −∏ (Σ (Q×R) ) 2. Q: R⋈ (Σ (S) ) : Σ (R⋈S) B) Σ (Rlojs) Rloj (Σ (S) ) D) Σ (R) Lojs
No ratings yet
∏ (Σ (P×R) ) −∏ (Σ (Q×R) ) 2. Q: R⋈ (Σ (S) ) : Σ (R⋈S) B) Σ (Rlojs) Rloj (Σ (S) ) D) Σ (R) Lojs
6 pages
DBMS MCQ
No ratings yet
DBMS MCQ
27 pages
Finalpractice 2017w
No ratings yet
Finalpractice 2017w
15 pages
ABCD Complete V7b HR 1
No ratings yet
ABCD Complete V7b HR 1
11 pages
Midterm Exam
No ratings yet
Midterm Exam
11 pages
BPMG 3023 Transport and Society Assignment 1
No ratings yet
BPMG 3023 Transport and Society Assignment 1
8 pages
Emergency Nursing Questionnaires 2
No ratings yet
Emergency Nursing Questionnaires 2
1 page
Computer Ports and Cables
No ratings yet
Computer Ports and Cables
7 pages
Unit Iii & Iv
No ratings yet
Unit Iii & Iv
4 pages
Databases
No ratings yet
Databases
20 pages
Week-8 Assignment8 DBMS
No ratings yet
Week-8 Assignment8 DBMS
10 pages
Exam 04
No ratings yet
Exam 04
29 pages
Databases Questions and Answers
No ratings yet
Databases Questions and Answers
5 pages
Mahanakhon Structural Design Presentation
100% (1)
Mahanakhon Structural Design Presentation
42 pages
Indian Statistical Institute: PGDBA, First Year, Final of First Semester Examination, 2019-20
No ratings yet
Indian Statistical Institute: PGDBA, First Year, Final of First Semester Examination, 2019-20
3 pages
2018 Data Base Mid
No ratings yet
2018 Data Base Mid
9 pages
Hw1 Instructions
No ratings yet
Hw1 Instructions
5 pages
Data Sheet - Carrier Chiller
No ratings yet
Data Sheet - Carrier Chiller
4 pages
Dbms Question Bank
100% (1)
Dbms Question Bank
28 pages
Test 1
No ratings yet
Test 1
6 pages
Title of Paper - Bending-Axis Effects On Load-Moment (P-M) Interaction Diagrams For Circular Concrete Columns Using A Limited Number of Longitudinal Reinforcing Bars
No ratings yet
Title of Paper - Bending-Axis Effects On Load-Moment (P-M) Interaction Diagrams For Circular Concrete Columns Using A Limited Number of Longitudinal Reinforcing Bars
8 pages
Maulana Azad National Institute of Technology Bhopal Endterm Exam Nov 2021
No ratings yet
Maulana Azad National Institute of Technology Bhopal Endterm Exam Nov 2021
6 pages
Asg3 Solns Sketch
100% (1)
Asg3 Solns Sketch
7 pages
Deloitte Full Test 1 Q
No ratings yet
Deloitte Full Test 1 Q
13 pages
Mentee Application Form: PERIOD: 1 October 2022 - 1 May 2023
No ratings yet
Mentee Application Form: PERIOD: 1 October 2022 - 1 May 2023
2 pages
Quiz-Ans Iitg Ma321
No ratings yet
Quiz-Ans Iitg Ma321
5 pages
1 Cs208 Principles of Database Design - Model QP
No ratings yet
1 Cs208 Principles of Database Design - Model QP
3 pages
DBMS QP
No ratings yet
DBMS QP
4 pages
Cs403 Collection of Old Papers
100% (1)
Cs403 Collection of Old Papers
19 pages
Family Waste Inventory
No ratings yet
Family Waste Inventory
2 pages
Final F02
No ratings yet
Final F02
4 pages
Student Solutions Manual for Mathematics for Economics, fourth edition
From Everand
Student Solutions Manual for Mathematics for Economics, fourth edition
Michael Hoy
No ratings yet
Data Interpretation Guide For All Competitive and Admission Exams
From Everand
Data Interpretation Guide For All Competitive and Admission Exams
Mohmmad Khaja Shareef
2.5/5 (6)
HESI A2 Math Practice Tests: HESI A2 Nursing Entrance Exam Math Study Guide
From Everand
HESI A2 Math Practice Tests: HESI A2 Nursing Entrance Exam Math Study Guide
Exam SAM
No ratings yet
Instruction for Using a Slide Rule
From Everand
Instruction for Using a Slide Rule
W. Stanley
No ratings yet

2019 Spring Final Sol

Uploaded by

2019 Spring Final Sol

Uploaded by

CompSci 316 Spring 2019: Final Exam

0. (bonus) 1/1 1. (Short Q/A) /40

(recall that r1(A) means T1 reads A etc.)

r1(A), w1(A), r3(B), w2(A), w3(B), w1(B)

(3 points)​ The schedule is conflict serializable.

True: draw the precedence graph: T3 -> T1 -> T2

(3 points)​ The schedule is possible under two-phase locking (2PL).

(2 points)​ The schedule avoids cascading rollback.

True: No dirty read from data written by other uncommitted transaction

(2 points)​ The schedule has two equivalent serial schedules.

False: from the precedence graph, there is only one.

where − denote set difference.

The schema are different, not union compatiable

True: Suppose R has m 1, S has n 1, and m >= n. LHS = n. RHS = m - ((m-n) + 0) = n

False: If m < n, LHS = m, RHS = m - (0 + n-m) = 2m - n, may be different from m.

//A[B/C = "foo" and B/D = "bar"]

//A[B[C = "foo" and D = "bar"]]

False. Consider the following XML document:

• Student: S(sid, name, major)

• Registration: R(sid, cid)

SELECT C.cid, C.year, COUNT(S.sid) / C.capacity as e_rate

FROM Course C, Student S, Registration R

WHERE C.cid = R.cid AND R.sid = S.sid

GROUP BY C.cid, C.year

SELECT TEMP.cid FROM TEMP

Any query plan which is not a left-deep plan.

Gamma​cid, year, ​COUNT(S.sid) / C.capacity -> e_rate

Problem 3: Normalization (14 points)

grade_duty(duty_id, ta_id, answer_id, homework_id, problem_id, student_id, grade, overdue)

(1) D -> TAHPSGO -- Duty id is the key for the table

1000 John 101 1 1 5001 90 Yes

1001 Mary 104 2 3 5005 92 Yes

1003 John 103 4 2 5005 100 No

1004 Mary 105 2 5 5002 95 No

1005 Ray 103 4 3 5003 70 No

1001 and 1004, violate HT -> P

1003 and 1005, violate A -> SH

From A -> SH, decompose DTAHPSGO into DTAPGO and ASH.

From AT -> P, demopose DTAPGO into DATGO and ATP

Using HT->P, get HTP and DTAHSGO

No. See counterexample below:

duty_id TA_id homework_id

duty_id TA_id assignment_id

Problem 4: XML and XQuery (9 points)

2. (S, X, 10, 20)

6. (T, X, 20, 30)

8. (T, Y, 10, 20)

11. (U, X, 30, 40)

12. (END CKPT)

13. (U, Y, 20, 30)

15. (START CKPT(U,V))

16. (COMMIT U)

17. (V, Y, 30, 40)

18. (V, Y, 40, 50)

● Guaranteed: this update is guaranteed to be written to disk,

Page 10​ of 19

Log sequence Object and its value Guaranteed/Maybe/Impossible

Page 11​ of 19

11 X = 40 Maybe

13 Y = 30 Maybe

17 Y = 40 Maybe

18 Y = 50 Maybe

Page 12​ of 19

SELECT R.A, S.B, S.C

WHERE R.B = S.B AND S.C > X

Table Tuples Pages

Nested Loops Join R.B = S.B

tuples: ________ tuples: ________

cost: ________ cost: ________

R select(S.C > 208)

Page 13​ of 19

tuples: __(see below - everyone got 1.5 for this)___

Nested Loops Join R.B = S.B

tuples: ___1000_____ tuples: __500______

cost: ___100_____ cost: ____5000_______

Scan R Scan S and select(S.C > 208)

(3 points) The schedule is conflict serializable.

(3 points) The schedule is possible under two-phase locking (2PL).

(2 points) The schedule avoids cascading rollback.

(2 points) The schedule has two equivalent serial schedules.

Gammacid, year, COUNT(S.sid) / C.capacity -> e_rate

Page 10 of 19

Page 11 of 19

Page 12 of 19

tuples: tuples:

cost: cost:

Page 13 of 19

tuples: (see below - everyone got 1.5 for this)_

tuples: _1000_ tuples: 500______

cost: _100_ cost: 5000_____

Page 14 of 19

Page 15 of 19

πA, B (S ⨝ T ⨝ U ⨝ V) = S πB, C, D (S ⨝ T ⨝ U ⨝ V) = T etc.

Page 16 of 19

S, T, U, V have n tuples, join has n2, i.e.,

Page 17 of 19

Page 18 of 19

Page 19 of 19