Midterm 02 Solutions
Midterm 02 Solutions
Name:
Brief Directions:
² Write clearly: First, you don't want me to spend the whole week grading, do you? Second,
it's good for you to write clearly!
² Good luck!
1
1 B+ Trees (20 points)
Consider a B+ tree where n = 4, i.e., the maximum number of keys in a node is 4 and the maximum
number of pointers is 5 at internal nodes and 4 at leaf nodes. Assume that the B+ tree initially
consists of a single node, which is both the root and the only leaf, that has the key 1.
1. 2 points What is the minimum number of keys that may appear in a non-root internal node?
Solution: d n+1
2 e¡1=2
2. 2 points What is the minimum number of keys that may appear in a non-root leaf node?
Solution: b n+1
2 c=2
2
3. 8 points1 Consider the set of keys S = f1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13g. Write down a
sequence of inserting the keys of S such that at the end the resulting B+ tree has 3 levels and
is as empty as possible, i.e., as many nodes as possible have the minimum number of nodes.
Provide the B+ tree snapshots that correspond to the points right after node splits. (Use the
white space at the end of the page and the back side of this page.)
Solution: The sequence (1); 13; 12; 11; 10; 9; 8; 7; 6; 5; 4; 3; 2 produces the sequence of splits
and the resulting B+ tree shown below, assuming the splits are \3 nodes to the left, 2 nodes
to the right".
1 10 12 13
12
1 10 11 12 13
10 12
1 8 9 10 11 12 13
8 10 12
1 6 7 8 9 10 11 12 13
6 8 10 12
1 4 5 6 7 8 9 10 11 12 13
4 6 10 12
1 2 3 4 5 6 7 8 9 10 11 12 13
This is just one of the possible solutions. However, the resulting tree is the only one that
is possible if the splits are \3 nodes to the left, 2 nodes to the right". There is eactly one
possible resulting tree in the case of \2 nodes to the left, 3 nodes to the right".
1
May be time consuming.
3
4. 8 points2 Consider the set of keys S 0 = f1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 19; 20g.
Write down a sequence of inserting the keys of S 0 such that at the end the resulting B+ tree
is as full as possible, i.e., as many nodes as possible have the maximum number of nodes.
Provide the B+ tree snapshots that correspond to the points right after a node split. (Use
the next blank page.)
Remarks
Solution: There is exactly one resulting tree that has just two levels and, hence, is as full as
possible. It is produced as follows
1
Insert 2, 3 ,6
(no split)
5
1 2 3 6 Insert 5
1 2 3 5 6 In
se rt 7
, 9, 5 9
10
1 2 3 5 6 7 9 10
Insert
11, 13, 14
5 9 13
1 2 3 5 6 7 9 10 11 13 14
Insert
15, 17, 18
5 9 13 17
1 2 3 5 6 7 9 10 11 13 14 15 17 18
Insert
4, 8, 12, 19, 20
(no extra
splits)
5 9 13 17
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2
May be time consuming
4
2 Extensible Hash Index (10 points)
Assume that each bucket of an extensible hash index can ¯t exactly two records (each record is
a pair of hash key and pointer). Consider the following records, with the corresponding hash key
values.
Key hash key
a 0000
b 0001
c 0010
d 0011
e 0100
f 0110
g 1000
We insert the records in the order given above. Show the extensible hash index after all records
have been inserted.
Solution: As is the case with extensible hash indices, you do not have to consider the sequence
by which the records are inserted. The resulting index is
i=3
0000 3 000
0001
001
0010 3
010
0011
011
0100 32
0110 100
101
1000 1
110
111
5
3 Algebra (10 points)
Consider the following two de¯nitions of the semijoin operator .<, which may or may not be
equivalent.
² Direct (from page 255 of the textbook) The semijoin .< of relations R and S, written R.< S,
is the bag of tuples t in R such that there is at least one tuple in S that agrees with t in all
attributes that R and S have in common.
² Indirect Let us call a(R) the list of attributes of R. Then, it is
R.< S = ¼a(R) (R 1 S)
Solution: No, they are not equivalent. Consider the following counterexample with rela-
tions R(A) and S(A; B):
A
R=
1
and
A B
S= 1 2
1 3
A
R.< S =
1
2. 5 points Consider the indirect de¯nition of .<. Assume that the schema of P is P (A; B; C; D)
and the schema of T is T (C; E). Prove the following, using the notation shown in the Ap-
pendix.
6
Solution: Exercising transformation rules from the notes and the book we have
¼A;B ¾D=5^E=6 (P 1 T ) =
¼A;B ¾D=5 ¾E=6 (P 1 T ) =
¼A;B ¾D=5 (P 1 ¾E=6 T ) =
¼A;B (¾D=5 (P ) 1 ¾E=6 (T )) =
¼A;B (¼A;B;C ¾D=5 (P ) 1 ¼C ¾E=6 (T )) =
¼A;B ¼A;B;C (¼A;B;C ¾D=5 (P ) 1 ¼C ¾E=6 (T )) =
¼A;B ((¼A;B;C ¾D=5 P ).< (¼C ¾E=6 T ))
Consider the following SQL query, which returns the total amount for each product that \Jones"
bought.
1. 5 points Write an algebra expression that uses two cartesian products £, exactly one selection
!T otal operator and computes the SQL query.
¾ and the SU MP roduct;Amount7
7
2. 5 points Show the series of transformations that transform the algebra expression of the
previous question into an expression where the cartesian products have been replaced by
joins and the selections are pushed as down (early) as possible.
Assume that the equation ¾R:A=S:A (R £ S) = R 1R:A=S:A S is one of the transformation
rules.
Solution:
!T otal ¾C:CID=O:CID^O:OID=L:OID^C:N ame=0 Jones0 ((C £ O) £ L) =
SU MP roduct;Amount7
!T otal ¾C:CID=O:CID ¾O:OID=L:OID^C:N ame=0 Jones0 ((C £ O) £ L) =
SU MP roduct;Amount7
!T otal ¾C:CID=O:CID ¾O:OID=L:OID ¾C:N ame=0 Jones0 ((C £ O) £ L) =
SU MP roduct;Amount7
!T otal ¾C:CID=O:CID ¾O:OID=L:OID (¾C:N ame=0 Jones0 (C £ O) £ L) =
SU MP roduct;Amount7
!T otal ¾C:CID=O:CID ¾O:OID=L:OID ((¾C:N ame=0 Jones0 (C) £ O) £ L) =
SU MP roduct;Amount7
!T otal ¾C:CID=O:CID ((¾C:N ame=0 Jones0 (C) £ O) 1O:OID=L:OID L) =
SU MP roduct;Amount7
!T otal (¾C:CID=O:CID (¾C:N ame=0 Jones0 (C) £ O) 1O:OID=L:OID L) =
SU MP roduct;Amount7
!T otal ((¾C:N ame=0 Jones0 (C) 1C:CID=O:CID O) 1O:OID=L:OID L) =
SU MP roduct;Amount7
3. 5 points Provide an additional (non-trivial) expression where the selections have been pushed
as down (early) as possible, but the join order is di®erent. No need to show the series of
transformations that led you to this expression.
Solution:
4. 10 points So far you must have provided 2 algebra expressions with di®erent join orders.
For each one of them provide an estimate of the size of its intermediate results. Also, provide
an estimate of the size of the ¯nal result. To save time, just put the size numbers next to the
edges in the algebra expressions.
If you reached that point you have got full points. So, what about the SU MP roduct;Amount7
!T otal
? The book provides upper and lower bound numbers as well as some pretty arbitrary esti-
mates for the duplicate elimination operation, which is identical to the problem at hand. If
you are a combinatorics freak you will recognize Stirling's numbers as the solution. More in
class!
8
5. 7 points Assume that you have got indices on all attributes. Consider the following simpli-
¯cation of the query:
Apply the INGRES algorithm to get a join order. Indicate which is the small relation hyper-
edge that you pick in each step of the algorithm and show the resulting plan.
9
A What will get you full points in proving equivalence of algebraic
expressions
Assume that the exercise is to prove that
¾p^q (P 1 T ) =
¾p ¾q (P 1 T ) =
¾p (P 1 ¾q T ) =
(¾p P ) 1 (¾q T )
10