Solution:: Quiz 1
Solution:: Quiz 1
Solution: Quiz 1
• Do not open this quiz booklet until directed to do so. Read all the instructions on this page.
• When the quiz begins, write your name on the top of every page of this quiz booklet.
• You have 120 minutes to earn a maximum of 120 points. Do not spend too much time on
any one problem. Skim them all first, and attack them in the order that allows you to make
the most progress.
• You are allowed one double-sided letter-sized sheet with your own notes. No calculators,
cell phones, or other programmable or communication devices are permitted.
• Write your solutions in the space provided. Pages will be scanned and separated for grading.
If you need more space, write “Continued on S1” (or S2, S3, S4) and continue your solution
on the referenced scratch page at the end of the exam.
• Do not waste time and paper rederiving facts that we have studied in lecture, recitation, or
problem sets. Simply cite them.
• When writing an algorithm, a clear description in English will suffice. Pseudo-code is not
required. Be sure to argue that your algorithm is correct, and analyze the asymptotic
running time of your algorithm. Even if your algorithm does not meet a requested bound,
you may receive partial credit for inefficient solutions that are correct.
• Pay close attention to the instructions for each problem. Depending on the problem,
partial credit may be awarded for incomplete answers.
Name:
School Email:
2 6.006 Solution: Quiz 1 Name
(a) [1 point] Write your name and email address on the cover page.
Solution: OK!
(a) [5 points] Given array A of n integers, the Python function below appends all integers
from set {A[x] | 0 ≤ i ≤ x < j ≤ n and A[x] < k} to the end of dynamic array B.
1 def filter_below(A, k, i, j, B):
2 if (j - i) > 1:
3 c = (i + j) // 2
4 filter_below(A, k, i, c, B)
5 filter_below(A, k, c, j, B)
6 elif (j - i) == 1 and A[i] < k:
7 B.append(A[i])
Argue the worst-case running time of filter below(A, k, 0, len(A), [])
in terms of n = len(A). You may assume that n is a power of two.
Solution: This function has recurrence of form T (n) = 2 T (n/2) + f , where f
is constant amortized. The number of base cases nlog2 2 dominates, so since a linear
number of constant amortized operations are performed, this function runs in worst-
case O(n) time.
Common Mistakes:
• wrong recurrence, or solved correct recurrence incorrectly
• not acknowledging amortized operation
(b) [5 points] The integer array A = [4, 3, 1, 5, 0, 2] is not a heap. It is possible to make
A either a max or min heap by swapping two integers. State two such integers.
Solution: Swap 4 and 0 to construct a min heap.
____4____ ____0____
__3__ __1 => __3__ __1
5 0 2 5 4 2
Common Mistakes:
• trying for a max heap with a single swap, which is not possible
4 6.006 Solution: Quiz 1 Name
(c) [5 points] Let T be a binary search tree storing n integer keys in which the key k
appears m > 1 times. Let p be the lowest common ancestor of all nodes in T which
contain key k. Prove that p also contains key k.
Solution: Suppose for contradiction p contains some key k ∗ 6= k. Since p is the
lowest common ancestor of at least two nodes containing key k, one such node exists
in p’s left subtree and one such node exists in p’s right subtree. But then by the BST
property, k ≤ k ∗ ≤ k, a contradiction.
Common Mistakes:
• assuming a node storing k is always parent or child of another node storing k
(d) [5 points] Given the hash family H = {ha (k) = a(k + a) mod m | a ∈ {1, . . . , m}}
and some key k1 ∈ {0, . . . u − 1} where 2 < 2m < u, find a key k2 ∈ {0, . . . , u − 1}
with k2 6= k1 such that ha (k1 ) = ha (k2 ) for every ha ∈ H.
Solution: Choose k2 = k1 + m if k1 < m, and k2 = k1 − m if k1 ≥ m. Clearly,
0 ≤ k2 < u by construction, with k1 6= k2 . Further, h0 (k1 ) = h0 (k2 ) because:
h0 (k1 ) = (ak1 +a2 ) mod m = (ak1 +a2 ±am) mod m = (a(k1 ±m)+a2 ) mod m = h0 (k2 ).
Common Mistakes:
• choosing particular pair of keys instead of k2 for any k1
• choosing k2 = k1 mod m without bounding k2 ∈ {0, . . . , u − 1}
• choosing k2 which depends on a; fixed pair must collide for every a
6.006 Solution: Quiz 1 Name 5
(a) [6 points] A dealer in the casino has a deck of cards that is missing 3 cards. He will
help Jane find Dr. Yes if she helps him determine which cards are missing from his
deck. A full deck of cards contains kn cards, where each card has a value (an integer
i ∈ {1, . . . , n}) and a suit (one of k known English words), and no two cards have
both the same value and the same suit. Describe an efficient1 algorithm to determine
the value and suit of each of the 3 cards missing from the deck.
Solution: Construct an initially empty hash table in constant time. For each card in
the deck, check whether its suit is in the hash table. If not, add the suit to the hash
table mapping to an empty dynamic array. Then in either case, append the card’s
number to the end of the suit’s dynamic array. At the end of this process, each suit
array contains the card numbers of that suit from the deck. Processing each card in
this way takes expected amortized O(1) time, so doing this with all nk − 3 cards takes
expected O(nk) time. After inserting all cards, check the length of each suit array in
O(k) time. Any suit array with length n has all its cards. For any suit array whose
length is less than n, sort the numbers in the array using counting sort in worst-case
O(n) time since the numbers are positive and bounded by n. Then loop through the
numbers in order to find any that are missing. At most three suit arrays have length
less than n. In total, this algorithm runs in expected O(nk).
Common Mistakes:
• using an integer sort on suits without mapping suits to integers
• looping over suits without first generating a list of suits
1
By “efficient”, we mean that faster correct algorithms will receive more points than slower ones.
6 6.006 Solution: Quiz 1 Name
(b) [6 points] The dealer doesn’t know Dr. Yes, but he knows that Dr. Yes is one of the
k best players in the casino. Jane scans the room and for each of the p > k players,
she transmits back to headquarters a pair (c, `) representing the number of chips c
and location ` of the player. Assuming that no player has the same number of chips,
describe an efficient algorithm for headquarters to determine the locations of the k
players in the casino who have the most chips.
Solution: Build a max heap out of the p player pairs keyed by chip number in worst-
case O(p) time. Then, remove the maximum player pair from the heap k times and
store its location in a dynamic array to return. This process takes worst-case O(k log p)
time, so this process runs in O(p + k log p) time in total.
Common Mistakes:
• correct solutions that were inefficient
6.006 Solution: Quiz 1 Name 7
(c) [6 points] After determining the locations of the k players with the most chips, Jane
observes the game play of each of them. She watches each player play exactly h < k
game rounds. In any game round, a player will either win or lose chips. A player’s
win ratio is one plus the number of wins divided by one plus the number of losses
during the h observed hands. Given the number of observed wins and losses from
each of the k players, describe an efficient algorithm to sort the players by win ratio.
Solution: Observe that since wins plus losses equals h, one win ratio (w1 +1)/(`1 +1)
is larger than another win ratio (w2 + 1)/(`2 + 1) if and only if w1 > w2 , so it suffices
to sort the players based on their wins. Since wins are positive and bounded by h < k,
we can use counting sort to sort the players in worst-case O(k) time.
Common Mistakes:
• directly computing win ratios with arbitrary precision
• multiplying ratios by product of denominators (numbers can be exponential)
• comparison sorting win ratios by cross-multiplication (inefficient)
8 6.006 Solution: Quiz 1 Name
Solution: Maintain a doubly-linked list containing customers on the wait list in order, maintaining
a pointer to the front of the linked list corresponding to the front of the wait list, and a pointer
to the back of the linked list corresponding to the back of the wait list. Also maintain a hash
table mapping each customer name to the linked list node containing that customer. To implement
add name(x), create a new linked list node containing name x and add it to the back of the linked
list in worst-case O(1) time. Then add name x to the has table pointing to the newly created node.
To implement remove name(x), lookup name x in the hash table in and remove the mapped
node from the linked list in expected O(1) time. Lastly, to implement seat(), remove the node
from the front of the linked list containing name x, remove name x from the hash table, and then
return x, also in expected O(1) time.
Note, this problem can also be solved with amortized bounds without using a linked list but the
amortization would need to be fully analyzed for full points.
Common Mistakes:
(a) [10 points] Assuming b2 − b1 < 6006, describe an O(n)-time algorithm to return
a range pair of A with respect to range (b1 , b2 ) if one exists. State whether your
algorithm’s running time is expected, worst-case, and/or amortized.
Solution: Let c = b2 − b1 , a constant. Then, for each k ∈ {1, . . . , c}, we can find
whether any pair (i, j) satisfies ai + aj = b1 + k. For each k ∈ {1, . . . , c}, build a hash
table H mapping each integer ai to i in expected O(n) time. Then for each ai , check
whether b1 + k − ai is in the hash table. If it is, you have found a range pair, so return
(i, j). Otherwise, there is no aj for which ai + aj = b1 + k, so we proceed to check
ai+1 until i = n. Each of the n checks takes expected constant time, so checking
for range pairs for k runs in expected O(n) time. We repeat this procedure for all
k ∈ {1, . . . , c}, returning no range pair exists if the process terminates without finding
a range pair. Since c is constant, this algorithm runs in expected O(cn) = O(n) time.
Common Mistakes:
• assuming that ai , b1 , b2 polynomially bounded
• checking sums within an incorrect range (not exactly the range from b1 and b2 )
10 6.006 Solution: Quiz 1 Name
(b) [15 points] Assuming logn (max A − min A) < 6006 (with no restriction on b1 or
b2 ), describe an O(n)-time algorithm to return a range pair of A with respect to range
(b1 , b2 ) if one exists. State whether your algorithm’s running time is expected, worst-
case, and/or amortized.
Solution: Observe that since logn (max A − min A) is some constant k, then nk =
max A − min A. Loop through A to find min A in worst-case O(n) time and subtract
min A from each ai . Now A contains integers in the range {0, . . . , nk+1 }, and we can
use radix sort to sort them in worst-case O(n) time. Lastly, add min A to each value,
also in worst-case O(n) time, yielding array A0 , containing the elements of A in sorted
order. Now we try to find a range pair. Initialize pointers i = 0 and j = n − 1. Repeat
the following procedure, maintaining the invariant that at the start of each loop, we’ve
either returned a range pair or confirmed that ax cannot exist in a range pair for any
x < i and x > j, which is vacuously true at the start. To process loop (i, j), if
b1 ≤ ai + aj ≤ b2 , then return (i, j) as a range pair. If i = j, then by the invariant,
there is no range pair, so return that none exists. Otherwise:
• If ai + aj < b1 , then ai cannot be a part of a range pair with any ax for x ≤ j, so
increase i by one, maintaining the invariant.
• If ai + aj > b2 , then aj cannot be a part of a range pair with any ax for x ≥ i, so
decrease j by one, maintaining the invariant.
Each loop takes worst-case O(1) time to execute, and with each loop, either i increase
or j decreases, so j − i decreases from n − 1 to j − i = 0, at which point the algorithm
terminates. Thus this algorithm runs in worst-case O(n) time.
Common Mistakes:
• assuming each ai is polynomially bounded (not subtracting min A)
• assuming b2 − b1 is bounded and trying to use hash table as in 4a)
• two-finger algorithm mistakes:
– pointers move in same direction or can move back and forth
– algorithm has no termination condition
6.006 Solution: Quiz 1 Name 11
Solution: Maintain a latitude AVL tree keyed on distinct measurement latitudes, where each lat-
itude ` maps to a rainfall AVL tree containing all the measurement triples with latitude `, keyed
by time. We only store nodes associated with measurements, so the height of each AVL tree is
bounded by O(log n). For each rainfall AVL tree, we augment each node p with the maximum
rainfall p.m of any measurement within p’s subtree. This augmentation can be maintained in con-
stant time at a node p by taking the maximum of the rainfall at p and the augmented maximums of
p’s left and right children; thus this augmentation can be maintained without effecting the asymp-
totic running time of standard AVL tree operations.
To implement record data(r, `, t), search the latitude AVL tree for latitude ` in worst-case
O(log n) time. If ` does not exist in the latitude AVL tree, add a new node corresponding to ` map-
ping to a new empty rainfall AVL tree, also in O(log n) time. In either case, add the measurement
triple to `’s rainfall AVL tree, for a total running time of worst-case O(log n).
To implement peak rainfall(`, t), search the latitude AVL tree for latitude ` in worst-case
O(log n) time. If ` does not exist, return zero. Otherwise, we perform a one-sided range query on
the rainfall AVL tree associated with ` to find the peak rainfall at latitude ` since time t. Specifically,
let peak(p, t) be the maximum rainfall of any measurement in node p’s subtree measured at time
≥ t (or zero if p is not a node):
max {p.r, p.right.m, peak(p.lef t, t)} if p.t ≥ t
peak(p, t) = .
peak(p.right, t) if p.t < t
Then peak rainfall is simply peak(p, t) with p being the root of the tree, which can be computed
using at most O(log n) recursive calls. So this operation runs in worst-case O(log n) time.
Note, this problem can also be solved where each latitude AVL tree is keyed by rainfall, augment-
ing nodes with maximum time in subtree. We leave this as an exercise to the reader.
Rubric and common mistakes continued on S1.
12 6.006 Solution: Quiz 1 Name
For example, the left smaller count array of A = [10, 5, 12, 1, 11] is S = [0, 0, 2, 0, 3]. Describe an
O(n log n)-time algorithm to compute the left smaller count array of an array of n distinct integers.
State whether your algorithm’s running time is worst-case, amortized, and/or expected.
Solution: We compute values si increasing from i = 0 to i = n − 1 by maintaining at all times an
AVL tree Ti on integer keys a0 , . . . , ai−1 , where each AVL node p is augmented with the number
of nodes p.s in the subtree rooted at p. This augmentation can be maintained in constant time
at a node p by adding 1 to the augmented subtree sizes of p’s left and right children; thus this
augmentation can be maintained without effecting the asymptotic running time of standard AVL
tree operations.
To compute si , perform a one-sided range query on T . Specifically, let count(p, k) be the number
of nodes in the subtree rooted at p having key strictly less than k (or zero if p is not a node):
1 + p.lef t.s + count(p.right, k) if p.k < k
count(p, k) = .
count(p.lef t, k) if p.k ≥ k
Then si is simply count(p, ai ) where p is the root of Ti , which can be computed using at most
O(log n) recursive calls. So computing si takes worst-case O(log n) time. We then maintain the
invariant by inserting ai into Ti to form Ti+1 . Repeating this procedure n times then computes all
si ∈ S in worst-case O(n log n) time.
Note, this problem can also be solved by modifying merge sort. A rigorous solution based on this
approach is continued on S2.
Common Mistakes:
You can use this paper to write a longer solution if you run out of space, but be sure to write
“Continued on S1” on the problem statement’s page.
Common Mistakes:
You can use this paper to write a longer solution if you run out of space, but be sure to write
“Continued on S2” on the problem statement’s page.
• If AL [l] < AR [r], we set A[i] ← AL [l], S[i] ← SL [l], and increment c, l, i by one.
• If AL [l] > AR [r], we set A[i] ← AR [r], S[i] ← SR [r] + c, and increment r, i by one.
Once this is completed, we will have A that is sorted and S that contains the correct values (but
sorted); to recover the output we want to (according to the original ordering of A), one can iterate
over the original input and search for the key (via binary search) in the sorted list, and output S
accordingly, which can be done in O(n log n) time. Or, one can also associate the original index of
each A[i] in the algorithm or use a hash-map; as long as you clearly state how you can re-construct
the final output in the correct ordering in O(n log n) time you’d get a credit for this part.
This algorithm runs in O(n log n) time as the recurrence of runtime is T (n) = 2T (n/2)+O(n);
the indices i, l, r only increase in the (implicit) while-loop above. (Continued on S3)
6.006 Solution: Quiz 1 Name 15
You can use this paper to write a longer solution if you run out of space, but be sure to write
“Continued on S3” on the problem statement’s page.
Solution: To prove correctness, we prove by induction. The base case is trivial: On a sub-array
of size 1, S[i] should be equal to 0. Now suppose our recursive calls correctly sorted AL , AR and
correctly computed SL , SR respectively on AL and AR . We show that A will be correctly sorted
(on AL plus AR ) and S will be correctly computed on A. Since we know from Merge Sort A is
correctly sorted, we do not need to prove this part. To show that S[i] is computed correctly (with
respect to A), consider the first case when some element AL [l] on the left-half of A is being added
to A[i]. All elements in AR appear after AL [l] in the original array A, and thus if we computed
SL [l] correctly from the recursive call, then S[i] is correctly computed – which is true by inductive
hypothesis. Now consider the second case where some element AR [r] on the right-half of A is
being added to A[i]. Within the right-half, our recursive call correctly computed the number of
elements smaller than AR [r] which is stored in SR [r]. The number of elements in the left-half (AL )
that are smaller than AR [r] is exactly equal to c, which counts the number of elements from AL
that have been added to A[] so far. Since we increment c by one only when we add an element
from AL to A, c is computed correctly. This proves that both A and S are correctly computed at
the end of the merge step.
(Common errors include: Not using the counter (in our solution, that is c) correctly. Sorting
A and compute differences between original indices and new indices. Decreasing/increasing S
values during the merge step by iterating over L or R – this results in O(n2 ) during the merge step.
)
Since the problem asks for O(n log n) solution and there exists a trivial O(n2 ) solution (by
computing S[i] by iterating over j ∈ [1, i − 1]), there is no partial credit for any answer with worse
runtime than O(n log n).
16 6.006 Solution: Quiz 1 Name
You can use this paper to write a longer solution if you run out of space, but be sure to write
“Continued on S4” on the problem statement’s page.