0% found this document useful (0 votes)
60 views6 pages

AA Exam 2021 Answers

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views6 pages

AA Exam 2021 Answers

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

UNIVERSITY OF BRISTOL

January/February 2021 Examination Period

FACULTY OF ENGINEERING

Third Year Examination for the Degree of


Bachelor and Master of Engineering and Bachelor of Science

COMS30042
Advanced Algorithms

TIME ALLOWED:
2 Hours

This paper contains seven questions.


All answers will be used for assessment.
The maximum for this paper is 85 marks.

Other Instructions:
1. Calculators must have the Faculty of Engineering Seal of Approval.

TURN OVER ONLY WHEN TOLD TO START WRITING

Page 1 of 6
Q1. (a) Consider an array of n integers whose values are in the range 0, . . . , u − 1. Give an
algorithm that can sort the array in O(n log log u) time.
Solution: Make a Van Emde Boas tree from the integers taking O(n log log u)
time. In linear time find the largest element and call predecessor and delete re-
peatedly to get the array in reverse order, then reverse. Or we can find the smallest
element and call successor repeatedly.
[3 marks]
(b) State the space usage of the FKS hashing scheme and state the time complexity of
the construction.
Solution: Linear for both.
[2 marks]

Q2. Answer the following questions about Bloom filters which we use to represent a set of
elements. Bloom filters do not support deletions natively. However, your friend Martha
suggests a way round this problem. Her idea is as follows. Create two Bloom filters, one
called Badd and one callld Bdel . Whenever we want to add an element e to the set we
set the relevant places in Badd to 1. Whenever we want to delete an element e we set
the relevant places in Bdel to 1. In order to check if an element is in the set we have
to look at both Badd and Bdel . First check Badd . If it returns false then e is not in the
set. If it returns true then check Bdel . If this returns true then e is not in the set. If it
returns false then e is in the set.
(a) Does Martha’s idea give us the desirable property that false positives may occur but
false negatives are impossible. If you think yes then explain why and if no, give an
example that breaks this property.
Solution: You can have false positives even if for example you haven’t deleted
anything. You can also have false negatives by having a false positive in Bdel .
[5 marks]
(b) What is the time complexity of a lookup using Martha’s idea?
Solution: If the Bloom filter uses r hash functions then it is O(r ) time because
we need to do r hash function calls twice.
[5 marks]
(c) Another way to handle deletions is to store an m-bit number in the array positions.
In practice it turns out that 4 bits normally suffices. Describe how insert, delete and
lookup queries could be supported in such an extended Bloom filter.
Solution: The insert operation then increments the values at the relevant places
by 1 and the delete operation decrements the value of each of the locations by 1.
A lookup checks to see if all the values at the relevant locations equal 0.

Page 2 of 6 Qu. continues . . .


(cont.)

[5 marks]
(d) Does your solution using m-bit numbers in the array positions have the desirable
property that false positives may occur but false negatives are impossible? If you
think yes then explain why and if no, give an example that breaks this property.
Solution: No. If we insert x and y which have the same hash locations and then
delete y twice and then query x we will get a false negative.
[5 marks]
(e) Imagine we want to use a Bloom filter of size 800 bits for a Web cache which stores
up to 100 URLs picked from a universe of 1000000 URLs. How would you calculate
the optimal number of hash functions to use to minimise the false positive probability
and what false positive probability does it give? You can give the probability to one
decimal place.
Solution: The number of hash functions is 800/(100e) ∼ 2.9. In practice you
should round this up to 3. The false positive probability is ≈ 0.05
[5 marks]

Q3. A hash function family H = {h1 , h2 , . . . } with hi : U → T is pairwise independent iff for
randomly and uniformly chosen h ∈ H, we have that for any distinct x, y ∈ U and pair
of values u, v ∈ T , Pr(h(x) = u ∧ h(y ) = v ) = 1/|T |2 .
Let x1 , . . . , xn be different values chosen from U.
(a) Pick at random a1 , . . . , an ∈ {0, 1} and define

ha1 ,...,an (x1 , . . . , xn ) = a1 x1 + · · · + an xn mod 2

Prove that this family is pairwise independent or give a counter-example.


Solution:
This is not pairwise independent. Consider the zero string input x1 = x2 = · · · =
xn = 0. It always maps to zero and any other string will map to zero with probability
1/2. Hence the probability they are equal is 1/2 not 1/4.
[5 marks]
(b) Pick at random a0 , . . . , an ∈ {0, 1} and define

ha0 ,...,an (x1 , . . . , xn ) = a0 + a1 x1 + · · · + an xn mod 2

Let u = u1 , . . . , un and v = v1 , . . . , vn . ui and vi must differ at at least one index i


and we can ignore all other indices. We can regard a1 , . . . , an as selecting a subset of
values in the two inputs u and v . You can assume without proof that the probability
of the size of the subset of mismatching indices being even or odd size is equal.
Prove that this hash function family is pairwise independent.

Page 3 of 6 Turn Over/. . .


Solution: We only have to observe that if the size of the subset of mismatching
indices is even then h(u) = h(v ), if it is odd then h(u) 6= h(v ). The choice of
value for a0 flips parity of the subset size for both u and v . For the even case the
probability they are either both 0 or both 1 each has equal probability of 1/4. For
the odd case 0, 1 and 1, 0 each also have probability 1/4 as desired.
[5 marks]

Q4. This question is about devising a fast algorithm to count the number of distinct substrings
of a string of length n, including the empty string. For example, the string abb has five
distinct non-empty substrings which are (a, b, ab, bb, abb).
(a) i. Construct the atomic suffix tree of abb. The suffix tree you construct should
have exactly three leaves as we do not need to add an edge from the root that
points only to the sentinel.

root
a b

b $ b

b $

Solution:
[5 marks]
ii. What property of an atomic suffix tree gives you the number of distinct non-
empty substrings for a string?
Solution: The number of internal nodes excluding the root.
[5 marks]
iii. Give an algorithm that runs in O(n) time to count the number of distinct non-
empty substrings of a string of length n. You should provide a short explanation
in English of why your algorithm is correct and why its running time is O(n).
You may assume anything given in the lectures as part of your solution. [Hint:
The atomic tree has size O(n2 ).]
Solution: Construct a (compacted) suffix tree in O(n) and sum up the lengths
of all the edges in the tree.
[5 marks]
(b) This part of the question is about the DC3 suffix array construction algorithm. In
the DC3 algorithm indices B1 = {i mod 3 = 1} and B2 = {i mod 3 = 2} are
defined. In the first stage of the algorithm it computes the suffix array for indices
B1 ∪ B2 from a string of twice the length of the input. Show the suffix array that
is obtained in this way by the DC3 algorithm in its first stage using the input string

Page 4 of 6
bananas.
Solution: (0, 3, 2, 1)
[5 marks]

Q5. Consider the following problem which we will call MaxIncome. We are given work values
a1 , . . . , an and a bound b and incomes c1 , . . . , cn with bound k. Work ai gives income
ci . Is there a way of choosing work values so that their sum is at most b and the
corresponding total income is at least k?
(a) Show that MaxIncome is in NP. [5 marks]
(b) Show that MaxIncome is NP-complete by reducing it from one of the NP-complete
problems that has been covered in the lectures. [5 marks]

Solution: To check if MaxIncome is in NP we check in linear time if the total of the


work values is at most b and the sum of the incomes is at least k.
For the reduction, we reduce from SubsetSum. Create an instance of MaxIncome with

ai = ci = si
b=k =t

We now have:
P P
i ≤ b ⇐⇒
a Pi∈S si ≤ t ⇐⇒
X
i∈S
P si = t
i∈S ci ≤ k ⇐⇒ i∈S si ≥ t i∈S

A yes answer to the new problem gives a yes answer to the original problem. Similarly
a no answer to the new problem gives a new answer to the original problem.

Q6. Consider a 2D version of the RMQ problem. We are given a square n×n grid of integers.
A query RMQ (y1 , x1 , y2 , x2 ) returns an index of a smallest value in the rectangle with
bottom left coordinate (y1 , x1 ) and top right coordinate (y2 , x2 ). We can assume that
y1 < y2 and x1 < x2 .
Consider a solution which precomputes RMQ (y1 , x1 , y1 + ly , x1 + lx ) for all y1 , x1 , ly ∈
{21 , 22 , 23 , . . .} and lx ∈ {21 , 22 , 23 , . . .} and store the results in a table.
(a) Explain a fast method for performing the precomputation and give its running time
and gives its time complexity.
Solution: We use the same idea as in the 1D case. That is for a given bottom left
(y1 , x1 ) and fixed ly we find all the RMQs for lx = k and then use these to get the
RMQs for lx = 2k and so on doubling. This is O(log n) work in each dimension
for all O(n2 ) bottom left corners, giving O(n2 log2 n) time overall.

Page 5 of 6 Turn Over/Qu. continues . . .


(cont.)

[5 marks]
(b) How can queries be performed in O(1) time once this precomputation has been
finished?
Solution: Any rectangular query can be cover by 4 overlapping rectangles with
both dimensions being a power of 2. We can then make 4 queries to our precom-
puted table and take the minimum of these.
[5 marks]

Q7. Construct the Cartesian tree for the input 7, 6, 1, 7, 5, 8, 3, 4.

6 3

7 5 4

7 6
Solution:

[5 marks]

Page 6 of 6
END OF PAPER

You might also like