0% found this document useful (0 votes)

157 views7 pages

hw5 Sol PDF

This document contains instructions for homework 5 that is due on March 11, 2020. It includes problems related to hashing and sampling. Specifically, it asks to prove properties about universal and strongly universal hashing schemes that use random matrices. It also asks to prove that a sampling algorithm called GetOneSample returns an element from a stream uniformly at random and to determine the expected number of times a specific line in the algorithm is executed.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

157 views7 pages

hw5 Sol PDF

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

CS 473 ] Spring 2020

Y Homework 5 Z
Due Wednesday, March 11, 2020 at 9pm

1. Suppose we are hashing from the universe U = {0, 1, . . . , 2w − 1} of w-bit strings to a hash
table of size m = 2` ; that is, we are hashing w-bit words into `-bit labels. To define our
universal family of hash functions, we think of words and labels as boolean vectors, and we
specify our hash function by choosing a random ` × w boolean matrix.
For any ` × w matrix M of 0s and 1s, define the hash function h M : {0, 1}w → {0, 1}` by
the boolean matrix-vector product
w
M M
h M (x) = M x mod 2 = Mi x i = Mi .
i=1 i : x i =1

where ⊕ denotes bitwise exclusive-or (that is, addition mod 2), Mi denotes the ith column
of M , and x i denotes the ith bit of x.

(a) Prove that the family M = {hm | M ∈ {0, 1}w×` } of all random-matrix hash functions
is universal.
Solution: Pick two arbitrary distinct words x 6= y; we need to prove that
Pr M [h M (x) = h M ( y)] ≤ 1/m. Definition-chasing implies that for any matrix M ,
we have h M (x) = h M ( y) if and only if M (x ⊕ y) mod 2 = 0. Thus, it suffices to
prove for any non-zero word z ∈ {0, 1}w that Pr M [M z mod 2 = 0] ≤ 1/m.
Fix an arbitrary non-zero word z, and suppose z j = 1. Then we have
M M
M z mod 2 = 0 ⇐⇒ Mi zi = 0 ⇐⇒ Mi zi = M j
i i6= j

If we fix all the entries in M except the jth column M j , the left side of the last
equation is a fixed vector, but the right side is still uniformly random. The
L
probability that the random vector M j equals the fixed vector i6= j Mi zi is
exactly 1/m.

Rubric: 4 points

(b) Prove that M is not uniform.

Solution: h(0) = 0 for all h ∈ M.

Rubric: 1 point

1
CS 473 Homework 5 (due March 11) Spring 2020

(c) Now consider a modification of the previous scheme, where we specify a hash
function by a random matrix M ∈ {0, 1}`×w and an independent random offset vector
b ∈ {0, 1}` : w
M
h M ,b (x) = (M x + b) mod 2 = Mi x i ⊕ b
i=1
+
Prove that the family M of all such functions is strongly universal.

Solution: Fix two arbitrary distinct words x 6= y and two labels u and v; we
need to compute the probability that h(x) = u and h( y) = v.
Without loss of generality, suppose x i = 0 and yi = 1 for some index i. Fix
all the entries in M except in column Mi ; then

h(x) = u ⊕ b and h( y) = v ⊕ Mi ⊕ b

for some fixed vectors u and v; specifically,

M M
u= Mi x i and v= Mi yi
j6=i j6=i

Definition-chasing now implies that h(x) = u and h( y) = v if and only if

b =u⊕u and M1 = u ⊕ u ⊕ v ⊕ v.

The vectors b and M1 are uniform and independent, so the probability of this
event is exactly 1/m2 .

Rubric: 4 points

(d) Prove that M+ is not 4-uniform.

Solution: h(3) = h(0) ⊕ h(1) ⊕ h(2) for all h ∈ M+ .

Rubric: 1 point

(e) [Extra credit] Prove that M+ is actually 3-uniform.

Solution: Fix three distinct words x, y, z and three (not necessarily distinct)
labels s, t, u. Because x, y, z are distinct, there must be two indices i 6= j such
that the bit-pairs (x i , x j ), ( yi , y j ), (zi , z j ) are also distinct. Fix all entries in M
except those in columns Mi and M j . Up to permutations of the variable names,
there are only three cases to consider.

• Case 1: No (1, 1). Suppose x i = x j = y j = zi = 0 and yi = z j = 1. Then

h(x) = s ⊕ b h( y) = t ⊕ Mi ⊕ b h(z) = u ⊕ M j ⊕ b

for some fixed vectors s, t, and u. It follows that h(x) = s and h( y) = t and

2
CS 473 Homework 5 (due March 11) Spring 2020

h(z) = u if and only if

b =s⊕s Mi ⊕ b = t ⊕ t Mj ⊕ b = u ⊕ u

and therefore (by solving the system of three mod-2-linear equations) if and
only if

Mi = s ⊕ s ⊕ t ⊕ t Mj = s ⊕ s ⊕ u ⊕ u b =s⊕s

• Case 2: No (0, 1). Suppose x i = x j = y j = 0 and yi = zi = z j = 1. Then

h(x) = s ⊕ b h( y) = t ⊕ Mi ⊕ b h(z) = u ⊕ Mi ⊕ M j ⊕ b

for some fixed vectors s, t, and u. It follows that h(x) = s and h( y) = t and
h(z) = u if and only if

b =s⊕s Mi ⊕ b = t ⊕ t Mi ⊕ M j ⊕ b = u ⊕ u

and therefore if and only if

Mi = s ⊕ s ⊕ t ⊕ t Mj = s ⊕ s ⊕ t ⊕ t ⊕ u ⊕ u b =s⊕s

• Case 3: No (0, 0). Suppose x j = yi = 0 and x j = yi = zi = z j = 1. Then

h(x) = s ⊕ Mi ⊕ b h( y) = t ⊕ M j ⊕ b h(z) = u ⊕ Mi ⊕ M j ⊕ b

for some fixed vectors s, t, and u. It follows that h(x) = s and h( y) = t and
h(z) = u if and only if

Mi ⊕ b = s ⊕ s Mj ⊕ b = t ⊕ t Mi ⊕ M j ⊕ b = u ⊕ u

and therefore if and only if

Mi = t ⊕ t ⊕ u ⊕ u Mj = s ⊕ s ⊕ u ⊕ u b =s⊕s⊕ t ⊕ t ⊕u⊕u

In all three cases, we have found unique values of Mi and M j and b such
that h(x) = s and h( y) = t and h(z) = u. We conclude that Pr[(h(x) = s) ∧
(h( y) = t) ∧ (h(z) = u)] = 1/m3 , as required.

Rubric: Max 5 points extra credit. This is not the only correct proof, or even necessarily the
simplest proof.

3
CS 473 Homework 5 (due March 11) Spring 2020

2. (a) Prove that the item returned by GetOneSample(S) is chosen uniformly at random
from S.
Solution: Let P(i, n) denote the probability that GetOneSample(S) returns
the ith item in a given stream S of length n. We consider the behavior of the last
iteration of the algorithm. With probability 1/n, GetOneSample returns the
last item in the stream, and with probability (n − 1)/n, GetOneSample returns
the recursively computed sample of the first n − 1 elements of S. Thus, we have
the following recurrence for all positive integers i and n:


 0 if i > n

1

P(i, n) = n if i = n

 n − 1 P(i, n − 1) if i < n



n
The recurrence includes the implicit base case P(i, 0) = 0 for all i. The induction
hypothesis implies that P(i, n−1) = 1/(n−1) for all i < n. It follows immediately
that P(i, n) = 1/n for all i ≤ n, as required.

Rubric: 3 points.

Solution: See our solution to problem 3 (with k = 1).

Rubric: Full credit if and only if (1) the proposed algorithm for problem 3 is actually equiv-
alent to GETONESAMPLE when k = 1, (2) the proposed solution to problem 3 includes a
correct proof of uniformity, and (3) the proposed solution to problem 3 doesn’t refer to
problem 2(a).

(b) What is the exact expected number of times that GetOneSample(S) executes line (?)?

that equals 1 if line (?) is executed in

Solution: Let X i be the indicator variable P
the ith iteration of the main loop; then X = i X i is the total number of executions
of line (?). The algorithm immediatelyPgives us E[X P i ] = Pr[X i = 1] = 1/i, so
n n
linearity of expectation implies E[X ] = i=1 E[X i ] = i=1 1/i = Hn .

Rubric: 2 points. 1 point for “O(log n)”.

(c) What is the exact expected value of ` when GetOneSample(S) executes line (?) for
the last time?
Solution: Let `∗ denote the value of ` when GetOneSample(S) executes line (?)
for the last time. Part (a) immediately implies that `∗ is uniformly distributed
between 1 and n. It follows that E[`∗ ] = (n + 1)/2.

Rubric: 2 points. 1 point for “O(n)”.

4
CS 473 Homework 5 (due March 11) Spring 2020

(d) What is the exact expected value of ` when either GetOneSample(S) executes line (?)
for the second time (or the algorithm ends, whichever happens first)?

Solution: Let `2 denote the value of ` when either GetOneSample(S) executes

line (?) for the second time (or the algorithm ends, whichever happens first). To
determine
Xn
E[`2 ] = Pr[`2 ≥ i],
i=1

we compute Pr[`2 ≥ i] for all i.

We immediately have Pr[`2 ≥ 1] = 1 and Pr[`2 ≥ i] = 0 for all i > n. For
any i such that 1 ≤ i ≤ n, we have `2 ≥ i if and only if line (?) is not executed in
iterations 2 through i − 1 of the main loop, so
i−1
Y j−1 1
Pr[`2 ≥ i] = = .
j=2
j i−1

We conclude that
n n
X X 1
E[`2 ] = Pr[`2 ≥ i] = 1 + = Hn−1 + 1.
i=1 i=2
i−1

Rubric: 3 points. This is not the only correct proof. 2 points for writing “O(log n)”; 1 point
for an exact answer that is incorrect but still O(log n).

5
CS 473 Homework 5 (due March 11) Spring 2020

3. (This is a continuation of the previous problem.) Describe and analyze an algorithm that
returns a subset of k distinct items chosen uniformly at random from a data stream of
length at least k. Prove your algorithm is correct.

Solution (O(k) time per element, 8/10): Intuitively, we can chain together k inde-
pendent instances of GetOneSample, where the items rejected by each instance are
treated as the input stream for the next instance. Folding these k instances together
gives us the following algorithm, which runs in O(kn) time. The algorithm maintains
a single array sample[1 .. k].

GetSamples(S, k):
while S is not done
x ← next item in S
for j ← 1 to min{k, `}
if Random(` − j + 1) = 1
swap x ↔ sample[ j]
return sample[1 .. k]

Correctness follows inductively from the correctness of GetOneSample as follows.

The analysis in part (a) implies that sample[1] is equally likely to be any element
of S. Let S 0 [1 .. n − 1] be the sequence of items rejected by sample[1]; at each
iteration `, item S 0 [` − 1] is either S[`] (with probability 1 − 1/`) or the previous
value of sample[1] (with probability 1/`). The output array sample[2 .. k] has the
same distribution as the output array from GetSamples(S 0 , k −1). Thus, the inductive
hypothesis implies that sample[2 .. k] contains k − 1 distinct items chosen uniformly
at random from S \ {sample[1]}. We conclude that sample[1 .. k] contains k distinct
items chosen uniformly at random from S, as required.

Solution (O(1) time per element, 10/10): To develop the algorithm, let’s start with
a standard algorithm for a different problem. Recall from the lecture notes that the
Fisher-Yates algorithm randomly permutes an arbitrary array. (This algorithm is
called InsertionShuffle in the notes.)

FisherYates(S[1 .. n]):
for ` ← 1 to n
j ← Random(`)
swap S[ j] ↔ S[`]
return S

We can prove that each of the n! possible permutations of the input array has the
same probability of being output by FisherYates, by combining two observations:

• First, there are exactly n! possible outcomes of all the calls to Random. Specifi-
cally, the `th call to Q
Random has ` possible outcomes, so the total number of
n
outcomes is exactly `=1 ` = n!.
• Second, every permutation of S can be computed by FisherYates. Specifically,
the location of S[n] in the output array gives us a unique outcome for the last

6
CS 473 Homework 5 (due March 11) Spring 2020

call to Random, and (after undoing the last swap) the induction hypothesis
implies that there is a sequence of Random outcomes that produce any desired
permutation of S[1 .. n − 1]. (The base case n = 0 is trivial.)

In particular, each subset of k distinct elements of S has the same probability of

appearing in the prefix S[1 .. k] of the output array.
Now suppose we modify the Fisher-Yates algorithm in two different ways. First,
let the input be given in a stream rather than an explicit array; in particular, we do
not know the value of n in advance. Second, we maintain only the first k elements of
the output array.

FYSample(S, k):
`←0
while S is not done
x ← next item in S
`←`+1
j ← Random(`)
if j ≤ k
x ← sample[ j]
return sample[1 .. k]

The k elements produced by this modified algorithm have exactly the same
distribution as the first k elements of the output of FisherYates. Thus, FYSample
chooses a k-element subset of the stream uniformly at random, as required. The
algorithm runs in O(n) time, using only O(k) space.

Rubric: 10 points = 5 for the algorithm + 5 for the proof. Max 8 points for O(kn)-time algorithm;
scale partial credit. These are neither the only correct algorithms nor the only correct proofs for
these algorithms.

Maths Syllabus F1&F2
50% (2)
Maths Syllabus F1&F2
49 pages
IGCSE Mensuration Formula
100% (1)
IGCSE Mensuration Formula
7 pages
MIT6 006S20 Final Sol
No ratings yet
MIT6 006S20 Final Sol
21 pages
MIT6 042JF10 FNL 2006 Sol
No ratings yet
MIT6 042JF10 FNL 2006 Sol
16 pages
f22 hw5 Sol
No ratings yet
f22 hw5 Sol
9 pages
Mscds2022 Solutions
No ratings yet
Mscds2022 Solutions
23 pages
GS2022 QP CSS
No ratings yet
GS2022 QP CSS
16 pages
Practice Test - Algebra 1
No ratings yet
Practice Test - Algebra 1
6 pages
Math Problem Solving Session - 1 - Solutions
No ratings yet
Math Problem Solving Session - 1 - Solutions
8 pages
Isi Mstat Pyq Book (2024-2016)
No ratings yet
Isi Mstat Pyq Book (2024-2016)
178 pages
Assignment Based On Quiz 1: EE16B130 Electrical Engineering
No ratings yet
Assignment Based On Quiz 1: EE16B130 Electrical Engineering
10 pages
Solutions For Exam #2: False
No ratings yet
Solutions For Exam #2: False
6 pages
Midterm Marking
No ratings yet
Midterm Marking
11 pages
GT C and Ps
No ratings yet
GT C and Ps
6 pages
HW 1
No ratings yet
HW 1
7 pages
Drdo 2020 Cse Paper 1 1 25
No ratings yet
Drdo 2020 Cse Paper 1 1 25
4 pages
AA Exam 2021 Answers
No ratings yet
AA Exam 2021 Answers
6 pages
Tutorial 1
No ratings yet
Tutorial 1
3 pages
Quiz
No ratings yet
Quiz
11 pages
hw05 Solution PDF
No ratings yet
hw05 Solution PDF
8 pages
GS2025 QP CS Lids
No ratings yet
GS2025 QP CS Lids
16 pages
Assignment 0
No ratings yet
Assignment 0
3 pages
CLRS Chapter 11 Solutions
No ratings yet
CLRS Chapter 11 Solutions
7 pages
Classtest 2018-19 Sol
No ratings yet
Classtest 2018-19 Sol
6 pages
Gs2015 QP Css
No ratings yet
Gs2015 QP Css
16 pages
Midterm 1 Practice
No ratings yet
Midterm 1 Practice
11 pages
CS174: Note12
No ratings yet
CS174: Note12
5 pages
PSet1 CS725 2022
No ratings yet
PSet1 CS725 2022
2 pages
Tute Week3 Sol
No ratings yet
Tute Week3 Sol
7 pages
MATH1005 Final Exam 2022
No ratings yet
MATH1005 Final Exam 2022
17 pages
HW1 Handout
No ratings yet
HW1 Handout
5 pages
Sample Jrfcs 2022ch2 Cs2
No ratings yet
Sample Jrfcs 2022ch2 Cs2
62 pages
Exam #2 Study Questions (Solution)
No ratings yet
Exam #2 Study Questions (Solution)
7 pages
Python
No ratings yet
Python
8 pages
Assignment4 Solution
No ratings yet
Assignment4 Solution
7 pages
Combinatorics
No ratings yet
Combinatorics
2 pages
Tutorial Sheet 1
No ratings yet
Tutorial Sheet 1
3 pages
Homework 1
No ratings yet
Homework 1
2 pages
Problem Set 0: Introduction To Algorithms
No ratings yet
Problem Set 0: Introduction To Algorithms
27 pages
Assignment 01 Rev1
No ratings yet
Assignment 01 Rev1
6 pages
Homework #1 - Solution Coverage: Chapter 1-2 Due Date: 18 March, 2021
No ratings yet
Homework #1 - Solution Coverage: Chapter 1-2 Due Date: 18 March, 2021
3 pages
Solutions 2
No ratings yet
Solutions 2
7 pages
Pgcs 2023
No ratings yet
Pgcs 2023
5 pages
Nerc 2021 Tutorial
No ratings yet
Nerc 2021 Tutorial
8 pages
f22 Midterm1 Sol
No ratings yet
f22 Midterm1 Sol
6 pages
CIS 160 - Spring 2016 Final Exam Review: Posted Friday, April 29
No ratings yet
CIS 160 - Spring 2016 Final Exam Review: Posted Friday, April 29
12 pages
Discrete Mathematics and Its Applications: Planar Graphs
No ratings yet
Discrete Mathematics and Its Applications: Planar Graphs
48 pages
Homework 6: Stats 217: Xy Xy
No ratings yet
Homework 6: Stats 217: Xy Xy
5 pages
Homework Assignment #2
No ratings yet
Homework Assignment #2
2 pages
2015 SPR
No ratings yet
2015 SPR
4 pages
HW 2 Sol
No ratings yet
HW 2 Sol
5 pages
Engineering Mathematics IV (MAT-212) (CS-ICT-CC) (RCS-2011) (Makeup)
No ratings yet
Engineering Mathematics IV (MAT-212) (CS-ICT-CC) (RCS-2011) (Makeup)
2 pages
Isi 2015
No ratings yet
Isi 2015
17 pages
Hw1 Solutions
No ratings yet
Hw1 Solutions
6 pages
2011 D
No ratings yet
2011 D
4 pages
HW 1
No ratings yet
HW 1
4 pages
Final
No ratings yet
Final
15 pages
BG M BGBG BGBGBG BGBGBGBGBGBGR BGBGBGBG M R BGBGBGBG M R BG M R
No ratings yet
BG M BGBG BGBGBG BGBGBGBGBGBGR BGBGBGBG M R BGBGBGBG M R BG M R
11 pages
Disc 2020 Final Fall KHI (Q&A)
No ratings yet
Disc 2020 Final Fall KHI (Q&A)
9 pages
Problem Set 5
No ratings yet
Problem Set 5
4 pages
Isi JRF Stat 07
No ratings yet
Isi JRF Stat 07
10 pages
PSLE Maths 2018 Paper 1 Booklet B
No ratings yet
PSLE Maths 2018 Paper 1 Booklet B
8 pages
Integral Calculus V2 Formula Sheet
No ratings yet
Integral Calculus V2 Formula Sheet
3 pages
Coordinate System For 2D Level 1 and 2
No ratings yet
Coordinate System For 2D Level 1 and 2
19 pages
Math Question Bank Grade X - Sahil
No ratings yet
Math Question Bank Grade X - Sahil
5 pages
Math 6 Learning Worksheets Q1 W1
No ratings yet
Math 6 Learning Worksheets Q1 W1
3 pages
Class Xii CS Practical File 2
No ratings yet
Class Xii CS Practical File 2
63 pages
Year 7 - Indices
No ratings yet
Year 7 - Indices
6 pages
3-Proving Lines Parallel
0% (1)
3-Proving Lines Parallel
4 pages
WHAT IS A PROBLEM - Learning Unit 2
No ratings yet
WHAT IS A PROBLEM - Learning Unit 2
15 pages
IOQM - 2024 - TEST PAPER-12: TOPIC:-Number Theory Complete, Algebra (Till Taught)
No ratings yet
IOQM - 2024 - TEST PAPER-12: TOPIC:-Number Theory Complete, Algebra (Till Taught)
1 page
Math 10 - M7 - Q4
No ratings yet
Math 10 - M7 - Q4
4 pages
Bcbs Math Second Term Paper
No ratings yet
Bcbs Math Second Term Paper
8 pages
Math 7 10 - Modified Bukmathix - Budget of Learning Competencies
No ratings yet
Math 7 10 - Modified Bukmathix - Budget of Learning Competencies
8 pages
Q1. Circle The Correct Answer
No ratings yet
Q1. Circle The Correct Answer
14 pages
Branch and Bound
No ratings yet
Branch and Bound
7 pages
UNIT 1 REGULAR POLYGONS Notes 4 ESO
No ratings yet
UNIT 1 REGULAR POLYGONS Notes 4 ESO
6 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
14 pages
Aops Community 2017 Imo: Proposed by Stephan Wagner, South Africa
No ratings yet
Aops Community 2017 Imo: Proposed by Stephan Wagner, South Africa
2 pages
C117 F2019 Syllabus v7 PDF
No ratings yet
C117 F2019 Syllabus v7 PDF
9 pages
Camp Conser Cerra 1
No ratings yet
Camp Conser Cerra 1
12 pages
MA 106: Linear Algebra: Prof. B.V. Limaye IIT Dharwad
No ratings yet
MA 106: Linear Algebra: Prof. B.V. Limaye IIT Dharwad
22 pages
Lab 08: Fourier Transform
No ratings yet
Lab 08: Fourier Transform
3 pages
AP Question Bank Advanced
No ratings yet
AP Question Bank Advanced
5 pages
Population Based Incremental Learning
No ratings yet
Population Based Incremental Learning
8 pages
Basis For Grayscale Images
No ratings yet
Basis For Grayscale Images
12 pages
SOL G.8 Pythagorean Theorem Student Notes
No ratings yet
SOL G.8 Pythagorean Theorem Student Notes
6 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Transformation of Axes (Geometry) Mathematics Question Bank
From Everand
Transformation of Axes (Geometry) Mathematics Question Bank
Mohmmad Khaja Shareef
3/5 (1)

hw5 Sol PDF

Uploaded by

hw5 Sol PDF

Uploaded by

CS 473 ] Spring 2020

(b) Prove that M is not uniform.

Solution: h(0) = 0 for all h ∈ M. 

for some fixed vectors u and v; specifically,

Definition-chasing now implies that h(x) = u and h( y) = v if and only if

(d) Prove that M+ is not 4-uniform.

Solution: h(3) = h(0) ⊕ h(1) ⊕ h(2) for all h ∈ M+ . 

(e) [Extra credit] Prove that M+ is actually 3-uniform.

• Case 1: No (1, 1). Suppose x i = x j = y j = zi = 0 and yi = z j = 1. Then

h(z) = u if and only if

• Case 2: No (0, 1). Suppose x i = x j = y j = 0 and yi = zi = z j = 1. Then

and therefore if and only if

• Case 3: No (0, 0). Suppose x j = yi = 0 and x j = yi = zi = z j = 1. Then

and therefore if and only if

Solution: See our solution to problem 3 (with k = 1). 

that equals 1 if line (?) is executed in

Rubric: 2 points. 1 point for “O(log n)”.

Rubric: 2 points. 1 point for “O(n)”.

Solution: Let `2 denote the value of ` when either GetOneSample(S) executes

we compute Pr[`2 ≥ i] for all i.

Correctness follows inductively from the correctness of GetOneSample as follows.

In particular, each subset of k distinct elements of S has the same probability of

You might also like

Solution: h(0) = 0 for all h ∈ M.

Solution: h(3) = h(0) ⊕ h(1) ⊕ h(2) for all h ∈ M+ .

Solution: See our solution to problem 3 (with k = 1).