28 Randomized Algorithms
28 Randomized Algorithms
Lecture 28
Announcements
Three Concept checks this week (to get to a total of 30)
Concept Check 28 is a “standard” one for today’s lecture
Concept Check 29 asks you to fill out the official UW course evals.
Please fill these out! It’s super useful to know what you found valuable, what you found
frustrating, what took too long,…we are always working to improve the course.
Concept Check 30 asks you to fill out a different anonymous form with
specific feedback about the real world assignments.
They were new this quarter – asking about logistics (e.g., would you rather
have had these as part of regular homeworks like the programming or as
separate work like we did?), for any examples to add to suggested lists (like
in real world 1/2), etc.
Final Logistics
What’s Fair Game?
All the content/theorems we learned.
Principles behind the programming questions, but not the code itself.
Applications lectures won’t be tested directly (that is to say we won’t give you
a problem that would only make sense if you followed the application lectures.
But the principles behind the applications are absolutely fair game.
And we reserve the right to give a problem inspired by the applications
lectures, as long as you could do it even if you didn’t see the applications.
E.g. we could ask a question about covariance (because you know the formula
from main lectures) but we wouldn’t ask you to think about the ML
implications of a picture of covariances.
What’s a randomized algorithm?
A randomized algorithm is an algorithm that uses randomness in the
computation.
Well, ok.
Let’s get a little more specific.
Two common types of algorithms
Las Vegas Algorithm
Always tells you the right answer
Takes varying amounts of time.
Quick Sort 0 1 2 3 4 5 6
20 50 70 10 60 40 30
0 0 1 2 3 4
10 50 70 60 40 30
0 1 0 1
40 30 70 60
0 0
30 60
0 1 0 1
30 40 60 70
0 1 2 3 4
30 40 50 60 70
0 1 2 3 4 5 6
10 20 30 40 50 60 70
Total time:
How long does it take?
Well…it depends on what pivots you pick.
About
levels
…
work when elements remaining.
For Simplicity
We’ll talk about how quicksort is really done at the end.
For now an easier-to-analyze version:
Cut in
half is
ideal.
levels.
We only get the perfect pivot with probability . That’s not very likely…
maybe we can settle for something more likely.
Focus on an element
Let’s focus on one element of the array .
The recursion will stop when every element is all alone in their own
subarray.
Call an iteration “good for ” if the array containing in the next step is at
most the size it was in the current step.
Pivot here: might leave in a Pivot here: both subarrays size. Pivot here: might leave in a
big subarray (if is big) Must be good for . big subarray (if is small)
Good for
At least half of the potential pivots guarantee ends up with a good iteration.
So we’ll use
It’s actually quite a bit more than half for large arrays – one of the two red
subarrays might be good for (just bad for the others in the array)
might be our pivot, in which case it’s totally done.
To avoid any tedious special cases for small arrays, just say at least .
How many levels?
How many levels do we need to go?
Once is in a size subarray, it’s done. How many iterations does it take?
If we only had good iterations, we’d need
.
I want (at the end of our process) to say with probability at least <blah> the
running time is at most .
What’s the probability of getting a lot of good iterations…what’s the tool we
should use?
Fill out the poll everywhere so Robbie
knows how long to explain
Go to pollev.com/cse312
Needed iterations
is done after good for iterations.
Let’s imagine we do iterations. Let be the number of good for iterations. Let
This argument so far does apply to any other -- but they aren’t independent,
so….union bound!
Another strategy: find the true median (very fancy, very impractical: take
421)
Algorithms with some probability of failure
There are also algorithms that sometimes give us the wrong answer. (Monte
Carlo Algorithms)
Wait why would we accept a probability of failure?
How many independent runs of the algorithm do we need to get the right
answer with high probability?
Small Probability of Failure
How many independent runs of the algorithm do we need to get the right
answer with high probability?
Probability of failure