Slides04 Selection
Slides04 Selection
Selection
Selection Problem
Small Small
Problem Problem
▪ Choose 𝑣 = 3.
1 2 5 3 1 3 4 0
▪ What is L , M, and R?
1 2 5 3 1 3 4 0
▪ L: 1 2 1 0
▪ M: 3 3
▪ R: 5 4
Divide
▪ L: 1 2 1 0
▪ M: 3 3
▪ R: 5 4
Recurse
1 2 1 0 3 3 5 4
▪ L: 1 2 1 0
– 𝑥 ∗ is the 𝑘-th integer in L
▪ M: 3 3
– 𝑥∗ = 3
▪ R: 5 4
– 𝑥 ∗ is the (𝑘 − L − |M|)-th integer in R
Example: 𝑘 = 4
1 2 5 3 1 3 4 0
1 2 1 0 3 3 5 4
Example: 𝑘 = 4
1 2 5 3 1 3 4 0
1 2 1 0
Example: 𝑘 = 4
1 2 5 3 1 3 4 0
1 2 1 0
0 1 1 2
Example: 𝑘 = 4
1 2 5 3 1 3 4 0
1 2 1 0
2
Example: 𝑘 = 4
1 2 5 3 1 3 4 0
1 2 1 0
Output 2
Formalize
Function Select(𝑺,𝒌)
▪ Divide:
– Pick an arbitrary value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣,
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse:
– Recurse the subset contains 𝑥 ∗ .
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘);
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣;
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|).
Running Time We want to know 𝑻(𝒏)
Function Select(𝑺,𝒌)
▪ Divide:
– Pick an arbitrary value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣, Divide: 𝑶(𝒏)
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse: 𝑻( 𝑳 )
– Recurse the subset contains 𝑥 ∗ .
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘);
𝑶(𝟏)
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣;
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|).
𝑻(|𝑹|)
Running Time
Fact
▪ 𝑇 𝑛 ≤ 𝑂 𝑛 + m𝑎𝑥 𝑇 𝐿 , 𝑇 𝑅
▪ 𝐿 + 𝑀 + 𝑅 = 𝑆 =𝑛
≤𝑂 𝑛 +𝑇 𝑛−1 ▪ 𝑀 ≥1
▪ 𝐿 , 𝑅 ≤𝑛−1
≤𝑂 𝑛 +𝑂 𝑛−1 +𝑇 𝑛−2 ≤⋯
= 𝑂 𝑛 + 𝑂 𝑛 − 1 + 𝑂 𝑛 − 2 + ⋯ + 𝑂 1 = 𝑶(𝒏𝟐 )
▪ Very Bad!
– One-by-one: 𝑂 𝑛𝑘
– Sorting: 𝑂(𝑛 log 𝑛)
Is it really that bad?
Function Select(𝑺,𝒌)
▪ Divide:
– Pick an arbitrary value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣,
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse:
– Recurse the subset contains 𝑥 ∗ .
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘);
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣;
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|).
Using Randomness!
Function Select(𝑺,𝒌)
▪ Divide:
– Pick an arbitrary value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣,
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse:
– Recurse the subset contains 𝑥 ∗ .
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘);
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣;
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|).
Quick Selection
Function Select(𝑺,𝒌)
▪ Divide:
– Pick a random value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣,
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse:
– Recurse the subset contains 𝑥 ∗ .
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘);
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣;
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|).
When we are lucky
𝑆 𝑆 𝑆
4 2 4
▪ : Lucky pivot area
▪ : Bad pivot area
1
▪ Fact 1: With probability, we are lucky!
2
3𝑛
▪ Fact 2: If we are always lucky, 𝑇 𝑛 = 𝑇 + 𝑂 𝑛 = 𝑂(𝑛)
4
Analysis
3𝑛
▪ 𝜏 𝑛 : Time we reduce 𝑛 to
4
3𝑛
▪ 𝑇 𝑛 = 𝜏 𝑛 + 𝑇( )
4
3𝑛
▪ 𝐸[𝜏 𝑛 ]: The expected time we reduce 𝑛 to
4
3𝑛
▪ 𝐸𝑇 𝑛 =𝐸 𝜏 𝑛 +𝑇
4
3𝑛 Fact
=𝐸 𝜏 𝑛 +𝐸 𝑇
4 1
Since we are lucky with probably 2,
▪ 𝐸𝜏 𝑛 = 𝑂(𝑛) so the expected number of rounds
3𝑛
it takes to become lucky is 2.
▪ 𝐸[𝑇 𝑛 ] = 𝑂 𝑛 + 𝐸 𝑇 = 𝑶(𝒏)
4
Evaluate Random Algorithm by
Expected Running Time!
Other Viewpoints
Function Select(𝑺,𝒌)
▪ Divide:
– Pick a random value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣,
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse:
– Recurse the subset contains 𝑥 ∗ .
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘);
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣;
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|).
Throw Randomness!
Function Select(𝑺,𝒌)
▪ Divide:
– Pick a random value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣,
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse:
– Recurse the subset contains 𝑥 ∗ .
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘);
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣;
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|).
Blum, M.; Floyd, R. W.; Pratt, V. R.;
Median of medians (1973) Rivest, R. L.; Tarjan, R. E.
Function Select(𝑺,𝒌)
▪ Divide:
– Pick a good pivot value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣,
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse:
– Recurse the subset contains 𝑥 ∗ .
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘);
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣;
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|).
Trade-off
5 5 5
How to select a good pivot?
32 4 5 32 63
13 14 8 2 42
How to select a good pivot?
32 4 5 32 63
13 14 8 2 42
32 4 5 32 63
13 14 8 2 42
13 14 8 2 42
Function Select(𝑺,𝒌)
𝒏
▪ Divide: 𝑻 + 𝑶(𝒏)
𝟓
– Pick a good pivot value 𝑣 among 𝑥1 , 𝑥2 , 𝑥3 , … .
– Divide 𝑥1 , 𝑥2 , 𝑥3 , … into three subsets:
▪ L : 𝑥 < 𝑣, 𝑶(𝒏)
▪ M : 𝑥 = 𝑣,
▪ R : 𝑥 > 𝑣.
▪ Recurse:
– Recurse the subset contains 𝑥 ∗ . 𝟑
𝑻 𝑳 ≤ 𝑻(𝒏 − 𝒏)
▪ If 𝑘 ≤ |𝐿|, output Select(𝐿,𝑘); 𝟏𝟎
▪ If 𝐿 < 𝑘 ≤ 𝐿 + |𝑀|, output 𝑣; 𝟑
▪ If |𝐿| + |𝑀| < 𝑘, output Select(𝑅, k − L − |M|). 𝑻 𝑹 ≤ 𝑻(𝒏 − 𝒏)
𝟏𝟎
Guess time!
𝑻(𝒏)=𝑻(𝟎.𝟐𝒏)+𝑻(𝟎.𝟕𝒏)+𝑶(𝒏)
Observation: Comparing to Master Theorem
Level 0 𝑛 𝑂(𝑛)
Level k …… 𝑂(0.9𝑘 𝑛)
log10 𝑛
Level log 10 𝑛 1 1 …… 1 1 𝑂(0.9 7 𝑛)
7
We allow some
problem with size≤ 1.
Make a guess
𝑻 𝒏 = 𝑻 𝟎. 𝟐𝒏 + 𝑻 𝟎. 𝟕𝒏 + 𝑶(𝒏)
▪ Guess: 𝑇 𝑛 ≤ 𝐵𝑛!
Assume
▪ Try to prove it inductively 𝑂 𝑛 ≤ 𝐶𝑛
– Basic step: 𝑇 1 = 1 ≤ 𝐵𝑛
– Inductive step:
𝑇 𝑛 ≤ 𝑇 0.2n + 𝑇 0.7n + 𝐶𝑛
≤ 0.9𝐵𝑛 + 𝐶𝑛
≤ 𝐵𝑛
It holds when
▪ We have 𝑇 𝑛 ≤ 10𝐶𝑛 = 𝑂(𝑛) 𝐵 ≥ 10𝐶
Remember not to use induction
with Big 𝑂 notations!
One more Question
What if we group them by 2,3,4,5,…?
Today’s goal