Introduction To Algorithms: Order Statistics
Introduction To Algorithms: Order Statistics
6.046J/18.401J
LECTURE 6
Order Statistics
• Randomized divide and
conquer
• Analysis of expected time
• Worst-case linear-time
order statistics
• Analysis
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.8
Calculating expectation
⎡ n −1 ⎤
E[T (n)] = E ⎢ ∑ X k (T (max{k , n − k − 1}) + Θ(n) )⎥
⎣k =0 ⎦
n −1
= ∑ E[ X k (T (max{k , n − k − 1}) + Θ(n) )]
k =0
Linearity of expectation.
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.9
Calculating expectation
⎡ n −1 ⎤
E[T (n)] = E ⎢ ∑ X k (T (max{k , n − k − 1}) + Θ(n) )⎥
⎣k =0 ⎦
n −1
= ∑ E[ X k (T (max{k , n − k − 1}) + Θ(n) )]
k =0
n −1
= ∑ E[ X k ] ⋅ E[T (max{k , n − k − 1}) + Θ(n)]
k =0
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.10
Calculating expectation
⎡ n −1 ⎤
E[T (n)] = E ⎢ ∑ X k (T (max{k , n − k − 1}) + Θ(n) )⎥
⎣k =0 ⎦
n −1
= ∑ E[ X k (T (max{k , n − k − 1}) + Θ(n) )]
k =0
n −1
= ∑ E[ X k ] ⋅ E[T (max{k , n − k − 1}) + Θ(n)]
k =0
n −1 n −1
= 1 ∑ E [T (max{k , n − k − 1})] + 1 ∑ Θ(n)
n k =0 n k =0
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.11
Calculating expectation
⎡ n −1 ⎤
E[T (n)] = E ⎢ ∑ X k (T (max{k , n − k − 1}) + Θ(n) )⎥
⎣k =0 ⎦
n −1
= ∑ E[ X k (T (max{k , n − k − 1}) + Θ(n) )]
k =0
n −1
= ∑ E[ X k ] ⋅ E[T (max{k , n − k − 1}) + Θ(n)]
k =0
n −1 n −1
= 1 ∑ E [T (max{k , n − k − 1})] + 1 ∑ Θ(n)
n k =0 n k =0
n −1
≤ 2 ∑ E [T (k )] + Θ(n) Upper terms
n k = ⎣n / 2 ⎦
appear twice.
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.12
Hairy recurrence
(But not quite as hairy as the quicksort one.)
n −1
E[T (n)] = 2 ∑ E [T (k )] + Θ(n)
n k= n/2
⎣ ⎦
Prove: E[T(n)] ≤ cn for constant c > 0 .
• The constant c can be chosen large enough
so that E[T(n)] ≤ cn for the base cases.
n −1
Use fact: ∑ 8 (exercise).
k ≤ 3n 2
k = ⎣n / 2 ⎦
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.13
Substitution method
n −1
E [T (n)] ≤ 2 ∑ ck + Θ(n)
n k= n/2
⎣ ⎦
Substitute inductive hypothesis.
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.14
Substitution method
n −1
E [T (n)] ≤ 2 ∑ ck + Θ(n)
n k= n/2
⎣ ⎦
≤ 2c ⎛⎜ 3 n 2 ⎞⎟ + Θ(n)
n ⎝8 ⎠
Use fact.
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.15
Substitution method
n −1
E [T (n)] ≤ 2 ∑ ck + Θ(n)
n k= n/2
⎣ ⎦
≤ 2c ⎛⎜ 3 n 2 ⎞⎟ + Θ(n)
n ⎝8 ⎠
= cn − ⎛⎜ cn − Θ(n) ⎞⎟
⎝4 ⎠
Express as desired – residual.
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.16
Substitution method
n −1
E [T (n)] ≤ 2 ∑ ck + Θ(n)
n k= n/2
⎣ ⎦
≤ 2c ⎛⎜ 3 n 2 ⎞⎟ + Θ(n)
n ⎝8 ⎠
= cn − ⎛⎜ cn − Θ(n) ⎞⎟
⎝4 ⎠
≤ cn ,
if c is chosen large enough so
that cn/4 dominates the Θ(n).
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.17
Summary of randomized
order-statistic selection
• Works fast: linear expected time.
• Excellent algorithm in practice.
• But, the worst case is very bad: Θ(n2).
Q. Is there an algorithm that runs in linear
time in the worst case?
A. Yes, due to Blum, Floyd, Pratt, Rivest,
and Tarjan [1973].
IDEA: Generate a good pivot recursively.
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.18
Worst-case linear-time order
statistics
SELECT(i, n)
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
2. Recursively SELECT the median x of the ⎣n/5⎦
group medians to be the pivot.
3. Partition around the pivot x. Let k = rank(x).
4. if i = k then return x
elseif i < k Same as
then recursively SELECT the ith RAND-
smallest element in the lower part SELECT
else recursively SELECT the (i–k)th
smallest element in the upper part
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.19
Choosing the pivot
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.20
Choosing the pivot
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.21
Choosing the pivot
greater
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.22
Choosing the pivot
greater
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.24
Analysis (Assume all elements are distinct.)
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.27
Developing the recurrence
T(n) SELECT(i, n)
1. Divide the n elements into groups of 5. Find
Θ(n) the median of each 5-element group by rote.
2. Recursively SELECT the median x of the ⎣n/5⎦
T(n/5) group medians to be the pivot.
Θ(n) 3. Partition around the pivot x. Let k = rank(x).
4. if i = k then return x
elseif i < k
T(3n/4) then recursively SELECT the ith
smallest element in the lower part
else recursively SELECT the (i–k)th
smallest element in the upper part
September 28, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L6.28
Solving the recurrence
T (n) = T ⎛⎜ 1 n ⎞⎟ + T ⎛⎜ 3 n ⎞⎟ + Θ(n)
⎝5 ⎠ ⎝4 ⎠