CS4311 Design and Analysis of Algorithms: Lecture 8: Order Statistics
CS4311 Design and Analysis of Algorithms: Lecture 8: Order Statistics
1
About this lecture
•Finding max, min in an unsorted array
(upper bound and lower bound)
2
Finding Maximum
in unsorted array
3
Finding Maximum (Method I)
• Let S denote the input set of n items
• To find the maximum of S, we can:
Step 1: Set max = item 1
Step 2: for k = 2, 3, …, n
if (item k is larger than max)
Update max = item k;
Step 3: return max;
# comparisons = n –1
4
Finding Maximum (Method II)
• Define a function Find-Max as follows:
Find-Max(R, k) /* R is a set with k items */
1. if (k 2) return maximum of R;
2. Partition items of R into bk/2c pairs;
3. Delete smaller item from R in each pair;
4. return Find-Max(R, k - bk/2c);
Calling Find-Max(S,n) gives the maximum of S
5
Finding Maximum (Method II)
Let T(n) = # comparisons for Find-Max with
problem size n
6
Lower Bound
Question: Can we find the maximum using
fewer than n –1 comparisons?
Answer: No ! To ensure that an item x is
not the maximum, there must be at least
one comparison in which x is the smaller
of the compared items
So, we need to ensure n-1 items not max
at least n –1 comparisons are needed
7
Finding Both Max and Min
in unsorted array
8
Finding Both Max and Min
Can we find both max and min quickly?
Solution 1:
First, find max with n –1 comparisons
Then, find min with n –1 comparisons
Total = 2n –2 comparisons
9
Finding Both Max and Min
Better Solution: (Case 1: if n is even)
First, partition items into n/2 pairs;
= larger = smaller
10
Finding Both Max and Min
Then, max = Find-Max in larger items
min = Find-Min in smaller items
Find-Max Find-Min
# comparisons = 3n/2 –2
11
Finding Both Max and Min
Better Solution: (Case 2: if n is odd)
We find max and min of first n - 1 items;
if (last item is larger than max)
Update max = last item;
if (last item is smaller than min )
Update min = last item;
# comparisons = 3(n-1)/2
12
Finding Both Max and Min
Conclusion:
To find both max and min:
if n is odd: 3(n-1)/2 comparisons
if n is even: 3n/2 –2 comparisons
13
Lower Bound
Textbook Ex 9.1-2 (Very challenging):
• Show that we need at least
d3n/2e–2 comparisons
to find both max and min in worst-case
15
Selection in Linear Time
• In next slides, we describe a recursive call
Select(S,k)
which supports finding the kth smallest
element in S
• Recursion is used for two purposes:
(1) selecting a good pivot (as in Quicksort)
(2) solving a smaller sub-problem
16
Select(S, k)
/* First,find a good pivot */
1. Partition S into d|S|/5e groups, each
group has five items (one group may
have fewer items);
2. Sort each group separately;
3. Collect median of each group into S’;
4. Find median m of S’
:
m = Select(S’
,dd|S|/5e/2e);
17
4. Let q = # items of S smaller than m;
5. If (k == q + 1)
return m;
/* Partition with pivot */
6. Else partition S into X and Y
X = {items smaller than m}
Y = {items larger than m}
/* Next,form a sub-problem */
7. If (k q + 1)
return Select(X, k)
8. Else
return Select(Y, k– (q+1));
18
Selection in Linear Time
Questions:
19
Running Time
• In our selection algorithm, we chose m,
which is the median of medians, to be a
pivot and partition S into two sets X and Y
• In fact, if we choose any other item as the
pivot, the algorithm is still correct
• Why don’ t we just pick an arbitrary pivot
so that we can save some time ??
20
Running Time
• A closer look reviews that the worst-case
running time depends on |X| and |Y|
• Precisely, if T(|S|) denote the worst-case
running time of the algorithm on S, then
+ max {T(|X|),T(|Y|) }
21
Running Time
• Later, we show that if we choose m, the
“median of medians”, as the pivot,
• Consequently,
T(n) = T(dn/5e) + (n) + T(3n/4)
22
Median of Medians
• Let’
s begin with dn/5e sorted groups, each
has 5 items (one group may have fewer)
23
Median of Medians
• Then, we obtain the median of medians, m
27
Median of Medians
Similarly, we can show that at most
7n/10 + 5 items are smaller than m
|X| is at most 3n/4 for large enough n
Conclusion:
The “ median of medians”helps us control
the worst-case size of the sub-problem
without it, the algorithm runs in (n2)
time in the worst-case
28