0% found this document useful (0 votes)
32 views28 pages

CS4311 Design and Analysis of Algorithms: Lecture 8: Order Statistics

This lecture discusses algorithms for finding order statistics like the maximum, minimum, and kth smallest element in an unsorted array. It presents two methods for finding the maximum element in O(n) time by repeatedly comparing elements. It also proves this is optimal by showing at least n-1 comparisons are needed. Finding both the maximum and minimum can be done in O(n) time by partitioning pairs of elements. The lecture then describes a selection algorithm that finds the kth smallest element in linear O(n) time using recursion and choosing the "median of medians" as the pivot element. This guarantees the two partitions will be at most 3n/4 in size, resulting in linear overall time complexity.

Uploaded by

Soe Sandar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views28 pages

CS4311 Design and Analysis of Algorithms: Lecture 8: Order Statistics

This lecture discusses algorithms for finding order statistics like the maximum, minimum, and kth smallest element in an unsorted array. It presents two methods for finding the maximum element in O(n) time by repeatedly comparing elements. It also proves this is optimal by showing at least n-1 comparisons are needed. Finding both the maximum and minimum can be done in O(n) time by partitioning pairs of elements. The lecture then describes a selection algorithm that finds the kth smallest element in linear O(n) time using recursion and choosing the "median of medians" as the pivot element. This guarantees the two partitions will be at most 3n/4 in size, resulting in linear overall time complexity.

Uploaded by

Soe Sandar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

CS4311

Design and Analysis of


Algorithms
Lecture 8: Order Statistics

1
About this lecture
•Finding max, min in an unsorted array
(upper bound and lower bound)

•Finding both max and min (upper bound)

•Selecting the kth smallest element

kth smallest element  kth order statistics

2
Finding Maximum
in unsorted array

3
Finding Maximum (Method I)
• Let S denote the input set of n items
• To find the maximum of S, we can:
Step 1: Set max = item 1
Step 2: for k = 2, 3, …, n
if (item k is larger than max)
Update max = item k;
Step 3: return max;
# comparisons = n –1
4
Finding Maximum (Method II)
• Define a function Find-Max as follows:
Find-Max(R, k) /* R is a set with k items */
1. if (k 2) return maximum of R;
2. Partition items of R into bk/2c pairs;
3. Delete smaller item from R in each pair;
4. return Find-Max(R, k - bk/2c);
Calling Find-Max(S,n) gives the maximum of S

5
Finding Maximum (Method II)
Let T(n) = # comparisons for Find-Max with
problem size n

So, T(n) = T(n -bn/2c) + bn/2c for n ¸ 3


T(2) = 1
Solving the recurrence (by substitution),
we get T(n) = n - 1

6
Lower Bound
Question: Can we find the maximum using
fewer than n –1 comparisons?
Answer: No ! To ensure that an item x is
not the maximum, there must be at least
one comparison in which x is the smaller
of the compared items
So, we need to ensure n-1 items not max
 at least n –1 comparisons are needed

7
Finding Both Max and Min
in unsorted array

8
Finding Both Max and Min
Can we find both max and min quickly?
Solution 1:
First, find max with n –1 comparisons
Then, find min with n –1 comparisons
 Total = 2n –2 comparisons

Is there a better solution ??

9
Finding Both Max and Min
Better Solution: (Case 1: if n is even)
First, partition items into n/2 pairs;

Next, compare items within each pair;


= larger = smaller
10
Finding Both Max and Min
Then, max = Find-Max in larger items
min = Find-Min in smaller items

Find-Max Find-Min
# comparisons = 3n/2 –2
11
Finding Both Max and Min
Better Solution: (Case 2: if n is odd)
We find max and min of first n - 1 items;
if (last item is larger than max)
Update max = last item;
if (last item is smaller than min )
Update min = last item;

# comparisons = 3(n-1)/2
12
Finding Both Max and Min
Conclusion:
To find both max and min:
if n is odd: 3(n-1)/2 comparisons
if n is even: 3n/2 –2 comparisons

Combining: at most 3bn/2c comparisons

 better than finding max and min separately

13
Lower Bound
Textbook Ex 9.1-2 (Very challenging):
• Show that we need at least
d3n/2e–2 comparisons
to find both max and min in worst-case

Hint: Consider how many numbers may be


max or min (or both). Investigate how a
comparison affects these counts
14
Selecting kth smallest item
in unsorted array

15
Selection in Linear Time
• In next slides, we describe a recursive call
Select(S,k)
which supports finding the kth smallest
element in S
• Recursion is used for two purposes:
(1) selecting a good pivot (as in Quicksort)
(2) solving a smaller sub-problem

16
Select(S, k)
/* First,find a good pivot */
1. Partition S into d|S|/5e groups, each
group has five items (one group may
have fewer items);
2. Sort each group separately;
3. Collect median of each group into S’;
4. Find median m of S’
:
m = Select(S’
,dd|S|/5e/2e);

17
4. Let q = # items of S smaller than m;
5. If (k == q + 1)
return m;
/* Partition with pivot */
6. Else partition S into X and Y
X = {items smaller than m}
Y = {items larger than m}
/* Next,form a sub-problem */
7. If (k q + 1)
return Select(X, k)
8. Else
return Select(Y, k– (q+1));
18
Selection in Linear Time
Questions:

1. Why is the previous algorithm correct?


(Prove by Induction)

2. What is its running time?

19
Running Time
• In our selection algorithm, we chose m,
which is the median of medians, to be a
pivot and partition S into two sets X and Y
• In fact, if we choose any other item as the
pivot, the algorithm is still correct
• Why don’ t we just pick an arbitrary pivot
so that we can save some time ??

20
Running Time
• A closer look reviews that the worst-case
running time depends on |X| and |Y|
• Precisely, if T(|S|) denote the worst-case
running time of the algorithm on S, then

T(|S|) = T(d|S|/5e) + (|S|)

+ max {T(|X|),T(|Y|) }

21
Running Time
• Later, we show that if we choose m, the
“median of medians”, as the pivot,

both |X| and |Y| will be at most 3|S|/4

• Consequently,
T(n) = T(dn/5e) + (n) + T(3n/4)

 T(n) = (n) (obtained by substitution)

22
Median of Medians
• Let’
s begin with dn/5e sorted groups, each
has 5 items (one group may have fewer)

= larger = median = smaller

23
Median of Medians
• Then, we obtain the median of medians, m

Groups with median Groups with median


smaller than m =m larger than m 24
Median of Medians
Then, we know that all items marked with
X have value at most m
X X
X
X X X X
X X
X X
X

Groups with median


smaller than m =m X=“
value m”
25
Median of Medians
The number of items with value at most m
is at least
3(ddn/5e/2e–1) - 2

one group may have


each full group has min # of only 1 ‘
crossed’
item
3‘crossed’ items groups

 number of items: at least 3n/10 –5


26
Median of Medians
Previous page implies that at most
7n/10 + 5 items
are greater than m

 For large enough n (say, n 100)


7n/10 + 5  3n/4
 |Y| is at most 3n/4 for large enough n

27
Median of Medians
Similarly, we can show that at most
7n/10 + 5 items are smaller than m
 |X| is at most 3n/4 for large enough n

Conclusion:
The “ median of medians”helps us control
the worst-case size of the sub-problem
 without it, the algorithm runs in (n2)
time in the worst-case
28

You might also like