0% found this document useful (0 votes)
36 views3 pages

Maxima

The document summarizes an algorithm for finding all maximal points in a set of points in the plane. It begins with formally defining the maxima problem and providing an initial O(n log n) algorithm. It then describes how to improve this to an output-sensitive O(n log k) time algorithm, where k is the number of maximal points. The key ideas are using binary search to guess k, and applying a multi-rank selection subroutine to partition the points based on the guess.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views3 pages

Maxima

The document summarizes an algorithm for finding all maximal points in a set of points in the plane. It begins with formally defining the maxima problem and providing an initial O(n log n) algorithm. It then describes how to improve this to an output-sensitive O(n log k) time algorithm, where k is the number of maximal points. The key ideas are using binary search to guess k, and applying a multi-rank selection subroutine to partition the points based on the guess.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Lecture Notes: 2D Maxima

Yufei Tao
Department of Computer Science and Engineering
Chinese University of Hong Kong
[email protected]

Jan 8, 2014

In this lecture, we will discuss the maxima problem defined as follows. Let (x1 , y1 ) and (x2 , y2 ) be
two different points in R2 . We say that the former dominates the latter if x1 ≥ x2 and y1 ≥ y2
(note that the two equalities cannot hold simultaneously because these are two different points).
Let P be a set of n points in R2 . A point p ∈ P is a maximal point of P if p is not dominated by any
point in P . We want to design an algorithm to report all the maximal points of P efficiently. In the
example of Figure 1, points 1, 2, 6, and 8 should be reported. We will assume that P is in general
position (in particular, no two points have the same x-coordinate, or the same y-coordinate). It
is not hard to observe that the maximal points must form a “staircase”, namely, if we walk along
them in ascending order of x-coordinate, their y-coordinates must be descending.

8
6
2
4

7
3
1
5

Figure 1: An example

Whether a point p ∈ P is dominated by any other point in P can be easily checked in O(n)
time. This implies a naive algorithm that finds all the maximal points in O(n2 ) time. We can, in
fact, settle the problem in O(n log n) time. First sort P in descending order of x-coordinate. Let
the sorted order be p1 , p2 , ..., pn . Then, we process the points in the sorted order by adhering to the
invariant that, after having processed pi , we have computed all the maximal points of {p1 , ..., pi },
and stored them in descending order of x-coordinate. For example, in Figure 1, after finishing with
point 4, we should be maintaining a sorted list: point 1, point 2. This turns out to be very easy.
First, p1 must be a maximal point. In general, suppose that the invariant holds after pi , we process
pi+1 as follows. Let p∗ be the highest point of p1 , ..., pi (p∗ can be easily maintained). Observe that
pi+1 is a maximal point if and only if the y-coordinate of pi+1 is greater than that of p∗ . Hence,
we can decide whether to append pi+1 to our maximal list in just O(1) time. The total time spent
on processing p1 , ..., pn is thus O(n). The overall running time is O(n log n), which is determined
by sorting.
It can also be shown that any comparison-based algorithm must use Ω(n log n) time solving the
problem. Hence, the algorithm explained earlier is already optimal. Although it may seem that
we have solved the problem, our lecture has actually just started. We will give an output sensitive
algorithm that finishes in O(n log k) time, where k is the number of maximal points. Attention

1
should be paid to the methodology behind this algorithm, which is very useful in designing output
sensitive algorithms. Before continuing, let us first observe a simple O(nk) time algorithm. First,
find the rightmost point p of P in O(n) time, which must be a maximal point. Then, in O(n) time
remove from P all the points dominated by p and also p itself. Now, the rightmost point of (the
remaining) P is also guaranteed to be a maximal point. Hence, we repeat the above steps until P
becomes empty.

1 Utilizing An Upper Bound of k


Let us first assume that, by magic, we know an upper bound k̂ of k (e.g., k̂ = n is a trivial upper
bound). We will design an algorithm whose efficiency depends on k̂.
First, divide P by x-coordinate into k̂ subsets P1 , ..., Pk̂ such that (i) every point in Pi has a
larger x-coordinate than all the points in Pj for any 1 ≤ i < j ≤ k̂, and (ii) |P1 | = |P2 | = ... =
|Pk̂−1 | = ⌈n/k̂⌉ (note that this condition implies Pi = O(n/k̂) for all i ∈ [1, k̂]). This can be easily
done in O(n log k̂) time using a standard rank selection algorithm (see appendix).
Next, we process the subsets Pi in ascending order of i. Our invariant is that, after we are
done with Pi , we must have computed the maximal points of P1 ∪ ... ∪ Pi (observe that they must
also be maximal points of P ). We achieve the purpose as follows. First, all the maximal points
of P1 are found in O(t1 |P1 |) = O(t1 n/k̂) time, where t1 is the number of those points. In general,
assuming that the invariant holds after Pi , we process Pi+1 as follows. Let p∗ be the highest of all
the maximal points in P1 ∪ ... ∪ Pi . Scan Pi+1 to remove all the points dominated by p∗ . Then, find
all the maximal points of the remaining Pi+1 in O(ti+1 |Pi+1 |) = O(ti+1 n/k̂) time, where ti+1 is the
number of those points—note Pkthat all these points must also be maximal points of P1 ∪ ... ∪ Pi+1 .
Overall, we spend O((n/k̂) i=1 ti ) = O((n/k̂) · k) = O(n) time.
We thus have proved:

Lemma 1. If an upper bound k̂ of k is known, we can find all the maximal points in at most
cn log k̂ time for some constant c.

You may be puzzled why the lemma states the constant c explicitly—usually, we hide such
constants with big-O. There is, however, a good reason to do so, as will be clear in the next
section.

2 The Final Algorithm


Lemma 1 is not immediately helpful—after all, if we set k̂ to the trivial bound n, then the running
time O(n log k̂) is no better than O(n log n), which we have already achieved. However, a more
clever use of the lemma leads to the desired O(n log k) bound. The main idea is to ask the algorithm
take a guess k′ of k. Initially, the algorithm sets k′ to 1, and gradually increases it if k′ < k. But
how expensive is it to find out whether k′ < k? The answer is O(n log k̂), thanks to Lemma 1.
Specifically, we simply run the algorithm of Section 1 by setting k̂ = k′ , and keep monitoring
the algorithm’s cost (this means counting the number of unit-time atomic operations in the RAM
model). We know if k′ ≥ k, then by Lemma 1, the algorithm should terminate within cn log k̂ time.
Hence, as soon as the algorithm’s cost reaches 1 + cn log k′ , we can manually force the algorithm
to terminate, and declare that k′ < k.
Motivated by this, we start with k′ = 21 . If k′ < k, we increase k′ to 22 and try again. In
i i+1
general, if k′ = 22 is still smaller than k, the next k′ we will try is min{22 , n}. Clearly, this

2
algorithm will eventually find all the maximal points—it does so when k′ is at least k for the first
time.
At first glance, it may appear that the running time is expensive because we may have to
attempt multiple values of k′ before the algorithm stops. A careful calculation reveals that this
i
intuition is not true. Suppose that eventually the algorithm stops at k′ = 22 . The total running
time is:
 0 1 2 3 i

O n log 22 + n log 22 + n log 22 + n log 22 + ... + n log 22
= O n 20 + 21 + 22 + ... + 2i


= O(n · 2i )
i−1
How large is 2i ? The definition of i implies 22 < k, namely, 2i−1 < log2 k. Hence, 2i < 2 log2 k.
We thus have designed an algorithm solving the maxima problem in O(n · 2i ) = O(n log k) time.

Appendix: Multi-Rank Selection


Let S be a set of n real values. We say that a value v ∈ S has rank i if |{u ∈ S | u ≥ v}| = i (i.e.,
the largest value in S has rank 1, the second largest rank 2, ...). Given any rank r ∈ [1, n], the
element with rank r can be selected in linear time O(n) using a textbook rank selection algorithm
(note that this does not require sorting S).
In the multi-rank selection problem, suppose we are given k ranks r1 , ..., rk in ascending order,
and need to find the k corresponding elements. This is do-able in O(n log k) time as follows.
Without loss of generality, let us assume that k is a power of 2. We first pick the median of rk/2 of
{r1 , ..., rk }, and find the element e with rank rk/2 . Then, divide S into S1 and S2 such that (i) the
former includes all the elements of S at least e, and (ii) the latter includes the other elements of
S. We now recurse on two instances of the multi-rank selection problem: the first one on S1 with
ranks r1 , ..., rk/2 , and the second one on S2 with ranks r1+k/2 − k/2, r2+k/2 − k/2, ..., rk − k/2.
Let us analyze the running time. Define f (n, k) be the time of the above algorithm issued on a
set S (n = |S|), and k designated ranks. If k = 1, we know f (n, k) = O(n). For k > 1, we have:

f (n, k) = f (n1 , k/2) + f (n − n1 , k/2)

where n1 = |S1 |. Solving the recurrence gives f (n, k) = O(n log k).

You might also like