0% found this document useful (0 votes)
75 views5 pages

Complexity, The Changing Minimum and Closest Pair: 1 Las Vegas and Monte Carlo Algorithms

The document discusses algorithms for finding the closest pair of points from a set of points in the plane. It introduces the concepts of Las Vegas and Monte Carlo algorithms. It then presents an algorithm that can determine if the closest pair distance is less than or greater than a given radius r in linear time by storing the points in a grid with width r. This allows iteratively computing the closest pair distance as each new point is added, rebuilding the grid only when the distance decreases. If the distance changes k times, the overall runtime is O(kn).

Uploaded by

Vu LongBien
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views5 pages

Complexity, The Changing Minimum and Closest Pair: 1 Las Vegas and Monte Carlo Algorithms

The document discusses algorithms for finding the closest pair of points from a set of points in the plane. It introduces the concepts of Las Vegas and Monte Carlo algorithms. It then presents an algorithm that can determine if the closest pair distance is less than or greater than a given radius r in linear time by storing the points in a grid with width r. This allows iteratively computing the closest pair distance as each new point is added, rebuilding the grid only when the distance decreases. If the distance changes k times, the overall runtime is O(kn).

Uploaded by

Vu LongBien
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Complexity, the Changing Minimum and Closest Pair

Sariel Har-Peled
November 29, 2005
598shp - Randomized Algorithms
1 Las Vegas and Monte Carlo algorithms
Denition 1.1 A Las Vegas algorithm!Las Vegas algorithm is a randomized algorithms that al-
ways return the correct result. The only variant is that its running time might change between
executions.
An example for a Las Vegas algorithm is the QuickSort algorithm.
Denition 1.2 Monte Carlo algorithm!Monte Carlo algorithm is a randomized algorithm that
might output an incorrect result. However, the probability of error can be diminished by repeated
executions of the algorithm.
The MinCut algorithm was an example of a Monte Carlo algorithm.
1.1 Complexity Classes
I assume people know what are Turing machines, NP, NPC, RAM machines, uniform model,
logarithmic model. PSPACE, and EXP. If you do now know what are those things, you should
read about them. Some of that is covered in the randomized algorithms book, and some other stu
is covered in any basic text on complexity theory.
Denition 1.3 The class P consists of all languages L that have a polynomial time algorithm A,
such that for any input

,
x L A(x) accepts.
x / L A(x) rejects.
Denition 1.4 The class NP consists of all languages L that have a polynomial time algorithm
A, such that for any input

,
x L then y

, A(x, y) accepts, where [y[ (i.e. the length of y) is bounded by a


polynomial in [x[.
x / L then y

A(x, y) rejects.
Denition 1.5 For a complexity class C, we dene the complementary class co-C as the set of
languages whose complement is in the class C. That is
co C =

L C

,
where L =

` L.
1
It is obvious that P = coP and P NP coNP. (It is currently unknown if P =
NP coNP or whether NP = coNP, although both statements are believed to be false.)
Denition 1.6 The class RP (for Randomized Polynomial time) consists of all languages L that
have a randomized algorithm A with worst case polynomial running time such that for any input
x

,
x L Pr[A(x) accepts] 1/2.
x / L Pr[A(x) accepts] = 0.
An RP algorithm is Monte Carlo, but the mistake can only be if x L. coRP is all the
languages that have a Monte Carlo algorithm that make a mistake only if x / L. A problem which
is in RPcoRP has an algorithm that does not make a mistake, namely a Las Vegas algorithm.
Denition 1.7 The class ZPP (for Zero-error Probabilistic Polynomial time) is the class of lan-
guages that have Las Vegas algorithms in expected polynomial time.
Denition 1.8 The class PP (for Probabilistic Polynomial time) is the class of languages that
have a randomized algorithm A with worst case polynomial running time such that for any input
x

,
x L Pr[A(x) accepts] > 1/2.
x / L Pr[A(x) accepts] < 1/2.
The class PP is not very useful. Why?
Denition 1.9 The class BPP (for Bounded-error Probabilistic Polynomial time) is the class of
languages that have a randomized algorithm A with worst case polynomial running time such that
for any input x

,
x L Pr[A(x) accepts] 3/4.
x / L Pr[A(x) accepts] 1/4.
2 How many times can a minimum change, before it is THE min-
imum?
Let a
1
, . . . , a
n
be a set of n numbers, and let us randomly permute them into the sequence b
1
, . . . , b
n
.
Next, let c
i
= min
i
k=1
b
i
, and let X be the random variable which is the number of distinct values
appears in the sequence c
1
, . . . , c
n
. What is the expectation of X?
Lemma 2.1 In expectation, the number of times the minimum of a prex of n randomly permuted
numbers change, is O(log n). That is
E
[X] = O(log n).
Proof: Consider the indicator variable X
i
, such that X
i
= 1 if c
i
= c
i1
. The probability for that
is q1/i, since this is the probability that the smallest number if b
1
, . . . , b
i
is b
i
. As such, we have
X =

i
X
i
, and
E
[X] =

i
E
[X
i
] =
n

i=1
1
i
= O(log n).
2
3 Closest Pair
Assumption 3.1 Throughout the discourse, we are going to assume that every hashing operation
takes (worst case) constant time. This is quite a reasonable assumption when true randomness is
available (using for example perfect hashing [CLRS01]). We probably will revisit this issue later in
the course.
For r a real positive number and a point p = (x, y) in R
2
, dene G
r
(p) to be the point
(x/r| r, y/r| r). We call r the width of the grid G
r
. Observe that G
r
partitions the plane into
square regions, which we call grid cells. Formally, for any i, j Z, the intersection of the half-planes
x ri, x < r(i + 1), y rj and y < r(j + 1) is said to be a grid cell. Further we dene a grid
cluster as a block of 3 3 contiguous grid cells.
For a point set P, and parameter r, the partition of P into subsets by the grid G
r
, is denoted
by G
r
(P). More formally, two points p, q P belong to the same set in the partition G
r
(P), if
both points are being mapped to the same grid point or equivalently belong to the same grid cell.
Note, that every grid cell C of G
r
, has a unique ID; indeed, let p = (x, y) be any point in C,
and consider the pair of integer numbers id
C
= id(p) = (x/r| , y/r|). Clearly, only points inside
C are going to be mapped to id
C
. This is very useful, since we store a set P of points inside a
grid eciently. Indeed, given a point p, compute its id(p). We associate with each unique id a
data-structure that stores all the points falling into this grid cell (of course, we do not maintain
such data-structures for grid cells which are empty). So, once we computed id(p), we fetch the data
structure for this cell, by using hashing. Namely, we store pointers to all those data-structures in
a hash table, where each such data-structure is indexed by its unique id. Since the ids are integer
numbers, we can do the hashing in constant time.
We are interested in solving the following problem:
Problem 3.2 Given a set P of n points in the plane, nd the pair of points closest to each other.
Formally, return the pair of points realizing ({(P) = min
p,qP
|pq|.
Lemma 3.3 Given a set P of n points in the plane, and a distance r, one can verify in linear
time, whether or not ({(P) < r or ({(P) r.
Proof: Indeed, store the points of P in the grid G
r
. For every non-empty grid cell, we maintain
a linked list of the points inside it. Thus, adding a new point p takes constant time. Indeed,
compute id(p), check if id(p) already appears in the hash table, if not, create a new linked list for
the cell with this ID number, and store p in it. If a data-structure already exist for id(p), just add
p to it.
This takes O(n) time. Now, if any grid cell in G
r
(P) contains more than, say, 9 points of p,
then it must be that the ({(P) < r. Indeed, consider a cell C containing more than four points of
P, and partition C into 3 3 equal squares. Clearly, one of those squares must contain two points
of P, and let C

be this square. Clearly, the diameter of C

= diam(C)/3 =

r
2
+ r
2
/3 < r. Thus,
the (at least) two points of P in C

are distance smaller than r from each other.


Thus, when we insert a point p, we can fetch all the points of P that were already inserted,
for the cell of P, and the 8 adjacent cells. All those cells, must contain at most 9 points of P
(otherwise, we would already have stopped since the ({() of inserted points, is smaller than r).
Let S be the set of all those points, and observe that [S[ 9 9 = O(1). Thus, we can compute by
brute force the closest point to p in S. This takes O(1) time. If d(p, S) < r, we stop, otherwise,
we continue to the next point, where d(p, S) = min
sS
|ps|.
3
Overall, this takes O(n) time. As for correctness, rst observe that if ({(P) > r then the
algorithm would never make a mistake, since it returns ({(P) < r only after nding a pair of
points of P with distance smaller than r. Thus, assume that p, q are the pair of points of P realizing
the closest pair, and |pq| = ({(P) < r. Clearly, when the later of them, say p, is being inserted,
the set S would contain q, and as such the algorithm would stop and return ({(P) < r.
Lemma 3.3 gives a natural way of computing ({(P). Indeed, permute the points of P in
arbitrary fashion, and let P = 'p
1
, . . . , p
n
`. Next, let r
i
= ({(p
1
, . . . , p
i
). We can check if
r
i+1
< r
i
, by just calling the algorithm for Lemma 3.3 on P
i+1
and r
i
. In fact, if r
i+1
< r
i
, the
algorithm of Lemma 3.3, would give us back the distance r
i+1
(with the other point realizing this
distance).
In fact, consider the good case, where r
i+1
= r
i
= r
i1
. Namely, the length of the shortest pair
does not check for awhile. In this case, we do not need to rebuild the data structure of Lemma 3.3,
for each point. We can just reuse it from the previous iteration. Thus, inserting a single point
takes constant time, as long as the closest pair does not change.
Things become bad, when r
i
< r
i1
. Because then, we need to rebuild the grid, and reinsert
all the points of P
i
= 'p
1
, . . . , p
i
` into the new grid G
r
i
(P
i
). This takes O(i) time.
So, if the closest pair radius, in the sequence r
1
, . . . , r
n
changes only k times, then the running
time of our algorithm would be O(nk). In fact, we can do even better.
Theorem 3.4 Let P be a set of n points in the plane, one can compute the closest pair of points
of P in expected linear time.
Proof: Pick a random permutation of the points of P, let 'p
1
, . . . , p
n
` be this permutation.
Let r
2
=|p
1
p
2
|, and start inserting the points into the data structure of Lemma 3.3. In the ith
iteration, if r
i
= r
i1
, then this insertion takes constant time. If r
i
< r
i1
, then we rebuild the grid
and reinsert the points. Namely, we recompute G
r
i
(P
i
).
To analyze the running time of this algorithm, let X
i
be the indicator variable which is 1 if
r
i
= r
i1
, and 0 otherwise. Clearly, the running time is proportional to
R = 1 +
n

i=2
(1 + X
i
i) .
Thus, the expected running time is
E
[R] = 1 +
E

1 +
n

i=2
(1 + X
i
i)

= n +
n

i=2
(
E
[X
i
] i) = n +
n

i=2
i Pr[X
1
= 1] ,
by linearity of expectation and since for indicator variable X
i
, we have
E
[X
i
] = Pr[X
i
= 1].
Thus, we need to bound Pr[X
i
= 1] = Pr[r
i
< r
i1
]. To bound this quantity, x the points of
P
i
, and randomly permute them. A point q P
i
is called critical, if ({(P
i
` q) > ({(P
i
). If
there are no critical points, then r
i1
= r
i
and then Pr[X
i
= 1] = 0. If there is one critical point,
than Pr[X
i
= 1] = 1/i, as this is the probability that this critical point, would be the last point in
the random permutation of P
i
.
If there are two critical points, and let p, q be this unique pair of points of P
i
realizing ({(P
i
).
The quantity r
i
is smaller than r
i1
, one if either p or q are p
i
. But the probability for that is 2/i
(i.e., the probability in a random permutation of i objects, that one of two marked objects would
be the last element in the permutation).
Observe, that there can not be more than two critical points. Indeed, if p and q are two points
that realizing the closest distance, than if there is a third critical point r, then ({(P
i
`r) =|pq|,
and r is not critical.
4
We conclude that
E
[R] = n +
n

i=2
i Pr[X
1
= 1] n +
n

i=2
i
2
i
3n.
We have that the expected running time is O(
E
[R]) = O(n).
Theorem 3.4 is a surprising result, since it implies that uniqueness (i.e., deciding if n real
numbers are all distinct) can be solved in linear time. However, there is a lower bound of (nlog n)
on uniqueness, using the comparison tree model. This reality dysfunction, can be easily explained,
once one realizes that the model of computation of Theorem 3.4 is considerably stronger, using
hashing, randomization, and the oor function.
4 Bibliographical notes
Section 1 follows [MR95, Section 1.5]. The closest-pair algorithm follows Golin et al. [GRSS95].
This is in turn a simplication of a result of Rabin [Rab76]. Smid provides a survey of such
algorithms [Smi00].
References
[CLRS01] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms.
MIT Press / McGraw-Hill, Cambridge, Mass., 2001.
[GRSS95] M. Golin, R. Raman, C. Schwarz, and M. Smid. Simple randomized algorithms for
closest pair problems. Nordic J. Comput., 2:327, 1995.
[MR95] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press,
New York, NY, 1995.
[Rab76] M. O. Rabin. Probabilistic algorithms. In J. F. Traub, editor, Algorithms and Complex-
ity: New Directions and Recent Results, pages 2139. Academic Press, New York, NY,
1976.
[Smi00] M. Smid. Closest-point problems in computational geometry. In Jorg-R udiger Sack and
Jorge Urrutia, editors, Handbook of Computational Geometry, pages 877935. Elsevier
Science Publishers B. V. North-Holland, Amsterdam, 2000.
5

You might also like