Approx Lecture03-3 KCentre
Approx Lecture03-3 KCentre
Lecture 3: 9 September
Lecturer: J. van Leeuwen Scribes: Shay Uzery & Kasper van den Berg
3.1 Overview
During this lecture we will consider facility location problems. A typical question in facility
location is to determine a set of locations in a network such as all other locations (nodes) are
within “easy reach”. Facility location problems can be modelled by network models where
the edges are labelled with ‘distances’ or ‘costs’. In this lecture we will discuss the following
facility location problems: ‘finding dominating locations’, the ‘k-center problem’, and the
‘lazy tourist problem’.
Finding dominating locations (i.e. finding a dominating set) is a basic case of the facility
location problem. Here only existing edges count, and their weight is ignored.
Finding a minimum size dominating set is computationally hard – there is no known polyno-
mial time algorithm to compute it. We assume from now on that the networks we consider
are connected. If a network consists of unconnected subgraphs, the minimum dominating set
is the union of the minimum dominating sets of the subgraphs.
The ‘naı̈ve’ way to compute it is to check all nk subsets ⊆ V of size k, and incrementing k
from 1 to n in steps until a minimum size dominating set is found. Let OPT be the size of a
minimum size dominating set in G.
Proposition 3.2 A minimum dominating set can be found in a number of steps bounded by
n
O(log OP T · max k ).
0≤k≤2·OP T
3-1
Lecture 3: 9 September 3-2
Proof: If there is a dominating set of size OP T then there is one of any size larger than
OP T (any superset will do). A minimum size set can thus be found as follows. Increment k
in powers of two (1, 2, 4, . . . ) until a j is found such that there is no dominating set of size
2j−1 but there is one of size 2j or n, whichever is smaller. Carry out a binary search between
2j−1 and 2j to find the smallest k for which a dominating set of size k exists. The search
takes O(j) = O(log OP T ) iterations, and each iteration involves testing nk subsets, for some
k ≤ 2 · OP T .
Can one bound OP T ? Let DG be the minimum degree of any vertex in G.
Proposition 3.4 A minimum dominating set can be found in a number of steps bounded by
O(log L · max nk ), with L = n 1+ln(D G +1)
DG +1 .
0≤k≤L
Theorem 3.5 (Baker, 1994) For the planar network case, there is an algorithm A running
on instances (G, ) such that for any given > 0, A produces a dominating set of size ≤
(1 + )OP T in time complexity polynomial to |G|.
1
The algorithm A may be exponential in however.
Lecture 3: 9 September 3-3
Definition 3.6 I ⊆ V is an independent set if every two nodes in I are not adjacent.
Definition 3.7 An independent set I is called maximal if for every node v 6∈ I, I ∪ {v} is
not independent.
Definition 3.8 An independent set is called a maximum independent set if it has the largest
possible size of any independent set in G.
Computing a maximal independent set is ‘easy’. It can be done in polynomial time as follows:
start with S = ∅, add any v ∈ V such that {v} ∪ S is an independent set, and iterate. The
algorithm stops when for every v ∈ V /S , S ∪ {v} is no longer independent.
Note that squaring a planar network doesn’t have to result in a planar network again.
Lemma 3.10 Let I a maximal independent set in G, S any minimum dominating set in G,
and J a maximal independent set in G2 . Then:
|J| ≤ |S| ≤ |I| (1)
Proof:
(a) |S| ≤ |I|.
We will show that I is a dominating set in G. For any node x ∈ V either x ∈ I or x is within
a distance 1 from a node in I, otherwise we could add x to I and I ∪ {x} would also be an
independent set, which is a contradiction to I being a maximal independent set. Thus I is a
dominating set in G. Since S is a minimum dominating set in G, |S| ≤ |I|.
(b) |J| ≤ |S|.
By squaring the network G, the neighbourhoods of all elements of S are turned into ‘cliques’
– groups of elements that are mutually connected. The cliques of the nodes u ∈ S span the
entire set V (some neighbourhoods of two nodes u1 ∈ S and u2 ∈ S might be combined into
one clique but this does not matter). At most one node of each clique can be in J. Thus
|J| ≤ |S|.
The next problem concerns a metric-symmetric network. Here we assume that all edges are
present and weighted with ‘distances’ such that the triangle inequality is satisfied. We want
Lecture 3: 9 September 3-4
to find a dominating set of size ≤ k such that the distances of all other nodes to one of the
set nodes is minimised over all possible choices of dominating set of size ≤ k.
Cover Radius
Figure 2: The black nodes are the centers which cover all the nodes in the graph
Definition 3.11 The covering radius is the maximum of the minimum distances from any
node not in the dominating set to a node that is in the dominating set.
Theorem 3.12 (Hochbaum & Shmoys, 1985) The k-center problem can be solved by an
approximation algorithm in polynomial time with performance ratio ≤ 2.
The key to the algorithm is a ‘bottleneck’ technique: start with an ‘empty’ network, add the
edges of E one by one, and see what ‘covering ranges’ are created.
Algorithm HS
(to be continued)
Definition 3.13 Let Gi be the network with vertex set V and edges e1 , . . . , ei .
Observation:
The k-center problem is now the problem of determining the smallest i such that Gi has a
dominating set of size ≤ k. The covering radius will be cost(ei ).
Exercise. Argue that ‘the smallest i’ in the observation is well-defined. (Hint: what size
dominating set does Gn have.)
Lecture 3: 9 September 3-5
Consider Gi for increasing i. The minimum dominating set in Gi will get smaller and smaller
as i increases. Lemma 3.10 gives a lowerbound |J| on its size which is easy to compute, taking
J to be any maximal independent set in G2i . |J| decreases also when more edges are added.
As long as |J| > k, Gi cannot have a dominating set of size < k. We exploit this as follows.
Algorithm HS (continued)
Step 2. Start with a network G0 = hV, E0 i containing all vertices and no edges. Set
i=1.
Step 3. Add edge ei to obtain the network Gi .
Step 4. Square Gi resulting in G2i .
Step 5. Compute a maximal independent set Mi in G2i .
Step 6. If |Mi | > k, then i := i + 1 and go back to step 3.
Step 7. Return Mi .
Comment: At this point i is the first index such that |Mi | ≤ k.
<= cost(e_i)
v
w
<= cost(e_i)
Theorem 3.14 Algorithm HS always returns a solution to the k-center problem with a cov-
ering radius ≤ 2 · OP T .
Proof: Let Gj be the first network with a solution to the problem. According to lemma 3.10
we get 1 ≤ i ≤ j. This results in G1 . . . Gi . . . Gj and hence cost(ei ) ≤ OP T . We now consider
the quality of the set Mi returned by the algorithm.
Clearly |Mi | ≤ k, thus Mi is a feasible solution. Consider any node u ∈ G. Mi is a maximal
independent set in G2i . From this follows that Mi is also a dominating set in G2i . Thus u is
either ∈ Mi or is connected by an edge to some node v ∈ Mi (in G2i ). But then u is within a
distance of two edges from v in Gi and both edges have cost ≤ cost(ei ) (see illustration 3.3).
According to the metric property of the network it is concluded that the cost from u to v is
≤ 2cost(ei ) ≤ 2 · OP T .
We note that algorithm HS is easily implemented to run in low polynomial time in |G|.
Lecture 3: 9 September 3-6
Algorithm HS does not build a candidate solution set for k-center problem incrementally. A
greedy approach does. Let G = hV, Ei be a metric-symmetric network again.
Definition 3.15 For u ∈ V and any subset J ⊆ V let d(u, J) be the shortest distance of u
to any node in J.
Consider the following algorithm that builds up a candidate set J in stages. In every stage it
adds the ‘farthest’ node into the set.
Algorithm G
Pick any initial vertex v0 . Set J0 = {0}.
for i = 1 to k do
begin
vi := any node farthest from Ji−1
di := the distance of vi := to Ji−1
Ji := Ji−1 ∪ {vi }
end
return J = Jk−1
comment: vk is not included in the returned set
comment: vk is farthest from Jk−1 , thus dk is the covering radius of J = Jk−1
Theorem 3.16 (Gonzalez, 1985) Algorithm G always returns a solution to the k-center
problem with a covering radius ≤ 2 · OP T .
Proof: Clearly |J| ≤ k by construction. We now consider the quality of the set J = Jk−1 .
Let C be an optimum solution and let its covering radius be OP T . As dk is the covering
radius of J it follows that OP T ≤ dk .
To measure the distance of any node to J it suffices to estimate dk , as vk is at farthest distance
from it. Consider Jk = {v0 , . . . , vk }, a set with k + 1 elements. C covers the network with
≤ k ‘circles’ of radius ≤ OP T . Thus there must be a u ∈ C such that the circle around u
contains 2 elements vi and vj , with necessarily d(vi , vj ) ≤ OP T . Say i < j, thus vj is added
later to J than vi and vi ∈ Jj−1 . Now observe
The first inequality follows because farthest distances decrease as j increases (or note that
d(vk , Jk−1 ) ≤ d(vk , Jj−1 ) ≤ d(vj , Jj−1 ) by the choice of vj in the jth round). We conclude
that dk is within twice OPT.
Algorithm G again has an easy polynomial running time.
Lecture 3: 9 September 3-7
Both polynomial time algorithms we saw for the k-center problem achieve a performance ratio
of 2. Doing better seems to be very difficult. In fact, if it could be done we would have solved
a major open problem as stated in the following observation.
Theorem 3.17 If the metric, symmetric k-center problem could be solved by a polynomial
time algorithm with a performance ratio < 2 than the general dominating set problem could
be solved in a polynomial time.
Proof: The dominating set problem in the ‘decision’ variant: given k, is there a dominating
set of size k. We reduce the dominating set problem to an instance of the k-center problem.
Let G = hV, Ei be any network, k the parameter for the dominating set query. Design a
‘complete’ symmetric network G0 with the following costs:
It is easily seen that G0 is metric. There is a k-center with radius ≤ 1 in G0 iff there is a
dominating set of size ≤ k in G.
Suppose the metric, symmetric k-center problem could be solved by a polynomial time algo-
rithm A with a performance ratio < 2. Run A on G0 for k = 1. It will return an approximation
in the range [1, 2i if there is a solution. It will return a value ≥ 2 if there is no solution. Thus,
the approximation algorithm can be used to decide the dominating set problem in polynomial
time.
Solving the dominating set problem in polynomial time is intimately connected with the P -
versus-N P problem. By the theorem so is the problem of finding a polynomial time algorithm
for the k-center problem with a performance ratio better that 2!
Variants of the k-center problem continue to be studied. It is a special case of the k-supplier
problem: given a set of suppliers Σ and a set of customers C, determine a subset S ⊆ Σ with
|S| ≤ k that covers C with smallest possible covering radius.
Consider a lazy tourist in a big city. Model the area he wants to visit as a graph, with the
nodes being all sites that can be visited and edges (u, v) representing that site u is visible
from v and vice versa.
The tourist wants to make a closed walk visiting the fewest possible number of places while
seeing all of them.1
1
A walk is a tour that may pass through a node more than once.
Lecture 3: 9 September 3-8
The lazy tourist problem can be approximated by finding a minimun size connected domi-
nating set and vice versa - a lazy lazy tourist walk constitutes a connected dominating set
[3].
Proposition 3.18 (a) If there exists a lazy tourist walk in G with k edges, then G has a
connected dominating set of size ≤ k.
(b) If G has a connected dominating set of size ≤ k, then there is a lazy tourist walk in G
with ≤ 2 · k edges.
Proof:
(a) Take the ≤ k vertices visited during the walk.
b Let M be a connected dominating set of size ≤ k, T a spanning tree of M . Do a depth-first
traversal of T .
The proposition implies that whenever we have a polynomial time approximation algorithm
for the connected dominating set problem with performace ratio α, then we automatically
have a polynomial time approximation for the lazy tourist problem with a performance ratio
≤ 2 · α.
Theorem 3.19 (Guha and Khuller, 2001) There is a polynomial time algorithm that com-
putes a connected dominating set within a performance ratio of ln ∆ + 3, where ∆ is the maxi-
mum degree in G. In case vertices have weights and a minimum weight connected dominating
set is sought, a polynomial time approximation algorithm exists that achieves a performace
ratio of 3 · ln n.
References
[1] N. Alon, J.H. Spencer. The probabilistic method. J. Wiley & Sons, new York, 1992.
[2] B.S. Baker. Approximation algorithms for NP-complete problems on planar graphs. Jour-
nal of the ACM 41:1 (1994) 153-180.
[3] S. Guha, S. Khuller. Approximation algorithms for connected dominating sets. Algorith-
mica 20:4 (1998) 374-387.
[4] D.S. Hochbaum, D.B. Shmoys. A best possible heuristic for the k-center problem. Math.
of Operations Research 10 (1985) 180-184.