Approximation Algorithms PDF
Approximation Algorithms PDF
Eindhoven
University of Technology
Course Notes on
Approximation Algorithms
Mark de Berg
Contents
3
Chapter 1
Introduction to Approximation
Algorithms
Many important computational problems are difficult to solve optimally. In fact, many of
those problems are NP-hard 1 , which means that no polynomial-time algorithm exists that
solves the problem optimally unless P=NP. A well-known example is the Euclidean traveling
salesman problem (Euclidean TSP): given a set of points in the plane, find a shortest tour
that visits all the points. Another famous NP-hard problem is Independent Set: given a
graph G = (V, E), find a maximum-size independent set V ∗ ⊂ V . (A subset is independent
if no two vertices in the subset are connected by an edge.)
What can we do when faced with such difficult problems, for which we cannot expect
to find polynomial-time algorithms? Unless the input size is really small, an algorithm with
exponential running time is not useful. We therefore have to give up on the requirement that
we always solve the problem optimally, and settle for a solution close to optimal. Ideally,
we would like to have a guarantee on how close to optimal the solution is. For example, we
would like to have an algorithm for Euclidean TSP that always produces a tour whose length
is at most a factor ρ times the minimum length of a tour, for a (hopefully small) value of ρ.
We call an algorithm producing a solution that is guaranteed to be within some factor of the
optimum an approximation algorithm. This is in contrast to heuristics, which may produce
good solutions but do not come with a guarantee on the quality of their solution.
4
TU Eindhoven Approximation Algorithms — Course Notes
From now on we will use opt(I) to denote the value of an optimal solution to the problem
under consideration for input I. For instance, when we study Euclidean TSP then opt(P )
will denote the length of a shortest tour on a point set P , and when we study Independent
Set then opt(G) will denote the maximum size of any independent set of the input graph G.
When no confusion can arise we will sometimes simply write opt instead of opt(I). We
denote the value of the solution that a given approximation algorithm computes for input I
by Alg(I), or simply by Alg when no confusion can arise. In the remainder we make the
assumption that opt(I) > 0 and Alg(I) > 0, which will be the case in all problems and
algorithms we shall discuss.
As mentioned above, approximation algorithm come with a guarantee on the relation
between opt(I) and Alg(I). This is made precise by the following definition.
Definition 1.1
(i) An algorithm Alg for a minimization problem is called a ρ-approximation algorithm,
for some ρ > 1, if Alg(I) 6 ρ · opt(I) for all inputs I.
(ii) An algorithm Alg for a maximization problem is called a ρ-approximation algorithm,
for some ρ < 1, if2 Alg(I) > ρ · opt(I) for all inputs I.
Note that any ρ-approximation algorithm for a minimization problem is also a ρ0 -approximation
algorithm for any ρ0 > ρ. For example, any 2-approximation algorithm is also a 3-approximation
algorithm. Thus for an algorithm Alg for a minimization problem it is interesting to find
the smallest ρ such that Alg is a ρ-approximation algorithm. When we have found such a ρ
we say that the approximation ratio is tight. Here’s a more formal definition.
Definition 1.2 Let Alg be a ρ-approximation algorithm for a minimization problem. We say
that the approximation ratio ρ is tight when ρ = supI Alg(I)/opt(I), where the supremum
is over all possible inputs I.
To prove that an algorithm is a ρ-approximation and that the bound ρ is tight, we must thus
show two things.
• First, we must prove that Alg is a ρ-approximation algorithm: we must prove that
Alg(I) 6 ρ · opt(I) for all inputs I.
• Second, we must show that for any ρ0 < ρ there is some input I such that Alg(I) >
ρ0 · opt(I). Often we actually show that there is some input I with Alg(I) = ρ · opt(I).
Remark. Saying that when we have a ρ-approximation algorithm, the number ρ is called
“the approximation ratio”, is a bit sloppy. It would be more accurate to say that ρ is the
approximation ratio only when the bound is tight. We will permit ourselves this abuse of
terminology, and when we proved that Alg a 2-approximation algorithm, we will often say
something like “Hence, the approximation ratio of Alg is 2.” even though we did not prove
that the bound is tight.
2
In some texts an algorithm for a maximization problem is called a ρ-approximation algorithm if Alg(I) >
(1/ρ) · opt(I) for all inputs I. Thus, contrary to our definition, the approximation ratio ρ for a maximization
problem is always larger than 1.
5
TU Eindhoven Approximation Algorithms — Course Notes
The importance of lower bounds. It may seem strange that it is possible to prove that
an algorithm is a ρ-approximation algorithm: how can we prove that an algorithm always
produces a solution that is within a factor ρ of opt when we do not know opt? The crucial
observation is that, even though we do not know opt, we can often derive a lower bound (or,
in the case of maximization problems: an upper bound) on opt. If we can then show that our
algorithm always produces a solution whose value is at most a factor ρ from the lower bound,
then the algorithm is also within a factor ρ from opt. Thus finding good lower bounds on
opt is an important step in the analysis of an approximation algorithm. In fact, the search
for a good lower bound often leads to ideas on how to design a good approximation algorithm.
This is something that we will see many times in the coming chapters.
and the makespan of the assignment equals max16i6m load (Mi ). The Load Balancing
problem is to find an assignment of jobs to machines that minimizes the makespan, where
each job is assigned to a single machine. (We cannot, for instance, execute part of a job on
one machine and the rest of the job on a different machine.) Load Balancing is NP-hard.
Our first approximation algorithm for Load Balancing is greedy: we consider the jobs
one by one and assign each job to the machine whose current load is smallest.
This algorithm clearly assigns each job to one of the m available machines. Moreover, it runs
in polynomial time. In fact, if we maintain the set {load (Mi ) : 1 6 i 6 m} in a min-heap,
then we can find the machine k with minimum load in O(1) time and update load (Mk ) in
O(log m) time. This way the entire algorithm can be made to run in O(n log m) time. The
main question is how good the assignment is. Does it give an assignment whose makespan
6
TU Eindhoven Approximation Algorithms — Course Notes
is close to opt? The answer is yes. To prove this we need a lower bound on opt, and then
we must argue that the makespan of the assignment produced by the algorithm is not much
more than this lower bound.
There are two very simple observations that give a lower bound. First of all, the best one
could hope for is that it is possible to spread
P the jobs perfectly over the machines so that
each machine has the same load, namely 16j6n tj /m. In many cases this already provides a
pretty good lower bound. When there is one very large job and all other jobs have processing
time close to zero, however, then the upper bound is weak. In that case the trivial lower
bound of max16j6n tj will be stronger. To summarize, we have
1 P
Lemma 1.3 opt > max ( m 16j6n tj , max16j6n tj ).
1 P
Let’s define lb := max ( m 16j6n tj , max16j6n tj ) to be the lower bound provided by
Lemma 1.3. With this lower bound in hand we can prove that our simple greedy algorithm
gives a 2-approximation.
Proof. We must prove that Greedy-Scheduling always produces an assignment of jobs to ma-
chines such that the makespan T satisfies T 6 2·opt. Consider an input t1 , . . . , tn , m. Let Mi∗
be a machine determining the makespan of the assignment produced by the algorithm, that is,
a machine such that at the end of the algorithm we have load (Mi∗ ) = max16i6m load (Mi ). Let
j ∗ be the last job assigned to Mi∗ . The crucial property of our greedy algorithm is that at the
time job j ∗ is assigned to Mi∗ , machine Mi∗ is a machine with the smallest load among all the
machines. So if load 0 (Mi ) denotes the load of machine Mi just before job ∗
Pj is assigned, then
load (Mi∗ ) 6 load (Mi ) for all 1 6 i 6 m. Hence, load (Mi∗ ) 6 (1/m) · 16i6m load 0 (Mi ). It
0 0 0
follows that
1 X 1 X 1 X
load 0 (Mi∗ ) 6 load 0 (Mi ) = tj < tj 6 lb. (1.1)
m m ∗
m
16i6m 16j<j 16j6n
So this simple greedy algorithm is never more than a factor 2 from optimal. Can we do
better? There are several strategies possible to arrive at a better approximation factor. One
possibility could be to see if we can improve the analysis of Greedy-Scheduling. Perhaps we
might be able to show that the approximation factor is in fact at most c · lb for some c < 2.
Another way to improve the analysis might be to use a stronger lower bound than the one
provided by Lemma 1.3. (Note that if there are instances where lb = opt/2 then an analysis
based on this lower bound cannot yield a better approximation ratio than 2.)
7
TU Eindhoven Approximation Algorithms — Course Notes
It is, indeed, possible to prove a better approximation factor for the greedy algorithm
described above: a more careful analysis shows that the approximation factor is in fact
1
(2 − m ), where m is the number of machines:
1
Theorem 1.5 Algorithm Greedy-Scheduling is a (2 − m )-approximation algorithm.
Proof. The proof is similar to the proof of Lemma 1.4. We first slightly change (1.1) to get
1 X 1 X 1 X 1
load 0 (Mi∗ ) 6 load 0 (Mi ) = tj 6 tj − tj ∗ 6 lb − · tj ∗ .
m m ∗
m m
16i6m 16j<j 16j6n
(1.2)
Now we can derive
load (Mi∗ ) = tj ∗ + load 0 (Mi∗ )
1
6 (1 − m) · tj ∗ + lb
1
6 (1 − m) · max16j6n tj + lb
1
6 (2 − m) · lb
1
6 (2 − m) · opt (by Lemma 1.3)
The bound in Theorem 1.5 is tight for the given algorithm: for any m there are inputs
1
such that Greedy-Scheduling produces an assignment of makespan (2 − m ) · opt. Thus the
approximation ratio is fairly close to 2, especially when m is large. So if we want to get an
1
approximation ratio better than (2 − m ), then we have to design a better algorithm.
A weak point of our greedy algorithm is the following. Suppose we first have a large number
of small jobs and then finally a single very large job. Our algorithm will first spread the small
jobs evenly over all machines and then add the large job to one of these machines. It would
have been better, however, to give the large job its own machine and spread the small jobs over
the remaining machines. Note that our algorithm would have produced this assignment if the
large job would have been handled first. This observation suggest the following adaptation
of the greedy algorithm: we first sort the jobs according to decreasing processing times, and
then run Greedy-Scheduling. We call the new algorithm Ordered-Scheduling.
Does the new algorithm really have a better approximation ratio? The answer is yes.
However, the lower bound provided by Lemma 1.3 is not sufficient to prove this; we also need
the following lower bound.
Lemma 1.6 Consider a set of n jobs with processing times t1 , . . . , tn that have to be sched-
uled on m machines, where t1 > t2 > · · · > tn . If n > m, then opt > tm + tm+1 .
Proof. Since there are m machines, at least two of the jobs 1, . . . , m + 1, say jobs j and j 0 ,
have to be scheduled on the same machine. Hence, the load of that machine is tj + tj 0 , which
is at least tm + tm+1 since the jobs are sorted by processing times.
8
TU Eindhoven Approximation Algorithms — Course Notes
Proof. The proof is very similar to the proof of Lemma 1.4. Again we consider a machine Mi∗
that has the maximum load, and we consider the last job j ∗ scheduled on Mi∗ . If j ∗ 6 m,
then j ∗ is the only job scheduled on Mi∗ —this is true because the greedy algorithm schedules
the first m jobs on different machines. Hence, our algorithm is optimal in this case. Now
consider the case j ∗ > m. As in the proof of Lemma 1.4 we can derive
1 P
load (Mi∗ ) 6 tj ∗ + m 16i6n ti .
For the first term we use that j ∗ > m. Since the jobs are ordered by processing time we have
tj ∗ 6 tm+1 6 tm . We can therefore use Lemma 1.6 to get
1.3 Exercises
Exercise 1.1 In Definition 1.1 it is stated that we should have ρ > 1 for minimization
problems and that we should have ρ < 1 for a maximization problem. Explain this.
Exercise 1.2 In Definition 1.2 the approximation ratio of an algorithm for a minimization
problem is defined. Give the corresponding definition for a maximization problem.
Exercise 1.3 Consider the Load Balancing problem on two machines. Thus we want
to distribute a set of n jobs with processing times t1 , . . . , tn over two machines such that
the makespan (the maximum of the processing times of the two machines) is minimized.
Professor Smart has designed an approximation algorithm Alg for this problem, and he claims
that his algorithm is a 1.05-approximation algorithm. We run Alg on a problem instance
where the total size of all the jobs is 200, and Alg returns a solution whose makespan is 120.
(i) Suppose that we know that all job sizes are at most 100. Can we then conclude that
professor Smart’s claim is false? Explain your answer.
(ii) Same question when all job sizes are at most 10.
Exercise 1.4 Suppose we have two algorithms for the same minimization problem, Alg1
and Alg2. Alg1 is a 2-approximation algorithm, and Alg2 is a 4-approximation algorithm.
Consider the following statements.
(A) There must be an input I such that Alg2(I) > 2 · Alg1(I).
9
TU Eindhoven Approximation Algorithms — Course Notes
Exercise 1.5 Consider a company that has to schedule jobs on a daily basis. That is, each
day the company receives a number of jobs that need to be scheduled (for that day) on one
of their machines. The company uses the Greedy-Scheduling algorithm to do the scheduling.
(Thus, each day the company runs Greedy-Scheduling on the set of jobs that must be executed
on that day.) The following information is known about the company and the jobs: the
company has 5 machines, the processing times tj of the jobs are always between Pn 1 and 25
(that is, 1 6 tj 6 25 for all j) and the total processing time of all the jobs, j=1 tj , is always
at least 500.
(ii) Give an example of a set of jobs satisfying the condition stated above such that the
makespan produced by Greedy-Scheduling on this input is ρ0 times the optimal makespan.
Try to make ρ0 as large as possible.
Note: ideally, the value for ρ0 that you prove in (ii) is equal to the value for ρ that you prove
in (i). If this is the case, your analysis is tight—it cannot be improved—and the value is the
approximation ratio of the algorithm.
1
Exercise 1.6 We have seen in this chapter that algorithm Greedy-Scheduling is a (2 − m )-
1
approximation algorithm. Show that the bound 2− m is tight, by giving an example of an input
1
for which Greedy Scheduling produces a solution in which the makespan is (2 − m ) · opt, and
argue that the makespan is indeed that large. NB: Your example should be for arbitrary m,
it is not sufficient to give an example for one specific value of m.
Exercise 1.7 Give an example of an input on which neither Greedy-Scheduling nor Ordered-
Scheduling gives an optimal solution.
(i) Prove that for n 6 4—that is, if the number of jobs is at most four—then Ordered-
Scheduling is optimal.
10
TU Eindhoven Approximation Algorithms — Course Notes
(ii) Give an example for n = 5 in which Ordered-Scheduling gives a makespan that is equal
to (7/6) · opt.
(iii) Prove that for n = 5 the algorithm always gives a makespan that is at most (7/6) · opt.
(iv) Following the proof of Theorem 1.7, let Mi∗ be a machine determining the makespan,
and let j ∗ be the last job assigned to Mi∗ by Ordered-Scheduling. Prove that then the
produced makespan is at most (1 + j1∗ ) · opt.
Exercise 1.11 Consider the following problem. A shipping company has to decide how to
distribute a load consisting of a n containers over its ships. For 1 6 i 6 n, let wi denote
the weight of container i. The ships are identical, and each ship can carry containers with a
maximum
P total weight W . (Thus if C(j) denotes the set of containers assigned to ship j, then
i∈C(j) i 6 W .) The goal is to assign containers to ships in such a way that a minimum
w
number of ships is used. Give a 2-approximation algorithm for this problem, and prove your
algorithm indeed gives a 2-approximation.
Hint: There is a very simple greedy algorithm that gives a 2-approximation.
Exercise 1.12 Let P be a set of n points in the plane. We say that a point p is covered by
a square s, if p is contained in the boundary or interior of s. A square cover of P is a set S of
axis-aligned unit squares (squares of size 1 × 1 whose edges are parallel to the x- and y-axis)
such that any point p ∈ P is covered by at least one square s ∈ S. We want to find a square
cover of P with a minimum number of squares.
(i) Consider the integer grid, that is, the grid defined by the horizontal and vertical lines at
integer coordinates. Cells in this grid are unit squares. Suppose we generate a square
cover by taking all grid cells covering at least one point from S—see Fig. 1.1. You
may assume that no point falls on the boundary between two squares. Analyze the
approximation ratio of this simple strategy. (N.B. You should prove that the strategy
gives a ρ-approximation, for some ρ,and give an example that where the ratio between
the algorithm and the optimal solution is ρ0 , for some ρ0 . Ideally ρ = ρ0 , which then
implies your analysis is tight.)
(ii) Suppose all points lie in between two horizontal grid lines. More precisely, assume there
is an integer k such that for every p ∈ P we have k 6 py < k + 1, where py is the
y-coordinate of p. Give an algorithm that computes an optimal square covering for this
case. Your algorithm should run in O(n log n) time.
(iii) Using (ii), give an algorithm that computes in O(n log n) time a 2-approximation of a
minimum-size square cover for an arbitrary set of points in the plane. Prove that your
algorithm achieves the desired approximation ratio, and that it runs in O(n log n) time.
11
TU Eindhoven Approximation Algorithms — Course Notes
y=1
x-axis
y = −1
x = −1 x = 1
y-axis
Fig. 1.1: The integer grid, and the square covering produced by it (in grey). Note that the
covering is not optimal, since the two squares in the top right corner can be replaced by the
dotted square.
(ii) It follows from (i) that if we can compute a vertex cover for G, we can also compute an
independent set for G. Now suppose we want to compute a maximum-size independent
set on G, and suppose we have a 2-approximation algorithm ApproxMinVertexCover for
finding a minimum-size vertex cover. Consider the following algorithm for computing a
maximum independent set.
ApproxMaxIndependentSet(G)
1: C ← ApproxMinVertexCover (G) . G = (V, E) is an undirected graph
2: return V \ C
12
Chapter 2
Let G = (V, E) be an undirected graph. A Hamiltonian cycle of G is a cycle that visits every
vertex v ∈ V exactly once. Instead of Hamiltonian cycle, we sometimes also use the term
tour. Not every graph has a Hamiltonian cycle: if the graph is a single path, for example,
then obviously it does not have a Hamiltonian cycle. The problem Hamiltonian Cycle
is to decide for a given graph graph G whether it has a Hamiltonian cycle. Hamiltonian
Cycle is NP-complete.
Now suppose that G is a complete graph—that is, G has an edge between every pair of
vertices—where each edge e ∈ E has a non-negative length. It is easy to see that because G is
complete it must have a Hamiltonian cycle. Since the edges now have lengths, however, some
Hamiltonian cycles may be shorter than others. This leads to the traveling salesman problem,
or TSP for short: given a complete graph G = (V, E), where each edge e ∈ E has a length,
find a minimum-length tour (Hamiltonian cycle) of G. (The length of a tour is defined as the
sum of the lengths of the edges in the tour.) TSP is NP-hard. We are therefore interested in
approximation algorithms. Unfortunately, even this is too much to ask.
Theorem 2.1 There is no value c for which there exists a polynomial-time c-approximation
algorithm for TSP, unless P=NP.
13
TU Eindhoven Approximation Algorithms — Course Notes
if G has a Hamiltonian cycle then Alg returns tour of length |V |. For the ”only if”-part,
suppose Alg returns a tour of length |V |. Then obviously that tour can only use edge of
length 1—in other words, edges from E—which means G has a Hamiltonian cycle.
Note that in the proof we could also have set the lengths of the edges in E to 0 and the
lengths of the other edges to 1. Then opt(G∗ ) = 0 if and only if G has a Hamiltonian cycle.
When opt(G∗ ) = 0, then c·opt(G∗ ) = 0 no matter how large c is. Hence, any approximation
algorithm must solve the problem exactly. In some sense, this is cheating: when opt = 0
allowing a (multiplicative) approximation factor does not help, so it is not surprising that
one cannot get a polynomial-time approximation algorithm (unless P=NP). The proof above
shows that this is even true when all edge lengths are positive, which is a stronger result.
This is disappointing news. But fortunately things are not as bad as they seem: when the
edge lengths satisfy the so-called triangle inequality then we can obtain good approximation
algorithms. The triangle inequality states that for every three vertices u, v, w we have
In other words, it is not more expensive to go directly from u to w than it is to go via some
intermediate vertex v. This is a very natural property. It holds for instance for Euclidean
TSP. Here the vertices in V are points in the plane (or in some higher-dimensional space),
and the length of an edge between two points is the Euclidean distance between them. As
we will see below, for graphs whose edge lengths satisfy the triangle inequality, it is fairly
easy to give a 2-approximation algorithm. With a little more effort, we can improve the
approximation factor to 3/2. For the special case of Euclidean TSP there is even a PTAS;
this algorithm is fairly complicated, however, and we will not discuss it here. We will use the
following property of graphs whose edge lengths satisfy the triangle inequality.
Observation 2.2 Let G = (V, E) be a graph whose edge lengths satisfy the triangle inequal-
ity, and let v1 , v2 , . . . , vk be any path in G. Then length((v1 , vk )) 6 length(v1 , v2 , . . . , vk ).
14
TU Eindhoven Approximation Algorithms — Course Notes
v8
(i) (ii)
v1 v3
v6
v3
v7 v1 v7 v10 v4
v10
v5
v9
v4 v2 v8 v6 v5 v9 v2
Fig. 2.1: (i) A spanning tree (thick grey) and the tour (thin black) that is found when the
traversal shown in (ii) is used. (ii) Possible inorder traversal of the spanning tree in (i). The
traversal results from choosing v3 as the root vertex and visiting the children in some specific
order. (A different tour would result if we visit the children in a different order.)
computational point of view, however, this makes a huge difference: while TSP is NP-hard,
computing a minimum spanning tree can be done in polynomial time with a simple greedy
algorithm such as Kruskal’s algorithm or Prim’s algorithm—see [CLRS] for details.
Now let G = (V, E) be a complete graph whose edge lengths satisfy the triangle inequality.
As usual, we will derive our approximation algorithm for TSP from an efficiently computable
lower bound. In this case the lower bound is provided by the minimum spanning tree.
Lemma 2.3 Let opt denote the minimum length of any tour of the given graph G, and let
mst denote the total length of a minimum spanning tree of G. Then opt > mst.
Proof. Let Γ be an optimal tour for G. By deleting an edge from Γ we obtain a path Γ0
and since all edge lengths are non-negative we have length(Γ0 ) 6 length(Γ) = opt. Because a
path is (a special case of) a tree, a minimum spanning tree is at least as short as Γ0 . Hence,
mst 6 length(Γ0 ) 6 opt.
So the length of a minimum spanning tree provides a lower bound on the minimum length
of a tour. But can we also use a minimum spanning tree to compute an approximation
of a minimum-length tour? The answer is yes: we simply make an arbitrary vertex of the
minimum spanning tree T to be the root and use an inorder traversal of T to get the tour.
(An inorder traversal of a rooted tree is a traversal that starts at the root and then proceeds
as follows. Whenever a vertex is reached, we first visit that vertex and then we recursively
visit each of its subtrees.) Figure 2.1 illustrates this. We get the following algorithm.
15
TU Eindhoven Approximation Algorithms — Course Notes
Below the algorithm for the inorder traversal of the minimum-spanning tree T is stated.
Recall that we have assigned an arbitrary vertex u as the root of T , where the traversal is
started. This also determines for each edge (u, v) of T whether v is a child of u or vice versa.
InorderTraversal (u, Γ)
1: Append u to Γ.
2: for each child v of u do
3: InorderTraversal (v, Γ)
4: end for
The next theorem gives a bound on the approximation ratio of the algorithm.
Proof. Let Γ denote the tour reported by the algorithm, and let mst denote the total length
of the minimum spanning tree. We will prove that length(Γ) 6 2 · mst. The theorem then
follows from Lemma 2.3.
Consider an inorder traversal of the minimum spanning tree T where we change line 3 to
In other words, after coming back from recursively visiting a subtree of a node u we first
visit u again before we move on to the next subtree of u. This way we get a cycle Γ0 where
some vertices are visited more than once, and where every edge in Γ0 is also an edge in T .
In fact, every edge in T occurs exactly twice in Γ0 , so length(Γ0 ) = 2 · mst. The tour Γ
can be obtained from Γ0 by deleting vertices so that only the first occurrence of each vertex
remains. This means that certain paths v1 , v2 , . . . , vk are shortcut by the single edge (v1 , vk ).
By Observation 2.2, all the shortcuts are at most as long as the paths they replace, so
length(Γ) 6 length(Γ0 ) 6 2 · mst.
Is this analysis tight? Unfortunately, the answer is basically yes: there are graphs for which
the algorithm produces a tour of length (2 − |V1 | ) · opt, so when |V | gets larger and larger the
worst-case approximation ratio gets arbitrarily close to 2. Hence, if we want to improve the
approximation ratio, we have to come up with a different algorithm. This is what we do in
the next section.
16
TU Eindhoven Approximation Algorithms — Course Notes
Of course a minimum spanning tree—or any other tree for that matter—does not admit
an Euler tour, since the leaves of the tree have degree 1. The idea is therefore to add extra
edges to the minimum spanning tree such that all vertices have even degree, and then take
an Euler tour of the resulting graph. To this end we need the concept of so-called matchings.
Lemma 2.5 Let G = (V, E) be a graph and let V ∗ ⊂ V be any subset of an even number
of vertices. Let opt denote the minimum length of any tour on G, and let M ∗ be a perfect
matching on the complete graph G∗ = (V ∗ , E ∗ ), where the lengths of the edges in E ∗ are
equal to the lengths of the corresponding edges in E. Then length(M ∗ ) 6 12 · opt.
If V ∗ 6= V then we can use basically the same argument: Let n∗ = |V ∗ | and number the
vertices from V ∗ as v1∗ , . . . , vn∗ ∗ in the order they are encountered by Γ. Consider the two
matchings M1 = {(v1∗ , v2∗ ), . . . , (vn∗ −1 , vn∗ )} and M2 = {(v2∗ , v3∗ ), . . . , (vn∗ , v1 )}. One of these
has length at most length(Γ∗ )/2, where Γ∗ is the tour v1∗ , . . . , vn∗ ∗ , v1∗ . The result follows
because length(Γ∗ ) 6 length(Γ) by the triangle inequality.
The algorithm is now as follows.
17
TU Eindhoven Approximation Algorithms — Course Notes
Proof. First we note that in any graph, the number of odd-degree vertices must be even—this
is easy to show by induction on the number of edges. Hence, the set V ∗ has an even number of
vertices, so it admits a perfect matching. Adding the edges from the matching M to the tree
T ensures that every vertex of odd degree gets an extra incident edge, so all degrees become
even. (Note that the matching M may contain edges that were already present in T . Hence,
after we add these edges to M we in fact have a multi-graph. But this is not a problem for the
rest of the algorithm.) It follows that after adding the edges from M , we get a multi-graph
that has an Euler tour Γ. The length of Γ is at most length(T ) + length(M ), which is at most
(3/2)·opt by Lemmas 2.3 and 2.5. By Observation 2.2, removing the superfluous occurrences
of the vertices occurring more than once in line 5 can only decrease the length of the tour.
Christofides’s algorithm is still the best known algorithm for TSP for graphs satisfying the
triangle inequality. For Euclidean TSP, however, a PTAS exists. As mentioned earlier, this
PTAS is rather complicated and we will not discuss it here.
2.3 Exercises
Exercise 2.1 Consider the algorithm ApproxTSP presented in Section 2.1. When the edge
lengths of the input graph G satisfy the triangle inequality, ApproxTSP gives a 2-approximation.
Let c be any constant. Give an example of a graph G where the edge weights do not satisfy
the triangle inequality such that ApproxTSP returns a tour of length more than c·opt, where
opt is the minimum length of any tour for G.
NB: Describing the graph is not sufficient, you should also argue that the algorithm indeed
gives a tour of length more than c · opt. Note that it is not sufficient to argue that the length
of the tour is more than c times the length of a minimum-spanning tree, because opt could
be larger than mst.
Exercise 2.2 Consider the TSP problem for graphs where all the edges have either weight 1
or weight 2.
(i) Prove that the TSP problem is still NP-hard for these graphs.
(ii) Show that these edge weights satisfy the triangle inequality.
(iii) By part (ii) of this exercise, algorithm ApproxTSP gives a 2-approximation for graphs
where the edge weights are 1 or 2. Someone conjectures that for these special graphs, the
algorithm in fact always gives a (3/2)-approximation. Prove or disprove this conjecture.
Exercise 2.3 Let P be a set of n points in the plane. It is known that a minimum spanning
tree (MST) for P can be computed in O(n log n) time. In this exercise we are also interested
in computing an MST T for P , but we are allowed to add extra points (anywhere we like)
and use those extra points as additional vertices in T . Such a tree with extra points is called
a Steiner tree. Computing a minimum Steiner tree is NP-hard.
(i) Show that it sometimes helps to add extra points by giving an example of a point set
P and an extra point q such that an MST for P ∪ {q} is shorter than an MST for P .
18
TU Eindhoven Approximation Algorithms — Course Notes
(ii) Let P be a set of points and Q be any set of extra points. Prove that the length
of an MST for P is never more than twice the length of an MST for P ∪ Q. (Hence,
simply computing an MST for P gives a 2-approximation for the MST-with-extra-points
problem. In fact, one can show that the approximation ratio is even better, but proving
the factor 2 is sufficient.)
Exercise 2.4 Theorem 2.4 states that ApproxTSP is a 2-approximation algorithm. Give an
example of an input graph on n vertices (for arbitrary n) for which the algorithm can produce
a tour of length (2 − (1/n)) · opt.
Exercise 2.5 Consider the algorithm ApproxTSP from the Course Notes. When the edge
lengths of the input graph G satisfy the triangle inequality, ApproxTSP gives a 2-approximation.
Now suppose the edge lengths satisfy the following weak triangle inequality: for any three ver-
tices u, v, w we have length((u, w)) 6 2 · (length((u, v)) + length((v, w))).
(i) Explain how Observation 7.2 needs to be modified for this setting, and prove the new
version of the observation.
(ii) Now prove a bound on the approximation ratio of ApproxTSP for graphs satisfying the
weak triangle inequality.
NB: You do not have to write the complete analysis. It suffices to explain how the
modified version of Observation 7.2 changes the proof of Theorem 2.4, and what the
new bound on the approximation ratio will be.
19
Chapter 3
Lemma 3.1 Let G = (V, E) be a graph and let opt denote the minimum size of a vertex
cover for G. Let E ∗ ⊂ E be any subset of pairwise disjoint edges, that is, any subset such
that each pair of edges in E ∗ is disjoint. Then opt > |E ∗ |.
Proof. Let C be an optimal vertex cover for G. By definition, any edge e ∈ E ∗ must be
covered by a vertex in C, and since the edges in E ∗ are disjoint any vertex in C can cover at
most one edge in E ∗ .
This lemma suggests the greedy algorithm given in Algorithm 3.1. It is easy to check that the
while-loop in the algorithm indeed maintains the stated invariant. After the while-loop has
terminated—the loop must terminate since at every step we remove at least one edge from
E 0 —we have E \ E 0 = E \ ∅ = E. Together with the invariant this implies that the algorithm
indeed returns a vertex cover. Next we show that the algorithm gives a 2-approximation.
Theorem 3.2 Algorithm ApproxVertexCover produces a vertex cover C such that |C| 6
2 · opt, where opt is the minimum size of a vertex cover.
20
TU Eindhoven Approximation Algorithms — Course Notes
Proof. Let E ∗ be the set of edges selected in line 3 over the course of the algorithm. Then
C consists of the endpoints of the edges in E ∗ , and so |C| 6 2|E ∗ |. Moreover, any two
edges in E ∗ are disjoint because as soon as an edge (vi , vj ) is selected from E 0 all edges in
E 0 that share vi and/or vj are removed from E 0 . The theorem now follows from Lemma 3.1.
21
TU Eindhoven Approximation Algorithms — Course Notes
with with famous simplex method, which is exponential in the worst case but works quite well
in most practical applications. Hence, if we can formulate a problem as a linear-programming
problem then we can solve it efficiently, both in theory and in practice.
There are several problems that can be formulated as a linear-programming problem but
with one twist: the variables x1 , . . . , xd can not take on real values but only integer values.
This is called Integer Linear Programming. (When the variables can only take the values
0 or 1, the problem is called 0/1 Linear Programming.) Unfortunately, Integer Lin-
ear Programming and 0/1 Linear Programming are considerably harder than Linear
Programming. In fact, Integer Linear Programming and 0/1 Linear Programming
are NP-complete. However, formulating a problem as an integer linear program can still be
useful, as shall see next.
As noted earlier, solving 0/1 linear programs is hard. Therefore we perform relaxation:
we drop the restriction that the variables can only take integer values and we replace the
integrality constraints (3.1) by
0 6 xi 6 1 for 1 6 i 6 n (3.2)
This linear program can be solved in polynomial time. But what good is a solution where
the variables can take on any real number in the interval [0, 1]? A solution with xi = 1/3, for
instance, would suggest that we put 1/3 of the vertex vi into the cover—something that does
not make sense. First we note that the solution to our new relaxed linear program provides
us with a lower bound.
Lemma 3.3 Let optrelaxed denote the value of an optimal solution to the relaxed linear
program described above, and let opt denote the minimum weight of a vertex cover. Then
opt > optrelaxed .
Proof. Any vertex cover corresponds to a feasible solution of the linear program, by setting
the variables of the vertices in the cover to 1 and the other variables to 0. Hence, the optimal
solution of the linear program is at least as good as this solution. (Stated differently: we
already argued that an optimal solution of the 0/1-version of the linear program corresponds
22
TU Eindhoven Approximation Algorithms — Course Notes
to an optimal solution of the vertex-cover problem. Relaxing the integrality constraints clearly
cannot make the solution worse.)
The next step is to derive a valid vertex cover—or, equivalently, a feasible solution to the 0/1
linear program—from the optimal solution to the relaxed linear program. We want to do this
in such a way that the total weight of the solution does not increase by much. This can simply
be done by rounding: we pick a suitable threshold τ , and then round all variables whose value
is at least τ up to 1 and all variables whose value is less than τ down to 0. The rounding
should be done in such a way that the constraints are still satisfied. In our algorithm we
can take τ = 1/2—see the first paragraph of the proof of Theorem 3.4 below—but in other
applications we may need to use a different threshold. We thus obtain the algorithm for
Weighted Vertex Cover shown in Algorithm 3.2.
The integrality gap. Note that, as always, the approximation ratio of our algorithm is
obtained by comparing the obtained solution to a certain lower bound. Here–and this is
essentially always the case when LP relaxation is used–the lower bound is the solution to
the relaxed LP. The worst-case ratio between the solution to the integer linear program
(which models the problem exactly) and its relaxed version is called the integrality gap. For
approximation algorithms based on rounding the relaxation of an integer linear program,
23
TU Eindhoven Approximation Algorithms — Course Notes
one cannot prove a better approximation ratio than the integrality gap (assuming that the
solution to the relaxed LP is used as the lower bound).
In this section we will develop an approximation algorithm for Weighted Set Cover.
To this end we first formulate the problem as a 0/1 linear program: we introduce a variable
xi that indicates whether Si is in the cover (xi = 1) or not (xi = 0), and we introduce a
constraint for each element zj ∈ Z that guarantees that zj will be in at least one of the
chosen sets. The constraint for zj is defined as follows. Let
S(j) := {i : 1 6 i 6 n and zj ∈ Si }.
P
Then one of the chosen sets contains zj if and only if i∈S(j) xi > 1. This leads to the
following 0/1 linear program.
Pn
Minimize i=1 weight(Si ) · xi
P
Subject to i∈S(j) xi > 1 for all 1 6 j 6 m
We relax this 0/1 linear program by replacing the integrality constraints in (3.3) by the
following constraints:
0 6 xi 6 1 for 1 6 i 6 n (3.4)
We obtain a linear program that we can solve in polynomial time. As in the case of Weighted
Vertex Cover, the value of an optimal solution to this linear program is a lower bound on
the value of an optimal solution to the 0/1 linear program and, hence, a lower bound on the
minimum total weight of a set cover for the given instance:
Lemma 3.5 Let optrelaxed denote the value of an optimal solution to the relaxed linear
program described above, and let opt denote the minimum weight of a set cover. Then
opt > optrelaxed .
24
TU Eindhoven Approximation Algorithms — Course Notes
The next step is to use the solution to the linear program to obtain a solution to the 0/1 linear
program (or, in other words, to the set cover problem). Rounding in the same way as for the
vertex cover problem—rounding variables that are at least 1/2 to 1, and the other variables
to 0—does not work: such a rounding scheme will not give a set cover. Instead we use the
following randomized rounding strategy:
Proof. By definition, the total weight of C is the sum of the weights of its subsets. Let’s
define an indicator random variable Yi that tells us whether a set Si is in the cover C:
1 if Si ∈ C
Yi =
0 otherwise
We have
E [ weight of C ] = E [ ni=1 weight(Si ) · Yi ]
P
Pn
= i=1 weight(Si ) · E [Yi ] (by linearity of expectation)
Pn
= i=1 weight(Si ) · Pr [ Si is put into C ]
Pn
= i=1 weight(Si ) · xi
So the total weight of C is very good. Is C is valid set cover? To answer thisPquestion, let’s
look at the probability that an element zj ∈ Z is not covered. Recall that i∈S(j) xi > 1.
Suppose that zj is present in ` subsets, that is, |S(j)| = `. To simplify the notation, let’s
renumber the sets such that S(j) = {1, . . . , `}. Then we have
1 `
Pr [ zj is not covered ] = (1 − x1 ) · · · · · (1 − x` ) 6 1 − ,
`
where the last inequality follows from the fact that (1 − x1 ) · · · · · (1 − x` ) is maximized when
the xi ’s sum up to exactly 1 and are evenly distributed, that is, when xi = 1/` for all i. Since
(1 − (1/`))` 6 1/e, where e ≈ 2.718 is the base of the natural logarithm, we conclude that
1
Pr [ zj is not covered ] 6 ≈ 0.268.
e
So any element zj is covered with constant probability. But this is not good enough: there are
many elements zj and even though each one of them has a reasonable chance of being covered,
we cannot expect all of them to be covered simultaneously. (This is only to be expected, since
Weighted Set Cover is NP-complete, so we shouldn’t hope to find an optimal solution in
polynomial time.) What we need is that each element zj is covered with high probability. To
this end we simply repeat the above procedure t times, for a suitable value of t: we generate
covers C1 , . . . , Ct where each Cs is obtained using the randomized rounding strategy, and we
take C ∗ := C1 ∪ · · · ∪ Ct as our cover. Our final algorithm is shown in Algorithm 3.3.
25
TU Eindhoven Approximation Algorithms — Course Notes
3: t ← 2 ln m
4: for s ← 1 to t do . Compute Cs by randomized rounding:
5: for i ← 1 to n do
6: Put Si into Cs with probability xi
7: end for
8: end for
9: C ∗ ← C1 ∪ · · · ∪ Ct
10: return C ∗
Proof. The expected weight of each Cs is at most opt, so the expected total weight of C ∗ is
at most t · opt = O(opt · log m). What is the probability that some fixed element zj is not
covered by any of the covers Cs ? Since the covers Cs are generated independently, and each
Cs fails to cover zj with probability at most 1/e, we have
Since t = 2 ln m we conclude that zj is not covered with probability at most 1/m2 . Hence,
6 1 − 1/m
3.4 Exercises
Exercise 3.1 Consider the following greedy algorithm for unweighted Vertex Cover:
GreedyDegreeVertexCover(V, E)
1: C ← ∅; E 0 ← E
2: while E 0 6= ∅ do
26
TU Eindhoven Approximation Algorithms — Course Notes
Exercise 3.2 Explicitly write the 0/1-LP that solves Weighted Vertex Cover for the
graph in Fig. 3.1.
4 10
v2 v5
5 3
v1 v4
2.5 2.5
v3 v6
Fig. 3.1: A weighted graph on six vertices. The weights of each vertex is written next to it.
Exercise 3.3 Let X = {x1 , . . . , xn } be a set of n boolean variables. A boolean formula over
the set X is a CNF formula—in other words, is in conjunctive normal form—if it has the
form C1 ∧ C2 ∧ · · · ∧ Cm , where each clause Cj is the disjunction of a number of literals. In this
exercise we consider CNF-formulas where every clause has exactly three literals, and there
are no negated literals. An example of such a formula is
Such CNF formulas are obviously satisfiable: setting all variables to true clearly makes every
clause true. Our goal is to make the CNF formula true by setting the smallest number of
variables to true.
(i) Consider the following algorithm for this problem.
Greedy-CNF (C, X)
1: B C = {C1 , . . . , Cm } is a set of clauses, X = {x1 , . . . , xn } a set of variables.
2: while C 6= ∅ do
3: Take an arbitrary clause Cj ∈ C.
4: Let xi be one of the variables in Cj .
5: Set xi ← true.
6: Remove all clauses from C that contain xi .
7: end while
8: return X
27
TU Eindhoven Approximation Algorithms — Course Notes
(iii) Now give a different 3-approximation algorithm for the problem, based on the technique
of LP relaxation.
Exercise 3.4 Show that for any n > 1 there is an input graph G with n vertices such that
the integrality gap of the LP in ApproxWeightedVertexCover for G is (at least) 2 − n2 . Hint:
Use a graph where all vertex weights are 1.
Exercise 3.5 Let d > 2 be an integer, and let V be a set of elements called vertices. We call
a subset e ⊂ V of size d a d-edge on V . A d-hypergraph is a pair G = (V, E) where E is a set
of d-edges on V . Note that a 2-hypergraph is just a normal (undirected) graph.
A double vertex cover of a d-hypergraph G = (V, E) is a subset C ⊂ V such that for every
d-edge e ∈ E there are two vertices u, v ∈ C such that u ∈ e and v ∈ e. We want to compute
for a given d-hypergraph G = (V, E) a minimum-size double vertex cover.
(i) Formulate the problem as a 0/1 linear program, and briefly explain your formulation.
(ii) Give a polynomial-time approximation algorithm for this problem, based on the tech-
nique of LP rounding. Prove that your algorithm returns a valid solution (that is, a
double vertex cover) and prove a bound on its approximation ratio.
(iii) Now consider your LP for the case d = 3. Recall that the integrality gap for an LP is
the worst-case ratio between the value of an optimal fractional solution and the value of
an integral solution. Show that the integrality gap for your LP when d = 3 is at least c,
for some constant c > 1. Try to make c as large as possible.
(iv) Consider the case of arbitrary d again. Give an approximation algorithm that has
approximation ratio O(log n), with high probability.
Exercise 3.6 An electricity company has to decide how to connect the houses in a new
neighborhood to the electricity network. Connecting a house to the network is done via a
distribution unit. There are several possible locations where distribution units can be placed.
Thus the problem faced by the company is to decide which of the potential distribution units
to actually build, and then through which of these units each house will be served. An
additional difficulty is that for each house only some of the distribution units are suitable.
This problem can be modeled as follows. We have a set U = {u1 , . . . , un } of potential
distribution units, each of which has a cost fi associated to it; the cost fi must be paid if the
company decides to build unit ui . Moreover, we have a set H = {h1 , . . . , hm } of houses that
need to be connected to the network. Each house has a set U (hj ) ⊆ U of suitable distribution
units, and for each ui ∈ U (hj ) there is a cost gi,j that must be paid if the company decides
to connect house hj to unit ui . The goal of the company is to minimize its total cost, which
is the cost of building distribution units plus the cost of connecting each house to one of the
distribution units.
(i) Formulate the problem as a 0/1 linear program, and briefly explain your formulation.
(ii) Assume that |U (hj )| 6 4 for all hj . Give a polynomial-time approximation algorithm
for this case, based on the technique of LP rounding. Prove that your algorithm returns
a valid solution and prove a bound on its approximation ratio.
28
TU Eindhoven Approximation Algorithms — Course Notes
Exercise 3.7 Let G = (V, E) be an undirected edge-weighted graph, where the weight of an
edge (u, v) is denoted by weight(u, v). A matching in G is a collection M ⊂ E of edges such
that each vertex v ∈ V is incident to at most one edge in M . We want to compute a matching
in G of maximum total weight. We can model this as a 0/1-LP, as follows. We introduce a
variable xuv for every edge (u, v) ∈ E, where setting xuv := 1 indicates that we put (u, v) in
the matching, and setting xuv := 0 indicates that we do not put (u, v) in the matching. Let
N (u) be the set of neighbors of a node u ∈ V , that is, N (u) := {v ∈ V : (u, v) ∈ E}. Then
we can model the maximum-matching problem as a 0/1-LP:
P
Maximize (u,v)∈E weight(u, v) · xuv
P
Subject to v∈N (u) xuv 6 1 for all u ∈ V
xuv ∈ {0, 1} for all (u, v) ∈ E
Someone suggests to derive an approximation algorithm from this using the technique of
LP-relaxation, as follows.
MaxMatching(V, E)
1: Solve the relaxed linear program corresponding to the given problem:
P
Maximize weight(u, v) · xuv
P(u,v)∈E
Subject to v∈N (u) xuv 6 1 for all u ∈ V
0 6 xuv 6 1 for all (u, v) ∈ E
29
Chapter 4
Polynomial-time approximation
schemes
When faced with an NP-hard problem one cannot expect to find a polynomial-time algorithm
that always gives an optimal solution. Hence, one has to settle for an approximate solution.
Of course one would prefer that the approximate solution is very close optimal, for example
at most 5% worse. In other words, one would like to have an approximation ratio very
close to 1. The approximation algorithms we have seen so far do not quite achieve this:
for Load Balancing we gave an algorithm with approximation ratio 3/2, for Weighted
Vertex Cover we gave an algorithm with approximation ratio 2, and for Weighted Set
Cover the approximation ratio was even O(log n). Unfortunately it is not always possible
to get a better approximation ratio: for some problems one can prove that it is not only
np-hard to solve the problem exactly, but that there is a constant c > 1 such that there is
no polynomial-time c-approximation algorithm unless p=np. Vertex Cover, for instance,
cannot be approximated to within a factor 1.3606... unless p=np, and for Set Cover one
cannot obtain a better approximation factor than Θ(log n).
Fortunately there are also problems where much better solutions are possible. In partic-
ular, some problems admit a so-called polynomial-time approximation scheme, or PTAS for
short. Such an algorithm works as follows. Its input is, of course, an instance of the problem
at hand, but in addition there is an input parameter ε > 0. The output of the algorithm
is then a solution whose value is at most (1 + ε) · opt for a minimization problem, or at
least (1 − ε) · opt for a maximization problem. The running time of the algorithm should be
polynomial in n; its dependency on ε can be exponential however. So the running time can be
O(21/ε n2 ) for example, or O(n1/ε ), or O(n2 /ε), etc. If the dependency on the parameter 1/ε
is also polynomial then we speak of a fully polynomial-time approximation scheme (FPTAS).
In this lecture we give an example of an FPTAS.
30
TU Eindhoven Approximation Algorithms — Course Notes
P
and value(S) := x∈S value(x). The goal is now to select a subset of the items whose value
is maximized, under the condition that the total weight of the selected items is at most W .
From now on, we will assume that weight(xi ) 6 W for all i. (Items with weight(xi ) > W can
of course simply be ignored.)
We will first develop an algorithm for the case where all the values are integers. Let
Vtot := value(X), that is, Vtot is the total value of all items. The running time of our
algorithm will depend on n and Vtot . Since Vtot can be arbitrarily large, the running time of
our algorithm will not necessarily be polynomial in n. In the next section we will then show
how to obtain an FPTAS for Knapsack that uses this algorithm as a subroutine.
Our algorithm for the case where all values are integers is a dynamic-programming algo-
rithm. For 1 6 i 6 n and 0 6 j 6 Vtot , define
In other words, A[i, j] denotes the minimum possible weight of any subset S of the first i
items such that value(S) is exactly j. When there is no subset S ⊂ {x1 , . . . , xi } of value
exactly j then we define A[i, j] = ∞. Note that Knapsack asks for a subset of weight
at most W with the maximum value. This maximum value is given by opt := max{j :
0 6 j 6 Vtot and A[n, j] 6 W }. This means that if we can compute all values A[i, j] then
we can compute opt. From the table A we can then also compute a subset S such that
value(S) = opt—see below for details. As is usual in dynamic programming, the values
A[i, j] are computed bottom-up by filling in a table. It will be convenient to extend the
definition of A[i, j] to include the case i = 0, as follows: A[0, 0] = 0 and A[0, j] = ∞ for j > 0.
Now we can give a recursive formula for A[i, j].
Lemma 4.1
0
if j=0
∞ if i = 0 and j > 0
A[i, j] =
A[i − 1, j] if i > 0 and 0 < j < value(xi )
min(A[i − 1, j], A[i − 1, j − value(xi )] + weight(xi )) if i > 0 and j > value(xi )
Proof. The first two cases are simply by definition. Now consider third and fourth case.
Obviously the minimum weight of any subset of {x1 , . . . , xi } of total value j is given by one
of the following two possibilities:
• the minimum weight of any subset S ⊂ {x1 , . . . , xi } with value j and xi ∈ S, or
31
TU Eindhoven Approximation Algorithms — Course Notes
Finding an optimal subset S in line 18 of the algorithm can be done by “walking back” in
the table A, as is standard in dynamic-programming algorithms—see also the chapter on dy-
namic programming from [CLRS]. For completeness, we describe a subroutine ReportSolution
that finds an optimal subset.
ReportSolution(X, A, opt)
1: j ← opt; S ← ∅
2: for i ← n downto 1 do
3: if value(xi ) 6 j then
4: if weight(xi ) + A[i − 1, j − value(xi )] < A[i − 1, j] then
5: S ← S ∪ {xi }; j ← j − value(xi )
6: end if
7: end if
8: end for
9: return S
Theorem 4.2 Suppose all values in a Knapsack instance are integers. Then the problem
can be solved in O(nVtot ) time, where Vtot := value(X) is the total value of all items.
32
TU Eindhoven Approximation Algorithms — Course Notes
0 ∆ 2∆ 3∆
Fig. 4.1: Rounding the values: each value (indicated by the small circles) is rounded to the
right endpoint of the interval ((j − 1)∆, j∆] in which it lies.
33
TU Eindhoven Approximation Algorithms — Course Notes
(since we assumed each item has weight at most W ) we have opt > max16i6n value(xi ).
Thus we set
∆ := (ε/n) · lb, (4.3)
where lb := max16i6n value(xi ). Note that by working with lb instead of opt, the interval
size ∆ can only become smaller. Hence, the error on the individual values does not increase
and so intuitively condition (iii) is satisfied.
What about condition (ii)? For every item xi we have
∗ max16i6n value(xi ) max16i6n value(xi )
value (xi ) 6 = = dn/εe.
∆ (ε/n) · lb
It follows that ni=1 value ∗ (xi ) = O(n2 /ε), and condition (ii) is satisfied. Before we formally
P
prove that our algorithm is an FPTAS, we summarize the algorithm in pseudocode.
Proof. To prove the running time, recall that value ∗ (xi ) 6 dn/εe for all 1 6 i 6 n. Hence,
value ∗ (X), the total new value, is at most n · dn/εe. Hence, by Theorem 4.2 the algorithm
runs in O(n3 /ε) time.
Above we already argued intuitively that the total error is at most ε · opt. Next we
formally prove that the value of the solution we compute is indeed at least (1 − ε) · opt. To
this end, let Sopt denote an optimal subset, that is, a subset of weight at most W such that
value(Sopt ) = opt. Let S ∗ denote the subset returned by the algorithm. Since we did not
change the weights of the items, the subset S ∗ has weight at most W . Hence, the computed
solution S ∗ is feasible. It remains to show that value(S ∗ ) > (1 − ε) · opt.
Because S ∗ is optimal for the new values, we have value ∗ (S ∗ ) > value ∗ (Sopt ). Moreover
value(xi ) value(xi )
6 value ∗ (xi ) 6 + 1,
∆ ∆
where ∆ = (ε/n) · lb. Hence, we have
34
TU Eindhoven Approximation Algorithms — Course Notes
value(S ∗ ) =
P
xi ∈S ∗ value(xi )
∆ · (value ∗ (xi ) − 1)
P
> xi ∈S ∗
value ∗ (xi ) − |S ∗ | · ∆
P
= ∆· xi ∈S ∗
value ∗ (xi ) − n · ∆
P
> ∆· xi ∈S ∗
> opt − ε · lb
> opt − ε · opt
Thus value(S ∗ ) > (1 − ε) · opt, as claimed.
4.3 Exercises
Exercise 4.1 Consider the algorithm Knapsack-FPTAS described above. Suppose that in
step 2 of the algorithm we round the values down instead of up, that is, we use
∗ value(xi )
value (xi ) := .
∆
Prove or disprove: Theorem 4.3 is still true for this modified version of Knapsack-FPTAS.
Exercise 4.2 Consider the following problem. We are given a number W > 0 and a set X
of n weighted items, where the weight of the i-th item is denoted by wi . The goal is to find
a subset S ⊆ X with the largest possible weight under the condition that P the weight of the
subset is at most W . Assume that 0 < wi 6 W for all 1 6 i 6 n, and that ni=1 wi > W .
Suppose that there is an algorithm Largest-Weight-Subset-Integer (X, W ) that finds an
optimal solution when all the weights are integers. We now want to develop an algorithm
that computes an optimal solution when the weights are real numbers (in the range (0 : W ]).
Since this problem is hard, we are interested in approximations. More precisely, we want to
find a subset of weight at least (1 − ε) · opt that is feasible, that is, has weight at most W ;
here opt denotes the weight of an optimal solution and ε is a given constant with 0 < ε < 1.
Largest-Weight-Subset(X, W, ε)
1: lb ← W/2 l m l m
w
2: For all 1 6 i 6 n let wi∗ ← (ε/n)·i lb , and let W ∗ ← (ε/n)·
W
lb .
∗
3: Let X be the set of items with the new weights wi . ∗
4: S ← Largest-Weight-Subset-Integer (X ∗ , W ∗ ).
5: return S
Prove or disprove: this algorithm gives a feasible solution of weight at least (1 − ε) · opt.
35
TU Eindhoven Approximation Algorithms — Course Notes
(iii) Someone else suggests to modify the algorithm and round the weights down instead of
up. Thus step 2 becomes:
j k j k
wi
2: For all 1 6 i 6 n let wi∗ ← (ε/n)· lb , and let W ∗ ← W
(ε/n)·lb .
Prove or disprove: this algorithm gives a feasible solution of weight at least (1 − ε) · opt.
Exercise 4.4 Consider the Load Balancing problem on two machines. Thus we want to
distribute a set of n jobs with processing times t1 , . . . , tn over two machines such that the
makespan (the maximum of the processing times of the two machines) is minimized. In this
exercise we will
P develop a PTAS for this problem.
Let T = nj=1 tj be the total size of all jobs. We call a job large (for a given ε > 0) if its
processing time is at least ε · T , and we call it small otherwise.
(i) How many large jobs are there at most, and what is the number of ways in which the
large jobs can be distributed over the two machines?
(ii) Give a PTAS for the Load Balancing problem for two machines. Prove that your
algorithm achieves the required approximation ratio and analyze its running time.
Exercise 4.5 The TSP problem on a set P of points in the plane is to compute a shortest
tour visiting all the points in P , that is, a tour whose (Euclidean) length is minimized. Suppose
we have an algorithm IntegerTSP (P ) that, given a set P of n points in the plane with integer
coordinates in the range 0, . . . , m, computes a shortest tour in O(nm) time. Consider the
following PTAS for the general TSP problem, that is, for the case where the coordinates need
not be integral and we do not have a pre-specified range in which the coordinates lie. We
assume that minp∈P px = minp∈P py = 0, where px and py denote the x- and y-coordinate of
the point p.
36
TU Eindhoven Approximation Algorithms — Course Notes
PTAS-TSP (P, ε)
1: ∆ ← . . .
2: For each point p ∈ P define p∗ = (p∗x , p∗y ), where p∗x = dpx /∆e and p∗y =
dpy /∆e. Let P ∗ := {p∗ : p ∈ P }.
3: Compute a shortest tour on P ∗ using the algorithm IntegerTSP (P ∗ ), and
return the reported tour (with each point p∗ ∈ P ∗ replaced with its corre-
sponding point p ∈ P .
(i) Derive a suitable value to be used for ∆ in Step 1, so that the resulting algorithm is a
PTAS. (Note: In this part of the exercise you don’t have to prove that the algorithm is
a PTAS.)
(ii) For a tour T on P , define length(T ) to be the Euclidean length of T . Moreover, define
length ∗ (T ) to be the length of T if each point p ∈ P is replaced by p∗ . Let T ∗ be the
tour computed by PTAS-TSP and let Topt be an optimal tour for the set P . Prove that
length(T ∗ ) 6 (1 + ε) · length(Topt ) for your choice of ∆, using a proof similar to the
proof of Theorem 3.3 in the Course Notes.
(iii) Analyze the running time of the algorithm for your choice of ∆.
(i) Prove that this implies that there is no FPTAS for Maximum Independent Set unless
p=np. Hint: Assume Alg(G, ε) is an FPTAS that computes a (1 − ε)-approximation
for Maximum Independent Set on a graph G. Now give an algorithm that solves
Maximum Independent Set exactly by picking a suitable ε and using Alg(G, ε) as
a subroutine. Argue that your choice of ε leads to an exact solution and argue that the
resulting algorithm runs in polynomial time to derive a contradiction to the existence
of an FPTAS.
(ii) Does your proof also imply that there is no PTAS for Maximum Independent Set
unless p=np? Explain your answer.
37