Scribe Notes 1
Scribe Notes 1
1.1 Introduction
Suppose P 6= NP, then it is impossible to find optimal solutions for many discrete optimization problems such
as Set Cover, Traveling Salesman, and Maxcut efficiently. The study of approximation algorithms focuses on
developing algorithms that relax the requirement of finding an optimal solution and instead searches for those
that are “good enough” efficiently given any instance. Furthermore, it studies providing mathematically
rigorous methods to quantify what “good enough” means. We will often measure an approximation algorithm’s
efficacy by its approximation ratio.
Definition 1.1. An α-approximation algorithm for an optimization problem is a polynomial time algorithm
that for all instances of the problem produces a result whose value is within a factor of α of the value of the
optimal solution. The algorithm’s approximation ratio is then given by α.
If we are dealing with a maximization problem, we expect α < 1. Similarly, we would expect α > 1 for a
minimization problem. In these notes, we will provide an overview of iconic techniques used in the design of
approximation algorithms by looking at an example problem: Set Cover.
In the set cover problem, we are given a set of elements E = {e1 , e2 , . . . , en } and a collection of subsets
S1 , S2 , . . . , Sm ⊆ E where each is associated with weights w1 , . . . , wm ≥ 0 of each subset S. The goal of this
problem is to find P a collection of subsets {Sj }j∈I such that ∪j∈I Sj = E while simultaneously minimizing the
sum of weights j∈I wj .
Set cover is one of Karp’s 21 NP-complete problems and unless P 6= NP it is unlikely we can find an optimal
solution efficiently for all instances. Instead, we will focus on designing various algorithms to approximate
Set Cover. We will first discuss a greedy algorithm for Set Cover, then demonstrate two algorithms that rely
on the linear program reduction of Set Cover. The first rounds the solution of the LP deterministically, while
the second randomly. Finally, we show an algorithm that rounds the dual LP for Set Cover.
1
Lecture 1: Intro to Approximation Algorithms: Set Cover 2
A greedy algorithm iteratively builds a solution by making a sequence of myopic decisions; each decision
optimizes an often simple heuristic though doing so may construct a solution that deviates from the optimum.
Let’s consider a simpler version of Set Cover where all sets Sj have weight wj = 1. If we are to construct
a cover iteratively, then a reasonable heuristic to optimize for is to always pick the set that includes the
greatest number of currently uncovered elements. Our greedy approximation algorithm is the following:
Greedy Algorithm
Given E = {e1 , . . . , en } and S1 , . . . , Sm ⊆ E initialize I = ∅. While I does not cover E:
3. Update I ← I ∪ {`}
What does it mean to be a ln n-approximation algorithm for Set Cover? The goal of Set Cover seeks to
minimize the sum of set weights, or just the number of sets chosen because we assume wj = 1. The claim
states that the algorithm returns a solution that is at most ln n factor greater than the optimal sum of
weights. Suppose that I ∗ denotes the optimal set cover and I denotes the set cover returned by the algorithm.
Then a ln n-approximation guarantees that
X X X
wj ≤ wj ≤ ln n · wj
j∈I ∗ j∈I j∈I ∗
Proof. Let nt denote the number of elements not covered after t iterations of the greedy algorithm. At
iteration t = 0, all elements remain to be covered hence n0 = n. Suppose that I ∗ is the optimal set cover
with k elements. At any iteration t, there always exists a set Sj such that it covers at least nkt elements. This
is because I ∗ , being a set cover, must cover all remaining nt elements. If no set covers nkt elements then no
combination of k sets can cover the remaining nt elements. It could not be that I ∗ is a set cover.
Since there is a set covering nkt elements at every time step t, then the greedy algorithm will at least be able
to cover nkt additional elements. Hence
t
nt 1 1
nt+1 ≤ nt − = nt 1 − ≤ n0 1 −
k k k
Lecture 1: Intro to Approximation Algorithms: Set Cover 3
If t = k ln n, then nt < 1 meaning there are no more elements to cover. Each iteration adds one set and so
the size of the returned set cover I will be |I| ≤ k ln n. However, k = |I ∗ | and so |I| ≤ |I ∗ | ln n = ln n · opt
as required.
This algorithm turns out to be almost as good as we can achieve assuming P 6= NP.
Theorem 1.3. There exists some constant c > 0 such that if there exists a c ln n-approximation algorithm
for the unweighted set cover problem, then P = NP.
Linear programming is another versatile tool for the design and analysis of approximation algorithms. Recall
that a linear program is a optimization problem of the form
minimize c> x
subject to Ax ≥ b
where x, c ∈ Rn , b ∈ Rm and A ∈ Rm×n . A similar system can be written for maximization problems and we
can append a constraint xi ∈ {0, 1} to make this into an integer linear program. Many discrete optimization
problems can be formulated into integer linear programs where the vector solution encodes the combinatorial
solution. However, integer linear programs are in general NP-Hard to solve and so LP based approximation
algorithms take the following form to get around this:
1. Write the integer linear program for the discrete optimization problem.
2. Relax the constraint xi ∈ {0, 1} to xi ≥ 0.
3. Solve the linear program using a method such as simplex, interior point, etc.
4. Take the continuous solution x and apply a rounding procedure that converts components into integers x̂
and return x̂.
First, we have to write an integer linear programming for Set Cover. Begin by considering what we want our
solution vector x to be. We want x to encode whether or not to include a set Sj in the final set cover so let
Lecture 1: Intro to Approximation Algorithms: Set Cover 4
We want to minimize the sum of weight of subsets we pick, therefore our objective is:
m
X
minimize wj xj
j=1
Finally, we want all elements to be a part of the set cover. We add constraints for each ei ∈ E
X
xj ≥ 1
j:ei ∈Sj
In particular, this constraint says that for each ei , there is one xj = 1 meaning at least one Sj chosen in the
final set cover that contains it. The final linear program is given by:
m
X
minimize wj xj
j=1
X
subject to xj ≥ 1, i = 1, . . . , n
j:ei ∈Sj
xj ∈ {0, 1}, j = 1, . . . , m
Now let’s sit back and relax the integer constraints. Replace xj ∈ {0, 1} with xj ≥ 0 to derive the following.
m
X
minimize w j xj
j=1
X
subject to xj ≥ 1, i = 1, . . . , n
j:ei ∈Sj
xj ≥ 0, j = 1, . . . , m
This is indeed a relaxation because all feasible solution for integer linear programming is also a feasible
∗ ∗ ∗
solution for this linear programming. We always have ZLP ≤ ZILP = OP T where ZLP is the optimal value
∗
of the linear programming and ZILP is the value for integer linear programming because the integer solution
is always feasible for the relaxed program.
Our relaxed LP will return a vector of real numbers. We now need to round the real number solutions to
integer solutions in order to return the combinatorial solution.
Lecture 1: Intro to Approximation Algorithms: Set Cover 5
One way of rounding a linear programming is to set a certain threshold and accept if each variable if it’s
larger than that threshold. We use a rounding scheme as following:
1
1. For each j, If x∗j ≥ , Update I ← I ∪ {Sj }.
f
2. return I.
f is defined as the maximum number of sets that covers a single element for each element. The rounding
procedure includes subset Sj in our solution if and only if x∗j ≥ 1/f . To analyze this algorithm we need to
complete two tasks: (1) demonstrate that this rounding scheme actually returns a valid set cover and (2)
show that this set cover is a reasonable approximation. Let’s first show that I is a set cover.
Lemma 1.4. I is a set cover.
Proof. for each element ei . it’s covered if in ILP solution j:ei ∈Sj xj ≥ 1. We know x∗ is a feasible solution,
P
therefore we have j:ei ∈Sj x∗j ≥ 1. Because by definition, there must be fi ≤ f terms in the summation, so
P
there’s at least a set in the summation that is included in the solution.
Proof. Observe that we can analyze the cost of the algorithm’s set cover I as follows.
X m
X
wj ≤ wj · (f · x∗j )
j∈I j=1
This inequality follows as Sj is included in I only if x∗j ≥ 1/f or f · x∗j ≥ 1. However, this implies
m
X m
X
wj · (f · x∗j ) = f wj x∗j = f · ZLP
∗
≤ f · opt
j=1 j=1
Pm
Notice here that we use the fact that j=1 wj x∗j is the objective value of LP optimal solution. As mentioned
previously, this lower bounds the integer optimal solution since the integer solution is always LP feasible.
This gives us a way to compare the algorithm’s solution and the optimal solution allowing us to conclude
that this is an f -approximation.
An f -approximation algorithm is dependent on the input. For certain cases, this can provide for us a tighter
bound. For example, this immediately provides a 2-approximation for vertex cover. Vertex cover is a special
Lecture 1: Intro to Approximation Algorithms: Set Cover 6
case of set cover where you want to find the minimum set of vertices such that each edge is adjacent to
at least one vertex in the set. Since each edge can only be adjacent to two vertices, the algorithm above
immediately admits an f = 2 approximation.
A rounding of LP can also be used to produce a tighter approximation ratio after getting the solution for
each instance. We know from the analysis that we have approximation ratio α ≤ f . However, it could be
that the approximation ratio for an instance is much smaller than f . Comparing the solution we have with
the LP relaxation could sometimes give us a better approximation ratio for each instance.
Another variant of rounding relies on treating the solution probabilistically. Suppose we construct a random
cover of E by choosing each Sj with probability x∗j . If we let Xj denote the random variable that Sj is
included in the sampled set cover, then the expected cost of the random cover is exactly LP solution.
" m # m m
X X X
E wj Xj = wj Pr[Xj = 1] = wj x∗j = ZLP
∗
This means that with constant probability, the random set cover will fail to cover all elements! We want the
probability that the set cover fails to be much smaller – say n1c – so that we have by a union bound:
n
X 1
Pr[∃ uncovered element] ≤ Pr[ei not covered] ≤
i=1
nc−1
The way to deal with that is to give it a boost by repeating the process many times. Take an unfair coin that
flips heads with probability x∗j and flip it c ln n times. Include set Sj if at least one of the coin toss is heads.
Y Y ∗ −c ln n
P
x∗ 1
Pr[ei not covered] = (1 − x∗j )c ln n ≤ e−xj (c ln n) = e j:ei ∈Sj j
≤
nc
j:ei ∈Sj j:ei ∈Sj
1. Flip a biased coin, which lands on head with probability x∗j , c ln n times.
2. If at least one of the coin tosses is head, Update I ← I ∪ {Sj }
3. return I.
Lecture 1: Intro to Approximation Algorithms: Set Cover 7
Since our algorithm is random, it is possible that it does not return a reasonable approximation let alone
a valid set cover. Instead, we can show that it returns a valid set cover that reasonably approximates the
optimal with high probability.
Theorem 1.6. This algorithm returns an O(ln n)-approximation of set cover with high probability.
need to provide E given set cover is provided: using conditional expectation. E[A] = E[A|B] Pr[B] +
E[A|¬B] Pr[¬B]. Then for n ≥ 2 and c ≥ 2:
" m # " m # " m # !
X 1 X X
E wj Xj F = E wj Xj − E wj Xj ¬F Pr[¬F ]
j=1
Pr[F ] j=1 j=1
" m #
1 X
≤ E wj Xj
Pr[F ] j=1
∗
(c ln n)ZLP
≤ 1
1 − nc−1
∗
≤ 2c ln nZLP
In CS170, one may have encountered the concept of LP duality. Call a given maximization LP, the primal
LP. We can compute the dual linear program of the primal which provides the tightest lower bound programs
objective value. That is to say it abides by the following property.
Theorem 1.7 (Weak Duality). If x is a feasible solution to the primal LP with a maximization objective, and
y a feasible solution to the dual LP with a minimization objective, then the value of dual objective function ≤
the value of primal objective function.
We can take the dual of the set cover LP that we derived and consider rounding a solution from there. The
dual for set cover can be written as
Xn
maximize yi
i=1
(1.2)
X
subject to yi ≤ wj j = 1, . . . , m
i:ei ∈Sj
yi ≥ 0 i = 1, . . . , n
We can interpret this program in the following manner. To lower bound the primal objective value, we
imagine a show owner pricing elements in each set Sj . In the primal setting, selecting Sj for the cover means
Lecture 1: Intro to Approximation Algorithms: Set Cover 8
a buyer can always pay wj for all elements in Sj . To maximize the profit the seller can make by selling each
item separately, he can assign non-negative prices yi ≥ 0 (hence the dual constraint). However, prices must
be assigned such that for each set, the sum of prices is at most wj as the buyer could always pay that much
for the set. One can always derive the dual such that the following property holds.
1. Return I given by
X
I= Sj ⊆ E : yi = w j
i:ei ∈Sj
Let us first verify that a solution I 0 returned by this algorithm is a valid set cover.
Lemma 1.8. The collection {Sj }j∈I 0 is a set cover.
Proof. For sake of contradiction, suppose that there exists some elements ek uncovered. Then for each subset
Sj containing ek , it must for all the sets j containing ek :
X
yi∗ < wj
i:ei ∈Sj
as the algorithm only includes sets for which the constraint i:ei ∈Sj yi∗ ≤ wj is achieved with equality. Let
P
be the minimum difference between the price of all elements in Sj and wj . Construct
yk0 = yk∗ +
And set yi0 = yi∗ for all i 6= k. We immediately have i:ei ∈Sj yi∗ ≤ wj for all sets containing ek meaning y 0 is
P
This is not possible as y ∗ is assumed to be the optimal solution for dual LP 1.2 – contradiction.
yi∗ .
P
Proof. Let us begin by writing out the cost of the dual rounded solution by recalling that wj = i:ei ∈Sj
X X X n
X
yi∗ = | j ∈ I 0 : ei ∈ Sj | · yi∗
wj =
j∈I 0 j∈I 0 i:ei ∈Sj i=1
Lecture 1: Intro to Approximation Algorithms: Set Cover 9
The last equality holds as for a fixed i, the double sum sums yi∗ exactly the number of times ei appears in
any Sj in the cover. We then have the following:
n
X n
X n
X
| j ∈ I 0 : ei ∈ Sj | · yi∗ ≤ fi yi∗ ≤ f yi∗ ≤ f · OP T
i=1 i=1 i=1
It remains expensive to solve large linear programs. The fastest known algorithm for linear programming
can find the an approximate optimum in the time it takes to multiply two matrices [?]. At time of writing,
matrix multiplication takes time O(nω ) for ω ≈ 2.37 [2]. Special purpose rounding algorithm are often much
faster. Take the rounding algorithm for dual programming in the previous subsection as an example, instead
of trying to solve dual LP optimally, we are just constructing its feasible solutions. Therefore we can design
a similar algorithm that achieves this purpose by repeatedly increasing dual variables until the constraints
become tight. Such a technique is called primal-dual rounding and will be explored in future notes.
1.5 Conclusion
This note explored many different ways to approximate the set cover problem. Throughout the semester,
we will be looking into these techniques in much more depth and will be introducing some other related
techniques.
References
[1] Cohen, M. B., Lee, Y. T., & Song, Z. (2018). “Solving linear programs in the current matrix multiplication
time.” in arXiv preprint arXiv:1810.07896.
[2] Williams, V. V. (2012). “Multiplying matrices faster than coppersmith-winograd.” In Symposium on the
Theory of Computing (Vol. 12, pp. 887-898).