Introduction Algorithm
Introduction Algorithm
Chapter 1- Introduction
General Information
Time: MWF 1:55 – 2:45
Place: TUR 2303
My T. Thai 2
mythai@cise.ufl.edu
Instructor Information
Name: Dr. My T. Thai
Office: E566 CSE Building
Phone: (352)392-6842
Email: mythai@cise.ufl.edu
Office Hours: W 3:00pm – 5:00pm or by
appointments
My T. Thai 3
mythai@cise.ufl.edu
Why Approximation Algorithms
Problems that we cannot find an optimal
solution in a polynomial time
Eg: Set Cover, Bin Packing
Need to find a near-optimal solution:
Heuristic
Approximation algorithms:
This gives us a guarantee approximation ratio
My T. Thai 4
mythai@cise.ufl.edu
Why take this course
Your advisers/bosses give you a
computationally hard problem. Here are two
scenarios:
No knowledge about approximation:
Spend a few months looking for an optimal solution
Come to their office and confess that you cannot do it
Get fired
Knowledge about approximation:
My T. Thai 5
mythai@cise.ufl.edu
Why take this course (cont)
Knowledge about approximation
Show your boss that this is a NP-complete (NP-
hard) problem
There does not exist any polynomial time algorithm
to find an exact solution
Propose a good algorithm (either heuristic or
approximation) to find a near-optimal solution
Better yet, prove the approximation ratio
My T. Thai 6
mythai@cise.ufl.edu
Course Description
Covers a variety of techniques to design and
analyze many approximation algorithms for
computationally hard problems:
Combinatorial algorithms:
Greedy Techniques, Independent System, Submodular
Function
Cover various problems
Linear Programming based algorithms
Semidefinite Programming based algorithms
My T. Thai 7
mythai@cise.ufl.edu
Course Objectives
Grasp the essential techniques to design and
analyze approximation algorithms:
Combinatorial methods
Linear programming
Semidefinite programming
Primal-dual and relaxation methods
Hardness of approximation
Grasp the key ideas of graph theory
Able to model and solve many practical
problems raising inMyour real life
T. Thai 8
mythai@cise.ufl.edu
Textbooks
Required:
Vijay Vazirani, Approximation Algorithms,
Springer-Verlag, 2001
Recommended:
Vasek Chvatal, Linear Programming, W. H.
Freeman Company
Michael R. Garey and David S. Johnson,
Computers and Intractability: A Guide to the
Theory of NP-Completeness
Shall provide some appropriate lecture notes
My T. Thai 9
mythai@cise.ufl.edu
Grading Policies
Homework Assignments:
5 homework assignments, each weighs 10%
Due at the beginning of the lecture on the due date
No late assignment will be accepted
Midterm Exam:
One midterm exam, weighs 20%
In class, open book, open notes
One Final Exam:
Weighs 30%
My T. Thai 10
mythai@cise.ufl.edu
Grading Policies (cont)
Cut-off points:
A >= 85%
85% > B >= 75%
75% > C >= 65%
My T. Thai 11
mythai@cise.ufl.edu
Plagiarism Policy
Zero tolerance
Collaboration:
May discuss with other students on homework, but
must write up solution on your own independently
More info, see the class website
Other policy: Students are encouraged to attend
the class. Many proofs will not be shown on the
lecture slides
My T. Thai 12
mythai@cise.ufl.edu
Tentative Schedule
See the class website for detail information
Roughly divided into two parts:
Week 1 to Week 8: Combinatorial Algorithms
Week 9 to Week 16: Linear Programming: Dual-
Primal, Relaxation, Rounding, and Semidefinite
programming
My T. Thai 13
mythai@cise.ufl.edu
Introduction to Combinatorial
Optimization
My T. Thai 14
mythai@cise.ufl.edu
Combinatorial Optimization
The study of finding the “best” object from
within some finite space of objects, eg:
Shortest path: Given a graph with edge costs and a
pair of nodes, find the shortest path (least costs)
between them
Traveling salesman: Given a complete graph with
nonnegative edge costs, find a minimum cost cycle
visiting every vertex exactly once
Maximum Network Lifetime: Given a wireless
sensor networks and a set of targets, find a schedule
of these sensors to maximize network lifetime
My T. Thai 15
mythai@cise.ufl.edu
In P or not in P?
Informal Definitions:
The class P consists of those problems that are
solvable in polynomial time, i.e. O(nk) for some
constant k where n is the size of the input.
The class NP consists of those problems that
are “verifiable” in polynomial time:
Given a certificate of a solution, then we can verify
that the certificate is correct in polynomial time
My T. Thai 16
mythai@cise.ufl.edu
In P or not in P: Examples
In P:
Shortest path
Minimum Spanning Tree
Not in P (NP):
Vertex Cover
Traveling salesman
Minimum Connected Dominating Set
My T. Thai 17
mythai@cise.ufl.edu
NP-completeness (NPC)
A problem is in the class NPC if it is in NP and
is as “hard” as any problem in NP
My T. Thai 18
mythai@cise.ufl.edu
What is “hard”
Decision Problems: Only consider YES-NO
Decision version of TSP: Given n cities, is there a
TSP tour of length at most L?
Why Decision Problem? What relation between
the optimization problem and its decision?
Decision is not “harder” than that of its
optimization problem
If the decision is “hard”, then the optimization
problem should be “hard”
My T. Thai 19
mythai@cise.ufl.edu
NP-complete and NP-hard
A language L is NP-complete if:
1. L is in NP, and
2. For every L’ in NP, L’ can be polynomial-time
reducible to L
My T. Thai 20
mythai@cise.ufl.edu
Approximation Algorithms
An algorithm that returns near-optimal
solutions in polynomial time
Approximation Ratio ρ(n):
Define: C* as a optimal solution and C is the
solution produced by the approximation algorithm
max (C/C*, C*/C) <= ρ(n)
Maximization problem: 0 < C <= C*, thus C*/C
shows that C* is larger than C by ρ(n)
Minimization problem: 0 < C* <= C, thus C/C*
shows that C is larger than C* by ρ(n)
My T. Thai 21
mythai@cise.ufl.edu
Approximation Algorithms (cont)
PTAS (Polynomial Time Approximation
Scheme): A (1 + ε)-approximation algorithm
for a NP-hard optimization П where its running
time is bounded by a polynomial in the size of
instance I
FPTAS (Fully PTAS): The same as above +
time is bounded by a polynomial in both the
size of instance I and 1/ε
My T. Thai 22
mythai@cise.ufl.edu
A Dilemma!
We cannot find C*, how can we compare C to
C*?
How can we design an algorithm so that we can
compare C to C*
My T. Thai 23
mythai@cise.ufl.edu
Vertex Cover
Definition:
Given an undirected graph G (V , E ), a subset V ' V is called
a vertex cover of G iff for every edge e E , e has at least
one endpoint incident at V '
An Example
My T. Thai 24
mythai@cise.ufl.edu
Vertex Cover Problem
Definition:
Given an undirected graph G=(V,E), find a vertex
cover with minimum size (has the least number of
vertices)
This is sometimes called cardinality vertex cover
More generalization:
Given an undirected graph G=(V,E), and a cost
function on vertices c: V → Q+
Find a minimum cost vertex cover
My T. Thai 25
mythai@cise.ufl.edu
How to solve it
Matching:
A set M of edges in a graph G is called a matching
of G if no two edges in set M have an endpoint in
common
Example:
My T. Thai 26
mythai@cise.ufl.edu
How to solve it (cont)
Maximum Matching:
A matching of G with the greatest number of edges
Maximal Matching:
A matching which is not contained in any larger
matching
Note: Any maximum matching is certainly
maximal, but not the reverse
My T. Thai 27
mythai@cise.ufl.edu
Main Observation
No vertex can cover two edges of a matching
The size of every vertex cover is at least the
size of every matching: |M| ≤ |C|
|M| = |C| indicates the optimality
My T. Thai 29
mythai@cise.ufl.edu
Can Alg1 be improved?
Q1: Can the approximation ratio of Alg 1 be
improved by a better analysis?
Q2: Can we design a better approximation
algorithm using the similar technique (maximal
matching)?
Q3: Is there any other lower bounding method
that can lead to a better approximation
algorithm?
My T. Thai 30
mythai@cise.ufl.edu
Answers
A1: No by considering the complete bipartite
graphs Kn,n
My T. Thai 31
mythai@cise.ufl.edu
Answers (cont)
A3:
Currently a central open problem
Yes, we can obtain a better one by using the
semidefinite programming
Generalization vertex Cover
Can we still able to design a 2-approximation
algorithm?
Homework assignment!
My T. Thai 32
mythai@cise.ufl.edu
Set Cover
Definition: Given a universe U of n elements, a
collection of subsets of U, S = {S1, …, Sm}, and a cost
function c: S → Q+, find a minimum cost
subcollection C of S that covers all elements of U.
Example:
U = {1, 2, 3, 4, 5}
S1 = {1, 2, 3}, S2 = {2,3}, S3 = {4, 5}, S4 = {1, 2, 4}
c1 = c2 = c3 = c4 = 1
Solution C = {S1, S3}
If the cost is uniform, then the set cover problem asks
us to find a subcollection covering all elements of U
with the minimum size.
My T. Thai 33
mythai@cise.ufl.edu
An Example
My T. Thai 34
mythai@cise.ufl.edu
NP-completeness
Theorem: Set Cover (SC) is NP-complete
Proof:
INSTANCE: Given a universe U of n elements, a collection
of subsets of U, S = {S1, …, Sm}, and a positive integer b
QUESTION: Is there a , |C| ≤ b,
such that
My T. Thai 35
mythai@cise.ufl.edu
Proof (cont)
First we need to show that SC is in NP. Given a
collection of sets C, it is easy to verify that if |
C| ≤ b and the union of all sets listed in C does
include all elements in U.
To complete the proof, we need to show that
Vertex Cover (VC) ≤p Set Cover (SC)
Given an instance C of VC (an undirected graph
G=(V,E) and a positive integer j), we need to
construct an instance C’ of SC in polynomial
time such that C is satisfiable iff C’ is
satisfiable. 36
My T. Thai
mythai@cise.ufl.edu
Proof (cont)
Construction: Let U = E. We will define n
elements of U and a collection S as follows:
Label all vertices in V from 1 to n. Let Si be the set of all
edges that incident to vertex i. Finally, let b = j. This
construction is in poly-time with respect to the size of
VC instance.
My T. Thai 37
mythai@cise.ufl.edu
VERTEX-COVER p SET-COVER
one element
for every edge
VC SC
one set for every vertex,
containing the edges it covers
My T. Thai 38
mythai@cise.ufl.edu
Proof (cont)
Now, we need to prove that C is satisfiable iff C’ is .
That is, we need to show that if the original instance
of VC is a yes instance iff the constructed instance of
SC is a yes instance.
(→) Suppose G has a vertex cover of size at most j,
called C. By our construction, C corresponds to a
collection C’ of subsets of U. Since b = j, |C’| ≤ b.
Plus, C’ covers all elements in U since C “covers” all
edges in G. To see this, consider any element of U.
Such an element is an edge in G. Since C is a set cover,
at least endpoint of thisMyedge
T. Thai
is in C. 39
mythai@cise.ufl.edu
(←) Suppose there is a set cover C’ of size at
most b in our constructed instance. Since each
set in C’ is associated with a vertex in G, let C
be the set of these vertices. Then |C| = |C’| ≤ b =
j. Plus, C is a vertex cover of G since C’ is a set
cover. To see this, consider any edge e. Since e
is in U, C’ must contain at least one set that
includes e. By our construction, the only set
that include e correspond to nodes that are
endpoints of e. Thus C must contain at least one
endpoints of e.
My T. Thai 40
mythai@cise.ufl.edu
Solutions
Algorithm 1: (in the case of uniform cost)
1: C = empty
2: while U is not empty
3: pick a set Si such that Si covers the most
elements in U
4: remove the new covered elements from U
5: C = C union Si
6: return C
My T. Thai 41
mythai@cise.ufl.edu
Solutions (cont)
In the case of non-uniform cost
Similar method. At each iteration, instead of picking a
set Si such that Si covers the most uncovered elements,
we will pick a set Si whose cost-effectiveness α is
smallest, where α is defined as:
c( Si ) / | Si U |
Questions: Why choose smallest α? Why define α as
above
My T. Thai 42
mythai@cise.ufl.edu
Solutions (cont)
Algorithm 2: (in the case of non-uniform cost)
1: C = empty
2: while U is not empty
3: pick a set Si such that Si has the smallest α
4: for each new covered elements e in U
5: set price(e) = α
6: remove the new covered elements from U
7: C = C union Si
8: return C My T. Thai 43
mythai@cise.ufl.edu
Approximation Ratio Analysis
Let ek, k = 1…n, be the elements of U in the order in which
they were covered by Alg 2. We have:
Lemma 1:
For each k {1,..., n}, price(ek ) opt /( n k 1)
where opt is the cost of the optimal solution
Proof: Let Uj be the set the remaining set U at iteration j.
That is, Uj contains all the uncovered elements at
iteration j. At any iteration j, the leftover sets of the
optimal solution can cover the remaining elements at a
cost of at most opt. (Why?)
My T. Thai 44
mythai@cise.ufl.edu
Proof of Lemma 1 (cont)
Thus, among these leftover sets, there must exist
one set where its α value is at most opt/|Uj|. Let
ek be the element covered at this iteration, |Uj| ≥
n – k + 1. Since ek was covered by the most
cost-effective set in this iteration, we have:
price(ek) ≤ opt/|Uj| ≤ opt/(n-k+1)
My T. Thai 45
mythai@cise.ufl.edu
Approximation Ratio
Theorem 1: The set cover obtained form Alg 2
(also Alg 1) has a factor of Hn where Hn is a
harmonic series Hn = 1 + 1/2 + … + 1/n
Proof: It follows directly from Lemma 1
My T. Thai 46
mythai@cise.ufl.edu