Introduction to combinatorial optimization springer
Introduction to combinatorial optimization springer
Ding-Zhu Du
Panos M. Pardalos
Xiaodong Hu
Weili Wu
Introduction
to Combinatorial
Optimization
Springer Optimization and Its Applications
Volume 196
Series Editors
Panos M. Pardalos , University of Florida
My T. Thai , University of Florida
Honorary Editor
Ding-Zhu Du, University of Texas at Dallas
Advisory Editors
Roman V. Belavkin, Middlesex University
John R. Birge, University of Chicago
Sergiy Butenko, Texas A&M University
Vipin Kumar, University of Minnesota
Anna Nagurney, University of Massachusetts Amherst
Jun Pei, Hefei University of Technology
Oleg Prokopyev, University of Pittsburgh
Steffen Rebennack, Karlsruhe Institute of Technology
Mauricio Resende, Amazon
Tamás Terlaky, Lehigh University
Van Vu, Yale University
Michael N. Vrahatis, University of Patras
Guoliang Xue, Arizona State University
Yinyu Ye, Stanford University
Aims and Scope
Optimization has continued to expand in all directions at an astonishing rate. New
algorithmic and theoretical techniques are continually developing and the diffusion
into other disciplines is proceeding at a rapid pace, with a spot light on machine
learning, artificial intelligence, and quantum computing. Our knowledge of all
aspects of the field has grown even more profound. At the same time, one of the
most striking trends in optimization is the constantly increasing emphasis on the
interdisciplinary nature of the field. Optimization has been a basic tool in areas
not limited to applied mathematics, engineering, medicine, economics, computer
science, operations research, and other sciences.
The series Springer Optimization and Its Applications (SOIA) aims to publish
state-of-the-art expository works (monographs, contributed volumes, textbooks,
handbooks) that focus on theory, methods, and applications of optimization. Topics
covered include, but are not limited to, nonlinear optimization, combinatorial opti-
mization, continuous optimization, stochastic optimization, Bayesian optimization,
optimal control, discrete optimization, multi-objective optimization, and more. New
to the series portfolio include Works at the intersection of optimization and machine
learning, artificial intelligence, and quantum computing.
Volumes from this series are indexed by Web of Science, zbMATH, Mathematical
Reviews, and SCOPUS.
Ding-Zhu Du • Panos M. Pardalos • Xiaodong Hu •
Weili Wu
Introduction to
Combinatorial Optimization
Ding-Zhu Du Panos M. Pardalos
Department of Computer Science Department of Industrial & Systems
University of Texas, Dallas Engineering
Richardson, TX, USA University of Florida
Gainesville, FL, USA
Xiaodong Hu Weili Wu
University of Chinese Academy of Sciences Department of Computer Science
Academy of Math and System Science University of Texas at Dallas
Chinese Academy of Sciences Richardson, TX, USA
Beijing, China
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Since the fabric of the world is the most
perfect and was established by the wisest
Creator, nothing happens in this world in
which some reason of maximum or minimum
would not come to light.
—Euler
The motivation for writing this book came from our previous teaching experience
in the undergraduate, junior graduate, and senior graduate levels.
The first observation is about organization of courses on combinatorial opti-
mization. Many textbooks are problem oriented. But our experience indicates that a
methodology-oriented organization is preferred by students.
The second observation is about contents. At present, technological develop-
ments, such as wireless communication, cloud computing, social networks, and
machine learning, involve many applications of combinatorial optimization and
provide a platform for their new issues, new techniques, and new subareas to grow.
This makes us update teaching materials.
This book is methodology oriented and organized along a line leading the reader
step by step from the very beginning toward frontier of this field. Actually, all
materials are selected from lecture notes from three courses, which are taught at
undergraduate level, junior graduate (MS) level, and senior graduate (PhD) level,
respectively.
The first part is selected from a course on computer algorithm design and analy-
sis. This course does not clearly state that it is about combinatorial optimization.
However, its contents have a very large portion overlapping with combinatorial
optimization.
The second part comes from a course on the design and analysis of approxima-
tion algorithms. The third part comes from a course on nonlinear combinatorial
optimization. These two parts are overlapping at a few chapters. Therefore, we
combined and simplified them.
While all three parts have been used for teaching at the University of Texas
at Dallas and the University of Florida for many years, the second and the third
parts are also utilized for teaching in short summer courses at the University of
Chinese Academy of Sciences, Beijing University, Tsinghua University, Beijing
Jiaotong University, Xi’an Jiaotong University, Ocean University of China, Bei-
jing University of Technology, Lanzhou University, Zhejiang Normal University,
Shandong University, Harbin Institute of Technology, CityU of Hong Kong, and
PolyU of Hong Kong. Therefore, we wish to thank Professors Andy Yao, Francis
vii
viii Preface
Yao, Jianzhong Li, Hong Gao, Xiaohua Jia, Jiannong Cao, Qizhi Fang, Jianliang
Wu, Naihua Xiu, Lingchen Kong, Dachuan Xu, Suixiang Gao, Wenguo Yang, Zhao
Zhang, Xujin Chen, Xianyue Li, and Hejiao Huang for their support.
Finally, we would like to acknowledge the support in part by NSF of USA under
grants 1747818, 1822985, and 1907472.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 What Is Combinatorial Optimization? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Optimal and Approximation Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Running Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Historical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Divide-and-Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Algorithms with Self-Reducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Rectilinear Minimum Spanning Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Fibonacci Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Counting Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 More Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Historical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3 Dynamic Programming and Shortest Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Shortest Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Dijkstra Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4 Priority Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5 Bellman-Ford Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.6 All Pairs Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Historical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4 Greedy Algorithm and Spanning Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1 Greedy Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2 Matroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
ix
x Contents
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Chapter 1
Introduction
Let us start this textbook from a fundamental question and tell you what will
constitute this book.
The aim of combinatorial optimization is to find an optimal object from a finite set
of objects. Those candidate objects are called feasible solutions, while the optimal
one is called an optimal solution. For example, consider the following problem.
Problem 1.1.1 (Minimum Spanning Tree) Given a connected graph G = (V , E)
with nonnegative edge weight c : E → R+ , find a spanning tree with minimum total
weight, where “spanning” means that all nodes are included in the tree and hence
a spanning tree interconnects all nodes in V . An example is as shown in Fig. 1.1.
Clearly, the set of all spanning trees is finite, and the aim of this problem is to
find one with minimum total weight from this set. Each spanning tree is a feasible
solution, and the optimal solution is the spanning tree with minimum total weight,
which is also called the minimum spanning tree. Therefore, this is a combinatorial
optimization problem.
A combinatorial optimization problem may have more than one optimal solution.
For example, in Fig. 1.1, there are two spanning trees with minimum total length.
(The second one can be obtained by using edge (e, f ) to replace edge (d, f ).)
Therefore, by the optimal solution as mentioned in the above, it means a general
member in the class of optimal solutions.
The combinatorial optimization is a proper subfield of discrete optimization.
In fact, there exist some problems in discrete optimization, which do not belong
a d
1 1 3
2
c f
1
1 3
b e
Kruskal Algorithm
input: A connected graph G = (V , E) with nonnegative edge weight
c : E → R+ .
output: A minimum spanning tree T .
Sort all edges e1 , e2 , . . . , em in nondecreasing order of weight,
i.e., c(e1 ) ≤ c(e2 ) ≤ · · · ≤ c(em );
T ← ∅;
for i ← 1 to m do
if T ∪ ei does not contain a cycle
then T ← T ∪ ei ;
return T .
From this algorithm, we see that it is not hard to find the optimal solution for the
minimum spanning tree problem. If every combinatorial optimization problem likes
the minimum spanning tree, then we would be very happy to find optimal solution
for it. Unfortunately, there exist a large number of problems that it is unlikely to
be able to compute their optimal solution efficiently. For example, consider the
following problem.
Problem 1.2.2 (Minimum Length Rectangular Partition) Given a rectangle
with point-holes inside, partition it into smaller rectangles without hole to minimize
the total length of cuts.
Problems 1.1.2 and 1.2.2 are quite different. Problem 1.2.2 is intractable, while
there exists an efficient algorithm to compute an optimal solution for Problem 1.1.2.
Actually, in theory of combinatorial optimization, we need to study not only how to
design and analyze algorithms for finding optimal solutions but also how to design
and analyze algorithms for computing approximation solutions. When do we put
our efforts on optimal solution? When should we pay attention to approximation
solutions? Ability for making such a judgement has to be growth from study
computational complexity.
The book consists of three building blocks, design and analysis of computer algo-
rithm for exact optimal solution, design and analysis of approximation algorithms,
and nonlinear combinatorial optimization.
The first block contains Chaps. 2–7, which can be divided into two parts
(Fig. 1.3). The first part is on algorithms with self-reducibility, including the divide-
and-conquer, the dynamic program, the greedy algorithm, the local search, the local
ratio, etc., which are organized into Chaps. 2–4. The second part is on incremental
method, including the primal algorithm, the dual algorithm, and the primal-dual
algorithm, which are organized also into Chaps. 5–7. There is an intersection
between algorithms with self-reducibility and primal-dual algorithms. In fact, in
computation process of the former, an optimal feasible solution is built up step
by step based on certain techniques, and the latter also has a process to build up
an optimal primal solution by using information from dual side. Therefore, some
1.3 Preprocessing 5
1.3 Preprocessing
In Kruskal algorithm, the first line is to sort all edges into a nondecreasing order
of cost. This requires a preprocessing procedure for solving the sorting problem as
follows.
6 1 Introduction
Problem 1.3.1 (Sorting) Given a sequence of positive integers, sort them into
nondecreasing order.
The following is a simple algorithm to do sorting job.
Insertion Sort
input: An array A with a sequence of positive integers.
output: An array A with a sequence of positive integers in
nondecreasing order.
for j ← 2 to length[A]
do key ← A[j ]
i ←j −1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i ←i−1
end-while
A[i + 1] ← key
end-for
return A.
An example for using insertion sort is as shown in Fig. 1.5.
Although insertion sort is simpler, it runs a little slow. Since sorting appears
very often in algorithm design for combinatorial optimization problems, we have to
spend some space in Chap. 2 to introduce faster algorithms.
1.4 Running Time 7
Fig. 1.5 An example for insertion sort. σ is the key lying outside of array A
The most important measure of quality for algorithms is the running time. However,
for the same algorithm, it may take different times when we run it in different
computers. To give a uniform standard, we have to get an agreement that runs
algorithms in a theoretical computer model. This model is the multi-tape Turing
machine which has been accepted by a very large population. Based on the Turing
machine, the theory of computational complexity has been built up. We will touch
this part of theory in Chap. 8.
But, we will use RAM model to evaluate the running time for algorithms
throughout this book except Chap. 8. In RAM model, assume that each line of
pseudocode requires a constant time. For example, the running time of insertion
sort is calculated in Fig. 1.6.
Actually, RAM model and Turing machine model are closely related. The
running time estimated based on these two models is considered to be close enough.
However, they are sometimes different in estimation of running time. For example,
the following is a piece of pseudocode.
for i = 1 to n
do assign First(i) ← i
end-for
According to RAM model, the running time of this piece is O(n). However, based
on the Turing machine, the running time of this piece is O(n log n) because the
assigned value has to be represented by a string with O(log n) symbols.
Theoretically, a constant factor is often ignored. For example, we usually say that
the running time of insertion sort is O(n2 ) instead of giving the specific quadratic
function with respect to n. Here f (n) = O(g(n)) means that there exist constants
c > 0 and n0 > 0 such that
There are two more notations which appear very often in representation of running
time. f (n) = (g(n)) means that there exist constant c > 0 and n0 > 0 such that
f (n) = (g(n)) means that there exist constants c1 > 0, c2 > 0 and n0 > 0 such
that
Kruskal Algorithm
input: A connected graph G = (V , E) with nonnegative edge weight c : E → R+ .
output: A minimum spanning tree T .
Sort all edges e1 , e2 , . . . , em in nondecreasing order of weight,
i.e., c(e1 ) ≤ c(e2 ) ≤ · · · ≤ c(em );
T ← ∅;
for each node v ∈ V do
Make-Set(v);
end-for
for i ← 1 to m do
if Find-Set(x) = Find-Set(y) where ei = (x, y)
then T ← T ∪ ei
and Union(x, y);
end-for
return T .
Exercises
1. In a city there are N houses, each of which is in need of a water supply. It costs
Wi dollars to build a well at house i, and it costs Cij to build a pipe in between
houses i and j . A house can receive water if either there is a well built there or
there is some path of pipes to a house with a well. Give an algorithm to find the
minimum amount of money needed to supply every house with water.
2. Consider a connected graph G with all distinct edge weights. Show that the
minimum spanning tree of G is unique.
3. Consider a connected graph G = (V , E) with nonnegative edge weight c :
E → R+ . Suppose e1∗ , e2∗ , . . . , ek∗ are edges generated by Kruskal algorithm,
and e1 , e2 , . . . , ek are edges of a spanning tree in ordering c(e1 ) ≤ c(e2 ) ≤
· · · ≤ c(ek ). Show that c(ei∗ ) ≤ c(ei ) for all 1 ≤ i ≤ k.
Historical Notes 11
Historical Notes
There are many books which have been written for combinatorial optimization
[72, 105, 264, 272, 275, 335, 358, 367, 368]. There are also many books published
in design and analysis of computer algorithms [73, 280], which cover a large
portion on combinatorial optimization problems. However, those books mainly on
computing exact optimal solutions and possibly a small part on approximation
solutions. For approximation solutions, a large part of materials are usually covered
in separated books [100, 387, 408]. For issues on computational complexity, the
reader may refer to [99, 260].
In recent developments of technology, combinatorial optimization gets a lot of
new applications and new research directions [337, 338, 422, 425]. In this book,
we try to meet requests from various areas for teaching, research, and reference,
to put together three components, the classic part of combinatorial optimization,
approximation theory developed in recent years, and newly appeared nonlinear
combinatorial optimization.
Chapter 2
Divide-and-Conquer
There exist a large number of algorithms in which the problem is reduced to several
subproblems, each of which is the same problem on a smaller-size input. Such a
problem is said to have the self-reducibility, and the algorithm is said to be with
self-reducibility.
For example, consider sorting problem again. Suppose input contains n numbers.
We may divide these n numbers into two subproblems. One subproblem is the
sorting problem on n/2 numbers, and the other subproblem is the sorting problem
on n/2 numbers. After completely sorting each subproblem, combine two sorted
sequences into one. This idea will result in a sorting algorithm, called the merge
sort. The pseudocode of this algorithm is shown in Algorithm 1.
The main body calls a procedure. This procedure contains two self-calls, which
means that the merge sort is a recursive algorithm, that is, the divide will continue
until each subproblem has input of single number. Then this procedure employs
another procedure (Merge) to combine solutions of subproblems with smaller
inputs into subproblems with larger inputs. This computation process on input
{5, 2, 7, 4, 6, 8, 1, 3} is shown in Fig. 2.1.
Note that the running time of procedure Merge at each level is O(n). Let t (n) be
the running time of merge sort on input of size n. By the recursive structure, we can
obtain that t (1) = 0 and
Suppose
t (n) ≤ 2 · t (n/2) + c · n
T (n) = 2 · T (n/2) + c · n.
t (n) ≤ 2 · t (n/2) + c · n
≤ 2 · T (n/2) + c · n (by induction hypothesis)
= T (n).
Now, let us discuss how to solve recursive equation about T (n). Usually, we use
two stages. In the first stage, we consider special numbers n = 2k and employ the
recursive tree to find T (2k ) (Fig. 2.2), that is,
T (2k ) = 2 · T (2k−1 ) + c · 2k
= 2 · (2 · T (2k−2 ) + c · 2k−1 ) + c · 2k
= ...
= 2k T (1) + kc · 2k
16 2 Divide-and-Conquer
= c · k2k .
In general, we may guess that T (n) ≤ c · n log n for some constant c > 0. Let
us show it by mathematical induction.
First, we choose c to satisfy T (n) ≤ c for n ≤ n0 where n0 will be determined
later. This choice will make T (n) ≤ c n log n for n ≤ n0 , which meets the
requirement for the basic step of mathematical induction.
For induction step, consider n ≥ n0 + 1. Then we have
T (n) = 2 · T (n/2) + c · n
≤ 2 · c n/2 logn/2 + c · n
≤ 2 · c ((n + 1)/2)(log(n + 1) − 1) + c · n
= c · (n + 1) log(n + 1) − c (n + 1) + c · n
≤ c (n + 1)(log n + 1/n) − (c − c)n − c
= c n log n + c log n − (c − c)n + c /n.
Now, we choose n0 sufficiently large such that n/2 > log n + 1/n and c >
max(2c, T (1), . . . , T (n0 )). Then the above mathematical induction proof will be
passed. Therefore, we obtained the following.
Theorem 2.1.1 Merge sort runs in O(n log n) time.
By the mathematical induction, we can also prove the following result.
Theorem 2.1.2 Let T (n) = aT (n/b) + f (n) where constants a > 1, b > 1, and
n/b mean n/b or n/b. Then we have the following:
2.1 Algorithms with Self-Reducibility 17
1. If f (n) = O(nlogb a−ε ) for some positive constant ε, then T (n) = (nlogb a ).
2. If f (n) = (nlogb a ), then T (n) = (nlogb a log n).
3. If f (n) = (nlogb a+ε ) for some positive constant ε and moreover, af (n/b) ≤
cf (n) for sufficiently large n and some constant c < 1, then T (n) = (f (n)).
In Fig. 2.1, we see a tree structure between problem and subproblems. In general,
for any algorithm with self-reducibility, its computational process will produce a set
of subproblems on which we can also construct a graph to describe relationship
between them by adding an edge from subproblem A to subproblem B if at an
iteration, subproblem A is reduced to several problems, including subproblem B.
This graph is called the self-reducibility structure of the algorithm.
All algorithms with tree self-reducibility structure form a class, called divide-
and-conquer, that is, an algorithm is in class of divide-and-conquer if and only
if its self-reducibility structure is a tree. Thus, the merge sort is a divide-and-
conquer algorithm.
In a divide-and-conquer algorithm, it is not necessary to divide a problem evenly
or almost evenly. For example, we consider another sorting algorithm, called Quick
Sort. The idea is as follows.
In merge sort, the procedure Merge takes O(n) time, which is the main
consumption of time. However, if A[i] ≤ A[q] for p ≤ i < q and A[q] ≤ A[j ] for
q < j ≤ r, then this procedure can be skipped, and after sort A[p . . . q − 1] and
A[q + 1 . . . r], we can simply put them together to obtain sorted A[p . . . r].
In order to have above property satisfied, Quick Sort uses A[r] to select all
elements A[p . . . r − 1] into two subsequences such that one contains elements less
than A[r] and the other one contains elements at least A[r]. A pseudocode of quick
sort is as shown in Algorithm 2.
The division is not balanced in Quick Sort. In the worst case, one part contains
nothing, and the other contains r − p elements. This will result in running time
O(n2 ). However, Quick Sort has expected running time O(n log n). To see it, let
T (n) denote the running time for n numbers. Note that the procedure Partition runs
in linear time. Then, we have
1
E[T (n)] ≤ ( E[T (n − 1)] + c1 n)
n
1
+ (E[T (1)] + E[T (n − 2)] + c1 n)
n
+···
1
+ (E[T (n − 1)] + c1 n)
n
2
n−1
= c1 n + E[T (i)].
n
i=1
18 2 Divide-and-Conquer
2c
n−1
E[T (n)] ≤ c1 n + i log i
n
i=1
2/(n(n−1))
= c1 n + c(n − 1) log n−1
i=1 i i
12 + 22 + · · · + (n − 1)2
≤ c1 n + c(n − 1) log
n(n − 1)/2
2n − 1
= c1 n + c(n − 1) log
3
2n
≤ c1 n + cn log
3
2.2 Rectilinear Minimum Spanning Tree 19
3
= cn log n + c1 − c log n.
2
Consider two points A = (x1 , y1 ) and B = (x2 , y2 ) in the plane. The rectilinear
distance of A and B is defined by
The rectilinear plane is the plane with the rectilinear distance, denoted by L1 -plane.
In this section, we study the following problem.
Problem 2.2.1 (Rectilinear Minimum Spanning Tree) Given n points in the
rectilinear plane, compute the minimum spanning tree on those n given points.
In Chap. 1, we already present Kruskal algorithm which can compute a minimum
spanning tree within O(m log n) time. In this section, we will improve this result by
showing that the rectilinear minimum spanning tree can be computed in O(n log n)
time. To do so, we first study an interesting problem as follows.
Problem 2.2.2 (All Northeast Nearest Neighbors) Consider a set P of n points in
the rectilinear plane. For each A = (xA , yA ) ∈ P , another point B = (xB , yB ) ∈ P
is said to lie in northeast (NE) area of A if xA ≤ xB and yA ≤ yB , but A = B.
Furthermore, B is the NE nearest neighbor of A if B has the shortest distance from
A among all points lying in the NE area of A. This problem is required to compute
the NE nearest neighbor for every point in P . (The NE nearest neighbor of a point
A is “none” if no given point lies in the northeast area of A.)
Let us design a divide-and-conquer algorithm to solve this problem. For sim-
plicity of description, assume all n points have distinct x-coordinates and distinct
y-coordinates. Now, we bisect n points by a vertical line L. Let Pl be the set of
points lying on the left side of L and Pr the set of points lying on the right side
of L. Suppose we already solve the all NE nearest neighbors problem on input
point sets Pl and Pr , respectively. Let us discuss how to combine solutions for two
subproblems into a solution for all NE nearest neighbors on P .
For point A in Pr , the NE nearest neighbor in Pr is also the NE nearest neighbor
in P . However, for point A in Pl , the NE nearest neighbor in Pl may not be the NE
nearest neighbor in P . Actually, let B1 denote the NE nearest neighbor of A in Pl
and Br the NE nearest neighbor of A for B2 in Pr . Then, if d(A, B1 ) ≤ d(A, B2 ),
20 2 Divide-and-Conquer
By this lemma, we can compute the NE nearest neighbors in Pr for all points in
Pl as follows.
• For Pl , put all points in decreasing ordering of y-coordinates. For Pr , also, put all
points in deceasing ordering of y-coordinates. Put none in Pr as the first element.
Assume that none has y-coordinate +∞ and for any point A ∈ Pl , d(A, none) =
+∞.
• Employ three pointers left, right, and min. left will be located in Pl . right and min
work in Pr and none.
• Initially, assign left with the first point in Pl , and assign right and min with the
first element Pr .
• If right has y-coordinate higher than left and d(lef t, right) ≥ d(lef t, min),
then move right to next point in Pr .
• If right has y-coordinate higher than left and d(lef t, right) < d(lef t, min),
then set min = right, and move right to next one in Pr .
• If right has y-coordinate lower than left, then min is the NE nearest neighbor of
left. Put this fact in record, and move left to next one in Pl .
Since left, right, and min always move down and never move up, above procedure
runs in O(n) time. Let T (n) be the running time for computing all NE nearest
neighbors for n points. Then we obtain T (n) = 2T (n/2) + O(n). Therefore,
T (n) = O(n log n).
We make a remark on the case that P contains points with the same x-coordinate
or y-coordinates. If P has some points with the same x-coordinate, then in order to
partition P into two even parts, we may also consider their y-coordinates. If P has
some points with same y-coordinate, then we may need to give a little adjustment
for combination procedure.
Theorem 2.2.4 Computing all NE nearest neighbors for n points can be done in
O(n log n) time.
2.2 Rectilinear Minimum Spanning Tree 21
Now, we move back our attention to the rectilinear minimum spanning tree.
Consider any point A. As shown in Fig. 2.3, divide the area surrounding A into
eight octants. To make them disjoint, we assume that each octant contains only one
boundary, which can be reached from an interior ray to turn in counterclockwise
direction.
Lemma 2.2.5 Suppose (A, B) is an edge in a rectilinear minimum spanning tree.
Then B must be the nearest neighbor of A in an octant.
Proof Without loss of generality, assume that B lies in octant I of point A. For
contradiction, suppose that B is not the nearest neighbor of A in octant I. Let C be
the nearest neighbor of A in octant I, i.e., C lies in octant I and d(A, C) < d(A, B).
Note that
is a triangle without boundary (A, B), which has a property that every two points
in this triangle has rectilinear distance less than d(A, B) (Fig. 2.3). Therefore,
d(C, B) < d(A, B).
Remove edge (A, B) from the rectilinear minimum spanning tree, which will
partition the tree into two connected components, containing points A and B,
respectively. If A and C lie in the same component, then add edge (C, B); otherwise,
C and B must lie in the same component, and add edge (A, C). Therefore, we will
obtain a shorter rectilinear minimum spanning tree, a contradiction.
Construct a graph G in the following way: For each point A, if an octant of A
contains another given point, then find a nearest neighbor B for A in this octant, and
add edge (A, B) to G.
Lemma 2.2.6 G contains a rectilinear minimum spanning tree.
Proof Consider a rectilinear minimum spanning tree T . For each point A, T must
contain an edge (A, B). By Lemma 2.2.5, B is the nearest neighbor of A in an
octant. Suppose (A, B) is not an edge of G. Then G must contain an edge (A, C)
22 2 Divide-and-Conquer
lying in the same octant, and C is another nearest neighbor of A in the same octant.
Note that
Delete (A, B) from tree T . Then T is partitioned into two connected components.
We claim that C and B must lie in the same component. In fact, otherwise, assume
that A and C lie in the same component. Then we can shorten T by replacing (A, B)
with (B, C), contradicting the minimality of T . Therefore, our claim is tree. It
follows that replacing (A, B) by (A, C) in T results in another minimum spanning
tree T . Continue above operations; we will find a rectilinear minimum spanning
tree contained in G.
2.4 Heap
Heap is a quite useful data structure. Let us introduce it here and, by the way, give
another sorting algorithm, Heap Sort.
A heap is a nearly complete binary tree, stored in an array (Fig. 2.4). What is
nearly complete binary tree? It is a binary tree satisfying the following conditions:
Right(i)
return 2i + 1.
Max-Heapify(A, i)
if Left(i) ≥Right(i) and Left(i) > A[i]
then Exchange A[i] and Left(i)
Max-Heapify(A, Left(i))
if Left(i) <Right(i) and Right(i) > A[i]
then Exchange A[i] and Right(i)
Max-Heapify(A, Right(i));
Build-Max-Heap(A)
for i ← size[A]/2 down to 1
do Max-Heapify(A, i);
h
h
h−i
O 2 (h − i) = O 2h
i
2h−i
i=0 i=0
h
i
= O 2h
2i
i=0
= O(n).
Since the number of steps is O(n) and Max-Heapify(A, 1) takes O(log n) time,
the running time of Heap Sort is O(n log n).
Theorem 2.4.1 Heap Sort runs in O(n log n) time.
We already have two sorting algorithms with O(n log n) running time and one
sorting algorithm with expected O(n log n) running time. But, there is no sorting
algorithm with running time faster than O(n log n). Is O(n log n) a barrier of
running time for sorting algorithm? In some sense, the answer is yes. All sorting
algorithms presented previously belong to a class, called comparison sort.
In comparison sort, order information about input sequence can be obtained only
by comparison between elements in the input sequence. Suppose input sequence
contains n positive integers. Then there are n! possible permutations. The aim of
sorting algorithm is to determine a permutation which gives a nondecreasing order.
Each comparison divides the set of possible permutations into two subsets. The
comparison result tells which subset contains a nondecreasing order. Therefore,
every comparison sort algorithm can be represented by a binary decision tree
(Fig. 2.9). The (worst case) running time of the algorithm is the height (or depth)
of the decision tree.
Since the binary decision tree has n! leaves, its height T (n) satisfies
1 + 2 + · · · + 2T (n) ≥ n!
that is,
√ n n
2T (n)+1 − 1 ≥ 2π n .
e
30 2 Divide-and-Conquer
Thus,
To break the barrier of running time O(n log n), one has to design a sorting
algorithm without using comparison. Counting sort is such an algorithm.
Let us use an example to illustrate Counting Sort as shown in Algorithm 4. This
algorithm contains three arrays, A, B, and C. Array A contains input sequence of
positive integers. Suppose A = {4, 6, 5, 1, 4, 5, 2, 5}. Let k be the largest integer
in input sequence. Initially, the algorithm makes preprocessing on array C in three
stages:
1. Clean up array C.
2. For 1 ≤ i ≤ k, assign C[i] with the number of i’s appearing in array A. (In the
example, C = {1, 1, 0, 2, 3, 1} at this stage.)
3. Update C[i] such that C[i] is equal to the number of integers with value at most
i appearing in A. (In the example, C = {1, 2, 2, 4, 7, 8} at this stage.)
With the help of array C, the algorithm moves element A[j ] to array B for j = n
down to 1, by
C[A[j ]] ← C[A[j ]] − 1.
C 122478
A 4 6 5 1 4 5 2 5̂
B 5
C 122468
A 4 6 5 1 4 5 2̂ 5
B 2 5
C 112468
A 4 6 5 1 4 5̂ 2 5
B 2 55
C 112458
A 4 6 5 1 4̂ 5 2 5
B 2 4 55
C 112358
A 4 6 5 1̂ 4 5 2 5
B 12 4 55
C 012358
A 4 6 5̂ 1 4 5 2 5
B 12 4555
C 012348
A 4 6̂ 5 1 4 5 2 5
B 12 45556
C 012347
A 4̂ 6 5 1 4 5 2 5
B 12445556
Proof The loop at line 1 takes O(k) time. The loop at line 4 takes O(n) time. The
loop at line 7 takes O(k) time. The loop at line 10 takes O(n) time. Putting all
together, the running time is O(n + k).
A student found a simple way to improve Counting Sort. Let consider the same
example. At the second stage, C = {1, 1, 0, 2, 3, 1} where C[i] is equal to the
number of i’s appearing in array A. The student found that with this array C, array
B can be put in integers immediately without array A.
C 110231
B 1
B 12
B 1244
B 1244555
B 12445556
Is this method acceptable? The answer is no. Why not? Let us explain.
First, we should note that those numbers in input sequence may come from
labels of objects. The same numbers may come from different objects. For
example, consider a sequence of objects {329, 457, 657, 839, 436, 720, 355}. If
we use their first digits from left as labels, then we will obtain a sequence
{9, 7, 7, 9, 6, 0, 5}. When apply Counting Sort on this sequence, we will obtain a
sequence {720, 355, 436, 457, 657, 329, 839}. This is because a label gets moved
together with its object in Counting Sort.
Moreover, consider two objects 329 and 839 with the same label 9. In input
sequence, 329 lies on the left side of 839. After Counting Sort, 329 lies still on the
left side of 839.
A sorting algorithm is stable if for different objects with the same label, after
labels are sorted, the ordering of objects in output sequence is the same as their
ordering in input sequence. The following can be proved easily.
Lemma 2.5.2 Counting Sort is stable.
The student’s method cannot keep stable property.
With stable property, we can use Counting Sort in the following way. Remember,
after sorting the leftmost digit, we obtain sequence
Now, we continue to sort this sequence based on the second leftmost digit. Then we
will obtain sequence
an + b for 0 ≤ a ≤ n − 1, 0 ≤ b ≤ n − 1.
Apply Counting Sort first to b and then to a. Each application takes O(n) = O(n +
k) time since k = n. Therefore, total time is still O(n).
In general, suppose there are n integers, each of which can be represented in the
form
ad k d + ad−1 1k d−1 + · · · + a0
Let us study more examples with divide-and-conquer technique and sorting algo-
rithms.
Example 2.6.1 (Maximum Consecutive Subsequence Sum) Given a sequence of n
integers, find a consecutive subsequence with maximum sum.
Divide input sequence S into two subsequence S1 and S2 such that |S1 | = n/2
and |S2 | = n/2. Let MaxSub(S) denote the consecutive subsequence of S with
maximum sum. Then there are two cases.
Case 1. MaxSub(S) is contained in either S1 or S2 . In this case, MaxSub(s) =
MaxSub(S1 ) or MaxSub(s) = MaxSub(S2 ).
Case 2. MaxSub(S) ∩ S1 = ∅ and MaxSub(S) ∩ S2 = ∅. In this case,
MaxSub(S) ∩ S1 is the tail subsequence with maximum sum. That
is, suppose S1 = {a1 , a2 , . . . , ap }. Then among subsequences {ap },
{ap−1 , ap }, . . . , {a1 , . . . , ap }, MaxSub(S)∩S1 is the one with maximum
sum. Therefore, it can be found in O(n) time. Similarly, MaxSub(S)∩S2
is the head subsequence with maximum sum, which can be computed in
O(n) time.
34 2 Divide-and-Conquer
S1 = a1
Sj + aj +1 if Sj > 0,
Sj +1 =
aj +1 if Sj ≤ 0.
This recursive formula gives a linear time algorithm to compute Sj for all 1 ≤
j ≤ n. From them, find the maximum one, which is the solution for the maximum
consecutive subsequence sum problem.
Example 2.6.2 (Closest Pair of Points) Given n points in the Euclidean plane, find
a pair of points to minimize the distance between them.
Initially, we may assume that all n points have distinct x-coordinates since, if
not, we may rotate the coordinate system a little.
Now, divide all points into two half parts based on x-coordinates. Find the closest
pair of points in each part. Suppose δ1 and δ2 are distances of closest pairs in two
parts, respectively. Let δ = min(δ1 , δ2 ). We next study if there is a pair of points
lying in both parts, respectively, and with distance less than δ (Fig. 2.10).
For each point u = (xu , yu ) in the left part (Fig. 2.10), consider the rectangle
Ru = {(x, y) | xu ≤ x ≤ xu + δ, yu − δ ≤ y ≤ yu + δ}. It has the following
properties:
• Every point in the right part and within distance δ from u lies in this rectangle.
• This rectangle contains at most six points in the right part because every two
points have distance at least δ.
2.6 More Examples 35
For each u in the left part, check every point v lying in Ru , if distance d(u, v) < δ. If
yes, then we keep the record and choose the closest pair of points from them, which
should be the solution. If not, then the solution should either be the closest pair of
points in the left part or the closest pair of points in the right part.
Let T (n) be the time for finding the closest pair of points from n points. Above
method gives a recursive relation
left of x and hence makes a recursive call A(k, i). If k ≤ i − 2, then the ith smallest
number lies in the right of x and hence makes a recursive call A(n−k −1, i −k −1).
Now, let us analyze this algorithm. Let T (n) be the running time of A(n, i).
• Steps 1 and 2 take O(n) time.
• Step 3 takes T (n/5) time.
• Step 4 takes O(n) time.
• Step 5 takes T (max(k, n − k − 1)) time.
Therefore,
We claim that
1 n
max(k, n − k − 1) ≤ n − 3 −2 .
2 5
1 n
k+1=3
2 5
and
1 n
n−k ≥3 − 2.
2 5
2.6 More Examples 37
Therefore,
1 n
n−k−1≤n−3
2 5
and
1 n
k ≤n− 3 −2 .
2 5
Note that
1 n 3n 7n
n− 3 −2 ≤n− −2 ≤ + 2.
2 5 10 10
By the claim,
7n
T (n) ≤ T (n/5) + T + 2 + c n
10
T (n) ≤ cn (2.1)
Therefore, (2.1) holds for n ≤ 59. Next, consider n ≥ 60. By induction hypothesis,
we have
since
c(n/10 − 3) ≥ n/20 ≥ c n.
The first inequality is due to n ≥ 60, and the second one is due to c ≥ 20c . This
ends the proof of T (n) = O(n).
Example 2.6.4 (Largest Rectangular Area in Histogram) Consider a histogram as
shown in Fig. 2.14. Assume every bar has unit width and heights are h1 , h2 , . . . , hn ,
respectively. Find the largest rectangular area.
Let hk = min(hi , h2 , . . . , hj ). Denote by m(i, j ) the largest rectangular area in
histogram with bars between i and j . Then, we can obtain the following recursive
formula.
It is similar to Quicksort that expected running time can be proved to be O(n log n).
Exercises 39
Exercises
1. Use a recursion tree to estimate a good upper bound on the recurrence T (n) =
3T (n/2) + n and T (1) = 0. Use the mathematical induction to prove
correctness of your estimation.
2. Draw the recursion tree for T (n) = 3T (n/2) + cn, where c is a positive
constant, and guess an asymptotic upper bound on its solution. Prove your
bound by mathematical induction.
3. Show that for input sequence in decreasing order the running time of Quick
Sort is (n2 ).
4. Show that Counting Sort is stable.
5. Find an algorithm to sort n integers in the range 0 to n3 − 1 in O(n) time.
6. Let A[1 : n] be an array of n distinct integers sorted in increasing order.
(Assume, for simplicity, that n is a power of 2.) Give an O(log n)-time
algorithm to decide if there is an integer i, 1 ≤ i ≤ n, such that A[i] = i.
7. Given an array A of integers, please return an array B such that B[i] = |{A[k] |
k > i and A[k] < A[i]}|.
8. Given a string S and an integer k > 0, find the longest substring of s such that
each symbol appears at least k times if it appears in the substring.
9. Given an integer array A, please compute the number of pairs {i, j } with A[i] >
2 · A[j ].
10. Given a sorted sequence of distinct nonnegative integers, find the smallest
missing number.
11. Given two sorted sequences with m, n elements, respectively, design and
analyze an efficient divide-and-conquer algorithm to find the kth element
in the merge of the two sequences. The best algorithm runs in time
O(log(max(m, n))).
12. Design a divide-and-conquer algorithm for the following longest ascending
subsequence problem: Given an array A[1..n] of natural numbers, find the
length of the longest ascending subsequence. (A subsequence is a list A[i1 ],
A[i2 ], . . . , A[im ] where m is the length.)
13. Show that in a max-heap of length n, the number of nodes rooted at which the
subtree has height h is at most 2h+1 n
.
14. Let A be an n × n matrix of integers such that each row is strictly increasing
from left to right and each column is strictly increasing from top to bottom. Give
an O(n)-time algorithm for finding whether a given number x is an element of
A, i.e., whether x = A(i, j ) for some i, j .
15. Let S be a set of n points, pi = (xi , yi ), 1 ≤ i ≤ n, in the plane. A point pj ∈ S
is a maximal point of S if there is no other point pk ∈ S such that xk ≥ xj and
yk ≥ yj . In Fig. 2.15, it illustrates the maximal points of a point-set S. Note
that the maximal points form a “staircase” which descends rightward. Give an
efficient divide-and-conquer algorithm to determine the maximal points of S.
16. Let A[1..n] be an array of n distinct integers where n ≥ 2. An element A[i]
is a local maximum if A[i − 1] < A[i] and A[i] > A[i + 1] for 1 < i < n,
40 2 Divide-and-Conquer
A[i] > S[i + 1] for i = 1, and A[i − 1] < A[i] for i = n. Please design an
algorithm to find a local maximum in O(log n) time.
17. The maximum subsequence sum problem is defined as follows: Given an array
A[1..n] of integer numbers, find values of i and j with 1 ≤ i ≤ j ≤ n such that
j
k=i A[i] is maximized. Design a divide-and-conquer algorithm for solving
the maximum subsequence sum problem in time O(n log n).
18. In the plane, there are n distinct points p1 , p2 , . . . , pn lying on line y = 0 and
also n distinct points q1 , q2 , . . . , qn lying on line y = 0. Consider n segments
[p1 , q1 ], [p2 , q2 ], . . . , [pn , qn ]. Design an algorithm to count how many cross
pairs in these n segments. Your algorithm should run in O(n log n) time.
19. Design a divide-and-conquer algorithm for multiplying n complex numbers
using only 3(n − 1) real multiplications.
20. Consider a 0-1 matrix of order (2n −1)×n. All rows have distinct 0-1 sequences
of length n, that is, no two rows are identical. Design a O(n) time algorithm to
find the missing sequence.
21. Given a sequence of n distinct integers and a positive integer i, finding the ith
smallest one in the sequence can be done in O(n) time (see Example 2.6.3).
Now, consider the problem of finding the ith smallest one for every i =
1, 2, . . . , k. Can you do it in O(n log k) time?
22. An inversion in an array A[1..n] is a pair of indices i and j such that i < j
and A[i] > A[j ]. Design an algorithm to count the number of inversions in an
n-element array in O(n log n) time.
23. In Example 2.6.3, a linear time algorithm is given for finding the ith smallest
number in a unsorted list of n distinct integers. Now, let us modify the first two
steps as follows: Initially, suppose all n integers are given in array A. Partition
all input integers into groups of three elements. Then sort each group, and place
its median into another array B. Repeat the same process for B, that is, partition
elements in B into groups of three elements, and then place the median of each
group into array C. Now, make a recursive call to find the median x of C. The
remaining part is the same as later steps in the linear time algorithm. Please
analyze the running time of this modified algorithm.
Historical Notes 41
24. Design an O(nlog2 3 ) step algorithms for multiplication of two n-digit numbers.
A single step only allows the multiplication/division or addition/subtraction
of single digit numbers. Could you improve your algorithm with running
O(nlog3 5 ) steps?
Historical Notes
Let us first study several examples and start from a simpler one.
Example 3.1.1 (Fibonacci Number) Fibonacci number Fi for i = 0, 1, . . . is
defined by
be the length of the path from vi to the root of T . The cost of T is defined by
n
cost (T ) = ai Di .
i=1
The problem is to construct a labeled tree T to minimize the cost cost (T ) for a
given sequence of positive integers a1 , a2 , . . . , an .
Let T (i, j ) be the optimal labeled tree for subsequence {ai , ai+1 , . . . , aj } and
sum(i, j ) = ai + ai+1 + · · · + aj . Then
where
ai if i = j
sum(i, j ) =
ai + sum(i + 1, j ) if i < j.
There are two remarks on this formula: (1) There are some exceptional cases. We
will see one in the next section. (2) The divide-and-conquer can be considered as a
special case of the dynamic programming. Therefore, its running time can also be
estimated with this formula. However, the outcome is usually too rough.
It is similar to the divide-and-conquer that there are two ways to write software
codes for the dynamic programming. The first way is to employ recursive call as
shown in Algorithm 5. The second way is as shown in Algorithm 6 which saves the
recursive calls, and hence in practice, it runs faster with smaller space requirement.
46 3 Dynamic Programming
Before we study the next example, let us first introduce a concept, guillotine cut.
Consider a rectangle P , a cut on P is called a guillotine cut if it cuts P into two
parts. A guillotine partition is a sequence of guillotine cuts.
Example 3.1.3 (Minimum Length Guillotine Partition) Given a rectangle with
point-holes inside, partition it into smaller rectangles without a hole inside by a
sequence of guillotine cuts to minimize the total length of cuts. Here, the guillotine
cut is a vertical or horizontal straight line segment which partitions a rectangle into
two smaller rectangles. An example is shown in Fig. 1.2.
Example 3.1.3 is a geometric optimization problem. It has infinitely many
feasible solutions. Therefore, strictly speaking, it is not a combinatorial optimization
problem. However, it can be reduced to a combinatorial optimization problem.
Lemma 3.1.4 (Canonical Partition) There exists a minimum length guillotine
partition such that every guillotine cut passes through a point-hole.
Proof Suppose there exists a guillotine cut AB not passing through any point-hole
(Fig. 3.3). Without loss of generality, assume that AB is a vertical cut. Let n1 be the
number of guillotine cuts touching AB on the left and n2 the number of guillotine
cuts touching AB on the right. Without loss of generality, assume n1 ≥ n2 . Then we
can move AB to the left without increasing the total length of rectangular guillotine
partition, until a point-hole is met. If this moving cannot meet a point-hole, then AB
can be moved to meet another vertical cut or vertical boundary, and in either case,
AB can be deleted, contradicting the optimality of the partition.
By Lemma 3.1.4, we may consider only canonical guillotine partitions. During
the canonical guillotine partition, each subproblem can be determined by a rectangle
in which each boundary edge is obtained by a guillotine cut or a boundary edge of a
given rectangle, and hence there are O(n) possibility. This implies that the number
of subproblems is O(n4 ).
To find an optimal one, let us study a guillotine cut on a rectangle P . Let n be
the number of point-holes. Since the guillotine cut passes a point-hole, there are at
3.1 Dynamic Programming 47
most 2n possible positions. Suppose P1 and P2 are two rectangles obtained from
P by the guillotine cut. Let opt (P ) denote the minimum total length of guillotine
partition on P . Then we have
The computation time for this recurrence is O(n). Therefore, the optimal rectan-
gular guillotine partition can be computed by a dynamic programming in O(n5 )
time.
One of the important techniques for design of dynamic programming for a given
problem is to replace the original problem by a proper one which can be easily found
to have a self-reducibility. The following is such an example.
Example 3.1.5 Consider a horizontal strip. There are n target points lying inside
and m unit disks with centers lying outside of the strip where each unit disk di has
radius one and a positive weight w(di ). Each target point is covered by at least one
unit disk. The problem is to find a subset of unit disks, with minimum total weight,
to cover all target points.
First, without loss of generality, assume all target points have distinct x-
coordinates; otherwise, we may rotate the strip together with coordinate system
a little to reach such a property. Line up all target points p1 , p2 , . . . , pn in the
increasing ordering of x-coordinate. Let Da be the set of all unit disks with centers
lying above the strip and Db the set of all unit disks with centers lying below the
strip. Let 1 , 2 , . . . , n be vertical lines passing through p1 , p2 , . . . , pn , respectively.
For any two disks d, d ∈ Da , define d ≺i d if the lowest intersection between
the boundary of disk d and i is not lower than the lowest intersection between the
boundary of disk d and i . Similarly, for any two sensors d, d ∈ Db , define d ≺i d
if the highest intersection between the boundary of disk d and i is not higher than
the highest intersection between the boundary of disk d and i .
For any two disks da ∈ Da and db ∈ Db with pi covered by da or db , let
Di (da , db ) be an optimal solution of the following problem.
min w(D) = w(d) (3.1)
d∈D
subject to da , db ∈ D,
∀d ∈ D ∩ Da : d ≺i da ,
∀d ∈ D ∩ Db : d ≺i db ,
D covers target points p1 , p2 , . . . , pi .
Lemma 3.1.6
w(Di (da , db )) = min{w(Si−1 (da , db )) + [da = da ]w(da ) + [db = db ]w(db )
| da ≺i da , db ≺i db , and pi−1 is covered by da or db }
where
1 if d = d ,
[d = d ] =
0 otherwise .
Proof Let da be the disk in Di (da , db ) ∩ Da whose boundary has the lowest
intersection with i−1 and db the disk in Di (da , db ) ∩ Db whose boundary has the
highest intersection with i−1 . We claim that
w(Di (da , db )) = w(Di−1 (da , db )) + [da = da ]w(da ) + [db = db ]w(db ). (3.2)
To prove it, we first show that if da = da , then da ∈ Di−1 (da , db ) for w(da ) > 0.
In fact, for otherwise, there exists i < i − 1 such that pi is covered by da , but not
covered by da .
This is impossible (Fig. 3.4). To see this, let A be the lowest intersection between
the boundary of disk da and i and B the lowest intersection between the boundary
of disk da and i . Then A and B lie inside the disk da . Let C and D be intersection
points between line AB and the boundary of disk da . Let E be the lowest intersection
between the boundary of disk da and i−1 and F the lowest intersection between
the boundary of disk da and i−1 . Note that da and da lie above the strip. We have
CED > AF B > π/2 and hence sin CED < sin AF B. Moreover, we have
|AB| < |CD|. Thus,
|CD| |AB|
radius (da ) = > = radius(da ),
2 sin CED 2 sin AF B
contradicting the homogeneous assumption of disks. Therefore, our claim is true.
Similarly, if db = db , then db ∈ Si−1 (da , db ) for w(sb ) > 0. Therefore, (3.2) holds.
This means that for equation in Lemma 3.1.6, the left-side ≥ the right-side.
To see the left-side ≤ the right-side for the equation in Lemma 3.1.6, we note
that in the right side, Si−1 (da , db ) ∪ {da , db } is always a feasible solution of the
problem (3.1).
Let us employ the recursive formula in Lemma 3.1.6 to compute all Si (da , db ).
There are totally O(m2 n) problems. With the recursive formula, each Si (da , db , k)
can be computed in time O(m2 ). Therefore, all Si (da , db , k) can be computed by
dynamic programming in time O(m4 n). The solution of Example 3.1.5 can be
computed by
f (v, k) = f (u, k) + wv
d ∗ (s) = 0,
d ∗ (u) = min {d ∗ (v) + c(v, u)}.
v∈N − (v)
3.2 Shortest Path 51
nodes. Thus, the product of the table size and the computation time of recursive
formula is O(n2 ). However, this estimation for the running time of algorithm DP1
is not correct. In fact, we need also to consider the time for finding u ∈ T such
that N − (u) ⊆ S. This requires to check if a set is a subset of another set. What is
the running time of this computation? Roughly speaking, this may take O(n log n)
time, and hence, totally the running time of algorithm DP1 is O(n(n + n log n)) =
O(n2 log n).
Can we improve this running time by a smarter implementation? The answer is
yes. Let us do this in two steps.
First, we introduce a new number d(u) = minv∈N − (u)∩S (d ∗ (v) + c(v, u)) and
rewrite the algorithm DP1 as follows.
In this algorithm, updating value of d(u) would be performed on all edges, and
for each edge, update once. Therefore, the total time is O(m) where m is the number
of edges, i.e., m = |E|.
Secondly, we introduce the topological sort. The topological sort of nodes in a
digraph G = (V , E) is an ordering such that for any arc (u, v) ∈ E, node u has
position before node v. Please note that the topological sort exists only for directed
acyclic graphs, which are exactly those networks where the dynamic programming
can work for the shortest path problem by Theorem 3.2.2.
There is an algorithm with running time O(m) for topological sort as shown in
Algorithm 7. Actually, in Algorithm 7, line 3 takes O(n) time, and line 7 takes
O(m) time. Hence, it runs totally in O(m + n) time. However, for the shortest path
problem, input directed graph is connected if ignoring the arc direction, and hence
n = O(m). Therefore, O(m + n) = O(m).
An example for topological sort is shown in Fig. 3.8. In each iteration, yellow
node is the one selected from S to initiate the iteration. During the iteration, the
yellow node will be moved from S to end of L, and all arcs from the yellow node
will be deleted; meanwhile, new nodes will be added to S.
3.2 Shortest Path 53
Now, we can first do topological sort and then carry out dynamic programming,
which will result in a dynamic programming (Algorithm 8 for the shortest path
problem, running in O(m) time).
An example is shown in Fig. 3.9. At the beginning, the topological sort is done in
the previous example as shown in Fig. 3.8. In Fig. 3.9, the yellow node represents the
one removed from the front of T to initiate an iteration. During the iteration, all red
54 3 Dynamic Programming
arcs from the yellow node are used for updating the value of d(·), and meanwhile,
the yellow node is added to S whose d ∗ (·)’s value equals to d(·)’s value.
It may be worth mentioning that Algorithm 8 works for acyclic directed graph
without restriction on arc weight, i.e., arc weight can be negative. This implies that
the longest path problem can be solved in O(m) time if input graph is acyclic. For
definition of the longest path problem, please find it in Chap. 8. The longest path
3.2 Shortest Path 55
Dijkstra algorithm is able to find the shortest path in any directed graph with
nonnegative arc weights. Its design is based on the following important discovery.
Theorem 3.3.1 Consider a directed network G = (V , E) with a source node s
and a sink node t and every arc (u, v) has a nonnegative weight c(u, v). Suppose
(S, T ) is a partition of V such that s ∈ S and t ∈ T . If d(u) = minv∈T d(v), then
d ∗ (u) = d(u).
Proof For contradiction, suppose d(u) = minv∈T d(v) > d ∗ (u). Then there exists
a path p (Fig. 3.10) from s to u such that
Let w be the first node in T on path p. Then d(w) = length(p(s, w)) where
p(s, w) is the piece of path p from s to w. Since all arc weights are nonnegative,
we have
a contradiction.
By Theorem 3.3.1, in dynamic programming for shortest path, we may replace
N − (u) ⊆ S by d(u) = minv∈T d(v) when all arc weights are nonnegative. This
replacement results in Dijkstra algorithm.
Dijkstra Algorithm
S ← ∅;
T ← V;
while T = ∅ do begin
find u ← argminv∈T d(v);
S ← S ∪ {u};
T ← T − {u};
d ∗ (u) = d(u);
for every w ∈ N + (u), update d(w) ← min(d(w), d ∗ (u) + c(u, w));
end-while
output d ∗ (t).
Although Dijkstra algorithm with simple buckets runs faster for small c, it cannot be
counted as a polynomial-time solution. In fact, the input size of c is log c. Therefore,
we would like to select a data structure which implements Dijkstra algorithm in
polynomial-time. This data structure is priority queue.
A priority queue is a data structure for maintaining a set S of elements, each
with an associated value, called a key. All keys are stored in an array A such that an
element belongs to set S if and only if its key is in array A. There are two types of
priority queues, the min-priority queue and the max-priority queue. Since they are
similar, we introduce one of them, the min-priority queue.
A min-priority queue supports the following operations: Minimum(S), Extract-
Min(S), Increase-Key(S, x, k), and Insert(S, x).
The min-heap can be employed in implementation of those operations.
Minimum(S) returns the element of S with the smallest key, which can be
implemented as follows.
Heap-Minimum(A)
return A[1].
Extract-Min(S) removes and returns the element of S with the smallest key,
which can be implemented by using min-heap as follows.
Heap-Extract-Min(A)
min ← A[1];
A[1] ← A[heap-size[A]];
heap-size[A] ← heap-size[A]-1;
Min-Heapify(A, 1);
return min.
Decrease-Key(S, x, k) decreases the value of element x’s key to the new value
k, which is assumed to be no more than x’s current key value. Suppose that A[i]
contains x’s key. Then, Decrease-Key(S, x, k) can be implemented as an operation
of min-heap as follows.
Heap-Decrease-Key(A, i, key)
if key > A[i]
then error “new key is larger than current key”;
A[i] ← key;
while i > 1 and A[Parent(i)] > A[i]
do exchange A[i] ↔ A[Parent(i)]
and i ← Parent(i).
Insert(S, x.key) inserts the element x into S, which is implemented in the
following.
60 3 Dynamic Programming
Insert(A, key)
array-size[A] ← array-size[A] + 1;
A[array-size[A]] ← +∞;
Decrease-Key(A, array-size[A], key).
• Use min-priority queue to keep set T , and for every node u ∈ T , use d(u) for the
key of u.
• Use operation Extract-Min(T ) to obtain u satisfying d(u) = minv∈T d(v). This
operation at line 9 will be used for O(n) times.
• Use operation Decrease-Key(T , v, key) on each edge (u, v) to update d(v) and
the min-heap. This operation on line 14 will be used for O(m) times.
• Therefore, the total running time is O((m + n) log n).
3.5 Bellman-Ford Algorithm 61
Bellman-Ford Algorithm
input:A directed graph G = (V , E) with weight c : E → R+ ,
a source node s and a sink node t.
output: The minimum weight of path from s to t,
or a message “G contains a negative weight cycle”.
d(s) ← 0;
for u ∈ V \ {s} do
62 3 Dynamic Programming
d(u) ← ∞;
for i ← 1 to n − 1 do
for each arc (u, v) ∈ E do
if d(u) + c(u, v) < d(v)
then d(v) ← d(u) + c(u, v);
for each arc (u, v) ∈ E do
if d(u) + c(u, v) < d(v)
then return “G contains a negative weight cycle”.
else return d(t).
(1)
This means that (ast ) is the adjacency matrix of graph G. Denote
(1)
A(G) = (ast ).
64 3 Dynamic Programming
We claim that
(k)
A(G)k = (ast ).
Let us prove this claim by induction on k. Suppose it is true for k. Consider a path
from s to t with exactly k + 1 arcs. Decompose the path at a node h such that the
subpath from s to h contains exactly k arcs and (h, t) is an arc. Then the subpath
(k) (1)
from s to h has ash choices and (h, t) has aht choices. Therefore,
(k+1)
(k) (1)
ast = ash aht .
h∈V
It follows that
(k+1) (k) (1)
(ast ) = (ash )(aht ) = A(G)k · A(G) = A(G)k+1 .
Denote
(1)
L(G) = ( st ).
This is called the weighted adjacency matrix. For example, the graph in Fig. 3.14
has weighted adjacency matrix
⎛ ⎞
0 4 ∞
⎝∞ 0 6 ⎠.
5 ∞ 0
(k)
We next establish a recursive formula for st .
Lemma 3.6.3
(k+1) (k) (1)
st = min( sh + st ).
h∈V
Proof Since the shortest path from s to h with at most k arcs and the shortest path
from h to t with at most one arc form a path from s to t with at most k + 1 arcs, we
have
(k+1) (k) (1)
st ≤ min( sh + ht ).
h∈V
Next, we show
(k+1) (k) (1)
st ≥ min( sh + ht ).
h∈V
(k+1)
Case 2. Every path with weight st from s to t contains exactly k + 1 arcs.
In this case, we can find a node h on the path such that the piece from s to h
contains exactly k arcs and (h , t) ∈ E. Their weights should be (k) (1)
sh and h t ,
respectively. Therefore,
If G does not contain negative weight cycle, then each shortest path does not
need to contain a cycle. Therefore, we have
(n−1)
Theorem 3.6.4 If G does not have a negative weight cycle, then st is the weight
of shortest path from s to t where n = |V |.
This suggests a dynamic programming to solve the all-pairs-shortest-paths
(k)
problem by using recursive formula in Lemma 3.6.3. Since each st is computed
66 3 Dynamic Programming
(n−1)
in O(n) time, this algorithm will run in O(n4 ) time to compute st for all pairs
{s, t}.
Next, we give a method to speed up this algorithm. To do so, let us define a
new operation for matrixes. Consider two n × n square matrixes A = (aij ) and
B = (bij ). Define
A◦B = min (aih + bhj ) .
1≤h≤n
(A ◦ B) ◦ C = A ◦ (B ◦ C).
A(k) = A
◦ ·
· · ◦ A .
k
The above result is derived under assumption that G does not have a negative
weight cycle. Suppose G is unknown to have a negative weight cycle or not. Can we
modify the faster-all-pairs-shortest-paths algorithm to find a negative weight cycle
if G has one? The answer is yes. However, we need to compute L(G)(m) for m ≥ n.
Theorem 3.6.6 G contains a negative weight cycle if and only if L(G)(n) contains
a negative diagonal entry. Moreover, if L(G)(n) contains a negative diagonal entry,
then such an entry keeps negative sign in every L(G)(m) for m ≥ n.
Proof It follows immediately from the fact that a simple cycle contains at most n
arcs.
Next, let us study another algorithm for the all-pairs-shortest-paths problem.
First, we show a lemma.
(k)
Lemma 3.6.7 Assume V = {1, 2, . . . , n}. Let dij denote the weight of shortest
path from i to j with internal nodes in {1, 2, . . . , k}. Then for i = j ,
(k) c(i, j ) if k = 0,
dij = (k−1) (k−1) (k−1)
min(dij , dik + dkj ) if 1 ≤ k ≤ n,
(k)
and dij = 0 for i = j and k ≥ 0.
Proof We need only to consider i = j . Let p be the shortest path from i to j with
internal nodes in {1, 2, . . . , k}. For k = 0, p does not contain any internal node.
Hence, its weight is c(i, j ). For k ≥ 1, there are two cases (Fig. 3.16).
Case 1. p does not contain internal node k. In this case,
dij(k) = dij(k−1) .
Case 2. p contains an internal node k. Since p does not contain a cycle, node k
appears exactly once. Suppose that node k decomposes path p into two pieces p1
and p2 , from i to k and from k to j , respectively. Then the weight of p1 should
(k−1) (k−1)
be dik , and the weight of p2 should be dij . Therefore, in this case, we have
68 3 Dynamic Programming
(k)
Denote D (k) = (dij ). Based on recursive formula in Lemma 3.6.7, we obtain a
dynamic programming as shown in Algorithm 10, which is called Floyd-Warshall
algorithm.
From algorithm description, we can see the following.
Theorem 3.6.8 If G does not contain a negative weight cycle, then Floyd-Warshall
algorithm computes all-pairs shortest paths in O(n3 ) time.
If G contains a negative weight cycle, could Floyd-Warshall algorithm tell us this
fact? The answer is yes. Actually, we also have
Theorem 3.6.9 G contains a negative weight cycle if and only if D (n) contains a
negative diagonal element.
Exercises
5. A stair is a rectilinear polygon as shown in Fig. 3.17. Show that the minimum
length rectangular partition for a given stair can be computed by a dynamic
programming in time O(n2 log n).
6. Given a rectilinear polygon with hole free, design a dynamic programming to
partition it into small rectangles with minimum total length of cuts.
7. Consider a horizontal line. There are n points lying below the line and m unit
disks with centers above the line. Every one of the n points is covered by some
unit disk. Each unit disk has a weight. Design a dynamic programming to find
a subset of unit disks covering all n points, with the minimum total weight. The
dynamic programming should run in polynomial time with respect to m and n.
8. Given a convex polygon in the Euclidean plane, partition it into triangles with
minimum total length of cuts. Design a dynamic programming to solve this
problem in time O(n3 ) where n is the number of vertices of input polygon.
9. Does Dijkstra’s algorithm for shortest path work for input with negative weight
and without negative weight cycle? If yes, please give a proof. If not, please
give a counterexample and a way to modify the algorithm to work for input
with negative weight and without negative weight cycle.
10. Given a directed graph G = (V , E) and a positive integer k, count the number
of paths with at most k arcs from s to t for all pairs of nodes s and t.
11. Given a graph G = (V , E) and a positive integer k, count the number of paths
with at most k edges from s to t for all pairs of nodes s and t.
12. Given a directed graph G = (V , E) without loop, and a positive integer k,
count the number of paths with at most k arcs from s to t for all pairs of nodes
s and t.
13. Show Lemma 3.6.5, that is, A ◦ (B ◦ C) = (A ◦ B) ◦ C.
14. Does FASTER-ALL-PAIR-SHORTEST-PATH algorithm work for input with
negative weight and without negative weight cycle? If yes, please give a proof.
If not, please give a counterexample.
15. Does Floyd-Warshall algorithm work for input with negative weight and
without negative weight cycle? If yes, please give a proof. If not, please give a
counterexample.
70 3 Dynamic Programming
max c1 x1 + c2 x2 + · · · + cn xn
subject to a1 x1 + a2 x2 + · · · + an xn ≤ S,
x1 , x2 , . . . , xn ∈ {0, 1},
c(i, i) = 0
c(i, j ) = w(i, j ) + min (c(i, k − 1) + c(k, j )).
i<k≤j
and monotone,
original (0, 0), in which every arc can have its direction either going up or to
the right.
32. Consider three strings X, Y, Z over alphabet . Z is said to be a shuffle of
X and Y if Z = X1 Y1 X2 Y2 · · · Xn Yn where X1 , X2 , . . . , Xn are substring of
X such that X = X1 X2 · · · Xn and Y1 , Y2 , . . . , Yn are substrings of Y such
that Y = Y1 Y2 · · · Yn . Design a dynamic programming algorithm to determine
whether Z is a shuffle of X and Y for given X, Y , and Z.
33. Consider a tree T = (V , E) with arbitrary integer weights w : E → Z.
Design an algorithm to compute the diameter of T , i.e., the maximum weight
of a simple path in T .
34. Let G = (V , E) be a planar graph lying in the Euclidean plane. The weight
of any edge (u, v) is the Euclidean distance between nodes u and v, denoted
by L(u, v). For any two nodes x and y, denote by d(x, y) the total weight of
shortest path between x and y. If there is no path between x and y, then define
d(x, y) = ∞. The stretch factor is defined to be the smallest upper bound for
ratio d(x, y)/L(x, y) for any two distinct nodes x, y ∈ V . Design an efficient
algorithm to find the stretch factor for given graph G.
35. Given a set of n integers a1 , a2 , . . . , an and a target integer T , design a dynamic
programming algorithm to determine whether there exists a subset of given
integers such that their sum is equal to T . Your algorithm should run in O(nT )
time.
36. (Sensor Barrier Cover) Consider a rectangle R. Randomly deploy a set of
sensors. Each sensor can monitor an area, called sensing area. Suppose that
the sensing area of every sensor is a unit disk. A subset of sensors is called a
barrier cover if their sensing areas cover a curve connecting two vertical edges
of R. Given a set of sensors, find the minimum barrier cover. Please formulate
this problem into a shortest path problem.
37. (Influence Maximization) A social network is a directed graph G = (V , E)
with an information diffusion model m. Suppose m is the linear threshold (LT)
model as follows: Each arc (u, v) is associated with a weight wuv ∈ [0, 1]
such that for any node v, the total weight of arcs coming to v is at most one.
Each node has two possible states, active and inactive. Initially, all nodes are
inactive. To start an information diffusion process, we may select a few nodes,
called seed, to activate and select, for each node v, a threshold θv uniformly and
randomly from [0, 1]. Then step by step, more nodes will be activated. In each
step, every inactive node v gets evaluated for the total weight of coming arcs
from active nodes. If it is less than its threshold θv , then v is kept in inactive
state; otherwise, v is activated. This process ends at the step in which no new
node can be activated. Given a positive integer k, select k seeds to maximize
the expected number of active nodes (including themselves) at the end of the
process. This problem is called the influence maximization. Suppose G is an
in-arborescence. Design a dynamic programming algorithm for the influence
maximization problem. Could your algorithm run in a polynomial-time with
respect to |V |?
Historical Notes 73
Historical Notes
Dynamic programming was proposed by Richard Bellman in 1953 [95] and later
became a popular method in optimization and control theory. The basic idea
is stemmed from self-reducibility. In the design of computer algorithms, it is a
powerful and elegant technique to find an efficient solution for many optimization
problems, which attracts a lot of researchers’ efforts in the literature, especially
in the direction of speed-up dynamic programming. For example, Yao [434] and
Borchers and Gupta [35] speed up dynamic programming with the quadrangle
inequality, including a construction for the rectilinear Steiner arborescence [345]
from O(n3 ) time to O(n2 ) time.
The shortest path problem became a classical graph problem as early as in 1873
[407]. A. Schrijver [361] provides a quite detail historical note with a large list
of references. There are many algorithms in the literature. Those closely related to
dynamic programming algorithms can be found in Bellman [24], Dijkstra [83],
Dial [82], and Fredman and Tarjan [149].
All-pair-shortest-paths problem was first studied by Alfonso Shimbel in 1953
[366], who gave a O(n4 )-time solution. Floyd [144] and Marshall [402] found a
O(n3 )-time solution independently in the same year.
Examples and exercises about disk (or sensor) coverage can be found in [4,
160, 219, 296, 421], and those about social influence can be found in [393] for the
influence maximization, [271] for the effector detection, and [427] for the active
friending. Extended reading materials can be found in [188, 304, 379, 380, 435,
445].
Chapter 4
Greedy Algorithm and Spanning Tree
A problem that the greedy algorithm works for computing optimal solutions often
has the self-reducibility and a simple exchange property. Let us use two examples
to explain this point.
Example 4.1.1 (Activity Selection) Consider n activities with starting times
s1 , s2 , . . . , sn and ending times f1 , f2 , . . . , fn , respectively. They may be
represented by intervals [s1 , f1 ), [s2 , f2 ), . . ., and [sn , fn ). The problem is to
find a maximum subset of nonoverlapping activities, i.e., nonoverlapping intervals.
This problem has the following exchange property.
Lemma 4.1.2 (Exchange Property) Suppose f1 ≤ f2 ≤ · · · ≤ fn . In a maximum
solution without interval [s1 , f1 ), we can always exchange [s1 , f1 ) with the first
activity in the maximum solution preserving the maximality.
Proof Let [si , fi ) be the first activity in the maximum solution mentioned in the
lemma. Since f1 ≤ fi , replacing [si , fi ) by [s1 , f1 ) will not cost any overlapping.
Theorem 4.1.4 Algorithm 11 produces an optimal solution for the activity selection
problem.
Proof Let us prove it by induction on n. For n = 1, it is trivial.
Consider n ≥ 2. Suppose {I1∗ , I2∗ , . . . , Ik∗ } is an optimal solution. By
Lemma 4.1.2, we may assume that I1∗ = [s1 , f1 ). By Lemma 4.1.3, {I2∗ , . . . , Ik∗ } is
an optimal solution for the activity selection problem on input {Ii | Ii ∩ I1∗ = ∅}.
Note that after select [s1 , f1 ), if we ignore all iterations i with [si , fi )∩[s1 , f1 ) =
∅, then the remaining part is the same as greedy algorithm running on input {Ii |
Ii ∩ I1∗ = ∅}. By induction hypothesis, it will produce an optimal solution for the
activity selection problem on input {Ii | Ii ∩ I1∗ = ∅}, which must contain k − 1
activities. Together with [s1 , f1 ), they form a subset of k non-overlapping activities,
which should be optimal.
Next, we study another example.
Example 4.1.5 (Huffman Tree) Given n characters a1 , a2 , . . . , an with weights
f1 , f2 , . . . , fn , respectively, find a binary tree with n leaves labeled by
a1 , a2 , . . . , an , respectively, to minimize
where d(ai ) is the depth of leaf ai , i.e., the number of edges on the path from the
root to ai .
First, we show a property of optimal solutions.
Lemma 4.1.6 In any optimal solution, every internal node has two children, i.e.,
every optimal binary tree is full.
Proof If an internal node has only one child, then this internal node can be removed
to reduce the objective function value.
We can also show an exchange property and a self-reducibility.
Lemma 4.1.7 (Exchange Property) If fi > fj and d(ai ) > d(aj ), then
exchanging ai with aj would make the objective function value decrease.
Proof Let d (ai ) and d(aj ) be the depths of ai and aj , respectively, after exchanging
ai with aj . Then d (ai ) = d(aj ) and d (aj ) = d(ai ). Therefore, the difference of
objective function values before and after exchange is
c(T ) = d(a) · f (a)
a over leaves of T
where d(a) is the depth of leaf a and f (a) is the weight of leaf a. Then we have
If Tu is not optimal for weights at leaves of Tu , then we have a binary tree Tu for
those weights with c(Tu ) < c(Tu ). Therefore, c(Tu ∪ Tu ) < c(T ∗ ), contradicting
optimality of T ∗ .
By Lemmas 4.1.7 and 4.1.8, we can construct an optimal Huffman tree in the
following:
• Sort f1 ≤ f2 ≤ · · · ≤ fn .
• By exchange property, there must exist an optimal tree in which a1 and a2 are
sibling at bottom level.
• By self-reducibility, the problem can be reduced to construct optimal tree for
leaves weights {f1 + f2 , f3 , . . . , fn }.
• Go back to initial sorting step. This process continues until only two weights
exist.
In Fig. 4.2, an example is presented to explain this construction. This construction
can be implemented with min-priority queue (Algorithm 12)
The Huffman tree problem is raised from the study of Huffman codes as follows.
Problem 4.1.9 (Huffman Codes) Given n characters a1 , a2 , . . . , an with frequen-
cies f1 , f2 , . . . , fn , respectively, find prefix binary codes c1 , c2 , . . . , cn to minimize
4.2 Matroid
A ⊂ B, B ∈ C ⇒ A ∈ C,
i.e., it is hereditary. In the independent system (S, C), each subset in C is called an
independent set.
Consider a maximization problem as follows.
Problem 4.2.1 (Independent SetMaximization) Let c be a nonnegative cost
function on S. Denote c(A) = x∈A c(x) for any A ⊆ S. The problem is to
maximize c(A) subject to A ∈ C.
Also, consider the greedy algorithm in Algorithm 13.
For any F ⊆ E, a subset I of F is called a maximal independent subset if no
independent subset of E contains F as a proper subset. Define
c(A∗ ) u(F )
1≤ ≤ max .
c(AG ) F ⊆S v(F )
n
c(AG ) = c(x1 )|S1 ∩ AG | + c(xi )(|Si ∩ AG | − |Ai−1 ∩ AG |)
i=2
n−1
= |Si ∩ AG |(c(xi ) − c(xi+1 )) + |An ∩ AG |c(xn ).
i=1
Similarly,
n−1
c(A∗ ) = |Si ∩ A∗ |(c(xi ) − c(xi+1 )) + |Sn ∩ A∗ |c(xn ).
i=1
Thus,
c(A∗ ) |A∗ ∩ Si |
≤ max .
c(AG ) 1≤i≤n |AG ∩ Si |
|Si ∩ AG | ≥ v(Si ).
|Si ∩ A∗ | ≤ u(Si ).
Therefore,
c(A∗ ) u(F )
≤ max .
c(AG ) F ⊆S v(F )
Theorem 4.2.4 An independent system (S, C) is a matroid if and only if for any
cost function c(·), Algorithm 13 gives a maximum solution.
Proof For necessity, we note that when (S, C) is matroid, we have u(F ) = v(F )
for any F ⊆ S. Therefore, Algorithm 13 gives an optimal solution.
For sufficiency, we give a contradiction argument. To this end, suppose indepen-
dent system (S, C) is not a matroid. Then, there exists F ⊆ S such that F has two
maximal independent sets I and J with |I | < |J |. Define
4.2 Matroid 83
⎧
⎨ 1 + ε if e ∈ I
c(e) = 1 if e ∈ J \ I
⎩
0 otherwise
where ε is a sufficient small positive number to satisfy c(I ) < c(J ). The greedy
algorithm will produce I , which is not optimal.
This theorem gives tight relationship between matroids and greedy algorithms,
which is built up on all nonnegative objective function. It may be worth mentioning
that the greedy algorithm reaches optimal for a certain class of objective functions
may not provide any additional information to the independent system. The
following is a counterexample.
Example 4.2.5 Consider a complete bipartite graph G = (V1 , V2 , E) with |V1 | =
|V2 |. Let I be the family of all matchings. Clearly, (E, I) is an independent system.
However, it is not a matroid. An interesting fact is that maximal matchings may have
different cardinalities for some subgraph of G although all maximal matchings for
G have the same cardinality.
Furthermore, consider the problem max{c(·) | I ∈ I}, called the maximum
assignment problem.
If c(·) is a nonnegative function such that for any u, u ∈ V1 and v, v ∈ V2 ,
This means that replacing edges (u1 , v ) and (u , v1 ) in M ∗ by (u1 , v1 ) and (u , v )
will not decrease the total cost of the matching. Similarly, we can put all (ui , vi ) into
an optimal solution, that is, they form an optimal solution. This gives an exchange
property. Actually, we can design a greedy algorithm to solve the maximum
assignment problem. (We leave this as an exercise.)
Next, let us present some examples of the matroid.
Example 4.2.6 (Linear Vector Space) Let S be a finite set of vectors and I the
family of linearly independent subsets of S. Then (S, I) is a matroid.
In a matroid, all maximal independent subsets have the same cardinality. They
are also called bases. In a graph matroid obtained from a connected graph, every
base is a spanning tree.
Let B be the family of all bases of a matroid (S, C). Consider the following
problem:
Problem 4.2.8 (Base Cost Minimization) Consider a matroid (S, C) with base
family B and a nonnegative cost function on S. The problem is to minimize c(B)
subject to B ∈ B.
Theorem 4.2.9 An optimal solution of the base cost minimization can be computed
by Algorithm 14, a variation of Algorithm 13.
Proof Suppose that every base has the cardinality m. Let M be a positive number
such that for any e ∈ S, c(e) < M. Define c (e) = M − c(e) for all e ∈ E. Then
c (·) is a positive function on S, and the non-decreasing ordering with respect to c(·)
is the non-increasing ordering with respect to c (·). Note that c (B) = mM − c(B)
for any B ∈ B. Since Algorithm 13 produces a base with maximum value of c ,
Algorithm 14 produces a base with minimum value of function c.
The correctness of greedy algorithm for the minimum spanning tree can also be
obtained from this theorem.
Next, consider the following problem.
Problem 4.2.10 (Unit-Time Task Scheduling) Consider a set of n unit-time tasks,
S = {1, 2, . . . , n}. Each task i can be processed during a unit-time and has to be
completed before an integer deadline di and, if not completed, will receive a penalty
wi . The problem is to find a schedule for S on a machine within time n to minimize
total penalty.
A set of tasks is independent if there exists a schedule for these tasks without
penalty. Then we have the following.
Lemma 4.2.11 A set A of tasks is independent if and only if for any t = 1, 2, . . . , n,
Nt (A) ≤ t where Nt (A) = |{i ∈ A | di ≤ t}|.
4.2 Matroid 85
Proof It is trivial for “only if” part. For the “if” part, note that if the condition
holds, then tasks in A can be scheduled in order of nondecreasing deadlines without
penalty.
Example 4.2.12 Let S be a set of unit-time tasks with deadlines and penalties and C
the collection of all independent subsets of S. Then, (S, C) is a matroid. Therefore,
an optimal solution for the unit-time task scheduling problem can be computed by
a greedy algorithm (i.e., Algorithm 13).
Proof (Hereditary) Trivial.
(Augmentation) Consider two independent sets A and B with |A| < |B|. Let k
be the largest k such that Nt (A) ≥ Nt (B). (A few examples are presented in Fig. 4.5
to explain the definition of k.) Then k < n and Nt (A) < Nt (B) for k + 1 ≤ t ≤ n.
Choose x ∈ {i ∈ B \ A | di = k + 1}. Then
and
Example 4.2.13 Consider an independent system (S, C). For any fixed A ⊆ S,
define
CA = {B ⊆ S | A ⊆ B}.
Prim Algorithm
input: A graph G = (V , E) with nonnegative edge weight c :→ R+ .
output: A spanning tree T .
U ← {s} for some s ∈ V ;
T ← ∅;
while U = V do
find the minimum weight edge (u, v) from cut (U, V \ U )
and T ← T ∪ (u, v);
return T .
An example for using Prim algorithm is shown in Fig. 4.7. The construction starts
at node 1 and guarantees that the cut optimality conditions are satisfied at the end.
The min-priority queue can be used for implementing Prim algorithm to obtain
the following result.
Theorem 4.3.2 Prim algorithm can construct a minimum spanning tree in
O(m log m) time where m is the number of edges in input graph.
Proof Prim algorithm can be implemented by using min-priority queue in the
following way:
• Keep to store all edges in a cut (U, W ) in the min-priority queue S.
• At each iteration, choose the minimum weight edge (u, v) in the cut (U, W ) by
using operation Extract-Min(S) where u ∈ U and v ∈ W .
• For every edge (x, v) with x ∈ U , delete (c, v) from S. This needs a new
operation on min-priority queue, which runs O(m) time.
• Add v to U .
• For every edge (v, y) with y ∈ V \ U , insert (v, y) into priority queue. This also
requires O(log m) time.
In this implementation, Prim algorithm runs in O(m log m) time.
Prim algorithm can be considered as a local-information greedy algorithm.
Actually, its correctness can also be established by an exchange property and a self-
reducibility as follows.
Lemma 4.3.3 (Exchange Property) Consider a cut (U, W ) in a graph G =
(V , E). Suppose edge e has the smallest weight in cut (U, W ). If a minimum
spanning tree T does not contain e, then there must exist an edge e in T such
that (T \ e ) ∪ e is still a minimum spanning tree.
The local ratio method is also a type of algorithm with self-reducibility. Its basic
idea is as follows.
Lemma 4.4.1 Let c(x) = c1 (x) + c2 (x). Suppose x ∗ is an optimal solution
of minx∈ c1 (x) and minx∈Omega c2 (x). Then x ∗ is an optimal solution of
minx∈ c(x). The similar statement holds for the maximization problem.
Proof For any x ∈ , c1 (x) ≥ c1 (x ∗ ), c2 (x) ≥ c2 (x ∗ ), and hence c(x) ≥ c(x ∗ ).
Usually, the objective function c(x) is decomposed into c1 (x) and c2 (x) such
that optimal solutions of minx∈ c1 (x) constitute a big pool so that the problem is
reduced to find an optimal solution of minx∈ c2 (x) in the pool. In this section, we
present two examples to explain this idea.
First, we study the following problem.
Problem 4.4.2 (Weighted Activity Selection) Given n activities each with a time
period [si , fi ) and a positive weight wi , find a nonoverlapping subset of activities to
maximize the total weight.
Suppose, without loss of generality, f1 ≤ f2 ≤ · · · ≤ fn . First, we consider a
special case that for every activity [si , fi ), if si < f1 , i.e., activity [si , fi ) overlaps
with activity [s1 , f1 ), then wi = w1 > 0, and if si ≥ f1 , then wi = 0. In this case,
every feasible solution containing an activity overlapping with [s1 , f1 ) is an optimal
solution. Motivated from this special case, we may decompose the problem into two
subproblems. The first one is in the special case, and the second one has weight as
follows
wi − w1 if si < f1 ,
wi =
wi otherwise.
In the second subproblem obtained from the decomposition, some activity may
have non-positive weight. Such an activity can be removed from our consideration
because putting it in any feasible solution would not increase the total weight. This
operation would simplify the problem by removing at least one activity. Repeat the
decomposition and simplification until no activity is left.
To explain how to obtain an optimal solution, let A be the set of remaining
activities after the first decomposition and simplification and Opt is an optimal
solution for the weighted activity selection problem on A . Since simplification
does not effect the objective function value of optimal solution, Opt is an optimal
solution of the second subproblem in the decomposition. If Opt contains an activity
overlapping with activity [s1 , f1 ), then Opt is also an optimal solution of the
first subproblem, and hence by Lemma 4.4.1, Opt is an optimal solution for the
weighted activity selection problem on original input A. If Opt does not contain an
activity overlapping with [s1 , f1 ), then Opt ∪ {[s1 , f1 )} is an optimal solution for
90 4 Greedy Algorithm and Spanning Tree
the first subproblem and the second subproblem and hence also an optimal solution
for the original problem.
Based on the above analysis, we may construct the following algorithm.
Proof Note that the number of arcs in T is equal to |V | − 1. Thus, condition (b)
implies the connectivity of T when ignore direction, which implies condition (a).
Therefore, if T is not an arborescence, then condition (b) does not hold, i.e., there
exists v ∈ V − {r} such that there does not exist a directed path from r to v. Now, T
contains an arc (v1 , v) coming to v with v1 = r, an arc (v2 , v1 ) coming to v1 with
v2 = v, and so on. Since the directed graph G is finite. The sequence (v, v1 , v2 , . . .)
must contain a cycle.
Conversely, if T contains a cycle, then T is not an arborescence by the definition.
This completes the proof of the lemma.
Now, we consider the minimum arborescence problem.
Problem 4.4.4 (Minimum Arborescence) Given a directed graph G = (V , E)
with positive arc weight w : E → R + and a vertex r ∈ V , compute an arborescence
with root r to minimize total arc weight.
The following special case gives a basic idea for a local ratio method.
Lemma 4.4.5 Suppose for each vertex v ∈ V − {r} all arcs coming to v have
the same weight. Then every arborescence with root r is optimal for the MIN
ARBORESCENCE problem.
Proof It follows immediately from the fact that each arborescence contains exactly
one arc coming to v for each vertex v ∈ V − {r}.
92 4 Greedy Algorithm and Spanning Tree
Exercises
1. Suppose that for every cut of the graph, there is a unique light edge crossing the
cut. Show that the graph has a unique minimum spanning tree. Does the inverse
hold? If not, please give a counterexample.
2. Consider a finite set S. Let Ik be the collection of all subsets of S with size at
most k. Show that (S, Ik ) is a matroid.
3. Solve the following instance of the unit-time task scheduling problem.
ai 1 2 3 4 5 6 7
di 4 2 4 3 1 4 6
wi 70 60 50 40 30 20 10
That is, the maximal independent sets of (S, I ) are just complements of the
maximal independent sets of (S, I).
6. Suppose that a set of activities are required to schedule in a large number of
lecture halls. We wish to schedule all the activities using as few lecture halls as
possible. Give an efficient greedy algorithm to determine which activity should
use which lecture hall.
7. Consider a set of n files, f1 , f2 , . . . , fn , of distinct sizes m1 , m2 , . . . , mn ,
respectively. They are required to be recorded sequentially on a single tape, in
some order, and retrieve each file exactly once, in the reverse order. The retrieval
of a file involves rewinding the tape to the beginning and then scanning the files
sequentially until the desired file is reached. The cost of retrieving a file is the
sum of the sizes of the files scanned plus the size of the file retrieved. (Ignore
the cost of rewinding the tape.) The total cost of retrieving all the files is the
sum of the individual costs.
(a) Suppose that the files are stored in some order fi1 , fi2 , . . . , fin . Derive a
formula for the total cost of retrieving the files, as a function of n and the
mik ’s.
(a) Describe a greedy strategy to order the files on the tape so that the total cost
is minimized, and prove that this strategy is indeed optimal.
8. In merge sort, the merge procedure is able to merge two sorted lists of lengths
n1 and n2 , respectively, into one by using n1 + n2 comparisons. Given m sorted
lists, we can select two of them and merge these two lists into one. We can then
select two lists from the m − 1 sorted lists and merge them into one. Repeating
this step, we shall eventually end up with one merged list. Describe a general
algorithm for determining an order in which m sorted lists A1 , A2 , . . . , Am are
to be merged so that the total number of comparisons is minimum. Prove that
your algorithm is correct.
9. Let G = (V , E) be a connected undirected graph. The distance between two
vertices x and y, denoted by d(x, y), is the number of edges on the shortest
path between x and y. The diameter of G is the maximum of d(x, y) over all
pairs (x, y) in V × V . In the remainder of this problem, assume that G has at
least two vertices.
Consider the following algorithm on G: Initially, choose arbitrarily x0 ∈
V . Repeatedly, choose xi+1 such that d(xi+1 , xi ) = maxv∈V d(v, xi ) until
d(xi+1 , xi ) = d(xi , xi−1 ).
Can this algorithm always terminate? When it terminates, is d(xi+1 , xi )
guaranteed to equal the diameter of G? (Prove or disprove your answer.)
10. Consider a graph G = (V , E) with positive edge weight c : E → R + . Show
that for any spanning tree T and the minimum spanning tree T ∗ , there exists
a one-to-one onto mapping ρ : E(T ) → E(T ∗ ) such that c(ρ(e)) ≤ c(e) for
every e ∈ E(T ) where E(T ) denotes the edge set of T .
4.4 Local Ratio Method 95
11. Consider a point set P in the Euclidean plane. Let R be a fixed positive number.
A steinerized spanning tree on P is a tree obtained from a spanning tree on P
by putting some Steiner points on its edges to break them into pieces each of
length at most R. Show that the steinerized spanning with minimum number of
Steiner points is obtained from the minimum spanning tree.
12. Consider a graph G = (V , E) with edge weight w : E → R + . Show that the
spanning tree T which minimizes e∈E(T ) e α for any fixed 1 < α is the
minimum spanning tree, i.e., the one which minimizes e∈E(T ) e .
13. Let B be the family of all maximal independent subsets of an independent
system (E, I). Then (E, I) is a matroid if and only if for any nonnegative
function c(·), Algorithm 14 produces an optimal solution for the problem
min{c(I ) | I ∈ B}.
14. Consider a complete bipartite graph G = (U, V , E) with |U | = |V |. Let c(·)
be a nonnegative function on E such that for any u, u ∈ V1 and v, v ∈ V2 ,
Historical Notes
In this chapter, we study the incremental method which is very different from those
methods in the previous chapters. This method does not use the self-reducibility.
It starts from a feasible solution, and in each iteration, computation moves from a
feasible solution to another feasible solution by improving the objective function
value. The incremental method has been used in the study of many problems,
especially in the study of network flow.
(c) f (s,v)>0 f (s, v)− f (u,s)>0 f (u, s)= f (u,t)>0 f (u, t)− f (t,v)>0 f (t, v).
Proof
(a) By capacity constraint, f (u, v) ≤ c(u, v) = 0 and f (v, u) ≤ c(v, u) = 0. By
skew symmetric, f (u, v) = −f (v, u) ≥ 0. Hence, f (u, v) = 0.
(b) By flow conservation, for any x ∈ V \ {s, t},
f (x, u) + f (x, v) = f (x, v) = 0.
f (x,u)<0 f (x,v)>0 v∈V
By skew symmetry,
f (u, x) = − f (x, u) = f (x, v).
f (u,x)>0 f (x,u)<0 f (x,v)>0
For (y, z) ∈ E with y, z ∈ V \ {s, t}, if f (y, z) > 0, then f (y, z) appears in
both the left-hand and the right-hand sides, and hence it will be cancelled. After
cancellation, we obtain
f (s, v) + = f (u, s) + f (u, t).
f (s,v)>0 f (t,v)>0 f (u,s)>0 f (u,t)>0
In case that the source s does not have arc coming in, we have
|f | = f (s, v).
f (s,v)>0
In Fig. 5.2, arc labels with underline give a flow. This flow has value 11.
The maximum flow problem is as follows.
Problem 5.1.2 (Maximum Flow) Given a flow network G = (V , E) with arc
capacity c : V × V → R+ , a source s, and a sink t, find a flow f with maximum
flow value. Usually, assume that s does not have incoming arc and t does not have
outgoing arc.
An important tool for study of the maximum flow problem is the residual
network. The residual network for a flow f in a network G = (V , E) with capacity
c is the flow network with Gf (V , E ) with capacity c (u, v) = c(u, v) − f (u, v) for
any u, v ∈ V where E = {(u, v) ∈ V × V | c (u, v) > 0}. For example, the flow
in Fig. 5.2 has its residual network as shown in Fig. 5.3. Two important properties
of the residual network are included in the following lemmas.
Lemma 5.1.3 Suppose f is a flow in the residual network Gf . Then f + f is a
flow in network G and |f + f | = |f | + |f |.
Proof For any u, v ∈ V , since f (u, v) ≤ c (u, v) = c(u, v) − f (u, v), we have
f (u, v) + f (u, v) ≤ c(u, v), that is, f + f satisfies the capacity constraint.
Moreover, f (u, v) + f (u, v) = −f (v, u) − f (v, u) = −(f (v, u) + f (v, u))
and for every u ∈ V \ {s, t},
(f + f )(u, v) = f (u, v) + f (u, v) = 0.
v∈V \{u} v∈V \{u} v∈V \{u}
This means that f + f satisfies the skew symmetry and the flow conservation
conditions. Therefore, f + f is a flow. Finally,
|f + f | = (f + f )(s, v) = f (s, v) + f (s, v) = |f | + |f |.
v∈V \{s} v∈V \{s} v∈V \{s}
The answer for the second question is positive. Actually, we have the following.
Theorem 5.1.6 A flow f is maximum if and only if its residual network Gf does
not contain a path from source s to sink t.
To prove this theorem, let us first show a lemma.
A partition (S, T ) of V is called an s-t cut if s ∈ S and t ∈ T . The capacity of
an s-t cut is defined by
CAP(S, T ) = c(u, v).
u∈S,v∈T
Lemma 5.1.7 Let (S, T ) be an s-t cut. Then for any flow f ,
|f | = f (u, v) − f (v, u) ≤ CAP(S, T ).
f (u,v)>0,u∈S,v∈T f (v,u)>0,u∈S,v∈T
Thus,
5.1 Maximum Flow 103
f (s, x) + f (u, x)
f (s,x)>0 u∈T ,x∈S,f (u,x)>0
= f (x, s) + f (x, v),
f (x,s)>0 v∈T ,x∈S,f (x,v)>0
that is,
|f | = f (s, x) − f (x, s)
f (s,x)>0 f (x,s)>0
= f (x, v) − f (u, x)
v∈T ,x∈S,f (x,v)>0 u∈T ,x∈S,f (u,x)>0
≤ f (x, v)
v∈T ,x∈S,f (x,v)>0
≤ c(x, v).
x∈S,v∈T
Corollary 5.1.8 The maximum flow is equal to minimum s-t cut capacity.
Finally, we remark that Ford-Fulkerson algorithm is not a polynomial-time. A
counterexample is given in Fig. 5.5. On this counterexample, the algorithm runs in
2m steps. However, the input size is O(log m). Clearly, 2m is not a polynomial with
respect to O(log m).
104 5 Incremental Method
Let δf (x) denote the shortest path distance from source s to node x in the residual
network Gf of flow f where each arc is considered to have unit distance.
Lemma 5.2.1 When Edmonds-Karp algorithm runs, δf (x) increases monotoni-
cally with each flow augmentation.
Proof For contradiction, suppose flow f is obtained from flow f through an
augmentation with path P and δf (v) < δf (v) for some node v. Without loss of
generality, assume δf (v) reaches the smallest value among such v, i.e.,
Suppose arc (u, v) is on the shortest path from s to v in Gf . Then δf (u) = δf (v)−
1 and hence δf (u) ≥ δf (u). Next, let us consider two cases.
Case 1. (u, v) ∈ Gf . In this case, we have
a contradiction.
Case 2. (u, v) ∈ Gf . Then arc (v, u) must lie on the augmenting path P in Gf
(Fig. 5.7). Therefore,
a contradiction.
An arc (u, v) is critical in residual network Gf if (u, v) has the smallest capacity
in the shortest augmenting path in Gf .
Lemma 5.2.2 Each arc (u, v) can be critical at most (|V | + 1)/2 times.
Proof Suppose arc (u, v) is critical in Gf . Then (u, v) will disappear in the next
residual network. Before (u, v) appears again, (v, u) has to appear in augmenting
path of a residual network Gf . Thus, we have
δf (u) = δf (v) + 1.
By Lemma 5.2.1, the shortest path distance from s to u will increase by 2(k − 1)
when arc (u, v) can be critical k times. Since this distance is at most |V | − 1, we
have 2(k − 1) ≤ |V | − 1, and hence k ≤ (|V | + 1)/2.
Now, we establish a theorem on running time.
Theorem 5.2.3 Edmonds-Karp algorithm runs in time O(|V | · |E|2 ).
Proof In each augmentation, there exists a critical arc. Since each arc can be
critical (|V | + 1)/2 times, there are at most O(|V | · |E|) augmentations. In each
augmentation, finding the shortest path takes O(|E|) time, and operations on
the augmenting path take also O(|E|) time. Putting all together, Edmonds-Karp
algorithm runs in time O(|V | · |E|2 ).
Note that the above theorem does not require that all arc capacities are integers.
Therefore, the modification of Edmonds and Karp is twofold: (1) Make the
algorithm halt within finitely many iterations, and (2) the number of iterations is
bounded by a polynomial.
5.3 Applications
The maximum flow has many applications. Let us show a few examples in this
section.
Example 5.3.1 Given an undirected graph G = (V , E) and two distinct vertices
s, t ∈ V , please give an algorithm to determine the connectivity between s and t,
i.e., the maximum number of s-to-t paths that are vertex-disjoint paths (other than
at s and t).
For each vertex v ∈ V , create two vertices v + and v − together with an arc
(v + , v − ).
For each edge (u, v) ∈ E, create two arcs (u− , v + ) and (v − , u+ ). Then,
we obtain a directed graph G from G (Fig. 5.8). Every path from s to t in G induces
a path from s − to t + in G , and a family of vertex-disjoint paths from s to t in G
will induce a family of arc-disjoint paths from s − to t + , vice versa. Therefore, assign
every arc with unit capacity in G . Then the connectivity between s and t in G is
equal to the maximum flow value from s − to t + in G .
Example 5.3.2 Consider a set of wireless sensors lying in a rectangle which is a
piece of boundary area of the region of interest. The region is below the rectangle
and outside is above the rectangle. The monitoring area of each sensor is a unit
disk, i.e., a disk with radius of unit length. A point is said to be covered by a
sensor if it lies in the monitoring disk of the sensor. The set of sensors is called
5.3 Applications 107
a barrier cover if they can cover a line (not necessarily straight) connecting two
vertical edges (Fig. 5.9) of the rectangle. The barrier cover is used for protecting
any intruder coming from outside. Sensors are powered with batteries and hence
lifetime is limited. Assume that all sensors have unit lifetime. The problem is to find
the maximum number of disjoint barrier covers so that they can be used in turn to
maximize the lifetime of the system.
Use two points s and t to represent two vertical edges of the rectangle; we call them
vertical lines s and t, respectively. Construct a graph G by setting the vertex set
consisting of all sensors together with s and t (Fig. 5.9). The edge is constructed
based on the following rules:
• If the monitoring disk of sensor u and the monitoring disk of sensor v have
nonempty intersection, then add an edge (u, v).
• If vertical line s and the monitoring disk of sensor v have nonempty intersection,
then add an edge (s, v).
• If the monitoring disk of sensor u and vertical line t have nonempty intersection,
then add an edge (u, t).
In graph G, every path between s and t induces a barrier cover, and every set of
vertex-disjoint paths between s and t will induce a set of disjoint barrier covers,
vice versa. Therefore, we can further construct G from G as above (Fig. 5.8), so
108 5 Incremental Method
Example 5.3.4 (Maximum Bipartite Matching) Given a bipartite graph (U, V , E),
find a matching with maximum cardinality.
This problem can be transformed into a maximum flow problem as follows. Add
a source node s and a sink node t. Connect s to every node u in U by adding an
arc (s, u). Connect every node v in V to t by adding an arc (v, t). Add to every
edge in E the direction from U to V . Finally, assign every arc with unit capacity.
An example is shown in Fig. 5.10.
Motivated from observation on the example in Fig. 5.10, we may have questions:
5.4 Matching
In this section, we study matching in a directed way. First, we define the augmenting
path as follows.
Consider a matching M in a bipartite graph G = (U, V , E). Let us call every
edge in M as matched edge and every edge not in M as unmatched edge. A node v
is called a free node if v is not an ending point of a matched edge.
Definition 5.4.1 (Augmenting Path) The augmenting path is now defined to be a
path satisfying the following:
• It is an alternating path, that is, edges on the path are alternatively unmatched
and matched.
• The path is between two free nodes.
There are totally odd number of edges in an augmenting path. The number of
unmatched edges is one more than the number of matched edges. Therefore, on an
augmenting path, turn all matched edges to unmatched and turn all unmatched edges
to matched. Then considered matching will become a matching with one more edge.
Therefore, if a matching M has an augmenting path, then M cannot be maximum.
The following theorem indicates that the inverse holds.
Theorem 5.4.2 A matching M is maximum if and only if M does not have an
augmenting path.
Proof Let M be a matching without augmenting path. For contradiction, suppose
M is not maximum. Let M ∗ be a maximum matching. Then |M| < |M ∗ |. Consider
M ⊕ M ∗ = (M \ M ∗ ) ∪ (M ∗ \ M), in which every node has degree at most two
(Fig. 5.11).
Hence, it is disjoint union of paths and cycles. Since each node with degree two
must be incident to two edges belonging to M and M , respectively. Those paths
and cycles must be alternative. They can be classified into four types as shown in
Fig. 5.12.
Note that in each of the first three types of connected components, the number of
edges in M is not less than the number of edges in M ∗ . Since |M| < |M ∗ |, we have
|M \ M ∗ | < |M ∗ \ M|. Therefore, the connected component of the fourth type must
exist, that is, M has an augmenting path, a contradiction.
We now return to the question on augmentation of several paths at the same time.
The following algorithm is the result of a positive answer.
We next analyze Hopcroft-Karp algorithm (Algorithm 17).
Lemma 5.4.3 In each iteration, the length of the shortest augmenting path is
increased by at least two.
Proof Suppose matching M is obtained from matching M through augmentation
on a maximal set of shortest augmenting paths, {P1 , P2 , . . . , Pk }, for M. Let P be
a shortest augmenting path for M . If P is disjoint from {P1 , P2 , . . . , Pk }, then P is
110 5 Incremental Method
Fig. 5.11 M ⊕ M ∗
1: M ← any edge;
2: while there exists an augmenting path do
3: find a maximal set of disjoint augmenting paths {P1 , P2 , . . . , Pk };
4: M ← M ⊕ (P1 ∪ P2 ∪ · · · Pk );
5: end while
6: return M.
also an augmenting path for M. Hence, the length of P is longer than the length of
P1 . Note that the augmenting path must have odd length. Therefore, the length of P
at-least-two longer than the length of P1 .
Next, assume that P has an edge lying in Pi for some i. Note that every
augmenting path has two endpoints in U and V , respectively. Let u and v be two
endpoints of P and ui and vi two endpoints of Pi where u, ui ∈ U and v, vi ∈ V .
Without loss of generality, assume that (x, y) is the edge lying on P and also on
some Pi such that no such edge exists from y to v. Clearly,
5.4 Matching 111
where distP (y, v) denotes the distance between y and v on path P . In fact, if
distP (y, v) < distPi (y, vi ), then replacing the piece of Pi between y and vi by
the piece of P between y and v, we obtain an augmenting path for M, shorter than
Pi , contradicting to shortest property of Pi . Now, we claim that the following holds.
To prove this claim, we may put the bipartite graph into a flow network as shown
in Fig. 5.10. Then every augmenting path receives a direction from U to V , and the
claim can be proved as follows.
Firstly, note that on path P , we assumed that the piece from y to v is disjoint from
all P1 , P2 , . . . , Pk . This assumption implies that edge (x, y) is in direction from x
to y on P , so that distP (u, x) = distP (u, y) − 1.
Secondly, note that edge (x, y) also appears on Pi , and after augmentation, every
edge in Pi must change its direction. Thus, edge (x, y) is in direction from y to x
on Pi . Hence, distPi (ui , y) + 1 = distPi (ui , x).
Thirdly, by Lemma 5.2.1, we have distPi (ui , x) ≤ distP (u, x).
Finally, putting (5.1) and (5.2) together, we obtain
There are two steps in finding a maximal set of disjoint augmenting paths for a
matching M in bipartite graph G = (U, V , E).
In the first step, employ the breadth-first search to put nodes into different levels
as follows. Initially, select all free nodes in U and put them in the first level. Next,
112 5 Incremental Method
put in the second level all nodes each with an unmatched edge connecting to a node
in the first level. Then, put in the third level all nodes each with a matched edge
connecting to a node in the second level. Continue in this alternating ways, until a
free node in V is discovered, say in the kth level (Fig. 5.13). Let F be all free nodes
in the kth level and H the obtained subgraph. If the breadth-first search comes to an
end and still cannot find a free node in V , then this means that there is no augmenting
path, and a maximum matching has already obtained by Hopcroft-Karp algorithm.
In the second step, employ the depth-first search to find path from each node in
F to a node in the first level. Such paths will be searched one by one in H , and once
a path is obtained, all nodes on this depth-first-search path will be deleted from H ,
until no more such path can be found.
Since both steps can work in O(|E|) time, the total time for finishing this task is
O(|E|).
The alternating path method can also be used for the maximum matching in
general graph.
Problem 5.4.5 (Maximum Graph Matching) Given a graph G = (V , E), find a
matching with maximum cardinality.
The augmenting path is also defined to be a path satisfying the following:
• It is an alternating path, that is, edges on the path are alternatively unmatched
and matched.
• The path is between two free nodes.
Now, the proof of Theorem. 5.4.2 can be applied to the graph matching without
any change, to show the following.
Theorem 5.4.6 A matching M is maximum if and only if M does not have an
augmenting path.
Therefore, we obtained Algorithm 18 for the maximum graph matching problem.
Definition 5.4.8 (Even and Odd Nodes) In an alternating tree, a node is called an
odd node if its distance to the root has odd length. A node is called an even node if
its distance to the root has even length, e.g., the root is an even node.
Now, let us describe the blossom algorithm.
• For each free node x, construct an alternating tree with root x in the breadth-first-
search ordering.
• At an odd node y, if no matched edge is incident to y, then y is a free node, and
an augmenting path from x to y is found. If there exists a matched edge incident
to y, then such a matched edge is unique, and y can be extended uniquely to an
even node.
• At an even node z, if no unmatched edge is incident to z, then z cannot be
extended. If there exists an unmatched edge (u, z) incident to z, then consider
another ending node u of this edge. If u is a known even node, then a blossom is
found; shrink the blossom into an even node. If u is not a known even node, then
u can be counted as an odd node to continue our construction.
114 5 Incremental Method
• At a level consisting of even nodes, if none of them can be extended, then there
is no augmenting path starting from free node x.
• Therefore, above construction of alternating tree, we can either find an augment-
ing path starting from free node x or determine not existing of such a path. As
soon as an augmenting path is found, we can carry out an augmentation, matching
is updated, and we restart to search for an augmenting path.
• If for all free nodes, no augmenting path can be found from construction of
alternating trees, then current matching is maximum.
To show the correctness of the above algorithm, it is sufficient to explain why we
can shrink a blossom into a node. An explanation is given in the following.
Lemma 5.4.9 Let B be a blossom in graph G. Let G/B denote the graph obtained
from G by shrinking B into a node. Then G contains an augmenting path if and only
if G/B contains an augmenting path.
Proof Note that the alternating path can be extended passing through a blossom
out-reach to its any connection (Fig. 5.15). Therefore, if an augmenting path passes
through a blossom, then after shrink the blossom into a node, the augmenting path is
still an augmenting path. Conversely, if an augmenting path contains a node which
is obtained from a blossom, then after de-shrink the blossom, we can still obtain an
augmenting path.
Clearly, this algorithm runs in O(|V | · |E|) time. Thus, we have the following.
Theorem 5.4.10 With blossom algorithm, the maximum cardinality matching in
graph G = (V , E) can be computed in O(|V |2 · |E|) time.
Proof To obtain a maximum matching, we can carry out at most |V | augmentations.
To find an augmenting path, we may spend O(|V |·|E|) time to construct alternating
trees. Therefore, the total running time is O(|V |2 |E|).
For weighted bipartite matching and weighted graph matching, can we use
the alternating path to deal with them? The answer is yes. However, it is more
complicated. We can find a better way, which will be introduced in Sect. 6.8.
In this and the next sections, we present more algorithms for the maximum flow
problem. They have running time better than Edmonds-Karp algorithm.
First, we note that the idea in Hopcroft-Karp algorithm can be extended from
matching to flow. This extension gives a variation of Edmonds-Karp algorithm,
called Dinitz algorithm.
Consider a flow network G = (V , E). The algorithm starts with a zero flow
f (u, v) = 0 for every arc (u, v). In each substantial iteration, consider residual
network Gf for flow f . Start from source node s to do the breadth-first search
until node t is reached. If t cannot be researched, then algorithm stops, and the
maximum flow is already obtained. If t is reached with distance from node s,
then the breadth-first-search tree contains level, and its nodes are divided into
classes V0 , V1 , . . . , V where Vi is the set of all nodes each with distance i from s
and ≤ |V |. Collect all arcs from Vi to Vi+1 for i = 0, 1, . . . , − 1. Let L(s) be
the obtained levelable subnetwork. Above computation can be done in O(|E|) time.
Next, the algorithm finds augmenting paths to do augmentations in the following
way.
Step 1. Iteratively, for v = t and u = s, remove, from L(s), every arc (u, v) with
no coming arc at u or no outgoing arc at v. Denote by L̂(s) the obtained levelable
network.
Step 2. If L̂(s) is empty, then this iteration is completed, and go to the next
iteration. If L̂(s) is not empty, then it contains a path of length , from s to t.
Find such a path P by using the depth-first search. Do augmentation along the
path P . Update L(s) by using L̂(s) and deleting all critical arcs on P . Go to Step
1.
This algorithm has the following property.
Lemma 5.5.1 Let δf (s, t) denote the distance from s to t in residual graph Gf
of flow f . Suppose flow f is obtained from flow f through an iteration of Dinitz
algorithm. Then δf (s, t) ≥ δf (s, t) + 2.
Proof The proof is similar to the proof of Lemma 5.2.1.
The correctness of Dinitz algorithm is stated in the following theorem.
Theorem 5.5.2 Dinitz algorithm produces a maximum flow in O(|V |2 |E|) time.
Proof By Lemma 5.5.1, Dinitz algorithm runs within O(|V |) iterations. Let us
estimate the running time in each iteration.
• The construction of L(s) spends O(|E|) time.
• It needs O(|V |) time to find each augmenting path and to do augmentation. Since
each augmentation will remove at least one critical arc, there are at most O(|E|)
augmentations. Thus, the total time for augmentations is O(|V | · |E|).
• Amortizing all time for removing arcs, it is at most O(|E|).
116 5 Incremental Method
Therefore, each iteration runs in O(|V | · |E|) time. Hence, Dinitz algorithm runs in
O(|V |2 |E|) time. At the end of the algorithm, Gf does not contain a path from s to
t. Thus, f is a maximum flow.
Next, we explain two operations, push and relabel. Consider an active node v.
Suppose that there exists an admissible arc (v, w). Then a flow min(e(v), c(v, w))
will be pushed along arc (v, w). If e(v) ≤ c(v, w), then it is called a saturated push.
Otherwise, the push is called a non-saturated one.
Suppose that there does not exist an admissible arc (v, w). Then relabel d(v) by
setting
Lemma 5.6.4 For any node v, d(v) ≤ 2n during computation, and there are at
most 2n relabels at each node v, where n is the number of nodes. Moreover, all
relabels need at most O(mn) time of computation.
Proof Note that the relabel occurs only at active nodes. If a node has never been
active, then its label is at most n − 1. If a node v has been active, then at last time
that v is active, there is a path from v to s. After push, this path still exists, and
118 5 Incremental Method
Fig. 5.16 An example for Goldberg-Tarjan algorithm (it contains three iterations from (5) to (6)
and two iterations from (8) to (9))
deg(v)(2n − 1) ≤ (2n − 1) · 2m = O(mn)
v∈V
where m = |E|.
Moreover, when the algorithm terminates, there is no active node, and hence the
preflow becomes a normal flow. Moreover, the label of s is still d(s) = n, which
120 5 Incremental Method
means that there is no path from s to t in the residual graph. Therefore, the flow is
maximum.
Note that in Goldberg-Tarjan algorithm, the selection of an active node is
arbitrary. This gives an opportunity for improvement. There are two interesting rules
for the selection of active node, which can improve the running time.
The First Rule (Excess Scaling) The algorithm is divided into phases, -scaling
phase for = 2log2 C , 2log2 C−1 , . . . , 1 where C = max{c(u, v) | (u, v) ∈ E}.
At beginning of the -scaling phase, e(v) ≤ for every active node v. At the end
of the -scaling phase, e(v) ≤ /2 for every active node v. (When = 1, /2 is
replaced by 0.) During the -scaling phase, active node v is selected to be
In order to keep all active nodes with excess no more than , a modification has
to be made on flow amount in a push. Along an admissible arc (u, v), the flow of
amount min(e(u), c(v, w), − e(w)) is pushed.
Note that with the modification, there is no change on the relabel and the
saturated push. Therefore, Lemmas 5.6.4 and 5.6.5 still hold. However, the non-
saturated push occurs when pushed amount is either e(v) or −e(w). In either case,
this amount is at least /2. In Fig. 5.17, an example is presented for computation in
a -scaling phase.
We next analyze the excess scaling algorithm, i.e., Goldberg-Tarjan algorithm
with excess scaling rule.
Theorem 5.6.9 The excess scaling algorithm must terminate at a maximum flow
within time O(mn + n2 log C).
Proof By Lemmas 5.6.4 and 5.6.5, the relabel and the saturated push use totally
O(mn) time. By Lemma 5.6.8, the non-saturated push spends totally n2 log C time.
At the end of algorithm, there is no active node, i.e., the preflow becomes a flow.
5.6 Goldberg-Tarjan Algorithm 121
Moreover, in its residual graph, there is no path from source s to sink t since d(s) =
n. Thus, the flow is maximum.
The Second Rule (Highest-Level Pushing) A level is subset of nodes with the
same label. In this rule, the active node v is selected from the highest level, i.e.,
v = argmax{d(u) | u is active}.
By Lemma 5.6.4, there are at most O(n2 ) relabels, and hence (b) occurs at most
O(n2 ) times. Let = max{d(v) | v is active}. Then initially, ≤ n. If (b) occurs,
then is increased. However, the total amount of increasing at each node is at most
2n. Hence, can be increased at most 2n2 by relabels. If (a) occurs, then will be
decreased by one. Therefore, (b) can occur at most n + 2n2 = O(n2 ) times. Putting
together, the number of phases is at most O(n2 ).
where z(v) is the number of nodes each with label at most d(v), i.e.,
Again, for simplicity of speaking, let us call some operation making a “deposit” if
it decreases the value of and “withdraw” if it increases the values of .
Each relabel makes an active node v have label increased, which will make
z(v) increased. However, the increase cannot exceed n, the total number of nodes.
Therefore, each relabel makes a deposit with at most value n. O(n2 ) relabels will
deposit with at most value O(n3 ).
Each saturated push may activate a node, which will deposit with value at most
2n. Since there are totally O(mn) saturated pushes, they can deposit with value at
most O(n2 ).
Every non-saturated push on admissible arc (v, w) will make u inactive. Since
d(v) = d(w) + 1, z(v) − z(w) is equal to the number of nodes at the highest level
in the phase containing the non-saturated push. Note that during a phase, an active
node v at the level becomes inactive if and only if a non-saturated push occurs at
active node v. Therefore, at beginning of an expensive phase, there must exist at
least k active nodes at the highest level. This means that every non-saturated push
will withdraw at least value k from .
Summarized from above argument, we can conclude that the total number of non-
saturated pushes in expensive phases is at most O((n3 + n2 m)/k) = O(n2 m1/2 ).
Therefore, the total number of non-saturated pushes during whole computation is at
most O(n2 m1/2 ).
Exercises
5. Let G be a flow network in which every arc capacity is a positive odd integer.
Can we conclude that its maximum flow value is an odd integer? If not, please
give a counterexample.
6. Let G be a flow network. An arc (u, v) is said to be critical for a maximum flow
f if f (u, v) = c(u, v) where c(u, v) is the capacity of (u, v). Show that an arc
(u, v) is critical for every maximum flow if and only if decreasing it capacity
by one will result in maximum flow value getting decreased by one.
7. Let A be an m × n matrix with non-negative real numbers such that for every
row and every column, the sum of entries is an integer. Prove that there exists
an m × n matrix B with non-negative integers and the same sums as in A, for
every row and every column.
8. Suppose there exist two distinct maximum flows f1 and f2 . Show that there
exist infinitely many maximum flows.
9. Consider a directed graph G with a source s, a sink t and nonnegative arc
capacities. Find a polynomial-time algorithm to determine whether G contains
a unique s-t cut.
10. (This is an example on which Ford-Fulkerson algorithm runs with infinitely
many√augmentations.) Consider a flow network as shown in Fig. 5.19 where
x = 5−1 2 . Show by induction on k that the residual capacity c(u, v) − f (u, v)
on three vertical arcs can be x k , 0, x k+1 for every k = 0, 1, 2, . . .. (Hint: The
case of k = 0 is shown in Fig. 5.20. The induction step is as shown in Fig. 5.21.)
11. Consider a flow network G = (V , E) with a source s, a sink t, and nonnegative
capacities. Suppose a maximum flow f is given. If an arc is broken, find a fast
algorithm to compute a new maximum flow based on f . A favorite algorithm
will run in O(|E| log |V |) time.
12. Consider a flow network G = (V , E) with a source s, a sink t, and nonnegative
integer capacities. Suppose a maximum flow f is given. If the capacity of an
6 x 6
6
s 1 t
6
6
6 1
6 x 6 6 x 6
6 5
s 1 t s 1 t
6 5
6 6
6 1 6 1
arc is increased by one, find a fast algorithm to update the maximum flow. A
favorite algorithm runs in O(|E| + |V |) time.
13. Consider a directed graph G = (V , E) with a source s and a sink t. Instead
of arc capacity, assume that there is the nonnegative integer node capacity c(v)
on each node v ∈ V , that is, the total flow passing node v cannot exceed c(v).
Show that the maximum flow can be computed in polynomial-time.
14. Show that the maximum flow of a flow network G = (V , E) can be
decomposed into at most |E| path-flows.
126 5 Incremental Method
1
min · (|V | + |U | − odd(G \ U )).
U ⊆V 2
Historical Notes
algorithm [178], Goldberg-Rao algorithm [175], Sherman algorithm [365], and the
algorithm of Kelner, Lee, Orecchia, and Sidford [240]. Currently, the best running
time is O(|V ||E|). This record is kept by Orlin algorithm [331] for approximation
solution, running time can be further improved [239].
Matching is a classical subject in graph theory. Both maximum (cardinality)
matching and minimum cost perfect matching problems in bipartite graphs can be
easily transformed to maximum flow problems. However, they can also be solved
with alternating path methods. So far, Hopcroft-Karp algorithm [215] is the fastest
algorithm for the maximum bipartite matching. In general graph, they have to be
solved with alternating path method since currently, no reduction has been found to
transform matching problem to flow problem. Those algorithms were designed by
Edmonds [118]. An extension of Hopcroft-Karp
√ algorithm was made by Micali and
Vazirani [313], which runs in O( |E||V |) time.
For maximum weight matching, nobody has found any method to transform it to
a flow problem. Therefore, we have to employ the alternating path and cycle method
[118], too.
Chinese postman problem was proposed by Kwan [269], and the first
polynomial-time algorithm was given by Edmonds and Johnson [122] with
minimum cost perfect matching in complete graph with even number of nodes.
Chapter 6
Linear Programming
maximize z = 4x + 5y
subject to 2x + 3y ≤ 60
x ≥ 0, y ≥ 0.
This example can be explained in the Euclidean plane as shown in Fig. 6.1. Each of
three inequalities gives a half plane. Their intersection is a triangle, which is called
a feasible domain. In general, the feasible domain of an LP is the set of all points
satisfying all constraints. For different value of z, z = 4x + 5y gives different lines
which form a family of parallel lines. When z increases, line z = 4x + 5y moves
from left to right, and at point (30, 0), it is the last moment to intersect the feasible
domain. Hence, (30, 0) is the point at which z = 4x + 5y reaches its maximum
value, i.e., 120.
In general, an LP may contain a large number of variables and a large number
of constraints and hence cannot be solved geometrically as above. However, above
example gives us a hint to find a general method. An important observation is that
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 129
D.-Z. Du et al., Introduction to Combinatorial Optimization, Springer Optimization
and Its Applications 196, https://doi.org/10.1007/978-3-031-10596-8_6
130 6 Linear Programming
max z = cx
s.t. Ax = b
x ≥ 0,
maximize z = 4x + 5y
subject to 2x + 3y + w = 60
x ≥ 0, y ≥ 0, w ≥ 0.
Working with (0, 20, 0), we get a little problem with objective function z = 4x +
5y. From this representation, we know that increasing x would increase the objective
function value z; however, this would bring down y, in order to keep 2x + 3y + w =
60, and hence bring down objective function value z. Can we give a representation
of the objective function which does not contain y? The answer is yes. Substituting
y = 20 − 23 x − 13 w into z = 4x + 5y, we would obtain
2 2
z = 100 + x − w.
3 3
From this representation, we can easily see that increasing x would increase
objective function value z although y would be brought down. As long as y is kept
being nonnegative, it is ok. When y is brought down to 0, x can be increased to 30,
that is, we move from vertex (0, 20, 0) to vertex (30, 0, 0).
At vertex (30, 0, 0), we substitute x = 30 − 32 y − 12 w into z = 4x + 5y and
obtain
z = 120 − y − w.
A = (AI , AI¯ ),
xI
x= ,
xI¯
c = (cI , cI¯ ).
Thus,
and
xI = A−1 −1
I b − AI AI¯ xI¯ .
z = cI A−1 −1
I b + (cI¯ − cI AI AI¯ )xI¯ .
If cj ∗ > 0 for some j ∗ ∈ I¯, then it means that increasing xj ∗ will increase
the objective function value z. How much can xj ∗ be increased? Denote (aij ) =
A−1 −1 −1
I A = (Im , AI AI¯ ) and b = AI b where Im is the identity matrix of order m.
We want to increase xj and keep xj = 0 for all j ∈ I¯ − {j ∗ }. Thus, for every j ∈ I ,
∗
xj = bij − aij j ∗ xj ∗ where ij is the row index such that aij j = 1. If aij j ∗ ≤ 0, then
xj ≥ 0 for any xj ∗ ≥ 0. However, if aij j ∗ > 0, then xj ≥ 0 only for xj ∗ ≤ bij /aij j ∗ .
This means that xj ∗ can be increased at most to
bi ∗ b
= min{ i | aij ∗ > 0}, (6.3)
ai ∗ j ∗ aij ∗
−z x T
0 c
b A
−z xT
−cAI b c − cA−1
−1
I A
A−1
I b A−1
I A
−z x1 x2 x3 x4 x5 x6
0 3 1 2
30 1 1 3 1
24 2 2 5 1
36 4 1 2 1
Since c1 = 3 > 0, we may move x1 into basis. Note that 36/4 =
min(30/1, 24/2, 36/4). We choose the black 4 as pivoting element. This means
that I1 = (I0 − {6}) ∪ {1} = {4, 5, 1}. The simplex table with I1 is as follows:
−z x1 x2 x3 x4 x5 x6
−27 1/4 1/2 −3/4
21 3/4 5/2 1 −1/4
6 3/2 4 1 −1/2
9 1 1/4 1/2 1/4
Since c2 > 0 and 6/(3/2) = min(21/(3/4), 6/(3/2), 9/(1/4)), we choose the black
3/2 as pivoting element and set I2 = (I1 − {5}) ∪ {2} = {4, 2, 1}. The simplex table
with I2 is as follows:
−z x1 x2 x3 x4 x5 x6
−28 −1/6 −3/8 −2/3
18 1/2 1 −9/8
4 1 8/3 3/2 −1/3
8 1 −1/6 −3/8 1/3
Now, cj ≤ 0 for all j ∈ I¯2 . Therefore, the basic feasible solution given by I2 ,
(x1 = 8, x2 = 4, x4 = 18, x3 = x5 = x6 = 0), is optimal, which achieves the
objective function value 28.
When the nondegeneracy condition does not hold, the simplex method may fail into
a cycle. The following is an example provided in [22]:
−z x1 x2 x3 x4 x5 x6 x7
1 3/4 −20 1/2 −6 0 0 0
0 1/4 −8 −1 9 1 0 0
0 1/2 −12 −1/2 3 0 1 0
0 0 0 1 0 0 0 1
6.2 Lexicographical Ordering 137
−z x1 x2 x3 x4 x5 x6 x7
1 0 4 7/2 −33 −3 0 0
0 1 −32 −4 36 4 0 0
0 0 4 3/2 −15 −2 1 0
0 0 0 1 0 0 0 1
−z x1 x2 x3 x4 x5 x6 x7
1 0 0 2 −18 −1 −1 0
0 1 0 8 −84 −12 8 0
0 0 1 3/8 −15/4 −1/2 1/4 0
0 0 0 1 0 0 0 1
−z x1 x2 x3 x4 x5 x6 x7
1 −1/4 0 0 3 2 −3 0
0 1/8 0 1 −21/2 −3/2 1 0
0 −3/64 1 0 3/16 1/16 −1/8 0
0 −1/8 0 0 21/2 3/2 −1 1
−z x1 x2 x3 x4 x5 x6 x7
1 1/2 −16 0 0 1 −1 0
0 −3/2 56 1 0 2 −6 0
0 −1/4 16/3 0 1 1/3 −2/3 0
0 5/2 −56 0 0 −2 6 1
−z x1 x2 x3 x4 x5 x6 x7
1 7/4 −44 −1/2 0 0 2 0
0 −5/4 28 1/2 0 1 −3 0
0 1/6 −4 −1/6 1 0 1/3 0
0 0 0 1 0 0 0 1
−z x1 x2 x3 x4 x5 x6 x7
z 3/4 −20 1/2 −6 0 0 0
0 1/4 −8 −1 9 1 0 0
0 1/2 −12 −1/2 3 0 1 0
0 0 0 1 0 0 0 1
−z xIT xIT¯
c0 0 cI¯
b Im AI¯
This arrangement makes m rows, other than the top row, lexicographically positive.
Moreover, they are distinct since they are distinct within identity matrix Im .
If cI¯ ≤ 0, then I is an optimal basis. Otherwise, choose j ∗ such as cj ∗ > 0.
Denote A = (Im , AI¯ ) and c = (0, cI¯ ). If aij ∗ ≤ 0, then algorithm stops and we
can conclude that the LP has no optimal solution. If there exists some i such that
aij ∗ > 0, then choose i ∗ such that
bi ∗ ai ∗ 1 ai ∗ n
, ,...,
ai j ai j
∗ ∗ ∗ ∗ ai ∗ j ∗
and
6.2 Lexicographical Ordering 139
cj ∗
(c0 , c) ← (c0 , c) − (bi ∗ , ai ∗ ) · .
ai ∗ j ∗
This means that after a pivot, the new simplex table becomes
−z xT
aij ∗ cj ∗
c0 − bi ∗ · ai ∗ j ∗ c − ai ∗ · ai ∗ j ∗
aij ∗ aij ∗
bi − bi ∗ · ai ∗ j ∗ ai − ai ∗ · ai ∗ j ∗
−z x5 x6 x7 x1 x2 x3 x4
1 0 0 0 3/4 −20 1/2 −6
0 1 0 0 1/4 −8 −1 9
0 0 1 0 1/2 −12 −1/2 3
0 0 0 1 0 0 1 0
−z x5 x6 x7 x1 x2 x3 x4
1 0 −3/2 0 0 −2 5/4 −21/2
0 1 −1/2 0 0 −2 −3/4 15/2
0 0 2 0 1 −24 −1 6
0 0 0 1 0 0 1 0
−z x5 x6 x7 x1 x2 x3 x4
1 0 −3/2 −5/4 0 −2 0 −21/2
0 1 −1/2 3/4 0 −2 0 15/2
0 0 2 1 1 −24 0 6
0 0 0 1 0 0 1 0
The second method for dealing with degeneracy is to modify the simplex algorithm
by Bland’s rule as follows:
• Choose the entering column index j ∗ satisfying
• Choose the row index i ∗ which is the smallest one if there are more than one i ∗
satisfying
bi ∗ bi
= min | aij ∗ > 0 .
ai ∗ j ∗ aij ∗
Theorem 6.3.1 With Bland’s rule, simplex algorithm will not run into a cycle, so
that within finitely many iterations, the algorithm is able to determine whether the
optimal value goes to infinity or not, and if the optimal value is finite, then the
algorithm will obtain an optimal solution.
Proof It is sufficient to show that with Bland’s rule, simplex algorithm will not run
into a cycle. For contradiction, suppose a cycle exists. For simplicity of discussion,
we delete all constraints with row indices not selected in the cycle. Thus, for
remaining row index i, bi = 0 since objective function value cannot be changed
during computation of the cycle. In this cycle, there also exist some column indices
entering the feasible basis and then leaving or vice versa. Let t be the largest column
index among them. For simplicity of discussion, we also delete all columns with
index j > t since we will always assign 0 to variable xj for j > t. Next, let us
consider two moments in this cycle.
At the first moment, t leaves the feasible basis. Assume column index s enters the
feasible basis. Denote by aij and cj coefficients of constraints and cost, respectively,
at this moment.
At the second moment, t enters the feasible basis. Denote by aij and cj
coefficients of constraints and cost, respectively.
6.3 Bland’s Rule 141
After deletion, suppose there are m rows left. Assume that at the first moment,
j1 , j2 , . . . , jm are base indices such that a1j1 = a2j2 = · · · = amjm = 1. Consider a
variable assignment x̂ with the following values:
⎧
⎨ −1 if j = s,
x̂j = ais if j = ji for i = 1, 2, . . . , m,
⎩
0 otherwise
Clearly, x̂ satisfies all constraints. Note that (cj ) can be obtained from (cj ) through
elementary row operations. Therefore, at the first moment and at the second
moment, the cost function value should be the same at x̂, that is,
m
−cs = −cs + cj i ais .
i=1
Since at the first moment, s enters in the feasible basis, we have cs > 0. Note that
s < t and at the second moment, t enters in the feasible basis. By Bland’s rule, we
have cs ≤ 0. Therefore,
m
cj i ais < 0.
i=1
It follows that for some i, cj i ais < 0. By Bland’s rule, cj i ≤ 0. Therefore, ais > 0,
contradicting that t is the leaving index at the first moment.
Now, we apply Bland’s rule to the following LP:
−z x1 x2 x3 x4 x5 x6 x7
1 3/4 −20 1/2 −6 0 0 0
0 1/4 −8 −1 9 1 0 0
0 1/2 −12 −1/2 3 0 1 0
0 0 0 1 0 0 0 1
−z x1 x2 x3 x4 x5 x6 x7
1 0 4 7/2 −33 −3 0 0
0 1 −32 −4 36 4 0 0
0 0 4 3/2 −15 −2 1 0
0 0 0 1 0 0 0 1
142 6 Linear Programming
−z x1 x2 x3 x4 x5 x6 x7
1 0 0 2 −18 −1 −1 0
0 1 0 8 −84 −12 8 0
0 0 1 3/8 −15/4 −1/2 1/4 0
0 0 0 1 0 0 0 1
−z x1 x2 x3 x4 x5 x6 x7
1 −1/4 0 0 3 2 −3 0
0 1/8 0 1 −21/2 −3/2 1 0
0 −3/64 1 0 3/16 1/16 −1/8 0
0 −1/8 0 0 21/2 3/2 −1 1
−z x1 x2 x3 x4 x5 x6 x7
1 1/2 −16 0 0 1 −1 0
0 −3/2 56 1 0 2 −6 0
0 −1/4 16/3 0 1 1/3 −2/3 0
0 5/2 −56 0 0 −2 6 1
−z x1 x2 x3 x4 x5 x6 x7
1 0 −24/5 0 0 7/5 −11/5 −1/5
0 0 112/5 1 0 4/5 −12/5 3/5
0 0 −4/15 0 1 2/15 −1/15 1/10
0 1 −112/5 0 0 −4/5 12/5 2/5
−z x1 x2 x3 x4 x5 x6 x7
1 0 −44 0 0 0 2 −4/5
0 0 28 5/4 0 1 −3 3/4
0 0 −4 −1/6 1 0 1/3 0
0 1 0 1 0 0 0 1
−z x1 x2 x3 x4 x5 x6 x7
1 0 −20 1 −6 0 0 −4/5
0 0 −8 −1/4 0 1 0 3/4
0 0 −12 −1/2 3 0 1 0
0 1 0 1 0 0 0 1
−z x1 x2 x3 x4 x5 x6 x7
1 −1 −20 0 −6 0 0 −9/5
0 1/4 −8 0 0 1 0 1
0 1/2 −12 0 3 0 1 1/2
0 1 0 1 0 0 0 1
How do we find the initial feasible basis? A popular way is to introduce artificial
variables y = (y1 , y2 , . . . , ym )T and solve the following LP:
max w = −ey
subject to Ax + Im y = b
x ≥ 0, y ≥ 0,
where e = (1, 1, . . . , 1) and Im is the identity matrix of order m. In this LP, those
artificial variables form a feasible basis. There are three possible outcomes resulting
from solving this LP.
(1) The cost function value w is reduced to 0 and all artificial variables are removed
from the feasible basis. In this case, the final feasible basis can be used as initial
feasible basis in original LP.
(2) The cost function reaches a negative maximum value. In this case, the original
LP has no feasible solution.
(3) The cost function value w is reduced to 0; however, there is an artificial variable
yi in the feasible basis. Let bi and aij denote coefficients of constraints at the
last moment. In this case, we must have yi = bi = 0; otherwise, w = ey > 0.
Note that there exists a variable xj such that aij = 0 since rank(A) = m. This
means that we may take aij as pivot element to move yi out from feasible basis
and to move in xj , preserving cost function value 0. When all artificial variables
are moved out from the feasible basis, this case is reduced to case (1).
144 6 Linear Programming
max z = −2x1
subject to x1 −x3 = 3,
x1 −x2 −2x4 = 1,
2x1 +x4 ≤ 7,
x1 , x2 , x3 , x4 ≥ 0.
max z = −2x1
subject to x1 −x3 = 3,
x1 −x2 −2x4 = 1,
2x1 +x4 +x5 = 7,
x1 , x2 , x3 , x4 , x5 , ≥ 0.
To find an initial feasible basis, we introduce two artificial variables y1 and y2 and
solve the following LP:
The following tables are obtained with simplex algorithm with lexicographical rule:
−w y1 y2 x5 x1 x2 x3 x4
4 2 −1 −1
3 1 1 −1
1 1 1 −1 −2
7 1 2 1
−w y1 y2 x5 x1 x2 x3 x4
2 −2 1 −1 4
2 1 −1 1 −1 2
1 1 1 −1 −2
5 −2 1 2 5
6.4 Initial Feasible Basis 145
−w y1 y2 x5 x1 x2 x3 x4
0 −1 −1 2
2 1 −1 1 −1 2
3 1 1 −1
1 −2 1 2 1
At this point, we may stop the algorithm since w has been reduced to 0 and
all artificial variables are already moved out of feasible basis. An feasible basis
{x5 , x1 , x2 } is obtained for the original LP. Deleting columns of artificial variables
and putting back original cost function, we obtain the following:
−z x5 x1 x2 x3 x4
0 −2
2 1 −1 2
3 1 −1
1 1 2 1
−z x5 x1 x2 x3 x4
6 −2
2 1 −1 2
3 1 −1
1 1 2 1
It is pretty lucky that this basis already reaches the optimal. Therefore, we obtain
optimal solution (x1 = 3, x2 = 2, x3 = x4 = 0) with maximum objective function
value −6.
Now, we can summarize what we obtained on the LP as follows:
Theorem 6.4.1 There are three possible outcomes for solving LP max{cx | Ax =
b, x ≥ 0}.
(a) There is no feasible solution.
(b) The maximum value of objective function is +∞.
(c) There is a maximum solution with finite objective function value. Then, there
exists a maximum solution which is a basic feasible solution associated with
a feasible basis I such that c − cI A−1 I A ≤ 0. Moreover, if a basic feasible
solution is associated with the feasible basis I satisfying c − cI A−1
I A ≤ 0, then
it is a maximum solution.
146 6 Linear Programming
6.5 Duality
(P ) : max z = cx
subject to Ax = b
x ≥ 0,
and
(D) : min w = yb
subject to yA ≥ c,
c − cI A−1
I A ≤ 0.
yb = yax = cI A−1 −1
I Ax = (cI , cI AI AI¯ )x = cI xI = cx.
6.5 Duality 147
(P ) : max z = cx
subject to Ax ≤ b,
x ≥ 0,
and
(D) : min w = yb
subject to yA ≥ c,
y ≥ 0.
The duality theorem still holds for them and the complementary slackness condition
has a different expression.
Corollary 6.5.4 (Complementary-Slackness) Consider a primal-feasible solu-
tion x and a dual-feasible solution y in a pair of primal LP and dual LP in symmetric
form. Then both x and y are optimal if and only if (yA−c)x = 0 and y(b−Ax) = 0.
Proof Note that
cx ≤ yAx ≤ yb.
By the duality theorem, both x and y are optimal if and only if cx = by, that is,
cx = yAx and yAx = yb. These two equalities are equivalent (yA − c)x = 0 and
y(b − Ax) = 0, respectively.
Another important corollary of the duality theorem is about separating hyper-
plane.
Corollary 6.5.5 (Separating Hyperplane Theorem)
(a) There does not exist x ≥ 0 such that Ax = b if and only if there exists y such
that yA ≥ 0 and yb < 0.
(b) There does not exist x ≥ 0 such that Ax ≤ b if and only if there exists y ≥ 0
such that yA ≥ 0 and yb < 0.
148 6 Linear Programming
Proof First, we prove (a). Consider the following pair of primal and dual LPs:
(P ) : max z=0
subject to Ax = b
x ≥ 0,
and
(D) : min w = yb
subject to yA ≥ 0.
By the duality theorem, (P) has no feasible solution if and only if (D) approaches
to −∞. Note that if y is dual-feasible, so is αy for any α > 0. Therefore, (D)
approaches to −∞ if and only if there is a dual-feasible solution y such that yb < 0.
Similarly, we can show (b).
Let us give a little explanation for separating hyperplane. If the feasible domain
{x | Ax = b, x ≥ 0} is nonempty, then b is located in the cone generated by
a1 , a2 , . . . , an , i.e., { ni=1 αi ai | αi ≥ 0 for i = 1, 2, . . . , n}. The separating
hyperplane theorem (a) says that if b does not lie in this cone, then there exists a
hyperplane separating the core and b. (b) has a similar background. The separating
hyperplane theorem is quite useful in design of approximation algorithms in later
chapters.
The duality gives the possibility to design other algorithms for LP, which may
have advantage in some cases. The dual simplex algorithm is useful when initial
dual-feasible solution is easily obtained.
Consider a basis I . I is called a dual-feasible basis if c − cI A−1 I A ≤ 0, i.e.,
y = cI A−1 I is a dual-feasible solution. Clearly, the dual-feasible basis is optimal
−1
if and only AI b ≥ 0. In the following example, a primal-feasible basis is not
explicitly appeared. However, a dual-feasible basis {x4 , x5 , x6 } is easy to be found:
−z x1 x2 x3 x4 x5 x6
0 −3 −1 −2
30 1 1 3 1
−24 2 −2 5 1
4 1 2 1
This condition yields that all cj is keep nonpositive after pivot. According to these
rules, the pivot element is a22 = −2 in this example. After, the first pivot, we obtain
the following:
−z x1 x2 x3 x4 x5 x6
12 −4 0 −9/2 −1/2
18 2 0 11/2 1 1/2
12 −1 1 −5/2 −1/2
24 5 0 9/2 1/2 1
Primal LP ←→ Dual LP
max ←→ min
aij xj = bi ←→ yi has no restriction
j
aij xj ≤ bi ←→ yi ≥ 0
j
aij xj ≥ bi ←→ yi ≤ 0
j
xj has no restriction ←→ aij yi = cj
i
150 6 Linear Programming
xj ≥ 0 ←→ aij yi ≥ cj
i
xj ≤ 0 ←→ aij yi ≤ cj .
i
For example, the maximum flow problem can be formulated as the following LP:
max xsu
(s,u)∈E
(P ) : max z = cx
subject to Ax = b
x ≥ 0,
and
(D) : min w = yb
subject to yA ≥ c,
6.6 Primal-Dual Algorithm 151
yaj > cj ⇒ xj = 0,
or
xj > 0 ⇒ yaj = cj
m
(RP ) : max − ui
i=1
subject to aij xj + ui = bi for i = 1, 2, . . . , m,
j ∈J (y)
xj ≥ 0 for j ∈ J (y),
ui ≥ 0 for i = 1, 2, . . . , m.
If this LP does not have optimal value 0, then solve its dual LP:
(RD) : min vb
subject to vaj ≥ 0 for j ∈ J (y),
vi ≥ −1 for i = 1, 2, . . . , m.
Let us give (RD) another explanation. Consider applying the feasible direction
method to solve LP. At dual-feasible point y, we want to find next dual-feasible point
y + λv (λ > 0) such that yb > yb + λyv where v is a descend feasible direction.
We may like v to satisfy (RD). In fact, when (RP) does not have maximum value
0, it must have a negative maximum value. By the duality theorem, (RD) must have
negative minimum value and hence v is a descend direction. Next, we can determine
λ by
cj − yaj
λ = min{ | vaj < 0}.
vaj
Here, note that if there does not exist j such that vaj < 0, then λ = +∞ and
hence (P) does not have a feasible solution. Now, we summarize the primal-dual
algorithm.
152 6 Linear Programming
Can this algorithm terminate within finitely many iterations? We cannot find a
conclusion in the literature. However, it can be known in nonlinear programming
that this feasible direction method has the global convergence, that is, if it generates
an infinite sequence, then every cluster point of the sequence is an optimal solution.
Although three algorithms have been presented in previous sections for LP, they all
are running not in polynomial-time. The reason is that in general, the number of
extreme points (i.e., vertices) of feasible domain is exponential. In this section, we
present a polynomial-time algorithm which moves from a feasible point to another
feasible point in the interior of the feasible domain. Hence, it is called the interior
point algorithm.
First, we assume that the LP is in the following form:
min cx
subject to Ax = b
x≥0
Actually, from LP (P) and its dual (D’) in Sect. 6.6, we can obtain LP as follows:
min w − z = yb − cx
subject to yA ≥ c
Ax = b
x ≥ 0.
This LP has zero as the objective function value of optimal solution. Modify it into
standard form. Then we will obtain an LP satisfying our assumptions.
In order to keep moving in the interior of feasible domain, we need to replace
our linear objective function by a nonlinear one, called the potential function,
n
f (x) = q log(cx) − log xi
i=1
which contains a barrier terms ni=1 log xi to keep moving away from boundaries.
Moreover, for simplicity of notation, we assume the base of log is 2 in this section.
Next, we present the interior point algorithm and then explain and analyze it.
In this algorithm, each iteration is divided into three stages, scaling, update, and
scale back. In the scaling stage, the point x k is moved to y k = 1$ where 1$ is the
vector with 1 for every component, which is away from boundaries yi = 0 with
same distance. Let y = Dx. Then
154 6 Linear Programming
Ax = Āy,
cx = c̄y,
and
n
f¯(y) = q log(c̄y) − log yi
i=1
n
= q log(cx) − log(Di xi )
i=1
n
= f (x) − log Di .
i=1
ĀP = 0
(y − P y)(P y)T = 0.
and
6.7 Interior Point Algorithm 155
h
y k+1 = y k + λh = 1$ + 0.3 · ≥ 0.
h
f1 (y) = q log(c̄y),
n
f2 (y) = log yi .
i=1
Then
d
f1 (1$ + λh) = ∇f1 (1$ + λh)T h,
dλ
d2 $ + λh) = q −(c̄h)
2
f1 ( 1 ≤ 0.
dλ2 (c̄(1$ + λh))2
n
n
f2 (1$ + λh) ≥ λhi − 2 λ2 h2i
i=1 i=1
$ + λ∇f2 (1)
= f2 (1) $ T h − 2λ2 h ,
√
Lemma 6.7.3 Select q = n + n. Then h ≥ 1.
$ = 0. Therefore,
Proof Let y ∗ be the optimal solution, i.e., c̄y ∗ = 0. Then, Ā(y ∗ − 1)
156 6 Linear Programming
$ = −(P (∇ f¯(1)))
hT (y ∗ − 1) $ T (y ∗ − 1)
$
= −(∇ f¯(1))$ T (y ∗ − 1)
$
q c̄ $ T (y ∗ − 1)$
= − + (1)
c̄1$
q $ T y∗ + q − n
= − · c̄y ∗ + (1)
c̄1$
n
√
= yi∗ + n
i=1
√
≥ y∗ + n.
$ ≤ h · y ∗ − 1$ . Therefore,
Moreover, hT (y ∗ − 1)
√ √
y∗ + n y∗ + n
h ≥ ≥ = 1.
y ∗ − 1$ y∗ $
+ 1
$ − f¯(1$ + λh).
f (x k ) − f (x k+1 ) = f¯(y k ) − f¯(y k+1 ) = f¯(1)
By Lemma 6.7.2,
$ T h = − h 2 . Therefore,
Since h ∈ Null(Ā), we have ∇ f¯(1)
= −0.3 · h + 0.18
< −0.1
log xi0 ≤ L. From content of Sect. 6.4, we may assume that initially, x 0 is a vertex.
Next, compute a x to satisfy Ax = b with log |xi | ≤ L for 1 ≤ i ≤ n. Since
the feasible domain is compact, we can find two boundary points y and y on line
passing through x 0 and x . Consider z = (y + y )/2. Then z satisfies the condition.
Lemma 6.7.6 If cx k < 2−L , then x k can be rounded into an exact optimal solution
within O(n3 ) time.
Proof If x k is not a vertex, then we can find a line passing through xk such that
contains two boundary points y and z of the feasible domain and x k ∈ (y, z). cx
is nonincreasing in either direction (x k , y] or (x k , z]. Thus, we will find either cy ≤
cx k or cz ≤ cx k . However, y and z have one more active constraint than x k . In this
way, we can find a vertex x of the feasible domain such that cx ≤ cx k < 2−L . Note
that for any vertex x , each component is a rational number with denominator at most
2L since all numbers in the input are integers. Therefore, if x is not optimal, then
we must have cx ≥ 2−L . Hence, x is optimal. Above operation may be performed
O(n) time and in each operation, computing boundary points may take O(n2 ) time.
Therefore, the total running time is O(n3 ).
Theorem 6.7.7 Select ε = 2−L . The interior point algorithm will be terminated
within O(nL) steps.
Proof Since the feasible domain is compact, ni=1 xi has a upper bound M on the
feasible domain. Therefore, right before the algorithm terminates, f (x k ) ≥ −L−M.
Therefore, by Lemmas 6.7.4 and 6.7.5, the number of iterations is upper-bounded
by
2nL + L + M
= O(nL).
0.1
Define the polyhedron of bipartite match Pbmatch to be the convex hull of χM for M
over all matchings that is
Pbmatch = { αM χM | αM ≥ 0, αM = 1}
M∈M M∈M
xe ≥ 0 for every e ∈ E.
Proof
First, we show the necessity. Note that for any M ∈ M and v ∈ V1 ∪ V2 ,
e∈δ(v) χM (e) ≤ 1. Therefore, for x ∈ Pbmatch ,
xe = αM χM (e)
e∈δ(v) e∈δ(v) M∈M
= αM χM (e)
M∈M e∈δ(v)
6.8 Polyhedral Techniques 159
≤ αM
M∈M
≤ 1.
x = xe1 χM1 + (xe2 − xe1 )χM2 + · · · + (xek − xek−1 )χMk + (1 − xek )χ∅ .
dei = 1 if i is odd,
dei = −1 if i is even,
d(e) = 0 if e is not on the cycle C.
Let ε1 be the maximum ε > 0 such that x + εd ≥ 0 and ε2 the maximum ε > 0
such that x − εd ≥ 0. Denote y = x + ε1 d and z = x − ε2 d. Then, we have
and
ε2 ε1
x= y+ z.
ε1 + ε2 ε1 + ε2
dei = 1 if i is odd,
dei = −1 if i is even,
d(e) = 0 if e is not on the path.
Note that xe1 + xe2 ≤ 1 and x2 > 0. Hence, 1 − xe1 ≥ xe2 > 0. Similarly, 1 − xek ≥
xek−1 > 0. Let ε1 be the maximum ε > 0 such that x + εd ≥ 0. Then, we must
160 6 Linear Programming
Corollary 6.8.5 (König Theorem) In any bipartite graph, the cardinality of the
maximum matching is equal to the cardinality of the minimum vertex cover.
Proof By duality of LP,
max{1$T x | x ∈ Pbmatch }
= max{1$T x | x ≥ 0 and ∀v ∈ V , xe ≤ 1}
e∈δ(v)
0 0 0 0 0
G’
Theorem 6.8.9 Consider a graph G = (V , E). Let (G) be the feasible region
defined by the following constraints:
xe = 1 for every v ∈ V ,
e∈δ(v)
xe ≥ 1 for every U ⊆ V with odd |U |,
e∈δ(U )
xe ≥ 0 for every e ∈ E,
Proof First, it can be verified that for every perfect matching, χM satisfies all
constraints. Hence,
1
m
x = χMi
m
i=1
1
m
x = χMi ,
m
i=1
6.8 Polyhedral Techniques 163
where Mi and Mi are perfect matchings in G/U and G/Ū , respectively (they may
not be distinct). Since x and x agree on δ(U ) = δ(Ū ), we are able to pair up Mi
and Mi such that Mi and Mi use the same edge in δ(U ), so that Mi = Mi ∪ Mi is
a perfect matching in G. Thus,
1
m
x= χMi .
m
i=1
min w T x
subject to xe = 1 for every v ∈ V ,
e∈δ(v)
xe ≥ 1 for every U ⊆ V with odd |U |,
e∈δ(U )
xe ≥ 0 for every e ∈ E.
In this algorithm, we do not need to write down all constraints; instead, only
consider those U in a laminar family.
Definition 6.8.11 (Laminar Family) A family of sets, F is a laminar family if for
any two sets A, B ∈ F, A ∩ B = ∅ or A or B, i.e., if A ∩ B = ∅, then A ⊆ B or
B ⊆ A.
164 6 Linear Programming
Lemma 6.8.12 A laminar family F of distinct sets can have at most 2n−1 members
where n is the total number of elements.
Proof Suppose F contains all singletons, i.e., for every element x, {x} ∈ F.
Construct a graph T with node set F and edge (A, B) exists if and only if A ⊂ B
and there is no third set C ∈ F such that A ⊂ C ⊂ B. Then, G is a forest, and a
vertex is a leaf if and only if it is a singleton. Moreover, each connected component
of G is a rooted tree such that the root is a largest set in this component. Since every
singleton is in F, every internal node of G has at least two children. Let us give a
proof by induction on the number of internal node.
For basis step, consider G without internal node. Then the number of nodes in G
is n ≤ 2n − 1.
For induction step, consider G with at least one internal node. Suppose r is the
root of connected component which has at least one internal node. Then r has at
least two children. Removal r will result in at least two subtrees. Suppose that T1 is
one of them and T1 contains n1 leaves. Let T2 be the remaining part after removing
T1 . Then T2 has n − n1 leaves. By induction hypothesis, T1 has at most 2n1 − 1
nodes and T2 has at most 2(n − n1 )−! nodes. Therefore, the number of nodes in G
is at most
1 + 2n1 − 1 + 2(n − n1 ) − 1 = 2n − 1.
If F does not contain all singletons, then we can add those missing singletons
to F, which will enlarge F. However, the enlarged F can have at most 2n − 1
members.
In this primal-dual algorithm, a matching M will be growing through augmenta-
tions. Each augmentation is on an alternating path with two free nodes. A node is
free if it is not covered by matching M. To find an augmenting path, we may grow
an alternating tree starting from each free node. On this tree, a node is called an even
node if it has even distance from the free node; otherwise, it is called an odd node.
In this algorithm, we assume that input graph G = (V , E) is simple and contains
a perfect matching. Then, the algorithm consists of five steps as follows:
Step 0. Initially, set M = ∅, = {{v} | v ∈ V }, and yU = 0 for every U ∈ .
Step 1. Let F be the set of free nodes with respect to matching M and
Ey = {e | yU = we }.
U :e∈δ(U ),|U |=odd
Step 2. If alternating tree rooted at free node v meets another free node u, then an
augmenting path between v and u is found, perform an augmentation, and go
back to Step 1.
Step 3. If no blossom can be found, no augmenting path can be found, and no
alternating tree can be extended for all alternating tree, then for each alternating
tree, modify dual solution by increasing yU for all even nodes U and decreasing
yU for all odd nodes U at the same rate until stucking at boundary of the feasible
domain of the dual LP (Fig. 6.4). If the process does not get stuck, then the primal
is not feasible. (Since we assume that G contains a perfect matching, this cannot
occur.)
Step 4. When the process in Step 3 gets stuck, there are two possibilities. (a) The
first possibility is that a constraint e:e∈δ(U ),|U |=odd yU ≤ we becomes active,
i.e., the equality is reached. In this case, add e to Ey and go back to Step 1.
(b) The second possibility is that the constraint yU ≥ 0 becomes active for |U | =
odd and |U | ≥ 3. In this case, de-shrink the blossom U , add |U |/2 matching
edges back to M, remove U from , and go back to Step 3 (Fig. 6.5). Please
note that case (b) cannot occur if every node U is singleton. Therefore, finally (a)
occurs and Ey will be increased.
Next, we analyze this algorithm.
- +
Exercises
1. Linda plans to put $12,000 in investment for two stocks. The history shows that
the first stock earns 6% interests and the second stock earns 8% interests. If
Linda wants to spend the money in the first stock at least twice as much as that
in the second stock, but must not be greater than $9000, then how can she buy
these two stocks in order to maximize her profit?
Exercises 167
2. A teacher plans to rent buses from a company for a trip of 200 students. The
company has nine drivers and two types of buses. The first type bus has 50
seats; the rental cost is $800. The second type bus has 40 seats; rental cost is
$600. The company has ten buses of the first type and eight buses of the second
type. What plan can get the lowest total rental cost?
3. Transform the following LP into an equivalent LP in standard form:
min x1 − x2
subject to 2x1 + x2 ≥ 3
3x1 − x3 ≤ 7
x1 ≥ 0.
4. Please formulate the following problem into an LP: Given n lines ai x+bi y = ci
for i = 1, 2, . . . , n in a plane, find a point to minimize the total distance from
the point to these lines.
5. Transform the following problem into an LP:
max z = cx
subject to Ax = b
x ≥ 0,
(b)
max x + y
subject to x + 2y ≤ 10
2x + y ≤ 16
−x + y ≤ 3
x ≥ 0, y ≥ 0.
(c)
(d)
max 3x + cy + 2z
subject to 2x + 4y + 2z ≤ 200
x + 3y + 2z ≤ 100
x, y, z ≥ 0.
(e)
max u + v
subject to −u − 2v − w = 2
3u + v ≤ −1
v ≥ 0, w ≥ 0.
(f)
8. Can you find a simple method to solve the LP with only one equality constraint
as follows?
max cx
subject to Ax = b
x≥0
max w = −ey
subject to Ax + Im y = b
x ≥ 0, y ≥ 0,
min yb
subject to yaj ≥ cj − εj for j = 1, 2, . . . , n
n
εj ≤ ε
j =1
y ≥ 0, εj ≥ 0 for j = 1, 2, . . . , n.
max cx
subject to Ax ≤ b
x≥0
170 6 Linear Programming
max ex
subject to Ax ≤ b
x≥0
max cx
subject to Ax ≤ b
x≥0
max cx
subject to Ax ≤ b̂
x≥0
min cx
subject to b ≤ Ax ≤ b̂
x≥0
and
min cx
subject to Ax + y = b
x≥0
0 ≤ y ≤ b̂ − b.
Exercises 171
19. Show that the following LP has either optimal value 0 or no feasible solution:
min cx − yb
subject to Ax ≥ b
yb ≤ c
x ≥ 0, y ≥ 0,
min cx
subject to Ax ≥ cT
x ≥ 0.
21. Using the primal-dual algorithm, please solve the following LP:
22. Consider a pair of primal LP (P) and dual LP (D). Suppose that both (P) and
(D) have feasible solutions. Show that there exist optimal solutions x ∗ and y ∗
for (P) and (D), respectively, such that
if and only if
xe = 1 for every v ∈ V1 ∪ V2 ,
e∈δ(v)
xe ≥ 0 for every e ∈ E.
25. Show that in a bipartite graph, every perfect matching cannot be represented as
a convex combination of other matchings.
26. (Doubly stochastic matrix) An n × n matrix A = (aij ) is doubly stochastic
matrix if
for 1 ≤ i ≤ n, 1 ≤ j ≤ n, aij ≥ 0,
n
for 1 ≤ i ≤ n, aij = 1,
j =1
n
for 1 ≤ j ≤ n, aij = 1.
i=1
xe ≥ 0 for every e ∈ E,
30. Every d-regular bipartite graph has an edge-coloring with d colors, such that
every vertex is incident to all d colors.
31. Let G = (U, V , E) be a bipartite graph. Suppose that there exists a matching
covering U and there exists a matching covering V . Show that a perfect
matching exists.
32. Consider a graph G. Let Pmatch (G) denote the matching polytope of G, i.e.,
Pmatch (G) = conv{χM | M is a matching in G}. Prove that Pmatch ∩ {x |
1$T x = k} is the convex hull of all matchings of size exactly k.
33. Consider a graph G and its perfect matching polytope P = conv{χM |
M is a perfect matching of G}. An edge is a line segment between two vertices
s = [χM , χM ] such that s = H ∩ P for some hyperplane H = {x | w T x = α}
and w T x ≤ α for all x ∈ P . Prove that [χM , χM ] is an edge if and only if
M ⊕ M is a single cycle.
Historical Notes
Education today, more than ever before, must see clearly the
dual objectives: education for living and educating for making a
living.
—James Wood-Mason
There are three types of incremental methods, primal, dual, and primal-dual. In
Chap. 6, we touched all of them for linear programming (LP). This chapter is
contributed specially to primal-dual methods for further exploring techniques about
primal-dual with a special interest in the minimum cost flow. Actually, the minimum
cost flow is a fundamental optimization problem on networks. The shortest path
problem and the assignment problem can be formulated as its special cases. We
begin with the study of the assignment problem.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 175
D.-Z. Du et al., Introduction to Combinatorial Optimization, Springer Optimization
and Its Applications 196, https://doi.org/10.1007/978-3-031-10596-8_7
176 7 Primal-Dual Methods and Minimum Cost Flow
c(M) = c(u, v) ≥ d(u) + d(v).
(u,v)∈M u∈S v∈T
n
n
min c(u, v)xuv
u=1 v=1
subject to xuv = 1 for all u ∈ S,
v∈T
xuv = 1 for all v ∈ T ,
u∈S
xuv ≥ 0 for all u ∈ S, v ∈ T .
Lemma 7.1.4 After update, d is still a valid label, Gd still contains M, and Z gets
increased by at least one.
Proof Note that any arc (v, u) with u ∈ S and v ∈ T must belong to M and hence
is tight. We have only the following three cases for every arc:
(a) For a tight arc (u, v) with u ∈ S and v ∈ T , if u ∈ Z ∩ S, then we must have
v ∈ Z ∩ T . Therefore, after update, (u, v) is still tight. If v ∈ Z ∩ T , then we
may not have u ∈ Z, but on (u, v), the valid condition d(u) + d(v) ≤ c(u, v) is
still held.
(b) For a tight arc (v, u) with u ∈ S and v ∈ T , if u ∈ Z ∩ S, then u ∈ RS and
hence u can be reached from RS only from v, which implies v ∈ Z. Therefore,
(v, u) is still tight. If v ∈ Z ∩ T , then we must have u ∈ Z ∩ S and hence (v, u)
keeps tight.
(c) For a loose arc (u, v) with u ∈ S and v ∈ T , if u and v are both in Z or both
not in Z, then the valid condition is clearly held. If u ∈ Z and v ∈ Z, then by
definition of , the valid condition is still held for (u, v). If u ∈ Z and v ∈ Z,
then d(u) + d(v) is decreased by and hence the valid condition is held.
Above argument showed that after update, the label is still valid. Moreover, (b)
also implies that Gd still contains M. (a) and (b) together imply that every node in
Z is still in Z. The definition of implies that one more node gets in Z.
178 7 Primal-Dual Methods and Minimum Cost Flow
By Lemmas 7.1.3 and 7.1.4, if M is not maximum, then the algorithm can
increase either M or Z. Hence, it can finally obtain a maximum M contained in Gd ,
i.e., with minimum cost. A pseudocode of the algorithm is included in Algorithm 22.
t1 t2 t3 t4
w1 1 3 2 4
w2 2 1 4 2
w3 1 4 3 3
w4 4 2 1 3
7.1 Hungarian Algorithm 179
Let us start with a little different initiation. For each worker wi , assign d(wi ) =
min{c(i, j ) | j = 1, 2, 3, 4}. Then for each task tj , assign d(tj ) = min{c(i, j ) −
d(wi ) | i = 1, 2, 3, 4}. Then we obtain
0 0 0 1
1 0 2 1 2
1 1 0 3 0
1 0 3 2 1
1 3 1 0 1
where the most left column consists of d(wi ), the top row consists of d(tj ), and
the matrix consists of entries c(i, j ) − d(wi ) − d(tj ). Therefore, the entry 0 in the
cost matrix represents the tight edge. Let M be a maximum matching in Gd of tight
edges, consisting of edges represented by 0∗ . Mark rows not covered by M, which
form RS . Continue to mark rows and columns according to the following rules:
• If a row is marked, then mark all columns each of which intersects the marked
row with an entry 0.
• If a column is marked, then mark all rows each of which intersects the marked
column with entry 0.
All marked rows and columns form the set Z. Now, consider all entries located in
the intersection of marked rows and unmarked columns. (Each of them is put in a
parenthesis.) Let be the minimum of them, i.e., = 1.
0 0 0 1
1 0∗ (2) (1) (2) ∗
1 1 0∗ 3 0
1 0 (3) (2) (1) ∗
1 3 1 0∗ 1
∗
Subtract from marked rows and all to marked columns. We obtain the following
matrix with a minimum cost assignment marked 0∗ :
−1 0 0 1
2 0∗ (1) (0) (1) ∗
1 2 0∗ 3 0
2 0 (2) (1) (0∗ ) ∗
1 4 1 0∗ 1
∗
that such an operation on cost table will not change optimality of assignment.
Therefore, we may use such operations without mentioning label d at all. This will
induce a local ratio algorithm. Actually, the primal-dual method and the local ratio
method have a very close relationship. A lot of primal-dual algorithms can have
their equivalent local ratio companions and vice versa.
Hungarian algorithm can also be used for solving the following problem:
7.2 Label-Correcting
In this section, we introduce the label-correcting algorithm for the shortest path
problem. This is another look of Bellman-Ford algorithm, which allows negative
arc cost and only restriction is no negative cost cycle. The disadvantage is that the
running time is slow. However, it induces a faster algorithm for all-pairs shortest
paths.
Consider a directed network G = (V , E) with arc cost c : E → R, an origin
node s and a destination node t. The aim of the problem is to find a shortest path
from s to t.
Let us start to introduce the label-correcting algorithm by defining the node label.
A node label d : V → R is said to be valid if it satisfies the following conditions:
(a1) d(s) = 0.
(a2) d(v) ≤ d(u) + c(u, v) for any (u, v) ∈ E and v = s.
Let d ∗ (v) denote the cost of the shortest path from s to v. Then the valid node
label has the following properties:
Lemma 7.2.1 For any valid node label d(v), d(v) ≤ d ∗ (v).
Proof The proof is by induction on the number of arcs on the shortest path from s
to v, denoted by d̂(v). For d̂(v) = 0, we have v = s and hence d(s) = 0 = d ∗ (s).
7.2 Label-Correcting 181
For d̂(v) = k > 0, consider a shortest path P from s to v. Suppose arc (u, v)
on P . Then d̂(v) = d̂(u) + 1. By induction hypothesis, d(u) ≤ d ∗ (u). Therefore,
d(v) ≤ d(u) + c(u, v) ≤ d ∗ (u) + c(u, v) = d ∗ (v).
Lemma 7.2.2 Let d(·) be a valid node label. If d(v) is the cost of some path from s
to v, then d(v) = d ∗ (v) and vice versa.
Proof By Lemma 7.2.1, d(v) ≤ d ∗ (v). Since d(v) is the cost of some path from s
to v, we have d ∗ (v) ≤ d(v). Hence, d(v) = d ∗ (v).
From above two lemmas, we can see easily that the valid label can play a role of
dual solution. We now describe the label-correcting algorithm in Algorithm 23.
Algorithm 23 Label-correcting
Input: A directed network G = (V , E) with arc cost c, and start node s and destination node t.
Output: A shortest path from s to t.
1: d(s) ← 0; pred(s) ← ∅;
2: d(v) ← ∞ for v ∈ V \ {s};
3: QUEUE Q ← {s};
4: while Q = do
5: remove a node u from Q
6: if d(u) < nC where C = max{c(u, v) | (u, v) ∈ E} then
7: stop algorithm and output “negative cost cycle exists”
8: end if
9: for each (u, v) ∈ E do
10: if d(v) > d(u) + c(u, v) then
11: d(v) ← d(u) + c(u, v)
12: pred(v) ← u
13: add v to Q
14: end if
15: end for
16: end while
17: return s(t) together with a path (s, u1 , . . . , uk , t) where uk = pred(t), uk−1 = pred(uk ),
s = pred(u1 )
Theorem 7.2.4 If the network G does not contain a negative cost cycle, then the
label-correcting algorithm finds the shortest path from s to t within O(mn) time
where m = |E| and n = |V |.
Proof When the algorithm stops, Q is empty. This means that d(·) will not be
further updated. Hence, d(·) will be a valid label and meanwhile, d(v) is the cost of
a path from s to v for every v ∈ V . By Lemma 7.2.2, d(t) is the cost of the shortest
path from s to t.
Now, we look at the tree consisting of arcs (v, v) for v ∈ V \ {s} at the end of
the algorithm. This tree is constructed with a breadth-first search principal in the
computation. Its depth is at most n − 1 and at each level, the computation checks
each arc at most once at line 10, and hence, totally computational time is O(m).
Therefore, the algorithm has running time O(mn).
If we replace the queue Q by a stack in the label-correcting algorithm, then what
happens to the modified algorithm? The shortest path tree would be built up in a
way similar to the depth-first search style. The running time is still O(mn) with a
little harder analysis.
Now, let us explain why this is an algorithm of primal-dual type. As we
mentioned, the valid label plays a dual role. The label that for any v, d(v) is the
cost of a path from s to v plays a primal role. The label-correcting algorithm is
an incremental method on primal side. The incremental direction is guided by dual
feasible conditions.
Next, we present an application of the label-correcting algorithm.
At the end of the label correcting algorithm, it outputs a label d(v) satisfying
d(v) ≤ d(u) + c(u, v) for every arc (u, v) ∈ E. Define c (u, v) = c(u, v) + d(u) −
d(v). Then c (u, v) ≥ 0 for every arc (u, v) ∈ E. An application is motivated from
the following property:
Lemma 7.2.5 For any pair of nodes x, y ∈ V , a path from x to y is the shortest
one for arc cost c if and only if it is the shortest one for arc cost c.
Proof Consider a path P = (x = x0 , x1 , . . . , xk = y). Then
Theorem 7.2.6 Above algorithm computes all-pairs shortest paths within O(mn +
n2 log n) time.
Proof Note that Dijkstra algorithm runs in time O(m + n log n) on networks with
nonnegative arc cost.
induce an arc (y, x) with capacity u(y, x) = f (x, y) and cost a(y, x) = −a(x, y).
This is because, if we make an adjustment to reduce flow f (x, y), then the cost will
be reduced. However, it is equivalent to construct a flow from (y, x) in the residual
graph with cost −a(x, y). This new arc (y, x) may not be able to merge with original
arc (y, x) because they may have different costs. This is a troublemaker in dealing
with the residual graph.
Before design algorithm, let us first establish an optimality condition.
Lemma 7.3.3 (Optimality Condition) A maximum flow f has the minimum cost
if and only if its residual graph Gf does not contain a negative cost cycle.
Proof If Gf contains a negative cost cycle, then the cost can be reduced by adding a
flow along this cycle. Next, assume that Gf for a maximum flow f does not contain
a negative cost cycle. We show that f has the minimum cost. For contradiction,
assume that f does not reach the minimum cost, so that its cost is larger than the
cost of a maximum flow f . Note that every flow can be decomposed into disjoint
union of several path flows. This fact implies that f contains a path flow P that
has cost larger than the cost of a path flow P in f . Let P̂ be obtained from P
by reversing its direction. Then P̂ ∪ P forms a negative cost cycle, which may be
decomposed into several simple cycles, and one of them must also have negative
cost. This simple cycle must be contained in Gf , a contradiction.
This optimality condition suggests an algorithm as shown in Algorithm 24. In
this algorithm, a maximum flow is initially produced, and then use Bellman-Ford
algorithm to find whether a negative cost cycle exists or not. If a negative cost cycle
exists, then send a flow along the negative cost cycle to reduce the cost. The new
residual graph will have at least one arc on the cycle getting flow cancelled. If a
negative cost cycle does not exist, then the optimal solution is found.
Theorem 7.3.4 Suppose every arc has an integral capacity and an integral
cost. Then Algorithm 24 terminates in at most O(mU C) iterations and runs in
O(m2 nU C) time where U is an upper bound for arc capacity and C is an upper
bound for arc cost.
7.3 Minimum Cost Flow 185
Proof Note that every flow has cost upper-bounded by mU C. When every arc
capacity is an integer, the maximum flow obtained by Admonds–Karp algorithm
has an integral value at every arc. Since every arc cost is an integer, each iteration
of cycle cancelling reduces the total cost by at least one. Therefore, the algorithm
terminates within at most mU C iterations. Moreover, Admonds–Karp algorithm
runs in O(m2 n) time and Bellman-Ford algorithm runs in O(mn) time. Therefore,
the total running time of Algorithm 24 is O(m2 nU C).
Clearly, the cycle cancelling is a primal algorithm. To introduce the dual solution,
let us define a label on nodes π : V ← R, called the node potential.
Lemma 7.3.5 A maximal flow f has the minimum cost if and only if there exists a
node potential π such that for every arc (x, y) in Gf , a(x, y) ≥ π(x) − π(y).
Proof For sufficiency, consider any cycle (x1 , x2 , . . . , xk , x1 ). We have
that is, no negative cost cycle exists. Therefore, f has the minimum cost.
For necessity, suppose f has the minimum cost. Then Gf has no negative cost
cycle. Therefore, consider a(x, y) as the length of arc (x, y). Using Bellman-Ford
algorithm, we can compute a distance d(x) from source s to node x. Define π = −d.
Then −π(y) ≤ −π(x) + a(x, y) for any arc (x, y) in Gf .
The condition in Lemma 7.3.5 is called the dual feasibility. The π plays a role of
dual solution, like the label in Hungarian algorithm. Denote a π (x, y) = a(x, y) −
π(x) + π(y) which is called a reduced arc cost. Next, we show some properties of
π.
Lemma 7.3.6 Let π be a dual-feasible node potential for residual graph Gf of a
flow f . Consider reduced arc cost a π (x, y). Let f be obtained from f through an
augmentation on a shortest path from source s to sink t. Denote by d(x) the shortest
distance from s to node x. Then, π = π − d is a dual-feasible node potential for
residual graph Gf .
Proof Since π is dual-feasible for Gf , we have a π (x, y) = a(x, y)−π(x)+π(y) ≥
0 for every arc (x, y) in Gf . Moreover, since d(x) is the shortest distance from s to
x when we consider reduced arc cost a π (x, y), we have d(y) ≤ d(x) + a π (x, y) for
any arc (x, y) in Gf . Therefore, for any arc (x, y) in Gf , a(x, y) − π (x) + π (y) =
a(x, y) − (π(x) − d(x)) + (π(y) − d(y)) = a π (x, y) + d(x) − d(y) ≥ 0.
For arc (x, y) on the shortest path from s to t, we have d(y) = d(x) + a π (x, y).
Thus, a π (x, y) = 0. Note that in a new arc appearing in Gf can occur only on the
augmenting path in Gf , which is the reverse of an arc on the path. However, since
186 7 Primal-Dual Methods and Minimum Cost Flow
a π (x, y) = 0 for any arc (x, y) on this path, we have a π (y, x) = 0 for its reverse
(y, x). Therefore, π is a dual-feasible node potential for Gf .
Lemmas 7.3.5 and 7.3.6 suggest an algorithm as shown in Algorithm 25.
Theorem 7.3.7 Suppose every arc has integer capacity and integer cost. Then
Algorithm 25 terminates in at most O(nU ) iterations and runs in O(U mn log n)
time where U is an upper bound for arc capacity.
Proof Note that the maximum flow has value at most O(nU ). Each iteration will
increase flow value by at least one. Therefore, there are at most O(nU ) iterations.
In each iteration, since a π (x, y) ≥ 0 for every arc (x, y) in Gf , Dijkstra algorithm
can be employed to find the shortest path and compute d(x) for every node x in
Gf within O((m + n) log n) = O(m log n) time. Updating π will take O(n) time.
Putting all together, the total running time is O(U mn log n).
Both the cycle cancelling and the successive shortest path algorithms run not in
polynomial-time.
Lemma 7.4.2 A circulation f is optimal if and only if its residual graph Gf does
not have a negative cost cycle.
Proof For necessity, suppose that Gf has a negative cost cycle. Then, this cycle can
be added to f , resulting in a circulation with cost less than the cost of f .
For sufficiency, suppose that Gf does not have a negative cost cycle. For
contradiction, assume that f is not optimal. Let f ∗ be a minimum cost circulation.
Then f ∗ − f is a circulation of residual graph Gf . Since the cost of f ∗ − f is
negative, its decomposition must contain a negative cost cycle, contradicting to the
assumption on Gf .
Lemma 7.4.3 The minimum circulation problem is equivalent to the minimum cost
maximum flow problem with possibly negative arc cost.
Proof To reduce the minimum circulation to the minimum cost maximum flow, add
a source s and sink t on input network G for the minimum circulation problem
without connection to G. Then, the maximum flow value is 0 and the minimum cost
is exactly the minimum cost circulation.
To reduce the minimum cost maximum flow to the minimum circulation,
consider an input network G with source s and sink t for the minimum cost
maximum flow problem. Add an arc (t, s) with cost −(1 + nC) and capacity nU
where C is the maximum arc cost and U is the maximum arc capacity in G. Denote
by G the obtained network. Then the maximum flow f in G will be turned into
a circulation f passing (t, s). Note that Gf does not contain any negative cost
cycle. Therefore, Gf does not contain any negative cost cycle not passing (t, s).
Moreover, Gf does not contain a path from s to t. Therefore, Gf does not contain
a cycle passing through (t, s). Therefore, the minimum cost circulation in G is
equivalent to the minimum cost maximum flow in G.
Consider a node potential π(v) for each node v ∈ V . The node potential d(·) is
said to be dual-feasible if π(v) ≤ π(u) + a(u, v) for every arc (u, v) ∈ E. Define
a π (u, v) = a(u, v) − π(u) + π(v). a π is called cost reduced by node potential π .
Lemma 7.4.4 Every circulation has the same cost under cost a(·, ·) and reduced
cost a π (·, ·).
188 7 Primal-Dual Methods and Minimum Cost Flow
Proof Note that every circulation can be decomposed into cycles. When reduced
cost a π (·, ·) is applied to each cycle, the node potential terms will be cancelled out.
a π (v, w) = a(v, w) − π (v) + π (w) = a π (v, w) + d(v) − d(w) ≥ 0.
The scaling technique can also be applied to the arc cost. Let G(k) be obtained from
G by giving sign(a(x, y)) · |a(x, y)|/2k as cost of each arc (x, y) where
190 7 Primal-Dual Methods and Minimum Cost Flow
⎧
⎨ 1 if x > 0,
sign(x) = 0 if x = 0,
⎩
−1 if x < 0.
Clearly, G(0) = G. Let L = log2 (C + 1) where C is the maximum arc cost.
Then every arc in G(L) has cost 0. Therefore, every circulation for G(L) has the
minimum cost 0. Without loss of generality, we may choose f (L) = 0 as the
minimum cost circulation for G(L) . Let 2G(k) denote the network obtained from
G(k) by doubling cost for every arc. Then the minimum cost circulation f (k) for
G(k) is also a minimum cost circulation for 2G(k) . Note that G(k) can be obtained
from 2G(k−1) by adding one for some positive arc cost and subtracting one for some
negative arc cost. For adding one on an arc cost, it does not produce a negative cost
cycle. Hence, the minimum cost circulation is unchanged. Next, we study the case
of subtracting one from a negative arc cost.
Consider a network G = (V , E) and a network G which is obtained from G by
decreasing one from the cost of an arc (x, y). Let f be a minimum cost circulation
of G. We describe how to obtain a minimum cost circulation f of G. Consider two
cases.
Case 1. Gf does not contain a negative cost cycle. Then f = f .
Case 2. Gf contains a negative cycle Q. Clearly, Q contains arc (x, y). Without
loss of generality, assume that for every arc e ∈ Q, the reduced cost a π (e) = 0,
and for every arc e in Gf , a π (e) ≥ 0, where π is a node potential. Let H be a
subgraph of Gf , consisting of all arcs with reduced cost 0. Add an arc (x , y)
to H and set capacity c(x y) = c(x, y). Denote by H the obtained graph. Find
a maximum flow fx x from x to x in H . Merging node x into x, the flow fx x
becomes a circulation fx . We claim that f = f + fx is the minimum cost
circulation for G .
To show our claim, we update the node potential by setting
⎧
⎨ π(v) + 1 if there exists a path from y to v such that for every
π (v) = arc e on the path a π (e) = 0,
⎩
π(v) otherwise.
In the following, we show that a π (e) ≥ 0 for e ∈ Gf .
If (x, y) is in Gf , then we must have |fx x | < c(x , y) = c(x, y). Hence, there
is a cut between y and x in Hfx x , which implies that π (x) = π(x). Moreover, note
that π (y) = π(y) + 1. Therefore, the reduced cost of a π (x, y) = 0 after cost of
(x, y) is reduced by one.
If (u, v) is in Gf with π (u) = π(u) + 1 and π (v) = π(v), then we must have
a π ((u, v)) ≥ 1. Hence, a π ((u, v)) ≥ 0.
If (u, v) is in Gf , but (u, v) is not in Gf , then we must have a π ((v, u)) = 0 and
π (u) = π(u) and hence a π ((u, v)) ≥ 0.
7.6 Strongly Polynomial-Time Algorithm 191
For arc e other than above three possibilities, we have a π (e) ≥ a π (e) ≥ 0.
Theorem 7.5.1 The cost scaling algorithm can compute a minimum cost circula-
tion in O(m3 n log C) time where C is the largest arc cost.
Proof Note that G(k) can be obtained from 2G(k−1) by adding one for some positive
arc cost and subtracting one for some negative arc cost. For each subtraction, we
may need to compute a maximum flow of subgraph H by Admonds–Karp algorithm
in O(nm2 ) time. Therefore, total running time is O(nm3 log C).
1
k
μ(Q) = a(ei ).
k
i=1
i.e.,
δn (s, v) − δk (s, v)
min max ≥ 0.
v∈V 0≤k≤n−1 n−k
δn (s, v) − δk (s, v)
max = 0.
0≤k≤n−1 n−k
δn (s, v) − δk (s, v)
max = 0.
0≤k≤n−1 n−k
Lemma 7.6.2
δn (s, v) − δk (s, v)
μ∗ = min max .
v∈V 0≤k≤n−1 n−k
Proof Let μ∗ (a) denote the μ∗ for arc cost a. For every arc e, define a new arc cost
a (e) = a(e) − μ∗ (a). Then, for every cycle Q, the mean-cost of Q is reduced by
7.6 Strongly Polynomial-Time Algorithm 193
δn (s, v) − δk (s, v)
min max − μ∗ (a) = 0.
v∈V 0≤k≤n−1 n−k
The following algorithm for computing the minimum cycle mean-cost is based
on characterization in Lemma 7.6.2:
Clearly, this algorithm runs in O(mn) time. Hence, we have the following:
Theorem 7.6.3 Karp’s algorithm computes the minimum cycle mean-cost in
O(mn) time.
A circulation f is ε-optimal if and only if there exists a node potential π such
that for every arc e in G, a π (e) ≥ −ε. The minimum cycle mean-cost has a close
relationship with the ε-optimality of circulation.
Lemma 7.6.4 f is a ε-optimal circulation if and only if −ε ≤ μ∗ (Gf , a) where
μ∗ (Gf , a) is the minimum cycle mean-cost of residual graph Gf with respect to
arc cost a.
Proof First, suppose f is ε-optimal. Then, there exists a node potential π such that
for every arc in Gf , a π (e) ≥ −ε. For every cycle Q in Gf , we have
π (e)
e∈Q a(e) e∈Q a
μ(Q) = = ≥ −ε.
|Q| |Q|
Conversely, suppose −ε ≤ μ∗ (Gf , a). Define a new arc cost a (e) = a(e) + ε.
With this new cost, μ∗ (Gf , a ) = μ∗ (Gf , a) + ε ≥ 0. It follows that f is the
minimum cost circulation for this new cost. Therefore, there exists a node potential
π such that for every arc e in Gf , (a )π (e) ≥ 0, i.e., a π (e) = (a )π (e) − ε ≥ −ε.
Therefore, f is ε-optimal.
However, all arc costs are integers. Therefore, e∈Q a π (e) ≥ 0, i.e., f is optimal.
Proof After iteration, every cycle Q in Gf contains at least one arc with nonnega-
tive reduced cost and hence its cost is at least (|Q| − 1)μ∗ (Gf , a). Therefore,
1 ∗ 1
μ(Q) ≥ 1 − μ (Gf , a) ≥ 1 − μ∗ (Gf , a).
|Q| n
Hence,
7.6 Strongly Polynomial-Time Algorithm 195
∗ 1
μ (Gf , a) ≥ 1 − μ∗ (Gf , a).
n
Lemma 7.6.7 Let π ∗ be the optimal node potential and f ∗ the corresponding
∗
minimum cost circulation. If a π (e) > nε > 0 for an arc e, then for any ε-optimal
circulation f , f (e) = f ∗ (e) ≤ 0.
Proof For contradiction, assume f ∗ (e) > 0. Then the reverse ē of e is in Gf ∗ and
∗
a π (ē) < −nε < 0, contradicting optimality of f ∗ .
For contradiction, first assume f (e) > f ∗ (e). Consider f ⊕ f ∗ , which can be
decomposed into union of cycles. Let Q be the cycle containing arc e. Then for every
∗
arc e in Q, f (e ) > f ∗ (e ), and hence, e is in Gf ∗ with a π (e ) ≥ 0. Therefore,
∗
a(Q) ≥ a π (e) > nε. Let Q̄ be the reverse of Q. Then a(Q̄) < −nε. Note that
every arc in Q̄ is in Gf . Since f is ε-optimal, every cycle in Gf has cost at least
−nε, a contradiction.
Next, assume f (e) < f ∗ (e). Let ē be the reverse of e. Then f ((e)) ¯ > f ∗ (ē)
∗ ∗ ∗
and hence ē is in Gf ∗ . It follows that a (ē) ≥ 0. Thus, a (e) = −a (ē) ≥ 0,
π π π
∗
contradicting a π (e) > 0.
∗
Corollary 7.6.8 If a π (e) < −nε < 0 for an arc e, then for any ε-optimal
circulation f , f (e) = f ∗ (e) ≥ 0.
∗
Proof Consider the reverse ē of e. Then a π (ē) > nε. This corollary follows
immediately by applying Lemma 7.6.7 to ē.
∗
By Lemma 7.6.7 and its corollary, if |a π (e)| > nε > 0, then for any ε-optimal
circulation f , f (e) = f ∗ (e). Such an arc e is called ε-fixed.
Lemma 7.6.9 Let −nε = μ∗ (Gf , a) < 0. Then Gf contains an arc e such that e
is not nε-fixed, but is ε -fixed for any ε < ε.
Proof Consider the minimum mean-cost cycle Q in Gf . Then f is nε-optimal.
Note that another nε-optimal circulation can be obtained from augmenting along Q.
These two nε-optimal circulations have different values at every arc in Q. Therefore,
every arc e in Q is not nε-fixed.
∗
However, since |a π (Q)| = |a(Q)| = nε × |Q|, Q must contain an arc e such
∗
that |a π | ≥ nε, i.e., e is ε -fixed for any ε < ε.
Now, we are ready to give the running time of above strongly polynomial-time
algorithm.
Theorem 7.6.10 Above algorithm computes a minimum cost circulation in
O(m3 n log n) time.
Proof Suppose f is the circulation obtained from f through n ln n iterations. By
Lemma 7.6.6,
196 7 Primal-Dual Methods and Minimum Cost Flow
By Lemma 7.6.9, after O(n ln n) iterations, a new arc will be fixed. When all
∗
arcs with positive reduced cost a π (·) are fixed, obtained circulation f reaches the
∗
minimum cost and hence μ (Gf , a) ≥ 0, i.e., the algorithm will be terminated.
Note that each iteration contains O(m) augmentation and each augmentation can be
done in O(m) time. Therefore, total running time is O(m3 n log n).
Exercises
1. Show that the shortest path problem is a special case of the minimum cost flow
problem.
2. Show that the assignment problem can be formulated as a minimum cost flow
problem.
3. Show that the maximum flow problem is a special case of the minimum cost
flow problem.
4. Suppose that the residual graph Gf of flow f does not contain a negative cost
cycle. Let the flow f be obtained from f through augmentation on a minimum
cost path from s to t in Gf . Show that Gf does not contain a negative cost
cycle.
5. An edge cover C of a graph G = (V , E) is a subset of edges such that every
vertex is incident to an edge in C. Design a polynomial-time algorithm to find
the minimum edge cover, i.e., an edge cover with minimum cardinality.
6. (König theorem) Show that the minimum size of vertex cover is equal to the
maximum size of matching in bipartite graph.
7. Show that the vertex cover problem in bipartite graphs can be solved in
polynomial-time.
8. A matrix with all entries being 0 or 1 is called a 0-1 matrix. Consider a positive
integer d and a 0-1 matrix M that each row contains exactly two 1s. Please
design an algorithm to find a minimum number of rows to form a submatrix
such that for every d + 1 columns C0 , C1 , . . . , Cd , there exists a row at which
C0 has entry 1, but all C1 , . . . , Cd have entry 0 (such a matrix is called a d-
disjunct matrix).
9. Design a cycle cancelling algorithm for the Chinese postman problem.
10. Design a cycle cancelling algorithm for the minimum spanning tree problem.
11. Consider a graph G = (V , E) with nonnegative edge distance d(e) for e ∈ E.
There are m source nodes s1 , s1 , . . . , sm and n sink nodes t1 , t2 , . . . , tn . Suppose
these sources are required to provide those sink nodes with certain type of
products. Suppose that si is required
n to provide ai products and tj requires
bj products. Assume m i=1 ai = j =1 bj . The target is to find a transportation
plan to minimize the total cost where on each edge, the cost is the multiplication
of the distance and the amount of products passing through the edge. Show that
Historical Notes 197
a transportation plan is minimum if and only if there is no cycle such that the
total distance of unloaded edges is less than the total distance of loaded edges.
12. Consider m sources s1 , s1 , . . . , sm and n sinks t1 , t2 , . . . , tn . These sources
are required to provide those sink nodes with certain type of products. m si is
required
n to provide a i products and t j requires bj products. Assume i=1 ai =
j =1 bj . Given a distance table (d ij ) between sources s i and sinks tj , the
target is to find a transportation plan to minimize the total cost where on
each edge, the cost is the multiplication of the distance and the amount of
products passing through the edge. Show that a transportation plan is minimum
if and only if there is no circuit [(i1 , j1 ), (i2 , j1 ), (i2 , j2 ), . . . , (i1 , jk )] such
that (i1 , j1 ), (i2
, j2 ), . . . , (ik , jk ) are
loaded, (i2 , j1 ), (i3 , j2 ), . . . , (i1 , jk ) are
unloaded, and kh=1 d(ih , jh ) > kh=1 d(ih , jh−1 ) (j0 = jk ). Here, (i, j ) is
said to be loaded if there is at least one product transported from si to tj .
13. An input for the minimum cost circulation problem consists of a network G
with arc capacity c(u, v) and arc cost a(u, v). Suppose π is a dual-feasible
node potential to witness an optimal solution f , i.e., a π (u, v) ≥ 0 for every arc
(u, v) in Gf . Show that for every arc (u, v) in G, we have the following:
(a) a π (u, v) < 0 ⇒ f (u, v) = c(u, v).
(b) f (u, v) > 0 ⇒ a π (u, v) ≤ 0.
14. Show that the cost of a ε-optimal circulation is no more than |E|εU from the
optimal, where |E| is the number of arcs and U is the maximum arc capacity.
15. Consider a strongly connected directed graph G = (V , E) with arc cost. Let
v ∈ V be a node reaching
δn (s, v) − δk (s, v)
min max .
v∈V 0≤k≤n−1 n−k
Then, every cycle on the path from s to v with length n has the minimum mean-
cost where s is the start node.
16. Show that for two different circulations f and f , if f (e) > f (e) for an arc e,
then e lies in Gf .
17. Show that the strongly polynomial-time algorithm in Sect. 7.6 has running time
also upper-bounded by O(m2 n log(nC)) where m is the number of arcs, n is
the number of nodes, and C is the maximum arc cost.
18. Design a primal-dual algorithm for the minimum weight arborescence problem.
Historical Notes
The Hungarian method for the assignment problem was published in 1995 by Harold
Kuhn [267, 268]. He gave the name “Hungarian method” because his work is
based on contributions of two Hungarian mathematicians: Dénes König and Jenö
198 7 Primal-Dual Methods and Minimum Cost Flow
Egerváry. Since then, there are a sequence of research efforts made on this algorithm
[123, 232, 322, 377].
For the shortest path problem and its variations, the label-correcting algorithm
has some advantages, so that it appeared in the literature quite often [27, 190, 234,
369, 439].
Actually, both the assignment problem and the shortest path problem are special
cases of the minimum cost flow problem. The minimum cost flow is a fundamental
problem in the study of network flows. It has many applications. Especially, several
classic network optimization problems, such as the assignment and the shortest path,
can be formulated as its special cases.
Minimum cost maximum flow is studied following up with maximum flow
problem. Similarly, earlier algorithms run in pseudo polynomial-time such as out-
of-kilter algorithm [151], cheapest path augmentation [39], cycle cancelling
[257], cut cancelling [129, 205], minimum mean cancelling [176], and successive
shortest path [125]. Polynomial-time algorithms were found later such as speed-
up successive shortest path, linear programming approach [332], capacity scaling
[123], and cost scaling [177]. Strongly polynomial-time algorithm is obtained much
later by Tardos [373], Goldberg and Tarjin [176, 177], etc. In the study of strongly
polynomial-time algorithm, Karp’s algorithm [238] for the minimum cycle mean
problem [48] plays an important role. Currently, the fastest strong polynomial-
time algorithm has running time is O(|E|2 log2 |V |). This record is also kept by
an algorithm of Orlin [332].
Chapter 8
NP-Hard Problems and Approximation
Algorithms
The biggest difference between time and space is that you can’t
reuse time.
—Merrick Furst
The class P consists of all polynomial-time solvable decision problems. What is the
class NP? There are two popular misunderstandings:
(1) NP is the class of problems which are not polynomial-time solvable.
(2) A decision problem belongs to the class NP if its answer can be checked in
polynomial-time.
The misunderstanding (1) comes from incorrect explanation of NP as the brief
name for “not polynomial-time solvable.” Actually, it is polynomial-time solvable,
but in a wide sense of computation, nondeterministic computation, that is, NP is the
class of all nondeterministic polynomial-time solvable decision problems. Thus,
NP is the brief name of “nondeterministic polynomial-time.”
What is nondeterministic computation? Let us explain it starting from computa-
tion model, Turing machine (TM). A TM consists of three parts, a tape, a head, and
a finite control (Fig. 8.1).
The tape has the left end and infinite long in the right direction, which is divided
infinitely into many cells. Each cell can hold a symbol. All symbols possibly on the
tape form an alphabet , called the alphabet of tape symbols. In , there is a special
symbol B, called the blank symbol, which means the cell is actually empty. Initially,
an input string is written on the tape. All symbols possibly in the input string form
another alphabet , called the alphabet of input symbols. Assume that both and
are finite and B ∈ \ .
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 199
D.-Z. Du et al., Introduction to Combinatorial Optimization, Springer Optimization
and Its Applications 196, https://doi.org/10.1007/978-3-031-10596-8_8
200 8 NP-Hard Problems and Approximation Algorithms
The head can read, erase, and write symbols on the tape. Moreover, it can move to
the left and right. In each move, the head can shift a distance of one cell. Please note
that in classical one-tape TM, the head is not allowed to stay in the place without
moving before the TM halts.
The finite control contains a finite number of states, forming a set Q. The TM’s
computation depends on function δ : Q × → Q × × D where D = {R, L}
is the set of possible moving directions and R means moving to the right, while
L means moving to the left. This function δ is called the transaction function. For
example, δ(q, a) = (p, b, L) means that when TM in state q reads symbol a, it will
change state to p, change symbol a to b, and then move to the left (the uppercase
in Fig. 8.2); δ(q, a) = (p, b, R) means that when TM in state q reads symbol a,
it will change state to p, change symbol a to b, and then move to the right (the
lowercase in Fig. 8.2); Initially, on an input x, the TM is in a special state s, called
the initial state, and its head is located at the leftmost cell, which contains the first
symbol of x if x is not empty. The TM stops moving if and only if it enters another
special state h, called the final state. An input x is said to be accepted if on x, the TM
will finally stop. All accepted inputs form a language, which is called the language
accepted by the TM. The language accepted by a TM M is denoted by L(M).
From above description, we see that each TM can be described by the following
parameters: an alphabet of input symbols, an alphabet of tape symbols, a finite
set Q of states in finite control, a transition function δ, and an initial state s.
The computation time of an TM M on an input x is the number of moves from
initial state to final state, denoted by T imeM (x). A TM M is said to be polynomial-
time bounded if there exists a polynomial p such that for every input x ∈ L(M),
T imem (x) ≤ p(|x|). So far, what TM we have described is the deterministic TM
(DTM), that is, for each move, there exists at most one transition determined by the
transition function. All languages accepted by polynomial-time bounded DTM form
a class, denoted by P.
There are many variations of the TM, in which the TM has more freedom. For
example, the head is allowed to stay at the same cell during a move, the tape
may have no left end, and multiple tapes exist (Fig. 8.3). However, in terms of
8.1 What Is the Class NP? 201
polynomial-time computability, all of them have been proved to have the same
power. Based on such experiences, one made the following conclusion:
Extended Church-Turing Thesis A function computable in polynomial-time in
any reasonable computational model using a reasonable time complexity measure
is computable by a deterministic TM in polynomial-time.
Extended Church-Turing thesis is a natural law of computation. It is similar
to physics laws, which cannot have a mathematical proof, but is obeyed by the
natural world. By extended Church-Turing thesis, the class P is independent from
computational models. In the statement, “reasonable” is an important word. Are
there unreasonable computational models? The answer is yes. For example, the
nondeterministic Turing machine (NTM) is an important one among them. In an
NTM, for each move, there may exist many possible transitions (Fig. 8.4) and the
NTM can use any one of them. Therefore, transition function δ in an NTM is a
mapping from Q × to 2Q××{R,L} , that is, δ(q, a) is the set of all possible
202 8 NP-Hard Problems and Approximation Algorithms
Fig. 8.4 There are many possible transitions for each move in an NTM
transitions. When the NTM in state q reads symbol a, it can choose any one
transition from δ(q, a) to implement.
It is worth mentioning that for each nondeterministic move of one-tape NTM, the
number of possible transitions is upper-bounded by |Q| × || × 3 where Q is the set
of states, is the alphabet of tape symbols, and 3 is an upper bound for the number
of moving choices. |Q| × || × 3 is a constant independent from input size |x|.
The computation process of the DTM can be represented by a path, while the
computation process of the NTM has to be represented by a tree. When is an input
x accepted by an NTM? The definition is that as long as there is a path in the
computation tree, leading to the final state, then x is accepted. Suppose that at each
move, we make a guess for choice of possible transitions. This definition means that
if there exists a correct guess which leads to the final state, we will accept the input.
Let us look at an example. Consider the following problem:
Problem 8.1.1 (Hamiltonian Cycle) Given a graph G = (V , E), does G contain
a Hamiltonian cycle? Here, a Hamiltonian cycle is a cycle passing through each
vertex exactly once.
The following is a nondeterministic algorithm for the Hamiltonian cycle prob-
lem:
input a graph G = (V , E).
step 1 guess a permutation of all vertices.
step 2 check if guessed permutation gives a Hamiltonian cycle.
if yes, then accept input.
8.1 What Is the Class NP? 203
Lemma 8.1.4 If rank(A) = r < n, then there exists a nonzero vector z such that
Az = 0 and every component of z is at most (αq)q .
Proof Without loss of generality, assume that the left-upper r × r submatrix B is
nonsingular. Set xr+1 = · · · = xn−1 = 0 and xn = −1. Apply Cramer’s rule to
system of equations
where aij is the element of A on the ith row and the j th column. Then we can obtain
xi = det Bi / det B where Bi is a submatrix of A. By Lemma 3.1, | det Bi | ≤ (αq)q .
Now, set z1 = det B1 , · · · , zr = det Br , zr+1 = · · · = zn−1 = 0, and zn = det B.
Then Az = 0.
|xi | ≤ | det Di |
≤ (αq)q (|c1 | + · · · + |cn |)
≤ (αq)q n(α + (αq)q+1 )
≤ 2(αq)2q+1 .
m
n
n
β= log2 |aij | + log2 |bj | ≥ mn + log2 α ≥ q + log2 α.
i=1 j =1 j =1
max cx
subject to Ax ≥ b
x ∈ {0, 1}n ,
1 If they are rational numbers, then we can transform them into integers. If some of them are
irrational numbers, then we have to touch the complexity theory of real number computation,
which is out of scope of this book.
8.2 What Is NP-Completeness? 207
x y x∧y x∨y ¬x
0 0 0 0 1
0 1 0 1 1
1 0 0 1 0
1 1 1 1 0
xy = x̄ + ȳ and x + y = x̄ ȳ.
Solution. Let B, J , and D denote respectively Brown, John, and David. What the
Chair said can be written as a Boolean formula:
Since this department actually needs more than one new faculty, there is only one
way to satisfy this Boolean formula, that is, B = 0, D = J = 1. Thus, John and
David will be hired.
Now, we are ready to state Cook’s result.
Theorem 8.2.4 (Cook Theorem) The SAT problem is NP-complete.
After the first NP-complete problem is discovered, a large number of problems
have been found to be NP-hard or NP-complete. Indeed, there are many tools
passing the NP-hardness from one problem to another problem. We introduce one
of them as follows:
Consider two decision problems A and B. A is said to be polynomial-time
p
many-one reducible to B, denoted by A ≤m B, if there exists a polynomial-time
computable function f mapping from all inputs of A to inputs of B such that A
receives yes-answer on input x if and only if B receives yes-answer on input f (x)
(Fig. 8.5).
For example, we have
Example 8.2.5 The Hamiltonian cycle problem is polynomial-time many-one
reducible to the decision version of the traveling salesman problem.
Proof To construct this reduction, for each input graph G = (V , E) of the
Hamiltonian cycle problem, we consider V as the set of cities and define a distance
table D by setting
1 if (u, v) ∈ E
d(u, v) =
|V | + 1 otherwise.
f cB
A B 1
(b)
f cB
A B 0
w̄ x̄ ȳ + w(x + y)
= (w̄ + x + y)(x̄ ȳ + w)
= (w̄ + x + y)(x̄ + w)(ȳ + w)
= (w̄ + x + y)(w + x̄ + y)(w + x + ȳ)(w + x̄ + ȳ)
= q(w, x, y).
Those techniques will be studied systematically in the next few chapters. Before
doing so, we would touch a few fundamental NP-complete problems and their
related combinatorial optimization problems with their approximation solutions in
later sections of this chapter.
To end this section, let us mention a rough way to judge whether a problem has
a possible polynomial-time solution or not. Note that in many cases, it is easy to
judge whether a problem belongs to NP or not. For a decision problem A in NP,
if it is hard to find a polynomial-time solution, then we may study its complement
Ā = {x | x ∈ A}. If Ā ∈ NP, then we may need to try hard to find a polynomial-time
solution. If it is hard to show Ā ∈ NP , then we may try to show NP-hardness of
problem A.
Actually, let co-NP denote the class consisting of all complements of decision
problems in NP. Then class P is contained in the intersection of NP and co-NP
(Fig. 8.7). So far, no natural problem has been found to exist in (NP∩co-NP)\P.
In the history, there are two well-known open problems existing in NP∩co-NP,
and they were unknown to have polynomial-time solutions for many years. They are
the primality test and the decision version of linear program. Finally, they both have
been found to have polynomial-time solutions.
Since the mapping f has to satisfy (2), the idea is to find the relationship of
output of problem A and output of problem B, that is, find the mapping from
inputs to inputs through the relationship between outputs of two problems. Let
us explain this idea through an example.
Theorem 8.3.1 The Hamiltonian cycle problem is NP-complete.
Proof We already proved previously that the Hamiltonian cycle problem belongs
to NP. Next, we are going to construct a polynomial-time many-one reduction from
the NP-complete 3SAT problem to the Hamiltonian cycle problem.
The input of the 3SAT problem is a 3CNF F and the input of the Hamiltonian
cycle problem is a graph G. We need to find a mapping f such that for any 3CNF F ,
f (F ) is a graph such that F is satisfiable if and only if f (F ) contains a Hamiltonian
cycle. What can make F satisfiable? It is a satisfied assignment. Therefore, our con-
struction should give a relationship between assignments and Hamiltonian cycles.
Suppose F contains n variables x1 , x2 , . . . , xn and m clauses C1 , C2 , . . . , Cm . To
do so, we first build a ladder Hi with 4m + 2 levels, corresponding to a variable
xi as shown in Fig. 8.8. In this ladder, there are exactly two Hamiltonian paths
corresponding to two values 0 and 1 for xi . Connect n ladders into a cycle as
shown in Fig. 8.9. Then we obtain a graph H with exactly 2n Hamiltonian cycles
corresponding to 2n assignments of F .
Now, we need to find a way to involve clauses. An idea is to represent each clause
Cj by a point and represent the fact “clause Cj is satisfied under an assignment”
by the fact “point Cj is included in the Hamiltonian cycle corresponding to the
assignment.” To realize this idea, for each literal xi in clause Cj , we connected
point Cj to two endpoints of an edge, between the (4j − 1)th level and the (4j )th
level, on the path corresponding to xi = 1 (Fig. 8.10), and for each x̄i in clause Cj ,
we connected point Cj to two endpoints of an edge on the path corresponding to
xi = 0. This completes our construction for graph f (F ) = G.
4m+2
levels
x1 c2 x1
8.3 Hamiltonian Cycle 213
instead of a cycle. However, in the following, we would like to give a simple proof
by reducing the Hamiltonian cycle problem to the Hamiltonian path problem.
We are going to find a polynomial-time computable mapping f from graphs to
graphs such that G contains a Hamiltonian cycle if and only if f (G) contains a
Hamiltonian path. Our analysis starts from how to build a relationship between a
Hamiltonian cycle of G and a Hamiltonian path of f (G). If f (G) = G, then from
a Hamiltonian cycle of G, we can find a Hamiltonian path of f (G) by deleting an
edge; however, from a Hamiltonian path of f (G), we may not be able to find a
Hamiltonian cycle of G. To have “if and only if” relation, we first consider a simple
case that there is an edge (u, v) such that if G contains a Hamiltonian cycle C, then
C must contain edge (u, v). In this special case, we may put two new edges (u, u )
and (v, v ) at u and v, respectively.
For simplicity of speaking, we may call these two edges as two horns. Now, if G
has the Hamiltonian cycle C, then f (G) has a Hamiltonian path between endpoints
of two horns, u and v . Conversely, if f (G) has a Hamiltonian path, then this
Hamiltonian path must have two endpoints u and v ; hence, we can get back C
by deleting two horns and putting back edge (u, v).
Now, we consider the general case that such an edge (u, v) may not exist. Note
that for any vertex u of G, suppose u have k neighbors v1 , v2 , . . . , vk . Then a
Hamiltonian cycle of G must contain one of edges (u, v1 ), (u, v2 ), . . . , (u, vk ).
Thus, we may first connect all v1 , v2 , . . . , vk to a vertex u and put two horns (u, w)
and (u , w ) (Fig. 8.12). This construction would work similarly as above.
As a corollary of Theorem 8.3.1, we have
Corollary 8.3.4 The traveling salesman problem is NP-hard.
Proof In Example 8.2.5, a polynomial-time many-one reduction has been con-
structed from the Hamiltonian cycle problem to the traveling salesman problem.
for any three vertices a, b, and c where d(a, b) is the distance between a and b,
there is an easy way to obtain a tour (i.e., a Hamiltonian cycle) with total distance
within twice from the optimal.
To do so, at the first compute a minimum spanning tree in the input graph and
then travel around the minimum spanning tree (see Fig. 8.13). During this trip, a
vertex which appears at the second time can be skipped without increasing the
total distance of the trip due to the triangular inequality. Note that the length of a
minimum spanning tree is smaller than the minimum length of a tour. Moreover, this
trip uses each edge of the minimum spanning tree exactly twice. Thus, the length of
the Hamiltonian cycle obtained from this trip is within twice from the optimal.
Christofides in 1976 introduced an idea to improve above approximation. After
computing the minimum spanning tree, he considers all vertices of odd degree
(called odd vertices) in the tree and computes a minimum perfect matching among
8.3 Hamiltonian Cycle 217
these odd vertices. Because in the union of the minimum spanning tree and the
minimum perfect matching, every vertex has even degree, one can travel along
edges in this union using each edge exactly once. This trip, called Euler tour, can be
modified into a traveling salesman tour (Fig. 8.14), without increasing the length by
the triangular inequality. Thus, an approximation is produced with length bounded
by the length of minimum spanning tree plus the length of the minimum perfect
matching on the set of vertices with odd degree. We claim that each Hamiltonian
cycle (namely, a traveling salesman tour) can be decomposed into a disjoint union of
two parts that each is not smaller than the minimum perfect matchings for vertices
with odd degree. To see this, we first note that the number of vertices with odd
degree is even since the sum of degrees over all vertices in a graph is even. Now,
let x1 , x2 , · · · , x2k denote all vertices with odd degree in clockwise ordering of
the considered Hamiltonian cycle. Then (x1 , x2 ), (x3 , x4 ), · · · , (x2k−1 , x2k ) form
a perfect matching for vertices with odd degree and (x2 , x3 ), (x4 , x5 ), · · · , (x2k , x1 )
form the other perfect matching. The claim then follows immediately from the
triangular inequality. Thus, the length of the minimum matching is at most half
of the length of the minimum Hamiltonian cycle. Therefore, Christofides gave an
approximation within a factor of 1.5 from the optimal.
From the above example, we see that the ratio of objective function values
between approximation solution and optimal solution is a measure for the perfor-
mance of an approximation.
For a minimization problem, the performance ratio of an approximation algo-
rithm A is defined as follows:
A(I )
r(A) = sup
I opt (I )
218 8 NP-Hard Problems and Approximation Algorithms
where I is over all possible instances and A(I ) and opt (I ) are respectively the
objective function values of the approximation produced by algorithm A and the
optimal solution with respect to instance I .
For a maximization problem, the performance ratio of an approximation algo-
rithm A is defined by
A(I )
r(A) = inf .
I opt (I )
This gives an instance I for the traveling salesman problem. Then, G has a
Hamiltonian cycle if and only if for I , the travel salesman has a tour with length at
most K|V |. The optimal tour has length |V |. Applying approximation algorithm A
to I , we will obtain a tour of length at most K|V |. Thus, G has a Hamiltonian cycle
if and only if approximation algorithm A produces a tour of length at most K|V |.
This means that the Hamiltonian cycle problem can be solved in polynomial-time.
Because the Hamiltonian cycle problem is NP-complete, we obtain a contradiction.
The above argument proved the following:
Theorem 8.3.8 If P = NP , then no polynomial-time approximation algorithm for
the traveling salesman problem in general case has a constant performance ratio.
For the longest path problem, there exists also a negative result.
Theorem 8.3.9 For any ε > 0, the longest path problem has no polynomial-time
n1−ε -approximation unless P = NP .
A vertex subset C is called a vertex cover if every edge has at least one endpoint in
C. Consider the following problem:
Problem 8.4.1 (Vertex Cover) Given a graph G = (V , E) and a positive integer
K, is there a vertex cover of size at most K?
The vertex cover problem is the decision version of the minimum vertex cover
problem as follows:
Problem 8.4.2 (Minimum Vertex Cover) Given a graph G = (V , E), compute a
vertex cover with minimum cardinality.
Then each clause Cj must have a truth literal which is the one adjacent to the jk not
in S. Thus, F is satisfiable.
The above construction is clearly polynomial-time computable. Hence, the 3SAT
problem is polynomial-time many-one reducible to the vertex cover problem.
have at least one endpoint in the complement of an independent set, which means
that the complement of an independent set must be a vertex cover. Conversely, if the
complement of a vertex subset I is a vertex cover, then every edge has an endpoint
not in I and hence I is independent. Furthermore, we see that a vertex subset I is
the maximum independent set if and only if the complement of I is the minimum
vertex cover.
Problem 8.4.6 (Maximum Clique) Given a graph G = (V , E), find a clique with
maximum size.
Here, a clique is a complete subgraph of input graph G and its size is the number of
vertices in the clique. Let Ḡ be the complementary graph of G, that is, an edge e is
in Ḡ if and only if e is not in G. Then a vertex subset I is induced a clique in G if
and only if I is an independent set in Ḡ. Thus, a subgraph on a vertex subset I is a
maximum clique in G if and only if I is a maximum independent set in Ḡ.
From their relationship, we see clearly the following:
Corollary 8.4.7 Both the maximum independent set problem and the maximum
clique problem are NP-hard.
Next, we study the approximation of the minimum vertex cover problem.
Theorem 8.4.8 The minimum vertex cover problem has a polynomial-time 2-
approximation.
Proof Compute a maximal matching. The set of all endpoints of edges in this
maximal matching form a vertex cover, which is a 2-approximation for the minimum
vertex cover problem since each edge in the matching must have an endpoint in the
minimum vertex cover.
The minimum vertex cover problem can be generalized to hypergraphs. This
generalization is called the hitting set problem as follows:
Problem 8.4.9 (Hitting Set) Given a collection C of subsets of a finite set X, find
a minimum subset S of X such that every subset in C contains an element in S. Such
a set S is called a hitting set.
For the maximum independent set problem and the maximum clique problem,
there are negative results on their approximation.
Theorem 8.4.10 For any ε > 0, the maximum independent set problem has no
polynomial-time n1−ε -approximation unless NP = P .
Theorem 8.4.11 For any ε > 0, the maximum clique problem has no polynomial-
time n1−ε -approximation unless NP = P .
222 8 NP-Hard Problems and Approximation Algorithms
c1
c1 = x1 + x2 + x3
8.5 Three-Dimensional Matching 223
or
Define
1 if Pi ⊆ M,
xi =
0 if Qi ⊆ M.
Then this assignment will satisfy F . In fact, for any clause Cj , in order to have
elements x0j and y0j appear in M, M must contain 3-set {x0j , y0j , zih kh } for some
h ∈ {1, 2, 3}. This assignment will assign 1 to the hth literal of Cj according to the
construction. Conversely, suppose F has a satisfied assignment. We can construct a
three-dimensional matching M as follows:
• If xi = 1, then put Pi into M. If xi = 0, then put Qi into M.
• If hth literal of clause Cj is equal to 1, then put 3-set {x0j , y0j , zih kh } into M.
• So far, all elements in X ∪ Y have been covered by 3-sets put in M. However,
there are m(n − 1) elements of Z that are left outside. We now use 3-
sets {xn+1,h , yn+1,h , zik } to play a role of garbage collector. For each zik not
appearing in 3-sets in M, select a pair of xn+1,h and yn+1,h , and then put 3-set
{xn+1,h , yn+1,h , zik } into M.
X ← X ∪ Y ∪ Z,
C ← C,
k ← |X ∪ Y ∪ Z|.
Clearly, for instance (X, Y, Z, C), a three-dimensional matching exists if and only
if for instance (X, C, k), a set cover of size k exists.
For any subcollection A ⊆ C, define
By Lemma 8.5.5,
Therefore,
|X| − f (Ai )
f (Ai+1 ) − f (Ai ) ≥ , (8.2)
opt
that is,
1
|X| − f (Ai+1 ) ≤ (|X| − f (Ai )) 1 −
opt
1 i+1
≤ |X|(1 − )
opt
≤ |X|e−(i+1)/opt .
Choose i such that |X| − f (Ai+1 ) < opt ≤ |X| − f (Ai ). Then
g ≤ i + opt
and
opt ≤ |X|e−i/opt .
Therefore,
226 8 NP-Hard Problems and Approximation Algorithms
|X|
g ≤ opt 1 + ln ≤ opt (1 + ln γ ).
opt
The following theorem indicates that above greedy approximation has the best
possible performance ratio for the set cover problem:
Theorem 8.5.7 For ρ < 1, there is no polynomial-time (ρ ln n)-approximation for
the set cover problem unless NP = P where n = |X|.
In the worst case, we may have γ = n. Therefore, this theorem indicates that the
performance of Greedy algorithm is tight in some sense.
The hitting set problem is equivalent to the set cover problem. To see this
equivalence, for each element x ∈ X, define Sx = {C ∈ C | x ∈ C}. Then the
set cover problem on input (X, C) is equivalent to the hitting set problem on input
(C, {Sx | x ∈ X}). In fact, A ⊆ C covers X if and only if A hits every Sx . From this
equivalence, the following is obtained immediately:
Corollary 8.5.8 The hitting set problem is NP-hard and has a greedy (1 +
ln γ )-approximation. Moreover, for any ρ < 1, it has no polynomial-time ρ ln γ -
approximation unless NP = P.
8.6 Partition
1 if k = i,
bxi [k] = bx̄i [k] =
0 otherwise
and
1 if x̄i appears in clause Cj ,
bx̄i [n + j ] =
0 otherwise.
m n
L = 3...31...1.
For example, if F = (x1 + x2 + x̄3 )(x̄2 + x̄3 + x4 ), then we would construct the
following 2(m + n) + 1 = 13 positive integers:
are satisfied under assignment σ . Clearly, obtained A meets the condition that the
sum of all numbers in A is equal to L.
Conversely, suppose that there exists a subset A of A such that the sum of all
numbers in A is equal to L. Since L[i] = 1 for 1 ≤ i ≤ n, A contains exactly one
of bxi and bx̄i . Define an assignment σ by setting
1 if bxi ∈ A ,
xi =
0 if bx̄i ∈ A .
We claim that σ is a satisfied assignment for F . In fact, for any clause Cj , since
L[n + j ] = 3, there must be a bxi or bx̄i in A whose the (n + j )th leftmost digit is
1. This means that there is a literal with assignment 1, appearing in Cj , i.e., making
Cj satisfied.
Now, we show the NP-completeness of the partition problem.
Theorem 8.6.4 The partition problem is NP-complete.
Proof The partition problem can be seen as the subsum problem in the special case
that L = S/2 where S = a1 + a2 + · · · + an . Therefore, it is in NP. Next, we show
p
subsum ≤m partition.
Consider an instance of the subsum problem, consisting of n+1 positive integers
a1 , a2 , . . . , an and L where 0 < L ≤ S. Since the partition problem is equivalent
to the subsum problem with 2L = S, we may assume without of generality that
2L = S. Now, consider an input for the partition problem, consisting of n + 1
positive integers a1 , a2 , . . . , an and |2L − S|. We will show that there exists a subset
N1 of [n] such that i∈N1 ai = L if and only if A = {a1 , a2 , . . . , an , |2L − S|} has
a partition (A1 , A2 ) such that the sum of all numbers in A1 equals the sum of all
numbers in A2 . Consider two cases as follows:
Case 1 2L > S. First, suppose there exists a subset N1 of [n] such that i∈N1 ai =
L. Let A1 = {ai | i ∈ N1 } and A2 = A − A1 . Then, the sum of all numbers in A2
is equal to
ai + 2L − S = S − L + 2L − S = L = ai .
i∈[n]−N1 i∈N1
Conversely, suppose A has a partition (A1 , A2 ) such that the sum of all numbers in
A1 equals the sum of all numbers in A2 . Without loss of generality, assume 2L−S ∈
A2 . Note that the sum of all numbers in A equals S + 2L − S = 2L. Therefore, the
sum of all numbers in A1 equals L.
2 2L < S. Let L = S− L and N2 = [n] − N1 . Then 2L − S > 0 and
Case
i∈N1 ai = L if and only if i∈N2 ai = L . Therefore, this case can be done in a
way similar to Case 1 by replacing L and N1 with L and N2 , respectively.
8.6 Partition 229
max c1 x1 + c2 x2 + · · · + cn xn
subject to a1 x1 + a2 x2 + · · · + an xn ≤ S
x1 , x2 , . . . , xn ∈ {0, 1}
In this 0-1 linear programming, variable xi is an indicator that xi = 1 if the ith item
is chosen, and xi = 0 if the ith item is not chosen.
c1 x1 + c2 x2 + · · · + cn xn ≥ k,
a1 x1 + a2 x2 + · · · + an xn ≤ S.
ci = ai for 1 ≤ i ≤ n
k = S = (a1 + a2 + · · · + an )/2.
Then the partition problem receives yes-answer if and only if the decision version
of knapsack problem receives yes-answer.
The knapsack problem has a simple 1/2-approximation (Algorithm 27).
Without loss of generality, assume ai ≤ S for every 1 ≤ i ≤ n. Otherwise, item
i can be removed from our consideration because it cannot be put in the knapsack.
First, sort all items into ordering ac11 ≥ ac22 ≥ · · · ≥ acnn . Then put items one by one
into knapsack according to this ordering, until no more items can be put in. Suppose
that above process stops at the kth item, that is, either k = n or first k items have
been placed into the knapsack and the (k + 1)th item cannot be put in. In the former
case, all n items can be put in the knapsack. In the latter case, if ki=1 ci > ck+1 ,
230 8 NP-Hard Problems and Approximation Algorithms
then take the first k items to form a solution; otherwise, take the (k + 1)th item as a
solution.
Theorem 8.6.7 Algorithm 27 produces a 1/2-approximation for the knapsack
problem.
Proof If all items canbe put in the knapsack, then this will give a simple optimal
solution. If not, then ki=1 ci + ck+1 > opt where opt is the objective function
value of an optimal solution. Hence, max( ki=1 ci , ck+1 ) ≥ 1/2 · opt.
From above 1/2-approximation, we may have the following observation: For an
item selected into the knapsack, two facts are considered:
• The first fact is the ratio ci /ai . The larger ratio means that volume is used for
higher value.
• The second fact is the ci . When putting an item with small ci and bigger ci /ai
into the knapsack may affect the possibility of putting items with bigger ci and
smaller ci /ai , we may select the one with bigger ci .
By properly balancing consideration on these two facts, we can obtain a construction
for (1 + ε)-approximation for any ε > 0 (Algorithm 28).
Denote α = cG · 1+ε2ε
where cG is the total value of a 1/2-approximation solution
obtained by Algorithm 27. Classify all items into two sets A and B. Let A be the set
of all items each with value ci < α and B the set of all items each with value ci ≥ α.
Suppose |A| = m. Sort all items in A in ordering c1 /a1 ≥ c2 /a2 ≥ · · · ≥ cm /am .
8.6 Partition 231
For any subset I of B, with |I | ≤ 1 + 1/ε, if i∈I ai > S, then define c(I ) = 0;
k
otherwise, select the largest k ≤ m satisfying i=1 ai ≤ S − i∈I ci and define
c(I ) = i∈I ci + ki=1 ci .
Lemma 8.6.8 Let cε = maxI c(I ). Then
1
cε ≥ · opt
1+ε
ai > α · (1 + 1/ε)
i∈I
2ε
= cG · · (1 + 1/ε)
1+ε
≥ opt.
Thus, we must have |Ib | ≤ 1 + 1/ε and hence cε ≥ c(Ib ). Moreover, we have
c(Ib ) = ci + ci
i∈Ib i∈Ia
≥ opt − α
2ε
= opt − cG ·
1+ε
opt 2ε
≥ opt − ·
2 1+ε
1
= opt · .
1+ε
Therefore, cε ≥ opt · 1
1+ε .
max c1 x1 + c2 x2 + · · · + ck xk
subject to a1 x1 + a2 x2 + · · · + ak xk ≤ S
x1 , x2 , . . . , xk ∈ {0, 1}.
8.6 Partition 233
Then
This recursive formula gives a dynamic programming to solve the knapsack problem
within O(nS) time. This is a pseudopolynomial-time algorithm, not a polynomial-
time algorithm because the input size of S is log2 S, not S.
To construct a PTAS, we need to design another pseudopolynomial-time algo-
rithm for the knapsack problem.
Let c(i, j ) denote a subset of index set {1, . . . , i} such that
(a) k∈c(i,j ) ck = j and
(b) k∈c(i,j ) sk = min{ k∈I sk | k∈I ck = j, I ⊆ {1, . . . , i}}.
If no index subset satisfies (a), then we say that c(i, j ) is
undefined, or write
c(i, j ) = nil. Clearly, opt = max{j | c(n, j ) = nil and k∈c(i,j ) sk ≤ S}.
Therefore, it suffices to compute all c(i, j ). The following algorithm is designed
with this idea.
Initially, compute c(1, j ) for j = 0, . . . , csum by setting
⎧
⎨ ∅ if j = 0,
c(1, j ) := {1} if j = c1 ,
⎩
nil otherwise,
where csum = ni=1 ci .
Next, compute c(i, j ) for i ≥ 2 and j = 0, . . . , csum .
for i = 2 to n do
for j = 0 to csum do
case 1 [c(i − 1, j − ci ) = nil]
set c(i, j ) = c(i − 1, j )
case 2 [c(i − 1, j − ci ) = nil]
and [c(i − 1, j ) = nil]
set c(i, j ) = c(i − 1, j − ci ) ∪ {i}
case 3 [c(i − 1, j − ci ) = nil]
and [c(i − 1, j ) = nil]
if [ k∈c(i−1,j ) sk > k∈c(i−1,j −ci ) sk + si ]
then c(i, j ) := c(i − 1, j − ci ) ∪ {i}
else c(i, j ) := c(i − 1, j );
Finally, set opt = max{j | c(n, j ) = nil and k∈c(i,j ) sk ≤ S}.
This algorithm computes the exact optimal solution for the knapsack problem
with running time O(n3 M log(MS)) where M = max1≤k≤n ck , because the
algorithm contains two loops, the outside loop runs in O(n) time, the inside loop
runs in O(nM) time, and the central part runs in O(n log(MS)) time. This is a
pseudopolynomial-time algorithm because the input size of M is log2 M and the
running time is not a polynomial with respect to input size.
234 8 NP-Hard Problems and Approximation Algorithms
c∗ 1
≤1+ ,
ch h
ck n(h + 1) M
ch = ·
h
M n(h + 1)
k∈I
ck n(h + 1) M
≥ ·
h
M n(h + 1)
k∈I
M
= ck
n(h + 1) h k∈I
M
≥ ck
n(h + 1) ∗
k∈I
M ck n(h + 1)
≥ −1
n(h + 1) ∗
M
k∈I
M
≥ opt −
h+1
1
≥ opt 1 − .
h+1
max a1 x1 + a2 x2 + · · · + an xn
subject to a1 x1 + a2 + · · · + an xn ≤ S/2
x1 , x2 , . . . , xn ∈ {0, 1}
and
optk
≤ 1 + ε.
i∈N1 ai
Therefore,
S − opts
≤ 1 + ε,
S − i∈N2 ai
that is,
S− ai ≥ (S − opts )/(1 + ε).
i∈N2
236 8 NP-Hard Problems and Approximation Algorithms
Thus,
εS + opts ε · 2opts + opts
ai ≤ ≤ ≤ opts (1 + ε).
1+ε 1+ε
i∈N2
Problem 8.7.2 (Strongly Planar 3SAT) Given a strongly planar 3CNF F , deter-
mine whether F is satisfiable.
x ⊕ y = x ȳ + x̄y.
We next show that for each ⊕ operation x ⊕ y = z, there exists a planar 3CNF
Fx⊕y=z such that
x ⊕ y = z ⇔ Fx⊕y=z ∈ SAT
8.7 Planar 3SAT 237
x + y = z ⇔ c(x, y, z) ∈ SAT ,
x · y = z ⇔ c(x̄, ȳ, z̄) ∈ SAT .
Since
x ⊕ y = (x + y) · ȳ + x̄ · (x + y),
we have
x ⊕ y = z ⇔ Fx⊕y=z = c(x, y, u)c(ū, y, v̄)c(x, ū, w̄)c(v, w, z) ∈ SAT .
As shown in Fig. 8.20, Fx⊕y=z
is planar. Fx⊕y=z contains some clauses with two
literals. Each such clause x+y can be replaced by two clauses (x+y+w)(x+y+ w̄)
with a new variable w as shown in Fig. 8.21. Then we can obtain a planar 3CNF
Fx⊕y=z such that
x ⊕ y = z ⇐⇒ Fx⊕y=z ∈ SAT .
238 8 NP-Hard Problems and Approximation Algorithms
Fig. 8.20 CNF Fx⊕y=z is
planar
Finally, look back at the instance 3CNF F of the 3SAT problem at the beginning.
Let F ∗ be the product of F and all 3CNFs for all ⊕ operations appearing in crossers
used for removing cross-points in G∗ (F ). Then F ∗ is planar and
F ∈ SAT ⇐⇒ F ∗ ∈ SAT .
This completes our reduction from the 3SAT problem to the planar 3SAT problem.
and k is selected in the following way: For each edge (x, Cj ), label it with x if
Cj contains lateral X, and x̄ if Cj contains literal x̄. Select k to be the number of
8.7 Planar 3SAT 239
x2 x2 x3 x3 x4 x4 x2 x2 x3 x3 x4 x4
c13 c21
c1 c2
c1 c2 c12 c22
c11 c23
x1 x1 x1 x1
Problem 8.8.2 (k-Center) Given a set C of n cities with a distance table, find a
subset S of k cities as centers to minimize
Theorem 8.8.3 The k-center problem with triangular inequality has a polynomial-
time 2-approximation.
Proof Consider the following algorithm:
Initially, choose arbitrarily a vertex s1 ∈ C and set
S1 ← {s1 };
for i = 2 to k do
select si = arcmaxc∈C d(c, Si−1 ), and set
Si ← Si−1 ∪ {si };
output Sk .
We will show that this algorithm gives a 2-approximation.
Let S ∗ be an optimal solution. Denote
Classify all cities into k clusters such that each cluster contains a center s ∗ ∈ S ∗ and
d(c, s ∗ ) ≤ d ∗ for every city c in the cluster. Now, we consider two cases.
Case 1 Every cluster contains a member si ∈ Sk . Then for each city c in the cluster
with center s ∗ , d(c, si ) ≤ d(c, s ∗ ) + d(s ∗ , si ) ≤ 2 · opt.
d(c, Sk ) ≤ d(c, Sj −1 )
≤ d(sj , Sj −1 )
≤ d(sj , si )
≤ d(si , s ∗ ) + d(s ∗ , sj )
≤ 2 · opt.
242 8 NP-Hard Problems and Approximation Algorithms
v v
Lemma 8.8.5 The decision version of the dominating set problem is NP-complete.
Proof Consider an input graph G = (V , E) of the vertex cover problem. For each
edge (u, v), create a new vertex xuv together with two edges (u, xuv ) and (xuv , v)
(Fig. 8.25). Then we obtain a modified graph G . If G has a vertex cover of size ≤ k,
then the same vertex subset must be a dominating set of G , also of size ≤ k.
Conversely, if G has a dominating set D of size ≤ k, then without loss of
generality, we may assume D ⊆ E. In fact, if xuv ∈ D, then we can replace xuv
by either u or v, which results in a dominating set of the same size. Since D ⊆ E
dominating all xuv in G , D covers all edges in G.
Now, we come back to the k-center problem.
Theorem 8.8.6 For any ε > 0, the k-center problem with triangular inequality
does not have a polynomial-time (2 − ε)-approximation unless NP=P.
Proof Suppose that the k-center problem has a polynomial-time (2 − ε)-
approximation algorithm A. We use algorithm A to construct a polynomial-time
algorithm for the decision version of the dominating set problem.
Consider an instance of the decision version of the dominating set problem,
consisting of a graph G = (V , E) and a positive integer k. Construct an instance of
the k-center problem by choosing all vertices as cities with distance table defined as
follows:
1 if (u, v) ∈ E,
d(u, v) =
|V | + 1 otherwise.
8.8 Complexity of Approximation 243
If G has a dominating set of size at most k, then the k-center problem will have
an optimal solution with opt = 1. Therefore, algorithm A produces a solution
with objective function value at most (2 − ε), actually has to be one. If G does
not have a dominating set of size at most k, then the k-center problem will have
its optimal solution with opt ≥ 2. Hence, algorithm A produces a solution with
objective function value at least two. Therefore, from objective function value of
solution produced by algorithm A, we can determine whether G has a dominating
set of size ≤ k or not. By Lemma 8.8.5, we have NP = P.
By Theorems 8.8.3 and 8.8.6, the k-center problem with triangular inequality
separates PTAS and APX.
Actually, APX is a large class which contains many problems not in PTAS if
NP=P. Those are called APX-complete problems. There are several reductions to
establish the APX-completeness. Let us introduce a popular one, the polynomial-
time L-reduction.
Consider two combinatorial optimization problems and . is said to
p
be polynomial-time L-reducible to , written as ≤L , if there exist two
polynomial-time computable functions h and g, and two positive constants a and
b such that
(L1) h maps from instances x of to instances h(x) of such that
where opt (x) is the objective function value of an optimal solution for
on instance x;
(L2) g maps from feasible solutions y of on instance h(x) to feasible solutions
g(y) of on instance x such that
where obj (y) is the objective function value of feasible solution y for
(Fig. 8.26).
Π Γ
x h(x)
feasible feasible
solutions g(y) y solutions
of Π on x of Γ on h(x)
g g'
g(g'(y)) g'(y) y
and
p
Theorem 8.8.8 If ≤L and ∈ P T AS, then ∈ P T AS.
Proof Consider four cases.
Case 1. Both and are minimization problems:
where V (G) denotes the vertex set of graph G. To see (L2), note that the
path (u1 , v1 , u2 , v2 , u3 , v3 , u4 ) has unique maximum independent set Iu =
{u1 , u2 , u3 , u4 }. For any independent set I of G , define g(I ) to be obtained from I
by replacing set Iu by vertex u and removing all other vertices not in G. Then g(I )
is an independent set of G. We claim
Now, for any vertex cover C, define g(C) to be the complement of C. Then we have
where optds (G ) denotes the size of minimum dominating set of G and optsc (G)
denotes the cardinality of minimum set cover on input (X, C). In fact, suppose that
C ∗ is a minimum set cover on input (X, C). Then C ∪ {o} is a dominating set of G .
Next, consider a dominating set D of G , generated by the polynomial-time
(ρ ln n)-approximation for the dominating set problem. Then, we have |D| ≤
(ρ ln(2|X| + 2))optds (G ). Construct a set cover S as follows:
Step 1. If D does not contain o, then add o. If D contains a vertex with label
x ∈ X, then replace x by a vertex with label C ∈ C such that x ∈ C.
8.8 Complexity of Approximation 249
|S| ≤ |D|
≤ (ρ ln(2|X| + 2))optds (G )
≤ (ρ ln(2|X| + 2))(1 + optsc (X, C))
ln(2|X| + 2) 1
= · (1 + ) · (ρ ln |X|)optsc (X, C).
ln |X| optsc (X, C)
ln(2α + 2) 1
ρ = · (1 + ) · ρ < 1.
ln α β
For |X| < α or optsc (X, C) < β, an exactly optimal solution can be computed in
polynomial-time. Therefore, there exists a polynomial-time (ρ ln n)-approximation
for the set cover problem and hence NP=P by Theorem 8.8.16.
Class Poly-APX may also be further divided into several levels.
Polylog-APX, consisting of all combinatorial optimization problems each of
which has a polynomial-time O(lni n)-approximation for minimization, or
(1/O(lni n))-approximation for maximization for some i ≥ 1.
Lemma 8.8.19 A scheduling (B1 , B2 , .., Bk ) is secure if and only if for any 1 ≤
i ≤ k − 1, there is no point a lying above Bi and below Bi+1 .
Proof If such a point a exists, then the scheduling is not secure since the intruder
can walk to point a during Bi works and enters into the area of interest during Bi+1
works. Thus, the condition is necessary.
For sufficiency, suppose the scheduling is not secure. Consider the moment at
which the intruder gets the possibility to enter the area of interest and the location a
where the intruder lies. Let Bi work before this moment. Then a must lie above Bi
and below Bi+1 .
This lemma indicates that the secure scheduling can be reduced to the longest
path problem in directed graphs in the following way:
• Construct a directed graph G as follows: For each barrier cover Bi , create a node
i. For two barrier covers Bi and Bj , if there exists a point a lying above barrier
cover Bi and below barrier cover Bj , add an arc (i, j ).
• Construct the complement Ḡ of graph G, that is, Ḡ and G have the same node
set and an arc in Ḡ if and only if it is not in G.
By Lemma 8.8.19, each secure scheduling of barrier covers a corresponding
simple path in Ḡ, and a secure scheduling is maximum if and only if a corresponding
simple path is the longest one. Actually, the longest path problem can also be
reduced to the secure scheduling problem as shown in the proof of the following
theorem:
Theorem 8.8.20 For any ε > 0, the secure scheduling problem has no polynomial-
time n1−ε -approximation unless NP = P .
Proof Let us reduce the longest path problem in directed graph to the secure
scheduling problem. Consider a directed graph G = (V , E). Let Ḡ = (V , Ē) be
the complement of G, i.e., Ē = {(i, j ) ∈ V × V | (i, j ) ∈ E}. Draw a horizontal
line L and for each arc (i, j ) ∈ Ē, create a point (i, j ) on the line L. All points (i, j )
are apart from each other with distance 6 units (Fig. 8.32). At each point (i, j ), add
a disk Sij with center (i, j ) and unit radius. Cut line L into a segment L to include
all disks between two endpoints. Add more unit disks with centers on the segment
L to cover the uncovered part of L such that point (i, j ) is covered only by Sij . Let
B0 denote the set of sensors with constructed disks as their monitoring areas.
Now, let Bi be obtained from B0 in the following way:
ij
• For any (i, j ) ∈ Ē, remove Sij to break B0 into two parts. Add two unit disks Si1
ij
and Si2 to connect the two parts, such that point (i, j ) lies above them.
• For any (j, i) ∈ Ē, remove Sj i to break B0 into two parts. Add two unit disks
ij ij
Si1 and Si2 to connect the two parts, such that point (i, j ) lies below them.
• To make all constructed barrier covers disjoint, unremoved disks in B0 will be
made copies and put those copies into Bi (see Fig. 8.32).
252 8 NP-Hard Problems and Approximation Algorithms
Clearly, G has a simple path (i1 , i2 , . . . , ik ) if and only if there exists a secure
scheduling (Bi1 , Bi2 , . . . , Bik ). Therefore, our construction gives a reduction from
the longest path problem to the secure scheduling problem. Hence, this theorem can
be obtained from Theorem 8.3.9 for the longest path problem.2
Other than the polynomial-time L-reduction, there exist many reductions pre-
serving or amplifying the gap between approximation solutions and optimal solu-
tions. They are very useful tools to establish the inapproximability of a target
problem through transformation from known inapproximability of another problem.
The reader may find more in Chapter 10 of [100].
Exercises
2 Theorem 8.3.9 states for undirected graphs. The same theorem also holds for directed graphs
since the graph can be seen as special case of directed graphs.
Exercises 253
(b) If this problem has a polynomial-time 1/γ -approximation, then the mini-
mum vertex cover problem has a polynomial-time γ -approximation.
4. Show that the following problems are NP-hard:
(a) Given a directed graph, find the minimum subset of edges such that every
directed cycle contains at least one edge in the subset.
(b) Given a directed graph, find the minimum subset of vertices such that every
directed cycle contains at least one vertex in the subset.
5. Show the NP-completeness of the following problem: Given a sequence
of positive integers a1 , a2 , . . . , an , determine whether the sequence can be
partitioned into three parts with equal sums.
6. Given a Boolean formula F , determine whether F has at least two satisfying
assignments. Show that this problem is NP-complete.
7. Show NP-hardness of the following problem: Given a graph G and an integer
k > 0, determine whether G has a vertex cover C of size at most k, satisfying
the following conditions:
(a) The subgraph G|C induced by C has no isolated vertex.
(b) Every vertex in C is adjacent to a vertex not in C.
8. Show that all internal nodes of a depth-first search tree form a vertex cover,
which is 2-approximation for the minimum vertex cover problem.
9. Given a directed graph, find an acyclic subgraph containing maximum number
of arcs. Design a polynomial-time 1/2-approximation for this problem.
10. A wheel is a cycle with a center (not on the cycle) which is connected to
every vertex on the cycle. Prove the NP-completeness of the following problem:
Given a graph G, does G have a spanning wheel?
11. Given a 2-connected graph G and a vertex subset A, find the minimum vertex
subset B such that A ∪ B induces a 2-connected subgraph. Show that this
problem is NP-hard.
12. Show that the following problems are NP-hard:
(a) Given a graph G, find a spanning tree with minimum number of leaves.
(b) Given a graph G, find a spanning tree with maximum number of leaves.
13. Given two graphs G1 and G2 , show the following:
(a) It is NP-complete to determine whether G1 is isomorphic to a subgraph of
G2 or not.
(b) It is NP-hard to find a subgraph H1 of G1 and a subgraph H2 of G1 such
that H1 is isomorphic to H2 and |E(H1 )| = |E(H2 )| reaches the maximum
common.
14. Given a collection C of subsets of three elements in a finite set X, show the
following:
(a) It is NP-complete to determine whether there exists a set cover consisting of
disjoint subsets in C.
254 8 NP-Hard Problems and Approximation Algorithms
|{S ∈ C | S ∩ A = ∅} ∪ {S ∈ D | S ∩ A = ∅}|.
20. Design a FPTAS for the following problem: Consider n jobs and m identical
machine. Assume that m is a constant. Each job j has a processing time pj and
a weight wj . The processing
does not allow preemption. The problem is to find
a scheduling to minimize j wj Cj where Cj is the completion time of job j .
21. Design a FPTAS for the following problem: Consider a directed graph with a
source node s and a sink node t. Each edge e has an associated cost c(e) and
length (e). Given a length bound L, find a minimum cost path from s to t of
total length at most L.
Exercises 255
22. Show the NP-completeness of the following problem: Given n positive integers
1 , a2 , . . . , an , is there a partition (I1 , I2 ) of [n] such that |
a i∈I1 ai −
i∈I2 a i | ≤ 2?
23. (Ron Graham’s Approximation for Scheduling P ||Cmax ) Show that the follow-
ing algorithm gives a 2-approximation for the scheduling P ||Cmax problem:
• List all jobs. Process them according to the ordering in the list.
• Whenever a machine is available, move the first job from the list to the
machine until the list becomes empty.
24. In the proof of Theorem 8.7.4, if letting k be the degree of vertex x, then the
proof can also work. Please complete the construction of replacing vertex x by
cycle G(Fx ).
25. (1-in-3SAT) Given a 3CNF F , is there an assignment such that for each clause
of F , exactly one literal gets value 1? This is called the 1-in-3SAT problem.
Show the following:
(a) The 1-in-3SAT problem is NP-complete.
(b) The planar 1-in-3SAT problem is NP-complete.
(c) The strongly planar 1-in-3SAT is NP-complete.
26. (NAE3SAT) Given a 3CNF F , determine whether there exists an assignment
such that for each clause of F , is there an assignment such that for each clause
of F , not all three literals are equal? This is called the NAE3SAT problem.
Show the following:
(a) The NAE3SAT problem is NP-complete.
(b) The planar NAE3SAT is in P.
27. (Planar 3SAT with Variable Cycle) Given a 3CNF F which has G∗ (F ) with
property that all variables can be connected into a cycle without crossing, is F
satisfiable?
(a) Show that this problem is NP-complete.
(b) Show that the planar Hamiltonian cycle problem is NP-hard.
28. Show that the planar dominating set problem is NP-hard.
29. Show that the following are APX-complete problems:
(a) (Maximum 1-in-3SAT) Given a 3CNF F , find an assignment to maximize
the number of 1-in-3 clauses, i.e., exactly one literal equal to 1.
(b) (Maximum NAE3SAT) Given a 3CNF F , find an assignment to maximize
the number of NAE clauses, i.e., either one or two literals equal to 1.
30. (Network Steiner Tree) Given a network G = (V , E) with nonnegative edge
weight, and a subset of nodes, P , find a tree interconnecting all nodes in P ,
with minimum total edge weight. Show that this problem is APX-complete.
31. (Rectilinear Steiner Arborescence) Consider a rectilinear plan with origin O.
Given a finite set of terminals in the first of this plan, find the shortest
arborescence to connect all terminals, that is, the shortest directed tree rooted at
256 8 NP-Hard Problems and Approximation Algorithms
origin O such that for each terminal t, there is a path from O to t and the path
is allowed to go only to the right or upward. Show that this problem is NP-hard.
32. (Connected Vertex Cover) Given a graph G = (V , E), find a minimum vertex
cover which induces a connected subgraph. Show that this problem has a
polynomial-time 3-approximation.
33. (Weighed Connected Vertex Cover) Given a graph G = (V , E) with nonnega-
tive vertex weight, find a minimum total weight vertex cover which induces a
connected subgraph. Show the following:
(a) This problem has a polynomial-time O(ln n)-approximation where n =
|V |.
(b) For any 0 < ρ < 1, this problem has no polynomial-time (ρ ln n)-
approximation unless NP=P.
34. (Connected Dominating Set) In a graph G, a subset C is called a connected
dominating set if C is a dominating set and induces a connected subgraph.
Given a graph, find a minimum connected dominating set. Show that for any
0 < ρ < 1, this problem has no polynomial-time (ρ ln n)-approximation unless
NP=P where n is the number of vertices in input graph.
35. Show that the following problem is APX-complete: Given a graph with vertex
degree upper-bounded by a constant b, find a clique of the maximum size.
36. Show that the traveling salesman problem does not belong to Poly-APX if the
distance table is not required to satisfy the triangular inequality.
Historical Notes
MAXSNP class. This class is extended to APX by Khanna et al. [242]. Motivated
from the study of MAXSNP-completeness, the PCP theorem and the PCP system
were initiated by Arora et al. [10, 11] and Arora and Safra [13, 14]. With the PCP
system, many results are generated on inapproximability, such as Hastad [206, 207],
Lund and Yannakalis [305], Feige [134], Raz and Safra [347], and Zuckerman
[464, 465].
Recently, techniques developed in above classic results on algorithm designs and
inapproximability proofs have been widely used in the study of wireless sensor
networks and social networks, such as secure scheduling of barrier covers [452]
and influence maximization in various models [302–304].
Chapter 9
Restriction and Steiner Tree
min f (x)
subject to x ∈
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 259
D.-Z. Du et al., Introduction to Combinatorial Optimization, Springer Optimization
and Its Applications 196, https://doi.org/10.1007/978-3-031-10596-8_9
260 9 Restriction and Steiner Tree
• Meanwhile, estimate the cost of modification. Suppose this cost is within a factor
of α from the minimum solution, i.e.,
f (y) − f (x ∗ )
≤ α.
f (x ∗ )
f (y ∗ ) f (y) f (y) − f (x ∗ )
∗
≤ ∗
≤1+ ≤ 1 + α.
f (x ) f (x ) f (x ∗ )
Case 2 Case 1 does not occur. In this case, we cut R by a horizontal segment CD
at the middle of R. CD must lie between horizontal segments in P ∗ , i.e., from
each point on CD, going above and below must meet horizontal segments in P ∗ .
Therefore, the total length of horizontal segments above CD and directly facing CD
is equal to the length of CD. So is the total length of horizontal segments below CD
and directly facing CD. We charge 0.5 to those horizontal segments in P ∗ . Note that
each horizontal segment in P ∗ can be charged at most twice, one from above and
one from below. Therefore, the total charge is at most one.
From argument in the above two cases, we can see that modifying P ∗ into a
guillotine partition needs to add new cut segments of total length not exceeding the
total length of P ∗ . Therefore, the optimal solution for the minimum length guillotine
partition is 2-approximation for the minimum length rectangular partition.
262 9 Restriction and Steiner Tree
Lemma 9.2.3 Let T be a spanning tree and T ∗ be a minimum spanning tree. Let
E(T ) and E(T ∗ ) be their edge sets. Then, there is a one-to-one onto mapping f
from E(T ∗ ) to E(T ) such that
Lemma 9.2.5 By breaking each edge with length longer than R into smaller pieces
of length at most R, the minimum length spanning tree will induce a minimum
steinerized spanning tree.
Proof It follows immediately from Lemma 9.2.3.
Theorem 9.2.6 Suppose that for any set of terminals, there always exists a
minimum spanning tree with vertex degree at most D. Then, the minimum steinerized
spanning tree is (D − 1)-approximation for ST-MSP.
Proof Let P be a set of terminals and S ∗ an optimal tree on input P for ST-MSP.
Suppose S ∗ contains k Steiner points s1 , s2 , . . . , sk in the ordering of the breadth-
first search starting from a node of S ∗ . Let N(P ) denote the number of Steiner points
in a minimum steinerized spanning tree induced from the minimum length spanning
tree on P . We first show a claim that
To do so, consider a minimum length spanning tree T for P ∪ {s1 , . . . , si+1 }, with
degree at most D. Suppose si+1 has adjacent nodes v1 , . . . , vd (d ≤ D). Then,
one of the edges (si+1 , v1 ), . . . , (si+1 , vd ) has length not exceeding R because P ∪
{s1 , . . . , si } has distance at most R from si+1 (Fig. 9.5).
Without loss of generality, assume the length of (si+1 , v1 ) is at most R. Delete
edges (si+1 , v1 ), . . . , (si+1 , vd ), and add d − 1 edges (v1 , v2 ), . . . , (v1 , vd ). This
results in a spanning tree T on P ∪ {s1 , . . . , si }. Since d(v1 , vj ) ≤ d(v1 , si+1 ) +
9.2 Role of Minimum Spanning Tree 265
Note that for any set of terminals in the Euclidean plane, there is a minimum
spanning tree with degree at most 5 (an exercise in Chap. 1). Therefore, we have the
following:
Corollary 9.2.7 The minimum steinerized spanning tree is 4-approximation for ST-
MSP in the Euclidean plane.
Proof It follows immediately from the fact that for any set of terminals in the
Euclidean plane, there is a minimum spanning tree with degree at most 5. We leave
the proof of this fact as an exercise.
The following problem is closely related to ST-MSP:
Problem 9.2.8 (Bottleneck Steiner Tree) Given a set P of terminals in the
Euclidean plane and a positive integer k, find a Steiner tree on P with at most k
Steiner nodes, to minimize the length of longest edge.
Consider a spanning tree T on P . The steinerized spanning tree induced by T is
defined to be the tree obtained in the following way:
Optimal Steinerization:
input: A spanning tree T .
output: A steinerized spanning tree T .
for every edge e ∈ T do n(e) ← 0;
for i = 1 to k do
choose e ∈ T to maximize c(e)/n(e)
(remind: c(e) is the length of edge e)
and set n(e) ← n(e) + 1;
for every edge e ∈ T do
cut e evenly with n(e) Steiner points;
return T .
c(e1 ) c(ei )
= max .
n(e1 ) 1≤i≤t n(ei )
Denote by Opt (k; e1 , . . . , et ) the minimum value of longest edge length after
adding k Steiner points on edges e1 , . . . , et . We will show that
c(ei ) c(e1 )
Opt (k + 1; e1 , . . . , et ) = max max , . (9.2)
2≤i≤t n(ei ) n(e1 ) + 1
By induction hypothesis,
c(e1 )
Opt (k; e1 , . . . , et ) = .
n(e1 )
Note that in the algorithm on input e1 , e2 , . . . , et , if we ignore the step for adding
points on e1 , then the remaining steps are exactly those steps in the algorithm on
input e2 , . . . , et . Therefore, by induction hypothesis, we also have
c(e1 )
Opt (k − n(e1 ); e2 , . . . , et ) = max .
2≤i≤t n(e1 )
Note that
c(e1 )
Opt (k + 1; e1 , . . . , et ) ≤ max Opt (k − n(e1 ); e2 , . . . , et ), .
n(e1 ) + 1
Let n∗ (e1 ) denote the number of Steiner points on e1 in an optimal solution for
Opt (k + 1; e1 , . . . , et ). By (9.3), we must have n∗ (e1 ) > n(e1 ) + 1 and
9.2 Role of Minimum Spanning Tree 267
Note that
c(e1 ) c(e1 )
≤ < Opt (k; e1 , . . . , et ).
n∗ (e1 ) − 1 n(e1 ) + 1
Hence,
c(e1 )
max Opt (k + 1 − n∗ (e1 ); e2 , . . . , et ), ∗
< Opt (k; e1 , . . . , et ),
n (e1 ) − 1
a contradiction.
Lemma 9.2.10 Among steinerized spanning trees, the one induced by minimum
spanning tree reaches the minimum of longest edge length.
Proof It follows immediately from Lemma 9.2.3.
Theorem 9.2.11 The steinerized spanning tree induced by minimum spanning tree
is a 2-approximation solution for the bottleneck Steiner tree.
Proof Consider an optimal Steiner tree T ∗ for the bottleneck Steiner tree problem.
We want to modify T ∗ into a steinerized spanning tree. Note that every Steiner tree
can be decomposed into full components in each of which all terminals are leaves.
Since this decomposition is on terminals, it suffices to consider each full component.
Let T be a full component with k Steiner points with edge length at most R. We
arbitrarily select a Steiner point s as the root. A path from the root to a leaf is called
a root-leaf path. Its length is the number of edges on the path which is equal to
the number of Steiner points on the path. Let h be the length of a shortest root-leaf
path. We will show by induction on the depth d of T that there exists a steinerized
spanning tree for all terminals with at most k −h degree-two Steiner points and edge
length at most 2R. Here, the depth of T is the length of a longest root-leaf path.
For d = 0, T contains only one terminal, so it is trivial. For d = 1, T contains
only one Steiner point. We can directly connect the terminals without any Steiner
points since, by the triangular inequality, the distance between two terminals is at
most 2R.
Next, we consider d ≥ 2. Suppose s has t sons s1 , . . . , st . For each si , there is a
subtree Ti rooted at si with depth ≤ d − 1. Let ki be the number of Steiner points in
Ti and hi the length of a shortest root-leaf path in Ti , from si to a leaf vi (Fig. 9.6).
By induction hypothesis, there exists a steinerized spanning tree Si for all
terminals in Ti with at most ki − hi degree-two Steiner points and edge length
at most 2R. Without loss of generality, assume h1 ≥ h2 ≥ · · · ≥ ht . Connect all
Si for i = 1, . . . , t into a spanning tree S with edges (v1 , v2 ), . . . , (vt−1 , vt ), and
put hi Steiner points on edge (vi , vi+1 ). Note that the path between vi and vi+1 in
T contains hi + hi+1 + 2 edges. By triangular inequality, the distance between vi
268 9 Restriction and Steiner Tree
and vi+1 is at most (hi + hi+1 + 2)R ≤ 2(hi + 1)R. Therefore, hi Steiner points
would break (vi , vi+1 ) into hi + 1 pieces each of length ≤ 2R. Note that S contains
k1 + · · · + kt−1 + kt − ht = k − (ht + 1) Steiner points. Moreover, the path from s
to vt in T contains ht + 1 Steiner points. Hence, h ≤ ht + 1.
The minimum spanning tree also has many applications in the study the energy
efficient problems in wireless ad hoc and sensor networks. Those applications are
based on the following property.
Lemma 9.2.12 Let f be a nonnegative monotone nondecreasing function. Then,
the minimum length spanning tree is an optimal solution for the following problem:
min f (c(e))
e∈T
subject to T is over all spanning trees.
where c and α are positive constants and usually 2 ≤ α ≤ 6. Arc (u, v) is said to
exist if v is able to receive signals from u. Edge (u, v) is said to exist if v can receive
signals from u and u can also receive signals from v.
Suppose a directed graph G = (V , E) is obtained from setting up energy power
at every node. Then, denote
p(G) = p(u).
u∈V
p(G) ≥ p(T )
where
p(T ) = c e α
.
e∈T
Proof In every arborescence T , each arc (u, v) is unique arc going from u.
Therefore, to make (u, v) exist, we need to set up
p = c (u, v) α
.
Therefore,
p(T ) = c e α
.
e∈T
where α ≥ 2.
Proof Since x ∈ P , every edge of T has length at most R. Let Tr be the subgraph
of T , induced by all edges with length at most r. Let n(T , r) denote the number of
connected components of Tr . Then,
9.2 Role of Minimum Spanning Tree 271
R
e α
=α (n(T , r) − 1)r α−1 dr.
e∈T 0
Associate each node u ∈ P with a disk with center u and radius r/2. Then, for
each connected component, those disks form a connected region, and those regions
for different connected components are disjoint (Fig. 9.7). Moreover, since each
such region contains at least one disk with radius r/2, its area is at least π(r/2)2 .
Therefore, the boundary of each region has length at least π r. This is because
for surrounding a certain amount area, circle gives the shortest length. Let a(P , r)
denote the total area covered by those disks with radius r/2. Now, we have
R
a(P , R) = d(a(P , r))
0
R
≥ n(T , r)π rd(r/2)
0
R
π π R2
= (n(T , r) − 1)rdr +
2 0 4
π π R2
= e 2
+ .
4 4
e∈T
π π R2
π(1.5R)2 ≥ e 2
+ .
4 4
e∈T
Hence,
e 2
≤ 8R 2 .
e∈T
272 9 Restriction and Steiner Tree
Theorem 9.2.19 The minimum spanning tree induces an 8-approximation for the
min-energy broadcasting problem.
Proof Consider an optimal solution T ∗ for the min-energy broadcasting problem,
i.e., T ∗ is a minimum energy broadcasting routing. For each internal node u of T ∗ ,
we draw a smallest disk Du to cover all out-arc at u. Let D be the set of such disks
and Ru is the radius of disk Du . Those disks will cover all points in input set S and
the total energy consumption of T ∗ is
cRuα .
Du ∈D
Now, for each disk Du , construct a minimum spanning tree Tu on u and all endpoints
of out-arcs at u (Fig. 9.8). By Lemma 9.2.18,
e α
≤ 8 · Ruα .
e∈Tu
Note that ∪Du ∈D Tu is a spanning tree on all nodes of T ∗ . Let T be a minimum length
spanning tree on all nodes of T ∗ . However, the energy consumption of broadcasting
routing induced by T is at most
c e α
≤ c e α
≤ c · Ruα ≤ 8p(T ∗ ).
e∈T Du ∈D e∈Tu Du ∈D
Lemma 9.2.18 can be improved by more careful argument, so that the constant 8
is brought down to 6 [3], which is tight (Fig. 9.9).
Since for P , PTAS exists, we have that for any ε > 0, there exists a polynomial-
time approximation solution A for P with length
274 9 Restriction and Steiner Tree
length(A) ≤ (1 + ε) · opt (P ).
Connecting each cell center back to terminal in P , we obtain a solution for P with
length at most
L 1 1
length(A) + ≤ (1 + ε) 1 + + · opt (P ).
n n n
1 2
Fig. 9.11 A 3, 3 -cut
Fig.
9.12 Each
1 2
3 3 -partition results in a
,
binary tree with depth
O(log n)
Proof Trivial.
A ( 13 , 23 )-partition is a sequence of ( 13 , 23 )-cuts. Each cut divides a rectangle into
two smaller ones so that all obtained rectangles form a binary tree (Fig. 9.12). Since
each cut line is located at a grid line, every obtained rectangle has area at least
L2 /n4 . Therefore, this binary tree has a depth at most
For each cut segment, we put on m portals such that m portals divide the cut
segment equally (Fig. 9.13). Now, we are ready to describe a restriction as follows:
A rectilinear Steiner tree T is restricted if there exists a ( 13 , 23 )-partition such that
if a segment of T passes through a cut line, then it passes at a portal (Fig. 9.13).
Lemma 9.3.4 Minimum restricted rectilinear Steiner tree can be computed in time
n26 2O(m) by dynamic programming.
Proof Each cut has O(n2 ) choices. It takes O(n2 ) time to select the best one.
To show the lemma, it suffices to prove that the number of subproblems is
O(n24 2O(m) ). Each subproblem can be determined by the following four facts:
1. Determine a rectangle. (There are O(n8 ) possibilities.)
2. Determine position of portals at each edge. (There are O(n4 ) possibilities as
shown in Fig. 9.14.)
276 9 Restriction and Steiner Tree
Define
Then,
xf 2 (x) = f (x) − 1.
Therefore,
√
1± 1 − 4x
f (x) = .
2x
Thus,
k+1 (−4)k+1
N (k) = − ·
1/2 2
0.5(0.5 − 1)(0.5 − 2) · · · (0.5 − k)
= · (−1)k · 22k+1
(k + 1)!
= 2O(k) .
a(R)
nR · ≤ (TR∗ ). (9.4)
3
Now, we move each cross-point to closest portal as shown in Fig. 9.16. In each
rectangle R in the binary tree resulting from the selected ( 13 , 23 )-partition, such
moving will increase the total length within
b(R)
nR ·
m+1
where b(R) is the length of selected cut, i.e., the length of the shorter edge of R. By
(9.4), we have that moving cross-points increases the total length inside R, upper
bounded by
b(R) 3
nR · ≤ · (TR∗ ).
m+1 m+1
Note that as shown in Fig. 9.12, at each level of the binary tree, all rectangles have
disjoint interiors. Therefore, the sum of (TR∗ ) for R overall rectangles at each
level is at most (T ∗ ), the length of the optimal solution. Since the binary tree has
O(log n) levels, the total increased length in moving cross-points to portals is at
most
3
· O(log n) · (T ∗ ). (9.5)
m+1
3·O(log n)
Choose m + 1 = ε . Then, the totally increased length will be at most
ε · (T ∗ ).
9.4 Connected Dominating Set 279
This means that the minimum restricted rectilinear Steiner tree has performance
ratio 1 + ε when it is considered as an approximation of the rectilinear Steiner
minimum tree. Moreover, for the choice of m in (9.5), we have
2O(m) = nO(1) .
• For each cell e, the part of G lying in central area of cell may be broken
into several connected components. Let He denote set of those connected
components.
• For each such connected component H , find a minimum subset CH of nodes in
Se which dominates nodes in H and induces a connected subgraph (Fig. 9.19).
CH will be called a CDS for H in Se .
• Let Ce denote the union of CH for H over He .
• Let Cx,y denote the union of Ce for e over all cells of partition P (x, y).
Let us estimate the running time for computing Cx,y .
9.4 Connected Dominating Set 281
Fig. 9.20 Se √
is partitioned
into (a + 2) 22 small
areas
2
Lemma 9.4.2 Cx,y can be computed in time nO(a ) .
√ Let us first estimate the computation time for Ce . Partition Se into (a +
Proof
2) 22√small areas which are squares or rectangles with edge or longer-edge
length 2/2 (Fig. 9.20). Note that the diameter of each small area is at most one.
If a small area contains a node, then choose one of them which can dominate
others.
√ Therefore, the minimum dominating set for nodes inside Se contains at most
2(a + 2)2 nodes.
In a connected component H , if D is a dominating set, then we can connect D
into a CDS for H by adding at most 2(|D| − 1) nodes. (We leave the proof of this
fact as an exercise.) From this fact, it follows immediately that
√
|Ce | ≤ 2( 2(a + 2)2 − 1) = O(a 2 ).
Denote by ne the number of nodes lying in central area Se . Then, by exhausting
O(a 2 )
search, we can find Ce in ne .
For a > 2(h + 1), each node can lie in Se for at most four cells e. Therefore,
e∈P (x,y) ne ≤ 4n where n = |V |. Thus, total time for computing Cx,y is at most
2 2 2
nO(a
e
)
≤ (4n)O(a ) = nO(a ) .
e∈P (x,y)
282 9 Restriction and Steiner Tree
Therefore,
for analysis of approximation performance, we may consider
e∈P (x,x) |C e | instead of |C(x, x)|.
Let C ∗ be a minimum CDS for input graph G. We need to find a partition P (x, x)
such that C ∗ be modified into a CDS S to satisfy the restriction and
9.4 Connected Dominating Set 283
|Ce | ≤ (1 + ε)|C ∗ |
e∈P (x,x)
where Bx is the set of nodes in C ∗ , lying in boundary area ∪e∈P (x,x) (Se \ e).
Proof Consider a cell e and a connected component H of G ∩ Se . Suppose that
C ∗ ∩ Se does not have a connected component dominating H . Let us consider those
connected components of C ∗ ∩ Se , say C1 , . . . , Ck , each of which dominates at
least one node in H . Since H is connected, there must exist Ci and Cj such that
their distance is at most three, i.e., adding a most two nodes will connect Ci and Cj .
Therefore, adding totally at most 2(k − 1) nodes will connect all C1 , . . . , Ck into
a connected one. Note that h ≥ 2 and all C1 , . . . , Ck are connected outside of Se
through C ∗ . Therefore, every Ci must contain a node lying in outer boundary area
Se \ Se . We choose the one adjacent to a node in H . We may charge 2 to each such
node in k − 1 of them (Fig. 9.22). Moreover, each such node can be charged at most
five times since each node can be adjacent to at most five connected components of
G ∩ Se .
Finally, every node in boundary area can lie in Se for at most four cells e.
Summarize all above, each node in the boundary area can be repeatedly counted
for at most 14 times.
Now, we show the result of analysis.
Theorem 9.4.5 There exists a PTAS for the minimum CDS problem in unit disk
graphs.
Fig. 9.24 Each node appears in horizontal strip once and in vertical strip once
Proof As shown in Fig. 9.23, for each x, the boundary area ∪e∈P (x,x) (Se \ e) can be
covered by horizontal strips and vertical strips. All horizontal strips are disjoint, and
all vertical strips are disjoint for x over X (Fig. 9.24). Therefore, each node appears
in at most one horizontal strip and at most one vertical strip. Therefore, each node
appears in Bx at most twice for x over X. This implies
|Bx | ≤ 2|C ∗ |.
x∈X
Thus, there exists x ∈ X such that |Bx | ≤ 2/m. Choose m ≥ 38/ε. For this x, we
can modify C ∗ into a CDS C such that
|Ce | ≤ (1 + ε)|C ∗ |.
e∈P (x,x)
Exercises
rectangle into two parts such that these two parts contain possibly 2m windows
connected to each other. Those windows are located arbitrarily on two sides of
the cut segment. A m-guillotine partition is a rectangular partition which results
from a sequence of m-guillotine cuts. Please prove the following:
(a) The minimum length 1-guillotine rectangular partition can be computed
in O(n10m+5 ) time by a dynamic programming where n is the number of
holes in input rectangle.
(b) The minimum length 1-guillotine rectangular partition can be a (1 + m1 )-
approximation for the minimum length rectangular partition.
15. Using technique of m-guillotine partition, show that the rectilinear Steiner
minimum tree problem has a PTAS.
16. Using technique of m-guillotine partition, show that the following problem has
a PTAS: Given a rectangular polygon with rectangular holes inside (i.e., each
hole is a smaller rectangular polygon), partition it into hole-free rectangles with
the minimum total length of cuts.
17. Explain why the technique of portals is unable to establish a PTAS for the
minimum length rectangular partition problem.
18. Given a unit disk graph G = (V , E), find a minimum vertex cover. Please
design a PTAS.
19. Given a unit disk graph G = (V , E), find a minimum connected vertex cover.
Please design a PTAS.
20. (Minimum CDS with Routing Cost Constraint) Consider a unit disk graph
G = (V , E) and a CDS C. For any two nodes u and v, denote by dC (u, v)
the shortest distance between u and v passing through intermediate nodes in
C and by d(u, v) the shortest distance between u and v in G. Please find a
constant α > 0 and a PTAS for the minimum CDS such that for any two nodes
u and v, dC (u, v) ≤ αd(u, v).
21. Design a PTAS for the minimum CDS problem in unit ball graphs where a
graph is called a unit ball graph if all nodes can be placed in the Euclidean
three-dimensional space such that there exists an edge between two nodes u
and v if and only if the unit ball with center u intersects the unit ball at center v.
22. Could you design a PTAS for the following problems in unit n-dimensional ball
graphs, where a graph is called a unit n-dimensional ball graph if all nodes can
be placed in the Euclidean n-dimensional space such that there exists an edge
between two nodes u and v if and only if the unit n-dimensional ball with center
u intersects the unit n-dimensional ball at center v.
(a) The minimum CDS.
(b) The minimum CDS with routing cost constraint.
(c) The minimum vertex cover.
(d) The minimum connected vertex cover.
Historical Notes 289
Historical Notes
There are three classic Steiner minimum tree problems, the Euclidean Steiner
minimum tree, the rectilinear Steiner minimum tree, and the network Steiner
minimum tree. The Euclidean Steiner tree is the first one appearing in the literature.
It was initiated by studying Fermat problem [356]: Given three points in the
Euclidean plane, find a point connecting them with shortest total distance. Fermat
problem has two generalizations for more than three given points. One of them
was found by Gauss [356] and unfortunately named Steiner tree by Crourant and
Robbins [75]. The detail story can be found in Schreiber’s article [356].
All three classic Steiner minimum tree problems are NP-hard [148, 163, 165,
166, 236]. Therefore, one has to put a lot of efforts to study approximation solutions.
The minimum spanning tree is the first candidate. Therefore, determination of the
performance ratio of the minimum spanning tree becomes an attractive research
problem. Hwang [221] determined this ratio in the rectilinear plane. However,
for this ratio on the Euclidean Steiner tree, a tortuous story passed a sequence of
publications [67–69, 101, 172, 186, 352].
Does there exist a polynomial-time approximation with worst case performance
ratio better than that of the minimum spanning tree? For the network Steiner
minimum tree, Zelikovsky [437] gave a yes answer. In general, Du, Zhang, and
Feng [106, 107] showed that such approximations exist in all metric space as
long as Steiner minimum tree for a fixed number of points is polynomial-time
computable. Now, much better approximation algorithms have designed. But all
designs include restriction technique. For the network Steiner tree, the k-restricted
Steiner tree is always involved [40, 350], and hence the k-Steiner ratio [33] plays
an important role. For the Euclidean and rectilinear Steiner minimum tree, PTAS
can be constructed with guillotine partition.
The Steiner tree has many applications in the real world. Often, various applica-
tions also generate variations of Steiner tree, such as terminal Steiner trees [94, 289],
Steiner trees with minimum number of Steiner points [288, 310], acyclic directed
Steiner trees [438], bottleneck Steiner trees [394], k-generalized Steiner forest
[159], Steiner networks [173], and selected internal Steiner trees [217]. The
phylogenetic tree alignment can also be considered as a Steiner tree problem with a
given topology in a special metric space [346, 355, 395]. For all of them, restriction
plays an important role in the study of their approximation.
Is there a polynomial-time constant-approximation for weighted dominating
set in unit disk graphs? This open problem was solved by Ambühl, Erlebach,
Mihalák, and Nunkesser [4]. Using partition, they constructed a polynomial-time
72-approximation. Gao, Huang, Zhang, and Wu [160] introduced a new technique,
called the double partition, and improved the ratio to (6 + ε). Following this work,
through a few efforts [77, 128, 462, 463], this ratio is reduced to 4 + ε. Ding
et al. [85] note that above techniques can also be used for the weighted sensor
cover problem in unit disk graphs, which solves a long-standing open problem.
Actually, the unit disk graph is the mathematical formulation of homogeneous
wireless sensor networks. Coverage is an important issue in the study of wireless
sensor networks [58, 59, 201, 300, 301, 421, 423]. In 2005, Cardei et al. [43]
studied a sensor scheduling problem, called the maximum lifetime coverage, and
Historical Notes 291
This function has a property that for any two subcollections A and B,
In fact, comparing μ(A) + μ(B) with μ(A ∪ B), the difference is the number of
elements appearing in both A and B, i.e.,
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 293
D.-Z. Du et al., Introduction to Combinatorial Optimization, Springer Optimization
and Its Applications 196, https://doi.org/10.1007/978-3-031-10596-8_10
294 10 Greedy Approximation
Note that each element appearing in A ∩ B must appear in both A and B. Therefore,
we obtain the inequality. The equality sign may not hold since there may exist some
element appearing in a subset S in A and another subset S in B. However, S = S .
The function μ with inequality (10.1) is called a submodular function. In general,
consider a function f defined over all subsets of a set X, i.e., 2X . f is called a
submodular function if for any two subsets A and B of X,
A ⊂ B ⇒ f (A) ≤ f (B).
The submodular function has a lot of properties. The following two are important
ones:
Lemma 10.1.2 For any subset A and element x, denote x f (A) = f (A ∪ {x}) −
f (A). Then, the following holds:
(a) A set function f : 2X → R is submodular if and only if for any two subsets A
and B with A ⊆ B and for any x ∈ X \ B, x f (A) ≥ x f (B).
(b) A set function f : 2X → R is monotone nondecreasing if and only if for any
two subsets A and B with A ⊆ B and for any x ∈ B \ A, x f (A) ≤ x f (B).
Proof
(a) Suppose f is submodular. Consider two subsets A and B with A ⊂ B and an
element x ∈ X \ B. By modularity, we have
that is,
Conversely, suppose inequality (10.2) holds for any two subsets A and B with
A ⊂ B and for any element x ∈ X \ B. Consider any two subsets U and V .
Suppose U \ V = U \ (U ∩ V ) = (U ∪ V ) \ V = {y1 , y2 , . . . , yk }. Denote
Yi = {y1 , . . . , yi }. Then, we have
f (U ) − f (U ∩ V )
= y1 f (U ∩ V ) + y2 f ((U ∩ V ) ∪ Y1 ) + · · · + yk f ((U ∩ V ) ∪ Yk−1 )
≥ y1 f (V ) + y2 f (V ∪ Y1 ) + · · · + yk f (V ∪ Yk−1 )
= f (U ∪ V ) − f (V ),
that is,
10.1 What Is the Submodular Function? 295
f (U ) + f (V ) ≥ f (U ∪ V ) + f (U ∩ V ).
(b) If f is monotone nondecreasing, then for two subsets A and B with A ⊂ B and
x ∈ B \ A,
f (B) − f (A)
= (f (A ∪ {x1 }) − f (A)) + (f (A ∪ {x1 , x2 }) − f (A ∪ {x1 })) + · · ·
+(f (B) − f (A ∪ {x1 , . . . , xk−1 }))
≥ 0.
Therefore,
Applying this inequality recursively, we obtain the inequality in the statement of the
lemma.
The maximum set coverage problem can be formulated as
max μ(A)
subject to |A| ≤ k,
A ⊆ C.
f (A∗ ) ≤ f (A∗ ∪ Ai )
= f (Ai ) + u1 f (Ai ) + · · · + uk f (Ai ∪ {u1 , u2 , . . . , uk−1 })
≤ f (Ai ) + u1 f (Ai ) + · · · + uk f (Ai )
≤ f (Ai ) + k · xi+1 f (Ai ), (10.5)
where the first inequality is due to the monotonicity of f , the second inequality is
due to the submodularity of f , and the third inequality is due to the greedy rule in
the algorithm.
Denote ai = f (A∗ )−f (Ai ). Then, it follows from (10.5) that ai ≤ k(ai −ai+1 ).
Hence,
In the above proof, we did not consider decision versions of two optimization
problems and construct a polynomial-time many-one reduction between them.
Instead, we directly built a reduction between two optimization problems. Such
a reduction is called a polynomial-time Turing reduction. Generally speaking, a
problem A is said to be polynomial-time Turing reducible to another problem B
if A can be solved in polynomial-time by using B as an oracle (i.e., a subroutine).
Lemma 10.1.7 For any node set A, let σ (A) denote the total number of nodes
influenced by A. Then, σ (A) is a polymatroid function.
Proof We show that for two node subsets A and B with A ⊂ B and v ∈ A,
v σ (A) ≥ v σ (B).
Let I (A) denote the set of nodes influenced by node subset A. Then
In submodular optimization, the generalization of the set cover problem has more
applications. This problem is called the submodular set cover problem described as
follows:
Problem 10.2.1 (Submodular Set Cover (Standard Form)) Let f be a polyma-
troid function over 2X where X is a finite set and c be a nonnegative cost function
on X. Consider the minimization problem:
min c(A) = c(x) (10.6)
x∈A
subject to f (A) = f (X),
A ∈ 2X
γ = max f ({x})
x∈X
10.2 Submodular Set Cover 299
and
γ
1
H (γ ) = .
i
i=1
Proof We claim that the minimum submodular cover problem can be formulated as
the following integer LP:
min c(v)xv (10.7)
v∈X
s.t. v f (S)xv ≥ X−S f (S) for all S ∈ 2X ,
v∈X−S
xv ∈ {0, 1} for v ∈ X.
To show the claim, we first prove that for any set A ∈ 2X satisfying f (A) =
f (X), its indicator vector 1A is a feasible solution of LP (10.7), where 1A = (xv )v∈X
is defined by
1 if v ∈ A,
xv =
0 otherwise.
≥ A\S f (S)
= f (A) − f (S)
= f (X) − f (S)
= X−S f (S),
that is,
0 ≥ f (X) − f (A).
yS ≥ 0 for S ∈ 2X .
kv −1
1 ck+1 ck
v f (S)yS = v f (Ak ) − , (10.9)
H (γ ) rk+1 rk
S:v∈S k=0
10.2 Submodular Set Cover 301
v −1
k
ck+1 ck
v f (Ak ) −
rk+1 rk
k=0
v −1
k
ck ck
= (v f (Ak−1 ) − v f (Ak )) + v v f (Akv −1 )
rk rkv
k=1
kv
ck
= (v f (Ak−1 ) − v f (Ak )) , (10.10)
rk
k=1
where v f (Akv ) = 0 is used in absorbing the second term into the summation. For
any k = 0, 1, . . . , kv − 1, since v ∈ Ak , we have
ck c(v)
≤
rk v f (Ak )
v −1
k
kv
ck+1 ck v f (Ak−1 ) − v f (Ak )
v f (Ak ) − ≤ c(v)
rk+1 rk v f (Ak )
k=0 k=1
g
g
ck
g
ck
c(Ag ) = ck = xk f (Ak−1 ) = (f (Ak ) − f (Ak−1 ))
rk rk
k=1 k=1 k=1
g−1
ck ck+1 cg
= f (Ak ) − + f (Ag )
rk rk+1 rg
k=1
g−1 g−1
ck ck+1 ck+1 ck
= f (Ak ) − + − f (X)
rk rk+1 rk+1 rk
k=1 k=1
g−1
ck+1 ck
= − (f (X) − f (Ak ))
rk+1 rk
k=1
302 10 Greedy Approximation
g−1
= H (γ ) X−Ak f (Ak ) · yAk
k=1
= H (γ ) X−S f (S) · yS
S∈2X
≤ H (γ )opt,
where opt is the optimal value of the integer LP (10.7), which is an upper bound for
the objective value of {yS } in the dual LP (10.8).
Now, we present an example.
Problem 10.2.3 (Positively Dominating Set) Given a graph G = (V , E), find a
minimum positively dominating set where a node v is positively dominated by a node
set A if degA (v) ≥ deg(x). Here, deg(v) is the degree of v in G and degA (v) =
|{u | (u, v) ∈ E, u ∈ A or v ∈ A}|.
For any A ⊆ V , define
g(A) = min(deg(V )/2, degA (v)).
v∈V
We will show that g is a polymatroid function. To do so, we first show two properties
of the polymatroid function.
Lemma 10.2.4 Suppose f is a polymatroid function. Then, for any constant c ≥ 0,
ζ (A) = min(c, f (A)) is a polymatroid function.
Proof Note that for A ⊂ B,
Thus, it suffices to prove the modularity of ζ (A). Consider two node subsets A and
B with A ⊂ B and x ∈ B. We divide the proof into three cases:
Case 1. f (A ∪ {x}) > c.
Lemma
n 10.2.5 If fi for i = 1, 2, . . . , n are polymatroid functions, then f =
i=1 i is a polymatroid function.
f
Proof It follows immediately from the following derivations:
x hv (A) = 0 = x hv (B).
Case 2. v ∈ B \ A.
x hv (A) ≥ 0 = x hv (B).
x hv (A) = 1 = x hv (B).
If (x, v) ∈ E, then
304 10 Greedy Approximation
x hv (A) = 0 = x hv (B).
Now, we show that the positively dominating set problem can be formulated in
the following form:
min |A|
subject to g(A) = g(V )
A ∈ 2V .
Therefore,
γ = max g({v})
v∈V
1
≤ max deg(v) + deg(v)
v∈V 2
3
= max deg(v)
v∈V 2
10.3 Monotone Submodular Maximization 305
3
≤ .
2
Lemma 10.2.4 indicates how to deal with the following general form of the
submodular set cover problem.
Problem 10.2.10 (Submodular Set Cover (General Form)) Let f be a polyma-
troid function over 2X where X is a finite set and c be a nonnegative cost function
on X. Consider the minimization problem:
min c(A) = c(x) (10.11)
x∈A
subject to f (A) ≥ d,
A ∈ 2X ,
S ∈ 2X
where X is the universe set, b(x) is the budget cost of item x, and B is the total
budget.
This problem can also be seen as a generalization of the knapsack problem.
Algorithm 31 is an extension of
greedy 1/2-approximation algorithm for the
knapsack problem. Denote b(A) = x∈A b(x).
306 10 Greedy Approximation
x f (Si )
xi+1 = argmaxx∈X\Si ,
b(x)
f (S ∗ ) ≤ f (Si ∪ S ∗ )
! " ! "
= f (Si ) + y1 f (Si ) + y2 f Si ∪ S1∗ + · · · + yh f Si ∪ Sh−1
∗
b(S ∗ )
= f (Si ) + · (f (Si+1 ) − f (Si )).
b(xi+1 )
b(S ∗ )
αi ≤ · (αi − αi+1 ).
b(xi+1 )
Hence,
10.3 Monotone Submodular Maximization 307
b(x )
b(xi+1 ) − b(Si+1
∗)
αi+1 ≤ 1− αi ≤ αi · e .
b(S ∗ )
Therefore,
b(Sk+1 )
−
opt − f (Sk+1 ) = αk+1 ≤ α0 · e b(S ∗ ) ≤ opt · e−1 .
Hence,
Thus,
Next, we extend the PTAS for the knapsack problem to an algorithm for
Problem 10.3.1 which has a better performance ratio.
Theorem 10.3.3 Algorithm 32 is a (1 − 1/e)-approximation for Problem 10.3.1.
Proof Suppose optimal solution S ∗ = {u1 , u2 , . . . , uh } in ordering
Thus,
and
f (S(I )) ≥ f (I ∪ {v1 , v2 , . . . , vi })
≥ (1 − e−1 )f (S ∗ ) + e−1 f (I ) − vi+1 f (I ∪ {v1 , v2 , . . . , vi }).
Thus,
vi+1 f (I ∪ {v1 , v2 , . . . , vi })
1
≤ (u3 f ({u1 , u2 }) + u2 f ({u1 }) + u1 f (∅))
3
1
= f (I ).
3
Hence,
f (S(I )) ≥ f (I ∪ {v1 , v2 , . . . , vi })
1
≥ (1 − e−1 )f (S ∗ ) + e−1 f (I ) − f (I )
3
≥ (1 − e−1 )f (S ∗ ).
10.3 Monotone Submodular Maximization 309
Clearly, Algorithm 31 runs faster than Algorithm 32 although the performance ratio
is smaller.
Next, we study the monotone submodular maximization with matroid con-
straints.
Problem 10.3.4 Let f be a polymatroid function. Let (X, Ii ) be a matroid for every
1 ≤ i ≤ k. Consider the following problem:
max f (A)
subject to A ∈ Ii for every 1 ≤ i ≤ k.
k
|A \ I | ≤ | ∪ki=1 Ai | ≤ |Ak | ≤ k|B \ A|.
i=1
j
Suppose Hi−1 \ Hi = {v1 , . . . , vr }. Denote Hi−1 = Hi−1 \ {v1 , . . . , vj }. Then
Thus,
r
j
r(f (Ai ) − f (Ai−1 )) ≥ vj f Hi−1 ∪ Ai−1
j =1
! r "
= f Hi−1
0
∪ Ai−1 − f Hi−1 ∪ Ai−1
= f (Hi−1 ) − f (Hi ).
g
k[f (Ag ) − f (A0 )] = r [f (Ai ) − f (Ai−1 )]
i=1
g
≥ [f (Hi−1 ) − f (Hi )]
i=1
= f (H0 ) − f (Hg ).
1
If the objective function f is linear, then the performance ratio k+1 can be
improved to k1 . To see this, let us first show a lemma.
Lemma 10.3.7 Consider an independent system (X, I) which is the intersection of
k matroids (X, I). For any subset F of X, let u(F ) and v(F ) denote the maximal
size and the minimal size of maximal independent set in (X, I), respectively. Then,
u(F )/v(F ) ≤ k.
Proof Consider two maximal independent subsets I and J of F . Let Ii ⊇ I be a
maximal independent subset of I ∪ J with respect to (X, Ii ). For each e ∈ J \ I , if
e ∈ ∩ki=1 (Ii \ I ), then I ∪ {e} is independent in (X, I), contradicting the maximality
10.4 Random Greedy 311
k
k
|Ii | − k|I | = |Ii \ I | ≤ (k − 1)|J \ I | ≤ (k − 1)|J |.
i=1 i=1
max f (S)
subject to |S| ≤ k
S ∈ 2X .
312 10 Greedy Approximation
In this section, we will study a random greedy algorithm for this problem, as
shown in Algorithm 34. This algorithm can be proved not only to have theoretical
guaranteed performance ratio (1 − e−1 ) for monotone nondecreasing function f but
also to have performance ratio 1/e for general f .
Let us first consider the monotone nondecreasing function f .
Theorem 10.4.2 If f is monotone nondecreasing, then random greedy (Algo-
rithm 34) has approximation performance ratio 1 − e−1 .
Proof Let us fix all random process until Ai is obtained for 1 ≤ i ≤ k. Let OP T
be an optimal solution and denote opt = f (OP T ). Therefore, we have
1
E[ui f (Ai−1 )] = · u f (Ai−1 )
k
u∈Mi
1
≥ · u f (Ai−1 )
k
u∈OP T \Ai−1
1
≥ · (f (OP T ∪ Ai−1 ) − f (Ai−1 ))
k
opt − f (Ai−1 )
≥ .
k
The first inequality is due to greedy choice of Ai . The second inequality holds
because f is submodular. The third inequality is true because f is monotone
nondecreasing.
Now, we release the randomness of Ai and Ai−1 . Then, we have
Therefore,
1
opt − E[f (Ai )] ≤ 1 − (opt − E[f (Ai−1 )]).
k
This implies
1 i
opt − E[f (Ai )] ≤ 1 − · (opt − f (A0 ))
k
1 i
≤ 1− · opt.
k
Thus,
1 k
E[f (Ak )] ≥ 1 − 1 − · opt ≥ (1 − e−1 ) · opt.
k
h
= f (∅) + E[xi ] · ui f (Ai−1 )
i=1
h
= f (∅) + pi · ui f (Ai−1 )
i=1
h−1
= (1 − p1 ) · f (∅) + (pi − pi+1 )f (Ai ) + ph · f (A)
i=1
≥ (1 − p) · f (∅).
1
E[ui f (Ai−1 )] = · u f (Ai−1 )
k
u∈Mi
1
≥ · u f (Ai−1 )
k
u∈OP T \Ai−1
1
≥ · (f (OP T ∪ Ai−1 ) − f (Ai−1 ))
k
Exercises 315
The first inequality is due to greedy choice of Ai . The second inequality is true
because f is submodular.
Release randomness of Ai and Ai−1 and take expectation. By Lemma 10.4.4, we
obtain
1
E[ui f (Ai−1 )] ≥ · (E[f (OP T ∪ Ai−1 )] − E[f (Ai−1 )])
k
1 1 i−1
≥ · 1− · opt − E[f (Ai−1 )] .
k k
Thus,
i−1
1 1 1
E[f (Ai )] ≥ · 1− · opt + 1 − · E[f (Ai−1 )]
k k k
i−1
1 1
≥ · 1− · opt.
k k
Exercises
I ∗ = {X \ I | I ∈ I}.
Every node has two states, active and inactive. Before starting a process, every
node is inactive. Initially, for every node v, a threshold θv is selected randomly
from [0, 1] with uniform distribution, and activate at most k nodes, called them
If yes, then v is activated. Otherwise, v is kept inactive. The process ends when
no new active node is produced. The influence spread is the expected number of
active nodes at the end of the influence process. Prove that the influence spread
is a monotone nondecreasing submodular function with respect to the seed set.
19. (Independent Cascade Model) Consider a directed graph G = (V , E). Every
node has two states, active and inactive. Every arc (u, v) has a probability puv
which means that u can influence v successfully with probability puv . Before
starting a process, every node is inactive. Initially, activate at most k nodes,
called them as seeds. Then, an influence process is carried out step-by-step. In
each step, every freshly active node u will activate inactive neighbor through arc
(u, v) with success probability puv where, by a freshly active node, we mean
that a node becomes active at last step. When an inactive node receives influence
from k (k ≥ 2) incoming neighbors, we treat them as k independent events. The
process ends when no new active node is produced. The influence spread is the
expected number of active nodes at the end of the influence process. Prove that
the influence spread is a monotone nondecreasing submodular function with
respect to the seed set.
20. (Mutually Exclusive Cascade Model) Consider a directed graph G = (V , E).
Every node has two states, active and inactive. Every arc (u, v) has a probability
puv which means that u can influence v successfully with probability puv .
Before starting a process, every node is inactive. Initially, activate at most k
nodes, called them as seeds. Then, an influence process is carried out step-by-
step. In each step, every freshly active node u will activate inactive neighbor
through arc (u, v) with success probability puv where, by a freshly active
node, we mean that a node becomes active at last step. When an inactive
node receives influence from k (k ≥ 2) incoming neighbors, we treat them
as k mutually exclusive events. The process ends when no new active node is
produced. The influence spread is the expected number of active nodes at the
end of the influence process. Prove that the mutually exclusive cascade model
is equivalent to the linear threshold model, that is, in both models, the influence
spread is the same function with respect to the seed set.
21. Let f be a monotone nondecreasing submodular function on 2X . Assume A ⊂
A and B ⊆ B . Prove that
22. (General Threshold Model) Consider a directed graph G = (V , E). Every node
v has a monotone nondecreasing threshold function fv on subsets of incoming
Exercises 319
neighbors. Every node has two states, active and inactive. Before stating a
process, every node is inactive. Initially, for every node v, a threshold θv is
selected randomly from [0, 1] with uniform distribution, and activate at most k
nodes, called them as seeds. Then, an influence process is carried out step-by-
step. In each step, every inactive node v checks whether
k
wi (∪j ∈Si Aj )
i=1
Historical Notes
min f (x)
subject to x ∈
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 323
D.-Z. Du et al., Introduction to Combinatorial Optimization, Springer Optimization
and Its Applications 196, https://doi.org/10.1007/978-3-031-10596-8_11
324 11 Relaxation and Rounding
Let wi be the weight of vertex vi . Then, every vertex cover corresponds to a feasible
solution in the following 0-1 integer LP, and the minimum weight vertex cover
corresponds to the optimal solution of the 0-1 integer LP:
11.1 The Role of Rounding 325
min w1 x1 + w2 x2 + · · · + wn xn (11.1)
subject to xi + xj ≥ 1 for (vi , vj ) ∈ E
xi = 0 or 1 for i = 1, 2, · · · , n.
min w1 x1 + w2 x2 + · · · + wn xn (11.2)
subject to xi + xj ≥ 1 for (vi , vj ) ∈ E
0 ≤ xi ≤ 1 for i = 1, 2, · · · , n.
n
n
wi xiA ≤2 wi xi∗
i=1 i=1
and
n the optimal solution of (11.1) has objective function value not smaller than
w x ∗.
i=1 i i
Next, we consider the following problem:
Problem 11.1.3 (MAX-SAT) Given a CNF F , find an assignment to maximize the
number of satisfied clauses.
Suppose F contains m clauses C1 , . . . , Cm and n variables x1 , . . . , xn . We also
first formulate the MAX-SAT problem into an integer LP:
max z1 + z2 + · · · + zm
326 11 Relaxation and Rounding
subject to yi + (1 − yi ) ≥ zj for j = 1, 2, . . . , m,
xi ∈Cj x̄i ∈Cj
yi ∈ {0, 1} for i = 1, 2, . . . , n,
zj ∈ {0, 1} for j = 1, 2, . . . , m,
max z1 + z2 + · · · + zm
subject to yi + (1 − yi ) ≥ zj for j = 1, 2, . . . , m,
xi ∈Cj x̄i ∈Cj
0 ≤ yi ≤ 1 for i = 1, 2, . . . , n,
0 ≤ zj ≤ 1 for j = 1, 2, . . . , m.
Let us first show an inequality since it will be employed not only once.
! "k
Lemma 11.1.4 Let f (z) = 1 − 1 − kz . Then,
Thus,
Proof
% %
E[Zj ] = 1 − (1 − yi∗ ) yi∗
xi ∈Zj x̄i ∈Zj
∗
yi∗
k
xi ∈Zj (1 − yi ) + x̄i ∈Zj
≥ 1− (k = |Zj |)
k
yi∗ + ∗ k
xi ∈Zj x̄i ∈Zj (1 − yi )
≥ 1− 1− .
k
By Lemma 11.1.4,
⎛ ⎞
E[Zj ] ≥ (1 − e−1 ) ⎝ yi∗ + (1 − yi∗ )⎠ = (1 − e−1 ) · zj∗ .
xi ∈Zj x̄i ∈Zj
1
E[ZF ] ≥ optmax-sat · 1 − ,
e
where optmax-sat is the optimal objective function value of the MAX-SAT problem.
Proof By Lemma 11.1.5, we have
Therefore, we have
1
E[ZF |xi =1 ] = E[ZF | x1 = 1] ≥ optlp · 1 −
e
or
1
E[ZF |xi =0 ] = E[ZF | x1 = 0] ≥ optlp · 1 − .
e
In the former case, it means that among assignments with x1 = 1, the expectation
of the number of satisfied clauses not less than optlp · (1 − 1e ). In the latter case,
it means that among assignments with x1 = 0, the expectation of the number of
satisfied clauses is not less than optlp · (1 − 1e ). Motivated from this observation, we
can find the following way to derandomization procedure:
Derandomization
for i = 1 to n do
if E[ZF | xi = 1] ≥ optlp · (1 − 1e )
then xi ← 1 and
F ← F |xi =1
else xi ← 0 and
F ← F |xi =0
end-for.
11.2 Group Set Coverage 329
for 1 ≤ i ≤ k. Therefore,
Hence,
S1 ∪ · · · ∪ Si = S1 ∪ · · · ∪ Sg
Therefore, we can use the same argument as that in Case 1 to show that
n
max yj
i=j
m
s.t. yj ≤ xiS ∀j = 1, . . . , n,
i=1 S:j ∈S∈Gi
xiS ≤ 1 ∀i = 1, . . . , m,
S:S∈Gi
yj ∈ {0, 1} ∀j = 1, . . . , n,
xiS ∈ {0, 1} ∀S ∈ G and i = 1, 2, . . . , m.
n
max yj
i=j
m
s.t. yj ≤ xiS ∀j = 1, . . . , n,
i=1 S:j ∈S∈Gi
xiS ≤ 1 ∀i = 1, . . . , m,
S:S∈Gi
0 ≤ yj ≤ 1 ∀j = 1, . . . , n,
0 ≤ xiS ≤ 1 ∀S ∈ G and i = 1, 2, . . . , m.
follows:
Mutually Exclusive Rounding: For each group Gi , make a mutually exclusive
selection to choose ∗ and not select any subset with
one subset
∗
S with probability xiS
probability 1 − S∈Gi xiS . Set yj = 1 if element j appears in a selected subset, and
yj = 0, otherwise.
Let (xiS , yj ) be a solution obtained from the randomized rounding. We show
properties of this solution.
Lemma 11.2.4 E[yj ] ≥ (1 − e−1 )yj∗ .
Proof For each i = 1, . . . , n,
%
m % ! "
∗
Prob[yj = 0] = 1 − xiS
i=1 S:j ∈S∈Gi
332 11 Relaxation and Rounding
m ∗ Kj
i=1 S:j ∈S∈Gi (1 − xiS )
≤
Kj
(where Kj = |{(i, S) | j ∈ S ∈ Gi }|)
m ∗ Kj
i=1 S:j ∈S∈Gi xiS
= 1−
Kj
Kj
yj∗
≤ 1− .
Kj
Hence,
yi∗ Kj
Prob[yi = 1] ≥ 1 − 1 − .
Kj
where opt is the objective function value of optimal solution for the group set
coverage problem.
Proof By Lemma 11.2.4,
# $
n
n
E yj = E[yj ]
i=1 i=1
n
≥ (1 − e−1 ) · yj∗
j =1
≥ (1 − e−1 ) · opt.
11.3 Pipage Rounding 333
In this section, we introduce a rounding technique, called the pipage rounding since
it can be applied to submodular optimization.
Consider the following problem:
Problem 11.3.1 (Maximum Weight Hitting) Given a collection C of subsets of a
finite set X with nonnegative weight function w on C and a positive integer p, find a
subcollection A of X with |A| = p to maximize the total weight of subsets hit by A.
Assume X = {1, 2, . . . , n} and C = {S1 , S2 , . . . , Sm }. Denote wi = w(Si ). Let
xi be a 0-1 variable to indicate whether element i is in subset A. Then, this problem
can be formulated into the following integer LP:
m
max w j zj (11.3)
j =1
s.t. xi ≥ zj , j = 1, . . . , m,
i∈Sj
n
xi = p
i=1
xi ∈ {0, 1}, i = 1, 2, . . . , n
zj ∈ {0, 1}, j = 1, 2, . . . , m.
m
max L(x) = wj min{1, xi } (11.4)
j =1 i∈Sj
n
s.t. xi = p
i=1
xi ∈ {0, 1}, i = 1, 2, . . . , n
m %
max F (x) = wj (1 − (1 − xi )) (11.5)
j =1 i∈Sj
n
s.t. xi = p
i=1
xi ∈ {0, 1}, i = 1, 2, . . . , n
L(x) and F (x) have the same value when each xi takes value 0 or 1. But when
xi is relaxed to 0 ≤ xi ≤ 1, they may have different values. The following gives a
relationship between them:
Lemma 11.3.2 F (x) ≥ (1 − 1/e)L(x) for 0 ≤ x ≤ 1.
Proof Note that
k
% i∈Sj (1 − xi )
1− (1 − xi ) ≥ 1 − (k = |Sj |)
k
i∈Sj
k
i∈Sj xi
≥ 1− 1− .
k
⎧ ⎫
m ⎨ ⎬
max L(x) = wj min 1, xi (11.6)
⎩ ⎭
j =1 i∈Sj
n
s.t. xi = p
i=1
0 ≤ xi ≤ 1, i = 1, 2, . . . , n
m
max w j zj (11.7)
j =1
s.t. xi ≥ zj , j = 1, . . . , m,
i∈Sj
n
xi = p
i=1
0 ≤ xi ≤ 1, i = 1, 2, . . . , n
0 ≤ zj ≤ 1, j = 1, 2, . . . , m.
Pipage Rounding
x ← x∗;
while x has an noninteger component do begin
choose 0 < xk < 1 and 0 < xj < 1 (k = j );
define x(ε) by setting
⎧
⎨ xi if i = k, j,
xi (ε) = xj + ε if i = j,
⎩
xk − ε if i = k;
define ε1 = min(xj , 1 − xk ) and ε2 = min(1 − xj , xk );
if F (x(−ε1 )) ≥ F (x(ε2 ))
then x ← x(−ε1 )
else x ← x(ε2 );
end-while;
return x̄ = x.
336 11 Relaxation and Rounding
The existence of xk and xj is due to the fact that when x has a noninteger
component, x has at least two noninteger components since ni=1 xi = p.
The following is an important property of F (x(ε)):
Lemma 11.3.3 F (x(ε)) is convex with respect to ε.
Proof If Sj contains only one of k and j , then the j th term of F (x(ε)), correspond-
ing to 1 − i∈Sj (1 − xi ), is linear and hence convex with respect to ε. If Sj contains
both k and j , then the j th term of F (x(ε)), corresponding to 1 − i∈Sj (1 − xi ), is
in the form
or
xe = pe
e∈δ(v)
or
xe ≥ pe
e∈δ(v)
s.t. xe ≤ pv for v ∈ U ∪ V
e∈δ(v)
xe ∈ {0, 1} for e ∈ E.
As shown in Fig. 11.3, suppose L(x) has a company F (x) such that
(A1) L(x) = F (x) for xe ∈ {0, 1}.
(A2) L(x) ≤ cF (x) for 0 ≤ xe ≤ 1.
We also assume the following:
(A3) The relaxation of integer programming (11.8) is equivalent to an LP or is
polynomial-time solvable.
0 ≤ xe ≤ 1 for e ∈ E.
1. Consider the subgraph Hx of G induced by all edges e with 0 < xe < 1. Let
R be a cycle or a maximal path of Hx . Then, R can be decomposed into two
matchings M1 and M2 .
2. Define x(ε) by
⎧
⎨ xe if e ∈ R,
xe (ε) = xe + ε if e ∈ M1 ,
⎩
xe − ε if e ∈ M2 .
Define
ε1 = min min xe , min (1 − xe )
e∈M1 e∈M2
ε2 = min min (1 − xe ), min xe .
e∈M1 e∈M2
3. If F (x(−ε1 )) ≥ F (x(ε2 ))
then x ← x(−ε1 )
else x ← x(−ε2 ).
For the maximum weight hitting problem, we faced a star (which is a bipartite
graph) G = (U, V , E) where U = {u}, V = {v1 , v2 , . . . , vn } and E =
{(u, v1 ), (u, v2 ), . . . , (u, v2 )}. Each variable xi corresponds to an edge (u, vi ).
Therefore, in each iteration of pipage rounding, we deal with a maximal path
consisting of two edges.
Why the pipage rounding can be applied to set function optimization? We may get
some idea from relaxation by expectation for set functions.
Consider a set function f (S) on subsets of a finite set X. Let X = {1, 2, . . . , n}.
For each element i, let xi be an indicator which indicates whether i belongs to subset
S or not, i.e.,
1 if i ∈ S,
xi =
0 otherwise.
This is called the multilinear extension of f , which has the following property:
Theorem 11.4.1 Suppose F is the multilinear extension of f . Then,
1. If f is monotone nondecreasing, then F is monotone nondecreasing along any
direction d ≥ 0.
2. If f is submodular, then F is concave along any line d ≥ 0.
3. If f is submodular, then F is convex along line ei − ej (i = j ) where ei has its
ith component equal 1 and others equal 0.
340 11 Relaxation and Rounding
Proof
1. Note that
∂F (x)
= F (x1 , .., xi−1 , 1, xi+1 , . . . , xn ) − F (x1 , .., xi−1 , 0, xi+1 , . . . , xn )
∂xi
= E[f (R ∪ {i})] − E[f (R)]
∂F (x)
≥ 0.
∂xi
dF (x + αd)
= -d, ∇F (x + αd). ≥ 0.
dα
∂ 2 F (x)
= E[f (R ∪ {i, j })] − E[f (R ∪ {i})] − E[f (R ∪ {j })] + E[f (R)]
∂xi ∂xj
= E[f (R ∪ {i, j }) − f (R ∪ {i}) − f (R ∪ {j }) + f (R)]
≤ 0.
Moreover,
∂ 2 F (x)
= 0.
∂xi2
d 2 F (x + αd)
= d T Hf (x + αd)d ≤ 0
dα 2
d 2 F (x + αd) ∂ 2 F (x)
= d T
Hf (x + αd)d = −2 ≥ 0.
dα 2 ∂xi ∂xj
11.4 Continuous Greedy 341
max f (S)
subject to S ∈ I
and
F (x + d) ≥ F (x).
Therefore,
Since vmax (x( ni )) ∈ P (M) and P (M) are convex and closed, we have x(1) ∈
P (M).
2. Note that
d
F (x(t)) = -x (t), ∇F (x(t)). = -vmax (x(t)), ∇F (x(t))..
dt
Therefore,
Thus,
d
F (x(t)) ≥ pt − F (x(t)).
dt
Define h(t) = g (t) + g(t). Solve this differential equation. Then, we obtain
t
g(t) = ex−1 h(x)dx.
0
Hence,
1
F (x(1)) = g(1) ≥ opt · x x−1 dx = opt · (1 − e−1 ).
0
- t -
-1 -
- -
- f (Si ) − F (x)- ≤ ε · f (X)
-t -
i=1
2
with probability at least 1 − etε /4 , where S1 , . . . , St are random subsets based
on element selection probability x. Therefore, if t = O( ε12 ), then with a
constant probability, we can compute an approximation of F (x) within εF (X)
error.
Similarly, the sampling method can be employed for computing ∇F (x) since
∂F (x)
= E[f (R ∪ {i})] − E[f (R)].
∂xi
(2) How to compute vmax (x) = argmaxv∈P (M) -v, ∇F (x).? It looks like a trouble,
but not a real trouble. This is a linear programming. Why it looks like a trouble?
This is because P (M) is not described by a constant number of constraints.
From matroid theory, we see several ways to represent P (M). However, in
general, everyone involves a large number of constraints. For example, there
are exponential number of inequalities in the following representation:
P (M) = {x ≥ 0 | ∀S ∈ 2X : ≤ rM (S)}.
j ∈S
Why it is not a real trouble? The reason is that P (M) is the convex hull of 1I
for I ∈ I where
1 if i ∈ I,
(1I )i =
0 otherwise.
Since optimal solution of linear programming can be found in vertices, vmax (x)
can be a solution of max{-1I , ∇F (x). | I ∈ I}. This can be solved by a greedy
algorithm since ∇F (x) ≥ 0.
(3) How to solve differential equation x (t) = vmax (x(t)) numerically? Algo-
rithm 39 shows a simple numerical computational solution.
Using Algorithm 39 to solve differential equation, we can obtain the following:
Lemma 11.4.5 x(1) ∈ P (M) and
1 n c
F (x(1)) ≥ 1 − 1 − · opt −
n n
1
n−1
t
x(1) = vmax x ,
n n
t=0
that is, x(1) is a convex combination of points in P (M). Therefore, x(1) ∈ P (M).
Next, we employ Taylor expansion on F (x((t + 1)α)) = F (x(tα) + α ·
vmax (x(tα))). For some constant c, we have
The second inequality is due to Lemma 11.4.3. Exchanging two sides and adding
opt, we obtain
(4) How to round x(1) into an integer solution? By Theorem 11.4.1, F is convex
along line ei − ej . Therefore, we can apply pipage rounding to obtain an integer
solution x̂ such that F (x̂) ≥ F (x(1)).
Exercises
P (M) = {x ∈ R+
X
| for every S ⊆ X, x(S) ≤ rM (S)}
can be described by
Historical Notes 347
. /
Pbase (M) = x ∈ R+
X
| for all S, x(S) ≤ r(S), x(X) = r(X) .
where θ is uniformly random in [0, 1]. Prove that f is submodular if and only
if f L is convex.
14. Consider an undirected graph G = (V , E). We intend to orient every edge
such that each node has at most k incoming edges. Prove that this orientation is
possible if and only if |E[W ]| ≤ k|W | for every subset W of V , where E[W ]
is the edge set of subgraph induced by W .
15. Design a continuous greedy approximation algorithm for submodular maxi-
mization with a knapsack constraint.
16. Design a continuous greedy approximation algorithm for submodular maxi-
mization with k matroid constraints.
Historical Notes
In the real world, there are many set function optimization problems with objective
function and/or constraint which is neither submodular nor supermodular. Usually,
it is hard to study their approximation solutions. In this chapter, we summarize
existing efforts in the literature.
12.1 An Example
The rumor is an important research subject in the study of social networks since its
spread can make a lot of negative effects. For example, a rumor on earthquake will
cause people’s panic, and a rumor on a political leader’s health will cause a shaking
of stock market. Therefore, there exist many publications in the literature, which
proposed many methods to block the spread of rumor. In this section, we introduce
one of them, blocking the rumor by cutting at nodes.
Consider a social network represented by a directed graph G = (V , E) with
the independent cascade (IC) model for information diffusion. In this model, every
node has two states, active and inactive. When active, it means a node is getting
influenced. In the current case that we are studying, the spread of the rumor, an
active node means a node gets a negative influence, i.e., influenced by the rumor.
In the IC model, the information diffusion process consists of discrete steps. In
the initial step, a subset of nodes, called seeds, are activated. In the spread of the
rumor, seeds are rumor sources. In each subsequent step, every newly active node
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 349
D.-Z. Du et al., Introduction to Combinatorial Optimization, Springer Optimization
and Its Applications 196, https://doi.org/10.1007/978-3-031-10596-8_12
350 12 Nonsubmodular Optimization
tries to influence its inactive out-neighbors where a node is called a newly active
node if it just becomes active in the last step. Suppose u is a newly active node and
v is an inactive out-neighbor of u. Then, v gets influenced from u, i.e., v becomes
active with the probability puv which is given with the model. When an inactive
node v receives more than one newly active nodes’ influence, we assume that all
newly active in-neighbors influence v independently. The process ends if no node
becomes active in the current step.
Now, consider a situation that a rumor is spreading in a social network G by
following the rule of the IC model. There may exist one or more rumor sources. Due
to budget limit, there are only k monitors available for screening out the rumor and
blocking the rumor passing through the monitor. We meet the following problem:
Problem 12.1.1 (Blocking Rumor by Node Cuts) Given a set of rumor sources,
how to allocate k monitors to maximize the expectation of the number of blocked
nodes?
Let S denote the set of rumor sources spreading the same rumor and IG (S) the
set of active nodes in G, influenced by the information spread from S. Define
the expected number of nodes in IG (S). Then, the expectation of the number of
blocked nodes is
where C is the set of monitors. The problem of blocking rumor by node cuts can be
expressed as
max τ (C).
C:|C|≤k
max c1 x1 + c2 x2 + · · · + cn xn
subject to b1 x1 + b2 x2 + · · · + bn xn ≤ B
j = 1, . . . , bi where all Ci are disjoint (Fig. 12.1). In every arc (x, y), set pxy = 1.
Thus, if we intend to block the rumor to influence Ci , then we must allocate at
least bi monitors at nodes uij for 1 ≤ j ≤ bi . Clearly, the knapsack problem has a
feasible solution with objective function value at least c if and only if B monitors
can be allocated into the constructed social network to protect at least B + c nodes
from the rumor influence.
Recall that a set function f : 2X → R is submodular if for any two sets A ⊂ B
and any element x ∈ B, x f (A) ≥ x f (B) where z f (A) = f (A ∪ {x}) − f (A).
Proposition 12.1.3 The function τ (·) defined in (12.1) is not submodular.
Proof Consider a social network as shown in Fig. 12.2(a). It has four nodes r, u1 ,
u2 , and v and two paths (r, u1 , v) and (r, u2 , v). For every arc (u, v), assign puv = 1.
r is the unique rumor source. Then, u1 τ (∅) = 1 and u1 τ ({u2 }) = 2. Therefore,
u1 τ (∅) < p1 τ ({p2 }), contradicting the definition of submodularity.
A set function f : 2X → R is supermodular if −f is submodular, that is, for
any two sets A ⊂ B and any element x ∈ B, x f (A) ≤ x f (B).
Proposition 12.1.4 The function τ (·) defined in (12.1) is not supermodular.
Proof Consider a social network as shown in Fig. 12.2(b). It has four nodes r, u1 ,
u2 , and v and a path (r, u1 , u2 , v). For every arc (u, v), assign puv = 1. r is the
352 12 Nonsubmodular Optimization
unique rumor source. Then, u1 τ (∅) = 3 and u1 τ ({u2 }) = 1. Hence, u1 (∅) >
u1 τ ({u2 }), contradicting the definition of supermodularity.
How to study a maximization problem for nonsubmodular and nonsupermodular
functions? We introduce some approaches in this chapter.
and
≥2>0
and
≥ 1 > 0.
− min{η(f ), τ (f )}
η(h) = 2 · · η(q) ≥ −2 · min{η(f ), τ (f )} > 0
min{η(q), τ (q)}
and
− min{η(f ), τ (f )}
τ (h) = 2 · · τ (q) ≥ −2 · min{η(f ), τ (f )} > 0.
min{η(q), τ (q)}
Therefore,
and
In the real world, the DS decomposition sometime exists quite naturally. For
example, consider viral marketing in social networks with the independent cascade
model for information diffusion. To advertise a product, initially, the company has
to distribute some free samples or discount coupons to potential buyers. They form
seed set S for the marketing. Let I (S) be the set of active nodes at the end of
information diffusion process. Then, E[|I (S)|] is the expectation of the number
of active nodes, i.e., expected number of customers who will adopt the product.
Suppose the price of the product is c. The profit received by the company is
where d is the cost of each free sample or discount lost on each seed. It can be
proved that both terms are monotone nondecreasing and submodular.
For each specific set function, one is always able to find a DS decomposition in
some way. However, no efficient approach has been found to do so. Therefore, there
exists an important open problem here.
Open Problem 1 Is there an efficient algorithm to produce a DS decomposition for
any given set function?
A set function m over 2X is a modular function if for any two sets A and B,
m(A) + m(B) = m(A ∪ B) + m(A ∩ B). The following lemma indicates that the
modular function is similar to a linear set function.
Lemma 12.2.3 For any modular function m : 2X → R,
m(A) = m(∅) + (m(x) − m(∅))
x∈A
Therefore,
Therefore,
f (A) ≤ f (Y ) + j f (A ∩ Y ) − j f (Y \ j )
j ∈A\Y j ∈A\Y
≤ f (Y ) + j f (∅) − j f (Y \ j )
j ∈A\Y j ∈A\Y
= mu (A).
Moreover, for any set A ⊆ X with A = ∅, suppose A = {xi1 , xi2 , . . . , xik }, and then
we have
12.3 Parameterized Methods 357
and
m u ≥ f ≥ ml
and
To deal with nonsubmodular optimization, one intends to measure how far the
function differs from the submodularity. Motivated from this intension, several
parameters are introduced, and theoretical results for submodular optimization
are extended to nonsubmodular optimization, usually with parameter involving in
performance analysis. Let us give two examples in the following:
Consider a set function f : 2X → R. The supermodular degree of an element
u ∈ X by a function f is defined to be |D+ (u)| where
max f (A)
subject to A ∈ Ci for i = 1, 2, . . . , k,
j
Suppose Hi−1 \ Hi = {v1 , . . . , vr }. Denote Hi−1 = Hi−1 \ {v1 , . . . , vj }. Then,
= vj f ((D+ (vj ) ∩ Hi−1 ) ∪ Si−1 ) + f ((D+ (vj ) ∩ Hi−1 ) ∪ Si−1 ) − f (Si−1 )
j j
j
≥ vj f (Hi−1 ∪ Si−1 ).
Since
D+ (vj ) ∩ Hi−1 \ Si−1 ∪ {vj } ∪ Si−1 = D+ (vj ) ∩ Hi−1 ∪ {vj } ∪ Si−1
j j
and (vj , (D+ (vj )∩Hi−1 )\Si−1 ) is a candidate pair for the ith step of the algorithm,
j
we have
f (Si ) − f (Si−1 )
D+ (vj ) ∩ Hi−1 \ Si−1 ∪ {vj } ∪ Si−1 − f (Si−1 )
j
≥f
= f D+ (vj ) ∩ Hi−1 ∪ {vj } ∪ Si−1 − f (Si−1 )
j
≥ vj f D+ (vj ) ∩ Hi−1 ∪ Si−1
j
j
≥ vj f Hi−1 ∪ Si−1 .
Thus,
r
j
r(f (Si ) − f (Si−1 )) ≥ vj f Hi−1 ∪ Si−1
j =1
! r "
= f Hi−1
0
∪ Si−1 − f Hi−1 ∪ Si−1
= f (Hi−1 ) − f (Hi ).
g
+
k(D + 1)[f (Si ) − f (S0 )] ≥ r [f (Si ) − f (Si−1 )]
i=1
g
≥ [f (Hi−1 ) − f (Hi )]
i=1
= f (H0 ) − f (Hi ).
min g(A)
subject to f (A) = f (X)
γ = max f ({x})
x∈X
γ
and H (γ ) = i=1 1/i.
Proof Define
c(A) = g(x).
x∈A
Then, Algorithm 41 is exactly the greedy algorithm for the submodular set cover
(standard form) problem. Let S be the set obtained by the algorithm. Then,
c(S) ≤ H (γ ) · c(OP Tc )
and
c(OP Tc ) ≤ c(OP Tg ) ≤ g(x) ≤ χ (g) · g(OP Tg ).
x∈OP Tg
Therefore,
g(S) ≤ γ H (γ ) · g(OP Tg ).
Problem 12.3.3 is also closely related to the generalized hitting set problem as
follows:
Problem 12.3.5 (Generalized Hitting Set) Given m nonempty collections
C1 , C2 , . . . , Cm of subsets of a finite set X, find the minimum subset A of X such that
every Ci has a member S ⊆ A.
Let C = ∪m
i=1 Ci . For every subcollection A ⊆ C, define
g(A) = | ∪A∈A A|
and
This equivalence means that A is a minimum solution of problem (12.2) if and only
if ∪A∈A A is the minimum solution of the generalized hitting set problem.
Proof Suppose A is the minimum solution of problem (12.2). For contradiction,
suppose ∪A∈A A is not a minimum generalized hitting set. Consider a minimum
generalized hitting set D. Then, |D| < | ∪A∈A A|. For each Cuv , let Cuv be a subset
of D, contained in Cuv . Denote
a contradiction.
Now, we present an application. Consider a wireless sensor network. Each sensor
has a communication disk and a sensing disk with itself as common center. If sensor
s1 lies in the communication disk of s2 , then s1 can receive message from s2 . When
all sensors have the same size of communication disks and the same size of sensing
disks, they are said to be homogeneous. In a homogeneous wireless sensor system,
the communication network is an undirected graph, in which a virtual backbone is
a connected dominating set, that is, it is a node subset such that every node is either
in the subset or adjacent to the subset. Construction of the virtual backbone is an
important issue in the study of wireless sensor networks.
12.3 Parameterized Methods 363
where α is a constant.
Condition (12.3) can be simplified as follows:
Lemma 12.3.9 To satisfy condition (12.3), it is sufficient to satisfy
Lemma 12.3.10 Suppose D contains a maximal independent set I . If for any two
nodes x and y in I with distance at most four, dD (x, y) ≤ α, then for any two nodes
u and v with distance two, dD (u, v) ≤ α + 2.
364 12 Nonsubmodular Optimization
dD (u, v) ≤ dD (u v ) + 2 ≤ α + 2.
Motivated from this lemma, we divide the construction into two stages.
In the first stage, we construct a maximal independent set I . It is worth
mentioning a well-known conjecture about the maximal independent set.
Conjecture 12.3.11 Let α(G) be the size of the maximum independent set in graph
G and γc (G) the size of the minimum connected dominating set in graph G. Then,
for any unit disk graph G (i.e., the graph structure of any homogeneous wireless
sensor network),
α(G) ≤ 3 · γc (G) + 3.
This conjecture is still open. The best proved result (see [115]) is
Since |I | ≤ α(G) and γc (G) are lower bound for optαcds , the minimum size of
routing-cost constrained connected dominating set with parameter α, we have
In the second stage, for every pair of nodes u and v in I with distance at most four,
let Cuv denote the collections of node subsets each of which is the set of intermediate
nodes on a path between u and v with distance at most α (α ≥ 4). Let D be a
node set hitting every Cuv . Then, D would satisfy constraint (12.4) and hence (12.3)
holds. This would imply that D is a connected dominating set. Thus, the routing-
cost constrained CDS problem is equivalent to the generalized hitting set problem
with input collections Cuv .
Now, we proved that in this example, χ (g) and γ are bounded by constants:
Lemma 12.3.12
γ ≤ 25
and
χ (g) ≤ 420.
12.3 Parameterized Methods 365
Proof Note that each node is adjacent to at most five nodes in an independent set.
Thus, there exist at most 25 paths with length 2, sharing the same intermediate nodes
with endpoints in I . Therefore, γ = maxS∈C f ({S}) ≤ 25.
2
To estimate χ (g), we first note that there are at most π(d+0.5)
π ·0.52
= (2d + 1)2
independent nodes within distance d from any node.
Suppose u is an intermediate node of a path between x and y in I with at most
distance 4. There are two cases:
Case 1. Both x and y are within a distance 2 from u. The number of possible pairs
{x, y} is at most 25(25 − 1)/2 = 300.
Case 2. One of x and y has distance one from u and the other has distance three
from u. The number of possible pairs {x, y} in this case is at most 5×(72 −52 ) =
5 × 24 = 120.
Putting two cases together, we obtain χ (g) ≤ 420.
Theorem 12.3.13 For any connected unit disk graph G, a connected dominating
set D can be constructed, in polynomial-time, to satisfy
and
Proof Let A be obtained by the greedy algorithm hitting all Cuv for all pairs of nodes
u, v in a maximal independent set I with d(u, v) ≤ 4. Then, by Lemma 12.3.12,
Moreover, by Lemma 12.3.10, we have that for any two nodes u and v with distance
two
dD (u, v) ≤ 6.
The sandwich method has been used quite often for solving nonsubmodular
optimization problems in the literature. It runs as follows:
Suppose we face a problem maxA∈ f (A) where is a collection of subsets of
2X and X is a finite set.
Sandwich Method :
• Input a set function f : 2X → R.
• Initially, find two submodular functions u and l such that u(A) ≥ f (A) ≥ l(A)
for A ∈ . Then, carry out the following operations:
– Compute an α-approximation solution Su for maxA∈ u(A) and a β-
approximation solution Sl for maxA∈ l(A).
– Compute a feasible solution So for maxA∈ f (A).
– Set S = argmax(f (Su ), f (So ), f (Sl )).
• Output S.
The performance ratio of this algorithm is data-dependent as follows: Hence, this
algorithm is also called a data-dependent approximation algorithm.
Theorem 12.4.1 The solution S produced by the sandwich method satisfies the
following:
f (Su ) optl
f (S) ≥ max · α, · β · optf ,
u(Su ) optf
where optf (optl ) is the objective function value of the minimum solution for
maxA∈ f (A) (maxA∈ l(A)).
Proof Since Su is a α-approximation solution for maxA∈ u(A), we have
f (Su )
f (Su ) = · u(Su )
u(Su )
f (Su )
≥ · α · optu
u(Su )
f (Su )
≥ · α · u(OP Tf )
u(Su )
12.4 Sandwich Method 367
f (Su )
≥ · α · optf ,
u(Su )
optl
f (Sl ) ≥ l(Sl ) ≥ β · optl = β · · optf .
optf
and
α2 (C) = τ ({c, c }).
c,c ∈C:c=c
Clearly, α1 is modular, that is, for any two subsets A ⊂ B and any element x ∈ B,
x α1 (A) = x α1 (B).
However, we have
Lemma 12.4.2 α2 is supermodular.
Proof By definition of α2 , we have that for any two subsets A ⊂ B and any element
x ∈ B,
x α2 (A) = τ ({x, y}) ≤ x α2 (B) = τ ({x, y}).
y∈A y∈B
Therefore, α2 is supermodular.
By the inclusive-exclusive formula, we have
Lemma 12.4.3 For any set C,
Data-Dependent Approximation
Compute an optimal solution Cα1 for max{α1 (C) | |C| ≤ k}.
Compute 1/e-approximation Cβ for max{β(C) | |C| ≤ k}.
Compute a feasible solution Cτ for max{τ (C) | |C| ≤ k}.
Choose Cdata = argmax(τ (Cα1 ), τ (Cβ ), τ (Cτ )).
where optτ (optα1 , and optβ ) is the objective function value of an optimal solution
for problem max{τ (C) | |C| ≤ k} (problem max{α1 (C) | |C| ≤ k}, and problem
max{β(C) | |C| ≤ k} respectively).
Note that τ is monotone nondecreasing, i.e., for A ⊂ B, τ (A) ≤ τ (B).
Therefore, Cτ can be obtained by the following greedy algorithm:
Greedy Algorithm
C0 ← ∅;
for i = 1 to k do
x = argmaxx∈V \Ci−1 (τ (Ci−1 ∪ {x}) − τ (Ci−1 )) and
Ci ← Ci−1 ∪ {x};
end-for
return Cτ = Ck .
From theoretical point of view, the sandwich method is always applicable since
we have the following:
Theorem 12.4.5 For any set function f on 2X , there exist two monotone nonde-
creasing submodular functions u and l such that u(A) ≥ f (A) ≥ l(A) for every
A ∈ 2X .
Proof By the first DS decomposition theorem, there exist two monotone nonde-
creasing submodular functions g and h such that f = g − h. Note that for every
A ∈ 2X , h(∅) ≤ h(A) ≤ h(X). Set u(A) = g(A) − h(∅) and l(A) = g(A) − h(X)
for any A ∈ 2X . Then, u and l meet our requirement.
However, in practice, it is often quite hard to find such an upper-bound u and a
lower-bound l which are easily computable since the DS decomposition exists but
is unknown to be efficiently computable. Therefore, more efforts are required to
construct them for specific real-world problems.
12.5 Algorithm Ending at Local Optimal Solution 369
For nonsubmodular optimization, there also exists a class of algorithms which end
at local optimal solutions. What is the local optimal solution? For set function
optimization, there exist several definitions in the literature. However, they have
a property in common, that is, all of them are necessary conditions for optimality.
In this section, we introduce two of them together with two algorithms which end at
these two types of local optimal solutions, respectively.
Here are two necessary conditions for minimality:
1. Let f be a set function on 2X . Suppose A is a minimum solution of f in 2X .
Then, f (A) ≤ f (A \ {x}) and f (A) ≤ f (A ∪ {x}) for any x ∈ X.
2. Let f = g − h be a set function and g and h submodular functions on subsets of
X. If set A is a minimum solution for minY ⊆X f (Y ), then ∂h(A) ⊆ ∂g(A).
Condition 1 is obvious. Condition 2 needs a little explanation. First, let us explain
what is the notation ∂h(A). ∂h(A) is the subgradient of function h at set A, defined
as
min f (Y ),
Y ⊆X
we have f (A) ≤ f (Y ) and hence g(Y ) − g(A) ≥ h(Y ) − h(A) for any Y ⊆ X.
Therefore, for any c ∈ ∂h(A), g(Y ) − g(A) ≥ h(Y ) − h(A) ≥ c(Y ) − c(A). This
means that ∂h(A) ⊆ ∂g(A).
Condition 2 implies Condition 1. To see this, we first introduce two lemmas:
Lemma 12.5.1 Suppose A satisfies condition 2. Then, for any Y ∈ U , f (A) ≤
f (Y ) where
7: end for
8: σ ← argminσ ∈ A f (A+ σ );
9: A+ ← A + σ
10: end while
11: return A.
12.6 Global Approximation of Local Optimality 371
mσhl (A) and, moreover, h(Si ) = mσhl (Si ) for any Si = {σ (1), . . . , σ (i)}. Let
θ = max(|A|, |X| − |A|). Let A be a collection of θ permutations σ of X such
that σ (|A|) goes over all elements of A and σ (|A| + 1) goes over all elements
of X \ A, that is, for any element x ∈ A, there exists σ ∈ A such that
A \ {x} = {σ (1), . . . , σ |A| − 1} and for any x ∈ X \ A, there exists σ ∈ A
such that A ∪ {x} = {σ (1), . . . , σ (|A| + 1)}. Now, let A+
σ denote the minimum
solution for minY ∈2X [g(Y ) − mσhl (Y )]. Set
σ + = argminσ ∈ A
f (A+
σ)
and
A+ = A+
σ+
.
and
Sometimes, an algorithm may not be able to stop at a local optimal solution, and
instead, it stops at a local approximately optimal solution. For example, consider
the following problem:
max f (A)
subject to |A| ≤ k
A ∈ 2X
where f is a set function over 2X for a finite set X. Algorithm 43 is the submodular-
supermodular algorithm for this problem.
372 12 Nonsubmodular Optimization
7: end for
8: σ ← argmaxσ ∈ A f (A+ σ );
9: A+ ← A + σ
10: end while
11: return A.
7: end for
8: σ ← argminσ ∈ A f (A+ σ );
9: A+ ← A + σ
10: end while
11: return A.
algorithm must stop within k iterations. Note that e < (1 + ε)1+1/ε and 1 + 1/ε <
2/ε for ε < 1. Thus, ekε/2 < f (A)/opt. Hence, kε < 2 ln(f (A)/opt) ≤ 2 ln ζ .
Job Assignment There are K groups, J jobs, and I types of resources. For each
resource i, availability is bi . For each job j , if group k completes job j , then
consumption of resource i is aik i . The cost for group k working on job j is c .
jk
The problem is to find an assignment for distributing J jobs to K groups, under
availability of every resource, to minimize the total cost. Let xj k be an indicator for
assigning job j to group k. Then, this problem can be formulated as a 0-1 integer
LP as follows:
K
J
min cj k xj k
k=1 j =1
K
J
subject to aji k ≤ bi for1 ≤ i ≤ I
k=1 j =1
12.7 Large-Scale System 375
K
xj k = 1 for 1 ≤ j ≤ J
k=1
xj k ∈ {0, 1} for all 1 ≤ j ≤ J, 1 ≤ k ≤ K.
Fig. 12.5 Each flight is represented by a directed edge. Each airport contains a set of ending
points and a set of starting points. Each edge between them represents a possible transfer from one
air-flight to another one. Add a virtual starting point and a virtual ending point in each airport
376 12 Nonsubmodular Optimization
from the virtual start point of the former airport to the virtual endpoint of the latter
airport.
For each airline, its flight map is a power law graph, that is, the number of nodes
with degree k is α · k −β where α and β are positive constants. Therefore, the fast
algorithm should be developed on power law graphs for the shortest path problem.
A successful solution can make a big social benefit.
For more complicated problems, such as nonsubmodular optimizations, in more
complicated large-scale background, one of the successful algorithms developed
recently is optimization from samples.
There are different models for optimization from samples. In different models,
the same problems may have different computational complexity. In the following,
let us show an example.
First, we consider a model proposed by Balkanski et al. [18].
Definition 12.7.1 (Optimization from Samples) Consider a family F of set func-
tions over 2L where L is the ground set. M ⊆ 2L is a constraint over distribution D
on 2L . F is said to be α-optimizable from samples in M if there exists an algorithm
satisfying that for any parameter δ > 0 and sufficiently large L, there exists an
integer t0 ∈ poly(|L|, 1/δ) such that for all t ≥ t0 and for any set of samples
{Si , f (Si )}ti=1 with f ∈ F and Si selected i.i.d. from D, the algorithm takes samples
{Si , f (Si )}ti=1 as the input and returns S ∈ M satisfying that
where the expectation is taken over the randomness of the algorithm. (Note that the
algorithm runs not necessarily in polynomial-time.)
With this model, a negative result is obtained as follows:
Theorem 12.7.2 The maximum set coverage problem √ (Problem 10.1.1) cannot
be approximated within a ratio better than 2( log |N |) using polynomially many
samples selected i.i.d. from any distribution D.
Next, we consider another model proposed by Chen et al. [54].
Definition 12.7.3 (Coverage Function) Consider a bipartite graph G =
(L, R, E). For every node u ∈ L ∪ R, denote by NG (u) the set of all neighbors of
u. For any subset S ⊆ L ∪ R, denote NG (S) = ∪u∈S NG (u). The coverage function
fG : 2L → R+ is defined by
for S ⊆ L.
where the expectation is taken over the randomness of the algorithm. (Note that the
algorithm runs not necessarily in polynomial-time.)
With this model, a positive result is obtained as follows:
Theorem 12.7.5 Suppose that the distribution D on 2L satisfies the following three
assumptions:
(a1) Feasibility For any sample S ∼ D, |S| ≤ k.
(a2) Polynomial bounded For any u ∈ L,
pu = PrS∼D [u ∈ S] ≥ 1/|L|c
Exercises
3. Give a counterexample to show that not every set function can be decomposed
into the sum of a monotone nondecreasing submodular function and a mono-
tone nondecreasing supermodular function.
4. Show that for any monotone nondecreasing submodular function f : 2X → R
and any Y ⊆ X, there exist a pair of monotone nondecreasing modular
functions u, l : 2X → R such that u(Y ) = f (Y ) = l(Y ) and u(S) ≥ f (S) ≥
l(S) for any S ⊆ X.
5. Let C = {A | |A| ≤ k}. Then, (X, C) is a matroid. This means that the size
constraint is a specific matroid constraint. With this constraint, the monotone
nonsubmodular maximization has a better approximation solution. Consider
the following maximization problem and a greedy algorithm as shown in
Algorithm 45:
max f (A)
subject to |A| ≤ k,
A(u, v).
u,v∈I (S)
subject to |S| ≤ k
8: end for
9: σ ← argminσ ∈ A f (A+ σ );
10: A+ ← A+ σ
11: end while
12: return A.
380 12 Nonsubmodular Optimization
10. Please modify Algorithm 43 into one which runs in polynomial-time and a G-L
performance ratio not far from (1 − e−1 ).
Historical Notes
1. P.K. Agarwal, M. van Kreveld, S. Suri: Label placement by maximum independent set in
rectangles, Comput. Geom. Theory Appl., 11(118): 209–218 (1998).
2. A.A. Ageev and M. Svirdenko: Pipage rounding: a new method of constructing algorithms
with proven performance guarantee, Journal of Combinatorial Optimization, 8: 307–328
(2004).
3. C. Ambühl: An optimal bound for the MST algorithm to compute energy efficient broadcast
trees in wireless networks, Proceedings, 32nd International Colloquium on Automata,
Languages and Programming, Springer LNCS 3580: 1139–1150 (2005).
4. C. Ambühl, T. Erlebach, M. Mihalák and M. Nunkesser: Constant-approximation for
minimum-weight (connected) dominating sets in unit disk graphs, Proceedings, 9th Inter-
national Workshop on Approximation Algorithms for Combinatorial Optimization (APPROX
2006), Springer LNCS 4110: 3–14 (2006).
5. E.M. Arkin, J.S.B. Mitchell and G. Narasimhan: Resource-constructed geometric network
optimization, Proceedings, 14th Annual Symposium on Computational Geometry, Minneapo-
lis, pp.307–316, 1998.
6. S. Arora: Polynomial-time approximation schemes for Euclidean TSP and other geometric
problems, Proceedings, 37th IEEE Symp. on Foundations of Computer Science, pp. 2–12,
1996.
7. S. Arora: Nearly linear time approximation schemes for Euclidean TSP and other geometric
problems, Proceedings, 38th IEEE Symp. on Foundations of Computer Science, pp. 554–563,
1997.
8. S. Arora: Polynomial-time approximation schemes for Euclidean TSP and other geometric
problems, Journal of ACM, 45: 753–782 (1998).
9. S. Arora, M. Grigni, D. Karger, P. Klein and A. Woloszyn: Polynomial time approximation
scheme for Weighted Planar Graph TSP, Proceedings, 9th ACM-SIAM Symposium on Discrete
Algorithms, pp. 33–41, 1998.
10. S. Arora, C. Lund, R. Motwani, M. Sudan and M. Szegedy: Proof verification and hardness of
approximation problems, Proceedings, 33rd IEEE Symposium on Foundations of Computer
Science, pp. 14–23, 1992.
11. S. Arora, C. Lund, R. Motwani, M. Sudan and M. Szegedy: Proof verification and hardness
of approximation problems, Journal of the ACM, 45: 753–782 (1998).
12. S. Arora, P. Raghavan and S. Rao: Polynomial Time Approximation Schemes for Euclidean k-
medians and related problems, Proceedings, 30th ACM Symposium on Theory of Computing,
pp. 106–113, 1998.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 383
D.-Z. Du et al., Introduction to Combinatorial Optimization, Springer Optimization
and Its Applications 196, https://doi.org/10.1007/978-3-031-10596-8
384 Bibliography
13. S. Arora and S. Safra: Probabilistic checking of proofs: A new characterization of NP,
Proceedings, 33rd IEEE Symposium on Foundations of Computer Science, pp. 2–13, 1992.
14. S. Arora and S. Safra: Probabilistic checking of proofs: A new characterization of NP, J.
Assoc. Comput. Mach., 45: 70–122 (1998).
15. Wenruo Bai, Jeffrey A. Bilmes: Greed is still good: maximizing monotone submodu-
lar+supermodular (BP) functions, Proceedings, ICML, 314–323, 2018.
16. B.S. Baker: Approximation algorithms for NP-complete problems on planar graphs, Proceed-
ings, 24th FOCS, pp. 265–273, 1983.
17. B.S. Baker: Approximation algorithms for NP-complete problems on planar graphs, Journal
of ACM, 41(1): 153–180 (1994).
18. E. Balkanski, A. Rubinstein and Y. Singer: The limitations of optimization from samples,
Proceedings, 49th Annual ACM SIGACT Symposium on Theory of Computing (STOC),
Montreal, QC, Canada, June 19–23, pp. 1016–1027, 2017.
19. Anton Barhan and Andrey Shakhomirov: Methods for sentiment analysis of Twitter Mes-
sages, Proceedings, 12th Conference of Fruct Association, pp. 216–222, 2012.
20. J. Bar-LLan, G. Kortsarz and D. Prleg: Generalized submodular cover problem and applica-
tions, Theoretical Computer Science, 250: 179–200 (2001).
21. D. Bayer and J.C. Lagarias: The non-linear geometry of linear programming, I. Affine and
projective scaling trajectories, II. Legendre transform coordinates, III. Central trajectories,
Preprints, AT&T Bell Laboratories (Murray Hill, NJ, 1986).
22. E.M. Beale: Cycling in dual simplex algorithm, Navel Research Logistics Quarterly 2: 269–
276 (1955).
23. M. Bellare, O. Goldreich and M. Sudan: Free bits and nonapproximability, Proceedings, 36th
FOCS, pp.422–431, 1995.
24. R. Bellman: On a routing problem, Quarterly of Applied Mathematics, 16: 87–90 (1958).
25. P. Berman, B. Basgupta, S. Muthukrishnan, S. Ramaswami: Efficient approximation algo-
rithms for tiling and packing problem with rectangles, J. Algorithms, 41: 178–189 (2001).
26. P. Berman, G. Calinescu, C. Shah, A. Zelikovsky: Efficient energy management in sensor
networks, in Ad Hoc and Sensor Networks, Wireless Networks and Mobile Computing, vol. 2,
ed. by Y. Xiao, Y. Pan (Nova Science Publishers, Hauppauge, 2005).
27. D.P. Bertsekas: A simple and fast label correcting algorithm for shortest paths, Networks,
23(8): 703–709 (1993).
28. Aditya Bhaskara, Moses Charikar, Eden Chlamtac, Uriel Feige: Aravindan vijayaraghavan:
detecting high log-densities – an O(n1/4 ) approximation for densest k-subgraph, Proceed-
ings, 42nd ACM International Symposium on Theory of Computing, ACM, New York, pp.
201–210, 2010.
29. Arim Blum, Tao Jiang, Ming Li, John Tromp and M. Yannakakis: Linear approximation of
shortest superstrings, Journal of ACM, 41(4): 630–647 (1994).
30. Otakar Boruvka on Minimum Spanning Tree Problem (translation of both 1926 papers,
comments, history) (2000) Jaroslav Nesetril, Eva Milková, Helena Nesetrilová. (Section 7
gives his algorithm, which looks like a cross between Prim’s and Kruskal’s.)
31. Dimitris Bertsimas, Chung-Piaw Teo, Rakesh Vohra: On dependent randomized rounding
algorithms, Oper. Res. Lett. 24(3): 105–114 (1999).
32. Robert G. Bland: New finite pivoting rules for the simplex method, Mathematics of Opera-
tions Research 2 (2): 103–107 (1977).
33. Al Borchers and Ding-Zhu Du: The k-Steiner ratio in graphs, Proceedings, 27th ACM
Symposium on Theory of Computing, pp. 641–649, 1995.
34. Al Borchers and Ding-Zhu Du: The k-Steiner ratio in graphs, SIAM J. Comput., 26(3): 857–
869 (1997).
35. Al Borchers and Prosenjit Gupta: Extending the Quadrangle Inequality to Speed-Up Dynamic
Programming. Inf. Process. Lett., 49(6): 287–290 (1994).
36. O. Boruvka: On a minimal problem, Prace Morask’e Pridovedeké Spolecnosti, 3: 37–58
(1926).
Bibliography 385
61. Xiuzhen Cheng, Xiao Huang, Deying Li, Weili Wu, Ding-Zhu Du: A polynomial-time
approximation scheme for the minimum-connected dominating set in ad hoc wireless
networks, Networks, 42(4): 202–208 (2003).
62. X. Cheng, J.-M. Kim, and B. Lu: A polynomial time approximation scheme for the problem
of interconnecting highways, Journal of Combinatorial Optimization, 5: 327–343, (2001).
63. D. Cheriton and R.E. Tarjan: Finding minimum spanning trees, SIAM J. Comput., 5: 724–742
(1976).
64. J. Cheriy, S. Vempala: A. Vetta: Network design via iterative rounding of setpair relaxations,
Combinatorica, 26(3): 255–275 (2006).
65. G. Choquet: Etude de certains réseaux de routes, C R Acad Sci Paris, 205: 310–313 (1938).
66. N. Christofides: Worst-case analysis of a new heuristic for the travelling salesman problem,
Technical Report, Graduate School of Industrial Administration, Carnegie-Mellon University,
Pittsburgh, PA, 1976.
67. F.R.K. Chung and E.N. Gilbert: Steiner trees for the regular simplex, Bull. Inst. Math. Acad.
Sinica, 4: 313–325 (1976).
68. F.R.K. Chung and R.L. Graham: A new bound for euclidean Steiner minimum trees, Ann.
N.Y. Acad. Sci., 440: 328–346 (1985).
69. F.R.K. Chung and F.K. Hwang: A lower bound for the Steiner tree problem, SIAM
J.Appl.Math., 34: 27–36 (1978).
70. V. Chvátal: A greedy heuristic for the set-covering problem, Mathematics of Operations
Research, 4(3): 233–235 (1979).
71. S.A. Cook: The complexity of theorem-proving procedures, Proceedings, 3rd ACM Sympo-
sium on Theory of Computing, pp. 151–158, 1971.
72. William J. Cook, William H. Cunningham, William R. Pulleyblank, Alexander Schrijver:
Combinatorial Optimization, (Wiley, 1997).
73. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein: Introduction to
Algorithms, (3rd ed.), (MIT Press, 2009).
74. Henry H. Crapo, Gian-Cario Rota: On the Foundations of Combinatorial Theory: Combina-
torial Geometries, (Cambridge, Mass.: M.I.T. Press, 1970).
75. R. Crourant and H. Robbins, What Is Mathematics?, (Oxford Univ. Press, New York, 1941).
76. W. H. Cunningham: Decomposition of submodular functions, Combinatorica, 3(1): 53–68
(1983).
77. D. Dai and C. Yu: A (5 + ε)-approximation algorithm for minimum weighted dominating set
in unit disk graph, Theoretical Computer Science, 410: 756–765 (2009).
78. G.B. Dantzig: Application of the simplex method to a transportation problem, in: Activity
Analysis of Production and Allocation, (Cowles Commission Monograph 13), T.C. Koopmans
(ed.), John-Wiley, New York, pp. 359–373, 1951.
79. G.B. Dantzig: Maximization of a linear function of variables subject to linear inequalities,
Chap. XXI of Activity Analysis of Production and Allocation, (Cowles Commission Mono-
graph 13), T.C. Koopmans (ed.), John-Wiley, New York, 1951, pp. 339–347.
80. G.B. Dantzig: A. Orden, P. Wolfe: Note on linear programming, Pacific J. Math, 5: 183–195
(1955).
81. G.B. Dantzig and P. Wolfe: Decomposition principle for linear programs, Operations
Research, 8: 101–111 (1960).
82. Robert B. Dial: Algorithm 360: Shortest-Path Forest with Topological Ordering [H], Com-
munications of the ACM, 12 (11): 632–633 (1969).
83. E.W. Dijkstra: A note on two problems in connexion with graphs, Numerische Mathematik,
1: 269–271 (1959).
84. Ling Ding, Xiaofeng Gao, Weili Wu, Wonjun Lee, Xu Zhu, Ding-Zhu Du: Distributed
construction of connected dominating sets with minimum routing cost in wireless networks,
Proceedings, ICDCS, pp. 448–457, 2010.
85. Ling Ding, Weili Wu, James Willson, Lidong Wu, Zaixin Lu, Wonjun Lee: Constant-
approximation for target coverage problem in wireless sensor networks, Proceedings,
INFOCOM, pp. 1584–1592, 2012.
Bibliography 387
86. Xingjian Ding, Jianxiong Guo, Deying Li, Weili Wu: Optimal wireless charger placement
with individual energy requirement, Theor. Comput. Sci., 857: 16–28 (2021).
87. Xingjian Ding, Jianxiong Guo, Yongcai Wang, Deying Li, Weili Wu: Task-driven charger
placement and power allocation for wireless sensor networks, Ad Hoc Networks, 119: 102556
(2021).
88. E.A. Dinic: Algorithm for solution of a problem of maximum flow in a network with power
estimation, Soviet Mathematics - Doklady, 11: 1277–1280 (1970).
89. Yefim Dinitz: Dinitz’ algorithm: the original version and Even’s version, in Oded Goldreich,
Arnold L. Rosenberg, Alan L. Selman (eds.), Theoretical Computer Science: Essays in
Memory of Shimon Even. (Springer, 2006): pp. 218–240, 2006.
90. I. Dinur, D. Steurer: Analytical approach to parallel repetition, Proceedings, 46th Annual
ACM Symposium on Theory of Computing, pp. 624–633, 2014.
91. Luobing Dong, Qiumin Guo, Weili Wu: Speech corpora subset selection based on time-
continuous utterances features, J. Comb. Optim., 37(4): 1237–1248 (2019).
92. Luobing Dong, Qiumin Guo, Weili Wu, Meghana N. Satpute: A semantic relatedness pre-
served subset extraction method for language corpora based on pseudo-Boolean optimization,
Theor. Comput. Sci., 836: 65–75 (2020).
93. Luobing Dong, Meghana N. Satpute, Weili Wu, Ding-Zhu Du: Two-phase multidocument
summarization through content-attention-based subtopic detection, IEEE Trans. Comput. Soc.
Syst., 8(6): 1379–1392 (2021).
94. D.E. Drake and S. Hougardy, On approximation algorithms for the terminal Steiner tree
problem, Information Processing Letters, 89: 15–18 (2004).
95. Stuart Dreyfus: Richard Bellman on the birth of dynamic programming, Operations Research,
50(1): 48–51 (2002).
96. Ding-Zhu Du: On heuristics for minimum length rectangular partitions, Technical Report,
Math. Sci. Res. Inst., Univ. California, Berkeley, 1986.
97. Ding-Zhu Du, R.L. Graham, P.M. Pardalos, Peng-Jun Wan, Weili Wu and W. Zhao: Analysis
of greedy approximations with nonsubmodular potential functions, Proceedings, 19th ACM-
SIAM Symposiun on Discrete Algorithms (SODA), pp. 167–175, 2008.
98. Ding-Zhu Du, D. Frank Hsu, and K.-J. Xu: Bounds on guillotine ratio, Congressus Numeran-
tium, 58: 313–318 (1987).
99. Ding-Zhu Du and Ker-I Ko: Theory of Computational Complexity (2nd Ed), (John Wiley,
New York, NY, 2014).
100. Ding-Zhu Du, Ker-I Ko, Xiaodong Hu: Design and Analysis of Approximation Algorithms,
(Springer, 2012).
101. Ding-Zhu Du and Frank K. Hwang: The Steiner ratio conjecture of Gilbert-Pollak is true,
Proceedings of National Academy of Sciences, 87: 9464–9466 (1990).
102. Ding-Zhu Du, Frank K. Hwang, M.T. Shing and T. Witbold: Optimal routing trees, IEEE
Transactions on Circuits, 35: 1335–1337 (1988).
103. Ding-Zhu Du, Zevi Miller: Matroids and subset interconnection design, SIAM J. Discrete
Math., 1(4): 416–424 (1988).
104. Ding-Zhu Du, L.Q. Pan, and M.-T. Shing: Minimum edge length guillotine rectangular
partition, Technical Report 0241886, Math. Sci. Res. Inst., Univ. California, Berkeley, 1986.
105. Ding-Zhu Du, Panos M. Pardalos, Weili Wu: Mathematical Theory of Optimization,
(Springer, 2010).
106. Ding-Zhu Du, Yan-Jun Zhang: On heuristics for minimum length rectilinear partitions,
Algorithmica, 5: 111–128 (1990).
107. Ding-Zhu Du, Yanjun Zhang and Qing Feng: On better heuristic for euclidean Steiner
minimum trees, Proceedings, 32nd FOCS, pp. 431–439, 1991.
108. Hongjie Du, Weili Wu, Wonjun Lee, Qinghai Liu, Zhao Zhang, Ding-Zhu Du: On minimum
submodular cover with submodular cost, J. Global Optimization, 50(2): 229–234 (2011).
109. Hongjie Du, Weili Wu, Shan Shan, Donghyun Kim, Wonjun Lee: Constructing weakly
connected dominating set for secure clustering in distributed sensor network, J. Comb. Optim.,
23(2): 301–307 (2012).
388 Bibliography
110. Hongwei Du, Panos M. Pardalos, Weili Wu, Lidong Wu: Maximum lifetime connected
coverage with two active-phase sensors, J. Glob. Optim., 56(2): 559–568 (2013).
111. Hongwei Du, Weili Wu, Qiang Ye, Deying Li, Wonjun Lee, Xuepeng Xu: CDS-based virtual
backbone construction with guaranteed routing cost in wireless sensor networks, IEEE Trans.
Parallel Distributed Syst., 24(4): 652–661 (2013).
112. Hongwei Du, Qiang Ye, Weili Wu, Wonjun Lee, Deying Li, Ding-Zhu Du, Stephen Howard:
Constant approximation for virtual backbone construction with guaranteed routing cost in
wireless sensor networks, Proceedings, INFOCOM, pp. 1737–1744, 2011.
113. Hongwei Du, Qiang Ye, Jiaofei Zhong, Yuexuan Wang, Wonjun Lee, Haesun Park:
Polynomial-time approximation scheme for minimum connected dominating set under
routing cost constraint in wireless sensor networks, Theor. Comput. Sci., 447: 38–43 (2012).
114. Xiufeng Du, Weili Wu, Dean F. Kelley: Approximations for subset interconnection designs,
Theoretical Computer Science, 207(1): 171–180 (1998).
115. Yingfan L. Du, Hongmin W. Du: A new bound on maximum independent set and minimum
connected dominating set in unit disk graphs, J. Comb. Optim., 30(4): 1173–1179 (2015).
116. J. Edmonds: Maximum matching and a polyhedron with 0, 1-vertices, Journal of Research
National Bureau of Section B, 69: 125–130 (1965).
117. J. Edmonds: Minimum partition of a matroid into independent subsets, Journal of Research
National Bureau of Section B, 69: 67–72 (1965).
118. J. Edmonds: Paths, trees and flowers, Canadian Journal of Mathematics, 17: 449–467 (1965).
119. J. Edmonds: Optimum branchings, Journal of Research National Bureau of Section B, 71:
233–240 (1967).
120. J. Edmonds: Submodular functions, matroids, and certain polyhedrons, in: Combinatorial
Structure and Their Applications (R. Guy, H. Hanani, N. Sauer, J. Schönheim, eds.) Gordon
and Breach, New York, pp. 69–87, 1970.
121. J. Edmonds: Edge-disjoint branchings, in Combinatorial Algorithms (R. Rustin, ed.) Algo-
rithmics Press, New York, pp. 91–96, 1973.
122. J. Edmonds, E.L. Johnson: Matching, Euler Tours, and the Chinese Postman, Math. Pro-
gramm., 5 : 88–124 (1973).
123. J. Edmonds, R. Karp: Theoretical improvements in algorithmic efficiency for network flow
problems, Journal of the ACM, 19(2): 248–264 (1972).
124. J. Edmonds, K. Pruhs: Scalably scheduling processes with arbitrary speedup curves, ACM
Trans. Algorithms, 8(3): 28 (2012).
125. M.A. Engquist: A successive shortest path algorithm for the assignment problem, Research
Report, Center for Cybernetic Studies (CCS) 375, University of Texas, Austin; 1980.
126. T. Erlebach, T. Grant, F. Kammer: Maximising lifetime for fault tolerant target coverage in
sensor networks, Sustain. Comput. Inform. Syst., 1: 213–225 (2011).
127. T. Erlebach, K. Jansen and E. Seidel: Polynomial-time approximation schemes for geometric
graphs, Proceedings, 12th SODA, pp. 671–679, 2001.
128. T. Erlebach, M. Mihal: A (4 + ε)-approximation for the minimum-weight dominating set
problem in unit disk graphs, Proceedings, WAOA, pp. 135-1, 2009.
129. Thomas R. Ervolina, S. Thomas McCormick: Two strongly polynomial cut cancelling
algorithms for minimum cost network flow, Discrete Applied Mathematics, 4: 133–165
(1993).
130. Lidan Fan, Weili Wu: Rumor blocking, Encyclopedia of Algorithms, pp. 1887–1892, 2016.
131. Lidan Fan, Weili Wu, Kai Xing, Wonjun Lee: Precautionary rumor containment via trustwor-
thy people in social networks, Discrete Math., Alg. and Appl., 8(1): 1650004:1-1650004:18
(2016).
132. Lidan Fan, Weili Wu, Xuming Zhai, Kai Xing, Wonjun Lee, Ding-Zhu Du: Maximizing rumor
containment in social networks with constrained time, Social Netw. Analys. Mining, 4(1): 214
(2014).
133. Lidan Fan, Zaixin Lu, Weili Wu, Bhavani M. Thuraisingham, Huan Ma, Yuanjun Bi: Least
Cost Rumor Blocking in Social Networks, Proceedings, ICDCS, pp. 540–549, 2013.
134. Uriel Feige: A threshold of ln n for approximating set cover, J. ACM, 45(4): 634–652 (1998).
Bibliography 389
160. Xiaofeng Gao, Yaochun Huang, Zhao Zhang and Weili Wu: (6 + ε)-approximation for
minimum weight dominating set in unit disk graphs, Proceedings, COCOON, pp. 551–557.
2008.
161. Xiaofeng Gao, Wei Wang, Zhao Zhang, Shiwei Zhu, Weili Wu: A PTAS for minimum d-hop
connected dominating set in growth-bounded graphs, Optim. Lett., 4(3): 321–333 (2010).
162. Xiaofeng Gao, Weili Wu, Xuefei Zhang, Xianyue Li: A constant-factor approximation for
d-hop connected dominating sets in unit disk graph, Int. J. Sens. Networks, 12(3): 125–136
(2012).
163. M.R. Garey, R.L. Graham and D.S. Johnson, The complexity of computing Steiner minimal
trees, SIAM J. Appl. Math., 32: 835–859 (1977).
164. M.R. Garey and D.S. Johnson: The complexity of near-optimal graph coloring, J. Assoc.
Comput. Mach., 23: 43–49 (1976).
165. M.R. Garey and D.S. Johnson, The rectilinear Steiner tree is NP-complete, SIAM J. Appl.
Math., 32: 826–834 (1977).
166. M.R. Garey and D.S. Johnson: Computers and Intractability: A Guide to the Theory of NP-
Completeness, (W. H. Freeman and Company, New York, 1979).
167. N. Garg, J. Köemann: Faster and simpler algorithms for multicommodity flows and other
fractional packing problems, Proceedings, 39th Annual Symposium on the Foundations of
Computer Science, pp. 300–309, 1998.
168. N. Garg, G. Konjevod, R. Ravi, A polylogarithmic approximation algorithm for the group
Steiner tree problem, Proceedings, 9th SODA, vol. 95, p. 253, 1998.
169. Dongdong Ge, Yinyu Ye, Jiawei Zhang: The fixed-hub single allocation problem: a geometric
rounding approach, working paper, 2007.
170. Dongdong Ge, Simai Hey, Zizhuo Wang, Yinyu Ye, Shuzhong Zhang: Geometric rounding:
a dependent rounding scheme for allocation problems, working paper, 2008.
171. A.M.H. Gerards: A short proof of Tutte’s characterization of totally unimodular matrices,
Linear Algebra and Its Applications, 114/115: 207–212 (1989).
172. E.N. Gilbert and H.O. Pollak: Steiner minimal trees, SIAM J. Appl. Math., 16: 1–29 (1968).
173. M. X. Goemans, A. Goldberg, S. Plotkin, D. Shmoys, E. Tardos and D. P. Williamson:
Approximation algorithms for network design problems, Proceedings, 5th SODA, pp. 223–
232, 1994.
174. M.X. Goemans and D.P. Williamson: New 34 -approximation algorithms for the maximum
satisfiability problem, SIAM Journal on Discrete Mathematics, 7: 656–666 (1994).
175. A.V. Goldberg, S. Rao: Beyond the flow decomposition barrier, Journal of the ACM, 45(5):
783 (1998).
176. Andrew V. Goldberg, Robert E. Tarjan: Finding minimum-cost circulations by canceling
negative cycles, Journal of the ACM, 36 (4): 873–886 (1989).
177. Andrew V. Goldberg, Robert E. Tarjan: Finding minimum-cost circulations by successive
approximation. Math. Oper. Res., 15(3): 430–466 (1990).
178. A.V. Goldberg, R.E. Tarjan: A new approach to the maximum-flow problem, Journal of the
ACM, 35(4): 921 (1988).
179. C. C. Gonzaga: Polynomial affine algorithms for linear programming, Mathematical Pro-
gramming, 49: 7–21 (1990).
180. C. Gonzaga: An algorithm for solving linear programming problems in O(n3L) operations,
in: N. Megiddo, ed., Progress in Mathematical Programming: Interior-Point and Related
Methods, pp. 1–28, Springer, New York, 1988.
181. C. Gonzaga: Conical projection algorithms for linear programming, Mathematical Program-
ming, 43: 151–173 (1989).
182. T. Gonzalez, S.Q. Zheng: Bounds for partitioning rectilinear polygons, Proc. 1st Symp. on
Computational Geometry, pp. 281–287, 1985.
183. T. Gonzalez, S.Q. Zheng: Improved bounds for rectangular and guillotine partitions, Journal
of Symbolic Computation 7: 591–610 (1989).
184. R.L. Graham: Bounds on multiprocessing timing anomalies, Bell System Tech. J., 45: 1563–
1581 (1966).
Bibliography 391
185. R. L. Graham, Pavol Hell: On the history of the minimum spanning tree problem, Annals of
the History of Computing, 7(1): 43–57 (1985).
186. R.L. Graham and F.K. Hwang: Remarks on Steiner minimal trees, Bull. Inst. Math. Acad.
Sinica, 4: 177–182 (1976).
187. M. Grötschel, L. Lovász and A. Schrijver: Geometric Algorithms and Combinatorial Opti-
mization (2nd edition), (Springer-Verlag, 1988).
188. Shuyang Gu, Chuangen Gao, Ruiqi Yang, Weili Wu, Hua Wang, Dachuan Xu: A general
method of active friending in different diffusion models in social networks, Soc. Netw. Anal.
Min., 10(1): 41 (2020).
189. Shuyang Gu, Ganquan Shi, Weili Wu, Changhong Lu: A fast double greedy algorithm for non-
monotone DR-submodular function maximization, Discret. Math. Algorithms Appl., 12(1):
2050007:1-2050007:11 (2020).
190. F. Guerriero, R. Musmanno: Label correcting methods to solve multicriteria shortest path
problems, Journal of Optimization Theory and Applications, 111(3): 589–613 (2001).
191. S. Guha, S. Khuller: Approximation algorithms for connected dominating sets, Algorithmca,
20(4): 374–387 (1998).
192. Leonidas J. Guibas, Jorge Stolfi: On computing all north-east nearest neighbors in the L1
metric, Inf. Process. Lett., 17(4): 219–223 (1983).
193. Jianxiong Guo, Weili Wu: Adaptive influence maximization: If influential node unwilling to
be the seed, ACM Trans. Knowl. Discov. Data, 15(5): 84:1-84:23 (2021).
194. Jianxiong Guo, Weili Wu: Influence maximization: Seeding based on community structure,
ACM Trans. Knowl. Discov. Data, 14(6): 66:1-66:22 (2020)
195. Jianxiong Guo, Weili Wu: Continuous profit maximization: A study of unconstrained Dr-
submodular maximization, IEEE Trans. Comput. Soc. Syst., 8(3): 768–779 (2021).
196. Jianxiong Guo, Tiantian Chen, Weili Wu: A multi-feature diffusion model: rumor blocking in
social networks, IEEE/ACM Trans. Netw., 29(1): 386–397 (2021).
197. Jianxiong Guo, Yi Li, Weili Wu: Targeted protection maximization in social networks. IEEE
Trans. Netw. Sci. Eng., 7(3): 1645–1655 (2020).
198. Jianxiong Guo, Weili Wu: Discount advertisement in social platform: algorithm and robust
analysis, Soc. Netw. Anal. Min., 10(1): 57 (2020).
199. Jianxiong Guo, Weili Wu: Viral marketing with complementary products, in Nonlinear
Combinatorial Optimization (edited by Du, Pardalos, Zhang), Springer, pp. 309–315, 2019.
200. Jianxiong Guo, Weili Wu: A novel scene of viral marketing for complementary products,
IEEE Trans. Comput. Soc. Syst., 6(4): 797–808 (2019).
201. Ling Guo, Deying Li, Yongcai Wang, Zhao Zhang, Guangmo Tong, Weili Wu, Ding-Zhu Du:
Maximisation of the number of β-view covered targets in visual sensor networks, Int. J. Sens.
Networks, 29(4): 226–241 (2019)
202. D. Gusfield and L. Pitt, A bounded approximation for the minimum cost 2-sat problem,
Algorithmica, 8: 103–117 (1992).
203. E. Halperin, R. Krauthgamer: Polylogarithmic inapproximability, Proceedings, 35th ACM
Symposium on Theory of Computing, pp. 585–594, 2003.
204. T.E. Harris, F.S. Ross: Fundamentals of a Method for Evaluating Rail Net Capacities,
Research Memorandum, 1955.
205. Refael Hassin: The minimum cost flow problem: A unifying approach to existing algorithms
and a new tree search algorithm, Mathematical Programming, 25: 228–239 (1983).
206. J. Hastad: Clique is hard to approximate within n to the power 1−ε, Acta Math., 182: 105–142
(1999).
207. J. Hastad: Some optimal inapproximability results, J. Assoc. Comput. Mach., 48: 798–859
(2001).
208. D. Hausmann, B. Korte, T.A. Jenkyns: Worst case analysis of greedy type algorithms for
independence systems, Mathematical Programming Study, 12: 120–131 (1980).
209. M.T. Heideman, D. H. Johnson and C. S. Burrus: Gauss and the history of the fast Fourier
transform, IEEE ASSP Magazine, 1(4): 14–21 (1984).
392 Bibliography
210. “Sir Antony Hoare”. Computer History Museum. Archived from the original on 3 April 2015.
Retrieved 22 April 2015.
211. C. A. R. Hoare: Algorithm 64: Quicksort, Comm. ACM., 4(7): 321 (1961).
212. D.S. Hochbaum: Approximating covering and packing problems: set cover, vertex cover,
independent set, and related problems, in D.S. Hochbaum (ed.) Approximation Algorithms
for NP-Hard Problems, PWS Publishing Company, Boston, pp. 94–143, 1997.
213. D.S. Hochbaum and W. Maass, Approximation schemes for covering and packing problems
in image processing and VLSI, J.ACM, 32: 130–136 (1985).
214. A.J. Hoffman: Some recent applications of the theory of linear inequalities to extremal
combinatorial analysis, in Combinatorial Analysis (Yew York, 1958; R. Bellman, M. Hall,
Jr, eds.), American Mathematical Society, Providence, Rhode Islands, pp. 113–127, 1960.
215. J.E. Hopcroft, R.M. Karp: An n5/2 algorithm for maximum matchings in bipartite graphs,
SIAM Journal on Computing, 2 (4): 225–231 (1973).
216. Chenfei Hou, Suogang Gao, Wen Liu, Weili Wu, Ding-Zhu Du, Bo Hou: An approximation
algorithm for the submodular multicut problem in trees with linear penalties, Optim. Lett.,
15(4): 1105–1112 (2021).
217. S.Y. Hsieh and S.-C. Yang: Approximating the selected-internal Steiner tree, Theoretical
Computer Science, 38: 288–291 (2007).
218. Luogen Hua: Exploratory of Optimal Selection, (Science Publisher, 1971).
219. Yaochun Huang, Xiaofeng Gao, Zhao Zhang, Weili Wu: A better constant-factor approxima-
tion for weighted dominating set in unit disk graph, J. Comb. Optim., 18(2): 179–194 (2009).
220. H.B. Hunt III, M.V. Marathe, V. Radhakrishnan, S.S. Ravi, D.J. Rosenkrantz, and R.E.
Stearns: Efficient approximations and approximation schemes for geometric problems,
Journal of Algorithms, 26(2): 238–274 (1998).
221. F.K. Hwang, On Steiner minimal trees with rectilinear distance, SIAM J. Appl. Math., 30:
104–114 (1972).
222. F.K. Hwang, An O(n log n) algorithm for rectilinear minimal spanning trees, J. ACM, 26:
177–182 (1979).
223. O.H. Ibarra and C.E. Kim: Fast approximation algorithms for the knapsack and sum of subset
proble, J. Assoc. Comput. Mach., 22: 463–468 (1975).
224. R. Iyer and J. Bilmes: Algorithms for approximate minimization of the difference between
submodular functions, Proceedings, 28th UAI, pp. 407–417, 2012.
225. R. Iyer and J. Bilmes: Submodular optimization subject to submodular cover and submodular
knapsack constraints, Proceedings, Advances of NIPS, 2013.
226. Rishabh K. Iyer, Stefanie Jegelka, Jeff A. Bilmes: Fast Semidifferential-based Submodular
Function Optimization, Proceedings, ICML, (3): 855–863 (2013).
227. K. Jain: A factor 2 approximation algorithm for the generalized Steiner network problem,
Combinatorica, 21: 39–60 (2001).
228. Thomas A Jenkyns: The efficacy of the “greedy” algorithm, Congressus Numerantium, no 17:
341–350 (1976).
229. T. Jiang and L. Wang, An approximation scheme for some Steiner tree problems in the plane,
Lecture Notes in Computer Science, Vol 834: 414–427 (1994).
230. T. Jiang, E.B. Lawler and L. Wang: Aligning sequences via an evolutionary tree: complexity
and algorithms, Proceedings, 26th STOC, 1994.
231. D.S. Johnson: Approximation algorithms for combinatorial problems, Journal of Computer
and System Sciences, 9: 256–278 (1974).
232. R. Jonker, A. Volgenant: A shortest augmenting path algorithm for dense and sparse linear
assignment problems, Computing, 38(4): 325–340 (1987).
233. L.V. Kantorovich: A new method of solving some classes of extremal problems, Doklady
Akad Sci SSSR, 28: 211–214 (1940).
234. A. Karczmarz, J. Lacki: Simple label-correcting algorithms for partially dynamic approximate
shortest paths in directed graphs, Proceedings, Symposium on Simplicity in Algorithms,
Society for Industrial and Applied Mathematics, pp. 106–120, 2020.
Bibliography 393
235. N. Karmakkar: A new polynomial-time algorithm for linear programming, Proceedings, 16th
Annual ACM Symposium on the Theory of Computing, pp. 302–311, 1984.
236. R.M. Karp: Reducibility among combinatorial problems, in Complexity of Computer Compu-
tations, (E.E. Miller and J.W. Thatcher eds.), Plenum Press, New York, pp. 85–103, 1972.
237. R.M. Karp: Probabilistic analysis of partitioning algorithms for the traveling salesman
problem in the plane, Mathematics of Operations Research, 2(3): 209–224 (1977).
238. R. M. Karp: A characterization of the minimum cycle mean in a digraph, Discrete Mathemat-
ics, 23(3): 309–311 (1978).
239. L. Kou, G. Markowsky and L. Berman, A fast algorithm for Steiner trees, Acta Informatics,
15: 141–145 (1981).
240. J.A. Kelner, Y.T. Lee, L. Orecchia, A. Sidford: An almost-linear-time algorithm for approx-
imate max flow in undirected graphs, and its multicommodity generalizations, Proceedings,
25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 217–226, 2014.
241. L.G. Khachiyan: A polynomial algorithm for linear programming, Doklad. Akad. Nauk. USSR
Sec., 244: 1093–1096 (1979).
242. S. Khanna, R. Motwani, M. Sudan and U. Vazirani: On syntactic versus computational views
of approximability, SIAM J. Comput., 28: 164–191 (1999).
243. S. Khanna, S. Muthukrishnan and M. Paterson: On approximating rectangle tiling and
packing, Proceedings, 9th ACM-SIAM Symp. on Discrete Algorithms, pp. 384–393, 1998.
244. J. Kiefer: Sequential minimax search for a maximum, Proceedings of the American Mathe-
matical Society, 4(3): 502–506 (1953).
245. Donghyun Kim, Baraki H. Abay, R. N. Uma, Weili Wu, Wei Wang, Alade O. Tokuta:
Minimizing data collection latency in wireless sensor network with multiple mobile elements,
Proceedings, INFOCOM, pp. 504–512, 2012.
246. Donghyun Kim, Xianyue Li, Feng Zou, Zhao Zhang, Weili Wu: Recyclable connected
dominating set for large scale dynamic wireless networks, Proceedings, WASA, pp. 560–569,
2008.
247. Donghyun Kim, Wei Wang, Ling Ding, Jihwan Lim, Heekuck Oh, Weili Wu: Minimum
average routing path clustering problem in multi-hop 2-D underwater sensor networks, Optim.
Lett., 4(3): 383–392 (2010).
248. Donghyun Kim, Wei Wang, Deying Li, Joonglyul Lee, Weili Wu, Alade O. Tokuta: A joint
optimization of data ferry trajectories and communication powers of ground sensors for long-
term environmental monitoring, J. Comb. Optim., 31(4): 1550–1568 (2016).
249. Donghyun Kim, Wei Wang, Nassim Sohaee, Changcun Ma, Weili Wu, Wonjun Lee, Ding-Zhu
Du: Minimum data-latency-bound k-sink placement problem in wireless sensor networks,
IEEE/ACM Trans. Netw., 19(5): 1344–1353 (2011).
250. Donghyun Kim, Wei Wang, Junggab Son, Weili Wu, Wonjun Lee, Alade O. Tokuta:
Maximum lifetime combined barrier-coverage of weak static sensors and strong mobile
sensors, IEEE Trans. Mob. Comput., 16(7): 1956–1966 (2017).
251. Donghyun Kim, Wei Wang, Weili Wu, Deying Li, Changcun Ma, Nassim Sohaee, Wonjun
Lee, Yuexuan Wang, Ding-Zhu Du: On bounding node-to-sink latency in wireless sensor
networks with multiple sinks, Int. J. Sens. Networks, 13(1): 13–29 (2013).
252. Donghyun Kim, R. N. Uma, Baraki H. Abay, Weili Wu, Wei Wang, Alade O. Tokuta:
Minimum latency multiple data MULE trajectory planning in wireless sensor networks, IEEE
Trans. Mob. Comput., 13(4): 838–851 (2014).
253. Donghyun Kim, Zhao Zhang, Xianyue Li, Wei Wang, Weili Wu, Ding-Zhu Du: A better
approximation algorithm for computing connected dominating sets in unit ball graphs, IEEE
Trans. Mob. Comput., 9(8): 1108–1118 (2010).
254. Robert Kingan, Sandra Kingan: A software system for matroids, Graphs and Discovery,
DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pp. 287–296,
2005.
255. L.M. Kirousis, E. Kranakis, D. Krizanc and A. Pelc: Power consumption in packer radio
networks, Theoretical Computer Science, 243: 289–305 (2000).
394 Bibliography
256. L.V. Klee and G.J. Minty: How good is the simplex algorithm, in O. Shisha (ed.) Inequalities
3, (Academic, New York, 1972).
257. Morton Klein: A primal method for minimal cost flows with applications to the assignment
and transportation problems, Management Science, 14 (3): 205–220 (1967).
258. Donald E. Knuth: The Art of Computer Programming: Volume 3, Sorting and Searching,
second edition, (Addison-Wesley, 1998).
259. Ker-I Ko: Computational Complexity of Real Functions and Polynomial Time Approximation,
Ph.D. Thesis, Ohio State University, Columbus, Ohio, 1979.
260. Ker-I Ko: Computational Complexity of Real Functions, (Birkhauser Boston, Boston, MA,
1991).
261. M. Kojima, S. Mizuno and A. Yoshise: A primal-dual interior point method for linear pro-
gramming, in: Progress in Mathematical Programming: Interior-Point and Related Methods
(N. Megiddo, ed.), pp. 29–48, (Springer, New York, 1988).
262. J. Komolos and M.T. Shing: Probabilistic partitioning algorithms for the rectilinear Steiner
tree problem, Networks, 15: 413–423 (1985).
263. Bernhard Korte, Dirk Hausmann: An analysis of the greedy heuristic for independence
systems, Ann. Discrete Math., 2: 65–74 (1978).
264. B. Korte, J. Vygen: Combinatorial Optimization, (Springer, 2002).
265. L. Kou, G. Markowsky and L. Berman: A Fast Algorithm for Steiner Trees, Acta Informatica,
15: 141–145 (1981).
266. J.B. Kruskal: On the shortest spanning subtree of a graph and the traveling salesman problem,
Proc. Amer. Math. Sot., 7: 48–50 (1956).
267. H.W. Kuhn: The Hungarian method for the assignment problem, Naval Research Logistics
Quarterly, 2: 83–97 (1955).
268. H.W. Kuhn: Variants of the Hungarian method for assignment problems, Naval Research
Logistics Quarterly, 3: 253–258 (1956).
269. M.K. Kwan: Graphic Programming Using Odd or Even Points, Chinese Math., 1: 273–277
(1962).
270. Lei Lai, Qiufen Ni, Changhong Lu, Chuanhe Huang, Weili Wu: Monotone submodular
maximization over the bounded integer lattice with cardinality constraints, Discret. Math.
Algorithms Appl., 11(6): 1950075:1-1950075:14 (2019).
271. T. Lappas, E. Terzi, D. Gunopulos and H. Mannila: Finding effectors in social networks,
Proceedings, 16th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining (KDD), pp.
1059–1068, 2010.
272. Eugene Lawler: Combinatorial Optimization: Networks and Matroids, (Dover, 2001).
273. D.T. Lee: Two-dimensional Voronoi diagrams in the Lp metric, J. ACM, 27: 604–618 (1980).
274. D.T. Lee and C.K. Wang: Voronoi diagrams in L, (L,) metrics with 2-dimensional storage
applications, SIAM J. Comput., 9: 200–211 (1980).
275. Jon Lee: A First Course in Combinatorial Optimization, (Cambridge University Press, 2004).
276. J. Lee, V. Mirrokni, V. Nagarajan and M. Sviridenko: Nonmonotone submodular maximiza-
tion under matroid and knapsack constraints, Proceedings, 41th ACM Symposium on Theory
of Computing, pp. 323–332, 2009.
277. J.K. Lenstra, D.B. Shmoys and E. Tardos: Approximation algorithms for scheduling unrelated
parallel machines, Mathematical Programming, 46: 259–271 (1990).
278. Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen,
and Natalie Glance: Cost-effective outbreak detection in networks, Proceedings, 13th ACM
SIGKDD international conference on Knowledge discovery and data mining (KDD), New
York, ACM, pp. 420–429, 2007.
279. C. Levcopoulos: Fast heuristics for minimum length rectangular partitions of polygons,
Proceedings, 2nd Symp. on Computational Geometry, pp. 100–108, 1986.
280. Anany V. Levitin: Introduction to the Design and Analysis of Algorithms, (Addison Wesley,
2002).
281. Deying Li, Hongwei Du, Peng-Jun Wan, Xiaofeng Gao, Zhao Zhang, Weili Wu: Construction
of strongly connected dominating sets in asymmetric multihop wireless networks, Theor.
Comput. Sci., 410(8-10): 661–669 (2009).
Bibliography 395
282. Deying Li, Hongwei Du, Peng-Jun Wan, Xiaofeng Gao, Zhao Zhang, Weili Wu: Minimum
power strongly connected dominating sets in wireless networks, Proceedings, ICWN, pp. 447–
451, 2008.
283. Deying Li, Donghyun Kim, Qinghua Zhu, Lin Liu, Weili Wu: Minimum total communication
power connected dominating set in wireless networks, Proceedings, WASA, pp. 132–141,
2012.
284. Deying Li, Qinghua Zhu, Hongwei Du, Weili Wu, Hong Chen, Wenping Chen: Conflict-
free many-to-one data aggregation scheduling in multi-channel multi-hop wireless sensor
networks, Proceedings, ICC, pp. 1–5, 2011.
285. Guanfeng Li, Hui Ling, Taieb Znati, Weili Wu: A Robust on-Demand Path-Key Establishment
Framework via Random Key Predistribution for Wireless Sensor Networks, EURASIP J.
Wirel. Commun. Netw., 2006: 091304 (2006).
286. J. Li, Y. Jin, A PTAS for the weighted unit disk cover problem, in Automata, Languages, and
Programming, Proceedings, ICALP, pp. 898–909, 2015.
287. Xianyue Li, Xiaofeng Gao, Weili Wu: A Better Theoretical Bound to Approximate Connected
Dominating Set in Unit Disk Graph. WASA 2008: 162–175.
288. G.-H. Lin and G. Xue: Steiner tree problem with minimum number of Steiner points and
bounded edge-length, Information Processing Letters, 69: 53–57 (1999).
289. G.-H. Lin and G. Xue: On the terminal Steiner tree problem, Information Processing Letters,
84: 103–107 (2002).
290. H. Lin and J. Bilmes: Optimal selection of limited vocabulary speech corpora, In Interspeech,
2011.
291. A. Lingas, R.Y. Pinter, R.L. Rivest and A. Shamir: Minimum edge length partitioning of
rectilinear polygons, Proceedings, 20th Allerton Conf. on Comm. Control and Compt., pp.
53–63, Illinos, 1982.
292. A. Lingas: Heuristics for minimum edge length rectangular partitions of rectilinear figures,
Proceedings, 6th GI-Conference, pp. 199–210, Dortmund, Springer-Verlag, 1983.
293. Bin Liu, Xiao Li, Huijuan Wang, Qizhi Fang, Junyu Dong, Weili Wu: Profit Maximization
problem with coupons in social networks, Theor. Comput. Sci., 803: 22–35 (2020).
294. Bin Liu, Xiao Li, Huijuan Wang, Qizhi Fang, Junyu Dong, Weili Wu: Profit maximization
problem with coupons in social networks, Proceedings, AAIM, pp. 49–61, 2018.
295. Bin Liu, Yuxia Yan, Qizhi Fang, Junyu Dong, Weili Wu, Huijuan Wang: Maximizing profit
of multiple adoptions in social networks with a martingale approach, J. Comb. Optim., 38(1):
1–20. (2019).
296. Siwen Liu, Hongmin W. Du: Constant-approximation for minimum weight partial sensor
cover, Discret. Math. Algorithms Appl., 13(4): 2150047:1-2150047:8 (2021).
297. L. Lovász: On the ratio of optimal integral and fractional covers, Discrete Mathematics, vol
13 (1975) 383–390.
298. B. Lu, L. Ruan: Polynomial time approximation scheme for the rectilinear Steiner arbores-
cence problem, Journal of Combinatorial Optimization, 4: 357–363 (2000).
299. Wei Lu, Wei Chen, Laks V.S. Lakshmanan: From competition to complementarity: compara-
tive influence diffusion and maximization, Proceedings, the VLDB Endowsment, 9(2): 60–71
(2015).
300. Zaixin Lu, Travis Pitchford, Wei Li, Weili Wu: On the maximum directional target coverage
problem in wireless sensor networks, Proceedings, MSN, pp. 74–79, 2014.
301. Zaixin Lu, Weili Wu, Wei Wayne Li: Target coverage maximisation for directional sensor
networks, Int. J. Sens. Networks, 24(4): 253–263 (2017).
302. Zaixin Lu, Wei Zhang, Weili Wu, Joonmo Kim, Bin Fu: The complexity of influence
maximization problem in the deterministic linear threshold model, J. Comb. Optim., 24(3):
374–378 (2012).
303. Zaixin Lu, Wei Zhang, Weili Wu, Bin Fu, Ding-Zhu Du: Approximation and inapproximation
for the influence maximization problem in social networks under deterministic linear
threshold model, ICDCS Workshops, pp. 160–165, 2011.
396 Bibliography
304. Zaixin Lu, Zhao Zhang, Weili Wu: Solution of Bharathi-Kempe-Salek conjecture for influ-
ence maximization on arborescence, J. Comb. Optim., 33(2): 803–808 (2017).
305. C. Lund, M. Yanakakis: On the hardness of approximating minimization problems, J. ACM,
41(5): 960–981 (1994).
306. Chuanwen Luo, Wenping Chen, Deying Li, Yongcai Wang, Hongwei Du, Lidong Wu, Weili
Wu: Optimizing flight trajectory of UAV for efficient data collection in wireless sensor
networks, Theor. Comput. Sci., 853: 25–42 (2021).
307. Chuanwen Luo, Lidong Wu, Wenping Chen, Yongcai Wang, Deying Li, Weili Wu: Trajectory
optimization of UAV for efficient data collection from wireless sensor networks, Proceedings,
AAIM, pp. 223–235, 2019.
308. Saunders Mac Lane: Some interpretations of abstract linear dependence in terms of projective
geometry, American Journal of Mathematics, 58 (1): 236–240 (1936).
309. Takanori Maehara, Kazuo Murota: A framework of discrete DC programming by discrete
convex analysis, Math. Program., 152(1-2): 435–466 (2015).
310. I. Mandoiu and A. Zelikovsky: A note on the MST heuristic for bounded edge-length Steiner
trees with minimum number of Steiner points, Information Processing Letters, 75(4): 165–
167 (2000).
311. N. Megiddo and M. Shub: Boundary behaviour of interior point algorithms in linear
programming, Research Report RJ 5319, IBM Thomas J. Watson Research Center (Yorktown
Heights, NY, 1986).
312. V. Melkonian and E. Tardos: Algorithms for a network design problem with crossing
supermodular demands, Networks,√43: 256–265 (2004).
313. S. Micali, V.V. Vazirani: An O( |V | · |E|) algorithm for finding maximum matching in
general graphs, Proc. 21st IEEE Symp. Foundations of Computer Science, pp. 17–27 (1980).
314. Manki Min, Hongwei Du, Xiaohua Jia, Christina Xiao Huang, Scott C.-H. Huang, Weili Wu:
Improving Construction for Connected Dominating Set with Steiner Tree in Wireless Sensor
Networks, J. Glob. Optim., 35(1): 111–119 (2006).
315. M. Min, S.C.-H. Huang, J. Liu, E. Shragowitz, W. Wu, Y. Zhao and Y. Zhao, An
approximation scheme for the rectilinear Steiner minimum tree in presence of obstructions,
Novel Approaches to Hard Discrete Optimization, Fields Institute Communications Series,
American Math. Society, vol 37: 155–163 (2003).
316. George J. Minty: On the axiomatic foundations of the theories of directed linear graphs,
electrical networks and network-programming, Journal of Mathematics and Mechanics, 15:
485–520 (1966).
317. J.S.B. Mitchell: Guillotine subdivisions approximate polygonal subdivisions: A simple new
method for the geometric k-MST problem. Proceedings, 7th ACM-SIAM Symposium on
Discrete Algorithms, pp. 402–408, 1996.
318. J.S.B. Mitchell: Guillotine subdivisions approximate polygonal subdivisions: Part II - A
simple polynomial-time approximation scheme for geometric k-MST, TSP, and related
problem, SIAM J. Comput., 28: 1298–1307 (1999).
319. J.S.B. Mitchell: Guillotine subdivisions approximate polygonal subdivisions: Part III - Faster
polynomial-time approximation scheme for geometric network optimization, Proceedings,
9th Canadian Conference on Computational Geometry, pp. 229–232, 1997.
320. J.S.B. Mitchell, A. Blum, P. Chalasani, S. Vempala: A constant-factor approximation
algorithm for the geometric k-MST problem in the plane, SIAM J. Comput., 28: 771–781
(1999).
321. R.C. Monteiro and I. Adler: An O(n3 L) primal-dual interior point algorithm for linear
programming, Manuscript, Department of Industrial Engineering and Operations Research,
University of California (Berkeley, CA, 1987).
322. J. Munkres: Algorithms for the assignment and transportation problems, Journal of the
Society for Industrial and Applied Mathematics, 5(1): 32–38 (1957).
323. K. Nagano, Y. Kawahara and K. Aihara: Size-constrained submodular minimization through
minimum norm base, Proceedings, 28th International Conference on Machine Learning,
Bellevue, WA, USA, 2011.
Bibliography 397
350. G. Robin and A. Zelikovsky, Improved Steiner trees approximation in graphs, Proceedings,
11th SIAM-ACM Symposium on Discrete Algorithms (SODA), San Francisco, CA, pp. 770–
779, January 2000.
351. Lu Ruan, Hongwei Du, Xiaohua Jia, Weili Wu, Yingshu Li, Ker-I Ko: A greedy approxima-
tion for minimum connected dominating sets, Theor. Comput. Sci., 329(1-3): 325–330 (2004).
352. J.H. Rubinstein and D.A. Thomas, The Steiner ratio conjecture for six points, J. Combinatoria
Theory, Ser.A, 58: 54–77 (1991).
353. S. Sahni: Approximate algorithms for the 0/1 knapsack problem, J. Assoc. Comput. Mach.,
22: 115–124 (1975).
354. S. Sahni and T. Gonzalez: P-complete approximation algorithms, J. Assoc. Comput. Mach.,
23: 555–565 (1976).
355. D. Sankoff: Minimal mutation trees of sequences, SIAM J. Appl. Math., 28: 35–42 (1975).
356. P. Schreiber: On the history of the so-called Steiner weber problem, Wiss. Z. Ernst-Moritz-
Arndt-Univ. Greifswald, Math.-nat.wiss. Reihe, 35(3): (1986).
357. A. Schrijver: Theory of Linear and Integer Programming, (Wiley, Chichester, 1986).
358. Alexander Schrijver: Combinatorial Optimization: Polyhedra and Efficiency, Algorithms and
Combinatorics. 24. (Springer, 2003).
359. A. Schrijver: A combinatorial algorithm minimizing submodular func- tions in strong
polynomial time, J. Combinatorial Theory (B), 80: 346–355 (2000).
360. A. Schrijver: On the history of the transportation and maximum flow problems, Mathematical
Programming, 91(3): 437–445 (2002).
361. A. Schrijver: On the history of the shortest path problem, Documenta Math, Extra Volume
ISMP: 155–167 (2012).
362. H. H. Seward: “Internal Sorting by Floating Digital Sort”, Information sorting in the
application of electronic digital computers to business operations (PDF), Master’s thesis,
Report R-232, Massachusetts Institute of Technology, Digital Computer Laboratory, pp. 25–
28, 1954.
363. M. I. Shamos and D. Hoey: Closest point problems, Proceedings, 16th Annual Symp
Foundations of Computer Science, pp 151–162, 1975.
364. Shan Shan, Weili Wu, Wei Wang, Hongjie Du, Xiaofeng Gao, Ailian Jiang: Constructing
minimum interference connected dominating set for multi-channel multi-radio multi-hop
wireless network, Int. J. Sens. Networks, 11(2): 100–108 (2012).
365. J. Sherman: Nearly maximum flows in nearly linear time, Proceedings, 54th Annual IEEE
Symposium on Foundations of Computer Science (FOCS), pp. 263–269, 2013.
366. Alfonso Shimbel: Structural parameters of communication networks, Bulletin of Mathemati-
cal Biophysics, 15(4): 501–507 (1953).
367. Gerard Sierksma, Yori Zwols: Linear and Integer Optimization: Theory and Practice, (CRC
Press, 2015).
368. Gerard Sierksma, Diptesh Ghosh: Networks in Action; Text and Computer Exercises in
Network Optimization, (Springer, 2010).
369. A.J. Skriver, K.A. Andersen: A label correcting approach for solving bicriterion shortest-path
problems, Computers & Operations Research, 27(6): 507–524 (2000).
370. Petr Slavik: A tight analysis of the greedy algorithm for set cover, Journal of Algorithms,
25(2): 237–254 (1997).
371. M. Sviridenko: A note on maximizing a submodular set function subject to knapsack
constraint, Operations Research Letters, 32: 41–43 (2004).
372. Z. Svitkina and L. Fleischer: Submodular approximation: Sampling-based algorithms and
lower bounds, SIAM Journal on Computing, 40(6): 1715–1737 (2011).
373. E. Tardos: A strongly polynomial minimum cost circulation algorithm, Combinatorica, 5(3):
247–255 (1985).
374. J. Tarhio, E. Ukkonen: A greedy approximation algorithm for constructing shortest common
superstrings, Theoretical Computer Science, 57(1): 131–145 (1988).
375. M. Todd and B. Burrell: An extension of Karmarkar’s algorithm for linear programming using
dual variables, Algorithmica, 1: 409–424 (1986).
Bibliography 399
376. M.J. Todd and Y. Ye: A centered projective algorithm for linear programming, Technical
Report 763, School of Operations Research and Industrial Engineering, Cornell University
(Ithaca, NY, 1987).
377. N. Tomizawa: On some techniques useful for solution of transportation network problems,
Networks, 1(2): 173–194 (1971).
378. Guangmo Amo Tong, Ding-Zhu Du, Weili Wu: On misinformation containment in online
social networks, Proceedings, NeurIPS, pp. 339–349, 2018.
379. Guangmo Amo Tong, Shasha Li, Weili Wu, Ding-Zhu Du: Effector detection in social
networks, IEEE Trans. Comput. Soc. Syst., 3(4): 151–163 (2016).
380. Guangmo Tong, Ruiqi Wang, Xiang Li, Weili Wu, Ding-Zhu Du: An approximation algorithm
for active friending in online social networks, Proceedings, ICDCS, pp. 1264–1274, 2019
381. Guangmo Amo Tong, Weili Wu, Ding-Zhu Du: Distributed Rumor Blocking in Social
Networks: A Game Theoretical Analysis, IEEE Transactions on Computational Social
Systems, 5(2): 468–480 (2018).
382. Guangmo Amo Tong, Weili Wu, Ling Guo, Deying Li, Cong Liu, Bin Liu, Ding-Zhu Du: An
efficient randomized algorithm for rumor blocking in online social networks, Proceedings,
INFOCOM, pp. 1–9, 2017.
383. J.S. Turner: Approximation algorithms for the shortest common superstring problem, Infor-
mation and Computation, 83(1): 1–20 (1989).
384. W.T. Tutte: Introduction to the theory of matroids, Modern Analytic and Computational
Methods in Science and Mathematics, vol. 37, (New York: American Elsevier Publishing
Company, 1971).
385. Pravin M. Vaidya: An algorithm for linear programming which requires O(((m+n)n2 +(m+
n)1.5 n)L) arithmetic operations, Mathematical Programming, 47: 175–201 (1990).
386. S.A. Vavasis: Automatic domain partitioning in tree dimensions, SIAM J. Sci. Stat. Comput.,
12(4): 950–970 (1991).
387. Vijay V. Vazirani: Approximation Algorithms, (Berlin: Springer, 2003).
388. Jan Vondrák: Optimal approximation for the submodular welfare problem in the value oracle
model, Proceedings, STOC, pp. 67–74, 2008.
389. Peng-Jun Wan, Ding-Zhu Du, Panos M. Pardalos, Weili Wu: Greedy approximations for
minimum submodular cover with submodular cost. Comp. Opt. and Appl., 45(2): 463–474
(2010).
390. Chen Wang, My T. Thai, Yingshu Li, Feng Wang, Weili Wu: Minimum coverage breach
and maximum network lifetime in wireless sensor networks, Proceedings, GLOBECOM, pp.
1118–1123, 2007.
391. Chen Wang, My T. Thai, Yingshu Li, Feng Wang, Weili Wu: Optimization scheme for sensor
coverage scheduling with bandwidth constraints, Optim. Lett., 3(1): 63–75 (2009)
392. Ailian Wang, Weili Wu, Junjie Chen: Social network rumors spread model based on cellular
automata, Proceedings, MSN, pp. 236–242, 2014.
393. Ailian Wang, Weili Wu, Lei Cui: On Bharathi-Kempe-Salek conjecture for influence maxi-
mization on arborescence, J. Comb. Optim., 31(4): 1678–1684 (2016).
394. L. Wang and D.-Z. Du: Approximations for bottleneck Steiner trees, Algorithmica, 32: 554–
561 (2002).
395. L. Wang and D. Gusfield: Improved approximation algorithms for tree alignment, Proceed-
ings, 7th Symp. on Combinatorial Parrern Matching. Springer LNCS, 1075: 220–233 (1996).
396. L. Wang and T. Jiang: An approximation scheme for some Steiner tree problems in the plane,
Networks, 28: 187–193 (1996).
397. L. Wang, T. Jiang and D. Gusfield: A more efficient approximation scheme for tree alignment,
Proceedings, 1st annual international conference on computational biology, pp. 310–319,
1997.
398. L. Wang, T. Jiang and E.L. Lawler: Approximation algorithms for tree alignment with a given
phylogeny, Algorithmica, 16: 302–315 (1996).
399. Wei Wang, Donghyun Kim, Nassim Sohaee, Changcun Ma, Weili Wu: A PTAS for minimum
d-hop underwater sink placement problem in 2-d underwater sensor networks, Discret. Math.
Algorithms Appl., 1(2): 283–290 (2009).
400 Bibliography
400. Wei Wang, Donghyun Kim, James Willson, Bhavani M. Thuraisingham, Weili Wu: A better
approximation for minimum average routing path clustering problem in 2-d underwater
sensor networks, Discret. Math. Algorithms Appl., 1(2): 175–192 (2009).
401. Zhefeng Wang, Yu Yang, Jian Pei and Enhong Chen, Activity maximization by effective
information diffusion in social networks, IEEE Transactions on Knowledge and Data
Engineering, 29(11): 2374–2387 (2017).
402. Stephen Warshall: A theorem on Boolean matrices, Journal of the ACM, 9(1): 11–12 (1962).
403. D.J.A. Welsh: Matroid Theory, L.M.S. Monographs, vol. 8, (Academic Press, 1976).
404. Neil White (ed.): Theory of Matroids, Encyclopedia of Mathematics and its Applications, vol.
26, (Cambridge: Cambridge University Press, 1986).
405. Neil White (ed.): Combinatorial geometries, Encyclopedia of Mathematics and its Applica-
tions, vol. 29, (Cambridge: Cambridge University Press, 1987).
406. Hassler Whitney: On the abstract properties of linear dependence, American Journal of
Mathematics, 57(3): 509–533 (1935).
407. Chr. Wiener, Ueber eine Aufgabe aus der Geometria situs, Mathematik Annalen, 6: 29–30
(1873).
408. David P. Williamson, David B. Shmoys: The Design of Approximation Algorithms, (Cam-
bridge University Press, 2011).
409. James Willson, Weili Wu, Lidong Wu, Ling Ding, Ding-Zhu Du: New approximation for
maximum lifetime coverage, Optimization, 63(6): 839–847 (2014).
410. James Willson, Zhao Zhang, Weili Wu, Ding-Zhu Du: Fault-tolerant coverage with maximum
lifetime in wireless sensor networks, Proceedings, INFOCOM, pp. 1364–1372, 2015.
411. Laurence A. Wolsey: Heuristic analysis, linear programming and branch and bound, Mathe-
matical Programming Study 13: 121–134 (1980).
412. Laurence A. Wolsey: Maximizing real-valued submodular function: primal and dual heuris-
tics for location problems, Math. of Operations Research 7: 410–425 (1982).
413. Laurence A. Wolsey: An analysis of the greedy algorithm for the submodular set covering
problem, Combinatorica, 2(4): 385–393 (1982).
414. Baoyuan Wu, Siwei Lyu, Bernard Ghanem: Constrained submodular minimization for
missing labels and class imbalance in multi-label learning, Proceedings, AAAI, pp. 2229–
2236, 2016.
415. Chenchen Wu, Yishui Wang, Zaixin Lu, P.M. Pardalos, Dachuan Xu, Zhao Zhang, Ding-
Zhu Du: Solving the degree-concentrated fault-tolerant spanning subgraph problem by DC
programming, Math. Program., 169(1): 255–275 (2018).
416. Lidong Wu, Hongwei Du, Weili Wu, Deying Li, Jing Lv, Wonjun Lee: Approximations for
minimum connected sensor cover, Proceedings, INFOCOM, pp. 1187–1194, 2013.
417. Lidong Wu, Hongwei Du, Weili Wu, Yuqing Zhu, Ailian Wang, Wonjun Lee: PTAS for
routing-cost constrained minimum connected dominating set in growth bounded graphs, J.
Comb. Optim., 30(1): 18–26 (2015).
418. Lidong Wu, Huijuan Wang, Weili Wu: Connected set-cover and group Steiner tree, Encyclo-
pedia of Algorithms, pp. 430–432, 2016.
419. Weili Wu, Xiuzhen Cheng, Min Ding, Kai Xing, Fang Liu, Ping Deng: Localized outlying
and boundary data detection in sensor networks, IEEE Trans. Knowl. Data Eng., 19(8): 1145–
1157 (2007).
420. Weili Wu, Hongwei Du, Xiaohua Jia, Yingshu Li, Scott C.-H. Huang: Minimum connected
dominating sets and maximal independent sets in unit disk graphs, Theor. Comput. Sci., 352(1-
3): 1–7 (2006).
421. Weili Wu, Zhao Zhang, Chuangen Gao, Hai Du, Hua Wang, Ding-Zhu Du: Quality of barrier
cover with wireless sensors, Int. J. Sens. Networks, 29(4): 242–251 (2019).
422. Weili Wu, Zha Zhang, Wonjun Lee, Ding-Zhu Du: Optimal Coverage in Wireless Sensor
Networks, (Springer, 2020).
423. Biaofei Xu, Yuqing Zhu, Deying Li, Donghyun Kim, Weili Wu: Minimum (k, ω)-angle barrier
coverage in wireless camera sensor networks, Int. J. Sens. Networks, 21(3): 179–188 (2016).
Bibliography 401
424. Wen Xu, Zaixin Lu, Weili Wu, Zhiming Chen: A novel approach to online social influence
maximization, Soc. Netw. Anal. Min., 4(1): 153 (2014).
425. Wen Xu, Weili Wu: Optimal Social Influence, (Springer, 2020).
426. Ruidong Yan, Deying Li, Weili Wu, Ding-Zhu Du, Yongcai Wang: Minimizing influence of
rumors by blockers on social networks: algorithms and analysis, IEEE Trans. Netw. Sci. Eng.,
7(3): 1067–1078 (2020).
427. D-N Yang, H-J Hung, W-C Lee, W Chen: Maximizing acceptance probability for active
friending in online social networks, Proceedings, 19th ACM SIGKDD international confer-
ence on Knowledge discovery and data mining, pp. 713–721, 2013.
428. Ruiqi Yang, Shuyang Gu, Chuangen Gao, Weili Wu, Hua Wang, Dachuan Xu: A constrained
two-stage submodular maximization, Theor. Comput. Sci., 853: 57–64 (2021).
429. Ruidong Yan, Yi Li, Weili Wu, Deying Li, Yongcai Wang: Rumor blocking through online
link deletion on social networks, ACM Trans. Knowl. Discov. Data, 13(2): 16:1-16:26 (2019).
430. Wenguo Yang, Jianmin Ma, Yi Li, Ruidong Yan, Jing Yuan, Weili Wu, Deying Li: Marginal
gains to maximize content spread in social networks, IEEE Trans. Comput. Soc. Syst., 6(3):
479–490 (2019).
431. Wenguo Yang, Jing Yuan, Weili Wu, Jianmin Ma, Ding-Zhu Du: Maximizing Activity Profit
in Social Networks, IEEE Trans. Comput. Soc. Syst., 6(1): 117–126 (2019).
432. M. Yannakakis: On the approximation of maximum satisfiability, Journal of Algorithms, 3:
475–502 (1994).
433. A.C. Yao: On constructing minimum spanning trees in k-dimensional spaces and related
problems, SIAM J. Comput., 11: 721–736 (1982).
434. F.F. Yao: Efficient dynamic programming using quadrangle inequalities, Proceedings, 12th
Ann. ACM Symp. on Theory of Computing, pp. 429–435, 1980.
435. Jing Yuan, Weili Wu, Yi Li, Ding-Zhu Du: Active friending in online social networks,
Proceedings, BDCAT, pp. 139–148, 2017
436. Jing Yuan, Weili Wu, Wen Xu: Approximation for influence maximization, Handbook of
Approximation Algorithms and Metaheuristics, (2) 2018.
437. A. Zelikovsky, The 11/6-approximation algorithm for the Steiner problem on networks,
Algorithmica, 9: 463–470 (1993).
438. A. Zelikovsky, A series of approximation algorithms for the acyclic airected Steiner tree
Problem, Algorithmica, 18: 99–110 (1997).
439. F.B. Zhan, C.E. Noon: A comparison between label-setting and label-correcting algorithms
for computing one-to-one shortest paths, Journal of Geographic information and decision
analysis, 4(2): 1–11 (2000).
440. Jianzhong Zhang, Shaoji Xu: Linear Programming, (Schiece Press, 1987).
441. Ning Zhang, Incheol Shin, Feng Zou, Weili Wu, My T. Thai: Trade-off scheme for fault
tolerant connected dominating sets on size and diameter, Proceedings, FOWANC, pp. 1–8,
2008.
442. Wei Zhang, Weili Wu, Wonjun Lee, Ding-Zhu Du: Complexity and approximation of the
connected set-cover problem, J. Glob. Optim., 53(3): 563–572 (2012).
443. Yapu Zhang, Jianxiong Guo, Wenguo Yang, Weili Wu: Targeted Activation Probability
Maximization Problem in Online Social Networks, IEEE Trans. Netw. Sci. Eng., 8(1): 294–
304 (2021).
444. Yapu Zhang, Jianxiong Guo, Wenguo Yang, Weili Wu: Mixed-case community detection
problem in social networks: Algorithms and analysis, Theor. Comput. Sci., 854: 94–104
(2021)
445. Yapu Zhang, Wenguo Yang, Weili Wu, Yi Li: Effector detection problem in social networks,
IEEE Trans. Comput. Soc. Syst., 7(5): 1200–1209 (2020).
446. Zhao Zhang, Xiaofeng Gao, Weili Wu: Algorithms for connected set cover problem and fault-
tolerant connected set cover problem, Theor. Comput. Sci., 410(8-10): 812–817 (2009).
447. Zhao Zhang, Xiaofeng Gao, Weili Wu, Ding-Zhu Du: PTAS for minimum connected
dominating set in unit ball graph, Proceedings, WASA, pp. 154–161, 2008.
402 Bibliography
448. Zhao Zhang, Xiaofeng Gao, Weili Wu, Ding-Zhu Du: A PTAS for minimum connected
dominating set in 3-dimensional Wireless sensor networks. J. Glob. Optim., 45(3): 451–458
(2009).
449. Zhao Zhang, Xiaofeng Gao, Xuefei Zhang, Weili Wu, Hui Xiong: Three approximation
algorithms for energy-efficient query dissemination in sensor database system, Proceedings,
DEXA, pp. 807–821, 2009.
450. Zhao Zhang, Joonglyul Lee, Weili Wu, Ding-Zhu Du: Approximation for minimum strongly
connected dominating and absorbing set with routing-cost constraint in disk digraphs, Optim.
Lett., 10(7): 1393–1401 (2016).
451. Zhao Zhang, James Willson, Zaixin Lu, Weili Wu, Xuding Zhu, Ding-Zhu Du: Approximat-
ing maximum lifetime k-coverage through minimizing weighted k-cover in homogeneous
wireless sensor networks, IEEE/ACM Trans. Netw., 24(6): 3620–3633 (2016).
452. Zhao Zhang, Weili Wu, Jing Yuan, Ding-Zhu Du: Breach-free sleep-wakeup scheduling for
barrier coverage with heterogeneous wireless sensors, IEEE/ACM Trans. Netw., 26(5): 2404–
2413 (2018).
453. Zhao Zhang, Weili Wu, Lidong Wu, Yanjie Li, Zongqing Chen: Strongly connected dominat-
ing and absorbing set in directed disk graph, Int. J. Sens. Networks, 19(2): 69–77 (2015).
454. Jiao Zhou, Zhao Zhang, Weili Wu, Kai Xing: A greedy algorithm for the fault-tolerant
connected dominating set in a general graph, J. Comb. Optim., 28(1): 310–319 (2014).
455. Jianming Zhu, Smita Ghosh, Weili Wu: Robust rumor blocking problem with uncertain rumor
sources in social networks, World Wide Web, 24(1): 229–247 (2021)
456. Jianming Zhu, Smita Ghosh, Weili Wu: Group influence maximization problem in social
networks, IEEE Trans. Comput. Soc. Syst., 6(6): 1156–1164 (2019).
457. Jianming Zhu, Junlei Zhu, Smita Ghosh, Weili Wu and Jing Yuan: Social influence max-
imization in hypergraph in social networks, IEEE Transactions on Network Science and
Engineering, 6(4): 801–811 (2019).
458. Yuqing Zhu, Deying Li, Ruidong Yan, Weili Wu, Yuanjun Bi: Maximizing the influence and
profit in social networks, IEEE Trans. Comput. Soc. Syst., 4(3): 54–64 (2017).
459. Yuqing Zhu, Weili Wu, Yuanjun Bi, Lidong Wu, Yiwei Jiang, Wen Xu: Better approximation
algorithms for influence maximization in online social networks, J. Comb. Optim., 30(1): 97–
108 (2015).
460. Yuqing Zhu, Zaixin Lu, Yuanjun Bi, Weili Wu, Yiwei Jiang, Deying Li: Influence and profit:
Two sides of the coin, Proceedings, ICDM, pp. 1301–1306, 2013.
461. Feng Zou, Xianyue Li, Donghyun Kim, Weili Wu: Construction of minimum cnnected
dominating set in 3-dimensional wireless network, Proceedings, WASA, pp. 134–140, 2008.
462. Feng Zou, Xianyue Li, Donghyun Kim and Weil Wu: Two constant approximation algorithms
for node-weighted Steiner tree in unit disk graphs, Proceedings, COCOA, pp. 278–285, 2008.
463. Feng Zou, Yuexuan Wang, XiaoHua Xu, Xianyue Li, Hongwei Du, Peng-Jun Wan, Weili
Wu: New approximations for minimum-weighted dominating sets and minimum-weighted
connected dominating sets on unit disk graphs, Theor. Comput. Sci., 412(3): 198–208 (2011).
464. D. Zuckerman: Linear degree extractors and the inapproximability of max clique and
chromatic number, Proceedings, 38th ACM Symposium on Theory of Computing, pp. 681–
690, 2006.
465. D. Zuckerman: Linear degree extractors and the inapproximability of Max Clique and
Chromatic Number, Theory Comput., 3: 103–128 (2007).