0% found this document useful (0 votes)
121 views304 pages

Optimización Lineal

Optimización Lineal

Uploaded by

Eduardo Sanchez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views304 pages

Optimización Lineal

Optimización Lineal

Uploaded by

Eduardo Sanchez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 304

A First Course in

Linear Optimization
— a dynamic book —

by
Jon Lee

Fourth Edition (Version 4.0)

ReEx PrEsS
Jon Lee
2013–2021

ReEx PrEsS
iii

This work is licensed under the

Creative Commons Attribution 3.0 Unported License


(CC BY 3.0) c b.

To view a copy of this license, visit

http://creativecommons.org/licenses/by/3.0/
where you will see the summary information below and can click through to the full
license information.
Go Forward

This is a book on linear optimization, written in LATEX. I started it, aiming it at the
course IOE 510, a masters-level course at the University of Michigan. Use it as is, or
adapt it to your course! It is an ongoing project. It is alive! It can be used, modified
(the LATEX source is available) and redistributed as anyone
pleases, subject to the terms of the Creative Commons At-
tribution 3.0 Unported License (CC BY 3.0) c b. Please
take special note that you can share (copy and redistribute
in any medium or format) and adapt (remix, transform, and
build upon for any purpose, even commercially) this mate-
rial, but you must give appropriate credit, provide a link to
the license, and indicate if changes were made. You may do
so in any reasonable manner, but not in any way that sug-
gests that I endorse you or your use. If you are interested
in endorsements, speak to my agent.
I started this material, but I don’t control so much what
you do with it. Control is sometimes overrated — and I am a control freak, so I should
know!
I hope that you find this material useful. If not, I am happy to refund what you paid
to me.

Jon Lee

Ann Arbor, Michigan


started March 2013

v
Preface

This book is a treatment of linear optimization meant for students who are reason-
ably comfortable with matrix algebra (or willing to get comfortable rapidly). It is not
a goal of mine to teach anyone how to solve small problems by hand. My goals are to
introduce: (i) the mathematics and algorithmics of the subject at a beginning mathe-
matical level, (ii) algorithmically-aware modeling techniques, and (iii) high-level com-
putational tools for studying and developing optimization algorithms (in particular,
Python/Gurobi1 ).
Proofs are given when they are important in understanding the algorithmics. I make
free use of the inverse of a matrix. But it should be understood, for example, that B −1 b
is meant as a mathematical expression for the solution of the square linear system of
equations Bx = b . I am not in any way suggesting that an efficient way to calculate the
solution of a large (often sparse) linear system is to calculate an inverse! Also, I avoid
the dual simplex algorithm (e.g., even in describing branch-and-bound and cutting-
plane algorithms), preferring to just think about the ordinary simplex algorithm ap-
plied to the dual problem. Again, my goal is not to describe the most efficient way to
do matrix algebra!
Conventional illustrations are woefully few. Though if Lagrange could not be both-
ered1 , who am I to aim higher? Still, I am gradually improving this aspect, and many
of the algorithms and concepts are illustrated and verified in the modern way, with com-
puter code.2
The material that I present was mostly well known by the 1960’s. As a student at
Cornell in the late 70’s and early 80’s, I learned and got excited about linear optimiza-
tion from Bob Bland, Les Trotter and Lou Billera, using [1] and [5]. The present book
is a treatment of some of that material, with additional material on integer-linear op-
timization, mostly which I originally learned from George Nemhauser and Les. But
there is new material too; in particular, a “deconstructed post-modern” version of Go-
mory pure and mixed-integer cuts. There is nothing here on interior-point algorithms
and the ellipsoid algorithm; don’t tell Mike Todd!

Jon Lee

Ann Arbor, Michigan


started March 2013
(or maybe really in Ithaca, NY in 1979)

1
New in the 4th Edition! But thanks for the very fond memories AMPL.

vii
Serious Acknowledgments

Throw me some serious funding for this project, and I will acknowledge you — seri-
ously!
Many of the pictures in this book were found floating around on the web. I am making
”fair use” of them as they float through this document. Of course, I gratefully acknowl-
edge those who own them.
Hearty thanks to many students and to Prof. Siqian Shen for pointing out typos in an
earlier version.

ix
Dedication

For students (even Ohio students). Not for publishers — maybe next time.

xi
The Nitty Gritty

You can always get the released edition of this book (in .pdf format) from my web
page or github and the materials to produce them (LATEX source, etc.) from me.
I make significant use of software. Everything seems to work with:

Python 3.8.3 (default, Jul 2 2020, 17:30:36) [MSC v.1916 64 bit (AMD64)]
(via Anaconda distribution)
Jupyter Notebook server 6.0.3 (via Anaconda distribution)
Gurobi Optimizer version 9.1.2 build v9.1.2rc0 (win64)
WinEdt 10.3
MiKTeX 2.9

Use of older versions is inexcusable. Newer versions will surely break things. Nonethe-
less, if you can report success or failure on newer versions, please let me know.
I use lots of LATEX packages (which, as you may know, makes things rather fragile).
I could not possibly gather the version numbers of those — I do have a day job! (but
WinEdt does endeavor to keep the packages up to date).

xiii
Contents

1 Let’s Get Started 1


1.1 Linear Optimization and Standard Form . . . . . . . . . . . . . . . . . . . 1
1.2 A Standard-Form Problem and its Dual . . . . . . . . . . . . . . . . . . . 2
1.3 Linear-Algebra Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Modeling 9
2.1 A Production Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Norm Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Network Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Modeling in Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Algebra Versus Geometry 19


3.1 Basic Feasible Solutions and Extreme Points . . . . . . . . . . . . . . . . . 19
3.2 Basic Feasible Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Basic Feasible Rays and Extreme Rays . . . . . . . . . . . . . . . . . . . . 25
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 The Simplex Algorithm 27


4.1 A Sufficient Optimality Criterion . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 The Simplex Algorithm with No Worries . . . . . . . . . . . . . . . . . . 30
4.3 Anticycling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4 Obtaining a Basic Feasible Solution . . . . . . . . . . . . . . . . . . . . . . 37
4.4.1 Ignoring degeneracy . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4.2 Not ignoring degeneracy . . . . . . . . . . . . . . . . . . . . . . . . 39
4.5 The Simplex Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5 Duality 45
5.1 The Strong Duality Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2 Complementary Slackness . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3 Duality for General Linear-Optimization Problems . . . . . . . . . . . . . 48
5.4 Theorems of the Alternative . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

xv
xvi CONTENTS

6 Sensitivity Analysis 57
6.1 Right-Hand Side Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.1.1 Local analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.2 Global analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.3 A brief detour: the column geometry for the Simplex Algorithm . 60
6.2 Objective Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2.1 Local analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2.2 Global analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

7 Large-Scale Linear Optimization 65


7.1 Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.1.1 The master reformulation . . . . . . . . . . . . . . . . . . . . . . . 66
7.1.2 Solution of the Master via the Simplex Algorithm . . . . . . . . . 68
7.2 Lagrangian Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.2.1 Lagrangian bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.2.2 Solving the Lagrangian Dual . . . . . . . . . . . . . . . . . . . . . 75
7.3 The Cutting-Stock Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.3.1 Formulation via cutting patterns . . . . . . . . . . . . . . . . . . . 80
7.3.2 Solution via continuous relaxation . . . . . . . . . . . . . . . . . . 80
7.3.3 The knapsack subproblem . . . . . . . . . . . . . . . . . . . . . . . 81
7.3.4 Applying the Simplex Algorithm . . . . . . . . . . . . . . . . . . . 82
7.3.5 A demonstration implementation . . . . . . . . . . . . . . . . . . 83
7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

8 Integer-Linear Optimization 89
8.1 Integrality for Free . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.1.1 Some structured models . . . . . . . . . . . . . . . . . . . . . . . . 89
8.1.2 Unimodular basis matrices and total unimodularity . . . . . . . . 92
8.1.3 Consequences of total unimodularity . . . . . . . . . . . . . . . . 95
8.2 Modeling Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.2.1 Disjunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.2.2 Forcing constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.2.3 Piecewise-linear univariate functions . . . . . . . . . . . . . . . . 103
8.3 A Prelude to Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.4 Branch-and-Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.5 Cutting Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.5.1 Pure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.5.2 Mixed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.5.3 Finite termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.5.4 Branch-and-Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
CONTENTS xvii

Appendices 125
A.1 LATEX template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
A.2 MatrixLP.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
A.3 Production.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
A.4 Multi-commodityFlow.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . 143
A.5 pivot_example.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
A.6 pivot_tools.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
A.7 Circle.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
A.8 Decomp.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
A.9 SubgradProj.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
A.10 CSP.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
A.11 UFL.ipynb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
A.12 pure_gomory_example_1.ipynb . . . . . . . . . . . . . . . . . . . . . . . 237
A.13 pure_gomory_example_2.ipynb . . . . . . . . . . . . . . . . . . . . . . . 247

End Notes 273

Bibliography 279

Indexes 281
Index of definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Index of Jupyter notebooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Chapter 1

Let’s Get Started

Our main goals in this chapter are as follows:


• Introduce some terminology associated with linear optimization.

• Describe elementary techniques for transforming any linear-optimization prob-


lems to one in a ‘standard form.’

• Introduce the Weak Duality Theorem.

• Review ideas from linear algebra that we will make use of later.

1.1 Linear Optimization and Standard Form


Linear optimization is the study of the mathematics and algorithms associated with
minimizing or maximizing a real linear objective function of a finite number of real
variables, subject to a finite number of linear constraints, each being that a real linear
function on these variables be = , ≤ or ≥ a constant. A polyhedron is the solution

1
2 CHAPTER 1. LET’S GET STARTED

set of a finite number of linear constraints; so we are studying optimization of a linear


function on a polyhedron.
A solution of a linear-optimization problem is an assignment of real values to the
variables. A solution is feasible if it satisfies the linear constraints. A solution is optimal
if there is no feasible solution with better objective value. The set of feasible solutions
(which is a polyhedron) is the feasible region.
It is convenient to put a general linear-optimization problem into a standard form
min c0 x
Ax = b ;
x ≥ 0,
where c ∈ Rn , b ∈ Rm , A ∈ Rm×n has full row rank m , and x ia a vector of variables in
Rn . That is, minimization of a linear function on a finite number number of non-negative
real variables, subject to a non-redundant and consistent system of linear equations. Note
that even though the system of equations, Ax = b, has a solution, the problem may not
have a feasible solution.
Through a finite sequence of simple transformations, every linear-optimization prob-
lem can be brought into an equivalent one in standard form. Specifically, we can apply
any of the follow steps, as needed, in the order presented.

• The maximum of c0 x is the same as the negative of the minimum of −c0 x .


• We can replace any non-positive variable xj with a non-negative variables x− j ,
substituting −x− j for x j . Additionally, we can replace any unrestricted variable
xj with the difference of a pair of non-negative variables x+ −
j and xj . That is,
substituting x+ −
j − xj for xj . In this way, we can make all variables constrained to
be non-negative.
P P
• Next, if we have an inequality nj=1 αj xj ≤ γ , we simply replace it with nj=1 αj xj +
s = γ , where a real slack variable sPis introduced whichPis constrained to be non-
negative. Similarly, we can replace nj=1 αj xj ≥ γ with nj=1 αj xj −s = γ , where
a real surplus variable s is introduced which is constrained to be non-negative.
• Applying these transformations as needed results in a standard-form problem,
except possibly for the condition that the matrix of coefficients of the systems of
equations have full row rank. But we can realize this last condition by carrying out
elementary row operations on the system of equations, resulting in the elimina-
tion of any redundant equations or the identification that the system of equations
is inconsistent. In the latter case, the linear-optimization problem is infeasible.

1.2 A Standard-Form Problem and its Dual


Let c ∈ Rn , b ∈ Rm , and A ∈ Rm×n . Let x be a vector of variables in Rn . Consider the
standard-form problem
min c0 x
Ax = b ; (P)
x ≥ 0.
1.3. LINEAR-ALGEBRA REVIEW 3

Let y be a vector of variables in Rm , and consider the linear-optimization problem

max y 0 b
(D)
y 0 A ≤ c0 .

It is worth emphasizing that (P) and (D) are both defined from the same data A , b
and c . We have the following very simple but key result, relating the objective values
of feasible solutions of the two linear-optimization problems.

Theorem 1.1 (Weak Duality Theorem)


If x̂ is feasible in (P) and ŷ is feasible in (D), then c0 x̂ ≥ ŷ 0 b .

Proof.
c0 x̂ ≥ ŷ 0 Ax̂ ,
because ŷ 0 A ≤ c0 (feasibility of ŷ in (D)) and x̂ ≥ 0 (feasibility of x̂ in (P)). Furthermore

ŷ 0 Ax̂ = ŷ 0 b ,

because Ax̂ = b (feasibility of x̂ in (P)). The result follows. t


u

1.3 Linear-Algebra Review

For a matrix A ∈ Rm×n , we denote the entry in row i and column j as aij . For a
matrix A ∈ Rm×n , we denote the transpose of A by A0 ∈ Rn×m . That is, the entry in
row i and column j of A0 is aji .
Except when we state clearly otherwise, vectors are “column vectors.” That is, we
can view a vector x ∈ Rn as a matrix in Rn×1 . Column j of A is denoted by A·j ∈ Rm .
Row i of A is denoted by Ai· , and we view its transpose as a vector in Rn . We will have
far greater occasion to reference columns of matrices rather than rows, so we will often
write Aj as a shorthand for A·j , so as to keep notation less cluttered.
For matrices A ∈ Rm×p Pp and B ∈ R
p×n , the (matrix) product AB ∈ Rm×n is defined

to be the matrix having k=1 aik bkj as the entry in row i and column j . Note that for the
product AB to make sense, the number of columns of A and the number of rows of B
must be identical. It is important to emphasize that matrix multiplication is associative;
that is, (AB)C = A(BC) , and so we can always unambiguously write the product
4 CHAPTER 1. LET’S GET STARTED

of any number of matrices without the need for any parentheses. Also, note that the
product and transpose behave nicely together. That is, (AB)0 = B 0 A0 . P
The dot product or scalar product of vectors x, z ∈ Rn is the scalar hx, zi := nj=1 xj zj ,
which we can equivalently see as x0 z or z 0 x , allowing ourselves to consider a 1 × 1 ma-
trix to be viewed as a scalar. Thinking about matrix multiplication again, and freely
viewing columns as vectors, the entry in row i and column j of the product AB is the
dot product h(Ai· )0 , B·j i .
Matrix multiplication extends to “block matrices” in a straightforward manner. If
   
A11 · · · A1p B11 · · · B1n
 A21 · · · A2p   B21 · · · B2n 
   
A :=  . . .  and B :=  . . ..  ,
 . . . . .
.   . . . . . 
Am1 · · · Amp Bp1 · · · Bpn
where each of the Aij and Bij are matrices, and we assume that for all i and j the
number of columns of Aik agrees with the number of rows of Bkj , then
 
Pp Pp
 k=1 A1k Bk1 ··· k=1 A1k Bkn 
 
 
 Pp Pp 
 k=1 A2k Bk1 ··· 
k=1 A2k Bkn  .
AB = 
 .. .. 
 .. 
 . . . 
 
Pp Pp
k=1 Amk Bk1 · · · k=1 Amk Bkn

P
That is, block i, j of the product is pk=1 Aik Bkj , and Aik Bkj is understood as ordinary
matrix multiplication. P
For vectors x1 , x2 , . . . , xp ∈ Rn , and scalars λ1 , λ2 , . . . , λp , pi=1 λi xi is a linear
combination of x1 , x2 , . . . , xp . The linear combination is trivial if all λi = 0 . The vec-
tors x1 , x2 , . . . , xp ∈ Rn are linearly independent if the only representation of the zero
vector in Rn as a linear combination of x1 , x2 , . . . , xp is trivial. The set of all linear combi-
nations of x1 , x2 , . . . , xp is the vector-space span of {x1 , x2 , . . . , xp } . The dimension of
a vector space V , denoted dim(V ) , is the maximum number of linearly-independent
vectors in it. Equivalently, it is the minimum number of vectors needed to span the
space.
A set of dim(V ) linearly-independent vectors that spans a vector space V is a ba-
sis for V . If V is the vector-space span of {x1 , x2 , . . . , xp } , then there is a subset of
{x1 , x2 , . . . , xp } that is a basis for V . It is not hard to prove the following very useful
result.

Theorem 1.2 (Greedy Basis Extension Theorem)


Let V be the vector-space span of {x1 , x2 , . . . , xp } . Then every linearly-independent
subset of {x1 , x2 , . . . , xp } can be extended to a basis for V using vectors from
x1 , x2 , . . . , xp .
1.3. LINEAR-ALGEBRA REVIEW 5

The span of the rows of a matrix A ∈ Rm×n is the row space of A , denoted r.s.(A) :=
{y 0 A: y ∈ Rm } . Similarly, the span of the columns of a matrix A is the column space
of A , denoted c.s.(A) := {Ax : x ∈ Rn } . It is a simple fact that, for a matrix A ,
the dimension of its row space and the dimension of its column space are identical,
this common number being called the rank of A . The matrix A has full row rank if
its number of rows is equal to its rank. That is, if its rows are linearly independent.
Similarly, the matrix A has full column rank if its number of columns is equal to its
rank. That is, if its columns are linearly independent.
Besides the row and columns spaces of a matrix A ∈ Rm×n , there is another very
important vector space associated with A . The null space of A is the set of vectors
having 0 dot product with all rows of A , denoted n.s.(A) := {x ∈ Rn : Ax = 0} .
An important result is the following theorem relating the dimensions of the row
and null spaces of a matrix.

Theorem 1.3 (Rank-Nullity Theorem)


If A is a matrix with n columns, then

dim(r.s.(A)) + dim(n.s.(A)) = n .

There are some simple operations on a matrix that preserve its row and null spaces.
The following operations are elementary row operations:

1. multiply a row by a non-zero scalar;

2. interchange a pair of rows;

3. add a scalar multiple of a row to another row;

4. delete a row that is identically zero.

There is one more operation that we allow, which is really one of convenience rather
than mathematics. It is convenient to be able to permute columns while also permuting
the corresponding column indices. That is, if A ∈ Rm×n , we regard the columns as
labeled, in left-to-right order: 1, 2, . . . , n . So we have

A = [A1 , A2 , . . . , An ] .

It can be convenient to have a permutation σ1 , σ2 , . . . , σn of 1, 2, . . . , n , and then write

[Aσ1 , Aσ2 , . . . , Aσn ] .

This matrix is really equivalent to A , because we regard its columns as labeled by


σ1 , σ2 , . . . , σn rather than 1, 2, . . . , n . Put another way, when we write a matrix, the
order of the columns is at our convenience, but the labels of columns is determined by
the order that we choose for placing the columns.
6 CHAPTER 1. LET’S GET STARTED

The identity matrix Ir in Rr×r is the matrix having 1 as every diagonal element and
0 as every off-diagonal element. Via elementary row operations, any matrix A , that is
not all zero, can be transformed, via elementary row operations, into one of the form

[Ir , M ] .

Using corresponding operations on the associated system of equations, this is known


as Gauss-Jordan elimination.
For an r × r matrix B of rank r , there is a unique r × r matrix “B −1 ” such that
B −1 B = Ir . For this reason, such a matrix B is called invertible, and B −1 is called the
inverse of B . According to the definition, B −1 B = Ir , but we also have BB −1 = Ir .
Also, (B 0 )−1 = (B −1 )0 , and if A and B are both invertible, then (AB)−1 = B −1 A−1 .
Noting that,
B −1 [B, Ir ] = [Ir , B −1 ] ,
we see that there is a nice way to compute the inverse of a matrix B using elementary
row operations. That is, we perform elementary row operations on

[B, Ir ]

so that we have the form


[Ir , M ],
and the resulting matrix M is B −1 .
The Sherman-Morrison formula is a useful way to relate the inverse of a matrix to
the inverse of a rank-1 change to the matrix:

−1 B −1 uv 0 B −1
B + uv 0 = B −1 − ,
1 + v 0 B −1 u
where the r × r matrix B is invertible, u, v ∈ Rr , and it must be assumed that 1 +
v 0 B −1 u 6= 0 for otherwise B + uv 0 would not be invertible.
Next, we define the determinant of a square r × r matrix B , which we denote
det(B) . We define the determinant in a non-standard but useful manner, via a recursive
formula known as Laplace expansion.3
If r = 1 , then B = (b11 ) , and we define det(B) := b11 . For r > 1 , choose any fixed
column j of B , and we define
r
X
det(B) = (−1)i+j bij det(B ij ) ,
i=1

where B ij is the (r − 1) × (r − 1) matrix obtained by deleting row i and column j of B .


It is a fact that this is well defined — that is, the value of det(B) does not depend on the
choice of j (taken at each step of the recursion). Moreover, we have det(B 0 ) = det(B) ,
so we can could as well choose any fixed row i of B , and we have
r
X
det(B) = (−1)i+j bij det(B ij ) ,
j=1
1.4. EXERCISES 7

resulting in the same value for det(B) .


An interesting observation links det(B) with elementary row operations. Consider
performing elementary row operations on

[B, Ir ]

to obtain
[Ir , B −1 ] .
As we carry out the elementary row operations, we sometimes multiply a row by a
non-zero scalar. If we accumulate the product of all of these multipliers, the result is
det(B −1 ) ; equivalently, the reciprocal is det(B) .
Finally, for an invertible r × r matrix B and a vector b , we can express the unique
solution x̄ of the system Bx = b , via a formula involving determinants. Cramer’s rule
is the following formula:

det(B(j))
x̄j = , for j = 1, 2, . . . , r ,
det(B)

where B(j) is defined to be the matrix B with its j-th column replaced by b . It is worth
emphasizing that direct application of Cramer’s rule is not to be thought of as a useful
algorithm for computing the solution of a system of equations. But it can be very useful
to have in the proof toolbox.4

1.4 Exercises
Exercise 1.0 (Learn LATEX)
Learn to use LATEX for writing all of your homework solutions. Personally, I use MiKTEX,
which is an implementation of LATEX for Windows. Specifically, within MiKTEX, I am
using pdfLATEX (it only matters for certain things like including graphics and also pdf
into a document). I find it convenient to use the editor WinEdt, which is very LATEX
friendly. A good book on LATEX is

In Appendix A.1 there is a template to get started. Also, there are plenty of tutorials
and beginner’s guides on the web.

Exercise 1.1 (Convert to standard form)


Give an original example (i.e., with actual numbers) to demonstrate that you know
how to transform a general linear-optimization problem to one in standard form.
8 CHAPTER 1. LET’S GET STARTED

Exercise 1.2 (Weak Duality example)


Give an original example to demonstrate the Weak Duality Theorem.

Exercise 1.3 (Convert to ≤ form)


Describe a general recipe for transforming an arbitrary linear-optimization problem
into one in which all of the linear constraints are of ≤ type.

Exercise 1.4 (m + 1 inequalities)


Prove that the system of m equations in n variables Ax = b is equivalent to the system
Ax ≤ b augmented by only one additional linear inequality — that is, a total of only
m + 1 inequalities.

Exercise 1.5 (Weak duality for another form)


Give and prove a Weak Duality Theorem for

max c0 x
Ax ≤ b ; (P0 )
x ≥ 0.

HINT: Convert (P0 ) to a standard-form problem, and then apply the ordinary Weak
Duality Theorem for standard-form problems.

Exercise 1.6 (Weak duality for a complicated form)


Give and prove a Weak Duality Theorem for

min c0 x + f 0 w
Ax + Bw ≤ b ;
(P0 )
Dx = g;
x≥0 w≤0 .

HINT: Convert (P0 ) to a standard-form problem, and then apply the ordinary Weak
Duality Theorem for standard-form problems.

Exercise 1.7 (Weak duality for a complicated matrix form — with Python/Gurobi)
Python is an interpreted, general-purpose programming language. Anaconda is a free
and open-source distribution of Python (and R). Via the Anaconda distribution, one also
gets Jupyter Notebook, which is a convenient way to experiment with Python. Gurobi
is a state-of-the art commercial linear and integer linear optimization software, with
free temporary licensing for students. Gurobi can be easily accessed with gurobipy, a
Python module. The Jupyter notebook MatrixLP.ipynb (see Appendix A.2) sets up
and solves an instance of (P0 ) from Exercise 1.6. Run the code and it, to see how it is
works. Now, extend the code to solve the dual of (P0 ). Also, after converting (P0 ) to
standard form (as indicated in the HINT for Exercise 1.6), use Python/Gurobi to solve
that problem and its dual. Make sure that you get the same optimal value for all of
these problems.
Chapter 2

Modeling

Our goals in this chapter are as follows:

• Learn some basic linear-optimization modeling techniques.

• Learn how to use an Python as an LP modeling language in connection with


Gurobi as an LP solver.

2.1 A Production Problem

9
10 CHAPTER 2. MODELING

We suppose that a company has m resources, available in quantities bi , i = 1, 2, . . . , m ,


and n production activities, with per-unit profits cj , j = 1, 2, . . . , n . Each unit of ac-
tivity j consumes aij units of resources i . Each production activity can be carried out
at any non-negative level, as long as the resources availabilities are respected. We as-
sume that any unused resource quantities have no value and can be disposed of at no
cost. The problem is to find a profit-maximizing production plan. We can formulate
this problem as the linear-optimization problem
max c0 x
Ax ≤ b ; (P)
x ≥ 0,
where b := (b1 , b2 , . . . , bm )0 , c := (c1 , c2 , . . . , cn )0 , A ∈ Rm×n is the matrix of aij , and x
is a vector of variables in Rn .
From the very same data, we can formulate a related linear-optimization problem.
The goal now is to set per-unit prices yi , for the resources i = 1, 2, . . . , m . The total cost
of purchasing the resources from the company is then y 0 b , and we wish to minimize
the total cost of obtaining the resources from the company. We want to set these prices
in such a way that the company would never have an incentive to carry out any of the
production activities versus selling P the resources at the associated resources at these
prices. That is, we require that m i=1 yi aij ≥ cj , for j = 1, 2, . . . , n . Because of our
assumption that the company can dispose of any unused quantities of resources at no
cost, we have yi ≥ 0 , for i = 1, 2, . . . , m . All in all, we have the linear-optimization
problem
min y 0 b
y 0 A ≥ c0 ; (D)
y ≥ 0,
Comparing this pair of linear-optimization problem with what you discovered in Ex-
ercise 1.5, we see that a Weak Duality Theorem holds: that is, the profit of any feasible
production plan is bounded above by the cost of the resources determined by any set
of prices that would render all production activities non-profitable.

2.2 Norm Minimization


2.3. NETWORK FLOW 11

“Norms” are very useful as a measure of the “size” of a vector. In some applica-
tions, we are interested in making the “size” small. There are many different “norms”
(for example, the Euclidean norm), but two are particularly interesting for linear opti-
mization.
For x ∈ Rn , the ∞-norm (or max-norm) of x is defined as

kxk∞ := max{|xj | : j = 1, 2, . . . , n}.

We would like to formulate the problem of finding an ∞-norm minimizing solution of


the system of equations Ax = b . This is quite easy, via the linear-optimization problem:

min t
t − xi ≥ 0 , i = 1, 2, . . . , n ;
t + xi ≥ 0 , i = 1, 2, . . . , n ;
Ax = b ,

where t ∈ R is an auxiliary variable. Notice how the minimization “pressure” ensures


that an optimal solution (x̂, t̂) has t̂ = maxnj=1 {|x̂j |} = kx̂k∞ . This would not work for
maximization!
The 1-norm of x is defined as
n
X
kxk1 := |xj | .
j=1

Now, we would like to formulate the problem of finding a 1-norm minimizing solu-
tion of the system of equations Ax = b . This is quite easy, via the linear-optimization
problem: Pn
min j=1 tj
tj − xj ≥ 0 , j = 1, 2, . . . , n ;
tj + xj ≥ 0 , j = 1, 2, . . . , n ;
Ax = b ,
where t ∈ Rn is a vector of n auxiliary variables. Notice how the minimization “pres-
sure” ensures that an optimal solution (x̂, t̂) has t̂j = |x̂j | , for
Pjn = 1, 2, . . . , n (again,
this would not work for maximization!), and so we will have j=1 t̂j = kx̂k1 .

2.3 Network Flow


12 CHAPTER 2. MODELING

A finite network G is described by a finite set of nodes N and a finite set A of arcs.
Each arc e has two key attributes, namely its tail t(e) ∈ N and its head h(e) ∈ N . We
think of K ≥ 1 commodities as being allowed to “flow” along each arc, from its tail to
its head. Indeed, we have “flow” variables

xke := amount of flow of commodity k on arc e ,

for e ∈ A, and k = 1, 2, . . . , K . Formally, a flow x̂ on G is simply an assignment of any


real numbers x̂ke to the variables xke , for e ∈ A, and k = 1, 2, . . . , K . We assume that
the total flow on arc e should not exceed

ue := the flow upper bound on arc e ,

for e ∈ A . Associated with each arc e and commodity k is a cost

cke := cost per-unit-flow of commodity k on arc e ,

for e ∈ A, and k = 1, 2, . . . , K . The (total) cost of the flow x̂ is defined to be


K X
X
cke xke .
k=1 e∈A

We assume that we have further data for the nodes. Namely,

bkv := the net supply of commodity k at node v ,

for v ∈ N . A flow is conservative if the net flow out of node v , minus the net flow into
node v , is equal to the net supply at node v , for all nodes v ∈ N , and all commodities
k = 1, 2, . . . , K .
The multi-commodity min-cost network-flow problem is to find a minimum-cost
conservative flow that is non-negative and respects the flow upper bounds on the arcs.
We can formulate this as follows:
K X
X
min cke xke
k=1 e∈A
X X
xke − xke = bkv , ∀ v ∈ N , k = 1, 2, . . . , K ;
e∈A : e∈A :
t(e)=v h(e)=v
K
X
xke ≤ ue , ∀e∈A;
k=1

xke ≥ 0 , ∀ e ∈ A , k = 1, 2, . . . , K .
2.4. MODELING IN SOFTWARE 13

2.4 Modeling in Software


Optimization modeling languages fa-
cilitate rapid development of mathemat-
ical optimization models, instantiation
with data, and the subsequent solution
by solvers. Well-known examples of op-
timization modeling languages are AMPL
and GAMS. Another is Pyomo, which is a
Python package. All of these are means
to set up structured LP models, instanti-
ate them with data, pass to an LP solver,
and recover the solutions (with the oppor-
tunity to manipulate them) back in their
environments. All have the ability to it-
erate, solving sequences of LPs, dynami-
cally setting them up at each iteration. In
fact, we will not use any of them here, but
will instead work directly in Python, mak-
ing direct calls to Gurobi, a state-of-the-art
LP solver1 . A strong advantage of work-
ing in Python is that it is a well-supported
programming language with lots of useful
add-on packages.

In Exercise 1.7, we saw how to set up “matrix-style” optimization models, instantiate


them with data, and solve them. For models that relate to applications, it is often more
natural and convenient to specify models in a way that does not obscure the problem
being solved and is close to the way that we would naturally write the model mathe-
matically. We will do this in Python, making direct calls to Gurobi. As a first step in
this direction, we consider the Production problem of Section 2.1. For this problem, we
specify the model in Python/Gurobi as follows.
First, it is convenient to number the resources as M:= {0, 1, . . . , m − 1} and the vari-
ables as N:= {0, 1, . . . , n − 1}. We do this in Python via:
M=list(range(0,m))
N=list(range(0,n))
We instantiate a Gurobi Model object via
model = gp.Model()
Note that model is the name that we have given Gurobi Model object in Python.
We create (continuous nonnegative) variables x[j], for j ∈ N, attached to model, via:
x = model.addMVar(n)
These variable names x[j] are accessible to us in Python and are not used internally by Gurobi.
We define and attach our objective function, revenueobjective, to model via:
1
Another is CPLEX
14 CHAPTER 2. MODELING

revenueobjective = model.setObjective(sum(c[j]*x[j] for j in N), GRB.MAXIMIZE)

This objective name is accessible to us in Python and is not used internally by Gurobi.
Finally, we define our resource constraints and attach them to model via:

for i in M:
model.addConstr(sum(A[i,j]*x[j] for j in N) <= b[i], name='r'+str(i))

Note that we have created names, rj, for j ∈ N, for the constraints inside Gurobi. This enables
us to easily retrieve constraint “attributes” from Gurobi. A Juptyer notebook giving the full
Python/Gurobi implementation is Production.ipynb (see Appendix A.3).
Next, we consider the Network Flow problem of Section 2.3. The model is specified as:

x = model.addVars(ArcsCrossCommods)
model.setObjective(sum(sum(CapacityCosts[i,j][k]*x[(i,j),k] for (i,j) in Arcs)
for k in Commods), GRB.MINIMIZE)
model.addConstrs(sum(x[(i,j),k] for k in Commods) <= CapacityCosts[i,j][0]
for (i,j) in Arcs)
model.addConstrs(
(sum(x[(i, j),k] for j in Nodes if (i, j) in Arcs) - sum(x[(j, i),k]
for j in Nodes if (j,i) in Arcs)
== Supplies[i][k-1] for i in Nodes for k in Commods))

A Python/Gurobi implementation is in the Jupyter notebook Multi-commodityFlow.ipynb


(see Appendix A.4).

Example 2.1
Figure 2.1 depicts an 8-node network for a K = 2 commodity example. Each arc e is labeled
[ue , c12 , c2e ]. Figures 2.2 and 2.3 depict the node supply data and the optimal solutions. Figures
2.2 corresponds to commodity 1 and Figure 2.3 corresponds to commodity 2. Node v is labeled
v : bkv and arc e is labeled with the optimal value of xke .
2.4. MODELING IN SOFTWARE 15

Figure 2.1: Arc data

Figure 2.2: Commodity 1: supplies and flows


16 CHAPTER 2. MODELING

Figure 2.3: Commodity 2: supplies and flows

2.5 Exercises
Exercise 2.1 (Dual in Python/Gurobi)
Without changing the data specification in Production.ipynb (see Appendix A.3), use Python/Gurobi
to solve the dual of the Production Problem example, as described in Section 2.1. You will need
to modify the model in Production.ipynb appropriately.

Exercise 2.2 (Sparse solution for linear equations)


In some application areas, it is interesting to find a “sparse solution” — that is, one with few non-
zeros — to a system of equations Ax = b, on say the domain −1 ≤ xj ≤ +1, for j = 1, 2, . . . , n.
It is empirically well known that a 1-norm minimizing solution is a good heuristic for finding
a sparse solution. The moral justification of this is as follows. We define the function indicator
function I6=0 : R 7→ R by 
1, w 6= 0;
I6=0 (w) :=
0, w = 0.
It is easy to see (make a graph) that f (w) := |w| is the “best Pconvex function under-estimator” of
n
I6=0 on the domain [−1, 1]. So we can hope that minimizing j=1 |x|j comes close to minimizing
Pn
j=1 I6=0 (xj ) .
Using Python/Gurobi try this idea out on several large examples, using 1-norm minimiza-
tion as a heuristic for finding a sparse solution.
HINT: To get an interesting example, try generating a random m × n matrix A of zeros
and ones, perhaps m = 50 equations and n = 500 variables, maybe with probability 1/2 of
m
an entry being equal to one. Next, choose a random z̃ ∈ R 2 satisfying −1 ≤ z̃j ≤ +1, for
j = 1, 2, . . . , m/2, and z̃j = 0 for j = m/2 + 1, . . . , n. Now let b := Az̃. In this way, you will
know that there is a solution (i.e., z̃) with only m/2 non-zeros (which is already pretty sparse).
,
Your 1-norm minimizing solution might in fact recover this solution ( ), or it may be sparser
(,, /
), or perhaps less sparse ( ).
2.5. EXERCISES 17

Exercise 2.3 (Bloody network)


A transportation problem is a special kind of (single-commodity min-cost) network-flow prob-
lem. There are certain nodes v called supply nodes which have net supply bv > 0. The other
nodes v are called demand nodes, and they have net supply bv < 0. There are no nodes with
bv = 0 , and all arcs point from supply nodes to demand nodes.
A simplified example is for matching available supply and demand of blood, in types A, B,
AB and O . Suppose that we have sv units of blood available, in types v ∈ {A, B, AB, O} . Also,
we have requirements dv by patients of different types v ∈ {A, B, AB, O} . It is very important
to understand that a patient of a certain type can accept blood not just from their own type.
Do some research to find out the compatible blood types for a patient; don’t make a mistake —
lives depend on this! In this spirit, if your model allocates any blood in an incompatible fashion, you
will receive a grade of F on this problem.
Describe a linear-optimization problem that satisfies all of the patient demand with com-
patible blood. You will find that type O is the most versatile blood, then both A and B, followed
by AB. Factor in this point when you formulate your objective function, with the idea of having
the left-over supply of blood being as versatile as possible.
Using Multi-commodityFlow.ipynb (see Appendix A.4) with a single commodity only; that is,
K = 1, set up and solve an example of a blood-distribution problem.

Exercise 2.4 (Mix it up)


“I might sing a gospel song in Arabic or do something in Hebrew. I want to mix it up and
do it differently than one might imagine.” — Stevie Wonder
We are given a set of ingredients 1, 2, . . . , m with availabilities bi , measured in kilograms,
and per kilogram costs ci . We are given a set of products 1, 2, . . . , n with minimum production
requirements dj , measured in kilograms, and per kilogram revenues ej . It is required that
product j have at least a fraction (by weight) of lij of ingredient i and at most a fraction (by
weight) of uij of ingredient i . The goal is to devise a plan to maximize net profit.
Formulate, mathematically, as a linear-optimization problem. Then, model with Python/Gurobi,
make up some data, try some computations, and report on your results.

Exercise 2.5 (Task scheduling)

We are given a set of tasks, numbered 1, 2, . . . , n that should be completed in the minimum
amount of time. For convenience, task 0 is a “start task” and task n+1 is an “end task”. Each task,
except for the start and end task, has a known duration di . For convenience, let d0 := 0 . Any
number of tasks can be carried out simultaneously, except that there are precedences between
tasks. Specifically, Ψi is the set of tasks that must be completed before task i can be started. Let
t0 := 0 , and for all other tasks i , let ti be a decision variable representing its start time.
Formulate the problem, mathematically, as a linear-optimization problem. The objective
should be to minimize the start time tn+1 of the end task. Then, model the problem with
Python/Gurobi, make up some data, try some computations, and report on your results.
18 CHAPTER 2. MODELING

Exercise 2.6 (Investing wisely)


Almost certainly, Albert Einstein did not say that “compound interest is the most powerful force
in the universe.”
A company wants to maximize their cash holdings at the end of T time periods. They have
an external inflow of pt dollars at the start of time period t , for t = 1, 2, . . . , T . At the start
of each time period, available cash can be allocated to any of K different investment vehicles
(in any available non-negative amounts). Money allocated to investment-vehicle k at the start
of period t must be held in that investment k for all remaining time periods, and it generates
income vt,tk
, vt,t+1
k
, . . . , vt,T
k
, per dollar invested. It should be assumed that money obtained
from cashing out the investment is incorporated into these parameters. For example, (v4,4 9
, v4,5
9
,
v4,6 , v4,7 , v4,8 , v4,9 , v4,10 , v4,11 , v4,12 ) = (0.1, 0.1, 0.1, 1.1, 0, 0, 0, 0, 0) can be interpreted as 1
9 9 9 9 9 9 9

dollar invested in investment vehicle #9 at the start of time period 4 yields 0.1 dollars of income
for times periods 4–7, and with the original dollar returned in time period 7, and no returns at
all in the remaining time periods 8–12.
Note that at the start of time period t , the cash available is the external inflow of pt , plus
cash accumulated from all investment vehicles in prior periods that was not reinvested. Finally,
assume that cash held over in any time period earns interest of q percent.
Formulate the problem, mathematically, as a linear-optimization problem. Then, model the
problem with Python/Gurobi, make up some data, try some computations, and report on your
results.
Chapter 3

Algebra Versus Geometry

Our goals in this chapter are as follows:


• Develop the algebra needed later for our algorithms.
• Develop some geometric understanding of this algebra.

Throughout, we refer to the standard-form problem

min c0 x
Ax = b ; (P)
x ≥ 0.

3.1 Basic Feasible Solutions and Extreme Points


A basic partition of A ∈ Rm×n is a partition of {1, 2, . . . , n} into a pair of ordered sets, the
basis β = (β1 , β2 , . . . , βm ) and the non-basis η = (η1 , η2 , . . . , ηn−m ), so that the basis matrix

19
20 CHAPTER 3. ALGEBRA VERSUS GEOMETRY

Aβ := [Aβ1 , Aβ2 , . . . , Aβm ] is an invertible m × m matrix. The connection with the standard
“linear-algebra basis” is that the columns of Aβ form a “linear-algebra basis” for Rm . But for
us, “basis” almost always refers to β.
We associate a basic solution x̄ ∈ Rn with the basic partition via:

x̄η := 0 ∈ Rn−m ;
−1
x̄β := Aβ b ∈ Rm .

We can observe that x̄β = A−1β b is equivalent to Aβ x̄β = b , which is the unique way to write b
as a linear combination of the columns of Aβ . Of course this makes sense, because the columns
of Aβ form a “linear-algebra basis” for Rm .
Note that every basic solution x̄ satisfies Ax̄ = b , because

n
X X X  
Ax̄ = Aj x̄j = Aj x̄j + Aj x̄j = Aβ x̄β + Aη x̄η = Aβ A−1
β b + Aη 0 = b .
j=1 j∈β j∈η

A basic solution x̄ is a basic feasible solution if it is feasible for (P). That is, if x̄β = A−1
β b≥0.
It is instructive to have a geometry for understanding the algebra of basic solutions, but for
standard-form problems, it is hard to draw something interesting in two dimensions. Instead,
we observe that the feasible region of (P) is the solution set, in Rn , of

xβ + A−1
β Aη x η = A−1
β b;
xβ ≥ 0 , xη ≥ 0 .

Projecting this onto the space of non-basic variables xη ∈ Rn−m , we obtain


 
A−1
β Aη x η ≤ A−1
β b;
xη ≥ 0.

Notice how we can view the xβ variables as slack variables.


In the following example, because it is convenient in Python, we use “zero indexing”. In
particular, we use indices {0, 1, . . . , n − 1} for the variables xj and the columns Aj , and for the
basic partition we label β := (β0 , β1 , . . . , βm−1 ) and η := (η0 , η1 , . . . , ηn−m−1 ).

Example 3.1
For this system, it is convenient to draw pictures when n − m = 2 , for example n = 6 and
m = 4 . In such a picture, the basic solution x̄ ∈ Rn maps to the origin x̄η = 0 ∈ Rn−m , but
other basic solutions (feasible and not) will map to other points.
Suppose that we have the data:
 
1 2 1 0 0 0
 3 1 0 1 0 0 
A := 
 3/2
 ,
3/2 0 0 1 0 
0 1 0 0 0 1
b := (7, 9, 6, 33/10)0 ,
β := (β0 , β1 , β2 , β3 ) = (0, 1, 3, 5) ,
η := (η0 , η1 ) = (2, 4) .
3.1. BASIC FEASIBLE SOLUTIONS AND EXTREME POINTS 21

Then
 
1 2 0 0
 3 1 1 0 
Aβ = [Aβ0 , Aβ1 , Aβ2 , Aβ3 ] = 
 3/2
,
3/2 0 0 
0 1 0 1
 
1 0
 0 0 
Aη = [Aη0 , Aη1 ] =  0 1  ,

0 0
xβ = (x0 , x1 , x3 , x5 )0
xη := (x2 , x4 )0 .
We can calculate
 
−1 4/3
 1 −2/3 
A−1   ,
β Aη =  2 −10/3 
−1 2/3
A−1
β b := (1, 3, 3, 3/10)0 ,
and then we have plotted this in Figure 3.1. The plot has xη0 = x2 as the abscissa, and xη1 = x4
as the ordinate.
 In theplot, besides the non-negativity of the variables x2 and x4 , the four
inequalities of A−1 −1
β Aη xη ≤ Aβ b are labeled with their slack variables — these are the basic
variables
 x0 , x1 , x3 , x5 . The correct matching of the basic variables to the inequalities of
A−1 −1
β Aη xη ≤ Aβ b is simply achieved by seeing that the i-th inequality has slack variable
xβi .
The feasible region is colored cyan, while basic feasible solutions project to green points and
basic infeasible solutions project to red points. We can see that the basic solution associate with
the current basis is feasible, because the origin (corresponding to the non-basic variables being
set to 0) is feasible.

A set S ⊂ Rn is a convex set if it contains the entire line segment between every pair of
points in S . That is,
λx1 + (1 − λ)x2 ∈ S , whenever x1 , x2 ∈ S and 0 < λ < 1 .
It is simple to check that the feasible region of every linear-optimization problem is a convex set
— do it!
For a convex set S ⊂ Rn , a point x̂ ∈ S is an extreme point of S if it is not on the interior of
any line segment wholly contained in S . That is, if we cannot write
x̂ = λx1 + (1 − λ)x2 , with x1 6= x2 ∈ S and 0 < λ < 1 .

Theorem 3.2
Every basic feasible solution of standard-form (P) is an extreme point of its feasible region.

Proof. Consider the basic feasible solution x̄ with

x̄η := 0 ∈ Rn−m ;
−1
x̄β := Aβ b ∈ Rm .
22 CHAPTER 3. ALGEBRA VERSUS GEOMETRY

Figure 3.1: Feasible region projected into the space of non-basic variables

If
x̄ = λx1 + (1 − λ)x2 , with x1 and x2 feasible for (P) and 0 < λ < 1 ,
then 0 = x̄η = λx1η + (1 − λ)x2η and 0 < λ < 1 implies that x1η = x2η = 0 . But then Aβ xiβ = b
implies that xiβ = A−1
β b = x̄β , for i = 1, 2 . Hence x̄ = x = x (but we needed x 6= x ), and
1 2 1 2

so we cannot find a line segment containing x̄ that is wholly contained in S . t


u

Theorem 3.3
Every extreme point of the feasible region of standard-form (P) is a basic solution.

Proof. Let x̂ be an extreme point of the feasible region of (P). We define

ρ := {j ∈ {1, 2, . . . , n} : x̂j > 0} .

That is, ρ is the list of indices for the positive variables of x̂ . Also, we let

ζ := {j ∈ {1, 2, . . . , n} : x̂j = 0} .

That is, ζ is the list of indices for the zero variables of x̂ . Together, ρ and ζ partition {1, 2, . . . , n} .
Our goal is to construct a basic partition, β, η , so that the associated basic solution is pre-
cisely x̂ .
3.2. BASIC FEASIBLE DIRECTIONS 23

The first thing that we will establish is that the columns of Aρ are linearly independent. We
will do that by contradiction. Suppose that they are linearly dependent. That is, there exists
zρ ∈ R|ρ| different from the zero vector, such that Aρ zρ = 0 . Next we extend zρ to a vector
z ∈ Rn , by letting zζ = 0 . Clearly Az = Aρ zρ + Aζ zζ = 0 + Aζ 0 = 0 ; that is, z is in the null
space of A . Next, let
x1 := x̂ + z
and
x2 := x̂ − z ,
with  chosen to be sufficiently small so that x1 and x2 are non-negative. Because z is only
non-zero on the ρ coordinates (where x̂ is positive), we can choose an appropriate  . Notice
that x1 6= x2 , because zρ and hence z is not the zero vector. Now, it is easy to verify that
Ax1 = A(x̂ + z) = Ax̂ + Az = b + 0 = b and similarly Ax2 = b . Therefore, x1 and x2 are
feasible solutions of (P). Also, 12 x1 + 21 x2 = 12 (x̂ + z) + 12 (x̂ − z) = x̂ . So x̂ is on the interior
(actually it is the midpoint) of the line segment between x1 and x2 , in contradiction to x̂ being
an extreme point of the feasible region of (P). Therefore, it must be that the columns of Aρ are
linearly independent.
In particular, we can conclude that |ρ| ≤ m , since we assume that A ∈ Rm×n has full row
rank. If |ρ| < m , we choose (via Theorem 1.2) m − |ρ| columns of Aζ to append to Aρ in such
a way as to form a matrix Aβ having m linearly-independent columns — we note that such a
choice is not unique. As usual, we let η be a list of the n − m indices not in β . By definition,
the associated basic solution x̄ has x̄η = 0 , and we observe that it is the unique solution to the
system of equations Ax = b having xη = 0 . But x̂η = 0 because x̂η is a subvector of x̂ζ = 0 .
Therefore, x̂ = x̄ . That is, x̂ is a basic solution of (P). t
u

Taken together, these last two results give us the main result of this section.

Corollary 3.4
For a feasible point x̂ of standard-form (P), x̂ is extreme if and only if x̂ is a basic solution.

3.2 Basic Feasible Directions


24 CHAPTER 3. ALGEBRA VERSUS GEOMETRY

For a point x̂ in a convex set S ⊂ Rn , a feasible direction relative to the feasible solution
x̂ is a ẑ ∈ Rn such that x̂ + ẑ ∈ S , for sufficiently small positive  ∈ R . Focusing now on the
standard-form problem (P), for ẑ to be a feasible direction relative to the feasible solution x̂ ,
we need A(x̂ + ẑ) = b . But

b = A(x̂ + ẑ) = Ax̂ + Aẑ = b + Aẑ ,

so we need Aẑ = 0 . That is, ẑ must be in the null space of A .


Focusing on the standard-form problem (P), we associate a basic direction z̄ ∈ Rn with the
basic partition β, η and a choice of non-basic index ηj via

z̄η := ej ∈ Rn−m ;
z̄β := −A−1
β Aη j ∈ Rm .

Note that every basic direction z̄ is in the null space of A :


 
Az̄ = Aβ z̄β + Aη z̄η = Aβ −A−1 β Aηj + Aη ej = −Aηj + Aηj = 0 .

So
A (x̂ + z̄) = b ,
for every feasible x̂ and every  ∈ R . Moving a positive amount in the direction z̄ corresponds
to increasing the value of xηj , holding the values of all other non-basic variables constant, and
making appropriate changes in the basic variables so as to maintain satisfaction of the equation
system Ax = b .
There is a related point worth making. We have just seen that for a given basic partition β, η ,
each of the n − m basic directions is in the null space of A — there is one such basic direction for
each of the n − m choices of ηj . It is very easy to check that these basic directions are linearly
independent — just observe that they are columns of the n × (n − m) matrix
 
I
.
−A−1β Aη

Because the dimension of the null space of A is n − m , these n − m basic directions form a basis
for the null space of A .
Now, we focus on the basic feasible solution x̄ determined by the basic partition β, η . The
basic direction z̄ is a basic feasible direction relative to the basic feasible solution x̄ if x̄ + z̄
is feasible, for sufficiently small positive  ∈ R . That is, if

A−1 −1
β b − Aβ Aηj ≥ 0 ,

for sufficiently small positive  ∈ R .


Recall that x̄β = A−1 −1
β b , and let Āηj := Aβ Aηj . So, we need that

x̄β − Āηj ≥ 0 ,

for sufficiently small positive  ∈ R . That is,

x̄βi − āi,ηj ≥ 0 ,

for i = 1, 2, . . . , m . If āi,ηj ≤ 0 , for some i , then this imposes no restriction at all on  . So, the
only condition that we need for z̄ to be a basic feasible direction relative to the basic feasible solution
x̄ is that there exists  > 0 satisfying
x̄βi
≤ , for all i such that āi,ηj > 0 .
āi,ηj
3.3. BASIC FEASIBLE RAYS AND EXTREME RAYS 25

Equivalently, we simply need that


x̄βi > 0 , for all i such that āi,ηj > 0 .
So, we have the following result:

Theorem 3.5
For a standard-form problem (P), suppose that x̄ is a basic feasible solution relative to the basic
partition β, η . Consider choosing a non-basic index ηj . Then the associated basic direction z̄
is a feasible direction relative to x̄ if and only if

x̄βi > 0 , for all i such that āi,ηj > 0 .

3.3 Basic Feasible Rays and Extreme Rays


For a non-empty convex set S ⊂ Rn , a ray of S is a ẑ 6= 0 in Rn such that x̂ + τ ẑ ∈ S , for all
x̂ ∈ S and all positive τ ∈ R .
Focusing on the standard-from problem (P), it is easy to see that ẑ 6= 0 is a ray of the feasible
region if and only if Aẑ = 0 and ẑ ≥ 0.
Recall from Section 3.2 that for a standard-form problem (P), a basic direction z̄ ∈ Rn is
associated with the basic partition β, η and a choice of non-basic index ηj via
z̄η := ej ∈ Rn−m ;
z̄β := −A−1
β Aη j ∈ Rm .

If the basic direction z̄ is a ray, then we call it a basic feasible ray. We have already seen
that Az̄ = 0. Furthermore, z̄ ≥ 0 if and only if Āηj := A−1β Aηj ≤ 0.
Therefore, we have the following result:

Theorem 3.6
The basic direction z̄ is a ray of the feasible region of (P) if and only if Āηj ≤ 0 .

Recall, further, that z̄ is a basic feasible direction relative to the basic feasible solution x̄ if x̄ + z̄
is feasible, for sufficiently small positive  ∈ R . Therefore, if z̄ is a basic feasible ray, relative to the
basic partition β, η and x̄ is the basic feasible solution relative to the same basic partition, then
z̄ is a basic feasible direction relative to x̄ .
A ray ẑ of a convex set S is an extreme ray if we cannot write
ẑ = z 1 + z 2 , with z 1 6= µz 2 being rays of S and µ 6= 0 .
Similarly to the correspondence between basic feasible solutions and extreme points for standard-
form problems, we have the following two results.

Theorem 3.7
Every basic feasible ray of standard-form (P) is an extreme ray of its feasible region.
26 CHAPTER 3. ALGEBRA VERSUS GEOMETRY

Theorem 3.8
Every extreme ray of the feasible region of standard-form (P) is a positive multiple of a basic
feasible ray.

3.4 Exercises
Exercise 3.1 (Illustrate algebraic and geometric concepts)
Using the Jupyter notebook pivot_tools.ipynb (see Appendix A.6), make a small example,
say with six variables and four equations, to fully illustrate the concepts in this chapter. The
Jupyter notebook pivot_example.ipynb (see Appendix A.5) shows how to start to work with
pivot_tools.ipynb.

Exercise 3.2 (Basic feasible rays are extreme rays)


Prove Theorem 3.7.

Exercise 3.3 (Extreme rays are positive multiples of basic feasible rays)
If you are feeling very ambitious, prove Theorem 3.8.

Exercise 3.4 (Dual basic direction — do this if you will be doing Exercise 4.2)
Let β, η be a basic partition for our standard-form problem (P). As you will see on the first page
of the next chapter, we can associate with the basis β, a dual solution

ȳ 0 := c0β A−1
β

of
max y0 b
(D)
y0 A ≤ c0 .
It is easy to see that ȳ satisfies the constraints y 0 Aβ ≤ c0β (of (D)) with equality; that is, the dual
constraints indexed from β are “active”.
Let us assume that ȳ is feasible for (D). Now, let β` be a basic index, and let w̄ := H`· be
row ` of H := A−1 β . Consider ȳ := ȳ − λw̄ , and explain (with algebraic justification) what is
˜ 0

happening to the activity of each constraint of (D), as λ increases. HINT: Think about the cases
of (i) i = `, (ii) i ∈ β, i 6= `, and (iii) j ∈ η .
Chapter 4

The Simplex Algorithm

Our goal in this chapter is as follows:


• Develop a mathematically-complete Simplex Algorithm for optimizing standard-form
problems.

4.1 A Sufficient Optimality Criterion


The dual solution of (D) associated with basis β is

ȳ 0 := c0β A−1
β .

Lemma 4.1
If β is a basis, then the primal basic solution x̄ (feasible or not) and the dual solution ȳ (feasible
or not) associated with β have equal objective value.

Proof. The objective value of x̄ is c0 x̄ = c0β x̄β + c0η x̄η = c0β (A−1 −1
β b) + cη 0 = cβ Aβ b . The objective
0 0

value of ȳ is ȳ 0 b = (c0β A−1 0 −1


β )b = cβ Aβ b . t
u

The vector of reduced costs associated with basis β is

c̄0 := c0 − c0β A−1 0 0


β A = c − ȳ A .

27
28 CHAPTER 4. THE SIMPLEX ALGORITHM

Lemma 4.2
The dual solution of (D) associated with basis β is feasible for (D) if

c̄η ≥ 0 .

Proof. Using the definitions of ȳ and c̄ , the condition c̄η ≥ 0 is equivalent to

ȳ 0 Aη ≤ c0η .

The definition of ȳ gives us

ȳ 0 Aβ = c0β (equivalently, c̄β = 0) .

So we have
ȳ 0 [Aβ , Aη ] ≤ (c0β , c0η ) ,
or, equivalently,
ȳ 0 A ≤ c0 ,
Hence ȳ is feasible for (D). t
u

Theorem 4.3 (Weak Optimal Basis Theorem)


If β is a feasible basis and c̄η ≥ 0 , then the primal solution x̄ and the dual solution ȳ associated
with β are optimal.

Proof. We have already observed that c0 x̄ = ȳ 0 b for the pair of primal and dual solutions associ-
ated with the basis β . If these solutions x̄ and ȳ are feasible for (P) and (D), respectively, then
by weak duality these solutions are optimal. t
u

We can also take (P) and transform it into an equivalent form that is quite revealing. Clearly,
(P) is equivalent to
min c0β xβ + c0η xη
Aβ x β + Aη x η = b
xβ ≥ 0 , x η ≥ 0 .
Next, multiplying the equations on the left by A−1
β , we see that they are equivalent to

xβ + A−1 −1
β Aη x η = Aβ b .

We can also see this as


xβ = A−1 −1
β b − Aβ Aη x η .

Using this equation to substitute for xβ in the objective function, we are led to the linear objective
function  
c0β A−1
β b + min c0
η − c0
A
β β
−1
A 0 −1 0
η xη = cβ Aβ b + min c̄η xη ,

which is equivalent to the original one on the set of points satisfying Ax = b . In this equivalent
form, it is now solely expressed in terms of xη . Now, if c̄η ≥ 0 , the best we could hope for in
minimizing is to set xη = 0 . But the unique solution having xη = 0 is the basic feasible solution
x̄ . So that x̄ is optimal.
4.1. A SUFFICIENT OPTIMALITY CRITERION 29

Example 4.4
This is a continuation of Example 3.1. In Figure 4.1, we have depicted the sufficient optimality
criterion, in the space of a particular choice of non-basic variables — not the choice previously
depicted. Specifically, we consider the equivalent problem

min c̄0η xη
Āη xη ≤ x̄β ;
xη ≥ 0.

This plot demonstrates the optimality of β := (2, 5, 3, 4) (η := (0, 1)). The basic directions
available from the basic feasible solution x̄ appear as standard unit vectors in the space of the
non-basic variables. The solution x̄ is optimal because c̄η ≥ 0 ; we can also think of this as
c̄η having a non-negative dot product with each of the standard unit vectors, hence neither
direction is improving.

Figure 4.1: Sufficient optimality criterion


30 CHAPTER 4. THE SIMPLEX ALGORITHM

4.2 The Simplex Algorithm with No Worries

Improving direction. Often it is helpful to directly refer to individual elements of the vector
c̄η ; namely,
c̄ηj = cηj − c0β A−1 0
β Aηj = cηj − cβ Āηj , for j = 1, 2, . . . , n − m .
If the sufficient optimality criterion is not satisfied, then we choose an ηj such that c̄ηj is negative,
and we consider solutions that increase the value of xηj up from x̄ηj = 0 , changing the values
of the basic variables to insure that we still satisfy the equations Ax = b , while holding the
other non-basic variables at zero.
Operationally, we take the basic direction z̄ ∈ Rn defined by
z̄η := ej ∈ Rn−m ;
z̄β := −A−1
β Aηj = −Āηj ∈ Rm ,
and we consider solutions of the form x̄ + λz̄ , with λ > 0 . The motivation is based on the
observations that
• c0 (x̄ + λz̄) − c0 x̄ = λc0 z̄ = λc̄ηj < 0 ;
• A(x̄ + λz̄) = Ax̄ + λAz̄ = b + λ0 = b .
That is, the objective function changes at the rate of c̄ηj , and we maintain satisfaction of the
Ax = b constraints.

Maximum step — the ratio test and a sufficient unboundedness criterion. By our
choice of direction z̄, all variables that are non-basic with respect to the current choice of basis
remain non-negative (xηj increases from 0 and the others remain at 0). So the only thing that
restricts our movement in the direction z̄ from x̄ is that we have to make sure that the current
basic variables remain non-negative. This is easy to take care of. We just make sure that we
choose λ > 0 so that
x̄β + λz̄β = x̄β − λĀηj ≥ 0 .
Notice that for i such āi,ηj ≤ 0 , there is no limit on how large λ can be. In fact, it can well
happen that Āηj ≤ 0 . In this case, x̄ + λz̄ is feasible for all λ > 0 and c0 (x̄ + λz̄) → −∞ as
λ → +∞ , so the problem is unbounded.
Otherwise, to insure that x̄ + λz̄ ≥ 0 , we just enforce
x̄βi
λ≤ , for i such that āi,ηj > 0 .
āi,ηj
Finally, to get the best improvement in the direction z̄ from x̄, we let λ equal
 
x̄βi
λ̄ := min .
i : āi,ηj >0 āi,ηj
4.2. THE SIMPLEX ALGORITHM WITH NO WORRIES 31

Non-degeneracy. There is a significant issue in even carrying out one iteration of this algo-
rithm. If x̄βi = 0 for some i such that āi,ηj > 0 , then λ̄ = 0 , and we are not able to make any
change from x̄ in the direction z̄ . Just for now, we will simply assume away this problem, using
the following hypothesis that every basic variable of every basic feasible solution is positive.
The problem (P) satisfies the non-degeneracy hypothesis if for every feasible basis β, we have
x̄βi > 0 for i = 1, 2, . . . , m . Under the non-degeneracy hypothesis, λ̄ > 0 .

Another basic feasible solution. By our construction, the new solution x̄ + λ̄z̄ is feasible
and has lesser objective value than that of x̄ . We can repeat the construction as long as the new
solution is basic. If it is basic, there is a natural guess as to what an appropriate basis may be. The
variable xηj , formerly non-basic at value 0 has increased to λ̄ , so clearly it must become basic.
Also, at least one variable that was basic now has value 0 . In fact, under our non-degeneracy
hypothesis, once we establish that the new solution is basic, we observe that exactly one variable
that was basic now has value 0. Let
 
x̄βi
i∗ := argmin .
i : āi,ηj >0 āi,ηj

If there is more than one i that achieves the minimum (which can happen if we do not assume
the non-degeneracy hypothesis), then we will see that the choice of i∗ can be any of these. We
can see that xβi∗ has value 0 in x̄ + λ̄z̄ . So it is natural to hope we can replace xβi∗ as a basic
variable with xηj .
Let
β̃ := (β1 , β2 , . . . , βi∗ −1 , ηj , βi∗ +1 , . . . , βm )
and
η̃ := (η1 , η2 , . . . , ηj−1 , βi∗ , ηj+1 , . . . , ηn−m ) .

Lemma 4.5
Aβ̃ is invertible.

Proof. Aβ̃ is invertible precisely when the following matrix is invertible:


 
A−1
β Aβ̃ = Aβ−1 Aβ1 , Aβ2 , . . . , Aβi∗ −1 , Aηj , Aβi∗ +1 , . . . , Aβm
 
= e1 , e2 , . . . , ei∗ −1 , Āηj , ei∗ +1 , . . . , em .

But the determinant of this matrix is precisely āi∗ ,ηj 6= 0 . t


u

Lemma 4.6
The unique solution of Ax = b having xη̃ = 0 is x̄ + λ̄z̄ .

Proof. (x̄ + λ̄z̄)j = 0, for j ∈ η̃. Moreover, x̄ + λ̄z̄ is the unique solution to Ax = b having xη̃ = 0
because Aβ̃ is invertible. t
u

Putting these two lemmata together, we have the following key result.
32 CHAPTER 4. THE SIMPLEX ALGORITHM

Theorem 4.7
x̄ + λ̄z̄ is a basic solution; in fact, it is the basic solution determined by the basic partition β̃, η̃ .

Passing from the partition β, η to the partition β̃, η̃ is commonly referred to as a pivot.

Worry-Free Simplex Algorithm

Input: c ∈ Rn , b ∈ Rm , A ∈ Rm×n of full row rank m , for the standard-form problem:

min c0 x
Ax = b; (P)
x ≥ 0,

where x is a vector of variables in Rn .


0. Start with any basic feasible partition β, η .
1. Let x̄ and ȳ be the primal and dual solutions associated with β, η .
If c̄η ≥ 0, then STOP: x̄ and ȳ are optimal.
2. Otherwise, choose a non-basic index ηj with c̄ηj < 0 .
3. If Āηj ≤ 0 , then STOP: (P) is unbounded and (D) is infeasible.
4. Otherwise, let  
∗ x̄βi
i := argmin ,
i : āi,ηj >0 āi,ηj
replace β with  
β1 , β2 , . . . , βi∗ −1 , ηj , βi∗ +1 , . . . , βm
and η with  
η1 , η2 , . . . , ηj−1 , βi∗ , ηj+1 , . . . , ηn−m .

5. GOTO 1.
Example 4.8
This is a continuation of Example 3.1 / Example 4.4. In Figure 4.2, be have depicted the solution
one step after the initial solution depicted in Figure 3.1. The result of the next pivot is depicted
in Figure 4.3. Finally, in one more pivot, we reach the optimum depicted in Figure 4.1.

Theorem 4.9
Under the non-degeneracy hypothesis, the Worry-Free Simplex Algorithm terminates cor-
rectly.

Proof. Under the non-degeneracy hypothesis, every time we visit Step 1, we have a primal feasi-
ble solution with a decreased objective value. This implies that we never revisit a basic feasible
partition. But there are only a finite number of basic feasible partitions, so we must terminate,
after a finite number of pivots. But there are only two places where the algorithm terminates;
either in Step 1 where we correctly identify that x̄ and ȳ are optimal by our sufficient optimality
criterion, or in Step 3 because of our sufficient unboundedness criterion. t
u
4.2. THE SIMPLEX ALGORITHM WITH NO WORRIES 33

Remark 4.10
There are two very significant issues remaining:
• How do we handle degeneracy? (see Section 4.3).
• How do we initialize the algorithm in Step 0? (see Section 4.4).

Figure 4.2: After one pivot


34 CHAPTER 4. THE SIMPLEX ALGORITHM

Figure 4.3: After two pivots

4.3 Anticycling

To handle degeneracy, we will eliminate it with an algebraic perturbation. It is convenient


to make the perturbation depend on an m × m non-singular matrix B — eventually we will
choose B in a convenient manner. We replace the problem (P) with

min c0 x
Ax = b (B) ; (P (B))
x ≥ 0 ,

where
• b (B) := b + B~ , and ~ := (, 2 , . . . , m )0 (these are exponents not superscripts);
• the scalar  is an arbitrarily small indeterminant;  is not given a numerical value; it is
4.3. ANTICYCLING 35

simply considered to be a quantity that is positive, yet smaller than any positive real
number;
• 0 denotes a vector in which all entries are the zero polynomial (in );
• the variables xj are polynomials in  with real coefficients;
• the ordering of polynomials used to interpret the inequality ≥ is described next.

The ordering is actually quite simple, but for the sake of precision, we describe it formally.

An ordered ring. The set of polynomials in , with real coefficients, form what is P known in
m
mathematics as an “ordered ring”. The ordering < is simple to describe. Let p() := j=0 pj j
Pm
and q() := j=0 qj j . Then p() < q() if the least j for which pj 6= qj has pj < qj . Another
way to think about the ordering < is that p() < q() if p() < q() when  is considered to be
an arbitrarily small positive number. Notice how the ordering < is in a certain sense a more
refined ordering than < . That is, if p(0) < q(0) , then p() < q() , but we can have p(0) = q(0)
without having p() = q() . Finally, we note that the zero polynomial “0 ”(all coefficients
equal to 0) is the zero of this ordered ring, so we can speak, for example about polynomials that
are positive with respect to the ordering < . Concretely, p() 6= 0 is positive if the least i for
which pi 6= 0 satisfies pi > 0 . Emphasizing that < is a more refined ordering than < , we see
that p() ≥ 0 implies that p(0) = p0 ≥ 0 .
For an arbitrary basis β, the associated basic solution x̄ has x̄β := A−1 β (b + B~ ) = x̄β +
−1
Aβ B~ . It is evident that x̄βi is a polynomial, of degree at most m, in  , for each i = 1, . . . , m .


Because the ordering < refines the ordering < , we have that x̄β ≥ 0 implies that x̄β ≥ 0 . That
is, any basic feasible partition for (P (B)) is a basic feasible partition for (P). This implies that
applying the Worry-Free Simplex Algorithm to (P (B)), using the ratio test to enforce feasibility
of x̄ in (P (B)) at each iteration, implies that each associated x̄β is feasible for (P). That is, the
choice of a leaving variable dictated by the ratio test when we work with (P (B)) is valid if we
instead do the ratio test working with (P).
The objective value associated with x̄ is c0β A−1
β (b + B~ ) = ȳ 0 b + ȳ 0 B~ , is a polynomial (of
degree at most m) in  . Therefore, we can order basic solutions for (P (B)) using < , and
that ordering refines the ordering of the objective values of the corresponding basic solution of
(P). This implies that if x̄ is optimal for (P (B)) , then the x̄ associated with the same basis is
optimal for (P).

Lemma 4.11
The -perturbed problem (P (B)) satisfies the non-degeneracy hypothesis.
36 CHAPTER 4. THE SIMPLEX ALGORITHM

Proof. For an arbitrary basis matrix Aβ , the associated basic solution x̄ has x̄β := A−1 β (b+B~) =
−1
x̄β + Aβ B~ . As we have already pointed out, x̄βi is a polynomial, of degree at most m, in  , for


each i = 1, . . . , m . x̄βi = 0 implies that the i-th row of A−1


β B is all zero. But this is impossible
for the invertible matrix A−1 β B . t
u

Theorem 4.12
Let β 0 be a basis that is feasible for (P). Then the Worry-Free Simplex Algorithm applied to
(P (Aβ 0 )), starting from the basis β 0 , correctly demonstrates that (P) is unbounded or finds
an optimal basic partition for (P).

Proof. The first important point to notice is that we are choosing the perturbation of the original
right-hand side to depend on the choice of a basis that is feasible for (P). Then we observe that
x̄β 0 := A−1
β 0 (b + Aβ ~
−1
 . Now because x̄ is feasible for (P), we have A−1
0 ) = A 0 b + ~
β β0 b ≥ 0 .
−1
Then, the ordering < implies that x̄β 0 = Aβ 0 b + ~ ≥ 0 . Therefore, the basis β is feasible for
 0

(P (Aβ 0 )), and the Worry-Free Simplex Algorithm can indeed be started for (P (Aβ 0 )) on β 0 .
Notice that it is only in Step 4 of the Worry-Free Simplex Algorithm that really depends
on whether we are considering (P (Aβ 0 )) or (P). The sufficient optimality criterion and the
sufficient unboundedness criterion are identical for (P (Aβ 0 )) and (P). Because (P (Aβ 0 )) sat-
isfies the non-degeneracy hypothesis, the Worry-Free Simplex Algorithm correctly terminates
for (P (Aβ 0 )). t
u

(Pivot.mp4)

Figure 4.4: With some .pdf viewers, you can click above to see or download a short
video. Or just see it on YouTube (probably with an ad) by clicking here.
4.4. OBTAINING A BASIC FEASIBLE SOLUTION 37

4.4 Obtaining a Basic Feasible Solution


Next, we will deal with the problem of finding an initial basic feasible solution for the standard-
form problem
min c0 x
Ax = b ; (P)
x ≥ 0.

4.4.1 Ignoring degeneracy


At first, we ignore the degeneracy issue — why worry about two things at once?! The idea is
rather simple. First, we choose any basic partition β̃, η̃ . If we are lucky, then Aβ̃−1 b ≥ 0 .

Otherwise, we have some work to do. We define a new non-negative variable xn+1 , which we
temporarily adjoin as an additional non-basic variable. So our basic indices remain as
 
β̃ = β̃1 , β̃2 , . . . , β̃m ,

while our non-basic indices are extended to


 
η̃ = η̃1 , η̃2 , . . . , η̃n−m , η̃n−m+1 := n + 1 .

This variable xn+1 is termed an artificial variable. The column for the constraint matrix
associated with xn+1 is defined as An+1 := −Aβ̃ 1 . Hence Ān+1 = −1 . Finally, we temporarily
put aside the objective function from (P) and replace it with one of minimizing the artificial
variable xn+1 . That is, we consider the so-called phase-one problem

min xn+1
Ax + An+1 xn+1 = b; (Φ)
x , xn+1 ≥ 0.

With this terminology, the original problem (P) is referred to as the phase-two problem.
38 CHAPTER 4. THE SIMPLEX ALGORITHM

It is evident that any feasible solution x̂ of (Φ) with x̂n+1 = 0 is feasible for (P). Moreover,
if the minimum objective value of (Φ) is greater than 0, then we can conclude that (P) has no
feasible solution. So, toward establishing whether or not (P) has a feasible solution, we focus
our attention on (Φ). We will soon see that we can easily find a basic feasible solution of (Φ).

Finding a basic feasible solution of (Φ). Choose i∗ so that x̄β̃i∗ is most negative. Then
we exchange β̃i∗ with η̃n−m+1 = n + 1 . That is, our new basic indices are
 
β := β̃1 , β̃2 , . . . , β̃i∗ −1 , n + 1, β̃i∗ +1 , . . . , β̃m ,

and our new non-basic indices are


 
η := η̃1 , η̃2 , . . . , η̃n−m , β̃i∗ .

Lemma 4.13
The basic solution of (Φ) associated with the basic partition β, η is feasible for (Φ).

Proof. This pivot, from β̃, η̃ to β, η amounts to moving in the basic direction z̄ ∈ Rn+1 defined
by
z̄η̃ := en−m+1 ∈ Rn−m+1 ;
−1
z̄β̃ := −Aβ̃ An+1 = 1 ∈ Rm ,

in the amount λ := −x̄β̃i∗ > 0 . That is, x̄ + λz̄ is the basic solution associated with the basic
partition β, η . Notice how when we move in the direction z̄ , all basic variables increase at
exactly the same rate that xn+1 does. So, using this direction to increase xn+1 from 0 to −x̄β̃i∗ >
0 results in all basic variables increasing by exactly −x̄β̃i∗ > 0 . By the choice of i∗ , this causes
all basic variable to become non-negative, and xβ̃i∗ to become 0 , whereupon it can leave the
basis in exchange for xn+1 . t
u

The end game for (Φ). If (P) is feasible, then at the very last iteration of the Worry-Free
Simplex Algorithm on (Φ), the objective value will drop from a positive number to zero. As
this happens, xn+1 will be eligible to leave the basis, but so may other variables also be eligible.
That is, there could be a tie in the ratio-test of Step 4 of the Worry-Free Simplex Algorithm.
As is the case whenever there is a tie, any of the tying indices can leave the basis — all of the
associated variables are becoming zero simultaneously. For our purposes, it is critical that if
there is a tie, we choose i∗ so that βi∗ = n + 1 ; that is, xn+1 must be selected to become non-
basic. In this way, we not only get a feasible solution to (P), we get a basis for it that does not use
the artificial variable xn+1 . Now, starting from this basis, we can smoothly shift to minimizing
the objective function of (P).
4.4. OBTAINING A BASIC FEASIBLE SOLUTION 39

4.4.2 Not ignoring degeneracy


Anticycling for (Φ). There is one lingering issue remaining. We have not discussed anticy-
cling for (Φ).

But this is relatively simple. We define an -perturbed version

min xn+1
Ax + An+1 xn+1 = b (B) ; (Φ )
x , xn+1 ≥ 0 ,

where b (B) := b + ~ , and ~ := (, 2 , . . . , m )0 . Then we choose i∗ so that x̄β̃ ∗ is most negative
i

with respect to the ordering < , and exchange β̃i∗ with η̃n−m+1 = n + 1 as before. Then, as in
Lemma 4.13, the resulting basis is feasible for (Φ ).

We do need to manage the final iteration a bit carefully. There are two different ways we
can do this.

“Early arrival”. If (P) has a feasible solution, at some point the value of xn+1 will decrease to
a homogeneous polynomial in  . That is, the constant term will become 0. At this point, although
xn+1 may not be eligible to leave the basis for (Φ ), it will be eligible to leave for (P). So, at this
point we let xn+1 leave the basis, and we terminate the solution process for (Φ ), having found
a feasible basis for (P). In fact, we have just constructively proved the following result.

Theorem 4.14
If standard form (P) has a feasible solution, then it has a basic feasible solution.

Note that because xn+1 may not have been eligible to leave the basis for (Φ ) when we apply
the “early arrival’ idea, the resulting basis may not be feasible for (P ). So we will have to re-
perturb (P) .
40 CHAPTER 4. THE SIMPLEX ALGORITHM

“Be patient”. Perhaps a more elegant way to handle the situation is to fully solve (Φ ). In
doing so, if (P) has a feasible solution, then the minimum objective value of (Φ ) will be 0 (i.e.,
the zero polynomial), and xn+1 will necessarily be non-basic. That is because, at every iteration,
every basic variable in (Φ ) is positive. Because xn+1 legally left the basis for (Φ ) at the final
iteration, the resulting basis is feasible for (P ). So we do not re-perturb (P) , and we simply
revert to solving (P ) from the final basis of (Φ ) .

4.5 The Simplex Algorithm

“This is a very complicated case,


Maude. You know, a lotta ins, a lotta
outs, a lotta what-have-yous. And,
uh, a lotta strands to keep in my head,
4.6. EXERCISES 41

man. Lotta strands in old Duder’s head.” — The Dude

Putting everything together, we get a mathematically complete algorithm for linear opti-
mization. That is:
1. Apply an algebraic perturbation to the phase-one problem;
2. Solve the phase-one problem using the Worry-Free Simplex Algorithm, adapted to alge-
braically perturbed problems, but always giving preference to xn+1 for leaving the basis
whenever it is eligible to leave for the unperturbed problem. Go to the next step, as soon
as xn+1 leaves the basis;
3. Starting from the feasible basis obtained for the original standard-form problem, apply
an algebraic perturbation (Note that the previous step may have left us with a basis that
is feasible for the original unperturbed problem, but infeasible for the original perturbed
problem — this is why we apply a perturbation anew (see the “Early arrival" paragraph
in Section 4.4.2);
4. Solve the problem using the Worry-Free Simplex Algorithm, adapted to algebraically per-
turbed problems.

It is important to know that the Simplex Algorithm will be used, later, to prove the cele-
brated Strong Duality Theorem. For that reason, it is important that our algorithm be math-
ematically complete. But from a practical computational viewpoint, there is substantial over-
head in working with the -perturbed problems. Therefore, in practice, no computer code that
is routinely applied to large instances worries about the potential for cycling associated with the
very-real possibility of degeneracy.

4.6 Exercises
Exercise 4.1 (Carry out the Simplex Algorithm)
pivot_tools.ipynb (see Appendix A.6) implements the primitive steps of the simplex algo-
rithm. Using these primitives only, write a Python function to carry out the simplex algorithm.
Initialize your data as is done in pivot_example.ipynb. Do not worry about degeneracy/anti-
cycling. But I do want you to take care of algorithmically finding an initial feasible basis as
described in Section 4.4.1. Make some small examples to fully illustrate the different possibili-
ties for (P) (i.e., infeasible, optimal, unbounded).
Exercise 4.2 (Dual change — first do Exercise 3.4)
Let β, η be any basic partition for the standard-form problem (P). The associated dual solution
is ȳ 0 := c0β A−1
β . Now, suppose that we pivot, letting ηj enter the basis and β` leave the basis,
so that the new partition β̃, η̃ is also a basic partition (in other words, Aβ̃ is invertible). Let ȳ˜
be the dual solution associated with the basic partition β̃, η̃ , and let H`· be row ` of H := A−1
β .
Prove that
c̄ηj 0
ȳ˜ = ȳ + H
ā`,ηj `·
HINT: Use the Sherman-Morrison formula; see Section 1.3.
Exercise 4.3 (Traditional phase one)
Instead of organizing the phase-one problem as (Φ), we could first scale rows of Ax = b as
necessary so as to achieve b ≥ 0. Then we can formulate the “traditional phase-one problem”
Pm
min x
Pm j=1 n+j
Ax + j=1 ej xn+j = b; (Φ̂)
x , xn+1 , . . . , xn+m ≥ 0 .
42 CHAPTER 4. THE SIMPLEX ALGORITHM

Here we have m artificial variables: xn+1 , xn+2 , . . . , xn+m . It is easy to see that (i) β :=
{n + 1, n + 2, . . . , n + m} is feasible for (Φ̂), and (ii) the optimal value of (Φ̂) is zero if and only
if (P) has a feasible solution.
It may be that the optimal value of (Φ̂) is zero, but the optimal basis discovered by the
Simplex Algorithm applied to (Φ̂) contains some of the indices {n + 1, n + 2, . . . , n + m} of
artificial variables. Describe how we can take such an optimal basis and pass to a different
optimal basis that uses none of {n + 1, n + 2, . . . , n + m} (and is thus a feasible basis for (P)).
HINT: Use Theorem 1.2.

Exercise 4.4 (Worry-Free Simplex Algorithm can cycle)

Let θ := 2π/k, with integer k ≥ 5. The idea is to use the symmetry of the geometric circle,
and complete a cycle of the Worry-Free Simplex Algorithm in 2k pivots. Choose a constant γ
satisfying 0 < γ < tan(θ/2) . Let
   
1 0
A1 := , A2 := .
0 γ

Let  
cos θ − sin θ
R := .
sin θ cos θ
Then, for j = 3, 4, . . . , 2k , let

R(j−1)/2 A1 , for odd j ;
Aj :=
R(j−2)/2 A2 , for even j .

We can observe that for odd j , Aj is a rotation of A1 by (j − 1)π/k radians, and for even j , Aj
is a rotation of A2 by (j − 2)π/k radians.
Let cj := 1 − a1j − a2j /γ , for j = 1, 2, . . . , 2k , and let b := (0, 0)0 . Because b = 0 , the
problem is fully degenerate; that is, x̄ = 0 for all basic solutions x̄ . Notice that this implies that
either the problem has optimal objective value zero, or the objective value is unbounded on the
feasible region.
For k = 5 , you can choose γ := 12 tan(θ/2) , and then check that the following is a sequence
of bases β that are legal for the Worry-Free Simplex Algorithm:

β = (1, 2) → (2, 3) → (3, 4) → . . . → (2k − 1, 2k) → (2k, 1) → (1, 2) .

You need to check that for every pivot, the incoming basic variable xηj has negative reduced
cost, and that the outgoing variable is legally selected — that is that āi,ηj > 0 . Feel free to use
any software that you find convenient (e.g., Python, MATLAB, Mathematica, etc.).
4.6. EXERCISES 43

Note that it may seem hard to grasp the picture at all5 . But see Section 6.1.3 and Figure 4.5;
you can look at it from different perspectives using the Jupyter notebook Circle.ipynb (see
Appendix A.7).
If you are feeling ambitious, check that for all k ≥ 5 , we get a cycle of the Worry-Free
Simplex Algorithm.

Exercise 4.5
Run the code pivot_example.ipynb, but with the following line uncommented:

#pivot_perturb() # uncomment to perturb the right-hand side

See how this carries out the algebraic-perturbation method from Section 4.3.
Now, using this code as your starting point, solve the example from Exercise 4.4 with k = 5
to optimality, using the algebraic-perturbation method. Just change the data to correspond to
the example that we want to solve, and do the pivots one at a time, following the rules of the
simplex method.
44 CHAPTER 4. THE SIMPLEX ALGORITHM

Figure 4.5: A picture of the cycle with k = 5


Chapter 5

Duality

Our goals in this chapter are as follows:

• Establish the Strong Duality Theorem for the standard-form problem.

• Establish the Complementary Slackness Theorem for the standard-form problem.

• See how duality and complementarity carry over to general linear-optimization problems.

• Learn about “theorems of the alternative.”

As usual, we focus on the standard-form problem

min c0 x
Ax = b; (P)
x ≥ 0

and its dual


max y0 b
(D)
y0 A ≤ c0 .

45
46 CHAPTER 5. DUALITY

5.1 The Strong Duality Theorem


We have already seen two simple duality theorems:
• Weak Duality Theorem. If x̂ is feasible in (P) and ŷ is feasible in (D), then c0 x̂ ≥ ŷ 0 b .
• Weak Optimal Basis Theorem. If β is a feasible basis and c̄η ≥ 0 , then the primal solution
x̄ and the dual solution ȳ associated with β are optimal.
The Weak Duality Theorem directly implies that if x̂ is feasible in (P) and ŷ is feasible in (D),
and c0 x̂ = ŷ 0 b , then x̂ and ŷ are optimal. Thinking about it this way, we see that both the Weak
Duality Theorem and the Weak Optimal Basis Theorem assert conditions that are sufficient for
establishing optimality.

Theorem 5.1 (Strong Optimal Basis Theorem)


If (P) has a feasible solution, and (P) is not unbounded, then there exists a basis β such that the
associated basic solution x̄ and the associated dual solution ȳ are optimal. Moreover, c0 x̄ = ȳ 0 b .

Proof. If (P) has a feasible solution and (P) is not unbounded, then the Simplex Algorithm
will terminate with a basis β such that the associated basic solution x̄ and the associated dual
solution ȳ are optimal. t
u

As a direct consequence, we have a celebrated theorem.

Theorem 5.2 (Strong Duality Theorem)


If (P) has a feasible solution, and (P) is not unbounded, then there exist feasible solutions x̂
for (P) and ŷ for (D) that are optimal. Moreover, c0 x̂ = ŷ 0 b .

It is important to realize that the Strong Optimal Basis Theorem and the Strong Duality
Theorem depend on the correctness of the Simplex Algorithm — this includes: (i) the correct-
ness of the phase-one procedure to find an initial feasible basis of (P), and (ii) the anti-cycling
methodology.

5.2 Complementary Slackness


5.2. COMPLEMENTARY SLACKNESS 47

With respect to the standard-form problem (P) and its dual (D), the solutions x̂ and ŷ are
complementary if

(cj − ŷ 0 A·j )x̂j = 0 , for j = 1, 2, . . . , n ;


ŷi (Ai· x̂ − bi ) = 0 , for i = 1, 2, . . . , m .

Theorem 5.3
If x̄ is a basic solution (feasible or not) of standard-form (P), and ȳ is the associated dual
solution, then x̄ and ȳ are complementary.

Proof. Notice that if x̄ is a basic solution then Ax̄ = b. Then we can see that complementarity of
x̄ and ȳ amounts to
c̄j x̄j = 0 , for j = 1, 2, . . . , n .
It is clear then that x̄ and ȳ are complementary, because if x̄j > 0 , then j is a basic index, and
c̄j = 0 for basic indices. t
u

Theorem 5.4
If x̂ and ŷ are complementary with respect to (P) and (D), then c0 x̂ = ŷ 0 b .

Proof.
c0 x̂ − ŷ 0 b = (c0 − ŷ 0 A)x̂ + ŷ 0 (Ax̂ − b) ,
which is 0 by complementarity. t
u

Corollary 5.5 (Weak Complementary Slackness Theorem)


If x̂ and ŷ are feasible and complementary with respect to (P) and (D), then x̂ and ŷ are
optimal.

Proof. This immediately follows from Theorem 5.4 and the Weak Duality Theorem. t
u

Theorem 5.6 (Strong Complementary Slackness Theorem)


If x̂ and ŷ are optimal for (P) and (D), then x̂ and ŷ are complementary (with respect to (P)
and (D)).

Proof. If x̂ and ŷ are optimal, then by the Strong Duality Theorem, we have c0 x̂ − ŷ 0 b = 0 .
Therefore, we have

0 = (c0 − ŷ 0 A)x̂ + ŷ 0 (Ax̂ − b)


Xn m
X
= (cj − ŷ 0 A·j )x̂j + ŷi (Ai· x̂ − bi ) .
j=1 i=1
48 CHAPTER 5. DUALITY

Next, observing that x̂ and ŷ are feasible, we have


n
X m
X
0
(cj − ŷ A·j ) x̂j + ŷi (Ai· x̂ − bi ) .
j=1
| {z } |{z} i=1 | {z }
≥0 ≥0 =0

Clearly this expression is equal to a non-negative number. Finally, we observe that this expres-
sion can only be equal to 0 if

(cj − ŷ 0 A·j )x̂j = 0 , for j = 1, 2, . . . , n .

t
u

5.3 Duality for General Linear-Optimization Problems


Thus far, we have focused on duality for the standard-form problem (P). But we will see that
every linear-optimization problem has a natural dual. Consider the rather general linear mini-
mization problem

min c0P xP + c0N xN + c0U xU


AGP xP + AGN xN + AGU xU ≥ bG ;
ALP xP + ALN xN + ALU xU ≤ bL ; (G)
AEP xP + AEN xN + AEU xU = bE ;
xP ≥ 0 , xN ≤ 0 .
We will see in the next result that a natural dual for it is
0 0 0
max yG bG + yL bL + yE bE
0 0 0
yG AGP + yL ALP + yE AEP ≤ c0P ;
0
yG AGN + 0
yL ALN + 0
yE AEN ≥ c0N ; (H)
0 0 0
yG AGU + yL ALU + yE AEU = c0U ;
0 0
yG ≥ 0 , yL ≤ 0 .

Theorem 5.7
• Weak Duality Theorem: If (x̂P , x̂N , x̂U ) is feasible in (G) and (ŷG , ŷL , ŷE ) is feasible in
(H), then c0P x̂P + c0N x̂N + c0U x̂U ≥ ŷG
0 0
bG + ŷL 0
bL + ŷE bE .
• Strong Duality Theorem: If (G) has a feasible solution, and (G) is not unbounded,
then there exist feasible solutions (x̂P , x̂N , x̂U ) for (G) and (ŷG , ŷL , ŷE ) for (H) that are
optimal. Moreover, c0P x̂P + c0N x̂N + c0U x̂U = ŷG 0 0
bG + ŷL 0
bL + ŷE bE .

Proof. The Weak Duality Theorem for general problems can be demonstrated as easily as it
was for the standard-form problem and its dual. But the Strong Duality Theorem for general
problems is most easily obtained by converting our general problem (G) to the standard-form

min c0P xP − c0N x̃N + c0U x̃U − ˜U


c0U x̃
AGP xP − AGN x̃N + AGU x̃U − AGU x̃ ˜U − sG = bG ;
ALP xP − ALN x̃N + ALU x̃U − ALU x̃˜U + tL = bL ;
AEP xP − AEN x̃N + AEU x̃U − AEU x̃ ˜U = bE ;
xP ≥ 0 , x̃N ≥ 0 , x̃U ≥ 0 , ˜
x̃U ≥ 0 , sG ≥ 0 , tL ≥ 0 .
5.3. DUALITY FOR GENERAL LINEAR-OPTIMIZATION PROBLEMS 49

˜U for xU . Taking the dual of this standard-form


Above, we substituted −x̃N for xN and x̃U − x̃
problem, we obtain
0 0 0
max yG bG + yL bL + yE bE
0 0 0
yG AGP + yL ALP + yE AEP ≤ c0P ;
0 0 0
− yG AGN − yL ALN − yE AEN ≤ −c0N ;
0 0 0
yG AGU + yL ALU + yE AEU ≤ c0U ;
0 0 0
− yG AGU − yL ALU − yE AEU ≤ −c0U ;
0
− yG ≤ 0 ;
0
+ yL ≤ 0 ,

which is clearly equivalent to (H). t


u

With respect to (G) and its dual (H), the solutions (x̂P , x̂N , x̂U ) and (ŷG , ŷL , ŷE ) are com-
plementary if
0 0 0
(cj − ŷG AGj − ŷL ALj − ŷE AEj ) x̂j = 0 , for all j ;
ŷi (AiP xP + AiN xN + AiU xU − bi ) = 0 , for all i .

Theorem 5.8
• Weak Complementary Slackness Theorem: If (x̂P , x̂N , x̂U ) and (ŷG , ŷL , ŷE ) are feasible
and complementary with respect to (G) and (H), then (x̂P , x̂N , x̂U ) and (ŷG , ŷL , ŷE ) are
optimal.
• Strong Complementary Slackness Theorem: If (x̂P , x̂N , x̂U ) and (ŷG , ŷL , ŷE ) are opti-
mal for (G) and (H), (x̂P , x̂N , x̂U ) and (ŷG , ŷL , ŷE ) are complementary (with respect to
(G) and (H)).

Proof. Similarly to the proof for standard-form (P) and its dual (D), we consider the following
expression.
X
0 0 0
0 = (cj − ŷG AGj − ŷL ALj − ŷE AEj ) x̂j
j∈P
| {z } |{z}
≥0 ≥0
X
0 0 0
+ (cj − ŷG AGj − ŷL ALj − ŷE AEj ) x̂j
j∈N
| {z } |{z}
≤0 ≤0
X
0 0 0
+ (cj − ŷG AGj − ŷL ALj − ŷE AEj ) x̂j
j∈U
| {z }
=0
X
+ ŷi (AiP xP + AiN xN + AiU xU − bi )
|{z} | {z }
i∈G ≥0 ≥0
X
+ ŷi (AiP xP + AiN xN + AiU xU − bi )
|{z} | {z }
i∈L ≤0 ≤0
X
+ ŷi (AiP xP + AiN xN + AiU xU − bi ) .
| {z }
i∈E =0

The results follows easily using the Weak and Strong Duality Theorems for (G) and (H). t
u
50 CHAPTER 5. DUALITY

The table below summarizes the duality relationships between the type of each primal con-
straint and the type of each associated dual variable. Highlighted in yellow are the relation-
ships for the standard-form (P) and its dual (D). It is important to note that the columns are
labeled “min” and “max”, rather than primal and dual — the table is not correct if “min” and
“max” are interchanged.

 min max 
 ≥ ≥0 
constraints ≤ ≤0 variables
 
 = unres. 
 ≥0 ≤ 
variables ≤0 ≥ constraints
 
unres. =

5.4 Theorems of the Alternative

In this section, we use linear-optimization duality to understand when a linear-optimization


problem has a feasible solution. This fundamental result, expounded by Farkas6 , opened the
door for studying linear inequalities and optimization.

Theorem 5.9 (Farkas Lemma)


Let A ∈ Rm×n and b ∈ Rm be given. Then exactly one of the following two systems has a
solution.
Ax = b ;
(I)
x ≥ 0.
y0 b > 0 ;
(II)
y 0 A ≤ 00 .

Proof. It is easy to see that there cannot simultaneously be a solution x̂ to (I) and ŷ to (II).
Otherwise we would have
0 ≥ ŷ 0 A x̂ = ŷ 0 b > 0 ,
|{z} |{z}
≤0 ≥0

which is a clear inconsistency.


Next, suppose that (I) has no solution. Then the following problem is infeasible:

min 00 x
Ax = b; (P)
x ≥ 0.
5.4. THEOREMS OF THE ALTERNATIVE 51

Its dual is
max y0 b
(D)
y0 A ≤ 00 .
Because (P) is infeasible, then (D) is either infeasible or unbounded. But ŷ := 0 is a feasible
solution to (D), therefore (D) must be unbounded. Therefore, there exists a feasible solution
ŷ to (D) having objective value greater than zero (or even any fixed constant). Such a ŷ is a
solution to (II). t
u

Remark 5.10
Geometrically, the Farkas Lemma asserts that exactly one of the following holds:
(I) b is in the “cone generated by the columns of A” (i.e., b is a non-negative linear combina-
tion of the columns of A), or
(II) there is ŷ ∈ Rm that makes an acute angle with b and a non-acute (i.e., right or obtuse)
angle with every column of A .
In the case of (II), considering the hyperplane H containing the origin having ŷ as its normal
vector, this H separates b from the cone generated by the columns of A . So, the Farkas Lemma
has the geometric interpretation as a“Separating-Hyperplane Theorem.” See Figure 5.1 for an
example with m = 2 and n = 4 . The cone is red and the point b that we separate from the
cone is blue. The green point is a solution ŷ for (II), and the dashed green line is the separating
hyperplane. Notice how the (solid) green vector makes an acute angle with the blue vector and
a non-acute angle with all points in the cone.
52 CHAPTER 5. DUALITY

Figure 5.1: Case (II) of the Farkas Lemma

In a similar fashion to the Farkas Lemma, we can develop theorems of this type for feasible
regions of other linear-optimization problems.

Theorem 5.11 (Theorem of the Alternative for Linear Inequalities)


Let A ∈ Rm×n and b ∈ Rm be given. Then exactly one of the following two systems has a
solution.
Ax ≥ b . (I)
y0 b > 0 ;
y 0 A = 00 ; (II)
y ≥ 0.

Proof. It is easy to see that there cannot simultaneously be a solution x̂ to (I) and ŷ to (II).
Otherwise we would have
0 = ŷ 0 A x̂ ≥ ŷ 0 b > 0 ,
|{z}
=0

which is a clear inconsistency.


5.5. EXERCISES 53

Next, suppose that (I) has no solution. Then the following problem is infeasible:

min 00 x
(P)
Ax ≥ b.

Its dual is
max y0 b
y0 A = 00 ; (D)
y ≥ 0.
Because (P) is infeasible, then (D) is either infeasible or unbounded. But ŷ := 0 is a feasible
solution to (D), therefore (D) must be unbounded. Therefore, there exists a feasible solution ŷ
to (D) having objective value greater than zero (or even greater than any fixed constant). Such
a ŷ is a solution to (II). t
u

5.5 Exercises
Exercise 5.1 (Dual picture)
For the standard-form problem (P) and its dual (D), explain aspects of duality and comple-
mentarity using this picture:

Exercise 5.2 (Reduced costs as dual values)


In this exercise, we will see that we can regard reduced costs (corresponding to an optimal basic
partition) as (optimal) values of dual variables for non-negativity constraints.
Consider the ordinary standard-form problem

z := min c0 x dual variables


Ax = b; y (P)
x ≥ 0,
54 CHAPTER 5. DUALITY

and let β, η be an optimal basic partition for (P).


We can equivalently see (P) as

z := min c0 x dual variables


Ax = b; y (P̃)
x ≥ 0, w

where in (P̃), we regard the non-negativity constraints of (P) as ordinary structural constraints
— with dual variables.
Define w̄ ∈ Rn by
w̄β := 0 ∈ Rm ;
w̄η := c̄η ∈ Rn−m .

Prove that together, ȳ 0 := c0β A−1


β and w̄ are optimal for the dual of (P̃).

Exercise 5.3 (Duality and complementarity with Python/Gurobi)


After optimization using Python/Gurobi, it is easy to get more information regarding primal
and dual problems. In particular, we can obtain optimal primal and dual solutions, and slacks
for these solutions in the primal and dual constraints. See how this is done in Production.ipynb
(Appendix A.3), and verify the concepts of duality and complementarity developed in this
chapter.

Exercise 5.4 (Complementary slackness)


Construct an example where we are given x̂ and ŷ and asked to check whether x̂ is optimal
using complementary slackness. I want your example to have the property that x̂ is optimal, x̂
and ŷ are complementary, but ŷ is not feasible.
The idea is to see an example where there is not a unique dual solution complementary to
x̂ , and so x̂ is optimal, but we only verify it with another choice of ŷ.

Exercise 5.5 (Over complementarity)


With respect to the standard-form problem (P) and its dual (D), complementary solutions x̂
and ŷ are overly complementary if exactly one of

cj − ŷ 0 A·j and x̂j is 0 , for j = 1, 2, . . . , n .

Prove that if (P) has an optimal solution, then there are always optimal solutions for (P) and
(D) that are overly complementary.
HINT: Let v be the optimal objective value of (P). For each j = 1, 2, . . . , n , consider

max xj
c0 x ≤ v
(Pj )
Ax = b.
x ≥ 0.

(Pj ) seeks an optimal solution of (P) that has xj positive. Using the dual of (Pj ), show that
if no optimal solution x̂ of (P) has x̂j positive, then there is an optimal solution ŷ of (D) with
cj − ŷ 0 A·j positive. Once you do this you can conclude that, for any fixed j, there are optimal
solutions x̂ and ŷ with the property that exactly one of

cj − ŷ 0 A·j and x̂j is 0 .

Take all of these n pairs of solutions x̂ and ŷ and combine them appropriately to construct
optimal x̂ and ŷ that are overly complementary.
5.5. EXERCISES 55

Exercise 5.6 (Another proof of a Theorem of the Alternative)


Prove the Theorem of the Alternative for Linear Inequalities directly from the Farkas Lemma,
without appealing to linear-optimization duality. HINT: Transform (I) of the Theorem of the
Alternative for Linear Inequalities to a system of the form of (I) of the Farkas Lemma.

Exercise 5.7 (A general Theorem of the Alternative)


State and prove a “Theorem of the Alternative” for the system:

AG
P xP + AG
N xN + AGU xU ≥ bG ;
AL x P + AN x N + AL
L
U xU ≤ bL ;
P (I)
AE
P xP + AE
N x N + A E
U xU = bE ;
xP ≥ 0 , xN ≤ 0 .

Exercise 5.8 (Dual ray)


Consider the linear-optimization problem

min c0 x
Ax ≥ b; (P)
x ≥ 0.

a) Suppose that (P) is infeasible. Then, by a ‘Theorem of the Alternative’ there is a solution
to what system?
b) Suppose, further, that the dual (D) of (P) is feasible. Take a feasible solution ŷ of (D) and
a solution ỹ to your system of part (a) and combine them appropriately to prove that (D)
is unbounded.
Chapter 6

Sensitivity Analysis

Our goal in this chapter is as follows:

• Learn how the optimal value of a linear-optimization problem behaves when the right-
hand side vector and objective vector are varied.

6.1 Right-Hand Side Changes

We define a function f : Rm → R via

f (b) := min c0 x
Ax = b; (Pb )
x ≥ 0.

That is, (Pb ) is simply (P) with the optimal objective value viewed as a function of its right-hand
side vector b .

57
58 CHAPTER 6. SENSITIVITY ANALYSIS

6.1.1 Local analysis

Consider a fixed basis β for (Pb ). Associated with that basis is the basic solution x̄β = A−1 β b
and the corresponding dual solution ȳ 0 = c0β A−1 β . Let us assume that ȳ is feasible for the dual
of (Pb ) — or, equivalently, c0η − ȳ 0 Aη ≥ 00 . Considering the set B of b ∈ Rm such that β is an
optimal basis, is it easy to see that B is just the set of b such that x̄β := A−1 β b ≥ 0 . That is,
B ⊂ Rm is the solution set of m linear inequalities (in fact, it is a “simplicial cone” — we will
return to this point in Section 6.1.3). Now, for b ∈ B , we have f (b) = ȳ 0 b . Therefore, f is a
∂f
linear function on b ∈ B . Moreover, as long as b is in the interior of B , we have ∂b i
= ȳi . So we
have that ȳ is the gradient of f , as long as b is in the interior of B . Now what does it mean for
b to be in the interior of B ? It just means that x̄βi > 0 for i = 1, 2, . . . , m .
Let us focus our attention on changes to a single right-hand side element bi . Suppose that
β is an optimal basis of (P) , and consider the problem

min c0 x
Ax = b + ∆i ei ; (Pi )
x ≥ 0,

where ∆i ∈ R . The basis β is feasible (and hence still optimal) for (Pi ) if A−1
β (b + ∆i ei ) ≥ 0 .
Let hi := A−1
β ei . So
[h1 , h2 , . . . , hm ] = A−1
β .

Then, the condition A−1


β (b + ∆i ei ) ≥ 0 can be re-expressed as x̄β + ∆i h ≥ 0 . It is straightfor-
i

ward to check that β is feasible (and hence still optimal) for (Pi ) as long as ∆i is in the interval
[Li , Ui ] , where 
Li := maxi
−x̄βk /hik ,
k : hk >0

and 
Ui := min
i
−x̄βk /hik .
k : hk <0

It is worth noting that it can be the case that hik ≤ 0 for all k, in which case we define Li := −∞,
and it could be the case that hik ≥ 0 for all k, in which case we define Ui := +∞,
In summary, for all ∆i satisfying Li ≤ ∆i ≤ Ui , β is an optimal basis of (P) . It is important
to emphasize that this result pertains to changing one right-hand side element and holding all
others constant. For a result on simultaneously changing all right-hand side elements, we refer
to Exercise 6.3.

6.1.2 Global analysis


6.1. RIGHT-HAND SIDE CHANGES 59

The domain of f is the set of b for which (Pb ) has an optimal solution. Assuming that the
dual of (Pb ) is feasible (note that this just means that y 0 A ≤ c0 has a solution), then (Pb ) is never
unbounded. So the domain of f is just the set of b ∈ Rm such that (Pb ) is feasible.

Theorem 6.1
The domain of f is a convex set.

Proof. Suppose that bj is in the domain of f , for j = 1, 2 . Therefore, there exist xj that are
feasible for (Pbj ) , for j = 1, 2 . For any 0 < λ < 1 , let b̂ := λb1 + (1 − λ)b2 , and consider
x̂ := λx1 + (1 − λ)x2 . It is easy to check that x̂ is feasible for (Pb̂ ) , so we can conclude that b̂ is
in the domain of f . t
u

Before going further, we need a few definitions. We consider functions f : Rm → R. The


domain of f is the subset S of Rm on which f is defined. We assume that S is a convex set. A
function f : Rm → R is a convex function on its domain S, if

f (λu1 + (1 − λ)u2 ) ≤ λf (u1 ) + (1 − λ)f (u2 ) ,

for all u1 , u2 ∈ S and 0 < λ < 1 . That is, f is never underestimated by linear interpolation.
PmA function f : R → R is an affine function, if it has the form f (u1 , . . . , um ) = a0 +
m

i=1 ai ui , for constants a0 , a1 , . . . , am ∈ R . If a0 = 0 , then we say that f is a linear function.


Affine (and hence linear) functions are easily seen to be convex.
A function f : Rm → R having a convex set as its domain is a convex piecewise-linear
function if, on its domain, it is the pointwise maximum of a finite number of affine functions.

It would be strange to refer to a function as being “convex piecewise-linear” if it were not convex!
The next result justifies the moniker.

Theorem 6.2
If fˇ is a convex piecewise-linear function, then it is a convex function.

Proof. Let
fˇ(u) := max {fi (u)} ,
1≤i≤k

for u in the domain of fˇ , where each fi is an affine function. That is, fˇ is the pointwise maximum
of a finite number (k) of affine functions.
60 CHAPTER 6. SENSITIVITY ANALYSIS

Then, for 0 < λ < 1 and u1 , u2 ∈ Rm ,



fˇ(λu1 + (1 − λ)u2 ) = max fi (λu1 + (1 − λ)u2 )
1≤i≤k

= max λfi (u1 ) + (1 − λ)fi (u2 ) (using the definition of affine)
1≤i≤k
 
≤ max λfi (u1 ) + max (1 − λ)fi (u2 )
1≤i≤k 1≤i≤k
 
= λ max fi (u1 ) + (1 − λ) max fi (u2 )
1≤i≤k 1≤i≤k

= λfˇ(u1 ) + (1 − λ)fˇ(u2 ) .

t
u

Theorem 6.3
f is a convex piecewise-linear function on its domain.

Proof. We refer to the dual

f (b) := max y 0 b
(Db )
y0 A ≤ c0 ;

of (Pb ).
A basis β is feasible or not for (Db ), independent of b . Thinking about it this way, we can
see that n  o
f (b) = max c0β A−1 β b : β is a dual feasible basis ,

and so f is a convex piecewise-linear function, because it is the pointwise maximum of a finite


number of affine (even linear) functions. t
u

6.1.3 A brief detour: the column geometry for the Simplex Algorithm

In this section, we will describe a geometry for visualizing the Simplex Algorithm.7 The
ordinary geometry for a standard-form problem, in the space of the non-basic variables for
same choice of basis, can be visualized when n − m = 2 or 3. The “column geometry” that we
will describe is in Rm+1 , so it can be visualized when m + 1 = 2 or 3. Note that the graph of
the function f (b) (introduced at the start of this chapter) is also in Rm+1 , which is why we take
the present detour.
6.2. OBJECTIVE CHANGES 61

We think of the n points  


cj
,
Aj
for j = 1, 2, . . . , n , and the additional so-called requirement line
  
z
: z∈R .
b

We think of the first component of these points and of the line as the vertical dimension; so the
requirement line is thought of as vertical. It of particular interest to think of the cone generated
by the n points. That is, ( ! )
c0 x
K := ∈ Rm+1 : x ≥ 0 .
Ax
Notice how the top coordinate of a point in the cone gives the objective value of the associated x
for (P). So the goal of solving (P) can be thought of as that of finding a point on the intersection
of the requirement line and the cone that is as low as possible.
Restricting ourselves to a basis β , we have the cone
( ! )
c0β xβ m+1
Kβ := ∈R : xβ ≥ 0 .
Aβ xβ

The cone Kβ is an “m-dimensional simplicial cone.” Next, we observe that if β is a feasible basis,
then Kβ intersects the requirement line uniquely at the point
!
c0β x̄β
,
Aβ x̄β

where x̄ is the basic solution associated with β .


In a pivot of the Simplex Algorithm from basis β to basis β̃, we do so with the goal of having
Kβ̃ intersect the requirement line at a lower point than did Kβ . In Figure 6.1 (m = 2 and the
coordinate axes are the red lines), we see an example depicting a single pivot. Kβ is the yellow
cone, intersecting the blue requirement line at the red point. After the pivot (with one cone
generator exchanged), we have the green cone Kβ̃ intersecting the requirement line at the pink
point.
So at each iteration of the Simplex Algorithm, we exchange a single “generator” of the simplicial
cone Kβ associated with our basis β, to descend along the requirement line, ultimately finding
a point of K that meets the requirement at its lowest point.

6.2 Objective Changes

“Here is what is needed for Occupy Wall Street to become a force for change: a clear, and
clearly expressed, objective. Or two.” — Elayne Boosler
62 CHAPTER 6. SENSITIVITY ANALYSIS

Figure 6.1: A simplex pivot

We define a function g : Rn → R via


g(c) := min c0 x
Ax = b ; (Pc )
x ≥ 0.
That is, (Pc ) is simply (P) with the optimal objective value viewed as a function of its objective
vector c .

6.2.1 Local analysis


Consider a fixed basis β for (Pc ). Associated with that basis is the basic solution x̄β = A−1 β b
−1 c
and the corresponding dual solution ȳ = cβ Aβ . Let us assume that x̄ is feasible for (P ) — or,
0 0

equivalently, A−1β b ≥ 0 . Considering the set C of c ∈ R such that β is an optimal basis, is it easy
n
−1
to see that this is just the set of c such that cη − cβ Aβ Aη ≥ 00 . That is, C ⊂ Rn is the solution
0 0

set of n − m linear inequalities (in fact, it is a cone). Now, for c ∈ C , we have g(c) = c0β x̄β .
Therefore, g is a linear function on c ∈ C .

6.2.2 Global analysis


The domain of g is the set of c for which (Pc ) has an optimal solution. Assuming that (Pc ) is
feasible, then the domain of g is just the set of c ∈ Rn such that (Pc ) is not unbounded.
Similarly to the case of variations in the right-hand side vector b, we have the following two
results.

Theorem 6.4
The domain of g is a convex set.
6.3. EXERCISES 63

A function g : Rn → R is a concave function on its domain S, if

g(λu1 + (1 − λ)u2 ) ≥ λg(u1 ) + (1 − λ)g(u2 ) ,

for all u1 , u2 ∈ S and 0 < λ < 1 . That is, f is never overestimated by linear interpolation.
The function g is a concave piecewise-linear function if it is the pointwise minimum of a finite
number of affine functions.

Theorem 6.5
g is a concave piecewise-linear function on its domain.

6.3 Exercises
Exercise 6.1 (Local sensitivity analysis with Python/Gurobi)
We can easily carry out some local sensitivity analysis with Python/Gurobi. See how this is done
in Production.ipynb (Appendix A.3). Verify the calculations of Python/Gurobi by ‘hand’,
using the ideas and formulas in Section 6.1.1 to make the calculations yourself; you may use
any convenient software (e.g., Python, MATLAB, Mathematica, etc.) to assist you, but only for
doing arithmetic on scalars, vector and matrices.

Exercise 6.2 (Illustrate global sensitivity analysis using Python/Gurobi)


Using Python/Gurobi, make an original example, with at least three constraints, graphing the
objective value of (P ) , as a single b[i] is varied from −∞ to +∞ . As you work on this, bear in
mind Theorem 6.3, using local analysis to identify successive ranges where the optimal value is
linear.
Exercise 6.3 (“I feel that I know the change that is needed.” — Mahatma Gandhi)
We are given 2m numbers satisfying Li ≤ 0 ≤ Ui , i = 1, 2, . . . , m . Let β be an optimal basis for
all of the m problems
min c0 x
Ax = b + ∆i ei ; (Pi )
x ≥ 0,
for all ∆i satisfying Li ≤ ∆i ≤ Ui . Let’s be clear on what this means: For each i individually,
the basis β is optimal when the ith right-hand side component is changed from bi to bi + ∆i , as
long as ∆i is in the interval [Li , Ui ] (see Section 6.1.1).
The point of this problem is to be able to say something about simultaneously changing all
of the bi . Prove that we can simultaneously change bi to
 
Li
b̃i := bi + λi ,
Ui
64 CHAPTER 6. SENSITIVITY ANALYSIS

Pm
where λi ≥ 0 , when i=1 λi ≤ 1 . [Note that in the formula above, for each i we can pick either
Li (a decrease) or Ui (an increase)].

Exercise 6.4 (Domain for objective variations)


Prove Theorem 6.4.

Exercise 6.5 (Concave piecewise-linear function)


Prove Theorem 6.5.
Chapter 7

Large-Scale Linear Optimization

Our goals in this chapter are as follows:

• To see some approaches to large-scale linear-optimization problems

• In particular, to learn about decomposition, Lagrangian relaxation and column genera-


tion.

• Also, via a study of the “cutting-stock problem,” we will have a first glimpse at some
issues associated with integer-linear optimization.

7.1 Decomposition

65
66 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

In this section we describe what is usually known as Dantzig-Wolfe Decomposition. It is


an algorithm aimed at efficiently solving certain kinds of structured linear-optimization prob-
lems. The general viewpoint is that we might have a very efficient way to solve a certain type
of structured linear-optimization problem, if it were not for a small number of constraints that
break the structure. For example, the constraint matrix might have the form in Figure 7.1, where
if it were not for the top constraints, the optimization problem would separate into many small
problems8 .
 
···
 
 
 
 
 .. 
 . 

Figure 7.1: Nearly separates

7.1.1 The master reformulation

Theorem 7.1 (The Representation Theorem)


Let
min c0 x
Ax = b; (P)
x ≥ 0.
Suppose that (P) has a non-empty feasible region. Let X := {x̂j : j ∈ J } be the set of basic-
feasible solutions of (P), and let Z := {ẑ k : k ∈ K} be the set of basic-feasible rays of (P).
Then the feasible
 region of (P) is equal to 
X X X 
λj x̂j + µk ẑ k : λj = 1 ; λj ≥ 0 , j ∈ J ; µk ≥ 0 , k ∈ K .
 
j∈J k∈K j∈J

Proof. Let S be the feasible region of (P) . Let


 
X X X 
S0 = λj x̂j + µk ẑ k : λj = 1 ; λj ≥ 0 , j ∈ J ; µk ≥ 0 , k ∈ K .
 
j∈J k∈K j∈J

We will demonstrate that S = S 0 . It is very easy to check that S 0 ⊂ S , and we leave that to the
reader. For the other direction, suppose that x̂ ∈ S , and consider the system
X X
λj x̂j + µk ẑ k = x̂ ;
j∈J
X k∈K
λj = 1; (I)
j∈J
λj ≥ 0 , j ∈ J ; µk ≥ 0 , k ∈ K .

Keep in mind that in (I), x̂ is fixed as well as are the x̂j and the ẑ k — the variables are the λj
and the µk . By way of establishing that S ⊂ S 0 , suppose that x̂ ∈
/ S 0 — that is, suppose that (I)
7.1. DECOMPOSITION 67

has no solution. Applying the Farkas Lemma to (I) , we see that the system

w0 x̂ + t > 0;
w0 x̂j + t ≤ 0, ∀j ∈J ; (II)
w0 ẑ k ≤ 0, ∀k∈K

has a solution, say ŵ, t̂ . Now, consider the linear-optimization problem

min −ŵ0 x
Ax = b; (P̂)
x ≥ 0.

(P̂) cannot be unbounded, because −ŵ0 ẑ k ≥ 0 , for all k ∈ K . In addition, every basic feasible
solution of (P̂) has objective value at least t̂ . By Theorem 5.1 (the Strong Optimal Basis Theo-
rem), this implies that the optimal value of (P̂) is at least t̂ . But the objective value −ŵ0 x̂ of x̂ is
less than t̂ . Therefore, x̂ cannot be feasible. That is, x̂ ∈
/S. t
u

Corollary 7.2 (The Decomposition Theorem)


Let
min c0 x
Ex ≥ h ;
(Q)
Ax = b ;
x ≥ 0.
Let S := {x ∈ Rn : Ax = b , x ≥ 0} , let X := {x̂j : j ∈ J } be the set of basic-feasible
solutions S, and let Z := {ẑ k : k ∈ K} be the set of basic-feasible rays of S. Then (Q) is
equivalent to the Master Problem
X  X 
min c0 x̂j λj + c0 ẑ k µk
j∈J k∈K
X  X 
j
E x̂ λj + E ẑ k µk ≥ h;
j∈J k∈K
(M)
X
λj = 1;
j∈J
λj ≥ 0 , j ∈ J , µk ≥ 0 , k ∈ K .

Proof. Using the Representation Theorem, we just substitute the expression


X X
λj x̂j + µk ẑ k
j∈J k∈K

for x in c0 x and in Ex ≥ h of (Q), and it is easy to see that (M) is equivalent to (Q). t
u

Decomposition is typically applied in a way such that the constraints defining (S) are some-
how relatively “nice,” and the constraints Ex ≥ h somehow are “complicating” the situation.
For example, we may have a problem where the overall constraint matrix has the form depicted
in Figure 7.1. In such a scenario, we would let

E := ···
68 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

and  
 
 
A :=  .. 
 . 

We note that there is nothing special here about the “nice” constraints being “=”, and the
complicating constraints being “≥”. The method, with small modifications, can handle any
types of constraints; we take the particular form that we do for some convenience.

7.1.2 Solution of the Master via the Simplex Algorithm


Next, we describe how to solve (M) using the Simplex Algorithm. Our viewpoint is that we
cannot write out (M) explicitly; there are typically far too many variables. But we can reasonably
maintain a basic solution of (M̄), the standard-form problem obtained from (M) by adding slack
variables for the Ex ≥ 0 constraints, because the number of constraints of (M̄), is just one more
than the number of constraints in Ex ≤ h .
The only part of the Simplex Algorithm that is sensitive to the total number of variables is
the step in which we check whether there is a variable with a negative reduced cost. So rather
than checking this directly, we will find an indirect way to carry it out.
Toward this end, we define dual variables y and σ for (M) .
X  X 
min c0 x̂j λj + c0 ẑ k µk dual variables
j∈J k∈K
X  X 
E x̂j λj + E ẑ k µk ≥ h; y≥0
(M)
j∈J
X k∈K
λj = 1; σ unrestricted
j∈J
λj ≥ 0 , j ∈ J , µk ≥ 0 , k ∈ K .

While σ is a scalar variable, y is a vector with a component for each row of E .


Using a vector of slack variables s, we obtain the standard-from problem
X  X 
min c0 x̂j λj + c0 ẑ k µk
j∈J k∈K
X  X 
E x̂j λj + E ẑ k µk − Is = h;
(M̄)
j∈J
X k∈K
λj = 1;
j∈J
λj ≥ 0 , j ∈ J , µk ≥ 0 , k ∈ K .

We will temporarily put aside how we calculate values for y and σ , but for now we suppose
that we have a basic partition of (M̄) and an associated dual solution ȳ and σ̄ .

Entering variable. Notice that nonnegativity of the dual variables y in (M), is is equivalently
realized in (M̄) via the reduced costs of the slack variables being nonnegative. Therefore, a slack
variable si is eligible to enter the basis if ȳi < 0.
The reduced cost of a variable λj is
 
c0 x̂j − ȳ 0 E x̂j − σ̄ = −σ̄ + (c0 − ȳ 0 E) x̂j .
7.1. DECOMPOSITION 69

It is noteworthy that with the dual solution fixed (at ȳ and σ̄), the reduced cost of λj is a constant
(−σ̄) plus a linear function of x̂j . A variable λj is eligible to enter the basis if its reduced cost
is negative. So we formulate the following optimization problem:

−σ̄ + min (c0 − ȳ 0 E) x


Ax = b; (SUB)
x ≥ 0.

If the “subproblem” (SUB) has as optimal solution, then it has a basic optimal solution — that
is, an x̂j . In such a case, if the optimal objective value of (SUB) is negative, then the λj corre-
sponding to the optimal x̂j is eligible to enter the current basis of (M̄). On the other hand, if
the optimal objective value of (SUB) is non-negative, then we have a proof that no non-basic λj
is eligible to enter the current basis of (M̄).
If (SUB) is unbounded, then (SUB) has a basic feasible ray ẑ k having negative objective 
value. That is, (c0 − ȳ 0 E) ẑ k < 0 . Amazingly, the reduced cost of µk is precisely c0 ẑ k −
ȳ 0 E ẑ k = (c0 − ȳ 0 E) ẑ k , so, in fact, µk is then eligible to enter the current basis of (M̄).

Leaving variable. To determine the choice of leaving variable, let us suppose that B is the
basis matrix for (M̄). Note that B consists of at least one column of the form
 
E x̂j
1

and columns of the form    


E ẑ k −ei
and .
0 0
With respect to the current basis, to carry out the ratio test of the Simplex Algorithm, we
simply need
 
−1 h
B
1
and:  
E x̂j
B −1
1
if λj is entering the basis, or
 
E ẑ k
B −1
0
if µk is entering the basis, or
 
−1 −ei
B
0
if si is entering the basis.

Calculation of basic primal and dual solutions. It is helpful to explain a bit about the
calculation of basic primal and dual solutions. As we have said, B consists of at least one column
of the form  
E x̂j
1
and columns of the form    
E ẑ k −ei
and .
0 0
70 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

So organizing the basic variables λj , µk and si into a vector ζ , with their order appropriately
matched with the columns of B , the vector ζ̄ of values of ζ is precisely the solution of
 
h
Bζ = .
1

That is,  
h
ζ̄ = B −1 .
1
Finally, organizing the costs c0 x̂j , c0 ẑ k , and 0 of the basic variables λj , µk and si into a vector ξ ,
with their order appropriately matched with the columns of B , the associated dual solution
(ȳ, σ̄) is precisely the solution of
(y 0 , σ)B = ξ 0 .
That is,
(ȳ 0 , σ̄) = ξ 0 B −1 .

Starting basis. It is not obvious how to construct a feasible starting basis for (M̄); after all,
we may not have at hand any basic feasible solutions and rays of (S). Next, we give a simple
recipe. First, we take as x̂1 any basic feasible solution of (P) . Such a solution can be readily
obtained by using our usual (phase-one) methodology of the Simplex Algorithm. Our initial
basic variables are all of the slack variables si and also λ1 , associated with x̂1 . So we have the
initial basis matrix  
−I E x̂1
B= .
00 1
It is very easy to see that this is an invertible matrix.
It is very important to realize that we have given a recipe for finding an initial basic solution
of (M̄). This basic solution is feasible precisely when x1 satisfies the Ex ≥ h constraints. If
this solution is not feasible, then we would introduce an artificial variable and do a phase-
one procedure. Following the methodology of Section 4.4.1, we introduce the single artificial
column  
1 − E x̂1
,
1
with cost 1. We let the artificial variable enter the basis, removing the slack variable that is the
most negative from the basis. This yields a feasible basis for the phase-one problem, with pos-
itive objective value. Now we carry out phase-one of the simplex method, using Decomposition,
minimizing the artificial variable, seeking to drive it down to zero.

A demonstration implementation.

It is not completely trivial to write a small Python/Gurobi) code for the Decomposition Algo-
rithm. First of all, we solve the subproblems (SUB) using functionality of Gurobi. Another
7.1. DECOMPOSITION 71

point is that rather than carry out the simplex method at a detailed level on (M̄), we just ac-
cumulate all columns of (M̄) that we generate, and always solve linear-optimization problems,
using functionality of Gurobi, with all of the columns generated thus far. In this way, we do not
maintain bases ourselves, and we do not carry out the detailed pivots of the Simplex Algorithm.
Note that the linear-optimization functionality of Gurobi does give us a dual solution, so we do
not compute that ourselves. Our code is in the Jupyter notebook Decomp.ipynb (see Appendix
A.8)
In Figures 7.2 and 7.3, we see quite good behavior for the Decomposition Algorithm, for
a problem with 100 variables, 200 “complicating” constraints (i.e., rows of E), and 50 “nice”
constraints (i.e., rows of A).

Figure 7.2: Example: Phase-one objective values with Decomposition


72 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

Figure 7.3: Example: Phase-two objective values with Decomposition

Convergence and lower bounds. Practically speaking, the convergence behavior of the
Decomposition Algorithm can suffer from a tailing-off effect. That is, while the sequence of
objective values for successive iterates is non-increasing, at some point improvements can be-
come quite small. It would be helpful to know when we already have a very good but possibly
non-optimal solution. If we could rapidly get a good lower bound on z , then we could stop the
Decomposition when the its objective value is close to such a lower bound. Lower bounds on
z can be obtained from feasible solutions to the dual of (Q) . But there is another way, closely
related to the dual of (Q), to rapidly get good lower bounds. We develop this in the next section.
7.2. LAGRANGIAN RELAXATION 73

7.2 Lagrangian Relaxation

Again, we consider
z := min c0 x
Ex ≥ h ;
(Q)
Ax = b ;
x ≥ 0,
but our focus now is on efficiently getting a good lower bound on z , with again the view that
we are able to quickly solve many linear-optimization problems having only the constraints:
Ax = b , x ≥ 0 .

7.2.1 Lagrangian bounds


For any fixed choice of ŷ ≥ 0, consider the following “Lagrangian” optimization problem

v(ŷ) := ŷ 0 h + min (c0 − ŷ 0 E)x


Ax = b (Lŷ )
x ≥ 0.

Note that the only variables in the minimization are x , because we consider ŷ to be fixed.

Theorem 7.3
v(ŷ) ≤ z , for all ŷ in the domain of v.

Proof. Let x∗ be an optimal solution for (Q). Clearly x∗ is feasible for (Lŷ ). Therefore

v(ŷ) ≤ ŷ 0 h + (c0 − ŷ 0 E)x∗


= c0 x∗ − ŷ 0 (Ex∗ − h)
≤ z.

The last equation uses the fact that x∗ is optimal for (Q), so z = c0 x∗ , and also that Ex∗ ≥ h and
ŷ ≥ 0 . t
u
74 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

From what we learned in studying sensitivity analysis, it can be seen that v is a concave
(piecewise-linear) function on its domain (see Theorem 6.5). Because of this nice behavior, it is
plausible that we could calculate the maximum of v as a means of getting a good lower bound
on z . Before doing that, we examine the precise relationship between primal and dual solutions
of (Q), minimizers of v , and primal and dual solutions of the Lagrangian.

Theorem 7.4
Suppose that x∗ is optimal for (Q) , and suppose that ŷ and π̂ are optimal for the dual of (Q) .
Then x∗ is optimal for (Lŷ ) , π̂ is optimal for the dual of (Lŷ ) , ŷ is a maximizer of v(y) over
y ≥ 0, and the maximum value of v(y) over y ≥ 0 is z .

In the theorem above, we refer to two duals. The dual of (Q) is:

min y 0 h + π 0 b
y 0 E + π 0 A ≤ c0 ;
y≥0.

The dual of (Lŷ ) is:


ŷ 0 h + max π 0 b
π 0 A ≤ c0 − ŷ 0 E .

Proof. x∗ is clearly feasible for (Lŷ ) . Because ŷ and π̂ are feasible for the dual of (Q) , we have
ŷ ≥ 0, and ŷ 0 E + π̂ 0 A ≤ c0 . The latter implies that π̂ is feasible for the dual of (Lŷ ) .
Using the Strong Duality Theorem for (Q) implies that c0 x∗ = ŷ 0 h+ π̂ 0 b . Using that E x̂∗ ≥ h
(feasibility of x∗ in (Q)), we then have that (c0 − ŷ 0 E) x∗ ≤ π̂ 0 b . Finally, using the Weak Duality
Theorem for (Lŷ ) , we have that x∗ is optimal for (Lŷ ) and π̂ is optimal for the dual of (Lŷ ) .
Next,

z ≥ v(ŷ) (by Theorem 7.3)


= ŷ h + (c − ŷ 0 E) x∗
0 0
(because x∗ is optimal for (Lŷ ))
0 ∗ 0 ∗
= ĉ x + ŷ (Ex − h)
≥ c0 x∗ (because Ex∗ ≥ h and y ≥ 0)
= z.

Therefore the all of the inequalities are equations, and so ŷ is a maximizer of v and the maximum
value is z . t
u

Theorem 7.5
Suppose that ŷ is a maximizer of v(y) over y ≥ 0 , and suppose that π̂ is optimal for the dual
of (Lŷ ) . Then ŷ and π̂ are optimal for the dual of (Q) , and the optimal value of (Q) is v(ŷ) .
7.2. LAGRANGIAN RELAXATION 75

Proof.
v(ŷ) = max {v(y)}
y≥0
n o
= max y 0 h + min {(c0 − y 0 E) x : Ax = b , x ≥ 0}
y≥0 x
n o
= max y h + max {π 0 b : π 0 A ≤ c0 − y 0 E}
0
y≥0 π

= max {y 0 h + π 0 b : y 0 E + π 0 A ≤ c0 }
y≥0,π
= z.
The third equation follows from taking the dual of the inner (minimization) problem. The last
equation follows from seeing that the final maximization (over y ≥ 0 and π simultaneously) is
just the dual of (Q).
So, we have established that the optimal value z of (Q) is v(ŷ) . Looking a bit more closely,
we have established that z = ŷ 0 h + π̂ 0 b , and because π̂ 0 A ≤ c0 − ŷ 0 E and y ≥ 0, we have that ŷ
and π̂ are optimal for the dual of (Q) . t
u

Note that the conclusion of Theorem 7.5 gives us an optimal ŷ and π̂ for the dual of (Q), but
not an optimal x∗ for (Q) itself.

7.2.2 Solving the Lagrangian Dual


Theorem 7.3 gives us a simple way to calculate a lower bound on z , by solving a potentially
much-easier linear-optimization problem. But the bound depends on the choice of ŷ ≥ 0 . Can
we find the best such ŷ ? This would entail solving the so-called Lagrangian Dual problem of
maximizing v(y) over all y ≥ 0 in the domain of v . It should seem that there is hope for doing
this — because v is a concave function. But v is not a smooth function (it is piecewise linear),
so we cannot rely on calculus-based techniques.

Theorem 7.6
Suppose that we fix ŷ , and solve for v(ŷ) . Let x̂ be the solution of (Lŷ ) . Let γ̂ := h − E x̂ .
Then
v(ỹ) ≤ v(ŷ) + (ỹ − ŷ)0 γ̂ ,
for all ỹ in the domain of v .

Proof.
v(ŷ) + (ỹ − ŷ)0 γ̂ = ŷ 0 h + (c0 − ŷ 0 E)x̂ + (ỹ − ŷ)0 (h − E x̂)
= ỹ 0 h + (c0 − ỹ 0 E)x̂
≥ v(ỹ) .
The inequality follows from the fact that x̂ is feasible (but possible not optimal) for (Lỹ ). t
u

Subgradient. What is v(ŷ) + (ỹ − ŷ)0 γ̂ ? It is a linear estimation of v(ỹ) starting from the
actual value of v at ŷ . The direction ỹ − ŷ is what we add to ŷ to move to ỹ . The choice of
γ̂ := h − E x̂ is made so that Theorem 7.6 holds. That is, γ̂ is chosen in such a way that the linear
estimation is always an upper bound on the value v(ỹ) of the function, for all ỹ in the domain of
f . The nice property of γ̂ demonstrated with Theorem 7.6 has a name: we say that γ̂ := h − E x̂
is a subgradient of (the concave function) v at ŷ (because it satisfies the inequality of Theorem
7.6).
76 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

Subgradient Optimization. Next, we describe a simple “Projected Subgradient Optimiza-


tion Algorithm” for solving the Lagrangian Dual. The general idea is to iteratively move in the
direction of a subgradient.

Projected Subgradient Optimization Algorithm

0. Start with any ŷ 1 ∈ Rm . Let k := 1 .


1. Solve (Lŷk ) to get x̂k .
2. Calculate the subgradient γ̂ k := h − E x̂k .
3. Let ŷ k+1 ← ProjRm (ŷ k + λk γ̂ k ) .
+

4. Let k ← k + 1 , and GOTO 1.

Above, ProjRm (·) means project onto the nonnegative orthant. That is, we take the closest
+
point (in Euclidean norm) to the argument of the function. In fact, this means just zeroing-out
the negative entries.

Convergence. We have neglected, thus far, to fully specify the Subgradient Optimization
Algorithm. We can stop if, at some iteration k , we have γ̂ k = 0 (or, more generally, if ŷ k =
ProjRm (ŷ k + λk γ̂ k )), because the algorithm will make no further progress if this happens, and
+

indeed we will have found that ŷ k is a maximizer of v(y) over y ≥ 0 . But this is actually very
unlikely to happen. In practice, we may stop if k reaches some pre-specified iteration limit, or
if after many iterations, v is barely increasing.
We are interested in mathematically analyzing the convergence behavior of the algorithm,
letting the algorithm iterate infinitely. We will see that the method converges (in a certain
sense), if we takePa sequence of λk > P 0 that in some sense slowly diverges; Specifically, we
∞ ∞
will require that k=1 λ2k < +∞ and k=1 λk = +∞ . That is, “square summable, but not
summable.” For example, taking λk := α/(β + k) , with α > 0 and β ≥ 0 , we get a sequence
of stepP sizes satisfying this property; in particular,
P∞ for α = 1 and β = P 0 we have the harmonic
∞ ∞
series k=1 1/k which satisfies ln(k + 1) < k=1 1/k < ln(k) + 1 and k=1 1/k 2 = π 2 /6 .
To prove convergence of the algorithm, we must first establish a key technical lemma.

Lemma 7.7
Let y ∗ be any maximizer of v(y) over y ≥ 0. Suppose that λk > 0 , for all k . Then

k
X k
X
∗ ∗

ky − ŷ k+1 2
k − ky − ŷ k ≤ 1 2
λ2i kγ̂ i k2 −2 λi v(y ∗ ) − v(ŷ i ) .
i=1 i=1

Proof. Let wk+1 := ŷ k + λk γ̂ k ; that is, the unprojected k + 1-st iterate. For k ≥ 1, we have

ky ∗ − ŷ k+1 k2 − ky ∗ − ŷ k k2
≤ ky ∗ − wk+1 k2 − ky ∗ − ŷ k k2
= k(y ∗ − ŷ k ) − λk γ̂ k k2 − ky ∗ − ŷ k k2
0
= λ2k kγ̂ k k − 2λk y ∗ − ŷ k γ̂ k

≤ λ2k kγ̂ k k − 2λk v(y ∗ ) − v(ŷ k ) .
7.2. LAGRANGIAN RELAXATION 77

The first inequality uses that fact that the projection of a point onto a convex set is no further to
any point in that convex set than the unprojected point. The final inequality uses the assumption
that λk > 0 and the subgradient inequality:
v(ỹ) ≤ v(ŷ k ) + (ỹ − ŷ k )0 γ̂ k ,
plugging in y ∗ for ỹ.
Finally, adding up the established inequality over k yields the result. t
u
Now, let 
vk∗ := maxki=1 v(ŷ i ) , for k = 1, 2, . . .
That is, vk∗ is the best value seen up through the k-th iteration.

Theorem 7.8 (“Square summable, but not summable” convergence)


Let y ∗ be any maximizer of v(y) over y ≥ 0. Assume that we take a basic solution as the
solution
P∞ of each Lagrangian
P∞subproblem. Suppose that λk > 0 , for all k . Suppose further
that k=1 λ2k < +∞ and k=1 λk = +∞ . Then limk→∞ vk∗ = v(y ∗ ) .

Proof. Because the left-hand side of the inequality in the statement of Lemma 7.7 is non-negative,
we have
Xk k
X

2 λi v(y ∗ ) − v(ŷ i ) ≤ ky ∗ − ŷ 1 k2 + λ2i kγ̂ i k2 .
i=1 i=1

Because vk∗ ≥ v(ŷ ) for all i ≤ k , we then have


i

k
! k
X X
2 λi (v(y ∗ ) − vk∗ ) ≤ ky ∗ − ŷ 1 k2 + λ2i kγ̂ i k2 ,
i=1 i=1
or Pk
∗ ky ∗ − ŷ 1 k2 + i=1 λ2i kγ̂ i k2
v(y ) − vk∗ ≤ Pk .
2 i=1 λi
Next, we observe that kγ̂ i k2 is bounded by some constant Γ , independent of i , because our
algorithm takes γ̂ := h − E x̂ , where x̂ is a basic solution of a Lagrangian subproblem. There
are only a finite number of bases. Therefore, we can take

Γ = max kh − E x̂k2 : x̂ is a basic solution of Ax = b , x ≥ 0 .
So, we have Pk
∗ ky ∗ − ŷ 1 k2 + Γ i=1 λ2i
v(y ) − vk∗ ≤ Pk .
2 i=1 λi
Pk
Now, we get our result by observing that ky ∗ − ŷ 1 k2 is a constant, i=1 λ2i is converging to a
Pk
constant and i=1 λi goes to +∞ (as k increases without limit), and so the right-hand side of
the final inequality converges to zero. The result follows. t
u

A simple implementation. It is very easy to write a small Gurobi/Python code for Subgra-
dient Optimization. Our code is in the Jupyter notebook SubgradProj.ipynb (see Appendix
A.9). Typical behavior is a very bad first iteration, then some iterations to recover from that,
and then a slow and steady convergence to an optimum. The method is usually stopped after
a predetermined number of iterations or after progress becomes very slow. In Figure 7.4, we
see this typical behavior, for a problem with 100 variables, 200 “complicating” constraints (i.e.,
rows of E), and 50 “nice” constraints (i.e., rows of A).
78 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

Figure 7.4: Example: Projected subgradient optimization with harmonic step sizes

Practical steps. Practically speaking, in order to get a ŷ with a reasonably high value of v(ŷ) ,
it can be better to choose a sequence of λk that depends on a “good guess” of the optimal value
of v(ŷ), taking bigger steps when one is far away, and smaller steps when one is close (try to
develop this idea in Exercise 7.3). A further idea is take shorter steps when the subgradient has
a big norm. With these ideas, we can achieve faster practical convergence of the algorithm; see
Figure 7.5

Dual estimation. From Theorem 7.5, we see that the Subgradient Optimization Method is a
way to try and quickly find an estimate of an optimal solution to the dual of (Q). At each step,
ŷ together with the π̂ that is optimal for the dual of (Lŷ ) give a feasible solution of (Q) with
objective value v(ŷ). But note that we give something up — we do not get an x∗ that solves (Q)
from a ŷ that maximizes v and a π̂ that is optimal for the dual of (Lŷ ) . There is no guarantee
7.2. LAGRANGIAN RELAXATION 79

Figure 7.5: Example: Projected subgradient optimization with better step sizes

that a x̂ that is optimal for (Lŷ ) will be feasible for (Q) .


80 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

7.3 The Cutting-Stock Problem

The cutting-stock problem is a nice concrete topic at this point. We will develop a technique
for it, using column generation, but the context is different than for decomposition. Moreover,
the topic is a nice segue into integer linear optimization — the topic of the next chapter.
The story is as follows. We have stock rolls of some type of paper of (integer) width W .
But we encounter (integer) demand di for rolls of (integer) width wi < W , for i = 1, 2, . . . , m .
The cutting-stock problem is to find a plan for satisfying demand, using as few stock rolls as
possible.9

7.3.1 Formulation via cutting patterns


There are several different ways to formulate the cutting-stock problem mathematically. A par-
ticularly useful way is based on a consideration of the problem from the point of view of the
worker who has to adjust the cutting machine. What she dearly hopes for is that a plan can be
formulated that does not require that the machine be adjusted for (different cutting patterns)
too many times. That is, she hopes that there are a relatively small number of ways that will be
utilized for cutting a stock roll, and that these good ways can each be repeated many times.
With this idea in mind, we define a cutting pattern to be a solution of
Pm
i=1 wi ai ≤ W ;
ai ≥ 0 integer, i = 1, . . . , m ,
where ai is the number of pieces of width wi that the pattern yields.
Conceptually, we could form a matrix A with m rows, and an enormous number of columns,
where each column is a distinct pattern. Then, letting xj be the number of times that we use
pattern Aj , we can conceptually formulate the cutting-stock problem as
P
z := min x
P j j
j Aj xj ≥ d; (CSP)
xj ≥ 0 integer, ∀ j .

7.3.2 Solution via continuous relaxation


Our approach to getting a good solution to (CSP) is to solve its continuous relaxation and then
round. Toward this end, we subtract surplus variables and consider the linear-optimization
problem P
z := min x
P j j
j Aj x j −t = d;
(CSP)
xj ≥ 0 , ∀ j ;
t ≥0.
7.3. THE CUTTING-STOCK PROBLEM 81

We endeavor to compute a basic optimum (x̄, t̄) . Because of the nature of the formulation, we
can see that dx̄e is feasible for (CSP). Moreover, we have produced a solution using 10 dx̄e stock
rolls, and we can give an a priori bound on its quality. Specifically, as we will see in the next
theorem, the solution that we obtain wastes at most m − 1 stock rolls, in comparison with an
optimal solution. Moreover, we have a practically-computable bound on the number of wasted
rolls, which is no worse than the worst-case bound of m − 1. That is, our waste is at worst
10 dx̄e − dze .

Theorem 7.9

dze ≤ z ≤ 10 dx̄e ≤ dze + (m − 1) .

Proof. Because (CSP) is a relaxation of (CSP) and because z is an integer, we P have dze ≤ z .
m
Because dx̄e is a feasible
Pmsolution of (CSP), we have
Pm z ≤ 10
dx̄e . Now,
Pm 1 0
dx̄e = i=1 (x̄βi + fi ) ,
with each
Pmf i < 1 . But i=1 (x̄ β i
+ fi ) = 10
x̄ + i=1 f i ≤ d1 0
x̄e + i=1 f i . Therefore, 10 dx̄e ≤
d1 x̄e + i=1 fi . Now the left-hand side of this last inequality is an integer, so we may round
0

down the right-hand side. So we can conclude that 10 dx̄e ≤ dze + (m − 1) . t


u

7.3.3 The knapsack subproblem


Toward describing how we can solve (CSP) by the Simplex Algorithm, we introduce a vector
y ∈ Rm of dual variables.
P
z := min xj dual variables
P j
j Aj x j −t = d; y
(CSP)
xj ≥ 0 , ∀ j ;
t ≥0.

We suppose that we have a feasible basis of (CSP) and that we have, at hand, the associated
dual solution ȳ . For each i , 1 ≤ i ≤ m , the reduced cost of ti is simply ȳi . Therefore, if ȳi < 0 ,
then ti is eligible to enter the basis.
So, moving forward, we may assume that ȳi ≥ 0 for all i . We now want to examine the
reduced cost of an xj variable. The reduced cost is simply
m
X
1 − ȳ 0 Aj = 1 − ȳi aij .
i=1

Pm
The variable xj is eligible to enter the basis then if 1 − i=1 ȳi aij < 0 . Therefore, to check
whether there is some column xj with negative reduced cost, we can solve the so-called knap-
sack problem
Pm
max i=1 ȳi ai
Pm
i=1 wi ai ≤ W;
ai ≥ 0 integer, i = 1, . . . , m ,

and check whether the optimal value is greater than one. If it is, then the new variable that we
associate with this solution pattern (i.e., column of the constraint matrix) is eligible to enter the
basis.
82 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

Our algorithmic approach for the knapsack problem is via recursive optimization (known
popularly as dynamic programming10 ). We will solve this problem for all positive integers up
through W . That is, we will solve
Pm
f (s) := max P i=1 ȳi ai
m
i=1 wi ai ≤ s;
ai ≥ 0 integer, i = 1, . . . , m ,

starting with f (s) = 0 , for 0 ≤ s < minm


i=1 {wi } , and proceeding from s = mini=1 {wi } − 1 by
m

incrementing the argument of f by 1 at each step. Then, we have the recursion

f (s) = max {ȳi + f (s − wi )} , for s ≥ minm


i=1 {wi } .
i : wi ≤s

It is important to note that we can always calculate f (s) provided that we have already calcu-
lated f (s0 ) for all s0 < s . Why does this work? It follows from a very simple observation: If
we have optimally filled a knapsack of capacity s and we remove any item i, then what remains
optimally fills a knapsack of capacity s − wi . If there were a better way to fill the knapsack of
capacity s − wi , then we could take such a way, replace the item i , and we would have found
a better way to fill a knapsack of capacity s . Of course, we do not know even a single item that
we can be sure is in an optimally filled knapsack of capacity s , and this is why in the recursion,
we maximize over all items that can fit in (i.e., i : wi ≤ s).
The recursion appears to calculate the value of f (s) , but it is not immediate how to recover
optimal values of the ai . Actually, this is rather easy.

Recover the Solution of a Knapsack Problem

0. Let s := W , and let ai := 0 , for i = 1, . . . , m .


1. While (s > 0)
(a) Find ı̂ : f (s) = ȳı̂ + f (s − wı̂ ).
(b) Let aı̂ := aı̂ + 1 .
(c) Let s := s − wı̂ .
2. Return ai , for i = 1, . . . , m .
Note that in Step 1.a, there must be such an ı̂ , by virtue of the recursive formula for calcu-
lating f (s) . In fact, if we like, we can save an appropriate ı̂ associated with each s at the time
that we calculate f (s) .

7.3.4 Applying the Simplex Algorithm


An initial feasible basis. It is easy to get an initial feasible basis. We just consider the m
patterns Ai := bW/wi cei , for i = 1, 2, . . . , m . The values of the m basic variables associated
with the basis of these patterns are x̄i = di /bW/wi c , which are clearly non-negative.
7.3. THE CUTTING-STOCK PROBLEM 83

Basic solutions: dual and primal. At any iteration, the basis matrix B has some columns
corresponding to patterns and possibly other columns for ti variables. The column correspond-
ing to ti is −ei .
Organizing the basic variables xj and ti into a vector ζ , with their order appropriately
matched with the columns of B , the vector ζ̄ of values of ζ is precisely the solution of

Bζ = d .

That is,
ζ̄ = B −1 d .
The cost of an xj is 1, while the cost of a ti is 0. Organizing the costs of the basic variables
into a vector ξ , with their order appropriately matched with the columns of B , the associated
dual solution ȳ is precisely the solution of

y0 B = ξ0 .

That is,
ȳ 0 = ξ 0 B −1 .

7.3.5 A demonstration implementation


We can use Python/Gurobi, in a somewhat sophisticated manner, to implement our algorithm
for the cutting-stock problem. As we did for the Decomposition Algorithm, rather than carry
out the simplex method at a detailed level on (CSP), we just accumulate all columns of (CSP)
that we generate, and always solve linear-optimization problems, using functionality of Gurobi,
with all of the columns generated thus far. In this way, we do not maintain bases ourselves,
and we do not carry out the detailed pivots of the Simplex Algorithm. Note that the linear-
optimization functionality of Gurobi does give us a dual solution, so we do not compute that
ourselves. Our full code is in the Jupyter notebook CSP.ipynb (see Appendix A.10).
On the example provided, our algorithm gives a lower bound of 1378 on the minimum num-
ber of stock rolls needed to cover demand, and it gives us an upper bound (feasible solution)
of 1380.

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0.]
[0. 2. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 4. 0.]
[0. 0. 0. 0. 3.]]
***** x:
x[ 0 ]= 205.0
x[ 1 ]= 1160.5
x[ 2 ]= 71.5
x[ 3 ]= 272.25
x[ 4 ]= 39.0
***** y': [1. 0.5 0.5 0.25 0.3333]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.5
***** DP Knap objval: 1.5
***** Column: [1. 1. 0. 0. 0.]
84 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0. 1.]
[0. 2. 0. 0. 0. 1.]
[0. 0. 2. 0. 0. 0.]
[0. 0. 0. 4. 0. 0.]
[0. 0. 0. 0. 3. 0.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 1058.0
x[ 2 ]= 71.5
x[ 3 ]= 272.25
x[ 4 ]= 39.0
x[ 5 ]= 205.0
***** y': [0.5 0.5 0.5 0.25 0.3333]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.25
***** DP Knap objval: 1.25
***** Column: [0. 2. 0. 1. 0.]

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0. 1. 0.]
[0. 2. 0. 0. 0. 1. 2.]
[0. 0. 2. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1.]
[0. 0. 0. 0. 3. 0. 0.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 71.5
x[ 3 ]= 7.75
x[ 4 ]= 39.0
x[ 5 ]= 205.0
x[ 6 ]= 1058.0
***** y': [0.625 0.375 0.5 0.25 0.3333]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.0833333333333333
***** DP Knap objval: 1.0833333333333333
***** Column: [0. 0. 0. 3. 1.]

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0. 1. 0. 0.]
[0. 2. 0. 0. 0. 1. 2. 0.]
[0. 0. 2. 0. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1. 3.]
[0. 0. 0. 0. 3. 0. 0. 1.]]
***** x:
7.3. THE CUTTING-STOCK PROBLEM 85

x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 71.5
x[ 3 ]= 0.0
x[ 4 ]= 35.5556
x[ 5 ]= 205.0
x[ 6 ]= 1058.0
x[ 7 ]= 10.3333
***** y': [0.6111 0.3889 0.5 0.2222 0.3333]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.0555555555555556
***** DP Knap objval: 1.0555555555555556
***** Column: [0. 1. 0. 0. 2.]

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 2. 0. 0. 0. 1. 2. 0. 1.]
[0. 0. 2. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1. 3. 0.]
[0. 0. 0. 0. 3. 0. 0. 1. 2.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 71.5
x[ 3 ]= 0.0
x[ 4 ]= 0.0
x[ 5 ]= 205.0
x[ 6 ]= 1033.3846
x[ 7 ]= 18.5385
x[ 8 ]= 49.2308
***** y': [0.6154 0.3846 0.5 0.2308 0.3077]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.0
***** DP Knap objval: 1.0
***** No more improving columns
***** Pattern generation complete. Main LP solved to optimality.
***** Total number of patterns generated: 9
***** A:
[[1. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 2. 0. 0. 0. 1. 2. 0. 1.]
[0. 0. 2. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1. 3. 0.]
[0. 0. 0. 0. 3. 0. 0. 1. 2.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 71.5
x[ 3 ]= 0.0
x[ 4 ]= 0.0
86 CHAPTER 7. LARGE-SCALE LINEAR OPTIMIZATION

x[ 5 ]= 205.0
x[ 6 ]= 1033.3846
x[ 7 ]= 18.5385
x[ 8 ]= 49.2308
***** Optimal LP objective value: 1377.6538461538462
***** rounds up to: 1378.0 (lower bound on rolls needed)
***** x rounded up:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 72.0
x[ 3 ]= 0.0
x[ 4 ]= 0.0
x[ 5 ]= 205.0
x[ 6 ]= 1034.0
x[ 7 ]= 19.0
x[ 8 ]= 50.0
***** Number of rolls used: 1380.0

By solving a further integer-linear optimization problem to determine the best way to cover
demand using all patterns generated in the course of our algorithm, we improve the upper
bound to 1379.

***** Now solve the ILP over all patterns generated to try and get a better soution...
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 72.0
x[ 3 ]= 1.0
x[ 4 ]= 1.0
x[ 5 ]= 205.0
x[ 6 ]= 1034.0
x[ 7 ]= 17.0
x[ 8 ]= 49.0
***** Number of rolls used: 1379.0

It remains unknown as to whether the optimal solution to this instance is 1378 or 1379.

7.4 Exercises
Exercise 7.1 (Dual solutions)
Refer to (Q) and (M) defined in the Decomposition Theorem (i.e., Corollary 7.2) What is the
relationship between optimal dual solutions of (Q) and (M) ?

Exercise 7.2 (Lagrangian value function)


Using Theorem 6.5, prove that v (from Section 7.2.1) is a concave piecewise-linear function on
its domain.
Exercise 7.3 (Play with subgradient optimization)
Play with the Gurobi/Python code in the Jupyter notebook SubgradProj.ipynb (see Appendix
A.9). Try bigger examples. Try different ideas for the step size, with the goal of gaining faster
convergence — be a real engineer and think ‘outside of the box’ (you can use any information
you like: e.g., the current subgradient ŷ k , the current function value v(γ̂ k ), an estimate v̄ of the
maximum value of v, etc.).
7.4. EXERCISES 87

Exercise 7.4 (Cutting it closer to reality)


Real cutting machines may have a limited number, say K , P of blades. This means
Pmthat we can cut
m
at most K + 1 pieces for patterns that leave no scrap (i.e., Pi=1 wi ai = W ⇒ P i=1 ai ≤ K + 1)
m m
and at most K pieces for patterns that leave scrap (i.e., i=1 wi ai < W ⇒ i=1 ai ≤ K).
Describe how to modify our algorithm for the cutting-stock problem to account for this. Modify
CSP.ipynb that I provided (see Appendix A.10) to try this out.

Exercise 7.5 (Another kind of question)


Print is dying, right? Why should we care about the cutting-stock problem?
Chapter 8

Integer-Linear Optimization

Our goals in this chapter are as follows:


• to develop some elementary facility with modeling using integer variables;
• to learn how to recognize when we can expect solutions of linear-optimization problems
to be integer automatically;
• to learn the fundamentals of the ideas that most solvers employ to handle integer vari-
ables;
• to learn something about solver-aware modeling in the context of integer variables.

8.1 Integrality for Free


8.1.1 Some structured models
Network-flow problem. Recapitulating a bit from Section 2.3, a finite network G is de-
scribed by a finite set of nodes N and a finite set A of arcs. Each arc e has two key attributes,
namely its tail t(e) ∈ N and its head h(e) ∈ N , both nodes. We think of a single commodity
as being allowed to “flow” along each arc, from its tail to its head. Indeed, we have “flow”
variables
xe := amount of flow on arc e ,
for e ∈ A . Formally, a flow x̂ on G is simply an assignment of any real numbers x̂e to the
variables xe , for all e ∈ A . We assume that the flow on arc e should be non-negative and
should not exceed
ue := the flow upper bound on arc e ,
for e ∈ A . Associated with each arc e is a cost

ce := cost per-unit-flow on arc e ,

89
90 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

for e ∈ A . The (total) cost of the flow x̂ is defined to be


X
ce x̂e .
e∈A

We assume that we have further data for the nodes. Namely,


bv := the net supply at node v ,
for v ∈ N . A flow is conservative if the net flow out of node v , minus the net flow into node
v , is equal to the net supply at node v , for all nodes v ∈ N .
The single-commodity min-cost network-flow problem is to find a minimum-cost conser-
vative flow that is non-negative and respects the flow upper bounds on the arcs. This is the
K = 1 commodity version of the multi-commodity min-cost network-flow problem from Sec-
tion 2.3.
We can formulate the single-commodity min-cost network-flow problem as follows:
X
min ce xe
e∈A
X X
xe − xe = bv , ∀ v ∈ N ;
e∈A : e∈A :
t(e)=v h(e)=v

0 ≤ xe ≤ ue , ∀e∈A.
As we have stated this, it is just a structured linear-optimization problem. But there are many
situations where the given net supplies at the nodes and the given flow capacities on the arcs
are integer, and we wish to constrain the flow variables to be integers.
We will see that it is useful to think of the network-flow problem in matrix-vector language.
We define the network matrix of G to be a matrix A having rows indexed from N , columns
indexed from A , and entries

 1 , if v = t(e) ;
ave := −1 , if v = h(e);

0 , if v ∈
/ {t(e), h(e)},
for v ∈ N , e ∈ A . With this notation, and organizing the bv in a column-vector indexed
accordingly with the rows of A , and organizing the ce , xe and ue as three column-vectors
indexed accordingly with the columns of A , we can rewrite the network-flow formulation as
min c0 x
Ax = b;
x ≤ u;
x ≥ 0.

Assignment problem on a graph.


8.1. INTEGRALITY FOR FREE 91

A finite bipartite graph G is described by two finite sets of vertices V1 and V2 , and a set
E of ordered pairs of edges, each one of which is of the form (i, j) with i ∈ V1 and j ∈ V2 . A
perfect matching M of G is a subset of E such that each vertex of the graph meets exactly one
edge in M . We assume that there are given edge weights

cij := for (i, j) ∈ E ,

and our goal is to find a perfect matching that has minimum (total) weight.
We can define

xij := indicator variable for choosing edge (i, j) to be in M ,

for all (i, j) ∈ E . Then we can model the problem of finding a perfect matching of G having
minimum weight via the formulation:
X
min cij xij
(i,j)∈E
X
xij = 1, ∀ i ∈ V1 ;
j∈V2 :
(i,j)∈E
X
xij = 1, ∀ j ∈ V2 ;
i∈V1 :
(i,j)∈E

xij ∈ {0, 1} , ∀ (i, j) ∈ E .


It will be useful to think of this assignment-problem formulation in matrix-vector language.
We define the vertex-edge incidence matrix of the bipartite graph G to be a matrix A having
rows indexed from V1 ∪ V2 , columns indexed from E , and entries

1 , if v = i or v = j ;
av,(i,j) :=
0 , otherwise,

for v ∈ V1 ∪ V2 , (i, j) ∈ E . With this notation, and organizing the cij , xij and as column-
vectors indexed accordingly with the columns of A , we can rewrite the assignment-problem
formulation as
min c0 x
Ax = 1 ;
x ∈ {0, 1}E ,

Staffing problem. In this problem, we have discrete time periods numbered 1, 2, . . . , m , and
we are given

bi := the minimum number of workers required at time period i ,

for each i = 1, 2, . . . , m . Additionally, there is an allowable set of “shifts.” An allowable shift


is simply a given collection of time periods that a worker is allowed to staff. It may well be that
not all subsets of {1, 2, . . . , m} are allowable; e.g., we may not want to allow too many or too
few time periods, and we may not want to allow idle time to be interspersed between non-idle
times. We suppose that the allowable shifts are numbered 1, 2, . . . , n , and we have

cj := the per worker cost to staff shift j ,

for each j = 1, 2, . . . , n . It is convenient to encode the shifts as a 0, 1-valued matrix A , where



1 , if shift j contains time period i ;
aij :=
0 , otherwise,
92 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

for i = 1, 2, . . . , m , j = 1, 2, . . . , n . Letting x be an n-vector of variables, with xj representing


the number of workers assigned to shift j , we can formulate the staffing problem as

min c0 x
Ax ≥ b;
x ≥ 0 integer.

As we have stated it, this staffing problem is really a very general type of integer-linear-
optimization problem because we have not restricted the form of A beyond it being 0, 1-valued.
In some situations, however, it may may be reasonable to assume that shifts must consist of a
consecutive set of time periods. In this case, the 1’s in each column of A occur consecutively, so
we call A a consecutive-ones matrix.

8.1.2 Unimodular basis matrices and total unimodularity

In this section we explore the essential properties of a constraint matrix so that basic solu-
tions are guaranteed to be integer. This has important implications for the network-flow, assign-
ment, and staffing problems that we introduced.
Let A be an m × n real matrix. A basis matrix Aβ is unimodular if det(Aβ ) = ±1 . Checking
whether a large unstructured matrix has all of its basis matrices unimodular is not a simple mat-
ter. Nonetheless, we will see that this property is very useful for guaranteeing integer optimal
of linear-optimization problems, and certain structured constraint matrices have this property.

Theorem 8.1
If A is an integer matrix, all basis matrices of A are unimodular, and b is an integer vector, then
every basic solution x̄ of
Ax = b ;
x ≥ 0
is an integer vector.

Proof. Of course x̄ηj = 0, an integer, for j = 1, 2, . . . , n − m, so we concentrate now on the basic


variables. By Cramer’s rule, the basic variables take on the values
det(Aβ (i))
x̄βi = , for i = 1, 2, . . . , m ,
det(Aβ )
8.1. INTEGRALITY FOR FREE 93

where Aβ (i) is defined to be the matrix Aβ with its i-th column, Aβi , replaced by b . Because
we assume that A and b are all integer, the numerator above is the determinant of an integer
matrix, which is an integer. Next, the fact that A has unimodular basis matrices tells us that the
determinant of the invertible matrix Aβ is ±1 . That is, the denominator above is ±1 . So, we
have an integer divided by ±1 , which results in an integer value for x̄βi . t
u

We note that Theorem 8.1 asserts that all basic solutions are integer, whether or not they are
feasible. There is a converse to this theorem.

Theorem 8.2
Let A be an integer matrix in Rm×n . If the system

Ax = b;
x ≥ 0

has integer basic feasible solutions for every integer vector b ∈ Rm , then all basis matrices of
A are unimodular.

It is important to note that the hypothesis of Theorem 8.2 is weaker than the conclusion of
Theorem 8.1. For Theorem 8.2, we only require integrality for basic feasible solutions.

Proof. (Theorem 8.2). Let β be an arbitrary basis, choose an arbitrary i (1 ≤ i ≤ m), and consider
the associated basic solution when b := ei + ∆Aβ 1 . The basic solution x̄ has x̄β equal to the i-th
column of A−1β plus ∆1 . Note that if we choose ∆ to be an integer, then b is integer. Furthermore,
if we choose ∆ to be sufficiently large, then x̄β is non-negative. Therefore, we can chose ∆ so
that b is integer and x̄ is a basic feasible solution. Therefore, by our hypothesis, x̄ is integer.
So the i-th column of A−1 β plus ∆1 is an integer vector . But this clearly implies that the i-th
column of Aβ is an integer vector . Now, because i was arbitrary, we conclude that A−1
−1
β is an
integer matrix . Of course Aβ is an integer matrix as well. Now, it is a trivial observation that
an integer matrix has an integer determinant. Furthermore, the determinants of Aβ and A−1 β
are reciprocals. Of course the only integers with integer reciprocal are 1 and −1 . Therefore, the
determinant of Aβ is 1 or −1 . t
u

Before turning to specific structured linear-optimization problems, we introduce a stronger


property than unimodularity of basis matrices. The main reason for introducing it is that for
the structured linear-optimization problems that we will look at, the constraint matrices satisfy
this stronger property, and the inductive proofs that we would deploy for proving the weaker
property naturally prove the stronger property as well.
Let A be an m × n real matrix. A is totally unimodular (TU) if every square non-singular
submatrix B of A has det(B) = ±1 .
Obviously every entry of a TU matrix must be 0 , ±1 , because the determinant of a 1 × 1
submatrix is just its single entry. It is quite easy to make an example of even a 2 × 2 non-TU
matrix with all entries 0 , ±1:  
1 −1
.
1 1
It is trivial to see that if A is TU, then every basis matrix of A is unimodular. But note that
even for integer A , every basis matrix of A could be unimodular, but A need not be TU. For
example,  
2 1
1 1
94 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

has only itself as a basis matrix, and its determinant is 1, but there is a 1 × 1 submatrix with
determinant 2, so A is not TU. Still, as the next result indicates, there is a way to get the TU
property from unimodularity of basis matrices.

Theorem 8.3
If every basis matrix of [A, Im ] is unimodular, then A is TU.

Proof. Let B be an r × r invertible submatrix of A , with r < m . It is an easy matter to choose a


(m × m) basis matrix H of [A, Im ] that includes all r columns of A that include columns of B ,
and then the m − r identity columns that have their ones in rows other than those used by B .
If we permute the rows of A so that B is within the first r rows, then we can put the identity
columns to the right, in their natural order, and the basis we construct is
!
B 0
H= .
× Im−r
Clearly B and H have the same determinant. Therefore, the fact that every basis matrix has
determinant 1 or −1 implies that B does as well. t
u

Next, we point out some simple transformations that preserve the TU property.

Theorem 8.4
If A is TU, then all of the following leave A TU.
(i) multiplying any rows or columns of A by −1 ;
(ii) duplicating any rows or columns of A ;
(iii) appending standard-unit columns (that is, all entries equal to 0 except a single entry of
1) ;
(iv) taking the transpose of A .

We leave the simple proof to the reader.


Remark 8.5
Relationship with transformations of linear-optimization problems. The significance of The-
orem 8.4 for linear-optimization problems can be understood via the following observations:
• (i) allows for reversing the sense of an inequality (i.e., switching between “≤” and “≥”) or
variable (i.e., switching between non-negative and non-positive) in a linear-optimization
problem with constraint matrix A .
• (ii) together with (i) allows for replacing an equation with a pair of oppositely senses
inequalities and for replacing a sign-unrestricted variable with the difference of a pair of
non-negative variables.
• (iii) allows for adding a non-negative slack variable for a “≤” inequality, to transform
it into an equation. Combining (iii) with (i) , we can similarly subtract a non-negative
surplus variable for a “≥” inequality, to transform it into an equation.
• (iv) allows for taking the dual of a linear-optimization problem with constraint matrix A .
8.1. INTEGRALITY FOR FREE 95

8.1.3 Consequences of total unimodularity


Network flow.

Theorem 8.6
If A is a network matrix, then A is TU.

Proof. A network matrix is simply a 0 , ±1-valued matrix with exactly one +1 and one −1 in
each column.
Let B be an r × r invertible submatrix of the network matrix A . We will demonstrate that
det(B) = ±1 , by induction on r . For the base case, r = 1 , the invertible submatrices have a
single entry which is ±1 , which of course has determinant ±1 . Now suppose that r > 1 , and
we inductively assume that all (r − 1) × (r − 1) invertible submatrices of A have determinant
±1 .
Because we assume that B is invertible, it cannot have a column that is a zero-vector.
Moreover, it cannot be that every column of B has exactly one +1 and one −1 . Because, by
simply adding up all the rows of B , we have a non-trivial linear combination of the rows of B
which yields the zero vector. Therefore, B is not invertible in this case.
So, we only need to consider the situation in which B has a column with a single non-
zero ±1 . By expanding the determinant along such a column, we see that, up to a sign, the
determinant of B is the same as the determinant of an (r − 1) × (r − 1) invertible submatrix of
A . By the inductive hypothesis, this is ±1 . t
u

Corollary 8.7
The single-commodity min-cost network-flow formulation
X
min ce xe
e∈A
X X
xe − xe = bv , ∀ v ∈ N ;
e∈A : e∈A :
t(e)=v h(e)=v

0 ≤ xe ≤ ue , ∀e∈A.

has an integer optimal solution if: (i) it has an optimal solution, (ii) each bv is an integer, and
(iii) each ue is an integer or is infinite.

Proof. Recall that we can rewrite the single-commodity min-cost network-flow formulation as
min c0 x
Ax = b;
x ≤ u;
x ≥ 0,
where A is a network matrix. For the purpose of proving the theorem, we may as well assume
that the linear-optimization problem has an optimal solution. Next, we transform the formula-
tion into standard form:
min c0 x
Ax = b;
x + s = u;
x , s ≥ 0.
96 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION
 
A 0
The constraint matrix has the form . This matrix is TU, by virtue of the fact that A
I I
is TU, and that it arises from A using operations that preserve the TU property. Finally, we
delete any redundant equations from this system of equations, and we delete any rows that
have infinite right-hand side ue . The resulting constraint matrix is TU, and the right-hand side
is integer, so an optimal basic solution exists and will be integer. t
u

Remark 8.8
Considering Example 2.1, we can see that Corollary 8.7 does not extend to more than one com-
modity.

Assignments.

Theorem 8.9
If A is the vertex-edge incidence matrix of a bipartite graph, then A is TU.

Proof. The constraint matrix A for the formulation has its rows indexed by the vertices of G .
With each edge having exactly one vertex in V1 and exactly one vertex in V2 , the constraint
matrix has the property that for each column, the only non-zeros are a single 1 in a row indexed
from V1 and a single 1 in a row indexed from V2 .
Certainly multiplying any rows (or columns) of a matrix does not bear upon whether or
not it is TU. It is easy to see that by multiplying the rows of A indexed from V1 , we obtain a
network matrix, thus by Theorem 8.6, the result follows. t
u

Corollary 8.10
The continuous relaxation of the following formulation for finding a minimum-weight perfect
matching of the bipartite graph G has an 0, 1-valued solution whenever it is feasible.
X
min cij xij
(i,j)∈E
X
xij = 1 , ∀ i ∈ V1 ;
j∈V2 :
(i,j)∈E
X
xij = 1 , ∀ j ∈ V2 ;
i∈V1 :
(i,j)∈E

xij ≥ 0 , ∀ (i, j) ∈ E .

Proof. After deleting any redundant equations, the resulting formulation as a TU constraint
matrix and integer right-hand side. Therefore, its basic solutions are all integer. The constraints
imply that no variable can be greater than 1, therefore the optimal value is not unbounded, and
the only integer solutions have all xij ∈ {0, 1} . The result follows. t
u

A matching M of G is a subset of E such that each vertex of the graph is met by no more
that one edge in M . An interesting variation on the problem of finding a perfect matching of G
8.1. INTEGRALITY FOR FREE 97

having minimum weight, is to find a maximum-cardinality matching of G . This problem is


always feasible, because M := ∅ is always a matching.
How big can a matching of a finite graph G be? A vertex cover of G is a set W of vertices
that touches all of the edges of G . Notice that if M is a matching and W is a vertex cover, then
|M | ≤ |W | , because each element of W touches at most one element of M . Can we always find a
matching M and a vertex cover W so that |M | = |W | ? The next result, due to the mathematician
König11 , tells us that the answer is ’yes’ when G is bipartite.

Corollary 8.11 (König’s Theorem)


If G is a bipartite graph, then the maximum cardinality of a matching of G is equal to the
minimum cardinality of a vertex cover of G .

Proof. We can formulate the problem of finding the maximum cardinality of a matching of G
as follows: X
max xij
(i,j)∈E
X
xij ≤ 1 , ∀ i ∈ V1 ;
j∈V2 :
(i,j)∈E
X
xij ≤ 1 , ∀ j ∈ V2 ;
i∈V1 :
(i,j)∈E

xij ≥ 0 integer, ∀ (i, j) ∈ E .


It is easy to see that we can relax integrality, and the optimal value will be unchanged, because
A is TU, and the constraint matrix will remain TU after introducing slack variables. The dual
of the resulting linear-optimization problem is
X
min yv
v∈V

yi + yj ≥ 1 , ∀ (i, j) ∈ E ;
yv ≥ 0, ∀v∈V .

It is easy to see that after putting this into standard form via the subtraction of surplus variables,
the constraint matrix has the form [A0 , −I] , where A is the vertex-edge incidence matrix of G .
This matrix is TU, therefore an optimal integer solution exists.
Next, we observe that because of the minimization objective and the form of the constraints,
an optimal integer solution will be 0, 1-valued; just observe that if ȳ is an integer feasible solution
and ȳv > 1 , for some v ∈ V , then decreasing ȳv to 1 (holding the other components of ȳ
constant, produces another integer feasible solution with a lesser objective value. This implies
that every integer feasible solution ȳ with any ȳv > 1 is not optimal.
Next, let ŷ be an optimal 0, 1-valued solution. Let

W := {v ∈ V : ŷv = 1} .
P
It is easy to see that W is a vertex cover of G and that |W | = v∈V ŷv . The result now follows
from the strong duality theorem. t
u

For studying matching in non-bipartite graphs, one can have a look at [3, Chapter 4].
98 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Staffing.

Theorem 8.12
If A is a consecutive-ones matrix, then A is TU.

Proof. Let B be an r × r invertible submatrix of a consecutive-ones matrix A . We will demon-


strate that det(B) = ±1 , by induction on r . We take care that we preserve the ordering of the
rows of A in B . In this way, B is also a consecutive-ones matrix. Note that only the sign of the
determinant of B depends on the ordering of its rows (and columns).
For the base case, r = 1 , the invertible submatrix B has a single entry which is 1 , which
of course has determinant 1 . Now suppose that r > 1 , and we inductively assume that all
(r − 1) × (r − 1) invertible submatrices of all consecutive-ones matrices have determinant ±1 .
(We will see that the ’all’ in the inductive hypothesis will be needed — it will not be enough to
consider just (r − 1) × (r − 1) invertible submatrices of our given matrix A).
Next, we will reorder the columns of B so that all columns with a 1 in the first row come
before all columns with a 0 in the first row. Note that there must be a column with a 1 in the
first row, otherwise B would not be invertible. Next, we further reorder the columns, so that
among all columns with a 1 in the first row, a column of that type with the fewest number of 1s
is first.

Our matrix B now has this form


 
1 1···1 0···0
 1 1···1 
 
 .. .. .. 
 . . . 
 
 1 1···1 ,




0
..
G 




0
. F 

where F and G are the submatrices indicated. Note that F and G are each consecutive-ones
matrices.
8.1. INTEGRALITY FOR FREE 99

Next, we subtract the top row from all other rows that have a 1 in the first column. Such
row operations do not change the determinant of B , and we get a matrix of the form

 
1 1···1 0···0
 0 0···0 
 
 .. .. .. 
 . . . 
 
 0 0···0 .




0
..
G 




0
. F 

Note that this resulting matrix need not be a consecutive-ones matrix — but that is not
needed. By expanding the determinant of this latter matrix along the first column, we see that
the determinant of this matrix is the same as that of the matrix obtained by striking out its first
row and column,
 
0···0
 .. .. 
 . . 
 
 0···0 
 .

 G 


F 

But this matrix is an (r − 1) × (r − 1) invertible consecutive-ones matrix (note that it is not


necessarily a submatrix of A). So, by our inductive hypothesis, its determinant is ±1 .
t
u

Corollary 8.13
Let A be a shift matrix such that each shift is a contiguous set of time periods, let c be a vector
of non-negative costs, and let b be a vector of non-negative integer demands for workers in the
time periods. Then there is an optimal solution x̄ of the continuous relaxation

min c0 x
Ax ≥ b;
x ≥ 0

of the staffing formulation that has x̄ integer, whenever the relaxation is feasible.

Proof. A is a consecutive-ones matrix when each shift is a contiguous set of time periods. There-
fore A is TU. After subtracting surplus variables to put the problem into standard form, the
constraint matrix takes the form [A, −I] , which is also T U . The result follows. t
u
100 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

8.2 Modeling Techniques

8.2.1 Disjunctions
8.2. MODELING TECHNIQUES 101

Example 8.14
Suppose that we have a single variable x ∈ R , and we want to model the disjunction

−12 ≤ x ≤ 2 or 5 ≤ x ≤ 20 .

By introducing a binary variable y ∈ {0, 1} , we can model the disjunction as

x ≤ 2 + M1 y ,
x + M2 (1 − y) ≥ 5,

where the constant scalars M1 and M2 (so-called big M’s) are chosen to be appropriately large.
A little analysis tell us how large. Considering our assumption that x could be as large as 20 ,
we see that M1 should be at least 18 . Considering our assumption that x could be as small as
−12 , we see that M2 should be at least 17 . In fact, we should choose these constants to be as
small as possible so as make the feasible region with y ∈ {0, 1} relaxed to 0 ≤ y ≤ 1 as small as
possible. So, the best model for us is:

x ≤ 2 + 18y ,
x + 17(1 − y) ≥ 5.

It is interesting to see a two-dimensional graph of this in x − y space; see Figures 8.1 and 8.2.

Figure 8.1: Optimal choice of “big M’s”

8.2.2 Forcing constraints


The uncapacitated facility-location problem involves n customers, numbered 1, 2, . . . , n and m
facilities, numbered 1, 2 . . . , m . Associated with each facility, we have

fi := fixed cost for operating facility i ,

for i = 1, . . . , m . Associated with each customer/facility pair, we have

cij := cost for satisfying all of customer j’s demand from facility i ,
102 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Figure 8.2: Comparing optimal vs non-optimal “big M’s”

for i = 1, . . . , m , j = 1, . . . , n . The goal is to determine a set of facilities to operate and an


allocation of each customers demand across operation facilities, so as to minimize the total cost.
The problem is “uncapacitated” in the sense that each facility has no limit on its ability to satisfy
demand from even all customers.
We formulate this optimization problem with

yi := indicator variable for operating facility i ,

for i = 1, . . . , m , and

xij := fraction of customer j demand satisfied by facility i ,

for i = 1, . . . , m , j = 1, . . . , n .
8.2. MODELING TECHNIQUES 103

Our formulation is as follows:


Pm Pm Pn
min i=1 fi yi + i=1 j=1 cij xij
Pm
i=1 xij = 1, for j = 1, . . . , n ;

for i = 1, . . . , m ,
−yi + xij ≤ 0,
j = 1, . . . , n ;

yi ∈ {0, 1}, for i = 1, . . . , m ;

for i = 1, . . . , m ,
xij ≥ 0 ,
j = 1, . . . , n .

All of these constraints are self-explanatory except for the mn constraints:

− yi + xij ≤ 0 for i = 1, . . . , m , j = 1, . . . , n . (S)

These constraints simply enforce that for any feasible solution x̂, ŷ , we have that ŷi = 1 when-
ever x̂ij > 0 . It is an interesting point that this could also be enforced via the m constraints:
n
X
− nyi + xij ≤ 0 , for i = 1, . . . , m . (W)
j=1

We can view the coefficient −n of yi as a “big M”, rendering the constraint vacuous when yi = 1 .
Despite the apparent parsimony of the latter formulation, it turns out that the original formu-
lation is preferred. The Python/Gurobi code is in Jupyter notebook UFL.ipynb can be used to
compare the use of (S) versus (W). (see Appendix A.11).

8.2.3 Piecewise-linear univariate functions


Of course many useful functions are non-linear. Integer-linear optimization affords a good way
to approximate well-behaved univariate non-linear functions. Suppose that f : R → R has
domain the interval [l, u] , with l < u . For some n ≥ 2 , we choose n breakpoints l = ξ 1 < ξ 2 <
· · · < ξ n−1 < ξ n = u . Then, we approximate f linearly between adjacent pairs of breakpoints.
That is, we approximate f by
n
X
fˆ(x) := λj f (ξ j ) ,
j=1

where we require that


n
X
λj = 1;
j=1
λj ≥ 0 , for j = 1, . . . , n ,

and the adjacency condition:

λj and λj+1 may be positive for only one value of j .

This adjacency condition means that we “activate” the interval [ξ j , ξ j+1 ] for approximating
f (x) . That is, we will approximate f (x) by

λj f (ξ j ) + λj+1 f (ξ j+1 ) ,
104 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

with

λj + λj+1 = 1;
λj , λj+1 ≥ 0.

We can enforce the adjacency condition using 0, 1-variables. Let



1, if the interval [ξ j , ξ j+1 ] is activated;
yj :=
0, otherwise,

for j = 1, 2, . . . , n − 1 .
The situation is depicted in Figure 8.3, where the red curve graphs the non-linear function
f.

Figure 8.3: Piecewise-linear approximation

We only want to allow one of the n − 1 intervals to be activated, so we use the constraint

n−1
X
yj = 1 .
j=1

We only want to allow λ1 > 0 if the first interval [ξ 1 , ξ 2 ] is activated. For an internal breakpoint
ξ j , 1 < j < n , we only want to allow λj > 0 if either [ξ j−1 , ξ j ] or [ξ j , ξ j+1 ] is activated. We
8.3. A PRELUDE TO ALGORITHMS 105

only want to allow λn > 0 if the last interval [ξ n−1 , ξ n ] is activated. We can accomplish these
restrictions with the constraints

λ1 ≤ y1 ;
λj ≤ yj−1 + yj , for j = 2, . . . , n − 1 ;
λn ≤ yn−1 .

Notice how if yk is 1 , for some k (1 ≤ k ≤ n), and necessarily all of the other yj are 0 (j 6= k),
then only λk and λk+1 can be positive.
How do we actually use this? If we have a model involving
Pn such a non-linear f (x), then
wherever we have f (x) in the model, we simply substitute j=1 λj f (ξ j ) , and we incorporate
the further constraints:
n
X
λj = 1;
j=1
n−1
X
yj = 1;
j=1
λ1 ≤ y1 ;
λj ≤ yj−1 + yj , for j = 2, . . . , n − 1 ;
λn ≤ yn−1 ;
λj ≥ 0 , for j = 1, . . . , n ;
yj {0, 1} , for j = 1, . . . , n − 1 .

Pn
Of course a very non-linear f (x) will demand an fˆ(x) := j=1 λj f (ξ j ) with a high value for n ,
so as to get an accurate approximation. And higher values for n imply more binary variables
yj , which come at a high computational cost.

8.3 A Prelude to Algorithms


For reasons that will become apparent, for the purpose of developing algorithms for linear-
optimization problems in which some variable are required to be integer, it is convenient to
assume that our problem has the form

z := max y 0 b
y 0 A ≤ c0 ;
(DI )
y ∈ Rm ;
yi integer, for i ∈ I .

The set I ⊂ {1, 2, . . . , m} allows for a given subset of the variables to be constrained to be integer.
This linear-optimization problem has a non-standard form, but it is convenient that the dual of
the continuous relaxation has the standard form
min c0 x
Ax = b; (P)
x ≥ 0.

To prove that an algorithm for (DI ) is finite, it is helpful to assume that the feasible region
of the continuous relaxation (D) of (DI ) is non-empty and bounded.
We saw in Section 7.3.2 that there are situations in which rounding the solution of a continu-
ous relaxation can yield a good solution to an optimization problem involving integer variables.
But generally, this is not the case.
106 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Example 8.15
Consider the problem
max y2
2ky1 + y2 ≤ 2k ;
− 2ky1 + y2 ≤ 0;
− y2 ≤ 0;
y1 , y2 integer,
where k ≥ 1 is a possibly large positive integer. It is easy to check that (y1 , y2 ) = (0, 0) and
(y1 , y2 ) = (1, 0) are both optimal solutions of this problem, but the optimal solution of the con-
tinuous relaxation is (y1 , y2 ) = ( 21 , k). If we consider rounding y1 up or down in the continuous
solution, we do not get a feasible solution, and moreover we are quite far from the optimal
solutions.

Example 8.16
Consider the problem
Pm
max i=1 yi
yi + y` ≤ 1, for all 1 ≤ i < ` ≤ m;
−yi ≤ 0, for all 1 ≤ i ≤ m;
yi integer, for all 1 ≤ i ≤ m,

where m ≥ 3 is a possibly large positive integer. It is easy to check that each integer optimal
solution sets any single variable to one and the rest to zero, achieving objective value 1. While
the (unique) continuous optimal solution sets all variables to 21 , achieving objective value m
2.
We can see that the continuous solution is not closely related to the integer solutions.

8.4 Branch-and-Bound

Next, we look at a rudimentary framework called branch-and-bound, which aims at finding


an optimal solution of (DI ), a linear-optimization problem having some integer variables. We
assume that (P), the dual of the continuous relaxation of (DI ), has a feasible solution. Hence,
even the continuous relaxation (D) of (DI ) is not unbounded.
Our algorithm maintains a list L of optimization problems that all have the general form of
(DI ). Keep in mind that problems on the list have integer variables. We maintain a lower bound
LB, satisfying LB ≤ z . Put simply, LB is the objective value of the best (objective maximizing)
feasible solution ỹLB of (DI ) that we have seen so far. Initially, we set LB = −∞ , and we update
it in an increasing fashion.
The algorithm maintains the key invariant for branch-and-bound:
8.4. BRANCH-AND-BOUND 107

Every feasible solution of the original problem (DI ) with greater objective value than LB is feasible
for a problem on the list.
We stop when the list is empty, and because of the property that we maintain, we correctly
conclude that the optimal value of (DI ) is LB when we do stop.
At a general step of the algorithm, we select and remove a problem (D̃I ) on the list, and
we solve its continuous relaxation (D̃). If this continuous relaxation is infeasible, then we do
nothing further with this problem. Otherwise, we let ȳ be its optimal solution, and we proceed
as follows.

• If ȳ 0 b ≤ LB , then no feasible solution to the selected problem can have objective value
greater than LB, so we are done processing this selected problem.
• If ȳi is integer for all i ∈ I , then we have solved the selected problem. In this case, if
ȳ 0 b > LB , then we
– reset LB to ȳ 0 b ;
– reset ȳLB to ȳ .
• Finally, if ȳ 0 b > LB and ȳi is not integer for all i ∈ I , then (it is possible that this selected
problem has a feasible solution that is better than ȳLB , so) we
– select some i ∈ I such that ȳi is not integer;
– place two new “child” problems on the list, one with the constraint yi ≤ bȳi c ap-
pended (the so-called down branch), and the other with the constraint yi ≥ dȳi e
appended (the so-called up branch).
(observe that every feasible solution to a parent is feasible for one of its children, if
it has children.)

Because the key invariant for branch-and-bound is maintained by the processing rules, the
following result is evident.

Theorem 8.17
Suppose that the original (P) is feasible. Then at termination of branch-and-bound, we have
LB= −∞ if (DI ) is infeasible or with ȳLB being an optimal solution of (DI ).

Finite termination. If the feasible region of the continuous relaxation (D) of (DI ) is a bounded
set, then we can guarantee finite termination. If we do not want to make such an assumption,
then if we assume that the data for the formulation is rational, it is possible to bound the region
that needs to be searched, and we can again assure finite termination.

Solving continuous relaxations. Some remarks are in order regarding the solution of con-
tinuous relaxations. Conceptually, we apply the Simplex Algorithm to the dual (P̃) of the con-
tinuous relaxation (D̃) of a problem (D̃I ) selected and removed from the list. At the outset, for
an optimal basis β of (P̃), the optimal dual solution is given by ȳ 0 := c0β A−1
β . If i ∈ I is chosen,
such that ȳi is not an integer, then we replace the selected problem (D̃I ) with one child having
the additional constraint yi ≤ bȳi c (the down branch) and another with the constraint yi ≥ dȳi e
appended (the up branch).
Adding a constraint to (D̃) adds a variable to the standard-form problem (P̃). So, a basis
for (P̃) remains feasible after we introduce such a variable.
108 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

• The down branch: The constraint yi ≤ bȳi c , dualizes to a new variable xdown in (P̃). The
variable xdown has a new column Adown := ei and a cost coefficient of cdown := bȳi c . Notice
that the fact that ȳi is not an integer (and hence ȳ violates yi ≤ bȳi c) translates into the
fact that the reduced cost c̄down of xdown is c̄down = cdown − ȳ 0 Adown = bȳi c − ȳi < 0 , so xdown
is eligible to enter the basis.
• The up branch: Similarly, the constraint yi ≥ dȳi e , or equivalently −yi ≤ −dȳi e , dualizes
to a new variable xup in (P̃). The variable xup has a new column Aup := −ei and a cost
coefficient of cup := −dȳi e . Notice that the fact that ȳi is not an integer (and hence ȳ
violates yi ≥ dȳi e) translates into the fact that the reduced cost c̄up of xup is c̄up = cup −
ȳ 0 Aup = −dȳi e + ȳi < 0 , so xup is eligible to enter the basis.

In either case, provided that we have kept the optimal basis for the (P̃) associated with a
˜ associated with a child (D̃
problem (D̃I ), the Simplex Algorithm picks up on the (P̃) ˜ ) of that
I
˜ entering the basis.
problem , with the new variable of the child’s (P̃)
Notice that the (P̃) associated with a problem (D̃I ) on the list could be unbounded. But
this just implies that the problem (D̃I ) is infeasible.

Partially solving continuous relaxations. Notice that as the Simplex Algorithm is applied
to the (P̃) associated with any problem (D̃I ) from the list, we generate a sequence of non-
increasing objective values, each one of which is an upperbound on the optimal objective value
of (D̃I ). That is, for any such (P̃), we start with the upperbound value of its parent, and then we
gradually decrease it, step-by-step of the Simplex Algorithm. At any point in this process, if the
objective value of the Simplex Algorithm falls at or below the current LB, we can immediately
terminate the Simplex Algorithm on such a (P̃) — its optimal objective value will be no greater
than LB — and conclude that the optimal objective value of (D̃I ) is no greater than LB.

A global upper bound. As the algorithm progresses, if we let UBbetter be the maximum, over
all problems on the list, of the objective value of the continuous relaxations, then any feasible
solution ŷ with objective value greater than that LB satisfies ŷ 0 b ≤ UBbetter . Of course, it may be
that no optimal solution is feasible to any problem on the list — for example if it happens that
LB = z . But we can see that

z ≤ UB := max {UBbetter , LB} .

It may be useful to have UB at hand, because we can always stop the computation early, say
when UB−LB < τ , returning the feasible solution ȳLB , with the knowledge that z−ȳLB0
b ≤ τ . But
notice that we do not readily have the objective value of the continuous relaxation for problems
on the list — we only solve the continuous relaxation for such a problem after it is selected
(for processing). But, for every problem on the list, we can simply keep track of the optimal
objective value of its parent’s continuous relaxation, and use that instead. Alternatively, we can
re-organize our computations a bit, solving continuous relaxations of subproblems before we
put them on the list.

Selecting a subproblem from the list. Which subproblem from the list should we process
next?
• A strategy of last-in/first-out, known as diving, often results in good increases in LB. To
completely specify such a strategy, one would have to decide which of the two children
of a subproblem is put on the list last (i.e., the down branch or the up branch). A good
choice can affect the performance of this rule, and such a good choice depends on the
type of model being solved.
8.4. BRANCH-AND-BOUND 109

• A strategy of first-in/first-out is very bad. It can easily result in an explosion in the size of
the list of subproblems.
• A strategy of choosing a subproblem to branch on having objective value for its contin-
uous relaxation equal to UB, known as best bound, is a sound strategy for seeking a
decrease in UB. If such a rule is desired, then it is best to solve continuous relaxations of
subproblems before we put them on the list.
A hybrid strategy, doing mostly diving at the start (to get a reasonable value of LB) and shifting
more and more to best bound (to work on proving that LB is at or near the optimal value) has
rather robust performance.

Selecting a branching variable. Probably very many times, we will need to choose an i ∈ I
for which ȳi is fractional, in order to branch and create the child subproblems. Which such i
should we choose? Naïve rules such as choosing randomly or the so-called most fractional rule
of choosing an i that maximizes min{ȳi −bȳi c , dȳi e− ȳi } seem to have rather poor performance.
Better rules are based on estimates of how the objective value of the children will change relative
to the parent.

Using dual variables to bound the “other side” of an inequality. Our constraint
system y 0 A ≤ c0 can be viewed as y 0 Aj ≤ cj , for j = 1, 2, . . . , n ; that is, cj is an upper bound on
y 0 Aj . We may wonder if we can also derive lower bounds on y 0 Aj .

Theorem 8.18
Let LB be the objective value of any feasible solution of (DI ). Let x̄ be an optimal solution of
(P), and assume that x̄j > 0 for some j . Then

LB − c0 x̄
cj + ≤ y 0 Aj
x̄j

is satisfied by every optimal solution of (DI ).

Proof. We consider a parametric version of (DI ). For ∆j ∈ R, consider

z(∆j ) := max y0 b
y0 A ≤ c0 + ∆j e0j ;
(DI (∆j ))
y ∈ Rm ;
yi integer, for i ∈ I .

Let zR (∆j ) be defined the same way as z(∆j ), but with integrality relaxed. Using ideas from
Chapters 6 and 7, we can see that zR is a concave (piecewise-linear) function on its domain, and
x̄j is a subgradient of zR at ∆j = 0 . It follows that

z(∆j ) ≤ zR (∆j ) ≤ zR (0) + ∆j x̄j = c0 x̄ + ∆j x̄j .

So, we can observe that for


LB − c0 x̄
∆j < ,
x̄j
we will have z(∆j ) < LB. Therefore, every ŷ that is feasible for (DI (∆j )) with ∆j < (LB −
c0 x̄)/x̄j will have ŷ 0 b < LB . So such a ŷ cannot be optimal for (DI ). t
u
110 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

It is interesting to consider two special cases of Theorem 8.18:

Corollary 8.19 (Variable fixing)


Let LB be the objective value of any feasible solution of (DI ). Let x̄ be an optimal solution of
(P). Assume that x̄j > 0 is the optimal dual variable for a constraint of the form: yk ≤ 1 (or
−yk ≤ 0) . If c0 x̄ − LB < x̄j , then yk = 1 (respectively, yk = 0) is satisfied by every optimal
solution of (DI ).

Because of Exercise 5.2, this is known as reduced-cost fixing.

8.5 Cutting Planes


This section is adapted from material in [2] and [4]. In fact, those papers were developed
to achieve versions of Gomory cutting-plane algorithms (with finiteness proofs) that would
mesh with our column-generation treatment of many topics in this book (i.e., cutting stock,
decomposition, and branch-and-bound).

8.5.1 Pure
In this section, we assume that all yi variables are constrained to be integer. That is, I =
{1, 2, . . . , m}
We can choose any non-negative w ∈ Rn , and we see that

w ≥ 0 and y 0 A ≤ c0 =⇒ y 0 (Aw) ≤ c0 w .

Note that this inequality is valid for all solutions of y 0 A ≤ c0 , integer or not. Next, if Aw is integer,
we can exploit the integrality of y . We see that

Aw ∈ Zm , y ∈ Zm =⇒ y 0 (Aw) ≤ bc0 wc ,

for all integer solutions of y 0 A ≤ c0 .


The inequality y 0 (Aw) ≤ bc0 wc is called a Chvátal-Gomory cut. The condition Aw ∈ Zm
may seem a little awkward, but usually we have that A is integer, so we can get Aw ∈ Zm by
then just choosing w ∈ Zn . In fact, for the remained of this section, we will assume that A and
c are integer.
Of course, it is by no means clear how to choose appropriate w, and this is critical for getting
useful inequalities. We should also bear in mind that there are examples for which Chvátal-
Gomory are rather ineffectual. Trying to apply such cuts to Example 8.15 reveals that infeasi-
ble integer points can “guard” Chvátal-Gomory cuts from getting close to any feasible integer
points.
We would like to develop a concrete algorithmic scheme for generating Chvátal-Gomory
cuts. We will do this via basic solutions. Let β be any basis for P. The associated dual basic
solution (for the continuous relaxation (D)) is ȳ 0 := c0β A−1
β . Suppose that ȳi is not an integer.
Our goal is to derive a valid cut for (DI ) that is violated by ȳ.
Let
b̃ := ei + Aβ r,
where r ∈ Zm , and, as usual, ei denotes the i-th standard unit vector in Rm . Note that by
construction, b̃ ∈ Zm .
8.5. CUTTING PLANES 111

Theorem 8.20
ȳ 0 b̃ is not an integer, and so y 0 b̃ ≤ bȳ 0 b̃c cuts off ȳ.

Proof. ȳ 0 b̃ = ȳ 0 (ei + Aβ r) = ȳi + (c0β A−1 ȳi + c0β r ∈


β )Aβ r = |{z} / Z. t
u
|{z}
∈Z
/ ∈Z

At this point, we have an inequality y 0 b̃ ≤ bȳ 0 b̃c which cuts off ȳ, but we have not established
its validity for (DI ).
Let H·i := A−1 −1
β e , the i-th column of Aβ . Now let
i

w := H·i + r.

Clearly we can choose r ∈ Zm so that w ≥ 0; we simply choose r ∈ Zm so that

rk ≥ −bhki c, for k = 1, . . . , m. (∗≥ )

Theorem 8.21
Choosing r ∈ Zm satisfying (∗≥ ), we have that y 0 b̃ ≤ bȳ 0 b̃c is valid for (DI ).

Proof. Because w ≥ 0 and y 0 A ≤ c0 , we have the validity of

y 0 Aβ (A−1 i 0 −1 i
β e + r) ≤ cβ (Aβ e + r),

even for the continuous relaxation (D) of (DI ). Simplifying this, we have

y 0 (ei + Aβ r) ≤ ȳi + c0β r.

The left-hand side is clearly y 0 b̃, and the right-hand side is

ȳi + c0β r = ȳi + ȳ 0 Aβ r = ȳ 0 (ei + Aβ r) = ȳ 0 b̃.

So we have that y 0 b̃ ≤ ȳ 0 b̃ is valid even for (D). Finally, observing that b̃ ∈ Zm and y is constrained
to be in Zm for (DI ), we can round down the right-hand side and get the result. t
u

So, given any non-integer basic dual solution ȳ, we have a way to produce a valid inequality
for (DI ) that cuts it off. This cut for (DI ) is used as a column for (P): the column is b̃ with objec-
tive coefficient bȳ 0 b̃c. Taking β to be an optimal basis for (P), the new variable corresponding to
this column is the unique variable eligible to enter the basis in the context of the primal simplex
algorithm applied to (P) — the reduced cost is precisely

bȳ 0 b̃c − ȳ 0 b̃ < 0.

The new column for A is b̃ which is integer. The new objective coefficient for c is bȳ 0 b̃c
which is an integer. So the original assumption that A and c are integer is maintained, and we
can repeat. In this way, we get a legitimate cutting-plane framework for (DI ) — though we
emphasize that we do our computations as column generation with respect to (P).
There is clearly a lot of flexibility in how r can be chosen. Next, we demonstrate that in a
very concrete sense, it is always best to choose a minimal r ∈ Zm satisfying (∗≥ ).
112 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Theorem 8.22
Let r ∈ Zm be defined by
rk = −bhki c, for k = 1, . . . , m, (∗= )
and suppose that r̂ ∈ Z satisfies (∗≥ ) and r ≤ r̂. Then the cut determined by r dominates
m

the cut determined by r̂.

Proof. It is easy to check that our cut can be re-expressed as



yi ≤ bȳi c + c0β − y 0 Aβ r.

Noting that c0β − y 0 Aβ ≥ 0 for all y that are feasible for (D), we see that the strongest inequality
is obtained by choosing r ∈ Zm to be minimal. t
u

Example 8.23
We work through an example in pure_gomory_example_1.ipynb (see Appendix A12) which
uses again pivot_tools.ipynb (see Appendix A.6). The function library pivot_tools.ipynb
contains two (additional) useful tools for this: pure_gomory( ) and dual_plot( , )
Let    
7 8 −1 1 3 26
A= , b=
5 6 −1 2 1 19
0

and c = 126 141 −10 5 67 .
So, the integer program (DI ) which we seek to solve is defined by five inequalities in the two
variables y0 and y1 . For the basis of (P), β = (0, 1), we have
   
7 8 3 −4
Aβ = , and hence A−1 β = .
5 6 −5/2 7/2

It is easy to check that for this choice of basis, we have


 
2
x̄β = ,
3/2

and for the non-basis η = (2, 3, 4, 5), we have c̄0η = 5 1/2 1 , which are both non-negative,
and so this basis is optimal for (P). The associated dual basic solution depicted in Figure 8.4 is

ȳ 0 = 51/2 −21/2 , and the objective value is z = 463 1/2.

Because both ȳ1 and ȳ2 are not integer, we can derive a cut for (DI ) from either. Recalling
the procedure, for any fraction ȳi , we start with the i-th column H·i of H := A−1
β , and we get a
new A·j := ei + Aβ r. Throughout we will choose r via (∗= ). So we have,
          
3 −3 1 7 8 −3 4
H·0 = ⇒r= ⇒ b̃ = + = =: A·5
−5/2 3 0 5 6 3 3
          
−4 4 0 7 8 4 4
H·1 = ⇒r= ⇒ b̃ = + = .
7/2 −3 1 5 6 −3 3
In fact, for this iteration of this example, we get the same cut for either choice of i. To calculate
the right-hand side of the cut, we have
 
 4
ȳ 0 b̃ = 51/2 −21/2 = 70 1/2,
3
8.5. CUTTING PLANES 113

Figure 8.4:

so the cut for (DI ) is


4y0 + 3y1 ≤ 70.
Now, we do our simplex-method calculations with respect to (P).
The new column for (P) is A·5 (above) with objective coefficient c5 := 70. Following the
ratio test, when index 5 enters the basis, index 2 leaves the basis, and so the new basis is β =
(0, 5), with
 
7 4
Aβ = ,
5 3
with objective value 462, a decrease. At this point, index 4 has a negative reduced cost, and
index 0 leaves the basis. So we now have β = (4, 5), which turns out to be optimal.
The associated dual basic solution depicted in Figure 8.5 is

ȳ 0 = 131/5 −58/5 , and the objective value is z = 460 4/5.

We observe that the objective function has decreased, but unfortunately both ȳ0 and ȳ1 are
not integers. So we must continue. We have
   
3 4 3/5 −4/5
Aβ = , and hence A−1
β = .
1 3 −1/5 3/5
114 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Figure 8.5:

We observe that the objective function has decreased, but because both ȳ0 and ȳ2 are not
integers, we can again derive a cut for (DI ) from either. We calculate
          
3/5 0 1 3 4 0 5
H·0 = ⇒r= ⇒ b̃ = + = =: A·6
−1/5 1 0 1 3 1 3

          
−4/5 1 0 3 4 1 3
H·1 = ⇒r= ⇒ b̃ = + = =: A·7 .
3/5 0 1 1 3 0 2

Correspondingly, we have ȳ 0 A·6 = 96 1/5 and ȳ 0 A·7 = 55 2/5, giving us c6 := 96 and c7 := 55.
So, we have two possible cuts for (DI ):

5y0 + 3y1 ≤ 96 and 3y0 + 2y1 ≤ 55.

Choosing to incorporate both as columns for (P), and letting index 7 enter the basis, index
5 leaves (according to the ratio test), and it turns out that we reach an optimal basis β = (7, 5)
after this single pivot. The associated dual basic solution is depicted in Figure 8.6 (the second
graphic is zoomed in)

ȳ 0 = 25 −10 , and the objective value is z = 460.

Not only has the objective decreased, but now all of the ȳi are integers, so we have an optimal
solution for (DI ).
8.5. CUTTING PLANES 115

Figure 8.6:
116 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

8.5.2 Mixed
In this section, we no longer assume that all yi variables are constrained to be integer. That is,
we only assume that non-empty I ⊂ {1, 2, . . . , m}. The cuts from the previous section cannot
be guaranteed to be valid, so we start anew.
Let β be any basis partition for(P), and let ȳ be the associated dual basic solution. Suppose
that ȳi ∈
/ Z, for some i ∈ I. We aim to find a cut, valid for (DI ) and violated by ȳ.
Let
b̃1 := ei + Aβ r,
and r ∈ Rm will be determined later. We will accumulate the conditions we need to impose on
r, as we go.
Let w1 be the basic solution associated with the basis β and the “right-hand side” b̃1 . So
wβ = h·i + r, where h·i is defined as the i-th column of A−1
1
β , and wη = 0. Choosing r ≥ −h·i ,
1

we can make w1 ≥ 0. Moreover, c0 w1 = c0β (h·i + r) = c0β h·i + c0β r = ȳi + c0β r, so because we
assume that ȳi ∈
/ Z, we can choose r ∈ Zm , and we have that c0 w1 ∈/ Z.
Next, let
b̃2 := Aβ r.
Let w2 be the basic solution associated with the basis β and the “right-hand side” b̃2 . So, now
further choosing r ≥ 0, we have wβ2 = r ≥ 0, wη2 = 0, and c0 w2 = c0β r.
So, we choose r ∈ Zm so that:

rk ≥ max{−bhki c, 0}, for k = 1, . . . , m, (∗≥+ )

Because we have chosen w1 and w2 to be non-negative, forming (y 0 A)wl ≤ c0 wl , for l = 1, 2,


we get a pair of valid inequalities for D. They have the form y 0 b̃l ≤ c0 wl , for l = 1, 2. Let αj0
denote the j-th row of Aβ . Then our inequalities have the form:
X
(1 + αi0 r)yi + (αj0 r)yj ≤ ȳi + ȳ 0 Aβ r, (I1)
j:j6=i
X
(αi0 r)yi + (αj0 r)yj ≤ ȳ 0 Aβ r. (I2)
j:j6=i
P
Now, defining z := j:j6=i (αj r)yj ,
0
we have the following inequalities in the two variables yi
and z:

slope
(1 + αi0 r)yi 0
+ z ≤ ȳi + ȳ Aβ r −1/(1 + αi0 r) (B1)
(αi0 r)yi 0
+ z ≤ ȳ Aβ r −1/αi0 r (B2)

Note that the intersection point (yi∗ , z ∗ ) of the lines associated


Pwith these inequalities (subtract
the second equation from the first) has yi∗ = ȳi and z ∗ = j:j6=i (αj
0
r)ȳj . Also, the “slopes”
indicated regard yi as the ordinate and z as the abscissa.
Bearing in mind that we choose r ∈ Zm and that A is assumed to be integer, we have that
αi r ∈ Z. There are now two cases to consider:
0

• αi0 r ≥ 0, in which case the first line has negative slope and the second line has more
negative slope (or infinite αi0 r = 0);
• αi0 r ≤ −1, in which case the second line has positive slope and the first line has more
positive slope (or infinite αi0 r = −1).
8.5. CUTTING PLANES 117

See Figures 8.7 and 8.8.


In both cases, we are interested in the point (z 1 , yi1 ) where the first line intersects the line
yi = bȳi c + 1 and the point (z 2 , yi2 ) where the second line intersects the line yi = bȳi c.
We can check that

z 1 = ȳi + ȳ 0 Aβ r − (1 + αi0 r) (bȳi c + 1) ,


z 2 = ȳ 0 Aβ r − (αi0 r)bȳi c.

Subtracting, we have
z1 − z2 = (ȳi − bȳi c) −(1 + αi0 r ),
| {z } |{z}
∈(0,1) ∈Z

so we see that: z < z precisely when


1 2
≥ 0; z < z precisely when αi0 r ≤ −1. Moreover,
αi0 r 2 1

the slope of the line through the pair of points (z 1 , yi1 ) and (z 2 , yi2 ) is just

1 1
= .
z1 −z 2 (ȳi − bȳi c) − (1 + αi0 r)

Figure 8.7: (F-BMI) cut when αi0 r ≥ 0


yi (F-BMI)

(B2)

(B1)

We now define the inequality

((ȳi − bȳi c) − (1 + αi0 r)) (yi − bȳi c) ≥ z − ȳ 0 Aβ r + (αi0 r)bȳi c,

which has the more convenient form

((1 + αi0 r) − (ȳi − bȳi c)) yi + z ≤ ȳ 0 Aβ r − (ȳi − bȳi c − 1) bȳi c. (F-BMI)

By construction, we have the following two results.

Lemma 8.24
(F-BMI) is satisfied at equality by both of the points (z 1 , yi1 ) and (z 2 , yi2 ).
118 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Figure 8.8: (F-BMI) cut when αi0 r ≤ −1


z

(B2)

(B1)
yi
(F-BMI)

Lemma 8.25
(F-BMI) is valid for
 
(yi , z) ∈ R2 : (B1), yi ≥ dȳi e ∪ (yi , z) ∈ R2 : (B2), yi ≤ bȳi c .

Lemma 8.26
(F-BMI) is violated by the point (yi∗ , z ∗ ).

Proof. Plugging (yi∗ , z ∗ ) into (F-BMI), and making some if-and-only-if manipulations, we obtain

(ȳi − bȳi c − 1) (ȳi − bȳi c) ≥ 0,

which is not satisfied. t


u

Finally, translating (F-BMI) back to the original variables y ∈ Rm , we get


X
((1 + αi0 r) − (ȳi − bȳi c)) yi + (αj0 r)yj ≤ ȳ 0 Aβ r − (ȳi − bȳi c − 1) bȳi c,
j:j6=i

or,
− (ȳi − bȳi c − 1) yi + y 0 Aβ r ≤ ȳ 0 Aβ r − (ȳi − bȳi c − 1) bȳi c,
which, finally has the convenient form

y 0 (Aβ r − (ȳi − bȳi c − 1) ei ) ≤ c0β r − (ȳi − bȳi c − 1) bȳi c. (F-GMI)

We immediately have:
8.5. CUTTING PLANES 119

Theorem 8.27
(F-GMI) is violated by the point ȳ.

Finally, we have:

Theorem 8.28
(F-GMI) is valid for the following relaxation of the feasible region of (D):

{y ∈ Rm : y 0 Aβ ≤ c0β , yi ≥ dȳi e} ∪ {y ∈ Rm : y 0 Aβ ≤ c0β , yi ≤ bȳi c}

Proof. The proof, maybe obvious, is by a simple disjunctive argument. We will argue that
(F-BMI) is valid for both S1 := {y ∈ Rm : y 0 Aβ ≤ c0β , −yi ≤ −bȳi c − 1} and S2 := {y ∈
Rm : y 0 Aβ ≤ c0β , yi ≤ bȳi c}.
The inequality (F-BMI) is simply the sum of (B1) and the scalar ȳi − bȳi c times −yi ≤
−bȳi c − 1. It follows than that taking (I1) plus ȳi − bȳi c times −yi ≤ −bȳi c − 1, we get an
inequality equivalent to (F-GMI).
Similarly, it is easy to check that the inequality (F-BMI) is simply (B2) plus 1 − (ȳi − bȳi c)
times yi ≤ bȳi c. It follows than that taking (I2) plus 1 − (ȳi − bȳi c) times yi ≤ bȳi c, we also get
an inequality equivalent to (F-GMI). t
u

In our algorithm, we append columns to (P), rather than cuts to (D). The column for (P)
corresponding to (F-GMI) is
Aβ r − (ȳi − bȳi c − 1) ei ,
and the associated cost coefficient is
c0β r − (ȳi − bȳi c − 1) bȳi c.

So A−1
β times the column is
r − (ȳi − bȳi c − 1) h·i .
Agreeing with what we calculated in Proposition 8.26, we have the following result.

Proposition 8.29
The reduced cost of the column for (P) corresponding to (F-GMI) is

(ȳi − bȳi c − 1) (ȳi − bȳi c) < 0.

Proof.
c0β r − (ȳi − bȳi c − 1) bȳi c − c0β (r − (ȳi − bȳi c − 1) h·i )

= (ȳi − bȳi c − 1) c0β h·i − bȳi c
= (ȳi − bȳi c − 1) (ȳi − bȳi c) .
t
u

Next, we come to the choice of r.


120 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Theorem 8.30
Let r ∈ Zm be defined by

rk = max{0, −bhki c}, for k = 1, 2, . . . , m, (∗=+ )

and suppose that r̂ ∈ Zm satisfies (∗≥+ ) and r ≤ r̂. Then the cut determined by r dominates
the cut determined by r̂.

Proof. We simply rewrite (F-GMI) as

(c0β − y 0 Aβ )r ≥ (ȳi − bȳi c − 1) (bȳi c − yi ).

Observing that c0β − y 0 Aβ ≥ 0 for y that are feasible for (D), we see that the tightest inequality
of this type, satisfying (∗≥+ ), arises by choosing a minimal r. The result follows. t
u

8.5.3 Finite termination


Making a version of our Gomory cutting-plane scheme that we can prove is finitely terminating
is rather technical. Though it can be done in essentially the same manner for both pure and
mixed cases. We need to treat the objective-function value as an additional variable (numbered
first), employ the Simplex Algorithm adapted to the -perturbed problem, always choose the
least-index i ∈ I having ȳi ∈
/ Z and choose r via (∗= ) or (∗=+ ) as appropriate to generate the
Gomory cuts. Details can be found in [2] and [4].

8.5.4 Branch-and-Cut
State-of-the-art algorithms for (mixed-)integer linear optimization (like Gurobi, Cplex and Express)
combine cuts with branch-and-bound. There are a lot of software design and tuning issues that
make this work successfully.

8.6 Exercises
Exercise 8.1 (Task scheduling, continued)
Consider again the “task scheduling” Exercise 2.5. Take the dual of the linear-optimization
problem that you formulated. Explain how this dual can be interpreted as a kind of network
problem. Using Python/Gurobi, solve the dual of the example that you created for Exercise 2.5
and interpret the solution.

Exercise 8.2 (Pivoting and total unimodularity)


8.6. EXERCISES 121

A pivot in an m × n matrix A means choosing a row i and column j with aij 6= 0 , subtracting
akj
aij times row i from all other rows k(6= i) , and then dividing row i by aij . Note that after the
pivot, column j becomes the i-th standard-unit column. Prove that if A is TU, then it is TU after
a pivot.
Exercise 8.3 (Comparing formulations for a toy problem)
Consider the systems:
S1 : 2x1 + 2x2 + x3 + x4 ≤ 2;
xj ≤ 1;
−xj ≤ 0.

S2 : x1 + x2 + x3 ≤ 1;
x1 + x2 + x4 ≤ 1;
−xj ≤ 0.

S3 : x1 + x2 ≤ 1;
x1 + x3 ≤ 1;
x1 + x4 ≤ 1;
x2 + x3 ≤ 1;
x2 + x4 ≤ 1;
−xj ≤ 0.
Notice that each system has precisely the same set of integer solutions. In fact, each system
chooses, via its feasible integer (0/1) solutions, the “vertex packings” of the following graph.

A vertex packing of a graph is a set of vertices with no edges between them. For this particular
graph we can see that the packings are: ∅, {1}, {2}, {3}, {4}, {3, 4}.
Compare the feasible regions Si of the continuous relaxations, for each pair of these systems.
Specifically, for each choice of pair i 6= j , demonstrate whether or not the solution set of Si is
contained in the solution set of Sj . HINT: To prove that the solution set of Si is contained in the
solution set of Sj , it suffices to demonstrate that every inequality of Sj is a non-negative linear
combination of the inequalities of Si . To prove that the solution set of Si is not contained in the
solution set of Sj , it suffices to give a solution of Si that is not a solution of Sj .
Exercise 8.4 (Comparing facility-location formulations)
We have seen two formulations of the forcing constraints for the uncapacitated facility-location
problem. We have a choice of the mn constraints:Pn −yi + xij ≤ 0 , for i = 1, . . . , m and
j = 1, . . . , n , or the m constraints: −nyi + x
j=1 ij ≤ 0 , for i = 1, . . . , m . Which formulation
is stronger? That is, compare (both computationally and analytically) the strength of the two as-
sociated continuous relaxations (i.e., when we relax yi ∈ {0, 1} to 0 ≤ yi ≤ 1 , for i = 1, . . . , m).
The Jupyter notebook UFL.ipynb can be used to perform experiments comparing the use of
(S) versus (W). (see Appendix A.11).
122 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Exercise 8.5 (Comparing piecewise-linear formulations)


We have seen that the adjacency condition for piecewise-linear univariate functions can be mod-
eled by

λ1 ≤ y1 ;
λj ≤ yj−1 + yj , for j = 2, . . . , n − 1 ;
λn ≤ yn−1 .

An alternative formulation is
j
X j+1
X
yi ≤ λi , for j = 1, . . . , n − 2 ;
i=1 i=1
n−1
X n
X
yi ≤ λi , for j = 2, . . . , n − 1 .
i=j i=j

Explain why this alternative formulation is valid, and compare its strength to the original for-
mulation, when we relax yi ∈ {0, 1} to 0 ≤ yi ≤ 1 , for i = 1, . . . , n − 1 . (Note that for both
Pn Pn−1
formulations, we require λi ≥ 0, for i = 1, . . . , n , i=1 λi = 1 , and i=1 yi = 1).

Exercise 8.6 (Variable fixing)


Prove Corollary 8.19.

Exercise 8.7 (Gomory cuts)


Prove that we need at least k Chvátal-Gomory cuts to solve Example 8.15. You can observe this
bad behavior specifically for Gomory cuts in pure_gomory_example_2.ipynb (see Appendix
A.12)

Exercise 8.8 (Solve pure integer problems using Gomory cuts)


Extend what you did for Exercise 4.1 to now solve pure integer problems using Gomory cuts.
pivot_tools.ipynb (see Appendix A.6) contains two (additional) useful tools for this: pure_gomory( )
and dual_plot( , ) Using only the functions in pivot_tools.ipynb, extend your code from Ex-
ercise 4.1 to solve pure integer problems using Gomory cuts. As before, do not worry about
degeneracy/anti-cycling. Make some small examples to fully illustrate your code.

Exercise 8.9 (Make amends)


Find an interesting applied problem, model it as a pure- or mixed- integer linear-optimization prob-
lem, and test your model with Python/Gurobi.

Credit will be given for deft modeling, sophisticated use of Python/Gurobi, testing on meaningfully-
large instances, and insightful analysis. Try to play with Gurobi integer solver options (they can
be set through Python) to get better behavior of the solver.
Your grade on this problem will replace your grades on up to 6 homework problems (i.e.,
up to 6 homework problems on which you have lower grades than you get on this one). I will
8.6. EXERCISES 123

not consider any re-grades on this one! If you already have all or mostly A’s (or not), do a good job
on this one because you want to impress me, and because you are ambitious, and because this
problem is what we have been working towards all during the course, and because you should
always finish strong.
124 CHAPTER 8. INTEGER-LINEAR OPTIMIZATION

Take rest
Appendices

125
127

A.1 LATEX template


LATEX TEMPLATE

Your actual name ([email protected])

This template can serve as a starting point for learning LATEX. You may download MiKTeX from
miktex.org to get started. Look at the source file for this document (in Section 5) to see how to get
all of the effects demonstrated.

1. This is the first section where we make some lists


It is easy to make enumerated lists:
(1) This is the first item
(2) Here is the second
And even enumerated sublists:
(1) This is the first item
(2) Here is the second with a sublist
(a) first sublist item
(b) and here is the second

2. Here is a second section where we typeset some math


Pn
You can typeset math inline, like j=1 aij xj , by just enclosing the math in dollar signs.
But if you want to display the math, then you do it like this:
n
X
aij xj ∀ i = 1, . . . , m.
j=1
And here is a matrix:  
1 π 2 21 ν
 6.2 r 2 4 5 
|y 0 | R R r R̂
Here is an equation array, with the equal signs nicely aligned:
n
X
(2.1) xj = 5
j=1
Xn
(2.2) yj = 7
j=1
X
(2.3) xj = 29
j∈S

The equations are automatically numbered, like x.y, where x is the section number and y is the y-th
equation in section x. By tagging the equations with labels, we can refer to them later, like (2.3) and (2.1).
Theorem 2.1. This is my favorite theorem.
Proof. Unfortunately, the space here does not allow for including my ingenious proof of Theorem 2.1. 

3. Here is how I typset a standard-form linear-optimization problem

min c0 x
(P) Ax = b;
x ≥ 0.
Notice that in this example, there are 4 columns separated by 3 &’s. The ‘rrcl’ organizes justification
within a column. Of course, one can make more columns.

Date: October 31, 2020.


1
2 LATEX TEMPLATE

4. Graphics
This is how to include and refer to Figure 1 with pdfLaTeX.

Figure 1. Another duality

5. The LATEX commands to produce this document


Look at the LATEX commands in this section to see how each of the elements of this document was produced.
Also, this section serves to show how text files (e.g., programs) can be included in a LATEX document verbatim.
% LaTeX_Template.tex // J. Lee
%
% ----------------------------------------------------------------
% AMS-LaTeX ************************************************
% **** -----------------------------------------------------------
\documentclass{amsart}
\usepackage{graphicx,amsmath,amsthm}
\usepackage{hyperref}
\usepackage{verbatim}
\usepackage[a4paper,text={16.5cm,25.2cm},centering]{geometry}
% ----------------------------------------------------------------
\vfuzz2pt % Don’t report over-full v-boxes if over-edge is small
\hfuzz2pt % Don’t report over-full h-boxes if over-edge is small
% THEOREMS -------------------------------------------------------
\newtheorem{thm}{Theorem}[section]
\newtheorem{cor}[thm]{Corollary}
\newtheorem{lem}[thm]{Lemma}
\newtheorem{prop}[thm]{Proposition}
\theoremstyle{definition}
\newtheorem{defn}[thm]{Definition}
\theoremstyle{remark}
\newtheorem{rem}[thm]{Remark}
\numberwithin{equation}{section}
% MATH -----------------------------------------------------------
\newcommand{\Real}{\mathbb R}
\newcommand{\eps}{\varepsilon}
\newcommand{\To}{\longrightarrow}
\newcommand{\BX}{\mathbf{B}(X)}
\newcommand{\A}{\mathcal{A}}
% ----------------------------------------------------------------
\begin{document}

\title{\LaTeX~ Template}

\date{\today}
LATEX TEMPLATE 3

\maketitle

\href{mailto:[email protected]}
{Your actual name ([email protected])}

%
%\medskip
%
%(this identifies your work and it \emph{greatly} help’s me in returning homework to you by email
%---- just plug in the appropriate replacements in the \LaTeX~ source; then when I click on the
%hyperlink above, my email program opens up starting a message to you)

\bigskip

% ----------------------------------------------------------------

This template can serve as a starting point for learning \LaTeX. You may download MiKTeX from
{\tt miktex.org}
to get started. Look at the source file for this
document (in Section \ref{sec:appendix})
to see how to get all of the effects demonstrated.

\section{This is the first section where we make some lists}

It is easy to make enumerated lists:


\begin{enumerate}
\item This is the first item
\item Here is the second
\end{enumerate}

And even enumerated sublists:


\begin{enumerate}
\item This is the first item
\item Here is the second with a sublist
\begin{enumerate}
\item first sublist item
\item and here is the second
\end{enumerate}
\end{enumerate}

\section{Here is a second section where we typeset some math}

You can typeset math inline, like $\sum_{j=1}^n a_{ij} x_j$, by just enclosing the math in dollar signs.

But if you want to \emph{display} the math, then you do it like this:

\[
\sum_{j=1}^n a_{ij} x_j~ \forall~ i=1,\ldots,m.
\]

And here is a matrix:


\[
\left(
\begin{array}{ccccc}
1 & \pi & 2& \frac{1}{2} & \nu \\
6.2 & r & 2 & 4 & 5 \\
|y’| & \mathcal{R} & \mathbb{R} & \underbar{r} & \hat{R} \\
\end{array}
\right)
\]

Here is an equation array, with the equal signs nicely aligned:


\begin{eqnarray}
\sum_{j=1}^n x_j &=& 5 \label{E1} \\
\sum_{j=1}^n y_j &=& 7 \label{E7} \\
4 LATEX TEMPLATE

\sum_{j\in S} x_j &=& 29 \label{E4}


\end{eqnarray}

The equations are automatically numbered, like $x.y$, where


$x$ is the section number and $y$ is the $y$-th equation in section $x$.
By tagging the equations
with labels, we can refer to them later, like (\ref{E4}) and (\ref{E1}).

\begin{thm}\label{Favorite}
This is my favorite theorem.
\end{thm}
\begin{proof}
Unfortunately, the space here does not allow for including my ingenious proof
of Theorem \ref{Favorite}.
\end{proof}

\section{Here is how I typset a standard-form linear-optimization problem}

\[
\tag{P}
\begin{array}{rrcl}
\min & c’x & & \\
& Ax & = & b~; \\
& x & \geq & \mathbf{0}~.
\end{array}
\]

Notice that in this example, there are 4 columns separated by 3 \&’s.


The ‘rrcl’ organizes justification within a column.
Of course, one can make more columns.

\section{Graphics}

This is how to include and refer to Figure \ref{nameoffigure} with pdfLaTeX.

\begin{figure}[h!!]
\includegraphics[width=0.4\textwidth]{yinyang.jpg}
\caption{Another duality}\label{nameoffigure}
\end{figure}

\section{The \LaTeX~ commands to produce this document}


\label{sec:appendix}

Look at the \LaTeX~ commands in this section to see how each of the elements
of this document was produced. Also, this section serves to show
how text files (e.g., programs) can be included in a \LaTeX~ document verbatim.

\bigskip

\hrule

\small
\verbatiminput{LaTeX_Template.tex}
\normalsize

% ----------------------------------------------------------------

\end{document}
% ----------------------------------------------------------------
133

A.2 MatrixLP.ipynb
MatrixLP

August 23, 2021

Example: Setting up and solving a matrix-style LP with Python/Gurobi

min c0 x + f 0 w
Ax + Bw ≤ b
Dx =g
x ≥ 0, w ≤ 0

Note that we have the following dual, but we don’t model it:

max y0 b + v0 g
y0 A + v0 D ≤ c0
y0 B ≥ f0
y ≤ 0, v unrestricted

Rather, we recover its solution from Gurobi.


References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f
import numpy as np
import gurobipy as gp
from gurobipy import GRB

class StopExecution(Exception):
def _render_traceback_(self):
pass

[2]: # setting the matrix sizes and random data


n1=7
n2=15
m1=2
m2=4
np.random.seed(56) # set seed to be able to repeat the same random data
A=np.random.rand(m1,n1)
B=np.random.rand(m1,n2)
D=np.random.rand(m2,n1)

# Organize the situation (i.e., choose the right-hand side coefficients)


# so that the primal problem has a feasible solution
xs=np.random.rand(n1)
ws=-np.random.rand(n2)
b=np.matmul(A,xs)+np.matmul(B,ws)+0.01*np.random.rand(m1)
g=np.matmul(D,xs)

# Organize the situation (i.e., choose the objective coefficients)


# so that the dual problem has a feasible solution
ys=-np.random.rand(m1)
vs=np.random.rand(m2)-np.random.rand(m2)
c=np.matmul(np.transpose(A),ys)+np.matmul(np.transpose(D),vs)+0.01*np.random.
,→rand(n1)

f=np.matmul(np.transpose(B),ys)-0.01*np.random.rand(n2)

[3]: model = gp.Model()


model.reset()
x = model.addMVar(n1) # default is a nonnegative continuous variable
w = model.addMVar(n2, ub=0.0, lb=-GRB.INFINITY)
objective = model.setObjective(c@x+f@w, GRB.MINIMIZE)
constraints1 = model.addConstr(A@x+B@w <= b)
constraints2 = model.addConstr(D@x == g)

--------------------------------------------
Warning: your license will expire in 10 days
--------------------------------------------

Using license file C:\Users\jonxlee\gurobi.lic


Academic license - for non-commercial use only - expires 2021-06-28
Discarded solution information

[4]: model.optimize()
if model.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", model.status)
print("***** This is a problem. Model does not have an optimal solution")
raise StopExecution

Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)


Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 6 rows, 22 columns and 72 nonzeros
Model fingerprint: 0x734450bc
Coefficient statistics:
Matrix range [2e-03, 1e+00]
Objective range [1e-01, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [7e-01, 3e+00]
Presolve time: 0.01s
Presolved: 6 rows, 22 columns, 72 nonzeros

Iteration Objective Primal Inf. Dual Inf. Time


0 -7.2547823e+30 1.946982e+31 7.254782e+00 0s
9 2.6453973e+00 0.000000e+00 0.000000e+00 0s

Solved in 9 iterations and 0.02 seconds


Optimal objective 2.645397253e+00

[5]: print("***** Primal solution:")


for j in range(0,n1): print("x[",j,"]=",
np.format_float_positional(np.ndarray.item(x[j].X),4,pad_right=4))
print(" ")
for j in range(0,n2): print("w[",j,"]=",
np.format_float_positional(np.ndarray.item(w[j].X),4,pad_right=4))
print(" ")
print("***** Dual solution:")
for i in range(0,m1): print("y[",i,"]=",
np.format_float_positional(constraints1[i].Pi,4,pad_right=4))
print(" ")
for i in range(0,m2): print("v[",i,"]=",
np.format_float_positional(constraints2[i].Pi,4,pad_right=4))

***** Primal solution:


x[ 0 ]= 0.2689
x[ 1 ]= 0.0080
x[ 2 ]= 1.3952
x[ 3 ]= 0.
x[ 4 ]= 0.4962
x[ 5 ]= 0.
x[ 6 ]= 0.

w[ 0 ]= 0.
w[ 1 ]= 0.
w[ 2 ]= 0.
w[ 3 ]= 0.
w[ 4 ]= 0.
w[ 5 ]= 0.
w[ 6 ]= 0.
w[ 7 ]= 0.
w[ 8 ]= 0.
w[ 9 ]= 0.
w[ 10 ]= -4.7348
w[ 11 ]= 0.
w[ 12 ]= -4.392
w[ 13 ]= 0.
w[ 14 ]= 0.

***** Dual solution:


y[ 0 ]= -0.4424
y[ 1 ]= -0.7261

v[ 0 ]= -0.8196
v[ 1 ]= -0.6668
v[ 2 ]= -0.0458
v[ 3 ]= 0.1904
139

A.3 Production.ipynb
Production

June 25, 2021

Production model: constraint-style LP with Python/Gurobi


Notes: * This example is meant to show how to: * do constraint-style LP’s (as opposed to matrix
style), though the model we are setting up is max{c0 x : Ax ≤ b, x ≥ 0}. * extract primal and dual
solutions, primal and dual slacks, and sensitivity information are printed * pass constraint names
to Gurobi and then retrieve constraints from Gurobi by these names
References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f
import numpy as np
import gurobipy as gp
from gurobipy import GRB

class StopExecution(Exception):
def _render_traceback_(self):
pass
[2]: # Some toy data
m=3
n=2
M=list(range(0,m))
N=list(range(0,n))
A = np.array([ [8, 5], [8, 6], [8, 7] ])
b = np.array([32, 33, 35])
c = np.array([3 ,2])

[3]: model = gp.Model()


model.reset()
x = model.addMVar(n)
revenueobjective = model.setObjective(sum(c[j]*x[j] for j in N), GRB.MAXIMIZE)
for i in M: # naming the constraints r0,r1,r2,... (inside Gurobi)
model.addConstr(sum(A[i,j]*x[j] for j in N) <= b[i], name='r'+str(i))

Using license file C:\Users\jonxlee\gurobi.lic


Academic license - for non-commercial use only - expires 2021-01-10
Discarded solution information

[4]: model.optimize()
if model.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", model.status)
print("***** This is a problem. Model does not have an optimal solution")
raise StopExecution
print(" ")
print("primal var, dual slack, obj delta-lb, obj delta-ub")
for j in N: print("x[",j,"]=",np.format_float_positional(np.ndarray.item(x[j].
,→X),4,pad_right=4),

" t[",j,"]=", np.format_float_positional(np.ndarray.item(x[j].


,→RC),4,pad_right=4),

" L[",j,"]=", np.format_float_positional(np.ndarray.item(x[j].


,→SAObjLow-c[j]),4,pad_right=4),

" U[",j,"]=", np.format_float_positional(np.ndarray.item(x[j].


,→SAObjUp-c[j]),4,pad_right=4))

print(" ")
print("dual vars, primal slack, rhs delta-lb, rhs delta-ub")
for i in M:
constr=model.getConstrByName('r'+str(i)) # retriving from Gurobi the
,→named constraints r0,r1,r2,...

print("y[",i,"]=",np.format_float_positional(constr.Pi,4,pad_right=4),
" s[",i,"]=", np.format_float_positional(constr.
,→Slack,4,pad_right=4),

" L[",i,"]=", np.format_float_positional(constr.


,→SARHSLow-b[i],4,pad_right=4),

" U[",i,"]=", np.format_float_positional(constr.


,→SARHSUp-b[i],4,pad_right=4))
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 3 rows, 2 columns and 6 nonzeros
Model fingerprint: 0x32d9daed
Coefficient statistics:
Matrix range [5e+00, 8e+00]
Objective range [2e+00, 3e+00]
Bounds range [0e+00, 0e+00]
RHS range [3e+01, 4e+01]
Presolve time: 0.00s
Presolved: 3 rows, 2 columns, 6 nonzeros

Iteration Objective Primal Inf. Dual Inf. Time


0 5.0000000e+30 5.250000e+30 5.000000e+00 0s
3 1.2125000e+01 0.000000e+00 0.000000e+00 0s

Solved in 3 iterations and 0.01 seconds


Optimal objective 1.212500000e+01

primal var, dual slack, obj delta-lb, obj delta-ub


x[ 0 ]= 3.3750 t[ 0 ]= 0. L[ 0 ]= -0.3333 U[ 0 ]= 0.2
x[ 1 ]= 1. t[ 1 ]= 0. L[ 1 ]= -0.125 U[ 1 ]= 0.25

dual vars, primal slack, rhs delta-lb, rhs delta-ub


y[ 0 ]= 0.2500 s[ 0 ]= 0. L[ 0 ]= -1. U[ 0 ]= 1.
y[ 1 ]= 0.125 s[ 1 ]= 0. L[ 1 ]= -1. U[ 1 ]= 0.5
y[ 2 ]= 0. s[ 2 ]= 1.0000 L[ 2 ]= -1. U[ 2 ]= inf
143

A.4 Multi-commodityFlow.ipynb
Multi-commodityFlow

June 25, 2021

Multi-Commodity Network-Flow model: constraint-style LP with Python/Gurobi

K
min ∑ ∑ cke xek
k =1 e∈A

∑ xek − ∑ xek = bvk , for v ∈ N , k = 1, 2, . . . , K;


e∈A : t(e)=v e∈A : h(e)=v
K
∑ xek ≤ ue , for e ∈ A;
k =0
xek ≥ 0, for e ∈ A, k = 1, 2, . . . , K

Notes: * K=1 is ordinary single-commodity network flow. Integer solutions for free when node-
supplies and arc capacities are integer. * K=2 example below with integer data gives a fractional
basic optimum. This example doesn’t have any feasible integer flow at all.
References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2021 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f
import itertools
import numpy as np
#%matplotlib notebook
%matplotlib inline
import matplotlib.pyplot as plt
import gurobipy as gp
from gurobipy import GRB
import networkx as nx

class StopExecution(Exception):
def _render_traceback_(self):
pass

[2]: # parameters
solveLPOnly=True # set False to solve as an IP

[3]: # # Some toy data: 1 commodity


# Supplies= {
# # node i: [supply commodity[1] ... supply commodity[K]],
# 1: [12.],
# 2: [6.],
# 3: [-2.],
# 4: [0.],
# 5: [-9.],
# 6: [-7.]}

# CapacityCosts = {
# # arc (i,j): [capacity, cost commodity[1] ... cost commodity[K]],
# (1,2): [6., 2],
# (1,3): [8., -5],
# (2,4): [5., 3],
# (2,5): [7., 12],
# (3,5): [5., -9],
# (4,5): [8., 2],
# (4,6): [5., 0],
# (5,6): [5., 4]}

# Some toy data: 2 commodities with a fractional LP basic optimum


Supplies= {
# node i: [supply commodity[1] ... supply commodity[K]],
1: [1., 0.],
2: [0., -1.],
3: [0., 0.],
4: [0., 0.],
5: [0., 0.],
6: [0., 0.],
7: [0., 1.],
8: [-1., 0.]}

CapacityCosts = {
# arc (i,j): [capacity, cost commodity[1] ... cost commodity[K]],
(1,2): [1., 1, 1],
(1,3): [1., 1, 1],
(2,5): [1., 1, 1],
(3,4): [1., 1, 1],
(4,1): [1., 1, 1],
(4,7): [1., 1, 1],
(5,6): [1., 1, 1],
(6,2): [1., 1, 1],
(6,8): [1., 1, 1],
(7,3): [1., 1, 1],
(7,8): [1., 1, 1],
(8,5): [1., 1, 1]}

[4]: Nodes=list(Supplies.keys()) # get node list from supply data


K=len(Supplies[Nodes[0]]) # get number of commodities from supply data
Commods=list(range(1,K+1)) # name the commodities 1,2,...,K
Arcs=list(CapacityCosts.keys()) # get arc list from Capacity/Cost data
ArcsCrossCommods=list(itertools.product(Arcs,Commods)) # make cross product of
,→Arcs and Commods for variable indexing

[5]: model = gp.Model()


if solveLPOnly==True:
x = model.addVars(ArcsCrossCommods)
else:
x = model.addVars(ArcsCrossCommods,vtype=GRB.INTEGER)
model.setObjective(sum(sum(CapacityCosts[i,j][k]*x[(i,j),k] for (i,j) in Arcs)
,→for k in Commods), GRB.MINIMIZE)

model.addConstrs(sum(x[(i,j),k] for k in Commods) <= CapacityCosts[i,j][0] for


,→(i,j) in Arcs)

model.addConstrs(
(sum(x[(i, j),k] for j in Nodes if (i, j) in Arcs) - sum(x[(j, i),k] for j in
,→Nodes if (j,i) in Arcs)

== Supplies[i][k-1] for i in Nodes for k in Commods))


model.update()

--------------------------------------------
Warning: your license will expire in 3 days
--------------------------------------------

Using license file C:\Users\jonxlee\gurobi.lic


Academic license - for non-commercial use only - expires 2021-06-28
[6]: model.optimize()
if model.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", model.status)
print("***** This is a problem. Model does not have an optimal solution")
raise StopExecution
print(" ")
print("***** Flows:")
for (i,j) in Arcs:
arcflow=""
for k in Commods:
arcflow += str(round(x[(i,j),k].X,4))
arcflow += " "
print("x[(",i,",",j,"), *]=", arcflow, "capacity:", CapacityCosts[i,j][0])

Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)


Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 28 rows, 24 columns and 72 nonzeros
Model fingerprint: 0xf7e9da00
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [1e+00, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [1e+00, 1e+00]
Presolve removed 26 rows and 22 columns
Presolve time: 0.01s
Presolved: 2 rows, 2 columns, 4 nonzeros

Iteration Objective Primal Inf. Dual Inf. Time


0 8.0000000e+00 1.000000e+00 0.000000e+00 0s
1 8.0000000e+00 0.000000e+00 0.000000e+00 0s

Solved in 1 iterations and 0.01 seconds


Optimal objective 8.000000000e+00

***** Flows:
x[( 1 , 2 ), *]= 0.5 0.5 capacity: 1.0
x[( 1 , 3 ), *]= 0.5 0.0 capacity: 1.0
x[( 2 , 5 ), *]= 0.5 0.0 capacity: 1.0
x[( 3 , 4 ), *]= 0.5 0.5 capacity: 1.0
x[( 4 , 1 ), *]= 0.0 0.5 capacity: 1.0
x[( 4 , 7 ), *]= 0.5 0.0 capacity: 1.0
x[( 5 , 6 ), *]= 0.5 0.5 capacity: 1.0
x[( 6 , 2 ), *]= 0.0 0.5 capacity: 1.0
x[( 6 , 8 ), *]= 0.5 0.0 capacity: 1.0
x[( 7 , 3 ), *]= 0.0 0.5 capacity: 1.0
x[( 7 , 8 ), *]= 0.5 0.5 capacity: 1.0
x[( 8 , 5 ), *]= 0.0 0.5 capacity: 1.0
[7]: G = nx.DiGraph()
G.add_nodes_from(Nodes)
G.add_edges_from(Arcs)
plt.figure(figsize=(8,8))
edge_labels=nx.draw_networkx_edge_labels(G,edge_labels=CapacityCosts,
pos=nx.shell_layout(G), label_pos=0.3, font_size=10)
nx.draw_shell(G, with_labels=True, node_color='cyan', node_size=800,
font_size=20, arrowsize=20)
print("Network with node labels and capacities/costs on arcs")

Network with node labels and capacities/costs on arcs


[8]: #k=2
for k in Commods:
Supply1_label={}
for i in Nodes:
Supply1_label[i]= str(i)+': '+str(Supplies[i][k-1])

Flow0=np.zeros(len(Arcs))
Flow=dict(zip(list(Arcs), Flow0))
for (i,j) in Arcs: Flow[i,j]= str(round(x[(i,j),k].X,4))
H=nx.relabel_nodes(G, Supply1_label)
plt.figure(figsize=(8,8))
edge_labels=nx.draw_networkx_edge_labels(H,edge_labels=Flow,
pos=nx.shell_layout(G), label_pos=0.7, font_size=10)
nx.draw_shell(H, with_labels=True, node_color='cyan',
node_size=1200, font_size=10, arrowsize=15)
print("Network with supplies and flows for commodity ",k)

Network with supplies and flows for commodity 1


Network with supplies and flows for commodity 2
153

A.5 pivot_example.ipynb
pivot_example

June 25, 2021

Example: pivot tools for standard form linear-optimization problem P


For standard-form problems

z = min c0 x (P)
Ax = b
x ≥ 0.

Notes: * Can work with e perturbed right-hand side


References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f

[2]: %run ./pivot_tools.ipynb

pivot_tools loaded: pivot_perturb, pivot_algebra, N, pivot_ratios, pivot_swap,


pivot_plot, pure_gomory, mixed_gomory, dual_plot
[3]: A = sym.Matrix(([1, 2, 1, 0, 0, 0],
[3, 1, 0, 1, 0, 0],
[sym.Rational(3,2), sym.Rational(3,2), 0, 0, 1, 0],
[0, 1, 0, 0, 0, 1]))
m = A.shape[0]
n = A. shape[1]
c = sym.Matrix([6, 7, -2, 0, 4, sym.Rational(9,2)])
b = sym.Matrix([7, 9, 6, sym.Rational(33,10)])
beta = [0,1,3,5]
eta = list(set(list(range(n)))-set(beta))
A_beta = copy.copy(A[:,beta])
A_eta = copy.copy(A[:,eta])
c_beta = copy.copy(c[beta,0])
c_eta = copy.copy(c[eta,0])
Perturb=False ### do NOT change this!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! You
,→can perturb later

[4]: A
[4]:  1 2 1 0 0 0

3 1 0 1 0 0
3 3 
 0
2 2 0 0 1
0 1 0 0 0 1

[5]: c
[5]:  6 
 7 
 
 −2
 
 0 
 
 4 
9
2

[6]: #pivot_perturb() # uncomment to perturb the right-hand side

[7]: b
[7]:  7 
9
 
6
33
10

[8]: beta
[8]:
[0, 1, 3, 5]
[9]: eta
[9]:
[2, 4]
[10]: A_beta
[10]:  1 2 0 0
 3 1 1 0
3 3 
 0 0 
2 2
0 1 0 1

[11]: A_eta
[11]: 1 0

0 0
 
0 1
0 0

[12]: pivot_algebra()

pivot_algebra() done

[13]: sym.N(objval)
[13]:
28.35
[14]: xbar_beta
[14]:  1 
3
 
3
3
10

[15]: cbar_eta
[15]:  3

2
− 73

[16]: pivot_ratios(1)
3
4
∞
 
∞
9
20
x̄ + λz̄ :
 
1 − 4λ
3
 2λ + 3 
 3 
 0 
 
 10λ + 3 
 3 
 λ 
3 2λ
10 − 3
[17]: c.dot(zbar) # agrees with cbar_eta(1)
[17]: 7

3
[18]: pivot_plot()

[19]: pivot_swap(1,3)

swap accepted - new partition:


eta: [2, 5]
beta: [0, 1, 3, 4]
*** MUST APPLY pivot_algebra()! ***
[20]: pivot_algebra()

pivot_algebra() done

[21]: sym.N(objval)
[21]:
27.3
[22]: xbar_beta
[22]:  2 
5
 33 
 10 
9
2
9
20

[23]: cbar_eta
[23]: −2
7
2

[24]: pivot_ratios(0)
2
5
∞
 
∞

x̄ + λz̄ :
 2 
5 −λ
 33 
 10 
 λ 
 
 3λ + 9 
 2
 3λ + 9 
2 20
0

[25]: c.dot(zbar) # agrees with cbar_eta(0)


[25]:
−2
[26]: pivot_plot()
[27]: pivot_swap(0,0)

swap accepted - new partition:


eta: [0, 5]
beta: [2, 1, 3, 4]
*** MUST APPLY pivot_algebra()! ***

[28]: pivot_algebra()

pivot_algebra() done

[29]: sym.N(objval)
[29]:
26.5
[30]: xbar_beta
[30]:  2 
5
 33 
 10 
 57 
10
21
20

[31]: cbar_eta
[31]:  2 
− 12

[32]: pivot_ratios(1)
 

 33 
 10 
∞

x̄ + λz̄ :
 
0
 33 
 10 − λ2 
 2λ + 
 5
 λ + 57 
 10 
 3λ + 21 
2 20
λ

[33]: c.dot(zbar) # agrees with cbar_eta(1)


[33]: 1

2
[34]: pivot_plot()
[35]: pivot_swap(1,1)

swap accepted - new partition:


eta: [0, 1]
beta: [2, 5, 3, 4]
*** MUST APPLY pivot_algebra()! ***

[36]: pivot_algebra()

pivot_algebra() done

[37]: sym.N(objval)
[37]:
24.85
[38]: xbar_beta
[38]:  7 
 33 
 10 
9
6

[39]: cbar_eta
[39]:  2 
1
2

[40]: pivot_plot()
[41]: xbar
[41]:  0 
0
 
7
 
9
 
6
33
10

[42]: objval
[42]: 497
20
[43]: c.dot(xbar) # reality check
[43]: 497
20
[44]: c_beta.dot(xbar_beta) # reality check
[44]: 497
20
[45]: ybar.dot(b) # reality check
[45]: 497
20
[46]: sym.transpose(c)-sym.transpose(ybar)*A # reality check
[46]:  1

2 2 0 0 0 0

[47]: b-A*xbar # reality check


[47]: 0
0
 
0
0
165

A.6 pivot_tools.ipynb
pivot_tools

June 25, 2021

Pivot tools for standard form linear-optimization problem P


For standard-form problems

z = min c0 x (P)
Ax = b
x ≥ 0.

Notes: * Can work with e perturbed right-hand side * β = ( β 0 , β 1 , . . . , β m−1 ) has m entries from
{0, 1, . . . , n − 1}. * η = (η0 , η1 , . . . , ηn−m−1 ) has n − m entries from {0, 1, . . . , n − 1}. * So, for the
purpose of selecting j (corresponding to η j entering the basis), we view c̄η = (c̄η0 , c̄η1 , . . . , c̄ηn−m−1 ).
* For pivot_ratios(j): j must be in {0, 1, . . . , n − m − 1}. The output of pivot_ratios(j) is m numbers,
and they correspond to the basic variables numbered β 0 , β 1 , . . . , β m−1 . So, for the purpose of se-
lecting i (correspond to β i leaving the basis), i must be in {0, 1, . . . , m − 1}. * For pivot_swap(j,i): j
must be in {0, 1, . . . , n − m − 1} and i must be in {0, 1, . . . , m − 1}.
References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[ ]: import numpy as np
import sympy as sym
sym.init_printing()
import copy
import operator
eps = sym.symbols('epsilon')
lam = sym.symbols('lambda')
xlz = sym.Symbol('\\bar{x}+\\lambda \\bar{z} :')
from IPython.display import Latex, Math
#########################################################################
### CHOOSE A BACKEND --- IF YOU SWITCH, RESTART THE KERNEL
# evaluates faster than 'notebook':
%matplotlib inline
# evaluates slower than 'inline' (gives interactive plots, though delayed when
,→running all cells):

#%matplotlib notebook
#########################################################################
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
from matplotlib.ticker import AutoMinorLocator, MultipleLocator
from scipy.spatial import ConvexHull, convex_hull_plot_2d
import itertools
import seaborn as sns; sns.set(); sns.set_style("whitegrid"); color_list = sns.
,→color_palette("muted")

[ ]: # perturb
def pivot_perturb():
global m, b, Perturb, eps
Perturb = True
for i in range(m):
for j in range(m):
b[i] += A_beta[i,j]*eps**(j+1)
print('pivot_perturb() done')

[ ]: # algebra
def pivot_algebra():
global m, n, objval, xbar, xbar_beta, xbar_eta, ybar, cbar_eta, ratios
xbar_beta = A_beta.solve(b)
xbar_eta = sym.zeros(n-m,1)
objval = c_beta.dot(xbar_beta)
xbar = sym.zeros(n,1)
for i in range(m): xbar[beta[i]]=xbar_beta[i]
for j in range(n-m): xbar[eta[j]]=xbar_eta[j]
ybar = A_beta.transpose().solve(c[beta,0])
#cbar_eta = c_eta.transpose()- ybar.transpose()*A_eta
cbar_eta = c_eta- A_eta.transpose()*ybar
ratios=sym.oo*sym.ones(m,1)
print('pivot_algebra() done')

[ ]: # numerical version of a d-by-1 array


def N(parray):
for i in range(parray.shape[0]): display(sym.N(parray[i]))

[ ]: # ratios (and direction) for a given nonbasic index eta_j


def pivot_ratios(j):
global ratios, zbar
if j>n-m-1:
display(Latex("error: $j$ is out of range."))
else:
A_etaj=copy.copy(A[:,eta[j]])
Abar_etaj = A_beta.solve(A_etaj)
for i in range(m):
if Abar_etaj[i] > 0:
ratios[i] = xbar_beta[i] / Abar_etaj[i]
else:
ratios[i] = sym.oo
display(ratios)
zbar=sym.zeros(n,1)
for i in range(m): zbar[beta[i]] = -Abar_etaj[i]
zbar[eta[j]] = 1
display(xlz,xbar+lam*zbar)

[ ]: # swap nonbasic eta_j in and basic beta_i out


def pivot_swap(j,i):
global A_beta, A_eta, c_beta, c_eta
if i>m-1 or j>n-m-1:
display(Latex("error: $j$ or $i$ is out of range. swap not accepted"))
else:
save = copy.copy(beta[i])
beta[i] = copy.copy(eta[j])
eta[j] = save
A_beta = copy.copy(A[:,beta])
A_eta = copy.copy(A[:,eta])
c_beta = copy.copy(c[beta,0])
c_eta = copy.copy(c[eta,0])
display(Latex("swap accepted --- new partition:"))
print('eta:',eta)
print('beta:',beta)
print('*** MUST APPLY pivot_algebra()! ***')

[ ]: # plot
def pivot_plot():
if n-m != 2 or Perturb == True:
display(Latex("Hey friend --- give me a break!"))
display(Latex("This plotting only works if there are $n-m=2$ nonbasic
,→variables and no rhs perturbation"))
return
A_beta_inv = A_beta.inv()
Abar_eta = A_beta_inv*A_eta
M = sym.zeros(n,n-m)
M[0:m,:] = Abar_eta
M[m:n,:] = -sym.eye(n-m)
h = sym.zeros(n,1)
h[0:m,0] = xbar_beta
feaspoints=np.empty((0,2))
infeaspoints=np.empty((0,2))
bbar=sym.zeros(2,1)
M2=sym.zeros(2,2)
for i in range(n-1):
for j in range(i+1,n):
bbar[0]=h[i]
bbar[1]=h[j]
M2[0,:]=M[i,:]
M2[1,:]=M[j,:]
if abs(sym.det(M2)) >0.0001:
xy = M2.solve(bbar)
if min(h - M*xy) >= -0.00001:
feaspoints=np.r_[feaspoints,np.transpose(xy)]
else:
infeaspoints=np.r_[infeaspoints,np.transpose(xy)]
hull = ConvexHull(feaspoints)
fig, ax = plt.subplots(figsize=(8,8))
ax.set(xlabel=r"$x_{}$".format(eta[0]), ylabel=r"$x_{}$".format(eta[1]))
ax.spines['left'].set_position(('data',0.0))
ax.spines['bottom'].set_position(('data',0.0))
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.xaxis.set_ticks_position('bottom')
ax.yaxis.set_ticks_position('left')
plt.xlim(float(min(cbar_eta[0],min(feaspoints[:,0])))-1.25,
,→float(max(feaspoints[:,0]))+0.25)

plt.ylim(float(min(cbar_eta[1],min(feaspoints[:,1])))-0.25,
,→float(max(feaspoints[:,1]))+0.25)

plt.fill(feaspoints[hull.vertices,0], feaspoints[hull.vertices,1], 'cyan',


,→alpha=0.3)

x = np.linspace(float(min(feaspoints[:,0]))-0.5,float(max(feaspoints[:
,→,0]))+0.5,100)

for i in range(m):
if Abar_eta[i,1] != 0:
y = (xbar_beta[i] - Abar_eta[i,0]*x) / Abar_eta[i,1]
plt.plot(x, y, linewidth=3, label=r"$x_{}$".format(beta[i]))
else:
plt.vlines(float(xbar_beta[i]/ Abar_eta[i,0]),
,→float(min(cbar_eta[1],min(feaspoints[:,1]))),

float(max(feaspoints[:,0])), label=r"$x_{}$".
,→format(beta[i]))

for simplex in hull.simplices:


plt.fill(feaspoints[simplex, 0], feaspoints[simplex, 1], 'cyan',
,→alpha=0.5)

arrow=plt.arrow(0,0, float(cbar_eta[0]),float(cbar_eta[1]), color='magenta',


,→width = 0.02, head_width = 0.1, label=r"$\bar{c}_\eta$")

ax.scatter(feaspoints[:,0], feaspoints[:,1], color='green',zorder=8)


ax.scatter(infeaspoints[:,0], infeaspoints[:,1], color='red',zorder=7)
plt.legend(loc="upper left",title="slacks")
plt.title(r"In the space of the non-basic variables",size=18)
#ax.grid()
plt.show()

Gomory cutting-plane tool for dual-form pure-integer problem DI


For dual-form pure-integer problem
max y0 b (DI )
0 0
yA≤c
y ∈ Zm .

Notes: * A and c MUST be integer * The variables are y0 , y1 , . . . , ym−1 , so valid input arguments
for pure_gomory(i) are i ∈ {0, 1, . . . , m − 1}.
Reference: * Qi He, Jon Lee. Another pedagogy for pure-integer Gomory. RAIRO – Operations
Research, 51:189–197, 2017.
[ ]: # pure gomory cut
def pure_gomory(i):
global A, c, A_beta, A_eta, c_eta, cbar_eta, m, n, beta, eta
if i>m-1:
display(Latex("error: $i$ is out of range."))
else:
ei = sym.zeros(m,1)
ei[i]=1 # ei is the i-th standard unit column
hi = A_beta.solve(ei) # i-th column of basis inverse
#r = -sym.floor(hi) # best choice of r
r = -(hi.applyfunc(sym.floor))
btilde = ei + A_beta*r # new column for P
A = A.row_join(btilde)
c = c.col_join(sym.Matrix(([sym.floor(ybar.dot(btilde))])))
eta.insert(n-m,n)
n += 1
A_eta = copy.copy(A[:,eta])
c_eta = copy.copy(c[eta,0])
cbar_eta = c_eta - A_eta.transpose()*ybar
print('*** PROBABLY WANT TO APPLY pivot_algebra()! ***')

[ ]: # dual plot
def dual_plot(delta=None,center=None):
if delta==None: delta=2
if center==None: center=ybar
if m != 2 or Perturb == True:
display(Latex("Hey friend --- give me a break!"))
display(Latex("This plotting only works if there are $m=2$ dual
,→variables and no rhs perturbation"))

return
M=sym.transpose(A)
feaspoints=np.empty((0,2))
infeaspoints=np.empty((0,2))
c2=sym.zeros(2,1)
M2=sym.zeros(2,2)
for i in range(n-1):
for j in range(i+1,n):
c2[0]=c[i]
c2[1]=c[j]
M2[0,:]=M[i,:]
M2[1,:]=M[j,:]
if abs(sym.det(M2)) > 0.0001:
y0y1 = M2.solve(c2)
if min(c - M*y0y1) >= -0.00001:
feaspoints=np.r_[feaspoints,np.transpose(y0y1)]
else:
infeaspoints=np.r_[infeaspoints,np.transpose(y0y1)]
hull = ConvexHull(feaspoints)
fig, ax = plt.subplots(figsize=(8,8))
ax.xaxis.set_label_coords(1.05, 0.49)
ax.yaxis.set_label_coords(0.5, 1.05)
ax.set(xlabel=r"$y_{}$".format(0), ylabel=r"$y_{}$".format(1))
ax.spines['left'].set_position(('data',ybar[0]))
ax.spines['bottom'].set_position(('data',ybar[1]))
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
# set major ticks to show every 1 (integer)
ax.xaxis.set_major_locator(MultipleLocator(1))
ax.yaxis.set_major_locator(MultipleLocator(1))
ax.xaxis.set_ticks_position('bottom')
ax.yaxis.set_ticks_position('left')
plt.xlim(float(center[0])-delta,float(center[0])+delta)
plt.ylim(float(center[1])-delta,float(center[1])+delta)
plt.fill(feaspoints[hull.vertices,0], feaspoints[hull.vertices,1], 'cyan',
,→alpha=0.3)
y1 = np.linspace(float(min(feaspoints[:,0]))-0.5,float(max(feaspoints[:
,→,0]))+0.5,100)

for j in range(n):
if M[j,1] != 0:
y2 = (c[j] - M[j,0]*y1) / M[j,1]
plt.plot(y1, y2, linewidth=2, label=r"constraint ${}$".format(j))
else:
plt.vlines(float(c[j]/ M[j,0]), float(center[1])-delta,
float(center[1])+delta, linewidth=2, label=r"constraint
,→${}$".format(j))

for simplex in hull.simplices:


plt.fill(feaspoints[simplex, 0], feaspoints[simplex, 1], 'cyan',
,→alpha=0.5)

arrow=plt.arrow(float(ybar[0]),float(ybar[1]),0.5*float(b[0]/(b.dot(b))**0.
,→5),0.5*float(b[1]/(b.dot(b))**0.5), color='magenta', width = 0.01*delta,

,→head_width = 0.02*delta, label=r"$b$")

ax.scatter(feaspoints[:,0], feaspoints[:,1], color='green',zorder=8)


ax.scatter(infeaspoints[:,0], infeaspoints[:,1], color='red',zorder=7)
# the integer grid
xp = np.arange(np.floor(float(center[0])-delta)-1, np.
,→ceil(float(center[0])+delta)+2)

yp = np.arange(np.floor(float(center[1])-delta)-1, np.
,→ceil(float(center[1])+delta)+2)

pp = itertools.product(xp, yp)
plt.scatter(*zip(*pp), marker='o', s=5, color='black',zorder=9)
# sorting plot legend entries by label
handles, labels = ax.get_legend_handles_labels()
hl = sorted(zip(handles, labels), key=operator.itemgetter(1))
handles2, labels2 = zip(*hl)
ax.legend(handles2, labels2, loc="lower left",title="constraints")
ax.grid(which='major')
plt.show()

[ ]: print('pivot_tools loaded: pivot_perturb, pivot_algebra, N, pivot_ratios,


,→pivot_swap, pivot_plot, pure_gomory, mixed_gomory, dual_plot')
173

A.7 Circle.ipynb
Circle

June 25, 2021

Hoffman’s circle
Reference: * Jon Lee. Hoffman’s circle untangled. SIAM Review, 39(1):98-105, 1997.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: import numpy as np
#%matplotlib notebook
%matplotlib inline
import mpl_toolkits.mplot3d as a3
import matplotlib.colors as colors
import pylab as pl
t = 2 * np.pi/5
c = np.cos(t)
s = np.sin(t)
M = np.array([[c, -s, 0], [s, c, 0], [(c - 1)/c, s/c, 1]])
x = np.array((1,0,0))
y = np.array((0, 0.5*np.tan(t/2), 0))
T=np.row_stack((x,y, M.dot(x), M.dot(y), M.dot(M.dot(x)), M.dot(M.dot(y)),
M.dot(M.dot(M.dot(x))), M.dot(M.dot(M.dot(y))),
M.dot(M.dot(M.dot(M.dot(x)))), M.dot(M.dot(M.dot(M.dot(y)))), x))
ax = a3.Axes3D(pl.figure(figsize=(5,8)),azim=42,elev=15)
for i in range(10):
vtx = np.row_stack(([0,0,0],T[i],T[i+1]))
tri = a3.art3d.Poly3DCollection([vtx])
tri.set_color(colors.rgb2hex(np.random.rand(3)))
tri.set_edgecolor('k')
ax.add_collection3d(tri)
ax.set_xlim3d(-1,1)
ax.set_ylim3d(-1,1)
ax.set_zlim3d(-3,4)
pl.show()
177

A.8 Decomp.ipynb
Decomp

June 25, 2021

Decomposition Algorithm with Python/Gurobi


Apply the (Dantzig-Wolfe) Decomposition Algorithm to:

z = min c0 x (Q)
Ex ≥ h
Ax = b
x ≥ 0,

treating Ex ≥ h as the “complicating constraints’ ’.


Notes: * In this implementaion, we never delete generated columns
References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f
import numpy as np
#%matplotlib notebook
%matplotlib inline
import matplotlib.pyplot as plt
import gurobipy as gp
from gurobipy import GRB

class StopExecution(Exception):
def _render_traceback_(self):
pass

[2]: MAXIT = 500


# generate a random example
n = 100 # number of variables
m1 = 200 # number of equations to relax
m2 = 50 # number of equations to keep
np.random.seed(25) # change the seed for a differemt example
E=0.01*np.random.randint(-5,high=5,size=(m1,n)).astype(float) #np.random.
,→randn(m1,nt)

A=0.01*np.random.randint(-2,high=3,size=(m2,n)).astype(float) #np.random.
,→randn(m2,nt)

# choose the right-hand sides so that Q will be feasible


xfeas=0.1*np.random.randint(0,high=5,size=n).astype(float)
h=E.dot(xfeas) - 0.1*np.random.randint(0,high=10,size=m1).astype(float)
b=A.dot(xfeas)

# choose the objective function so that the dual of Q will be feasible


yfeas=0.1*np.random.randint(0,high=5,size=m1).astype(float)
pifeas=0.1*np.random.randint(-5,high=5,size=m2).astype(float)
c=np.transpose(E)@yfeas + np.transpose(A)@pifeas + 0.1*np.random.
,→randint(0,high=1,size=n).astype(float)

[3]: print("***** Solve as one big LP --- for comparison purposes")


modelQ = gp.Model()
modelQ.reset()
xQ = modelQ.addMVar(n)
objective = modelQ.setObjective(c@xQ, GRB.MINIMIZE)
constraintsQ1 = modelQ.addConstr(E@xQ >= h)
constraintsQ2 = modelQ.addConstr(A@xQ == b)
modelQ.optimize()
if modelQ.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", modelQ.status)
print("***** This is a problem. Stopping execution.")
raise StopExecution
print(" ")
print("***** Proceed to Decomposition")

# initialization for Decomposition


results1=[]
results2=[]
ITER=0
xgen=0
zgen=0
y=np.zeros(m1)

# set up the Subproblem model and get one basic feasible solution
modelS = gp.Model()
#modelS.setParam('OutputFlag', 0) # quiet the Gurobi output
x = modelS.addMVar(n)
constraintsS = modelS.addConstr(A@x == b)
#modelS.setObjective(c@x, GRB.MINIMIZE)
modelS.optimize()
if modelS.status != GRB.Status.OPTIMAL:
print("***** Gurobi (initial) Subproblem solve status:", modelS.status)
print("***** This is a problem. Stopping execution.")
raise StopExecution
xgen += 1

# construct a basis
XZ=np.reshape(x.X,(n,1))
#Z=np.r_[np.zeros((n-m1,m1)),np.eye(m1)]
#Z=np.empty((n,0), dtype=float)
h1=np.r_[h,(1)]
#B=np.c_[np.r_[np.eye(m1),np.zeros((1,m1))],np.r_[[email protected],(1)]]
B=np.c_[-np.r_[np.eye(m1),np.zeros((1,m1))],np.r_[[email protected],(1)]]

# set up the Main Phase-2 model


modelM2 = gp.Model()
s = modelM2.addMVar(m1+1)
modelM2.setObjective([email protected]*s[m1], GRB.MINIMIZE)
modelM2.addConstrs((-s[i] + E[i,:]@x.X*s[m1] == h[i] for i in range(m1)))
modelM2.addConstr(s[m1]==1)
modelM2.update()
constraintsM2=modelM2.getConstrs()

# Identify if the constructed basis is feasible to see if Phase 1 is needed


if min(np.linalg.solve(B, h1)) >= -1e-10:
print('***** Phase I not needed')
Phase=2
modelM=modelM2
else:
print('***** Phase I needed')
Phase=1
ITERphaseI=1
modelM1=modelM2.copy()
#modelM1.setParam('OutputFlag', 0) # quiet the Gurobi output
constraintsM1=modelM1.getConstrs()
# create the artificial variable
newcol=gp.Column(-np.r_[[email protected](m1),(1)],constraintsM1)
modelM1.setObjective(0.0, GRB.MINIMIZE)
modelM1.addVar(obj=1.0, column=newcol, name='artificial')
modelM=modelM1

while True:
ITER += 1
print(" ")
print("***** Currently in Phase", Phase, ". Iteration number", ITER)
print("***** Solving Main LP...")
modelM.optimize()
if modelM.status != GRB.Status.OPTIMAL:
print("***** Gurobi Main solve status:", modelM.status)
print("***** This is a problem. Stopping execution.")
raise StopExecution
results1=np.append(results1,ITER-1)
results2=np.append(results2,modelM.Objval)
if Phase==1 and modelM.Objval < 0.0000001:
print("***** Phase I succeeded")
print("LP iter", " LP val")
print("--------- ---------")
for j in range(ITER):
print(np.int(results1[j]), " ", np.round(results2[j],9))
fig, ax = plt.subplots(figsize=(10,10))
ax.plot(results1[0:ITER], results2[0:ITER])
ax.set(xlabel='iteration', ylabel='LP objective value')
ax.set_xticks(ticks=results1, minor=False)
ax.grid()
plt.show()
ITERphaseI=ITER
Phase=2
# switch to the Phase II model
modelM=modelM2
modelM.optimize()
# overwrite last iteration result with phase-II objective value
results2[ITER-1]=modelM.Objval
if ITER == MAXIT: break

constraintsM=modelM.getConstrs()
for i in range(m1):
y[i]=constraintsM[i].Pi
sigma=constraintsM[m1].Pi
if Phase==1: modelS.setObjective((-y.dot(E))@x, GRB.MINIMIZE)
else: modelS.setObjective((c-y.dot(E))@x, GRB.MINIMIZE)
print(" ")
print("***** Solving Subproblem LP...")
modelS.optimize()
if modelS.status != GRB.Status.OPTIMAL and modelS.status != GRB.Status.
,→UNBOUNDED:

print("***** Gurobi Subproblem solve status:", modelS.status)


print("***** This is a problem. Stopping execution.")
raise StopExecution
if modelS.status == GRB.Status.OPTIMAL:
print("***** Gurobi Subproblem solve status:", modelS.status)
reducedcost = -sigma + modelS.Objval
print("***** sigma=",sigma)
print("***** reduced cost=",reducedcost)
if reducedcost < -0.0001:
xnew=x.X
if Phase==1:
newcol=gp.Column(np.r_[E@xnew,(1)],constraintsM1)
modelM1.addVar(obj=0.0, column=newcol)
newcol=gp.Column(np.r_[E@xnew,(1)],constraintsM2)
modelM2.addVar(obj=c@xnew, column=newcol)
XZ=np.c_[XZ,xnew]
xgen += 1
else:
if Phase==1:
print("***** No more improving columns for Main")
print("***** Phase I finished without a feasible solution")
print("***** Phase I objective", modelM.Objval)
break
else: # Phase 2
print("***** No more improving columns for Main")
print("***** Phase II finished")
print("***** Phase II objective", modelM.Objval)
break
if modelS.status == GRB.Status.UNBOUNDED:
print("***** Gurobi Subproblem solve status:", modelS.status)
znew=x.UnbdRay
if Phase==1:
newcol=gp.Column(np.r_[E@znew,(0)],constraintsM1)
modelM1.addVar(obj=0.0, column=newcol)
reducedcost = -y.dot(E)@znew
newcol=gp.Column(np.r_[E@znew,(0)],constraintsM2)
modelM2.addVar(obj=c@znew, column=newcol)
if Phase==2:
reducedcost = (c-y.dot(E))@znew
print("***** reduced cost=", reducedcost)
#if reducedcost > 0.0001: input()
XZ=np.c_[XZ,znew]
zgen += 1

print("LP iter", " LP val")


print("--------- ---------")
for j in range(ITERphaseI-1,ITER):
print(np.int(results1[j]), " ", np.round(results2[j],9))
# recover the solution in the original variables x
greekvar=modelM2.getVars()[m1:ITER+m1]
greekval=np.zeros(ITER)
for i in range(ITER):
greekval[i] = greekvar[i].X
xhat=XZ@greekval
print("***** Reality check: recover the optimal x found by decomposition.")
print("***** Its objective value is:", np.round(c@xhat,9))
print(" ")
print("***** Compare with LP value calculated without decomposition:",np.
,→round(modelQ.Objval,9))

if ITER > ITERphaseI:


fig, ax = plt.subplots(figsize=(10,10))
ax.plot(results1[ITERphaseI-1:ITER], results2[ITERphaseI-1:ITER])
ax.plot(results1[ITERphaseI-1:ITER], modelQ.Objval*np.
,→ones(ITER-ITERphaseI+1))

ax.set(xlabel='iteration', ylabel='LP objective value')


ax.set_xticks(ticks=results1[ITERphaseI-1:ITER], minor=True)
ax.grid()
plt.show()
print(" ")
print("***** Number of basic-feasible solutions generated:", xgen)
print(" ")
print("***** Number of basic-feasible rays generated:", zgen)

***** Solve as one big LP --- for comparison purposes

--------------------------------------------
Warning: your license will expire in 3 days
--------------------------------------------

Using license file C:\Users\jonxlee\gurobi.lic


Academic license - for non-commercial use only - expires 2021-06-28
Discarded solution information
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 250 rows, 100 columns and 21957 nonzeros
Model fingerprint: 0xd5eae979
Coefficient statistics:
Matrix range [1e-02, 5e-02]
Objective range [2e-02, 5e-01]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 1e+00]
Presolve time: 0.01s
Presolved: 250 rows, 100 columns, 21957 nonzeros

Iteration Objective Primal Inf. Dual Inf. Time


0 -4.2122000e+31 1.799360e+33 4.212200e+01 0s
211 -5.6119344e+00 0.000000e+00 0.000000e+00 0s

Solved in 211 iterations and 0.04 seconds


Optimal objective -5.611934358e+00

***** Proceed to Decomposition


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Model fingerprint: 0x3f5107c3
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [0e+00, 0e+00]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Presolve time: 0.01s
Presolved: 50 rows, 100 columns, 3992 nonzeros

Iteration Objective Primal Inf. Dual Inf. Time


0 0.0000000e+00 1.271200e+01 0.000000e+00 0s
74 0.0000000e+00 0.000000e+00 0.000000e+00 0s

Solved in 74 iterations and 0.02 seconds


Optimal objective 0.000000000e+00
***** Phase I needed

***** Currently in Phase 1 . Iteration number 1


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 202 columns and 602 nonzeros
Model fingerprint: 0x29dd1f2e
Coefficient statistics:
Matrix range [2e-04, 1e+00]
Objective range [1e+00, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Presolve removed 201 rows and 202 columns
Presolve time: 0.00s
Presolve: All rows and columns removed
Iteration Objective Primal Inf. Dual Inf. Time
0 1.1729795e-02 0.000000e+00 0.000000e+00 0s

Solved in 0 iterations and 0.01 seconds


Optimal objective 1.172979517e-02

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [1e-02, 5e-02]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -5.6353739e+30 5.468380e+31 5.635374e+00 0s

Solved in 112 iterations and 0.02 seconds


Unbounded model
***** Gurobi Subproblem solve status: 5
***** reduced cost= -2.901800799665117

***** Currently in Phase 1 . Iteration number 2


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 203 columns and 802 nonzeros
Coefficient statistics:
Matrix range [2e-04, 2e+01]
Objective range [1e+00, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Iteration Objective Primal Inf. Dual Inf. Time
0 -3.6272510e+29 1.241503e+32 3.627251e-01 0s
4 9.9780072e-03 0.000000e+00 0.000000e+00 0s

Solved in 4 iterations and 0.01 seconds


Optimal objective 9.978007225e-03

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [5e-04, 5e-02]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -7.8114845e+30 0.000000e+00 3.124594e+01 0s

Solved in 0 iterations and 0.01 seconds


Unbounded model
***** Gurobi Subproblem solve status: 5
***** reduced cost= -7.811484455526056

***** Currently in Phase 1 . Iteration number 3


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 204 columns and 1002 nonzeros
Coefficient statistics:
Matrix range [2e-04, 4e+03]
Objective range [1e+00, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Iteration Objective Primal Inf. Dual Inf. Time
0 -6.1027222e+28 3.537703e+30 6.102722e-02 0s
1 9.9592529e-03 0.000000e+00 0.000000e+00 0s

Solved in 1 iterations and 0.01 seconds


Optimal objective 9.959252945e-03

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [4e-04, 5e-02]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -1.5688147e+30 0.000000e+00 6.275259e+00 0s

Solved in 0 iterations and 0.01 seconds


Unbounded model
***** Gurobi Subproblem solve status: 5
***** reduced cost= -1.5688147347327117

***** Currently in Phase 1 . Iteration number 4


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 205 columns and 1202 nonzeros
Coefficient statistics:
Matrix range [2e-04, 4e+03]
Objective range [1e+00, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Iteration Objective Primal Inf. Dual Inf. Time
0 -2.4512730e+28 1.720162e+30 2.451273e-02 0s
1 9.9527969e-03 0.000000e+00 0.000000e+00 0s

Solved in 1 iterations and 0.01 seconds


Optimal objective 9.952796903e-03

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [3e-04, 5e-02]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -1.8823585e+30 5.931674e+34 7.529434e+00 0s

Solved in 77 iterations and 0.02 seconds


Unbounded model
***** Gurobi Subproblem solve status: 5
***** reduced cost= -0.1282899745878549

.
.
.
.
.
.

***** Currently in Phase 1 . Iteration number 30


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 231 columns and 6413 nonzeros
Coefficient statistics:
Matrix range [9e-05, 4e+03]
Objective range [1e+00, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Iteration Objective Primal Inf. Dual Inf. Time
0 -2.0978320e+29 3.303289e+30 2.097832e-01 0s
14 0.0000000e+00 0.000000e+00 0.000000e+00 0s

Solved in 14 iterations and 0.01 seconds


Optimal objective 0.000000000e+00
***** Phase I succeeded
LP iter LP val
--------- ---------
0 0.011729795
1 0.009978007
2 0.009959253
3 0.009952797
4 0.009540816
5 0.009489317
6 0.009481518
7 0.009476081
8 0.009465465
9 0.009463658
10 0.009004181
11 0.008995995
12 0.008995994
13 0.008988601
14 0.007936928
15 0.007613552
16 0.006527189
17 0.005966359
18 0.005907057
19 0.005898025
20 0.005897971
21 0.00589796
22 0.005852746
23 0.005221517
24 0.000209534
25 0.00019799
26 0.000195965
27 0.000166034
28 5.2869e-05
29 0.0
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 230 columns and 6212 nonzeros
Model fingerprint: 0x5fffefb2
Coefficient statistics:
Matrix range [9e-05, 4e+03]
Objective range [3e+00, 4e+04]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Presolve removed 55 rows and 200 columns
Presolve time: 0.01s
Presolved: 146 rows, 30 columns, 4362 nonzeros

Iteration Objective Primal Inf. Dual Inf. Time


0 -5.5652233e+02 5.013465e+02 0.000000e+00 0s
12 -3.3379333e+00 0.000000e+00 0.000000e+00 0s

Solved in 12 iterations and 0.02 seconds


Optimal objective -3.337933307e+00

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [3e-03, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -2.0980458e+32 2.924022e+32 2.098046e+02 0s

Solved in 90 iterations and 0.03 seconds


Unbounded model
***** Gurobi Subproblem solve status: 5
***** reduced cost= -1800.0217056372962

***** Currently in Phase 2 . Iteration number 31


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 231 columns and 6412 nonzeros
Coefficient statistics:
Matrix range [9e-05, 4e+03]
Objective range [3e+00, 4e+04]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Iteration Objective Primal Inf. Dual Inf. Time
0 -1.1250136e+32 5.581086e+31 1.125014e+02 0s
8 -3.3480733e+00 0.000000e+00 0.000000e+00 0s

Solved in 8 iterations and 0.01 seconds


Optimal objective -3.348073342e+00

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [2e-02, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -2.7250898e+32 2.809026e+35 2.725090e+02 0s

Solved in 71 iterations and 0.02 seconds


Unbounded model
***** Gurobi Subproblem solve status: 5
***** reduced cost= -246.04454754170908

***** Currently in Phase 2 . Iteration number 32


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 232 columns and 6612 nonzeros
Coefficient statistics:
Matrix range [9e-05, 4e+03]
Objective range [3e+00, 4e+04]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Iteration Objective Primal Inf. Dual Inf. Time
0 -6.1511137e+31 3.580951e+31 6.151114e+01 0s
8 -3.3880381e+00 0.000000e+00 0.000000e+00 0s

Solved in 8 iterations and 0.01 seconds


Optimal objective -3.388038102e+00

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [2e-03, 2e+00]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -1.0613914e+32 0.000000e+00 1.061391e+02 0s

Solved in 0 iterations and 0.01 seconds


Unbounded model
***** Gurobi Subproblem solve status: 5
***** reduced cost= -106.13913722403913

***** Currently in Phase 2 . Iteration number 33


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 233 columns and 6812 nonzeros
Coefficient statistics:
Matrix range [9e-05, 4e+03]
Objective range [3e+00, 4e+04]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Iteration Objective Primal Inf. Dual Inf. Time
0 -1.6584240e+30 2.503175e+30 1.658424e+00 0s
1 -3.3883900e+00 0.000000e+00 0.000000e+00 0s

Solved in 1 iterations and 0.01 seconds


Optimal objective -3.388390009e+00

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [4e-03, 2e+00]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -7.6231335e+31 6.759435e+34 7.623134e+01 0s

Solved in 53 iterations and 0.01 seconds


Unbounded model
***** Gurobi Subproblem solve status: 5
***** reduced cost= -3.3419183360137676

.
.
.
.
.
.

***** Currently in Phase 2 . Iteration number 402


***** Solving Main LP...
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 201 rows, 602 columns and 80835 nonzeros
Coefficient statistics:
Matrix range [1e-06, 9e+04]
Objective range [3e+00, 9e+05]
Bounds range [0e+00, 0e+00]
RHS range [2e-03, 1e+00]
Iteration Objective Primal Inf. Dual Inf. Time
0 -4.5864905e+27 2.680439e+30 4.586490e-03 0s
12 -5.6119344e+00 0.000000e+00 0.000000e+00 0s
Solved in 12 iterations and 0.02 seconds
Optimal objective -5.611934358e+00

***** Solving Subproblem LP...


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 50 rows, 100 columns and 3992 nonzeros
Coefficient statistics:
Matrix range [1e-02, 2e-02]
Objective range [2e-03, 6e-01]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 8e-02]
Iteration Objective Primal Inf. Dual Inf. Time
0 -1.0996308e+00 0.000000e+00 0.000000e+00 0s

Solved in 0 iterations and 0.01 seconds


Optimal objective -1.099630761e+00
***** Gurobi Subproblem solve status: 2
***** sigma= -1.0996307611968144
***** reduced cost= -2.886579864025407e-15
***** No more improving columns for Main
***** Phase II finished
***** Phase II objective -5.611934358015313
LP iter LP val
--------- ---------
29 -3.337933307
30 -3.348073342
31 -3.388038102
32 -3.388390009
33 -3.389532342
34 -3.390085393
35 -3.392675548
36 -3.396216294
37 -3.397063091
38 -3.397096763
39 -3.397100656
40 -3.399325683
41 -3.399516444
42 -3.403298253
43 -3.404126621
44 -3.404216698
45 -3.408199788
46 -3.427992646
47 -3.428853489
48 -3.428915332
49 -3.430004579
50 -3.434533448
51 -3.434578091
52 -3.435023422
53 -3.438204684
54 -3.464591303
55 -3.482106764
56 -3.490610142
57 -3.490932305
58 -3.501570718
59 -3.502282783
60 -3.502396016
61 -3.50255097
62 -3.502559503
63 -3.50255961
64 -3.550147616
65 -3.551899073
66 -3.551960331
67 -3.551969951
68 -3.551974273
69 -3.552126815
70 -3.553304142
71 -3.559367309
72 -3.585686828
73 -3.591579287
74 -3.592097794
75 -3.592099263
76 -3.592106236
77 -3.592166783
78 -3.592167111
79 -3.592260832
80 -3.596953678
81 -3.647858835
82 -3.648322636
83 -3.648823441
84 -3.648949993
85 -3.652426765
86 -3.652868367
87 -3.653271957
88 -3.653291285
89 -3.653357576
90 -3.653383423
91 -3.653470503
92 -3.65349083
93 -3.653526157
94 -3.653531168
95 -3.653533577
96 -3.653739817
97 -3.653741749
98 -3.653769089
99 -3.653784884
100 -3.653787571
101 -3.653789803
102 -3.65379341
103 -3.653852793
104 -3.654009584
105 -3.658634444
106 -3.658687275
107 -3.65871201
108 -3.658712204
109 -3.658814114
110 -3.65885407
111 -3.658860826
112 -3.658860967
113 -3.65925888
114 -3.659299173
115 -3.659304181
116 -3.659369008
117 -3.65937317
118 -3.659571116
119 -3.659591636
120 -3.662523932
121 -3.663623479
122 -3.690648603
123 -3.691647171
124 -3.691920471
125 -3.692219149
126 -3.692226784
127 -3.692227667
128 -3.69224252
129 -3.692262591
130 -3.692263321
131 -3.692280862
132 -3.69685685
133 -3.696945712
134 -3.696946859
135 -3.696994546
136 -3.698976225
137 -3.70918171
138 -3.710432304
139 -3.712114723
140 -3.712283978
141 -3.712284316
142 -3.712505779
143 -3.712494631
144 -3.712494764
145 -3.712494922
146 -3.712495897
147 -3.712496908
148 -3.712496927
149 -3.712496993
150 -3.712638233
151 -3.712645067
152 -3.712645222
153 -3.713073043
154 -3.714257511
155 -3.714998487
156 -3.715013073
157 -3.715280387
158 -3.717719529
159 -3.728594886
160 -3.729083702
161 -3.72921576
162 -3.729824162
163 -3.744162516
164 -3.746749996
165 -3.77967583
166 -3.781225932
167 -3.781468171
168 -3.781479385
169 -3.781487726
170 -3.782062008
171 -3.788729529
172 -3.972231302
173 -3.975517219
174 -3.975557851
175 -3.975558722
176 -3.976423564
177 -3.986095797
178 -4.258082375
179 -4.259585372
180 -4.260048104
181 -4.260287583
182 -4.260424079
183 -4.260467422
184 -4.260468281
185 -4.260522077
186 -4.260526128
187 -4.263706418
188 -4.277614201
189 -4.277885813
190 -4.409400967
191 -4.424280887
192 -4.43257427
193 -4.43473067
194 -4.435537701
195 -4.438437738
196 -4.438457408
197 -4.438590262
198 -4.438593435
199 -4.440856242
200 -4.440958833
201 -4.440996679
202 -4.440997645
203 -4.440998211
204 -4.442302585
205 -4.45460487
206 -4.455341733
207 -4.455354671
208 -4.456026423
209 -4.456211001
210 -4.461281822
211 -4.469839164
212 -4.496361478
213 -4.497160806
214 -4.498120496
215 -4.498468493
216 -4.499015464
217 -4.499138373
218 -4.500065923
219 -4.500068586
220 -4.50008956
221 -4.500082092
222 -4.500070377
223 -4.500071096
224 -4.501173911
225 -4.501176912
226 -4.501399887
227 -4.513787235
228 -4.519158959
229 -4.521410536
230 -4.52145703
231 -4.52172698
232 -4.521732696
233 -4.523604918
234 -4.527626001
235 -4.528792498
236 -4.528966128
237 -4.529101321
238 -4.52910312
239 -4.529107607
240 -4.52912014
241 -4.529120881
242 -4.530701735
243 -4.537670426
244 -4.543712819
245 -4.543776577
246 -4.543786349
247 -4.543843701
248 -4.552229795
249 -4.562895646
250 -4.566556559
251 -4.569979955
252 -4.570164845
253 -4.576497467
254 -4.631658441
255 -4.631948951
256 -4.63426494
257 -4.642431515
258 -4.647712706
259 -4.650776229
260 -4.717520442
261 -4.725017172
262 -4.750776754
263 -4.763716149
264 -4.772849804
265 -4.822353057
266 -4.859152509
267 -4.887218111
268 -4.900392368
269 -4.905109795
270 -4.913457592
271 -4.950264681
272 -4.958831288
273 -4.967308956
274 -4.968549952
275 -4.970439999
276 -4.973099586
277 -4.97367364
278 -5.08041008
279 -5.103975181
280 -5.153945381
281 -5.16617746
282 -5.18042044
283 -5.306079391
284 -5.325503408
285 -5.339985574
286 -5.340005494
287 -5.340323655
288 -5.345260503
289 -5.346623869
290 -5.357443756
291 -5.359033035
292 -5.359033066
293 -5.359033826
294 -5.360459676
295 -5.365110233
296 -5.365571553
297 -5.365658649
298 -5.366202814
299 -5.368920728
300 -5.369390306
301 -5.377392993
302 -5.381176444
303 -5.396181952
304 -5.400643055
305 -5.401286148
306 -5.407390193
307 -5.418281105
308 -5.431235833
309 -5.440408885
310 -5.442246726
311 -5.44295267
312 -5.44666539
313 -5.447514456
314 -5.447957295
315 -5.450120679
316 -5.460446866
317 -5.462434463
318 -5.466522923
319 -5.466532795
320 -5.471456194
321 -5.472807558
322 -5.473479812
323 -5.474296092
324 -5.476893335
325 -5.47831061
326 -5.47844126
327 -5.478470585
328 -5.479990089
329 -5.480742118
330 -5.483757895
331 -5.488471497
332 -5.489758502
333 -5.489930413
334 -5.492845731
335 -5.505524974
336 -5.506461579
337 -5.513494955
338 -5.51693178
339 -5.524576569
340 -5.526151153
341 -5.527584181
342 -5.530487668
343 -5.533704324
344 -5.53799806
345 -5.549260521
346 -5.550891687
347 -5.552268263
348 -5.560029132
349 -5.567472165
350 -5.569508323
351 -5.570169625
352 -5.574374383
353 -5.57526478
354 -5.575745448
355 -5.57853413
356 -5.579312536
357 -5.579508418
358 -5.58225834
359 -5.582374154
360 -5.584805914
361 -5.584957514
362 -5.585573266
363 -5.586538647
364 -5.587187136
365 -5.589449889
366 -5.590118166
367 -5.592863859
368 -5.594737779
369 -5.595086263
370 -5.595586761
371 -5.596032289
372 -5.599250818
373 -5.59996404
374 -5.600327908
375 -5.601578421
376 -5.602616138
377 -5.603063199
378 -5.60359981
379 -5.603868388
380 -5.606775329
381 -5.606968038
382 -5.607355782
383 -5.607960439
384 -5.608439572
385 -5.609511875
386 -5.609740755
387 -5.610708933
388 -5.61093085
389 -5.610956345
390 -5.61118588
391 -5.611261848
392 -5.611352693
393 -5.611479619
394 -5.61151479
395 -5.611620436
396 -5.611689574
397 -5.611735221
398 -5.611799514
399 -5.611839283
400 -5.611860504
401 -5.611934358
***** Reality check: recover the optimal x found by decomposition.
***** Its objective value is: -5.611934358

***** Compare with LP value calculated without decomposition: -5.611934358


***** Number of basic-feasible solutions generated: 235

***** Number of basic-feasible rays generated: 167


203

A.9 SubgradProj.ipynb
SubgradProj

June 25, 2021

Subgradient Optimization with Python/Gurobi


Apply Subgradient Optimization to:

z = min c0 x (Q)
Ex ≥ h
Ax = b
x ≥ 0,

relaxing Ex ≥ h in the Lagrangian.


References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f
import numpy as np
#%matplotlib notebook
%matplotlib inline
import matplotlib.pyplot as plt
import gurobipy as gp
from gurobipy import GRB

class StopExecution(Exception):
def _render_traceback_(self):
pass

[2]: MAXIT = 500


HarmonicStepSize = False # If you choose False, then you have to guess a
,→'target value'

GUESS = -5.6 # but don't guess a target value higher than z!!
,→!!

SmartInitialization = True # Set 'False' to initialize with y=0.

# generate a random example


n = 100 # number of variables
m1 = 200 # number of equations to relax
m2 = 50 # number of equations to keep
np.random.seed(25) # change the seed for a differemt example
E=0.01*np.random.randint(-5,high=5,size=(m1,n)).astype(float) #np.random.
,→randn(m1,nt)

A=0.01*np.random.randint(-2,high=3,size=(m2,n)).astype(float) #np.random.
,→randn(m2,nt)

# choose the right-hand sides so that Q will be feasible


xfeas=0.1*np.random.randint(0,high=5,size=n).astype(float)
h=E.dot(xfeas) - 0.1*np.random.randint(0,high=10,size=m1).astype(float)
b=A.dot(xfeas)

# choose the objective function so that the dual of Q will be feasible


yfeas=0.1*np.random.randint(0,high=5,size=m1).astype(float)
pifeas=0.1*np.random.randint(-5,high=5,size=m2).astype(float)
c=np.transpose(E)@yfeas + np.transpose(A)@pifeas + 0.1*np.random.
,→randint(0,high=1,size=n).astype(float)

[3]: # solve the problem as one big LP --- for comparison purposes
modelQ = gp.Model()
modelQ.reset()
x = modelQ.addMVar(n)
objective = modelQ.setObjective(c@x, GRB.MINIMIZE)
constraintsQ1 = modelQ.addConstr(E@x >= h)
constraintsQ2 = modelQ.addConstr(A@x == b)
modelQ.optimize()
if modelQ.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", modelQ.status)
print("***** This is a problem. Model Q does not have an optimal solution")
raise StopExecution

--------------------------------------------
Warning: your license will expire in 3 days
--------------------------------------------

Using license file C:\Users\jonxlee\gurobi.lic


Academic license - for non-commercial use only - expires 2021-06-28
Discarded solution information
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 250 rows, 100 columns and 21957 nonzeros
Model fingerprint: 0xd5eae979
Coefficient statistics:
Matrix range [1e-02, 5e-02]
Objective range [2e-02, 5e-01]
Bounds range [0e+00, 0e+00]
RHS range [3e-18, 1e+00]
Presolve time: 0.01s
Presolved: 250 rows, 100 columns, 21957 nonzeros

Iteration Objective Primal Inf. Dual Inf. Time


0 -4.2122000e+31 1.799360e+33 4.212200e+01 0s
211 -5.6119344e+00 0.000000e+00 0.000000e+00 0s

Solved in 211 iterations and 0.03 seconds


Optimal objective -5.611934358e+00

[4]: # 'SmartInitialization' chooses the initial y so that the dual of the Lagrangian
,→Subproblem has (pi=0 as)

# a feasible solution, thus making sure that the initial Lagrangian Subproblem
,→is not unbounded.

if SmartInitialization:
modelY = gp.Model()
modelY.reset()
yvar = modelY.addMVar(m1)
constraintsY = modelY.addConstr(np.transpose(E)@yvar <= c)
modelY.optimize()
y=yvar.X
else: y=np.zeros(m1)

# initialization
k=1
bestlb = -np.Inf

# set up the Lagrangian relaxation


modelL = gp.Model()
modelL.reset()
modelL.setParam('OutputFlag', 0) # quiet the Gurobi output
x = modelL.addMVar(n)
constraintsL = modelL.addConstr(A@x == b)
objective = modelL.setObjective((c-y.dot(E))@x, GRB.MINIMIZE)

modelL.optimize()
if modelL.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", modelL.status)
print("***** This is a problem. Lagrangian Subproblem is unbounded.")
print("***** The algorithm cannot work with this starting y.")
raise StopExecution
v = y.dot(h) + modelL.Objval
results1=[0]
results2=[v]
bestlb = v

Discarded solution information


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 100 rows, 200 columns and 17965 nonzeros
Model fingerprint: 0x64395335
Coefficient statistics:
Matrix range [1e-02, 5e-02]
Objective range [0e+00, 0e+00]
Bounds range [0e+00, 0e+00]
RHS range [2e-02, 5e-01]
Presolve time: 0.01s
Presolved: 100 rows, 200 columns, 17965 nonzeros

Iteration Objective Primal Inf. Dual Inf. Time


0 0.0000000e+00 1.684880e+02 0.000000e+00 0s
79 0.0000000e+00 0.000000e+00 0.000000e+00 0s

Solved in 79 iterations and 0.02 seconds


Optimal objective 0.000000000e+00
Discarded solution information

[5]: while k < MAXIT:


k += 1
g = h - E.dot(x.X)
if HarmonicStepSize:
stepsize = 1/k # This one converges in theory, but it is
,→slow.

else: # Instead, you can make a GUESS at the max


stepsize = (GUESS - v)/(g@g) # and then use this 'Polyak' stepsize
y = np.maximum(y + stepsize*g, np.zeros(m1)) # The projection keeps y>=0.
objective = modelL.setObjective((c-y.dot(E))@x, GRB.MINIMIZE)
modelL.optimize()
if modelL.status != GRB.Status.OPTIMAL:
k -= 1
print("***** Gurobi solve status:", GRB.OPTIMAL)
print("***** This is a problem. Lagrangian Subproblem is unbounded.")
print("***** The algorithm cannot continue after k =",k)
break
v = y.dot(h) + modelL.Objval
bestlb = np.max((bestlb,v))
results1=np.append(results1,k-1)
results2=np.append(results2,v)

print("***** z:", modelQ.Objval)


print("***** first lower bound:", results2[0])
print("***** best lower bound:", bestlb)

***** z: -5.611934358015312
***** first lower bound: -35.97487470911054
***** best lower bound: -6.309166317427381

[6]: if k > 1:
fig, ax = plt.subplots(figsize=(10,10))
ax.plot(results1, results2)
ax.plot(results1, modelQ.Objval*np.ones(k))
ax.set(xlabel='iteration', ylabel='v(y)')
ax.grid()
plt.show()
211

A.10 CSP.ipynb
CSP

June 25, 2021

Cutting-Stock model: column generation with Python/Gurobi

min e0 x
Ax − t = d
x, t ≥ 0,

where the columns of A are cutting patterns, and d is the demand vector.
Notes: * In this implementaion, we never delete generated columns (i.e., patterns) * Knapsack
subproblems solved by DP or ILP (Gurobi) or both [user options] * At the end, we solve the ILP
over all columns generated, aiming to improve on the rounded-up LP solution from column-
generation
References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f
import numpy as np
#%matplotlib notebook
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns; sns.set(); sns.set_style("whitegrid"); color_list = sns.
,→color_palette("muted")

import gurobipy as gp
from gurobipy import GRB

class StopExecution(Exception):
def _render_traceback_(self):
pass

[2]: # set at least one of the following two parameters to 'True'


# if both are set to 'True', then DP overwrites what IP calculates (but we can
,→still compare)

IP=True # set True for solution of knapsack problem by IP (i.e., Gurobi)


DP=True # set True for solution of knapsack problem by DP
results1=[]
results2=[]
ITER=0

[3]: # Some toy data


W=110
m=5; M=range(m)
Widths=np.array([70.0,40.0,55.0,25.0,35.0])
Demands=np.array([205,2321,143,1089,117])

[4]: # set up the Main LP model


LP = gp.Model()
LP.setParam('OutputFlag', 0) #comment out to see more Gurobi output
minsum = LP.setObjective(0, GRB.MINIMIZE)
s=LP.addVars(m)
for i in M:
LP.addConstr(-s[i] == Demands[i])
LP.update()
demandconstraints=LP.getConstrs()
# initialize with elementary patterns
nPAT=0
A = np.zeros((m,m))
for i in M:
nPAT += 1
A[i,nPAT-1] = np.floor(W/Widths[i])
newcol=gp.Column(A[:,i],demandconstraints)
LP.addVar(obj=1.0, column=newcol)
LP.update()

--------------------------------------------
Warning: your license will expire in 3 days
--------------------------------------------

Using license file C:\Users\jonxlee\gurobi.lic


Academic license - for non-commercial use only - expires 2021-06-28

The Knapsack model for generating an improving column

m
max ∑ ȳi ai
i =1
m
∑ wi a i ≤ W
i =1
ai ≥ 0 and integer, for i = 1, . . . m.

[5]: # set up for solving the knapsack subproblems: either by DP or IP (or both)
#
y=np.zeros(m)
if IP==True:
# set up the Subproblem ILP knapsack model for Gurobi
Knap = gp.Model()
Knap.setParam('OutputFlag', 0) #comment out to see more Gurobi output
a = Knap.addMVar(m,vtype=GRB.INTEGER)
knapsackobjective = Knap.setObjective(y@a, GRB.MAXIMIZE)
knapsackconstraint = Knap.addConstr(Widths@a <= W)
if DP==True:
# DP for knapsack. Local notation: max c'x, s.t. a'x <= b, x>=0 int.
def Knapf(a,b,c):
m=np.size(a)
f=np.zeros(b+1)
i=-np.ones(b+1,dtype=int)
v=-np.Inf*np.ones(m)
for s in range(min(a),b+1):
for j in range(m):
if a[j]<=s: v[j]=c[j] + f[s-a[j]]
else: v[j]=-np.Inf
f[s]=max(v)
i[s]=np.argmax(v) # save the index j where the max occured for
,→that s

#
x=np.zeros(m)
s=b+0
while s>=min(a):
x[i[s]] += 1
s=s-a[i[s]]
return f[b], x
[6]: # fancy output function
def fancyoutput():
plt.figure()
print("***** Patterns / Widths:", Widths, "Stock roll width:", W)
Aw=np.zeros((m,nPAT))
for i in M:
for j in range(nPAT):
Aw[i,j]=A[i,j]*Widths[i]
Aw=np.c_[ Aw, np.zeros(m) ]
wlist=[''] * m
for i in M:
wlist[i]='w'+str(i)
K=np.diagflat(Widths)
Bw=np.c_[Aw,K]
T = np.arange(Bw.shape[1])
for i in range(Bw.shape[0]):
plt.bar(T, Bw[i],
tick_label = np.concatenate((np.arange(nPAT),np.array([' ']),wlist)),
bottom = np.sum(Bw[:i], axis = 0),
color = color_list[i % len(color_list)])
plt.show()

print("***** A:")
print(A)

[7]: while True:


print(" ")
print("***** Solving LP...")
ITER += 1
LP.optimize()
if LP.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", LP.status)
print("***** This is a problem. LP does not have an optimal solution")
raise StopExecution
results1=np.append(results1,ITER-1)
results2=np.append(results2,LP.Objval)
print("***** A:")
print(A)
print("***** x:")
x = LP.getVars()
for j in range(nPAT):
print("x[",j,"]=",round(x[j+m].X,4))
for i in M:
y[i]=demandconstraints[i].Pi
print("***** y':",np.round(y,4))
#
if IP==True:
knapsackobjective = Knap.setObjective(y@a, GRB.MAXIMIZE)
print(" ")
print("***** Solving Knapsack...")
Knap.optimize()
if Knap.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", Knap.status)
print("***** This is a problem. Knapsack IP does not have an optimal
,→solution")

raise StopExecution
print("***** Gurobi Knap objval:",Knap.Objval)
reducedcost = 1.0-Knap.Objval
pattern=a.X+np.zeros(m)
#
if DP==True:
results = Knapf(Widths.astype(int),W,y)
print("***** DP Knap objval: ",results[0])
reducedcost = 1.0-results[0]
pattern=results[1]
#
if reducedcost < -0.0001:
print("***** Column:",pattern)
A=np.c_[ A, pattern ]
nPAT += 1
newcol=gp.Column(pattern,demandconstraints)
LP.addVar(obj=1.0, column=newcol)
else:
print("***** No more improving columns")
break

print("***** Pattern generation complete. Main LP solved to optimality.")


print("***** Total number of patterns generated: ", nPAT)
print("***** A:")
print(A)
print("***** x:")
x = LP.getVars()
for j in range(nPAT):
print("x[",j,"]=",round(x[j+m].X,4))
print("***** Optimal LP objective value:", LP.Objval)
print("***** rounds up to: ", np.ceil(LP.Objval), "(lower bound on rolls
,→needed)")

print("***** x rounded up:")


for j in range(nPAT):
print("x[",j,"]=",np.ceil(x[j+m].X))
print("***** Number of rolls used:", sum(np.ceil(x[j+m].X) for j in range(nPAT)))
fancyoutput()
fig, ax = plt.subplots(figsize=(10, 10))
ax.plot(results1[0:ITER], results2[0:ITER])
ax.plot(results1, np.ceil(LP.Objval)*np.ones(ITER))
ax.plot(results1, sum(np.ceil(x[j+m].X) for j in range(nPAT))*np.ones(ITER))
ax.set(xlabel='LP iteration', ylabel='LP objective value')
ax.set_xticks(ticks=results1, minor=False)
ax.grid()
plt.show()
print("LP iter", " LP val")
print("--------- ---------")
for j in range(ITER):
print(np.int(results1[j]), " ", np.round(results2[j],4))
print(" ")

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0.]
[0. 2. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 4. 0.]
[0. 0. 0. 0. 3.]]
***** x:
x[ 0 ]= 205.0
x[ 1 ]= 1160.5
x[ 2 ]= 71.5
x[ 3 ]= 272.25
x[ 4 ]= 39.0
***** y': [1. 0.5 0.5 0.25 0.3333]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.5
***** DP Knap objval: 1.5
***** Column: [1. 1. 0. 0. 0.]

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0. 1.]
[0. 2. 0. 0. 0. 1.]
[0. 0. 2. 0. 0. 0.]
[0. 0. 0. 4. 0. 0.]
[0. 0. 0. 0. 3. 0.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 1058.0
x[ 2 ]= 71.5
x[ 3 ]= 272.25
x[ 4 ]= 39.0
x[ 5 ]= 205.0
***** y': [0.5 0.5 0.5 0.25 0.3333]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.25
***** DP Knap objval: 1.25
***** Column: [0. 2. 0. 1. 0.]

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0. 1. 0.]
[0. 2. 0. 0. 0. 1. 2.]
[0. 0. 2. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1.]
[0. 0. 0. 0. 3. 0. 0.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 71.5
x[ 3 ]= 7.75
x[ 4 ]= 39.0
x[ 5 ]= 205.0
x[ 6 ]= 1058.0
***** y': [0.625 0.375 0.5 0.25 0.3333]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.0833333333333333
***** DP Knap objval: 1.0833333333333333
***** Column: [0. 0. 0. 3. 1.]

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0. 1. 0. 0.]
[0. 2. 0. 0. 0. 1. 2. 0.]
[0. 0. 2. 0. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1. 3.]
[0. 0. 0. 0. 3. 0. 0. 1.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 71.5
x[ 3 ]= 0.0
x[ 4 ]= 35.5556
x[ 5 ]= 205.0
x[ 6 ]= 1058.0
x[ 7 ]= 10.3333
***** y': [0.6111 0.3889 0.5 0.2222 0.3333]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.0555555555555556
***** DP Knap objval: 1.0555555555555556
***** Column: [0. 1. 0. 0. 2.]

***** Solving LP...


***** A:
[[1. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 2. 0. 0. 0. 1. 2. 0. 1.]
[0. 0. 2. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1. 3. 0.]
[0. 0. 0. 0. 3. 0. 0. 1. 2.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 71.5
x[ 3 ]= 0.0
x[ 4 ]= 0.0
x[ 5 ]= 205.0
x[ 6 ]= 1033.3846
x[ 7 ]= 18.5385
x[ 8 ]= 49.2308
***** y': [0.6154 0.3846 0.5 0.2308 0.3077]

***** Solving Knapsack...


***** Gurobi Knap objval: 1.0
***** DP Knap objval: 1.0
***** No more improving columns
***** Pattern generation complete. Main LP solved to optimality.
***** Total number of patterns generated: 9
***** A:
[[1. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 2. 0. 0. 0. 1. 2. 0. 1.]
[0. 0. 2. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1. 3. 0.]
[0. 0. 0. 0. 3. 0. 0. 1. 2.]]
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 71.5
x[ 3 ]= 0.0
x[ 4 ]= 0.0
x[ 5 ]= 205.0
x[ 6 ]= 1033.3846
x[ 7 ]= 18.5385
x[ 8 ]= 49.2308
***** Optimal LP objective value: 1377.6538461538462
***** rounds up to: 1378.0 (lower bound on rolls needed)
***** x rounded up:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 72.0
x[ 3 ]= 0.0
x[ 4 ]= 0.0
x[ 5 ]= 205.0
x[ 6 ]= 1034.0
x[ 7 ]= 19.0
x[ 8 ]= 50.0
***** Number of rolls used: 1380.0
***** Patterns / Widths: [70. 40. 55. 25. 35.] Stock roll width: 110

***** A:
[[1. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 2. 0. 0. 0. 1. 2. 0. 1.]
[0. 0. 2. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1. 3. 0.]
[0. 0. 0. 0. 3. 0. 0. 1. 2.]]
LP iter LP val
--------- ---------
0 1748.25
1 1645.75
2 1381.25
3 1380.3889
4 1377.6538

[8]: print(" ")


print("***** Now solve the ILP over all patterns generated to try and get a
,→better soution...")

for var in LP.getVars():


var.vtype=GRB.INTEGER
LP.optimize()
if LP.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", LP.status)
print("***** This is a problem. Hit enter to continue")
input()
print("***** x:")
for j in range(nPAT):
print("x[",j,"]=",round(x[j+m].X+0,4))
print("***** Number of rolls used:", sum(np.ceil(x[j+m].X) for j in range(nPAT)))
fancyoutput()

***** Now solve the ILP over all patterns generated to try and get a better
soution...
***** x:
x[ 0 ]= 0.0
x[ 1 ]= 0.0
x[ 2 ]= 72.0
x[ 3 ]= 1.0
x[ 4 ]= 1.0
x[ 5 ]= 205.0
x[ 6 ]= 1034.0
x[ 7 ]= 17.0
x[ 8 ]= 49.0
***** Number of rolls used: 1379.0
***** Patterns / Widths: [70. 40. 55. 25. 35.] Stock roll width: 110
***** A:
[[1. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 2. 0. 0. 0. 1. 2. 0. 1.]
[0. 0. 2. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 4. 0. 0. 1. 3. 0.]
[0. 0. 0. 0. 3. 0. 0. 1. 2.]]
225

A.11 UFL.ipynb
UFL

June 28, 2021

Uncapacitated-Facility-Location models with Python/Gurobi


The base model that we work with is

min ∑ f i yi + ∑ ∑ cij xij


i∈ M i∈ M j∈ N

∑ xij = 1, for j ∈ N;
i∈ M
xij ≥ 0, for i ∈ M, j ∈ N;
0 ≤ yi ≤ 1, and integer, for i ∈ M.

Notes: * We make two solves, first with the weak forcing constraints

∑ xij ≤ n yi , for i ∈ M,
j∈ N

and then with the strong forcing constraints

xij ≤ yi , for i ∈ M, j ∈ N.

* Random instances with m facilities and n customers. Play with m, n and possibly with demand
and scale factor in f .
References: * Jon Lee, “A First Course in Linear Optimization”, Fourth Edition (Version 4.0), Reex
Press, 2013-20.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f
import numpy as np
#%matplotlib notebook
%matplotlib inline
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi, voronoi_plot_2d
import gurobipy as gp
from gurobipy import GRB

class StopExecution(Exception):
def _render_traceback_(self):
pass

[2]: # parameters
m=75 # number of facilities
n=4000 # number of customers
M=list(range(0,m))
N=list(range(0,n))
np.random.seed(10) # set seed to be able to repeat the same random data
solveLPsOnly=False # set True to only solve LP relaxations

# random locations in the unit square


fPx=np.random.rand(m)
fPy=np.random.rand(m)
cPx=np.random.rand(n)
cPy=np.random.rand(n)

# cost data
demand=10*np.random.rand(n) # these will be 'baked' into the shipping costs
f=200*np.random.rand(m) # facility costs
c=np.zeros((m,n))
for i in range(0,m):
for j in range(0,n):
c[i,j]=demand[j]*np.sqrt(np.square(fPx[i]-cPx[j])+np.
,→square(fPy[i]-cPy[j]))

# = demand times per-unit transportation costs (distance)

[3]: # set up the weak model


model = gp.Model()
model.reset()
#model.setParam('Threads', 1) # uncomment to ask for 1 thread
if solveLPsOnly==True:
y=model.addVars(m,ub=1.0)
else:
y=model.addVars(m,vtype=GRB.BINARY)
x=model.addVars(m,n)
model.setObjective(sum(f[i]*y[i] for i in M) + sum(sum(c[i,j]*x[i,j] for i in M)
,→for j in N), GRB.MINIMIZE)

demandconstraints = model.addConstrs((sum(x[i,j] for i in M) == 1 for j in N))


weakforceconstraints = model.addConstrs((sum(x[i,j] for j in N) <= n*y[i] for i
,→in M))

Using license file C:\Users\jonxlee\gurobi.lic


Academic license - for non-commercial use only - expires 2021-08-26
Discarded solution information

[4]: # solve the weak model


model.optimize()
if model.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", model.status)
print("***** This is a problem. Model does not have an optimal solution")
raise StopExecution
for i in M: print("y[",i,"]=",round(y[i].X,4))
ytot=round(sum (y[i].X for i in M))
print("y total =",ytot)

Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)


Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 4075 rows, 300075 columns and 600075 nonzeros
Model fingerprint: 0xfc880efe
Variable types: 300000 continuous, 75 integer (75 binary)
Coefficient statistics:
Matrix range [1e+00, 4e+03]
Objective range [3e-04, 2e+02]
Bounds range [1e+00, 1e+00]
RHS range [1e+00, 1e+00]
Found heuristic solution: objective 17544.136375
Presolve time: 0.64s
Presolved: 4075 rows, 300075 columns, 600075 nonzeros
Variable types: 300000 continuous, 75 integer (75 binary)

Root relaxation: objective 1.229656e+03, 597 iterations, 0.08 seconds

Nodes | Current Node | Objective Bounds | Work


Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time

0 0 1229.65553 0 75 17544.1364 1229.65553 93.0% - 1s


H 0 0 8324.6715293 1229.65553 85.2% - 1s
H 0 0 4673.5132638 1229.65553 73.7% - 1s
H 0 0 4419.8133391 1229.65553 72.2% - 2s
H 0 0 4368.5310262 1229.65553 71.9% - 2s
0 0 1391.46920 0 70 4368.53103 1391.46920 68.1% - 4s
H 0 0 4063.1409361 1391.46920 65.8% - 7s
0 0 1558.72879 0 72 4063.14094 1558.72879 61.6% - 7s
H 0 0 4037.6172963 1558.72879 61.4% - 13s
0 0 1753.61158 0 73 4037.61730 1753.61158 56.6% - 13s
0 0 1795.39708 0 72 4037.61730 1795.39708 55.5% - 15s
0 0 1795.39708 0 72 4037.61730 1795.39708 55.5% - 15s
H 0 0 3862.9299415 1795.39708 53.5% - 27s
H 0 0 3799.1987178 1795.39708 52.7% - 27s
H 0 0 3668.9696315 1795.39708 51.1% - 27s
0 0 1921.17962 0 70 3668.96963 1921.17962 47.6% - 27s
H 0 0 3635.6910048 1921.17962 47.2% - 42s
H 0 0 3535.7603537 1921.17962 45.7% - 42s
H 0 0 3444.3909501 1921.17962 44.2% - 42s
H 0 0 3328.5221017 1921.17962 42.3% - 42s
H 0 0 3252.1304013 1921.17962 40.9% - 42s
0 0 2074.95883 0 71 3252.13040 2074.95883 36.2% - 42s
H 0 0 3190.2555864 2074.95883 35.0% - 57s
H 0 0 3085.7395476 2074.95883 32.8% - 57s
H 0 0 3083.1364083 2074.95883 32.7% - 57s
H 0 0 3067.7311804 2074.95883 32.4% - 57s
H 0 0 3066.7687696 2074.95883 32.3% - 57s
H 0 0 2958.9643419 2074.95883 29.9% - 57s
H 0 0 2946.5871499 2074.95883 29.6% - 57s
H 0 0 2921.4140285 2074.95883 29.0% - 57s
H 0 0 2842.0681101 2074.95883 27.0% - 57s
H 0 0 2802.9809811 2074.95883 26.0% - 57s
H 0 0 2792.0637739 2074.95883 25.7% - 57s
H 0 0 2779.7296742 2074.95883 25.4% - 57s
H 0 0 2763.1285431 2074.95883 24.9% - 57s
H 0 0 2760.9864797 2074.95883 24.8% - 57s
H 0 0 2745.1240381 2074.95883 24.4% - 57s
H 0 0 2735.8698677 2074.95883 24.2% - 57s
H 0 0 2727.9013440 2074.95883 23.9% - 57s
H 0 0 2714.4940747 2074.95883 23.6% - 57s
H 0 0 2714.1367239 2074.95883 23.5% - 57s
H 0 0 2708.3760559 2074.95883 23.4% - 57s
H 0 0 2704.6679993 2074.95883 23.3% - 57s
0 0 2185.58189 0 71 2704.66800 2185.58189 19.2% - 58s
0 0 2186.20338 0 68 2704.66800 2186.20338 19.2% - 58s
0 0 2297.87203 0 67 2704.66800 2297.87203 15.0% - 75s
0 0 2392.45333 0 66 2704.66800 2392.45333 11.5% - 90s
0 0 2392.51043 0 66 2704.66800 2392.51043 11.5% - 91s
0 0 2493.21367 0 66 2704.66800 2493.21367 7.82% - 109s
H 0 0 2689.2832904 2493.21367 7.29% - 116s
0 0 2504.48700 0 62 2689.28329 2504.48700 6.87% - 116s
0 0 2504.50603 0 62 2689.28329 2504.50603 6.87% - 117s
H 0 0 2683.5226224 2504.50603 6.67% - 126s
H 0 0 2671.7945495 2504.50603 6.26% - 126s
H 0 0 2670.9772972 2504.50603 6.23% - 126s
H 0 0 2666.7624678 2504.50603 6.08% - 126s
0 0 2532.96155 0 43 2666.76247 2532.96155 5.02% - 126s
H 0 0 2565.6289184 2532.96155 1.27% - 127s
0 0 2533.15328 0 37 2565.62892 2533.15328 1.27% - 128s
0 0 2537.72352 0 21 2565.62892 2537.72352 1.09% - 130s
H 0 0 2538.4262791 2537.72352 0.03% - 131s
0 0 cutoff 0 2538.42628 2538.42628 0.00% - 132s

Cutting planes:
Implied bound: 11142

Explored 1 nodes (14761 simplex iterations) in 133.17 seconds


Thread count was 8 (of 8 available processors)

Solution count 10: 2538.43 2565.63 2666.76 ... 2745.12

Optimal solution found (tolerance 1.00e-04)


Best objective 2.538426279075e+03, best bound 2.538426279075e+03, gap 0.0000%
y[ 0 ]= 0.0
y[ 1 ]= 0.0
y[ 2 ]= 0.0
y[ 3 ]= 0.0
y[ 4 ]= 0.0
y[ 5 ]= 0.0
y[ 6 ]= 0.0
y[ 7 ]= 0.0
y[ 8 ]= 0.0
y[ 9 ]= 1.0
y[ 10 ]= 0.0
y[ 11 ]= 0.0
y[ 12 ]= 0.0
y[ 13 ]= 1.0
y[ 14 ]= 0.0
y[ 15 ]= 0.0
y[ 16 ]= 0.0
y[ 17 ]= 0.0
y[ 18 ]= 1.0
y[ 19 ]= 1.0
y[ 20 ]= 1.0
y[ 21 ]= 1.0
y[ 22 ]= 0.0
y[ 23 ]= 1.0
y[ 24 ]= 0.0
y[ 25 ]= 0.0
y[ 26 ]= 1.0
y[ 27 ]= 0.0
y[ 28 ]= 0.0
y[ 29 ]= 0.0
y[ 30 ]= 0.0
y[ 31 ]= 0.0
y[ 32 ]= 0.0
y[ 33 ]= 1.0
y[ 34 ]= 0.0
y[ 35 ]= 0.0
y[ 36 ]= 1.0
y[ 37 ]= 0.0
y[ 38 ]= 1.0
y[ 39 ]= 1.0
y[ 40 ]= 0.0
y[ 41 ]= 0.0
y[ 42 ]= 1.0
y[ 43 ]= 0.0
y[ 44 ]= 0.0
y[ 45 ]= 0.0
y[ 46 ]= 1.0
y[ 47 ]= 0.0
y[ 48 ]= 0.0
y[ 49 ]= 0.0
y[ 50 ]= -0.0
y[ 51 ]= 0.0
y[ 52 ]= 1.0
y[ 53 ]= 0.0
y[ 54 ]= 1.0
y[ 55 ]= 0.0
y[ 56 ]= 0.0
y[ 57 ]= 0.0
y[ 58 ]= 0.0
y[ 59 ]= 0.0
y[ 60 ]= 0.0
y[ 61 ]= 1.0
y[ 62 ]= 0.0
y[ 63 ]= 0.0
y[ 64 ]= 0.0
y[ 65 ]= 0.0
y[ 66 ]= 0.0
y[ 67 ]= 1.0
y[ 68 ]= 0.0
y[ 69 ]= 1.0
y[ 70 ]= 0.0
y[ 71 ]= 0.0
y[ 72 ]= 0.0
y[ 73 ]= 0.0
y[ 74 ]= 0.0
y total = 19

[5]: # set up and solve the strong model


model.reset()
model.remove(weakforceconstraints)
strongforceconstraints = model.addConstrs((x[i,j] <= y[i] for i in M for j in
,→N))

model.optimize()
if model.status != GRB.Status.OPTIMAL:
print("***** Gurobi solve status:", model.status)
print("***** This is a problem. Model does not have an optimal solution")
raise StopExecution
for i in M: print("y[",i,"]=",round(y[i].X,4))
print("y total =", round(sum (y[i].X for i in M),4))

Discarded solution information


Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 304000 rows, 300075 columns and 900000 nonzeros
Model fingerprint: 0xdcd5646b
Variable types: 300000 continuous, 75 integer (75 binary)
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [3e-04, 2e+02]
Bounds range [1e+00, 1e+00]
RHS range [1e+00, 1e+00]
Found heuristic solution: objective 17544.136375
Presolve time: 1.02s
Presolved: 304000 rows, 300075 columns, 900000 nonzeros
Variable types: 300000 continuous, 75 integer (75 binary)

Deterministic concurrent LP optimizer: primal and dual simplex


Showing first log only...

Warning: Markowitz tolerance tightened to 0.5


Concurrent spin time: 0.00s

Solved with dual simplex

Root relaxation: objective 2.538426e+03, 11556 iterations, 1.09 seconds

Nodes | Current Node | Objective Bounds | Work


Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time

* 0 0 0 2538.4262791 2538.42628 0.00% - 2s


Explored 0 nodes (11556 simplex iterations) in 2.80 seconds
Thread count was 8 (of 8 available processors)

Solution count 2: 2538.43 17544.1

Optimal solution found (tolerance 1.00e-04)


Best objective 2.538426279075e+03, best bound 2.538426279075e+03, gap 0.0000%
y[ 0 ]= -0.0
y[ 1 ]= -0.0
y[ 2 ]= -0.0
y[ 3 ]= -0.0
y[ 4 ]= -0.0
y[ 5 ]= -0.0
y[ 6 ]= -0.0
y[ 7 ]= -0.0
y[ 8 ]= -0.0
y[ 9 ]= 1.0
y[ 10 ]= -0.0
y[ 11 ]= -0.0
y[ 12 ]= -0.0
y[ 13 ]= 1.0
y[ 14 ]= -0.0
y[ 15 ]= -0.0
y[ 16 ]= -0.0
y[ 17 ]= -0.0
y[ 18 ]= 1.0
y[ 19 ]= 1.0
y[ 20 ]= 1.0
y[ 21 ]= 1.0
y[ 22 ]= -0.0
y[ 23 ]= 1.0
y[ 24 ]= -0.0
y[ 25 ]= -0.0
y[ 26 ]= 1.0
y[ 27 ]= -0.0
y[ 28 ]= -0.0
y[ 29 ]= -0.0
y[ 30 ]= -0.0
y[ 31 ]= -0.0
y[ 32 ]= -0.0
y[ 33 ]= 1.0
y[ 34 ]= -0.0
y[ 35 ]= -0.0
y[ 36 ]= 1.0
y[ 37 ]= -0.0
y[ 38 ]= 1.0
y[ 39 ]= 1.0
y[ 40 ]= -0.0
y[ 41 ]= -0.0
y[ 42 ]= 1.0
y[ 43 ]= -0.0
y[ 44 ]= -0.0
y[ 45 ]= -0.0
y[ 46 ]= 1.0
y[ 47 ]= -0.0
y[ 48 ]= -0.0
y[ 49 ]= -0.0
y[ 50 ]= -0.0
y[ 51 ]= -0.0
y[ 52 ]= 1.0
y[ 53 ]= -0.0
y[ 54 ]= 1.0
y[ 55 ]= -0.0
y[ 56 ]= -0.0
y[ 57 ]= -0.0
y[ 58 ]= -0.0
y[ 59 ]= -0.0
y[ 60 ]= -0.0
y[ 61 ]= 1.0
y[ 62 ]= -0.0
y[ 63 ]= -0.0
y[ 64 ]= -0.0
y[ 65 ]= -0.0
y[ 66 ]= -0.0
y[ 67 ]= 1.0
y[ 68 ]= -0.0
y[ 69 ]= 1.0
y[ 70 ]= -0.0
y[ 71 ]= -0.0
y[ 72 ]= -0.0
y[ 73 ]= -0.0
y[ 74 ]= -0.0
y total = 19.0

[6]: # plot the results


#
if solveLPsOnly == False:
fxopen=np.zeros(ytot)
fyopen=np.zeros(ytot)
count=-1
for i in M:
if round(y[i].X)==1:
count += 1
fxopen[count]=fPx[i]
fyopen[count]=fPy[i]
# Get current figure size
fig_size = plt.rcParams["figure.figsize"]
#print("Current size:", fig_size)
fig_size[0] = 10
fig_size[1] = 10
plt.rcParams["figure.figsize"] = fig_size

# vornoi diagram for the open facilities


points=np.column_stack((fxopen,fyopen))
vor = Voronoi(points)
fig = voronoi_plot_2d(vor,show_vertices=False)

# open facilities are blue, closed failities are opaque red,


# vornoi cells capture the customers assigned to each open facility
plt.scatter(cPx,cPy,s=1)
plt.scatter(fPx,fPy,c='red',alpha=0.3)
plt.scatter(fxopen,fyopen,c='blue')
237

A.12 pure_gomory_example_1.ipynb
pure_gomory_example_1

June 25, 2021

Example 1: Gomory cutting-planes for dual-form pure-integer problem


DI
For dual-form pure-integer problem
max y0 b (DI )
0 0
yA≤c
y ∈ Zm .

Notes: * A and c MUST be integer


Reference: * Qi He, Jon Lee. Another pedagogy for pure-integer Gomory. RAIRO – Operations
Research, 51:189–197, 2017.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f

[2]: %run ./pivot_tools.ipynb

pivot_tools loaded: pivot_perturb, pivot_algebra, N, pivot_ratios, pivot_swap,


pivot_plot, pure_gomory, mixed_gomory, dual_plot
[3]: A = sym.Matrix(([7, 8,-1, 1, 3],
[5, 6, -1, 2, 1]))
m = A.shape[0]
n = A. shape[1]
c = sym.Matrix([126, 141, -10, 5, 67])
b = sym.Matrix([26, 19])
beta = [0,1]
eta = list(set(list(range(n)))-set(beta))
A_beta = copy.copy(A[:,beta])
A_eta = copy.copy(A[:,eta])
c_beta = copy.copy(c[beta,0])
c_eta = copy.copy(c[eta,0])
Perturb=False ### do NOT change this!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! You
,→can perturb later

[4]: A
[4]: 7 8 −1 1 3
5 6 −1 2 1

[5]: c
[5]:  126 
 141 
 
−10
 
 5 
67

[6]: #pivot_perturb()

[7]: b
[7]: 26
19

[8]: pivot_algebra()

pivot_algebra() done

[9]: xbar_beta
[9]:  2 
3
2

[10]: cbar_eta
[10]:  5 
1
2
1
[11]: ybar
[11]:  51

2
− 21
2

[12]: dual_plot()

[13]: pure_gomory(1)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[14]: pivot_algebra()

pivot_algebra() done
[15]: A
[15]: 7 8 −1 1 3 4
5 6 −1 2 1 3

[16]: c
[16]:  126 
 141 
 
−10
 
 5 
 
 67 
70

[17]: eta
[17]:
[2, 3, 4, 5]
[18]: cbar_eta
[18]:  5 
 1 
 2 
 1 
− 12

[19]: pivot_ratios(3)
 

3
x̄ + λz̄ :
 
2
3 − λ
2 2 
 0 
 
 0 
 
 0 
λ

[20]: pivot_swap(3,1)

swap accepted - new partition:


eta: [2, 3, 4, 1]
beta: [0, 5]
*** MUST APPLY pivot_algebra()! ***

[21]: pivot_algebra()

pivot_algebra() done
[22]: xbar_beta
[22]: 2
3

[23]: cbar_eta
[23]:  4 
 5 
 
 −3
1

[24]: pivot_ratios(2)
2
5

x̄ + λz̄ :
 
2 − 5λ
 0 
 
 0 
 
 0 
 
 λ 
8λ + 3

[25]: pivot_swap(2,0)

swap accepted - new partition:


eta: [2, 3, 0, 1]
beta: [4, 5]
*** MUST APPLY pivot_algebra()! ***

[26]: pivot_algebra()

pivot_algebra() done

[27]: xbar_beta
[27]:  2 
5
31
5

[28]: cbar_eta
[28]:  23 
5
2
3
 
5
1

[29]: ybar
[29]:  131

5
− 58
5

[30]: dual_plot()

[31]: pure_gomory(0)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[32]: pure_gomory(1)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[33]: pivot_algebra()
pivot_algebra() done

[34]: cbar_eta
[34]:  23 
5
 2 
 3 
 
 5 
 1 
 
− 1 
5
− 52

[35]: pivot_ratios(5)
 
2
31
3
x̄ + λz̄ :
 
0
 0 
 
 0 
 
 0 
 2 λ 
 − 
 5 5 
 31 − 3λ 
5 5 
 0 
λ

[36]: pivot_swap(5,0)

swap accepted - new partition:


eta: [2, 3, 0, 1, 6, 4]
beta: [7, 5]
*** MUST APPLY pivot_algebra()! ***

[37]: pivot_algebra()

pivot_algebra() done

[38]: xbar_beta
[38]: 2
5

[39]: cbar_eta
[39]:
 
5
0
 
1
 
1
 
1
2

[40]: ybar
[40]:  25 
−10

[41]: dual_plot()
[42]: dual_plot(.1)
247

A.13 pure_gomory_example_2.ipynb
pure_gomory_example_2

June 25, 2021

Example 2: Gomory cutting-planes for dual-form pure-integer problem


DI
For dual-form pure-integer problem
max y0 b (DI )
0 0
yA≤c
y ∈ Zm .

Notes: * A and c MUST be integer


Reference: * Qi He, Jon Lee. Another pedagogy for pure-integer Gomory. RAIRO – Operations
Research, 51:189–197, 2017.
MIT License
Copyright (c) 2020 Jon Lee
Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARIS-
ING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
[1]: %reset -f

[2]: %run ./pivot_tools.ipynb

pivot_tools loaded: pivot_perturb, pivot_algebra, pivot_ratios, pivot_swap,


pivot_plot, pure_gomory, dual_plot
[3]: k=3
A = sym.Matrix(([2*k, -2*k, 0],
[1, 1, -1]))
m = A.shape[0]
n = A. shape[1]
c = sym.Matrix([2*k, 0, 1])
b = sym.Matrix([0,1])
beta = [0,1]
eta = list(set(list(range(n)))-set(beta))
A_beta = copy.copy(A[:,beta])
A_eta = copy.copy(A[:,eta])
c_beta = copy.copy(c[beta,0])
c_eta = copy.copy(c[eta,0])
Perturb=False ### do NOT change this!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! You
,→can perturb later

[4]: A
[4]: 6 −6 0 
1 1 −1

[5]: c
[5]: 6
0
1

[6]: #pivot_perturb()

[7]: b
[7]: 0
1

[8]: pivot_algebra()

pivot_algebra() done

[9]: dual_plot(2*k+1)
[10]: ybar
[10]:  1 
2
3

[11]: pure_gomory(0)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[12]: pivot_algebra()

pivot_algebra() done

[13]: dual_plot()
[14]: cbar_eta
[14]:  
4 − 12

[15]: pivot_ratios(1)
 
6
6
11
x̄ + λz̄ :
1 λ 
2 − 12
 1 − 11λ 
2 12 
 0 
λ
[16]: pivot_swap(1,1)

swap accepted - new partition:


eta: [2, 1]
beta: [0, 3]
*** MUST APPLY pivot_algebra()! ***

[17]: pivot_algebra()

pivot_algebra() done

[18]: cbar_eta
[18]:  41 6

11 11

[19]: xbar_beta
[19]:  5 
11
6
11

[20]: dual_plot()
[21]: ybar
[21]:  6 
11
30
11

[22]: pure_gomory(1)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[23]: dual_plot()
[24]: pivot_algebra()

pivot_algebra() done

[25]: cbar_eta
[25]:  41 6 8

11 11 − 11

[26]: pivot_ratios(2)
 
1
1
x̄ + λz̄ :
 5 5λ 
11 − 11
 0 
 
 0 
 
 6 6λ 
11 − 11
λ

[27]: pivot_swap(2,0)

swap accepted - new partition:


eta: [2, 1, 0]
beta: [4, 3]
*** MUST APPLY pivot_algebra()! ***

[28]: pivot_algebra()

pivot_algebra() done

[29]: cbar_eta
[29]:  2 8

3 5 5

[30]: xbar_beta
[30]: 1
0

[31]: dual_plot()
[32]: ybar
[32]:  2 
5
2

[33]: pure_gomory(0)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[34]: dual_plot()
[35]: pivot_algebra()

pivot_algebra() done

[36]: cbar_eta
[36]:  2 8

3 5 5 − 25

[37]: pivot_ratios(3)
 
5
0
x̄ + λz̄ :
 
0
 0 
 
 0 
 4λ 
− 
 5 
1 − λ 
5
λ

[38]: pivot_swap(3,1)

swap accepted - new partition:


eta: [2, 1, 0, 3]
beta: [4, 5]
*** MUST APPLY pivot_algebra()! ***

[39]: pivot_algebra()

pivot_algebra() done

[40]: cbar_eta
[40]:  1

3 1 1 2

[41]: xbar_beta
[41]: 1
0

[42]: ybar
[42]:  1 
2
2

[43]: dual_plot()
[44]: pure_gomory(0)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[45]: pivot_algebra()

pivot_algebra() done

[46]: cbar_eta
[46]:  1

3 1 1 2 − 12

[47]: pivot_ratios(4)
 
4
0
x̄ + λz̄ :
 
0
 0 
 
 0 
 
 0 
 
1 − λ 
 4
 − 3λ 
4
λ

[48]: pivot_swap(4,1)

swap accepted - new partition:


eta: [2, 1, 0, 3, 5]
beta: [4, 6]
*** MUST APPLY pivot_algebra()! ***

[49]: pivot_algebra()

pivot_algebra() done

[50]: cbar_eta
[50]:  4 2

3 2 0 3 3

[51]: xbar_beta
[51]: 1
0

[52]: ybar
[52]:  2 
3
2

[53]: pure_gomory(0)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[54]: pivot_algebra()

pivot_algebra() done

[55]: cbar_eta
[55]:  4 2

3 2 0 3 3 − 32
[56]: pivot_ratios(5)
 
3
0
x̄ + λz̄ :
 
0
 0 
 
 0 
 
 0 
 
1 − λ 
 3
 0 
 
 − 2λ 
3
λ

[57]: pivot_swap(5,1)

swap accepted - new partition:


eta: [2, 1, 0, 3, 5, 6]
beta: [4, 7]
*** MUST APPLY pivot_algebra()! ***

[58]: pivot_algebra()

pivot_algebra() done

[59]: cbar_eta
[59]:  
3 4 −2 3 2 1

[60]: pivot_ratios(2)
1
4

x̄ + λz̄ :
 
λ
 0 
 
 0 
 
 0 
 
1 − 4λ
 
 0 
 
 0 

[61]: pivot_swap(2,0)

swap accepted - new partition:


eta: [2, 1, 4, 3, 5, 6]
beta: [0, 7]
*** MUST APPLY pivot_algebra()! ***

[62]: pivot_algebra()

pivot_algebra() done

[63]: cbar_eta
[63]:  5 1 9 3 3

2 3 2 4 2 4

[64]: xbar_beta
[64]:  1 
4
3
4

[65]: dual_plot()
[66]: ybar
[66]:  3 
4
3
2

[67]: pure_gomory(1)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[68]: pivot_algebra()

pivot_algebra() done

[69]: dual_plot()
[70]: cbar_eta
[70]:  5 1 9 3 3

2 3 2 4 2 4 − 12

[71]: pivot_ratios(6)
 
1
1
x̄ + λz̄ :
 
1
4 − λ4
 0 
 
 0 
 
 
 0 
 
 0 
 
 0 
 
 0 
3 
4 − 3λ
4

λ

[72]: pivot_swap(6,1)

swap accepted - new partition:


eta: [2, 1, 4, 3, 5, 6, 7]
beta: [0, 8]
*** MUST APPLY pivot_algebra()! ***

[73]: pivot_algebra()

pivot_algebra() done

[74]: cbar_eta
[74]:  19 7 3 2

2 4 1 6 3 2 3

[75]: dual_plot()
[76]: ybar
[76]:  5 
6
1

[77]: pure_gomory(0)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[78]: pivot_algebra()

pivot_algebra() done

[79]: cbar_eta
[79]:  19 7 3 2

2 4 1 6 3 2 3 − 56

[80]: pivot_ratios(7)
 
0
6
5
x̄ + λz̄ :
 
− λ6
 0 
 
 0 
 
 0 
 
 
 0 
 
 0 
 
 0 
 
 0 
 
1 − 5λ 6

λ

[81]: pivot_swap(7,0)

swap accepted - new partition:


eta: [2, 1, 4, 3, 5, 6, 7, 0]
beta: [9, 8]
*** MUST APPLY pivot_algebra()! ***

[82]: pivot_algebra()

pivot_algebra() done

[83]: cbar_eta
[83]:  
2 −1 1 −1 −1 −1 −1 5

[84]: xbar_beta
[84]: 0
1

[85]: dual_plot()
[86]: pivot_ratios(6)
 

1
3
x̄ + λz̄ :
 
0
 0 
 
 0 
 
 0 
 
 0 
 
 0 
 
 0 
 
 λ 
 
1 − 3λ

[87]: pivot_swap(6,1)

swap accepted - new partition:


eta: [2, 1, 4, 3, 5, 6, 8, 0]
beta: [9, 7]
*** MUST APPLY pivot_algebra()! ***

[88]: pivot_algebra()

pivot_algebra() done

[89]: cbar_eta
[89]:  5 4 4 2 1 1 10

3 3 3 1 3 3 3 3

[90]: dual_plot()
[91]: ybar
[91]:  1 
3
2
3

[92]: pure_gomory(1)

*** PROBABLY WANT TO APPLY pivot_algebra()! ***

[93]: pivot_algebra()

pivot_algebra() done

[94]: cbar_eta
[94]:  5 4 4 2 1 1 10

3 3 3 1 3 3 3 3 − 23

[95]: pivot_ratios(8)
 
1
1
x̄ + λz̄ :
 
0
 0 
 
 0 
 
 0 
 
 0 
 
 
 0 
 
 0 
1 λ
3−3
 
 0 
 2 2λ 
3 − 3 
λ

[96]: pivot_swap(8,1)

swap accepted - new partition:


eta: [2, 1, 4, 3, 5, 6, 8, 0, 7]
beta: [9, 10]
*** MUST APPLY pivot_algebra()! ***

[97]: pivot_algebra()

pivot_algebra() done

[98]: cbar_eta
[98]:  
1 6 2 5 4 3 1 0 2

[99]: xbar_beta
[99]: 0
1

[100]: ybar
[100]: 1
0

[101]: dual_plot()
END NOTES 273

End Notes

1
“The reader will find no figures in this work. The methods which I set forth do not require either
constructions or geometrical or mechanical reasonings: but only algebraic operations, subject to a regular
and uniform rule of procedure.” — Joseph-Louis Lagrange, Preface to “Mécanique Analytique,” 1815.

2
“The testing of this hypothesis, however, will be postponed until it is programmed for an electronic
computer.” — Ailsa H. Land and Alison G. Doig (inventors of branch-and-bound), last line of: An
Automatic Method of Solving Discrete Programming Problems, Econometrica, 1960, Vol. 28, No. 3, pp.
497–520.

3
“Il est facile de voir que...”, “il est facile de conclure que...”, etc. — Pierre-Simon Laplace, frequently
in “Traité de Mécanique Céleste.”

4
“One would be able to draw thence well some corollaries that I omit for fear of boring you.” — Gabriel
Cramer, Letter to Nicolas Bernoulli, 21 May 1728. Translated from “Die Werke von Jakob Bernoulli,” by
R.J. Pulskamp.
274 END NOTES

5
“Two months after I made up the example, I lost the mental picture which produced it. I really regret
this, because a lot of people have asked me your question, and I can’t answer.” — Alan J. Hoffman, private
communication with J. Lee, August, 1994.

6
“Fourier hat sich selbst vielfach um Ungleichungen bemüht, aber ohne erheblichen Erfolg.” — Gyula
Farkas, “Über die Theorie der Einfachen Ungleichungen,” Journal für die Reine und Angewandte Mathematik,
vol. 124:1–27.

7
“The particular geometry used in my thesis was in the dimension of the columns instead of the rows.
This column geometry gave me the insight that made me believe the Simplex Method would be a very
efficient solution technique for solving linear programs. This I proposed in the summer of 1947 and by
good luck it worked!” — George B. Dantzig, “Reminiscences about the origins of linear programming,”
Operations Research Letters vol. 1 (1981/82), no. 2, 43–48.
END NOTES 275

8
“George would often call me in and talk about something on his mind. One day in around 1959,
he told me about a couple of problem areas: something that Ray Fulkerson worked on, something else
whose details I forget. In both cases, he was using a linear programming model and the simplex method
on a problem that had a tremendous amount of data. Dantzig in one case, Fulkerson in another, had
devised an ad hoc method of creating the data at the moment it was needed to fit into the problem. I
reflected on this problem for quite awhile. And then it suddenly occurred to me that they were all doing
the same thing! They were essentially solving a linear programming problem whose data - whose columns
- being an important part of the data, were too many to write down. But you could devise a procedure
for creating one when you needed it, and creating one that the simplex method would choose to work
with at that moment. Call it the column-generation method. The immediate, lovely looking application
was to the linear programming problem, in which you have a number of linear programming problems
connected only by a small number of constraints. That fit in beautifully with the pattern. It was a way
of decomposing such a problem. So we referred to it as the decomposition algorithm. And that rapidly
became very famous.” — Philip Wolfe, interviewed by Irv Lustig ∼2003.

9
“So they have this assortment of widths and quantities, which they are somehow supposed to make
out of all these ten-foot rolls. So that was called the cutting stock problem in the case of paper. So Paul
[Gilmore] and I got interested in that. We struck out (failed) first on some sort of a steel cutting problem,
but we seemed to have some grip on the paper thing, and we used to visit the paper mills to see what
they actually did. And I can tell you, paper mills are so impressive. I mean they throw a lot of junk in at
one end, like tree trunks or something that’s wood, and out the other end comes – swissssssh – paper! It’s
one damn long machine, like a hundred yards long. They smell a lot, too. We were quite successful. They
didn’t have computers; believe me, no computer in the place. So we helped the salesman to sell them the
first computer.” — Ralph E. Gomory, interviewed by William Thomas, New York City, July 19, 2010.
276 END NOTES

10
“I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision
processes. An interesting question is, Where did the name, dynamic programming, come from? The
1950s were not good years for mathematical research. We had a very interesting gentleman in Washington
named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the
word research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would
turn red, and he would get violent if people used the term research in his presence. You can imagine
how he felt, then, about the term mathematical. The RAND Corporation was employed by the Air Force,
and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson
and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What
title, what name, could I choose? In the first place I was interested in planning, in decision making, in
thinking. But planning, is not a good word for various reasons. I decided therefore to use the word
‘programming’. I wanted to get across the idea that this was dynamic, this was multistage, this was time-
varying I thought, lets kill two birds with one stone. Lets take a word that has an absolutely precise
meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an
adjective, and that is its impossible to use the word dynamic in a pejorative sense. Try thinking of some
combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic
programming was a good name. It was something not even a Congressman could object to. So I used it as
an umbrella for my activities.” — Richard E. Bellman, “Eye of the Hurricane: An Autobiography,” 1984.

11
“Vielleicht noch mehr als der Berührung der Menschheit mit der Natur verdankt die Graphentheorie
der Berührung der Menschen untereinander.” — Dénes König, “Theorie Der Endlichen Und Unendlichen
Graphen,” 1936.
The Afterward

277
Bibliography

[1] Stephen P. Bradley, Arnoldo C. Hax, and Thomas L. Magnanti. Applied Mathematical Pro-
gramming. Addison Wesley, 1977. Available at http://web.mit.edu/15.053/www/. [Cited
on page vii]
[2] Qi He and Jon Lee. Another pedagogy for pure-integer gomory. RAIRO-Operations Research,
51(1):189–197, 2017. [Cited on pages 110 and 120]
[3] Jon Lee. A First Course In Combinatorial Optimization. Cambridge Texts in Applied Mathe-
matics. Cambridge University Press, Cambridge, 2004. [Cited on page 97]
[4] Jon Lee and Angelika Wiegele. Another pedagogy for mixed-integer gomory. EURO Journal
on Computational Optimization, 5:455––466, 2017. [Cited on pages 110 and 120]
[5] Katta G. Murty. Linear And Combinatorial Programming. John Wiley & Sons Inc., New York,
1976. [Cited on page vii]

279
Index of definitions

1-norm, 11 elementary row operations, 5


∞-norm, 11 extreme point, 21
(matrix) product, 3 extreme ray, 25
concave piecewise-linear function, 63
convex piecewise-linear function, 59 feasible, 2
feasible direction relative to the feasible
affine function, 59 solution, 24
algebraic perturbation, 34 feasible region, 2
arcs, 12, 89 flow, 12, 89
artificial variable, 37 full column rank, 5
full row rank, 5
basic direction, 24
Gauss-Jordan elimination, 6
basic feasible direction relative to the basic
feasible solution, 24 head, 12, 89
basic feasible ray, 25
basic feasible solution, 20 identity matrix, 6
basic partition, 19 inverse, 6
basic solution, 20 invertible, 6
basis, 4, 19
key invariant for branch-and-bound, 106
basis matrix, 19
knapsack problem, 81
best bound, 109
big M, 101 Lagrangian Dual, 75
bipartite graph, 91 Laplace expansion, 6
branch-and-bound, 106 linear combination, 4
breakpoints, 103 linear constraints, 1
linear function, 59
Chvátal-Gomory cut, 110 Linear optimization, 1
column space, 5 linearly independent, 4
complementary, 47, 49 lower bound, 106
concave function, 63
consecutive-ones matrix, 92 Master Problem, 67
conservative, 12, 90 matching, 96
convex function, 59 max-norm, 11
convex set, 21 most fractional, 109
cost, 12, 90 multi-commodity min-cost network-flow
Cramer’s rule, 7 problem, 12
cutting pattern, 80
network, 12, 89
cutting-stock problem, 80
network matrix, 90
Dantzig-Wolfe Decomposition, 66 nodes, 12, 89
demand nodes, 17 non-basis, 19
determinant, 6 non-degeneracy hypothesis, 31
dimension, 4 null space, 5
diving, 108 objective function, 1
dot product, 4 optimal, 2
down branch, 107 overly complementary, 54
dual solution, 27
perfect matching, 91
edge weights, 91 phase-one problem, 37
edges, 91 phase-two problem, 37

281
282 INDEX OF DEFINITIONS

pivot, 32, 121 sufficient unboundedness criterion, 30


polyhedron, 1 supply nodes, 17
surplus variable, 2
rank, 5
ratio test, 30 tail, 12, 89
ray, 25 the adjacency condition, 103
recursive optimization, 82 totally unimodular (TU), 93
reduced costs, 27 transportation problem, 17
reduced-cost fixing, 110 trivial, 4
row space, 5
uncapacitated facility-location problem,
scalar product, 4 101
Sherman-Morrison formula, 6 unimodular, 92
single-commodity min-cost network-flow up branch, 107
problem, 90
slack variable, 2 vertex cover, 97
solution, 2 vertex packing, 121
span, 4 vertex-edge incidence matrix of the
standard form, 2 bipartite graph, 91
subgradient, 75 vertices, 91
Index of Jupyter notebooks

Circle.ipynb, 43 pivot_tools.ipynb, 26, 41, 112, 122


CSP.ipynb, 83, 87 Production.ipynb, 14, 16, 54, 63
pure_gomory_example_1.ipynb, 112
Decomp.ipynb, 71 pure_gomory_example_2.ipynb, 122
MatrixLP.ipynb, 8
Multi-commodityFlow.ipynb, 14, 17 SubgradProj.ipynb, 77, 86

pivot_example.ipynb, 26, 41, 43 UFL.ipynb, 103, 121

283

You might also like