Integer Programming PDF
Integer Programming PDF
Lecture Notes
Marco Chiarandini
1 Introduction 5
1.1 Operations Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Mathematical Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Resource Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Mathematical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 General Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Diet Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 The Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.1 Solving LP Models in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.6 A Brief History of Linear Programming (LP) . . . . . . . . . . . . . . . . . . . . . . 15
1.7 Fourier Motzkin elimination method . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Duality 49
3.1 Derivation and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1.1 Bounding approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1.2 Geometric Interpretation of Duality . . . . . . . . . . . . . . . . . . . . . . . 50
3.1.3 Multipliers Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.4 Duality Recipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.2 Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Lagrangian Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4 Dual Simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.5 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3
4 CONTENTS
Introduction
• Production Planning and Inventory Control: planning the issue of orders for refilling ware-
houses avoiding stock out and satisfying space capacity limits.
• Budget Investment, given a budget and a number of projects, each with its own foreseen return
and cost of resources, find the projects to fund that would maximize the profit.
• Blending and Refining in the chemical industry
• Energy planning, for example, deciding when to activate cogeneration plant in order to meet
the forecast demand of heat and electricity in the next few hours.
• Manpower Planning, for example scheduling the shifts of a group of nurses such that a depart-
ment of an hospital is manned 24 hours a day with exactly 3 nurses and working agreements
are respected. Or in the airline and railways industries, rostering crews such that geographical
locations and working agreements are satisfied.
• Packing Problems, for example, filling containers with 3D packs without exceeding capacity
and minimizing the free space.
• Cutting Problems: for example, in textile or paper industry. Cutting a paper roll in pieces to
accommodate journals of different sizes while minimizing the waste.
• Routing, for example, in logistics, delivering products (oil, beer, food, etc.) to customers or
retailers such that the total traveling distance is minimized and the capacity of the vehicle
satisfied.
5
6 CHAPTER 1. INTRODUCTION
• Location Decisions, for example, deciding where to open a set of warehouses having to ensure
a satisfactory coverage of a number of retailers.
In all these contexts planning decisions must be made that relate to quantitative issues. For
example, fewest number of people, shortest route. On the other side not all plans are feasible –
there are constraining rules. Moreover there is a limited amount of available resources. It can be
extremely difficult to figure out what to do.
In Figure 1.1, we depict a common scheme of the solution process in applied optimization.
Algorithms
Modeling
(simplex, b&b)
variables,
& Solvers Solution
constraints,
(Gurobi,
obj. func.
CPLEX)
Problem Decision
Crew Scheduling
First we observe the real life system and interview the persons involved to understand the prob-
lem. We then write a problem description in clear, plain English. This is useful to get back to
the comissioners and ensure that there are no misunderstandings. You should challenge your de-
scription by presenting cases that are not valid for the real life situation but that would be allowed
by your description. This procedure is helpful to make the description precise and less prone to
misinterpretations. Then we are ready to introduce mathematical notation, which makes almost
impossible to misinterpret your model and removes all sources of disturbance. The real life objects
are abstracted to sets, graph, networks or other mathematical concepts. Then the model made by
known parameters, unknowns, objectives and constraints is forumlated. Any word description is
at this point removed. Finally, the model is solved on some test data and the solution interpreted
and crosschecked with respect to reality. The central idea in this process is to build a mathematical
model describing exactly what one wants, and what the “rules of the game” are.
parameters. Finally, the constraints indicating the interplay between the different variables must be
expressed in mathematical terms.
x2
5x1 + 10x2 ≤ 60
x1
max c1 x1 + c2 x2 + c3 x3 + . . . + cn xn = z
subject to a11 x1 + a12 x2 + a13 x3 + . . . + a1n xn ≤ b1
a21 x1 + a22 x2 + a23 x3 + . . . + a2n xn ≤ b2
.. .. ..
. . .
am1 x1 + am2 x2 + am3 x3 + . . . + amn xn ≤ bm
x1 , x2 , . . . , xn ≥ 0
The words “subject to” are often abbreviated to “s.t.”. More concisely the model can be written
in scalar form as:
Pn
max cj xj
j=1
P
n
aij xj ≤ bi , i = 1, . . . , m
j=1
xj ≥ 0, j = 1, . . . , n
1.3. RESOURCE ALLOCATION 9
max z = cT x
Ax ≤ b
x ≥ 0
1.3.3 Duality
Above we saw the factory planning problem from the perspective of the company owing the raw
materials. We assumed that it was convenient for the company to produce and sell products.
However, a plausible alternative would be to close the factory and sell the raw material to the market.
What would be the price of the raw material such that this becomes feasible and attractive? To
answer this question we have to solve a resource valuation problem. Let’s take the point of view of
an outside company who has to make an offer for buying the raw materials.
From this standpoint the unknowns that are to be determined are the values of a unit of raw
material i, which we indicate by zi , for i = 1, 2, . . . , m. These values
P are the variables of the problem.
The total expenses for buying the raw materials are given by m i=1 bi zi . The buying company is
interested in minimizing precisely this value. From the perspective of the owing company this value
is known as the opportunity cost of owning the raw material. It is the value that could be obtained
by the opportunity of selling all the raw material and closing the factory. The owning company wants
to minimize the lost opportunity cost with respect to producing and selling the products. The value
zi has to be larger than the prevailing unit market value of material i, ρi , otherwise the price would
contradict the market and the owning company would prefer selling to someone else. Similarly, for
each single product j ∈ J the opportunity cost derived from producing a unit of product has to be
larger than the unitary price of σj of the product. If this was not true, then the owning company
would not sell the raw material but rather use it to produce the product and sell that one instead.
10 CHAPTER 1. INTRODUCTION
We can therefore write the model for the resource valuation problem as follows:
m
X
min bi zi (1.1)
i=1
Xm
zi aij ≥ σj , j = 1...n (1.2)
i=1
zi ≥ ρi , i = 1...m (1.3)
Constraints 1.2 and 1.3 ensure that we are not contradicting the market while the objective 1.1 aims
at making the deal appealing for the buying company.
Let yi = zi − ρi be the markup that the owing company would make by reselling the raw material
at the price zi with respect to the price ρi at which it bought it. We can then rewrite the model
above as:
m
X X
min yi bi + ρi bi (1.4)
i=1 i
Xm
yi aij ≥ cj , j = 1...n (1.5)
i=1
yi ≥ 0, i = 1...m (1.6)
P
where in the objective function the term i ρi bi is always constant and does not impact the solution.
The problem we wrote is known as the dual of the previous resource allocation problem, which gets
consequently the name of primal. The two models are one the dual of the other.
P
n P
m
max u = cj xj min w = yi bi
j=1 i=1
P
n P
m
aij xj ≤ bi , i = 1, . . . , m yi aij ≥ cj , j = 1...n
j=1 i=1
xj ≥ 0, j = 1, . . . , n yi ≥ 0, i = 1...m
As we will see the optimal value of the primal problem u∗ is the same as the optimal value of the
dual problem w∗ , ie, u∗ = w∗ .
min cost/weight
subject to nutrition requirements:
eat enough but not too much of Vitamin A
eat enough but not too much of Sodium
eat enough but not too much of Calories
...
1.5. THE MATHEMATICAL MODEL 11
The problem was motivated in the 1930s and 1940s by the US army. It was first formulated as
a linear programming problem by George Stigler.
Suppose there are:
• there are restrictions on the number of calories (between 2000 and 2250) and the amount of
Vitamin A (between 5,000 and 50,000)
F = set of foods
N = set of nutrients
Decision Variables
Constraint Set 1 : For each nutrient j ∈ N , at least meet the minimum required level
X
aij xi ≥ Nminj , ∀j ∈ N
i∈F
12 CHAPTER 1. INTRODUCTION
Constraint Set 2 : For each nutrient j ∈ N , do not exceed the maximum allowable level.
X
aij xi ≤ Nmaxj , ∀j ∈ N
i∈F
Constraint Set 3 : For each food i ∈ F , select at least the minimum required number of
servings
xi ≥ Fmini , ∀i ∈ F
Constraint Set 4 : For each food i ∈ F , do not exceed the maximum allowable number of
servings.
xi ≤ Fmaxi , ∀i ∈ F
All together we obtain the following system of equalities and inequalities that gives the linear
programming problem:
X
min ci xi
i∈F
X
aij xi ≥ Nminj , ∀j ∈ N
i∈F
X
aij xi ≤ Nmaxj , ∀j ∈ N
i∈F
xi ≥ Fmini , ∀i ∈ F
xi ≤ Fmaxi , ∀i ∈ F
# diet.dat
data;
set NUTR := A B1 B2 C ;
set FOOD := BEEF CHK FISH HAM MCH MTL SPG TUR;
# Model diet.py
1.6. A BRIEF HISTORY OF LINEAR PROGRAMMING (LP) 15
m = Model("diet")
# Nutrition constraints
for c in categories:
m.addConstr(
quicksum(nutritionValues[f,c] * buy[f] for f in foods) <= maxNutrition[c], name
=c+’max’)
m.addConstr(
quicksum(nutritionValues[f,c] * buy[f] for f in foods) >= minNutrition[c], name
=c+’min’)
# Solve
m.optimize()
• In 1947, Dantzig (1914-2005) invented the (primal) simplex algorithm working for the US Air
Force at the Pentagon. (program=plan)
• In 1954, Lemke describes the dual simplex algorithm. In 1954, Dantzig and Orchard Hays
present the revised simplex algorithm.
• In 1970, Victor Klee and George Minty created an example that showed that the classical
simplex algorithm has exponential worst-case behavior.
• In 1979, L. Khachain found a new efficient algorithm, the Ellipsoid method, for linear pro-
gramming. It was terribly slow.
• In 1984, Karmarkar discovered yet another new efficient algorithm for linear programming,
the interior point method. It proved to be a strong competitor for the simplex method.
Some other important marks in the history of optimization are the following:
• In 1950s, Network Flow Theory began with the work of Ford and Fulkerson.
1. transform the system into another by eliminating some variables such that the two systems
have the same solutions over the remaining variables.
Let M = {1 . . . m} be the set that indexes the constraints. For a variable j = 1...n let partition
the rows of the matrix A in those in which xj appears with a positive, negative and null coefficient,
respectively, that is:
N = {i ∈ M | aij < 0}
Z = {i ∈ M | aij = 0}
P = {i ∈ M | aij > 0}
Let xr be the variable to eliminate.
Pr−1
xr ≥ b0ir − Pk=1 a0ik xk , air < 0 xr ≥ Ai (x1 , . . . , xr−1 ), i ∈ N
x ≤ b0ir − r−1 0
k=1 aik xk , air > 0 xr ≤ Bi (x1 , . . . , xr−1 ), i ∈ P
r
all other constraints(i ∈ Z) all other constraints(i ∈ Z)
1.7. FOURIER MOTZKIN ELIMINATION METHOD 17
which is equivalent to
Ai (x1 , . . . , xr−1 ) ≤ Bj (x1 , . . . , xr−1 ) i ∈ N, j ∈ P
all other constraints(i ∈ Z)
we eliminated xr but:
|N | · |P | inequalities
|Z| inequalities
d
After d iterations if |P | = |N | = n/2 exponential growth: 1/4(n/2)2
Example
−7x1 + 6x2 ≤ 25
x1 − 5x2 ≤ 1
x1 ≤ 7
−x1 + 2x2 ≤ 12
−x1 − 3x2 ≤ 1
2x1 − x2 ≤ 10
Let x2 be the variable we choose to eliminate:
We obtain |Z ∪ (N × P )| = 7 constraints.
By adding one variable and one inequality, Fourier-Motzkin elimination can be turned into an
LP solver. How?
18 CHAPTER 1. INTRODUCTION
Chapter 2
In this chapter we study the simplex method or (simplex algorithm). It was the first algorithm
to solve linear programming problems proposed in 1947 by George Dantzig in a technical report
“Maximization of a Linear Function of Variables Subject to Linear Inequalities” Dantzig [1951].
2.1 Preliminaries
We lay down some definitions from linear algebra that will be useful to motivate and describe the
simplex algorithm.
k
X
x = λ 1 v1 + · · · + λ k v k = λi vi
i=1
19
20 CHAPTER 2. THE SIMPLEX METHOD
Figure 2.1:
Figure 2.2:
• For a set of points S ⊆ Rn linear hull (aka linear span), affine hull, conic hull and convex hull
are respectively the sets:
lin(S) = {λ1 v1 + · · · + λk vk |k ≥ 0; v1 , · · · , vk ∈ S; λ1 , · · · , λk ∈ R}
aff(S) = {λ1 v1 + · · · + λk vk |k ≥ 1; v1 , · · · , vk ∈ S; λ1 , · · · , λk ∈ R; λ1 + · · · + λk = 1}
cone(S) = {λ1 v1 + · · · + λk vk |k ≥ 0; v1 , · · · , vk ∈ S; λ1 , · · · , λk ≥ 0}
conv(S) = {λ1 v1 + · · · + λk vk |k ≥ 0; v1 , · · · , vk ∈ S; λ1 , · · · , λk ≥ 0; λ1 + · · · + λk = 1}
See Figure 2.1 for a geometrical interpretation of these concepts. The set of points can be the
vectors made by the columns of an n × m matrix A, hence the previous definitions can refer
to a matrix as well.
• convex set: if x, y ∈ S and 0 ≤ λ ≤ 1 then λx + (1 − λ)y ∈ S
• convex function if its epigraph {(x, y) ∈ R2 : y ≥ f (x)} is a convex set or f : X → R, if
∀x, y ∈ X, λ ∈ [0, 1] it holds that f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
• See Fig 2.2.
• Given a set of points X ⊆ Rn the convex hull conv(X) is the convex linear combination of
P
the points conv(X) = {λ1 x1 + λ2 x2 + . . . + λn xn | xi ∈ X, λ1 , . . . , λn ≥ 0 and i λi = 1}
Depending on whether we study systems of linear equalities or inequalities and using integer or
continuous variables we may be in a different field of mathematics:
• Linear algebra studies linear equations
• Integer linear algebra studies linear diophantine equations
• Linear programming studeis linear inequalities (simplex method)
• Integer linear programming studies linear diophantine inequalities
22 CHAPTER 2. THE SIMPLEX METHOD
1. F = ∅
2. F 6= ∅ and ∃ solution
(a) one solution
(b) infinite solution
3. F 6= ∅ and 6 ∃ solution
min{cT x | x ∈ P } where P = {x ∈ Rn | Ax ≤ b}
If P is a bounded polyhedron and not empty and x∗ is an optimal solution to the problem, then:
Proof. The first part of the proof shows by contradiction that x∗ must be on the boundary of P .
Then, if x∗ is not a vertex, it is a convex combination of vertices and it shows that all points of the
convex combination are also optimal.
For the first part, suppose, for the sake of contradiction, that x∗ ∈ int(P ), that is, is interior
to P , not a vertex. Then there exists some > 0 such that the ball of radius centered at x∗ is
contained in P , that is B (x∗ ) ⊂ P . Therefore,
c
x∗ − ∈P
2 ||c||
2.2. SYSTEMS OF LINEAR EQUATIONS 23
Hence x∗ is not an optimal solution, a contradiction. Therefore, x∗ must live on the boundary
of P . If x∗ P Pbe the convex combination of vertices of P , say x1 , . . . , xt .
is not a vertex itself, it must
Then x∗ = ti=1 λi xi with λi ≥ 0 and ti=1 λi = 1. Observe that
t
! ! t
! t
X X X
∗ ∗
0=c T
λi xi −x T
=c λi (xi − x ) = λi (cT xi − cT x∗ ).
i=1 i=1 i=1
Since x∗ is an optimal solution, all terms in the sum are nonnegative. Since the sum is equal to zero,
we must have that each individual term is equal to zero. Hence, cT x∗ = cT xi for each xi , so every
xi is also optimal, and therefore all points on the face whose vertices are x1 , . . . , xt , are optimal
solutions.
It follows from the theorem that the optimal solution is at the intersection of hyperplanes sup-
porting halfspaces. Since there are finitely many halfspaces to describe a polyhedron, then there are
finitely many possibilities to look for optimal soltuions.
A solution method could proceed as follows:
n
write all inequalities as equalities and solve all m systems of linear equalities (n # variables, m #
equality constraints). For each point we need then to check if it is feasible and if it is best in cost
(optimality condition). We can solve each system of linear equations by Gaussian elimination.
n
Checking all m may result in a lot of work. Recall that [Cormen et al., 2009, pag. 1097]
en m
n
=O
m m
hence the the asymptotic upper bound is an exponential function. The simplex method, as we will
see, tries to be smarter and only visit some of the vertices. Shortly said, it finds a solution that is
at the intersection of some m hyperplanes. Then it tries to systematically produce the other points
by exchanging one hyperplane with another and at each point it checks optimality.
1. Gaussian elimination
3. Cramer’s rule
1. Forward elimination: reduces the system to triangular (row echelon) form by elementary row
operations
2. Back substitution
Alternatively, one can compute the reduced row echelon form (RREF) of A and read immediately
the solution from there.
We illustrate this procedure on the following numerical example.
2x + y − z = 8 (R1)
−3x − y + 2z = −11 (R2)
−2x + y + 2z = −3 (R3)
On the right side we perform the computations. The style is taken from emacs org mode, that offers
a convenient environment for working with tables.
|----+----+----+----+-----|
2x + y − z = 8 (R1) | R1 | 2 | 1 | -1 | 8 |
−3x − y + 2z = −11 (R2) | R2 | -3 | -1 | 2 | -11 |
| R3 | -2 | 1 | 2 | -3 |
−2x + y + 2z = −3 (R3) |----+----+----+----+-----|
|---------------+---+-----+------+---|
| R1’=1/2 R1 | 1 | 1/2 | -1/2 | 4 |
2x + y − z = 8 (R1) | R2’=R2+3/2 R1 | 0 | 1/2 | 1/2 | 1 |
| R3’=R3+R1 | 0 | 2 | 1 | 5 |
+ 21 y + 12 z = 1 (R2) |---------------+---+-----+------+---|
+ 2y + 1z = 5 (R3)
|-------------+---+-----+------+---|
2x + y − z = 8 (R1) | R1’=R1 | 1 | 1/2 | -1/2 | 4 |
1 1 | R2’=2 R2 | 0 | 1 | 1 | 2 |
+ 2y + 2z = 1 (R2) | R3’=R3-4 R2 | 0 | 0 | -1 | 1 |
− z = 1 (R3) |-------------+---+-----+------+---|
2x + y − z = 8 (R1) |---------------+---+-----+---+-----|
1 1 | R1’=R1-1/2 R3 | 1 | 1/2 | 0 | 7/2 |
+ 2y + 2z = 1 (R2) | R2’=R2+R3 | 0 | 1 | 0 | 3 |
− z = 1 (R3) | R3’=-R3 | 0 | 0 | 1 | -1 |
|---------------+---+-----+---+-----|
x = 2 (R1)
y = 3 (R2) |---------------+---+---+---+----+
| R1’=R1-1/2 R2 | 1 | 0 | 0 | 2 | => x=2
z = −1 (R3) | R2’=R2 | 0 | 1 | 0 | 3 | => y=3
| R3’=R3 | 0 | 0 | 1 | -1 | => z=-1
|---------------+---+---+---+----+
Gaussian elimination can be implemented in polynomial time O(n2 m) but some care must be
applied to guarantee that all the numbers during the run can be represented by polynomially bounded
bits.
By inverse matrix
Ax = b
x = A−1 b
2.3. SIMPLEX METHOD 25
Calculating the inverse of a matrix is computationally very expensive. In practice (that is, in
computer systems) the computation is rather performed via LU factorization: each matrix can be
expressed as the product of a permutation matrix P , a lower triangular matrix L (with diagonal
elements equal to 1) and upper triangular matrix U .
A = P LU
A = P LU
x = A−1 b = U −1 L−1 P T b
z1 = P T b, z2 = L−1 z1 , x = U −1 z2
The last two equations are solved by forward substitution Lz2 = z1 and by backward substitution
U x = z2 .
Cramer’s rule To Do. See wikipedia for now. It is computationally expensive but we will use it
to derive a result later.
Solving Ax = b in practice and at the computer is done:
– via LU factorization (much quicker if one has to solve several systems with the same matrix
A but different vectors b)
x ∈ Rn , c ∈ Rn , A ∈ Rm×n , b ∈ Rm
The algorithm first ensures that the problem is in a standard form. Then determines an easy to
find initial solution. We will initially assume that this initial solution is feasible. Next the algorithm
proceeds by iterating through feasible solutions that are vertices of the polyhedron that represents
the feasibility region. Finally, it will use an optimality condition to terminate. A few exceptions
may occur, they determine initial infeasibility, unboundedness, more than one solution and cycling
in case of degenerancies. We will see how this situations are handled.
Standard Form The first step in the algorithm is to put the LP problem in a standard form:
Proposition 1 (Standard form). Each linear program can be converted in the form:
max cT x
Ax ≤ b
x ∈ Rn
c ∈ Rn , A ∈ Rm×n , b ∈ Rm
Proof. If the problem is not in the standard form already we can transform it:
• if there are inequalities of the type ax ≥ b, then we change them to the form −ax ≤ −b.
For now, we assume that in the standard form b ≥ 0. As we will see, if this is true then finding
an initial feasible solution is not trivial.
Proposition 2 (Equational standard form). Each LP problem can be converted to the form:
max cT x
Ax = b
x ≥ 0
x ∈ Rn , c ∈ Rn , A ∈ Rm×n , b ∈ Rm
that is, the objective is to maximize, the constraints are all equalities, the variables are all non-
negative.
1. we add one non-negative slack variable xn+i to the left hand side of each constraint i = 1, . . . , m
of the type ≤:
P
n
max cj xj
j=1
P
n
aij xj + xn+i = bi , i = 1, . . . , m
j=1
xj ≥ 0, j = 1, . . . , n
xn+i ≥ 0, i = 1, . . . , m
We assume here that the problem is already in standard form. If it was not then there might
be larger or equal constraints, in which case we would subtract so-called non-negative surplus
variables to make them equality constraints.
4. Again we assume b ≥ 0.
Hence, every LP problem in n×m is converted to an LP problem with at most (m+2n) variables
and m equations (n # original variables, m # constraints).
The relevant form for simplex algorithm is the equational standard form and it is this form most
text books refer to when referring to the standard form. We call the equational standard form
determined by the procedure above canonical if the b terms are all non-negative. It is not always
trivial to put the problem in canonical equational standard form and for infeasible problems it is
simply not possible, as we will see.
From the geometrical point of view the feasibility region of the problem
max{cT x | Ax = b, x ≥ 0}
is the intersection of the set of solutions of Ax = b, which is an affine space (a plane not passing
through the origin) and the nonegative orthant x ≥ 0. For a case in R3 with Ax = b made by
x1 + x2 + x3 = 0 the situation is shown in Figure 2.4 (in R3 the orthant is called octant).
Note that Ax = b is a system of equations that we can solve by Gaussian elimination. Elementary
row operations of A | b such as:
• multiplying all entries in some row of A | b by a nonzero real number λ
• replacing the ith row of A | b by the sum of the ith row and jth row for some i 6= j
do not affect set of feasible solutions. We assume rank( A | b ) = rank(A) = m, ie, rows of A are
linearly independent. Otherwise, we remove the linear dependent rows and change the value of m.
28 CHAPTER 2. THE SIMPLEX METHOD
• the square matrix AB is non-singular, ie, all columns indexed by B are lin. indep.
• xB = A−1
B b is non-negative, ie, xB ≥ 0
In the definition, the last condition ensures the feasibility of the solution.
We call xj , j ∈ B basic variables and remaining variables nonbasic variables. Non basic variables
are set to zero in the basic feasible solution determined by B.
Theorem 2.2 (Uniqueness of a basic feasible solution). A basic feasible solution is uniquely deter-
mined by the set B.
Proof.
Ax =AB xB + AN xN = b
xB + A−1 −1
B AN x N = AB b
xB = A−1
B b AB is singular hence one solution
Figure 2.5: The points p and q represent feasible solutions but they are not extreme points. The
point r is a feasible solution and an extreme point.
Theorem 2.3. Let P be a (convex) polyhedron from LP in std. form. For a point v ∈ P the following
are equivalent:
The proof, not shown here, goes through recognizing that vertices of P are linear independent
combinations of the columns of the matrix A and that such are the columns in AB since AB is
non-singular by definition.
From the previous theorem and the fundamental theorem of linear programming, it follows that
Theorem 2.4. Let LP = max{cT x | Ax = b, x ≥ 0} be feasible and bounded, then the optimal
solution is a basic feasible solution.
We have thus learned how to find algebraically vertices of the polyhedron. The idea for a solution
algorithm is therefore to examine all basic solutions. From what we saw, this corresponds to generate
different sets B of the indices of the columns of A and checking whether the conditions for being a
basic feasible solution hold. For a matrix A that after trasnformation to standard eqautional form
has n + m columns there are finitely many possible subsets B to examine, precisely
m+n
m.
If n = m, then 2m m
m ≈ 4 . Hence, even though at each iteration it might be easy to retrieve the
value of the corresponding solutions, we are still left with exponentially many iterations to perform
in the worst case that we have to see all vertices of the polyhedron.
We are now ready to start working at a numerical example. Let’s consider our previous problem
from resource allocation. In scalar form:
x1 , x2 ≥ 0
We put the problem in canonical equational standard form:
5x1 + 10x2 + x3 = 60
4x1 + 4x2 + x4 = 40
or, equivalently, in matrix form:
x1
max z= 6 8
x2
x1
5 10 1 0
x2 = 60
4 4 0 1 x3 40
x4
x1 , x2 , x3 , x4 ≥ 0
If the equational standard form is canonical one decision variable is isolated in each constraint and
it does not appear in the other constraints nor in the objective function and the b terms are positive.
The advantage of the canonical form is evident: it gives immediately a basic feasible solution:
x1 = 0, x2 = 0, x3 = 60, x4 = 40
The basis of this solution is B = {3, 4}. Consequently, N = {1, 2}. If this solution is also optimal
then the algorithm can terminate and return the solution. Is this solution optimal?
Looking at signs in z it seems not: since they are positive, if we can increase the variables x1 and
x2 to become larger than zero then the soltion quality would improve. Let’s then try to increase a
promising variable, i.e., one with positive coefficient in z. Let’s take x1 and let’s consider how much
we can increase its value looking at the first constraint. Since x2 stays equal to zero, this variable
does not appear in the constraint.
5x1 + x3 = 60
Isolating first x1 and then x3 we can plot the line represented by this constraint:
x3
x1 = 60
5 − 5
x3 = 60 − 5x1 ≥ 0
x3
x1
5x1 + x3 = 60
2.3. SIMPLEX METHOD 31
From the explicit form we see that if we increase x1 more than 12 then x3 becomes negative and
thus the whole solution infeasible. This constraint imposes therefore an upper bound of 12 to the
increase of x1 . Let’s analyze the second constraint now:
4x1 + x4 = 40
x4
x1
4x1 + x4 = 40
For a similar reasoning as above we observe that this constraint imposes an upper bound of 10 to
the increase of x1 .
It follows that the value of x1 can be increased at most up to 10. Increasing x1 to 10 makes
x4 = 0 because of the second constraint. Hence, we want that x4 exits the basis while x1 enters
in it. In order to bring the problem back in canonical standard form after the increase of x1 we
need to perform elementary row operations. To this end it is convenient to work with a particular
organization of the data that is called simplex tableau (plural tableaux).
x1 x2 x3 x4 −z b
x3 5 10 1 0 0 60
x4 4 4 0 1 0 40
6 8 0 0 1 0
The variables that label the columns remain fixed throughout the iterations, while the labels in
the rows changes depending on which variables are in basis. The column −z will never change
throughout the iterations of the algorithm. The last row is given by the objective function. Note
that some text books put this row as the first row on the top. With the new basis the new tableau
that correspond to a canonical standard form looks like this:
x1 x2 x3 x4 −z b
x3 0 ? 1 ? 0 ?
x1 1 ? 0 ? 0 ?
0 ? 0 ? 1 ?
that is, there is a permuted identity matrix whose last column, −z, remains fixed while the other
two columns indicate which variable is in basis.
The decisions that we have done so far: to select a variable to increase, the amount of the
increase, which variable has to decrease and putting the tableau the new form, can be written in
general terms as the following pivot operations.
Definition 2 (Pivot operations). The following operations are done at each iteration of the simplex:
32 CHAPTER 2. THE SIMPLEX METHOD
1. Determine a pivot:
Note that the choice of the row of the pivot gives us also the increase value θ of entering variable,
that is,
bi
θ = min : ais > 0
i ais
Let’s get back to our numerical example and perform the simplex iterations:
• x1 enters the basis and x4 leaves the basis. The pivot is element at position (1, 4) ie, 4. The
elementary row operations to put the tableau in the new form are:
From the last row we read: 2x2 − 3/2x4 − z = −60, that is: z = 60 + 2x2 − 3/2x4 . Since x2
and x4 are nonbasic we have z = 60 and x1 = 10, x2 = 0, x3 = 10, x4 = 0.
• Are we done? No, there are still positive coefficients in the objective row! Let x2 enter the
basis. We determine the pivot, which is 5, hence x3 is the variable that exists. After the row
operations we obtain:
| | x1 | x2 | x3 | x4 | -z | b |
|--------------+----+----+------+------+----+-----|
| I’=I/5 | 0 | 1 | 1/5 | -1/4 | 0 | 2 |
| II’=II-I’ | 1 | 0 | -1/5 | 1/2 | 0 | 8 |
|--------------+----+----+------+------+----+-----|
| III’=III-2I’ | 0 | 0 | -2/5 | -1 | 1 | -64 |
• Are we done? Yes! The variables not in basis have negative coefficients in the objective
function that corresponds to the tableau we reached. Hence if we increased them, we would
worsen the objective function. The solutions we have found is therefore the optimal one.
Definition 3 (Reduced costs). The coefficients in the objective function of the nonbasic variables,
c̄N , are called reduced costs.
Note that basic variables have always coefficients in the last row equal to zero.
2.3. SIMPLEX METHOD 33
x2 x2
?
?
x1 ?
?
x1
Definition 4 (Improving variables). An improving variable is a non basic variable with positive
reduced cost.
Proposition 3 (Optimality condition). A basic feasible solution is optimal when the reduced costs in
the corresponding simplex tableau are nonpositive, ie:
c̄N ≤ 0
In Figure 2.6 left, we represent graphically the solution process executed by the simplex algo-
rithm. Starting from the vertex (0, 0), we moved to (10, 0) and finally reached the optimal solution
in (8, 2). For this problem the other path with x2 increased before x1 would have been of the same
length and hence lead to the same number of iterations of the simplex algorithm. However, the
situation is not always like this. In Figure 2.6 right, and in Figure 2.7, we see that choosing one
direction of increase rather than another may influence considerably the efficiency of the search.
We said earlier that trying all points implies approximately 4m iterations. This is an asymptotic
upper bound. On the other hand to find an asymptotic lower bound we should apply the clairvoyant’s
rule, that is, using the shortest possible sequence of steps for any pair of vertices we may choose as
starting and optimal solutions. However, the length of this path for a general polyhedron in Rn is
not known. Hirsh conjectures O(n) but best known result is n1+ln n .
In practice, the simplex algorithm runs in between 2m and 3m iterations. (Hence, relevant to
note, the number of iterations depends on the number of constraints.)
Tableaux and Dictionaries We chose the use the tableau representation which is the original
Dantzig representation of the simplex algorithm. An alternative representation by means of dictio-
naries due to Chvatal is equally spread and used in text books. The tableau representation is more
amenable to implementations at the computer than the dictionary one. However, efficient code use
the revised simplex method and hence not either a tableaux representation.
Let’s consider the general LP problem:
P
n
max cj xj
j=1
P
n
aij xj ≤ bi , i = 1, . . . , m
j=1
xj ≥ 0, j = 1, . . . , n
The equational standard form can be written, perhaps more intuitively, also by isolating the slack
variables:
P
n
max z = cj xj
j=1
P
n
xn+i = bi − aij xj , i = 1, . . . , m
j=1
xj ≥ 0, j = 1, . . . , n
xn+i ≥ 0, i = 1, . . . , m
This form gives immediately the dictionary representation. We compare this representation side
by side with the tableau representation:
Tableau
Dictionary
P
x r = b̄r − ārs xs , r ∈ B
I ĀN 0 b̄
P s6 ∈B
z = d¯ + c̄s xs
s6∈B
0 c̄N 1 −d¯
The last row of the dictionary gives us the same objective function as we have seen that it can
be derived from the last row of tableau, namely:
X X
0xr + ¯
c̄N xN − z = −d.
r∈B s6∈B
Decisions in the two cases must correspond. In the dictionary the Pivot operations are given by:
1. Determine a pivot:
column choose the column s with reduced cost strictly positive
row choose the row i with negative coefficients such that the ratio b/asi is minimal
2. update: express the entering variable and substitute it in the other rows
Example 2.1.
max 6x1 + 8x2
5x1 + 10x2 ≤ 60
4x1 + 4x2 ≤ 40
x1 , x2 ≥ 0
2.4. EXCEPTION HANDLING 35
x1 x2 x3 x4 −z b
x3 = 60 − 5x1 − 10x2
x3 5 10 1 0 0 60
x4 = 40 − 4x1 − 4x2
x4 4 4 0 1 0 40
z = + 6x1 + 8x2
6 8 0 0 1 0
2.4.1 Unboundedness
We consider the following LP problem instance:
max 2x1 + x2
x2 ≤ 5
−x1 + x2 ≤ 1
x1 , x2 ≥ 0
• We write the initial tableau
| | x1 | x2 | x3 | x4 | -z | b |
|----+----+----+----+----+----+---|
| x3 | 0 | 1 | 1 | 0 | 0 | 5 |
| x4 | -1 | 1 | 0 | 1 | 0 | 1 |
|----+----+----+----+----+----+---|
| | 2 | 1 | 0 | 0 | 1 | 0 |
36 CHAPTER 2. THE SIMPLEX METHOD
x2
?
?
x1
• x2 entering, x4 leaving
| | x1 | x2 | x3 | x4 | -z | b |
|-------------+----+----+----+----+----+----|
| II’=II-I’ | 1 | 0 | 1 | -1 | 0 | 4 |
| I’=I | -1 | 1 | 0 | 1 | 0 | 1 |
|-------------+----+----+----+----+----+----|
| III’=III-I’ | 3 | 0 | 0 | -1 | 1 | -1 |
−x1 + x2 + x4 = 1
, where x2 being in the basis is set to zero. Hence, x1 can increase without restriction. This
is why when writing the maximum allowed increase, we enforced ais > 0: θ = min{ abisi :
ais > 0, i = 1 . . . , n}. If ais ≤ 0 then the variable can increase arbitrarily.
• x1 entering, x3 leaving
| | x1 | x2 | x3 | x4 | -z | b |
|--------------+----+----+----+----+----+-----|
| I’=I | 1 | 0 | 1 | -1 | 0 | 4 |
| II’=II+I’ | 0 | 1 | 1 | 0 | 0 | 5 |
|--------------+----+----+----+----+----+-----|
| III’=III-3I’ | 0 | 0 | -3 | 2 | 1 | -13 |
x1 + x3 − x4 = 4
x2 + x3 + 0x4 = 5
x4 can now be increased arbitrarily: in the first constraint it will be compensated by x1 (x3 is
non basic and hence 0) and in the second constraint it does appear at all.
We are therefore in the condition of an unbounded problem. We recognise this when a variable
chosen to enter in the basis is not upper bounded in its increase. Figure 2.8 provides the geometrical
view of the solution process for this example.
2.4. EXCEPTION HANDLING 37
max x1 + x2
5x1 + 10x2 ≤ 60
4x1 + 4x2 ≤ 40
x1 , x2 ≥ 0
| | x1 | x2 | x3 | x4 | -z | b |
|----+----+----+----+----+----+----|
| x3 | 5 | 10 | 1 | 0 | 0 | 60 |
| x4 | 4 | 4 | 0 | 1 | 0 | 40 |
|----+----+----+----+----+----+----|
| | 1 | 1 | 0 | 0 | 1 | 0 |
• x2 enters, x3 leaves
| | x1 | x2 | x3 | x4 | -z | b |
|-------------+-----+----+------+----+----+----|
| I’=I/10 | 1/2 | 1 | 1/10 | 0 | 0 | 6 |
| II’=II-4Ix4 | 2 | 0 | -2/5 | 1 | 0 | 16 |
|-------------+-----+----+------+----+----+----|
| III’=III-I | 1/2 | 0 | -1/6 | 0 | 1 | -6 |
• x1 enters, x4 leaves
| | x1 | x2 | x3 | x4 | -z | b |
|----------------+----+----+------+------+----+-----|
| I’=I-II’/2 | 0 | 1 | 1/5 | -1/4 | 0 | 2 |
| II’=II/2 | 1 | 0 | -1/5 | 1/2 | 0 | 8 |
|----------------+----+----+------+------+----+-----|
| III’=III-II’/2 | 0 | 0 | 0 | -1/4 | 1 | -10 |
The corresponding solution is x1 = (8, 2, 0, 0), z = 10. Applying the optimality condition we
see that the solution is optimal. However, we are used to see that nonbasic variables have
reduced costs not equal to 0. Here x3 has reduced cost equal to 0. Let’s make it enter the
basis.
• x3 enters, x2 leaves
| | x1 | x2 | x3 | x4 | -z | b |
|----------------+----+----+----+------+----+-----|
| I’=5I | 0 | 5 | 1 | -5/4 | 0 | 10 |
| II’=II+I’/5 | 1 | 1 | 0 | 4 | 0 | 10 |
|----------------+----+----+----+------+----+-----|
| III’=III | 0 | 0 | 0 | -1/4 | 1 | -10 |
We find a different solution that has the same value: x2 = (10, 0, 10, 0), z = 10. Note that we
use a subscript to differentiate from the first soltution.
Hence we found two optimal solutions. If we continued from here we would again bring x2 in
the basis and x3 out, thus cycling.
38 CHAPTER 2. THE SIMPLEX METHOD
If more than one solution is optimal, then we saw that also all their convex combinations are
optimal solutions. Let’s then express all optimal solutions. The convex combination is:
2
X
x= αi xi
i=1
αi ≥ 0 ∀i = 1, 2
2
X
αi = 1
i=1
Any vector x resulting from the convex combination with coefficients α1 = α and α2 = 1 − α is
given by:
x1 8 10
x2 2 0
= α + (1 − α)
x3 0 10
x4 0 0
or
x1 = 8α + 10(1 − α)
x2 = 2α
x3 = 10(1 − α)
x4 = 0.
A problem has infinite solutions when the objective hyperplane is parallel to one of the faces of the
feasibility region with dimension larger than 0. The example is depicted in Figure 2.9. A face could
have larger dimensions and the simplex would find all its extreme vertices before looping between
them.
2.4.3 Degeneracy
Let this time the LP problem instance be:
max x2
−x1 + x2 ≤ 0
x1 ≤ 2
x1 , x2 ≥ 0
• The initial tableau is
| | x1 | x2 | x3 | x4 | -z | b |
|----+----+----+----+----+----+---|
| x3 | -1 | 1 | 1 | 0 | 0 | 0 |
| x4 | 1 | 0 | 0 | 1 | 0 | 2 |
|----+----+----+----+----+----+---|
| | 0 | 1 | 0 | 0 | 1 | 0 |
2.4. EXCEPTION HANDLING 39
x2
x1
Figure 2.9: The example with infinite solutions. The objective function is parallel with the edge of
the feasibility region. The solutions found by the simplex are the two extremes of the segment.
The novel element here is that a right-hand side coefficient is zero, ie, b1 = 0. In the pivot
operations, a null b term will make such that the entering variable will not be increased.
Indeed, a null b term will make the increase value θ null.
Definition 5 (Degenerate pivot step). We call degenerate pivot step a pivot step in which the
entering variable stays at zero.
• Let’s proceed and make x2 enter the basis and x3 leave it.
| | x1 | x2 | x3 | x4 | -z | b |
|---+----+----+----+----+----+---|
| | -1 | 1 | 1 | 0 | 0 | 0 |
| | 1 | 0 | 0 | 1 | 0 | 2 |
|---+----+----+----+----+----+---|
| | 1 | 0 | -1 | 0 | 1 | 0 |
• in the next step we end up avoding the constraint with the null b term and the step is not
degenerate anymore. We exit from the degeneracy state and reach an optimal tableau:
| | x1 | x2 | x3 | x4 | -z | b |
|---+----+----+----+----+----+----|
| | 0 | 1 | 0 | 1 | 0 | 2 |
| | 1 | 0 | 0 | 1 | 0 | 2 |
|---+----+----+----+----+----+----|
| | 0 | 0 | -1 | -1 | 1 | -2 |
The situation is represented graphically in Figure 2.10. If n is the number of original variables,
degenerancies arise when n + 1 or more constraints meet at a vertex. In other terms, there are
polyhedra that have vertices that are overdetermined, that is, the number of facets that meet in
those vertices is larger than dim(P ). In this case, every dim(P ) inequalities that define these facets
determine a basis that produce a basis solution. In linear algebra terms, for n + m variables of an
LP problem in equational standard form, a basis solution that belongs to a basis B has n variables
set to zero and the remaining m variables set to A− B 1b. In a degenerate basis solution there are more
than n variables set to zero. It follows that the same solution x is solution of more than one regular
n × n subsystem.
Degenerancy may lead to cycling in the simplex.
40 CHAPTER 2. THE SIMPLEX METHOD
x2
x1
Figure 2.10: In the origin 3 constraints meet. In that vertex the simplex method encounters a
degeneracy, which in this case is resolved and another vertex reached.
Proof. A tableau is completely determined by specifying which variables are basic and which are
nonbasic. There are only
n+m
m
different possibilities. The simplex method always moves to non-worsening tableaux. If the simplex
method fails to terminate, it must visit some of these tableaux more than once. Hence, the algorithm
cycles.
Degenerate conditions may appear often in practice but cycling is rare and some pivoting rules
prevent cycling. (Ex. 7 of Sheet 3 shows the smallest possible example.)
Under certain pivoting rules cycling may happen. So far we chose an arbitrary improving variable
to enter the basis.
• Largest Coefficient: select the improving variable with largest coefficient in last row of the
tableau, ie, reduced cost. This is the original Dantzig’s rule, and it was shown that it can lead
to cycling.
• Largest increase: select the improving variable that leads to the best absolute improvement,
ie, argmaxj {cj θj }. This rule is computationally more costly.
• Steepest edge select the improving variable that if brought in the basis, would move the current
basic feasible solution in a direction closest to the direction of the vector c (ie, maximizes the
cosine of the angle between the two vectors):
cT (xnew − xold )
a · b = kak kbk cos θ =⇒ max
k xnew − xold k
2.4. EXCEPTION HANDLING 41
• Bland’s rule chooses the improving variable with the lowest index and, if there are more than
one leaving variable, the one with the lowest index. This rule prevents cycling but it is slow.
• Perturbation method perturb the values of bi terms to avoid bi = 0, which must occur for
cycling. To avoid cancellations: 0 < m m−1 · · · 1 1. It can be shown to be the
same as lexicographic method, which prevents cycling
• It is unknown if there exists a pivot rule that leads to polynomial time. The best would be
the Clairvoyant’s rule: that is, choose the pivot rule that gives the shortest possible sequence
of steps. This corresponds to determining the diameter of the m dimensional polytope. The
diameter of a polytope P is the maximum distance between any two vertices in the edge graph
of P (Figure 2.12). The diameter gives a lower bound for any pivoting rule for the simplex
algorithm. Hirsch conjectured (1957) that the diameter of any n-facet convex polytope in
d-dimensional Euclidean space is at most n − d. Kalai and Kleitman (1992) gave an O(nlog n )
upper bound on the diameter namely n1+ln n . Hirsh conjecture was disproved in May 2010
by Francisco Santos from the University of Cantabria in Santander. He constructed a 43-
dimensional polytope with 86 facets and diameter bigger than 43. [Documenta Math. 75 Who
Solved the Hirsch Conjecture? Günter M. Ziegler]. In general terms he showed the existence
of polytopes with diameter (1 + )(n − d). It remains open whether the diameter is polynomial,
or even linear, in n and d.
• In practice the simplex runs in between 2m and 3m number of iterations, hence the running
time seems to be dependent on the number of constraints.
• Positive results are of smoothed complexity type: that is, average case on slight random
perturbations of worst-case inputs. D. Spielman and S. Teng (2001), Smoothed analysis of
algorithms: why the simplex algorithm usually takes polynomial time
O(max(n5 log2 m, n9 log 4 n, n3 σ −4 ))
42 CHAPTER 2. THE SIMPLEX METHOD
Figure 2.12: . The shortest path between any two vertices of a polyhedron may contain an expo-
nentially growing number of vertices as the dimension grows.
• One of the most prominent mysteries in Optimization remains the question of whether a lin-
ear program can be solved in strongly-polynomial time. A strongly polynomial-time method
would be polynomial in the dimension n and in the number m of inequalities only, whereas
the complexity of the known weakly-polynomial time algorithms for linear programming, like
the ellipsoid method or variants of the interior-point method, also depend on the binary en-
coding length of the input. The simplex method, though one of the oldest methods for linear
programming, still is a candidate for such a strongly polynomial time algorithm. This would
require the existence of a pivoting rule that results in a polynomial number of pivot steps.
Since the famous Klee-Minty example, many techniques for deriving exponential lower bounds
on the number of iterations for particular pivoting rules have been found.
Some very important pivoting rules, however, have resisted a super-polynomial lower-bound
proof for a very long time. Among them the pivoting rules Random Edge (uniformly random
improving pivots) Randomized Bland’s Rule (random shuffle the indexes + lowest index for
entering + lexicographic for leaving) Random-Facet and Zadeh’s pivoting rule (least-entered
rule: enter the improving variable that has been entered least often – it minimizes revisits).
Random-Facet has been shown to yield sub-exponential running time of the simplex method
independently by Kalai as well as by Matousek, Sharir and Welzl. For every linear program
with at most n√variables and at most m constraints, the expected number of pivot steps is
bounded by eC m ln n , where C is a (not too large) constant. (Here the expectation means
the arithmetic average over all possible orderings of the variables.) O. Friedmann together
with Hansen and Zwick have shown super-polynomial (but subexponential) lower bounds for
Random Edge, Randomized Bland’s rule and Zadeh’s pivoting rules in 2011. The same authors
in 2015 proposed an improved version of the Random-Facet rule that achieves the best known
sub-exponential running time. These results are unrelated to the diameter of the polytope.
(Sources: Mathematical Optimization Society, 2012 Tucker Prize Citation; Thomas Dueholm
Hansen, Aarhus University, http://cs.au.dk/~tdh/;
example:
max x1 − x2
x1 + x2 ≤ 2
2x1 + 2x2 ≥ 5
x1 , x2 ≥ 0
The second constraint is of larger and equal type. To make it smaller and equal we multiply left-hand
side and right-hand side by -1, yielding a negative right-hand side term. The equational standard
form becomes:
max x1 − x2
x1 + x2 + x3 = 2
−2x1 + −2x2 + x4 = −5
x1 , x2 , x3 , x4 ≥ 0
max x1 − x2
x1 + x2 + x3 = 2
2x1 + 2x2 − x4 = 5
x1 , x2 , x3 , x4 ≥ 0
However, when we write the corresponding tableau we observe that it is not in canonical form,
that is, we cannot recognize an identity submatrix.
| | x1 | x2 | x3 | x4 | -z | b |
|----+----+----+----+----+----+---|
| x3 | 1 | 1 | 1 | 0 | 0 | 2 |
| x4 | 2 | 2 | 0 | -1 | 0 | 5 |
|----+----+----+----+----+----+---|
| | 1 | -1 | 0 | 0 | 1 | 0 |
We note that, similarly to the canonical form, one decision variable is isolated in each constraint
and it does not appear in the other constraints nor in the objective function but for x4 the coefficient
is −1. If we take x3 and x4 in basis then reading from the tableau, their value is x3 = 2 and x4 = −5.
This does not comply with the definiton of basic feasible solution that asks all variables to be non-
negative. Hence, we do not have an initial basic feasible solution!
In general finding any feasible solution is as difficult as finding an optimal solution, otherwise we
could do binary search on the values of the objective function (that is, solving a sequence of systems
of linear inequalities, one of which being the constrained objective function).
To find an initial feasible solution we formulate an auxiliary problem and solve it.
In our example above we introduce the auxiliary non-negative variable x5 in the second constraint
and minimize its value:
If w∗ = 0 then x5 = 0 and the two problems are equivalent, if w∗ > 0 then not possible to set x5 to
zero and the original problem does not have a feasible solution.
Let’s solve this auxiliary problem.
• In the initial tableau we introduce a new row at the bottom for the new objective function
to maximize and a new column denoted by −w. We keep the row for the original objective
function and the column −z. In the pivot operations we will keep −z and −w always in basis.
will keep
| | x1 | x2 | x3 | x4 | x5 | -z | -w | b |
|----+----+----+----+----+----+----+----+---|
| | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 2 |
| | 2 | 2 | 0 | -1 | 1 | 0 | 0 | 5 |
| z | 1 | -1 | 0 | 0 | 0 | 1 | 0 | 0 |
|----+----+----+----+----+----+----+----+---|
| w | 0 | 0 | 0 | 0 | -1 | 0 | 1 | 0 |
• The initial tableau is not yet in canonical form but it can very easily be made such by letting
x5 enter the basis:
| | x1 | x2 | x3 | x4 | x5 | -z | -w | b |
|-------+----+----+----+----+----+----+----+---|
| | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 2 |
| | 2 | 2 | 0 | -1 | 1 | 0 | 0 | 5 |
| z | 1 | -1 | 0 | 0 | 0 | 1 | 0 | 0 |
|-------+----+----+----+----+----+----+----+---|
| IV+II | 2 | 2 | 0 | -1 | 0 | 0 | 1 | 5 |
Now we have a basic feasible solution. It is [0, 0, 2, 0, 5] and its objective value is w = −5. It
is not optimal and therefore we proceed to find the optimal solution.
| | x1 | x2 | x3 | x4 | x5 | -z | -w | b |
|--------+----+----+----+----+----+----+----+----|
| | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 2 |
| II-2I’ | 0 | 0 | -2 | -1 | 1 | 0 | 0 | 1 |
| III-I’ | 0 | -2 | -1 | 0 | 0 | 1 | 0 | -2 |
|--------+----+----+----+----+----+----+----+----|
| IV-2I’ | 0 | 0 | -2 | -1 | 0 | 0 | 1 | 1 |
The tableau is optimal. The optimal value can be read from the last row of the tableau:
w∗ = −1. Hence, we see that x5 = 1 and it cannot be decreased further. Then no solution
with X5 = 0 exists and there is no feasible solution for our initial problem.
2.5. INFEASIBILITY AND INITIALIZATION 45
x2
x1
Figure 2.13: The feasibility region is the intersection of the half-spaces described by the constraints.
In this case it is empty.
The original problem is infeasible. We can appreciate this also graphically from Figure 2.13
where we see that the intersection between the half-spaces that define the problem is empty.
Let’s change the right-hand side of the second constraint in the prvious example to be 2 instead
of 5.
max x1 − x2
x1 + x2 ≤ 2
2x1 + 2x2 ≥ 2
x1 , x2 ≥ 0
The equational standard form becomes:
max x1 − x2
x1 + x2 + x3 = 2
2x1 + 2x2 − x4 = 2
x1 , x2 , x3 , x4 ≥ 0
Since it is not canonical we resort to the Phase I of the simplex by formulating the auxiliary problem:
| | x1 | x2 | x3 | x4 | x5 | -z | -w | b |
|----+----+----+----+----+----+----+----+---|
| | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 2 |
| | 2 | 2 | 0 | -1 | 1 | 0 | 0 | 2 |
| z | 1 | -1 | 0 | 0 | 0 | 1 | 0 | 0 |
|----+----+----+----+----+----+----+----+---|
| w | 0 | 0 | 0 | 0 | -1 | 0 | 1 | 0 |
• we can set the problem in canonical form by making x5 entering the basis:
| | x1 | x2 | x3 | x4 | x5 | -z | -w | b |
|-------+----+----+----+----+----+----+----+---|
| | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 2 |
| | 2 | 2 | 0 | -1 | 1 | 0 | 0 | 2 |
| z | 1 | -1 | 0 | 0 | 0 | 1 | 0 | 0 |
|-------+----+----+----+----+----+----+----+---|
| IV+II | 2 | 2 | 0 | -1 | 0 | 0 | 1 | 2 |
The basic feasible solution is [0, 0, 2, 0, 2] and yields an objective value w = −2. This solution
is not optimal.
| | x1 | x2 | x3 | x4 | x5 | -z | -w | b |
|----+----+----+----+------+------+----+----+----|
| | 0 | 0 | 1 | 1/2 | -1/2 | 0 | 0 | 1 |
| | 1 | 1 | 0 | -1/2 | 1/2 | 0 | 0 | 1 |
| z | 0 | -2 | 0 | 1/2 | -1/2 | 1 | 0 | -1 |
|----+----+----+----+------+------+----+----+----|
| w | 0 | 0 | 0 | 0 | -1 | 0 | 1 | 0 |
The solution is optimal and w∗ = 0 hence x5 = 0 and we have a starting feasible solution for
the original problem.
From now we can proceed with the Phase II of the simplex method, which works on the original
problem as we have learned.
• First we rewrite the tableau that we reached by keeping only what we need:
| | x1 | x2 | x3 | x4 | -z | b |
|----+----+----+----+------+----+----|
| | 0 | 0 | 1 | 1/2 | 0 | 1 |
| | 1 | 1 | 0 | -1/2 | 0 | 1 |
|----+----+----+----+------+----+----|
| z | 0 | -2 | 0 | 1/2 | 1 | -1 |
| | x1 | x2 | x3 | x4 | -z | b |
|----+----+----+----+----+----+----|
| | 0 | 0 | 2 | 1 | 0 | 2 |
| | 1 | 1 | 1 | 0 | 0 | 2 |
|----+----+----+----+----+----+----|
| z | 0 | -2 | -1 | 0 | 1 | -2 |
The solution process is geometrically depicted in Figure 2.14. Phase I starts from the origin,
which for this problem is not a feasible solution, it then jumps to the first feasible solution and from
there with the Phase II to the optimal solution.
2.5. INFEASIBILITY AND INITIALIZATION 47
x2
x1
Figure 2.14: The geometric representation of the feasible example. The blue line represents the
objective function.
Dictionary form In dictionary form, the auxiliary problem can be seen below:
max x1 − x2
x3 = 2 − x1 − x2
x1 + x2 ≤ 2
x4 = −5 + 2x1 + 2x2
2x1 + 2x2 ≥ 5
z = x1 + x2
x1 , x2 ≥ 0
We introduce corrections of infeasibility
This new problem is still infeasible but it can be made feasible by letting x0 enter the basis. Which
variable should leave? The most infeasible one: the variable that has the negative b term with the
largest absolute value.
48 CHAPTER 2. THE SIMPLEX METHOD
Chapter 3
Duality
In the previous chapter we saw that the economical interpretation of a given LP problem leads us
to define a dual problem. In this chapter we look at the theory of duality from a more general
and systematic perspective. We present four analytic ways to derive a dual problem. Then we look
at four important theorems that are at the foundation of linear programming. Finally, we present
important practical uses of duality such as the dual simplex method, the sensitivity analysis and
infeasibility proofs.
Any feasible solution to this problem provides a lower bound to the objective value. By attempts,
we can get
(x1 , x2 , x3 ) = (1, 0, 0) z∗ ≥ 4
(x1 , x2 , x3 ) = (0, 0, 3) z∗ ≥ 9
Which is the best one? Clearly the largest lower bound 9 is the best lower bound. If we knew that
we cannot do better then we could claim the solution (0, 0, 3) optimal. How can we know what is the
best we can do? We could look at upper bounds. Let’s try to derive one upper bound. We multiply
49
50 CHAPTER 3. DUALITY
left and right hand sides of the constraint inequalities by a positive factor and sum the inequalities.
This will not change the sense of the inequality.
2 · ( x1 + 4x2 ) ≤ 2·1
+3 · ( 3x1 + x2 + x3 ) ≤ 3·3
4x1 + x2 + 3x3 ≤ 11x1 + 11x2 + 3x3 ≤ 11
In the left-most side of the last row we rewrote the objective function of our problem. The left-hand
side of the inequality obtained by summing left and right hand sides is certainly larger than this
function, indeed, the three variables must all be non-negative and their coefficients in the objective
function are one by one smaller than their corresponding coefficients in the left-hand side of the
obtained inequality. Hence z ∗ ≤ 11. Is this the best upper bound we can find?
To obtain this lower bound we chose two arbitrary multipliers y1 , y2 ≥ 0 that preserve the sign
of the inequalities and made a linear combination of the inequalities:
y1 · ( x1 + 4x2 ) ≤ y1 (1)
+y2 · ( 3x1 + x2 + x3 ) ≤ y2 (3)
(y1 + 3y2 )x1 + (4y1 + y2 )x2 + y2 x3 ≤ y1 + 3y2
We aim at
cT x ≤ y T Ax ≤ y T b.
hence we have to impose some restrictions on the multipliers, namely that the coefficients of the
variables xi , i = 1, 2, 3 in the left-hand side of the linear combination of the constraints doe not
exceed the coefficient of the same variables in the objective function:
y1 + 3y2 ≥ 4
4y1 + y2 ≥ 1
y2 ≥ 3
Thus
z = 4x1 + x2 + 3x3 ≤ (y1 + 3y2 )x1 + (4y1 + y2 )x2 + y2 x3 ≤ y1 + 3y2 .
Then to attain the best upper bound we need to solve the following problem:
min y1 + 3y2 = w
y1 + 3y2 ≥ 4
4y1 + y2 ≥ 1
y2 ≥ 3
y1 , y2 ≥ 0
This is the dual problem of our original instance of LP problem. We will soon prove that z ∗ = w∗ .
max x1 + x2 = z
2x1 + x2 ≤ 14
−x1 + 2x2 ≤ 8
2x1 − x2 ≤ 10
x1 , x2 ≥ 0
3.1. DERIVATION AND MOTIVATION 51
The feasibility region and the objective function are plotted in Figure 3.1, left. The feasible solution
x∗ = (4, 6) yields z ∗ = 10. To prove that this solution is optimal we need to show that no other
feasible solution can do better. To do this we need to verify that y ∗ = (3/5, 1/5, 0) is a feasible
solution of the dual problem:
min 14y1 + 8y2 + 10y3 = w
2y1 − y2 + 2y3 ≥ 1
y1 + 2y2 − y3 ≥ 1
y1 , y2 , y3 ≥ 0
and that w∗ = 10. To put it differently, we multiply the constraints 2x1 + x2 ≤ 14 by 3/5 and
multiply −x1 + 2x2 ≤ 8 by 1/5. Since the sum of the resulting two inequalities reads x1 + x2 ≤ 10,
we conclude that every feasible solution x1 , x2 of the original problem satisfies x1 + x2 ≤ 10,
3
5· 2x1 + x2 ≤ 14
1
5· −x1 + 2x2 ≤ 8
x1 + x2 ≤ 10
Interpreted geometrically, this conclusion means that the entire region of feasibility is contained in
the half plane x1 + x2 ≤ 10; this fact is evident from Figure 3.1, center. Actually, we have support
for a stronger conclusion: the set of all points (x1 , x2 ) satisfying
2x1 + x2 ≤ 14
(3.1)
−x1 + 2x2 ≤ 8
is a subset of the half plane x1 + x2 ≤ 10. Now let us consider linear combinations of these two
inequalities, Each of these combinations has the form:
for some non-negative v and w. Geometrically, it represents a half-plane that contains the set
represented by (3.1) and whose boundary line passes through the point (4, 6). Examples of these
boundary lines are
v = 1, w = 0 =⇒ 2x1 + x2 = 14
v = 1, w = 1 =⇒ x1 + 3x2 = 22
v = 2, w = 1 =⇒ 3x1 + 4x2 = 36
x2 x2 x2
−x1 + 2x2 ≤ 8
2x1 − x2 ≤ 10
x1 + 3x2 = 22
x1 x1 x1
x1 + x2 x1 + x2 ≤ 10 3xx1 2+
x1 + ≤4x
102 = 36
2x1 + x2 ≤ 14 2x1 + x2 = 14
Figure 3.1:
Suppose the maximum is finite, say its value is z ∗ and attained by the element x∗ of P . Let
a1 x ≤ b1
..
.
ak x ≤ bk
(the inequality follows since yi∗ give a feasible solution for the minimum). The inequality holds also
in the opposite verse:
cT x = yAx ≤ bT y
This yields the LP-duality equation
Working columnwise, the conditions of optimality since c̄k ≤ 0 for all k = 1, . . . , n+m can be written
as
π1 a11 + π2 a21 . . . + πm am1 + πm+1 c1 ≤ 0
. . ..
.. .. .
π a + π a . . . + π a + π c ≤ 0
1 1n 2 2n m mn m+1 n
π1 a1,n+1 , π2 a2,n+1 , . . . πm am,n+1 ≤ 0
.. .. .. .. .. .. (3.3)
. . . . . .
π1 a1,n+m , π2 a2,n+m , . . . πm am,n+m ≤ 0
πm+1 = 1
π1 b1 + π2 b2 ... + πm bm (≤ 0)
For the columns that correspond to the variables in basis (eg, from n + 1 to n + m, ie, the second
block of constraints in the system (3.3)) we will have an identity matrix and hence πi ≤ 0, i = 1..m.
From the last row of the final tableau we read z = −πb. Since we want to maximize z then we would
try to solve for min(−πb) or equivalently for max πb. We can then rewrite:
max π1 b1 + π2 b2 . . . + πm bm
π1 a11 + π2 a21 . . . + πm am1 ≤ −c1
.. .. ..
. . .
π1 a1n + π2 a2n . . . + πm amn ≤ −cn
π1 , π2 , . . . πm ≤ 0
and setting y = −π and substituting we obtain:
max −y1 b1 + −y2 b2 . . . + −ym bm
−y1 a11 + −y2 a21 . . . + −ym am1 ≤ −c1
.. .. ..
. . .
−y1 a1n + −y2 a2n . . . + −ym amn ≤ −cn
−y1 , −y2 , . . . − ym ≤ 0
which leads to the same dual as we saw in the previous section:
min bT y = w
AT y ≥ c
y ≥ 0
Example 3.1. We sketch a numerical example:
max 6x1 + 8x2
5x1 + 10x2 ≤ 60
4x1 + 4x2 ≤ 40
x1 , x2 ≥ 0
5π1 + 4π2 + 6π3 ≤ 0
10π1 + 4π2 + 8π3 ≤ 0
1π1 + 0π2 + 0π3 ≤ 0
0π1 + 1π2 + 0π3 ≤ 0
0π1 + 0π2 + 1π3 = 1
60π1 + 40π2
which we transform by setting:
y1 = −π1 ≥ 0
y2 = −π2 ≥ 0
54 CHAPTER 3. DUALITY
Figure 3.2:
max z = cT x min w = bT y
(P) Ax ≤ b (D) AT y ≥ c
x ≥ 0 y ≥ 0.
We proceed to derive the dual of (D). We put first the dual in the standard form:
then we use the same rules used to derive (D) from (P) to derive the dual of (D)
− min cT x
−Ax ≥ −b
x ≥ 0
From the derivations we saw already that the dual problem yields upper bounds (to maximization
primal problems). This is true in general:
(P) max{cT x | Ax ≤ b, x ≥ 0}
(D) min{bT y | AT y ≥ c, y ≥ 0}
for any feasible solution x of (P) and any feasible solution y of (D):
cT x ≤ bT y
P
Proof. In scalar form, from the feasibility of x in (P) we have that nj=1P
aij xi ≤ bi for i = 1..m and
xj ≥ 0 for j = 1..n. From the feasibility of y in (D) we have that cj ≤ m i=1 yi aij for i = 1..m and
yi ≥ 0 for j = 1..n. It follows that:
!
n
X n
X m
X m
X n
X m
X
cj xj ≤ yi aij xj = aij xi yi ≤ bi yi .
j=1 j=1 i=1 i=1 j=1 i=1
The following theorem is due to Von Neumann and Dantzig, 1947, and Gale, Kuhn and Tucker,
1951.
(P) max{cT x | Ax ≤ b, x ≥ 0}
(D) min{bT y | AT y ≥ c, y ≥ 0}
4. (P) has feasible solution x∗ = [x∗1 , . . . , x∗n ]T , (D) has feasible solution y∗ = [y1∗ , . . . , ym
∗ ]T and
c T x ∗ = bT y ∗
56 CHAPTER 3. DUALITY
Proof. All other combinations of the 3 possible outcomes (Optimal, Infeasible, Unbounded) for (P)
and (D) are ruled out by the weak duality theorem. For example, the combination (P) unbounded
(+∞) and (D) unbounded (−∞) would clearly violate the weak duality theorem. To show 4, we use
the simplex method (other proofs independent of the simplex method exist, eg, via Farkas Lemma
and convex polyhedral analysis.)
To prove the statement 4. we need to exhibit for an optimal solution x∗ a dual feasible solution y∗
satisfying cT x∗ = bT y∗ . Suppose we apply the simplex method. We know that the simplex method
produces an optimal solution whenever one exists, and in statement 4. we are assuming that one
does indeed exist. Let x∗ be this optimal solution. The final dictionary will be an optimal dictionary
for the primal problem. The objective function in this final dictionary is ordinarily written (see page
34) as
n+m
X X X
z = d¯ + c̄k xk = d¯ + c̄r xr + c̄s xs
k=1 r∈B s6∈B
But, since this is the optimal dictionary and we prefer stars to bars for denoting optimal “stuff”, let us
¯ Also, the reduced costs of the basic variables will be zero and nonbasic variables
write z ∗ instead of d.
will generally consist of a combination of original variables as well as slack variables. Instead of using
c̄k for the coefficients of all these variables, let us differentiate and use c̄j for the objective coefficients
corresponding to original variables, and let us use c̄i for the objective coefficients corresponding to
slack variables. Also, for those original variables that are basic we put c̄j = 0, and for those slack
variables that are basic we put c̄i = 0. With these new notations, we can rewrite the objective
function as
n
X m
X
∗
z=z + c̄j xj + c̄n+i xn+i (3.4)
j=1 i=1
n
X
∗
z = cj x∗j
j=1
(This is just the original objective function with the substitution of the optimal value for the original
variables.)
We now define
yi∗ = −c̄n+i , i = 1, 2, . . . , m.
and obtain
n
X n
X m
X n
X
cj xj = z ∗ + c̄j xj − yi∗ bi − aij xj
j=1 j=1 i=1 j=1
m
! n m
!
X X X
= z∗ − yi∗ bi + c̄j + aij yi∗ xj
i=1 j=1 i=1
Theorem 3.4 (Complementary Slackness). Given a feasible solution x∗ for (P) and a feasible
solution y∗ for (D), necessary and sufficient conditions for the optimality of both are:
m
!
X
cj − yi∗ aij x∗j = 0, j = 1, . . . , n
i=1
P
This
P ∗ implies that for any j = 1..n, if x∗j 6= 0 then yi∗ aij = cj (no slack nor surplus) and if
∗
yi aij > cj then xj = 0.
Proof. From the weak duality theorem:
z ∗ = cT x∗ ≤ y∗T Ax∗ ≤ bT y∗ = w∗ .
cT x∗ − y∗T Ax∗ = 0
In scalars: !
n
X m
X
cj − yi∗ aij x∗j = 0
j=1 i=1
|{z}
| {z } ≥0
≤0
final tableau:
x0 x1 x2 s1 s2 s3 −z b
0 1 0 5/2
1 0 0 7
0 0 1 2
−1/5 0 0 −1/5 0 −1 −62
• Which are the values of the reduced costs? (−1/5, 0, 0, −1/5, 0, −1)
• Which are the values of the values of dual variables? (1/5, 0, 1) - for the proof of the Strong
duality theorem.
• Which are the values of the shadow prices (or marginal values of the resources)? (1/5, 0, 1)
• If one slack variable > 0 then there is overcapacity, that is, the constraint to which the slack
variable belongs is not tight. This can be assessed also via complementary slackness theorem:
y2 is dual variable associated with the second constraint, y2 = 0 from the tableau, hence the
second constraint is not binding.
P
• without constraint yi aij ≥ cj there would not be negotiation because P would be better off
producing and selling
• at optimality the situation is indifferent (strong duality theorem)
• resource 2 that was not totally utilized in the primal has been given value 0 in the dual.
(complementary slackness theorem) Plausible, since we do not use all the resource, likely to
place not so much value on it
P
• for product 0, yi aij > cj hence not profitable producing it (complementary slackness theo-
rem)
(The second statement can be also seen as a proof of the weak duality theorem.) The problem
PR is easy to solve:
(13 − 2y1 − 3y2 ) x1
+ (6 − 3y1 ) x2
P R(y1 , y2 ) = min + (4 − 4y1 − 2y2 ) x3
x1 ,x2 ,x3 ,x4 ≥0
+ (12 − 5y1 − 4y2 ) x4
+ 7y1 + 2y2
60 CHAPTER 3. DUALITY
if the coefficients of xi is < 0 then the bound is −∞ and hence useless to our purposes of finding
the strongest bound. Hence,
(13 − 2y1 − 3y2 ) ≥ 0
(6 − 3y1 ) ≥ 0
(4 − 4y1 − 2y2 ) ≥ 0
(12 − 5y1 − 4y2 ) ≥ 0
If they all hold then we are left with 7y1 + 2y2 because all other terms go to 0. Thus,
min z = cT x c ∈ Rn
Ax = b A ∈ Rm×n , b ∈ Rm
x ≥ 0 x ∈ Rn
max bT y
AT y ≤ c
y ∈ Rm
max{cT x | Ax ≤ b, x ≥ 0} = min{bT y | AT y ≥ cT , y ≥ 0}
= − max{−bT y | −AT x ≤ −cT , y ≥ 0}
Figure 3.3: The figure shows an iteration of the simplex in the primal problem and the corresponding
step in the dual problem.
An alternative view: we can solve the primal problem with the primal simplex and observe what
happens in the dual problem. This has an important application. Since the last terms of the tableau
become the right hand side terms of the constraints, whenever a tableau in the primal problem is not
optimal, the corresponding tableau in the dual problem is non canonical (or infeasible). Iterating in
the two algorithms we observe that:
While in the primal simplex we increase a variable that can improve the objective function, in
the dual simplex we take a constraint that is not yet satisfied and use it to diminish the value of a
variable until the constraint becomes satisfied. See Figure 3.3.
Hence, the dual simplex applied on the primal problem can be used for resolving an infeasible
start. This yields a dual based Phase I algorithm (Dual-primal algorithm) (see Ex 10, Sheet 3)
max −x1 − x2
min 4y1 − 8y2 − 7y3
−2x1 − x2 ≤ 4
−2y1 − 2y2 − y3 ≥ −1
−2x1 + 4x2 ≤ −8
−y1 + 4y2 + 3y3 ≥ −1
−x1 + 3x2 ≤ −7
y1 , y2 , y3 ≥ 0
x1 , x2 ≥ 0
62 CHAPTER 3. DUALITY
The Phase I is thus terminated. We note that the tableaux are optimal also with respect to the
Phase II hence we are done.
The final tableau for this problem is the following (we show only the numbers that are relevant for
3.5. SENSITIVITY ANALYSIS 63
our analysis):
x0 x1 x2 s1 s2 s3 −z b
0 1 0 5/2
1 0 0 7
0 0 1 2
−1/5 0 0 −1/5 0 −1 −62
What-if anaylsis: what changes in the solution if some input data changes? Instead of solving
each modified problems from scratch, exploit results obtained from solving the original problem.
• How much more expensive a product not selected should be? Look at reduced costs, we want:
ci − πaj > 0 hence we must increase the original cost ci .
• What is the value of extra capacity of manpower? Adding 1 + 1 units of the first and second
resource we obtain an increase in objective value of 1/5 + 1, respectively.
max{cT x | Ax = b, l ≤ x ≤ u} (3.9)
A new fesaible solution is easily derived by setting the new variable to zero. We need to check
whether it is worth increasing it (primal iteration).
The last tableau on the right gives the possibility to estimate the effect of variations. For a
variable in basis the perturbation goes unchanged in the reduced costs. Eg:
2
max(6 + δ)x1 + 8x2 =⇒ c̄1 = − · 5 − 1 · 4 + 1(6 + δ) = δ.
5
If δ > 0 then the variable must enter in basis and we need to bring the tableau in canonical form
for the new basis and hence δ changes the obj value. For a variable not in basis, if it changes the
sign of the reduced cost then it is worth bringing in basis. The δ term propagates to other columns
via pivot operations.
x1 x2 x3 x4 −z b
x3 5 10 1 0 0 60 + δ
x4 4 4 0 1 0 40 +
6 8 0 0 1 0
x1 x2 x3 x4 −z b
x2 0 1 1/5 -1/4 0 2 + 1/5δ − 1/4
x1 1 0 −1/5 1/2 0 8 − 1/5δ + 1/2
0 0 -2/5 -1 1 −64 − 2/5δ −
Looking at the cell in the bottom-left corner of the tableau, −64 − 2/5δ − , we see what would
be the contribution to the objective value of an increase of δ and of the resources. If both δ = = 1
then it would be more convenient to augment the second resources.
Let’s analyze the situation when only one of the resources changes, ie, let = 0. If 60 + δ
=⇒all RHS terms change and we must check feasibility. Which are the multipliers for the first
row?π1 = 51 , π2 = − 41 , π3 = 0.
I: 1/5(60 + δ) − 1/4 · 40 + 0 · 0 = 12 + δ/5 − 10 = 2 + δ/5
II: −1/5(60 + δ) + 1/2 · 40 + 0 · 0 = −60/5 + 20 − δ/5 = 8 − 1/5δ
Risk that the RHS becomes negative. Eg: if δ = −20 then the tableau stays optimal but not
feasible. We need to apply the dual simplex and the increase in the objective value would therefore
be less than what prospected by the marginal values. In Figure 3.4, we plot the objective value as
a function of the increase δ.
f.o.
60 + 2/5δ
δ
-10 40
x1 x2 x3 x4 x5 −z b
x2 0 1 1/5 −1/4 0 2
x1 1 0 −1/5 1/2 0 8
0 0 5/5 6/4 1 0 −2
0 0 −2/5 −1 0 1 −64
66 CHAPTER 3. DUALITY
x1 x2 x3 x4 −z b
x3 5 10 + δ 1 0 0 60
x4 4 4 0 1 0 40
6 8 0 0 1 0
• then look at c
• finally look at b
x1 x2 x3 x4 −z b
x2 0 (10 + δ)1/5 + 4(−1/4) 1/5 −1/4 0 2
x1 1 (10 + δ)(−1/5) + 4(1/2) −1/5 1/2 0 8
0 −2/5δ −2/5 −1 1 −64
either I. ∃x ∈ Rn : Ax = b and x ≥ 0
or II. ∃y ∈ Rm : yT A ≥ 0T and yT b < 0
(0 ≤) yT Ax = yT b (< 0)
Proof. We show only that i) =⇒ ii) since it will be useful in our proof of the duality theorem.
Ā = [A | Im ]
Ax ≤ b has sol x ≥ 0 ⇐⇒ Āx̄ = b has sol x̄ ≥ 0
By (i):
∀y ∈ Rm
yT b ≥ 0, yT Ā ≥ 0
yT A ≥ 0
y≥0
relation with Fourier & Moutzkin method
(P ) max{cT x | Ax ≤ b, x ≥ 0}
Assume P has opt sol x∗ with value z ∗ . We find that D has opt sol as well and its value coincide
with z ∗ .
Opt value for P:
γ = c T x∗
We know by assumption:
and ∀ > 0
Ax ≤ b Ax ≤ b
has sol x ≥ 0 has no sol x ≥ 0
cT x ≥ γ cT x ≥ γ +
Let’s define:
A b
 = b̂ =
−cT −γ −
ŷ ≥ 0 ŷ ≥ 0
ŷT Â ≥ 0 ŷT Â ≥ 0
ŷT b0 ≥ 0 ŷT b < 0
Then Then
AT u ≥ zc AT u ≥ zc
T
b u ≥ zγ bT u < z(γ + )
Hence, z > 0 or z = 0 would contradict the separation of cases.
We can set v = z1 u ≥ 0 By weak duality γ is lower bound for D. Since D
bounded and feasible then there exists y∗ :
T
A v≥c
bT v < γ + γ ≤ bT y ∗ < γ + ∀ > 0
AT y ∗ ≥ 0
by∗ < 0
Contradiction
General form:
infeasible ⇔ ∃y ∗
max cT x
A1 x = b1 bT1 y1 + bT2 y2 + bT3 y3 > 0
A2 x ≤ b2 AT1 y1 + AT2 y2 + AT3 y3 ≤ 0
A3 x ≥ b3 y2 ≤ 0
x ≥ 0 y3 ≥ 0
Example 3.8.
max cT x
x1 ≤ 1
x1 ≥ 2
bT1 y1 + bT2 y2 > 0
AT1 y1 + AT2 y2 ≤ 0
y1 ≤ 0
y2 ≥ 0
3.7. SUMMARY 69
y1 + 2y2 > 0
y1 + y2 ≤ 0
y1 ≤ 0
y2 ≥ 0
y1 = −1, y2 = 1 is a valid certificate.
Note that the Farkas’ infeasibility certificate is not unique! It can be reported in place of the
dual solution because they have the same dimension. To repair infeasibility we should change the
primal at least so much so that the certificate of infeasibility is no longer valid. Only constraints
with yi 6= 0 in the certificate of infeasibility cause infeasibility.
3.7 Summary
In this chapter we have presented the following topics regarding LP duality:
• Derivation:
1. bounding
2. multipliers
3. recipe
4. Lagrangian
• Theory:
– Symmetry
– Weak duality theorem
– Strong duality theorem
– Complementary slackness theorem
– Farkas Lemma: Strong duality + Infeasibility certificate
• Dual Simplex
• Economic interpretation
• Geometric Interpretation
• Sensitivity analysis
The running time of the simplex algorithms depends on the total number of pivot operations and by
the cost of each single operation. We have already discussed in the previous chapter what is known
about the number of iterations. Let’s now focus on the cost of a single iteration.
The complexity of a single pivot operation in the standard simplex is determined by:
Hence, the most costly operation in the simplex is updating the tableau in the pivot operation.
We can observe that we are doing operations that are not actually needed. For example, in
the tableau the only columns that really matters is the one of the entering variable. Moreover,
we have space issues: we need to store the whole tableau, that is, O(mn) floating point numbers,
this can become a lot: for 1000 constraints and 50000 variables in double precision floating point
storing the whole tableau yields 400MB. Further, most problems have sparse matrices (they contain
many zeros). Sparse matrices are typically handled efficiently by special storing ways and specialized
operators. Instead, the standard simplex immediately disrupts sparsity by problem with an iterated
method like the simplex is that floating point errors accumulate and may become very important.
There are several ways to improve the efficiency of the pivot operation. To gain a better insight
we need a matrix description of simplex. As (we have mostly tried) in the previous chapter, all
vectors are column vectors and denoted by lowercase letters in bold face. Matrices are denoted in
upper case letters.
We consider a general LP problem in standard form
P
n
max cj xj
j=1
P
n
aij xj ≤ bi i = 1..m
j=1
xj ≥ 0 j = 1..n
After the introduction of the slack variables xn+1 , xn+2 , . . . , xn+m the problem can be written in
vector form as
max cT x
Ax = b
x ≥ 0
71
72 CHAPTER 4. REVISED SIMPLEX METHOD
or
max{cT x | Ax = b, x ≥ 0}
where x ∈ Rn+m , A ∈ Rm×(n+m) , c ∈ R(n+m) , and b ∈ Rm .
We aim at understanding the relationship between each tableau and the initial data. We saw
that every tableau corresponds to a basic feasible solution (the reverse is not true because of the
possible degenerancies). For each basic feasible solution:
Moreover,
• xN = 0
• xB ≥ 0
Ax = AN xN + AB xB = b (4.1)
AB xB = b − AN xN (4.2)
Theorem 4.1. Basic feasible solution ⇐⇒ AB is non-singular (ie, the rows are linearly independent
and det(A) 6= 0).
Proof. We have already shown previously that if AB is non-singular then there is a unique basic
feasible solution given by xB = A−1 xB . Now we set out to prove that if x is a basic feasible solution
for B then AB is non-singular. Since a basic feasible solution x satisfies Ax = b and xN = 0 then
it satisfies AB xB = b − AN xN = b. Hence, AB xB = b. From linear algebra we know that if xB is
unique then the matrix of the system AB xB = b must be full rank, or, equivalently, non-singular.
To verify that there are no other basic feasible solutions for B, consider an arbitrary vector x̃ such
that AB x̃ = b and x̃N = 0. Since the resulting verctor satisfies Ax̃ = AB x̃B + AN x̃N = b, it must
satisfy the top m equations in the tableau representing x. But then x̃N = 0 implies x̃ = xB .
xB = A−1 −1
B b − AB AN xN (4.3)
73
z = cT x = cTB xB + cTN xN .
z = cTB (A−1 −1 T
B b − AB AN xN ) + cN xN =
= cTB A−1 T T −1
B b + (cN − cB AB AN )xN
which is the dictionary corresponding to a basis B. In tableau form, for a basic feasible solution B
we have:
A−1
B AN I 0 A−1
B b
cTN − cTB A−1
B AN 0 1 −cTB A−1
B b
The identity matrix I of size n × n occupies the columns of the variables that are in basis. The other
terms of the matrix A are given byĀ = A−1 B AN .
The cost of one iteration of the revised simplex in a trivial implementation is determined by the
matrix operations needed to write the values in the tableau. These operations are:
• Compute A−1 2
B AN : O(m n)
• Compute AB b: O(m2 )
The overall complexity is O(m2 (m+n)). This is apparently more than the standard simplex, however
smart implementations can be more efficient. The most important gain can be achieved by noting
that at each iteration of the simplex we do not need to compute all elements of Ā.
Example 4.1. The LP problem is given on the left. Its equational standard form is derived on the
max x1 + x2
max x1 + x2
−x1 + x2 ≤ 1
−x1 + x2 + x3 = 1
right: x1 ≤ 3
x1 + x4 = 3
x2 ≤ 2
x2 + x5 = 2
x1 , x2 ≥ 0
x1 , x2 , x3 , x4 , x5 ≥ 0
The initial tableau is:
x1 x2 x3 x4 x5 −z b
−1 1 1 0 0 0 1
1 0 0 1 0 0 3
0 1 0 0 1 0 2
1 1 0 0 0 1 0
74 CHAPTER 4. REVISED SIMPLEX METHOD
The basic variables are x1 , x2 , x4 and the non basic ones: x3 , x5 . With this information we can
write, looking at the initial tableau:
−1 1 0 1 0 x1
x
AB = 1 0 1 AN = 0 0 xB = x2 xN = 3
x5
0 1 0 0 1 x4
cTB = 1 1 0 cTN = 0 0
The tableau is not optimal hence we proceed to the next pivot operation. We describe the operations
of selecting a variable to enter the basis, one to leave the basis and updating the tableau in terms
of matrix calculations.
Entering variable : In the standard simplex we look at the reduced costs in the tableau. In the
revised simplex we need to calculate: cTN − cTB A−1
B AN . This is decomposed into two steps:
Step 1. Find yT by solving the linear system yT AB = cTB . It is possible to calculate yT = cTB A−1
B
but the system can be solved more efficiently without calculating the inverse of AB .
Step 2. Calculate cTN − yT AN (each term cj − yT aj can be calculated independently and for some
pivoting rules one does not need to calculate them all).
The first element is the vector is positive and the calculation can stop. It corresponds to
the variable x3 , which therefore has a positive reduced cost and is selected to enter in the
basis.
75
Leaving variable In the standard simplex we need now to determine the largest feasible amount
θ to increase the entering variable without producing an infeasible solution. We do the constraint
analysis, looking at the tableau and knowing that x5 will remain non-basic and hence zero:
R1: x1 − x3 + x5 = 1 x1 = 1 + x3 ≥ 0
R2: x2 + 0x3 + x5 = 2 x2 = 2
R3: x3 + x4 − x5 = 2 x4 = 2 − x3 ≥ 0
Hence, the first and the second constraints do not pose any limit to the increase of x3 . The third
constraint is the most restrictive. It determines that the largest increase θ is 2.
Translating these operations in matrix operations we observe that they can be expressed as:
xB = x0B − A−1
B AN x N
where x0 is the current solution and x the adjacent solution to which we are moving. Since in the
new solution only one non basic variable changes its value, the selected one x3 , then not all the
terms of A−1
B AN xN need to be calculated but only those that correspond to the entering variable.
Let denote by d the column of A−1B AN and by a the column of AN that correspond to this variable.
We have
d = A−1
B a
xB = x0B − dθ
We can thus describe the calculation that we have to carry out in the revised simplex to find θ
such that xB stays positive.
Step 4. Determine the largest θ such that xB = x0B − dθ ≥ 0. If there is no such θ, then the
problem is unbounded. Otherwise, at least one component of x0B − dθ equals zero and the
corresponding variable is leaving the basis.
Step 3:
d1 −1 0 1 1 −1
d2 = 0 0 1 0 =⇒ d = 0
d3 1 1 −1 0 1
Step 4:
1 −1
xB = 2 − 0 θ ≥ 0
2 1
The first two terms do not pose any limit, while for the third term it must be 2 − θ ≥ 0,
which implies θ ≤ 2 Hence, it is x4 that goes to zero and thus leaves the basis.
76 CHAPTER 4. REVISED SIMPLEX METHOD
Updating the tableau This part is the computationally heaviest part of the standard simplex.
In the revised simplex, instead, this step comes for free.
Step 5. The update of xB is done by setting the value found for θ in x0B − dθ ≥ 0 and replacing
x0B with xB . AB is updated by replacing the leaving column by the entering column.
Step 5.
x1 − d1 θ 3 −1 1 1
xB = x2 − d2 θ = 2
0 AB = 1 0 0 .
θ 2 0 1 0
Incidentally, note that the order of the columns of AB is not important as long as it matches the
order of the components of xB . Hence, the next iteration could just as well be entered with
2 1 0 0
x0B = 3 AB = −1 1 1 .
2 0 1 0
The basis heading is an ordered list of basic variables that specifies the actual order of the m
columns of AB . For simplicity, during the solution process the basis heading is updated by replacing
the leaving variable with the entering variable.
The revised simplex allows to save many operations especially if there are many variables! Also
in terms of space the revised simplex is convenient, note indeed that we do not need to store the
matrix AB but a vector containing the basis heading is enough. Special ways to call the matrix A
from memory help to provide then further speeds up. Finally, the revised simplex provides a better
control over numerical issues since A−1
B can be recomputed at the end once the variables that are in
basis are known.
There are different implementations of the revised simplex, depending on how yT AB = cTB and
AB d = a are solved. They are in fact solved from scratch every time. The next section provides the
general idea behind these implementations.
Eta Factorization of the Basis Let AB = B and let’s consider the kth iteration. The matrix
Bk will differ from the matrix Bk−1 by the column p. The column p is the a column appearing in
Bk−1 d = a solved in Step 3. Hence:
Bk = Bk−1 Ek
where Ek is the eta matrix differing from identity matrix only in one column:
−1 1 1 −1 1 0 1 −1
1 0 0 = 1 0 1 1 0
0 1 0 0 1 0 1
4.1. EFFICIENCY ISSUES 77
No matter how we solve yT Bk−1 = cTB and Bk−1 d = a, their update always relays on Bk =
Bk−1 Ek with Ek available. Moreover, when the initial basis is made of slack variables then B0 = I
and B1 = E1 , B2 = E1 E2 , . . .:
Bk = E1 E2 . . . Ek eta factorization
When B0 6= I:
Bk = B0 E1 E2 . . . Ek eta factorization
((((y B0 )E1 )E2 ) · · · )Ek = cTB
T
LU Factorization To solve the system Ax = b by Gaussian Elimination we put the A matrix in row
echelon form by means of elemntary row operations. Each row operation corresponds to multiply left and
right side by a lower triangular matrix L and a permuation matrix P . Hence, the method throughout its
iterations is equivalent to:
Ax = b
L1 P1 Ax = L1 P1 b
L2 P2 L1 P1 Ax = L2 P2 L1 P1 b
..
.
Lm Pm . . . L2 P2 L1 P1 Ax = Lm Pm . . . L2 P2 L1 P1 b
thus
U = Lm Pm . . . L2 P2 L1 P1 A triangular factorization of A
where U is an upper triangular matrix whose entries in the diagonal are ones (if A is nonsingular such
triangularization is unique).
For a square matrix A the LU decomposition is given by:
A = LU
PA = LU
We can compute the triangular factorization of B0 before the initial iterations of the simplex:
Lm Pm . . . L2 P2 L1 P1 B0 = U
Lm Pm . . . L2 P2 L1 P1 Bk = Um Um−1 . . . U1 E1 E2 · · · Ek
Efficient Implementations
• Dual simplex with steepest descent
• Linear Algebra:
– Dynamic LU-factorization using Markowitz threshold pivoting (Suhl and Suhl, 1990)
– sparse linear systems: Typically these systems take as input a vector with a very small number
of nonzero entries and output a vector with only a few additional nonzeros.
• Presolve, ie problem reductions: removal of redundant constraints, fixed variables, and other extraneous
model elements.
• dealing with degeneracy, stalling (long sequences of degenerate pivots), and cycling:
Tableaux and Vertices For each tableau is associated to exactly one vertex of the feasibility region.
The reverse is not always true. One vertex of the feasibility region can have more than one tableau associated.
For example degenerate vertices have several tableaux associated.
Now, the non basic variables are (x2 , x4 ) = (0, 0). The constraints that are active are 4x1 + 4x2 ≤ 40 and
x2 ≥ 0. Hence, there are still two active constraints when the non basic variables are two. If in the original
space of the problem we have 3 variables and there are 6 constraints we would have 3 constraints active in the
vertices. After we add the slack variables we have 6 variables in all. If any of the slack variables is positive
then some constraints xi ≥ 0 of the original variables are be active, otherwise the corresponding constraint
of the original problem are active. Hence, we can generalize: the non basic variables are always n and they
tell which constraints (among the original and the variable feasibility constraints xi ≥ 0) are active. A basic
feasible solution implies a matrix of active constraints with rank n, some of which may be due to the original
variables being zero. Let a tableau be associated with a solution that makes exactly n + 1 constraints active.
Then, one basic variable is zero.
Definition 6. In a polyhedron in Rn , two vertices are adjacency iff:
In terms of tableaux, this condition menas that between two adjacent vertices there are n − 1 variables
in common in basis.
4.3 More on LP
4.3.1 LP: Rational Solutions
• A precise analysis of running time for an algorithm includes the number of bit operations together with
the number of arithmetic operations.
Example 4.2. The knapsack problem aka, budget allocation problem, that asks to choose amont a
set of n investments those that maximize the profit and cost in total less than B, can be solved by
dynamic programming in
O(n|B|)
The number B needs b = log |B| bits hence the running time is exponential in the number of bits
needed to represent B, ie, O(n2b )
• Weakly polynomial time algorithms have running time that are independent on the sizes of the numbers
involved in the problem and hence on the number of bits needed to represent them.
• Strongly polynomial time algorithms: the running time of the algorithm is independent on the number
of bit operations. Eg: same running time for input numbers with 10 bits as for inputs with a million
bits.
• Running time depends on the sizes of numbers. We have to restrict attention to rational instances
when analyzing the running time of algorithms and assume they are coded in binary.
Theorem 4.2 (Rational Solutions). Optimal feasible solutions to LP problems are always rational as
long as all coefficient and constants are rational.
Proof: derives from the fact that in the simplex we only perform multiplications, divisions and sums
of rational numbers
• In spite of this: No strongly polynomial-time algorithm for LP is known.
How Large Problems Can We Solve? The speed up due to algorithmic improvements has been
more important than the the one due to technology and machine architecture improvements (Bixby, 2007).
Often we need to deal with integral inseparable quantities. For example, if we are modeling the presence of
a bus on a line a value of 0.3 would not have a meaning in practice. Sometimes it may be enough to round
fractional solutions to their nearest integers but other times rounding is not a feasible option, as it may be
costly and a feasible solution is not ensured.
Discrete Optimization is the mathematical field that deals with optimization where the variables must
be integer. Combinatorial optimization is also a kind of discrete optimization, although the term refers to
problems that have a particular structure, like selecting subgraphs, patterns, etc.
In this chapter we will study mixed integer linear programming (MILP). The world is not linear but we
will see that MILP constitutes a very powerful tool and that many situations, apparently non linear, can
actually be modeled in linear terms. In other cases it is possible to linearize by approximation. After all
“OR is the art and science of obtaining bad answers to questions to which otherwise worse answers would be
given.”
An integer linear programming (ILP) problem has linear objective function, linear constraints and integer
variables. A mixed integer linear programming (MILP) problem has linear objective function, linear con-
straints and both integer and real valued variables. A binary programming or 0–1 programming problem
is an ILP problem where variables can take only two values, 0 and 1. Non-linear programming refers to
all problems that although written in mathematical programming terms may have a non linear objective
function and/or non linear constraints.
Here is a non exhaustive list of mathematical programming formulations. We will not see NLP in this
course.
Linear Programming (LP) Integer (Linear) Programming (ILP) Binary Integer Program (BIP)
0–1 Integer Programming
max cT x max cT x
Ax ≤ b Ax ≤ b max cT x
x ≥ 0 x ≥ 0 Ax ≤ b
x integer x ∈ {0, 1}n
83
84 CHAPTER 5. MODELING IN MIXED INTEGER LINEAR PROGRAMMING
• Z set of integers
• Z+ +
0 set of nonnegative integers ({0} ∪ Z )
Whenever different types of variables are present, we will try to comply with the convention used in the
MIPLIB 2003 and use the letters:
Note that the definition above is not an MILP formulation. However, many COP can be formulated as
IP or BIP. Typically, one defines an incidence vector of S, xS ∈ Bn such that:
(
1 if j ∈ S
xSj = .
0 otherwise
That is, an element i of N is selected if xSi = 1 and not selected if xSi = 0. Then, one expresses the structural
constraints in function of xS .
x2 3x1 − 2x2 + 4
x1
x1 + 0.64x2 − 4 50x1 + 31x2 − 250
Rounding
A trivial heuristic to find integer solutions to an MILP problem is to relax the integrality constraint, solve
the linear programming problem thus derived and then round up or down each single fractional value of the
solution found.
The situation is represented in Figure 5.1. The feasibility region is made of the red dots that represent
integer solutions contained in the convex region defined by the constraints. The feasibility region is not
continuous: now the optimum can be on the border (vertices) of the polytope but also internal.
The linear programming relaxation of this problem is obtained by substituting the integrality constraints
on the two variables with the requirements that they must be non-negative, ie, x1 , x2 ≥ 0. The problem
obtained is a linear programming problem that we can solve with the simplex method.
Let’s denote by (ILP) the original problem and by (LPR) its linear relaxation. If the solution to (LPR)
is integer then the (ILP) is solved. On the other hand if the solution is rational then we can try to round
the values of the variables to their nearest integers.
The solution of (LPR) is (376/193, 950/193). The situation is depicted in the figure. The circles filled in
blue represent the solutions obtained by rounding down or up the values of the variables. For each rounded
solution we need to test whether it is feasible. If we are in R2 then there are 22 possible roundings (up or
down) of the variables and solutions to test. If we are in Rn then there are 2n possible solutions. Hence,
in large problems checking all possible roundings may become computationally costly. Moreover, rounding
does not guarantee that a feasible solution is found and it can be arbitrarily bad with respect to the optimal
solution. In our example, the optimum of (ILP) is (5, 0) (the red circle in the figure), while any rounded
solution is quite far from that.
There are two main techniques to solve MILP problem exactly: branch and bound and cutting planes.
max x1 + 4x2
x1 + 6x2 ≤ 18
x1 ≤ 3
x1 , x2 ≥ 0
x1 , x2 integers
86 CHAPTER 5. MODELING IN MIXED INTEGER LINEAR PROGRAMMING
x2
x1 = 3
x1 + 6x2 = 18
x1 + x2 = 5
x1
x1 + 4x2 = 2
We solve the linear programming relaxation. The situation is depicted in Figure 5.2.The optimal solution
is fractional. If we knew the constraint represented by the dashed line in the figure then solving the linear
programming relaxation would give us an integer value, which would be optimal for the original problem. If
we do not know that constraint we can devise a method to add constraints to our problem that would cut
out the rational solution of the current linear relaxation but not cut out any integer feasible solution. This
would bring us closer to the integer optimal solution. An example of such an added cut is given in the figure
by the line just below the constraint x1 + 6x2 ≤ 18. Iterating the procedure a needed number of times would
eventually lead us to an integer solution.
max x1 + 2x2
x1 + 4x2 ≤ 8
4x1 + x2 ≤ 8
x1 , x2 ≥ 0, integer
The branch and bound technique works by solving the linear relaxation and then branching on the
fractional values. Branching corresponds to splitting the problem into two subproblems each one taking
a different part of the feasibility region. The solution to the linear relaxation is cut out by restricting a
fractional variable to be:
• larger than the smallest integer larger than the fractional value of the variable (floor), or
• smaller than largest integer smaller than the fractional value of the variable (ceil).
Each new subproblem is then a linear programming problem with an added constraint and we have seen
in sensitivity analysis how a solution can be derived from an optimal tableau after the introduction of a
constraint. When at a node the solution of the linear programming relaxation is integer, then we can stop
branching on that node. The optimal solution will be the best one found at the leaves of the branch and
bound tree.
The branch and bound process for our example problem is shown in Figure 5.3. Each node bears the
information of the best feasible encountered in its subtree and the value of the linear relaxation, which in
this case represents the best possible that can be achieved. The optimal solution has value 4 and is shown
on the right hand side of the corresponding node.
5.2. MILP MODELING 87
x2
x1 + 4x2 = 8
x1
x1 + 2x2 = 1
4x1 + x2 = 8
4.8
x1 ≤ 1 x1 ≥ 2
x2 x2
x1 = 1
x1 + 4x2 = 8 x1 + 4x2 = 8
x1 x1
x1 + 2x2 = 1 x1 + 2x2 = 1
4x1 + x2 = 8 4x1 + x2 = 8
4.8
−∞
x2 ≤ 1 x1 ≥ 2
4.5 2 x1 =2
−∞ 2 x2 =0
x2 ≤ 1 x2 ≥ 2
x1 =1 3 4 x1 =0
x2 =1 3 4 x2 =2
x2 x2
x1 + 4x2 = 8 x1 + 4x2 = 8
x1 x1
x1 + 2x2 = 1 x1 + 2x2 = 1
4x1 + x2 = 8 4x1 + x2 = 8
• the Parameters that represent the values that are fixed and known;
• the decision variables that answer the questions of the decision maker. They must be of a suitable
type according to the decisions they represent (continuous, integer valued, binary).
It is helpful, in order to formulate constraints, to first write down the relationship between the variables
in plain words. Then, these constraints can be transformed in logical sentences using connectives such as
and, or, not, implies. Finally, logical sentences can be converted to mathematical constraints.
Example 5.4. The decisions to take are whether to produce in a time period or not. This can be modeled
by using a binary integer variables, xi = 1 or xi = 0, for any period i. With these variables we can formulate
the constraint in three steps:
• Plain English: “The power plant must not work in both of two neighboring time periods”
We now provide the formulation of a series of relevant problems with several real life applications.
Parameters: We use the letter I to indicate the set of persons, indexed by i = 1..n and the letter J to
indicate the set of jobs, indexed by j = 1..n. We represent the proficiency by numerical value δij , the higher
the value is the higher the proficiency for the job.
Objective Function:
n X
X n
max ρij xij
i=1 j=1
Constraints:
Each person is assigned one job:
n
X
xij = 1 for all i ∈ I
j=1
Parameters: vi the value (or profit) of an item; wi the weight of item i; W the knapsack capacity.
Decision Variables:
1 if item i is taken
xi = for i = 1, 2 . . . , n
0 otherwise,
Objective Function: we want to maximize the total value of the selected items:
n
X
max vi x i
i=1
– regions: M = {1, . . . , 5}
– centers: N = {1, . . . , 6}
– cost of centers: cj = 1 for j = 1, . . . , 6
– coverages: S1 = (1, 2), S2 = (1, 3, 5), S3 = (2, 4, 5), S4 = (3), S5 = (1), S6 = (4, 5)
Parameters: The set of regions M , the set of centers N , the coverage of regions for each center, Sj , the
cost of each center cj .
Objective:
n
X
min cj x j
j=1
Constraints: all regions must be safely served: We define an incidence matrix A of size m × |F |:
(
1
aij =
0
x1 x2 x3 x4 x5 x6
S1 S2 S3 S4 S5 S6
1 1 1 0 0 1 0
2 1 0 1 0 0 0
A=
3 0 1 0 1 0 0
4 0 0 1 0 0 1
5 0 1 1 0 0 1
where we labeled the columns with the variables and the sets they represent and the rows by the region
identifier. A feasible solution is a selection of the columns of the matrix such that they identify a submatrix
that has at least a 1 in each row, or in other terms such that all rows are covered by the selected columns.
The one we formulated is a set covering problem. Two other variants, set packing and set partitioning
are also relevant for real life applications. We sketch the formulations here.
5.2. MILP MODELING 91
These problems can be generalized to the cases where the coverage must be larger than 1, that is, where
the right hand side of the constraints are larger than 1.
These problems have several applications. Examples are:
• Aircrew scheduling: the legs to cover define the set M and the rosters (crew shifts during which a crew
can cover a number of legs) defines the set N .
• Vehicle routing: the customers to visit define the set M , the routes that visit customers define the set
N.
Here is an example that can be modeled as a generalized set covering.
Decision Variables:
• xi ∈ N0 : number of people starting to work in hour i (i = 1, . . . , 15). For easiness of expressing the
constraints we also define the variables xi , i = −5, ..., −1, 0.
Objective Function:
9
X
min xi
i=1
Constraints:
• Demand:
i=t
X
xi ≥ dt for t = 1, . . . , 15
i=t−6
• Bounds:
x−5 , . . . , x0 = 0
92 CHAPTER 5. MODELING IN MIXED INTEGER LINEAR PROGRAMMING
For a graph with weights on the edges, the weight of a matching is the sum of the weights on the edges
of the matching.
Definition 5.7 (Maximum weighted matching problem). Given a graph G = (V, E) and weights we on the
edges e ∈ E find the matching of maximum weight.
Binary variables indicate whether an edge is selected or not. The constraint ensures that for each vertex the
number of selected edges that are incident to the vertex are not more than 1.
A particular case is a bipartite matching that arises when the graph is bipartite. A bipartite matching is
equivalent to an assignment problem.
Vertex Cover
Definition 5.8 (Vertex cover problem). Given a graph G, select a subset S ⊆ V such that each edge has at
least one end vertex in S.
Roughly said, an approximation algorithm is an algorithm that runs in polynomial time and that guaratees
in the worst case a certain approximation ratio with respect to the optimal solution. Formally, if OP T (π)
is the optimal solution of an instance π of a minimization problem, and A(π) is the solution found by the
approximation algorithm, the approximation ratio AR is defined as:
A(π)
AR = max
π OP T (π)
An approximation algorithm for vertex cover can be easily derived from the linear programming solution.
Let x∗ be the optimal solution of the linear programming relaxation of the MILP formulation of the vertex
cover problem. Then, a cover SLP can be constructed by selecting the vertices whose variables received a
value larger than 1/2, that is:
SLP R = {v ∈ V : x∗v ≥ 1/2}.
The set SLP R is a cover since x∗v + x∗u ≥ 1 implies x∗v ≥ 1/2 or x∗u ≥ 1/2.
Proposition 5.1. The LP rounding approximation algorithm described above gives a 2-approximation:
|SLP | ≤ 2|SOP T | (at most as bad as twice the optimal solution)
5.2. MILP MODELING 93
P P
Proof. Let x̄ be the optimal solution for the MILP formulation of the vertex cover. Then x∗v ≤ x̄v .
Moreover,
X X
|SLP | = 1≤ 2x∗v
v∈SLP v∈V
Also in this case we could design an algorithm that rounds the LP relaxation of the MILP formulation. The
optimal solution to this LP problem sets xv = 1/2 for all variables and has value |V |/2. This fact implies
that the LP relaxation rounding algorithm gives an O(n)-approximation (almost useless). (To prove this
fact think about the worst possible instance which is a complete graph. What is the optimal integer max
independent set solution for a complete graph?)
Definition 5.10 (Traveling salesman problem). Given a set of n locations and costs cij of travelling from
one location i to another location j, find the cheapest tour that visits all locations.
The problem is modeled in graph terms by defining a directed graph D = (V, A). In this context a tour
that visit all vertices is called Hamiltonian tour. Note that if the costs are symmetric everywhere then the
graph can be undirected.
The problem can be formulated as a MILP problem as follows.
Parameters The set of locations identified by 1, ..., n indexed by i and j and the set of traveling costs cij .
Variables (
1
xij =
0
Objective
n X
X n
cij xij
i=1 j=1
94 CHAPTER 5. MODELING IN MIXED INTEGER LINEAR PROGRAMMING
The previous constraints alone do not remove the possibility that subtours are found. To eliminate this
possibility there are two ways:
• cut set constraints
XX
xij ≥ 1 ∀S ⊂ N, S 6= ∅
i∈S j6∈S
The problem with these constraints is that there are exponentially many (look at the quantifiers on the
right side). In can learn how to deal with this issue in one of the assignments.
Modeling Trick III: “Either/Or Constraints” In conventional mathematical models, the solution
must satisfy all constraints. Suppose that your constraints are of the type “either/or”:
a1 x1 + a2 x2 ≤ b1 or
d1 x1 + d2 x2 ≤ b2
Introduce a new variable y ∈ {0, 1} and a large number M :
a1 x1 + a2 x2 ≤ b1 + M y if y = 0 then this is active
d1 x1 + d2 x2 ≤ b2 + M (1 − y) if y = 1 then this is active
Hence, binary integer programming allows to model alternative choices. For example, we can model the
case of tow disjoint feasible regions, ie, disjunctive constraints, which are not possible in LP.
We introduce an auxiliary binary variable y and M , a big number:
Ax ≤ b + M y if y = 0 then this is active
A0 x ≤ b0 + M (1 − y) if y = 1 then this is active
Example 5.9. At least one of the two constraints must be satisfied:
3x1 + 2x2 ≤ 18 or x1 + 4x2 ≤ 16
Introduce new variable y ∈ {0, 1} and a large number M :
3x1 + 2x2 ≤ 18 + M y
x1 + 4x2 ≤ 16 + M (1 − y)
If y = 1 then x1 + 4x2 ≤ 16 is the active constraint and the other is always satisfied.
If y = 0 then 3x1 + 2x2 ≤ 18 is the active constraints and the other is always satisfied.
96 CHAPTER 5. MODELING IN MIXED INTEGER LINEAR PROGRAMMING
Modeling Trick IV: “Either/Or Constraints” We can generalize the previous trick to the case
where Exactly K of the N constraints:
a11 x1 + a12 x2 + a13 x3 + . . . + a1m xm ≤ d1
a21 x1 + a22 x2 + a23 x3 + . . . + a2m xm ≤ d2
..
.
am1 x1 + aN 2 x2 + aN 3 x3 + . . . + aN m xm ≤ dN
must be satisfied. We need to introduce binary variables y1 , y2 , . . . , yN and a large number M and impose:
a11 x1 + a12 x2 + a13 x3 + . . . + a1m xm ≤ d1 + M y1
a21 x1 + a22 x2 + a23 x3 + . . . + a2m xm ≤ d2 + M y2
..
.
am1 x1 + aN 2 x2 + aN 3 x3 + . . . + aN m xm ≤ dN + M yN
y1 + y2 + . . . yN = N − K
Since in a feasible solution K of the y-variables will be 0, then K constraints will be satisfied.
P
n
Similarly we can model the case where at least h ≤ k of aij xj ≤ bi , i = 1, . . . , k must be satisfied. We
j=1
introduce yi , i = 1, ..., k auxiliary binary variables and impose:
n
X
aij xj ≤ bi + M yi
j=1
X
yi ≤ k − h
i
Modeling Trick V: “Possible Constraints Values” A constraint must take on one of N given
values:
a1 x1 + a2 x2 + a3 x3 + . . . + am xm = d1 or
a1 x1 + a2 x2 + a3 x3 + . . . + am xm = d2 or
..
.
a1 x1 + a2 x2 + a3 x3 + . . . + am xm = dN
We introduce the binary variables y1 , y2 , . . . , yN and impose:
a1 x1 + a2 x2 + a3 x3 + . . . + am xm = d1 y1 + d2 y2 + . . . dN yN
y1 + y2 + . . . yN = 1
Example 5.10. The constraint must equal 6 or 8 or 12:
• 3x1 + 2x2 = 6 or
• 3x1 + 2x2 = 8 or
• 3x1 + 2x2 = 12 or
Reformulate with auxiliary variables y1 , y2 , y3 ∈ {0, 1}:
• 3x1 + 2x2 = 6y1 + 8y2 + 12y3 and
• y1 + y2 + y3 = 1
Example 5.11 (Dijunctive constraints in scheduling). Two tasks, P and Q, must be performed by the same
person. The duration of P (resp. Q) is dp units. The start time of P (resp. Q) is denoted as sp (sq ).
We want to enforce either sp + dp ≤ sq or sq + dq ≤ sp .
Trick: Define binary variable ipq , indicating if P precedes Q Introduce the following constraints
sp + dp ≤ sq + M (1 − ipq )
sq + dq ≤ sp + M ipq
5.4. FORMULATIONS 97
5.4 Formulations
Problems can have more than one MILP formulation. Let’s start by considering the Uncapacited Facility
Location problem.
Variables (
1 if depot open
yj = ,
0 otherwise
xij fraction of demand of i satisfied by j
Objective XX X
min cij xij + fj yj
i∈M j∈N j∈N
Constraints
n
X
xij = 1 ∀i = 1, . . . , m
j=1
X
xij ≤ myj ∀j ∈ N
i∈M
xij ≤ yj ∀i ∈ M, j ∈ N
That is, if it does not leave out any of the solutions of the feasible region X.
Definition 5.13 (Convex Hull). Given a set X ⊆ Zn the convex hull of X is defined as:
n t
X t
X
i
conv(X) = x : x = λi x , λi = 1, λi ≥ 0, for i = 1, . . . , t,
i=1 i=1
for all finite subsets {x1 , . . . , xt } of X
98 CHAPTER 5. MODELING IN MIXED INTEGER LINEAR PROGRAMMING
Hence:
max{cT x : x ∈ X} ≡ max{cT x : x ∈ conv(X)}
This is an important result, it means that we can solve the integer programming problem by solving its
linear programming problem relaxation. However the description of the convex hull conv(X) may require an
exponential number of inequalities to describe and it may be not knwon.
What makes a formulation better than another? Let’s suppose that we have two formulations P1 and P2
and that
X ⊆ conv(X) ⊆ P1 ⊂ P2
Then we can conclude that:
P1 is better than P2
Definition 5.14. Given a set X ⊆ Rn and two formulations P1 and P2 for X, P1 is a better formulation
than P2 if P1 ⊂ P2 .
We can now get back to our two alternative formulations for the UFL problem.
We show that
P2 ⊂ P1
P
• P2 ⊆ P1 because summing xij ≤ yj over i ∈ M we obtain i∈M xij ≤ myj
• P2 ⊂ P1 because there exists a point in P1 but not in P2 : for example, let m = 6 = 3 · 2 = k · n The
following solution
x10 = 1, x20 = 1, x30 = 1,
x41 = 1, x51 = 1, x61 = 1
under formulation P1 would admit a fractional value for y0 and y1
P
Pi xi0 ≤ 6y0 y0 = 1/2
i xi1 ≤ 6y1 y1 = 1/2
while under the formulation P2 the variables y could not take a fractional value. Since they must be
integer for their proper use in the obejctive function, then we showed that there is a solution in P1 but
not in P2 while not removing any feasible solution.
Chapter 6
6.1 Relaxations
Suppose we have the following ILP problem:
z = max{c(x) : x ∈ X ⊆ Zn }
The set X represents the set of all feasible solutions. In an ILP this set is a subset of Zn . Since the problem
is a maximization problem a any feasible solution x∗ of value z gives a lower bound to z. Then, to prove the
optimality of a feasible solution we need also an upper bound, z. The if z = z, the solution of that gives z is
optimal. Alternatively, we can stop our search process when z − z ≤ for a given reasonably small .
z
z
z
The concepts, roles and determination of upper and lower bounds are linked to the sense of the optimiza-
tion function. In a minimization problem their roles are exchanged. To avoid this dependency on the sense
of the objective function, the following concepts are instead used.
• Primal bounds: every feasible solution gives a primal bound. In some problems it may be easy or hard
to find feasible solutions. Heuristics are used to provide such type of solutions.
• Dual bounds: They are obtain through relaxations of the problem formulation.
In our initial maximization problem, the lower bounds are primal bounds and the upper bounds are dual
bounds.
Given a primal bound pb and a dual bound db it is possible to calculate the optimality gap:
|pb − db|
gap = · 100
pb +
The is added to avoid division by zero when pb = 0. To avoid a confusing behaviour when 0 lies in between
pb and db a different definition, which includes the above one, is often used. For a minimization problem,
this is:
pb − db
gap = (·100)
inf{|z|, z ∈ [db, pb]}
If pb ≥ 0 and db ≥ 0 then pb−db
db . If db = pb = 0 then gap = 0. If no feasible solution is found or db ≤ 0 ≤ pb
then the gap is not computed.
99
100 CHAPTER 6. WELL SOLVED PROBLEMS
Proposition 6.1.
(RP ) z R = max{f (x) : x ∈ T ⊆ Rn } is a relaxation of
(IP ) z = max{c(x) : x ∈ X ⊆ Rn } if :
(i) X ⊆ T or
In other terms:
maxx∈T c(x)
max f (x) ≥ ≥ max c(x)
x∈T maxx∈X f (x) x∈X
• f (x) ≥ c(x)
(ii) Let x∗ be optimal solution for RP. If x∗ ∈ X and f (x∗ ) = c(x∗ ) then x∗ is optimal for IP.
Combinatorial relaxations Some complicating constraints are removed leaving a problem easy (that
is, in polyniomail time) to solve. For example, the TSP can be reduced to an Assignment problem by dropping
the subtour elimination constraints.
Lagrangian relaxation it is obtained by bringing all or some constraints in the objective function via
multipliers. That is,
IP : z = max{cT x : Ax ≤ b, x ∈ X ⊆ Zn }
LR : z(u) = max{cT x + u(b − Ax) : x ∈ X}
Duality the concept of duality works only for linear programming. We can adapt the definitions of dual
problems to integer programming as follows:
z = max{c(x) : x ∈ X} w = min{w(u) : u ∈ U }
An advantage of ILP with respect to LP is that once we have a dual problem we do not need to solve an
LP like in the LP relaxation to have a bound, any feasible dual solution gives a dual bound for the primal
problem.
Here are some examples of dual pairs in ILP:
are weak dual pairs. Indeed, it is easy to see that LP relaxations of these two problems are dual of each
other, then
z ≤ z LP = wLP ≤ w.
The two problems are strong-dual pairs when the graphs are bipartite.
Definition 6.2 (Separation problem for a COP). Given x∗ ∈ P is x∗ ∈ conv(X)? If not find an inequality
ax ≤ b satisfied by all points in X but violated by the point x∗ .
(i) Efficient optimization property: there exists a polynomial time algorithm for max{cx : x ∈ X ⊆ Rn }
(ii) Strong duality property: there exists a strong dual D min{w(u) : u ∈ U } that allows to quickly verify
optimality
(iii) Efficient separation property: there exists an efficient algorithm for the separation problem
(iv) Efficient convex hull property: there is a compact description of the convex hull available.
102 CHAPTER 6. WELL SOLVED PROBLEMS
Note that if the explicit convex hull property is true, then the strong duality property and the efficient
separation property hold (just description of conv(X)).
Problems that are easy have typically all four properties satisfied.
Polyhedral analysis is the field of theoretical analysis to prove results about: the strength of certain
inequalities that are facet defining and the descriptions of convex hull of some discrete X ⊆ Z∗ (we see one
way to do this in the next section).
IP : max{cT x : Ax ≤ b, x ∈ Zn+ }
AB x B + AN x N = b
xN = 0 AB xB = b,
AB m × m non singular matrix
xB ≥ 0
Cramer’s rule for solving systems of linear equations:
a b x e
=
c d y f
e b a e
f d c f
x= y =
a b
a b
c d c d
Aadj
B b
x = A−1
B b=
det(AB )
0 1 0 0 0
1 −1 −1 0
1 −1 0 0 1 1 1 1
1 −1 −1 0 0 1
0 1 1 1 0 1 1 1
1 1 0 1 0 −1
1 0 1 1 0 0 1 0
0 0 1 0
1 0 0 0 0
Proof: by induction
Basis: one matrix of one element {+1, −1} is TUM
Induction: let C be of size k.
If C has column with all 0s then it is singular.
If a column with only one 1 then expand on that by induction
If 2 non-zero in each column then X X
∀j : aij = aij
i∈I1 i∈I2
• TUM
• Balanced matrices
• Perfect matrices
• Integer vertices
Eg: Shortest path, max flow, min cost flow, bipartite weighted matching
104 CHAPTER 6. WELL SOLVED PROBLEMS
Chapter 7
Network Flows
105
models arise in broader problem contexts and how the algorithms that we have
developed for the core models can be used in conjunction with other methods to
solve more complex problems that arise frequently in practice. In particular, this
discussion permits us to introduce and describe the basic ideas of decomposition
methods for several important network optimization models-constrained shortest
paths, the traveling salesman problem, vehicle routing problem, multicommodity
flows, and network design.
Since the proof of the pudding is in the eating, we have also included a chapter
on some aspects of computational testing of algorithms. We devote much of our
discussion to devising the best possible algorithms for solving network flow prob-
lems, in the theoretical sense of computational complexity theory. Although the
theoretical model of computation that we are using has proven to be a valuable guide
for modeling and predicting the performance of algorithms in practice, it is not a
perfect model, and therefore algorithms that are not theoretically superior often
perform best in practice. Although empirical testing of algorithms has traditionally
been a valuable means for investigating algorithmic ideas, the applied mathematics,
computer science, and operations research communities have not yet reached a
consensus on how to measure algorithmic performance empirically. So in this chapter
we not only report on computational experience with an algorithm we have pre-
sented, but also offer some thoughts on how to measure computational performance
and compare algorithms.
4 Introduction Chap. 1
variants as well as other basic models that we consider in later chapters. We assume
our readers are familiar with the basic notation and definitions of graph theory; those
readers without this background might consult Section 2.2 for a brief account of this
material.
Let G = (N, A) be a directed network defined by a set N of n nodes and a
set A of m directed arcs. Each arc (i, j) E A has an associated cost Cij that denotes
the cost per unit flow on that arc. We assume that the flow cost varies linearly with
the amount of flow. We also associate with each arc (i, j) E A a capacity Uij that
denotes the maximum amount that can flow on the arc and a lower bound lij that
denotes the minimum amount that must flow on the arc. We associate with each
node i E N an integer number b(i) representing its supply/demand. If b(i) > 0, node
i is a supply node; if b(i) < 0, node i is a demand node with a demand of - b(i); and
if b(i) = 0, node i is a transshipment node. The decision variables in the minimum
cost flow problem are arc flows and we represent the flow on an arc (i,}) E A by
Xij. The minimum cost flow problem is an optimization model formulated as follows:
Minimize 2 CijXij (l.la)
(i,j)EA
subject to
2 Xij - 2 Xj; = b(i) for all i E N, (l.Ib)
{j:(i,j)EA} {j:(j,i)EA}
Shortest path problem. The shortest path problem is perhaps the simplest
of all network flow problems. For this problem we wish to find a path of minimum
cost (or length) from a specified source node s to another specified sink node t,
assuming that each arc (i, j) E A has an associated cost (or length) Cij' Some of the
simplest applications of the shortest path problem are to determine a path between
two specified nodes of a network that has minimum length, or a path that takes least
time to traverse, or a path that has the maximum reliability. As we will see in our
later discussions, this basic model has applications in many different problem do-
mains, such as equipment replacement, project scheduling, cash flow management,
message routing in communication systems, and traffic flow through congested cities.
If we set b(s) = 1, b(t) = - 1, and b(i) = 0 for all other nodes in the minimum
cost flow problem, the solution to the problem will send 1 unit of flow from node s
to node t along the shortest path. The shortest path problem also models situations
in which we wish to send flow from a single-source node to a single-sink node in an
uncapacitated network. That is, if we wish to send v units of flow from node s to
node t and the capacity of each arc of the network is at least v, we would send the
flow along a shortest path from node s to node t. If we want to determine shortest
paths from the source node s to every other node in the network, then in the minimum
cost flow problem we set b(s) = (n - 1) and b(i) = - 1 for all other nodes. [We
can set each arc capacity Uij to any number larger than (n - 1).] The minimum cost
flow solution would then send unit flow from node s to every other node i along a
shortest path.
6 Introduction Chap. J
since any flow on arc (t, s) must travel from node s to node t through the arcs in A
[since each b(i) = 0], the solution to the minimum cost flow problem will maximize
the flow from node s to node t in the original network.
In this book, we also study the following generalizations of the minimum cost
flow problem.
Convex cost flow problems. In the minimum cost flow problem, we assume
that the cost of the flow on any arc varies linearly with the amount of flow. Convex
cost flow problems have a more general cost structure: The cost is a convex function
of the amount of flow. Flow costs vary in a convex manner in numerous problem
settings, including (1) power losses in an electrical network due to resistance, (2)
congestion costs in a city transportation network, and (3) expansion costs of a com-
munication network.
Other Models
In this book we also study two other important network models: the minimum span-
ning tree problem and the matching problem. Although these two models are not
flow problems per se, because of their practical and mathematical significance and
because of their close connection with several flow problems, we have included
them as part of our treatment of network flows.
8 Introduction Chap. 1
Matching problems. A matching in a graph G = (N, A) is a set of arcs
with the property that every node is incident to at most one arc in the set; thus a
matching induces a pairing of (some 00 the nodes in the graph using the arcs in A.
In a matching, each node is matched with at most one other node, and some nodes
might not be matched with any other node. The matching problem seeks a matching
that optimizes some criteria. Matching problems on a bipartite graphs (i.e., those
with two sets of nodes and with arcs that join only nodes between the two sets, as
in the assignment and transportation problems) are called bipartite matching prob-
lems, and those on nonbipartite graphs are called nonbipartite matching problems.
There are two additional ways of categorizing matching problems: cardinality match-
ing problems, which maximize the number of pairs of nodes matched, and weighted
matching problems, which maximize or minimize the weight of the matching. The
weighted matching problem on a bipartite graph is also known as the assignment
problem. Applications of matching problems arise in matching roommates to hostels,
matching pilots to compatible airplanes, scheduling airline crews for available flight
legs, and assigning duties to bus drivers.
1.3 APPLICATIONS
Networks are pervasive. They arise in numerous application settings and in many
forms. Physical networks are perhaps the most common and the most readily iden-
tifiable classes of networks; and among physical networks, transportation networks
are perhaps the most visible in our everyday lives. Often, these networks model
homogeneous facilities such as railbeds or highways. But on other occasions, they
correspond to composite entities that model, for example, complex distribution and
logistics decisions. The traditional operations research "transportation problem" is
illustrative. In the transportation problem, a shipper with inventory of goods at its
warehouses must ship these goods to geographically dispersed retail centers, each
with a given customer demand, and the shipper would like to meet these demands
incurring the minimum possible transportation costs. In this setting, a transportation
link in the underlying network might correspond to a complex distribution channel
with, for example, a trucking shipment from the warehouse to a railhead, a rail
shipment, and another trucking leg from the destination rail yard to the customer's
site.
Physical networks are not limited to transportation settings; they also arise in
several other disciplines of applied science and engineering, such as mathematics,
chemistry, and electrical, communications, mechanical, and civil engineering. When
physical networks occur in these different disciplines, their nodes, arcs, and flows
model many different types of physical entities. For example, in a typical commu-
nication network, nodes will represe'nt telephone exchanges and'transmission facil-
ities, arcs will denote copper cables or fiber optic links, and flow would signify the
transmission of messages or of data. Figure 1",1 shows some typical associations
for the nodes, arcs, and flows in a variety of physical networks.
Network flow problems also arise in surprising ways for problems that on the
surface might not appear to involve networks at all. Sometimes these applications
are linked to a physical entity, and at other times they are not. Sometimes the nodes
and arcs have a temporal dimension that models activities that take place over time.
In this section we briefly discuss two special cases of the minimum cost flow
problem, both of which occur frequently in practical applications. For a more
detailed discussion see, e.g., [91, Section 3.12].
In the Assignment Problem, the input consists of a set of persons
P1 , P2 , . . . , Pn , a set of jobs J1 , J2 , . . . , Jn and an n × n matrix M = [Mij ]
whose entries are non-negative integers. Here Mij is a measure for the skill of
person Pi in performing job Jj (the lower the number the better Pi performs
job Jj ). The goal is to find an assignment ! π of persons to jobs so that each
n
person gets exactly one job and the sum i=1 Miπ(i) is minimized. Given
any instance of the assignment problem, we may form a complete bipartite
graph B = (U, V ; E) where U = {P1 , P2 , . . . , Pn }, V = {J1 , J2 , . . . , Jn } and
E contains the edge Pi Jj with the weight Mij for each i ∈ [m], j ∈ [n]. Now
the assignment problem is equivalent to finding a minimum weight perfect
matching in B. Clearly we can also go the other way and hence the assignment
problem is equivalent to the weighted bipartite matching problem. It
is also easy to see from this observation that the assignment problem is a
(very) special case of the minimum cost flow problem. In fact, if we think
of Mij as a cost and orient all edges from U to V in the bipartite graph
above, then what we are seeking is an integer-valued flow of minimum cost
so that the value of the balance vector equals 1 for each Pi , i = 1, 2, . . . , m,
and equals -1 for each Jj , j = 1, 2, . . . , n.
Inspecting the description of the buildup algorithm above, it is not hard
to see that the following holds (Exercise 4.53).
Theorem 4.10.8 The buildup algorithm solves the assignment problem for
a bipartite graph on n vertices in time O(n3 ). ⊓
#
115
116 CHAPTER 8. CUTTING PLANE ALGORITHMS
Theorem 8.1. Applying this CG procedure a finite number of times every valid inequality for X can be
obtained.
Init.: t = 0, P 0 = P
P t+1 = P ∩ {x : ai x ≤ bi , i = 1, . . . , t}
P
xu = b̄u − āuj xj , u∈B
P j∈N
z = d¯ + c̄j xj
j∈N
8.2. CUTTING PLANE ALGORITHMS FOR INTEGER PROGRAMMING 117
• If basic optimal solution to LPR is not integer then ∃ some row u: b̄u 6∈ Z1 .
The Chvatál-Gomory cut applied to this row is:
X
x Bu + bāuj cxj ≤ bb̄u c
j∈N
fu > 0 or else u would not be row of fractional solution. It implies that x∗ in which x∗N = 0 is cut out!
• Moreover: when x is integer, since all coefficient in the CG cut are integer the slack variable of the cut
is also integer: X
s = −fu + fuj xj
j∈N
(theoretically it terminates after a finite number of iterations, but in practice not successful.)
max x1 + 4x2
x1 + 6x2 ≤ 18
x1 ≤ 3
x1 , x2 ≥ 0
x1 , x2 integer
x2
x1 = 3
x1 + 6x2 = 18
x1
x1 + 4x2 = 2
| | x1 | x2 | x3 | x4 | -z | b |
|---+----+----+----+----+----+----|
| | 1 | 6 | 1 | 0 | 0 | 18 |
| | 1 | 0 | 0 | 1 | 0 | 3 |
|---+----+----+----+----+----+----|
| | 1 | 4 | 0 | 0 | 1 | 0 |
| | x1 | x2 | x3 | x4 | -z | b |
|---+----+----+----+----+----+----|
| | 0 | 6 | 1 | -1 | 0 | 15 |
| | 1 | 0 | 0 | 1 | 0 | 3 |
|---+----+----+----+----+----+----|
| | 0 | 4 | 0 | -1 | 1 | -3 |
| | x1 | x2 | x3 | x4 | -z | b |
|---+----+----+------+------+----+------|
| | 0 | 1 | 1/6 | -1/6 | 0 | 15/6 |
| | 1 | 0 | 0 | 1 | 0 | 3 |
|---+----+----+------+------+----+------|
| | 0 | 0 | -2/3 | -1/3 | 1 | -13 |
118 CHAPTER 8. CUTTING PLANE ALGORITHMS
x2 = 5/2, x1 = 3
Optimum, not integer
P 1
• CG cut j∈N fuj xj ≥ fu 6 x3 + 56 x4 ≥ 1
2
• Let’s see how it looks in the space of the original variables: from the first tableau:
x3 = 18 − 6x2 − x1
x4 = 3 − x1
1 5 1
(18 − 6x2 − x1 ) + (3 − x1 ) ≥ x1 + x2 ≤ 5
6 6 2
• Graphically:
x2
x1 = 3
x1 + 6x2 = 18
x1 + x2 = 5
x1
x1 + 4x2 = 2
• Let’s continue:
| | x1 | x2 | x3 | x4 | x5 | -z | b |
|---+----+----+------+------+----+----+------|
| | 0 | 0 | -1/6 | -5/6 | 1 | 0 | -1/2 |
| | 0 | 1 | 1/6 | -1/6 | 0 | 0 | 5/2 |
| | 1 | 0 | 0 | 1 | 0 | 0 | 3 |
|---+----+----+------+------+----+----+------|
| | 0 | 0 | -2/3 | -1/3 | 0 | 1 | -13 |
c
ratio rule: min | aijj |
4(18 − x1 − 6x2 ) + (5 − x1 − x2 ) ≥ 2
x1 + 5x2 ≤ 15
x2
x1
• ...
128 CHAPTER 8. CUTTING PLANE ALGORITHMS
Chapter 9
129
References
T. Cormen, C. Leiserson, R. Rivest, and Stein. Introduction to algorithms. MIT press, Cambridge, Mas-
sachusetts, USA, third edition, 2009.
H. P. Williams. Model building in mathematical programming. John Wiley & Sons, Chichester, fifth edi-
tion, 2013. ISBN 9780471997887. URL http://site.ebrary.com/lib/sdub/docDetail.action?docID=
10657847.
141