Hu T.C. - Linear and Integer Programming Made Easy
Hu T.C. - Linear and Integer Programming Made Easy
Kahng
Linear and
Integer
Programming
Made Easy
Linear and Integer Programming Made Easy
T. C. Hu • Andrew B. Kahng
Linear Programming
Preliminaries
Chapter 1
Matrix Notation & Gauss-Jordan Elimination
Simplex Method
Chapter 4
An Algorithm that Enables Computers to Solve
Linear Programs
Integer Programming
Knapsack Problem
Chapter 8
One of the Simplest Integer Programs
The subject of linear programming was discovered by Dr. George B. Dantzig, and
the subject of integer programming was discovered by Dr. Ralph E. Gomory.
Thousands of papers and hundreds of books have been published. Is there still a
need for this book?
The earlier algorithms for integer programming were based on cutting planes. In
this book, we map non-basic columns into group elements to satisfy congruence
relations. The first six chapters of the book are about linear programming. Then,
Chapter 8 introduces the knapsack problem which is known to have time complex-
ity of O(nb), where n is the number of types of items and b is the capacity of the
knapsack. We present a new algorithm which has time complexity O(nw), where
n is the number of types of items and w is the weight of the best item, that is, the
item with the highest ratio of value to weight.
The unique contents of this book include:
1. The column generating technique for solving very large linear programs with too
many columns to write down (Chapter 7)
2. A new knapsack algorithm with its time complexity O(nw), where n is the
number of types of items and w is the weight of the best item (Chapter 8)
Thus, we explain the two features in detail and devote all of Chapter 9 to the
asymptotic algorithm for integer programming; we present “The World Map on
Integer Programs” in Chapter 10.
Chapter 11 of this book introduces the practical application of linear and integer
programming. Ultimately, real-world problems must be formulated as linear or
integer programs and then solved on computers using commercial or public-domain
software packages. We give examples and pointers to this end.
vii
viii Preface
Otter Cove, CA T. C. Hu
La Jolla, CA A. B. Kahng
June 2015
Contents
1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Matrix Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Solving Simultaneous Equations . . . . . . . . . . . . . . . . . . . . . . . 3
Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
The Gauss–Jordan Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Ratio Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Dimension of the Solution Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1 Solution Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Convex Sets and Convex Functions . . . . . . . . . . . . . . . . . . . . . 31
3.3 Geometric Interpretation of Linear Programs . . . . . . . . . . . . . . 34
3.4 Theorem of Separating Hyperplanes . . . . . . . . . . . . . . . . . . . . 35
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Introduction to the Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1 Equivalent Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Basic, Feasible, and Optimum Solutions . . . . . . . . . . . . . . . . . 41
4.3 Examples of Linear Programs . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4 Pivot Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.6 Review and Simplex Tableau . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.7 Linear Programs in Matrix Notation . . . . . . . . . . . . . . . . . . . . 55
4.8 Economic Interpretation of the Simplex Method . . . . . . . . . . . . 56
4.9 Summary of the Simplex Method . . . . . . . . . . . . . . . . . . . . . . 57
4.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
ix
x Contents
Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Preliminaries
1
This chapter provides a brief linear algebra “refresher” and covers notation and
some matrix properties that will be used throughout this book.
In this book, we will use parentheses to denote row vectors and brackets to denote
column vectors. Thus, a matrix A defined as
2 3
a11 a12 a13 a14 a15
A ¼ 4 a21 a22 a23 a24 a25 5
a31 a32 a33 a34 a35
can be written as three row vectors or five column vectors with (a11, a12, a13, a14, a15)
as its first row and [a11, a21, a31] as its first column. Notice that aij is the entry in
the ith row and the jth column.
A matrix of m rows and n columns is denoted by m n elements. Thus, for
m ¼ 2 and n ¼ 3, we can have a matrix A defined as
a a12 a13
A ¼ 11
a21 a22 a23
or a matrix B defined as
b11 b12 b13
B¼
b21 b22 b23
In general, the sum of two m n matrices is an m n matrix whose entries are the
sum of the corresponding
entries
of the summands.
c11 c12 αc11 αc12
A matrix C ¼ multiplied by a scalar α is αC ¼ .
c21 c22 αc21 αc22
In general, an m n matrix C multiplied by a scalar α is the matrix obtained
by multiplying each component of C by α.
The transpose of an m n matrix A is denoted by AT, which is an n m matrix.
The transpose of A is obtained by writing each row of A as a column instead. So, for
matrices B and C below, we have BT and CT as
2 3
b11 b21
b b12 b13
B ¼ 11 BT ¼ 4 b12 b22 5
b21 b22 b23
b13 b23
2 3
c11
C ¼ 4 c21 5 CT ¼ ½ c11 c21 c31
c31
For a square matrix, we can define a determinant that is useful in linear algebra.
The determinant of square matrix C is denoted as det C. We will discuss this in
Section 1.3.
Matrices can also be multiplied together. Two matrices A and B can be multiplied
if the number of columns of A is the same as the number of rows of B. We could have
a11 a12 a13
A¼
a21 a22 a23
1.2 Solving Simultaneous Equations 3
2 3
b11 b12
B ¼ 4 b21 b22 5
b31 b32
a11 b11 þ a12 b21 þ a13 b31 a11 b12 þ a12 b22 þ a13 b32
AB¼
a21 b11 þ a22 b21 þ a23 b31 a21 b12 þ a22 b22 þ a23 b32
are said to be in reduced row echelon form. Note that the matrices A and B are not in
reduced row echelon form.
x1 þ 3x2 þ 2x3 ¼ 7
0 þ 2x2 þ x3 ¼ 3
x1 x2 þ x3 ¼ 2
4 1 Preliminaries
This is just a convenient way for us to express and solve a system of linear
equations.
When we solve a system of linear equations expressed as an augmented matrix,
there are three operations called elementary row operations which we are allowed
to perform on the rows. We can:
Notice that these operations all preserve the solution(s) to the system of linear
equations. Swapping two rows clearly has no effect on the solutions, multiplying
the right and left side of an equation does not affect its solutions, and adding two
equations with common solutions preserves the solutions.
Gaussian Elimination
1. Swap rows so the leftmost column with a nonzero entry is nonzero in the
first row.
2. Add multiples of the first row to every other row so that the entry in that column
of every other row is zero.
3. Ignore the first row and repeat the process.
Please see the example below of the Gauss–Jordan Method to see how Gaussian
elimination operates.
The Gauss–Jordan Method first applies Gaussian elimination to get a row echelon
form of the matrix. Then, the rows are reduced even further into reduced row
echelon form. In the case of a square matrix, the reduced row echelon form is the
identity matrix if the initial matrix was invertible.
1.2 Solving Simultaneous Equations 5
x1 þ 3x2 þ 2x3 ¼ 7
0 þ 2x2 þ x3 ¼ 3
x1 x2 þ x3 ¼ 2
Notice that we now have an augmented matrix in row echelon form, and Gaussian
elimination is complete. We now want to row reduce this matrix into reduced row
echelon form.
Multiply row 2 by 0.5 to get a leading 1 in row 2:
2 3
x1 þ 3x2 þ 2x3 ¼ 7 1 3 2 7
6 7
0 þ x2 þ 0:5x3 ¼ 1:5 4 0 1 0:5 1:5 5
0 þ0 þ x3 ¼ 1 0 0 1 1
det M ¼ αδ βγ
Note that if the determinant of a matrix is zero, then the matrix has no inverse.
For any n n square matrix, there is an easy way to calculate its inverse matrix.
First, we put an n n identity matrix to the right of it. Then, we use Gaussian
elimination to convert the left matrix into the identity matrix and execute the same
1.3 Inverse of a Matrix 7
row operations on the right matrix. Once the left matrix is the identity matrix, the
matrix on the right will be the inverse of the initial matrix.
Accordingly, if we want to find the inverse of
2 3
1 3 2
40 2 15
1 1 1
We end up with
2 31 2 3
1 3 2 1:5 2:5 0:5
40 2 1 5 ¼ 4 0:5 0:5 0:5 5
1 1 1 1 2 1
As an exercise, the reader should verify that this is the inverse of the given matrix.
The multiplication of matrices is associative; that is, ðABÞC ¼ AðBCÞ. Consider all
of the ways to evaluate the product of four matrices A, B, C, and D.
The number of multiplications needed to obtain the final result depends on the
order of multiplications! Table 1.1 shows the number of multiplications to obtain
the final result for each order of matrix multiplications.
Table 1.1 Number of multiplications to obtain the result with different orders of multiplication
((AB)C)D ð1000 20 1Þ þ ð1000 1 50Þ þ ð1000 50 100Þ ¼ 5, 070, 000
A((BC)D) ð20 1 50Þ þ ð20 50 100Þ þ ð20 100 1000Þ ¼ 2, 101, 000
(AB)(CD) ð1 20 1000Þ þ ð1 50 100Þ þ ð1 100 1000Þ ¼ 125, 000
(A(BC))D ð50 1 20Þ þ ð50 20 1000Þ þ ð50 100 1000Þ ¼ 6, 001, 000
A(B(CD)) ð100 50 1Þ þ ð100 1 20Þ þ ð100 1000 20Þ ¼ 2, 007, 000
The optimum order of multiplying a chain of matrices was studied since the
1970s. It was discovered by Hu and Shing [5]1 that the optimum order of multipli-
cation of n matrices,
M1 M2 M3 . . . Mn1 ¼ M
1
T. C. Hu and M. T. Shing, Combinatorial Algorithms, Dover, 2001
1.5 Exercises 9
1 1
B C B C
20 50 20 50
A D A D
E E
1000 100 1000 100
((AB)C)D A((BC)D)
1
B C
20 50
A D
E
1000 100
(AB)(CD)
1 1
B C B C
20 50 20 50
A D A D
E E
1000 100 1000 100
(A(BC))D A(B(CD))
Fig. 1.1 The resulting view of multiplying four matrices with different orders
1.5 Exercises
2 3
9
3 5 1 4
(b) 5 5
3 7 0
4
2. Compute the inverse of each of the following matrices by placing the identity
matrix to the right of it and using row reduction:
4 3
(a)
3 2
2 3
7 2 1
(b) 4 0 3 1 5
3 4 2
3. Compute the determinant of each of the following matrices:
6 9
(a)
1 1
5 3
(b)
7 11
/ β
4. What is the formula for finding the inverse of the square matrix ?
γ δ
Apply it to find the inverse of
7 3
2 1
We will now explore the basics of linear programming. In this chapter, we will go
through several basic definitions and examples to build our intuition about linear
programs. We will also learn the ratio method, which we will find useful in solving
certain linear programs.
Consider the following problem. We have two resources, wood and iron. We can
use up to 96 units of wood per day, and we can use up to 72 units of iron per day.
With these two resources, we can make two products, A and B. It requires 12 units
of wood and 6 units of iron to produce one copy of product A, and it requires 8 units
of wood and 9 units of iron to produce one copy of product B. Furthermore, we can
sell products A and B for $4 and $5 per copy, respectively. How do we allocate
resources to the production of A and B in order to maximize our revenue? Table 2.1
summarizes the problem.
With the information in Table 2.1, we can draw Figure 2.1 using A and B to
represent the amounts of the two products. Two lines represent upper bounds on
utilization of the two resources, wood and iron. For example, the line “wood” shows
the maximum usage of wood. That is, the 96 available units of wood could
potentially be used toward making eight copies of product A and zero copies of
product B. Or, the wood could be used toward making 12 copies of product B and
zero copies of product A. An important observation is that even though there is
enough wood to make 12 copies of product B, this is not feasible since only 72 units
of iron are available. The amounts of A and B produced must satisfy resource
constraints on both wood and iron. As such, the polygon JKLM represents our
solution space, that is, the set of all feasible solutions to our problem.
B
maximize 4A + 5B
12
wood: 12A + 8B = 96
10
M
8
6
L
4
iron: 6A + 9B = 72
2
J K
0 2 4 6 8 10 12 A
B
maximize 4A + 5B
12
wood: 12A + 8B = 96
10
M
8
c = (4, 5)
6
L
4
no
wi rma iron: 6A + 9B = 72
th l t
2 slo o c
pe ,
=
J −4
/5 K
0 2 4 6 8 10 12 A
Fig. 2.2 Search for the feasible point with maximum projection by shifting the dotted line along
c(4,5)
16 2 Introduction
B
maximize 4A + 5B
12
wood: 12A + 8B = 96
10
4
iron: 6A + 9B = 72
2
J K
0 2 4 6 8 10 12 A
Fig. 2.3 Optimum solution and optimum fractional solution of the revenue
max z ¼ x1 þ x2
subject to 2x1 þ x2 13
ð2:1Þ
0 x1 5
0 x2 5
Notice that our solution space consists of the area bounded by ABCDE in
Figure 2.4.
In Figure 2.4, the coordinates of the vertices and their values are A ¼ ð0, 0Þ with
value z ¼ 0, B ¼ ð5, 0Þ with value z ¼ 5, C ¼ ð5; 3Þ with value z ¼ 8, D ¼ ð4; 5Þ
with value z ¼ 9, and E ¼ ð0; 5Þ with value z ¼ 5.
The famous Simplex Method for solving a linear program—which we will
present in Chapter 3—starts from vertex A, moves along the x1 axis to B, and
then moves vertically upward until it reaches C. From the vertex C, the Simplex
Method goes upward and slightly westward to D. At the vertex D, the gradient of
the function z increases toward the northeast, and the gradient is a convex combi-
nation of the vectors that are normal to lines ED and DC. This indicates that the
corner point D is the vertex that maximizes the value of z1 (Figure 2.5):
max z ¼ maxðx1 þ x2 Þ ¼ ð4 þ 5Þ ¼ 9
1
A corner point is also called an extreme point. Such a point is not “between” any other two points
in the region.
2.1 Linear Programming 17
For three variables x1, x2, and x3, we can imagine a twisted cube such as a single die,
where the minimum is at the lowest corner point and the maximum is at the highest
corner point. Then, we can imagine that the Simplex Method would traverse along
the boundary of the cube to search for the optimal solution among extreme points.
18 2 Introduction
max z ¼ x1 þ x2 þ x3
subject to 6x1 þ 3x2 þ x3 15
ð2:3Þ
4x1 þ 5x2 þ 6x3 15
x1 , x2 , x3 0
Thus, when solving a linear program, the classical tool of “calculus” is not used.
In this section, we will familiarize ourselves with linear programs and the ratio
method. We will proceed by exploring several examples.
Example 1 Consider a bread merchant carrying a knapsack who goes to the farmers
market. He can fill his knapsack with three kinds of bread to sell. A loaf of raisin bread
can be sold for $12, a loaf of wheat bread for $10, and a loaf of white bread for $1.
2.2 Ratio Method 19
Furthermore, the loaves of raisin, wheat, and white bread weigh 11 lbs, 10 lbs, and
9 lbs, respectively. If the bread merchant can carry 20 lbs of bread in the knapsack,
what kinds of breads should he carry if he wants to get the most possible cash?
This problem can be formulated as a linear program (2.2) where x1 denotes the
number of loaves of raisin bread, x2 denotes the number of loaves of wheat bread,
and x3 denotes the number of loaves of white bread in his knapsack.
In this example, there are three “noteworthy” integer solutions: x1 ¼ x3 ¼ 1,
x2 ¼ 0 or x1 ¼ x3 ¼ 0, x2 ¼ 2 or x1 ¼ x2 ¼ 0, x3 ¼ 2. So, the merchant would carry
two loaves of wheat bread and get $20, rather than one loaf each of raisin bread and
white bread to get $13 or two loaves of white bread to get $2. However, notice that
9
if he can cut a loaf of raisin bread into pieces, 11 of a loaf of raisin bread is worth
$12 11 $9:82, and he should therefore carry 20
9
11 ¼ 1 þ 11 loaves of raisin bread
9
max z ¼ v1 x1 þ v2 x2 þ v3 x3
subject to w 1 x1 þ w 2 x2 þ w 3 x3 b ð2:4Þ
x1 , x2 , x3 0
where vj is the value associated with item j, wj is the weight associated with item j,
and b is the total weight restriction of the knapsack.
Intuitively, we would like to carry an item which is of low weight and high value.
In other words, the ratio of the value to weight for such an item should be
maximized. Applying this idea to (2.2), we have
12 10 1
> >
11 10 9
ði:e:, raisin bread > wheat bread > white breadÞ
which indicates that we should fill the knapsack with the first item, that is, carry
20
11¼ 1 þ 11
9
loaves of raisin bread.
Definition The ratio method is a method that can be applied to a linear program to
get the optimal solution as long as the variables are not restricted to be integers. It
operates simply by taking the maximum or minimum ratio between two appropriate
parameters (e.g., value to cost, or value to weight).
To formalize this idea of the ratio method, consider the general linear program
max v ¼ v1 x1 þ v2 x2 þ v3 x3 þ
subject to w1 x1 þ w2 x2 þ w3 x3 þ b ð2:5Þ
xj 0
20 2 Introduction
Furthermore, let vk =wk ¼ maxj vj =wj (where wj 6¼ 0). Then the feasible solution
that maximizes v is xk ¼ b=wk , xj ¼ 0 for j 6¼ k, and the maximum profit value of v is
obtained by filling the knapsack with a single item. In total, we obtain a profit of
v ¼ b ðvk =wk Þ dollars.
It is easy to prove that the max ratio method for selecting the best item is correct
for any number of items and any right-hand side (total weight restriction). It should
be emphasized that the variables in (2.5) are not restricted to be integers. Normally,
when we say a “knapsack problem,” we refer to a problem like (2.2) but with
integer constraints. To distinguish the difference, we call (2.2) a “fractional knap-
sack problem.”
Example 2 Consider another merchant who goes to the farmers market. His goal is
not to get the maximum amount of cash but to minimize the total weight of his
knapsack as long as he can receive enough money. He also has the choice of three
kinds of bread: raisin, wheat, and white (see (2.6)). Then his problem becomes
minimizing the total weight subject to the amount of cash received being at least,
say, $30, as shown in (2.7).
min z ¼ w1 x1 þ w2 x2 þ w3 x3
subject to v1 x1 þ v2 x2 þ v3 x3 c ð2:6Þ
xj 0
where vj is the value associated with item j, wj is the weight associated with item j,
and c is the minimum amount of cash received.
When the objective function is to minimize the total weight, we also take ratios
associated with the items, but we want the minimum ratio of weight to value. We
have
and the minimum ratio is 1112 < 10 < 1, which means that we should carry 12 ¼ 2:5
10 9 30
loaves of the raisin bread, and the total weight of the knapsack is
11 lbs 2.5 ¼ 27.5 lbs.
Thus, in a fractional knapsack problem, we always take a ratio involving weight
and value. To maximize profit, we use the maximum ratio of value to weight. To
minimize weight, we use the reciprocal, that is, the minimum ratio of weight to
value. The ratio method is always correct for a single constraint and no integer
restrictions.
2.2 Ratio Method 21
max z ¼ x1 þ x2 þ x3
" # " # " # " #
4 1 3 15
subject to x1 þ x2 þ x3 ð2:8Þ
1 4 2 15
xj 0
where x1, x2, and x3 are the amounts of drinks A, B, and C, respectively.
Definition Mixing the apple juice and orange juice into drink A in a specified
activity.
proportion is an That is,
an activity is a column vector in the above linear
4 1 3
program, e.g., , , or . Also in the example above, the amount of a drink
1 4 2
that is produced is called the activity level, e.g., x1 or x2.
If the activity level is zero, i.e., x1 ¼ 0, it means that we do not select that activity
at all. In this example, we have three activities, so we have three activity levels to
choose. (In fact, the book of T. C. Koopmans (ed.) in 1951 is called Activity
Analysis of Production and Allocation.)
Here, x1, x2, and x3 are the amounts of drinks A, B, and C, respectively, that the
merchant will carry to the market. By trial and error, we might find that an optimum
solution is x1 ¼ 3, x2 ¼ 3, and x3 ¼ 0, or ½x1 ; x2 ; x3 ¼ ½3; 3; 0.
Note that every drink needs five parts of juice, and if we take the ratio of value to
weight as before, all the ratios are equal. We have
1 1 1 1
¼ ¼ ¼ ð2:9Þ
4þ1 1þ4 3þ2 5
The reason that we do not select drinks A and C or B and C in the optimum solution
is that drinks A and B are more compatible in the sense that they do not compete for
the same kinds of juices heavily.
Definition Two activities are compatible if the right-hand side (RHS) of the
constraint function can be expressed as a sum of the two activities with
non-negative activity level coefficients.
The notion of compatibility is central to a linear program with more than one
constraint. To illustrate this point, we let the ratio of apple juice to orange juice for
22 2 Introduction
drink A be 2:3. We keep the ratios for drink B and drink C the same. Thus, we have
the linear program
max v ¼ x1 þ x2 þ x3
" # " # " # " #
2 1 3 15
subject to x1 þ x2 þ x3 ð2:10Þ
3 4 2 15
xj 0
max v ¼ cx
subject to Ax b ð2:12Þ
x0
every food in the supermarket as a vector with two components, the first component
being the amount of vitamin A the food contains and the second component being
the amount of vitamin B it contains. For example, we shall represent beef as [3, 1],
meaning that a pound of beef contains three units of vitamin A and one unit of
vitamin B. Similarly, we may represent wheat as [1, 1]. We may also potentially
represent a food as ½1, 2, meaning that that particular food will destroy one unit of
vitamin A but provide two units of vitamin B.
Thus, the process of deciding to buy a particular food is equivalent to selecting a
column vector aj in the matrix A. There are n column vectors, so we say that there
are n activities. The amount of a particular food j to be purchased is called its
activity level and is denoted by xj. For example, if we associate j with beef, then
xj ¼ 3 means that we should buy three pounds of beef, and xj ¼ 0 means that we
should not buy any beef. Since we can only buy from the supermarket, it is natural
to require xj X 0. The unit cost of food j is denoted by cj, so the total bill for all
purchases is cj xj . The total amount of vitamin A in all the foodstuffs purchased
X X
is a1j xj , and similarly, the amount of vitamin B is a2j xj . As such, the linear
program describing our problem is
X
min z¼ cj xj
X
subject to aij xj bi ði ¼ 1, . . . , mÞðj ¼ 1, . . . , nÞ ð2:13Þ
xj 0
Example 4 (Homemaker Problem) Assume that the supermarket stocks four kinds
of food costing $15, $7, $4, and $6 per pound, and that the relevant nutritional
characteristics of the food can be represented as
3 1 0 1
, , , :
1 1 1 2
If we know the nutritional requirements for vitamins A and B for the whole family
are [3, 5], then we have the following linear program:
Example 5 (Pill Salesperson) Let us twist the Homemaker Problem a little bit and
consider a vitamin pill salesperson who wants to compete with the supermarket.
Since the taste and other properties of the food are of no concern (by assumption),
the salesperson merely wants to provide pills that contain equivalent nutrition at a
lower cost than the food. Let us assume that there are two kinds of pill, one for
vitamin A and one for vitamin B, and each pill supplies one unit of its vitamin.
Suppose that the salesperson sets the price of vitamin A pills at y1 and vitamin B
pills at y2. If the prices satisfy the constraints
3y1 þ y2 15
y1 þ y2 7
y2 4
y1 þ 2y2 6
then no matter what combination of food items we select from the supermarket, it is
always cheaper to satisfy our nutritional requirements by buying the pills. Since the
requirements for vitamins A and B are [3, 5], the total amount that we have to pay
the salesperson is 3y1 þ 5y2 . Of course, in setting prices, the salesperson would like
to maximize the amount he receives for his goods. Once again, we can represent his
problem as a linear program:
Now, let us study a general method to solve the linear program with a single
constraint. The following are two linear programs, each of which has a single
constraint:
To solve the minimization problem, let us take one variable at a time. To satisfy
(3.15) using x1, we need
2.2 Ratio Method 25
60
x1 ¼ ¼ 15 and z ¼ $15
4
60
For x2 , x2 ¼ ¼ 12 and z ¼ $24
5
60
For x3 , x3 ¼ ¼ 10 and z ¼ $30
6
The intuitive idea is to take the ratios 14 , 25 , and 36 and select the minimum ratio
if we want to minimize. If we want to maximize, the ratios should be 16 , 25 , and
3
4 , and we select the maximum ratio.
60
For x1 , x1 ¼ ¼ 10 and v ¼ $10
6
60
For x2 , x2 ¼ ¼ 12 and v ¼ $24
5
60
For x3 , x3 ¼ ¼ 15 and v ¼ $45
4
To formalize the idea, let the linear program be
min z ¼ c1 x1 þ c2 x2 þ þ cn xn
subject to a1 x 1 þ a2 x 2 þ þ an x n b ð2:18Þ
xj 0 ðj ¼ 1, . . . , nÞ
max v ¼ c1 x1 þ c2 x2 þ þ cn xn
subject to a1 x 1 þ a2 x 2 þ þ an x n b ð2:19Þ
xj 0
1 2 3 4 5 6
Let us now consider the Homemaker Problem in Example 4 again, but with a
different set of data. Assume that the supermarket has four kinds of food, each kind
labeled by its vitamin contents of vitamin A and vitamin B, as follows:
4 1 3 4
, , , :
1 4 2 8
All items cost one dollar per pound, except the fourth item which costs two dollars
per pound. And the homemaker needs 6 units of vitamin A and 6 units of vitamin B
for his family. So the homemaker’s linear program becomes
min z ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
4 1 3 2 6
subject to x1 þ x2 þ x3 þ x4 ð2:20Þ
1 4 2 4 6
xj 0 ðj ¼ 1, 2, 3, 4Þ
4 2
Note that the vector has been normalized to and costs one dollar. Note also
8 4
that (2.20) has two constraints, and we want to solve the problem by a graphic
method. In Figure 2.6, the horizontal axis measures the amount of vitamin A and the
vertical axis measures
the amount of vitamin B of the item. Also, there is a line from
6
the origin to .
6
Since we have two constraints, we need to select the best pair of items. The pair
which intersects the 45 line furthest from the origin is the cheapest pair. For (2.20),
x1 ¼ 67, x4 ¼ 97, and z ¼ 15
7.
2.3 Exercises 27
The Pill Salesperson Problem corresponding to (2.20) becomes (2.21) where the
vitamin A pill costs π 1 dollars and the vitamin B pill costs π 2 dollars:
max v ¼ 6π 1 þ 6π 2
subject to 4π 1 þ π 2 1
π 1 þ 4π 2 1 ð2:21Þ
3π 1 þ 2π 2 1
2π 1 þ 4π 2 1
2.3 Exercises
1. Draw the solution space graphically for the following problems. Then, deter-
mine the optimum fractional solutions using your graph.
(a) Producing one copy of A requires 3 units of water and 4 units of electricity,
while producing one copy of B requires 7 units of water and 6 units of electricity.
The profits of products A and B are 2 units and 4 units, respectively. Assuming
that the factory can use up to 21 units of water and 24 units of electricity per
day, how should the resources be allocated in order to maximize profits?
(b) Suppose there are two brands of vitamin pills. One pill of Brand A costs $3
and contains 30 units of vitamin X and 10 units of vitamin Y. One pill of
Brand B costs $4 and contains 15 units of vitamin X and 15 units of vitamin
Y. If you need 60 units of vitamin X and 30 units of vitamin E and want to
minimize your spending, how will you purchase your pills?
2. Why do we use max x0,
x0 c1 x1 c2 x2 cn xn ¼ 0;
and use min z,
z þ c1 x1 þ c2 x2 þ þ cn xn ¼ 0?
(b)
max z ¼ 3x1 þ 8x2 þ x3
subject to 6x1 þ 15x2 þ 3x3 90
xj 0
28 2 Introduction
7. In Exercises 5 and 6, both constraints are equations. Will the solutions change if
the constraints are inequalities?
8. Prove that the ratio method works correctly for both maximizing and minimizing
the objective function.
This chapter covered the basics of linear programming. You should now be
comfortable with the concept of a linear program, how the ratio method works, and
the notation that we have been using in order to express linear programs. It is highly
advisable for you to be comfortable with the material in this chapter before
moving on.
Also, you may have noticed that the Homemaker Problem and the Pill Salesper-
son Problem in the last section are closely related. It turns out that they are dual
linear programs. We will be revisiting these problems later in Chapter 4 as we
explore duality theory, and you will see why they both yielded the same optimal
value for the objective function.
In the next chapter, we will discuss solution spaces and some properties of linear
programs. Let’s go!
Dimension of the Solution Space
3
In this chapter, we will discuss solution spaces, convex sets and convex functions,
the geometry of linear programs, and the theorem of separating hyperplanes. We
will apply what we cover in this chapter to build our understanding of the Simplex
Method, duality theory, and other concepts in subsequent chapters.
Before we try to solve a linear program, let us first solve a simple set of simulta-
neous equations:
x1 þ x2 ¼ 5
ð3:1Þ
x1 x2 ¼ 1
x1 ¼ 3
ð3:2Þ
x2 ¼ 2
x2
x1 = 3
x1 x1 – x2 = 1
+
x2
=
5
x2 = 2
x1
Fig. 3.1 Lines and their intersections
x1 þ x2 ¼ 5
In order to fix a point on the line x1 þ x2 ¼ 5, we need to specify one parameter, say
x1 ¼ 3:8. Then the point is uniquely determined as ½x1 , x2 ¼ ½3:8, 1:2. Hence,
because we need to specify one parameter, we say that the line x1 þ x2 ¼ 5 is of one
dimension. In a three-dimensional space, the equation x1 þ 2x2 þ 3x3 ¼ 6 defines a
plane, and that plane is of two dimensions. In order to fix a point on the plane, we
need to specify two parameters, say x1 ¼ 2 and x2 ¼ 1, in which case the third
parameter and hence the point is uniquely determined:
6 x1 2x2 6 2 2ð1Þ 2
x3 ¼ ¼ ¼
3 3 3
Similarly, the circle x21 þ x22 ¼ 25 is of one dimension since we need to specify
only one parameter, the angle from the x1 axis, to determine a point uniquely.
x1 þ x2 5
x1 0
x2 0
then the solution space is a triangular portion of the plane as we can see in Figure 3.1
and still of two dimensions.
We can now see that each inequality constraint defines a half-space, and the
solution space is the intersection of all the half-spaces. We have also discussed the
dimension of a solution space of a linear program and now want to talk about the
shape of a solution space. It turns out the solution space always corresponds to a
convex set.
If we consider a two-dimensional convex set as the area of a room, then a person
can be anywhere inside the room, and he can see all other points in the convex set.
Our formal definition of a convex set will capture this idea. In Figure 3.2 the top
three shapes represent convex sets, while the bottom three do not.
Definition A set X is convex if for any two points x1 and x2 in the set, all points
lying on the line segment connecting x1 and x2 also belong to the set. In other words,
all points of the form
λx1 þ ð1 λÞx2
x1 þ x2 5
x1 0
x2 0
form a convex set. All points on the x1 x2 plane also form a convex set.
32 3 Dimension of the Solution Space
Fig. 3.2 Examples of convex (top row) and non-convex (bottom row) sets
implies x1 ¼ x2 ¼ x, or x1 2
= X, or x2 2
= X. In other words, there is no way to express
an extreme point x of convex set X as an interior point of a line segment where the
two distinct endpoints x1 and x2 both belong to X.
Note that a convex function is always defined on a convex set X. Otherwise, the
point λx1 þ ð1 λÞx2 may not be in X. Geometrically, if we consider X as a plane
and f(x) as a surface plotted above the plane, then the surface of the convex function
has the property that a line segment connecting any two points on the surface lies
entirely above or on the surface. For a strictly convex function, the line segment lies
entirely above the surface except at the two endpoints.
Theorem 3.1 If f(x) is a convex function defined on a closed and bounded convex
set X, then a local minimum (strict or not strict) of f(x) is a global minimum of f(x).
3.2 Convex Sets and Convex Functions 33
Proof Let f(x) have a local minimum at x0, i.e., there exists a neighborhood of x0
that f ðxÞ f ðx0 Þ for jx x0 j < E. Let x* be any other point in X with
such
f x* < f ðx0 Þ. All the points of the form λx* þ ð1 λÞx0 ð1 λ 0Þ belong to
X since X is convex. We can take λ sufficiently small, e.g., E2, such that x ¼ λx*
þð1 λÞx0 is in the E-neighborhood of x0. By the assumption that x0 is a local
minimum of f(x), we have f ðxÞ f ðx0 Þ or
Theorem 3.2 If f(x) is a concave function defined on a closed and bounded convex
set X, then a global minimum of f(x) exists at an extreme point of X.
X X
Proof Let vi be the extreme points of X and let x ¼ λi vi λi ¼ 1, λi 0 .
X X
Then f ðxÞ ¼ f λi vi λi f ðvi Þ mini f ðvi Þ. □
A linear function is both a convex function and a concave function. Thus, for a
linear function, a local minimum is a global minimum, and a global minimum
occurs at an extreme point.
34 3 Dimension of the Solution Space
max z ¼ cx
subject to Ax b ð3:5Þ
x0
Every point in the convex set (also known as a convex polytope) is a feasible
solution of the linear program. Among all feasible solutions, we want the point
which maximizes the objective function z ¼ cx. Consider a set of parallel
hyperplanes:
Each hyperplane has the same normal vector ð3, 2, 4Þ. We can imagine a hyper-
plane with normal vector ð3, 2, 4Þ moving along its normal vector in x1 x2 x3
space; for each position, there is an associated value of a parameter k such as 16, 14,
12, etc. At certain positions, the hyperplane intersects the convex set defined by the
constraints of the linear program.
If we imagine that the hyperplane moves from smaller values of the parameter
k to larger values, then there will be an intersection point of the convex set with the
hyperplane of largest value, which is the optimum solution of the linear program.
In general, the optimum solution of a linear program is a vertex of the convex
polytope (see Figure 3.3a). However, if the hyperplane defined by the objective
function z ¼ cx is parallel to one of the faces of the convex polytope, then every
point on the face is an optimum solution (see Figure 3.3b).
Consider the linear program
X
min z ¼ cj xj ðj ¼ 1, . . . , nÞ
X
subject to aij xj ¼ bi ði ¼ 1, . . . , mÞ ð3:6Þ
xj 0
In most cases, n is much larger than m. Let us assume that n ¼ 100 and m ¼ 2 so
that the activity space is of 100 2 ¼ 98 dimensions. There is another way of
3.4 Theorem of Separating Hyperplanes 35
a b
x1 x1
x2 x2
Fig. 3.3 The optimal solution of a linear program: (a) a vertex of a convex polytope; and (b)
every point on the face
viewing the linear program geometrically. Namely, we have 100 column vectors,
each having two components. In the Homemaker Problem, the 100 column vectors
represent 100 kinds of food, each containing different amounts of vitamins A and
B. The nutrition requirements of the homemaker’s family are also represented by a
two-component vector b with components b1, the amount of vitamin A, and b2, the
amount of vitamin B.
Under the linear program model, we can buy half a loaf of bread for one dollar if
the whole loaf is priced at two dollars and one third of a watermelon for one dollar if
the whole watermelon is three dollars. For simplicity, we can let each column
vector have two components which represent the amount of vitamins A and B in
one dollar’s worth of that kind of food.
Let us now set up a coordinate system where the horizontal axis denotes the
amount of vitamin A contained in an item and the vertical axis denotes the amount
of vitamin B contained in an item. We can plot the 100 items as 100 vectors where
the length of a vector is determined by what one dollar can buy. The nutrition
requirement vector b can also be plotted on the same diagram. This is shown in
Figure 3.4 where only five vectors instead of 100 vectors are plotted.
Note that these vectors follow the rule of vector addition. If a1 ¼ ½a11 ; a21 and
a2 ¼ ½a12 ; a22 , then a1 þ a 2 ¼ a3 where a3 has the components
½a11 þ a12 , a21 þ a22 . In this two-dimensional space, we want to select two vectors
of minimum total cost, with each component of their vector sum greater than or
equal to the corresponding component of b.
b2
b1
yA 0
yb > 0
yA 0
yb < 0
−2 θ1 < 90°
a1 =
4
θ2 < 90°
−3
a2 = θ3 > 90°
3 θ1
θ2
y = (1, 2)
−4 θ3
b=
1
for which there is no non-negative solution x. However, consider yb for the vector
y ¼ ð1; 2Þ, as shown in Figure 3.5,
" #
2
½1 2 ¼60
4
" #
3
½1 2 ¼30
3
" #
4
½1 2 ¼ 2 < 0
1
3.5 Exercises
6. We say two linear programs are equivalent if they have the same unique
optimum solutions and the dimensions of their solution spaces are the same.
Are the linear programs in (3.7)–(3.9) equivalent? Why or why not?
max v ¼ x1 þ 2x2 þ 3x3
subject to 12x1 þ 12x2 þ 6x3 30
ð3:7Þ
4x1 þ 10x2 þ 18x3 15
xj 0
First divide the first inequality by 2, yielding this modified problem:
max v ¼ x1 þ 2x2 þ 3x3
subject to 6x1 þ 6x2 þ 3x3 15
ð3:8Þ
4x1 þ 10x2 þ 18x3 15
xj 0
Then divide the second column by 2 and the third column by 3.
max v ¼ x1 þ x2 þ x3
subject to 6x1 þ 3x2 þ x3 15
ð3:9Þ
4x1 þ 5x2 þ 6x3 15
xj 0
Good job on completing this chapter! We now have all the knowledge needed
before learning about fundamental linear programming techniques such as the
Simplex Method—in Chapter 4, next!
Introduction to the Simplex Method
4
In this chapter, we will learn the Simplex Method, which is a widely used technique
for solving linear programs. Even though the notation can be a bit daunting, the
technique is actually quite simple. Just keep in mind that the Simplex Method
essentially involves iterative application of the ratio test (which you already know)
and the pivot operation (which we are about to cover).
Here, we shall summarize some standard tricks for converting between equiva-
lent formulations. We have already seen some conversions between inequalities
and equations, and we now systematically list all of these. More conversions that
help in formulating real-world problems as linear programs are given as “tips and
tricks” in Section 11.2 and on this book’s website.
x1 þ 2x2 þ 3x3 10
is equivalent to
x1 þ 2x2 þ 3x3 þ s1 ¼ 10
s1 0 ðs1 is a slack variableÞ
is equivalent to
3x1 þ 2x2 þ x3 10
3x1 þ 2x2 þ x3 10
is the same as
Using these conversions, we can put a linear program into an equivalent form.
The reader should convince himself or herself that these conversions are valid and
that the dimension of the solution space remains the same for all three conversions.
If you are interested, please see Chapter 11 for additional tricks in formulating
problems as linear programs!
4.2 Basic, Feasible, and Optimum Solutions 41
From now on, we shall always assume that the system of linear equations is
consistent and nonredundant. Recall that this means a system of linear equations
has at least one solution and that none of the equations can be expressed as a linear
combination of the other equations. Also, by convention, we will assume that there
are m rows (equations) and n columns (variables) in the matrix, and all variables are
required to be non-negative.
Definition A linear program has an unbounded solution if the solution can be made
arbitrarily large while satisfying all constraints.
Definition The column vectors of B are called the basic vectors. Sometimes we
say that the m independent column vectors of B form the basis.
Definition A feasible solution is called a basic feasible solution if the set of column
vectors corresponding to positive components of the solution is independent.
For brevity, we shall say “solution” for a feasible solution and “basic solution” for a
basic feasible solution.
Consider the following system of equations.
2 3 2 3 2 3 2 3 2 3
4 1 0 0 3
4 8 5x1 þ 4 0 5x2 þ 4 2 5x3 þ 4 0 5x4 ¼ 4 4 5, xj 0 ðj ¼ 1, 2, 3, 4Þ
10 0 1 3 5
Theorem 4.1 If there exists a solution to a system of linear equations, there exists
a basic solution to the linear equations.
Theorem 4.2 If there exists a feasible solution to a system of linear equations, then
there exists a basic feasible solution.
X
n
x j aj ¼ b
j¼1 ð4:1Þ
xj 0 ðj ¼ 1, . . . , nÞ
4.2 Basic, Feasible, and Optimum Solutions 43
X
n
λ j aj ¼ 0 ð4:2Þ
j¼1
We can assume that some λj are positive. If not, we can multiply (4.2) by 1.
λ
Among those positive λj, one of them must give maximal value to the ratio xjj .
Without loss of generality, let that ratio be λxnn , and denote its value by θ. The value θ
is non-negative because λn and xn are positive. So, we have
λj λn
θ ¼ max ¼ >0
j xj xn
n
1X λj
θ x j aj ¼ b ð4:3Þ
θ j¼1 xj
The vertices of a convex set that is defined by linear equations and inequalities
correspond to basic feasible solutions.
Theorem 4.3 The set of all feasible solutions to a linear program is a convex set.
Ax1 ¼ b ðx1 0Þ
Ax2 ¼ b ðx2 0Þ
Then for 1 λ 0, we have that λx1 þ ð1 λÞx2 0 because each of the two
terms is a product of non-negative numbers. Furthermore,
44 4 Introduction to the Simplex Method
Theorem 4.4 If all feasible solutions of a linear program are bounded, any
feasible solution is a linear convex combination of basic feasible solutions.
Proof Feasible solutions form a compact convex set, where basic feasible solutions
correspond to extreme points of the convex set. Since any point of a convex set is a
linear convex combination of its extreme points,1 any feasible solution is a linear
convex combination of basic feasible solutions. □
Theorem 4.5 If there exists an optimum solution, then there exists a basic optimum
solution.
The following examples will be referenced in the next few sections. It is advised
that the reader sit with the examples and try to understand what row reductions are
being carried out before moving forward.
Example 1
min z ¼ x1 þ x2 þ 2x3 þ x4
subject to x1 þ 2x3 2x4 ¼ 2
ð4:4Þ
x2 þ x3 þ 4x4 ¼ 6
x1 , x2 , x3 , x4 0
z þ x1 þ x2 þ 2x3 þ x4 ¼ 0 ð 0Þ
x1 þ 2x3 2x4 ¼ 2 ð1Þ
x2 þ x3 þ 4x4 ¼ 6 ð2Þ
1
A feasible solution x* is called an optimum solution if cx* cx for all feasible solutions x and
1 < cx* .
4.3 Examples of Linear Programs 45
0
z x3 x4 ¼ 8 ð0Þ ð1Þ ð2Þ ¼ 0
0
x1 þ 2x3 2x4 ¼ 2 ð 1Þ ¼ 1
0
x2 þ x3 þ 4x4 ¼ 6 ð 2Þ ¼ 2
1 1
z þ x1 2x4 ¼ 7 ð00 Þ þ ð10 Þ ¼ ð000 Þ
2 2
1 1 0
x1 þ x3 x4 ¼ 1 ð1 Þ ¼ ð100 Þ
2 2
1
x1 þ x2 þ 5x4 ¼ 5 ð20 Þ ð100 Þ ¼ ð200 Þ
2
1
z þ x1 2x4 ¼ 7 ð000 Þ ¼ ð0000 Þ
2
1
x1 þ x3 x4 ¼ 1 ð100 Þ ¼ ð1000 Þ
2
1 1 1 00
x1 þ x2 þ x4 ¼ 1 ð2 Þ ¼ ð2000 Þ
10 5 5
3 2
z þ x1 þ x2 ¼ 5 ð0000 Þ þ 2ð2000 Þ ¼ ð00000 Þ
10 5
4 1
x1 þ x2 þ x3 ¼2 ð1000 Þ þ ð2000 Þ ¼ ð10000 Þ
10 5
1 1
x1 þ x2 þ x4 ¼ 1 ð2000 Þ ¼ ð20000 Þ
10 5
x1 ¼ 0, x2 ¼ 0, x3 ¼ 2, x4 ¼ 1, and z ¼ 5:
Example 2
max x1 þ 2x2
subject to x1 þ x2 6
x2 8
ð4:5Þ
x1 þ x2 12
4x1 þ x2 36
x1 , x2 0
x0 x1 2x2 ¼0 ð 1Þ
x1 þ x 2 þ s1 ¼6 ð 2Þ
x2 þ s2 ¼8 ð 3Þ
x1 þ x2 þ s3 ¼ 12 ð 4Þ
4x1 þ x2 þ s4 ¼ 36 ð 5Þ
46 4 Introduction to the Simplex Method
0
x0 þ 7x1 þ 2s4 ¼ 72 ð 1Þ þ 2ð 5Þ ¼ 1
0
5x1 þ s1 s4 ¼ 30 ð 2Þ ð 5Þ ¼ 2
0
4x1 þ s2 s4 ¼ 28 ð 3Þ ð 5Þ ¼ 3
0
3x1 þ s3 s4 ¼ 24 ð 4Þ ð 5Þ ¼ 4
0
4x1 þ x2 þ s4 ¼ 36 ð5Þ ¼ ð5 Þ
7 1 0 7
0
00
x0 þ s3 s4 ¼ 16 1 þ 4 ¼ 1
3 3 3
5 2 0 5
0
00
s1 s3 þ s4 ¼ 10 2 4 ¼ 2
3 3 3
4 1 0 4
0
00
s2 s3 þ s4 ¼4 3 4 ¼ 3
3 3 3
1 1 1 0
00
x1 s3 þ s4 ¼8 4 ¼ 4
3 3 3
4 1
0 4
0
00
x2 þ s3 s4 ¼4 5 þ 4 ¼ 5
3 3 3
00 00 000
x0 þ s2 þ s3 ¼ 20 1 þ 3 ¼ 1
00 00 000
s1 2s2 þ s3 ¼2 2 2 3 ¼ 2
00 000
3s2 4s3 þ s4 ¼ 12 3 3 ¼ 3
00 00 000
x1 s2 þ s3 ¼4 4 3 ¼ 4
00 00
000
x2 þ s2 ¼8 5 þ 3 ¼ 5
x1 ¼ 4, x2 ¼ 8, s1 ¼ 2, s2 ¼ 0, s3 ¼ 0, s4 ¼ 12, and x0 ¼ 20:
In the last section, you may have noticed that for both examples, the row reductions
carried out at each step resulted in a change of basic variables. It turns out that at
each step we applied the pivot operation.
In short, the pivot operation swaps a basic variable and a non-basic variable. So, in
Example 2 from Section 4.3, notice how we started out with s1, s2, s3, s4 as our basic
variables, and after the first step, we had basic variables x2, s1, s2, s3. This means that
we applied the pivot operation in order to switch out s4 from our basis with x2.
Definition We call the column of the non-basic variable that we are swapping into
our basis the pivot column, we call the row that contains the nonzero basic variable
the pivot row, and we call the intersection of the pivot column and the pivot row the
pivot.
4.5 Simplex Method 47
In order to swap the basic variable with the non-basic variable, we scale the pivot
row so that the coefficient of the non-basic variable is 1. Then, we add multiples of
the pivot row to all other rows so that they have a coefficient of 0 in the pivot
column. This is the pivot operation.
2
Solution space is also called solution set.
48 4 Introduction to the Simplex Method
with respect to x1 and x2. Then we successively pick different sets of variables to be
non-basic variables until we find that the value of the objective function cannot be
decreased.
Let us summarize the Simplex Method as follows. Assume all constraints are
converted into m equations, and the n variables are all restricted to be non-negative
ðm nÞ.
xi ¼ bi ði ¼ 1, . . . , mÞ
xj ¼ 0 ðj ¼ m þ 1, . . . , nÞ
or
X
min z ¼ cj xj
We shall eliminate the basic variables from the objective function so that the
objective function is of the form, say
X
z þ cj xj ¼ 60 ðj ¼ m þ 1, . . . , nÞ
or
X
wþ cj xj ¼ 100 ðj ¼ m þ 1, . . . , nÞ
In both cases, if all cj 0, then increasing the value of non-basic variables will
not improve the objective function, and z ¼ 60 or w ¼ 100 will not be the value
4.5 Simplex Method 49
of the objective function, which is actually the optimal value. If all cj > 0, then
all xj must be zero to obtain z ¼ 60 or w ¼ 100. Assuming that the condition
cj > 0 implies optimality has become standard practice, and we use ðzÞ and
ðþwÞ in conformance with this practice.
3. If not all cj 0 in the objective function, we choose the variable with the most
negative coefficient, say cs < 0, and increase xs from zero to the largest value
possible. We use the ratio test to determine how much xs can be increased
without causing any current basic variables to become negative. The ratio test is
bi br
min ¼ , 8ais > 0
i ais ars
Then we apply the pivot operation with ars as our pivot. Recall that this changes
xs from non-basic to basic and changes xr from basic to non-basic. Again, the
coefficients of all basic variables are zero in the objective function, and we
perform the optimality test.
Note that we can choose any cs < 0 and let xs be the new basic variable. But
once xs is chosen, the new non-basic variable is determined by the ratio test.
When we solve simultaneous equations, we can work with the coefficients only,
and we need not write down the names of the variables. This technique is used in
linear programming: the computational framework is condensed into the so-called
Simplex tableau, where the names of the variables are written along the outside
edges. To illustrate the use of the tableau, we shall repeat Example 1 again.
min z ¼ x1 þ x2 þ 2x3 þ x4
subject to x1 þ 2x3 2x4 ¼ 2
ð4:6Þ
x2 þ x3 þ 4x4 ¼ 6
x1 , x2 , x3 , x4 0
Note that in the above Simplex tableau, the objective function is written above
the tableau (i.e., the 0th row); the current basic variables x1 and x2 are written to the
left of the tableau.
If we subtract the first row and the second row from the 0th row expressing the
objective function, we have
50 4 Introduction to the Simplex Method
Note that the labels along the upper boundary remain the same during the
computation. Each row represents an equation; the equation is obtained by
multiplying the corresponding coefficients by the variables. The double vertical
lines separate the left-hand side (LHS) from the right-hand side (RHS). The labels
along the left boundary of the tableau are the current basic variables; their values
are equal to those on the right-hand side.
max w ¼ 3π 1 þ 5π 2
subject to 3π 1 þ π 2 15
π1 þ π2 7 ð4:7Þ
π2 4
π 1 þ 2π 2 6
4.6 Review and Simplex Tableau 51
In the computational aspect of linear algebra, the goal is to find the solution to
simultaneous equations, a point of zero dimension. When the number of variables
n is more than the number of equations m, the solution space has n m dimensions.
When we start with any n 2 variables and one constraint, we use the Ratio
Method. The Ratio Method gives the optimum solution, which has one variable
with a positive value, all other variables equal to zero. When there are m 2
52 4 Introduction to the Simplex Method
min z ¼ cx
subject to Ax ¼ b
x0
or
max x0 ¼ cx
subject to Ax ¼ b
x0
While the RHS b has m components, m < n, we need to find x1, x2, . . ., xm positive
variables and let the rest of the variables xmþ1 , . . . , xn be zero. In other words, the
basis has m columns all associated with positive values, while variables associated
with ðn mÞ columns are zero.
In terms of the Homemaker Problem, the homemaker needs to buy m items if he
needs m kinds of vitamins. For the Pill Salesperson Problem, the pill salesperson
needs to fix the prices of m kinds of vitamin pills to compete with the supermarket.
In the Simplex tableau, there are m þ 1 rows and n þ 2 columns, with the rows
associated with the current basic variables and the 0th row associated with z or x0.
The ðn þ 2Þ columns are associated with all variables (including z or x0), with the
rightmost column representing vector b.
To generalize the Ratio Method, we need to find the denominator of the ratio or
multiply all columns of the matrix A by the inverse of a square matrix of size
ðm þ 1Þ by ðm þ 1Þ. This would mean that we need to choose an m m square
matrix. However, we can select the most promising column as (the column with the
most negative coefficient) to enter the basis, to replace the rth row and make ars ¼ 1.
The operation needs a test to maintain that all components of RHS remain positive.
The beauty of the Simplex Method is that the process of replacing a square
matrix by another matrix which differs in only one column is quite simple. As a
folklore, we need to execute this process about 2m to 3m times, independent of
n even if n m.
Let us put the following linear program into the tableau form (Tableau 4.1).
max x0
subject to x0 þ 2x3 2x4 x5 ¼ 0
x1 2x3 þ x4 þ x5 ¼ 4
x2 þ 3x3 x4 þ 2x5 ¼ 2
4.6 Review and Simplex Tableau 53
Tableau 4.1
Tableau 4.2
Tableau 4.3
There are two negative coefficients, 2 and 1, in the 0th row, so we select the
column under x4 to enter the basis, indicated by #. Because the fourth column has
only one positive entry, we do not need to perform the feasibility test. The entry for
pivot is indicated by an asterisk* on its upper right corner. Note that the labels on
the left of the tableau are the current basic variables.
In Tableau 4.2, the only negative coefficient is 2 under x3, and under this there
is only one positive coefficient eligible for pivoting, indicated by an asterisk* to its
upper right. The result is shown in Tableau 4.3.
In Tableau 4.3, all coefficients in the 0th row are non-negative. As such, we
conclude that an optimum result is x0 ¼ 20, x4 ¼ 16, and x3 ¼ 6.
Note that B B1 ¼ I, B1 aj ¼ aj , and B1 b ¼ b where B is a corresponding
basis matrix and B1 is an inverse matrix of B. We shall discuss this more in
Chapter 6.
54 4 Introduction to the Simplex Method
2 32 3 2 3
1 2 0 1 2 0 1 0 0
6 76 7 6 7
40 1 0 54 0 1 05 ¼ 40 1 05
0 1 1 0 1 1 0 0 1
2 32 3 2 3
1 2 2 1 4 2 1 0 0
6 76 7 6 7
40 1 2 54 0 3 25 ¼ 40 1 05
0 1 3 0
1 1 0 0 1
2 32 3 2 3
1 2 0 2 2
6 76 7 6 7
40 1 0 54 2 5 ¼ 4 2 5
0 1 1 3 1
2 32 3 2 3
1 4 2 0 20
6 76 7 6 7
40 3 2 54 4 5 ¼ 4 16 5
0 1 1 2 6
Now let us add another column to Tableau 4.1 and name it Tableau 4.4.
Here we choose the most negative coefficient and perform a feasibility test to
choose the pivot. This gives us Tableaus 4.5. Tableaus 4.6 and 4.7 are similarly
obtained.
This simple example shows that choosing the most negative coefficient may not
be efficient. We use this example to show that we need the feasibility test 41 > 21 ,
and we need to choose the pivot as shown in Tableau 4.4. Up to now, there is no
known “perfect” pivot selection rule.
In the Simplex Method, the choice of pivot is by the feasibility test. In rare cases,
there is no improvement of the value of the basic feasible solution after the pivot.
The way to cure it is called Bland’s rule:
Tableau 4.4
Tableau 4.5
4.7 Linear Programs in Matrix Notation 55
Tableau 4.6
Tableau 4.7
1. In the case of a tie in the feasibility test, select the column with the smallest index
j to enter the basis.
2. If there are two or more columns which could be dropped out, select the column
which has the smallest index j.
min z ¼ cx
subject to Ax ¼ b ð4:8Þ
x0
Assume that A ¼ ½B; N, where B is a feasible basis. Then the program (4.8) can be
written as
min z ¼ cB xB þ cN xN ð4:9Þ
z ¼ cB B1 b þ cN cB B1 N xN
z ¼ πb þ ðcN πNÞxN
Now, if we select the first m variables to be basic variables, and express the
objective function z in terms of the non-basic variables,
X
z ¼ z0 þ cj xj
where
cj ¼ 0 ðj ¼ 1, . . . , mÞ
cj ¼ cj πaj ðj ¼ m þ 1, . . . , nÞ
Notice that if the relative cost is negative, or cj ¼ cj πaj < 0, then we should
bring the corresponding xj into the basis.
4.10 Exercises 57
Definition This operation of changing the basis is the result of what is sometimes
called the pricing operation. In the economic interpretation it corresponds to
lowering the price of an underutilized resource.
Definition π i is called the shadow price of each row under the present basis.
In Step 2, we could choose any a0j ¼ cj < 0 and the algorithm will still work.
The criterion minj cj < 0 is used analogous to the steepest descent (greedy)
algorithm. In Step 3, the test mini ai0/ais for selecting the pivot row is called the
ratio test. It is used to maintain ai0 0.
The three steps are repeated until in Step 2 there are no negative coefficients in
the 0th row. Then the optimum solution is obtained by setting the basic variables xi
on the left of the tableau equal to ai0 in that current tableau.
4.10 Exercises
x1 þ x2 þ x3 ¼ 30
2x1 þ 2x2 þ 2x3 ¼ 60
x1 þ 2x2 þ 3x3 ¼ 60
12x þ 5y þ u ¼ 30
x þ 2y þ v ¼ 12
3x 2y þ w ¼ 0
max x0 ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
9 1 5 4 11
subject to x1 þ x2 þ x3 þ x4 ¼
1 9 4 8 11
xj 0 ðj ¼ 1, 2, 3, 4Þ
4.10 Exercises 59
Without doing the actual computations, could you find the optimal basic
variables? Use a graph if you need it.
7. Solve the following linear programming problem and draw the solution space
graphically. How would you describe this particular solution space?
max 8x þ 5y
subject to x y 350
x 200
y 200
x, y 0
8. For the following matrices, apply the pivot operation to the bolded elements:
2 3
2 3 3
(a) 4 2 4 2 5
3 5 1
2 3
1 2 2 8
(b) 4 0 3 5 1 5
6 5 3 5
9. Do the labels (names) of all variables above the Simplex tableau change during
the computation? Do the labels (names) of variables to the left of the Simplex
tableau change? Are these names of the current basic variables or of the current
non-basic variables?
10. Solve the following linear programming problems using the Simplex Method:
(a) max 2x þ y
subject to 4x þ y 150
2x 3y 40
x, y 0
min 6x 4y þ 2z
(b)
subject to x þ y þ 4z 20
5y þ 5z 100
x þ 3y þ z 400
x, y, z 0
max 3x þ 5y þ 2z
(c)
subject to 5x þ y þ 4z 50
x y þ z 150
2x þ y þ 2z 100
x, y, z 0
60 4 Introduction to the Simplex Method
11. Solve the following linear programming problems using the Simplex Method.
Note that some variables are unrestricted:
(a) min x þ y
subject to x þ 2y 100
3x 6y 650
y 0, x unrestricted
max x þ y þ 2z
(b)
subject to 5x þ y þ z 240
x y þ z 50
2x þ y þ 2z 400
x, z 0, y unrestricted
12. You need to buy pills in order to help with your vitamin and supplement
deficiency in calcium, iron, and vitamin A. Each pill of Type I contains
6 units of calcium, 2 units of iron, and 3 units of vitamin A and costs $0.10.
Each pill of Type II contains 1 unit of calcium, 4 units of iron, and 7 units of
vitamin A and costs $0.12. You need a minimum of 250 units of calcium,
350 units of iron, and 420 units of vitamin A. How should you purchase pills of
Type I and Type II in order to minimize your spending?
13. The K Company produces three types (A, B, C) of accessories which yield
profits of $7, $5, and $3, respectively. To manufacture an A accessory, it takes
3 minutes on machine I, 2 minutes on machine II, and 2 minutes on machine III.
In order to manufacture a B accessory, it takes 1 minute, 3 minutes, and
1 minute on machines I, II, and III, respectively. In order to manufacture a C
accessory, it takes 1 minute, 2 minutes, and 2 minutes on machines I, II, and III,
respectively. There are 10 hours available on machine I, 7 hours on machine II,
and 8 hours on machine III for manufacturing these accessories each day. How
many accessories of each type should the K Company manufacture each day in
order to maximize its profit?
Duality and Complementary Slackness
5
In this chapter, we will develop the concept of duality, as well as the related
theorem of complementary slackness which not only tells us when we have optimal
solutions, but also leads us to the Dual Simplex Method.
min cx max yb
subject to Ax b subject to yA c
x0 y0
Definition It turns out that for every linear program, there is another linear
program closely associated with it. The two linear programs form a pair. If we
call the first linear program the primal program, then we call the second linear
program the dual program.
x1 ¼ x2 ¼ x3 ¼ 1 with z¼6
and
1 3
y1 ¼ y2 ¼ with w¼
12 2
Notice that the value of the minimized objective function z is greater than or equal
to the value of the maximized objective function w. It turns out that this is always
the case. This is analogous to the Max-Flow Min-Cut Theorem, which states that
the value of a network cut is always greater than or equal to the value of the flow
across that cut. In our example, the optimal solutions are
7 7
x1 ¼ , x2 ¼ x3 ¼ 0 with z¼
4 4
and
1 7
y1 ¼ , y2 ¼ 0 with w¼
4 4
The key thing to note in this example is that z ¼ w.
In a linear program, such as the Homemaker Problem, we always write the
objective function as an equation
min z, z þ c1 x1 þ c2 x2 þ þ cn xn ¼ 0
max x0 , x0 c1 x1 c2 x2 cn xn ¼ 0
In both cases, the coefficients cj can be either negative or positive. Notice that if we
have all cj 0 in the tableau in the minimization problem, then the current value of
z is optimum.
5.2 Duality and Complementary Slackness 63
Recall the Homemaker Problem and the Pill Salesperson Problem of Chapter 2. The
homemaker and the pill salesperson wanted to minimize the cost and maximize
the profit, respectively. These were not random examples. It turns out that most
linear programs exist in pairs. Furthermore, many interesting relations can exist
between a given pair of linear programs. Chief among these is the duality relation-
ship. Consider the following pair of linear programs, shown side by side.
For ease of exposition, we shall use the canonical form in our discussion. Notice
the symmetry between the two programs in the canonical form.
Theorem 5.1 (Theorem of Duality) Given a pair of primal and dual programs
(in canonical form), exactly one of the following cases must be true.
1. Both programs have optimum solutions and their values are the same,
i.e., min z ¼ max w , min cx ¼ max yb.
2. One program has no feasible solution, and the other program has at least one
feasible solution, but no (finite) optimum solution.
3. Neither of the two programs has a feasible solution.
Cases 2 and 3 are easy to understand. To prove case 1, we apply the theorem of
separating hyperplanes from Section 3.4.
64 5 Duality and Complementary Slackness
Lemma 5.1 If x and y are feasible solutions to a pair of primal and dual programs
(shown as follows), then cx yb.
min cx max yb
subject to Ax b subject to yA c
x0 y0
yðAx bÞ ¼ 0
ðc yAÞx ¼ 0
α ¼ yðAx bÞ 0
β ¼ ðc yAÞx 0
Corollary of Theorem 5.2 Given a pair of primal and dual problems in canonical
form, the necessary and sufficient conditions for feasible solutions x and y to be
optimum are that they satisfy the relations (5.1)–(5.4).
y i ð ai x bi Þ ¼ 0 for each i
cj yaj xj ¼ 0 for each j
Likewise
The relations (5.1)–(5.4) must hold true for every pair of optimum solutions. It may
happen that both yi ¼ 0 and ai x bi ¼ 0 are true. The following theorem
emphasizes that there will exist at least one pair of optimum solutions for which
yi ¼ 0 and ai x bi ¼ 0 cannot happen at the same time.
ðAx bÞ þ yT > 0
ðc yAÞ þ xT ¼ 0
And also
for all j, then no improvements are possible. If cj < 0, then we should increase the
level of that activity or make the non-basic variable basic.
In the Simplex Method, we successively interchange basic and non-basic
variables until all non-basic variables are priced out, i.e., cj 0 for all non-basic
variables.
We can now more formally state the definition of the pricing operation from
Section 4.8.
cj ¼ cj yi aij ≶ 0
aij xj bi
as a half space. The intersection of all these half spaces defines the convex set which
is the solution space (high dimension room).
For any point inside the solution space, we can always move the point along the
gradient (steepest descent) to improve the value until the point is at a corner where
the gradient is the convex combination of the normal vectors to the hyperplanes
which define the corner point (extreme point). Notice that due to inequality
constraints, we may have to move along an edge which minimizes the objective
function but not at the quickest rate. Finally we may be at the intersection of two
inequalities or the extreme point of the convex set. At that point, any local
movement will increase the objective function.
Before we discuss the Dual Simplex Method, let us first review the Simplex
Method. The two methods are tied together by duality theory. Consider a linear
program in standard form
min z ¼ cj xj ðj ¼ 1, . . . , nÞ
subject to aij xj ¼ bi ði ¼ 1, . . . , mÞðm nÞ
xj 0
5.4 Dual Simplex Method 67
Assume that we have put this into diagonal form with respect to x1, . . ., xm, and –z.
This gives
The top row of the tableau expresses x0 in terms of all variables. Every row in the
tableau is an equation. To the left of the tableau are the current basic variables. We
start with a tableau in which ai0 0 ði ¼ 1, . . . , mÞ. This condition is called primal
feasible.
Notice that the coefficients in the 0th row associated with the starting basic
variables are zero, and we can read the values of the basic variables as the current
bi 0 because the first m columns form an identity matrix I. If all coefficients in the
0th equation are non-negative, then we have the optimum solution. If not, we first
pick cj < 0, and use one ratio test to decide which basic variable to leave the basis,
say cs < 0,
br bi
¼ min 8 aij > 0
ars aij
Then, we perform the pivot operation which makes ars ¼ 1. The process is iterated
until all cj 0. If all cj 0, the current solution is the optimum solution. Note that
at every step of the Simplex Method, we keep a primal feasible solution, i.e., bi 0,
and at the end, cj 0. The current solution is both primal and dual feasible, hence
optimum.
The Dual Simplex Method is carried out in the same tableau as the primal
Simplex Method. The Dual Simplex Method starts with a dual feasible solution
and maintains its dual feasibility. It first decides what variable is to leave the basis
and then decides which variable is to enter the basis. The Dual Simplex Method can
be described in the following steps:
cj cs
max ¼ arj < 0
j arj ars
cj cs
or min ¼ arj < 0
j arj ars
Since bi ≶ 0, we select that row, and select xr to leave the basis. To select the
non-basic variable to enter the basis, we again perform a test like ratio testing
and find
cj cs
min ¼ arj < 0
j arj ars
Again, the ars is made 1 by the pivot operation. The process is iterated until all
bi 0.
We shall do the following example by the Dual Simplex Method:
The system is already in diagonal form with respect to x1 and x3. Putting this in
the usual Simplex tableau, we have Tableau 5.1. Note that a0j 0 ðj ¼ 1, 2, 3, 4Þ,
70 5 Duality and Complementary Slackness
Tableau 5.1
and hence it is dual feasible. In the leftmost column, a10 ¼ 4 < 0, so it is not
primal feasible. Thus we select x1 to leave the basis. Among coefficients in the first
row, a12 ¼ 3 and a14 ¼ 1 are potential pivots. Since
a02 3 a04 5
¼ < ¼
a 3 a 1
12 14
we select x2to enter the basis. We then make a12 equal to 1 by multiplying the first
row by 13 . Then, we obtain Tableau 5.2 via row reductions.
In Tableau 5.2, a0j 0 ðj ¼ 1, 2, 3, 4Þ and ai0 0 ði ¼ 1, 2Þ, so the optimum
solution is x2 ¼ 43, x3 ¼ 53, x1 ¼ x4 ¼ 0, and z ¼ 3.
Tableau 5.2
max w ¼ 1 4π 1 þ 3π 2
subject to π1 0
3π 1 þ π 2 3
ð5:10Þ
π2 0
π1 þ π2 5
πi ≶ 0
The optimum solution of the dual is π 1 ¼ 1, π 2 ¼ 0. (This can be solved by first
converting it into equations and introducing π ¼ y ey0 . Then we can use the
5.5 Exercises 71
Simplex Method.) The top row of the Simplex Method is cj ¼ cj πaj , and if
cj 0, then π is a solution to the dual program. Here in the starting Tableau 5.1,
c1 ¼ c3 ¼ 0, a1 ¼ ½1; 0, and a3 ¼ ½0; 1. Therefore what appears in the top row
under x1 and x3 is 0 πej ¼ π j .
For example, in Tableau 5.2, we have 1 ¼ π 1 and 0 ¼ π 2 . Thus, the optimum
tableau contains optimum solutions of both primal and dual programs. This being
the case, we have the choice of solving either the original program or its dual and
the choice of using either the primal or the dual method. In this example, it is
unwise to solve the dual program because we would need to add four slack
variables, and it would become a program with seven non-negative variables and
four equations.
Note
1. cj 0 implies that we have a dual feasible solution, since cj ¼ cj πaj 0,
where π satisfies the dual constraint.
2. We first decide which variable to leave the basis and then which variable to enter
the basis (the opposite of Simplex).
3. Only negative elements are pivot candidates.
4. The “ratio test” is used to maintain dual feasibility.
5.5 Exercises
min z ¼ 4x1 þ x2 þ x3
subject to 3x1 þ 2x2 þ x3 23
x1 þ x3 10
8x1 þ x2 þ 2x3 40
x1 , x2 , x3 0
72 5 Duality and Complementary Slackness
3. Solve the given problem using the Dual Simplex Method and use the Simplex
Method to verify optimality:
4. Solve the given problem using the Dual Simplex Method and use the Simplex
Method to verify optimality:
max z ¼ 2x1 x2
subject to 2x1 þ x2 þ x3 4
x1 þ 2x2 x3 6
x1 , x2 , x3 0
5. Solve the given problem using the Dual Simplex Method and use the Simplex
Method to verify optimality:
6. Solve the given problem using the Dual Simplex Method and use the Simplex
Method to verify optimality:
min z ¼ 2x1 þ x2
subject to 3x1 þ 4x2 24
4x1 þ 3x2 12
x1 þ 2x2 1
x1 2, x2 0
Revised Simplex Method
6
In the Simplex Method, all entries of the Simplex tableau are changed from one
iteration to the next iteration. As such, the Simplex Method requires lots of
calculations and storage, and we would like a more efficient method to solve a
linear program. Assume that we have an m n constraint matrix A and the optimum
tableau is obtained in, say, the pth iteration. Then, effectively, we have calculated
pðm þ 1Þðn þ 1Þ numbers. Notice that once a tableau is obtained during the
calculation, we have all the information necessary to calculate the next iteration;
all preceding tableaus, including the starting tableau, can be ignored.
Suppose that we keep the starting tableau, and we wish to generate all the entries
in a particular tableau. What information is needed? Let us say that we are
interested in all the entries in the 29th tableau. Then all we need is B1 associated
with the 29th tableau and the names of the current basic variables. All other entries
of the 29th tableau can be generated from the entries of the starting tableau and
the current B1 of the 29th tableau. Note that π ¼ cB B1 , which means the
current shadow price π is obtained by multiplying cB of the starting tableau by
the current B1 .
We define b as b ¼ B1 b, where B1 is from the 29th tableau and b is from the
starting tableau. Similarly, we let any column āj be given by aj ¼ B1 a j , where B1
is from the 29th tableau and aj is from the starting tableau. The relative cost (or the
modified cost) is cj ¼ cj πa j , where cj and aj are from the starting tableau and π is
the current shadow price.
The idea is that once we have B1 of the 29th tableau and the labels and names
of the basic variables, we can generate all entries if we keep the starting tableau.
So, what additional entries in the 29th tableau do we need to get B1 of the 30th
tableau?
It turns out that we need the non-basic column of the 29th tableau with a negative
relative coefficient and the updated right-hand side (RHS), where cj ¼ cj πaj ,
b ¼ B1 b, π ¼ cB B1 , cB is from the starting tableau, and B1 is the inverse of the
29th tableau.
In matrix notation
1 π cj cj πaj cj
¼ ¼
0 B1 aj B1 aj aj
and
1 π 0 πb z
¼ ¼
0 B1 b B1 b b
In the first equation, if cj 0, we are not interested in the components of āj. Only if
cj < 0 do we want to multiply B1 aj , do the feasibility test among positive entries in
āj and the right-hand b, and determine the pivot.
In the starting tableau, the identity is its own inverse, and π of the starting tableau
is zero because π ¼ cB B1 and cB ¼ 0 in the starting tableau. Thus, we can adjoin a
column of [1, 0, . . ., 0] to the identity matrix in the starting tableau, so that we have
1 0 1 π
¼
0 I 0 B1
Let us consider the same example in Chapter 4 which was solved by the Simplex
tableau. (See Tableau 4.1.) Now we can do it by the revised Simplex Method. The
starting tableau is given below in Tableau 6.1.
We have
2 3
1 0 0
B* ¼ 4 0 1 05
0 0 1
We use the row vector (1, 0, 0) to multiply the first column ½2, 2, 3 and get c3 ¼ 2.
Since it is positive, we do not care about the components below it. We use the
same row vector (1, 0, 0) to multiply the second column vector ½2, 1, 1 and get
c4 ¼ 2.
6.1 Revised Simplex Method 75
Tableau 6.1
Since in a*4 , a14 ¼ 1 is the only positive coefficient, we shall drop a1 from the basis.
We then perform the pivot operation with ā14 as the pivot (Tableau 6.2). The new
tableau is Tableau 6.3.
We put question marks in positions to indicate that they have not been
calculated.
Now the inverse of the new basis is
2 3
1 2 0
B*1 ¼ 40 1 05
0 1 1
c1 ¼ ð1; 2; 0Þ½0; 1; 0 ¼ 2
c3 ¼ ð1; 2; 0Þ½2, 2, 3 ¼ 2
c5 ¼ ð1; 2; 0Þ½1, 1, 2 ¼ 1
As a check, we have
c2 ¼ ð1; 2; 0Þ½0; 0; 1 ¼ 0
c4 ¼ ð1; 2; 0Þ½2, 1, 2 ¼ 0
We use the row vector (1, 2, 0) to multiply the column vector in the starting tableau
½1, 1, 2 and get c5 ¼ 1, so we do not care about the rest of the entries. Going back,
we use the same row vector (1, 2, 0) to multiply the column vector ½2, 2, 3 and get
c3 ¼ 2.
76 6 Revised Simplex Method
Tableau 6.2
Tableau 6.3
Since a23 ¼ 1 is the only positive element, a2 should be dropped. We perform the
pivot operation with a23 as the pivot (Tableau 6.4). The new tableau is Tableau 6.5.
Now the inverse of the new basis is
2 3
1 4 2
B*1 ¼ 40 3 25
0 1 1
c1 ¼ ð1; 4; 2Þ½0; 1; 0 ¼ 4
c2 ¼ ð1; 4; 2Þ½0; 0; 1 ¼ 2
c5 ¼ ð1; 4; 2Þ½1, 1, 2 ¼ 7
As a check, we have
Tableau 6.4
Tableau 6.5
Tableau 6.6
min z ¼ c1 x1 þ c2 x2 þ c3 x3
subject to a11 x1 þ a12 x2 þ a13 x3 ¼ b1
ð6:1Þ
a21 x1 þ a22 x2 þ a23 x3 ¼ b2
xj 0
We can find scalars for each row, say π 1 for the first row and π 2 for the second,
such that the columns of the constraint equations sum to equal the corresponding ci.
In other words,
c1 π 1 a11 π 2 a21 ¼ 0
c2 π 1 a12 π 2 a22 ¼ 0
min z ¼ x1 þ x2 þ x3
subject to x1 x2 þ 2x3 ¼ 2 ð6:2Þ
x1 þ 2x2 x3 ¼ 1
1 3ð1Þ 2ð1Þ ¼ 0
1 3ð1Þ 2ð2Þ ¼ 0
Notice that
1
a a12
½ π1 π 2 ¼ ½ c1 c2 11
a21 a22
After we put the linear program into diagonal form with respect to z, x1 , x2 , we
obtain the equations
z þ c3 x3 ¼ z0
x1 þ a13 x3 ¼ b1
x2 þ a23 x3 ¼ b2
z 3x3 ¼ 8
x1 þ 3x3 ¼ 5
ð6:3Þ
x2 þ x3 ¼ 3
xj 0
because the pivot is a13, which means that x1 will become a non-basic variable and
x3 will become a basic variable.
If we put the linear program into a diagonal form with respect to –z, x2, x3, we
obtain
z þ x1 ¼ 3
1 4
x1 þ x2 ¼
3 3 ð6:4Þ
1 5
x1 þ x3 ¼
3 3
In other words,
z þ c1 x1 ¼ 3
4
a21 x1 þ x2 ¼
3
5
a11 x1 þ x3 ¼
3
6.2 Simplified Calculation 79
where cj 0 for all non-basic variables. This condition indicates that the optimum
solution has been obtained. Note that (6.4) is obtained from (6.3) by multiplying the
first row by 1 and the second row by 0 and subtracting from the 0th row.
In general, after a linear program is in diagonal form, such as (6.4), whatever
appears above the identity matrix I in the 0th row is the current
ðπ 1 , π 2 , . . . , π m Þ. Also, whatever appears in the position of I is the current
B1 . For example, the following matrix occupies the position of I in (6.4):
2 1 3
0 3
6 3 7 0 1 0
4 5 ¼
1 1 1 0 1
1
3
If we interpret the example linear program (6.2) as the task of buying adequately
nutritional food as cheaply as possible, then there are three kinds of food, each
costing one dollar per unit ðc1 ¼ c2 ¼ c3 ¼ 1Þ. The first food contains one unit of
vitamin A and destroys one unit of vitamin B, and the second food destroys one unit
of vitamin A and contains two units of vitamin B.
If we decide to buy five units of the first food and three units of the second food,
the minimum requirements will be satisfied at the cost of eight dollars.
1 1 2
5 þ3 ¼
1 2 1
This is equivalent to paying three dollars for one unit of vitamin A and two dollars
for one unit of vitamin B, since
1 0 2
2 þ1 ¼
0 1 1
π ¼ cB B1 ð6:5Þ
min z ¼ cx
subject to Ax ¼ b ð6:6Þ
x0
80 6 Revised Simplex Method
Let us partition c into (cB, cN), A into (B, N), and x into [xB, xN]. Then we can
rewrite the linear program as
min z ¼ cB xB þ cN xN
subject to BxB þ NxN ¼ b ð6:7Þ
xB , xN 0
If we multiply (6.7) by π and subtract from the objective function, we obtain the
following:
In other words, z equals the original RHS b multiplied by the current shadow prices.
In the Simplex Method, every entry in
c 0 c cN 0
¼ B
A b B N b
is changed from iteration to iteration; for example, the 100th tableau is obtained
from the 99th tableau. If we knew the value of B1 for the 99th tableau, we could
directly generate every entry in the 100th tableau by multiplying the original tableau
by B1 of the 99th tableau without going through the first 99 iterations.
Since we do not know B1 of the 99th tableau, we cannot go directly from the first
tableau to the 100th. But we can save work by keeping only the value of B1 for
every tableau, together with the original tableau. The size of B1 is only
ðm þ 1Þ ðm þ 1Þ, which may be much smaller than ðm þ 1Þ n. This technique
works because we are only interested in a non-basic column j when cj < 0. If all
cj 0, then we are not interested in generating the matrix N corresponding to the
non-basic columns. In the worst case, we only need to generate the following
portions of the tableau:
þ þ þ þ
B1 b
(Note that denotes an unrestricted variable.)
Note that if cj ¼ cj πa j 0 for all j, then the current solution is optimal.
Column Generating Technique
7
Revised Simplex
Method
min z ¼ x1 þ x2 þ x3 þ x4 þ x5
" # " # " # " # " # " #
2 5 4 5 4 11
subject to x1 þ x2 þ x3 þ x4 þ x5 ¼ ð7:1Þ
4 1 3 4 5 11
xj 0
11
The five columns are plotted as shown in Figure 7.2. The requirement is
11
11
shown as a line from the origin to the coordinate . There are four line
11
2 5 2 4 2 5 5 4
segments connecting to , to , to , and to .
4 1 4 3 4 4 4 5
11
10
9
8
7
6
C
5 5 D
4 1 4
3 3 B
A
2
1 2
0 1 2 3 4 5 6 7 8 9 10 11
The
intersections of the four dotted line segments with the line from the origin to
11
are four possible solutions with the values of z ¼ 11 33 33 22
3 , 10 , 12 , 9 . The detailed
11
computations are shown in Tableaus 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, and 7.7.
Tableau 7.1
2 5
The result of multiplying Tableau 7.1 by the inverse of is shown in
4 1
Tableau 7.2.
Tableau 7.2
The result of subtracting the first row and the second row from the 0th row in
Tableau 7.2 is shown in Tableau 7.3.
Tableau 7.3
We have x1 ¼ 44
18, x2 ¼ 18, and z ¼ 3 (A in Figure 7.2).
22 11
2 4
The result of multiplying the Tableau 7.1 by the inverse of is shown in
4 3
Tableau 7.4.
84 7 Column Generating Technique
Tableau 7.4
The result of subtracting the first row and the second row from the 0th row in
Tableau 7.4 is shown in Tableau 7.5.
Tableau 7.5
We have x1 ¼ 11
10, x3 ¼ 10, and z ¼ 10 (B in Figure 7.2).
22 33
2 5
The result of multiplying Tableau 7.1 by the inverse of is shown in
4 4
Tableau 7.6.
Tableau 7.6
The result of subtracting the first row and the second row from the 0th row in
Tableau 7.6 is shown in Tableau 7.7.
7 Column Generating Technique 85
Tableau 7.7
We have x1 ¼ 11
12, x4 ¼ 12, and z ¼ 12 (C in Figure 7.2).
22 33
5 4
The result of multiplying Tableau 7.1 by the inverse of is shown in
4 5
Tableau 7.8.
Tableau 7.8
The result of subtracting the first row and the second row in Tableau 7.8 from the
0th row is shown in Tableau 7.9.
Tableau 7.9
In the numerical example, all cj are equal to one, so the modified cost cj ¼ cj
πaj is minimum if πaj is maximum.
1
2 5
The first shadow price π ¼ 16, 16 since cB B1 ¼ ð1; 1Þ ¼ π. So,
4 1
we look for an aj which has maxj a1j þ a2j .
Consider the Homemaker and Pill Salesperson examples in Chapter 2. Let the
supermarket have only two kinds of food: one kind has two units of vitamin A and
86 7 Column Generating Technique
four units of vitamin B, and the other kind has five units of vitamin A and one unit
of vitamin B. Furthermore, let the homemaker’s family need 11 units of vitamin A
and 11 units of vitamin B. Then the homemaker will pay 66 18 dollars. In the meantime,
the pill salesperson will charge 16 dollars per capsule for vitamin A and 16 dollars per
capsule for vitamin B,
1 1 66
11 þ ¼
6 6 18
Once the supermarket has a third kind of food with vitamin contents [4, 3] for
vitamins A and B, the homemaker will buy the foods [2, 4] and [4, 3], and the pill
1
salesperson will decrease the price of vitamin A capsule to 10 dollars and increase
1
the price of vitamin B capsule to 5 dollars.
Let us look at Figure 7.2 again. When the shadow price π ¼ ðπ 1 ; π 2 Þ ¼ 16; 16 , the
best vector is one with max j a1j ¼ a2 j , with three kinds of food having negative
modified cost. When the shadow price becomes ðπ 1 ; π 2 Þ ¼ 19; 19 , i.e., z ¼ 22
9 , then
all five kinds of food have non-negative modified costs (see the 0th column of
Tableau 7.9), and the computation stops.
The following is a brief summary of the Simplex Method, the revised Simplex
Method, and the column generating technique.
1. Simplex Method: Search the top row and use the most negative column to
replace a column in the current basis until cj 0 for all columns. (Too much
storage and work.)
2. Revised Simplex Method: Keep a tableau of size ðm þ 1Þ ðn þ 2Þ. Use a
column to replace another in the current basis if cj < 0 (where cj ¼ cj πa j ).
Only generate the entries of the column when c j < 0. Use a ratio test to keep the
next solution feasible. Since the exchange uses one new column and replaces one
old column, the work is reasonable.
Notice that the result of z ¼ πb can be checked in each iteration, where π is the
current shadow price and b is the original right-hand side.
3. Column Generating Technique: This technique is typically applied to linear
programs with too many columns to write down. The coefficients in the Simplex
tableau are not random numbers. The entries in a column could be the path from
one point to another point, or perhaps they could be weights of several items in a
knapsack packing problem. The columns could be partitioned into several sets,
or each set of columns could be the multicommodity flows of a special
commodity. Therefore, the problem of selecting a column to enter the basis is
an optimization problem itself. The optimum solution is found when the best
column has cj 0.
The Knapsack Problem
8
8.1 Introduction
We are now going to be looking at integer programs. We start with the simplest
integer program, which has a single constraint. Consider a hiker with a knapsack
that can carry 20 lbs. There are three kinds of items with various values and
weights, and he wants to select items to put in the knapsack with maximum total
value. So we have the knapsack problem, as follows:
If we use the Ratio Method, the first item has the best ratio, with
12 10 7
> >
11 10 9
20
Since x1 ¼ 11 is not an integer, x1 ¼ 11 ¼ 1. So the integer program (8.1) would
20
1. The best item may not be used in the optimum integer solution.
2. The maximum linear program solution, x1 ¼ 20 11 with objective function value
20
11 12 ¼ 240
11 ffi 21:82, is usually better than the optimum integer solution
(the integer constraint on variables certainly cannot help the optimum
solution’s value).
3. We cannot round off the optimum linear program solution to get the optimum
integer solution, and the difference between the two values can be arbitrarily
large. For example, we can change (8.1) to (8.2):
value.
Denote the density of the best item by p1 ¼ wv11 and the density of the second best
item by p2 ¼ wv22 , where pj denotes the value per pound of the item j. Intuitively, we
would fill the knapsack with the first item and then fill leftover space with the
8.1 Introduction 89
second item, etc. This intuitive approach usually gives very good results, but it does
not always give the optimum solution due to integer restrictions.
One thing is clear: if b is sufficiently large compared to w1, then x1 > 0 in the
optimum solution. That is, the best item would be used at least once. To see this, let
us assume that x1 ¼ 0 in the optimum solution. Then, the maximum value will
certainly not exceed p2b. In other words, the maximum value without the best item
(i.e., x1 ¼ 0) is p2b.
However, if we try to fill the knapsack with only the best item, we can obtain the
value
b b w1 b w1 b w1
v1 > v1 ¼ v1 ¼ p1 w 1 ¼ p1 ð b w 1 Þ
w1 w1 w1 w1 w1
p1
p1 ðb w1 Þ ¼ p2 b or b ¼ w1
p1 p2
p1 ð8:4Þ
That is, for b w1 , p1 ðb w1 Þ p2 b:
p1 p2
In other words, if b is sufficiently large, x1 > 0 in the optimum solution. Note that
the above condition (8.4) and b lcmðw1 ; w2 Þ are both sufficient conditions but not
necessary conditions.
Define
X
k
Fk ð y Þ ¼ vj xj ð 0 k nÞ
j¼1
where
X
k
wj xj y ð0 y bÞ
j¼1
Here, Fk( y) is the maximum value obtained by using the first k items only when the
total weight limitation is y. Notice that Fk( y) can be calculated recursively from
k ¼ 0, 1, . . . , n and y ¼ 0, 1, . . . , b. Fk ð0Þ ¼ 0 if the weight-carrying capacity
is zero.
90 8 The Knapsack Problem
When there are two kinds of items available, we could either only use the first item
or we can use the second item at least once. So,
F2 ð yÞ ¼ maxfF1 ð yÞ, v2 þ F2 ð y w2 Þg
Note that the formula does not restrict x2 ¼ 1 since the term F2 ð y w2 Þ could
contain x2 1. Similarly,
F3 ð yÞ ¼ maxfF2 ð yÞ, v3 þ F3 ð y w3 Þg
F4 ð yÞ ¼ maxfF3 ð yÞ, v4 þ F4 ð y w4 Þg
⋮
When the first k items are chosen to obtain Fk( y), either the kth item is used at least
once, or it is not used at all. If it is used at least once, then the total weight limitation
is reduced to y wk . If it is not used, then Fk( y) is the same as Fk1 ð yÞ.
Remember that x1 > 0 in the optimum solution when b is sufficiently large. This
means that
p1 w1
Fn ðbÞ ¼ v1 þ Fn ðb w1 Þ for b > ð8:6Þ
p1 p 2
The most interesting feature of the knapsack problem is that its optimum integer
solution will be periodic if the right-hand side bound b becomes asymptotically
large.
Define θðbÞ ¼ p1 b Fn ðbÞ to be the difference between the optimum linear
program value and the optimum integer program value using n kinds of items.
Then,
θðb w1 Þ ¼ p1 ðb w1 Þ Fn ðb w1 Þ
¼ p1 b p1 w1 ðFn ðbÞ v1 Þ ðfrom ð8:6Þ and for sufficiently large bÞ
¼ p1 b Fn ðbÞ
¼ θðbÞ
This shows that the function θ(b) is periodic in nature, with period w1 for suffi-
ciently large b.
8.1 Introduction 91
other words, b 540 implies that x1 1. This means that for two weight limitations
0 0
b and b0 with b b 540 and b b ðmod w1 Þ, the optimum solutions to both
problems are almost the same except that we will fulfill the part of the weight
0
limitation b b using the first item. Assume that we calculate θ(b) for all
values of b starting with b ¼ 0, 1, . . .. Then it will be seen that θ(b) is periodic,
i.e., θðbÞ ¼ θðb þ 15Þ for b 26.
We use (8.5) and (8.7) to build Tables 8.1 and 8.2 to show the periodic nature of
optimum integer solutions of the knapsack problem.
Notice that in the tables, the values of θ(b) are the same for the two intervals,
b ¼ 26 to 40 and b ¼ 41 to 55.
Next, recall that θ(b) is the difference between the optimum linear program
value and the optimum integer program value. For smaller values of b, we have
θ ð 0Þ ¼ 0
θ ð 1Þ ¼ ð p1 p5 Þ 1 ¼ 1:2
θ ð 2Þ ¼ ð p1 p5 Þ 2 ¼ 2:4
θ ð 3Þ ¼ ð p1 p5 Þ 3 ¼ 3:6
θ ð 4Þ ¼ ð p1 p4 Þ 4 ¼ 0:8
θ ð 5Þ ¼ ð p1 p4 Þ 4 þ ð p1 p5 Þ 1 ¼ 2:0
θ ð 6Þ ¼ ð p1 p4 Þ 4 þ ð p1 p5 Þ 2 ¼ 3:2
θ ð 7Þ ¼ ð p1 p3 Þ 7 ¼ 0:4
θ ð 8Þ ¼ ð p1 p3 Þ 7 þ ð p1 p5 Þ 1 ¼ 1:6
etc:
If we keep calculating, we would get Table 8.1 where the θ(b) values for b ¼ 26
to 40 are identical to the θ(b) values for b ¼ 41 to 55.
Table 8.2 shows wj’s periodic solutions in each b (mod w1).
To see how Table 8.2 can be used to find Fn(b) for all b, take
b ¼ 43 13 ðmod 15Þ. In the 13th row, we have periodic solution 1, 12, which
92 8 The Knapsack Problem
Table 8.1
X
wj xj
Values of b Solutions j θ(b)
0
⋮
25
26 1:2
⋮ ⋮
40 2:0
41 1:2
⋮ ⋮
55 2:0
Table 8.2
X
n
wj xj
b (mod w1) wj’s periodic solution j¼2 θ value
0 0 0 0
1 1 1 1.2
2 1, 1 2 2.4
3 4, 7, 7 18 3.6
4 4 4 0.8
5 1, 4 5 2.0
6 7, 7, 7 21 3.2
7 7 7 0.4
8 1, 7 8 1.6
9 12, 12 24 0.8
10 1, 12, 12 25 2.0
11 4, 7 11 1.2
12 12 12 0.4
13 1, 12 13 1.6
14 7, 7 14 0.8
1. Given any b, find b (mod w1), where w1 is the weight of the best item.
2. Find the θ(b) values for
Xvalues of b until θ(b) occurs periodically.
n
3. Use Table 8.2 to find wx.
j¼2 j j
8.2 Landa’s Algorithm 93
Note that the algorithm above has a time complexity of O(nb). In the next
section, we shall develop another algorithm that has time complexity O(nw1). The
reader could and should stop here and read the next section after he or she finishes
Chapter 9, on Asymptotic Algorithms.
The content of this section is a simplified version of the paper published in Research
Trends in Combinational Optimization, written by T. C. Hu, L. Landa, and M.-T.
Shing.1 This section may be read independently from the preceding section.
Mathematically, the knapsack problem is defined as follows:
X
n
max v¼ vi xi
i¼1
X n ð8:8Þ
subject to w¼ wi xi b
i¼1
with vi, wi, xi, b all non-negative integers. The first optimal knapsack algorithm
based on dynamic programming was developed by Gilmore and Gomory (1966).
Let Fk( y) be the maximum value obtained in (8.8). If the knapsack has a weight-
carrying capacity y and only the first k kinds of items are used, we can compute
Fk( y) for k ¼ 1, . . . , n and y ¼ 1, . . . , b as follows:
In general,
Based on (8.9), we can build a table of n rows and b columns and solve the knapsack
problem in O(nb) time. Note that this algorithm runs in pseudo-polynomial time
because any reasonable encoding of the knapsack problem requires only a polyno-
mial in n and log2b.
When there are no restrictions on the values of xi and we need to calculate the
optimal solutions for large values of b, Gilmore and Gomory discovered that the
optimum solutions have a periodic structure when b is sufficiently large. For
simplicity, we always assume
1
T. C. Hu, L. Landa, and M.-T. Shing, “The Unbounded Knapsack Problem,” Research Trends in
Combinatorial Optimization, 2009, pp. 201–217.
94 8 The Knapsack Problem
v1 vj
max
w1 j wj
in the rest of this section, and, in the case of a tie, we let w1 < wj so that the first item
is always the best item. Note that the best item can be found in O(n) time without
any need to sort the items according to their value-to-weight ratios.
When the weight-carrying capacity b exceeds a critical value b* *, the optimal
solution for b is equal to the optimal solution for b w1 plus a copy of the best item.
In other words, we can first fill a portion of the knapsack using the optimal solutions
for a smaller weight-carrying capacity and fill the rest of the knapsack with the best
item only. Two simple upper bounds for the critical value b* * are
X
b** lcm w1 ; wj ðj 6¼ 1Þ ð8:10Þ
and
w 1 p1
b** ð8:11Þ
p1 p2
where
p1 ¼ v1
w is the highest value-to-weight ratio and
1
p2 ¼ v2
w2 is the second highest value-to-weight ratio.
For illustration purposes, we shall use the following numerical example in this
section:
Definition To find the starting point of the periodic optimal structure, we shall
partition the weight-carrying capacity b into w1 classes (called threads), where
Definition We shall call the weight-carrying capacity the boundary and present an
algorithm to pinpoint the necessary and sufficient value of the boundary in each
thread (called the thread critical boundary). For brevity, we shall use:
8.2 Landa’s Algorithm 95
and
The optimal solution for any boundary b larger than the critical value b** ¼ 36 will
exhibit the periodic structure that uses one or more copies of the best item in our
example.
Landa (2004) proposes a new algorithm for computing the critical thread
boundaries. Landa focuses on the gain from filling the otherwise unusable space
(if we are limited to use integral copies of the best item only). One major advantage
of using gain is that the gain computation involves only integers. And, the gain
concept enables us to obtain a very simple algorithm for finding the optimal
solutions when the boundary b is less than b*(t(b)). To understand how Landa’s
algorithm works, we first formally define the concept of gain, a measure of the
efficiency of different packings for a boundary b.
Definition The gain of a boundary b is the difference between the optimal value
V(b) and the value of the packing using the best item only. We define
b
gðbÞ ¼ V ðbÞ v1 ð8:14Þ
w1
Proof
bb bs þ w1
gðbb Þ ¼ V ðbb Þ v1 ¼ V ð bb Þ v 1
w1 w1
bs þ w1
¼ V ð bs Þ þ v 1 v 1
w1
bs
¼ V ð bs Þ v 1
w1
¼ gð bs Þ
□
Proof Let V(bs) and g(bs) be the values of the optimal packing for bs and its gain.
We can create a feasible solution for bb ¼ bs þ w1 by using one copy of the best
item. The optimal value V(bb) must be greater than or equal to V ðbs Þ þ v1 since
bs þ w 1 is a feasible solution. From the definition (8.14), we have
gðbb Þ gðbs Þ. □
From now on, if there is more than one optimal solution for a boundary b, we
prefer the optimal solution with the largest x1. Since we want to find the smallest
boundary where the optimal solution becomes periodic in each thread, we shall
partition boundaries in each thread into two categories.
Let us build Table 8.3 for our numerical example (8.12), where each column
corresponds to a thread. Each cell has two numbers: the top number is the boundary
and the bottom number is its gain. All special boundaries are marked with darker
borders, and the special cell with the largest gain (i.e., thread critical boundary) in
each column is marked with darker borders and shading. Since there is no cell
above the first row, we define all cells in the first row as special boundaries.
Lemma 8.2 The gain in any cell is strictly less than v1.
If gðbÞ ¼ v1 , then
b b þ w1 v1
V ðbÞ ¼ v1 þ v1 ¼ v1 > b
w1 w1 w1
a contradiction. □
Lemma 8.3 There exists a cell in each thread that has the maximum gain, with all
boundaries in the thread beyond that cell having the same gain.
Proof We know that all gains are non-negative integers, and from Lemma 8.2, all
gains are less than v1. Since the gains are monotonically increasing in each thread
by Corollary 8.1, the gain will stabilize somewhere in each thread. □
Suppose we have an optimal solution to a boundary bs with gain g(bs), and the
optimal solution for the boundary bb ¼ bs þ wk (in a different thread) uses the same
optimal solution for bs plus one copy of the kth item. Then the gain g(bb) of the
boundary bb is related to g(bs) based on the formula stated in Theorem 8.1.
Theorem 8.1 If the optimal solution for the boundary bb ¼ bs þ wk in the thread
t(bb) consists of the optimal solution for the boundary bs in the thread t(bs) plus one
copy of the kth item, then
98 8 The Knapsack Problem
tðbs Þ þ wk
gð bb Þ ¼ gð bs Þ þ v k v 1 ð8:15Þ
w1
Proof Since
bb bs þ wk
gð bb Þ ¼ V ð bb Þ v 1 ¼ V ðbs Þ þ vk v1 and
w1 w1
bs bs
bs ¼ bs ðmod w1 Þ þ w1 ¼ tðbs Þ þ w1
w1 w
1
tðbs Þ þ wk bs
gð bb Þ ¼ V ð bs Þ þ v k v 1 þ
w1 w1
bs tðbs Þ þ wk
¼ V ð bs Þ v 1 þ vk v1
w1 w1
tðbs Þ þ wk
¼ gð bs Þ þ v k v 1
w1
Note that in Theorem 8.1, the formula (8.15) does not contain the actual value of
the optimal solutions V(bb) or V(bs). It relates the gains of two cells in two different
threads t(bb) and t(bs) using only the values of v1, w1, vk, wk, t(bs), and t(bb).
From now on, we describe the method for obtaining the largest gain (and its
corresponding boundary) in each thread. From the principle of optimality, the
sub-solution of an optimal solution must be an optimal solution of the
sub-boundary. However, the principle does not say how to extend an optimal
solution of a boundary to the optimal solution of a large boundary. We shall
show how to use the gain formula in Theorem 8.1 to discover the gain of a larger
boundary from a smaller one. For brevity, we shall use:
Starting with only one type of item with value v1 and weight w1, we have the
thread critical boundary b1 ðiÞ ¼ i and the gain g1t ðiÞ ¼ 0 for every 0 i ðw1 1Þ.
Next, we want to pack the boundaries in each thread with the first and second
types of items and see if the addition of a second type of items with value v2 and
weight w2 can increase the maximum gain in each thread.
Assuming that we have obtained the correct value of g2t (i), the largest gain in
thread i using the first two types of items (e.g., g2t ð0Þ ¼ g1t ð0Þ ¼ 0), we can use the
8.2 Landa’s Algorithm 99
gain formula in Theorem 8.1 to get a better gain for the thread j ¼ ði þ w2 Þðmod w1 Þ
and set the gain g2t ( j) to
i þ w2
g2t ð jÞ ¼ max g1t ð jÞ, g2t ðiÞ þ v2 v1
w1
Table 8.4 Thread critical boundaries using the first item only
Threads
K 0 1 2 3 4 5 6 7
1 b¼0 b¼1 b¼2 b¼3 b¼4 b¼5 b¼6 b¼7
g¼0 g¼0 g¼0 g¼0 g¼0 g¼0 g¼0 g¼0
Table 8.5 Thread critical boundaries using the item types 1 and 2
Threads
K 0 1 2 3 4 5 6 7
2 b¼0 b¼1 b¼2 b¼3 b¼4 b¼5 b¼6 b¼7
g¼0 g¼0 g¼0 g¼0 g¼0 g¼3 g¼3 g¼3
Table 8.5 thus contains the thread critical boundaries using the two types of
items.
Since gcdðw1 ; w3 Þ ¼ gcdð8; 18Þ ¼ 2 6¼ 1, we cannot visit all the threads. We can
only visit 0, 2, 4, 6 in one chain if we start with thread 0. Instead of visiting all
the threads in a single chain, we need two chains. The second chain involves the
threads 1, 3, 5, and 7.
Definition The technique of finding the thread which does not use the newly
introduced type is called the double-loop technique, which will be used whenever
gcdðw1 ; wk Þ 6¼ 1.
There exists at least one cell in the chain which does not use any kth-type item.
Hence, for our numerical example, there must exist a thread j ∈ f1; 3; 5; 7g such that
i þ 18 i þ 18
gt ð jÞ ¼ gt ð jÞ gt ðiÞ þ 17 8
3 2 3
gt ðiÞ þ 17 8
2
8 w1
Table 8.6 shows the values of g3t ( j) and b3( j) of the threads 7, 1, 3, and 5 after the
first loop. Note that both threads 1 and 5 can serve as the starting cell for the chain.
Table 8.6 The g3t ( j) and b3( j) values after the first loop
i þ 18
g 3
t ð i Þ þ 17 8
i j g2i ( j) 8 g3i ( j) b3( j) Remarks
5 7 3 4 4 23
7 1 0 3 0 1 Possible starting cell for the chain
1, 3, 5, 7
1 3 0 1 1 19
3 5 3 3 3 5 Possible starting cell for the chain
1, 3, 5, 7
Then we start with the thread 5 and visit threads in the second chain in the order
of 7, 1, and 3 one more time, to obtain the thread critical boundaries and gains
shown in Table 8.7.
Table 8.7 Thread critical boundaries using the item types 1, 2, and 3
Threads
K 0 1 2 3 4 5 6 7
3 b¼0 b¼1 b ¼ 18 b ¼ 19 b ¼ 36 b¼5 b¼6 b ¼ 23
g¼0 g¼0 g¼1 g¼1 g¼2 g¼3 g¼3 g¼4
1. Find the best item type, i.e., the item type with the highest value-to-weight ratio,
and name it as the first item. In case of a tie, pick the item with the smallest
weight.
2. Create an array T with w1 entries. Initialize the entry T[i], 0 i w1 1, with
the ordered pair b1 ðiÞ ¼ i, g1t ðiÞ ¼ 0 .
3. For k ¼ 2, . . . , n, introduce the challenge types, one type at a time. Starting from
the thread 0, we set bk ð0Þ ¼ 0 and gtk ð0Þ ¼ 0, traverse the threads in the first
chain in the order of j ¼ wk ðmod w1 Þ, 2wk ðmod w1 Þ, . . ., compute the (b*( j),
gkt ( j)) using the formula (8.16), and update entries in T accordingly. If
gcdðw1 ; wk Þ > 1, use the double-loop technique to compute (bk( j), gkt ( j)) for
the threads in each of the remaining chains and update the entries in the array
T accordingly.
Since Step 1 takes O(n) time, Step 2 takes O(w1) time, and Step 3 takes O(w1)
time for each new item type and there are n 1 new item types, Landa’s algorithm
runs in O(nw1) time.
8.3 Exercises
min z ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
3 1 2 1 11
subject to x1 þ x2 þ x3 þ x4 ¼ ð9:1Þ
1 3 1 2 11
xj 0 integers
So, in the case of (9.1), the optimum solution of the associated linear program is
x1 ¼ 11
4 , x2 ¼ 4 , and x3 ¼ x4 ¼ 0. In other words, x1 and x2 are the basic variables,
11
11
11
11
10
9
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11
3 1
Fig. 9.1 Possible coordinates of sums of integer multiples of the vectors and from (9.1)
1 3
Since the solutions of the associated linear program are not integers, the only
way to solve it is to increase the values of the non-basic variables x3 and x4.
In solving the associated linear program, we multiply by the inverse of the
matrix
1
3 1 3 1 1 3 1
with ¼
1 3 1 3 8 1 3
1 3 1
The result of multiplying by the constraint of (9.1) is shown as (9.2).
8 1 3
The constraint of (9.1) becomes a congruence relation, where all numbers are equal
(mod 1).
2 3 2 3 2 3
5 1 6
1 0 6 8 7 6 8 7 6 8 7
x þ x þ 6 7x þ 6 7x ¼ 6 7 ð9:2Þ
0 1 1 2 4 1 5 3 4 5 5 4 4 6 5
8 8 8
To satisfy the relation (9.2), we see by inspection that x3 ¼ 1 and x4 ¼ 1. Then, we
obtain the optimum integer solution by substituting x3 and x4 into (9.1) and solving
the reduced integer program
9.1 Solving Integer Programs 105
min z ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
3 1 2 1 8
subject to x1 þ x2 þ x3 þ x4 ¼ ð9:3Þ
1 3 1 2 8
xj 0 integers
In this example, we first solve the associated linear program. If basic variables are
not integers, we solve the equivalent constraint relation by increasing the values of
non-basic variables.
Let us use matrix notation and describe the general approach.
Consider an integer program
max z ¼ c0 x0
subject to A0 x0 b ð9:4Þ
x0 0 integers
max z ¼ cx
subject to Ax ¼ b ð9:5Þ
x0 integers
max z ¼ cB xB þ cN xN
subject to BxB þ NxN ¼ b ð9:6Þ
xB , xN 0 integers
max z ¼ cB B1 b cB B1 N cN xN
subject to xB þ B1 NxN ¼ B1 b ð9:7Þ
xB , xN 0 integers
If we consider (9.7) as a linear program, i.e., drop the integer restriction on xB and
xN, and if B is the optimum basis of the linear program, then the optimum solution
to the linear program is
xB ¼ B1 b, xN ¼ 0
where
cB B1 N cN 0
7 11 112
max z¼ x1 x2
10 10 10
1. Treat the integer program as a linear program. If the associated linear program
has an optimum solution in integers, then the optimum solution is also the
optimum solution of the original integer program.
2. If not, map all non-basic columns of the associated linear program and its RHS
into group elements and find the cheapest way of expressing the RHS.
3. Substitute the values of the non-basic variables from Step 2 into the original
integer program, and obtain the values of basic variables.
It turns out that we cannot solve all integer programs simply by following these
three steps. However, most integer programs with large RHS b can be solved with
the above three steps.
Let us consider another integer program as shown in (9.10).
min z ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
3 2 1 1 11
subject to x1 þ x2 þ x3 þ x4 ¼ ð9:10Þ
2 3 2 5 11
xj 0 integers
108 9 Asymptotic Algorithms
This is shown in Figure 9.2.There are three straight lines from the origin: one
3 2
straight line with multiples of , another straight line with multiples of , and
2 3
1 11
the third straight line with multiples of . The point is marked with a black
2 11
8
square. The point with coordinates is also shown as a black square (to be used
13
in Example 2, below).
8
13
13
12
11
11
11
10
9
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13
Fig. 9.2 Possible coordinates of integer multiples of the integer program as shown in (9.10)
11
Note that the line from the origin to the point is in the cone spanned by
11
3 2
and . The associated linear program of (9.10) has its basic variables x1 and
2 3
x2 with their values
11 11 22
x1 ¼ , x2 ¼ , x3 ¼ 0, x4 ¼ 0, and z ¼
5 5 5
1
3 2 1 3 2
since ¼ .
2 3 5 2 3
9.1 Solving Integer Programs 109
The result is
2 3
1 7 11
61 0
6 5 5 5 7
7
4 4 13 11 5
0 1
5 5 5
ð9:11Þ
2 3 2 3 2 3
4 3 1
6 5 7 6 5 7 6 5 7
or 6 7x3 þ 6 7x4 6 7
4 4 5 4 3 5 4 1 5
5 5 5
with many possible solutions to (9.11):
8
< x3 ¼ 4,
> x4 ¼ 0
x3 ¼ 0, x4 ¼ 2
>
:
x3 ¼ 2, x4 ¼ 1
Example 1
min z ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
3 2 1 1 44
subject to x1 þ x2 þ x3 þ x4 ¼ ð9:12Þ
2 3 2 5 44
xj ðmod 1Þ ¼ 0 and xj 0 ðj ¼ 1, 2, 3, 4Þ
44 44 88
x1 ¼ , x2 ¼ , x3 ¼ 0, x4 ¼ 0, and z¼ :
5 5 5
110 9 Asymptotic Algorithms
All the non-basic columns and the RHS are mapped into the congruence relation
" # " # " #
1 1 1 7 1 44
x3 þ x4 ¼
5 4 5 13 5 44
2 3 2 3 2 3
4 3 4 ð9:13Þ
6 5 7 6 5 7 6 5 7
or 6 7 6 7 6 7
4 4 5x3 þ 4 3 5x4 4 4 5
5 5 5
where the cheapest solution is x3 ¼ 1 and x4 ¼ 0. When we substitute the values of
x3 ¼ 1 and x4 ¼ 0 back to (9.12), we get the optimum integer solution of (9.12):
88
x1 ¼ 9, x2 ¼ 8, x3 ¼ 1, and x4 ¼ 0 with z ¼ 18 > ¼ 17:6:
5
Here we are very lucky. We pay very little to make all variables to be integers.
Let us try a different RHS b as shown in (9.12a), which is the same as (9.10):
min z ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
3 2 1 1 11
subject to x1 þ x2 þ x3 þ x4 ¼ ð9:12aÞ
2 3 2 5 11
xj 0 integers
11 11 22
x1 ¼ , x2 ¼ , x3 ¼ 0, x4 ¼ 0, and z¼ :
5 5 5
And the congruence relation
2 3 2 3 2 3
1 7 11
6 5 7 6 5 7 6 5 7
6 7 6 7 6 7
4 4 5x3 þ 4 3 5x4 ¼ 4 11 5
5 5 5
2 3 2 3 2 3 ð9:13aÞ
4 3 1
6 5 7 6 5 7 6 5 7
or 6 7 6 7 6 7
4 4 5x3 þ 4 3 5x4 4 1 5
5 5 5
(1) x3 ¼ 0 and x4 ¼ 2 with x1 ¼ 5 and x2 ¼ 3:
(2) x3 ¼ 4 and x4 ¼ 0 with x1 ¼ 3 and x2 ¼ 1:
9.1 Solving Integer Programs 111
min z ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
3 2 1 1 44
subject to x1 þ x2 þ x3 þ x4 ¼ ð9:12bÞ
2 3 2 5 44
all xj ðmod 1Þ ¼ 0 and xj 0
As noted above, the associated linear program has the optimum solution
44 44 88
x1 ¼ , x2 ¼ , x3 ¼ 0, x4 ¼ 0, and z¼ :
5 5 5
And the congruence relation becomes
2 3 2 3 2 3
4 3 4
6 5 7 6 7
5 7 6 5 7
6 7x3 þ 6 x4 6 7 ð9:13bÞ
4 4 5 4 3 5 4 4 5
5 5 5
with the cheapest solution x3 ¼ 1 and x4 ¼ 0.
When we substitute the values of x3 ¼ 1 and x4 ¼ 0 back into (9.12b), we obtain
the optimum integer solution
x1 ¼ 9, x2 ¼ 8, x3 ¼ 1, and x4 ¼ 0:
Example 2
min z ¼ x1 þ x2 þ x3 þ x4
" # " # " # " # " #
3 2 1 1 8
subject to x1 þ x2 þ x3 þ x4 ¼ ð9:14Þ
2 3 2 5 13
xj 0 integers
112 9 Asymptotic Algorithms
8 3 1 2 1
Note that is spanned by and and is also spanned by and .
13 2 2 3 2
Which pair should we choose?
2 1
If we choose and , we obtain the optimum associated linear program
3 2
solution
x2 ¼ 3, x3 ¼ 2, and z ¼ 5:
3 1
If we choose and , we obtain the solution
2 2
9.2 Exercises
Integer programs are very hard to solve. Even the knapsack problem, one of the
simplest integer programs, is NP-complete. In order to solve this problem, most
people would use the dynamic programming-type algorithm of Gilmore and
Gomory. This algorithm is pseudo-polynomial and has time complexity O(nb),
where b is the weight-carrying capacity of the knapsack. The knapsack problem can
also be solved using Landa’s algorithm, as we saw in Section 8.2. Landa’s algo-
rithm is also pseudo-polynomial and has time complexity O(nw1), where w1 is the
weight of the best item.
When there are two constraints in an integer program, i.e., the RHS b ¼ ½b1 ; b2 ,
we can see many nice features of integer programs. We illustrate these features in
Figure 10.1. In this figure, there aretwo cones: the outer cone and the inner cone.
0
The outer cone, with its origin at , is composed of two thick lines that are
0
2 1 3
extended by the vectors and . The inner cone, with its origin at , is
1 3 4
composed of two thin lines that are parallel to the two thick lines.
We solve an integer program
max x0 ¼ c1 x1 þ c2 x2 þ
" # " # " #
2 1 b1
subject to x1 þ x2 þ ¼ ð10:1Þ
1 3 b2
xj 0 integers
2 1
If the associated linear program has as its optimal basis, then the integer
1 3
b
program (10.1) also has the same optimal basis provided that 1 can be expressed
b2
2 1
as integer combinations of and and are images of some non-basic columns.
1 3
# Springer International Publishing Switzerland 2016 113
T.C. Hu, A.B. Kahng, Linear and Integer Programming Made Easy,
DOI 10.1007/978-3-319-24001-5_10
114 10 The World Map of Integer Programs
13
12
11
10
9
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13
Fig. 10.1
Theouter cone and inner cone of the integer program in (10.1) corresponding to basis
2 1
vectors
1 3
max x0 ¼ c1 x1 þ c2 x2 þ c3 x3 þ c4 x4 þ
" # " # " # " # " #
2 1 4 1 b1
subject to x1 þ x2 þ x3 þ x4 þ ¼ ð10:2Þ
1 3 3 5 b2
xj 0 integers
In summary, there are four regions in Figure 10.1 and in Figure 10.2: (i), (iia), (iib),
and (iii):
(i) The with the RHS [b1, b2] very far away from
region inside the inner cone,
3 5
in Figure 10.1 or from in Figure 10.2. If [b1, b2] is in this region,
4 8
10 The World Map of Integer Programs 115
13
12
11
10
9
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13
Fig. 10.2
Theouter cone and inner cone of the integer program in (10.2) corresponding to basis
4 1
vectors
3 5
(1) we have to solve the congruence relation, and (2) we do not have to worry
that the basic variables of the associated linear program would be forced to be
negative.
(ii) The [b1, b2] lies within the strips that are between parallel thin and thick lines
in the figure. We use (iia) to denote the upper strip and (iib) to denote the
lower strip. These strips extend farther and farther away from the origin. (Note
that the two strips could be of different widths and vary from problem to
problem.) In this case, (1) we have to solve the congruence relation, and (2) we
should consider that the basic variables of the associated linear program would
be forced to be negative. If the [b1, b2]
values are
very
large, the final integer
2 1
solution may contain the vectors and many times over as in
1 3
4 1
Figure 10.1 or the vectors and many times over as in Figure 10.2.
3 5
(iii) This region is the intersection of the two strips of constant widths
corresponding to (iia) and (iib), i.e., it is a parallelogram near the origin.
This is in some sense the worst region, particularly if there are many
constraints in the integer program. For [b1, b2] in this region, (1) we may
have no integer solution, and (2) we have troubles of the same nature as noted
for (iia) and (iib).
116 10 The World Map of Integer Programs
13
12
11
10
9
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13
Fig. 10.3
Theouter cone and inner cone of the integer program in (10.3) corresponding to basis
5 1
vectors
1 5
Since we have no control over the integer program, we may have to find a
basis—not the optimal basis of the associated linear program—and try our luck
again and again. However, there is some good news. If we randomly select [b1, b2]
to be our RHS of the integer program, then there are more points in region (i) than
all integer points in the other three regions (iia), (iib), and (iii).
If we have an integer program with two constraints, and most vectors have
components with small values such as
max x0 ¼ c1 x1 þ c2 x2 þ þ cn xn
" # " # " # " # " #
5 1 1 2 b1
subject to x1 þ x2 þ x3 þ x4 þ ¼ ð10:3Þ
1 5 1 1 b2
xj 0 integers
then presumably we do not even have to solve the congruence relations of the
associated linear program of (10.3) or worry about its basic variables being forced
to be negative. This is because most [b1, b2] in (10.3) can be expressed as positive
integer
combinations of the given vectors in (10.3). In Figure 10.3, we assume that
5 1
is the optimum basis of the associated linear program of (10.3). Note
1 5
that the two strips of (10.3) are narrow, the region (iii) is small, and most of the
integer points are in region (i).
Linear and Integer Programming
in Practice 11
When formulating a problem as a linear program or integer program, the first step is
to establish clear and complete notation. Then, three questions must be answered:
Once you have answered these three questions, you will be well on your way to
writing down the desired linear or integer program.
Example 1: Fitting a Line. Given a set of points with coordinates (x1, y1), . . ., (xn,
yn), formulate a linear program to find the best-fit linear function, i.e., the line for
which the maximum distance of any given point above or below the line is
minimized (see Figure 11.1).
To write this as a linear program, the first step is to observe that the general form
of a linear function is ax þ by þ c ¼ 0. Thus, to find the best-fit line, we seek the set
of values a, b, and c such that the maximum deviation of any given point above or
below the line is minimized. If we introduce a variable e (for “error”) to denote the
maximum deviation of any given point from the line corresponding to the linear
function, then we want to minimize e. We formulate the desired linear program as
min e
subject to axj þ byj þ c e
axj þ byj þ c e
where there is a pair of inequality constraints for each given point (xj, yj).
X
max x1j ðamount of flow leaving the sourceÞ
j
variable” xjk to denote whether the kth color is used on vertex vj. This leads us to the
following integer program:
X
min yk
k
X
subject to xjk ¼ 1 ð j ¼ 1, . . . , nÞ ðeach vertex has only one colorÞ
k
Some readers will realize that this “schedule classes into classrooms” illustration of
graph coloring is actually the interval partitioning problem, which is greedily
solvable in polynomial time. This being said, the general graph colorability prob-
lem is intractable and hence germane to the use of integer programming.
The reader may observe recurring motifs in the above integer program examples,
notably, the introduction of binary “indicator” variables to represent “used by” versus
“not used by” or “assigned to” versus “not assigned to”, etc. We next present some
additional techniques that are useful in formulating linear and integer programs.
Linear Programming
Here, the absolute value operator is problematic. However, we can avoid the jxjj
expression by creating two variables xþj , xj :
xj ¼ xþ
j xj
xj ¼ xþ þ x
j j
xþ
j , xj 0
xþ
j , xj 0 8j ∈ J
The original and the rewritten linear programs are the same only if one of xþ
j , xj is
þ
always zero for each j. This can be proved by contradiction. If both xj and xj are
positive values for a particular j and δ ¼ min xþ þ
j ; xj , subtracting δ from xj , xj
leaves the value xj unchanged, but reduces xj by 2δ, which contradicts the
optimality assumption.
Min or Max Terms in the Objective. Consider the following optimization:
X
min max ckj xj
k∈K
j∈J
X
subject to aij xj ≷ bi 8i ∈ I
j∈J
xj 0 8j ∈ J
Here, the objective that we want to minimize is the maximum over several terms.
This can be accomplished by creating an additional variable z which upper-bounds
all the terms over which the max is taken. Hence, our objective becomes the
minimization of z. The resulting linear program is then:
min z
X
subject to aij xj ≷ bi 8i ∈ I
j∈J
X
ckj xj z 8k ∈ K
j∈J
xj 0 8j ∈ J
xj 0 8j ∈ J
This looks daunting, but, with appropriate manipulation of variables, can also be
formulated program. The first step is to introduce a variable t, where
X as a linear
t ¼ 1= dx þβ .
j∈J j j
The optimization can then be rewritten as
X
min cj xj t þ αt
j∈J
X
subject to aij xj ≷ bi 8i ∈ I
j∈J
X
dj xj t þ βt ¼ 1
j∈J
xj 0 8j ∈ J
t>0
yj 0 8j ∈ J
t>0
Variables That Are Minimum Values. Consider the situation where we wish to
use a variable that is always equal to the minimum of other variables, e.g.,
y ¼ minfxi g i∈I
Li xi Ui i∈I
We can realize such a variable y by introducing new binary variables di, where
Li xi Ui i∈I
y xi i∈I
y xi ðUi Lmin Þð1 d i Þ i∈I
X
di ¼ 1 i∈I
i
The situation where a variable must always be equal to the maximum of other
variables, that is, y ¼ maxfxi g, is handled analogously (the binary variable di ¼ 1 if
xi is the maximum value, and d i ¼ 0 otherwise).
Variables That Are Absolute Differences. It may be desirable for a variable to be
the absolute difference of two other variables, i.e.,
y ¼ x1 x2
0 xi U
d1 ¼ 1 if x1 x2 0
d2 ¼ 1 if x2 x1 > 0
0 xi U
0 y ðx1 x2 Þ 2 U d 2
0 y ðx2 x 1 Þ 2 U d 1
d1 þ d2 ¼ 1
11.2 Tips and Tricks for Expressing Linear and Integer Programs 125
d ¼ minfd i g i∈I
di binary i∈I
d di i∈I
X
d d i I 1 i∈I
i
d0
d ¼ maxfdi g i∈I
di binary i∈I
d di i∈I
X
d di i∈I
i
d1
d0 ¼ NOT d
d binary;
is achieved using
d0 ¼ 1 d
126 11 Linear and Integer Programming in Practice
L1 x U 1 or L2 x U 2
x U 1 þ ðU 2 U 1 Þy
x L1 þ ðL2 L1 Þy
Fixed-Cost Form of the Objective. Real-world costs often have both fixed and
variable components, yielding cost minimization formulations such as
min CðxÞ
X
subject to ai x þ aij wj ⋛ bi 8i ∈ I
j∈J
x0
wj 0 8j∈J
0, x¼0
where CðxÞ ¼
k þ cx, x>0
min ky þ cx
X
subject to ai x þ aij wj ⋛ bi 8i ∈ I
j∈J
x0 integer
wj 0 8j∈J
x uy
y binary
11.2 Tips and Tricks for Expressing Linear and Integer Programs 127
subject to a1j xj b1 ð 1Þ
a2j xj b2 ð 2Þ
xj 0 8j∈J
The above constraints can then be rewritten as follows, where M1 and M2 are large
constants:
X
min cj xj
j∈J
subject to a1j xj b1 þ M1 y
a2j xj b2 þ M2 ð1 yÞ
xj 0 8j∈J
y binary
To convert the strict inequality, the following set of either-or constraints is used,
where E is a small constant:
X
a1j xj b1 þ E
j∈J
X
a2j xj b2
j∈J
Product of Variables. A final important case is when we must handle the product
of a binary variable d and a continuous, upper-bounded variable 0 x u. In
this case,
y ¼ dx
can be rewritten as
y ud
yx
y x þ uð 1 d Þ
y0
We close this chapter by pointing out how one actually obtains solutions to linear
and integer programs in practice. Of course, it is not possible to solve real-world
mathematical programs by hand, nor is it reasonable to implement the Simplex or
other solution methods from scratch. Today, there are many mature, scalable
software packages available either for free or for nominal license costs; the
references at the end of this book point to examples such as CPLEX from IBM.
To input your linear or integer program to a solver, it is necessary to translate the
program into one of several standard formats that go by names such as AMPL,
GAMS, CPLEX, MPS, etc. Let us see a few examples.
Recall the linear program example (3.7) in Chapter 3:
var x1 >¼ 0;
var x2 >¼ 0;
var x3 >¼ 0;
maximize TOTAL: x1 + 2*x2 + 3*x3;
subject to LIM1: 12*x1 + 12*x2 + 6*x3 <¼ 30;
subject to LIM2: 4*x1 + 10*x2 + 18*x3 <¼ 15;
Format 2: GAMS (“General Algebraic Modeling System”). The same example can
be encoded in a file example.gms as follows:
Equations
obj "max TOTAL"
lim1 "lim1"
lim2 "lim2";
Maximize
TOTAL: x1 + 2 x2 + 3 x3
Subject to
LIM1: 12 x1 + 12 x2 + 6 x3 <¼ 30
LIM2: 4 x1 + 10 x2 + 18 x3 <¼15
Bounds
x1 >¼ 0
x2 >¼ 0
x3 >¼ 0
End
130 11 Linear and Integer Programming in Practice
It can be somewhat tedious to write and debug the small codes or scripts that
translate your linear or integer program into one of these input formats. However,
such codes and scripts can tremendously improve productivity and the automation
of experimental studies—and their elements are highly reusable from one project
to another. Further, and finally, you will have a great feeling of satisfaction and
accomplishment after having successfully framed your problem as a linear or
integer program, correctly feeding the problem to a solver, and obtaining the
desired solution!
Appendix: The Branch and Bound Method
of Integer Programming
The integer algorithms described in previous chapters are classified as cutting plane
type, since they all generate additional constraints or cutting planes. In this section,
we shall discuss an entirely different approach, which can be called the tree search
method. The tree search type of algorithm includes the branch and bound method,
the additive algorithm, the direct search algorithm, and many others.
The common features of the tree search type of algorithm are (1) they are easy to
understand, (2) they are easy to program on a computer, and (3) the upper bound on
the number of steps needed in the algorithm is of the order O(kn), where n is the
number of variables. Features (1) and (2) are two advantages of the tree search type
of algorithm. Feature (3) is a disadvantage since it implies exponential growth of
the amount of computation as the problem becomes larger.
In the previous chapters, we have concentrated on the cutting plane type
algorithms because they give better insight into the problem. For small problems,
the tree search type needs less computing time than the cutting plane type, but the
growth of the computing time is more rapid in the tree search type.
Consider an integer program
min z ¼ cy
subject to Ay b ðA:1Þ
y0 integers
yk ¼ ½ yk þ 1. If one of the two linear programs, say the one with yk ¼ ½ yk , still
does not give integer solutions, i.e., yl ¼ ½ yl þ f l , then two more linear programs
are solved, one with yk ¼ ½ yk , yl ¼ ½ yl and one with yk ¼ ½ yk , yl ¼ ½ yl þ 1 as the
additional constraints.
All solutions obtained in this way can be partially ordered as a tree with the root
of the tree representing the linear program solution obtained without any additional
integer constraints. When a solution y0 does not satisfy the integer constraints, it
branches into two other solutions y1 and y2. The solution y0 is called the predecessor
of y1 and y2, and y1 and y2are called the successors of y0.
If the successors of y1 and y2 are all infeasible, then we have to branch from y0
again with yl ¼ ½ yl 1 and yl ¼ ½ yl þ 2. A node may have more than two
successors. For example, a node with yl non-integer may have many successors
corresponding to yl ¼ ½ yl , yl ¼ ½ yl 1, . . ., yl ¼ ½ yl þ 1, yl ¼ ½ yl þ 2, . . ., etc.
A node is called a terminal node if it has no successors; this definition implies that a
terminal node represents a feasible integer solution or an infeasible integer solution.
The idea of the branch and bound method lies in the following two facts:
1. Because the predecessor has fewer constraints than the successors and additional
constraints cannot improve the value of the objective function, the optimum
value of a successor is always larger than or equal to the optimum value of the
predecessor.
2. If two integer feasible solutions have the same predecessor, one with yl ¼ ½ yl
þ1 and one with yl ¼ ½ yl þ 2, then the optimum value of the first solution is less
than the optimum value of the second. This is to say, the further away the value
of yl is from the linear programming solution, the worse is the resulting value of
the objective function.
During the computation of the branch and bound method, we keep the optimum
value z* of the best integer feasible solution obtained so far. If a node with a
non-integer solution has an optimum value greater than z*, then all the successors of
that node must have optimum values greater than z*. Hence, there is no sense in
branching from that node. The advantage of the branch and bound method is that it
can be used for mixed integer problems.
Epilogue
In this epilogue, we hope to give a global view of the book that can serve as a guide
to go through the subjects of linear and integer programming again.
Consider a linear program with a single constraint. If we want to maximize the
objective function of the linear program, the optimum solution uses variable xk
associated with the item with the largest ratio of its value to its weight.
When a linear program has two constraints, we can use the same intuitive idea,
except that the denominator of the ratio is replaced by a 2 2 square matrix.
Dividing by a 2 2 square matrix is the same as multiplying by the inverse of the
square matrix, and this generalization eventually leads us to the classical Simplex
Method for solving m constraints and n columns.
There are several notations used in the first three chapters: min z, max z, or max w
and max v. And when we start using the Simplex tableau, we always use max x0 and
min z, say x0 þ 2x3 2x4 x5 ¼ 0.
The Simplex Method is shown in Tableaus 4.1, 4.2, and 4.3. A typical Simplex
tableau is shown in Tableau 4.1, which shows x0 ¼ 0, x1 ¼ 4, and x2 ¼ 2. After an
iteration, we have x0 ¼ 8, x2 ¼ 6, and x4 ¼ 4 in Tableau 4.2. When all modified
coefficients in the top row are non-negative as shown in Tableau 4.3, the solution is
optimum.
If we use brute force to solve a linear program, we would need to search over all
choices of ðm þ 1Þ columns from ðn þ 1Þ columns, and for each choice solve the
simultaneous equations. In the Simplex Method, we choose the column with the
most negative coefficient to enter the basis, and then do a pivot step. The pivot
element ars in the rth row and sth column is decided by a feasibility test.
If the modified coefficient is already non-negative, we do not need to find all
entries in that column. Also, we do not need to carry all entries, for iterations after
iterations.
The revised Simplex Method is introduced in Chapter 6 with Tableaus 6.1, 6.2,
6.3, 6.4, 6.5, and 6.6. The entries that are not calculated are replaced by question
marks. To show the savings of the revised Simplex Method, in the revised savings
We could try
min z ¼ x1 þ x2 þ x3 þ x4 þ x5 þ x6
" # " # " # " # " # " # " #
3 1 2 1 1 0 11
subject to x1 þ x2 þ x3 þ x4 þ x5 þ x6 ¼
1 3 1 2 0 1 11
xj 0
Epilogue 135
When we raise the values of non-basic variables, we may force the basic variables
to be negative. In general, we could take the optimum solutions of the associated
linear program and combine them with non-basic variables that satisfy the congru-
ence relationships. However, it is possible that a better basis of the associated linear
program may not give a better value of the objective function.
References
1. Bland RG (1977) New finite pivoting rules for the simplex method. Math Oper Res 2
(2):103–107
2. Dantzig GB, Orchard-Hayes W (1953) Alternate algorithm for revised simplex method using
product form of the inverse. RAND Report RM 1268. The RAND Corporation, Santa Monica
3. Dantzig GB (1963) Linear programming and extensions. Princeton University Press, Princeton
4. Dantzig GB, Thapa MN (1997) Linear programming. In: Introduction, vol 1; Theory and
extensions, vol 2. Springer series in operations research. Springer, New York
5. Hu TC, Shing M-T (2001) Combinatorial algorithms. Dover, Mineola
6. Hu TC (1969) Integer programming and network flows. Addison-Wesley, Reading
7. Lemke CE (1954) The dual method of solving the linear programming problems. Naval Res
Log Q 1(1):36–47
8. Von Neumann J, Morgenstern O (1944) Theory of games and economic behavior. Wiley,
New York
9. Wolfe P (1963) A technique for resolving degeneracy in linear programming. J SIAM Appl
Math 11(2):205–211
17. Gomory RE (1963) Large and non-convex problems in linear programming. In: Proceedings of
the symposium on the interactions between mathematical research and high-speed computing
of American Mathematics Society, vol 15, pp 125–139
18. Gomory RE, Hu TC (1964) Synthesis of a communication network. SIAM J 12(2):348–369
19. Gomory RE, Johnson EL (2003) T-space and cutting planes. Math Program Ser B 96:341–375
20. Gomory RE (1965) Mathematical programming. Am Math Mon 72(2):99–110
21. Gomory RE (1965) On the relation between integer and non-integer solutions to linear
programs. Proc Natl Acad Sci U S A 53(2):260–265, Also published in Dantzig GB, Veinott
AF Jr (eds) (1968) Mathematics of the decision sciences, Part 1. Lectures in applied mathe-
matics, vol 2. American Mathematical Society, pp. 288–294
22. Gomory RE (1969) Some polyhedra related to combinatorial problems. J Linear Algebra Appl
2(4):451–558
23. Gomory RE, Johnson EL (1972) Some continuous functions related to corner polyhedra, part
I. Math Program 3(1):23–85
24. Gomory RE, Johnson EL (1972) Some continuous functions related to corner polyhedra, part
II. Math Program 3(3):359–389
25. Gomory RE, Baumol WJ (2001) Global trade and conflicting national interest. MIT Press,
Cambridge
26. Gomory RE, Johnson EL, Evans L (2003) Corner polyhedra and their connection with cutting
planes. Math Program 96(2):321–339
27. Hu TC (1969) Integer programming and network flows. Addison-Wesley, Reading, Chapter 17
by R. D. Young
28. Hu TC, Landa L, Shing M-T (2009) The unbounded knapsack problem. In: Cook WJ, Lovász
L, Vygen J (eds) Research trends in combinatorial optimization. Springer, Berlin, pp 201–217
29. Johnson EL (1980) Integer programming: facets, subadditivity and duality for group and semi-
group problems. CBMS-NSF conferences series in applied mathematics 32
30. Schrijver A (1986) Theory of linear and integer programming, Wiley interscience series in
discrete mathematics and optimization. Wiley, New York
31. Schrijver A (2002) Combinatorial optimization. Springer, Berlin, Vol A Chapters 1–38,
pp 1–648; Vol B Chapters 39–69, pp 649–1218; Vol C Chapters 70–83, pp 1219–1882
32. Young RD (1965) A primal (all integer) integer programming algorithms. J Res Natl Bur Stand
B 69(3):213–250, See also Chapter 17, pp 287–310 in the book by T. C. Hu
33. Bellman RE (1956) Dynamic programming. R-295 RAND Corporation. The RAND Corpora-
tion, Santa Monica
34. Bellman RE, Dreyfus SE (1962) Applied dynamic programming. R-352-PR RAND Corpora-
tion. The RAND Corporation, Santa Monica
35. Dreyfus SE, Law AM (1977) The art and theory of dynamic programming, vol 130, Mathe-
matics in science and engineering. Academic, New York
References on NP-Completeness
42. Cook SA (1973) A hierarchy for nondeterministic time complexity. J Comput Syst Sci
7(4):343–353
43. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of
NP-completeness. Freeman, San Francisco
44. Karp RM (1972) Reducibility and combinatorial problems. In: Miller RE, Thatcher JW (eds)
Complexity of computer computation. Plenum, New York, pp 85–103
Index
B
Basic feasible solution, 42
Basic solution, 42 D
basic variables, 42 Dantzig, G.B., vii
degenerate, 42 Dimension of solution space, 29
Basis, 42 Double-loop technique, 100
basic vectors, 42 Dual feasible, 68
Best-fit linear function, 117 Dual program, 61
Bland’s rule, 54 Dual simplex method, 66–71
Boundary, 94 Dual variables, 61
Branch and bound, 87 Duality
canonical form, 63
standard form, 63
C theorem of, 63
Chain, 99
Challenge, 98
Column generating technique, 81, 86 E
Compatibility, 21 Economic interpretation, 56–57, 65
Complementary slackness, 61 Equivalent formulations, 39
strong, 65 Extreme point (¼ corner point), 16, 32
weak, 64–65
Concave function, 33
Cone, 36, 108 F
inner and outer, 113 Feasibility test, 53, 54, 74
Congruence relation, 104 Feasible, 14, 42
Consistent, 30 basic, 42
H
Hoffman, A.I., viii O
Homemaker Problem, 22, 35, 52, 62, 63 Objective function, 18
fixed-cost form, 126
optimum solution, 34
I
Indicator variable, 119–120
Integer program, 18, 87 P
solving an integer program, 103–112 Pill salesperson, 24, 63, 85
Pivot
Bland’s rule, 54
J column, 46
Johnson, E.L., viii operation, 46
row, 46
Pricing operation, 57, 66
K Primal feasible, 67
Kleitman, D.J., viii Primal program, 61
Knapsack problem, 20, 87–101 Product of variables, 128
gain, 95
optimal knapsack problem, 93
periodic, 90 R
Ratio method, 18–27
Ratio test, 49, 57
L Redundant, 31
Landa’s algorithm, 93–101 Relative cost, 56, 73
chain, 99 Revised simplex method,
challenge type, 98 73–79
high-level description, 101 Row echelon form, 3
special boundary, 96
Linear equations, 3
consistent, 30 S
redundant, 31 Separating hyperplanes,
Linear program, 13, 16, 18, 61 35, 63
canonical form, 47 theorem, 35
geometric interpretation, 34–35 Shadow prices, 57, 61, 79
standard form, 47 Shing M.T., viii
Index 143