Linear and Nonlinear Programing With Maple™: An Interactive, Applications-Based Approach
Linear and Nonlinear Programing With Maple™: An Interactive, Applications-Based Approach
PROGRAMING
WITH MAPLE™
An Interactive,
Applications-Based Approach
PUBLISHED TITLES
Paul E. Fishback
Grand Valley State University
Allendale, Michigan, U.S.A.
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to
copyright holders if permission to publish in this form has not been obtained. If any copyright material has
not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit-
ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented,
including photocopying, microfilming, and recording, or in any information storage or retrieval system,
without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.
com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the CCC,
a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
Fishback, Paul E.
Linear and nonlinear programming with Maple : an interactive, applications‑based
approach / Paul E. Fishback.
p. cm. ‑‑ (Textbooks in mathematics)
Includes bibliographical references and index.
ISBN 978‑1‑4200‑9064‑2 (hardcover : alk. paper)
1. Linear programming. 2. Maple (Computer file) I. Title. II. Series.
QA402.5.F557 2010
519.7’2‑‑dc22 2009040276
List of Tables xv
Foreword xix
I Linear Programming 1
1 An Introduction to Linear Programming 3
1.1 The Basic Linear Programming Problem Formulation . . . . . 3
1.1.1 A Prototype Example: The Blending Problem . . . . . . 4
1.1.2 Maple’s LPSolve Command . . . . . . . . . . . . . . . . 7
1.1.3 The Matrix Inequality Form of an LP . . . . . . . . . . . 8
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Linear Programming: A Graphical Perspective in R2 . . . . . 13
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3 Basic Feasible Solutions . . . . . . . . . . . . . . . . . . . . . . 19
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
vii
viii
A Projects 319
A.1 Excavating and Leveling a Large Land Tract . . . . . . . . . . 319
A.2 The Juice Logistics Model . . . . . . . . . . . . . . . . . . . . . 322
A.3 Work Scheduling with Overtime . . . . . . . . . . . . . . . . . 325
A.4 Diagnosing Breast Cancer with a Linear Classifier . . . . . . . 327
A.5 The Markowitz Portfolio Model . . . . . . . . . . . . . . . . . 330
A.6 A Game Theory Model of a Predator-Prey Habitat . . . . . . . 334
Bibliography 383
Index 387
List of Figures
4.1 Feasible region for FuelPro LP along with contour z = 46. . . . 120
4.2 Ordered pairs (δ1 , δ2 ) for which changes in the first two con-
straints of FuelPro LP leave the basic variables, {x1 , s1 , x2 }, un-
changed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.3 Feasible region for LP (4.34). . . . . . . . . . . . . . . . . . . . . 139
5.1 Feasible region and solution for the GLKC ILP relaxation. . . . 147
5.2 Branching on x1 in the feasible region of the relaxation LP. . . 149
5.3 Tree diagram for ILP (5.1). . . . . . . . . . . . . . . . . . . . . . 153
5.4 Tree diagram for MILP (5.11). . . . . . . . . . . . . . . . . . . . 156
5.5 Tour consisting of 4 destinations. . . . . . . . . . . . . . . . . . 158
5.6 Two subtours of 4 destinations. . . . . . . . . . . . . . . . . . . 160
xiii
xiv
6.5 The paraboloid f (x1 , x2 ) = x21 + x22 , together with one chord
illustrating notion of convexity. . . . . . . . . . . . . . . . . . . 196
6.6 The paraboloid f (x1 , x2 ) = x21 + x22 , together with the lineariza-
tion, or tangent plane, at an arbitrary point. . . . . . . . . . . . 198
6.7 Surface plot of ConPro objective function. . . . . . . . . . . . . 201
6.8 The quadratic form f (x1 , x2 ) = 2x21 + 2x1 x2 + 3x22 . . . . . . . . . 208
6.9 Quadratic form f (x1 , x2 ) = −2x21 + 2x1 x2 − 3x22 . . . . . . . . . . . 208
6.10 Quadratic form f (x1 , x2 ) = x21 + 4x1 x2 + x22 . . . . . . . . . . . . . 209
xv
xvi
The book itself is organized into two parts, with the first focusing on lin-
ear programming and the second on nonlinear programming. Following
these two parts are four appendices, which contain course projects, Maple
resources, and a summary of important linear algebra facts.
A primary goal of this book is to “bridge the gap,” so to speak, which separates
the two primary classes of textbooks on linear and nonlinear programming
currently available to students. One consists of those management science
books that lack the level of mathematical detail and rigor found in this text.
Typically, they assume little to no linear algebra background knowledge on
the part of the reader. Texts in the second class are better suited for gradu-
ate level courses on mathematical programming. In simple terms, they are
written at “too high a level” for this book’s intended audience.
Undergraduate students who use this book will be exposed early to topics
from introductory linear algebra, such as properties of invertible matrices
and facts regarding nullspaces. Eigenvalues, of course, are an essential for
the classification of quadratic forms. Most important, however, is the extent
to which partitioned matrices play a central role in developing major ideas.
In particular, the reader discovers in Section 2.4 that the simplex algorithm
may be viewed entirely in terms of multiplication of such matrices. This per-
spective from which to view the algorithm provides streamlined approaches
for constructing the revised simplex method, developing duality theory, and
approaching the process of sensitivity analysis.
Some linear algebra topics arising in this text are not ones usually encountered
by students in an introductory course. Most notable are certain properties of
xix
xx
the matrix transpose, the Spectral Theorem, and facts regarding matrix norms.
As these topics arise in the text, brief digressions summarize key ideas, and
the reader is referred to appropriate locations in the appendices.
Maple
As the title indicates, Maple is the software of choice for this text. While many
practitioners in the field of mathematical programming do not consider Maple
well-suited for large-scale problems, this software is ideal in terms of its abil-
ity to meet the pedagogical goals of this text and is accessible to students at
over 90% of advanced research institutions worldwide. By utilizing Maple’s
symbolic computing components, its numeric capabilities, its graphical ver-
satility, and its intuitive programming structures, the student who uses this
text should acquire a deep conceptual understanding of major mathematical
programming principles, along with the ability to solve moderately sized
real-world applications.
The text does not assume prior Maple experience on the part of its reader.
The Maple novice should first read Appendix C, Getting Started with Maple.
It provides a sufficient amount of instruction for the reader unfamiliar with
this software, but no more than that needed to start reading Chapter 1. Maple
commands are then introduced throughout the book, as the need arises, and
a summary of all such commands is found in Appendix D. Finally, sample
Maple work sheets are provided in the text itself and are also accessible online
at www.lp-nlp-with-maple.org.
Waypoints
Any mathematics text should strive to engage its reader. This book is no
exception, and, hopefully, it achieves success in this regard. Interspersed
throughout are “Waypoints,” where the reader is asked to perform a simple
computation or to answer a brief question, either of which is intended to assess
his or her understanding. Such Waypoints are meant to facilitate the “hands-
on,” or interactive, learning approach suggested by this book’s title. The
instructor can utilize them in various ways, for example, by using them as a
means to intersperse lecture with small-group discussions or as an additional
source of text exercises.
Projects
There are several different possible outlines for using this book as a primary
course text, depending upon course goals and instructional time devoted to
the various topics. In terms of linear programming, core topics include Chap-
ter 1, Sections 2.1-2.4, Chapter 3, Sections 4.1-4.2, and Section 5.1. These cover
the essentials of linear programming, basic applications, duality, sensitivity
analysis, and integer linear programming via the branch and bound method.
Sections 2.5, 4.3, and 5.2 address optional topics, namely the interior point
algorithm, the dual simplex method, and the cutting plane algorithm, re-
spectively. When considering which of these three additional topics to cover,
however, the instructor should bear in mind that the cutting plane algorithm
requires use of the dual simplex method.
Acknowledgments
Linear Programming
1
Chapter 1
An Introduction to Linear Programming
Of course, the calculus provides one means of solving this problem. If we let
x and y denote the dimensions of the corral, then the available fence length
dictates 2x + 2y = 100, and the area, A, is given by A = xy. Rewriting A as a
dA
function of a single variable, say x, and solving the equation, = 0, for x,
dx
we deduce that Farmer Brown maximizes the enclosed area by using 50 feet
of fence opposite the barn and 25 feet for each of the opposite two sides.
3
4 Chapter 1. An Introduction to Linear Programming
A fictitious, yet simple and illuminating example used throughout this chap-
ter is that of the FuelPro Petroleum Company, a small refinery that sells two
grades of fuel, premium and regular unleaded. For purposes of simplicity, we
will assume that only two stocks, stock A and stock B, are used to produce the
two grades and that 28 and 32 gallons of each stock, respectively, are avail-
able. The following table summarizes how much of each stock is required for
each grade. (Assume all quantities are measured in gallons.)
Some logical questions the company might wish to address are the following:
1. What are the possible production combinations of fuel types that will
satisfy the above conditions or set of constraints?
_ _
Waypoint 1.1.1. In the preceding example, experiment and list four
combinations of premium and regular unleaded fuel types, three of
which satisfy the listed constraints and one of which does not. Of
those three that do, which produces the largest profit?
_ _
Let us denote the number of gallons of premium grade and regular unleaded
produced in a given hour by the variables x1 and x2 , respectively. Using
information regarding the production of the two fuel grade types, we can
construct inequalities relating these two variables.
1. From information provided in the first row of Table 1.1, we know that
each gallon of premium requires 2 gallons of stock A and each gallon
of regular unleaded requires 2 gallons of stock A. Since only 28 gallons
of stock A are available to FuelPro each hour, we obtain the inequality:
3. At most 8 gallons of premium grade fuel can be sold each hour. Thus
x1 ≤ 8.
4. Only nonnegative values of x1 and x2 make sense for the problem situ-
ation. Therefore, x1 , x2 ≥ 0.
Combining this function with the above constraints, the optimization problem
faced by FuelPro can be written as follows:
z = f (x1 , x2 , . . . , xn ) = c1 x1 + c2 x2 + · · · + cn xn , (1.2)
While Definition 1.1.1 appears to stipulate that all constraints are of inequality
type, equations are permissible as well, as long as the constraint equations are
linear in the variables x1 , x2 , . . . , xn . However, in Section 1.1.3 we will discover
how the set of constraints of any LP can be expressed entirely in terms of
inequalities.
1.1. The Basic Linear Programming Problem Formulation 7
It is important to note that any LP has hidden assumptions due to its linear
nature. For example, in our petroleum model we assume that there is no
interaction between the amounts of premium and regular unleaded, e.g.,
no product terms x1 x2 . We are also assuming that the objective function is
linear. In certain other situations, the function f might be better modeled by
a quadratic function, in which case the problem is not an LP but a quadratic
programming problem. Finally, we are assuming that the decision variables
can take on fractional, noninteger values, which may not always be realistic
for the application at hand. In Chapter 5 we discuss problems in which we
desire only integer-valued solutions for one or more of our decision variables.
> restart;
> with(Optimization);
ImportMPS, Interactive, LPSolve, LSSolve, Maximize, Minimize,
NLPSolve, QPSolve]
> f:=(x1,x2)->4*x1+3*x2;
> constraints:=[x1<=8,2*x1+2*x2<=28,3*x1+2*x2<=32];
> LPSolve(f(x1,x2),constraints,‘maximize’,assume=nonnegative);
maximize z = c · x (1.4)
subject to
Ax ≤ b
x ≥ 0.
1.1. The Basic Linear Programming Problem Formulation 9
> c:=Vector[row]([4,3]);
h i
c := 4 3
> A:=Matrix(3,2,[1,0,2,2,3,2]);
1 0
2
A := 2
3 2
> b:=<8,28,32>;
8
b := 28
32
> LPSolve(c,[A, b],assume=nonnegative,’maximize’);
" " ##
4
46
10
Observe that Maple’s output consists of a list. The first entry is the optimal
objective value; the second entry is the corresponding vector of decision
variable values.
The second constraint can be rewritten as −x1 − 5x2 ≤ −10. The third is the
combination of two constraints, 2x1 − x2 ≤ 3 and −2x1 + x2 ≤ −3. Thus, (1.5)
becomes
10 Chapter 1. An Introduction to Linear Programming
maximize z = c · x
subject to
Ax ≤ b
x ≥ 0,
1 1 6 " #
−1 −5 h i −10 x1
where A = , c = 2 3 , b = , and x = .
2 −1 3 x2
−2 1 −3
Exercises in this section demonstrate other means for expressing LPs in matrix
inequality form, including those containing variables unrestricted in sign or
those whose constraints involve absolute values.
(a)
(b)
(c)
(d)
maximize z = 2x1 − x2
subject to
x1 + 3x2 ≥ 8
x1 + x2 ≥ 4
x1 − x2 ≤ 2
x1 , x2 ≥ 0
(f)
minimize z = x1 + 4x2
subject to
x1 + 2x2 ≤ 5
|x1 − x2 | ≤ 2
x1 , x2 ≥ 0.
Table 1.2 summarizes grass and forb data regarding digestive capacity
and energy content. Food bulk records the extent to which the mass of
a substance increases after it enters the digestive system and becomes
liquid-saturated. For example, two grams of grass, when consumed,
expands to 2 × 1.64 = 3.28 grams within the digestive system. To distin-
guish the mass of food prior to consumption from that in the digestive
system, we use units of gm-dry and gm-wet, respectively.
The digestive capacity of the vole is 31.2 gm-wet per day, and the vole
1 Based upon Pendegraft, [36], (1997). For this problem it helps to have some Legos, specifically
(a) Let x1 and x2 denote the number of grams of grass and number of
grams of forb, respectively, consumed by the vole on a given day.
Construct an LP that minimizes the vole’s total daily foraging time
given the digestive capacity and energy constraints. Verify that
your units make sense for each constraint and for your objective
function. Then write this LP in matrix inequality form.
The points in the x1 x2 -plane that satisfy all the constraints and sign conditions
in (1.6) form the feasible region for the FuelPro model.
14 Chapter 1. An Introduction to Linear Programming
_ _
Waypoint 1.2.1. Sketch and shade the graph of the feasible re-
gion in the x1 x2 -plane. While this task is easily accomplished by
hand, Maple can also do so through the use of the inequal com-
mand as described in Appendices C and D. The list of inequalities
used within this command is given by [x1<=8, 2*x1+2*x2<=28,
3*x1+2*x2<=32,x1>=0,x2>=0 ] . The resulting graph should resem-
ble that in Figure 1.1
_ _
x2
x1
The function f = 4x1 + 3x2 from (1.6) is a multivariate function whose graph
consists of a surface in R3 . Recall that a contour or level curve of such a function
is a curve in the x1x2 -plane that results from holding z constant. In other words,
it is a curve that results from slicing the surface with a horizontal plane of the
form z = c, where c is a constant. A contour diagram is merely a collection of
contours plotted on a single set of axes.
ious options include specifying the output values used for the contours,
specifically via contours=L, where L is a list of contour values. Figure 1.2 il-
lustrates an example, using the expression x21 + x22 , along with contour values
of 0, 1, 2, 3, and 4:
> contourplot(x1ˆ2+x2ˆ2,x1=-3..3,x2=-3..3,contours=[1,2,3,4]);
For the FuelPro problem, the objective function f is a linear function of two
variables so that its contours form parallel lines.
_ _
Waypoint 1.2.2. Superimpose on your feasible region from the previ-
ous Waypoint the graphs of several contours of the objective function
f (x1 , x2 ) = 4x1 + 3x2. (Recall from Appendices C and D that plot struc-
tures can be superimposed by use of the display command.) Then
use these contours to determine the amounts of premium and regular
unleaded fuel types in the feasible region that correspond to the max-
imum profit. Your solution should of course agree with that obtained
by using the LPSolve command in Section 1.1.
_ _
16 Chapter 1. An Introduction to Linear Programming
Recall from Section 1.1 that the solution of an LP can take one of four possible
forms.
1. The LP can be infeasible, meaning that its feasible region is the empty
set.
2. The LP can have a unique optimal solution, as is the case for the FuelPro
problem.
3. The LP can have alternative optimal solutions. That is, there exists at least
two optimal solutions.
The feasible region for (1.7), together with objective function contours z = 32,
z = 34 and z = 36, is shown in Figure 1.3.
In this example, contours of the objective function f are parallel to the bound-
ary of the constraint 3x1 + 2x2 ≤ 9. Moreover, objective values corresponding
to these contours illustrate how all points on the segment from (2, 6) to (4, 3)
yield the same optimal objective value, z = 36. One way to represent this
segment is through a parametric representation as follows:
x2
7
z=36
1 z=34
z=32
0 1 2 3 4 5
x1
1. For each LP below, graph the feasible region, and sketch at least three
contours of the objective function. Use your contours to determine the
nature of the solution. If the LP has a unique optimal solution, list the
values of the decision variables along with the corresponding objective
function value. If the LP has alternative optimal solutions, express the
general solution in parametric form. You may wish to combine Maple’s
inequal, contourplot, and display commands to check each solution.
(a)
(b)
(c)
(d)
(e)
2. Use the graphical methods discussed in this section to solve the Foraging
Herbivore Model, Exercise 4, from Section 1.1.
1.3. Basic Feasible Solutions 19
maximize z = 2x1 − x2
subject to
x1 + 3x2 ≥ 8
x1 + x2 ≥ 4
x1 − x2 ≤ 2
x1 , x2 ≥ 0.
(a) Sketch the feasible region for this LP, along with the contours
corresponding to z = 50, z = 100, and z = 150.
(b) If M is an arbitrarily large positive real number, what is a descrip-
tion, in terms of M, of the feasible points (x1 , x2 ) corresponding to
z = M.
(c) Explain why the LP is unbounded.
maximize z = f (x1 , x2 )
subject to
x1 ≥1
x2 ≤ 1
x1 , x2 ≥ 0.
We see that the feasible region has five “corner points” or “extreme points,”
20 Chapter 1. An Introduction to Linear Programming
x2
(4, 10)
x1
one of which is the optimal solution. This result holds true in general. Namely,
for the LP (1.4) having a unique optimal solution, the optimal solution occurs
at an “extreme point” of the feasible region in Rn . The simplex algorithm, which
we develop in Section 2.1, is a linear-algebra based method that graphically
corresponds to starting at one extreme point on the boundary of the feasible
region, typically (0, 0), and moving to adjacent extreme points through an
iterative process until an optimal solution is obtained. The purpose of this
section is to lay a foundation for this algorithm by developing appropriate
terminology that connects the underlying ideas of systems of equations to
those of the feasible region and its extreme points.
1.3. Basic Feasible Solutions 21
We will use the terminology that (1.8) is the original LP (1.6) expressed in
standard form.
Note that requiring the slack variables to be nonnegative assures us that our
three equations are equivalent to the three inequalities from (1.6). Recalling
the matrix inequality form notation (1.4) and denoting the vector of slack
variables by
s1
s = s2 ,
s3
we can express the standard matrix form as follows:
maximize z = c · x (1.9)
subject to
" #
x
[A|I3 ] =b
s
x, s ≥ 0,
22 Chapter 1. An Introduction to Linear Programming
where I3 denotes the 3-by-3 identity matrix. Thus we have converted the
m = 3 inequalities in n = 2 decision variables from the original LP into
a matrix equation, or system of equations, involving m = 3 equations in
m
" #+ n = 5 unknowns. Note in particular that [A|I3 ] is a 3 by 5 matrix and that
x
belongs to R5 .
s
_ _
Waypoint" # 1.3.1. Before proceeding, explain why the matrix equation
x
[A|I3 ] = b has infinitely many solutions.
s
_ _
" #
x
We now seek to determine the set of solutions to [A|I3 ] = b, ignoring mo-
s
mentarily the sign restrictions placed on the variables in the LP. Recall that
the row space, resp. column space of a matrix is the set of all linear combina-
tions of the row vectors (resp. column vectors) that comprise the matrix. The
dimensions of the row and column spaces are equal, having common value
referred to as the rank of the matrix. In the FuelPro example,
1 0 1 0 0
0 ,
[A|I3 ] = 2 2 0 1
3 2 0 0 1
On the other hand, the null space of [A|I3 ] is the set of all solutions to the
homogeneous matrix equation [A|I3 ] v = 0. By the Rank-Nullity Theorem
(Theorem B.6.1), the sum of its dimension, referred to as the nullity, and the
rank equals the number of columns of [A|I3 ], which is m + n = 5. Hence, the
dimension of the null space is n = 2.
" #
x
Finally, we recall that the solution to a consistent matrix equation [A|I3 ] =b
s
can be written in the form " #
x
= vh + vp ,
s
where vh is the general solution to the homogeneous matrix equation [A|I3 ] v =
1.3. Basic Feasible Solutions 23
_ _
Waypoint 1.3.2. For the FuelPro
" # LP, determine the general solution to
x
the matrix equation [A|I3 ] = b. For each extreme point in Figure
s
1.5, choose the free variables in a way that produces"the
# given extreme
x
point. Next to the extreme point, label the vector with all five of
s
its entries.
_ _
x2
16
14
12
10
0 2 4 6 8 10
x1
The solutions to the preceding matrix equation play a special role in the
context of the LP, in that they correspond to what we call basic solutions.
Definition 1.3.1. Consider an LP consisting of m inequalities and n decision
variables, whose matrix inequality form is given by (1.4). Then the constraints
of the standard matrix form of this LP are represented by the matrix equation,
" #
x
[A|Im ] = b, (1.10)
s
24 Chapter 1. An Introduction to Linear Programming
The fact that the basic variables, denoted by the vector x′ , are uniquely deter-
mined in Definition 1.3.1 follows from the Invertible Matrix Theorem and the
assumption that the m columns form a linearly independent set. For when
this occurs, these columns yield an m by m invertible submatrix, B, of [A|Im ],
so that Bx′ = b has a unique solution.
For the FuelPro example, basic feasible solutions correspond to the points
(0, 0), (8, 0), (8, 4), (4, 10), and (0, 14).
1.3. Exercises Section 1.3 25
_ _
1. Sketch the feasible region for the LP. Label all extreme points.
2. Introduce slack variables and express the LP in standard matrix
form.
3. Determine the basic feasible solutions for the LP. Complete this
last task without actually performing elementary row opera-
tions. Instead use the extreme points from (1).
4. Graphically solve the LP by using contours of the objective
function.
_ _
1. For each of the following LPs, write the LP in the standard matrix form
maximize z = c · x
subject to
" #
x
[A|Im ] =b
s
x, s ≥ 0.
(a)
(b)
2. Show that in the FuelPro LP that no basic solution exists for which x1
and s1 are nonbasic. This result underscores the importance of the linear
independence condition in Definition 1.3.1.
3. The following LP
" #has its feasible
" # region given by the segment connecting
3 4
the points x = and x = :
4 2
p
X
where the weights σi satisfy σi ≥ 0 for all i and where σi = 1.
i=1
The set of all convex combinations of vectors in V is referred to as
the convex polyhedron, or convex hull, generated by V. Suppose that the
feasible region of the LP is the
n convex polyhedrono generated by its
basic feasible solutions, V = x1 , x2 , . . . , xp . Show that if the LP has an
optimal solution, then it has a basic feasible solution that is also optimal.
(Hint: Start with an LP having optimal solution x̃. Express x̃ as a convex
combination of the basic feasible solutions and show that the objective
function evaluated at one of the basic feasible solutions is at least as
large as the objective function evaluated at x̃.)
Chapter 2
The Simplex Algorithm
29
30 Chapter 2. The Simplex Algorithm
2. We use the top row of the tableau to determine which nonbasic variable
should then become positive in order to increase the objective function
most rapidly. Because this variable will switch from nonbasic to basic,
we call it the entering variable.
3. We then find the basic variable that swaps places with the entering
variable and becomes nonbasic.
5. We repeat the previous steps until the objective function can no longer
be increased. The basic feasible solution at that stage is the optimal
solution.
x1 + s1 = 8
2x1 + s2 = 28
3x1 + s3 = 32.
x1 + 0x2 = 8
2x2 + s2 = 12
2x2 + s3 = 8.
32 Chapter 2. The Simplex Algorithm
We again apply the ratio test. Note that the first equation places no
restriction on the growth of x2 . The remaining two equations limit the
increase in x2 to 6 and 4, respectively. We thus focus on the third equa-
tion, which tells us that x2 replaces s3 as a basic variable, and we pivot
on the entry highlighted in Table 2.2. The new tableau is given in Table
2.3.
The new BFS corresponds to the extreme point at (8, 4). We have BV =
{x1 = 8, x2 = 4, s2 = 4}, NBV = {s1 , s3 } and z = 44.
6. The process is repeated one last time. The variable s1 becomes basic.
8
The corresponding ratios are given by 8, 4, and − . Since the negative
3
ratio corresponds to unlimited growth in s1 , the value of 4 “wins” the
ratio test. Thus s1 replaces s2 as a basic variable and we pivot on the
highlighted entry in Table 2.3. The resulting tableau appears in Table
2.4.
The new BFS corresponds to the extreme point at (4, 10). We have
BV = {x1 = 4, x2 = 10, s1 = 4}, NBV = {s2 , s3 } and z = 46. Note that
all coefficients in the top row of the tableau are nonnegative, so we
have obtained the optimal solution, and FuelPro maximizes its profit if
x1 = 4, x2 = 10, in which case z = 46, i.e., a profit of $4.60 per hour. These
results agree with those obtained using Maple’s LPSolve command and
graphical tools from Section 1.2.
track of the basic and nonbasic variables and understanding the ratio-
nale behind the ratio test. Doing so will prove beneficial during later
discussions of special cases, duality, and sensitivity analysis. A Maple
worksheet found at the end of this section can facilitate this process.
• The preceding example was fairly straightforward. As such, it raises
a variety of questions. For example, “What happens in the process if
the LP is unbounded? What happens if the LP has alternative optimal
solutions? What happens if the LP has equality constraints?” We address
all these questions in the next section.
Since 5 is the most positive nonbasic variable coefficient in the top row, x1 is
34 Chapter 2. The Simplex Algorithm
3
the entering variable. The third constraint row yields the smallest ratio of ,
2
so s3 becomes nonbasic and is replaced by x1 . The new tableau is shown in
Table 2.6.
3
The new BFS corresponds to the extreme point at , 0 , where BV =
2
3 1 3 15
x1 = , s1 = , s2 = , NBV = {x2 , s3 } and z = − . Since all nonbasic vari-
2 2 2 2
able coefficients in the top row of the tableau are nonpositive, this solution is
optimal.
_ _
Waypoint 2.1.1. Now solve the minimization LP (2.2) by converting
it to a maximization problem.
_ _
form as
maximize z = c · x (2.3)
subject to
Ax ≤ b
x ≥ 0.
In this case, the worksheet is implemented to solve the FuelPro LP. The user
enters the quantities c, A, and b, as well as the number of constraints and
number of decision variables. (The Vector command is used to enter c as a
row vector.) A for-do loop structure then creates arrays of decision variables
and slack variables of appropriate respective sizes. These quantities are used
to create a matrix labeled LPMatrix. Note that the task of keeping track of
basic versus nonbasic variables is still left to the user.
> restart;with(LinearAlgebra):
> c:=Vector[row]([4,3]);
# Create row vector with objective coefficients.
c := [4, 3]
> A:=Matrix(3,2,[1,0,2,2,3,2]);
# Matrix of constraint coefficients.
1 0
A := 2 2
3 2
> b:=<8,28,32>;
# Constraint bounds.
8
b := 28
32
36 Chapter 2. The Simplex Algorithm
> n:=2:m:=3:
# Enter the number of decision variables and constraints,
respectively.
> x:=array(1..n):s:=array(1..m):
# Create arrays of decision and slack variables.
> Labels:=
Matrix(1,2+n+m,[z,seq(x[i],i=1..n),seq(s[j],j=1..m),RHS]):
# Create a top row of variable labels for the tableau
> LPMatrix:=
< UnitVector(1,m+1) | <-c,A> | <ZeroVector[row](m),
IdentityMatrix(m)> | <0,b> >;
# Create matrix corresponding to the tableau.
1 −4 −3 0 0 0 0
0 1 0 1 0 0 8
LPMatrix :=
0 2 2 0 1 0 28
0 3 2 0 0 1 32
> Tableau:=proc(M);return(<labels,M>):end:
# Procedure for printing tableau with labels.
> RowRatios:=proc(M,c) local k:
for k from 2 to nops(convert(Column(M,c+1),list)) do
if M[k,c+1]=0 then print(cat(“Row ”,convert(k-1,string),
“ Undefined”)) else
print(cat(“Row ”,convert(k-1,string),“ Ratio =”,
convert(evalf(M[k,nops(convert(Row(M,k),list))]/M[k,c+1]),
string))) end if; end do;end:
#The ratio test procedure applied to column c of M.
> Iterate:=proc(M,r,c)
RowOperation(M,r+1,(M[r+1,c+1])ˆ(-1),inplace=true):
Pivot(M,r+1,c+1,inplace=true):return(Tableau(M)):
end:
# Iteration of the simplex algorithm applied to row r,
column c
> Tableau(LPMatrix);
# Display initial tableau.
z x1 x2 s1 s2 s3 RHS
1 −4 −3 0 0 0 0
0 1 0 1 0 0 8
0 2 2 0 1 0 28
0 3 2 0 0 1 32
2.1. The Simplex Algorithm 37
> RowRatios(LPMatrix,1);
# Determine ratios corresponding to column 1.
> RowRatios(LPMatrix,2);
# Determine ratios corresponding to column 2.
> Iterate(LPMatrix,3,2);
# Pivot on entry in row 3, column 2 of matrix.
z x1 x2 s1 s2 s3 RHS
1 0 0 − 12 0 3
44
2
0 1 0 1 0 0 8
0 0 0 1 1 −1 4
0 0 1 − 32 0 1
2 4
> RowRatios(LPMatrix,3);
# Determine ratios corresponding to column 3.
> Iterate(LPMatrix,2,3);
# Pivot on entry in row 2, column 3 of matrix to obtain final
tableau.
z x1 x2 s1 s2 s3 RHS
1 0 0 0 1
1 46
2
0 1
0 0 −1 1 4
0 0
0 1 1 −1 4
3
0 0 1 0 2 −1 10
(a)
(b)
(c)
(d)
What happens? Specify the attribute of this problem that makes solving
it more challenging than solving the minimization problem in Exercise
1.
3. A constraint is said to be binding at the optimal solution of an LP if there
is equality in that constraint in the optimal solution. For the first two
LPs in Exercise 1, determine which constraints are binding.
The feasible region for the LP is shown in Figure 2.1. Superimposed on the
feasible region is the contour z = 12 of the objective function. The fact that the
contour z = 12 and all others are parallel to one edge of the feasible region
can be immediately seen by observing that the objective function is a scalar
multiple of the left-hand side of the first constraint.
x2
8
z=12
2
0
x1
0.5 1 1.5 2 2.5
After introducing two slack variables s1 and s2 and performing one iteration
of the simplex algorithm, we arrive at the tableau given in Table 2.7.
The nonbasic variables are given by NBV = {x2 , s1 } and neither has a nega-
tive coefficient in the top row, thereby indicating that the current solution is
optimal. But let’s take a closer look. The top row of the tableau dictates that
increasing s1 will decrease the value of z. (In fact, it will return us to the initial
basic feasible solution at the origin!) On the other hand, increasing the other
nonbasic variable x2 from its current value of zero will not change the current z value
because x2 currently has a coefficient of zero in the top row. An additional iteration
4 8
of the simplex algorithm in which x2 becomes basic leads to x1 = , x2 = ,
" # "4# 3 3
2
and s1 = s2 = 0. Thus, each of the vectors x = and x = 38 are solutions
0 3
to the LP. By exercise 5, of Section 1.3, so is every point on the line segment
connecting the two vectors:
4
2 3
0 8
x = (1 − t) + t 3 ,
0 0
2 0
where t ∈ [0, 1]. This infinite set of vectors corresponds to points along the
4 8
edge of feasible region connecting (2, 0) and , .
3 3
Assume that x1 , s2 , and s3 are basic. The variable x2 is the entering variable,
42 Chapter 2. The Simplex Algorithm
and clearly any strict increase in its value will lead to a strict increase in z. The
row of the tableau corresponding to the first constraint places no limit on the
increase of x2 due to the coefficient of zero in the x2 column. Thus, if either
of the entries marked were also zero, then the corresponding constraints
would also place no limit on the increase of x2 . But the same conclusion can
be made if either of the entries marked were less than zero. The variable
x2 could increase as large as desired, and slack variables could increase in
a positive manner ensuring that all constraint equations were still satisfied.
Thus, if a variable in the top row of an LP maximum problem has a negative coefficient
and all the entries below the top row are less than or equal to zero, then the LP is
unbounded.
_ _
Waypoint 2.2.1. The tableau shown in Table 2.9 results when the sim-
plex algorithm is used to solve a standard maximization LP. Assume
that a ≥ 0 and that BV = {x1 , s1 }.
1. For which values of a does the LP have alternative optimal
solutions?
2. For which values of a is the LP unbounded?
_ _
2.2.3 Degeneracy
Recall that Definition 1.3.2 states the basic variables in any basic feasible
solution must be nonnegative. For the examples we have considered thus
far, basic variables have always been positive. A degenerate LP is one that
possesses a basic feasible solution in which at least one basic variable equals
zero. Degenerate LPs are important from the standpoint of convergence of
the simplex algorithm.
the new value of each entering variable must be nonzero so that the objective
function must strictly increase at each iteration. In particular, the simplex
algorithm cannot encounter the same basic feasible solution, so the algorithm
must terminate due to the finite number of such solutions.
For a degenerate LP, the objective function can remain unchanged over suc-
cessive iterations. One of the simplest examples illustrating this possibility is
given by the LP
x2
7
0
x1
1 2 3 4
that, due to degeneracy, the ratio corresponding to the bottom row is zero.
To determine which of s1 or s2 becomes nonbasic, we consider the equations
corresponding to rows 1 and 2. Since x2 remains nonbasic, the equation in row
1 corresponds to x2 + s1 = 6, whereas that in row 2 yields x1 + s2 = 0. The first
equation permits x1 to increase by 6. However, in the second equation, the
basic variable s2 = 0 so that x1 cannot increase without s2 becoming negative.
If we treat the ratio in the bottom row as the smallest “positive” ratio and
pivot in a way that x1 replaces s2 as a nonbasic variable, we obtain the new
tableau in Table 2.11.
Note that the objective function value has not changed from zero nor has the
extreme point in the feasible region. Only have the roles of x1 and s2 as basic
versus nonbasic changed; their values after the first iteration remain equal to
zero.
At the next iteration, x2 becomes basic, and we are once again faced with a
ratio of 0. In this case, the equations corresponding to the bottom two rows
are given by 2x2 + s1 = 6 and x1 − x2 = 0. The first permits x1 to increase by
3. The second permits x2 to increase without bound, provided x1 does so as
well in a manner that their difference remains equal to 0. Thus, we now let
x2 replace s1 as a basic variable, and the final tableau becomes that shown in
Table 2.12
(a) Suppose the current basic solution is not optimal but that the LP is
bounded. After the next iteration of the simplex algorithm, what
are the new values of the decision variables and objective function
in terms of a and/or b?
46 Chapter 2. The Simplex Algorithm
(b) Suppose the current basic solution is optimal but that the LP has
alternative optimal solutions. Determine another such solution in
terms of a and/or b.
(c) If the LP is unbounded, what can be said about the signs of a and
b?
4. Table 2.14 results after two iterations of the simplex algorithm are ap-
plied to a maximization problem having objective function,
z = f (x1 , x2 ) = x1 + 3x2 .
0 2 4 6 8
x1
(b) Suppose the coefficients of the objective function are both positive.
If the extreme point at the origin corresponds to the initial basic
feasible solution, what is the smallest possible number of iterations
required to obtain a degenerate basic feasible solution? What is
the largest possible number of iterations? What attribute of the
objective function determines this outcome?
In certain parts of the U.S., studies have shown that the vole, or common field
mouse, is an herbivore whose diet consists predominantly of grass and a type
of broad-leafed herb known as forb. Empirical studies suggest that the vole
forages in a way that minimizes its total foraging time, subject to a set of two
constraints [5].
grams within the digestive system. To distinguish the mass of food prior to
consumption from that in the digestive system, we use units of gm-dry and
gm-wet, respectively.
The digestive capacity of the vole is 31.2 gm-wet per day, and the vole must
consume enough food to meet an energy requirement of at least 13.9 kcal per
day. Assume that the vole’s foraging rate is 45.55 minutes per gram of grass
and 21.87 minutes per gram of forb.
Let x1 and x2 denote the quantity of grass and forb, respectively, consumed by
the vole on a given day. Units of both x1 and x2 are gm-dry, and consumption
results in a total mass of 1.64x1 + 2.67x2 gm-wet within the digestive system.
Food bulk limitations then lead to the constraint
maximize z = c · x (2.7)
subject to
" #
x
[A|I2 ] =b
s
x, s ≥ 0.
x2
x1
Note that LP (2.8) has an equality constraint; thus, its feasible region consists
of a segment in R2 .
Clearly this system yields a basic feasible solution in which the basic variables
are given by s1 = 6, a2 = 1, and a3 = 10.
variables in the new LP’s final solution are ensured to equal zero. For if
this outcome occurs and all artificial variables are zero, then the values of all
remaining variables (decision, slack, and excess) are exactly the same as those
in the optimal solution to the original LP.
Since a2 and a3 are basic variables at the outset, pivots in their columns are
necessary to record the initial objective function value. Performing two pivots
yields the tableau in Table 2.17.
Two more iterations of the simplex algorithm are then necessary to achieve
the optimal solution. The intermediate and final tableaus are given in Tables
2.19 and 2.20.
Waypoint 2.3.2. Now use the Big M Method to solve the Foraging
Herbivore Model (2.6).
_ _
(a)
(b)
minimize z = x1 + x2
subject to
2x1 + 3x2 ≥ 30
−x1 + 2x2 ≤ 6
x1 + 3x2 ≥ 18
x1 , x2 ≥ 0
(c)
For example, the product M2 can be calculated directly, but it can also be
computed by using (2.11) along with the fact all products involving A, B, C,
and I2 are defined. The result is as follows:
" # " #
2 A B A B
M = · (2.12)
C I2 C I2
" 2 #
A + BC AB + BI2
=
CA + I2 C CB + I22
" 2 #
A + BC AB + B
=
CA + C CB + I2
_ _
Waypoint 2.4.1. Calculate each of the quantities AB + B, CA + C,
and CB + I2 . Use the results to compute the remaining entries of M2 .
Then calculate M2 directly using M itself to verify that your answer
is correct.
_ _
For example, suppose that A and B are 3-by-3 and 3-by-2 matrices, respec-
tively. Assume that c is a (column) vector in R3 , d is a (row) vector in R2 , and
that 03×1 and 01×3 represent
" the zero
# column
" # vector and zero row vector in
A 0 3×1 B c
R3 , respectively. Then and are 4-by-4 and 4-by-3 matrices,
01×3 1 d 2
respectively, whose product is given by
> A:=Matrix(2,2,[1,3,2,0]):
> B:=Matrix(2,2,[2,1,-1,1]):
> C:=Matrix(2,2,[-5,0,2,6]):
> I2:=IdentityMatrix(2):
> M:=< <A|B>,<C,I2> >;
maximize z = c · x
subject to
Ax ≤ b
x ≥ 0,
h i 1 0 8
28
where c = 4 3 , A = 2 2, and b = . The initial tableau, as discussed
3 2 32
in Section 2.1, is shown in Table 2.21.
this first row operation. Moreover, after one iteration of the simplex algorithm,
the new row 0, in matrix form, is given by
h i
1 − y · 03×1 − c + yA y yb = 1 0 −3 4 0 0 32 . (2.15)
_ _
Waypoint 2.4.2. Verify that Equation (2.15) is indeed correct.
_ _
The remaining row operations needed to perform the pivot involve subtract-
ing twice row 1 from row 2 and then three times row 1 from 3. Each of these
row operations can be performed separately on the identity matrix to obtain
1 0 0 1 0 0
E1 = −2 1 0 and E2 = 0 1 0, respectively.
0 0 1 −3 0 1
The results of (2.15) and (2.16) can now be in the following" manner:
# We con-
1
struct a partitioned matrix whose first column consists of , i.e., column
03×1
0 of the tableau matrix, which remains constant from one iteration
" #to the next.
y
We then augment this first column on the right by the matrix , which re-
M
58 Chapter 2. The Simplex Algorithm
" #
1 y
sults in . This matrix is then right-multiplied by the original tableau
03×1 M
matrix as follows:
The result of (2.17) is the tableau matrix obtained after one simplex algorithm
iteration is applied to the FuelPro LP. A Maple worksheet Simplex Algorithm
as Partitioned Matrix Multiplication.mw, useful for verifying this fact is
given as follows.
> restart;with(LinearAlgebra):
> c:=Vector[row]([4,3]);
# Create row vector with objective coefficients.
c := [4, 3]
> A:=Matrix(3,2,[1,0,2,2,3,2]);
# Matrix of constraint coefficients.
1 0
2 2
A :=
3 2
> b:=<8,28,32>;
# Constraint bounds.
8
b := 28
32
> TableauMatrix:=
< <UnitVector(1,4) | <-c,A> | <ZeroVector[row](3),
IdentityMatrix(3)> | <0,b> >;
# Create initial tableau matrix.
1 −4 −3 0 0 0 0
0 1 0 1 0 0 8
TableauMatrix :=
0 2 2 0 1 0 28
0 3 2 0 0 1 32
> y:=[4,0,0];
y := Vector[row]([4, 0, 0])
> M:=Matrix(3,3,[1,0,0,-2,1,0,-3,0,1]);
1 0 0
M := −2 1 0
−3 0 1
2.4. A Partitioned Matrix View of the Simplex Method 59
_ _
Waypoint 2.4.3. For the FuelPro LP, use the elementary row operations
performed at the second iteration of the simplex algorithm to verify
that
h i 1 0 0
y2 = 0 0 32 and M2 = 0 1 −1 . (2.19)
0 0 21
Then combine these results with the previously computed values
of y1 and M1 to verify that (2.18) coincides with the tableau matrix
corresponding to Table 2.3 from Section 2.1.
_ _
60 Chapter 2. The Simplex Algorithm
where
M = Mk · Mk−1 · · · M1 and y = y1 + y2 M1 + y3 M2 M1 + . . . + yk Mk Mk−1 · · · M1 .
h i h i 1 0 0
0, and
For the FuelPro LP, y1 = 4 0 0 , y2 = 0 0 3
, M1 = −2 1
2
−3 0 1
1 0 0
1 −1. It then follows that for k = 2,
M2 = 0
0 0 12
y = y1 + y2 M1
h i h i 1 0 0
= 4 0 0 + 0 0 3 −2 1 0
2
−3 0 1
h i
= − 21 0 3
2
2.4. Exercises Section 2.4 61
and
1 0 0
−1 .
M = M2 M1 = 1 1
3 1
−2 0 2
" #
y 1
Using these results, we can form the partitioned matrix , which, in
0m×1 M
turn, provides a means for determining the tableau matrix after the second
iteration.
At this stage, Theorem 2.4.1 is merely a theoretical result, one that does not
affect the manner in which we perform the simplex algorithm. However, it
demonstrates that the quantities y and M alone in (2.21) can be used with an
LP’s initial tableau matrix to determine the entire tableau matrix at a given
iteration. In the next section we exploit this fact and improve the efficiency of
the simplex algorithm.
6. For the FuelPro LP, calculate the vector, y3 , and matrix, M3 , correspond-
ing to the third and final iteration of the simplex algorithm. Then, cal-
culate the final tableau matrix, using the vectors and matrices, M1 , M2 ,
M3 , y1 , y2 , and y3 .
2.5.1 Notation
Throughout this section we shall again refer to the standard maximization
problem
maximize z = c · x
subject to
Ax ≤ b
x ≥ 0,
where A is an m-by-n matrix, c and x belong to Rn , and b belongs to Rm . We
will assume that all entries of b are positive. The revised simplex algorithm
builds upon the result of Theorem 2.4.1 from Section 2.4. Namely, the tableau
matrix after the kth iteration, in partitioned matrix multiplication form, is
given by
" # " # " #
1 y 1 −c 0 0 1 −c + yA y yb
· = , (2.22)
0 M 0 A Im b 0 MA M Mb
where y is a row vector in Rm and M is an m-by-m matrix. The vector y and
matrix M record all row operations used to obtain the kth tableau matrix. To
underscore the fact that these quantities correspond to the kth iteration, we
modify our notation and relabel y and M from Theorem 2.4.1 and (2.22) as yk
and Mk , respectively. We also define the (m + 1)-by-(m + 1) matrix
" #
1 yk
Sk = k = 0, 1, 2, . . . (2.23)
0 Mk
With this notation, the tableau matrix after the kth iteration is the product of
Sk and the initial tableau matrix:
" #
1 −c 0 0
Sk · . (2.24)
0 A Im b
2.5. The Revised Simplex Algorithm 63
In the subsequent discussions, we will use the following notation. We let BVk ,
where 0 ≤ k ≤ m, be an m + 1 by 1 column vector of “labels,” whose top entry
is z and whose jth entry is the basic variable associated with row j of the
z
x
tableau after the kth iteration. For example, in the FuelPro LP, BV1 = 1 since
s2
s3
the basic variables after the first iteration are given by x1 (row 1), s2 (row 2),
and s3 (row 3).
Recall that our convention is to number the top row of the tableau matrix,
excluding labels, as row 0 and the leftmost column as column 0.
z
s
1. Initially, BV0 = 1 and S0 = I4 .
s2
s3
2. We have that
" # 0
0 8
BV0 = S0 = . (2.25)
b 28
32
2.5. The Revised Simplex Algorithm 65
Since the nonbasic variables are x1 and x2 , we compute the top entries in
the matrix-vector products of S0 and the corresponding initial tableau
matrix column vectors:
−4 −4 −3 −3
1 0
S0 = and S0 = ,
2 2
3 2
3. Using the results of (2.25) and (2.26), we compute the respective ratios
as 8, 14, and 32
3 . The result tells us that x1 replaces s1 in the second entry
z
x
(i.e., row 1 entry) of BV0 so that BV1 = 1 .
s2
s3
−4
1
4. We now pivot on the second entry of v = . This process requires the
2
3
following three elementary row operations:
(a) Add 4 times row 1 to row 0;
(b) Add -2 times row 1 to row 2;
(c) Add -3 times row 1 to row 3.
To compute S1 , we perform on S0 = I4 these elementary row operations,
in the order they are listed. The result is as follows:
1 4 0 0
0 1 0 0
S1 = .
0 −2 1 0
0 −3 0 1
The nonbasic variables are x2 and s1 . Thus we compute only the top
entries in the matrix-vector products of S0 and the corresponding initial
tableau matrix column vectors:
−3 −3 0 4
0 1
S1 = and S1 = ,
2 0
2 0
6. The results of (2.27), (2.28), and the ratio test tell us to pivot on the fourth
entry (i.e., on the row 3 entry) in the result from (2.28). Thus x2 replaces
s3 as a basic variable and
z
x
BV2 = 1 .
s2
x2
7. The required row operations to pivot on the row 3 entry of (2.28) are as
follows:
_ _
Waypoint 2.5.1. Execute a third iteration of the revised simplex algo-
z
x
rithm for the FuelPro LP. Your results should indicate that BV3 = 1
s1
x2
46
4
and S3 b = . Begin to carry out a fourth iteration. You should
4
10
discover that when S3 is multiplied by either column vector of the
tableau matrix corresponding to a nonbasic variable, the top entry of
the vector is positive. Hence the algorithm terminates with BV3 = S3 b
so that z = 46, x1 = 4, s1 = 4 and x2 = 10.
_ _
Solve each of the following LPs using the revised simplex algorithm.
68 Chapter 2. The Simplex Algorithm
1.
2.
3.
feasible solution at the origin, we arrive at the optimal solution after only three
iterations of the simplex algorithm. Obviously, for larger scale problems, the
number of iterations needed to obtain an optimal solution can become quite
large. How much so was demonstrated in 1972, when Klee and Minty con-
structed a family of LPs, each having n decision variables and n constraints,
where n = 1, 2, 3, . . ., yet requiring 2n simplex algorithm iterations to solve
[19]. In simple terms, the number of iterations needed to solve an LP can be
exponential in the problem size. However, this is an extreme case, and for
most practical applications, far fewer iterations are required.
maximize z = c · x (2.29)
subject to
Ax = b
x ≥ 0,
To begin the algorithm, we start with a point, x0 that belongs to the interior
of the LP’s feasible region. To say that x0 is feasible means that Ax0 = b. That
it belongs to the interior means each component of x0 is strictly positive, as
3
4
opposed to merely nonnegative. For example, x0 = 5 satisfies both these
14
15
conditions for the FuelPro LP. Finally, we say that x0 belongs to the boundary
of the feasible region if at least one of its components is zero.
We now start at x0 and seek a new point x1 = x0 + △x, whose objective, cx1 ,
is larger than at x0 . An initial choice for △x is ct , the transpose vector of c.
That this choice makes sense can be seen if we observe that ∇(c · x) = ct , the
gradient of z, which “points” in the direction of greatest increase in z. (We
discuss the gradient in much greater detail in Part II of the text.) However,
this choice does not work if we use the feasibility of x0 , along with the desired
feasibility of x1 . Namely,
b = Ax1
= A x 0 + ct
= Ax0 + Act
= b + Act .
cp = Pct , (2.30)
−1
where P = In − At AAt A .
In this definition, we assume that the m-by-m matrix AAt is invertible. This
2.6. Moving beyond the Simplex Method: An Interior Point Algorithm 71
is true for the FuelPro LP and any matrix A having linearly independent row
vectors.
_ _
Waypoint 2.6.1. For the FuelPro LP, calculate the projection matrix P.
Then use your result to verify that the projected gradient is given by
6
35
1
7
6
cp = − 35 .
22
− 25
4
−5
_ _
Proof. Using the preceding properties, along with properties of the transpose,
we have
c · x1 − c · x0 = c (x1 − x0 )
= c · αcp
= αc Pct
= αc P2 ct
= α cPt Pct
t
= α Pct Pct
2
= α cp
≥ 0.
72 Chapter 2. The Simplex Algorithm
x1 = x0 + αcp (2.31)
6
3 35
1
4 7
6
5
= + α − 35
14 22
−
25
15 − 45
6
3 + 35 α
4 + 7 α
1
= 5 − 35 α .6
22
14 − 25 α
15 − 54 α
The last component of this vector dictates that α can increase by no more than
18 43 before x1 violates the sign restrictions.
3
Setting α = 18 results in
4 87
14
187
28
x1 = 25 ,
(2.32)
14
31
14
0
1257
with corresponding objective value z = c · x1 = 28 ≈ 44.9.
The matrix D that we elect to use has the effect of “centering” x˜k in the feasible
region of the new LP. It is an n-by-n matrix defined by
xk,1 0 . . . 0
0 x 0
k,2 . . .
D = . .. .. , (2.33)
.. . .
0 0 . . . xk,n
where xk,i , 1 ≤ i ≤ n denotes the ith component of xk .
where −1
P = In − Ãt ÃÃt à .
4. Since x˜0 = e, where e is the vector in Rn all of whose entries are one, α
can be at most the reciprocal of the absolute value of the most negative
entry of c˜p before an entry of x˜0 + αc˜p becomes negative. We therefore
set α to equal this quantity.
5. Define
x˜1 = e + λαc˜p ,
2. Using x0 ,
3 0 0 0 0
0 4 0 0 0
5 0 0
D = 0 0
0
0 0 14 0
0 0 0 0 15
.
h i
3. In the new LP, c̃ = cD = 12 12 0 0 0 and
3 0 5 0 0
à = AD = 6 8 0 14 0 . These quantities are then used to con-
9 8 0 0 15
struct the projection matrix, P, and projected gradient, c˜p . The second
4.57
5.85
of these is given by c˜p ≈ −2.74.
−5.30
−5.86
4. The last entry of c˜p indicates that α = 1/5.86 ≈ .1706, the reciprocal of
the absolute value of most negative component of c˜p . Using this value
along with λ = .75, we compute x˜1 :
> restart;with(LinearAlgebra):
> c:=Vector[row]([4,3,0,0,0]);
# Create row vector with objective coefficients.
c := [4, 3, 0, 0, 0]
> b:=<8,28,32>;
# Constraint bounds.
8
b := 28
32
> N:=4:lambda:=.75:
# Set number of iterations, N, and parameter lambda.
> x:=array(0..N):
# Array of iteration values.
> x[0]:=<3,4,5,14,15>;
# Set initial value.
3
4
x0 := 15
14
15
> for i from 0 to (N-1) do
d:=DiagonalMatrix(convert(x[i],list)):
# Determine transformation matrix.
∼A:=A.d: ∼ c:=c.d:
# Calculate coefficient matrix and objective
coefficients for new LP.
P:=IdentityMatrix(5)-
Transpose(∼A).MatrixInverse(∼A.Transpose(∼A)).∼A:
# The projection matrix.
cp:=P.Transpose(∼c):
# Determine projected gradient.
alpha:=(abs(min(seq(cp[j],j=1..5))))ˆ(-1):
# Find alpha.
x[i+1]:=d.(<1,1,1,1,1>+lambda*alpha*cp):
# Determine next iterate.
end do:
78 Chapter 2. The Simplex Algorithm
For the sake of brevity, we do not print here the output that appears at the
end of this worksheet. Instead, we summarize these results in Table 2.23 but
list only the entries of each xi that correspond to the two decision variables,
x1 and x2 , in the original FuelPro LP.
TABLE 2.23: Results of applying the interior point algorithm to the FuelPro
LP
n (xn,1 , xn,2 ) z
0 (3, 4) 24
1 (4.75, 7) 40
2 (5.1, 7.95) 44.07
3 (4.66, 8.9) 45.32
4 (4.1, 9.79) 45.76
The results in Table 2.23 dictate that convergence to the optimal solution takes
place very quickly. A graphical illustration of this phenomenon is depicted
in Figure 2.6.
x4
x3
x2
x1
x0
Since 1984, interior point methods have continually evolved and improved
in their efficiency. Interestingly, after a great deal of controversy centered
on the issuance of patents for what could be viewed as newly developed
mathematical tools, AT& T applied for, and received a patent for Karmarkar’s
method in 1988. This patent expired in 2006.
(a) The vector cp belongs to the null space of A. That is, Acp = 0.
(b) The matrix P is symmetric, meaning that Pt = P.
(c) The matrix P satisfies P2 ct = Pct .
3
4 " #
3
3. Assume in the FuelPro LP that x0 = 5 , which corresponds to in
4
14
15
" #
x̃1
x1 x2 -plane. If we denote x˜k = , sketch the image in the x̃1 x˜2 -plane of
x̃2
each basic feasible solutions of the FuelPro LP under the first centering
transformation D−1 . Show that the region in the x̃1 x˜2 -plane enclosed by
these points coincides with the feasible region of the LP,
maximize z = c̃x̃
subject to
Ãx̃ ≤ b
x̃ ≥ 0,
(a)
(b)
Daily nutritional guidelines for each category in Table 3.1 are summarized
in Table 3.2. The listed minimum requirements for vitamins A and C, cal-
cium, and iron are based upon the assumption that deficiencies are remedied
through beverages and daily vitamin supplements.
81
82 Chapter 3. Standard Applications of Linear Programming
Suppose that sandwiches cost $6.00 (turkey breast), $5.00 (club sandwich),
$3.50 (veggie sub), and $5.00 (breakfast sandwich). We seek to minimize the
total daily cost of maintaining a diet that is comprised solely of these four
sandwiches and fulfills the previously stated nutritional guidelines. We will
permit partial consumption of any sandwich and assume that the purchase
price of any such sandwich is pro-rated on the basis of the fraction consumed.
For example, consumption of one-half of a club will contribute only $2.50 to
total cost. Certainly this last assumption is unrealistic in that the number of
purchased sandwiches of each type must be integer-valued. In Chapter 5, we
will learn how to account for this fact.
> restart:with(LinearAlgebra):with(Optimization):
> x:=array(1..4);
x := array(1..4, [])
3.1. The Diet Problem 83
Each column of Table 3.1 can be viewed as a vector in R10 recording nutritional
information for a particular sandwich. One means of storing this information
in Maple is to first construct a “nutrition vector” for each sandwich and then
to use the resulting vectors to form a “nutrition matrix.” The following syntax
performs this task:
> TurkeyBreast:=<280,4.5,20,46,5,18,8,35,6,25>:
> Club:=<320,6,35,47,5,24,8,35,8,30>:
> VeggieSub:=<230,3,0,44,5,9,8,35,6,25>:
> BreakfastSandwich:=<470,19,200,53,5,28,10,15,25,25>:
> A:=<TurkeyBreast | Club | VeggieSub | BreakfastSandwich>:
In a similar manner, each column of Table 3.2 can also be entered as a vector
in R10 . The terms Maximum, maximum, max, as well as corresponding terms for
minima, have reserved meanings in Maple, so we must choose names for these
vectors with care. Here we call them MinimumAmount and MaximumAmount. The
first of these is given as follows:
> Prices:=<6,5,3.5,5>:
> TotalCost:=add(Prices[i]*x[i],i=1..4);
> add(A[1,j]*x[j],j=1..4)<=3000;
We wish to ensure that analogous guidelines hold for other categories. One
means of accomplishing this efficiently in Maple is through use of the se-
quence command, seq. What follows is syntax illustrating how this command
can be used to produce all model constraints:
84 Chapter 3. Standard Applications of Linear Programming
> MinimumGuidelines:=
seq(MinimumAmount[i]<=add(A[i,j]*x[j],j=1..4),i=1..10):
> MaximumGuidelines:=
seq(add(A[i,j]*x[j],j=1..4)<=MaximumAmount[i],i=1..10):
These preceding two command lines produce the LP constraints in the form
of two sequences, whose results can be used to create the constraint list for
purposes of invoking the LPSolve command. Here is the command line that
combines the previous results to determine the optimal solution of the LP.
> LPSolve(TotalCost,[MinimumGuidelines,MaximumGuidelines],
assume=’nonnegative’);
_ _
Waypoint 3.1.1. Use Maple to verify that the cheapest, four-sandwich
diet costs approximately $29.81 per day and consists of no turkey
breast sandwiches, approximately 5.14 club sandwiches, .314 veggie
subs, and .6 breakfast sandwiches.
_ _
Maple structures of sequences, sums, vectors, matrices, and arrays all played
a role in solving the diet LP in this section. It is important to note that in
many problems more than one structure can be used to perform a task. For
example, Prices was entered as a vector but, for purposes of computing total
cost, could also be entered as a list, Prices:=[6,5,3.5,5]. Choosing this
approach would not affect subsequent command lines. A matrix was chosen
over an array to record nutritional data for all sandwiches merely due to
the fact nutritional information for each sandwich was presented in tabular
form (as one would expect from reading product labels) and could be easily
entered as a vector. These vectors were combined to produce the matrix A. In
other problem types, use of an array is more appropriate. When confronted
with choices such as these, two important points to be bear in mind are
the following: Which method permits the easiest entry of data relevant to
the problem, and which method permits the greatest flexibility in terms of
needed operations for constructing the LP? For example, a Matrix can be
multiplied by a column Vector, but not by a list structure, unless the list is
first converted to a Vector through use of the convert command.
3.1. Exercises Section 3.1 85
of transporting the coal from the mines to the cities. Because the total coal
amount produced by the mines equals the combined demand of the cities, this
LP constitutes a balanced transportation problem. Clearly if the total demand
exceeds total available supply, any LP model attempting to minimize cost
will prove infeasible. The case when total supply exceeds total demand is
addressed in the exercises.
> x:=array(1..3,1..4);
If the entries of Table 3.4 are also entered as a 3-by-4 array (matrix), labeled
Cost, the total transportation cost, in millions of dollars, is computed as
follows:
> Cost:=Matrix(3,4,[2,3,2,4,4,2,2,1,3,4,3,1]):
> TotalCost:=add(add(Cost[i,j]*x[i,j],i=1..3),j=1..4);
As was the case for the Four-Sandwich Diet Problem from Section 3.1, the
seq and add commands, can be combined to ensure that supply and demand
constraints are met. For example, suppose that the cities’ respective demand
amounts are defined by the list (or array or vector) named Demand. Then
seq(add(x[i,j],i=1..3) >=Demand[j], j=1..4) is a sequence of four in-
equalities, which, if satisfied, ensures that demand is met.
3.2. Transportation and Transshipment Problems 87
_ _
_ _
The solution of the Coal Distribution Problem has the property that all de-
cision variables are integer-valued. This is a trait we desire of LPs in a wide
variety of contexts. In many situations we are forced to use the techniques of
integer linear programming, which form the basis of discussion in Chapter
5. Fortunately, the nature of the Coal Distribution Problem, in particular the
form of its constraints, guarantees, a priori, an integer-valued solution.
X
m X
n
minimize z = ci j x i j (3.1)
i=1 j=1
subject to
X n
xi j = si 1≤i≤m
j=1
X
m
xi j = d j 1≤ j≤n
i=1
xi j ≥ 0 for 1 ≤ i ≤ m and 1 ≤ j ≤ n.
Observe that we may express each constraint as an equality due to the bal-
anced nature of the LP.
m blocks
z }| { x11
11×n 01×n 01×n · · · 01×n ..
0 .
1×n 11×n 01×n · · · 01×n " #
.
m blocks
. .. .. .. .. x1n s
. . . . . · x21 = . (3.2)
x22 d
..
01×n 01×n . 01×n 11×n .
.
In In ··· In In .
xmn
Ax = b, (3.3)
To minimize the cost of transporting the coal from the mines to the cities, we
view the problem as one combining two separate transportation problems. In
the first of these, transshipment points are viewed as “demand points”; in the
90 Chapter 3. Standard Applications of Linear Programming
second, they constitute “supply points.” Combined with these two problems
are constraints, sometimes referred to as conservation constraints, which reflect
the requirement that the amount of coal entering a transshipment point is the
same as the amount leaving.
In a manner similar to that for the original problem, we use arrays to record the
shipment amounts between the various points. We define xi j where 1 ≤ i ≤ 3
and 1 ≤ j ≤ 2, as the amount of coal transported from Mine i to Station j.
Similarly, y jk , where 1 ≤ j ≤ 2 and 1 ≤ k ≤ 4, denotes the amount transported
from Station j to City k. In Maple we have:
> x:=array(1..3,1..2):
> y:=array(1..2,1..4):
Entries from Tables 3.5 and 3.6 can be entered as two matrices, which we label
CostTo and CostFrom. Using them, we construct the total transportation cost
as follows:
> CostTo:=Matrix(3,2,[4,3,5,4,2,4]):
> CostFrom:=Matrix(2,4,[3,5,4,3,4,3,4,4]):
> TotalCost:=add(add(CostTo[i,j]*x[i,j],i=1..3),j=1..2)+
add(add(CostFrom[j,k]*y[j,k],j=1..2),k=1..4):
Each constraint for the LP belongs to three one of the three categories:
1. The supply at each mine is the sum of the amounts transported from
that mine to each of the stations.
2. The demand at each city is the sum of the amounts transported to that
city from each of the stations.
3. Conservation constraints that guarantee the total amounts transported
into and out of each station are equal.
Using the arrays x and y and the Demand and Supply lists defined in the first
problem, we express these constraints in Maple as follows:
> SupplyConstraints:=seq(add(x[i,j],j=1..2)<=Supply[i],i=1..3):
> DemandConstraints:=seq(add(y[j,k],j=1..2)>=Demand[k],k=1..4):
> NetFlowConstraints:=
seq(add(x[i,j],i=1..3)=add(y[j,k],k=1..4),j=1..2):
_ _
Waypoint 3.2.2. Use the previously defined sequences to form a list
of constraints and determine the entries of x and y that minimize total
costs in this transshipment model.
_ _
3.2. Exercises Section 3.2 91
TABLE 3.7: Market demand for corn and soy, measured in tons
Market/Commodity Corn Soy
1 2 5
2 5 8
3 10 13
4 17 20
Each plantation produces 2 tons of corn per acre and 4 tons of soy per
acre. The costs of growing corn and soy are $200 per acre and $300 per
acre, respectively, regardless of the plantation. The cost of transporting
either good from plantation i to market j is 20i + 30 j dollars per ton.
(a) Suppose crop 1 represents corn and crop 2 represents soy. Let xi jk ,
where 1 ≤ i ≤ 3, 1 ≤ j ≤ 4, and 1 ≤ k ≤ 2, represent the amount
of crop k, measured in tons, that is transported from plantation i
to market j. Let yik , where 1 ≤ i ≤ 3, and 1 ≤ k ≤ 2, represent the
1 Based upon Blandford, Boisvert, and Charles [8], (1982).
92 Chapter 3. Standard Applications of Linear Programming
From a physical perspective, we may view the network as modeling the flow
of fluid through pipes. Sources and sinks are locations from which fluid enters
and exits the system, with rate of fluid entry or exit at node i measured by
fi . Arcs represent pipes, with Mi j denoting the maximum possible fluid flow
through the pipe starting at node i and ending at nodej. Figure 3.1 illustrates
flow between two nodes. Note that there exist two flow capacities between
these two nodes, Mi j and M ji , and that these values need not be equal. For
example, if Mi j > 0 and M ji = 0, then flow is permitted only in the direction
from node i and to j but not in the reverse direction.
The minimum cost network flow problem is the LP that seeks to determine
3.3. Basic Network Models 93
Mi j fj
j
M ji
i
fi
the flow values between nodes that minimize the total cost of fluid flow
through the network, subject to the flow constraints and the assumption that
conservation of flow occurs at each node. By conservation of flow we mean
the combined net outflow at node i due to fluid both flowing in from and also
flowing out to adjacent nodes is given by fi . If xi j , where 1 ≤ i, j ≤ n, is the
decision variable representing the flow from node i to node j, the minimum
cost network flow LP is given by (3.4):
X
n X
n
minimize z = Ci j xi j (3.4)
i=1 j=1
subject to
X
n X
n
xi j − x ji = fi , for 1 ≤ i ≤ n
j=1 j=1
xi j ≤ Mi j for 1 ≤ i, j ≤ n
xi j ≥ 0 for 1 ≤ i, j ≤ n.
= 0.
X
n
Hence, a necessary condition for (3.4) to be feasible is that fi = 0.
i=1
Both the transportation and transshipment problems from Section 3.2 are
examples of minimum cost network flow LPs. For the first type of problem,
94 Chapter 3. Standard Applications of Linear Programming
there exist two classes of nodes, those at which the net outflow is positive, the
supply points, and those at which it is negative, the demand points. The net
outflow numbers represent the respective supply and demand amounts. The
available supply from the first type of point may be used as the flow capacity
to any demand point. The flow capacity along any arc starting and ending at
two supply points as well as along any arc starting at a demand point is zero.
For the transhipment LP, there exists a third class of nodes, namely those at
which the net outflow is zero.
We already know from Theorem 3.2.1 that the solution of the balanced
transportation problem having integer supply and demand amounts is it-
self integer-valued. Under appropriate conditions, this result generalizes to
the minimum network flow problem problem and is spelled out by Theorem
3.3.1, sometimes known as the Integrality Theorem.
I
3 (3,4)
(6,4) II
(4,4) -1
V 4
(3,3)
(3,3)
-4 -2
(2,3)
IV III
> with(LinearAlgebra):with(Optimization):
> f:=[3,-1,-2,-4,4]:
# Outflow numbers corresponding to nodes.
> M:=Matrix(5,5,
[0,4,0,0,4,0,0,3,3,0,0,0,0,3,0,0,0,0,0,0,0,4,0,0,0]):
# Matrix of flow capacities.
> C:=Matrix(5,5,
[0,3,0,0,6,0,0,3,3,0,0,0,0,2,0,0,0,0,0,0,0,4,0,0,0]):
# Matrix of unit flow costs.
> x:=array(1..5,1..5):
# Array of decision variables.
> z:=add(add(C[i,j]*x[i,j],j=1..5),i=1..5):
# Network cost function.
> FlowConservation:=
seq(add(x[i,j],j=1..5)-add(x[j,i],j=1..5)=f[i],i=1..5):
# Conservation of flow constraints.
> FlowCapacities:=seq(seq(x[i,j]<=M[i,j],j=1..5),i=1..5):
# Flow capacity constraints.
> LPSolve(z,[FlowConservation,FlowCapacities],
assume=nonnegative):
# Solve minimum cost flow LP, suppressing output.
> assign(%[2]):
# Assign the decision variables to their solution values.
> print(x,z);
# Print the solution array along with objective value.
0 3 0 0 0
0 0 3 3 0
0 , 45
x := 0 0 0 1
0 0 0 0 0
0 4 0 0 0
For example, suppose an individual wishes to travel from his home in Grand
Rapids, Michigan to Graceland, Tennessee and is considering routes that pass
through the cities of Chicago, Indianapolis, Lansing, and St. Louis. Approx-
imate distances between the various pairs of cities are shown in Figure 3.3.
Assume that the lack of a single segment connecting two cities, e.g., Grand
Rapids and St. Louis, indicates that a direct route between those two cities
is not under consideration. That arrows in the diagram are bidirectional in-
dicates that travel is possible in either direction and that the corresponding
transportation costs, i.e., distances, are equal.
1. Grand Rapids, MI 70
140 4. Lansing, MI
180
300 3. Indianapolis, IN
720
250
6. Graceland, TN
Our goal is to determine the path of shortest length that starts at Grand Rapids
and ends at Graceland, traveling only upon segments shown in Figure 3.3.
Costs associated with the objective function are determined using inter-city
driving distances, which we record by the matrix
0 140 220 70 0 0
140 0 180 0 300 0
220 180 0 250 0 460
C = . (3.5)
70 0 250 0 0 720
0 300 0 0 0 250
0 0 460 720 250 0
Note that our flow capacity constraints already guarantee that the decision
variable corresponding to pairs of cities between which travel is not possible
must equal 0 in the solution. Thus, in our cost matrix, we may set to zero
(or any value, for that matter) the cost corresponding to any such pair of
cities. Observe also that the matrix C is symmetric, i.e., C = Ct , which reflects
the facts flow is bidirectional and that transportation costs between any two
nodes are the same, regardless of direction.
Using the objective z, along with the flow capacities, and conservation of
flow constraints we have the information required to solve the minimum cost
flow problem. Its solution, as computed using Maple, is given by z = 680,
x13 = x36 = 1 and all other decision variables equal to zero. Thus, the shortest
path corresponds to the 680-mile route from Grand Rapids to Indianapolis to
Graceland.
Clearly, many variants of this problem exist. For example, the objective can
become one of minimizing the time, not the distance, required to travel from
the source to the sink.
_ _
Waypoint 3.3.1. Suppose that driving speeds between all pairs of
cities in Figure 3.3 is 70 miles per hour, with the exception of the
segment from Grand Rapids to Indianapolis, where speed is limited to
55 miles per hour. Assuming that one travels from one city to another
at a constant speed equal to the legal maximum, determine the route
that minimizes the time required to travel from Grand Rapids to
Graceland.
_ _
98 Chapter 3. Standard Applications of Linear Programming
I. City
3
2 III.
3
II.
2
4 IV.
V. Plant
new arc to a large number, m, such as any number greater than the sum of
the already-existing capacities. We use m = 100.
Our goal is to maximize the flow leaving the city or, by conservation, the flow
along the “artificial” arc from the plant to the city. In other words, we seek
to maximize x51 , equivalently minimize z = −x51 . Hence, we set C51 = −1
0 0 0 0 0
0 0 0 0 0
and all other unit flow costs to zero, which gives us C = 0 0 0 0 0.
0 0 0 0 0
−1 0 0 0 0
0 2 3 0 0
0 0 3 4 0
Along with M = 0 0 0 0 2, we have all the data needed to solve the
0 0 0 0 1
100 0 0 0 0
problem of maximizing the flow through the city.
_ _
Waypoint 3.3.2. Use Maple to verify that the flow from the city is
maximized when x12 = x24 = x45 = 1, x13 = x35 = 2, and x51 = 3.
_ _
1
1
3
2
3
4
4 3
2
1
5
3. Consult a road atlas and determine the route of shortest driving distance
from Billings to Great Falls, Montana that uses only interstate and U.S.
highways. Then determine the route of shortest time duration, assum-
ing interstate highway speeds of 70 miles per hour and U.S. highway
speeds of 55 miles per hour.
4. A hospital discharges to its outpatient system a portion of its patients,
each of whom receives one of three types of follow-up services: speech
therapy, physical therapy, and occupational therapy. Each therapy type
is performed at two different outpatient clinics.2 Table 3.8 indicates the
maximum number of new outpatient admissions of a specified therapy
type that each clinic can accept during any given day.
Suppose the hospital represents the source and the state of complete
discharge denotes the sink. Using the clinics and therapy types as nodes
2 Based upon Duncan and Noble, [12], (1979).
3.3. Exercises Section 3.3 101
4.1 Duality
Duality occurs when two interrelated parts comprise the whole of something.
In the context of linear programming, duality refers to the notion that every
LP has a corresponding dual LP, whose solution provides insight into the
original LP.
103
104 Chapter 4. Duality and Sensitivity Analysis
inequality form as
minimize w = y · b (4.3)
subject to
yA ≥ c
y ≥ 0.
There are several features to observe about the dual LP. First, its goal is
minimization. Second, its objective function coefficients are determined from
the right-hand sides of the original LP’s constraints. (We call the original LP
the primal LP.) Finally, the constraints of the dual LP are all greater-than-or-
equal-to, and the right-hand sides now are the objective coefficients of the
primal LP.
Viewing the dual LP in the form (4.4) gives us better insight into the correspon-
dence between primal LP and its dual. Whereas the primal LP involved three
constraints in two decision variables, the dual LP involves two constraints
in three decision variables. Moreover, there is a natural correspondence be-
tween each decision variable in the dual and a constraint in the primal. For
example, by comparing variable coefficients, we see that the decision variable
y3 in (4.4) corresponds to the third constraint in the primal.
_ _
Waypoint 4.1.1. Consider the LP
cx0 ≤ y0 b.
106 Chapter 4. Duality and Sensitivity Analysis
The consequences of the Weak Duality Theorem are significant. For example,
suppose x0 is primal-feasible, y0 is dual-feasible, and cx0 = y0 b. Then x0 and
y0 are optimal solutions to their respective LPs since weak duality implies
the primal objective value is no larger than cx0 and the dual objective value
is no smaller than y0 b.
Of particular interest is the case when both the primal and dual possess
optimal solutions. Key to understanding the consequence of this situation is
the result from Theorem 2.4.1 of Section 2.4.3. Recall that after each iteration
of the simplex algorithm, the tableau matrix corresponding to (4.1) can be
written in the form
" # " # " #
1 y 1 −c 01×m 0 1 −c + yA y yb
· = , (4.5)
0m×1 M 0m×1 A Im b 0m×1 MA M Mb
where y is a row vector in Rm and M is an m-by-m matrix.
Reintroducing the top row of variable labels to (4.5), we obtain the tableau
shown in Table 4.1.
The result given in Table 4.1 holds for any iteration of the simplex algorithm.
However, the algorithm for this maximization problem terminates when all
coefficients in the top row of the tableau are nonnegative, which forces
−c + yA ≥ 0 i.e. yA ≥ c, and y ≥ 0.
4.1. Duality 107
def
w0 = y0 · b = z0 = c · x0 .
Thus we have proven the following result, which we refer to as the Strong
Duality Theorem.
Theorem 4.1.2. If LP (4.1) has an optimal solution x0 , then its dual LP (4.3)
also has an optimal solution, y0 , and their corresponding objective function
values, z0 and w0 , are equal. In other words, there is equality in Weak Duality
at the optimal solution, and
z0 = cx0 = y0 b = w0 .
Moreover, the optimal dual decision variable values, y0 , are given by the
coefficients of the slack variables in the top row of the primal LP’s final
tableau.
By combining the result of Theorem 4.1.2, the fact that the dual of an LP’s
dual is the original LP, and arguments such as those following the proof of
the Weak Duality Theorem (Theorem 4.1.1), we arrive at the following major
result, which summarizes the three possible outcomes for an LP and its dual:
Theorem 4.1.3. For an LP and its dual LP, one of the three possible outcomes
must occur:
1. If the primal LP has an optimal solution, then so does the dual LP and
the conclusions of both the Weak and Strong Duality Theorems hold.
Let us recap the two means of obtaining the dual LP solution. One way is
to solve it directly. For the FuelPro LP, this means using the Big M method
(Section 2.3) and introducing excess and artificial variables as follows:
subject to
y1 + 2y2 + 3y3 − e1 + a1 = 4
2y2 + 2y3 − e2 + a2 = 3
y1 , y2 , y3 , e1 , e2 , a1 , a2 ≥ 0.
Here we use M = 100. Each iteration of the simplex algorithm yields a dual-
feasible solution with corresponding objective value, yb, which decreases
from one iteration to the next. The tableau obtained at the final iteration is
shown in Table 4.2.
TABLE 4.2: Final tableau for FuelPro dual LP after being solved with the Big
M Method
w y1 y2 y3 e1 e2 a1 a2 RHS
1 -4 0 0 -4 -10 -96 -90 46
0 1 0 1 -1 1 1 -1 1
0 -1 1 0 0 − 32 -1 3
2
1
2
1
At the final iteration, we achieve an optimal solution of y1 = 0, y2 = , and
2
y3 = 1, which agrees with the slack variable coefficients in the top row of
the primal solution tableau. Furthermore, the optimal dual objective value,
w0 = 46, equals the optimal objective value from the primal.
TABLE 4.3: Top rows of tableau for iterations of primal FuelPro LP (the z
column has been omitted)
Iteration −c + yA 1 −c + yA 2 y1 y2 y3 yb
Zero -4 -3 0 0 0 0
First 0 -3 4 0 0 32
Second 0 0 − 21 0 3
2 44
1
Third 0 0 0 2 1 46
A close inspection of Table 4.2 reveals other interesting patterns. First, the
coefficients of the excess variables in the top row of the tableau are the additive
inverses of the decision variable values in the optimal solution of the primal.
We will not prove this general result, but it illustrates how we can solve the
original LP by solving its corresponding dual if doing so appears to require
fewer computations, for example if the original LP has far more constraints
than variables.
and
[y0 ] j [b − Ax0 ] j = 0 j = 1, 2, . . . , m.
In other words, we have obtained an optimal solution to an LP and its dual
if and only if both of the following two conditions hold:
• Each decision variable in the primal is zero or the corresponding con-
straint in the dual is binding.
• Each decision variable in the dual is zero or the corresponding constraint
in the primal is binding.
Proof. Feasibility and associativity dictate
cx0 ≤ y0 A x0 = y0 (Ax0 ) ≤ y0 b. (4.7)
which is equivalent to
0 = y0 A − c x0 and 0 = y0 (b − Ax0 ) . (4.8)
110 Chapter 4. Duality and Sensitivity Analysis
The constraint and sign conditions guarantee that all entries of the vectors
y0 A − c, x0 , y0 , and b − Ax0
are nonnegative. Thus the product of any two corresponding entries in either
matrix product from (4.8) must be zero. Hence x0 and y0 are simultaneously
optimal if and only if
y0 A − c i [x0 ]i = 0 i = 1, 2, . . . , n,
and
[y0 ] j [b − Ax0 ] j = 0 j = 1, 2, . . . , m.
Suppose that FuelPro Petroleum Companyconsiders selling all its assets. These
assets arise from three sources corresponding to the three original constraints:
4.1. Duality 111
the availability of premium and the availability of stocks A and B. The inter-
ested buyer seeks to purchase all assets (at respective prices of y1 , y2 , and y3 )
and to do so at minimum cost. This goal yields the minimization objective w
in (4.9).
For the dual constraints, the profit earned from each fuel grade can be viewed
as coming from one of the three preceding asset types. The extent to which
each type contributes to a particular fuel grade’s profit is determined by an
appropriate entry in A. From FuelPro Petroleum Company’s perspective, the
values of y1 , y2 , and y3 must guarantee that the amount earned from the sale
of the three assets is worth at least as much as the profit these assets currently
generate, be it profit stemming from premium or from regular unleaded. This
fact motivates an interpretation of each of the two constraints in the dual LP.
Ed and Steve love to gamble. When they can’t find enough friends to play
Texas Hold ’Em, they instead play a matrix game using the following 3-by-3
payoff matrix:
1 −1 2
A = 2 4 −1 . (4.10)
−2 0 2
In this game, at each play, Ed picks a column and, simultaneously, Steve
picks a row. The dollar amount ai, j in the resulting entry then goes to Ed if it
is positive and to Steve if it is negative. For example, if Ed chooses column
three and Steve simultaneously chooses row one, then Steve pays Ed $2. Since
whatever dollar amount Ed wins on a play is lost by Steve and vice versa,
money merely changes hands. Such a contest is known as a zero-sum matrix
game. The columns of the payoff matrix form Ed’s possible pure strategies and
the rows of the matrix form Steve’s.
Pure strategy Nash equilibria candidates consist of entries of the payoff ma-
trix. In the case of Ed and Steve’s game, there are nine such possibilities. For
example, consider the entry in column three, row one of (4.10). This entry
does not constitute a pure strategy equilibrium, for if Steve recognizes that
Ed always chooses column three, then Steve can increase his earnings by
always choosing row two instead of row one. Similar reasoning, applied to
the remaining eight entries of the payoff matrix, demonstrates that no pure
strategy Nash equilibrium exists for this game.
In contrast to a pure strategy, whereby Ed and Steve always choose the same
column or row, respectively, a mixed strategy arises by assigning probabilities
to the various choices. For example, Ed may choose column one, column
1 1 1
two, and column three with respective probabilities , , and . Clearly
2 3 6
there are infinitely many possible mixed strategies for each player. A mixed
strategy Nash equilibrium consists of a mixed strategy for each player, called the
4.1. Duality 113
Using this notation, we now consider mixed strategies for each of the two
players, starting with Ed.
_ _
Waypoint 4.1.2. Suppose Ed chooses columns one, two, and three
with respective probabilities, x1 , x2 , and x3 and that Steve always
chooses row one. Use A to determine a linear function, f1 , of x1 , x2 ,
and x3 , that represent Ed’s expected winnings as a function of these
three probabilities. Repeat this process and find functions, f2 and
f3 , that represent Ed’s winnings as functions x1 , x2 , and x3 , when
Steve instead always chooses row two or always chooses row three,
respectively. Your results should establish that
f1 (x1 , x2 , x3 )
f2 (x1 , x2 , x3 ) = Ax, (4.11)
f3 (x1 , x2 , x3 )
x1
where x = x2 .
x3
_ _
114 Chapter 4. Duality and Sensitivity Analysis
_ _
Waypoint 4.1.3. Suppose Steve chooses rows one, two, and three with
respective probabilities, y1 , y2 , and y3 and that Ed always chooses
column one. Use A to determine a linear function, g1 , of y1 , y2 , and
y3 , that represents Steve’s expected winnings as a function of these
three probabilities. Repeat this process and find functions, g2 and g3 ,
that represent Steve’s winnings as functions y1 , y2 , and y3 , when Ed
instead always chooses column two or always chooses column three,
respectively. Your results should establish that
h i
g1 (y1 , y2 , y3 ) g2 (y1 , y2 , y3 ) g3 (y1 , y2 , y3 ) = yA, (4.12)
h i
where y = y1 y2 y3 .
_ _
If z denotes
the minimum of the three columns of Ax, then Ax ≥ ze, where
1
e = 1. Since Ed’s payoffs are associated with positive entries of A, his goal
1
is to choose x1 , x2 , and x3 so as to maximize z. Thus he seeks to determine the
solution to the LP
We now use duality to formally verify that the equilibrium mixed strategies,
116 Chapter 4. Duality and Sensitivity Analysis
3/28 h i 13
x0 = 9/28, y0 = 1/2 5/14 1/7 and corresponding payoff z0 = w0 = ,
14
4/7
constitute a mixed strategy Nash equilibrium for this matrix game.
z = zy0 e (4.15)
= y0 (ze)
≤ y0 (Ax) (since x is primal-feasible)
= (y0 A)x
≤ (w0 et )x (since y0 is dual-feasible)
= w0 (et x)
= z0 (since w0 = z0 )
Thus, Ed’s earnings are no more than z0 . Similar reasoning demonstrates that
if Steve deviates from his equilibrium strategy, choosing a mixed strategy,
y , y0 , with corresponding earnings, w, and if Ed continues to follow his
equilibrium mixed strategy, x0 , then w ≥ w0 . In other words, Steve does not
decrease his losses.
In Section 6.4.5 we will re-examine the zero-sum matrix game from the per-
spective of nonlinear programming, and in Section 8.4.4 we investigate the
consequences of the players having two different payoff matrices.
2. Consider the LP
maximize z = 3x1 + 4x2
subject to
x1 ≤ 4
x1 + 3x2 ≤ 15
−x1 + 2x2 ≥ 5
x1 − x2 ≥ 9
x1 + x2 = 6
x1 , x2 ≥ 0.
By rewriting the fifth constraint as a combination of two inequalities,
express this LP in the matrix inequality form (4.1). Then construct the
corresponding dual LP and demonstrate explicitly, without appealing
to Table 4.4, how it possesses a variable that is unrestricted in sign.
3. Prove that the dual of the dual of LP (4.1) is the original LP. (Hint:
Transpose properties may prove useful.)
4. Simplex algorithm iterations applied to an LP result in the tableau (4.5).
(a) How many constraints and decision variables are in the dual of
this LP?
(b) If the basic variables in the tableau are given by {x1 , x2 , x3 }, what
information does Weak Duality provide about the objective value
in the solution of the dual LP?
5. Consider the LP
maximize z = 2x1 − x2
subject to
x1 − x2 ≤ 1
x1 − x2 ≥ 2
x1 , x2 ≥ 0.
Show that both this LP and its corresponding dual are infeasible.
118 Chapter 4. Duality and Sensitivity Analysis
(a) What are the constraints of the LP? (Hint: Use the matrix, M−1 ,
where M is as given in Theorem 2.4.1.)
(b) Fill in the entries marked in the tableau.
(c) Determine the solutions of both the LP and its corresponding dual.
7. Solve the LP
maximize z = x1 + 5x2
subject to
−x1 + x2 ≤ 4
x2 ≤ 6
2x1 + 3x2 ≤ 33
2x1 + x2 ≤ 24
x1 , x2 ≥ 0,
Suppose that
8
17
0
5
x0 = 17 2 .
17
0
0
Without performing the simplex algorithm, show that x0 is the optimal
4.2. Sensitivity Analysis 119
solution of
h the LP and determine
i the corresponding dual solution. (Hint:
Let y0 = y1 y2 y3 y4 . Calculate both y0 A − c and b − Ax0 and apply
Theorem 4.1.4 to construct a system of equations in the variables y1 , y2 ,
y3 , and y4 .)
The feasible region and the contour corresponding to the optimal solution
z = 46 are shown in Figure 4.1.
x2
16
14
12
(4,10)
10
6 z=4x1+3x2=46
0
x1
2 4 6 8 10
FIGURE 4.1: Feasible region for FuelPro LP along with contour z = 46.
There are many ways in which the original LP can be modified. Examples
include the introduction of an additional variable, e.g., fuel type, a change
in an objective coefficient, an increase or decrease in a constraint bound, or
4.2. Sensitivity Analysis 121
and if the LP is feasible and bounded, then the tableau matrix after the final
iteration can be written as
" # " # " #
1 y 1 −c 01×m 0 1 −c + yA y yb
· = , (4.18)
0 M 0 A I3 b 0 MA M Mb
for some suitable 1 by m vector y (the solution to the dual!) and some m-by-m
matrix M. In particular, all steps of the simplex algorithm used to achieve the
final tableau are completely determined by the entries of y and M.
" # " #
1 y 1 −c − cδ 0 0
·
0 M 0 A Im b
" # " # " #!
1 y 1 −c 0 0 0 −cδ 0 0
= · +
0 M 0 A Im b 0 0m×n 0m×m 0
" # " #
1 −c + yA y yb 0 −cδ 0 0
= +
0 MA M Mb 0 0m×n 0m×m 0
" #
1 −c − cδ + yA y yb
= . (4.20)
0 MA M Mb
By comparing (4.20) to (4.18), we see that the only changes that result from
applying the same steps of the algorithm to the new LP occur in the top row of
the tableau. In particular, the tableau corresponding to (4.20) reflects the need
for additional iterations only if at least one entry of −c − cδ + yA is negative.
Recall that in the original LP, the basic variables at the final iteration were
given by x1 , x2 , and s1 . To update the values of x1 and the objective, we must
pivot on the highlighted entry in Table 4.8. Doing so yields the tableau in
Table 4.9.
TABLE 4.9: FuelPro tableau under changed premium cost and after additional
pivot
z x1 x2 s1 s2 s3 RHS
1 0 0 0 12 − δ 1 + δ 46 + 4δ
0 1 0 0 -1 1 4
0 0 0 1 1 -1 4
3
0 0 1 0 2 -1 10
From this tableau, we see that an additional iteration of the simplex algorithm
4.2. Sensitivity Analysis 123
1
is required only if one of 21 − δ or 1 + δ is strictly negative. Hence if −1 ≤ δ ≤
2
(meaning the premium cost stays between 3 and 4.5), then the values of the
decision variables remain at (x1 , x2 ) = (4, 10). However, the corresponding
objective function changes to z = 46 + 4δ.
20
The basic variables at the final iteration are given by x2 = 10 and x3 = .
3
Suppose in this case that the first objective coefficient increases from 4 to 4 + δ.
Then applying to the new LP the same steps of the simplex algorithm that
led to Table 4.10, we obtain the tableau given in Table 4.11.
The current solution remains optimal provided δ < 1. In this case, no addi-
tional pivots are necessary in order to update the objective value. This stems
from the fact that we elected to increase the objective coefficient of a deci-
sion variable that was nonbasic in the optimal solution of the original LP. In
the case of the FuelPro LP, both decision variables were basic in the optimal
solution of the original LP.
_ _
Waypoint 4.2.1. Solve the LP
Note that the top row of this matrix, with the exception of the rightmost
entry, is identical to that obtained applying the simplex algorithm to the
original, unmodified LP. Thus, on first inspection, (4.25) corresponds to the
final tableau of the modified LP, and the set of basic variables is the same as in
the original LP. However, care must be taken to ensure that all basic variables
are nonnegative.
For example, suppose that in the FuelPro LP, an additional δ units of stock A
becomes available, in which case the right-hand side of the second constraint
in (4.16) increases from 28 to 28 + δ. The final tableau for the original LP is
given in Table 4.12.
In this situation, the basic variables corresponding to rows 1-3 are given by
0 −1 1 0
x1 , s1 , and x2 , respectively. Since M = 1 1 −1 and u = 1, an increase in
0 23 −1 0
the right-hand side of the second constraint by δ yields
x1
s
1 = Mb + δMu
x2
4 − δ
= 4 + δ .
3
10 + 2 δ
For basic variables x1 , s1 , and x2 to all remain positive, it must be the case that
−4 < δ < 4, or, in other words, the amount of available stock A stays between
24 and 32 gallons. For values of δ in this range, the corresponding objective
value is given by
1
z0 = y (b + δu) = 46 + δ.
2
4.2. Sensitivity Analysis 127
A close inspection of the preceding example, together with the general objec-
tive formula z = y (b + δu) in (4.25), illustrates a fundamental role played by
slack variables in sensitivity analysis. The increase in the objective value in
this particular example is
1
δ = y2 δ = yδu.
2
Thus, slack variable coefficient values in the top row of the final tableau,
which are represented by y, can be used to determine a change in objective
value when a constraint bound is changed. This outcome is true even when
the previously discussed sensitivity analysis is not performed, provided the
set of basic variables remains intact under such a change. This gives slack
variables an extremely useful role in sensitivity analysis and leads us to the
following definition.
128 Chapter 4. Duality and Sensitivity Analysis
∂z0
Then the instantaneous rate of change, = y j , the jth slack variable coeffi-
∂b j
cient value in the top row of the LP’s final tableau, is called the shadow price
of the objective function z with respect to constraint j. For a small amount
of change in b j , small enough so that the set of basic variables in the solu-
tion of the new LP is identical to that in the solution of the original LP, this
shadow price can be used to compute the amount of increase or decrease in
the objective function.
∂z0
= y1 = 3.
∂b1
Assume that increasing the right-hand side of the first constraint by .01 does
not change the set of basic variables in the new solution from that in the
solution of the original LP. Then the objective z increases by ∆z = 3(.01) = .03.
_ _
Waypoint 4.2.2. Consider the three-variable LP from Waypoint 4.2.1.
1. Suppose that the right-hand side of a particular constraint in-
creases by ∆b and that the set of basic variables in the solution
of the new LP is identical to that obtained in the original LP. Use
the definition of shadow price to determine how much the ob-
jective function will increase or decrease. Perform this analysis
for each constraint that is binding in the original LP.
2. By how much may the right-hand side of each binding con-
straint change and the set of basic variables in the solution to
the resulting LP equal that from the solution to the original LP?
For this range, what are the corresponding values of the basic
variables and the objective function?
_ _
In the preceding example, we modified the right-hand side of only one con-
straint. Yet the underlying principles can be easily modified to situations
involving more than one constraint bound. In this case, the constraint vector
δ1
b is adjusted by the vector δ = δ2 , where δi , for i = 1, 2, 3, denotes the change
δ3
in the bound on constraint i. Now the final matrix corresponding to that in
(4.25) becomes
" #
1 −c + yA y y (b + δ)
. (4.28)
0 MA M Mb + Mδ
For example, in the FuelPro LP, if the available premium increases by δ1 and
130 Chapter 4. Duality and Sensitivity Analysis
δ1
the available stock A increases by δ2 , then δ = δ2 so that
0
x1
s
1 = Mb + Mδ (4.29)
x2
4 − δ2
4 + δ + δ
=
1 .
2 (4.30)
10 + 23 δ2
The set of ordered pairs, (δ1 , δ2 ), for which x1 , s1 , and s2 are all nonnegative
is shown in Figure 4.2. For ordered pairs in this shaded region, the new
objective value is given by z = y (b + δ).
δ2
δ1
FIGURE 4.2: Ordered pairs (δ1 , δ2 ) for which changes in the first two con-
straints of FuelPro LP leave the basic variables, {x1 , s1 , x2 }, unchanged.
The reason for this fact becomes apparent if we contrast the two vectors,
−c − cδ + yA and − c + y(A + Aδ ),
which appear in the top rows of the respective tableau matrices (4.20) and
(4.31). Changing an objective coefficient of a single decision variable in an
LP only changes the coefficient of the same variable in the top row when
the simplex algorithm is performed. But when an entry of A is modified,
the presence of the vector-matrix product y(A + Aδ ) in the top row of (4.31)
introduces the possibility that more than one coefficient in the top row of the
tableau is altered.
y(A + Aδ ) ≥ c,
then the solution to the LP whose final tableau matrix is given in (4.31) remains
optimal. Moreover, no additional pivots are needed to update any of the basic
variable values. This second fact follows from noting that entries of −c + yA
corresponding to basic variables are all zero (in the solution of the original
LP), as are those in yAδ . (Recall in our choice of Aδ that column entries of Aδ
corresponding to basic variables must equal zero.)
For example, in the solution to the three variable LP (4.21), the variable x1 was
nonbasic. Suppose the coefficient of x1 changes from 2 to 2 + δ in the second
constraint. Referring to the slack variable coefficients in Table ??, we note that
the current solution remains optimal if y = [1, 1] satisfies
" #
3 1 3
y(A + Aδ ) = y
2+δ 2 3
= [5 + δ, 3, 6]
≥ [4, 3, 6],
that is, if −1 ≤ δ.
To begin the analysis in this situation, we first perform a pivot to update the
values of the basic variables and the objective. This pivot leads to the result
in Table 4.15.
> e:=UnitVector[row](ObjectiveCoefficient,2);
h i
e= 1 0
> LPMatrixNew:=<UnitVector(1,m+1)|<convert(-c-delta*e,Matrix)+y.A,M.A>|
<y,M>|<y.b,M.b>>;
# Create new tableau matrix obtained by perturbing first objective
coefficient.
1
1 −δ 0 0 2
1 46
0 1 0 0 −1 1 46
LPMatrixNew =
0 0 0 1 1 −1 4
3
0 0 1 0 2
−1 10
> Iterate(LPMatrixNew,1,1);
# Update tableau entries.
z x1 x2 s1 s2 s3 RHS
1 0 0 0 1
−δ 1+δ 46 + 4δ
2
0
1 0 0 −1 1 4
0
0 0 1 1 −1 4
3
0 0 1 0 2
−1 10
ConstraintNumber = 2
> e:=UnitVector(ConstraintNumber,m);
0
e = 1
0
> LPMatrixNew:=<UnitVector(1,m+1)|<convert(-c,Matrix)+y.A,M.A>|
<y,M>|<y.(b+delta*e),M.(b+delta*e)>>;
# Create new tableau matrix obtained by perturbing first objective
coefficient.
1
1 0 0 0 1 46 + 12 δ
0 2
1 0 0 −1 1 46
LPMatrixNew =
0 0 0 1 1 −1 4 − δ
3 3
0 0 1 0 2
−1 10 + 2 δ
4.2. Exercises Section 4.2 135
RealRange = (−4, 4)
> RowNumber:=3:ColumnNumber:=2:
# Assess sensitivity to entry in
row 3, column 2 of coefficient matrix.
> Adelta:=Matrix(m,n,{(RowNumber,ColumnNumber)=delta});
# Create an m by n perturbation matrix.
0 0
0 0
Adelta =
0 δ
> LPMatrixNew:=
<UnitVector(1,m+1)|<convert(-c,Matrix)+y.(A+Adelta),M.(A+Adelta)>|
<y,M>|<y.b,M.b>>;
# Create new tableau matrix that results from perturbing assigned
entry of A.
1
1 0 δ 0 1 46
0 1 2
δ 0 −1 1 4
LPMatrixNew =
0 0 −δ 1 1 −1 4
3
0 0 1−δ 0 2
−1 10
> Iterate(LPMatrixNew,3,2);
z x1 x2 s1 s2 s3 RHS
1 δ δ 10δ
1 0 0 0 − 32 1−δ 1 + 1−δ 46 − 1−δ
2
10δ
0 1
0 0 −1 − 32 1−δδ δ
1 + 1−δ 4 − 1−δ
0 0 3 δ δ 10δ
0 1 1 + 2 1−δ −1 − 1−δ 4 + 1−δ
0 0 1 0 1 + 32 1−δ
1 1
− 1−δ 10
1−δ
2. Consider the LP
(a) Solve this LP. One of the three decision variables in your solution
should be nonbasic.
(b) By how much can the objective coefficient of x1 increase or decrease
and the set of basic variables in the solution be unchanged from
that of original LP? For this range of coefficient values, what are
the values of the decision variables and objective function? Repeat
this process, separately, for each of the other objective coefficients.
(c) By how much can the bound in the first constraint increase or
decrease and the set of basic variables in the solution be unchanged
from that of the original LP? For this range of values, what are
the values of all variables, both decision and slack, as well as
the objective function? What is the corresponding shadow price?
Repeat this process for the bound in the second constraint.
(d) Now consider the LP formed by changing the bounds on the sec-
ond and third constraints:
Sketch the set of ordered pairs, (δ2 , δ3 ), for which the set of basic
variables in the solution is the same as that of the original LP. For
points in this set, what are the values of the decision variables and
the objective function in terms of δ2 and δ3 ?
(e) By how much can the coefficient of x1 in the second constraint
increase or decrease and the set of basic variables in the solution
be unchanged from that of the original LP? For this range of values,
what are the values of all variables, both decision and slack, as well
as the objective function?
4.3. The Dual Simplex Method 137
3. Suppose that FuelPro currently produces its two fuels in a manner that
optimizes profits but is considering adding a mid-grade blend to its line
of products. In terms of the LP, this change corresponds to introducing
a new decision variable. Each gallon of mid-grade fuel requires two
gallons of stock A and two and a half gallons of stock B and, hence,
diverts resources away from producing the other fuel types. The com-
pany seeks to determine the smallest net profit per gallon of mid-grade
blend that is needed to increase the company’s overall profit from its
current amount.
(a) The introduction of a third fuel type adds an extra column to the
original coefficient matrix. Determine the new coefficient matrix,
Ã.
(b) Suppose that pm denotes the profit per gallon of the mid-grade fuel
type. Then c̃ = [4, 3, pm] records the profit per gallon of each of the
three fuel types. If y0 denotes the solution of the original dual LP,
calculate −c̃ + y0 Ã in terms of pm .
(c) Explain why −c̃ + y0 Ã records the decision variable coefficients
in the top row of the tableau that result if the same steps of the
simplex algorithm used to solve the original LP are applied to the
new LP having the third decision variable.
(d) Use the result from the previous question to determine the smallest
net profit per gallon of mid-grade blend that is needed to increase
the company’s overall profit from its current amount.
At each iteration, the primal LP has a basic feasible solution, whose values
are recorded by the entries of Mb. Unless the LP is degenerate, all such values
are positive. During this process but prior to the termination of the algorithm,
at least one entry of y or −c + yA is negative, implying y is not dual-feasible.
At the completion of the algorithm, both yA ≥ c and y ≥ 0, whereby we
simultaneously obtain optimal solutions to both the primal LP and its dual.
138 Chapter 4. Duality and Sensitivity Analysis
Unfortunately, if b has one or more negative entries, then so too can Mb. In
this case the primal has a solution that is basic, but not basic feasible. The
dual simplex method, which uses the same tableau as the original method, is
well suited for addressing such a situation. At each stage of the algorithm,
all entries of y and −c + yA are nonnegative, implying y is dual-feasible.
At the same time, at least one entry of the rightmost tableau column, Mb, is
negative, so that the basic variable values for the primal at that stage constitute
merely a basic, but not basic feasible, solution. The goal of the algorithm then
becomes one of making all entries of Mb nonnegative. For when we achieve
this outcome, we obtain a basic feasible solution of the primal, whose values
are recorded by Mb, along with a basic feasible solution, y, of the dual. Since
their objective values are both recorded by yb, both solutions are optimal, and
our tableau is identical to that obtained using the regular simplex algorithm.
x2
x1
The tableau shows that s1 = −30 is the most negative basic variable, so it
becomes the departing variable. Thus we focus on its corresponding row and
must decide which nonbasic variable, x1 or x2 , replaces s1 . The two equations,
−2x1 + s1 = −30 and −3x2 + s1 = −30, indicate that we may let s1 increase to
zero by allowing either x1 to increase to 15 or x2 to increase to 10. The first of
these choices results in z = −3 · 15 = −45, the latter in z = 2 · 10 = −20. Because
our goal is to maximize z, we let x2 replace s1 as a basic variable and thus pivot
on the highlighted entry in Table 4.16. From this analysis, we conclude that
the dual simplex ratio test works as follows: Once we have determined a row
corresponding to the departing basic variable, i.e., a pivot row, we determine
the column of the entering basic variable as follows. For each negative entry
in the pivot row, we compute the ratio of the corresponding entry in the
top row (which is necessarily nonnegative by dual-feasibility) divided by the
pivot row entry. The smallest of all such ratios, in absolute value, determines
the pivot column.
We thus pivot on the highlighted entry in Table 4.16. The resulting tableau is
given in Table 4.17. Observe that x1 = 0 and x2 = 10, which corresponds to
another basic solution in Figure 4.3.
At this stage, s2 is the only negative basic variable. By the ratio test applied
to the row corresponding to s, we see that x1 replaces s2 as a basic variable.
The resulting pivot leads to the final tableau in Table 4.18.
The dual simplex algorithm terminates at this stage because all entries in
the right-hand side of the tableau, below the top row, are nonnegative. In
other words, a basic feasible solution to the primal LP has been obtained. The
primal solution is given by (x1 , x2 ) = (6, 6) with z = −30, which corresponds
to w = 30 in the original LP (4.34).
If we compare this method with the Big M Method for solving the LP, we see
4.3. The Dual Simplex Method 141
Another setting in which the dual simplex method proves extremely useful
is in sensitivity analysis, specifically when one desires to know the effect of
adding a constraint to an LP. For example, suppose that FuelPro must add
a third stock, call it stock C, to each fuel type in order to reduce emissions.
Assume that production of each gallon of premium requires 5 gallons of
stock C and that production of each gallon of regular unleaded requires 6
gallons of stock C. Furthermore, at most 75 gallons of stock C are available
for production. It follows then that we must add the constraint 5x1 + 6x2 ≤ 75
to the original FuelPro LP. Instead of completely resolving a new LP, we may
begin with the final tableau for the original FuelPro LP, given by Table 2.4
from Section 2.1, and add a row and column, together with a slack variable
s4 corresponding to the new constraint. The result is shown in Table 4.19.
To update the values of the basic variables x1 and x2 in the tableau, we must
perform two pivots. The resulting tableau is given in Table 4.20.
h i
At this stage, y = 0 21 1 0 is dual-feasible so that we may apply the dual
simplex algorithm directly. Only one iteration is required, in which case we
replace a negative basic variable, s4 , with the nonbasic variable s2 . Pivoting
on the highlighted entry in Table 4.20 yields the final tableau in Table 4.21.
From this final tableau, we observe that the addition of the new constraint
leads to an updated solution of (.x1 , x2 ) = 21
4 8, 65
with a corresponding profit
142 Chapter 4. Duality and Sensitivity Analysis
363
of z = = 45.375. Observe how the added constraint has decreased FuelPro
8
Petroleum Company’s profit from the original model.
As the steps involved in applying the dual simplex algorithm are quite similar
to those of its regular counterpart, we should not be surprised that the Maple
worksheet, Simplex Algorithm.mw, from Section 2.1 is easily modified to
execute the dual simplex method. Specifically, we substitute for the RowRatios
procedure in the original worksheet, a procedure that computes “column
ratios” instead. Syntax for doing so is given as follows:
Here we follow our numbering convention that column 0 and row 0 denote
the leftmost column and top row, respectively, of the tableau matrix.
4.3. Exercises Section 4.3 143
minimize z = x1 + 4x2
subject to
x1 + 3x2 ≥ 1
4x1 + 18x2 ≥ 5
x1 , x2 ≥ 0
(b)
minimize z = x1 + x2
subject to
x1 ≤8
−x1 + x2 ≤ 4
−x1 + 2x2 ≥ 6
2x1 + x2 ≤ 25
3x1 + x2 ≥ 18
−x1 + 2x2 ≥ 6
x1 , x2 ≥ 0
2. Consider again the scenario when a third stock, stock C, is added to each
of the fuel types in the FuelPro LP. Assume that each gallon of premium
requires 5 units of stock C, each unit of regular unleaded requires 6 units
of stock C. What is the minimum number of available units of stock C
that will change the optimal solution from its value of (x1 , x2 ) = (4, 10)
in the original LP?
Chapter 5
Integer Linear Programming
Letting x1 and x2 denote the numbers of produced rec kayaks and sea kayaks,
respectively, we can easily formulate GLKC’s goal as that of solving ILP (5.1):
145
146 Chapter 5. Integer Linear Programming
Figure 5.1 illustrates the candidate solution values of x1 and x2 for ILP 5.1,
namely the ordered pairs in the shaded region whose entries are integer-
valued. We call such candidates, feasible lattice points.
One approach to solving ILP 5.1 is of course to merely evaluate the objective
function at all feasible lattice points and to determine which produces the
largest objective value. If the inequalities that comprise the ILP produce a
bounded region, then there exist only a finite number of candidates from
which to choose. For small-scale problems, this approach is satisfactory, but
for larger-scale problems, it proves extremely inefficient, thereby motivating
the need for different methods.
While the shaded portion represents the feasible region for the relaxation LP,
only the lattice points in this feasible region are also feasible for the ILP.
x2
93
42
x1
FIGURE 5.1: Feasible region and solution for the GLKC ILP relaxation.
A naive approach to solving the ILP consists of rounding off the relaxation
solution to the nearest lattice point. Unfortunately, this approach can lead to
decision variable values that do not correspond to the optimal solution or are
infeasible to begin with.
_ _
Waypoint 5.1.1. Determine the solution of ILP (5.1) by graphical in-
spection of Figure 5.1. How does this solution compare to that of the
relaxation or to that obtained by rounding off the relaxation solution?
_ _
Note that the first of these LPs is nothing more than our original LP using
that portion of its feasible region in Figure 5.2 that falls to the left of x1 = 2.
The second LP uses that portion to the right of x1 = 3.
x2
x1
Solving LPs (5.3) and (5.4), we obtain the following solutions, respectively:
2. x1 = 3 x2 = 0, and z = 9.
We now branch again by using these results and adding more constraints
to the corresponding LPs. However, the solution of (5.4) is integer-valued,
so adding more constraints to (5.4) can only produce an LP whose " #optimal
3
objective value is no greater than 9. For this reason, we label x = , which
0
corresponds to z = 9, a candidate for the optimal solution of (5.2), and we
branch no further on LP (5.4).
and
1. x1 = 2, x2 = 1, and z = 10;
2. x1 = 1.5, x2 = 2, z = 12.5.
and
" #
2
These results indicate that x = is also a candidate solution. Because its
1
objective
" # value is larger than that produced by the first candidate solution,
3
x= , we eliminate the first candidate from further consideration.
0
and
Clearly, the first constraint in (5.7) is redundant in light of the newly added,
more restrictive third constraint. Moreover, the first and third constraints in
(5.8) can be combined into the equality constraint, x1 = 2. We only express
the constraints for each LP in this manner so as to emphasize the manner in
which we are adding more and more constraints to the original LP at each
stage of branching.
The second of the preceding LPs is infeasible. The first has its solution given
by
together with
152 Chapter 5. Integer Linear Programming
" # " #
1 0
LPs (5.9) and (5.10) produce solutions of x = and x = , respectively,
2 3
with corresponding objective values of z = 11 and z = 12. Both are candidate
solutions, so we branch no further.
At this stage the branching process is complete, and "we # have determined
3
four candidate solutions exist. The first of these, x = , has already been
0
" # " #
2 1
eliminated from consideration. Of the remaining three, x = ,x= , and
1 2
" #
0
x= , the last yields the largest objective value of z = 12. Thus, ILP (5.1) has
3
" #
0
its solution given by x = so that GLKC should produce no “rec” kayaks
3
and three sea kayaks.
Relaxation
" # of ILP:
2.25
x=
1.5
z = 12.75
x1 ≤ 2 x1 ≥ 3
" #
" # 3
2 x=
x= 0
1.6̄
z=9
z = 12.6̄
Candidate Solution
x1 ≤ 2, x2 ≤ 1 x1 ≤ 2, x2 ≥ 2
" #
2 " #
x= 1.5
1 x=
2
z = 10
z = 12.5
Candidate Solution
x1 ≤ 2, x2 ≥ 2, x1 ≤ 1 x1 ≤ 2, x2 ≥ 2, x1 ≥ 2
" #
1
x=
2.3̄ Infeasible
z = 12.3̄
x1 ≤ 2, x2 ≥ 2, x1 ≤ 1, x2 ≤ 2 x1 ≤ 2, x2 ≥ 2, x1 ≤ 1, x2 ≥ 3
" # " #
1 0
x= x=
2 3
z = 11 z = 12
Candidate Solution Optimal Solution
Here are some points to bear in mind when using the branch and bound
method.
1. Create a tree diagram with nodes and arrows. Each node should clearly
convey the LP being solved, together with its optimal solution, espe-
cially the objective function value.
2. Along each branch, or edge, of the tree diagram connecting two nodes,
indicate the constraint that is added to the LP at the top node in order
to formulate the LP of the bottom node.
3. Before branching on a particular variable, pause to consider whether
doing so is absolutely necessary. Remember that in a maximization
(resp. minimization) problem, objective values can only decrease (resp.
increase) from one node to a lower node connected along a particular
branch.
4. At a certain point in the solution process, all possible cases along a
branch become exhausted and other previously constructed branches
must be examined for candidate solutions. Backtracking, the process of
revisiting unresolved branches in exactly the reverse order they were
created, is a systematic means for carrying out this process.
> A:=Matrix(2,2,[2,1,2,3]);
# Enter constraint matrix.
" #
2 1
A :=
2 3
> b:=<6,9>;
# Enter constraint bounds.
" #
6
b :=
9
Integer Linear Programming and the Branch and Bound Method 155
> x:=<x1,x2>;
# Create vector of decision variables.
" #
x
x := 1
x2
> ConstraintMatrix:=A.x-b:
> Constraints:=seq(ConstraintMatrix[i]<=0,i=1..2);
> LPSolve(c.x,[constraints],assume=’nonnegative’,
’maximize’);
# Solve relaxation.
> LPSolve(c.x,[constraints,x1>=3],assume=’nonnegative’,
’maximize’);
# Add an inequality to form a new list of constraints.
Then solve the resulting LP.
Remaining LPs that result from further branching " # are handled in a similar
0
manner so as to obtain the final solution of x = .
3
x3 ≤ 3
x3 ≥ 4
1
(x1 , x2 , x3 ) = 0, , 3
3 Infeasible
z = 10
Optimal solution
> restart:with(LinearAlgebra):with(Optimization):
> c:=Vector[row]([3,4]);
# Create vector of objective coefficients.
h i
c := 3 4
> A:=Matrix(2,2,[2,1,2,3]);
# Matrix of constraint coefficients.
" #
2 1
A :=
2 3
> b:=<6,9>;
# Constraint bounds. " #
6
b :=
9
> LPSolve(c,[A, b],assume=nonnegint,’maximize’,depthlimit=5);
" " ##
0
12
3
> LPSolve(x1+3x2+3x3,[x1+3x2+2x3<=7,2x1+2x2+x3<=11],
assume=nonnegative, integervariables=[x1,x3],’maximize’);
# Solve the LP, specifying x1 and x3 are integer-valued.
The TSP seeks to determine the minimum distance an individual must travel
in order to begin at one location, pass through each of a list of other inter-
mediate locations exactly one time, and return to the starting point. Such a
round trip is known as a tour.
1 2
To solve the TSP, we first define the binary decision variables, xi j , where
1 ≤ i, j ≤ n. A value of xi j = 1 in the final solution indicates that travel takes
place from destination i to j; a value of 0 indicates that no such travel takes
place. That no travel takes place from a destination to itself dictates that the
problem formulation must guarantee a solution in which xii = 0, for 1 ≤ i ≤ n.
Each location is the starting point from which one travels to a new destination,
Xn
a condition expressed as xi j = 1 for 1 ≤ i ≤ n. Similarly, each destination
j=1
X
n
is also ending point of travel from a previous destination so that x ji = 1
j=1
for 1 ≤ i ≤ n. We will denote the distance between locations i and j, where
1 ≤ i, j ≤ n, by di j . Of course, di j = d ji , and to ensure that xii = 0 for 1 ≤ i ≤ n, we
set each dii = M, where M is a number much larger than any distance between
Integer Linear Programming and the Branch and Bound Method 159
two different destinations. With this notation, the total distance traveled is
Xn X n
given by D = di j xi j .
i=1 j=1
At first glance, it might then appear that the solution to the TSP is the same
as that obtained by solving the BLP
X
n X
n
minimize D = di j xi j (5.12)
i=1 j=1
subject to
X n
xi j = 1, 1≤i≤n
i=1
Xn
x ji = 1, 1≤i≤n
j=1
To guarantee that only tours, and not subtours, are feasible, we must introduce
more decision variables and use these to add additional constraints to (5.12).
We label these decision variables as pi , where 1 ≤ i ≤ n, and let each pi denote
the position of destination i in a tour. For example, if 1 → 3 → 2 → 4 → 1, then
p1 = 1, p3 = 2, p2 = 3, and p4 = 4. Note that for each i, 1 ≤ pi ≤ n. Furthermore,
to solve the TSP, we may assume we start at location 1, meaning p1 = 1.
To verify that all tours satisfy (5.13), we consider two cases for each pair
of distinct destinations, i and j, in a tour, where 2 ≤ i, j ≤ n. In the first case
160 Chapter 5. Integer Linear Programming
1 2
pi − p j + 1 = 0
= (n − 1)(1 − xi j )
so that (5.13) holds. In the second case xi j = 0, and no such direct travel
takes place. Since p1 = 1 and 2 ≤ i, j ≤ n, we have pi − p j ≤ n − 2 so that
pi − p j + 1 ≤ n − 1. Thus, (5.13) holds for the second case as well.
We begin by selecting the subtour not containing the first destination. Then,
for some k satisfying 2 ≤ k ≤ n − 1 and for some subset {i1 , i2 , . . . , ik } of distinct
elements of {2, . . . , n},
i1 → i2 → i3 → . . . → ik = i1 .
However, i1 = ik so that pi1 = pik , from which it follows k ≤ 1. But this result
contradicts our assumption k ≥ 2. We may therefore conclude that decision
variables corresponding to a subtour do not satisfy constraints (5.13).
Combining all these results, we finally obtain an ILP formulation of the TSP:
X
n X
n
minimize D = di j xi j (5.16)
i=1 j=1
subject to
X n
xi j = 1, 1≤i≤n
i=1
Xn
x ji = 1, 1≤i≤n
j=1
pi − p j + 1 ≤ (n − 1)(1 − xi j ), 2 ≤ i, j ≤ n
p1 = 1
2 ≤ pi ≤ n, 2≤i≤n
pi ∈ Z, 1 ≤ i ≤ n.
xi, j ∈ {0, 1}, 1 ≤ i, j ≤ n.
162 Chapter 5. Integer Linear Programming
_ _
While (5.16) provides a means of stating the TSP as an ILP, the task of solving
it proves inefficient, even for problems involving a moderate number of
decision variables. For this reason, determining more efficient methods for
solving larger-scale TSPs remains a much-pursued goal in the field of linear
programming.
(a)
(b)
(c)
3. To what digits (0-9) can one assign each of the letters in the phrase below
to make the equation mathematically valid?
(Hint: Note that only ten letters of the alphabet are used in this statement
so that we may number the letters. For example, suppose we number
“o”, “n”, and “e” as 1, 2, and 3, respectively. If we define xi j , where
1 ≤ i ≤ 10 and 0 ≤ j ≤ 9 is a binary decision variable whose solution
value is 1 provided letter i corresponds to digit j, then the letter “o” is
X9 X
9
represented by the sum, jx1 j , the letter “n” by the sum jx2 j , and
j=0 j=0
164 Chapter 5. Integer Linear Programming
X
9
the letter “e” by the sum jx3 j . This means that the word “one” is
j=0
represented by the sum
X
9 X
9 X
9
100 jx1 j + 10 jx2 j + jx3 j .
j=0 j=0 j=0
TABLE 5.2: Weight and nutritional data taken from manufacturer’s web sites
Cereal Cliff Luna Gu
Item Powerbar TM
Bar Bar TM Bar TM Gel TM
Carbs (gms.) 29 43 42 27 25
Weight (gms.) 42 65 68 48 32
6. The n-Queens Problem asks for the largest number of queens that can
1 Based upon Machol, [25], (1970).
Integer Linear Programming and the Branch and Bound Method 165
(a) Formulate a BLP that solves the n-by-n Sudoku puzzle. To get
started, let xi jk , where 1 ≤ i, j, k ≤ n, denote a binary decision vari-
able, whose solution value equals 1 provided the entry in row i,
column j of the puzzle equals k, and zero otherwise. Let the objec-
tive of the BLP be any constant function; the constraints correspond
to feasible puzzle solutions and can be broken down as follows:
i. Each column contains one entry equal to k, where 1 ≤ k ≤ n.
ii. Each row contains one entry equal to k, where 1 ≤ k ≤ n.
2 Based upon Letavec and Ruggiero, [22], (2002).
3 Based upon Bartlett, et al., [3], (2008).
166 Chapter 5. Integer Linear Programming
(a) Under a low-high policy, checks presented on the same day are
processed in the order of lower to higher check amounts. Calculate
the total amount of NSF fees the bank charges Joe under this policy.
(b) Under a high-low policy, checks presented on the same day are first
listed in order of descending amounts. Starting at the top of the list,
the bank ascertains whether the current balance exceeds the given
check amount. If so, the bank clears the check by deducting its
amount from the current balance and then moves to the next check
on this list. If not, the bank charges an NSF fee before proceeding
to the next check, which is of smaller amount and therefore may
not necessarily bounce. Calculate the total amount of NSF fees the
bank charges Joe under this policy. Your value should be larger
than that charged in (a).
(g) How does the solution of the BLP from the previous question
compare with that of the high-low policy?
5.2.1 Motivation
From a graphical perspective, the branch and bound method determines an
ILP’s solution through a process of repeatedly subdividing the feasible region
of the corresponding relaxation LP. The cutting plane algorithm, developed
by Ralph Gomory, also involves an iterative process of solving LPs. Each
new LP consists of its predecessor LP, combined with an additional, cleverly
constructed constraint that “trims” the feasible region, so to speak, without
removing feasible lattice points.
Our primary focus will be on pure ILPs (as opposed to MILPs) having the
property that all constraint bounds and all decision variable coefficients are
integer-valued. The GLKC ILP,
168 Chapter 5. Integer Linear Programming
We now choose a row in (5.5) for which the corresponding basic variable is not
integer-valued. In this particular case, any row will suffice, but as a general
rule, we choose the row whose basic variable has its fractional value closest
to one-half. Thus, in this case we select the bottom row, which corresponds
to the equation
1 1 3
x2 − s1 + s2 = . (5.19)
2 2 2
1 1 1
x2 − s1 − 1 = − s1 − s2 . (5.21)
2 2 2
The “trimming” of the relaxation LP’s feasible region arises from the fact that
whenever terms of an equation corresponding to a row of the relaxation’s
final tableau are separated in this manner, with terms having integer com-
ponent coefficients on one side of the equation and terms having fractional
component coefficients on the other, each side of the resulting equation is
nonpositive whenever (x1 , x2 ) is a feasible solution of the ILP.
To demonstrate that this assertion holds for this particular example, we first
consider the constraint equations, associated with the original relaxation of
(5.18):
2x1 + x2 + s1 = 6
2x1 + 3x2 + s2 = 9.
1 1 1
− s1 − s2 + s3 = − . (5.22)
2 2 2
We now add this equation to the relaxation solution tableau, (5.5), by adding
a new row and column. The result is shown in Table 5.6.
170 Chapter 5. Integer Linear Programming
TABLE 5.6: Tableau for the relaxation after a new slack variable, s3 , has been
introduced
z x1 x2 s1 s2 s3 RHS
1 5 51
1 0 0 4 4 0 4
3 1 9
0 1 0 4 −4 0 4
0 0 1 − 21 1
2 0 3
2
1 1
0 0 0 −2 −2 1 − 12
1
At this stage of the algorithm, both s1 and s2 are nonbasic and s3 = − . Thus,
2
the current basic variables constitute a basic, but not basic feasible solution,
of the LP formed by adding constraint (5.22) to the relaxation of (5.18). To
obtain a basic feasible solution, we utilize the dual simplex algorithm from
Section 4.3.
Since only s3 is a negative basic variable, we apply the ratio test to the bottom
row of (5.6) The result is that variable s1 replaces s3 as a basic variable. The
resulting tableau is given by (5.7).
TABLE 5.7: Tableau after the first iteration of the cutting plane algorithm
z x1 x2 s1 s2 s3 RHS
1 0 0 0 1 12 25
2
0 1 0 0 -1 32 3
2
0 0 1 0 1 -1 2
0 0 0 1 1 -2 1
The tableau in (5.7) marks the end of the first iteration of what we refer to as
the cutting plane algorithm.
While we could repeat the entire process of recording the equation corre-
sponding to this row, decomposing variable coefficients into their integer
and fractional components, and grouping terms with fractional coefficients
on one side of the equation, we see from close inspection of the first iteration
that a simpler means exists. Namely, we may perform the following steps:
5.2. The Cutting Plane Algorithm 171
3. Perform the dual simplex algorithm. The first step of this algorithm
focuses on the new row just added to the tableau, which corresponds
to the negative basic variable s4 .
TABLE 5.8: Tableau for the relaxation after a new slack variable, s4 , has been
introduced
z x1 x2 s1 s2 s3 s4 RHS
1 25
1 0 0 0 1 2 0 2
3 3
0 1 0 0 -1 2 0 2
0 0 1 0 1 -1 0 2
0 0 0 1 1 -2 0 1
0 0 0 0 0 − 21 1 − 21
Executing the dual simplex algorithm a second time, we see that s3 replaces
s4 as a basic variable. The resulting pivot leads to the result in Table (5.9).
TABLE 5.9: Tableau for the GLKC ILP after second iteration of cutting plane
algorithm
z x1 x2 s1 s2 s3 s4 RHS
1 0 0 0 1 0 1 12
0 1 0 0 -1 0 3 0
0 0 1 0 1 0 -2 3
0 0 0 1 1 0 -4 3
0 0 0 0 0 1 -2 1
The "tableau
# " indicates
# an optimal solution to original ILP given by
x1 0
x= = and z = 12. In addition, all slack variables are integer-valued.
x2 3
172 Chapter 5. Integer Linear Programming
In addition, we have omitted from the following worksheet a new Maple pro-
cedure, called fracpart, which computes the fractional part of its argument.
It can be defined at the start of the worksheet as follows:
We begin this portion of the worksheet with the final tableau matrix,
LPMatrix, that results when the simplex algorithm is used to solve the GLKC’s
ILP relaxation. Recall again our numbering convention that column 0 and row
0 denote the leftmost column and top row, respectively, of the tableau matrix.
> Tableau(LPMatrix);
# Display final tableau for GLKC ILP relaxation.
z x1 x2 s1 s2 RHS
1 5 51
1 0 0 4
4 4
9
0 1 0 3
− 1
4 4 4
3
0 0 1 − 21 1
2 2
> evalf(%);
z x1 x2 s1 s2 RHS
1 0 0 .25 1.25 12.75
0 1 0 .75 −.25 2.25
0 0 1 −.5 .5 1.5
> m:=m+1:
> for j from 2 to m do evalf(fracpart(LPMatrix[j,n+m+1]));
end do;
# Determine basic variable whose fractional part is closest
to one half.
.25
.50
5.2. The Cutting Plane Algorithm 173
> k:=2:
# Basic variable corresponding to row 2 has fractional part
closest to one half.
> NewRow:=-Vector[row]([seq(-fracpart(RowCoefficients[j]),
j=1..(n+m+1)),-1,-fracpart(LPMatrix[k+1,n+m+1])]);
# Create new bottom row corresponding to added constraint.
h i
NewRow := 0 0 0 − 21 − 12 1 − 12
> LPMatrix:=<<SubMatrix(LPMatrix,1..m,1..(n+m))|ZeroVector(m)|
SubMatrix(LPMatrix,1..m,(m+n+1)..(m+n+1))>,NewRow>;
# Create new matrix incorporating addition of new slack
variable to original LP’s relaxation.
51
1 0 0 14 5
4 0 4
0 1 0 34 − 41 0 9
LPMatrix := 4
3
0 0 1 − 21 1
2 0 2
1
0 0 0 − 21 − 21 1 −2
> x:=array(1..n):s:=array(1..m):
# Create arrays of decision and slack variables.
> Labels:=Matrix(1,2+n+m,[z,seq(x[i],i=1..n),seq(s[j],
j=1..m),RHS]):
# Create a new top row of labels.
> Tableau(LPMatrix);
z x1 x2 s1 s2 s3 RHS
1 0 0 1 5
0 51
4 4 4
3 9
0
1 0 4 − 41 0 4
0 3
0 1 − 12 1
2 0 2
1
0 0 0 − 12 − 21 1 −2
> ColumnRatios(LPMatrix,m);
# Perform dual simplex algorithm ratio test on row m.
> Iterate(LPMatrix,m,3);
# Pivot on entry in row m, column 3 so that s1 replaces
174 Chapter 5. Integer Linear Programming
s3 as a basic variable.
z x1 x2 s1 s2 s3 RHS
1 0 0 1 25
0 1 2
3
2
0 1 0 0 −1 23
2
0 0 1 0 1 −1 2
0 0 0 1 1 −2 1
> LPMatrix:=<<SubMatrix(LPMatrix,1..m,1..(n+m))|ZeroVector(m)|
SubMatrix(LPMatrix,1..m,(m+n+1)..(m+n+1))>,NewRow>;
# Create new matrix incorporating addition of new slack
variable to original LP’s relaxation.
> x:=array(1..n):s:=array(1..m):
# Create arrays of decision and slack variables.
> Labels:=Matrix(1,2+n+m,[z,seq(x[i],i=1..n),
seq(s[j],j=1..m),RHS]):
# Create a new top row of labels.
> Tableau(LPMatrix);
z x1 x2 s1 s2 s3 s4 RHS
1 0 1 25
0 0 1 2 0 2
0 1 3
0 0 −1 32 0 2
0 0 1 0 1 −1 0 2
0 0 0 1 1 −2 0 1
1
0 0 0 0 0 − 12 1 −2
> Iterate(LPMatrix,m,5);
# Pivot on entry in row m, column 5 to obtain final tableau.
z x1 x2 s1 s2 s3 s4 RHS
1 0 0 0 1 0 1 12
0 1 0 0 −1 0 3 0
0 0 1 0
1 0 −2 3
0 0 0 1
1 0 −4 3
0 0 0 0 0 1 −2 1
2.
Nonlinear Programming
177
Chapter 6
Algebraic Methods for Unconstrained
Problems
Unless stated otherwise, we will incorporate any sign restrictions into the
constraints. Of course NLPs whose goal involves maximization and/or whose
constraints include equations as opposed to inequalities, can easily be con-
verted to the form of (6.1) through simple algebraic manipulations. Finally,
an NLP having no constraints is said to be unconstrained.
179
180 Chapter 6. Algebraic Methods for Unconstrained Problems
> display(FeasibleRegion);
x2
x1
As was the case in the linear programming setting, contours are useful for
estimating the solution of an NLP. In Maple, they can be generated and then
superimposed on the feasible region by combining the contourplot and
display commands as was done in Section 1.2. Unfortunately, Maple does
not label contours. This fact, along with the nonlinearity of the objective and
constraints, can make estimating the solution of an NLP using its contour
diagram extremely difficult.
One means of gaining a better sense of how the objective function changes
within the feasible region is to add within the implicitplot command the
options filled=true,coloring=[white, black]. Doing so has the effect of
shading between contours in such a way that smallest function values corre-
spond to white regions, largest values correspond to black, and intermediate
values to varying shades of grey. Maple syntax that achieves this outcome is
given as follows:
182 Chapter 6. Algebraic Methods for Unconstrained Problems
> f:=(x1,x2)->x1*x2;
f := (x1 , x2 ) → x1 x2
> ObjectiveContours:=contourplot(f(x1,x2),x1=0..1,x2=-1..1,
filled=true,coloring=[white, black]):
# Create contour plot.
> display({FeasibleRegion,ObjectiveContours});
# Superimpose contours on previously constructed feasible
region.
The result in this case produces the graph in Figure 6.2. It suggests that the
solution of (6.2) occurs at the point on the unit circle corresponding to π4 ,
which is of course (x1 , x2 ) = √1 , √1 .
2 2
x2
x1
FIGURE 6.2: Feasible region and contour shading for NLP (6.2).
> restart:with(Optimization):
6.1. Nonlinear Programming: An Overview 183
> f:=(x1,x2)->x1*x2;
f := (x1 , x2 ) → x1 x2
> g2:=(x1,x2)->-x1;
g2 := (x1 , x2 ) → −x1
> NLPSolve(f(x1,x2),[g1(x1,x2)<=0,g2(x1,x2)<=0],’maximize’);
Thus NLP (6.3) has a solution of (x1 , x2 ) ≈ (.7071, .7071), with corresponding
objective value of .5. In Section 6.4 we
will develop
algebraic techniques that
prove this solution equals (x1 , x2 ) = √1 , √1 .
2 2
Each unit of produced pipe generates $1400 of revenue but costs $350 in
materials and $200 in labor to produce. The company has $40,000 of available
funds to spend, and the ratio of labor units to material units must be at least
one-third in order to ensure adequate labor to produce pipe from acquired
materials. Assume ConPro sells all the pipe that it produces.
184 Chapter 6. Algebraic Methods for Unconstrained Problems
Funding limits imply that 350x1 + 200x2 ≤ 40, 000, and the requirement of
adequate labor to ensure pipe production of acquired materials forces
x1 ≤ 3x2 . Assuming positive decision variable values capable of taking on
non-integer values, we obtain NLP (6.6):convert
1 1
maximize f (x1 , x2 ) = 1400x12 x23 − 350x1 − 200x2 (6.6)
subject to
350x1 + 200x2 ≤ 40, 000
x1 − 3x2 ≤ 0
x1 , x2 > 0.
Note that this NLP includes sign restrictions on its decision variables. By our
convention, these restrictions contribute two constraints, −x1 ≤ 0 and −x2 ≤ 0,
so that the NLP has four constraints total.
(a)
(b)
(c)
(d)
(e)
(f)
x1 + 2
minimize f (x1 , x2 ) =
x2 + 1
subject to
x1 + 2x2 ≤ 3
−x1 + x2 ≤ 1
Currently, Pam completes the 800 meter run in 3 minutes and throws
the shot put 6.7 meters. Pam’s coach estimates that for each hour per
week Pam devotes to weight lifting, she will decrease her 800 meter run
time by half that many seconds and increase her shot put distance by
one-tenth of a meter. Thus, for example, lifting weights two hours per
week will decrease her 800 meter run time by 1 second and increase
her shot put distance by .2 meters. Similarly, the coach estimates that
for each hour per week Pam devotes to distance running and for each
hour per week Pam devotes to speed workouts, she will decrease her
800 meter run time by that many seconds.
1 Based upon Ladany, [20], (1975).
6.2. Differentiability and a Necessary First-Order Condition 187
(a) Pam is expected to train between six and ten hours per week.
(b) Between two and four hours of this time must be devoted to weight
lifting.
(c) Between three and four hours should be spent distance running.
(d) In order to ensure Pam devotes sufficient time to speed workouts,
she must allot at least 60% of her total running time to this activity.
Based upon this information, set up and solve an NLP having three de-
cision variables and ten constraints (including sign restrictions), whose
solution indicates the number of hours Pam should devote weekly to
weight lifting, distance running, and speed workouts in order to max-
imize her score in the 800 meter run and shot put components of the
pentathlon. By how much will her total score increase?
We shall later discover that constrained NLPs are solved using a method that
involves cleverly converting them to an unconstrained problem of the form
(6.7). For unconstrained NLPs, our investigation follows a line of reasoning
similar to that from a single-variable calculus course. In this section we de-
rive necessary conditions for a feasible point to be an optimal solution. This
step will be the easy part. The more challenging task, which is addressed in
Sections 6.3 and 6.4, is to establish sufficient conditions.
188 Chapter 6. Algebraic Methods for Unconstrained Problems
6.2.1 Differentiability
Deriving necessary conditions begins with stating a clear definition of differ-
entiability. Throughout subsequent discussions, we let kxk denote the usual
Euclidean vector norm in Rn .
Here are some important facts to keep in mind regarding this definition.
2. Equation (6.8), due to the presence of the limit, implicitly assumes that
f is defined at inputs sufficiently close to x0 . Consequently, there must
exist a small neighborhood, or open disk, about x0 that is contained in
S. A set S ⊆ Rn in which every point has an open disk about it contained
in S, is said to be open.
> with(VectorCalculus):
> Gradient(x1ˆ2+x1*x2ˆ3,[x1,x2]);
> with(VectorCalculus):
> f:=(x1,x2)->x1ˆ2+x2ˆ2;
> Delf:=unapply(Gradient(f(x1,x2),[x1,x2]),[x1,x2]):
> Delf(x1,x2);
2x1 ex1 + 2x2 ex2
_ _
A function can have many local maxima and/or minima, and many inputs
can share the distinction of being the global maximum (or minimum).
6.2. Differentiability and a Necessary First-Order Condition 191
Proof. The proof involves little more than applying the single-variable result
to each component of x0 . Without loss of generality, assume f has a local
x0,1
x0,2
.
..
minimum at x0 . Fix an arbitrary j, where 1 ≤ j ≤ n and write x0 = . Define
x0, j
..
.
x0,n
f j to be the “cross-section” obtained by fixing all but the jth component of f .
In other words,
x0,1
..
.
x
0, j−1
x
f j (x) = f .
x0, j+1
..
.
x0,n
Observe that
f j′ (x0, j ) = ∇ f (x0 ) j , (6.11)
the jth component of ∇ f (x0 ).
_ _
Waypoint 6.2.2. Suppose f (x1 , x2 ) = x31 − 3x1 x22 . A surface plot of this
function, which is sometimes called a “monkey saddle,” is shown in
Figure 6.3.
1. Show that x0 = 0 is the only critical point of f .
2. Explain why this critical point is neither a local maximum nor
a local minimum.
_ _
x2
x1
2. Use Definition
" # 6.2.1 to show that f (x1 , x2 ) = x31 − 2x1 − x2 is differentiable
1
at x0 = .
1
3. Suppose S ⊆ Rn , f : S → R and x0 belongs to the interior of S. If d is a
nonzero vector in Rn , we define the directional derivative of f at x0 in the
direction of d as
f (x0 + hd) − f (x0 )
fd′ (x0 ) = lim+ , (6.12)
h→0 h
provided this limit exists.
(a) Show that if f is differentiable at x0 , then the limit in (6.12) exists
and that
fd′ (x0 ) = ∇ f (x0 )t d.
(b) Show that the preceding quantity fd′ (x0 ) depends only upon the
direction of d, not its magnitude. Hence, d may always be assigned
a unit vector value.
(c) For the ConPro objective function f , determine the directional
derivative of f in the direction of the origin at (x1 , x2 ) = (10, 20).
4. Suppose S ⊆ Rn , f : S → R, and f is differentiable at x1 in S. Fix x2 in
S. (Note: The differentiability assumption implies x1 + hx2 also belongs
to S for h sufficiently close to zero.) Show that φ(h) = f (x1 + hx2 ) is
differentiable at h = 0 and satisfies
φ′ (0) = ∇ f (x1 )t x2 .
0.8
0.6
0.4
0.2
x
–1 0 1
3
FIGURE 6.4: Graph of f (x) = |x| 2 .
6.3.1 Convexity
The preceding example illustrates how, in the single-variable setting, concav-
ity still plays a role in classifying a critical point, even though the function
may not be twice-differentiable there. As we shall discover, this principle also
applies to the general unconstrained NLP. However, instead of using phrases
such as “concave up” and “concave down” to describe the behavior of a func-
tion, we will instead use the terms convex and concave. Before defining these
terms in the context of functions, we first define convexity for sets.
In the single-variable setting, we recall using the terms “concave up” and
“concave down.” In the language of Definition 6.3.2, these merely correspond
to “convex” and “concave,” respectively.
2
2
1
0
–2 0
–1
0 –1 x2
x1 1
2 –2
FIGURE 6.5: The paraboloid f (x1 , x2 ) = x21 + x22 , together with one chord
illustrating notion of convexity.
Proof. Assume first that f is convex on S, let x0 and x in S, and define d = x−x0 .
By Exercise 3 from Section 6.2, the differentiability of f implies the existence
of its directional derivatives at x0 . Using the two equivalent formulations of
this quantity, we have
∇ f (x0 )t (x − x0 ) = ∇ f (x0 )t d
f (x0 + hd) − f (x0 )
= lim+
h→0 h
f (h(x0 + d) + (1 − h)x0 ) − f (x0 )
= lim+ . (6.15)
h→0 h
6.3. Convexity and a Sufficient First-Order Condition 197
Since f is convex,
Substituting the result from (6.16) into (6.15) leads to (6.14), which completes
the first half of the proof.
For the reverse implication, choose x1 and x2 in S and t ∈ [0, 1] and assume
(6.14) holds. Since S is a convex set, x0 = tx1 + (1 − t)x2 belongs to S. Now apply
inequality (6.14) twice, first with x = x1 , and then with x = x2 . The results,
Since x1 and x2 were arbitrary in S and t was arbitrary in [0, 1], f is convex on
S by Definition 6.3.2. This completes the proof of the theorem.
Then compute the difference f (x) − T(x), and establish that this quantity is
nonnegative, regardless of the choices of x and x0 in S.
" # " #
x0,1 2x1
For example, if f (x1 , x2 ) = x21 + x22 and x0 = , then ∇ f (x) = and the
x0,2 2x2
198 Chapter 6. Algebraic Methods for Unconstrained Problems
x1
x2
FIGURE 6.6: The paraboloid f (x1 , x2 ) = x21 +x22 , together with the linearization,
or tangent plane, at an arbitrary point.
Now assume that x0 is a strict local minimum, implying that f (x0 ) < f (x) in
(6.17). Then x0 is a global minimum of f on S by the preceding result. To
show it is unique with this property, we assume that x⋆ is a second global
minimum so that f (x⋆ ) = f (x0 ). By the convexity of f again, for all 0 ≤ t ≤ 1,
f tx⋆ + (1 − t)x0 ≤ t f (x⋆ ) + (1 − t) f (x0 ) (6.19)
= f (x0 ).
from which it follows that these two quantities are equal. Hence, x0 is not a
strict local minimum of f , and therefore x0 is the unique global minimum of
f on S.
The case when f is strictly convex is similar and is as an exercise. This com-
pletes the proof.
1 1
maximize f (x1 , x2 ) = 1400x12 x23 − 350x1 − 200x2,
" #
x
where x = 1 ∈ S = R2+ = {(x1 , x2 ) | x1 , x2 > 0} . (6.20)
x2
Recall in this example that x1 denotes number of units capital, x2 number of
units labor, and f is profit. A surface plot of f is shown in Figure 6.7.
10000
–10000
f (x1 , x2 )
(0, 0)
50 150
x1 150
50 x2
However, using Theorem 6.3.1 to formally verify this fact is extremely chal-
lenging, as it requires us to prove that T − f is nonnegative on S, where T is
the tangent plane to f at an arbitrary point of S. Fortunately, in Section 6.4 we
develop a much simpler means for establishing f is concave, one analogous
to the single-variable second derivative test. Before starting that discussion,
however, we investigate how convexity arises in the context of regression.
When n = 1, then f (x) = at x+b reduces to the standard regression line. If n > 1,
then derivation of f is known as multiple linear regression. A simple example
of such a problem is illustrated by the data in Table 6.1, which lists several
cigarette brands. To each brand, we associate its tar and nicotine amounts, its
mass, and the amount of produced carbon monoxide (CO).
In many cases, data such as that in Table 6.1 is given in spreadsheet format.
Maple’s ExcelTools package provides a convenient means for importing Ex-
cel spreadsheet data into Maple array structures. Once the package is loaded,
a range of Excel cells is imported as an array using a command of the form
Import("c:\\Users\\Documents\\CigaretteData.xls","Sheet1","A1:A25").
6.3. Convexity and a Sufficient First-Order Condition 203
Observe that this command indicates the directory of the Excel file, the file’s
name, the name of the worksheet in the file, and the cell range. Assuming that
the entries from (6.1) are located in rows 1-25 and columns A-D of the first
sheet of the file CigaretteData.xls, we can read the entries into four vectors as
follows:
> restart:with(ExcelTools):with(Statistics):
with(VectorCalculus):with(LinearAlgebra):
> tar:=
convert(Import("c:\\Users\\Documents\\CigaretteData.xls",
"Sheet1","A1:A25"),Vector):
> nicotine:=
convert(Import("c:\\Users\\Documents\\CigaretteData.xls",
"Sheet1","B1:B25"),Vector):
> mass:=
convert(Import("c:\\Users\\Documents\\CigaretteData.xls",
"Sheet1","C1:C25"),Vector):
> CO:=
convert(Import("c:\\Users\\Documents\\CigaretteData.xls",
"Sheet1","D1:D25"),Vector):
Here we have converted the arrays to vectors through use of the convert
command. Doing so allows us to access entries in each vector using a single
index. Now we can use this data to determine the linear function of best fit
as follows:
> x:=Matrix(3,1,[x1,x2,x3]):
# Independent variable vector.
> a:=Matrix(3,1,[a1,a2,a3]):
# Unknown regression coefficients.
> f:=unapply((Transpose(a).x)[1,1]+b,[x1,x2,x3]):
# Define linear regression function.
> S:=add((f(tar[i],nicotine[i],mass[i])-CO[i])ˆ2,i=1..25):
# Sum of squared-errors.
> DelS:=Gradient(S,[a1,a2,a3,b]): # Gradient of S.
> fsolve({DelS[1]=0,DelS[2]=0,DelS[3]=0,DelS[4]=0},
{a1,a2,a3,b});
# Determine critical point of S
Thus the linear function that best predicts carbon monoxide output as a
204 Chapter 6. Algebraic Methods for Unconstrained Problems
function of tar, nicotine, and cigarette mass, based upon the information in
Table 6.1, is given by
f (x) = at x + b
= .96257x1 − 2.63167x2 − .13048x3 + 3.20221
> f:=(x1,x2,x3)->.96257*x1-2.63167*x2-.13048*x3+3.20221:
# Enter best-fit function.
> CObar:=Mean([seq(CO[i],i=1..25)]);
# Compute mean carbon monoxide value.
CObar := 12.528
> Rsquared:=
add((f(tar[i],nicotine[i],mass[i])-CObar)ˆ2,i=1..25)/
add((CObar-CO[i])ˆ2,i=1..25);
.91859
(a) The absolute value function, f (x) = |x|. (Hint: Use the triangle
inequality.)
(b) Any quadratic function, f (x) = ax2 + bx + c, where a, b, and c are
real numbers and a ≥ 0.
6.3. Exercises Section 6.3 205
5. Suppose that f (x1 , x2 ) = 2x21 x2 + x21 x22 − 4x1 x2 − 2x1 x22 . Show that f is
neither convex nor concave on R2 . (Hint: First fix x1 , say x1 = 1, and
consider the graph of f (1, x2 ). Then fix x2 = 1 and consider the graph of
f (x1 , 1).)
11. The Body Fat Index (BFI) is one means of measuring an individual’s
fitness. One method of computing this value is Brozek’s formula, which
defines
457
BFI = − 414.2,
ρ
where ρ denotes the body density in units of kilograms per liter. Un-
fortunately, an accurate measurement of ρ can only be accomplished
through a process known as hydrostatic weighing, which requires
recording an individual’s weight while under water, with all air ex-
pelled from his or her lungs. In an effort to devise less time-consuming
means for estimating BFI, researchers have collected data that suggests
206 Chapter 6. Algebraic Methods for Unconstrained Problems
f ′′ (x0 )
P2 (x) = f (x0 ) + f ′ (x0 )(x − x0 ) + (x − x0 )2 . (6.27)
2
When f ′ (x0 ) = 0,
f ′′ (x0 )
f (x) ≈ f (x0 ) + (x − x0 )2 for x ≈ x0 . (6.28)
2
6.4. Sufficient Conditions for Local and Global Optimal Solutions 207
The sign of f ′′ (x0 ) indicates whether locally, near x0 , the graph of f is parabolic
opening up, indicating x0 is a local minimum, or parabolic opening down,
indicating x0 is a local maximum. Of course, these two cases correspond to
the graph of f being convex or concave, respectively at x0 . If f ′′ (x0 ) = 0,
no conclusion may be made regarding the nature of the critical point; other
means of investigation are required.
We usually refer to A as the matrix associated with the quadratic form. When
n = 1, this quantity is merely the coefficient a of the quadratic power function
f (x) = ax2 .
180
160
140
120
f 100
80
60
40
20
0 6
–6 4
–4 2
–2 0
x1
0
2
4 –4
–2 x2
6
2.
" #t " #" #
x1 −2 1 x1
f (x1 , x2 ) =
x2 1 −3 x2
= −2x21 + 2x1 x2 − 3x22 (6.30)
–100
–200
–300
f
–400
–500
–600
–700 10
–10 5
–5
0 0
x1 5 –5 x2
10
3.
" #t " #" #
x1 1 2 x1
f (x1 , x2 ) =
x2 2 1 x2
= x21 + 4x1 x2 + x22 (6.31)
6.4. Sufficient Conditions for Local and Global Optimal Solutions 209
600
400
f
200
–200
10
–10 –5 0
x1
0 5 10 x2
Then f is positive definite (resp. negative definite) if and only if all eigenvalues
of A are positive (resp. negative).
_ _
Waypoint 6.4.1. Determine the eigenvalues of the matrices associated
with the quadratic forms in Figures 6.8 and 6.9. Use your results to
classify each as positive definite or negative definite.
_ _
With one more additional tool, we shall discover how these three types of
quadratic forms hold the key for constructing our higher-dimensional version
of the second derivative test.
1
f (x) = f (x0 ) + ∇ f (x0 )t (x − x0 ) + (x − x0 )t H f (x0 )(x − x0 )
2
+ kx − x0 k2 R(x0 ; x), (6.32)
In the same way that first-order differentiability forces the gradient vector to
be comprised of first-order partial derivatives, second-order differentiability
leads to a Hessian matrix consisting of second-order partial derivatives:
∂2 f ∂2 f ∂2 f
2 ...
∂x1 xn
∂x1 ∂x1 x2
∂2 f ∂2 f ∂ f
2
...
∂x x ∂x2 xn
H f (x) = .2 1
∂x22
. (6.33)
. .. .. ..
. . . .
∂2 f
∂2 f ∂2 f
∂xn x1 ∂xn x2 ... ∂x2n
> with(VectorCalculus):
> Hessian(x1ˆ2+x1*x2ˆ3,[x1,x2]);
" #
2 3x22
3x22 6x1 x2
> with(VectorCalculus):
> f:=(x1,x2)->x1ˆ2 +x1*x2ˆ3,[x1,x2];
> Hf:=unapply(Hessian(f(x1,x2),[x1,x2]),[x1,x2]):
> Hf(x1,x2); " #
2 3x22
3x22 6x1 x2
_ _
Waypoint 6.4.2. For each of the following functions, determine a
general formula for the Hessian matrix, H f (x). Then, determine the
critical point, x0 , of each function, and evaluate H f (x0 ).
The Hessian is an extremely useful tool for classifying local optimal solutions
and also for establishing a function is convex on its domain. Theorem 6.4.2,
our long-desired “second derivative test,” addresses the first of these issues.
Before stating this theorem, we describe an important result from linear alge-
bra that we will utilize in its proof.
Definition 6.4.4. A square, invertible matrix, P, satisfying P−1 = Pt is said to
be an orthogonal matrix.
An orthogonal n-by-n matrix, P, has interesting properties. Among them are
the following:
• The column vectors of P are orthonormal, meaning that if we label the
column vectors as u1 , u2 , . . . , un , then uti u j equals zero for 1 ≤ i, j ≤ n
with i , j and equals 1 if i = j.
• kPxk = kxk for any x in Rn .
• The matrix Pt is orthogonal as well.
We now prove
(x − x0 )t H f (x0 )(x − x0 ) ≥ λkx − x0 k2 , (6.40)
a result, when combined with (6.38), leads to
λ
f (x) − f (x0 ) ≥ kx − x0 k2 + R(x0 ; x) .
2
Since λ > 0 and R(x0 ; x) → 0 as x → x0 , we will then be able to conclude
that f (x) − f (x0 ) > 0 for all x sufficiently close to x0 , which is precisely what it
means to say that x0 is a strict local minimum of f .
Since Pt is also orthogonal, kwk = kx − x0 k. We now use this fact to obtain the
desired lower bound in (6.40):
Thus, we have established (6.40) so that x0 is a local minimum and the proof
of (2) is complete.
If H f (x0 ) is indefinite, then there exist x+ and x− in S such that xt+ H f (x0 )x+ > 0
and xt− H f (x0 )x− < 0. Define φ(h) = f (x0 + hx+ ). By the result from Exercise
4 of Section 6.2, along with a simple extension of this result to the second
derivative, φ is twice-differentiable in a neighborhood of the origin where it
satisfies
Since φ′ (0) = 0 and φ′′ (0) > 0, φ has a local minimum of φ(0) = f (x0 ) at h = 0,
from which it follows that f cannot have a local maximum at x0 . A similar
argument using x− establishes that f cannot have a local minimum at x0 . This
completes the proof of (3).
_ _
Waypoint 6.4.3. Use the Theorem 6.4.2 to classify, where possible,
all strict local maxima and minima of the functions corresponding
Figures 6.8-6.9.
_ _
While Theorem 6.4.2 is useful for determining local maxima and minima, it
cannot be used by itself to determine global optimal solutions of the uncon-
strained NLP in which we seek to minimize f (x), for x belonging to some
open, convex set S in Rn . The difficulty lies in making the transition from
“local” to “global.” Fortunately, the Global Optimal Solutions Theorem (The-
orem 6.3.2) helps us make this transition, provided we show f is convex on
S. This task appeared quite difficult in general at the end of Section 6.3. The
next result demonstrates that a more straightforward method exists, one that
requires showing H f (x0 ) is positive semidefinite for all x0 in S, not just at the
critical point.
Theorem 6.4.3 and our previous results provides a framework for solving
unconstrained NLPs. Suppose S ⊆ Rn is an open convex set and that f : S → R
is twice-differentiable at each point of S. Consider the general unconstrained
NLP
minimize f (x), where x ∈ S. (6.46)
The process of determining the global minimum is as follows:
6.4. Sufficient Conditions for Local and Global Optimal Solutions 217
_ _
Waypoint 6.4.4. Show that the preceding √quantity is negative in S.
(Hint: First establish the general inequality a2 + b2 < a+b for a, b > 0.)
_ _
Since both eigenvalues are negative on S, H f (x) is negative definite on S so
that f is strictly concave. Thus, the unique global minimum is given by
" # " #
784/9 87.11
x0 = ≈ .
2744/27 101.63
> with(VectorCalculus):with(LinearAlgebra):
> f:=(x1,x2)->1400*x1ˆ(1/2)*x2ˆ(1/3)-350*x1-200*x2;
# Enter function to maximized.
√
f := (x1, x2) → 1400 x1 x1/3
2
− 350x1 − 200x2
> Delf:=unapply(Gradient(f(x1,x2),[x1,x2]),[x1,x2]):
# Create the gradient function of f.
6.4. Sufficient Conditions for Local and Global Optimal Solutions 219
> solve({Delf(x1,x2)[1]=0,Delf(x1,x2)[2]=0},{x1,x2});
# Determine the critical point by solving a system of two
equations in x1 and x2. These are formed by setting each
component of the gradient of f equal to 0.
> evalf(%);
# Determine floating point approximation of critical point.
> subs(%,f(x1,x2));
# Determine objective value at critical point.
10162.96296
> Hf:=Hessian(f(x1,x2),[x1,x2]);
# Create a matrix, Hf, consisting of the Hessian of f.
√3
x2
−350 3/2 700 1
3 √x1 x2/3
√2
x1
H f = 700 1
x1
√ 2/3 − 9 5/3
2800
3 x1 x2 x2
> Eigenvalues(Hf);
# Determine eigenvalues of Hf.
√
− 175 8x31 +9x1 x22 − 64x61 +81x21 x42
9 x5/3
2 √1
x5/2
175 8x31 +9x1 x22 + 64x61 +81x21 x42
− 5/3 5/2
9 x2 x1
> IsDefinite(Hf,’query’=’negative_semidefinite’);
# Example illustrating use of IsDefinite command. Result
shows H is negative semidefinite on domain, x1>0,x2>0.
2 2
490000 350 8x1 + 9x2
0≤ 0≤
9x1 x4/3
2
9 x3/2 x5/3
1 2
Recall from Section 4.1 Steve and Ed’s zero-sum matrix game involving the
3-by-3 payoff matrix
1 −1 2
2 4 −1 .
A = (6.51)
−2 0 2
maximize z (6.52)
subject to
Ax ≥ ze
et · x = 1
x ≥ 0.
(Recall that e denotes a 3-by-1 vector of 1s.) Steve’s equilibrium mixed strat-
6.4. Sufficient Conditions for Local and Global Optimal Solutions 221
minimize w (6.53)
subject to
yt A ≤ wet
et · y = 1
y ≥ 0.
(Note: Notation in this formulation of the dual LP differs slightly from that
used in Section 4.1.5. Here we assume that y is a column vector, as opposed
to a row vector, in R3 .)
This zero-sum matrix game can also be described using results from this
section. Consider the matrix product yt Ax. Since x1 + x2 + x3 = 1 and y1 + y2 +
y3 = 1, this product defines a function f : R4 → R as follows:
t
y1 x1
f (x1 , x2 , y1 , y2 ) = y2 A x2 . (6.54)
1 − y1 − y2 1 − x1 − x2
_ _
Waypoint 6.4.5. Show that f in (6.54) has one critical point, a saddle
point, at
x1,0 3/28
x2,0 9/28
=
y1,0 1/2 ,
y2,0 5/14
13
where f (x1,0 , x2,0 , y1,0 , y2,0 ) = . This demonstrates how the mixed
14
strategy Nash equilibrium of a zero-sum matrix game corresponds to
a saddle point of a single function, which is formed using the payoff
matrix and whose output at the saddle point is the game value. That
this outcome holds in general is a special case of classic result from
game theory, known as the von Neumann Minimax Theorem.
_ _
We conclude this section by noting that, while neither player can improve
his standing by unilaterally deviating from his equilibrium mixed strategy, it
is possible for both players to simultaneously deviate from their equilibrium
mixed strategies and for one or the other to benefit as a result. Exercise (8)
illustrates this phenomenon.
222 Chapter 6. Algebraic Methods for Unconstrained Problems
1
Q(x) = f (x0 ) + ∇ f (x0 )t (x − x0 ) + (x − x0 )t H f (x0 )(x − x0 )
2
" #
1
for the function f (x1 , x2 ) = e−(x1 +x2 ) at x0 =
2 2
. Plot this approximation
1
together with the function f .
(b) Confirm that the solution just obtained coincides with the saddle
point of the function
" #t " #
y1 x1
f (x1 , y1 ) = A .
1 − y1 1 − x1
(c) Sketch the set of point(s) in the x1 y1 plane representing all possible
mixed strategies for the two players that yield the game value from
(a).
(d) Sketch the set of point(s) in the x1 y1 plane representing all possible
mixed strategies for the two players that yield a game value that
is 20% larger than that from (a).
9. Consider the zero-sum matrix game having the 2-by-2 payoff matrix
" #
a b
A= . (6.57)
c d
During the past several years, numeric methods have evolved and improved
dramatically. We will not attempt to describe all developments in this field
but will instead focus on three elementary tools: the Steepest Descent Method,
Newton’s Method, and the Levenberg-Marquardt Algorithm.
225
226 Chapter 7. Numeric Tools for Unconstrained NLPs
∇ f (x0 )
with equality in the absolute value possible if and only if d = ± .
k∇ f (x0 )k
(See Appendix B.) Consequently fd′ (x0 ) is a minimum when d is chosen to
have direction exactly opposite that of ∇ f (x0 ). We summarize this fact by
saying −∇ f (x0 ) is the optimal descent direction of f at x0 .
The Steepest Descent Method begins with an initial value x0 and a corre-
sponding optimal descent direction, −∇ f (x0 ). If we define
φ(t) = f x0 − t∇ f (x0 ) , (7.2)
1 t
f (x) = x Ax + bt x + c (7.3)
2
3
= x21 + x1 x2 + x22 + x1 − 4x2 + 6.
2
" #
1
If x0 = , straightforward calculations show that f (x0 ) = 12.5 and
−1
" #
2
∇ f (x0 ) = . In this case, (7.2) yields φ(t) = 46t2 − 40t + 12.5.
−6
Figures 7.1 and 7.2 provide two depictions of the function φ. Figure 7.1
illustrates the graph of f , along with an embedded arc consisting of the
ordered triples
(" # )
x0 − t∇ f (x0 )
0≤t≤1 . (7.4)
φ(t)
7.1. The Steepest Descent Method 227
" #
1
f x0 =
−1
x1 x2
FIGURE 7.1: Plot of function f , together with curve of triples defined by (7.4).
Figure 7.2 depicts φ itself plotted as a function of t. The strict global minimum
10
of φ is given by t0 = ≈ .435. Using t0 , we define
23
" # " #
3/23 .1304
x1 = x0 − t0 ∇ f (x0 ) = ≈ . (7.5)
37/23 1.609
Note that f (x1 ) = φ(t0 ) < φ(0) = f (x0 ). In fact, direct computation shows that
f (x1 ) = 3.804.
15
10
t
0 0.2 0.4 0.6 0.8 1
FIGURE 7.2: A plot of φ(t) = f x0 − t∇ f (x0 ) .
Table 7.1 lists results of the Steepest Descent Method applied to the " function
#
2 3 2 1
f (x1 , x2 ) = x1 + x1 x2 + x2 + x1 − 4x2 + 6 using an initial value x0 = and a
2 −1
7.1. The Steepest Descent Method 229
> restart:with(VectorCalculus):with(LinearAlgebra):
> SteepestDescentMethod:=proc(function,initial,N,tolerance)
local x,f,Delf,d,j,epsilon,phi,t0: global iterates:
# Create local and global variables.
x:=array(0..N):
# Define array of iterates.
f:=unapply(function,[x1,x2]):
# Create function.
Delf:=unapply(Gradient(function,[x1,x2]),[x1,x2]):
# Create corresponding gradient function.
x[0]:=evalf(initial):
# Set initial array value.
epsilon:=10:
# Set initial epsilon to a large value.
j:=0: while (tolerance<=epsilon and j<=N-1) do
# Create loop structure to perform algorithm.
d:=-Delf(op(x[j])): # Compute optimal descent direction.
phi:=unapply(simplify(f(x[j][1]+t*d[1],x[j][2]+t*d[2])),t):
t0:=subs(fsolve(D(phi)(t)=0, {t=0..infinity},
maxsols=1),t):
# Determine minimum of phi.
x[j+1]:=[x[j][1]+t0*d[1],x[j][2]+t0*d[2]]:
# Use minimum of phi to construct next iterate.
epsilon:=evalf(Norm(Delf(x[j+1][1],x[j+1][2]),Euclidean)):
# Update epsilon using gradient.
j:=j+1:
# Increase loop index.
end do:
iterates:=[seq(x[i],i=0..j)]:RETURN(x[j]):end:
# Construct iterates and return last array value.
> f:=(x1,x2)->x1ˆ2+x1*x2+3/2*x2ˆ2+x1-4*x2+6;
# Enter function.
3
f := (x1 , x2 ) → x21 + x1 x2 + x22 + x1 − 4x2 + 6
2
[−1.396540715, 1.795964168]
> iterates;
7.1. The Steepest Descent Method 231
# Print iterates.
Note that the list, iterates, can then be plotted using the pointplot com-
mand.
_ _
Waypoint 7.1.1. Apply the Steepest Descent Method to the function,
f : R2 → R, defined by f (x1 , x2 ) = x41 + x42 − 4x1x2 + 1. Use a tolerance of
ǫ = .1," and
# investigate
" # the outcome " # using three different initial values,
1 −1 −1
x0 = ,x = , and x0 = . Compare your results with those
2 0 −2 1
obtained using algebraic methods from Section 6.4.
_ _
The preceding example, and others like it, demonstrate how the Steepest
Descent Method can generate a sequence, {xk }, that converges to a critical
point of f that is a local, but not global, minimum or that is a saddle point.
Thus, additional conditions must be met in order to ensure that the sequence
of iterates converges in the first place, and, if it does, whether it yields a local
or global minimum.
If there exists m ∈ R such that m ≤ f (x) for all x ∈ S, then the sequence
{xk } ⊆ S0 generated using the Steepest Descent Method satisfies
This theorem does not state that the sequence {xk } converges to the global
minimum of f on S. In fact, it does not even state that this sequence converges
at all. However, if even some subsequence of this sequence has a limit point,
x⋆ , in S, then the Lipschitz continuity and (7.6) guarantee ∇ f (x⋆ ) = 0. (We
note that if S0 is closed and bounded, then such a subsequence must exist due
to a fundamental analysis result known as the Bolzano-Weierstrass Theorem.
The limit of the subsequence must belong to S0 or its boundary.) Because f (xk )
is decreasing in k, x⋆ must therefore correspond to either a local minimum
or a saddle point.
Here, kAk denotes the spectral norm of A, defined as the square root of
the largest eigenvalue of AAt .
2. If f is as defined as above and A is also symmetric, eigenvalues of AAt
are the eigenvalues of A2 , and the spectral norm reduces to kAk = ρ(A),
where ρ(A) is the spectral radius, or maximum of the absolute values of
the eigenvalues of A.
3. More generally, if f all second-order partial derivatives of f exist and
on O, then ∇ f is Lipschitz continuous on O provided
are continuous
kH f (x)k = ρ H f (x) is bounded by a constant independent of x in O. We
will omit the proof of this result, which follows from a generalization
to Rn of the Mean Value Theorem [4].
7.1. The Steepest Descent Method 233
illustrates the usefulness of Theorem 7.1.1. (Hereafter, we will state the ConPro
" # We first note that f is bounded below on
objective in terms of minimization.)
10
S. Now suppose we let x0 = , which corresponds to the lower level set
10
S0 = x | f (x) ≤ f (10, 10) shown in Figure 7.3.
x2
O
250
200
150
100
50
x1
0
0 50 100 150 200 250
FIGURE 7.3: The lower level set S0 = x | f (x) ≤ f (10, 10) .
Clearly we can draw" an# bounded, open set O in S that contains S0 and has the
x
property that if x = 1 belongs to O, then δ < x1 ≤ M and δ < x2 ≤ M, where
x2
δ and M are suitably chosen positive constants. The set contained within the
dashed lines in Figure 7.3 is one such example.
which are in fact both positive. The larger of these two eigenvalues is obtained
by using the positive root in (7.8). This fraction is bounded by a constant
234 Chapter 7. Numeric Tools for Unconstrained NLPs
independent of x in O, due to the fact that its numerator is bounded there and
its denominator is bounded away from zero. Specifically,
q
175 9x1 x22 + 8x31 + 81x42 x21 + 64x61
|λ| = 5 5
9x12 x23
√
175 9MM2 + 8M3 + 81M4 M2 + 64M6
≤ 5 5
9δ 2 δ3
√
175 17M3 + 145M6
= 25
.
9δ 6
Thus, ∇ f is Lipschitz continuous on O, and we conclude that the hypothesis
of Theorem 7.1.1 is"satisfied.
# The first eight Steepest Descent Method iterates,
10
starting from x0 = , are given in Table 7.2. In the fourth column, xk − x⋆
10
measures the distance from xk to the true minimum of f , which we computed
in Section 6.4 and is given by
784 2744
x⋆ = , ≈ (87.11, 101.63).
9 27
TABLE 7.2: Results of the Steepest Descent Method applied to the ConPro
objective function, f , from (7.7)
k xtk ∇ f (xk ) xk − x⋆
0 [10, 10] 173.245 119.759
1 [91.680, 85.907] 40.277 16.373
2 [82.493, 95.792] 3.6065 7.444
3 [87.236, 100.200] 2.788 1.435
4 [86.558, 100.930] .420 .892
5 [87.125, 101.455] .334 .175
6 [87.043, 101.543] .0152 .110
7 [87.113, 101.608] .0411 .022
8 [87.103, 101.619] .006 .014
Inequality (7.9)
is sometimes rephrased by stating that the sequence of objec-
tive values f (xk ) converges to f (x⋆ ) at a linear rate. Use of this term stems
f (xk+1 ) − f (x⋆ )
from the fact that the ratio, , is bounded.
f (xk ) − f (x⋆ )
The proof of Theorem 7.1.2 can be found in more advanced texts [4]. Instead
of proving it, we examine its validity in the context of the positive definite
" #
1 t t 2 1
quadratic function from (7.3). Thus, f (x) = x Ax + b x + c, where A = ,
2 1 3
" #
1
b= , and c = 6.
−4
# "
1
Assume that x0 = , in which case f (x0 ) = 12.5. Recall that the global
−1
" 7#
−
minimum is given by x⋆ = 95 and that f (x⋆ ) = 1.7.
5
√
5− 5
The Hessian, H f (x) is simply A, whose eigenvalues are given by λ1 =
√ 2
5+ 5
and λ1 = . Thus
2
λ2 − λ1 2 1
= .
λ2 + λ1 5
236 Chapter 7. Numeric Tools for Unconstrained NLPs
Using the first four Steepest Descent Method iterates from Table 7.1, we obtain
the values shown in Table 7.3. In each row, the ratio of the second entry to the
third is always less than or equal to 51 , thereby demonstrating the validity of
inequality (7.9) for these first few iterates.
Table 7.4 focuses on the iterates themselves and not their corresponding
objective values. It illustrates the validity of inequality (7.11).
Perhaps the most important conclusion we can draw from Theorem 7.1.2
is that eigenvalues play a crucial role in determining convergence rate. In
particular, the greater the difference between the largest and smallest eigen-
λn − λ1
values of H f (x⋆ ), the closer that becomes to one. This in turn slows
λn + λ1
down the rates of convergence. Exercises 1 and 2 provide further examples
illustrating this phenomenon.
A major drawback of the Steepest Descent Method is its slow rate of con-
vergence near the minimum. The simple positive definite quadratic function,
3
f (x1 , x2 ) = x21 + x1 x2 + x22 + x1 − 4x2 + 6, from (7.3) illustrates this phenomenon,
2
both in Table 7.1, where the relative change in ∇ f (xk ) is largest for the first
few iterations, but also in Figure 7.4. Superimposed on this diagram are four
contours of f , the first three Steepest Descent Method iterates, and scaled
versions of the corresponding optimal descent directions, −∇ f (xk ), where
k = 0, 1, 2, 3. The descent direction vectors demonstrate a drawback with the
Steepest Descent Method known as “zig-zagging” or “hemstitching.” This
phenomenon, which frequently results for functions having mildly “elliptic”
or “elongated” valleys, slows the rate of convergence.
f (x) = 12.5 x2
f (x) = 8 4
f (x) = 4
f (x) = 2
x3 2
x1
x2
0
x1
–4 –2 2
x0
1
f (x0 ) − f (x0 + d) ≈ −∇ f (x0 )t d − dt H f (x0 )d. (7.12)
2
If we intend to express the first iterate, x1 , in the form x1 = x0 + d, then the
decrease in f is maximized by differentiating the right-hand side of (7.12)
with respect to the vector d. Doing so, setting the result equal to 0 and solving
for d, we obtain
d = −H f (x0 )−1 ∇ f (x0 ), (7.13)
We call the vector d from (7.13) the Newton direction of f at x0 . It is the second-
order analog of the optimal descent direction associated with the Steepest
Descent Method. Like the optimal descent direction, it produces a curve,
“embedded” in the graph of f . Figure 7.5 depicts" # both the optimal descent
1
and Newton directions starting from x0 = for the quadratic function,
−1
3
f (x1 , x2 ) = x21 + x1 x2 + x22 + x1 − 4x2 + 6.
2
Thus, starting from x0 and moving in the Newton direction leads to
x1 = x0 + d = x0 − H f (x0 )−1 ∇ f (x0 ). Of course, assuming invertibility of the Hes-
sian at each stage, this process can be repeated, thereby leading to Newton’s
Method. We now summarize the steps of the algorithm.
Newton’s Method
To obtain an approximate solution of NLP (7.1) under the assumption f is
twice-differentiable on S:
For example, recall the positive definite quadratic function from (7.3). If
240 Chapter 7. Numeric Tools for Unconstrained NLPs
" #
1
x0 =
−1
Newton Direction
Optimal Descent Direction
x1 x2
"# " #
2 1 1
A= ,b= , and c = 6, this function is given by
1 3 −4
1 t
f (x) = x Ax − bt x + c
2
3
= x21 + x1 x2 + x22 + x1 − 4x2 + 6
2
1 t
f (x) = x Ax − bt x + c,
2
where A is an n-by-n positive definite matrix, b is in Rn , and c is a real
scalar.
1. Compute ∇ f (x), and use your result to determine the global
minimum x⋆ in terms of A and b.
2. Given x0 in Rn , determine the Newton direction in terms of x0 ,
A, and b.
3. Show that the first iterate, x1 , obtained using Newton’s method
is the global minimum obtained in (1).
_ _
NewtonsMethod(function,initial,N,tolerance),
> restart:with(VectorCalculus):with(LinearAlgebra):
> NewtonsMethod:=proc(function,initial,N,tolerance)
local x,f,Delf,Hf,d,j,epsilon: global iterates:
# Create local and global variables.
x:=array(0..N):
242 Chapter 7. Numeric Tools for Unconstrained NLPs
[87.10975551, 101.6262013]
> iterates;
# Print iterates.
While Newton’s Method has its drawbacks, the advantage of its use stems
from the fact that an initial value sufficiently close to a known local minimum
yields a sequence of iterates converging to the minimum more rapidly to
the minimum than a sequence obtained using the Steepest Descent Method.
Theorem 7.2.1 summarizes this result.
Because the Hessian matrix, H f (x), is a function of x, so too will be its matrix
norm. We say that H f (x) is Lipschitz continuous in the matrix norm if there
exists a constant, M, satisfying
Phrased another way, the norm of the matrix difference, H f (x1 ) − H f (x2 ), is
bounded by a constant times the distance from x2 to x1 . A consequence of
(7.15), whose proof we omit, is that if H f (x1 ) is positive definite and if H f (x) is
Lipschitz continuous in some neighborhood of x1 , then H f (x2 ) is also positive
244 Chapter 7. Numeric Tools for Unconstrained NLPs
Proof. Choose δ in (0, 1) small enough to satisfy the following three conditions:
We now focus on the difference ∇ f (x⋆ ) − ∇ f (x0 ) in (7.18) and introduce a
vector-valued function φ : [0, 1] → Rn given by
φ(t) = ∇ f tx⋆ + (1 − t)x0 .
7.2. Newton’s Method 245
Note that φ(0) = ∇ f (x0 ) and φ(1) = ∇ f (x⋆ ). An application of the chain rule
yields
φ′ (t) = H f tx⋆ + (1 − t)x0 x⋆ − x0 ,
which is continuously differentiable on [0, 1] since f is continuously twice-
differentiable on S. By the Fundamental Theorem of Calculus,
Z t=1
∇ f (x⋆ ) − ∇ f (x0 ) = H f tx⋆ + (1 − t)x0 x⋆ − x0 dt. (7.19)
t=0
We now apply the result from (7.20) to bound the norm of the vector difference,
x1 − x⋆ . In this process we utilize the fact that the eigenvalues of H f (x0 )−1
are the reciprocals of the eigenvalues of H f (x0 ). Hence, H f (x0 )−1 ≤ m. Using
this bound, along with properties of the definite integral, and the Lipschitz
continuity of H f (x), we obtain the following sequence of inequalities:
Z t=1
x1 − x⋆ = H f (x0 )−1 H f tx⋆ + (1 − t)x0 − H f (x0 ) x⋆ − x0 dt
t=0
Z t=1
≤ H f (x0 )−1 H f tx⋆ + (1 − t)x0 − H f (x0 ) x⋆ − x0 dt
t=0
Z t=1
≤ H f (x0 )−1 M tx⋆ + (1 − t)x0 − x0 x⋆ − x0 dt
t=0
Z t=1
2
= H f (x0 )−1 M x⋆ − x0 t dt
t=0
M 2
= x⋆ − x0 .
2m
Hence,
M 2
x1 − x⋆ ≤ x0 − x⋆ .
2m
r
2m
Since kx0 − x⋆ k < · δ, we see that x1 − x⋆ ≤ δ2 < δ, so the preceding
M
argument can be applied again, this time starting at x1 . Repeating the results,
246 Chapter 7. Numeric Tools for Unconstrained NLPs
M
we eventually obtain inequality (7.16) with C = , which completes the
2m
proof.
xk+1 − x⋆
2
≤ C,
xk − x⋆
illustrates this principle. Table" 7.5# lists the first five iterations of both methods
10
using an initial value of x0 = .
10
Both methods demonstrate convergence of iterates to the global minimum,
but they do so in a different manner. For small k, the error xk − x⋆ using
the Steepest Descent Method is much less than obtained using Newton’s
Method. Starting at k = 4, this trend reverses. In fact, if ǫ = .01, the Steepest
Descent Method requires nine iterations before ∇ f (xk ) < ǫ, whereas, Newton’s
7.2. Exercises Section 7.2 247
TABLE 7.5: Results of the Steepest Descent and Newton’s Methods applied
to the ConPro objective function
Steepest Descent Method Newton’s Method
t
k xk xk − x⋆ xtk xk − x⋆
0 [10, 10] 119.76 [10, 10] 119.76
1 [91.680, 85.907] 16.373 [28.063, 29.111] 93.518
2 [82.493, 95.792] 7.444 [56.819, 61.720] 50.103
3 [87.236, 100.200] 1.435 [80.0760, 91.040] 12.714
4 [86.558, 100.930] 0.892 [86.7370, 100.922] 0.801
5 [87.125, 101.455] 0.175 [87.110, 101.626] 0.004
Method requires only five. That one of these methods is more effective when
iterates are far from the global minimum and the other is more effective when
iterates are near prompts the question of whether it is possible to construct
a single procedure that blends Steepest Descent and Newton Methods so as
to be effective in either case. In the next section, we see that this is the case
and combine the best of both techniques, creating a new procedure known
as the Levenberg-Marquardt Algorithm, which is particularly well-suited for
approximating solutions of unconstrained NLPs, especially those arising in
the area of nonlinear regression.
x1
x2
1
3. Suppose that f : R2 → R is defined by f (x1 , x2 ) = ln(1+x21)+ x22 and that
" # 2
2
x0 = . Recall from Exercise 5 of Section 7.1 that the Steepest Descent
2
Method, starting with x0 , produces a sequence of iterates converging
to the global minimum of f at the origin. Show that Newton’s Method
fails to produce such a convergent sequence. Then experiment with
other initial values, x0 , and try to account for the varying outcomes by
using the Hessian of f .
minimize f (x), x ∈ S.
Assume x0 belongs to S. The Levenberg Method starts with the initial value x0
and, at each iteration, uses a trial iterate formula given by
h i−1
w = xk − H f (xk ) + λIn ∇ f (xk ) k = 0, 1, 2, . . . . (7.21)
3. If f (w) < f (xk ), then set xk+1 = w, decrease λ, and return to (2) with xk
replaced by xk+1 .
4. If f (w) ≥ f (xk ), then increase λ and return to (2).
5. Repeat the process until a desired level of tolerance has been achieved,
e.g., ∇ f (xk ) is smaller than a specified value ǫ.
The algorithm does not stipulate the extent to which we should decrease or
increase λ in (3) and (4), respectively. We return to this issue shortly. For now,
mere experimentation with various positive values will suffice.
_ _
Waypoint 7.3.1. Perform the second iteration of the Levenberg
Method for the unconstrained ConPro Manufacturing Company NLP,
using the previously obtained value of x1 , together with λ = .005.
_ _
Note that Diag H f (xk ) utilizes only those entries along the diagonal of
H f (xk ).
3. If f (w) < f (xk ), where w is determined in (2), then set xk+1 = w and
λk+1 = λk ρ−1 . Return to (2), replace k with k + 1, and compute a new trial
iterate.
4. If f (w) ≥ f (xk ) in (3), determine a new trial iterate, w, using (7.25) with
λ = λk .
5. If f (w) < f (xk ), where w is determined in (4), then set xk+1 = w and
λk+1 = λk . Return to (2), replace k with k + 1, and compute a new trial
iterate.
LevenbergMarquardt(function,initial,N,tolerance,damp,scale),
> restart:with(VectorCalculus):with(LinearAlgebra):
> LevenbergMarquardt:=
proc(function,initial,N,tolerance,damp,scale)
local x,f,d,Delf,Hf, DiagH,lambda,rho,w,j,m,epsilon:
global iterates:
254 Chapter 7. Numeric Tools for Unconstrained NLPs
> f:=(x1,x2)->-1400*x1ˆ(1/2)*x2ˆ(1/3)+350*x1+200*x2;
# Enter function
1 1
f := (x1 , x2 ) → −1400x12 x23 + 350x1 + 200x2
> iterates;
# Print iterates.
[[10., 10.], [28.00148014, 29.04621145], [56.73574951, 61.62388852],
[80.03451430, 90.98472144], [86.73264262, 100.9143105],
[87.10972680, 101.6261320]]
Unlike the case of multiple linear regression, where (7.26) results in an uncon-
strained NLP whose critical points are determined analytically, minimization
of (7.26) in the nonlinear setting frequently requires a numeric algorithm.
Such was the case for the ConPro Manufacturing Company when it constructed
its production function, which was based upon the data shown in Table 7.7.
Recall x1 and x2 denote the number of units of material and labor, respectively,
used to produce P(x1 , x2 ) units of pipe.
To determine the Cobb-Douglas function that best fits the data in Table 7.7,
we minimize the sum of squared-errors
X
16 2
β
f (α, β) = xα1 x2 − P(x1 , x2 ) , α, β > 0, (7.28)
k=1
where the sum is over the sixteen ordered triples (x1 , x2 , P(x1, x2 )) in Table
7.7. If" we# apply
" # the Levenberg-Marquardt Algorithm, using an initial value
α0 .5
x0 = = , initial damping factor λ0 = .001, a scaling factor of ρ = 10,
β0 .5
and a tolerance of ǫ = .001, we obtain after five iterates, α ≈ .510 and β ≈ .322.
These values are quite close to those used by ConPro in its actual production
formula.
Figure 7.7 illustrates a plot of the data triples (x1 , x2 , P) from Table 7.7 along
with the “best-fit” production function P(x1 , x2 ) = x.510
1 x2 .
.320
50
40
30
20
10
140
0 120
0 100
20 80
40
60
80 40
60
x2
100 20
x1 120
140
One specific tool developed during the last decade is known as Lipschitz
Global Optimization, or LGO [38]. Implementation of LGO for various software
programs, e.g., Maple and Excel, is now available as an add-on package. In
Maple, for example, the Global Optimization Toolbox package is loaded
using the command with(GlobalOptimization), and the global minimum
is computed using the command GlobalSolve, whose arguments essentially
take the same form as those used for the LPSolve and NLPSolve commands.
(a) Find the logistic function P of the form given in Equation 7.30
that best fits the data provided in Table 7.8 in the sense of least
squares. That is, determine α, β, and k using Table 7.8 and nonlinear
regression. (Hint: To determine sensible initial values of α, β, and
k, you may wish to graph the data points first, using Maple’s
pointplot command. Then experiment with various values of α,
β, and k. For each choice, use Maple to construct the graph of f from
(7.30). Superimpose this graph on that of the data points and see
if your values of α, β, and k make reasonably good initial choices.)
7.3. Exercises Section 7.3 259
261
262 Chapter 8. Methods for Constrained Nonlinear Problems
1 1
minimize f (x1 , x2 ) = −1400x12 x23 + 350x1 + 200x2 (8.2)
subject to
g1 (x) = 350x1 + 200x2 − 40, 000 ≤ 0
g2 (x) = x1 − 3x2 ≤ 0
g3 (x) = −x1 ≤ 0
g4 (x) = −x2 ≤ 0.
> with(VectorCalculus):
> g1:=(x1,x2)->350*x1+200*x2-40000:
> g2:=(x1,x2)->x1-3*x2:
> g3:=(x1,x2)->-x1:
> g4:=(x1,x2)->-x2:
> g:=(x1,x2)-><g1(x1,x2),g2(x1,x2),g3(x1,x2),g4(x1,x2)>:
> g(x1,x2);
X
m p
X
L(x, λ, µ) = f (x) + λi gi (x) + µ j h j (x) (8.6)
i=1 j=1
Note that in each product, λt g(x) and µt h(x), there exists a natural correspon-
dence between each constraint of the NLP (8.1) and a component of λ or µ.
In particular, a constraint of inequality-type (respectively, of equality-type)
corresponds to a component of λ (respectively, to a component of µ).
X
m p
X
∇L(x, λ, µ) = ∇ f (x0 ) + λ0,i ∇g j (x0 ) + µ0, j ∇h j (x0 ) (8.7)
i=1 j=1
λ0 ≥ 0 (8.9)
t t
∇L(x0 , λ0 , µ0 ) = ∇ f (x0 ) + Jg (x0 ) λ0 + Jh (x0 ) µ0 = 0 (8.10)
λ0,i gi (x0 ) = 0 for 1 ≤ i ≤ m. (8.11)
0 = ∇L(x0 , λ0 , µ0 ) (8.12)
t t
= ∇ f (x0 ) + Jg (x0 ) λ0 + Jh (x0 ) µ0
X
m p
X
= ∇ f (x0 ) + λ0,i ∇g j (x0 ) + µ0, j ∇h j (x0 )
i=1 j=1
X p
X
= ∇ f (x0 ) + λ0,i ∇gi (x0 ) + µ0, j ∇h j (x0 ).
i∈I j=1
" # " #
480/7 68.57
x0 = ≈ , (8.13)
80 80
with corresponding objective value, f (x0 ) = −9, 953.15. Thus, using approx-
imately 68.57 units of material and 80 units of labor, ConPro Manufacturing
Company has a maximum profit of $9,953.15, given the constraints dictated
in (8.2). Observe that this solution differs from that of the unconstrained ver-
sion of
" the problem,
# " which # we solved in Section 6.4 and which had solution
784/9 87.11
x0 = ≈ with corresponding profit approximately equal to
2744/27 101.63
$10,162.96.
A simple calculation shows that at the optimal solution, only the first"con-#
350
straint, g1 (x) = 350x1 + 200x2 − 40, 000 ≤ 0 is binding. Thus, ∇g1 (x0 ) =
200
trivially satisfies the regularity condition, so by (8.11), λ0,2 = λ0,3 = λ0,4 = 0.
8.1. The Lagrangian Function and Lagrange Multipliers 267
"#
480/7
Evaluating ∇L at x0 = and λ0,2 = λ0,3 = λ0,4 = 0 and setting each
80
component of the result to zero, we obtain
r
7 √3
λ0,1 = 80 − 1 ≈ .04. (8.15)
120
0 = ∇ f (x0 ) + Jg (x0 )t λ0
= ∇ f (x0 ) + λ0,1 ∇g1 (x0 )
≈ ∇ f (x0 ) + .04∇g1 (x0 ).
Thus, for the ConPro Manufacturing Company NLP, at the optimal solution, x0
the gradients of f and the first constraint function, g1 , are scalar multiples of
one another, where the scaling factor is approximately −.04.
x0
∇f ∇g1
x1
FIGURE 8.1: Feasible region of the ConPro Manufacturing Company NLP illus-
trating objective and constraint gradients at the solution, x0 .
Since the units of g1 , the only binding constraint in the solution, are dollars
(of available funds for material and labor), the units of λ0,1 are dollar of profit
per dollar of available material and labor. Put another way, the multiplier
λ0,1 is the instantaneous rate of change of the objective, f , with respect to
the constraint function, g1 . This means that if g1 changes by a small amount,
small enough so that it remains a binding constraint in the new solution, we
should notice the value of f at the optimal solution change by an amount
approximately equal to the change in g1 , multiplied by λ0,1 .
For example, suppose the funds available to ConPro for material and labor in-
creases by $100 from $40,000 to $40,100. Then the new first constraint becomes
350x1 + 200x2 ≤ 40,100, or, equivalently, g1 in (8.2) changes to
g1 (x) = 350x1 + 200x2 − 40,100. (8.16)
In terms of Figure 8.1, such a change corresponds to moving slightly up and
to the right the
" segment
# on the boundary of the feasible region that passes
480/7
through x0 = . The multiplier, λ0,1 ≈ .04, indicates, that, under such a
80
change, we should expect to see the objective value in the solution of the new
NLP to change by ∆ f ≈ .04(−100) = −4. This is a fairly good approximation,
for if we replace the original first constraint by that in (8.16) and solve the
resulting NLP (with Maple), we obtain a new optimal objective value of
−9, 957.21. This value, less that of f (x0 ) = −9, 953.15, is approximately -4.06.
8.1. Exercises Section 8.1 269
Thus, with $100 more funds available for capital and labor, ConPro ’s profit
increases by a mere $4.06.
(a)
minimize f (x1 , x2 ) = x21 − 6x1 + x22 − 4x2
subject to
x1 + x2 ≤ 3
x1 ≤ 1
x2 ≥ 0
" #
1
Solution: x0 =
2
(b)
minimize f (x1 , x2 ) = 4x21 − 12x1 − x22 − 6x2
subject to
x1 + x22 ≤ 2
x1 ≤ 1
2x1 + x2 = 1
" #
−1/4
Solution: x0 =
3/2
270 Chapter 8. Methods for Constrained Nonlinear Problems
(a) Verify that the regularity condition does not hold at x0 for this NLP.
(b) Show that there exists no vector λ0 in R3 for which λ0 ≥ 0,
∇L(x0 , λ0 ) = 0, and the complementary slackness conditions hold.
This result demonstrates that the regularity condition in the hy-
pothesis of Theorem 8.1.1 is important for guaranteeing the exis-
tence of the multiplier vectors.
(a) Verify that the regularity condition does not hold at x0 for this NLP.
(b) Construct at least two different multiplier vectors, λ0 , in R4 for
which λ0 ≥ 0, ∇L(x0 , λ0 ) = 0, and the complementary slackness
conditions hold. This result demonstrates that the regularity con-
dition in the hypothesis of Theorem 8.1.1 is important for guaran-
teeing the uniqueness of the multiplier vectors.
4. Suppose in (1a) that the right-hand side of the second constraint in-
creases from 1 to 1.1. By approximately how much will the optimal
objective value for the new NLP differ from that of the original prob-
lem? Answer this question without actually solving the new NLP.
8.1. Exercises Section 8.1 271
maximize z = ct · x (8.17)
subject to
Ax ≤ b
x ≥ 0.
per week Pam devotes to weight lifting, distance running, and speed
workouts, and the constraints give requirements as to how Pam al-
lots her training time. The objective function represents the portion of
Pam’s total pentathlon score stemming from her performances in the
800 meter run and shot put. It is based upon International Amateur
Athletic Federation scoring formulas, together with the fact that Pam
currently completes the 800 meter run in 3 minutes and throws the shot
put 6.7 meters. The NLP has a total of ten constraints, including sign
restrictions.
(a) The solution to this NLP is given by x1 = 2.5, x2 = 3, and x3 =
4.5 hours. Use this information to determine the values of the
ten corresponding Lagrange multipliers. (Hint: First determine
which constraints are nonbinding at the solution. The result, by
complementary slackness, quickly indicates a subset of multipliers
equalling zero.)
(b) If Pam has 30 minutes of more available training time each week,
by approximately how much will her score increase? Answer this
question using your result from (a).
_ _
" #
0
Waypoint 8.2.1. Show that x0 = is a regular KKT point, yet not a
0
solution of the NLP
We now combine the results of (8.20) and (8.21), together with (8.10), which
states
∇ f (x0 ) + Jg (x0 )t λ0 + Jh (x0 )t µ0 = 0.
Theorem 8.2.1 provides the tools for solving a wide range of problems, in-
cluding the ConPro Manufacturing Company NLP:
1 1
minimize f (x1 , x2 ) = −1400x12 x23 + 350x1 + 200x2 (8.22)
subject to
g1 (x) = 350x1 + 200x2 − 40000 ≤ 0
g2 (x) = x1 − 3x2 ≤ 0
g3 (x) = −x1 ≤ 0
g4 (x) = −x2 ≤ 0.
In this case, S = R2+ = {(x1 , x2 ) | x1 , x2 > 0}. Recall in Section 6.4 that we
established f is convex on S by using Theorem 6.4.3. Each constraint function,
gi (x), from (8.22) is an affine transformation and, hence, is convex on S. (See
Section 6.3.1). Thus (8.22) is a convex NLP, so finding its solution reduces to
finding its regular KKT points.
X
4
∇L(x0 ) = ∇ f (x0 ) + λi ∇gi (x0 )
i=1
= 0.
(8.23)
8.2. Convex NLPs 275
λ1 g1 (x0 ) = 0
λ2 g2 (x0 ) = 0
λ3 g3 (x0 ) = 0
λ4 g4 (x0 ) = 0.
(8.24)
> with(VectorCalculus):with(LinearAlgebra):
> f:=(x1,x2)->-1400*x1ˆ(1/2)*x2ˆ(1/3)+350*x1+200*x2:
# Enter objective function.
> g1:=(x1,x2)->350*x1+200*x2-40000:
> g2:=(x1,x2)->x1-3*x2:
> g3:=(x1,x2)->-x1:
> g4:=(x1,x2)->-x2:
> g:=(x1,x2)-><g1(x1,x2),g2(x1,x2),g3(x1,x2),g4(x1,x2)>:
# Enter vector-valued constraint function.
> g(x1,x2);
> lambda:=<lambda1,lambda2,lambda3,lambda4>:
# Create vector of multipliers.
> L:=unapply(f(x1,x2)+Transpose(lambda).g(x1,x2),
[x1,x2,lambda1,lambda2,lambda3,lambda4]):
# Create Lagrangian Function.
> LG:=Gradient(L(x1,x2,lambda1,lambda2,lambda3,lambda4),
[x1,x2,lambda1,lambda2,lambda3,lambda4]):
# Create Lagrangian Gradient.
> CS:=seq(g(x1,x2)[i]*lambda[i]=0,i=1..4):
# Create complementary slackness equations.
> solve({CS,LG[1]=0,LG[2]=0},{x1,x2,lambda1,lambda2,
lambda3,lambda4});
# Solve system of equations to determine KKT point
and multipliers.
276 Chapter 8. Methods for Constrained Nonlinear Problems
We have omitted the output produced by the last command, due to the fact
Maple returns several solutions. However, only one of these satisfies the
feasibility conditions along with the sign restriction on the multipliers. The
r
480 7 √3
resulting values are x1 = ≈ 68.57, x2 = 80, λ1 = 80 − 1 ≈ .04, and
7 " 480120
# " #
7
68.57
λ2 = λ3 = λ4 = 0. Thus the sole KKT point is x0 = ≈ . Only the
80 80
first constraint, g1 (x) = 350x1 + 200x2 − 40, 000 ≤ 0, is binding at x0 . Thus, the
regularity condition is trivially satisfied, so that x0 is a regular KKT point and,
therefore, is the solution of the ConPro Manufacturing CompanyNLP (8.22).
In this section we have developed a tool for solving a wide range of NLPs. In
the next section, we develop means of solving certain non-convex NLPs.
(b)
minimize f (x1 , x2 ) = (x1 − 6)2 + (x2 − 4)2
subject to
−x1 + x22 ≤ 0
x1 ≤ 4
x1 , x2 ≥ 0
(c)
minimize f (x1 , x2 ) = x21 − ln(x2 + 1)
subject to
2x1 + x2 ≤ 3
x1 , x2 ≥ 0
8.2. Exercises Section 8.2 277
(d)
(e)
1
minimize f (x1 , x2 ) =
x1 x2
subject to
x1 + 2x2 = 3
x1 , x2 > 0
minimize f (x)
subject to
kxk ≤ 1
Show that the optimal solution occurs at the origin, unless A possesses a
positive eigenvalue. Show that in this latter case, the solution occurs at
the eigenvector, x0 , which is normalized to have length one and which
corresponds to the largest eigenvalue, λ. Verify that the corresponding
objective value is given by f (x0 ) = −λ.
e
L(x, λ, µ) = f (x) + λte
g(x) + µt h(x) (8.26)
X
k p
X
= f (x) + λi gi (x) + µ j h j (x),
i=1 j=1
_ _
Waypoint 8.3.1. Calculate the restricted Lagrangian for NLP (8.25) at
each of its two KKT points.
_ _
To understand how the restricted Lagrangian compares to the original, note
that if x is feasible, then λi gi (x) ≤ 0 for 1 ≤ i ≤ m, so that
e
L(x, λ, µ) = f (x) + λte
g(x) + µt h(x)
X
k p
X
= f (x) + λi gi (x) + µ j h j (x)
i=1 j=1
X
m p
X
≥ f (x) + λi gi (x) + µ j h j (x)
i=1 j=1
= L(x, λ, µ).
∇e
L(x0 , λ0 , µ0 ) = ∇ f (x0 ) + Jeg (x0 )t λ0 + Jh (x0 )t µ0 (8.27)
t t
= ∇ f (x0 ) + Jg (x0 ) λ0 + Jh (x0 ) µ0
= ∇L(x0 , λ0 , µ0 )
= 0.
∂e
L
=0 for 1 ≤ k ≤ n. (8.28)
∂xk (x0 ,λ0 ,µ0 )
∂e
L
= gi (x0 )
∂λi (x0 ,λ0 ,µ0 )
=0 for 1 ≤ i ≤ k.
∂e
L
= 0 for 1 ≤ i ≤ m. (8.29)
∂λi (x0 ,λ0 ,µ0 )
Finally,
∂e
L
= h j (x0 ) (8.30)
∂µ j (x0 ,λ0 ,µ0 )
=0 for 1 ≤ j ≤ p.
Definition 8.3.2. Suppose the triple (x0 , λ0 , µ0 ) is a critical point of the re-
stricted Lagrangian, e L, where x0 is a KKT point of (8.1) with multiplier vec-
tors, λ0 and µ0 , where λ0 ≥ 0. Then we say that e L satisfies the saddle point
criteria at (x0 , λ0 , µ0 ) if and only if
e
L(x0 , λ, µ) ≤ e
L(x0 , λ0 , µ0 ) ≤ e
L(x, λ0 , µ0 ) (8.31)
We leave it as an exercise to show that for a convex NLP, any triple, (x0 , λ0 , µ0 ),
where x0 is a KKT point with corresponding multiplier vectors, λ0 and µ0 , is a
saddle point of eL. Thus a regular KKT point for a convex NLP is a solution of
the NLP and also corresponds to a saddle point of the restricted Lagrangian.
To test whether a KKT point and its multiplier vectors satisfy the saddle point
criteria, we first define φ(x) = e
L(x, λ0 , µ0 ) and establish that φ(x0 ) ≤ φ(x) for
all feasible x in S. For when this holds, feasibility requirements along with
the condition λ ≥ 0, yield
e
L(x, λ0 , µ0 ) ≥ e
L(x0 , λ0 , µ0 )
= f (x0 ) + λt0e
g(x0 ) + µt0 h(x0 )
≥ f (x0 ) + λte
g(x0 ) + µt h(x0 )
=e
L(x0 , λ, µ),
φ(x) = e
L(x, λ0 )
1
= f (x) + g2 (x) + 2g3 (x)
2
3 2 1
= x1 − 6x1 + x22 − 2x2 − 4.
2 2
" #
2
An elementary exercise establishes that ∇φ vanishes precisely at x0 = and
2
that the Hessian of φ is constant and positive definite. Thus, φ has a global
minimum at x0 so that x0 is a solution of NLP (8.25).
" #
1/2
At the second KKT point, x0 = , only the second constraint is binding.
−1/2
In this case, φ(x0 ) = x21 − x22 has a saddle point at x0 . Thus, we may not
1 1
conclude x0 is also a solution of NLP (8.25). In fact, f , − = −.5, whereas
2 2
f (2, 2) = −12.
(a)
(b)
minimize f (x1 , x2 ) = x1 x2
subject to
x21 + x22 ≤ 1
2x1 + 3x2 ≤ 3
x1 ≥ 0
(c)
2. Verify that the restricted Lagrangian for the ConPro Manufacturing Com-
pany NLP satisfies the saddle point criteria at its optimal solution.
minimize f (x)
subject to
g1 (x) ≤ 0
g2 (x) ≤ 0
..
.
gm (x) ≤ 0,
(d) Finally, apply Equation (8.10) to the right-hand side of the previous
inequality.
1 t
minimize f (x) = x Qx + pt x (8.34)
2
subject to
Ax = b
Cx ≤ d,
As a first step towards solving (8.34), we consider the special case when it
has only equality-type constraints. In other words, we seek to solve
1 t
minimize f (x) = x Qx + pt x (8.35)
2
subject to
Ax = b.
0 = ∇L(x0 ) (8.37)
t
= ∇ f (x) + A µ0
= Qx0 + p + At µ0
for some µ0 in Rp .
In the ideal situation, the matrix Q is positive definite. When this occurs (8.35)
is a convex NLP so that any solution of (8.38) yields a regular KKT point, x0 ,
which is a solution of (8.35) by Theorem 8.2.1.
When Q is not positive definite, the situation becomes somewhat more com-
plicated. In this case, equation (8.38) still remains valid. However, the coef-
ficient matrix need not be invertible and, even if it is, (8.35) is no longer a
convex NLP, so a solution of (8.38) need not yield a solution of the original
problem.
Fortunately, a tool exists for circumventing this problem, one that applies to
many situations. To understand the rationale "behind# it, we first consider a
1 2 h i
simple example. Suppose, in (8.35), that Q = , p = 0, A = 1 −1 ,
2 −1
and b = 1. (For the sake of consistent notation throughout the discussion, we
express b using vector notation even though it is a scalar for this particular
problem.) The feasible region for this NLP consists of the line x1 − x2 = 1 in
R2 . However, the eigenvalues of Q are mixed-sign, so the NLP is not convex.
x2
x1
FIGURE 8.2: Quadratic form, f , together with the image under f of the line
x1 − x2 = 1.
288 Chapter 8. Methods for Constrained Nonlinear Problems
" #
1 x
then the restriction of f (x) = xt Qx to the set of vectors x = 1 satisfying
2 x2
Ax = b is a quadratic expression in x1 , whose leading coefficient is given
q1 a22 − 2qa2 a1 + q2 a21
by . Thus, as a function of x1 alone, this restriction of f is
2a22
convex provided q21 a22 − 2qa2 a1 + q2 a21 > 0. This was the case in our preceding
example, where q1 = 1, q2 = −1, q = 2, a1 = 1, and a2 = −1.
" #
0 A
The matrix B = , which is identical to the coefficient matrix in (8.38),
At Q
is known as a bordered Hessian. It consists of the Hessian of f , Q in this case,
bordered above by A and to the left by At . In the absence of constraints, B is
simply Q. Thus, the bordered Hessian can be viewed as a generalization of
the normal Hessian to the constrained setting, one that incorporates useful
information regarding the constraints themselves. It is an extremely useful
tool for solving the quadratic programming problem (8.35). However, before
stating a general result that spells out how this is so, we introduce some new
terminology.
Definition 8.4.1. Suppose M is an n-by-n matrix. Then the leading principal
minor of order k, where 0 ≤ k ≤ n, is the determinant of the matrix formed by
deleting the last n − k rows and the last n − k columns of M.
Note that when k = 0, the leading principal minor is just M itself. With this
terminology, we now state a classic result, known as the bordered Hessian test.
Theorem 8.4.1. Assume the quadratic programming problem (8.35) " has a#
0 A
KKT point at x0 with corresponding multiplier vector, µ0 . Let B = t
A Q
denote the corresponding (n + m)-by-(n + m) bordered Hessian formed using
Q and A. Then x0 is a solution of (8.35) if the determinant of B and the last
n − m leading principal minors of B all have the same sign as (−1)m .
The proof of this result can be found in a variety of sources [29]. Instead of
8.4. Quadratic Programming 289
1 t
minimize f (x) = x Qx + pt x (8.40)
2
subject to
Ax = b
Cx ≤ d,
" #
1 2 0 2 h i 1 2 2
4 1, p = −3, A = 1 0
where Q = 2 1 , b = 2, C = , and
4 1 0
0 1 3 0
" #
2
d= .
5
> b:=<2>:
> C:=Matrix(2,3,[1,2,2,4,1,0]):
> d:=<2,5>:
> QPSolve([p,Q],[C,d,A,b]);
1.25
−.375
4.3125,
.75
Note that p, b, and d must be entered as vectors using either the <, > notation
or the Vector command. They may not be entered as matrices or as scalars.
One drawback of the QPSolve command is its inability to return the corre-
sponding Lagrange multipliers. To obtain these values, we must use tools
introduced in the context of solving the ConPro problem at the end of Section
8.2.1 and formulate the Lagrangian function. Here is syntax that demon-
strates an efficient means for constructing the Lagrangian corresponding to a
quadratic programming problem, such as (8.40):
> with(VectorCalculus):with(LinearAlgebra):
> Q:=Matrix(3,3,[1,2,0,2,4,1,0,1,3]):
> p:=Matrix(3,1,[2,-3,0]):
> x:=Matrix(3,1,[x1,x2,x3]):
> f:=unapply(1/2*(Transpose(x).Q.x)[1,1]+
(Transpose(p).x)[1,1],[x1,x2,x3]):
> A:=Matrix(1,3,[1,0,1]):
> b:=Matrix(1,1,[2]):
> h:=unapply(convert(A.x-b,Vector),[x1,x2,x3]):
> C:=Matrix(2,3,[1,2,2,4,1,0]):
> d:=Matrix(2,1,[2,5]):
> g:=unapply(convert(C.x-d,Vector),[x1,x2,x3]):
> lambda:=<lambda1,lambda2>:
> mu:=<mu1>:
> L:=unapply(f(x1,x2,x3)+(Transpose(lambda).g(x1,x2,x3))
+(Transpose(mu).h(x1,x2,x3),[x1,x2,x3,lambda1,lambda2,mu1]):
Observe the index, [1,1], added to (Transpose(x).Q.x) in the fifth line
of the worksheet. It is required to convert (Transpose(x).Q.x), which
Maple views as a 1-by-1 matrix, to a scalar-valued function. In essence,
the added index, “removes the brackets” so to speak, from the matrix
(Transpose(x).Q.x). For identical reasons, we add brackets, [1,1], imme-
diately after (Transpose(p).x).
Once the Lagrangian, L, has been created, KKT points and corresponding
multipliers can be computed as was done in the ConPro worksheet at the end
of Section 8.2.1.
8.4. Quadratic Programming 293
A pure strategy Nash equilbrium for the bimatrix game is defined in a manner
analogous to that done for the zero-sum case. To determine such equilibria, we
consider the four possible ordered pairs formed by the two column choices
for Ed and the two row choices for Steve. For example, the combination
consisting of Ed choosing column one and Steve choosing row one is not a
pure strategy equilibrium. If Ed recognizes that Steve always chooses row
one, then Ed can increase his earnings by choosing column two provided
Steve continues to follow his own strategy. Similar reasoning applied to other
cases establishes that no pure strategy Nash equilibrium exists.
and
n o
z0,2 = max yt Bx0 | et y = 1 and y ≥ 0 .
y
Observe that due to the manner in which equilibrium strategies, x0 and y0 , are
defined, if one player deviates from his equilibrium strategy while the second
player continues to follow his own, then the first sees no improvement in his
earnings.
subject to
At y ≤ z 1 e
Bx ≤ z2 e
et x = 1
et y = 1
x, y ≥ 0,
" #
z
where z = 1 . The proof that the solution of (8.45) coincides with that of
z2
(8.44) is beyond the scope of this text and can be found in other sources [27].
We will focus on the connection between this problem and that of the general
quadratic programming model and apply these tools to Ed and Steve’s game.
We will also leave it as an exercise to show that when B = −A, the solutions
of (8.44) and (8.45) are not only identical but also coincide with the solution
of the zero-sum matrix game.
We first express (8.45) in the standard form from (8.34). For the sake of compact
8.4. Quadratic Programming 295
notation, we let w be the vector in R6 , whose first two entries we associate to
z
z, the next two to x, and the last two to y. In other words, w = x.
y
1
1
0
Now define p = and
0
0
0
02×2 02×2 02×2
−(A + B)
Q = 02×2 02×2
02×2 −(A + B)t 02×2
1 t
f (w) = w Qw + pt w. (8.46)
2
We now seek to express the constraints from (8.45) in the form of a matrix
inequality, Cw ≤ d, along with a matrix equation Ew = b. (We use E instead
of A since A already denotes Ed’s payoff matrix.)
1 t
minimize f (w) = w Qw + pt w (8.47)
2
subject to
Ew = b
Cw ≤ d.
_ _
Waypoint 8.4.1. Use Maple’s QPSolve command to verify that the
5/3
7/2
1/2
solution of (8.47) is given by w0 = . Thus Ed’s equilibrium strat-
1/2
2/3
1/3
" #
1/2
egy consists of x0 = , implying he chooses each column with
1/2
equal probability. His earnings are given by z0,1 ="5/3.# Steve, on the
2/3
other hand, has an equilibrium strategy of y0 = , meaning he
1/3
chooses the first row 2/3 of the time and has earnings of z0,2 = 7/2.
That Steve’s earnings are much larger than Ed’s is intuitively obvious
if we compare the relative sizes of the entries in A and B.
_ _
We can verify these results by applying our newly developed tools for
quadratic programming problems. Using tools for calculating KKT points,
as discussed in Section 8.2, we can verify w0 is the only KKT point of (8.47).
The multiplier vector, λ0 , corresponding to the eight inequality constraints is
1/6
5/6
1/6
5/6
given by λ0 = , and that the multiplier vector corresponding to the two
0
0
0
0
" #
11/6
equality constraints is µ0 = . Only the first four constraints of to Cw ≤ d
11/3
are binding. Thus, to apply the bordered Hessian test, we form a submatrix
of C using its top four rows:
8.4. Exercises Section 8.4 297
" #
02×2 At
M1
C̃ = .
M2
B 02×2
" #
E
It is easy to check that the rows forming are linearly independent, so x0 is
C̃
regular.
We end this section by noting that the mixed strategy Nash equilibrium
solution of a bimatrix game need not be unique in general. It can be the case
that (8.45) has multiple solutions. While the value of the objective function,
f , in (8.45) must be the same for all of these, the players’ equilibrium mixed
strategies and corresponding payoffs, as determined by entries of x0 , y0 , and
z0 , may vary.
(a) Rewrite the problem in the standard form (8.34). (Hint: The matrix
Q is simply the Hessian of f .)
(b) Explain why the problem is convex. Then determine its solution
using formula (8.38).
2. Solve each of the following nonconvex, quadratic programming prob-
lems.
298 Chapter 8. Methods for Constrained Nonlinear Problems
(a)
1 t
minimize f (x) = x Qx + pt x
2
subject to
Ax = b,
" # " #
1 −2 0 2 1 2 −1 1
−2 3 1 1
where Q = , p = , A = , and b = .
4 0 1 2
0 1 3 0
(b)
1 t
minimize f (x) = x Qx
2
subject to
Cx ≤ d,
1 −2 0 −1 2 2 1
−2 3 1 −3, and d = 2.
where Q = , C = 1 −1
0 1 3 −1 1 0 3
(c)
1 t
minimize f (x) = x Qx + pt x
2
subject to
Ax = b
Cx ≤ d,
2 4 0 0 2
4 −3 1 2 3 −1
7 −2 −1
where Q = , p = . A = 0 −5 4 2 ,
0 −2 −3 −1 0
3 1 5 6
0 −1 −1 0 1
" # " #
2
1 1 1 3 1
b = 4, C = and d = .
−1 3 1 0 1
3
where S = −AQ−1 At .
4. An individual’s blood type (A, B, AB, or O) is determined by a pair
of inherited alleles, of which there are three possibilities: A, B, and O,
8.4. Exercises Section 8.4 299
where A and B are dominant over O. Table 8.1 summarizes the blood
types produced by each allele pair. Note that the allele pairs AO=OA
and BO=OB result in blood types A and B, respectively, due to the
dominate nature of A and B.
x1
Let x = x2 be a vector whose entries represent the frequencies of
x3
the alleles, A, B, and C, within a certain population. An individual is
heterozygous for blood type if he or she inherits different allele types.
Since
f (z, x, y) = et z − xt (A + B)y
= et z
= z1 + z2
= w − z,
Based upon the rules of the game, we can associate to the two drivers
the following payoff matrices, where row 1 and column 1 are associated
with continuing straight and row 2 and column 2 to swerving:
" # " #
−1 0 −1 2
A= and B =
2 1 0 1
Set up and solve a quadratic programming that determines the all Nash
equilibria for the game. There are three. Two are pure strategy equilib-
ria, where one driver always continues driving straight and the other
swerves, and vice versa. The third equilibrium is of mixed-strategy
type in which each driver elects to continue straight or to swerve with
a non-zero probability.
h1 (x)
h (x)
2
where h : S → Rp is the vector-valued function defined by h(x) = . .
..
hp (x)
In a nutshell, to solve (8.50), we will apply Newton’s Method to the associated
Lagrangian function. At each iteration, we solve a quadratic programming
problem. The matrix associated with the objective function is the Hessian
of the Lagrangian. Constraints are formulated using the gradient of L, the
Jacobian of h, and h itself.
where x ∈ S and µ ∈ Rp .
Recall results (8.28) and (8.28) from Section 8.3. These demonstrated that if x0
is a KKT point of NLP (8.50) with corresponding multiplier vector, µ0 , then
the gradient of L with respect to both vector variables, x and µ, vanishes at
(x0 , µ0 ). In other words, (x0 , µ0 ) is a critical point of L.
The nature of this critical point provides information useful for solving (8.50).
If this critical point is a minimum of L on S × Rp , then for all feasible x in S,
302 Chapter 8. Methods for Constrained Nonlinear Problems
Using the Newton direction formula, Equation (7.13) from Section 7.2, we
obtain
8.5. Sequential Quadratic Programming 303
Equation (8.56) bears a resemblance to Equation (8.38) from Section 8.4, which
arose in the context of quadratic programming. To see this connection more
clearly, we will associate to (8.50) the following quadratic subproblem:
1
minimize (∆x)t HL,x (x1 , µ1 )∆x + (∇ f (x1 ) + Jh (x1 )t µ1 )t (∆x) (8.57)
2
subject to Jh (x1 )(∆x) = −h(x1 ).
Thus, the values of ∆x and ∆µ can be obtained by solving either the quadratic
subproblem (8.57), or the matrix equation
" #(8.56). Using these values,
" # we obtain
x1 ∆x
the Newton direction of L at w1 = . Namely, ∆w = . Therefore,
µ1 ∆µ
w2 = w1 + ∆w so that x2 = x1 + ∆x and µ2 = µ1 + ∆µ. This completes
the first iteration of the Sequential Quadratic Programming Technique. We now
summarize this process.
It is important to recognize that the SQPT, for the case of equality constraints.
generates a sequence of iterates identical to that obtained by applying New-
ton’s Method to the associated Lagrangian function.
The solution of (8.58) can of course be obtained by solving the constraint for
one decision variable, substituting the result into the objective, and minimiz-
ing the resulting
√ # function of a single variable. Doing so leads to a solution of
"
−1/√ 2 √
x0 = with corresponding objective value f (x0 ) = − sin( 2).
1/ 2
While NLP (8.58) is a very simple example, it is also well suited for demon-
strating our new technique, due to the nature of the objective function and
the nonlinear constraint.
" #
1
We use as our initial value x1 = and µ1 = 1. (For the sake of consistency
2
with our previous derivations, we express µ1 and h using vector notation even
though both are scalar-valued for this particular problem.) The quantities
needed to compute ∆x and ∆µ consist of the following:
sin(1) + 2 − sin(1) 2 " # cos(1) + 2
∆x
− − sin(1) sin(1) + 2 4 = − cos(1) + 4 . (8.59)
∆µ
2 4 0 4
" # −.7
∆x
Solving (8.59), we obtain ≈ −.650, from which it follows that
∆µ
−.550
x2 ≈ x1 + ∆x
" #
.3
=
1.35
and µ2 = µ1 + ∆µ ≈ .45.
Table 8.2 provides values of xk and µk for the first seven iterations of the SQPT
" #
1
applied to NLP (8.58) using x1 = and µ1 = 1.
2
It
n should not come o as a surprise then that a sequence of SQPT iterates,
(x1 , µ1 ), (x2 , µ2 ), . . . , can have the same undesirable property. We can demon-
strate this phenomenon " by# applying the SQPT to NLP (8.58) using a different
−1
initial value. If x1 = and µ1 = −1, then the resulting sequences of it-
−2
" √ # √
−1/√ 2 cos( 2)
erates do not converge to x0 = and µ0 = √ , respectively.
1/ 2 2
Instead, the sequence {xk } converges to the second KKT point of NLP (8.58),
" √ #
1/ √2 n o
x0 = , and the sequence µk converges to the corresponding multi-
−1/ 2
√
cos( 2)
plier vector, µ0 = − √ ≈ −.110. Note that while x0 in this case is a KKT
2
point, it is not a solution of (8.58). Mere comparison of objective values shows
this to be the case.
ming problems of the form (8.57), each of which has only equality-type con-
straints.
1
minimize (∆x)t HL,x (x1 , µ1 )∆x + ∇ f (x1 )t ∆x (8.61)
2
subject to
Jg (x1 )(∆x) ≤ −g(x1 )
Jh (x1 )(∆x) = −h(x1).
The multiplier vectors λ2 and µ2 are assigned the values associated with
the solution of (8.61). Note that this process differs from the situation when
only equality-type constraints were present. There, the multiplier vector was
labeled ∆µ, which we then added to µ1 to obtain µ2 .
6.
2 + 4x2 x3 + 2µ1 4x1 x3 4x1 x2
2x1 .
7. HL,x (x, λ, µ) = 4x1 x3 2 − 2λ2 + 2µ1 2
4x1 x2 2x21 −2 + 2µ1
2. −h(x1 ) = −10
26
3. ∇ f (x1 ) = 10
−2
" #
1 2 −1
4. Jg (x1 ) =
0 −4 0
h i
5. Jh (x1 ) = 2 4 6
8.5. Sequential Quadratic Programming 309
24 12 8
12 −2 2
6. HL,x (x1 , λ1 , µ1 ) = .
8 2 −4
Now we substitute these results into (8.61) and solve the resulting quadratic
programming problem. To find the KKT point and corresponding multipliers,
we could certainly construct the associated Lagrangian and use methods from
Section 8.2.1. In an effort to streamline the solution process, however, we will
first determine the KKT point using the matrix form of Maple’s QPSolve
command. If we enter the preceding quantities in Maple so that items 1,
2, 3, 4, 5, and 6 correspond to d, b, p, C, A, and Q, respectively, then
−.4000
−.7500
QPSolve([p,Q],[C,d,A,b]) returns a value of ∆x ≈ .
−1.0335
While the QPSolve command does not return the multiplier values, we can
obtain them by using ∆x along with the given functions. Substituting ∆x into
each side of the inequality Jg (x1 )(∆x) ≤ −g(x1 ) establishes that the constraint
corresponding to the second row of Jg (x1 ) is binding. Thus λ2,1 , the first
component of λ2 , equals zero by complementary slackness.
Since λ2,1 = 0, we use only the unknown value, λ2,2 in the vector of unknowns.
In this example, substitution of all known quantities into Equation (8.63)
yields
24 12 8 2 0 −26
12 −2 2 4 −4 ∆x −10
8 2 −4 6 0 µ2 = 2 ,
2
4 6 0 0 λ2,2 −10
0 −4 0 0 0 3
310 Chapter 8. Methods for Constrained Nonlinear Problems
−.4
−.75
from which it follows that ∆x ≈ , µ2 ≈ .4268, and λ2,2 ≈ 1.587.
−1.0355
> with(VectorCalculus):with(LinearAlgebra):with(Optimization):
> f:=(x1,x2,x3)->x1ˆ2+2x1ˆ2*x2*x3+x2ˆ2-x3ˆ2:
# Enter objective function.
> Delf:=unapply(Gradient(f(x1,x2,x3),[x1,x2,x3]),[x1,x2,x3]):
# Gradient of f.
> h:=(x1,x2,x3)-><x1ˆ2+x2ˆ2+x3ˆ2-4>:
# Vector-valued form of equality constraint function.
> Jh:=unapply(Jacobian(h(x1,x2,x3),[x1,x2,x3]),[x1,x2,x3]):
# Jacobian function of h.
> g:=(x1,x2,x3)-><x1+2*x2-x3-2,-xˆ2+1>:
# Vector-valued form of inequality constraint functions.
> Jg:=unapply(Jacobian(g(x1,x2,x3),[x1,x2,x3]),[x1,x2,x3]):
# Jacobian function of g.
> lambda:=<lambda1,lambda2>:
# Create vector of multipliers.
8.5. Sequential Quadratic Programming 311
> L:=unapply(f(x1,x2,x3)+Transpose(lambda).g(x1,x2,x3),
+<mu>.h(x1,x2,x3), [x1,x2,x3,mu,lambda1,lambda2]):
# Create Lagrangian Function.
> HLx:=unapply(Hessian(L(x1,x2,x3,mu,lambda1,lambda2),
[x1,x2,x3]),[x1,x2,x3,mu,lambda1,lambda2]):
# Hessian in x1,x2,x3 of the Lagrangian.
> X1:=1,2,3:mu1:=-1:Lambda1:=1,1:
# Initial choices for variables and multipliers. Note
Lambda1 is case sensitive since lambda1 has already
been defined as a variable in the Lagrangian.
> Qsol:=QPSolve([Delf(X1),HLx(X1,mu1,Lambda1)],
[Jg(X1),-g(X1),Jh(X1),-h(X1)]);
# Solve first quadratic subproblem in iteration process.
−.3994
Qsol := −8.1547, −.7500
−1.0335
> Jg(X1).<Qsol[2][1],Qsol[2][2],Qsol[2][3]>+g(X1);
# Determine which inequality constraints in solution of
subproblem are binding.
(−.8659)ex + (0)e y
> ∼C:=SubMatrix(Jg(X1),[2],[1,2,3]);
# The second constraint is binding, so we form a submatrix
of Jg(X1) using only the second row. Furthermore, we
know the first component of Lambda2 is zero by
complementary slackness.
h i
∼ C := 0 −4 0
> ∼d:=SubVector(g(X1),[2]);
# Use entry of Jg(X1) corresponding to binding constraint.
> B:=< < <HLx(X1,mu1,Lambda1)|Transpose(Jh(X1))|Transpose(∼C)> >,
<Jh(X1)|ZeroMatrix(1,2)>,<∼C|ZeroMatrix(1,2)>>;
# Create coefficient matrix for finding changes in x1, x2,
and x3, along with new multiplier values.
24 12 8 2 0
12 −2 2 4 −4
0
B := 8 2 −4 6
2 4 6 0 0
0 −4 0 0 0
312 Chapter 8. Methods for Constrained Nonlinear Problems
> X2:=X1[1]+w[1,1],X1[2]+w[2,1],X1[3]+w[3,1];
# Create X2 using each entry of X1, added to the
corresponding entry in w.
> mu2:=w[4,1];
µ2 := .4268
> Lambda2:=0,w[5,1];
# Lambda2 has its first entry equal to zero since the first
constraint in the quadratic subproblem was not binding at
the solution. We use the fifth entry of w to form the
second component of Lambda2. Now return to Step 1 and
repeat the process again starting at X2,mu2,Lambda2.
Λ2 = 0, 1.5869
The simplest means for carrying out this process is to utilize a decision-
making rule, whereby at each iteration, we choose the fraction of ∆x that best
decreases the objective value while simultaneously minimizing the “infeasi-
bility error.” This task is accomplished by introducing a new function,
and where ρ is a fixed, large positive real number. The function, M, is referred
to as a merit function. Corresponding to NLP (8.60), various different choices
exist for P, with one of the most common being
X
m p
X
2
P(x) = max(gi (x), 0) + h j (x)2 . (8.65)
i=1 j=1
Note that if x is feasible for (8.60), then P(x) = 0, and that the value of P
increases with the overall amount of constraint violation.
x2 = x1 + t0 ∆x (8.69)
1 −.4
2 −.75
≈ + 1.3062
3 −1.034
.478
1.020
≈ .
1.649
is easily modified to incorporate use of the merit function. After the objective
and constraint functions, f, g, and h have been defined, the penalty and merit
functions are constructed using the following input:
> P:=unapply(add(piecewise(g(x1,x2,x3)>=0,g(x1,x2,x3)[i]ˆ2,
i=1..2)+h(x1,x2,x3)ˆ2,[x1,x2,x3])):
> M:=unapply(f(x1,x2,x3)+10*P(x1,x2,x3),[x1,x2,x3]):
To account for the scaling of ∆x, we modify the worksheet after the compu-
tation of w as follows:
> w:=evalf(MatrixInverse(B).<-Delf(X1),-h(X1),-(∼ d)>);
−.3994
−.7500
w := −1.0335
.4268
1.5869
> phi:=t->M(X1[1]+t*w[1,1],X1[2]+t*w[2,1],X1[3]+t*w[3,1]):
> t0:=NewtonsMethod(phi(t),1,5,.01);
# NewtonsMethod procedure modified for a function of one
variable, using an initial value 1, maximum number of
five iterations, and tolerance of .01.
t0 := 1.3062
> X2:=X1[1]+t0*w[1,1],X1[2]+t0*w[2,1],X1[3]+to*w[3,1];
.478 1.020 1.649
(b)
"#
1 1
along with x1 = , λ1 = 1, and µ1 = 1.
−1
1
2
Use the MSQPT, with initial values x1 = 2 and λ1 = .5e, where e is the
2
10
vector in R all of whose entries are one, to estimate the solution of this
NLP.
Appendix A
Projects
An excavation company must prepare a large tract of land for future construc-
tion.1 The company recognizes the difficulty and lack of aesthetics associated
with leveling the site to form one horizontal plane. Instead, it divides the
site into eight rectangles, with each rectangle assigned to fall in one of three
possible planes, whose equations are to be determined. Rectangles within a
given plane must be pairwise adjacent.
Each plane will be expressed as a function of x and y, and to keep the planes
relatively “flat,” we require that the slopes in both the x- and y-directions are
close to zero. (The actual tolerance will be prescribed as a constraint in the
LP.) In addition, adjacent planes should meet along their edges so as to avoid
“jumps.” This entire process of plane-fitting and leveling is accomplished
through a process of transporting among the various rectangles. The com-
pany’s goal is to minimize the total costs stemming from the digging of dirt,
the filling of dirt, and the transporting of dirt from one rectangle to another.
The eight rectangles and their designated planes are shown in Figure A.1.
We define the decision variables for this model using the following notation:
319
320 Appendix A
y
(56,100) (80,100)
(1) (2)
(28,90)
A=1120
h=20
(68,83)
A=816
(0,80) h=10
(3) (4)
Plane 1
Plane 2 (80,66)
(5)
(18,60) (46,60)
A=1440 A=800
h=25 h=150 (68,53)
A=624
h=5
(0,40) (80,40)
(6) (7)
(25,32)
A=800
h=30
Plane 3 (65,20)
(8) A=1200
h=20
(25,12)
A=1200
h=10
x
(0,0) (50,0)
FIGURE A.1: Land tract site consisting of eight rectangles forming three
planes.
Projects 321
a j = a j1 − a j2 , b j = b j1 − b j2 , and d j = d j1 − d j2 , 1 ≤ j ≤ 3,
Equations for planes can then be created using this notation. For example,
T1 (x, y) = a1 x + b1 y + d1
= (a11 − a12 )x + (b11 − b12 )y + (d11 − d12 )
represents the equation of the first plane. For the sake of aesthetics, we pre-
scribe that each plane has slopes in the x- and y-directions falling between -.1
and .1.
Costs, in dollars, associated with filling, cutting, and transporting are given
as follows:
• The cost of removing one cubic foot of dirt from any rectangle is $2
• The cost of adding one cubic foot of dirt from any rectangle is $1
• The cost of transporting one cubic foot from one rectangle to another is
$1
1. Construct the objective function for this LP, which consists of total costs
stemming from three sources: the digging of dirt from the rectangles,
the adding of dirt to the rectangles, and the transporting of dirt between
pairs of rectangles.
2. Now formulate a set of constraints for the LP. They arise from the
following requirements:
(a) There is a gap between the original height of each rectangle and the
height of the new plane containing the rectangle. This gap equals
the difference between the fill and cut amounts for the rectangle.
A total of eight equations arises from this condition.
322 Appendix A
(b) Each rectangle has an area, Ai . The net volume of dirt removed
from that rectangle can be expressed in two ways. The first uses
n oquantities Ai , ci , and fi ; the other utilizes the family of variables,
the
ti j . For example, the volume of dirt added to rectangle one, equals
3. Now use Maple to solve the LP that minimizes the objective from (1),
subject to the constraints from (2).
Assume the company collects “raw,” unprocessed grape juice, hereafter re-
ferred to simply as “juice,” from grape growers at two different plants, labeled
plant 1 (k = 1) and plant 2 (k = 2). The former is located in an urban setting,
the latter in a rural region. The collection process takes place over a 12-month
period, with month i = 1 corresponding to September, when the grape har-
vest is largest. The majority of grapes are collected from growers who belong
to a regional Grape Cooperative Association (GCA). However, during the
winter months, when local harvesting is low, the company frequently seeks
grape sources elsewhere, including overseas.
The objective of the company is to minimize costs that stem from three dif-
ferent sources: cost associated with shipping finished products to stores, cost
stemming from moving juice between the two plants, and cost due to storing
juice at each plant.
We define the decision variables for this model using the following notation.
Assume all amounts are measured in tons.
• TSi, j,k , where 1 ≤ i ≤ 12 and 1 ≤ j, k ≤ 2: Amount of product j shipped
to stores from plant k at month i.
• TIi,k,m , where 1 ≤ i ≤ 12 and 1 ≤ k, m ≤ 2: Amount of grape juice
transferred into plant k from plant m during month i. A stipulation, of
course, is that TIi,k,k = 0 for 1 ≤ k ≤ 2 and 1 ≤ i ≤ 12.
• TOi,k,m , where 1 ≤ i ≤ 12 and 1 ≤ k, m ≤ 2: Amount of grape juice trans-
ferred out of plant k and into plant m during month i. Again, we assume
that TOi,k,k = 0 for 1 ≤ k ≤ 2 and 1 ≤ i ≤ 12. This family of decision
variables may not appear necessary since we can interpret negative val-
ues of TIi,k,m as the amount of juice transferred out of plant k and into
plant m, but adding these variables, along with appropriate constraints,
permits us to construct a model in which all decision variables are non-
negative. For example, a negative value of TIi,k,m − TOi,k,m means that
plant k suffers a net loss of juice to plant m, even though each individual
quantity in this difference is nonnegative.
• EIi,k , where 1 ≤ i ≤ 12 and 1 ≤ k ≤ 2: The amount of grape juice
inventory at plant k at end of month i.
Associated with the first three of these four families of variables are various
costs, which are as follows:
• The cost of transporting juice from plant k to the other plant during
month i is $65 per ton.
324 Appendix A
• Manufacturing costs: For each plant, jam costs $150 per ton to produce
and juice concentrate $175 per ton.
• The cost of storing juice in each plant from one season to the next:
Assume this amount is $500 per ton for either plant.
1. Formulate the objective function for this LP, which consists of the total
cost of manufacturing the two products at the plants, transferring the
juice from one plant to the other, and storing juice from one season to
the next. Your function will involve triple summations.
There are several important quantities that determine the LP’s con-
straints.
• At the start of every year, each plant is required to have 5 tons of
juice on hand.
• The amount of juice delivered to each plant at the start of month i
is approximately
π
P1 (i) = 15 1 + sin i
6
> P1:=[seq(15*(1+sin(Pi*i/6)),i=1..12)]:
> P2:=[seq(50*(1+sin(Pi*i/6)),i=1..12)]:
(a) At the end of each year, the juice inventory at each plant must
equal 70 tons.
(b) During each month, the amount transferred into one plant equals
the amount transferred out of the other.
(c) Production of the products at the plants must meet monthly de-
mand.
(d) At most 50 tons can be shipped from one plant to the other during
any given month.
(e) Each plant is limited in how much jam and juice concentrate it can
produce annually.
(f) Each month, the amount of juice transferred into one plant equals
the amount transferred out of the other.
(g) A balance of juice exists from one month to the next.
i. At the end of month 1, the ending inventory at each plant is
the initial inventory of 70, plus the juice transferred in, less
the juice transferred out, less juice lost to production of each
product, plus juice obtained via normal delivery.
ii. At the end of month i, where 2 ≤ i ≤ 12, the ending inventory
is that from the previous month, plus the juice transferred in,
less the juice transferred out, less juice lost to production of
each product, plus juice obtained via normal delivery.
3. Now use Maple to solve the LP that minimizes the objective from (1),
subject to the constraints from (2).
Each guard works five consecutive days per week and has the option of
working overtime on one or both of his two days off. The larger the regular
workforce, the more costly to the state are fringe benefit expenditures paid
to the guards. On the other hand, the smaller the workforce, the more the
3 Based upon Maynard, [31], (1980).
326 Appendix A
state pays in overtime wages. The problem then becomes one of determining
the workforce size and method of scheduling that meets staffing needs yet
minimizes labor costs stemming from regular wages, overtime wages, and
fringe benefits.
For the sake of simplicity, we focus on the 8 a.m. to 4 p.m. shift and for the sake
of notational convention, we assume that the days of the week are numbered
so that Monday corresponds to day 1, Tuesday to day 2, etc. Each prison
guard is assigned to work the same 8-hour shift on five consecutive days,
with the option of working the same shift on one or both of his following two
days off. To say that a guard is assigned to work schedule k, means that his
standard work week begins on day k.
We define the decision variables for this model using the following notation:
• xk , where 1 ≤ k ≤ 7: The number of guards assigned to work schedule
k.
• uk , where 1 ≤ k ≤ 7: The number of guards assigned to work schedule
k, who work overtime on their first day off.
• vk , where 1 ≤ k ≤ 7: The number of guards assigned to work schedule
k, who work overtime on their second day off.
Staffing costs stem from wages, both regular and overtime, as well as fringe
benefits and summarized as follows:
• The regular pay rate for guards is $10 per hour, except on Sundays,
when it is $15 per hour instead.
• A guard who works on one or more of his or her two days off is paid at
a rate of $15 per hour on each of those days.
• Each guard receives fringe benefits amounting to 30% of his or her
regular wages.
1. Construct the objective function for this ILP, which consists of total
staffing costs due to regular wages, overtime wages, and fringe benefits.
2. Now formulate a set of constraints for the ILP. They arise from the
following requirements:
(a) Prison requirements stipulate at least 50 guards be present each
day. The exception to this rule occurs on Sundays when 55 guards
are required during family visitation.
(b) To prevent guard fatigue on any given day, the number of prison
guards working overtime can comprise no more than 25% of the
total staff working that day.
Projects 327
3. Now use Maple to solve the ILP that minimizes the objective from (1),
subject to the constraints from (2).
Breast cancer is the most common form of cancer and the second largest cause
of cancer deaths among women. In the mid-1990s a noninvasive diagnostic
tool was developed that can be described in terms of nonlinear program-
ming.4 Researchers use small-gauge needles to collect fluid from both ma-
lignant and benign tumors of a large number of women. With the aid of a
computer program called Xcyt, the boundaries of the cell nuclei are analyzed
and categorized with regard to a number of features, including area, perime-
ter, number of concavities, fractal dimension, variance of grey scale, etc. Mean
values, variance, and other quantities for these attributes are computed, and
ultimately 30 data items for each sample are represented by a vector in R30 .
The set of all such vectors constitutes what is known as a “training set.” Our
aim is to construct a hyperplane in R30 , using these training vectors, that “best
separates” the training vectors corresponding to benign tumors from those
corresponding to malignant ones. Figure A.2 illustrates this basic concept in
R2 , in which case the hyperplane consists of a line. This hyperplane can
be used to classify, with a reasonable degree of certainty, whether a newly
collected sample corresponds to a benign or malignant tumor. Because the
model consists of two classes and a hyperplane is used to separate the data,
we refer to this prediction model as a linear classifier.
nt x − d = −ǫ
nt x + d = ǫ nt x + d = 0
FIGURE A.2: Hyperplane consisting of a solid line that separates two classes
of training vectors, circles and boxes, in R2 .
Ideally all malignant tumor vectors fall on one side of this plane and all benign
vectors fall on the other. In this case, there exists some positive constant ǫ such
that nt xi + d ≥ ǫ for 1 ≤ i ≤ 10 and nt xi + d ≤ −ǫ for 11 ≤ i ≤ 20. We can think
of ǫ as the smallest possible distance between any of the training vectors and
the desired plane. It determines two new planes, nt x + d = ǫ and nt x + d = −ǫ,
as depicted in Figure A.2. The plane, nt x + d = 0, separates these two new
planes and is a distance ǫ from each.
1
Of course, by scaling n and d by if necessary, we may rewrite (A.1) as
ǫ
follows:
y i nt x i + d ≥ 1 where 1 ≤ i ≤ 20. (A.2)
Thus, the three planes in Figure A.2 can be expressed as
nt x + d = −1, nt x + d = 0, and nt x + d = 1.
knk
3. Use Maple to solve the NLP that arises by minimizing the sum of
2
and the penalty function, subject to the constraints in (A.4) together
with the sign restrictions δi ≥ 0 for i = 1, 2, . . . , 20.
4. The solution to your NLP determines the equation of the desired plane.
Use Maple to plot this plane, together with the training vectors. To do
this, create two pointplot3d structures, one for each of the two vec-
tor types. (Specifying different symbols, e.g., symbol=solidcircle and
symbol=solidbox for the two plots will highlight the difference between
the two training vector types.) Then create a third plot structure, this
one producing the separating plane. Combine all three plot structures
with the display command.
10.1
5.97
5. Suppose two new sample vectors are given by v1 = and
31.7
8.54
v2 = 5.33. Use your newly constructed linear classifier, f : R3 → R
21.8
given by f (x) = nt x + d, to classify each of v1 and v2 as corresponding
to benign or malignant vector type.
1. Pick four stocks and determine the prices of each over an extended
period of time. For example, you may wish to determine the monthly
closing price of the stock over a period of several years. Various means
exist for acquiring such data. Internet resources, such as Yahoo! Finance,
allow the user to specify the dates of interest for a given stock and
whether the closing prices are to be sampled daily, weekly, or monthly.
Data can then be downloaded in Excel format.
> with(ExcelTools):
> S:=Import("c:\\Users\\stockdata.xls",
"stock1","A1:A100"):
For each stock, we wish to compute its average rate of return per unit
time period. For example, if the 100 values described above are collected
on the last day of each month, then the rate of return for month j is given
by
2. For each stock you selected, compute the corresponding rates of return.
1 X 1 X
N N
µ= rj and σ2 = (r j − µ)2 ,
N N
j=1 j=1
respectively. The second of these measures the extent to which the data
deviates from the mean. Thus, a larger variance corresponds to more
“volatility” in the data.
3. For each of the stocks, compute the corresponding mean and variance
of the rates of return. One way to do this in Maple is to create a list of
these rates of return, one list for each stock. The Mean command takes
each list as its argument and computes the corresponding mean. The
Variance command works similarly. Both commands are located in the
332 Appendix A
With means and variances in hand for the five different investments, we
are now in a position to construct a portfolio. Suppose we let xi , where
1 ≤ i ≤ 5, denote the fraction of our available funds we will devote to
investment i. Given a sense of how much risk we are willing to tolerate,
we seek to determine values of these weights that maximize the return
on our portfolio.
µp = x 1 µ1 + x 2 µ2 + x 3 µ3 + x 4 µ5 + x 5 µ5
and
σ2p = x21 σ21 + x22 σ22 + x23 σ23 + x24 σ24 + x25 σ25 . (A.6)
the future, we can still use the mean and variance in (A.6) to estimate
the performance and volatility of a portfolio for a given set of weights.
X
5 X
5
maximize f (x1 , x2 , x3 , x4 , x5 ) = x i µi − α x2i σ2i (A.7)
i=1 i=1
X
5
subject to xi = 1
i=1
and xi ≥ 0 for 1 ≤ i ≤ 5.
1 t
minimize f (x) = x Qx + pt x (A.8)
2
subject to
Ax = b
Cx ≤ d,
6. Experiment with various risk aversion parameters and solve (A.8) for
each value you choose. Describe what happens to the weights and
objective function value as α increases, and provide an economic inter-
pretation of the nonzero Lagrange multipliers.
334 Appendix A
In the desert, snakes prey on small rodents in both open and vegetated habi-
tats.6 Should the rodent and snake elect to reside in the same habitat, we ex-
pects the two species to have negative and positive payoffs, respectively, due
to potential inter-species encounters. Similarly, when the rodent and snake
reside in different habitats, the payoffs are positive and negative, respectively.
Let habitats 1 and 2 correspond to the vegetated and open areas. (Hereafter,
we simply refer to the vegetated area as the “bush.”) Corresponding to the
snake, we define the 2-by-2 payoff matrix A, where [A]i j represents the snake’s
payoff when it resides in habitat i and the rodent in habitat j. For the rodent,
we denote its payoff matrix B, where [B]i j denotes the rodent’s payoff when
it resides in habitat i and the snake in habitat j. We expect that the diagonal
entries of A to be nonnegative and those of B nonpositive.
1. Use the values of e, β, Psb , and Pso to construct the snake’s payoff matrix,
A.
2. Use d, α, Psb , Pso , ρb , and ρo to construct the rodent’s payoff matrix, B.
6 Based upon Bouskila, [9], (2001).
Projects 335
3. Determine the mixed strategy Nash equilibrium for this bimatrix game
model of the predator-prey habitat.
4. Suppose the snake’s average payoff must increase by 10% from its equi-
librium value and that the rodent’s average loss must decrease by 10%.
What fraction of time should each species spend in each habitat to bring
this change about?
Appendix B
Important Results from Linear Algebra
What follows is a list of results from linear algebra that are referenced at
various places in this text. Further elaboration on each of them can be found
in a variety of other sources, such as [15] and [21]. We assume in this list of
results that all matrices have real-valued entries.
3. A is invertible.
4. The set of column vectors of A and the set of row vectors of A both span
Rn .
5. The set of column vectors of A and the set of row vectors of A both form
linear independent sets.
6. The set of column vectors of A and the set of row vectors of A both form
a basis for Rn .
337
338 Appendix B
det(Ai (b))
xi = ,
det(A)
While several different matrix norms exist for Mn (R), the one that connects
the Euclidean norm of a vector, x, in Rn with that of the matrix-product vector,
Ax is known as the spectral norm. It is defined as
n√ o
kAk = max λ | λ is an eigenvalue of At A .
Note that by the last property from B.3, each eigenvalue in this set is non-
negative. The spectral norm of a matrix A in Mn (R) satisfies the vector-norm
inequality
kAxk ≤ kAkkxk for all x ∈ Rn .
In other words, multiplication of x by A yields a vector whose length is no
more than that of x, scaled by the matrix norm, kAk.
341
342 Appendix C
Maple follows certain conventions and requires the user to observe various
rules. Among those that are most important are the following:
1. Maple is case-sensitive.
2. Every Maple command line ends with either a semicolon (;) or colon (:),
and the command line is executed by hitting the “Enter” key.
7. Maple permits copying and pasting. Short-cut keys for copy and paste
are “CTRL-C” and “CTRL-V,” respectively.
8. Maple provides the means for the user to enter ordinary text using its
“text mode." To switch to this mode, select the “T” button at the top of
the worksheet or type “CTRL-T.”
Commonly used built-in functions are those used to compute square roots,
logarithms, exponentials, and trigonometric values. Note how the exponen-
tial ex is written, exp(x), and the constant π is written as Pi.
> x:=9:
> sqrt(x);
3
> ln(1);
0
> exp(2);
e2
> sin(Pi/3);
1√
3
2
Maple returns exact values where possible. To obtain a floating-point rep-
resentation of a number, use the evalf command. For example, evalf(Pi)
returns 3.141592654. The default number of digits Maple uses for floating
point values is 10. To change this number, at the start of a worksheet enter
the command Digits:=N;, where N is the number of desired digits.
Frequently one wishes to take output from one command line and use it
directly in the next line without retyping the output value or assigning it a
name. The % symbol is useful for such situations in that it uses the most recent
output as its value.
344 Appendix C
> theta:=Pi/3;
1
θ := π
3
> sin(theta);
1√
3
2
> evalf(%);
0.8660254040
A word of warning is in order regarding the use of %: Its value is the output
from the most recently executed command line. This value may or may not be
the output from the previous line in the worksheet, depending upon whether
or not command lines are executed in the order in which they appear. For this
reason, use % with caution.
> y:=(x+1)*(x-3);
y := (x + 1)(x − 3)
> expand(y);
x2 − 2x − 3
> z:=tˆ2-t-6;
z := t2 − t − 6
> factor(z);
(t − 3)(t + 2)
> subs(t=1,z);
−6
Getting Started with Maple 345
> y:=xˆ2+3*x+1;
y := x2 + 3x + 1
> solve(y=0,x);
3 1√ 3 1√
− + 5, − − 5
2 2 2 2
> fsolve(y=0,x);
−2.618033989, −.3819660113
> solve({y=2x+3,y=-x+4},{x,y});
1 11
x= ,y=
3 3
As a general rule, solve returns exact roots, both real- and complex-valued
of an equation or system of equations, to the extent Maple is capable of
doing so. If unable to compute all roots, Maple will return a warning of the
form warning: SolutionsMayBeLost. Sometimes, Maple will express certain
solutions using placeholders of the form RootOf. This is especially true if one
root is difficult for Maple to compute, yet a second root is dependent upon
the first. Here is an example:
This output indicates that each solution value x, of which there are four,
corresponds to a value of y, each of which is twice x.
> f:=x->xˆ2-4;
f := x → x2 − 4
> f(3);
5
> f(3+h);
(3 + h)2 − 4
> expand(%);
5 + 6h + h2
Functions of more than one variable are entered in a very similar manner.
> f:=(x1,x2)->sin(x1+x2);
> f(x1,Pi);
sin(x1 + π)
Functions are more appropriate than expressions when one wishes to perform
“function-like” operations, such as evaluating function values or creating new
functions from old ones. For example, to compute the derivative function, use
the Maple D operator. Here is an example utilizing this operator to calculate
first and second derivatives of a function, along with critical points and in-
flection points and their respective function outputs. Note the documentation
through the use of the #.
> f:=x->x*exp(x);
f := x → xex
> D(f)(x); # Calculate derivative function.
ex + xex
5.436563656
Getting Started with Maple 347
−1
−e−1
2ex + xex
−2
−2e−2
The commands diff(f(x),x) and diff(f(x),x$2) also yield the first and
second derivatives as expressions and could be combined with the solve
command to determine the preceding critical point and inflection point.
For example, here is a 2-by-3 array, in which each entry is assigned the sum of
the corresponding column and row indices. The entire array is then printed.
> A:=array(1..2,1..3);
Later we will see how arrays provide a convenient means for creating vari-
ables labeled with double or triple indices.
Lists are merely one-dimensional arrays. The only advantage of a list is that
it need not be first defined using the array command. For example, the
command L:=[1,2,4,8,16] defines a list, such that L[1]=1, L[2]=2, and so
on. A sequence, created using the seq command, can be thought of as a list
without brackets. It is frequently used to create longer sequences of values as
in the following commands, which create a sequence, called S, consisting of
the first 10 nonnegative powers of two. This sequence is then used to form a
list, labeled T.
> S:=seq(2ˆj,j=0..10);
S := 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024
> T:=[S];
T := [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]
> op(T);
1, 2, 4, 8, 16, 32, 64
> nops(T);
7
The Maple commands sum and add are used to add finite sequences of values
together. The first command is used to determine the closed form representa-
tion, if one exists, of a sum involving symbolic quantities. The add command
is used instead to add explicitly a finite sequence of values. The following
examples illustrate the commands and their differences:
> sum(rˆk,k=0..n);
rn+1 1
−
r−1 r−1
> add(2ˆk,k=0..5);
63
Getting Started with Maple 349
> restart;
> with(LinearAlgebra);
Matrices are constructed in two different ways. The first involves using the
Matrix command, Matrix(m,n,L), where m and n denote the number of rows
and columns, respectively, and where L is a list of the matrix entries, reading
across the rows. For example:
> A:=Matrix(2,3,[1,4,0,2,-3,7]);
" #
1 4 0
A=
2 −3 7
A second method for defining a matrix uses column vector notation. A column
vector1 in Rn can be entered as <a1,a2,a3,...,an>, where a1,...,an denote
the entries of the vector. Matrices are then formed by adjoining vectors using
the symbols <, |, and >. Here is the preceding matrix A entered in such a
manner:
> v1:=<1,2>:
> v2:=<4,-3>:
> v3:=<0,7>:
> A:=<v1|v2|v3>; " #
1 4 0
A=
2 −3 7
> v1:=<1,2>:
> v2:=<4,-3>:
> v3:=<0,7>:
> A:=<v1,v2,v3>;
1
2
4
A =
−3
0
7
Once matrices are entered into Maple, various operations can be performed
on them. To multiply matrices A and B use a period. A scalar multiplied by a
matrix still requires the asterisk, however. The following worksheet illustrates
the use of several LinearAlgebra package commands:
> restart;
1 Unless specifically stated otherwise, we assume throughout this text that every vector is a
column vector.
Getting Started with Maple 351
Finally, we point out two common issues that frequently arise in the context
of arrays and matrices. First, for large arrays and matrices, Maple returns
output in summary format. For example, an 11-by-2 matrix A is displayed as
11 × 2 Matrix
Data Type: anything
A :=
Storage: rectangular
Fortran_order
To view the actual contents of the matrix, select the output, right-click, and
select “Browse.” Actual entries are displayed in table format, which can then
be displayed in the worksheet and even exported to Excel. Second, Maple is
capable of converting various data types from one form to another. Typical
conversions used throughout this text include those converting arrays to
matrices and vice versa. For example, convert(A,Matrix) converts an array,
A, to a Matrix type, thereby permitting the use of matrix operations. The
command convert(B,array) converts a Matrix object, B, to an array. A second
frequently used conversion will be that from a list to a vector (or vice versa).
For example, convert(L,Vector) converts a list, L, to a Vector. For a thorough
list of data types and permissible conversions, type ?convert at the command
prompt.
> restart;
> f:=x->x+sin(x);
f := x → x + sin(x)
> plot(f(x),x=0..2*Pi,y=0..6,color=blue);
#plot function in blue on interval [0,2*Pi], restricting
output values to the interval [0,6].
> restart;
> g:=(x1,x2)->x1*exp(-x1ˆ2 -x2ˆ2);
2 2
g := (x1, x2) → x1 e−x1 −x2
> plot3d(g(x1,x2),x1=-2..2,x2=-2..2,color=red,
style=wireframe,axes=framed);
#plot function in red on specified rectangle using the
wireframe style and a framed set of axes.
To plot more than one function or expression on a single set of axes, enter
the functions and/or expressions as a list within the plot command. Colors
can be specified using a list in which each color matches the corresponding
function or expression entry. Here is an example, in which the functions
f (x) = x + sin(x), y = x − 1, and y = x + 1 are plotted on a single set of axes, in
black, blue and red, respectively:
> restart;
> plot([f(x),x-1,x+1],x=0..2*Pi,y=0..6,color=
[black,blue,red]);
# f, x-1, and x+1 are plotted black, blue, and red,
respectively.
The pointplot command plots a list of data points in R2 , with each data
point itself expressed as a list of two numbers.
> restart;
> with(plots):
> L:=[[1,2],[1,1],[0,.5],[-1,-1]];
> pointplot(L,color=blue,symbol=circle,symbolsize=20);
# Plot points as blue circles using specified size.
The output is shown in Figure C.5.
Finally, the inequal command is extremely useful for graphing regions in the
R2 that satisfy a list of linear inequalities. Here is an example, which shades
the region bounded by x1 ≤ 2, x2 ≥ 1, x1 + x2 ≤ 4, and x1 − x2 ≥ −2.
> restart;
> with(plots):
> inequal([x1<=2,x2>=1,x1+x2<=4,x1-x2>=-2],x1=-3..3,x2=0..5,
optionsfeasible=(color=grey),
optionsexcluded=(color=white),
optionsclosed=(color=black));
# Shade in grey the region of points satisfying all
inequalities in the list. The exterior and boundary
of the region are colored white and black, respectively.
Superimposing Plots
One of the most useful Maple tools for combining plot structures is the
display command, which is located in the plots package. It allows one
to superimpose a list of plot structures, all of whose members are plot struc-
tures in one of R2 or in R3 . The general procedure for accomplishing this task
first requires creating the individual plot structures themselves (e.g., plot,
implicitplot, pointplot, etc.) as discussed previously. However, each plot
structure is assigned a name and the command line performing the assign-
ment ends with a colon so as to suppress output. Once the structures are
all created in this manner, a list is formed using their respective names, and
this list is used as the argument for the display command. Here is an exam-
ple, which superimposes results from the preceding pointplot and inequal
commands.
> restart;
> with(plots):
> L:=[[1,2],[1,1],[0,.5],[-1,-1]];
> G1:=pointplot(L,color=blue,symbol=circle,symbolsize=20):
> G2:=inequal([x1<=2,x2>=1,x1+x2<=4,x1-x2>=-2],x1=-3..3,
x2=0..5,optionsfeasible=(color=grey),optionsexcluded
=(color=white),optionsclosed=(color=black)):
> display([G1,G2]);
The output is shown in Figure C.7.
Getting Started with Maple 361
Here is a list of Maple commands used at various points in this text, organized
by package name. Each command is accompanied by a brief explanation of
its syntax as well as an example. To learn more about a particular command,
type ?command name at the command prompt. To load a particular package,
type with(package name).
Functions
Example:
f := x → x2 − x
> f(3);
6
Example:
piecewise(cond1,expr1,...,condn, exprn,expr_otherwise),
363
364 Appendix D
Defines the function f , whose output equals −x2 for negative inputs, x2
for inputs between 0 and 2, inclusive, and 0 otherwise.
Example:
> f:=unapply(x1ˆ2-x2ˆ2,[x1,x2]);
Plot Structures
1. plot(expression,interval,options);
Plots an expression of a single variable over the specified interval using
prescribed options, such as graph color, etc.
Example:
> f:=x->xˆ2:
> plot(f(x),x=-1..3,color=red);
Plots the function f (x) = x2 in red on the interval [−1, 3]
Example:
> f:=(x1,x2)->x1*x2:
> plot3d(f(x1,x2),x1=0..2,x2=-2..2,style=wireframe,
color=blue);
Plots the function f (x1 , x2 ) = x1 x2 in blue, wireframe style on the
region 0 ≤ x1 ≤ 2 and −2 ≤ x2 ≤ 2.
Summary of Maple Commands 365
3. implicitplot(relation,options);
Plots a relation in two variables using specified options, which dictate
plot region, color, etc.
Example:
> implicitplot(x1ˆ2-x2ˆ2=1,x1=-5..5,x2=-5..5,color=green,
thickness=3);
Plots the relation x21 −x22 = 1 in green, thickness level 3 on the region
−5 ≤ x1 ≤ 5 and −5 ≤ x2 ≤ 5.
4. contourplot(relation,options);
Creates a contour plot of an expression in two variables using specified
options, which dictate plot region, color, contour values, etc.
Example:
> contourplot(x1ˆ2-x2ˆ2,x1=-5..5,x2=-5..5,color=blue,
contours=[-2,-1,0,1,2]);
Creates blue contour plot on the region −5 ≤ x1 ≤ 5 and −5 ≤ x2 ≤ 5
of the expression z = x21 − x22 using z values of z = −2, −1, 0, 1, 2.
5. inequal(inequalities,options);
Plots the region in the plane consisting of points that satisfy the given
inequalities. The inequalities must be linear, allow for equality, and
should be enclosed within brackets and separated by commas. The
options dictate the plot region, the color of the region satisfying the
inequalities, (optionsfeasible), the color of the boundary of the region,
(optionsclosed), and the color of the points not satisfying at least one
inequality, (optionsexcluded). This command is contained in the plots
package.
Example:
> with(plots):
> inequal([x1+x2<=3,x2>=x1],x1=0..5,x2=0..5,
optionsfeasible=(color=red),optionsclosed=(color=green),
optionsexcluded=(color=yellow));
Plots in red the set of points, (x1 , x2 ), that satisfy the inequalities,
x1 + x2 ≤ 3 and x2 ≥ x1 and that belong to the plot region 0 ≤ x1 ≤ 5
and 0 ≤ x2 ≤ 5. The boundary of the region is colored green, and
the points not satisfying at least one inequality are colored yellow.
366 Appendix D
This command also has coloring options for situations when at least
one inequality is strict, i.e., involves < or > as opposed to ≤ or ≥.
6. pointplot(list_of_points,options);
Plots the specified list_of_points using specified options. Each point
should consist of two numbers, separated by commas and enclosed
within brackets. Then these points themselves are separated by com-
mas and enclosed within brackets again. Options permit specifying the
point color, the point symbol, (asterisk, box, circle, cross, diagonalcross,
diamond, point, solidbox, solidcircle, soliddiamond), and the size of
this symbol, whose default value is 15. This command is contained in
the plots package.
Example:
> with(plots):
> pointplot([[-1,3],[0,0],[3,7]],symbol=box,symbolsize=20);
Plots, as red boxes, the ordered pairs, (−1, 3), (0, 0), and (3, 7). The
size of the symbol is slightly larger than the default value.
Example:
> with(plots):
> graph1:=pointplot([[-1,3],[0,0],[3,7]],symbol=box):
> graph2:=inequal([x1+x2<=3,x2>=x1],x1=0..5,
x2=0..5,optionsfeasible=(color=red),
optionsclosed=(color=green),
optionsexcluded=(color=yellow)):
> display([graph1,graph2],axes=framed);
Displays the superposition of the set of points specified by graph1,
along with the region specified by graph2.
Summary of Maple Commands 367
Example:
> a:=5:
> if a > 2 then x:=5 else x:=0 end if:
> x;
5
2. for index from start to finish by change do statement end do:
Perform task specified by statement using index value index, which
varies from start to finish in increments of change.
Example:
> for i from 0 to 20 by 5 do print(i) end do;
10
15
20
3. while condition do statement end do:
Performs statement so long as condition is true.
Example:
> a:=5:
> while a < 8 do a:=a+1: print(a): end do:
8
368 Appendix D
Example:
> PowerCounter:=proc(number,value) local i:global n:
n:=0: for i from 0 to 5 do if numberˆi <= value then
n:=n+1: end if:
RETURN(n):
end:
> PowerCounter(2,33);
5
Procedure for determining the number of powers of number that are less
than or equal to value, where the powers vary from 0 to 5.
Example:
> A:=array(0..2,1..3);
Example:
> A:=array(0..2,1..3);
> for i from 0 to 2 do for j from 1 to 3 do
A[i,j]:=i+j:od:od:
Creates an array, A, consisting of 9 entries, where the indices vary
from 0 to 2 and 1 to 3, respectively. Each entry is the sum of the
two corresponding indices. For example, A[0, 2] = 2.
Summary of Maple Commands 369
2. [list-entries];
Creates an ordered list using list-entries. The values of
list-entries must be separated by commas. Values in the list can
be extracted using subscripts; the commands nops determines the size
of the list, and op removes the outer brackets.
Examples:
> L:=[1,2,4,8,16,32]:
> L[3];
4
> nops(L);
6
> op(L);
1, 2, 4, 8, 16, 32
> L:=[blue,red,black,green]:
> L[3];
black
> nops(L);
4
> op(L);
1. seq(expression,index=start to finish);
Creates a sequence of values that results when the integer index value
index, which varies between start and finish, is substituted into
expression.
Example:
> seq(3ˆi,i=-2..3);
1 1
, , 1, 3, 9, 27
9 3
370 Appendix D
2. sum(expression,index=start to finish);
Computes a closed form for the sum that results when the integer
index is substituted into expression, where index varies from start
to finish and the resulting sequence of values are added together.
Example:
> sum((1/2)ˆi,i=0..infinity);
2
3. add(expression,index=start to finish);
Computes the explicit sum that results when the integer index is sub-
stituted into expression, where index varies from start to finish and
the resulting sequence of values are added together.
Example:
> add(2ˆi,i=-2..6);
511
4
4. product(expression,index=start to finish);
Computes the explicit product that results when the integer index is
substituted into expression, where index varies from start to finish
and the resulting sequence of values are multiplied together.
Example:
> product(2ˆi,i=1..3);
64
Linear Algebra
Commands in this section are contained in the Linear Algebra package.
1. Vector(entries):
Constructs a vector having components given by entries. The entries
should be separated by commas and enclosed with brackets. By default,
Maple assumes a vector is a column vector. To enter a row vector instead,
type Vector[row](entries).
Examples:
> with(LinearAlgebra):
Summary of Maple Commands 371
> Vector([1,2]);
" #
1
2
> Vector[row]([1,2]);
h i
1 2
Alternatively, a column vector can be entered using <, >, notation, e.g.,
<1,2>.
2. Matrix(m,n,entries);
Constructs an m-by-n matrix having using prescribed entries, which
are read by rows. The object entries should be separated by commas
and enclosed with brackets.
Example:
> with(LinearAlgebra):
> Matrix(2,3,[1,4,5,-3,2,5]);
" #
1 4 5
−3 2 5
3. Norm(column vector,Euclidean);
Computes the Euclidean norm of vector, which is entered as a column
vector using <, >, notation or a Matrix.
Example:
> with(LinearAlgebra):
> u:=<3,4>:
> Norm(u,Euclidean);
5
> v:=Matrix(2,1,[3,4]):
> Norm(v,Euclidean);
5
Note: Maple’s VectorCalculus package also contains a Norm command,
which accepts a row vector as its argument, as opposed to a column
vector. Because of this fact, and for the sake of uniformity, when using
both packages, we always load the LinearAlgebra package second.
4. ZeroMatrix(m,n);
Constructs an m-by-n zero matrix; Similarly, ZeroVector(m); constructs
a column vector with m entries.
Example:
> with(LinearAlgebra):
> ZeroMatrix(2,3);
" #
0 0 0
0 0 0
5. IdentityMatrix(m);
Constructs an m-by-m identity matrix.
Example:
> with(LinearAlgebra):
> IdentityMatrix(2);
" #
1 0
0 1
6. UnitVector(m,n);
Constructs a column vector in Rn having an entry of 1 in component m
and a zero in all other components.
Example:
Summary of Maple Commands 373
> with(LinearAlgebra):
> UnitVector(2,4);
0
1
0
0
7. DiagonalMatrix(entries);
Constructs a square, diagonal matrix using entries along the diagonal.
The entries should be separated by commas and enclosed with brackets.
Example:
> with(LinearAlgebra):
> DiagonalMatrix([1,3,-5]);
1 0 0
0 3 0
0 0 −5
8. Row(M,m);
Extracts, as a row vector, row m of the matrix M. Likewise, Column(M,m)
extracts, as a column vector, column n of the matrix M.
Example:
> with(LinearAlgebra):
> A:=Matrix(2,3,[1,4,5,-3,2,5]);
" #
1 4 5
−3 2 5
> Row(A,2);
h i
−3 2 5
9. RowOperation(M,operation);
Performs a specified row operation, dictated by operation, on the ma-
trix M. There are three such operations.
> RowOperation(M,j,k);
Multiplies row j of M by k.
> RowOperation(M,[i,j]);
Interchanges rows i and j of M.
> RowOperation(M,[i,j],k);
Adds k times row j to i.
374 Appendix D
Example:
> with(LinearAlgebra):
> A:=IdentityMatrix(3);
1 0 0
0 1 0
0 0 1
> RowOperation(A,3,4);
1 0 0
0 1 0
0 0 4
> RowOperation(A,[1,3]);
0 0 1
0 1 0
1 0 0
> RowOperation(A,[1,3],-2);
Produces the matrix,
1 0 −2
0 1 0
0 0 1
Note: The option, inplace=true, replaces the value of the matrix with
that obtained by performing the specified row operation. For example,
RowOperation(M,[1,2], inplace=true) interchanges rows 1 and 2 of
M and overwrites the value of M with this new matrix.
10. Pivot(M,i,j);
Pivots on the entry in row i, column j of the matrix M.
Example:
> with(LinearAlgebra):
> A:=Matrix(3,3,[1,4,5,-3,2,5,7,0,-1]);
1 4 5
−3 2 5
7 0 −1
> Pivot(A,1,3);
1 4 5
−4 −2 0
36 4
5 5 0
Summary of Maple Commands 375
11. RowDimension(M);
Determines the row dimension of the matrix M. Likewise,
ColumnDimension(M); determines the column dimension of M.
Example:
> with(LinearAlgebra):
> A:=Matrix(2,3,[1,4,5,-3,2,5]):
> RowDimension(A)
2
> ColumnDimension(A)
3
12. GaussianElimination(M);
Performs Gaussian elimination on the matrix M and produces the row
echelon form.
13. ReducedRowEchelonForm(M);
Computes the reduced row echelon form of M .
14. LinearSolve(A,b,t);
Solve the matrix equation Ax=b, specifying any free variables be labeled
using t along with appropriate subscripts.
Example:
> A:=Matrix(3,3,[1,1,-1,2,1,1,3,1,3]);
1 1 −1
2 1 1
A :=
3 1 3
> b:=<2,1,0>;
2
b := 1
0
> LinearSolve(A,b,free=t);
−1 − 2t3
3 + 3t
3
t3
15. Determinant(M);
Computes the determinant of the matrix M.
376 Appendix D
16. MatrixInverse(M);
Calculates the inverse of the matrix M. If the matrix is not invertible, an
error message is returned.
17. Eigenvalues(M);:
Calculates the eigenvalues of the matrix M.
18. Eigenvectors(M);
Calculates the eigenvectors and corresponding eigenvalues of the ma-
trix M. The eigenvalues are returned as a column vector, and the eigen-
vectors as corresponding columns in a matrix.
Example:
> with(LinearAlgebra):
> A:=Matrix(3,3,[1,0,0,2,3,0,4,5,3]):
> Eigenvectors(A)
1 2 0 0
3 , −2 0 0
3 1 1 0
2
Thus, the matrix A has eigenvectors −2 (corresponding eigen-
1
0
value of 1) and 0 (corresponding repeated eigenvalue of 3).
1
19. IsDefinite(M,’query’=q);
Tests whether the matrix M is positive (semi-) definite, negative (semi-
) definite, or indefinite. The returned value is true or false, and the
parameter q specifies the matrix form to be determined. Choices
for q consist of ’positive_definite’, ’positive_semidefinite’,
’negative_definite’, ’negative_semidefinite’, and ’indefinite’.
If the matrix has symbolic entries, the command returns conditions on
these entries for which q is satisfied.
Examples:
> with(LinearAlgebra):
> A:=Matrix(3,3,[1,0,0,2,3,0,4,5,3]):
> IsDefinite(A,’query’=’positive_definite’)
f alse
> A:=Matrix(3,3,[1,0,0,2,3,0,4,5,x]):
Summary of Maple Commands 377
> IsDefinite(A,’query’=’positive_semidefinite’);
33 33
0≤− + 2x and 0 ≤ − + 4x and 0 ≤ 4 + x
4 4
1. Import("directory\\filename.xls",
"sheetname","cellrange"):
Imports cells cellrange, from worksheet, sheetname, of the Excel file,
filename.xls, located within the specified directory.
All command arguments are contained in quotation marks, and the
data itself is imported into an array, which can then be converted to a
Matrix and/or Vector structure.
Example:
> with(ExcelTools):
> A:=convert(Import("c:\\file.xls",
"sheet1","A1:D5"),Matrix):
> B:=convert(Import("c:\\file.xls",
"sheet1","E1:E5"),Vector):
1. Mean(L):
Computes the mean value of the entries in the list, L.
2. Variance(L):
Computes the variance of the entries in the list, L.
Example:
> with(Statistics):
> L:=[1,2,3,4,5]:
> Mean(L);
3
> Variance(L);
2.5
1. LPSolve(expression,constraints,options)
Solves the linear programming problem consisting of objective,
expression, subject to the entries of constraints. The quantity
expression must be linear in the decision variables, and entries must
consist of linear inequalities in the decision variables, separated by com-
mas and enclosed within brackets. The command permits a variety of
options, which can specify whether the objective is to be minimized
or maximized (the default goal is minimization), can dictate sign re-
strictions on decision variables, and can require one or more decision
variables to be integer-valued in the solution. The output of the com-
mand is a list, with the first entry given by the optimal objective value
and the second as a vector specifying the corresponding decision vari-
able values.
Examples:
> with(Optimization):
> LPSolve(-4*x1-5*x2,[x1+2*x2<=6,5*x1+4*x2<=20],
assume=nonnegative);
[−19, [x1 = 2.6666, x2 = 1.6666]]
> LPSolve(-7*x1+2*x2,[4*x1-12*x2<=20,-x1+3*x2<=3],
assume=nonnegative,’maximize’);
[2, [x1 = 0, x2 = 1]]
> LPSolve(3*x1-2*x2,[x1-2*x2<=5,-x1+3*x2<=3,x1>=2],
assume=integer);
[4, [x1 = 2, x2 = 1]]
> LPSolve(x1-2*x2,[x1-2*x2<=5,-x1+3*x2<=3,x1>=0],
assume=binary);
[−2, [x1 = 0, x2 = 1]]
Example:
Summary of Maple Commands 379
> with(Optimization):
> c:=Vector[row]([-7,2]);
c = [−7, 2]
> b:=<20,3>; "#
20
b=
3
> A:=Matrix(2,2,[4,-12,-1,3]);
" #
4 −12
A=
−1 3
3. NLPSolve(expression,constraints,options)
Similar to the LPSolve command, NLPSolve minimizes expression,
subject to constraints, which consists of a list of inequalities. Both
expression and any particular constraint, are permitted to be nonlinear
in the decision variables. The advanced numeric method used by Maple
to execute this command generally returns a local optimal solution.
However, if the problem is convex, this local optimal solution is a global
optimal solution as well.
Example:
> with(Optimization):
> NLPSolve(x1ˆ2+x2ˆ2+3*x2,[x1ˆ2+x2ˆ2<=3,x1-x2<=4],
assume=nonnegative,’maximize’);
4. QPSolve(quadratic expression,constraints,options)
A special case of the NLPSolve command is QPSolve, which solves
quadratic programming problems. In this situation, the objective,
quadratic expression, is quadratic in the decision variables. Each
entry of constraints takes the form of a linear or nonlinear inequality.
380 Appendix D
Example:
> with(Optimization):
> QPSolve(x1ˆ2+x2ˆ2+3*x2,[x1+x2<=3,x1-x2<=4]);
5. QPSolve([p,Q],[C,d,A,b],options);
This is the matrix form of the QPSolve command, which solves the
1
quadratic programming problem of minimizing xt Qx + pt x subject to
2
the linear constraints, Ax = b and Cx ≤ d. Note that p, b, and d must
be entered as vectors and not as matrices. In addition, if only C and d
are listed within the brackets, Maple assumes they correspond to the
linear inequalities. If the problem involves equality constraints only,
constraints are given by [NoUserValue,NoUserValue,A,b].
Examples:
> with(Optimization):
> Q:=Matrix(2,2,[2,1,1,3]):
> p:=<5,-3>:
> C:=Matrix(2,2,[1,0,0,3]):
> d:=<3,0>:
> A:=Matrix(2,2,[4,2,2,1]):
> b:=<6,3>:
> QPSolve([p,Q],[C,d,A,b]);
h h ii
9.75, 1.5 0
> QPSolve([p,Q],[C,d]);
h h ii
−6.25, −2.5 0
> QPSolve([p,Q],[NoUserValue,NoUserValue,A,b]);
h h ii
3.70, .4 2.2
Vector Calculus
Commands in this section are contained in the VectorCalculus package.
1. Gradient(expression,variables)
Calculates the gradient vector of the multivariable expression with re-
spect to the variables prescribed by variables. The quantity variables
consists of comma separated variables enclosed within brackets. The
output of the command is expressed in terms of unit vectors corre-
sponding to each variable. Each such vector takes the form evariable
Example:
Summary of Maple Commands 381
> with(VectorCalculus):
> Gradient(x1ˆ2+2*x1*x2,[x1,x2]);
2. Jacobian(expressions,variables)
Calculates the Jacobian matrix of the multivariable expressions
with respect to the variables prescribed by variables. The quantity
expressions consists of comma-separated expressions enclosed within
brackets, and variables consists of comma-separated variables en-
closed within brackets. The output of the command is expressed as
a matrix.
Example:
> with(VectorCalculus):
> Jacobian([x1ˆ2+2x1*x2,4x2ˆ2-3x1*x2],[x1,x2]);
" #
2x1 + 2x2 2x1
−3x2 8x2 − 3x1
3. Hessian(expression,variables)
Calculates the Hessian matrix of the multivariable expression with re-
spect to the variables prescribed by variables. The quantity variables
consists of comma-separated variables enclosed within brackets.
Example:
> with(VectorCalculus):
> Hessian(x1ˆ2+2*x1*x2,[x1,x2]);
" #
2 2
2 0
Bibliography
[1] Apte, A., Apte, U., Beatty, R., Sarkar, I., and Semple, J., The Impact
of Check Sequencing on NSF (Not-Sufficient Funds) Fees, Interfaces, 34
(2004), 97-105.
[3] Bartlett, A., Chartier, T., Langville, A., and Rankin, T., An Integer Pro-
gramming Model for the Sudoku Problem, The Journal of Online Mathe-
matics and its Applications, 9 (2008).
[4] Bazaraa, M., Sherali, H., and Shetty, C., Nonlinear Programming: Theory
and Applications, John Wiley and Sons, Hoboken, 2006.
[7] Bevington, P., and Robinson, D., Data Analysis and Error Analysis for the
Physical Sciences, McGraw-Hill, New York, 2002.
[8] Blandford, D., Boisvert, R., and Charles, C., Import Substitution for Live-
stock Feed in the Caribbean Community, American Journal of Agricultural
Economics, 64, (1982), 70-79.
[10] Dantzig, G., and Thapa, M., Linear Programming 2: Theory and Extensions,
Springer, New York, 2003.
[11] DeWitt, C., Lasdon, L., Waren, A., Brenner, D., and Melhem., S., OMEGA:
An Improved Gasoline Blending System for Texaco, Interfaces, 19 (1989),
85-101.
[12] Duncan, I.B., and Noble, B.M., The Allocation of Specialties to Hospitals
in a Health District, The Journal of the Operational Research Society, 30
(1979), 953-964.
383
384 Bibliography
[13] Floudas, C.A., Pardalos, P.M., (Eds.) Recent Advances in Global Optimiza-
tion, Princeton University Press, Princeton, 1992.
[15] Horn, R., and Johnson, C., Matrix Analysis, Cambridge University Press,
Cambridge, 1985.
[16] Horst, R., and Tuy, H., Global Optimization - Deterministic Approaches,
Third Edition, Springer, Berlin, 1996.
[17] Jarvis, J., Rardin, R., Unger, V., Moore, R., Schimpler, C., Optimal Design
of Regional Wastewater Systems: A Fixed-Charge Network Flow Model,
Operations Research, 26 (1978), 538-550.
[19] Klee, V., and Minty, G.J., How Good is the Simplex Method?, in Inequal-
ities III, Shisha, O., (Ed.), Academic Press, New York, 1972, 159-175.
[21] Lay, D., Linear Algebra, Third Edition, Addison-Wesley, New York, 2006.
[22] Letavec, C., and Ruggiero, J., The n-Queens Problem, INFORMS Trans-
actions on Education, 2 (2002), 101-103.
[26] Mangasarian, O., Street, W., and Wolberg, W., Breast Cancer Diagnonsis
and Prognosis via Linear Programming, Mathematical Programming
Technical Reports 94-10, Madison, WI, 1994.
[29] Marsden, J., and Tromba., A., Vector Calculus, Fifth Edition, W.H. Freeman
and Company, New York, 2003.
[30] Marshall, K., and Suurballe, J., A Note on Cycling in the Simplex Algo-
rithm, Naval Research Logistics Quarterly, 16 (1969), 121-137.
[31] Maynard, J., A Linear Programming Model for Scheduling Prison
Guards, UMAP Module 272, The Consortium for Mathematics and its
Applications, Birkhauser, Boston, 1980.
[32] Moreb, A., and Bafail, A., A Linear Programming Model Combining
Land Leveling and Transportation Problems, The Journal of the Operational
Research Society, 45 (1994), 1418-1424.
[33] Nash, J., Non-Cooperative Games, The Annals of Mathematics, 54 (1951),
286-295.
[34] Neumaier, A., Interval Methods for Systems of Equations, Cambridge Uni-
versity Press, Cambridge, 1990.
[35] Nocedal, J., and Wright, S., Numerical Optimization, Springer, New York,
2000.
[36] Pendegraft, N., Lego of my Simplex, ORMS Today (online), 24 (1997).
[37] Penrose, K., Nelson, A., and Fisher, G., Generalized Body Composition
Prediction Equation for Men Using Simple Measurement Techniques,
Medicine and Science in Sports and Exercise, 17 (1985), 189.
[38] Pintér, J.D., Global Optimization in Action, Kluwer Academic Publishers,
Dordrecht, 1996.
[39] Pintér, J.D., Optima, Mathematical Programming Society Newsletter, 52
(1996).
[40] Polak, E., Optimization: Algorithms and Consistent Approximations,
Springer, New York, 1997.
[41] Reed, H.S., and Holland, R.H., The Growth Rate of an Annual Plant
Helianthus, Proceedings of the National Academy of Sciences, 5 (1919), 135-
144.
[42] Schuster, E., and Allen, S., Raw Material Management at Welch’s Inc.,
Interfaces, 28 (1998), 13-24.
[43] Sharp, J., Snyder, J., and Green, H., A Decomposition Algorithm for
Solving the Multifacility Production Problem with Nonlinear Production
Costs, Econometrica, 38 (1970), 490-506.
[44] Straffin, P., Applications of Calculus, The Mathematical Association of
America, Washington D.C., 1996.
386 Bibliography
[45] Von Neumann, J., and Morgenstern, O., Theory of Games and Economic
Behavior, Third Edition, Princeton University Press, Princeton, 1980.
[46] Yamashita, N., Fukushima, M., On the Rate of Convergence of the
Levenberg-Marquardt Method, Computing, (Supplement), 15 (2001), 239-
249.
[47] Zoutendijk, G., Nonlinear Programming, Computational Methods, in
Integer and Nonlinear Programming, Abadie, J., (Ed.), North Holland, Am-
sterdam, 1970.
Index
387
388 Index
tableau matrix, 56
tour, 158
training vectors, 327
transportation problem, 85
as a minimum cost network flow
problem, 93
balanced, 86
integrality of solution, 87, 89
with transshipment, 89
Tucker, W., 264
twice-differentiable, 210