0% found this document useful (0 votes)
43 views35 pages

Chapter 1 Linear Programming 1.1 Transportation of Commodities

The document describes a linear programming problem involving the transportation of commodities between providers and demanders. The goal is to minimize transportation costs while satisfying supply and demand constraints. Variables represent shipment quantities over routes between providers and demanders. The problem can be formulated as a linear program and solved using methods like the simplex method. An example transportation problem from the original model by Dantzig is provided to illustrate the concepts.

Uploaded by

Rodas getahun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views35 pages

Chapter 1 Linear Programming 1.1 Transportation of Commodities

The document describes a linear programming problem involving the transportation of commodities between providers and demanders. The goal is to minimize transportation costs while satisfying supply and demand constraints. Variables represent shipment quantities over routes between providers and demanders. The problem can be formulated as a linear program and solved using methods like the simplex method. An example transportation problem from the original model by Dantzig is provided to illustrate the concepts.

Uploaded by

Rodas getahun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Chapter 1 Linear Programming

1.1 Transportation of Commodities


We consider a market consisting of a certain number of providers
and demanders of a commodity and a network of routes between the
providers and the demanders along which the commodity can be shipped
from the providers to the demanders. In particular, we assume that
the transportation network is given by a set A of arcs, where (i, j) ∈ A
means that there exists a route connecting the provider i and the de-
mander j.
We denote by cij the unit shipment cost on the arc (i, j), by si the
available supply at the provider i , and by dj the demand at the de-
mander j. The variables are the quantities xij of the commodity that
is shipped over the arc (i, j) ∈ A and the problem is to minimize the
transportation costs
X
(1.1) minimize cij xij over all x = (xij ) ≥ 0
(i,j)∈A

under the natural constraints:


• the supply si at i should exceed the sum of the demands at all
j such that (i, j) ∈ A, i.e.
X
(1.2) xij ≤ si for all i ,
j:(i,j)∈A

• the demand dj at j must be satisfied in the sense that is less or


equal the sum of the supplies at all i such that (i, j) ∈ A, i.e.
X
(1.3) xij ≥ dj for all j .
i:(i,j)∈A

The problem (1.1)-(1.3) is a Linear Program (LP) whose solution by


the simplex method and primal-dual interior-point methods will be
considered in sections 1.2 and 1.3 below.

1.1.1 Dantzig’s original transportation model


As an example we consider G.B. Dantzig’s original transportation model:
We assume two providers i = 1 and i = 2 of tin cans located at Seattle
and San Diego and three demanders j = 1, j = 2, and j = 3 located at
New York, Chicago, and Topeka, respectively:
Sets
i canning plants: Seattle , San Diego
j markets: New York , Chicago , Topeka
1
2 Ronald H.W. Hoppe

The following table contains the distances dij in thousands of miles


between the providers and the demand centers

New York Chicago Topeka


Seattle 2.5 1.7 1.8
San Diego 2.5 1.8 1.4

The freight F in dollars per case per thousand miles is F = 90, so that
the transport cost cij in thousands of dollars per case is given by

F dij
cij = .
1000

The supplies si , 1 ≤ i ≤ 2, and the demands dj , 1 ≤ j ≤ 3, are given as


follows
Supplies
s1 Seattle 325
s2 San Diego 575
Demands
d1 New York 325
d2 Chicago 300
d3 Topeka 275

1.1.2 The primal-dual and the dual problem


For the inequality constraints (1.2) and (1.3) we introduce Lagrange
multipliers psi ≥ 0 and pdj ≥ 0 which are also called dual variables or
shadow prices. Then, the LP (1.1)-(1.3) can be restated as the saddle
point problem

(1.4) min max L(x, ps , pd )


x≥0 ps ,pd ≥0

where the Lagrangian L is given by


X
(1.5) L(x, ps , pd ) := cij xij +
(i,j)∈A
X X
+ psi ( xij − si ) +
i:(i,j)∈A j:(i,j)∈A
X X
+ pdj (dj − xij ) .
j:(i,j)∈A i:(i,j)∈A
Optimization Theory, Fall 2006 ; Chapter 1 3

The optimality conditions for (1.5) turn out to be


(1.6) xij ≥ 0 , Lxij ≥ 0 , xij · Lxij = 0 for all (i, j) ∈ A ,
(1.7) psi ≥ 0 , Lpsi ≤ 0 , psi · Lpsi = 0 for all i ,
(1.8) pdj ≥ 0 , Lpdj ≤ 0 , pdj · Lpdj = 0 for all j .
Computing the derivatives of the Lagrangian L, (1.6)-(1.8) represents
the Linear Complementarity Problem (LCP)
(1.9) xij ≥ 0 , psi + cij − pdj ≥ 0 ,
xij · (psi + cij − pdj ) = 0 for all (i, j) ∈ A ,
X
(1.10) psi ≥ 0 , si − xij ≥ 0 ,
j:(i,j)∈A
X
psi · (si − xij ) = 0 for all i ,
j:(i,j)∈A
X
(1.11) pdj ≥ 0 , xij − dj ≥ 0 ,
i:(i,j)∈A
X
pdj · ( xij − dj ) = 0 for all j .
i:(i,j)∈A

Although, the complementarity conditions (1.9)-(1.11) can be derived


rigorously, let us comment on them from an intuitive point of view:
(i) Complementarity condition (1.9):
The supply price psi at i plus the transportation cost cij from i to j must
exceed the market price pdj at d, i.e., psi + cij − pdj ≥ 0. Otherwise, in a
competitive marketplace, another provider will replicate the supplier i
increasing the supply of the commodity which drives down the market
price. This chain would repeat until the inequality is satisfied. On the
other hand, if the cost of delivery cij strictly exceeds the market price,
i.e., psi + cij − pdj > 0, then nothing is shipped from i to j because doing
so would incur a loss and hence, xij = 0.
(ii) Complementarity condition (1.10):
P
In case si > xij there is an excessive supply at i. In a compet-
j:(i,j)∈A
itive marketplace, the provider is not willing to pay for more supply,
because he is already
P over-supplied, and hence, psi = 0. On the other
hand, if si = xij , the supplier might be willing to pay for addi-
j:(i,j)∈A
tional supply of the commodity, whence psi ≥ 0.
4 Ronald H.W. Hoppe

(iii) Complementarity condition (1.11):


P
Assume xij > dj which means that the supply exceeds the de-
i:(i,j)∈A
mand. Hence, the demander
P is not willing to pay for more goods, i.e.,
d
pj = 0. Otherwise, if xij = dj , the demander might consider to
i:(i,j)∈A
order more commodities whence pdj ≥ 0.
The LCP (1.9)-(1.11) represents the complementary slackness condi-
tions of the LP (1.1)-(1.3) which are both the necessary and the suffi-
cient optimality conditions for the LP. Moreover, the conditions (1.9)-
(1.11) are also the necessary and sufficient optimality conditions of the
problem
X X
(1.12) max dj pdj − si psi ,
ps ,pd ≥0
j i

(1.13) subject to cij ≥ pdj − psi for all (i, j) ∈ A ,


which is called the dual linear program
Since the LCP (1.9)-(1.11) contains both the primal and the dual vari-
ables, it is referred to as the primal-dual formulation of the LP.

1.1.3 Model generalization


The primal-dual formulation of the LP has the advantage that exten-
sions of the underlying model can easily be accomplished. One of these
extensions is to consider the demand as a function of the prices p, i.e.,
we replace d with some function d(p). As long as this function is an
affine one, the complementarity problem stays linear, as it is the case
for dj (p) := dj (1 − pdj ). However, we note that in this case the demand
behaves somewhat strangely, if pdj > 1.
A more realistic scenario occurs by introducing some price elasticity ej
and a reference price p̄dj at j and to define the demand according to
p̄dj ej
(1.14) dj (pdj ) := ( d ) .
pj
In this case, the demand depends nonlinearly on the price which means
that we are faced with a nonlinear complementarity problem (NCP).
Methods to solve nonlinear programming problems will be presented
in Chapters 3 and 4.
Optimization Theory, Fall 2006 ; Chapter 1 5

1.2 The Simplex Method


A linear program is a problem of the following form
(1.15) minimize c1 x1 + c2 x2 + · · · + cn xn ≡ cT x
over all x ∈ Rn which satisfy finitely many linear equality and inequal-
ity constraints
(1.16) ai1 x1 + ai2 x2 + · · · + ain xn ≤ bi , 1 ≤ i ≤ m1 ,
(1.17) ai1 x1 + ai2 x2 + · · · + ain xn = bi , m1 + 1 ≤ i ≤ m .
Here, ck , aik , bi are given real numbers. The function cT x is called
the objective function, and any x ∈ Rn which satisfies (1.16),(1.17) is
referred to as a feasible point. Introducing additional variables and
equations, the linear program (1.15),(1.16),(1.17) can be transformed
into a form where the constraints only consist of equations and elemen-
tary inequalities of the form xi ≥ 0. Moreover, it is useful to cast the
objective functional cT x in the form cT x ≡ −xp . For that purpose, in
(1.16) any inequality
ai1 x1 + · · · + ain xn ≤ bi
is replaced with an equation and an elementary inequality by means of
the slack variable xn+i
ai1 x1 + · · · + ain xn + xn+i = bi , xn+i ≥ 0 .
If the objective functional c1 x1 + · · · + cn xn does not have the required
form, we introduce an additional variable xn+m1 +1 and an additional
equation according to
c1 x1 + · · · + cn xn + xn+m1 +1 = 0
which has to be added to the other constraints. Then, the minimization
of cT x is equivalent to the maximization of xp under the thus extended
constraints.
Therefore, without restriction of generality we may assume that the
linear program has the following standard form:
(1.18) LP (I, p) : maximize xp
n
(1.19) x∈R : Ax = b,
(1.20) xi ≥ 0 for i ∈ I.
Here, I ⊂ N := {1, 2, . . . , n + m1 + 1} is an index set (possibly empty),
p stands for a fixed index with p ∈ N \I, A = (a1 |a2 | . . . |an+m1 +1 ) is
a real (m + 1) × (n + m1 + 1)-matrix with columns ai and b ∈ Rm+1
6 Ronald H.W. Hoppe

is a given vector. The variables xi , i ∈ I resp. xi , i 6∈ I are called


constrained resp. free (unconstrained) variables.
We denote by
P := {x ∈ Rn | Ax = b & xi ≥ 0 for all i ∈ I}
the feasible set of LP (I, p). A vector x̄ ∈ P is said to be the optimal
solution of LP (I, p), if x̄p = max{xp | x ∈ P }.
To illustrate the ideas, we use the following example:
minimize − x1 − 2x2
x: − x1 + x2 ≤ 2
x1 + x2 ≤ 4
x1 ≥ 0, x2 ≥ 0.
After introducing the slack variables x3 , x4 and the additional variable x5
for the objective functional, the problem can be stated in standard form
LP (I, p) with p = 5, I = {1, 2, 3, 4}:
maximize x5
x: − x1 + x2 + x3 =2
x1 + x2 + x4 =4
− x1 − 2x2 + x5 =0
xi ≥ 0 for i ≤ 4.
The problem can be graphically displayed in R2 . The shaded set P in Figure
1 is a polygon.
x2

x1 =0 C

x 3 =0 P x4 =0

x1
E x2 =0 A D

Figure 1. Feasible set

We first consider the linear system Ax = b of LP (I, p). For a vector


J = (j1 , . . . , jr ), ji ∈ N , of indices we denote by AJ := (aj1 | . . . |ajr ) the
Optimization Theory, Fall 2006 ; Chapter 1 7

submatrix of A with columns aji ; xJ refers to the vector (xj1 , . . . , xjr )T . For
notational simplicity, the set {ji | i = 1, 2, . . . , r} of components of J will
also be denoted by J, and we will use the notation p ∈ J, if there exists t
such that p = jt . We define
Definition 1.1: An index vector J = (j1 , . . . , jM ) with M := m+1 different
indices ji ∈ N is called a basis of Ax = b resp. of LP (I, p), if AJ is regular.
Obviously, A has a basis if and only if the rows of A are linearly independent.
Besides J, sometimes AJ is referred to as a basis as well; the variables xi
with i ∈ J are called basis variables, the other variables xk (indizes k) with
k 6∈ J are said to be non-basis variables (non-basis indices). In case an
index vector K contains all non-basis indices, we will write J ⊕ K = N .
In the previous example, JA := (3, 4, 5), JB := (4, 5, 2) are bases.
To a basis J, J ⊕ K = N we assign a uniquely determined solution
x̄ = x̄(J) of Ax = b, called a basis solution, with the property x̄K = 0. Since
Ax̄ = AJ x̄J + AK x̄K = AJ x̄J = b ,
x̄ is given by
(1.21) x̄J := b̄, x̄K := 0 with b̄ := A−1
J b.
Moreover, for any basis J, a solution x of Ax = b is uniquely determined by
its non-basis part xK and its basis part x̄: This follows from the multiplica-
tion of Ax = AJ xJ + AK xK = b by A−1 J and (1.21):
(1.22) xJ = b̄ − A−1
J AK xK
= x̄J − A−1
J AK xK .
Choosing xK arbitrarily and defining xJ and hence x by (1.22), we have
that x solves Ax = b. (1.22) thus provides a specific parametrization of the
solution set {x | Ax = b} via the parameters xK .
If the basis solution x̄ associated with the basis J of Ax = b is a feasible
solution of LP (I, p), x̄ ∈ P , i.e., due to x̄K = 0 there holds
(1.23) x̄i ≥ 0 for all i ∈ I ∩ J ,
J is called a feasible basis of LP (I, p) and x̄ is said to be a feasible basis
solution. Moreover, a feasible basis is called non-degenerate, if instead of
(1.23) the sharper condition
(1.24) x̄i > 0 for all i ∈ I ∩ J.
holds true. The linear program LP (I, p) is said to be non-degenerate, if all
feasible bases J of LP (I, p) are non-degenerate.
Geometrically, the feasible basis solutions of the different bases of LP (I, p)
correspond to the vertices of the polyhedra P of feasible solutions, provided
the set of vertices of P is non-empty. In the example (cf. Fig. 1), the
vertex A ∈ P corresponds to the feasible basis JA = (3, 4, 5), since A is
determined by x1 = x2 = 0 and {1, 2} is the complementary set of JA with
respect to N = {1, 2, 3, 4, 5}; B corresponds to JB = (4, 5, 2), C corresponds
8 Ronald H.W. Hoppe

to JC = (1, 2, 5) etc. The basis JE = (1, 4, 5) is non-feasible, since the


associated basis solution E is not feasible (E 6∈ P ).
The simplex method for the solution of linear programs due to G.B. Dantzig
proceeds as follows: Starting from a feasible basis J0 of LP (I, p) with p ∈ J0 ,
the simplex steps
Ji → Ji+1
recursively generate a sequence {Ji } of additional feasible bases Ji of LP (I, p)
with p ∈ Ji and the following property: The values x̄(Ji )p of the objective
functional associated with the feasible basis solutions x̄(Ji ) w.r.t. Ji are
non-decreasing
x̄(Ji )p ≤ x̄(Ji+1 )p for i ≥ 0 .
If the bases Ji are non-degenerate and LP (I, p) admits an optimal solution,
the sequence {Ji } terminates after finitely many steps with a basis JL whose
basis solution x̄(JL ) is an optimal solution of LP (I, p), and we have
x̄(Ji )p < x̄(Ji+1 )p for 0 ≤ i ≤ L − 1 .
Furthermore, two subsequent bases J = (j1 , . . . , jM ) and J˜ = (j̃1 , . . . , j̃M )
are neighbors in the following sense: J and J˜ possess exactly M − 1 common
components. J and J˜ are related by an exchange of indices: There exist
exactly two indices q, s ∈ N with q ∈ J, s 6∈ J and q 6∈ J, ˜ s ∈ J, ˜ i.e.,
˜
J = (J ∪ {s})\{q}.
For non-degenerate problems, neighboring feasible bases correspond geo-
metrically to neighboring vertices of P . In the example (cf. Fig. 1), we have
that JA = (3, 4, 5) and JB = (4, 5, 2) are neighbors, A and B are neighboring
vertices.

1.2.1 Phase II of the simplex method


The simplex method, to be more precise ’phase II’ of the simplex method
assumes the knowledge of a feasible basis J of LP (I, p) with p ∈ J. Such a
feasible basis can be obtained by the so-called ’phase I’, provided LP (I, p)
has a feasible solution. We first describe a typical step of phase II which
starts from a feasible basis J and generates a feasible neighboring basis J˜
of LP (I, p):
Simplex step: Assumption: J = (j1 , . . . , jM ) is a feasible basis of LP (I, p)
with p = jt ∈ J, J ⊕ K = N .
Step 1: Compute the vector
b̄ := A−1
J b

and thus the basis solution x̄ associated with J satisfying x̄J := b̄, x̄K := 0.
Step 2: Compute the row vector
π := eTt A−1
J ,
Optimization Theory, Fall 2006 ; Chapter 1 9

where et = (0, . . . , 1, . . . , 0)T ∈ RM is the t-th unit vector in RM . Using π,


compute the numbers
ck := πak , k ∈ K.

Step 3: Check whether


(1.25) ck ≥ 0 for all k ∈ K ∩ I
and ck = 0 for all k ∈ K \ I

If yes, stop: the basis solution x̄ is the optimal solution of LP (I, p).
If no, determine s ∈ K such that
(1.26)
cs = min {ck < 0 | k ∈ K ∩ I} or |cs | = max {|ck | 6= 0, | k ∈ K \ I}

and set σ := −sign(cs ).


Step 4: Compute the vector
ā := (ᾱ1 , ᾱ2 , . . . , ᾱM )T := A−1
J as .

Step 5: If
(1.27) σ ᾱi ≤ 0 for all i with ji ∈ I,

stop: LP (I, p) does not admit a finite optimum. Otherwise,


Step 6: determine an index r with jr ∈ I, σ ᾱr > 0 and
½ ¾
b̄r b̄i ¯¯
= min ¯ i : ji ∈ I & σ ᾱi > 0 .
σ ᾱr σ ᾱi

Step 7: Choose as J˜ any suitable index vector with


J˜ := (J ∪ {s}) \ {jr },
for instance
J˜ := (j1 , . . . , jr−1 , s, jr+1 , . . . , jM )
or
J˜ := (j1 , . . . , jr−1 , jr+1 , . . . , jM , s).

We will motivate these rules and assume that J = (j1 , . . . , jM ) is a feasible


basis of LP (I, p) with p = jt ∈ J andJ ⊕ K = N . In view of (1.21), step 1
provides the associated basis solution x̄ = x̄(J). Since all solution of Ax = b
10 Ronald H.W. Hoppe

can be represented in the form (1.22), due to p = jt , the objective functional


satisfies
(1.28) xp = eTt xJ = x̄p − eTt A−1
J AK xK
= x̄p − πAK xK
= x̄p − cK xK ,
if the row vector π and the components ck of the row vector cK , cTK ∈ Rn−m
are chosen as in step 2. cK is called the vector of reduced costs:
In view of (1.28), ck , k ∈ K stands for the amount by which the objective
functional xp decreases, if xk is enlarged by one unit. Hence, if (1.25) is
satisfied (cf. step 3), for each feasible solution x of LP (I, p), (1.28) and
xi ≥ 0, i ∈ I imply
X
xp = x̄p − ck xk ≤ x̄p ,
k∈K∩I

i.e., the basis solution x̄ is the optimal solution of LP (I, p). This motivates
the test (1.25) and the assertion a) of step 3. If (1.25) does not old true,
there exists an index s ∈ K for which either
(1.29) cs < 0, s ∈ K ∩ I,
or
(1.30) |cs | =
6 0, s∈K \I .
Assume that s is such an index. We set σ := −sign(cs ). Since due to
(1.28) an increase in σxs yields an increase in the objective functional xp ,
we consider the following family of vectors x(θ) ∈ Rn+m1 +1 , θ ∈ R,
(1.31) x(θ)J := b̄ − θσA−1
J as = b̄ − θσā,
(1.32) x(θ)s := θσ,
(1.33) x(θ)k := 0 for k ∈ K, k 6= s.
Here, ā := A−1
J as is chosen as in step 4.
In the example we have I = {1, 2, 3, 4}, and J0 = JA = (3, 4, 5) is a
feasible basis, K0 = (1, 2), p = 5 ∈ J0 , t0 = 3. We obtain:
   
1 2
AJ0 =  1  , b̄ = 4
1 0
x̄(J0 ) = (0, 0, 2, 4, 0)T (=
ˆ point A in Fig. 1) and πAJ0 = eTt0 ⇒ π = (0, 0, 1).
The reduced costs are c1 = πa1 = −1, c2 = πa2 = −2. Hence, J0 is not
optimal. Choosing in step 3 the index s = 2, we obtain
 
1
ā = A−1 a2 =  1 .
J0
−2
Optimization Theory, Fall 2006 ; Chapter 1 11

The family of solutions x(θ) is given by


x(θ) = (0, θ, 2 − θ, 4 − θ, 2θ)T .
Geometrically, in Fig. 1 x(θ), θ ≥ 0 describes a ray pointing from vertex
A (θ = 0) towards the neighboring vertex B (θ = 2) along an edge of the
polyhedra P .
In view of (1.22), we have Ax(θ) = b for all θ ∈ R. In particular, observing
(1.28), x̄ = x(0) and the choice of σ
(1.34) x(θ)p = x̄p − cs x(θ)s = x̄p + θ |cs | ,
such that the objective functional increases monotonically along the ray
x(θ). Consequently, among the solutions x(θ) of Ax = b we pick the best
feasible solution, i.e., we are looking for the largest θ ≥ 0 with
x(θ)l ≥ 0 for all l ∈ I.
Taking (1.33) into account, this is equivalent to choosing the largest θ ≥ 0
with
(1.35) x(θ)ji ≡ b̄i − θσ ᾱi ≥ 0 for all i with ji ∈ I ,
since x(θ)k ≥ 0, k ∈ K ∩ I, θ ≥ 0 is automatically satisfied due to (1.33).
Now, if σ ᾱi ≤ 0 for all ji ∈ I (cf. step 5 ), (1.35) implies that x(θ), θ ≥ 0 is a
feasible solution of LP (I, p) with sup{x(θ)p | θ ≥ 0} = +∞: Then, LP (I, p)
does not have a finite optimum. This justifies step 5. Otherwise, there is a
largest θ =: θ̄ such that (1.35) holds true:
½ ¾
b̄r b̄i ¯¯
θ̄ = = min ¯ i : ji ∈ I & σ ᾱi > 0 .
σ ᾱr σ ᾱi
This determines an index r with jr ∈ I, σ ᾱr > 0 and
(1.36) x(θ̄)jr = b̄r − θ̄σ ᾱr = 0, x(θ̄) is a feasible solution.
In the example we have
½ ¾
b̄1 b̄1 b̄2
θ̄ = 2 = = min , , r = 1.
ᾱ1 ᾱ1 ᾱ2
x(θ̄) = (0, 2, 0, 2, 4)T corresponds to the vertex B in Fig. 1.
Due to the feasibility of J we have θ̄ ≥ 0, and (1.34) implies
x(θ̄)p ≥ x̄p .
If J is non-degenerate, there holds θ̄ > 0, and hence
x(θ̄)p > x̄p .
In view of (1.22), (1.33), and (1.36) x = x(θ) is the uniquely determined
solution of Ax = b with the additional property
xjr = 0, xk = 0 for k ∈ K, k 6= s ,
12 Ronald H.W. Hoppe

i.e., xK̃ = 0, K̃ := (K ∪{jr })\{s}. The uniqueness of x implies the regularity


of AJ˜, J˜ := (J ∪{s})\{jr }. Hence, x(θ) = x̄(J)
˜ is a basis solution associated
with the neighboring feasible basis J,˜ and there holds

(1.37) ˜ p > x̄(J)p ,


x̄(J) if J is non-degenerate,
(1.38) ˜ p ≥ x̄(J)p ,
x̄(J) otherwise.
In the example we obtain the new basis
J1 = (2, 4, 5) = JB , K1 = (1, 3) ,
which corresponds to the vertex B in Fig. 1. With regard to the objective
functional x5 , we have that B is ’better’ than A : x̄(JB )5 = 4 > x̄(JA )5 = 0.
According to the definition of r, we always have jr ∈ I, whence
J \ I ⊂ J˜ \ I,

i.e., in the step J → J, ˜ there are only constrained variables xjr , jr ∈ I


which are eliminated from the basis. As soon as a free variable xs , s 6∈ I,
has become a basis variable, it remains a basis variable in all subsequent
simplex steps. In particular, p ∈ J˜ due to p ∈ J and p 6∈ I. The new basis
J˜ satisfies again the assumption of the simplex step such that the simplex
step can be applied to J. ˜ Starting from a first feasible basis J0 of LP (I, p)
with p ∈ J0 , we thus obtain a sequence
J0 → J1 → J2 → · · ·
of feasible bases Ji of LP (I, p) with p ∈ Ji , for which in case of non-
degeneracy of allJi there holds
x̄(J0 )p < x̄(J1 )p < x̄(J2 )p < · · · .
In this case, Ji will not occur again. Since there are only finitely many
index vectors J, the method must terminate after finitely many steps. Con-
sequently, for the simplex method we have shown:
Theorem 1.1 Assume that J0 is a feasible basis of LP (I, p) with p ∈ J0 . If
LP (I, p) is non-degenerate, starting from J0 , the simplex method generates
a finite sequence of feasible bases Ji of LP (I, p) with p ∈ Ji and x̄(Ji )p <
x̄(Ji+1 )p . The last basis solution either is an optimal solution of LP (I, p)
or LP (I, p) does not admit a finite optimum.
We proceed with the example: As a result of the first simplex step we
have obtained the new feasible basis J1 = (2, 4, 5) = JB , K1 = (1, 3), t1 = 3,
such that
   
1 0 0 2
AJ1 =  1 1 0 ,b̄ = 2 , x̄(J1 ) = (0, 2, 0, 2, 4)T (= ˆ vertex B),
−2 0 1 4
πAJ1 = eTt1 ⇒ π = (2, 0, 1).
Optimization Theory, Fall 2006 ; Chapter 1 13

The reduced costs are c1 = πa1 = −3, c3 = πa3 = 2. Hence,J1 is not


optimal:
 
−1
−1
s = 1, ā = AJ1 a1 ⇒ ā =  2 ⇒ r = 2.
−3
Therefore,
J2 = (2, 1, 5) = JC , K2 = (3, 4), t2 = 3
   
1 −1 0 3
AJ2 =  1 1 0 , b̄ = 1 ,
−2 −1 1 7
x̄(J2 ) = (1, 3, 0, 0, 7) (=
ˆ vertex C),
1 3
πAJ2 = eTt2 ⇒ π = ( , , 1)
2 2
The reduced costs are c3 = πa3 = 12 > 0, c4 = πa4 = 32 > 0.
The optimality criterion is satisfied, and hence, x̄(J2 ) is optimal, i.e.,
x̄1 = 1, x̄2 = 3, x̄3 = 0, x̄4 = 0, x̄5 = 7. The optimal value of the objective
functional x5 is given by x̄5 = 7.
The practical implementation of the simplex method, each step J → J˜
(requires the solution of three linear systems with the matrix AJ
(1.39) AJ b̄ = b ⇒ b̄ (Step 1),
(1.40) πAJ = eTt ⇒π (Step 2),
(1.41) AJ ā = as ⇒ ā (Step 4).

The computational work for the successive bases J → J˜ → · · · can be signif-


icantly reduced by taking into account that the bases J → J˜ are neighbors:
the new basis matrix AJ˜ is obtained from AJ by replacing a column of AJ
with another column of A. This can be used, e.g., in case of decompositions
of the basis matrix AJ of type
F AJ = R, F regular, R upper triangular matrix.
Taking advantage of such a decomposition, the linear systems (1.41) can be
easily solved:
Rb̄ = F b ⇒ b̄,
RT z = et ⇒ z ⇒ π = z T F,
Rā = F as ⇒ ā.
Moreover, by means of a decomposition F AJ = R of AJ , in each simplex
step a similar decomposition F̃ AJ˜ = R̃ can be easily computed for the
neighboring basis J˜ = (j1 , . . . , jr−1 , jr+1 , . . . , jM,s ) (cf. Step 7): The matrix
14 Ronald H.W. Hoppe

F AJ˜ is an upper Hessenberg matrix of the form (shown here for M = 4,


r = 2)
 
x x x x
 x x x
F AJ˜ = 
 x
 =: R0 .
x x
x x
The subdiagonal elements can be easily eliminated by elimination matrices
Er,r+1 , Er+1,r+2 , . . . , EM −1,M and hence, R0 can be transformed into an
upper triangular matrix R̃:
F̃ AJ˜ = R̃, F̃ := EF,R̃ := ER0 , E := EM −1,M EM −2,M −1 . . . Er,r+1 .
Therefore, it seems to be reasonable, to implement the simplex method in
such a way that in each simplex step J → J˜ a 4-tuple M = {J; t; F, R} with
the property
jt = p, F AJ = R,

is transformed into a similar 4-tuple M̃ = {J; ˜ t̃; F̃ , R̃}. Besides a feasible


basis J0 with p ∈ J0 of LP (I, p), the initialization of this variant of the
simplex method also requires a decomposition F0 AJ0 = R0 of AJ0 .
Instead of the decompositions F AJ = R, standard implementations of the
simplex method use other quantities which enable an efficient solution of the
linear systems (1.41). The so-called ’Inverse-Basis-Method’ uses 5-tuples of
the form
M̂ = {J; t; B, b̄, π}
with
jt = p, B := A−1
J , b̄ = A−1
J b, π := eTt A−1
J .

Another variant implements 5-tuples


M̄ = {J; t; Ā, b̄, π}
with
jt = p, Ā := A−1
J AK , b̄ := A−1
J b, π := eTt A−1
J , J ⊕K =N .

In the transition J → J, ˜ computational work can be reduced in such a


way that for neighboring bases J, J˜ the inverse AJ−1 ˜ can be computed by
multiplying AJ with a suitable Frobenius matrix G: AJ−1
−1 −1
˜ = GAJ . Here,
the computational work is even a little bit less as in case of the decomposition
F AJ = R. However, a serious drawback is the numerical instability: if a
basis matrix AJi is ill conditioned, inevitable large errors in A−1 −1
Ji , AJi AKi
are amplified in M̂i and M̄i for all subsequent 5-tuples M̂j , M̄j , j > i.
The following practical example illustrates the gain in numerical stability,
if instead of using the ’Inverse-Basis-Method’ the triangular decomposition
Optimization Theory, Fall 2006 ; Chapter 1 15

will be employed. Consider a linear program with constraints of the form

(1.42) Ax = b, A = (A(1) , A(2) ),


x ≥ 0.

The matrix A is chosen as the 5 × 10-matrix given by the 5 × 5-submatrices


A(1) , A(2)

(1) (1)
A(1) = (aik ), aik := 1/(i + k), i, k = 1, . . . , 5,
A(2) := I5 = 5-row unit matrix .

Here, A(1) is badly conditioned, whereas A(2) is well conditioned. The right-
hand side is chosen as the vector b := A(1) · e, e := (1, 1, 1, 1, 1)T ,

5
X 1
bi := ,
i+k
k=1

so that both bases J1 := (1, 2, 3, 4, 5), J2 := (6, 7, 8, 9, 10) are feasible for
(1.42) with the basis solutions

(1.43) x̄(J1 ) := b̄1 0 ,


b̄1 := A−1 (1) −1
J1 b = (A ) b = e ,
x̄(J2 ) := 0 b̄2 ,
b̄2 := A−1 (2) −1
J2 b = (A ) b = b .

As a start basis we choose J2 = (6, 7, 8, 9, 10) and transform it by the Inverse-


Basis-Method resp. the triangular decomposition method into the new basis
J1 and then, using another sequence of exchange steps, return to the start
basis J2 :

J2 → · · · → J1 → · · · → J2 .

For the associated basis solutions (1.43), this cycling process yields the fol-
lowing results (machine accuracy eps ≈ 10−11 , inexact digits are underlined):
16 Ronald H.W. Hoppe

Basis exact basis solution Inverse-Basis-Method Triangular decomposition


1.4500000000E + 00 1.4500000000E + 00 1.4500000000E + 00
1.0928571428E + 00 1.0928571428E + 00 1.0928571428E + 00
J2 b̄2 = 8.8452380952E − 01 8.8452380952E − 01 8.8452380952E − 01
7.4563492063E − 01 7.4563492063E − 01 7.4563492063E − 01
6.4563492063E − 01 6.4563492063E − 01 6.4563492063E − 01
1 1.0000000182E + 00 1.0000000786E + 00
1 9.9999984079E − 01 9.9999916035E − 01
J1 b̄1 = 1 1.0000004372E + 00 1.0000027212E + 00
1 9.9999952118E − 01 9.9999956491E − 01
1 1.0000001826E + 00 1.0000014837E + 00
1.4500000000E + 00 1.4500010511E + 00 1.4500000000E + 00
1.0928571428E + 00 1.0928579972E + 00 1.0928571427E + 00
J2 b̄2 = 8.8452380952E − 01 8.8452453057E − 01 8.8452380950E − 01
7.4563492063E − 01 7.4563554473E − 01 7.4563492060E − 01
6.4563492063E − 01 6.4563547103E − 01 6.4563492059E − 01

We obtain the following result: Due to AJ2 = I5 , at the beginning both


methods provide the exact solution. For the basis J1 , both methods yield
the same inexact results which reflect the bad condition of AJ1 . This can not
be avoided, unless the computations are carried out with higher accuracy.
After the step with the badly conditioned basis matrix AJ1 , the situation
changes drastically in favor of the triangular decomposition method. This
method provides the basis solution associated with J2 practically at machine
accuracy, whereas the Inverse-Basis-Method reproduces the solution with
the same inaccuracy as in case of the previous basis J1 . In case of the
Inverse-Basis-Method, all subsequent bases inherit the bad condition of the
basis matrix AJ .

1.2.2 Phase I of the simplex method


The initialization of phase II of the simplex method requires a feasible ba-
sis J0 of LP (I, p) with p = jt0 ∈ J0 resp. an associated 4-tuple M0 =
{J0 ; t0 ; F0 , R0 }, where the regular matrix F0 and the regular upper trian-
gular matrix R0 provide a decomposition F0 AJ0 = R0 of the basis matrix
AJ0 .
In some special cases, it is easy to find a feasible basis J0 (M0 ), e.g., when
we are faced with a linear program of the following form:

minimize c1 x1 + · · · + cn xn
n
x ∈ R : ai1 x1 + · · · + ain xn ≤ bi , i = 1, 2, . . . , m
xi ≥ 0 for i ∈ I1 ⊂ {1, 2, . . . , n},
Optimization Theory, Fall 2006 ; Chapter 1 17

where bi ≥ 0 for i = 1, 2, . . . , m. Introducing slack variables, we obtain the


equivalent problem
maximize xn+m+1
n+m+1
x∈R : ai1 x1 + · · · + ain xn + xn+i = bi , i = 1, 2, . . . , m
c1 x1 + · · · + cn xn + xn+m+1 = 0,
xi ≥ 0 for i ∈ I1 ∪ {n + 1, n + 2, . . . , n + m} ,
representing the standard form LP (I, p) of the previous section
   
a11 . . . a1n 1 b1
 .. .. . ..   . 
 . .   .. 
A= ,b =  ,
am1 . . . amn 1  bm 
c1 . . . cn 1 0
p := n + m + 1, I := I1 ∪ {n + 1, n + 2, . . . , n + m} .
In view of bi ≥ 0, we have that J0 := (n + 1, n + 2, . . . , n + m + 1) is a feasible
basis with p = jt ∈ J0 , t := m + 1. A corresponding M0 = (J0 ; t0 ; F0 , R0 )
is given by t0 := m + 1, F0 := R0 := Im+1 , where Im+1 denotes the unit
matrix with m + 1 rows.
For more general linear programs (P ), the so-called ’phase I’ provides a
feasible basis. This phase is based on techniques where phase II is applied
to a modified linear program (P̃ ) which is such that a feasible start basis is
known, i.e., (P̃ ) can be solved by means of phase II and each optimal basis
of (P̃ ) yields a feasible start basis for (P ). Here, we will only describe one
of these techniques. We refer to the literature on linear programming with
respect to other techniques.
We consider a linear program of the following form:
(1.44) minimize c1 x1 + · · · + cn xn
x ∈ Rn : aj1 x1 + · · · + ajn xn = bj , j = 1, 2, . . . , m
xi ≥ 0 for i ∈ I ⊆ {1, 2, . . . , n} ,
and assume that bj ≥ 0 holds true for all j (multiply the j-th constraint by
−1, if bj < 0).
First, we extend the constraints by introducing artificial variables xn+1 ,
. . . , xn+m ,
a11 x1 +···+ a1n xn +xn+1 = b1
(1.45) .. .. .. ..
. . . .
am1 x1 + · · · + amn xn +xn+m = bm
xi ≥ 0 for i ∈ I ∪ {n + 1, . . . , n + m} .
Obviously, the feasible solutions of (1.44) are uniquely assigned to those
feasible solutions of (1.45) for which the artificial variables are zero:
(1.46) xn+1 = xn+2 = · · · = xn+m = 0 .
18 Ronald H.W. Hoppe

Now, we design a maximization problem with the constraints (1.45) whose


optimal solutions satisfy (1.46), provided (1.44) admits feasible solutions.
ˆ p̂):
For this purpose, we consider LP (I,
maximize xn+m+1
x: a11 x1 +···+ a1n xn +xn+1 = b1
.. .. .. ..
. . . .
am1 x1 + · · · + amn xn +xn+m = bm
xn+1 + ··· +xn+m +xn+m+1 = 0
xi ≥ 0 for i ∈ Iˆ := I ∪ {n + 1, . . . , n + m}, p̂ := n + m + 1

A possible start basis for this problem is Jˆ0 := (n + 1, . . . , n + m + 1) which


is feasible, since the associated basis solution x̄ with
m
X
x̄j = 0, x̄n+i = bi , x̄n+m+1 = − bi , for 1 ≤ j ≤ n, 1 ≤ i ≤ m,
i=1

is feasible due to bj ≥ 0.
A 4-tuple M0 = {Jˆ0 ; t̂0 ; F̂0 , R̂0 } corresponding to Jˆ0 is given by
 
1 0  
 .. . .  1 0
 . .   .. 
t̂0 := m + 1, F̂0 :=  , R̂0 :=  . .
 0 ... 1 
0 1
−1 . . . −1 1
ˆ p̂), phase II of the simplex method can be
Now, for the solution of LP (I,
P ˆ
launched. Due to xn+m+1 = − m i=1 xn+i ≤ 0, LP (I, p̂) has a finite
maximum and hence, phase II provides an optimal basis J¯ and the associated
¯ which is the optimal solution of LP (I,
basis solution x̄ = x̄(J) ˆ p̂).
We distinguish the three cases:
1: x̄n+m+1 < 0, i.e., (1.46) does not hold true for x̄,
2: x̄n+m+1 = 0 and no artificial variable is a basis variable,
3: x̄n+m+1 = 0 and there exists an artificial variable in J. ¯
In case 1, (1.44) is not solvable, since any feasible solution corresponds to
a feasible solution of LP (I, ˆ p̂) with xn+m+1 = 0. In case 2, the optimal
¯ ˆ
basis J of LP (I, p̂) readily gives a feasible start basis for phase II of the
simplex method. Case 3 represents a degenerate problem, since the artifi-
cial variables in the basis J¯ are zero. If necessary, by a re-numeration of
the equations and the artificial variables we may achieve that the artificial
variables in the basis J¯ are the variables xn+1 , xn+2 , . . . , xn+k . In LP (I,
ˆ p̂),
we then eliminate the remaining artificial variables which are not in J and ¯
instead of xn+m+1 introduce a new variable xn+k+1 := −xn+1 − · · · − xn+k
and a new variable xn+k+2 for the objective functional. The optimal ba-
sis J¯ of LP (I,
ˆ p̂) yields a feasible start basis J¯ ∪ {xn+k+2 } for the problem
Optimization Theory, Fall 2006 ; Chapter 1 19

equivalent to (1.44)

maximize xn+k+2
x: a11 x1 +···+ a1n xn +xn+1 = b1
.. .. .. ..
. . . .
ak1 x1 +···+ akn xn +xn+k = bk
xn+1 + ··· +xn+k +xn+k+1 =0
ak+1,1 x1 +···+ ak+1,n xn = bk+1
.. .. ..
. . .
am1 x1 +···+ amn xn = bm
c1 x 1 +···+ cn xn +xn+k+2 =0
xi ≥ 0 or i ∈ I ∪ {n + 1, . . . , n + k + 1} .

1.3 Primal-Dual Interior Point Methods


1.3.1 Primal-Dual Methods
1.3.1.1 Optimality Conditions for LP
We recall from Chapter 1.1 the definition of the standard form of an LP:
Given vectors b ∈ lRm , c ∈ lRn , and a matrix A ∈ lRm×n , we are looking for
a vector x ∈ lRn satisfying

(1.47) minimize cT x subject to Ax = b , x ≥ 0 .

The sets

(1.48) FP := {x ∈ lRn | Ax = b , x ≥ 0} ,
(1.49) FPo := {x ∈ lRn | Ax = b , x > 0}

are called the primal feasible set and the primal strictly feasible set, respec-
tively.
The dual of the LP is given by: Find λ ∈ lRm , s ∈ lRn , such that

(1.50) maximize bT λ subject to AT λ + s = c , s ≥ 0 .

The sets

(1.51) FD := {(λ, s) ∈ lRm × lRn | AT λ + s = c , s ≥ 0} ,


o
(1.52) FD := {(λ, s) ∈ lRm × lRn | AT λ + s = c , s > 0}

are referred to as the dual feasible set and the dual strictly feasible set,
respectively.
Theorem 1.2 (KKT conditions) A vector (x∗ , λ∗ , s∗ ) ∈ lRn × lRm × lRn
is a solution of (1.47),(1.50) if and only if the following Karush-Kuhn-Tucker
20 Ronald H.W. Hoppe

(KKT) conditions are satisfied


(1.53) AT λ + s = c ,
(1.54) Ax = b ,
(1.55) (x, s) ≥ 0 ,
(1.56) xT s = 0 .
Proof: See, e.g., [2, 3].
The KKT conditions (1.53)-(1.56) represent an LCP which is called the
primal-dual problem. The sets
(1.57) FP D := {(x, λ, s) | Ax = b , AT λ + s = c , (x, s) ≥ 0} ,
(1.58) FPo D := {(x, λ, s) | Ax = b , AT λ + s = c , (x, s) > 0}
are said to be the primal-dual feasible set and the primal-dual strictly feasible
set, respectively. Moreover, we denote by
(1.59) ΩP := {x∗ ∈ lRn | x∗ solves (1.47} ,
(1.60) ΩD := {(λ∗ , s∗ ) ∈ lRm × lRn | (λ∗ , s∗ ) solves (1.50} ,
(1.61) Ω := ΩP × ΩD = {(x∗ , λ∗ , s∗ ) solves (1.53 − (1.56)} |
the primal, the dual, and the primal-dual solution set, respectively.
Theorem 1.3 (Characterization of solutions) There holds:
(i) If the primal and dual problems are feasible, i.e., FP D 6= ∅, the set Ω is
nonempty.
(ii) If either the primal or the dual problem has an optimal solution, so
does the other, and the values of the objective functionals are equal.
Proof of (i): Assertion (i) follows from the application of Farkas’ lemma
to the system
      
−A 0 0 x 0 −b
 0 I 0   s  +  AT  λ =  c  , (x, s, β) ≥ 0 ,
cT 0 1 β −bT 0
whose solutions coincide with Ω.
Proof of (ii): Assume that x∗ is an optimal solution of the LP. Due to
the necessity of the KKT conditions, there exist λ∗ , s∗ such that the 3-
tuple (x∗ , λ∗ , s∗ ) satisfies (1.53)-(1.56). Since the KKT conditions are also
sufficient, (x∗ , λ∗ , s∗ ) is a primal-dual solution and hence, (λ∗ , s∗ ) is a dual
solution. The same argument applies to an optimal solution of the dual
problem. The proof that the optimal objective values are equal is left as an
exercise.
Corollary (Bounds for the objective functionals) There holds:
(i) Assume that the LP is feasible. Then, the objective functional cT x is
bounded from below on its feasible region if and only if the dual problem is
feasible.
Optimization Theory, Fall 2006 ; Chapter 1 21

(ii) Assume that the dual problem is feasible. Then, the objective functional
bT λ is bounded from above on its feasible region if and only if the primal
problem is feasible.
Proof: The proof is left as an exercise.
The following result reveals a condition for the existence and boundedness
of the primal and dual solution sets:
Theorem 1.4 (Existence of primal/dual solutions) Assume that the
primal and dual problems are feasible, i.e., FP D 6= ∅. Then, there holds:
(i) If the dual problem has a strictly feasible point, the primal solution set
ΩP is nonempty and bounded.
(ii) If the primal problem has a strictly feasible point, the set
{s∗ ∈ lRn | (λ∗ , s∗ ) ∈ ΩD for some λ∗ ∈ lRm }
is nonempty and bounded.
Proof of (i): let (λ̄, s̄) be the strictly feasible dual point and assume that
x̂ is some primal feasible point. Then, we have
(1.62) 0 ≤ s̄T x̂ = cT x̂ − bT λ̄ ,
and the set
T := {x ∈ lRn | Ax = b , x ≥ 0 , cT x ≤ cT x̂}
is nonempty (x̂ ∈ T ) and closed. For any x ∈ T , (1.62) implies
n
X
s̄i xi = s̄T x = cT x − bT λ̄ ≤ cT x̂ − bT λ̄ = s̄T x̂ .
i=1

Since s̄i > 0, xi ≥ 0, it follows that


1 T ¡ 1¢ T
xi ≤ s̄ x̂ =⇒ kxk∞ ≤ max s̄ x̂ .
s̄i 1≤i≤n s̄i

Since x has been arbitrarily chosen from T , we deduce that T is nonempty,


bounded and closed. Consequently, cT x must attain its minimum on T , i.e.,
there exists x∗ such that T
x∗ ∈ T , cT x∗ ≤ cT x for all x ∈ T .
Obviously, x∗ ∈ ΩP . Therefore, ΩP is nonempty and bounded as a subset
of the bounded set T .
Proof of (ii): The proof of (ii) is left as an exercise.
We note that there are LPs with FP D 6= ∅ but FPo D = ∅, i.e., there is no
strictly feasible point. An example is given by
(1.63) min x1
x∈lR3
subject to x1 + x3 = 0 , x ≥ 0 .
22 Ronald H.W. Hoppe

The associated dual problem is given by


(1.64) max 0
λ∈lR,s∈lR3
     
1 s1 1
subject to  0  λ +  s2  =  0  , s ≥ 0 .
1 s3 0
Any feasible primal-dual (x, λ, s) ∈ FP D is of the form x1 = x3 = s2 = 0,
whence FPo D = ∅.
An important feature of the strictly feasible set FPo D is that FPo D 6= ∅ implies
that
(1.65) {(x∗ , s∗ ) | (x∗ , λ∗ , s∗ ) ∈ Ω for some λ∗ }
is a bounded set, which is an immediate consequence of the previous theo-
rem.
An important property that will play a crucial role in the convergence analy-
sis of primal-dual interior-point algorithms is strict complementarity:
Assume that (x∗ , λ∗ , s∗ ) is a solution of the LCP. Then, (1.56) implies
x∗i = 0 and/or s∗i = 0 for all 1 ≤ i ≤ n .
We define the inactive sets
(1.66) IP := {1 ≤ i ≤ n | x∗i 6= 0 for some x∗ ∈ ΩP } ,
(1.67) ID := {1 ≤ i ≤ n | s∗i 6= 0 for some (λ∗ , s∗ ) ∈ ΩD } .
Theorem 1.5 (Goldman-Tucker theorem) There holds
(1.68) IP ∪ ID = {1, ..., n} .
Hence, there exists a primal solution x∗ ∈ ΩP and a dual solution (λ∗ , s∗ ) ∈
ΩD such that x∗ + s∗ > 0.
Proof: We refer to [5].
Primal-dual solutions (x∗ , λ∗ , s∗ ) satisfying x∗ + s∗ > 0 are called strictly
complementary solutions. The Goldman-Tucker theorem guarantees the ex-
istence of such a solution. The following example shows that an LP may
have multiple solutions some of which are strictly complementary and others
not: Consider the LP
(1.69) min x1
x∈lR3
subject to x1 + x2 + x3 = 1 , x ≥ 0 .
The associated dual problem is as follows
(1.70) max λ
λ∈lR,s∈lR3
   
1 1
subject to  1  λ + s =  0  , s≥0.
1 0
Optimization Theory, Fall 2006 ; Chapter 1 23

Primal-dual solutions are given by


(1.71) x∗ = (0, t, 1 − t)T , λ∗ = 0 , s∗ = (1, 0, 0)T , t ∈ [0, 1] .
For t ∈ (0, 1) the solutions are strictly complementary, whereas for t = 0
and t = 1 they are not, because there is an index i such that both x∗i and
s∗i are zero.

1.3.1.2 Central Path


The central path C is an arc of strictly feasible points parametrized by a
scalar τ > 0 such that each point (xτ , λτ , sτ ) ∈ C solves the system
(1.72) AT λτ + sτ = c,
(1.73) Axτ = b,
(1.74) xτ,i sτ,i = τ , 1≤i≤n,
(1.75) (x, s) > 0.
As we shall show, the existence of C is guaranteed, provided FPo D 6= ∅. As a
first result in this direction, we prove:
Lemma 1.1 Assume FPo D 6= ∅ and let K ≥ 0. Then, the set
{(x, s) | (x, λ, s) ∈ FP D for some λ , xT s ≤ K}
is bounded.
Proof: Let (x̄, λ̄, s̄) ∈ FPo D and (x, λ, s) ∈ FP D , xT s ≤ K be arbitrarily
given. Then, the two equations
A(x̄ − x) = 0 ,
T
A (λ̄ − λ) + (s̄ − s) = 0
imply
(x̄ − x)T (s̄ − s) = − (x̄ − x)T AT (λ̄ − λ) = 0 ,
whence
x̄T s + s̄T x ≤ K + x̄T s̄ .
Due to (x̄, s̄) > 0 we have
0 < ξ := min min (x̄i , s̄i ) .
1≤i≤n
Hence, from the previous inequality we deduce
ξeT (x + s) ≤ K + x̄T s̄ ,
where e := (1, ..., 1)T , and further
1 1
0 ≤ xi ≤ (K + x̄T s̄) , 0 ≤ si ≤ (K + x̄T s̄) , 1≤i≤n.
ξ ξ

Theorem 1.6 (Existence of the central path) Under the assumption


FPo D 6= ∅, for every τ > 0 there is a solution (xτ , λτ , sτ ) of (1.72)-(1.75).
Moreover, the (xτ , sτ )-component of the solution is uniquely determined.
24 Ronald H.W. Hoppe

Proof: We prove that (xτ , sτ ) is the unique minimizer of


(1.76) min fτ (x, s) ,
(x,s)∈Ho

where fτ (x, s) is the logarithmic barrier function


n
X
1 T
(1.77) fτ (x, s) := x s − log(xj sj )
τ
j=1

and Ho is the reduced strictly feasible set


Ho := {(x, s) | (x, λ, s) ∈ FPo D for some λ ∈ lRm } .
We first show that each level set of fτ
{(x, s) ∈ Ho | fτ (x, s) ≤ κ} , κ>0,
is contained in a compact subset of Ho . For that purpose, we rewrite fτ
according to
n
X xj sj
(1.78) fτ (x, s) = g( ) + n − n log τ ,
τ
j=1

where g denotes the strictly convex, nonnegative function


g(t) := t − log t − 1 , t ∈ R+
satisfying
(1.79) g(t) → ∞ for t → 0 and t → ∞ .
In view of (1.78) we have
n
X xj sj
fτ (x, s) ≤ κ ⇐⇒ g( ) ≤ κ̄ := κ − n + n log τ ,
τ
j=1

whence for each 1 ≤ i ≤ n


xi si X xj sj
g( ) ≤ κ̄ − g( ) ≤ κ̄ .
τ τ
j6=i

Taking (1.79) into account, there exists M > 0 such that


1
(1.80) ≤ xi si ≤ M , 1≤i≤n,
M
and consequently,
(1.81) xT s ≤ nM .
Observing the previous lemma, from (1.81) we deduce the existence of Mu >
0 such that
xi ∈ (0, Mu ] , si ∈ (0, Mu ] , 1≤i≤n.
Optimization Theory, Fall 2006 ; Chapter 1 25

Using (1.80), it follows that for all 1 ≤ i ≤ n


1 1
xi ≥ ≥ ,
M si M Mu
1 1
si ≥ ≥ .
M xi M Mu
Setting M` := 1/(M Mu ), we conclude
xi ∈ [M` , Mu ] , si ∈ [M` , Mu ] , 1≤i≤n.
Since fτ is bounded from below on Ho according to
fτ (x, s) ≥ n (1 − log τ ) ,
fτ attains its minimum in Ho . This minimum must be unique, since fτ is
strictly convex in Ho which follows from the fact that the first term in (1.77)
is linear on Ho , i.e.,
xT s = cT x − bT λ = cT x − x̄T AT λ = cT x − x̄T (c − s) = cT x + x̄T s − x̄T c
for any (x, s) ∈ Ho and any x̄ such that Ax̄ = b, whereas the second sum-
mation term has a positive definite Hessian for all (x, s) > 0.
It remains to be shown that the unique minimizer (xτ , sτ ) of (1.76) corre-
sponds to the (x, s)-component of the solution of (1.72)-(1.75). We note
that (xτ , sτ ) solves the problem
(1.82) min fτ (x, s) ,
x,s

subject to Ax = b , AT λ + s = c , (x, s) > 0 .


Setting
X := diag(x1 , ..., xn ) , S := diag(s1 , ..., sn ) ,
the KKT conditions for (1.82) imply the existence of Lagrange multipliers
ν and µ such that
∂ s
(1.83) fτ (x, s) = AT ν =⇒ − X −1 e = AT ν ,
∂x τ

(1.84) fτ (x, s) = Aµ =⇒ 0 = − Aµ ,
∂λ
∂ x
(1.85) fτ (x, s) = µ =⇒ − S −1 e = µ .
∂s τ
Combining (1.84) and (1.85) yields
x
(1.86) A( − S −1 e) = 0 .
τ
Taking the inner product of the left-hand side in (1.86) with ν and observing
(1.83), we find
x x
(A( − S −1 e))T ν = ( − S −1 e)T AT ν =
τ τ
x −1 T s
= ( − S e) ( − X −1 e) = 0 .
τ τ
26 Ronald H.W. Hoppe

It follows that
1 1
0 = ( Xe − S −1 e)T (X −1/2 S 1/2 )(X 1/2 S −1/2 )( Se − X −1 e) =
τ τ
1 1/2 −1/2 2
= k (XS) e − (XS) ek ,
τ
and hence,
1
(XS)1/2 e − (XS)−1/2 e = 0 =⇒ XSe = τ e ,
τ
which concludes the proof of the theorem.
A commonly used primal-dual interior-point method is to couple the in-
equality constraints x ≥ 0 by a standard logarithmic barrier function para-
metrized by a barrier parameter τ > 0 which leads to the family of parame-
trized minimization subproblems
Xn
(1.87) min cT x − τ log xi subject to Ax = b .
x
i=1
The domain of the logarithmic barrier function is the set of strictly feasible
points for the LP, and the optimality conditions imply the existence of a
Lagrange multiplier λ ∈ lRm such that
τ X −1 e + AT λ = c ,
Ax = b ,
x > 0.
If we define s ∈ lRn by means of
τ
si := , 1≤i≤n,
xi
we see that the minimizer xτ of (1.87) is the x-component of the central
path vector (xτ , λτ , sτ ) ∈ C. Hence, we may refer to the path
(1.88) {xτ ∈ lRn | xτ solves (1.87) , τ > 0}
as the primal central path.
Optimization Theory, Fall 2006 ; Chapter 1 27

1.3.2 Path following algorithms


1.3.2.1 Preliminaries
The optimality conditions for the LP can be written as the nonlinear system
 T 
A λ+s−c
(1.89) F (x, λ, s) =  Ax − b  = 0 , (x, s) ≥ 0 .
XSe
Using the same nonlinear function F , the central path vector (xτ , λτ , sτ ) ∈ C
turns out to be the solutions of the nonlinear system
 
0
(1.90) F (xτ , λτ , sτ ) =  0  = 0 , (xτ , sτ ) > 0 .
τe
Most primal-dual algorithms take Newton steps toward points on C. To
describe the search direction, a centering parameter σ ∈ [0, 1] and a duality
measure µ according to
n
1 X
(1.91) µ := xi si
n
i=1

are introduced. Then, the Newton-like algorithm is as follows:


Step 1: Initialization
Choose (x0 , λ0 , s0 ) ∈ FPo D .
Step 2: Iteration loop
For k ≥ 0 compute
    
0 AT I ∆xk 0
(1.92)  A 0 0   ∆λk  =  0  ,
k k k k k
−X S e + σk µk e
S 0 X δs

where σk ∈ [0, 1] and µk := (xk )T sk /n, and set


 k+1   k   
x x ∆xk
(1.93)  λk+1  =  λk  + αk  ∆λk  ,
sk+1 sk ∆sk

where αk is such that (xk+1 , sk+1 ) > 0.

Path following algorithms restrict the iterates to a neighborhood of C and


follow C to a solution of the LP. The two most common neighborhoods of C
are the the so-called 2-norm neighborhood
(1.94) N2 (θ) := {(x, λ, s) ∈ FPo D | kXSe − µek2 ≤ θµ} , θ ∈ (0, 1) ,
28 Ronald H.W. Hoppe

where k · k2 stands for the Euclidean norm, and the one-sided ∞-norm
neighborhood
(1.95) N−∞ (γ) := {(x, λ, s) ∈ FPo D | xi si ≥ γµ, 1 ≤ i ≤ n} , γ ∈ (0, 1).
In the sequel, we will investigate three classes of methods:
• short-step path following methods,
• Mizuno-Todd-Ye predictor-corrector methods,
• long-step path following methods.

1.3.2.2 Short-step path following methods


This method starts at a point (x0 , λ0 , s0 ) ∈ N2 (θ) and uses uniform values
αk = 1 , σk = σ , k≥0,
where θ and σ satisfy some relationship (see the theorem below).

x 2 s2
central path
1

0
3 2
N 2 (θ)
iterates
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
111111111111111111111111111111
000000000000000000000000000000
x1s 1

Figure 2. Short-step path following algorithm

Step 1: Initialization

Choose (x0 , λ0 , s0 ) ∈ FPo D and set θ := 0.4 , σ := 1 − 0.4/ n.
Step 2: Iteration loop
For k ≥ 0 set σk = σ and compute
    
0 AT I ∆xk 0
(1.96)  A 0 0   ∆λk  =  0  ,
k k k k k
−X S e + σk µk e
S 0 X δs
Optimization Theory, Fall 2006 ; Chapter 1 29

where µk := (xk )T sk /n. Set


 k+1   k   
x x ∆xk
(1.97)  λk+1  =  λk  +  ∆λk  .
sk+1 sk ∆sk

Figure 2 contains the first iterates of the algorithm. The horizontal and the
vertical coordinate axes stand for the x1 s1 and the x2 s2 product, respec-
tively. The central path is the line emanating from (0, 0) at an angle of π/4.
The search direction appear to be curves rather than straight lines. The
solution is at (0, 0) and the problem is to reach that point maintaining the
feasibility conditions
Ax = b , AT λ + s = c .
The choice of θ and σ is motivated by the following result:
Theorem 1.7 (Properties of the short-step path following algo-
rithm) Let θ ∈ (0, 1) and σ ∈ (0, 1) be given such that
θ2 + n(1 − σ)2
(1.98) ≤ σθ.
23/2 (1 − θ)
Then
(1.99) (x, λ, s) ∈ N2 (θ) =⇒ (x(α), λ(α), s(α)) ∈ N2 (θ) , α ∈ [0, 1] .
Proof: For a proof we refer to [5].
1.3.2.3 Mizuno-Todd-Ye predictor-corrector methods
Predictor-corrector methods consist of predictor steps with σk = 0 to reduce
the duality measure µ and corrector step with σk = 1 to improve centrality.
They work with an inner neighborhood N2 (0.25) and an outer neighborhood
N2 (0.5) such that even-index iterates are confined to the inner neighborhood,
whereas odd-index iterates stay in the outer neighborhood.
Step 1: Initialization
Choose (x0 , λ0 , s0 ) ∈ N2 (0.25).
Step 2: Iteration loop
For k ≥ 0 do:
Predictor step: If k is even, set σk = 0 and solve
    
0 AT I ∆xk 0
(1.100)  A 0 0   ∆λk  =  0  .
k k k k k
−X S e
S 0 X δs
Choose αk as the largest value of α ∈ [0, 1] such that
(1.101) (xk + α∆xk , λk + α∆λk , sk + α∆sk ) ∈ N2 (0.5) .
30 Ronald H.W. Hoppe

Set
   k   
xk+1 x ∆xk
(1.102)  λk+1  =  λk  + αk  ∆λk  .
sk+1 sk ∆sk
Corrector step: If k is odd, set σk = 1 and solve
    
0 AT I ∆xk 0
(1.103)  A 0 0   ∆λk  =  0  ,
k k
Sk 0 X k
δs k −X S e + µk e

where µk := (xk )T sk /n and set


 k+1   k   
x x ∆xk
(1.104)  λk+1  =  λk  +  ∆λk  .
sk+1 sk ∆sk

x 2 s2
central path
N 2 (0.5) N 2 (0.25)

2 iterates
4

x1s 1

Figure 3. Path following predictor-corrector algorithm

Figure 3 displays the path following predictor-corrector algorithm. We


start from (x0 , λ0 , s0 ) in the inner neighborhood N2 (0.25) and use a predictor
step with σ0 = 0 to arrive at the new iterate (x1 , λ1 , s1 ) at the boundary of
the outer neighborhood N2 (0.5). The corrector step with σ1 = 1 and unit
step α = 1 leads us back into the inner neighborhood N2 (0.25). This cycle
then repeats.
The predictor steps achieve a substantial reduction of the duality measure:
Optimization Theory, Fall 2006 ; Chapter 1 31

Lemma 1.2 (Properties of the predictor step) Assume (x, λ, s) ∈


N2 (0.25) and that (∆x, ∆λ, ∆s) has been computed as described in the
predictor step (i.e., with σ = 0). Then, there holds
(1.105) (x + α∆x, λ + α∆λ, s + α∆s) ∈ N2 (0.5) for all α ∈ [0, α] ,
where
¡1 µ ¢
(1.106) α := min ,( )1/2 .
2 8k∆X∆Sek
For the new value µnew of the duality measure we have
(1.107) µnew ≤ (1 − α) µ .
Proof: We refer to [5].
Corrector steps return to the inner neighborhood without changing the du-
ality measure:
Lemma 1.3 (Properties of the corrector step) Assume (x, λ, s) ∈
N2 (0.5) and that (∆x, ∆λ, ∆s) has been computed by the corrector step
with σ = 1 and α = 1. Then, we have
(1.108) (x + ∆x, λ + ∆λ, s + ∆s) ∈ N2 (0.25) , µnew = µ .
Proof: We refer to [5].
1.3.2.4 Long-step path following methods
The long-step path following method generates iterates in the neighborhood
N−∞ (γ) which for small γ contains most of the strictly feasible points. At
each step, the centering parameter σ stays between two fixed limits 0 <
σmin < σmax < 1 and the step length αk of the search direction is chosen as
large as possible, provided the new iterate stays in N−∞ (γ).
Step 1: Initialization
Given γ ∈ (0, 1) and 0 < σmin < σmax < 1, choose (x0 , λ0 , s0 ) ∈ N−∞ (γ).
Step 2: Iteration loop
For k ≥ 0 choose σk ∈ [σmin , σmax ] and compute
    
0 AT I ∆xk 0
(1.109)  A 0 0   ∆λk  =  0  ,
S k 0 Xk δsk −X k S k e + σk µk e
where µk := (xk )T sk /n. Choose αk as the largest value such that
(1.110) (xk + α∆xk , λk + α∆λk , sk + α∆sk ) ∈ N−∞ (γ)
and set
   k   
xk+1 x ∆xk
(1.111)  λk+1  =  λk  + αk  ∆λk  .
sk+1 sk ∆sk
32 Ronald H.W. Hoppe

x 2 s2 1 0
central path

iterates
2

3
N −infty (γ )

x1s 1

Figure 4. Long-step path following algorithm

The lower bound σmin on the centering parameter guarantees that the
search directions start out by moving off the boundary of N−∞ (γ) and into
its interior: Small steps would improve the centrality, whereas large steps
lead outside the neighborhood. The step size selection αk ensures that we
stay at least at the boundary.
Lemma 1.4 (properties of the long-step path following algorithm)
Fore given γ ∈ (0, 1) and 0 < σmin < σmax < 1 there exists δ < n, indepen-
dent of n, such that
δ
(1.112) µk+1 ≤ (1 − ) µk , k ≥ 0 .
n
Proof: We refer to [5].
1.3.2.5 Convergence of the path following algorithms
As far as the convergence of the sequence of iterates of the three previously
introduced path following primal-dual interior-point methods is concerned,
we have the following result:
Theorem 1.8 (Convergence of iterates of path following meth-
ods) Assume that {(xk , λk , sk )}k∈lN0 is a sequence of iterates generated
either by the short-step resp. long-step path following method or by the
predictor-corrector path-following algorithm and suppose that the sequence
{µk }k∈lN0 of duality measures is going to zero as k → ∞. Then, the sequence
{(xk , sk )}k∈lN0 is bounded and thus contains a convergent subsequence. Each
limit point is a strictly complementary primal-dual solution.
Optimization Theory, Fall 2006 ; Chapter 1 33

Proof: We refer to [5].


1.3.3 Mehrotra’s predictor-corrector algorithm
In contrast to the algorithms treated in the previous subsection, Mehro-
tra’s algorithm produces a sequence of infeasible iterates (xk , λk , sk ) with
(xk , sk ) > 0. Each iteration step involves the following three components
• an ’affine-scaling’ prediction step which is the Newton direction for
the nonlinear function F as defined by (1.90),
• a centering term by means of an adaptively chosen centering para-
meter σ,
• a corrector step that compensates for some nonlinearity in the pre-
dictor step direction.
Predictor step: Given (x, λ, s) with (x, s) > 0, the affine scaling direction
(∆xaf f , ∆λaf f , ∆saf f ) is obtained by the solution of the system
    
0 AT I ∆xaf f −rc
(1.113)  A 0 0   ∆λaf f  =  −rb  ,
S 0 X ∆saf f −XSe
where rb and rc stand for the residuals
(1.114) rb := Ax − b , rc := AT λ + s − c .
The step lengths are chosen separately for the primal and dual components
p
(1.115) αaf f := argmax {α ∈ [0, 1] | x + α∆xaf f ≥ 0} ,
d
(1.116) αaf f := argmax {α ∈ [0, 1] | s + α∆saf f ≥ 0} .

Adaptive choice of the centering parameter: If we perform a full


step to the boundary in the affine-scaling direction, the resulting value of
the duality measure would be
p af f T d af f
µaf f = (x + αaf f ∆x ) (s + αaf f ∆s )/n .
Mehrotra has suggested a heuristics for the choice of the centering parameter
µaf f 3
(1.117) σ = ( ) ,
µ
which can be motivated as follows:
If µaf f ¿ µ, the affine-scaling direction is such that it leads to a significant
reduction of the duality measure. Consequently, the centering parameter
σ should be chosen close to zero. On the other hand, if µaf f is only a bit
smaller than µ, the trajectory should lead closer to the central path C which
can be realized by choosing the centering parameter σ closer to 1. In order
to compute the centering step direction, we would have to solve
    
0 AT I ∆xc 0
(1.118)  A 0 0   ∆λc  =  0  .
S 0 X ∆sc σµe
34 Ronald H.W. Hoppe

Instead of doing so, we will combine the centering step with the corrector
step.
Corrector step: The impact of a full step in the affine-scaling direction
on the pairwise products xi si , 1 ≤ i ≤ n, is as follows
(1.119) (xi + ∆xaf
i
f
)(si + ∆saf
i
f
) =
= xi si + xi ∆saf
i
f
+ si ∆xaf
i
f
+∆xaf
i
f
∆saf
i
f
= ∆xaf
i
f
∆saf
i
f
,
| {z }
= 0
where we have used that the sum of the first three terms is zero due to
(1.113). The corrector step is designed in such a way that the pairwise
products xi si come closer to the target value of zero:
    
0 AT I ∆xcor 0
(1.120)  A 0 0   ∆λcor  =  0  ,
∆s cor af f af f
S 0 X −∆X ∆S e
where
∆X af f = diag(∆xaf f af f
1 , ..., ∆xn ) ,
∆S af f = diag(∆saf f af f
1 , ..., ∆sn ) .
Now, it is an easy exercise to show that (1.119) and (1.120) imply
(1.121) (xi + ∆xaf
i
f
+ ∆xcor af f
i )(si + ∆si + ∆scor
i ) =
= ∆xaf
i
f
∆scor cor af f
i + ∆xi ∆si + ∆xcor cor
i ∆si .
If for µ → 0 the coefficient matrix in (1.113) resp. (1.120) is approaching a
nonsingular limit, we indeed have
k(∆xaf f , δsaf f )k = O(µ) , k(∆xcor , δscor )k = O(µ2 ) ,
which implies
∆xaf
i
f
∆saf
i
f
= O(µ2 ) ,
∆xaf
i
f
∆saf
i
f
∆xiaf f ∆scor cor af f
i + ∆xi ∆si + ∆xcor cor
i ∆si = O(µ3 ) .
However, if the limiting matrix is singular, it is not guaranteed that the
corrector step is smaller in norm than the predictor step (in fact, often it is
larger). Nevertheless, numerical evidence suggests that also in this case the
corrector step improves the overall performance of the algorithm.
Combining the centering and the corrector step amounts to the solution of
the linear system
    
0 AT I ∆xcc 0
(1.122)  A 0 0   ∆λcc  =  0  .
∆scc af f af f
S 0 X σµe − ∆X ∆S e
A commonly used variant of Mehrotra’s predictor-corrector step is given as
follows:
Step 1: Initialization
Optimization Theory, Fall 2006 ; Chapter 1 35

Choose (x0 , λ0 , s0 ) with (x0 , s0 ) > 0.


Step 2: Iteration loop
For k ≥ 0 set
(x, λ, s) = (xk , λk , sk )
and solve (1.113) for (∆xaf f , ∆λaf f , ∆saf f ). Compute
p
(1.123) αaf f := argmax {α ∈ [0, 1] | xk + α∆xaf f ≥ 0} ,
d
(1.124) αaf f := argmax {α ∈ [0, 1] | sk + α∆saf f ≥ 0} ,
p
(1.125) µaf f := (xk + αaf f ∆x
af f T k d
) (s + αaf f ∆s
af f
)/n ,
(1.126) σ := (µaf f /µ)3 .
Solve (1.122) for (∆xcc , ∆λcc , ∆scc ) and compute the search direction and
the step length to the boundary according to
(1.127) (∆xk , ∆λk , ∆sk ) := (∆xaf f , ∆λaf f , ∆saf f ) +
+ (∆xcc , ∆λcc , ∆scc ) ,
p
(1.128) αmax := argmax {α ≥ 0 | xk + α∆xk ≥ 0} ,
d
(1.129) αmax := argmax {α ≥ 0 | sk + α∆sk ≥ 0} .
Set
αkp := min(0.99 · αmax
p
, 1) , αkd := min(0.99 · αmax
d
, 1) ,
and compute
(1.130) xk+1 := xk + αkp δxk ,
(1.131) (λk+1 , sk+1 ) := (λk , sk ) + αkd (δλk , ∆sk ) .

References
[1] G.B. Dantzig; Linear Programming and Extensions. Princeton Univ. Press,
Princeton, 1963
[2] R. Fletcher; Practical Methods of Optimization. Wiley, New York, 1987
[3] O.L. Mangarasian; Nonlinear Programming. McGraw-Hill, New York, 1969
[4] J. Stoer and R. Bulirsch; Introduction to Numerical Analysis. 3rd Edition.
Springer, Berlin-Heidelberg-New York, 2002
[5] S.J. Wright; Primal-Dual Interior-Point Methods. SIAM, Philadelphia, 1997

You might also like