Rrrdesdelinear and Nonlinear Programming-4
Rrrdesdelinear and Nonlinear Programming-4
METHODS
In this chapter we take another approach toward the development of methods lying
somewhere intermediate to steepest descent and Newton’s method. Again working
under the assumption that evaluation and use of the Hessian matrix is impractical
or costly, the idea underlying quasi-Newton methods is to use an approximation to
the inverse Hessian in place of the true inverse that is required in Newton’s method.
The form of the approximation varies among different methods—ranging from
the simplest where it remains fixed throughout the iterative process, to the more
advanced where improved approximations are built up on the basis of information
gathered during the descent process.
The quasi-Newton methods that build up an approximation to the inverse
Hessian are analytically the most sophisticated methods discussed in this book for
solving unconstrained problems and represent the culmination of the development
of algorithms through detailed analysis of the quadratic problem. As might be
expected, the convergence properties of these methods are somewhat more difficult
to discover than those of simpler methods. Nevertheless, we are able, by continuing
with the same basic techniques as before, to illuminate their most important features.
In the course of our analysis we develop two important generalizations of
the method of steepest descent and its corresponding convergence rate theorem.
The first, discussed in Section 10.1, modifies steepest descent by taking as the
direction vector a positive definite transformation of the negative gradient. The
second, discussed in Section 10.8, is a combination of steepest descent and Newton’s
method. Both of these fundamental methods have convergence properties analogous
to those of steepest descent.
minimize f x
f x = 21 xT Qx − bT x (2)
where Q is symmetric and positive definite. For this case we can find an explicit
expression for k in (1). The algorithm becomes
where
gk = Qxk − b (3b)
gkT Sk gk
k = (3c)
gkT Sk QSk gk
We may then derive the convergence rate of this algorithm by slightly extending
the analysis carried out for the method of steepest descent.
where bk and Bk are, respectively, the smallest and largest eigenvalues of the
matrix Sk Q.
†
The algorithm (1) is sometimes referred to as the method of deflected gradients, since the
direction vector can be thought of as being determined by deflecting the gradient through
multiplication by Sk .
10.1 Modified Newton Method 287
T 2
E xk − E xk+1 p k Pk
= T
E xk pk Tk pk pTk T−1
k pk
A Classical Method
We conclude this section by mentioning the classical modified Newton’s method, a
standard method for approximating Newton’s method without evaluating Fxk −1
for each k. We set
In this method the Hessian at the initial point x0 is used throughout the process.
The effectiveness of this procedure is governed largely by how fast the Hessian is
changing—in other words, by the magnitude of the third derivatives of f .