Support Lecture 1
Support Lecture 1
A few steps of the bisection method applied over the starting range [a1;b1]. The bigger red
dot is the root of the function.
The bisection method in mathematics is a root-finding method that repeatedly bisects an
interval and then selects a subinterval in which a root must lie for further processing. It is a
very simple and robust method, but it is also relatively slow. Because of this, it is often used
to obtain a rough approximation to a solution which is then used as a starting point for more
rapidly converging methods.[1] The method is also called the interval halving method,[2] the
binary search method,[3] or the dichotomy method.[4]
BroydenFletcherGoldfarbShanno
algorithm
From Wikipedia, the free encyclopedia
In numerical optimization, the BroydenFletcherGoldfarbShanno (BFGS) algorithm
is an iterative method for solving unconstrained nonlinear optimization problems.
The BFGS method approximates Newton's method, a class of hill-climbing optimization
techniques that seeks a stationary point of a (preferably twice continuously differentiable)
function. For such problems, a necessary condition for optimality is that the gradient be
zero. Newton's method and the BFGS methods are not guaranteed to converge unless the
function has a quadratic Taylor expansion near an optimum. These methods use both the
first and second derivatives of the function. However, BFGS has proven to have good
performance even for non-smooth optimizations.[1]
In quasi-Newton methods, the Hessian matrix of second derivatives doesn't need to be
evaluated directly. Instead, the Hessian matrix is approximated using rank-one updates
specified by gradient evaluations (or approximate gradient evaluations). Quasi-Newton
methods are generalizations of the secant method to find the root of the first derivative for
multidimensional problems. In multi-dimensional problems, the secant equation does not
specify a unique solution, and quasi-Newton methods differ in how they constrain the
solution. The BFGS method is one of the most popular members of this class.[2] Also in
common use is L-BFGS, which is a limited-memory version of BFGS that is particularly
suited to problems with very large numbers of variables (e.g., >1000). The BFGS-B[3]
variant handles simple box constraints.
Contents
1 Rationale
2 Algorithm
3 Implementations
4 See also
5 Notes
6 Bibliography
7 External links
Rationale
The search direction pk at stage k is given by the solution of the analogue of the Newton
equation
where
stage, and
is the gradient of the function evaluated at xk. A line search in the
direction pk is then used to find the next point xk+1. Instead of requiring the full Hessian
matrix at the point xk+1 to be computed as Bk+1, the approximate Hessian at stage k is
updated by the addition of two matrices.
Both Uk and Vk are symmetric rank-one matrices but have different (matrix) bases. The
symmetric rank one assumption here means that we may write
So equivalently, Uk and Vk construct a rank-two update matrix which is robust against the
scale problem often suffered in the gradient descent searching (e.g., in Broyden's method).
The quasi-Newton condition imposed on this update is
Algorithm
From an initial guess and an approximate Hessian matrix
repeated as
converges to the solution.
1. Obtain a direction
by solving:
3. Set
4.
5.
denotes the objective function to be minimized. Convergence can be checked by
observing the norm of the gradient,
. Practically,
can be initialized with
, so that the first step will be equivalent to a gradient descent, but further steps are
more and more refined by
, the approximation to the Hessian.
The first step of the algorithm is carried out using the inverse of the matrix
, which is
usually obtained efficiently by applying the ShermanMorrison formula to the fifth line of
the algorithm, giving
and
is