0% found this document useful (0 votes)

15 views3 pages

Lecture 15

The document discusses quasi-Newton methods for optimization, specifically the BFGS and SR1 algorithms. It provides details on how BFGS and SR1 approximate the Hessian matrix without directly computing it, using rank-1 or rank-2 updates that satisfy the secant condition. Pseudocode for the general BFGS algorithm is also presented.

Uploaded by

ronaldo lopes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views3 pages

Lecture 15

Uploaded by

ronaldo lopes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

45010: Optimization I 2019W

Lecture 15: BFGS and SR1

Lecturer: Yimin Zhong Scribes: None

Note: In all notes, bold face letters denote vectors.

15.1 Quasi-Newton methods

The quasi-Newton methods should be noted they are different from modified Hessian methods, since the
quasi-Newton methods are not working on Hessian matrices.
The Newton’s method requires us to solve the Newton’s step p = −[∇2 f (xk )]−1 ∇f (xk ). However, for general
case, this is not a cheap operation (solving an equation).
On the other hand, the Newton’s method has quadratic local convergence, which is much better than linear
convergence. The quasi-Newton is a compromise between convergence and complexity.

15.1.1 The basic rule for quasi-Newton

Suppose we have the model function at xk (You can imagine Newton’s method or Trust region method)
1
mk (p) = f (xk ) + pT ∇f (xk ) + pT Bk p (15.1)
2
This Bk is an approximation for Hessian of f , but how to get Bk+1 from Bk without computing the Hessian
exactly?
After one step, the next model is
1
mk+1 (p) = f (xk+1 ) + pT ∇fk+1 + pT Bk+1 p (15.2)
2
where xk+1 = xk + pk . The quasi-Newton considers the approximation Bk+1 should satisfy a condition
(secant condition): the gradients of model function mk+1 should match the gradients of f at xk and xk+1 .
That means
∇mk+1 (−pk ) = ∇fk+1 − Bk+1 pk = ∇fk (15.3)
so Bk+1 pk = ∇fk+1 − ∇fk . We denote
sk = xk+1 − xk , yk = ∇fk+1 − ∇f (xk ) (15.4)
then (the secant equation)
Bk+1 sk = yk (15.5)
This (15.5) is the condition that our Bk+1 should satisfy! Intuitively, just “informally” write it as (imagine
one dimensional case)
∇fk+1 − ∇fk
Bk+1 = (15.6)
xk+1 − xk
The right hand side is “like” Hessian. But a single equation (15.5) cannot determine Bk+1 uniquely (why?).
For many problems, we also hope Bk+1 be psd to make sure the direction is descent direction.

15-1
15-2 Lecture 15: BFGS and SR1

15.1.2 SR1

In this part, we use the symmetric rank-1 update by

Bk+1 = Bk + σvv T (15.7)

where σ = 1 or −1, and σ, v satisfies the secant equation yk = Bk+1 sk . The reason for its name as rank-1 is:
vv T is a rank-1 matrix. So we compute

yk = Bk sk + [σv T sk ]v (15.8)

so v is along direction yk − Bk sk , say v = a(yk − Bk sk ), then

σa2 (sTk (yk − Bk sk )) = 1 (15.9)

so
−1/2
σ = sign(sTk (yk − Bk sk )), a = ±|sTk (yk − Bk sk )| (15.10)
which is
(yk − Bk sk )(yk − Bk sk )T
Bk+1 = Bk + (15.11)
(yk − Bk sk )T sk
The Sherman-Morrison formula (see A.27 in book) can easily invert this matrix by

(sk − Hk yk )(sk − Hk yk )T
Hk+1 = Hk + (15.12)
(sk − Hk yk )T yk

we can also derive above formula by setting rank-1 formula for Hk+1 as we did for Bk+1 . However there are
two issues with this method:

1. the denominator (sk − Hk yk )T yk may be too close to 0

2. If (sk − Hk yk )T yk is negative, then Hk+1 might be non-positive definite anymore.

For the first issue, we can set a rule to skip the iteration,

|(sk − Hk yk )T yk | < rkyk kk(sk − Hk yk )k (15.13)

say r = 10−8 , we will skip this iteration by setting Hk+1 = Hk , otherwise the denominator is not small, we
can still use the update formula.

15.1.3 BFGS

This famous quasi-Newton method: BFGS, is named after 4 distinguished mathematicians. The idea is
similar to the above SR1 method. Instead of rank-1 update, we can use rank-2 update formula. So BFGS is
using a update formula as
Bk+1 = Bk + auuT + bvv T (15.14)
So we can multiply sk ,
yk = Bk sk + au(uT sk ) + bv(v T sk ) (15.15)
Which means
a(uT sk )u + b(v T sk )v = yk − Bk sk (15.16)
Lecture 15: BFGS and SR1 15-3

Here actually we have multiple choices for u and v vectors, but BFGS takes u = yk and v = Bk sk , to match
the right hand side. Then we must have

a(ykT sk ) = 1, b(sTk Bk sk ) = −1 (15.17)

The update formula is

yk ykT Bk sk sTk Bk
Bk+1 = Bk + − (BFGS)
ykT sk sTk Bk sk (15.18)
= (I − ρk yk sTk )Bk (I − ρk sk ykT ) + ρk yk ykT (DFP)
1 −1
where ρk = sT
. Use the relation Hk+1 sk = yk , we will get the rank-2 update for Hk+1 = Bk+1 .
k yk

sk sTk Hk yk ykT Hk
Hk+1 = Hk + T
− (DFP)
sk yk ykT Hk yk (15.19)
= (I − ρk sk ykT )Hk (I − ρk yk sTk ) + ρk sk sTk (BFGS)

The latter one is BFGS. Now, we have the updating formula, what is the initial value of H0 ? This is quite
difficult to come up with a good one unless we compute it explicitly, sometimes we can simply set it to be
identity .

15.1.4 BFGS Algorithm

1. set starting point x0 and approximated inverse Hessian H0 . k ← 0.
2. If k∇f (xk )k is not small enough (say 10−9 ), then

pk = −Hk ∇fk (15.20)

compute xk+1 = xk +αk pk with step length αk chosen to satisfy Wolfe condition (important). Compute
sk = xk+1 − xk , yk = ∇fk+1 − ∇fk and update Hk+1 , k ← k + 1. Go to step 2.

The step length should not use backtracking algorithm to generate since the algorithm relies
on the curvature condition. The performance may be degraded using backtracking. We can
use the exact line search here.

L-BFGS Algorithm
No ratings yet
L-BFGS Algorithm
4 pages
Bfgs
No ratings yet
Bfgs
11 pages
Numerical Optimization: Lecture Notes #18 Quasi-Newton Methods - The BFGS Method
No ratings yet
Numerical Optimization: Lecture Notes #18 Quasi-Newton Methods - The BFGS Method
24 pages
BFGS
No ratings yet
BFGS
9 pages
Support Lecture 1
No ratings yet
Support Lecture 1
4 pages
Wiki Lbfgs
No ratings yet
Wiki Lbfgs
6 pages
Algorithm 778: L-BFGS-B: Fortran Subroutines For Large-Scale Bound-Constrained Optimization
No ratings yet
Algorithm 778: L-BFGS-B: Fortran Subroutines For Large-Scale Bound-Constrained Optimization
11 pages
Quasi Newton Methods
No ratings yet
Quasi Newton Methods
15 pages
1991imajna 11 325 332
No ratings yet
1991imajna 11 325 332
9 pages
Quasi Newton Methods
No ratings yet
Quasi Newton Methods
17 pages
Lecture 05 - Quasi Newthon Methods
No ratings yet
Lecture 05 - Quasi Newthon Methods
10 pages
Quasi Newton PDF
No ratings yet
Quasi Newton PDF
15 pages
Quasi Newton PDF
No ratings yet
Quasi Newton PDF
15 pages
E1 251 Linear and Nonlinear Op2miza2on: Chapter 10: Quasi - Newton Method
No ratings yet
E1 251 Linear and Nonlinear Op2miza2on: Chapter 10: Quasi - Newton Method
18 pages
A Modified BFGS Method and Its Global Convergence in Nonconvex Minimization - 2001 - Journal of Computational and Applied Mathematics
No ratings yet
A Modified BFGS Method and Its Global Convergence in Nonconvex Minimization - 2001 - Journal of Computational and Applied Mathematics
21 pages
Experimental Study of The Broyden Class Updating Method For Solving Non-Linear Unconstrained Optimization Problems
No ratings yet
Experimental Study of The Broyden Class Updating Method For Solving Non-Linear Unconstrained Optimization Problems
10 pages
Unit 2 (Second Order Methods)
No ratings yet
Unit 2 (Second Order Methods)
9 pages
Practice BFGS Algorithm
No ratings yet
Practice BFGS Algorithm
9 pages
Project For Automated Train by Roshan
No ratings yet
Project For Automated Train by Roshan
6 pages
L-BFGS-B Summary
No ratings yet
L-BFGS-B Summary
1 page
1 s2.0 S0893965901001628 Main
No ratings yet
1 s2.0 S0893965901001628 Main
7 pages
The Convergence of Quasi-Gauss-Newton Methods For Nonlinear Problems
No ratings yet
The Convergence of Quasi-Gauss-Newton Methods For Nonlinear Problems
12 pages
Doan BFGS
No ratings yet
Doan BFGS
72 pages
Anglada Et Al-1998-Journal of Computational Chemistry
No ratings yet
Anglada Et Al-1998-Journal of Computational Chemistry
14 pages
On The Connection Between The Conjugate Gradient Method and Quasi-Newton Methods On Quadratic Problems
No ratings yet
On The Connection Between The Conjugate Gradient Method and Quasi-Newton Methods On Quadratic Problems
20 pages
Chapter 3, Lecture 6: Broyden's Method: This Document Comes From The Math 484 Course Webpage
No ratings yet
Chapter 3, Lecture 6: Broyden's Method: This Document Comes From The Math 484 Course Webpage
5 pages
Jurnal - Gauss Newton
No ratings yet
Jurnal - Gauss Newton
11 pages
SQ P Methods
No ratings yet
SQ P Methods
13 pages
A Quasi-Gauss-Newton Method For Solving Non-Linear Algebraic Equations
No ratings yet
A Quasi-Gauss-Newton Method For Solving Non-Linear Algebraic Equations
11 pages
16.323 Optimal Control Problems Set 1
No ratings yet
16.323 Optimal Control Problems Set 1
3 pages
Practica Cuasi-Newton
No ratings yet
Practica Cuasi-Newton
6 pages
A Limited-Memory Algorithm For
No ratings yet
A Limited-Memory Algorithm For
22 pages
Modified Limited Memory BFGS Method With Nonmonoto
No ratings yet
Modified Limited Memory BFGS Method With Nonmonoto
23 pages
Newton Gauss Method
No ratings yet
Newton Gauss Method
37 pages
1 PB
No ratings yet
1 PB
9 pages
FALLSEM2020-21 CHE1011 TH VL2020210101704 Reference Material I 05-Sep-2020 Lecture 17 PDF
No ratings yet
FALLSEM2020-21 CHE1011 TH VL2020210101704 Reference Material I 05-Sep-2020 Lecture 17 PDF
19 pages
Sequential State Estimation in Nonlinear, Non Gaussian Dynamical Systems
No ratings yet
Sequential State Estimation in Nonlinear, Non Gaussian Dynamical Systems
38 pages
DFO of Noisy Functions Via Quasi Newtion Methods - Berahas, Byrd, Nocedal (2018)
No ratings yet
DFO of Noisy Functions Via Quasi Newtion Methods - Berahas, Byrd, Nocedal (2018)
40 pages
06 UMAT Plasticity 2018
No ratings yet
06 UMAT Plasticity 2018
55 pages
Reserch Paper 1
No ratings yet
Reserch Paper 1
9 pages
Rood Findings
No ratings yet
Rood Findings
7 pages
Meshfree Approximation With M: Lecture VI: Nonlinear Problems: Nash Iteration and Implicit Smoothing
No ratings yet
Meshfree Approximation With M: Lecture VI: Nonlinear Problems: Nash Iteration and Implicit Smoothing
77 pages
Lecture 6: Breadth-First Search
No ratings yet
Lecture 6: Breadth-First Search
26 pages
Ad Asd Asd
No ratings yet
Ad Asd Asd
13 pages
CS 256: LMS Algorithms
No ratings yet
CS 256: LMS Algorithms
23 pages
Ica20100100003 17780538
No ratings yet
Ica20100100003 17780538
8 pages
Fourier Transform of Bernstein Bezier Po
No ratings yet
Fourier Transform of Bernstein Bezier Po
11 pages
Opt Sem10
No ratings yet
Opt Sem10
26 pages
Fast and Accurate Bessel Function Computation: John Harrison, Intel Corporation
No ratings yet
Fast and Accurate Bessel Function Computation: John Harrison, Intel Corporation
22 pages
Global Convergence of A Modified Fletcher-Reeves Conjugate Gradient Method With Armijo-Type Line Search - Zhang, Zhou (2006)
No ratings yet
Global Convergence of A Modified Fletcher-Reeves Conjugate Gradient Method With Armijo-Type Line Search - Zhang, Zhou (2006)
12 pages
Limited-Memory BFGS With Displacement Aggregation
No ratings yet
Limited-Memory BFGS With Displacement Aggregation
24 pages
Recursive Least Squares: T y T X T X T X T
No ratings yet
Recursive Least Squares: T y T X T X T X T
5 pages
MAT321 Lecture Notes Boumal 2019
No ratings yet
MAT321 Lecture Notes Boumal 2019
203 pages
Fast Inverse Square Root
No ratings yet
Fast Inverse Square Root
12 pages
Kalman Filter: Ekf, Ukf
No ratings yet
Kalman Filter: Ekf, Ukf
17 pages
Course Notes: Week 2.: Math 270C: Applied Numerical Linear Algebra
No ratings yet
Course Notes: Week 2.: Math 270C: Applied Numerical Linear Algebra
8 pages
Lecture 7 8 Other Descent Methods
No ratings yet
Lecture 7 8 Other Descent Methods
7 pages
Workshop 05 S
No ratings yet
Workshop 05 S
7 pages
Assignment No. 3
No ratings yet
Assignment No. 3
1 page
KULIAH 2-MATREK I-Fungsi & Limit
No ratings yet
KULIAH 2-MATREK I-Fungsi & Limit
37 pages
Class - Test 2 Soln
No ratings yet
Class - Test 2 Soln
5 pages
1 Flanders Mathematics Olympiad 2000-2001: First Round.: 1.1 The Problems
No ratings yet
1 Flanders Mathematics Olympiad 2000-2001: First Round.: 1.1 The Problems
17 pages
Solutions To Homework 2-MAT319/320: 1 Section 1.3
No ratings yet
Solutions To Homework 2-MAT319/320: 1 Section 1.3
3 pages
09 Power Method
No ratings yet
09 Power Method
72 pages
Maths L1 MPA 9th
No ratings yet
Maths L1 MPA 9th
4 pages
mm5231 6
No ratings yet
mm5231 6
109 pages
Serrin Dirichlet PDF
No ratings yet
Serrin Dirichlet PDF
85 pages
Davis RotationEigenvectorsPerturbation 1970
No ratings yet
Davis RotationEigenvectorsPerturbation 1970
47 pages
Basic Calculation
No ratings yet
Basic Calculation
12 pages
Algebra Questions For CAT
No ratings yet
Algebra Questions For CAT
10 pages
Math 1
No ratings yet
Math 1
25 pages
Allen - Sets and Relations
100% (1)
Allen - Sets and Relations
19 pages
Partial Differential Equations: CS 205A: Mathematical Methods For Robotics, Vision, and Graphics
No ratings yet
Partial Differential Equations: CS 205A: Mathematical Methods For Robotics, Vision, and Graphics
49 pages
Logarithmic Functions, Equations, and Inequalities
No ratings yet
Logarithmic Functions, Equations, and Inequalities
9 pages
Integration by Substitution
No ratings yet
Integration by Substitution
12 pages
Hard Geometric Problem
No ratings yet
Hard Geometric Problem
4 pages
Exact Amplitude Distributions of Sums of Stochastic Sinusoidals
No ratings yet
Exact Amplitude Distributions of Sums of Stochastic Sinusoidals
14 pages
Algebra Foundation Worksheets
No ratings yet
Algebra Foundation Worksheets
22 pages
Introduction To ROBOTICS: Kinematics of Robot Manipulator
No ratings yet
Introduction To ROBOTICS: Kinematics of Robot Manipulator
44 pages
Second Periodical Test in Mathematics 8
No ratings yet
Second Periodical Test in Mathematics 8
9 pages
On The Sliding-Window Representation in Digital Signal Processing
No ratings yet
On The Sliding-Window Representation in Digital Signal Processing
6 pages
05 Convolution ALF
No ratings yet
05 Convolution ALF
68 pages
092) Synopsis Functions Sets and Relations
No ratings yet
092) Synopsis Functions Sets and Relations
12 pages
Revision Support Material Chapter-1 Relationa ND Function
No ratings yet
Revision Support Material Chapter-1 Relationa ND Function
8 pages
Trigonometry 11th Edition Lial Solutions Manual
No ratings yet
Trigonometry 11th Edition Lial Solutions Manual
55 pages
Dips Lab Report
No ratings yet
Dips Lab Report
5 pages
02 00523024215 QP
No ratings yet
02 00523024215 QP
10 pages

Lecture 15

Uploaded by

Lecture 15

Uploaded by

45010: Optimization I 2019W

Lecture 15: BFGS and SR1

Note: In all notes, bold face letters denote vectors.

15.1 Quasi-Newton methods

15.1.1 The basic rule for quasi-Newton

In this part, we use the symmetric rank-1 update by

Bk+1 = Bk + σvv T (15.7)

so v is along direction yk − Bk sk , say v = a(yk − Bk sk ), then

σa2 (sTk (yk − Bk sk )) = 1 (15.9)

1. the denominator (sk − Hk yk )T yk may be too close to 0

2. If (sk − Hk yk )T yk is negative, then Hk+1 might be non-positive definite anymore.

|(sk − Hk yk )T yk | < rkyk kk(sk − Hk yk )k (15.13)

a(ykT sk ) = 1, b(sTk Bk sk ) = −1 (15.17)

The update formula is

15.1.4 BFGS Algorithm

pk = −Hk ∇fk (15.20)

You might also like