Optimization2
Optimization2
Lecture outline
New topics:
• Axial iteration
• Levenberg-Marquardt algorithm
• Application
Introduction: Problem specification
+,- f ! x "
x
local global x
minimum minimum
Not convex
15
10
-5
-5 0 5 10 15
Optimization algorithm – Random direction
Choosing the direction 1: axial iteration
15
10
-5-5 0 5 10 15
Optimization algorithm
axial directions
Gradient and Partial Derivatives
A function of several variables can be written as f (x1 , x2 ), Gradient and Tangent Plane /
etc. Often times, we abbreviate multiple arguments in a 1st Degree Taylor Expansion
single vector as f (x).
Suppose now a function g(x, y) with signature g : Rn × τx1 (y) = f (x) + (y − x)> ∇f (x)
Rm → R. Its derivative with respect to just x is written
as ∇x g(x, y).
Choosing the direction 2: steepest descent
10
-5-5 0 5 10 15
Optimization algorithm – Steepest descent
Steepest descent
15
10
-5
-5 0 5 10 15
$ 9:.')#'*4,-'#;,-,;,<*.,7-#.&'#-'2#()*+,'-.#,/#*62*1/ orthogonal
. 7 .&' 3)'0,75/ /.'3 +,)'4.,7- =.)5' 7: *-1 6,-' ;,-,;,<*.,7-8>
$ ?7-/'@5'-.61A#.&'#,.')*.'/#.'-+#. 7 # <,(B<*(#+72-#.&'#0*66'1#,-#*#0')1##
,-'C4,'-. ;*--')
Gradient Descent
x1
0.25 2
x2 0.00 0.25
Line Search 0.50
0.75 0
• Take the descent step direction d = −∇f (x) Gradient Descent for
! ! !
f " x , y # $ %&&"y ' x # ( " % ' x #
Rosenbrock function
3
2.5
1.5
0.5
-0.5
-1
-2 -1 0 1 2
2.5
0.85
2
1.5 0.8
1
0.75
0.5
0 0.7
-0.5
0.65
-1 -0.95 -0.9 -0.85 -0.8 -0.75
-2 -1 0 1 2
Afn!Afn p
pn ? A f n B n" #
A f n!" # A f n " #
Choosing the direction 3: conjugate gradients
$ I , - , ; 5 ; # ,/#)'*4&'+#,-#'J*4.61#K /.'3/8
The Hessian Matrix
D72#;,-,;,<'#.&,/#'J3*-/,7-#70') δxN
" !
;,- f = x ! δ x > M a ! g ! δx ! δx H δx
δx K
<
:#$ f , x - δx . 7 a - g>δx - δx>H δx
δx >
"34 + : #$#: ': ?2 42@'#42 * ) + * A f , x - δx . 7 0B +$% &3
A f , x - δx . 7 g - Hδx 7 0
? # * ) &31'*#3$ δx 7 C H O" g , D + * 1 + 9 δx 7 C H E g .F
15
/)#& 0 G#52&0*)20#*24+*#52 '6%+*2
10
-5
-5 0 5 10 15
x n ! " ! x n " H#n"gn
x n % # M x n O αnH "n#gn
2.5 2.5
2 2
1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-2 -1 0 1 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
gradient < 1e-3 after 15 iterations gradient < 1e-3 after 15 iterations
3. Memory footprint
4. Region of convergence
Non-linear least squares
M
'
f F xG ? ri
i& #
Gradient
M
A f FxG ? J r iF x G A r iF x G Ari
i
Hessian
M
H? A A ! f F x G ? J A 9 r iF x G A 9!r Fi x G
i
M
? J A r i F x G A !9 r iF x G B 9 ri F x G A A !9 r iF x G
i
<",)"9,*9155/&K,$1%#' 1*
!Uri
M Ari
H() ? J A r i F x G A !9 r i F x G
i
2.5 2.5
2 2
1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-2 -1 0 1 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
gradient < 1e-3 after 14 iterations gradient < 1e-3 after 14 iterations
2 2
1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-2 -1 0 1 2 -2 -1 0 1 2
gradient < 1e-3 after 15 iterations gradient < 1e-3 after 14 iterations
&'()*+ x n ! " , x n ! δx
"- %+.*/0-
H δx , # g
1- $)2334%+.*/0-
HVD#δx , # g
5-6$7)(8+0* (+39+0*-
λ δx , # g
Levenberg-Marquardt algorithm
$ 92*1 :)7; .&' ;,-,;5;A ,- )'(,7-/ 7: -'(*.,0' 45)0*.5)'A .&'
V*5//BD'2.7- *33)7J,;*.,7- ,/ -7. 0')1 (77+8
1.2
0.8
0.6
0.4
0.2
0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Newton gradient
descent
$ % & ' # ; ' . & 7 + # 5/'/#.& ' #; 7 + ,R ' + X'//,*-
H= x , λ > M H$% ! λ I
N o t e : T h i s a l g o r i t h m d o e s n o t r e q u i r e e x p l i c i t lin e searches.
Example
Levenberg-Marquardt method
3 Levenberg-Marquardt method
3
2.5 2.5
2 2
1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-2 -1 0 1 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
gradient < 1e-3 after 31 iterations gradient < 1e-3 after 31 iterations
! "#$#%#&'(#)$*+,#$-*./0/$1/2-3"'24+'25(*6 $ ) *7#$/*,/'289:*(';/,*<=**
#(/2'(#)$,>
Matlab: lsqnonlin
Comparison
Gauss-Newton Levenberg-Marquardt
Levenberg-Marquardt method
Gauss-Newton method with line search
3 3
2.5 2.5
2 2
1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1
-2 -1 0 1 2 -1
gradient < 1e-3 after 14 iterations -2 -1 0 1 2
gradient < 1e-3 after 31 iterations