Optimumengineeringdesign Day3a

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Optimization Methods

in Engineering Design
Day-3a
Lecture outline

Unconstrained continuous optimization:


• Convexity
• Iterative optimization algorithms
• Gradient descent
• Newton’s method
• Gauss-Newton method

New topics:
• Axial iteration
• Levenberg-Marquardt algorithm
• Application
Introduction: Problem specification

Suppose we have a cost function (or objective function)

f ! x " # $%n & $%


Our aim is find the value of the parameters x that minimize this function

x ! ' ()* +,-


x
f ! x "
subject to the following constraints:

• equality c i ! x "#$ %, i $ &, . . . , me


• inequality c i ! x " # $ %, i & me ' (,. . . , m

We will start by focussing on unconstrained problems


Unconstrained optimization
function of one
variable f !x"

+,- f ! x "
x

local global x
minimum minimum

• down-hill search (gradient descent) algorithms can find local minima


• which of the minima is found depends on the starting point
• such minima often occur in real applications
Reminder: convexity
Class of functions

convex Not convex

• Convexity provides a test for a single extremum


• A non-negative sum of convex functions is convex
Class of functions continued

single extremum – convex single extremum – non-convex

Not convex

multiple extrema – non-convex noisy horrible


Optimization algorithm – key ideas

! "#$% δx &'() * ) + * f , x - δx . < f , x .

! /)#& 0 12+%&0* 3 0 +$0#*24+*#520'6%+*20 x n ! " # 7 0 x n - δx

! 82%'(20*)20643912:0* 3 0 +0&24#2&03;0< = 0 1#$20&2+4()2&0δx 7 α p

15

10

-5
-5 0 5 10 15
Optimization algorithm – Random direction
Choosing the direction 1: axial iteration

Alternate minimization over x and y

15

10

-5-5 0 5 10 15
Optimization algorithm
axial directions
Choosing the direction 2: steepest descent

Move in the direction of the gradient ! f "xn#


15

10

-5-5 0 5 10 15
Optimization algorithm – Steepest descent
Steepest descent
15

10

-5
-5 0 5 10 15

$ % & ' # ()*+,'-.#,/#'0')12&')'#3')3'-+,456*)#. 7 # .&'#47-.75) 6,-'/8

$ 9:.')#'*4&#6,-'#;,-,;,<*.,7-#.&'#-'2#()*+,'-.#,/#*62*1/ orthogonal
. 7 .&' 3)'0,75/ /.'3 +,)'4.,7- =.)5' 7: *-1 6,-' ;,-,;,<*.,7-8>

$ ?7-/'@5'-.61A#.&'#,.')*.'/#.'-+#. 7 # <,(B<*(#+72-#.&'#0*66'1#,-#*#0')1##
,-'C4,'-. ;*--')
A harder case: Rosenbrock’s function

! ! !
f " x , y # $ %&&"y ' x # ( " % ' x #
Rosenbrock function
3

2.5

1.5

0.5

-0.5

-1
-2 -1 0 1 2

" # $ # % & % ' #(') * ' +,, ,-


Steepest descent on Rosenbrock function

Steepest Descent Steepest Descent


3

2.5
0.85
2

1.5 0.8

1
0.75
0.5

0 0.7

-0.5
0.65
-1 -0.95 -0.9 -0.85 -0.8 -0.75
-2 -1 0 1 2

• The zig-zag behaviour is clear in the zoomed view (100 iterations)

• The algorithm crawls down the valley


Conjugate Gradients – sketch only
! " # $ # % " & ' &( c o n j u g a t e g r a d i e n t s )"&&*#* *+))#**,-# '#*)#.% ',/#)0
%,&.* p n *+)" % " 1 % , % ,* 2+1/1.%##' % & /#1)" %"# $ , . , $ + $ ,. 1 3. ,% #
.+$4#/ &( *%#5*6

7 81)"9 p n ,*9)"&*#.9% & 9 4#9)&.:+21%#9% & 9 1;;95/#-,&+*9*#1/)"9',/#)%,&.*99


< , % " 9 /#*5#)%9% & 9 %"#9=#**,1.9 H>

p!nHp j ? @, @?< j < n

7 ! " # 9 /#*+;%,.29*#1/)"9',/#)%,&.*91/#9$+%+1;;C9;,.#1/;C ,.'#5#.'#.%6

7 RemarkablyD p n )1. 4# )"&*#. +*,.2 &.;C E.&<;#'2# &( p n " # , A f F x n " # G 9 9


1.'9A f F x n G 9 F*##9H+$#/,)1; I#),5#*G

Afn!Afn p
pn ? A f n B n" #
A f n!" # A f n " #
Choosing the direction 3: conjugate gradients

Again, uses first derivatives only, but avoids “undoing” previous


work

$ 9 - # DB+,; '-/,7-*6#@ 5 * + ) * .,4 #: 7 ) ; # 4*-#E'#; ,- ,; ,<' + #,-#a t m o s t N


47 - F5 ( * .' #+'/4'-. /.'3/8

$ G #+ , H ' ) ' - . # / .* ) .,- ( 3 7 ,- ./ 8

$ I , - , ; 5 ; # ,/#)'*4&'+#,-#'J*4.61#K /.'3/8
Choosing the direction 4: Newton’s method
Start from Taylor expansion in 2D
9 # :5-4.,7-#;*1#E'#*33)7J,;*.'+#674*661#E1#,./#%*167)#/'),'/#'J3*-/,7-##
*E75.#*#37,-. x $
à ! à ! ∂ !f ∂ !f à !
∂f ∂f δx " ∂x ! ∂x∂y δx
f = x ! δx > L f = x> ! , ! =δx, δy> ∂ !f ∂ !f
∂ x ∂y δy K δy
∂x ∂y ∂y!

%&'# 'J3*-/,7-#. 7 # /'47-+#7)+')#,/#*#@5*+)*.,4 :5-4.,7-


" !
f = x ! δ x > M a ! g ! δx ! δx H δx
K

D72#;,-,;,<'#.&,/#'J3*-/,7-#70') δxN
" !
;,- f = x ! δ x > M a ! g ! δx ! δx H δx
δx K
<
:#$ f , x - δx . 7 a - g>δx - δx>H δx
δx >
"34 + : #$#: ': ?2 42@'#42 * ) + * A f , x - δx . 7 0B +$% &3

A f , x - δx . 7 g - Hδx 7 0
? # * ) &31'*#3$ δx 7 C H O" g , D + * 1 + 9 δx 7 C H E g .F

15
/)#& 0 G#52&0*)20#*24+*#52 '6%+*2

10

x n ! " & x n ) H#n"gn


5

-5
-5 0 5 10 15
x n ! " ! x n " H#n"gn

$ P:#f = x > # ,/#@5*+)*.,4A#.&'-#.&'#/765.,7-#,/#:75-+#,-#7-' /.'38

$ % & ' # ;'.&7+#&*/#@5*+)*.,4#47-0')('-4'#=*/#,-#.&'#" Q 4*/'>8

$ % & ' # /765.,7-#δx M # OH " n#g n ,/#(5*)*-.''+#. 7 # E'#*#+72-&,66#+,)'4.,7-##


3)70,+'+#. & * . # H,/#37/,.,0' +'R-,.'

$ S*.&')#.&*-#F5;3#/.)*,(&.#. 7 # .&'#3)'+,4.'+#/765.,7-#*. # x n O # H"n#gnA##


, . # ,/#E'..')#. 7 # 3'):7);#*#line search

x n % # M x n O αnH "n#gn

$ P:#HM # I .&'-#.&,/#)'+54'/#. 7 # /.''3'/. +'/4'-.8


Newton’s method - example
Newton method with line search
Newton method with line search
3 3

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1
-2 -1 0 1 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
gradient < 1e-3 after 15 iterations gradient < 1e-3 after 15 iterations

ellipses show successive


quadratic approximations

•The algorithm converges in only 15 iterations – far superior to steepest


descent
•However, the method requires computing the Hessian matrix at each
iteration – this is not always feasible
Optimization algorithm – Newton method
Performance issues for optimization algorithms

1. Number of iterations required

2. Cost per iteration

3. Memory footprint

4. Region of convergence
Non-linear least squares

M
X '
f F xG ? ri
i& #
Gradient
M
A f FxG ? J
X
r iF x G A r iF x G Ari
i
Hessian
M
X ³ ́
H? A A ! f F x G ? J A 9 r iF x G A 9!r Fi x G
i
M
X
? J A r i F x G A !9 r iF x G B 9 ri F x G A A !9 r iF x G
i
<",)"9,*9155/&K,$1%#' 1*
!Uri
M
X Ari
H() ? J A r i F x G A !9 r i F x G
i

! " , * 9 ,*9%"#9G a u s s - N e w t o n 155/&K,$1%,&.


x n ! " " x n # αnH#n"gn $ % & ' Hn ( x ) " H$% ( x n )

Gauss-Newton method with line search


Gauss-Newton method with line search
3 3

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1
-2 -1 0 1 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

gradient < 1e-3 after 14 iterations gradient < 1e-3 after 14 iterations

•minimization with the Gauss-Newton approximation with line search


takes only 14 iterations
Comparison
Newton Gauss-Newton
Newton method with line search Gauss-Newton method with line search
3 3
2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1
-2 -1 0 1 2 -2 -1 0 1 2
gradient < 1e-3 after 15 iterations gradient < 1e-3 after 14 iterations

• requires computing Hessian •approximates Hessian by


product of gradient of residuals
• exact solution if quadratic
• requires only derivatives
Summary of minimizations methods

&'()*+ x n ! " , x n ! δx

"- %+.*/0-
H δx , # g

1- $)2334%+.*/0-
HVD#δx , # g

5-6$7)(8+0* (+39+0*-
λ δx , # g
Levenberg-Marquardt algorithm
$ 92*1 :)7; .&' ;,-,;5;A ,- )'(,7-/ 7: -'(*.,0' 45)0*.5)'A .&'
V*5//BD'2.7- *33)7J,;*.,7- ,/ -7. 0')1 (77+8

$ P- /54& )'(,7-/A * /,;36' /.''3'/.B+'/4'-. /.'3 ,/ 3)7E*E61 .&' E'/.


36*-8

$ % & ' W'0'-E')(BI*)@5*)+. ;'.&7+ ,/ * ;'4&*-,/; :7) 0*)1,-( E'B


.2''- /.''3'/.B+'/4'-. *-+ V*5//BD'2.7- /.'3/ +'3'-+,-( 7- &72
(77+ .&' H() *33)7J,;*.,7- ,/ 674*6618
1.4

1.2

0.8

0.6

0.4

0.2

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Newton gradient
descent
$ % & ' # ; ' . & 7 + # 5/'/#.& ' #; 7 + ,R ' + X'//,*-

H= x , λ > M H$% ! λ I

$ T & ' - #λ ,/#/;*66A# H*33)7J,;*.'/#.& ' #V*5//BD'2.7- X'//,*-8

$ T & ' - #λ ,/#6*)('A# H,/#467/'#. 7 # .& ' #,+'-.,.1A#4*5/,-(#/.''3'/.B+'/4'-.##


/.'3/#. 7 # E' .*Y'-8
LM Algorithm
H= x , λ > M H $ % = x > ! λ I

"8#Z'.#λ M # [.[[" =/*1>

K8 Z760' δx M O H = x , λ > & # g

G8 P: f = x n ! δx > > f = x n > A ,-4)'*/' λ = \ " [ /*1> *-+ (7 . 7 K8

]8 ^.&')2,/'A#+'4)'*/'#λ = \ [ . " # /*1>A#6'.# x n ' # ( M # x n ! δx A#*-+#(7#. 7 # K8

N o t e : T h i s a l g o r i t h m d o e s n o t r e q u i r e e x p l i c i t lin e searches.
Example

Levenberg-Marquardt method
3 Levenberg-Marquardt method
3

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1
-2 -1 0 1 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
gradient < 1e-3 after 31 iterations gradient < 1e-3 after 31 iterations

! "#$#%#&'(#)$*+,#$-*./0/$1/2-3"'24+'25(*6 $ ) *7#$/*,/'289:*(';/,*<=**
#(/2'(#)$,>

Matlab: lsqnonlin
Comparison

Gauss-Newton Levenberg-Marquardt
Levenberg-Marquardt method
Gauss-Newton method with line search
3 3

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5
-1
-2 -1 0 1 2 -1
gradient < 1e-3 after 14 iterations -2 -1 0 1 2
gradient < 1e-3 after 31 iterations

•more iterations than Gauss-Newton,


but
• no line search required,
• and more frequently converges

You might also like