MultivariateOptimizationx4 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Overview of Multivariate Optimization Topics Multivariate Optimization Overview

Problem denition The unconstrained optimization problem is a generalization of


Algorithms the line search problem
Cyclic coordinate method Find a vector a such that
Steepest descent a = argminf (a)
Conjugate gradient algorithms a

PARTAN Note that the are no constraints on a


Newtons method Example: Find the vector of coecients (w Rp1 ) that
Levenberg-Marquardt minimize the average absolute error of a linear model
Concise, subjective summary Akin to a blind person trying to nd their way to the bottom of a
valley in a multidimensional landscape
We want to reach the bottom with the minimum number of cane
taps
Also vaguely similar to taking core samples for oil prospecting

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 1 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 2

Example 1: Optimization Problem Example 1: Optimization Problem

5 5
a2

a2

0 0

5 5
5 4 3 2 1 0 1 2 3 4 5 5 4 3 2 1 0 1 2 3 4 5
a1 a1

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 3 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 4
Example 1: Optimization Problem Example 1: Optimization Problem

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 5 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 6

Example 1: Optimization Problem Example 1: MATLAB Code

function [] = O p t i m i z a t i o n P r o b l e m ();

% ==============================================================================
% User - Specified Parameters
% ==============================================================================
x = -5:0 .05 :5;
y = -5:0 .05 :5;

% ==============================================================================
% Evaluate the Function
% ==============================================================================
[X , Y ] = meshgrid (x , y );
[Z , G ] = OptFn (X , Y );
functionName = O p t i m i z a t i o n P r o b l e m ;
fileIdentifier = fopen ([ functionName .tex ] , w );

% ==============================================================================
% Contour Map
% ==============================================================================
figure ;
FigureSet (2 , Slides );
contour (x ,y ,Z ,50);
xlabel ( a_1 );
ylabel ( a_2 );
zoom on ;
AxisSet (8);
fileName = sprintf ( %s -% s , functionName , Contour );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 7 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 8
print ( fileName , - depsc ); AxisSet (8);
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , \\ newslide \ n ); fileName = sprintf ( %s -% s , functionName , Quiver );
fprintf ( fileIdentifier , \\ stepcounter { exc }\ n ); print ( fileName , - depsc );
fprintf ( fileIdentifier , \\ slideheading { Exam ple \\ arabic { exc }: Optimization Problem }\ n ); fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n fprintf ( fileIdentifier , \\ newslide \ n );
fprintf ( fileIdentifier , \\ includegraphics [ scale =1]{ Matlab /% s }\ n , fileName ); fprintf ( fileIdentifier , \\ slideheading { Example \\ arabic { exc }: Optimization Problem }\ n );
fprintf ( fileIdentifier , \ n ); fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
fprintf ( fileIdentifier , \\ includegraphics [ scale =1]{ Matlab /% s }\ n , fileName );
% ============================================================================== fprintf ( fileIdentifier , \ n );
% Quiver Map
% ============================================================================== % ==============================================================================
figure ; % 3 D Maps
FigureSet (1 , Slides ); % ==============================================================================
axis ([ -5 5 -5 5]); figure ;
contour (x ,y ,Z ,50); set ( gcf , Renderer , zbuffer );
h = get ( gca , Children ); FigureSet (1 , Slides );
set (h , LineWidth ,0 .2 ); h = surf (x ,y , Z );
hold on ; set (h , LineStyle , None );
xCoarse = -5:0 .5 :5; xlabel ( a_1 );
yCoarse = -5:0 .5 :5; ylabel ( a_2 );
[X , Y ] = meshgrid ( xCoarse , yCoarse ); shading interp ;
[ ZCoarse , GCoarse ] = OptFn (X , Y ); grid on ;
nr = size ( xCoarse ,1); AxisSet (8);
dzx = GCoarse ( 1: nr ,1: nr ); hl = light ( Position ,[0 ,0 ,30]);
dzy = GCoarse ( nr + (1: nr ) ,1: nr ); set ( hl , Style , Local );
quiver ( xCoarse , yCoarse , dzx , dzy ); set (h , B a c k F a c e L i g h t i n g , unlit )
hold off ; material dull
xlabel ( a_1 );
ylabel ( a_2 ); for c1 =1:3
zoom on ; switch c1

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 9 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 10

case 1 ,
case 2 ,
view (45 ,10);
view ( -55 ,22);
Global Optimization?
case 3 , view ( -131 ,10);
otherwise , error ( Not implemented. ); In general, all optimization algorithms nd a local minimum in as
end
few steps as possible
fileName = sprintf ( %s -% s % d , functionName , Surface , c1 );
print ( fileName , - depsc ); There are also global optimization algorithms based on ideas
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
fprintf ( fileIdentifier , \\ newslide \ n ); such as
fprintf ( fileIdentifier , \\ slideheading { Example \\ arabic { exc }: Optimization Problem }\ n );
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Evolutionary computing
Genetic algorithms
fprintf ( fileIdentifier , \\ includegraphics [ scale =1]{ Matlab /% s }\ n , fileName );
fprintf ( fileIdentifier , \ n );
end
Simulated annealing
% ==============================================================================
% List the MATLAB Code
% ==============================================================================
None of these guarantee convergence in a nite number of
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n iterations
fprintf ( fileIdentifier , \\ newslide \ n );
fprintf ( fileIdentifier , \\ slideheading { Example \\ arabic { exc }: MATLAB Code }\ n );
fprintf ( fileIdentifier , % % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = \ n
All require a lot of computation
fprintf ( fileIdentifier , \ t \\ matlabcode { Matlab /% s.m }\ n , functionName );

fclose ( fileIdentifier );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 11 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 12
Optimization Comments Optimization Algorithm Outline
Ideally, when we construct models we should favor those which The basic steps of these algorithms is as follows
can be optimized with few shallow local minima and reasonable 1. Pick a starting vector a
computation 2. Find the direction of descent, d
Graphically you can think of the function to be minimized as the 3. Move in that direction until a minimum is found:
elevation in a complicated high-dimensional landscape
:= argminf (a + d)
The problem is to nd the lowest point

The most common approach is to go downhill a := a + d


The gradient points in the most uphill direction 4. Loop to 2 until convergence
The steepest downhill direction is the opposite of the gradient Most of the theory of these algorithms is based on quadratic
Most optimization algorithms use a line search algorithm surfaces

The methods mostly dier only in the way that the direction of Near local minima, this is a good approximation
descent is generated Note that the functions should (must) have continuous gradients
(almost) everywhere

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 13 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 14

Cyclic Coordinate Method Example 2: Cyclic Coordinate Method


1. For i = 1 to p,
5
ai := argminf ([a1 , a2 , . . . , ai1 , , ai+1 , . . . , ap ]) 4

3
2. Loop to 1 until convergence
2
+ Simple to implement
1
+ Each line search can be performed semi-globally to avoid shallow

Y
local minima 0

+ Can be used with nominal variables 1

+ f (a) can be discontinuous 2

+ No gradient required 3

Very slow compared to gradient-based optimization algorithms 4


Usually only practical when the number of parameters, p, is small 5
5 0 5
There are modied versions with faster convergence X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 15 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 16
Example 2: Cyclic Coordinate Method Example 2: Cyclic Coordinate Method

0.5 7

0 6

0.5
5

Function Value
1
4
Y

1.5
3
2
2
2.5

3 1

3.5 0
3 2 1 0 0 5 10 15 20 25
X Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 17 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 18

Example 2: Cyclic Coordinate Method Example 2: Relevant MATLAB Code

6 function [] = C y c l i c C o o r d i n a t e ();
% clear all ;
close all ;

5 ns = 26;
Euclidean Position Error

x = -3;
y = 1;
b0 = -1;
4 ls = 30;

a = zeros ( ns ,2);
f = zeros ( ns ,1);
3
[z , dzx , dzy ] = OptFn (x , y );
a (1 ,:) = [ x y ];
2 f (1) = z;
for cnt = 2: ns ,
if rem ( cnt ,2)==1 ,
d = [1 0] ; % Along x direction
1 else
d = [0 1] ; % Along y direction
end ;

0 [b , fmin ] = LineSearch ([ x y ] ,d , b0 , ls );
0 5 10 15 20 25
Iteration x = x + b * d (1);
y = y + b * d (2);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 19 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 20
set ( h (1) , LineWidth ,1 .2 );
a ( cnt ,:) = [ x y ]; set ( h (2) , LineWidth ,0 .6 );
f ( cnt ) = fmin ; h = plot ( xopt , yopt , kx , xopt , yopt , rx );
end ; set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 )); set ( h (1) , MarkerSize ,5);
[z , dzx , dzy ] = OptFn (x , y ); set ( h (2) , MarkerSize ,4);
[ zopt , id1 ] = min ( z ); hold off ;
[ zopt , id2 ] = min ( zopt ); xlabel ( X );
id1 = id1 ( id2 ); ylabel ( Y );
xopt = x ( id1 , id2 ); zoom on ;
yopt = y ( id1 , id2 ); AxisSet (8);
print - depsc C y c l i c C o o r d i n a t e C o n t o u r A ;
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y ); figure ;
[ zopt2 , id1 ] = min ( z ); FigureSet (1 ,4 .5 ,2 .75 );
[ zopt2 , id2 ] = min ( zopt2 ); [x , y ] = meshgrid ( -1 .5 + ( -2:0 .05 :2) , -1 .5 + ( -2:0 .05 :2));
id1 = id1 ( id2 ); [z , dzx , dzy ] = OptFn (x , y );
xopt2 = x ( id1 , id2 ); contour (x ,y ,z ,75);
yopt2 = y ( id1 , id2 ); h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
figure ; axis ( square );
FigureSet (1 ,4 .5 ,2 .75 ); hold on ;
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5); h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
z = OptFn (x , y ); set ( h (1) , LineWidth ,1 .2 );
contour (x ,y ,z ,50); set ( h (2) , LineWidth ,0 .6 );
h = get ( gca , Children ); hold off ;
set (h , LineWidth ,0 .2 ); xlabel ( X );
axis ( square ); ylabel ( Y );
hold on ; zoom on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); AxisSet (8);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 21 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 22

print - depsc C y c l i c C o o r d i n a t e C o n t o u r B ; print - depsc C y c l i c C o o r d i n a t e E r r o r L i n e a r ;

figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc C y c l i c C o o r d i n a t e P o s i t i o n E r r o r ;

figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 23 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 24
Steepest Descent Steepest Descent
The gradient of the function f (a) is dened as the vector of partial + Very stable algorithm
derivatives:  T
f (a) f (a) f (a) Can converge very slowly once near the local minima where the
a f (a) a1 a2 . . . ap surface is approximately quadratic
It can be shown that the gradient, a f (a), points in the
direction of maximum ascent
The negative of the gradient, a f (a), points in the direction
of maximum descent
A vector d is a direction of descent if there exists a  such that
f (a + d) < f (a) for all 0 < < 
It can alsoT be shown that d is a direction of descent i
(a f (a)) d < 0
The algorithm of steepest descent uses d = a f (a)
The most fundamental of all algorithms for minimizing a
continuously dierentiable function

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 25 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 26

Example 3: Steepest Descent Example 3: Steepest Descent

5 1.2

4 1.3

3 1.4

2 1.5

1 1.6
Y

Y
0 1.7

1 1.8

2 1.9

3 2

4 2.1

5 2.2
5 0 5 2 1.8 1.6 1.4 1.2
X X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 27 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 28
Example 3: Steepest Descent Example 3: Steepest Descent Method

7
6

6
5

Euclidean Position Error


5
4
Function Value

4
3
3
2
2

1
1

0 0
0 5 10 15 20 25 0 5 10 15 20 25
Iteration Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 29 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 30

Example 3: Relevant MATLAB Code a ( cnt ,:) = [ x y ];


f ( cnt ) = z;
end ;

[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
function [] = SteepestDescent ();
[z , dzx , dzy ] = OptFn (x , y );
% clear all ;
[ zopt , id1 ] = min ( z );
close all ;
[ zopt , id2 ] = min ( zopt );
id1 = id1 ( id2 );
ns = 26;
xopt = x ( id1 , id2 );
x = -3;
yopt = y ( id1 , id2 );
y = 1;
b0 = 0 .01 ;
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
ls = 30;
[z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
a = zeros ( ns ,2);
f = zeros ( ns ,1); [ zopt2 , id2 ] = min ( zopt2 );
id1 = id1 ( id2 );
[z , g ] = OptFn (x , y ); xopt2 = x ( id1 , id2 );
a (1 ,:) = [ x y ]; yopt2 = y ( id1 , id2 );
f (1) = z;
d = -g / norm ( g ); [ zopt zopt2 ]
for cnt = 2: ns ,
[b , fmin ] = LineSearch ([ x y ] ,d , b0 , ls ); figure ;
FigureSet (1 ,4 .5 ,2 .75 );
x = x + b * d (1); [x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
y = y + b * d (2); z = OptFn (x , y );
contour (x ,y ,z ,50);
[z , g ] = OptFn (x , y ); h = get ( gca , Children );
d = -g ; set (h , LineWidth ,0 .2 );
d = d / norm ( d ); axis ( square );
hold on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 31 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 32
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); AxisSet (8);
set ( h (1) , LineWidth ,1 .2 ); print - depsc S t e e p e s t D e s c e n t C o n t o u r B ;
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx ); figure ;
set ( h (1) , LineWidth ,1 .5 ); FigureSet (2 ,4 .5 ,2 .75 );
set ( h (2) , LineWidth ,0 .5 ); k = 1: ns ;
set ( h (1) , MarkerSize ,5); xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
set ( h (2) , MarkerSize ,4); h = plot (k -1 , xerr , b );
hold off ; set ( h (1) , Marker , . );
xlabel ( X ); set (h , MarkerSize ,6);
ylabel ( Y ); xlabel ( Iteration );
zoom on ; ylabel ( Euclidean Position Error );
AxisSet (8); xlim ([0 ns -1]);
print - depsc S t e e p e s t D e s c e n t C o n t o u r A ; ylim ([0 xerr (1)]);
grid on ;
figure ; set ( gca , Box , Off );
FigureSet (1 ,4 .5 ,2 .75 ); AxisSet (8);
[x , y ] = meshgrid ( -1 .6 + ( -0 .5 :0 .01 :0 .5 ) , -1 .7 + ( -0 .5 :0 .01 :0 .5 )); print - depsc S t e e p e s t D e s c e n t P o s i t i o n E r r o r ;
z = OptFn (x , y );
contour (x ,y ,z ,75); figure ;
h = get ( gca , Children ); FigureSet (2 ,4 .5 ,2 .75 );
set (h , LineWidth ,0 .2 ); k = 1: ns ;
axis ( square ); h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
hold on ; set ( h (1) , Marker , . );
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); set (h , MarkerSize ,6);
set ( h (1) , LineWidth ,1 .2 ); xlabel ( Iteration );
set ( h (2) , LineWidth ,0 .6 ); ylabel ( Function Value );
hold off ; ylim ([0 f (1)]);
xlabel ( X ); xlim ([0 ns -1]);
ylabel ( Y ); grid on ;
zoom on ; set ( gca , Box , Off );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 33 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 34

AxisSet (8);
print - depsc S t e e p e s t D e s c e n t E r r o r L i n e a r ;
Conjugate Gradient Algorithms
1. Take a steepest descent step
2. For i = 2 to p
:= argminf (a + d)

a := a + d
gi := f (a)
T
gi gi
:= T
gi1 gi1
d := gi + di
3. Loop to 1 until convergence
Based on quadratic approximations of f
Called the Fletcher-Reeves method

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 35 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 36
Example 4: Fletcher-Reeves Conjugate Gradient Example 4: Fletcher-Reeves Conjugate Gradient

5 2.5

4 2.6

3 2.7

2 2.8

1 2.9
Y

Y
0 3

1 3.1

2 3.2

3 3.3

4 3.4

5 3.5
5 0 5 1.5 2 2.5
X X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 37 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 38

Example 4: Fletcher-Reeves Conjugate Gradient Example 4: Fletcher-Reeves Conjugate Gradient

7
6

6
5

Euclidean Position Error


5
4
Function Value

4
3
3
2
2

1
1

0 0
0 5 10 15 20 25 0 5 10 15 20 25
Iteration Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 39 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 40
Example 4: Relevant MATLAB Code d = -g + beta * d ;

a ( cnt ,:) = [ x y ];
f ( cnt ) = z;
end ;
function [] = FletcherReeves ();
% clear all ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
close all ;
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
ns = 26;
[ zopt , id2 ] = min ( zopt );
x = -3;
id1 = id1 ( id2 );
y = 1;
xopt = x ( id1 , id2 );
b0 = 0 .01 ;
yopt = y ( id1 , id2 );
ls = 30;

[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
a = zeros ( ns ,2);
f = zeros ( ns ,1); [z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[z , g ] = OptFn (x , y ); [ zopt2 , id2 ] = min ( zopt2 );
a (1 ,:) = [ x y ]; id1 = id1 ( id2 );
f (1) = z; xopt2 = x ( id1 , id2 );
d = -g / norm ( g ); % First direction yopt2 = y ( id1 , id2 );
for cnt = 2: ns ,
[b , fmin ] = LineSearch ([ x y ] ,d , b0 , ls ); figure ;
FigureSet (1 ,4 .5 ,2 .75 );
x = x + b * d (1); [x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
y = y + b * d (2); z = OptFn (x , y );
contour (x ,y ,z ,50);
go = g ; % Old gradient h = get ( gca , Children );
[z , g ] = OptFn (x , y ); set (h , LineWidth ,0 .2 );
axis ( square );
beta = (g * g )/( go * go ); hold on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 41 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 42

h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); AxisSet (8);


set ( h (1) , LineWidth ,1 .2 ); print - depsc F l e t c h e r R e e v e s C o n t o u r B ;
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx ); figure ;
set ( h (1) , LineWidth ,1 .5 ); FigureSet (2 ,4 .5 ,2 .75 );
set ( h (2) , LineWidth ,0 .5 ); k = 1: ns ;
set ( h (1) , MarkerSize ,5); xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
set ( h (2) , MarkerSize ,4); h = plot (k -1 , xerr , b );
hold off ; set ( h (1) , Marker , . );
xlabel ( X ); set (h , MarkerSize ,6);
ylabel ( Y ); xlabel ( Iteration );
zoom on ; ylabel ( Euclidean Position Error );
AxisSet (8); xlim ([0 ns -1]);
print - depsc F l e t c h e r R e e v e s C o n t o u r A ; ylim ([0 xerr (1)]);
grid on ;
figure ; set ( gca , Box , Off );
FigureSet (1 ,4 .5 ,2 .75 ); AxisSet (8);
[x , y ] = meshgrid (1 .5 :0 .01 :2 .5 , -3 .5 :0 .01 : -2 .5 ); print - depsc F l e t c h e r R e e v e s P o s i t i o n E r r o r ;
z = OptFn (x , y );
contour (x ,y ,z ,75); figure ;
h = get ( gca , Children ); FigureSet (2 ,4 .5 ,2 .75 );
set (h , LineWidth ,0 .2 ); k = 1: ns ;
axis ( square ); h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
hold on ; set ( h (1) , Marker , . );
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); set (h , MarkerSize ,6);
set ( h (1) , LineWidth ,1 .2 ); xlabel ( Iteration );
set ( h (2) , LineWidth ,0 .6 ); ylabel ( Function Value );
hold off ; ylim ([0 f (1)]);
xlabel ( X ); xlim ([0 ns -1]);
ylabel ( Y ); grid on ;
zoom on ; set ( gca , Box , Off );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 43 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 44
AxisSet (8);
print - depsc F l e t c h e r R e e v e s E r r o r L i n e a r ;
Conjugate Gradient Algorithms Continued
There is also a variant called Polak-Ribiere where
T
(gi gi1 ) gi
:= T
gi1 gi1

+ Only requires the gradient


+ Converges in a nite No. steps when f (a) is quadratic and perfect
line searches are used
Less stable numerically than steepest descent
Sensitive to inexact line searches

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 45 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 46

Example 5: Polak-Ribiere Conjugate Gradient Example 5: Polak-Ribiere Conjugate Gradient

5 2.5

4 2.6

3 2.7

2 2.8

1 2.9
Y

Y
0 3

1 3.1

2 3.2

3 3.3

4 3.4

5 3.5
5 0 5 1.5 2 2.5
X X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 47 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 48
Example 5: Polak-Ribiere Conjugate Gradient Example 5: Polak-Ribiere Conjugate Gradient

7
6

6
5

Euclidean Position Error


5
4
Function Value

4
3
3
2
2

1
1

0 0
0 5 10 15 20 25 0 5 10 15 20 25
Iteration Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 49 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 50

Example 5: MATLAB Code d = -g + beta * d ;

a ( cnt ,:) = [ x y ];
f ( cnt ) = z;
end ;
function [] = PolakRibiere ();
% clear all ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
close all ;
[z , dzx , dzy ] = OptFn (x , y );
[ zopt , id1 ] = min ( z );
ns = 26;
[ zopt , id2 ] = min ( zopt );
x = -3;
id1 = id1 ( id2 );
y = 1;
xopt = x ( id1 , id2 );
b0 = 0 .01 ;
yopt = y ( id1 , id2 );
ls = 30;

[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
a = zeros ( ns ,2);
f = zeros ( ns ,1); [z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
[z , g ] = OptFn (x , y ); [ zopt2 , id2 ] = min ( zopt2 );
a (1 ,:) = [ x y ]; id1 = id1 ( id2 );
f (1) = z; xopt2 = x ( id1 , id2 );
d = -g / norm ( g ); % First direction yopt2 = y ( id1 , id2 );
for cnt = 2: ns ,
[b , fmin ] = LineSearch ([ x y ] ,d , b0 , ls ); figure ;
FigureSet (1 ,4 .5 ,2 .75 );
x = x + b * d (1); [x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
y = y + b * d (2); z = OptFn (x , y );
contour (x ,y ,z ,50);
go = g ; % Old gradient h = get ( gca , Children );
[z , g ] = OptFn (x , y ); set (h , LineWidth ,0 .2 );
axis ( square );
beta = (( g - go ) * g )/( go * go ); hold on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 51 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 52
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); AxisSet (8);
set ( h (1) , LineWidth ,1 .2 ); print - depsc P o l a k R i b i e r e C o n t o u r B ;
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx ); figure ;
set ( h (1) , LineWidth ,1 .5 ); FigureSet (2 ,4 .5 ,2 .75 );
set ( h (2) , LineWidth ,0 .5 ); k = 1: ns ;
set ( h (1) , MarkerSize ,5); xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
set ( h (2) , MarkerSize ,4); h = plot (k -1 , xerr , b );
hold off ; set ( h (1) , Marker , . );
xlabel ( X ); set (h , MarkerSize ,6);
ylabel ( Y ); xlabel ( Iteration );
zoom on ; ylabel ( Euclidean Position Error );
AxisSet (8); xlim ([0 ns -1]);
print - depsc P o l a k R i b i e r e C o n t o u r A ; ylim ([0 xerr (1)]);
grid on ;
figure ; set ( gca , Box , Off );
FigureSet (1 ,4 .5 ,2 .75 ); AxisSet (8);
[x , y ] = meshgrid (1 .5 :0 .01 :2 .5 , -3 .5 :0 .01 : -2 .5 ); print - depsc P o l a k R i b i e r e P o s i t i o n E r r o r ;
z = OptFn (x , y );
contour (x ,y ,z ,75); figure ;
h = get ( gca , Children ); FigureSet (2 ,4 .5 ,2 .75 );
set (h , LineWidth ,0 .2 ); k = 1: ns ;
axis ( square ); h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
hold on ; set ( h (1) , Marker , . );
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); set (h , MarkerSize ,6);
set ( h (1) , LineWidth ,1 .2 ); xlabel ( Iteration );
set ( h (2) , LineWidth ,0 .6 ); ylabel ( Function Value );
hold off ; ylim ([0 f (1)]);
xlabel ( X ); xlim ([0 ns -1]);
ylabel ( Y ); grid on ;
zoom on ; set ( gca , Box , Off );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 53 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 54

AxisSet (8);
print - depsc P o l a k R i b i e r e E r r o r L i n e a r ;
Parallel Tangents (PARTAN)
1. First gradient step
d := f (a)
:= argmin f (a + d)
sp := d
a := a + sp
2. Gradient Step
dg := f (a)
:= argmin f (a + d)
sg := d
a := a + sg
3. Conjugate Step
dp := sp + sg
:= argmin f (a + d)
sp := d
a := a + sp
4. Loop to 2 until convergence

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 55 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 56
PARTAN Concept Example 6: PARTAN
a2 a3
a0 5
a6 a7 4

a1 a4 3
a5
2

First two steps are steepest descent 1

Thereafter, each iteration consists of two steps

Y
0
1. Search along the direction 1
di = ai ai2 2

where ai is the current point and ai2 is the point from two 3
steps ago 4
2. Search in the direction of the negative gradient
5
5 0 5
di = f (ai ) X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 57 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 58

Example 6: PARTAN Example 6: PARTAN

2.5 7
2.6
6
2.7

2.8 5
Function Value
2.9
4
Y

3.1 3

3.2 2
3.3
1
3.4

3.5 0
1.5 2 2.5 0 5 10 15 20 25
X Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 59 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 60
Example 6: PARTAN Example 6: MATLAB Code

6 function [] = Partan ();


% clear all ;
close all ;

5 ns = 26;
Euclidean Position Error

x = -3;
y = 1;
b0 = 0 .01 ;
4 ls = 30;

a = zeros ( ns ,2);
f = zeros ( ns ,1);
3
[z , g ] = OptFn (x , y );
a (1 ,:) = [ x y ];
2 f (1) = z;
xa = x;
ya = y;

1 % First step - substitute for a Conjugate step


d = -g / norm ( g ); % First direction
[ bp , fmin ] = LineSearch ([ x y ] ,d , b0 ,100);
x = x + bp * d (1); % Standin for a conjugate step
0 y = y + bp * d (2);
0 5 10 15 20 25 a (2 ,:) = [ x y ];
Iteration f (2) = fmin ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 61 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 62

cnt = 2; else % Could not move - do another gradient update


while cnt < ns , cnt = cnt + 1;
% Gradient step a ( cnt ,:) = a ( cnt -1 ,:);
[z , g ] = OptFn (x , y ); f ( cnt ) = f ( cnt -1);
d = -g / norm ( g ); % Direction if cnt == ns ,
[ bg , fmin ] = LineSearch ([ x y ] ,d , b0 , ls ); break ;
end ;
xg = x + bg * d (1); fprintf ( G2 : );
yg = y + bg * d (2); [z , g ] = OptFn ( xg , yg );
d = -g / norm ( g ); % Direction
cnt = cnt + 1; [ bp , fmin ] = LineSearch ([ xg yg ] ,d , b0 , ls );
a ( cnt ,:) = [ xg yg ]; x = xg + bp * d (1);
f ( cnt ) = OptFn ( xg , yg ); y = yg + bp * d (2);
fprintf ( G : % d %5 .3f \ n ,cnt , f ( cnt )); end ;
if cnt == ns ,
break ; % Update anchor point
end ; xa = xg ;
ya = yg ;
% Conjugate
d = [ xg - xa yg - ya ] ; cnt = cnt + 1;
if norm ( d )=0 , a ( cnt ,:) = [ x y ];
d = d / norm ( d ); f ( cnt ) = OptFn (x , y );
[ bp , fmin ] = LineSearch ([ xg yg ] ,d , b0 , ls ); fprintf ( % d %5 .3f \ n ,cnt , f ( cnt ));
else end ;
bp = 0;
end ; [x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
[z , dzx , dzy ] = OptFn (x , y );
if bp >0 , % Line search in conj ugate direction was successful [ zopt , id1 ] = min ( z );
fprintf ( P : ); [ zopt , id2 ] = min ( zopt );
x = xg + bp * d (1); id1 = id1 ( id2 );
y = yg + bp * d (2); xopt = x ( id1 , id2 );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 63 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 64
yopt = y ( id1 , id2 ); AxisSet (8);
print - depsc PartanContourA ;
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y ); figure ;
[ zopt2 , id1 ] = min ( z ); FigureSet (1 ,4 .5 ,2 .75 );
[ zopt2 , id2 ] = min ( zopt2 ); [x , y ] = meshgrid (1 .5 :0 .01 :2 .5 , -3 .5 :0 .01 : -2 .5 );
id1 = id1 ( id2 ); z = OptFn (x , y );
xopt2 = x ( id1 , id2 ); contour (x ,y ,z ,75);
yopt2 = y ( id1 , id2 ); h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
figure ; axis ( square );
FigureSet (1 ,4 .5 ,2 .75 ); hold on ;
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5); h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
z = OptFn (x , y ); set ( h (1) , LineWidth ,1 .2 );
contour (x ,y ,z ,50); set ( h (2) , LineWidth ,0 .6 );
h = get ( gca , Children ); hold off ;
set (h , LineWidth ,0 .2 ); xlabel ( X );
axis ( square ); ylabel ( Y );
hold on ; zoom on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); AxisSet (8);
set ( h (1) , LineWidth ,1 .2 ); print - depsc PartanContourB ;
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx ); figure ;
set ( h (1) , LineWidth ,1 .5 ); FigureSet (2 ,4 .5 ,2 .75 );
set ( h (2) , LineWidth ,0 .5 ); k = 1: ns ;
set ( h (1) , MarkerSize ,5); xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
set ( h (2) , MarkerSize ,4); h = plot (k -1 , xerr , b );
hold off ; set ( h (1) , Marker , . );
xlabel ( X ); set (h , MarkerSize ,6);
ylabel ( Y ); xlabel ( Iteration );
zoom on ; ylabel ( Euclidean Position Error );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 65 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 66

xlim ([0 ns -1]);


ylim ([0 xerr (1)]);
PARTAN Pros and Cons
grid on ;
set ( gca , Box , Off ); a2 a3
AxisSet (8); a0
print - depsc P a r t a n P o s i t i o n E r r o r ;
a6 a7
figure ;
FigureSet (2 ,4 .5 ,2 .75 ); a1 a4
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g ); a5
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration ); + For quadratic functions, converges in a nite number of steps
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);
+ Easier to implement than 2nd order methods
+ Can be used with large number of parameters
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc P a r t a n E r r o r L i n e a r ; + Each (composite) step is at least as good as steepest descent
+ Tolerant of inexact line searches
Each (composite) step requires two line searches

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 67 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 68
Newtons Method Example 7: Newtons with Steepest Descent Safeguard
1
ak+1 = ak H(ak ) f (ak ) 5
where f (ak ) is the gradient and H(ak ) is the hessian of f (a), 4
2 f (a) 2 f (a) 2 f (a)

2 . . . 3
a1 a a a a
2 f (a) 1
2 f (a)
2
2 f (a)
1 p

. . . 2
a a a2 2 a2 ap
H(ak ) 2. 1 .. ..
.. ..
.
1
. .
2 2 2
f (a) f (a) f (a)

Y
0
ap a1 ap a2 . . . a 2
p
1
Based on a quadratic approximation of the function f (a) 2
If f (a) is quadratic, converges in one step 3
If H(a) is positive-denite, the problem is well dened near local 4
minima where f (a) is nearly quadratic
5
5 0 5
X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 69 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 70

Example 7: Newtons with Steepest Descent Safeguard Example 7: Newtons with Steepest Descent Safeguard

1.5 7

2 5
Function Value

4
Y

2.5
3

2
3
1

0 0.5 1 1.5 2 0
X 0 10 20 30 40 50 60 70 80 90
Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 71 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 72
Example 7: Newtons with Steepest Descent Safeguard Example 7: Relevant MATLAB Code

6 function [] = Newtons ();


% clear all ;
close all ;

5 ns = 100;
Euclidean Position Error

x = -3; % Starting x
y = 1; % Starting y
b0 = 1;
4
a = zeros ( ns ,2);
f = zeros ( ns ,1);
3 [z ,g , H ] = OptFn (x , y );
a (1 ,:) = [ x y ];
f (1) = z;
2
for cnt = 2: ns ,
d = - inv ( H )* g ;
if d * g >0 , % Revert to steepest descent if is not direction of descent
1 % fprintf ( (%2 d of %2 d ) Min. Eig :%5 .3f Reverting... \n , cnt , ns , min ( eig ( H )));
d = -g ;
end ;
d = d / norm ( d );
0 [b , fmin ] = LineSearch ([ x y ] ,d , b0 ,100);
0 10 20 30 40 50 60 70 80 90 % a ( cnt ,:) = ( a ( cnt -1 ,:) - inv ( H )* g ) ; % Pure Newton s Method
Iteration
x = x + b * d (1);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 73 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 74

y = y + b * d (2); axis ( square );


hold on ;
[z ,g , H ] = OptFn (x , y ); h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
set ( h (1) , LineWidth ,1 .2 );
a ( cnt ,:) = [ x y ]; set ( h (2) , LineWidth ,0 .6 );
f ( cnt ) = z; h = plot ( xopt , yopt , kx , xopt , yopt , rx );
end ; set ( h (1) , LineWidth ,1 .5 );
set ( h (2) , LineWidth ,0 .5 );
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 )); set ( h (1) , MarkerSize ,5);
[z , dzx , dzy ] = OptFn (x , y ); set ( h (2) , MarkerSize ,4);
[ zopt , id1 ] = min ( z ); hold off ;
[ zopt , id2 ] = min ( zopt ); xlabel ( X );
id1 = id1 ( id2 ); ylabel ( Y );
xopt = x ( id1 , id2 ); zoom on ;
yopt = y ( id1 , id2 ); AxisSet (8);
print - depsc NewtonsContourA ;
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
[z , dzx , dzy ] = OptFn (x , y ); figure ;
[ zopt2 , id1 ] = min ( z ); FigureSet (1 ,4 .5 ,2 .75 );
[ zopt2 , id2 ] = min ( zopt2 ); [x , y ] = meshgrid (1 .0 + ( -1:0 .02 :1) , -2 .4 + ( -1:0 .02 :1));
id1 = id1 ( id2 ); z = OptFn (x , y );
xopt2 = x ( id1 , id2 ); contour (x ,y ,z ,75);
yopt2 = y ( id1 , id2 ); h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
figure ; axis ( square );
FigureSet (1 ,4 .5 ,2 .75 ); hold on ;
[x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5); h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r );
z = OptFn (x , y ); set ( h (1) , LineWidth ,1 .2 );
contour (x ,y ,z ,50); set ( h (2) , LineWidth ,0 .6 );
h = get ( gca , Children ); hold off ;
set (h , LineWidth ,0 .2 ); xlabel ( X );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 75 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 76
ylabel ( Y ); grid on ;
zoom on ; set ( gca , Box , Off );
AxisSet (8); AxisSet (8);
print - depsc NewtonsContourB ; print - depsc N e w t o n s E r r o r L i n e a r ;

figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
h = plot (k -1 , xerr , b );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Euclidean Position Error );
xlim ([0 ns -1]);
ylim ([0 xerr (1)]);
grid on ;
set ( gca , Box , Off );
AxisSet (8);
print - depsc N e w t o n s P o s i t i o n E r r o r ;

figure ;
FigureSet (2 ,4 .5 ,2 .75 );
k = 1: ns ;
h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
set ( h (1) , Marker , . );
set (h , MarkerSize ,6);
xlabel ( Iteration );
ylabel ( Function Value );
ylim ([0 f (1)]);
xlim ([0 ns -1]);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 77 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 78

Newtons Method Pros and Cons Levenberg-Marquardt


ak+1 = ak H(ak )1 f (ak ) 1. Determine if k I + H(ak ) is positive denite. If not, k := 4k
and repeat.

+ Very fast convergence near local minima 2. Solve the following equation for ak+1

Not guaranteed to converge (may actually diverge) [k I + H(ak )] (ak+1 ak ) = f (ak )
Requires p p Hessian 3.
Requires a p p matrix inverse that uses O(p3 ) operations f (ak ) f (ak+1 )
rk
q(ak ) q(ak+1 )
where q(a) is the quadratic approximation of f (a) based on the
f (a), f (a), and H(ak )
4. If rk < 0.25, then k+1 := 4k
If rk > 0.75, then k+1 := 12 k
If rk 0, then ak+1 := ak
5. If not converged, k := k + 1 and loop to 1.

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 79 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 80
Levenberg-Marquardt Comments Example 8: Levenberg-Marquardt Conjugate Gradient
Similar to Newtons method 5
Has safety provisions for regions where quadratic approximation is 4
inappropriate
3
Compare
2
Newtons: ak+1 = ak H(ak )1 f (ak ) 1
LM : [k I + H(ak )] (ak+1 ak ) = f (ak )

Y
0

1
If  = 0, these are equivalent 2
If  , ak+1 ak 3
 is chosen to ensure that the smallest eigenvalue of H(ak ) is 4
positive and suciently large ( )
5
5 0 5
X

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 81 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 82

Example 8: Levenberg-Marquardt Conjugate Gradient Example 8: Levenberg-Marquardt Conjugate Gradient

2.5 7
2.6
6
2.7

2.8 5
Function Value
2.9
4
Y

3.1 3

3.2 2
3.3
1
3.4

3.5 0
1.5 2 2.5 0 5 10 15 20 25
X Iteration

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 83 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 84
Example 8: Levenberg-Marquardt Conjugate Gradient Example 8: Relevant MATLAB Code

6 function [] = L e v e n b e r g M a r q u a r d t ();
% clear all ;
close all ;

5 ns = 26;
Euclidean Position Error

x = -3; % Starting x
y = 1; % Starting y
eta = 0 .0001 ;
4
a = zeros ( ns ,2);
f = zeros ( ns ,1);
3 [ zn ,g , H ] = OptFn (x , y );
a (1 ,:) = [ x y ];
f (1) = zn ;
2 ap = [ x y ] ; % Previous point

for cnt = 2: ns ,
[ zn ,g , H ] = OptFn (x , y );
1
while min ( eig ( eta * eye (2)+ H )) <0 ,
eta = eta * 4;
end ;
0
0 5 10 15 20 25 a ( cnt ,:) = ( ap - inv ( eta * eye (2)+ H )* g ) ;
Iteration
x = a ( cnt ,1);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 85 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 86

y = a ( cnt ,2); y = a ( cnt ,2);


zo = zn ; % Old function value a ( cnt ,:) = [ x y ];
zn = OptFn (x , y ); f ( cnt ) = OptFn (x , y );

xd = ( a ( cnt ,:) - ap ); % disp ([ cnt a ( cnt ,:) f ( cnt ) r eta ])


qo = zo ; end ;
qn = zn + g * xd + 0 .5 * xd * H * xd ;
[x , y ] = meshgrid (0+( -0 .01 :0 .001 :0 .01 ) ,3+( -0 .01 :0 .001 :0 .01 ));
if qo == qn , % Test for convergence [z , dzx , dzy ] = OptFn (x , y );
x = a ( cnt ,1); [ zopt , id1 ] = min ( z );
y = a ( cnt ,2); [ zopt , id2 ] = min ( zopt );
a ( cnt : ns ,:) = ones ( ns - cnt +1 ,1)*[ x y ]; id1 = id1 ( id2 );
f ( cnt : ns ,:) = OptFn (x , y ); xopt = x ( id1 , id2 );
break ; yopt = y ( id1 , id2 );
end ;
[x , y ] = meshgrid (1 .883 +( -0 .02 :0 .001 :0 .02 ) , -2 .963 +( -0 .02 :0 .001 :0 .02 ));
r = ( zo - zn )/( qo - qn ); [z , dzx , dzy ] = OptFn (x , y );
[ zopt2 , id1 ] = min ( z );
if r <0 .25 , [ zopt2 , id2 ] = min ( zopt2 );
eta = eta * 4; id1 = id1 ( id2 );
elseif r >0 .50 , % 0 .75 is recommended , but much slower xopt2 = x ( id1 , id2 );
eta = eta / 2; yopt2 = y ( id1 , id2 );
end ;
figure ;
if zn > zo , % Back up FigureSet (1 ,4 .5 ,2 .75 );
a ( cnt ,:) = a ( cnt -1 ,:); [x , y ] = meshgrid ( -5:0 .1 :5 , -5:0 .1 :5);
else z = OptFn (x , y );
ap = a ( cnt ,:) ; contour (x ,y ,z ,50);
end ; h = get ( gca , Children );
set (h , LineWidth ,0 .2 );
x = a ( cnt ,1); axis ( square );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 87 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 88
hold on ; zoom on ;
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); AxisSet (8);
set ( h (1) , LineWidth ,1 .2 ); print - depsc L e v e n b e r g M a r q u a r d t C o n t o u r B ;
set ( h (2) , LineWidth ,0 .6 );
h = plot ( xopt , yopt , kx , xopt , yopt , rx ); figure ;
set ( h (1) , LineWidth ,1 .5 ); FigureSet (2 ,4 .5 ,2 .75 );
set ( h (2) , LineWidth ,0 .5 ); k = 1: ns ;
set ( h (1) , MarkerSize ,5); xerr = ( sum ((( a - ones ( ns ,1)*[ xopt2 yopt2 ]) ) . ^2) ) . ^(1/2);
set ( h (2) , MarkerSize ,4); h = plot (k -1 , xerr , b );
hold off ; set ( h (1) , Marker , . );
xlabel ( X ); set (h , MarkerSize ,6);
ylabel ( Y ); xlabel ( Iteration );
zoom on ; ylabel ( Euclidean Position Error );
AxisSet (8); xlim ([0 ns -1]);
print - depsc L e v e n b e r g M a r q u a r d t C o n t o u r A ; ylim ([0 xerr (1)]);
grid on ;
figure ; set ( gca , Box , Off );
FigureSet (1 ,4 .5 ,2 .75 ); AxisSet (8);
[x , y ] = meshgrid (1 .5 :0 .01 :2 .5 , -3 .5 :0 .01 : -2 .5 ); print - depsc L e v e n b e r g M a r q u a r d t P o s i t i o n E r r o r ;
z = OptFn (x , y );
contour (x ,y ,z ,75); figure ;
h = get ( gca , Children ); FigureSet (2 ,4 .5 ,2 .75 );
set (h , LineWidth ,0 .2 ); k = 1: ns ;
axis ( square ); h = plot (k -1 ,f , b ,[0 ns ] , zopt *[1 1] , r ,[0 ns ] , zopt2 *[1 1] , g );
hold on ; set ( h (1) , Marker , . );
h = plot ( a (: ,1) , a (: ,2) , k ,a (: ,1) , a (: ,2) , r ); set (h , MarkerSize ,6);
set ( h (1) , LineWidth ,1 .2 ); xlabel ( Iteration );
set ( h (2) , LineWidth ,0 .6 ); ylabel ( Function Value );
hold off ; ylim ([0 f (1)]);
xlabel ( X ); xlim ([0 ns -1]);
ylabel ( Y ); grid on ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 89 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 90

set ( gca , Box , Off );


AxisSet (8);
Levenberg-Marquardt Pros and Cons
print - depsc L e v e n b e r g M a r q u a r d t E r r o r L i n e a r ;
[k I + H(ak )] (ak+1 ak ) = f (ak )

Many equivalent formulations


+ No line search required
+ Can be used with approximations to the hessian
+ Extremely fast convergence (2nd order)
Requires gradient and hessian (or approximate hessian)
Requires O(p3 ) operations for each solution to the key equation

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 91 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 92
Optimization Algorithm Summary

Algorithm Convergence Stable f (a) H(a) LS


Cyclic Coordinate Slow Y N N Y
Steepest Descent Slow Y Y N Y
Conjugate Gradient Fast N Y N Y
PARTAN Fast Y Y N Y
Newtons Method Very Fast N Y Y N
Levenberg-Marquardt Very Fast Y Y Y N

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 93

You might also like