0% found this document useful (0 votes)
40 views

Unconstrained Multivariable Optimization

This document discusses unconstrained multivariate optimization methods, including grid search, gradient methods, and conjugate gradient methods. Gradient methods involve calculating the gradient of the objective function f to determine a search direction, then optimizing the step length along that direction. Conjugate gradient methods improve on gradient methods by generating conjugate search directions, guaranteeing reaching the optimum in n iterations for quadratic functions. The document provides examples of applying gradient and conjugate gradient methods to minimize test functions.

Uploaded by

Aliyya Dhiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Unconstrained Multivariable Optimization

This document discusses unconstrained multivariate optimization methods, including grid search, gradient methods, and conjugate gradient methods. Gradient methods involve calculating the gradient of the objective function f to determine a search direction, then optimizing the step length along that direction. Conjugate gradient methods improve on gradient methods by generating conjugate search directions, guaranteeing reaching the optimum in n iterations for quadratic functions. The document provides examples of applying gradient and conjugate gradient methods to minimize test functions.

Uploaded by

Aliyya Dhiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 42

UNCONSTRAINED MULTIVARIABLE

Chapter 6

OPTIMIZATION

1
Methods
1 Function Values Only (grid search)
Chapter 6

2 First Derivatives of f (gradient and conjugate


direction methods)

3 Second Derivatives of f (e.g., Newton’s


method)

4 Quasi-Newton methods

2
Grid Search
• Fungsi dihitung di setiap titik pada grid
• Nilai ekstremum ditemukan pada satu titik
Chapter 6

tertentu  nilai optimum

3
Gradient Method
Chapter 6

k
(1) Calculate a search direction s

(2) Select a step length in that direction to reduce f(x)


k 1 k k k k
x  x   s  x  x

4
Gradient Method: Steepest
Descent (Search Direction)
k k
s  f ( x ) Don’t need to normalize

Method terminates at any stationary point. Why?


f ( x )  0

So procedure can stop at saddle point. Need to show

*
H ( x ) is positive definite for a minimum.

5
Gradient Method:
Step Length
Chapter 6

How to pick 
• analytically
• numerically

6
Chapter 6

7
8
Chapter 6

9
Analytical Method
How does one minimize a function in a search direction
using an analytical method?

It means s is fixed and you want to pick , the step


length to minimize f(x). Note  x k   s k .
Chapter 6

k k k 1 k k k 1 k k k
f (x   s )  f (x )  f ( x )  T f ( x )(  x )  (  x )T H ( x )(  x )
2
k k
df ( x   s ) k k k k k
 0  T f ( x )(s )   (s )T H ( x )(s )
d
Solve for 
k k
T f ( x )(s ) (6.9)
 
k k k
(s )T H ( x )(s )

This yields a minimum of the approximating function.


10
Numerical Method
Use coarse search first
(1) Fixed  ( = 1) or variable  ( = 1, 2, ½, etc.)
Chapter 6

Options for optimizing 


(1) Use interpolation such as quadratic, cubic
(2) Region Elimination (Golden Search)
(3) Newton, Secant, Quasi-Newton
(4) Random
(5) Analytical optimization

(1), (3), and (5) are preferred. However, it may


not be desirable to exactly optimize  (better to
generate new search directions).
11
Suppose we calculate the gradient at the point x T = [2 2]
Chapter 6

12
Chapter 6

13
Chapter 6

14
Gradient Method:
Termination Criteria
f(x) f(x)
Chapter 6

x x
Big change in f(x) but little change Big change in x but little change
in x. Code will stop if x is sole criterion. in f(x). Code will stop if x is sole criterion.
For minimization you can use up to three criteria for termination:
(1) f ( x k )  f ( x k 1 ) except when f ( x k )  0
1
k
f (x ) then use f ( x k )  f ( x k 1 ) 2
xi k 1  xi k except when x k  0
(2) 3
xi k
then use x k 1  x k 4
(3) f ( x k ) 5 or si k 6
15
Gradient Method:
Conjugate Search Directions
Improvement over gradient method for general quadratic functions
Basis for many NLP techniques
Two search directions are conjugate relative to Q if

( s i )T Q ( s j )  0

To minimize f(xnx1) when H is a constant matrix (=Q), you are


guaranteed to reach the optimum in n conjugate direction stages if
you minimize exactly at each stage
(one-dimensional search)

16
Chapter 6

17
Conjugate Gradient Method
Step 1. At x 0 calculate f (x 0 ). Let
s 0  f ( x 0 )
Step 2. Savef ( x 0 ) and compute
x1  x 0   0 s 0

by minimizing f(x) with respect to  in the s0 direction (i.e., carry out a unidimensional search for 0).
Chapter 6

Step 3. Calculate f ( x1 ), f ( x1 ). The new search direction is a linear combination of s 0 and f ( x1 ) :


T f ( x1 )f ( x1 )
s  f ( x )  s T
1 1 0

 f ( x 0 )f ( x 0 )
For the kth iteration the relation is

k 1 k 1 T f ( x k 1 )f ( x k 1 )
s  f ( x )s k
(6.6)
T f ( x k )f ( x k )

For a quadratic function it can be shown that these successive search directions are conjugate.
After n iterations (k = n), the quadratic function is minimized. For a nonquadratic function,
the procedure cycles again with xn+1 becoming x0.

Step 4. Test for convergence to the minimum of f(x). If convergence is not attained, return to step 3.
Step n. Terminate the algorithm when f ( x ) is less than some prescribed tolerance.
k

18
Chapter 6

19
Chapter 6

20
Chapter 6

21
Chapter 6

22
Minimize f  ( x1  3)  9( x2  5) using the method of conjugate gradients with
2 2

x10  1 and x20  1 as an initial point.

1
In vector notation, x 0   
1
4 
f x0
  
72 
Chapter 6

For steepest descent,


 4
s 0  f x0
 
72 
Steepest Descent Step (1-D Search)
1 4 
x1      0   ,  0  0.
1  72 

The objective function can be expressed as a function of 0 as follows:


f ( 0 )  (4 0  2) 2  9(72 0  4) 2 .
1.223 
Minimizing f(0), we obtain f = 3.1594 at 0 = 0.0555. Hence x1   
5.011
23
Calculate Weighting of Previous step
The new gradient can now be determined as
 3.554
f x1   
0.197 
and 0 can be computed as
(3.554) 2  (0.197) 2
 
0
 0.00244.
(4) 2  (72) 2
Generate New (Conjugate) Search Direction
Chapter 6

 3.554   4   3.564 
s1     0.00244    
 0.197  72   0.022 
and
1.223  1  3.564 
x2       0.022
5.011  
One dimensional Search

Solving for 1 as before [i.e., expressing f(x1) as a function of 1 and minimizing


with respect to 1] yields f = 5.91 x 10-10 at 1 = 0.4986. Hence
3.0000  which is the optimum (in 2 steps,
X 
2
 which agrees with the theory).
5.0000 
24
Chapter 6

25
Chapter 6

26
s  H f ( x )  f ( x )  k
k 1 k 1 k

T
(s )  f ( x )  f ( x ) H /  k
k T k 1 k1

k k 1
Using definition of conjugate directions, (s )T Hs =0,
T
 f ( x k 1
)  f ( x ) H H  f ( x )   k s   0
1
k k 1 k

Chapter 6

k k 1
f T ( x )f ( x )0
k 1 k
and f T ( x )s  0, and solving for the weighting factor:

k 1 k 1
T f ( x )f ( x )
 
k
k k
T f ( x )f ( x )
k 1 k 1 k
s  f ( x )  k s

27
Newton Method: Linear vs.
Quadratic Approximation of f(x)
k k k 1 k k k
f ( x )  f ( x )  ( x  x )T f ( x )  ( x  x )T H ( x )( x  x )
2
k k k
x  x  x   k s
Chapter 6

(1) Using a linear approximation of f ( x ) :


df ( x ) k k
T
 0   f ( x ) so cannot solve for  x !
d ( x )
(2) Using a quadratic approximation for f (x) :
df ( x ) k k k  Newton's method
T
 0   f ( x )  H ( x )( x  x )
d ( x )  solves one of these
or
k 1 k
x  x  H ( x )f ( x )
k  with x  x k 1

(simultaneous
equation-solving)
28
Note: Both direction and step length are determined
- Requires second derivatives (Hessian)
1
- H, H must be positive definite (for minimum) to guarantee convergence
- Iterate if f ( x ) is not quadratic
Chapter 6

Modified Newton's Procedure:


k 1 k 1 k k
x  x   k H ( x )f ( x )
 k  1 for Newton's Method
(If H  I, you have steepest descent)
Example
f ( x )  x12  20 x2 2
Minimize f starting at x 0   1 1
T

29
Chapter 6

30
Chapter 6

31
Chapter 6

32
Marquardt’s Method
1
If H ( x ) or H ( x ) is not always positive definite, make it
positive definite.
1 1
Let H ( x )  H ( x )   I  ; similar for H( x )
  
 is a positive constant large enough to shift all the
Chapter 6

negative eigenvalues of H ( x ).
Example
0
At the start of the search, H( x ) is evaluated at x and
Not positive definite
0 1 2
found to be H ( x )    as the eigenvalues
 2 1  are e1  3, e2  1
0
Modify H ( x ) to be (  2)
Positive definite as the
1  2 2 
H

 eigenvalues are e1  5, e2  1
 2 1  2 
 is adjusted as search proceeds. 33
Step 1
0
Pick x the starting point.
Let   convergence criterion

Step 2
Chapter 6

Set k  0. Let  0  10 3

Step 3
k
Calculate f ( x )

Step 4

Is f ( x )k )   ? If yes, terminate. If no, continue.

34
Step 5
1
Calculate s( x )  - H   I  f ( x )
k k k k k

Step 6
k 1 k k
Calculate x  x  s( x )
Chapter 6

Step 7
k 1 k
Is f( x )  f ( x )? If yes, go to step 8. If no, go to step 9.

Step 8
1 k
Set  k 1   and k  k  1. Go to step 3
4
Step 9

Set  k  2 k . Go to step 5
35
Secant Methods
Recall for one dimensional search the secant method
only uses values of f(x) and f ′(x).
1
 f ( x )  f ( x ) 
k p
k 1
x  x  k
 f ( x k
)
 x x
k p

Chapter 6

Approximate f (x ) by a straight line (the secant).


Hence it is called a "Quasi-Newton" method.
The basic idea (for a quadratic function):
k k k 1 k 1 k
f ( x )  H  x  0 or (x  x )  H f ( x )
Pick two points to start (x k  Ref. point)
k k
f ( x 2 )  f ( x )  H ( x 2  x )
k k
f ( x 1 )  f ( x )  H ( x 1  x )
k
f ( x 2 )  f ( x 1 )  y  H ( x 2  x 1 )
36
For a non-quadratic function, H would be calculated,
k k 1
after taking a step from x to x , by solving the
secant equations
k k k 1 k
y  H  x or 
x  H y
Chapter 6

- An infinite number of candidates exist for H when n  1


-1 -1
 
- We want to choose H (or H ) close to H (or H ) in
some sense. Several methods can be used to update H

37
• Probably the best update formula is the BFGS update
(Broyden – Fletcher – Goldfarb – Shanno) – ca. 1970

• BFGS is the basis for the unconstrained optimizer


in the Excel Solver
Chapter 6

• Does not require inverting the Hessian matrix but


approximates the inverse with values of f

38
Chapter 6

39
Chapter 6

40
Chapter 6

41
Chapter 6

42

You might also like