0% found this document useful (0 votes)
63 views42 pages

Constrained and Unconstrained Optimization: Carlos Hurtado

1) The document discusses various numerical optimization methods for minimizing functions, including bracketing methods, golden search, and Newton's method. 2) Golden search uses the golden ratio to select new trial points, reducing the bracketing interval by about 40% independently of the function. 3) Newton's method approximates the function with a quadratic Taylor polynomial and sets the derivative equal to zero to find the minimum.

Uploaded by

Lilia Xa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views42 pages

Constrained and Unconstrained Optimization: Carlos Hurtado

1) The document discusses various numerical optimization methods for minimizing functions, including bracketing methods, golden search, and Newton's method. 2) Golden search uses the golden ratio to select new trial points, reducing the bracketing interval by about 40% independently of the function. 3) Newton's method approximates the function with a quadratic Taylor polynomial and sets the derivative equal to zero to find the minimum.

Uploaded by

Lilia Xa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Constrained and Unconstrained Optimization

Carlos Hurtado

Department of Economics
University of Illinois at Urbana-Champaign
[email protected]

Oct 10th, 2017

C. Hurtado (UIUC - Economics) Numerical Methods


On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Numerical Optimization

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Numerical Optimization

Numerical Optimization

I In some economic problems, we would like to find the value that


maximizes or minimizes a function.

I We are going to focus on the minimization problems:

min f (x )
x

or
min f (x ) s.t. x ∈ B
x

I Notice that minimization and maximization are equivalent because we


can maximize f (x ) by minimizing −f (x ).

C. Hurtado (UIUC - Economics) Numerical Methods 1 / 27


Numerical Optimization

Numerical Optimization

I We want to solve this problem in a reasonable time

I Most often, the CPU time is dominated by the cost of evaluating


f (x ).

I We will like to keep the number of evaluations of f (x ) as small as


possible.
I There are two types of objectives:
- Finding global minimum: The lowest possible value of the function
over the range.
- Finding a local minimum: Smallest value within a bounded
neighborhood.

C. Hurtado (UIUC - Economics) Numerical Methods 2 / 27


Minimization of Scalar Function

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Minimization of Scalar Function

Bracketing Method

I We would like to find the minimum of a scalar funciton f (x ), such


that f : R → R.
I The Bracketing method is a direct method that does not use
curvature or local approximation
I We start with a bracket:
(a, b, c) s.t. a < b < c and f (a) > f (b) and f (c) > f (b)
I We will search for the minimum by selecting a trial point in one of the
intervals.
I If c − b > b − a, take d = b+c2 .
I Else, if c − b ≤ b − a, take d = a+b2
I If f (d) > f (b), there is a new bracket (d, b, c) or (a, b, d).
I If f (d) < f (b), there is a new bracket (a, d, c).
I Continue until the distance between the extremes of the bracket is
small.
C. Hurtado (UIUC - Economics) Numerical Methods 3 / 27
Minimization of Scalar Function

Bracketing Method
I We selected the new point using the mid point between the extremes,
but what is the best location for the new point d?

a b d c
I One possibility is to minimize the size of the next search interval.
I The next search interval will be either from a to d or from b to c
I The proportion of the left interval is
b−a
w=
c −a

I The proportion of the new interval is


d −b
z=
c −a
C. Hurtado (UIUC - Economics) Numerical Methods 4 / 27
Golden Search

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Golden Search

Golden Search
I The proportion of the new segment will be
c −b
1−w =
c −a
or
d −a
w +z =
c −a

I Moreover, if d is the new candidate to minimize the function,


d−b
z c−a d −b
= c−b
=
1−w c−a
c −b

I Ideally we will have


z = 1 − 2w
and
z
=w
1−w
C. Hurtado (UIUC - Economics) Numerical Methods 5 / 27
Golden Search

Golden Search

I The previous equations imply w 2 − 3w + 1 = 0, or



3− 5
w= ' 0.38197
2


1+ 5
I In mathematics, the golden ration is φ = 2
I This goes back to Pythagoras

1 3− 5
I Notice that 1 − φ = 2
I The Golden Search algorithm uses the golden ratio to set the new
point (using a weighed average)
I This reduces the bracketing by about 40%.
I The performance is independent of the function that is being
minimized.
C. Hurtado (UIUC - Economics) Numerical Methods 6 / 27
Golden Search

Golden Search

I Sometimes the performance can be improve substantially when a local


approximation is used.
I When we use a combination of local approximation and golden search
we get a method called Brent.
I Let us suppose that we want to minimize y = x (x − 2)(x + 2)2

C. Hurtado (UIUC - Economics) Numerical Methods 7 / 27


Golden Search

Golden Search

I Sometimes the performance can be improve substantially when a local


approximation is used.
I When we use a combination of local approximation and golden search
we get a method called Brent.
I Let us suppose that we want to minimize y = x (x − 2)(x + 2)2

30
y = x(x - 2)(x + 2) 2
25

20

15

10

10
2 1 0 1 2
x

C. Hurtado (UIUC - Economics) Numerical Methods 7 / 27


Golden Search

Golden Search

I We can use the minimize scalar function from the scipy .optimize
module.
1 >>> def f ( x ) :
2 >>> .... return ( x - 2) * x * ( x + 2) **2
3 >>> from scipy . optimize import minimize_scalar
4 >>> opt_res = minimize_scalar ( f )
5 >>> print opt_res . x
6 1.28077640403
7 >>> opt_res = minimize_scalar (f , method = ’ golden ’)
8 >>> print opt_res . x
9 1.28077640147
10 >>> opt_res = minimize_scalar (f , bounds =( -3 , -1) , method = ’
bounded ’)
11 >>> print opt_res . x
12 -2.0000002026

C. Hurtado (UIUC - Economics) Numerical Methods 8 / 27


Newton’s Method

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Newton’s Method

Newton’s Method
I Let us assume that the function f (x ) : R → R is infinitely
differentiable
I We would like to find x ∗ such that f (x ∗ ) ≤ f (x ) for all x ∈ R.
I Idea: Use a Taylor approximation of the function f (x ).
I The polynomial approximation of order two around a is:
1
p(x ) = f (a) + f 0 (a)(x − a) + f 00 (a)(x − a)2
2

I To find an optimal value for p(x ) we use the FOC:


p 0 (x ) = f 0 (a) + (x − a)f 00 (a) = 0

I Hence,
f 0 (a)
x =a−
f 00 (a)
C. Hurtado (UIUC - Economics) Numerical Methods 9 / 27
Newton’s Method

Newton’s Method
I The Newton’s method starts with a given x1 .
I To compute the next candidate to minimize the function we use
f 0 (xn )
xn+1 = xn −
f 00 (xn )

I Do this until
|xn+1 − xn | < ε
and
|f 0 (xn+1 )| < 

I Newton’s method is very fast (quadratic convergence).


I Theorem:
|f 000 (x ∗ )|
|xn+1 − xn | < |xn − x ∗ |2
2|f 00 (x ∗ )|
C. Hurtado (UIUC - Economics) Numerical Methods 10 / 27
Newton’s Method

Newton’s Method
I A Quick Detour: Root Finding
I Consider the problem of finding zeros for p(x )
I Assume that you know a point a where p(a) is positive and a point b
where p(b) is negative.
I If p(x ) is continuous between a and b, we could approximate as:
p(x ) ' p(a) + (x − a)p 0 (a)

I The approximate zero is then:


p(a)
x =a−
p 0 (a)

I The idea is the same as before. Newton’s method also works for
finding roots.
C. Hurtado (UIUC - Economics) Numerical Methods 11 / 27
Newton’s Method

Newton’s Method

I There are several issues with the Newton’s method:

- Iteration point is stationary

- Starting point enter a cycle

- Derivative does not exist

- Discontinuous derivative

I Newton’s method finds a local optimum, but not a global optimum.

C. Hurtado (UIUC - Economics) Numerical Methods 12 / 27


Polytope Method

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Polytope Method

Polytope Method

I The Polytope (a.k.a. Nelder-Meade) Method is a direct method to


find the solution of
min f (x )
x

where f : Rn → R.
I We start with the points x1 , x2 and x3 , such that
f (x1 ) ≥ f (x2 ) ≥ f (x3 )
I Using the midpoint between x2 and x3 , we reflect x1 to the point y1
I Check if f (y1 ) < f (x1 ).
I If true, you have a new polytope.
I If not, try x2 . If not, try x3
I If nothing works, shrink the polytope toward x3 .
I Stop when the size of the polytope is smaller then ε
C. Hurtado (UIUC - Economics) Numerical Methods 13 / 27
Polytope Method

Polytope Method
I Let us consider the following function:
f (x0 , x1 ) = (1 − x0 )2 + 100(x1 − x02 )2

I The function looks like:

C. Hurtado (UIUC - Economics) Numerical Methods 14 / 27


Polytope Method

Polytope Method
I Let us consider the following function:
f (x0 , x1 ) = (1 − x0 )2 + 100(x1 − x02 )2

I The function looks like:

3000 3000

2500 2500

2000 2000

1500 y 1500 y

1000 1000

500 500

0 0
2.01.5 2.01.5
1.0 0.5 1 1.0 0.5 1
0 0
0.0 1 0.0 1
x0 0.5 1.0 2
x1 x0 0.5 1.0 2
x1
1.5 3 1.5 3
2.0 4 2.0 4

C. Hurtado (UIUC - Economics) Numerical Methods 14 / 27


Polytope Method

Polytope Method

I Let us consider the following function:

f (x0 , x1 ) = (1 − x0 )2 + 100(x1 − x02 )2

I The function looks like:

1.5

1.0

0.5

0.0

0.5
1.0 0.5 0.0 0.5 1.0 1.5

C. Hurtado (UIUC - Economics) Numerical Methods 14 / 27


Polytope Method

Polytope Method

I In python we can do:


1 >>> def f2 ( x ) :
2 .... return (1 - x [0]) **2 + 100*( x [1] - x [0]**2) **2
3 >>> from scipy . optimize import fmin
4 >>> opt = fmin ( func = f2 , x0 =[0 ,0])
5 >>> print ( opt )
6 [ 1.00000439 1.00001064]

C. Hurtado (UIUC - Economics) Numerical Methods 15 / 27


Newton’s Method Reloaded

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Newton’s Method Reloaded

Newton’s Method

I What can we do if we want to use Newton’s Method for a function


f : Rn → R?

I We can use a quadratic approximation at a0 = (a1 , · · · , an ):

1
p(x ) = f (a) + ∇f (a)(x − a) + (x − a)0 H(a)(x − a)
2
where x 0 = (x1 , · · · , xn ).

I The gradient ∇f (x ) is a multi-variablegeneralization of the


derivative: ∇f (x )0 = ∂f∂x(x1 ) , · · · , ∂f∂x(xn )

C. Hurtado (UIUC - Economics) Numerical Methods 16 / 27


Newton’s Method Reloaded

Newton’s Method
I The hessian matrix H(x ) is a square matrix of second-order partial
derivatives that describes the local curvature of a function of many
variables.
 ∂ 2 f (x ) ∂ 2 f (x ) ∂ 2 f (x )

2 ∂x1 ∂x2 ··· ∂x1 ∂xn
 2∂x1
 ∂ f (x ) ∂ 2 f (x ) ∂ 2 f (x )

···

∂x22
 ∂x2 ∂x1 ∂x2 ∂xn 
H(x ) =  .. .. ..

 .. 

 . . . . 

∂ 2 f (x ) ∂ 2 f (x ) ∂ 2 f (x )
∂xn ∂x1 ∂xn ∂x2 ··· ∂xn2

I The FOC is:


∇p = ∇f (a) + H(a)(x − a) = 0

I We can solve this to get:


x = a − H(a)−1 ∇f (a)
C. Hurtado (UIUC - Economics) Numerical Methods 17 / 27
Newton’s Method Reloaded

Newton’s Method

I Following the same logic as in the one dimensional case:

x k+1 = x k − H(x k )−1 ∇f (x k )

I How do we compute H(x k )−1 ∇f (x k )?


I We can solve:

H(x k )−1 ∇f (x k ) = s
∇f (x k ) = H(x k )s

I The search direction, s, is the solution of a system of equations (and


we know how to solve that!)

C. Hurtado (UIUC - Economics) Numerical Methods 18 / 27


Quasi-Newton Methods

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Quasi-Newton Methods

Quasi-Newton Methods

I For Newton’s method we need the Hessian of the function.


I If the Hessian is unavailable, the ”full” Newton’s method cannot be
used
I Any method that replaces the Hessian with an approximation is a
quasi-Newton method.
I One advantage of quasi-Newton methods is that the Hessian matrix
does not need to be inverted.
I Newton’s method, require the Hessian to be inverted, which is
typically implemented by solving a system of equations
I Quasi-Newton methods usually generate an estimate of the inverse
directly.

C. Hurtado (UIUC - Economics) Numerical Methods 19 / 27


Quasi-Newton Methods

Quasi-Newton Methods

I The Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm, the


Hessian matrix is approximated using updates specified by gradient
evaluations (or approximate gradient evaluations).
I In python:
1 >>> import numpy as np
2 >>> from scipy . optimize import fmin_bfgs
3 >>> def f ( x ) :
4 ... return (1 - x [0]) **2 + 100*( x [1] - x [0]**2) **2
5 >>> opt = fmin_bfgs (f , x0 =[0.5 ,0.5])

I Using the gradient we can improve the approximation


1 >>> d e f g r a d i e n t ( x ) :
2 ... r e t u r n np . a r r a y (( −2∗(1 − x [ 0 ] ) − 100∗4∗ x [ 0 ] ∗ ( x [ 1 ] − x [ 0 ] ∗ ∗ 2 ) , 2 0 0 ∗ ( x [ 1 ] − x
[0]∗∗2) ) )
3 >>> o p t 2 = f m i n b f g s ( f , x0 = [ 1 0 , 1 0 ] , f p r i m e=g r a d i e n t )

C. Hurtado (UIUC - Economics) Numerical Methods 20 / 27


Quasi-Newton Methods

Quasi-Newton Methods

I One of the methods that requires the fewest function calls (therefore
very fast) is the Newton-Conjugate-Gradient (NCG).

I The method uses a conjugate gradient algorithm to (approximately)


invert the local Hessian.

I If the Hessian is positive definite then the local minimum of this


function can be found by setting the gradient of the quadratic form to
zero

I In python
1 >>> from scipy . optimize import fmin_ncg
2 >>> opt3 = fmin_ncg (f , x0 =[10 ,10] , fprime = gradient )

C. Hurtado (UIUC - Economics) Numerical Methods 21 / 27


Non-linear Least-Square

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Non-linear Least-Square

Non-linear Least-Square

I suppose it is desired to fit a set of data {xi , yi } to a model,


y = f (x ; p) where p is a vector of parameters for the model that
need to be found.
I A common method for determining which parameter vector gives the
best fit to the data is to minimize the sum of squares errors. (why?)
I The error is usually defined for each observed data-point as:

ei (yi , xi ; p) = kyi − f (xi ; p)k

I The sum of the square of the errors is:


N
ei2 (yi , xi ; p)
X
S (p; x , y ) =
i=1

C. Hurtado (UIUC - Economics) Numerical Methods 22 / 27


Non-linear Least-Square

Non-linear Least-Square

I Suppose that we model some populaton data at several times.

yi = f (ti ; (A, b)) = Ae bt

I The parameters A and b are unknown to the economist.


I We would like to minimize the square of the error to approximate the
data

C. Hurtado (UIUC - Economics) Numerical Methods 23 / 27


Non-linear Least-Square

Non-linear Least-Square

I Suppose that we model some populaton data at several times.

yi = f (ti ; (A, b)) = Ae bt

I The parameters A and b are unknown to the economist.

I We would like to minimize the square of the error to approximate the


data

C. Hurtado (UIUC - Economics) Numerical Methods 23 / 27


Constrained Optimization

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Constrained Optimization

Constrained Optimization

I Let us find the minimum of a scalar function subject to constrains.

min f (x ) s.t. g(x ) = a and h(x ) ≥ b


x ∈Rn

I Here we have g : Rn → Rm and h : Rn → Rk .


I Notice that we can re-write the problem as an unconstrained version:
m k
" #
1 X 2
X
min f (x ) + p (gi (x ) − ai ) + max {0, hj (x ) − bj }
x ∈Rn 2 i=1 j=1

I For a ”very large” value of p, the constrain needs to be satisfied


(penalty method).

C. Hurtado (UIUC - Economics) Numerical Methods 24 / 27


Constrained Optimization

Constrained Optimization

I If the objective function is quadratic, the optimization problem looks


like
1
min q(x ) = x 0 Gx + x 0 c s.t. g(x ) = a and h(x ) ≥ b
x ∈Rn 2

I The structure of this type of problems can be efficiently exploited.

I This form the basis for Augmented Lagrangian and Sequential


Quadratic Programming problems

C. Hurtado (UIUC - Economics) Numerical Methods 25 / 27


Constrained Optimization

Constrained Optimization

I The Augmented Lagrangian Methods use a mix of the Lagrangian


with penalty method.

I The Sequential Quadratic Programming Algorithms (SQPA) solve the


problem by using Quadratic approximations of the Lagrangean
function.

I The SQPA is the analogous of Newton’s method for the case of


constraints.

I How does the algorithm solve the problem? It is possible with


extensions of simplex method, which we will not cover.

I The previous extensions can be solved with the BFGS algorithm

C. Hurtado (UIUC - Economics) Numerical Methods 26 / 27


Constrained Optimization

Constrained Optimization

I Let us consider the Utility Maximization problem of an agent with


constant elasticity of substitution (CES) utility function:
1
U(x , y ) = (αx ρ + (1 − α) y ρ ) ρ

I Denote by px and py the prices of goods x and y respectively.

I the constraint optimization problem for the consumer is:

max U(x , y ; ρ, α) subject to x ≥ 0, y ≥ 0 and px x + py y = M


x ,y

C. Hurtado (UIUC - Economics) Numerical Methods 27 / 27

You might also like