0% found this document useful (0 votes)
19 views51 pages

Opt Lecture2 2019

Developmental Economics

Uploaded by

Eyerusalem Degu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views51 pages

Opt Lecture2 2019

Developmental Economics

Uploaded by

Eyerusalem Degu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Constraint optimization problems

Numerical methods

Optimization Techniques in Finance


2. Constraint optimization and Lagrange multipliers

Andrew Lesniewski

Baruch College
New York

Fall 2019

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Outline

1 Constraint optimization problems

2 Numerical methods

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Formulation of the problem

In most financial applications, the variables in an optimization problem are


restricted to vary in a subset Ω of Rn rather than in the entire space Rn .
Examples:
(i) The correlation parameter in a stochastic volatility model has to lie within
the range [−1, 1].
(ii) The volatility parameters in a term structure model should be all positive,
while the unconstrained optimizer stubbornly produces negative values.
(iii) Portfolio manager’s mandate is to be long only, meaning that the weights
of all assets in the portfolio are positive.
(iv) The share of each position in the portfolio cannot exceed 5% of its total
value.
(v) No industry group should take more than 20% share of the total portfolio
value.
Oftentimes we are “lucky” and get the right results using an unconstrained
method (like in data fitting). Otherwise, we have to resort to the specialized
methods developed to handle constrained problems.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Formulation of the problem

We shall assume that the subset Ω is specified in terms of a number of functions


ci (x), i = 1, . . . , m, which define equality constraints or inequality constraints.
Specifically, given an objective function f (x), we formulate the following problem:
(
ci (x) = 0, if i ∈ E,
min f (x), subject to (1)
x∈Rn ci (x) ≤ 0, if i ∈ I.

Here, E and I are disjoint subsets of the set of indices 1, . . . , m, such that
E ∪ I = {1, . . . , m}.
A point x ∈ Rn is called feasible, if it satisfies all the constraints. We can thus
characterize the subset Ω as the set of all feasible points of the problem:

Ω = {x ∈ Rn : ci (x) = 0, if i ∈ E; ci (x) ≤ 0, if i ∈ I}. (2)

A point x ∗ is a local solution to (1), if it is feasible, and f (x) ≥ f (x ∗ ), for all x in a


neighborhood of x ∗ (which is contained in Ω).
It is a strict local solution if f (x) > f (x ∗ ), for all x 6= x ∗ in a neighborhood of x ∗ .

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Examples
Solving constraint optimization problems is challenging and, before developing
general methodology, we discuss a few examples.
Example 1. Consider the problem:
(
x1 x2 = a2 , where a > 0,
min x1 + x2 , subject to (3)
x1 , x2 ≥ 0.

In this case, c1 (x) = x1 x2 − a2 , E = {1}, c2 (x) = −x1 , c3 (x) = −x2 , I = {2, 3},
and the feasible set is the hyperbola x1 x2 = a2 in the first quadrant.
The special feature of this problem is that the constraints can be solved. Namely,
x2 = a2 /x1 , which reduces the problem to minimizing a function of one variable:

a2
g(x1 ) = x1 + .
x1

Setting the derivative of g(x1 ) to zero we find that x1∗ = a (the solution −a is
rejected because it is not in the feasible set). That means that x2∗ = a, and
inspection shows that x ∗ = (a, a) is, in fact, a global minimum.
Situations in which the constraints can be solved are rare. In general, solving the
constraints is either impossible or it leads to cumbersome calculations.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Examples

Example 2. Consider the problem:

min x1 + x2 , subject to x12 + x22 = 2. (4)

In this case, c1 (x) = x12 + x22 − 2, E = {1}, I = ∅, and the feasible set consists

of the circle of radius 2.
Inspection shows that the solution to this problem is
 
1
x∗ = − .
1

We note that
1
∇f (x ∗ ) = − ∇c1 (x ∗ ), (5)
2
i.e., at the solution, the gradient of the objective function is proportional to the
gradient of the constraint function.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Examples

This is not a coincidence! We will see that this is a general fact: at a local
minimizer, the gradient of the objective function is a linear combination of the
gradients of the constraints.
We rewrite (5) in the form:

1
∇f (x ∗ ) + ∇c1 (x ∗ ) = 0. (6)
2

The proportionality coefficient λ∗ = 1/2 (in this example) is called the Lagrange
multiplier.
We can interpret this observation geometrically as follows.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Examples

If x is a feasible point that is not a solution to (4), then there is a small vector
h = εd, ε > 0, such that x + h is feasible and f (x + h) < f (x), i.e.

0 = c1 (x + h)
≈ ∇c1 (x)T h,
0 > f (x + h) − f (x)
≈ ∇f (x)T h.

Therefore, for x to be a solution, there cannot exist a vector (direction) d such


that the conditions below hold simultaneously:

∇c1 (x)T d = 0,
∇f (x)T d < 0.

Thus x can be a solution if and only if ∇f (x) and ∇c1 (x) are parallel. In other
words, there has to exist a scalar −λ such that ∇f (x) = −λ∇c1 (x).

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Equality constraints and Lagrange Multiplier Theorem


Let us now consider the general constrained optimization problem with equality
constraints only (i.e. I = ∅).
Reasoning along the lines of Example 2, we argue that a feasible point x is a
solution to (1), provided that there is no vector d ∈ Rn , d 6= 0, with the
properties:

∇ci (x)T d = 0, for i = 1, . . . , m,


(7)
∇f (x)T d < 0.

We shall call a feasible point x regular, if the vectors ∇c1 (x), . . . , ∇cm (x) are
linearly independent.
If x is regular, the m vectors ∇ci (x) ∈ Rn span an m-dimensional subspace
W (x) ⊂ Rn .
The first condition in (7) defines the subspace of first order feasible variations:

V (x) = {d : ∇ci (x)T d = 0, for i = 1, . . . , m}. (8)

Notice that V (x) is simply the orthogonal complement of W (x), V (x) = W (x)⊥ .

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Equality constraints and Lagrange Multiplier Theorem

The (impossibility of the) second condition in (7) implies that the gradient ∇f (x)
of the objective function has to be perpendicular to V (x).
Indeed, if the inner product of ∇f (x) with a nonzero d ∈ V (x) is positive then, by
replacing d with −d, it can be negative, which is assumed impossible. Therefore,
∇f (x) ∈ V (x)⊥ .
But V (x)⊥ = W (x)⊥⊥ = W (x).
Consequently, ∇f (x) must be a linear combination of the constraint gradients
∇ci (x), which span W (x), i.e.

m
X
∇f (x) + λi ∇ci (x) = 0,
i=1

with some scalars λi .

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Equality constraints and Lagrange Multiplier Theorem

The Lagrange Multiplier Theorem formulated below states necessary conditions


for local minima of (1). It puts the informal reasoning above on a rigorous basis.
Lagrange Multiplier Theorem. Let x ∗ be a regular local minimizer of f (x) subject
to ci (x) = 0, for i = 1, . . . , m. Then:
(i) There exists a unique vector λ∗ = (λ∗1 , . . . , λ∗m ) of Lagrange multipliers,
such that
m
X
∇f (x ∗ ) + λ∗i ∇ci (x ∗ ) = 0. (9)
i=1

(ii) If, in addition, f (x) and ci (x) are twice continuously differentiable, then

 m
X 
d T ∇2 f (x ∗ ) + λ∗i ∇2 ci (x ∗ ) d ≥ 0, (10)
i=1

for all d ∈ V (x ∗ ).

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Lagrangian function

It is convenient (and customary) to write the necessary conditions in terms of the


Lagrangian function. This is a function of n + m variables defined as follows:

m
X
L(x, λ) = f (x) + λi ci (x). (11)
i=1

The necessary condition of the Lagrange Multiplier Theorem can be


parsimoniously formulated as follows:

∇x L(x ∗ , λ∗ ) = 0,
∇λ L(x ∗ , λ∗ ) = 0, (12)
d T
∇2xx L(x ∗ , λ∗ )d ≥ 0, for all d ∈ V (x ).∗

We emphasize that these conditions are necessary but not sufficient: a solution
to the system above may not represent a local minimum.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Example

Example 3. Consider the problem:

1 2
min (x + x22 + x32 ), subject to x1 + x2 + x3 = 3.
2 1

The first order necessary conditions read:

xi + λ = 0, for i = 1, 2, 3,
x1 + x2 + x3 = 3.

Solving this system yields


 
1

x = 1 , λ∗ = −1.
1

Note also that the second order (positive definiteness) condition holds, as
∇2xx L(x ∗ , λ∗ ) = I3 (the 3 × 3 identity matrix). In fact, x ∗ is a global minimum.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Sufficient conditions

The theorem below gives a sufficient condition for the existence of a local
minimizer.
Second Order Sufficient Conditions. Let f (x) and ci (x) be twice continuously
differentiable and let x ∗ ∈ Ω ⊂ Rn , λ∗ ∈ Rm , be such that

∇x L(x ∗ , λ∗ ) = 0,
(13)
∇λ L(x ∗ , λ∗ ) = 0,

and
d T ∇2xx L(x ∗ , λ∗ )d > 0, for all d ∈ V (x ∗ ). (14)
Then x∗ is a strict local minimizer of f (x) subject to equality constraints
ci (x), i = 1, . . . , m.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Example

Example 4. Consider the quadratic optimization problem:

1 T
min f (x) = x Ax + x T b, subject to x T c = 1, (15)
2

where b, c ∈ Rn are constant vectors, and A ∈ Matn (R) is positive definite.


First and second order conditions yield

bT A−1 c + 1
λ∗ = − ,
c T A−1 c
∗ −1 ∗
x = −A (b + λ c),
∇2xx L(x ∗ , λ∗ ) = A.

Note that, since A is positive definite, condition (14) holds for all d (and in
particular, for d ∈ V (x ∗ ) = {d : c T d = 0}).
Consequently, x ∗ is a strict (global) minimizer.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Sensitivity analysis
Lagrange multipliers have often intuitive interpretation, depending on the specific
problem at hand.
In general, they can be interpreted as the rates of change of the objective
function as the constraint functions are varied.
Let x ∗ and λ∗ be a local minimizer and the corresponding Lagrange multiplier,
respectively, of a constrained optimization problem for f (x).
Consider now the following family of constrained optimization problems,
parameterized by a vector u = (u1 , . . . , um ) ∈ Rm :

min f (x), subject to ci (x) − ui = 0. (16)


x∈Rn

One can think of the family of constraint functions ci (x, u) = ci (x) − ui as


defining the parametric problem.
Then, at least for small values of u, the solution x(u) with the corresponding
Lagrange multipliers λ(u) forms a continuously differentiable function, such that

x(0) = x ∗ ,
λ(0) = λ∗ .

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Sensitivity analysis

Observe that, since ∇x ci (x, u) = ∇x ci (x), we have the following equation:

m
X
∇x f (x(u)) + λi (u)∇x ci (x(u)) = 0, (17)
i=1

which is simply the first order condition for (16).


We denote
p(u) = f (x(u)), (18)

the minimum of the objective function as a function of the parameters u.


The function p(u) is called the primal function.
Then, for all u where x(u) is defined, we have the following relation:

∇u p(u) = −λ(u). (19)

In words: the Lagrange multipliers are the (negative of the) rates of change of
the primal function as the parameters are varied.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Sensitivity analysis
The proof is a straightforward calculation. From the chain rule and (17), we get
for each j = 1, . . . , m,

∇uj p(u) = ∇uj f (x(u))


n
X
= ∇xk f (x(u))∇uj xk (u)
k =1
m X
X n
=− λi (u)∇xk ci (x(u))∇uj xk (u).
i=1 k =1

However, using the chain rule again,

n
X
∇xk ci (x(u))∇ui xk (u) = ∇uj ci (x(u))
k =1
= ∇u j u i
= δji ,

where we have also used the fact that ci (u) = ui .

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Sensitivity analysis

As a consequence,

m
X
∇uj p(u) = − λi (u)δji
i=1
= −λj (u),

which proves equation (19).

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints

We will now move on to the case of inequality constraints. We begin with a


motivational example.
Example 5. Consider the slightly modified Example 2:

min x1 + x2 , subject to x12 + x22 ≤ 2. (20)

In this case, c1 (x) = x12 + x22 − 2, E = ∅, I = {1}, and the feasible set consists
of the unit circle and its interior.
Inspection shows that the solution to this problem continues to be
 
1
x∗ = − .
1

Notice that condition (5) continues to hold. We will argue that, in case of an
inequality constraint, the sign of the Lagrange multiplier is not a coincidence.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints

Assume that a feasible point x ∈ R2 is not a local minimizer.


Arguing along the lines of Example 2, this is possible, if there cannot exist a
(small) vector h = εd, ε > 0, satisfying both conditions below hold
simultaneously:

c1 (x) + ε∇c1 (x)T d ≤ 0,


(21)
∇f (x)T d < 0.

We consider two cases.


Case 1. x lies inside the circle, i.e. c1 (x) < 0 (the constraint is inactive). In this
case, assuming that ∇f (x) 6= 0, the vector d = −∇f (x) satisfies both
conditions, provided that ε > 0 is sufficiently small.
Note that this vector does not work if ∇f (x) = 0.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints

Case 2. x lies on the boundary of the circle, i.e. c1 (x) = 0 (the constraint is
active).
In this case, the conditions read

∇c1 (x)T d ≤ 0,
(22)
∇f (x)T d < 0.

The first of these conditions defines a closed half-plane, while the second one
defines an open half-plane.
The intersection of these two half-planes should be empty!
A reflection shows that this is possible only if there is a positive constant λ such
that
∇f (x) = −λ∇c1 (x).

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints

We can formulate the results of the analysis of Cases 1 and 2 in the following
elegant way using the Lagrange function.
If x ∗ is a local minimizer of f (x) (no feasible descent direction d is possible), then

∇x L(x ∗ , λ∗ ) = 0, for some λ∗ ≥ 0,

with the additional requirement that

λ∗ c1 (x ∗ ) = 0.

The latter condition is called the complementary slackness condition. It means


that λ∗ can be strictly positive, only if the constraint c1 (x) is active.
In Case 1 c1 (x ∗ ) < 0, and so λ∗ = 0, while in Case 2 λ∗ = 1/2 is positive.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints and the KKT conditions

Consider now the general problem (1), in which both equality and inequality
constraints are present.
For any feasible point x ∈ Ω, we define the set of active inequality constraints by

A(x) = {i ∈ I : ci (x) = 0}. (23)

A constraint i ∈ I for which ci (x) < 0 is called inactive at x.


If x ∗ is a local minimum solution to (1), then it is also a local minimum to the
same problem with all constraints inactive at x ∗ ignored. Thus the inactive
constraints do not matter.
The active constraints can be treated as equality constraints. A local minimum
solution to (1) solves thus the equality constraints problem:

min f (x), subject to ci (x) = 0, if i ∈ E ∪ A(x). (24)


x∈Rn

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints and the KKT conditions

Consequently, assuming that x ∗ is regular, there exist Lagrange multipliers


λ∗i , i ∈ E ∪ A(x ∗ ), so that
X
∇f (x ∗ ) + λ∗i ∇ci (x ∗ ) = 0.
i∈E∪A(x ∗ )

For convenience, we also introduce the Lagrange multiplier λ∗i = 0 for each
inactive constraint i. We can thus write the above condition compactly as

m
X
∇f (x ∗ ) + λ∗i ∇ci (x ∗ ) = 0,
i=1 (25)
λ∗i ci (x ∗ ) = 0.

An important fact about the Lagrange multipliers corresponding to the inequality


constraints is that they are nonnegative.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints and the KKT conditions

This can be seen as follows.


If the i-th constraint is replaced by a weaker constraint ci (x) ≤ ui , where ui > 0,
then the optimal objective function will not increase. The reason for this is that
the feasible will become larger.
From the sensitivity equation (19):


λ∗i = − lim p(u)
ui ↓0 ∂ui
≥ 0.

The arguments above are not exactly a proof, but are convincing enough to
understand the following necessary condition.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints and the KKT conditions


As in the case of equality constraints only, it is convenient to work with the
Lagrangian function:
X m
L(x, λ) = f (x) + λi ci (x). (26)
i=1

Karush-Kuhn-Tucker Necessary Conditions. Let x ∗ be a local minimum solution


to (1), and assume that x ∗ is regular. Then there exists a unique vector of
Lagrange multipliers λ∗i , i = 1, . . . m, such that

∇x L(x ∗ , λ∗ ) = 0,
λ∗i ci (x ∗ ) = 0, (27)
λ∗i ≥ 0, for i ∈ I.

If, additionally, f (x) and ci (x) are twice continuously differentiable, then

d T ∇2xx L(x ∗ , λ∗ )d ≥ 0, (28)

for all d, such that ∇ci (x ∗ )T d = 0, i ∈ E ∪ A(x ∗ ).

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Inequality constraints and the KKT conditions

Note that the complementary slackness condition

λ∗i ci (x ∗ ) = 0 (29)

is just a compact way of stating the fact λ∗i = 0, if ci (x ∗ ) is not active, and it can
be nonzero (and, in fact, nonnegative), if the constraint is active.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Examples

Example 6. Consider the problem:

1 2
min (x + x22 + x32 ), subject to x1 + x2 + x3 ≤ −3.
2 1

The first order necessary conditions read:

xi + λ = 0, for i = 1, 2, 3,

or
xi = −λ, for i = 1, 2, 3.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Examples

We now have two possibilities:


(i) The constraint is inactive, x1∗ + x2∗ + x3∗ < −3, in which case λ∗ = 0. This
implies xi∗ = 0, for i = 1, 2, 3, which contradicts the constraint. Thus
(0, 0, 0) is not a minimizer.
(ii) The constraint is active, x1∗ + x2∗ + x3∗ = −3, in which case λ∗ = 1, and
 
1

x = − 1 .
1

Note also that the second order (positive definiteness) condition holds, as
∇2x L(x ∗ , λ∗ ) = I3 .
A reflection shows that x ∗ is, in fact, a global minimizer.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Examples
Example 7. Consider the problem:
(
x1 + 2x2 ≤ 1,
min 2(x1 − 1)2 + (x2 − 2)2 , subject to
x1 ≥ x2 .

The Lagrange function is

L(x, λ) = 2(x1 − 1)2 + (x2 − 2)2 + λ1 (x1 + 2x2 − 1) + λ2 (−x1 + x2 ),

and the KKT conditions read:

4x1 − 4 + λ1 − λ2 = 0,
2x2 − 4 + 2λ1 + λ2 = 0,
λ1 (x1 + 2x2 − 1) = 0,
λ2 (−x1 + x2 ) = 0,
λ1 ≥ 0,
λ2 ≥ 0.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Examples

Complementarity conditions yield 4 cases:


(i) If λ1 = 0, λ2 = 0, then x1 = 1, x2 = 2. This violates the second constraint.
(ii) If λ1 = 0, −x1 + x2 = 0, then x1 = 43 , x2 = 43 , λ2 = 43 . This violates the
first constraint.
(iii) If x1 + 2x2 − 1 = 0, λ2 = 0, then x1 = 95 , x2 = 92 , λ1 = 169
. This is a
feasible solution.
(iv) If x1 + 2x2 − 1 = 0, −x1 + x2 = 0, then x1 = 13 , x2 = 13 , λ1 = 2, λ2 = − 23 .
This violates λ2 ≥ 0.
Since  
4 0
∇2x L(x, λ) =
0 2

is (constant and) positive definite, the point x ∗ = ( 95 , 92 ), with λ∗ = ( 16


9
, 0),
satisfies the first and second order KKT conditions.
Inspection show that it is a strict minimizer.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Sufficient conditions

Assume now that the functions f (x) and ci (x), i = 1, . . . , m, are twice
continuously differentiable, and let x ∗ ∈ Rn , λ∗ ∈ Rm be such that

∇x L(x ∗ , λ∗ ) = 0,
ci (x ∗ ) = 0, if i ∈ E,
ci (x ∗ ) ≤ 0, if i ∈ I, (30)
λi ci (x ∗ )

= 0,
λ∗i ≥ 0, for i ∈ I,

and
d T ∇2xx L(x ∗ , λ∗ )d > 0, (31)
for all d, such that ∇ci (x ∗ )T d = 0, i ∈ E ∪ A(x ∗ ).
Assume also that λ∗i > 0 for all active inequality constraints (strict
complementary slackness condition).
Then x ∗ is a strict local minimizer of f (x) subject to the constraints.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Penalty function method

A practical way of solving constrained optimization problems is by adding a


penalty function per constraint to the objective function.
These penalty functions attempt to mimic the effect of the constraint functions,
i.e. discourage the optimizer to search outside of the feasible set.
Typically, one considers a sequence of such modified objective functions, in
which the penalty functions are multiplied by larger and larger positive
coefficients, thus forcing the optimizer to stay closer and closer to the feasible
set.
Two commonly used penalty function methods are:
(i) Exterior penalty functions impose a penalty for violation of constraints.
(ii) Interior penalty functions impose a penalty for approaching the boundary
of an inequality constraint.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Penalty function method

The simplest exterior penalty method is the quadratic penalty function.


The quadratic penalty function applied to constraint optimization problems with
equality constraints only, ci (x) = 0, for i = 1, . . . , m, is defined as

m
µX
Q(x, µ) = f (x) + ci (x)2 , (32)
2
i=1

where µ > 0 is the penalty parameter.


As µ → ∞, the presence of the penalty term imposes more and more severe
penalty for constraints violations.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Penalty function method

Algorithmically, we consider a sequence µ0 < . . . < µk → ∞, and exit criteria

k∇Q(x, µ)k∞ ≤ εk , (33)

where εk > 0, k = 0, 1, . . . , is a sequence with εk → 0, as k → ∞.


We start the search with an initial guess x0 .
For k = 0, 1, . . . , we find a minimizer xk∗ of Q(x, µk ) satisfying the stopping
criterion (33).
In the next step, we increase µk to µk +1 , initialize the search with xk∗ (“warm
start”), and find a minimizer xk∗+1 satisfying (33).
We stop, once the final convergence test is satisfied.
The expectation is that, as k → ∞, xk∗ converges to the minimizer of the original
problem, and thus, for µk sufficiently large, xk∗ will be a good approximation to x ∗ .

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Example
Example 7. Consider again the problem:

min x1 + x2 , subject to x12 + x22 − 2 = 0.

The quadratic penalty function is given by

µ 2
Q(x, µ) = x1 + x2 + (x + x22 − 2)2 .
2 1

We find that the gradient and Hessian of Q(x, µ) are

1 + µx1 (x12 + x22 − 2)


 

∇Q(x, µ) =  ,
1 + µx2 (x12 + x22 − 2)

and  2
3x1 + x22 − 2

2x1 x2
2
∇ Q(x, µ) = µ  ,
2x1 x2 x12 + 3x22 − 2

respectively.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Example

We choose µ0 = 1 and perform Newton’s search with the initial guess


x0 = (0, 0.1).
The table below shows the values of xk∗ corresponding to µk = k.
µk xk∗
1 (-1.19148788, -1.19148788)
2 (-1.10715987, -1.10715987)
3 (-1.07474445, -1.07474445)
4 (-1.05745377, -1.05745377)
.. ..
. .
100 (-1.00249069, -1.00249069)

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Penalty function method

The sequence of µk can be chosen adaptively, based on the difficulty of


minimizing Q(x, µk ).
When minimizing Q(x, µk ) turns out to be difficult, in the next step, we choose
µk +1 to be modestly larger than µk .
Otherwise, we can take µk +1 significantly larger than µk .
As µk → ∞, the search proves in general to get increasingly harder due to the
fact that the Hessian of Q(x, µk ) becomes close to ill conditioned.
This phenomenon can be seen in Example 7, where the magnitude of the
Hessian increases in proportion to µ.
One can prove the following general theorem [2]: If xk∗ is the exact global
minimizer of Q(x, µk ), then every limit point x ∗ of xk∗ is a global solution to the
original problem.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Penalty function method

The quadratic penalty function can be defined for a problem with both equality
and inequality constraints as follows:

µX µX 2
Q(x, µ) = f (x) + ci (x)2 + ci (x)+ , (34)
2 2
i∈E i∈I

where a+ = max(a, 0).

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Augmented Lagrangian method

A refinement of the quadratic penalty method is the augmented Lagrange


method.
It reduces the possibility of ill conditioning discussed above by introducing
approximate Lagrange multipliers into the objective function.
We discuss the case equality constraints only, see [2] and [1] for the general
case.
We consider the augmented Lagrange function

m m
X µX
Lµ (x, λ) = f (x) + λi ci (x) + ci (x)2 . (35)
m=1
2
i=1

It differs form the standard Lagrange function (11) by the presence of the
quadratic penalty term.
Note that the λi ’s are not, strictly speaking, the Lagrange multipliers, as the
critical points of Lµ (x, λ) are not feasible points for the original problem.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Augmented Lagrangian method

We do expect that as µk → ∞, the augmented Lagrange function will tend to the


standard Lagrange function Lµk (xk∗ , λ) → L(x ∗ , λ∗ ).
Heuristically, the first order condition reads:

0 ≈ ∇x Lµ (x, λ)
m
X 
= ∇f (x) + λi + µci (x) ∇ci (x),
m=1

from which we deduce that the “true” Lagrange multipliers are

λ∗i ≈ λi + µci (x),

or
λ∗i − λi
ci (x) ≈ .
µ

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Augmented Lagrangian method

In other words, if λi is close to λ∗i , the infeasibility of x will be much smaller than
1/µ.
This suggest that the value of λi at each iteration step k (denoted by λi,k ) is
updated according to the following rule:

λi,k+1 = λi,k + µk ci (xk∗ ), (36)

for all i = 1, . . . , m.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Augmented Lagrangian method

Algorithmically, we consider a sequence µ0 < . . . < µk → ∞, and stopping


criteria
k∇x Lµ (x, λ)k∞ ≤ εk , (37)

where εk > 0, k = 0, 1, . . . , is a sequence with εk → 0, as k → ∞.


We start the search with an initial guess x0 and λ0 .
For k = 0, 1, . . . , we find a minimizer xk∗ of Lµk (x, λk ) satisfying the exit criterion
(37).
In the next step, we increase µk to µk +1 , update λk to λk +1 according to the rule
(36), initialize the search with xk∗ , and find a minimizer xk+1
∗ satisfying (37).
We exit, once the final convergence test is satisfied.
The expectation is that, as k → ∞, xk∗ converges to the minimizer of the original
problem without the necessity of sending µk → ∞. This should mitigate the
danger of ill conditioning and speed up the convergence.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Example

Consider again Example 7. The augmented Lagrange function reads

µ 2
Lµ (x, λ) = x1 + x2 + λ(x12 + x22 − 2) + (x + x22 − 2)2 .
2 1

We find that the gradient and Hessian of Lµ (x, λ) are

1 + 2λx1 + µx1 (x12 + x22 − 2)


 

∇Lµ (x, λ) =  ,
1 + 2λx2 + µx2 (x12 + x22 − 2)

and

2λ + µ(3x12 + x22 − 2)
 
2µx1 x2
2
∇ Lµ (x, λ) =  ,
2µx1 x2 2λ + µ(x12 + 3x22 − 2)

respectively.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Example

We choose µ0 = 1 and perform Newton’s search with the initial guess


x0 = (0, 0.1) and λ0 = 0.3.
The table below shows the values of xk∗ and λk corresponding to µk = k.
µk λk xk∗
1 0.62706599 (-1.07867187, -1.07867187)
2 0.40227548 (-0.97149492, -0.97149492)
3 0.58299634 (-1.01494835, -1.01494835)
4 0.28520661 (-0.9901962, -0.9901962)
5 0.69491662 (-1.02027986, -1.02027986)
.. .. ..
. . .
10 0.31472584 (-0.99044963, -0.99044963)

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Barrier methods

Exterior penalty methods allow for generating infeasible points during the search.
Therefore they not suitable when feasibility has to be strictly enforced.
This could be the case if the objective function is undefined outside of the
feasible set.
Barrier methods are similar in spirit to the external penalty method.
They generate a sequence of unconstrained modified differentiable objective
functions whose unconstrained minimizers are expected to converge to the
solution of the constrained problem in the limit.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Barrier methods

Barrier methods belong to the category of interior penalty function methods and
apply to inequality constraint optimization problems, i.e. E = ∅ in (1).
The essence of the method is to add a penalty term B(x) to the objective
function, which has the following properties:
(i) It is defined and continuous whenever ci (x) < 0, for all i = 1, . . . , m.
(ii) It goes to +∞, whenever cj (x) ↑ 0, for any i.
Commonly used barrier functions are:

m
X 
B(x) = − log − ci (x) (logarithmic), (38)
i=1

and
m
X 1
B(x) = − (inverse). (39)
ci (x)
i=1

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Barrier methods

Algorithmically, barrier methods work similarly to the penalty function methods.


We select a sequence of positive numbers µ0 > . . . > µk → 0, and consider a
sequence of functions:

R(x, µk ) = f (x) + µk B(x). (40)

We find a local minimizer xk∗ , and use it as the initial guess for the next iteration.
A general convergence theorem guarantees that any limit point of this sequence
is a global minimizer of the original constrained problem.
Since the barrier function is defined only in the interior of the feasible set, any
successive iteration must also be an interior point.
The barrier term µk B(x) goes to zero for interior points as µk → 0. The barrier
term becomes thus increasingly irrelevant for interior points, while allowing xk∗ to
move closer to the boundary of Ω.
This behavior is expected if the solution to the original constraint problem lies on
the boundary.

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

Example

Example 8. Consider the problem:

1 2
min (x + x22 ), subject to x1 ≥ 2.
2 1

Its optimal solution is clearly x ∗ = (2, 0).


The function
1 2
R(x, µk ) = (x + x22 ) − µk log(x1 − 2)
2 1
has a global minimum at

xk∗ = (1 +
p
1 + µk , 0).

As µk decreases to zero, the unconstrained minimum xk∗ converges to the


constrained minimum x ∗ .

A. Lesniewski Optimization Techniques in Finance


Constraint optimization problems
Numerical methods

References

[1] Bertsekas, D. P.: Nonlinear Programming, Athena Scientific (2016).

[2] Nocedal, J., and Wright, S. J.: Numerical Optimization, Springer (2006).

A. Lesniewski Optimization Techniques in Finance

You might also like