0% found this document useful (0 votes)
19 views8 pages

Mcnotes 41

The document covers unconstrained and constrained optimization, focusing on methods to find local and global maxima and minima of functions. It introduces key concepts such as local and global extrema, the role of derivatives, and the use of the Hessian matrix in determining the nature of extrema. Additionally, it discusses the Lagrange multipliers method for solving constrained optimization problems, including necessary and sufficient conditions for optimality.

Uploaded by

eugenio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views8 pages

Mcnotes 41

The document covers unconstrained and constrained optimization, focusing on methods to find local and global maxima and minima of functions. It introduces key concepts such as local and global extrema, the role of derivatives, and the use of the Hessian matrix in determining the nature of extrema. Additionally, it discusses the Lagrange multipliers method for solving constrained optimization problems, including necessary and sufficient conditions for optimality.

Uploaded by

eugenio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Simon Fraser University, Department of Economics

Econ 798 { Introduction to Mathematical Economics


Prof. Alex Karaivanov
Lecture Notes 4

1 Unconstrained optimization
In this section we address the problem of maximizing (minimizing) a function in the case when
there are no constraints on its arguments. This is not a very interesting case for economics,
which typically deals with problems where resources are constrained, but represents a natural
starting point to solving the more economically relevant constrained optimization problems.

1.1 Univariate case


Let f : U R ! R be C 2 : We are interested in nding maxima (or minima) of this function.
We need to start with de ning what do we mean by these concepts.
De nition (local maximum) { done before.
A point x0 2 U is a local maximum for the function f if 9" > 0; such that f (x0 )
f (x); 8x 2 U \ N" (x0 ); where N" (x0 ) denotes an "-ball around x0 : If f (x0 ) > f (x);
8x 2 U \ N" (x0 ) with x 6= x0 we say that the local maximum is strict.
Clearly a function can have many or no local maxima in its domain.
De nition (global maximum)
A point x0 2 U is a global maximum for the function f if f (x0 ) f (x); 8x 2 U:
So how do we go about nding local (global) maxima? Most of the time we use di erentiation
and set the rst derivative to zero but a zero rst derivative is neither necessary (e.g., corner
maximum; kink maximum), nor su cient condition for maximum (f 0 = 0 could be a minimum
or other critical point). Thus some care is needed to ensure that what one nds by setting
f 0 = 0 is indeed what one is looking for. Let us call both local maximum and local minimum
local extremum. The following theorem is the basic result used in unconstrained optimization
problems for the univariate case.
Theorem 19 (Conditions for local extrema)
Let f 0 (x0 ) = 0. If:
(i) f 00 (x0 ) < 0 then x0 is a local maximum of f:
(ii) f 00 (x0 ) > 0 then x0 is a local minimum of f:
(iii) f 00 (x0 ) = 0 then we cannot conclude whether x0 is a local extremum of f:
Theorem 20
A continuous function f with domain the closed interval [a; b] 2 R attains a global
maximum and global minimum in the interval.

1
1.2 The multivariate case
Now consider more general functions of the type f : U Rn ! R (multivariate).

Theorem 21 (First-order (necessary) conditions for a local extremum)

Let f : U Rn ! R be a C 1 function. If x0 is a local extremum of f in the interior


of U then:

@f (x0 )
= 0; i = 1; ::n
@xi

The above theorem states that at an interior local extremum all rst partial derivatives
must be equal to zero, i.e. we can solve the system of n equations de ned by the condition
above and look for interior extrema only among its solutions. Note also that the above can be
written equivalently as

rf (x0 ) = 0n 1

i.e., at interior local extremum the gradient of f is zero. Remembering that the gradient
was a vector pointing in the direction in which the function changes fastest, we see that the
above condition implies that at the extremum there's no such best direction, i.e. if we go in
any direction we will reach a lower functional value (if we are talking about a maximum).

The rst-order condition rf (x0 ) = 0n 1 is only necessary. To obtain su cient conditions,


as in the univariate case (Thm 19) we need to know something about the second derivatives of
f: In order to be able to do so, we need some useful concepts from linear algebra.

De nition (Principal submatrix and minor)

Let A be an n n matrix. A k-th order principal submatrix of A is the matrix


formed by deleting some n k rows and their corresponding n k columns of A:
Principal minor is the determinant of a k-th order principal submatrix.

De nition (Leading principal submatrix and minor)

The k-th order leading principal submatrix (LPS) of A is formed by deleting the last
n k columns and rows of A. Its determinant is called the leading principal minor
(LPM).

For a symmetric n n matrix A de ne the following concepts.

De nition (positive/negative de nite symmetric matrix) (Note: we saw alterna-


tive de nitions earlier, using quadratic forms!)

2
(a) The symmetric matrix A is positive de nite (p.d.) i all its n LPMs are
positive.
(b) The symmetric matrix A is negative de nite (n.d.) i all its LPMs are not
zero and alternate in sign, that is jA1 j < 0; jA2 j > 0; etc.

De nition (positive/negative semide nite symmetric matrix)

(a) The matrix A is positive semi-de nite (p.s.d.) i all its principal minors
are non-negative.
(b) The matrix A is negative semi-de nite (n.s.d.) i all its odd-order principal
minors are non-positive and all its even-order principal minors are non-negative.

Note: the above must be true for all principal minors, not just the leading ones.
Finally we are ready to state the su cient conditions for local extrema.

Theorem 22 (Second-order (su cient) conditions for local extrema)

Let f : U Rn ! R be a C 2 function. Let also x0 2 U satisfy rf (x0 ) = 0n 1 and


H(x0 ) be the Hessian of f at x0 . Then:
(i) If H(x0 ) is negative de nite, then x0 is a strict local maximum of f:
(ii) If H(x0 ) is positive de nite, then x0 is a strict local minimum of f:

If the Hessian is only p.s.d. (n.s.d.) the extrema may be not strict.

Theorem 23 (Second-order necessary conditions)

Let f : U Rn ! R be a C 2 function. Let also x0 2 intU (the interior of U; i.e. not


a boundary point). If x0 is a local maximum (minimum) of f then rf (x0 ) = 0n 1
and H(x0 ) is n.s.d. (p.s.d.).

The following examples illustrate how the theory from above is applied.

Example 1 (Multi-product rm)

Suppose we have a rm producing two goods in quantities q1 and q2 and with prices p1 and
p2 : Let the cost of producing q1 units of good 1 and q2 units of good 2 is given by C(q1 ; q2 ) =
2q12 + q1 q2 + 2q22 : The rm maximizes pro ts, i.e., it solves:

max = p 1 q1 + p 2 q2 (2q12 + q1 q2 + 2q22 ) = pT q q T Aq


q1 ;q2

2 :5
where p = (p1 ; p2 )T ; q = (q1 ; q2 )T and A = :
:5 2

3
How to solve for the optimal quantities the rm will choose? Take rst the partial derivatives
4pi pj
of with respect to q1 and q2 and set them to zero, to nd qi = ; i; j = 1; 2: We also
15
4 1
need to verify that this is a maximum indeed. The Hessian of the objective is H = :
1 4
Let's check if the leading principal minors alternate in sign, we have H1 = det[ 4] = 4 < 0
and H2 = det(H) = 15 > 0; i.e., the candidate solution is a maximum indeed.

Example 2 (OLS)

Think of some variable y which depends on x1 ; x2 ; ::xk and assume we have a dataset of n
observations (i.e. n vectors Xi = (x1i ; x2i ; ::xki ); i = 1::n): Assume that x1 is a vector of ones.
We are looking for the \best t" between a linear function of the observations, X and our
dependent variable y: (Note that X is n k and is k 1 vector of coe cients). Thus we can
write:

yi = 1 x1i + ::: k xki + "i ; i = 1::n

where "i are the `residuals' (errors) between the tted line X and y: The above can be written
more compactly in matrix form as:
y =X +"
Remember, we want to nd the best t, i.e. the coe cients which minimize the "0 sP
in some
sense. One possible criterion (used by the OLS method) is to choose to minimize ni=1 "2i ;
i.e. we want to solve the problem:

X
min S( ) = (yi 1 x1i ::: k xki )2 = (y X )T (y X )=
i
T T
= y y XT y yT X + T
XT X

The rst order condition for the above minimization problem is (using the matrix di erentiation
rules):
@S( ) !
= 2X T y + 2X T X = 0
@
from which we nd = (X T X) 1 X T y { a candidate minimum. So is indeed a minimum
of S( )? We need to check if the Hessian is positive semi-de nite, i.e., whether H( ) =
@ 2 S( )
= 2X T X, a k-by-k matrix is p.s.d. This is true in this case (think of all the squared
@ 2
terms). (Exercise: prove that the Hessian is p.s.d. using one of the given de nitions).

4
1.3 Constrained optimization
1.3.1 Introduction
In this section we look at problems of the following general form:
maxn f (x) (NLP)
x2R
s:t: g(x) b
h(x) = c
We call the above problem, a Non-Linear Optimization Problem (NLP). In it, f (x) is called
the objective function, g(x) b are inequality constraints, and h(x) = c are equality constraints.
Note that any optimization problem can be written in the above canonical form. For example
if we want to minimize a function h(x), we can do this by maximizing h(x).
It turns out that it is easier not to solve (NLP) directly, but instead solve another, related
problem (Lagrange's or Kuhn-Tucker's) for x and then verify that x solves the original NLP
as well. We will also be interested in whether we are obtaining all solutions to the NLP in this
way, i.e., whether it is true that if x solves the NLP it solves the related problem as well. Thus
we would like to see when the Lagrange's or Kuhn-Tucker's methods are both necessary and
su cient for obtaining solutions to the original NLP.

1.3.2 Equality constraints


Start simple, assuming that the problem we deal with has only equality constraints, i.e.,
maxn f (x)
x2R
s:t: h(x) = c; c 2 Rm
The equality constraints restrict the domain over which we maximize. Notice that if the number
of the constraints is equal to the number of variables (m = n) and if we assume that the
constraints are linearly independent, potentially we can solve for x from the constraints and
there will be nothing to be optimized. Thus a well-de ned problem will typically have m < n
(less constraints than choice variables).

(a) The Lagrange multipliers method


The method for solving problems of the above type is called the Lagrange Multipliers Method
(LMM). What it does is convert the NLP into a related problem (call it the LMM problem)
with a di erent objective function and no constraints, so that we can then use the usual
unconstrained optimization techniques.
What is the price we have to pay for this simpli cation? During the conversion to LMM
we end up with m more variables to optimize over. We next verify what is the connection
between the solutions to the LMM and the original NLP and most importantly, what conditions
are needed for the solutions to the LMM to be solutions to our NLP with equality constraints.
Let us describe the Lagrange method works. First we form the new objective function,
called the Lagrangean:
(x; ) f (x) + T (c h(x))

5
Notice that we added m new variables, j , j = 1; :::; m { one for each constraint. These are
called Lagrange multipliers. Note that they multiply zeros, so in fact the functional value of
the objective does not change.
The LMM problem is:

max (x; ) (LMM)


x;

Suppose we have set all partial derivatives to zero and arrived at a candidate-solution
(x ; ): We need to check if it is indeed a maximum, i.e. a second-order condition must be
veri ed as well.
Let's now go through the above steps in more detail. First, write down the rst-order
(necessary) conditions for local extremum in the LMM problem:
@ @h1 @hm @f
= 1 ::: m + = 0; i = 1::n
@xi @x1 @xm @xi
@
= cj hj (x) = 0; j = 1::m
@ j
Note we have m + n equations in the same number of unknowns.

(b) Second-order conditions of LMM and the bordered Hessian


Suppose the above system of rst-order conditions has a solution (x ; ): We need to check
if it is a maximizer indeed. The standard way in unconstrained problems was to see if the
Hessian is n.s.d. Here, we form so-called bordered Hessian, de ned as:
2 3
@h1 @h1
0 ::: 0 :::
6 @x1 @xn 7
6 7
6 ::: ::: ::: ::: ::: ::: 7
6 @hm @hm 7
6 0 ::: 0 ::: 7
6 7
^ 6 @x1 @xn 7
H(m+n) (m+n) (x ; ) 6 @h1 @hm @2 @2 7
6 7
6 @x ::: @x1 @x21
:::
@x1 @xn 7
6 1 7
6 ::: ::: ::: ::: ::: ::: 7
6 7
4 @h1 @hm @ 2
@ 2 5
::: :::
@xn @xn @xn @x1 @x2n
where all derivatives are evaluated at (x ; ): This is nothing but our usual Hessian for the
Lagrangean (from the unconstrained optimization method) but notice that we have ordered the
matrix of second partials in a particular way { rst taking all second partials with respect to
the 0 s and then with respect to the x0 s: The bordered Hessian, H^ can be written in a more
compact way as:
^ = 0m m Jhm n
H h T
(J )n m H( (x))n n
where Jh is the Jacobian of h(x) and H( (x)) is the \Hessian" of (X) (the matrix of second
partials of taken only with respect to the xi ).

6
Because of all the zeros, turns out we need only the last n m leading principal minors of
2
^ to determine if it is n.s.d. Let jH
H ^ m+1 j be the LPM with last element @ ; jH
^ m+2 j be the
@x21
@2
LPM with last element ; etc. Then we have the following result:
@x22
Proposition:
^ m+l j = ( 1)l , l = 1; ::n m, then the bordered Hessian is n.s.d. and the
If signjH
candidate solution is a maximum of the LMM.
Example: (A Consumer's problem)
A consumer has income y and wants to choose the quantities of n goods q1 ; ::qn to buy to
maximize his strictly concave utility U (q1 ; ::qn ); taking as given the prices of the goods p1 ; ::pn :
Her problem can be written as (using vector notation):
max U (q)
q

s:t: pT q = y
Set up the Lagrangean:
(q; ) = U (q) + (y pT q)
The FOCs are:
y pT q = 0
@U
pi = 0
@qi
which can be solved for (q ; ): Check that the SOC (the fact that the bordered Hessian is
n.s.d.) is satis ed as an exercise (Hint: use the concavity of U ).

(c) The constraint quali cation


Notice that the above proposition does not say anything about whether x obtained as the
solution to LMM will solve the original NLP problem. In general, this is not true since the FOCs
of the Lagrangean may be neither necessary, nor su cient for a maximum and thus additional
conditions are needed. One possible necessary condition so that the solutions of the LMM be
solutions of the NLP as well is the so-called constraint quali cation (CQ):
2 3
@h1 (x ) @h1 (x )
:::
6 @x1 @xn 7
6 7
J(h(x )) = 6 ::: ::: ::: 7 is rank m
4 @h (x ) @hm (x ) 5
m
:::
@x1 @xn
If there is only one constraint the CQ is equivalent to the gradient of h being not a vector
of zeros at x . The CQ is only a necessary condition, so if the Jacobian is singular at some x^
we should treat it as candidate maximum and we would need to check (separately) whether it
solves the NLP.

7
Theorem 26 (Equality constraints)

Consider the problem:

maxn f (x)
x2R
s:t: h(x) = c 2 Rm

Let C = fx 2 Rn : h1 (x) = c1 ; :::hm (x) = cm g; i.e., the set of feasible points.


Let x be local maximum in C and suppose it satis es the constraint quali cation.
rank(J(h(x )) = m: Then 9 2 Rm such that (x ; ) is a critical point of the
Lagrangean (x; ) = f (x) + T (c h(x)):

The above theorem implies that if the constraint quali cation holds, we can use the FOCs
of the LMM to nd candidate maxima and then verify which of them is solves the original NLP
problem.

You might also like