Mcnotes 41
Mcnotes 41
1 Unconstrained optimization
In this section we address the problem of maximizing (minimizing) a function in the case when
there are no constraints on its arguments. This is not a very interesting case for economics,
which typically deals with problems where resources are constrained, but represents a natural
starting point to solving the more economically relevant constrained optimization problems.
1
1.2 The multivariate case
Now consider more general functions of the type f : U Rn ! R (multivariate).
@f (x0 )
= 0; i = 1; ::n
@xi
The above theorem states that at an interior local extremum all rst partial derivatives
must be equal to zero, i.e. we can solve the system of n equations de ned by the condition
above and look for interior extrema only among its solutions. Note also that the above can be
written equivalently as
rf (x0 ) = 0n 1
i.e., at interior local extremum the gradient of f is zero. Remembering that the gradient
was a vector pointing in the direction in which the function changes fastest, we see that the
above condition implies that at the extremum there's no such best direction, i.e. if we go in
any direction we will reach a lower functional value (if we are talking about a maximum).
The k-th order leading principal submatrix (LPS) of A is formed by deleting the last
n k columns and rows of A. Its determinant is called the leading principal minor
(LPM).
2
(a) The symmetric matrix A is positive de nite (p.d.) i all its n LPMs are
positive.
(b) The symmetric matrix A is negative de nite (n.d.) i all its LPMs are not
zero and alternate in sign, that is jA1 j < 0; jA2 j > 0; etc.
(a) The matrix A is positive semi-de nite (p.s.d.) i all its principal minors
are non-negative.
(b) The matrix A is negative semi-de nite (n.s.d.) i all its odd-order principal
minors are non-positive and all its even-order principal minors are non-negative.
Note: the above must be true for all principal minors, not just the leading ones.
Finally we are ready to state the su cient conditions for local extrema.
If the Hessian is only p.s.d. (n.s.d.) the extrema may be not strict.
The following examples illustrate how the theory from above is applied.
Suppose we have a rm producing two goods in quantities q1 and q2 and with prices p1 and
p2 : Let the cost of producing q1 units of good 1 and q2 units of good 2 is given by C(q1 ; q2 ) =
2q12 + q1 q2 + 2q22 : The rm maximizes pro ts, i.e., it solves:
2 :5
where p = (p1 ; p2 )T ; q = (q1 ; q2 )T and A = :
:5 2
3
How to solve for the optimal quantities the rm will choose? Take rst the partial derivatives
4pi pj
of with respect to q1 and q2 and set them to zero, to nd qi = ; i; j = 1; 2: We also
15
4 1
need to verify that this is a maximum indeed. The Hessian of the objective is H = :
1 4
Let's check if the leading principal minors alternate in sign, we have H1 = det[ 4] = 4 < 0
and H2 = det(H) = 15 > 0; i.e., the candidate solution is a maximum indeed.
Example 2 (OLS)
Think of some variable y which depends on x1 ; x2 ; ::xk and assume we have a dataset of n
observations (i.e. n vectors Xi = (x1i ; x2i ; ::xki ); i = 1::n): Assume that x1 is a vector of ones.
We are looking for the \best t" between a linear function of the observations, X and our
dependent variable y: (Note that X is n k and is k 1 vector of coe cients). Thus we can
write:
where "i are the `residuals' (errors) between the tted line X and y: The above can be written
more compactly in matrix form as:
y =X +"
Remember, we want to nd the best t, i.e. the coe cients which minimize the "0 sP
in some
sense. One possible criterion (used by the OLS method) is to choose to minimize ni=1 "2i ;
i.e. we want to solve the problem:
X
min S( ) = (yi 1 x1i ::: k xki )2 = (y X )T (y X )=
i
T T
= y y XT y yT X + T
XT X
The rst order condition for the above minimization problem is (using the matrix di erentiation
rules):
@S( ) !
= 2X T y + 2X T X = 0
@
from which we nd = (X T X) 1 X T y { a candidate minimum. So is indeed a minimum
of S( )? We need to check if the Hessian is positive semi-de nite, i.e., whether H( ) =
@ 2 S( )
= 2X T X, a k-by-k matrix is p.s.d. This is true in this case (think of all the squared
@ 2
terms). (Exercise: prove that the Hessian is p.s.d. using one of the given de nitions).
4
1.3 Constrained optimization
1.3.1 Introduction
In this section we look at problems of the following general form:
maxn f (x) (NLP)
x2R
s:t: g(x) b
h(x) = c
We call the above problem, a Non-Linear Optimization Problem (NLP). In it, f (x) is called
the objective function, g(x) b are inequality constraints, and h(x) = c are equality constraints.
Note that any optimization problem can be written in the above canonical form. For example
if we want to minimize a function h(x), we can do this by maximizing h(x).
It turns out that it is easier not to solve (NLP) directly, but instead solve another, related
problem (Lagrange's or Kuhn-Tucker's) for x and then verify that x solves the original NLP
as well. We will also be interested in whether we are obtaining all solutions to the NLP in this
way, i.e., whether it is true that if x solves the NLP it solves the related problem as well. Thus
we would like to see when the Lagrange's or Kuhn-Tucker's methods are both necessary and
su cient for obtaining solutions to the original NLP.
5
Notice that we added m new variables, j , j = 1; :::; m { one for each constraint. These are
called Lagrange multipliers. Note that they multiply zeros, so in fact the functional value of
the objective does not change.
The LMM problem is:
Suppose we have set all partial derivatives to zero and arrived at a candidate-solution
(x ; ): We need to check if it is indeed a maximum, i.e. a second-order condition must be
veri ed as well.
Let's now go through the above steps in more detail. First, write down the rst-order
(necessary) conditions for local extremum in the LMM problem:
@ @h1 @hm @f
= 1 ::: m + = 0; i = 1::n
@xi @x1 @xm @xi
@
= cj hj (x) = 0; j = 1::m
@ j
Note we have m + n equations in the same number of unknowns.
6
Because of all the zeros, turns out we need only the last n m leading principal minors of
2
^ to determine if it is n.s.d. Let jH
H ^ m+1 j be the LPM with last element @ ; jH
^ m+2 j be the
@x21
@2
LPM with last element ; etc. Then we have the following result:
@x22
Proposition:
^ m+l j = ( 1)l , l = 1; ::n m, then the bordered Hessian is n.s.d. and the
If signjH
candidate solution is a maximum of the LMM.
Example: (A Consumer's problem)
A consumer has income y and wants to choose the quantities of n goods q1 ; ::qn to buy to
maximize his strictly concave utility U (q1 ; ::qn ); taking as given the prices of the goods p1 ; ::pn :
Her problem can be written as (using vector notation):
max U (q)
q
s:t: pT q = y
Set up the Lagrangean:
(q; ) = U (q) + (y pT q)
The FOCs are:
y pT q = 0
@U
pi = 0
@qi
which can be solved for (q ; ): Check that the SOC (the fact that the bordered Hessian is
n.s.d.) is satis ed as an exercise (Hint: use the concavity of U ).
7
Theorem 26 (Equality constraints)
maxn f (x)
x2R
s:t: h(x) = c 2 Rm
The above theorem implies that if the constraint quali cation holds, we can use the FOCs
of the LMM to nd candidate maxima and then verify which of them is solves the original NLP
problem.