Mathematical Economics (ECON 471) Unconstrained & Constrained Optimization
Mathematical Economics (ECON 471) Unconstrained & Constrained Optimization
Lecture 4
Unconstrained & Constrained Optimization
Teng Wah Leo
1 Unconstrained Optimization
We will now deal with the simplest of optimization problem, those without conditions, or
what we refer to as unconstrained optimization problems. By definition, for a function
F (.) in multiple variables, x = [x1 , x2 , . . . , xn ] = U ⊂ Rn , such that F : U → R1 achieves
a maximum if,
4. x∗ ∈ U is a strict local max of F (.) if there is a “ball” B(x∗ ) around x∗ such that
F (x∗ ) > F (x)∀x 6= x∗ ∈ B(x∗ ) U
T
The latter to points regarding local and strict local max essentially says that for “nearby”
points, the function F (.) achieves it’s max at x∗ . If the maximum is achieved on the
entire domain U of the function F (.), then we say the max is global/absolute at x∗ .
These definitions simply needs to be reverse for the case of minimization, or finding the
minimum.
The next question is regarding the manner in which a maximum or minimum is char-
acterized. Focusing on maximums, you would recall that for a single variable function,
1
a maximum is achieved when function attains a critical point, which is when the slope
is zero, in other words, when the function peaks. In the multivariate function case, the
criteria is similar, with the sole difference being maximum is achieved in the interior of the
domain, an interior solution. To be precise, a maximum is attained when the Jacobian
∗
vector is equal to zero ∂F∂x(xi ) = 0, ∀i ∈ {1, 2, . . . , n}, and this solution is an interior
x=x ∗
solution if the “ball” B(x∗ ) is in the domain of F (.). All this of course retaining the con-
tinuity and differentiability of the function. This criterion remain true for a minimization
problem.
So what distinguishes a maximization from a minimization problem? It is for this rea-
son we developed the Hessian matrix, and the ideas of negative and positive (semi)definiteness.
Recall that in the univariate case, we maximize when the function is increasing and con-
cave, and minimize when the function decreasing and convex. This are important distin-
guishing features, without knowledge of which, you might instead be finding a maxima for
a minimization problem and vice versa. So the onus is always on you to verify whether
you are setting up and solving the “correct problem”. To reiterate the conditions for
maximization and minimization in the case multivariate functions, let F : U → R1 be
continuous and twice differentiable, x∗ be the critical value to the solution, then
3. if the Hessian matrix, O2 F (x∗ ), is indefinite, then x∗ is neither a (strict) local max
or min of the function F ().
The strict conditions are referred to as the sufficient conditions for maximization or
minimization as the case may be. While the weaker conditions (inequality ≤ and ≥) are
referred to as necessary conditions. If the function is concave, then your solution to the
function F (.), x∗ is a global max, and if the function is convex, your solution would be
a global min of F (.).
To consolidate what we have learned, we will go through several examples here, one
from microeconomics, one from macroeconomics, and another from econometrics.
2
constraint. For simplicity, let this individual choose between to quantities of two types of
goods, 1 and 2, with respective quantities x1 and x2 . Let his utility function that describes
the level of his felicity be U (.). The individual’s income is y, and the price of each good 1
and 2, is p1 and p2 . Then the constrained maximization problem is,
Although there is a constrain in this optimization problem, it is quite easy to change this
into a unconstrained problem in terms of one good. With the solution in that single good,
you can always find the solution for the other by substituting your solution back into the
budget constraint. So the new unconstrained problem becomes,
y − p1 x 1
max U x1 ,
x1 p2
which is now a unconstrained problem in terms of x1 . The condition that describes the
maximization occurs when the slope of the utility function is equal to zero, which occurs
at the critical point or the solution, x∗1 and x2 ∗. Using the Chain Rule,
dx2
Ux1 + Ux2 = 0
dx1
p1
⇒ U x1 − Ux2 = 0
p2
Ux p1
⇒ 1 =
U x2 p2
which gives your standard equilibrium condition of marginal rate of substitution equating
with the price ratio (which is the slope of the budget constraint). How do you know if you
have maximized or minimized. Again using both Chain Rule in concert with the product
rule,
2 !
d 2 x2
dx2
Ux1 x1 + Ux2 x2 + Ux2 2
dx1 dx1
To complete the analysis, we can now rationalize why we need the utility function to be
concave it the quantities of the goods. Given the concavity assumption Uxi xi would be
negative for i = {1, 2}. Further, since the second derivative of x2 with respect to x1 is 0
given the linearity of the budget constraint, the second derivative of the utility function is
negative, and consequently ensuring we are indeed maximizing.
3
Example 2 Consider the firm’s profit maximization problem where the typical firm faces
a revenue function R(.), and a cost function C(.). The choice is to maximize their profit
with respect to a vector of n inputs x. In other words,
Then based on what we understand about maximization, at the critical point for all xi , x∗i ,
where xi the typical element of x and x∗i is the typical element of x∗ , the following must
be true,
∂Π(x∗ ) ∂R(x∗ ) ∂C(x∗ )
= − =0
∂xi ∂xi ∂xi
which is just your marginal revenue equating with the marginal cost condition. This sort
of condition at equilibrium is what we refer to as a first order condition.
Let’s give the problem alittle more structure, so that we can get a more recognizable
equilibrium condition. Let the production function be F (.), and it is a function of the
same n inputs. Let the price of the single good produced by these inputs be p, so that
F : x ⊂ Rn → R1 . Let the cost function be linear, so that C(x) = w.x, where w is the
vector of of n input prices. The first order condition is now,
∂F (x∗ )
p − wi = 0
∂xi
∂F (x∗ )
⇒ p = wi
∂xi
In other words, we get our standard equilibrium condition that the firm should engage an
∂F (x∗ )
input up to the extent where its value marginal product ∂xi
p equates with price of the
input.
To ascertain whether the firm has maximized its profit, we have to check the Hessian
matrix, which in the current example, we need again more structure to the profit function,
or more precisely the production function.
Example 3 Another useful example is the ordinary least squares regression. We will use
the full spectrum of results we have learned regarding optimization to see it’s relevance. In
performing a regression, we are in effect trying to fit a line that best traverse the location
of observations in the variables we are interested in using to explain a phenomenon. Of
4
course we cannot possibly explain everything change, which then requires a idiosyncratic
term we call the error, . This variable is what allows us to use statistics to find out
how markets, individuals, firms, etc behave on the average. Without going into excessive
detail, the objective function in the ordinary least squares procedure is to minimize the
sum of this squared errors.
Consider a simple model,
yi = β0 + β1 xi + i
where the subscript i denotes the ith observation. The objective is to minimize the sum of
squared errors with respect to the coefficients/parameters β0 and β1 . The second parame-
ters tells us the effect that the variable x has on y, while the first is just the intercept of
this equation of a straight line.
n
X n
X
min 2i = min (yi − β0 − β1 xi )2
β0 ,β1 β0 ,β1
i=1 i=1
y − β0 − β1 x = 0
5
Combining the two conditions yield by substituting away β0 first gives us,
n
X n
X n
X
yi xi − (y − β1 x) xi − β 1 x2i = 0
i=1 i=1 i=1
n
P Pn
yi xi − y xi
i=1 i=1
⇒ n n = β1
x2i − x
P P
xi
i=1 i=1
n
P n
P Pn
n y i xi − yi xi
i=1 i=1 i=1
⇒ 2 = β1
n n
x2i −
P P
n xi
i=1 i=1
We can now substitute this back into the other equation to obtain,
n n
!
1 X X
β0 = yi − β1 xi
n i=1 i=1
n n n n
x2i
P P P P
yi − xi xi y i
i=1 i=1 i=1 i=1
= n 2
n
2
P P
n xi − xi
i=1 i=1
and notice that both the first and second order leading principal minors are positive.
6
or minimize our objective function subject to constraints that may be an equality (which
implies it will be binding, something we will discuss in more detail shortly), or an in-
equality (which implies in can be binding or not). We deal with the equality constraints
first. A typical constrained optimization problem is of the form,
max F (x)
x
subject to h1 (x) = b1 , . . . , hq (x) = bq
g1 (x) ≤ a1 , . . . , gm (x) ≤ am
where x is a vector of n elements as before, and F (.) is our objective function such that
F : x ⊂ X ∈ Rn → R1 . The h(.) functions are written with equality signs, and they are
the equality constraints, where bi are just constants. The g(.) functions written with
inequality signs are the inequality constraints. It is on the former types of constraints
that we focus on.
When the number of variables are small, it may be easier to simply substitute away
one of the variables, such as in the previous section, and change the question into a
unconstrained problem (when you have an equality constraint. Otherwise, you’d still
have to “experiment” to check if the constraints are binding). However, generally we
need an alternative technique. That would be the Lagrangian Method. Consider now
a constrained optimization problem with equality constraints.
max F (x)
x
subject to h1 (x) = b1 , . . . , hq (x) = bq
The method requires us to first form the Lagrangian Function, and optimize (maximize
or minimize as the case might be)
The optimization is with respect to all the n variables as well as the Lagrange multi-
pliers, λi for i = {1, . . . , q}, where Λ is the vector of Lagrange multipliers. It should be
noted that the first derivative of the constraints cannot be equal to zero, failing which
the technique will not work. As you would notice with the caveat/non-degenerate
constraint qualification, this happens only if the constraint is not dependent on the
variables you are maximizing on, which means for all intent and purpose, it is generally
not an issue in economics. In economics, the multipliers are typically referred to as the
7
shadow prices. That is a rather neat term in the sense that it highlights the cost to the
optimization problem of the constraint. Put another way, it tells you how important the
constraint is to your optimization problem. In the way we have structured the Lagrangian
function, the Lagrange multipliers will always be positive, so that the higher the value of
the multipliers, the more important they are. Take for instance the utility maximization
problem. The equality constraint we had imposed ensures the the choice will be at the
point where the indifference hyperbola is just tangent to the budget plane. Failing which
without a constraint, an individual due to non-satiation, will always choose to have more
ensuring that we have no interior solution, as long as a good is a “good”.
The first order conditions at the critical points or solution are thus,
∂L(x∗ , Λ∗ ) ∂L(x∗ , Λ∗ )
= 0, . . . , =0
∂x1 ∂xn
∂L(x∗ , Λ∗ ) ∂L(x∗ , Λ∗ )
= 0, . . . , =0
∂λ1 ∂λq
In other words, you set up the matrix of second derivatives of the objective function, and
border the top and left with the cross partial of the Lagrangian between the variable and
the respective constraint. The reason you see the border as a first derivative with respect
∂ 2 L(x∗ ) ∗
to the constraint is because ∂xi ∂λi
= − ∂h(x
∂xi
)
. With the first derivative of the Lagrangian
with respect to the multiplier you would have gotten the respective constraint h(x). The
8
matrix can also be written as follows,
" #
0 −Oh(x)
O2x,Λ L =
−Oh(x)T O2x L
The conditions to verify the concavity or convexity of the Lagrangian function involves
simply investigating the last n − q of the n + q leading principal minors. Another way
to think about this is that you need to check from the largest leading principal mi-
nor all the way to the second smallest one including elements from the submatrix O2x L.
This is because the smallest leading principal minor including an element from O2x L will
always be negative. Further note that, unlike the standard Hessian, for positive (semi-
)definiteness, the sign of the determinant is now uniformly negative, while for negative
(semi-)definiteness, they still alternate in sign in the same way.
Let’s consolidate using several examples.
Example 4 Let’s consider the utility maximization problem we had earlier, but this time,
we will solve it using the Lagrangian Method.
which is exactly what we obtained previously using the substitution method. However,
you would notice that as the number of variables increases, it becomes easier to use the
Lagrangian Method as opposed to substitution. Technically, given a functional form for
the utility function, all you would need to do is to substitute the final first order condition
into the equilibrium condition to solve for one of the variables, and with that answer,
substitute it back into the final condition to get the other critical value.
9
To figure out the concavity of the Lagrangian function, since the bordered Hessian is
just,
0 −p1 −p2
−p1 Ux x Ux x
1 1 1 2
and all we need to do is to calculate the single third order leading principal minor, which
is just the determinant of the bordered Hessian above which is
0(Ux1 x1 Ux2 x2 − Ux1 x2 Ux2 x1 ) + p1 (−p1 Ux2 x2 + p2 Ux1 x2 ) − p2 (−p1 Ux2 x1 + p2 Ux1 x1 )
= p1 (−p1 Ux2 x2 + p2 Ux1 x2 ) − p2 (−p1 Ux2 x1 + p2 Ux1 x1 )
= −p21 Ux2 x2 + p1 p2 Ux1 x2 + p2 p1 Ux2 x1 − p22 Ux1 x1 ) ≥ 0
The last inequality follows since Uxi xi ≤ 0, while Uxi xj ≥ 0, i 6= j. Therefore, since
the bordered Hessian is positive, the lagrangian is negative semi-definite, and we have a
local maximum. Notice that should we have only focused on the submatrix without the
border, O2x L, its sequence of leading principal minors would have been, Ux1 x1 ≤ 0 and
Ux1 x1 Ux2 x2 − 2Ux1 x2 ≥ 0 for it to be negative semi-definite, which is similar to what we
have found in the bordered Hessian case. However, you have to keep in mind that since
we are dealing with a constrained optimization, the bordered Hessian is the correct matrix
to examine. Note further that this is merely a sufficient condition for a local maximum.
max F (x)
x
subject to h1 (x) = b1 , . . . , hq (x) = bq
g1 (x) ≤ a1 , . . . , gm (x) ≤ am
where x is a vector of n elements as before, and F (.) is our objective function such that
F : x ⊂ X ∈ Rn → R1 . The h(.) functions are written with equality signs, and they are
10
the equality constraints, where bi are just constants. The g(.) functions written with
inequality signs are the inequality constraints. It is on the latter types of constraints
that we focus on. Let’s focus just on inequality constraints,
max F (x)
x
subject to g1 (x) ≤ a1 , . . . , gm (x) ≤ am
The setup is as before in converting the constrained problem into an unconstrained one.
In other words, the set up is as before. However, there are significant difference in the
first order conditions which are now,
∂L(x∗ , Λ∗ ) ∂L(x∗ , Λ∗ )
= 0, . . . , =0
∂x1 ∂xn
λ∗1 (g1 (x∗ ) − a1 ) = 0, . . . , λ∗m (gm (x∗ ) − am ) = 0
λ∗1 ≥ 0, . . . , λ∗m ≥ 0
g1 (x∗ ) ≤ a1 , . . . , gm (x∗ ) ≤ am
The key difference is the last three conditions. The first, or the second set of conditions are
what is known as complimentary slackness conditions. Essentially, what it is saying
is either one or both parts of the equation, λ∗i or hi (x∗ ) − ai i ∈ {1, . . . , q} will equate
with zero. This condition replaces the first order condition which was derived previously
as the derivative of the Lagrangian function with respect to the Lagrangian multiplier.
The reason is because a priori we do not know if the constraint is binding or otherwise.
We say a constraint is binding if it holds with equality, otherwise the constraint is
not binding. When a constraint is binding, it’s corresponding Lagrangian Multiplier is
must be positive, so that the complimentary slackness constraint holds. On the other
hand, when the constraint is not binding, the Lagrange Multiplier must be equal. The
manner we determine whether the constraint is binding or otherwise is simply to check
the constraint qualification. Further, note that should there be binding constraints,
the Jacobian matrix of binding constraints at the solution must have the rank equal to
the number of binding constraints.
To get a feeling for what is encompassed in deriving the solution, consider this simple
example which is found in your text.
11
Example 5 Let us maximize f (x, y) = xy subject to x2 + y 2 ≤ 1. The Lagrangian thus
is,
L(x, y, λ) = xy − λ(x2 + y 2 − 1)
highest Lagrangian function values, they are the solutions. In other words, (0, 0, 0) is not
a solution. Note the last number in the triplet is just the Lagrangian Multiplier.
In the more general case where we have both inequality, and equality constraints,
what you would need to do is to merge all the first order conditions together. As before,
the Jacobian matrix of the equality and binding inequality constraints evaluated at the
solution must have a rank equal to the number of constraints.
To consolidate what have learned, let’s consider one final example.
12
not in the constraint set based on the equality constraint. If either of the nonnegative
constraints are binding, it would yield (2, 0) or (0, 2) as a solution, which would yield a
2 × 2 Jacobian with a rank of 2, which is what we need.
The Lagrangian is
L = x − y 2 − µ(x2 + y 2 − 4) + λ1 x + λ2 y
and since λ1 ≥ 0 this means that the left hand side must be greater than zero. This in
turn implies that since µ is greater than zero, x > 0, and implies that λ1 > 0. From the
second constraint,
2y(1 + µ) = λ2
We know that µ > 0 since it is the Lagrangian Multiplier of a equality constraint. This
means that for this first order to hold, either both y and λ2 are zero, or both are non-zero.
However, by the complimentary slackness condition, both of them cannot be positive, which
means then that y = λ2 = 0. Since this is the case, by the equality constraint x = 2 and
λ1 = 0, which turn implies that µ = 14 . Then the solution is, (x, y, µ, λ1 , λ2 ) = 2, 0, 41 , 0, 0
13
Therefore the first order conditions are,
∂L ∂L
= 2x − 2λ + µ1 = 0 ∂y
= 2y − λ + µ2 = 0
∂x
λ(2x + y − 2) = 0 µ1 x = 0 µ2 y = 0
2x + y ≤ 2 x≥0 y≥0
λ≥0 µ1 ≥ 0 µ2 ≥ 0
Consider first the case where λ = 0 so that the inequality constraint 2x + y ≤ 2 so that
we have 2x = −µ1 and 2y = −µ2 . In so far as both µi has to be greater than or equal
to zero, the conditions hold only if x = y = 0 = µ1 = µ2 . This means that f (x, y) = 0.
Incidentally, this is the critical point of the objective function.
If the main constraint is binding, λ > 0, and 2x + y = 2. From the first two first order
constraints,
2x + µ1 = 4y + 2µ2
Consider first x > 0 and y = 0, which implies that µ1 = 0 and µ2 ≥ 0. When this is true,
we have x = 1, λ = 1, and µ2 = 1. When we have y > 0 and x = 0, y = 2 so that µ2 = 0.
Therefore λ = 4, and µ1 = 8.
Thus far the setup we have discussed is for maximization problems. The typical
minimization problem with equality and inequality constraints is of the following form,
min F (x)
x
subject to h1 (x) = b1 , . . . , hq (x) = bq
g1 (x) ≥ a1 , . . . , gm (x) ≥ am
L(x, Γ, Λ) = F (x)−γ1 (h1 (x)−b1 )−· · ·−γq (hq (x)−bq )−λ1 (g(x)−a1 )−· · ·−λm (g(x)−am )
The first order conditions and the rank of the Jacobian matrix is also the same as in the
maximization case. The sole difference is in the structure of the original problem. Can
you see the difference?
14
optimization, and a set of non-negativity constraints on the variables. Specifically, the
problem is,
max F (x)
x
subject to h1 (x) ≤ b1 , . . . , hq (x) ≤ bq
x1 ≥ 0, . . . xn ≥ 0
In other words, the Lagrangian is missing the non-negativity constraints in the Lagrangian
function, relative to the usual setup. Due to this alternative setup, the first order condi-
tions are now altered,
∂ Le ∂ Le ∂ Le ∂ Le
≤ 0, . . . , ≤ 0, ≤ 0, . . . , ≤0
∂x1 ∂xn ∂λ1 ∂λq
∂ Le ∂ Le ∂ Le ∂ Le
x1 = 0, . . . , xn = 0, λ1 = 0, . . . , λq =0
∂x1 ∂xn ∂λ1 ∂λq
These first order conditions can be derived from our original solution by realizing the
relationship between the two Lagrangian functions.
L = Le + ν1 x1 + · · · + νn xn
This means that based on the usual method, the first order conditions for the independent
variables should be,
∂L ∂ Le
= + νi = 0
∂xi ∂xi
Le
⇒ = −νi
∂xi
Since the Lagrange Multiplier νi for all i ∈ {1, . . . , n} must be at least zero, this means
∂L
that ≤ 0. Further, by the standard complimentary slackness condition that xi νi = 0,
e
∂xi
∂L
this means that the new complimentary slackness condition should now be xi ∂x = 0. For
e
i
∂L ∂ Le
= = bi − hi (x) ≥ 0
∂λi ∂λi
15
∂L
while λi ≥ 0 so that the new complimentary slackness condition here is now, λi ∂λ =0
e
i
since if the constraint is not binding, λi must be zero, while if the constraint is binding,
λi > 0.
max F (x; a)
x
let x∗ (a) be the solution. If x(a) is a twice continuously differentiable function of a, then
we have,
dF (x∗ (a)|a) ∂F (x∗ (a)|a)
=
da ∂a
To see that this is the case, use the Chain Rule to totally differentiate the objective
function,
n
dF (x∗ (a)|a) X ∂F (x∗ (a)|a) dxi ∂F (x∗ (a)|a)
= +
da i=1
∂xi da ∂a
∂F (x∗ (a)|a)
=
∂a
∂F (x∗ (a)|a)
since ∂xi
= 0 is a first order condition for maximization of the objective function.
The primary gain to us regarding the Envelope Theorem is that it eliminates the necessity
of examining how the parameter affects the solution, and consequently the objective
function. Rather, we can directly examine the objective function with respect to the
parameter.
The idea works as well for constrained problems in the sense that it says we do not
need to examine the objective function underlying the Lagrangian function, but that we
can examine the parameter’s effect on the Lagrangian function. To state the Envelope
16
theorem in the case of a constrained problem (with equality constraints), the Lagrangian
function for q equality constraints is evaluated at the optimal point/solution,
L(x∗ (a), Λ(a)|a) = F (x∗ (a)|a) − λ(a)1 (h1 (x∗ (a)|a) − b1 ) − · · · − λ(a)q (hq (x∗ (a)|a) − bq )
Example 8 We had previously solved for the individual consumer’s utility maximiza-
tion problem. Generically, given two goods, 1 and 2, consumed in quantities x1 and
x2 , and a utility function of U (., .), we can derive their demand as functions of the
prices of both goods, and the individuals income (y), xi ≡ xi (p1 , p2 , y) for i ∈ {1, 2}.
If you were to substitute this demand back into the utility function, you would obtain
what is referred to as the Indirect Utility Function, V (., .), that is V (p1 , p2 , y) =
U (x2 (p1 , p2 , y), x2 (p1 , p2 , y)). The effects that the variables has on the indirect utility, via
prices and income are derived likewise through their relationships with the quantity de-
manded of the goods. Consequently, the properties of the indirect utility function are as
follows,
2. If y 0 ≥ y ⇒ V (p1 , p2 , y) ≤ V (p1 , p2 , y 0 )
3. V (p1 , p2 , y) = V (αp1 , αp2 , αy) for all α > 0. In this case, we say that the indirect
utility function is homogeneous of degree zero in its prices and income.
It is this relationship between the utility and indirect utility function that we can back out
or derive the demand functions for the goods using the indirect utility. This relationship
is known as the Roy’s Indentity.
We will now use some of the techniques we have learned to derive this idea. Since the
indirect utility is a function of prices and income, if we were to depict this simple two
good case, both the vertical and horizontal axes would be for the price of the goods. For
the budget constraint, instead of the quantities being variables, we have the prices being
the “variables” now. Note that the indirect utility is decreasing in this instance in such
17
a diagram since as prices increase, the quantity of goods falls, giving us a lower level of
indirect and direct utility. Therefore, the parallel problem here now is to find the prices
that minimize the indirect utility. Setting up the minimization problem,
L = V (p1 , p2 , y) − λ(y − p1 x1 − p2 x2 )
where the last equation is just the relative version of Roy’s Identity. Next using the
Envelope Theorem, we know that,
∂L ∂V (p1 , p2 , y)
= =λ
∂y x∗
∂y
Therefore,
Vpi ∂V
= λ=
x∗i ∂y
Vpi
⇒ x∗i =
Vy
for all i. This is known as the Roy’s Identity, and the demand obtain is known as you
Marshallian Demand.
Another useful formula is the Shephard’s Lemma, and likewise is can be derived
using the Envelope Theorem.
Example 9 The idea is as follows, just as you can find an individuals choice by maximiz-
ing her utility subject to her budget constraint, you could likewise minimize the expenditure
18
subject to a minimum level of utility she wishes to derive from her choice. In other words,
the problem can be written as,
min p1 x1 + p2 x2
p1 ,p2 ,u
subject to: u(x) ≥ u
max −p1 x1 − p2 x2
p1 ,p2 ,u
subject to: −u(x) ≤ −u
We will treat the inequality constraint as an equality since in the course of minimizing the
expenditure, the best the individual can do is to just meet the minimum utility constraint.
The consequent Lagrangian function is,
L = p1 x1 + p2 x2 − λ(u(x) − u)
Note that the demand derived here is the Hicksian Demand. Let the optimized expendi-
ture function be denoted as e(p1 , p2 , u) which we obtain by substituting the solution to the
Lagrangian function for x∗i into the expenditure equation. Then by the Envelope Theorem,
de(p1 , p2 , u) ∂e(p1 , p2 , u) ∂L
= = = x∗i
dpi ∂pi ∂pi x∗ ,λ
The significance of this theorem is that given the expenditure function, you can always
back out the Hicksian demand by differentiating the expenditure function with respect to
the respective price.
To complete our discussion on some of the key formulas, we will provide an example
from Producer Theory. Pertaining to the firm’s profit optimizing choice, there is likewise
a useful formula, the Hotelling’s Lemma.
19
Lemma 1 Let q(p) be the firm’s supply function vector of n goods with typical element
qi (p). Further, if the firm’s profit function is π(p), where both qi (.) and π(.) are both
twice continuously differentiable, then
∂π(p)
qi (p) =
∂pi
where q ∗ (.) is the profit maximizing output vector. First note that for any planned vector
p, π(p) will be at least as profitable as q ∗ (p)T p. However, the difference becomes zero at
exactly p∗ . In other words, by the Envelope Theorem
∂h(p) ∂π(p)
= − qi∗ (p) = 0
∂pi ∂pi
20