0% found this document useful (0 votes)
242 views100 pages

Micro Notes August 14 2016

This document outlines topics in microeconomic theory, specifically choice under constraint. The key topics covered are: 1) Consumer choice theory including utility maximization, budget constraints, demand functions and comparative statics. 2) Firm production including profit maximization, cost minimization, and aggregation of firms. 3) General equilibrium modeling markets and interactions of consumers and firms.

Uploaded by

Cristina Tessari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
242 views100 pages

Micro Notes August 14 2016

This document outlines topics in microeconomic theory, specifically choice under constraint. The key topics covered are: 1) Consumer choice theory including utility maximization, budget constraints, demand functions and comparative statics. 2) Firm production including profit maximization, cost minimization, and aggregation of firms. 3) General equilibrium modeling markets and interactions of consumers and firms.

Uploaded by

Cristina Tessari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

Micro Theory

Geoffrey Heal
August 14, 2016

Contents
I

Choice under constraint

1 Consumer Choice
1.1 Comparative Statics . . . . . . . . . . . . . . . . . . . . . . .
1.2 Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6
7
8

2 Utility Maximization Problem (UMP)


11
2.1 Digression on constrained maximization . . . . . . . . . . . . . 11
2.2 UMP solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Expenditure Minimization Problem (EMP)
3.1 Using First Order Conditions . . . . . . . . . .
3.2 Derivation of Slutsky Equation for labor supply
3.2.1 Example . . . . . . . . . . . . . . . . . .
3.3 Welfare Evaluation of Economic Changes . . . .
3.4 Deadweight loss from commodity taxation. . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

15
19
20
21
21
24

4 First Problem Set

26

5 Preference Aggregation

28

6 Preference aggregation and social choice

30

7 Final comments on preferences and choice


32
7.1 Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.2 Endogenous Preferences . . . . . . . . . . . . . . . . . . . . . 33

II

Firms and Production Plans


7.3
7.4
7.5
7.6

Properties of the Production


Profit Maximization (PMP)
Cost Minimization (CMP) .
7.5.1 Examples: . . . . .
Aggregation of Firms . . . .

Set
. .
. .
. .
. .

.
.
.
.
.

34
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

35
36
37
38
42

8 Second Problem Set

43

III

44

Choice under Uncertainty

9 Preferences over Lotteries


9.1

44

Paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
10 Risk Aversion
10.1 Risk Management . . . . . . . . . . . . . . .
10.1.1 Mean-Variance . . . . . . . . . . . .
10.2 Comparison of payoffs in terms of return and
10.3 A Geometric Approach to Insurance . . . . .

. . .
. . .
risk
. . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

50
51
56
57
59

11 Discount Rates and the Elasticity of Marginal Utility

60

12 State-Dependent Preferences

61

13 Subjective Probabilities
63
13.1 Savages Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . 64
14 Non-Expected Utility Approaches
66
14.1 MinMax Approaches . . . . . . . . . . . . . . . . . . . . . . . 67
14.2 MaxMin Expected Utility . . . . . . . . . . . . . . . . . . . . 68
14.3 Smooth Ambiguity Aversion . . . . . . . . . . . . . . . . . . . 69
2

14.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
15 Third Problem Set

75

IV

77

General Equilibrium

16 Edgeworth Box

78

17 The Theorems of Welfare Economics

81

18 Problem Set 4

86

19 Existence of General Equilibrium

87

20 Public Goods

88

21 External Effects

93

22 Common Property Resources

94

23 Non-Convex Production Sets

95

24 Time and Uncertainty in General Equilibrium: the ArrowDebreu Model


97

These notes are loosely based on Microeconomic Theory by Andreu


Mas-Colell, Andrew Whinston and Jerry Green. But the coverage differs: I
cover some material that they dont and they cover a lot that I dont.
There are three basic topics to be covered this semester:
1. Optimal constrained choice under certainty - consumers, firms, governments, investors
2. Optimal constrained choice under uncertainty and incomplete information - all of the above topics again
3. General market equilibrium: interaction of consumers, firms, governments, investors via markets

Part I

Choice under constraint


Examples are
1. Consumer maximizing wellbeing subject to budget or wealth constraints
2. Firm maximizing profits subject to technological and market constraints
3. Government maximizing welfare subject to budget, technology constraints
Basic concepts are choice set and preference ordering. A consumer wants
to make the best choice from the set of alternative available to her (choice
set). This requires that we define the set of alternatives to be considered
and also a way of ranking these (preference ordering) so that we can define
a best choice.
Definition 1. Choice set is the set of alternatives from which consumer
can choose. Defined by what is offered and technological constraints. Denote
this by X.
Next we need
Definition 2. Preference ordering: a ranking of all the alternatives in X.
Denoted by &, x & y denotes x is preferred or indifferent to y.
Strict preference defined by x & y and not y & x, denoted x  y.
Indifference defined by x & y and y & x, denoted x v y.
Complete means for all x, y X, either x & y or y & x.
Transitive means x & y, y & z x & z.
A preference ordering is said to be rational if it is complete and transitive.
Proposition 1. If & is rational then
(1)  is transitive and irreflexive (x  x is never true)
(2) is reflexive (x xx) and transitive and
(3) if x  y & z then x  z.

Proof. First prove that  is transitive. x  y, y  z x & y, y & z x &


z. Suppose it is not the case that x  z. Then x z. This means that x & z
and z & x. But z & x, x & y z & y. Contradicts y  z.
To show that  is irreflexive note that x  y x & y & not y & x so
x  x x & x & not x & x a contradiction.
Next prove that is transitive. Assume x y, y z. Then x & y, y &
x, y & z, z & y. Then z & y, y & x z & x. And x & y, y & z x & z. So
z & x and x & z z x.
Finally consider the last case, x  y & z. There are two possibilities,
y  z and y z. In the first case we are done by the transitivity of strict
preference. So assume that x  y z.
Also assume that x  z, meaning that either x z or x z. But if
x z and z y then x y which contradicts x  y. And if x z then
z  x and also z  y (because z y) so y  x a contradiction.

Consumer Choice

Commodity space is RN . Commodity bundles are vectors in RN . Consumption set X RN . Reflects physiological constraints - how much effort
can be supplied, how much food is needed to survive, dependence of effort
on food consumption, maximum of 24 hours in a day, etc. Generally closed,
N
for simplicity take to be R+
.
Budget is constraint on amount that can be spent, not to exceed income
W . p RN is a pricevector, and p.x is the
cost of bundle x at prices p. The
N
budget set B is B = x R : p.x W .
Consumer problem is to choose a best point in B according to the ordering &, i.e. to choose in
 {x NB : y B, x & y}. The demand correspondence is x (p, W ) = x R+ : x B, y B, x & y . This is a correspondence (set-valued map) as it is not necessarily single-valued.
Note that the price vector p is orthogonal to the budget hyperplane H =
{x : p.x = W }, i.e. p.z = 0 for any vector z lying in the hyperplane H. To
see this let y be such that p.y = W, and also let x H, then (x y) is a
vector in p.x = W , and p.x = p.y = W so p. [x y] = 0 and p is orthogonal
to (x y).
Definition 3. Demand correspondence x (p, W ) is homogeneous of degree zero if x (p, W ) = x (p, W ) for any p, W and any > 0.
6

Also needed is
Definition 4. Demand correspondence satisfies Walras Law if for every
p > 0 and W > 0 we have p.x = W for all x x (p, W ).
Assume that the demand correspondence is single-valued, i.e. a function.
Proposition 2. The demand function is homogeneous of degree zero.
Proof. {x : p.x W } = {x : p.x W }

1.1

Comparative Statics

We investigate how consumption changes with changes in wealth and in


prices. For fixed p, the function x (
p, W ) is the Engel function. Its image
in Rn , {x (p,W
) : W > 0} is the wealth expansion path. A commodity l
is normal at (p, W ) if xl /W = 0: otherwise it is inferior.
Locus of xi (p, W ) as pi varies, all other variables constant, is the demand
curve for good i, sometimes known as the offer curve.
The derivative xi (p, W ) /pi is the own price effect, and the cross derivatives xi (p, W ) /pk are the cross price effects. There is a matrix of such
cross partials with the own effects on the diagonal
x1 (p,W )
x1 (p,W )
.......
p1
pN

..
..
...
Dp (p, Y ) =

.
.
xN (p,W )
p1

.......

xN (p,W )
pN

Elasticities are defined as


n,k =

xn (p, W )
pk
pk
xn (p, W )

n,W =

xn (p, W )
W
W
xn (p, W )

Proposition 3. If the demand function x (p, W ) is homogeneous of degree


zero then for all p and W
N
X
xn (p, W )
k=1

pk

pk +

xn (p, W )
W = 0 n = 1, ..., N
W
7

Proof. Consider the demand function x (p, W ), 


differentiate with
 respect
P xn pk xn W
xn
xn
xn
to and let = 1. = k pk + W = p1 , ......, pN (p1 , ........, pN )+
xn
W.
W

n
But as x (p, W ) = x (p, W ) > 0, x
=0

In matrix notation Dp x (p, W ) p + DW x (p, W ) W = 0.


By dividing through by xn we see that in elasticity form proposition 3
takes the form
X
n,k (p, W ) + n,W (p, W ) = 0n
k

Proposition 4. From Walras Law - p.x (p, W ) = W - it follows that


P
n
=1
xk = 0k and n pn x
W

n
+
pn x
pk

Proof. Just differentiate Walras Law by prices or by wealth.

1.2

Preferences

We have assumed completeness and transitivity of preference orderings. We


now add assumptions about desirability and convexity.
Definition 5. The preference ordering & on X is monotone if x X & y 
x implies y  x.1 It is strongly monotone if y x & y 6= x implies y  x.
A weaker version is
Definition 6. The preference relation & on X is locally non-satiated if for
every x X and every  > 0 there is a y X such that kx yk <  and
y  x.
The indifference set containing x is {y X : y x}.
The upper contour set is the set of points at least as good as x, that
is the set of points on or above the indifference set containing x, {y : y & x}.
The lower contour set is the set of points that x is at least as good as:
{y : x & y}.
Local non-satiation rules out thick indifference sets.
Definition 7. The preference relation & on X is convex if for every x X the
upper contour set is convex, that is if y & x & z & x then y + (1 ) z & x
for any [0, 1].
1

x  y means that xi > yi i.

A strengthening of this is
Definition 8. The preference relation & on X is strictly convex if for every
x X the upper contour set is strictly convex or if we have if y & x & z &
x & y 6= z then y + (1 ) z  x for any (0, 1).
Homothetic preferences are often used because they have some neat properties:
Definition 9. A monotone preference relation on X is homothetic if x
y x y = 0
Intuitively, with homothetic preferences every indifference curve can be
obtained from one by scaling it up or down. Another property often used is
that of being quasi-linear:
Definition 10. A preference relation & is called quasi-linear with respect to
commodity 1 (called the numeraire commodity) if
(1) all indifference sets are parallel displacements of each other along
the axis of commodity 1, that is if x y then x + e1 y + e1 for
e1 = (1, 0, 0, 0, ..., 0) and any scalar > 0, and
(2) good 1 is desirable, that is x + e1  x for all x and > 0.
Next continuity of preferences:
Definition 11. The preference relation & on X is continuous if its upper
and lower contour sets are closed.
Rational continuous preferences can be represented by a real-valued function which we call a utility function. This means there is a function
U : X R such that U (y) = U (x) if f y & x.
Proposition 5. If the preference ordering & on X is rational and continuous
then there exists a continuous utility function that represents this ordering.
This means that the utility function numbers indifference sets in such a
way that the numbering is higher for more preferred sets. Note that there
are in general very many ways of doing this so that the utility representation
is not unique. Preferences are in some sense more fundamental than the
utility representation. And this is not always possible: continuity of the
upper contour sets matters, as the next example shows.
9

An interesting case to think about and use to test your understanding


is that of lexicographic preferences. To see what these are take the case of
R2 so all points we look at are 2-vectors. Then define & by x & y if either
x1 > y1 or x1 = y1 & x2 = y2 . This is like the alphabetical ordering. What
do indifference curves look like in this case? Is there a continuous utility
function?
The properties of preference orderings translate into properties of utility
functions:
1. Monotonicity implies that U (x) > U (y) if x  y.
2. Convexity of preferences implies that U (x) is quasi-concave: {y : U (y) = U (x)}
is convex for all x.
Proposition 6. A continuous preference on X is homothetic if and only if
it admits a utility function that is homogeneous of degree one, i.e. U (x) =
U (x) > 0.
Proof. If & is homothetic then I (x) = {y : y x} and I (x) = {y : y x} =
I (x). So there is a utility function for which U (x) = U (x). Now conversely suppose there is such a function. Then its obvious that the preference
is homothetic.
Proposition 7. Note that if U (x) represents & then so does (U (x)) for
any increasing function . So U is unique only up to an order-preserving
transformation.
Proof. Look at IU (x) = {y : U (y) = U (x)} & I (x) = {y : (U (y)) = (U (x))}.
If U (x) = U (y) (U (x)) = (U (y)) so IU I . And if (U (x)) =
(U (y)) then U (x) , U (y) are in the same level set for : R1 R1
U (x) = U (y) I IU .
This means that the scale of the utility function is arbitrary, and that there
is no meaning to utility differences - we cant say that 1 is preferred to
2 by more than 3 is preferred to 4 because preference differences are not
meaningful: we can only rank options, we cant rank their differences.
Proposition 8. A quasi-linear preference can be represented by a utility
function of the form U (x) = x1 + (x2 , ......, xN ).
In the special case of N = 2 then U (x) = x1 + (x2 ) and indifference
curves are all parallel but displaced along the horizontal axis. The marginal
rate of substitution between the two goods is independent of the consumption
of the first good and depends only on that of the second.
10

Utility Maximization Problem (UMP)

Consider the utility maximization problem M axU (x) , p.x = W . We assume


from now on that there is a unique solution to this problem.
Proposition 9. Suppose that U (x) is a continuous utility function representN
ing a non-satiated preference ordering & on the consumption set X = R+
.
The the demand function x (p, W ) has the following properties:
1. Homogeneous of degree zero in (p, W )
2. Walras Law: p.x = W for all x x (p, W )
Proof. Clearly x (p, W ) = x (p, W ) proving 1. Point 2 follows from nonsatiation.

2.1

Digression on constrained maximization

We can turn the constrained maximization problem M ax U (x) , p.x = W


into an unconstrained problem by introducing the Lagrangian
L = U (x) + [p.x W ]

(2.1)

1
is a Lagrange multiplier. We assume the function U to be
where R+
concave: in this case

Proposition 10. If there exists 0 such that (x , ) form a saddle point


of the Lagrangian then x solves the constrained utility maximization problem
Proof. Suppose (x , ) satisfy:
x max L (x, )
min L (x , )
which means that they form a saddle-point of L. Then L (x , ) = U (x ) +
(p.x W ) = U (x ) because L/ = (p.x W ) = 0. Hence at x
the budget constraint is satisfied. Now note that L (x , ) L (x, ) x
which implies that U (x ) + (p.x W ) U (x) + (p.x W ) x and as
p.x = W we have U (x ) U (x) + (p.x W ) x and so in particular for
any x satisfying p.x = W . Hence x is a constrained maximum.

11

2.2

UMP solutions

Necessary conditions for x to be a solution to the UMP are that there exist
a Lagrange multiplier such that:
U (x )
5 pn , with equality if xn > 0
xn
So if we are at an interior optimum then
U (x )
= pn
xn
In matrix notation, letting U (x) = (U/x1 , ......., U/xN ), we have
U (x ) 5 p, x . [U (x) p] = 0
This tells us that at an interior solution the gradient vector of the utility function must be proportional to the price vector, or more intuitively marginal
rates of substitution between commodities must be equal to their price ratios:
U (x )/xk/U (x )/xj

= pk/pj

Note that the value of the Lagrange multiplier gives us the marginal utility
of wealth, i.e. the rate at which utility increases with wealth if the budget
constraint is relaxed slightly. To see this note that if W changes by 4W
then the resulting change in utility is
4U =

X
X U
4xn =
pn 4xn = 4W
xn

Definition 12. Let V (p, W ) = U (x ) f oranyx x (p, W ). Then V (p, W ) :


RN +1 R is called the indirect utility function. V (p, W ) = U (x (p, W ))
Proposition 11. If U is continuous and represents a locally non-satiated
preference ordering then the indirect utility function V (p, W ) is
1. Homogeneous of degree zero
2. Strictly increasing in
n W and non-increasing
o in pn for any n

3. Quasi-convex, i.e. p, W : V (p, W ) 5 V is convex for any V


4. Continuous in p and W .
V
5. W
=
V
6. pn = xn (p, W )
12

Proof. Point 6:
P
xj
j pj pn = 0.

V
pn

U xj
j xj pn

pj pnj . But

xj p j = W xn +

Example 1: Cobb-Douglas Utility


M axx x1 x21 , p1 x1 + p2 x2 5 W
As U is monotone in on both arguments, we know that a solution will be on
the budget hyperplane. Write out a Lagrangian
L = x1 x21 + [W p1 x1 p2 x2 ]
x1
x1
= p1
1
2
(1 ) x1 x
2 = p2
Dividing the two FOCs we get
x2
p1
p1 (1 )
=
x2 = x1
(1 ) x1
p2
p2
From the budget constraint
p 1 x1 + p 2 x1
Hence
x1 = W

p1 (1 )
=W
p2

, x2 = W
p1
p2

From these we can work out the income and price elasticities:
P ED1,1 = P ED2,2 = 1, P ED1,2 = P ED2,1 = 0, IED1 = IED2 = 1
Note that the cross PEDs are zero, which is obvious when you rewrite the
utility function as lnx1 + (1 ) lnx2 . We can work with the log of the
utility function as it is an increasing transformation and so preserves the
underlying ordering, i.e. the upper contour sets are the same as before. It
is often interesting to look at expenditure shares as a function of price and
income. In this case we have
S1 =

p 1 x1
p 2 x2
= , S2 =
=1
W
W
13

so the expenditure shares are just the exponents of the consumption levels
in the Cobb-Douglas case.
Indirect utility function: by definition V (p, W ) = U (x ). From above
  
1

1
so
V
(p,
W
)
=
W
x1 = W p1 , x2 = W 1
p2
p1
p2
Example 2: Linear Utility
M axx {ax1 + bx2 } s.t. p1 x1 + p2 x2 = W
There exists such that
U
5 pi , = if xi > 0, i = 1, 2
xi
So
a 5 p1 , = if x1 > 0; b 5 p2 , = if x2 > 0
So if x1 , x2 > 0 then a/b = p1/p2 . Otherwise either x1 > 0 and x2 = 0 or vice
versa.
Example 3: Fixed Coefficients
Utility is U (x) = min {ax1 , bx2 } so the problem is
M axx min {ax1 , bx2 } , p1 x1 + p2 x2 = W
We know that to avoid wasting either good we need ax1 = bx2 so we want
to max x1 with x2 = x1 a/b. So p1 x1 + p2 x1 ab = W and
x1 =

bW
aW
, x2 =
p1 a + p2 b
p 1 a + p2 b

In this case the indirect utility function is


V (p1 , p2 , W ) = ax1 = bx2 =

abW
p1 a + p2 b

Example 4: Constant Elasticity of Substitution


The problem is M ax {x1 + x2 }1/ , p1 x1 + p2 x2 = W . Lagrangian is
L = {x1 + x2 }1/ + (W p1 x1 p2 x2 )
FOCs are
1
1
x1
{x1 + x2 } 1 x1
= pi , i = 1, 2, so
=
i

x2

14

p1
p2

1
 1

Using the budget constraint


1
  1
 1

1
+1 1
p1
1
x2
p 1 + x2 p 2 = x2 p 1
p2
+ p2 = W
p2
so
n
o
1
1
x2 p 2 p 1 p 2
+1 =W
Let r = / ( 1). Then
x2 p 2

pr1 pr
2

+ 1 = x2 p 2

pr1 + pr2
pr2


=W

so finally
pr1
pr1
2
x2 = W r
, x1 = W r 1 r
r
p1 + p 2
p1 + p2
Indirect utility is

 r
 
 1/
1/
1
W pr1
W pr1
p1 + pr2
1
2
r
r r
V (p1 , p2 , W ) =
+
=
W
=
W
(p
+
p
)

1
2
pr1 + pr2
pr1 + pr2
(pr1 + pr2 )

Expenditure Minimization Problem (EMP)

Consider the problem


M inx=0 p.x, s.t. U (x) = U
This is the EMP, the dual to (opposite of) the UMP.2 Here we seek to minimize expenditure subject to not falling below a specified utility level: in the
UMP we maximize utility subject to not exceeding a specified expenditure
level.
Proposition 12. Suppose U is continuous and represents a locally nonN
satiated preference on R+
and that the price vector p  0. Then

1. If x is optimal in the UMP when wealth is W > 0 then x is also


optimal in the EMP when the required utility level is U = U (x ). Moreover
the minimized expenditure level in the EMP is exactly W.
2. If x is optimal in the EMP when the required utility level U > U (0),
then x is optimal in the UMP with wealth p.x . Moreover the maximized
utility level in this UMP is exactly U .
2

We get the dual of a constrained maximization problem by interchanging the objective


and constraint.

15

Definition 13. The expenditure function e (p, U ) is the solution to the


EMP problem for prices p and required utility U .
Its properties are similar to those of the indirect utility function:
Proposition 13. Suppose U is continuous and represents a locally nonN
satiated preference on R+
. Then the expenditure function e (p, U ) : RN R
is
1. Homogeneous of degree one in p
2. Strictly increasing in U and non-decreasing in pn n.
3. Concave in p.
4. Continuous in p and U .
Example: Cobb-Douglas
To compute the expenditure function for the Cobb-Douglas utility considered in earlier examples we will take a monotone transform of this function
by taking logs: U (x) = ln x1 + (1 ) ln x2 . The problem is
minx (p1 x1 + p2 x2 ) s.t. ln x1 + (1 ) ln x2 = U
and we assume the constraint holds with equality by non-satiation.
L = p1 x1 + p2 x2 + (U ln x1 (1 ) ln x2 )
p1 =

1
p2

, p2 =
, x 1 = x2
x1
x2
p1 1

Using the constraint



ln x2 + ln

p2
p1 1


+ (1 ) ln x2 = U


p2
ln x2 = U ln
p1 1

  

 1 
p1
p2

1
U
U
x1 = e
, x2 = e
p1
1
p2

The expenditure function is thus


(  
1
  
 )

  
1 
p

p
1

p2
p2
2
1
U
U
e
p1
=e
+ p2
p1
1
p2

1
p1
1
16

 p   p 1
2
1
=e

1
Example: Fixed Coefficients
2b
We know that V ((p1 , p2 , W ) = p1abW
W = Expenditure = U p1 a+p
b+p2 a
ab
Example: CES
1
We know that U = V (p1 , p2 , W ) = W (pr1 + pr2 ) r W = Expenditure =
1
U (pr1 + pr2 ) r
U

Definition 14. The optimal commodity vector in the EMP, denoted h (p, U ) :
RN RN , is known as the Hicksian or compensated demand function.
The adjective compensated indicates that this is demand as a function
of prices with the utility level constant, which means that as prices change
wealth must be altered too to allow the consumer to maintain the same utility
level.
Proposition 14. Suppose U is continuous and represents a locally nonN
satiated preference & on R+
. Then for any p  0 the Hicksian demand
function h (p, U ) is
1. homogeneous of degree zero in p
2. U (h (p, U )) = U for any p
Next we establish a connection between the expenditure function and the
compensated (Hicksian) demand function. Note that e (p, U ) = h (p, U ) .p
Proposition 15. Let U be a continuous utility function representing a locally
non-satiated strictly convex preference relation & on X. For all p and U the
Hicksian demand h (p, U ) is the derivative vector of the expenditure function
e (p, U ) with respect to prices:
h (p, U ) = p e (p, U ) or hn (p, U ) = e (p, U ) /pn n
Proof. Assume for simplicity that h (p, U )  0 and is differentiable. Using
the chain rule we can write
X hi
X
e (p, U )
=
hi (p, U ) pi =
pi
+ hn (p, U )
pn
pn i
pn
i
Using the FOCs this is
X hi U
e (p, U )
=
+ hn (p, U )
pn
p
h
n
i
i
17

Note that U = U (h (p, U )) p so that


tion.

hi U
pn hi

= 0. This proves the proposi-

So
the compensated demand function is the regular demand function evaluated at the same prices and at the wealth level required to reach the
specified utility level, and
the regular demand function is the compensated demand function evaluated at the same prices and at the indirect utility function evaluated
at the same prices and wealth.
The next proposition states that the effect of a price change on demand
can be decomposed into two parts, one due to the price change alone at
constant welfare level and the other due to the change in real income resulting
from a price change. These are called the substitution and income effects
respectively.
Proposition 16. (The Slutsky equation) Assume U is a continuous utility
function representing a locally non-satiated strictly convex preference relation
& defined on X. Then for all (p, W ) and U = V (p, W ) we have
hn xn (p, W )
xn (p, W )
=

xk (p, W ) n, k
pk
pk
W
or equivalently in matrix notation
Dp x (p, W ) = Dp h (p, U ) DW x (p, W ) x (p, W )T
and attaining
Proof. Consider a consumer facing the price-wealth pair p, W
= e p, U . For all (p, U ) we
utility level U . Her wealth level must satisfy W
have hn (p, U ) = xn (p, e (p, U )). Differentiating this with respect to pk and
evaluating at p, U gives




hn p,
U
xn p,e
p,
U
xn p, e p, U e p,
U
=
+
(3.1)
pk
pk
W
pk
This yields from proposition 15




hn p,
U
xn p,e
p,
U
xn p, e p, U
=
+
hk p, U
pk
pk
W
18





= e p,
we
Finally since W
U and hk p, U = xk p, e p, U = xk p, W
have





xn p, W
xn p, W
hn p,
U

=
+
xk p, W
pk
pk
W
Simple rearrangement gives





xn p, W
hn p,
U
xn p, W

xk p, W
pk
pk
W
Now let k = n so that we are looking at own price effects:





xn p, W
hn p,
xn p, W
U

xn p, W
pn
pn
W

This allows an intuitive interpretation. It gives a decomposition of the effect


of a price change on Marshallian (conventional) demand into two components.
A price change has two effects: it changes the relative prices of commodities,
and it makes the consumer better or worse off, that is raises or lowers her real
income. The first of these is the substitution effect, the effect of a price
)
hn (p,
U
. The second is an income
change alone, keeping utility constant:
pn
)

xn (p,W
. We
effect, the effect of the price change on real income,
x
p

,
W
n
W
can also express this in terms of elasticities by multiplying through by pn /xn :



W p n xn
pn
hn p,
xn p, W
U pn xn p, W
=

pn
xn
pn
xn
W
xn W
which states that the elasticity of regular (Marshallian) demand equals that
of compensated (Hicksian) demand minus the income elasticity of demand
times the expenditure share.

3.1

Using First Order Conditions

Consider a consumer problem


M axx U (x | y) , p.x = W
19

where y is a parameter that affects preferences - for example, temperature


affects preferences for drinks and clothes. For the two-good case the FOC
are
U1 (x | y) p1

= F (x, y, p) = 0
U2 (x | y) p2
We can use the implicit function theorem to get the impact of a change in
the parameter y on the demand for say x1 .
x1
Fy
U2 U1y U1 U2y
=
=
y
F x1
U2 U11 U1 U21
and for a particular functional forms we can determine the sign of this.

3.2

Derivation of Slutsky Equation for labor supply

Another example is the supply curve for labor as a function of the wage
rate. Let utility be U (y, L) where y is consumption and L is leisure, so U
is increasing in both. Income is given by y = (K L) w + A where w is the
wage rate and K is the number of hours available for work and leisure. A is
the agents non-labor income. The consumer problem is
M axL U (y, L) , y + wL = wK + A
and we let S = wK +A be the total income the agent could earn if he devoted
all his time to work. This is his wealth: unlike in previous cases it depends
on the price w. The FOCs are
UL w = 0, Uy = 0
From this we get the Marshallian demand function xl (p, w, S) for leisure
and by solving the expenditure minimization problem we get the Hicksian or
compensated demands hl (p, w, U ) for leisure. Using the Slutsky equation we
can write


hl
xl
xl
=

l
w
w
S
Note that xl (p, w, S) = xl (p, w, wK + A) from which
dxl
xl
xl
=
+K
dw
w
S
20

Note that xl depends on w via two of its arguments so we need to take this
into account when differentiating with respect to w. I have used dxl /dw to
stand for the derivative taking this into account. The standard derivation of
the Slutsk equation does not take this into account.
Substituting this into the Slutsky equation we have
dx
xl
hl
xl
xl
= l K
=
l
w
dw
S
w
S
so

hl
dxl
=
+
dw
w

hl
S


(K l)

Here the first term is negative as it is the own price effect: an increase in the
wage rate makes leisure more expensive and reduces consumption of leisure
(substitution effect), while if leisure is a normal good the second term is
positive representing the income effect of a wage change.
3.2.1

Example

We illustrate this with a Cobb-Douglas utility function. U (y, L) = y L1 =


w (24 L) L1 so the FOC is
U
= w (24 L)1 L1 + w (1 ) (24 L) L = 0
L
which implies
1
L
=
24 L

so the ratio of leisure to work equals the ratio of the exponents of leisure
and work. Labor supply is independent of the wage rate: income effects
exactly offset substitution effects. In the first problem set you will investigate
whether this is true for a CES utility function.
Now we switch to an application in the cost-benefit area:

3.3

Welfare Evaluation of Economic Changes

An issue we can investigate is the effect of a price change on consumer welfare


- this could be the result of a policy measure such as a tax. Suppose prices
change from p0 to p1 : by how much is the consumer better or worse off? In
principle the indirect utility function can tell us this: it goes from V (p0 , W )
21

to V (p1 , W ) and we just need to evaluate the difference V (p1 , W )V (p0 , W ).


In general this number is hard to interpret as we dont know what the units
are: they are utility and the utility function is unique only up to an orderpreserving transformation. However starting from an indirect utility function
V (p, W ) and an arbitrary strictly positive price vector p we can consider the
expenditure function e (
p, V (p, W )). This function gives the wealth needed
to reach the utility level V (p, W ) when prices are p. This is strictly increasing
in the value of V , and so is an indirect utility function, but giving answers
in dollars rather than utility. So


e p, V p1 , W e p, V p0 , W
provides a measure of the welfare change when the price changes from p0
p1 : the measure is in dollars. We can do this for any price vector p, and natural choices are p0 and p1 . Associated with these choices are the equivalent
variation EV and compensating variation CV, introduced by Hicks.
Let U 0 = V (p0 , W ) , U 1 = V (p1 , W ) and note that as we are holding
income constant e (p0 , U 0 ) = e (p1 , U 1 ) = W . Define




EV p0 , p1 , W = e p0 , U 1 e p0 , U 0 = e p0 , U 1 W




CV p0 , p1 , W = e p1 , U 1 e p1 , U 0 = W e p1 , U 0
The interpretation of the EV is as follows. Imagine you ask the consumer,
before the price change has occurred: Would you rather have the price
change from p0 to p1 or a cash payment of $x? The the EV is the cash
payment at which the consumer is indifferent between the two. It is, for
her, equivalent to the price change. (If the prices increase then the EV is
negative.) Note that e (p0 , U 1 ) is the expenditure needed to achieve utility
level U 1 = V (p1 , W ), the level generated by the price change, at the prices
p0 . So EV is therefore the extra wealth needed to compensate for the price
change. We can also write the EV as follows:

V p0 , W + EV = U 1
The CV answers a different question: suppose the price change has already
occurred, and we ask how much it would take to compensate the consumer
for it and bring her back to her original welfare level at the new prices. The
question is: How much do I have to pay you to compensate you for the price
22

change? Note that e (p1 , U 0 ) tells us how much it costs to attain the initial
welfare level at the new prices p1 . So the CV is the change in wealth needed
to compensate for the new prices and get back to the initial state. We can
write the CV as

V p1 , W CV = U 0
In each case we are trying to restore the consumer to the original welfare
level but in the EV case using the initial prices and in the CV case using the
final prices. These two clearly in general give different answers, but they will
nevertheless give the same ranking of alternatives - the consumer is better off
under p1 if and only if these measures are both positive. For normal goods
(IED > 0) we have that EV = CV and in the case of quasi-linear preferences
we have EV = CV . To see this consider the two-good case with just the
price of good 1 changing so that p01 6= p11 , p02 = p12 . We can express the EV in
terms of the Hicksian or compensated demand.
Recall that w = e (p0 , U 0 ) = e (p1 , U 1 ) and that (by proposition 15)
h1 (p, U ) = e (p, U ) /p1 . Hence we can write




EV p , p , W = e p , U W = e p0 , U 1 e p1 , U 1 =


p01


h1 p1 , p2 , U 1 dp1

p11

so the EV can be represented by the area between p11 and p01 to the left of the
Hicksian demand curve for good 1 associated with utility level U 1 . Similarly
the CV can be expressed as
p01


0 1
CV p , p , W =
h1 p1 , p2 , U 0
p11

which is the area between the two prices to the left of the Hicksian demand
curve corresponding to utility level U 0 . Clearly
p01





0 1
0 1
EV p , p , W CV p , p , W =
h1 p1 , p2 .U 1 h1 p1 , p2 , U 0
p11

which is zero, so that EV = CV , if the Hicksian demand for good 1 is


independent of the utility level U . This is the case for quasi-linear preferences.
To see this assume U (x) = f (x1 ) + x2 and recall that the Hicksian demand
is the solution to the expenditure minimization problem
M in {p1 x1 + p2 x2 } , f (x1 ) + x2 = U
23

h
i
This reduces to M inx1 p1 x1 + p2 U f (x1 ) and the solution to this is x1 =
 
0 1
p1
f
, X2 = U f (x1 ). Hence the expenditure function is
p2

   
 0 1  p 
p1
0 1
1
e p1 , p2 , U = p1 f
+ p2 U f
f
p2
p2


and the derivative of this with respect to p1 is independent of the utility level.
In this case the EV and CV are equal and are both equal to the conventional
Marshallian consumer surplus, the area between the two prices to the left of
the regular demand curve.

3.4

Deadweight loss from commodity taxation.

A standard question in public finance is: which way of raising government


revenue reduces consumer welfare least? We will use the EV-CV machinery
to compare the costs of raising money by commodity taxes with the cost of
raising it by a lump-sum tax, a subtraction from wealth.
Consider a two commodity world with initial prices p01 , p02 , where a tax of
t per unit is levied on the first good, so that its price changes to p11 = p01 + t.
Revenue raised is T = tx1 (p1 , W ). The alternative is a tax of T on wealth,
reducing this to W T .
The consumer is worse off under the commodity tax if EV (p1 , p0 , W ),
which is negative, is more negative than T , that is EV < T or 0 <
T EV . In terms of expenditure functions she is worse off under the
commodity tax if W T > e (p0 , U 1 ) or W T e (p0 , U 1 ) > 0, that is if the
wealth she has after the lump-sum tax exceeds the wealth needed at prices
p0 to generate the utility she gets under the commodity tax. We can equate
these two criteria:


T EV p0 , p1 , W = W T e p0 , U 1
This is called the deadweight loss from commodity taxation. It measures how much worse off the consumer is because of the use of commodity
rather than lump-sum taxation. We can write this in terms of the Hicksian
or compensated demand curve at utility level U 1 .



T EV p0 , p1 , W = e p1 , U 1 e p0 , U 1 T

24

p01 +t

p01
p01 +t

=
p01



h1 p1 , p2 , U 1 dp1 th1 p01 + t, p2 , U 1



h1 p1 , p2 , U 1 dp1 h1 p01 + t, p2 , U 1 dp1

as p01 is constant independent of t. Because h1 is non-increasing in p1 , this


expression is non-negative, and is strictly positive if h1 is strictly decreasing
in p1 .

25

First Problem Set


1. Compute demand functions for goods 1 and 2 when the utility functions
1/2
1/2
are (1) U (x1 x2 ) = x31 x52 and (2) U (x1 , x2 ) = 3x1 + x2 .
2. Show that if x (p, W ) is homogeneous of degree one with respect to
W , [i.e. x (p, aW ) = ax (p, W ) a > 0] and satisfies Walras Law, then
l,W = 1 for every l where l,W is the elasticity of demand for good l
with respect to wealth. Can you say something about DW (x, p) and
the form of the Engel functions and curves in this case?
3. Show that the elasticity of demand for good l with respect to price pk ,
l,k , can be written as l,k = dln (xl (p, W )) /dlnpk . Derive a similar expression for l,W . Show that if we estimate the parameters (a0 , a1 a2 , b)
in the equation ln (xl (p, W )) = a0 + a1 lnp1 + a2 lnp2 + blnW these
parameters provide estimates of the elasticities l,1 , l,2 , l,W .
4. Draw a convex preference relation that is locally non-satiated but not
monotone.
5. Suppose that in a two commodity world the consumers utility func1
tion takes the form U (x) = [1 x1 + 2 x2 ] , known as the CES or constant elasticity of substitution function. (A) Show that when = 1,
indifference curves are linear. (B) Show that as 0, the utility
function comes to represent the same preferences as the generalized
Cobb-Douglas x1 1 x2 2 . (C) Show that as , indifference curves
become right angled, that is they become the indifference map of the
Leontieff function min {x1 , x2 }.
6. The elasticity of substitution between goods 1 and 2 is defined as
1,2 =

[x1 /x2 ] p1 /p2


[p1 /p2 ] x1 /x2

Show that for CES functions 1,2 = 1/ (1 ). What is this elasticity


for the Linear, Cobb-Douglas and Leontieff cases?
7. Consider a consumer who chooses his consumption bundle x1 , x2 , x3 , ....xn
to
P maximize his utility U (x1 , ...., xn ) subject to the budget constraint
k pk xk Y . Prove the Engel Proposition that the sum of the
26

products of each income elasticity with its budget share must equal
one, i.e.
X
k E k = 1
k

where Ek =

xk y
y xk

and k = pk xk /Y .

8. For the utility function


U (x1 , x2 ) = 1 ln (x1 1 ) + 2 ln (x2 2 )
prove that demands xi , i = 1, 2 satisfy the so-called linear expenditure
system
!
X
pi xi = pi i + i Y
pi i i = 1, 2
i

9. A consumer has the utility function {y + L }1/ where y is income, L


is leisure and the two are related by the budget y = w (24 L) where
w is the wage rate. What is the slope of the labor supply curve?

27

Preference Aggregation

The first issue to look at here is that of social indifference curves. Recall that
for a rational preference relation & on RN the preferred or indifferent set to
the point x is P I (x) = {y : y & x}. Let there be I individuals indexed by
i, each consuming xi . The for each person we have P Ii (xi ) = {y : y &i xi }
where &i denotes the i-th individuals preference ordering. Consider
(
)
X
X
P I (x1 , ..., xI ) =
P Ii (xi ) = Y : Y =
yi , yi P Ii (xi ) i
i

So this is the set of points that can be divided between agents so that each
is in the preferred or indifferent set to xi . The boundary of this set is the
social indifference curve SIC associated with (x1 , ...., xI ), SI (x1 , ..., xI ), and
is the set of points that can be divided between agents so that each is on
the indifference curve containing xi . An important question is: do these
indifference curves define a rational preference - complete and transitive?
Complete is not an issue: transitive is.
Proposition 17. A sufficient condition for the SICs SI (x1 , ..., xI ) for all
possible allocations x1 , ....., xI to form a transitive preference is that all agents
have identical and homothetic preferences. (Note: a necessary and sufficient
condition is a bit weaker but not a lot: it is that each agent have a preference
that is an affine translation of a given homothetic preference. See paper by
me and Chichilnisky Jour Math Econ 1983 )
Proof. Recall that a preference is homothetic if and only if x y x
y > 0. This implies that x & y x & y > 0. So if agent i0 s
preferences are homothetic then all preferred or indifferent sets are scalar
multiples of each other: > 0 : P Ii (xi ) = P Ii (yi ) f or any xi , yi . If all
preferences are homothetic and identical there exists an individual k and an
allocation xk and scalars j (xj ) such that
P Ij (xj ) = j (xj ) P Ik (xk )
Hence
X

P Ij (xj ) = P Ik (xk )

j (xj ) = P Ik (xk )

so the social preferences are homothetic too and so transitive.


28

So we can get a well-behaved aggregate preference or social preference


from individual preferences only under restrictive conditions.
Next we look at the aggregation of demands rather than preferences.
Each individual demand function depends
on prices and income or wealth:
P
di (p, Wi ). Aggregate demand is D = i di (p, Wi ). We can write this as D =
D (p, W1 , ....., WI ) as the price is the same for all. An interesting question is:
when does aggregate demand depend only on the total wealth/income
P rather
than on the distribution? Formally, when do we have D = D (p, i Wi ) so
that the distribution does not affect aggregate demand? We are looking for
conditions under which a change in the distribution of income/wealth will
not change P
aggregate demand, that is any set of alterations in the wealth
levels Wi , i Wi = 0, leads to no change in aggregate demand for any
good. Letting dl,i be is demand for good l, this means
X dl,i
i

Wi

Wi = 0 l

This is only going to be true if


dl,i (p, Wi )
dl,k (p, Wk )
=
Wi
Wk
This means that the effect of a change in wealth on demand is the same for
all individuals and all wealth levels, which means that for a given price vector
all wealth expansion paths are parallel. A sufficient condition for this is that
all preferences are homothetic and identical. All preferences being quasilinear with respect to the same good is also sufficient. As with preference
aggregation, there is a weaker necessary and sufficient condition, but not a
lot weaker. Here is an example for the quasi-linear case with two goods.
Let Ui (xi,1 , xi,2 ) = xi,1 + f (xi,2 ). If prices are p and wealth is Wi after
using the budget constraint the utility maximization problem is


p2
Wi
xi,2 + f (xi,2 )
M axxi,2
p1
p1
The FOCs are
xi,2

 
 0 1  p 
Wi p2  0 1 p2
2
= f
, xi,1 =

f
p1
p1
p1
p1
29

Now consider a group of N consumers with such preferences and demands.


Their aggregate demand is
P
 
 0 1  p 
Wi p2  0 1 p2
2
x2 = N f
, x1 =
N f
p1
p1
p1
p1
which is of the same form as the individual demands and independent of the
distribution of income.
Finally, return to the consumers expenditure minimization problem EMP:
M inxi {p.xi } , U (xi ) = Ui


Hicksian or compensated demand xi p, Ui = argmin p.x : Ui (x) = Ui .
We
o Xi (xi ) = {y : Ui (y) = Ui (xi )} is a convex set, and so is
n know that P

x : Ui (x) = Ui .
o
  n



Proposition 18. xi p, Ui minimizes p.x over Pi Ui = x : Ui (x) = Ui






P
for each i if and only if i xi p, Ui = X p, U1 , ..., UN minimizes p.X
 
P
over i P i Ui . In words, the order of set summation and minimization
can be interchanged: the sum of the cost minima over the individual preferred or indifferent sets equals the minimum over the sum of the preferred
or indifferent sets.
So although we cant aggregate utility maxima we can aggregate expenditure minima, because there is no budget constraint and no income effect.

Preference aggregation and social choice

We have spoken of preference aggregation via the market (aggregation of


demands) and via the summation of indifference curves, which turned out
to be the same problem. Democratic systems also aggregate preferences voting is a way of doing this and getting a social preference from a set of
disparate individual preferences. Voting is an exercise in social choice. Let
be the set of all possible individual preference orderings, &i i.
Definition 15. A social choice rule is a function : N which associates with any N-tuple of individual preferences, each in , a social preference, also in .
30

All forms of voting are social choice rules - single non-transferable votes,
transferable votes, proportional representation, etc. They all map a diverse set of individual preferences into a single social preference. Voting
systems and other social choice systems run into the Condorcet Paradox.
Consider a set of three people {A, B, C} with preferences over three alternatives {, , }. Let their rankings of these alternatives be as follows:
A :   , B :   , C :   . Let them vote between
the three options. Two prefer to , and two prefer to . We expect,
then that they will vote for over . But in fact two prefer to , so we
have    . This is called a voting cycle or Condorcet cycle.
Transitive individual preferences are aggregated via voting to an intransitive
social preference in this case.
More generally consider a social choice rule as defined above satisfying
the following properties:
1. Unrestricted Domain - it works for all possible N-tuples of preferences
in N .
2. Pareto Principle: if all individual preferences prefer alternative to
alternative then the social preference also prefers to .
3. Independence of Irrelevant Alternatives: the social preference between
any two alternatives {, } depends only on individuals preferences
over and and not on their preferences about other alternatives.
Formally, for any pair of alternatives , , and for any two prefer0
0
ence n-tuples &i and &i , if & and &i agree on {, } then the social
preference between and is the same for both preference N-tuples.
4. Non-Dictatorship: there is no individual such that { &i & }
where & denotes the social preference.
Theorem 1. (Arrows Impossibility Theorem) There is no social choice rule
satisfying the above four conditions.
Another way of thinking about this: any rule satisfying the first three conditions is dictatorial.
This is a very influential result - it has been taken as saying that perfect
democracy is impossible. But there are social choice rules that satisfy points
2, 3 and 4 if we place some restrictions on the N-tuples of preferences admitted
and drop the unrestricted domain condition. There are many results showing
31

that there is a social choice rule satisfying 2, 3 and 4 if the preferences we


consider are all in some way similar. The classic case of similarity is singlepeakedness: choices are over a naturally ordered one dimensional variable
(tax rate, budget deficit, ....) and each person has a most preferred value for
this and ranks alternative lower, the further they are from this most preferred
value. In this case voting works, and the outcome will be the most preferred
outcome of the median voter when voters are ranked by their most preferred
outcomes. Here is a formal statement and proof.
Definition 16. Consider preferences over the real line R1 . We say an ordering  is single peaked if there exists an alternative m such that for y, z > m,
y  z if f z > y and for y, z < m, y  z if f y > z. A profile of preferences
i is single peaked if for every i there exists such an mi .
Another related definition:
Definition 17. An agent k is the median agent for the preference profile
i , i I, if N {i : mi mk } I2 & N {i : mi mk } I2
Here N {X} denotes the number of elements in the set {X}. Now we
show that with single peaked preferences, the median voters preferred point
is always chosen by a majority voting process.
Proposition 19. Consider a profile of single-peaked preferences with agent k
the median agent. Then the peak of the median agent mk cannot be defeated
by any other alternative by majority voting.
Proof. Pick any y and assume that mk > y. Consider the set of agents
S I who have peaks greater than or equal to mk , {i I : mi mk }. Then
mi mk > y i S. Hence mk i y i S. As agent k is a median agent,
N {S} I2 . So more than half the voters prefer mk to y.

7
7.1

Final comments on preferences and choice


Framing

We have set up a very formal and rational model of choice by consumers.


Psychologists who study consumer behavior find this excessively rational.
Choices are affected by framing - how an issue is presented. Heres an
32

example from Psychological Science, 21(1) pp 86-92 2010. A Dirty Word or


a Dirty World? By David Hardisty, Eric Johnson and Elke Weber.
Suppose you are purchasing a round trip flight from Los Angeles to New
York City, and you are debating between two tickets, one of which includes
a carbon tax [offset]. You are debating between the following two tickets,
which are otherwise identical. Which would you choose? The ticket including
the carbon tax [offset] costs $392.70 and the ticket without costs $385. The
results are striking and depend on whether the extra $7.70 a mere 2% of the
ticket costs is described as a tax or an offset, and on the political leanings
of the subjects. When the $7.70 is described as an offset, the proportions of
democrats, independents and republicans agreeing to pay the extra amount
are 56%, 49% and 53% respectively, roughly the same. But when the $7.70
was described as a tax, the results changed dramatically in the case of the
independents and republicans: the numbers were now 50%, 28% and 13%.
The acceptance rate almost halved for independents and dropped by 75% for
republicans. Nothing changed except the frame of reference through which
people saw the issue and whether this triggered their hostility to taxes. There
are many other cases of framing affecting a choice.

7.2

Endogenous Preferences

We have taken preferences as given, primitive data exogenous to the choice


process. Perhaps preferences are in part at least formed by social and economic experience. Does this invalidate our approach? Not necessarily. In a
more dynamic context we can model preferences that respond to consumption experience. For example, let ct be a persons consumption vector at time
t, and let preferences be
Psuch that they can be represented by the following
utility function: U = t u (ct ) t with [0, 1] the discount factor. This
is a standard model of preferences over time, and takes preferences
in each
P
t
period as exogenous
and
fixed.
We
can
modify
this
to
U
=
u
(c
t , zt )
t
Pt1
where zt = =0 c t . This defines a complete transitive preference ordering over consumption sequences, in which preferences today depend on
past consumption experiences. (See paper by me and Harl Ryder, Rev Econ
Studs 1973.)

33

Part II

Firms and Production Plans


Firms seek to maximize profits subject to various constraints so there is a lot
of overlap with consumer theory. But there is no budget constraint, which
makes matters easier.
We let y RN be a production plan, a list of inputs used and the outputs
produced from them. Sign convention: inputs are negative and outputs
positive, so {2, 5, 7, 1, 9, 0} means that 2, 5 and 7 units of goods one, two
and three are produced and 1 and 9 units of four and five are used as inputs,
with six neither consumed not produced. With this sign convention, p.y is
profit: it is the cost of inputs subtracted from the value of outputs. So firms
seek to maximize p.y.
Definition 18. The production possibility set is the set of production plans
that is feasible for the firm, denoted Y RN . Clearly y Y .
The production possibility set is limited primarily by technology - you
need iron ore to make steel, wheels to make a car, etc - but also by laws and
regulations - you cannot dump toxic chemicals in the water, at least in the
US - and by available resources - an oil company may run out of oil reserves.
We sometimes characterize the production
possibility set in terms

of a
N
N
transformation function F : R R, Y = y R : F (y) 5 0 . The
boundary of the production set is known as the transformation frontier,
{y : F (y) = 0}.
The ratio
F/yl
= M RTlj
F/yj
is known as the marginal rate of transformation between l and j. This number
is the negative of the slope of the frontier.
If there is only one output y1 we can write
y1 = f (y2 , ...., yN )
and the function f is known as the production function. For a given level of
output y1 we can look at the set of all inputs that can produce y1 :
{y2 , .., yN : f (y2 , .., yN ) = y1 }
34

The boundary of this set is called the y1 isoquant: {y2 , .., yN : f (y2 , ..., yN ) = y1 }.
The slope of this isoquant is the marginal rate of technical substitution. A
common example of a production function is the Cobb-Douglas:
y = x1 x2
Here the marginal rate of technical substitution [MRTS] between x1 and x2
is
x2
M RT S1,2 =
x1

7.3

Properties of the Production Set

1. Y is non-empty
2. Y is closed
N
{0}
3. No free lunch: Y R+

4. Possibility of inaction: 0 Y . Can conflict with sunk costs.


5. Free disposal: if y Y & y 0 5 y y 0 Y . Note that this can conflict
with environmental regulations.
6. Irreversibility: y Y, y 6= 0 y
/Y
7. Constant returns to scale: y Y y Y > 0. Geometrically Y
is a cone. Efficiency neither increases nor falls with scale.
8. Decreasing returns to scale: Y is strictly convex. Efficiency falls with
scale.
9. Increasing returns to scale: y Y y Y > 1 & y Y : y
/
Y, (0, 1).
10. Additivity: y, y 0 Y y + y 0 Y
11. Convexity: strict convexity implies diminishing returns to scale, convexity implies non-increasing returns to scale. (Non-increasing returns
means y Y, y Y, [0, 1].

35

7.4

Profit Maximization (PMP)

Problem is
M axy {p.y} , y Y
or
M axy {p.y} , F (y) 5 0
and given that the profit maximizing plan will generally be in the boundary
we can work with
M axy {p.y} , F (y) = 0
Definition 19. Profit function (p) = M axy {p.y} , y Y

Definition 20. Supply correspondence/function {y (p) : p.y = (p)}


Profit maximization involves maximizing the value of output net of the cost
of input:we may write this problem in several ways.
M axx {py y px x} , y = f (x) , M axy {p.y} , F (y) = 0
depending on whether we have a single output whose production can be
expressed as a function of the inputs y = f (x) or many outputs so that we
need to work with an implicit function F (y) = 0. For the general implicit
function case the first order conditions are
F (y )
pl =
yl
where is a Lagrange multiplier. For the alternative case we can write the
maximand as py f (x) px x and the FOCs are
py

f
pl , = if xl > 0
xl

These FOCs are generally necessary conditions for profit maximization: they
are also sufficient if the production possibility set is convex. From the FOCs
it follows that the MRTS will equal the price ratio:
F/xl
pl
=
F/xj
pj
and of course the FOC for the single output case states that the price of an
input is the value of its marginal product.
Some facts about the profit function and the supply correspondence:
36

Proposition 20. (1) The profit function (p) is homogeneous of degree one.
(2) The supply correspondence y (p) is homogeneous of degree zero.
(3) If y (p) consists of a single point then (p) is differentiable at p and
(p) = y (p) (Hotellings Lemma)
(4) If y (p) is a function and differentiable at p, then Dy (p) = D2 (p) is
a symmetric and positive semi-definite matrix with Dy (p) p = 0.
Point (4) here is sometimes called the Law of Supply: it says that
quantities respond to a price change in the same direction as the price change
- if a price increases then the supply of that good increases too.

7.5

Cost Minimization (CMP)

If a firm is maximizing profits then there is no way of producing the same


output at lower cost, so it is also cost minimizing. Minimizing the cost of
producing output y requires
M inx {px .x} , f (x) y
The minimized value of this gives the cost function c (px , y), the cost of
producing y at prices px : c (px , y) = M inx px .x, f (x) y. The FOCs for
this problem require that
pj
fj
=
fk
pk
so the MRTS equals the price ratio.
Proposition 21. (1) c (px , y) is homogeneous of degree one in prices and
non-decreasing in output.
(2) c (px , y) is a concave function of px
(3) If f (x) is homogeneous of degree one (constant returns to scale) then
c (px , q) is homogeneous of degree one in y.
(4) If f (x) is concave then c (px , y) is a convex function of y, which means
that marginal costs are non-deceasing in y.
Using the cost function we can restate the problem of maximizing profits
as:
M axy {py} c (px y)
and a FOC for this is clearly that
p

c
, = if y > 0
y
37

which means that marginal cost should equal price.


7.5.1

Examples:

Cobb-Douglas Production Function


Consider the function y = L K , + < 1, which shows diminishing
returns to scale. The PMP is
M axL,K {py} wL rK, y = L K M axL,K {pL K wL rK}
and so the FOCs are

pL1 K w = 0
pL K 1 r = 0

which imply
w
K
=
L
r
This yields

L =


1
w

K =



w

1
o 1



r


1
r

1
o 1

The supply function associated with these is


y = L K =

(
   
w

1
) 1

p+

Next look at the CMP:


M inL,K {wL + rK}, L K y
The Lagrangian is


L = wL + rK + y L K
and the FOCs are
w

y
y
= 0, r = 0, y L K = 0

L
K

38

Dividing the first by the second and substituting into third yields factor
demand functions given y:

L =

r
w

 +

1
+

, K =

r
w

/(+)

y 1/(+)

We can now show that


=

1 +
r + y +
w
k

The cost function is

c (w, r, y) = wL + rK =

+
k

w + r + y +

so the marginal cost is


MC =

1
1
c
= w + r + y + =
y
k

so the Lagrange multiplier from the CMP is the marginal cost.


We can compute the average cost


1

c (w, r, y)

AC =
= Gy + , G =
w + r +
y
k
and check whether this is increasing or decreasing with output:


1
AC
1
=
Gy + < 0 + > 1
y
+
so average costs are decreasing with output if and only if we have increasing
returns to scale. With diminishing returns to scale average costs are rising
with output level.
Leontief technology
The production function is
f (x1 , x2 ) = min [ax1 , bx2 ]
As we know that the firm will not waste inputs, it must operate where y =
ax1 = bx2 . So the input demands are
y y 
(x1 , x2 ) =
,
a b
39

and the cost function is


c (w1 , w2 , y) = y

w

w2 
+
a
b
1

Example: Linear Technology


Consider a linear production function
f (x1 , x2 ) = ax1 + bx2
Goods are perfect substitutes and so the firm will use whichever is cheaper.
Hence the input demands are


y
w1
w2
,
0
if
<

a
a
b

y
w2
w1
>
0,
if
(x1 , x2 ) =
b
a
b


(x1 , x2 : ax1 + bx2 = y, x1 , x2 0) if wa1 = wb2
and the cost function is just
c (w1 , w2 , y) = min

40

w2 i
,
y
a b

hw

41

Figure 7.1:

7.6

Aggregation of Firms

We noted that it is hard to aggregate consumers. It is much easier to aggregate firms. In this context a useful result is the following, which tells us
that maximizing profits over the sum of all firms production possibility sets
is equivalent to maximizing over each set individually and then adding up
these maxima:
Proposition
22. Let Yi , i = 1, . . . , I be production possibility sets and Y =
P
N
vector. Let y = argmaxyY {p.y} and
i Yi be their sum. p R is a priceP

yi = argmaxyi Yi {p.yi }. Then y = i yi .


P
P P
Y can be written
as
y
=
Proof.
I Yi = Y . Any y P
i yi
P
P i yi , yi Yi .

But
Pp.yi p.yi yi Yi , i. So i p.yi = p. i yi p. i yi yi Yi
p. i yi p.y y Y .

42

Second Problem Set


1. Show that the production function y = f (x) has constant returns to
scale if and only if it is homogeneous of degree one.
2. For a Cobb-Douglas production function of two arguments with the sum
of the exponents less than one, show that input demands are decreasing
in their own prices.
3. For the same production function as 2 above, show that the cross price
derivatives of input demands are equal, i.e. if K, L are the two inputs
= L
.
and w, r their prices then K
w
r
4. For the same production function as 2 above derive the supply function.
5. Suppose f (z) is a concave production function with L 1 inputs
(z1 , z2 , ...., zL1 ). Suppose f /zl > 0l, z 0 and that the matrix
D2 f (z) is negative definite for all z. Use the firms first order condition and the implicit function theorem to prove the following statements: (A) An increase in the output price always increases the profitmaximizing level of output. (B) An increase in output price increases
the demand for some input. (C) An increase in the price of an input
reduced the demand for that input.

43

Part III

Choice under Uncertainty


Definition 21. A simple lottery L is a list of N exclusive andP
exhaustive outcomes 1, . . . ., N with associated probabilities (p1 , p2 , ....pN ) , n pn = 1, pn
[0, 1], where pn is the probability of outcome n occurring.
A simple lottery
can P
be represented
by a point in the N 1-dimensional


N
: n pn = 1 .
simplex 4 = p R+

Definition 22. Given KP
simple lotteries Lk = pk1 , ...., pkN , k = 1, ..., K, and
probabilities 1 ak 0, k ak = 1, the compound lottery (L1 , .., LK ; a1 , ..aK )
is the risky alternative that yields the simple lottery Lk with probability ak .
For any such compound lottery we can calculate a corresponding reduced
lottery as the simple lottery L = (p1 , ..., pN ) that generates the same ultimate
distribution over outcomes. The probability of each outcome 1, ..., N is found
by multiplying the probability of each lottery ak by the probability pkn that
outcome n occurs in lottery k, and then adding over lotteries k. So the
probability of outcome n in the reduced lottery is
X
ak pkn
pn = a1 p1n + a2 p2n + .... + aK pK
=
n
k

Preferences over Lotteries

We assume the preferences over simple or compound lotteries depend only


on the outcomes and their probabilities, and not in any way on the process
reached to arrive at these outcomes and probabilities. So we take the set of
alternatives to be the set of all simple lotteries L over the set of outcomes
C. People are assumed to have rational (complete, transitive) preferences 
over L. Additional technical assumptions are:
Definition 23. The preference relation  on the space of simple lotteries L
is continuous if for any l, l0 , l00 L the sets
{a [0, 1] : al + (1 a) l0  l00 } [0, 1]
and
{a [0, 1] : l00  al + (1 a) l0 } [0, 1]
44

are closed.
So the set of combinations of l, l0 that are at least as good as l00 is a closed
set, as is the set that of combinations that are no better than l00 . As in
the deterministic case this is ruling out lexicographic preferences where the
agent places all emphasis on the probability of one particular outcome - for
example on the risk of death being zero. Next comes another assumption, a
very crucial one.
Definition 24. The preference relation  on the space of simple lotteries L
satisfies the independence axiom if for all l, l0 , l00 L and a (0, 1) we have
l  l0 if f al + (1 a) l00  al0 + (1 a) l00 .
In words, if we mix two lotteries with a third one, the the preference
ordering of the two resulting mixtures does not depend on (is independent
of) the particular third lottery used. l is preferred to l0 iff l in any mixture
with a third lottery l00 is preferred to l0 in the same mixture with l00 . Heres
an example of what this axiom means.
Suppose l  l0 and a = 0.5. 0.5l + 0.5l00 is the lottery resulting from a
coin toss between l, l00 , say heads giving l and tails l00 . Likewise 0.5l0 + 0.5l00
is the lottery generated by a coin toss between l0 , l00 , again heads giving l0 .
Conditional on heads 0.5l + 0.5l00 is at least as good as 0.5l0 + 0.5l00 , and
conditional on tails they give the same outcome. So it is reasonable that
0.5l + 0.5l00  0.5l0 + 0.5l00 , which is what the axiom implies.
Definition 25. The utility function U : L R has an expected utility form
if there is an assignment of numbers u1 , ...., uN to the N outcomes such that
for every simple lottery l = (p1 , ...., pN ) L we have
X
un pn
U (l) =
n

A utility function with the expected utility form is called a von NeumannMorgenstern (vN-M) expected utility function.
Note that if we let ln denote the degenerate lottery yielding P
outcome n
with probability one then U (ln ) = un . The expression U (l) = n un pn is
linear in probabilities, suggesting

45

Proposition 23. A utility function U : L R has an expected utility form


if and only if it is linear in probabilities, that is
!
K
K
X
X
U
ak lk =
ak U (lk )
k=1

k=1

P
for any K lotteries lk and probabilities a1 , ..., aK 0, k ak = 1. [In words
the utility of a mixture of simple lotteries equals the expectation of the utilities
of the individual simple lotteries.]
Proof. Suppose U has the linearity property. We can write any lottery l =
(p
degenerate
lotteriesPl1 , . . . . . . , lN : l =
P
P1 , .., pnN ) as a combinationPof the
n
n
n pn un . So U has
n pn U (l ) =
n pn l ) =
n pn l . Then U (l) = U (
the expected utility form.
Now suppose U has the expected utility form, and
 consider a compound
k
k
lottery
(l
,
...,
l
;
a
,
...,
a
)
where
l
=
p
,
....,
p
K
1
K
k
1
N . Its reduced lottery is
P 1
0
l = k ak lk . Then we have, remembering that the probability of outcome n
in the reduced lottery is
X
ak pkn
pn = a1 p1n + a2 p2n + .... + aK pK
n =
k

!
U

X
k

ak lk

"
=

un

#
X

ak pkn =

"
X
k

ak

#
X
n

un pkn =

ak U (lk )

which completes the proof.


The utility functions we discussed in the section on consumer preferences were unique only up to a monotone transformation, that is, an orderpreserving transformation. They are ordinal functions. That is not true of
v.N-M utility functions: there are unique up to a linear transformation, and
are cardinal functions.
Proposition 24. Suppose that U : L R is a vN-M expected utility for the
preference relation  on L. Then U is another vN-M function for  if and
only if there exist scalars > 0, such that U (l) = U (l) + l L.

46

Proof. Choose two lotteries l, l such that l  l  l l L. If l l all lotteries


are indifferent and the result is trivial. So we assume l  l.
Note that if U is a vN-M function and U = U + then
!
!
"
#
X
X
X
U
ak lk = U
ak lk + =
ak U (lk ) +
k

ak [U (lk ) + ] =

ak U (lk )

so U has the expected utility form.


For the reverse proof we need to show that if both U, U have the expected
utility form then there exist scalars > 0, such that U = U + . Consider
any lottery l and define l by


U (l) = l U l + (1 l ) U (l) = U l + (1 ) l
Hence
l =

U (l) U (l)

U l U (l)



Since l U l +(1 l ) U (l) = U l l + (1 l ) l and U represents the preference  it follows that l l l + (1 l ) l. In this case since U is also linear
and represents the same preferences
h
i




U (l) = U l l + (1 l ) l = l U l + (1 l ) U (l) = l U l U (l) + U (l)


Substituting and rearranging terms shows that U (l) = U (l) + where

U l U (l)

=
, = U (l) U (l)
U l U (l)

A result of this proposition is that for vN-M utilities, utility differences


have meaning - which was not true for ordinal utilities. So if there are four
outcomes it is meaningful to say the utility difference between outcomes 1
and 2 is greater than that between outcomes 3 and 4, which corresponds to
47

the statement u1 u2 > u3 u4 . This in turn is equivalent to 0.5u1 + 0.5u4 >


0.5u2 + 0.5u3 , which in turn means that the lottery (0.5, 0, 0, 0.5) is preferred
to (0, 0.5, 0.5, 0).
A final point to note before we get to the most important result of this
section is that the independence axiom implies that indifference curves are
straight lines in prospect space. To see this note that an indifference curve
is a straight line if l1 l2 al1 + (1 a) l2 l1 l2 a [0, 1]. Suppose
this is not the case and in fact 0.5l1 + 0.5l2  l2 . This is the same as
0.5l1 + 0.5l2  0.5l2 + 0.5l2 . But since l1 l2 , by the independence axiom,
we must have 0.5l1 + 0.5l2 0.5l2 + 0.5l2 . So indifference curves are linear:
we can also show that they are parallel straight lines: it is easy to construct
a contradiction if they are not.
Proposition 25. [Expected utility theorem.] Suppose that the rational preference relation  on L satisfies the continuity and independence axioms.
Then  admits a representation in the expected utility form, that is we can
assign numbers un to each outcome 1, . . . ., N in such a manner that for any
two lotteries l = (p1 , ...., pN ) , l0 = (p01 , ...., p0N ) we have
X
X
un p0n
un pn
l  l0 if f
n

9.1
Paradoxes
Lots of people feel uneasy about the independence axiom and there are various paradoxes that seem to violate it. One of the best known is the Allais
Paradox. Number of outcomes is 3, N = 3.
First prize Second prize Third prize
$2,500,000
$500,000
$0
First choice is between two lotteries l1 , l10 :
l1 = 0, 1, 0 : l10 = 0.10, 0.89, 0.01
Second is between l2 , l20
l2 = 0, 0.11, 0.89 : l20 = 0.10, 0, 0.90
48

It is common in experimental situations for people to rank l1  l10 : l20  l2 .


The first choice implies the certainty of $500,000 is preferred to a lottery
offering a 10% chance of five times as much together with a small risk of
getting zero. The second implies that a 10% chance of getting $2,500,000
beats an 11% chance of $500,000.
These choices are inconsistent with a vN-M utility function. To see this
denote by u25, u5 , u0 the utilities of the three outcomes. Then l1  l10 implies
u5 > 0.1u25 + 0.89u5 + 0.01u0
Adding 0.89u0 0.89u5 to both sides gives
0.11u5 + 0.89u0 > 0.1u25 + 0.9u0
and therefore l2  l20 . Several possible reactions to this.
1. People will change their choices if the inconsistency with the underlying
axioms is shown to them.
2. This paradox is not very relevant because it involves probabilities near
zero and one, and vastly different outcomes.
3. Regret is relevant: we prefer l1 to l10 because we will always regret
not getting $500,000 when we could have got that with certainty if we
choose l10 and get zero. In the second case there is a good chance of
getting zero anyway.
Another interesting paradox is the Ellsberg Paradox, as follows. There
is an urn with 90 balls in it, 30 red and the rest black and yellow. So the
probabilities of black and yellow are unknown - uncertainty or ambiguity
rather than risk. People are given the following choices between pairs of
lotteries:
Gamble A: $10 of you draw red.
Gamble B: $10 if you draw black.
Gamble C: $10 for red or yellow
Gamble D: $10 for black or yellow
Generally we find that people prefer A to B and also D to C. This also
implies a violation of the independence axiom, as follows:
A  B Ru10 + (1 R) u0 > Bu10 + (1 B) u0 R > B
49

D  C Bu10 + Y u10 + Ru0 > Ru10 + Y u10 + Bu0 B > R


So there is no set of probabilities and utilities that supports this choice as
expected utility maximization. Choices under ambiguity seem to violate the
independence axiom.

10

Risk Aversion

Consider a lottery over monetary amounts, non-negative numbers, with probabilities given by the density function f (t) , t [0, ]. The cumulative disx
tribution function is F (x) = 0 f (t) dt. Note that the final distribution for
a compound lottery is a weighted average of the distributions of each of the
component lotteries: if l1 , . . . . . . , lK : aP
1 , ..., aK is a compound lottery then
the cumulative distribution is F (x) = k ak Fk (x).
We now take the set of all lotteries to be the set of all cumulative distributions over non-negative amounts of money. We can apply the vN-M
theorem to show that there is a utility function of the form

U (F ) = u (x) dF (x)
Note that U is defined on lotteries and u is defined on amounts of money. U is
generally called the VN-M utility and u the Bernoulli utility. The axioms of
expected utility theory place restrictions on U but not on u, which could be
any increasing continuous function.
Note: it is sometimes argued that u should be bounded above, because
of the St Petersburg Paradox. Assume that u is unbounded and let xm be
an amount of money such that u (xm ) > 2m . Consider the following lottery:
Toss a coin repeatedly until heads comes up. If this happens on the mth
toss the payoff is xm . The expected utility from this lottery is
 m
 m X

X
1
1
m
>
= +
2
u (xm )
2
2
1
1
so you ought in principle to be willing to pay any amount to play this lottery.
Clearly most people are not, so this is an argument for u being bounded
above.
Definition 26. A decision maker is risk averse if for
any lottery F (.) the
degenerate lottery that yields the expected amount xdF (x) with certainty
50

is at least as good as F . If for all F the decision-maker is indifferent between


these two lotteries we say he is risk-neutral, and we say he is strictly risk
averse if he is indifferent only when the two lotteries are the same.
So a decision maker is risk averse if and only if



u (x) dF (x) u
xdF (x) F
In words, the expected utility of the outcome does not exceed the utility of
the expected outcome. This is called Jensens Inequality, and is the inequality
used to define a concave function, so risk aversion is equivalent to the function
u being concave - diminishing marginal utility of income or wealth.
Definition 27. The certainty equivalent of a lottery F , denoted c (F, u), is
the amount of money that the individual regards as indifferent to the gamble
represented by the lottery:

u (c (F, u)) = u (x) dF (x)


If the decision-maker is an expected utility maximizer with Bernoulli utility function u on amounts of money then:
Proposition 26. The following are equivalent:
1 The decision-maker is risk-averse
2 u (.) is concave

3 c (F, u) xdF (x) for all F


So risk-aversion, concavity and the certainty equivalent being less than
the expectation are all equivalent.

10.1

Risk Management

We will look at insurance and portfolio choice.


Insurance: a strictly risk-averse person with initial wealth W runs a risk
of losing $D with probability . A unit of insurance costs $q and pays $1 if
the loss occurs. So if a units of insurance are purchased then her wealth is
W aq if there is no loss and W aq D + a if the loss occurs. So expected
wealth is
(W aq) (1 ) + (W aq D + a) = W D + a ( q)
51

Utility maximization requires


M axa (1 ) u (W aq) + u (W aq D + a)
FOC is
q (1 ) u0 (W a q)+ (1 q) u0 (W D + a (1 q)) 0, = if a > 0
Assume the price of insurance is actuarially fair, that is it is equal to the
expected cost of the insurance. This means q = . Then the FOC requires
u0 (W D + a (1 )) u0 (W a ) 0, = if a > 0
Since u0 (W D) > u0 (W ) it must be the case that a > 0 and so
u0 (W D + a (1 )) = u0 (W a )
Because u0 is strictly decreasing in its argument this means that
W D + a (1 ) = W a
or equivalently
a = D
which means that the agent insures fully. So if the insurance is actuarially
fair the risk-averse agents insures fully. Her wealth is then W D whether
the loss occurs or not.
Demand for a risky asset: There are two assets, a safe asset with a
return of 1 per dollar invested, and a risky one with a random return of z per
dollar
invested. z has a distribution function F (z) which we assume satisfies

zdF (z) > 1, so that its mean return exceeds that of the safe asset.
Wealth W can be invested in any way between the two assets, with a, b
the amounts invested in the risky and safe assets respectively, with a+b = W .
For any realization of z the portfolio pays az + b. The choice problem is

M axa,b u (az + b) dF (z) = u (W + a (z 1)) dF (z) , 0 a W


The solution a must satisfy

(a ) = u0 (W + a [z 1]) (z 1) dF (z) 0 if a < W, 0 if a > 0

Note that zdF (z) > 1 (0) > 0. So a = 0 cannot satisfy this equation
and the optimal portfolio has a > 0. Conclusion: if a risk is actuarially
favorable then a risk averter will always accept at least a small amount of it.
52

Definition 28. Given a Bernoulli utility function u for money, the ArrowPratt coefficient of absolute risk aversion at x is defined as rA (x) = u00 (x) /u0 (x).
Note the negative sign in front: for concave functions this is always nonnegative.
We know that risk neutrality is equivalent to linearity, and that risk
aversion seems to increase with the curvature of u. The utility function can be
recovered from rA by integrating twice, up to two integration constants. The
integration constants dont matter as u is unique only up to two constants
anyway.
Example: Consider the utility function u (x) = eax for a > 0. Then
we have u0 (x) = aeax and u00 (x) = a2 eax , so that rA (x, u) = ax.
Given two utility functions u1 (x) , u2 (x), when can we say that one is
more risk averse than the other?
Proposition 27. The following are all equivalent:
1. rA (x, u2 ) rA (x, u1 ) x
2. There exists an increasing concave function (.) such that u2 (x) =
(u1 (x)) x: that is u2 is a concave transformation of u1 .
3. c (F, u2 ) c (F, u1 ) F (.)
4. Whenever u2 (.) finds a lottery F (.) at least as good as a riskless outcome x, then
u1 (.) also finds F (.) at least as good as x. Or u2 (x) dF (x)
u2 (
x) u1 (x) dF (x) u1 (
x) F (.) , x.
Proof. Show that 1 and 2 are equivalent. Note that for some increasing
function we always have u2 (x) = (u1 (x)) because the two represent the
same (increasing) ordering on R1 . Differentiating
u02 (x) = 0 (u1 (x)) u01 (x)
and again
2

u002 (x) = 0 (u1 (x)) u001 (x) + 00 (u1 (x)) (u01 (x))

Dividing both sides of u002 by u02 and using the first line we get
rA (x, u2 ) = rA (x, u1 )

00 (u1 (x)) 0
u (x)
0 (u1 (x)) 1

From this we note that


rA (x, u2 ) rA (x, u1 ) if f 00 (u1 ) 0

53

More-risk-averse-than is a transitive but incomplete ordering. Many Bernoulli


utility functions will be incomparable, that is we will have rA (x, u1 ) >
rA (x, u2 ) at some x and also rA (x0 , u1 ) < rA (x0 , u2 ) for some x0 6= x.
Example
Next we will consider the portfolio choices of two risk-averse individuals
and show that the more risk-averse will always invest less in the risk asset.
As before, there are two assets, a safe asset with a return of l per dollar
invested, and a risky one with a random return of z per dollar
invested. z

has a distribution function F (z) which we assume satisfies zdF (z) > 1, so
that its mean return exceeds that of the safe asset.
Wealth W can be invested in any way between the two assets, with ai , bi
the amounts invested in the risky and safe assets respectively, with ai + bi =
Wi . For any realization of z the portfolio pays ai z + bi . The choice problem
is

M axai ,bi ui (ai z + bi ) dF (z) = ui (Wi + ai (z 1)) dF (z) , 0 ai Wi


For interior solutions the solutions ai must satisfy

i (ai ) = u0i (Wi + ai [z 1]) (z 1) dF (z) = 0


The concavity of u2 implies that 2 is decreasing, so if we show that 2 (a1 ) <
0, it follows that a2 < a1 , which is what we want to show.
Now u2 (x) = (u1 (x)) where is a concave function. Hence

2 (a1 ) = (z 1) 0 (u1 (W1 + a1 [z 1])) u01 (W1 + a1 [z 1]) dF (z) < 0


The final inequality follows from the first order condition recalling that in
this case we have the FOC multiplied by 0 a positive decreasing function of
z.
Definition 29. The Bernoulli utility function u (.) for money exhibits decreasing absolute risk aversion if rA (x, u) is a decreasing function of x.
People whose preferences show decreasing absolute risk aversion take more
risk as they become richer.

54

Proposition 28. The following properties are equivalent:


1. The Bernoulli utility function u exhibits decreasing absolute risk aversion
2. Whenever x2 < x1 , u2 (z) = u (x2 + z) is a concave transformation of
u1 (z) = u (x1 + z)
3. For any risk F (z) the certainty equivalent of the lottery formed by
adding
risk z to wealth level x, given by the amount cx at which u (cx ) =

u (x + z) dF (z), is such that x cx is decreasing in x. So the higher is x,


the less the person is willing
to pay to remove the risk.
4.
For
any
F
(x),
if
u (x2 + z) dF (z) u (x2 ) and x2 < x1 then

u (x1 + z) dF (z) u (x1 ).


Next we look at another concept of risk aversion, relative rather than
absolute. This is a measure of aversion to proportional fluctuations in wealth,
rather than absolute fluctuations.
Definition 30. The coefficient or index of relative risk aversion (RRA) is
rR (x, u) = xu00 /u0 .
Decreasing relative risk aversion [rR decreasing with x] means that a
person becomes less averse to a given proportional risk as her income rises.
Examples of constant IRRA utility functions:
u (C) = logC, u0 = 1/C, u00 = 1/C 2 , IRRA = 1
u (C) = C 1 / (1 ) , u0 = C , u00 = C 1 , IRRA = . If
< 1 then u (C) > 0 and u is unbounded: if > 1 then u < 0 and u is
bounded.
Proposition 29. The following are equivalent for a Bernoulli utility function
u (.) :
1. rR (x, u) is decreasing in x
2. Whenever x2 < x1 , u2 (t) = u (tx2 ) is a concave transform of u1 (t) =
u (tx1 )

3. Given any risk F (t) on t > 0, the certainty equivalent cx = u (tx) dF (t)
satisfies x/
cx is decreasing in x.
Proof. We will show that 1. implies 3. Pick a distribution F (t) over t, and
for any x define ux (t) = u (tx). Let c (x) be the usual certainty equivalent
from definition 27: ux (c (x)) = ux (t) dF (t). Note that as u0x (t) = xu0 (tx)

1 u00 (tx)
u00x (t)
=

tx
u0x (t)
t u0 (tx)
55

for any x. Hence if 1. holds then ux0 is less risk averse than ux whenever
x0 > x. It follows from proposition 27 that c (x0 ) > c (x) and so c is increasing.
By the definition of ux , ux (c (x)) = u (xc (x)). In addition

ux (c (x)) = ux (t) dF (t) = u (tx) dF (t) = u (


cx )
Hence cx /x = c (x) and so x/
cx is decreasing, completing the proof.
10.1.1

Mean-Variance

Next a simple illustration of the role that the index of RRA can play. Let
y be a random variable distributedas F (y) with mean y . The expected
utility associated with this is Eu = u (y) dF (y) and define x so that

u (y x) = Eu = u (y) dF (y)
Here x is the cost of risk bearing, the difference between the certainty equivalent and mean y of the prospect F . Clearly



u00 (y )

2
0

u (y x)u (y ) = [u (y) u (y )] dF (y) =


(y y ) dF (y)
u (y ) (y y ) +
2

so that

u (y x) u (y ) =


u00 (y ) 2
u00 (y )
2
(y y ) dF (y) =
2
2

and we can rewrite the LHS to give


u0 (y ) x =

u00 (y ) 2
1 y u00 2
1 2
x=
=

2
2 u0 y
2 y

So the cost of risk bearing is one half of the variance of the risk over the
mean outcome times the IRRA. It is also the index of absolute risk aversion
times half the variance.
We can write
1
Eu = u (y x) = u (y ) + u00 (y ) 2
2
56

which is a function of the mean outcome y and its variance 2 , and from
this we can use the implicit function theorem to get
u00
y
= 0

u + 0.5u000
Assuming u000 0 so the utility is almost quadratic we have
y
u00
= 0 >0

u
as the slope of an indifference curve in y space. So this is linear if
the index of absolute risk aversion (IARA) is constant and defines convex
preferred-or-indifferent sets if the IARA is increasing.

10.2

Comparison of payoffs in terms of return and risk

In comparing risky choices, we can ask two different questions: is one more
rewarding than the other, in terms of offering better outcomes, and is one
more risky than the other?
First we formalize the idea that distribution F yields unambiguously
higher returns than distribution G. We assume distributions satisfy F (0) = 0
and F (x) = 1 for some x. Two possible approaches: one to ask whether every
expected utility maximizer whose utility is increasing in income will prefer
one to the other, and the second is to ask if for every amount of money x the
probability of getting at least x is greater under one than under the other.
Both approaches lead to the same concept.
Definition 31. The distribution F first order stochastically dominates G if,
for every non-decreasing function u : R R,

u (x) dF (x) u (x) dG (x)


Proposition 30. The distribution F first order stochastically dominates distribution G if and only if F (x) G (x) x
Proof. Note first that if f (x) , g (x) are the pdfs of F (x) , G (x) and [a, b]
contains the supports of both distributions then integrating by parts
b
b
b
u (x) f (x) dx = [u (x) F (x)]a
u0 (x) F (x) dx
a

57

which reduces to

b
0

u0 (x) F (x) dx

u (x) F (x) dx = u (b)

u (b) 1 u (a) 0

So comparing the expected utility under the two distributions gives


b
b
b
b
u (x) g (x) dx = a u0 (x) F (x) dx + a u0 (x) G (x) dx
u (x) f (x) dx
a
a
b
= a u0 (x) [G (x) F (x)] dx
We want this to be positive for all increasing functions u : u0 (x) > 0 x
[a, b]. Clearly this is true if G (x) > F (x) x (a, b). So we have shown that
G (x) F (x) implies that F first order stochastically dominates G.
Now the reverse: G (x) F (x) F F OSD G. Suppose to the contrary
that x0 : F (x0 ) > G (x0 ). Then we can construct a function u for which u0
is very large where F > G and very small elsewhere. This makes the integral
on the RHS negative, so that F does not dominate G.
Next we turn to a discussion of when one distribution is more risky than
another, and of second order stochastic dominance. We compare only distributions with the same mean.
Definition 32. For any two distributions F and G with the same mean, F
second order stochastically dominates G (or is less risky than G) if for every
N
N
we have
R+
non-decreasing concave function u : R+

u (x) dF (x) u (x) dG (x)


Next we discuss an alternative way of characterizing second order stochastic dominance, using the idea of a mean-preserving spread.
Definition 33. Distribution G is a mean-preserving spread of distribution
F if G is the reduction of a compound lottery made up of the distribution
F with an additional lottery so that when F selects x the final outcome is
x + z where z is a random variable whose mean is zero.
Proposition 31. Consider two distributions F and G with the same mean.
Then the following statements are equivalent.
1. F (.) second order stochastically dominates G (.)
2. G
spread of F (.)
x(.) is a mean-preserving
x
3. 0 G (t) dt 0 F (t) dt x
58

For a good clear exposition of this issue see


www.princeton.edu/~dixitak/Teaching/EconomicsOfUncertainty/Slides&Notes/Notes04.pdf

10.3

A Geometric Approach to Insurance

Figure 10.1:
Figure 10.1 gives a geometric way of thinking about insurance. There are
two states, 1 & 2. It is not certain which will occur, and their respective
probabilities are p1 , p2 . The consumers initial endowment is at the point z1
giving z11 in state 1 and z12 in state 2. The 45 degree line shows situations
where income is the same in each state, and these are therefore fully-insured
positions. The consumers expected utility is given by
u (z11 ) p1 + u (z12 ) p2
and the slope of an indifference curve is therefore

p1 u0 (z11 )
p2 u0 (z12 )
59

On the 45 degree line, z11 = z12 so this slope is just p1 /p2 , the ratio of the
probabilities.
Now consider the move from the initial position z1 to the fully insured
position z2 . This involves selling z1 z0 of income in state 1 and buying
z2 z0 of income in state 2. This transaction will move the consumer to a
fully insured position. What is the expected value of this transaction? The
probability of state 1 is p1 so the probability of giving up z1 z0 is p1 , and the
probability of state 2 and so of acquiring z2 z0 is p2 . So the expected value
0
, so
of this transaction is p1 (z1 z0 )+p2 (z2 z0 ) which is zero if pp21 = zz21 z
z0
the transaction is actuarially fair if the slope of the budget line, which is the
right hand side here, equals the price ratio. As the slope of an indifference
curve is always equal to the price ratio on the 45 degree line, the offer of
actuarially fair insurance will always be accepted and lead to full insurance.
Note that in this context convexity of the preferred-or-indifferent sets is
equivalent to risk aversion: it implies a preference for moving towards the 45
degree line.

11

Discount Rates and the Elasticity of Marginal


Utility

We have discussed the index of relative risk aversion, xu00 (x) /u0 (x). This
parameter is also important in other areas of economics, where it is known as
the elasticity of the marginal utility of consumption and generally denoted
(c). To see why it is called this note that
cu00
du0 (c) c
=
dc u0 (c)
u0
Note also that the proportional rate of change of the present value of marginal
utility u0 (ct ) et (where ct is a function of time t) is given by

1 dc
dln u0 (ct ) et /dt = (c)
= g
c dt
where g = 1c dc
.
dt
Now consider the problem

dk
M axc
u (ct ) et dt, ct +
= f (k)
dt
0
60

where we are maximizing the integral of the utility of consumption, discounted at rate 0, subject to the constraint that consumption plus
investment dk/dt adds up to output f (k) where k is the capital stock and
f a strictly concave production function. This is called the Ramsey problem
or the optimal growth problem.
To solve this problem we use a Hamiltonian:
H = u (ct ) et + t et [f (kt ) ct ]
where t et is a time-varying shadow price, ct is called the control variable
and kt the state variable, and the expression multiplied by the shadow price
is the rate of change of the state variable.
First order conditions - necessary conditions - for a path of ct , kt to solve
this problem are that


d t et
H
H
= 0t,
=
t
ct
dt
kt
If all functions are concave these conditions are not only necessary but also
sufficient, plus one additional technical condition known as a transversality
condition.
Applying these conditions gives
u0 (ct ) = t
and

dt
t = t f 0 (kt ) ,
dt

dt
= 0 = f 0 (k)
dt

,from which
g + = f 0 (kt )
So the return on capital - f 0 (k) - has to equal the elasticity of MU times the
growth rate of consumption plus the discount rate. And the LHS here is the
rate of change of the marginal value of consumption.

12

State-Dependent Preferences

So far we have assumed that lotteries deliver money and that preferences are
over amounts of money, with no other characteristics mattering. It may be
61

however that the circumstance under which money is delivered matter: for
example, money if you have lost your house in an earthquake or hurricane
may be more valuable than money if you just won the lottery. An umbrella
is much more useful if it is about to rain than on a hot dry day. We will call
circumstances such as whether it is dry or raining, or whether there is an
earthquake or hurricane, states of nature. S is the set of all possible states
of nature (earthquakes, hurricanes, dry, wet, .... ), assumed to be finite, nonintersecting and exhaustive, and s S is a particular state (earthquake etc).
The probability of s occurring is s .
Definition 34. A random variable is a function g : S R+ that maps
states into monetary outcomes.
Every random variable g (.) gives rise toPa money lottery described by the
distribution function F (.) where F (x) = {s:g(s)x} s for all x. A random
variable can now be represented by a vector of the monetary payoffs it gives
in each of the states in S, denoted (x1 , ...., xS ) where we are using S also to
S
denote the number of states in S. The set of all random variable is now R+
.
In this framework the primitive concept is a preference ordering over the
S
. Such a preference can be represented by an
set of all random variables, R+
expected utility function as before, with one difference: the utility function
can now depend on the state of nature. So we have
S
Definition 35. The preference relation  on R+
has an expected utility representation if for every s S there is a function us : R+ R suchPthat for any
0
0
S
0
0
(x
s s us (xs )
P1 , ..., xS ) 0and (x1 , ...., xS ) R+ , (x1 , ..., xS )  (x1 , ..., xS ) if f
).

u
(x
s
s s s

Consider a world with two states only, s1 , s2 . Let x1 , x2 be monetary


amounts delivered in each state. Then if x1 = x2 we have the same payoff
in each state and there is no uncertainty about the payoff. The marginal
rate of substitution between x1 , x2 is now 1 u01 (x) /2 u02 (x) and if the function is state-independent then this reduces to the ratio of the probabilities.
Generally it depends on probabilities and preferences.
With state-dependent preferences it is no longer the case that a riskaverse person will always purchase actuarially fair insurance. Insurance that
pays pi in state i is actuarially fair if 1 p1 + 2 p2 = 0, p2 = p1 21 . This
means that the budget line for insurance has the slope 1 /2 , and this is
the slope of an indifference curve on the risk free line only if preferences are
not state-dependent.
62

13

Subjective Probabilities

Von Neumann and Morgenstern took probabilities as given, objective, and


deduced the existence of preferences such that agents maximize the expectation of utility at these probabilities.
de Finetti, and Italian probabilist, worked the other way, and derived
probabilities from agents behavior. He said: suppose you have to set the
price of a promise to pay $1 if there was life on Mars one billion years ago,
and nothing otherwise. Your opponent can then choose which side of this
bet you are on and which side he is on. de Finetti defined the price you
set for this promise as your subjective probability of there having been
life on Mars one billion years ago. Here we have deduced probabilities from
observed behavior. They are not based on experimental evidence and can
clearly differ from person to person.
Savage (Foundations of Statistics 1953) developed a theory of choice under
uncertainty in which both preferences and probabilities are deduced from
behavior. He postulated a larger and more demanding set of axioms, and
in exchange for this greater complexity proved more - that people under
these conditions will behave as if they have both preferences and personal
probabilities and maximize their expected utilities using these preferences
and probabilities.
Savages framework is as follows. Primitive concepts are states and outcomes. The set of states s S is an exhaustive list of all scenarios that might
unfold. Knowing which state occurs resolves all uncertainty. An event is
any subset A S. The set of outcomes is X, typical member x X. An
outcome specifies everything that affects the choosers well-being.
The objects of choice are acts, which are functions from states to outcomes, and acts are denoted f F, f : S X. The state is uncertain and
so not known when the act is chosen, but the actor does know that if the
state is s then the outcome is f (s).
Acts whose payoffs do not depend on the state of the world s are constant
functions in F . We will use the notation x F to indicate the constant
function in F whose outcome is always equal to x X. Suppose f, g are two
acts and A is an event: then we define a new act by
(
g (s) , s A,
fAg (s) =
f (s) , s Ac
Intuitively this is f but replaced on A by g.
63

13.1

Savages Axioms

Axiom P1. Preferences are a complete transitive relation on F .


Axiom P2. Preferences between two acts f, g depend only on the values
of f, g where they differ.
Let A be an event and Ac its complement. Suppose f, g are equal if A
does not occur, that is on Ac , so they differ only on A. Alter f, g only on
Ac to get f 0 , g 0 such that f (s) = f 0 (s) , g (s) = g 0 (s) , s A. So there is
no change on A. They are still equal off A, though not necessarily to f, g:
f (s) = g (s) , f 0 (s) = g 0 (s) , s Ac .
Then P2 requires:
f  g f 0  g0
This is sometimes written f A g, f preferred or indifferent to g given
A. This is often referred to as Savages sure thing principle.
Axiom P3. If you take an act that guarantees an outcome x on an
event A and you change it on A from x to another outcome y, the preference
between the two acts should follow the preference between the two outcomes.
Formally let fAx be an act that produces x for every state in A. A null event
is, roughly, one that is thought to be impossible. Formally an event A is
null if whenever two acts yield the same outcome off A they are ranked as
equivalent. For every act f F, every non-null event A S, and x, y X,
x  y fAx  fAy
Here x, y can also be interpreted as the acts that yield x, y respectively in
every state. This is a monotonicity assumption. Another interpretation
is that rankings should be independent of the events with which they are
associated (note the requirement that this hold for every non-null event A).
Axiom P4. For everyA, B S and every x, y, z, w X with x 
y & z  w,
z
z
yAx  yBx wA
 wB
This is an axiom about probabilities: presumably yAx  yBx means that
you think event A is more likely than event B.
Axiom P5. There are f, g such that f  g.
64

Axiom P6. For every f, g, h F with f  g there exists a partition of S [a collection of pairwise disjoint events whose union is S] denoted
{A1 , A2 , .., An } such that for every i
fAhi  g & f  gAh i
This is roughly like a continuity assumption, but it is hard to state continuity
in Savages framework.
Axiom P7. Consider acts f, g F and an event A S. If for every
s S, f A g (s) then f A g, and if for every s A, g (s) A f , then
g A f .
Proposition 32. [Savage] Assume that X is finite. Then  satisfies P1
to P6 if and only if there exists a probability measure on states S and a
non-constant utility function u : X R such that for every f, g F ,

f g
u (f (s)) d (s)
u (g (s)) d (s)
S

Furthermore is unique and u is unique up to positive linear transformations.


As a generalization we also have the same result for infinite state spaces
if we use P7:
Proposition 33. [Savage]  satisfies P1 to P7 if and only if there exists
a probability measure on states S and a non-constant utility function u :
X R such that for every f, g F ,

f g
u (f (s)) d (s)
u (g (s)) d (s)
S

Furthermore is unique and u is unique up to positive linear transformations.


These results look at first sight like the von Neumann Morgenstern result,
but in fact they are far stronger.
We use this theorem just as we use the vNM one, and can go through
the same mechanisms with it, but the applicability is greater. These axioms
produce a preference representation with a separation of preferences (utility
function) from beliefs (probabilities). (Are preferences and beliefs really
separate?)
An illustration of the sure thing principle, P2: here are four bets
65

1. If horse A wins you get a trip to Paris, and otherwise you get trip to
Rome
2. If horse A wins you get a trip to London and otherwise a trip to Rome
3. If horse A wins you get a trip to Paris and otherwise a trip to Los
Angeles
4. If horse A wins you get a trip to London and otherwise a trip to Los
Angeles
Clearly 1 and 2 are the same if A loses. Generally your choice will depend
on preferences and beliefs or probabilities, but presumably the chance of A
winning is the same in each case, so the choice depends on your preferences
between Paris and London. The same is true for 3 and 4, and axiom P2
requires 1  2 3  4. If two acts are equal on a given event, it does not
matter what they are equal to. So it doesnt matter if when the horse loses
you get Rome or LA.

14

Non-Expected Utility Approaches

Suppose agents cannot derive subjective probabilities: they have no basis at


all for assigning probabilities to events. An example is the probability that
there is life in the universe on an planet other than ours within 1012 light
years from us. How can we then describe rational choice under uncertainty?
Or in the case of the Ellsberg paradox, where the numbers of black and yellow
balls are unknown. Here is another version of the Ellsberg paradox.
There are two urns, each with 100 balls. Urn 1 has 50 black and 50 red.
The numbers of red and black in urn 2 are not known. You are asked whether
you would rather bet on a red ball being withdrawn from urn 1 than urn 2:
most people reply yes. Then you are asked whether you would rather bet on
a black ball being withdrawn from 1 than 2: again most people answer yes.
But this is inconsistent with probabilistic reasoning. If you would sooner bet
on a red ball being taken from urn 1 than urn 2 you must believe that it is
more likely that the red ball will be taken from 1 than 2: the probability of
its being taken from 1 is 0.5 so the probability of its coming from 2 is less
than 0.5. Similarly for a black ball: if you prefer to bet on urn 1 then you
must think a black ball is more likely from urn 1 than from urn 2, and it
66

chance from urn 1 is 0.5 and so it is less than this from urn 2. So the chances
of both red and black from urn 2 are less than 0.5. But they have to sum to
1.
Here is a related example. You have to bet on the toss of a coin. There
are two coins and you can choose which to toss. One has been tossed many
times and came down heads 50% of them. The other has never been tested.
Which would you rather bet on? In one case you know that the odds are
50/50: in the other case this is a reasonable assumption but you have no
evidence. Most people would sooner bet on the tested coin.
In both of these examples people cannot quantify probabilities and stay
away from bets involving unquantified risks.

14.1

MinMax Approaches

One set of approaches to these problems is to ignore probabilistic information.


The classic approach of this type is the maxmin approach, due to Wald.
Recall that f is an act that maps states S to outcomes X. We say that an
act f is preferred to an act g if the worst outcome associated with the choice
of f is better than the worst outcome associated with g. Formally
f  g minsS f (s)  minsS g (s)
We then seek the act that is ranked best in this ordering:
maxf F minsS f (s)
Another approach is the minmax regret approach, due to Savage. The regret
associated with a state s is the difference between the outcome according to
the act chosen and the best possible act given that state:
r (s, g) = maxf F f (s) g (s)
The max regret for a given policy is the maximum of this for all possible
states:
M axsS r (s, g) = maxsS {maxf F f (s) g (s)}
and so is the worst shortfall you could have between actual and ideal outcomes
under act g. The optimum policy then minimizes this over all acts:
M ingF M axsS r (s, g)
67

Both of these approaches neglect any probabilistic information available.


One approach that does take probabilistic information into account is to
consider all probability distributions that are consistent with what we know,
and use all of these in making a decision.

14.2

MaxMin Expected Utility

An approach due to Gilboa and Schmeidler (Maxmin Expected Utility with


a Non-Unique Prior, Journal of Mathematical Economics, 1989, 18, 141-53)
with the following axioms. The framework is as with Savage above.
Axiom 1. We have a complete transitive ordering over F .
Axiom 2. Continuity: For every f, g, h F if f  g  h then there
exist , (0, 1) such that
f + (1 ) h  g  f + (1 ) h
Axiom 3. Monotonicity: For every f, g S, f (s)  g (s) s S f 
g
Axiom 4. Nontriviality: There exist f, g X : f  g
Axiom 5. Independence: For every f, g F, constant h F,
(0, 1),
f  g f + (1 ) h  g + (1 ) h
Axiom 6. Uncertainty Aversion. For every f, g F, (0, 1) , f
g af + (1 ) g  f
Proposition 34. A preference satisfies the above axioms if and only if there
exists a closed convex set of probabilities C and a non-constant function u :
X R such that for every f, g F

f  g minpC u (f (s)) dp (s) minpC u (g (s)) dp (s)


S

Furthermore in this case C is unique and u is unique up to a positive linear


transformation.
What this theorem is saying, is: look at the probabilities that give the
worst possible expected utility for each act, and evaluate the acts according
to these probabilities. So evaluate an act by the probability that gives the
minimal outcome and choose the best according to this ranking.
68

14.3

Smooth Ambiguity Aversion

In this case we again work with many probability distributions that are consistent with what we know. But rather than focussing only on the worst
of them, in the sense of lowest expected utility, we give them all weights and
take note of them all according to these weights. The weight attached to
a distribution can be thought of as the subjective assessment of the chance
of that probability distribution being the correct one. (Klibanoff, Marinacci
and Mukerji, Decision-Making under Ambiguity, Econometrica, 2005, 73(6),
1848-1892)
The first assumption is that for any probability over states S there is a
utility such that acts are ranked by the expectation of that utility:
Axiom 1. Let p be a probability
over states S. Then there exists u :

X R such that f  g S u (f (s)) dp S u (g (s)) dp. This is the


von Neumann Morgenstern theorem, taking the probability p over states as
objective.
For each action
f F and each probability p we now have an expected

utility Ep f = S u (f (s)) dp

Axiom 2. There exist weights (p) 0, (p) dp = 1, and a function


: R R such that

f  g (p) (Ep f ) dp (p) (Ep g) dp


p

In words this states that we prefer f to g if and only if the expectation of the
function of the expected utilities according to the weights is greater for
f than for g. We can think of as a second order utility function - defined
on expected utilities - and the weights as second order probabilities.

14.4

Examples

First look at the two-urn version of the Ellsberg paradox. Urn 1 has 100 balls,
50 red and 50 yellow. Urn 2 has also 100 red and yellow balls in unknown
proportions. You are asked if you are interested in betting $10 on a red ball
being drawn from urn 1 or urn 2. We consider the value of this bet firstly
with linear utilities and then with concave utilities.
Linear utilities: the value of the bet on urn 1 is clearly 0.510+0.50 = 5.
With urn 2 all possible distributions of 100 balls between red and yellow
are possible, and we can take either the mmu approach or the smooth am69

biguity approach. Let pn , n = 0, .., 100 be the probability of choosing a red


ball given that the number of red balls is n. So pn = n/100.
With mmu the worst possible distribution is that giving zero probability
to choosing a red ball, p0 . In this case the value of the bet is zero, so the
mmu approach values this bet at zero.
With smooth ambiguity we consider all probabilities p0 , ...., p100 and not
just p0 . We give each probability pn a weight or likelihood or second order
probability n . Then with linear u, the value of the bet is
X

n pn 10 =

X n
1 X
10n =
nn
101
10 n
n

and if we rank all numbers of red balls n as equally likely (n = =


then this is
100
1 X n
1 X
nn =
=5
10 n
10 n=0 101

1
)
100

which is the same as the value of the bet on urn 1.


Now assume risk and ambiguity aversion, with a utility of money function
given by u (x) , u (0) = 0. The value of a bet on urn 1 is just 0.5u (10) +
0.5u (0) = 0.5u (10).
According to MMU the value of the bet on urn 2 is u (0) = 0.
Now consider the smooth ambiguity approach. If the number of red balls
n
u (10) +
is n, so the probability of red is n/100, the expected utility is 100
100n
n
u (0) = 100 u (10). So overall the value of the bet is
100
100
X
n=0

 n

u (10)
100

If is linear this is just 0.5u (10), the same as the value of a bet on urn 1.
This is the case of no ambiguity aversion.
Let u (10) = 100 and consider instead the case of (x) = x0.5 , a strictly
concave function. Then the value of the bet is 1
100
100
100
100

X
1  n
1 X
1 X
1 X
(n) =
n

100 =
n<
101
100
101 n=0
101 n=0
101 n=0
n=0

so the value of the bet is reduced by ambiguity aversion.


70

Here is another example. A model mi is a mapping from acts f F ,


which we take to be R1 for this example, to probability distributions p (x)
over outcomes x X. (Think of this as a macroeconomic model or a model
of the stock market, whose output is a distribution over possible outcomes,
with the act being the choice of an interest rate or an investment level.) We
want to choose the act that leads to the best outcomes but dont know which
model is the right one. So an act f gives a distribution over outcomes that
depends on the model:
mi (f ) = p (x | f, mi )
which is a distribution over x conditional on the act f and the model mi . The
expected utility from an act contingent on model i being correct is therefore

Eu (f | mi ) = u (x) dp (x | f, mi )
The mmu approach is to value each act f according to the model that
gives the worst outcome:

minmi u (x) dp (x | f, mi )
and then maximize this value across acts:

maxf minmi u (x) dp (x | f, mi )


The smooth ambiguity approach will assign second order likelihoods or
probabilities to the models mi and evaluate expected utilities by the concave
function . So the problem is
X
maxf
i (Eu (f | mi ))
i

The FOCs for this are


X

i 0 (Eu (f | mi )) Eu0 (f | mi ) = 0

Define ambiguity-adjusted probabilities i0 as


i 0 (Eu (f | mi ))
i0 = P
0
j j (Eu (f | mj ))
71

and divide the FOC through by the denominator of this expression to get a
new way of stating the FOC:
X
i0 Eu0 (f | m) = 0
i

So the expected sum of the marginal expected payoffs from a change in the
act f must be zero, where the expectation is calculated at the ambiguityadjusted probabilities. Because is concave and so 0 is decreasing, these
adjusted probabilities give more weight to bad outcomes and less to good
outcomes than the original second-order probabilities i .

Problem
A university has an endowment W that it may invest in bonds B or equity E.
Each type of security may go up 10% or go down 10%. The distributions are
not independent. The university has two financial advisers X and Y who give
different estimates of the probabilities of the possible cases, and the university
cannot tell which if either is correct. For adviser X these probabilities are
xij , and for Y they are yij . The university evaluates outcomes according to a
concave utility function U (P ) where P is the financial payoff. Formulate the
universitys investment problem according to the MaxMin Expected Utility
approach and the Smooth Ambiguity approach.

Answer
Here is the table of possible outcomes and the probabilities that advisor X
assigns to them: for advisor Y replace xij by yij .
B/E +10% -10%
+10%
x11
x12
-10%
x21
x22
The investment in equities is eW and that in bonds is (1 e) W . So the
expected utility according to advisor X is
EUx =

x11 U [1.1eW + 1.1 (1 e) W ] + x12 U [0.9eW + 1.1 (1 e) W ]


+x21 U [1.1eW + 0.9 (1 e) W ] + x22 U [0.9eW + 0.9 (1 e) W ]

72

This can be simplified to


EUx = Kx + x12 U [W (1.1 0.2e)] + x21 U [W (0.9 + 0.2e)]
where Kx = x11 U [1.1W ] + x22 U [0.9W ]
Note that
e = 0 EUx = Kx + x12 U [1.1W ] + x21 U [0.9W ]
e = 1 EUx = Kx + x12 U [0.9W ] + x21 U [1.1W ]
and

n
o
EUx
0
0
= 0.2 x21 U (C21 ) x12 U (C12 )
e
where Cij is income in state i, j, and this derivative is positive when e = 0
and negative when e = 1.
For the MaxMin Expected Utility approach we need to find EUx (e)
and EUy (e) for each value of e, pick the min,
V (e) = M in {EUx (e) , EUy (e)}
e

and then choose e to maximize V (e).


For the Smooth Ambiguity Approach we value a policy e as follows:
let i be the probability the university assigns to advisor i being right (a
second order probability) and : R R be a concave increasing function.
Then the objective is
V (e) = M ax {x (EUx (e)) + y (EUy (e))}
e

To solve this maximization problem we choose e so that


V
0
0
0
0
= x (EUx (e)) EUx (e) + y (EUy (e)) EUy (e) = 0
e
0

Note that we can divide both sides of this equation by x (EUx (e)) +
0
y (EUy (e)) giving
0

x (EUx (e))
y (EUy (e))
0
0
EUx (e)+
EUy (e) = 0
0
0
0
0
x (EUx (e)) + y (EUy (e))
x (EUx (e)) + y (EUy (e))
which we can write as
0

x EUx (e) + y EUy (e) = 0


73

where x , y are ambiguity-adjusted second order probabilities.


0

x0

x (EUx (e))
y (EUy (e))
=
y0 =
0
0
0
x (EUx (e)) + y (EUy (e))
x (EUx (e)) + y 0 (EUy (e))

So the solution involves setting the expected marginal gain from shifting the
portfolio equal to zero, where the expectation is taken via these ambiguityadjusted second order probabilities. Note that if is strictly concave then
the ambiguity adjustment involves placing more weight on the bad outcome
than with the initial second order probabilities. If is linear there is no
change.

74

15

Third Problem Set

Problem 1. Consider the following two lotteries: L = $200 with probability


0.7, 0 with probability 0.3, and L=$1200 with probability 0.1 and $0 with
probability 0.9. Let xL and xL0 be the sure amounts of money the individual finds indifferent to L and L respectively. Show that if preferences are
monotone, the individual must prefer L to L if and only if xL > xL0 .
Problem 2. Consider the insurance problem studied in section 10.1, and
show that if insurance is not actuarially fair (q > ) then the individual will
not insure completely.
Problem 3. Show that if an individual has a Bernoulli utility function
of the form
u (x) = x2 + x
then her utility from a distribution is determined by the mean and variance
of the distribution and by these alone. Note: < 0 for concavity of u and
we limit the distribution to values no greater than /2 as u is decreasing
after this.
Problem 4. Assume that a firm is risk-neutral with respect to profits
and that if there is uncertainty about prices then production choices are made
after the resolution of this uncertainty. The firm faces a choice between two
alternatives. In the first prices are uncertain. In the second they are certain
and equal to the expected value of the uncertain case. Show that a firm that
maximizes expected profits will prefer the first alternative to the second.
Problem 5. Suppose that an individual has a Bernoulli utility u (x) =
1/2
x .
1. Calculate the coefficients of absolute and relative risk aversion when
x=5
2. Calculate the certainty equivalent for the gamble (16, 4 : 0.5, 0.5)
3. Calculate the certainty equivalent for the gamble (36, 16 : 0.5, 0.5)
Problem 6. Consider a lottery over monetary outcomes that pays x + 
with probability 0.5 and x  with probability 0.5. Compute the second
derivative of this lotterys certainty equivalent with respect to  and show
that the limit of this derivative as  0 is exactly rA (x).
Problem 7

75

A university has an endowment W that it may invest in bonds B or equity


E. Each type of security may go up 10%, stay constant, or go down 10%.
The distributions are not independent. The university has two financial advisers X and Y who give different estimates of the probabilities of the nine
possible cases, and the university cannot tell which if either is correct. For
adviser X these probabilities are xij , and for Y they are yij . The university
evaluates outcomes according to a concave utility function U (P ) where P is
the financial payoff. Formulate the universitys investment problem according to the MaxMin Expected Utility approach and the Smooth Ambiguity
approach.

76

Part IV

General Equilibrium
Next we study the interactions of firms and consumers through markets.
Firms as before are characterized by production possibility sets: firm i has
production possibility set Yi RN , and yi Yi is production plan. There are
I firms. The sign convention as before is that inputs areP
negative and outputs
N
with i pi = 1, pi 0 i.
positive. A price vector p is an element of R+
Clearly profits are given by i = p.yi . Firms seek to:
M axyi Yi {p.yi } = i

N
and a consumption
Consumers as before have preferences j on Xj R+
vector is xj Xj . There are J consumers. Preferences are represented by an
ordinal utility function uj : RN R. Consumers have endowments wj RN :
wj is the vector of goods that individual j owns and can either consume or
sell. Typically it contains labor, which may be consumed as leisure or sold
as work, and any other items that belong to the individual. Firms are owned
by individuals: individual j owns a fraction ji of firm i, entitling her to this
fraction of its profits. So the consumer choice problem is
X
ji i
M axxj u (xj ) , p.xj p.wj +
i

Here total spending power is the value of endowments plus the income from
shareholdings. A set of consumption and production plans, one for each
consumer and producer, is called an allocation.
Definition 36. Let yi , xj be an allocation. We say this is feasible if
X
X
X
xj
wj +
yi
j

that is if consumption is less than or equal to production plus endowments


for each good or service.
Another important definition is that of Pareto efficiency (aka Pareto optimality)
77

Definition 37. An allocation yi , xj is Pareto efficient if it is feasible and



there is no other feasible allocation yi , xj such that uj (
xj ) uj xj j, >

uj xj some j. In other words, there is no other feasible allocation where
someone is better off and no-one is worse off.
We can also work with the Pareto ranking:
Definition 38. An allocation yi , xj is Pareto superior to another allocation


xj ). In words, someone is
yi , xj if uj xj uj (xj ) j & j : uj xj > uj (
better off and no-one is worse off and the starred allocation. This is a partial
ordering, like the vector ordering.
Next we introduce the idea of a competitive equilibrium, which is a set
of prices, production and consumption plans such that firms are maximizing
profits, consumers are maximizing utility, and all markets clear:
Definition 39. A competitive equilibrium is a price vector p , a set of
production plans yi for each firm i I, and a set of consumption vectors xj
one for each consumer j J, such that consumers and producers maximize
utilities and profits respectively and demand is less than or equal to supply:
1. i, yi max p .y, y Yi
2. j, xj max u (x) , p x p.wj +
P P
P
3.
j xj
j wj +
i yi

i ji i

There are several questions one can ask about this concept. One is - is
it Pareto efficient? Another is - does such an equilibrium exist? We will
investigate the first question extensively. Before tackling the general cases,
we will look at a simple 2X2 case that can be studied geometrically, the case
of two consumers trading two goods, with no production - a 2X2 exchange
economy.

16

Edgeworth Box

Let consumers be a and b, and goods 1 and 2. Consumption and endowment


vectors are in R2 . The total endowment of good i is wi = wai + wbi , i = 1, 2.
An Edgeworth Box (see figure 16.1) is a rectangle whose horizontal side
78

is of length w1 and whose vertical side is w2 long. The lower left corner is
the origin for consumer a0 s preferences and the top right corner is that for
consumer b0 s preferences, increasing to the south west. The budget line for a
consumer is a line whose slope equals the price ratio, and which goes through
that consumers endowment vector [because she can afford her endowment
whatever the prices are]. Any point in this rectangle represents an allocation
of the two goods between the two consumers: its coordinates relative to the
normal lower left origin are the amounts allocated to consumer a, and the
remaining amounts, which are the coordinates relative to the upper right
origin, are the amounts allocated to consumer b. So in particular the initial
endowments of the two consumers form a point in this box. A line through
this point with slope equal to the price ratio gives the budget lines of both
consumers.

Figure 16.1: An Edgeworth Box


By varying the slope of the budget line we can trace out the offer curves
of each consumer, and check whether supply and demand match.
79

By looking at points where pairs indifference curves are tangential to each


other, we can locate the Pareto efficient allocations, forming a set generally
called the contract curve. We can also find the Pareto efficient points that
are Pareto superior to the initial allocation - these are attractive points from
a bargaining perspective.

Figure 16.2:
The equilibrium prices associated with any allocation are given by the
slope of the budget line that goes through that allocation and is simultaneously tangent to indifference curves of a and b: this is shown in figure 16.1.
[Note - there may be more than one set of equilibrium prices associated with
an initial allocation.] This is a point where demand and supply are equal
and note that it is on the contract curve and so is Pareto efficient. So this
geometric approach suggests that a competitive equilibrium is Pareto efficient. Figure 16.2 shows a configuration where demand and supply are not
equated. The initial endowment is E, and at the prices shown A wants to
move to the point A where her indifference curve is tangent to the budget
line, and likewise B wants to move to B. So A wants to sell DE of good 1 and
80

buy CE of good 2, and B wants to but GE of good 1 and sell FE of good 2.


So we have demand exceeding supply for good 1 and vice versa for good 2.
Note that if the two agents preferences are identical and homothetic,
then the contract curve will be a straight line along the diagonal of the
box, and equilibrium prices will be independent of the initial allocation of
endowments. Heres the argument for the contract curve being the diagonal.
On the contract curve both agents indifference curves have the same slope.
This means they must both consume the two goods in the same proportions.
This only happens on the diagonal.

17

The Theorems of Welfare Economics

We investigate the relationship between competitive equilibrium and Pareto


efficiency.
Definition 40. An allocation xj , yi and a price vector p form a price
equilibrium
withPtransfers
P if there is an assignment of wealth levels Wj
P
with j Wj = p . wj + i p .yi [the value of the wealth is the sum of the
value of endowments plus profits] such that
1. i, yi max p .y, y Yi
2. j, xj max uj (xj ) , p .xj Wj
P
P P
3.
i yi
j wj +
j xj =
In words, what we have here is a way of dividing endowments between people
such that the allocation xj , yi forms a competitive equilibrium at prices p .
Proposition 35. First theorem of welfare economics. If preferences are
locally non-satiated, then if xj , yi , p is a price equilibrium with transfers, the
allocation xj , yi is Pareto efficient. In particular any competitive equilibrium
is Pareto efficient.
Proof. Preference maximization implies that anything that the consumer
strictly prefers, is unaffordable to her. Formally,

uj (xj ) > uj xj p .xj > Wj

81

Local non-satiation implies an additional property:



uj (xj ) uj xj p .xj Wj
A consumption vector that is at least as good as xj , costs at least as much.
[This follows from local non-satiation: if there were a consumption vector
that is at least as good and that costs less, then there would be another
vector arbitrarily close to it that is better and also costs less, contradicting
utility maximization.]

Let x0j , yi0 be an allocation that is Pareto superior to xj , yi , so that uj x0j


uj xj j, > uj xj some j. Then we must have
p .x0j p xj j & j : p .x0j > p .xj
In this case we have that
X
X
X X
X
p .x0j >
Wj =
p .
wj +
p .yi
j

Because yi is profit-maximizing at prices p for firm i, we know that


X
X
X
X
p .
wj +
p .yi p .
wj +
p .yi0
j

and hence
X

p .x0j > p .

wj +

p .yi0

so that the allocation x0j , yi0 cannot be feasible. Feasibility requires that
X
j

x0j =

wj +

yi

which contradicts the previous inequality, taking the inner product with p
on both sides.
So the take-away here is that if consumers are maximizing utility and
firms are maximizing profits, all facing the same prices, and markets clear
[the allocation is feasible] then the allocation is Pareto efficient. All facing
the same prices is crucial: it means that marginal rates of substitution in
production and consumption are all the same.
82

Definition 41. An allocation xj , yi and a price vector p form a quasiequilibrium


with
P
P transfers
P if there is an assignment of wealth levels Wj with

W
=
p
.
w
+
j
j
j
i p .yi such that
1. i, yi max p .y, y Yi
2. j, if xj  xj p .xj Wj
P P
P
3.
j xj =
j wj +
i yi
Note that the second condition here is different from that in definition 40: we
are not asking for utility maximization, but that any preferred choice costs
no less than the wealth level. Under some extra conditions this does imply
utility maximization - as we will see.
Proposition 36. Second theorem of welfare economics. Assume that all
production sets Yi are convex and that all preferences are also convex
 and
locally non-satiated. Then for every Pareto efficient allocation xj , yi there

is a price vector p 6= 0 such that p, xj , yi form a quasi-equilibrium with
transfers.


P
Proof. 1. Let Vj = x RN : uj (x) > uj xj and define V = j Vj . This
is the set of aggregate
P allocations that can make everyone better off than at

xj . Also let Y
P= i Yi . Clearly V, Vj , Y are all convex sets.
2. Call
j wj = w, the aggregate endowment. The set Y + w, the
aggregate production set translated by the aggregate endowment, is the set
of all aggregate bundles available for consumption given the technology and
endowments.
3. Note that V (Y + w) = . This is an implication of the Pareto
efficiency of the allocation - if this intersection were non-empty then there
would be a vector that is feasible (in Y + w) and can
P be used to give every

consumer a greater utility level than xj (is in V = j Vj ).


4. Next note that there is a vector p 6= 0 RN and a number r such
that p.z > rz V and p.z rz Y + w. This is just an application of
the separating hyperplane theorem: V and Y + w are convex sets with no
interior points in common (we proved this above in 3). 

P
5. Next note that uj (xj ) uj xj p.
r. In this case
j xj
by local non-satiation there is a vector xj arbitrarily close to xj such that

83

P 
P
uj (xj ) > uj (xj ) and so xj Vj j xj V , so p.
j r, and taking
jx
P
the limit as xj xj we have that j p.xj r.
P 
P 
P

6. p.
= p. (w + i yi ) = r . By 5 p.
r, but in
j xj
j xj
P 
P P

r,
addition we know that j xj = i yi + w Y + w and so p.
j xj
P 
P
P

implying that in fact p.


j xj = w +
i yi we also know
j xj = r. Since
P
that p. (w + i yi ) = r.
P
7. For each firm consider an arbitrary yi Yi . Clearly yi + h6=i yh Y ,
so that by separation (point 4 above)
!
!
X
X

p. w + yi +
yh r = p. w + yi +
yh
h6=i

h6=i

Hence p.yi p.yi .



8. For any consumer uj (xj ) > ui xj p.xj p.xj . So assume

uj (xj ) > ui xj . By 5 and 6 above we have
!
!
X
X

xk
xk r = p. xj +
p. xj +
k6=j

k6=j

so p.xj p.xj .

9. The wealth levels p.xj = Wj support p, xj , yi as a quasi-equilibrium
with transfers. Conditions 1 and 2 of definition 41 follow from 7 and 8 above.
Condition 3 follows from the feasibility of a Pareto efficient allocation.
What is the distinction between a quasi-equilibrium and an equilibrium,
and when are they the same? In a quasi-equilibrium, anything that is better,
costs no less: in an equilibrium, it costs more. When might something that
is better not cost more? When the price of a good in which you are not
satiated is zero. In this case in R2 the budget line is horizontal or vertical
and a consumer may wish to go infinitely far along it but be limited by the
amount of the good available - the total endowment.
Something similar could happen if the consumer has endowments only of
goods that have zero prices, so effectively has zero income.
If every consumer has positive wealth at a quasi-equilibrium with transfers then it is a price equilibrium with transfers and so any Pareto efficient
allocation can be supported as a price equilibrium with transfers.
84

Proposition 37. Suppose that at a quasi-equilibrium every consumer has


strictly positive wealth Wj > 0. Then it follows that xj j xj p.xj > Wj
so that the quasi-equilibrium is a price equilibrium with transfers.
Proof. Suppose in contradiction that there is xj j xj & p.xj = Wj . There


exists an x0j such that p.x0j < Wj . But for all a [0, 1), p. axj + (1 a) x0j <
Wj . But if a is close enough to 1, the continuity of j implies that axj +
(1 a) x0j j xj , contradicting point 2 of definition 41.

85

18

Problem Set 4

Problem 1.
Consider an Edgeworth box economy in which consumers have the Cobb1
1
Douglas utility functions u1 (x11 , x21 ) = x11 x21
and u2 (x21 , x22 ) = x21 x22
.
Consumer i0 s endowments are (1i , 2i ) > 0. Solve for the equilibrium price
ratio and allocation. How do these change with a small change in 11 ?
Problem 2
Give the mathematical formula for two preferences which lead to an Edgeworth box in which prices are independent of the initial allocation of endowments amongst the agents. Prove that in this case the prices are independent
of the initial allocation. Assume that the total endowments of the two goods
are equal.
Problem 3
Construct, and give the mathematical formulae for, an Edgeworth box
in which there is more than one competitive equilibrium from some initial
allocations. Hint: you might work with linear preferences.

86

19

Existence of General Equilibrium

We now know something about the welfare properties of a general equilibrium. But we dont actually know if such an equilibrium exists. This is not
a trivial question - it is easy to construct examples of economies where there
is clearly no price vector at which all markets clear simultaneously. So this
question needs some work. The main concept in working on this is the excess
demand function. Recall definition 39: a competitive equilibrium is a set of
consumption plans xj , production plans yi and prices p such that
1. i, yi max p .y, y Yi
2. j, xj max uj (xj ) , p .xj Wj
P P
P
3.
j xj =
j wj +
i yi
So firms are maximizing profits, consumers utility, and demand and supply
balance. Now define for any price vector z (p), the excess demand associated
with that price:
X
X
X
z (p) =
xj (p)
wj
yi (p)
where
yi (p) = ArgM ax p.y, y Yi , xj = argmax Uj (x) , p.x p.wj +
P
i ji i . So z (p) is the difference between demand and supply at prices p,
and for an equilibrium we need this to be non-positive for all goods. It can
be negative - supply greater than demand - for goods whose price is zero. So
the question now is: does there exist a price p such that z (p ) 5 0?
Note that z (p) is a map from prices to commodity
space. Prices


P can be
N
N
N
considered as points in the simplex in R , S = p R : pl 0, l pl = 1 .
We know that demand and supply functions are homogeneous of degree zero
in prices so we can always scale prices to be in the simplex without changing
excess demand.
Next we modify z (p) to z 0 (p) which is a map from the simplex to itself.
Define the following function z + (p) on S N : zl+ (p) = M ax {zl (p) , 0}. Note
that z + is continuous and that
X
z + (p) .z (p) =
M ax {zl , 0} zl = 0 z (p) 5 0
l

Now construct
a (p) =

X

pl + zl+ (p)
l

87

Clearly we have a (p) 1p as


from S N to itself by

f (p) =

pl = 1. Now define the continuous function



1 
p + z + (p)
a (p)

This is a continuous function from a compact convex set S N to itself and so


has a fixed point p (Browers fixed point theorem) such that f (p ) = p . By
Walras Law
0 = p. z (p ) = f (p ) .z (p ) =


1 
+
p
+
z
(p)
.z (p ) =
a (p )


1 
1

.z
(p
)
+
z
(p)
.z
(p
)
=
z + (p ) .z (p )

a (p )
a (p )
Therefore z + (p ) .z (p ) = 0, which from above means that z (p ) 5 0 as
required. So we have proved that there is a price at which all markets clear,
provided that all demand and supply functions are continuous functions,
which requires that all production sets be strictly convex and all utilities be
strictly quasi concave. We have:
Proposition 38. If all production sets in the economy are strictly convex
and all utilities strictly quasi-concave then there exists a price vector p at
which z (p ) 5 0, that is, at which all markets clear.
Now we move on to consider some situations where competitive equilibria
are not efficient.

20

Public Goods

First a result on characterizing Pareto efficient allocations. The basic proposition is that we can characterize PE allocations as allocations that maximize
a weighted sum of utilities.
Proposition 39. Suppose that the consumption vectors xj maximize the
P

P
P
P
weighted utility sum j j Uj (xj ) ,
where j
j xj
i Yi +
j wj

0 j. Then xj are PE.

88

Proof. Suppose the xj are not PE. Then there exists an alternative set
P

P 0
P
0
of consumption vectors xj which are feasible ( j xj
Yi + j w j )
i




0
and such that Uj xj Uj xj j & j : Uj xj > Uj xj . In this case
 P

P
0

j U j xj >
j Uj xj a contradiction.
A public good is one that, if provided for one person, is provided for
all, or for local public goods, for all in a group or location. The traditional
textbook examples are law and order, public health, and defense. A more
contemporary examples is air quality, which if improved for one person in
a region is necessarily improved for all. Public goods are said to be nonexcludable and non-rivalrous.
The traditional market mechanism does not work well for public goods,
as the provider cannot ensure that everyone who benefits from these goods,
pays for them. For example, a group in New York city may decide to incur
costs to make the air in NYC cleaner and less dangerous, and ask people to
pay for this. But they have no way of ensuring that every who benefits, pays.
People can free ride, enjoy the benefits without paying. This is why we refer
to public goods, reflecting the fact that these are normally provided by the
government, which has the ability to force people to pay via taxes.
Let cj R be consumption of a normal, private, good, and g R be
the level of provision of a public good. For each individual j utility depends
on the consumption of both: Uj (cj , g) where g being the consumption of the
public good is the same for all, but people can of course choose different levels
of the private goods. We assume Uj to be concave. We suppose that each
person has a budget that can be divided between the regular consumption
goods and contributing to the provision of the public good. The total amount
of the public good provided
P is a function of the total amount contributed by
0
all individuals: g = f
j gj where gj is j s contribution to the public good
and f is concave. Hence the individual optimization problem is (setting the
price of the private good equal to one)
!
X
M axcj ,gj Uj (cj , g) , g = f
gk , cj = Wj gj
k

In this problem the agent is optimizing over her choice of consumption of


the private good and contribution to the public good, taking as given the
contributions that she thinks others are making (gk , k 6= j) as in a Nash
89

non-cooperative equilibrium, giving rise to the Lagrangian


!!
X
L = U j cj , f
gk
+ j {Wj gj cj }
k

and the first order conditions are


Uj
Uj 0
= j ,
f = j
cj
g
so that

Uj /cj
Uj
Uj 1
Uj /g
1
= f 0 or
=
or
= 0
0
Uj /g
g
cj,l f
Uj /cj
f

This last expression says that the marginal rate of substitution between the
public and private good should equal the marginal rate of transformation
between them.
Now look at the socially efficient allocation between public and private
goods. Consider the problem
!
X
X
X
X
X
M ax
Uj (cj , g) , g = f
gk ,
cj =
Wj
gk
j

which leads to a Pareto efficient allocation with the public good. The Lagrangean is
)
! !
(
X
X
X
X
X
+
gk
gj
gk
Wj
L=
Uj cj , f
j

and first oder conditions are


X Uj
Uj
= , f 0
=
cj
g
j
implying that
X Uj
j

We can also write

Uj 1
Uj
Uj 1 X Ul
or
that
=

cj f 0
g
cj f 0
g
l6=j
X Uj /g
1
= 0
Uj /cj
f
j
90

This is known as the Bowen-Lindahl-Samuelson formula for the optimal provision of a public good. Note the difference between the first order condition
U /g
that the individual chooses Ujj/cj = f10 and that which is Pareto efficient
P Uj /gj
1
j Uj /cj = f 0 . Individual choices do not lead to an efficient outcome in this
case. In fact is is easy to see that individual choices under-provide
P Uj /gj the public
U /g
good relative to a Pareto efficient outcome, for clearly j Uj /cj > Ujj/cj
and if f is strictly concave then this implies that the optimal level of provision
is greater than the private level.
Now lets look into having a market for the public good in which different
people pay different prices and the good is provided by a profit-maximizing
firm. So person j pays price pj for the public good (the price of the private
good is one), and everyone pays the provider of the public good, the production of which uses as an input the private good. So the individual problem
is
M axcj ,g Uj (cj , g) , Wj cj pj g = 0
and the first order conditions are
Uj
Uj /g
Uj
= pj
= j ,
= j pj , so
cj
g
Uj /cj
and the problem for the firm producing the public good is
X
M ax g
pk z where g = f (z)
k

and the FOCs are


f0

pk = 1 or

pk =

1
f0

Combining both sets of FOCs we see that


X Uj /g
1
= 0
Uj /cj
f
j
which is the condition needed for Pareto efficiency.
Example
Let u (cj , g) = log (g) + log (cj ) . Let J be the total number of people,
W the total endowment and W/J the share of each person. Assume the
public good is produced from the private according to g (z) = z where zis
91

the amount of the private good allocated in total to the production of the
public good.
A Pareto efficient allocation is the solution to
X
X
M ax
{log (g) + log (cj )} , g +
cj = W
j

The Lagrangean is
(
L=

{log (g) + log (cj )} + W g

cj

and the FOCs are


J

= ,
=
g
cj

from which it follows that the efficient allocation satisfies


cj =

W
,
J (1 + )

g=

W
1+

The typical individual solves the problem


M ax {log (g) + log (cj )} ,

g = zj +

zk ,

cj =

k6=j

W
zj
J

Substituting into the utility function we find the FOCs are


zj +
or

k6=j

zk

W
J

1
zj

X
W
= zj (1 + ) +
zk
J
k6=j

and summing over j gives


X
XX
W =
zj (1 + ) +
zk
j

or

W = (J + )

k6=j

so that

W
W
, cj =
J +
J +
which is the same as the efficient allocation only if J = 1.
g=

92

X
j

zj

21

External Effects

These occur whenever the consumption/production of one person/firm affects


the welfare/profits of another person/firm. Let cj be the consumption vector
of agent j, andcj be the vector of consumption vectors of all agents other
than j. The for each agent utility takes the form Uj (cj , cj ). Individual
endowments are Wj . A social optimum is the solution to
X
X
X
M ax
aj Uj (cj , cj ) ,
p.cj =
Wj
j

where the aj 0 are welfare weights. The Lagrangean is


!
L=

aj Uj (cj , cj ) +

and the FOCs are

p.cj

Wj

1 X Uk
1 Uj
aj
ak
= pl
cj,l
k6=j cj,l

Noting that if the external effects are harmful the terms Uk /cj,l are negative, and we can think of the second term on the RHS as a tax, which corrects
for the external costs by adding them to the market prices of goods.
The private optimum is the solution to
M ax Uj (cj , cj ) p.cj = Wj
and clearly the FOCs are
1 Uj
= pl
j cj,l
P
Uk
which differ from the social optimum by the term 1 k6=j ak c
. Hence if
j,l
this amount is added to the price pl for individual j, the private and social
FOCs will coincide. Note that the tax is in principle person-specific. These
taxes are known as Pigovian taxes after Arthur Pigou. Pollution taxes
(carbon taxes) are examples of Pigovian taxes, and cap and trade systems
are also systems for adding external costs to the prices that agents face.

93

22

Common Property Resources

Common property resources are resources to which all have equal access. A
classic example is fisheries, though ground water is also a common property resource. Assume that the total production from a common property
resources Y is a function
PF of the total inputs applied to the resource X,
Y = F (X), where X = i xi , and xi is the input applied by person i. So Y
could be the catch from a fishery and xi the number of vessel-hours applied
to the fishery by agent i. Alternatively Y could be the water withdrawn from
an aquifer and xi the capacity of the wells drilled by person i.
We assume F 0 > 0 and F 00 < 0. These imply that k 0 : LimX F 0 (X) =
k, & LimX F 00 (X) = 0.
We assume further that person i0 s output is
y i = xi

F (X)
X

which means that she gets as her output a share of total equal to the share of
inputs that she provides. Another way of thinking of this is that F (X) /X
is the average product of the input, and each agent gets the average product
times the amount of input she provides. We can write this as
y i = xi

F (xi + xi )
xi + xi

where xi is the vector of inputs provided by all agents other than i. If the
cost of input is p then each agent seeks to maximize
i = xi

F (xi + xi )
pxi
xi + xi

If she takes X as given - as would make sense if there are many agents, each
small with respect to the total, the FOCs are
F (X)
=p
X
so that average product equals price.
Now look at the Pareto efficient outcome. We seek to maximize F (X)
pX and the FOC is
F
=p
X
94

or marginal product equals price. As F (X) /X > F 0 (X), we see that the
resource is over-used under a competitive regime. Note that under the assumptions specified on F , it is the case that


F (X)
0
=0
LimX F (X)
X
so that with infinitely many agents the two outcomes are the same. They
are also the same when there is only one agent.

23

Non-Convex Production Sets

If production sets are not convex, then


1. there may be no competitive equilibrium, as supply functions are not
continuous
2. it is not the case that an Pareto efficient allocation can be supported
as a CE
In this case we generally work with what is called a marginal cost pricing
equilibrium, which is a form of regulated equilibrium.
Definition
42. A marginal cost pricing equilibrium (MCPE)
is an allocation

P
P

,
x
and
a
price
vector
p
and
wealth
levels
W
,
W
=
p
.
y
j
j
j
i
j wj +
P
i p .yi , such that
1. For each firm i the first order conditions for profit maximization are
satisfied, i.e. p = i Fi (yi ) , for some i > 0.
2. For each consumer j , xj maximizes Uj (xj ) subject to p .xj Wj
P P
P
3.
x
=
w
+
j
j
j
j
i yi
So this is a competitive equilibrium except that firms are not necessarily
maximizing profits, but they are satisfying the FOCs for profit maximization.
Profits may be negative, so that they could be increased by closing down. So
any CE is a MCPE, but the converse is not true.
Note that a MCPE satisfies the FOCs for PE: all marginal rates of substitution and transformation are equal. We can see that this is necessary for
95

PE from proposition 39, which shows that a PE allocation is the maximum


of a weighted sum of utilities subject to production and resource constraints.
Assume the sets Yi can be described by functions Fi (yi ) 0. Then the
maximization problem in proposition 39 is
X
X
X
X
M axxj ,yi
j Uj (xj ) ,
xj =
wj +
yi , Fi (yi ) = 0
j

The Lagrangian is
L=

X
j

j Uj (xj ) +

(
X

)
xj

wj

yi

+ i Fi (yi )

which gives as FOCs those of the MCPE.


In the US many regulated utilities are expected to price at or near marginal
cost and then cover losses from the fixed elements of two-part tariffs.

96

24

Time and Uncertainty in General Equilibrium: the Arrow-Debreu Model

Next we extend the static, deterministic model of competitive equilibrium


considered above to a world of time and uncertainty. Assume that the world
last for T > 0 time periods, indexed by t = 1, ..., T and that any one of S > 0
states of the world may obtain, indexed by s = 1, ..., S. A state is a complete
description of the evolution of the world over all time periods, a description
so complete that it resolves all uncertainty. There are as usual N distinct
goods and services, indexed by n = 1, ..., N .
The major innovation is to distinguish commodities by the date and state
in which they are available: cjts is now individual j 0 s consumption vector in
period t in state s. It is a vector in RN , and a complete description of agent
j 0 s consumption is {cjts }t=1,..,T,s=1,..,S RN T S . We denote this by cj RN T S .
There are markets and prices for all N T S commodities. So commodity k at
date t and in state s is a different commodity from k at date t0 and in state
s0 . An umbrella if it rains is different from one if it is dry: $500,000 when
your house has just burned down is different from $500,000 if your house is
intact.
We have T S times as many commodities, markets and prices as in the
atemporal certain case, and the commodities are called time-state-contingent
commodities or just contingent commodities. People have preferences over
this extended commodity space RN T S , which will reflect their attitudes towards risk and time, and firms production sets Yi RN T S contain production plans that are state-contingent and extend over time.
We can define a competitive equilibrium in this extended commodity
space exactly as before. And as before it will be Pareto efficient, meaning
that no alternative feasible allocation of goods between states, dates and
people will make someone better off without making someone else worse off.
Let pjts be the price of good j in period t and state s, p RN T S the overall
price vector, and yits be firm i0 s production plan in period t and state s, with
yi RN T S its overall production plan. Firm i0 s profit is as usual p.yi . So a
competitive equilibrium in the state-contingent commodity world is defined
as follows:
Definition 43. A competitive equilibrium of the economy with time-statecontingent commodities is a set of prices p RN T S , a set of production
plans for each firm yi RN T S and a set of consumption plans for each
97

person xj RN T S such that


1. yi maximizes i = p .y for y Yi
2. xj maximizes Uj (xj ) , p .xj = p .wj +
P P
P
3.
j xj =
j wj +
i yi

i ji i

= Wj

A consumers endowment wj is now in RN T S , giving endowments of goods


by date and state. Note that except for the dimensionality of the commodity
space this is exactly as the earlier definition of a competitive equilibrium,
definition 39.
Trading occurs at the start of time, before a state of the world is realized.
Agents trade and enter into contracts contingent on the state of the world
(and time period), before the first time period and when the state is unknown.
One important point to understand is that in this framework firms profits
are not stochastic variables, but certain numbers. They are known independently of the state of the world that obtains. The prices are known, and the
firm produces and sells various state-contingent commodities. These are sold
before the state is known. Once the state is known the firms deliver whatever they have contracted to deliver in that state, and have already been
paid for this, and have already paid for any state-contingent inputs. So firms
are fully hedged against uncertainty. Individuals consumption vectors are
however uncertain: they are state-contingent. But conditional on a state and
time period they are known with certainty.
The complexity of having state-contingent commodities and a commodity
space of dimension RN T S can be avoided by the device of securities, which
are contracts that pay a specified amount (generally taken to be a unit of
the currency, $1) if and only if a particular state occurs. To focus on the role
of securities assume now that there is only one time period but that there
are many states, so that the commodity space is RN S , with each physical
commodity being contingent on the state that occurs. Then a competitive
equilibrium will exist and be efficient under the standard assumptions of
earlier sections, but with N S prices and commodities.
As an alternative suppose that before uncertainty is realized, people and
firms can trade securities that pay one dollar if and only if any particular state
occurs, so that security s pays $1 if and only if state s occurs, and securities
are available for all states s S (this assumption is often referred to as
complete insurance markets). Once uncertainty is realized and the state is
98

known, agents can trade on risk-free spot markets (normal markets) using
the money obtained from the securities they purchased that pay off in that
state. Under certain conditions this structure can imitate the outcome that
would occur if we had a competitive equilibrium with the full N S markets.
But instead we have only N markets for goods and S markets for securities,
where S is the number of states, and generally N + S < N S.
To see this, assume that p RN S (there is only one time period now)
is the competitive equilibrium price vector with a full set of state-contingent
commodity markets. A typical element is pks , the price of good k in state
s. Let xjks be the amount of good k person j bought in state s at the

be firm is supply of good k in state s.


competitive equilibrium, and let yiks
Now let qs be the price of a security that pays one unit if and only if state
s occurs. And let pks be the price of a unit of good k on the spot market once
uncertainty has been resolved and the state is known to be s. Then the cost
of buying a unit of good k if state s occurs is pks qs . If pks qs = pks s S, k,
then consumers and firms will make exactly the same choices as they made
at the competitive equilibrium with the full set of contingent commodity
markets. They will trade securities so as to fund these trades. To purchase
xjks in this framework will cost agent j pks qs xjks on the spot market once s
has been realized, so she will buy securities to the value of rjs where
X
pks qs xjks = rjs
k

Her purchase of securities across all states s must satisfy the budget constraint
X
rjs = Wj
s

which follows from item 2 of definition 43.

So if consumers anticipate the prices that will rule in spot markets once
uncertainty is resolved and believe these to satisfy pks qs = pks s S, k the
outcome will be an efficient competitive equilibrium. The conclusion: S securities markets and N goods markets can replace N S contingent commodity
markets if agents have price expectations that mimic the prices that would
have ruled on contingent commodity markets. (Arrow, The Role of Securities
in an Optimal Allocation of Risk-Bearing, Review of Economic Studies 1964,
first published in French in 1953.)
99

100

You might also like