Notes 03
Notes 03
Reminders
On the space of lotteries L that offer a finite number of consequences (C1 , C2 , . . . Cn ) with
probabilities (p1 , p2 , . . . pn ), we established the existence of a utility function u(L) such that:
[1] it represents the preferences, that is, it has the property that for any two lotteries La
and Lb ,
u(La ) > u(Lb ) if and only if La Lb
[2] has the expected utility property
n
X
EU (L) ≡ u(L) = pi u(Ci )
i=1
The proof was “constructive”. On the bounded (and closed, if you are a mathematician)
set of lotteries we let L be the best and L the worst. Then we showed the existence of a
unique p such that L is indifferent to the lottery that yields L with probability p and L with
probability (1 − p). (This is often written for short as the lottery p L + (1 − p) L; this is
convenient but it should be understood that no sum in any usual number or vector sense
is intended.) Then we simply defined u(L) = p, and verified that it had the two desired
properties.
The same preferences can be represented by another utility function ũ which also has the
expected utility property if (and only if, although we did not prove this) ũ is an increasing
linear (a pedantic mathematician would say “affine”) transform of u: there are constants a,
b with b > 0 such that ũ(c) = a + b u(c) for all c.
Since each consequence Ci is a degenerate lottery that yields this consequence with proba-
bility 1 and all other consequences Cj with zero probabilities, the construction automatically
gives a utility function u(Ci ) over consequences. We can think of the utility of one conse-
quence, u(Ci ), as the utility of a degenerate lottery that yields Ci with probability 1 and
any other consequence Cj with probability zero. We call this the von Neumann-Morgenstern
utility function, to distinguish it from the expected utility function for a non-degenerate
lottery. (A pedantic mathematician would create different symbols for the two.)
Many of our applications will be expressed in terms of actions a, possible states of the
world s, and consequence functions c = F (a, s). We can convert our theory of preferences
over lotteries easily to this context by writing expected utility of an action as the expectation
of the random variable namely the utilities of all possible consequences it might yield in
different states of the world:
m
X
EU (a) = P r(sj ) u(F (a, sj ) )
j=1
1
Risk aversion
In consumer theory without uncertainty, if c is a positive
√ scalar magnitude like money income
2 c
or wealth or consumption, the utility functions c, c , c, e , and ln(c) would all represent the
same preferences (all reflecting the trivial property that more is better). But as components
of expected utility, these are different. For example, if there are just two consequences with
probabilities p1 , p2 , the three expected utility functions
p1 c1 + p2 c2 , p1 ln(c1 ) + p2 ln(c2 ), and p1 (c1 )2 + p2 (c2 )2
represent very different preferences. (Just sketch indifference curves in (c1 , c2 ) space.) Specif-
ically, they represent preferences with very different attitudes toward risk. We now develop
this idea.
We will usually take the consequences c to be monetary magnitudes such as income or
wealth. If the underlying preferences are defined over quantities of goods, then we can work
in terms of the indirect utility function of income or wealth, so long as the relative prices
are constant or are not the focus of the analysis.
With this convention, if preferences can be represented by expected utility where the
utility-of-consequences function is linear, so we can take u(c) = c up to an increasing linear
transformation, that means the decision-maker is indifferent between two alternatives that
P
yield equal expected income or wealth, i pi ci , regardless of the variance or any other
measure of dispersion of the distribution over consequences. In other words, this would be
a risk-neutral decision-maker. But this is an exceptional case, and raises difficulties, one of
which was the starting-point of this whole subject. So we begin there.
But no one seems willing to pay any very large sums, let alone unbounded sums, to play this
game. This is the St. Petersburg Paradox. (For most of the 20th century it was renamed
the Leningrad Paradox :-) .)
Nicholas’ brother Daniel Bernoulli offered the following resolution. “People’s perceptions
of money are logarithmic. Therefore the log of the value they place on the game equals
∞ ∞ ∞
−n −n ln(2) X
n
n 2−(n−1)
X X
2 ln(2 ) = 2 n ln(2) =
n=1 n=1 2 n=1
2
" ∞ #2
ln(2) −(n−1) ln(2) 2
2 = 2 ln(2) = ln(22 ) = ln(4)
X
= 2 =
2 n=1 2
This is like the “compensating variation” in ECO 310 – it is the change in money income
that compensates for, or cancels out, the effect of the lottery and leaves Bernoulli at the
same level of utility as before. We could instead look for the “equivalent variation,” namely
the sure amount of money that would give Bernoulli the same utility as the expected utility
he would get when given a gift of the lottery. Then we want the Y that solves the equation
∞
2−n ln(W0 + 2n )
X
ln(W0 + Y ) =
n=1
If you have elementary programming skills, try these out for a few values of W0 .
More importantly, Bernoulli’s resolution of the paradox is unsatisfactory in a more fun-
damental way. Even with a logarithmic utility-of-consequences function, the paradox can
be reconstructed by changing the reward if heads show up first on the nth toss from 2n to
Rn = exp(2n ). Then the utilities of consequences are u(Rn ) = ln(exp(2n ) ) = 2n , and now
the expected utility is infinite. But most people still would not be willing to pay very large
sums up front for this prospect. The only sure way to avoid the paradox in this framework is
to have a utility-of-consequences function that is bounded above, but that can create other
problems. More realistically, perhaps people just don’t believe that the prizes will actually
be paid out if a large value of n is realized, and such disbelief is justified since the prizes
soon start to exceed the GDP of the US or of the whole world.
3
and C2 = C0 − k with probabilities 12 each. Suppose the utility-of-consequences function u
is concave. Then, as is evident from Figure 1,
1 1
2
u(C1 ) + 2
u(C2 ) < u(C0 ) (1)
In terms of expected utilities, this becomes EU (L) < EU (L0 ). (To be mathematically
rigorous, I should define a concept of “strict concavity” that will yield strict inequalities and
ordinary or weak concavity that will yield only weak inequalities, but I will leave this out;
the textbook is likewise sloppy.)
u
u(C)
u(C )
0
y
u(C ) + yu(C )
1 2
C
C2 C C1
CE 0
C
The figure also shows that there is a C CE < C0 (but C CE > C2 ) such that
1
2
u(C1 ) + 1
2
u(C2 ) = u(C CE )
so the decision maker with this utility function is indifferent between the random prospect
L and the sure prospect C CE . Then we call C CE the certainty-equivalent of L. Suppose
this person starts at C0 but is then confronted with the prospect of gaining or losing k with
equal probabilities 12 each. We can think of C0 − C CE as the highest insurance premium he
is willing to pay to avoid the risk; it is called the risk premium for the decision maker in this
initial situation facing this random prospect.
There are three different ways of characterizing a concave function:
[1] u00 (C) < 0,
[2] u0 (C) is a decreasing function of C, and
[3] For any C1 , C2 and any p ∈ (0, 1), p u(C1 ) + (1 − p) u(C2 ) < u(p C1 + (1 − p) C2 ).
They have different degrees of generality: the first requires u to be twice differentiable,
the second requires it to be only once-differentiable, and the third does not require any
differentiability at all, for example it works for piecewise linear functions. For most of our
uses, we will work with twice-differentiable functions, so this distinction is not so important
and we can use the three characterizations indifferently. There is another point where care is
necessary: the inequalities in the above definitions can be weak, in which case we will speak
4
of a “weakly concave” function, and distinguish it from a “strictly concave” function where
the inequalities are strict. An expected-utility maximizer with a weakly concave utility-of-
consequences function may be indifferent to some fair gambles, but can never be an actual
risk-lover. This distinction too should be kept in mind, but for most of our uses it will not
play a role.
Jensen’s Inequality
Returning to the idea that the concavity of the u function implies risk-aversion, we can
develop it more generally. Consider a lottery whose outcomes are a random variable c with
a given distribution. Then u(c) is another random variable. We show that
that is, the expected utility of the lottery is less than the sure utility one would get from
having the monetary expected outcome with certainty. (2) is called Jensen’s Inequality.
Know and remember it well; it is used all the time in the economics of uncertainty. The (1)
we started with is a special case with just two possible outcomes of the random c.
To prove it, write c = E[c] for brevity, and use Taylor’s theorem with remainder to write
for some ce lying between c and c. Since u is assumed to be concave, u00 (ce) < 0 and so
and then the risk premium is Π ≡ C0 − C CE . For some functional forms of u, it may be
possible to solve the equation explicitly. But most of the time we have to rely on numerical
solution. Alternatively, for small risks we can use a Taylor approximation to get an expression
5
that is due to Pratt and Arrow; it is called the Pratt or Arrow-Pratt approximation. Here
is a heuristic derivation; a more rigorous justification requires more calculus but yields no
better understanding.
Write (3) as
n
X
u(C0 − Π) = pi u(C0 + Xi ) .
i=1
where . . . indicates higher-order (smaller for small risks) terms in the expansion.
Canceling u(C0 ) from both sides, and equating the leading terms that remain on each
side (rigorous proof of the validity of this step is what requires more math), we have
or
1 − u00 (C0 )
Π= 2
Var[X] .
u0 (C0 )
This is quite intuitive: the risk premium is proportional to the variance of the random
component of C, which can be thought of as a measure of the magnitude of the risk, and
also proportional to a measure of the extent of concavity or curvature of the utility-of-
consequences function. The latter factor then serves as a measure of the decision-maker’s
risk aversion for small risks around C0 :
− u00 (C0 )
A(C0 ) = . (4)
u0 (C0 )
This is called the coefficient of absolute risk aversion to distinguish it from the coefficient of
relative risk aversion, where the latter pertains to risks that are expressed as a proportion of
the initial position, and the risk premium is likewise expressed as a proportion. Consider a
decision-maker who starts from the position C0 and faces the risk of going to C0 (1+ X) c where
X
c is a random variable with zero mean, for example taking on values X c with probabilities
i
P
pi and satisfying i pi X c = 0. The relative risk premium, call it Π,
i
b solves the equation
n
X
u(C0 (1 − Π)
b )= pi u(C0 (1 + X
c)).
i
i=1
6
We can obtain an Arrow-Pratt measure of relative risk aversion by calculations very
similar to those above for the absolute case; I highly recommend that you try this yourself
to improve your understanding and skills. The resulting measure is
− C0 u00 (C0 )
R(C0 ) = . (5)
U 0 (C0 )
− u00 (C)
u0 (C) = a exp(− a C), u00 (C) = − a2 exp(− a C), A(C) = =a
u0 (C)
so this function has constant absolute risk aversion, and a is the parameter that measures
this risk aversion. By reversing the steps, that is, by solving the differential equation
− u00 (C)/u0 (C) = a, we can show that all utility-of-consequences functions with constant
absolute risk aversion belong to the family (6), of course within an increasing linear trans-
formation.
[2] Next, consider
1 C 1−r for r > 0, r 6= 1
1−r
u(C) = (7)
ln(C) for r = 1
For this family of functions,
− C u00 (C)
u0 (C) = C −r , u00 (C) = − r C −r−1 , R(C) = = r;
u0 (C)
7
More on Cardinal v. Ordinal Utility
Suppose the utility-of-consequences function is logarithmic, so the expected utility of a lot-
tery L = (C1 , C2 ; p1 , p2 ) is
Then
eEU (L) = (C1 )p1 (C2 )p2 ,
the famous Cobb-Douglas utility function from ECO 310. And for any two lotteries L, L0 ,
0
L L0 if and only if EU (L) > EU (L0 ) if and only if eEU (L) > eEU (L ) ,
so we could have used the Cobb-Douglas utility function equally well to represent the same
preferences. Only it would not have the “expected utility” form – it would not be the mathe-
matical expectation of anything. (This approach is useful when comparing random prospects
with common probabilities but different monetary consequences, as is done when we con-
sider bets of different magnitude, or insurance coverage with different sizes of deductible,
coinsurance, and indemnity, for a given event. We will discuss such applications in a couple
of weeks.)
More generally, we can take any nonlinear increasing transformations of the whole of
expected utility to represent the same preferences. In this sense we could say that utility is
still ordinal. But the transforms do not have the expected utility form. If we want to preserve
that, the only kind of transforms permissible are increasing linear (or pedantically, affine)
transforms of the underlying utility-of-consequences function. So the utility-of-consequences
function has to be cardinal if we want the expected utility form of the overall objective that
is maximized.
8
safety net means that there is little effective downside. This can happen even when the
risk is statistically unfavorable in the sense that the expected value of the money amount
is negative. Similarly, if a firm is close to bankruptcy, owners or managers with substantial
equity stakes may find it attractive to take excessive risks, knowing that the downside will
fall on other lenders or bondholders: for the equity holder, going bankrupt for a million
dollars is no worse than going bankrupt for a thousand.
u(C)
C
C
L
While a convex utility-of-consequences function (at least over some intervals of wealth or
income) can explain gambling behavior in an expected utility framework, probably a more
realistic explanation is that people enjoy the act of gambling for its own sake, as a form of
entertainment. However, this goes against a basic assumption underlying expected utility
theory, namely the assumption that compound lotteries can be collapsed into their simple
form without affecting preferences. If you enjoy gambling, you care about the process by
which uncertainty is resolved. This violates the compound lottery axiom. (There is some
very recent work that studies some consequences of such behavior.)
Many other kinds of utility-of-consequences functions have been suggested. Friedman
and Savage argued that there would be risk aversion for very low and very high levels of
wealth, and a middle range of risk preference. This is somewhat the opposite of the safety
net idea described above.
Kahneman and Tversky proposed the “prospect theory” of choice under risk, in which
the status quo plays a special role. The utility-of-consequences function is generally concave
for gains and convex for losses, with a discontinuity of slope (kink) at the status-quo point.
Such people would be especially averse to small gambles around the status quo, but may
like some larger gambles. Figure 3 shows these examples. Rabin studied some consequences
of this kink, and we will consider this in more detail later.
The utility-of-consequences function may be different in different states of the world, if
something about the state (for example the state of your health, or whether your favorite
sports team wins) makes you value the consequences (for example money) differently. Sup-
pose in state sj for j = 1, 2, . . . m, the utility-of-consequences function is uj (c). Then the
9
Friedman and Savage Kahneman and Tversky
Discontinuous
u(C) change in slope
(kink) u(C)
C C
Status quo C 0
where c = F (a, s) is the consequence function. The exact form of a state-dependent utility
is very specific to its context, so it is not possible to develop any useful general theory of
this. But we will use it from time to time in problems and applications.
10