0% found this document useful (0 votes)

24 views79 pages

Gilboa Notes For Introduction To Decision Theory

This document contains lecture notes for an introduction to decision theory class. The notes cover topics like preference relations, utility representations, semi-orders, choice functions, expected utility theory, and more. Proofs of representation theorems and other mathematical concepts are provided.

Uploaded by

anahita.galvani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views79 pages

Gilboa Notes For Introduction To Decision Theory

Uploaded by

anahita.galvani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Lecture Notes for Introduction to Decision

Theory
Itzhak Gilboa
March 6, 2013

Contents
1 Preference Relations 4

2 Utility Representations 6
2.1 Representation of a preference order . . . . . . . . . . . . . . . . 6
2.2 Characterization theorems for maximization of utility . . . . . . 7

3 Semi-Orders 15
3.1 Just Noticeable Diﬀerence . . . . . . . . . . . . . . . . . . . . . . 15
3.2 A note on the proof . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Uniqueness of the utility function . . . . . . . . . . . . . . . . . . 20

4 Choice Functions 23

5 von Neumann-Morgenstern/Herstein-Milnor Theorem 27

6 vNM Expected Utility 37

6.1 Model and Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3 A geometric approach . . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 A separation argument . . . . . . . . . . . . . . . . . . . . . . . . 40

7 de Finetti’s Theorem 43
7.1 Model and Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.2 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8 Anscombe-Aumann’s Theorem 46
8.1 Model and Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.2 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

1
9 Savage’s Theorem 57
9.1 Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.2 Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.3.1 Finitely additive measures . . . . . . . . . . . . . . . . . . 58
9.3.2 Non-atomic measures . . . . . . . . . . . . . . . . . . . . 59
9.3.3 Savage’s Theorem(s) . . . . . . . . . . . . . . . . . . . . . 60
9.4 The proof and qualitative probabilities . . . . . . . . . . . . . . . 61

10 Choquet Expected Utility 65

10.1 Capacities and Choquet Integration . . . . . . . . . . . . . . . . 65
10.2 Comonotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
10.3 Schmeidler’s Axioms and Result . . . . . . . . . . . . . . . . . . 67

11 Maxmin Expected Utility 70

11.1 Model and Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 70
11.2 Idea of Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

12 Arrow’s Impossibility Theorem 72

13 References 75

2
These are notes for a basic class in decision theory. The focus is on decision
under risk and under uncertainty, with relatively little on social choice. The
notes contain the mathematical material, including all the formal models and
proofs that will be presented in class, but they do not contain the discussion of
background, interpretation, and applications. The course is designed for 30-40
hours.

3
1 Preference Relations
A subset of ordered pairs of a set X is called a binary relation. Formally, R is
a binary relation on X if R ⊂ X × X.
A binary relation R on X is
— reflexive if for every x ∈ X, xRx;
— complete if for every x, y ∈ X, xRy or yRx (or possibly both);
— symmetric if for every x, y ∈ X, xRy implies yRx;
— transitive if for every x, y, z ∈ X, xRy and yRz imply xRz.

Remark 1 A binary relation that is complete is also reflexive.

Proof: Given x ∈ X apply the definition of completeness for the two elements
(x, y) being that x. Then xRy or yRx, and in both cases xRx. ¤

A binary relation R on X is an equivalence relation if it is reflexive, sym-

metric, and transitive. An example of such a relation is equality. Another
example is “having the same height” (on a population of people), or “having
the same (first) first name”. More generally, the equality of any function defines
an equivalence relation, and vice versa:

Proposition 2 A binary relation R on X is an equivalence relation if and only

if there exists a set A and a function f : X → A such that

xRy ⇔ f (x) = f (y) (1)

Proof: If such a set and function exist, it is straightforward to verify that

R satisfies the three conditions. Conversely, if R is an equivalence relation, one
can define
A = {{y |xRy } |x ∈ X }

and
f (x) = { y | xRy } .

To see that (1) holds, assume, first, that xRy. Then y ∈ f (x) and by symmetry
also x ∈ f (y). Further, transitivity implies that z ∈ f (x) also satisfies z ∈ f (y)

4
and vice versa. Thus, f (x) = f (y). Conversely, if f (x) = f (y) we first note
that, by reflexivity, y ∈ f (y), hence y ∈ f (x) and xRy. ¤

The set { y | xRy } is called the equivalence class of x. The set A defined in
the proof, namely, the set of all equivalence classes (which obviously defines a
partition of X) is called the quotient set, denoted X/R.

For a binary relation % on a set of alternatives X, we define the symmetric

(˜) and asymmetric (Â) parts as follows. For all x, y ∈ X,
x˜y if x % y and y % x;
x Â y if x % y and ¬(y % x) (where ¬ denotes negation); equivalently, x Â y
if x % y and ¬(y˜x).
We also use - and ≺ for the inverse of % and Â, respectively. That is,
x - y is the same thing as y % x and x ≺ y is equivalent to y Â x. (Note that
the choice of the symbols is supposed to make this natural, but these are new
symbols, which denote new relations, so we need to define them.)

Proposition 3 If % is transitive, then Â and ˜ are transitive.

Proof: To see that ˜ is transitive, assume that x˜y and y˜z, we have (x % y
and y % x) as well as (y % z and z % y). The first two parts imply (by
transitivity of %) x % z, and the second — z % x, so we get x˜z.
To see that Â is transitive, assume x Â y and y Â z. That is, (x % y and
not y % x) as well as (y % z and not z % y). The first two parts imply x % z by
transitivity of % as above. We need to show that z % x does not hold. Indeed,
assume it did. Then we would have z % x and x % y, and transitivity (of %
again) would imply z % y, which is in contradiction to y Â z. Hence ¬(z % x)
and x Â z. ¤

5
2 Utility Representations
2.1 Representation of a preference order

We say that a function u : X → R represents a relation % if, for every x, y ∈ X,

x%y iﬀ u(x) ≥ u(y).

That is, we want it to be the case that the relation at-least-as-desirable in

terms of observable preferences (%) holds between alternatives precisely when
their utility values satisfy at-least-as-large in terms of utility numbers (≥).
One can think of other notions of “representation”, for instance, requiring
that strict inequality between the utility numbers will reflect strict preferences,
or that equality would match indiﬀerence. It turns out that the first two notions
are identical, and that they imply the third:

Theorem 4 Let % be a complete relation on X and let there be a function

u : X → R. Define
(i) x % y iff u(x) ≥ u(y) ∀ x, y ∈ X
(ii) x Â y iff u(x) > u(y) ∀ x, y ∈ X
(iii) x˜y iff u(x) = u(y) ∀ x, y ∈ X
Then (i) and (ii) are equivalent and they imply (iii) (but not vice versa).

Proof: (i) implies (ii): Let there be given x, y ∈ X. Assume first that
x Â y. If u(y) ≥ u(x), by (i) we have y % x, a contradiction to x Â y. Hence,
u(x) > u(y). Conversely, assume u(x) > u(y). If y % x we would have (by (i)
again) u(y) ≥ u(x), which isn’t true. Hence ¬(y % x). But completeness implies
that x % y or y % x has to hold, and if the latter doesn’t hold, the former does.
So we have x % y and ¬(y % x), that is, x Â y.
(ii) implies (i): Let there be given x, y ∈ X. Assume first that x % y.
If u(y) > u(x), by (ii) we have y Â x, a contradiction to x % y. Hence,
u(x) ≥ u(y). Conversely, assume that u(x) ≥ u(y). If x % y didn’t hold, we
would have, by completeness, y Â x, and then, applying (ii), u(y) > u(x), a
contradiction. Hence u(x) ≥ u(y) implies x % y.

6
(i) implies (iii): Let there be given x, y ∈ X. Assume first that x˜y. Then
x % y and y % x. Applying (i) we have u(x) ≥ u(y) as well as u(y) ≥ u(x), hence
u(x) = u(y). Conversely, assume that u(x) = u(y). Then we have u(x) ≥ u(y),
which implies (by (i)) x % y, as well as u(y) ≥ u(x) which implies y % x, and
x˜y follows. ¤

To see that (iii) is strictly weaker than (i) and (ii), take a representation u
of a relation % with more than one equivalence class, and define v = −u. Such
a v will still represent indiﬀerences as in (iii) but not preferences as in (i) or (ii).

2.2 Characterization theorems for maximization of utility

Assume that % is a binary relation on X of alternatives as above.

Theorem 5 If X is finite or countably infinite, the following are equivalent:

(i) % is complete and transitive
(ii) there is a function u : X → R that represents %.

A relation that is complete and transitive is called a weak order.

Proof: It is easy to see that (ii) implies (i), independently of the cardinality
of X. Indeed, if u represents %, the latter is complete because ≥ is complete on
the real numbers, and % is transitive because so is ≥ (on the real numbers).
The main part of the proof is to show that (i) implies (ii). To this end,
we can assume, without loss of generality, that the equivalence classes of ˜ are
singletons, that is, that no two distinct alternatives are equivalent. The reason is
that from each equivalence class of ˜, say A, we can choose a single representative
xA ∈ A. Clearly, if we restrict attention to % on { xA | A ∈ X/˜ } (where X/˜
denotes the quotient set, consisting of equivalent classes of ˜), the relation is still
complete and transitive and the cardinality of X/˜ is finite or countably infinite.
Thus, is we manage to prove that (ii) implies (i) in this restricted case (of
singleton equivalence classes), we will have a function u : { xA | A ∈ X/˜ } → R
that represents % on { xA | A ∈ X/˜ }. It remains to extend it to all of X in the

7
obvious way (that respects equivalence, that is, set u(y) = u(xA ) for all y ∈ A
and for every A ∈ X/˜).
(As we will shortly see, assuming that the equivalence sets are singletons
doesn’t make a huge diﬀerence. However, it’s a good exercise to go over this
reasoning if it’s not immediately obvious.)
So let us now turn to the proof that (i) implies (ii) when the equivalence
classes of ˜ are singletons. Let us start with a simple proof for the finite case:
assume that (i) holds, that X is finite, and define, for x ∈ X,

u (x) = # { y | x % y }

that is, the utility of an alternative x is simply the number of elements in X

that x is at least as good as. This is like having any two alternatives compete,
and count how many “victories” an alternative has. (This may remind you of
sports tournaments.)
Clearly, u is a well-defined function. Also, because of transitivity, x % y
implies u (x) ≥ u (y). Moreover, if x Â y, then x ∈ { z | x % z } but y ∈
/
{ z | x % z } so that the inequality is strict, that is, u (x) > u (y). Hence we have
proved that x % y iﬀ u (x) ≥ u (y).
It is easy to see that this proof doesn’t extend to the infinite case (even
if the set is countable), because the sets { y | x % y } may well be infinite, and
then their cardinalities # { y | x % y } are not real numbers. Moreover, even if
we allowed the utility to take values in the extended reals, including ∞ and
−∞, we will not get a representation. For example, if we consider the rational
numbers with the standard ≥ relation, all of then would have the same u value
of ∞.
However, one can modify the proof a bit so that it will extend to the infinite
(countable) case: defining

u (x) = # { y | x % y }

we basically counted, for each x, how many y’s does it “beat”. This is as if an
alternative x collects “points” for its “victories” in the matches with other alter-

8
natives, and we assume that the points for all alternatives are equal. However,
we could do the same trick with points that are not necessarily equal. Suppose
that, for each y there is a “weight” αy > 0. Then, defining
X
u (x) = αy
{ y | x% y }

— if this is a real number — the proof above goes through: transitivity proves that
x % y implies u (x) ≥ u (y), and, because αx > 0, x Â y implies u (x) > u (y).
All that is left is to choose weights αy > 0 such that the summation above is
always finite. This, however, can easily be done because X is countable. We can
take any enumeration of X, X = {x1 , ..., xn , ...} and set αxn = 21n . Since the
P
entire series n αxn converges, u (x) is well-defined, that it, it is a real number
for every x.
Let us now look at a second proof, which uses induction. We consider a enu-
meration of X, X = {x1 , ..., xn , ...}. Let Xn = {x1 , ..., xn } be the set consisting
of the first n elements of X according to this enumeration. Clearly, Xn ⊂ Xn+1
for all n ≥ 1 and X = ∪n≥1 Xn . We define u by induction: set u (x1 ) = 0 and
then, for each n ≥ 1, we are about to define u (xn+1 ) ∈ R given the definition
of u on Xn . We will prove that, according to this definition, for every n ≥ 1, if
u represents % on Xn , it will also represent % on Xn+1 , and then observe that
this means also that u represents % on X.
Observe that, when we say “u represents % on Xn ” we refer to the values of
u on Xn , which are the first n numbers defined in the proof. Formally speaking,
the function u on Xn is a diﬀerent function that the function u defined on Xn+1 .
However, at stage n + 1 of the proof we only define u (xn+1 ) without changing
the values of u on Xn , and thus there is no need to use a diﬀerent notation for
the function defined on the smaller set, Xn , and for its extension to the larger
set, Xn+1 .
The induction step, is, however, trivial: given Xn and u that is defined on it,
let there be given xn+1 . If xn+1 ≺ xi for all i ≤ n, set u (xn+1 ) = mini≤n u(xi )−
1. Symmetrically, if xn+1 Â xi for all i ≤ n, set u (xn+1 ) = maxi≤n u(xi ) + 1.
Otherwise, xn+1 is “between” two alternatives xi , xj that is, there are i, j ≤ n

9
such that
xi ≺ xn+1 ≺ xj

and, for every k ≤ n, xk º xj or xk - xj . Setting

1
u (xn+1 ) = (u (xi ) + u (xj ))
2
completes the induction step.
Finally, it remains to be noted that this inductive process defines u over all
of X. It is very important here that in the induction step we do not re-define the
value of u defined in previous steps, so that u on X is well-defined. In fact, when
we recall that functions are sets of ordered pairs, the function u on X is simply
the union of the functions u defined over Xn — when we take the union over all n.
To see this u represents % over all of X, consider two elements x, y ∈ X. They
appear in the enumeration, say, x = xk and y = xl . Then, taking n = max(k, l),
we have x, y ∈ Xn , and then x % y iﬀ u(x) ≥ u (y) because u represents % over
Xn . ¤

To see that the theorem, as stated, cannot generally be true if X is uncount-

able, consider Debreu’s famous example of a lexicographic order (Debreu, 1959):
let X = [0, 1]2 and (x1 , x2 ) %L (y1 , y2 ) if and only if [x1 > y1 or (x1 = y1 and
x2 ≥ y2 )].

Proposition 6 There is no function u : [0, 1]2 → R that represents %L .

Proof: If there were such a function, then, for every value of x1 ∈ [0, 1] there
would be an open interval of utility values,

I(x1 ) = (u (x1 , 0) , u (x1 , 1))

(with u (x1 , 1) > u (x1 , 0)) such that, for x1 > y1 , I(x1 ) and I(y1 ) are disjoint
(because u (x1 , 0) > u (y1 , 1)). However, on the real line we can only have
countably many disjoint open intervals (for instance, because each such interval
contains a rational number). ¤

This means that in order to get a representation of % by a utility function

when X is not countable we need to make additional assumptions.

10
One direction to follow (again, Debreu, 1959) is to assume that the set of
alternatives X is a topological space, and require that % be continuous with
respect to this topology. For example, in the case X = Rk , we define continuity
as follows: % is continuous if, for all x, y ∈ X and every {xn } ⊂ Rk such that
xn → x, (i) if xn % y for all n, then x % y, and (ii) if y % xn for all n, then
y % x.

Remark 7 Assume that X = Rk . Then % is continuous iﬀ, for every y ∈ Rk ,

the sets {x ∈ X | x Â y} and {x ∈ X | y Â x} are open (in the standard topology
on X = Rk ).

Given that definition, one may state:

Theorem 8 If X = Rk , the following are equivalent:

(i) % is complete, transitive, and continuous
(ii) there is a continuous function u : X → R that represents %.

We will not prove this theorem here. Rather, we follow the other direction,
due to Cantor (1915), where no additional assumptions are made on X. In
particular, it need not be a topological space, and if it happens to be one,
we still will not insist on continuity of u. Instead, Cantor used the notion of
separability: % is separable if there exists a countable set Z ⊂ X such that, for
every x, y ∈ X\Z, if x Â y, then there exists z ∈ Z such that x % z % y.

Theorem 9 (For every X) The following are equivalent:

(i) % is complete, transitive, and separable
(ii) there is a function u : X → R that represents %.

Let us now briefly understand the mathematical content of separability. The

condition states, roughly speaking, that countably many element (those of Z)
tell the entire story. If we want to know how an alternative x ranks, it suﬃces to
know how it ranks relative to all elements of Z. Observe that, if X is countable,
separability vacuously holds, since one may choose Z = X.

11
To see that the separability axiom has some mathematical power, and that it
may give us hope for utility representation, let us see why it rules out Debreu’s
example above. In this example, suppose that Z ⊂ X = [0, 1]2 is a countable
set. Consider its projection on the first coordinate, that is,

Z1 = { x1 ∈ [0, 1] | ∃x2 ∈ [0, 1] s.t.(x1 , x2 ) ∈ Z} .

Clearly, Z1 is also countable. Consider x1 ∈ [0, 1]\Z1 and note that (x1 , 1) Â
(x1 , 0). However, no element of Z can be between these two in terms of prefer-
ence, since it cannot have x1 as its first coordinate.

Proof of Theorem 9
As above, the discussion is simplified if we assume, without loss of generality,
that there are no equivalences.
We may go back to the two proofs of Theorem 5 and try to build on these
ideas. The second proof, by which the values of u (xn ) were defined by induction
on n, doesn’t generalize very graciously. For a countable set, one can have a
enumeration of the elements, such that each one of them has only finitely many
predecessors. This allowed us to find a value for u (xn ) for each n, so that u
represented % on the elements up to xn . However, when X is uncountable, no
such enumeration exists. Thus, there will be (many) elements x of X that have
infinitely many predecessors. And then it might be impossible to find a value
u (x) that allows u to represent % on all the elements up to x. (For example,
assume that x Â z and yn Â x where we have already assigned the values
1
u (z) = 0 and u (yn ) = n .)

However, the first proof does extend to the general set-up. Recall that, in
the countable case, we agreed that for each y there would be a “weight” αy > 0
and that, given these weights, we would define
X
u (x) = αy .
{ y | x% y }

You might think that, when X is uncountable, the corresponding idea would
be to have an integral (over all { y | x % y } for each x) instead of a sum. But

12
this would require a definition of an algebra on X (which is not the hard part)
and a definition of αy > 0 so that the function α· is integrable relative to
that algebra (which is harder). However, these complications are not necessary:
the separability requirement says that countably many elements “tell the entire
story”. Hence, we should take the sum not over all { y | x % y }, but only over
those elements of Z that are in this set.
Hence, for the proof that (i) implies (ii), let Z = {z1 , z2 , ...} and define
X 1 X 1
u(x) = i
− . (2)
2 2i
zi ∈X,x% zi zi ∈X,zi Âx

Clearly u(x) ∈ R for all x (in fact, u(x) ∈ [−1, 1]). It is easy to see that x % y
implies u(x) ≥ u(y). To see the converse, assume that x Â y. If one of {x, y} is
in Z, u(x) > u(y) follows from the definition (2). Otherwise, invoke separability
to find z ∈ Z such that x Â z Â y and then use (2).
Another little surprise in this theorem, somewhat less pleasant, is how messy
the proof of the converse direction is. Normally we expect axiomatizations to
have suﬃciency, which is a challenge to prove, and necessity which is simple. If
it is obvious that the axioms are necessary, they are probably quite transparent
and compelling. (If, by contrast, suﬃciency is hard to prove, the theorem is
surprising in a good sense: the axioms take us a long way.) Yet, we should be
ready to sometimes work harder to prove the necessity of conditions such as
continuity, separability, and others that bridge the gap between the finite and
the infinite.
In our case, if we have a representation by a function u, and if the range
of u were the entire line R, we would know what to do: to select a set Z that
satisfies separability, take the rational numbers Q = {q1 , q2 , ...}, for each such
number qi select zi such that u(zi ) = qi , and then show that Z separates X.
The problem is that we may not find such a zi for each qi . In fact, it is even
possible that range(u) ∩ Q = ∅.
In some sense, we need not worry too much if a certain qi is not in the range
of u. In fact, life is a little easier: if no element will have this value, we will not
be asked to separate between x and y whose utility is this value. However, what

13
happens if we have values of u very close to qi on both sides, but qi is missing?
In this case, if we fail to choose elements with u-values close to qi , we may later
be confronted with x Â y such that u(x) ∈ (qi , qi + ε) and u(y) ∈ (qi − ε, qi ) and
we will not have an element of z with x Â z Â y.
Now that we see what the problem is, we can also find a solution: for each
qi , find a countable non-increasing sequence {zik }k such that

{u(zik )}k & inf{u(x)|u(x) > qi }

and a countable non-decreasing sequence {wik }k such that

{u(zik )}k % sup{u(x)|u(x) < qi }

assuming that the sets on the right hand sides are non empty. The (countable)
union of these countable sets will do the job. ¤

In all of these representation results (Theorems 5, 8, 9), the function u is

unique only up to increasing transformations. That is, if f : R → R is a strictly
increasing function, and u represents %, then

v = f (u)

also represents %. (In Theorem 8 one would need to require that f be continuous
to guarantee that v is continuous on X, as is u. In the other two theorems we
don’t have a topology on X, and continuity of u or v is not defined. Therefore it
makes no diﬀerence if f is continuous or not.) For that reason, a utility function
u that represent % is called ordinal. Importantly, when we say that “u is ordinal”
we don’t refer to a property of the function u as a mathematical object per se,
but to the way we use it. Saying that u is ordinal is like saying “I’m using
the function u, but don’t take me too literally; it’s actually but an example of
a function, a representative of a large class of functions that are equivalent in
terms of their observable content, and any monotone transformation of u can
be used as “the” utility function just as well. I’ll try not to say anything that
depends on the particular function u and that would not hold true if I were to
replace u by a monotone transformation thereof, v.”

14
3 Semi-Orders
3.1 Just Noticeable Diﬀerence

Proposition 3 showed that transitivity of % implies that of ˜. However, many

alternatives involve variables that can be thought of as continuous, and in these
cases it does not make sense to assume that ˜ is transitive due to our limited
capacity to discern difference. In Luce’s coffee mug example, a decision maker
has preferences over coffee mugs with n grains of sugar. Not being able to
discern a mug with n grains of sugar from one with n + 1 grains, one can hardly
expect there to be strict preference. Thus, we get indifference ˜ between any
two consecutive alternatives, but this doesn’t mean that we’ll get indifference
between any pair of alternatives.
Indeed, Weber’s Law in psychophysiology (dating back to 1834), states that
a person’s ability to discern difference between perceptual stimuli is limited. He
defined the just noticeable difference to be the minimal increase in a stimulus
that is needed for the difference to be noticed. More precisely, if S is a level of a
stimulus, let ∆S be the minimal quantity such that (S + ∆S) can be identified
as larger than S with probability of at least 75%. Weber’s Law states that
∆S/S is a constant, independent of S, say, λ > 0. In other words, if S 0 > S,
the person will be able to tell that this is indeed the case (with probability of
75% or more) iff
S 0 > S + ∆S = (1 + λ) S

or
S0
>1+λ
S
or
log (S 0 ) − log (S) > δ ≡ log (1 + λ) > 0.

Inspired by this law, Luce (1956) was interested in strict preferences P that
can be described by a utility function u through the equivalence

xP y iﬀ u(x) − u(y) > δ > 0 ∀x, y ∈ X. (3)

15
If (3) for u : X → R and δ > 0, we say that the pair (u, δ) L-represents P .
Seeking to axiomatize L-representations, Luce considered the binary relation
P (on a set of alternatives X) as primitive. The relation P is interpreted as
strict preference, where I = (P ∪ P −1 )c — as absence of preference in either
direction, or “indiﬀerence”.1
Luce formulated three axioms, which are readily seen to be necessary for an
L-representation. He defined a relation P to be a semi-order if it satisfied these
three axioms, and showed that, if X is finite, the axioms are also suﬃcient for
L-representation.
To state the axioms, it will be useful to have a notion of concatenation of
relations. Given two binary relations B1 ,B2 ⊂ X × X, let B1 B2 ⊂ X × X be
defined as follows: for all x, y ∈ X,

xB1 B2 y iﬀ ∃z ∈ X, xB1 z, zB1 y.

Observe that, if you think of a relation B as a binary matrix, whose rows

and columns are members of X, and such that Bxy = 1 iﬀ xBy (and Bxy = 0
otherwise), then the concatenated relation B1 B2 precisely corresponds to the
“product” of the matrices B1 and B2 is by “addition” we mean “or” (that is,
1 + 0 = 0 + 1 = 1 + 1 = 1).

We can finally state Luce’s axioms. The relation P (or (P, I)) is a semi-order
if:
L1. P is irreflexive (that is, xP x for no x ∈ X);
L2. P IP ⊂ P
L3. P P I ⊂ P .

The meaning of L2 is, therefore: assume that x, z, w, y ∈ X are such that

xP zIwP y. Then it has to be the case that xP y. Similarly, L3 requires that
xP y will hold whenever there are z, w ∈ X such that xP zP wIy.
Since I is reflexive, each of L2, L3 implies transitivity of P (but not of I!).
But L2 and L3 require something beyond transitivity of P . For example, if
1 For a relation P ⊂ X × X, define P −1 to be the inverse, i.e., P −1 = {(x, y)|(y, x) ∈ P }.

Thus, I = (P ∪ P −1 )c is equivalent to say that xIy if and only if neither xP y nor yP x.

16
X = R2 and P is defined by Pareto domination, P is transitive but you can
verify that it satisfies neither L2 nor L3.
Conditions L2 and L3 restrict the indiﬀerence relation I. For the Pareto
relation P , the absence of preference, I, means intrinsic incomparability. Hence
we can have, say, xP zP wIy without being able to say much on the comparison
between x and y. It is possible that y is incomparable to any of x, z, w because
one of y’s coordinates is higher than the corresponding coordinate for all of
x, z, w. This is not the case if I only reflects the inability to discern small
diﬀerences. Thus, L2 and L3 can be viewed as saying that the incomparability
of alternatives, reflected in I, can only be attributed to issues os discernibility,
and not to fundamental problems as in the case of Pareto dominance.
Looking at Luce’s three conditions, you may wonder why not require also
L4. IP P ⊂ P .
The answer is that it follows from the previous two. More precisely

Proposition 10 Assume that P is irreflexive. Then

(i) L3 implies L4
(ii) L4 implies L3
(iii) L3 (and L4) do not imply L2
(iv) L2 does not imply L3 (or L4).

Proof: (i) Assume L3. To see that L4 holds, let there be given x, y, z, w ∈ X
such that xIyP zP w. We need to show that xP w. If not, we have either wP x or
wIx. We argue that in either case, yP x. Indeed, if wP x, we have yP zP wP x.
Recall that P is transitive by L3. Hence yP x. If, however, wIx, we have
yP zP wIx and, by L3, yP x. However, this is a contradiction because xIy.
(ii) Assume L4. To see that L3 holds, let there be given x, y, z, w ∈ X such
that xP yP zIw. We need to show that xP w. If not, we have either wP x or wIx.
We argue that in either case, wP z. Indeed, if wP x, we have wP xP yP z and
(since L4 also implies transitivity of P ), wP z. If, however, wIx, then wIxP yP z
and L4 implies wP z. Thus, in both cases we obtain wP z, which contradicts
zIw.

17
(iii) Consider X = {x, y, z, w} and P = {(x, y) , (z, w)}. L3 and L4 hold
vacuously (as there are no chains of two P relations) but L2 doesn’t. (If it
did, we should have xP w because xP yIzP w — and, indeed, also zP y because
zP wIxP y.)
(iv) Consider X = {x, y, z, w} and P = {(x, y) , (y, z) , (x, z)}. L2 holds, as
P IP = {(x, z)} (because xP yIyP z but there is no other quadruple of elements
satisfying this chain of relations) and, indeed, (x, z) ∈ P . However, L3 does not
hold (if it did, xP yP zIw would imply xP w) nor does L4 (if it did, wIxP yP z
would imply wP z). ¤

It is an easy exercise to show that L2 and L3 are necessary conditions for an

L-representation to exist. It takes more work to prove the following.

Theorem 11 (Luce) Let X be finite. P ⊂ X × X is a semi-order if and only

if there exists u : X → R that L-represents it.

We will not prove this theorem here, but we will make several comments
about it.

If we drop L3 (but do not add L4), we get a family of relations that Fishburn
(1970b, 1985) defined as interval relations. Fishburn proved that, if X is finite,
a relation is an interval relation if and only if it can be represented as follows:
for every x ∈ X we have an interval, (b(x), e(x)), with b(x) ≤ e(x), such that

xP y iﬀ b(x) > e(y) ∀x, y ∈ X

that is, xP y iﬀ the entire range of values associated with x, (b(x), e(x)), is higher
than the range of values associated with y, (b(y), e(y)).
Given such a representation, you can define u(x) = b(x) and δ(x) = e(x) −
b(x) to get an equivalent representation

xP y iﬀ u(x) − u(y) > δ(y) ∀x, y ∈ X (4)

Comparing (4) to (3), you can think of (4) as a representation with a variable
just noticeable diﬀerence, whereas (3) has a constant jnd, which is normalized
to 1.

18
3.2 A note on the proof

If P is a semi-order, one can define from it a relation Q = P I ∪ IP . That is,

xQy if there exists a z such that (xP z and zIy) or (xIz and zP y). This is an
indirectly revealed preference: suppose that xIy but xP z and zIy. This means
that a direct comparison of x and y does not reveal a noticeable difference,
and therefore no preference either. But, when comparing x and y to another
alternative z, it turns out that x is different enough from z to be preferred to
it, while y isn’t. Indirectly, we find evidence that x is actually better than y for
our decision maker, even though the decision maker herself cannot discern the
difference when the two are presented to her. (Similar logic applies if xIP y.)
If P is a semi-order, Q turns out to be the strict part of a weak order:

Claim 12 Q is transitive

Proof: Assume xQy and yQz.

If xP wIy and yP tIz then xP IP t hence xP t and tIz that is xP Iz.
If xIwP y and yP tIz then xIP P t hence xP t and tIz that is xP Iz.
If xIwP y and yItP z then xIw and wP IP z hence xIw and wP z hence xIP z.
If xP wIy and yItP z then we know that tP x does not hold. If it did,
tP xP wIy that is, tP P Iy and tP y would follow (while we know that yIt). Hence
xP t, in which case xP tP z, and xP z and xQz follow, or xIt, but then xItP z,
that is xIP z. ¤

The equivalence relation that corresponds to the relation Q is denoted by

E. Formally, define E = (Q ∪ Q−1 )c .

Claim 13 E is transitive

Proof: Define ˜ as follows: x˜y if for every z, (xP z ⇔ yP z) and (zP x ⇔ zP y).
Clearly, ˜ is an equivalence relation. Also, x˜y implies xEy. To see the con-
verse, assume that xEy. If there exists z such that xP zP y (or yP zP x) then
xP y (yP x) and x˜y cannot hold. Hence xP z ⇒ yP z and vice versa. Similarly,
zP x ⇒ zP y and vice versa. ¤

19
Moreover, one can get an L-representation of P by a function u that simul-
taneously also satisfies

xQy iﬀ u(x) − u(y) > 0 ∀x, y ∈ X. (5)

3.3 Uniqueness of the utility function

We finally get back to the question of uniqueness. How unique is a utility

function that L-represents a semi-order P ? In general the answer may be very
messy. Let us therefore suppose that we are dealing with a large X so that we
can think of the range of the utility function being all of R, or at least an open
interval in R.
For this to be the case, one would need additional assumptions on P , beyond
it being a semi-order: first, if we wish to have a representation of P by (3) and,
simultaneously, of Q by (5), we’d need something like separability to guarantee
that Q is separable. On top of that, there would be assumptions that are specific
to semi-orders. For example, if x1 P x2 P x3 P... and xn P y for every n, we won’t
be able to have a representation of P by (3). Thus, we would need to requires
that any infinite P -chain cannot be P -bounded from above or from below. We
won’t get into the characterization here, and will simply assume that P is such
that (3) holds with range(u) = R.
How unique are (u, δ), then? Clearly, if we shift u by a constant, defining
v(x) = u(x) + c, the representation does not change. Similarly, if we multiply
both u and δ by a positive constant, the representation is unaﬀected. Hence,
when we may assume, without loss of generality, that we normalize (u, δ) so
that δ = 1 and, for some x0 ∈ X, u (x0 ) = 0. Assume now that v is also such a
function that satisfies (3) and (5). Thus, u, v : X → R satisfy, for all x, y ∈ X,

xP y ⇔ u(x) − u(y) > 1 ⇔ v(x) − v(y) > 1

xQy ⇔ u(x) − u(y) > 0 ⇔ v(x) − v(y) > 0

with u(x0 ) = v(x0 ) = 0 and range(u) = range(v) = R.

20
The functions u and v can be quite diﬀerent on [0, 1]. But if x is such
that u(x) = 1, we will also have to have v(x) = 1. To see this, imagine that
v(x) > 1. Then there are alternatives y with v(y) ∈ (1, v(x)). This would mean
that, according to v, yP 0, while according to u, yI0, a contradiction.
The same logic applies to any point we start out with. That is, for every
x, y,
u(x) − u(y) = 1 ⇔ v(x) − v(y) = 1

and this obviously generalizes to (u(x) − u(y) = k ⇔ v(x) − v(y) = k) for every
k ∈ Z. Moreover, we obtain

|u(x) − v(x)| < 1 ∀x.

In other words, the just noticeable diﬀerence, which we here normalized to

1, has an observable meaning. Every function that L-represents P has to have
the same jnd. Similarly, if we consider any two alternatives x and y and find
that
3 < u(x) − u(y) < 4

we know that these inequalities will also hold for any other utility function v.
And the reason is, again, that utility diﬀerences became observable to a certain
degree: we have an observable distinction between “greater than the jnd” and
“smaller than (or equal to) the jnd”. This distinction, however coarse, gives
us some observable anchor by which we can measure distances along the utility
scale: we can count how many integer jnd steps exist between alternatives.

One can make a stronger claim if one recalls that the semi-orders were defined
for a given probability threshold, say, p = 75%. If one varies the probability,
one can obtain a diﬀerent semi-order. Thus we have a family of semi-orders
{Pp }p>.5 . Under certain assumptions, all these semi-orders can be represented
simultaneously by the same utility function u, and a corresponding family of
jnd’s, {δ p }p>.5 such that

xPp y ⇔ u(x) − u(y) > δ p

xQy ⇔ u(x) − u(y) > 0.

21
In this case, it is easy to see that the utility u will be unique to a larger degree
than before. We may even find that, as p → .5, δ p → 0, that is, that if we
are willing to make do with very low probabilities of detection, we will get very
low jnd’s, and correspondingly, any two functions u and v that L-represent the
semi-orders {Pp }p>.5 will be identical.
Observe that the uniqueness result depends discontinuously on the jnd δ:
the smaller is δ, the less freedom we have in choosing the function u, since
sup |u(x) − v(x)| ≤ δ. But when we consider the case δ = 0, we are back with a
weak order, for which u is only ordinal.

22
4 Choice Functions
The binary relation approach assumes that we observe choices between pairs
of alternatives. More generally, given a set of alternatives X, we may assume
that the choice is observed within various subsets of X, and not only between
pairs. Assume that X is finite, and denote by B ⊂ 2X \{∅} the collection of
(non-empty) subsets of X that are choice sets, that is, that choice within each of
them can be observed. This choice is assumed to be a subset of the set oﬀered.
Thus we define a choice correspondence to be a function

C : B →2X \{∅}

with

C(B) ⊂ B ∀B ∈ B.

Further, we assume that B includes all subsets of size ≤ 3, so that the choice
functions we consider will be suﬃciently informative.

In this context, we define the Weak Axiom of Revealed Preference (WARP)

by:

WARP: If for some B, x, y ∈ B ∈ B, x ∈ C(B), then, for any B 0 ∈ B with

x, y ∈ B 0 , y ∈ C(B 0 ) implies x ∈ C(B 0 ).

This axiom states that, if in one context (B), where y was available (y ∈
B), x was chosen (x ∈ C(B)), then in any other context (B 0 ) where both are
available (x, y ∈ B 0 ), if y is good enough to be chosen (y ∈ C(B 0 )), then so is x
(x ∈ C(B 0 )). Thus, if in one instance x was observed to be at least as good as
y, we will never find that y is strictly better than x.
A choice function that satisfies WARP can be thought of as a binary relation.
To be precise, one may start with a choice function C, and, if it satisfies WARP,
define a binary relation %∗ such that C picks the %∗ -maximal elements in B for
every B ∈ B. Conversely, if one starts with a binary relation %, one may define
the choice function that selects the %-maximal elements (n B for every B ∈ B)
and show that it satisfies WARP. Details follow.

23
Let us first assume that a choice function C : B →2X \{∅} is given. Define
a binary relation %∗ =%∗ (C) as follows: for every x, y ∈ X, x %∗ y if (and only
if)2 there exists B ∈ B such that x, y ∈ B, and x ∈ C(B). That is, we say that
x %∗ y if there is a context in which x was revealed to be at least as desirable
as y.
Taking Â∗ to be the asymmetric part of %∗ , we find that x Â∗ y iﬀ (i) for
at least one B ∈ B with x, y ∈ B, we have x ∈ C(B) but (ii) for no B ∈ B such
that x, y ∈ B, is it the case that y ∈ C(B).
Note that, if then there exists a B ∈ B such that x, y ∈ B, and x ∈ C(B)
but not y ∈ C(B). We could therefore say that x was “revealed to be strictly
preferred to” y. Indeed, it makes sense to define this formally: we write x Â0 y,
if there exists B ∈ B, with x, y ∈ B, such that x ∈ C(B) but y ∈
/ C(B). Thus,
x Â∗ y implies x Â0 y. But the converse isn’t generally true: it is possible that,
given one B only x is chosen, and given another, B 0 , only y is chosen (while x, y
are in both B and B 0 ). That is, the definition of Â0 allows for the possibility
that x Â0 y and y Â0 x. By contrast, the definition of Â∗ implies asymmetry:
if x Â∗ y, we know that in some contexts (sets B ∈ B with x, y ∈ B), x was
chosen, but in none was y chosen.

Conversely, if we start with a binary relation %, it makes sense to define a

choice function by seeking the %-maximal elements. Formally, for any %⊂ X×X
we can define

C ∗ (B) = C ∗ (B, %) = { x ∈ B | x % y ∀y ∈ B}

(observe that this is not necessarily a choice function as we’re not guaranteed
that the set hereby defined is non-empty.)
We can now state formally the equivalence between binary orders that are
complete and transitive and choice functions that satisfy WARP. Let us start
with the more immediate result:
2 In
case we haven’t mentioned this: definitions are always characterizations, that is, "if
and only if" statements. For this very reason, it is considered better style not to write "...
and only if" in definitions.

24
Proposition 14 If % is a weak order, then C ∗ (B, %) is a choice function sat-
isfies WARP. Furthermore, the relation corresponding to C ∗ is %: %=%∗ (C ∗ ).

Proof: Recall that X is finite. Then C ∗ (B) 6= ∅ for every B ∈ B. Moreover,

% can be represented by a function u and C ∗ (B) consists precisely of the u-
maximizers in B. It follows that WARP holds: if for some B, x, y ∈ B ∈ B,
x ∈ C(B), we know that u (x) ≥ u (y) and then, for any B 0 ∈ B with x, y ∈ B 0 ,
if y ∈ C(B 0 ) then y is a maximizer of u over B 0 and then so is x, and this implies
x ∈ C(B 0 ).
As for the “furthermore” part, observe that, given C ∗ , the relation %∗ =%∗
(C ∗ ) is defined by x %∗ y iﬀ there exists B ∈ B such that x, y ∈ B, and
x ∈ C ∗ (B). If indeed such a B exists, we know that that u (x) ≥ u (y) and
x % y. Conversely, if x % y and u (x) ≥ u (y), then we have x ∈ C ∗ ({x, y}) and
x %∗ y holds. Hence %∗ =% . ¤

Conversely, let us now start with a choice function that satisfies WARP and
define the relation from it.

Proposition 15 If C satisfies WARP, then %∗ =%∗ (C ∗ ) is a weak order, it

satisfies C ∗ (B, %∗ ) = C(B) for all B ∈ B, and it is the unique weak order that
satisfies this equation.

Proof: Let there be given a choice function C that satisfies WARP. To see
that %∗ =%∗ (C ∗ ) is complete, consider B = {x, y} (which is in B as we assumed
that all sets with no more than three elements are in B). Because C({x, y}) 6= ∅,
it has to be the case that x ∈ B, and then x %∗ y, or y ∈ B, and then y %∗ x
(or both).
To see that %∗ is transitive, assume that x %∗ y and y %∗ z, and we will
prove that x %∗ z. As x %∗ y, there exists D ∈ B such that x, y ∈ B, and
x ∈ C(D). WARP then implies that the same would hold for D0 = {x, y} ∈ B:
x ∈ C({x, y}). Similarly, y %∗ z means that there exists some E ∈ B such
that y, z ∈ E, and y ∈ C(E) and this implies also y ∈ C({y, z}). Let us now
consider B = {x, y, z} ∈ B. We need to show that x ∈ C ({x, y, z}) (and then,

25
by definition of %∗ , x %∗ z is established). Assume that this is not the case, that
is, x ∈
/ C ({x, y, z}). Can it be the case that y ∈ C ({x, y, z})? The negative
answer is given by WARP: since x ∈ C({x, y}), x will be chosen whenever y
is (provided they are both available). Hence we find that x ∈
/ C ({x, y, z})
implies y ∈
/ C ({x, y, z}). But, by the same token, y ∈
/ C ({x, y, z}) implies
z∈
/ C ({x, y, z}) and it follows that, if x ∈
/ C ({x, y, z}) then C ({x, y, z}) = ∅, a
contradiction to the definition of choice functions. Hence x ∈ C ({x, y, z}) and
%∗ is transitive.
We now turn to show that, if we define C ∗ from the relation %∗ , we get the
function C that we started out with. That is, we wish to show that, for every
B ∈ B, C ∗ (B, %∗ ) = C(B).
Fix B. To see that C ∗ (B, %∗ ) ⊂ C(B), let x ∈ C ∗ (B, %∗ ), that is, x is a
%∗ -maximum in B. Choose y ∈ C(B). Since x is a %∗ -maximum in B, we know
that x %∗ y. By definition of %∗ , for some B 0 , x, y ∈ B 0 , x ∈ C(B 0 ). But then
WARP implies x ∈ C(B) and C ∗ (B, %∗ ) ⊂ C(B) is established.
To see the converse inclusion, namely, that, C(B) ⊂ C ∗ (B, %∗ ), let x ∈
C(B). By definition of %∗ , this implies that x %∗ y for every y ∈ B. That
is, x is a %∗ -maximum in B. But this, in turn, is precisely the definition of
C ∗ (B, %∗ ). Hence x ∈ C ∗ (B, %∗ ) and C(B) ⊂ C ∗ (B, %∗ ) also holds.
Finally, to see uniqueness of the relation %∗ , it suﬃces to consider the sets
B’s that are pairs, and to observe that C on these sets is suﬃcient to define %∗
uniquely. ¤

26
5 von Neumann-Morgenstern/Herstein-Milnor The-
orem
In this section we present a theorem that is some combination of results, by
people whose names are in the title. de Finetti was the first to indicate the
type of result he needed to have, and we’ll discuss the context of his result later
on. von Neumann and Morgenstern (vNM) had the famous theorem which we
will study later on. The theorem we present here is slightly more general than
the result they proved, as it will be used for other structures as well. The
generalized version suggested here is still a special case of the generalization of
vNM’s theorem provided by Herstein and Milnor (1953).
Let there be an underlying set A and suppose that we are interested in
objects of choice that are described as real-valued functions on A. Thus,

X ⊂ RA .

The structure of linear functions allows us to define a mixture operation: for

every x, y ∈ X and every α ∈ [0, 1] define αx + (1 − α)y ∈ RA is defined
pointwise, that is, it is given by

(αx + (1 − α)y) (i) = αx(i) + (1 − α)y(i)

for every i ∈ A. We assume that X is a convex set, so that αx + (1 − α)y ∈ X

for x, y ∈ X and every α ∈ [0, 1].
In de Finetti’s set-up, the set A consists of states of the world, and x ∈ X
designates a “bet”: an act that, given state i ∈ A, yields a payoﬀ x (i), which is
assumed to be monetary. In this case one may assume that X = RA , or perhaps
add some measurability constraints if A is infinite. In vNM’s set-up, A denotes
possible outcomes, and an element x ∈ X is a lottery, obtaining the outcome i
with probability x (i). In this case attention is restricted to functions x ∈ RA
that are non-negative, assume positive values for only finitely many i’s, and add
up to 1, that is, an element x ∈ X is a lottery on A, with a finite support.
Additional examples will be discussed later on.

27
A relation %⊂ X × X will be assumed to satisfy the following three axioms:
A1. Weak order: % is complete and transitive.
A2. Continuity: For every x, y, z ∈ X, if x Â y Â z, there exist α, β ∈ (0, 1)
such that
αx + (1 − α)z Â y Â βx + (1 − β)z.

A3. Independence: For every x, y, z ∈ X, and every α ∈ (0, 1),

xÂy implies αx + (1 − α)z Â αy + (1 − α)z.

A function U : X → R is called aﬃne if, for every x, y ∈ X, and every

α ∈ [0, 1],
U (αx + (1 − α)y) = αU (x) + (1 − α)U (y)

Theorem 16 A relation %⊂ X × X satisfies A1-A3 if and only if it can be

represented by an affine U : X → R.
Furthermore, in this case, the function U is unique up to a positive affine
transformation: an affine function V : X → R also represents % iff there are
c > 0 and d ∈ R such that

U (x) = cV (x) + d ∀x ∈ X.

Proof: We mention that necessity of the axioms is straightforward. Also, it

is easy to see that, if U is an affine function that represents %, so will be any
increasing affine transformation thereof, V . The main part of the proof is the
sufficiency of the axioms; along the proof of sufficiency we will develop tools
that would make uniqueness easy to establish.
Let us assume, then, that A1-A3 hold. We start with a few lemmas.

Lemma 17 For every x, y ∈ X, if x Â y,

28
(i) for every λ ∈ (0, 1),

x Â λx + (1 − λ)y Â y

(ii) for every λ, μ ∈ (0, 1) with λ > μ,

x Â λx + (1 − λ)y Â μx + (1 − μ)y Â y.

Proof: (i) Assume that x Â y. Use A3 for z = x and α = 1 − λ to obtain

x = αx + (1 − α)z Â αy + (1 − α)z = λx + (1 − λ)y.

Then use A3 again, for z = y and α = λ to obtain

λx + (1 − λ)y = αx + (1 − α)y Â αy + (1 − α)z = y.

(ii) Assume that x Â y and λ > μ. By (i) we know that λx + (1 − λ)y Â y.

Defining x0 = λx + (1 − λ)y, apply (i) again for λ0 = μ
λ to conclude that

x0 Â λ0 x0 + (1 − λ0 )y Â y

and, observing that

μ μ μ
λ0 x0 + (1 − λ0 )y = λx + (1 − λ)y + (1 − )y = μx + (1 − μ)y
λ λ λ
we get λx + (1 − λ)y Â μx + (1 − μ)y. ¤

Lemma 18 For every x, y ∈ X, if x ∼ y, then, for every λ ∈ [0, 1],

x ∼ λx + (1 − λ)y ∼ y

Proof: Let there be x˜y and assume that for some λ ∈ (0, 1), z ≡ λx+(1−λ)y
does not satisfy z˜x. Assume that z Â x, y. (The proof for the case z ≺ x, y is
symmetric.) By the previous lemma, we know that

z Â αz + (1 − α)x Â x

for every α ∈ (0, 1). Thus, for every μ > λ,

z = λx + (1 − λ)y Â μx + (1 − μ)y Â x ∼ y.

29
Next consider μ > λ and observe that μx + (1 − μ)y Â y. Pick one such μ
and denote w = μx + (1 − μ)y, so that z Â w Â y.
Since w Â y, for every β ∈ (0, 1) we have

w Â βw + (1 − β) y Â y

but for β = λ
μ we obtain w Â z, a contradiction. Hence we have z ∼ x ∼ y. ¤

Lemma 19 For every x, y, z ∈ X, and every α ∈ (0, 1),

x%y if f αx + (1 − α)z % αy + (1 − α)z.

Proof: The independence axiom states that x Â y implies αx + (1 − α)z Â

αy + (1 − α)z. Hence we only need to prove that x ∼ y implies αx + (1 − α)z ∼
αy + (1 − α)z for every x, y, z ∈ X and every α ∈ (0, 1).
Let there be given x, y, z with x ∼ y. If z ∼ x ∼ y, then, by Lemma 18,
αx + (1 − α)z ∼ x ∼ y ∼ αy + (1 − α)z. We are therefore interested in the case
x ∼ y Â z or z Â x ∼ y. Consider the first (x ∼ y Â z), as the second is proved
in a symmetric way.
Observe that, for every δ < 1, y Â δx + (1 − δ)z because y ∼ x and x Â
δx + (1 − δ)z. We claim that, for 1 ≥ β > γ ≥ 0, βy + (1 − β)z Â γx + (1 − γ)z.
To see this, consider x0 = (γ/β) x + (1 − γ/β) z so that y Â x0 and thus βx +
(1 − β)z Â βx0 + (1 − β)z = γx + (1 − γ)z.
Assume that, contrary to our claim, for some α > 0 we have αx + (1 − α)z Â
αy + (1 − α)z. (The case of the opposite preference is proven similarly.) Since
αy + (1 − α)z Â z, continuity implies that for some μ

μ (αx + (1 − α)z) + (1 − μ) z Â αy + (1 − α)z

that is, that there exists ν < α such that νx + (1 − ν)z Â αy + (1 − α)z. But
this is in contradiction to

β>γ ⇒ βy + (1 − β)z Â γx + (1 − γ)z.

The next lemma is a key step in defining the utility value for an alternative:

30
Lemma 20 Assume that x, y, z ∈ X are such that x Â y and x % z % y. Then
there exists a unique α = α (x, y, z) ∈ [0, 1] such that z˜αx + (1 − α) y.

Proof: If z˜x then α = 1 satisfies the condition. It is unique because, for

any α < 1 we have x Â αx + (1 − α) y. Similarly, z˜y implies that α = 0 is the
unique value that provides z˜αx + (1 − α) y.
Assume, then, that x Â z Â y. Let

G = { α ∈ [0, 1] | αx + (1 − α)y Â z }

E = { α ∈ [0, 1] | αx + (1 − α)y ˜ z }

B = { α ∈ [0, 1] | αx + (1 − α)y ≺ z }

By completeness, {G, E, B} is a partition of [0, 1]. Since x Â z Â y, 1 ∈ G

and 0 ∈ B. Lemma 17 implies that G and B are convex, and that E cannot
consist of more than one point (taking transitivity into account). We still need
to show, however, that E is not empty.
Suppose that E were empty. Denote α∗ = inf G = sup B. Then, either

B = [0, α∗ ] ; G = (α∗ , 1]

or
B = [0, α∗ ) ; G = [α∗ , 1].

In either case, the Continuity axiom (A2) is violated: in the first,

x Â z Â α∗ x + (1 − α∗ )y

but any non-trivial mixture of x with α∗ x + (1 − α∗ )y, that is, any

λx + (1 − λ) α∗ x + (1 − α∗ )y

for λ ∈ (0, 1) is strictly preferred to z. In the second case,

α∗ x + (1 − α∗ )y Â z Â y

and z is preferred to any non-trivial mixture of α∗ x + (1 − α∗ )y with y, that is,

z is preferred to any

λ [α∗ x + (1 − α∗ )y] + (1 − λ) y

31
for λ ∈ (0, 1).

αx + (1 − α)y Â z

Thus, continuity necessitates that both B and G be open intervals in [0, 1].
Since [0, 1] cannot be split into two disjoint open intervals, we find that E 6= ∅.
¤

It will be useful to have a notation for the alternatives that are, in terms of
preferences, in the range of a set of alternatives. For Y ⊂ X, define
½ ¯ ¾
¯ ∃g ∈ Y, g % x
[Y ]% = x ∈ X ¯¯
∃b ∈ Y, x % b

(We will only use this notation for finite, and rather small sets Y . Still, this
notation will save some lines.) For example, for two alternatives, b, g ∈ X such
that g % b,
[{b, g}]% = { x ∈ X | g % x % b }

In this case, we can also simplify notation and write [b, g]% for [{b, g}]% . With
this notation, we can state the following.

Lemma 21 Let there be b, g ∈ X such that g Â b. There exists an aﬃne

Ub,g : [b, g]% → R that represents % on [b, g]% . Moreover, it is unique up to a
positive aﬃne transformation.

Proof: By Lemma 20, for every x ∈ [b, g]% there is a unique α = α (g, b, x) ∈
[0, 1] such that x˜αg + (1 − α) b. Define U (x) = α (g, b, x).
To see that U represents %, consider x, y ∈ [b, g]% . We have

x˜α (g, b, x) g + (1 − α (g, b, x)) b

y˜α (g, b, y) g + (1 − α (g, b, y)) b

and thus

x % y if f

α (g, b, x) g + (1 − α (g, b, x)) b % α (g, b, y) g + (1 − α (g, b, y)) b

32
which, in light of Lemma 17, is equivalent to

α (g, b, x) ≥ α (g, b, y)

or to
U (x) ≥ U (y) .

Next, to see that U is aﬃne, consider z = λx + (1 − λ) y for x, y ∈ [b, g]% .

Observe that, by Lemmas 17 and 18, z ∈ [b, g]% . From

x˜α (g, b, x) g + (1 − α (g, b, x)) b

and Lemma 19 we deduce

z = λx + (1 − λ) y (6)

˜λ [α (g, b, x) g + (1 − α (g, b, x)) b] + (1 − λ) y

and from
y˜α (g, b, y) g + (1 − α (g, b, y)) b

we similarly obtain that (6) is also equivalent to

λ [α (g, b, x) g + (1 − α (g, b, x)) b]

+ (1 − λ) [α (g, b, y) g + (1 − α (g, b, y)) b]

= [λα (g, b, x) + (1 − λ) α (g, b, y)] g

+ [1 − [λα (g, b, x) + (1 − λ) α (g, b, y)]] b.

Thus

Ub,g (z) = λα (g, b, x) + (1 − λ) α (g, b, y)

= λUb,g (x) + (1 − λ) Ub,g (y) .

Finally, we wish to prove that this U is unique. Assume that V : [b, g]% → R
is also aﬃne and represents %. Define

c = V (g) − V (b) > 0

d = V (b)

33
so that
V (x) = cUb,g (x) + d

for x = b, g. For any other x ∈ [b, g]% , recall that

x˜α (g, b, x) g + (1 − α (g, b, x)) b

and, because V represents %,

V (x) = α (g, b, x) V (g) + (1 − α (g, b, x)) V (b)

= α (g, b, x) (c + d) + (1 − α (g, b, x)) d

= Ub,g (x) (c + d) + (1 − Ub,g (x)) d

= cUb,g (x) + d.

which completes the proof of the Lemma. ¤

Clearly, we’re nearing the end of the proof. We have more or less what
we needed: an affine function that represents preferences. This function can be
defined over each preference interval separately, no matter how large it is. Thus,
if X happens to have a maximal and a minimal elements, we’re done: we only
need to apply Lemma 21 to the interval between the minimal and the maximal
element, which spans all of X. However, some more work is needed if maximal
or minimal elements fail to exist.
We define the function U as follows. If all elements in X are equivalent, we
set U (x) = 0 for all x. This function is affine, and it represents preferences.
Moreover, it is unique up to a positive affine transformation: any other function
that represents preferences has to be a constant as well. Otherwise, not all
elements of X are equivalent. Thus, there are b, g ∈ X such that g Â b. Fix
these two alternatives until the end of the proof, and set U (b) = 0 and U (g) = 1.
For x 6= b, g, define U (x) as follows:
(i) for x ∈ [b, g], define U (x) = Ub,g (x);

34
(ii) for x Â g, define U (x) = 1/Ub,x (g) so that

(0, 1, U (x)) = (Ub,g (b) , Ub,g (g) , U (x))

= c (Ub,x (b) , Ub,x (g) , Ub,x (x))

= c (0, Ub,x (g) , 1)

for c = 1/Ub,x (g) > 0;

U
x,g (b)
(iii) for x ≺ b, define U (x) = − 1−Ux,g (b)
so that

(U (x) , 0, 1) = (U (x) , Ub,g (b) , Ub,g (g))

= c (Ux,g (x) , Ux,g (b) , Ux,g (g)) + d

= c (0, Ux,g (b) , 1) + d

for c = 1 − Ux,g (b) and d = Ux,g (b) (observe that 0 < c, d < 1).
Thus, for every x, U (x) is the unique number such that the vector (0, 1, U (x))
(which is not necessarily an increasing list of numbers) is an increasing affine
¡ ¢
transformation of U[{b,g,x}] (b) , U[{b,g,x}] (g) , U[{b,g,x}] (x) . Put differently, for
each x there exists a unique function V[{b,g,x}] : [{b, g, x}] → R such that (i)
V[{b,g,x}] is an increasing affine transformation of U[{b,g,x}] , so that V[{b,g,x}] is
affine and represents % on [{b, g, x}]; (ii) V[{b,g,x}] (b) = 0 and V[{b,g,x}] (g) = 1.
And then U (x) = V[{b,g,x}] (x).
We wish to show that U so defined satisfies the two conditions, namely,
that it represents preferences and that it is affine. Let there be given x, y ∈ X
and consider the set Y = {b, g, x, y}. We know that there exists an affine U[Y ]
that represents preferences on all of [Y ], Y = {b, g, x, y} included. It has a
unique increasing affine transformation, V[Y ] that also satisfies V[Y ] (b) = 0 and
V[Y ] (g) = 1. Consider z ∈ [Y ]. We wish to show that U (z) = V[Y ] (z). Indeed,
we know that U (z) = V[{b,g,z}] (z); moreover, V[Y ] (weakly) extends V[{b,g,z}]
from [{b, g, z}] to all of [Y ]; since they are both affine, and both represent
preferences on [{b, g, z}], with V[{b,g,x}] (b) = V[Y ] (b) = 0 and V[{b,g,x}] (g) =
V[Y ] (g) = 1, V[{b,g,x}] (·) = V[Y ] (·) on [{b, g, z}]. Hence V[{b,g,x}] (z) = V[Y ] (z)
and U (z) = V[Y ] (z) follows. Because V[Y ] represents preference on [Y ], we have,

35
in particular,

x%y ⇔ V[Y ] (x) ≥ V[Y ] (y) ⇔ U (x) ≥ U (y)

hence U represents %. To see that U is aﬃne, note that, for z = αx + (1 − α) y

we have, by Lemmas 17 and 18, z ∈ [{x, y}] ⊂ [Y ]. Hence V[Y ] is defined on z
and it is known to satisfy

V[Y ] (αx + (1 − α) y) = αV[Y ] (x) + (1 − α) V[Y ] (y)

hence U also satisfied aﬃnity.

This concludes the proof of existence of U . To see that it is unique up to

a positive affine transformation, one need only repeat the argument above: for
any affine V that represents % one may apply an affine transformation such that
V (b) = 0 and V (g) = 1, and prove that V (x) = U (x) for all x. ¤

36
6 vNM Expected Utility
6.1 Model and Theorem

Let X be a set of alternatives.

The objects of choice are lotteries with finite support. Formally, define
½ ¯ ¾
¯ #{x|P (x) > 0} < ∞,
L = P : X → [0, 1] ¯ ¯ P .
x∈X P (x) = 1
P
Observe that the expression x∈X P (x) = 1 is well-defined thanks to the
finite support condition that precedes it.
A mixing operation is performed on L, defined for every P, Q ∈ L and every
α ∈ [0, 1] as follows: αP + (1 − α)Q ∈ L is given by

(αP + (1 − α)Q) (x) = αP (x) + (1 − α)Q(x)

for every x ∈ X. The intuition behind this operation is of conditional probabil-

ities: assume that I oﬀer you a compound lottery that will give you the lottery
P with probability α, and the lottery Q — with probability (1 − α). Asking,
what is the probability to obtain a certain outcome x, one observes that it is,
indeed, α times the conditional probability of x if one gets P plus (1 − α) times
the conditional probability of x is one gets Q.

Since the objects of choice are lotteries, the observable choices are modeled
by a binary relation on L, %⊂ L × L. The vNM axioms are:

V1. Weak order: % is complete and transitive.

V2. Continuity: For every P, Q, R ∈ L, if P Â Q Â R, there exist α, β ∈
(0, 1) such that
αP + (1 − α)R Â Q Â βP + (1 − β)R.

V3. Independence: For every P, Q, R ∈ L, and every α ∈ (0, 1),

P ÂQ implies αP + (1 − α)R Â αQ + (1 − α)R.

37
Theorem 22 (vNM) %⊂ L × L satisfies V1-V3 if and only if there exists u :
X → R such that, for every P, Q ∈ L
X X
P %Q iﬀ P (x)u(x) ≥ Q(x)u(x).
x∈X x∈X

Moreover, in this case u is unique up to a positive aﬃne transformation (pat).

6.2 Proof

This is clearly an example of Theorem 16, which was actually a generalization

of the present one. By Theorem 16, we have a representation by an aﬃne U on
L. It remains to define, for x ∈ X,

u (x) = U ([x])

where [x] ∈ L is the lottery that assigns probability 1 to the outcome x. By

inductive application of aﬃnity we find that, for any lottery P ,
X X
U (P ) = P (x) U ([x]) = P (x)u(x).
{x|P (x)>0} x∈X

To see that u is unique up to positive aﬃne transformations, observe that an

aﬃne transformation of U defines an aﬃne transformation of u and vice versa.
¤

The first lemmas of Theorem 16 are needed whichever way we look at the
vNM or Herstein-Milnor theorems. However, once we established these, when
the time comes to define the utility function, there are two other ways to con-
tinue. The proof provided above is relatively general, yet it makes use of very
little machinery. Moreover, it has the advantage of mimicking a process by
which the decision maker’s utility is calibrated. However, this proof does not
shed much light on the geometry of preferences. The following approaches add
something in this respect.

6.3 A geometric approach

To understand the geometry of the independence axiom, it is useful to consider

the case in which X contains only three pairwise-non-equivalent outcomes. Say,

38
X = {x1 , x2 , x3 } where x1 Â x2 Â x3 . Every lottery in L is a vector (p1 , p2 , p3 )
such that pi ≥ 0 and p1 + p2 + p3 = 1. For visualization, let us focus on the
probabilities of the best and worst outcomes. Formally, consider the p1 p3 plane:
draw a graph in which the x axis corresponds to p1 and the y axis — to p3 . The
Marschak-Machina Triangle is

∆ = {(p1 , p3 ) | p1 , p3 ≥ 0, p1 + p3 ≤ 1} .

Thus, the point (1, 0) corresponds to the best lottery x1 (with probability 1),
(0, 0) — to x2 , and (0, 1) — to the worst lottery x3 . Every lottery P corresponds
to a unique point (p1 , p3 ) in the triangle, and vice versa. We will refer to the
point (p1 , p3 ) by P as well.
Consider the point (0, 0). By reasoning as in the previous proof, we conclude
that, along the segment connecting (1, 0) with (0, 1) there exists a unique point
which is equivalent to (0, 0). Such a unique point will exist along the segment
connecting (1, 0) with (0, c) for every c ∈ [0, 1]. The continuity axiom implies (in
the presence of the independence axiom) that these points generate a continuous
curve, which is the indiﬀerence curve of x2 .

Lemmas 17 and 18 imply that the indifference curves are linear. (Otherwise,
they will have to be “thick”, and for some c we will obtain intervals of indifference
on the segment connecting (1, 0) with (0, c).) We want to show that they are
also parallel.3
Consider two lotteries P ∼ Q. Consider another lottery R such that S =
R + (Q − P ) is also in the triangle. (In this equation, the points are considered
as vectors in ∆.) We claim that R ∼ S. Indeed, if, say R Â S the independence
axiom would have implied 12 R+ 12 Q Â 12 S + 12 Q, and, by P ∼ Q, also 12 S + 12 Q ∼
1
2S + 12 P . We would have obtained 12 R + 12 Q Â 12 S + 12 P while we know that
these two lotteries are identical. (Not only equivalent, simply equal, because
S + P = R + Q.) Similarly S Â R is impossible. That is, the line segment
3 You may suggest that linear indifference curves that are not parallel would intersect,

contradicting transitivity. But if the intersection is outside the triangle, such preferences may
well be transitive. See Chew (1983) and Dekel (1986).

39
connecting R and S is also an indifference curve. However, by P − Q = R − S
we realize that the indifference curve going through R, S is parallel to the one
going through P, Q. This argument can be repeated for practically every R if
Q is sufficiently close to P . (Some care is needed near the boundaries.) Thus
all indifference curves are linear and parallel.
The Independence axiom might bring to mind some high school geometry.
Geometrically, the Independence axiom states that indifference curves should
be parallel: consider P, Q, R, and draw a triangle whose base is P Q and whose
apex is R. Assume that P ˜Q so that the base of the triangle is an indifference
curve. Then, when you consider points on the edges P R and QR that are
proportionately removed from P (Q) in the direction of R — that is, αP +(1−α)R
and αQ + (1 − α)R — you find that, by the Independence axiom, they are also
equivalent to each other. Thus, the segment connecting them is also part of
an indifference curve. But the proportionality means that we generated similar
triangles, and their bases are parallel.

Once we know that the indiﬀerence curves are linear and parallel, we’re more
or less done: linear and parallel lines can be described by a single linear function.
That is, one can choose two numbers a1 and a3 such that all the indiﬀerence
curves are of the form a1 p1 + a3 p3 = c (varying the constant c from one curve to
the other). Setting u(x1 ) = a1 , u(x2 ) = 0, and u(x3 ) = a3 , this is an expected
utility representation.
This argument can be repeated for any finite set of outcomes X. “Patching”
together the representations for all the finite subsets is done in the same way as
in the algebraic approach.

6.4 A separation argument

It is worth noticing that the vNM theorem is basically a separating hyperplane

theorem. To see the gist of the argument, assume that X is finite, though the
same idea applies more generally. Embed L in RX , so that we have a linear
space, and we can discuss, for P, Q ∈ L ⊂ RX also the diﬀerence P − Q ∈ RX .

40
Consider the sets
© ª
A = P − Q ∈ RX |P % Q

and
© ª
B = P − Q ∈ RX |Q Â P .

We first show that R % S if and only if (R − S) ∈ A. This is true because,

if R − S = P − Q, we find, by reasoning similar to that used above, that P % Q
iﬀ R % S. Similarly, S Â R if and only if (R − S) ∈ B.
Next we show that both A and B are convex. This is again an implication
of the independence axiom: suppose, say, (P − Q), (R − S) ∈ A and consider

α(P − Q) + (1 − α)(R − S) = (αP + (1 − α)R) − (αQ + (1 − α)S) .

P % Q and R % S imply (by two applications of the independence axiom)

αP + (1 − α)R % αQ + (1 − α)R % αQ + (1 − α)S

which means that (αP + (1 − α)R)−(αQ + (1 − α)S) ∈ A. The same reasoning

applies to B.
Finally, we need to show that A is closed and that B is open. The topology in
which such claims would be true is precisely the topology in which the continuity
axiom guarantees continuity of preferences: an open neighborhood of a point P
is defined by
∪R∈L {αP + (1 − α)R | α ∈ [0, εR )}

where, for every R ∈ L, εR > 0. You may verify that this topology renders
vector operations continuous. (Observe that this is not the standard topology
on RX , even if X is finite, because εR need not be bounded away from 0. That
is, as we change the “target” R, the length of the interval coming out of P in
the direction of R, still inside the neighborhood, changes and may converge to
zero. Still, in each given direction R − P there is an open segment, leading from
P towards R, which is in the neighborhood.)
When we separate A from B by a linear functional, we can refer to the
functional as the utility function u. Linearity of the utility with respect to the

41
probability values guarantees aﬃnity, i.e., that

u(αP + (1 − α)R) = αu(P ) + (1 − α)u(R).

Since every P has a finite support, using this property inductively results in the
expected utility formula.

42
7 de Finetti’s Theorem
7.1 Model and Theorem

Let the set of outcomes be R, interpreted as monetary values. We are after an

expected-value-maximization theorem, which corresponds to a decision maker
who is risk neutral. While this is not very realistic, the theorem is important on
the way to the following results. One can also try to re-interpret the numerical
values as “utiles”, so that the result is expected utility, rather than expected
value maximization.
The objects of choice are acts, which are functions from states of the world to
the outcomes R. Importantly, there are no probabilities given. The probability
distribution over the states of the world will be derived from preferences, and
will be interpreted as subjective probabilities of the decision maker.
Let S be a finite set of states, S = {1, 2, ..., n}, where the alternatives are all
the real-valued functions on S: X = RS . Assume that the decision maker has
a preference order over bets, %⊂ X × X. Consider the following axioms:
D1. Weak order: % is complete and transitive.
D2. Continuity: For every x ∈ X, the sets { y | x Â y}, { y | y Â x} are
open4 .
D3. Additivity: For every x, y, z ∈ X, x % y iﬀ x + z % y + z.
D4. Monotonicity: For every x, y ∈ X, xi ≥ yi for all i ≤ n implies x % y.
D5. Non-triviality: There exist x, y ∈ X such that x Â y.

Theorem 23 (de Finetti) %⊂ X × X satisfies D1-D5 if and only if there exists

a probability vector p ∈ ∆n−1 such that, for every x, y ∈ X,

x%y iﬀ px ≥ py.

Moreover, in this case p is unique.

4 Here we refer to the standard topology on Rn . The condition is therefore identical to the

continuity of consumer preferences in Debreu (1959).

43
As a reminder, ∆n−1 is the set of probability vectors on {1, ..., n}. The
P
notation px refers to the inner product, that is, i pi xi , which is the expected
payoﬀ of x relative to the probability p.

7.2 Proof

Let us first show that D1-D3 are equivalent to the existence of p ∈ Rn such that

x%y iﬀ px ≥ py

for every x, y ∈ X.
Necessity of the axioms is immediate. To prove suﬃciency, observe first that,
for every x, y ∈ X,
x%y iﬀ x − y % 0.

Define
A = {x ∈ X |x % 0}

and
B = {x ∈ X |0 Â x} .

Clearly, A ∩ B = ∅ and A ∪ B = X. Also, A is closed and B is open. If B = ∅,

p = 0 is the vector we need. Otherwise, both A and B are non-empty.

We wish to show that they are convex. To this end, we start by observing
that, if x % y, then x % z % y where z = x+y
2 . This is true because, defining
d= y−x
2 , we have x+d = z and z+d = y. D3 implies that x % z ⇔ x+d % z+d,
i.e. x % z ⇔ z % y. Hence z Â x would imply y Â z and y Â x, a contradiction.
Hence x % z, and z % y follows from x % z.
Next we wish to show that if x % y, then x % z % y for any z = λx +(1− λ)y
with λ ∈ [0, 1]. If λ is a binary rational (i.e., of the form k/2i for some k, i ≥ 1),
the conclusion follows from an inductive application of the previous claim (for
λ = 1/2). As for other values of λ, z Â x (y Â z) would imply, by continuity,
the same preference in an open neighborhood of z, including binary rationals.

44
It follows that one can separate A from B by a linear function. That is,
there exists a linear f : X → R and a number c ∈ R such that

x ∈ A iﬀ f (x) ≥ c

(and x ∈ B iﬀ f (x) < c). Since 0 ∈ A, c ≤ 0. If c < 0, consider x with

f (x) = 3c
4 . Then x ∈ A but 2x ∈ B. That is, x % 0 but 0 Â 2x, in contradiction
to D3 (coupled with transitivity). This implies that c = 0. Denoting pi = f (ei )
(where ei is the i-th unit vector), we obtain

x % y

iﬀ x−y % 0

iﬀ x−y ∈ A

iﬀ f (x − y) ≥ 0

iﬀ px ≥ py.

It is easily verifiable that, given the above, D4 is equivalent to pi ≥ 0, and

P
D5 — to the claim that p 6= 0, or i pi > 0. Under this conditions, p can be
normalized to be a probability vector, and it is the unique probability vector
representing preference as above. ¤

45
8 Anscombe-Aumann’s Theorem
8.1 Model and Theorem

Anscombe-Aumann’s model has states of the world, and derives subjective prob-
abilities on them, as does de Finetti’s. However, in this model it is not assumes
that the outcomes are real numbers; rather, the outcomes are vNM lotteries.
So we have two levels of uncertainty: first, we do not know which state of the
world will obtain, and we don’t even have a probability for that uncertainty.
Second, given a state, the decision maker will be facing a lottery with known,
objective probabilities as in the vNM model.
Formally, we use we the set-up introduced by Fishburn (1970). As a re-
minder, the vNM lotteries are
½ ¯ ¾
¯ #{x|P (x) > 0} < ∞,
L = P : X → [0, 1] ¯ P
¯ P (x) = 1
x∈X

and this set is endowed with a mixing operation: for every P, Q ∈ L and every
α ∈ [0, 1], αP + (1 − α)Q ∈ L is given by

(αP + (1 − α)Q) (x) = αP (x) + (1 − α)Q(x).

The state space is S. We wish to state that acts are functions from S to
L. In general we would need to endow S with a σ-algebra, and deal with
measurable and bounded acts. Both of these terms have to be defined in terms
of preferences, because we don’t have yet a utility function. Instead, we will
simplify our lives and assume that S is finite. However, the theorem holds also
for general measurable spaces.
The set of acts is F = LS . We will endow F with a mixture operation as
well, performed pointwise. That is, for every f, g ∈ F and every α ∈ [0, 1],
αf + (1 − α)g ∈ F is given by

(αf + (1 − α)g) (s) = αf (s) + (1 − α)g(s) ∀s ∈ S.

We will denote the decision maker’s preference order by %⊂ F × F and we

will abuse this notation as usual. In particular, we can write, for P, Q ∈ L,

46
P % Q, understood as fP % fQ where, for every R ∈ L, fR ∈ F is the constant
act given by fR (s) = R for all s ∈ S.
The interpretation is that, if the decision maker chooses f ∈ F and Nature
chooses s ∈ S, a roulette wheel is spun, with distribution f (s) over the outcomes
X, so that your probability to get outcome x is f (s)(x).
For a function u : X → R we will use the notation

X
EP u = P (x)u(x)
x∈X
for P ∈ L.
Thus, if you choose f ∈ F and Nature chooses s ∈ S, you will get a lottery
f (s), which has the expected u-value of
X
Ef (s) u = f (s)(x)u(x).
x∈X

Anscombe-Aumann’s axioms are the following. The first three are identical
to the vNM axioms. Observe that they now apply to more complicated crea-
tures: rather than to specific vNM lotteries, we now deal with functions whose
values are such lotteries, or, if you will, with vectors of vNM lotteries, indexed
by the state space S. The next two axioms are almost identical to de Finetti’s
last two axioms, guaranteeing monotonicity and non-triviality:

AA1. Weak order: % is complete and transitive.

AA2. Continuity: For every f, g, h ∈ F , if f Â g Â h, there exist α, β ∈
(0, 1) such that
αf + (1 − α)h Â g Â βf + (1 − β)h.

AA3. Independence: For every f, g, h ∈ F , and every α ∈ (0, 1),

f Âg implies αf + (1 − α)h Â αg + (1 − α)h.

AA4. Monotonicity: For every f, g ∈ F , f (s) % g(s) for all s ∈ S implies

f % g.
AA5. Non-triviality: There exist f, g ∈ X such that f Â g.

47
Theorem 24 (Anscombe-Aumann) % satisfies AA1-AA5 if and only if there
exist a probability measure μ on S and a non-constant function u : X → R such
that, for every f, g ∈ F
Z Z
f %g iﬀ (Ef (s) u)dμ(s) ≥ (Eg(s) u)dμ(s)
S S

Furthermore, in this case μ is unique, and u is unique up to positive linear

transformations.

8.2 Proof

The first part of the proof is a direct application of Theorem 16. The objects
of choice can be thought of as matrices whose columns are states in S and their
rows are outcomes in X. For the sake of the concreteness, let’s assume that
X is also finite, as is S. Then, every act f can be thought of as a matrix of
non-negative numbers, such that in each column (that is, for every state s), the
numbers sum up to 1 (defining a probability distribution over the outcomes in
X). Viewed thus, an act f is an extreme point of the set F if, at each and
every column s, it assigns probability 1 to an outcome x. Thus, there are |X||S|
extreme points, and F is their convex hull.
The first three axioms mean that we can have a representation of % by
an aﬃne function U . We now wish to show that this aﬃne function can be
represented as
X
U (f ) = f (s) (x) u (x, s)
x,s

for some u : X × S → R.

Lemma 25 There exists u : X × S → R such that

X
U (f ) = f (s) (x) u (x, s) ∀f ∈ F (7)
x,s

Proof: Observe that, if the domain of U were all the functions f : X × S →

P
R+ such that x,s f (s) (x) = |S|, (or, equivalently, all the distributions on the
P
matrix, that is, all f : X × S → R+ such that x,s f (s) (x) = 1) then each
extreme point would be assigning positive weight to one pair (x, s) only, and the

48
proof would be immediate. However, the set F has the additional constraint
P
that x f (s) (x) = 1 for each s separately, and this means that it has many
more extreme points and a bit more needs to be said to obtain (7).
Let us choose x∗ ∈ X and shift U so that U ([x∗ ] , ..., [x∗ ]) = 0. This can be
done without loss of generality. Next, define, for each s,

u (x∗ , s) = 0.

We have to define u (x, s) for other x’s. It will be convenient to define

us (·) ≡ u (·, s) : L → R

that is, to have us be defined for all lotteries on X, with u (x, s) = u ([x] , s) =
us ([x]), that is, to define us in such a way that the degenerate lottery [x],
assigning probability 1 to x, has the same value as the outcome x. (Obviously,
this is an abuse of notation, but we’re accustomed to such sins by now.)
For P ∈ L, s ∈ S, define
½
0 P s0 = s
hP,s (s ) =
[x∗ ] s0 =
6 s

and
us (P ) = U (hP,s ) .

That is, us (P ) is the U value of the act f that obtains [x∗ ] at each state s0 6= s
and takes the value P at s:

us (P ) = U ([x∗ ] , ..., [x∗ ] , P, [x∗ ] , ..., [x∗ ]) .

We argue that, for every f ,

X
U (f ) = us (f (s)) .
s

To see this, let there be given f ∈ F . Consider the act f 0 ∈ F defined by

µ ¶
0 1 1
f (s) = f (s) + 1 − [x∗ ]
n n

where n = |S|.

49
We can think of f 0 as the mixture of f and [x∗ ]: clearly, because of our
definition of the mixture operation as a pointwise operation,
µ ¶
0 1 1
f = f + 1− ([x∗ ] , ..., [x∗ ])
n n

This means, by aﬃnity of U , that

µ ¶
0 1 1
U (f ) = U (f ) + 1 − U ([x∗ ] , ..., [x∗ ])
n n

recalling that U ([x∗ ] , ..., [x∗ ]) = 0 as have

1
U (f 0 ) = U (f ) . (8)
n

On the other hand, we can also think of f 0 as the n-fold mixture of acts,
each of which equals [x∗ ] in all but one state. Formally, define gs ∈ F by
½
0 f (s) s0 = s
gs (s ) = hf (s),s =
[x∗ ] s0 6= s

and observe that

X1
f0 = gs
n
which implies
X1
U (f 0 ) = U (gs ) . (9)
n
Comparing (8) and (9) we get

1 X1
U (f ) = U (gs )
n n

and
X
U (f ) = U (gs ) .

Next, note that, by definition of gs , which is ([x∗ ] , ..., [x∗ ] , f (s), [x∗ ] , ..., [x∗ ]),
and the definition of us (·), we get

U (gs ) = us (f (s))

so that
X
U (f ) = us (f (s)) .
s

50
It only remains to note that, at each and every state s,
X
us (f (s)) = f (s) (x) u (x, s)
x

as in the reasoning in the vNM case (where aﬃnity of U yields the results
immediately as the extreme points are the degenerate lotteries). ¤

It follows that, for x ∈ X and s ∈ S there is a number u(x, s) such that

X X
f %g if f f (s) (x) u (x, s) ≥ g (s) (x) u (x, s) ∀f, g ∈ F.
x,s x,s
(10)
Observe that, for every vector of real numbers (β s )s∈S , if we define

v (x, s) = u (x, s) + β s

then we get a matrix v that also satisfies (10): indeed, for every f ∈ F ,
X X
f (s) (x) v (x, s) = f (s) (x) [u (x, s) + β s ]
x,s x,s
X X
= f (s) (x) u (x, s) + f (s) (x) β s
x,s x,s
X X X
= f (s) (x) u (x, s) + βs f (s) (x)
x,s s x
X X
= f (s) (x) u (x, s) + βs
x,s s
P
because, for every f ∈ F and s ∈ S, f (s) is a vNM lottery, so that x f (s) (x) =
0. Thus, shifting the utility numbers u (x, s) by a constant β s in column s (for
P
every x) results in a shift of U (f ) = x,s f (s) (x) u (x, s) and thus in a new
matrix that still represents preferences as in (10).
Let us pick an outcome x∗ ∈ X and henceforth assume that u (x∗ , s) = 0 for
all s. In view of the above, this restriction entails no loss of generality. One may
verify that the remaining degree of freedom is only a positive multiplication of
all {u (x, s)}x,s (by the same positive number).

The idea of the proof is to show that the functions on X defined by

us (x) = u (x, s)

51
are non-negative multiples of a single function u : X → R. More precisely,
we will distinguish between two type of states: those that are “null”, intuitively
corresponding to having a zero subjective probability, and that do not matter for
the decision, and those that are “non-null”, intuitively corresponding to positive
subjective probabilities. For any two non-null states,s, s0 , we wish to show that
us0 is a positive multiple of us . Then, we can fix one function u : X → R and
write u (x, s) = us (x) = μs u(x) for some μs > 0. Without loss of generality,
assume that we normalized the coeﬃcients μs so that they sum up to 1. This
allows us to think of them as probabilities, writing
X X
f (s) (x) u (x, s) = f (s) (x) μs u (x)
x,s x,s
X X
= μs f (s) (x) u (x)
s x,s
X ¡ ¢
= μs Ef (s) u
s

namely, the expected utility of u, where the inner expression is the expectation
relative to the objective probabilities given by the lottery f (s), and all these
are integrated over with respect to the probability vector μ, interpreted as the
decision maker’s subjective probability over the state space S.

Formally, for s ∈ S, define a function us : X → R by

us (x) = u (x, s)

and define also u : X → R by

X
u (x) = us (x) .
s∈S

Observe that, in the definition of monotonicity, the relation f (s) % g(s)

means that the (constant) act, which is equal to f (s) at each s0 ∈ S, is at least
as good as the constant act corresponding to g(s). Using the representation
(10), this is true iﬀ
X X
f (s) (x) u (x, s0 ) ≥ g (s) (x) u (x, s0 ) .
x,s x,s

52
Because f (s) (x) and g (s) (x) are independent of s, this can be written as
X X X X
f (s) (x) u (x, s) ≥ g (s) (x) u (x, s)
x s x s

that is,
X X
f (s) (x) u (x) ≥ g (s) (x) u (x) .
x x
P
In other words, the sum of state-utilities, u = s∈S us is a vNM function
that represents preferences over constant acts. We now wish to show that for
every s there is μs ≥ 0 such that us (·) = μs u (·).

What will use the following lemma.

Lemma 26 Let a, b ∈ Rn be such that

az ≥ 0 ⇒ bz ≥ 0
P
for every z ∈ Rn with i zi = 0. Then there are λ ≥ 0 and c ∈ R such that

bi = λai + c

for every i ≤ n.

Proof: We use the duality theorem of linear programming. Consider the

problem

M inz=(z1 ,...,zn ) bz ((P))

subject to

az ≥ 0

1z = 0

where 1 stands for the vector of 1’s.

Observe that, since the feasible set is homogeneous (z is feasible iﬀ αz if
feasible for all α > 0), (P) is bounded (from below) iﬀ it is bounded (from
below) by zero. Indeed, what we are told is precisely that it is bounded by zero:
az ≥ 0 and 1z = 0 imply bz ≥ 0. Hence

az ≥ 0 ⇒ bz ≥ 0 ∀z ∈ Rn , 1z = 0

53
is equivalent to (P) being bounded, which is equivalent to its dual being feasible.
The dual will have two variables — say, λ for the first constraint and c for the
second. Its objective function is

0λ + 0c = 0

and it will be a maximization problem. Further, λ will be non-negative (because

it is attached to a ≥ constraint in (P)) and c will be unconstrained, because it
is attached to an equality constraint. Finally, because the variables zi are not
constrained to be positive or negative, the constraints in the dual problem will
be equality constraints. To sum, the dual is

M azλ,c 0

subject to

λai + c = bi ∀i

Finally, this problem is feasible iﬀ there are λ ≥ 0 and c ∈ R such that

bi = λai + c

which is what we set out to prove. ¤

Let X = {x1 , ..., xn }. Let ai = u (xi ) for i ≤ n. For s ∈ S, define bi = us (xi )

for i ≤ n. Consider two acts f, g ∈ F such that f (s0 ) = g (s0 ) for s0 6= s. Observe
that f (s) % g (s) iﬀ
X X
f (s) (x) u (x) ≥ g (s) (x) u (x)
x x
or
X
[f (s) (x) − g (s) (x)] u (x) ≥ 0
x
that is,
[f (s) (·) − g (s) (·)] a ≥ 0.

Monotonicity implies, that whenever this is the case (f (s) % g (s)), we have
f % g. However, f % g is equivalent to
X X
f (s0 ) (x) u (x, s0 ) ≥ g (s0 ) (x) u (x, s0 )
x,s0 x,s0

54
and, since f (s0 ) = g (s0 ) for s0 6= s, also to
X X
f (s) (x) us (x) ≥ g (s) (x) us (x)
x x

or
X
[f (s) (x) − g (s) (x)] us (x) ≥ 0
x

that is,
[f (s) (·) − g (s) (·)] b ≥ 0.

Consider a vNM lottery P such that P (x) = 1/n for all x. Select f such
¡ ¢n
that f (s) = P . For z ∈ − n1 , n1 , 1z = 0, select g such that f (s0 ) = g (s0 ) for
1
s0 6= s and g (s) (x) = n − z. So that f (s) (·) − g (s) (·) = z. For every such z
we therefor get that az ≥ 0 ⇒ bz ≥ 0. Due to homogeneity, this also implies
that az ≥ 0 ⇒ bz ≥ 0 holds for every vector z ∈ Rn , 1z = 0. By the lemma, we
have λ ≥ 0 and c ∈ R such that

us (xi ) = λu (xi ) + c.

Plugging in x∗ (which equals one of the xi ) we obtain c = 0. Thus, for every

s there exists λs ≥ 0 such that

us (x) = λs u (x) ∀x ∈ X.

It remains to normalize the coeﬃcients (λs )s to obtain a probability vector

(μs )s . To see that this can be done, we need to guarantee that not all of them
are zero.
Define a state s ∈ S to be null if, whenever f (s0 ) = g (s0 ) for all s0 6= s,
f ∼ g. Observe that if all states were null, then, by replacing f (s) by g (s)
consecutively, we can prove that f ∼ g for all f, g ∈ F , contradiction the non-
triviality axiom A5. Hence, not all states are null.
It is easy to see that, if s is null, then us (·) = u (·, s) has to be a constant on
X. Since we have u (x∗ , s) = 0, this constant is zero, that is, us (·) = u (·, s) = 0
for all null states s. Thus, if s is null, it has to be the case that λs = 0.

55
Conversely, if λs = 0 it follows that us (x) vanishes for all x, and then s is null.
Thus, λs > 0 iﬀ s is non-null (and λs = 0 iﬀ s is null). Since there are non-null
P
states, s λs > 0 and
λs
μs = P
s λs

defines a probability vector such that

X X ¡ ¢
f (s) (x) u (x, s) = μs Ef (s) u
x,s s

for all f ∈ F .

56
9 Savage’s Theorem
9.1 Set-up

Savage’s model includes two primitive concepts: states and outcomes. The set
of states, S, should be thought of as an exhaustive list of all scenarios that might
unfold. An event is any subset A ⊂ S. There are no measurability constraints,
and S is not endowed with an algebra of measurable events. If you wish to
be more formal about it, you can define the set of events to be the maximal
σ-algebra, Σ = 2S , with respect to which all subsets are measurable.
The set of outcomes will be denoted by X. An outcome x is assumed to
specify all that is relevant to your well-being, insomuch as it may be relevant to
your decision.
The objects of choice are acts, which are defined as functions from states to
outcomes, and denoted by F . That is,

F = X S = {f | f : S → X} .

Acts whose payoﬀs do not depend on the state of the world s are constant
functions in F . We will abuse notation and denote them by the outcome they
result in. Thus, x ∈ X is also understood as x ∈ F with x(s) = x.
Since the objects of choice are acts, Savage assumes a binary relation %⊂
F × F . The relation will have its symmetric and asymmetric parts, ∼ and Â,
defined as usual. It will also be extended to X with the natural convention.
Specifically, for two outcomes x, y ∈ X, we say that x % y if and only if the
constant function that yields always x is related by % to the constant function
that yields always y.
For two acts f, g ∈ F and an event A ⊂ S, define an act fAg by
½
g(s) s ∈ A
fAg (s) = .
f (s) s ∈ Ac
Think of fAg as “f , where on A we replaced it by g”.
An event A is null if, for every f, g ∈ F , f ∼A g. That is, if you know
that f and g yield the same outcomes if A does not occur, you consider them
equivalent.

57
9.2 Axioms

P1 % is a weak order.

P2 For every f, g, h, h0 ∈ F , and every A ⊂ S,

0 0
fAhc % gA
h
c iﬀ fAhc % gA
h
c.

P3 For every f ∈ F , non-null event A ⊂ S and x, y ∈ X,

x%y iﬀ fAx % fAy .

P4 For every A, B ⊂ S and every x, y, z, w ∈ X with x Â y and z Â w,

x
yA % yB
x z
iﬀ wA % wB
z
.

P5 There are f, g ∈ F such that f Â g.

P6 For every f, g, h ∈ F with f Â g there exists a partition of S, {A1 , ..., An }

such that, for every i ≤ n,

fAhi Â g h
and f Â gA i
.

P7 For every f, g ∈ F and event A ⊂ S, if, for every s ∈ A, f %A g(s), then

f %A g, and if, for every s ∈ A, g(s) %A f , then g %A f .

9.3 Results
9.3.1 Finitely additive measures

Normally, when you study probability, you define a measure μ to be a probability

on a measurable space (Ω, Σ) if it is a function μ : Σ → R+ such that
∞
X
μ (∪∞
i Ai ) = μ (Ai ) (11)
i
whenever i 6= j ⇒ Ai ∩ Aj = ∅

58
and μ(Ω) = 1. Condition (11) is referred to as σ-additivity.
Finite additivity is the condition known as μ (A ∪ B) = μ (A) + μ (B) when-
ever A ∩ B = ∅, which is clearly equivalent to (11) if you replace ∞ by any
finite n:
n
X
μ (∪ni Ai ) = μ (Ai ) (12)
i
whenever i 6= j ⇒ Ai ∩ Aj = ∅

If you already have a finitely additive measure, σ-additivity is an additional

constraint of continuity: define Bn = ∪ni Ai and B = ∪∞
i Ai . Then Bn % B and

(11) means
³ ´ ∞
X
μ lim Bn = μ (∪∞
i Ai ) = μ (Ai ) = lim μ (Bn )
n→∞ n→∞
i

that is, σ-additivity of μ is equivalent to saying that the measure of the limit is
the limit of the measure, when increasing sequences of events are concerned.

9.3.2 Non-atomic measures

In the standard case of a σ-additive measure: an event A is an atom of μ if

(i) μ(A) > 0
(ii) For every event B ⊂ A, μ(B) = 0 or μ(B) = μ(A).
That is, an atom cannot split, in terms of its probability. When you try to
split it to B and A\B, you find either that all the probability is on B or that
all of it is on A\B. A measure that has no atoms is called non-atomic.
There are two other possible definitions of non-atomicity, trying to capture
the same intuition: you may require, for every event A with μ(A) > 0, that
there be an event B ⊂ A such that μ(B) is not too close to 0 or to μ(A). For
instance, you may require that
1 2
μ(A) ≤ μ(B) ≤ μ(A).
3 3
Finally, you may consider an even more demanding requirement: that for
every event A with μ(A) > 0, and for every r ∈ [0, 1] there be an event B ⊂ A
such that μ(B) = rμ(A).

59
In the case of a σ-additive μ, all three definitions coincide. But this is not
true for finite additivity. Moreover, the condition that Savage needs, and the
condition that turns out to follow from P6, is the strongest.
Hence, we will define a finitely additive measure μ to be non-atomic if for
every event A with μ(A) > 0, and for every r ∈ [0, 1], there is an event B ⊂ A
such that μ(B) = rμ(A).

9.3.3 Savage’s Theorem(s)

Theorem 27 (Savage) Assume that X is finite. Then % satisfies P1-P6 if

and only if there exist a non-atomic finitely additive probability measure μ on S
(=(S, 2S )) and a non-constant function u : X → R such that, for every f, g ∈ F
Z Z
f % g iﬀ u(f (s))dμ(s) ≥ u(g(s))dμ(s)
S S

Furthermore, in this case μ is unique, and u is unique up to positive linear

transformations.

Theorem 28 (Savage) % satisfies P1-P7 if and only if there exist a non-atomic

finitely additive probability measure μ on S (=(S, 2S )) and a non-constant bounded
function u : X → R such that, for every f, g ∈ F
Z Z
f % g iﬀ u(f (s))dμ(s) ≥ u(g(s))dμ(s) (13)
S S

Furthermore, in this case μ is unique, and u is unique up to positive linear

transformations.

Observe that this theorem restricts u to be bounded. (Of course, this was not
stated in Theorem 27 because when X is finite, u is bounded.) The boundedness
of u follows from P3. Indeed, if u is not bounded one can generate acts whose
expected utility is infinite (following the logic of the St. Petersburg Paradox).
This, in and of itself, is not an insurmountable diﬃculty, but P3 will not hold for
such acts: you may strictly improve f from, say, x to y on a non-null event A,
and yet the resulting act will be equivalent to the first one, both having infinite

60
expected utility. Hence, as stated, P3 implies that u is bounded. An extension
of Savage’s theorem to unbounded utilities is provided in Wakker (1993a).5
A corollary of the theorem is that an event A is null if and only if μ(A) = 0. In
Savage’s formulation, this fact is stated on par with the integral representation
(13).

9.4 The proof and qualitative probabilities

Savage’s proof is too long and involved to be covered here. Savage (1954) de-
velops the proof step by step, alongside conceptual discussions of the axioms.
Fishburn (1970) provides a more concise proof, which may be a bit laconic, and
Kreps (1988, pp. 115-136) provides more details. Here I will only say a few
words about the strategy of the proof, and introduce another concept in this
context.
Savage first deals with the case |X| = 2. That is, there are two outcomes,
say, 1 and 0, with 1 Â 0. Thus every f ∈ F is characterized by an event A,
that is, f = 1A . Correspondingly, %⊂ F × F can be thought of as a relation
%⊂ Σ × Σ with Σ = 2S .
In this set-up P4 has no bite. Let us translate P1-P3 and P5 to the language
of events. P1 would mean, again, that % (understood as a relation on events)
is a weak order. P2 is equivalent to the condition:

Cancellation: For every A, B, C ∈ Σ, if (A ∪ B) ∩ C = ∅, then

A%B iﬀ A ∪ C % B ∪ C

Taken together, P1-P5 are equivalent to:

(i) % is a weak order;
(ii) % satisfies cancellation;
(iii) For every A, A % ∅;
(iv) S Â ∅.
5 In 1954 Savage was apparently unaware that the boundedness of u follows from P3.

Fishburn reports that this became obvious during a discussion they had later on.

61
A binary relation on an algebra of events that satisfies these conditions was
defined by de Finetti to be a qualitative probability. The idea was that subjective
judgments of “at least as likely as” on events that satisfied certain regularities
might be representable by a probability measure, that is, that a probability
measure μ would satisfy

A%B iﬀ μ(A) ≥ μ(B). (14)

If such a measure existed, and if it were unique, one could use the likelihood
comparisons % as a basis for the definition of subjective probability. Observe
that such a definition would qualify as a definition by observable data if you are
willing to accept judgments such as “I find A at least as likely as B” as valid
data.6
de Finetti conjectured that every qualitative probability has a (quantitative)
probability measure that represents it. It turns out that this is true if |S| ≤ 4,
but a counterexample can be constructed for n = 5. Such a counterexample was
found by Kraft, Pratt, and Seidenberg (1959), who also provided a necessary
and suﬃcient condition for the existence of a representing measure.
You can easily convince yourself that even if such a measure exists, it will
typically not be unique. The set of measures that represent a given qualitative
probability is defined by finitely many inequalities. Generically, one can expect
that the set will not be a singleton.

However, Savage found that for |X| = 2 his relation was a qualitative prob-
ability defined on an infinite space, which also satisfied P6. This turned out to
be a powerful tool. With P6 one can show that every event A can be split into
two, B ⊂ A and A\B, such that B ∼ A\B.7 Equipped with such a lemma,
one can go on to find, for every n ≥ 1, a partition S into 2n equivalent events,
Πn = {An1 , ..., An2n }. Moreover, using P2 we can show that the union of every k
events from Πn is equivalent to the union of any other k events from the same
partition. Should there be a probability measure μ that represents %, it has to
6 We will discuss such cognitive data in Part IV.
7 Kopylov (2007) provides a diﬀerent proof, which also generalizes Savage’s theorem.

62
1 k
satisfy μ(Ani ) = 2n and μ(∪ki=1 Ani ) = 2n .

Given an event B such that S Â B, one may ask, for every n, what is the
number k such that

i=1 Ai Â B % ∪i=1 Ai .
∪k+1 n k n

Any candidate for a probability μ will have to satisfy

k+1 k
> μ(B) ≥ n .
2n 2

With a little bit of work one can convince oneself that there is a unique μ(B)
that satisfies the above for all n. Moreover, it is easy to see that

B%C implies μ(B) ≥ μ(C). (15)

The problem then is that the converse is not trivial. In fact, Savage provides
beautiful examples of qualitative probability relations, for which there exists a
unique μ satisfying (15) but not the converse direction.
Here P6 is used again. Savage shows that P6 implies that % (applied to
events) satisfies two additional conditions, which he calls fineness and tightness.
(Fineness has an Archimedean flavor, while tightness can be viewed as a conti-
nuity of sorts.) With these conditions, it can be shown that the only μ satisfying
(15) satisfies also
BÂC implies μ(B) > μ(B).

and thus represents % as in (14).8

Having established a representation of % by a measure, Savage’s proof loses

some of its dramatic eﬀect. First, we, the audience, know that he’s going to
make it. In fact, he already has: restricting attention to two outcomes, and
yet defining a subjective probability measure in a unique and observable way is
quite a feat. Second, the rest of the proof is less exciting, though by no means
trivial. Savage chose to use vNM’s theorem, though this is not the only way to
8 The examples provided by Savage also show that fineness and tightness are independent

conditions. He shows a qualitative probability relation that has a unique μ satisfying (15),
which is fine but not tight, and one which is tight but not fine, and neither of these has a
probability that represents it as in (14).

63
proceed. He first shows that if two acts have the same distribution (with finite
support), according to μ, they are equivalent. This means that, for a finite X,
one can deal with equivalence classes defined by distributions over outcomes.
Then Savage proves that the preference relation over these classes satisfies the
vNM axioms, and finally he extends the representation to an infinite X.

64
10 Choquet Expected Utility
10.1 Capacities and Choquet Integration

Schmeidler (1989) introduced the first general-purpose, axiomatically-based model

that generalized the classical ones from subjective expected utility to non-
Bayesian beliefs. His model used the notion of a non-additive probability, or a
capacity, which is a set function that satisfies all the conditions that probabilities
satisfy, with the possible exception of additivity (but retaining monotonicity: if
A ⊂ B the capacity value of A cannot exceed that of B). Capacities were in-
troduced by Choquet (1953-4) in a diﬀerent context. He also defined a notion
of integration with respect to these set functions.
Schmeidler used Anscombe-Aumann’s model, and weakened the Indepen-
dence axiom so that it holds only when the acts involved are “comonotonic”,
that is, that they move up and down — in terms of preferences — together across
the states of the world. He showed that, with this restriction on the acts, the
weaker axiom that results is equivalent (together with the other axioms) to an
expected-utility representation, with the probability replaced by a capacity, and
the standard integral — by a Choquet integral.

Formally, a non-additive probability, or a capacity on a state space S — here

assumed finite — is a set function v : 2S → [0, 1] such that:
(i) v(∅) = 0;
(ii) A ⊂ B implies v(A) ≤ v(B);
(iii) v(S) = 1.

To define the Choquet integral of a function f with respect to a capacity

v, let us start with a non-negative f . Since S is finite, we know that f takes
finitely many values. Let us order them from the largest to the smallest: that
is, f = (x1 , E1 ; ...; xm , Em ) with x1 ≥ x2 ≥ ... ≥ xm ≥ 0. The Choquet integral
of f according to v is
Z m
X
f dv = (xj − xj+1 )v(∪ji=1 Ei ) (16)
S j=1

65
with the convention xm+1 = 0. If v is additive, this integral is equivalent to
Pm
the Riemann integral (and to j=1 xj v(Ej )). You can also verify that (16)
is equivalent to the following definition, which applies to any bounded non-
negative f (even if S were infinite, as long as f were measurable with respect
to the algebra on which v is defined):
Z Z ∞
f dv = v(f ≥ t)dt
S 0

where the integral on the right is a standard Riemann integral. (Observe that
it is well defined, because v(f ≥ t) is a non-increasing function of t.)
For functions that may be negative, the integral is defined so that, for every
function f and constant c,
Z Z
(f + c)dv = f dv + c
S S

— a property that holds for non-negative f and c. So we make sure the property
holds: given a bounded f , take a c > 0 such that g = f + c ≥ 0, and define
R R
S
f dv = S gdv − c.

10.2 Comonotonicity

The Choquet integral has many nice properties — it respects “shifts”, namely,
the addition of a constant, as well as multiplication by a positive constant. It
is also continuous and monotone in the integrand. But it is not additive in
general. Indeed, if we had
Z Z Z
(f + g)dv = f dv + gdv
S S S

for every f and g, we could take f = 1A and g = 1B for disjoint A and B, and
show that v(A ∪ B) = v(A) + v(B).
However, there are going to be pairs of functions f, g for which the Choquet
integral is additive. To see this, observe that (16) can be re-written also as
Z m
X h i
f dv = xj v(∪ji=1 Ei ) − v(∪j−1
i=1 Ei ) .
S j=1

66
Assume, without loss of generality, that Ei is a singleton. (This is possible
because we only required a weak inequality xj ≥ xj+1 .) That is, there is some
permutation of the states, π : S → S, defined by the order of the xi ’s, such that
∪ji=1 Ei consists of the first j states in this permutation. Given this π, define a
probability vector pπ on S by pπ (∪ji=1 Ei ) = v(∪ji=1 Ei ). It is therefore true that
Z Z
f dv = f dpπ
S S

that is, the Choquet integral of f equals the integral of f relative to some
additive probability pπ . Note, however, that pπ depends on f . Since diﬀerent
f ’s have, in general, diﬀerent permutations π that rank the states from high f
values to low f values, the Choquet integral is not additive in general.
Assume now that two functions, f and g, happen to have the same permu-
tation π. They will have the same pπ and then
Z Z Z Z
f dv = f dpπ and gdv = gdpπ .
S S S S

Moreover, in this case f + g will also be decreasing relative to the permutation

π, and Z Z
(f + g)dv = (f + g)dpπ
S S
R R R
and it follows that S
(f + g)dv = S
f dv + S
gdv.

In other words, if f and g are two functions such that there exists a permu-
tation of the states π, according to which both f and g are non-increasing, we
will have additivity of the integral for f and g. When will f and g have such a
permutation? It is not hard to see that a necessary and suﬃcient condition is
the following:
f and g are comonotonic if there are no s, t ∈ S such that f (s) > f (t) and
g(s) < g(t).

10.3 Schmeidler’s Axioms and Result

Schmeidler uses the Anscombe-Aumann set-up. This requires a new definition

of comonotonicity, because now the acts assume values in L, rather than in R.

67
For two acts f, g ∈ F , we say that f and g are comonotonic if there are no
s, t ∈ S such that f (s) Â f (t) and g(s) ≺ g(t).

Schmeidler suggested the following weakening of the Independence axiom:

Comonotonic Independence: For every pairwise comonotonic f, g, h ∈

F , and every α ∈ (0, 1),

f Âg implies αf + (1 − α)h Â αg + (1 − α)h.

That is, the sole modification needed in the Anscombe-Aumann model is

that now Independence is required to hold only when the acts involved are
comonotonic. And the only modification in the theorem is that the probability
may be non-additive:

Theorem 29 (Schmeidler) % satisfies AA1, AA2, Comonotonic Independence,

AA4, and AA5 if and only if there exist a non-additive probability measure v on
S and a non-constant function u : X → R such that, for every f, g ∈ F
Z Z
f % g iﬀ (Ef (s) u)dv ≥ (Eg(s) u)dv
S S

(where the integrals are in the sense of Choquet). Furthermore, in this case v
is unique, and u is unique up to positive linear transformations.

The proof is given in Schmeidler (1989, pp. 579-581, relying on Schmeidler,

1986). To understand how it works, observe that, restricting attention to acts
that are constant over S, we basically have vNM lotteries, and they satisfy
the vNM axioms. (Importantly, a constant act is comonotonic with any other
act, and, in particular, all constant acts are pairwise comonotonic.) Thus we
can find a utility function that represents preferences over lotteries, and we can
plug it in. This simplifies the problem to one dealing with real-valued functions.
Furthermore, if S is indeed finite we have real-valued vectors.
Now consider all vectors that are non-decreasing relative to a given permu-
tation of the states, π. They generate a convex cone, and they are all pairwise
comonotonic, so that the independence axiom holds for all three of them. More-
over, these vectors generate a mixture space — when we mix two of them, we are

68
still inside the set. Applying Theorem 16, one gets an equivalent of Anscombe-
Aumann representation, restricted to the cone of π-non-decreasing vectors. For
this cone, we therefore obtain a representation by a probability vector pπ . One
then proceeds to show that all these probability vectors can be described by a
single non-additive measure v.

69
11 Maxmin Expected Utility
11.1 Model and Theorem

This model builds on Anscombe-Aumann. It weakens the Independence axiom

and introduces two axioms, each of which is strictly weaker than the Indepen-
dence axiom. They are:

C-Independence: For every f, g ∈ F , every constant h ∈ F and every

α ∈ (0, 1),

f Âg implies αf + (1 − α)h Â αg + (1 − α)h.

Uncertainty Aversion: For every f, g ∈ F , if f ∼ g, then, for every

α ∈ (0, 1),9
αf + (1 − α)g % f.

Thus, uncertainty aversion requires that the decision maker have a preference
for mixing. Two equivalent acts can only improve by mixing, or “hedging”
between them. Observe that uncertainty aversion is also a weakened version
of Anscombe-Aumann’s independence axiom (which would have required αf +
(1 − α)g ∼ f whenever f ∼ g).

Theorem 30 (Gilboa and Schmeidler, 1989) % satisfies AA1, AA2, C-Independence,

AA4, AA5, and Uncertainty Aversion if and only if there exist a closed and con-
vex set of probabilities on S, C ⊂ ∆(S), and a non-constant function u : X → R
such that, for every f, g ∈ F
Z Z
f %g iﬀ min (Ef (s) u)dp ≥ min (Eg(s) u)dp
p∈C S p∈C S

Furthermore, in this case C is unique, and u is unique up to positive linear

transformations.

The uniqueness of C is relative to the conditions stated. Explicitly, if there

is another set C 0 and a utility function u0 that satisfy the above representation,
C 0 = C and u0 is a pat of u.
9 It suﬃces to require this condition for α = 1/2.

70
11.2 Idea of Proof

One would first define the function J : Rn → R so that

f %g ⇔ J (f ) ≥ J (g)

by letting
J ((c, c, ..., c)) = c

for each c ∈ R, and setting, for each f ,

J (f ) = c

for c such that f ∼ (c, c, ..., c).

The Uncertainty Aversion axiom guarantees that J is concave.
Next, let af be the coeﬃcients of the supporting hyperplane at f , so that

af f = J(f )

af g ≥ J(g) ∀g

and

J(f ) = min ag f
g

Consider a specific f and its supporting hyperplane af . Assume that J (f ) =

c so that f ∼ (c, c, ..., c) and af f = c. Axiom C-independence implies that

f ∼ αf + (1 − α) (c, c, ..., c) ∼ (c, c, ..., c)

for all α ∈ [0, 1]. This implies that the supporting hyperplane defined by af has
to coincide with J on the segment connecting f and (c, c, ..., c). Hence

af (c, c, ..., c) = c

and the latter means that af 1 = 1. Coupled with the non-negativity of af , we

conclude that it is a probability vector.

71
12 Arrow’s Impossibility Theorem
Assume that there is a set of alternatives A = {1, 2, ..., m} with m ≥ 3 and a
set of individuals N = {1, 2, ..., n} with n ≥ 2.
Let the set of linear orderings be R = {Â⊂ A × A| Â complete, transitive,
a-symmetric}
A preference aggregation function maps profiles of preferences to a preference
that is attributed to society. That is, a preference aggregation function is f :
Rn → R.
Given such a function, define:
1. Unanimity: For all a, b ∈ A, if a Âi b ∀i ∈ N , then af ((Âi )i )b
2. Independence of Irrelevant Alternatives (IIA): For all a, b ∈ A, (Âi )i , (Â0i )i
if
a Âi b ⇔ a Â0i b ∀i ∈ N

then
af ((Âi )i )b ⇔ af ((Â0i )i )b.

An aggregation function is dictatorial if there exists i ∈ N such that, for all

(Âj )j .
f ((Âj )j ) =Âi

Theorem 31 (Arrow, 1951) f satisfies Unanimity and IIA iﬀ it is dictatorial.

Proof: Clearly, all (the n diﬀerent) dictatorial functions satisfy the two con-
ditions. The interesting (in fact, amazing) fact is that the opposite it true as
well. We turn to prove it now (based on one of the short proofs provided by
Geanakoplos, 2005).

Step 1: For every a ∈ A and (Âi )i , if a is extreme (top or bottom) in each

of Âi , then it is extreme in Â= f ((Âi )i ).
Proof:

72
Assume not. Then, there exists a profile (Âi )i and an alternative a such
that a is extreme in each of Âi but not in Â. Thus, there are b, c ∈ A such that
b Â a Â c. We can modify the profile (Âi )i to get another profile (Â0i )i such
that
(i) a is top (bottom) at Â0i whenever it is top (bottom) at Âi ;
(ii) c Âi b for all i
— simply by switching between b and c, if needed, in the profile Âi .
The ranking between a and any other alternative has not changed (it is the
same in Â0i as in Âi for each i), and, by IIA, a is ranked, relative to b and c,
in Â0 = f ((Â0i )i as it was in Â= f ((Âi )i ). Thus, b Â0 a Â0 c while unanimity
implies c Â b. ¤

Step 2: There exists i = i(a) ∈ N such that

b Âi c ⇒ b Â c ∀b, c ∈ A\{a}

Proof: To find i, consider profiles such that a is extreme according to each

individual. We can start with a being at the bottom of the ranking Âi for each
i, and switch a to the top, one individual at a time. In all of these, a should be
extreme in Â= f ((Âi )i ), where, by unanimity, it has to be at the bottom at the
beginning, and at the top at the end.
Let i be the individual for which the first jump occurs from bottom to top
of Â. It follows (using IIA again), that for all d 6= a,

(I) d Â ja ∀j ≤ i

a Â jd ∀j > i

⇒dÂa

and

(II) d Â ja ∀j < i

a Â jd ∀j ≥ i

⇒ a Â d.

73
Given distinct b, c ∈ A\{a}, consider

b Âi a Âi c

d Â ja ∀d 6= a, j < i

a Â jd ∀d 6= a, j > i.

Then on {b, a} preferences look like pattern (I) and b Â a follows. On {c, a}
preferences look like pattern (II) and a Â c follows. Hence b Â c. Finally,
due to the IIA, this has to be the case whenever the individuals have the same
rankings between b and c as in such profiles. However, the b/c rankings of the
other individuals were not constrained above, which means that individual i’s
ranking between b and c determine that of society’s. ¤

Step 3: i(a) is independent of a.

Proof: If not, there are a, b with i(a) 6= i(b). Consider c 6= a, b. Let

a Â i(c) b (17)

b Â i(a) c

c Â i(b) a

which is possible unless i(a) = i(c) = i(b). However, in this case society’s
preferences would be cyclical. Hence, it has to be the case that there is no profile
for which (17) happens. That is, it has to be the case that i(a) = i(c) = i(b)
and the conclusion follows. ¤

Step 4: i is a dictator. Given a, b ∈ A, choose c 6= a, b and use i = i (c). ¤

A similar result, with a similar proof, applies to weak orders if unanimity is

strengthened to weak and strong unanimity.

74
13 References
(Not all of the following are mentioned above, but many of them might be
mentioned in class.)

Anscombe, F. J. and R. J. Aumann (1963), “A Definition of Subjective Proba-

bility”, The Annals of Mathematics and Statistics, 34: 199-205.

Arrow, K. J. (1951), “Alternative Approaches to the Theory of Choice in Risk-

Taking Situations”, Econometrica 19: 404-437.

Aumann, R. J. (1962), “Utility Theory without the Completeness Axiom”,

Econometrica 30: 445-462.

Bayes, T. (1763), “An Essay towards solving a Problem in the Doctrine of

Chances.” Communicated by Mr. Price, Philosophical Transactions of the Royal
Society of London 53, 370—418.

Beja, A. and I. Gilboa (1992), “Numerical Representations of Imperfectly Or-

dered Preferences (A Unified Geometric Exposition)”, Journal of Mathematical
Psychology, 36: 426-449.

Bernoulli, D. (1738), “Exposition of a New Theory on the Measurement of Risk”,

Econometrica 22 (1954): 23-36.

Bernoulli, J. (1713), Ars Conjectandi. 1713.

Bewley, T. (2002), “Knightian Decision Theory: Part I”, Decisions in Eco-

nomics and Finance, 25: 79-110. (Working paper, 1986).

Chew, S. H. (1983), “A Generalization of the Quasilinear Mean with Applica-

tions to the Measurement of Income Inequality and Decision Theory Resolving
the Allais Paradox,” Econometrica, 51: 1065-1092.

Choquet, G. (1953-4), “Theory of Capacities,” Annales de l’Institut Fourier 5:

(Grenoble), 131-295.

Debreu, G. (1959), The Theory of Value: An axiomatic analysis of economic

equilibrium. New Haven: Yale University Press.

75
de Finetti, B. (1930) Funzione caratteristica di un fenomeno aleatorio. Atti
Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Nat. 4 , 86-133.

–––— (1937), “La Prevision: Ses Lois Logiques, Ses Sources Subjectives”,
Annales de l’Institut Henri Poincare, 7, 1-68.

Dreze, J. H. (1961), “Les fondements logiques de l’utilite cardinale et de la

probabilite subjective”, La Decision. Colloques Internationaux du CNRS.

–––— (1987), Essays on Economic Decision under Uncertainty. Cambridge,

UK: Cambridge University Press.

Edwards W. (1954), “The Theory of Decision Making”, Psycological Bulletin,

51: 380-417.

Ellsberg, D. (1961), “Risk, Ambiguity and the Savage Axioms", Quarterly Jour-
nal of Economics, 75: 643-669.

Epstein, L. G. and M. Schneider (2003), “Recursive Multiple Priors”, Journal

of Economic Theory, 113: 32-50.

Fechner, G. T. (1860), Elemente der Psychophysik, 2 vol. (Elements of Psy-

chophysics).

Fishburn, P.C. (1970a) Utility Theory for Decision Making. John Wiley and
Sons, 1970.

–––— (1970b), “Intransitive Indiﬀerence in Preference Theory: A Survey”,

Operations Research, 18: 207-228.

–––— (1982), “Foundations of Risk Measurement. II. Eﬀects of Gains on

Risk,” Journal of Mathematical Psychology 22: 226-242.

–––— (1985), Interval Orders and Interval Graphs. New York: Wiley and
Sons.

Geanakoplos, (2005), “Three Brief Proofs of Arrow’s Impossibility Theorem,”

Economic Theory, 26: 211-215.

76
Gilboa, I. and R. Lapson (1995), “Aggregation of Semi-Orders: Intransitive
Indiﬀerence Makes a Diﬀerence”, Economic Theory, 5: 109-126.

Gilboa, I., F. Maccheroni, M. Marinacci, and D. Schmeidler (2008), “Objective

and Subjective Rationality in a Multiple Prior Model”, mimeo.

Gilboa, I. and D. Schmeidler (1989), “Maxmin Expected Utility with a Non-

Unique Prior", Journal of Mathematical Economics, 18: 141-153.

Herstein, I. N. and J. Milnor (1953), “An Axiomatic Approach to Measurable

Utility,” Econometrica, 21: 291-297.

Kahneman, D. and A. Tversky (1979), “Prospect Theory: An Analysis of De-

cision Under Risk,” Econometrica, 47: 263-291.

Karni, E. (1985), Decision Making under Uncertainty: The Case of State-

Dependent Preferences. Cambridge: Harvard University Press.

–––— (1996), “Probabilities and Beliefs,” Journal of Risk and Uncertainty,

13: 249-262.

Karni, E. and P. Mongin (2000), “On the Determination of Subjective Proba-

bility by Choices”, Management Science, 46: 233-248.

Karni, E., D. Schmeidler and K. Vind (1983), “On state dependent preferences
and subjective probabilities,” Econometrica, 51: 1021-1031.

Keynes, J. M. (1921), A Treatise on Probability. London: MacMillan and Co.

Knight, F. H. (1921), Risk, Uncertainty, and Profit. Boston, New York: Houghton
Miﬄin.

Kraft, C. H., J. W. Pratt, and T. Seidenberg (1959), “Intuitive Probability on

Finite Sets,” Annals of Mathematical Statistics, 30: 408-419.

Kreps, D. Notes on the Theory of Choice. Westview Press: Underground Clas-

sics in Economics, 1988.

Luce, R. D. (1956), “Semiorders and a Theory of Utility Discrimination”, Econo-

metrica, 24: 178-191.

77
Maccheroni, F., M. Marinacci, and A. Rustichini (2006a), “Ambiguity Aversion,
Robustness, and the Variational Representation of Preferences,” Econometrica,
74: 1447-1498.

–––— (2006b), “Dynamic Variational Preference”, Journal of Economic The-

ory 128: 4-44.

Machina, M. J. (1987), “Choice Under Uncertainty: Problems Solved and Un-

solved", Economic Perspectives, 1: 121-154.

–––— (2004), “Almost-objective uncertainty”, Economic Theory 24: 1—54.

Machina, M. J. and D. Schmeidler (1992), “A More Robust Definition of Sub-

jective Probability”, Econometrica, 60: 745-780.

March, J. G. and H. A. Simon (1958), Organizations. New York: Wiley.

Markowitz, H. M. (1952), “The Utility of Wealth,” Journal of Political Econ-

omy, 60: 151-158.

Mas-Colell, A., M. Whinston, and J. Green (1995), Microeconomic Theory. New

York: Oxford Press.

Preston, M. G. and P. Baratta (1948), “An Experimental Study of the Auction

Valueof an Uncertain Outcome”, American Journal of Psychology, 61: 183-193.

Quiggin, J. (1982), “A Theory of Anticipated Utility”, Journal of Economic

Behaviorand Organization, 3: 225-243.

Ramsey, F. P. (1931), “Truth and Probability", in The Foundation of Mathe-

matics and Other Logical Essays. New York, Harcourt, Brace and Co.

Savage, L. J. (1954), The Foundations of Statistics. New York: John Wiley and
Sons. (Second addition in 1972, Dover)

Schmeidler, D. (1986), “Integral Representation without Additivity.” Proceed-

ings of the American Mathematical Society, 97: 255-261.

–––— (1989), “Subjective Probability and Expected Utility without Additiv-

ity", Econometrica, 57: 571-587.

78
Shafer, G. (1986), “Savage Revisited”, Statistical Science, 1: 463-486.

Tversky, A., and D. Kahneman, (1992), “Advances in Prospect Theory: Cu-

mulative Representation of Uncertainty”, Journal of Risk and Uncertainty, 5:
297-323.

von Neumann, J. and O. Morgenstern (1944), Theory of Games and Economic

Behavior. Princeton, N.J.: Princeton University Press.

Wakker, P. P. (1989), Additive Representations of Preferences. Dordrecht,

Boston, London: Kluwer Academic Publishers.

–––— (2010), Prospect Theory. Cambridge University Press, forthcoming.

Weber, E. H. (1834), De Tactu (“Concerning Touch”).

Yaari, M. E. (1987), “The Dual Theory of Choice under Risk”, Econometrica,

55: 95-115.

Classification of Sex Positions
100% (4)
Classification of Sex Positions
13 pages
Discrete Math Notes
No ratings yet
Discrete Math Notes
74 pages
An Intuitionistic Exploration of Zermelo's Axiom
100% (1)
An Intuitionistic Exploration of Zermelo's Axiom
94 pages
2006 02 06lecture Relations
100% (1)
2006 02 06lecture Relations
6 pages
Engineering Economic Analysis (Week 7) Comparative Analysis of Alternatives, Repeatability, Coterminated and Capitalized Worth
No ratings yet
Engineering Economic Analysis (Week 7) Comparative Analysis of Alternatives, Repeatability, Coterminated and Capitalized Worth
44 pages
Lojban Words
No ratings yet
Lojban Words
213 pages
Levels of Energy PDF
No ratings yet
Levels of Energy PDF
81 pages
Epiphany About Conditioning (Post #1)
100% (3)
Epiphany About Conditioning (Post #1)
56 pages
Actual Freedom - Made Easy (Print Friendly Edition)
100% (5)
Actual Freedom - Made Easy (Print Friendly Edition)
186 pages
Logique
No ratings yet
Logique
53 pages
DLP MATH 10-1st Quarter Week 4 - Day 3
100% (1)
DLP MATH 10-1st Quarter Week 4 - Day 3
5 pages
Georg Cantor
No ratings yet
Georg Cantor
16 pages
Melnikov, Green's Functions and Elementary Functions, 2011
No ratings yet
Melnikov, Green's Functions and Elementary Functions, 2011
176 pages
Exsolvent Dimensions - From Obstruction To Adaptive Geometry - Vol 2
No ratings yet
Exsolvent Dimensions - From Obstruction To Adaptive Geometry - Vol 2
51 pages
mth303 Notes 2024
No ratings yet
mth303 Notes 2024
114 pages
Introduction To Functional Analysis - Daniel Daners - 2017
No ratings yet
Introduction To Functional Analysis - Daniel Daners - 2017
123 pages
254 Math
No ratings yet
254 Math
65 pages
Urantia Book Workbook Volume 1 - Foreword and Part 1
No ratings yet
Urantia Book Workbook Volume 1 - Foreword and Part 1
249 pages
Microeconomics Homework 1: Answers: Comments
No ratings yet
Microeconomics Homework 1: Answers: Comments
4 pages
Good Proof Utility Representation
No ratings yet
Good Proof Utility Representation
25 pages
MICRO1 Solutions Bookpdf
No ratings yet
MICRO1 Solutions Bookpdf
262 pages
Brown, Gregory (2005), Leibniz' Mathematical Argument Against A Soul of The World, en British Journal For The History of PH
No ratings yet
Brown, Gregory (2005), Leibniz' Mathematical Argument Against A Soul of The World, en British Journal For The History of PH
41 pages
Nature Institute - in Context - 46
No ratings yet
Nature Institute - in Context - 46
24 pages
4) Preferences and Utility
No ratings yet
4) Preferences and Utility
21 pages
L 8 - Maths 2 - Integral, Comparison and Ratio Test
No ratings yet
L 8 - Maths 2 - Integral, Comparison and Ratio Test
14 pages
RA Notes
No ratings yet
RA Notes
66 pages
Mas Colell A. Whinston M. Green J. Microeconomic Theory of
0% (1)
Mas Colell A. Whinston M. Green J. Microeconomic Theory of
262 pages
CS 4850-Lecture 1
No ratings yet
CS 4850-Lecture 1
3 pages
Preference and Utility: Lecture 3, 5 September
No ratings yet
Preference and Utility: Lecture 3, 5 September
23 pages
Mathemtical Continuity in Aristotle
No ratings yet
Mathemtical Continuity in Aristotle
17 pages
Mathematical Proof
No ratings yet
Mathematical Proof
64 pages
Mas Colell Microeconomic Theory Solution Guide
100% (6)
Mas Colell Microeconomic Theory Solution Guide
354 pages
Geometric Sequences
No ratings yet
Geometric Sequences
8 pages
Lecture - 03 Preference and Utility
No ratings yet
Lecture - 03 Preference and Utility
22 pages
Lecture Notes On Microeconomics
No ratings yet
Lecture Notes On Microeconomics
187 pages
Example
No ratings yet
Example
2 pages
Core Concepts of Elan
No ratings yet
Core Concepts of Elan
13 pages
The American Mathematical Monthly - No 3 March 2011
100% (2)
The American Mathematical Monthly - No 3 March 2011
101 pages
Seung Jin
100% (1)
Seung Jin
31 pages
General Topology Notes
No ratings yet
General Topology Notes
63 pages
02 ECON5210 Lecture 1 Final Version
No ratings yet
02 ECON5210 Lecture 1 Final Version
46 pages
Chapter 2: Fundamentals
No ratings yet
Chapter 2: Fundamentals
16 pages
Sucser
No ratings yet
Sucser
120 pages
KNUTH, Donald E. Mathematics and Computer Science - Coping With Finiteness.
No ratings yet
KNUTH, Donald E. Mathematics and Computer Science - Coping With Finiteness.
19 pages
MasColell Whinston Green PDF
No ratings yet
MasColell Whinston Green PDF
262 pages
10 1 1 494 7942 PDF
No ratings yet
10 1 1 494 7942 PDF
120 pages
The Spirit of Muslim Culture
No ratings yet
The Spirit of Muslim Culture
22 pages
Hermite Delta
No ratings yet
Hermite Delta
43 pages
Solutions For Mas Colell
100% (1)
Solutions For Mas Colell
706 pages
Solution To Algebra Chapter 0
100% (1)
Solution To Algebra Chapter 0
133 pages
P2 Chp5 SequencesAndSeries JEN
No ratings yet
P2 Chp5 SequencesAndSeries JEN
47 pages
ECON4211 PS1 Sol 2024F
No ratings yet
ECON4211 PS1 Sol 2024F
7 pages
Probably Something Useless
No ratings yet
Probably Something Useless
37 pages
Binary Relations
No ratings yet
Binary Relations
12 pages
A Synopsis of The Theory of Choice: 1 Revealed Preferences
No ratings yet
A Synopsis of The Theory of Choice: 1 Revealed Preferences
16 pages
Mit18 100af20 Lec Full2
No ratings yet
Mit18 100af20 Lec Full2
92 pages
SMT 2024 Power SOLS
No ratings yet
SMT 2024 Power SOLS
9 pages
Fnaf World Upgrade Vs Battles Wiki Forum
No ratings yet
Fnaf World Upgrade Vs Battles Wiki Forum
1 page
M T Nair Notes Real
No ratings yet
M T Nair Notes Real
52 pages
ECON6203Lecture 1
No ratings yet
ECON6203Lecture 1
20 pages
Lecture 1
No ratings yet
Lecture 1
7 pages
Lec02 Concepts
No ratings yet
Lec02 Concepts
35 pages
Wuolah Free Apuntes Parte 2
No ratings yet
Wuolah Free Apuntes Parte 2
4 pages
Mathematics For Economists
No ratings yet
Mathematics For Economists
7 pages
R Is Uncountable Using The Intermediate Value Theorem
No ratings yet
R Is Uncountable Using The Intermediate Value Theorem
4 pages
Houlgate, Stephen - The Opening of Hegel's Logic - From Being To Infinity - Purdue University Press PDF
75% (4)
Houlgate, Stephen - The Opening of Hegel's Logic - From Being To Infinity - Purdue University Press PDF
478 pages
CON Icroeconomic Heory Ecture References and Tility
No ratings yet
CON Icroeconomic Heory Ecture References and Tility
4 pages
Abstract Algebra - Ash
No ratings yet
Abstract Algebra - Ash
9 pages
Abstract Algebra - Ash
No ratings yet
Abstract Algebra - Ash
8 pages
MA1511 Engineering Calculus Chapter 5 Infinite Series: 5.1 Sequences
No ratings yet
MA1511 Engineering Calculus Chapter 5 Infinite Series: 5.1 Sequences
20 pages
Lecture 1
No ratings yet
Lecture 1
7 pages
List of Important Mathematicians
No ratings yet
List of Important Mathematicians
6 pages
Solucionario Set Theory
No ratings yet
Solucionario Set Theory
89 pages
Problem Set I: Preferences, W.A.R.P., Consumer Choice: Paolo Crosetto Paolo - Crosetto@unimi - It
No ratings yet
Problem Set I: Preferences, W.A.R.P., Consumer Choice: Paolo Crosetto Paolo - Crosetto@unimi - It
26 pages
Intro To Inequalities in The Real World
No ratings yet
Intro To Inequalities in The Real World
20 pages
AMS The Lost Revolution
No ratings yet
AMS The Lost Revolution
5 pages
E0 219 Linear Algebra and Applications / August-December 2011
No ratings yet
E0 219 Linear Algebra and Applications / August-December 2011
4 pages
Chapter 0
No ratings yet
Chapter 0
5 pages
Gi v4 n3
No ratings yet
Gi v4 n3
14 pages
Summary of CS1231
No ratings yet
Summary of CS1231
4 pages
Exploring Probability and Random Processes Using MATLAB®
From Everand
Exploring Probability and Random Processes Using MATLAB®
Roshan Trivedi
No ratings yet
Quantum Physics for Beginners
From Everand
Quantum Physics for Beginners
Max Thomson
4.5/5 (3)
Primer of Quantum Mechanics
From Everand
Primer of Quantum Mechanics
Marvin Chester
4.5/5 (5)
Understanding Proof: Explanation, Examples and Solutions
From Everand
Understanding Proof: Explanation, Examples and Solutions
Tom Bennison
No ratings yet
Mortals or Immortals
From Everand
Mortals or Immortals
Konstantinos p Anastasiadis
No ratings yet
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Risk Management and System Safety
From Everand
Risk Management and System Safety
Leonam dos Santos Guimarães
5/5 (1)
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
Human Nature Potential in Nurture
From Everand
Human Nature Potential in Nurture
David L. Hawk
No ratings yet