ECON 203 Lecture Notes
ECON 203 Lecture Notes
Contents
1 Strategic Environments 1
1.1 Components of a non-cooperative game (“extensive form”) . . . . . . . . . . 1
1.2 Some special classes of extensive form games . . . . . . . . . . . . . . . . . . 6
1.3 Strategies and the strategic (normal) form . . . . . . . . . . . . . . . . . . . 7
1
2.7.3 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.8 Applications of mixed strategy Nash equilibria . . . . . . . . . . . . . . . . . 57
2.8.1 An auction problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.8.2 Bertrand competition with capacity constraints . . . . . . . . . . . . 58
2.9 Normal form refinements of Nash equilibrium . . . . . . . . . . . . . . . . . 62
2.9.1 Application of weak dominance . . . . . . . . . . . . . . . . . . . . . 62
2.9.2 Trembling hand perfection . . . . . . . . . . . . . . . . . . . . . . . . 63
2.10 Arguments for and against Nash equilibrium . . . . . . . . . . . . . . . . . . 65
2.10.1 Possible justifications for Nash equilibrium . . . . . . . . . . . . . . . 65
2.10.2 Possible criticisms of Nash equilibrium . . . . . . . . . . . . . . . . . 65
2.11 Correlated equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2
6 Repeated Games with Complete Information 162
6.1 Infinitely repeated games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.1.1 Some preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.1.2 Nash equilibria with no discounting . . . . . . . . . . . . . . . . . . . 164
6.1.3 Nash equilibria with discounting . . . . . . . . . . . . . . . . . . . . . 171
6.1.4 Subgame perfect Nash equilibria . . . . . . . . . . . . . . . . . . . . . 174
6.1.5 A short list of other topics . . . . . . . . . . . . . . . . . . . . . . . . 177
6.2 Finitely repeated games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
6.3.1 The repeated Bertrand model . . . . . . . . . . . . . . . . . . . . . . 179
6.3.2 The repeated Cournot model . . . . . . . . . . . . . . . . . . . . . . 181
6.3.3 Cooperation with cyclical demand . . . . . . . . . . . . . . . . . . . . 185
6.3.4 Multimarket contact . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.3.5 Price wars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
3
1 Strategic Environments
1.1 Components of a non-cooperative game (“extensive form”)
1. Who is playing?
Concept of a game tree: A formal representation of the sequence of events and decisions
(related to decision trees)
1. Nodes
2. A mapping from nodes to the set of players
3. Branches
4. A mapping from branches to the set of action labels
5. Precedence (a partial ordering)
6. A probability distribution over branches for all nodes that map to nature, N
Definition: The root of a tree is a node with two properties: (i) it has no predecessors, and
(ii) it is a predecessor for everything else (equivalently, everything else is a successor).
Assumptions:
2
A1. The number of nodes is finite (relax later).
A3. There is a unique path (following precedence) from the root of the tree to each terminal
node.
3
available at all nodes within any element of the partition, and (iii) no element of the
partition contains both a node and its predecessor.
Examples:
4
4. How are payoffs determined?
5
1.2 Some special classes of extensive form games
Games of Perfect Recall: Informally, (i) a player never forgets a decision that he or she
took in the past, and (ii) a player never forgets information that he or she possessed
when making a previous decision (for a formal definition, see e.g. Kreps p. 374).
6
Violation of Condition (i) Violation of Condition (ii)
1 1
A B A B
a a
2 2 2
b b
a b a b C C
1 1
1 D D
2
Concept of a strategy: A mapping that assigns a feasible action to all information sets
for which the player is the decision-maker (a complete contingent plan).
Mathematical structure of a strategy: Let A be the set of action labels associated with
all branches in the game. Suppose that there are Ki information sets for which player
i is the decision-maker. Then a strategy for player i, denoted si , is an element of some
7
strategy set Si ⊆ AKi . That is, si is a Ki -dimensional vector, which lists an action for
each information set.
Examples:
MP-A: S1 = {H, T }, S2 = {(h, h), (h, t), (t, h), (t, t)}
QI
S= i=1 Si is the set of feasible strategy profiles.
s−i = (s1 , ..., si−1 , si+1 , ..., sI ) is a profile of strategies for every player but i
Q
S−i = j6=i Sj is the set of feasible strategy profiles for every player but i
Payoffs: A strategy profile s induces a probability distribution over terminal nodes. Let
gi (s) denote the associated expected utility of player i, given this probability dis-
tribution, and given the mapping from terminal nodes into i’s utility. Let g(s) =
(g1 (s), g2 (s), ..., gI (s)). Notice that g : S → RI . This is known as the payoff function.
A game in strategic (normal) form: A collection of players I = {1, ..., I}, a strategy
profile set S, and a payoff function g. (We will sometimes write this as (I, S, g).)
Example 1: MP-A
8
Example 2: MP-B
9
Player 1: 2 information sets; 2 possible choices at first, 3 possible choices at second;
cardinality of strategy set = 2 × 3 = 6. S1 = {(c, k), (c, m), (c, n), (d, k), (d, m), (d, n)}
Player 2: 3 information sets; 2 possible choices at each; cardinality of strategy set = 23 =
8. S2 = {(e, g, i), (e, g, j), (e, h, i), (e, h, j), (f, g, i), (f, g, j), (f, h, i), (f, h, j)}.
S = S1 × S2 . |S| = 6 × 8 = 48 (number of strategy profiles).
Illustrative calculations for the payoff function:
(e, g, i) and (c, k) yields (0,1) with probability 0.5, and (0,1) with probability 0.5. Expected
payoff is (0,1).
(e, h, i) and (d, m) yields (0,2) with probability 0.5 and (1,1) with probability 0.5. Expected
payoff is (0.5, 1.5).
(f, g, j) and (c, n) yields (1,0) with probability 0.5 and (0,3) with probability 0.5. Expected
payoff is (0.5, 1.5).
10
2 Strategic Choice in Static Games of Complete Infor-
mation
Games of Complete Information: Each player knows the payoff received at each termi-
nal node for every other player.
Definition: si is a dominant strategy for i iff for all sbi ∈ Si (with sbi 6= si ) and s−i ∈ S−i ,
gi (si , s−i ) > gi (b
si , s−i ).
11
2.2 Dominance and iterated dominance
12
Notation: Ki ≡ |Si |. For player i, label the individual strategies ski , k = 1, ..., Ki .
Ki
X
ρm gi (sm k
i , s−i ) > gi (si , s−i ).
m=1
13
Remarks (concerning iterative deletion of dominated strategies):
2. To justify iterative deletion of dominated strategies, one must assume common knowledge
of rationality.
14
P
Normal form: si = qi , Si = R+ , gi (s1 , ..., sI ) = P ( i si ) si − ci (si ).
Specialize to two identical firms, linear cost, and linear demand: ci (qi ) = cqi , p = a − bQ.
Payoff functions:
gi (qi , q−i ) = (a − c)qi − bqi q−i − bqi2
Theorem: The linear Cournot model with two identical firms is dominance solvable. The
a−c
solution involves q = 3b
for each firm.
Lemma: If q−i ∈ [q 0 , q 1 ], then all qi < γ(q1 ), and all qi > γ(q0 ) are dominated.
dD(q−i ) ¡ ¢
= b qi − γ(q1 ) < 0.
dq−i
Consequently, for q−i < q 1 , D (q−i ) > 0. Thus, qi is dominated by γ(q 1 ).
The argument for qi > γ(q 0 ) is completely symmetric, and is left as an exercise. Q.E.D.
Remark: The lemma relies only on the submodularity of the payoff function.
Proof of the Theorem: We use the lemma to iteratively delete dominated strategies.
This produces a sequence of intervals, as follows:
15
Strategy set: [0, ∞)
First iteration: [0, γ(0)] (note: γ(0) is the monopoly quantity, (a − c)/2b)
Note that µ¶ t µ ¶j
t a−c X 1
γ (0) = − − .
b j=1
2
This series is absolutely convergent; hence it is also convergent. Thus, the upper and lower
bounds converge to a single point. Since all other quantities are ruled out, the game
is dominance solvable.
a−c
So γ ∞ (0) = 3b
, as claimed. Q.E.D.
16
The lemma continues to apply, except now q−i refers to the total quantity of all other firms
P
(q−i = j6=i qj ). Hence, iterative domination proceeds as follows:
a−c
But (N − 1)γ(0) ≥ b
(equal if N = 3), so γ ((N − 1)γ(0)) = 0. Consequently, nothing
happens after the first iteration. All dominance tells us is that each firm will produce
not more than the monopoly quantity.
Normal form: si = pi , Si = P. Let L denote the set of firms naming the lowest price (i
s.t. pi ≤ pj for all j), and let ` ≡ |L|. Then
⎧ ³ ´
⎪ Q(si )
⎨ si ` − ci
Q(si )
if i ∈ L
`
gi (s1 , ..., sI ) =
⎪
⎩
0 otherwise
Case 1: Begin by taking P to be a discrete price grid (prices are quoted in “pennies”):
pi ∈ {0, 1, 2, 3, ...}. Also assume c is an integer.
17
Second iteration: eliminate 1 (dominated by pi = c)
Conclusion: price can’t be lower than unit cost. No other prices are eliminated (when
pj = c, firm i earns zero profits for all pi ≥ c, so no price at or above marginal cost is
dominated).
p = 0 is dominated by p = ci .
Is any other p > 0 dominated? No. Choose any p0 , p00 > 0. Neither price is dominated by
the other. Why? Consider p−i ∈ (0, min{p0 , p00 }). Then gi (p0 , p−i ) = gi (p00 , p−i ) = 0.
Note in particular that prices below cost are not eliminated by domination.
Motivating example: Bertrand competition with linear costs. Below-cost prices do not
seem plausible.
Object: name a number that is lower than the one named by your opponent.
Normal form:
18
Note: neither strategy is dominated.
Definition: ski is a weakly dominated strategy for player i iff ∃ρm ≥ 0 for 1 ≤ m ≤ Ki with
PKi
m=1 ρm = 1 such that, for all s−i ∈ S−i ,
Ki
X
ρm gi (sm k
i , s−i ) ≥ gi (si , s−i ),
m=1
Remark: Dominated strategies, as we have defined them, are sometimes called strictly
dominated strategies. The term “dominated,” when used by itself, will always denote
strict domination.
Example: MP-A
19
ht is (strictly) dominated by th; hh and tt are weakly dominated by th. Thus, all choices
for player 2 are dominated (either weakly or strictly), except th. Given that Player 2
will play th, Player 1 is indifferent between playing H and T .
Any p0 < c is weakly dominated by p00 = c. If p−i < p0 , both p0 and p00 yield the same
payoff (zero). But if p−i ≥ p0 , p0 yields a strictly negative payoff, while p00 still yields
zero.
p0 = c is weakly dominated by p00 slightly greater than c. The payoffs associated with p0
are zero for all p−i , while the payoffs associated with p00 are always non-negative, and
strictly positive for p−i > p00 .
p0 > pm (the monopoly price, which, for simplicity, we will assume to be unique) is weakly
dominated by pm . If p−i < pm , both yield zero payoffs. If p−i ∈ [pm , p0 ), pm yields
strictly positive payoffs, while p0 yields a zero payoff. If p−i ≥ p0 , the payoff for pm
strictly exceeds the payoff from p0 .
20
Provided that Q(p)(p − c) is strictly increasing in p on [c, pm ], no other p0 ∈ (c, pm ] is
weakly dominated. Choose any such p0 . Consider any probability distribution over
non-negative prices, with CDF F , that does not place all weight on p0 . If F (p0 ) = 1,
then, for any p−i > p0 , randomizing according to F produces a strictly lower payoff
than p0 . If F (p0 ) < 1, then, for any p−i > p, the payoff from this randomization scheme
is bounded above by F (p−i )Q(p−i )(p−i − c), which is strictly less than Q(p0 )(p0 − c) for
p−i sufficiently close to p0 . But the latter term is what firm i earns, for such p−i , when
choosing p0 .
p0 yields strictly higher profits if p−i ∈ (p0 , p00 ). Consider p00 < p0 ; p0 yields strictly higher
profits if p−i > p0 .
I players (“bidders”).
Each has a valuation, vi , for some indivisible object that is being auctioned.
The highest bid wins the object. In case of ties, the winner is determined randomly (with
equal probability for all winning bids).
The winner pays the second highest bid, ps (in the event of a tie, this equals the winning
bid).
Proof: Let p−i denote the vector of bids by players other than i, and let pm
−i denote the
highest bid among these players. There are two cases to consider.
21
Case 1: p0 < vi .
(i) pm 0 m 0 00
−i < p . Then i’s payoff is the same (vi − p−i ) regardless of whether it bids p or p .
(ii) pm 0 00 m
−i = p . Then i’s payoff is strictly higher when it bids p (vi − p−i > 0) then when it
bids p0 ( n1 (vi − pm
−i ), where n is the number of winning bids).
(iii) pm 0 00 00 m
−i ∈ (p , p ). Then i’s payoff is strictly higher when it bids p (vi − p−i > 0) then
(iv) pm 00 0 00
−i ≥ p . Then i’s payoff is zero regardless of whether it bids p or p .
Case 2: p0 > vi
(i) pm 00 m 0 00
−i < p . Then i’s payoff is the same (vi − p−i ) regardless of whether it bids p or p .
(ii) pm 00 0 00
−i = p . Then i’s payoff is zero regardless of whether it bids p or p .
(iii) pm 00 0 0
−i ∈ (p , p ]. Then i’s payoff is strictly negative when it bids p , and zero when it bids
p00 .
(iv) pm 0 0 00
−i > p . Then i’s payoff is zero regardless of whether it bids p or p . Q.E.D.
Caveat: The theorem does not establish that nothing weakly dominates playing v. How-
ever, this is true. (As an exercise, prove this. In particular, show that, for any
randomization over bids with CDF F , there is some pm
−i such that bidding v is strictly
better.)
I players (voters), I ≥ 3.
Players simultaneously cast votes: they name j. Abstentions are not permitted. Si =
{1, ..., J}
22
The policy with the most votes is adopted (in the event of ties, a policy is selected with
equal probabilities from the set of policies receiving the most votes)
Payoffs for i are vij when policy j is adopted. We assume that vij 6= vik for all i, j, k.
Observation: Domination does not eliminate any strategies. Imagine that all other players
vote for the same policy. Then i’s vote cannot affect the outcome; therefore, i’s payoff
is the same for all si ∈ Si .
Theorem: In the simple voting problem, weak domination eliminates the possibility that
0
i would vote for her least favorite policy j (i) ≡ arg minj vij . With I > 3, weak
domination does not rule out any other strategy.
Proof: Fix any strategy for voters k 6= i. Let π (a J-dimensional probability vector) denote
the outcome if i does not vote.
Now imagine that i votes for j 0 (i). There are three possibilities: (a-i) no change in outcome,
(a-ii) j 0 (i) is added to the set of possible outcomes, or (a-iii) the set of possible outcome
shrinks to j 0 (i). Possibility (a-i) is π, and the other possibilities are strictly worse than
π.
Next imagine that i votes for j 00 (i) ≡ arg maxj vij . There are three possibilities: (b-i) no
change in outcome, (b-ii) j 00 (i) is added to the set of possible outcomes, or (b-iii) the
set of possible outcomes shrinks to j 00 (i). Possibility (b-i) yields π, and the other
possibilities are better than π.
Thus, regardless of what other voters have chosen, the outcome when i votes for j 00 (i) is at
least as attractive to i as when i votes for j 0 (i). In cases where (a-ii), (a-iii), (b-ii), or
(b-iii) occurs, voting for j 00 (i) is strictly better than voting for j 0 (i), so voting for j 0 (i)
is weakly dominated.
Now we argue that weak domination does not rule out any other strategy when I > 3.
∗
First assume that I is odd. Suppose that J−i (s−i ) = {j 0 (i), k}, and that every j 6= i
23
votes either for j 0 (i) or for k. Then i is strictly better off voting for k than voting for
anything else. Next suppose that I is even. Again assume that every j 6= i votes
either for j 0 (i) or for k, but in this case imagine that j 0 (i) receives one more vote than
k. Once again, i is strictly better off voting for k than voting for anything else. Q.E.D.
Corollary: If players avoid weakly dominated strategies, and if J = 2, then the policy
preferred by the majority is victorious.
Alternatively:
24
2.4 Rationalizability
Definition: (i) A 1-rationalizable strategy for player i is an element of Si that is a best re-
sponse to some (independent) probability distribution over strategies for other players.
An example:
25
Remark: For two player games, rationalizable strategies are exactly what is left over after
26
iterative deletion of dominated strategies. This equivalence generalizes to more than
2 players if one does not insist on independence. If one does insist on independence,
the set of rationalizable strategies is smaller.
Definition: s∗ = (s∗1 , ..., s∗I ) ∈ S is a pure strategy Nash equilibrium if for all i and si ∈ Si ,
Theorem: A finite game of perfect information has a pure strategy Nash equilibrium
Proof: To keep things simple, we’ll treat cases with no uncertainty - Nature makes no
choices. (As an exercise, try to generalize the proof to include cases with uncertainty.)
Define the length L of the game as the longest path (measured in number of branches,
following precedence) from a root node to a terminal node. The proof is by induction
on the length of the game.
Begin with L = 1. This is just a one-person decision problem with a finite number of
choices. Every finite set of real numbers contains a maximal element; the corresponding
choice is the equililibrium of the one-player game.
Now imagine that the theorem is true for all games of length L ≤ n. We argue that it
must also be true for games of length n + 1.
Consider a finite game of perfect information with length n + 1. This game begins with an
initial decision by some player (without loss of generality, player 1). Let T denote the
number of potential choices at the root node. Each these choices leads to a successor
27
node. We index these successor nodes t = 1, ...T . One can think of each one of these
T successor nodes as the root node for a (finite, perfect information) continuation game
of of length n or less. By hypothesis, there exists a pure strategy Nash equilibrium for
each of these continuation games. Construct an equilibrium for the n + 1 length game
as follows. Within each continuation game, pick a mapping from nodes to actions that
corresponds to some equilibrium of that continuation game. Let vit denote the payoff
to i for the equilibrium associated with the continuation game emanating from node t.
At the root node, let player 1 choose the action that leads to any node t∗ ∈ arg maxt v1t
(since the set of alternatives is finite, a maximal element exists).
The preceding is plainly a pure strategy Nash equilibrium for the full game. Any deviation
by a player i 6= 1 has the same effect on i’s payoff as a deviation in the continuation
game emanating from node t∗ . Since, by construction, the strategy profile constitutes
an equilibrium in this continuation game, the deviation does not benefit player i.
Now consider player 1. Consider some deviation by player 1. For this deviation,
imagine that 1’s initial choice leads to node k ∈ {1, ..., T }. Let u denote 1’s payoff from
deviating. Since the strategy profile constitutes an equilibrium in the continuation
game emanating from node t, we have u ≤ v1k ≤ maxt v1t , which is 1’s payoff in the
proposed equilibrium. Hence, the deviation is not in 1’s interest. Q.E.D.
Remark: “Zermelo’s theorem” suggests an algorithm for finding equilibria: backward in-
duction. Procedure: at each penultimate node, identify an optimal choice for the
decision-maker. Treat the penultimate node as a terminal node, using the payoffs as-
sociated with the aforementioned optimal choice. Repeat until all decisions in the tree
are resolved.
Example #1: Chess. Implication of previous theorem: either white always wins, black
always wins, or it is always a draw. Solvable by backward induction.
28
Two players
Two piles of money, one for each player. Start with $1 in each pile.
Every time i says “continue,” $1 is taken from i’s pile, and $2 is placed into j’s pile.
Implication of backward induction: Say “stop” at every node. The game ends imme-
diately.
How do people actually play? Tend to say “continue” until there is a substantial amount
of money in both piles.
Theorem: Suppose S1 , ..., SI are compact, convex Euclidean sets. Suppose that gi is con-
tinuous in s and quasiconcave in si . Then there exists a pure strategy Nash equlibrium.
29
Terminology:
X ⊂ <N is compact if it is closed (contains all of its limit points) and bounded.
f : <N → < is quasiconcave if, for all t ∈ < and x, y ∈ <N such that f (x) ≥ t and f (y) ≥ t,
f (αx + (1 − α)y) ≥ t for all α ∈ [0, 1].
Kakutani Fixed Point Theorem: Suppose that S is compact and convex. Suppose also
that γ : S ⇒ S is a convex valued, upper hemicontinuous correspondence. Then there
exists s∗ ∈ S such that s∗ ∈ γ(s∗ ).
Proof of the Nash existence theorem: Define i’s best response correspondence, γ i : S−i ⇒
Si , as follows:
γ i (s−i ) = {si ∈ Si | si maximizes gi (·, s−i )}
Note that a Nash equilibrium is a fixed point of γ; that is, s ∈ S such that s ∈ γ(s).
Suppose the claim is false. Then there exists st−i → s∗−i and sti ∈ γ i (st−i ) converging to s0i
/ γ i (s∗−i ). It follows that there must exist s∗i ∈ Si with
∈
30
By continuity of gi , gi (sti , st−i ) converges to gi (s0i , s∗−i ), and gi (s∗i , st−i ) converges to
gi (s∗i , s∗−i ). But then, for t sufficiently large, we must have gi (sti , st−i ) < gi (s∗i , st−i ),
which implies that sti ∈
/ γ i (st−i ), a contradiction.
Suppose that s0i , s00i ∈ γ i (s−i ). Plainly, gi (s0i , s−i ) = gi (s00i , s−i ) ≡ g∗ . Since gi is quasiconcave
in si , gi (αs0i + (1 − α)s00i , s−i ) ≥ g∗ for α ∈ [0, 1]. The previous expression must hold
with equality, or it would contradict s0i , s00i ∈ γ i (s−i ). But then αs0i +(1−α)s00i ∈ γ i (s−i ),
as claimed.
Structure: N ≥ 2 firms simultaneously select price. Customers purchase from the firm with
the lowest announced price, dividing equally in the event of ties. Quantity purchased
is given by a continuous, strictly decreasing function Q(P ).
Theorem: If ci (qi ) = cqi (identical linear costs), then there exists an equilibrium in which
all output is sold at the price p = c, and there does not exist an equilibrium in which
output is sold at any other price.
p ≡ min p∗i
i
31
The theorem asserts that p = c. Suppose that this is false. Then p > c (p < c is not possible
since firm 1 would be losing money, which it could avoid by deviating to p1 = c). There
are two cases to consider.
(i) p∗2 > p. Then firm 2 receives a payoff of 0. By deviating to any p2 ∈ (c, p), firm 2 would
earn a positive profit. Consequently, the initial configuration could not have been an
equilibrium.
(ii) p∗2 = p. Then firm 2 receives a payoff π 2 ≤ (p−c)Q(p)/2. If instead firm 2 set p02 ∈ (c, p),
it would earn (p02 − c)Q(p02 ). Since Q is continuous, firm 2 can, by choosing p02 close
enough to (but smaller than) p, obtain a payoff arbitrarily close to (p − c)Q(p), which
is greater than the payoff received in the proposed equilibrium.
We have ruled out the possibility that p 6= c. To conclude the proof, we must demonstrate
the existence of an equilibrium with p = c. Suppose that every firm sets pi = c. Then
every firm receives a payoff of zero. Deviating to a lower (below-cost) price clearly
cannot yield positive profits. Moreover, a firm cannot make any sales by setting a
higher (above-cost) price. Consequently, there is no profitable deviation. Q.E.D.
Henceforth, to keep things simple, order the firms so that c1 ≤ c2 ≤ ... ≤ cN , and assume
that Q(p)(p − c1 ) is increasing in p over the range [c1 , c2 + ε] for some ε > 0. Relaxing
this assumption is straightforward, but requires attention to some tedious details.
Theorem: Suppose that ci (qi ) = ci qi . If c1 < c2 , then no pure strategy Nash equilibrium
exists.
Proof: Define p as before. One rules out equilibria with p > c2 through essentially the
same argument as in the last theorem (either firm 1 or firm 2 must have an incentive
to undercut the other). If p = c2 and pi = c2 for some i > 1, firm 1 could profit by
setting a price slightly below p. If p = c2 and pi > c2 for all i > 1, firm 1 could profit
by raising its price slightly. If p < c2 , then firm 1 must be making all sales (otherwise
32
some other firm i could avoid losses by deviating to pi = ci ). This means that firm 1 is
quoting a price strictly below any other quoted price. But then firm 1 could increase
its profits by quoting a slightly higher price. Q.E.D.
Alternative approach #1: Treat customers as players in the game. When two or more
firms qouote the same price, customers are indifferent. As usual, we are free to resolve
this indifference in a way that is consistent with equilibrium.
Theorem: Suppose that ci (qi ) = ci qi . Order the firms so that c1 ≤ c2 ≤ ... ≤ cN . With
alternative approach #1:
(i) there does not exist any equilibrium in which output is sold at a price less than c1 or
greater than c2 ,
(ii) if c1 = c2 , there exists an equilibrium in which all output is sold at the price p = c1 ,
and
(iii) if c1 < c2 , for all p ∈ [c1 , c2 ] there exists a Nash equilibrium in which all output is sold
at the price p. Moreover, in any equilibrium, firm 1 makes all of the sales.
(i) The argument closely parallels that given in the proof of the first theorem in this section.
Briefly: we can’t have p < c1 because some firm i could then avoid losing money by
setting pi = c1 . If p > c2 and firm 2 makes no sales, firm 2 could earn positive profits
by setting p2 ∈ (c2 , p). If p > c2 and firm 2 does make some sales, then firm 1 could
increase profits by setting a price slightly below (but sufficiently close to) p.
(ii) Suppose that p1 = p2 = c2 , pi > c1 for i > 2, and all customers purchase the good from
firm 1. Every firm a payoff of zero. Deviations to prices below c1 result in negative
profits; deviations to prices above c1 yield zero profits, and deviations to c1 (by i > 2)
yield non-positive profits receives (strictly negative when ci > c1 and i makes strictly
positive sales).
33
(iii) Choose some p ∈ [c1 , c2 ]. Suppose that p1 = p2 = p, pi > p for i > 2, and all customers
purchase the good from firm 1. Firm 1 earns a non-negative profit, but cannot earn
a greater profit by raising price (as it would sell nothing) or by lowering price (as it
already serves the entire market, and as Q(p)(p − c1 ) is increasing in p over the range
[c1 , c2 ]). All other firms earn zero, but cannot sell positive quantity unless they set
pi < c2 , in which case they would incur losses.
To see that firm 1 must make all sales, note that, for p < c2 , any other firm i making
positive sales could avoid losses by setting pi = ci . For p = c2 , if any other firm made
positive sales, firm 1 could achieve a higher payoff by setting a price slightly below
(but sufficiently close to) p. Q.E.D
Question: For the case of c1 < c2 , are all of these equlibria equally plausible? Equilibria
with p < c2 require some firm to set a price below cost. As we have seen, this is a
weakly dominated strategy. We will return to this issue later.
Alternative approach #2: Assume that prices must be quoted in discrete units (“pen-
nies”); that is, the feasible price set is {0, 1, 2, ...}. Assume also that ci is a whole
number for each i. Finally, return to the assumption that customers split evenly be-
tween the lowest-price firms in the event of a tie.
The case of c1 = c2 : (i) For the same reasons as before, there is an equilibrium in which
p1 = p2 = c1 , and all other firms set higher prices.
(iii) One can show that all output must be sold at either a price of c1 or at a price of c1 + 1.
(Check: firm 1 would undercut any higher price.)
(iv) Notice that, as the price grid becomes increasingly fine, we obtain convergence to the
same answer we obtained with the original model (when price is selected from R+ ),
34
and with alternative approach #1.
The case of c1 < c2 . (i) For the same reasons as above, there is an equilibrium in which
p1 ∈ {c1 + 1, c1 + 2, ..., c2 }, p2 = p1 + 1, and all other firms charge higher prices.
(ii) One can show that there does not exist an equilibrium for which output is sold at any
other price.
(iii) One can also show that firm 1 must sell all output.
(iv) Notice that, as the price grid becomes increasingly fine, we obtain convergence to
the same answer we obtained with alternative approach #1. In particular, the prices
charged by firms 1 and 2 become identical in the limit, but all customers purchase the
good from firm 1.
Some observations concerning Bertrand competition (in a setting with one-shot, si-
multaneous choices and homogeneous products):
3. The effects of production costs. Suppose N = 2, and that marginal costs are c1 and c2 ,
with c1 < c2 . Then changing c1 has no effect on price.
Competitive Analysis:
PN
Herfindahl-Hirshman Index: H = 10, 000 × i=1 α2i , where αi denotes the market share of
firm i
pi −c0i (qi )
Lerner Index: Li = pi
(represents firm i’s markup over marginal cost, measured as a
PN
percent of price), L = i=1 αi Li .
35
Thought experiment: Start with c1 = c2 . Reduce c1 . What happens to the HHI, to the
LI, and to welfare?
After the change, HHI = 10, 000, LI > 0 (depending on which price is selected).
Concentration rises, and markups rise. That sounds bad, but in fact social welfare rises.
Remark: Price competition seems like the right model, but yields various implausible pre-
dictions. As we proceed, we will be relaxing three assumptions that are essential for
producing the preceding results:
1. Non-spatial models
q1 = Q(p1 , p2 )
q2 = Q(p2 , p1 )
36
First order condition: Q1 (p1 , p2 )(p1 − c) + Q(p1 , p2 ) = 0
Implication: p1 − c = −Q(p1 , p2 )/Q1 (p1 , p2 ) > 0 (provided firm 1 produces strictly positive
quantity). In contrast to case with homogeneous products, firms earn strictly positive
profits.
If the second-derivative terms are small (e.g. linear demand implies Q11 = Q12 = 0),
we have
dp1 1
0< <
dp2 2
Graphically:
Observation: best response functions are upward sloping — a case of strategic complements.
37
From the perspective of the firms, equilibrium is locally inefficient; one could increase profits
for both firms by raising both prices.
Two firms
Firms are endowed with product characteristics: a for firm 1, 1 − b for firm 2, with a < 1 − b
Payoff for a type θ consumer: 0 if no purchase, v − p − t(x − θ)2 if purchases a type x good
at price p.
38
For simplicity, we will assume that v is very large so that all customers buy something, and
so that b
θ is well-defined.
Q1 (p1 , p2 ) = b
θ(p1 , p2 , a, b, t)
Q2 (p1 , p2 ) = 1 − b
θ(p1 , p2 , a, b, t)
where b
θ(p1 , p2 , a, b, t) is given by the solution of
p1 + t(a − b
θ)2 = p2 + t(1 − b − b
θ)2
Solving, we get:
b p2 − p1 1+a−b
θ= +
2t(1 − a − b) 2
39
It is easy to check that these demand functions satisfy the assumptions used for the non-
spatial model, above; moreover, the second partials are zero. Consequently, we have a
standard price setting problem with strategic complementarities (upward sloping best
response functions).
1 t
p1 = c + (p2 − c) + (1 + a − b) (1 − a − b)
2 2
Notice: (i) strategic complements, (ii) p1 > c, (iii) price rises with c, (iv) price rises with t
(markets are more “insulated”)
Combining this with the corresponding equation for firm 2 and solving simultaneously
yields:
µ ¶
a−b
p∗1 = c + t(1 − a − b) 1 +
3
µ ¶
b−a
p∗2 = c + t(1 − a − b) 1 +
3
Notice: (i) equilibrium prices are strictly greater than costs, (ii) equilibrium prices rise with
costs and with transportation costs, (iii) for a = b, equilibrium prices decline with a
(they fall as the firms move closer together).
40
Firm i produces a good that yields quality vi , with vH > vL
Each consumer purchases either one unit of one of these goods, or nothing.
Consumers are characterized by a preference parameter, θ, which indicates the value at-
tached to quality. If a consumer of type θ purchases a good of type i, her utility is
given by
u(θ, i) = θvi − pi
41
b
θ satisfies:
b
θvH − pH = b
θvL − pL ,
so
b pH − pL
θ=
vH − vL
Thus, assuming a total population normalized to unity, demands are given as follows:
∙ ¸
pH − pL £ ¤−1
QH (pL , pH ) = θ − θ−θ
vH − vL
∙ ¸
pH − pL £ ¤−1
QL (pL , pH ) = −θ θ−θ
vH − vL
Notice that this is a completely standard case of Bertrand price competition with differenti-
ated products and linear demands. It will therefore exhibit strategic complementarities
(upward sloping best response functions). Solving for the equilibrium is completely
standard, and analogous to the case of horizontal differentiation.
Theorem: Assume that θ < 2θ. There exists an equilibium in which firm H makes all of
the sales and earns strictly positive profits.
Remark: This result contrasts sharply with the case of horizontal differentiation.
p∗L = c
42
Now we check to see whether this is an equilibrium. Firm L cannot do better by lowering
price (since this would require pL < c), or by raising price (since it would sell nothing).
Firm H cannot benefit from reducing price (since quantity would be unaffected). We
need to check whether firm H benefits from raising price.
For any pH ≥ p∗H , firm H will make sales to all customers with
vL θ − c ≤ vH θ − pH ,
or
pH − c
θ ≥ θ∗ ≡
vH − vL
43
2.6.3 Cournot (quantity) competition
N ≥ 2 firms
The quantity Q is taken to market, where it is auctioned off at the market clearing price,
P (Q)
³ P ´
Payoffs: gi (q) = P qi + j6=i qj qi − ci (qi )
Existence:
Si = <+ Convex, closed, but not bounded. This does not pose a problem in practice.
Analytic procedure: impose an artificial bound at some extremely large quantity, find
an equilibrium, and show that it remains an equilibrium when one removes the bound.
To apply our earlier theorem, need gi (q) to be continuous and quasiconcave in qi . Since
P and ci are usually assumed to be continuous, there is usually no problem with
continuity. However, we don’t necessarily have quasiconcavity.
Sufficient conditions for quasiconcavity: (i) ci is convex, and (ii) P is concave. Condition (i)
rules out increasing returns to scale, which is not surprising, since it creates difficulties
with existence in other settings as well. Condition (ii) is not a conventional property
of demand functions; it is satisfied for linear functions, but not for isoelastic ones.
For what follows, we will assume that a pure strategy Nash equilibrum exists.
44
Firm i solves:
max P (qi + Q−i )qi − ci (qi )
qi
dqi
−1 < <0
dQ−i
Note that the best response function slopes downwards. This is a case of strategic substi-
tutes.
45
Properties of Nash equilibrium:
Graphical depiction:
Parametric example: ci (qi ) = cqi , p = a − bQ. We have previously solved for the best
response function:
a − c Q−i
γ i (Q−i ) = −
2b 2
For a symmetric equilibrium, this implies
a − c (N − 1)q∗
q∗ = − ,
2b 2
or
a−c
q∗ =
b(N + 1)
1
p∗ = c + (a − c)
N +1
Some remarks:
46
1. From perspective of firms, equilibrium is locally inefficient. Total quantity exceeds the
monopoly quantity.
2. Comparative statics differ from Bertrand because of strategic substitutability (vs. strate-
gic complementarity). Example: assume constant unit costs, c1 and c2 , and raise c2 .
Cournot: best response for firm 2 shifts down. Result: firm 2’s quantity falls (2
becomes less aggressive), while firm 1’s quantity rises (1 becomes more aggressive).
Bertrand: best response for firm 2 shifts up. Result: firm 2’s price rises (2 becomes
less aggressive), and firm 1’s price rises (1 becomes less aggressive as well).
3. Price always exceeds marginal cost. To see this, rewrite the first order condition:
4. Achieve perfect competition in the limit as the number of firms grows. To illustrate,
assume symmetry. The first-order condition becomes:
µ ∗¶
∗ 0 Q 1
P (Q ) = c − P 0 (Q∗ )Q∗
N N
As long as we don’t change demand as we add firms, Q∗ and P 0 (Q∗ ) should remain
bounded, which means that price converges to marginal cost. For Cournot, adding
firms therefore moves the outcome smoothly from the monopoly case to the competitive
case. One can also see this directly in the parametric case considered above.
5. In contrast to Bertrand, the more efficient firm need not produce all of the output. The
first order conditions imply that, for any firms i and j,
Thus, if firms exhibit non-increasing returns to scale, and if c0i (x) < c0j (x) for all x, then
(i) qi∗ > qj∗ (the more efficient firm produces more), (ii) c0i (qi∗ ) < c0j (qj∗ ) (the equilibrium
violates productive efficiency, in the sense that the lower cost firm produces at lower
47
cost on the margin), and (iii) p −c0i (qi∗ ) > p −c0j (qj∗ ) (the more efficient firm has a larger
markup).
Competitive Analysis:
Using the first order condition and the definition of demand elasticity, ε, we have:
p − c0i (qi ) −P 0 (Q)qi αi
Li = = =
p p ε
Using this formula and the definition of the industry Lerner index, we have:
X N PN 2
α H
L= αi Li = i=1 i =
i=1
ε ε
Notice: (i) The equilibrium markup is positively related to concentration, (ii) The equilib-
rium markup is negatively related to demand elasticity.
Example: Three firms, constant returns to scale, costs ci for firm i, constant elasticity of
demand. Initial configuration: c1 = c2 = c3 , H = 13 .
Exercise #1: Remove one firm. Then H rises, L rises, p rises, and welfare falls. This is a
“bad” increase in concentration.
Exercise #2: Reduce c1 . We know that firm 1 ends up with a market share λ > 13 , while
the other firms split the rest of the market. Thus,
µ ¶2
2 1−λ 3λ2 − 2λ + 1
H=λ +2 =
2 2
This reaches a global minimum at λ = 13 . Therefore, when firm 1’s costs fall, H rises,
L rises, p falls, and welfare rises. This is a “good” increase in concentration.
Exercise #3: Increase c1 . Same reasoning, but different conclusion: H rises, L rises, p rises,
and welfare falls. So the relationship between concentration and either price or welfare
isn’t stable, even when it is driven only by cost changes.
Lesson: a welfare evaluation of a change in concentration depends on the factors that caused
it.
48
2.6.4 Public Goods
Definition: A public good has two characteristics: (i) it displays non-rivalry in consump-
tion, and (ii) it is non-excludable.
A simple model:
N consumers
The public good is produced from the private good through a convex technology, for which
the cost of producing g in terms of x is given by the convex function c(g)
PN
A feasible allocation consists of a vector (g, x1 , x2 , ..., xN ) such that Z = i=1 xi + c(g)
Imagine that the planner has a Samuelson-Bergson social welfare function of the form
X
N
1 N
W (u , ..., u ) = αi ui
i=1
Maximize this function over allocations subject to the feasibility constraint by setting up
the Lagrangian and differentiating (assuming an interior solution):
αi uix (g, xi ) = λ
X
N
αi uig (g, xi ) = λc0 (g)
i=1
Using the first expression to solve for αi = λ/uix and substituting this into the second
expression yields:
X
N
uig (g, xi )
= c0 (g),
i=1
uix (g, xi )
49
which we can rewrite in a more familiar form (the Samuelson condition):
X
N
i
MRSg,x = MRTg,x
i=1
A non-cooperative equilibrium:
¡ ¢
max ui c−1 (T−i
∗
+ t), zi − t
t
uig (g ∗ , x∗i )
= c0 (g ∗ )
uix (g∗ , x∗i )
i
MRSg,x = MRTg,x
Notice that this diverges from the Samuelson conditions for optimal provision.
Structure of government funding for the public good: a tax mi is levied on each consumer
PN
i, and all revenues, M = i=1 mi , are contributed to the public good. Private
contributions are allowed. The private contribution game is as described above. When
consumers make their decisions concerning contributions, they regard taxes and the
government contribution as predetermined.
50
Claim: provided that the equilibrium is interior, a change in government provision does not
change either aggregate provision of the public good or private consumption by any
individual. In other words, public contributions to the public good fully crowd out
private contributions.
¡ ¢
ui c−1 (T−i + ti + M), zi − mi − ti
Thus, an interior equilibrium, (x∗1 , ..., x∗N ), satisfies the following first order condition
for each i:
à à ! !
X
N
uig c −1
Z− x∗i , x∗i
i=1
à à !! à à ! !
X
N X
N
= c0 c−1 Z − x∗i uix c−1 Z − x∗i , x∗i
i=1 i=1
51
PN
Observation: the preceding argument remains true even when i=1 mi = 0 and mi < 0
for some i. This is a case of pure redistribution. Consequently, as long as everyone
is making voluntary contributions to the public good, redistributional policies affect
neither the level of the public good nor the private consumption of any individual.
Qualification: changes in taxes and transfers may have real affects on the equilibrium
allocation to the extent they alter the set of consumers making strictly positive transfers
to the public good.
2.6.5 Voting
I players (voters), I ≥ 3.
Players simultaneously cast votes: they name j. Abstentions are not permitted. Si =
{1, ..., J}
The policy with the most votes is adopted (in the event of ties, a policy is selected with
equal probabilities from the set of policies receiving the most votes)
Payoffs for i are vij when policy j is adopted. We assume that vij 6= vik for all i, j, k.
Claim: For any policy j ∈ {1, ..., J}, there is a pure strategy Nash equilibrium in which j
is adopted.
Demonstration: Pick any j. Let si = j for all i. No player can change the outcome by
deviating.
52
Note: This result remains valid even when all players agree that policy j is the worst
possible outcome. It holds even when J = 2 (the case for which weak dominance
selects the majority-preferred outcome).
Example: MP-B
Proposed solution: Player 1 and player 2 both flip their coins. Expected payoffs for player
1:
1 1
For H : (2) + (−2) = 0
2 2
1 1
For T : (−2) + (2) = 0
2 2
53
Consequently, provided that player 2 flips his coin, player 1 can’t improve upon flipping
her coin. Conversely, provided that player 1 flips her coin, player 2 can’t improve upon
flipping his coin.
2.7.2 Definitions
Approach #1: Randomizations over (normal form) pure strategies (“instructions”). These
randomizations are known as mixed strategies.
Example: MP-A. Player 2’s strategy set is {s1 , s2 , s3 , s4 } = {hh, ht, th, tt}. A mixed strat-
P
egy δ i is a four-dimensional vector (δ i1 , δ i2 , δ i3 , δi4 ), where 4k=1 δ ik = 1, and where
δ ik ∈ [0, 1] denotes the probability of playing sk . Note that there are three degrees of
freedom for choosing a mixed strategy in this game.
∆ ≡ ∆1 × ... × ∆I
That is, π i (δ) represents the expected payoff to player i when choices are governed by
the probability vectors (δ1 , ..., δ I ).
54
Definition: A mixed strategy Nash equilibrium of the game ({Si }i∈1,...,I , g) is a pure strategy
equilibrium of the game ({∆i }i∈1,...,I , π).
Approach #2: Randomizations over actions at each information set in the extensive form.
These randomizations are known as behavior strategies.
Example: MP-A. Player 1 chooses from the set {H, T } at only one information set. There-
fore, mixed strategies and behavior strategies are equivalent. Player 2 chooses from
the set {h, t} at two information sets, which we can identify with player 1’s preceding
action, H or T . A behavior strategy for player 2 consists of a randomization over
the set {h, t} for the information set associated with H, call it δ2H = (δ2Hh , δ 2Ht )
(where δ 2Hk denotes the probability of playing k ∈ {h, t} having observed that player
1 chose H), and another randomization over the set {h, t} for the information set asso-
ciated with T , call it δ 2T = (δ 2T h , δ 2T t ) (where δ 2T k denotes the probability of playing
k ∈ {h, t} having observed that player 1 chose T ). Notice that there are only two
degrees of freedom in choosing a behavior strategy for this game (since δ 2Kh + δ 2Kt = 1
for K ∈ {H, T }). In contrast, a mixed strategy for player 2 consists of a randomiza-
tion δ 2 = (δ 2hh , δ 2ht , δ 2th , δ 2tt ) over the pure strategies hh, ht, th, and tt. Notice that
there are three degrees of freedom in choosing a mixed strategy for this game (since
δ 2hh + δ 2ht + δ 2th + δ 2tt = 1).
Kuhn’s Theorem: Consider any game of perfect recall. For any behavior strategy of player
i, there is a mixed strategy for i that yields the same distribution over outcomes for
any strategies, mixed or behavioral, chosen by other players. Moreover, for any mixed
strategy of player i, there is a behavior strategy for i that yields the same distribution
over outcomes for any strategies, mixed or behavioral, chosen by other players. In this
sense, mixed strategies and behavior strategies are equivalent.
Example: MP-A. Suppose that player 1 chooses with probabilities (δ 1H , δ 1T ) (this can be
viewed as either a mixed or behavior strategy). If player 2 follows a mixed strategy
55
(δ 2hh , δ 2ht , δ 2th , δ 2tt ), the probability distribution over outcomes is as follows:
Outcome Probability
H, h δ 1H (δ 2hh + δ 2ht )
H, t δ 1H (δ 2th + δ 2tt )
T, h δ 1T (δ 2hh + δ 2th )
T, t δ 1T (δ 2ht + δ 2tt )
If instead player 2 follows a behavior strategy (δ 2Hh , δ 2Ht , δ 2T h , δ 2T t ), the probability distri-
bution over outcomes is as follows:
Outcome Probability
H, h δ 1H δ 2Hh
H, t δ 1H δ 2Ht
T, h δ 1T δ 2T h
T, t δ 1T δ 2T t
Beginning with any mixed strategy, one can plainly select a feasible behavior strategy that
achieves the same probability distribution over outcomes (use δ 2Hh = δ 2hh + δ2ht and
δ 2T h = δ 2hh +δ 2th ). Beginning with any behavior strategy one can also select a feasible
mixed strategy that achieves the same probability distribution over outcomes. To see
this, begin by selecting some arbitrary value δ ∗ ∈ [0, 1] for δ 2hh . To equate all of the
terms in the preceding tables, we must then have δ 2ht = δ 2Hh −δ ∗ , δ2th = δ 2T h −δ ∗ , and
δ 2tt = 1 − δ 2T h − δ 2Hh + δ ∗ . This is a feasible mixed strategy provided that all three
of these implied probabilities are non-negative. This requires δ ∗ ≤ min{δ 2Hh , δ2T h }
and δ ∗ ≥ δ 2T h + δ2Hh − 1. But since δ 2T h + δ 2Hh − 1 ≤ min{δ 2Hh , δ 2T h }, it is always
possible to select some δ∗ ∈ [0, 1] that satisfies these conditions.
2.7.3 Existence
56
Consequently, we can imply the existence theorem for pure strategy Nash equilibria.
Q.E.D.
Helpful observation: In any mixed strategy equlibrium, a player must be indifferent be-
tween all pure strategies to which he attaches strictly positive probability. To see why,
inspect the previous formula.
Illustration with MP-B: If player 1 places positive probability on both choices, then
player 2 randomizes to make player 1 indifferent:
2δ 2h − 2δ 2t = −2δ 2h + 2δ 2t
So
2δ 2h − 2(1 − δ 2h ) = −2δ 2h + 2(1 − δ 2h )
1
Solving, we obtain δ 2h = 2
= δ 2t . The game is symmetric, so a similar argument
1
implies δ 1H = 2
= δ 1T .
The highest bidder wins (in the event of a tie, the winner is selected at random from the
highest bidders)
Bidders pay their bids to the auctioneer regardless of whether they win.
57
Payoffs: v − bi if player i wins the object, and −bi if i does not win
Equilibrium analysis:
There do not exist any pure strategy Nash equilibria (easy to check)
Consider symmetric mixed strategy equilibria. Let F (x) denote the CDF for the bid.
F (x)I−1 (v − x) − (1 − F (x)I−1 )x = C
F (x) =
v
From this, it follows that:
Z v
v
E(x) = dF (x)xdx =
0 I
Consequently, for total expected revenue, we have IE(x) = v.
Cost structure: ⎧
⎨ 0 if qi ≤ K
ci (qi ) =
⎩
∞ if qi > K
58
Single buyer with reservation value v.
Demand structure: ⎧
⎨ Q if p ≤ v
Q(p) =
⎩
0 if p > v
The firms compete by simultaneously announcing prices. The buyer makes purchases (if
any) from the firm quoting the lowest price, and splits purchases equally between the
two firms in the event of a tie. If the low-price firm has insufficient capacity to satisfy
demand, the buyer purchases the residual from the high-price firm.
Assume: (i) 2K > Q (otherwise p = v is a pure strategy Nash equilibrium), and (ii) K < Q
(otherwise p = 0 is a Nash equilibrium).
Payoffs: ⎧
⎪
⎪ 0 if pi > v
⎪
⎪
⎪
⎪
⎪
⎪
⎨ (Q − K)pi if v ≥ pi > pj
gi (pi , pj ) =
⎪
⎪
⎪
⎪ Qpi /2 if v ≥ pi = pj
⎪
⎪
⎪
⎪
⎩
Kpi if v ≥ pi < pj
We argue as follows:
(i) We can rule out any strategy profile with p1 6= p2 in the standard way, as the firm with
the lower price could increase profits by slightly raising price.
(ii) We can rule out any strategy profile with p1 = p2 > 0 in the standard way, as either
firm could increase profits by slightly undercutting the other.
59
(iii) We can rule out any strategy profile with p1 = p2 = 0, on the grounds that either firm
could earn positive (and therefore higher) profits by setting p = v, in which case its
payoff would be (Q − K)v > 0.
We will look for symmetric equilibria, wherein each firm chooses some probability distrib-
ution on [0, v]. We will use F to denote the CDF for this distribution.
We claim that any symmetric equilibrium mixed strategy (F ) must have the following two
properties:
The payoff from playing any price p (given that the other firm’s choices are dictated by F )
is given by
µ ¶ µ ¶
Qp
π i (p, F ) = lim F (q) (Q − K)p + F (p) − lim F (q) + (1 − F (p)) Kp
q↑p q↑p 2
Suppose there is a probability atom at p0 > 0. Compare the payoff to a firm from playing
p0 to the payoff from playing p0 − ε for some small ε. This creates a shift in probability
weight from the second term to the third term in the preceding expression. The
magnitude of this shift remains bounded away from zero even when ε is arbitrarily
small. Since the payoff associated with the third term is higher, expected payoff must
rise for sufficiently small ε, even though price is lower (the impact of the latter effect
becomes small on the same order as ε). There is also a shift in probability weight from
the first term to the second, but this reinforces the increase in payoff.
Next suppose there is a probability atom at p0 = 0. Compare the payoff to a firm from
playing p0 to the payoff from any positive p ≤ v; the former profit is zero, while the
latter is strictly positive.
60
One implication of property #1: The probability of a tie is zero. This allows us to simplify
the expression for payoffs:
Significance: there are no “flat” portions of F , except for an initial segment starting at
zero.
Suppose on the contrary that there is some interval [b, d] such that F (p) > 0 and F 0 (p) = 0
for p ∈ [b, d]. Then over this range,
∂π i (p, F )
= K + F (p)(Q − 2K)
∂p
≥ K + (Q − 2K)
= Q−K >0
Consequently, by playing p = d, the firm would achieve strictly higher profits than by
playing a price in supp(F ) sufficiently close to b. We know that a > 0 because of the
indifference condition (since π i (0, F ) = 0 < π i (v, F )).
From the mixed strategy indifference condition, we know that ∀p ∈ [a, v],
61
which implies C = (Q − K)v. So
µ ¶
K (Q − K)v
F (p) = −
2K − Q (2K − Q)p
Pure strategy equilibria exist when K lies at the boundaries of the interval [ Q2 , Q]. As K
approaches these boundaries, the mixed strategy equilibrium converges to the pure
strategy equilibrium.
(i) As K ↓ Q/2, a ↑ v, which means that the support of F converges to v. In the limit
where K = Q/2, there is a pure strategy equilibrium with p = v.
(ii) As K ↑ Q, a ↓ 0, which means that the support of F converges to [0, v]. However,
for any p > 0, F (p) converges to unity. Consequently, all of the probability is mass is
converging towards 0. In the limit where K = Q, there is a pure strategy equilibrium
with p = 0.
(ii) The Bertrand problem with homogeneous products and differing linear costs (firms
pricing below cost)
Criterion: Discard all Nash equilibria that involve any player selecting a weakly dominated
strategy with positive probability.
62
Eliminates any equilibrium in which an individual votes for his least favorite outcome.
Application #2: Bertrand pricing, constant unit costs, p ∈ <+ (either method of resolving
consumer indifference).
Every pure strategy Nash equilibrium involves at least one player selecting a weakly domi-
nated strategy, pi ≤ ci .
Application #3: Bertrand pricing, constant unit costs, p ∈ {0, 1, 2, ...} (and ci ∈ {0, 1, 2, ...}
for all i, with c1 ≤ c2 ≤ c3 ...)
The weak dominance criterion rules out every equilibrium in which any player selects pi ≤ ci .
The outcome differs according to whether c1 < c2 . There are two cases to consider:
(i) c1 = c2 . Then the weak dominance criterion selects the equilibrium with p1 = p2 =
c1 + 1. Note: as the price grid becomes finer, the equilibrium price converges to c1 .
Consequently, it is natural to accept p1 = p2 = c1 as a solution when p ∈ <+ , even
though, as a technical matter, both firms play weakly dominated strategies.
(ii) c1 < c2 . Then the weak dominance criterion selects the equilibrium with p1 = c2 , and
p2 = c2 + 1. Note: as the price grid becomes finer, the equilibrium price for firms 1
and 2 both converge to c2 . Consequently, it is natural to accept p1 = p2 = c2 as the
solution when p ∈ <+ even though, as a techical matter, firm 2’s strategy is weakly
dominated.
Alternative perspective on motivating problems: There is always some risk that an-
other player will make a “mistake.”
Formalization: Consider a finite normal form game ({Si }i∈1,...,I , g); as before, let Ki denote
the cardinality of player i’s pure strategy set.
63
Let ∆i (ε) be the collection of vectors (δ i1 , ..., δiKi ) such that δ ik ∈ [ε, 1] for all k ∈ {1, ..., Ki },
P i
and K k=1 δ ik = 1. This is the set of probability distributions over player i’s pure
strategies with the property that everything is played with at least probability ε.
An ε−constrained mixed strategy equilibrium for the game ({Si }i∈1,...,I , g) is a pure strat-
egy Nash equlibrium for the game ({∆i (ε)}i∈1,...,I , π) (where π is the expected payoff
function, exactly as before).
Definition #1: A trembling hand perfect equilibrium is any limit of ε-constrained equilibria
as ε goes to zero.
Definition #2: A trembling hand perfect equilibrium is a mixed strategy Nash equilibrium,
δ, with the following property: there exists some sequence of totally mixed strategy
profiles δ t → δ, such that, for all i and t, π i (δ i , δ t−i ) ≥ π i (si , δ t−i ) for all si ∈ Si .
Some theorems:
Remark: Trembling-hand perfection can be extended to games with infinite strategy sets by
taking ε to be a lower bound on density, rather than probability. The second definition
64
must then be adjusted as follows: as δt → δ, each player i has some sequence of best
responses converging to δ i .
Application: Bertrand competition with homogenous products and identical linear costs,
c.
We know that, in any pure strategy Nash equilibrium, all output is sold at the price p =
c. We also know that this price is weakly dominated. However, this equilibrium is
trembling-hand perfect, in the sense just described. Similarly, when costs differ across
firms, p1 = p2 = c2 is trembling hand perfect, even though p2 = c2 is weakly dominated.
3. Any deterministic theory of strategic behavior must have the property that agents play
Nash equilibria.
6. Experimental evidence.
1. A misguided objection: The “Nash assumption” — opponents choices are fixed, and would
not change if a player made an out-of-equilibrium choice — is not valid.
65
Definition: A conjectural variations equilibrium consists of s∗ ∈ S and, for each player i,
a function ri∗ : Si → Sj (where ri∗ (si ) represent i’s conjecture of what j will play if i
chooses si ) such that
Key points:
(i) “Nash assumption” is that each player takes the strategies of the other players as fixed,
not their actions.
(ii) If a particular model (such as Cournot or Bertrand) does not allow for responses, this
is a problem with the model, and not with the equilibrium concept.
(iii) The conjectural variations concept does not coherently depict strategic choice even in
single-decision games, as it is impossible for both players to move first.
2. Nash equilibrium is too inclusive (see section 2.9, above, and the material in sections 4
and 5, below).
Example:
66
Pure strategy Nash equilibria: (A,b) with payoffs (6,10), and (B,a) with payoffs (10,6)
¡6 ¢
Mixed strategy Nash equilibrium: ,1
7 7
for each player, with expected payoff ≈ 8.57.
One potential agreement: Appoint a third party to flip a coin. The party announces
“heads” or “tails.” If heads, the players play (A, b). If tails, the players play (B, a).
Expected payoff is 8. Note: can choose the parameters differently so that this domi-
nates the mixed strategy equilibrium (change the 9s to 8s).
Another potential agreement: Appoint a third party to roll a six-sided dice. Neither player
observes the outcome. The third party tells P 1 whether the number falls into the set
{1, 2} or {3, 4, 5, 6}. He tells P 2 whether the number falls into the set {1, 2, 3, 4} or
{5, 6}. Players can condition their choices on the information received from the third
party.
67
Check to see that this is self-enforcing (since the game is symmetric, we need only check
for player 1):
If player 1 is told that the dice roll is in the set {1, 2}, she believes that 2 will play a with
certainty. B is a best response to a.
If player 1 is told that the dice roll is in the set {3, 4, 5, 6}, she believes that 2 will play both
a and b with probability 0.5. In that case, A yields an expected payoff of 7.5,while B
yields an expected payoff of 5. A is the best response.
Note: expected payoff here is 8 13 , which is still not quite as good as the mixed strategy
equilibrium, but we’ll see in a moment that we can do even better.
¡ ¢
Definition: Consider a finite game {Si }Ii=1 , g . δ ∗ (a probability distribution over S) is a
correlated equilibrium iff ∀i and for all si chosen with strictly positive probability, si
solves
max
0
Es−i [gi (s0i , s−i ) | si , δ]
si ∈Si
Return to example:
68
Translating the previous equilibrium into these terms:
Are there other correlated equilibria? Consider the following family of distributions:
Conditions for this to be a correlated equilibrium (since it’s symmetric, we only have to
check player 1):
69
If player 1 is told to play B, she thinks that player 2 will select a with certainty. B is a
best response.
2γ
If player 1 is told to play A, she thinks that player 2 will select a with probability 1+γ
, and
1−γ
that player 2 will select b with probability 1+γ
. A is a best response provided that
µ ¶ µ ¶ µ ¶ µ ¶
2γ 1−γ 2γ 1−γ
9 +6 ≥ 10 +0
1+γ 1+γ 1+γ 1+γ
γ = 0 corresponds to the very first correlated agreement we considered. The most attractive
equilibrium is the one for which γ = 34 . The associated expected payoff is 8 34 , which
is strictly higher than the payoff available from any mixed or pure strategy Nash
equilibrium.
70
3 Strategic Choice in Static Games with Incomplete
Information
A game is said to have incomplete information when some players do not know the payoffs
of other players.
Applications include: competition between firms with private information about costs and
technology, auctions where each potential buyer may attach a different valuation to the
item, negotiations with uncertainty about the other party’s preferences or objectives,
and so forth.
Two players
Player 2 may be either a “friend” or a “foe,” but player 1 doesn’t know which.
71
3.1 Bayesian equilbrium
In general, dealing with such situations would seem to require us to specify beliefs about
others’ payoffs, beliefs about others’ beliefs about others’ payoff, etc.
A general formulation: Each player has a payoff function gi (s, θi ) for s ∈ S and θi ∈ Θi ,
where θi is sometimes referred to as player i’s “type.” Define Θ = Θ1 × ... × ΘI . For
simplicity, assume Θ is a Euclidean set. Let the CDF F (θ) (for θ ∈ Θ) describe the
joint distribution of the players’ types. Each player knows her own value of θi , and F
is common knowledge, but i does not know θj for any j 6= i.
¡ ¢
We can describe this as a Bayesian game characterized by the vector {Si }Ii=1 , g, Θ, F
Alternatively, we can describe this as a more conventional game (the “extended game”), as
follows.
Play unfolds as follows: (i) nature selects types, (ii) players observe their own types, but do
not observe the types of any other players, (iii) players simultaneously select si ∈ Si .
72
Given F and g, we can, for any σ ∈ Σ, compute an expected payoff for player i, ui (σ), as
follows:
ui (σ) = Eθ [gi (σ 1 (θ1 ), ..., σ I (θI ), θi )]
For the following, suppose that Θi is a finite set for each i (the extension to infinite sets is
straightforward).
Example #1, revisited: Let p denote the probability that player 2 is a friend; 1 − p is the
probability that player 2 is a foe.
73
Normal form for extended game:
74
Nash equilibrium: (H, ht) if p ≥ 12 , (T, ht) if p ≤ 12 .
For p < 12 , σ 1 = T , same decision rule for 2. For p = 12 , 1 could choose H or T , or randomize
between them; 2’s decision rule is unchanged.
Problem: How should the government deal with externalities (including public goods) when
it does not have information about consumers’ preferences? In general, agents will have
incentives either to understate or exaggerate certain aspects of their preferences. How
can the government elicit correct information?
The taste parameters are drawn from some probability distribution with CDF F , which we
take for simplicity to be the same for both individuals. F is common knowledge.
75
Government policy: Consumers will be asked to name their types. They are not con-
strained to tell the truth. Let θA
i be i’s announced type. The government’s policy
Note: The government actually maximizes u1 +u2 +T1 +T2 , but T1 +T2 = 0 is an additional
constraint.
Let x∗i (θ1 , θ2 ) denote the optimal solution as a function of type. That is, x∗i (θ1 , θ2 ) solves
76
Remarks on approach: The government’s problem is known as mechanism design prob-
lem.
We will study direct revelation mechanisms, in which reported types are mapped to out-
comes on the assumption that the parties have revealed their information truthfully.
To justify our focus on direct revelation mechanisms, we invoke a theorem known as the
revelation principle. This theorem states that, in searching for an optimal mechanism
within a much broader class, the designer can, without loss, restrict attention to direct
revelation mechanisms for which truth-telling is an optimal strategy for each party.
xPi (θA A ∗ A A
1 , θ 2 ) = xi (θ 1 , θ 2 )
Z
A A
¡ ¡ ¢ ∗ ¡ A T¢ T¢ ¡ A¢
T1 (θ1 , θ2 ) = u2 x∗1 θA T T
1 , θ 2 , x2 θ 1 , θ 2 , θ 2 dF (θ 2 ) + K1 θ 2
¡ ¢ ¡ A¢
≡ t1 θ A1 + K1 θ 2
The idea: tax captures the expected externality which 1’s choice has on 2.
Given this scheme, what will the consumers choose? This is a game of incomplete informa-
tion.
¡ T¢
Claim: Announcing the truth (that is, θA
i θi = θTi ) is a Bayesian Nash equilibrium.
Proof: Consider consumer 1’s decision (the argument is identical for player 2). Imagine
¡ T¢
that consumer 2 follows the decision rule θA T
2 θ 2 = θ 2 . Then consumer 1’s payoff is
77
Z
£ ¡ ∗ A T ¢ ¡ ∗ ¡ A T ¢ ∗ ¡ A T ¢ T ¢¤
= u1 x1 (θ1 , θ2 ), x∗2 (θA
1 , θ T
2 ), θ T
1 + u2 x1 θ1 , θ2 , x2 θ1 , θ2 , θ2 dF (θT2 )
Z
+ K1 (θT2 )dF (θT2 )
θA
1 , so we can ignore it. Consider the integrand of the first term. By definition,
x∗1 (θT1 , θT2 ) and x∗2 (θT1 , θT2 ) maximize u1 (x1 , x2 , θT1 )+u2 (x1 , x2 , θT2 ). So clearly, consumer
1 does at least as well by announcing θT1 , thereby producing x∗1 (θT1 , θT2 ) and x∗2 (θT1 , θT2 )
for each θT2 , than by announcing some other θA ∗ A T
1 , thereby producing x1 (θ 1 , θ 2 ) and
x∗2 (θA T T
1 , θ 2 ) for each θ 2 . Q.E.D.
Implication: For this mechanism, there exists a Bayesian Nash equilibrium that achieves
the first-best outcome.
£ ¤ £ ¤
T1 (θA A A A
1 , θ 2 ) + T2 (θ 1 , θ 2 ) = t1 (θA A A A
1 ) − t2 (θ 2 ) + t2 (θ 2 ) − t1 (θ 1 )
= 0
An alternative tax/subsidy scheme: Attempt to achieve the first best through a mech-
anism that has a dominant strategy solution (in which case uniqueness is assured,
78
common knowledge of distributions is unimportant, and the solution concept is diffi-
cult to challenge):
xPi (θA A ∗ A A
1 , θ 2 ) = xi (θ 1 , θ 2 )
¡ ∗ ¡ A A¢ ∗ ¡ A A¢ A¢ ¡ A¢
T1 (θA A
1 , θ 2 ) = u2 x1 θ 1 , θ 2 , x2 θ 1 , θ 2 , θ 2 + K1 θ 2
¡ ¢ ¡ A¢
= t1 θ A A
1 , θ 2 + K1 θ 2
The idea: tax captures the externality which 1’s choice has on 2, assuming 2’s announcement
is correct.
Given this scheme, what will the consumers choose? This is another game of incomplete
information.
¡ T¢
Claim: Announcing the truth (that is, θA
i θi = θTi ) is a (weakly) dominant strategy.
Proof: Since the problem is symmetric, we can consider consumer 1’s decision. Let G
denote the CDF for θA
2 (since 2 may not be telling the truth, G may bear practically
no relation to F ). Given any realization of type, θT1 , consumer 1’s expected payoff is
given by
Z
£ ¡ ∗ A A ∗ A A T¢ ¤
u1 x1 (θ1 , θ2 ), x2 (θ1 , θ2 ), θ1 + t1 (θA A A A
1 , θ 2 ) + K1 (θ 2 ) dG(θ 2 )
Z
£ ¡ ∗ A A ∗ A A T¢ ¡ ¡ ¢ ∗ ¡ A A ¢ A ¢¤
= u1 x1 (θ1 , θ2 ), x2 (θ1 , θ2 ), θ1 + u2 x∗1 θA A
1 , θ 2 , x2 θ 1 , θ 2 , θ 2 dG(θA
2)
Z
+ K1 (θA A
2 )dG(θ 2 )
consumer 1 does at least as well by announcing θT1 , thereby producing x∗1 (θT1 , θA
2 ) and
x∗2 (θT1 , θA A A ∗ A A
2 ) for each θ 2 , than by announcing some other θ 1 , thereby producing x1 (θ 1 , θ 2 )
79
A potential problem: As before, the budget may not balance. In this case, we cannot
use the same trick as before, because t1 depends on both θA A
1 and θ 2 .
K1 (θA
2 ) = −EθT t2 (θT1 , θA
2)
Z1
¡ ¡ ¢ ∗ ¡ T A¢ T ¢
= − u1 x∗1 θT1 , θA T
2 , x2 θ 1 , θ 2 , θ 1 dF (θ 1 )
£ ¤
EθT1 ,θT2 T1 (θT1 , θT2 ) + T2 (θT1 , θT2 ) = 0
Consequently, the budget balances in expectation in equilibrium, but not necessarily for all
realizations of preferences, and not outside of equilibrium.
Comments: (i) This is known as a Groves mechanism. (ii) One can prove that there does
not exist a dominant strategy mechanism that balances the budget for all realizations
of preferences.
N bidders, i = 1, ..., N
80
For some purposes, we will assume that F is uniform on [0, α]
We will consider outcomes of four different types of auctions, making comparisons across
auctions to determine the most profitable method of selling the good.
Bidders simultaneously submit sealed bids, bi (for i); highest bid wins, but pays the second
highest bid. A winner is chosen from among the high bidders with equal probability
in the event of a tie.
We have already analyzed this model with complete information. We found that bi = vi
weakly dominated all other choices.
The seller always receives the second highest bid, which equals the second highest valuation.
Consequently, expected revenue equals the expected second order statistic of the set
of valuations, v(2) .
Z
£ ¤ vh
E v(2) = vN(N − 1)f (v) [F (v)]N−2 [1 − F (v)] dv
v`
81
Bidders simultaneously submit sealed bids, bi (for i); highest bidder wins, and pays her own
bid. A winner is chosen from among the high bidders with equal probability in the
event of a tie.
Payoffs are vi − bi for the winner, and 0 for all other bidders.
To construct a pure strategy Bayesian Nash equilibrium, we must identify decision rules,
σ i (vi ), mapping valuations, vi , to bids, bi .
Assume at the outset that σ(v) is strictly increasing in v. (Later, we will have to verify
that this property is in fact satisfied by the rule that we derive).
N−1
In that case, the probability of winning upon submitting the bid b is [F (σ −1 (b))] .
£ ¤N−1
π(v, b) = (v − b) F (σ −1 (b))
Notice that
dΠ(v) ∂π(v, σ(v)) ∂π(v, σ(v)) dσ(v)
= + ×
dv ∂v ∂σ dv
∂π(v,σ(v))
By the Envelope Theorem, ∂σ
= 0. Consequently,
dΠ(v) £ ¤N−1
= F (σ −1 (σ(v)))
dv
= [F (v)]N−1
82
Moreover, notice that
£ ¤N−1
Π(v) = (v − σ(v)) F (σ −1 (σ(v)))
= (v − σ(v)) [F (v)]N−1
Taking the previous two expressions and substituting them into the identity
Z v
dΠ(w)
dw = Π(v) − Π(v` )
v` dw
we obtain
Z v
[F (w)]N−1 dw = (v − σ(v)) [F (v)]N−1 − 0
v`
Notice: (i) This is increasing in v (as we assumed at the outset). (ii) Each customer bids
less than his or her actual valuation. (iii) limN→∞ σ(v) = v, (iv) in general, the highest
bid will not be equal to the second highest valuation, so realized revenues will differ
from those collected in a second-price auction.
The next step is to compute expected revenue. In a first-price auction, the seller receives
the highest bid. Consequently, expected revenue equals the expected value of the first
order statistic of the set of bids:
Z
£ ¤ vh
E b(1) = σ(v)Nf (v) [F (v)]N−1 dv
vl
83
Consequently, for this case, we have
µ ¶
£ ¤ £ ¤ N −1
E b(1) = E v(1)
N
µ ¶Z α
N −1
= vNf (v) [F (v)]N −1 dv
N
µ ¶ Z0 α µ ¶h i
N −1 1 v N−1
= vN dv
N 0 α α
µ ¶ µ ¶
N −1 N
= α
N N +1
µ ¶
N −1
= α
N +1
Notice: expected revenues are exactly the same as for the second-price auctions. This is a
special case of the revenue equivalence theorem, which states that first and second-price
sealed bid auctions always generate the same expected revenues in auctions with inde-
pendent private valuations, irrespective of the distribution function. This equivalence
holds despite the fact that, for any given realization of valuations, realized revenues
will differ across these two auctions.
General proof of revenue equivalence theorem: Consider any auction structure and
an associated Bayesian Nash equilibrium σ(v). Let p(b) denote the equilibrium prob-
ability of winning as a function of the bid, and let t(b) denote the expected payment
to the auctioneer, given the bid. The expected payoff to a bidder with valuation v is
So
Z v
dΠ(w)
Π(v) = Π(v` ) + dw
dw
Zv`v
= Π(v` ) + p(σ(w))dw
v`
84
But then, using the definition of Π(v), we have
Z v
t(σ(v)) = vp(σ(v)) − p(σ(w))dw − Π(v` )
v`
But for both first and second price auctions, we have shown that bids are strictly increasing
in valuations, so Π(v` ) = 0 and
p(σ(v)) = [F (v)]N−1
Thus, expected revenue is the same for both types of auctions. Q.E.D.
3. English (ascending price) auction: Posted price of the good is slowly increased. Bid-
ders drop out, until there is only one left, who purchases the good for the price at which
the second-to-last bidder dropped out. (If the last bidders drop out simultaneously, a
winner is selected from among them at random.)
Weakly dominant strategy for each bidder: stay in until the posted price exceeds your
valuation.
Implication: the good is sold for a price equal to the second highest valuation, exactly as
in a second-price sealed bid auction.
4. Dutch (descending price) auction: Posted price of the good is slowly decreased, until
a bidder claims the object at the posted price. (In the event of a tie, the winner is
selected at random from those attempting to claim the object at the posted price.)
A strategy is completely described by a decision rule that specifies a price at which to claim
the object.
85
The mapping from these strategy profiles into payoffs is the same as for the first-price
auction.
Thus, the descending price auction is strategically equivalent to the first-price auction, and
should produce the same outcome.
Remarks:
(i) All four varieties of auctions produce the same revenue for auctions with independent
private valuations.
(ii) Revenue equivalence does not necessarily hold once one relaxes the assumption of inde-
pendent private valuations.
(iii) Another natural question is whether there is any other kind of auction that yields
greater revenue. This is another mechanism design question, and it can be addressed
once again by invoking the revelation priciple. In general, one can improve upon the
payoff that the seller obtains from the first price auction by specifying a reserve price
in excess of the seller’s valuation.
86
4 Dynamic Games and the Problem of Credibility
Example #1: A simple model of entry into an industry.
If the entrant chooses “In”, the incumbent must choose to “F ight” or to “Accomodate”
87
There are two Nash equilibria: (In, Accomodate), and (Out, F ight)
Only (In, Accomodate) makes sense. For (Out, F ight), the entrant is discouraged from
entering by a non-credible threat.
Note: in this example, weak dominance eliminates (Out, F ight), but the problem here is
more general.
General problem: impose conditions to guarantee that players anticipate sensible things
“off the equilibrium path.”
Notation:
h(t) denotes the information set (element of the information partition) containing t
88
S(t) denotes the successors to the node t
Definition: A proper subgame of an extensive form game is {t} ∪ S(t) (along with the
associated mappings from information sets to players, from branches to action labels,
and from terminal nodes to payoffs) such that h(t) = {t} and ∀ t0 ∈ S(t), h(t0 ) ⊆ S(t).
89
Definition: (Due to Selten) Consider a Nash equilibrium in behavior strategies, δ ∗ . This
equilibrium is subgame perfect iff for every proper subgame, the restriction of δ ∗ to
that subgame forms a Nash equilibrium in behavior strategies.
Observation: For finite games, one can solve for subgame perfect Nash equilibria (SP NE)
by using backward induction on the subgames of the extensive form.
Return to example #1: There is one proper subgame (shown in preceding illustration).
The only equilibrium of this subgame involves the Incumbent choosing to “Accomodate.”
Consequently, (In, Accomodate) is subgame perfect, whereas (Out, F ight) is not.
Example #2: Similar to previous game, except that E can enter with one of two different
production technologies, IN1 or IN2 . The payoffs from IN1 are as in the previous
example; the payoffs from IN2 are shown below:
90
Nash equilibria: (IN1 , Accomodate), (Out, F ight).
Since there are no proper subgames, both equilibria are subgame perfect. However, (Out,
F ight) is still implausible (all we have done is to add an inferior entry technology).
The problem: We need a theory of sensible choices at each information set, and not just
in proper subgames.
What is missing from subgame perfection? Beliefs at each information set. If we had
a description of initial beliefs at an information set, we could ask whether choices are
reasonable given those beliefs.
Notation:
91
H is the set of information sets (the information partition)
ϕ(h) denotes the player who makes the decision at information set h
P
Definition: A system of beliefs is a mapping μ : X → [0, 1] such that, ∀ h ∈ H, t∈h μ(t) =
1.
Illustration:
92
(ii) where possible, μ∗ is computed from δ ∗ using Bayes rule. That is, for any information
set h with Prob(h | δ∗ ) > 0 and any t ∈ h,
Prob(t | δ ∗ )
μ∗ (t) =
Prob(h | δ ∗ )
Return to example #2: Consider the Nash equilibrium (Out, F ight). Can we supple-
ment this with a system of beliefs for which the strategy profile is sequentially rational?
System of beliefs consists of a single parameter, λ, such that μ(t1 ) = λ and μ(t2 ) = 1 − λ.
For the proposed equilibrium, Prob(h1 | δ ∗ ) = 0, so we cannot compute λ from Bayes rule.
We are therefore allowed to pick any value of λ ∈ [0, 1].
Note, however, that for all λ ∈ [0, 1], I’s optimal choice at h1 is “Accomodate.” Hence,
the equilibrium strategies are not sequentially rational for any beliefs. This is not a
weak perfect Bayesian equilibrium.
93
Now consider the Nash equilbrium (IN1 , Accomodate). For this equilibrium, Prob(h1 |
δ ∗ ) = 1, so we can compute λ from Bayes rule:
Prob(t1 | δ∗ ) 1
μ(t1 ) = Prob(t1 | h1 ) = ∗ = =1
Prob(h1 | δ ) 1
Likewise, μ(t2 ) = 0. Note that Accomodate is optimal given these beliefs. This is a weak
perfect Bayesian equilibrium.
A W P BE: E plays (Out, IN2 if IN), I plays F ight (if h1 is reached), I’s beliefs at h1 :
μ(t1 ) = 1, μ(t2 ) = 0.
This equilibrium is not, however, subgame perfect. There is one proper subgame, and
(IN2 , Accomodate) is the equilibrium for this subgame. Consequently, E should play
In at the root node. This is a SP NE, and it is also a W P BE.
94
The first W P BE is not reasonable. If E plays In, then E plainly prefers IN2 . Yet if I
observes that E has played In, I infers that E has played IN1 . These beliefs seem
inconsistent with E’s incentives. W P BE does not rule them out, however, because
they are off the equilibrium path.
Remark: One can strengthen W P BE in a variety of ways. For example, one can define a
perfect Bayesian equilibrium (P BE) as a W P BE that is also a W P BE in all proper
subgames. This assures subgame perfect. However, it still doesn’t appear to place
sufficient restrictions on out-of-equilibrium beliefs.
Suppose we have an equilibrium in which 1 played Out at h1 . Then, to make this a W P BE,
we are allowed to pick any beliefs for 2 at the information set h2 , e.g. including μ(t3 ) =
0.9, and μ(t4 ) = 0.1. But it is clear that the only sensible belief is μ(t3 ) = μ(t4 ) = 0.5
(nature picks L and R with equal probabilities, and 1’s choice cannot depend on N’s
choice).
95
4.3 Sequential equilibrium
Definition: A behavior strategy profile δ is strictly mixed if every action at every informa-
tion set is selected with strictly positive probability.
Note: For a strictly mixed behavior strategy profile δ, every information set is reached
with strictly positive probability. Consequently, one can completely infer a system of
beliefs, μ, from δ using Bayes rule. Let M denote the mapping from strictly mixed
behavior strategies to systems of belief.
Definition: (δ, μ) is consistent iff there exists a sequence of strictly mixed behavior strategy
profiles δ n → δ such that μn → μ, where μn = M(δ n ).
Return to example #3: Recall the undesirable equilibrium: E plays (Out, IN2 if IN),
I plays F ight (if h1 is reached), I’s beliefs at h1 : μ(t1 ) = 1, μ(t2 ) = 0.
Are these beliefs consistent? Consider a sequence of completely mixed behavior strategies,
as illustrated below, with λn → 0 and αn → 0 (so that the sequence converges to the
equilibrium strategy for E).
96
Apply Bayes rule for any element of this sequence:
λn αn
Prob(t1 | h1 ) = = αn
λn
Graphical interpretation:
97
Return to example #4: Recall the problemmatic beliefs: player 1 selects “Out,” and
player 2 makes some inference μ(t3 ) 6= 0.5.
Are these beliefs consistent? Consider a sequence of completely mixed behavior strategies,
as illustrated below, with λn → 0 (so that the sequence converges to the equilibrium
strategy for 1).
98
Apply Bayes rule for any element of this sequence:
0.5λn
Prob(t3 | h2 ) = = 0.5
0.5λn + 0.5λn
Plainly, this converges to 0.5 as n → ∞. But then, to satisfy consistency, we must have
μ∗ (t3 ) = μ∗ (t4 ) = 0.5. The proposed beliefs violate this condition. Hence, the associ-
ated equilibrium cannot be sequential.
Graphical interpretation:
99
In each case, consistency requires us to take a continuous selection from the correspondence
M(δ) (more generally, we insist on a selection that makes this correspondence lower
hemicontinuous).
Motivation: If players make all conceivable mistakes with strictly positive probability, then
all information sets will be reached with positive probability, and there won’t be any
need to think about beliefs off the equilibrium path.
Problem: Normal form trembling hand perfect equilibria need not be subgame perfect.
One can demonstrate this in two player games, where normal form trembling hand
perfection is equivalent to selecting Nash equilibria in which no player chooses a weakly
dominated strategy. Avoiding weakly dominated strategies in the normal form is not
sufficient to guarantee subgame perfection.
Example #5:
100
Three Nash equilibria: ((A, C), E), ((B, C), F ), and ((B, D), F ).
There is one proper subgame, and only one Nash equilibrium for this subgame: (C, E).
Consequently, ((A, C), E) is the only subgame perfect equilibrium.
Neither player has a weakly dominated strategy in any of the three Nash equilibria. Con-
sequently, all three equilibria are normal-form trembling-hand perfect.
Why ((B, C), F ), and ((B, D), F ) survive: trembles can place more weight on A, D than
on A, C. This is peculiar since, having played A, it is obviously better for player 1 to
select C rather than D.
Note: If one first eliminates dominated strategies (A, D) and then looks for equilibria in
strategies that are not weakly dominated, one isolates ((A, C), E).
Extensive form trembling-hand perfection: The idea is to have players tremble inde-
pendently among all actions at each information set.
101
Formalization: The agent normal form of an extensive form game is the normal form that
one would obtain if each player selected a different agent to make her decisions at every
information set, and if all of the player’s agents acted independently, with the object
of maximizing the original player’s payoff.
For the game just considered, the agent normal form is as follows, where player 1a picks
rows, player 1b picks columns, and player 2 picks boxes:
For the game just considered, the only extensive form trembling-hand perfect equilibrium
is ((A, C), E). One can see this directly: if player 1a plays A with positive probability,
then player 1b will necessarily prefer C to D.
Relation between EF T HP and SE: (i) Both use trembles in the extensive form to de-
rive implications for information sets off the equilibrium path
102
(ii) SE is easier to compute, as it requires one only to think about best responses to the
limiting strategies, and not along the entire sequence of strategies
(iv) EF T HP ⊆ SE
Can verify that (Out, F ight) and (IN1 , Accomodate) are both still Nash equilibria.
This is obvious for the case of (IN1 , Accomodate), since beliefs are implied by the strategies.
For (Out, F ight), consider beliefs of the form μ(t1 ) = 0.25, μ(t2 ) = 0.75. It is easy to check
that F ight is optimal given these beliefs. Moreover, these beliefs are consistent with
103
the equilibrium strategy. To see this, consider the following sequence of strategies for
E:
Prob(Out) = 1 − λn
Prob(IN1 ) = 0.25λn
Prob(IN2 ) = 0.75λn
This is not a reasonable equilibrium. E can assure himself of 0 by playing Out. E can
get at most −1 by playing IN2 . Therefore, he would never play IN2 . On the other
hand, E might get as much as +2 by playing IN1 , which exceeds what he could get
by playing Out. This means that E could conceivably justify playing IN1 . Thus,
if I finds herself at information set h1 , she should conclude that E has played IN1 ,
with the expectation that I will play Accomodate. Given this conclusion, it is in fact
optimal for I to play Accomodate.
This is an example of a “forward induction” argument. We will see more of this when we
come to dynamic games of incomplete information.
104
5 Dynamic Games with Complete Information: Appli-
cations
5.1 Leader-follower problems
5.1.1 Sequential quantity competition (the Stackelberg model)
Framework
Strategy sets
Nash equilibria
105
(ii)0 Q∗2 (q1∗ ) solves maxq2 π2 (q1∗ , q2 )
(ii)00 Q∗2 (q1∗ ) ∈ γ 2 (q1∗ ), where γ 2 is the Cournot best response correspondence
Note that condition (ii)00 only restricts the value of the function Q∗2 evaluated at the point
q1∗ . To sustain a Nash equilibrium, one is free to select any value for Q∗2 (q1 ) when
q1 6= q1∗ . In particular, one can use:
½
γ 2 (q1∗ ) when q1 = q1∗
Q∗2 (q1 ) =
+∞ otherwise
In this way, one can sustain as an outcome almost any point along firm 2’s best response
function. However, firm 2’s responses to quantities other than q1∗ are not credible.
This is because we have not insisted on subgame perfection.
An equilibrium in any such subgame entails firm 2 making a best response to the choice of
firm 1, q1 , that defines the subgame. That is, firm 2 must select γ 2 (q1 ) in response to
every value of q1 . In other words, subgame perfection requires Q∗2 (q1 ) = γ 2 (q1 ).
To have a Nash equilibrium, condition (i) (or equivalently condition (i)0 ) must still hold.
Consequently, q1∗ solves maxq1 π 1 (q1 , γ 2 (q1 )).
Graphically:
106
This is known as the Stackelberg equilibrium.
Compared to the Cournot equilibrium, firm 1 produces more (is more aggressive), and firm
2 produces less (is less aggressive).
Firm 1 (the leader) does (i) better than in a simultaneous move setting, and (ii) better than
firm 2.
Framework
107
Sales for firm i are given by Qi (p1 , p2 )
Strategy sets
Remark: We will move directly to subgame perfect equilibria, skipping Nash equilibria
involving non-credible out-of-equilibrium responses.
Notice that firm 2’s best response correspondence, γ 2 (p1 ), is not well defined for all p1 .
This implies that there are always subgames in which Nash equilibrium do not exist.
Consequently, subgame perfect equilibria do not exist.
Note that g1 (p1 , P2∗ ) = 0 for all p1 ≥ c. Consequently, firm 1 is completely indifferent with
respect to all prices not below costs.
Implication: any (p1 , P2∗ ) with p1 ≥ c is a subgame perfect equilibrium. Consequently, the
equilibrium price can be any p ∈ [c, pm ].
108
Note: it is not possible to eliminate any of these outcomes through dominance arguments.
Compared to the Bertrand equilibrium, firm 1 sets a (weakly) higher price (is less aggres-
sive), and firm 2 sets a (weakly) higher price (is less aggressive).
Firm 1 (the leader) does (i) the same as in a simultaneous move setting, and (ii) (weakly)
worse than firm 2.
Firm 2 (the follower) does (weakly) better than in a simultaneous move setting.
Heterogeneous products
The analysis is the same as with Stackelberg, except with Bertrand best responses instead
of Cournot.
Graphically:
109
Compared to the Bertrand equilibrium, firm 1 sets a higher price (is less aggressive), and
firm 2 sets a higher price (is less aggressive).
Assuming symmetry, firm 1 (the leader) does (i) better than in a simultaneous move setting,
and (ii) worse than firm 2 (firm 1 sets the higher price, and firm 2 makes a best response)
1. The leader always does (weakly) better than in a simultaneous move setting. (Note that
this is not necessarily true for mixed strategy equilibria. MP-B is an example.)
2. Whether firm 2 does better or worse than in a simultaneous move settings depends on
whether the model exhibits strategic substitutes or strategic complements.
3. Whether there is a first or second mover advantage (in terms of which player does better
in symmetric models) is related to whether the model exhibits strategic substitutes or
strategic complements.
Framework
110
Stage 2: Having observed each others’ capacities, firms select prices simultaneously. Cus-
tomers attempt to purchase the good from the firm naming the lowest price. However,
firm i can only sell quantity up to capacity Ki . Consumers who are unable to purchase
the good from the low-price firm may purchase it (if available) from the high-price firm.
Note: With downward sloping demand, consumers implicitly differ in their valuations of the
good (or they have different valuations for different units). Consequently, the quantity
sold by the high-price firm will depend upon which consumers are turned away from
the low-price firm.
Assumption: Consumers are rationed to maximize total surplus (efficient rationing). High
value consumers purchase from the low-price firm (they are the most eager), and low
value consumers purchase from the high-price firm. Thus:
⎧
⎨ qj = min {Kj , Q(pj )}
If pi > pj =⇒
⎩
qi = min {Ki , max{Q(pi ) − Kj , 0}}
½ ½ ¾¾
Q(pi )
If pi = pj =⇒ qi = min Ki , max , Q(pi ) − Kj
2
In other words, if there is a tie, each firm has a “claim” on half of the market, but can also
serve any customers turned away by its rival.
Review
If either firm has chosen Ki < Q(c2 ) in stage 1, p = c2 is no longer an equilibrium for stage
2. Graphically (for the case of downward sloping demand):
111
When capacity constraints are binding, pure strategy Nash equilibria frequently do not
exist. The reason: gi (p1 , p2 ) is discontinuous at p1 = p2 > c2 .
However, in such cases, there are generally mixed strategy equilibria. Consequently, allow-
ing for either pure or mixed strategy equilibria in subgames, we can solve for subgame
perfect Nash equilibria in the two-stage game with endogenous capacity.
Analytic preliminaries
Some definitions:
γ c (q) denotes the Cournot best response function with constant unit costs c, i.e. arg maxq0 (P (q + q0 ) − c) q 0
Let qc denote the Cournot equilibrium quantity with constant unit costs c.
γ 0 (q) denotes the Cournot best response with zero unit costs, i.e. arg maxq0 P (q + q 0 )q 0
Graphically:
112
Theorem: (qc , qc ) followed by (P (2qc ), P (2qc )) occurs on the equilibrium path of a subgame
perfect Nash equilibrium. Furthermore, it is the unique SPNE outcome of this game.
Interpretation: If capacity is a not easily changed in the short run but price is, the outcome
will be Cournot, not Bertrand.
113
In region 1, both firms have chosen capacity less than or equal to its zero-cost best response
quantity. In region 2, at least one firm has chosen capacity greater than their zero-cost
best response quantities.
Each point in the non-negative quadrant of the capacity plane defines a distinct subgame.
We need to know what happens in each of those subgames.
Claim: For any subgame following capacity choices (K1 , K2 ) in region 1, there is an
equilibrium for which both firms name price P (K1 + K2 ) with probability 1, and each
sells exactly Ki (in fact, this is the unique equilibrium).
Proof of the claim: Suppose that firm j selects pj = p∗ ≡ P (K1 + K2 ). We will show
that pi = p∗ is firm i’s best repsonse.
Selecting pi = p∗ , firm i sells Ki and earns profits of Ki p∗ . For any pi < p∗ , firm i’s
quantity would also be equal to its capacity, and its profits would be Ki pi < Ki p∗ .
Consequently, pi < p∗ is less profitable that pi = p∗ .
Now consider firm i’s best choice among prices satisfying pi ≥ p∗ . Firm i’s profits are then
114
given by
[Q(pi ) − Kj ]pi
Consider the following change of variables: qi ≡ Q(pi )−Kj . Note that pi = Q−1 (qi +Kj ) =
P (qi + Kj ). Substituting this into the expression for i’s profits, we have
qi P (qi + Kj )
max qi P (qi + Kj )
Ki ≥qi ≥0
Ignoring the constraint that Ki ≥ qi , we know that firm i’s best deviation would entail
qi = γ 0 (Kj ). But, since we are in region 1, we also know that Ki ≤ γ 0 (Kj ). Thus,
the best choice for firm i is qi = Ki . Changing variables back to prices, we see that
the best choice for i satisfies pi = P (Ki + Kj ), which is what we set out to prove.
Remark: This argument, by itself, proves the theorem for a class of games. Consider
what happens when c gets large:
(ii) We can rule out larger and larger sets of possible first stage choices through dominance.
Specifically, let π m ≡ maxq qP (q) (monopoly profits assuming zero costs). Plainly,
firm i will never choose any Ki such that cKi > π m . Consequently, we can rule out
all Ki > π m /c. As c rises, this rules out more capacity choices.
115
outcome, the first stage is just Cournot. This game is said to have an exact Cournot
reduced form; its outcome must be the Cournot quantities, (qc , qc ).
Of course, for lower values of c, one must explicitly consider the possibility of ending up in
region 2. The analysis is more difficult.
(i) For all capacity pairs in region 2, there exists a mixed strategy equilibrium (possibly
pure)
(iii) Expected profits change continuously as one moves across the boundary from regime
1 to regime 2.
The first half of the theorem (existence of a Cournot-equivalent SPNE) follows directly
from these properties. Graphically:
116
We already know that deviations within region 1 (q1 ≤ qs ) do not make firm 1 better
off. Since expected profits for firm 1 are decreasing in K1 within region 2, and since
expected profits change continuously as one moves across the boundary from regime
1 to regime 2, any q1 > qs yields strictly lower profits than qs . Therefore, qc is the
optimal response to qc .
Note: This game (unlike the one with high c after deleting dominated strategies) does
not have an exact Cournot reduced form. Choices of capacity in region 2 do not
yield reduced form Cournot payoffs. However, the game nevertheless gives rise to the
Cournot outcome.
Consider the case of random rationing (high value consumers are no more likely than anyone
else to buy from the low-price firm). This is considered in Davidson and Deneckere,
RAND Journal of Economics, 1986.
The residual demand curve is steeper because, even with a higher price, a firm still at-
tracts some of the high value customers who are not as discouraged by the high price.
Graphically:
117
Steeper residual demand implies that there is a greater benefit to raising price. This can
destroy the original equilibrium
However, even with this modification, the model yields: (i) equilibria with strictly positive
profits, and (ii) downward sloping first-stage reduced form reaction functions (that
is, strategic substitutabilities). Thus, the model retains the essential features of the
Cournot solution.
Suppose that firm cannot observe each others’ capacities. This is equivalent to saying
that they choose capacity and price simultaneously (even though capacity is a longer-
run decision). In that case, one can show that the equilibria yield zero profits, like
Bertrand.
3. Caution: although this is a justification for studying Cournot equilibria, the justification
is specific to this particular model. This is not a general justification for Cournot.
118
5.3 Price competition with endogenous product characteristics
Framework:
Two firms
Payoff for a type θ consumer: 0 if no purchase, v − p − t(x − θ)2 if purchases a type x good
at price p.
Timing of decisions:
Stage 1: The firms simultaneously select product characteristics a and b (by convention,
we can always label the firms so that a ≤ 1 − b).
Stage 2: Having observed a and b, the firms simultaneously select prices. Firms produce
output to meet demand at a cost of c per unit.
Analysis:
We have previously solved for equilibrium conditional upon fixed (a, b). This is the equi-
librium that will prevail in each subgame.
119
From our previous analysis, we know that, for any given vector (p1 , p2 , a, b), firm 1’s profits
are given by
π 1 (p1 , p2 , a, b) = (p1 − c)b
θ(p1 , p2 , a, b, t)
where
b p1 − p2 1+a−b
θ(p1 , p2 , a, b, t) = +
2t(1 − a − b) 2
Now let’s consider firm 1’s optimal choice of a in stage 1. Note that
∂π 1
Firm 1’s first order condition tells us that ∂p1
= 0 when evaluated at p∗1 . One can evaluate
the last two terms in the preceding expression using the formulas reproduced above.
After some algebra, one obtains:
∙ ¸
dπ1 ∗ 3a + b + 1
= − (p1 − c) <0
da 6(1 − a − b)
With firm 1 to the left of firm 2 (a < 1 − b), this implies that firm 1 wishes to reduce a
(move further to the left). Consequently, firm 1’s optimal choice in stage 1 is to set
a = 0, regardless of b. A symmetric argument implies that firm 2 sets b = 1. Thus,
the equilibrium involves the maximum possible product differentiation.
Welfare analysis: It is easy to check that (a, b) = ( 14 , 34 ) maximizes social surplus. Con-
clude that competition produces excessive product differentiation.
Remarks:
120
(i) Price competition plays a central role in producing this result. Price competition
becomes less severe when the firms are positioned further away from each other. Firms
endogenously select a highly differentiated configuration to minimize price competition.
The associated prediction is that firms will seek niches in markets.
Illustration of the role of price competition: Suppose that the government regulates the
industry, setting a price p such that v − p > t (all buyers are willing to travel the
length of the interval to purchase the good). This is a pure location problem. There
1
is a unique equilibrium: a = 1 − b = 2
(no product differentiation).
Exercise: Solve the two-stage model for the case where transportation costs are given by
t |x − θ|.
5.4 Entry
5.4.1 Homogeneous goods
Stage 1: Each potential firm decides whether it is in the industry, or out of the industry.
If it chooses to be in the industry, it pays a cost K > 0.
Stage 2: Having observed the number of actual competitors, firms play some market game
(e.g. Cournot or Bertrand).
121
From the perspective of stage 2, the entry cost is known as a sunk cost. The distinguishing
features of a sunk cost are: (1) it is incurred before the commercial success or failure
of the enterprise is established, and (2) it is non-recoverable.
SPNE: This game has a finite number of stages, so we solve it by backward induction.
Let π(N) denote the (expected) profits to each firm in the equilibrium that prevails
with N active firms (assume uniqueness for simplicity).
Now consider the first stage. SPNE requires N e firms to elect in, where
π(N e ) ≥ K
and
π(N e + 1) ≤ K
Cournot
Bertrand
1 2 3 4 5
N
Remarks
(i) N e = 1 for Bertrand, which gives rise to the monopoly outcome for all K > 0.
122
(ii) As long as π(N) is decreasing in N, a decline in K (weakly) increases the number of
firms.
This provides a foundation for various results concerning convergence to competitive equi-
librium.
Example: Cournot. Recall the condition for aggregate equilibrium quantity in the sym-
metric case: µ ¶
0 Q 1 0
P (Q) = c − P (Q)Q
Ne Ne
The second term vanishes as N goes to infinity, so price converges to marginal cost.
Similarly, one can hold K fixed and let the market get large: Q(p) = αQ0 (p), with α → ∞.
Note: The convergence result does not hold for Bertrand competition.
(iv) Suppose that π is also a function of some parameter θ, π(N, θ). Suppose that π is
strictly increasing in θ for all N. Then an increase in θ weakly increases N e .
Suppose that θ indexes demand — that an increase in θ shifts the demand curve out without
changing the elasticity of demand. For a fixed N, this will generally increase π. Thus,
N e will rise with θ. But then, it might well be that price falls. For an example,
1
consider the Cournot game. The equilibrium price-cost margin is Nε
. This does not
change with θ. However, larger θ increases profits and induces more entry. With
larger N e , the equilibrium price-cost margin (and therefore the equilibrium price) is
lower. Thus, we have a situation where an increase in demand causes a decrease in
price.
123
Suppose that θ indexes “conduct” — e.g. θ = 0 corresponds to Bertrand, while θ = 1
corresponds to Cournot (higher values of θ are less competitive in the sense that they
lead to higher equilibrium profits). A less competitive environment — higher θ — leads
to greater entry (higher N e ). With less competitive conduct, one can thereby end
up with more competitive solutions (compare the two-stage Cournot and Bertrand
outcomes above).
Welfare analysis
Is the level of equilibrium entry efficient? Clearly, it’s not efficient in the first-best sense
if the continuation equilibrium involves a wedge between price and marginal cost. Is
it second-best efficient? That is, if an omniscient planner could specify the number
of firms, but could not prescribe prices, how would her choices compare with the
equilibrium outcome? (Arguably, this is what merger policy entails.)
d
(i) dN
[Nq(N)] > 0 (aggregate quantity rises with N)
d
(ii) dN
q(N) < 0 (per-firm quantity falls with N), and
Proof:
124
W 0 (N) = P [Nq(N)] [Nq 0 (N) + q(N)] − c(q(N))
< π(N) − K
Area A is the gross-of-K social gain from adding the N ∗ -th firm. Obviously, K < A <
A + B + C. But A + B + C = π(N ∗ − 1), so π(N ∗ − 1) > K. Consequently, there
must be at least N ∗ − 1 firms in equilibrium.
Intuition: Entry is associated with an externality. Firms steal business from each other.
125
The cost of entry, K, is a social cost, but not all of the gains are social gains — much
is just a redistribution of surplus.
Same setting, but now assume the cost K is incurred in stage 2, and only if the firm
produces a strictly positive quantity. (If the entrant actually has to pay K in stage 1,
this investment is recoverable.)
Here, the cost of entry is a fixed cost of production (it is only incurred if the firm decides
to produce something).
Consider the following special case: Bertrand competition with linear variable costs, c(q) =
cq (so that total costs are given by cq + K for q > 0, and zero for q = 0). Assume
also that there is some quantity that a monopolist could produce profitably (i.e. the
demand curve lies somewhere above the average cost curve).
Reason: One cannot have an equilibrium in which any output is sold at a price above
p∗ ; otherwise, a firm could enter, set a lower price, and earn strictly positive profits.
Plainly, one cannot have an equilibrium in which output is sold at a lower price,
since the seller would be losing money. Finally, if two firms set this price and one
subsequently produces all of the output, then both earn zero profits, and no firm can
earn strictly positive profits by deviating to any other price.
Graphically:
126
Interpretation: There is a single firm. It breaks even, setting prices equal to average
costs. This is called a contestable market (Baumol, Panzar, and Willig).
Comparison with outcome in the presence of sunk entry costs: In both instances,
only one firm is active. However, with fixed costs, the active firm earns zero profits.
With sunk costs, the active firm earns monopoly profits.
Source of the difference: sinking entry costs alerts the other firm to the presence of a
competitor, and the firm can react to entry. With fixed costs, entry is “hit and run” —
competitors have no opportunity to react to entry. Entry is more difficult in the case
of sunk costs because prices are strategic complements — the ability to react to entry
means that the entrant faces a more aggressively competitive environment.
Welfare analysis
127
Here, the planner can control both the number of firms and their outputs/prices. However,
the problem is second-best because the planner must guarantee each firm non-negative
profits.
Solution: As long as P (Q) > c, it is socially beneficial to increase quantity (marginal social
benefits exceed marginal social costs). In a first-best exercise, one would increase
quantity to Q0 , but this is ruled out in the second-best problem. The non-negative
profits constraint binds at Q∗ ; hence Q∗ is the second-best optimum.
Conclusion: Equilibrium in the entry problem with fixed costs (and no sunk costs) achieves
the second-best outcome. The result extends to more complicated settings (e.g. mul-
tiproduct firms).
Framework:
Payoff for a type θ consumer: 0 if no purchase, v − p − t(x − θ)2 if purchases a type x good
at price p.
Timing of decisions:
128
Stage 1: Firms choose either to stay out, or to enter with some particular product char-
acteristic, x. Entrants incur the sunk cost K.
Stage 2: Having observed the set of entrants and associated product characteristics, the
operating firms simultaneously select prices. Firms produce output to meet demand
at a cost of c per unit.
Question: What happens in the limit as K goes to zero? Without product differentiation,
we know that we obtain the monopoly outcome for all K > 0. Does this change with
product differentiation?
Theorem: Consider any interval [θ1 , θ2 ]. Let r (θ1 , θ2 ) denote the (subgame perfect Nash)
equlibrium probability that at least one firm enters with a product characteristic in
this interval (x ∈ [θ1 , θ2 ]). Then limK→0 r (θ1 , θ2 ) = 1.
Remarks:
(i) The theorem applies to all possible equilibria, both pure and mixed (which is why the
result is stated in terms of probability). Note that existence is guaranteed.
(ii) The theorem implies that, in the limit, firms blanket the product space. This in
turn implies that price converges to c (check the formulas for equilibrium price in the
two-firm case — as a approaches 1 − b, equilibrium price declines to c).
Proof of the theorem: Assume that the theorem is false. Then, for some θ1 , θ2 , and
ε > 0, there exists a sequence Kn → 0 such that limn→∞ r (θ1 , θ2 ) < 1 − ε. For any
n, consider a firm that chooses not to enter in equilibrium (thereby earning a profit of
zero). Consider the following strategy as a potential deviation:
θ1 + θ2
Stage 1: Enter with x0 =
2
µ ¶2
0 t θ2 − θ1
Stage 2: In all subgames, set p = c +
2 2
129
We will calculate a lower bound on the payoff resulting from adoption of this strategy.
If at least one other firm enters with product type in [θ1 , θ2 ], then sales and variable profits
for the deviating entrant are non-negative. This occurs with probability no greater
than 1 − ε.
With probability of at least ε, no other firm enters with product type in [θ1 , θ2 ]. For all
such cases, the entrant does at least as well as when other firms enter with product
characteristics of both θ1 and θ2 , in each instance charging p = c. We calculate the
entrant’s variable profits for this worst-case scenario. The entrant will make sales to
all customers in the interval [x0 − z, x0 + z], where z is given by the solution to
2
v − c − t (θ1 − (x0 − z)) = v − p0 − tz 2
t
2z(p0 − c) = (θ2 − θ1 )3 > 0
32
Consequently, the entrant’s expected profits (before observing others’ entry and product
characteristic decisions) are bounded below by ε 32t (θ2 − θ1 )3 . For K sufficiently small,
the entrant’s net expected profits, ε 32t (θ2 − θ1 )3 − K, are strictly positive. Since the
entrant has a profitable deviation, the system could not have been in equilibrium — a
contradiction. Q.E.D.
Framework:
130
Each consumer purchases either one unit of one of these goods, or nothing.
Consumers are characterized by a preference parameter, θ, which indicates the value at-
tached to quality. If a consumer of type θ purchases a good of quality v at price p, her
utility is given by
u(θ, v, p) = θv − p
Timing of decisions:
Stage 1: Firms choose either to stay out, or to enter with some particular product quality,
v. Entrants incur the sunk cost K.
Stage 2: Having observed the set of entrants and associated product qualities, the oper-
ating firms simultaneously select prices. Firms produce output to meet demand at a
cost of c per unit (costs are invariant with respect to quality).
Question: What happens in the limit as K goes to zero? Without product differentiation,
we know that we obtain the monopoly outcome for all K > 0. With horizontal product
differentiation, one obtains convergence to the competitive outcome. What happens
with vertical differentation?
Theorem: Assume that θ < 2θ. For all K > 0 there exists an SP NE wherein a single
firm enters with quality vH , and earns monopoly profits.
Remarks:
131
(i) This result coincides with the outcome for no product differentiation, and contrasts
with the outcome for horizontal product differentiation. Here, one does not obtain
convergence to perfect competition in the limit.
(ii) One can also demonstrate that this is the unique outcome for this game.
Proof:
For convenience, designate the single firm that enters in equilibrium as firm 1. Consider any
subgame in which firm 1 enters with v1 = vH , and some other firm (for convenience,
“firm 2”) enters with vL ≤ v2 ≤ vH . We claim that there exists a continuation
equilibrium in which firm 2 earns zero profits. For v2 = vH , this follows directly from
the usual Bertrand argument (in equilibrium, p = c, and neither firm earns variable
profits). For v2 < vH , the claim follows from the theorem presented in section 2.6.2.
Now we prove the theorem by constructing a SP N E with the desired properties. Strategies
are as follows. In the first stage, firm 1 enters with quality vH and all other firms stay
out. In the second stage, if firm 1 is the only active firm, it sets the monopoly price
(conditional on the quality of its product). If firm 1 is active along with one other firm,
and if firm 1’s quality is vH , then the firms play the continuation equilibrium described
in the previous paragraph. For any other configuration of entrants and qualities, select
some arbitrary continuation equilibrium.
We claim that this is, in fact, a SP NE. By construction, strategies give rise to an
equilibrium in the second stage for every first stage outcome. Thus, we need only
check that first stage choices are optimal. Since firm 1 earns monopoly profits in
equilibrium, it is obvious that firm 1 does not have a profitable deviation, regardless
of how one selects the continuation outcomes in subgames where v1 6= vH . Given firm
1’s strategy and the continuation equilibria described above, any other firm would,
after entering, earn zero variable profits, irrespective of its product quality. Since
132
K > 0, this implies that the entrant would receive negative profits. Thus, entry is not
a profitable deviation. Q.E.D.
Two dimensions of generalization: (i) unit production costs are c(v), with c0 (v) > 0, and
(ii) θ − θ may be large.
Consider the following hypothetical: All products are offered, with p(v) = c(v) for all v.
What would consumers purchase?
Possibility #1: All consumers purchase the good with quality vH . (Note: c(v) = c,
the case considered above, assures this). Then, as K → 0, the number of firms in
equilibrium has a finite bound, and firms earn positive profits.
Possibility #2: Consumers with higher values of θ would purchase goods with higher
v. Provided c00 (v) > 0, one can show that, for the hypothetical, the set of purchased
qualities will be some interval [v0 , v 00 ]. The resulting behavior then resembles the
outcome obtained with horizontal product differentiation. As K → 0, firms blanket
the interval, and the outcome converges to perfect competition.
Motivation: One can also think about endogenous entry in models with non-spatially dif-
ferentiated products. One advantage: in spatial models, there are two dimensions to
product variety: the number of products, and the type of products. With appropri-
ate non-spatial models, one can summarize variety simply by the number of products.
This makes it easier to compare equilibrium diversity with optimal diversity.
133
1. Products are distinct, and each firm faces a downward sloping demand curve.
2. The decisions of any given firm have negligible effects on any other firm.
A formalization:
Numeraire good, y
PN
Maximizing subject to the budget constraint y + i=1 pi xi = I yields the following first
order conditions: ÃN !
X
g0 f (xj ) f 0 (xi ) = pi
j=1
Assume that firms have “U-shaped” production cost function c(xi ). Profits are given by:
ÃN !
X
pi xi − c(xi ) = g 0 f (xj ) f 0 (xi )xi − c(xi )
j=1
134
Note: For condition (ii), the firm takes the value of g 0 (N ∗ f (x∗ )) ≡ K as a fixed parameter,
acting as if it faces the inverse demand function Kf 0 (x) = p. Justification: firms
are small relative to the market, so their individual effects on K are negligible, even
though their collective effects on K are substantial. For this reason, a monopolistically
competitive equilibrium is not a Nash equilibrium — it does not fit into the framework
of equilibrium analysis developed in this course.
Chamberlin’s work led to the conventional view that monopolistic competition is inefficient
because firms hold excess capacity, or equivalently that it is inefficient because there
is too much variety. Reason: firms produce at an inefficiently low scale.
That argument is generally incorrect. It may be socially desirable to produce a good even
if there is insufficient demand to produce it at minimally efficient scale.
The relation between equilibrium variety and optimal variety is dictated by two factors:
135
(i) When a product is added, the revenues generated fall short of incremental consumer
surplus because the firm can’t perfectly price discriminate. This creates a bias toward
too little entry.
(ii) Firms don’t take into account the effect of introducting a product on the profits of
others. If the goods are substitutes, this creates a bias toward too much entry (we
have already seen this in a setting with homogeneous products). If the goods are
complements, it creates a bias toward too little entry (e.g. nuts and bolts).
If goods are complements, these effects are reinforcing, and there is too little variety relative
to the social optimum. If goods are substitutes, the effect can go in either direction.
Caveat: specific conclusions here depend upon the notion of social optimality that is used.
A natural second-best problem requires the planner to consider only allocations in
which each producer breaks even. The first-best problem would ignore this constraint.
Theme: Actions taken by an incumbent firm prior to the arrival of an entrant may alter
competitive conditions in a way that makes entry less attractive. The potential to
discourage entrants may distort incumbents’ choices away from the decisions they
would make either if they were not facing entry, or if entry were certain. For this
to occur, decisions taken prior to entry must have lasting effects. Such effects will
exist if decisions represent long-term commitments.
Observation: Certain activities, such as predation and limit pricing, do not commit the
firm to future actions. Consequently, they cannot act as entry deterents in the sense
considered here. One needs alternative theories (e.g. reputation in models of in-
complete information) to explain why certain actions can deter entry even without
commitment.
136
5.5.1 Product differentiation
Product selection
Model: Consider the standard Hotelling spatial location model with two firms (an incum-
bent and an entrant), product characteristics on the unit interval, quadratic trans-
portation costs, and a uniform distribution of consumers.
Stage 2: Having observed xI , the entrant selects either in or out; if in, the entrant must
select product characteristic xE , and incurs an entry cost K
Stage 3: Having observed entry decisions and product characteristics, the incumbent and
the entrant simultaneously choose prices.
Let Πi (xI , xE ) denote reduced form profits (having solved for the stage 3 equilibrium, as
we did earlier in the course).
Observations:
1
(3) xI = 2
also minimizes ΠE (xI , 1) for xI ∈ [0, 12 ].
Conclusions:
137
(2) If ΠE ( 12 , 1) < K, the incumbent will set xI = 12 , thereby preempting entry.
Notice that the incumbent acts differently than it would if entry occurred with certainty.
However, it acts the same as it would if no-entry occurred with certainty. This is an
example of a situation in which entry is blockaded.
Product proliferation
Stage 1: The incumbent selects products: pI ∈ {A, B, AB, N} (N for none). Incurs setup
cost Ki if it chooses to produce product i.
Stage 2: Having observed pI , the entrant selects either in or out; if in, the entrant must
select products pE ∈ {A, B, AB, N}, and incurs and entry cost Ki if it chooses to
produce product i.
Stage 3: Having observed entry decisions and product characteristics, the incumbent and
the entrant simultaneously choose prices (Bertrand). Production costs are ci per unit
for good i (same for both firms).
Note: if both firms produce any good, profits earned from sales of that good are zero
π0i : profits earned from sales of good i when only good i is produced
138
πci : profits earned from sales of good i when good j is produced competitively (p = c)
πdi : profits earned from sales of good i when a different firm produces each good
πm : profits earned from sales of both goods when the same firm produces both goods
Assumptions:
Solution:
Stage 2:
If pI = N, then pE = A; I earns 0
Implication:
139
If entry was not possible, incumbent would produce only one product. If entry was certain,
incumbent would produce at most one product. Here, the incumbent chooses to
proliferate products because this represents a commitment that deters entry.
Remark: One can also make this point in the context of a spatial location model. The
incumbent has an incentive to crowd the product space to eliminate niches.
Issue: Even though product characteristics may be somewhat fixed in the short-run, it may
be possible to withdraw products quickly and at relatively low cost. Moreover, the
incentives for withdrawal are greater for an incumbent than for an entrant, because the
incumbent captures some of the lost business through other imperfectly substitutable
products. Consequently, product proliferation may not be a credible commitment.
This issue is taken up in the paper by Ken Judd. We can study it in the context of
the preceding model by adding a stage 2.5 (between stages 2 and 3), in which the
incumbent and the entrant are both allowed to withdraw products at a cost Wi for
product i.
Exercise: show that, as long as Wi is not too large, the only subgame perfect equilibrium
involves I producing product A, and E producing product B (assuming π dA − KA >
π dB − KB ).
Conclusion: product proliferation only deters entry if it is a firm commitment, and cannot
be reversed at low cost.
5.5.2 Capacity
Motivation: Can incumbent firms deter entry by holding extra capacity? In particular,
will they hold excess (unused) capacity? Example: Alcoa decision (1945).
Model:
140
Two firms: an incumbent (firm 1) and a potential entrant (firm 2) producing a homogeneous
product.
Each firm will make investments in capacity, ki . Each unit of capacity for firm i costs ri .
Ultimately, firms will also choose quantities (Cournot), subject to the constraint that qi ≤
ki . Production costs wi per unit.
Sequence of decisions:
Stage 2: The entrant chooses in or out. If in, the entrant pays the cost K.
Stage 3: The firms simultaneously choose capacity, ki , and quantity, qi , subject to these
constraints: k1 ≥ max{k10 , q1 }, and k2 ≥ q2 .
Central questions: (1) Does the fact that the incumbent can put k10 in place and pay for
it in advance change anything? (2) Will the incumbent install capacity that it does
not ultimately use?
Intuition: By installing capacity in advance, a portion of what would be the marginal cost
in stage 3, q1 c1 , is converted to a sunk cost, thereby reducing the marginal cost that
the incumbent perceives in stage 3. This makes the incumbent play more aggressively.
Since quantities are strategic substitutes, it also makes the entrant play less aggressively
in stage 3, which benefits the incumbent at the expense of the entrant. Thus, the
incumbent has an incentive to invest in capacity preemptively. When the effect on
the entrant is sufficiently strong, preemptive investment may deter entry.
141
We solve for subgame perfect equilibria recursively, beginning in stage 3 and working back
to the start of the game.
Observation: the best response function for each firm depends only on the quantity of the
other firm, and not on the other firm’s capacity.
Let γ w (q) denote the best response function with marginal cost w1
Let γ wr (q) denote the best response function with marginal cost w1 + r1
142
Consider firm 2. Clearly, this firm would always choose q2 = k2 . Consequently, it acts as
if it is a Cournot competitor to firm 1, with marginal cost w2 + r2 , and best response
function γ 2 (q1 ). Putting these together:
Let
A ≡ (qL , γ 2 (qL ))
B ≡ (qH , γ 2 (qH ))
B if k10 > qH
¡ 0 ¢
k1 , γ 2 (k10 ) if k10 ∈ (qL , qH )
As one varies k10 , one maps out all of the points on γ 2 (q1 ) between A and B.
Let qi∗ (k10 ) be the equilibrium quantity for firm i given k10 , and let π ∗i (k10 ) denote associated
profits.
143
Notice that q2∗ (k10 ) and π ∗2 (k10 ) are both decreasing in k10 .
Stage 2: Firm 2 enters if π ∗2 (k10 ) > K, remains out if π ∗2 (k10 ) < K, and is indifferent for the
case of equality.
Case (i): π ∗2 (Mwr ) < K. The monopolist achieves the overall monopoly outcome by
installing capacity Mwr . Entry is blockaded. There is no unused capacity.
Case (ii): π ∗2 (qH ) > K. This means that it is impossible to deter entry. The incumbent
selects k10 to achieve its most-preferred point on the segment AB. There is no unused
capacity.
Define Z as the solution to π ∗2 (Z) = K. To deter entry, the incumbent must set k10 ≥ Z.
Since Z > Mwr , and since the profit function is concave in quantity, the optimal entry
deterrence strategy is to set k10 = Z. Entry is deterred, and the incumbent ends up
producing Z (since Z < Mw ); there is no unused capacity.
The incumbent compares the profits earned from this optimal deterrence strategy with the
profits earned from the optimal accommodation strategy, which involves setting k10 to
achieve its most-preferred point on the segment AB.
Note that K affects the profits earned from optimal deterrence, but does not affect the
profits from optimal accommodation. Larger K makes deterrence relatively more
attractive. Graphically:
144
If K is very large, then Z < Mwr , and entry is blockaded. For intermediate values of K,
Z ∈ (Mwr , T ), and the incumbent holds extra capacity to deter entry. For small values
of K, either Z does not exist (deterrence is impossible), or Z > T , so the incumbent
accommodates entry.
Conclusions:
(1) The incumbent may deter entry by building more capacity that it would either if entry
were impossible, or if it occurred with certainty. Intuition: an investment in capacity
lowers costs, thereby making the incumbent more aggressive if faced with entry.
(2) The incumbent does not, however, build unused (excess) capacity. Intuition: quanti-
ties are strategic substitutes. We know that the incumbent will exhaust capacity if
entry occurs. If entry does not occur, rival’s quantity is lower (0), and therefore the
incumbent’s optimal choice of quantity is at least as great.
145
Observation: Manipulation of capacity by an entrant may also enhance the feasibility of
entry. Intuition: by limiting its own capacity, the entrant reduces the threat to the
incumbent, and thereby reduces the incumbent’s incentives to drive it from the market.
Timing of decisions:
Stage 1: The entrant chooses in or out; if in, it pays a setup cost K, names its capacity,
kE , and a price, pE . Capacity is costless.
Stage 2: The incumbent names a price pI . In the event of ties, consumers resolve their
indifference in favor of the incumbent.
If Πu (pE ) > ΠA (pE , kE ), the incumbent undercuts the entrant, setting arg maxp≤pE (p −
cI )Q(p), and the entrant sells nothing.
If Πu (pE ) < ΠA (pE , kE ), the incumbent sets arg maxp≥pE (p − cI ) [Q(p) − kE ], which is a
higher price than the entrant’s, and the entrant sells kE .
Stage 1 solution:
146
To make positive sales, the entrant must select (pE , kE ) subject to the constraint that
Πu (pE ) ≤ ΠA (pE , kE ).
Note that a low value of kE increases ΠA (pE , kE ), while a low value of pE reduces Πu (pE ).
Thus, the entrant must restrict capacity and set a relatively low price. Note that the
constraint is always satisfied for kE and pE sufficiently small, so the constraint set is
non-empty.
The entrant solves maxpE ,kE (pE −cE )kE subject to Πu (pE ) ≤ ΠA (pE , kE ). If K is sufficiently
small, entry occurs.
Intuition: Prices are strategic complements. Capacity makes the entrant play aggres-
sively, inducing the incumbent to play aggressively, which reduces the entrant’s profits.
Consequently, the entrant weakens itself (by establishing less capacity) in order to in-
duce the incumbent to be less aggressive.
Motivation: Recent high-profile antitrust actions raise important questions about the po-
tential anticompetitive effects of exclusionary relationships between entities operating
at different levels of a production process.
1. Exclusive dealing: Contractual relation between two firms in a vertical chain, wherein
one firm agrees to buy only from, or sell only to the other.
2. Tying: A firm is said to engage in tying when it makes the sale (or price) of one of its
products conditional upon the purchase of some other product. Includes requirements
contracts and bundling.
147
The “Chicago view”: There is only one monopoly rent to be had. If a firm extracts rents
through prices, it can’t also extract rent through exclusivity.
Model:
B’s demand is given by x(p) (derived from quasi-linear utility so that consumer surplus is
valid)
Sequence of decisions:
Stage 3: E decides whether to enter the market. Prior to entry, contracts with E are
impossible.
Stage 4: S and (possibly) E set prices, and B determines purchases. Renegotiation of any
contract between S and B is not allowed.
Solution:
148
Stage 4: If E has not entered or if B and S have an exclusive deal, S sets the monopoly
price p∗ ; B receives z(p∗ ) + t, S receives π ∗ − t (where t = 0 if there is no exclusive
arrangement). E receives 0 if it has not entered, and −F if it has entered.
If E has entered and if B and S do not have an exclusive deal, E and S set p = cS , and E
makes all sales; B receives z(cS ), E receives x(cS ) (cS − cE ) − F > 0, and S earns 0.
Conclusion: the parties can enter an exclusive arrangement only if π ∗ ≥ z(cS ) − z(p∗ ). But
this condition is never satisfied, so exclusion never occcurs.
149
Intuition: With more than one buyer, the decision of one buyer to enter into an exclusive
arrangement with the seller confers negative externalities on the other buyers, which
the first buyer ignores. Reciprocal externalities lead to suboptimal decisions on the
part of the buyers.
Model:
Exactly the same as the preceding model, except that there are two identical buyers, B1
and B2 .
Assume that the buyers’ demand are independent of each others’ purchases (in particular,
they are not competitors in a downstream market).
Assume that the entrant is more efficient at serving both buyers: 2(cS − cE )x(cS ) > F .
However, assume that the entrant is less efficient at serving only one buyer: (cS −cE )x(cS ) <
F.
In stage 1, the seller makes separate (and potentially different) offers to B1 and B2 .
Solution:
Stage 4: If E has not entered, or if E has entered and both buyers have signed exclusive
deals, S sets the monopoly price p∗ ; Bi receives z(p∗ ) + t, S receives 2(π ∗ − t) (where
t = 0 if there is no exclusive arrangement). E receives 0 if it has not entered and −F
if it has.
If E has entered, buyer i has signed an exclusive deal, and buyer j has not, the S charges
i the monopoly price p∗ earning π ∗ − t, Bi receives z(p∗ ) + t, E makes all sales to Bj
at the price cS , earning (cS − cE )x(cS ) − F , and Bj receives z(cS ).
If E has entered and if neither buyer has signed an exclusive deal, E and S set p = cS , and
E makes all sales; Bi receives z(cS ), E receives 2x(cS ) (cS − cE ) − F > 0, and S earns
0.
150
Stage 3: E enters iff neither buyer has signed an exclusive deal.
If ti = z(cS )−z(p∗ ), Bi accepts the exclusive offer if she expects j to accept, and is indifferent
if she expects j to reject.
If ti ∈ (0, z(cS ) − z(p∗ )), Bi accepts the exclusive offer if and only if she expects j to accept.
If ti = 0, Bi rejects the exclusive offer if she expects j to reject, and is indifferent is she
expects j to accept.
Note: for some (t1 , t2 ), there are multiple continuation equilibria. Any equilibrium involves
some selection from these.
Stage 1: The best exclusive strategy for S is to pick (t1 , t2 ) ∈ T to minimize t1 + t2 . For
simplicity, imagine that we have picked the continuation equlibria so that the minimum
exists. If tM ≡ min(t1 ,t2 )∈T t1 + t2 > 2π ∗ , then no exclusion occurs. If tM ≤ 2π ∗ , then
there is an exclusive equilibrium; with strict inequality, one necessarily gets exclusion.
Reason: For all ε > 0, (z(cS ) − z(p∗ ) + ε, 0) ∈ T . The resulting payoff for S is 2π ∗ −
(z(cS ) − z(p∗ ) + ε). Under the preceding condition, this is strictly positive for small
ε. This is equivalent to tM < 2π ∗ .
Notes: (i) The value of ti is indeterminant. In fact, for any t ∈ [0, min{2π∗ , z(cS ) − z(p∗ )}],
there is an equlibrium in which S pays t (in total) for exclusivity.
(ii) In an exclusive equilibrium for which ti < z(cS ) − z(p∗ ) for i = 1, 2, there is also a non-
exclusive continuation equilibrium for period 2 onward. The non-exclusive outcome
might be considered more plausible, in that the buyers might coordinate their choices.
151
A modified model:
Assume that S approaches B1 and B2 sequentially, with B1 responding to S’s offer before
S makes an offer to B2 .
Claim: Provided that 2π ∗ > z(cS ) − z(p∗ ), the only subgame perfect equilibrium involves
S purchasing exclusivity from B1 for nothing, thereby excluding E at no cost.
If S has not signed an exclusive deal with B1 , then S will successfully offer z(cS ) − z(p∗ ) in
return for exclusivity (while B2 is indifferent about accepting, one must resolve B2 ’s
choice in favor of accepting if S is to have a best choice in this subgame). As a result,
B1 ’s payoff will be z(p∗ ).
B1 knows that, if it accepts an offer, its payoff will be z(p∗ ) + t1 , and if it rejects, it payoff
will be z(p∗ ). Hence, it is willing to accept for all t ≥ 0. For S’s optimum to be well
defined, we must resolve B1 ’s indifference in favor of acceptance when t = 0.
A model of tying
Production:
Firm 1 is a monopolist for X, and produces output in this market at unit cost cX
152
Firm i produces output in the Y market at unit cost ciY .
Demand:
Each consumer buys either 0 or 1 unit of X, and values this unit at w (same for all
consumers).
Each consumer buys either 0 or 1 unit of Y , and attaches a value vi to this unit if it is Yi
(provided that he is not also consuming Yj )
Consumers differ in their valuations of the Y products, and there is some distribution F of
(vI , vE ) across the population. We normalize the total population to unity.
Utility is given by the sum of valuations for the consumed goods, minus the total price
paid.
Sequence of decisions:
Stage 3: Bertrand competition. Firms name prices simultaneously (if both are present).
Firm E, if present, names price pE . If firm I has not tied its products together, it
names a price q for X, and a price pI for YI . If I has tied its products together, it
names a price r for the bundle.
Solution:
153
A consumer buys Yi rather than Yj or nothing iff vi − pi > max{vj − pj , 0}. Let
and Z
yi (pI , pE ) = dF
Gi (pI ,pE )
Continuation equilibrium:
I sets q = w.
A consumer buys the bundle from I, rather than YE or nothing, iff w + vI − r > max{vE −
pE , 0}, or equivalently.
r − w < min{pE + vI − vE , vI }
Similarly, a consumer buys the good from E, rather than the bundle or nothing, if vE −pE >
max{w + v1 − r, 0}, or equivalently
pE < min{r − w + vE − vI , vE }
pi < min{pj + vi − vj , vi }
154
Notice that r − w has simply taken the place of pI .
This implies that demand for Yi is given by yi (r − w, pE ), where yi is the same function as
before.
¡ ¢
max pI − ciY + (w − cX ) yI (pI , pE )
pI
Note that this is the same as in the case without a tie, except for the presence of the w − cX
term.
Intuitively, the margin from selling YI is effectively larger because YI is tied to the sales of
a profitable. Same effect as reducing the costs of YI .
155
Note that both I’s profits and E’s profits are lower than if I does not tie.
(iii) E has not entered. Regardless of whether I has tied, I sets the monopoly price.
Stage 2: E enters if anticipated profits are sufficient to cover entry costs, K. Notice
that anticipated profits are strictly higher if I has not tied. Consequently, there are
circumstances under which E will not enter if I has tied, but will enter if I has not
tied.
(i) E will not enter whether or not I ties. Then I is indifferent between tying and not
tying.
(ii) E will enter whether or not I ties. Then I will not tie.
(iii) E enters if I does not tie, but does not enter if I does tie. Then I will tie, and this
will deter entry.
Note: I must be able to commit to a tie. Otherwise, when E entered, I would always
abandon the tie.
Environment:
Sequence of decisions:
156
After each offer, the non-offering parties either accepts or rejects
If she accepts, the pie is divided as per the offer, and the game ends
The game may continue forever (infinite horizon version), or end after a finite number of
rounds (finite horizon version)
Payoffs:
If the parties expect division x to be implemented in the t-th round, party 1 receives a
payoff of δ t1 x, and party 2 receives a payoff of δ t2 (1 − x)
Period T −1: Given the continuation equilibrium (which gives 1 a payoff of unity one period
hence), 1 will not accept any x < δ 1 . Consequently, 2 proposes δ 1 and 1 accepts.
Exercise: Solve for the SP NE of a T -period bargaining model first for T even, and then
for T odd. Show that the payoffs from these equilibria converge to a common limit
as T → ∞. Solve for the limiting payoffs.
157
Solution of the infinite horizon version:
When receiving an offer, player i accepts any share greater than δ i (1 − δ j )/(1 − δ i δ j ), and
rejects any smaller share.
These strategies constitute a SP NE. Moreover, it is the unique SP NE outcome for this
game.
Remarks:
(i) Uniqueness is surprising, given other results on infinite horizon games (next section)
(iii) When discount factors are equal, the proposer receives (1 − δ)/(1 − δ 2 ) = 1/(1 + δ) > 12 .
Note that, as δ → 1 (the parties become increasingly patient), equilibrium shares
converge to 12 .
First, imagine that player i is making the offer. If i demands any share less than (1 −
δ j )/(1 − δ i δ j ), the offer will be accepted, and i’s payoff will be lower than in the
equilibrium. If i demands any share greater than (1 − δ j )/(1 − δ i δ j ), the offer will be
rejected, and in the next period i will receive 1 − (1 − δ i )/(1 − δ i δ j ). Discounted to
the period in which i makes the offer, this is equivalent to
∙ ¸
1 − δi 1 − δj 1 − δj
δi 1 − = δ 2i <
1 − δiδj 1 − δiδj 1 − δiδj
158
Thus, the deviation yields a lower payoff for i.
Second, imagine that player i is receiving the offer. If i rejects an offer, i will receive a
payoff of (1 − δj )/(1 − δ i δ j ) in the following period. Thus, it is optimal for i to accept
offers of at least δ i (1 − δj )/(1 − δ i δ j ), and to reject lower offers.
Let vL (i, j) and vH (i, j) denote, respectively, the lowest and highest SP NE continuation
payoffs to player i when player j makes the current offer. Since the infinite horizon
problem is stationary, these functions do not depend on the period t.
Strategy of proof: we will show that, for each (i, j) (including i = j), vL (i, j) = vH (i, j),
and we will solve for this common value.
(i) For i 6= j,
vL (i, i) ≥ 1 − δ j vH (j, j)
Reason: j gets no more than vH (j, j) in the continuation, and will therefore accept δ j vH (j, j)
today.
(ii) For i 6= j,
vH (j, i) ≤ δj vH (j, j)
This follows from (i) and vL (i, i) + vH (j, i) ≤ 1. Intuitively, j gets no more than vH (j, j)
in the continuation. Player i therefore does not need to offer j more than δ j vH (j, j)
today.
(iii) For i 6= j,
vH (i, i) ≤ max {1 − δ j vL (j, j), δ i vH (i, j)}
159
Reason: At this point, we don’t know whether vH (i, i) is achieved by making an offer than
it accepted, or one that is rejected.
Assume for the moment that it is achieved by making an offer that is accepted. Since
player j gets a discounted continuation payoff of at least δ j vL (j, j), j will reject any x
such that 1 − x ≤ δ j vL (j, j). Thus, for this case, vH (i, i) ≤ 1 − δ j vL (j, j) .
Now assume that vH (i, i) is achieved by making an offer that is rejected. Then the best
i can hope for is the best outcome for i when j makes the offer one period hence:
δ i vH (i, j).
© ª
vH (i, i) ≤ max 1 − δ j vL (j, j), δ2i vH (i, i)
Assume not. Then, by (iv), vH (i, i) ≤ δ 2i vH (i, i), which implies vH (i, i) ≤ 0. But, since
neither δ j or vL (j, j) can exceed unity, we then have 1 − δ j vL (j, j) ≥ δ 2i vH (i, i), a
contradiction.
Rearranging yields
1 − δj
vL (i, i) ≥
1 − δj δi
160
Rearranging yields
1 − δj
vH (i, i) ≤
1 − δj δi
This follows because i can always refuse j’s offer, receiving at least vL (i, i) in the following
period. Combining this with (vi) yields
µ ¶
1 − δj
vL (i, j) ≥ δ i
1 − δj δi
161
6 Repeated Games with Complete Information
Motivating example: The Prisoners’ dilemma, again.
Cooperation may seem more plausible. One reason: participants may expect to interact
more in the future. If you “Fink” today, your opponent may retaliate in the future.
Simple device for getting at these issues: imagine that the game is repeated.
Two important cases: (i) potentially infinite repetitions, (ii) finite repetitions.
Terminology: The game played in each period is called the stage game. The dynamic
game formed by infinite repetitions of a stage game is called a supergame.
162
Observations:
(i) Even if the stage game is finite, the associated supergame is not.
Evaluating payoffs:
What we need: a mapping from strategy profiles into expected payoffs. (For finite games,
this is given by the composition of the mapping from strategy profiles into distributions
over terminal nodes, with the mapping from terminal nodes to payoffs.)
For repeated games, one can assume that payoffs are distributed immediately after each
play of the stage game. Then any strategy profile maps to a distribution of paths
through the game tree, and any path through the game tree maps to a sequence of
payoffs for each player i, vi = (vi (1), vi (2), ...).
Remaining issue: We need a mapping from strategy profiles to scalar payoffs for each player.
How do we get from payoff sequences to scalar payoffs?
General answer: assume that players have utility functions mapping sequences of payoffs
into utility: ui (vi )
(ii) For the case of no discounting, we can use the average payoff criterion:
µ ¶XT
1
ui (vi ) = lim vi (t)
T →∞ T
t=1
Strategies:
163
The nature of strategies will depend upon assumptions about what is observed each time
the game is played.
For the time being, we will assume that, each time the stage game is played, all players can
observe all previous choices.
With this assumption, each sequence of choices up to (but not including) period t corre-
sponds to a separate information set (for each player) in period t. Consequently, we
proceed as follows.
Let a(t) = (a1 (t), ..., aI (t)) be the profile of actions chosen in period t
A t-history, h(t), is a sequence of action profiles (a(1), ..., a(t − 1)), summarizing everything
that has occurred prior to period t.
Since, by assumption, h(t) is observed by all players, there is, for each player, a one-to-one
correspondence between t-histories and period t information sets.
Consequently, a strategy is a mapping from all values of t ∈ {1, 2, ...} and all possible
t−histories to period t actions (for period 1, the set of 1-histories is degenerate).
Example of a strategy: σ N
i (t, h(t)) ≡ F for all t, h(t).
Note: this repeats the equilibrium of the stage game in every period.
¡ ¢
Claim:If players use the average payoff criterion, σ N
1 , σ N
2 is a Nash equilibrium.
164
Demonstration: Check to see whether a player can gain by deviating from this strategy,
given that his opponent plays this strategy.
If the player sticks to the strategy, the sequence of payoffs will be vi (t) = 1 for all t. The
average payoff is 1.
Conclusion: Repeating the equilibrium of the stage game is a Nash equilibrium for the
supergame.
Remark: The same proposition obviously holds with discounting, and without discounting
using the overtaking criterion (a sequence of ones always beats a sequence of ones and
zeros).
Exercise: Prove that this point is completely general (it holds for all stage games).
Question: Can we get anything other than repetitions of the stage game equilibrium?
With these strategies, the game would unfold as follows: Both players would play NF
forever. If any player ever deviated from this path, then subsequently both players
would play F forever.
Claim: If players use the average payoff criterion to evaluate payoffs, this is a Nash equi-
librium.
165
Demonstration: Check to see whether a player can gain by deviating from this strategy,
given that his opponent plays this strategy.
If the player sticks to the strategy, the sequence of payoffs will be vi (t) = 3 for all t. The
average payoff is 3.
Now consider a deviation to some other strategy. Let t0 be the first period t for which
this strategy dictates playing F when h(t) does not contain an F (if there is no such
t0 , then the deviation also generates an average payoff of 3). If the player deviates to
this strategy, the sequence of payoffs will be
⎧
⎪
⎪ = 3 for t < t0
⎪
⎪
⎨
vi (t) = 4 for t = t0
⎪
⎪
⎪
⎪
⎩
≤ 1 for t > t0
(For t > t0 , this follows because the opponents strategy will always dictate playing F ).
The associated average payoff is not larger than 1, and therefore certainly less than 3.
Consequently, this is a Nash equilibrium.
(ii) We cannot necessarily claim that the players will cooperate in this way, because there
are many other Nash equilibria.
Illustration: For all t, let h∗ (t) be the t-history such that (i) a1 (t0 ) = F for t0 < t odd,
and a1 (t0 ) = NF for t0 < t even; (ii) a2 (t0 ) = NF for all t0 < t. In words: player 1 has
alternated between NF and F , while player 2 has always chosen NF .
166
⎧
⎨ NF if h(t) = h∗ (t)
σ 2 (t, h(t)) =
⎩
F otherwise
In other words, player 1 alternates between NF and F , while player 2 always plays NF
(the result is h∗ (∞)). However, if either deviates from this path, both subsequently
play F forever.
Claim: When players use the average payoff criterion, there is also a Nash equilibrium
wherein both players select the preceding strategy.
Demonstration: As long as no player deviates, player 1’s payoffs alternate between 3 and
4, while 2’s payoffs alternate between 3 and 0. Average payoffs are 3.5 for player 1,
and 1.5 for player 2.
Now imagine that player i considers deviating to some other strategy. If this deviation has
any effect on the path of outcomes (and hence on payoffs), there must be some period
t0 in which the actions taken diverge from h∗ (∞). Assuming that player j sticks with
its equilibrium strategy, vi (t) ≤ 1 for t > t0 . Consequently, player i’s average payoff is
1. This is less than the payoff received by both players in equilibrium.
Some definitions
First, we identify the set of feasible payoffs (including things that can be achieved through
arbitrary randomizations).
Let C = {w | w is in the convex hull of payoff vectors from pure strategy profiles in the
stage game}
Remark: C is potentially larger than the set of payoffs achievable through mixed strategies
in the stage game, since we allow for correlations.
167
For our example (the prisoners’ dilemma):
Can anything in C occur in equilibrium as an average payoff? No. Each player can assure
himself of a payoff of at least unity each period by playing F all of the time. Therefore,
we know that no player can get a payoff smaller than unity.
πm
i = min max π i (δ)
(δ 1 ,...,δi−1 ,δ i+1 ,...,δ I ) δi
This is the average payoff that player i can assure himself, simply by making a best response
to what everyone else is supposed to do (according to their strategies) in every period.
Player i cannot receive an average payoff less than π m
i in equilibrium.
168
D is called the set of feasible and individually rational payoffs.
Question: How does E compare with D? It’s reasonably clear that E isn’t larger, but
can it be smaller?
The folk theorem: Consider a supergame formed by repeating a finite stage game an infi-
nite number of times. Suppose that players use the average payoff criterion to evaluate
outcomes. Then E = D.
Step 2: Consider any sequence of actions h0 (∞) yielding average payoffs w ∈ D. Then
there exists a pure strategy Nash equilibrium for the supergame where this sequence
of actions is taken on the equilibrium path. We show this by construction.
169
Consider the following strategies. If play through period t−1 has conformed to h0 (t), players
continue to follow h0 (∞) in period t. If play has not conformed to h0 (t), inspect the
actual history h(t) to find the first lone deviator (in other words, ignore any period in
which there are multiple deviators). If no lone deviator exists in any period prior to
t, then revert to following h0 (∞) in period t. If the first lone deviator is i, then all
j 6= i play
δm
−i = arg min max π i (δ)
(δ 1 ,...,δi−1 ,δ i+1 ,...,δ I ) δi
It is easy to check that this is a Nash equilibrium. If all players choose their equilibrium
strategies, the outcome is h0 (∞), and the average payoff for i is wi ≥ π m
i (since w ∈ D
by assumption). If player i deviates, then, assuming all others play their equilibrium
strategies, i will be the first deviator, and subsequently can do not better than π m
i in
any period. This means that i’s average payoff will be no greater than π m
i . Thus, the
Step 3: For all w ∈ D, there exists a sequence of actions yielding average payoffs of w.
Idea: alternate actions to produce the same frequencies as the randomization. This is easy
if the randomization involves rational frequencies. If it involves irrational frequencies,
one varies the frequency in the sequence to achieve the right frequency in the limit.
Interpretation of the folk theorem: (i) Anything can happen. Comparative statics are
problematic.
(ii) The inability to write binding contracts is not very damaging. Anything attainable
through a contract is also obtainable through a self-enforcing agreement, at least with
no discounting. The equilibrium that gets played is determined by a process of ne-
gotiation. It is natural to expect players to settle on some self-enforcing agreement
170
that achieves the efficient frontier. The precise location may depend upon bargaining
strengths.
Remark: We can think of δ as the product of a pure rate of time preference, ρ, and a
continuation probability, λ (measuring the probability of continuing the game in period
t + 1, conditional upon having reached t): δ = ρλ. In particular, assume that, if the
game ends, subsequent payoffs are zero (this is just a normalization). Let T be the
realized horizon of the game. Then expected payoffs are
" #
X∞ X
k
prob(T = k) ρt−1 vi (t)
k=1 t=1
" #
X∞ X
k
= λk−1 (1 − λ) ρt−1 vi (t)
k=1 t=1
X∞ Xk
£ k−1 ¤
= λ (1 − λ)ρt−1 vi (t)
k=1 t=1
X
∞ X
∞
£ k−1 ¤
= λ (1 − λ)ρt−1 vi (t)
t=1 k=t
" #
X∞ X
∞
= t−1
ρ vi (t) (1 − λ) λt−1 λk−t
t=1 k=t
X∞
= (λρ)t−1 vi (t)
t=1
The magnitude of δ in any context will depend upon factors such as the frequency of
interaction, detection lags, and interest rates.
171
Imagine again that both players use the following strategies:
In period 1, play NF
⎧
⎨ F if h(t) contains an F
In period t > 1, play
⎩
NF otherwise
If players discount payoffs at the rate δ, is this a Nash equilibrium?
If the player sticks to the strategy, the sequence of payoffs will be vi (t) = 3 for all t. The
discounted payoff is
X
∞
3
3δ t−1 =
t=1
1−δ
Now consider a deviation to some other strategy. Without loss of generality, imagine that
player i deviates to F in period 1. Player i knows that j will play F in all subsequent
periods (since this is dictated by j’s strategy). Consequently, it is optimal for i to
play F in all subsequent periods, having deviated in the first. Thus,
⎧
⎨ = 4 for t = 1
vi (t)
⎩
= 1 for t > 1
This is equivalent to
1
δ≥
3
Thus, the strategies still constitute an equilibrium provided that the players do not discount
the future too much.
172
The folk theorem
Notice we can now think of utility as a weighted average of the single period payoffs,
X
∞
ui (vi ) = μt vi (t),
t=1
P∞
where μt = (1 − δ)δ t−1 , and t=1 μt = 1.
In this setting, the folk theorem needs to be restated slightly: for any w ∈ D with wi > π m
i
for all i, there exists δ ∗ < 1 such that for all δ ∈ (δ ∗ , 1), w is the payoff vector for some
Nash equilibrium.
173
6.1.4 Subgame perfect Nash equilibria
The equilibria constructed to establish the folk theorem may not be subgame perfect. Pun-
ishing a player through a minmax strategy profile may not be credible, since the punishers
may suffer. Can we sustain cooperation in SP NE?
Claim: All of the Nash equilibria considered above for the repeated prisoners’ dilemma
are SP NE.
Nash reversion
Formally, consider some arbitrary stage game, as well as the supergame consisting of the
infinitely repeated stage game. Assume that the stage game has at least one Nash equi-
librium. For any particular stage-game Nash equilibrium, a∗ , consider the following
strategies:
σN ∗
i (t, h(t)) = ai for all t, h(t)
Now imagine that we want to support some particular outcome, h∗ (∞), as an equilibrium
path. Let’s try to do this with the following strategies: If play through period t − 1
174
has conformed to h∗ (t), players continue to follow h∗ (∞) in period t. If play has not
conformed to h∗ (t), then players use σ N .
Claim: If the aforementioned strategies constitute a Nash equilibrium, then the equilibrium
is subgame perfect.
Demonstration: We use precisely the same argument as for the prisoners’ dilemma. All
Nash equilibrium strategies necessarily constitute Nash equilibria in all subgames that
are reached along the equilibrium path. For any subgame off the equilibrium path,
the prescribed strategy profile is σ N . All subgames are identical to the original game,
and σ N is a Nash equilibrium of the original game. Thus, we have a Nash equilibrium
in every subgame.
Implication: When one uses Nash reversion to punish deviations, it is particularly sim-
ple to build SP NE and check subgame perfection: one simply makes sure that the
equilibrum is Nash (equivalently, that no player has an incentive to deviate from a
prescribed choice on the equilibrium path).
Examples:
(ii) Bertrand competition (minmax payoffs and Nash payoffs are both zero)
However, Nash reversion is frequently much less severe than minmax punishments.
175
Example: Cournot competition (minmax payoffs are 0, while Nash profits are strictly
positive)
Question: Does the validity of the folk theorem depend, in general, on the ability to use
non-credible punishments (at least for stage games with the property that Nash payoffs
exceed minmax)?
Answer: Subject to some technical conditions, one can prove versions of the folk theorem
(with and without discounting) for SP NE. The proofs are considerably more difficult.
Implication: If the stage-game Nash equilibrium payoffs exceed minmax payoffs, then, for
δ sufficiently close to unity, there exist more severe punishments than Nash reversion.
Example:
This game has only one Nash equilibrium: (F , NF ). Note that this gives player 1 the
maximum possible payoffs. It is therefore impossible to force player 1 to do anything
through Nash reversion.
176
Exercise: For this example, construct a SP NE in which (NF, NF ) is chosen on the
equilibrium path. Either use the average payoff criterion, or assume an appropriate
value for δ.
We will see another explicit example of punishments that are more severe than Nash rever-
sion when we analyze the dynamic Cournot model.
One might think that finitely repeated games get to look a lot like infinitely repeated games
when the horizon is sufficiently long. This is correct for Nash equilibria (where credibility
is not required), but not for subgame perfect equilibria.
Theorem: Consider any finitely repeated game. Suppose that there is a unique Nash
equilibrium for the stage game. Then there is also a unique SP NE for the repeated
game, consisting of the repeated stage game equilibrium
For T = 1, the repeated game is the stage game, which has a unique Nash equilibrium
Now assume the theorem is true for T − 1. Consider the T -times repeated game. All
subgames beginning in the second period simply consist of the of the (T − 1)-times
repeated game, which, by assumption, has a unique SP NE. Thus, in a SP NE,
actions taken in the first period have no effect on choices in subsequent periods. In
177
equilibrium first period choices must therefore be mutual best responses for the stage
game. This means that the first period choices must be the Nash equilibrium choices
for the stage game. Q.E.D.
Remarks:
(i) It is often said that a finitely repeated game “unravels” from the end, much like the
centipede game.
(ii) Cooperation may be possible when the stage game has multiple Nash equilibria.
Example:
There are two Nash equilibria: (b, B) and (c, C). (a, A) is Pareto superior, but it is not a
Nash equilibrium.
Strategies: Play a (A) in the first period. If the outcome in the first period was (a, A),
play b (B) in the second period; otherwise, play c (C).
178
This is plainly a Nash equilibrium: any other strategy yields a gain of at most 1 unit in the
first period, and a loss of at least 2 in the second period.
Remark: There are folk theorems for finite horizon games formed by repetitions of stage
games that possess multiple equilibria.
6.3 Applications
6.3.1 The repeated Bertrand model
Stage game: N ≥ 2 firms simultaneously select price. Customers purchase from the firm
with the lowest announced price, dividing equally in the event of ties. Quantity pur-
chased is given by a continuous, strictly decreasing function Q(P ). Firms produce with
constant marginal cost c. Let
π(p) ≡ (p − c)Q(p)
Observation: (i) Nash reversion involves setting p = c, which generates 0 profits. This
is also the minmax profit level. Thus, Nash reversion generates the most severe
possible punishment. Anything that can be sustained as an equilibrium outcome can
be sustained using Nash reversion as punishments. Therefore, we can, without loss of
generality, confine attention to equilibria that make use of Nash reversion.
(ii) The static Bertrand solution is unique. Thus, we know that no cooperation can be
sustained in SP NE for finite repetitions. Henceforth, we focus on infinite repetitions.
Analysis of equilibria:
Consider the following h(∞): both firms select some price p∗ ∈ [c, pm ] in every period.
179
Assuming that players discount future utility, when can we sustain this path as the outcome
of a SP NE?
Given the preceding observation, we answer this question by determining the conditions
under which this outcome can be supported as a Nash equilibrium using Nash reversion.
This is equivalent to
1
≥N
1−δ
Implications:
(i) Cooperation becomes more difficult with more firms. For N = 2, cooperation is sus-
tainable iff δ ≥ 12 . As N → ∞, the threshhold discount factor converges to unity.
180
(iii) There is no longer a sharp discontinuity between one firm and two, as in the static
Bertrand model. However, given (ii), there is still a sharp discontinuity between some
N and N + 1, where the best cooperative equilibrium shifts from monopoly to perfect
competition.
Stage game: N = 2 firms simultaneously select quantities. The market clearing price is
given by P (Q) = a − bQ. Firms produce with constant marginal cost c.
a−c (a−c)2
Let Qm = 2b
denote monopoly quantity, and let π m = 4b
denote monopoly profits.
a−c (a−c)2
Let q c = 3b
denote Cournot duopoly quantity, and let π c = 9b
denote Cournot duopoly
profits (both per firm).
Qm
Consider the following h(∞): each firm sets 2
in every period.
Assuming that players discount future utility, when can we sustain this path as the outcome
of a SP NE, using Nash reversion?
Qm
Imagine instead that the firm makes a static best response to 2
(this is its best possible
Qm 9 (a−c)2
deviation). Best response profits given that the rival plays 2
are 64 b
. In every
181
subsequent period (after the deviation occurs), the deviating firm earns the static
Cournot profits, π c . The deviation therefore yields discounted profits of
9 (a − c)2 δ (a − c)2
+
64 b 1 − δ 9b
Implication: Using Nash reversion, it is easier to get cooperation with Bertrand than with
Cournot. In the static setting, Bertrand is more competitive. Consequently, Nash
reversion punishments are more severe.
Exercise: We know that Cournot profits decline with the number of firms. This means
that, for the repeated Cournot game, Nash reversion punishments become more se-
vere with more firms. Does the preceeding “implication” mean that, for Cournot,
cooperation is easier to sustain with more firms? If not, why not?
Remark: The preceeding concerns the sustainability of the monopoly outcome. One can
perform a similar calculation for other quantities. In contrast to the Bertrand model,
it turns out that it is easier to sustain less cooperative outcomes (that is, the threshhold
discount factors are lower). Indeed, for Cournot, it is possible to sustain some degree
of cooperation (profits in excess of πc ) for all δ > 0. This is a consequence of the
envelope theorem: as one reduces quantities starting at the Cournot equilibrium, the
improvement in profits is first-order, but the change in the difference between profits
and best-deviation profits is second order.
Exercise: For the linear Cournot model, solve for the most profitable symmetric equilib-
rium sustained by Nash reversion, as a function of the discount factor, δ.
182
Alternative punishments
Motivation: From the folk theorem, it is obvious that more severe punishments may
be available than Nash reversion. In principle, the associated strategies could be
extremely complex, which would make them difficult to analyze.
Under some circumstances, however, it is possible to characterize the most severe pun-
ishments within large classes of strategies, and to show that the associated strategies
have a relatively simple “stick and carrot” structure. We illustrate using the Cournot
model.
Let gi (qi , qj ) denote firm i’s profits when it produces qi and j produces qj . Assume we have
chosen q L and q H so that gi (q L , q L ) > gi (q H , qH ).
(ii) Having defined σ sc (t−1, h(t−1)) for all feasible histories h(t−1), we define σ sc (t, h(t)) as
follows. If qi (t−1) = σ sc (t−1, h(t−1)) for i = 1, 2, then σ sc (t, (h(t − 1), q(t − 1))) = q L .
Otherwise, σ sc (t, (h(t − 1), q(t − 1))) = qH .
In words, the choice between qL and q H is always determined by play in the previous period.
If firms have played their prescribed choices in the previous period, then they play q L .
If one or both deviated in the previous period, they play qH .
When both players select stick-and-carrot strategies, play evolves as follows. On the
equilibrium path, (qL , q L ) is played every period. If a firm deviates in a single period
t − 1, both players play (qH , q H ) in the following period as a punishment, after which
they return to (q L , qL ) forever. Notice that, if a player deviates, this strategy requires
183
the player to participate in its own punishment in the following period by playing
qH . If it refuses and instead deviates in the punishment period, the punishment is
prolonged. If, on the other hand, it cooperates in its punishment, the punishment
period ends and cooperation is restored. Thus, there is both a “stick” (a one-period
punishment) and a “carrot” (a reward for participating in the punishment). Use of the
carrot can lead players to willingly participate in a very severe one-period punishment.
Given the stationary structure of the game and of the equilibrium, there are only two
deviations to check: from σ sc (t, h(t)) = q L , and from σ sc (t, h(t)) = qH .
From qL we have:
¡ ¢ ¡ ¢ £ ¡ ¢ ¡ ¢¤
gi γ i (q L ), qL − gi qL , q L ≤ δ gi q L , qL − gi q H , q H
From qH we have:
¡ ¢ ¡ ¢ £ ¡ ¢ ¡ ¢¤
gi γ i (q H ), q H − gi qH , q H ≤ δ gi qL , q L − gi q H , qH
Specialize to the case where the “carrot” is the monopoly outcome, and the “stick” is the
a−c a−c
competitive outcome (price equal to marginal cost). That is, qL = 4b
and qH = 2b
.
Then these expressions can be rewritten as
(a − c)2 (a − c)2
≤δ
16b 8b
(a − c)2 (a − c)2
≤δ
64b 8b
184
1 9
Implications: Since 2
< 17
, these strategies allow the firms to sustain the monopoly
outcome at lower discount rates than with Nash reversion. Indeed, they can now
achieve the monopoly outcome for the same range of discount factors as with the
infinitely repeated Bertrand model.
Remark: The stick used here yields zero profits for a single period. One can also use more
severe sticks that yield negative profits for a single period. Under some conditions,
this allows one to construct punishments that yield zero discounted payoffs. The firms
are willing to take losses in the short-term because they expect to earn positive profits
in subsequent periods.
Motivation: There is some evidence indicating that oligopoly prices tend to be counter-
cyclical (oligopolists are more prone to enter price wars when demand is strong). If
one thinks in terms of conventional supply and demand curves, this is counterintuitive.
Note: the evidence is controversial.
Insight: The ability to sustain cooperation depends generally on the importance of the
future relative to the present (we saw this with respect to the role of δ). When the
present looms large relative to the future, cooperation is more difficult to sustain. This
is what occurs during booms.
Model:
Demand is random. Each period, one of two states, H or L, is realized. The states are
equally probable, and realizations are independent across periods. Demand for state
i is Qi (p), with QH (p) > QL (p) for all p.
185
Notation:
Let π m
k denote industry monopoly profits in state k:
πm
k = max π k (p)
p
Equilibrium analysis:
Consider any stationary, symmetric equilibrium path such that both firms select the price
pH in state H and pL in state L.
Construct equilibrium strategies using Nash reversion (here, these are the most severe
possible subgame perfect punishments since they yield zero profits)
Given the stationary structure of the problem and the usual dynamic programming argu-
ment, we need only check to see whether the firms have incentives to make one period
deviations in each state.
For state H: µ ¶ µ ¶∙ ¸
N −1 δ 1 1 1
π H (pH ) ≤ π H (pH ) + πL (pL )
N 1−δ 2 2 N
For state L: µ ¶ µ ¶∙ ¸
N −1 δ 1 1 1
πL (pL ) ≤ π H (pH ) + π L (pL )
N 1−δ 2 2 N
Note that the right-hand sides of these expressions are the same, since the future looks
the same irrespective of the current demand state. For any given price, the left-hand
side is greater for the high demand state. Therefore, a given price is more difficult to
sustain for the high demand state than for the low demand state.
186
Specialized paramateric assumptions:
Before going further, we will simplify the model by making some parametric assumptions:
(i) Q = θ − p
(iii) c = 0
(iv) N = 2
1 1
Under these assumptions, π k (p) = p(θk − p), πm m m m
L = 4 , pL = 2 , π H = 1, and pH = 1.
Look back at the constraints. If the constraint is satisfied for monopoly in state H, then it
is also satisfied for monopoly in state L. Therefore, we need only check the constraint
for state H. Substituting, we have
µ ¶ ∙µ ¶ µ ¶¸
δ 1 1 1
1≤ ×1 + ×
1−δ 2 2 4
8
This is equivalent to δ ≥ 13
.
8
Note: Since 13
> 12 , it is more difficult to sustain full monopoly here than in the Bertrand
model with time-invariant demand.
We know pm m
H becomes unsustainable for state H. However, the constraint for pL in the
8
low state holds with strict inequality at δ = 13
. Consequently, one would still expect
it to hold for slightly smaller δ.
8
Proceed as follows: Assume that, for some δ < 13
, pm
L is sustainable. Calculate the highest
187
To compute the highest level of sustainable profits, πH , for state H under the aforemen-
tioned assumption, we substitute into the equilibrium constraint:
µ ¶∙ µ ¶¸
δ 1 1 1
πH ≤ πH + ×
1−δ 2 2 4
For the highest sustainable level of state H profits, this constraint holds with equality.
Rearranging yields
δ
πδH =
8 − 12δ
One can check the following:
8
For δ = , π δH = 1 = πm
H
13
1 1
For δ = , π δH = = π m L
2 4
£1 ¤
Thus, as long as δ ∈ , 8
2 13
, the assumption that π m
L is sustainable is valid.
Exercise: Verify that, when δ < 12 , the only SP NE outcome involves repetitions of the
static Bertrand outcome (price equal to marginal cost). As in the standard repeated
Bertrand model, no cooperation is sustainable for discount factors below 12 .
Properties of equilibrium:
188
(ii) Comparison of prices in the two states
8
For δ ≥ 13
, pH = pm m
H > pL = pL . Prices move pro-cyclically (higher in booms).
For δ = 12 , π H = π m
L . To achieve the same profits in the high demand state as in the low
demand state, prices must be lower in the high demand state. Therefore, prices move
counter-cyclically.
Conclusion: There is a range of discount factors over which the best sustainable price
moves countercyclically.
189
6.3.4 Multimarket contact
Motivation: In certain circles, there is a concern that large, conglomerate enterprises are
anticompetitive.
Corwin Edwards: When firms come into contact with each other across many separate
markets (geographic or otherwise), opportunistic behavior in any market is likely to
be met with retaliation in many markets, and this may blunt the edge of competition.
(i) It is correct that, with multimarket contact, deviations may lead to more severe punish-
ments involving larger numbers of markets. However,
(ii) Knowing this, if a firm were to deviate from a cooperative agreement, it would deviate
in all markets. Consequently, it is not obvious that multimarket contact does anything
more than increase the scale of the problem.
It turns out that multimarket contact can facilitate cooperation, but not for the reasons
suggested by Edwards.
Notation:
i denotes firm
k denotes market
Gik denotes the net gain to firm i from deviating in market k for the current period, for a
particular equilibrium
πcik denotes the discounted payoff from continuation (next period forward) for firm i in
market k, assuming no deviation from the equilibrium in the current period.
190
πpik denotes the discounted “punishment” payoff from continuation (next period forward)
for firm i in market k, assuming that i deviates from the equilibrium in the current
period.
For each i,
X X X
Gik + δ π pik ≤ δ π cik
k k k
Implication: Multimarket contact pools incentive constraints across markets. This may
enlarge the set of outcomes that satisfies the incentive constraints.
For example, the set {(x, y) | x ≤ 4 and y ≤ 4} is strictly smaller than the set {x, y |
x + y ≤ 8}.
As it turns out, pooling incentive constraints strictly expands the set of sustainable out-
comes, and in particular improves upon the best cooperative outcome, in a number of
different circumstances. We will study one of them.
Idea:When there is more enforcement power than needed to achieve full cooperation in
one market, the extra enforcement power can be used in another market where full
cooperation is not achievable. We will consider two examples.
191
Example #1: Differing numbers of firms in each market.
Suppose firms produce homogeneous goods in each market and compete by naming prices
(Bertrand)
Imagine that there are two markets. There are N firms in market 1 and N + 1 firms in
market 2. Moreover,
N −1 N
<δ<
N N +1
From our analysis of the infinitely repeated Bertrand problem, we know that the monopoly
price is sustainable for market 1:
X
∞
π(pm )1
δ t−1 > π(pm
1 )
t=1
N
Thus, if single-market firms operate in both markets, market 1 will be monopolized, while
market 2 will be competitive.
Now suppose that N conglomerate firms operate in both markets, and that one single-
market firm operates in market 2. Let 1−α denote the share of market 2 served by the
single-market firm. We will attempt to sustain a cooperative arrangement wherein the
N conglomerate firms divide the remaining share (α) equally. The incentive constraint
for the single-market firm is:
X
∞
(1 − α)π(p2 )δ t−1 ≥ π(p2 )
t=1
This is equivalent to the requirement that α ≤ δ. Thus, the conglomerate firms must cede
at least the share 1 − α to the single-market firm to deter the single-market firm from
deviating.
192
For the conglomerate firms, the incentive constraint becomes
X∞ ∙ ¸
π(pm1 ) π(p2 ) t−1
+α δ ≥ π(pm
1 ) + π(p2 )
t=1
N N
We will attempt to sustain a cooperative arrangement that cedes as little market share to
the single-market firm as possible (α = δ). Making this substitution and rearranging,
we obtain (after some algebra):
à !µ ¶
N−1
δ− N
π(p2 ) ≤ π(pm
1 ) N
N
N+1
−δ N +1
Under our assumptions (cooperation is sustainable in market 1 but not in market 2), the
RHS of this inequality is strictly positive. Thus, through multimarket contact, one
can always sustain p2 > c in market 2 without sacrificing profits in market 1. If δ is
N
sufficiently close to N+1
(cooperation in market 2 is almost sustainable in isolation),
one can achieve monopoly profits in market 2. Note that the conglomerate firms
always cede a larger market share to the single-market share to sustain cooperation.
Consider the model analyzed in the preceding section. Imagine that there are two such
markets, and that the same firms operate in both markets. Suppose moreover that the
demand shocks in these markets are perfectly negatively correlated (so that there is
always a market in state H and a market in stage L). Since the markets are symmetric,
this means that there is really only one demand state. Pooling incentive constraints
across markets, we obtain:
µ ¶
N −1
[π H (pH ) + π L (pL )]
N
µ ¶∙ ¸
δ 1 1 2
≤ πH (pH ) + π L (pL )
1−δ 2 2 N
193
N−1
This is equivalent to δ ≥ N
, which is exactly the same as for the simple repeated Bertrand
model. For example, with N = 2, we obtain full cooperation in both states for all
δ ≥ 12 .
Remark: As the correlation between the demand shocks rises, the gain to multimarket
contact declines. When the shocks are perfectly positively correlated, there is no gain.
This implies that the potential harm from multimarket contact is greater when the
markets are less closely related.
Motivation: Price wars appear to occur in practice. However, in the standard model,
one only obtains price wars off the equilibrium path. This means they happen with
probability zero.
Insight: One can generate price wars on the equilibrium path by considering repeated games
in which actions are not perfectly observable. To enforce cooperation, the players must
punish outcomes that are correlated with deviations. But sometimes those outcomes
occur even without deviations, setting off a price war. In that case, the punishments
must be chosen very carefully to assure that the consequences of the occasional war do
not outweigh the benefits of cooperation.
Model:
The firms compete by naming prices (Bertrand); indifferent consumers divide equally be-
tween the firms.
Firms do not observe each others’ price choices, even well after the fact.
194
Low demand states occur with probability α. Consumers purchase nothing.
The firms cannot directly observe the state of demand, even well after the fact.
A firm only observes its own price and the quantity that it sells.
Analysis of equilibrium:
Object: sustain pm
Problem: if a firm ends up with zero sales, there are two explanations: (i) demand is low,
and (ii) its competitor has deviated. It cannot tell the difference.
To sustain pm , the equilibrium must punish deviations. The only alternative is to enter a
punishment phase (price war) any time a firm has zero sales.
Key difference from previous models: the trigger for a price war occurs with strictly positive
probability in equilibrium.
Because price wars will actually occur, the firms want the consequences of these wars to be
no more severe than absolutely necessary to sustain cooperation. We no longer use
grim strategies that involve punishing forever.
Charge pm initially.
As long as firm i has played pm and made positive sales (or played p > pm ) in all previous
periods, it continues to play pm
195
If, in any period t − 1, firm i either deviated to p < pm or made zero sales, the game enters
a punishment phase in period t.
When the punishment phase is over, the strategies reinitialize, treating the first non-
punishment period as if it were the first period of the game.
Value functions:
Let V c denote the expected present value of payoffs from the current period forward when
play is not in a punishment phase.
Let V p denote the expected present value of payoffs from the current period forward in the
first period of a punishment phase.
X
T −1
V = p
(δ s × 0) + δ T V c
s=0
Substituting in the first expression for V p using the second expression yields:
∙ m ¸
π
c
V = (1 − α) + δV + αδ T +1 V c
c
2
Deviations:
196
phases. The best possible deviation is to slightly undercut pm . The expected present
value of the resulting profits is given by
V d = (1 − α) (π m + δV p ) + α (0 + δV p )
= (1 − α)πm + δ T +1 V c
V c ≥ (1 − α)π m + δ T +1 V c
Now we substitute the expression for V c derived above. The π m term cancels — as in
the standard repeated Bertrand model, the feasibility of cooperation is all or nothing.
Rearranging terms yields the equilibrium condition:
Note that the equlibrium does not hold for T = 0 (the left hand side reduces to δ)
An increase in T reduces the absolute value of the second term on the LHS. This can
increase the value of the LHS only if 2α − 1 is negative. In that case, the value of the
LHS remains bounded below 2δ(1 − α). Consequently, the equilibrium holds for some
T > 0 if and only if
197
Condition (ii) implies condition (i), so we only need to check (ii). When (ii) is satisfied,
cooperation is possible. The best cooperative equilibrium involves the least severe
punishments consistent with incentive compatibility. This requies us to pick the
smallest value of T satisfying the equilibrium condition.
1
(ii)0 δ > 2(1−α)
For the special case of α = 0, this gives δ > 12 , which is the correct answer for the Bertrand
model when there is no observability problem.
Conclusions:
(ii) Price wars are set off by declines in demand. (Note that this contrast with the model
of cyclical demand, in which prices fall when demand is high. The key difference
concerns observability.)
When an equilibrium price war occurs, everyone believes correctly that no one deviated. It
may seem odd to enter a punishment phase under those circumstances. However, if the
firms didn’t punish this non-deviation, the incentives to comply with the cooperative
agreement would vanish.
(iv) Imperfect observability makes cooperation more difficult (it raises the threshhold value
of δ).
198
7 Strategic Choice in Dynamic Games with Incomplete
Information
Games of incomplete information become particularly interesting in dynamic settings where
the actions of players can reveal something about their types. Knowing this, players
have incentives to tailor their actions to manipulate the inferences of others. Naturally,
others anticipate this manipulation, and attempt to make inferences subject to the
knowledge that they are being manipulated.
7.1 Reputation
7.1.1 Reputation with complete information
We will illustrate some ideas using a variant of the simply entry game considered earlier.
We know that if this game is played only once, the only plausible outcome (SP NE) is (In,
Accommodate). Yet, in practice, an incumbent might have a reason to play fight if
confronted with entry, in order to establish a reputation for toughness.
199
We say that an individual has a reputation if they are expected to behave a certain way in
the current environment because they have behaved the same way in similar environ-
ments in the past when playing against others.
To model reputation, it is therefore natural to consider a setting in which (1) the game is
repeated, and (2) the incumbent encounters different opponents each period. This is
an example of repeated game with both long-lived and short-lived players.
Infinite repetitions: Imagine that the incumbent in the above game plays this game re-
peatedly against a sequence of different opponents. These repetitions continue forever.
The incumbent discounts the future at the rate δ. The opponents care only about
their payoffs in the current period.
Clearly, this repeated game has a subgame perfect equilibrium in which all opponents
enter and the incumbent always accommodates entry. However, there may be other
equilibria (even though opponents all have one-period horizons).
Proposed equilibrium:
Strategy for opponents: If either all opponents have stayed out in the past, or if the
incumbent has never accommodated entry in the past, then play Out; otherwise, play
In.
Strategy for incumbent: Imagine that the current opponent enters. If either all opponents
have stayed out in the past or if the incumbent has never accommodated entry in the
past, then play F ight; otherwise, play Acc.
(1) Suppose that either all opponents have stayed out in the past, or that the incumbent
has never accommodated entry in the past. Then the current opponent expects the
200
incumbent to respond to entry by playing F ight. Consequently, the opponent’s best
choice is to stay Out.
(2) Now suppose that some opponent has entered in the past and that the incumbent has
accommodated entry. Then the current opponent expects the incumbent to accom-
modate entry in the current period. Consequently, the opponent’s best choice is to
play In.
(1) Suppose that either all opponents have stayed out in the past, or that the incumbent has
never accommodated entry in the past. Imagine that the current opponent plays In.
If the incumbent plays the choice prescribed by her equilibrium strategy (F ight), she
receives a payoff of -1 in the current period, and a payoff of 2 in all subsequent periods
(since all subsequent opponents will stay out). Her discounted payoff is therefore
2δ
−1 + 1−δ . If the incumbent instead plays Acc, she receives a payoff of 1 in the current
period and in all future periods (since, according to the strategies, entry will occur and
1
she will accommodate). Her discounted payoff is therefore 1−δ
. Thus, it is optimal
for the incumbent to play F ight provided that
2δ 1
−1≥
1−δ 1−δ
This requires δ ≥ 23 .
(2) Suppose that some opponent has entered in the past and that the incumbent has
accommodated entry. Imagine that the current opponent plays In. Regardless
of what the incumbent does, all future opponents will play In. Consequently, the
incumbent’s best choice is to accommodate entry.
Remarks:
201
(1) The incumbent benefits from a “reputation” for toughness. She fights anyone who
enters to sustain this reputation. If she ever fails to fight, she loses this reputation.
This provides her with the incentive to fight. However,
(2) In equilibrium, the incumbent never does anything to create or to maintain this repu-
tation. She is simply endowed with it.
(3) There are also SP NE in which the incumbent does not benefit from a reputation.
(4) The ability to sustain a reputation vanishes when the horizon is finite (standard unrav-
elling argument).
A one-shot game:
202
The unique SP NE is (In, Acc)
Now assume that the entrant does not know the incumbent’s type. He assumes that the
incumbent is wimpy with probability μ and macho with probability 1 − μ. The game
is:
203
The requirement of subgame perfection allows us to reduce this tree to:
This is a simple single-person decision problem. E’s payoff from In is 3μ − 1. E’s payoff
from Out is 0.
204
Introducing reputation: Imagine that the incumbent has been operating in the industry
for some time. The entrant will look back on the incumbent’s past behavior to
make inferences about whether the incumbent is macho or wimpy. Knowing this,
the incumbent may behave in a way designed to mislead potential entrants. This
compounds the entrant’s inference problem.
One could model this in a setting with (finitely) repeated interaction. To keep things
simple, we will use a more highly stylized model.
A simple representation: Suppose that, prior to playing the preceding game (stage 2),
the incumbent has the opportunity to raid the market of another firm (stage 1). In
stage 1, the incumbent can choose either to raid (called m for “acting macho”), or not
to raid (called w for “acting wimpy”). (One can think of this as playing a similar
game against some other opponent.)
In stage 1, payoffs are as follows: Im receives 2 from m and 0 from w, while Iw receives
− 12 from m and 0 from w
Extensive form:
205
Sequential equilibria
We will solve for the sequential equilibria of this game. Since sequential equilibria are sub-
game perfect, we can replace any proper subgame possessing a unique equilibrium with
the payoffs from that equilibrium. Here, that means substituting for the incumbent’s
final decisions, as follows:
We will work with the reduced game and abstract from the incumbent’s final decisions
(they are always implicit)
Claim #1: There is a sequential equilibrium in which both Iw and Im play m. E enters if
the incumbent has played w, but does not enter if the incumbent has played m. Beliefs
off the equilibrium path: if w is observed, E believes he faces Iw with probability 1.
206
Demonstration: First we verify that all actions are optimal, given beliefs and other
players’ strategies
E if m: receives 3μ − 1 from In, and 0 from Out; with μ < 13 , Out is optimal
Beliefs are consistent: Let Iw play w with probability ε, and let Im play w with probability
με
ε2 . Pr(I = Iw | w) = με+(1−μ)ε2
, which converges to unity as ε → 0.
First notice that Im ’s lowest possible payoff is 3 if it plays m, and its highest possible payoff
is 2 if it plays w. Thus, Im must play m with certainty in all sequential equilbria.
Next we argue that there cannot be an equilibrium in which Iw plays w with certainty. In
that case, m would indicate that the incumbent was Im with probability 1, so E would
not enter upon observing m, and w would indicate that the incumbant was Iw with
probability 1, so the E would enter upon observing w. But then, by playing m rather
than w, Iw would increase her payoffs from 1 to 1.5.
Finally we argue that there cannot be an equilibrium in which Iw mixes between m and
w. Iw would be willing to mix only if she were indifferent. Upon observing w, E
would infer that the incumbent was Iw with probability 1 (since Im makes this choice
with probability zero). E would therefore choose In, which means that Iw would
receive a payoff of 1. Upon observing m, E would infer that the incumbent was Iw
with probability less λ < μ < 13 , and would therefore choose Out, which means that
Iw would receive a payoff of 1.5. Thus, Iw could not be indifferent between w and m.
207
Conclusion: There is a unique sequential equilibrium. It has the property that Iw imitates
Im in order to deter entry. Iw succeeds because E fears that the incumbent might be
macho upon observing m. In other words, Iw acts macho to disguise her true type.
Can Iw play m with probability 1? No. Upon observing m, E would infer that the in-
cumbent is Iw with probability μ > 13 ; we have already shown that, for such inferences,
E enters. Thus, if Iw plays m, she receives a payoff of 0.5. On the other hand, if
Iw plays w, she receives a payoff no smaller than 1. Consequently, Iw would have an
incentive to deviate to w.
Can Iw play w with probability 1? No. Upon observing m, E would infer that the
incumbent is Im with probability 1 and not enter. Upon observing w, E would infer
that the incumbent is Iw with probability 1 and enter. Thus, Iw would receive a payoff
of 1.5 from playing m, and a payoff of 1 from playing w. Iw would therefore deviate
to w.
If Iw is mixing between m and w, then we also know that, upon observing w, E must infer
that the incumbent is Iw , and consequently E will enter.
That leaves two decisions: Iw ’s choice between m and w, and E’s choice between In and
Out conditional upon having observed m.
208
To have a MSE, Iw must randomize between m and w to make E indifferent between In
and Out conditional on observing m. Moreover, E must randomize between In and
Out conditional upon observing m to make Iw indifferent between m and w.
Let E randomize between In and Out with equal probabilities (conditional upon observing
m). Then Iw ’s expected payoff from playing m is 1. If Iw plays w, E will enter and
Iw ’s payoff will also be 1. Thus, this strategy for E makes Iw indifferent between m
and w, as required.
Next, let Iw randomize between m and w with probabilities ϕ and 1−ϕ, respectively, where
1−μ
ϕ≡
2μ
ϕμ
Pr (I = Iw | m) =
(1 − μ) + ϕμ
1
=
3
With this posterior, E is indifferent between entering and not entering, as required.
(i) When μ ≥ 13 , wimps deter entry in some fraction of cases by randomly imitating macho
players.
(iv) The probability of imitation declines as μ rises, but the effect on macho types does not
change with μ.
209
7.2 Cooperation in finite horizon games
Idea: Agents may be able to sustain cooperation if, by cooperating, they can constructively
mislead others about their objectives.
Imagine that player 2 is either “sane” with probability 1 − θ, or “crazy” with probability θ.
A crazy player behaves mechanically as follows: Plays C as long as player 1 has never
played N in the past. Otherwise, play N.
If the game is played once, player 1 and the sane player 2 will both choose N.
If the game is played more than once, the possibility that player 2 might be crazy obviously
gives player 1 some incentive to play C in early rounds. Less obviously, it also gives
210
player 2 an incentive to play C, since 2 may wish to give 1 the impression that 2 is
crazy.
Two repetitions
Second period: Both player 1 and the sane player 2 will plainly choose N regardless of
what has previously transpired, and regardless of player 1’s beliefs about player 2’s
type. The crazy player 2 picks N if player 1 chose N in the first period, and C if
player 1 chose C in the first period.
First period: The sane player 2 knows that player 1 will choose N in the second period
regardless of what player 2 chooses in the first period. Therefore, the sane player 2
will choose N. The crazy player 2 will, of course, choose C.
Player 1 knows that, in the first period, the sane player 2 will choose N, and that the crazy
player 2 will choose C. Player 1 also knows the continuation for the second period. If
player 1 chooses C, her total expected payoff is θ(1 + b) − (1 − θ)a = θ(1 + b + a) − a.
If player 1 chooses N, her total expected payoff is θb. Thus, 1 plays C provided that
a
θ(1 + b + a) − a > θb, or θ > 1+a
. Player 1 plays N when this inequality is reversed,
and is indifferent when equality holds.
Three repetitions
Beliefs: Let μ denote player 1’s probability assessment that player 2 is crazy, assessed at
the beginning of the second round of play, conditional upon the outcome of the first
round.
In this case, the continuation game is the same as the two-repetition game considered above,
with μ replacing θ.
211
a
If μ > 1+a
, the continuation is as described in the “two-repetitions” section above, with
player 1 selecting C in the second round.
a
If μ < 1+a
, the continuation is as described in the “two-repetitions” section above, with
player 1 selecting N in the second round.
a
If μ = 1+a
, the continuation is as described in the “two-repetitions” section above, with
player 1 potentially randomizing between C and N in the second round.
In this case, the continuation game differs from the two-repetition game considered above,
since the “crazy” player 2 will select N in both periods, no matter what else happens.
This means that the third-round outcome must be (N, N) no matter what has happened
in the second round; consequently, the outcome will also be (N, N ) in the second round.
Observe that the equilibrium would involve no revelation of information in the first round
(since both types of player 2 play C).
Given that no information is revealed in the first round (in equilibrium), what will happen
in the continuation game?
With no information revealed in the first round, the continuation will be as described above,
with μ = θ.
a
Suppose that θ > 1+a
. Then player 1 selects C in the second round, and receives a
continuation payoff of θ(1 + b + a) − a, while the sane player 2 selects N in the second
round and receives a continuation payoff of b (see above). Thus, player 1’s total
equilibrium payoff is 1 + θ(1 + b + a) − a, while the sane player 2’s total equilibrium
payoff is 1 + b.
212
Now let’s check to see whether it is optimal for both players to select C in the first period
(assuming that the other will do so).
Player 1: If player 1 chooses N, her total payoff is b (given that continuation payoffs are
zero when player 1 choses N in round 1). Thus, player 1 is willing to play C in round
1 provided that 1 + θ(1 + b + a) − a ≥ b. This simplifies to θ ≥ a+b−1
a+b+1
= θ∗ . Note
that, under our assumptions, θ∗ ∈ (0, 1).
Player 2: If the sane player 2 chooses N, his total payoff is b (in any sequential equilibrium,
continuation payoffs are zero when the sane player 2 has revealed his type, since μ =
a
0< 1+a
). Thus, player 2 is willing to play C in round 1 provided that 1 + b ≥ b, which
is always satisfied.
Remarks: (1) In the preceding equilibrium, the sane player 2 imitates the crazy player
2 because this maintains the possibility in player 1’s mind that player 2 is crazy, and
thereby induces player 1 to play C in round 2 (which in turn permits player 2 to earn
a payoff of b in round 2).
(2) Note that we obtain cooperation in the first round, possible cooperation (or partial
cooperation) in the second round, and no cooperation in the last round. This is
consistent with the way people play the repeated prisoners’ dilemma: cooperation
occurs in early rounds, and collapses in later rounds.
(3) This is only one equilibrium, and we need to make some restrictive parametric assump-
tions to generate it. Cooperation is only one possible outcome.
Assume that the game is played T times. We will show that, for large T , one necessarily
obtains cooperation in almost every period.
213
Theorem: In any sequential equilibrium, the number of stages where one player or the
other plays N is bounded above by a constant that depends on θ, but is independent
of T .
(1) If the sane player 2’s type ever becomes known to player 1 prior to some period t, then
both players must select N in round t and in all succeeding rounds. This follows from
induction on the number of stages left in the game. It is obviously true for t = T .
Now assume that it is true for some arbitrary t ≤ T . In round t − 1, both players
know that their actions cannot affect outcomes in subsequent rounds. Consequently,
they must make static best responses in round t − 1, which means that they both play
N.
(2) If both players have selected C in rounds 1 through t − 1, and if the sane player 2 selects
N in round t, then both players must select N in all succeeding rounds. Since the
choice of N in round t reveals player 2’s type as sane, this follows directly from step
(1).
(3) If player 1 plays N in round t − 1, then player 1 and both types of player 2 all select N
in all subsequent rounds.
Claim: In all rounds t0 > t − 1, both the crazy player 2 and the sane player 2 will play
N. For the crazy player 2, this is obvious. If the sane player 2 chooses C in round
t0 , he will reveal himself to be sane and, by step (1), both players will select N in all
subsequent periods. If player 2 instead chooses N in round t0 , he receives a strictly
higher round t0 payoff (irrespective of what player 1 chooses), and the continuation can
be no worse (since he can choose N in every subsequent period). Consequently, player
2 chooses N, as claimed.
214
Now consider player 1. Based on the preceding claim, this player knows that both the crazy
player 2 and the sane player 2 will choose N in all rounds t0 > t − 1. Consequently, in
any round t00 > t − 1, player 1 knows that the subsequent actions of player 2 (whether
sane or crazy) are independent of player 1’s round t00 action. Thus, player 1 must
make a static best response to player 2’s expected choice in round t00 , which means
that she chooses N.
b+(1−θ)a
(4) Define M ∗ ≡ θ
. If neither player has selected N in any round up to and including
t0 where t0 < T − M ∗ , then player 1 must select C in round t0 + 1.
Suppose on the contrary that there was an equlibrium in which, with strictly positive
probability, player 1 selects N in round t0 + 1. Then, by step (3), she receives at most
b from that point forward (since the continuation is for both players to select N in all
subsequent rounds).
Imagine instead that she deviates to the following continuation strategy: play C until player
2 plays N, and thereafter play N. Since no information about player 2’s type has been
revealed through round t0 , her total payoff from that point forward is no worse than
θ(T − t0 ) − (1 − θ)a > θM ∗ − (1 − θ)a = b. Consequently, this is a profitable deviation,
which contradicts our initial supposition.
(5) If neither player has selected N in any round up to and including t0 where t0 < T −M ∗ −1,
then player 2 must select C in round t0 + 1.
Suppose on the contrary that there was an equilibrium in which, with strictly positive
probability, player 2 selects N in round t0 + 1. Then, by steps (2) and (4), his
continuation payoff is b (player 1 will play C in round t0 + 1 and N thereafter).
Imagine instead that player 2 deviates to the following continuation strategy: play C in
round t0 + 1 and N in all subsequent rounds. Then, by step (4), he receives a payoff
of 1 in round t0 + 1 and a payoff of b in round t0 + 2 (since t0 + 1 < T − M ∗ , player 1
215
selects C in round t0 + 2). By step (2), he receives a payoff of 0 thereafter. Thus, his
total payoff is 1 + b > b. This is a profitable deviation, which contradicts our initial
supposition.
We conclude that neither player selects N with positive prbability in any round t < T −M ∗ .
Note that M ∗ is independent of T . Q.E.D.
Remarks:
(1) Though we have ruled out equililibria in which either player selects N with positive
probability in any round t < T − M ∗ , we have not proven that there exists an equilib-
rium in which the players select C in these rounds. However, combining our theorem
with a general result on the existence of sequential equilibria establishes this point.
(3) Note that, for any θ, no matter how small, the fraction of rounds in which we necessarily
obtain cooperation goes to unity as the number of rounds becomes large. In this sense,
players almost always cooperate with long horizons, even if craziness is only a remote
possibility.
(4) The preceding analysis presupposes a particular form of craziness. If one allows for
all conceivable forms of craziness, one obtains folk-like theorems (anything can happen
with sufficiently long finite horizons and arbitrarilly small probabilities of craziness
provided that one does not restrict the form of craziness).
7.3 Signaling
The model:
216
There is a single worker and two potential employers
The worker’s productivity (marginal revenue product) is given by θ, and is the same for
both firms.
Both firms have the same prior beliefs: the worker is type θL with probability λ
Eventually, the firms will bid for the worker’s services. However, before entering the job
market, the worker has the opportunity to obtain education.
Education is costly, but does nothing for the worker’s level of productivity.
Assume: c1 > 0 (education is always costly on the margin) and c12 < 0 (education is less
costly on the margin for more productive workers)
Workers have a reservation wage, 0. The level is a normalization; the more important
assumption here is that all workers have the same outside opportunities.
Stage 3: Having observed e but not θ, the firms simultaneously select wage offers, wi
(i = 1, 2)
217
Payoffs: The worker receives u(w, e, θ) = w − c(e, θ) if she accepts an offer (and −c(e, θ)
otherwise), the winning firm receives θ − w, and the losing firm receives 0.
Where we are headed: There are WPBE for this game in which high quality workers
receive education while lower quality workers do not, and the market pays educated
workers more despite the fact that education does not contribute to productivity. Ed-
ucation instead serves as a signal of productivity.
Simplifying assumption: We will confine attention to WPBE in which the two firms have
the same beliefs, contingent upon the observed level of e.
Stage 4: Provided that the highest wage offer is non-negative, the worker accepts it. In
the event of a tie, the worker could accept the offer from either firm.
Stage 3: Let μ(e) denote the beliefs of the firms (specifically, the probability that the
worker is of type θL ) conditional upon having observed e. The expected productivity
of the worker is then μ(e)θL + (1 − μ(e))θH . This represents the value of the worker
to each firm. Stage 3 involves Bertrand bidding between the firms. Consequently, we
know from earlier results that the resulting equilibrium price is
Stage 2: w(e) describes the tradeoff between wages and education facing the worker. To
determine the worker’s best choice, we have to study her preferences in greater detail.
218
Thus, indifference curves slope upward. Moreover,
µ ¯ ¶
d dw ¯¯
= c12 (e, θ) < 0
dθ de ¯u=C
Thus, indifference curves are flatter for higher productivity workers. Graphically:
These indifference curves exhibit a characteristic known as the Spence-Mirrlees single cross-
ing property.
For any w(e), we can find the worker’s optimal choice by selecting the point of tangency
with an indifference curve.
Question: Where does w(e) (equivalently, μ(e)) come from? It is implied by (and must
be consistent with) the worker’s choice on the equilibrium path. For a W P BE, it can
be anything between θL and θH off the equilibrium path (that is, for values of e not
chosen by the worker). This is because, for such e, one can choose w(e) arbitrarily.
This flexibility in selecting the function w(e) gives rise to a variety of different kinds of
equilibria.
219
Categories of equilibria:
3. Hybrids. Some members of a different type pool with members of another type, while
some separate.
Separating equilibria:
Claim #1: In any separating equilibrium, w(e(θi )) = θi for i = H, L. (The worker is paid
her marginal product.)
Proof: In a W P BE, beliefs are derived from Bayes rule where possible. Type θL workers
choose e(θL ) with probability one, while type θH workers choose it with probability
zero. Therefore, when e(θL ) is observed, the firms must believe μ(e(θL )) = 1, which
implies w(e(θL )) = θL . A similar argument holds for e(θH ).
Proof: Suppose not. By claim #1, type θL workers are paid θL . Suppose they chose
e = 0. Since μ(0) ∈ [0, 1], w(0) ≥ θL . This deviation eliminates costly education
without reducing pay, so it is beneficial.
From claim #2, it follows that type θL workers receive a payoff of u(θL , 0, θL ). This places
them on an indifference curve as follows:
220
Now we can see how to construct a separating equilibrium. By claim #1, θH workers must
receive an allocation somewhere on the dashed horizontal line at θH . If this allocation
was to the left to IL , then the θL workers would imitate the θH workers (by picking
the level of education, e(θH ), that the θH workers select in equilibrium).
Suppose then that we select the level of education defined by the intersection of IL and the
horizontal line at θH . On the graph, this corresponds to e1 . Formally, e(θH ) = e1 ,
where e1 is defined by u(θL , 0, θL ) = u(θH , e1 , θL ).
221
Given this wage function, type θL is happy to select e(θL ) = 0, and type θH is happy to
select e(θH ) = e1 . If the firms observe e ∈ {0, e1 }, they make the correct inference
about the worker’s type given the worker’s strategy, and set the wage equal to marginal
product.
Every other value of e occurs with zero probability in equilibrium. Consequently, for
W P BE, we are free to select an arbitrary value for μ(e), provided that μ(e) ∈ [0, 1].
This is equivalent to selecting an arbitrary function w(e) such that (i) w(e) ∈ [θL , θH ],
(ii) w(0) = θL , and (iii) w(e1 ) = θH . The wage schedule in the diagram satisfies
these properties. Another wage schedule that would suffice to sustain this outcome:
w(e) = θL for e < e1 , and w(e) = θH for e ≥ e1 (which corresponds to μ(e) = 1 for
e < e1 and μ(e) = 0 for e ≥ e1 ).
Remarks: (i) In this equilibrium, education acts as a signal of productivity, and is corre-
lated with wages. However, it does not add to productivity.
(ii) The critical assumption here is that education is less costly for more highly productive
workers (single crossing).
222
Question: Are there other W P B separating equilibria?
0
Let IH be the type θH indifference curve through the type θL equilibrium allocation. Let
0
e2 denote the intersection between IH and the horizontal line at θH . Formally, e2
satisfies u(θL , 0, θH ) = u(θH , e2 , θH ). Graphically:
Remark: The first separating equilibrium considered above weakly Pareto dominates all of
the others. In some sense, it seems like the most natural, and certainly least wasteful,
result. We will return to this point later.
Pooling equilibria:
In a pooling equilibrium, every worker chooses the same education level, eP , with probability
1 (e(θL ) = e(θH ) = eP ). Therefore, in a W P BE, the employer must, upon seeing this
education level, hold beliefs μ(eP ) = λ. It follows that w(eP ) = λθL + (1 − λ)θH ≡ θE .
223
Let e3 be defined by u(θL , 0, θL ) = u(θE , e3 , θL ); it is given by the intersection of a horizontal
line at θE and the type θL indifference curve through the point (e, w) = (0, θL )
Claim: For every eP ∈ [0, e3 ], there is a pooling equilibrium in which both types of workers
choose eP with probability one.
Notice that w(eP ) = θE as required. For all other values of e, we are free to choose any
value for w(e) ∈ [θL , θH ]. In the graph, we have made these choices so that it is
optimal for both type of workers to select eP . Another equilibrium wage schedule that
would suffice to sustain this outcome: w(e) = θL for e < eP , and w(e) = θE for e ≥ eP
(which corresponds to μ(e) = 1 for e < eP and μ(e) = λ for e ≥ eP ).
Notice that one cannot have a pooling equilibrium with eP > e3 . Type θL workers would
then prefer (w, e) = (w(0), 0) to (θE , eP ) (since w(0) ≥ θL ), which means that they
would deviate to e = 0.
224
Remark: Pooling equilibria with strictly positive levels of education are extremely inef-
ficient. Education accomplishes nothing — it neither adds to productivity nor differ-
entiates the workers, as in a separating equilibrium. Nevertheless, workers still incur
the costs of obtaining education because there are substantial wage penalties for the
uneducated.
How do we resolve the vast multiplicity of W P BE? The problem comes from having
too much latitude in selecting the wage schedule w(e). W P BE only ties it down
at educational levels that are chosen in equilibrium. Elsewhere, one can choose any
schedule satisfying the restriction that w(e) ∈ [θL , θH ].
Other standard refinements, such as sequential equilibrium, are not helpful, even though
some of these outcomes are not very plausible.
For signaling games, the literature has developed a number of “forward induction” refine-
ments.
Equilibrium dominance
Condition (ii) is the equilibrium dominance condition. It says that e0 is not as good as
the equilibrium outcome for θL regardless of the outcome that e0 generates. Note that
this differs from the usual notion of dominance.
225
Condition (iii) states that there is some conceivable continuation outcome following e0 for
which the choice of e0 improves upon the equilibrium outcome for θH . Note that, to
check this condition, it suffices to consider w = θH .
Intuition: If we observe an e0 satisfying these conditions, we can rule out the possibility
that a type θL might have made this choice, but we can’t rule out the possibility that
a type θH might have made this choice. Hence we should infer that the deviator is a
type θH . This is sometimes known as the intuitive criterion.
Claim: The most efficient separating equilibrium (with e(θH ) = e1 ) satisfies equilibrium
dominance. No other W P BE satisfies equilibrium dominance. In other words, the
equilibrium dominance criterion reduces the W P BE set to a single outcome.
226
(ii) By construction, uL = u(θL , 0, θL ) > u(w, e0 , θL ) for all w ∈ [θL , θH ].
Consequently, μ(e0 ) = 0, which rules out any w(e0 ) < θH . Since any wage schedule sup-
porting the preceding allocation must have w(e0 ) < θH (since type θH would otherwise
choose e0 ), any such equilibrium does not satisfy equilibrium dominance.
Intuition: A type θH worker should be able to go to an employer and make the following
speech. “I have obtained the level of education e0 , and I am a θH . You should believe
me when I tell you this. If indeed I was a θH , it would be in my interests obtain e0
and to tell you this, assuming that you would believe me. However, if I was a θL , it
would not be in my interests to obtain e0 and to tell you this regardless of what you
would then believe. Consequently my claim is credible.”
In this pooling equilibrium, e(θL ) = eP and e(θH ) = eP . Consider some e0 , as shown. Let’s
check the three conditions.
227
(i) e0 is not chosen in equilibrium.
Consequently, μ(e0 ) = 0, which rules out any w(e0 ) < θH . Since any wage schedule sup-
porting the preceding allocation must have w(e0 ) < θH (since type θH would otherwise
choose e0 ), any such equilibrium does not satisfy equilibrium dominance.
Intuition: A type θH worker can make the same speech as in the last instance.
If e0 > e1 , then @w ∈ [θL , θH ] such that u(w, e0 , θH ) > u(θH , e1 , θH ) = uH , so condition (iii)
cannot hold.
228
Thus, the equilibrium dominance condition does not restrict beliefs for any out-of-equilibrium
action.
Intuition: There is nothing to be gained from increasing the level of education. For any
e0 < e1 , a type θH worker might attempt to convince the employer of his type by
making a speech similar to the one above: “I have obtained the level of education e0 ,
and I am a θH . You should believe me when I tell you this. If indeed I was a θH ,
it would be in my interests obtain e0 and to tell you this, assuming that would believe
me.” Unfortunately, the employer would respond: “Yes, but if you were a type θL
worker, it would also be in your interests to obtain e0 and to tell me this, assuming
that I would believe you. Consequently your speech is not credible.”
Though equilibrium dominance is remarkably powerful in this setting, the argument turns
out to be less general than one might hope.
Illustration: Consider a signaling model with three types of workers, θL , θM , and θH (for
low, medium, and high).
229
θM could separate from θL at a lower level of education, e0 . Will equilibrium dominance
eliminate the inefficient equilibrium?
For condition (ii), we need: u(w, e0 , θL ) < u(θL , 0, θL ) = uL for all w ∈ [θL , θH ].
The same argument as before tells us that this is true for w ∈ [θL , θM ]. However, there’s
no reason it has to be true for w ∈ [θM , θH ], and indeed it isn’t true in the picture.
Intuition: θM makes the speech: “I’m playing e0 , and I’m a θM . You should believe me
because, if you do, it’s in my interests to make this speech. Moreover, if I’m a θL , it’s
not in my interests to make this speech.”
The employer reacts: “I’m not so sure you’re a θM . Let’s imagine that you’re a θH . As
a θH , you might be hoping that, by making this speech, I’m going to infer that you’re
a θH . That wouldn’t be an unreasonable thing for you to hope since, if that were
your inference, then any θH would have an incentive to make this speech. So I can’t
rule out that you’re a θH , and neither can my competitor (the other employer). By
making this speech, you might be hoping that we bid your wage up to θH . But in that
case I can’t rule out the possibility that your a θL either. A θL who thought that the
speech would generate the inference of θH would certainly have an incentive to make
the speech. So the speech doesn’t prove anything.”
Stronger refinements:
A variety of alternative criteria have been developed. We will cover the D1 criterion, which
is generally easy to use and often reduces the equilibrium set significantly.
230
In words, the D1 criterion states that, if one ever observed e0 , one would assume that this
choice was made by the type of individual most inclined to make it. This is very strong.
μj (e0 ) λj
We are not simply saying that μk (e0 )
> λk
where λi denotes population proportion (so
the inequality implies that posteriors place more weight on j relative to k than priors)
— we’re saying that k is ruled out altogether.
Provided that the single crossing property is satisfying and that the is no upper bound on
e, one can show that the D1 criterion isolates the most efficient separating equilibrium
in this class of models, regardless of how many types there are.
Return to the case of two types. What happens as λ → 0 (so that the population consists
almost entirely of highly productive workers)?
The separating equilibrium does not change. It does not depend on the population frac-
tions.
Consequently, when λ is small, pooling equilibria Pareto dominate all separating equilibria.
Moreover, the separating equilibrium seems increasingly silly: if the chances of being a
θL are one in a billion, would we expect everyone to nevertheless obtain large amounts
of costly education merely to prove that they are not that one?
231
This suggests that there might be something wrong with the argument for the intuitive
criterion.
One possibility: Suppose that the speech (“I’m a θH ...”) is credible to the employer. Then
it is in the interests of every θH to make the speech. Therefore, if a worker fails to
make the speech, one can infer that she is a type θL . The employer should then be
willing to offer only θL to anyone failing to make the speech.
This observation does not affect the inefficient separating equilibria, since the θL workers are
paid θL anyway. However, it does affect the pooling equilibrium. If λ is sufficiently
small, then the θL workers will prefer to make the speech rather than receive θL .
Graphically:
is what type θL would end up with if they didn’t choose e0 and make the speech,
assuming the speech is credible). This is illustrated for the point e01 in the diagram.
232
So, for this case, the assumption that the speech is credible implies that it is not
credible.
Does this reasoning always rescue the pooling equilibrium? No. Imagine that λ is large
(average productivity is θE 0
2 in the diagram). Note that type θ H prefers (θ H , e2 ) to
(θE 0
2 , 0), but that θ L prefers (θ L , 0) to (θ H , e2 ). Thus, type θ H would have an incentive
to make the speech assuming that it is credible, and type θL would not have an incentive
to make the speech even assuming that the employers would then infer that she is a
θL . Thus, the speech is credible.
Conclude: One still eliminates the pooling equilibria when the most efficient separating
equilibrium is better for type θH . However, the pooling equilibrium survives when it
Pareto dominates the most efficient separating equilibrium.
2. If we think that players actually may make mistakes with some small probability, then
all information sets are reached with positive probability, and there is no room for
refinements of beliefs.
The forward induction reasoning breaks down. Suppose that θH makes her speech, “I am a
θH and I have chosen e0 because...” The employer might simply say, “How do I know
you’re not just a θL who made a mistake, and now you’re trying to make the most of
it?”
Pushing this line of reasoning brings us back to trembling-hand perfection and sequential
equilibria. These refinements don’t help us with signaling problems unless we know
enough to place structure on the mistake-generation process.
233
Model:
θ denote private information known to the sender but not to the receiver (we will also refer
to this as the sender’s “type”).
This private information affects the payoff for both the sender and the receiver.
The receiver must take some action a ∈ <. This action affects both the sender and receiver.
Note that this is optimized at a = θ. In other words, the receiver would like to match
the action to the sender’s type, taking the “appropriate” action for each type. (In an
employment situtation, one could think of this as shorthand for paying the appropriate
wage and assigning skill-appropriate tasks).
Note that this is optimized at a = θ + c. In other words, each sender wants the receiver to
believe that he is a higher type than he actually is. (In an employment situation, the
worker might want the employer to think that he is more qualified than he actually is,
but he doesn’t want to be thought of as too qualified lest the employer assign him to
a job that the can’t handle.)
In this setting, can the sender credibly communicate anything about his private informa-
tion?
234
Stage 2: The sender chooses some message m ∈ M. This choice is payoff-irrelevant (hence
the name “cheap talk”).
Remark: We obviously can’t have a perfectly informative equilibrium. If each type could
credibly announce its type (m = θ), then each type θ would imitate type θ + c. So
how much information can the parties communicate credibly?
A “babbling” equilibrium:
Consider the following strategies and beliefs. For every m ∈ M the receiver beliefs about
θ are uniform over [0, 1], and the receiver responds to every m ∈ M by setting a = 12 .
All sender-types send the same message, m∗ .
Given the receiver’s responses, the sender’s choice is optimal (since the message is irrel-
evant). The receiver’s responses are optimal given beliefs. The beliefs are derived
from equilibrium strategies through application of Bayes rule (trivially) when m = m∗ .
Hence this is always a W P BE. It is also easy to check that it is a sequential equilib-
rium.
A two-message equilibrium:
Are there also equilibria in which the senders bifurcate into two groups, differentiated by
two different messages?
Let’s try to construct such an equilibrium. We will refer to the two messages as m1
(“grunt”) and m2 (“snort”).
Let a(m) denote the receiver’s response to the message m. If the equilibrium conveys
information, it must be the case that a(m1 ) 6= a(m2 ). Without loss of generality,
assume that a(m1 ) < a(m2 ).
235
Claim #1: A two-message equilibrium has the property that the population divides into
two segments, with θ ∈ [0, θ∗ ) choosing m1 , θ ∈ (θ∗ , 1] choosing m2 , and θ∗ indifferent
(and therefore choosing either message).
∆ ≡ uS (a(m2 ), θ) − uS (a(m1 ), θ)
Thus, if type θ0 prefers a(m2 ) to a(m1 ), then type θ00 > θ0 must also prefer a(m2 ) to a(m1 ).
It follows that the types must divide into two segments, with all types in the lower
segment weakly preferring a(m1 ), and all types in the upper segment weakly preferring
a(m2 ). By continuity, the type on the boundary between the two segments must be
indifferent between a(m1 ) and a(m2 ). This establishes the claim.
θ∗ θ∗ +1
Claim #2: a(m1 ) = 2
, and a(m2 ) = 2
.
Demonstration: Since the receiver has quadratic preferences, she always sets a = E(θ |
m). The preceding expressions are simply the conditional expectations for θ ∈ [0, θ∗ ],
and θ ∈ [θ∗ , 1], respectively (given the uniform distribution).
Solving for the equilibrium: We use the fact that θ∗ is indifferent between the two
messages: uS (a(m1 ), θ∗ ) = uS (a(m2 ), θ∗ ). Given our functional assumptions and
claim #2, this is equivalent to
∗ θ∗ θ∗ + 1
(θ + c) − = − (θ∗ + c)
2 2
This simplifies to θ∗ = 1
2
− 2c.
Graphically:
236
Remarks: (1) Provided c < 14 , this is an equilibrium. Note that there is also a babbling
equilibrium.
(2) It is easy to show that one can turn this into a W P BE when |M| > 2; simply have the
receiver believe that θ = 0 when unchosen messages are selected (one can check that
this is also sequential).
(3) Notice that the equilibrium is asymmetric: the upper segment is larger than the lower
segment. Thus, communication is more informative in the “lower quality” range.
Using precisely the same arguments as before, one can show that:
(1) The population divides into three segments, with θ ∈ [0, θ1 ) choosing m1 (“grunt”),
θ ∈ (θ1 , θ2 ) choosing m2 (“snort”), θ ∈ (θ2 , 1] choosing m3 (“shriek”), θ1 indifferent
237
between m1 and m2 (and therefore choosing either message), and θ2 indifferent between
m2 and m3 (and therefore choosing either message).
θ1 θ1 +θ2 θ2 +1
(2) a(m1 ) = 2
, a(m2 ) = 2
, and a(m3 ) = 2
(these are just the conditional expecta-
tions of types within each segment).
1
Remarks: (1) Provided c < 12
, this is an equilibrium. Note that there is also a two-
message equilibrium and a babbling equilibrium.
238
(2) As above, it is easy to show that one can turn this into a W P BE (and a sequential
equilibrium) when |M| > 3.
(3) Notice again that the higher segments are larger, so that communication continues to
be more informative in the lower quality range.
Using precisely the same arguments as before, one can show that:
(1) The population divides into consecutive segments, with θ ∈ [0, θ1 ) choosing m1 , θ ∈
(θt , θt+1 ) choosing mt+1 , θ ∈ (θT −1 , 1] choosing mT , and all types on the boundaries
between segments indifferent between the messages assigned to those segments.
θ1 θt+1 +θt θT −1 +1
(2) a(m1 ) = 2
, a(mt+1 ) = 2
, and a(mT ) = 2
(these are just the conditional
expectations of types within each segment).
θ1 θ1 + θ2
(θ1 + c) − = − (θ1 + c)
2 2
X
t−1
θt = tθ1 + 4c k
k=1
239
We are free to pick θ1 . Any choice of θ1 implies a sequence of boundary points, θt , between
successive segments.
If, for some choice of θ1 , there exists a T such that θT = 1, then this partition corresponds
to an equilibrium with T segments (messages).
Claim: There exists an equilibrium with T distinct segments (messages) if and only if
X
T −1
4c k≤1
k=1
When this condition fails to hold, then θT > 1 for every possible choice of θ1 .
Remarks:
(1) For any given value of c, there exists a T ∗ such that there exist equilibria with T distinct
segments iff T ≤ T ∗ (including T = 1, the case of babbling). In other words, there
can be up to (and including) T ∗ informative messages. T ∗ provides a bound on the
informativeness of language.
(2) Note that θt − θt−1 = θ1 + (t − 1)4c. Thus, the segments become wider as one moves to
higher values of θ. Once again, communication is more informative at the lower end
of the quality spectrum.
(3) As before, it is easy to show that one can turn this into a W P BE or a SE if there are
unused messages.
240