MIT博弈论lecture note
MIT博弈论lecture note
Introduction
Game Theory is a misnomer for Multiperson Decision Theory. It develops tools, meth
ods, and language that allow a coherent analysis of the decision-making processes when
there are more than one decision-makers and each player’s payoff possibly depends on
the actions taken by the other players. In this lecture, I will illustrate some of these
methods on simple examples.
Note that, since a player’s preferences on his actions depend on which actions the
other parties take, his action depends on his beliefs about what the others do. Of course,
what the others do depends on their beliefs about what each player does. In this way, a
player’s action, in principle, depends on the actions available to each player, each player’s
preferences on the outcomes, each player’s beliefs about which actions are available to
each player and how each player ranks the outcomes, and further his beliefs about each
player’s beliefs, ad infinitum.
When players think through what the other players will do, taking what the other
players think about them into account, they may find a clear way to play the game.
Consider the following “game”:
1\2 L m R
T 1, 1 0, 2 2, 1
M 2, 2 1, 1 0, 0
B 1, 0 0, 0 −1, 1
Here, There are two players, namely Player 1 and Player 2. Player 1 has strategies,
T, M, B, and Player 2 has strategies L, m, R. They pick their strategies simultaneously.
1
2 CHAPTER 1. INTRODUCTION
Every pair of strategies leads to a payoff to each player, a payoff measured by a real
number. In each entry, the first number is the payoff of Player 1, and the second entry
is the payoff of Player 2. For instance, if Player 1 plays T and Player 2 plays R, then
Player 1 gets a payoff of 2 and Player 2 gets a payoff of 1. Let’s assume that each player
knows that these are the strategies and the payoffs, each player knows that each player
knows this, each player knows that each player knows that each player knows this,. . .
ad infinitum.
Now, Player 1 looks at his payoffs, and realizes that, no matter what the other player
plays, it is better for him to play M rather than B. That is, if Player 2 plays L, M
gives 2 and B gives 1; if Player 2 plays m, M gives 1, B gives 0; and if Player 2 plays
R, M gives 0, B gives −1. Therefore, he realizes that he should not play B. Now he
compares T and M . He realizes that, if Player 2 plays L or m, M is better than T , but
if she plays R, T is definitely better than M . Would Player 2 play R? What would she
play? To find an answer to these questions, Player 1 looks at the game from Player 2’s
point of view. He realizes that, for Player 2, there is no strategy that is outright better
than any other strategy. For instance, R is the best strategy if Player 1 plays B, but
otherwise it is strictly worse than m. Would Player 2 think that Player 1 would play
B? Well, she knows that Player 1 is trying to maximize his expected payoff, given by
the first entries as everyone knows. She must then deduce that Player 1 will not play B.
Therefore, Player 1 concludes, she will not play R (as it is worse than m in this case).
Ruling out the possibility that Player 2 plays R, Player 1 looks at his payoffs and sees
that M is now better than T , no matter what. On the other side, Player 2 goes through
similar reasoning, and concludes that Player 1 must play M , and therefore plays L.
Exercise 1 In the above analysis, players are assumed to make many assumptions about
the other players’ reasoning capabilities. What are these assumptions? How would the
analysis change if these assumptions are changed, e.g., if players act rationally but as
sumes that the other parties play a random strategy?
The kind of reasoning in the above analyses does not always yield such a clear
prediction. Imagine that you want to meet with a friend in one of two places, about
which you both are indifferent. Unfortunately, you cannot communicate with each other
until you meet. This situation is formalized in the following game, which is called pure
coordination game:
3
1 \2 Left Right
Top 1, 1 0, 0
Bottom 0, 0 1, 1
Here, Player 1 chooses between Top and Bottom rows, while Player 2 chooses between
Left and Right columns. In each box, the first and the second numbers denote the payoffs
of players 1 and 2, respectively. Note that Player 1 prefers Top to Bottom if he knows
that Player 2 plays Left; he prefers Bottom if he knows that Player 2 plays Right.
Similarly, Player 2 prefers Left if she knows that Player 1 plays Top. There is no clear
prediction about the outcome of this game.
One may look for the stable outcomes (strategy profiles) in the sense that no player
has incentive to deviate if he knows that the other players play the prescribed strategies.
(Such strategy profiles are called Nash equilibrium, named after John Nash.) Here, Top-
Left and Bottom-Right are such outcomes. But Bottom-Left and Top-Right are not
stable in this sense. For instance, if Bottom-Left is known to be played, each player
would like to deviate.
Unlike in this game, mostly players have different preferences on the outcomes, in
ducing conflict. In the following game, which is known as the Battle of Sexes, conflict
and the need for coordination are present together.
1 \ 2 Left Right
Top 2, 1 0, 0
Bottom 0, 0 1, 2
Here, once again players would like to coordinate on Top-Left or Bottom-Right, but
now Player 1 prefers to coordinate on Top-Left, while Player 2 prefers to coordinate on
Bottom-Right. The stable outcomes are again Top-Left and Bottom- Right.
The above analysis assumes that players take their actions simultaneously, so that
a player does not observe the action taken by the others when chooses his own action.
In general, a player may observe some of the actions of some other players. Such a
knowledge may have a dramatic impact on the outcome of the game. For an illustration,
in the Battle of Sexes, imagine that Player 2 knows what Player 1 does when she takes
her action. This can be formalized via the tree in Figure 1.1. Here, Player 1 first
chooses between Top and Bottom, and then Player 2 chooses between Left and Right,
4 CHAPTER 1. INTRODUCTION
B O
2 2
B O B O
knowing what Player 1 has chosen. Clearly, now Player 2 would choose Left if Player
1 plays Top, and choose Right if Player 1 plays Bottom. Knowing this, Player 1 would
play Top. Therefore, one can argue that the only reasonable outcome of this game is
Top-Left. (This kind of reasoning is called backward induction.)
When Player 2 is able to check what the other player does, he gets only 1, while
Player 1 gets 2. (In the previous game, two outcomes were stable, in which Player 2
would get 1 or 2.) That is, Player 2 prefers that Player 1 has information about what
Player 2 does, rather than she herself has information about what Player 1 does. When
it is common knowledge that a player has some information or not, the player may prefer
not to have that information–a robust fact that we will see in various contexts.
Exercise 2 Clearly, this is generated by the fact that Player 1 knows that Player 2
will know what Player 1 does when she moves. Consider the situation that Player 1
thinks that Player 2 will know what Player 1 does only with probability π < 1, and this
probability does not depend on what Player 1 does. What will happen in a “reasonable”
equilibrium? [By the end of this course, hopefully, you will be able to formalize this
situation, and compute the equilibria.]
1
Exit Play
B O
O (0,0) (1,2)
Exercise 3 Consider the following version of the last game: after knowing what Player
2 does, Player 1 gets a chance to change his action; then, the game ends. In other words,
Player 1 chooses between Top and Bottom; knowing Player 1’s choice, Player 2 chooses
between Left and Right; knowing 2’s choice, Player 1 decides whether to stay where he
is or to change his position. What is the “reasonable” outcome? What would happen if
changing his action would cost player 1 c utiles?
Imagine that, before playing the Battle of Sexes, Player 1 has the option of exiting,
in which case each player will get 3/2, or playing the Battle of Sexes. When asked to
play, Player 2 will know that Player 1 chose to play the Battle of Sexes, as depicted
in Figure 1.2. There are two “reasonable” equilibria (or stable outcomes). One is that
Player 1 exits, thinking that, if he plays the Battle of Sexes, they will play the Bottom-
Right equilibrium of the Battle of Sexes, yielding only 1 for Player 1. The second one is
that Player 1 chooses to Play the Battle of Sexes, and in the Battle of Sexes they play
Top-Left equilibrium.
Some would argue that the first outcome is not really reasonable? Because, when
asked to play, Player 2 will know that Player 1 has chosen to play the Battle of Sexes,
forgoing the payoff of 3/2. She must therefore realize that Player 1 cannot possibly be
planning to play Bottom, which yields the payoff of 1 max. That is, when asked to play,
Player 2 should understand that Player 1 is planning to play Top, and thus she should
play Left. Anticipating this, Player 1 should choose to play the Battle of Sexes game,
in which they play Top-Left. Therefore, the second outcome is the only reasonable one.
6 CHAPTER 1. INTRODUCTION
Prisoners’ Dilemma
1 \2 Cooperate Defect
Cooperate 5, 5 0, 6
Defect 6, 0 1, 1
This is a well known game that most of you know. Two prisoners are arrested for a
crime for which there is no firm evidence, and they are being interrogated in separate
rooms. Each prisoner could either cooperate with the other and not confess their crime
or defect and confess the crime. In this game no matter what the other player does,
each player would like to defect, confessing their crime. This yields (1, 1). If they both
cooperated and not confessed their crime, each would get a better payoff of 5.
Hawk-Dove game
1 \ 2 Hawk Dove
V −C V −C
Hawk 2
, 2 V, 0
V V
Dove 0,V ,
2 2
This is an important biological game, but is also quite similar to many games in Eco
nomics and Political Science. V is the value of a resource that one of the players will
enjoy. If they shared the resource, their values are V /2. Hawk stands for a “tough”
strategy, whereby the player does not give up the resource. However, if the other player
is also playing hawk, they end up fighting, and incur the cost C/2 each. On the other
hand, a Hawk player gets the whole resource for itself when playing a Dove. When
V > C, this is a Prisoners’ Dilemma game, yielding a fight.
When V < C, so that fighting is costly, this game is similar to another well-known
game, named “Chicken”, where two players driving towards a cliff have to decide whether
to stop or continue. The one who stops first loses face, but may save his life. More
generally, a class of games called “wars of attrition” are used to model this type of
situations. In this case, a player would like to play Hawk if his opponent plays Dove,
and play Dove if his opponent plays Hawk.
7
An investment game:
Here, two parties simultaneously decide whether two invest; the investment is more
valuable if the other party also invests (as in the coordination game). For example,
consider a potential worker and a potential employer, potential worker deciding whether
to get education (investing in his human capital), and the potential employer deciding
wether to invest in a technology that would require human capital. (Think about what
are the reasonable outcomes for various values of θ and c. How would you analyze this
situations if the players do not know the actual values of these parameters, but have
some private information about what these values could be?)
Chapter 2
Decision Theory
[ º and º ] ⇒ º .
A relation is a preference relation if and only if it is complete and transitive. Given any
preference relation º, we can define strict preference  by
 ⇐⇒ [ º and 6º ]
∼ ⇐⇒ [ º and º ]
1
This is a matter of modeling. For instance, if we have options Coffee and Tea, we define alternatives
as = Coffee but no Tea, = Tea but no Coffee, = Coffee and Tea, and = no Coffee and no
Tea.
9
10 CHAPTER 2. DECISION THEORY
This statement can be spelled out as follows. First, if () ≥ (), then the player finds
alternative as good as alternative . Second, and conversely, if the player finds at
least as good as , then () must be at least as high as (). In other words, the
player acts as if he is trying to maximize the value of (·).
The following theorem states further that a relation needs to be a preference relation
in order to be represented by a utility function.
Theorem 2.1 Let be finite. A relation can be presented by a utility function if and
only if it is complete and transitive. Moreover, if : → R represents º, and if
: R → R is a strictly increasing function, then ◦ also represents º.
By the last statement, such utility functions are called ordinal, i.e., only the order
information is relevant.
In order to use this ordinal theory of choice, we should know the player’s preferences
on the alternatives. As we have seen in the previous lecture, in game theory, a player
chooses between his strategies, and his preferences on his strategies depend on the strate-
gies played by the other players. Typically, a player does not know which strategies the
other players play. Therefore, we need a theory of decision-making under uncertainty.
10
1/2
Lottery 1
1/2 0
Figure 2.1:
1. : → R represents º in the ordinal sense. That is, if () ≥ (), then the
player finds lottery as good as lottery . And conversely, if the player finds at
least as good as , then () must be at least as high as ().
2. The function takes a particular form: for each lottery , () is the expected
P
value of under . That is, () ≡ ∈ ()(). In other words, the player acts
as if he wants to maximize the expected value of . For instance, the expected
utility of Lottery 1 for the player is ((Lottery 1)) = 12 (10) + 12 (0).2
In the sequel, I will describe the necessary and sufficient conditions for a represen-
tation as in (2.1). The first condition states that the relation is indeed a preference
relation:
2
R
If were a continuum, like R, we would compute the expected utility of by ()().
12 CHAPTER 2. DECISION THEORY
³³ ³³
³ ³³ ³³³
d³ d³
³ ³
³³ ³³
PP PP
PP PP
PP PP
PP PP
P P
This is necessary by Theorem 2.1, for represents º in ordinal sense. The second
condition is called independence axiom, stating that a player’s preference between two
lotteries and does not change if we toss a coin and give him a fixed lottery if “tail”
comes up.
Axiom 2.2 For any ∈ , and any ∈ (0 1], + (1 − ) Â + (1 − ) ⇐⇒
 .
Let and be the lotteries depicted in Figure 2.2. Then, the lotteries + (1 − )
and + (1 − ) can be depicted as in Figure 2.3, where we toss a coin between a
fixed lottery and our lotteries and . Axiom 2.2 stipulates that the player would not
change his mind after the coin toss. Therefore, the independence axiom can be taken as
an axiom of “dynamic consistency” in this sense.
The third condition is purely technical, and called continuity axiom. It states that
there are no “infinitely good” or “infinitely bad” prizes.
Axiom 2.3 For any ∈ with  , there exist ∈ (0 1) such that + (1 −
)  and  + (1 − ).
Axioms 2.1 and 2.2 imply that, given any ∈ and any ∈ [0 1],
³³ ³³
³ ³³ ³³³
dP dP³
³³ ³³
³³ ³
¡ PPP ¡ PPP
¡ PP ¡ PP
PP PP
¡ P ¡ P
d d
¡ ¡
¡ ¡
@ @
@1 − @1 −
@ @
@ @
@ @
+ (1 − ) + (1 − )
(2 )
6
1
@
@
@
@
H
@
HH @
HH @
0
H HH@
HH H@H@
HH0 H
HH ¡@
H¡ @
¡H
0
HH @
¡ HH@
¡ @
HH
¡ @
H
¡ @
¡ @
¡ @
¡ @ - (1 )
0 1
2. The indifference curves, which are straight lines, are parallel to each other.
where
(0 ) = 0
(1 ) = 1
(2 ) = 2
In a game, when a player chooses his strategy, in principle, he does not know what the
other players play. That is, he faces uncertainty about the other players’ strategies.
Hence, in order to define the player’s preferences, one needs to define his preference
under such uncertainty. In general, this makes modeling a difficult task. Fortunately,
using the utility representation above, one can easily describe these preferences in a
compact way.
Consider two players Alice and Bob with strategy sets and . If Alice plays
and Bob plays , then the outcome is ( ). Hence, it suffices to take the set of
outcomes = × = {( ) | ∈ ∈ } as the set of prizes. Consider
Alice. When she chooses her strategy, she has a belief about the strategies of Bob,
represented by a probability distribution on , where ( ) is the probability
that Bob plays , for any strategy . Given such a belief, each strategy induces a
lottery, which yields the outcome ( ) with probability ( ). Therefore, we can
consider each of her strategies as a lottery.
Example 2.1 Let = { } and = { }. Then, the outcome set is =
{ }. Suppose that Alice assigns probability () = 13 to and
() = 23 to . Then, under this belief, her strategies and yield the following
16 CHAPTER 2. DECISION THEORY
lotteries:
TL TL
1/3 0
TR TR
0
2/3
T1/ B1/3 1/3
1/3
3
0 BL BL
2/3
0
BL BL
On the other hand, if she assigns probability () = 12 to and () = 12 to ,
then her strategies and yield the following lotteries:
TL TL
1/2 0
TR TR
0
1/2
1/2
T1/2 B1/2 1/
1/22
1/2
0 BL BL
1/2
0
BL BL
TL TL
p 0
TR TR
0
1-p
1-p
Tp Bp p
0 BL BL
1-p
1-p
0
BL BL
(2.3)
2.3. MODELING STRATEGIC SITUATIONS 17
: × → R
: × → R
In the example above, all we need to do is to find four numbers for each player. The
preferences of Alice is described by ( ), ( ), ( ), and ( ).
Example 2.2 In the previous example, assume that regarding the lotteries in (2.3), the
preference relation of Alice is such that
 if 14 (2.4)
∼ if = 14
 if 14
and she is indifferent between the sure outcomes ( ) and ( ). Under Axioms
2.1-2.3, we can represent her preferences by
( ) = 3
( ) = −1
( ) = 0
( ) = 0
The derivation is as follows. By using the fact that she is indifferent between ( ) and
( ), we reckon that ( ) = ( ). By the second part of Theorem 2.2, we
can set ( ) = 0 (or any other number you like)! Moreover, in (2.3), the lottery
yields
( ) = ( ) + (1 − ) ( )
That is,
1 3
( ) + ( ) = 0
4 4
and
( ) ( )
In other words, all we need to do is to find numbers ( ) 0 and ( ) 0
with ( ) = −3 ( ), as in our solution. (Why would any such two numbers
yield the same preference relation?)
This is true if and only if the utility function is linear, i.e., () = + for some
real numbers and . Therefore, an agent is risk-neutral if and only if he has a linear
Von-Neumann-Morgenstern utility function.
A decision maker is strictly risk-averse if and only if he rejects all fair gambles,
except for the gamble that gives 0 with probability 1. That is,
X ³X ´
() () (0) = ()
2.4. ATTITUDES TOWARDS RISK 19
Here, the inequality states that he rejects the lottery , and the equality is by the fact
that the lottery is a fair gamble. As in the case of risk neutrality, it suffices to consider
the binary lotteries ( ; ), in which case the above inequality reduces to
for all ∈ (0 1). Therefore, strict risk-aversion is equivalent to having a strictly concave
utility function. A decision maker is said to be risk-averse iff he has a concave utility
function, i.e., ( + (1 − )) ≥ () + (1 − )() for each , , and . Similarly,
a decision maker is said to be (strictly) risk seeking iff he has a (strictly) convex utility
function.
Consider Figure 2.5. A risk averse decision maker’s expected utility is () =
(1 ) + (1 − ) (2 ) if he has a gamble that gives 1 with probability and 2
with probability 1−. On the other hand, if he had the expected value 1 +(1 − ) 2
for sure, his expected utility would be (1 + (1 − ) 2 ). Hence, the cord AB is the
utility difference that this risk-averse agent would lose by taking the gamble instead of
its expected value. Likewise, the cord BC is the maximum amount that he is willing
to pay in order to avoid taking the gamble instead of its expected value. For example,
suppose that 2 is his wealth level; 2 − 1 is the value of his house, and is the
probability that the house burns down. In the absence of fire insurance, the expected
utility of this individual is (gamble), which is lower than the utility of the expected
value of the gamble.
EU
u
A
u(pW1+(1- p)W2)
C
EU(Gamble) B
W1 pW1+(1-p)W2 W2
Figure 2.5:
2.4. ATTITUDES TOWARDS RISK 21
that the two agents form a mutual fund by pooling their assets, each agent owning half
of the mutual fund. This mutual fund gives $200 the probability 1/4 (when both assets
yield high dividends), $100 with probability 1/2 (when only one on the assets gives high
dividend), and gives $0 with probability 1/4 (when both assets yield low dividends).
Thus, each agent’s share in the mutual fund yields $100 with probability 1/4, $50 with
probability 1/2, and $0 with probability 1/4. Therefore, his expected utility from the
√ √ √
share in this mutual fund is = 14 100 + 12 50 + 14 0 = 60355. This is clearly
larger than his expected utility from his own asset which yields only 5. Therefore, the
above agents gain from sharing the risk in their assets.
2.4.2 Insurance
Imagine a world where in addition to one of the agents above (with utility function
√
: 7→ and a risky asset that gives $100 with probability 1/2 and gives $0 with
probability 1/2), we have a risk-neutral agent with lots of money. We call this new agent
the insurance company. The insurance company can insure the agent’s asset, by giving
him $100 if his asset happens to yield $0. How much premium, , the agent would be
willing to pay to get this insurance? [A premium is an amount that is to be paid to
insurance company regardless of the outcome.]
If the risk-averse agent pays premium and buys the insurance, his wealth will be
$100 − for sure. If he does not, then his wealth will be $100 with probability 1/2 and
$0 with probability 1/2. Therefore, he is willing to pay in order to get the insurance
iff
1 1
(100 − ) ≥ (0) + (100)
2 2
i.e., iff
√ 1√ 1√
100 − ≥ 0+ 100
2 2
The above inequality is equivalent to
≤ 100 − 25 = 75
That is, he is willing to pay 75 dollars premium for an insurance. On the other hand, if
the insurance company sells the insurance for premium , it gets for sure and pays
$100 with probability 1/2. Therefore it is willing to take the deal iff
1
≥ 100 = 50
2
22 CHAPTER 2. DECISION THEORY
Therefore, both parties would gain, if the insurance company insures the asset for a
premium ∈ (50 75), a deal both parties are willing to accept.
Exercise 2.1 Now consider the case that we have two identical risk-averse agents as
above, and the insurance company. Insurance company is to charge the same premium
for each agent, and the risk-averse agents have an option of forming a mutual fund.
What is the range of premiums that are acceptable to all parties?
(a)
2 −2 1 1 −3 7 12 −1 5 0 −3 2
1 10 0 4 0 4 5 3 3 1 3 1
−2 1 1 7 −1 −5 −1 0 5 2 1 −2
(b)
1 2 7 0 4 −1 1 5 7 1 4 −1
6 1 2 2 8 4 6 3 2 4 8 8
3 −1 9 2 5 0 3 −1 9 5 5 1
Solution: Recall from Theorem 2.2 that two utility functions represent the same
preferences over lotteries if and only if one is an affine transformation of the other.
That is, we must have = + for some and where and are the
utility functions on the left and right, respectively, for each player . In Part 1, the
preferences of player 1 are different in two games. To see this, note that 1 ( ) =
0 and 1 ( ) = 3. Hence, we must have = 3. Moreover, 1 ( ) = 1 and
1 ( ) = 5. Hence, we must have = 2. But then, 1 ( ) + = 7 6=
12 = 1 ( ), showing that it is impossible to have an affine transformation.
Similarly, one can check that the preferences of Player 2 are different in Part 2.
2.5. EXERCISES WITH SOLUTION 23
Now, comparisons of payoffs for ( ) and ( ) yield that = 2 and = 1, but
then the payoffs for ( ) do not match under the resulting transformation.
2. [Homework 1, 2011] Alice and Bob want to meet in one of three places, namely
Aquarium (denoted by ), Boston Commons (denoted by ) and a Celtics game
(denoted by ). Each of them has strategies . If they both play the same
strategy, then they meet at the corresponding place, and they end up at different
places if their strategies do not match. You are asked to find a pair of utility
functions to represent their preferences, assuming that they are expected utility
maximizers.
Alice’s preferences: She prefers any meeting to not meeting, and she is indiffer-
ence towards where they end up if they do not meet. She is indifferent between a
situation in which she will meet Bob at , or , or , each with probability 1/3,
and a situation in which she meets Bob at with probability 1/2 and does not
meet Bob with probability 1/2. If she believes that Bob goes to Boston Commons
with probability and to the Celtics game with probability 1 − , she weakly
prefers to go to Boston Commons if and only if ≥ 13.
Bob’s preferences: If he goes to the Celtics game, he is indifferent where Alice
goes. If he goes to Aquarium or Boston commons, then he prefers any meeting to
not meeting, and he is indifferent towards where they end up in the case they do
not meet. He is indifferent between playing , , and if he believes that Alice
may choose any of her strategies with equal probabilities.
(a) Assuming that they are expected utility maximizers, find a pair of utility
functions : { }2 → R and : { }2 → R that represent the
preferences of Alice and Bob on the lotteries over { }2 .
Solution: Alice’s utility function is determined as follows. Since she is indif-
ferent between any ( ) with 6= , by Theorem 2.2, one can normalize
her payoff for any such strategy profile to ( ) = 0. Moreover, since
she prefers meeting to not meeting, ( ) 0 for all ∈ { }.
By Theorem 2.2, one can also set ( ) = 1 by a normalization. The
indifference condition in the question can then be written as
1 1 1 1
( ) + ( ) + ( ) = ( )
3 3 3 2
24 CHAPTER 2. DECISION THEORY
3. [Homework 1, 2011] In this question you are asked to price a simplified version of
mortgage-backed securities. A banker lends money to homeowners, where each
homeowner signs a mortgage contract. According to the mortgage contract, the
homeowner is to pay the lender 1 million dollar, but he may go bankrupt with
probability , in which case there will be no payment. There is also an investor
who can buy a contract in which case he would receive the payment from the
homeowner who has signed the contract. The utility function of the investor is
given by () = − exp (−), where is the net change in his wealth.
(a) How much is the investor willing to pay for a mortgage contract?
2.5. EXERCISES WITH SOLUTION 25
That is,
1
≤ ∗ ≡ − ln ( + (1 − ) exp (−))
∗
where is the maximum willing to pay.
(b) Now suppose that the banker can form "mortgage-backed securities" by
pooling all the mortgage contracts and dividing them equally. A mortgage
backed security yields 1 of the total payments by the homeowners, i.e., if
homeowners go bankrupt, a security pays ( − ) millions dollar. Assume
that homeowners’ bankruptcy are stochastically independent from each other.
How much is the investor willing to pay for a mortgage-backed security?
Assuming that is large find an approximate value for the price he is willing
to pay. [Hint: for large , approximately, the average payment is normally
distributed with mean 1 − (million dollars) and variance (1 − ) . If
is normally distributed with mean and variance 2 , the expected value of
¡ ¡ ¢¢
exp (−) is exp − − 12 2 .] How much more can the banker raise by
creating mortgage-backed securities? (Use the approximate values for large
.)
Solution: Writing for the number of combinations out of , the prob-
ability that there are bankruptcies is (1 − )− . If he pays for
a mortgage-backed security, his net revenue in the case of bankruptcies is
1 − − . Hence, his expected payoff is
X
− exp (− (1 − − )) (1 − )−
=0
He is willing to pay if the above amount is at least −1, the payoff from 0.
Therefore, he is willing to pay at most
à !
1 X
∗ = 1 − ln exp () (1 − )−
=0
For large ,
1 (1 − )
∗ ∼
= 1 − ln (exp ( ( + (1 − ) (2)))) = 1 − −
2
26 CHAPTER 2. DECISION THEORY
(c) Answer part (b) by assuming instead that the homeowners’ bankruptcy are
perfectly correlated: with probability all homeowners go bankrupt and with
probability 1 − none of them go bankrupt. Briefly compare your answers
for parts (b) and (c).
Solution: With perfect correlation, a mortgage-backed security is equivalent
to one contract, and hence he is willing to pay at most ∗ . In general, when
there is a positive correlation between the bankruptcies of different homeown-
ers (e.g. due to macroeconomic conditions), the value of mortgage backed
securities will be less than what it would have been under independence.
Therefore, mortgage back securities that are priced under the erroneous as-
sumption of independence would be over-priced.
2.6 Exercises
1. [Homework 1, 2000] Consider a decision maker with Von Neumann and Morgen-
stren utility function with () = ( − 1)2 . Check whether the following VNM
utility functions can represent this decision maker’s preferences. (Provide the de-
tails.)
(a) ∗ : →
7 − 1;
(c) ̂ : 7→ − ( − 1)2 ;
(d) ̃ : 7→ 2 ( − 1)2 − 1
2. [Homework 1, 2004] Which of the following pairs of games are strategically equiv-
alent, i.e., can be taken as two different representations of the same decision prob-
lem?
2.6. EXERCISES 27
(a)
L R L R
T 2,2 4,0 T -6,4 0,0
B 3,3 1,0 B -3,6 -9,0
(b)
L R L R
T 2,2 4,0 T 4,4 16,0
B 3,3 1,0 B 9,9 1,0
(c)
L R L R
T 2,2 4,0 T 4,2 2,0
B 3,3 1,0 B 3,3 1,0
3. [Homework 1, 2001] We have two dates: 0 and 1. We have a security that pays a
single dividend, at date 1. The dividend may be either $100, or $50, or $0, each
with probability 1/3. Finally, we have a risk-neutral agent with a lot of money.
(The agent will learn the amount of the dividend at the beginning of date 1.)
(a) An agent is asked to decide whether to buy the security or not at date 0. If he
decides to buy, he needs to pay for the security only at date 1 (not immediately
at date 0). What is the highest price at which the risk-neutral agent is
willing to buy this security?
(b) Now consider an “option” that gives the holder the right (but not obligation)
to buy this security at a strike price at date 1 – after the agent learns
the amount of the dividend. If the agent buys this option, what would be the
agent’s utility as a function of the amount of the dividend?
(c) An agent is asked to decide whether to buy this option or not at date 0. If he
decides to buy, he needs to pay for the option only at date 1 (not immediately
at date 0). What is the highest price at which the risk-neutral agent is
willing to buy this option?
4. [Homework 1, 2001] Take = R, the set of real numbers, as the set of alternatives.
Define a relation º on by
º ⇐⇒ ≥ − 12 for all ∈ .
28 CHAPTER 2. DECISION THEORY
 ⇐⇒ [ º and º
6 ]
and
∼ ⇐⇒ [ º and º ]
Representation of Games
We are now ready to formally introduce games and some fundamental concepts, such as
a strategy. In order to analyze a strategic situations, one needs to know
A game is just a formal representation of the above information. This is usually done
in one of the following two ways:
Both forms of representation are useful in their on way, and I will use both representa-
tions extensively throughout the course.
It is important to emphasize that, when describing what a player knows, one needs to
specify not only what he knows about external parameters, such as the payoffs, but also
what he knows about the other players’ knowledge and beliefs about these parameters,
29
30 CHAPTER 3. REPRESENTATION OF GAMES
as well as what he knows about the other players’ knowledge of his own beliefs, and so
on. In both representations such information is encoded in an economical manner. In the
first half of this course, we will focus on non-informational issues, by confining ourselves
to the games of complete information, in which everything that is known by a player is
known by everybody. In the second half, we will focus on informational issues, allowing
players to have asymmetric information, so that one may know a piece of information
that is not known by another.
The outline of this lecture is as follows. The first section is devoted to the extensive-
form representation of games. The second section is devoted to the concept of strategy.
The third section is devoted to the normal-form representation, and the equivalence
between the two representations. The final section contains exercises and some of their
solutions.
3. for any two nodes, there is a unique path that connect these two nodes.
For a visual aid, imagine the branches of a tree arising from its trunk. For example,
the graph in Figure 3.1 is a tree. There is a unique starting node, and it branches out
from there without forming a loop. It does look like a tree. On the other hand, the
3.1. EXTENSIVE-FORM REPRESENTATION 31
graphs in Figure 3.2 are not trees. In the graph on the left-hand side, there are two
alternative paths to node A from the initial node, one through node B and one through
node C. This violates the third condition. (Here, the second condition in the definition is
also violated, as there are two incoming edges to node A.) On the right-hand side, there
is no path that connects the nodes x and y, once again violating the third condition.
(Once again, the second condition is also violated.)
B x
A
y
C
Note that edges (or arrows) come with labels, which can be same for two different
arrows. In a game tree there are two types of nodes, terminal nodes, at which the game
ends, and non-terminal nodes, at which a player would need to make a further decision.
This is formally stated as follows.
Definition 3.2 The nodes that are not followed by another node are called terminal.
32 CHAPTER 3. REPRESENTATION OF GAMES
Non-terminal
Terminal Nodes
nodes
For example, the terminal and non-terminal nodes for the game tree in Figure 3.1
are as in Figure 3.3. There is no outgoing arrow in any terminal node, indication that
the game has ended. A terminal node may also be referred to as an outcome in the
game. At such a node, we need to specify the players’ payoffs towards describing their
preferences among the outcomes. On the other hand, there are some outgoing arrows in
any non-terminal node, indicating that some further decisions are to be made. In that
case, on needs to describe who makes a decision and what he knows at the time of the
decision. A game is formally defined just like this, next.
• a set of players,
• a tree,
3.1. EXTENSIVE-FORM REPRESENTATION 33
Players The set of players consists of the decision makers or actors who make some
decision during the course of the game. Some games may also contain a special player
Nature (or Chance) that represent the uncertainty the players face, as it will be explained
in Subsection 3.1.4. The set of games is often denoted by
= {1 2 }
Outcomes and Payoffs The set of terminal nodes often denoted by . At a terminal
node, the game has ended, leading to some outcome. At that point, one specifies a
payoff, which is a real number, for each player . The mapping
: → R
that maps each terminal node to the payoff of player at that node is the Von-Neumann
and Morgenstern utility function of player . Recall from the previous chapter that this
means that player tries to maximize the expected value of . That is, given any two
lotteries and on , he prefers to if and only if leads to a higher expected value for
P P
function than does, i.e., ∈ () () ≥ ∈ () (). Recall also that these
preferences do not change if we multiply all payoffs with a fixed positive number or add a
fixed number to all payoffs. The preferences do change under any other transformation.
moves come with their labels, and two different arrows can have the same label. In that
case, they are the same move.
Head Tail
2
2
head tail head tail
Example 3.1 (Matching Pennies with Perfect Information) Consider the game
in Figure 3.4.The tree consists of 7 nodes. The first one is allocated to Player 1, and
the next two to Player 2. The four end-nodes have payoffs attached to them. Since
there are two players, payoff vectors have two elements. The first number is the payoff
of Player 1 and the second is the payoff of Player 2. These payoffs are von Neumann-
Morgenstern utilities. That is, each player tries to maximize the expected value of his
own payoffs given his beliefs about how the other players will play the game.
One also needs to describe what the player knows at the moment of his decision
making. This is formally done by information sets, as follows.
The meaning of an information set is that when the individual is in that information
set, he knows that one of the nodes in the information set is reached, but he cannot
rule out any of the nodes in the information set. Moreover, in a game, the information
set belongs to the player who is to move in the given information set, representing his
uncertainty. That is, the player who is to move at the information set is unable to
distinguish between the points in the information set, but able to distinguish between the
points outside the information set from those in it. Therefore, the above definition would
be meaningless without condition 1, while condition 2 requires that the player knows his
available choices. The latter condition can be taken as a simplifying assumption. I also
refer to information sets as history and write for a generic history at which player
moves.
For an example, consider the game in Figure 3.5. Here, Player 2 knows that Player
1 has taken action or and not action ; but Player 2 cannot know for sure whether
1 has taken or .1
1 x
T B
L R
R L
Example 3.2 (Matching Pennies with Perfect Information) In Figure 3.4, the
informational partition is very simple. Every information set has only one element.
Hence, there is no uncertainty regarding the previous play in the game.
1
Throughout the course, the information sets are depicted either by circles (as in sets), or by dashed
curves connecting the nodes in the information sets, depending on convenience. Moreover, the informa-
tion sets with only one node in them are depicted in the figures. For example, in Figure 3.5, the initial
node is in an information set that contains only that node.
36 CHAPTER 3. REPRESENTATION OF GAMES
A game is said to have perfect information if every information set has only one
element. Recall that in a tree, each node is reached through a unique path. Hence,
in a perfect-information game, a player can construct the previous play perfectly. For
instance in Figure 3.4, Player 2 knows whether Player 1 chose Head or Tail. And Player
1 knows that when he plays Head or Tail, Player 2 will know what Player 1 has played.
Left (5, 0)
1
Head
1/2 Right (2, 2)
Nature
(3, 3)
1/2 Left
Tail 2
Right
(0, -5)
The structure of a game is assumed to be known by all the players, and it is assumed
that all players know the structure and so on. That is, in a more formal language, the
structure of game is common knowledge.2 For example, in the game of Figure 3.5, Player
1 knows that if he chooses or , Player 2 will know that Player 1 has chosen one of the
above actions without being able to rule out either one. Moreover, Player 2 knows that
Player 1 has the above knowledge, and Player 1 knows that Player 2 knows it, and so
on. Using information sets and richer game trees, one can model arbitrary information
structures like this. For example, one could also model a situation in which Player 1
does not know whether Player 2 could distinguish the actions and . One could do
that by having three information set for Player 2; one of them is reached only after ,
one of them is reached only after and one of them can be reached after both and .
Towards modeling uncertainty of Player 1, one would further introduce a chance move,
whose outcome either leads to the first two information sets (observable case) or to the
last information case (unobservable case).
Exercise 3.1 Write the variation of the game in Figure 3.5, in which Player 1 believes
that Player 2 can distinguish actions and with probability 13 and cannot distinguish
them probability 23, and this beliefs is common knowledge.
To sum up: At any node, the following are known: which player is to move, which
moves are available to the player, and which information set contains the node, sum-
marizing the player’s information at the node. Of course, if two nodes are in the same
information set, the available moves in these nodes must be the same, for otherwise the
player could distinguish the nodes by the available choices. Again, all these are assumed
to be common knowledge.
2
Formally, a proposition is said to be common knowledge if all of the following are true: is
true; everybody knows that is true; everybody knows that everybody knows that is true; . . . ;
everybody knows that . . . everybody knows that is true, ad infinitum.
38 CHAPTER 3. REPRESENTATION OF GAMES
3.2 Strategies
Definition 3.6 A strategy of a player is a complete contingent-plan determining which
action he will take at each information set he is to move (including the information sets
that will not be reached according to this strategy). More mathematically, a strategy of a
player is a function that maps every information set of player to an action that
is available at .
1. One must assign a move to every information set of the player. (If we omit to
assign a move for an information set, we would not know what the player would
have done when that information set is reached.)
2. The assigned move must be available at the information set. (If the assigned move
is not available at an information set, then the plan would not be feasible as it
could not be executed when that information set is reached.)
3. At all nodes in a given information set, the player plays the same move. (After
all, the player cannot distinguish those nodes from each other.)
Example 3.3 (Matching Pennies with Perfect Information) In Figure 3.4, Player
1 has only one information set. Hence, the set of strategies for Player 1 is {Head, Tail}.
On the other hand, Player 2 has two information sets. Hence, a strategy of Player 2
determines what to do at each information set, i.e., depending on what Player 1 does.
So, her strategies are:
Example 3.4 In Figure 3.5, both players have one information set. Hence, the sets of
strategies for Players 1 and 2 are
respectively. Although Player 2 moves at two different nodes, they are both in the same
information set. Hence, she needs to either play at both nodes or at both nodes.
For certain purposes it might suffice to look at the reduced-form strategies. A re-
duced form strategy is defined as an incomplete contingent plan that determines which
action the agent will take at each information set he is to move and that has not been
precluded by this plan. But for many other purposes we need to look at all the strategies.
Throughout the course, we must consider all strategies.
What are the outcomes of strategies of players? What are the payoffs generated by
those strategies? Towards answering these questions, we need first a couple of jargon.
Definition 3.7 In a game with players = {1 }, a strategy profile is a list
= (1 )
Definition 3.8 In a game without Nature, each strategy profile leads to a unique
terminal node (), called the outcome of . The payoff vector from strategy is the
payoff vector at ().
Sometimes the outcome is also described by the resulting history, which can also be
called as the path of play.
Example 3.5 (Matching Pennies with Perfect Information) In Figure 3.4, if Player
1 plays Head and Player 2 plays , then the outcome is
and the payoff vector is (−1 1). If Player 1 plays Head and Player 2 plays , the
outcome is the same, yielding the payoff vector (−1 1). If Player 1 plays Tail and
Player 2 plays HT, then the outcome is now
but the payoff vector is (−1 1) once again. Finally, if Player 1 plays Tail and Player 2
plays , then the outcome is
and the payoff vector is (1 −1). One can compute the payoffs for the other strategy
profiles similarly.
In games with Nature, a strategy profile leads to a probability distribution on the set
of terminal nodes. The outcome of the strategy profile is then the resulting probability
distribution. The payoff vector from the strategy profile is the expected payoff vector
under the resulting probability distribution.
Example 3.6 (A game with Chance) In Figure 3.5, each player has two strategies,
Left and Right. The outcome of the strategy profile (Left,Left) is the lottery that
1 1
( ) = (5 0) + (3 3) = (4 32)
2 2
Sometimes, it suffices to summarize all of the information above by the set of strate-
gies and the utility vectors from the strategy profiles, computed as above. Such a
summary representation is called formal-form or strategic-form representation.
= (1 ; 1 )
where, for each ∈ = {1 }, is the set of all strategies that are available to
player , and
: 1 × × → R
Notice that a player’s utility depends not only on his own strategy but also on the
strategies played by other players. Moreover, each player tries to maximize the expected
value of (where the expected values are computed with respect to his own beliefs); in
other words, is a von Neumann-Morgenstern utility function. We will say that player
is rational iff he tries to maximize the expected value of (given his beliefs).
It is also assumed that it is common knowledge that the players are = {1 },
that the set of strategies available to each player is , and that each tries to maximize
expected value of given his beliefs.
When there are only two players, we can represent the normal form game by a
bimatrix (i.e., by two matrices):
Here, Player 1 has strategies and , and Player 2 has the strategies and
. In each box the first number is Player 1’s payoff and the second one is Player 2’s
payoff (e.g., 1 ( ) = 0, 2 ( ) = 2.)
Example 3.7 (Matching Pennies with Perfect Information) In Figure 3.4, based
on the earlier analyses, the normal or the strategic form game corresponding to the
matching penny game with perfect information is
−1 1 −1 1 1 −1 1 −1
1 −1 −1 1 1 −1 −1 1
42 CHAPTER 3. REPRESENTATION OF GAMES
Head Tail
2
Information sets are very important. To see this, consider the following standard
matching-penny game. This game has imperfect information.
Example 3.8 (Matching Pennies) Consider the game in Figure 3.7. This is the
standard matching penny game, which has imperfect information as the players move
simultaneously. In this game, each player has only two strategies: Head and Tail. The
normal-form representation is
The two matching penny games may appear similar (in extensive form), but they
correspond to two distinct situations. Under perfect information Player 2 knows what
Player 1 has done, while nobody knows about the other player’s move under the version
with imperfect information.
As mentioned above, when there are chance moves, one needs to compute the ex-
pected payoffs in order to obtain the normal-form representation. This is illustrated in
the next example.
Example 3.9 (A game with Nature) As mentioned, in Figure 3.6, each player has
two strategies, Left and Right. Following the earlier calculations, the normal-form rep-
3.3. NORMAL FORM 43
TT
HH HT TH
1 1
1 1
H T H T
H T H T
1 1 1 ‐1
‐1 1 ‐1 ‐1
‐1 ‐1 ‐1 1
1 ‐1 1 1
Definition 3.10 A mixed strategy of a player is a probability distribution over the set
of his strategies.
44 CHAPTER 3. REPRESENTATION OF GAMES
If player has strategies = {1 2 }, then a mixed strategy for player
is a function on such that 0 ≤ ( ) ≤ 1 and
There are many interpretations for mixed strategies, from deliberate randomization (as
in coin tossing) to heterogeneity of strategies in the population. In all cases, however,
they serve as a device to represent the uncertainty the other players face regarding the
strategy played by player . Throughout the course, is interpreted as the other players’
beliefs about the strategy player plays.
Solution: Player 1 has two information sets with two action in each. Since the set
of strategies is functions that map information sets to the available moves, he has
the following four strategies: . The meaning here is straightforward:
assigns to the first information set and to the last information set. On the
other hand, Player 2 has only two strategies: and . Filling in the payoffs from
the tree, one obtains the following normal-form representation:
1\2
1 −5 5 2
3 3 5 2
4 4 4 4
4 4 4 4
2. [Midterm 1, 2001] Find the normal-form representation of the game in Figure 3.9.
3.4. EXERCISES WITH SOLUTIONS 45
Figure 3.9:
Solution:
1\2
2 1 2 1 2 1
2 1 2 1 2 1
2 1 2 1 2 1
2 1 2 1 2 1
1 2 3 1 1 3
1 2 3 1 3 1
1 2 1 3 1 3
1 2 1 3 3 1
3. [Make up for Midterm 1, 2007] Write the game in Figure 3.10 in normal form.
Solution: The important point in this exercise is that Player 2 has to play the same
move in a given information set. For example, she cannot play on the left node
and on the right node of her second information set. Hence, her set of strategies
is { }.
3 3 3 3 0 0 0 0
3 3 3 3 0 0 0 0
0 0 0 0 3 3 3 3
0 0 0 0 3 3 3 3
2 2 2 2 1 −1 −1 3
2 2 2 2 −1 1 1 −1
46 CHAPTER 3. REPRESENTATION OF GAMES
A B c
a b a b a b
1
3 0 0 3 2
3 0 0 3 2 X Y
x y x y
1 ‐1 ‐1 1
‐1 3 1 ‐1
Figure 3.10:
4. [Make up for Midterm 1, 2007] Write the following game in normal form, where
the first entry is the payoff of student and the second entry is the payoff of Prof.
healthy
sick
.5
student .5
student
regular Make up Make up regular
Prof
2
1 0
same new same new
-1
4 1 4 1
-1 -c 0 -c
Solution: Write the strategies of the student as , , , and , where
means Regular when Healthy and Make up when Sick, means Make up
3.5. EXERCISES 47
Figure 3.11:
when Healthy and Regular when Sick, etc. The normal form game is as follows:
3.5 Exercises
1. [Midterm 1, 2010] Write the game in Figure 3.13 in normal form.
48 CHAPTER 3. REPRESENTATION OF GAMES
1 A 2 α 1 a
(1,-5)
D δ d
Figure 3.13:
3.5. EXERCISES 49
Figure 3.14:
Figure 3.15:
50 CHAPTER 3. REPRESENTATION OF GAMES
Figure 3.16:
Dominance
The previous lectures focused on how to formally describe a strategic situation. We now
start analyzing strategic situations in order to find which outcomes are more reasonable
and likely to realize. In order to do that, we consider certain sets of assumptions about
the players’ beliefs and discover their implications on what they would play. Such analy-
ses will lead to solution concepts, which yield a set of strategy profiles1 . These are the
strategy profiles deemed to be possible by the solution concept. This lecture is devoted
to two solution concepts: dominant strategy equilibrium and rationalizability. These
solution concepts are based on the idea that a rational player does not play a strategy
that is dominated by another strategy.
1
A strategy profile is a list of strategies, prescribing a strategy for each player.
51
52 CHAPTER 4. DOMINANCE
= 2 − (1 − ) = 3 − 1
= 0
= − + 2(1 − ) = 2 − 3
UT 2
UB
UM
0
-1
0 1
p
Towards describing this idea more generally and formally, let us use the notation −
4.1. RATIONALITY AND DOMINANCE 53
to mean the list of strategies played by all the players other than , i.e.,
That is, no matter what the other players play, playing ∗ is strictly better than
playing for player . In that case, if is rational, he would never play the strictly
dominated strategy . That is, there is no belief under which he would play , for ∗
would always yield a higher expected payoff than no matter what player believes
about the other players.2
A mixed strategy dominates a strategy in a similar way: strictly dominates
if and only if
(1 ) (1 − ) + (2 ) (2 − ) + · · · ( ) ( − ) ( − ) ∀− ∈ −
Notice that neither of the pure strategies , , and dominates any strategy.
Nevertheless, is dominated by the mixed strategy that 1 that puts probability 1/2
on each of and . For each , the payoff from 1 is
1 1 1
1 = (3 − 1) + (2 − 3) =
2 2 2
which is larger than 0, the payoff from . Recall that is a best response to any .
This is indeed a general result. Towards stating the result, I introduce a couple of
basic concepts. Write
Y
− =
6=
for the set of other players’ strategies, and define a belief of player as a probability
distribution − on − .
Definition 4.2 For any player , a strategy is a best response to − if and only if
The concept of a best response is one of the main concepts in game theory, used
throughout the course. It is important to understand the definition well and be able to
compute the best response in relatively simple games, as those covered in this class. A
rational player can play a strategy under a belief only if it is a best response to that
belief.
Theorem 4.1 A strategy is a best response to some belief if and only if is not
dominated.3 Therefore, playing strategy is never rational if and only if is dominated
by a (mixed or pure) strategy.
To sum up: if one assumes that players are rational (and that the game is as
described), then one can conclude that no player plays a strategy that is strictly dominated
(by some mixed or pure strategy), and this is all one can conclude.
Although there are few strictly dominated strategies–and thus one can conclude
little from the assumption that players are rational–in general, there are interesting
games in which this weak assumption can lead to counterintuitive conclusions. For
example, consider the well-known Prisoners’ Dilemma game, introduced in Chapter 1:
1 \2 Cooperate Defect
Cooperate 5 5 0 6
Defect 6 0 1 1
Clearly, Cooperate is strictly dominated by Defect, and hence we expect each player to
play Defect, assuming that the game is as described and players are rational. Some found
the conclusion counterintuitive because if both players play Cooperate, the outcome
would be much better for both players.
3
If you like mathematical challenges try to prove this statement.
4.2. DOMINANT-STRATEGY EQUILIBRIUM 55
Example:
In this game, player 1 (firm) has a strictly dominant strategy: “hire.” Player 2 has
only a weakly dominated strategy. If players are rational, and in addition Player 2 is
cautious, then Player 1 hires and Player 2 shirks.
When every player has a dominant strategy, one can make a strong prediction about
the outcome. This case yields the first solution concept in the course.
Definition 4.6 A strategy profile ∗ = (∗1 ∗2 ∗ ) is a dominant strategy equilibrium,
if and only if for each player , ∗ is a weakly dominant strategy.
1 \2 Cooperate Defect
Cooperate 5 5 0 6
Defect 6 0 1 1
Defect is a strictly dominant strategy for both players, therefore (Defect, Defect) is a
dominant strategy equilibrium. Note that dominant strategy equilibrium only requires
weak dominance. For example, (hire, shirk) is a dominant strategy equilibrium in game
(4.2).
When it exists, the dominant strategy equilibrium has an obvious attraction. In
that case, rational cautious players will play the dominant strategy equilibrium. Unfor-
tunately, it does not exist in general. For example, consider the Battle of the Sexes
game:
opera football
opera 3 1 0 0
football 0 0 1 3
Clearly, no player has a dominant strategy: opera is a strict best reply to opera and
football is a strict best reply to football. Therefore, there is no dominant strategy
equilibrium.
4.3. EXAMPLE: SECOND-PRICE AUCTION 57
gets the object and pays the second highest bid (which is with 6= ∗ ). (If two or
more buyers submit the highest bid, one of them is selected by a coin toss.)
Formally the game is defined by the player set = {1 2}, the strategies , and the
payoffs ⎧
⎪
⎪
⎨ − if
(1 2 ) = ( − ) 2 if =
⎪
⎪
⎩ 0 if
where =
6 .
In this game, bidding his true valuation is a dominant strategy for each player
. To see this, consider the strategy of bidding some other value 0 6= . We want to
show that 0 is weakly dominated by bidding . Consider the case 0 . If the other
player bids some 0 , player would get − under both strategies 0 and . If
the other player bids some ≥ , player would get 0 under both strategies 0 and .
But if = 0 , bidding yields − 0, while 0 yields only ( − ) 2. Likewise, if
0 , bidding yields − 0, while 0 yields only 0. Therefore, bidding
weakly dominates 0 . The case 0 is similar, except for when 0 , bidding
yields 0, while 0 yields negative payoff − 0. Therefore, bidding is dominant
strategy. Since this is true for each player , (1 2 ) is a dominant-strategy equilibrium.
58 CHAPTER 4. DOMINANCE
If , then under both bids and ∗ , wins the object and pays price = ,
enjoying the payoff level of − . If , then under both bids and ∗ ,
loses the object and gets . Consider the case, ∗ . In that case, under
∗ , wins and gets − . Under , he gets . But, since ∗ = − , bid ∗
yields a higher payoff: − . The cases of ties and ∗ are dealt similarly.
3. For the following strategy space and utility pairs, check if best response exists for
player 1, and compute it when it exists.
Note: In general a best response exists if 1 is compact (i.e. closed and bounded for
all practical purposes) and is continuous in . In particular, it exists whenever
1 is finite. Fortunately it may exists even if the above conditions fail.
1 = 2 2
One does not need to check the second order condition because 1 is concave.
(d) First-Price Auction: 1 = 2 = [0 ∞);
⎧
⎪
⎪
⎨ − 1 if 1 2
1 (1 2 ) = ( − 1 ) 2 if 1 = 2
⎪
⎪
⎩ 0 otherwise
4.5. EXERCISES 61
where 0.
Solution: Everything is a best response when 2 = ; any 1 2 is a best
response when 2 , and nothing is a best response when 2 . Continuity
fails.
4.5 Exercises
1. Show that there cannot be a dominant strategy in mixed strategies.
Find the dominant strategy equilibrium; show that the strategies that you identify
are indeed dominant.
3. [Homework 1, 2006] There are players and an object. The game is as follows:
• First, for each player , Nature chooses a number from {0 1 2 99},
where each number is equally likely, and reveals to player and nobody
else. ( is the value of the object for player .)
• The player who bids the highest number wins the object and pays where
is the highest number bid by a player other than the winner. (If two or more
players bid the highest bid, the winner is determined by a coin toss among
the highest bidders.) The payoff of player is ( − ) if he is the winner and
0 otherwise.
(a) Write this game in normal form. That is, determine the set of strategies for
each player, and the payoff of each player for each strategy profile.
(b) Show that there is a dominant strategy equilibrium. State the equilibrium.
4. [Homework 1, 2010] Alice, Bob, and Caroline are moving into a 3-bedroom apart-
ment (with rooms, named 1, 2, and 3). In this problem we want to help them to
select their rooms. Each roommate has a strict preference over the rooms. The
roommates simultaneously submit their preferences in an envelope, and then the
rooms are allocated according to one of the following mechanisms. For each mech-
anism, check whether submitting the true preferences is a dominant strategy for
each roommate.
4.5. EXERCISES 63
Mechanism 1 First, Alice gets her top ranked room. Then, Bob gets his top
ranked room among the remaining two rooms. Finally, Caroline gets the
remaining room.
Mechanism 2 Alice, Bob, and Caroline have priority scores 03, 0, and −03,
respectively; the priority score of a roommate is denoted by . For each
roommate and room , let rank be 3 if ranks highest, 2 if ranks
second highest, and 1 if ranks lowest. Write = + for the aggregate
score. In the mechanism, Room 1 is given to the roommate with the highest
aggregate score 1 . Then, among the remaining two, the one with the highest
aggregate score 0 2 gets Room 2, and the other gets Room 3.
Chapter 5
Rationalizability
Nevertheless, in definition of a game, one assumes much more than rationality of the
players. One further assumes that it is common knowledge that the players are rational.
That is, everybody is rational; everybody knows that everybody is rational; everybody
knows that everybody knows that everybody is rational ... up to infinity. If some of
these assumptions fail, then one would need to consider a different game, the game
that reflects the failure of those assumptions. This lecture explores the implications
of the common knowledge of rationality. These implications are precisely captured by
a solution concept called rationalizability, which is equivalent to iterative elimination
of strictly dominated strategies. In this way, rationalizability precisely captures the
implications of the assumptions embedded in the definition of the game.
65
66 CHAPTER 5. RATIONALIZABILITY
General fact: If (1) every player is rational, (2) every player knows that every
player is rational, (3) every player knows that every player knows that every player is
rational, . . . and () every player knows that every player knows that . . . every player is
rational, then every player must play a strategy that survives -times iterated elimination
of strictly dominated strategies.
Caution: Two points are crucial for the elimination procedure:
1. One must eliminate only the strictly dominated strategies. One cannot eliminate
a strategy if it is weakly dominated but not strictly dominated. For example, in
the game
1 1 0 0
0 0 0 0
( ) is a dominant strategy equilibrium, but no strategy is eliminated because
does not strictly dominate and does not strictly dominate .
2. One must eliminate the strategies that are strictly dominated by mixed strategies
(but not necessarily by pure strategies). For example, in the game in (4.1),
must be eliminated although neither nor dominates .
When there are only finitely many strategies, this elimination process must stop at
some . That is, at some there will be no dominated strategy to eliminate. In that
case, iterating the elimination further would not have any effect.
Definition 5.1 The elimination process that keeps iteratively eliminating all strictly
dominated strategies until there is no strictly dominated strategy is called Iterated Elim-
ination of Strictly Dominated Strategies; one eliminates indefinitely if the process does
not stop. A strategy is said to be rationalizable if and only if it survives iterated elimi-
nation of strictly dominated strategies.
As depicted in Figure 5.1, the procedure is as follows. Eliminate all the strictly
dominated strategies. In the resulting smaller game, some of the strategies may become
strictly dominated. Check for those strategies. If there is one, apply the procedure one
more time to the smaller game. This continues until there is no strictly dominated strat-
egy; the elimination continues indefinitely if the process does not stop. The remaining
68 CHAPTER 5. RATIONALIZABILITY
No
Rationalizable strategies
strategies are called rationalizable. When the game is finite, the order of eliminations
does not matter for the resulting outcome. For example, even if one does not eliminate
a strictly dominated strategy at a given round, the eventual outcome is not affected by
such an omission. In that case, it is also okay to eliminate a strategy whenever it is
deemed to be strictly dominated.
Theorem 5.1 If it is common knowledge that every player is rational (and the game
is as described), then every player must play a rationalizable strategy. Moreover, any
rationalizable strategy is consistent with common knowledge of rationality.
A general problem with rationalizability is that there are usually too many rational-
izable strategies; the elimination process usually stops too early. In that case one cannot
make much prediction based on such analysis. For example, in the Matching Pennies
game
every strategy is rationalizable, and we cannot say what the players will do.
5.2. EXAMPLE: BEAUTY CONTEST 69
Therefore, at the end of the first round the set of surviving strategies is [0 1 ].
£ ¤
Now, suppose that at the end of round , the set of surviving strategies is 0 for
some number . By repeating the same analysis above with instead of 100, we can
£ ¤
conclude that at the end of round + 1, the set of surviving strategies is 0 +1 where
2 ( − 1)
+1 =
3 − 2
The solution to this equation with 0 = 100 is
∙ ¸
2 ( − 1)
= 100
3 − 2
Therefore, for each , at the end of round , a strategy survives if and only if
∙ ¸
2 ( − 1)
0 ≤ ≤ 100
3 − 2
Since ∙ ¸
2 ( − 1)
lim 100 = 0
→∞ 3 − 2
the only rationalizable strategy is = 0.
Notice that the speed at which goes to zero determines how fast we eliminate
the strategies. If the elimination is slow (e.g. when 2 ( − 1) (3 − 2) is large), then
many strategies are eliminated at very high iterations. In that case, predictions based on
rationalizability will heavily rely on strong assumptions about rationality, i.e., everybody
knows that everybody knows that ... everybody is rational. For example, if the is
large or the ratio 23 is replaced by a number close to 1, the elimination is slow and the
predictions of rationalizability are less reliable. On the other hand, if is small or the
ratio 23 is replaced by a small number, the elimination is fast and the predictions of
rationalizability are more reliable. In particular, the predictions of rationalizability for
this game is more robust in a small group than a larger group.
5.2. EXAMPLE: BEAUTY CONTEST 71
It is important that one analyzes the game that describes the actual situation. For
example, when the above game is played in classroom, there are often some students who
would rather move the mean in an unexpected direction and upset the other students
than get the prize of being closest to the two thirds of the average. Those students
bid 100 instead. In such experiments, the resulting outcome is often different from the
rationalizable solution of 0 for the above game, which does not take into account the
existence of such students. In fact, some students bid 0 in the first time they play
the game and switch to relatively higher bids in the follow up games. To analyze that
situation, consider the following variation.
For example, in the beauty contest game suppose that there are mischievous
students with utility function
µ ¶2
1 + · · ·
(1 ) = −
The remaining − students are as before. The best response of a mischievous student
P
is 0 if the expected value of 6= ( − 1) is greater than 50, and it is 100 otherwise.
Hence at the first round all strategies other than 0 and 100 are eliminated for the
mischievous students.
For each round there are such that survives rounds of iterated elimination for
a regular student iff ≤ ≤ ̄ Note that for = 0 = 0 and ̄ = 100. In the
earlier rounds, both 0 and 100 are available for mischievous students, and in that case
the lower bound remains = 0 because 0 is a best response to 0 for regular students.
P
To compute the upper bound, fix a regular student . The expected value of 6=
can take any value in [0 100 + ( − − 1)̄−1 ], where 100 + ( − − 1)¯
−1 is
obtained by taking the highest possible bid for each remaining students, mischievous
students playing 100 and ( − − 1) regular students playing ¯−1 . The best reply to
this value give us the upper bound:
2
̄ = [100 + ( − − 1)̄−1 ] (5.3)
3 − 2
P
which is obtained by substituting 100 + ( − − 1)̄−1 for 6= in 5.2. As above,
all ̄ is eliminated. Note that as → ∞ ̄ converges to
2
3−2
· 100 200
¯∞ = 2 = (5.4)
1− 3−2
( − − 1) + 2
72 CHAPTER 5. RATIONALIZABILITY
(One can obtain ̄∞ by substituting ̄∞ for ̄ and ̄−1 in 5.3.)
The lower bound depends on whether 0 remains a best response to a mischievous
student. This is the case when
̄ ( − ) + 100( − 1)
≥ 50
−1
If ≥ 4, then ̄∞ satisfies the above inequality. In that case, all ̄ satisfy the
inequality, and neither 0 nor 100 is eliminated for the mischievous students. In that case,
the rationalizable strategies are {0 100} for mischievous students and [0 200(+2)]
for the regular students. If ≥ 4, then ̄∞ fails the above inequality. Then, there
exists ∗ such that ̄ fails the inequality for every ≥ ∗ and ̄ satisfies the inequality
for all ∗ In that case at round ∗ + 1 0 is eliminated for mischievous students.
Consequently, at round = ∗ + 2 and after, for any regular student , the lowest value
P
for 6= is 100 + ( − − 1)−1 As in the above analysis, the best response to
this yields the lower bound at :
2
= [100 + ( − − 1)−1 ] (5.5)
3 − 2
Of course, as → ∞, converges to
200
∞ = ¯∞ =
+ 2
In that case, the unique rationalizable strategy is 200( + 2) for regular students
and 100 for the mischievous students. The rationalizable strategy is plotted in Figure
2. Note that the mischievous students have a large impact. For example, when 10% of
the students are mischievous, the rationalizable strategy for regular students is 2012 ∼
=
16667 and the average rationalizable bid is 25.
3 1 0 2
0 0 3 1
One can easily see that the strategy and then are eliminated next, yielding
( ) as the only rationalizable strategies. The games with unique rationalizable
strategy are called dominance-solvable. We got one of them here.
2. [Midterm 1, 2011] Compute the set of all rationalizable strategies in the following
74 CHAPTER 5. RATIONALIZABILITY
game.
0,3 0,1 3,0 0,1
3,0 0,2 2,4 1,1
2,4 3,2 1,2 10,1
0,5 5,3 1,2 0,10
3. [Midterm 1, 2001] Find all the pure strategies that are consistent with the common
knowledge of rationality in the following game. (State the rationality/knowledge
assumptions corresponding to each operation.)
1\2
1 1 0 4 2 2
2 4 2 1 1 2
1 0 0 1 0 2
Solution: Clearly, one needs to compute rationalizable strategies and state the
underlying rationalizability assumptions along the way.
1\2
1 1 0 4 2 2
2 4 2 1 1 2
1\2
2 4 2 1
Round 4 Since Player 2 knows that Player 1 is rational, and that Player
1 knows that Player 2 is rational, and that Player 1 knows that
Player 2 knows that Player 1 is rational, he knows that Player 1 will
not play or . Given this, strictly dominates . Since Player 2 is
rational, he will not play , either. He will play .
1\2
2 4
Thus, the only strategies that are consistent with the common knowledge of
rationality are for Player 1 and for Player 2.
4. [Midterm 1, 2011] Compute the set of all rationalizable strategies in the following
game. Simultaneously, Alice and Bob select arrival times and , respectively,
for their meeting, where ∈ {0 1 2 100}. The payoffs of Alice and Bob
are
(
2 − ( − )2 if
( ) =
− ( − )2 otherwise
(
2 − ( − )2 if
( ) =
− ( − )2 otherwise,
Solution: If the set of remaining strategies from the earlier rounds is {0 max }
for some max 0, then the max is strictly dominated by max −1 and is eliminated.
(Proof: For = max ,
showing that max −1 strictly dominates max for Alice. The same argument applies
for Bob.)
Therefore, we eliminate 100 in round 1, 99 in round 2, . . . , and 1 in round 100.
The set of rationalizable strategies is {0} for both players.
1\2
1 1 1 0
0 1 0 10000
1\2
1 − 1 − + 100002 1 − + 10000 (1 − )
1 − + 10000 (1 − ) 10000 (1 − )2 + 1 −
5.4. EXERCISES 77
To see how the payoffs are computed consider ( ). If this strategy profile
is intended, the outcome is ( ) with probability (1 − )2 [nobody trem-
bles], ( ) with probability (1 − ) [only Player 2 trembles], ( ) with
probability (1 − ) [only Player 1 trembles], and ( ) with probability 2
[everybody trembles]. We mix the payoff vectors with the above probabili-
ties to obtain the table. One can use the structure of payoffs to shorten the
calculations. For example, Player 1 gets 1 if he does not tremble and gets 0
otherwise, yielding 1 − .
To compute the rationalizable strategies, note that is still dominated by
and is eliminated in the first round. In the second round, we cannot eliminate
, however. Indeed, the payoffs from and are approximately 1 and 10,
respectively. Hence, is eliminated in the second round, yielding ( ) as
the only rationalizable strategy profile.
This example shows that rationalizability may be sensitive to the possibility
of trembling, depending on the relative magnitude of trembling probabilities
and the payoff differences.
5.4 Exercises
0 −1 4 4 0 0 2 0
0 3 0 0 4 4 1 0
5 2 2 0 1 3 1 3
4 4 1 0 0 1 0 5
(a) Iteratively eliminate all strictly dominated strategies; state the assumptions
necessary for each elimination.
4. [Homework 1, 2004] Consider the game depicted in Figure 5.3 in extensive form
(where the payoff of player 1 is written on top, and the payoff of 2 is on the
bottom).
L X
R
2 2
2
l l r a
r b
1
1
2 0 1 2 1
1 0 2 2 0
0 .5
0 2
Figure 5.3:
of these numbers. The students who submit the number that is closest to ̄3 will
share the total payoff of 100, while the other students get 0. Everything described
above is common knowledge. (Bonus: would the answer change if the students did
not know , but it were common knowledge that ≥ 2?)
(c) Answer part (b) assuming that there are ∈ (0 ( − 1) 2) mischievous
students with payoff ( − (1 ))2 .
8. [Midterm 1, 2005] Compute the set of all rationalizable strategies in the game in
Figure 3.14. (See Exercise 2 in Section 3.5.)
L
R
(2,2)
2
r
l
(0,0)
u
1
1
Figure 5.4:
10. [Homework 1, 2002] Consider the game depicted in Figure 5.5 in extensive form.
L R
M
2 2
r l r c
l a b
1
2 2 1 0
0 0
x y 0 1 2 0 1
0
2 1
1 2
Figure 5.5:
1
L R
l r l r
1 1 2
0
X Y A B 4 x y
1 0 0 1 0 1
0 4 0 1 0 1
Figure 5.6:
weakly dominated in the remaining game; then, eliminate all the strategies
of player 1 that are weakly dominated in the remaining game, and so on?
11. [Homework 1, 2006] Consider the game depicted in Figure 5.6 in extensive form.
12. Consider any collection of sets 1 ⊆ 1 , . . . , ⊆ such that there exists no
∈ that is strictly dominated when the others’ strategies are restricted to be
82 CHAPTER 5. RATIONALIZABILITY
in − . That is, for every ∈ and every mixed strategy of player , there
exists a strategy profile − of other players such that ∈ for every =
6 and
X
( − ) ≥ ( ) ( − )
∈
13. Show that the set of rationalizable strategies satisfy the above property that no
rationalizable strategy is dominated when others’ strategies are resticted to be
rationalizable.
Chapter 6
Nash Equilibrium
83
84 CHAPTER 6. NASH EQUILIBRIUM
(
− ) ≥ ( − ) ∀ ∈
Recall also that the definition of a best response differs from that of a dominant strategy
by requiring the above inequality only for a specific strategy − instead of requiring it
for all − ∈ − . If the inequality were true for all − , then would also be a
dominant strategy, which is a stronger requirement than being a best response against
some strategy − .
Definition 6.1 A strategy profile ∗ = (∗1 ∗ ) is a Nash Equilibrium if and only if
∗ is a best response to ∗− = (∗1 ∗−1 ∗+1
∗
) for each . That is, for all ,
(∗ −
∗ ∗
) ≥ ( − ) ∀ ∈
and
( ) = 1 0 = ( )
Likewise, (football, football) is also a Nash equilibrium. On the other hand, (opera,
football) is not a Nash equilibrium because Bob would like to go to opera instead:
Proof. Let ∗ be a dominant strategy equilibrium. Take any player . Since ∗ is a
dominant strategy for , for any given ,
In particular,
(∗ ∗− ) ≥ ( −
∗
)
1 1 0 0
0 0 0 0
In this game, ( ) is a dominant strategy equilibrium, but ( ) is also a Nash equilib-
rium.
86 CHAPTER 6. NASH EQUILIBRIUM
This example also illustrates that a Nash equilibrium can be in weakly dominated
strategies. In that case, one can rule out some Nash equilibria by eliminating weakly
dominated strategies. While may find such equilibria unreasonable and be willing to rule
out such equilibria, the next example shows that all Nash equilibria may need to be in
dominated strategies in some games. (One then ends up ruling out all Nash equilibria.)
Example 6.1 Consider a two-player game in which each player selects a natural num-
ber ∈ N = {0 1 2 }, and the payoff of each player is 1 2 . It is easy to check that
(0 0) is a Nash equilibrium, and there is no other Nash equilibrium. Nevertheless, all
strategies, including 0, are weakly dominated.
Theorem 6.2 If ∗ is a Nash equilibrium, then ∗ is rationalizable for every player .
Proof. It suffices to show that none of the strategies ∗1 ∗2 ∗ is eliminated at any
round of the iterated elimination of strictly dominated strategies. Since these strategies
are all available at the beginning of the procedure, it suffices to show if the strategies
∗1 ∗2 ∗ are all available at round , then they will remain available at round + 1.
Indeed, since ∗ is a Nash equilibrium, for each , ∗ is a best response to −
∗
which
are available at round . Hence, ∗ is not strictly dominated at round , and remains
available at round + 1.
The converse is not true. That is, there can be a rationalizable strategy that is not
played in any Nash equilibrium, as the next example illustrates.
1 −2 −2 1 0 0
−1 2 1 −2 0 0
0 0 0 0 0 0
(This game can be thought as a matching penny game with an outside option, which is
represented by strategy .) Note that ( ) is the only Nash equilibrium. In contrast, no
6.3. MIXED-STRATEGY NASH EQUILIBRIUM 87
strategy is strictly dominated (check that each strategy is a best response to some strategy
of the other player), and hence all strategies are rationalizable.
The condition for checking whether ∗ is mouthful.1 Fortunately, there is a simpler
condition to check: for every , if ∗ ( ) 0, then is a best response to −
∗
. That is,
X X
( − ) ∗− (− ) ≥ (0 − ) −
∗
(− ) ∀ with ∗ ( ) 0,∀0
− −
Example –Battle of the Sexes Consider the Battle of the Sexes again.
for every mixed strategy . It can be simplified because one does not need to check for all mixed
strategies . It suffices to check against the pure strategy deviations. That is, ∗ is a Nash equilibrium
if and only if
X Y X Y
(1 ) ∗ ( ) ∗ ( ) ≥ (0 − ) ∗ ( )
(1 ) 6= − 6
=
We have identified two pure strategy equilibria, already. In addition, there is a mixed
strategy equilibrium. To compute the equilibrium, write for the probability that Alice
goes to opera; with probability 1 − she goes to football game. Write also for the
probability that Bob goes to opera. For Alice, the expected payoff from opera is
The payoff function (; ) is strictly increasing with when (opera,) (football ).
This is the case when 4 1 − or equivalently when 15. In that case, the unique
best response for Alice is = 1, and she goes to opera for sure. Likewise, when 15,
(opera,) (football ), and her expected payoff (; ) is strictly decreasing
with . In that case, Alice’s best response is = 0, i.e., going to football game for sure.
Finally, when = 15, her expected payoff (; ) does not depend on , and any
∈ [0 1] is a best response. In other words, Alice would choose opera if her expected
utility from opera is higher, football if her expected utility from football is higher, and
can choose either opera or football or any randomization between them if she is indif-
ferent between the two.
Similarly, one can compute that = 1 is best response if 45; = 0 is best
response if 45; and any can be best response if = 45.
The best responses are plotted in Figure 6.1. The Nash equilibria are where these
best responses intersect. There is one at (0 0), when they both go to football, one at
(1 1), when they both go to opera, and there is one at (45 15), when Alice goes to
opera with probability 45, and Bob goes to opera with probability 15.
Remark 6.1 The above example illustrates a way to compute the mixed strategy equi-
librium (for 2x2 games). Choose the mixed strategy of Player 1 in order to make Player
6.3. MIXED-STRATEGY NASH EQUILIBRIUM 89
1/5
4/5
2 indifferent between her strategies, and choose the mixed strategy of Player 2 in order
to make Player 1 indifferent. This is a valid technique to compute a mixed strategy equi-
librium, provided that it is known which strategies are played with positive probabilities
in equilibrium. (Note that one must be indifferent between two strategies if he plays both
of them with positive probabilities.)
Exercise 6.1 Show that if ∗ is a mixed strategy Nash equilibrium and ∗ ( ) 0, then
is rationalizable.
One can use the above fact in searching for a mixed strategy Nash equilibrium.
One can compute the rationalizable strategies first and search for a mixed strategy
equilibrium within the set of rationalizable strategies, which may be smaller than the
original set of strategies.
Games with unique rationalizable strategy profile are called dominance solvable.
Exercise 6.2 Show that in a dominance-solvable game, the unique rationalizable strat-
egy is the only Nash equilibrium.
90 CHAPTER 6. NASH EQUILIBRIUM
=
Similarly, in order for Player 2 play both hawk and dove with positive probabilities
(which are played with positive probabilities and 1 − , respectively), it must be
that Player 1 plays hawk with probability . Therefore, in the mixed-strategy Nash
equilibrium, each player plays hawk with probability and dove with probability
1 − .
Now imagine an island where hawks and doves live together. Let there be 0 hawks
and 0 doves at the beginning where both 0 and 0 are very large. Suppose that each
season, the birds are randomly matched and the number of offsprings of a bird is given
by the payoff matrix above. That is, if a dove is matched to a dove as the neighbor, then
it will have 2 offsprings, and the next generation, we will have 1 + 2 doves in its
family. If a dove is matched with a hawk, then it will have zero offsprings and its family
will have only 1 member, itself in the next season. If two hawks are matched, then each
will have ( − ) 2 offsprings, which is negative reflecting the situation that the number
of hawks from such matches will decrease when we go to next season. Finally, if a hawk
meets dove, it will have offsprings, and there will 1 + hawks in its family in the
6.4. EVOLUTION OF HAWKS AND DOVES 91
next season. We want to know the ratio of hawks and doves in this island millions of
seasons later.
Let and be the number of hawks and doves, respectively, at season . Define
= and =
+ +
as the ratios of hawks and doves at . In accordance with the strong law of large numbers,
assume that the number of hawks that are matched to hawks is , and number of
hawks that are matched to doves is .2 Each hawk in the first group multiplies to
1 + ( − ) 2, and each hawk in the second group multiplies to 1 + 2. The number
of hawks in the next season will be then
Number of doves who are matched to hawks is , and number of doves that are
matched to doves is . Each dove in the first and the second group multiplies to 1
and 1 + 2, respectively. Hence, the number of doves in the next season will be then
= 0 and = 1
is a stationary state, which can be reached if we start with all doves. In that case, by
(6.2), it will continue as "doves only." Similarly, another steady state is
= 1 and = 0
which can be reached if we start with all hawks. Since we have started with both hawks
and doves, both and +1 are positive. Hence, we can compute the steady states by
+1 1 + ( − ) 2 +
= =
+1 1 + 2
2
The probabilities of matching to a hawk and dove are and , respectively. And there are
hawks.
92 CHAPTER 6. NASH EQUILIBRIUM
where the last equality is due to (6.2) and (6.3). The equality holds if and only if
( − ) 2 + = 2
or equivalently
=
This is the only steady state reached from a distribution with hawks and doves. Notice
that it is the mixed strategy Nash equilibrium of the underlying game. This is a general
fact: if a population dynamic is as described in this section, then the steady states
reachable from a completely mixed distribution are symmetric Nash equilibria.
We will now see that when we start with both hawks and doves present, we will nec-
essarily approach to the last steady state, which is the mixed strategy Nash equilibrium.
Now +1 whenever
+1
+1
which holds whenever
1 + ( − ) 2 +
1
1 + 2
as one can see from (6.2) and (6.3). The latter inequality is equivalent to
That is, if exceeds the equilibrium value, then it decreases towards the equilibrium
value. Similarly, if , then +1 , and will increase towards the equilibrium.
2. [Midterm 1, 2011] Compute the set of Nash equilibria in Exercise 2 of Section 5.3.
Solution: Recall that the set of Nash equilibria is invariant to the elimination
of non-rationalizable strategies. Hence, it suffices to compute the Nash equilibria
6.5. EXERCISES WITH SOLUTIONS 93
in the reduced game. Recall also from Section 5.3 that, after the elimination of
non-rationalizable strategies, the game reduces to
∗ ∗
0 3 3 0
3∗ 0 2 4∗
Here, the best responses (to the pure strategies) are indicated with asterisk. Since
the best responses do not intersect, there is no Nash equilibrium in pure strategies.
There is a unique mixed strategy Nash equilibrium ∗ . In order for Player 1 to
play a mixed strategy, he must be indifferent between and against ∗2 :
Here the left-hand side is the expected payoff from , and the right-hand side is
the expected payoff from . The indifference condition yields
∗2 () = 34
Of course, ∗2 () = 14. Since Player 2 is playing a mixed strategy, he must be
indifferent between playing and against ∗1 :
Here the left-hand side is the expected payoff from , and the right-hand side is
the expected payoff from . The indifference condition yields
3. [Midterm 1, 2001] Find all the Nash equilibria in the following game:
1\2
1 0 0 1 5 0
0 2 2 1 1 0
4. [Make up for Midterm 1, 2007] Consider the game in Exercise 4 of Section 3.4.
where the payoffs from the strategies "same" and "new" are on the left and
right hand sides of the equation, respectively. Therefore,
yielding
1 − 2
∗1 () = and ∗1 () =
1− 1−
Note that, in equilibrium, Student takes the regular exam when he is healthy
and mixes between regular exam and make up when he is sick.
6.6 Exercises
1. [Homework 2, 2007] Consider the following game:
L M N R
A (4 2) (0 0) (5 0) (0 0)
B (1 4) (1 4) (0 5) (−1 0)
C (0 0) (2 4) (1 2) (0 0)
D (0 0) (0 0) (0 −1) (0 0)
2. [Midterm 1, 2007] Consider the game in Exercise 3 in Section 3.5 and Exercise 3
in Section 5.4.
3. [Midterm 1, 2005] Find all the Nash equilibria in the following game. (Don’t forget
the mixed strategy equilibrium.)
1\2
1 0 4 1 1 0
2 1 3 2 0 1
3 −1 2 0 2 2
1\2
3 0 0 3 0
0 3 3 0 0
0 0
(b) For each equilibrium in part a, check if it remains a Nash equilibrium when
= 2.
5. [Homework 2, 2001] Compute all the Nash equilibria of the following game.
L M R
A (3 1) (0 0) (1 0)
B (0 0) (1 3) (1 1)
C (1 1) (0 1) (0 10)
6. [Homework 2, 2002] Compute all the Nash equilibria of the following game.
L M R
A (4 3) (0 0) (1 1)
B (0 1) (1 0) (10 0)
C (0 0) (3 4) (1 1)
D (−1 0) (3 1) (5 0)
6.6. EXERCISES 97
1 2 2 3 0 4 0
2 3 3 2 0 1 0
3 1 3 5 5 0 2
4 1 1 1 1 2 3
(a) Iteratively eliminate all strictly dominated strategies; state the assumptions
necessary for each elimination.
8. [Homework 1, 2004] Consider the game in Exercise 1 of Section 5.4. What are the
Nash equilibria in pure strategies?
9. [Midterm 1, 2003] Find all the Nash equilibria in Exercise 3 of Section 5.4. (Don’t
forget the mixed-strategy equilibrium!)
10. [Homework 1, 2002] Consider the game in Exercise 10 of Section 5.4. What are
the Nash equilibria in pure strategies?
11. [Midterm 1 Make up, 2001]Compute all the Nash equilibria in the following game.
3 2 4 0 0 0
2 0 3 3 0 0
0 0 0 0 3 3
12. [Homework 2, 2004] Compute all the Nash equilibria of the following games.
(a)
L M
T (2 1) (0 2)
B (0 1) (3 0)
98 CHAPTER 6. NASH EQUILIBRIUM
(b)
L M R
A (4 2) (0 0) (1 1)
B (1 1) (3 4) (2 1)
C (0 0) (3 1) (1 0)
14. [Midterm 1, 2010] Compute a Nash equilibrium of the following game. (This is a
version of Rock-Scissors-Paper with preference for Paper.)
1\2
0 0 2 −2 −2 3
−2 2 0 0 2 −1
3 −2 −1 2 1 1
15. [Homework 2, 2006] There are players, 1 2 , who bid for a painting in a
second-price auction. Each player bids , and the bidder who bids highest buys
the painting at the highest price bid by the players other than himself. (If two
ore more players bid the highest bid, the winner is decided by a coin toss.) The
value of the art is for each player where 1 2 · · · 0. Find a Nash
equilibrium of this game in which player , who values the painting least, buys
the object for free (at price zero). Briefly discuss this result and compare it to the
answer of Exercise 4 in Section 4.5.
16. [Homework 2, 2006] Compute all the Nash equilibria of the following game.
L M R
A (4 2) (0 0) (2 1)
B (0 1) (3 4) (0 1)
C (1 5) (2 1) (1 4)
6.6. EXERCISES 99
17. Assume that each strategy set is convex and each utility function is strictly
concave in own strategy .3 Show that all Nash equilibria are in pure strategies.
3
A set is convex if + (1 − ) ∈ for all ∈ and all ∈ [0 1]. A function : → is
strictly concave if
( + (1 − ) ) () + (1 − ) ()
Some of the earliest applications of game theory is the analyses of imperfect competition
by Cournot (1838) and Bertrand (1883), a century before Nash (1950). This chapter
applies the solution concepts of rationalizability and Nash equilibrium to those models
of imperfect competition.
where
= 1 + · · · (7.2)
is the total supply. Each firm maximizes the expected profit. Hence, the payoff of firm
is
= ( − ) (7.3)
Assuming all of the above is commonly known, one can write this as a game in normal
form, by setting
101
102 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION
• = [0 ∞) as the strategy space of player , where a typical strategy is the quantity
produced by firm , and
Best Response Throughout the course, it will be useful to know the best response of
a firm to the production levels of the other firms. (See also Exercise 3 in Section 4.4.)
Write
X
− = (7.4)
=
6
for the total supply of the firms other than firm . If − 1, then the price = 0 and
the best firm can do is to produce zero and obtain zero profit. Now assume − ≤ 1.
For any ∈ (0 1 − − ), the profit of the firm is
Profit
qi(1-qi-qj-c)
-cqi
-0.2
0 1
(1-qj-c)/2 1-qj-c
Figure 7.1:
qi
1 c
2 qi=qiB(Q-i)
Q-i
1-c
Figure 7.2:
104 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION
q2
q1=q1B(q2)
q*
1 c
2 q2=q2B(q1)
q1
1-c
Figure 7.3:
showing that ̂ is strictly dominated by (1 − ) 2. We therefore eliminate all ̂
(1 − ) 2 for each player . The resulting strategies are as follows, where the shaded
area is eliminated:
q2
1-c
1 c
2
q1
1 c
2
1-c
showing that ̂ is strictly dominated by (1 − ) 4. We will therefore eliminate all ̂
with ̂ (1 − ) 4. The remaining strategies are as follows:
q2
1-c
1 c
2
q1
1 c
2
1-c
106 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION
Notice that the remaining game is a smaller replica of the original game. Applying
the same procedure repeatedly, one can eliminate all strategies except for the Nash
equilibrium. (After every two rounds, a smaller replica is obtained.) Therefore, the only
rationalizable strategy is the unique Nash equilibrium strategy:
∗ = (1 − ) 3.
A more formal treatment One can prove this more formally by invoking the fol-
lowing lemma repeatedly:
Lemma 7.1 Given that ≤ ̄, every strategy ̂ with ̂ (̄) is strictly dominated
by (̄) ≡ (1 − ̄ − ) 2. Given that ≥ ̄, every strategy ̂ with ̂ (̄) is strictly
dominated by (̄) ≡ (1 − ̄ − ) 2.
Proof. To prove the first statement, take any ≤ ̄. Note that ( ; ) is strictly
increasing in at any ( ). Since ̂ (̄) ≤ ( ),2 this implies that
¡ ¢
(̂ ) (̄)
¡ ¢
(̂ ) (̄)
¡ ¢ ¡ ¢
= −1 ≡ 1 − −1 − 2 = (1 − ) 2 − −1 2
2
This is because is decreasing.
7.1. COURNOT (QUANTITY) COMPETITION 107
0 = 0
1−
1 =
2
1 − 1−
2 = −
2 4
1 − 1 − 1−
3 = − +
2 4 8
1− 1− 1− 1−
= − + − · · · − (−1)
2 4 8 2
Theorem 7.1 The set of remaining strategies after any odd round ( = 1 3 ) is
[ −1 ]. The set of remaining strategies after any even round ( = 2 4 ) is
[ −1 ]. The set of rationalizable strategies is {(1 − ) 3}.
Therefore, the intersections of the above intervals is {(1 − ) 3}, which is the set of
rationalizable strategies.
108 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION
Rationalizability In the first round, one can eliminate any strategy (1 − ) 2,
using the same argument in the case of duopoly. But in the second round, the maximum
possible total supply by the other firms is
( − 1) (1 − ) 2 ≥ 1 −
where is the number of firms. The best response to this aggregate supply level is 0.
Hence, one cannot eliminate any strategy in round 2. The elimination process stops,
yielding [0 (1 − ) 2] as the set of rationalizable strategies. Since the set of rationalizable
strategies is large, rationalizability has a weak predictive power in this game.
Nash Equilibrium While rationalizability has a weak predictive power in that the set
of rationalizable strategies is large, Nash equilibrium remains to have a strong predictive
power. There is a unique Nash equilibrium. Recall that ∗ = (1∗ 2∗ ∗ ) is a Nash
equilibrium if and only if
à ! P
X 1− ∗
6= −
∗ = ∗ =
6=
2
for all , where the second equality by (7.6) and the fact that in equilibrium the firms
P
cannot have negative profits in equilibrium (i.e. 6= ∗ ≤ 1 − ). Rewrite this equation
system more explicitly:
For any and , by subtracting th equation from th, one can obtain
∗ − ∗ = 0
7.2. BERTRAND (PRICE) COMPETITION 109
Hence,
1∗ = 2∗ = · · · = ∗
( + 1) 1∗ = 1 − ;
i.e.
1−
1∗ = 2∗ = · · · = ∗ =
+1
Therefore, there is a unique Nash equilibrium, in which each firm produces (1 − ) ( + 1).
In the unique equilibrium, the total supply is
= (1 − )
+1
and the price is
1−
=+
+1
The profit level for each firm is µ ¶2
1−
=
+1
As goes to infinity, the total supply converges to 1 − , and price converges to
. These are the values at which the demand ( = max {1 − 0}) is equal to supply
( = ), which is called (perfectly) competitive equilibrium. When there are few firms,
however, the price is significantly higher than the competitive price , and the total
supply is significantly lower than the competitive supply 1 − . We will next consider
another model, in which two firms are enough for the competitive outcome.
Assume that it costs noting to produce the good (i.e. = 0). Therefore, the profit of a
firm is ⎧
⎪
⎪
⎨ (1 − ) if
(1− )
(1 2 ) = (1 2 ) = if =
⎪
⎪
2
⎩ 0 otherwise.
Assuming all of the above is commonly known, one can write this formally as a game
in normal form by setting
• = [0 ∞) as the set of strategies for each , with price a typical strategy,
Observe that when = 0, (1 2 ) = 0 for every , and hence every is a best
response to = 0. This has two important implications:
1. Every strategy is rationalizable (one cannot eliminate any strategy because each
of them is a best reply to zero).
In the rest of the notes, I will first show that this is indeed the only Nash equilibrium.
In other words, even with two firms, when the firms compete by setting prices, the
competitive equilibrium will emerge. I will then show that if we modify the game
slightly by discretizing the set of allowable prices and putting a minimum price, then the
game becomes dominance-solvable, i.e., only one strategy remains rationalizable. In the
modified game, the minimum price is the only rationalizable strategy, as in competitive
equilibrium. Finally I will introduce small search costs on the part of consumers, who
are not modeled as players, and illustrate that the equilibrium behavior is dramatically
different from the equilibrium behavior in the original game and competitive equilibrium.
Proof. We have seen already that ∗ = (0 0) is a Nash equilibrium. I will here show that
if (1 2 ) is a Nash equilibrium, then 1 = 2 = 0. To do this, take any Nash equilibrium
(1 2 ). I first show that 1 = 2 . Towards a contradiction, suppose that . If
= 0, then ( ) = 0, while ( ) = (1 − ) 2 0. That is, choosing is
a profitable deviation for firm , showing that = 0 is not a Nash equilibrium.
Therefore, in order to be an equilibrium, it must be that 0. But then, firm
has a profitable deviation: ( ) = 0 while ( ) = (1 − ) 2 0. All in
all, this shows that one cannot have in equilibrium. Therefore, 1 = 2 . But
if 1 = 2 in a Nash equilibrium, then it must be that 1 = 2 = 0. This is because if
1 = 2 0, then Firm 1 would have a profitable deviation: 1 (1 2 ) = (1 − 1 ) 1 2
while 1 (1 − 2 ) = (1 − 1 + ) (1 − ), which is close to (1 − 1 ) 1 when is close
to zero.
A graphical proof for the above result is as follows. Recall that (1 2 ) is a Nash
equilibrium if and only if it is in the intersection of the best responses. Recall also from
Exercise 3.e of Section 4.4 that everything is a best response to = 0 and nothing is a
best response to any =
6 0. Hence, as shown in Figure ??, the best responses intersect
each other only at (0 0), showing that (0 0) is the only Nash equilibrium.
The important assumption here is that the minimum allowable price min = 001 yields
a positive profit. We will now see that the game is "dominance-solvable" under this
assumption. In particular min is the only rationalizable strategy, and it is the only
Nash equilibrium strategy. Let us start with the first step.
Step 1: Any price greater than the monopoly price = 05 is strictly dominated
by some strategy that assigns some probability 0 to the price min = 001 and
probability 1 − to the price = 05.
Proof. Take any player and any price . We want to show that the mixed
¡ ¢
strategy with ( ) = 1 − and min = strictly dominates for some 0.
112 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION
where the first inequality is by definition and the last inequality is due to the fact that
≥ 051. On the other hand,
¡ ¢
( ) = (1 − ) (1 − ) + min 1 − min
(1 − ) (1 − )
= 025 (1 − )
Round 1 By Step 1, all strategies with = 05 are eliminated. Moreover,
each ≤ is a best reply to = + 1, and is not eliminated. Therefore, the set
of remaining strategies is
2 = {001 002 05}
Then, the strategy ̄ is strictly dominated by a mixed strictly dominated by the mixed
¡ ¢
strategy with (̄ − 001) = 1 − and min = , as we will see momentarily. We
7.2. BERTRAND (PRICE) COMPETITION 113
then eliminate the strategy ̄. There will be no more elimination because each ̄ is
a best reply to = + 001.
To prove that ̄ is strictly dominated by , note that the profit from ̄ for player
is (
̄ (1 − ̄) 2 if = ̄
(̄ ) =
0 otherwise.
On the other hand,
¡ ¢ ¡ ¢
̄ ̄ = (1 − ) (̄ − 001) (1 − ¯ + 001) + min 1 − min
(1 − ) (̄ − 001) (1 − ̄ + 001)
= (1 − ) [̄ (1 − ̄) − 001 (1 − 2̄)]
showing that ̄ strictly dominates ̄, and completing the proof.
© ª
Therefore, the process continues until the set of remaining strategies is min and
it stops there. Therefore, min is the only rationalizable strategy.
Since players can put positive probability only on rationalizable strategies in a Nash
¡ ¢
equilibrium, the only possible Nash equilibrium is min min , which is clearly a Nash
equilibrium.
For simplicity, allow only two prices: 3 and 5. Suppose that the demand for the good
comes from a single buyer, for who the value of the good is 6. She needs only 1 unit of
good. Unlike before the buyer has a very small search cost ∈ (0 1). She can check
the prices by paying .
The game is as follows:
• The two firms set prices 1 ∈ {3 5} and 2 ∈ {3 5} and the consumer decides
whether to check the prices, all simultaneously.
• If she checks the prices, then she buys from the firm with the lower price. If she
decides not to check or if 1 = 2 , then she buys from either of the firms with equal
probabilities. This behavior is set, so that the strategies of the consumer is only
"check" and "no check".
Formally,
check no check
1\2 5 3 1\2 5 3
5 52 52 1 − 0 3 3 − 5 52 52 1 52 32 2
3 3 0 3 − 32 32 3 − 3 32 52 2 32 32 3
Here, the first entry is the payoff of firm 1, the second entry is the payoff of firm 2,
and the final entry is the payoff of the buyer. Firm 1 chooses the row; firm 2 chooses
the column, and the buyer chooses the matrix. We computed the payoffs, following the
set behavior above. For example, if the consumer doesn’t check the price, he buys from
the either firm with probability 0.5. Hence, the payoff of firm is 2, independent of
. The payoff of the buyer is
1 + 2
05 (6 − 1 ) + 05 (6 − 2 ) = 6 −
2
7.2. BERTRAND (PRICE) COMPETITION 115
If the buyer checks and 1 = 2 , then the payoffs are: 1 2 to each firm and 6 − 1 − to
the buyer. If the buyer checks and , then the buyer buys one unit from , and the
payoff of firm is ; the payoff of firm is 0, and the payoff of the buyer is 6 − − .
A quick glance at the above table reveals that the only pure strategy Nash equilibrium
is the both firm set price to 5 (1 = 2 = 5), and the buyer does not check the prices.
This is clearly different from the previous games, where price competition pushes the
prices to the minimum.
It is easy to check that (1 = 2 = 5; no check) is a Nash equilibrium: Given "no
check", = 5 dominates = 3. Given that prices are equal, the buyer saves by not
checking.
It is also easy to check that this is the only Nash equilibrium in pure strategies.
If 1 = 2 , the best response of the buyer is "no check". If buyer doesn’t check,
then the best reply of each firm is 5. Therefore, the only equilibrium with 1 = 2
is (1 = 2 = 5; no check). On the other hand, there cannot be a Nash equilibrium with
6 2 . To see this, suppose that = 5 and = 3. Then, the buyer gets 3 − when
1 =
she checks and 2 when she does not. The best reply is to check because 1. That
is, ( = 5, = 3, no check) is not an equilibrium. In order to have an equilibrium, she
must check. But in that case, gets 0. Firm could get the higher payoff of 5/2 by
setting = 5. Therefore, ( = 5, = 3, check) is not an equilibrium either.
There is also a symmetric Nash equilibrium in mixed strategies. To find the equi-
librium, let us write for the probability that a firm sets = 5 (the probabilities are
equal by assumption) and for the probability that buyer checks. The expected payoff
from checking for the buyer is
¡ ¢
(check; ) = 2 + 3 1 − 2 − = 3 − 2 2 −
This is because the buyer gets 1 − , when 1 = 2 = 5, which happens with probability
2 , and 3 − otherwise (with probability 1 − 2 ). If she doesn’t check her expected
payoff is
(no check; ) = + 3 (1 − ) = 3 − 2
(Since she chooses the firm randomly without knowing the prices, the probability that
the price will be high is .) We are looking for a mixed strategy Nash equilibrium with
116 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION
That is, √
1∓ 1 − 2
=
2
On the other hand, given that the buyer checks with probability and the other firm
charges high price with probability , the payoff from = 5 is
This is because the firm cannot sell if the buyer checks () and the other firm charges
low price (1 − ); otherwise he will sell with probability 0.5. In that case, the payoff
from = 3 is
(3; ) = 3 + (1 − ) 32
This is because the firm will sell with probability 1, getting the payoff of 3, if the buyer
checks and the other firm sets a high price; otherwise the firm gets 3/2. Since ∈ (0 1),
the firm must be indifferent:
(5; ) = (3; )
(1 − (1 − )) 52 = 3 + (1 − ) 32
That is,
2
=
5 − 2
There are two symmetric mixed strategy equilibria:
µ √ ¶
1 + 1 − 2 2
= = √
2 4 − 1 − 2
and µ √ ¶
1 − 1 − 2 2
= = √
2 4 + 1 − 2
It is illustrative to plot the possible values of as a function of , including the pure
strategy Nash equilibrium where = 1.
7.3. EXERCISES WITH SOLUTIONS 117
q 1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
c_s
When 12 there is a unique Nash equilibrium, in which the firms charge high prices.
When = 12, the there is also a mixed strategy Nash equilibrium, in which the firms
charge high and low prices with equal probabilities. As we decrease the cost further,
we have two mixed strategy equilibria and a pure strategy equilibria, where the price
is high. The probability of charging a high price reacts to the changes in differently
in the two equilibria. In one equilibrium, as we decrease to zero, the probability of
charging a high price also decreases to zero, when the firms charge low prices. In the
other equilibrium, that probability increases to 1, when the firms charge high prices.
Solution: Suppose that ≤ firms produce some positive quantity and the remain-
ing firms produce 0. For any with positive production ∗ ,
P ∗
∗
1 − − 6
=
=
2
As in the usual Cournot model above, the uniqe solution to this equation system is
1−
∗ =
+1
118 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION
≥ ∗ 2
Hence, the set of Nash equilibria is as follows. For any integer ∈ [∗ 2 ∗ ],
firms produce (1 − ) ( + 1) each, and the remaining firms produce 0.
7.4 Exercises
1. [Midterm 1, 2007] Consider the Cournot duopoly with linear demand function
= 1 − , where is the price and = 1 + 2 is the total supply.3 Firm 1 has
zero marginal cost. Firm 2 has marginal cost (2 ) = 2 , so that the total cost of
producing 2 is 22 2
2. Show that all Nash equilibria of Cournot oligopoly game above are in pure strate-
gies (i.e., one does not need to check for mixed strategy equilibria). (See exercise
17 in Section 6.6.)
3. Can you find a mixed-strategy Nash equilibrium in the Bertrand game above?
3
Recall that in Cournot duopoly Firms 1 and 2 simultaneously produce 1 and 2 , and they sell at
price .
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Chapter 8
Further Applications
This chapter is devoted to exercises that apply the ideas developed in previous chapters
to various real-world problems.All of the exercises come from past exams and homework
problems. The reader is recommended to solve them before studying the solutions.
Many of the games in this chapter are supermodular, a class of games for which there
are powerful general theorems. These theorems could be used to find the rationalizable
set relatively easily. I will not use these theorems. Instead, I will explicitly apply the
iterated elimination procedure and use the results from the previous chapters. This
will hopefully drill the logic of rationalizability better. Moreover, the knowledge of the
procedure clarifies the role of knowledge and rationality assumptions, clarifying how
sensitive the solution could be to such assumptions.
8.1 Partnership
Consider an employer and a worker. The employer provides the capital ≥ 0 (in terms
of investment in technology, etc.) and the worker provides the labor ≥ 0 (in terms of
the investment in the human capital) to produce
( ) =
for some ∈ (0 1) with + 1. They share the output equally. The parties
determine their investment level (the employer’s capital and the worker’s labor )
simultaneously. The per-unit costs of capital and the labor for the employer and the
119
120 CHAPTER 8. FURTHER APPLICATIONS
worker are 0 and 0 respectively. The worker cannot put more than some fixed
positve ̄. The payoffs for the employer and the worker are
1
( ) = ( ) −
2
and
1
( ) = ( ) −
2
respectively. Everything described up to here is common knowledge.
Solution: The set of players is { } where stands for the employer and
stands for the worker. The sets of strategies are
= [0 ∞)
£ ¤
= 0 ̄
Solution: Since and are strictly concave in own strategies, all Nash equilibria
are in pure strategies. Towards finding the Nash equilibria, compute the best-response
functions for and as
µ ¶1(1−) (µ ¶1(1−) )
1 1
∗ () = and ∗ () = min ̄
2 2
respectively. The Nash equilibrium is when the above best responses intersect, i.e.,
= ∗ () and = ∗ (). Clearly, (0 0) is a Nash equilibrium. To find the other
possible equilibria, first consider the case ∗ () ̄, the case plotted in Figure 8.1. In
that case, the non-zero solution to the above equation is
à µ ¶ !1(1−−)
1 ³ ´1−
̂ =
2
à µ ¶ !
1− ³ ´ 1(1−−)
1
̂ =
2
8.1. PARTNERSHIP 121
K
L*
K*
Rationalizable
Towards an induction, assume that at the end of round 2, the remaining strategy
£ ¡ ¢¤ £ ¡ ¢¤ 1
¯ and 0 (∗ ◦ ∗ )
sets are 0 ∗ ◦ (∗ ◦ ∗ )−1 ¯ . (This is the case for
= 1.)
ˆ )
Round 2 + 1 ( ¯ Write
¡ ¢
̃ ≡ ∗ ◦ (∗ ◦ ∗ )−1 ̄
and
¡ ¢
̃ ≡ (∗ ◦ ∗ ) ̄
³ ´
ˆ ,
Since ˜
¯ as one can see from the figure, ˜
ˆ and ˆ . Hence, ∗
˜ =
³ ³ ´´ ³ ´ ³ ´
∗ ∗ ̃ ̃. Any strategy ∗ ̃ is dominated by ∗ ̃ and
eliminated
h in this
³ ´iround. No
h other
i strategy is eliminated. The remaining strategy
∗
sets are 0 ̃ and 0 ̃ .
³ ³ ´´
Round 2 + 2 ( ˆ )
¯ Since ˆ˜ ̄, as in the previous round, ∗ ∗ ˜ ˜
.
h ³ ´i ³ ³ ´´
Now, since ∈ 0 ∗ ̃ , any ∗ ∗ ̃ is strictly dominated by
³ ³ ´´
∗ ∗ ˜ , and is eliminated for the worker. As before, no other strategy
h ³ ´i
is eliminated in this round. The remaining strategy sets are 0 ∗ ̃ and
h ³ ³ ´´i ¡ ¢
0 ∗ ∗ ˜ . By substituting ̃ ≡ (∗ ◦ ∗ ) ¯ one can check that the
formulas in the inductive hypothesis is true for + 1.
∗ ∗
¡ ¢
One can check that
h as
i → ∞, ( ◦ ) h̄ →
i ̂. Therefore, the set of rational-
izable strategies is 0 ̂ for the employer and 0 ̂ for the worker.
1
For any function , () = ( ( ())) where is repeated times.
8.2. COORDINATION IN SOFTWARE DEVELOPMENT 123
Note that a developers pays two costs: one for being away from his ideal specification,
and one for being away from the specification of other softwares (cost of incompatibility).
Note also that in the normal-form game, the software developers are the players; each
chooses a strategy from real line, and the payoff function of player is .
Solution: For each player , the first-order condition for his best response is
2 X
2 ( − ) − ( − ) = 0
− 1 =
6
which simplifies to
1 X
2 = +
− 1 =
6
P P
To solve this equation system, sum it up over and obtain = . Substituting
this in the above equation, obtain
1 X
−1
= +
2 − 1 2 − 1 =1
In equilibrium, a software developer chooses, roughly, the average of his own ideal spec-
P
ification and the average =1 of all ideal specifications, including his own.
∈ [0001 1] unit for research and development by paying cost of 4. If a firm invests
units and the other firm invests units, the former wins with probability ( + ).
Therefore, the payoff of the former start up will be
− 4
+
All these are common knowledge.
Note that the leader gets 1 and the follower gets 0 revenue. These numbers are
multiplied with their respective probabilities, and the final payoff above is obtained
after subtracting the cost of research. Note also that in the normal form game, the
players are Firm 1 and Firm 2, strategy set of each player is [0001 1], and the payoff
function is as above.
i.e.,
√
∗ () = 2 −
Note that ∗ () whenever 1. Therefore, the graphs of ∗ and ∗ intersect each
other only at = = 1 –as shown in Figure 8.2. Therefore, (1,1) is the only Nash
equilibrium.
8.3. COMPETITION IN RESEARCH AND DEVELOPMENT 125
Solution: (1 1) is the only rationalizable strategy profile. Since ≥ 0 ≡ 0001, then
any strategy ∗ (0 ) is strictly dominated by 1 = ∗ (0 ), and therefore eliminated.
Write also 0 = 0 and 1 = 1 . Now, the remaining strategy space of each player is
[1 1]. Note that 1 = ∗ (001) 0001 = 0 . Now, similarly, one can eliminate any
strategy 2 ≡ ∗ (1 ). Applying this iteratively, after th elimination, the remaining
strategy space is [ 1] where
√
= 2 −1 − −1
and 0 = 001. It is clear from the figure that → 1 as → ∞. Hence in the limit we
are left with strategy space {1}.
More formally,
√ √ 12
= 2 −1 − −1 −1 = −1
Hence,
(12)−1
1 0
(12)−1
Of course, as → ∞, (12)−1 → 0, and hence 0 → 1. Therefore, → 1.
126 CHAPTER 8. FURTHER APPLICATIONS
+ = ( + + )2 (8.2)
Comparing the two equations, we find that = . Substituting this equality in (8.1),
we find that, in equilibrium, solves
+ = ( + 2 )2 (8.3)
which is equivalent to
42 + (4 − 1) + 2 − = 0
There is only one non-negative solution to this quadratic equation:
√
1 − 4 + 1 + 8 ∼ 1
∗ = =
8 4
∗
The unique Nash equilibrium is given by = = .
8.4. POLITICAL COMPETITION 127
Answer: From (8.1) and (8.2), the best response functions of Alice and Bob are
√
( ) = + − ( + )
√
( ) = + − ( + )
1/4
x*
xA
1/4 - 1/4
Remember from the class that, since the utility functions are strictly concave (or "single-
peaked"), if
( ) for each , then is strictly dominated. For example,
any 14 is strictly dominated by = 14. Similarly, any 14 is strictly
dominated by = 14. Hence, in the first round, we eliminate all such strategies:
xB
1/4
x*
xA
x 1/4 - 1/4
√
In the next round we eliminate the strategies with
(0) = − 1 .This
is because all such strategies are now dominated by 1 . We continue this elimination
128 CHAPTER 8. FURTHER APPLICATIONS
iteratively. All we need to know where the process stops. The answer is actually easy.
It will stop at ∗ , and = = ∗ are the only rationalizable strategies.
Here is a mathematical proof: Since the game is symmetric, the set of rationaliz-
able strategies is the same for both players; call that set . Recall that no rationalizable
strategy is strictly dominated when we restrict the other player’s strategies to be ratio-
nalizable. That is, for each ∈ , there exists ∈ such that =
() = ().
Suppose that min 14 − . Now, min =
() for some ∈ . Since is
"single-peaked", either = min or = max . But, since ∗ ≤ max ≤ 14, as in
the figure,
(max ) 14 − , showing that 6= max . Hence, = min , i.e.,
min =
(min ). But this is a contradiction because it implies that (min min )
is a Nash equilibrium. Therefore, min ≥ 14 − . Then,
is strictly decreasing on
. That is, min = (max ) and max = (min ), i.e., (max min ) is a
Nash equilibrium, showing that max = min = ∗ .
8.5 Exercises
1. [Homework 1, 2002] In the partnership above compute the rationalizable strategies
for the case
2. [Homework 2, 2006] Alice and Bob seek each other. Simultaneously, Alice puts
effort and Bob puts effort to search. The probability of meeting is ; the
3
value of the meeting for each of them is , and the search costs to Alice and 3
to Bob.
3. [Homework 2, 2011] There are 3 partners, namely 1,2, and 3. Simultaneously, each
partner puts effort ∈ [0 1], producing output level of 1 2 3 and costing 2
to . The partners share the output equally; the payoff of is 1 2 3 3 − 2 .
4. [Midterm 1 Make Up, 2002] Consider a two player game in which each player’s
strategy is a real number ∈ [0 1]. A player’s payoff is
− ( − 2 − 14)2
where is his own strategy and is the strategy chosen by the other player.
5. In the software development game in Section 8.2, compute the set of rationalizable
strategies for the case
Backward Induction
We now start analyzing the dynamic games with complete information. These notes
focus on the perfect-information games, where each information set is singleton, and
apply the notion of backward induction to these games. We will assume that the game
has "finite horizon", i.e., there can only be finitely many moves in any history of moves.
9.1 Definition
The concept of backward induction corresponds to the assumption that it is common
knowledge that each player will act rationally at each future node where he moves — even
if his rationality would imply that such a node will not be reached.1 (The assumption
that the player moves rationally at each information set he moves is called sequential
rationality.)
Mechanically, backward induction corresponds to the following procedure, depicted in
Figure 9.1. Consider any node that comes just before terminal nodes, that is, after each
move stemming from this node, the game ends. (Such nodes are called pen-terminal.) If
the player who moves at this node acts rationally, he chooses the best move for himself
at that node. Hence, select one of the moves that give this player the highest payoff.
Assigning the payoff vector associated with this move to the node at hand, delete all the
1
More precisely: at each node the player is certain that all the players will act rationally at all
nodes that follow node ; and at each node the player is certain that at each node that follows
node the player who moves at will be certain that all the players will act rationally at all nodes
that follow node ,...ad infinitum.
131
132 CHAPTER 9. BACKWARD INDUCTION
moves stemming from this node so that we have a shorter game, where the above node
is a terminal node. Repeat this procedure until the origin is the only remaining node.
The outcome is the moves that are picked in the way. Since a move is picked at each
information set, the result is a strategy profile.
For an illustration of the procedure, consider the game in the following figure. This
game describes a situation where it is mutually beneficial for all players to stay in a
relationship, while a player would like to exit the relationship, if she knows that the
other player will exit in the next day.
1 2 1
• • • (2,5)
In the third day, Player 1 moves, choosing between going across () or down (). If
he goes across, he would get 2; if he goes down, he would get the higher payoff of 3.
Hence, according to the procedure, he goes down. Selecting the move for the node at
hand, one reduces the game as follows:
1 2
• • (3,3)
(1,1) (0,4)
Here, the part of the game that starts at the last decision node is replaced with the
payoff vector associated with the selected move, , of the player at this decision node.
In the second day, Player 2 moves, choosing between going across () or down ().
If she goes across, she get 3; if she goes down, she gets the higher payoff of 4. Hence,
according to the procedure, she goes down. Selecting the move for the node at hand,
one reduces the game further as follows:
1
• (0,4)
(1,1)
Once again, the part of the game that starts with the node at hand is replaced with
the payoff vector associated with the selected move, . Now, Player 1 gets 0 if he goes
across (), and gets 1 if he goes down (). Therefore, he goes down. The procedure
results in the following strategy profile:
That is, at each node, the player who is to move goes down, exiting the relationship.
Let’s go over the assumptions that we have made in constructing this strategy profile.
134 CHAPTER 9. BACKWARD INDUCTION
1 2 1
• • • (2,5)
We assumed that Player 1 will act rationally at the last date, when we reckoned that he
goes down. When we reckoned that Player 2 goes down in the second day, we assumed
that Player 2 assumes that Player 1 will act rationally on the third day, and also assumed
that she is rational, too. On the first day, Player 1 anticipates all these. That is, he is
assumed to know that Player 2 is rational, and that she will keep believing that Player
1 will act rationally on the third day.
This example also illustrates another notion associated with backward induction —
commitment (or the lack of commitment). Note that the outcomes on the third day
(i.e., (3,3) and (2,5)) are both strictly better than the equilibrium outcome (1,0). But
they cannot reach these outcomes, because Player 2 cannot commit to go across, and
anticipating that Player 2 will go down, Player 1 exits the relationship in the first day.
There is also a further commitment problem in this example. If Player 1 where able
to commit to go across on the third day, then Player 2 would definitely go across on
the second day. In that case, Player 1 would go across on the first. Of course, Player 1
cannot commit to go across on the third day, and the game ends in the first day, yielding
the low payoffs (1,0).
Proposition 9.1 In a game with finitely many nodes, backward induction always results
9.2. BACKWARD INDUCTION AND NASH EQUILIBRIUM 135
in a Nash equilibrium.
Proof. Let ∗ = (∗1 ∗ ) be the outcome of Backward Induction. Consider any
player and any strategy . To show that ∗ is a Nash equilibrium, we need to show
that
¡ ¢
(∗ ) ≥ ∗−
¡ ¢
where ∗− = ∗ =
6
. Take any node
• ∗ and prescribe the same moves for player at every node that comes after this
node.
(There is always such a node; for example the last node player moves.) Consider
a new strategy 1 according to which plays everywhere according to except for the
¡ ¢ ¡ ¢
above node, where he plays according to ∗ .According to 1 −
∗ ∗
or − , after this
¡ ∗ ∗ ¢
node, the play is as in − , the outcome of the backward induction. Moreover,
in the construction of ∗ , we have had selected the best move for player given this
continuation play. Therefore, the change from to 1 , which follows the backward
induction recommendation, can only increase the payoff of :
¡ ¢ ¡ ¢
1 ∗− ≥ ∗−
Applying the same procedure to 1 , now construct a new strategy 2 that differs from
1 only at one node, where player plays according to ∗ , and
¡ ¢ ¡ ¢
2 ∗− ≥ 1 ∗−
¡ ¢ ¡ −1 ∗ ¢ ¡ ¢ ¡ ¢
(∗ ) = ∗
− ≥ − ≥ · · · ≥ 1 −
∗
≥ ∗−
Since one takes his future moves given and picks only a move for the node at hand,
chhosing the best moves at the given nodes does not necessarily lead to a best response
among all contingent plans in general.
Example 9.1 Consider a single player, who chooses between good and bad everyday
forever. If he chooses good at everyday, he gets 1, and he gets 0 otherwise. Clearly, the
optimal plan is to play good everyday, yielding 1. Now consider the strategy according to
which he plays bad everyday at all nodes. This gives him 0. But the strategy satisfies the
condition of backward induction (although bacward induction cannot be applied to this
game with no end node). At any node, according to the moves selected in the future, he
gets zero regardless of what he does at the current node.
The above pathological case is a counterexample to the idea that if one is playing a
best move at every node, his plan is a best response. The latter idea is a major principle
of dynamic optimization, called the Single-Deviation Principle. It applies in most cases
except for the pathological cases as above. The above proof shows that the principle
applies in games with finitely many moves. Single-Deviation Principle will be the main
tool in the analyses of the infinite-horizon games in upcoming chapters. Studying the
above proof is recommended.
But not all Nash equilibria can be obtained by backward induction. Consider the
following game of the Battle of the Sexes with perfect information, where Alice moves
first.
Alice
O F
Bob Bob
O F O F
In this game, backward induction leads to the strategy profile identified in the figure,
according to which Bob goes wherever Alice goes, and Alice goes to Opera. There is
another Nash equilibrium: Alice goes to Football game, and Bob goes to Football game
9.3. COMMITMENT 137
at both of his decision nodes. Let’s see why this is a Nash equilibrium. Alice plays a
best response to the strategy of Bob: if she goes to Football she gets 1, and if she goes
to Opera she gets 0 (as they do not meet). Bob’s strategy (FF) is also a best response
to Alice’s strategy: under this strategy he gets 2, which is the highest he can get in this
game.
One can, however, discredit the latter Nash equilibrium because it relies on an se-
quentially irrational move at the node after Alice goes to Opera. This node does not
happen according to Alice’s strategy, and it is therefore ignored in Nash equilibrium.
Nevertheless, if Alice goes to Opera, going to football game would be irrational for Bob,
and he would rationally go to Opera as well. And Alice should foresee this and go to
Opera. Sometimes, we say that this equilibrium is based on "an incredible threat", with
the obvious interpretation.
This example illustrates a shortcoming of the usual rationality condition, which re-
quires that one must play a best response (as a complete contingent plan) at the be-
ginning of the game. While this requires that the player plays a best response at the
nodes that he assigns positive probability, it leaves the player free to choose any move
at the nodes that he puts zero probability–because all the payoffs after those nodes are
multiplied by zero in the expected utility calculation. Since the likelihoods of the nodes
are determined as part of the solution, this may lead to somewhat erroneous solutions in
which a node is not reached because a player plays irrationally at the node, anticipating
that the node will not be reached, as in (Football, FF) equilibrium. Of course, this is
erroneous in that when that node is reached the player cannot pretend that the node
will not be reached as he will know that the is reached by the definition of information
set. Then, he must play a best response taking it given that the node is reached.
9.3 Commitment
In this game, Alice can commit to going to a place, but Bob cannot. If we trust the
outcome of backward induction, this commitment helps Alice and hurts Bob. (Although
the game is symmetric Alice gets a higher payoff.) It is tempting to conclude that ability
to commit is always good. While this is true in many games, in some games it is not the
case. For example, consider the Matching Pennies with Perfect Information, depicted
138 CHAPTER 9. BACKWARD INDUCTION
in Figure 3.4. Let us apply backward induction. If Player 1 chooses Head, Player 2 will
play Head; and if Player 1 chooses Tail, Player 2 will prefer Tail, too. Hence, the game
is reduced to
(-1,1)
Head
Tail (-1,1)
In that case, Player 1 will be indifferent between Head and Tail, choosing any of these
two option or any randomization between these two acts will give us an equilibrium with
backward induction. In either equilibrium, Player 2 beats Player 1.
This example shows that backward induction can lead to multiple equilibria. Here, in
one equilibrium, Player 1 chooses Head, in another one Player 1 chooses Tail, and yet in
another mixed strategy equilibrium, he mixes between the two strategies. Each mixture
probability corresponds to a different equilibrium. In all of these equilibria, the payoffs
are the same. In general, however, backwards induction can lead to multiple equilibria
with quite different outcomes.
1 A 2 x 1 a
(1,1)
y
D d
z (0,2)
(1,1) a (2,2)
1 (0,1)
(1,0)
1 A 2 x
(2,2)
y
D
z (0,2)
(1,1) (1,0)
Figure 9.3:
140 CHAPTER 9. BACKWARD INDUCTION
(1,1)
The strategy selected for Player 1 depends on the choice of . If some 12 is selected
for Player 2, Player 1 must choose . This results in the equilibrium in which Player
1 plays and Player 2 plays with probability and with probability 1 − . If
12, Player 1 must choose . In the resulting equilibrium, Player 1 plays and
Player 2 plays with probability and with probability 1 − . Finally, if = 12 is
selected, then Player 1 is indifferent, and we can select any randomization between
and , each resulting in a different equilibrium.
(1 2 ) = ( (1 + 2 ) − )
9.5. EXAMPLE–STACKELBERG DUOPOLY 141
• At the initial node, firm 1 chooses an action 1 ; the set of allowable actions is
[0 ∞).
• After each action of firm 1, firm 2 moves and chooses action 2 ; the set of allowable
actions now is again [0 ∞).
• Each of these action leads to a terminal node, at which the payoff vector is
(1 (1 2 ) 2 (1 2 )).
Notice that a strategy of firm 1 is a real number 1 from [0 ∞), and more importantly
a strategy of firm 2 is a function from [0 ∞) to [0 ∞), which assigns a production level
2 (1 ) to each 1 . These strategies with the utility function (1 2 ) = (1 2 (1 ))
gives us the normal form.
Let us apply bachwards induction. Given 1 ≤ 1 − , the best production level for
the Follower is
1 − 1 −
2∗ (1 ) =
2
yielding to the payoff vector2
à ! à !
1
1 (1 2∗ (1 )) (1 − 1 − )
2 1
= (9.1)
2 (1 2∗ (1 )) 1
4
(1 − 1 − )2
By replacing the moves of firm 2 with the associated payoffs we obtain a game in which
firm one chooses a quantity level 1 which leads to the payoff vector in (9.1). In this
game firm 1 maximizes 12 1 (1 − 1 − ), choosing
1∗ = (1 − ) 2,
X E
2
(2,1) L R
M
1
1
(1,2)
l r
(3,1)
(3,1) (1,3
1,3)) (1,3) (3,1)
(3,1)
Figure 9.4:
1
L R
2
2
l1 r1 r2
l2
1
1,2 2,1 0,3
l r
2,2 1,4
Figure 9.5:
1
L R
2
2
l1 r1 r2
l2
1
1,2 2,1 0,3
l r
2,2 1,4
Figure 9.6:
1 2 1 2 1 2 1 2
1 2 1 2 2 1 2 1
1 2 1 2 2 1 2 1
0 3 2 2 0 3 2 2
0 3 1 4 0 3 1 4
(c) Find all the rationalizable strategies in this game –use the normal form.
State the rationality/knowledge assumptions necessary for each elimination.
1 2 1 2 1 2
1 2 1 2 2 1
1 2 1 2 2 1
0 3 2 2 0 3
(If you found the pure strategy equilibria (namely, (L,1 2 ) and (Lr,1 2 )), you
will get most of the points.)
Assuming that 23 22 , use backward induction to compute a Nash equi-
librium of this game. (Note that Alice and Bob are the only players here because
146 CHAPTER 9. BACKWARD INDUCTION
the actions of the committee members are fixed already.) [Hint: Bob chooses not
to contribute when he is indifferent between contribution and not contributing at
all.]
Solution: Given any (1 2 3 ) by Alice, for each , write ( ) = + for
the "price" of member for Bob. If the total price of the cheapest two members
P
exceeds (i.e., ( ) − max ( ) ≥ ), then Bob needs to pay at least
to stop the bill, in which case, he contributes 0 to each member. If the total
price of the cheapest two members is lower than , then the only best response
for Bob is to pay exactly the cheapest two members their price and pay nothing
to the the remaining member, stopping the bill, which would have cost him . In
sum, Bob’s strategy is given by
( P
+ if 0 0 (0 ) − max0 0 (0 ) and 6= ∗
∗ (1 2 3 ) =
0 otherwise,
where ∗ is the most expensive member, which is chosen randomly when there is
a tie.3
Answer: First consider the case ≤ 23 . Then, Alice chooses a contribution vector
(1 2 0) such that 1 + 2 + 1 + 2 = , 1 + 1 ≤ 3 , and 2 + 2 ≤ 3 . Such
a vector is feasible because 23 and 3 2 1 0. Optimality of this
contribution is as before.
Now consider the case 23 . Now, Alice must contribute to all members in
order to pass the bill, and the optimality requires that the prices of all members
are 2 (as Bob buys the cheapest two). That is, she must contribute
Since this costs Alice 32 − (1 + 2 + 3 ), she makes such a contribution to pass
the bill if and only if 32 ≤ + (1 + 2 + 3 ). Otherwise, she contributes
(0 0 0) and the bill fails.
9.7 Exercises
1. In Stackelberg duopoly example, for every 1 ∈ (0 1), find a Nash equilibrium in
which Firm 1 plays 1 .
L R
1/2 1/2 l r
1 2 1
2
X Y A B 4 x y
4 0 4 1 0 10
0 4 0 1 10 2
148 CHAPTER 9. BACKWARD INDUCTION
L R
M
2 2
2
l1 l r a c
r1 b
1
1
0 1 2 1 0
x y w 2 0 10
z 0 1
2 1 2
2 1
1 1
2
Figure 9.7:
5. [Homework 2, 2002] Three gangsters armed with pistols, Al, Bob, and Curly, are
in a room with a suitcase of money. Al, Bob, and Curly have 20%, 40% and
70% chances of killing their target, respectively. Each has one bullet. First Al
shoots targeting one of the other two gangster. After Al, if alive, Bob shoots,
targeting one of the surviving gangsters. Finally, if alive, Curly shoots, targeting
again one of the surviving gangsters. The survivors split the money equally. Find
a subgame-perfect equilibrium.
6. [Midterm 1 Make Up, 2001] Find all pure-strategy Nash equilibria in Figure 9.8.
Which of these equilibria are can be obtained by backward induction?
7. [Final Make up, 2000] Find the subgame-perfect equilibrium of the following 2-
person game. First, player 1 picks an integer 0 with 1 ≤ 0 ≤ 10. Then, player 2
picks an integer 1 with 0 + 1 ≤ 1 ≤ 0 + 10. Then, player 1 picks an integer 2
with 1 + 1 ≤ 2 ≤ 1 + 10. In this fashion, they pick integers, alternatively. At
each time, the player moves picks an integer, by adding an integer between 1 and
10 to the number picked by the other player last time. Whoever picks 100 wins
the game and gets 100; the other loses the game and gets zero.
L
R
2
2
L L
R R
1
1
0,0 1,3
L
R R
L
2,2 -1,-1
4,2 3,3
Figure 9.8:
O I
2 2
2 L R
1
L R L R
3 0 0 1
1 0 0 3
Figure 9.9:
150 CHAPTER 9. BACKWARD INDUCTION
(d) Which strategies are consistent with all of the following assumptions?
(i) 1 is rational.
(ii) 2 is sequentially rational.
(iii) at the node she moves, 2 knows (i).
(iv) 1 knows (ii) and (iii).
9. [Final 2004] Use backward induction to find a Nash equilibrium for the following
game, which is a simplified version of a game called Weakest Link. There are 4
risk-neutral contestants, 1,2, 3, and 4, with "values" 1 , . . . , 4 where 1 2
3 4 0. Game has 3 rounds. At each round, an outside party adds the value
of each "surviving" contestant to a common account,4 and at the end of third
round one of the contestants wins and gets the amount collected in the common
account. We call a contestant surviving at a round if he was not eliminated at
a previous round. At the end of rounds 1 and 2, the surviving contestants vote
out one of the contestants. The contestants vote sequentially in the order of their
indices (i.e., 1 votes before 2; 2 votes before 3, and so on), observing the previous
votes. The contestant who gets the highest vote is eliminated; the ties are broken
at random. At the end of the third round, a contestant wins the contest with
probability ( + ), where and are the surviving contestants at the third
round. (Be sure to specify which player will be eliminated for each combination of
surviving contestants, but you need not necessarily specify how every contestant
will vote at all contingencies.)
a lobbyist named Alice, stands to gain from the passage of the bill, and the
file-sharing industry, represented by a lobbyist named Bob, stands to lose from
the passage of the bill where 0. Consider the following game.
Use backward induction to compute a Nash equilibrium of this game. (Note that
Alice and Bob are the only players here because the actions of the committee
members are fixed already.)
Chapter 10
Application: Negotiation
Negotiation is an essential aspect of social and economic interaction. The states negoti-
ate their borders with their neighbors; the legislators negotiate the laws that they make;
defendants negotiate a settlement with the prosecutors or the plaintiffs in the courts;
workers negotiate their salaries with their employers; the families negotiate their spend-
ing and maintenance of the household with each other, and even some students try to
negotiate their grades with their professor. Despite its central importance, negotiations
were presumed to be outside of the purview of economic analysis until the emergence of
game theory. Today there are many game theoretical models of bargaining. These notes
apply backward induction to three important bargaining games. The first one considers
congressional bargaining. It abstracts away from the back-room deals that lead to the
proposed bills and focus on the way legislators vote between various alternatives. The
second model considers pretrial negotiation in law. The third one is a general model of
bargaining that can be applied to many different settings in economics.
153
154 CHAPTER 10. APPLICATION: NEGOTIATION
x0 x1
x0 x1 x2
x2
x1 x0 x2
x2
final bill. In the final vote, the final bill, which may not be the original one, passes, or
fails, in which case the status quo prevails. For example, if there is a bill, an amend-
ment, and the status quo, first they vote between the bill and the amendment, then they
vote between the winner of the previous vote and the status quo. These rules and the
available proposals lead to a "binary" agenda; it is binary because in any session two
alternatives are voted against each other.
Let {1 2 + 1} be the set of players and {0 } be the set of alternatives.
Each player has a strict preference ordering for the set of alternatives. There is a fixed
binary agenda, and assume that all of these are commonly known.
To solve this game, we start from a last vote (a vote after which there is no further
voting). We assume that each player votes according to his preference. The alternative
that gets + 1 or more votes wins. We then truncate the game by replacing the vote
with the winning alternative. We proceed in this way until there is only one alternative.
For example, consider three players, namely 1, 2, and 3, and three alternatives,
namely 0 , 1 , and 2 . The agenda is as in Figure 10.1. According to the agenda,
0 and 1 are voted against each other first; the winner is voted against 2 next. If
the winner defeats 2 as well, then it is implemented; otherwise 2 (the winner of the
second vote) is voted against the loser of the first vote and the winner of this vote is
implemented.
10.1. CONGRESSIONAL BARGAINING–VOTING WITH A BINARY AGENDA155
1 2 3
0 2 1
1 0 2
2 1 0
But poison pills and killer amendments are frequently introduced and sometimes they
defeat the original bill (and eventually are defeated by the status quo). A famous exam-
ple to this is DePew amendment to the "17th amendment to the constitution" in 1912.
Here, the 17th amendment, 1 , requires the senators to be elected by the statewide
popular vote. This bill was supported by the (Southern) Democrats and half of the
Republicans, making up the two thirds of the congress. The DePew amendment, 2 ,
required that these elections be monitored by the federal government. Each Republican
slightly prefers 2 to 1 , so the proponent Republicans’ ordering is 2 Â 1 Â 0 and
the opposing Republicans’ ordering is 0 Â 2 Â 1 , where 0 is the status quo. But
the federal oversight of the state elections is unacceptable to the southern Democrats
for obvious reasons: 1 Â 0 Â 2 . Notice that "opposing Republicans" and Democrats,
which is about the two thirds of the legislators, prefer the status quo to the DePew
amendment. Hence, the DePew amendment is a killer amendment. According to our
analysis it should be defeated in the first round, and the original bill, the 17th amend-
ment, should eventually pass. But this did not happen. The DePew amendment killed
the bill.
Why does this happen? It would be too naive to think that a legislator is so myopic
that he cannot see one step ahead and fails to recognize a killer amendment. Sometimes,
legislators might not know the preferences of the other legislators. After all, these
preferences are elicited in these elections. In that case, the backward induction analysis
above is not valid and needs to be modified. Of course, in that case, an amendment may
defeat the bill (because of the proponents who think that it has enough support for an
eventual passage) but later be defeated in the final vote because of the lack of sufficient
support (which was not known in the first vote). But mostly, the killer amendments
are introduced intentionally, and the legislators have a clear idea about the preferences.
Even in that case, a killer amendment can pass, not because of the stupidity of the
10.2. PRE-TRIAL NEGOTIATIONS 157
proponents of the original bill, but because their votes against the amendment can be
exploited by their opponents in the upcoming elections when the voters are not informed
about the details of these bills.
The moral of the story is that it is not enough that your analysis is correct. You
must also be analyzing the correct game. You will learn the first task in the Game
Theory class; for the second, and more important, task of considering the correct game,
you need to look at the underlying facts of the situation.
• At each date ∈ {1 3 2 − 1}, if they have not yet settled, the Plaintiff offers
a settlement ,
• and the Defendant decides whether to accept or reject it. If she accepts, the game
ends with the Defendant paying to the Plaintiff; the game continues otherwise.
• and the Plaintiff decides whether to accept the offer, ending the game with the
Defendant paying to the Plaintiff, or to reject it and continue.
• If they do not reach an agreement at the end of period = 2, they go to court,
and the Judge orders the Defendant to pay 0 to the Plaintiff.
The Plaintiff pays his lawyer for each day they negotiate and an extra if they
go to court. Similarly, the Defendant pays her lawyer for each day they negotiate
and an extra if they go to court. Each party tries to maximize the expected amount
of money he or she has at the end of the game.
158 CHAPTER 10. APPLICATION: NEGOTIATION
The backward induction analysis of the game as follows. The payoff from going to
court for the Plaintiff is
− − 2
If he accepts the settlement offer 2 of the Defendant at date 2, his payoff will be
2 − 2
Hence, if 2 − , he must accept the offer, and if 2 − , he must reject
the offer. If 2 = − , he is indifferent between accepting and rejecting the offer.
Assume that he accepts that offer, too.1 To sum up, he accepts an offer 2 if and only
if 2 ≥ − .
What should the Defendant offer at date 2? Given the behavior of Plaintiff, her
payoff from 2 is
This is because, if the offer is rejected, they will go to court. Notice that when 2 =
− , her payoff is − + − 2 , and offering anything less would cause her to
lose + . Her payoff is plotted in Figure 10.2. Therefore, the Defendant offers
2 = −
at date 2.
Now at date 2 − 1, the Plaintiff offers a settlement 2−1 and the Defendant accepts
or rejects the offer. If she rejects the offer, she will get the payoff from settling for
2 = − at date 2, which is
− + − 2
−2−1 − (2 − 1)
1
In fact, he must accept 2 = − in equilibrium. For, if he doesn’t accept it, the best response
of the Defendant will be empty, inconsistent with an equilibrium. (Any offer 2 = − + with
0 will be accepted. But for any offer 2 = − + , there is a better offer 2 = − + 2,
which will also be accepted.)
10.2. PRE-TRIAL NEGOTIATIONS 159
UD
CP + CP
s
J-CP
Figure 10.2: Payoff of Defendant from her offer at the last period
Hence, she will accept the offer if and only if the last expression is greater than or equal
to the previous one, i.e.,
2−1 ≤ − +
Then, the Plaintiff will offer the highest acceptable settlement (to the Defendant):
2−1 = − +
In summary, since the Plaintiff is making an offer, he offers the settlement amount of
next date plus the cost of negotiating one more day for the Defendant.
Let us apply the backward induction one more step. At date 2 − 2, the Defendant
offers a settlement 2−2 and the Plaintiff accepts or rejects the offer. If he rejects the
offer, he will get the payoff from settling for 2−1 = − + at date 2 − 1, which
is
2−1 − (2 − 1) = − + − (2 − 1)
If he accepts the offer, he will get
2−2 − (2 − 2)
Hence, he will accept the offer if and only if the last expression is greater than or equal
to the previous one, i.e.,
2−2 ≥ 2−1 − = − + −
160 CHAPTER 10. APPLICATION: NEGOTIATION
Then, the Defendant offers the highest acceptable settlement (to the Plaintiff):
2−1 = 2−1 − = − + −
In summary, since the Defendant is making an offer, she offers the settlement amount
of next date minus the cost of negotiating one more day for the Plaintiff.
Now the pattern is clear. At any odd date , the Defendant accepts an offer if and
only if ≤ +1 + , and the Plaintiff offers
= +1 + ( is odd)
At any even date , the Plaintiff accepts an offer if and only if ≥ +1 − , and the
Defendant offers
= +1 − ( is even)
The solution to the above difference equation is
(
− + ( − 2) ( − ) if is even
=
− + ( − ( + 1) 2) ( − ) + if is odd.
Recall from the lecture that the solution is substantially different if the order of the
proposers is changed (see the slides). This is because at the last day, the cost of delaying
the agreement is quite high (the cost of going to court), and the party who accepts or
rejects the offer is willing to accepts a wide range of offers. Hence, the last proposer has
a great advantage.
Player 2 accepts or rejects the offer. If he accepts the offer, the offer is implemented,
yielding payoffs (1 1 ). If he rejects the offer, then they wait until the next day, when
Player 2 makes an offer (2 2 ) ∈ . Now, knowing what Player 2 has offered, Player
1 accepts or rejects the offer. If Player 1 accepts the offer, the offer is implemented,
yielding payoffs (2 2 ). If Player 2 rejects the offer, then the game ends, when they
lose the dollar and get payoffs (0,0).
The backward induction analysis of this simplified model is as follows. On the second
day, if Player 1 rejects the offer, he gets 0. Hence, he accepts any offer that gives him
more than 0, and he is indifferent between accepting and rejecting any offer that gives
him 0. As we have seen in the previous section, he accepts the offer (0,1) in equilibrium.
Then, on the second day, Player 2 would offer (0,1), which is the best Player 2 can get.
Therefore, if they do not agree on the first day, then Player 2 takes the entire dollar on
the second day, leaving Player 1 nothing. The value of taking the dollar on the next
day for Player 2 is . Hence, on the first day, Player 2 accepts any offer that gives him
more than , rejects any offer that gives him less than , and he is indifferent between
accepting and rejecting any offer that gives him . As above, assume that Player 2
accepts the offer (1 − ). In that case, Player 1 offers (1 − ), which is accepted.
Could Player 1 receive more than 1 − ? If he offered anything that is better than 1 −
for himself, his offer would necessarily give less than to Player 2, and Player 2 would
reject the offer. In that case, the negotiations would continue to the next day and he
would receive 0, which is clearly worse than 1 − .
Now, consider the game in which the game above is repeated times. That is, if
they have not yet reached an agreement by the end of the second day, on the third day,
Player 1 makes an offer (3 3 ) ∈ . Then, knowing what has been offered, Player 2
accepts or rejects the offer. If he accepts the offer, the offer is implemented, yielding
¡ ¢
payoffs 2 3 2 3 . If he rejects the offer, then they wait until the next day, when
Player 2 makes an offer (4 4 ) ∈ . Now, knowing what Player 2 has offered, Player
1 accepts or rejects the offer. If Player 1 accepts the offer, the offer is implemented,
¡ ¢
yielding payoffs 3 4 3 4 . If Player 1 rejects the offer, then they go to the 5th day...
And this goes on like this until the end of day 2. If they have not yet agreed at the
end of that day, the game ends, they lose the dollar and get payoffs (0,0).
We can prove this is indeed the equilibrium given by backward induction using math-
ematical induction on . (That is, we first prove that it is true for = 0; then assuming
that it is true for some − 1, we prove that it is true for )
Proof. Note that for = 0, we have the last two periods, identical to the 2-period
example we analyzed above. Letting = 0, we can easily check that the behavior
described here is the same as the equilibrium behavior in the 2-period game. Now,
assume that, for some − 1 the equilibrium is as described above. That is, at the
beginning of date + 1 := 2 − 2 ( − 1) − 1 = 2 − 2 + 1, Player 1 offers
⎛ ³ ´⎞ Ã
2(−1)+2 1 +
2(−1)+1 ¡ ¢!
1 − 1 − 2
1 + 2−1
(+1 +1 ) = ⎝ ⎠= ;
1+ 1+ 1+ 1+
and his offer is accepted. At date = 2 − 2, Player 1 accepts an offer iff the offer
−2 (1−2 )
is at least as good as having 11+
the next day, which is worth 1+
. Therefore, he
accepts an offer ( ) iff ¡ ¢
1 − 2
≥ ;
1+
10.4. EXERCISES WITH SOLUTIONS 163
as in the strategy profile above. In that case, the best Player 2 can do is to offer
à ¡ ¢ ¡ ¢! à ¡ ¢ !
1 − 2 1 − 2 1 − 2 1 + 2+1
( ) = 1 − =
1+ 1+ 1+ 1+
This is because any offer that gives Player 2 more than will be rejected in which case
Player 2 will get ¡ ¢
2 1 + 2−1
+1 = .
1+
In summary, at , Player 2 offers ( ) ; and it is accepted. Consequently, at − 1,
Player 2 accepts an offer ( ) iff
¡ ¢
1 + 2+1
≥ =
1+
In that case, at − 1, Player 1 offers
à ¡ ¢!
1 − 2+2 1 + 2+1
(−1 −1 ) ≡ (1 − ) =
1+ 1+
2. The selected player makes an offer ( ) ∈ [0 1]2 such that + ≤ 1. Knowing
what has been offered, the other player accepts or rejects the offer. If the offer
¡ ¢
( ) is accepted, the game ends, yielding payoff vector . If the offer
is rejected, we proceed to the next date, when the same procedure is repeated,
except for = − 1, after which the game ends, yielding (0,0). The coin tosses
at different dates are stochastically independent. And everything described up to
here is common knowledge.
(a) Compute the subgame perfect equilibrium for = 1. What is the value
of playing this game for a player? (That is, compute the expected utility of
each player before the coin-toss, given that they will play the subgame-perfect
equilibrium.)
(b) Compute the subgame perfect equilibrium for = 2 Compute the expected
utility of each player before the first coin-toss, given that they will play the
subgame-perfect equilibrium.
Solution: In equilibrium, on the last day, they will act as in part (a). Hence,
on the first day, if a player rejects the offer, the expected payoff of each player
will be · 12 = 2. Thus, he will accept an offer if an only if it gives
him at least 2. Therefore, the selected player offers 2 to his opponent,
keeping 1 − 2 for himself, which is more than 2, his expected payoff if his
offer is rejected. Therefore, in any subgame perfect equilibrium, the outcome
is (1 − 2 2) if it comes Head, and (2 1 − 2) if it comes Tail. The
10.4. EXERCISES WITH SOLUTIONS 165
Solution: Part (b) suggests that, if expected payoff of each player at the
beginning of date + 1 is +1 2, the expected payoff of each player at the
beginning of will be 2. [Note that in terms of dollars these numbers
correspond to 2 and 1/2, respectively.] Therefore, the equilibrium is follows:
At any date − 1, the selected player offers 2 to his opponent, keeping
1 − 2 for himself; and his opponent accepts an offer iff he gets at least 2;
and at date − 1, a player accepts any offer, hence the selected player offers
0 to his opponent, keeping 1 for himself. [You should be able to prove this
using mathematical induction and the argument in part (b).]
2. [Midterm 1, 2002] Consider two players and , who own a firm and want to
dissolve their partnership. Each owns half of the firm. The value of the firm for
players and are and , respectively, where 0. Player sets a
price for half of the firm. Player then decides whether to sell his share or to
buy ’s share at this price, . If decides to sell his share, then owns the firm
and pays to , yielding playoffs − and for players and , respectively.
If decides to buy, then owns the firm and pays to , yielding playoffs
and − for players and , respectively. All these are common knowledge.
Applying backward induction, find a Nash equilibrium of this game.
UA(p)
vB p
Figure 10.3:
which can be depicted as in Figure 10.3. Then, no price could maximize the payoff
of , inconsistent with equilibrium (where maximizes his payoff given what he
anticipates). Hence, the equilibrium strategy of must be
(
buy if 2;
sell if ≥ 2
3. [Midterm 1, 2006] Paul has lost his left arm due to complications in a surgery. He
is suing the Doctor.
• The court date is set at date 2 + 1. It is known that if they go to court, the
judge will order the Doctor to pay 0 to Paul.
• They negotiate for a settlement before the court. At each date ∈ {1 3 2 − 1},
if they have not yet settled, Paul offers a settlement , and the Doctor decides
whether to accept or reject it. If she accepts, the game ends with the Doctor
paying to Paul; game continues otherwise. At dates ∈ {2 4 2}, the
10.4. EXERCISES WITH SOLUTIONS 167
UA(p)
vB
p
Figure 10.4:
Doctor offers a settlement , and Paul decides whether to accept the offer,
ending the game with Doctor paying to Paul, or to reject it and continue.
• Paul pays his lawyer only a share of the money he gets from the Doctor. He
pays (1 − ) if they settle at date ; (1 − ) if they go to court, where
0 1. The Doctor pays her lawyer for each day they negotiate
and an extra if they go to court.
• Each party tries to maximize the expected amount of money he or she has at
the end of the game.
(a) (10 pts) For = 2, apply backward induction to find an equilibrium of this
game. (If you answer part (b) correctly, you don’t need to answer this part.)
(b) (15 pts) For any , apply backward induction to find an equilibrium of this
game.
Answer: At date 2+1, Paul gets from the doctor and pays (1 − ) to his
lawyer, netting . Now at date 2, if he accepts 2 , he will pay (1 − ) 2
to his lawyer, receiving 2 . Hence, he will accept 2 iff 2 ≥ () . The
doctor will offer
2 = ()
as she would pay () to Paul next day and an extra to her lawyer. Paul
will then offer
2−1 = () +
2−2 = 2−1
The pattern is now clear. At any odd date , the Doctor accepts an offer iff
≤ +1 + , and Paul offers
= +1 + ( is odd).
At any even date , Paul accepts an offer iff ≥ +1 , and the Doctor offers
= +1 ( is even).
This much is more or less enough for an answer. To be complete, note that
the solution to the above equations is
(
+ 2+1−
2
if is odd
=
+ 2−
2
if is even
10.5 Exercises
1. [Final, 2000] Consider a legal case where a plaintiff files a suit against a defendant.
It is common knowledge that, when they go to court, the defendant will have to pay
$1000,000 to the plaintiff, and $100,000 to the court. The court date is set 10 days
from now. Before the court date plaintiff and the defendant can settle privately,
in which case they do not have the court. Until the case is settled (whether in
the court or privately) for each day, the plaintiff and the defendant pay $2000
and $1000, respectively, to their legal team. To avoid all these costs plaintiff and
the defendant are negotiating in the following way. In the first day demands an
amount of money for the settlement. If the defendant accepts, then he pays the
amount and they settle. If he rejects, then he offers a new amount. If the plaintiff
accepts the offer, they settle for that amount; otherwise the next day the plaintiff
demands a new amount; and they make offers alternatively in this fashion until
the court day. Players are risk neutral and do not discount the future. Applying
backward induction, find a Nash equilibrium.
2. We have a Plaintiff and a Defendant, who is liable for a damage to the Plaintiff.
If they go to court, then with probability 0.1 the Plaintiff will win and get a
compensation of amount $100,000 from the Defendant; if he does not win, there
will be no compensation. Going to court is costly: if they go to court, each of the
Plaintiff and Defendant will pay $20,000 for the legal costs, independent of the
outcome in the court. Both the Plaintiff and the Defendant are risk-neutral, i.e.,
each maximizes the expected value of his wealth.
(a) Consider the following scenario: The Plaintiff first decides whether or not to
sue the defendant, by filing a case and paying a non-refundable filing fee of
$100. If he does not sue, the game ends and each gets 0. If he sues, then
he is to decide whether or not to offer a settlement of amount $25 000. If
170 CHAPTER 10. APPLICATION: NEGOTIATION
he offers a settlement, then the Defendant either accepts the offer, in which
case the Defendant pays the settlement amount to the Plaintiff, or rejects
the offer. If the Defendant rejects the offer, or the Plaintiff does not offer a
settlement, the Plaintiff can either pursue the suit and go to court, or drop
the suit. Applying backward induction, find a Nash equilibrium.
(b) Now imagine that the Plaintiff has already paid his lawyer $20,000 for the
legal costs, and the lawyer is to keep the money if they do not go to court.
That is, independent of whether or not they go to court, the Plaintiff pays the
$20,000 of legal costs. Applying backward induction, find a Nash equilibrium.
under the new scenario.
(a) Apply backward induction to find an equilibrium of this game. (Assume that
the Contestant accepts the offer whenever she is indifferent between accepting
or rejecting the offer. Solving the special case in part b first may be helpful.)
4. [Midterm 1, 2007] [Read the bonus note at the end before you answer
the question.] This question is about arbitration, a common dispute resolution
method in the US. We have a Worker, an Employer, and an Arbitrator. They
10.5. EXERCISES 171
want to set the wage . If they determine the wage at date , the payoffs of the
Worker, the Employer and the Arbitrator will be , (1 − ) and (1 − ),
respectively, where ∈ (0 1). The timeline is as follows:
• At = 0,
• at = 1,
Backward induction is a powerful solution concept with some intuitive appeal. Unfor-
tunately, it can be applied only to perfect information games with a finite horizon. Its
intuition, however, can be extended beyond these games through subgame perfection.
This chapter defines the concept of subgame-perfect equilibrium and illustrates how one
can check whether a strategy profile is a subgame perfect equilibrium.
• all the moves and information sets from that node on must remain in the subgame.
173
174 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
1 2 1
• • • (2,5)
Consider, for instance, the centipede game in Figure 11.1, where the equilibrium is
drawn in thick lines. This game has three subgames. One of them is:
1
• (2,5)
(3,3)
2 1
• • (2,5)
(0,4) (3,3)
The third subgame is the game itself. Note that, in each subgame, the equilibrium
computed via backward induction remains to be an equilibrium of the subgame.
Any subgame other than the entire game itself is called proper.
11.1. DEFINITION AND EXAMPLES 175
Now consider the matching penny game with perfect information in Figure 3.4. This
game has three subgames: one after Player 1 chooses Head, one after Player 1 chooses
Tail, and the game itself. Again, the equilibrium computed through backward induction
is a Nash equilibrium at each subgame.
1
E X
1
(2,6)
T B
2
L R L R
Now consider the game in Figure 11.2. One cannot apply backward induction in
this game because it is not a perfect information game. One can compute the subgame-
perfect equilibrium, however. This game has two subgames: one starts after Player 1
plays ; the second one is the game itself. The subgame perfect equilibria are computed
as follows. First compute a Nash equilibrium of the subgame, then fixing the equilibrium
actions as they are (in this subgame), and taking the equilibrium payoffs in this subgame
as the payoffs for entering the subgame, compute a Nash equilibrium in the remaining
game.
The subgame has only one Nash equilibrium, as dominates , and dominates
. In the unique Nash equilibrium, Player 1 plays and Player 2 plays , yielding the
176 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
T B
2
L R L R
Figure 11.3: Equilibrium in the subgame. The strategies are in thicker arrows.
payoff vector (3,2), as illustrated in Figure 11.3. Given this, the game reduces to
1
E X
(3,2) (2,6)
1
E X
1
(2,6)
T B
2
L R L R
1
E X
1
(2,6)
T B
2
L R L R
• Assign the payoff vector associated with this equilibrium to the starting node, and
eliminate the subgame.
• Iterate this procedure until a move is assigned at every contingency, when there
remains no subgame to eliminate.
As in backward induction, when there are multiple equilibria in the picked subgame,
one can choose any of the Nash equilibrium, including one in a mixed strategy. Every
choice of equilibrium leads to a different subgame-perfect Nash equilibrium in the original
game. By varying the Nash equilibrium for the subgames at hand, one can compute all
subgame perfect Nash equilibria.
A subgame-perfect Nash equilibrium is a Nash equilibrium because the entire game
is also a subgame. The converse is not true. There can be a Nash Equilibrium that is not
subgame-perfect. For example, the above game has the following equilibrium: Player 1
plays in the beginning, and they would have played ( ) in the proper subgame, as
illustrated in Figure 11.5. You should be able to check that this is a Nash equilibrium.
But it is not subgame perfect: Player 2 plays a strictly dominated strategy in the proper
subgame.
178 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
1 X
(2,6)
T B
2
L R L R
the game. There is however a simple technique that can be used to check whether
a strategy profile is subgame-perfect in most games. The technique is called single-
deviation principle.
I will first describe the class of games for which it applies. In a game there may be
histories where all the previous actions are known but the players may move simultane-
ously. Such histories are called stages. For example, suppose that every day players play
the battle of the sexes, knowing what each player has played in each previous day. In
that case, at each day, after any history of play in the previous days, we have a stage at
which players move simultaneously, and a new subgame starts. Likewise, in Figure 11.2,
there are two stages. The first stage is where Player 1 chooses between and , and
the second stage is when they simultaneously play the 2x2 game. It is not a coincidence
that there are two subgames because each stage is the beginning of a subgame.
For another example, consider alternating-offer bargaining. At each round, at the
beginning of the round, the proposer knows all the previous offers, which have all been
rejected, and makes an offer. Hence, at the beginning we have a stage, where only the
proposer moves. Then, after the offer is made, the responder knows all the previous
offers, which have all been rejected, and the current offer that has just been made. This
is another stage, where only the responder moves. Therefore, in this game, each round
has two stages.
Such games are called multi-stage games.
In a multistage game, if two strategies prescribe the same behavior at all stages, then
they are identical strategies and yield the same payoff vector. Suppose that two strategies
are different, but they prescribe the same behavior for very, very long successive stages,
e.g., in bargaining they differ only after a billion rounds. Then, we would expect that
the two strategies yield very similar payoffs. If this is indeed the case, then we call
such games "continuous at infinity". (In this course, we will only consider games that
are continuous at infinity. For an example of a game that is not continuous at infinity
see Example 9.1.) The single-deviation principle applies to multistage games that are
continuous at infinity.
Single-deviation test Consider a strategy profile ∗ . Pick any stage (after any
history of moves). Assume that we are at that stage. Pick also a player who moves at
that stage. Fix all the other players’ moves as prescribed by the strategy profile ∗ at
180 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
the current stage as well as in the following game. Fix also the moves of player at all
the future dates, but let his moves at the current stage vary. Can we find a move at the
current stage that gives a higher payoff than ∗ , given all the moves that we have fixed?
If the answer is Yes, then ∗ fails the single-deviation test at that stage for player .
Clearly, if ∗ fails the single-deviation test at any stage for any player , then ∗
cannot be a subgame-perfect equilibrium. This is because ∗ does not lead to a Nash
equilibrium at the subgame that starts at that stage, as player has an incentive to
deviate to the strategy according to which plays the better move at the current stage
but follows ∗ in the remainder of the subgame. It turns out that in a multistage game
that is continuous at infinity, the converse is also true. If ∗ passes the single deviation
principle at every stage (after every history of previous moves) for every player, then it
is a subgame-perfect equilibrium.
This is a generalization of the fact that backward induction results in a Nash equi-
librium, as established in Proposition 9.1. For an illustration of the proof, see the proof
of Proposition 9.1. The proof in general case considered in the theorem here is similar.
Example 9.1 illustrates that the single-deviation principle need not apply when the game
is not continuous at infinity. Since all the games considered in this game are continuous
at infinity, you do not need to worry about that possibility.
Recall that, when the game automatically ends after 2 periods, at any , the proposer
offers to take
1 − (−)2−+1
1+
for himself and leave the remaining,
+ (−)2−+1
1+
to the other player, and the other player accepts an offer if his share is at least as in this
offer. When → ∞, the behavior is as follows:
∗ : at each history where makes an offer, offer to take 1 (1 + ) and leave (1 + )
to the other player, and at each history where responds to an offer, accept the
offer if and only if the offer gives at least (1 + ).
Now according to ∗ , at the current stage, is to accept the offer. This gives the payoff
of
≥ +1 (1 + )
If deviates and rejects the offer, then according to the fixed behavior he gets only
+1 (1 + ), and he has no incentive to deviate. Hence, ∗ passes the single deviation
test at this stage for player .
182 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
Now, consider a stage as in (ii) for some [for arbitrary history of previous offers],
where the current offer gives (1 + ) to player . Fix the behavior of the players
at + 1 and onwards as in ∗ , so that, independent of what happened so far, at + 1,
player offers to take 1 (1 + ), which is accepted by , yielding payoff of +1 (1 − )
to . According to ∗ , player is to reject the current offer and hence get this payoff. If
he deviates and accepts the offer, he will get
+1 (1 + )
Therefore, he has no incentive to deviate at this stage, and ∗ passes the single-deviation
test at this stage.
Now consider a stage as in (i) for some [for arbitrary history of previous offers]. Fix
again the moves of at and onwards as in ∗ . Fix also the moves of at and onwards
as in ∗ . Given the fixed moves, if offers some ≥ (1 + ), then the offer will be
accepted, and will obtain the payoff of (1 − ) . If he offers (1 + ), then the
offer will be rejected, and at + 1 they will agree to a division in which gets (1 + ).
In that case, the payoff of will be
+2 (1 + )
In this game at each stage only one player moves. In the following lectures we
will study the repeated games where multiple players may move at a given stage. The
single-deviation principle will be very useful in those games as well.
11.4. EXERCISES WITH SOLUTIONS 183
Figure 11.7: The payoff of the proposer as a function of the offered share to the other
party
X E
1
5/2 L R
5/2
2
l r
r l
3 0 2 2
3 2 0 2
Solution: The only proper subgame starts after . This subgame can be written
as
3 3 0 2
2 0 2 2
in normal form. It has three Nash equilibria: ( ), ( ), and the mixed strategy
Nash equilibrium with 1 () = 2 () = 23. Since 3 52, ( ) entices Player
184 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
L R
M
2 2
l r a b a b
1
0 1 3 0
x y x 3 0
y 0 1
2 1 2
2 1
1 1
2
Figure 11.8:
L R
M
2
3/2
3/2 a a b
b
0 1 3 0
0 1 3 0
Figure 11.9:
2 to play . This results in SPE ( ). Similarly, the second SPE is ( ). If
one picks in the subgame, the expected payoff vector for the subgame is (2 2),
and Player 2 plays . In the third SPE, Player 2 plays , and would have been
played in the subgame otherwise.
32 32 32 32
0 0 1 1
3 3 0 0
in normal form. This game does not have a proper subgame. The pure strategy
Nash equilibria are ( ) and ( ). These result in subgame-perfect Nash equilib-
¡ ¢ ¡ ¢
ria 12 + 12 12 + 12 and 12 + 12 12 + 12 in mixed strategies. The
reduced game has yet another Nash equilibrium, in which Player 1 puts equal
probabilities on and and Player 2 puts equal probabilities on and . This
leads to a third subgame-perfect Nash equilibrium.
3. [Final 2002] Ashok and Beatrice would like to go on a date. They have two options:
a quick dinner at Wendy’s, or dancing at Pravda. Ashok first chooses where to go,
and knowing where Ashok went Beatrice also decide where to go. Ashok prefers
Wendy’s, and Beatrice prefers Pravda. A player gets 3 out his/her preferred date,
1 out of his/her unpreferred date, and 0 if they end up at different places. All
these are common knowledge.
ANSWER: SPE : Beatrice goes wherever Ashok goes, and Ashok goes to
Wendy’s. The outcome is both go to Wendy’s. Non-subgame-perfect Nash
Equilibrium: Beatrice goes to Pravda at any history, so Ashok goes to Pravda.
The outcome is each goes to Pravda. This is not subgame-perfect because it
is not a Nash equilibrium in the subgame after Ashok goes to Wendy’s.
(b) Modify the game a little bit: Beatrice does not automatically know where
Ashok went, but she can learn without any cost. (That is, now, without
knowing where Ashok went, Beatrice first chooses between Learn and Not-
Learn; if she chooses Learn, then she knows where Ashok went and then
decides where to go; otherwise she chooses where to go without learning
where Ashok went. The payoffs depend only on where each player goes –as
186 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
Ashok
Wendy’s Pravda
Beatrice
Beatrice
Learn Don’t Don’t Learn
Beatrice Beatrice Beatrice
Wendy’s Pravda
Wendy’s Wendy’s Wendy’s
Pravda Pravda
Pravda
3,1 0,0
3,1 0,0 0,0 1,3 0,0 1,3
Figure 11.10:
4. [Midterm 2, 2007] The players in the following game are Alice, who is an MIT senior
looking for a job, and Google. She has also received a wage offer from Yahoo, but
we do not consider Yahoo as a player. Alice and Google are negotiating. They use
alternating offer bargaining, Alice offering at even dates = 0 2 4 and Google
offering at odd dates = 1 3 . When Alice makes an offer , Google either
11.4. EXERCISES WITH SOLUTIONS 187
accepts the offer, by hiring Alice at wage and ending the bargaining, or rejects
the offer and the negotiation continues. When Google makes an offer , Alice
• either accepts the offer and starts working for Google for wage , ending
the game,
• or rejects the offer and takes Yahoo’s offer , working for Yahoo for wage
and ending the game,
If the game continues to date ̄ ≤ ∞, then the game ends with zero payoffs for
both players. If Alice takes Yahoo’s offer at ̄, then the payoff of Alice is
and the payoff of Google is 0, where ∈ (0 1). If Alice starts working for Google
at ̄ for wage , then Alice’s payoff is and Google’s payoff is ( − ) ,
where
2
(Note that she cannot work for both Yahoo and Google.)
(a) Compute the subgame perfect equilibrium for ̄ = 4. (There are four rounds
of bargaining.)
ANSWER:
2 = (1 − ) +
(1 − )
2 = (1 − ) + 2 ⇐⇒ 2 =
1− 1+
Since 2 1+
, this implies that 2 . That is, Alice prefers
Yahoo’s offer to continuing, and hence she will never reject and continue.
Therefore, she must choose
(
if ≥
1 = 3 =
otherwise.
0 = 2 = (1 − ) +
[In part (b) most important cases are the acceptance/rejection cases, espe-
cially that of Alice. Many students skipped those cases, and wrongly con-
cluded that a non-SPE profile is a SPE.]
factor as before. The only difference is that the proposer is selected randomly.
At any , each player is selected as the proposer with probability , and the
other players sequentially accept or reject in the increasing order. The game ends
if all the responders accept. Compute the subgame-perfect Nash equilibria that
are stationary, in that there exist divisions 1 such that each player offers
= (1 ) whenever he is the proposer (and the offer is accepted).
Solution: Write for the expected share of player before the proposer is
selected:
= 1 1 + · · · +
= + (1 − )
à !
X
= 1 − + (1 − )
=
6
à !
X
= 1 − +
=1
= (1 − ) +
Here, the first equality is because all other players offer the same share to ; the
second equality is by substitution of the values; the third equality is by simple
algebra, and the last equality is by the fact that all the offers add up to 1. Solving
for , one obtains
=
6. [Final 2007] Three senators, namely Alice, Bob, and Colin, are in a committee
that determines the tax rate ∈ [0 1]. Alice is a libertarian: her utility from
setting the tax rate at date is (1 − 2 ). Bob is a moderate: his utility
11.4. EXERCISES WITH SOLUTIONS 191
¡ ¢
is 1 − ( − ̄ )2 where ̄ ∈ (0 1) is a known constant. Colin is a liberal: his
¡ ¢
utility is 1 − (1 − )2 . At each date randomly one of them becomes a proposer,
each having chance of 1/3. The proposer offers a tax rate and the other two vote
Yes or No in alphabetical order. If at least one of them votes Yes, then the game
ends and is set as the tax rate. If both says No, we continue to the next date.
(a) Find a subgame perfect equilibrium of this game. (Hint: There exists a SPE
with values ≤ ̄ ≤ such that Alice always offers , Bob always offers
̄ , and Colin always offers .)
In order to complete the description of the strategy profile, one also needs to
find which offers are accepted by each senator. Clearly, Bob accepts an offer
if and only if ∈ [ ]. The expected payoff of Alice at the beginning of
a period is
1¡ 2 ¢
= 1 − + ̄ 2 + 2
3
and she must accept an offer iff ≤ ̂ , where 1 − ̂ 2 = , i.e.,
r
̂ = 1 − + ( 2 + ̄ 2 + 2 )
3
Similarly, Colin accepts an offer iff ≥ ̂ , where
r
¡ ¢
̂ = 1 − 1 − + (1 − )2 + (1 − ̄ )2 + (1 − )2
3
(which is obtained by replacing with 1 − ). This completes the answer.
[It can be checked that ̂ + (1 − ̂ ) 1, so that at least one of Alice and
Colin accepts ̄ . This and the usual single deviation arguments would be
enough for verifying that the above strategy profile is indeed a SPE. Also,
the above solution assumes that ≥ 0 and ≤ 1. If it turns out that they
are out of bounds, one takes them 0 and 1 and computes accordingly.]
(b) What happens as → 1? Briefly interpret.
Answer: As → 1,
→ ̄ ; → ̄ ; ̂ → ̄ ; ̂ → ̄
That is, in the limit all players offer ̄ and they accept an offer if and only if
the offer is at least as good as ̄ . That is, the moderate senator’s preferences
dictate the outcome. (This is a version of the "median voter theorem" in
political science. The "theorem" states that the preferences of the voter who
is in the middle prevail. This emerges formally in models as in the example
here.)
11.5 Exercises
1. [Homework 3, 2004] Compute the subgame-perfect Nash equilibria in Figure 11.11.
11.5. EXERCISES 193
2
2
2 1
2
3 0 0 1
3 0 0 1
Figure 11.11:
B
M
2 2
L R
l
r 1
1
a a
x y x b b
y
0 2 1 1 2
4 1
0 0 1 2 2 1
1 4
0
Figure 11.12:
194 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
U D
L R L R
1 1
A 3/2 3/2 a b
B
0 0
2 2 2
r x y
l
2 0 0 1 2 0 0 1
1 0 0 2 1 0 0 2
Figure 11.13:
4. [Midterm 2 Make Up, 2011] Find all the subgame-perfect Nash equilibria of the
following game.
a b
2 2
L R L R
1 1
2 0
A B 0 2 A’ B’
2 2 2 2
l r l r L’ R’ L’ R’
0 6 1 5 0 0 1
5
6 0 1 1 0 0 5
5
terms of investment in technology, etc.) and the worker provides the labor (in
√
terms of the investment in the human capital) to produce ( ) = , which
they share equally. The parties determine their investment level (the employer’s
capital and the worker’s labor ) simultaneously. The worker cannot invest
¯ where
more than , ¯ is a very large number. Both capital and labor are costly,
so that the payoffs for the employer and the worker are
1
( ) −
2
and
1
( ) − 2
2
respectively. So far the problem is same as in Exercise 1 in Section 8.5. The
present problem differs as follows. Before the worker joins the firm (in which they
simultaneously choose and ), the worker is to choose between working for
this employer or working for another employer q
who pays the worker a constant
wage ˜ 0 makes him work as much as ˜ = ̃
. (If he works for the other
2
employer, the current employer gets 0.) Everything described up to here is common
knowledge.
6. [Homework 3, 2006] Alice and Bob are competing to play a game against Casey.
Alice and Bob simultaneously bid and , respectively. The one who bids
higher wins; if = , the winner is determined by a coin toss. The winner pays
his/her bid to Casey and play the following game with Casey:
Winner\Casey L R
T 3,1 0,0
B 0,0 1,3
Find two pure strategy subgame-perfect equilibria of this game. Which of the
equilibria makes more sense to you?
7. [Midterm 1 Make Up, 2002] Consider the following game of coalition formation
in a parliamentary system. There are three parties , , and who just won
41, 35, and 25 seats, respectively, in a 101-seats parliament. In order to form
a government, a coalition (a subset of { }) needs 51 seats in total. The
196 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM
parties in the government enjoy a total 1 unit of perks, which they can share in
any way they want. The parties outside the government get 0 units of perks, and
each party tries to maximize the expected value of its own perks. The process of
coalition formation is as follows. First is given right to form a government. If
it fails, then is given right to form a government, and if also fails then
is given to form a government. If also fails, then the game ends and each gets
0. The party who is given right to form a government, say , approaches one of
the other two parties, say , and offers some ∈ [0 1]. If accepts, then they
form the government and gets 1 − and gets units of perks. If rejects the
offer, then fails to form a government (in which case, as described above, either
another party is given right to form a government or game will and with 0 payoff).
Applying backward induction, find a Nash equilibrium of this game.
8. [A variation of Final Make Up, 2002] Consider the following game between two
firms. Firm 1 either stays out, in which case Firm 1 gets 2 and Firm 2 gets 3, or
enters the market where Firm 2 operates. If it enters, then the firms simultaneously
choose between two strategies: Hawk (an aggressive strategy) and Dove (a peaceful
strategy). In this subgame, if a firm plays Hawk and the other plays Dove, then
Hawk gets 3 Dove gets 0; if both choose Hawk, then each gets -1, and if both play
Dove, then each gets 1.
(b) Which of the above equilibria is consistent with the assumption that Firm 2
remains to believe that Firm 1 is rational in the information set of Firm 2.
10. Verify that the equilibrium identified in the random-proposer model of the previous
11.5. EXERCISES 197
12. [Final 2006] Alice and Bob own a dollar, which they need to share in order to
consume. Alice makes an offer ∈ = {001 002 098 099}; and observing
the offer, Bob accepts it or rejects it. If Bob accepts the offer, Alice gets 1 − and
Bob gets . If he reject, then each gets 0.
Repeated Games
In real life, most games are played within a larger context, and actions in a given situation
affect not only the present situation but also the future situations that may arise. When
a player acts in a given situation, he takes into account not only the implications of his
actions for the current situation but also their implications for the future. If the players
are patient and the current actions have significant implications for the future, then the
considerations about the future may take over. This may lead to a rich set of behavior
that may seem to be irrational when one considers the current situation alone. Such
ideas are captured in the repeated games, in which a "stage game" is played repeatedly.
The stage game is repeated regardless of what has been played in the previous games.
This chapter explores the basic ideas in the theory of repeated games and applies them
in a variety of economic problems. As it turns out, it is important whether the game is
repeated finitely or infinitely many times.
199
200 CHAPTER 12. REPEATED GAMES
played in each previous play. A strategy then prescribes what player plays at each as a
function of the plays at dates 0, . . . , − 1. More precisely, let us call the outcomes of the
previous stage games a history, which will be a sequence (0 −1 ). A strategy in
the repeated game prescribes a strategy of the stage game for each history (0 −1 )
at each date .
For example, consider a situation in which two players play the Prisoners’ Dilemma
game,
5 5 0 6 (12.1)
6 0 1 1
twice. In that case, = {0 1} and is the Prisoners’ Dilemma game. The repeated
game, , can be represented in the extensive-form as
1
C D
2
C D C D
1 1 1 1
C D D C D C D
C
2 2 2 2
C D C D C D C D C D C D C D
C D
10 5 11 6 5 0 6 1 11 6 12 7 6 1 7 2
10 11 5 6 11 12 6 7 5 6 0 1 6 7 1 2
after a history of plays in the initial round. For example, after ( ) in the initial
round, we have subgame
1
C D
2
C D C D
10 5 11 6
10 11 5 6
where we add 5 to each player’s payoffs, corresponding to the payoff that he gets from
playing ( ) in the first round. Recall that adding a constant to a player’s payoff
does not change the preferences in a game, and hence the set of equilibria in this game
is the same as the original Prisoners’ Dilemma game, which possesses the unique Nash
equilibrium of ( ). This equilibrium is depicted in the figure. Likewise, in each proper
subgame, we add some constant to the players’ payoffs, and hence we have ( ) as
the unique Nash equilibrium at each of these subgames.
Therefore, the actions in the last round are independent of what is played in the
initial round. Hence, the players will ignore the future and play the game as if there is
no future game, each playing . Indeed, given the behavior in the last round, the game
in the initial round reduces to
6 6 1 7
7 1 2 2
where we add 1 to each player’s payoffs, accounting for his payoff in the last round. The
unique equilibrium of this reduced game is ( ). This leads to a unique subgame-
perfect equilibrium: At each history, each player plays .
What would happen for arbitrary ? The answer remains the same. In the last
day, , independent of what has been played in the previous rounds, there is a unique
Nash equilibrium for the resulting subgame: Each player plays . Hence, the actions
at day − 1 do not have any effect in what will be played in the next day. Then, we
can consider the subgame as a separate game of the Prisoners’ Dilemma. Indeed, the
202 CHAPTER 12. REPEATED GAMES
5 + 1 + 1 5 + 1 + 2 0 + 1 + 1 6 + 1 + 2
6 + 1 + 1 0 + 1 + 2 1 + 1 + 1 1 + 1 + 2
where 1 is the sum of the payoffs of from the previous plays at dates 0 −2. Here
we add for these payoffs and 1 for the last round payoff, all of which are independent
of what happens at date − 1. This is another version of the Prisoner’s dilemma, which
has the unique Nash equilibrium of ( ). Proceeding in this way all the way back to
date 0, we find out that there is a unique subgame-perfect equilibrium: At each and
for each history of previous plays, each player plays .
That is to say, although there are many repetitions in the game and the stakes in
the future may be high, any plan of actions other than playing myopically everywhere
unravels, as players cannot commit to any plan of action in the last round. This is
indeed a general result.
Theorem 12.1 Let be finite and assume that has a unique subgame-perfect equi-
librium ∗ . Then, has a unique subgame-perfect equilibrium, and according to this
equilibrium ∗ is played at each date independent of the history of the previous plays.
The proof of this result is left as a straightforward exercise. The result can be
illustrated by another important example. Consider the following Entry-Deterrence
game, where an entrant (Player 1) decides whether to enter a market or not, and the
incumbent (Player 2) decides whether to fight or accommodate the entrant if he enters.
1 Enter 2 Acc.
(1,1)
X Fight
(0,2) (-1,-1)
(12.2)
Consider the game where the Entry-Deterrence game is repeated twice, and all the
previous actions are observed. This game is depicted in the following figure.
12.1. FINITELY-REPEATED GAMES 203
X Fight X Fight
Acc. 2 Enter 1
(1,3) (1,3) (0,0)
1 Enter 2 Acc.
Fight X (0,0)
X Fight
(-1,1) (0,4)
(-1,1) (-2,-2)
As depicted in the extensive form, in the repeated game, at = 1, there are three
possible histories: , ( ), and ( ). A strategy of Player 1 assigns
an action, which has to be either Enter or , to be played at = 0 and action to be
played at = 1 for each possible outcome at = 0. In total, we need to determine 4
actions in order to define a strategy for Player 1. Similarly for Player 2.
Note that after the each outcome of the first play, the Entry-Deterrence game is
played again, where the payoff from the first play is added to each outcome. Since a
player’s preferences do not change when we add a number to his utility function, each
of the three games played on the second “day” is the same as the stage game (namely,
the Entry-Deterrence game above). The stage game has a unique subgame perfect
equilibrium, where the incumbent accommodates the entrant and the entrant enters the
market. In that case, each of the three games played on the second day has only this
equilibrium as its subgame perfect equilibrium. This is depicted in the following.
X Fight X Fight
Acc. 2 Enter 1
(1,3) (1,3) (0,0)
1 Enter 2 Acc.
Fight X (0,0)
X Fight
(-1,1) (0,4)
(-1,1) (-2,-2)
1 Enter 2 Acc.
(2,2)
X Fight
(1,3) (0,0)
X Fight X Fight
Acc. 2 Enter 1
(1,3) (1,3) (0,0)
1 Enter 2 Acc.
Fight X (0,0)
X Fight
(-1,1) (0,4)
(-1,1) (-2,-2)
This can be generalized for arbitrary as above. All these examples show that in
certain important games, no matter how high the stakes are in the future, the consid-
erations about the future will not affect the current actions, as the future outcomes do
not depend on the current actions. In the rest of the lectures we will show that these
are very peculiar examples. In general, in many subgame-perfect equilibria, the patient
players will take a long-term view, and their decisions will be determined mainly by the
future considerations.
Indeed, if the stage game has more than one equilibrium, then in the repeated game
we may have some subgame-perfect equilibria where, in some stages, players play some
actions that are not played in any subgame-perfect equilibrium of the stage game. This
is because the equilibrium to be played on the second day can be conditioned to the
play on the first day, in which case the “reduced game” for the first day is no longer
the same as the stage game, and thus may obtain some different equilibria. I will now
12.1. FINITELY-REPEATED GAMES 205
illustrate this using an example in Gibbons. (See Exercises 1 and 2 at the end of the
chapter before proceeding.)
Take = {0 1} and the stage game be
1 1 5 0 0 0
0 5 4 4 0 0
0 0 0 0 3 3
Notice that a strategy in a stage game prescribes what the player plays at = 0 and
what he plays at = 1 conditional on the history of the play at = 0. There are 9 such
histories, such as ( ), ( ), etc. A strategy of Player 1 is defined by determining
an action (,, or ) for = 0, and determining an action for each of these histories at
= 1 (There will be 10 actions in total.) Consider the following strategy profile:
2 2 6 1 1 1
1 6 7 7 1 1
1 1 1 1 4 4
Here, we add 3 to the payoffs at ( ) (for it leads to ( ) in the second round) and
add 1 for the payoffs at the other strategy profiles, for they lead to ( ) in the second
round. Clearly, ( ) is a Nash equilibrium in the reduced game, showing that the
above strategy profile is a subgame-perfect Nash equilibrium. In summary, players can
coordinate on different equilibria in the second round conditional on the behavior in the
206 CHAPTER 12. REPEATED GAMES
first round, and the players may play a non-equilibrium (or even irrational) strategies in
the first round, if those strategies lead to a better equilibrium later.
When there are multiple subgame-perfect Nash equilibria in the stage game, a large
number of outcome paths can result in a subgame-perfect Nash equilibrium of the re-
peated game even if it is repeated just twice. But not all outcome paths can be a result
of a subgame-perfect Nash equilibrium. In the following, I will illustrate why some of
the paths can and some paths cannot emerge in an equilibrium in the above example.
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium? The
answer is No. This is because in any Nash equilibrium, the players must play a Nash
equilibrium of the stage game in the last period on the path of equilibrium. Since ( )
is not a Nash equilibrium of the stage game (( ) ( )) cannot emerge in any Nash
equilibrium, let alone in a subgame-perfect Nash equilibrium.
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium in pure
strategies? The answer is No. Although ( ) is a Nash equilibrium of the stage game,
in a subgame-perfect Nash equilibrium, a Nash equilibrium of the stage game must
be played after every play in the first round. In particular, after ( ), the play is
either ( ) or ( ), yielding 6 or 8, respectively for Player 1. Since he gets only 5
from (( ) ( )), he has an incentive to deviate to in the first period. (What
about if we consider mixed subgame-perfect Nash equilibria or non-subgame-perfect
Nash equilibria?)
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium in pure
strategies? As it must be clear from the previous discussion the answer would be Yes
if and only if ( ) is played after every play of the period except for ( ). In that
case, the reduced game for the first period is
2 2 6 1 1 1
3 8 5 5 1 1
1 1 1 1 4 4
Since ( ) is indeed a Nash equilibrium of the reduced game, the answer is Yes. It is
the outcome of the following subgame-perfect Nash equilibrium: Play ( ) in the first
round; in the second round, play ( ) if ( ) is played in the first round and play
( ) otherwise.
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 207
As an exercise, check also if (( ) ( )) or (( ) ( ) ( )) can be an
outcome of a subgame-perfect Nash equilibrium in pure strategies (in twice and thrice
repeated games, respectively).
X
∞
(; ) = = 0 + 1 + · · · + + · · ·
=0
X
∞
(1 − ) (; ) ≡ (1 − )
=0
Hence, the analysis does not change whether one uses or , but using is
simpler. In repeated games considered here, each player maximizes the present value
of the payoff stream he gets from the stage games, which will be played indefinitely.
Since the average value is simply a linear transformation of the present value, one can
also use average values instead of present values. Such a choice sometimes simplifies the
expressions without affecting the analyses.
In the repeated Prisoner’s Dilemma, the possible histories are -tuples of ( ) ( ) ( ),
and ( ), such as
( ) ( ) ( ) ( ) · · · ( )
where varies. A history at the beginning of date is denoted by = (0 −1 ),
where 0 is the outcome of stage game in round 0 ; is empty when = 0. For example,
in the repeated prisoners’ dilemma, (( ) ( )) is a history for = 2. In the repeated
entry-deterrence game, ( ) is a history for = 2.
A strategy in a repeated game, once again, determines a strategy in the stage game
for each history and for each . The important point is that the strategy in the stage
game at a given date can vary by histories. Here are some possible strategies in the
repeated Prisoner’s Dilemma game:
Naively Cooperate: Play always C (no matter what happened in the past).
Tit-for-Tat: Play at = 0, and at each 0, play whatever the other
player played at − 1.
Note that strategy profiles (Grim, Grim), (Naively Cooperate, Naively Cooperate)
and (Tit-for-Tat, Tit-for-Tat) all lead to the same outcome path:1
Nevertheless, they are quite distinct strategy profiles. Indeed, (Naively Cooperate,
Naively Cooperate) is not even a Nash equilibrium (why?), while (Grim, Grim) is a
subgame-perfect Nash equilibrium for large values of . On the other hand, while (Tit-
for-Tat, Tit-for-Tat) is a Nash equilibrium for large values of , it is not subgame-perfect.
All these will be clear momentarily.
where () is the stage-game payoff of player at in the original stage game, and
+1 ( ∗ ) is the present value of player at +1 from the payoff stream that results
1
Make sure that you can compute the outcome path for each strategy profile above.
210 CHAPTER 12. REPEATED GAMES
when all players follow ∗ starting with the history ( ) = (0 −1 ), which is a
history at the beginning of date + 1. Note that (|∗ ) is the time present value
of the payoff stream that results when the outcome of the stage game is in round and
everybody sticks to the strategy profile ∗ from the next period on. Note also that the
only difference between the original stage game and the augmented stage game is that
the payoff in the augmented game is (|∗ ) while the payoff in the original game is
().
Single-deviation principle now states that a strategy profile in the repeated game is
subgame-perfect if it always yields a subgame-perfect Nash equilibrium in the augmented
stage game:
Note that ∗ () is what player is supposed to play at the stage game after history
at date according to ∗ . Hence, ∗ () is a strategy in the stage game as well as
a strategy in the augmented stage game. Therefore, (∗1 () ∗ ()) is a strategy
profile in the augmented stage game, and a potential subgame-perfect Nash equilibrium.
Note also that, in order to show that ∗ is a subgame-perfect Nash equilibrium, one
must check for all histories and dates that ∗ yields a subgame-perfect Nash equi-
librium in the augmented stage game. Conversely, in order to show that ∗ is not a
subgame-perfect Nash equilibrium, one only needs to find one history (and date) for
which ∗ does not yield a subgame-perfect Nash equilibrium in the augmented stage
game. Finally, although the above result considers pure strategy profile ∗ the same
result is true for mixed strategies. The result is stated that way for clarity. The rest of
this section is devoted to illustration of single-deviation principle on infinitely repeated
Entry Deterrence and Prisoners’ Dilemma games.
At any given stage, the entrant enters the market if an only if the incum-
bent has accommodated the entrant sometimes in the past. The incumbent
accommodates the entrant if an only if he has accommodated the entrant
before.2
Using the single-deviation principle, we will now show that for large values of , this a
subgame-perfect Nash equilibrium. The strategy profile puts the histories in two groups:
1. The histories at which there was an entry and the incumbent has accommodated;
the histories that contain an entry , and
2. all the other histories, i.e., the histories that do not contain the entry at any
date.
= 1 + + 2 + · · · = 1 (1 − )
That is, for every outcome ∈ { }, +1 ( ∗ ) = . Hence, the aug-
mented stage game for and ∗ is
1 Enter 2 Acc.
(1+VA,1
,1++VA)
X Fight
0+VA
0+ -1+VA
-1+
2+VA
2+ -1+VA
-1+
2
This is a switching strategy, where initially incumbent fights whenever there is an entry and the
entrant never enters. If the incumbent happens to accommodate an entrant, they switch to the new
regime where the entrant enters the market no matter what the incumbent does after the switching,
and incumbent always accommodates the entrant.
212 CHAPTER 12. REPEATED GAMES
For example, if the incumbent accommodates the entrant at , his present value (at
) will be 1 + ; and if he fights his present value will be −1 + , and so on.
This is another version of the Entry-Deterrence game, where the constant is added
to the payoffs. The strategy profile ∗ yields (Enter, Accommodate) for round at
. According to single-deviation principle, (Enter, Accommodate) must be a subgame-
perfect equilibrium of the augmented stage game here. This is indeed the case, and ∗
passes the single-deviation test for such histories.
Now for some date consider a history = (0 −1 ) in the second group, where
the incumbent has never accommodated the entrant before, i.e., 0 differs from for
all 0 . Towards constructing the augmented stage game for , first consider the outcome
= at . In that case, at the beginning of + 1, the history is ( ), which
includes as in the previous paragraph. Hence, according to ∗ , Player 1 enters and
Player 2 accommodates at + 1, yielding a history that contains for the next period.
Therefore, in the continuation game, all histories are in the first group (containing ),
and the play is (Enter, Accommodate) at every 0 , resulting in the outcome path
( ). Starting from + 1, each player gets 1 for each date, resulting the
present value of +1 ( ∗ ) = . Now consider another outcome ∈ { }
in period . The continuation play for other outcomes is quite different now. At the
beginning of + 1, the history ( ) is either ( ) or ( ). Since does not contain
, neither does ( ). Hence, according to ∗ , at + 1, Player 1 exits, and Player 2
would have chosen Fight if there were an entry, yielding outcome for period + 1.
Consequently, at any 0 +1, the history is ( ), and Player 1 chooses to
exit at 0 according to ∗ . This results in the outcome path ( ). Therefore,
starting from + 1, Player 1 gets 0 and Player 2 gets 2 every day, yielding present values
of 1+1 ( ∗ ) = 0. and
2+1 ( ∗ ) = = 2 + 2 + 2 2 + · · · = 2 (1 − )
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 213
1 Enter 2 Acc.
Acc.
(1+VA,1+
,1+VA)
X Fight
0+0
0+ -1+0
-1+
2+VF
2+ -1+VF
-1+
At this history the strategy profile prescribes ( ), i.e., the entrant does not
enter, and if he enters, the incumbent fights. Single-deviation principle requires then
that ( ) is a subgame-perfect equilibrium of the above augmented stage game.
Since is a best response to Fight, we only need to ensure that Player 2 weakly prefers
Fight to Accommodate after the entry in the above game. For this, we must have
−1 + ≥ 1 +
Substitution of the definitions of and in this inequality shows that this is equivalent
to3
≥ 23
We have considered all possible histories, and when ≥ 23, the strategy profile
has passed the single-deviation test. Therefore, when ≥ 23, the strategy profile is a
subgame-perfect equilibrium.
On the other hand, when 23, ∗ is not a subgame-perfect Nash equilibrium. To
show this it suffices to consider one history at which ∗ fails the single-deviation test. For
a history in the second group, the augmented stage game is as above, and ( )
is not a subgame-perfect equilibrium of this game, as 1 + −1 + .
equilibrium of the augmented stage game for for every history . This simplifies the
analysis substantially because one only needs to compute the payoffs without deviation
and with unilateral deviations in order to check whether the strategy profile is a Nash
equilibrium.
As an example, consider the infinitely repeated Prisoner’s dilemma game in (12.2).
Consider the strategy profile (Grim,Grim). There are two kinds of histories we need to
consider separately for this strategy profile.
2. Defection: Histories in which has been played by some one at some date.
First consider a Cooperation history for some . Now if both players play , then
according to (Grim,Grim), from + 1 on each player will play forever. This yields the
present value of
= 5 + 5 + 5 2 + · · · = 5 (1 − )
at + 1. If any player plays , then from + 1 on, all the histories will be Defection
histories and each will play forever. This yields the present value of
= 1 + + 2 + · · · = 1 (1 − )
at + 1. Now, at , if they both play , then the payoff of each player will be 5 + .
If Player 1 plays while Player 2 is playing C, then Player 1 gets 6 + , and Player
2 gets 0 + . Hence, the augmented stage game at the given history is
5 + 5 + 0 + 6 +
6 + 0 + 1 + 1 +
To pass the single-deviation test, (C,C) must be a Nash equilibrium of this game.4 (That
is, we fix a player’s action at and check if the other player has an incentive to deviate.)
4
It is important to note that we do not need to know all the payoffs in the reduced game. For
example, for this history we only need to check if ( ) is a Nash equilibrium of the reduced game, and
hence we do not need to compute the payoffs from ( ). In this example, it was easy to compute.
In general, it may be time consuming to compute the payoffs for all strategy profiles. In that case, it
will save a lot of time to ignore the strategy profiles in which more than one player deviates from the
prescribed behavior at .
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 215
i.e.,
≥ 15
5 5 6 6
5 + 1− 5 + 1− 0 + 1− 26 +
1−2
6 6
6+ 1− 20 + 1− 2 1 + (1 − ) 1 + (1 − )
patient, their long-term incentives take over, and a large set of behavior may result in
equilibrium. Indeed, for any given feasible and "individually rational" payoff vector and
for sufficiently large values of , there exists some subgame perfect equilibrium that
yields the payoff vector as the average value of the payoff stream. This fact is called the
Folk Theorem. This section is devoted to presenting a basic version of folk theorem and
illustrating its proof.
Throughout this section, it is assumed that the stage game is a simultaneous action
game ( ) where set = {1 } is the set of players, = 1 × · · · × is a
finite set of strategy profiles, and : → R is the stage-game utility functions.
for some probability distribution : → [0 1] on . Note that is the smallest convex
set that contains all payoff vectors (1 () ()) from pure strategy profiles in the
stage game. A payoff vector is said to be feasible iff ∈ . Throughout this section,
is assumed to be -dimensional.
For a visual illustration consider the Prisoners’ Dilemma game in (12.1). The set
is plotted in Figure 12.1. Since there are two players, contains pairs = (1 2 ). The
payoff vectors from pure strategies are (1 1), (5 5), (6 0), and (0 6). The set is the
diamond shaped area that lies between the lines that connect these four points.
Note that for every strategy profile in the repeated game, the average payoff vector
from is in .7 This also implies that the same is true for mixed strategy profiles in the
repeated game. Conversely, if the players can collectively randomize on strategy profiles
7
Indeed, the average payoff vector can be written as
X
() = () (1 () ())
∈
218 CHAPTER 12. REPEATED GAMES
6
5
1
0
0 1 5 6
in the repeated games, all vectors ∈ could be obtained as average payoff vectors.
(See also the end of the section.)
where
X
() = (1 − )
∈
and is the set of dates at which is played on the outcome path of . Clearly,
X X X X
() = (1 − ) = (1 − ) = 1
∈ ∈ ∈ ∈
12.3. FOLK THEOREM 219
Here, the other players try to minimize the payoff of player by choosing a pure strategy
− for themselves, knowing that player will play a best response to − . Then, the
harshest punishment they could inflict on is . For example, in the prisoners’ dilemma
game, = 1 because gets maximum of 6 if the other player plays and gets maximum
of 1 if the other player plays .
Observe that in any pure-strategy Nash equilibrium ∗ of the repeated game, the
average payoff of player is at least . To see this, suppose that the average payoff of
is less than in ∗ . Now consider the strategy ̂ , such that for each history , ̂ ()
is a stage-game best response to ∗− (), i.e.,
¡ ¢ ¡ ¢
̂ () ∗− () = max ∗− ()
∈
Since
¡ ∗
¢
max − () ≥
∈
¡ ¢
for every , this implies that the average payoff from ̂ ∗− is at least 1, giving player
an incentive to deviate.
A lower bound for the average payoff from a mixed strategy Nash equilibrium is given
by minmax payoff, defined as
X Y
= min max ( ) ( − ) (12.4)
6= ∈
− ∈− =
6
where is a mixed strategy of in the stage game. Similarly to pure strategies one can
show that the average payoff of player is at least in any Nash equilibrium (mixed
or pure). Note that, by definition, ≤ . The equality can be strict. For example, in
the matching penny game
Head Tail
Head −1 1 1 −1
Tail 1 −1 −1 1
the pure-strategy minmax payoff is 1 while minmax payoff is 0. (This is obtained
when () = ( ) = 12.) For the sake of exposition, it is assumed that
(1 ) ∈ .
A payoff vector is said to be individually rational iff ≥ for every ∈ .
220 CHAPTER 12. REPEATED GAMES
Theorem 12.3 (Folk Theorem) Let ∈ be such that for every player .
Then, there exists ¯ ∈ (0 1) such that for every ̄ there exists a subgame-perfect
equilibrium of the repeated game under which the average value of each player is .
Moreover, if for every above, then the subgame-perfect equilibrium above is in
pure strategies.
he Folk Theorem states that any strictly individually rational and feasible payoff
vector can be supported in subgame perfect Nash equilibrium when the players are
sufficiently patient. Since all equilibrium payoff vectors need to be individually rational
and feasible, the Folk Theorem provides a rough characterization of the equilibrium
payoff vectors when players are patient: the set of all feasible and individually rational
payoff vectors.
I will next illustrate the main idea of the proof for a special case. Assume that, in the
theorem, = (1 (∗ ) (∗ )) for some ∗ ∈ and there exists a Nash equilibrium
̂ of the stage game such that (̂) for every . In the prisoners’ dilemma example,
∗ = ( ), yielding = (5 5), and ̂ = ( ), yielding payoff vector (1 1). Recall
that in that case one could obtain from strategy profile (Grim, Grim), which is a
subgame-perfect Nash equilibrium when 15. The main idea here is a generalization
of Grim strategy. Consider the following strategy profile ∗ of the repeated game:
(1 − ) () + (̂)
12.3. FOLK THEOREM 221
because the players will switch to ̂ after any such play. Then, ∗ is a Nash equilibrium
of the augmented stage game if and only if
¡ ¢
≥ (1 − ) max ∗− + (̂) (12.5)
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) · · · ( ) ( ) ( ) ( ) · · ·
Here, I approximated by time averaging. When is large, one can obtain each exactly
by time averaging.8
8
For mathematically oriented students: imagine writing each weight () ∈ [0 1] in base 1.
222 CHAPTER 12. REPEATED GAMES
3 3 0 0 0 0
0 0 2 2 1 0
0 0 0 1 0 0
(a) Find a lower bound for the average payoff of each player in all pure strategy
Nash equilibria. Prove indeed that the payoff of a player is at least in
every pure-strategy Nash equilibrium.
Solution: Note that the pure strategy minmax payoff of each player is 1.
Hence, the payoff of a player cannot be less than . Indeed, if a player
mirrors what the other player is supposed to play in any history at which the
other player plays or according to the equilibrium and play if the other
player is supposed to play at the history, then his payoff would be at least
. Since he plays a best response in equilibrium, his payoff is at least that
amount. This lower bound is tight. For = 2 1, consider the strategy
profile
Play ( ) for the first periods and ( ) for the last periods; if any player
deviates from this path, play ( ) forever.
Note that the payoff from this strategy profile is . To check that this is
a Nash equilibrium, note that the best possible deviation is to play play
forever, which yields , giving no incentive to deviate. Note also that the
quilibrium here is not subgame-perfect.
if is even. Note that the total payoff of each player from this path is + 1.
Consider the following strategy profile.
Play according to the above path; if any player deviates from this path at
any ≤ 2 − 1, switch to ∗ [ − − 1] for the remaining ( − − 1)-times
repeated game; if any player deviates from this path at any 2, remain
on the path.
This is a subgame-perfect Nash equilibrium. There are three classes of histo-
ries to check. First, consider a history in which some player deviated from the
path at some 0 ≤ 2. In that case, the strategy profile already prescribes
to follow the subgame-perfect Nash equilibrium ∗ [ − 0 − 1] of the subgame
that starts from 0 + 1, which remains subgame perfect at the current sub-
game as well. Second, consider a history in which no player has deviated from
the path at any 0 ≤ 2 and take 2. In the continuation game, the
above strategy profile prescribes: play ( ) every day if is odd and play
( ) every day but the last day and play ( ) on the last day if is even.
Since ( ) and ( ) are Nash equilibria of the stage game, this is clearly a
subgame-perfect equilibrium of the remaining game. Finally, take ≤ 2
and consider any on-the path history. Now, a player’s payoff is + 1 if he
follows the strategy profile. If he deviates at , he gets at most 1 at and
( − − 1) + 1 ≤ from the next period on, where ( − − 1) + 1 is his
payoff from ∗ [ − − 1]. His total payoff cannot exceed + 1, and he has
no incentive to deviate.
2. Consider the infinitely repeated prisoners’ dilemma game of (12.1) with discount
factor = 0999.
224 CHAPTER 12. REPEATED GAMES
(a) Find a subgame-perfect Nash equilibrium in pure strategies under which the
average payoff of each player is in between 1.1 and 1.2. Verify that your
strategy profile is indeed a subgame-perfect Nash equilibrium.
³ ´
ˆ ˆ
Solution: Take any ̂ with 1 − + 5 = 1 + 4 ̂ ∈ (11 12), e.g., any ̂
between 2994 and 3687. Consider the strategy profile
Play ( ) at any ̂ and ( ) at ̂ and thereafter. If any player deviates
from this path, play ( ) forever.
³ ´
ˆ
Note that the average value of each player is 1 − + 5 ̂ ∈ (11 12). To
check that it is a subgame-perfect Nash equilibrium, first take any on-path
history with date ≥ ̂. At that history, the average value of each player is
5. If a player deviates, then his average value is only 6 (1 − ) + = 105.
Hence, he has no incentive to deviate. For ̂, the average value is
³ ´ ³ ´
ˆ
− ˆ ˆ ˆ
1− −
+ 5 ≥ 1 − + 5 11
5 ̂− ≥ 5 ̂ ≥ 09991608 5 ∼
= 10006
3. [Midterm 2, 2006] Two firms, 1 and 2, play the following infinitely repeated game
in which all the previous plays are observed, and each player tries to maximize
the discounted sum of his or her profits at the stage games where the discount
rate is = 099. At each date , simultaneously, each firm selects a price
∈ {001 002 099 1}. If 1 = 2 , then each firm sells 1 unit of the good;
otherwise, the cheaper firm sells 2 units and the more expensive firm sells 0 units.
Producing the good does not cost anything to firms. Find a subgame-perfect equi-
librium in which the average value of Firm 1 is at least 1.4. (Check that the
strategy profile you construct is indeed subgame-perfect equilibrium.)
Solution: (There are several such strategy profiles; I will show one of them.) In
order for the average value to exceed 1.4, the present value must exceed 140. We
can get average value of approximately 1.5 for player 1 by alternating between
(099 1), which yields (198 0), and (1 1), which yields (1 1). The average value
of that payoff stream for player 1 is
198 + ∼
= 149
1+
Here is a SPE with such equilibrium play: At even dates play (099 1) and at odd
226 CHAPTER 12. REPEATED GAMES
dates play (1 1); if any player ever deviates from this scheme, then play (001 001)
forever.
We use the single-deviation principle, to check that this is a SPE. First note that
in "deviation" mode, they play a Nash equilibrium of the stage game forever, and
it passes the single-deviation test. Now, consider an even and a history where
there has not been any deviation. Player 1 has no incentive to deviate: if he follows
the strategy, he will get the payoff stream 1.98, 1, 1.98, 1, 1.98, . . . ; if he deviates,
he will get , 0.01, 0.01, . . . where ≤ 196 ( = 1 for upward deviation). For
player 2: if he plays according to the strategy, he will get the payoff stream of 0,
1, 0, 1, 0, 1, . . . with present value of
¡ ¢
1 − 2 ∼= 4975
If he deviates, he will get , 0.01, 0.01, . . . where ≤ 196. (The best deviation is
2 = 098.) This yields present value of
4. [Midterm 2, 2011] Alice and Bob are a couple, playing the infinitely repeated game
with the following stage game and discount factor . Every day, simultaneously,
Alice and Bob spend ∈ [0 1] and ∈ [0 1] fraction of their time in their
relationship, respectively, receiving the stage payoffs = ln ( + ) + 1 −
and = ln ( + ) + 1 − , respectively. (Alice and Bob are denoted by
and , respectively.) For each of the strategy profiles below, find the conditions
on the parameters for which the strategy profile is a subgame-perfect equilibrium.
(a) Both players spend all of their time in their relationship (i.e. = = 1)
until somebody deviates; the deviating player spends 1 and the other player
spends 0 thereafter. (Find the range of .)
Solution: Since (1 0) and (0 1) are Nash equilibria of the stage game, there
is no incentive to deviate at any history with previous deviation by one player.
Now consider any other history, in which they both are supposed to spend 1.
If a player follows the strategy, his he average payoff is
ln 2
(ln (1 + ) + 1 − ) (1 − )
1 −
ln 2 ≥ 1 −
where the valeus on left and right hand sides of inequality are the average
values from following the strategy profile and best deviation, respectively.
One can write this as a lower bound on the discount factor:
≥ 1 − ln 2
forever. (Find the set of inequalities that must be satisfied by the parameters
, ̂, ̃ , and ̃ .)
Hint: The following facts about logarithm may be useful:
ln () = 1; ln () ≤ − 1; ln () = ln + ln
Solution: Since (̃ 1 − ̃ ) is a Nash equilibrium of the stage game, there
is no incentive to deviate at state for any ∈ { }. In state , the
average payoff from following the strategy profile is ln 2. If a player deviates
at state , the next state is (as in part (a)), which gives the average payoff
of 1 − ̃ to . Hence, as in part (a), the average payoff from best deviation
is 1 − + (1 − ̃ ) = 1 − ̃ . Therefore, there is no incentive to deviate at
state iff ln 2 ≥ 1 − ̃ , i.e.
̃ ≥ 1 − ln 2 (12.6)
On the other hand, in state , the average payoff from following the strategy
is
(1 − ) ̂ +
≥ (1 − ) ̂ +
which simplifies to
≥
ˆ
By substituting the value of , one can write this condition as
ln 2 + (1 − ) (ln ̂ + 1 − )
ˆ ≥
ˆ (12.7)
Remark 12.1 One can make strategy profile above a subgame-perfect Nash
equilibrium by varying all three parameters ̂, ̃1 , ̃2 , and . For a fixed
(̂ ̃1 ̃2 ), both conditions bound the discound factors from below, yielding
½ ¾
1 − ln 2 1 − ln 2 ̂ − ln 2
≥ max 1 −
̃1 ̃2 ln ̂ + 1 − ̂
(To see this, observe that ln ̂ + 1 − ̂ 0.) Of course, when is fixed, the
above conditions can also be interpeted as bounds on ̃ and ̂. First, the
contribution of the guilty party in the divorce state cannot be too low:
1 − ln 2
̃ ≥
For otherwise, the parties deviate and marriage cannot be sustained. Second,
the above lower bound on also gives an absolute upper bound on the effort
level during the engagement. Since 1 and ln ̂ + 1 − ̂ 0, the condition
on implies that
ˆ ln 2 ∼
= 0693
For otherwise, the lower bound on would exceed 1. That is, one must start
small, as engagement may never turn into marriage otherwise. Of course,
one could also skip the engagement altogether.
5. [Final, 2001] This question is about a milkman and a customer. At any day, with
the given order,
(a) Assume that this is repeated for 100 days, and each player tries to maximize
the sum of his or her stage payoffs. Find all subgame-perfect equilibria of this
game.
230 CHAPTER 12. REPEATED GAMES
= −
The best deviation for him (at any history on the path of equilibrium play)
is to choose = 0 (and not being able to sell thereafter). In that case, his
average value is
= (1 − ) + 0 = (1 − )
− ≥ (1 − )
i.e.,
≥
In order for the customer to buy on the equilibrium path, it must also be true
that ≤ . Therefore,
≥ ≥
6. [Midterm 2 Make up, 2006] Since the British officer had a thick pen when he drew
the border, the border of Iraq and Kuwait is disputed. Unfortunately, the border
12.4. EXERCISES WITH SOLUTIONS 231
passes through an important oil field. In each year, simultaneously, each of these
countries decide whether to extract high () or low () amount of oil from this
field. Extracting high amount of oil from the common field hurts the other country.
In addition, Iraq has the option of attacking Kuwait ( ), which is costly for both
countries. The stage game is as follows:
2 2 4 1
1 4 3 3
−1 −1 −1 −2
Consider the infinitely repeated game with this stage game and with discount
factor = 09.
(a) Find a subgame perfect Nash equilibrium in which each country extracts low
() amount of oil every year on the equilibrium path.9
Solution: Consider the strategy profile
Play ( ) until somebody deviates and play ( ) thereafter.
This strategy profile is a subgame-perfect Nash equilibrium whenever ≥ 12.
(You should be able to verify this at this stage.)
(b) Find a subgame perfect Nash equilibrium in which Iraq extracts high ()
amount of oil and Kuwait extracts low () amount of oil every year on the
equilibrium path.
Solution: Consider the following ("Carrot and Stick") strategy profile10
There are two states: War and Peace. The game starts at state Peace. In
state Peace, they play ( ); they remain in Peace if ( ) is played and
switch to War otherwise. In state War, they play ( ); they switch to
Peace if ( ) is played and remain in War otherwise.
This strategy profile is a subgame-perfect Nash equilibrium whenever ≥ 35.
The vector of average values is (4 1) in state Peace and (−1 −1) (1 − ) +
(4 1) = (5 − 1 2 − 1) in War. Note that both countries strictly prefer
9
That is, an outside observer would observe that each country extracts low amount of oil every year.
10
See the next chapter for more on Carrot and Stick strategies.
232 CHAPTER 12. REPEATED GAMES
2 (1 − ) + [2 − 1] ≤ 1
i.e., ≥ 12, which is indeed the case. In state War, Kuwait clearly has no
incentive to deviate. In that state, Iraq could possibly benefit from deviating
to , getting 2 (1 − ) + (2 − 1). It does not have an incentive to deviate
if
2 − 1 ≥ 2 (1 − ) + (2 − 1)
i.e.,
2 − 1 ≥ 2
This is equivalent to ≥ 35, which is clearly the case.
7. [Selected from Midterms 2 in years 2001 and 2002] Below, there are pairs of stage
games and strategy profiles. For each pair, check whether the strategy profile is a
subgame-perfect Nash equilibrium of the infinitely repeated game with the given
stage game and discount factor = 099.
Strategy profile: Until some player deviates, Player 1 plays and Player
2 plays . If anyone deviates, then each plays thereafter.
Solution: This is a subgame perfect Nash equilibrium. After the deviation,
the players play a Nash equilibrium forever. Hence, we only need to check
that no player has any incentive to deviate on the path of equilibrium. Player
1 has clearly no incentive to deviate. If Player 2 deviates, he gets 2 in the
current period and gets zero thereafter. If he sticks to his equilibrium strategy,
then he gets 1 forever. The present value of this is 1 (1 − ) 2. Therefore,
Player 2 doesn’t have any incentive to deviate, either.
(c) Stage Game:
2 −1 0 0 −1 2
0 0 0 0 0 0
−1 2 0 0 2 −1
Strategy profile: Until some player deviates, Player 1 plays and Player 2
alternates between and . If anyone deviates, then each play thereafter.
Solution: It is subgame perfect. Since ( ) is a Nash equilibrium of
the stage game, we only need to check if any player wants to deviate at a
history in which Player 1 plays and Player 2 alternates between and
throughout. In such a history, the average value of Player 1 is
1 = 2 − = 101
1 = 2 − 1 = 098
12.5 Exercises
1. How many strategies are there in twice-repeated prisoners dilemma game?
2. Suppose that the stage game is a two-player games in which each player has
strategies. How many strategies each player has in an -times repeated game?
4. Show that in any Nash equilibrium ∗ of the repeated game, the average payoff of
player is at least .
5. [Homework 4, 2011] Consider the infinitely repeated game with discount factor
= 099 and the following stage game (in which the players are trading favors):
Give Keep
Give 1 1 −1 2
Keep 2 −1 0 0
(a) Find a subgame perfect equilibrium under which the average expected payoff
of Player 1 is at least 133. Verify that your strategy profile is indeed a
subgame-perfect Nash equilibrium.
12.5. EXERCISES 235
(b) Find a subgame-perfect equilibrium under which the average expected payoff
of Player 1 is at least 149. Verify that your strategy profile is indeed a
subgame-perfect Nash equilibrium.
6. [Midterm 2, 2011] Consider the 100-times repeated game with the following stage
game:
1
I X
1
2
0
a b
2 2
L R L R
5 0 x 1
1 x 0 6
where is either 0 or 6.
(a) Find the set of pure-strategy subgame-perfect equilibria of the stage game
for each ∈ {0 6}.
(b) Take = 6. What is the highest payoff Player 2 can get in a subgame-perfect
equilibrium of the repeated game?
(c) Take = 0. Find a subgame-perfect equilibrium of the repeated game in
which Player 2 gets more than 300 (i.e. more than 3 per day on average)?
7. [Midterm 2, 2011] Consider an infinitely repeated game in which the stage game is
as in the previous problem. Take the discount factor = 099 and = 6. For each
strategy profile below, check whether it is a subgame-perfect Nash equilibrium.
(a) They play ( ) everyday until somebody deviates; they play ( ) there-
after.
(b) There are three states: , 1, and 2, where the play is ( ), ( ), and
( ), respectively. The game starts at state . After state , it switches
to state 1 if the play is ( ) and to state 2 if the play is ( ); it stays
236 CHAPTER 12. REPEATED GAMES
8. [Midterm 2 Make Up, 2011] Consider an infinitely repeated game in which the
discount factor is = 09 and the stage game is
4 4 0 5 0 0
5 0 3 3 −1 0
2 2 1 1 −2 0
0 0 0 −1 −3 −2
For each payoff vector below ( ), find a subgame perfect equilibrium of the
repeated game in which the average discounted payoff is ( ). Verify that the
strategy profile you identified is indeed a subgame perfect equilibrium.
9. [Midterm 2 Make Up, 2011] Consider the infinitely repeated game with the stage
game in the previous problem and the discount factor ∈ (0 1). For each of the
strategy profiles below, find the conditions on the discount factor for which the
strategy profile is a subgame-perfect equilibrium.
(a) At = 0, they play ( ). At each , they play ( ) if the play at − 1 is
( ) or if the play at − 2 is not ( ). Otherwise, they play ( ).
(b) There are 4 states: ( ), ( ), ( ), and ( ). At each state (1 2 ), the
play is (1 2 ). The game starts at state ( ). For any with (1 2 ), the
state at + 1 is
10. [Homework 4, 2011] Consider the -times repeated game with the following stage
game.
A X
(1,0,0)
I
L R
C
L R
L R
(a) For = 2, what is the largest payoff A can get in a subgame-perfect Nash
equilibrium in pure strategies?
11. [Homework 4, 2011] Consider the infinitely repeated game with discount factor
∈ (0 1) and the stage game in the previous problem. For each of the strategy
profile below, find the range of under which the strategy profile is a subgame-
perfect Nash equilibrium.
(a) A always plays . B and C both play until somebody deviates and play
thereafter.
(b) A plays I and B and C rotate between ( ), ( ), and ( ) until some-
body deviates; they play ( ) thereafter.
(Note that the outcome is ( ) ( ) ( ) ( ) ( ) .)
12. [Homework 4, 2007] Seagulls love shellfish. In order to break the shell, they need
to fly high up and drop the shellfish. The problem is the other seagulls on the
beach are kleptoparasites, and they steal the shellfish if they can reach it first. This
question tells the story of two seagulls, named Irene and Jonathan, who live in a
238 CHAPTER 12. REPEATED GAMES
crowded beach where it is impossible to drop the shellfish and get it before some
other gull steals it. The possible dates are = 0 1 2 3 with no upper bound.
Everyday, simultaneously Irene and Jonathan choose one of the two actions: "Up"
or "Down". Up means to fly high up with the shellfish and drop it next to the
other sea gull’s nest, and Down means to stay down in the nest. Up costs 0,
but if the other seagull is down, it eats the shellfish, getting payoff . That is,
we consider the infinitely repeated game with the following stage game
Up Down
Up − − −
Down − 0 0
and discount factor ∈ (0 1).11 For each strategy profile below, find the set of dis-
count factors under which the strategy profile is a subgame-perfect equilibrium.
(a) Irrespective of the history, Irene plays Up in the even dates and Down in the
odd dates; Jonathan plays Up in the odd dates and Down in the even dates.
(b) Irene plays Up in the even dates and Down in the odd dates while Jonathan
plays the other way around until someone fails to go Up in a day that he is
supposed to do so. They both stay Down thereafter.
(c) For days Irene goes Up and Jonathan stays Down; in the next days
Jonathan goes Up and Irene stays Down. This continues back and forth until
someone deviates. They both stay Down thereafter.
(d) Irene goes Up on "Sundays", i.e., at = 0 7 14 21 , and stays Down on
the other days, while Jonathan goes up everyday except for Sundays, when
he rests Down, until someone deviates; they both stay Down thereafter.
(e) At = 0, Irene goes Up and Jonathan stays Down, and then they alternate.
If a seagull fails to go Up at a history when is supposed to go Up, then
the next day goes Up and the other seagull stays Down, and they keep
alternating thereafter until someone fails to go Up when it is supposed to do
so. (For example, given the history, if Irene is supposed to go Up at but
11
Evolutionarily speaking, the discounted sum is the fitness of the genes, which determine the behav-
ior.
12.5. EXERCISES 239
13. [Homework 4, 2007] Consider the infinitely repeated game, between Alice and Bob,
with the following stage game:
Alice
Hire Fire
Bob
0
Work Shirk 0
-1
2
3
2
The discount factor is = 09. (Fire does not mean that the game ends.) For each
strategy profile below, check if it is a subgame-perfect equilibrium. If it is not a
SPE for = 09, find the set of discount factors under which it is a SPE.
(a) Alice Hires if and only if there is no Shirk in the history. Bob Works if and
only if there is no Shirk in the history.
(b) Alice Hires unless Bob (was hired and) Shirked in the previous period, in
which case she Fires. Bob always Works.
(c) There are three states: Employment, Punishment for Alice, and Punishment
for Bob. In the Employment state, Alice Hires and Bob Works. In the
Punishment state for Alice, Alice Hires but Bob Shirks. In the Punishment
state for Bob, Alice Fires, and Bob would have worked if Alice Hired him. The
game starts in Employment state. At any state, if only one player fails to play
what s/he is supposed to play at that state, then we go to the Punishment
state for that player in the next period; otherwise we go to the Employment
state in the next period.
240 CHAPTER 12. REPEATED GAMES
14. [Midterm 2, 2007] Consider the infinitely repeated game with the following stage
game
Chicken Lion
Chicken 3 3 1 4
Lion 4 1 0 0
and discount factor = 099. For each strategy profile below check if it is a
subgame-perfect equilibrium. (You need to state your arguments clearly; you will
not get any points for Yes or No answers.)
(a) There are two states: Cooperation and Fight. The game starts in the Cooper-
ation state. In Cooperation state, each player plays Chicken. If both players
play Chicken, then they remain in the Cooperation state; otherwise they go
to the Fight state in the next period. In the Fight state, both play Lion, and
they go back to the Cooperation state in the following period (regardless of
the actions).
(b) There are three states: Cooperation, P1 and P2. The game starts in the Co-
operation state. In the Cooperation state, each player plays Chicken. If they
play (Chicken, Chicken) or (Lion, Lion), then they remain in the Cooperation
state in the next period. If player plays Lion while the other player plays
Chicken, then in the next period they go to P state. In P state player plays
Chicken while the other player plays Lion; they then go back to Cooperation
state (regardless of the actions).
15. [Midterm 2 Make Up, 2007] Alice has two sons, Bob and Colin. Every day, she is to
choose between letting them play with the toys ("Play") or make them visit their
grandmother ("Visit"). If she make them visit their grandmother, each of them
gets 1. If she lets them play, then Bob and Colin simultaneously choose between
Grab and Share, which leads to the payoffs as in the following table, where the
third entry is the payoff of Alice:
Consider the infinitely repeated game with the above game is the stage game and
the discount factor is = 09. For each strategy profile below check if it is a
subgame-perfect equilibrium. Show your work.
(a) There are three states: Share, and . In Share state Alice lets
them play and Bob and Colin both share. In state (resp.
state), Alice lets them play, and Bob (resp. Colin) shares while the other
brother grabs. The game starts in Share state. If Bob (resp. Colin) does
not play what he is supposed to play while the other player plays what he is
supposed to play, then the next day we go to (resp. ) state; we
go to Share state next day otherwise.
(b) There are two states: Play and Visit. The game starts in the Play state. In
the Play state, Alice lets them play, and both sons share. In Play state, if
everybody does what they are supposed to do, we remain in Play state; we
go to Visit state next day otherwise. In the Visit state, Alice makes them
visit their grandmother, and they would both Grab if she let them play. In
the Visit state, they automatically go back to Play state next day.
16. [Homework 4, 2006] Alice has a restaurant, and Bob is a potential customer. Each
day Alice is to decide whether to use high quality supply (High) or low quality
supply (Low) to make the food, and Bob is to decide whether to buy or not at
price ∈ [1 3]. (At the time Bob buys the food, he cannot tell if it is of high
quality, but after buying he knows whether it was high or low quality.) The payoffs
for a given day is as follows.
The discount rate is = 099. For each of the following strategy profiles, find the
range of ∈ [1 3] for which the strategy profile is a subgame-perfect equilibrium
(a) There are two states: Trade and No-trade. The game starts at Trade state.
In Trade state, Alice uses High quality supply, and Bob Buys. If in the Trade
state Alice uses Low quality supply, then they go to the No-Trade state, in
242 CHAPTER 12. REPEATED GAMES
which for days Alice uses Low quality supply and Bob Skips. At the end of
day, independent of what happens, they go back to the Trade state.
(b) Alice is to use High quality supply in the even days, = 0 2 4 , and Low
quality supply in the odd days, = 1 3 5 ; Bob is to Buy everyday. If
anyone deviates from this program, then in the rest of the game Alice uses
Low quality and Bob Skips.12
17. [Homework 4, 2006] In question 1, take = 2, and check whether each of the
following is a subgame-perfect equilibrium. [We assume here that Bob somehow
can check whether the food was good in the previous day even if did not buy it.]
(a) Everyday Alice uses High quality supply. Bob buys the product in the first
day. Afterwards, Bob buys the product if and only if Alice has used High
quality supply in the previous day.
(b) There are two states: Trade and Punishment. The game starts at Trade state.
In Trade state, Alice uses High quality supply, and Bob Buys. In Trade state
if Alice uses Low quality, then we go to Punishment state. In Punishment
state, Alice uses High quality supply, and Bob Skips. In Punishment state, if
Alice uses Low quality supply or Bob Buys, then we remain in the Punishment
state; otherwise we go to Trade state.
18. [Homework 4, 2006] In an eating club, there are 2 members. Each day, each
member is to decide how much to eat, denoted by , and the payoff of for that
day is
√ 1 + · · · +
−
For = 099, check if either of the following strategy profiles is a subgame-perfect
equilibrium. [If you solve the problem for = 3, you will get 80%.]
(a) Each player eats = 14 units until somebody eats more than 14; thereafter
each eats = 2 4 units.
12
That is, at any 0 , Alice will use Low quality supply and Bob wil Skip in either of the following
cases: (i) Alice used Low quality supply at an even date 0 , or (ii) she used High quality supply at
an odd date 0 , or (iii) Bob Skipped at some date 0 .
12.5. EXERCISES 243
(b) Each player eats = 14 units until somebody eats more than 14; thereafter
each eats = 2 units.
19. [Homework 4, 2006] Each day Alice and Bob receive 1 dollar. Alice makes an offer
to Bob, and Bob accepts or rejects the offer, where ∈ {001 002 098 099}.
If Bob accepts the offer Alice gets 1− and Bob gets . If Bob rejects the offer, then
they both get 0. Find the values of for which the following is a subgame-prefect
equilibrium, where ̄ ∈ {001 002 098 099} is fixed.
At = 0, Alice offers ̄ and Bob accepts Alice’s offer, , if and only if ≥ ̄. They
keep doing this until Bob deviates from this program (i.e. until Bob accepts an
offer ̄, or Bob rejects an offer ≥ ̄). Thereafter, Alice offers = 001 and
Bob accepts any offer.
20. [Homework 3, 2004] Consider a Firm and a Worker. The firm first decides whether
to pay a wage 0 to the worker (hire him), and then the worker is to decide
whether work, which costs him 0 and produces to the firm where .
The payoffs are as follows:
Firm Worker
pay, work − −
pay, shirk −
don’t pay, work −
don’t pay, shirk 0 0
(b) Now consider the game this stage game is repeated infinitely many times and
the players discounts the future with . The following are strategy profiles
for this repeated game. For each of them, Check if it is a subgame-perfect
Nash equilibrium for large values of , and if so, find the lowest discount rate
that makes the strategy profile a subgame-perfect equilibrium.
i. No matter what happens, the firm always pays and the worker works.
ii. At any time , the worker works if he is paid at , and the firm always
pays.
244 CHAPTER 12. REPEATED GAMES
iii. At = 0, the firm pays and the worker works. At any time 0, the
firm pays if and only if the worker worked at all previous dates, and the
worker works if and only if he has worked at all previous dates.
iv. At = 0, the firm pays and the worker works. At any time 0, the
firm pays if and only if the worker worked at all previous dates at which
the firm paid, and the worker works if and only if he is paid at and he
has worked at all previous dates at which he was paid.
v. There are two states: Employment, and Unemployment. The game starts
at Employment. In this state, the firm pays, and the worker works if and
only if he has been paid at this date. If the worker shirks we go to Un-
employment state; otherwise we stay in Employment. In Unemployment
the firm does not pay and the worker shirks. After 0 days of Unem-
ployment we always go back to Employment. (Your answer should cover
each 0.)
21. Stage Game: Alice and Bob simultaneously choose contributions ∈ [0 1] and
∈ [0 1], respectively, and get payoffs = 2 − and = 2 − , respectively.
(a) (5 points) Find the set of rationalizable strategies in the Stage Game above.
(b) (10 points) Consider the infinitely repeated game with the Stage Game above
and with discount factor ∈ (0 1). For each , find the maximum (∗ ∗ )
such that there exists a subgame-perfect equilibrium of the repeated game
in which Alice and Bob contribute ∗ and ∗ , respectively, on the path of
equilibrium.
(c) (10 points) In part (b), now assume that at the beginning of each period
one of the players (Alice at periods = 0 2 4 and Bob at periods
= 1 3 5 ) offers a stream of contributions = ( +1 ) and =
( +1 ) for Alice and Bob, respectively, and the other player accepts or
rejects. If the offer is accepted then the game ends leading the automatic
contributions = ( +1 ) and = ( +1 ) from period on. If the
offer is rejected, they play the ³ Stage
´ Game and proceed to the next period.
ˆ ˆ such that the following is a subgame-perfect
Find ( ), ( ), and
equilibrium:
12.5. EXERCISES 245
∗ : When it is Alice’s turn, Alice offers ( ) and ( ) and Bob
accepts an offer ( ) if and only if (1 − ) [2 − + (2+1 − +1 ) + · · · ] ≥
2 − When it is Bob’s turn, Bob offers ( ) and ( )
and Alice accepts an offer ( ) if and only if (1 − ) [2 − + (2
³ +1´− +1 ) + · · · ] ≥
2 − If there is no agreement, in the stage game they play ˆ ˆ .
Verify that ∗ is a subgame perfect equilibrium for the values that you found.
(If you find it easier, you can consider only the constant streams of contribu-
tions = ( ) and = ( ).)
22. [Selected from Midterms 2 and make up exams in years 2002 and 2004] Below,
there are pairs of stage games and strategy profiles. For each pair, check whether
the strategy profile is a subgame-perfect equilibrium of the game in which the
stage game is repeated infinitely many times. Each agent tries to maximize the
discounted sum of his expected payoffs in the stage game, and the discount rate is
= 099. (Clearly explain your reasoning in each case.)
(a) Stage Game: There are 2 players. Each player, simultaneously, decides
whether to contribute $1 to a public good production project. The amount
of public good produced is = (1 + · · · + ) 2, where ∈ {0 1} is the
level of contribution for player . The payoff of a player is − .
Strategy profile: Each player contributes, choosing = 1, if and only if
the amount of public good produced at each previous date is greater than
4; otherwise each chooses = 0. (According to this strategy profile, each
player contributes in the first period.)
23. [Midterm 2 Make Up, 2001] Consider the infinitely repeated game with the Pris-
oners’ Dilemma game
4 4 0 5
5 0 1 1
as its stage game. Each agent tries to maximize the discounted sum of his expected
payoffs in the stage game with discount rate .
(a) What is the lowest discount rate such that there exists a subgame perfect
equilibrium in which each player plays C on the path of equilibrium play?
[Hint: Note that a player can always guarantee himself an average payoff of
1 by playing D forever.]
(b) For sufficiently large values of , construct a subgame-perfect equilibrium in
which any agent’s action at any date only depends on the play at dates − 1
and − 2, and in which each player plays on the path of equilibrium play.
Chapter 13
= max {1 − 0}
247
248 CHAPTER 13. APPLICATION: IMPLICIT CARTELS
where = 1 +· · ·+ is the total supply. In the repeated game, all the past production
levels of all firms are publicly observable, and each firm’s utility function is the discounted
sum of its stage profits, where the discount factor is :
X
∞
= ( (1 + · · · + ) − )
=0
where is the production level of firm at time . Sometimes it will be more convenient
to use the discounted average value, which is (1 − ) .
For any , write
for the (daily) profit of a firm when each firm produces and
(
0 (1 − ( − 1) − )2 4 if ( − 1) ≤ 1
() = max ( ( + ( − 1) ) − ) =
0 0 otherwise
(13.2)
for the maximum profit of a firm from best responding when all the other firms produce
.
in total and divide the revenues according to their favored division rule, which could be
attained by assigning some production levels to the firms that add up to . For the
sake of simplicity, let us assume that they would like to divide it equally. Then, the
above outcome is attained by simply each firm producing
= = (1 − ) (2)
As it has been established by the Folk Theorem, when the discount factor is high, such
outcomes can be an outcome of a subgame-perfect equilibrium. In that case, the firms
can make some tacit informal plans that form a subgame-perfect equilibrium and yield
13.2. MONOPOLY PRODUCTION WITH PATIENT FIRMS 249
the desired outcome. Since the plan is a subgame-perfect equilibrium they may hope
that everybody will follow through in the absence of an official enforcement mechanism,
such as courts.
A simple strategy profile that leads to the above outcome is as follows:
The above strategy profile yields each firm producing forever, stipulating that
they would fall back to the myopic Nash equilibrium production if any firm deviates,
leading to the breakdown of the cartel. This strategy profile may or may not be a
subgame-perfect equilibrium, depending on the discount factor. This section is devoted
to determine the range of discount factors under which it is indeed a subgame-perfect
equilibrium.
Once a firm deviates and the cartel breaks down, the firms are playing the stage-game
Nash equilibrium regardless of what happens thereafter, which is a subgame-perfect Nash
equilibrium of the subgame after break down, as it has been established before. Hence,
by the single-deviation principle, it suffices to check whether a firm has an incentive to
deviate while the cartel in place (i.e., no firm has deviated from producing ). In that
case, according to the single deviation test, the average discounted value of producing
for a firm is
¡ (1 − )2
¢
= =
4
A deviation of producing 6= yields the average value of
µ ¶
−1 ¡ ¢
() = (1 − ) 1 − − − +
2
where the first term is the payoff from the current period, in which the other firms are
¡ ¢
producing each, and the second term = (1 − )2 ( + 1)2 is the value of
flow payoff of Nash equilibrium, starting from the next day. The best possible deviation
payoff is
¡ ¢ ¡ ¢
∗ = max () = (1 − ) +
6=
¡ ¢ ¡ ¢2
where = +1−2
4
is the profit from best responding to . The firm does not
have an incentive to deviate if and only if
≥ ∗
250 CHAPTER 13. APPLICATION: IMPLICIT CARTELS
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
0 20 40 60 80 100
i.e., ¡ ¢ ¡ ¢
−
≥ ≡
( ) − ( )
Clearly,for any , is less then 1, and hence the simple trigger strategy profile above
is a subgame perfect equilibrium when the discount factor is large (larger than ).
As shown in Figure 13.1, for small , is reasonably small, and the monopoly prices
are maintained in the simple trigger strategy equilibrium for reasonable values of . On
the other hand, is increasing in , and → 1 as → ∞. Hence, for any given
discount factor, as the number of firms becomes very large, the simple trigger strategy
profile fails to be an equilibrium.
is devoted to find the optimal production level supported by a simple trigger strategy.
More precisely, for a fixes and , consider the following strategy profile:
Simple Trigger Strategy ( ∗ ): Each firm is to produce ∗ until somebody deviates,
and produce = (1 − ) ( + 1) thereafter.
Note that in the outcome of this strategy profile each firm produces ∗ at each day,
yielding the average discounted value of
( ∗ ) = ( ∗ ) = ∗ (1 − ∗ − ) (13.3)
to each firm. The main question is: Which ∗ maximizes the firms’ profits subject
to the constraint that the simple trigger strategy profile is a subgame-perfect Nash
equilibrium?
Once again, since the myopic Nash equilibrium is played after the breakdown of the
cartel, it suffices to check that there is no incentive to deviate on the path, in which
all firms produced ∗ at all times. At any such history, any unilateral deviation =
6 ∗
yields the average discounted value of
¡ ¢
() = (1 − ) (1 − ( − 1) ∗ − − ) +
to the deviating firm. To see this, note that in the first day, the firm’s profit is
(1 − ( − 1) ∗ − − ) as it produces and all the other firms produce ∗ . This one
time profit is multiplied by (1 − ). After the deviation, the firm gets the myopic Nash
¡ ¢
equilibrium profit of = (1 − )2 ( + 1)2 every day, which has the average dis-
¡ ¢
counted value of . Since the firm gets this starting the next day, it is multiplied
by . The simple trigger strategy profile above is a subgame perfect Nash equilibrium if
and only if
( ∗ ) ≥ () 6 ∗ )
(∀ =
the simple trigger strategy profile is a subgame-perfect equilibrium if and only if (13.4)
is satisfied. Hence, the objective in this section is to maximize (∗ ) = ( ∗ ) in (13.3)
¡ ¢
subject to the constraint ( ∗ ) ≥ (1 − ) (∗ ) + in (13.4).
252 CHAPTER 13. APPLICATION: IMPLICIT CARTELS
i.e.,
(1 − ( − 1) ∗ − )2
∗ (1 − ∗ − ) = (1 − ) + (1 − )2 ( + 1)2
4
The explicit solution to the above quadratic equation is not important. The effect of
the parameters on the solution can be gleaned from the equation. The left-hand side is
independent of the discount factor, while the expression on the other side is decreasing
in . This is because the payoff from deviation, which is multiplied by (1 − ), is larger
than the myopic Nash equilibrium payoff, which is multiplied by . Hence, as the
discount factor increases the right hand-side goes down, decreasing ∗ . This results
in lower amount of production and higher amounts of profits, in the expense of the
consumers. This is because more patient firms can maintain higher cartel prices without
being tempted by the short-term opportunities.
Carrot & Stick Strategy: There are two states: Carrot and Stick. Each
player plays in Carrot state and in Stick state. The game starts in
Carrot state. At any , if all players play what they are supposed to play,
they go to Carrot state at + 1; they go to Stick state at + 1 otherwise.
In a Carrot&Stick strategy, the Carrot state is used as a reward for following through
and the Stick state is used as a punishment for deviation. Hence, the profit from
( ) is lower than the profit from ( ). Note that punishment in the
Stick state can be costly for everyone including the other players who are punishing the
deviant player. They may than forgive the deviant in order to avoid the cost. In order
to deter them from failing to punish the deviant, equilibrium prescribes that they, too,
will be punished the next period if they fail to punish today.
The average discounted payoff from the Carrot state is
= ( ) (13.5)
Single-deviation principle yields two constraints under which the Carrot & Stick
strategy profile above is a subgame-perfect equilibrium. First, no player has an incentive
to unilateral deviation in the Carrot state:
Here the first term ( ) is the profit from the most-profitable deviation, which is
multiplied by 1 − as it is a single profit, and the second term is the average
discounted payoff from switching to the Stick state next day, which is multiplied by
because it starts the next day. By substituting the value of in (13.6) to (13.7), one
can simplify (13.7) as
1
= ( ) ≥ ( ) + ( ) (13.8)
1+ 1+
This condition finds a lower bound on the average discounted payoff from Carrot: it
has to be at least as high as the daily profit from deviation, multiplied by 1 (1 + ),
and the daily profit at the Stick state, multiplied by (1 + ).
254 CHAPTER 13. APPLICATION: IMPLICIT CARTELS
The second constraint is that no firm has an incentive to deviate unilaterally in the
Stick state:
That is, applying the possibly painful punishment at the Stick state must be at least
as good as deviating from this for one day and postponing it to the next period. This
constraint simplifies to
≥ ( ) (13.10)
That is, the average discounted payoff in the stick state is at least as high as the daily
profit from deviation at that state. By substituting the value of from (13.6), one can
write this directly, again, as a lower bound on the equilibrium profit:
The Carrot & Stick gives a subgame-perfect equilibrium if and only if the simple
constraints (13.8) and (13.11) are satisfied.
In general one can obtain high values for selecting the punishment profit ( ) very
low even negative. When the costs are zero (i.e., = 0), since the price is non-negative,
the lowest payoff is also zero, and it is obtained from selecting = 1 ( − 1). In
that case, ( ) = ( ) = 0, and the constraint (13.11) is satisfied for all . Hence,
this value of equilibrium leads to a subgame perfect equilibrium if and only if (13.8) is
satisfied:
1
( ) ≥ ( )
1+
When this inequality is satisfied at = , then an optimal Carrot & Stick strategy
for the firms is = = 1 (2) and = 1 ( − 1). This is the case when ≥
¡ ¢ ¡ ¢
− 1. Otherwise, an optimal Carrot & Stick strategy is given by =
1 ( − 1) and as the smallest solution to the quadratic equation (1 + ) ( ) =
( ).
When the marginal cost is positive (i.e., 0), one can make ( ) negative and
as small as needed by selecting a large . In that case, the firms can inflict arbitrar-
ily painful punishments on the deviating firm. They do so by fearing that failure of
punishment only delay the punishment and the subsequent reward one more period.
Giving incentive to such punishment puts an upper bound on through (13.11). This
13.5. PRICE WARS 255
upper bound is large when the marginal cost is small. I will next describe the optimal
strategy for small values of so that one can choose 1 ( − 1). In that case, in
the the Stick state, the profit is ( ) = − , i.e., the firms simply incur the cost of
the production as a loss, and the optimal deviation is to avoid this loss by producing
nothing, i.e., ( ) = 0. Hence, the optimal Carrot & Stick strategy maximizes ( )
subject to the constraints
1
( ) ≥ ( ) − (13.12)
1+ 1+
( ) ≥ (1 − ) (13.13)
A careful reader can check that one can select the second weak inequality as equality.
(That inequality can be strict only when both inequalities are satisfied at the global
optimum .) That is, one can select = ( ) (1 − ). In that case, the first
inequality reduces to
( ) ≥ (1 − ) ( )
¡ ¢ ¡ ¢
Therefore, when ≥ 1 − , an optimal Carrot & Stick strategy is given by
¡ ¢
= and = (1 − ). The firms produce the monopoly outcome, and
any deviation leads to the production of that offsets the gain from optimal deviation.
¡ ¢ ¡ ¢
When 1 − , the constraint in the last displayed inequality is binding,
and the production in the optimal Carrot & Stick strategy is the smallest solution to
the quadratic equation
( ) = (1 − ) ( )
In a Carrot & Stick equilibrium, the firm produce large amounts yielding very small
prices in order to punish deviations from the equilibrium. For example, in the optimal
strategy above, the price becomes zero after a deviation. This can viewed as a price war.
Price War: There are +1 states: Cartel, 1 . Each firm produces
in Cartel state and = 1 ( − 1) in states 1 . The game
starts at Cartel state. If each firm produces the above amounts ( in Cartel
state and 1 ( − 1) in other states), then Cartel and transition to Cartel
and transitions to +1 for all . They go to 1 in the next period
otherwise.
On the path of the above strategy profile, the firms produce the cartel production
everyday. Any deviation from this production level starts a price war that lasts
days. During the price war, the price is 0. If a firm is to deviate at any date during the
punishment, the punishment starts all over again in order to punish the newly deviating
firm.
Note that the average discounted profit at the cartel state is
= ( )
where is the marginal cost. Note that, assuming ( ) ≥ 0, the situation improves as
they leave more war dates in the past and get closer to the start date of the cartel with
positive payoffs:
≥ −1 ≥ · · · ≥ 1
i.e., the value of cartel is higher than one period optimal deviation and the value of
starting a war next day. As in the previous section, by substituting the value of 1 from
(13.14), one simplifies this constraint to
¡ ¢
1− 1 −
( ) ≥ ( ) − (13.16)
1 − +1 1 − +1
13.5. PRICE WARS 257
In any war state , the single-deviation test requires that a firm does not have an
incentive to deviate and start the war all over again:
≥ 1
That is, the value of being in the th day of war is at least as good as not producing
at all and avoiding the cost of production of a good that sells at price zero for one day
and starting the war all over again in the next period. Since ≥ 1 for each , this
constraint is satisfied at each war period if it is satisfied at the first day of the war,
i.e.,
1 ≥ 1
Therefore, the single-deviation test in the war states yields a single constraint:
1 ≥ 0
i.e.,
¡ ¢
( ) ≥ 1 − −+1 −+1 (13.17)
In that case, from the equivalent form (13.15), one can see that the constraint (13.16)
reduces to:
( ) ≥ (1 − ) ( )
This is the same constraint as the optimal Carrot & Stick equilibrium. As in there, in
¡ ¢ ¡ ¢
the optimal price war equilibrium, one selects = when ≥ 1 −
and equal to the smallest solution to the quadratic equation
( ) = (1 − ) ( )
otherwise.
258 CHAPTER 13. APPLICATION: IMPLICIT CARTELS
(1 + ) ∗ (1 − ∗ ) ≥ (1 − ( − 1) ∗ )2 4
1
∗ (1 − ∗ ) ≥
42
2. [Midterm 2, 2007] Consider the infinitely repeated game with the following stage
game (Linear Bertrand duopoly). Simultaneously, Firms 1 and 2 choose prices
1 ∈ [0 1] and 2 ∈ [0 1], respectively. Firm sells
⎧
⎪
⎪
⎨ 1 − if
(1 2 ) = (1 − ) 2 if =
⎪
⎪
⎩ 0 if
13.6. EXERCISES WITH SOLUTIONS 259
units at price , obtaining the stage payoff of (1 2 ). For each strategy
profile below, find the range of parameters under which the strategy profile is a
subgame-perfect equilibrium.
(a) They both charge = 12 until somebody deviates; they both charge 0
thereafter.
Solution: After the switch, they produce 0 forever and the future moves do
not depend on the current actions. Hence, the reduced game is identical to
the original stage game. Since (0 0) is a SPE of the stage game, it passes the
single-deviation test at such a history. Before the switch, we need to check
that
= 18 ≥ (1 − ) · 14 + · 0
i.e., ≥ 12. (Note that by undercutting a firm can get 14− for any 0.)
(b) There are + 1 states: Cartel, 1 . Each firm charges = 12 in
Cartel state and = ∗ in War states 1 where ∗ 12. The
game starts at Cartel state. If each firm charges the above prices (12 in
Cartel state and ∗ in War states), then Cartel and transition to Cartel
and transitions to +1 for all . They go to 1 in the next period
otherwise.
Solution: As in the price war with Cournot oligopoly there are two binding
conditions for SPE. In the cartel state no firm should have an incentive to
undercut:
¡ ¢
18 ≥ (1 − ) 4 + 1 − ∗ (1 − ∗ ) 2 + +1 8
i.e.,
¡ ¢ ¡ ¢
1 − +1 8 ≥ (1 − ) 4 + 1 − ∗ (1 − ∗ ) 2 (13.18)
1 ≥ (1 − ) ∗ (1 − ∗ ) + 1
i.e.,
1 ≥ ∗ (1 − ∗ )
260 CHAPTER 13. APPLICATION: IMPLICIT CARTELS
Here,
¡ ¢
= 1 − −+1 ∗ (1 − ∗ ) 2 + −+1 8
≥ (1 − ) ∗ (1 − ∗ ) 2 +
13.7 Exercises
1. [Homework 4, 2011] Consider the infinitely repeated game with linear Cournot
oligopoly as the stage game and the discount factor . In the stage game, there are
2 firms with zero cost and the inverse-demand function = max {1 − 0}.
For each strategy profile below, find the range of under which the strategy profile
is a subgame-perfect Nash equilibrium.
(a) At each , each firm produces 1 (2) until some firm produces another
amount; each firm produces 1 thereafter.
(b) At each , firms 1, . . . , produce 12, 1/4,. . . , 12 , respectively, until some
firm deviates (by not producing the amount that it is supposed to produce);
they all produce 1 ( + 1) thereafter.
2. [Midterm 2 Make Up, 2007] Consider the infinitely repeated game with discount
rate and the following stage game. Simultaneously, Seller chooses quality ∈
[0 ∞) of the product and the Customer decides whether to buy at a fixed price .
The payoff vector is ( − 2 2 − ) if customer buys, and (− 2 2 0) otherwise,
where the first entry is the payoff of the seller and 0 is a constant.
(a) Find the highest price for which there is a SPE such that customer buys on
the path everyday.
(b) Find the set of parameters ̂, , and for which the following is a SPE. We
have a Trade state and Waste states (1 2 ). In the trade state
seller chooses quality = , and the buyer buys. In any Waste state, the
seller chooses quality level ̂ and the buyer does not buy. If everybody does
what he is supposed to do, in the next period Trade leads to Trade, 1 leads
to 2 , 2 leads to 3 , . . . , −1 leads to , and leads to Trade. Any
deviation takes us to 1 . The game starts at Trade state.
3. [Midterm 2, 2007] Consider the infinitely repeated game with the following stage
game (Linear Bertrand duopoly). Simultaneously, Firms 1 and 2 choose prices
1 ∈ [0 1] and 2 ∈ [0 1], respectively. Firm sells
⎧
⎪
⎪
⎨ 1 − if
(1 2 ) = (1 − ) 2 if =
⎪
⎪
⎩ 0 if
units at price , obtaining the stage payoff of (1 2 ). (All the previous prices
are observed, and each player maximizes the discounted sum of his stage payoffs
with discount factor ∈ (0 1).) For each strategy profile below, find the range of
parameters under which the strategy profile is a subgame-perfect equilibrium.
(a) They both charge = 12 until somebody deviates; they both charge 0
thereafter. (You need to find the range of .)
262 CHAPTER 13. APPLICATION: IMPLICIT CARTELS
(b) There are + 1 states: Collusion, the first day of war (1 ), the second day of
war (2 ), ..., and the th day of war ( ). The game starts in the Collusion
state. They both charge = 12 in the Collusion state and = ∗ in the
war states (1 ,. . . , ), where ∗ 12. If both players charge what they
are supposed to charge, then the Collusion state leads to the Collusion state,
1 leads to 2 , 2 leads to 3 , . . . , −1 leads to , and leads to the
Collusion state. If any firm deviates from what it is supposed to charge at
any state, then they go to 1 . (Every deviation takes us to the first day of a
new war.) (You need to find inequalities with , ∗ , and .)
4. [Selected from Midterms 2 and make up exams in years 2002 and 2004] Below,
there are pairs of stage games and strategy profiles. For each pair, check whether
the strategy profile is a subgame-perfect equilibrium of the game in which the
stage game is repeated infinitely many times. Each agent tries to maximize the
discounted sum of his expected payoffs in the stage game, and the discount rate is
= 099. (Clearly explain your reasoning in each case.)
(a) Stage Game: Linear Cournot Duopoly: There are two firms. Simultane-
ously each firm supplies ≥ 0 units of a good, which is sold at price
= max {1 − (1 + 2 ) 0}. The cost is equal to zero.
Strategy profile: There are two states: Cartel and Competition. The game
starts at Cartel state. In Cartel state, each supplies = 14. In Cartel state,
if each supplies = 14, they remain in Cartel state in the next period;
otherwise they switch to Competition state in the next period. In Competi-
tion state, each supplies = 12. In Competition state, they automatically
switch to Cartel state in the next period.
to Cartel state in the next period if and only if both supply = 12; otherwise
they remain in Competition state in the next period, too.
Chapter 14
So far we have focused on games in which any piece of information that is known by
any player is known by all the players (and indeed common knowledge). Such games
are called the games of complete information. Informational concerns do not play any
role in such games. In real life, players always have some private information that is not
known by other parties. For example, we can hardly know other players’ preferences and
beliefs as well as they do. Informational concerns play a central role in players’ decision
making in such strategic environments. In the rest of the course, we will focus on such
informational issues. We will consider cases in which a party may have some information
that is not known by some other party. Such games are called games of incomplete
information or asymmetric information. The informational asymmetries are modeled by
Nature’s moves. Some players can distinguish certain moves of nature while some others
cannot. Consider the following simple example, where a firm is contemplating the hiring
of a worker, without knowing how able the worker is.
Example 14.1 Consider the game in Figure 14.1. There are a Firm and a Worker.
Worker can be of High ability, in which case he would like to Work when he is hired, or
of Low ability, in which case he would rather Shirk. Firm would want to Hire the worker
that will work but not the worker that will shirk. Worker knows his ability level. Firm
does not know whether the worker is of high ability or low ability. Firm believes that the
worker is of high ability with probability and low ability with probability 1 − . Most
265
266 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION
importantly, the firm knows that the worker knows his own ability level. To model this
situation, we let Nature choose between High and Low, with probabilities and 1 − ,
respectively. We then let the worker observe the choice of Nature, but we do not let the
firm observe Nature’s choice.
Work (1, 2)
W
Firm Hire
Shirk
High p (0, 1)
Nature Do not (0, 0)
hire
Work (1, 1)
W
Low 1-p Hire
Shirk
(-1, 2)
Do not
hire (0, 0)
A player’s private information is called his “type”. For instance, in the above example
Worker has two types: High and Low. Since Firm does not have any private information,
Firm has only one type. As in the above example, incomplete information is modeled
via imperfect-information games where Nature chooses each player’s type and privately
informs him. These games are called incomplete-information game or Bayesian game.
his own type, but not the others’. Finally, players simultaneously choose their actions,
each player knowing his own type. We write = (1 2 2 ) ∈ for any list of
actions taken by all the players, where ∈ is the action taken by player . The payoff
of a player will now depend on players’ types and actions; we write : × → R
for the utility function of and = (1 ). Such a static game with incomplete
information is denoted by ( ). Such a game is called a Bayesian Games.
One can write the game in the example above as a Bayesian game by setting
• = { }
• and the utility functions and are defined by the following tables, where
the first entry is the payoff of the firm and the table on the left corresponds to
= ( )
It is very important to note that players’ types may be “correlated”, meaning that a
player “updates” his beliefs about the other players’ type when he learns his own type.
Since he knows his type when he takes his action, he maximizes his expected utility with
respect to the new beliefs he came to after “updating” his beliefs. We assume that he
updates his beliefs using Bayes’ Rule.
Bayes’ Rule Let and be two events, then probability that occurs conditional
on occurring is
( ∩ )
( | ) =
()
where ( ∩ ) is the probability that and occur simultaneously and (): the
(unconditional) probability that occurs.
268 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION
In static games of incomplete information, the application of Bayes’ Rule will often
be trivial, but a very good understanding of the Bayes’ Rule is necessary to follow the
treatment of the dynamic games of incomplete information later.
Let (0− | ) denote ’s belief that the types of all other players is 0− = (01 02 −1
0
0+1 0 )
given that his type is . [We may need to use Bayes’ Rule if types across players are
‘correlated’. But if they are independent, then life is simpler; players do not update
their beliefs.] For example, for a two player Bayesian game, let 1 = 2 = { } and
( ) = ( ) = ( ) = 13 and ( ) = 0. This distribution is vividly
tabulated as
13 13
0 13
Now,
Similarly,
1 (|) = 12
Pr (1 = 2 = ) ( ) 0
1 (|) = = = =0
Pr (1 = ) ( ) + ( ) 0 + 13
Pr (1 = 2 = ) ( ) 13
1 (|) = = = = 1
Pr (1 = ) ( ) + ( ) 0 + 13
mapping his types to his actions. For instance, in the example above, Worker has four
strategies: (Work,Work)–meaning that he will work regardless of whether he is of high
or low ability, (Work, Shirk)–meaning that he will work if he is of high ability and shirk
if he is of low ability, (Shirk, Work), and (Shirk, Shirk).
14.2. BAYESIAN NASH EQUILIBRIUM 269
When the probability of each type is positive according to , any Nash equilibrium of
a Bayesian game is called Bayesian Nash equilibrium. In that case, in a Nash equilibrium,
for each type , player plays a best reply to the others’ strategies given his beliefs about
the other players’ types given . If the probability of Nature choosing some is zero,
then any action at that type is possible according to an equilibrium (as his action at that
type does not affect his expected payoff.) In a Bayesian Nash equilibrium, we assume
that for each type , player plays a best reply to the others’ strategies given his beliefs
about the other players’ types given , regardless of whether the probability of that type
is positive.
Formally, a strategy profile ∗ = (∗1 ∗ ) is a Bayesian Nash Equilibrium in an
-person static game of incomplete information if and only if for each player and type
∈
X
∗ (1 ) ∈ arg max (∗ ( ) ∗ ( )) × (0− | )
where is the utility of player and denotes action. That is, for each player each
possible type, the action chosen is optimal given the conditional beliefs of that type
against the optimal strategies of all other players. Notice that the utility function of
player depends both players’ actions and types.1 Notice also that a Bayesian Nash
equilibrium is a Nash equilibrium of a Bayesian game with the additional property that
each type plays a best reply.2 For example, for = 34, consider the Nash equilibrium of
the game between the firm and the worker in which the firm hires and worker works if and
only if Nature chooses high. We can formally write this strategy profile as ∗ = (∗ ∗ )
with
We check that this is a Bayesian Nash equilibrium as follows. First consider the firm.
1
Utility function does not depend the whole of strategies 1 ,. . . , , but the expected value of
possibly does.
2
This property is necessarily satisfied in any Nash equilibrium if all types occur with positive prob-
ability.
270 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION
At his only type , his beliefs about the other types are
[ ( ∗ ) | ] = ( ∗ () ) (| ) + ( ∗ () ) (| )
= ( ) (| ) + ( ) (| )
3 1 1
= 1 · + (−1) · =
4 4 2
His expected payoff from action "dont" is
Clearly, 2 1, and "work" is the best response to ∗ for type . For type = ,
we check that his utility from "shirk",
Hence, the type = also plays a best response. Therefore, we have checked that
∗ is a Bayesian Nash equilibrium.
Exercise 14.1 Formally, check that firm not hiring and worker shirking for each type
is also a Bayesian Nash equilibrium.
14.3. EXAMPLE 271
14.3 Example
Suppose that the payoffs are given by the table
1 2
−1 0
where ∈ {0 2} is known by Player 1, ∈ {1 3} is known by Player 2, and all pairs of
( ) have probability of 14.
Formally, the Bayesian game is defined as
• = {1 2}
I next compute a Bayesian Nash equilibrium ∗ of this game. To do that, one needs
to determine ∗1 (0) ∈ { }, 1∗ (2) ∈ { }, 2∗ (1) ∈ { }, and 2∗ (3) ∈ { }–
four actions in total. First observe that when = 0, action strictly dominates action
, i.e.,
1 ( 2 = 0 ) 1 ( 2 = 0 )
for all actions 2 ∈ 2 and types ∈ {1 3} of Player 2. Hence, it must be that
∗1 (0) =
∗2 (3) =
Now consider the type = 2 of Player 1. Since his payoff does not depend on ,
observe that his payoff from is 1 + , where is the probability that Player 2 plays
272 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION
i.e.,
≥ 14
When 14, is the only best response. Note however that type must play ,
and the probability of that type is 1/2. Therefore,
≥ 12 14
∗1 (2) =
Now consider = 1. Given ∗1 , Player 2 knows that Player 1 plays (regardless of
his type). Hence, the payoff of = 1 is = 1 when he plays and 2 when he plays .
Therefore,
∗2 (1) =
To check that ∗ is indeed a Bayesian Nash equilibrium, one checks that each type
plays a best response.
Exercise 14.2 Verify that ∗ is indeed a Bayesian Nash equilibrium. Following the
analysis above, show that there is no other Bayesian Nash equilibrium.
=0 =1
1 −1 −1 1 1 1 −1 −1
−1 1 1 −1 −1 1 1 −1
= ( 1 1 1 )
In this problem,
2. [Midterm 2, 2001] This question is about a thief and a policeman. The thief
has stolen an object. He can either hide the object INSIDE his car on in the
TRUNK. The policeman stops the thief. He can either check INSIDE the car or
the TRUNK, but not both. (He cannot let the thief go without checking, either.)
If the policeman checks the place where the thief hides the object, he catches the
thief, in which case the thief gets −1 and the police gets 1; otherwise, he cannot
catch the thief, and the thief gets 1, the police gets −1.
Policemen cannot distinguish the thieves from each other, nor can the thieves
distinguish the policemen from each other. Each thief has stolen an object,
hiding it either in the TRUNK or INSIDE the car. Then, each of them is
randomly matched to a policeman. Each matching is equally likely. Again,
a policeman can either check INSIDE the car or the TRUNK, but not both.
Write this game as a Bayesian game with two players, a thief and a policemen.
Compute a pure-strategy Bayesian Nash equilibrium of this game.
Solution: The type space is {1 100} × {1 100} where each pair
( ) is equally likely. The payoff of thief is his payoff from part (a) plus ,
14.4. EXERCISES WITH SOLUTIONS 275
depending on his own type. The payoff of policeman is his payoff from part
(a) plus , depending on his type.
A Bayesian Nash equilibrium: A thief of type hides the object in
INSIDE if 0
TRUNK if 0;
a policeman of type checks
INSIDE if 0
TRUNK if 0
This is a Bayesian Nash equilibrium, because, from the thief’s point of view
the policeman is equally likely to check TRUNK or INSIDE the car, hence
it is the best response for him to hide in the trunk iff the extra benefit from
hiding in the trunk is positive. Similar for the policemen.
Remark 14.1 Note that from the point of view of an outside observer, the mixed
strategy equilibrium of complete information game in part (a) and the pure strategy
Bayesian Nash equilibrium of the Bayesian game in part (b) are equivalent: in both
cases, the thief hides either inside the car or in trunk and policeman checks inside
or trunk, where the probability of each pair is 14. Moreover, in both games, the
players face the same uncertainty about the action of the other player, assigning
equal probabilities on both actions. The rationale for those beliefs are somewhat
different however. In the complete information game, a player thinks that the ac-
tions of the other player are equally likely because he does not know the strategy of
the other player, assigning equal probabilities on those strategies. In the Bayesian
game, however, he does know what the other player’s strategy is–as a function of
his type. Yet, he does not know which action the other player takes as he does not
know the other player’s type. Therefore, the uncertainty about the strategies in com-
plete information game is replaced with uncertainty about the others’ types. One
can always convert a mixed strategy Nash equilibrium to a pure strategy Bayesian
Nash equilibrium by introducing very small uncertainty about the players’ payoffs.
(This fact is known as Harsanyi’s Purification Theorem.) Hence, a mixed strategy
Nash equilibrium can be interpreted as coming from slight variations in players’
payoffs.
276 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION
14.5 Exercises
1. [Midterm 2, 2011] Consider a two-player game with the payoff matrix
1 − 0
0 1
2. [Final 2010] Consider a two player Bayesian game with the following payoff matrix
(1 ) (2 ) (1 ) + 10 (2 ) − 10 (1 ) − 10 (2 ) + 10
(1 ) − 10 (2 ) + 10 (1 ) (2 ) (1 ) + 10 (2 ) − 10
(1 ) + 10 (2 ) − 10 (1 ) − 10 (2 ) + 10 (1 ) (2 )
3. [Midterm 2 Make up, 2002] Consider the incomplete information game with payoff
matrix
O B
O 2 + 1 1 1 2
B 0 0 1 2 + 2
where 1 and 2 are the private information of players 1 and 2, respectively, and are
identically and independently distributed with uniform distribution on [−13 23].
14.5. EXERCISES 277
(Here is the type of player .) Find a Bayesian Nash equilibrium of this game in
which for each action (O or B) there is a realization of at which player plays
that action.
2 2 0
0 1 1
273
274CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
P (Q) = a − Q
Bayesian Nash Equilibrium A Bayesian Nash equilibrium is a triplet (q1∗ , q2∗ (cH ) , q2∗ (cL ))
of real numbers, where q1∗ is the production level of Firm 1, q2∗ (cH ) is the production
level of type cH of Firm 2, and q2∗ (cL) is the production level of type cL of Firm 2. In
equilibrium each type plays a best response. First consider the high-cost type cH of
Firm 2. In equilibrium, that type knows that Firm 1 produces q1∗ . Hence, its production
level, q2∗ (cH ), solves the maximization problem
Hence,
a − q1∗ − cH
q2∗ (cH ) = (15.1)
2
Now consider the low-cost type cL of Firm 2. In equilibrium, that type also knows that
Firm 1 produces q1∗ . Hence, its production level, q2∗ (cL ), solves the maximization problem
max [a − q1∗ − q2 − cL ] q2 .
q2
Hence,
a − q1∗ − cL
q2∗ (cL ) = . (15.2)
2
15.1. COURNOT DUOPOLY WITH INCOMPLETE INFORMATION 275
The important point here is that both types consider the same q1∗ , as that is the strategy
of Firm 1, whose type is known by both types of Firm 2. Now consider Firm 1. It has
one type. Firm 1 knows the strategy of Firm 2, but since it does not know which type
of Firm 2 it faces, it does not know the production level of Firm 2. In Firm 1’s view,
the production level of Firm 2 is q2∗ (cH ) with probability θ and q2∗ (cL ) with probability
1 − θ. Hence, the expected profit of Firm 1 from production level q1 is
The equality is due to the fact that the production level q2 of Firm 2 enters the payoff
[a − q1 − q2 ] q1 = [a − q1 ] q1 − q1 q2 of Firm 1 linearly. The term
is the expected production level of Firm 2. Hence, the expected profit of Firm 1 just his
profit from expected production level:
⎛ ⎞ ⎡ ⎤−1 ⎛ ⎞
q1∗ 2 θ 1−θ a
⎜ ⎟ ⎢ ⎥ ⎜ ⎟
⎜ q ∗ (cH ) ⎟ = ⎢ 1 2 0 ⎥ ⎜ a − cH ⎟ ,
⎝ 2 ⎠ ⎣ ⎦ ⎝ ⎠
∗
q2 (cL ) 1 0 2 a − cL
yielding
a − 2cH (1 − θ)(cH − cL )
q2∗ (cH ) = +
3 6
a − 2c L θ(c H − c L)
q2∗ (cL ) = −
3 6
a + θc H + (1 − θ)c L
q1∗ = .
3
over bi .
Next, we will compute the Bayesian Nash equilibria. First, we consider a special equi
librium. The technique we will use here is a common technique in computing Bayesian
Nash equilibria, and pay close attentions to the steps.
bi (vi ) = b (vi )
for some function b from type space to action space, where b is the same function for all
players. Linear means that b is an affine function of vi , i.e.,
bi (vi ) = a + cvi .
for all types v1 and v2 for some constants a and c, that will be determined later. The
important thing here is the constants do not depend on the players or their types.
Step 2 Compute the best reply function of each type. Fix some type vi . To compute
her best reply, first note that c > 0.2 Then, for any fixed value bi ,
vi
bi
a c
vj
vj(bi)
Here, the first equality is obtained simply by substituting (15.5) to (15.4). The second
equality is simple algebra, and the third equality is due to the fact that vj is distributed
by uniform distribution on [0, 1]. [If you are taking this course, the last step must be
obvious to you!]
integral3
vj (bi )
(vi − bi ) dvj .
0
This is the area of the rectangle that is between 0 and vj (b) horizontally and between
bi and vi vertically: (vi − bi ) vj (bi ).
To compute the best reply, we must maximize the last expression over bi . Taking the
derivative and setting equal to zero yields
vi + a
bi = . (15.6)
2
Remark 15.1 Note that we took an integral to compute the expected payoff and took a
derivative to compute the best response. Since the derivative is an inverse of integral, this
involves unnecessary calculations in general. In this particular example, the calculations
were simple. In general those unnecessary calculations may be the hardest step. Hence,
it is advisable that one leaves the integral as is and use Leibnitz rule4 to differentiate it
to obtain the first-order condition. Indeed, the graphical derivation above does this.
3
If vj were not uniformly distributed on [0, 1], then it would have been the integral
vj (bi )
(vi − bi ) f (vj ) dvj = (vi − bi ) F (vj (bi ))
0
where f and F is the probability density and cumulative distribution functions of vj , respectively.
4
Leibnitz Rule:
U (x,y) U (x,y)
∂ ∂U ∂L ∂
f (x, y, t) dt = · f (x, y, U (x, y)) − · f (x, y, L (x, y)) + f (x, y, t) dt.
∂x t=L(x,y) ∂x ∂x t=L(x,y) ∂x
280CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
Step 3 Verify that best -reply functions are indeed affine, i.e., bi is of the form
bi = a + cvi . Indeed, we rewrite (15.6) as
1 a
bi = v i + . (15.7)
2 2
Check that both 1/2 and a/2 are constant, i.e., they do not depend on vi , and they are
same for both players.
Step 4 Compute the constants a and c. To do this, observe that in order to have
an equilibrium, the best reply bi in (15.6) must be equal to b∗i (vi ) for each vi . That is,
1 a
vi + = cvi + a.
2 2
must be an identity, i.e. it must remain true for all values of vi . Hence, the coefficient
of vi must be equal in both sides:
1
c= .
2
The intercept must be same in both sides, too:
a
a= .
2
Thus,
a = 0.
This yields the symmetric, linear Bayesian Nash equilibrium:
1
bi (vi ) = vi .
2
Step 2 Compute the best reply of each type, or compute the first-order condition
that must be satisfied by the best reply. To this end, compute that, given that the other
player j is playing according to equilibrium, the expected payoff of playing bi for type
vi is
where b−1 is the inverse of b. Here, the first equality holds because b is strictly increasing;
the second equality is obtained by again using the fact that b is increasing, and the last
equality is by the fact that vj is uniformly distributed on [0, 1]. The first-order condition
is obtained by taking the partial derivative of the last expression with respect to bi and
setting it equal to zero. Then, the first-order condition is
−1 db−1
−b (b∗i (vi )) + (vi − bi∗ ) = 0.
dbi bi =b∗i (vi )
Using the formula on the derivative of the inverse function, this can be written as
1
−b−1 (b∗i (vi )) + (vi − bi∗ (vi )) = 0. (15.8)
b (v) b(v)=b∗i (v)
Step 3 Identify the best reply with the equilibrium action, towards computing the
equilibrium action. That is, set
b∗i (vi ) = b (vi ) .
b (vi ) vi + b (vi ) = vi .
282CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
Hence,
d [b (vi ) vi ]
= vi .
dvi
Therefore,
for some constant const. Since the equality also holds vi = 0, it must be that const = 0.
Therefore,
b (vi ) = vi /2.
In this case, we were lucky. In general, one obtains a differential equation as in (15.9),
but the equation is not easily solvable in general. Make sure that you understand the
steps until finding the differential equation well.
db−1
−F b−1 (b∗i (vi )) + (vi − bi∗ (vi ))f b−1 (b∗i (vi )) = 0.
dbi bi =b∗i (vi )
Using the formula on the derivative of the inverse function, this can be written as
1
−F b−1 (b∗i (vi )) + (vi − b∗i (vi ))f b−1 (b∗i (vi )) = 0.
b (v) b(v)=b∗i (v)
In Step 3, towards identifying the best reply with the equilibrium action, one substututes
the equality b∗i (vi ) = b (vi ) in this equation and obtains
1
−F (vi ) + (vi − b (vi ))f (vi ) = 0.
b (vi )
15.3. DOUBLE AUCTION 283
Arranging the terms, one can write this as a usual differential equation:
The same trick in the case of uniform distribution applies more generally. One can
write the above differential equation as
d
[b (vi ) F (vi )] = vi f (vi ) .
dvi
By integrating both sides, one then obtains the solution
( vi
vf (v) dv
b (vi ) = 0 .
F (vi )
One can further simplify this solution by integrating the right hand side by parts:
(v ( vi
vi F (vi ) − 0 i F (v) dv F (v) dv
b (vi ) = = vi − 0 .
F (vi ) F (vi )
(v
That is, in equilibrium a bidder shades her bid down by an amount of 0 i F (v) dv /F (vi ) .
The value of the object for Seller is vs and for Buyer is vb . Each player knows her own
valuation privately. Assume that vs and vb are independently and identically distributed
with uniform distribution on [0, 1]. [Recall from the first-price auction what this means.]
Then, the payoffs are
pb +ps
vb − 2
if pb ≥ ps
ub =
0 otherwise
pb +ps
2
− vs if pb ≥ ps
us =
0 otherwise
We will now compute Bayesian Nash equilibria. In an equilibrium, one must compute
a price ps (vs ) for each type vs of the seller and a price pb (vb ) for each type vb of the
buyer. In a Bayesian Nash equilibrium, pb (vb ) solves the maximization problem
pb + ps (vs )
max E vb − : pb ≥ ps (vs ) ,
pb 2
and ps (vs ) solves the maximization problem
ps + pb (vb )
max E − vs : pb (vb ) ≥ ps ,
ps 2
where E [x : A] is the "integral" of x on set A. ( Note that E [x : A] = E [x|A] Pr (A),
where E [x|A] is the conditional expectation of x given A. Make sure that you know all
these terms!!!)
In this game, there are many Bayesian Nash equilibria. For example, one equilibrium
is given by
X if vb ≥ X
pb = ,
0 otherwise
X if vs ≤ X
ps =
1 otherwise
for some any fixed number X ∈ [0, 1]. We will now consider the Bayesian Nash equilib
rium with linear strategies.
pb (vb ) = ab + cb vb
ps (vs ) = as + cs vs
for some constants ab , cb , as , and cs . Assume also that cb > 0 and cs > 0. [Notice that a
and c may be different for buyer and the seller.]
Step 2 Compute the best responses for all types. To do this, first note that
pb − as
pb ≥ ps (vs ) = as + cs vs ⇐⇒ vs ≤ (15.10)
cs
and
ps − ab
ps ≤ pb (vb ) = ab + cb vb ⇐⇒ vb ≥
. (15.11)
cb
To compute the best reply for a type vb , one first computes his expected payoff from his
bid (leaving in an untegral form). As shown in Figure 15.2, the payoff of the buyer is
pb + ps (vs )
vb −
2
when vs ≤ vs (pb ) = (pb − as ) /cs and the payoff is zero otherwise. Hence, the expected
payoff is
pb + ps (vs )
E [ub (pb , ps , vb , vs ) |vb ] = E vb − : pb ≥ ps (vs )
2
pb −as
cs pb + ps (vs )
= vb − dvs .
0 2
By substituting ps (vs ) = as + cs vs in this expression, obtain
pb −as
cs pb + as + cs vs
E [ub (pb , ps , vb , vs ) |vb ] = vb − dvs .
0 2
Visually, this is the area of the trapezoid that lies beween 0 and vs (pb ) horizontally and
between the price (ps + pb ) /2 and vb vertically.5
5
The area is
pb − as 3pb + as
E [ub (pb , ps , vb , vs ) |vb ] = vb − ,
cs 4
but it is not needed for the final result.
286CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
vb
pb
c
a
vs(pb) vs
To compute the best reply, take the derivative of the last expression with respect to
pb and set it equal to zero:
pb −as
∂ ∂ cs pb + as + cs vs
0 = E [ub (pb , ps , vb , vs ) |vb ] = vb − dvs
∂pb ∂pb 0 2
pb −as
vb − p b 1 cs
= − dvs
cs 0 2
vb − pb 1 pb − as
= − .
cs 2 cs
Now compute the best reply of a type vs . As in before, his expected payoff of playing
ps in equilibrium is
ps + pb (vb )
E [us (pb , ps , vb , vs ) |vs ] = E − vs : pb (vb ) ≥ ps
2
1
ps + ab + cb vb
= − vs dvb ,
ps −ab
c
2
b
where the last equality is by (15.11) and pb (vb ) = ab + cb vb . Once again, in order to
compute the best reply, take the derivative of the last expression with respect to ps and
set it equal to zero:6
1
1 1 1 ps − ab 1
− (ps − vs ) + dvb = 1− − (ps − vs ) = 0.
cb ps −ab
cb
2 2 cb cb
Once again, a Δ increase in ps leads to a Δ/2 increase in the price, resulting in a gain
ps −ab
of 1 − cb
Δ/2. It also leads to a Δ/cb decrease in the types of buyers who trade,
leading to a loss of (ps − vs ) Δ/cb . At the optimum, the gain and the loss must be equal,
yielding the above equality. Solving for ps , one can then obtain
2 ab + cb
p s = vs + . (15.13)
3 3
Step 3 Verify that best replies are of the form that is assumed in Step 1. Inspecting
(15.12) and (15.13), one concludes that this is indeed the case. The important point
here is to check that in (15.12) the coefficient 2/3 and the intercept 13 as are constants,
independent of vb . Similarly for the coefficient and the intercept in (15.13).
Step 4 Compute the constants. To do this, we identify the coefficients and the
intercepts in the best replies with the relevant constants in the functional form in Step
1. Firstly, by (15.12) and pb (vb ) = pb , we must have the identity
1 2
ab + cb vb = as + vb .
3 3
6
One uses Leibnitz rule. The derivative of upper bound is zero, contributing zero to the derivative.
The derivative of the lower bound is 1/cs , and this is multiplied by the expression in the integral at the
lower bound, which is simply ps − vs . (Note that at the lower bound pb = ps , and hence the price is
simply ps .) Finally, one adds the integral of the derivative of the expression inside the integral, which
is simply 1/2.
288CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
That is,
1
ab = as (15.14)
3
and
2
cb = . (15.15)
3
Similarly, by (15.13) and ps (vs ) = ps , we must have the identity
a b + cb 2
as + cs vs = + vs .
3 3
That is,
ab + cb
as = (15.16)
3
and
2
cs = . (15.17)
3
Solving (15.14), (15.15), (15.16), and (15.17), we obtain ab = 1/12 and as = 1/4.
Therefore, the linear Bayesian Nash equilibrium is given by
2 1
pb (vb ) = vb + (15.18)
3 12
2 1
ps (vs ) = vs + . (15.19)
3 4
In this equilibrium, the parties trade iff
pb (vb ) ≥ ps (vs )
i.e.,
2 1 2 1
vb + ≥ vs + ,
3 12 3 4
which can be written as
3 1 1 31 1
vb − vs ≥ − = = .
2 4 12 26 4
Whenever vb > vc there is a positive gain form trader. When the gain from trade is
lower than 1/4, the parties leave this gain from trade on the table. This is because of
the incomplete information. The parties do not know that there is a positive gain from
trade. Even if they tried to find ingenious mechanisms to elicit the values, buyer would
have an incentive to understate vb and seller would have an incentive to overstate vs ,
and some gains from trade would not be realized.
15.4. INVESTMENT IN A JOINT PROJECT 289
Player 1 chooses between rows, and Player 2 chooses between columns. Here, the payoffs
from not investing is normalized to zero. If a player invests, his payoff depends on the
other player’s action. If the other player also invests, the project succeeds, and both
players get θ. If the other player does not invest, the project fails, and the investing
290CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
player incurs a cost, totalling a net benefit of θ − 1.7 The important thing here is the
return from investment is 1 utile more if the other party also invests.
Now suppose that θ is common knowledge. Consider first the case θ < 0. In that
case, the return from investment is so low that Invest is strictly dominated by Not Invest.
(I am sure you can imagine a case in which even if you learn everything the school is
offering and get the best job, it is still not worth studying.) Each player chooses not
to invest. Now consider the other extreme case: θ > 1. In that case, the return from
investment is so high that Invest strictly dominates Not Invest, and both parties invest
regardless of their expectations about the other. (For example, studying may be such a
fun that you would study the material even if you thought that it will not help you get
any job.) These are two extreme, uninteresting cases.
Now consider the more interesting and less extreme case of 0 < θ < 1. In that case,
there are two equilibria in pure strategies and one equilibrium in mixed strategies. In
the good equilibrium, anticipating that the other player invests, each player invests in
the project, and each gets the positive payoff of θ. In the bad equilibrium, each player
correctly anticipates that the other party will not invest, so that neither of them invest,
yielding zero payoff for both players.
It is tempting to explain the differences between developed and underdeveloped coun
tries that have similar resources or the successful and unsuccessful companies by such a
multiple equilibria story. Indeed, it has been done so by many researchers. We will next
consider the case with incomplete information and see that there are serious problems
with such explanations.
Now assume that players do not know θ, but each of them gets an arbitrarily precise
noisy signal about θ. In particular, each player i observes
xi = θ + η i , (15.20)
types x1 and x2 , as
Invest Not Invest
Invest x 1 , x2 x1 − 1, 0
Not Invest 0, x2 − 1 0, 0
That is, the players do not know how much the other party values the investment,
but they know that the valuations are positively correlated. This is because they are
both estimates about the same thing. For example, if Player 1 finds out that investment
is highly valuable, i.e., x1 is high, then he will believe that Player 2 will also find out
that the investment is valuable, i.e., x2 is high. Because of the noise terms, he will
not know however what x2 is. In particular, for x1 ∈ [0, 1], he will find that the other
player’s signal is higher than his own with probability 1/2 and lower than his own with
probability 1/2:
Pr (xj < xi |xi ) = Pr (xj > xi |xi ) = 1/2. (15.21)
This is implied by the fact that θ is uniformly distributed and we are away from the
corners L and −L. [If you are mathematically inclined, you should prove this.]
We will now look for the symmetric Bayesian Nash equilibria in monotone (i.e. weakly
increasing) strategies. A monotone strategy si here is a strategy with a cutoff value x̂i
such that player invests if and only if his signal exceeds the cutoff:
Invest if xi ≥ x̂i ,
si (xi ) =
Not Invest if xi < x̂i .
Any symmetric Bayesian Nash equilibrium s∗ in monotone strategies has a cutoff value
x̂ such that
Invest if xi ≥ x,
ˆ
s∗i (xi ) =
Not Invest if xi < x̂.
Here, symmetry means that the cutoff value x̂ is the same for both players. In order to
identify such a strategy profile all we need to do is to determine a cutoff value.
We will now find the cutoff values x̂ that yields a Bayesian Nash equilibrium. Notice
that the payoff from investment is
The payoff from Not invest is simply zero. Hence, a player invests as a best reply if
and only if his signal is at least as high as the probability that the other player is
not investing, i.e., xi ≥ Pr (sj (xj ) = N ot Invest|xi ). Therefore s∗ is a Bayesian Nash
equilibrium iff
Intuitively, when xi is slightly lower than x̂ we have xi ≤ Pr (xj < x̂|xi ), and when xi
is slightly higher than x̂ we have xi ≥ Pr (xj < x̂|xi ). Because of continuity we must
have equality at xi = x̂. Below, for those who want to see a rigorous proof, I make this
argument more formally.
Proof. Since x̂ ∈ [0, 1], there are types xi > x̂, and all such types invest. Hence there
is a sequence of types xi → x̂ with xi > x̂. Since each xi invests, xi ≥ Pr (xj < x̂|xi ).
Moreover, Pr (xj < x̂|xi ) is continuous in xi . Hence, x̂ = lim xi ≥ lim Pr (xj < x̂|xi ) =
Pr (xj < x̂|x̂). Similarly, there are types xi < x̂, who do not invests, and considering such
types approaching x̂, we conclude that x̂ = lim xi ≤ lim Pr (xj < x̂|xi ) = Pr (xj < x̂|x̂).
Combining these two we obtain the equality.
Equation (15.22) shows that there is a unique symmetric Bayesian Nash equilibrium
in monotone strategies.
15.4. INVESTMENT IN A JOINT PROJECT 293
Invest if xi ≥ 1/2,
s∗i (xi ) =
Not Invest if xi < 1/2.
Proof. By (15.22), we have x̂ = Pr (xj < x̂|x̂). But by (15.21), Pr (xj < x̂|x̂) = 1/2.
Therefore,
8
For mathematically inclined students: This is because the game is supermodular: (i) the return
to investment increases with the investment of the other party and with one’s own type xi , and (ii)
the beliefs are increasing in the sense that Pr (xj ≥ a|xi ) is weakly increasing in xi . In that case, the
rationalizable strategies are bounded by symmetric Bayesian Nash equilibria in monotone strategies.
Clearly, when the latter is unique, there must be a unique rationalizable strategy.
294CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
Answer:
1
|{j:pj =pi }|
(pi − ci ) if pi ≤ pj for all j
ui (p1 , . . . , pn ; c1 , . . . , cn ) =
0 otherwise
(c) Find all symmetric Bayesian Nash equilibrium in strictly increasing and dif
ferentiable strategies.
[Hint: Given any c¯ ∈ (0, 1), the probability that cj ≥ c̄ for all j = i is
(1 − c̄)n−1 .]
given that the other players play p, the expected utility of firm i from charging
pi at cost ci is
To see the last equality, note that for all j = i, Pr (cj > p−1 (pi )) = (1 − p−1 (pi )).
Since the types are independently distributed, we must multiply these prob
abilities over j–n − 1 times. The first order condition is
∂Ui n−1 n−2 1
= 1 − p−1 (pi ) −(n − 1) 1 − p−1 (pi ) (pi − ci )· = 0.
∂pi p (c) p(c)=pi
(If you obtain this differential equation, you will get 8 out of 10.) To solve it,
notice that the left-hand side is
d
(1 − ci )n−1 p (ci ) .
dci
Therefore,
and the highest possible cost. This price can be high. However, as n → ∞,
p (ci ) → ci , and the firm with the lowest cost sells the good at its marginal
cost, as in the competitive equilibrium.
Remark 15.2 The problem here can be viewed as a procurement auction, in which
the lowest bidder wins. This is closely related to the problem in which n buyers
with privately known values bid in a first-price auction.
2. [Final 2002] Two partners simultaneously invest in a project, where the level of
investment can be any non-negative real number. If partner i invests xi and the
other partner j invests xj , then the payoff of partners i is
θi xi xj − x3i .
Here, θi is privately known by partner i, and the other partner believes that θi is
uniformly distributed on [0, 1]. All these are common knowledge. Find a symmetric
Bayesian Nash equilibrium in which the investment of partner i is in the form of
√
xi = a + b θi .
Solution: In this problem, all symmetric Bayesian Nash equilibria turn out to
be of the above form; the question hints the form. I construct a Bayesian Nash
√
equilibrium (x∗1 , x∗2 ), which will be in the form of x∗i (θi ) = a + b θi . The expected
payoff of i from investment xi is
i.e.,
� E x∗j �
x∗i (θi ) = θi E x∗j /3 = θi .
3
√
That is, a = 0, and the equilibrium is in the form of x∗i (θi ) = b θi where
E x∗j
b= .
3
15.5. EXERCISES WITH SOLUTION 297
�
But x∗j = b θj , hence
[ � ] [� ]
E x∗j = E b θj = bE θj = 2b/3.
2
E x∗j 2b/3 2b
b = = = .
3 3 9
There are two solutions for this equality, each yielding a distinct Bayesian Nash
equilibrium. The first solution is
b = 2/9,
2�
x∗i (θi ) = θi .
9
The second solution is b = 0, yielding the Bayesian Nash equilibrium in which each
player invests 0 regradless of his type.
3x2 x ∈ [0, 1]
f (x) =
0 otherwise.
bi = a + cvi
298CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
where c > 0. Then, the expected payoff from bidding bi with type vi is
bi − a
= (vi − bi ) Pr vj <
j#=i
c
3
bi − a
= (vi − bi )
j#=i
c
3(n−1)
bi − a
= (vi − bi )
c
3(n−1) 3(n−1)−1
∂U (bi ; vi ) bi − a 1 bi − a
=− + 3 (n − 1) (vi − bi ) = 0;
∂bi c c c
i.e.,
bi − a 1
− + 3 (n − 1) (vi − bi ) = 0;
c c
i.e.,
a + 3 (n − 1) vi
bi = .
3 (n − 1) + 1
Since this is an identity, we must have
a
a= ,
3 (n − 1) + 1
i.e., a = 0, and
3 (n − 1)
c= .
3 (n − 1) + 1
In the limit, each bidder bids his valuation, and the seller extracts all the
gains from trade.
15.5. EXERCISES WITH SOLUTION 299
4. [Midterm 2, 2002] Consider a game between two software developers, who sell
operating systems (OS) for personal computers. (There are also a PC maker and
the consumers, but their strategies are already fixed.) Each software developer
i, simultaneously offers “bribe” bi to the PC maker. (The bribes are in the form
of contracts.) Looking at the offered bribes b1 and b2 , the PC maker accepts the
highest bribe (and tosses a coin between them if they happen to be equal), and he
rejects the other. If a firm’s offer is rejected, it goes out of business, and gets 0.
Let i∗ denote the software developer whose bribe is accepted. Then, i∗ pays the
bribe bi∗ , and the PC maker develops its PC compatible only with the operating
system of i∗ . Then in the next stage, i∗ becomes the monopolist in the market for
operating systems. In this market the inverse demand function is given by
P = 1 − Q,
where P is the price of OS and Q is the demand for OS. The marginal cost of
producing the operating system for each software developer i is ci . The costs c1
and c2 are independently and identically distributed with the uniform distribution
on [0, 1], i.e., ⎧
⎪
⎨ 0
⎪ if c < 0
Pr (ci ≤ c) = c if c ∈ [0, 1]
⎪
⎪
⎩ 1 otherwise.
The software developer i knows its own marginal costs, but the other firm does not
know. Each firm tries to maximize its own expected profit. Everything described
so far is common knowledge.
Hence,
�
Ui (bi ; ci ) = (vi − bi ) (bi − α) /γ.
But maximizing Ui (bi ; ci ) is the same as maximizing
i.e.,
2 (bi − α) + (bi − vi ) = 0,
i.e.,
1 2 1 2
bi = vi + α = (1 − ci )2 + α.
3 3 12 3
Therefore,
1 2
γ= and α = α =⇒ α = 0,
12 3
yielding
1 1
bi = vi = (1 − ci )2 .
3 12
(Check that the second derivative is 2 (3bi − 2vi ) = −2vi < 0.)
15.6. EXERCISES 301
(c) Considering that the demand for PCs and the demand of OSs must be the
same, should PC maker accept the highest bribe? (Assume that PC maker
also tries to maximize its own profit. Explain your answer.)
Answer: A low-cost monopolist will charge a lower price, increasing the
profit for the PC maker. Since low-cost software developers pay higher
bribes, it is in the PC maker’s interest to accept the higher bribe. In that
case, he will get higher bribe now and higher profits later.
15.6 Exercises
1. [Midterm 2 Make Up, 2011] There are n players in a town. Simultaneously each
player i contributes xi to a public project, yielding a public good of amount
y = x1 + · · · + xn ,
ui = y 2 − ci xγi
where γ > 2 is a known parameter and the cost parameter ci ∈ {1, 2} of player i
is his private information. The costs (c1 , . . . , cn ) are independently and identically
distributed where the probability of ci = 1 is 1/2 for each player i.
2. [Homework 4, 2004] There are n people, who want to produce a common public
good through voluntary contributions. Simultaneously, every player i contributes
xi . The amount of public good produced is
y = x 1 + x2 + · · · x n .
u i = θ i y − y 2 − xi ,
302CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
L R
X 3, θ 0, 0
Y 2, 2θ 2, θ
Z 0, 0 3, −θ
τ (x1 , . . . , xn ) = x1 + · · · + xn .
θi xi − xi τ (x1 , . . . , xn ) ,
(α + 1) xα x ∈ [0, 1]
f (x) =
0 otherwise
6. [Midterm 2, 2001] Consider a game of public good provision in which two players
simultaneously choose whether to contribute yielding payoff matrix
where the costs c1 and c2 are privately known by players 1 and 2, respectively. c1
and c2 are independently and identically distributed with uniform distribution on
[0, 2] (i.e., independent of his own cost, player 1 believes that c2 is distributed uni
form distribution on [0, 2] and vice verse). Compute a Bayesian Nash equilibrium
of this game.
7. [Final 2002 Make Up] We consider an “all-pay auction” between two bidders, who
bid for an object. The value of the object for each bidder i is vi , where v1 and v2
are identically and independently distributed with uniform distribution on [0, 1].
Each bidder simultaneously bid bi ; the bidder who bids the highest bid gets the
object, and each bidder i pays his own bid bi . (If b1 = b2 , then each gets the object
with probability 1/2.) The payoff of player i is
⎧
⎪
⎨ vi − bi
⎪ if bi > bj ,
ui = vi /2 − bi if bi = bj ,
⎪
⎪
⎩ −b if bi < bj .
i
8. [Homework 6, 2006] (This question is also about a game that was played in the
class.) There are n students in the class. We have a certificate, whose value for
each student i is vi , where vi is privately known by student i and (v1 , . . . , vn ) are
independently and identically distributed with uniform distribution on [0, 100].
Simultaneously, each student i bids a real number bi . The player who bids the
highest number "wins" the certificate; if there are more than one highest bids, then
we determine the "winner" randomly among the highest bidders. The winner i gets
n−1
the certificate and pays bi to the professor. [Hint: Pr (maxj=
# i vj ≤ x) = (x/100)
(a) Find a symmetric, linear Bayesian Nash equilibrium, where bi (vi ) = a + cvi
for some constants a and c.
(c) Assume that n = 80. How much would a student with value vi be willing to
pay (in terms of lost opportunities and pain of sitting in the class) in order
15.6. EXERCISES 305
to play this game? What is the payoff difference between the luckiest student
and the least lucky student?
9. [Homework 6, 2006] In a state, there are two counties, A and B. The state is to
dump the waste in one of the two counties. For a county i, the cost of having the
wasteland is ci , where cA and cB are independently and uniformly distributed on
[0, 1]. They decide where to dump the waste as follows. Simultaneously counties
A and B bid bA and bB , respectively. The waste is dumped in the county i who
bids lower, and the other county j pays bj to i. (We toss a coin if the bids are
equal. The payoff of a county is the amount of money it has minus the cost–if it
contains the wasteland.)
10. [Final, 2006] Alice and Bob have inherited a factory from their parents. The value
of the factory is vA for Alice and vB for Bob, where vA and vB are independently
and uniformly distributed over [0, 1], and each of them knows his or her own value.
Simultaneously, Alice and Bob bid bA and bB , respectively, and the highest bidder
wins the factory and pays the other sibling’s bid. (If the bids are equal, we toss a
coin to determine the winner.)
11. [Final 2007] There are n ≥ 2 siblings, who have inherited a factory from their
parents. The value of the factory is vi for sibling i, where (v1 , . . . , vn ) are inde
pendently and uniformly distributed over [0, 1], and each of them knows his or her
own value. Simultaneously, each i bids bi , and the highest bidder wins the factory
and pays his own bid to his siblings, who share it equally among themselves. (If
306CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION
the bids are equal, the winner is determined by a lottery with equal probabilities
on the highest bidders.) Note that if i wins, i gets vi − bi and any other j gets
bi / (n − 1).
12. [Homework 6, 2006] There is a house on the market. There are n ≥ 2 buyers.
The value of the house for buyer i is vi (measured in million dollars) where v1 ,
v2 , . . . , vn are independently and identically distributed with uniform distribution
on [0, 1]. The house is to be sold via first-price auction. This question explores
whether various "incentives" can be effective in improving participation.
(a) Suppose that seller gives a discount to the winner, so that winner pays only λbi
for some λ ∈ (0, 1), where bi is his own bid. Compute the symmetric Bayesian
Nash equilibrium. (Throughout the question, you can assume linearity if you
want.) Compute the expected revenue of the seller in that equilibrium.
(b) Suppose that seller gives a prize α > 0 to the winner. Compute the symmetric
Bayesian Nash equilibrium. Compute the expected revenue of the seller in
that equilibrium.
(c) Consider three different scenarios:
• the seller does not give any incentive;
• the seller gives 20% discount (λ = 0.8);
• the seller gives $100,000 to the winner.
For each scenarios, determine how much a buyer with value vi is willing to pay
in order to participate the auction. Briefly discuss whether such incentives
can facilitate the sale of the house.
13. [Homework 6, 2006] We have a penalty kicker and a goal keeper. Simultaneously,
penalty kicker decides whether to send the ball to the Left or to the Right, and
15.6. EXERCISES 307
the goal keeper decides whether to cover the Left or the Right. The payoffs are as
follows (where the first entry is the payoff of penalty kicker):
14. [Final 2010] There are two identical objects and three potential buyers, named
1, 2, and 3. Each buyer only needs one object and does not care which of the
identical objects he gets. The value of the object for buyer i is vi where (v1 , v2 , v3 )
are independently and uniformly distributed on [0, 1]. The objects are sold to two
of the buyers through the following auction. Simultaneously, each buyer i submits
a bid bi , and the buyers who bid one of the two highest bids buy the object and
pay their own bid. (The ties are broken by a coin toss.) That is, if bi > bj for
some j, i gets an object and pays bi , obtaining the payoff of vi − bi ; if bi < bj for
all j, the payoff of i is 0.
15. [Final 2010] A state government wants to construct a new road. There are n
construction firms. In order to decrease the cost of delay in completion of the
road, the government wants to divide the road into k < n segments and construct
the segments simultaneously using different firms. The cost of delay for the public
is Cp = K/k for some constant K > 0. The cost of constructing a segment for firm
i is ci /k where (c1 , . . . , cn ) are independently and uniformly distributed on [0, 1],
where ci is privately known by firm i. The government hires the firms through the
following procurement auction.
winning firm is paid the lowest k + 1st bid as the price for the construction
of the segment. The ties are broken by a coin toss.
The payoff of a winning firm is the price paid minus its cost of constructing a
segment, and the payoff of a losing firm is 0. For example, if k = 2 and the bids
are (0.1, 0.2, 0.3, 0.4), then firms 1 and 2 win and each is paid 0.3, resulting in
payoff vector (0.3 − c1 /2, 0.3 − c2 /2, 0, 0).
(a) (10 points) For a given fixed k, find a Bayesian Nash equilibrium of this game
in which no firm bids below its cost. Verify that it is indeed a Bayesian Nash
equilibrium.
(b) (10 points) Assume that each winning firm is to pay β ∈ (0, 1) share of the
price to the local mafia. (In the above example it pays 0.3β to the mafia
and keep 0.3 (1 − β) for itself.) For a given fixed k, find a Bayesian Nash
equilibrium of this game in which no firm bids below its cost. Verify that it
is indeed a Bayesian Nash equilibrium.
(c) (5 points) Assuming that the government minimizes the sum of CP and the
total price it pays for the construction, find the condition for the optimal k
for the government in parts (a) and (c). Show that the optimal k in (c) is
weakly lower than the optimal k in (a). Briefly interpret the result. [Hint:
the expected value of the k + 1st lowest cost is (k + 1) / (n + 1).]
16. [Final 2011] There are k identical objects and n potential buyers where n > k >
1. Each buyer only needs one object and does not care which of the identical
objects he gets. The value of the object for buyer i is vi where (v1 , v2 , . . . , vn ) are
independently and uniformly distributed on [0, 1]. The objects are sold to k of
the buyers through the following auction. Simultaneously, each buyer i submits a
bid bi , and the buyers who bid one of the k highest bids buy the object and pay
their own bid. (The ties are broken by a coin toss.) That is, if bi > bj for at least
n − k bidders j, then i gets an object and pays bi , obtaining the payoff of vi − bi ;
if bi < bj for at least k bidders j, the payoff of i is 0.
(b) (20 points) Compute a symmetric Bayesian Nash equilibrium of this game in
increasing differentiable strategies. (You will receive 15 points if you derive
the correct equations without solving them.)
Hint: Let (x1 , . . . , xm ) be independently and uniformly distributed on [0, 1]
and let x(r) be rth highest xi among (x1 , . . . , xm ). Then, the probability
density function of x(r) is
m!
fm,r (x) = (1 − x)r−1 xm−r .
r! (m − r)!
17. [Final 2011] Consider the following charity auction. There are two bidders, namely
1 and 2. Each bidder i has a distinct favored charity. Simulatenously, each bidder
i contributes bi to the auction. The highest bidder wins, and the sum b1 + b2 goes
to the favored charity of the winner. The winner is determined by a coin toss in
case of a tie. The payoff of the bidder i is
18. [Homework 5, 2011] Consider an n-player game in which each player i selects a
search level si ∈ [0, 1] (simultaneously), receiving the payoff
where (θ1 , . . . , θn ) are independently and identically distributed on [0, ∞). the
expected value of each Here, γ > 1 is commonly known and θi is privately known
by player i. (Denote the expected value of θi by θ̄ and the expected value of θαi by
θ̄α for any α > 0.)
19. [Homework 5, 2011] Consider an n-player first price auction in which the value
of the object auctioned is vi for player i, where (v1 , . . . , vn ) are independently
and identically distributed with CDF F where F (v) = v α for some α > 0. The
value of vi is privately known by player i. Compute a symmetric Bayesian Nash
equilibrium.
20. [Homework 5, 2011] Consider an auction with two buyers where the value of the
object auctioned is vi for player i, where (v1 , v2 ) are independently and identically
distributed with uniform distribution on [0, 1]. The value of vi is privately known
by player i. In the auction, the buyers simultaneously bid b1 and b2 and the highest
bidder wins the object and pays the average bid (b1 + b2 ) /2 as the price. The ties
are broken with a coin toss. Compute a symmetric Bayesian Nash equilibrium.
Chapter 16
This chapter is devoted to the basic concepts in dynamic games with incomplete in
formation. As in the case of complete information, Bayesian Nash equilibrium allows
players to take suboptimal actions in information sets that are not reached in equilib
rium. This problem addressed by sequential equilibrium, which explicitly requires that
the players play a best reply at every information set (sequential rationality) and that
the players’ beliefs are "consistent" with the other players’ strategies. Here, I will define
sequential equilibrium and apply it to some important games.
311
312 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
Work (1, 2)
W
Firm Hire
Shirk
High .7 (0, 1)
Nature Do not (0, 0)
hire
Work (1, 1)
W
Low .3 Hire
Shirk
(-1, 2)
Do not
hire (0, 0)
Figure 16.1: A Bayesian Nash equilibrium in which player W plays a suboptimal action.
worker would shirk if he were hired, independent of whether he is hard working or lazy,
and anticipating this, the firm does not hire. Clearly, hard working worker’s shirking is
against his preferences (which were meant to model a worker who would rather work).
This is however consistent with Bayesian Nash equilibrium because every strategy of
the worker is a best reply to the "do not hire" strategy of the firm. (Worker gets 0 no
matter what strategy he plays.) In order to solve this problem, assume that players are
sequentially rational, i.e., they play a best reply at every information set, maximizing
their expected payoff conditional on that they are at the information set. That is, when
he is to move, the hard working worker would know that Nature has chosen "High" and
the firm has chosen "Hire", and he must play Work as the only best reply under that
knowledge. This would lead to the other equilibrium, in which firm hires and worker
works if he is hard working and shirks otherwise.
Notice that the latter equilibrium is the only subgame-perfect equilibrium in that
game. Since subgame perfection has been introduced as a remedy to the problem exhib
ited in the former equilibrium, it is tempting to think that subgame perfection solves the
problem. As we have seen in the earlier lectures, it does not. For example, consider the
strategy profile in bold in Figure 16.2. This is a subgame-perfect equilibrium because
there is no proper subgame, and it clearly a Nash equilibrium. Strategy L is a best reply
only to X. However, at the information Player 2 moves, she knows that player one has
played either T or B. Given this knowledge, L could not be a best reply.
In order to formalize the idea of sequential rationality for general games, we need to
16.1. SEQUENTIAL EQUILIBRIUM 313
X
1 (2,6)
T
B
2
L R L R
(0,1) (3,2) (-1,3) (1,5)
define beliefs:
For any information set I, the player who moves at I believes that he is at node
n ∈ I with probability b (n|I). For example, for the game in Figure 16.2, in order to
define a belief assessment, we need to assign a probability μ on the node after T and
a probability 1 − μ on the node after B. (In information sets with single nodes, the
probability distribution is trivial, putting 1 on the sole node.) When Player 2 moves,
she believes that Player 1 played T with probability μ and B with probability 1 − μ.
We are now ready to define sequential rationality for a strategy profile:
Definition 16.2 For a given pair (s, b) of strategy profile s and belief assessment b,
strategy profile s is said to be sequentially rational iff, at each information set I, the
player who is to move at I maximizes his expected utility
1. given his beliefs b(·|I) at the information set (which imply that he is at information
set I), and
2. given that the players will play according to s in the continuation game.
For example, in Figure 16.2, for Player 2, given any belief μ, L yields
U2 (L; μ) = 1 · μ + 3 · (1 − μ)
314 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
T
B
.1 .9
2
L R L R
(0,10) (3,2) (-1,3) (1,5)
while R yields
U2 (R; μ) = 2 · μ + 5 · (1 − μ) .
Hence, sequential rationality requires that Player 2 plays R. Given Player 2 plays R,
the only best reply for Player 1 is T . Therefore, for any belief assessment b, the only
sequentially rational strategy profile is (T, R).
In order to have an equilibrium, b must also be consistent with σ. Roughly speaking,
consistency requires that players know which (possibly mixed) strategies are played by
the other players. For a motivation, consider Figure 16.3 and call the node on the left
nT and the node on the right nB . Given the beliefs b (nT |I2 ) = 0.1 and b (nB |I2 ) = 0.9,
strategy profile (T, R) is sequentially rational. Strategy T is a best response to R. To
check the sequential rationality for R, it suffices to note that, given the beliefs, L yields
while R yields
(.1) (2) + (.9) (5) = 4.7.
(Note that there is no continuation game.) But (T, R) is not even a Nash equilibrium
in this game. This is because in a Nash equilibrium player knows the other player’s
strategy. She would know that Player 1 plays T , and hence she would assign probability
1 on nT . In contrast, according to b, she assigns only probability 0.1 on nT .
In order to define consistency formally, we need to think more carefully about the
information sets are reached positive probability (the information sets that are "on the
16.1. SEQUENTIAL EQUILIBRIUM 315
path") and the ones that are not supposed to be reached ("off the path") according to
the strategy profile.
Definition 16.3 Given any (possibly mixed) strategy profile s, belief assessment b, and
any information set I that is reached with positive probability according to s, the beliefs
b (·|I) at I is said to be consistent with s iff b (·|I) is derived using the Bayes rule and
s. That is, for each node n in I,
Pr (n|s)
b (n|I) = L ,
nl ∈I Pr (n |s)
'
For example, in order a belief assessment b to be consistent with (T, R), we need
Pr (nT | (T, R)) 1
μ = b (nT |I) = = = 1.
Pr (nT | (T, R)) + Pr (nB | (T, R)) 1+0
In general, there can be information sets that are not supposed to be reached accord
L
ing to the strategy profile. In that case the number nl ∈I Pr (n' |s) on the denominator
would be zero, and we cannot apply the Bayes rule (directly). For such information
sets, we perturb the strategy profile slightly, by assuming that players may "tremble",
and apply the Bayes rule using the perturbed strategy profile. To see the general idea,
consider the game in Figure 16.4. The information set of player 3 is off the path of the
strategy profile (X, T, L). Hence, we cannot apply the Bayes rule. But we can still see
that the beliefs the figure are inconsistent. Let us perturb the strategies of players 1
and 2 assuming that players 1 and 2 tremble with probabilities ε1 and ε2 , respectively,
where ε1 and ε2 are small but positive numbers. That is, we put probability ε1 on E
and 1 − ε1 on X (instead of 0 and 1, respectively) and 1 − ε2 on T and ε2 on B (instead
of 1 and 0, respectively). Under the perturbed beliefs,
ε1 (1 − ε2 )
Pr (nT |I3 , ε1 , ε2 ) = = 1 − ε2 ,
ε1 (1 − ε2 ) + ε1 ε2
where nT is the node that follows T . As ε2 → 0, Pr (nT |I3 , ε1 , ε2 ) → 1. Therefore, for
consistency, we need b (nT |I3 ) = 1.
Definition 16.4 Given any (s, b), belief assessment b is consistent with s iff there exist
some trembling probabilities that go to zero such that the conditional probabilities derived
316 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
E X
2
2
T 0
B 0
0.1 0.9
3
L R L R
1 3 0 0
2 3 1 1
1 3 2 1
by Bayes rule with trembles converge to probabilities given by b on all information sets
(on and off the path of s). That is, there exists a sequence (σ m , bm ) of assessments such
that
1. (σ m , bm ) → (σ, b),
Definition 16.5 A pair (s, b) of a strategy profile s and a belief assessment b is said to
be a sequential equilibrium if (s, b) is sequentially rational and b is consistent with s.
Note that a sequential equilibrium is a pair, not just a strategy profile. Hence, in
order to identify a sequential equilibrium, one must identify a strategy profile s, which
describes what a player does at every information set, and a belief assessment b, which
describes what a player believes at every information set. In order to check that that
(s, b) is a sequential equilibrium, one must check that
1. (Sequential Rationality) s is a best response to belief b (·|I) and the belief that
the other players will follow s in the continuation games in every information set
I, and
2. (Consistency) there exist trembling probabilities that go to zero such that the
conditional probabilities derived from Bayes rule under the trembles approach
b (·|I) at every information set I.
Example 16.1 In the game in Figure 16.4, the unique subgame-perfect equilibrium is
s∗ = (E, T, R). Let us check that (s∗ , b∗ ) where b∗ (nT |I3 ) = 1 is a sequential equilibrium.
We need to check that
2. b∗ is consistent with s∗ .
At the information set of player 3, given b∗ (nT |I3 ) = 1, action L yields 1 while
R yields 3, and hence R is sequentially rational. At the information set of Player 2,
given the other strategies, T and B yield 3 and 1, respectively, and hence playing T
is sequentially rational. At the information set of Player 1, E and X yield 3 and 2,
respectively, and hence playing E is again sequentially rational.
Since all the information sets are reached under s∗ , we just need to use the Bayes
rule in order to check consistency:
1
Pr (nT |I3 , s∗ ) = = b∗ (nT |I3 ) .
1+0
318 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
0 1
1 du 1
el el
beer quiche
qui che du
{.1} don
2 ’t 3
0 don tw
’t
0
1 0
0 ts
0
du
el
du
el
beer {.9} quiche
qui che
3 ’t don 2
1 don ’t 1
Consider the game in Figure 16.5. Here Player 1 has two types: strong (ts ) and weak
(tw ). The strong type likes beer for breakfast, while the weak type likes quiche. Player
1 is ordering his breakfast, while Player 2, who is a bully, is watching and contemplating
whether to pick a fight with Player 1. Player 2 would like to pick a fight if Player 1
is weak but not fight if he is strong. His payoffs are such that if he assign probability
more than 1/2 to weak, he prefers a fight, and if he assigns probability more than 1/2
to strong, then he prefers not to fight. Player 1 would like to avoid a fight: he gets 1
utile from the preferred breakfast and 2 utiles from avoiding the fight. Before observing
the breakfast Player 2 finds it more likely that Player 1 is strong.
One sequential equilibrium, denoted by (s∗ , b∗ ), is depicted in Figure 16.6. Both
types of Player 1 order beer. If Player 2 sees Beer, he assigns probability 0.9 to strong
and does not fight; if he sees Quiche, he assigns probability 1 on weak and fights. Let
us check that this is indeed a sequential equilibrium.
We start with sequential rationality. Playing Beer is clearly sequentially rational for
the strong type because it leads to the highest payoff for ts . For tw , beer yields 2 (beer,
don’t) while quiche yields only 1 (quiche, duel). Hence beer is sequentially rational for
tw , too. After observing beer, the expected payoff of Player 2 from "duel" is
0 1
1 du 1
el el
beer quiche du
t .1 {.1} 1 d
2 ’ on’ 3
0 don tw t
0
1 0
0 ts
0
du
el
du
el
.9 beer {.9} quiche 0
3 ’ t don 2
1 don ’t 1
and hence "don’t" is indeed sequentially rational. After observing quiche, the expected
payoff of Player 2 from duel is 1 (which is (1) (1) + (0) (0)) while his expected payoff
from "don’t" is 0. Hence, duel is sequentially rational at this information set.
To check consistency, we start the information set after beer. This information set
is on the path, and hence we use the Bayes rule. Clearly,
Pr (ts ) Pr (beer|ts , s∗ )
Pr (ts |beer, s∗ ) =
Pr (ts ) Pr (beer|ts , s∗ ) + Pr (tw ) Pr (beer|tw , s∗ )
(.9) (1)
= = .9
(.9) (1) + (.1) (1)
= b∗ (ts |beer) ,
showing that the beliefs are consistent after observing beer. Now consider the informa
tion set after quiche. This information set is off the path, and we cannot apply the Bayes
rule directly. In order to check consistency at this information set, we need to find some
trembling probabilities that would lead to probability 1 on weak in the limit. (Notice
that we don’t need all the trembles to lead to this probability in the limit. There could
320 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
0 1
1 du 1
el el
beer quiche du
1 {.1} .1 d
2 ’t on’ 3
0 don tw t
0
1 0
0 ts
0
du
el
du
el
0 beer {.9} quiche .9
3 ’t don 2
1 don ’t 1
be some other trembles that would lead to a different limit.) Suppose that weak type
trembles with probability ε while the strong type trembles with probability zero. Then,
(.1) ε
Pr (tw |quiche, ε) = = 1.
(.1) ε + (.9) (0)
As ε → 0, clearly, Pr (tw |quiche, ε) → 1 = b∗ (tw |quiche), showing that b∗ is consistent
with s∗ .1
Above equilibrium is intuitive. Since weak type likes quiche, Player 2 takes ordering
quiche as a sign of weakness and fights. Anticipating this, none of the types orders
quiche. There is also another sequential equilibrium in which both types order quiche,
as depicted in Figure 16.2.
Exercise 16.1 Check that the strategy profile and the belief assessments in Figure 16.2
are a sequential equilibrium.
Exercise 16.2 Find all sequential equilibria in Beer and Quiche game. (Hint: Note
that there may be two different equilibria in which the strategy profiles are same but the
beliefs are different.)
-2 1
1 du 1
el el
beer quiche du
{.1} don
0 ’t 3
0 don tw
’t
0
1 0
0 ts
0
du
el
du
el
beer {.9} quiche
3 ’t don 2
1 don ’t 1
relevant. He takes an action (called a message). Player 2 observes Player 1’s action
but not his type and takes an action. Players’ payoffs depend both players’ actions and
Player 1’s type.
Both of the equilibria in Beer and Quiche game are pooling equilibrium. In a pooling
equilibrium, Player 2 does not learn anything from Player 1’s actions on the path of
equilibrium (i.e. his beliefs at the information set on the path are just his prior beliefs).
In some signaling games, different types may take different actions, and Player 2 may
learn Player 1’s information from his actions:
Example 16.2 Consider the game in Figure 16.7, where weak type really dislikes beer.
In this game there is a unique sequential equilibrium, depicted in Figure 16.8. Since
weak type plays quiche and strong type plays beer, it is a separating equilibrium. Notice
that Player 2 assigns probability 1 to ts after beer and to tw after quiche.
322 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
-2 1
1 du 1
el el
beer quiche du
t 0 {.1} 1 d
0 ’ on’ 3
0 don tw t
0
1 0
0 ts
0
du
el
du
el
1 beer {.9} quiche 0
3 ’ t don 2
1 don ’t 1
Exercise 16.3 Check that the strategy profile and the belief assessment form a sequential
equilibrium. Show also that this is the only sequential equilibrium.
expected payoffs from beer and quiche for type t, respectively. Then,2
and
UB (tw ) − UQ (tw ) = −1 + 2 (pB − pQ ) .
Hence,
UB (ts ) − UQ (ts ) = 2 + UB (tw ) − UQ (tw ) > UB (tw ) − UQ (tw ) . (16.1)
Now, if tw plays beer with positive probability, then for sequential rationality we must
have UB (tw ) ≥ UQ (tw ). Then (16.1) implies that UB (ts ) > UQ (ts ). In that case,
sequential rationality requires that ts must play beer with probability 1. Similarly, one
can conclude that if ts plays quiche with positive probability, then tw must play quiche
with probability 1. Therefore, in a sequential equilibrium, either (i) ts plays beer and tw
mixes, or (ii) ts mixes and tw plays quiche.
The case (ii) cannot happen in equilibrium. After beer, Player 2 must assign proba
bility 1 on ts and not fight, i.e. pB = 0. Moreover, after quiche, he must assign
0.8
Pr (tw |quiche) = ≥ 0.8
0.8 + 0.2 Pr (quiche|ts )
to the weak type and must fight, i.e. pQ = 1. In that case, UB (ts ) = 3 and UB (tw ) = 0,
and strong type must fight with probability 1 (not mixing).
Therefore, in equilibrium, ts plays beer and tw mixes. By consistency, we must have
Pr (quiche|tw ) (0.8)
Pr (tw |quiche) = = 1.
Pr (quiche|tw ) (0.8) + 0 · 0.2
By sequential rationality, Player 2 must fight after quiche:
pQ = 1.
pB = 1/2.
That is, after observing beer, player two strictly mixes between "duel" and "don’t". For
sequential rationality, he must then be indifferent between them. This happens only
2
Notice that UB (ts ) = 1 + 2pB and UQ (ts ) = 2pQ .
324 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
0 1
1 du 1
el el
.5 beer quiche du
.5 3/4
2
o n ’t .5 1/4 {.8} 1 don
’t
3
0 d tw 0
1 0
0 ts
0
du
el
.5
du
el
.5 beer {.2} quiche 0
.5 don 2
3 ’ t
1 don ’t 1
when3
Pr (tw |beer) = 1/2.
Pr (beer|tw ) (0.8)
1/2 = Pr (tw |beer) = .
Pr (beer|tw ) (0.8) + 1 · 0.2
Pr (beer|tw ) = 1/4.
We have identified a strategy profile and belief assessment, depicted in Figure 16.2. From
our derivation, one can check that this is indeed a sequential equilibrium.
Exercise 16.4 Check that the strategy profile and the belief assessment in Figure 16.2
form a sequential equilibrium.
1 2 1
(1,-5)
.9
in equilibrium if a player has a small amount of doubt about the other player’s payoffs?
It turns out that in dynamic games such small changes may have profound effects on
the equilibrium behavior. The next example illustrates this fact. (It also illustrates how
one computes a mixed-strategy sequential equilibrium.)
Consider the game in Figure 16.3. In this game, Player 2 does not know the payoffs
of Player 1. She thinks at the beginning that his payoffs are as in the upper branch with
high probability 0.9, but she also assigns the small probability of 0.1 to the possibility
that he is averse to play down, exiting the game. Call the first type of Player 1 "normal"
type and the second type of Player 1 "crazy" type. If it were common knowledge that
Player 1 is "normal", then backward induction would yield the following: Player 1 goes
down in the last decision node; Player 2 goes across, and Player 1 goes down in the first
node.
What happens in the incomplete information game of Figure 16.3 in which the above
common knowledge assumption is relaxed? By sequential rationality, the "crazy" type
(in the lower branch) will always go across. In the last decision node, the normal type
again goes down. Can it be the case that the normal type goes down in his first decision
node, as in the complete information case? It turns out that the answer is No. If
in a sequential equilibrium "normal" type goes down in the first decision node, in her
information set, Player 2 must assign probability 1 to the crazy type. (By Bayes rule,
Pr (crazy|across) = 0.1/ (0.1 + (.9) (0)) = 1. This is required for consistency.) Given
this belief and the actions that are already determined, she gets −5 from going across
and 2 from going down, and she must go down for sequential rationality. But then
326 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
"normal" type should go across as a best reply, which contradicts the assumption that
he goes down.
Similarly, one can also show that there is no sequential equilibrium in which the
normal type goes across with probability 1. If that were the case, then by consistency,
Player 2 would assign 0.9 to normal type in her information set. Her best response would
be to go across for sure, and in that case the normal type would prefer to go down in
the first node.
In any sequential equilibrium, normal type must mix in his first decision node. Write
α = Pr (across|normal) and β for the probability of going across for Player 2. Write
also μ for the probability Player 2 assigns to the upper node (the normal type) in her
information set. Since normal type mixes (i.e. 0 < α < 1), he is indifferent. Across
yields
3β + 5 (1 − β)
Since 0 < β < 1, Player 2 must be indifferent between going down, which yields 2 for
sure, and going across, which yields the expected payoff of
3μ + (−5) (1 − μ) = 8μ − 5.
This completes the computation of the unique sequential equilibrium, which is depicted
in Figure 16.3.
Exercise 16.5 Verify that the pair of mixed strategy profile and the belief assessment is
indeed a sequential equilibrium.
16.4. BARGAINING WITH INCOMPLETE INFORMATION 327
1 2 β=1/2 1
α=7/9
(1,-5)
.9 μ=7/8
Notice that in sequential equilibrium, after observing that Player 1 goes across, Player
2 increases her probability for Player 1 being a crazy type who will go across, from 0.1
to 0.125. If she assigned 0 probability at the beginning she would not change her beliefs
after she observes that he goes across. In the latter case, Player 1 could never convince
her that he will go across (no matter how many times he goes across), and he would not
try. When that probability is positive (no matter how small it is), she will increase her
probability of him being crazy after she sees him going across, and Player 1 would try
go across with some probability even he is not crazy.
Exercise 16.6 In the above game,compute the sequential equilibrium for any initial
probability π ∈ (0, 1) of crazy type (in the figure π = 0.1).
whether to buy. If he buys, the payoffs of the seller and the buyer are δp0 and δ (v − p0 ),
respectively, where δ ∈ (0, 1). Otherwise, the game ends with zero payoffs.
Consider a sequential equilibrium with the following cutoff strategies.4 For any price
p0 and p1 there are cutoffs a (p0 ) and b (p1 ) such that at period 0, buyer buys if and only
if v ≥ a (p0 ) and at period 1, the buyer buys if and only if v ≥ b (p1 ).
At period 1, given any price p1 , buyer gets δ (v − p1 ) if he buys and 0 otherwise.
Hence, by sequential rationality, he should buy if and only if v ≥ p1 .That is, b (p1 ) = p1 .
Now, given any p0 , if the buyer does not buy in period 0, then seller knows, from the
strategy of the buyer, that v ≤ a (p0 ). That is, after the rejection of p0 , the seller
believes that v is uniformly distributed on [0, a (p0 )]. Given that buyer buys iff v ≥ p1 ,
the expected payoff of the seller is
For sequential rationality, after the rejection of p0 , the price p1 (p0 ) must maximize
US (p1 |p0 ). Therefore,
p1 (p0 ) = a (p0 ) /2. (16.2)
Now consider period 0. Given any price p0 , the types v ≥ a (p0 ) buy at price p0 at
period 0; the types v ∈ [a (p0 ) /2, a (p0 )) buy at price a (p0 ) /2 at period 1, and the other
types do not buy. For sequential rationality, we must have
All we need to do is now to find what price buyer sets at period 0. For any price p0 , he
gets p0 from types with v ≥ a (p0 ), δp1 (p0 ) = δa (p0 ) /2 from types v ∈ [a (p0 ) /2, a (p0 ))
4
This is actually the only sequential equilibrium.
16.5. EXERCISES WITH SOLUTIONS 329
The first period price must maximize US (p0 ). By taking the derivative and setting it
equal to zero, we obtain
(1 − δ/2)2
p0 = .
2 (1 − 3δ/4)
1. [Final 2007, Early exam] Find a sequential equilibrium of the following game:
A C
1/3 B
1 1/3
1/3
1
1
L1 R1 L2 L3
2 2
R2
R3
a b -1
a b 0
1
3 1 0
1 x y 1 2
2 2 2
3 0
0 3 l r l r
l r
1
2 1 -1
w z 2 1
10 -1 1
0 2
3 1
0 2
330 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
A C
1/3 B
1 1/3
1/3 1
1
L1 R1 L2 L3
2 2
1 0 R2
R3
a b -1
a b 0
1
3 1 0
1 x y 1 2
2 1/2 2 1/2 2
3 0
0 3 l r l r
l r
1
2 1 -1
w z 2 1
10 -1 1
0 2
3 1
0 2
2. [Final 2007, Early exam] This question is about a game, called "Deal or No Deal".
The monetary unit is M$, which means million dollars. The players are a Banker
and a Contestant. There are 3 cases: 0,1, and 2. One of the cases contains 1M$
and all the other cases contain zero M$. All cases are equally likely to contain the
1M$ prize (with probability 1/3). Contestant owns Case 0. Banker offers a price
p0 , and Contestant accepts or rejects the offer. If she accepts, then Banker buys
the content of Case 0 for price p0 , ending the game. (Contestant gets p0 M$ and
Banker gets the content of the case, minus p0 M$.) If she rejects the offer, then we
open Case 1, revealing the content to both players. Banker again offers a price p1 ,
and Contestant accepts or rejects the offer. If she accepts, then Banker buys the
content of Case 0 for price p1 ; otherwise we open Case 2, and the game ends with
Contestant owning the content of Case 0 and Banker owning zero. The utility of
owning x M$ is x for the Banker and x1/α for the Contestant, where α > 1.
Answer: If Case 1 contains 1M$, then in period 1 players know that Case
0 contains 0, and hence Contestant accepts any offer, and Banker offers 0. If
Case 1 contains 0M$, then players know that Case 0 contains 0 with proba
bility 1/2 and 1M$ with probability 1/2. The expected payoff of Contestant
from rejecting an offer p1 is 1/2. Hence, she accepts the offer iff
1/α
p1 ≥ 1/2, i.e., p1 ≥ 1/2α .
Notice that, since α > 1, the value of the case for the banker is 1/2 > p1 , and
he is happy to make that offer.
Now consider period 0. If the offer p0 is rejected, then with probability 1/3
it will be revealed that Case 1 contains 1M$, and players will get (0,0), and
with probability 2/3 it will be revealed that Case 1 contains 0M$, and Banker
will get payoff of 1/2 − 1/2α in expectation and Contestant will get payoff
1/α
of 1/2 (which is p1 ). The expected value of these payoffs for Banker and
Contestant are 1/3 − 2/ (2α 3) and 1/3, respectively. Therefore, Contestant
will accept p0 iff
1/α
p0 ≥ 1/3, i.e., p0 ≥ 1/3α .
Notice that, since α > 1, 2/ (2α 3) > 1/3α , and hence Banker would rather
offer p0 and get 1/3 − 1/3α ; as opposed to making a rejected offer and getting
1/3 − 2/ (2α 3) as a result.
(b) Now assume that Banker does not know α, i.e., α is private information of
Contestant, and Pr (1/2α ≤ x) = 2x for any x ≤ 1/2. Consider a strategy of
the Contestant with cutoffs α̂0 (p0 ) and α̂1 (p1 ) such that Contestant accepts
the first price p0 iff α ≥ α̂0 (p0 ) and, in the case the game proceeds to the next
stage, she accepts the second price p1 iff α ≥ α̂1 (p1 ). Find the necessary and
sufficient conditions on α̂0 (p0 ) and α̂1 (p1 ) under which the above strategy is
332 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
p1 ≥ 1/2α ,
(Of course, α̂1 (p1 ) = − log (p1 ) / log (2), but you do not need to obtain this
explicit solution.)
Towards finding the equation for α0 , we need to find the price p1 (p0 ) that
will be offered in a sequential equilibrium. Given that p0 is rejected, Banker
knows that α < α̂0 (p0 ), or 1/2α > 1/2α̂0 (p0 ) . Write y = 1/2α̂0 (p0 ) . His
expected utility from offering p1 is
Given p0 , the types α ≥ α̂0 (p0 ) prefer to trade at p0 rather than waiting
for p1 (p0 ) the next period, and the types α ∈ (α̂1 (p1 (p0 )) , α̂0 (p0 )) wait for
p1 (p0 ) (and trade at that price) rather than trading at p0 . As explained in
the class, this implies that the type α̂0 (p0 ) is indifferent between these two
options:
1/α̂0 (p0 )
p0 = (2/3) (p1 (p0 ))1/α̂0 (p0 ) ,
16.5. EXERCISES WITH SOLUTIONS 333
where the left-hand side is the payoff from accepting p0 and the right-hand
side is the expected payoff from rejecting p0 and accepting p1 (p0 ) if Case 1
contains 0. By taking the powers on both sides and substituting the value of
p1 (p0 ), we obtain
(You can simplify this equation a bit more if you want, but you are not asked
to do so. Also, note that we specified all the actions and beliefs except for the
value of the initial price, which will be the price that maximizes the expected
payoff of the banker given what we described so far.)
•
⎧
⎪
⎨ P (Qi (x1 , x2 )) Qi (x1 , x2 ) − xi
⎪ if xi > xj
ui (x1 , Q1 , x2 , Q2 ) = P (Qi (x1 , x2 )) Qi (x1 , x2 ) /2 − xi if xi = xj
⎪
⎪
⎩ −xi otherwise.
(b) (15) Find a symmetric Bayesian Nash equilibrium of the above game in which
each player’s investment is of the form xi = a (1 − ci )3 +b for some parameters
a and b. [If you can, you may want to solve part (c) first.]
ANSWER: See part (c).
(c) (10) Show that the equilibrium in part (b) is the only Bayesian Nash equi
librium in which both firms act sequentially rationally and in which xi is an
increasing, differentiable function of (1 − ci ) .
ANSWER: By sequential rationality, a monopolist produces
Qi = 1 − ci /2
(1 − ci )2 /4,
θ i = 1 − ci
θ2i
E [ui ] = Pr (xi > xj ) − xi .
4
This is because with probability Pr (xi > xj ) the firm will become monopolist
and get the monopoly profit θ2i /4 and will pay the investment cost xi with
probability 1. Since x is increasing, Pr (xi = xj ) = 0. Now,
( )
Pr (xi > xj ) = Pr (xi > x (θj )) = Pr θj < x−1 (xi ) = x−1 (xi ) .
16.5. EXERCISES WITH SOLUTIONS 335
Hence,
θ2i −1
E [ui ] = x (xi ) − xi .
4
Therefore, the first-order condition for maximization is
∂E [ui ] θ2 1
0= = i ' − 1,
∂xi 4 x (θi )
showing that
θ2i
x' (θi ) = ,
4
and therefore
θ3i
+ const,x (θi ) =
12
where the const = 0, so that x (0) = 0.
.5 .1
.4
1
2 2
2,1,0 0,1,0
3
0,0,1 0,0,1
1 0,0,1 2
2
0,2,2 1,1,3
3,3,3
1 = 0μ + 3 (1 − μ) ,
hence
μ = 2/3.
That is player 2 must mix on the center branch, and hence she must be indifferent,
i.e.,
1 = 2β.
That is,
β = 1/2.
.5 .1
.4
1
2 2
α = 1/2
2,1,0 0,1,0
3
μ =2/3 1−μ
0
β = 1/2
0,0,1 0,0,1
1 0,0,1 2
2
0,2,2 1,1,3
3,3,3
16.5. EXERCISES WITH SOLUTIONS 337
5. [Final 2002] We have a Judge and a Plaintiff. The Plaintiff has been injured. Sever
ity of the injury, denoted by v, is the Plaintiff’s private information. The Judge
does not know v and believes that v is uniformly distributed on {0, 1, 2, . . . , 99} (so
that the probability that v = i is 1/100 for any i ∈ {0, 1, . . . , 99}). The Plaintiff
can verifiably reveal v to the Judge without any cost, in which case the Judge will
know v. The order of the events is as follows. First, the Plaintiff decides whether
to reveal v or not. Then, the Judge rewards a compensation R. The payoff of the
Plaintiff is R − v, and the payoff of the Judge is − (v − R)2 . Everything described
so far is common knowledge. Find a sequential equilibrium.
and
R∗ (N R) = E [v|N R] .
In equilibrium, the Plaintiff gives her best response to R∗ at each v. Hence, she
must reveal her type whenever v > R∗ (N R), and she must not reveal her type
whenever v < R∗ (N R). Suppose that R∗ (N R) > 0. Then, s∗ (0) = N R, and
hence N R is reached with positive probability. Thus,
and thus
s∗ (v) = v
6. [Final 2001, Make Up] This question is about a game between a possible appli
cant (henceforth student) to a Ph.D. program in Economics and the Admission
Committee. Ex-ante, Admission Committee believes that with probability .9 the
student hates economics and with probability .1 he loves economics. After Nature
decides whether student loves or hates economics with the above probabilities and
reveals it to the student, the student decides whether or not to apply to the Ph.D.
program. If the student does not apply, both the student and the committee get
0. If student applies, then the committee is to decide whether to accept or reject
the student. If the committee rejects, then committee gets 0, and student gets -1.
If the committee accepts the student, the payoffs depend on whether the student
loves or hates economics. If the student loves economics, he will be successful and
the payoffs will be 20 for each player. If he hates economics, the payoffs for both
the committee and the student will be -10. Find a separating equilibrium and a
pooling equilibrium of this game.
(-10,-10)
Apply Accept
{0} Reject
Don’t (-1,0)
.9
Hate
(0,0)
(20,20)
{1}
Love Apply Accept
.1
Reject
Don’t (-1,0)
(0,0)
A pooling equilibrium:
(-10,-10)
Apply Accept
{.9} Reject
Don’t (-1,0)
.9
Hate
(0,0)
(20,20)
{.1}
Love Apply Accept
.1
Reject
Don’t (-1,0)
(0,0)
7. [Final 2001] We have an employer and a worker, who will work as a salesman.
The worker may be a good salesman or a bad one. In expectation, if he is a good
salesman, he will make $200,000 worth of sales, and if he is bad, he will make only
$100,000. The employer gets 10% of the sales as profit. The employer offers a wage
340 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
w. Then, the worker accepts or rejects the offer. If he accepts, he will be hired at
wage w. If he rejects the offer, he will not be hired. In that case, the employer will
get 0, the worker will get his outside option, which will pay $15,000 if he is good,
$8,000 if he is bad. Assume that all players are risk-neutral.
(a) Assume that the worker’s type is common knowledge, and compute the subgame
perfect equilibrium.
(b) Assume that the worker knows his type, but the employer does not. Employer
believes that the worker is good with probability 1/4. Find the sequential
equilibrium.
Solution: Again a worker will accepts an offer iff his wage at least as high as
his outside option. Hence if w ≥ 15, 000 the offer will be accepted by both
types, yielding
U (w) = (1/4) (.1) 200, 000 + (3/4) (.1) 100, 000 − w = 12, 500 − w < 0
as the profit for the employer. If 8, 000 ≤ w < 15, 000, then only the bad
worker will accept the offer, yielding
as profit. If w < 0, no worker will accept the offer, and the employer will get
0. In that case, the employer will offer w = 8, 000, hiring the bad worker at
his outside option.
(c) Under the information structure in part (b), now consider the case that the
employer offers a share s in the sales rather than the fixed wage w. Compute
the sequential equilibrium.
Solution: Again a worker will accept the share s iff his income is at least as
high as his outside option. That is, a bad worker will accept s iff
i.e.,
8, 000
s ≥ sB = = 8%.
100, 000
A good worker will accept s iff
15, 000
s ≥ sG = = 7.5%.
200, 000
In that case, if s < sG no one will accept the offer, and the employer will get
0; if sG ≤ s < sB , the good worker will accept the offer and the employer will
get
(1/4) (10% − s) 200, 000 = 50, 000 (10% − s) ,
and if s ≥ sB , each type will accept the offer and the employer will get
(10% − s) [(1/4) 200, 000 + (3/4) 100, 000] = 125, 000 (10% − s) .
Since 125, 000 (10% − sB ) = 2%125, 000 = 2, 500 is larger than 50, 000 (10% − sG ) =
2.5%50, 000 = 1, 250, he will offer s = sB , hiring both types.
8. [Final 2001, Make Up] As in the previous question, we have an employer and a
worker, who will work as a salesman. Now the market might be good or bad. In
expectation, if the market is good, the worker will make $200,000 worth of sales,
and if the market is bad, he will make only $100,000 worth of sales. The employer
gets 10% of the sales as profit. The employer offers a wage w. Then, the worker
accepts or rejects the offer. If he accepts, he will be hired at wage w. If he rejects
the offer, he will not be hired. In that case, the employer will get 0, the worker
will get his outside option, which will pay $12,000. Assume that all players are
risk-neutral.
(a) Assume that whether the market is good or bad is common knowledge, and
compute the subgame-perfect equilibrium.
(b) Assume that the employer knows whether the market is good or bad, but the
worker does not. The worker believes that the market is good with probability
1/4. Find the sequential equilibrium.
(c) Under the information structure in part (b), now consider the case that the
employer offers a share s in the sales rather than the fixed wage w. Compute
a sequential equilibrium.
ANSWER: Note that, since the return is 10% independent of whether the
market is good or bad, the employer will make positive profit iff s < 10%.
Hence, except for s = 10%, we must have a pooling equilibrium. Hence, at
any s, the worker’s income is
poor who has $0. For some reason, the wealthy entrepreneur cannot use his wealth
as an investment towards this project. There is also a bank that can lend money
with interest rate π. That is, if the entrepreneur borrows $100,000 to invest, after
the project is completed he will pay back $100, 000 (1 + π) – if he has that much
money. If his wealth is less than this amount at the end of the project, he will pay
all he has. The order of the events is as follows:
(a) Compute the subgame perfect equilibrium for the case when the wealth is
common knowledge.
ANSWER: The rich entrepreneur is always going to pay back the loan in
full amount, hence his expected payoff from investing (as a change from not
investing) is
(0.5)(300, 000) − 100, 000 (1 + π) .
π ≤ 1/2.
π R = 1/2.
The poor entrepreneur is going to pay back the loan only when the project
succeeds. Hence, his expected payoff from investing is
π ≤ 2.
π P = 2.
344 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
(b) Now assume that the bank does not know the wealth of the entrepreneur.
The probability that the entrepreneur is rich is 1/4. Compute the sequential
equilibrium.
ANSWER: As in part (a), the rich type will invest iff π ≤ π R = .5, and the
poor type will invest iff π ≤ π P = 2. Now, if π ≤ π R , the bank’s payoff is
1 3 1 1
U (π) = 100, 000 (1 + π) + 100, 000 (1 + π) + 0 − 100, 000
4 4 2 2
5
= 100, 000 (1 + π) − 100, 000
8
5
≤ 100, 000 (1 + π R ) − 100, 000
8
5 1
= 100, 000 (1 + 1/2) − 100, 000 = − 100, 000 < 0.
8 16
If π R < π ≤ π P , the bank’s payoff is
3 1 1
U (π) = 100, 000 (1 + π) + 0 − 100, 000
4 2 2
3
= 100, 000 (π − 1) ,
8
which is maximized at π P , yielding 38 100, 000. If π > π P , U (π) = 0. Hence,
the bank will choose π = π P .
16.6 Exercises
1. [Homework 5, 2011] In the following game, for each action of player 2, find a
sequential equilibrium in which player 2 plays that action:
x y
3/4 1/4
1
1
out in in out
2
2 0
0 L R L R 1
3 1 3 1
10 0 0 1
16.6. EXERCISES 345
2. [Final 2011] Find a sequential equilibrium of the following game. Verify that you
have indeed a sequential equilibrium.
(1,1)
(0,0)
b x
a
y
1 (2,-2)
1/3
2 (0,0)
x
1 a
1/3
y
(1,1)
b
1/3
(-1,-1) (0,0)
x
a
y
(1,1)
b
(2,2)
3. [Final 2011] Consider the following version of Yankee Swap Game, played by Alice,
Bob, and Caroline. There are 3 boxes, namely A, B, and C, and three prizes x,
y, and z. The prizes are put in the boxes randomly, so that any combination of
prizes is equally likely, and the boxes are closed without showing their contents to
the players. First, Alice is to open box A, revealing its content observable. Then,
in the alphabetical order, Bob and Caroline are to open the box with their own
initial, making its content observable, and either keep the content as is or swap its
content with the content of a box that has been opened already. Finally, Alice is
given the option of swapping the content of her box with the content of any other
box, ending the game when each player gets the prize in their own box.
(a) Assume that it is commonly known that, for each player, the payoff from x, y,
and z are 3, 2, and 0, respectively. Find a subgame-perfect Nash equilibrium.
(b) Now assume that it is commonly known that the preferences of Bob and
Caroline are as in part (a), but the preferences of Alice are privately known
by herself. With probability 1/2, her utility function is as above, but with
probability 1/2 she gets payoffs of 2, 3, and 0 from x, y, and z, respectively.
Find a sequential equilibrium of this game.
346 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
1 2 A 1 a
E
(0,2)
1−π
X D d
(-1,1)
5. [Final 2005] The following game describes a situation in which Player 2 is not sure
that she is playing a game with Player 1, i.e., she is not sure that Player 1 exists.
1 A 2 a 1 F
-1,3
.8 D d P
1 0 3
0 2 1
.2 2 a
-1,3
0
2
(a) (20 points) Compute a perfect Bayesian Nash equilibrium of this game.
(b) (5 points) Breifly discuss the equilibrium in (a) from Player 2’s point of view.
16.6. EXERCISES 347
6. [Final 2005] We have two players, Host and Contestant. There are three doors, L,
M, and R.
• Nature puts a car behind one of these doors, and goats behind the others.
The probability of having the car is same for all doors. Host knows which
door, but Contestant does not.
• Then, Contestant selects a door.
• Then, Host must open one of the two doors that are not selected by Contestant
and show Contestant what Nature put behind that door.
• Then, Contestant chooses any of the three doors, and receives whatever is
behind that door.
Payoffs for Contestant and Host are (1,-1) if Contestant receives a car, and (0,0)
if he receives a goat. Compute a perfect Bayesian Nash equilibrium of this game.
Verify that this is indeed a PBE. [Hint: Any strategy for Host in which he never
shows the car is part of some PBE.]
1 A 2 α 1
0
.4 D δ
0 2
0 1
.6 1 2 α
a 1
2
d δ
2 3
2 1
8. [Final 2004] A soda company, XC, introduces a new soda and wants to sell it
to a representative consumer. The soda may be either Good or Bad. The prior
probability that it is Good is 0.6. Knowing whether the soda is Good or Bad,
348 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
the soda company chooses an advertisement level for the product, which can be
either an Ad Blitz, which costs the company c, or No Advertisement, which does
not cost anything. Observing how strongly the company advertises the soda, but
without knowing whether the soda is Good or Bad, the representative consumer
decides whether or not to buy the product. After subtracting the price, the payoff
of representative consumer from buying the soda is 1 if it is Good and −1 if it
is Bad. His payoff is 0 if he does not buy the soda. If the soda is Good and
representative consumer buys it (and therefore learns that the soda is Good), then
the company sells the soda to other future consumers, enjoying a high revenue of
R. If the soda is Bad and the representative consumer buys it, the company will
have only a small revenue r. If the representative consumer does not buy the soda,
the revenue of the company is 0. Assume that 0 < r < c < R.
(a) Write this game as a signaling game. (Draw the game tree.)
(d) Find a sequential equilibrium for the case that the prior probability of Good
is 0.4.
(e) Find a sequential equilibrium for the case that 0 < c < r < R (and the prior
probability of Good is 0.6).
9. [Final 2004] In this question, you are asked to help me to determine the letter
grades! We have a professor and a potential student. There are two types of
students, H and L. The student knows his type, but the professor does not. The
prior probability of type H is π ∈ [0, 1]. The events take place in the following
order.
• Observing γ and his type, the student decides whether to take the class.
• If the student does not take the class, the game ends; the professor gets 0, and
the student gets Wt , where t ∈ {H, L} is his type and 0 < WL < WH < 100.
16.6. EXERCISES 349
• If the student takes the class, then he chooses an effort level e and takes an
exam. His score in the exam is s = e if t = L and s = 2e if t = H; i.e., a high
type student scores higher for any effort level.
A if s ≥ γ
g=
B otherwise.
(a) Consider a prestigious institution with high standards, where π is high, and
WH is not too high. In particular, π > .5 (100 − WL ) / (100 − WH ) and WH <
(100 + WL ) /2. Compute a sequential equilibrium for this game.
(b) Consider a prestigious institution with spoiled kids, where both π and WH are
high. In particular, WH > (100 + WL ) /2 and π > 1−2 (100 − WH ) / (100 − WL ).
Compute a sequential equilibrium for this game.
(c) Consider a lower-tier college, where both π and WH are low; π < .5 (100 − WL ) / (100 − WH )
and WH < (100 + WL ) /2. Compute a sequential equilibrium for this game.
(d) Assuming that WL is the same at all three institutions, rank the exam scores
in (a), (b) and (c).
(e) (0 points) What cutoff value would you choose if you were a professor at
MIT?
T (1,1)
(3,1) T
L R
(2,1/2) B (0,1/2)
{.5} A
B
0
1 T 1
B T
L 1
R
2 {.1} B 3
0 B C 0
1,0 0,0
T
L {.4} R
3 B 2
1 B 1
(b) Find a sequential equilibrium in which for each signal there is a type who
send that signal.
11. [Final 2002 Make Up] We have a Defendant and a Plaintiff, who injured the Defen
dant. If they go to court, the Defendant will pay a cost c ∈ (0, 1) to the court and
a reward d to the Plaintiff, depending on the severity of the injury. [Here c and d
are measured in terms of utiles, where a utile is $1M.] The Plaintiff knows d but
the Defendant does not; she believes that d = 1 with probability π > c and d = 2
with probability 1 − π. The Plaintiff ask a settlement s, and the Defendant either
accepts, in which case she pays s (utile) to the Plaintiff, or rejects in which case
they go to court. Everything described up to here is common knowledge. Find a
sequential equilibrium.
12. [Final 2000] Consider the following private-value auction of a single object, whose
value for the seller is 0. there are two buyers, say 1 and 2. The value of the object
for each buyer i ∈ {1, 2} is vi so that, if i buys the object paying the price p, his
payoff is vi − p; if he doesn’t buy the object, his payoff is 0. We assume that
v1 and v2 are independently and identically distributed uniformly on [v, 1] where
0 ≤ v < 1.
(a) We use sealed-bid first-price auction, where each buyer i simultaneously bids
bi , and the one who bids the highest bid buys the object paying his own bid.
16.6. EXERCISES 351
(b) Now assume that v1 and v2 are independently and identically distributed
uniformly on [0, 1]. Now, in order to enter the auction, a player must pay
an entry fee φ ∈ (0, 1). First, each buyer simultaneously decides whether
to enter the auction. Then, we run the sealed-bid auction as in part (a);
which players entered is now common knowledge. If only one player enters
the auction any bid b ≥ 0 is accepted. Compute the symmetric sequential
equilibrium where the buyers use the linear strategies in the auction if both
buyer enter the auction. Anticipating this equilibrium, which entry fee the
seller must choose? [Hint: In the entry stage, there is a cutoff level such that
a buyer enters the auction iff his valuation is at least as high as the cutoff
level.]
13. [Final 2000] Consider a worker and a firm. Worker can be of two types, High or
Low. The worker knows his type, while the firm believes that each type is equally
likely. Regardless of his type, a worker is worth 10 for the firm. The worker’s
reservation wage (the minimum wage that he is willing to accept) depends on his
type. If he is of high type his reservation wage is 5 and if he is of low type his
reservation wage is 0. First the worker demands a wage w0 ; if the firm accepts it,
then he is hired with wage w0 , when the payoffs of the firm and the worker are
10 − w0 and w0 , respectively. If the firm rejects it, in the next day, the firm offers
a new wage w1 . If the worker accept the offer, he is hired with that wage, when
the payoffs of the firm and the worker are again 10 − w1 and w1 , respectively. If
the worker rejects the offer, the game ends, when the worker gets his reservation
wage and the firm gets 0. Find a perfect Bayesian equilibrium of this game.
14. [Homework 5, 2004] Compute all sequential equilibria of the following game.
15. [Homework 5, 2004] Consider the following general Beer-Quiche game, where the
value of avoiding a fight is α, and the ex-ante probability of strong type is p. For
each case below find a sequential equilibrium.
352 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION
1 2 1
1,-5
.5
.4
-1,4 0,2 -1,3
1 2 1
-2,-5
0 1
1 du 1
el el
beer quiche du
{1-p} don
α ’t 1+α
0 don tw
’t
0
1 0
0 ts
0
du
el
du
el
16. [Homework 5, 2004] Consider a buyer and a seller. The seller owns an object, whose
value for himself is c. The value of the object for the buyer is v. Each player knows
his own valuation not the other player’s valuation; v and c are independently and
identically distributed with uniform distribution on [0, 1]. We have two dates,
t = 0, 1. The players discount the future payoffs with δ = .9. Hence, if they trade
at t = 0 with price p, the payoffs of seller and the buyer are p − c and v − p,
respectively, while these payoffs would be 0.9 (p − c) and 0.9 (v − p), respectively,
if they traded at t = 1. If the do not trade at any of these dates, each gets 0. Find
a sequential equilibrium of the game in each of the following cases.
16.6. EXERCISES 353
(a) At t = 0, the seller offers a price p0 . If the buyer accepts, trade occurs at
price p0 . If the offer is rejected, the game end without possibility of a trade
at t = 1.
(b) At t = 0, the seller offers a price p0 . If the buyer accepts, trade occurs at price
p0 . If the buyer rejects, at t = 1, the seller sets another price p1 . If the buyer
accepts the price, the trade occurs at price p1 ; otherwise they do not trade.
[Hint: There is an equilibrium in which there is a threshold a (p0 ) such that a
buyers buys at t = 0 if his valuation is above a (p0 ), and the threshold and the
sellers strategies are "linear," i.e., a (p0 ) = min {αp0 + β, 1} and p0 = Ac + B
for some parameters α, β, A, and B.]
17. [Final 2000, Make Up] Two players (say A and B) own a company, each of them
owning a half of the Company. They want to dissolve the partnership in the
following way. Player A sets a price p. Then, player B decides whether to buy
A’s share or to sell his own share to A,in each case at price p. The value of the
Company for players A and B are vA and vB , respectively.
(a) Assume that the values vA and vB are commonly known. What would be the
price in the subgame-perfect equilibrium?
(b) Assume that the value of the Company for each player is his own private
information, and that these values are independently drawn from a uniform
distribution on [0,1]. Compute the sequential equilibrium.
(2,1)
(3,1)
L 1 R
2 2
(0,0) (2,0)
{0.4}
L 1 R
(1,1) (3,1)