0% found this document useful (0 votes)
9 views

MIT博弈论lecture note

Uploaded by

C LEE
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

MIT博弈论lecture note

Uploaded by

C LEE
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 352

Chapter 1

Introduction

Game Theory is a misnomer for Multiperson Decision Theory. It develops tools, meth­
ods, and language that allow a coherent analysis of the decision-making processes when
there are more than one decision-makers and each player’s payoff possibly depends on
the actions taken by the other players. In this lecture, I will illustrate some of these
methods on simple examples.
Note that, since a player’s preferences on his actions depend on which actions the
other parties take, his action depends on his beliefs about what the others do. Of course,
what the others do depends on their beliefs about what each player does. In this way, a
player’s action, in principle, depends on the actions available to each player, each player’s
preferences on the outcomes, each player’s beliefs about which actions are available to
each player and how each player ranks the outcomes, and further his beliefs about each
player’s beliefs, ad infinitum.
When players think through what the other players will do, taking what the other
players think about them into account, they may find a clear way to play the game.
Consider the following “game”:

1\2 L m R
T 1, 1 0, 2 2, 1
M 2, 2 1, 1 0, 0
B 1, 0 0, 0 −1, 1

Here, There are two players, namely Player 1 and Player 2. Player 1 has strategies,
T, M, B, and Player 2 has strategies L, m, R. They pick their strategies simultaneously.

1
2 CHAPTER 1. INTRODUCTION

Every pair of strategies leads to a payoff to each player, a payoff measured by a real
number. In each entry, the first number is the payoff of Player 1, and the second entry
is the payoff of Player 2. For instance, if Player 1 plays T and Player 2 plays R, then
Player 1 gets a payoff of 2 and Player 2 gets a payoff of 1. Let’s assume that each player
knows that these are the strategies and the payoffs, each player knows that each player
knows this, each player knows that each player knows that each player knows this,. . .
ad infinitum.
Now, Player 1 looks at his payoffs, and realizes that, no matter what the other player
plays, it is better for him to play M rather than B. That is, if Player 2 plays L, M
gives 2 and B gives 1; if Player 2 plays m, M gives 1, B gives 0; and if Player 2 plays
R, M gives 0, B gives −1. Therefore, he realizes that he should not play B. Now he
compares T and M . He realizes that, if Player 2 plays L or m, M is better than T , but
if she plays R, T is definitely better than M . Would Player 2 play R? What would she
play? To find an answer to these questions, Player 1 looks at the game from Player 2’s
point of view. He realizes that, for Player 2, there is no strategy that is outright better
than any other strategy. For instance, R is the best strategy if Player 1 plays B, but
otherwise it is strictly worse than m. Would Player 2 think that Player 1 would play
B? Well, she knows that Player 1 is trying to maximize his expected payoff, given by
the first entries as everyone knows. She must then deduce that Player 1 will not play B.
Therefore, Player 1 concludes, she will not play R (as it is worse than m in this case).
Ruling out the possibility that Player 2 plays R, Player 1 looks at his payoffs and sees
that M is now better than T , no matter what. On the other side, Player 2 goes through
similar reasoning, and concludes that Player 1 must play M , and therefore plays L.

Exercise 1 In the above analysis, players are assumed to make many assumptions about
the other players’ reasoning capabilities. What are these assumptions? How would the
analysis change if these assumptions are changed, e.g., if players act rationally but as­
sumes that the other parties play a random strategy?

The kind of reasoning in the above analyses does not always yield such a clear
prediction. Imagine that you want to meet with a friend in one of two places, about
which you both are indifferent. Unfortunately, you cannot communicate with each other
until you meet. This situation is formalized in the following game, which is called pure
coordination game:
3

1 \2 Left Right
Top 1, 1 0, 0
Bottom 0, 0 1, 1

Here, Player 1 chooses between Top and Bottom rows, while Player 2 chooses between
Left and Right columns. In each box, the first and the second numbers denote the payoffs
of players 1 and 2, respectively. Note that Player 1 prefers Top to Bottom if he knows
that Player 2 plays Left; he prefers Bottom if he knows that Player 2 plays Right.
Similarly, Player 2 prefers Left if she knows that Player 1 plays Top. There is no clear
prediction about the outcome of this game.
One may look for the stable outcomes (strategy profiles) in the sense that no player
has incentive to deviate if he knows that the other players play the prescribed strategies.
(Such strategy profiles are called Nash equilibrium, named after John Nash.) Here, Top-
Left and Bottom-Right are such outcomes. But Bottom-Left and Top-Right are not
stable in this sense. For instance, if Bottom-Left is known to be played, each player
would like to deviate.
Unlike in this game, mostly players have different preferences on the outcomes, in­
ducing conflict. In the following game, which is known as the Battle of Sexes, conflict
and the need for coordination are present together.

1 \ 2 Left Right
Top 2, 1 0, 0
Bottom 0, 0 1, 2

Here, once again players would like to coordinate on Top-Left or Bottom-Right, but
now Player 1 prefers to coordinate on Top-Left, while Player 2 prefers to coordinate on
Bottom-Right. The stable outcomes are again Top-Left and Bottom- Right.
The above analysis assumes that players take their actions simultaneously, so that
a player does not observe the action taken by the others when chooses his own action.
In general, a player may observe some of the actions of some other players. Such a
knowledge may have a dramatic impact on the outcome of the game. For an illustration,
in the Battle of Sexes, imagine that Player 2 knows what Player 1 does when she takes
her action. This can be formalized via the tree in Figure 1.1. Here, Player 1 first
chooses between Top and Bottom, and then Player 2 chooses between Left and Right,
4 CHAPTER 1. INTRODUCTION

B O

2 2

B O B O

2,1 0,0 0,0 1,2

Figure 1.1: Battle of Sexes with sequential moves

knowing what Player 1 has chosen. Clearly, now Player 2 would choose Left if Player
1 plays Top, and choose Right if Player 1 plays Bottom. Knowing this, Player 1 would
play Top. Therefore, one can argue that the only reasonable outcome of this game is
Top-Left. (This kind of reasoning is called backward induction.)
When Player 2 is able to check what the other player does, he gets only 1, while
Player 1 gets 2. (In the previous game, two outcomes were stable, in which Player 2
would get 1 or 2.) That is, Player 2 prefers that Player 1 has information about what
Player 2 does, rather than she herself has information about what Player 1 does. When
it is common knowledge that a player has some information or not, the player may prefer
not to have that information–a robust fact that we will see in various contexts.

Exercise 2 Clearly, this is generated by the fact that Player 1 knows that Player 2
will know what Player 1 does when she moves. Consider the situation that Player 1
thinks that Player 2 will know what Player 1 does only with probability π < 1, and this
probability does not depend on what Player 1 does. What will happen in a “reasonable”
equilibrium? [By the end of this course, hopefully, you will be able to formalize this
situation, and compute the equilibria.]

Another interpretation is that Player 1 can communicate to Player 2, who cannot


communicate to player 1. This enables Player 1 to commit to his actions, providing a
strong position in the relation.
5

1
Exit Play
B O

(3/2,3/2) B (2,1) (0,0)

O (0,0) (1,2)

Figure 1.2: Battle of Sexes with exit option

Exercise 3 Consider the following version of the last game: after knowing what Player
2 does, Player 1 gets a chance to change his action; then, the game ends. In other words,
Player 1 chooses between Top and Bottom; knowing Player 1’s choice, Player 2 chooses
between Left and Right; knowing 2’s choice, Player 1 decides whether to stay where he
is or to change his position. What is the “reasonable” outcome? What would happen if
changing his action would cost player 1 c utiles?

Imagine that, before playing the Battle of Sexes, Player 1 has the option of exiting,
in which case each player will get 3/2, or playing the Battle of Sexes. When asked to
play, Player 2 will know that Player 1 chose to play the Battle of Sexes, as depicted
in Figure 1.2. There are two “reasonable” equilibria (or stable outcomes). One is that
Player 1 exits, thinking that, if he plays the Battle of Sexes, they will play the Bottom-
Right equilibrium of the Battle of Sexes, yielding only 1 for Player 1. The second one is
that Player 1 chooses to Play the Battle of Sexes, and in the Battle of Sexes they play
Top-Left equilibrium.
Some would argue that the first outcome is not really reasonable? Because, when
asked to play, Player 2 will know that Player 1 has chosen to play the Battle of Sexes,
forgoing the payoff of 3/2. She must therefore realize that Player 1 cannot possibly be
planning to play Bottom, which yields the payoff of 1 max. That is, when asked to play,
Player 2 should understand that Player 1 is planning to play Top, and thus she should
play Left. Anticipating this, Player 1 should choose to play the Battle of Sexes game,
in which they play Top-Left. Therefore, the second outcome is the only reasonable one.
6 CHAPTER 1. INTRODUCTION

(This kind of reasoning is called Forward Induction.)


Here are some more examples of games that will be referred to frequently throughout
the course.

Prisoners’ Dilemma

1 \2 Cooperate Defect
Cooperate 5, 5 0, 6
Defect 6, 0 1, 1
This is a well known game that most of you know. Two prisoners are arrested for a
crime for which there is no firm evidence, and they are being interrogated in separate
rooms. Each prisoner could either cooperate with the other and not confess their crime
or defect and confess the crime. In this game no matter what the other player does,
each player would like to defect, confessing their crime. This yields (1, 1). If they both
cooperated and not confessed their crime, each would get a better payoff of 5.

Hawk-Dove game
1 \ 2 Hawk Dove
V −C V −C
Hawk 2
, 2 V, 0
V V
Dove 0,V ,
2 2

This is an important biological game, but is also quite similar to many games in Eco­
nomics and Political Science. V is the value of a resource that one of the players will
enjoy. If they shared the resource, their values are V /2. Hawk stands for a “tough”
strategy, whereby the player does not give up the resource. However, if the other player
is also playing hawk, they end up fighting, and incur the cost C/2 each. On the other
hand, a Hawk player gets the whole resource for itself when playing a Dove. When
V > C, this is a Prisoners’ Dilemma game, yielding a fight.
When V < C, so that fighting is costly, this game is similar to another well-known
game, named “Chicken”, where two players driving towards a cliff have to decide whether
to stop or continue. The one who stops first loses face, but may save his life. More
generally, a class of games called “wars of attrition” are used to model this type of
situations. In this case, a player would like to play Hawk if his opponent plays Dove,
and play Dove if his opponent plays Hawk.
7

An investment game:

1 \ 2 Invest Don’t Invest


Invest θ, θ θ − c, 0
Don’t Invest 0,θ − c 0,0

Here, two parties simultaneously decide whether two invest; the investment is more
valuable if the other party also invests (as in the coordination game). For example,
consider a potential worker and a potential employer, potential worker deciding whether
to get education (investing in his human capital), and the potential employer deciding
wether to invest in a technology that would require human capital. (Think about what
are the reasonable outcomes for various values of θ and c. How would you analyze this
situations if the players do not know the actual values of these parameters, but have
some private information about what these values could be?)
Chapter 2

Decision Theory

2.1 The basic theory of choice


We consider a set  of alternatives. Alternatives are mutually exclusive in the sense
that one cannot choose two distinct alternatives at the same time. We also take the set
of feasible alternatives exhaustive so that a player’s choices is always well-defined.1
We are interested in a player’s preferences on . Such preferences are modeled
through a relation º on , which is simply a subset of  × . A relation º is said to
be complete if and only if, given any   ∈ , either  º  or  º . A relation º is
said to be transitive if and only if, given any    ∈ ,

[ º  and  º ] ⇒  º .

A relation is a preference relation if and only if it is complete and transitive. Given any
preference relation º, we can define strict preference  by

 Â  ⇐⇒ [ º  and  6º ]

and the indifference ∼ by

 ∼  ⇐⇒ [ º  and  º ]
1
This is a matter of modeling. For instance, if we have options Coffee and Tea, we define alternatives
as  = Coffee but no Tea,  = Tea but no Coffee,  = Coffee and Tea, and   = no Coffee and no
Tea.

9
10 CHAPTER 2. DECISION THEORY

A preference relation can be represented by a utility function  :  → R in the


following sense:
 º  ⇐⇒ () ≥ () ∀  ∈ 

This statement can be spelled out as follows. First, if () ≥ (), then the player finds
alternative  as good as alternative . Second, and conversely, if the player finds  at
least as good as , then  () must be at least as high as  (). In other words, the
player acts as if he is trying to maximize the value of  (·).
The following theorem states further that a relation needs to be a preference relation
in order to be represented by a utility function.

Theorem 2.1 Let  be finite. A relation can be presented by a utility function if and
only if it is complete and transitive. Moreover, if  :  → R represents º, and if
 : R → R is a strictly increasing function, then  ◦  also represents º.

By the last statement, such utility functions are called ordinal, i.e., only the order
information is relevant.
In order to use this ordinal theory of choice, we should know the player’s preferences
on the alternatives. As we have seen in the previous lecture, in game theory, a player
chooses between his strategies, and his preferences on his strategies depend on the strate-
gies played by the other players. Typically, a player does not know which strategies the
other players play. Therefore, we need a theory of decision-making under uncertainty.

2.2 Decision-making under uncertainty


Consider a finite set  of prizes, and let  be the set of all probability distributions
P
 :  → [0 1] on , where ∈ () = 1. We call these probability distributions
lotteries. A lottery can be depicted by a tree. For example, in Figure 2.1, Lottery 1
depicts a situation in which the player gets $10 with probability 1/2 (e.g. if a coin toss
results in Head) and $0 with probability 1/2 (e.g. if the coin toss results in Tail).
In the above situation, the probabilities are given, as in a casino, where the probabili-
ties are generated by a machine. In most real-world situations, however, the probabilities
are not given to decision makers, who may have an understanding of whether a given
event is more likely than another given event. For example, in a game, a player is not
2.2. DECISION-MAKING UNDER UNCERTAINTY 11

10
1/2
Lottery 1

1/2 0

Figure 2.1:

given a probability distribution regarding the other players’ strategies. Fortunately, it


has been shown by Savage (1954) under certain conditions that a player’s beliefs can be
represented by a (unique) probability distribution. Using these probabilities, one can
represent the decision makers’ acts by lotteries.
We would like to have a theory that constructs a player’s preferences on the lotteries
from his preferences on the prizes. There are many of them. The most well-known–and
the most canonical and the most useful–one is the theory of expected utility maximiza-
tion by Von Neumann and Morgenstern. A preference relation º on  is said to be
represented by a von Neumann-Morgenstern utility function  :  → R if and only if
X X
 º  ⇐⇒  () ≡ ()() ≥ ()() ≡ () (2.1)
∈ ∈

for each   ∈  . This statement has two crucial parts:

1.  :  → R represents º in the ordinal sense. That is, if  () ≥  (), then the
player finds lottery  as good as lottery . And conversely, if the player finds  at
least as good as , then  () must be at least as high as  ().

2. The function  takes a particular form: for each lottery ,  () is the expected
P
value of  under . That is,  () ≡ ∈ ()(). In other words, the player acts
as if he wants to maximize the expected value of . For instance, the expected
utility of Lottery 1 for the player is ((Lottery 1)) = 12 (10) + 12 (0).2

In the sequel, I will describe the necessary and sufficient conditions for a represen-
tation as in (2.1). The first condition states that the relation is indeed a preference
relation:
2
R
If  were a continuum, like R, we would compute the expected utility of  by ()().
12 CHAPTER 2. DECISION THEORY

³³ ³³
³ ³³ ³³³
d³ d³
³ ³
³³ ³³
 PP  PP
PP PP
PP PP
PP PP
P P

Figure 2.2: Two lotteries

Axiom 2.1 º is complete and transitive.

This is necessary by Theorem 2.1, for  represents º in ordinal sense. The second
condition is called independence axiom, stating that a player’s preference between two
lotteries  and  does not change if we toss a coin and give him a fixed lottery  if “tail”
comes up.

Axiom 2.2 For any    ∈  , and any  ∈ (0 1],  + (1 − ) Â  + (1 − ) ⇐⇒
 Â .

Let  and  be the lotteries depicted in Figure 2.2. Then, the lotteries  + (1 − )
and  + (1 − ) can be depicted as in Figure 2.3, where we toss a coin between a
fixed lottery  and our lotteries  and . Axiom 2.2 stipulates that the player would not
change his mind after the coin toss. Therefore, the independence axiom can be taken as
an axiom of “dynamic consistency” in this sense.
The third condition is purely technical, and called continuity axiom. It states that
there are no “infinitely good” or “infinitely bad” prizes.

Axiom 2.3 For any    ∈  with  Â , there exist   ∈ (0 1) such that  + (1 −
) Â  and  Â  + (1 − ).

Axioms 2.1 and 2.2 imply that, given any    ∈  and any  ∈ [0 1],

if  ∼ , then  + (1 − )  ∼  + (1 − ) (2.2)

This has two implications:

1. The indifference curves on the lotteries are straight lines.


2.2. DECISION-MAKING UNDER UNCERTAINTY 13

³³ ³³
³ ³³ ³³³
 dP dP³
³³ ³³
³³ ³
¡ PPP ¡ PPP
¡ PP ¡ PP
PP PP
¡ P ¡ P
 
d d
¡ ¡
¡ ¡
@ @
@1 −  @1 − 
@ @
@ @
@ @

 + (1 − )  + (1 − )

Figure 2.3: Two compound lotteries

(2 )
6

1
@
@
@
@
H
 @
HH @
HH @
0
H HH@
HH H@H@
HH0 H
HH ¡@
H¡ @
¡H
0
HH @
¡ HH@
¡ @
HH
¡ @
H
¡ @
¡ @
¡ @
¡ @ - (1 )
 0 1

Figure 2.4: Indifference curves on the space of lotteries


14 CHAPTER 2. DECISION THEORY

2. The indifference curves, which are straight lines, are parallel to each other.

To illustrate these facts, consider three prizes 0  1 , and 2 , where 2 Â 1 Â 0 .


A lottery  can be depicted on a plane by taking  (1 ) as the first coordinate (on the
horizontal axis), and  (2 ) as the second coordinate (on the vertical axis). The remaining
probability  (0 ) is 1 −  (1 ) −  (2 ). [See Figure 2.4 for the illustration.] Given any
two lotteries  and , the convex combinations  + (1 − )  with  ∈ [0 1] form the line
segment connecting  to . Now, taking  = , we can deduce from (2.2) that, if  ∼ ,
then  + (1 − )  ∼  + (1 − ) =  for each  ∈ [0 1]. That is, the line segment
connecting  to  is an indifference curve. Moreover, if the lines  and 0 are parallel, then
 = | 0 |  ||, where || and | 0 | are the distances of  and  0 to the origin, respectively.
Hence, taking  = , we compute that 0 =  + (1 − )  0 and 0 =  + (1 − )  0 ,
where  0 is the lottery at the origin and gives 0 with probability 1. Therefore, by (2.2),
if  is an indifference curve, 0 is also an indifference curve, showing that the indifference
curves are parallel.
Line  can be defined by equation 1  (1 ) + 2  (2 ) =  for some 1  2   ∈ R. Since
0 is parallel to , then 0 can also be defined by equation 1  (1 ) + 2  (2 ) = 0 for some
0 . Since the indifference curves are defined by equality 1  (1 ) + 2  (2 ) =  for various
values of , the preferences are represented by

 () = 0 + 1  (1 ) + 2  (2 )


≡ (0 )(0 ) + (1 ) (1 ) + (2 )(2 )

where

 (0 ) = 0
(1 ) = 1 
(2 ) = 2 

giving the desired representation.


This is true in general, as stated in the next theorem:

Theorem 2.2 A relation º on  can be represented by a von Neumann-Morgenstern


utility function  :  →  as in (2.1) if and only if º satisfies Axioms 2.1-2.3. Moreover,
 and ̃ represent the same preference relation if and only if ̃ =  +  for some   0
and  ∈ R.
2.3. MODELING STRATEGIC SITUATIONS 15

By the last statement in our theorem, this representation is “unique up to affine


transformations”. That is, a decision maker’s preferences do not change when we change
his von Neumann-Morgenstern (VNM) utility function by multiplying it with a positive
number, or adding a constant to it; but they do change when we transform it through a
non-linear transformation. In this sense, this representation is “cardinal”. Recall that,
in ordinal representation, the preferences wouldn’t change even if the transformation

were non-linear, so long as it was increasing. For instance, under certainty,  = 
and  would represent the same preference relation, while (when there is uncertainty)

the VNM utility function  =  represents a very different set of preferences on the
lotteries than those are represented by .

2.3 Modeling Strategic Situations

In a game, when a player chooses his strategy, in principle, he does not know what the
other players play. That is, he faces uncertainty about the other players’ strategies.
Hence, in order to define the player’s preferences, one needs to define his preference
under such uncertainty. In general, this makes modeling a difficult task. Fortunately,
using the utility representation above, one can easily describe these preferences in a
compact way.
Consider two players Alice and Bob with strategy sets  and  . If Alice plays
 and Bob plays  , then the outcome is (   ). Hence, it suffices to take the set of
outcomes  =  ×  = {(   ) | ∈    ∈  } as the set of prizes. Consider
Alice. When she chooses her strategy, she has a belief about the strategies of Bob,
represented by a probability distribution  on  , where  ( ) is the probability
that Bob plays  , for any strategy  . Given such a belief, each strategy  induces a
lottery, which yields the outcome (   ) with probability  ( ). Therefore, we can
consider each of her strategies as a lottery.

Example 2.1 Let  = { } and  = { }. Then, the outcome set is  =
{     }. Suppose that Alice assigns probability  () = 13 to  and
 () = 23 to . Then, under this belief, her strategies  and  yield the following
16 CHAPTER 2. DECISION THEORY

lotteries:
TL TL

1/3 0

TR TR
0

2/3
T1/ B1/3 1/3
1/3
3

0 BL BL

2/3

0
BL BL

On the other hand, if she assigns probability  () = 12 to  and  () = 12 to ,
then her strategies  and  yield the following lotteries:

TL TL

1/2 0

TR TR
0
1/2
1/2
T1/2 B1/2 1/
1/22
1/2

0 BL BL

1/2

0
BL BL

The objective of a game theoretical analysis is to understand what players believe


about the other players’ strategies and what they would play. In other words, the players’
beliefs,  and  , are determined at the end of the analysis, and we do not know them
when we model the situation. Hence, in order to describe a player’s preferences, we need
to describe his preferences among all the lotteries as above for every possible belief he
may hold. In the example above, we need to describe how Alice compares the lotteries

TL TL

p 0

TR TR
0

1-p
1-p
Tp Bp p

0 BL BL

1-p
1-p

0
BL BL
(2.3)
2.3. MODELING STRATEGIC SITUATIONS 17

for every  ∈ [0 1]. That is clearly a challenging task.


Fortunately, under Axioms 2.1-2.3, which we will assume throughout the course, we
can describe the preferences of Alice by a function

 :  ×  → R

Similarly, we can describe the preferences of Bob by a function

 :  ×  → R

In the example above, all we need to do is to find four numbers for each player. The
preferences of Alice is described by  ( ),  ( ),  ( ), and  ( ).

Example 2.2 In the previous example, assume that regarding the lotteries in (2.3), the
preference relation of Alice is such that

 Â  if   14 (2.4)

 ∼  if  = 14

 Â  if   14

and she is indifferent between the sure outcomes ( ) and ( ). Under Axioms
2.1-2.3, we can represent her preferences by

 ( ) = 3
 ( ) = −1
 ( ) = 0
 ( ) = 0

The derivation is as follows. By using the fact that she is indifferent between ( ) and
( ), we reckon that  ( ) =  ( ). By the second part of Theorem 2.2, we
can set  ( ) = 0 (or any other number you like)! Moreover, in (2.3), the lottery 
yields
 ( ) =  ( ) + (1 − )  ( ) 

and the lottery  yields

 ( ) =  ( ) + (1 − )  ( ) = 0


18 CHAPTER 2. DECISION THEORY

Hence, the condition (2.4) can be rewritten as

 ( ) + (1 − )  ( )  0 if   14


 ( ) + (1 − )  ( ) = 0 if  = 14
 ( ) + (1 − )  ( )  0 if   14

That is,
1 3
 ( ) +  ( ) = 0
4 4
and
 ( )   ( ) 
In other words, all we need to do is to find numbers  ( )  0 and  ( )  0
with  ( ) = −3 ( ), as in our solution. (Why would any such two numbers
yield the same preference relation?)

2.4 Attitudes Towards Risk


Here, we will relate the attitudes of an individual towards risk to the properties of his
von-Neumann-Morgenstern utility function. Towards this end, consider the lotteries
with monetary prizes and consider a decision maker with utility function  : R → R.
A lottery is said to be a fair gamble if its expected value is 0. For instance, consider
a lottery that gives  with probability  and  with probability 1 − ; denote this lottery
by  ( ; ). Such a lottery is a fair gamble if and only if  + (1 − ) = 0
A decision maker is said to be risk-neutral if and only if he is indifferent between
accepting and rejecting all fair gambles. Hence, a decision maker with utility function
 is risk-neutral if and only if
X X
 ()  () =  (0) whenever  () = 0

This is true if and only if the utility function  is linear, i.e.,  () =  +  for some
real numbers  and . Therefore, an agent is risk-neutral if and only if he has a linear
Von-Neumann-Morgenstern utility function.
A decision maker is strictly risk-averse if and only if he rejects all fair gambles,
except for the gamble that gives 0 with probability 1. That is,
X ³X ´
 ()  ()   (0) =   () 
2.4. ATTITUDES TOWARDS RISK 19

Here, the inequality states that he rejects the lottery , and the equality is by the fact
that the lottery is a fair gamble. As in the case of risk neutrality, it suffices to consider
the binary lotteries  ( ; ), in which case the above inequality reduces to

() + (1 − )()  ( + (1 − ))

This is a familiar inequality from calculus: a function  is said to be strictly concave if


and only if
( + (1 − ))  () + (1 − )()

for all  ∈ (0 1). Therefore, strict risk-aversion is equivalent to having a strictly concave
utility function. A decision maker is said to be risk-averse iff he has a concave utility
function, i.e., ( + (1 − )) ≥ () + (1 − )() for each , , and . Similarly,
a decision maker is said to be (strictly) risk seeking iff he has a (strictly) convex utility
function.
Consider Figure 2.5. A risk averse decision maker’s expected utility is  () =
 (1 ) + (1 − )  (2 ) if he has a gamble that gives 1 with probability  and 2
with probability 1−. On the other hand, if he had the expected value 1 +(1 − ) 2
for sure, his expected utility would be  (1 + (1 − ) 2 ). Hence, the cord AB is the
utility difference that this risk-averse agent would lose by taking the gamble instead of
its expected value. Likewise, the cord BC is the maximum amount that he is willing
to pay in order to avoid taking the gamble instead of its expected value. For example,
suppose that 2 is his wealth level; 2 − 1 is the value of his house, and  is the
probability that the house burns down. In the absence of fire insurance, the expected
utility of this individual is  (gamble), which is lower than the utility of the expected
value of the gamble.

2.4.1 Risk sharing



Consider an agent with utility function  :  →
7 . He has a (risky) asset that gives $100
with probability 1/2 and gives $0 with probability 1/2. The expected utility of the asset
√ √
for the agent is 0 = 12 0 + 12 100 = 5. Consider also another agent who is identical
to this one, in the sense that he has the same utility function and an asset that pays
$100 with probability 1/2 and gives $0 with probability 1/2. Assume throughout that
what an asset pays is statistically independent from what the other asset pays. Imagine
20 CHAPTER 2. DECISION THEORY

EU
u

A
u(pW1+(1- p)W2)
C
EU(Gamble) B

W1 pW1+(1-p)W2 W2

Figure 2.5:
2.4. ATTITUDES TOWARDS RISK 21

that the two agents form a mutual fund by pooling their assets, each agent owning half
of the mutual fund. This mutual fund gives $200 the probability 1/4 (when both assets
yield high dividends), $100 with probability 1/2 (when only one on the assets gives high
dividend), and gives $0 with probability 1/4 (when both assets yield low dividends).
Thus, each agent’s share in the mutual fund yields $100 with probability 1/4, $50 with
probability 1/2, and $0 with probability 1/4. Therefore, his expected utility from the
√ √ √
share in this mutual fund is  = 14 100 + 12 50 + 14 0 = 60355. This is clearly
larger than his expected utility from his own asset which yields only 5. Therefore, the
above agents gain from sharing the risk in their assets.

2.4.2 Insurance
Imagine a world where in addition to one of the agents above (with utility function

 :  7→  and a risky asset that gives $100 with probability 1/2 and gives $0 with
probability 1/2), we have a risk-neutral agent with lots of money. We call this new agent
the insurance company. The insurance company can insure the agent’s asset, by giving
him $100 if his asset happens to yield $0. How much premium,  , the agent would be
willing to pay to get this insurance? [A premium is an amount that is to be paid to
insurance company regardless of the outcome.]
If the risk-averse agent pays premium  and buys the insurance, his wealth will be
$100 −  for sure. If he does not, then his wealth will be $100 with probability 1/2 and
$0 with probability 1/2. Therefore, he is willing to pay  in order to get the insurance
iff
1 1
 (100 −  ) ≥  (0) +  (100)
2 2
i.e., iff
√ 1√ 1√
100 −  ≥ 0+ 100
2 2
The above inequality is equivalent to

 ≤ 100 − 25 = 75

That is, he is willing to pay 75 dollars premium for an insurance. On the other hand, if
the insurance company sells the insurance for premium  , it gets  for sure and pays
$100 with probability 1/2. Therefore it is willing to take the deal iff
1
 ≥ 100 = 50
2
22 CHAPTER 2. DECISION THEORY

Therefore, both parties would gain, if the insurance company insures the asset for a
premium  ∈ (50 75), a deal both parties are willing to accept.

Exercise 2.1 Now consider the case that we have two identical risk-averse agents as
above, and the insurance company. Insurance company is to charge the same premium
 for each agent, and the risk-averse agents have an option of forming a mutual fund.
What is the range of premiums that are acceptable to all parties?

2.5 Exercises with Solution


1. [Homework 1, 2006] In which of the following pairs of games the players’ preferences
over lotteries are the same?

(a)
     
 2 −2 1 1 −3 7  12 −1 5 0 −3 2
 1 10 0 4 0 4  5 3 3 1 3 1
 −2 1 1 7 −1 −5  −1 0 5 2 1 −2

(b)
     
 1 2 7 0 4 −1  1 5 7 1 4 −1
 6 1 2 2 8 4  6 3 2 4 8 8
 3 −1 9 2 5 0  3 −1 9 5 5 1

Solution: Recall from Theorem 2.2 that two utility functions represent the same
preferences over lotteries if and only if one is an affine transformation of the other.
That is, we must have  =  +  for some  and  where  and  are the
utility functions on the left and right, respectively, for each player . In Part 1, the
preferences of player 1 are different in two games. To see this, note that 1 ( ) =
0 and 1 ( ) = 3. Hence, we must have  = 3. Moreover, 1 ( ) = 1 and
1 ( ) = 5. Hence, we must have  = 2. But then, 1 ( ) +  = 7 6=
12 = 1 ( ), showing that it is impossible to have an affine transformation.
Similarly, one can check that the preferences of Player 2 are different in Part 2.
2.5. EXERCISES WITH SOLUTION 23

Now, comparisons of payoffs for ( ) and ( ) yield that  = 2 and  = 1, but
then the payoffs for ( ) do not match under the resulting transformation.

2. [Homework 1, 2011] Alice and Bob want to meet in one of three places, namely
Aquarium (denoted by ), Boston Commons (denoted by ) and a Celtics game
(denoted by ). Each of them has strategies   . If they both play the same
strategy, then they meet at the corresponding place, and they end up at different
places if their strategies do not match. You are asked to find a pair of utility
functions to represent their preferences, assuming that they are expected utility
maximizers.
Alice’s preferences: She prefers any meeting to not meeting, and she is indiffer-
ence towards where they end up if they do not meet. She is indifferent between a
situation in which she will meet Bob at , or , or , each with probability 1/3,
and a situation in which she meets Bob at  with probability 1/2 and does not
meet Bob with probability 1/2. If she believes that Bob goes to Boston Commons
with probability  and to the Celtics game with probability 1 − , she weakly
prefers to go to Boston Commons if and only if  ≥ 13.
Bob’s preferences: If he goes to the Celtics game, he is indifferent where Alice
goes. If he goes to Aquarium or Boston commons, then he prefers any meeting to
not meeting, and he is indifferent towards where they end up in the case they do
not meet. He is indifferent between playing , , and  if he believes that Alice
may choose any of her strategies with equal probabilities.

(a) Assuming that they are expected utility maximizers, find a pair of utility
functions  : {  }2 → R and  : {  }2 → R that represent the
preferences of Alice and Bob on the lotteries over {  }2 .
Solution: Alice’s utility function is determined as follows. Since she is indif-
ferent between any (  ) with  6=  , by Theorem 2.2, one can normalize
her payoff for any such strategy profile to  (  ) = 0. Moreover, since
she prefers meeting to not meeting,  ( )  0 for all  ∈ {  }.
By Theorem 2.2, one can also set  ( ) = 1 by a normalization. The
indifference condition in the question can then be written as
1 1 1 1
 ( ) +  ( ) +  ( ) =  ( ) 
3 3 3 2
24 CHAPTER 2. DECISION THEORY

The last preference in the question also leads to


1 2
 ( ) =  ( ) 
3 3
Form the last equality,  ( ) = 2, and from the previous displayed equal-
ity,  ( ) = 6.
Bob’s utility function can be obtained similarly, by setting  (  ) = 0
for any distinct   when  ∈ { }. The first and the last indifference
conditions also imply that  ( )  0, and hence one can set  ( ) = 1
for all  ∈ {  } by the first indifference. The last indifference then
implies that
1 1
 ( ) =  ( ) =  ( ) = 1
3 3
yielding  ( ) =  ( ) = 3.
(b) Find another representation of the same preferences.
Solution: By Theorem 2.2, we can find another pair of utility functions by
doubling all payoffs.
(c) Find a pair of utility functions that yield the same preference as  and 
does among the sure outcomes but do not represent the preferences above.
Solution: Take  ( ) = 60 and  ( ) =  ( ) = 30 while keep-
ing all other payoffs as before. By Theorem 2.1, the preferences among sure
outcomes do not change, but the preferences among some lotteries change by
Theorem 2.2.

3. [Homework 1, 2011] In this question you are asked to price a simplified version of
mortgage-backed securities. A banker lends money to  homeowners, where each
homeowner signs a mortgage contract. According to the mortgage contract, the
homeowner is to pay the lender 1 million dollar, but he may go bankrupt with
probability , in which case there will be no payment. There is also an investor
who can buy a contract in which case he would receive the payment from the
homeowner who has signed the contract. The utility function  of the investor is
given by  () = − exp (−), where  is the net change in his wealth.

(a) How much is the investor willing to pay for a mortgage contract?
2.5. EXERCISES WITH SOLUTION 25

Solution: He pays a price  if and only if  [ ( −  )] ≥  (0), i.e.,

− (1 − ) exp (− (1 −  )) −  exp (− (0 −  )) ≥ −1

That is,
1
 ≤  ∗ ≡ − ln ( + (1 − ) exp (−)) 


where  is the maximum willing to pay.
(b) Now suppose that the banker can form  "mortgage-backed securities" by
pooling all the mortgage contracts and dividing them equally. A mortgage
backed security yields 1 of the total payments by the  homeowners, i.e., if 
homeowners go bankrupt, a security pays ( − )  millions dollar. Assume
that homeowners’ bankruptcy are stochastically independent from each other.
How much is the investor willing to pay for a mortgage-backed security?
Assuming that  is large find an approximate value for the price he is willing
to pay. [Hint: for large , approximately, the average payment is normally
distributed with mean 1 −  (million dollars) and variance  (1 − ) . If 
is normally distributed with mean  and variance 2 , the expected value of
¡ ¡ ¢¢
exp (−) is exp −  − 12  2 .] How much more can the banker raise by
creating mortgage-backed securities? (Use the approximate values for large
.)
Solution: Writing  for the number of  combinations out of , the prob-
ability that there are  bankruptcies is   (1 − )− . If he pays  for
a mortgage-backed security, his net revenue in the case of  bankruptcies is
1 −  − . Hence, his expected payoff is
X

− exp (− (1 −  − ))   (1 − )− 
=0

He is willing to pay  if the above amount is at least −1, the payoff from 0.
Therefore, he is willing to pay at most
à  !
1 X
∗ = 1 − ln exp ()   (1 − )− 
 =0

For large ,
1  (1 − )
∗ ∼
= 1 − ln (exp ( ( +  (1 − )  (2)))) = 1 −  −  
 2
26 CHAPTER 2. DECISION THEORY

Note that he is asking a discount of  (1 − ) 2 from the expected payoff


against the risk, and behaves approximately risk neutral for large . The
banker gets an extra revenue of ∗ −  ∗ from creating mortgage-backed se-
curities. (Check that ∗ −  ∗  0.)

(c) Answer part (b) by assuming instead that the homeowners’ bankruptcy are
perfectly correlated: with probability  all homeowners go bankrupt and with
probability 1 −  none of them go bankrupt. Briefly compare your answers
for parts (b) and (c).
Solution: With perfect correlation, a mortgage-backed security is equivalent
to one contract, and hence he is willing to pay at most  ∗ . In general, when
there is a positive correlation between the bankruptcies of different homeown-
ers (e.g. due to macroeconomic conditions), the value of mortgage backed
securities will be less than what it would have been under independence.
Therefore, mortgage back securities that are priced under the erroneous as-
sumption of independence would be over-priced.

2.6 Exercises
1. [Homework 1, 2000] Consider a decision maker with Von Neumann and Morgen-
stren utility function  with  () = ( − 1)2 . Check whether the following VNM
utility functions can represent this decision maker’s preferences. (Provide the de-
tails.)

(a) ∗ :  →
7  − 1;

(b) ∗∗ :  7→ ( − 1)4 ;

(c) ̂ :  7→ − ( − 1)2 ;

(d) ̃ :  7→ 2 ( − 1)2 − 1

2. [Homework 1, 2004] Which of the following pairs of games are strategically equiv-
alent, i.e., can be taken as two different representations of the same decision prob-
lem?
2.6. EXERCISES 27

(a)
L R L R
T 2,2 4,0 T -6,4 0,0
B 3,3 1,0 B -3,6 -9,0
(b)
L R L R
T 2,2 4,0 T 4,4 16,0
B 3,3 1,0 B 9,9 1,0
(c)
L R L R
T 2,2 4,0 T 4,2 2,0
B 3,3 1,0 B 3,3 1,0

3. [Homework 1, 2001] We have two dates: 0 and 1. We have a security that pays a
single dividend, at date 1. The dividend may be either $100, or $50, or $0, each
with probability 1/3. Finally, we have a risk-neutral agent with a lot of money.
(The agent will learn the amount of the dividend at the beginning of date 1.)

(a) An agent is asked to decide whether to buy the security or not at date 0. If he
decides to buy, he needs to pay for the security only at date 1 (not immediately
at date 0). What is the highest price   at which the risk-neutral agent is
willing to buy this security?
(b) Now consider an “option” that gives the holder the right (but not obligation)
to buy this security at a strike price  at date 1 – after the agent learns
the amount of the dividend. If the agent buys this option, what would be the
agent’s utility as a function of the amount of the dividend?
(c) An agent is asked to decide whether to buy this option or not at date 0. If he
decides to buy, he needs to pay for the option only at date 1 (not immediately
at date 0). What is the highest price   at which the risk-neutral agent is
willing to buy this option?

4. [Homework 1, 2001] Take  = R, the set of real numbers, as the set of alternatives.
Define a relation º on  by
 º  ⇐⇒  ≥  − 12 for all   ∈ .
28 CHAPTER 2. DECISION THEORY

(a) Is º a preference relation? (Provide a proof.)


(b) Define the relations  and ∼ by

 Â  ⇐⇒ [ º  and  º
6 ]

and
 ∼  ⇐⇒ [ º  and  º ] 

respectively. Is  transitive? Is ∼ transitive? Prove your claims.


(c) Would º be a preference relation if we had  = N, where N = {0 1 2   } is
the set of all natural numbers?
Chapter 3

Representation of Games

We are now ready to formally introduce games and some fundamental concepts, such as
a strategy. In order to analyze a strategic situations, one needs to know

• who the players are,

• which actions are available to them,

• how much each player values each outcome,

• what each player knows.

A game is just a formal representation of the above information. This is usually done
in one of the following two ways:

1. The extensive-form representation, in which the above information is explicitly


described using game trees and information sets;

2. The normal-form (or strategic-form) representation, in which the above informa-


tion is summarized by use of strategies.

Both forms of representation are useful in their on way, and I will use both representa-
tions extensively throughout the course.
It is important to emphasize that, when describing what a player knows, one needs to
specify not only what he knows about external parameters, such as the payoffs, but also
what he knows about the other players’ knowledge and beliefs about these parameters,

29
30 CHAPTER 3. REPRESENTATION OF GAMES

as well as what he knows about the other players’ knowledge of his own beliefs, and so
on. In both representations such information is encoded in an economical manner. In the
first half of this course, we will focus on non-informational issues, by confining ourselves
to the games of complete information, in which everything that is known by a player is
known by everybody. In the second half, we will focus on informational issues, allowing
players to have asymmetric information, so that one may know a piece of information
that is not known by another.
The outline of this lecture is as follows. The first section is devoted to the extensive-
form representation of games. The second section is devoted to the concept of strategy.
The third section is devoted to the normal-form representation, and the equivalence
between the two representations. The final section contains exercises and some of their
solutions.

3.1 Extensive-form Representation


The extensive-form representation of a game contains all the information about the game
explicitly, by defining who moves when, what each player knows when he moves, what
moves are available to him, and where each move leads to, etc. This is done by use of a
game tree and information sets–as well as more basic information such as players and
the payoffs.

3.1.1 Game Tree


Definition 3.1 A tree is a set of nodes and directed edges connecting these nodes such
that

1. there is an initial node, for which there is no incoming edge;

2. for every other node, there is exactly one incoming edge;

3. for any two nodes, there is a unique path that connect these two nodes.

For a visual aid, imagine the branches of a tree arising from its trunk. For example,
the graph in Figure 3.1 is a tree. There is a unique starting node, and it branches out
from there without forming a loop. It does look like a tree. On the other hand, the
3.1. EXTENSIVE-FORM REPRESENTATION 31

graphs in Figure 3.2 are not trees. In the graph on the left-hand side, there are two
alternative paths to node A from the initial node, one through node B and one through
node C. This violates the third condition. (Here, the second condition in the definition is
also violated, as there are two incoming edges to node A.) On the right-hand side, there
is no path that connects the nodes x and y, once again violating the third condition.
(Once again, the second condition is also violated.)

Figure 3.1: A tree.

B x

A
y
C

Figure 3.2: Two graphs that are not a tree

Note that edges (or arrows) come with labels, which can be same for two different
arrows. In a game tree there are two types of nodes, terminal nodes, at which the game
ends, and non-terminal nodes, at which a player would need to make a further decision.
This is formally stated as follows.

Definition 3.2 The nodes that are not followed by another node are called terminal.
32 CHAPTER 3. REPRESENTATION OF GAMES

The other nodes are called non-terminal.

Non-terminal
Terminal Nodes
nodes

Figure 3.3: Terminal and non-terminal nodes.

For example, the terminal and non-terminal nodes for the game tree in Figure 3.1
are as in Figure 3.3. There is no outgoing arrow in any terminal node, indication that
the game has ended. A terminal node may also be referred to as an outcome in the
game. At such a node, we need to specify the players’ payoffs towards describing their
preferences among the outcomes. On the other hand, there are some outgoing arrows in
any non-terminal node, indicating that some further decisions are to be made. In that
case, on needs to describe who makes a decision and what he knows at the time of the
decision. A game is formally defined just like this, next.

3.1.2 Games in Extensive Form


Definition 3.3 (Extensive form) A Game consists of

• a set of players,

• a tree,
3.1. EXTENSIVE-FORM REPRESENTATION 33

• an allocation of non-terminal nodes of the tree to the players,

• an informational partition of the non-terminal nodes (to be made precise in the


next subsection), and

• payoffs for each player at each terminal node.

Players The set of players consists of the decision makers or actors who make some
decision during the course of the game. Some games may also contain a special player
Nature (or Chance) that represent the uncertainty the players face, as it will be explained
in Subsection 3.1.4. The set of games is often denoted by

 = {1 2     }

and   ∈  are designated as generic players.

Outcomes and Payoffs The set of terminal nodes often denoted by . At a terminal
node, the game has ended, leading to some outcome. At that point, one specifies a
payoff, which is a real number, for each player . The mapping

 :  → R

that maps each terminal node to the payoff of player  at that node is the Von-Neumann
and Morgenstern utility function of player . Recall from the previous chapter that this
means that player  tries to maximize the expected value of  . That is, given any two
lotteries  and  on , he prefers  to  if and only if  leads to a higher expected value for
P P
function  than  does, i.e., ∈  ()  () ≥ ∈  ()  (). Recall also that these
preferences do not change if we multiply all payoffs with a fixed positive number or add a
fixed number to all payoffs. The preferences do change under any other transformation.

Decision Nodes In a non-terminal node, a new decision is to be made. Hence, in the


definition of a game, a player is assigned to each non-terminal node. This is the player
who will make the decision at that point. Towards describing the decision problem of
the player at the time, one defines the available choices to the player at the moment.
These are the outgoing arrows at the node, each of them leading to a different node.
Each of these choices is also called move or action (interchangeably). Note that the
34 CHAPTER 3. REPRESENTATION OF GAMES

moves come with their labels, and two different arrows can have the same label. In that
case, they are the same move.

Head Tail
2
2
head tail head tail

(-1,1) (1,-1) (1,-1) (-1,1)

Figure 3.4: Matching Pennies with Perfect Information

Example 3.1 (Matching Pennies with Perfect Information) Consider the game
in Figure 3.4.The tree consists of 7 nodes. The first one is allocated to Player 1, and
the next two to Player 2. The four end-nodes have payoffs attached to them. Since
there are two players, payoff vectors have two elements. The first number is the payoff
of Player 1 and the second is the payoff of Player 2. These payoffs are von Neumann-
Morgenstern utilities. That is, each player tries to maximize the expected value of his
own payoffs given his beliefs about how the other players will play the game.

One also needs to describe what the player knows at the moment of his decision
making. This is formally done by information sets, as follows.

3.1.3 Information Sets


Definition 3.4 An information set is a collection of nodes such that

1. the same player  is to move at each of these nodes;

2. the same moves are available at each of these nodes.

Definition 3.5 An information partition is an allocation of each non-terminal node of


the tree to an information set; the starting node must be "alone".
3.1. EXTENSIVE-FORM REPRESENTATION 35

The meaning of an information set is that when the individual is in that information
set, he knows that one of the nodes in the information set is reached, but he cannot
rule out any of the nodes in the information set. Moreover, in a game, the information
set belongs to the player who is to move in the given information set, representing his
uncertainty. That is, the player  who is to move at the information set is unable to
distinguish between the points in the information set, but able to distinguish between the
points outside the information set from those in it. Therefore, the above definition would
be meaningless without condition 1, while condition 2 requires that the player knows his
available choices. The latter condition can be taken as a simplifying assumption. I also
refer to information sets as history and write  for a generic history at which player 
moves.
For an example, consider the game in Figure 3.5. Here, Player 2 knows that Player
1 has taken action  or  and not action ; but Player 2 cannot know for sure whether
1 has taken  or .1

1 x

T B

L R
R L

Figure 3.5: A game

Example 3.2 (Matching Pennies with Perfect Information) In Figure 3.4, the
informational partition is very simple. Every information set has only one element.
Hence, there is no uncertainty regarding the previous play in the game.
1
Throughout the course, the information sets are depicted either by circles (as in sets), or by dashed
curves connecting the nodes in the information sets, depending on convenience. Moreover, the informa-
tion sets with only one node in them are depicted in the figures. For example, in Figure 3.5, the initial
node is in an information set that contains only that node.
36 CHAPTER 3. REPRESENTATION OF GAMES

A game is said to have perfect information if every information set has only one
element. Recall that in a tree, each node is reached through a unique path. Hence,
in a perfect-information game, a player can construct the previous play perfectly. For
instance in Figure 3.4, Player 2 knows whether Player 1 chose Head or Tail. And Player
1 knows that when he plays Head or Tail, Player 2 will know what Player 1 has played.

3.1.4 Nature as a player and representation of uncertainty


The set of players includes the decision makers taking part in the game. However, in
many games there is room for chance, e.g. the throw of dice in backgammon or the card
draws in poker. More broadly, the players often face uncertainty about some relevant
fact, including what the other players know. In that case, once again chance plays a role
(as a representation). To represent these possibilities we introduce a fictional player:
Nature. There is no payoff for Nature at end nodes, and every time a node is allocated
to Nature, a probability distribution over the branches that follow needs to be specified,
e.g., Tail with probability of 1/2 and Head with probability of 1/2. Note that this is the
same as adding lotteries in the previous section to the game.
For an example, consider the game in Figure 3.6. In this game, a fair coin is tossed,
where the probability of Head is 1/2. If Head comes up, Player 1 chooses between Left
and Right; if Tail comes up, Player 2 chooses between Left and Right. The payoffs also
depend on the coin toss.

Left (5, 0)

1
Head
1/2 Right (2, 2)
Nature
(3, 3)
1/2 Left
Tail 2

Right
(0, -5)

Figure 3.6: A game with chance


3.1. EXTENSIVE-FORM REPRESENTATION 37

3.1.5 Commonly Known Assumptions

The structure of a game is assumed to be known by all the players, and it is assumed
that all players know the structure and so on. That is, in a more formal language, the
structure of game is common knowledge.2 For example, in the game of Figure 3.5, Player
1 knows that if he chooses  or , Player 2 will know that Player 1 has chosen one of the
above actions without being able to rule out either one. Moreover, Player 2 knows that
Player 1 has the above knowledge, and Player 1 knows that Player 2 knows it, and so
on. Using information sets and richer game trees, one can model arbitrary information
structures like this. For example, one could also model a situation in which Player 1
does not know whether Player 2 could distinguish the actions  and . One could do
that by having three information set for Player 2; one of them is reached only after  ,
one of them is reached only after  and one of them can be reached after both  and .
Towards modeling uncertainty of Player 1, one would further introduce a chance move,
whose outcome either leads to the first two information sets (observable case) or to the
last information case (unobservable case).

Exercise 3.1 Write the variation of the game in Figure 3.5, in which Player 1 believes
that Player 2 can distinguish actions  and  with probability 13 and cannot distinguish
them probability 23, and this beliefs is common knowledge.

To sum up: At any node, the following are known: which player is to move, which
moves are available to the player, and which information set contains the node, sum-
marizing the player’s information at the node. Of course, if two nodes are in the same
information set, the available moves in these nodes must be the same, for otherwise the
player could distinguish the nodes by the available choices. Again, all these are assumed
to be common knowledge.

2
Formally, a proposition  is said to be common knowledge if all of the following are true:  is
true; everybody knows that  is true; everybody knows that everybody knows that  is true; . . . ;
everybody knows that . . . everybody knows that  is true, ad infinitum.
38 CHAPTER 3. REPRESENTATION OF GAMES

3.2 Strategies
Definition 3.6 A strategy of a player is a complete contingent-plan determining which
action he will take at each information set he is to move (including the information sets
that will not be reached according to this strategy). More mathematically, a strategy of a
player  is a function  that maps every information set  of player  to an action that
is available at  .

It is important to note the following three subtleties in the definition.

1. One must assign a move to every information set of the player. (If we omit to
assign a move for an information set, we would not know what the player would
have done when that information set is reached.)

2. The assigned move must be available at the information set. (If the assigned move
is not available at an information set, then the plan would not be feasible as it
could not be executed when that information set is reached.)

3. At all nodes in a given information set, the player plays the same move. (After
all, the player cannot distinguish those nodes from each other.)

Example 3.3 (Matching Pennies with Perfect Information) In Figure 3.4, Player
1 has only one information set. Hence, the set of strategies for Player 1 is {Head, Tail}.
On the other hand, Player 2 has two information sets. Hence, a strategy of Player 2
determines what to do at each information set, i.e., depending on what Player 1 does.
So, her strategies are:

 = Head if Player 1 plays Head, and Head if Player 1 plays Tail;


 = Head if Player 1 plays Head, and Tail if Player 1 plays Tail;
  = Tail if Player 1 plays Head, and Head if Player 1 plays Tail;
  = Tail if Player 1 plays Head, and Tail if Player 1 plays Tail.

Example 3.4 In Figure 3.5, both players have one information set. Hence, the sets of
strategies for Players 1 and 2 are

1 = {  } and 2 = { } 


3.2. STRATEGIES 39

respectively. Although Player 2 moves at two different nodes, they are both in the same
information set. Hence, she needs to either play  at both nodes or  at both nodes.

For certain purposes it might suffice to look at the reduced-form strategies. A re-
duced form strategy is defined as an incomplete contingent plan that determines which
action the agent will take at each information set he is to move and that has not been
precluded by this plan. But for many other purposes we need to look at all the strategies.
Throughout the course, we must consider all strategies.
What are the outcomes of strategies of players? What are the payoffs generated by
those strategies? Towards answering these questions, we need first a couple of jargon.

Definition 3.7 In a game with players  = {1     }, a strategy profile is a list

 = (1       )

of strategies, one for each player.

Definition 3.8 In a game without Nature, each strategy profile  leads to a unique
terminal node  (), called the outcome of . The payoff vector from strategy  is the
payoff vector at  ().

Sometimes the outcome is also described by the resulting history, which can also be
called as the path of play.

Example 3.5 (Matching Pennies with Perfect Information) In Figure 3.4, if Player
1 plays Head and Player 2 plays , then the outcome is

both players choose Head,

and the payoff vector is (−1 1). If Player 1 plays Head and Player 2 plays  , the
outcome is the same, yielding the payoff vector (−1 1). If Player 1 plays Tail and
Player 2 plays HT, then the outcome is now

both players choose Tail,

but the payoff vector is (−1 1) once again. Finally, if Player 1 plays Tail and Player 2
plays , then the outcome is

Player 1 chooses Tail and Player 2 chooses Head,


40 CHAPTER 3. REPRESENTATION OF GAMES

and the payoff vector is (1 −1). One can compute the payoffs for the other strategy
profiles similarly.

In games with Nature, a strategy profile leads to a probability distribution on the set
of terminal nodes. The outcome of the strategy profile is then the resulting probability
distribution. The payoff vector from the strategy profile is the expected payoff vector
under the resulting probability distribution.

Example 3.6 (A game with Chance) In Figure 3.5, each player has two strategies,
Left and Right. The outcome of the strategy profile (Left,Left) is the lottery that

Nature chooses Head and Player 1 plays Left

with probability 1/2 and

Nature chooses Tail and Player 2 plays Left

with probability 1/2. Hence, the expected payoff vector is

1 1
 (  ) = (5 0) + (3 3) = (4 32) 
2 2

Sometimes, it suffices to summarize all of the information above by the set of strate-
gies and the utility vectors from the strategy profiles, computed as above. Such a
summary representation is called formal-form or strategic-form representation.

3.3 Normal form


Definition 3.9 (Normal form) A game is any list

 = (1       ; 1       ) 

where, for each  ∈  = {1     },  is the set of all strategies that are available to
player , and
 : 1 ×    ×  → R

is player ’s von Neumann-Morgenstern utility function.


3.3. NORMAL FORM 41

Notice that a player’s utility depends not only on his own strategy but also on the
strategies played by other players. Moreover, each player  tries to maximize the expected
value of  (where the expected values are computed with respect to his own beliefs); in
other words,  is a von Neumann-Morgenstern utility function. We will say that player
 is rational iff he tries to maximize the expected value of  (given his beliefs).
It is also assumed that it is common knowledge that the players are  = {1     },
that the set of strategies available to each player  is  , and that each  tries to maximize
expected value of  given his beliefs.
When there are only two players, we can represent the normal form game by a
bimatrix (i.e., by two matrices):

1\2   


 0 2 1 1
 4 1 3 2

Here, Player 1 has strategies  and , and Player 2 has the strategies  and
. In each box the first number is Player 1’s payoff and the second one is Player 2’s
payoff (e.g., 1 (  ) = 0, 2 (  ) = 2.)

3.3.1 From Extensive Form to Normal Form


As it has been described in detail in the previous section, in an extensive form game, the
set of strategies is the set of all complete contingent plans, mapping information sets to
the available moves. Moreover, each strategy profile  leads to an outcome  (), which
is in general probability distribution on the set of terminal nodes. The payoff vector is
the expected payoff vector from  (). One can always convert an extensive-form game
to a normal form game in this way.

Example 3.7 (Matching Pennies with Perfect Information) In Figure 3.4, based
on the earlier analyses, the normal or the strategic form game corresponding to the
matching penny game with perfect information is

   
 −1 1 −1 1 1 −1 1 −1
  1 −1 −1 1 1 −1 −1 1
42 CHAPTER 3. REPRESENTATION OF GAMES

Head Tail
2

head tail head tail

(-1,1) (1,-1) (1,-1) (-1,1)

Figure 3.7: Matching Pennies Game

Information sets are very important. To see this, consider the following standard
matching-penny game. This game has imperfect information.

Example 3.8 (Matching Pennies) Consider the game in Figure 3.7. This is the
standard matching penny game, which has imperfect information as the players move
simultaneously. In this game, each player has only two strategies: Head and Tail. The
normal-form representation is

1\2 head tail


Head −1 1 1 −1
Tail 1 −1 −1 1

The two matching penny games may appear similar (in extensive form), but they
correspond to two distinct situations. Under perfect information Player 2 knows what
Player 1 has done, while nobody knows about the other player’s move under the version
with imperfect information.
As mentioned above, when there are chance moves, one needs to compute the ex-
pected payoffs in order to obtain the normal-form representation. This is illustrated in
the next example.

Example 3.9 (A game with Nature) As mentioned, in Figure 3.6, each player has
two strategies, Left and Right. Following the earlier calculations, the normal-form rep-
3.3. NORMAL FORM 43

TT
HH HT TH
1 1
1 1

H T H T
H T H T

1 1 1 ‐1
‐1 1 ‐1 ‐1
‐1 ‐1 ‐1 1
1 ‐1 1 1

Figure 3.8: A matching penny game with perfect information?

resentation is obtained as follows:

1\2 Left Right


Left 4 32 52 −52
Right 52 52 1 −32
The payoff from (Left,Left) has been computed already. The payoff from (Left, Right)
computed as
1 1
(5 0) + (0 −5) = (52 −52) 
2 2
While there is a unique normal-form representation for any extensive-form game (up
to a relabeling of strategies), there can be many extensive-form games with the same
normal-form representation. After all, any normal-form game can also be represented
as a simultaneous action game in extensive form. For example, the normal-form game
of matching pennies with perfect information can also be represented as in Figure 3.8.

3.3.2 Mixed Strategies


In many cases a player may not be able to guess exactly which strategies the other
players play. In order to cover these situations we introduce the mixed strategies:

Definition 3.10 A mixed strategy of a player is a probability distribution over the set
of his strategies.
44 CHAPTER 3. REPRESENTATION OF GAMES

If player  has strategies  = {1  2       }, then a mixed strategy  for player 
is a function on  such that 0 ≤   ( ) ≤ 1 and

 (1 ) +   (2 ) + · · · +   ( ) = 1

There are many interpretations for mixed strategies, from deliberate randomization (as
in coin tossing) to heterogeneity of strategies in the population. In all cases, however,
they serve as a device to represent the uncertainty the other players face regarding the
strategy played by player . Throughout the course,  is interpreted as the other players’
beliefs about the strategy player  plays.

3.4 Exercises with Solutions

1. What is the normal-form representation for the game in Figure 3.12?

Solution: Player 1 has two information sets with two action in each. Since the set
of strategies is functions that map information sets to the available moves, he has
the following four strategies:    . The meaning here is straightforward:
 assigns  to the first information set and  to the last information set. On the
other hand, Player 2 has only two strategies:  and . Filling in the payoffs from
the tree, one obtains the following normal-form representation:

1\2  
 1 −5 5 2
 3 3 5 2
 4 4 4 4
 4 4 4 4

2. [Midterm 1, 2001] Find the normal-form representation of the game in Figure 3.9.
3.4. EXERCISES WITH SOLUTIONS 45

Figure 3.9:

Solution:
1\2   
 2 1 2 1 2 1
 2 1 2 1 2 1
 2 1 2 1 2 1
 2 1 2 1 2 1
 1 2 3 1 1 3
 1 2 3 1 3 1
 1 2 1 3 1 3
 1 2 1 3 3 1

3. [Make up for Midterm 1, 2007] Write the game in Figure 3.10 in normal form.

Solution: The important point in this exercise is that Player 2 has to play the same
move in a given information set. For example, she cannot play  on the left node
and  on the right node of her second information set. Hence, her set of strategies
is {   }.
   
 3 3 3 3 0 0 0 0
 3 3 3 3 0 0 0 0
 0 0 0 0 3 3 3 3
 0 0 0 0 3 3 3 3
 2 2 2 2 1 −1 −1 3
 2 2 2 2 −1 1 1 −1
46 CHAPTER 3. REPRESENTATION OF GAMES

A B c

a b a b a b

1
3 0 0 3 2
3 0 0 3 2 X Y

x y x y

1 ‐1 ‐1 1
‐1 3 1 ‐1

Figure 3.10:

4. [Make up for Midterm 1, 2007] Write the following game in normal form, where
the first entry is the payoff of student and the second entry is the payoff of Prof.

healthy
sick
.5
student .5
student
regular Make up Make up regular
Prof
2
1 0
same new same new
-1

4 1 4 1
-1 -c 0 -c

Solution: Write the strategies of the student as , , , and , where
 means Regular when Healthy and Make up when Sick,  means Make up
3.5. EXERCISES 47

Figure 3.11:

when Healthy and Regular when Sick, etc. The normal form game is as follows:

Student\Prof same new


 1 0 1 0
 3 12 32 (1 − ) 2
 2 −1 12 −(1 + )2
 4 −12 1 −
Here, the payoffs are obtained by taking expectations over whether the student
is healthy or sick. For example,  leads to (2 1) and (0 −1) with equal prob-
abilities, yielding (1 0), regardless of the strategy of Prof. On the other hand,
( new) leads to (2 1) and (1 −) with equal probabilities, yielding (32 (1 − ) 2).

5. [Midterm 2006] Write the game in Figure 3.11 in normal form.


Solution:
1\2        
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 3 2 3 2 3 2 3 2 3 1 3 1 0 0 0 0
 0 0 0 0 0 0 0 0 0 −1 2 3 0 −1 2 3

3.5 Exercises
1. [Midterm 1, 2010] Write the game in Figure 3.13 in normal form.
48 CHAPTER 3. REPRESENTATION OF GAMES

1 A 2 α 1 a
(1,-5)

D δ d

(4,4) (5,2) (3,3)

Figure 3.12: A centepede-like game

Figure 3.13:
3.5. EXERCISES 49

2. [Midterm 1, 2005] Write the game in Figure 3.14 in normal form.

3. [Midterm 1, 2004] Write the game in Figure 3.15 in normal form.

Figure 3.14:

4. [Midterm 1, 2007] Write the following game in normal form.

Figure 3.15:
50 CHAPTER 3. REPRESENTATION OF GAMES

Figure 3.16:

5. [Homework 1, 2011] Find the normal-form representation of the extensive-form


game in Figure 3.16.
Chapter 4

Dominance

The previous lectures focused on how to formally describe a strategic situation. We now
start analyzing strategic situations in order to find which outcomes are more reasonable
and likely to realize. In order to do that, we consider certain sets of assumptions about
the players’ beliefs and discover their implications on what they would play. Such analy-
ses will lead to solution concepts, which yield a set of strategy profiles1 . These are the
strategy profiles deemed to be possible by the solution concept. This lecture is devoted
to two solution concepts: dominant strategy equilibrium and rationalizability. These
solution concepts are based on the idea that a rational player does not play a strategy
that is dominated by another strategy.

4.1 Rationality and Dominance


A player is said to be rational if and only if he maximizes the expected value of his
payoffs (given his beliefs about the other players’ strategies). For example, consider the
following game.
1\2  
 2 0 −1 1
(4.1)
 0 10 0 0
 −1 −6 2 0

1
A strategy profile is a list of strategies, prescribing a strategy for each player.

51
52 CHAPTER 4. DOMINANCE

Consider Player 1. He is contemplating about whether to play  , or , or . A quick


inspection of his payoffs reveals that his best play depends on what he thinks the other
player does. Let’s then write  for the probability he assigns to  (as Player 2’s play).
Then, his expected payoffs from playing  , , and  are

 = 2 − (1 − ) = 3 − 1
 = 0
 = − + 2(1 − ) = 2 − 3

respectively. These values as a function of  are plotted in Figure 4.1. As it is clear


from the graph,  is the largest when   12, and  is the largest when   12. At
 = 12,  =   0. Hence, if player 1 is rational, then he plays  when   12, 
when   12, and  or  when  = 12. Notice that, if Player 1 is rational, then he
never plays –no matter what he believes about the strategy of Player 2. Therefore,
if we assume that Player 1 is rational (and that the game is as it is described above),
then we can conclude that Player 1 does not play . This is because  is a strictly
dominated strategy, a concept that we define now.

UT 2
UB

UM
0

-1
0 1
p

Figure 4.1: Expected payoffs in (4.1) as a function of probability of .

Towards describing this idea more generally and formally, let us use the notation −
4.1. RATIONALITY AND DOMINANCE 53

to mean the list of strategies  played by all the players  other than , i.e.,

− = (1  −1  +1   )

Definition 4.1 A strategy ∗ strictly dominates  if and only if

 (∗  − )   (  − ) ∀− ∈ − 

That is, no matter what the other players play, playing ∗ is strictly better than
playing  for player . In that case, if  is rational, he would never play the strictly
dominated strategy  . That is, there is no belief under which he would play  , for ∗
would always yield a higher expected payoff than  no matter what player  believes
about the other players.2
A mixed strategy  dominates a strategy  in a similar way:  strictly dominates
 if and only if

 (1 ) (1  − ) +   (2 ) (2  − ) + · · ·   ( ) (  − )   (  − ) ∀− ∈ − 

Notice that neither of the pure strategies  , , and  dominates any strategy.
Nevertheless,  is dominated by the mixed strategy that  1 that puts probability 1/2
on each of  and . For each , the payoff from 1 is
1 1 1
1 = (3 − 1) + (2 − 3) = 
2 2 2
which is larger than 0, the payoff from . Recall that  is a best response to any .
This is indeed a general result. Towards stating the result, I introduce a couple of
basic concepts. Write
Y
− = 
6=

for the set of other players’ strategies, and define a belief of player  as a probability
distribution  − on − .

Definition 4.2 For any player , a strategy  is a best response to − if and only if

 (  − ) ≥  (0  − ) ∀0 ∈ 


2
As a simple exercise, prove this statement.
54 CHAPTER 4. DOMINANCE

A strategy  is said to be a best response to a belief  − if and only if playing  yields


the highest expected payoff under  − , i.e.,
X X
 (  − )  − (− ) ≥  (0  − )  − (− )  ∀0 ∈  
− ∈− − ∈−

The concept of a best response is one of the main concepts in game theory, used
throughout the course. It is important to understand the definition well and be able to
compute the best response in relatively simple games, as those covered in this class. A
rational player can play a strategy under a belief only if it is a best response to that
belief.

Theorem 4.1 A strategy  is a best response to some belief if and only if  is not
dominated.3 Therefore, playing strategy  is never rational if and only if  is dominated
by a (mixed or pure) strategy.

To sum up: if one assumes that players are rational (and that the game is as
described), then one can conclude that no player plays a strategy that is strictly dominated
(by some mixed or pure strategy), and this is all one can conclude.
Although there are few strictly dominated strategies–and thus one can conclude
little from the assumption that players are rational–in general, there are interesting
games in which this weak assumption can lead to counterintuitive conclusions. For
example, consider the well-known Prisoners’ Dilemma game, introduced in Chapter 1:

1 \2 Cooperate Defect
Cooperate 5 5 0 6
Defect 6 0 1 1

Clearly, Cooperate is strictly dominated by Defect, and hence we expect each player to
play Defect, assuming that the game is as described and players are rational. Some found
the conclusion counterintuitive because if both players play Cooperate, the outcome
would be much better for both players.
3
If you like mathematical challenges try to prove this statement.
4.2. DOMINANT-STRATEGY EQUILIBRIUM 55

4.2 Dominant-strategy equilibrium


This section introduces two concepts of dominance, one is stronger than the other. It
the uses the weak dominance to define dominant-strategy equilibrium.
Definition 4.3 A strategy ∗ is a strictly dominant strategy for player  if and only if
∗ strictly dominates all the other strategies of player .
For example, in the prisoners’ dilemma game, Defect strictly dominates the only other
strategy of Cooperate. Hence, Defect is a strictly dominant strategy. If  is rational and
has a strictly dominant strategy ∗ , then he will not play any other strategy. In that
case, it is reasonable to expect that he will play ∗ .
The problem is that there are only few interesting strategic situations in which play-
ers have a strictly dominant strategies. Such situations can be analyzed as individual
decision problems. A slightly weaker form of dominance is more common, especially in
dynamic games (which we will analyze in the future) and in situation that arise in struc-
tured environments, such as under suitably designed trading mechanisms as in auctions.
This weaker form is called weak dominance:
Definition 4.4 A strategy ∗ weakly dominates  if and only if
 (∗  − ) ≥  (  − ) ∀− ∈ −
and
 (∗  − )   (  − )
for some − ∈ − .
That is, no matter what the other players play, playing ∗ is at least as good as
playing  , and there are some contingencies in which playing ∗ is strictly better than
 . In that case, if rational,  would play  only if he believes that these contingencies
will never occur. If he is cautious in the sense that he assigns some positive probability
for each contingency, then he will not play  . This weak dominance is used in the
definition of a dominant strategy:
Definition 4.5 A strategy ∗ of a player  is a (weakly) dominant strategy if and only
if ∗ weakly dominates all the other strategies of player .
When there is a weakly dominant strategy, if the player is rational and cautious,
then he will play the dominant strategy.
56 CHAPTER 4. DOMINANCE

Example:

1\2 work hard shirk


hire 2 2 1 3 (4.2)
don’t hire 0 0 0 0

In this game, player 1 (firm) has a strictly dominant strategy: “hire.” Player 2 has
only a weakly dominated strategy. If players are rational, and in addition Player 2 is
cautious, then Player 1 hires and Player 2 shirks.
When every player has a dominant strategy, one can make a strong prediction about
the outcome. This case yields the first solution concept in the course.

Definition 4.6 A strategy profile ∗ = (∗1  ∗2  ∗ ) is a dominant strategy equilibrium,
if and only if for each player , ∗ is a weakly dominant strategy.

As an example consider the Prisoner’s Dilemma.

1 \2 Cooperate Defect
Cooperate 5 5 0 6
Defect 6 0 1 1

Defect is a strictly dominant strategy for both players, therefore (Defect, Defect) is a
dominant strategy equilibrium. Note that dominant strategy equilibrium only requires
weak dominance. For example, (hire, shirk) is a dominant strategy equilibrium in game
(4.2).
When it exists, the dominant strategy equilibrium has an obvious attraction. In
that case, rational cautious players will play the dominant strategy equilibrium. Unfor-
tunately, it does not exist in general. For example, consider the Battle of the Sexes
game:
opera football
opera 3 1 0 0
football 0 0 1 3
Clearly, no player has a dominant strategy: opera is a strict best reply to opera and
football is a strict best reply to football. Therefore, there is no dominant strategy
equilibrium.
4.3. EXAMPLE: SECOND-PRICE AUCTION 57

4.3 Example: second-price auction


As already mentioned, under suitably designed trading mechanisms, it is possible to
have a dominant strategy equilibrium. Such mechanisms are desirable for they give
the economic agents strong incentive to play a particular strategy (which is presumably
preferred by the market designer) and eliminate the agents’ uncertainty about what the
other players play, as it becomes irrelevant for the agent what the other players are
doing. The most famous trading mechanism with dominant-strategy equilibrium is the
second-price auction.
There is an object to be sold through an auction. There are two buyers. The value
of the object for any buyer  is  , which is known by the buyer . Each buyer  submits
a bid  in a sealed envelope, simultaneously. Then, the envelopes are opened, and the
buyer ∗ who submits the highest bid

∗ = max {1  2 }

gets the object and pays the second highest bid (which is  with  6= ∗ ). (If two or
more buyers submit the highest bid, one of them is selected by a coin toss.)
Formally the game is defined by the player set  = {1 2}, the strategies  , and the
payoffs ⎧


⎨  −  if   
 (1  2 ) = ( −  ) 2 if  = 


⎩ 0 if   
where  =
6 .
In this game, bidding his true valuation  is a dominant strategy for each player
. To see this, consider the strategy of bidding some other value 0 6=  . We want to
show that 0 is weakly dominated by bidding  . Consider the case 0   . If the other
player bids some   0 , player  would get  −  under both strategies 0 and  . If
the other player bids some  ≥  , player  would get 0 under both strategies 0 and  .
But if  = 0 , bidding  yields  −   0, while 0 yields only ( −  ) 2. Likewise, if
0     , bidding  yields  −   0, while 0 yields only 0. Therefore, bidding 
weakly dominates 0 . The case 0   is similar, except for when 0     , bidding
 yields 0, while 0 yields negative payoff  −   0. Therefore, bidding  is dominant
strategy. Since this is true for each player , (1  2 ) is a dominant-strategy equilibrium.
58 CHAPTER 4. DOMINANCE

Exercise 4.1 Extend this to the -buyer case.

4.4 Exercises with Solutions


1. [Homework 1, 2011] There are  students in a class. Simultaneously, each student
 chooses an effort level  incurring cost 2 for some   0. The student  receives
an increase  in his grade from his own effort, but this also raises the curve and
decreases the grade of every other student by  for some   0. The resulting
utility of player  is
X
 (1       ) =  −   − 2 
6=

All of the above is common knowledge.

(a) Write this game in normal form.


Solution: The set of players is  = {1     }. For each  ∈ ,  = R, and
 : R → R is given in the question.
(b) Is there a dominant strategy equilibrium? If so, compute the dominant strat-
egy equilibrium.
Solution: For any − = (1      −1  +1       ), the best response can
be found by

= 1 − 2 = 0

The solution to this equation is the unique best response:
1
∗ =
2
Since ∗ is best response to every strategy, ∗ dominates any other strategy
 :
 (∗  − )   (  − ) (∀− ) 
¡1 1
¢
Therefore, 2      2 is the dominant-strategy equilibrium.
(c) Compute the (1       ) vector that maximizes the sum 1 (1       ) +
· · ·+ (1       ) of grades. Comparing your answers to (b) and (c), briefly
discuss your findings.
4.4. EXERCISES WITH SOLUTIONS 59

Solution: The total utility is


à !
X X X X
=  −   − 2 = (1 − ( − 1) )  − 2 
 6=  

The first order condition is



= 1 − ( − 1)  − 2 = 0

Therefore,  is maximized at
µ ¶
1 − ( − 1)  1 − ( − 1) 
 
2 2

Note that the the dominant-strategy equilibrium corresponds to the case


 = 0, ignoring the negative impact on the other students’ grades. The
dominant strategy equilibrium always yield a higher effort than the socially
optimal level that maximizes . This is a version of the commons problem,
a generalization of the Prisoners’ Dilemma game. In commons problem, the
players’ efforts have positive impact on the others payoffs, as they produce
some public good. In that problem, equilibrium effort is lower than the opti-
mal one. Here, the impact is negative, and students work harder than socially
optimal. (Professors want them to work even harder!)

2. [Homework 1, 2010] Consider an auction in which  identical objects are sold to


   bidders. Each bidder  needs only one object and has a valuation  for
the object. In the auction, simultaneously, every bidder  bids  . The highest 
bidders win. Each winner gets one object and pays the  + 1 highest bidder (i.e.,
the price  is the highest bid among the bidders who do not get an object). (The
ties are broken by a coin toss.) Each of the losing bidders gets a gift of value 
for their participation. (The winners do not get a gift.) Show that the game has
a dominant strategy equilibrium, and compute the equilibrium.

Solution: The dominant strategy equilibrium is (1 −  2 −       − ). To


show that ∗ =  −  is dominant strategy, consider any  6= ∗ . Consider the
case,   ∗ . Towards showing that ∗ weakly dominates  , take any bid − by
the others. Relabeling the players, one can take  =  and 1 ≥ 2 ≥ · · · ≥ −1 .
60 CHAPTER 4. DOMINANCE

If    , then under both bids  and ∗ ,  wins the object and pays price  =  ,
enjoying the payoff level of  − . If    , then under both bids  and ∗ , 
loses the object and gets . Consider the case,     ∗ . In that case, under
∗ ,  wins and gets  −  . Under  , he gets . But, since   ∗ =  − , bid ∗
yields a higher payoff:  −   . The cases of ties and   ∗ are dealt similarly.

3. For the following strategy space and utility pairs, check if best response exists for
player 1, and compute it when it exists.
Note: In general a best response exists if 1 is compact (i.e. closed and bounded for
all practical purposes) and  is continuous in  . In particular, it exists whenever
1 is finite. Fortunately it may exists even if the above conditions fail.

(a) 1 = [0 1]; 1 (1 ) = 1 if 1  1 and 1 (1) = 0.


Solution: Clearly, there is no best response. Plot a graph for illustration.
(Continuity fails here.)
(b) 1 = 2 = [0 ∞); 1 (1 ) = 1 2 .
Solution: Everything is a best response when 2 = 0, and nothing is a best
response when 2 6= 0. Compactness fails. This also shows that there can be
more than one best response.
(c) Partnership Game: 1 = 2 = [0 ∞); 1 (1 ) = 1 2 − 21 where   0.
Solution: Best response exists although 1 is not compact. Take the partial
derivative with respect to 1 and set it equal to zero in order to obtain the
"first-order condition" for maximum:
1
= 2 − 21 = 0
1
That is, the best response is

1 = 2 2

One does not need to check the second order condition because 1 is concave.
(d) First-Price Auction: 1 = 2 = [0 ∞);



⎨  − 1 if 1  2
1 (1  2 ) = ( − 1 ) 2 if 1 = 2


⎩ 0 otherwise
4.5. EXERCISES 61

where   0.
Solution: Everything is a best response when 2 = ; any 1  2 is a best
response when 2  , and nothing is a best response when 2  . Continuity
fails.

(e) Price Competition: 1 = 2 = [0 ∞);





⎨ (1 − 1 ) 1 if 1  2
1 (1  2 ) = (1 − 1 ) 1 2 if 1 = 2


⎩ 0 otherwise.

Solution: Everything is a best response when 2 = 0, and nothing is a best


response when 2 =
6 0. Continuity fails.

(f) Quantity Competition: 1 = 2 = [0 ∞); 1 (1  2 ) = (1 − 1 − 2 ) 1 − 1 .


Solution: There is a unique best response. As in part (c), the first-order
condition is
1
= 1 − 21 − 2 −  = 0
1
yielding
1 − 2 − 
1 = 
2

4.5 Exercises
1. Show that there cannot be a dominant strategy in mixed strategies.

2. [Homework 1, 2007] The Federal Government is to decide whether to construct


a road between the towns Arlington and Belmont. The values of the road for
Arlington and Belmont are  ≥ 0 and  ≥ 0, respectively. The cost of constructing
the road is   0. The Federal Government wants to construct the road if and
only if  +  ≥ . The values  and  are known by the towns, but not by the
government;  is known by everybody. To learn these values, the government asks
each town to submit the value of the road for the town. Given the submitted
valuations  and  , which need to be non-negative, the government constructs
the bridge if and only if  +  ≥  and tax Arlington and Belmont  (   )
62 CHAPTER 4. DOMINANCE

and  (   ), respectively, where


(
 −  if  +  ≥  and   
 (   ) =
0 otherwise
(
 −  if  +  ≥  and   
 (   ) =
0 otherwise.

Find the dominant strategy equilibrium; show that the strategies that you identify
are indeed dominant.

3. [Homework 1, 2006] There are  players and an object. The game is as follows:

• First, for each player , Nature chooses a number  from {0 1 2     99},
where each number is equally likely, and reveals  to player  and nobody
else. ( is the value of the object for player .)

• Then, each player  simultaneously bids a number  .

• The player who bids the highest number wins the object and pays  where 
is the highest number bid by a player other than the winner. (If two or more
players bid the highest bid, the winner is determined by a coin toss among
the highest bidders.) The payoff of player  is ( −  ) if he is the winner and
0 otherwise.

(a) Write this game in normal form. That is, determine the set of strategies for
each player, and the payoff of each player for each strategy profile.

(b) Show that there is a dominant strategy equilibrium. State the equilibrium.

4. [Homework 1, 2010] Alice, Bob, and Caroline are moving into a 3-bedroom apart-
ment (with rooms, named 1, 2, and 3). In this problem we want to help them to
select their rooms. Each roommate has a strict preference over the rooms. The
roommates simultaneously submit their preferences in an envelope, and then the
rooms are allocated according to one of the following mechanisms. For each mech-
anism, check whether submitting the true preferences is a dominant strategy for
each roommate.
4.5. EXERCISES 63

Mechanism 1 First, Alice gets her top ranked room. Then, Bob gets his top
ranked room among the remaining two rooms. Finally, Caroline gets the
remaining room.
Mechanism 2 Alice, Bob, and Caroline have priority scores 03, 0, and −03,
respectively; the priority score of a roommate  is denoted by  . For each
roommate  and room , let rank  be 3 if  ranks  highest, 2 if  ranks 
second highest, and 1 if  ranks  lowest. Write  =  +  for the aggregate
score. In the mechanism, Room 1 is given to the roommate  with the highest
aggregate score 1 . Then, among the remaining two, the one with the highest
aggregate score 0 2 gets Room 2, and the other gets Room 3.
Chapter 5

Rationalizability

A player is said to be rational if he maximizes expected value of his utility function, as


described in the game. The previous lecture explored the implications of rationality. This
was captured by dominance. In natural strategic environments, this often yields weak
predictions. Moreover the games in which dominance alone leads to a sharp prediction
(e.g. the games with a dominant strategy equilibrium) are not interesting for game
theory because in such a game each player’s decision can be analyzed separately without
requiring a game theoretical analysis.

Nevertheless, in definition of a game, one assumes much more than rationality of the
players. One further assumes that it is common knowledge that the players are rational.
That is, everybody is rational; everybody knows that everybody is rational; everybody
knows that everybody knows that everybody is rational ... up to infinity. If some of
these assumptions fail, then one would need to consider a different game, the game
that reflects the failure of those assumptions. This lecture explores the implications
of the common knowledge of rationality. These implications are precisely captured by
a solution concept called rationalizability, which is equivalent to iterative elimination
of strictly dominated strategies. In this way, rationalizability precisely captures the
implications of the assumptions embedded in the definition of the game.

65
66 CHAPTER 5. RATIONALIZABILITY

5.1 Definition and Illustration


It is useful to illustrate the solution concept on the leading example of the previous
section: (4.1). We have seen there that strategy  is strictly dominated (by a mixture
of  and ) and hence it cannot be a best response to any belief. Hence, rationality of
player 1 implies that Player 1 does not play . No other strategy is strictly dominated.
For example, for Player 2, her both strategies can be a best reply. If she thinks that
Player 1 is not likely to play , then she must play , and if she thinks that it is very
likely that Player 1 will play , then she must play . Hence, rationality of Player 2
does not put any restriction on her behavior. But, what if she thinks that it is very
likely that player 1 is rational (and that his payoff are as in (4.1))? In that case, since
a rational player 1 does not play , she must assign very small probability for player 1
playing . In fact, if she knows that player 1 is rational, then she must be sure that he
will not play . In that case, being rational, she must play . In summary, if Player
2 is rational and she knows that player 1 is rational, then she must play R.
Notice that we first eliminated all of the strategies that are strictly dominated
(namely ), then taking the resulting game, we eliminated again all of the strate-
gies that are strictly dominated (namely ). This is called twice iterated elimination of
strictly dominated strategies. In general, if a player is rational and knows that the other
players are also rational (and the payoffs are as given), then he must play a strategy
that survives twice iterated elimination of strictly dominated strategies.
Under further rationality assumptions, one can further iteratively eliminate strictly
dominated strategies (if there remains any). In example (4.1), recall that rationality
of Player 1 requires him to play  or , and knowledge of the fact that Player 2 is
also rational does not put any restriction on his behavior–as rationality itself does not
restrict Player 2’s behavior. Now, assume that Player 1 also knows that Player 2 is
rational and that Player 2 knows that Player 1 is rational (and that the game is as in
(4.1)). Then, as the above analysis shows, Player 1 must know that Player 2 will play
. In that case, being rational he must play .
This analysis yields a mechanical procedure to analyze games, -times Iterated Elim-
ination of Strictly Dominated Strategies: eliminate all the strictly dominated strategies
and iterate this -times. In this procedure, one eliminates all the strictly dominated
strategies and iterates this  times.
5.1. DEFINITION AND ILLUSTRATION 67

General fact: If (1) every player is rational, (2) every player knows that every
player is rational, (3) every player knows that every player knows that every player is
rational, . . . and () every player knows that every player knows that . . . every player is
rational, then every player must play a strategy that survives -times iterated elimination
of strictly dominated strategies.
Caution: Two points are crucial for the elimination procedure:

1. One must eliminate only the strictly dominated strategies. One cannot eliminate
a strategy if it is weakly dominated but not strictly dominated. For example, in
the game
 
 1 1 0 0
 0 0 0 0
( ) is a dominant strategy equilibrium, but no strategy is eliminated because 
does not strictly dominate  and  does not strictly dominate .

2. One must eliminate the strategies that are strictly dominated by mixed strategies
(but not necessarily by pure strategies). For example, in the game in (4.1), 
must be eliminated although neither  nor  dominates .

When there are only finitely many strategies, this elimination process must stop at
some . That is, at some  there will be no dominated strategy to eliminate. In that
case, iterating the elimination further would not have any effect.

Definition 5.1 The elimination process that keeps iteratively eliminating all strictly
dominated strategies until there is no strictly dominated strategy is called Iterated Elim-
ination of Strictly Dominated Strategies; one eliminates indefinitely if the process does
not stop. A strategy is said to be rationalizable if and only if it survives iterated elimi-
nation of strictly dominated strategies.

As depicted in Figure 5.1, the procedure is as follows. Eliminate all the strictly
dominated strategies. In the resulting smaller game, some of the strategies may become
strictly dominated. Check for those strategies. If there is one, apply the procedure one
more time to the smaller game. This continues until there is no strictly dominated strat-
egy; the elimination continues indefinitely if the process does not stop. The remaining
68 CHAPTER 5. RATIONALIZABILITY

Eliminate all the strictly


dominated strategies.

Yes Any dominated strategy


In the new game?

No
Rationalizable strategies

Figure 5.1: Algorithm for rationalizability

strategies are called rationalizable. When the game is finite, the order of eliminations
does not matter for the resulting outcome. For example, even if one does not eliminate
a strictly dominated strategy at a given round, the eventual outcome is not affected by
such an omission. In that case, it is also okay to eliminate a strategy whenever it is
deemed to be strictly dominated.

Theorem 5.1 If it is common knowledge that every player is rational (and the game
is as described), then every player must play a rationalizable strategy. Moreover, any
rationalizable strategy is consistent with common knowledge of rationality.

A general problem with rationalizability is that there are usually too many rational-
izable strategies; the elimination process usually stops too early. In that case one cannot
make much prediction based on such analysis. For example, in the Matching Pennies
game

1\2   


 −1 1 1 −1
  1 −1 −1 1

every strategy is rationalizable, and we cannot say what the players will do.
5.2. EXAMPLE: BEAUTY CONTEST 69

5.2 Example: Beauty Contest


Consider an -player game in which each player  has strategies  ∈ [0 100], and payoff
µ ¶2
2 1 + · · · 
 (1       ) = −  − 
3 
Notice that, in this game, each player tries to play a strategy that is equal to two thirds of
the average strategy, which is also affected by his own strategy. Each person is therefore
interested guessing the other players’ average strategies, which depends on the other
players’ estimate of the average strategy.
One iteratively eliminate strictly dominated strategies as follows. First, since each
strategy must be less than or equal to 100, the average cannot exceed 100, and hence
any strategy   2003 is strictly dominated by 200/3. Indeed, any strategy   1 is
strictly dominated by 1 where1
2 ( − 1)
1 = 100
3 − 2
To show that   1 is strictly dominated by 1 , we fix any (1      −1  +1      )
and show that
¡ ¢
 (1      −1    +1      )   1      −1  1  +1       (5.1)

By taking the derivative of  with respect to  , we obtain


µ ¶µ ¶
 2 2 1 + · · · 
= −2 1 −  − 
 3 3 
Clearly,    0 if µ ¶
2 1 + · · · 
 −  0
3 
which would be the case if
2 X
   ≡ ∗  (5.2)
3 − 2 6=
Hence,  is strictly increasing when   ∗ and strictly decreasing when   ∗ . On
P
the other hand, since each  ≤ 100, the sum 6=  is less than or equal to ( − 1) 100.
Hence, it suffices that
2
  ( − 1) 100 = 1 
3 − 2
1
Here 1 is just a real number, where superscript 1 indicates that we are in Round 1.
70 CHAPTER 5. RATIONALIZABILITY

Therefore, for any   1 , we have ∗ ≤ 1    Since we have established that  is a


strictly decreasing function of  in this region, this proves that (5.1) is satisfied. This
shows that all the strategies   1 are eliminated in the first round.
On the other hand, each  ≤ 1 is a best response to some (1      −1  +1      )
with
2 X
 =  
3 − 2 6=

Therefore, at the end of the first round the set of surviving strategies is [0 1 ].
£ ¤
Now, suppose that at the end of round , the set of surviving strategies is 0  for
some number  . By repeating the same analysis above with  instead of 100, we can
£ ¤
conclude that at the end of round  + 1, the set of surviving strategies is 0 +1 where
2 ( − 1) 
+1 =  
3 − 2
The solution to this equation with 0 = 100 is
∙ ¸
 2 ( − 1)
 = 100
3 − 2
Therefore, for each , at the end of round , a strategy  survives if and only if
∙ ¸
2 ( − 1)
0 ≤  ≤ 100
3 − 2
Since ∙ ¸
2 ( − 1)
lim 100 = 0
→∞ 3 − 2
the only rationalizable strategy is  = 0.
Notice that the speed at which  goes to zero determines how fast we eliminate
the strategies. If the elimination is slow (e.g. when 2 ( − 1)  (3 − 2) is large), then
many strategies are eliminated at very high iterations. In that case, predictions based on
rationalizability will heavily rely on strong assumptions about rationality, i.e., everybody
knows that everybody knows that ... everybody is rational. For example, if the  is
large or the ratio 23 is replaced by a number close to 1, the elimination is slow and the
predictions of rationalizability are less reliable. On the other hand, if  is small or the
ratio 23 is replaced by a small number, the elimination is fast and the predictions of
rationalizability are more reliable. In particular, the predictions of rationalizability for
this game is more robust in a small group than a larger group.
5.2. EXAMPLE: BEAUTY CONTEST 71

It is important that one analyzes the game that describes the actual situation. For
example, when the above game is played in classroom, there are often some students who
would rather move the mean in an unexpected direction and upset the other students
than get the prize of being closest to the two thirds of the average. Those students
bid 100 instead. In such experiments, the resulting outcome is often different from the
rationalizable solution of 0 for the above game, which does not take into account the
existence of such students. In fact, some students bid 0 in the first time they play
the game and switch to relatively higher bids in the follow up games. To analyze that
situation, consider the following variation.
For example, in the beauty contest game suppose that there are  mischievous
students with utility function
µ ¶2
1 + · · · 
 (1       ) =  − 

The remaining  −  students are as before. The best response of a mischievous student
P
is 0 if the expected value of 6=  ( − 1) is greater than 50, and it is 100 otherwise.
Hence at the first round all strategies other than 0 and 100 are eliminated for the
mischievous students.
For each round  there are such that  survives  rounds of iterated elimination for
a regular student iff  ≤  ≤ ̄  Note that for  = 0  = 0 and ̄ = 100. In the
earlier rounds, both 0 and 100 are available for mischievous students, and in that case
the lower bound remains  = 0 because 0 is a best response to 0 for regular students.
P
To compute the upper bound, fix a regular student . The expected value of 6= 
can take any value in [0 100 + ( −  − 1)̄−1 ], where 100 + ( −  − 1)¯
−1 is
obtained by taking the highest possible bid for each remaining students,  mischievous
students playing 100 and ( −  − 1) regular students playing ¯−1 . The best reply to
this value give us the upper bound:
2
̄ = [100 + ( −  − 1)̄−1 ] (5.3)
3 − 2
P
which is obtained by substituting 100 + ( −  − 1)̄−1 for 6=  in 5.2. As above,
all   ̄ is eliminated. Note that as  → ∞ ̄ converges to
2
3−2
· 100 200
¯∞ = 2 = (5.4)
1− 3−2
( −  − 1)  + 2
72 CHAPTER 5. RATIONALIZABILITY

(One can obtain ̄∞ by substituting ̄∞ for ̄ and ̄−1 in 5.3.)
The lower bound  depends on whether 0 remains a best response to a mischievous
student. This is the case when
̄ ( − ) + 100( − 1)
≥ 50
−1
If  ≥ 4, then ̄∞ satisfies the above inequality. In that case, all ̄ satisfy the
inequality, and neither 0 nor 100 is eliminated for the mischievous students. In that case,
the rationalizable strategies are {0 100} for mischievous students and [0 200(+2)]
for the regular students. If  ≥ 4, then ̄∞ fails the above inequality. Then, there
exists ∗ such that ̄ fails the inequality for every  ≥ ∗  and ̄ satisfies the inequality
for all   ∗  In that case at round  ∗ + 1 0 is eliminated for mischievous students.
Consequently, at round  = ∗ + 2 and after, for any regular student , the lowest value
P
for 6=  is 100 + ( −  − 1)−1  As in the above analysis, the best response to
this yields the lower bound at :
2
 = [100 + ( −  − 1)−1 ] (5.5)
3 − 2
Of course, as  → ∞,  converges to
200
∞ = ¯∞ = 
 + 2
In that case, the unique rationalizable strategy is 200( + 2) for regular students
and 100 for the mischievous students. The rationalizable strategy is plotted in Figure
2. Note that the mischievous students have a large impact. For example, when 10% of
the students are mischievous, the rationalizable strategy for regular students is 2012 ∼
=
16667 and the average rationalizable bid is 25.

5.3 Exercises with Solution


1. [Homework 2, 2011] Compute the set of rationalizable strategies in the following
game.
   
 3 1 1 0 0 2 1 1
 1 0 0 10 1 0 0 10
 2 1 1 0 0 0 0 0
 0 0 12 0 3 1 0 0
5.3. EXERCISES WITH SOLUTION 73

Figure 5.2: Rationalizable strategy as a function of the fraction of the mischievous


students

Solution: For player 1, strategy  is dominated by a mixed strategy that puts


probability 12 on  and probability 1/2 on . No other strategy is dominated.
After elimination of , strategies  and  become dominated; both  and  are
dominated by any strategy that puts positive probabilities on  and  and zero
probability on  and . Strategies  and  are eliminated in the second round. In
the next round,  is eliminated because it becomes dominated by a mixed strategy
that puts probability 12 on  and probability 1/2 on . The eliminations so far
leaves the following strategies:

 
 3 1 0 2
 0 0 3 1

One can easily see that the strategy  and then  are eliminated next, yielding
( ) as the only rationalizable strategies. The games with unique rationalizable
strategy are called dominance-solvable. We got one of them here.

2. [Midterm 1, 2011] Compute the set of all rationalizable strategies in the following
74 CHAPTER 5. RATIONALIZABILITY

game.
   
 0,3 0,1 3,0 0,1
 3,0 0,2 2,4 1,1
 2,4 3,2 1,2 10,1
 0,5 5,3 1,2 0,10

(a) Solution: Strategy  is strictly dominated by the mixed strategy 2 with


2 () ∈ (13 12) and 2 () = 1 −  2 (). In the first round,  is therefore
eliminated. (No other strategy is eliminated in that round.) In the second
round,  is strictly dominated by  and eliminated. In the third round, 
is strictly dominated by 2 above and eliminated. In the fourth round,  is
strictly dominated by  and eliminated. There are no other elimination, and
the set of rationalizable strategies is { } × { }.

3. [Midterm 1, 2001] Find all the pure strategies that are consistent with the common
knowledge of rationality in the following game. (State the rationality/knowledge
assumptions corresponding to each operation.)

1\2   
 1 1 0 4 2 2
 2 4 2 1 1 2
 1 0 0 1 0 2

Solution: Clearly, one needs to compute rationalizable strategies and state the
underlying rationalizability assumptions along the way.

Round 1 For player 1,  strictly dominates . Since Player 1 is rational, he


will not play , and we eliminate this strategy:

1\2   
 1 1 0 4 2 2
 2 4 2 1 1 2

Round 2 Since Player 2 knows that Player 1 is rational, he knows that


Player 1 will not play . Given this, the mixed strategy that assigns prob-
ability 1/2 to each of the strategies  and  strictly dominates . Since
5.3. EXERCISES WITH SOLUTION 75

Player 2 is rational, in that case, he will not play . We eliminate this


strategy:
1\2  
 1 1 0 4
 2 4 2 1
Round 3 Since Player 1 knows that Player 2 is rational and that Player
2 knows that Player 1 is rational, he knows that Player 2 will not play
. Given this,  strictly dominates  . Since Player 1 is rational, he will
not play  , either. We are left with

1\2  

 2 4 2 1

Round 4 Since Player 2 knows that Player 1 is rational, and that Player
1 knows that Player 2 is rational, and that Player 1 knows that
Player 2 knows that Player 1 is rational, he knows that Player 1 will
not play  or . Given this,  strictly dominates . Since Player 2 is
rational, he will not play , either. He will play .

1\2 
 2 4

Thus, the only strategies that are consistent with the common knowledge of
rationality are  for Player 1 and  for Player 2.

4. [Midterm 1, 2011] Compute the set of all rationalizable strategies in the following
game. Simultaneously, Alice and Bob select arrival times  and  , respectively,
for their meeting, where    ∈ {0 1 2     100}. The payoffs of Alice and Bob
are
(
2 − ( −  )2 if   
 (   ) =
− ( −  )2 otherwise
(
2 − ( −  )2 if   
 (   ) =
− ( −  )2 otherwise,

respectively. [Note that  and  are integers between 0 and 100.]


76 CHAPTER 5. RATIONALIZABILITY

Solution: If the set of remaining strategies from the earlier rounds is {0    max }
for some max  0, then the max is strictly dominated by max −1 and is eliminated.
(Proof: For  = max ,

 (max − 1 max ) = 1  0 =  (max  max ) 

and for any   max ,

 (max − 1  ) = − (max − 1 −  )2  − (max −  )2 =  (max   ) 

showing that max −1 strictly dominates max for Alice. The same argument applies
for Bob.)
Therefore, we eliminate 100 in round 1, 99 in round 2, . . . , and 1 in round 100.
The set of rationalizable strategies is {0} for both players.

5. [Midterm 1 make up, 2007] Consider the following game:

1\2  
 1 1 1 0
 0 1 0 10000

(a) Compute the rationalizable strategies.


Solution: First  and then  are eliminated. The rationalizable strategies
are  for Player 1 and  for Player 2.
(b) Now assume that players can tremble: when a player intends to play a strategy
, with probability  = 0001, Nature switches it to the other strategy 0 . For
instance, if player 2 plays  (or intends to play ), with probability , 
is played, with probability 1 − ,  is played. Assume that the trembling
probabilities are independent. Compute the rationalizable strategies for this
new game.
Solution: Taking into the Nature’s move, the new game is as follows in
normal form:

1\2  
 1 −  1 −  + 100002 1 −   + 10000 (1 − ) 
  1 −  + 10000 (1 − )  10000 (1 − )2 + 1 − 
5.4. EXERCISES 77

To see how the payoffs are computed consider ( ). If this strategy profile
is intended, the outcome is ( ) with probability (1 − )2 [nobody trem-
bles], ( ) with probability (1 − )  [only Player 2 trembles], ( ) with
probability (1 − )  [only Player 1 trembles], and ( ) with probability 2
[everybody trembles]. We mix the payoff vectors with the above probabili-
ties to obtain the table. One can use the structure of payoffs to shorten the
calculations. For example, Player 1 gets 1 if he does not tremble and gets 0
otherwise, yielding 1 − .
To compute the rationalizable strategies, note that  is still dominated by 
and is eliminated in the first round. In the second round, we cannot eliminate
, however. Indeed, the payoffs from  and  are approximately 1 and 10,
respectively. Hence,  is eliminated in the second round, yielding ( ) as
the only rationalizable strategy profile.
This example shows that rationalizability may be sensitive to the possibility
of trembling, depending on the relative magnitude of trembling probabilities
and the payoff differences.

5.4 Exercises

1. [Homework 1, 2004] Consider the following game in normal form.

   
 0 −1 4 4 0 0 2 0
 0 3 0 0 4 4 1 0
 5 2 2 0 1 3 1 3
 4 4 1 0 0 1 0 5

(a) Iteratively eliminate all strictly dominated strategies; state the assumptions
necessary for each elimination.

(b) What are the rationalizable strategies?


78 CHAPTER 5. RATIONALIZABILITY

2. Compute the set of rationalizable strategies in the following game:


   
 2 0 2 4 0 0 0 −1
 1 −2 −2 −2 4 2 0 1
 1 3 0 0 1 3 5 2
 0 5 −1 0 0 1 4 4

3. [Midterm 1, 2000] Consider the following game.


1\2   
 3 2 4 0 1 1
 2 0 3 3 0 0
 1 1 0 2 2 3

(a) Iteratively eliminate all the strictly dominated strategies.


(b) State the rationality/knowledge assumptions corresponding to each elimina-
tion.
(c) What are the rationalizable strategies?

4. [Homework 1, 2004] Consider the game depicted in Figure 5.3 in extensive form
(where the payoff of player 1 is written on top, and the payoff of 2 is on the
bottom).

(a) Write this game in strategic form.


(b) What are the strategies that survive the iterative elimination of weakly-
dominated strategies in the following order: first eliminate all weakly-dominated
strategies of player 1; then, eliminate all the strategies of player 2 that are
weakly dominated in the remaining game; then, eliminate all the strategies
of player 1 that are weakly dominated in the remaining game, and so on?

5. [Homework 1, 2001] Compute the set of rationalizable strategies in the following


game that is played in a class of  students where  ≥ 2: Without discussing with
anyone, each student  is to write down a real number  ∈ [0 100] on a paper and
submit it to the TA. The TA will then compute the average
1 + 2 + · · · + 
̄ =

5.4. EXERCISES 79

L X
R
2 2
2

l l r a
r b
1
1
2 0 1 2 1
1   0 2 2 0

0 .5
0 2

Figure 5.3:

of these numbers. The students who submit the number that is closest to ̄3 will
share the total payoff of 100, while the other students get 0. Everything described
above is common knowledge. (Bonus: would the answer change if the students did
not know , but it were common knowledge that  ≥ 2?)

6. [Homework 2, 2011] There are  students. Simultaneously, each student  submits


a real number  ∈ [0 100] and each student receives the payoff of
µ ¶2
2
 (1       ) = 100 −  −  (1       ) 
3

where  finds the median.

(a) Write this game formally in normal form.

(b) Compute the sets of rationalizable strategies and Nash equilibria.

(c) Answer part (b) assuming that there are  ∈ (0 ( − 1) 2) mischievous
students with payoff ( −  (1       ))2 .

(d) Bonus: Answer part (c) for  ∈ (2 ).


80 CHAPTER 5. RATIONALIZABILITY

7. [Midterm 1, 2007] Compute the set of all rationalizable strategies in Exercise 4 in


Section 3.5.)

8. [Midterm 1, 2005] Compute the set of all rationalizable strategies in the game in
Figure 3.14. (See Exercise 2 in Section 3.5.)

9. [Homework 1, 2001] Consider the game in Figure 5.4.


1

L
R

(2,2)
2

r
l
(0,0)
u
1
1


  

(1,3) (3,1) (3,3) (1,1)

Figure 5.4:

(a) Write this game in the strategic form.


(b) What are the strategies that survive the iterative elimination of weakly-
dominated strategies in the following order: first eliminate all weakly-dominated
strategies of player 1; then, eliminate all the strategies of player 2 that are
weakly dominated in the remaining game; then, eliminate all the strategies
of player 1 that are weakly dominated in the remaining game, and so on?

10. [Homework 1, 2002] Consider the game depicted in Figure 5.5 in extensive form.

(a) Write this game in strategic form.


(b) What are the strategies that survive the iterative elimination of weakly-
dominated strategies in the following order: first eliminate all weakly-dominated
strategies of player 1; then, eliminate all the strategies of player 2 that are
5.4. EXERCISES 81

L R
M
2 2

r l r c
l a b
1
2 2 1 0
0 0
x y 0 1 2 0 1
0

2 1
1 2

Figure 5.5:
1

L R

l r l r

1 1 2
0
X Y A B 4 x y

1 0 0 1 0 1
0 4 0 1 0 1

Figure 5.6:

weakly dominated in the remaining game; then, eliminate all the strategies
of player 1 that are weakly dominated in the remaining game, and so on?

11. [Homework 1, 2006] Consider the game depicted in Figure 5.6 in extensive form.

(a) Write this game in strategic form.


(b) Iteratively eliminate all weakly dominated strategies.
(c) What are the rationalizable strategies?

12. Consider any collection of sets 1 ⊆ 1 , . . . , ⊆  such that there exists no
 ∈  that is strictly dominated when the others’ strategies are restricted to be
82 CHAPTER 5. RATIONALIZABILITY

in − . That is, for every  ∈  and every mixed strategy   of player , there
exists a strategy profile − of other players such that  ∈  for every  =
6  and
X
 (  − ) ≥   ( )  (  − ) 
 ∈

Show that each  ∈  is rationalizable.

13. Show that the set of rationalizable strategies satisfy the above property that no
rationalizable strategy is dominated when others’ strategies are resticted to be
rationalizable.
Chapter 6

Nash Equilibrium

6.1 Introduction and Definition

Both dominant-strategy equilibrium and rationalizability are well-founded solution con-


cepts. If players are rational and they are cautious in the sense that they assign positive
probability to each of the other players’ strategies, then we would expect that the players
to play according to the dominant-strategy equilibrium whenever such an equilibrium
exists. On the other hand, rationalizability describes exactly what is implied by the
definition of the game (aka common knowledge of rationality). If it is common knowl-
edge that the players are rational (i.e. they maximize the expected value of their utility
function), then each player must be playing a rationalizable strategy. Moreover, every
rationalizable strategy can be rationalizable in the sense that a player can play that
strategy and still believe that it is common knowledge that players are rational.
Unfortunately, these solution concepts are not useful in most situations in economics.
Except for the games that are specifically designed, as in the second-price auction, there
is often no dominant-strategy equilibrium. The set of rationalizable strategies tends to
be large in games analyzed in economics (and in this course). In that case, one can make
only weak predictions about the outcome using rationalizability.
This lecture introduces a new solution concept: Nash Equilibrium. It assumes that
the players correctly guess the other players’ strategies. This assumption may be rea-
sonable when there is a long prior interaction that leads players to form opinion about
how the other players play. It may also be reasonable when there is a social convention,

83
84 CHAPTER 6. NASH EQUILIBRIUM

adhered by the other players.


Towards defining Nash equilibrium, consider the Battle of the Sexes game

Alice\Bob opera football


opera 4 1 0 0 (6.1)
football 0 0 1 4

In this game, there is no dominant strategy, and everything is rationalizable. Suppose


Alice plays opera. Then, the best thing Bob can do is to play opera, too. Thus opera is
a best response for Bob against Alice playing opera. Similarly, opera is a best response
for Alice against opera. Thus, at (opera, opera), neither party wants to take a different
action. This is a Nash Equilibrium.
Towards formalizing this idea for general games, recall that, for any player , a
strategy 
 is a best response to − if and only if

 (
  − ) ≥  (  − ) ∀ ∈  

Recall also that the definition of a best response differs from that of a dominant strategy
by requiring the above inequality only for a specific strategy − instead of requiring it
for all − ∈ − . If the inequality were true for all − , then  would also be a
dominant strategy, which is a stronger requirement than being a best response against
some strategy − .

Definition 6.1 A strategy profile ∗ = (∗1  ∗ ) is a Nash Equilibrium if and only if
∗ is a best response to ∗− = (∗1  ∗−1  ∗+1  

) for each . That is, for all ,

 (∗  −
∗ ∗
) ≥  (  − ) ∀ ∈  

In other words, no player would have an incentive to deviate, if he correctly guesses


the other players’ strategies. If one views a strategy profile as a social convention, then
being a Nash equilibrium is tied to being self-enforcing, that is, nobody wants to deviate
when they think that the others will follow the convention.
For example, in the battle of sexes game (6.1), (opera, opera) is a Nash equilibrium
because
 ( ) = 4  0 =  ( )
6.2. RELATION TO EARLIER SOLUTION CONCEPTS 85

and
 ( ) = 1  0 =  ( ) 

Likewise, (football, football) is also a Nash equilibrium. On the other hand, (opera,
football) is not a Nash equilibrium because Bob would like to go to opera instead:

 ( ) = 1  0 =  ( ) 

6.2 Relation to Earlier Solution Concepts


Nash Equilibrium v. Dominant-strategy Equilibrium Every dominant strategy
equilibrium is also a Nash equilibrium, but the reverse is not true.

Theorem 6.1 If ∗ is a dominant strategy equilibrium, then ∗ is a Nash equilibrium.

Proof. Let ∗ be a dominant strategy equilibrium. Take any player . Since ∗ is a
dominant strategy for , for any given  ,

 (∗  − ) ≥  (  − ) ∀− ∈ − 

In particular,
 (∗  ∗− ) ≥  (  −

)

Since  and  are arbitrary, this shows that ∗ is a Nash equilibrium.


To see that the converse is not true, consider the Battle of the Sexes. In this game,
both (Opera, Opera) and (Football, Football) are Nash equilibria, but neither are dom-
inant strategy equilibria. Furthermore, there can be at most one dominant strategy
equilibrium, but as the Battle of the Sexes shows, Nash equilibrium is not unique in
general.
There can also be a other Nash equilibria when there is a dominant strategy equilib-
rium. For an example, consider the game

 
 1 1 0 0
 0 0 0 0

In this game, ( ) is a dominant strategy equilibrium, but ( ) is also a Nash equilib-
rium.
86 CHAPTER 6. NASH EQUILIBRIUM

This example also illustrates that a Nash equilibrium can be in weakly dominated
strategies. In that case, one can rule out some Nash equilibria by eliminating weakly
dominated strategies. While may find such equilibria unreasonable and be willing to rule
out such equilibria, the next example shows that all Nash equilibria may need to be in
dominated strategies in some games. (One then ends up ruling out all Nash equilibria.)

Example 6.1 Consider a two-player game in which each player  selects a natural num-
ber  ∈ N = {0 1 2   }, and the payoff of each player is 1 2 . It is easy to check that
(0 0) is a Nash equilibrium, and there is no other Nash equilibrium. Nevertheless, all
strategies, including 0, are weakly dominated.

Nash Equilibrium v. Rationalizability If a strategy is played in a Nash equilib-


rium, then it is rationalizable, but there may be rationalizable strategies that are not
played in any Nash equilibrium.

Theorem 6.2 If ∗ is a Nash equilibrium, then ∗ is rationalizable for every player .

Proof. It suffices to show that none of the strategies ∗1  ∗2   ∗ is eliminated at any
round of the iterated elimination of strictly dominated strategies. Since these strategies
are all available at the beginning of the procedure, it suffices to show if the strategies
∗1  ∗2  ∗ are all available at round , then they will remain available at round  + 1.
Indeed, since ∗ is a Nash equilibrium, for each , ∗ is a best response to −

which
are available at round . Hence, ∗ is not strictly dominated at round , and remains
available at round  + 1.
The converse is not true. That is, there can be a rationalizable strategy that is not
played in any Nash equilibrium, as the next example illustrates.

Example 6.2 Consider the following game:

  
 1 −2 −2 1 0 0
 −1 2 1 −2 0 0
 0 0 0 0 0 0

(This game can be thought as a matching penny game with an outside option, which is
represented by strategy .) Note that ( ) is the only Nash equilibrium. In contrast, no
6.3. MIXED-STRATEGY NASH EQUILIBRIUM 87

strategy is strictly dominated (check that each strategy is a best response to some strategy
of the other player), and hence all strategies are rationalizable.

6.3 Mixed-strategy Nash equilibrium


The definition above covers only the pure strategies. We can define the Nash equilibrium
for mixed strategies by changing the pure strategies with the mixed strategies. Again
given the mixed strategy of the others, each agent maximizes his expected payoff over
his own (mixed) strategies.

Definition 6.2 A mixed-strategy profile  ∗ = ( ∗1      ∗ ) is a Nash equilibrium if and


only if for every player ,  ∗ is a best response to  ∗− .

The condition for checking whether ∗ is mouthful.1 Fortunately, there is a simpler
condition to check: for every , if ∗ ( )  0, then  is a best response to  −

. That is,
X X
 (  − )  ∗− (− ) ≥  (0  − )  −

(− ) ∀ with ∗ ( )  0,∀0 
− −

where  ∗− (− ) = 1∗ (1 ) · · · · · −1


∗ ∗
(−1 ) · +1 ∗
(+1 ) · · · · · +1 (+1 ).

Example –Battle of the Sexes Consider the Battle of the Sexes again.

Alice\Bob opera football


opera 4 1 0 0
football 0 0 1 4
1
The condition is
X Y X Y
 (1       )  ∗ ( )  ∗ ( ) ≥  (1       )   ( )  ∗ ( )
(1  ) 6= (1  ) 6=

for every mixed strategy   . It can be simplified because one does not need to check for all mixed
strategies  . It suffices to check against the pure strategy deviations. That is,  ∗ is a Nash equilibrium
if and only if
X Y X Y
 (1       )  ∗ ( )  ∗ ( ) ≥  (0  − )  ∗ ( )
(1  ) 6= − 6
=

for every pure strategy 0 .


88 CHAPTER 6. NASH EQUILIBRIUM

We have identified two pure strategy equilibria, already. In addition, there is a mixed
strategy equilibrium. To compute the equilibrium, write  for the probability that Alice
goes to opera; with probability 1 −  she goes to football game. Write also  for the
probability that Bob goes to opera. For Alice, the expected payoff from opera is

 (opera,) =  (opera,opera) + (1 − )  (opera,football) = 4

and the expected payoff from football is

 (football,) =  (football,opera) + (1 − )  (football,football) = 1 − 

Her expected payoff from the mixed strategy is

 (; ) =  (opera,) + (1 − )  (football,)


= [4] + (1 − ) [1 − ] 

The payoff function  (; ) is strictly increasing with  when  (opera,)   (football ).
This is the case when 4  1 −  or equivalently when   15. In that case, the unique
best response for Alice is  = 1, and she goes to opera for sure. Likewise, when   15,
 (opera,)   (football ), and her expected payoff  (; ) is strictly decreasing
with . In that case, Alice’s best response is  = 0, i.e., going to football game for sure.
Finally, when  = 15, her expected payoff  (; ) does not depend on , and any
 ∈ [0 1] is a best response. In other words, Alice would choose opera if her expected
utility from opera is higher, football if her expected utility from football is higher, and
can choose either opera or football or any randomization between them if she is indif-
ferent between the two.
Similarly, one can compute that  = 1 is best response if   45;  = 0 is best
response if   45; and any  can be best response if  = 45.
The best responses are plotted in Figure 6.1. The Nash equilibria are where these
best responses intersect. There is one at (0 0), when they both go to football, one at
(1 1), when they both go to opera, and there is one at (45 15), when Alice goes to
opera with probability 45, and Bob goes to opera with probability 15.

Remark 6.1 The above example illustrates a way to compute the mixed strategy equi-
librium (for 2x2 games). Choose the mixed strategy of Player 1 in order to make Player
6.3. MIXED-STRATEGY NASH EQUILIBRIUM 89

1/5

4/5

Figure 6.1: The best-responses in the Battle of Sexes

2 indifferent between her strategies, and choose the mixed strategy of Player 2 in order
to make Player 1 indifferent. This is a valid technique to compute a mixed strategy equi-
librium, provided that it is known which strategies are played with positive probabilities
in equilibrium. (Note that one must be indifferent between two strategies if he plays both
of them with positive probabilities.)

Exercise 6.1 Show that if ∗ is a mixed strategy Nash equilibrium and  ∗ ( )  0, then
 is rationalizable.

One can use the above fact in searching for a mixed strategy Nash equilibrium.
One can compute the rationalizable strategies first and search for a mixed strategy
equilibrium within the set of rationalizable strategies, which may be smaller than the
original set of strategies.
Games with unique rationalizable strategy profile are called dominance solvable.

Exercise 6.2 Show that in a dominance-solvable game, the unique rationalizable strat-
egy is the only Nash equilibrium.
90 CHAPTER 6. NASH EQUILIBRIUM

6.4 Evolution of Hawks and Doves


Consider the game
 
¡  −  − ¢
 2
 2  0
 0   2  2
(played by the genes). Assume that   , so that the payoffs are negative when two
hawks meet. One can easily check that there are two Nash equilibria in pure strategies:
(hawk, dove) and (dove, hawk). There is also a mixed strategy equilibrium where both
strategies are played with positive probability. Let  be the probability of Player 2
playing hawk, and  = 1 −  be the probability that he plays dove. Since Player 1 plays
both strategies with positive probability, he must be indifferent between them:
 − 
·  +  ·  = · 
2 2
where the left hand side is the expected payoff from hawk and the right hand side is the
expected payoff from dove. The solution to this equation is

 =  

Similarly, in order for Player 2 play both hawk and dove with positive probabilities
(which are played with positive probabilities   and 1 −  , respectively), it must be
that Player 1 plays hawk with probability  . Therefore, in the mixed-strategy Nash
equilibrium, each player plays hawk with probability   and dove with probability
1 −  .
Now imagine an island where hawks and doves live together. Let there be 0 hawks
and 0 doves at the beginning where both 0 and 0 are very large. Suppose that each
season, the birds are randomly matched and the number of offsprings of a bird is given
by the payoff matrix above. That is, if a dove is matched to a dove as the neighbor, then
it will have  2 offsprings, and the next generation, we will have 1 +  2 doves in its
family. If a dove is matched with a hawk, then it will have zero offsprings and its family
will have only 1 member, itself in the next season. If two hawks are matched, then each
will have ( − ) 2 offsprings, which is negative reflecting the situation that the number
of hawks from such matches will decrease when we go to next season. Finally, if a hawk
meets dove, it will have  offsprings, and there will 1 +  hawks in its family in the
6.4. EVOLUTION OF HAWKS AND DOVES 91

next season. We want to know the ratio of hawks and doves in this island millions of
seasons later.
Let  and  be the number of hawks and doves, respectively, at season . Define
 
 = and  =
 +   + 
as the ratios of hawks and doves at . In accordance with the strong law of large numbers,
assume that the number of hawks that are matched to hawks is   , and number of
hawks that are matched to doves is   .2 Each hawk in the first group multiplies to
1 + ( − ) 2, and each hawk in the second group multiplies to 1 +  2. The number
of hawks in the next season will be then

+1 = (1 + ( − ) 2)   + (1 +  )   (6.2)


= (1 + ( − )  2 +   ) 

Number of doves who are matched to hawks is   , and number of doves that are
matched to doves is   . Each dove in the first and the second group multiplies to 1
and 1 +  2, respectively. Hence, the number of doves in the next season will be then

+1 = (1 + 0)   + (1 +  2)   = (1 +   2)   (6.3)

It is easy to find the steady states of the ratio  (and  ), defined by

+1 =  and +1 =  

From (6.2) and (6.3) it is clear that

 = 0 and  = 1

is a stationary state, which can be reached if we start with all doves. In that case, by
(6.2), it will continue as "doves only." Similarly, another steady state is

 = 1 and  = 0

which can be reached if we start with all hawks. Since we have started with both hawks
and doves, both  and +1 are positive. Hence, we can compute the steady states by
 +1  1 + ( − )  2 +  
= = 
 +1  1 +   2
2
The probabilities of matching to a hawk and dove are  and  , respectively. And there are 
hawks.
92 CHAPTER 6. NASH EQUILIBRIUM

where the last equality is due to (6.2) and (6.3). The equality holds if and only if

( − )  2 +   =   2

or equivalently
 =  
This is the only steady state reached from a distribution with hawks and doves. Notice
that it is the mixed strategy Nash equilibrium of the underlying game. This is a general
fact: if a population dynamic is as described in this section, then the steady states
reachable from a completely mixed distribution are symmetric Nash equilibria.
We will now see that when we start with both hawks and doves present, we will nec-
essarily approach to the last steady state, which is the mixed strategy Nash equilibrium.
Now +1   whenever
+1 
 
+1 
which holds whenever
1 + ( − )  2 +  
1
1 +   2
as one can see from (6.2) and (6.3). The latter inequality is equivalent to

   

That is, if  exceeds the equilibrium value, then it decreases towards the equilibrium
value. Similarly, if    , then +1   , and  will increase towards the equilibrium.

6.5 Exercises with Solutions


1. [Homework 2, 2011] Compute the set of Nash equilibria in Exercise 1 of Section
5.3.
Solution: Since Nash equilibrium strategies put positive probability only on ratio-
nalizable strategies, it suffices to consider rationalizable set. But there is only one
rationalizable strategy profile ( ). Therefore, ( ) is the only Nash equilibrium.

2. [Midterm 1, 2011] Compute the set of Nash equilibria in Exercise 2 of Section 5.3.
Solution: Recall that the set of Nash equilibria is invariant to the elimination
of non-rationalizable strategies. Hence, it suffices to compute the Nash equilibria
6.5. EXERCISES WITH SOLUTIONS 93

in the reduced game. Recall also from Section 5.3 that, after the elimination of
non-rationalizable strategies, the game reduces to

 
∗ ∗
 0 3 3 0
 3∗  0 2 4∗

Here, the best responses (to the pure strategies) are indicated with asterisk. Since
the best responses do not intersect, there is no Nash equilibrium in pure strategies.
There is a unique mixed strategy Nash equilibrium ∗ . In order for Player 1 to
play a mixed strategy, he must be indifferent between  and  against  ∗2 :

3∗2 () = 2 + (1 −  ∗2 ()) 

Here the left-hand side is the expected payoff from , and the right-hand side is
the expected payoff from . The indifference condition yields

 ∗2 () = 34

Of course, ∗2 () = 14. Since Player 2 is playing a mixed strategy, he must be
indifferent between playing  and  against ∗1 :

3∗1 () = 4 (1 −  ∗1 ()) 

Here the left-hand side is the expected payoff from , and the right-hand side is
the expected payoff from . The indifference condition yields

 ∗1 () = 47 and ∗1 () = 37

3. [Midterm 1, 2001] Find all the Nash equilibria in the following game:

1\2   
 1 0 0 1 5 0
 0 2 2 1 1 0

Solution: By inspection, there is no pure-strategy equilibrium in this game. There


is one mixed strategy equilibrium. Since  is strictly dominated, Player 2 assigns
0 probability to . Let  and  be the equilibrium probabilities for strategies 
94 CHAPTER 6. NASH EQUILIBRIUM

and , respectively; the probabilities for  and  are 1 −  and 1 − , respectively.


If Player 1 plays  , his expected payoff is 1 + (1 − ) 0 = . If he plays , his
expected payoff is 2 (1 − ). Since he assigns positive probabilities to both  and
, he must be indifferent between  and . Hence,  = 2 (1 − ), i.e.,  = 23.
Similarly, for Player 2, the expected payoffs from playing  and  are 2 (1 − )
and 1, respectively. Hence, 2 (1 − ) = 1, i.e.,  = 12.

4. [Make up for Midterm 1, 2007] Consider the game in Exercise 4 of Section 3.4.

(a) Assuming   12, find a Nash equilibrium.


Solution: It is easier to compute a Nash equilibrium from the normal-form
representation. Recall from the solution to Exercise 4 of Section 3.4 that the
normal-form representation of the game is
Student\Prof same new
 1 0 1 0
 3 12 32 (1 − ) 2
 2 −1 12 −(1 + )2
 4 −12 1 −
When   12, strategy "same" weakly dominates "new", with equality only
against . Since  is not a best response to "new", there cannot be a
Nash equilibrium in which "new" is played with positive probability. (Why?)
Hence, in any Nash equilibrium Prof plays "same". The best response is
. This yields ( same) as the unique Nash equilibrium.
(b) Assuming  ∈ (0 12), find a Nash equilibrium.
In order to find all Nash equilibria for   12, it is useful to find the ratio-
nalizable strategies:
Student\Prof same new
∗ ∗
 3 12 32  (1 − ) 2
 4∗  −12 1 −∗
where the best responses are indicated by asterisk. Clearly, there is no pure
strategy Nash equilibrium. The only Nash equilibrium  ∗ is in mixed strate-
gies. Towards computing  ∗ , the indifference condition for Student yields
32 + (32) ∗2 (same) = 1 + 3 ∗2 (same) 
6.6. EXERCISES 95

where the payoffs from the strategies "same" and "new" are on the left and
right hand sides of the equation, respectively. Therefore,

 ∗2 (same) = 13 and ∗2 (new) = 23

The indifference condition for Prof yields

∗1 () − 12 = (1 + ) 2∗1 () − 

yielding
1 − 2 
∗1 () = and ∗1 () = 
1− 1−
Note that, in equilibrium, Student takes the regular exam when he is healthy
and mixes between regular exam and make up when he is sick.

6.6 Exercises
1. [Homework 2, 2007] Consider the following game:

L M N R
A (4 2) (0 0) (5 0) (0 0)
B (1 4) (1 4) (0 5) (−1 0)
C (0 0) (2 4) (1 2) (0 0)
D (0 0) (0 0) (0 −1) (0 0)

(a) Compute the set of rationalizable strategies.

(b) Find all Nash equilibria (including those in mixed strategies).

2. [Midterm 1, 2007] Consider the game in Exercise 3 in Section 3.5 and Exercise 3
in Section 5.4.

(a) Find all pure strategy Nash Equilibria.

(b) Compute a mixed strategy Nash equilibrium.


96 CHAPTER 6. NASH EQUILIBRIUM

3. [Midterm 1, 2005] Find all the Nash equilibria in the following game. (Don’t forget
the mixed strategy equilibrium.)

1\2   
 1 0 4 1 1 0
 2 1 3 2 0 1
 3 −1 2 0 2 2

4. [Midterm 1, 2004] Consider the following game:

1\2   
 3 0 0 3 0 
 0 3 3 0 0 
  0  0  

(a) Compute two Nash equilibria for  = 1.

(b) For each equilibrium in part a, check if it remains a Nash equilibrium when
 = 2.

5. [Homework 2, 2001] Compute all the Nash equilibria of the following game.

L M R
A (3 1) (0 0) (1 0)
B (0 0) (1 3) (1 1)
C (1 1) (0 1) (0 10)

6. [Homework 2, 2002] Compute all the Nash equilibria of the following game.

L M R
A (4 3) (0 0) (1 1)
B (0 1) (1 0) (10 0)
C (0 0) (3 4) (1 1)
D (−1 0) (3 1) (5 0)
6.6. EXERCISES 97

7. [Homework 1, 2001] Consider the following game in normal form.

  
1 2 2 3 0 4 0
2 3 3 2 0 1 0
3 1 3 5 5 0 2
4 1 1 1 1 2 3

(a) Iteratively eliminate all strictly dominated strategies; state the assumptions
necessary for each elimination.

(b) What are the rationalizable strategies?

(c) What are the pure-strategy Nash equilibria?

8. [Homework 1, 2004] Consider the game in Exercise 1 of Section 5.4. What are the
Nash equilibria in pure strategies?

9. [Midterm 1, 2003] Find all the Nash equilibria in Exercise 3 of Section 5.4. (Don’t
forget the mixed-strategy equilibrium!)

10. [Homework 1, 2002] Consider the game in Exercise 10 of Section 5.4. What are
the Nash equilibria in pure strategies?

11. [Midterm 1 Make up, 2001]Compute all the Nash equilibria in the following game.

  
 3 2 4 0 0 0
 2 0 3 3 0 0
 0 0 0 0 3 3

12. [Homework 2, 2004] Compute all the Nash equilibria of the following games.

(a)
L M
T (2 1) (0 2)
B (0 1) (3 0)
98 CHAPTER 6. NASH EQUILIBRIUM

(b)
L M R
A (4 2) (0 0) (1 1)
B (1 1) (3 4) (2 1)
C (0 0) (3 1) (1 0)

13. [Homework 2, 2001] A group of  students go to a restaurant. It is common


knowledge that each student will simultaneously choose his own meal, but all
students will share the total bill equally. If a student gets a meal of price  and

contributes  towards paying the bill, his payoff will be −. Compute the Nash
equilibrium. Discuss the limiting cases  = 1 and  → ∞.

14. [Midterm 1, 2010] Compute a Nash equilibrium of the following game. (This is a
version of Rock-Scissors-Paper with preference for Paper.)

1\2   
 0 0 2 −2 −2 3
 −2 2 0 0 2 −1
 3 −2 −1 2 1 1

15. [Homework 2, 2006] There are  players, 1 2     , who bid for a painting in a
second-price auction. Each player  bids  , and the bidder who bids highest buys
the painting at the highest price bid by the players other than himself. (If two
ore more players bid the highest bid, the winner is decided by a coin toss.) The
value of the art is  for each player  where 1  2  · · ·    0. Find a Nash
equilibrium of this game in which player , who values the painting least, buys
the object for free (at price zero). Briefly discuss this result and compare it to the
answer of Exercise 4 in Section 4.5.

16. [Homework 2, 2006] Compute all the Nash equilibria of the following game.

L M R
A (4 2) (0 0) (2 1)
B (0 1) (3 4) (0 1)
C (1 5) (2 1) (1 4)
6.6. EXERCISES 99

17. Assume that each strategy set  is convex and each utility function  is strictly
concave in own strategy  .3 Show that all Nash equilibria are in pure strategies.

3
A set  is convex if  + (1 − )  ∈  for all   ∈  and all  ∈ [0 1]. A function  :  →  is
strictly concave if
 ( + (1 − ) )   () + (1 − )  ()

for all   ∈  and  ∈ (0 1).


Chapter 7

Application: Imperfect Competition

Some of the earliest applications of game theory is the analyses of imperfect competition
by Cournot (1838) and Bertrand (1883), a century before Nash (1950). This chapter
applies the solution concepts of rationalizability and Nash equilibrium to those models
of imperfect competition.

7.1 Cournot (Quantity) Competition


Consider  firms. Each firm  produces  ≥ 0 units of a good at marginal cost  ≥ 0
and sell it at price
 = max {1 −  0} (7.1)

where
 = 1 + · · ·  (7.2)

is the total supply. Each firm maximizes the expected profit. Hence, the payoff of firm
 is
  =  ( − )  (7.3)

Assuming all of the above is commonly known, one can write this as a game in normal
form, by setting

•  = {1 2     } as the set of players

101
102 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

•  = [0 ∞) as the strategy space of player , where a typical strategy is the quantity
 produced by firm , and

•   : 1 × · · · ×  → R as the payoff function.

Best Response Throughout the course, it will be useful to know the best response of
a firm  to the production levels of the other firms. (See also Exercise 3 in Section 4.4.)
Write
X
− =  (7.4)
=
6

for the total supply of the firms other than firm . If −  1, then the price  = 0 and
the best firm  can do is to produce zero and obtain zero profit. Now assume − ≤ 1.
For any  ∈ (0 1 − − ), the profit of the firm  is

  (  − ) =  (1 −  − − − )  (7.5)

(The profit is negative if   0.) By setting the derivative of   with respect to  to


zero,1 one can obtain the best production level
1 − − − 
 (− ) =  (7.6)
2
The profit function is plotted in Figure 7.1. The best response function is plotted in
Figure 7.2.

7.1.1 Cournot Duopoly


Now, consider the case of two firms. In that case, − =  for  =
6 .

Nash Equilibrium Any Nash equilibrium (1  2 ) must satisfy


1 − 2 − 
1 = 1 (2 ) ≡
2
and
1 − 1 − 
2 = 2 (1 ) ≡ 
2
1
I.e.

= 1 − 2 − − −  = 0

7.1. COURNOT (QUANTITY) COMPETITION 103

Profit

qi(1-qi-qj-c)
-cqi

-0.2
0 1
(1-qj-c)/2 1-qj-c

Figure 7.1:

qi

1 c
2 qi=qiB(Q-i)

Q-i

1-c

Figure 7.2:
104 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

q2

q1=q1B(q2)

q*
1 c
2 q2=q2B(q1)

q1

1-c

Figure 7.3:

Solving these two equations simultaneously, one can obtain


1−
1∗ = 2∗ =
3
as the only Nash equilibrium. Graphically, as in Figure 7.3, one can plot the best
response functions of each firm and identify the intersections of the graphs of these
functions as Nash equilibria. In this case, there is a unique intersection, and therefore
there is a unique Nash equilibrium.

Rationalizability The (linear) Cournot duopoly game considered here is dominance


solvable That is, there is a unique rationalizable strategy. Let us first consider the
first couple rounds of elimination to see this intuitively. I will then show mathematically
that this is indeed the case.

Round 1 Notice that a strategy ̂  (1 − ) 2 is strictly dominated by (1 − ) 2.


To see this, consider any  . As in Figure 7.1,  (   ) is strictly increasing until
 = (1 −  −  ) 2 and strictly decreasing thereafter. In particular,

  ((1 −  −  ) 2  ) ≥   ((1 − ) 2  )   (̂   ) 


7.1. COURNOT (QUANTITY) COMPETITION 105

showing that ̂ is strictly dominated by (1 − ) 2. We therefore eliminate all ̂ 
(1 − ) 2 for each player . The resulting strategies are as follows, where the shaded
area is eliminated:
q2

1-c

1 c
2

q1

1 c
2
1-c

Round 2 In the remaining game  ≤ (1 − ) 2. Consequently, any strategy


̂  (1 − ) 4 is strictly dominated by (1 − ) 4. To see this, take any  ≤ (1 − ) 2
and recall from Figure 7.1 that   is strictly increasing until  = (1 −  −  ) 2, which
is greater than or equal to (1 − ) 4. Hence,

  (̂   )    ((1 − ) 4  ) ≤   ((1 −  −  ) 2  ) 

showing that ̂ is strictly dominated by (1 − ) 4. We will therefore eliminate all ̂
with ̂  (1 − ) 4. The remaining strategies are as follows:
q2

1-c

1 c
2

q1

1 c
2
1-c
106 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

Notice that the remaining game is a smaller replica of the original game. Applying
the same procedure repeatedly, one can eliminate all strategies except for the Nash
equilibrium. (After every two rounds, a smaller replica is obtained.) Therefore, the only
rationalizable strategy is the unique Nash equilibrium strategy:

∗ = (1 − ) 3.

A more formal treatment One can prove this more formally by invoking the fol-
lowing lemma repeatedly:

Lemma 7.1 Given that  ≤ ̄, every strategy ̂ with ̂   (̄) is strictly dominated
by  (̄) ≡ (1 − ̄ − ) 2. Given that  ≥ ̄, every strategy ̂ with ̂   (̄) is strictly
dominated by  (̄) ≡ (1 − ̄ − ) 2.

Proof. To prove the first statement, take any  ≤ ̄. Note that   ( ;  ) is strictly
increasing in  at any    ( ). Since ̂   (̄) ≤  ( ),2 this implies that

¡ ¢
  (̂   )     (̄)   

That is, ̂ is strictly dominated by  (̄).


To prove the second statement, take any  ≤ ̄. Note that   ( ;  ) is strictly
decreasing in  at any    ( ). Since  ( ) ≤  (̄)  ̂ , this implies that

¡ ¢
  (̂   )     (̄)   

That is, ̂ is strictly dominated by  (̄).


Now, define a sequence  0   1   2     by  0 = 0 and

¡ ¢ ¡ ¢
 =   −1 ≡ 1 − −1 −  2 = (1 − ) 2 −  −1 2

2
This is because  is decreasing.
7.1. COURNOT (QUANTITY) COMPETITION 107

for all   0. That is,

0 = 0
1−
1 =
2
1 −  1−
2 = −
2 4
1 −  1 −  1−
3 = − +
2 4 8

1− 1− 1− 1−
 = − + − · · · − (−1) 
2 4 8 2


Theorem 7.1 The set of remaining strategies after any odd round  ( = 1 3   ) is
[ −1   ]. The set of remaining strategies after any even round  ( = 2 4   ) is
[    −1 ]. The set of rationalizable strategies is {(1 − ) 3}.

Proof. We use mathematical induction on . For  = 1, we have already proven the


statement. Assume that the statement is true for some odd . Then, for any  available
at even round +1, we have  −1 ≤  ≤   . Hence, by Lemma 7.1, any ̂   (  ) =
 +1 is strictly dominated by  +1 and eliminated. That is, if  survives round  + 1,
£ ¤
then +1 ≤  ≤   . On the other hand, every  ∈ [+1    ] =  ( )   ( −1 )
is a best response to some  with  −1 ≤  ≤   , and it is not eliminated. Therefore,
the set of strategies that survive the even round  + 1 is [ +1    ].
Now, assume that the statement is true for some even . Then, for any  available
at odd round  + 1, we have  ≤  ≤  −1 . Hence, by Lemma 7.1, any ̂   (  ) =
 +1 is strictly dominated by  +1 and eliminated. Moreover, every  ∈ [  +1 ] =
£  −1   ¤
 ( )   ( ) is a best response to some  with  ≤  ≤ −1 , and it is not
eliminated. Therefore, the set of strategies that survive the odd round +1 is [    +1 ].
Finally, notice that
lim   = (1 − ) 3
→∞

Therefore, the intersections of the above intervals is {(1 − ) 3}, which is the set of
rationalizable strategies.
108 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

7.1.2 Cournot Oligopoly


We will now consider the case of three or more firms. When there are three or more
firms, rationalizability does not help: one cannot eliminate any strategy less than the
monopoly production  1 = (1 − ) 2.

Rationalizability In the first round, one can eliminate any strategy   (1 − ) 2,
using the same argument in the case of duopoly. But in the second round, the maximum
possible total supply by the other firms is

( − 1) (1 − ) 2 ≥ 1 − 

where  is the number of firms. The best response to this aggregate supply level is 0.
Hence, one cannot eliminate any strategy in round 2. The elimination process stops,
yielding [0 (1 − ) 2] as the set of rationalizable strategies. Since the set of rationalizable
strategies is large, rationalizability has a weak predictive power in this game.

Nash Equilibrium While rationalizability has a weak predictive power in that the set
of rationalizable strategies is large, Nash equilibrium remains to have a strong predictive
power. There is a unique Nash equilibrium. Recall that  ∗ = (1∗  2∗      ∗ ) is a Nash
equilibrium if and only if
à ! P
X 1− ∗
6=  −
∗ =  ∗ =
6=
2

for all , where the second equality by (7.6) and the fact that in equilibrium the firms
P
cannot have negative profits in equilibrium (i.e. 6= ∗ ≤ 1 − ). Rewrite this equation
system more explicitly:

21∗ + 2∗ + · · · + ∗ = 1 − 


1∗ + 22∗ + · · · + ∗ = 1 − 

1∗ + 2∗ + · · · + 2∗ = 1 − 

For any  and , by subtracting th equation from th, one can obtain

∗ − ∗ = 0
7.2. BERTRAND (PRICE) COMPETITION 109

Hence,
1∗ = 2∗ = · · · = ∗ 

Substituting this into the first equation, one then obtains

( + 1) 1∗ = 1 − ;

i.e.
1−
1∗ = 2∗ = · · · = ∗ =
+1
Therefore, there is a unique Nash equilibrium, in which each firm produces (1 − )  ( + 1).
In the unique equilibrium, the total supply is

= (1 − )
+1
and the price is
1−
 =+ 
+1
The profit level for each firm is µ ¶2
1−
= 
+1
As  goes to infinity, the total supply  converges to 1 − , and price  converges to
. These are the values at which the demand ( = max {1 −  0}) is equal to supply
( = ), which is called (perfectly) competitive equilibrium. When there are few firms,
however, the price is significantly higher than the competitive price , and the total
supply is significantly lower than the competitive supply 1 − . We will next consider
another model, in which two firms are enough for the competitive outcome.

7.2 Bertrand (Price) Competition


Consider two firms. Simultaneously, each firm  sets a price  . The firm  with the lower
price    sells 1 −  units and the other firm cannot sell any. If the firms set the
same price, the demand is divided between them equally. That is, the amount of sales
for firm  is ⎧

⎪ if   
⎨ 1 − 
1−
 (1  2 ) = if  = 


2
⎩ 0 otherwise.
110 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

Assume that it costs noting to produce the good (i.e.  = 0). Therefore, the profit of a
firm  is ⎧


⎨ (1 −  )  if   
(1− )
 (1  2 ) =   (1  2 ) = if  = 


2
⎩ 0 otherwise.
Assuming all of the above is commonly known, one can write this formally as a game
in normal form by setting

•  = {1 2} as the set of players

•  = [0 ∞) as the set of strategies for each , with price  a typical strategy,

•   as the utility function.

Observe that when  = 0,   (1  2 ) = 0 for every  , and hence every  is a best
response to  = 0. This has two important implications:

1. Every strategy is rationalizable (one cannot eliminate any strategy because each
of them is a best reply to zero).

2. ∗1 = ∗2 = 0 is a Nash equilibrium.

In the rest of the notes, I will first show that this is indeed the only Nash equilibrium.
In other words, even with two firms, when the firms compete by setting prices, the
competitive equilibrium will emerge. I will then show that if we modify the game
slightly by discretizing the set of allowable prices and putting a minimum price, then the
game becomes dominance-solvable, i.e., only one strategy remains rationalizable. In the
modified game, the minimum price is the only rationalizable strategy, as in competitive
equilibrium. Finally I will introduce small search costs on the part of consumers, who
are not modeled as players, and illustrate that the equilibrium behavior is dramatically
different from the equilibrium behavior in the original game and competitive equilibrium.

7.2.1 Nash Equilibrium


Theorem 7.2 In Bertrand competition, the only Nash equilibrium is ∗ = (0 0).
7.2. BERTRAND (PRICE) COMPETITION 111

Proof. We have seen already that ∗ = (0 0) is a Nash equilibrium. I will here show that
if (1  2 ) is a Nash equilibrium, then 1 = 2 = 0. To do this, take any Nash equilibrium
(1  2 ). I first show that 1 = 2 . Towards a contradiction, suppose that    . If
 = 0, then   (   ) = 0, while   (   ) = (1 −  )  2  0. That is, choosing  is
a profitable deviation for firm , showing that    = 0 is not a Nash equilibrium.
Therefore, in order    to be an equilibrium, it must be that   0. But then, firm
 has a profitable deviation:   (   ) = 0 while  (   ) = (1 −  )  2  0. All in
all, this shows that one cannot have    in equilibrium. Therefore, 1 = 2 . But
if 1 = 2 in a Nash equilibrium, then it must be that 1 = 2 = 0. This is because if
1 = 2  0, then Firm 1 would have a profitable deviation:  1 (1  2 ) = (1 − 1 ) 1 2
while  1 (1 −  2 ) = (1 − 1 + ) (1 − ), which is close to (1 − 1 ) 1 when  is close
to zero.
A graphical proof for the above result is as follows. Recall that (1  2 ) is a Nash
equilibrium if and only if it is in the intersection of the best responses. Recall also from
Exercise 3.e of Section 4.4 that everything is a best response to  = 0 and nothing is a
best response to any  =
6 0. Hence, as shown in Figure ??, the best responses intersect
each other only at (0 0), showing that (0 0) is the only Nash equilibrium.

7.2.2 Rationalizability with discrete prices


Now suppose that the firms have to set prices as multiples of pennies, and they cannot
charge zero price. That is, the set of allowable prices is

 = {001 002 003   } 

The important assumption here is that the minimum allowable price min = 001 yields
a positive profit. We will now see that the game is "dominance-solvable" under this
assumption. In particular min is the only rationalizable strategy, and it is the only
Nash equilibrium strategy. Let us start with the first step.
Step 1: Any price  greater than the monopoly price  = 05 is strictly dominated
by some strategy that assigns some probability   0 to the price min = 001 and
probability 1 −  to the price  = 05.
Proof. Take any player  and any price    . We want to show that the mixed
¡ ¢
strategy  with   ( ) = 1 −  and  min =  strictly dominates  for some   0.
112 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

Take any strategy    of the other player . We have

  (   ) ≤   ( ) =  (1 −  ) ≤ 051 · 049 = 02499

where the first inequality is by definition and the last inequality is due to the fact that
 ≥ 051. On the other hand,
¡ ¢
  (    ) = (1 − )  (1 −  ) + min 1 − min
 (1 − )  (1 −  )
= 025 (1 − ) 

Thus,   (    )  02499 ≥   (   ) whenever 0   ≤ 00004. Choose  = 00004.


Now, pick any  ≤  . Since    , we now have   (   ) = 0. On the other
hand,
¡ ¢
  (    ) = (1 − )  (1 −  ) 2 + min 1 − min

when  =  , and


¡ ¢
  (    ) = min 1 − min

when    . In either case,


¡ ¢
 (    ) ≥ min 1 − min  0 =   (   ) 

Therefore,   strictly dominates  .

Step 1 yields the eliminations in the first round 1.

Round 1 By Step 1, all strategies  with    = 05 are eliminated. Moreover,
each  ≤  is a best reply to  =  + 1, and is not eliminated. Therefore, the set
of remaining strategies is
 2 = {001 002     05} 

Round  Suppose that the set of remaining strategies to round  is

  = {001 002     ̄} 

Then, the strategy ̄ is strictly dominated by a mixed strictly dominated by the mixed
¡ ¢
strategy   with   (̄ − 001) = 1 −  and  min = , as we will see momentarily. We
7.2. BERTRAND (PRICE) COMPETITION 113

then eliminate the strategy ̄. There will be no more elimination because each   ̄ is
a best reply to  =  + 001.
To prove that ̄ is strictly dominated by  , note that the profit from ̄ for player 
is (
̄ (1 − ̄) 2 if  = ̄
  (̄  ) =
0 otherwise.
On the other hand,
¡ ¢ ¡ ¢
 ̄  ̄ = (1 − ) (̄ − 001) (1 − ¯ + 001) + min 1 − min
 (1 − ) (̄ − 001) (1 − ̄ + 001)
= (1 − ) [̄ (1 − ̄) − 001 (1 − 2̄)] 

Then,  (   ̄)    (̄  ) whenever


̄ (1 − ̄) 2
≤1− 
̄ (1 − ̄) − 001 (1 − 2̄)
But ̄ ≥ 002, hence 001 (1 − 2̄)  ̄ (1 − ̄) 2, thus the right hand side is greater than
0. Choose
̄ (1 − ̄) 2
=1− 0
̄ (1 − ̄) − 001 (1 − 2̄)
¡ ¢
so that   ¯ ̄    (̄  ). Moreover, for any   ̄,
¡ ¢ ¡ ¢
  ̄   = (1 − ) (̄ − 001) (1 − ̄ + 001) + min 1 − min
¡ ¢
≥ min 1 − min  0 =   (̄  ) 

showing that  ̄ strictly dominates ̄, and completing the proof.
© ª
Therefore, the process continues until the set of remaining strategies is min and
it stops there. Therefore, min is the only rationalizable strategy.
Since players can put positive probability only on rationalizable strategies in a Nash
¡ ¢
equilibrium, the only possible Nash equilibrium is min  min , which is clearly a Nash
equilibrium.

7.2.3 Price competition with search costs


This section illustrates that the equilibrium behavior is quite different when the con-
sumers need to engage a costly search in order to find the prices offered by the firms,
regardless of how small these costs are.
114 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

For simplicity, allow only two prices: 3 and 5. Suppose that the demand for the good
comes from a single buyer, for who the value of the good is 6. She needs only 1 unit of
good. Unlike before the buyer has a very small search cost  ∈ (0 1). She can check
the prices by paying  .
The game is as follows:

• The two firms set prices 1 ∈ {3 5} and 2 ∈ {3 5} and the consumer decides
whether to check the prices, all simultaneously.

• If she checks the prices, then she buys from the firm with the lower price. If she
decides not to check or if 1 = 2 , then she buys from either of the firms with equal
probabilities. This behavior is set, so that the strategies of the consumer is only
"check" and "no check".

Formally,

•  = {1 2 } is the set of players;

• 1 = 2 = {3 5} and  = {check, no check} are the strategy sets; and

• the payoffs are as in the following table:

check no check
1\2 5 3 1\2 5 3
5 52 52 1 −  0 3 3 −  5 52 52 1 52 32 2
3 3 0 3 −  32 32 3 −  3 32 52 2 32 32 3

Here, the first entry is the payoff of firm 1, the second entry is the payoff of firm 2,
and the final entry is the payoff of the buyer. Firm 1 chooses the row; firm 2 chooses
the column, and the buyer chooses the matrix. We computed the payoffs, following the
set behavior above. For example, if the consumer doesn’t check the price, he buys from
the either firm with probability 0.5. Hence, the payoff of firm  is  2, independent of
 . The payoff of the buyer is

1 + 2
05 (6 − 1 ) + 05 (6 − 2 ) = 6 − 
2
7.2. BERTRAND (PRICE) COMPETITION 115

If the buyer checks and 1 = 2 , then the payoffs are: 1 2 to each firm and 6 − 1 −  to
the buyer. If the buyer checks and    , then the buyer buys one unit from , and the
payoff of firm  is  ; the payoff of firm  is 0, and the payoff of the buyer is 6 −  −  .
A quick glance at the above table reveals that the only pure strategy Nash equilibrium
is the both firm set price to 5 (1 = 2 = 5), and the buyer does not check the prices.
This is clearly different from the previous games, where price competition pushes the
prices to the minimum.
It is easy to check that (1 = 2 = 5; no check) is a Nash equilibrium: Given "no
check",  = 5 dominates  = 3. Given that prices are equal, the buyer saves  by not
checking.
It is also easy to check that this is the only Nash equilibrium in pure strategies.
If 1 = 2 , the best response of the buyer is "no check". If buyer doesn’t check,
then the best reply of each firm is 5. Therefore, the only equilibrium with 1 = 2
is (1 = 2 = 5; no check). On the other hand, there cannot be a Nash equilibrium with
6 2 . To see this, suppose that  = 5 and  = 3. Then, the buyer gets 3 −  when
1 =
she checks and 2 when she does not. The best reply is to check because   1. That
is, ( = 5,  = 3, no check) is not an equilibrium. In order to have an equilibrium, she
must check. But in that case,  gets 0. Firm  could get the higher payoff of 5/2 by
setting  = 5. Therefore, ( = 5,  = 3, check) is not an equilibrium either.
There is also a symmetric Nash equilibrium in mixed strategies. To find the equi-
librium, let us write  for the probability that a firm sets  = 5 (the probabilities are
equal by assumption) and  for the probability that buyer checks. The expected payoff
from checking for the buyer is

¡ ¢
 (check; ) = 2 + 3 1 − 2 −  = 3 − 2 2 −  

This is because the buyer gets 1 −  , when 1 = 2 = 5, which happens with probability
 2 , and 3 −  otherwise (with probability 1 − 2 ). If she doesn’t check her expected
payoff is
 (no check; ) =  + 3 (1 − ) = 3 − 2

(Since she chooses the firm randomly without knowing the prices, the probability that
the price will be high is .) We are looking for a mixed strategy Nash equilibrium with
116 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

0    1. In that case, the buyer must be indifferent:

 (check; ) =  (no check; )


2(1 − ) =   (7.7)

That is, √
1∓ 1 − 2
=
2
On the other hand, given that the buyer checks with probability  and the other firm
charges high price with probability , the payoff from  = 5 is

 (5;  ) = (1 −  (1 − )) 52

This is because the firm cannot sell if the buyer checks () and the other firm charges
low price (1 − ); otherwise he will sell with probability 0.5. In that case, the payoff
from  = 3 is
 (3;  ) = 3 + (1 − ) 32

This is because the firm will sell with probability 1, getting the payoff of 3, if the buyer
checks and the other firm sets a high price; otherwise the firm gets 3/2. Since  ∈ (0 1),
the firm must be indifferent:

 (5;  ) =  (3;  )
(1 −  (1 − )) 52 = 3 + (1 − ) 32

That is,
2
= 
5 − 2
There are two symmetric mixed strategy equilibria:
µ √ ¶
1 + 1 − 2 2
=  = √
2 4 − 1 − 2

and µ √ ¶
1 − 1 − 2 2
=  = √ 
2 4 + 1 − 2
It is illustrative to plot the possible values of  as a function of  , including the pure
strategy Nash equilibrium where  = 1.
7.3. EXERCISES WITH SOLUTIONS 117

q 1.0
0.8

0.6

0.4

0.2

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
c_s

When   12 there is a unique Nash equilibrium, in which the firms charge high prices.
When  = 12, the there is also a mixed strategy Nash equilibrium, in which the firms
charge high and low prices with equal probabilities. As we decrease the cost  further,
we have two mixed strategy equilibria and a pure strategy equilibria, where the price
is high. The probability of charging a high price reacts to the changes in  differently
in the two equilibria. In one equilibrium, as we decrease  to zero, the probability of
charging a high price also decreases to zero, when the firms charge low prices. In the
other equilibrium, that probability increases to 1, when the firms charge high prices.

7.3 Exercises with Solutions


1. [Homework 2, 2001] Compute the pure-strategy Nash equilibria in the following
linear Cournot oligopoly for arbitrary  firms: each firm has marginal cost  ∈ (0 1)
and a fixed cost   0, which it needs to incur only if it produces a positive amount;
the inverse-demand function is given by  () = max {1 −  0}, where  is the
total supply.

Solution: Suppose that  ≤  firms produce some positive quantity and the remain-
ing firms produce 0. For any  with positive production ∗ ,
P ∗

1 −  − 6  
=
 = 
2
As in the usual Cournot model above, the uniqe solution to this equation system is
1−
∗ =
+1
118 CHAPTER 7. APPLICATION: IMPERFECT COMPETITION

for each firm with positive production. The payoff is


µ ¶2
∗ 1−
 = − 
+1
Of course, the firm also has the option of not producing at all and obtaining zero profit.
Hence,  ∗ ≥ 0, i.e.,
1−
 ≤ √ − 1 ≡ ∗ 

Moreover, if   , the firm that produce 0 must not profit from deviation to positive
production: µ ¶2
1 −  − ∗
− ≤0
2
Since 1 −  − ∗ = (1 − )  ( + 1), this simplifies to

 ≥ ∗ 2

Hence, the set of Nash equilibria is as follows. For any integer  ∈ [∗ 2     ∗ ], 
firms produce (1 − )  ( + 1) each, and the remaining firms produce 0.

7.4 Exercises
1. [Midterm 1, 2007] Consider the Cournot duopoly with linear demand function
 = 1 − , where  is the price and  = 1 + 2 is the total supply.3 Firm 1 has
zero marginal cost. Firm 2 has marginal cost  (2 ) = 2 , so that the total cost of
producing 2 is 22 2

(a) (10 points) Compute all the Nash equilibria.


(b) (15 points) Compute the set of all rationalizable strategies. Explain your
steps.

2. Show that all Nash equilibria of Cournot oligopoly game above are in pure strate-
gies (i.e., one does not need to check for mixed strategy equilibria). (See exercise
17 in Section 6.6.)

3. Can you find a mixed-strategy Nash equilibrium in the Bertrand game above?

3
Recall that in Cournot duopoly Firms 1 and 2 simultaneously produce 1 and 2 , and they sell at
price  .
MIT OpenCourseWare
http://ocw.mit.edu

14.12 Economic Applications of Game Theory


Fall 2012

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Chapter 8

Further Applications

This chapter is devoted to exercises that apply the ideas developed in previous chapters
to various real-world problems.All of the exercises come from past exams and homework
problems. The reader is recommended to solve them before studying the solutions.
Many of the games in this chapter are supermodular, a class of games for which there
are powerful general theorems. These theorems could be used to find the rationalizable
set relatively easily. I will not use these theorems. Instead, I will explicitly apply the
iterated elimination procedure and use the results from the previous chapters. This
will hopefully drill the logic of rationalizability better. Moreover, the knowledge of the
procedure clarifies the role of knowledge and rationality assumptions, clarifying how
sensitive the solution could be to such assumptions.

8.1 Partnership
Consider an employer and a worker. The employer provides the capital  ≥ 0 (in terms
of investment in technology, etc.) and the worker provides the labor  ≥ 0 (in terms of
the investment in the human capital) to produce

 ( ) =   

for some   ∈ (0 1) with  +   1. They share the output equally. The parties
determine their investment level (the employer’s capital  and the worker’s labor )
simultaneously. The per-unit costs of capital and the labor for the employer and the

119
120 CHAPTER 8. FURTHER APPLICATIONS

worker are   0 and   0 respectively. The worker cannot put more than some fixed
positve ̄. The payoffs for the employer and the worker are
1
 ( ) =  ( ) − 
2
and
1
 ( ) =  ( ) − 
2
respectively. Everything described up to here is common knowledge.

Exercise 8.1 Write this formally as a game in normal form.

Solution: The set of players is {  } where  stands for the employer and 
stands for the worker. The sets of strategies are

 = [0 ∞)
£ ¤
 = 0 ̄ 

The payoff functions are  and  .

Exercise 8.2 Compute the set of Nash equilibria.

Solution: Since  and  are strictly concave in own strategies, all Nash equilibria
are in pure strategies. Towards finding the Nash equilibria, compute the best-response
functions for  and  as
µ ¶1(1−) (µ ¶1(1−) )
1  1 
 ∗ () =  and ∗ () = min   ̄ 
2 2

respectively. The Nash equilibrium is when the above best responses intersect, i.e.,
 =  ∗ () and  = ∗ (). Clearly, (0 0) is a Nash equilibrium. To find the other
possible equilibria, first consider the case ∗ ()  ̄, the case plotted in Figure 8.1. In
that case, the non-zero solution to the above equation is
à µ ¶ !1(1−−)
1 ³  ´1− 
̂ =
2  
à µ ¶ !
1− ³ ´ 1(1−−)
1  
̂ = 
2  
8.1. PARTNERSHIP 121

K
L*
K*

Rationalizable

Figure 8.1: Rationalizability and Nash equilibrium in partnership game.


³ ´
ˆ
When ̂ ≤ ̄, the only non-zero equilibrium is  ̂ , yielding the equilibria in Figure
8.1. When ˆ  ,¯ the constraint for the labor binds, and the non-zero equilibrium is
¡ ∗¡ ¢ ¢
  ¯ ¯ .

Exercise 8.3 Find all the rationalizable strategies.

Solution: Iterated dominance is applied as follows.


£ ¤ ¡ ¢ ¡ ¢
¯ is strictly dominated by  ∗ 
Round 1 Since  ∈ 0 ̄ , any    ∗  ¯ . (The
proof is similar to the Cournot duopoly case in the previous chapter, and left as
an easy exercise.) Therefore, any such strategy is eliminated for the employer. No
£ ¡ ¢¤
other strategy is eliminated at this round because any  ∈ 0  ∗  ¯ is a best
£ ¤ £ ¤
response to some  ∈ 0 ̄ and any  ∈ 0 ̄ is a best response to some  ≥ 0.
£ ¡ ¢¤ £ ¤
The remaining strategy sets are 0  ∗  ¯ and 0 ̄ .

The subsequent rounds depend on whether ̂ ≥ ̄.


¡ ¡ ¢¢
¯ as it is clear from Figure 8.1, ∗  ∗ 
Round 2 (̂ ≥ ̄) Since ̂ ≥ , ¯ = ,
¯ and
£ ¤ £ ¡ ¢¤
hence any  ∈ 0 ̄ is a best response to some  ∈ 0  ∗  ¯ . Moreover, it
£ ¡ ¢¤
has been established already that any  ∈ 0  ∗  ¯ is a best response to some
122 CHAPTER 8. FURTHER APPLICATIONS
£ ¤
 ∈ 0 ̄ . Therefore, no strategy is eliminated at this stage, and the elimination
£ ¡ ¢¤
procedure stops here. The set of rationalizable strategies is 0  ∗  ¯ for the
£ ¤
employer and 0 ̄ for the worker.
¡ ¡ ¢¢ ¡ ¢
Round 2 ( ¯ Since 
ˆ  ) ¯ as it is clear from Figure ??, ∗  ∗ ̄ = 1  ( ∗ ()) 1(1−) 
ˆ  ,
2
£ ¡ ¢¤ ¡ ¡ ¢¢
¯ Now, since  ∈ 0  ∗ 
. ¯ , any   ∗  ∗  ¯ is strictly dominated
¡ ¡ ¢¢
by ∗  ∗  ¯ , and is eliminated for the worker. As before, no other strat-
£ ¡ ¢¤
egy is eliminated in this round. The remaining strategy sets are 0  ∗  ¯ and
£ ¡ ¡ ¢¢¤
0 ∗  ∗ ̄ .

Towards an induction, assume that at the end of round 2, the remaining strategy
£ ¡ ¢¤ £ ¡ ¢¤ 1
¯ and 0 (∗ ◦  ∗ ) 
sets are 0  ∗ ◦ (∗ ◦  ∗ )−1  ¯ . (This is the case for
 = 1.)

ˆ  )
Round 2 + 1 ( ¯ Write
¡ ¢
̃ ≡  ∗ ◦ (∗ ◦  ∗ )−1 ̄

and
¡ ¢
̃ ≡ (∗ ◦  ∗ ) ̄ 
³ ´
ˆ  ,
Since  ˜ 
¯ as one can see from the figure,  ˜
ˆ and  ˆ . Hence,  ∗ 
˜ =
³ ³ ´´ ³ ´ ³ ´
 ∗ ∗ ̃  ̃. Any strategy    ∗ ̃ is dominated by  ∗ ̃ and
eliminated
h in this
³ ´iround. No
h other
i strategy is eliminated. The remaining strategy

sets are 0  ̃ and 0 ̃ .
³ ³ ´´
Round 2 + 2 ( ˆ  )
¯ Since  ˆ˜  ̄, as in the previous round, ∗  ∗  ˜ ˜
 .
h ³ ´i ³ ³ ´´
Now, since  ∈ 0  ∗ ̃ , any   ∗  ∗ ̃ is strictly dominated by
³ ³ ´´
∗  ∗  ˜ , and is eliminated for the worker. As before, no other strategy
h ³ ´i
is eliminated in this round. The remaining strategy sets are 0  ∗ ̃ and
h ³ ³ ´´i ¡ ¢
0 ∗  ∗ ˜ . By substituting ̃ ≡ (∗ ◦  ∗ )  ¯ one can check that the
formulas in the inductive hypothesis is true for  + 1.
∗ ∗ 
¡ ¢
One can check that
h as
i  → ∞, ( ◦  ) h̄ →
i ̂. Therefore, the set of rational-
izable strategies is 0 ̂ for the employer and 0 ̂ for the worker.
1
For any function  ,   () =  (    ( ())) where  is repeated  times.
8.2. COORDINATION IN SOFTWARE DEVELOPMENT 123

8.2 Coordination in Software Development


[Midterm 1, 2011] There are  software developers, named  = 1 2     . Each software
developer  has an "ideal specification"  for his product, but also would like his product
to be compatible with the other software, where  is a real number. Simultaneously,
each  selects a specification parameter  , which is a real number. The payoff of a player
 is
1 X
 (1       ) = 100 − ( −  )2 − ( −  )2 
 − 1 =
6

Note that a developers pays two costs: one for being away from his ideal specification,
and one for being away from the specification of other softwares (cost of incompatibility).
Note also that in the normal-form game, the software developers are the players; each
chooses a strategy  from real line, and the payoff function of player  is  .

Exercise 8.4 Compute a Nash equilibrium.

Solution: For each player , the first-order condition for his best response is
2 X
2 ( −  ) − ( −  ) = 0
 − 1 =
6

which simplifies to
1 X
2 =  +  
 − 1 =
6
P P
To solve this equation system, sum it up over  and obtain   =   . Substituting
this in the above equation, obtain

1 X

−1
 =  +  
2 − 1 2 − 1 =1

In equilibrium, a software developer chooses, roughly, the average of his own ideal spec-
P
ification and the average =1   of all ideal specifications, including his own.

8.3 Competition in Research and Development


[Midterm 1, 2002] Two start ups, named Firm 1 and Firm 2, are competing for leadership
in a software market. The leader wins, and the other loses. Each firm can invest some
124 CHAPTER 8. FURTHER APPLICATIONS

 ∈ [0001 1] unit for research and development by paying cost of 4. If a firm invests
 units and the other firm invests  units, the former wins with probability  ( + ).
Therefore, the payoff of the former start up will be

− 4
+
All these are common knowledge.
Note that the leader gets 1 and the follower gets 0 revenue. These numbers are
multiplied with their respective probabilities, and the final payoff above is obtained
after subtracting the cost of research. Note also that in the normal form game, the
players are Firm 1 and Firm 2, strategy set of each player is [0001 1], and the payoff
function is as above.

Exercise 8.5 Compute all pure strategy Nash equilibria.

Solution: Firm 1 maximizes



− 4
+
over , and Firm 2 maximizes

− 4
+
over . The best response function of Firm 1 as a function of  is given by
µ ¶ µ ¶
   
0 = − 4 = 1− − 4
  +   +

= − 14
( + )2

i.e.,

∗ () = 2  − 

Similarly, the best response function of Firm 2 is



 ∗ () = 2  − 

Note that ∗ ()   whenever   1. Therefore, the graphs of ∗ and  ∗ intersect each
other only at  =  = 1 –as shown in Figure 8.2. Therefore, (1,1) is the only Nash
equilibrium.
8.3. COMPETITION IN RESEARCH AND DEVELOPMENT 125

Figure 8.2: Best Response functions in R&D example.

Exercise 8.6 Compute all rationalizable strategies.

Solution: (1 1) is the only rationalizable strategy profile. Since  ≥ 0 ≡ 0001, then
any strategy   ∗ (0 ) is strictly dominated by 1 = ∗ (0 ), and therefore eliminated.
Write also 0 = 0 and 1 = 1 . Now, the remaining strategy space of each player is
[1  1]. Note that 1 = ∗ (001)  0001 = 0 . Now, similarly, one can eliminate any
strategy   2 ≡ ∗ (1 ). Applying this iteratively, after th elimination, the remaining
strategy space is [  1] where


 = 2 −1 − −1

and 0 = 001. It is clear from the figure that  → 1 as  → ∞. Hence in the limit we
are left with strategy space {1}.
More formally,
√ √ 12
 = 2 −1 − −1  −1 = −1 

Hence,
(12)−1
1    0 

(12)−1
Of course, as  → ∞, (12)−1 → 0, and hence 0 → 1. Therefore,  → 1.
126 CHAPTER 8. FURTHER APPLICATIONS

8.4 Political Competition


[Midterm 1, 2006] Two candidates, Alice and Bob, are running for a political office. Si-
multaneously, Alice and Bob invest  ≥ 0 and  ≥ 0 for their campaigns, respectively.
Alice wins with probability

 (   ) = ;
 +  + 
Bob wins with probability

 (   ) = 
 +  + 
and with remaining probability  ( +  +  ), a third party candidate wins, where
  0 is a fixed small number. For each party, the value of winning is 1 and the cost of
investment is , so that the expected payoff of Alice and Bob are  (   ) −  and
 (   ) −  , respectively.

Exercise 8.7 Compute the Nash equilibria of this game.

Answer: The first-order condition for  being a best response to  is


∙ ¸
    + 
[ (   ) −  ] = −  = − 1 = 0;
   +  +  ( +  +  )2
i.e.,
 +  = ( +  +  )2  (8.1)
Similarly, the first-order condition for Bob is

 +  = ( +  +  )2  (8.2)

Comparing the two equations, we find that  =  . Substituting this equality in (8.1),
we find that, in equilibrium,  solves

 +  = ( + 2 )2  (8.3)

which is equivalent to
42 + (4 − 1)  + 2 −  = 0
There is only one non-negative solution to this quadratic equation:

1 − 4 + 1 + 8 ∼ 1
∗ = = 
8 4

The unique Nash equilibrium is given by  =  =  .
8.4. POLITICAL COMPETITION 127

Exercise 8.8 Compute the set of rationalizable strategies.

Answer: From (8.1) and (8.2), the best response functions of Alice and Bob are


 ( ) =  +  − ( + )


 ( ) =  +  − ( + ) 

The best-response functions look like:


xB

1/4
x*

xA

1/4 -  1/4

Remember from the class that, since the utility functions are strictly concave (or "single-
peaked"), if   
 ( ) for each  , then  is strictly dominated. For example,
any   14 is strictly dominated by  = 14. Similarly, any   14 is strictly
dominated by  = 14. Hence, in the first round, we eliminate all such strategies:

xB

1/4

x*

xA

x 1/4 -  1/4


In the next round we eliminate the strategies with   
 (0) =  − 1 .This
is because all such strategies are now dominated by 1 . We continue this elimination
128 CHAPTER 8. FURTHER APPLICATIONS

iteratively. All we need to know where the process stops. The answer is actually easy.
It will stop at ∗ , and  =  = ∗ are the only rationalizable strategies.
Here is a mathematical proof: Since the game is symmetric, the set of rationaliz-
able strategies is the same for both players; call that set . Recall that no rationalizable
strategy is strictly dominated when we restrict the other player’s strategies to be ratio-
nalizable. That is, for each  ∈ , there exists  ∈  such that  =  
 () =  ().
Suppose that min   14 − . Now, min  =  
 () for some  ∈ . Since  is
"single-peaked", either  = min  or  = max . But, since ∗ ≤ max  ≤ 14, as in
the figure, 
 (max )  14 − , showing that  6= max . Hence,  = min , i.e.,
min  = 
 (min ). But this is a contradiction because it implies that (min  min )
is a Nash equilibrium. Therefore, min  ≥ 14 − . Then, 
 is strictly decreasing on
 
. That is, min  =  (max ) and max  =  (min ), i.e., (max  min ) is a
Nash equilibrium, showing that max  = min  = ∗ .

8.5 Exercises
1. [Homework 1, 2002] In the partnership above compute the rationalizable strategies
for the case

(a)  =  = 12,   14,   14;

(b)  =  = 12,  =  = 14;

(c)  =  = 12,   14,   14

2. [Homework 2, 2006] Alice and Bob seek each other. Simultaneously, Alice puts
effort  and Bob puts effort  to search. The probability of meeting is   ; the
3
value of the meeting for each of them is , and the search costs  to Alice and 3
to Bob.

(a) Find the Nash equilibria of this game.

(b) How do the search efforts in equilibrium change when we increase ?

(c) Take  = 1 and compute all rationalizable strategies.


8.5. EXERCISES 129

3. [Homework 2, 2011] There are 3 partners, namely 1,2, and 3. Simultaneously, each
partner  puts effort  ∈ [0 1], producing output level of 1 2 3 and costing 2
to . The partners share the output equally; the payoff of  is 1 2 3 3 − 2 .

(a) Write this game formally in normal form.


(b) For   16, compute the sets of rationalizable strategies and Nash equilibria.
(c) For  ∈ (0 16), compute the sets of rationalizable strategies and Nash equi-
libria.

4. [Midterm 1 Make Up, 2002] Consider a two player game in which each player’s
strategy is a real number  ∈ [0 1]. A player’s payoff is

− ( − 2 − 14)2

where  is his own strategy and  is the strategy chosen by the other player.

(a) Find all Nash equilibria.


(b) Compute all rationalizable strategies.

5. In the software development game in Section 8.2, compute the set of rationalizable
strategies for the case

(a) the set of strategies is all real numbers;


(b) the set of strategies is [0 1] and each  ∈ (0 1) 

6. Redo the analysis in Section 8.4 for the easier case of  = 0.


Chapter 9

Backward Induction

We now start analyzing the dynamic games with complete information. These notes
focus on the perfect-information games, where each information set is singleton, and
apply the notion of backward induction to these games. We will assume that the game
has "finite horizon", i.e., there can only be finitely many moves in any history of moves.

9.1 Definition
The concept of backward induction corresponds to the assumption that it is common
knowledge that each player will act rationally at each future node where he moves — even
if his rationality would imply that such a node will not be reached.1 (The assumption
that the player moves rationally at each information set he moves is called sequential
rationality.)
Mechanically, backward induction corresponds to the following procedure, depicted in
Figure 9.1. Consider any node that comes just before terminal nodes, that is, after each
move stemming from this node, the game ends. (Such nodes are called pen-terminal.) If
the player who moves at this node acts rationally, he chooses the best move for himself
at that node. Hence, select one of the moves that give this player the highest payoff.
Assigning the payoff vector associated with this move to the node at hand, delete all the
1
More precisely: at each node  the player is certain that all the players will act rationally at all
nodes  that follow node ; and at each node  the player is certain that at each node  that follows
node  the player who moves at  will be certain that all the players will act rationally at all nodes 
that follow node ,...ad infinitum.

131
132 CHAPTER 9. BACKWARD INDUCTION

Take any pen-terminal node

Pick one of the payoff vectors (moves) that gives


‘the mover’ at the node the highest payoff

Assign this payoff to the node at the hand;

Eliminate all the moves and the


terminal nodes following the node
Yes
Any non-terminal
node
No
The picked moves

Figure 9.1: Algorithm for backward induction

moves stemming from this node so that we have a shorter game, where the above node
is a terminal node. Repeat this procedure until the origin is the only remaining node.
The outcome is the moves that are picked in the way. Since a move is picked at each
information set, the result is a strategy profile.
For an illustration of the procedure, consider the game in the following figure. This
game describes a situation where it is mutually beneficial for all players to stay in a
relationship, while a player would like to exit the relationship, if she knows that the
other player will exit in the next day.

1  2  1 
• • • (2,5)

  

(1,1) (0,4) (3,3)


9.1. DEFINITION 133

In the third day, Player 1 moves, choosing between going across () or down (). If
he goes across, he would get 2; if he goes down, he would get the higher payoff of 3.
Hence, according to the procedure, he goes down. Selecting the move  for the node at
hand, one reduces the game as follows:

1  2 
• • (3,3)

 

(1,1) (0,4)

Here, the part of the game that starts at the last decision node is replaced with the
payoff vector associated with the selected move, , of the player at this decision node.
In the second day, Player 2 moves, choosing between going across () or down ().
If she goes across, she get 3; if she goes down, she gets the higher payoff of 4. Hence,
according to the procedure, she goes down. Selecting the move  for the node at hand,
one reduces the game further as follows:

1 
• (0,4)

(1,1)

Once again, the part of the game that starts with the node at hand is replaced with
the payoff vector associated with the selected move, . Now, Player 1 gets 0 if he goes
across (), and gets 1 if he goes down (). Therefore, he goes down. The procedure
results in the following strategy profile:
That is, at each node, the player who is to move goes down, exiting the relationship.
Let’s go over the assumptions that we have made in constructing this strategy profile.
134 CHAPTER 9. BACKWARD INDUCTION

1  2  1 
• • • (2,5)

  

(1,1) (0,4) (3,3)

We assumed that Player 1 will act rationally at the last date, when we reckoned that he
goes down. When we reckoned that Player 2 goes down in the second day, we assumed
that Player 2 assumes that Player 1 will act rationally on the third day, and also assumed
that she is rational, too. On the first day, Player 1 anticipates all these. That is, he is
assumed to know that Player 2 is rational, and that she will keep believing that Player
1 will act rationally on the third day.
This example also illustrates another notion associated with backward induction —
commitment (or the lack of commitment). Note that the outcomes on the third day
(i.e., (3,3) and (2,5)) are both strictly better than the equilibrium outcome (1,0). But
they cannot reach these outcomes, because Player 2 cannot commit to go across, and
anticipating that Player 2 will go down, Player 1 exits the relationship in the first day.
There is also a further commitment problem in this example. If Player 1 where able
to commit to go across on the third day, then Player 2 would definitely go across on
the second day. In that case, Player 1 would go across on the first. Of course, Player 1
cannot commit to go across on the third day, and the game ends in the first day, yielding
the low payoffs (1,0).

9.2 Backward Induction and Nash Equilibrium


Careful readers must have noticed that the strategy profile resulting from the backward
induction above is a Nash equilibrium. (If you have not noticed that, check that it is
indeed a Nash equilibrium). This not a coincidence:

Proposition 9.1 In a game with finitely many nodes, backward induction always results
9.2. BACKWARD INDUCTION AND NASH EQUILIBRIUM 135

in a Nash equilibrium.

Proof. Let ∗ = (∗1      ∗ ) be the outcome of Backward Induction. Consider any
player  and any strategy  . To show that ∗ is a Nash equilibrium, we need to show
that
¡ ¢
 (∗ ) ≥    ∗− 
¡ ¢
where ∗− = ∗ =
6
. Take any node

• at which player  moves, and

• ∗ and  prescribe the same moves for player  at every node that comes after this
node.

(There is always such a node; for example the last node player  moves.) Consider
a new strategy 1 according to which  plays everywhere according to  except for the
¡ ¢ ¡ ¢
above node, where he plays according to ∗ .According to 1  −
∗ ∗
or   − , after this
¡ ∗ ∗ ¢
node, the play is as in   − , the outcome of the backward induction. Moreover,
in the construction of ∗ , we have had selected the best move for player  given this
continuation play. Therefore, the change from  to 1 , which follows the backward
induction recommendation, can only increase the payoff of :
¡ ¢ ¡ ¢
 1  ∗− ≥    ∗− 

Applying the same procedure to 1 , now construct a new strategy 2 that differs from
1 only at one node, where player  plays according to ∗ , and
¡ ¢ ¡ ¢
 2  ∗− ≥  1  ∗− 

Repeat this procedure, producing a sequence of strategies  6= 1 6= 2 6= · · · 6= 


 6= · · · .
Since the game has finitely many nodes, and we are always changing the moves to those
of ∗ , there is some  such that  ∗
 =  . By construction, we have

¡ ¢ ¡ −1 ∗ ¢ ¡ ¢ ¡ ¢
 (∗ ) =   ∗
  − ≥    − ≥ · · · ≥  1  −

≥    ∗− 

completing the proof.


It is tempting to conclude that backward induction results in Nash equilibrium be-
cause one plays a best response at every node, finding the above proof unnessarily long.
136 CHAPTER 9. BACKWARD INDUCTION

Since one takes his future moves given and picks only a move for the node at hand,
chhosing the best moves at the given nodes does not necessarily lead to a best response
among all contingent plans in general.

Example 9.1 Consider a single player, who chooses between good and bad everyday
forever. If he chooses good at everyday, he gets 1, and he gets 0 otherwise. Clearly, the
optimal plan is to play good everyday, yielding 1. Now consider the strategy according to
which he plays bad everyday at all nodes. This gives him 0. But the strategy satisfies the
condition of backward induction (although bacward induction cannot be applied to this
game with no end node). At any node, according to the moves selected in the future, he
gets zero regardless of what he does at the current node.

The above pathological case is a counterexample to the idea that if one is playing a
best move at every node, his plan is a best response. The latter idea is a major principle
of dynamic optimization, called the Single-Deviation Principle. It applies in most cases
except for the pathological cases as above. The above proof shows that the principle
applies in games with finitely many moves. Single-Deviation Principle will be the main
tool in the analyses of the infinite-horizon games in upcoming chapters. Studying the
above proof is recommended.
But not all Nash equilibria can be obtained by backward induction. Consider the
following game of the Battle of the Sexes with perfect information, where Alice moves
first.
Alice

O F

Bob Bob

O F O F

(2,1) (0,0) (0,0) (1,2)

In this game, backward induction leads to the strategy profile identified in the figure,
according to which Bob goes wherever Alice goes, and Alice goes to Opera. There is
another Nash equilibrium: Alice goes to Football game, and Bob goes to Football game
9.3. COMMITMENT 137

at both of his decision nodes. Let’s see why this is a Nash equilibrium. Alice plays a
best response to the strategy of Bob: if she goes to Football she gets 1, and if she goes
to Opera she gets 0 (as they do not meet). Bob’s strategy (FF) is also a best response
to Alice’s strategy: under this strategy he gets 2, which is the highest he can get in this
game.
One can, however, discredit the latter Nash equilibrium because it relies on an se-
quentially irrational move at the node after Alice goes to Opera. This node does not
happen according to Alice’s strategy, and it is therefore ignored in Nash equilibrium.
Nevertheless, if Alice goes to Opera, going to football game would be irrational for Bob,
and he would rationally go to Opera as well. And Alice should foresee this and go to
Opera. Sometimes, we say that this equilibrium is based on "an incredible threat", with
the obvious interpretation.
This example illustrates a shortcoming of the usual rationality condition, which re-
quires that one must play a best response (as a complete contingent plan) at the be-
ginning of the game. While this requires that the player plays a best response at the
nodes that he assigns positive probability, it leaves the player free to choose any move
at the nodes that he puts zero probability–because all the payoffs after those nodes are
multiplied by zero in the expected utility calculation. Since the likelihoods of the nodes
are determined as part of the solution, this may lead to somewhat erroneous solutions in
which a node is not reached because a player plays irrationally at the node, anticipating
that the node will not be reached, as in (Football, FF) equilibrium. Of course, this is
erroneous in that when that node is reached the player cannot pretend that the node
will not be reached as he will know that the is reached by the definition of information
set. Then, he must play a best response taking it given that the node is reached.

9.3 Commitment
In this game, Alice can commit to going to a place, but Bob cannot. If we trust the
outcome of backward induction, this commitment helps Alice and hurts Bob. (Although
the game is symmetric Alice gets a higher payoff.) It is tempting to conclude that ability
to commit is always good. While this is true in many games, in some games it is not the
case. For example, consider the Matching Pennies with Perfect Information, depicted
138 CHAPTER 9. BACKWARD INDUCTION

in Figure 3.4. Let us apply backward induction. If Player 1 chooses Head, Player 2 will
play Head; and if Player 1 chooses Tail, Player 2 will prefer Tail, too. Hence, the game
is reduced to

(-1,1)

Head

Tail (-1,1)

In that case, Player 1 will be indifferent between Head and Tail, choosing any of these
two option or any randomization between these two acts will give us an equilibrium with
backward induction. In either equilibrium, Player 2 beats Player 1.

9.4 Multiple Solutions

This example shows that backward induction can lead to multiple equilibria. Here, in
one equilibrium, Player 1 chooses Head, in another one Player 1 chooses Tail, and yet in
another mixed strategy equilibrium, he mixes between the two strategies. Each mixture
probability corresponds to a different equilibrium. In all of these equilibria, the payoffs
are the same. In general, however, backwards induction can lead to multiple equilibria
with quite different outcomes.

Example–Multiple Solutions Consider the game in Figure 9.2. According to back-


ward induction, in his nodes on the right and at the bottom, Player 1 goes down, choosing
 and , respectively. This leads to the reduced game in Figure 9.3. Clearly, in the re-
duced game, both  and  yield 2 for Player 2, while  only yields 1. Hence, she must
choose either  or  or any randomization between the two. In other words, for any
 ∈ [0 1], the mixed strategy that puts  on , 1 −  on  and 0 on  can be selected by
the backward induction. Select such a strategy. Then, the payoff vector associated with
9.4. MULTIPLE SOLUTIONS 139

1 A 2 x 1 a
(1,1)
y
D d
z (0,2)

(1,1) a (2,2)
1 (0,1)

(1,0)

Figure 9.2: A game with multiple backward induction solutions.

1 A 2 x
(2,2)
y
D
z (0,2)

(1,1) (1,0)

Figure 9.3:
140 CHAPTER 9. BACKWARD INDUCTION

the decision of Player 2 is (2 2). The game reduces to


1 A
(2p,2)

(1,1)

The strategy selected for Player 1 depends on the choice of . If some   12 is selected
for Player 2, Player 1 must choose . This results in the equilibrium in which Player
1 plays  and Player 2 plays  with probability  and  with probability 1 − . If
  12, Player 1 must choose . In the resulting equilibrium, Player 1 plays  and
Player 2 plays  with probability  and  with probability 1 − . Finally, if  = 12 is
selected, then Player 1 is indifferent, and we can select any randomization between 
and , each resulting in a different equilibrium.

9.5 Example–Stackelberg duopoly


In the Cournot duopoly, we assume that the firms set the prices simultaneously. This
reflects the assumption that no firm can commit to a quantity level. Sometime a firm
may be able to commit to a quantity level. For example, a firm may be already in the
market and constructed its factory and warehouses etc, and its production level is fixed.
The other firm enters the market later knowing the production level of the first firm.
We will consider such a situation, which is called Stackelberg duopoly. There are two
firms. The first firm is called the Leader, and the second firm is called the Follower. As
before we take the marginal cost  constant.

• The Leader first chooses its production level 1 .

• Then, knowing 1 , the Follower chooses its own production level 2 .

• Each firm  sells its quantity  at the realized market price

 (1 + 2 ) = max {1 − (1 + 2 )  0} 

yielding the payoff of

  (1  2 ) =  ( (1 + 2 ) − ) 
9.5. EXAMPLE–STACKELBERG DUOPOLY 141

Notice that this defines an extensive form game:

• At the initial node, firm 1 chooses an action 1 ; the set of allowable actions is
[0 ∞).

• After each action of firm 1, firm 2 moves and chooses action 2 ; the set of allowable
actions now is again [0 ∞).

• Each of these action leads to a terminal node, at which the payoff vector is
(1 (1  2 )   2 (1  2 )).

Notice that a strategy of firm 1 is a real number 1 from [0 ∞), and more importantly
a strategy of firm 2 is a function from [0 ∞) to [0 ∞), which assigns a production level
2 (1 ) to each 1 . These strategies with the utility function  (1  2 ) =   (1  2 (1 ))
gives us the normal form.
Let us apply bachwards induction. Given 1 ≤ 1 − , the best production level for
the Follower is
1 − 1 − 
2∗ (1 ) = 
2
yielding to the payoff vector2
à ! à !
1
 1 (1  2∗ (1 ))  (1 − 1 − )
2 1
=  (9.1)
 2 (1  2∗ (1 )) 1
4
(1 − 1 − )2
By replacing the moves of firm 2 with the associated payoffs we obtain a game in which
firm one chooses a quantity level 1 which leads to the payoff vector in (9.1). In this
game firm 1 maximizes 12 1 (1 − 1 − ), choosing

1∗ = (1 − ) 2,

the quantity that it would choose if it were alone.


You should also check that there is also a Nash equilibrium of this game in which the
follower produces the Cournot quantity irrespective of what the leader produces, and
the leader produces the Cournot quantity. Of course, this is not consistent with back-
ward induction: when the follower knows that the leader has produced the Stackelberg
quantity, he will change his mind and produce a lower quantity, the quantity that is
computed during the backward induction.
¡ 1−1 −
¢
2
Note that 1 1 − 1 − 2 −  = 21 1 (1 − 1 − ) 
142 CHAPTER 9. BACKWARD INDUCTION

9.6 Exercises with Solutions


1. [Midterm 1, 2001] Using Backward Induction, compute an equilibrium of the game
in Figure 3.9.
Solution: See Figure 9.4.

X E

2
(2,1) L R
M
1
1
(1,2)
l r  

(3,1)
(3,1) (1,3
1,3)) (1,3) (3,1)
(3,1)

Figure 9.4:

2. Consider the game in Figure 9.5.

1
L R
2
2
l1 r1 r2
l2
1
1,2 2,1 0,3
l r

2,2 1,4

Figure 9.5:

(a) Apply backward induction in this game. State the rationality/knowledge


assumptions necessary for each step in this process.
9.6. EXERCISES WITH SOLUTIONS 143

1
L R
2
2
l1 r1 r2
l2
1
1,2 2,1 0,3
l r

2,2 1,4

Figure 9.6:

Solution: The backward induction outcome is as below. First eliminate


action 1 for Player 2, by assuming that Player 2 is sequentially rational and
hence will not play 1 , which is conditionally dominated by 1 . Also eliminate
action  for Player 1, assuming that Player 1 is sequentially rational. This is
because  is conditionally dominated by . Second, eliminate 2 , assuming that
Player 2 is sequentially rational and that Player 2 knows that Player 1 will be
sequentially rational at future nodes. This is because, believing that Player 1
will be sequentially rational in the future, Player 2 would believe that Player
1 will not play , and hence 2 would lead to payoff of 2. Being sequentially
rational she must play 2 . Finally, eliminate , assuming that (i) player 1 is
sequentially rational, (ii) player 1 knows that player 2 is sequentially rational,
and (iii) player 1 knows that player 2 knows that player 1 will be sequentially
rational in the future. This is because (ii) and (iii) lead Player 1 to conclude
that Player 2 will play 1 and 2 , and thus by (i) he plays . The solution is
as in Figure 9.6.

(b) Write this game in normal-form.


144 CHAPTER 9. BACKWARD INDUCTION

Solution: Each player has 4 strategies (named by the actions to be chosen).

1  2 1 2 1 2 1 2
 1 2 1 2 2 1 2 1
 1 2 1 2 2 1 2 1
 0 3 2 2 0 3 2 2
 0 3 1 4 0 3 1 4

(c) Find all the rationalizable strategies in this game –use the normal form.
State the rationality/knowledge assumptions necessary for each elimination.

Solution: First,  is strictly dominated by the mixed strategy that puts


probability .5 on each of  and . Assuming that Player 1 is rational, we
conclude that he would not play . We eliminate , so the game is reduced
to
1 2 1 2 1 2 1 2
 1 2 1 2 2 1 2 1
 1 2 1 2 2 1 2 1
 0 3 2 2 0 3 2 2
Now 1 2 is strictly dominated by 1 2 . Hence, assuming that (i) Player 2 is
rational, and that (ii) Player 2 knows that player 1 is rational, we eliminate
1 2 . This is because, by (ii), Player 2 knows that Player 1 will not play ,
and hence by (i) she would not play 1 2 . The game is reduced to

1 2 1 2 1 2
 1 2 1 2 2 1
 1 2 1 2 2 1
 0 3 2 2 0 3

There is no strictly dominated strategy in the remaining game. Therefore,


the all the remaining strategies are rationalizable.
(d) Comparing your answers to parts (a) and (c), briefly discuss whether or how
the rationality assumptions for backward induction and rationalizability dif-
fer.
9.6. EXERCISES WITH SOLUTIONS 145

Solution: Backward induction gives us a much sharper prediction compared


to that of rationalizability. This is because the notion of sequential rationality
is much stronger than rationality itself.
(e) Find all the Nash equilibria in this game.
Solution: The only Nash equilibria are the strategy profiles in which player
1 mixes between the strategies L and Lr, and 2 mixes between 1 2 and 1 2 ,
playing 1 2 with higher probability:

 = {( 1   2 ) |1 () +  1 () = 1  2 (1 2 ) +  2 (1 2 ) = 1  2 (1 2 ) ≤ 12} 

(If you found the pure strategy equilibria (namely, (L,1 2 ) and (Lr,1 2 )), you
will get most of the points.)

3. [Midterm 1, 2011] A committee of three members, namely  = 1 2 3, is to decide


on a new bill that would make file sharing more difficult. The value of the bill
to member  is  where 3  2  1  0. The music industry, represented by
a lobbyist named Alice, stands to gain  from the passage of the bill, and the
file-sharing industry, represented by a lobbyist named Bob, stands to lose  from
the passage of the bill where     0. Consider the following game.

• First, Alice promises non-negative contributions 1 , 2 , and 3 to the members


1 2 and 3, respectively, where  is to be paid to member  by Alice if the
bill passes.
• Then, observing (1  2  3 ), Bob promises non-negative contributions 1 , 2 ,
and 3 to the members 1 2 and 3, respectively, where  is to be paid to
member  by Bob if the bill does not pass.
• Finally, each member  votes, voting for the bill if  +    and against
the bill otherwise. The bill passes if and only if at least two members vote
for it.
• The payoff of Alice is  − (1 + 2 + 3 ) if the bill passes and zero otherwise.
The payoff of Bob is − if the bill passes and − (1 + 2 + 3 ) otherwise.

Assuming that 23    22 , use backward induction to compute a Nash equi-
librium of this game. (Note that Alice and Bob are the only players here because
146 CHAPTER 9. BACKWARD INDUCTION

the actions of the committee members are fixed already.) [Hint: Bob chooses not
to contribute when he is indifferent between contribution and not contributing at
all.]

Solution: Given any (1  2  3 ) by Alice, for each , write  ( ) =  +  for
the "price" of member  for Bob. If the total price of the cheapest two members
P
exceeds  (i.e.,   ( ) − max  ( ) ≥ ), then Bob needs to pay at least
 to stop the bill, in which case, he contributes 0 to each member. If the total
price of the cheapest two members is lower than , then the only best response
for Bob is to pay exactly the cheapest two members their price and pay nothing
to the the remaining member, stopping the bill, which would have cost him . In
sum, Bob’s strategy is given by
( P
 +  if 0 0 (0 ) − max0 0 (0 )   and  6= ∗
∗ (1  2  3 ) =
0 otherwise,

where ∗ is the most expensive member, which is chosen randomly when there is
a tie.3

Given ∗ , as a function of (1  2  3 ), Alice’s payoff is


( P
 − (1 + 2 + 3 ) if   ( ) − max  ( ) ≥ 
 (1  2  3 ) =
0 otherwise.
P
Clearly, this is maximized either at some (1  2  3 ) with   ( )−max  ( ) =
 (i.e. the cheapest two members costs exactly  to Bob) or at (0 0 0). Since
23    22 , Alice can set the prices of 1 and 2 to 2 by contributing
(2 − 1  2 − 2  0), which yields her  −  + 1 + 2  0 as   . Her
strategy is
∗ = (2 − 1  2 − 2  0) 

Bonus: Use backward induction to compute a Nash equilibrium of this game.


without assuming 23    22 .
3
Those who wrote Bob’s strategy wrongly as (0 0 0) or any other vector of numbers will lose 7
points for that. Clearly, (0 0 0) cannot be a strategy for Bob in this game, showing a collosal lack of
understanding of the subject.
9.7. EXERCISES 147

Answer: First consider the case  ≤ 23 . Then, Alice chooses a contribution vector
(1  2  0) such that 1 + 2 + 1 + 2 = , 1 + 1 ≤ 3 , and 2 + 2 ≤ 3 . Such
a vector is feasible because   23 and 3  2  1  0. Optimality of this
contribution is as before.
Now consider the case   23 . Now, Alice must contribute to all members in
order to pass the bill, and the optimality requires that the prices of all members
are 2 (as Bob buys the cheapest two). That is, she must contribute

∗∗ = (2 − 1  2 − 2  2 − 3 ) 

Since this costs Alice 32 − (1 + 2 + 3 ), she makes such a contribution to pass
the bill if and only if 32 ≤  + (1 + 2 + 3 ). Otherwise, she contributes
(0 0 0) and the bill fails.

9.7 Exercises
1. In Stackelberg duopoly example, for every 1 ∈ (0 1), find a Nash equilibrium in
which Firm 1 plays 1 .

2. Apply backward induction to the "sequential Stackelberg oligopoly" with  firms:


Firm 1 chooses 1 first, firm 2 chooses 2 second, firm 3 chooses 3 third, ..., and
firm  chooses  last.

3. [Homework 2, 2011] Use backward induction to compute a Nash equilibrium of the


following game.

L R

1/2 1/2 l r

1 2 1
2
X Y A B 4 x y

4 0 4 1 0 10
0 4 0 1 10 2
148 CHAPTER 9. BACKWARD INDUCTION

L R
M
2 2
2

l1 l r a c
r1 b
1
1
0 1 2 1 0
x y w 2 0 10
z 0 1

2 1 2
2 1
1 1
2

Figure 9.7:

4. [Homework 2, 2002] Apply backward induction in the game of Figure 9.7.

5. [Homework 2, 2002] Three gangsters armed with pistols, Al, Bob, and Curly, are
in a room with a suitcase of money. Al, Bob, and Curly have 20%, 40% and
70% chances of killing their target, respectively. Each has one bullet. First Al
shoots targeting one of the other two gangster. After Al, if alive, Bob shoots,
targeting one of the surviving gangsters. Finally, if alive, Curly shoots, targeting
again one of the surviving gangsters. The survivors split the money equally. Find
a subgame-perfect equilibrium.

6. [Midterm 1 Make Up, 2001] Find all pure-strategy Nash equilibria in Figure 9.8.
Which of these equilibria are can be obtained by backward induction?

7. [Final Make up, 2000] Find the subgame-perfect equilibrium of the following 2-
person game. First, player 1 picks an integer 0 with 1 ≤ 0 ≤ 10. Then, player 2
picks an integer 1 with 0 + 1 ≤ 1 ≤ 0 + 10. Then, player 1 picks an integer 2
with 1 + 1 ≤ 2 ≤ 1 + 10. In this fashion, they pick integers, alternatively. At
each time, the player moves picks an integer, by adding an integer between 1 and
10 to the number picked by the other player last time. Whoever picks 100 wins
the game and gets 100; the other loses the game and gets zero.

8. [Final, 2001] Consider the extensive form game in Figure 9.9.


9.7. EXERCISES 149

L
R
2
2

L L
R R
1
1
0,0 1,3
L
R R
L

2,2 -1,-1
4,2 3,3

Figure 9.8:

O I

2 2
2 L R

1
L R L R

3 0 0 1
1 0 0 3

Figure 9.9:
150 CHAPTER 9. BACKWARD INDUCTION

(a) Find the normal form representation of this game.

(b) Find all rationalizable pure strategies.

(c) Find all pure strategy Nash equilibria.

(d) Which strategies are consistent with all of the following assumptions?

(i) 1 is rational.
(ii) 2 is sequentially rational.
(iii) at the node she moves, 2 knows (i).
(iv) 1 knows (ii) and (iii).

9. [Final 2004] Use backward induction to find a Nash equilibrium for the following
game, which is a simplified version of a game called Weakest Link. There are 4
risk-neutral contestants, 1,2, 3, and 4, with "values" 1 , . . . , 4 where 1  2 
3  4  0. Game has 3 rounds. At each round, an outside party adds the value
of each "surviving" contestant to a common account,4 and at the end of third
round one of the contestants wins and gets the amount collected in the common
account. We call a contestant surviving at a round if he was not eliminated at
a previous round. At the end of rounds 1 and 2, the surviving contestants vote
out one of the contestants. The contestants vote sequentially in the order of their
indices (i.e., 1 votes before 2; 2 votes before 3, and so on), observing the previous
votes. The contestant who gets the highest vote is eliminated; the ties are broken
at random. At the end of the third round, a contestant  wins the contest with
probability   ( +  ), where  and  are the surviving contestants at the third
round. (Be sure to specify which player will be eliminated for each combination of
surviving contestants, but you need not necessarily specify how every contestant
will vote at all contingencies.)

10. [Midterm 1, 2011] A committee of three members, namely  = 1 2 3, is to decide


on a new bill that would make file sharing more difficult. The value of the bill
to member  is  where 3  2  1  0. The music industry, represented by
4
For example, if contestant 2 is eliminated in the first round and contestant 4 is eliminated in the
second round, the total amount in the account is (1 + 2 + 3 + 4 ) + (1 + 3 + 4 ) + (1 + 3 ) at the
end of the game.
9.7. EXERCISES 151

a lobbyist named Alice, stands to gain  from the passage of the bill, and the
file-sharing industry, represented by a lobbyist named Bob, stands to lose  from
the passage of the bill where     0. Consider the following game.

• First, Alice promises non-negative contributions 1 , 2 , and 3 to the members


1 2 and 3, respectively, where  is to be paid to member  by Alice if the
bill passes.
• Then, observing (1  2  3 ), Bob promises non-negative contributions 1 , 2 ,
and 3 to the members 1 2 and 3, respectively, where  is to be paid to
member  by Bob if the bill does not pass.
• Finally, each member  votes, voting for the bill if  +    and against
the bill otherwise. The bill passes if and only if at least two members vote
for it.
• The payoff of Alice is  − (1 + 2 + 3 ) if the bill passes and zero otherwise.
The payoff of Bob is − if the bill passes and − (1 + 2 + 3 ) otherwise.

Use backward induction to compute a Nash equilibrium of this game. (Note that
Alice and Bob are the only players here because the actions of the committee
members are fixed already.)
Chapter 10

Application: Negotiation

Negotiation is an essential aspect of social and economic interaction. The states negoti-
ate their borders with their neighbors; the legislators negotiate the laws that they make;
defendants negotiate a settlement with the prosecutors or the plaintiffs in the courts;
workers negotiate their salaries with their employers; the families negotiate their spend-
ing and maintenance of the household with each other, and even some students try to
negotiate their grades with their professor. Despite its central importance, negotiations
were presumed to be outside of the purview of economic analysis until the emergence of
game theory. Today there are many game theoretical models of bargaining. These notes
apply backward induction to three important bargaining games. The first one considers
congressional bargaining. It abstracts away from the back-room deals that lead to the
proposed bills and focus on the way legislators vote between various alternatives. The
second model considers pretrial negotiation in law. The third one is a general model of
bargaining that can be applied to many different settings in economics.

10.1 Congressional Bargaining–Voting with a Bi-


nary Agenda
In the US Congress, when a new bill introduced, there are often other alternative pro-
posals, such as amendments, amendments to amendments, substitute bills, amendments
to substitute bills, etc. There are rules of the Congress that determine the order in which
these proposals, or "alternatives", are voted against each other, eventually leading to a

153
154 CHAPTER 10. APPLICATION: NEGOTIATION

x0 x1

x0 x1 x2
x2

x1 x0 x2
x2

Figure 10.1: A binary agenda

final bill. In the final vote, the final bill, which may not be the original one, passes, or
fails, in which case the status quo prevails. For example, if there is a bill, an amend-
ment, and the status quo, first they vote between the bill and the amendment, then they
vote between the winner of the previous vote and the status quo. These rules and the
available proposals lead to a "binary" agenda; it is binary because in any session two
alternatives are voted against each other.

Let {1     2 + 1} be the set of players and {0       } be the set of alternatives.
Each player has a strict preference ordering for the set of alternatives. There is a fixed
binary agenda, and assume that all of these are commonly known.

To solve this game, we start from a last vote (a vote after which there is no further
voting). We assume that each player votes according to his preference. The alternative
that gets  + 1 or more votes wins. We then truncate the game by replacing the vote
with the winning alternative. We proceed in this way until there is only one alternative.

For example, consider three players, namely 1, 2, and 3, and three alternatives,
namely 0 , 1 , and 2 . The agenda is as in Figure 10.1. According to the agenda,
0 and 1 are voted against each other first; the winner is voted against 2 next. If
the winner defeats 2 as well, then it is implemented; otherwise 2 (the winner of the
second vote) is voted against the loser of the first vote and the winner of this vote is
implemented.
10.1. CONGRESSIONAL BARGAINING–VOTING WITH A BINARY AGENDA155

The preference ordering of the three players is as follows:

1 2 3
0 2 1
1 0 2
2 1 0

where the higher-ranked alternatives are placed in the higher rows.


Consider the branch on the left first. In the last vote, which is between 1 and 2 ,
every player vote his better alternative according to the table. Players 1 and 3 vote for
1 , and Player 2 votes for 2 . In this vote 1 beats 2 . Now consider the preceding
vote, between 0 and 2 . Now, everyone knows that if 2 wins, in the next round 1 will
be implemented. Hence, a vote for 2 is simply a vote for 1 . Hence, in the backward
induction, the final vote is replaced with its winner, namely 1 . Those who prefer 0
to 1 , who are players 1 and 2, vote for 0 , and the other player, who prefers 1 to 0 ,
votes for 2 . In this vote, 0 wins.
Now consider the right branch. In the last round, between 0 and 2 , Player 1 votes
for 0 , and 2 and 3 vote for 2 , resulting in the winning of 2 . Hence, in the backward
induction, the last round is replaced by 2 . In the previous vote between 1 and 2 ,
if 1 wins it is implemented, and if 2 wins it will be implemented (after defeating 0 ,
which will happen). Then, each player votes according to his true preference: players 1
and 3 for 1 , and Player 2 for 2 . Alternative 1 wins. Therefore, on the right branch,
1 wins.
Finally, at the very first vote, between 0 and 1 , the players know that the winning
alternative will be implemented in the future. Hence, everybody votes according to his
original preferences and 0 wins.
An interesting phenomenon is called a killer amendment or a poison pill. Suppose
that we have a bill 1 that is preferred by a majority of the legislators to the status
quo, 0 . If the bill is voted against the status quo, it will pass. A poison pill or a killer
amendment is an amendment 2 that is worse than the status quo, 0 , according to a
majority. Recall that the amendment 2 is first voted against the bill 1 and the winner
is finally voted against the status quo 0 . If the amendment passes, then it will fail in
the last round, and the status quo will be kept. Hence the term killer amendment.
Note that according to backward induction, a killer amendment is defeated in the
156 CHAPTER 10. APPLICATION: NEGOTIATION

first round (assuming that a majority prefers 1 to 0 ). This is because if 2 defeats 1 ,


in the next round 0 will be implemented. Hence, in the first round a vote for 2 is a
vote for the status quo, 0 . Then, the players who prefer the status quo, 0 , to the bill
will vote for the amendment, 2 , and the players who prefer the bill, 1 , to the status
quo will vote for the bill. Since the latter group is a majority, 1 defeats the amendment
in the first round.

But poison pills and killer amendments are frequently introduced and sometimes they
defeat the original bill (and eventually are defeated by the status quo). A famous exam-
ple to this is DePew amendment to the "17th amendment to the constitution" in 1912.
Here, the 17th amendment, 1 , requires the senators to be elected by the statewide
popular vote. This bill was supported by the (Southern) Democrats and half of the
Republicans, making up the two thirds of the congress. The DePew amendment, 2 ,
required that these elections be monitored by the federal government. Each Republican
slightly prefers 2 to 1 , so the proponent Republicans’ ordering is 2 Â 1 Â 0 and
the opposing Republicans’ ordering is 0 Â 2 Â 1 , where 0 is the status quo. But
the federal oversight of the state elections is unacceptable to the southern Democrats
for obvious reasons: 1 Â 0 Â 2 . Notice that "opposing Republicans" and Democrats,
which is about the two thirds of the legislators, prefer the status quo to the DePew
amendment. Hence, the DePew amendment is a killer amendment. According to our
analysis it should be defeated in the first round, and the original bill, the 17th amend-
ment, should eventually pass. But this did not happen. The DePew amendment killed
the bill.

Why does this happen? It would be too naive to think that a legislator is so myopic
that he cannot see one step ahead and fails to recognize a killer amendment. Sometimes,
legislators might not know the preferences of the other legislators. After all, these
preferences are elicited in these elections. In that case, the backward induction analysis
above is not valid and needs to be modified. Of course, in that case, an amendment may
defeat the bill (because of the proponents who think that it has enough support for an
eventual passage) but later be defeated in the final vote because of the lack of sufficient
support (which was not known in the first vote). But mostly, the killer amendments
are introduced intentionally, and the legislators have a clear idea about the preferences.
Even in that case, a killer amendment can pass, not because of the stupidity of the
10.2. PRE-TRIAL NEGOTIATIONS 157

proponents of the original bill, but because their votes against the amendment can be
exploited by their opponents in the upcoming elections when the voters are not informed
about the details of these bills.
The moral of the story is that it is not enough that your analysis is correct. You
must also be analyzing the correct game. You will learn the first task in the Game
Theory class; for the second, and more important, task of considering the correct game,
you need to look at the underlying facts of the situation.

10.2 Pre-trial Negotiations


Consider two players: a Plaintiff and a Defendant. The Plaintiff suffers a loss due to
the negligence of the Defendant, and he is suing her now. The court date is set at date
2 + 1. It is known that if they go to court, the Judge will order the Defendant to pay
  0 to the Plaintiff. But the litigation is very costly. For example, in the US, 95% of
cases are settled without going to court. In order to avoid the legal costs, the Plaintiff
and the Defendant are also negotiating an out of court settlement. The negotiation
follows the following protocol.

• At each date  ∈ {1 3     2 − 1}, if they have not yet settled, the Plaintiff offers
a settlement  ,

• and the Defendant decides whether to accept or reject it. If she accepts, the game
ends with the Defendant paying  to the Plaintiff; the game continues otherwise.

• At dates  ∈ {2 4     2}, the Defendant offers a settlement  ,

• and the Plaintiff decides whether to accept the offer, ending the game with the
Defendant paying  to the Plaintiff, or to reject it and continue.

• If they do not reach an agreement at the end of period  = 2, they go to court,
and the Judge orders the Defendant to pay   0 to the Plaintiff.

The Plaintiff pays his lawyer  for each day they negotiate and an extra  if they
go to court. Similarly, the Defendant pays her lawyer  for each day they negotiate
and an extra  if they go to court. Each party tries to maximize the expected amount
of money he or she has at the end of the game.
158 CHAPTER 10. APPLICATION: NEGOTIATION

The backward induction analysis of the game as follows. The payoff from going to
court for the Plaintiff is
 −  − 2 

If he accepts the settlement offer 2 of the Defendant at date 2, his payoff will be

2 − 2 

Hence, if 2   −  , he must accept the offer, and if 2   −  , he must reject
the offer. If 2 =  −  , he is indifferent between accepting and rejecting the offer.
Assume that he accepts that offer, too.1 To sum up, he accepts an offer 2 if and only
if 2 ≥  −  .
What should the Defendant offer at date 2? Given the behavior of Plaintiff, her
payoff from 2 is

−2 − 2 if 2 ≥  − 


− −  − 2 if 2   −  

This is because, if the offer is rejected, they will go to court. Notice that when 2 =
 −  , her payoff is − +  − 2 , and offering anything less would cause her to
lose  +  . Her payoff is plotted in Figure 10.2. Therefore, the Defendant offers

2 =  − 

at date 2.
Now at date 2 − 1, the Plaintiff offers a settlement 2−1 and the Defendant accepts
or rejects the offer. If she rejects the offer, she will get the payoff from settling for
2 =  −  at date 2, which is

− +  − 2 

If she accepts the offer, she will get

−2−1 − (2 − 1)  
1
In fact, he must accept 2 =  −  in equilibrium. For, if he doesn’t accept it, the best response
of the Defendant will be empty, inconsistent with an equilibrium. (Any offer 2 =  −  +  with
  0 will be accepted. But for any offer 2 =  −  + , there is a better offer 2 =  −  + 2,
which will also be accepted.)
10.2. PRE-TRIAL NEGOTIATIONS 159

UD

CP + CP

s
J-CP

Figure 10.2: Payoff of Defendant from her offer at the last period

Hence, she will accept the offer if and only if the last expression is greater than or equal
to the previous one, i.e.,
2−1 ≤  −  +  
Then, the Plaintiff will offer the highest acceptable settlement (to the Defendant):

2−1 =  −  +  

In summary, since the Plaintiff is making an offer, he offers the settlement amount of
next date plus the cost of negotiating one more day for the Defendant.
Let us apply the backward induction one more step. At date 2 − 2, the Defendant
offers a settlement 2−2 and the Plaintiff accepts or rejects the offer. If he rejects the
offer, he will get the payoff from settling for 2−1 =  −  +  at date 2 − 1, which
is
2−1 − (2 − 1)  =  −  +  − (2 − 1)  
If he accepts the offer, he will get

2−2 − (2 − 2)  

Hence, he will accept the offer if and only if the last expression is greater than or equal
to the previous one, i.e.,

2−2 ≥ 2−1 −  =  −  +  −  
160 CHAPTER 10. APPLICATION: NEGOTIATION

Then, the Defendant offers the highest acceptable settlement (to the Plaintiff):

2−1 = 2−1 −  =  −  +  −  

In summary, since the Defendant is making an offer, she offers the settlement amount
of next date minus the cost of negotiating one more day for the Plaintiff.
Now the pattern is clear. At any odd date , the Defendant accepts an offer  if and
only if  ≤ +1 +  , and the Plaintiff offers

 = +1 +   ( is odd)

At any even date , the Plaintiff accepts an offer  if and only if  ≥ +1 −  , and the
Defendant offers
 = +1 −  ( is even)
The solution to the above difference equation is
(
 −  + ( − 2) ( −  ) if  is even
 =
 −  + ( − ( + 1) 2) ( −  ) +  if  is odd.
Recall from the lecture that the solution is substantially different if the order of the
proposers is changed (see the slides). This is because at the last day, the cost of delaying
the agreement is quite high (the cost of going to court), and the party who accepts or
rejects the offer is willing to accepts a wide range of offers. Hence, the last proposer has
a great advantage.

10.3 Sequential Bargaining


Imagine that two players own a dollar, which they can use only after they decide how
to divide it. Each player is risk-neutral and discounts the future exponentially. That
is, if a player gets  dollar at day , his payoff is    for some  ∈ (0 1). The set of all
© ª
feasible divisions is  = ( ) ∈ [0 1]2 | +  ≤ 1 . The players are bargaining over the
division of the dollar by making offers and counteroffers, as it will be clear momentarily.
We want to apply backward induction to this game in order to understand when the
parties will reach an agreement and what the terms of the agreement will be.
First consider the following simplified model with only two rounds of negotiations. In
the first day, Player 1 makes an offer (1  1 ) ∈ . Then, knowing what has been offered,
10.3. SEQUENTIAL BARGAINING 161

Player 2 accepts or rejects the offer. If he accepts the offer, the offer is implemented,
yielding payoffs (1  1 ). If he rejects the offer, then they wait until the next day, when
Player 2 makes an offer (2  2 ) ∈ . Now, knowing what Player 2 has offered, Player
1 accepts or rejects the offer. If Player 1 accepts the offer, the offer is implemented,
yielding payoffs (2  2 ). If Player 2 rejects the offer, then the game ends, when they
lose the dollar and get payoffs (0,0).

The backward induction analysis of this simplified model is as follows. On the second
day, if Player 1 rejects the offer, he gets 0. Hence, he accepts any offer that gives him
more than 0, and he is indifferent between accepting and rejecting any offer that gives
him 0. As we have seen in the previous section, he accepts the offer (0,1) in equilibrium.
Then, on the second day, Player 2 would offer (0,1), which is the best Player 2 can get.
Therefore, if they do not agree on the first day, then Player 2 takes the entire dollar on
the second day, leaving Player 1 nothing. The value of taking the dollar on the next
day for Player 2 is . Hence, on the first day, Player 2 accepts any offer that gives him
more than , rejects any offer that gives him less than , and he is indifferent between
accepting and rejecting any offer that gives him . As above, assume that Player 2
accepts the offer (1 −  ). In that case, Player 1 offers (1 −  ), which is accepted.
Could Player 1 receive more than 1 − ? If he offered anything that is better than 1 − 
for himself, his offer would necessarily give less than  to Player 2, and Player 2 would
reject the offer. In that case, the negotiations would continue to the next day and he
would receive 0, which is clearly worse than 1 − .

Now, consider the game in which the game above is repeated  times. That is, if
they have not yet reached an agreement by the end of the second day, on the third day,
Player 1 makes an offer (3  3 ) ∈ . Then, knowing what has been offered, Player 2
accepts or rejects the offer. If he accepts the offer, the offer is implemented, yielding
¡ ¢
payoffs  2 3   2 3 . If he rejects the offer, then they wait until the next day, when
Player 2 makes an offer (4  4 ) ∈ . Now, knowing what Player 2 has offered, Player
1 accepts or rejects the offer. If Player 1 accepts the offer, the offer is implemented,
¡ ¢
yielding payoffs  3 4   3 4 . If Player 1 rejects the offer, then they go to the 5th day...
And this goes on like this until the end of day 2. If they have not yet agreed at the
end of that day, the game ends, they lose the dollar and get payoffs (0,0).

Application of backward induction to this game results in the following strategy


162 CHAPTER 10. APPLICATION: NEGOTIATION

profile. At any day  = 2 − 2 ( is a non-negative integer), Player 1 accepts any offer


( ) with ¡ ¢
 1 −  2
≥
1+
and rejects any offer ( ) with
¡ ¢
 1 −  2

1+
Player 2 offers
à ¡ ¢ ¡ ¢! à ¡ ¢ !
 1 −  2  1 − 2  1 −  2 1 +  2+1
(   ) = 1 − ≡  
1+ 1+ 1+ 1+

And at any day  − 1 = 2 − 2 − 1, Player 2 accepts an offer ( ) iff


¡ ¢
 1 +  2+1
≥
1+
and Player 1 offers
à ¡ ¢ ¡ ¢! à ¡ ¢!
 1 + 2+1  1 + 2+1 1 −  2+2  1 +  2+1
(−1  −1 ) = 1−  ≡  .
1+ 1+ 1+ 1+

We can prove this is indeed the equilibrium given by backward induction using math-
ematical induction on . (That is, we first prove that it is true for  = 0; then assuming
that it is true for some  − 1, we prove that it is true for )
Proof. Note that for  = 0, we have the last two periods, identical to the 2-period
example we analyzed above. Letting  = 0, we can easily check that the behavior
described here is the same as the equilibrium behavior in the 2-period game. Now,
assume that, for some  − 1 the equilibrium is as described above. That is, at the
beginning of date  + 1 := 2 − 2 ( − 1) − 1 = 2 − 2 + 1, Player 1 offers
⎛ ³ ´⎞ Ã
2(−1)+2  1 + 
2(−1)+1 ¡ ¢!
1 −  1 −  2
 1 + 2−1
(+1  +1 ) = ⎝  ⎠=  ;
1+ 1+ 1+ 1+

and his offer is accepted. At date  = 2 − 2, Player 1 accepts an offer iff the offer
−2  (1−2 )
is at least as good as having 11+
the next day, which is worth 1+
. Therefore, he
accepts an offer ( ) iff ¡ ¢
 1 −  2
≥ ;
1+
10.4. EXERCISES WITH SOLUTIONS 163

as in the strategy profile above. In that case, the best Player 2 can do is to offer
à ¡ ¢ ¡ ¢! à ¡ ¢ !
 1 −  2  1 −  2  1 − 2 1 +  2+1
(   ) = 1 − =  
1+ 1+ 1+ 1+

This is because any offer that gives Player 2 more than  will be rejected in which case
Player 2 will get ¡ ¢
 2 1 +  2−1
+1 =   .
1+
In summary, at , Player 2 offers (   ) ; and it is accepted. Consequently, at  − 1,
Player 2 accepts an offer ( ) iff
¡ ¢
 1 +  2+1
 ≥  = 
1+
In that case, at  − 1, Player 1 offers
à ¡ ¢!
1 −  2+2  1 + 2+1
(−1  −1 ) ≡ (1 −    ) =  
1+ 1+

completing the proof.


Now, let  → ∞. At any odd date , Player 1 will offer
à ¡ ¢! µ ¶
2+2 2+1
∞ ∞ 1 −   1 +  1 
(   ) = lim  =  ;
→∞ 1+ 1+ 1+ 1+

and any even date , Player 2 will offer


à ¡ ¢ ! µ ¶
2 2+1
∞ ∞  1 −  1 +   1
(   ) = lim  =  ;
→∞ 1+ 1+ 1+ 1+

The offers are barely accepted.

10.4 Exercises with Solutions


1. Consider two agents {1 2} owning one dollar which they can use only after they
divide it. Each player’s utility of getting  dollar at  is    for  ∈ (0 1). Given
any   0, consider the following -period symmetric, random bargaining model.
Given any date  ∈ {0 1      − 1}, we toss a fair coin; if it comes Head (which
comes with probability 1/2), we select player 1; if it comes Tail, we select player
164 CHAPTER 10. APPLICATION: NEGOTIATION

2. The selected player makes an offer ( ) ∈ [0 1]2 such that  +  ≤ 1. Knowing
what has been offered, the other player accepts or rejects the offer. If the offer
¡ ¢
( ) is accepted, the game ends, yielding payoff vector       . If the offer
is rejected, we proceed to the next date, when the same procedure is repeated,
except for  =  − 1, after which the game ends, yielding (0,0). The coin tosses
at different dates are stochastically independent. And everything described up to
here is common knowledge.

(a) Compute the subgame perfect equilibrium for  = 1. What is the value
of playing this game for a player? (That is, compute the expected utility of
each player before the coin-toss, given that they will play the subgame-perfect
equilibrium.)

Solution: If a player rejects an offer, he will get 0, hence he will accept


any offer that gives him at least 0. (He is indifferent between accepting
and rejecting an offer that gives him exactly 0; but rejecting such an offer
is inconsistent with an equilibrium.) Hence, the selected player offers 0 to
his opponent, taking entire dollar for himself; and his offer will be accepted.
Therefore, in any subgame perfect equilibrium, the outcome is (1,0) if it comes
Head, and (0, 1) if it comes Tail. The expected payoffs are
µ ¶
1 1 1 1
 = (1 0) + (0 1) =  
2 2 2 2

(b) Compute the subgame perfect equilibrium for  = 2 Compute the expected
utility of each player before the first coin-toss, given that they will play the
subgame-perfect equilibrium.

Solution: In equilibrium, on the last day, they will act as in part (a). Hence,
on the first day, if a player rejects the offer, the expected payoff of each player
will be  · 12 = 2. Thus, he will accept an offer if an only if it gives
him at least 2. Therefore, the selected player offers 2 to his opponent,
keeping 1 − 2 for himself, which is more than 2, his expected payoff if his
offer is rejected. Therefore, in any subgame perfect equilibrium, the outcome
is (1 − 2 2) if it comes Head, and (2 1 − 2) if it comes Tail. The
10.4. EXERCISES WITH SOLUTIONS 165

expected payoff of each player before the first coin toss is


1 1 1
(1 − 2) + (2) = 
2 2 2
(c) What is the subgame perfect equilibrium for  ≥ 3.

Solution: Part (b) suggests that, if expected payoff of each player at the
beginning of date  + 1 is  +1 2, the expected payoff of each player at the
beginning of  will be   2. [Note that in terms of dollars these numbers
correspond to 2 and 1/2, respectively.] Therefore, the equilibrium is follows:
At any date    − 1, the selected player offers 2 to his opponent, keeping
1 − 2 for himself; and his opponent accepts an offer iff he gets at least 2;
and at date  − 1, a player accepts any offer, hence the selected player offers
0 to his opponent, keeping 1 for himself. [You should be able to prove this
using mathematical induction and the argument in part (b).]

2. [Midterm 1, 2002] Consider two players  and , who own a firm and want to
dissolve their partnership. Each owns half of the firm. The value of the firm for
players  and  are  and  , respectively, where     0. Player  sets a
price  for half of the firm. Player  then decides whether to sell his share or to
buy ’s share at this price, . If  decides to sell his share, then  owns the firm
and pays  to , yielding playoffs  −  and  for players  and , respectively.
If  decides to buy, then  owns the firm and pays  to , yielding playoffs 
and  −  for players  and , respectively. All these are common knowledge.
Applying backward induction, find a Nash equilibrium of this game.

Solution: Given any price , the best response of  is





⎨ buy if  −   , i.e., if    2;
sell if    2;


⎩ {buy, sell} if  =  2

In equilibrium,  must be selling at price  =  2. This is because, if he were


buying, then the payoff of  as a function of  would be
(
 if  ≤  2;
 −  if    2
166 CHAPTER 10. APPLICATION: NEGOTIATION

UA(p)

vB p

Figure 10.3:

which can be depicted as in Figure 10.3. Then, no price could maximize the payoff
of , inconsistent with equilibrium (where  maximizes his payoff given what he
anticipates). Hence, the equilibrium strategy of  must be
(
buy if    2;
sell if  ≥  2

In that case, the payoff of  as a function of  would be


(
 if    2;
 −  if  ≥  2

which can be depicted as in Figure 10.4.This function is maximized at  =  2.


Player  sets the price as  =  2.

3. [Midterm 1, 2006] Paul has lost his left arm due to complications in a surgery. He
is suing the Doctor.

• The court date is set at date 2 + 1. It is known that if they go to court, the
judge will order the Doctor to pay   0 to Paul.
• They negotiate for a settlement before the court. At each date  ∈ {1 3     2 − 1},
if they have not yet settled, Paul offers a settlement  , and the Doctor decides
whether to accept or reject it. If she accepts, the game ends with the Doctor
paying  to Paul; game continues otherwise. At dates  ∈ {2 4     2}, the
10.4. EXERCISES WITH SOLUTIONS 167

UA(p)

vB
p

Figure 10.4:

Doctor offers a settlement  , and Paul decides whether to accept the offer,
ending the game with Doctor paying  to Paul, or to reject it and continue.

• Paul pays his lawyer only a share of the money he gets from the Doctor. He
pays (1 − )  if they settle at date ; (1 − )  if they go to court, where
0      1. The Doctor pays her lawyer  for each day they negotiate
and an extra  if they go to court.

• Each party tries to maximize the expected amount of money he or she has at
the end of the game.

(a) (10 pts) For  = 2, apply backward induction to find an equilibrium of this
game. (If you answer part (b) correctly, you don’t need to answer this part.)

(b) (15 pts) For any , apply backward induction to find an equilibrium of this
game.
Answer: At date 2+1, Paul gets  from the doctor and pays (1 − )  to his
lawyer, netting . Now at date 2, if he accepts 2 , he will pay (1 − ) 2
to his lawyer, receiving 2 . Hence, he will accept 2 iff 2 ≥ () . The
doctor will offer
2 = () 

instead of going to court and paying   ()  to Paul and an extra  to


her lawyer. Now, at 2−1, the Doctor will accept 2−1 iff 2−1 ≤ ()  +,
168 CHAPTER 10. APPLICATION: NEGOTIATION

as she would pay ()  to Paul next day and an extra  to her lawyer. Paul
will then offer
2−1 = ()  + 

as the settlement will be only ()  next day. He nets 2−1 =  + 


for himself. Now, at 2 − 2, Paul will accept an offer 2−2 iff 2−2 ≥ 2−1 =
()  + , for he could settle for 2−1 next day. (Note that offer gives him
2−2 and rejection gives him 2−1 .) Therefore, the Doctor would offer him

2−2 = 2−1 

The pattern is now clear. At any odd date , the Doctor accepts an offer iff
 ≤ +1 + , and Paul offers

 = +1 +  ( is odd).

At any even date , Paul accepts an offer iff  ≥ +1 , and the Doctor offers

 = +1 ( is even).

This much is more or less enough for an answer. To be complete, note that
the solution to the above equations is
(


 + 2+1−
2
 if  is odd
 = 


 + 2−
2
 if  is even

At the beginning, Paul offers 1 = ()  + , which is barely accepted by


the Doctor.
(c) (10 pts) Suppose now that with probability 1/2 the Judge may become sick
on the court date and a Substitute Judge decide the case in the court. The
Substitute Judge is sympathetic to doctors and will dismiss the case. In that
case, the Doctor does not pay anything to Paul. (With probability 1/2, the
Judge will order the Doctor to pay  to Paul.) How would your answer to
part (b) change?
Answer: The expected payment in the court is now
1 1
0 = ·  + · 0 = 2
2 2
10.5. EXERCISES 169

Hence, we simply replace  with 2. That is,


(

2
 + 2+1−
2
 if  is odd
 =  2−

2
+ 2  if  is even

10.5 Exercises
1. [Final, 2000] Consider a legal case where a plaintiff files a suit against a defendant.
It is common knowledge that, when they go to court, the defendant will have to pay
$1000,000 to the plaintiff, and $100,000 to the court. The court date is set 10 days
from now. Before the court date plaintiff and the defendant can settle privately,
in which case they do not have the court. Until the case is settled (whether in
the court or privately) for each day, the plaintiff and the defendant pay $2000
and $1000, respectively, to their legal team. To avoid all these costs plaintiff and
the defendant are negotiating in the following way. In the first day demands an
amount of money for the settlement. If the defendant accepts, then he pays the
amount and they settle. If he rejects, then he offers a new amount. If the plaintiff
accepts the offer, they settle for that amount; otherwise the next day the plaintiff
demands a new amount; and they make offers alternatively in this fashion until
the court day. Players are risk neutral and do not discount the future. Applying
backward induction, find a Nash equilibrium.

2. We have a Plaintiff and a Defendant, who is liable for a damage to the Plaintiff.
If they go to court, then with probability 0.1 the Plaintiff will win and get a
compensation of amount $100,000 from the Defendant; if he does not win, there
will be no compensation. Going to court is costly: if they go to court, each of the
Plaintiff and Defendant will pay $20,000 for the legal costs, independent of the
outcome in the court. Both the Plaintiff and the Defendant are risk-neutral, i.e.,
each maximizes the expected value of his wealth.

(a) Consider the following scenario: The Plaintiff first decides whether or not to
sue the defendant, by filing a case and paying a non-refundable filing fee of
$100. If he does not sue, the game ends and each gets 0. If he sues, then
he is to decide whether or not to offer a settlement of amount $25 000. If
170 CHAPTER 10. APPLICATION: NEGOTIATION

he offers a settlement, then the Defendant either accepts the offer, in which
case the Defendant pays the settlement amount to the Plaintiff, or rejects
the offer. If the Defendant rejects the offer, or the Plaintiff does not offer a
settlement, the Plaintiff can either pursue the suit and go to court, or drop
the suit. Applying backward induction, find a Nash equilibrium.

(b) Now imagine that the Plaintiff has already paid his lawyer $20,000 for the
legal costs, and the lawyer is to keep the money if they do not go to court.
That is, independent of whether or not they go to court, the Plaintiff pays the
$20,000 of legal costs. Applying backward induction, find a Nash equilibrium.
under the new scenario.

3. [Homework 2, 2006] This question is about a tv game, called Deal or No Deal.


There are two players: Banker and Contestant. There are  cash prizes, 1       ,
which are randomly put in  cases, 1, . . . , . Each permutation is equally likely.
Neither player knows which prize is in which case. The contestant owns Case
1. There are  − 1 periods. At each period, Banker makes a cash offer . The
Contestant is to accept ("Deal") or reject ("No Deal") the offer. If she accepts
the offer, the Banker buys the case from the Contestant at price  and the game
ends. (Banker gets the prize in Case 1 minus , and the Contestant gets .) If she
rejects the offer, then one of the remaining cases is opened to reveal its content to
the players, and we proceed to the next period. When all the cases 2,. . . ,  are
opened, the game automatically ends; the Banker gets 0 and the Contestant gets
the prize in Case 1. Assume that the utility of having  dollar is  for the Banker

and  for the Contestant. Everything described is common knowledge.

(a) Apply backward induction to find an equilibrium of this game. (Assume that
the Contestant accepts the offer whenever she is indifferent between accepting
or rejecting the offer. Solving the special case in part b first may be helpful.)

(b) What would be your answer if  = 3, 1 = 1, 2 = 100, and 3 = 10000.

4. [Midterm 1, 2007] [Read the bonus note at the end before you answer
the question.] This question is about arbitration, a common dispute resolution
method in the US. We have a Worker, an Employer, and an Arbitrator. They
10.5. EXERCISES 171

want to set the wage . If they determine the wage  at date , the payoffs of the
Worker, the Employer and the Arbitrator will be   ,   (1 − ) and  (1 − ),
respectively, where  ∈ (0 1). The timeline is as follows:

• At  = 0,

— the Worker offers a wage 0 ;


— the Employer accepts or rejects the offer;
— if she accepts the offer, then the wage is set at 0 and the game ends;
otherwise we proceed to the next date;

• at  = 1,

— the Employer offers a wage 1 ;


— the Worker accepts or rejects the offer;
— if he accepts the offer, then the wage is set at 1 and the game ends;
otherwise we proceed to the next date;

• at  = 2, the Arbitrator sets a wage 2 ∈ [0 1] and the game ends.

Compute an equilibrium of this game using backward induction.


Bonus: If you solve the following variation instead, then you will get extra 10
points (45 points instead of 35 points). Final Offer Arbitration: At  = 2, the
Arbitrator sets a wage 2 ∈ {0  1 }, i.e., the Arbitrator has to choose one of the
offers made by the parties.
Chapter 11

Subgame-Perfect Nash Equilibrium

Backward induction is a powerful solution concept with some intuitive appeal. Unfor-
tunately, it can be applied only to perfect information games with a finite horizon. Its
intuition, however, can be extended beyond these games through subgame perfection.
This chapter defines the concept of subgame-perfect equilibrium and illustrates how one
can check whether a strategy profile is a subgame perfect equilibrium.

11.1 Definition and Examples


An extensive-form game can contain a part that could be considered a smaller game in
itself; such a smaller game that is embedded in a larger game is called a subgame. A
main property of backward induction is that, when restricted to a subgame of the game,
the equilibrium computed using backward induction remains an equilibrium (computed
again via backward induction) of the subgame. Subgame perfection generalizes this
notion to general dynamic games:

Definition 11.1 A Nash equilibrium is said to be subgame perfect if an only if it is a


Nash equilibrium in every subgame of the game.

A subgame must be a well-defined game when it is considered separately. That is,

• it must contain an initial node, and

• all the moves and information sets from that node on must remain in the subgame.

173
174 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

1  2  1 
• • • (2,5)

  

(1,1) (0,4) (3,3)

Figure 11.1: A Centipede Game

Consider, for instance, the centipede game in Figure 11.1, where the equilibrium is
drawn in thick lines. This game has three subgames. One of them is:

1 
• (2,5)

(3,3)

Here is another subgame:

2  1 
• • (2,5)

 

(0,4) (3,3)

The third subgame is the game itself. Note that, in each subgame, the equilibrium
computed via backward induction remains to be an equilibrium of the subgame.
Any subgame other than the entire game itself is called proper.
11.1. DEFINITION AND EXAMPLES 175

Now consider the matching penny game with perfect information in Figure 3.4. This
game has three subgames: one after Player 1 chooses Head, one after Player 1 chooses
Tail, and the game itself. Again, the equilibrium computed through backward induction
is a Nash equilibrium at each subgame.

1
E X

1
(2,6)
T B

2
L R L R

(0,1) (3,2) (-1,3) (1,5)

Figure 11.2: An imperfect-information game

Now consider the game in Figure 11.2. One cannot apply backward induction in
this game because it is not a perfect information game. One can compute the subgame-
perfect equilibrium, however. This game has two subgames: one starts after Player 1
plays ; the second one is the game itself. The subgame perfect equilibria are computed
as follows. First compute a Nash equilibrium of the subgame, then fixing the equilibrium
actions as they are (in this subgame), and taking the equilibrium payoffs in this subgame
as the payoffs for entering the subgame, compute a Nash equilibrium in the remaining
game.

The subgame has only one Nash equilibrium, as  dominates , and  dominates
. In the unique Nash equilibrium, Player 1 plays  and Player 2 plays , yielding the
176 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

T B

2
L R L R

(0,1) (3,2) (-1,3) (1,5)

Figure 11.3: Equilibrium in the subgame. The strategies are in thicker arrows.

payoff vector (3,2), as illustrated in Figure 11.3. Given this, the game reduces to

1
E X

(3,2) (2,6)

Player 1 chooses  in this reduced game. Therefore, the subgame-perfect equilibrium is


as in Figure 11.4. First, Player 1 chooses  and then they play ( ) simultaneously.

1
E X

1
(2,6)
T B

2
L R L R

(0,1) (3,2) (-1,3) (1,5)

Figure 11.4: Subgame-perfect Nash equilibrium

The above example illustrates a technique to compute the subgame-perfect equilibria


in finite games:
11.1. DEFINITION AND EXAMPLES 177

1
E X

1
(2,6)
T B

2
L R L R

(0,1) (3,2) (-1,3) (1,5)

Figure 11.5: A non-subgame-perfect Nash equilibrium

• Pick a subgame that does not contain any other subgame.

• Compute a Nash equilibrium of this game.

• Assign the payoff vector associated with this equilibrium to the starting node, and
eliminate the subgame.

• Iterate this procedure until a move is assigned at every contingency, when there
remains no subgame to eliminate.

As in backward induction, when there are multiple equilibria in the picked subgame,
one can choose any of the Nash equilibrium, including one in a mixed strategy. Every
choice of equilibrium leads to a different subgame-perfect Nash equilibrium in the original
game. By varying the Nash equilibrium for the subgames at hand, one can compute all
subgame perfect Nash equilibria.
A subgame-perfect Nash equilibrium is a Nash equilibrium because the entire game
is also a subgame. The converse is not true. There can be a Nash Equilibrium that is not
subgame-perfect. For example, the above game has the following equilibrium: Player 1
plays  in the beginning, and they would have played ( ) in the proper subgame, as
illustrated in Figure 11.5. You should be able to check that this is a Nash equilibrium.
But it is not subgame perfect: Player 2 plays a strictly dominated strategy in the proper
subgame.
178 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

1 X
(2,6)
T B

2
L R L R

(0,1) (3,2) (-1,3) (1,5)

Figure 11.6: A subgame-perfect Nash equilibrium

Sometimes subgame-perfect equilibrium can be highly sensitive to the way we model


the situation. For example, consider the game in Figure 11.6. This is essentially the
same game as above. The only difference is that Player 1 makes his choices here at
once. One would have thought that such a modeling choice should not make a difference
in the solution of the game. It does make a huge difference for subgame-perfect Nash
equilibrium nonetheless. In the new game, the only subgame of this game is itself, hence
any Nash equilibrium is subgame perfect. In particular, the non-subgame-perfect Nash
equilibrium of the game above is subgame perfect. In the new game, it is formally
written as the strategy profile ( ) and takes the form that is indicated by the thicker
arrows in Figure 11.6. Clearly, one could have used the idea of sequential rationality
to solve this game. That is, by sequential rationality of Player 2 at her information
set, she must choose . Knowing this, Player 1 must choose  . Therefore, subgame-
perfect equilibrium does not fully formalize the idea of sequential rationality. It does
yield reasonable solutions in many games, and it is widely used in game theory. It will
also be used in this course frequently. We will later consider some other more refined
solution concepts that seem more reasonable.

11.2 Single-deviation Principle


It may be difficult to check whether a strategy profile is a subgame-perfect equilibrium
in infinite-horizon games, where some paths in the game can go forever without ending
11.2. SINGLE-DEVIATION PRINCIPLE 179

the game. There is however a simple technique that can be used to check whether
a strategy profile is subgame-perfect in most games. The technique is called single-
deviation principle.
I will first describe the class of games for which it applies. In a game there may be
histories where all the previous actions are known but the players may move simultane-
ously. Such histories are called stages. For example, suppose that every day players play
the battle of the sexes, knowing what each player has played in each previous day. In
that case, at each day, after any history of play in the previous days, we have a stage at
which players move simultaneously, and a new subgame starts. Likewise, in Figure 11.2,
there are two stages. The first stage is where Player 1 chooses between  and , and
the second stage is when they simultaneously play the 2x2 game. It is not a coincidence
that there are two subgames because each stage is the beginning of a subgame.
For another example, consider alternating-offer bargaining. At each round, at the
beginning of the round, the proposer knows all the previous offers, which have all been
rejected, and makes an offer. Hence, at the beginning we have a stage, where only the
proposer moves. Then, after the offer is made, the responder knows all the previous
offers, which have all been rejected, and the current offer that has just been made. This
is another stage, where only the responder moves. Therefore, in this game, each round
has two stages.
Such games are called multi-stage games.
In a multistage game, if two strategies prescribe the same behavior at all stages, then
they are identical strategies and yield the same payoff vector. Suppose that two strategies
are different, but they prescribe the same behavior for very, very long successive stages,
e.g., in bargaining they differ only after a billion rounds. Then, we would expect that
the two strategies yield very similar payoffs. If this is indeed the case, then we call
such games "continuous at infinity". (In this course, we will only consider games that
are continuous at infinity. For an example of a game that is not continuous at infinity
see Example 9.1.) The single-deviation principle applies to multistage games that are
continuous at infinity.

Single-deviation test Consider a strategy profile ∗ . Pick any stage (after any
history of moves). Assume that we are at that stage. Pick also a player  who moves at
that stage. Fix all the other players’ moves as prescribed by the strategy profile ∗ at
180 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

the current stage as well as in the following game. Fix also the moves of player  at all
the future dates, but let his moves at the current stage vary. Can we find a move at the
current stage that gives a higher payoff than ∗ , given all the moves that we have fixed?
If the answer is Yes, then ∗ fails the single-deviation test at that stage for player .
Clearly, if ∗ fails the single-deviation test at any stage for any player , then ∗
cannot be a subgame-perfect equilibrium. This is because ∗ does not lead to a Nash
equilibrium at the subgame that starts at that stage, as player  has an incentive to
deviate to the strategy according to which  plays the better move at the current stage
but follows ∗ in the remainder of the subgame. It turns out that in a multistage game
that is continuous at infinity, the converse is also true. If ∗ passes the single deviation
principle at every stage (after every history of previous moves) for every player, then it
is a subgame-perfect equilibrium.

Theorem 11.1 (Single-deviation Principle) In a multistage game that is continu-


ous at infinity, a strategy profile is a subgame-perfect Nash equilibrium if and only if it
passes the single-deviation test at every stage for every player.

This is a generalization of the fact that backward induction results in a Nash equi-
librium, as established in Proposition 9.1. For an illustration of the proof, see the proof
of Proposition 9.1. The proof in general case considered in the theorem here is similar.
Example 9.1 illustrates that the single-deviation principle need not apply when the game
is not continuous at infinity. Since all the games considered in this game are continuous
at infinity, you do not need to worry about that possibility.

11.3 Application: Infinite-Horizon Bargaining


This section illustrates how to apply single-deviation principle on the infinite-horizon
bargaining game with alternating offers. The game is the same as the one analyzed in
Section 10.3, except that there is no end date. That is, if an offer is rejected, then we
always proceed to the next date at which the other player makes an offer. Note that
the game is continuous at infinity, for if two strategies describe the same behavior at
the first  periods, the payoff difference under the two strategies cannot exceed  , which
goes to zero, as  goes to ∞.
11.3. APPLICATION: INFINITE-HORIZON BARGAINING 181

Recall that, when the game automatically ends after 2 periods, at any , the proposer
offers to take
1 − (−)2−+1
1+
for himself and leave the remaining,

 + (−)2−+1

1+
to the other player, and the other player accepts an offer if his share is at least as in this
offer. When  → ∞, the behavior is as follows:

∗ : at each history where  makes an offer, offer to take 1 (1 + ) and leave  (1 + )
to the other player, and at each history where  responds to an offer, accept the
offer if and only if the offer gives  at least  (1 + ).

We will now use the single-deviation principle to check that ∗ is a subgame-perfect


equilibrium. There are two kinds of stages: (i) a player  makes an offer, (ii) a player 
responds to an offer.
First consider a stage as in (ii) for some  [for an arbitrary history of previous offers],
where the current offer gives  ≥  (1 + ) to player . Fix the strategy of player 
from this stage on as in ∗ , i.e., from  + 1 and on player  accepts an offer iff his share
is at least as  (1 − ), and he offers  (1 + ) to the other player whenever he is to
make an offer. Similarly, fix the strategy of player  from date  + 1 as in ∗ , so that at
 + 1 and thereafter  offers  (1 + ) to  and accepts an offer if and only if  gets at
least  (1 + ). According to the fixed behavior, at  + 1,  offers to take 1 (1 + ) for
himself, leaving  (1 + ) to , and the offer is accepted; the payoff of  associated with
this outcome is
 +1 · 1 (1 − ) =  +1  (1 − ) 

Now according to ∗ , at the current stage,  is to accept the offer. This gives  the payoff
of
   ≥  +1  (1 + ) 

If  deviates and rejects the offer, then according to the fixed behavior he gets only
 +1  (1 + ), and he has no incentive to deviate. Hence, ∗ passes the single deviation
test at this stage for player .
182 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

Now, consider a stage as in (ii) for some  [for arbitrary history of previous offers],
where the current offer gives    (1 + ) to player . Fix the behavior of the players
at  + 1 and onwards as in ∗ , so that, independent of what happened so far, at  + 1,
player  offers to take 1 (1 + ), which is accepted by , yielding payoff of  +1  (1 − )
to . According to ∗ , player  is to reject the current offer and hence get this payoff. If
he deviates and accepts the offer, he will get

     +1  (1 + ) 

Therefore, he has no incentive to deviate at this stage, and ∗ passes the single-deviation
test at this stage.

Now consider a stage as in (i) for some  [for arbitrary history of previous offers]. Fix
again the moves of  at  and onwards as in ∗ . Fix also the moves of  at  and onwards
as in ∗ . Given the fixed moves, if  offers  some  ≥  (1 + ), then the offer will be
accepted, and  will obtain the payoff of (1 −  )   . If he offers    (1 + ), then the
offer will be rejected, and at  + 1 they will agree to a division in which  gets  (1 + ).
In that case, the payoff of  will be

 +2  (1 + ) 

The payoff of  as a function of  is as in Figure 11.7. According to ∗ , at this stage,


 offers  (1 + ) to the other player and clearly, any other offer gives a lower payoff
to , and he has no incentive to deviate at this stage. Therefore, ∗ passes the single
deviation test at this stage. We have covered all possible stages, and ∗ has passed the
single deviation principle at every stage. Therefore, ∗ is a subgame-perfect equilibrium.

In this game at each stage only one player moves. In the following lectures we
will study the repeated games where multiple players may move at a given stage. The
single-deviation principle will be very useful in those games as well.
11.4. EXERCISES WITH SOLUTIONS 183

Figure 11.7: The payoff of the proposer as a function of the offered share to the other
party

11.4 Exercises with Solutions


1. [Midterm 2, 2001]Compute all subgame-perfect Nash equilibria of the following
game:
2

X E
1
5/2 L R
5/2
2
l r
r l

3 0 2 2
3 2 0 2

Solution: The only proper subgame starts after . This subgame can be written
as
 
 3 3 0 2
 2 0 2 2
in normal form. It has three Nash equilibria: ( ), ( ), and the mixed strategy
Nash equilibrium  with 1 () = 2 () = 23. Since 3  52, ( ) entices Player
184 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

L R
M
2 2

l r a b a b
1
0 1 3 0
x y x 3 0
y 0 1

2 1 2
2 1
1 1
2

Figure 11.8:

L R
M
2

3/2
3/2 a a b
b

0 1 3 0
0 1 3 0

Figure 11.9:

2 to play . This results in SPE ( ). Similarly, the second SPE is ( ). If
one picks  in the subgame, the expected payoff vector for the subgame is (2 2),
and Player 2 plays . In the third SPE, Player 2 plays , and  would have been
played in the subgame otherwise.

2. [Homework 2, 2002] Compute two subgame-perfect equilibria in Figure 11.8.


Solution: The only proper subgame starts after Player 1 plays . The subgame
is a matching penny game. It has a unique Nash equilibrium, in which the each
player puts equal weights on his moves. The expected payoff vector in equilibrium
is (32 32). After fixing the payoffs of the subgame this way, the game reduces
11.4. EXERCISES WITH SOLUTIONS 185

to the game in Figure 11.9, which can be written as

 
 32 32 32 32
 0 0 1 1
 3 3 0 0

in normal form. This game does not have a proper subgame. The pure strategy
Nash equilibria are ( ) and ( ). These result in subgame-perfect Nash equilib-
¡ ¢ ¡ ¢
ria 12  + 12  12  + 12  and 12  + 12  12  + 12  in mixed strategies. The
reduced game has yet another Nash equilibrium, in which Player 1 puts equal
probabilities on  and  and Player 2 puts equal probabilities on  and . This
leads to a third subgame-perfect Nash equilibrium.

3. [Final 2002] Ashok and Beatrice would like to go on a date. They have two options:
a quick dinner at Wendy’s, or dancing at Pravda. Ashok first chooses where to go,
and knowing where Ashok went Beatrice also decide where to go. Ashok prefers
Wendy’s, and Beatrice prefers Pravda. A player gets 3 out his/her preferred date,
1 out of his/her unpreferred date, and 0 if they end up at different places. All
these are common knowledge.

(a) Find a subgame-perfect Nash equilibrium. Find also a non-subgame-perfect


Nash equilibrium with a different outcome.

ANSWER: SPE : Beatrice goes wherever Ashok goes, and Ashok goes to
Wendy’s. The outcome is both go to Wendy’s. Non-subgame-perfect Nash
Equilibrium: Beatrice goes to Pravda at any history, so Ashok goes to Pravda.
The outcome is each goes to Pravda. This is not subgame-perfect because it
is not a Nash equilibrium in the subgame after Ashok goes to Wendy’s.
(b) Modify the game a little bit: Beatrice does not automatically know where
Ashok went, but she can learn without any cost. (That is, now, without
knowing where Ashok went, Beatrice first chooses between Learn and Not-
Learn; if she chooses Learn, then she knows where Ashok went and then
decides where to go; otherwise she chooses where to go without learning
where Ashok went. The payoffs depend only on where each player goes –as
186 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

Ashok

Wendy’s Pravda
Beatrice
Beatrice
Learn Don’t Don’t Learn
Beatrice Beatrice Beatrice
Wendy’s Pravda
Wendy’s Wendy’s Wendy’s
Pravda Pravda
Pravda

3,1 0,0
3,1 0,0 0,0 1,3 0,0 1,3

Figure 11.10:

before.) Find a subgame-perfect equilibrium of this new game in which the


outcome is the same as the outcome of the non-subgame-perfect equilibrium
in part (a). (That is, for each player, he/she goes to the same place in these
two equilibria.)

ANSWER: The extensive form game is as in Figure 11.10. Consider the


strategy profile plotted in thicker arrows: Ashok plays Pravda, and Alice plays
Don’t and goes to Pravda; if she played Learn, then she would have played
Wendy’s if Ashok played Wendy’s and Pravda if Ashok played Pravda. As in
the non-subgame-perfect equilibrium, they both go to Pravda at the end. This
is a subgame-perfect equilibrium in the new game however. The only proper
subgames are the two decision nodes where Beatrice moves after learning
where Ashok went, and she plays best response at these nodes, yielding a Nash
equilibrium in these little subgames. As in the original game, the strategy
profile is a Nash equilibrium of the whole game. Therefore, it is a subgame-
perfect Nash equilibrium.

4. [Midterm 2, 2007] The players in the following game are Alice, who is an MIT senior
looking for a job, and Google. She has also received a wage offer  from Yahoo, but
we do not consider Yahoo as a player. Alice and Google are negotiating. They use
alternating offer bargaining, Alice offering at even dates  = 0 2 4    and Google
offering at odd dates  = 1 3   . When Alice makes an offer , Google either
11.4. EXERCISES WITH SOLUTIONS 187

accepts the offer, by hiring Alice at wage  and ending the bargaining, or rejects
the offer and the negotiation continues. When Google makes an offer , Alice

• either accepts the offer  and starts working for Google for wage , ending
the game,

• or rejects the offer  and takes Yahoo’s offer , working for Yahoo for wage
 and ending the game,

• or rejects the offer  and then the negotiation continues.

If the game continues to date ̄ ≤ ∞, then the game ends with zero payoffs for
both players. If Alice takes Yahoo’s offer at   ̄, then the payoff of Alice is  
and the payoff of Google is 0, where  ∈ (0 1). If Alice starts working for Google
at   ̄ for wage , then Alice’s payoff is   and Google’s payoff is ( − )   ,
where
2    

(Note that she cannot work for both Yahoo and Google.)

(a) Compute the subgame perfect equilibrium for ̄ = 4. (There are four rounds
of bargaining.)
ANSWER:

• Consider  = 3. Alice will get  if she accepts Google,  if she accepts


Yahoo, and 0 if she rejects and continues. Thus, she must choose
(
 if  ≥ 
3 =
  otherwise.

Given this, Google gets 0 if    and  −  if  ≥ . Therefore, it must


choose
3 = 

• Consider  = 2. Google will get  −  if it accepts an offer  by Alice


and  − 3 next day if it rejects the offer. Hence Google must

Accept iff ( − ) ≥  ( − 3 ) i.e.  ≤  (1 − ) + 


188 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

The best reply for Alice is to offer

2 =  (1 − ) + 

• [This is the most important step.] Consider  = 1. Consider Alice’s


decision. Alice will get  if she accepts Google,  if she accepts Yahoo,
and 2 if she rejects and continues. One must check whether she prefers
Yahoo’s offer to continuing. Note that

 (1 − ) 
  2 =  (1 − ) +  2  ⇐⇒   2 = 
1− 1+

Since   2  1+
, this implies that   2 . That is, Alice prefers
Yahoo’s offer to continuing, and hence she will never reject and continue.
Therefore, she must choose
(
 if  ≥ 
1 = 3 =
  otherwise.

Google then must offer 1 = .


• Consider  = 0. It must be obvious now that it is the same as  = 2.
Google Accepts iff  ≤ 2 and Alice offers

0 = 2 =  (1 − ) + 

(b) Take ̄ = ∞. Conjecture a subgame-perfect equilibrium and check that the


conjectured strategy profile is indeed a subgame-perfect equilibrium.
ANSWER:
From part (a), it is easy to conjecture that the following is a SPE:

∗ : At an odd date Alice accepts an offer  iff  ≥ , otherwise she takes


Yahoo’s offer. Google offers  = . At an even date Alice offers  =
 (1 − ) + , and Google accepts an offer  iff  ≤  .

Use single-deviation principle to check that ∗ is indeed a SPE. There are 4


major cases two check:

• Consider the case Alice is offered .


11.4. EXERCISES WITH SOLUTIONS 189

— Suppose that  ≥  ≡ . Alice is supposed to accept and receive


 today. If she deviates by rejecting  and taking Yahoo’s offer, she
will get , which is not better that . If she deviates by rejecting and
continuing, she will offer  at the next day, which will be accepted.
The present value of this is  =  (1 − ) +  2    ≤ , i.e. this
deviation yields even a lower payoff.
— Suppose that    ≡ . Alice is supposed to reject it and take
Yahoo’s offer with payoff . If she deviates accepting , she will
get the lower payoff of   . If she deviates by rejecting and
continuing, she will get  next day, with a lower present value of
 =  (1 − ) +  2   .
• Consider a case Google offers . If  ≥ , it will be accepted, yielding
a payoff of  −  to Google. If   , then Alice will go to Yahoo, with
payoff of 0 to Google. Therefore, the best response is to offer  =   0,
as in ∗ . There is no profitable (single) deviation.
• Consider the case Google is offered .
— Suppose that  ≤  . If Google deviates and rejects, it will pay 
tomorrow with payoff  ( − ) = ( −  ), which is not better than
 −  .
— Suppose that    . If Google deviates and accepts, then it will
get only  − , while it would get the present value of  ( − ) =
( −  ) by rejecting the offer.
• Consider a node in which Alice offers. Google will accept iff  ≤  . If
she offers    she gets  next day, with present value of    .
Therefore, the best reply is to offer  =  , and there is no profitable
deviation.

[In part (b) most important cases are the acceptance/rejection cases, espe-
cially that of Alice. Many students skipped those cases, and wrongly con-
cluded that a non-SPE profile is a SPE.]

5. Random Proposer Model: Consider -player version of the game in Section


11.3. They have again one dollar to share and each is risk neutral with discount
190 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

factor  as before. The only difference is that the proposer is selected randomly.
At any , each player  is selected as the proposer with probability  , and the
other players sequentially accept or reject in the increasing order. The game ends
if all the responders accept. Compute the subgame-perfect Nash equilibria that
are stationary, in that there exist divisions 1       such that each player  offers
 = (1       ) whenever he is the proposer (and the offer is accepted).
Solution: Write  for the expected share of player  before the proposer is
selected:
 = 1 1 + · · · +   

At , if a player  offers  = (1       ) and the offer is rejected, the payoff of 


is  +1  . His payoff is   if the offer is accepted. Hence, he accepts an offer 
iff  ≥  . Hence the proposer  = 6  offers  such that  =  . He keeps
P
 = 1 −   6=  to himself. Substituting these values in  = 1 1 + · · · +   ,
one obtains

 =   + (1 −  ) 
à !
X
=  1 −   + (1 −  ) 
=
6
à !
X

=  1 −   + 
=1
=  (1 − ) +  

Here, the first equality is because all other players offer the same share to ; the
second equality is by substitution of the values; the third equality is by simple
algebra, and the last equality is by the fact that all the offers add up to 1. Solving
for  , one obtains
 =  

SPE: Each player  offers  to every  6= , keeping himself 1 −  (1 −  ), and


accepts an offer  = (1       ) iff  ≥  .

6. [Final 2007] Three senators, namely Alice, Bob, and Colin, are in a committee
that determines the tax rate  ∈ [0 1]. Alice is a libertarian: her utility from
setting the tax rate  at date  is   (1 −  2 ). Bob is a moderate: his utility
11.4. EXERCISES WITH SOLUTIONS 191
¡ ¢
is   1 − ( − ̄ )2 where ̄ ∈ (0 1) is a known constant. Colin is a liberal: his
¡ ¢
utility is   1 − (1 −  )2 . At each date randomly one of them becomes a proposer,
each having chance of 1/3. The proposer offers a tax rate  and the other two vote
Yes or No in alphabetical order. If at least one of them votes Yes, then the game
ends and  is set as the tax rate. If both says No, we continue to the next date.

(a) Find a subgame perfect equilibrium of this game. (Hint: There exists a SPE
with values   ≤ ̄ ≤   such that Alice always offers   , Bob always offers
̄ , and Colin always offers   .)

Answer: Construct an equilibrium as in the hint. Note that when Alice


makes an offer, she will need the vote of Bob because whenever Bob rejects
Alice’s offer, so will the more liberal Colin. Also, she does not need Colin to
vote yes. Hence, she will offer the lowest tax rate accepted by Bob. That
offer will make Bob indifferent between Yes and No. Similarly, Colin will
make Bob indifferent between Yes and No. Let’s write  for the expected
value of Bob at the beginning of a date before we know who the proposer is.
If Bob says No, he will get  . Therefore, by indifference, his payoffs from
the offers of Alice and Carol are also  . Moreover, when he makes an offer,
he offers ̄ , and it is accepted by one of the other two senators, yielding payoff
of 1. Therefore, his payoff at the beginning of the period is
2 1
 =  + · 1
3 3
and hence,
1
 = 
3 − 2
But he is indifferent between   ,   , and the payoff  :

1 − (  − ̄ )2 = 1 − (  − ̄ )2 = 
3 − 2
i.e.,
3 (1 − )
(  − ̄ )2 = (  − ̄ )2 = 
3 − 2
Therefore,
r
3 (1 − )
  = ̄ −
3 − 2
r
3 (1 − )
  = ̄ + 
3 − 2
192 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

In order to complete the description of the strategy profile, one also needs to
find which offers are accepted by each senator. Clearly, Bob accepts an offer
if and only if  ∈ [     ]. The expected payoff of Alice at the beginning of
a period is
1¡ 2 ¢
 = 1 −   + ̄ 2 +  2 
3
and she must accept an offer iff  ≤ ̂  , where 1 − ̂ 2 =  , i.e.,
r

̂  = 1 −  + ( 2 + ̄ 2 +  2 )
3
Similarly, Colin accepts an offer  iff  ≥ ̂  , where
r
¡ ¢
̂  = 1 − 1 −  + (1 −   )2 + (1 − ̄ )2 + (1 −   )2
3
(which is obtained by replacing  with 1 −  ). This completes the answer.
[It can be checked that ̂  + (1 − ̂  )  1, so that at least one of Alice and
Colin accepts ̄ . This and the usual single deviation arguments would be
enough for verifying that the above strategy profile is indeed a SPE. Also,
the above solution assumes that   ≥ 0 and   ≤ 1. If it turns out that they
are out of bounds, one takes them 0 and 1 and computes  accordingly.]
(b) What happens as  → 1? Briefly interpret.

Answer: As  → 1,

  → ̄ ;   → ̄ ; ̂  → ̄ ; ̂  → ̄ 

That is, in the limit all players offer ̄ and they accept an offer if and only if
the offer is at least as good as ̄ . That is, the moderate senator’s preferences
dictate the outcome. (This is a version of the "median voter theorem" in
political science. The "theorem" states that the preferences of the voter who
is in the middle prevail. This emerges formally in models as in the example
here.)

11.5 Exercises
1. [Homework 3, 2004] Compute the subgame-perfect Nash equilibria in Figure 11.11.
11.5. EXERCISES 193

2
2

2 1
2

3 0 0 1
3 0 0 1

Figure 11.11:

B
M

2 2

L R
l
r 1
1

a a
x y x b b
y

0 2 1 1 2
4 1
0 0 1 2 2 1
1 4
0

Figure 11.12:
194 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

U D

L R L R
1 1

A 3/2 3/2 a b
B
0 0
2 2 2

r x y    
l

2 0 0 1 2 0 0 1
1 0 0 2 1 0 0 2

Figure 11.13:

2. [Homework 3, 2006] Compute the subgame-perfect Nash equilibria in Figure 11.12.

3. [Midterm 1, 2006] Compute all the subgame-perfect equilibria in pure strategies


in Figure 11.13.

4. [Midterm 2 Make Up, 2011] Find all the subgame-perfect Nash equilibria of the
following game.

a b

2 2

L R L R
1 1

2 0
A B 0 2 A’ B’

2 2 2 2

l r l r L’ R’ L’ R’

0 6 1 5 0 0 1
5
6 0 1 1 0 0 5
5

5. Homework 3, 2004] Find all subgame-perfect equilibria in the following game.


Consider an employer and a worker. The employer provides the capital  (in
11.5. EXERCISES 195

terms of investment in technology, etc.) and the worker provides the labor  (in

terms of the investment in the human capital) to produce  ( ) = , which
they share equally. The parties determine their investment level (the employer’s
capital  and the worker’s labor ) simultaneously. The worker cannot invest
¯ where 
more than , ¯ is a very large number. Both capital and labor are costly,
so that the payoffs for the employer and the worker are
1
 ( ) − 
2
and
1
 ( ) − 2 
2
respectively. So far the problem is same as in Exercise 1 in Section 8.5. The
present problem differs as follows. Before the worker joins the firm (in which they
simultaneously choose  and  ), the worker is to choose between working for
this employer or working for another employer q
who pays the worker a constant
wage ˜  0 makes him work as much as  ˜ = ̃
. (If he works for the other
2
employer, the current employer gets 0.) Everything described up to here is common
knowledge.

6. [Homework 3, 2006] Alice and Bob are competing to play a game against Casey.
Alice and Bob simultaneously bid  and  , respectively. The one who bids
higher wins; if  =  , the winner is determined by a coin toss. The winner pays
his/her bid to Casey and play the following game with Casey:

Winner\Casey L R
T 3,1 0,0
B 0,0 1,3

Find two pure strategy subgame-perfect equilibria of this game. Which of the
equilibria makes more sense to you?

7. [Midterm 1 Make Up, 2002] Consider the following game of coalition formation
in a parliamentary system. There are three parties , , and  who just won
41, 35, and 25 seats, respectively, in a 101-seats parliament. In order to form
a government, a coalition (a subset of {  }) needs 51 seats in total. The
196 CHAPTER 11. SUBGAME-PERFECT NASH EQUILIBRIUM

parties in the government enjoy a total 1 unit of perks, which they can share in
any way they want. The parties outside the government get 0 units of perks, and
each party tries to maximize the expected value of its own perks. The process of
coalition formation is as follows. First  is given right to form a government. If
it fails, then  is given right to form a government, and if  also fails then 
is given to form a government. If  also fails, then the game ends and each gets
0. The party who is given right to form a government, say , approaches one of
the other two parties, say , and offers some  ∈ [0 1]. If  accepts, then they
form the government and  gets 1 −  and  gets  units of perks. If  rejects the
offer, then  fails to form a government (in which case, as described above, either
another party is given right to form a government or game will and with 0 payoff).
Applying backward induction, find a Nash equilibrium of this game.

8. [A variation of Final Make Up, 2002] Consider the following game between two
firms. Firm 1 either stays out, in which case Firm 1 gets 2 and Firm 2 gets 3, or
enters the market where Firm 2 operates. If it enters, then the firms simultaneously
choose between two strategies: Hawk (an aggressive strategy) and Dove (a peaceful
strategy). In this subgame, if a firm plays Hawk and the other plays Dove, then
Hawk gets 3 Dove gets 0; if both choose Hawk, then each gets -1, and if both play
Dove, then each gets 1.

(a) Compute the set of subgame-perfect Nash equilibria.

(b) Which of the above equilibria is consistent with the assumption that Firm 2
remains to believe that Firm 1 is rational in the information set of Firm 2.

9. [Homework 3, 2004] Consider a two-player bargaining game with alternating offers,


where the players try to divide a dollar (as in the class). Assume that the discount
rate of player  is  ∈ (0 1), where 1 =
6  2 . Using the single-deviation principle,
check that the following is a subgame perfect equilibrium: at any given history
where  makes an offer, he offers (1 −   )  (1 −  1  2 ) to himself, leaving the rest to
the other player (), and at any history where he responds to an offer, he accepts
the offer if and only if his share is at least   (1 −   )  (1 −  1  2 ), where  6= .

10. Verify that the equilibrium identified in the random-proposer model of the previous
11.5. EXERCISES 197

section is indeed a subgame-perfect equilibrium.

11. Can you find a different subgame-perfect equilibrium in the random-proposer


model above?

12. [Final 2006] Alice and Bob own a dollar, which they need to share in order to
consume. Alice makes an offer  ∈  = {001 002     098 099}; and observing
the offer, Bob accepts it or rejects it. If Bob accepts the offer, Alice gets 1 −  and
Bob gets . If he reject, then each gets 0.

(a) Compute all the subgame-perfect equilibria in pure strategies.


(b) Now suppose that their cousin Carol sells a contract for $0.01. The contract
requires that Bob is to pay 1 dollar to Carol if Bob accepts an offer  that
is less than ̄, where ̄ ∈  is chosen by Bob at the time of purchase of the
contract. In particular, consider the following time-line:

• Bob decides whether to buy a contract from Carol and determines ̄ if


he chooses to buy;
• Alice observes Bob’s decision (i.e. whether he buys the contract and ̄ if
he buys);
• Then, they play the bargaining game above, where Bob pays Carol 1
dollar if he accepts an offer   ̄.

Find all the subgame-perfect equilibria in pure strategies.


(c) In part (b) assume that Alice cannot observe whether Bob buys a contract
(and in particular the value of ̄ if he buys). Find all the subgame-perfect
equilibria in pure strategies.
(d) In part (b) assume that Alice observes whether Bob buys a contract but does
not observe the value of ̄ if he buys. Find all the subgame-perfect equilibria
in pure strategies.
Chapter 12

Repeated Games

In real life, most games are played within a larger context, and actions in a given situation
affect not only the present situation but also the future situations that may arise. When
a player acts in a given situation, he takes into account not only the implications of his
actions for the current situation but also their implications for the future. If the players
are patient and the current actions have significant implications for the future, then the
considerations about the future may take over. This may lead to a rich set of behavior
that may seem to be irrational when one considers the current situation alone. Such
ideas are captured in the repeated games, in which a "stage game" is played repeatedly.
The stage game is repeated regardless of what has been played in the previous games.
This chapter explores the basic ideas in the theory of repeated games and applies them
in a variety of economic problems. As it turns out, it is important whether the game is
repeated finitely or infinitely many times.

12.1 Finitely-repeated games


Let  = {0 1     } be the set of all possible dates. Consider a game in which at each
 ∈  players play a "stage game" , knowing what each player has played in the past.
Assume that the payoff of each player in this larger game is the sum of the payoffs that
he obtains in the stage games. Denote the larger game by  .
Note that a player simply cares about the sum of his payoffs at the stage games. Most
importantly, at the beginning of each repetition each player recalls what each player has

199
200 CHAPTER 12. REPEATED GAMES

played in each previous play. A strategy then prescribes what player plays at each  as a
function of the plays at dates 0, . . . ,  − 1. More precisely, let us call the outcomes of the
previous stage games a history, which will be a sequence (0      −1 ). A strategy in
the repeated game prescribes a strategy of the stage game for each history (0      −1 )
at each date .
For example, consider a situation in which two players play the Prisoners’ Dilemma
game,
 
 5 5 0 6  (12.1)
 6 0 1 1
twice. In that case,  = {0 1} and  is the Prisoners’ Dilemma game. The repeated
game,  , can be represented in the extensive-form as

1
C D
2

C D C D

1 1 1 1
C D D C D C D
C
2 2 2 2

C D C D C D C D C D C D C D
C D

10 5 11 6 5 0 6 1 11 6 12 7 6 1 7 2
10 11 5 6 11 12 6 7 5 6 0 1 6 7 1 2

Now at date  = 1, a history is a strategy profile of the Prisoners’ Dilemma game,


indicating what has been played at  = 0. There are four histories at  = 1: ( ),
( ), ( ), and ( ). A strategy is to describe what the player plays at  = 0, and
what he plays at each of these four histories. (There are 5 actions to be determined.)
This is rather clear in the extensive-form game above.
Let us compute the subgame-perfect equilibrium of  ; the equilibrium is depicted
in the figure.  has four proper subgames, each corresponding to the last-round game
12.1. FINITELY-REPEATED GAMES 201

after a history of plays in the initial round. For example, after ( ) in the initial
round, we have subgame
1
C D
2

C D C D

10 5 11 6
10 11 5 6

where we add 5 to each player’s payoffs, corresponding to the payoff that he gets from
playing ( ) in the first round. Recall that adding a constant to a player’s payoff
does not change the preferences in a game, and hence the set of equilibria in this game
is the same as the original Prisoners’ Dilemma game, which possesses the unique Nash
equilibrium of ( ). This equilibrium is depicted in the figure. Likewise, in each proper
subgame, we add some constant to the players’ payoffs, and hence we have ( ) as
the unique Nash equilibrium at each of these subgames.
Therefore, the actions in the last round are independent of what is played in the
initial round. Hence, the players will ignore the future and play the game as if there is
no future game, each playing . Indeed, given the behavior in the last round, the game
in the initial round reduces to
 
 6 6 1 7
 7 1 2 2
where we add 1 to each player’s payoffs, accounting for his payoff in the last round. The
unique equilibrium of this reduced game is ( ). This leads to a unique subgame-
perfect equilibrium: At each history, each player plays .
What would happen for arbitrary ? The answer remains the same. In the last
day, , independent of what has been played in the previous rounds, there is a unique
Nash equilibrium for the resulting subgame: Each player plays . Hence, the actions
at day  − 1 do not have any effect in what will be played in the next day. Then, we
can consider the subgame as a separate game of the Prisoners’ Dilemma. Indeed, the
202 CHAPTER 12. REPEATED GAMES

reduced game for any subgame starting at  − 1 is

 
 5 + 1 + 1 5 + 1 + 2 0 + 1 + 1  6 + 1 + 2
 6 + 1 + 1 0 + 1 + 2 1 + 1 + 1  1 + 1 + 2

where 1 is the sum of the payoffs of  from the previous plays at dates 0     −2. Here
we add  for these payoffs and 1 for the last round payoff, all of which are independent
of what happens at date  − 1. This is another version of the Prisoner’s dilemma, which
has the unique Nash equilibrium of ( ). Proceeding in this way all the way back to
date 0, we find out that there is a unique subgame-perfect equilibrium: At each  and
for each history of previous plays, each player plays .
That is to say, although there are many repetitions in the game and the stakes in
the future may be high, any plan of actions other than playing myopically  everywhere
unravels, as players cannot commit to any plan of action in the last round. This is
indeed a general result.

Theorem 12.1 Let  be finite and assume that  has a unique subgame-perfect equi-
librium ∗ . Then,  has a unique subgame-perfect equilibrium, and according to this
equilibrium ∗ is played at each date independent of the history of the previous plays.

The proof of this result is left as a straightforward exercise. The result can be
illustrated by another important example. Consider the following Entry-Deterrence
game, where an entrant (Player 1) decides whether to enter a market or not, and the
incumbent (Player 2) decides whether to fight or accommodate the entrant if he enters.

1 Enter 2 Acc.
(1,1)

X Fight

(0,2) (-1,-1)
(12.2)

Consider the game where the Entry-Deterrence game is repeated twice, and all the
previous actions are observed. This game is depicted in the following figure.
12.1. FINITELY-REPEATED GAMES 203

1 Enter 2 Acc. 1 Enter 2 Acc.


(2,2)

X Fight X Fight
Acc. 2 Enter 1
(1,3) (1,3) (0,0)
1 Enter 2 Acc.
Fight X (0,0)

X Fight
(-1,1) (0,4)

(-1,1) (-2,-2)

As depicted in the extensive form, in the repeated game, at  = 1, there are three
possible histories: , ( ), and (  ). A strategy of Player 1 assigns
an action, which has to be either Enter or , to be played at  = 0 and action to be
played at  = 1 for each possible outcome at  = 0. In total, we need to determine 4
actions in order to define a strategy for Player 1. Similarly for Player 2.
Note that after the each outcome of the first play, the Entry-Deterrence game is
played again, where the payoff from the first play is added to each outcome. Since a
player’s preferences do not change when we add a number to his utility function, each
of the three games played on the second “day” is the same as the stage game (namely,
the Entry-Deterrence game above). The stage game has a unique subgame perfect
equilibrium, where the incumbent accommodates the entrant and the entrant enters the
market. In that case, each of the three games played on the second day has only this
equilibrium as its subgame perfect equilibrium. This is depicted in the following.

1 Enter 2 Acc. 1 Enter 2 Acc.


(2,2)

X Fight X Fight
Acc. 2 Enter 1
(1,3) (1,3) (0,0)
1 Enter 2 Acc.
Fight X (0,0)

X Fight
(-1,1) (0,4)

(-1,1) (-2,-2)

Using backward induction, we therefore reduce the game to the following.


204 CHAPTER 12. REPEATED GAMES

1 Enter 2 Acc.
(2,2)

X Fight

(1,3) (0,0)

Notice that we simply added the unique subgame-perfect equilibrium payoff of 1


from the second day to each payoff in the stage game. Again, adding a constant to a
player’s payoffs does not change the game, and hence the reduced game possesses the
subgame-perfect equilibrium of the stage game as its unique subgame perfect equilibrium.
Therefore, the unique subgame perfect equilibrium is as depicted below.

1 Enter 2 Acc. 1 Enter 2 Acc.


(2,2)

X Fight X Fight
Acc. 2 Enter 1
(1,3) (1,3) (0,0)
1 Enter 2 Acc.
Fight X (0,0)

X Fight
(-1,1) (0,4)

(-1,1) (-2,-2)

This can be generalized for arbitrary  as above. All these examples show that in
certain important games, no matter how high the stakes are in the future, the consid-
erations about the future will not affect the current actions, as the future outcomes do
not depend on the current actions. In the rest of the lectures we will show that these
are very peculiar examples. In general, in many subgame-perfect equilibria, the patient
players will take a long-term view, and their decisions will be determined mainly by the
future considerations.
Indeed, if the stage game has more than one equilibrium, then in the repeated game
we may have some subgame-perfect equilibria where, in some stages, players play some
actions that are not played in any subgame-perfect equilibrium of the stage game. This
is because the equilibrium to be played on the second day can be conditioned to the
play on the first day, in which case the “reduced game” for the first day is no longer
the same as the stage game, and thus may obtain some different equilibria. I will now
12.1. FINITELY-REPEATED GAMES 205

illustrate this using an example in Gibbons. (See Exercises 1 and 2 at the end of the
chapter before proceeding.)
Take  = {0 1} and the stage game  be

  
 1 1 5 0 0 0
 0 5 4 4 0 0
 0 0 0 0 3 3

Notice that a strategy in a stage game prescribes what the player plays at  = 0 and
what he plays at  = 1 conditional on the history of the play at  = 0. There are 9 such
histories, such as ( ), ( ), etc. A strategy of Player 1 is defined by determining
an action (,, or ) for  = 0, and determining an action for each of these histories at
 = 1 (There will be 10 actions in total.) Consider the following strategy profile:

Player 1: play  at  = 0; at  = 1, play  if ( ) played at  = 0, and


play  otherwise.
Player 2: play  at  = 0; at  = 1, play  if ( ) played at  = 0,
and play  otherwise.

According to this equilibrium, at  = 0, players play ( ) even though () is


not a Nash equilibrium of the stage game. Notice that in order for a strategy profile
to be subgame perfect, after each history, at  = 1, we must have a Nash equilibrium.
Since ( ) and ( ) are both Nash equilibria of the stage game, this is in fact the
case. Given this behavior, the first round game reduces to

  
 2 2 6 1 1 1
 1 6 7 7 1 1
 1 1 1 1 4 4

Here, we add 3 to the payoffs at ( ) (for it leads to ( ) in the second round) and
add 1 for the payoffs at the other strategy profiles, for they lead to ( ) in the second
round. Clearly, ( ) is a Nash equilibrium in the reduced game, showing that the
above strategy profile is a subgame-perfect Nash equilibrium. In summary, players can
coordinate on different equilibria in the second round conditional on the behavior in the
206 CHAPTER 12. REPEATED GAMES

first round, and the players may play a non-equilibrium (or even irrational) strategies in
the first round, if those strategies lead to a better equilibrium later.
When there are multiple subgame-perfect Nash equilibria in the stage game, a large
number of outcome paths can result in a subgame-perfect Nash equilibrium of the re-
peated game even if it is repeated just twice. But not all outcome paths can be a result
of a subgame-perfect Nash equilibrium. In the following, I will illustrate why some of
the paths can and some paths cannot emerge in an equilibrium in the above example.
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium? The
answer is No. This is because in any Nash equilibrium, the players must play a Nash
equilibrium of the stage game in the last period on the path of equilibrium. Since ( )
is not a Nash equilibrium of the stage game (( ) ( )) cannot emerge in any Nash
equilibrium, let alone in a subgame-perfect Nash equilibrium.
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium in pure
strategies? The answer is No. Although ( ) is a Nash equilibrium of the stage game,
in a subgame-perfect Nash equilibrium, a Nash equilibrium of the stage game must
be played after every play in the first round. In particular, after ( ), the play is
either ( ) or ( ), yielding 6 or 8, respectively for Player 1. Since he gets only 5
from (( ) ( )), he has an incentive to deviate to  in the first period. (What
about if we consider mixed subgame-perfect Nash equilibria or non-subgame-perfect
Nash equilibria?)
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium in pure
strategies? As it must be clear from the previous discussion the answer would be Yes
if and only if ( ) is played after every play of the period except for ( ). In that
case, the reduced game for the first period is

  
 2 2 6 1 1 1
 3 8 5 5 1 1
 1 1 1 1 4 4

Since ( ) is indeed a Nash equilibrium of the reduced game, the answer is Yes. It is
the outcome of the following subgame-perfect Nash equilibrium: Play ( ) in the first
round; in the second round, play ( ) if ( ) is played in the first round and play
( ) otherwise.
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 207

As an exercise, check also if (( ) ( )) or (( ) ( ) ( )) can be an
outcome of a subgame-perfect Nash equilibrium in pure strategies (in twice and thrice
repeated games, respectively).

12.2 Infinitely repeated games with observed actions


Now consider the infinitely repeated games where all the previous moves are common
knowledge at the beginning of each stage. That is, in the previous section take  =
{0 1 2   } as the set of natural numbers instead of  = {0 1     }. The game
continues indefinitely regardless of what players play along the way.
For the technically oriented students, the following must be noted. It is implicitly
assumed throughout the chapter that in the stage game, either the strategy sets are
all finite, or the strategy sets are convex subsets of R and the utility functions are
continuous in all strategies and quasiconcave in players’ own strategies.

12.2.1 Present Value calculations


In an infinitely repeated game, one cannot simply add the payoffs of each stage, as
the sum becomes infinite. For these games,assume instead that players maximize the
discounted sum of the payoffs from the stage games. The present value of any given
payoff stream  = ( 0   1           ) is computed by

X

  (; ) =     =  0 +  1 + · · · +    + · · · 
=0

where  ∈ (0 1) is the discount factor. The average value is simply

X

(1 − )   (; ) ≡ (1 − ) 
=0

Note that, for a constant payoff stream (i.e.,  0 =  1 = · · · =   = · · · ), the average


value is simply the stage payoff (namely,  0 ). The present and the average values can
be computed with respect to the current date. That is, given any , the present value at
 is
X

  (; ) = −   =  +  +1 + · · · +   + + · · · 
=
208 CHAPTER 12. REPEATED GAMES

and the average value at  is (1 − )   (; ). Clearly,

  (; ) =  0 +  1 + · · · +  −1  −1 +     (; ) 

Hence, the analysis does not change whether one uses   or   , but using   is
simpler. In repeated games considered here, each player maximizes the present value
of the payoff stream he gets from the stage games, which will be played indefinitely.
Since the average value is simply a linear transformation of the present value, one can
also use average values instead of present values. Such a choice sometimes simplifies the
expressions without affecting the analyses.

12.2.2 Histories and strategies


Once again, in a repeated game, a history at the beginning of a given date  is the
sequence of the outcomes of the play at dates 0      − 1. For example, in the Entry-
Deterrence game, the possible outcomes of the stage game are ,  = ( ),
and  = (  ), and the possible histories are -tuples of these three outcomes
for each . Examples of histories are

  · · ·  and    · · · 

In the repeated Prisoner’s Dilemma, the possible histories are -tuples of ( )  ( )  ( ),
and ( ), such as
( ) ( ) ( ) ( ) · · · ( ) 
where  varies. A history at the beginning of date  is denoted by  = (0      −1 ),
where 0 is the outcome of stage game in round 0 ;  is empty when  = 0. For example,
in the repeated prisoners’ dilemma, (( ) ( )) is a history for  = 2. In the repeated
entry-deterrence game, (  ) is a history for  = 2.
A strategy in a repeated game, once again, determines a strategy in the stage game
for each history and for each . The important point is that the strategy in the stage
game at a given date can vary by histories. Here are some possible strategies in the
repeated Prisoner’s Dilemma game:

Grim: Play  at  = 0; thereafter play  if the players have always played


( ) in the past, play  otherwise (i.e., if anyone ever played  in the
past).
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 209

Naively Cooperate: Play always C (no matter what happened in the past).
Tit-for-Tat: Play  at  = 0, and at each   0, play whatever the other
player played at  − 1.

Note that strategy profiles (Grim, Grim), (Naively Cooperate, Naively Cooperate)
and (Tit-for-Tat, Tit-for-Tat) all lead to the same outcome path:1

(( )  ( )  ( )    ) 

Nevertheless, they are quite distinct strategy profiles. Indeed, (Naively Cooperate,
Naively Cooperate) is not even a Nash equilibrium (why?), while (Grim, Grim) is a
subgame-perfect Nash equilibrium for large values of . On the other hand, while (Tit-
for-Tat, Tit-for-Tat) is a Nash equilibrium for large values of , it is not subgame-perfect.
All these will be clear momentarily.

12.2.3 Single-deviation principle


In an infinitely repeated game, one uses the single-deviation principle in order to check
whether a strategy profile is a subgame-perfect Nash equilibrium. In such a game, single-
deviation principle takes a simple form and is applied through augmented stage games.
Here, augmented refers to the fact that one simply augments the payoffs in the stage
game by adding the present value of future payoffs under the purported equilibrium. One
may also use the term reduced game instead of augmented stage game, interchangeably.

Augmented Stage Game (aka Reduced Game) Formally consider a strategy


profile ∗ = (∗1  ∗2      ∗ ) in the repeated game. Consider any date  and any history
 = (0      −1 ), where 0 is the outcome of the play at date 0 . Augmented stage game
for ∗ and  is the same game as the stage game in the repeated game except that the
payoff of each player  from each terminal history  of the stage game is

 (|∗  ) =  () +  +1 (  ∗ )

where  () is the stage-game payoff of player  at  in the original stage game, and
 +1 (  ∗ ) is the present value of player  at +1 from the payoff stream that results
1
Make sure that you can compute the outcome path for each strategy profile above.
210 CHAPTER 12. REPEATED GAMES

when all players follow ∗ starting with the history ( ) = (0      −1  ), which is a
history at the beginning of date  + 1. Note that  (|∗  ) is the time  present value
of the payoff stream that results when the outcome of the stage game is  in round  and
everybody sticks to the strategy profile ∗ from the next period on. Note also that the
only difference between the original stage game and the augmented stage game is that
the payoff in the augmented game is  (|∗  ) while the payoff in the original game is
 ().
Single-deviation principle now states that a strategy profile in the repeated game is
subgame-perfect if it always yields a subgame-perfect Nash equilibrium in the augmented
stage game:

Theorem 12.2 (Single-Deviation Principle) Strategy profile ∗ is a subgame-perfect


Nash equilibrium of the repeated game if and only if (∗1 ()      ∗ ()) is a subgame-
perfect Nash equilibrium of the augmented stage game for ∗ and  for every date  and
every history  at the beginning of .

Note that ∗ () is what player  is supposed to play at the stage game after history
 at date  according to ∗ . Hence, ∗ () is a strategy in the stage game as well as
a strategy in the augmented stage game. Therefore, (∗1 ()      ∗ ()) is a strategy
profile in the augmented stage game, and a potential subgame-perfect Nash equilibrium.
Note also that, in order to show that ∗ is a subgame-perfect Nash equilibrium, one
must check for all histories  and dates  that ∗ yields a subgame-perfect Nash equi-
librium in the augmented stage game. Conversely, in order to show that ∗ is not a
subgame-perfect Nash equilibrium, one only needs to find one history (and date) for
which ∗ does not yield a subgame-perfect Nash equilibrium in the augmented stage
game. Finally, although the above result considers pure strategy profile ∗ the same
result is true for mixed strategies. The result is stated that way for clarity. The rest of
this section is devoted to illustration of single-deviation principle on infinitely repeated
Entry Deterrence and Prisoners’ Dilemma games.

Infinitely Repeated Entry Deterrence Towards illustrating the single-deviation


principle when the stage game is dynamic, consider the infinitely repeated Entry-Deterrence
game in (12.2). Consider the following strategy profile.
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 211

At any given stage, the entrant enters the market if an only if the incum-
bent has accommodated the entrant sometimes in the past. The incumbent
accommodates the entrant if an only if he has accommodated the entrant
before.2

Using the single-deviation principle, we will now show that for large values of , this a
subgame-perfect Nash equilibrium. The strategy profile puts the histories in two groups:

1. The histories at which there was an entry and the incumbent has accommodated;
the histories that contain an entry , and

2. all the other histories, i.e., the histories that do not contain the entry  at any
date.

Consequently, in the application of single-deviation principle, one puts histories in


the above two groups, depending on whether the incumbent has ever accommodated
any entrant. First take any date  and any history  = (0      −1 ) in the first group,
where incumbent has accommodated some entrants. Now, independent of what happens
at , the histories at  + 1 and later will contain a past instance of accommodation 
(before ), and according to the strategy profile, at  + 1 and on, entrant will always
enter and incumbent will accommodate, each player getting the constant stream of 1s.
The present value of this at  + 1 is

 = 1 +  +  2 + · · · = 1 (1 − ) 

That is, for every outcome  ∈ {   },  +1 (  ∗ ) =  . Hence, the aug-
mented stage game for  and ∗ is
1 Enter 2 Acc.
(1+VA,1
,1++VA)

X Fight

0+VA
0+ -1+VA
-1+
2+VA
2+ -1+VA
-1+
2
This is a switching strategy, where initially incumbent fights whenever there is an entry and the
entrant never enters. If the incumbent happens to accommodate an entrant, they switch to the new
regime where the entrant enters the market no matter what the incumbent does after the switching,
and incumbent always accommodates the entrant.
212 CHAPTER 12. REPEATED GAMES

For example, if the incumbent accommodates the entrant at , his present value (at
) will be 1 +  ; and if he fights his present value will be −1 +  , and so on.
This is another version of the Entry-Deterrence game, where the constant  is added
to the payoffs. The strategy profile ∗ yields (Enter, Accommodate) for round  at
. According to single-deviation principle, (Enter, Accommodate) must be a subgame-
perfect equilibrium of the augmented stage game here. This is indeed the case, and ∗
passes the single-deviation test for such histories.

Now for some date  consider a history  = (0      −1 ) in the second group, where
the incumbent has never accommodated the entrant before, i.e., 0 differs from  for
all 0 . Towards constructing the augmented stage game for , first consider the outcome
 =  at . In that case, at the beginning of  + 1, the history is ( ), which
includes  as in the previous paragraph. Hence, according to ∗ , Player 1 enters and
Player 2 accommodates at  + 1, yielding a history that contains  for the next period.
Therefore, in the continuation game, all histories are in the first group (containing ),
and the play is (Enter, Accommodate) at every 0  , resulting in the outcome path
(     ). Starting from  + 1, each player gets 1 for each date, resulting the
present value of  +1 (  ∗ ) =  . Now consider another outcome  ∈ {  }
in period . The continuation play for other outcomes is quite different now. At the
beginning of  + 1, the history ( ) is either ( ) or (  ). Since  does not contain
, neither does ( ). Hence, according to ∗ , at  + 1, Player 1 exits, and Player 2
would have chosen Fight if there were an entry, yielding outcome  for period  + 1.
Consequently, at any 0  +1, the history is (        ), and Player 1 chooses to
exit at 0 according to ∗ . This results in the outcome path (      ). Therefore,
starting from  + 1, Player 1 gets 0 and Player 2 gets 2 every day, yielding present values
of  1+1 (  ∗ ) = 0. and

 2+1 (  ∗ ) =  = 2 + 2 + 2 2 + · · · = 2 (1 − ) 
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 213

respectively. Therefore, the augmented stage game for  and ∗ is now

1 Enter 2 Acc.
Acc.
(1+VA,1+
,1+VA)

X Fight

0+0
0+ -1+0
-1+
2+VF
2+ -1+VF
-1+

At this history the strategy profile prescribes (  ), i.e., the entrant does not
enter, and if he enters, the incumbent fights. Single-deviation principle requires then
that (  ) is a subgame-perfect equilibrium of the above augmented stage game.
Since  is a best response to Fight, we only need to ensure that Player 2 weakly prefers
Fight to Accommodate after the entry in the above game. For this, we must have

−1 +  ≥ 1 +  

Substitution of the definitions of  and  in this inequality shows that this is equivalent
to3
 ≥ 23

We have considered all possible histories, and when  ≥ 23, the strategy profile
has passed the single-deviation test. Therefore, when  ≥ 23, the strategy profile is a
subgame-perfect equilibrium.
On the other hand, when   23, ∗ is not a subgame-perfect Nash equilibrium. To
show this it suffices to consider one history at which ∗ fails the single-deviation test. For
a history  in the second group, the augmented stage game is as above, and (  )
is not a subgame-perfect equilibrium of this game, as 1 +   −1 +  .

Infinitely Repeated Prisoners’ Dilemma When the stage game is a simultaneous


action game, there is no distinction between subgame-perfect Nash equilibrium and Nash
equilibrium. Hence, for single-deviation test, one simply checks whether ∗ () is a Nash
3
The inequality is  ( −  ) ≥ 2. Substituting the values of  and  , we obtain  (1 − ) ≥ 2,
i.e.,  ≥ 23.
214 CHAPTER 12. REPEATED GAMES

equilibrium of the augmented stage game for  for every history . This simplifies the
analysis substantially because one only needs to compute the payoffs without deviation
and with unilateral deviations in order to check whether the strategy profile is a Nash
equilibrium.
As an example, consider the infinitely repeated Prisoner’s dilemma game in (12.2).
Consider the strategy profile (Grim,Grim). There are two kinds of histories we need to
consider separately for this strategy profile.

1. Cooperation: Histories in which  has never been played by any player.

2. Defection: Histories in which  has been played by some one at some date.

First consider a Cooperation history for some . Now if both players play , then
according to (Grim,Grim), from  + 1 on each player will play  forever. This yields the
present value of
 = 5 + 5 + 5 2 + · · · = 5 (1 − )

at  + 1. If any player plays , then from  + 1 on, all the histories will be Defection
histories and each will play  forever. This yields the present value of

 = 1 +  + 2 + · · · = 1 (1 − )

at  + 1. Now, at , if they both play , then the payoff of each player will be 5 +  .
If Player 1 plays  while Player 2 is playing C, then Player 1 gets 6 +  , and Player
2 gets 0 +  . Hence, the augmented stage game at the given history is

 
 5 +   5 +  0 +   6 + 
 6 +   0 +  1 +   1 + 

To pass the single-deviation test, (C,C) must be a Nash equilibrium of this game.4 (That
is, we fix a player’s action at  and check if the other player has an incentive to deviate.)
4
It is important to note that we do not need to know all the payoffs in the reduced game. For
example, for this history we only need to check if ( ) is a Nash equilibrium of the reduced game, and
hence we do not need to compute the payoffs from ( ). In this example, it was easy to compute.
In general, it may be time consuming to compute the payoffs for all strategy profiles. In that case, it
will save a lot of time to ignore the strategy profiles in which more than one player deviates from the
prescribed behavior at .
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 215

This is the case if and only if


5 +  ≥ 6 +  

i.e.,
 ≥ 15

We also need to consider Defection histories. Consider a Cooperation history for


some . Now, independent of what is played at , according to (Grim,Grim), from  + 1
on we will have defection histories and each player will play  forever. The present
value of payoffs from  + 1 on will always be  . Then, the augmented stage game at
this history is
 
 5 +   5 +  0 +   6 +  
 6 +   0 +  1 +   1 + 
Single-deviation test for (Grim,Grim) requires that ( ) is a Nash equilibrium of this
game,5 and in fact ( ) is the only Nash equilibrium.
Since (Grim,Grim) passes the single-deviation test at each history, it is a subgame-
perfect Nash equilibrium when  ≥ 15.6
We will now use the same technique to show that (Tit-for-tat,Tit-for-tat) is not a
subgame-perfect Nash equilibrium (except for the degenerate case  = 15). Tit-for-tat
strategies at  + 1 only depends on what is played at  not any previous play. If ( )
is played at , then starting at  + 1 and we will have ( ) throughout, and hence the
vector of present values at  + 1 will be
µ ¶
5 5
 = (1 1) +  (1 1) +  2 (1 1) + · · · 
1− 1−
If ( ) is played at , then according to (Tit-for-tat,Tit-for-tat) the sequence of plays
starting at  + 1 will be
( ) ( ) ( ) ( ) · · ·

with  + 1-present value of


µ ¶
6 6
2 2 = (6 0) +  (0 6) +  2 (6 0) + + 3 (0 6) +  4 (6 0) · · · 
1− 1−
5
Once again, to check this, we do not need to know the payoffs for ( ).
6
Once again, it is not a subgame-perfect Nash equilibrium when   15. In that case, it suffices to
show that ( ) is not a Nash equilibrium of the augmented stage game for a Cooperation history.
216 CHAPTER 12. REPEATED GAMES

Similarly if ( ) is played at , then  + 1-present value will be


µ ¶
6 6

1 − 2 1 − 2
After ( ) at , we will have ( ) throughout, yielding +1-present value (1 (1 − )  1 (1 − )).
In order to show that (Tit-for-tat,Tit-for-tat) is not a subgame-perfect Nash equilib-
rium, we will consider two histories. [To show a strategy profile is not subgame-perfect,
one only needs to find a case where it fails the single-deviation principle.] Given the
above continuation games, the reduced game at any  for any previous history is

 
5 5 6 6 
 5 +  1−  5 +  1− 0 +  1− 26 + 
1−2
6 6
 6+  1− 20 +  1− 2 1 +  (1 − )  1 +  (1 − )

1. Consider  = 0, when (Tit-for-tat,Tit-for-tat) prescribes ( ). Single-deviation


principle then requires that ( ) is a Nash equilibrium of the reduced game
above. That is, we must have
5 6
5+ ≥6+ 
1− 1 − 2

2. Consider a history in which ( ) is played at  − 1. Now according to (Tit-for-


tat,Tit-for-tat) we must have ( ) at . Single-deviation principle now requires
that ( ) is a Nash equilibrium of the above game. That is, we must have
6 5
6+ 2 ≥ 5+ 
1− 1−
the opposite of the previous requirement.

Hence, (Tit-for-tat,Tit-for-tat) is not a subgame-perfect Nash equilibrium, unless


6 5
6 +  1− 2 = 5 +  1− , or equivalently  = 15.

12.3 Folk Theorem


A main objective of studying repeated games is to explore the relation between the short-
term incentives (within a single period) and long term incentives (within the broader
repeated game). Conventional wisdom in game theory suggests that when players are
12.3. FOLK THEOREM 217

patient, their long-term incentives take over, and a large set of behavior may result in
equilibrium. Indeed, for any given feasible and "individually rational" payoff vector and
for sufficiently large values of , there exists some subgame perfect equilibrium that
yields the payoff vector as the average value of the payoff stream. This fact is called the
Folk Theorem. This section is devoted to presenting a basic version of folk theorem and
illustrating its proof.
Throughout this section, it is assumed that the stage game is a simultaneous action
game (  ) where set  = {1     } is the set of players,  = 1 × · · · ×  is a
finite set of strategy profiles, and  :  → R is the stage-game utility functions.

12.3.1 Feasible Payoffs


Imagine that the players collectively randomize over stage game strategy profiles  ∈ .
Which payoff vectors could they get if they could choose any probability distribution
P
 :  → [0 1] on ? (Recall that ∈  () = 1.) The answer is: the set  of payoff
vectors  = (1       ) such that
X
=  () (1 ()       ())
∈

for some probability distribution  :  → [0 1] on . Note that  is the smallest convex
set that contains all payoff vectors (1 ()       ()) from pure strategy profiles in the
stage game. A payoff vector  is said to be feasible iff  ∈  . Throughout this section,
 is assumed to be -dimensional.
For a visual illustration consider the Prisoners’ Dilemma game in (12.1). The set 
is plotted in Figure 12.1. Since there are two players,  contains pairs  = (1  2 ). The
payoff vectors from pure strategies are (1 1), (5 5), (6 0), and (0 6). The set  is the
diamond shaped area that lies between the lines that connect these four points.
Note that for every strategy profile  in the repeated game, the average payoff vector
from  is in  .7 This also implies that the same is true for mixed strategy profiles in the
repeated game. Conversely, if the players can collectively randomize on strategy profiles
7
Indeed, the average payoff vector can be written as
X
 () =  () (1 ()       ())
∈
218 CHAPTER 12. REPEATED GAMES

6
5

1
0
0 1 5 6

Figure 12.1: Feasible payoffs in Prisoners’ Dilemma

in the repeated games, all vectors  ∈  could be obtained as average payoff vectors.
(See also the end of the section.)

12.3.2 Individual Rationality–MinMax payoffs


There is a lower bound on how much a player gets in equilibrium. For example, in
the repeated prisoners’ dilemma, if one keeps playing defect everyday no matter what
happens, he gets at least 1 every day, netting an average payoff of 1 or more. Then,
he must get at least 1 in any Nash equilibrium because he could otherwise profitably
deviate to the above strategy.
Towards finding a lower bound on the payoffs from pure-strategy Nash equilibria, for

where
X
 () = (1 − ) 
∈

and  is the set of dates at which  is played on the outcome path of . Clearly,
X X X X
 () = (1 − )   = (1 − )   = 1
∈ ∈ ∈ ∈
12.3. FOLK THEOREM 219

each player  define pure-strategy minmax payoff as

 = min max  (  − )  (12.3)


− ∈−  ∈

Here, the other players try to minimize the payoff of player  by choosing a pure strategy
− for themselves, knowing that player  will play a best response to − . Then, the
harshest punishment they could inflict on  is  . For example, in the prisoners’ dilemma
game,  = 1 because  gets maximum of 6 if the other player plays  and gets maximum
of 1 if the other player plays .
Observe that in any pure-strategy Nash equilibrium ∗ of the repeated game, the
average payoff of player  is at least  . To see this, suppose that the average payoff of
 is less than  in ∗ . Now consider the strategy ̂ , such that for each history , ̂ ()
is a stage-game best response to ∗− (), i.e.,
¡ ¢ ¡ ¢
 ̂ ()  ∗− () = max    ∗− () 
 ∈

Since
¡ ∗
¢
max    − () ≥ 
 ∈
¡ ¢
for every , this implies that the average payoff from ̂  ∗− is at least 1, giving player
 an incentive to deviate.
A lower bound for the average payoff from a mixed strategy Nash equilibrium is given
by minmax payoff, defined as
X Y
 = min max  ( )  (  − )  (12.4)
 6=  ∈
− ∈− =
6 

where  is a mixed strategy of  in the stage game. Similarly to pure strategies one can
show that the average payoff of player  is at least  in any Nash equilibrium (mixed
or pure). Note that, by definition,  ≤  . The equality can be strict. For example, in
the matching penny game
Head Tail
Head −1 1 1 −1
Tail 1 −1 −1 1
the pure-strategy minmax payoff  is 1 while minmax payoff  is 0. (This is obtained
when  () =  ( ) = 12.) For the sake of exposition, it is assumed that
(1       ) ∈  .
A payoff vector  is said to be individually rational iff  ≥  for every  ∈ .
220 CHAPTER 12. REPEATED GAMES

12.3.3 Folk Theorem


I will next present a general folk theorem and illustrate the main idea of the proof for a
special case.

Theorem 12.3 (Folk Theorem) Let  ∈  be such that    for every player .
Then, there exists ¯ ∈ (0 1) such that for every   ̄ there exists a subgame-perfect
equilibrium of the repeated game under which the average value of each player  is  .
Moreover, if    for every  above, then the subgame-perfect equilibrium above is in
pure strategies.

he Folk Theorem states that any strictly individually rational and feasible payoff
vector can be supported in subgame perfect Nash equilibrium when the players are
sufficiently patient. Since all equilibrium payoff vectors need to be individually rational
and feasible, the Folk Theorem provides a rough characterization of the equilibrium
payoff vectors when players are patient: the set of all feasible and individually rational
payoff vectors.
I will next illustrate the main idea of the proof for a special case. Assume that, in the
theorem,  = (1 (∗ )       (∗ )) for some ∗ ∈  and there exists a Nash equilibrium
̂ of the stage game such that    (̂) for every . In the prisoners’ dilemma example,
∗ = ( ), yielding  = (5 5), and ̂ = ( ), yielding payoff vector (1 1). Recall
that in that case one could obtain  from strategy profile (Grim, Grim), which is a
subgame-perfect Nash equilibrium when   15. The main idea here is a generalization
of Grim strategy. Consider the following strategy profile ∗ of the repeated game:

Play ∗ until somebody deviates, and play ̂ thereafter.

Clearly, under ∗ , the average value of each player  is  (∗ ) =  . Moreover, ∗ is a


subgame-perfect Nash equilibrium when  is large. To see this, note that ∗ passes the
single-deviation test at histories with previous deviation because ̂ is a Nash equilibrium
of the stage game. Now consider a history in which ∗ is played throughout. In the
augmented stage game (with average payoffs), the payoff from ∗ is  because they will
keep playing ∗ forever after that play. The payoff from any other  ∈  is

(1 − )  () +  (̂)
12.3. FOLK THEOREM 221

because the players will switch to ̂ after any such play. Then, ∗ is a Nash equilibrium
of the augmented stage game if and only if
¡ ¢
 ≥ (1 − ) max    ∗− +  (̂) (12.5)


for every player . Let ¡ ¢


max    ∗− − 
 = ¡ ¢
max    ∗− −  (̂)
be the discount rate for which (12.5) becomes equality; such    1 exists because
¡ ¢
max    ∗− ≥  (∗ ) =    (ˆ
). Take ̄ = max { 1        }. Then, for every
  ̄, inequality (12.5) holds, and hence ∗ is a Nash equilibrium of the augmented
stage game. Therefore, ∗ is a subgame-perfect Nash equilibrium whenever   ̄. Note
that in the case of prisoners’ dilemma, ̄ = (6 − 5)  (6 − 1) = 15.
In the above illustration, the vector  is obtained from playing the same ∗ . What
if this is not possible, i.e.,  is a convex combination of payoff vectors from  ∈  but
6  () for any  ∈ . In that case, one can use time averaging to obtain  from
 =
pure strategy in the repeated game. For an illustration, consider (2 2) in the repeated
Prisoners’ Dilemma game. Note that
1 3 1
(2 2) = (5 5) + (1 1) = (1 1) + (4 4) 
4 4 4
We could obtain average payoff vectors near (2 2) in various ways. For example, consider
the path

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) · · · ( ) ( ) ( ) ( ) · · ·

The average value of each player from this path is


1− 4
1+ 44 = 1 + 
1− 1 +  + 2 + 3
As  → 1, this value approaches 2. Another way to approximate (2 2) would be first to
play ( ) then switch to ( ). For example, let ∗ be the smallest integer for which
∗ ∗
  ≤ 14. Note that when  is large,   ∼= 14. Now consider the path on which ( )
is played for every   ∗ and ( ) is played for every  ≥ ∗ . The average value is
¡ ∗¢ ∗
1 −  · 1 +  5 ∼= 2

Here, I approximated  by time averaging. When  is large, one can obtain each  exactly
by time averaging.8
8
For mathematically oriented students: imagine writing each weight  () ∈ [0 1] in base 1.
222 CHAPTER 12. REPEATED GAMES

12.4 Exercises with Solutions


1. Consider the -times repeated game with the following stage game

  
 3 3 0 0 0 0
 0 0 2 2 1 0
 0 0 0 1 0 0

(a) Find a lower bound  for the average payoff of each player in all pure strategy
Nash equilibria. Prove indeed that the payoff of a player is at least  in
every pure-strategy Nash equilibrium.
Solution: Note that the pure strategy minmax payoff of each player is 1.
Hence, the payoff of a player cannot be less than . Indeed, if a player
mirrors what the other player is supposed to play in any history at which the
other player plays  or  according to the equilibrium and play  if the other
player is supposed to play  at the history, then his payoff would be at least
. Since he plays a best response in equilibrium, his payoff is at least that
amount. This lower bound is tight. For  = 2  1, consider the strategy
profile
Play ( ) for the first  periods and ( ) for the last  periods; if any player
deviates from this path, play ( ) forever.
Note that the payoff from this strategy profile is . To check that this is
a Nash equilibrium, note that the best possible deviation is to play play 
forever, which yields , giving no incentive to deviate. Note also that the
quilibrium here is not subgame-perfect.

(b) Construct a pure-strategy subgame-perfect Nash equilibrium in which the


payoff of each player is at most  + 1. Verify that the strategy profile is
indeed a subgame-perfect Nash equilibrium.
Solution: Recall that  = {0      − 1}. For  = 1, ( ) is the desired
equilibrium. Towards a mathematical induction, now take any   1 and
assume that for every   , the -times repeated game has a pure-strategy
subgame-perfect Nash equilibrium ∗ [] in which each player gets + 1. For
12.4. EXERCISES WITH SOLUTIONS 223

-times repeated game, consider the path

( ) · · · ( ) ( ) · · · ( )


| {z } | {z }
( − 1) 2 times ( + 1) 2 times

if  is odd and the path

( ) · · · ( ) ( ) · · · ( ) ( )


| {z } | {z }
2 times 2 − 1 times

if  is even. Note that the total payoff of each player from this path is  + 1.
Consider the following strategy profile.
Play according to the above path; if any player deviates from this path at
any  ≤ 2 − 1, switch to ∗ [ −  − 1] for the remaining ( −  − 1)-times
repeated game; if any player deviates from this path at any   2, remain
on the path.
This is a subgame-perfect Nash equilibrium. There are three classes of histo-
ries to check. First, consider a history in which some player deviated from the
path at some 0 ≤ 2. In that case, the strategy profile already prescribes
to follow the subgame-perfect Nash equilibrium ∗ [ − 0 − 1] of the subgame
that starts from 0 + 1, which remains subgame perfect at the current sub-
game as well. Second, consider a history in which no player has deviated from
the path at any 0 ≤ 2 and take   2. In the continuation game, the
above strategy profile prescribes: play ( ) every day if  is odd and play
( ) every day but the last day and play ( ) on the last day if  is even.
Since ( ) and ( ) are Nash equilibria of the stage game, this is clearly a
subgame-perfect equilibrium of the remaining game. Finally, take  ≤ 2
and consider any on-the path history. Now, a player’s payoff is  + 1 if he
follows the strategy profile. If he deviates at , he gets at most 1 at  and
( −  − 1) + 1 ≤  from the next period on, where ( −  − 1) + 1 is his
payoff from ∗ [ −  − 1]. His total payoff cannot exceed  + 1, and he has
no incentive to deviate.

2. Consider the infinitely repeated prisoners’ dilemma game of (12.1) with discount
factor  = 0999.
224 CHAPTER 12. REPEATED GAMES

(a) Find a subgame-perfect Nash equilibrium in pure strategies under which the
average payoff of each player is in between 1.1 and 1.2. Verify that your
strategy profile is indeed a subgame-perfect Nash equilibrium.
³ ´
ˆ ˆ
Solution: Take any ̂ with 1 −  + 5  = 1 + 4 ̂ ∈ (11 12), e.g., any ̂
between 2994 and 3687. Consider the strategy profile
Play ( ) at any   ̂ and ( ) at ̂ and thereafter. If any player deviates
from this path, play ( ) forever.
³ ´
ˆ
Note that the average value of each player is 1 −  + 5 ̂ ∈ (11 12). To
check that it is a subgame-perfect Nash equilibrium, first take any on-path
history with date  ≥ ̂. At that history, the average value of each player is
5. If a player deviates, then his average value is only 6 (1 − ) +  = 105.
Hence, he has no incentive to deviate. For   ̂, the average value is
³ ´ ³ ´
ˆ
− ˆ ˆ ˆ
1− −
+ 5 ≥ 1 −  + 5   11

If he deviates, his average value is only . Therefore, he does not have an


incentive to deviate, once again. Since they play static Nash equilibrium after
switch, there is no incentive to deviate at such a history, either. Therefore,
the strategy profile above is a subgame-perfect Nash equilibrium.
(b) Find a subgame perfect Nash equilibrium in pure strategies under which the
average payoff of player 1 is at least 57. Verify that your strategy profile is
indeed a subgame-perfect Nash equilibrium.
³ ´
ˆ ˆ ˆ ˆ
Solution: Take any ̂ with 1 −   6 + 5  = 6 −   ∈ (57 58), i.e.,   ∈
(02 03). The possible values for ̂ are the natural numbers from 1204 to
1608. Consider the strategy profile
Play ( ) at any   ̂ and ( ) at ̂ and thereafter. If any player deviates
from this path, play ( ) forever.
ˆ
Note that the average value of Player 1 is 6 −   , taking values between
6 − 09991204 = 57002 and 6 − 09991608 = 57999. Note also that the strategy
profile coincides with the one in part (a) at all off-the-path histories and at
all on-the-path histories with  ≥ ̂. Hence, to check whether it is a subgame-
perfect Nash equilibrium, it suffices to check for on-the-path histories with
  ̂. At any such history, clearly, Player 1 does not have an incentive to
12.4. EXERCISES WITH SOLUTIONS 225

deviate (as in part (a)). For Player 2, the average value is

5 ̂− ≥ 5 ̂ ≥ 09991608 5 ∼
= 10006

If he deviates, his average value is only 1 (getting 1 instead of 0 on the first


day and getting 1 forever thereafter). Therefore, he does not have an incentive
to deviate. Therefore, the strategy profile above is a subgame-perfect Nash
equilibrium.
(c) Can you find a subgame-perfect Nash equilibrium under which the average
payoff of player 1 is more than 5.8?
Answer: While the average payoff of Player 1 can be as high as 5.7999, it
cannot be higher than 5.8. This is because 1  1 for any feasible  with
1  58. Such an individually irrational payoff cannot result in equilibrium
because Player 2 could do better by simply playing  at every history (as
discussed in the text).

3. [Midterm 2, 2006] Two firms, 1 and 2, play the following infinitely repeated game
in which all the previous plays are observed, and each player tries to maximize
the discounted sum of his or her profits at the stage games where the discount
rate is  = 099. At each date , simultaneously, each firm  selects a price
 ∈ {001 002     099 1}. If 1 = 2 , then each firm sells 1 unit of the good;
otherwise, the cheaper firm sells 2 units and the more expensive firm sells 0 units.
Producing the good does not cost anything to firms. Find a subgame-perfect equi-
librium in which the average value of Firm 1 is at least 1.4. (Check that the
strategy profile you construct is indeed subgame-perfect equilibrium.)
Solution: (There are several such strategy profiles; I will show one of them.) In
order for the average value to exceed 1.4, the present value must exceed 140. We
can get average value of approximately 1.5 for player 1 by alternating between
(099 1), which yields (198 0), and (1 1), which yields (1 1). The average value
of that payoff stream for player 1 is
198 +  ∼
= 149
1+
Here is a SPE with such equilibrium play: At even dates play (099 1) and at odd
226 CHAPTER 12. REPEATED GAMES

dates play (1 1); if any player ever deviates from this scheme, then play (001 001)
forever.

We use the single-deviation principle, to check that this is a SPE. First note that
in "deviation" mode, they play a Nash equilibrium of the stage game forever, and
it passes the single-deviation test. Now, consider an even  and a history where
there has not been any deviation. Player 1 has no incentive to deviate: if he follows
the strategy, he will get the payoff stream 1.98, 1, 1.98, 1, 1.98, . . . ; if he deviates,
he will get , 0.01, 0.01, . . . where  ≤ 196 ( = 1 for upward deviation). For
player 2: if he plays according to the strategy, he will get the payoff stream of 0,
1, 0, 1, 0, 1, . . . with present value of

¡ ¢
 1 −  2 ∼= 4975

If he deviates, he will get , 0.01, 0.01, . . . where  ≤ 196. (The best deviation is
2 = 098.) This yields present value of

 + 001 ·  (1 − ) =  + 1 ≤ 296 ¿ 4975

He has no incentive to deviate. We also need to check an even date  with no


previous deviation. Now the best deviation is to set price 0.99 and get 1.98 today
and get 0.01 forever, which yields the present value of 2.98. This is clearly lower
than what each player gets by sticking to their strategies (148.5 for player 1, and
50.25 for player 2).

4. [Midterm 2, 2011] Alice and Bob are a couple, playing the infinitely repeated game
with the following stage game and discount factor . Every day, simultaneously,
Alice and Bob spend  ∈ [0 1] and  ∈ [0 1] fraction of their time in their
relationship, respectively, receiving the stage payoffs  = ln ( +  ) + 1 − 
and  = ln ( +  ) + 1 −  , respectively. (Alice and Bob are denoted by 
and , respectively.) For each of the strategy profiles below, find the conditions
on the parameters for which the strategy profile is a subgame-perfect equilibrium.

Solution: It is useful to note that ( 1 − ) is a Nash equilibrium of the stage


game for every  ∈ [0 1].
12.4. EXERCISES WITH SOLUTIONS 227

(a) Both players spend all of their time in their relationship (i.e.  =  = 1)
until somebody deviates; the deviating player spends 1 and the other player
spends 0 thereafter. (Find the range of .)
Solution: Since (1 0) and (0 1) are Nash equilibria of the stage game, there
is no incentive to deviate at any history with previous deviation by one player.
Now consider any other history, in which they both are supposed to spend 1.
If a player  follows the strategy, his he average payoff is

ln 2

Suppose he deviates and spends   1. Then, since the other player is


supposed to spend 1, in the continuation game, player  spends 1 and the
other player spends 0. This yields 0 for player . Hence, the average value of
player  from deviation is

(ln (1 +  ) + 1 −  ) (1 − ) 

The best possible deviation is  = 0, yielding the payoff of

1 − 

Hence, the strategy profile is a subgame-perfect Nash equilibrium iff

ln 2 ≥ 1 − 

where the valeus on left and right hand sides of inequality are the average
values from following the strategy profile and best deviation, respectively.
One can write this as a lower bound on the discount factor:

 ≥ 1 − ln 2

(b) There are 4 states:  (namely, Engagement),  (namely, Marriage),  and


 . The game starts at state , in which each player spends ̂ ∈ (0 1). If
both spends ̂, they switch to state ; they remain in state  otherwise. In
state , each spends 1. They remain in state  until one player  ∈ { }
spends less than 1 while the other player spends 1, in which case they switch
to  state. In  state, player  spends ̃ and the other player spends 1 − ̃
228 CHAPTER 12. REPEATED GAMES

forever. (Find the set of inequalities that must be satisfied by the parameters
, ̂, ̃ , and ̃ .)
Hint: The following facts about logarithm may be useful:

ln () = 1; ln () ≤  − 1; ln () = ln  + ln 

Solution: Since (̃  1 − ̃ ) is a Nash equilibrium of the stage game, there
is no incentive to deviate at state  for any  ∈ { }. In state , the
average payoff from following the strategy profile is ln 2. If a player  deviates
at state , the next state is  (as in part (a)), which gives the average payoff
of 1 − ̃ to . Hence, as in part (a), the average payoff from best deviation
is 1 −  +  (1 − ̃ ) = 1 − ̃ . Therefore, there is no incentive to deviate at
state  iff ln 2 ≥ 1 − ̃ , i.e.

̃ ≥ 1 − ln 2 (12.6)

On the other hand, in state , the average payoff from following the strategy
is

 = (1 − ) (ln (2̂) + 1 − ̂) +  ln 2


= ln 2 + (1 − ) (ln ̂ + 1 − ̂) 

By deviating and playing  =


6 ,
ˆ player  can get

(1 − ) (ln (̂ +  ) + 1 −  ) +  

The best deviation is  = 1 − ̂ and yields the maximum average payoff of

(1 − ) ̂ +  

There is no incentive to deviate at  iff

 ≥ (1 − ) ̂ +  

which simplifies to
 ≥ 
ˆ
By substituting the value of  , one can write this condition as

ln 2 + (1 − ) (ln ̂ + 1 − )
ˆ ≥ 
ˆ (12.7)

The strategy profile is a SPE iff (12.6) and (12.7) are satisfied.


12.4. EXERCISES WITH SOLUTIONS 229

Remark 12.1 One can make strategy profile above a subgame-perfect Nash
equilibrium by varying all three parameters ̂, ̃1 , ̃2 , and . For a fixed
(̂ ̃1  ̃2 ), both conditions bound the discound factors from below, yielding
½ ¾
1 − ln 2 1 − ln 2 ̂ − ln 2
 ≥ max  1 − 
̃1 ̃2 ln ̂ + 1 − ̂

(To see this, observe that ln ̂ + 1 − ̂  0.) Of course, when  is fixed, the
above conditions can also be interpeted as bounds on ̃ and ̂. First, the
contribution of the guilty party  in the divorce state  cannot be too low:
1 − ln 2
̃ ≥ 

For otherwise, the parties deviate and marriage cannot be sustained. Second,
the above lower bound on  also gives an absolute upper bound on the effort
level during the engagement. Since   1 and ln ̂ + 1 − ̂  0, the condition
on  implies that
ˆ  ln 2 ∼
= 0693

For otherwise, the lower bound on  would exceed 1. That is, one must start
small, as engagement may never turn into marriage otherwise. Of course,
one could also skip the engagement altogether.

5. [Final, 2001] This question is about a milkman and a customer. At any day, with
the given order,

• Milkman puts  ∈ [0 1] liter of milk and 1 −  liter of water in a container


and closes the container, incurring cost  for some   0;

• Customer, without knowing , decides on whether or not to buy the liquid


at some price . If she buys, her payoff is  −  and the milkman’s payoff
is  − . If she does not buy, she gets 0, and the milkman gets −. If she
buys, then she learns .

(a) Assume that this is repeated for 100 days, and each player tries to maximize
the sum of his or her stage payoffs. Find all subgame-perfect equilibria of this
game.
230 CHAPTER 12. REPEATED GAMES

Solution: The stage game has a unique Nash equilibrium, in which  =


0 and the customer does not buy. Therefore, the finitely repeated game
has a unique subgame-perfect equilibrium, in which the stage equilibrium is
repeated.
(b) Now consider the infinitely repeated game with the above stage game and
with discount factor  ∈ (0 1). What is the range of prices  for which
there exists a subgame perfect equilibrium such that, everyday, the milkman
chooses  = 1, and the customer buys on the path of equilibrium play?
Solution: The milkman can guarantee himself 0 by always choosing  = 0.
Hence, his continuation value at any history must be at least 0. Hence, in
the worst equilibrium, if he deviates customer should not buy milk forever,
giving the milkman exactly 0 as the continuation value. Hence, the SPE we
are looking for is the milkman always chooses m=1 and the customer buys
until anyone deviates, and the milkman chooses m=0 and the customer does
not buy thereafter. If the milkman does not deviate, his average value is

 =  − 

The best deviation for him (at any history on the path of equilibrium play)
is to choose  = 0 (and not being able to sell thereafter). In that case, his
average value is
 =  (1 − ) + 0 =  (1 − ) 

In order this to be an equilibrium, we must have  ≥  ; i.e.,

 −  ≥  (1 − ) 

i.e.,
 ≥ 

In order for the customer to buy on the equilibrium path, it must also be true
that  ≤ . Therefore,
 ≥  ≥ 

6. [Midterm 2 Make up, 2006] Since the British officer had a thick pen when he drew
the border, the border of Iraq and Kuwait is disputed. Unfortunately, the border
12.4. EXERCISES WITH SOLUTIONS 231

passes through an important oil field. In each year, simultaneously, each of these
countries decide whether to extract high () or low () amount of oil from this
field. Extracting high amount of oil from the common field hurts the other country.
In addition, Iraq has the option of attacking Kuwait ( ), which is costly for both
countries. The stage game is as follows:

 
 2 2 4 1
 1 4 3 3
 −1 −1 −1 −2

Consider the infinitely repeated game with this stage game and with discount
factor  = 09.

(a) Find a subgame perfect Nash equilibrium in which each country extracts low
() amount of oil every year on the equilibrium path.9
Solution: Consider the strategy profile
Play ( ) until somebody deviates and play ( ) thereafter.
This strategy profile is a subgame-perfect Nash equilibrium whenever  ≥ 12.
(You should be able to verify this at this stage.)

(b) Find a subgame perfect Nash equilibrium in which Iraq extracts high ()
amount of oil and Kuwait extracts low () amount of oil every year on the
equilibrium path.
Solution: Consider the following ("Carrot and Stick") strategy profile10
There are two states: War and Peace. The game starts at state Peace. In
state Peace, they play ( ); they remain in Peace if ( ) is played and
switch to War otherwise. In state War, they play ( ); they switch to
Peace if ( ) is played and remain in War otherwise.
This strategy profile is a subgame-perfect Nash equilibrium whenever  ≥ 35.
The vector of average values is (4 1) in state Peace and (−1 −1) (1 − ) +
 (4 1) = (5 − 1 2 − 1) in War. Note that both countries strictly prefer
9
That is, an outside observer would observe that each country extracts low amount of oil every year.
10
See the next chapter for more on Carrot and Stick strategies.
232 CHAPTER 12. REPEATED GAMES

Peace to War. In state Peace, Iraq clearly has no incentive to deviate. On


the other hand, Kuwait gets 2 (1 − ) +  [2 − 1] from . Hence, it has no
incentive to deviate if

2 (1 − ) +  [2 − 1] ≤ 1

i.e.,  ≥ 12, which is indeed the case. In state War, Kuwait clearly has no
incentive to deviate. In that state, Iraq could possibly benefit from deviating
to , getting 2 (1 − ) +  (2 − 1). It does not have an incentive to deviate
if
2 − 1 ≥ 2 (1 − ) +  (2 − 1) 
i.e.,
2 − 1 ≥ 2
This is equivalent to  ≥ 35, which is clearly the case.

7. [Selected from Midterms 2 in years 2001 and 2002] Below, there are pairs of stage
games and strategy profiles. For each pair, check whether the strategy profile is a
subgame-perfect Nash equilibrium of the infinitely repeated game with the given
stage game and discount factor  = 099.

(a) Stage Game:


 
 6 6 0 4
 4 0 4 4
Strategy profile: Each player plays  in the first round and in the following
rounds he plays what the other player played in the previous round (i.e., at
each   0, he plays what the other player played at  − 1).
Solution: This is a version of Tit—for—tat; it is not a subgame perfect Nash
equilibrium. (Make sure that you can show this quickly! at this point.)
(b) Stage Game:
  
 3 1 0 0 −1 2
 0 0 0 0 0 0
 −1 2 0 0 −1 2
12.4. EXERCISES WITH SOLUTIONS 233

Strategy profile: Until some player deviates, Player 1 plays  and Player
2 plays . If anyone deviates, then each plays  thereafter.
Solution: This is a subgame perfect Nash equilibrium. After the deviation,
the players play a Nash equilibrium forever. Hence, we only need to check
that no player has any incentive to deviate on the path of equilibrium. Player
1 has clearly no incentive to deviate. If Player 2 deviates, he gets 2 in the
current period and gets zero thereafter. If he sticks to his equilibrium strategy,
then he gets 1 forever. The present value of this is 1 (1 − )  2. Therefore,
Player 2 doesn’t have any incentive to deviate, either.
(c) Stage Game:
  
 2 −1 0 0 −1 2
 0 0 0 0 0 0
 −1 2 0 0 2 −1
Strategy profile: Until some player deviates, Player 1 plays  and Player 2
alternates between  and . If anyone deviates, then each play  thereafter.
Solution: It is subgame perfect. Since ( ) is a Nash equilibrium of
the stage game, we only need to check if any player wants to deviate at a
history in which Player 1 plays  and Player 2 alternates between  and 
throughout. In such a history, the average value of Player 1 is

1 = 2 −  = 101

if Player 2 is to play  and

1 = 2 − 1 = 098

if Player 2 is to play . In the case Player 2 is to play , Player 1 cannot


gain by deviating. In the case Player 2 is to play , Player 1 can get at most
gets
2 (1 − ) + 0 = 002
by deviating to . Since 002  098, he has no incentive to deviate. The only
possible profitable deviation for Player 2 is to play  when he is supposed to
play . In that contingency, if he follows the strategy he gets 1 = 098; if
he deviates, he gets only 2 (1 − ) + 0 = 002.
234 CHAPTER 12. REPEATED GAMES

(d) Stage Game:


 
 2 2 1 3
 3 1 0 0
Strategy profile: The play depends on three states. In state 0 , each player
plays ; in states 1 and 2 , each player plays . The game start at state
0 . In state 0 , if each player plays  or if each player plays , they stay at
0 , but if a player  plays  while the other is playing , then they switch
to state  . At any  , if player  plays , they switch to state 0 ; otherwise
they stay at state  .
Solution: It is not subgame-perfect. At state 2 , Player 2 is to play , and
the state in the next round is 0 no matter what Player 1 plays. In that case,
Player 1 would gain by deviating and playing  (in state 2 ).

12.5 Exercises
1. How many strategies are there in twice-repeated prisoners dilemma game?

2. Suppose that the stage game is a two-player games in which each player  has 
strategies. How many strategies each player has in an -times repeated game?

3. Prove Theorem 12.1.

4. Show that in any Nash equilibrium ∗ of the repeated game, the average payoff of
player  is at least  .

5. [Homework 4, 2011] Consider the infinitely repeated game with discount factor
 = 099 and the following stage game (in which the players are trading favors):

Give Keep
Give 1 1 −1 2
Keep 2 −1 0 0

(a) Find a subgame perfect equilibrium under which the average expected payoff
of Player 1 is at least 133. Verify that your strategy profile is indeed a
subgame-perfect Nash equilibrium.
12.5. EXERCISES 235

(b) Find a subgame-perfect equilibrium under which the average expected payoff
of Player 1 is at least 149. Verify that your strategy profile is indeed a
subgame-perfect Nash equilibrium.

6. [Midterm 2, 2011] Consider the 100-times repeated game with the following stage
game:
1
I X

1
2
0
a b

2 2

L R L R

5 0 x 1
1 x 0 6

where  is either 0 or 6.

(a) Find the set of pure-strategy subgame-perfect equilibria of the stage game
for each  ∈ {0 6}.
(b) Take  = 6. What is the highest payoff Player 2 can get in a subgame-perfect
equilibrium of the repeated game?
(c) Take  = 0. Find a subgame-perfect equilibrium of the repeated game in
which Player 2 gets more than 300 (i.e. more than 3 per day on average)?

7. [Midterm 2, 2011] Consider an infinitely repeated game in which the stage game is
as in the previous problem. Take the discount factor  = 099 and  = 6. For each
strategy profile below, check whether it is a subgame-perfect Nash equilibrium.

(a) They play ( ) everyday until somebody deviates; they play ( ) there-
after.
(b) There are three states: ,  1, and  2, where the play is ( ), ( ), and
( ), respectively. The game starts at state . After state , it switches
to state  1 if the play is ( ) and to state  2 if the play is ( ); it stays
236 CHAPTER 12. REPEATED GAMES

in state  otherwise. After states  1 and  2, it switches back to state 


regardless of the play.

8. [Midterm 2 Make Up, 2011] Consider an infinitely repeated game in which the
discount factor is  = 09 and the stage game is

  
 4 4 0 5 0 0
 5 0 3 3 −1 0
 2 2 1 1 −2 0
 0 0 0 −1 −3 −2

For each payoff vector below ( ), find a subgame perfect equilibrium of the
repeated game in which the average discounted payoff is ( ). Verify that the
strategy profile you identified is indeed a subgame perfect equilibrium.

(a) ( ) = (4 4).

(b) ( ) = (2 2).

9. [Midterm 2 Make Up, 2011] Consider the infinitely repeated game with the stage
game in the previous problem and the discount factor  ∈ (0 1). For each of the
strategy profiles below, find the conditions on the discount factor for which the
strategy profile is a subgame-perfect equilibrium.

(a) At  = 0, they play ( ). At each , they play ( ) if the play at  − 1 is
( ) or if the play at  − 2 is not ( ). Otherwise, they play ( ).

(b) There are 4 states: ( ), ( ), ( ), and ( ). At each state (1  2 ), the
play is (1  2 ). The game starts at state ( ). For any  with (1  2 ), the
state at  + 1 is

( ) if the play at  is (1  2 )


( ) if the play at  is (1  02 ) for some 02 6= 2
( ) if the play at  is (10  2 ) for some 01 6= 1
( ) if the play at  is (01  20 ) for some 01 6= 1 and 02 6= 2 
12.5. EXERCISES 237

10. [Homework 4, 2011] Consider the -times repeated game with the following stage
game.
A X
(1,0,0)
I

L R

C
L R
L R

(0,2,2) (5,0,0) (5,0,0) (2,1,1)

(a) For  = 2, what is the largest payoff A can get in a subgame-perfect Nash
equilibrium in pure strategies?

(b) For   2, find a subgame-perfect Nash equilibrium in which the payoff of A


is at least 5 − 6.

11. [Homework 4, 2011] Consider the infinitely repeated game with discount factor
 ∈ (0 1) and the stage game in the previous problem. For each of the strategy
profile below, find the range of  under which the strategy profile is a subgame-
perfect Nash equilibrium.

(a) A always plays . B and C both play  until somebody deviates and play 
thereafter.

(b) A plays I and B and C rotate between ( ), ( ), and ( ) until some-
body deviates; they play (  ) thereafter.
(Note that the outcome is (  )  (  )  (  )  (  )  (  )    .)

12. [Homework 4, 2007] Seagulls love shellfish. In order to break the shell, they need
to fly high up and drop the shellfish. The problem is the other seagulls on the
beach are kleptoparasites, and they steal the shellfish if they can reach it first. This
question tells the story of two seagulls, named Irene and Jonathan, who live in a
238 CHAPTER 12. REPEATED GAMES

crowded beach where it is impossible to drop the shellfish and get it before some
other gull steals it. The possible dates are  = 0 1 2 3   with no upper bound.
Everyday, simultaneously Irene and Jonathan choose one of the two actions: "Up"
or "Down". Up means to fly high up with the shellfish and drop it next to the
other sea gull’s nest, and Down means to stay down in the nest. Up costs   0,
but if the other seagull is down, it eats the shellfish, getting payoff   . That is,
we consider the infinitely repeated game with the following stage game

Up Down
Up − − − 
Down  − 0 0

and discount factor  ∈ (0 1).11 For each strategy profile below, find the set of dis-
count factors  under which the strategy profile is a subgame-perfect equilibrium.

(a) Irrespective of the history, Irene plays Up in the even dates and Down in the
odd dates; Jonathan plays Up in the odd dates and Down in the even dates.
(b) Irene plays Up in the even dates and Down in the odd dates while Jonathan
plays the other way around until someone fails to go Up in a day that he is
supposed to do so. They both stay Down thereafter.
(c) For  days Irene goes Up and Jonathan stays Down; in the next  days
Jonathan goes Up and Irene stays Down. This continues back and forth until
someone deviates. They both stay Down thereafter.
(d) Irene goes Up on "Sundays", i.e., at  = 0 7 14 21   , and stays Down on
the other days, while Jonathan goes up everyday except for Sundays, when
he rests Down, until someone deviates; they both stay Down thereafter.
(e) At  = 0, Irene goes Up and Jonathan stays Down, and then they alternate.
If a seagull  fails to go Up at a history when  is supposed to go Up, then
the next day  goes Up and the other seagull stays Down, and they keep
alternating thereafter until someone fails to go Up when it is supposed to do
so. (For example, given the history, if Irene is supposed to go Up at  but
11
Evolutionarily speaking, the discounted sum is the fitness of the genes, which determine the behav-
ior.
12.5. EXERCISES 239

stays Down, then Irene goes Up at  + 1, Jonathan goes Up at  + 2, and so


on. If Irene stays down again at  + 1, then she is supposed to go up at  + 2,
and Jonathan is supposed to go at  + 3, etc.)

13. [Homework 4, 2007] Consider the infinitely repeated game, between Alice and Bob,
with the following stage game:

Alice

Hire Fire

Bob
0
Work Shirk 0

-1
2
3
2

The discount factor is  = 09. (Fire does not mean that the game ends.) For each
strategy profile below, check if it is a subgame-perfect equilibrium. If it is not a
SPE for  = 09, find the set of discount factors  under which it is a SPE.

(a) Alice Hires if and only if there is no Shirk in the history. Bob Works if and
only if there is no Shirk in the history.

(b) Alice Hires unless Bob (was hired and) Shirked in the previous period, in
which case she Fires. Bob always Works.

(c) There are three states: Employment, Punishment for Alice, and Punishment
for Bob. In the Employment state, Alice Hires and Bob Works. In the
Punishment state for Alice, Alice Hires but Bob Shirks. In the Punishment
state for Bob, Alice Fires, and Bob would have worked if Alice Hired him. The
game starts in Employment state. At any state, if only one player fails to play
what s/he is supposed to play at that state, then we go to the Punishment
state for that player in the next period; otherwise we go to the Employment
state in the next period.
240 CHAPTER 12. REPEATED GAMES

14. [Midterm 2, 2007] Consider the infinitely repeated game with the following stage
game
Chicken Lion
Chicken 3 3 1 4
Lion 4 1 0 0
and discount factor  = 099. For each strategy profile below check if it is a
subgame-perfect equilibrium. (You need to state your arguments clearly; you will
not get any points for Yes or No answers.)

(a) There are two states: Cooperation and Fight. The game starts in the Cooper-
ation state. In Cooperation state, each player plays Chicken. If both players
play Chicken, then they remain in the Cooperation state; otherwise they go
to the Fight state in the next period. In the Fight state, both play Lion, and
they go back to the Cooperation state in the following period (regardless of
the actions).

(b) There are three states: Cooperation, P1 and P2. The game starts in the Co-
operation state. In the Cooperation state, each player plays Chicken. If they
play (Chicken, Chicken) or (Lion, Lion), then they remain in the Cooperation
state in the next period. If player  plays Lion while the other player plays
Chicken, then in the next period they go to P state. In P state player  plays
Chicken while the other player plays Lion; they then go back to Cooperation
state (regardless of the actions).

15. [Midterm 2 Make Up, 2007] Alice has two sons, Bob and Colin. Every day, she is to
choose between letting them play with the toys ("Play") or make them visit their
grandmother ("Visit"). If she make them visit their grandmother, each of them
gets 1. If she lets them play, then Bob and Colin simultaneously choose between
Grab and Share, which leads to the payoffs as in the following table, where the
third entry is the payoff of Alice:

Bob\Colin Grab Share


Grab −1 −1 −1 3 −2 −2
Share −2 3 −2 2 2 2
12.5. EXERCISES 241

Consider the infinitely repeated game with the above game is the stage game and
the discount factor is  = 09. For each strategy profile below check if it is a
subgame-perfect equilibrium. Show your work.

(a) There are three states: Share,  and  . In Share state Alice lets
them play and Bob and Colin both share. In  state (resp. 
state), Alice lets them play, and Bob (resp. Colin) shares while the other
brother grabs. The game starts in Share state. If Bob (resp. Colin) does
not play what he is supposed to play while the other player plays what he is
supposed to play, then the next day we go to  (resp.  ) state; we
go to Share state next day otherwise.
(b) There are two states: Play and Visit. The game starts in the Play state. In
the Play state, Alice lets them play, and both sons share. In Play state, if
everybody does what they are supposed to do, we remain in Play state; we
go to Visit state next day otherwise. In the Visit state, Alice makes them
visit their grandmother, and they would both Grab if she let them play. In
the Visit state, they automatically go back to Play state next day.

16. [Homework 4, 2006] Alice has a restaurant, and Bob is a potential customer. Each
day Alice is to decide whether to use high quality supply (High) or low quality
supply (Low) to make the food, and Bob is to decide whether to buy or not at
price  ∈ [1 3]. (At the time Bob buys the food, he cannot tell if it is of high
quality, but after buying he knows whether it was high or low quality.) The payoffs
for a given day is as follows.

Alice\Bob Buy Skip


High  − 1 3 −  −1 0
Low  − 0 0

The discount rate is  = 099. For each of the following strategy profiles, find the
range of  ∈ [1 3] for which the strategy profile is a subgame-perfect equilibrium

(a) There are two states: Trade and No-trade. The game starts at Trade state.
In Trade state, Alice uses High quality supply, and Bob Buys. If in the Trade
state Alice uses Low quality supply, then they go to the No-Trade state, in
242 CHAPTER 12. REPEATED GAMES

which for  days Alice uses Low quality supply and Bob Skips. At the end of
 day, independent of what happens, they go back to the Trade state.

(b) Alice is to use High quality supply in the even days,  = 0 2 4   , and Low
quality supply in the odd days,  = 1 3 5   ; Bob is to Buy everyday. If
anyone deviates from this program, then in the rest of the game Alice uses
Low quality and Bob Skips.12

17. [Homework 4, 2006] In question 1, take  = 2, and check whether each of the
following is a subgame-perfect equilibrium. [We assume here that Bob somehow
can check whether the food was good in the previous day even if did not buy it.]

(a) Everyday Alice uses High quality supply. Bob buys the product in the first
day. Afterwards, Bob buys the product if and only if Alice has used High
quality supply in the previous day.

(b) There are two states: Trade and Punishment. The game starts at Trade state.
In Trade state, Alice uses High quality supply, and Bob Buys. In Trade state
if Alice uses Low quality, then we go to Punishment state. In Punishment
state, Alice uses High quality supply, and Bob Skips. In Punishment state, if
Alice uses Low quality supply or Bob Buys, then we remain in the Punishment
state; otherwise we go to Trade state.

18. [Homework 4, 2006] In an eating club, there are   2 members. Each day, each
member  is to decide how much to eat, denoted by  , and the payoff of  for that
day is
√ 1 + · · · + 
 − 

For  = 099, check if either of the following strategy profiles is a subgame-perfect
equilibrium. [If you solve the problem for  = 3, you will get 80%.]

(a) Each player eats  = 14 units until somebody eats more than 14; thereafter
each eats  = 2 4 units.
12
That is, at any 0 , Alice will use Low quality supply and Bob wil Skip in either of the following
cases: (i) Alice used Low quality supply at an even date   0 , or (ii) she used High quality supply at
an odd date   0 , or (iii) Bob Skipped at some date   0 .
12.5. EXERCISES 243

(b) Each player eats  = 14 units until somebody eats more than 14; thereafter
each eats  = 2 units.

19. [Homework 4, 2006] Each day Alice and Bob receive 1 dollar. Alice makes an offer 
to Bob, and Bob accepts or rejects the offer, where  ∈ {001 002     098 099}.
If Bob accepts the offer Alice gets 1− and Bob gets . If Bob rejects the offer, then
they both get 0. Find the values of  for which the following is a subgame-prefect
equilibrium, where ̄ ∈ {001 002     098 099} is fixed.

At  = 0, Alice offers ̄ and Bob accepts Alice’s offer, , if and only if  ≥ ̄. They
keep doing this until Bob deviates from this program (i.e. until Bob accepts an
offer   ̄, or Bob rejects an offer  ≥ ̄). Thereafter, Alice offers  = 001 and
Bob accepts any offer.

20. [Homework 3, 2004] Consider a Firm and a Worker. The firm first decides whether
to pay a wage   0 to the worker (hire him), and then the worker is to decide
whether work, which costs him   0 and produces  to the firm where     .
The payoffs are as follows:

Firm Worker
pay, work − −
pay, shirk − 
don’t pay, work  −
don’t pay, shirk 0 0

(a) Find all Nash equilibria.

(b) Now consider the game this stage game is repeated infinitely many times and
the players discounts the future with . The following are strategy profiles
for this repeated game. For each of them, Check if it is a subgame-perfect
Nash equilibrium for large values of , and if so, find the lowest discount rate
that makes the strategy profile a subgame-perfect equilibrium.

i. No matter what happens, the firm always pays and the worker works.
ii. At any time , the worker works if he is paid at , and the firm always
pays.
244 CHAPTER 12. REPEATED GAMES

iii. At  = 0, the firm pays and the worker works. At any time   0, the
firm pays if and only if the worker worked at all previous dates, and the
worker works if and only if he has worked at all previous dates.
iv. At  = 0, the firm pays and the worker works. At any time   0, the
firm pays if and only if the worker worked at all previous dates at which
the firm paid, and the worker works if and only if he is paid at  and he
has worked at all previous dates at which he was paid.
v. There are two states: Employment, and Unemployment. The game starts
at Employment. In this state, the firm pays, and the worker works if and
only if he has been paid at this date. If the worker shirks we go to Un-
employment state; otherwise we stay in Employment. In Unemployment
the firm does not pay and the worker shirks. After   0 days of Unem-
ployment we always go back to Employment. (Your answer should cover
each   0.)

21. Stage Game: Alice and Bob simultaneously choose contributions  ∈ [0 1] and
 ∈ [0 1], respectively, and get payoffs  = 2 −  and  = 2 − , respectively.

(a) (5 points) Find the set of rationalizable strategies in the Stage Game above.
(b) (10 points) Consider the infinitely repeated game with the Stage Game above
and with discount factor  ∈ (0 1). For each , find the maximum (∗  ∗ )
such that there exists a subgame-perfect equilibrium of the repeated game
in which Alice and Bob contribute ∗ and ∗ , respectively, on the path of
equilibrium.
(c) (10 points) In part (b), now assume that at the beginning of each period
 one of the players (Alice at periods  = 0 2 4    and Bob at periods
 = 1 3 5   ) offers a stream of contributions  = (  +1    ) and  =
(  +1    ) for Alice and Bob, respectively, and the other player accepts or
rejects. If the offer is accepted then the game ends leading the automatic
contributions  = (  +1    ) and  = (  +1    ) from period  on. If the
offer is rejected, they play the ³ Stage
´ Game and proceed to the next period.
ˆ ˆ such that the following is a subgame-perfect
Find (   ), (   ), and 
equilibrium:
12.5. EXERCISES 245

∗ : When it is Alice’s turn, Alice offers (      ) and (      ) and Bob
accepts an offer ( ) if and only if (1 − ) [2 −  +  (2+1 − +1 ) + · · · ] ≥
2 −   When it is Bob’s turn, Bob offers (      ) and (      )
and Alice accepts an offer ( ) if and only if (1 − ) [2 −  +  (2
³ +1´− +1 ) + · · · ] ≥
2 −   If there is no agreement, in the stage game they play  ˆ ˆ .

Verify that ∗ is a subgame perfect equilibrium for the values that you found.
(If you find it easier, you can consider only the constant streams of contribu-
tions  = (    ) and  = (    ).)

22. [Selected from Midterms 2 and make up exams in years 2002 and 2004] Below,
there are pairs of stage games and strategy profiles. For each pair, check whether
the strategy profile is a subgame-perfect equilibrium of the game in which the
stage game is repeated infinitely many times. Each agent tries to maximize the
discounted sum of his expected payoffs in the stage game, and the discount rate is
 = 099. (Clearly explain your reasoning in each case.)

(a) Stage Game: There are   2 players. Each player, simultaneously, decides
whether to contribute $1 to a public good production project. The amount
of public good produced is  = (1 + · · · +  ) 2, where  ∈ {0 1} is the
level of contribution for player . The payoff of a player  is  −  .
Strategy profile: Each player contributes, choosing  = 1, if and only if
the amount of public good produced at each previous date is greater than
4; otherwise each chooses  = 0. (According to this strategy profile, each
player contributes in the first period.)

(b) Stage Game:


 
 6 6 0 4
 4 0 4 4

Strategy profile: Each player plays  until someone deviates. If a player


deviates, then he is to keep playing  and the other player plays  forever.
246 CHAPTER 12. REPEATED GAMES

(c) Stage Game:


 
 6 6 0 4
 4 0 4 4
Strategy profile: Each player plays  until someone deviates. If a player
deviates, then each player plays  forever.
(d) Stage Game: Player 1 decides whether to give a $100 to Player 2. If Player
1 gives $100, then Player 2 decides whether to provide a service to Player 1,
which is worth $200 for Player 1 and costs $50 to Player 2.
Strategy Profile: There are two states: Trade and No trade. The game
starts in Trade state. If Player 1 pays 100, and Player 2 does not provide
the service, then they go to No trade state and stay there for two periods.
In No trade period, Player 1 does not give any money, and Player 2 does not
provide service (if Player 1 pays him $100).

23. [Midterm 2 Make Up, 2001] Consider the infinitely repeated game with the Pris-
oners’ Dilemma game
 
 4 4 0 5
 5 0 1 1
as its stage game. Each agent tries to maximize the discounted sum of his expected
payoffs in the stage game with discount rate .

(a) What is the lowest discount rate  such that there exists a subgame perfect
equilibrium in which each player plays C on the path of equilibrium play?
[Hint: Note that a player can always guarantee himself an average payoff of
1 by playing D forever.]
(b) For sufficiently large values of , construct a subgame-perfect equilibrium in
which any agent’s action at any date  only depends on the play at dates  − 1
and  − 2, and in which each player plays  on the path of equilibrium play.
Chapter 13

Application: Implicit Cartels

This chapter discusses many important subgame-perfect equilibrium strategies in opti-


mal cartel, using the linear Cournot oligopoly as the stage game. For game theory they
provide many applications of single-deviation principle in repeated games. The first
strategy is the simple trigger strategy, that switches to the myopic Nash equilibrium
forever after any deviation. I first characterize the range of discount factors under which
the monopoly prices can be supported by such a subgame-perfect equilibrium. Then, I
find the optimal production supported by such a subgame-perfect equilibrium for any
given discount factor. Next I study the Carrot & Stick strategies that reward the good
behavior by switching to Carrot state and punish the bad behavior by switching to the
Stick state. Here, in the Stick state, the firms can inflict painful punishments, which can
be costly to themselves, by fearing that the failure to punish will prolong the punishment
and delay the reward at the end. Finally, I consider a variation of the Carrot & Stick
strategy to discuss the price wars.

13.1 Infinitely Repeated Cournot Oligopoly


I will use the infinitely repeated linear Cournot oligopoly as the main statel of a cartel.
There are  firms, each with marginal cost  ∈ (0 1). In the stage game, each firm 
simultaneously produce  units of a good and sell it at price

 = max {1 −  0}

247
248 CHAPTER 13. APPLICATION: IMPLICIT CARTELS

where  = 1 +· · ·+ is the total supply. In the repeated game, all the past production
levels of all firms are publicly observable, and each firm’s utility function is the discounted
sum of its stage profits, where the discount factor is :
X

 =    ( (1 + · · · +  ) − ) 
=0

where  is the production level of firm  at time . Sometimes it will be more convenient
to use the discounted average value, which is (1 − )  .
For any , write

 () =  ( () − ) =  (max {1 −  0} − ) (13.1)

for the (daily) profit of a firm when each firm produces  and
(
0 (1 − ( − 1)  − )2 4 if ( − 1)  ≤ 1
 () = max  ( ( + ( − 1) ) − ) =
0 0 otherwise
(13.2)
for the maximum profit of a firm from best responding when all the other firms produce
.

13.2 Monopoly Production with Patient Firms


If it is possible to enforce, it is in the firms’ best interest to produce the monopoly
production level
 = 12

in total and divide the revenues according to their favored division rule, which could be
attained by assigning some production levels to the firms that add up to  . For the
sake of simplicity, let us assume that they would like to divide it equally. Then, the
above outcome is attained by simply each firm producing

 =   = (1 − )  (2) 

As it has been established by the Folk Theorem, when the discount factor is high, such
outcomes can be an outcome of a subgame-perfect equilibrium. In that case, the firms
can make some tacit informal plans that form a subgame-perfect equilibrium and yield
13.2. MONOPOLY PRODUCTION WITH PATIENT FIRMS 249

the desired outcome. Since the plan is a subgame-perfect equilibrium they may hope
that everybody will follow through in the absence of an official enforcement mechanism,
such as courts.
A simple strategy profile that leads to the above outcome is as follows:

Simple Trigger Strategy: Each firm is to produce   until somebody


deviates, and produce   = (1 − )  ( + 1) thereafter.

The above strategy profile yields each firm producing   forever, stipulating that
they would fall back to the myopic Nash equilibrium production   if any firm deviates,
leading to the breakdown of the cartel. This strategy profile may or may not be a
subgame-perfect equilibrium, depending on the discount factor. This section is devoted
to determine the range of discount factors under which it is indeed a subgame-perfect
equilibrium.
Once a firm deviates and the cartel breaks down, the firms are playing the stage-game
Nash equilibrium regardless of what happens thereafter, which is a subgame-perfect Nash
equilibrium of the subgame after break down, as it has been established before. Hence,
by the single-deviation principle, it suffices to check whether a firm has an incentive to
deviate while the cartel in place (i.e., no firm has deviated from producing  ). In that
case, according to the single deviation test, the average discounted value of producing
  for a firm  is
¡  (1 − )2
¢
 =   = 
4
A deviation of producing  6=  yields the average value of
µ ¶
−1 ¡ ¢
 () = (1 − )  1 − −  −  +    
2
where the first term is the payoff from the current period, in which the other firms are
¡ ¢
producing  each, and the second term    = (1 − )2  ( + 1)2 is the value of
flow payoff of Nash equilibrium, starting from the next day. The best possible deviation
payoff is
¡ ¢ ¡ ¢
∗ = max  () = (1 − )    +    
6=
¡ ¢ ¡ ¢2
where   = +1−2
4
is the profit from best responding to  . The firm does not
have an incentive to deviate if and only if

 ≥ ∗ 
250 CHAPTER 13. APPLICATION: IMPLICIT CARTELS

0.95

0.9

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5
0 20 40 60 80 100

Figure 13.1:    as a function of .

i.e., ¡ ¢ ¡ ¢
  −  
 ≥   ≡ 
 (  ) −  (  )
Clearly,for any ,    is less then 1, and hence the simple trigger strategy profile above
is a subgame perfect equilibrium when the discount factor is large (larger than    ).
As shown in Figure 13.1, for small ,    is reasonably small, and the monopoly prices
are maintained in the simple trigger strategy equilibrium for reasonable values of . On
the other hand,   is increasing in , and   → 1 as  → ∞. Hence, for any given
discount factor, as the number of firms becomes very large, the simple trigger strategy
profile fails to be an equilibrium.

13.3 Optimal Production Level with a Fixed 


For a fixed  and  with      , the simple trigger strategy above is not an equilibrium
when the firms tries to maintain the monopoly prices on the path. Such a plan may likely
to tempt the firms to over produce in equilibrium, breaking the cartel, and resulting in
highly competitive outcome with low prices and profits. The firms may want to target a
lower profit that can be supported by a simple trigger strategy equilibrium. This section
13.3. OPTIMAL PRODUCTION LEVEL WITH A FIXED  251

is devoted to find the optimal production level supported by a simple trigger strategy.
More precisely, for a fixes  and , consider the following strategy profile:
Simple Trigger Strategy ( ∗ ): Each firm is to produce  ∗ until somebody deviates,
and produce  = (1 − )  ( + 1) thereafter.
Note that in the outcome of this strategy profile each firm produces  ∗ at each day,
yielding the average discounted value of

 ( ∗ ) =  ( ∗ ) =  ∗ (1 −  ∗ − ) (13.3)

to each firm. The main question is: Which  ∗ maximizes the firms’ profits  subject
to the constraint that the simple trigger strategy profile is a subgame-perfect Nash
equilibrium?
Once again, since the myopic Nash equilibrium is played after the breakdown of the
cartel, it suffices to check that there is no incentive to deviate on the path, in which
all firms produced  ∗ at all times. At any such history, any unilateral deviation  =
6 ∗
yields the average discounted value of
¡ ¢
 () = (1 − )  (1 − ( − 1) ∗ −  − ) +   

to the deviating firm. To see this, note that in the first day, the firm’s profit is
 (1 − ( − 1)  ∗ −  − ) as it produces  and all the other firms produce  ∗ . This one
time profit is multiplied by (1 − ). After the deviation, the firm gets the myopic Nash
¡ ¢
equilibrium profit of    = (1 − )2  ( + 1)2 every day, which has the average dis-
¡ ¢
counted value of    . Since the firm gets this starting the next day, it is multiplied
by . The simple trigger strategy profile above is a subgame perfect Nash equilibrium if
and only if
 ( ∗ ) ≥  () 6 ∗ ) 
(∀ =

This constraint reduces to


¡ ¢
 ( ∗ ) ≥ max∗  () = (1 − )  (∗ ) +    ; (13.4)
6=

the simple trigger strategy profile is a subgame-perfect equilibrium if and only if (13.4)
is satisfied. Hence, the objective in this section is to maximize  (∗ ) =  ( ∗ ) in (13.3)
¡ ¢
subject to the constraint  ( ∗ ) ≥ (1 − )  (∗ ) +    in (13.4).
252 CHAPTER 13. APPLICATION: IMPLICIT CARTELS

When      , the monopoly production   is an equilibrium value for  ∗ . (After


all, it has been shown in the previous section that the simple trigger strategy for  ∗ =  
is a subgame-perfect equilibrium if and only if  ≥    .) In that case, the optimal value
for ∗ is   . When     , t  is not an equilibrium value for  ∗ . In that case, the
minimum allowable value for  ∗ is optimal, which is given by the equality
¡ ¢
 ( ∗ ) = (1 − )  ( ∗ ) +    

i.e.,
(1 − ( − 1)  ∗ − )2
∗ (1 − ∗ − ) = (1 − ) +  (1 − )2  ( + 1)2 
4
The explicit solution to the above quadratic equation is not important. The effect of
the parameters on the solution can be gleaned from the equation. The left-hand side is
independent of the discount factor, while the expression on the other side is decreasing
in . This is because the payoff from deviation, which is multiplied by (1 − ), is larger
than the myopic Nash equilibrium payoff, which is multiplied by . Hence, as the
discount factor increases the right hand-side goes down, decreasing  ∗ . This results
in lower amount of production and higher amounts of profits, in the expense of the
consumers. This is because more patient firms can maintain higher cartel prices without
being tempted by the short-term opportunities.

13.4 Reward and Punishment: Carrot-Stick Strate-


gies
In the above strategy profiles, the level of equilibrium quantities are limited by the
fact that the punishment after a deviation resorts to Nash equilibrium of the stage
game, which limits the deviators’ payoffs from below. In many games like the Cournot
oligopoly, the average payoff of a player in the repeated game can be lower than his lowest
equilibrium payoff in the stage game. Using such low SPE payoffs after a deviation, one
can maintain even higher equilibrium payoffs in a SPE. Such equilibria are of course
more sophisticated than the simple trigger strategies employed in the previous section.
Among such equilibria a relatively simple Carrot&Stick strategy plays a central role.
This section is devoted to constructing such a Carrot and Stick strategy in Cournot
oligopoly.
13.4. REWARD AND PUNISHMENT: CARROT-STICK STRATEGIES 253

Carrot & Stick Strategy: There are two states: Carrot and Stick. Each
player plays  in Carrot state and  in Stick state. The game starts in
Carrot state. At any , if all players play what they are supposed to play,
they go to Carrot state at  + 1; they go to Stick state at  + 1 otherwise.

In a Carrot&Stick strategy, the Carrot state is used as a reward for following through
and the Stick state is used as a punishment for deviation. Hence, the profit from
(       ) is lower than the profit from (       ). Note that punishment in the
Stick state can be costly for everyone including the other players who are punishing the
deviant player. They may than forgive the deviant in order to avoid the cost. In order
to deter them from failing to punish the deviant, equilibrium prescribes that they, too,
will be punished the next period if they fail to punish today.
The average discounted payoff from the Carrot state is

 =  ( )  (13.5)

and the average discounted payoff from the Stick state is

 = (1 − )  ( ) +  = (1 − )  ( ) +  ( )  (13.6)

Single-deviation principle yields two constraints under which the Carrot & Stick
strategy profile above is a subgame-perfect equilibrium. First, no player has an incentive
to unilateral deviation in the Carrot state:

 ≥ max (1 − )  ( + ( − 1)  − ) +  = (1 − )  ( ) +   (13.7)


6=

Here the first term  ( ) is the profit from the most-profitable deviation, which is
multiplied by 1 −  as it is a single profit, and the second term  is the average
discounted payoff from switching to the Stick state next day, which is multiplied by 
because it starts the next day. By substituting the value of  in (13.6) to (13.7), one
can simplify (13.7) as
1 
 =  ( ) ≥  ( ) +  ( )  (13.8)
1+ 1+
This condition finds a lower bound on the average discounted payoff  from Carrot: it
has to be at least as high as the daily profit from deviation, multiplied by 1 (1 + ),
and the daily profit at the Stick state, multiplied by  (1 + ).
254 CHAPTER 13. APPLICATION: IMPLICIT CARTELS

The second constraint is that no firm has an incentive to deviate unilaterally in the
Stick state:

 ≥ max (1 − )  ( + ( − 1)  − ) +  = (1 − )  ( ) +   (13.9)


6=

That is, applying the possibly painful punishment at the Stick state must be at least
as good as deviating from this for one day and postponing it to the next period. This
constraint simplifies to
 ≥  ( )  (13.10)

That is, the average discounted payoff in the stick state is at least as high as the daily
profit from deviation at that state. By substituting the value of  from (13.6), one can
write this directly, again, as a lower bound on the equilibrium profit:

 ( ) ≥  ( )  − (1 − )  ( )  (13.11)

The Carrot & Stick gives a subgame-perfect equilibrium if and only if the simple
constraints (13.8) and (13.11) are satisfied.
In general one can obtain high values for selecting the punishment profit  ( ) very
low even negative. When the costs are zero (i.e.,  = 0), since the price is non-negative,
the lowest payoff is also zero, and it is obtained from selecting  = 1 ( − 1). In
that case,  ( ) =  ( ) = 0, and the constraint (13.11) is satisfied for all  . Hence,
this value of equilibrium leads to a subgame perfect equilibrium if and only if (13.8) is
satisfied:
1
 ( ) ≥  ( ) 
1+
When this inequality is satisfied at  =   , then an optimal Carrot & Stick strategy
for the firms is  =   = 1 (2) and  = 1 ( − 1). This is the case when  ≥
¡ ¢ ¡ ¢
     − 1. Otherwise, an optimal Carrot & Stick strategy is given by  =
1 ( − 1) and  as the smallest solution to the quadratic equation (1 + )  ( ) =
 ( ).
When the marginal cost is positive (i.e.,   0), one can make  ( ) negative and
as small as needed by selecting a large  . In that case, the firms can inflict arbitrar-
ily painful punishments on the deviating firm. They do so by fearing that failure of
punishment only delay the punishment and the subsequent reward one more period.
Giving incentive to such punishment puts an upper bound on  through (13.11). This
13.5. PRICE WARS 255

upper bound is large when the marginal cost  is small. I will next describe the optimal
strategy for small values of  so that one can choose   1 ( − 1). In that case, in
the the Stick state, the profit is  ( ) = − , i.e., the firms simply incur the cost of
the production as a loss, and the optimal deviation is to avoid this loss by producing
nothing, i.e.,  ( ) = 0. Hence, the optimal Carrot & Stick strategy maximizes  ( )
subject to the constraints

1 
 ( ) ≥  ( ) −   (13.12)
1+ 1+
 ( ) ≥ (1 − )   (13.13)

A careful reader can check that one can select the second weak inequality as equality.
(That inequality can be strict only when both inequalities are satisfied at the global
optimum  .) That is, one can select  =  ( )  (1 − ). In that case, the first
inequality reduces to
 ( ) ≥ (1 − )  ( ) 
¡ ¢ ¡ ¢
Therefore, when  ≥ 1 −       , an optimal Carrot & Stick strategy is given by
¡ ¢
 =  and  =    (1 − ). The firms produce the monopoly outcome, and
any deviation leads to the production of  that offsets the gain from optimal deviation.
¡ ¢ ¡ ¢
When   1 −     , the constraint in the last displayed inequality is binding,
and the production  in the optimal Carrot & Stick strategy is the smallest solution to
the quadratic equation
 ( ) = (1 − )  ( ) 

In a Carrot & Stick equilibrium, the firm produce large amounts yielding very small
prices in order to punish deviations from the equilibrium. For example, in the optimal
strategy above, the price becomes zero after a deviation. This can viewed as a price war.

13.5 Price Wars


The price wars in Carrot & Stick strategies above are supposed to last only one period. In
general, the price wars can take much longer in other forms of equilibria, in which there
are multiple Carrot states. This section is devoted to analysis of such subgame-perfect
equilibria.
256 CHAPTER 13. APPLICATION: IMPLICIT CARTELS

Price War: There are +1 states: Cartel, 1       . Each firm produces
 in Cartel state and  = 1 ( − 1) in states 1       . The game
starts at Cartel state. If each firm produces the above amounts ( in Cartel
state and 1 ( − 1) in other states), then Cartel and  transition to Cartel
and  transitions to +1 for all   . They go to 1 in the next period
otherwise.

On the path of the above strategy profile, the firms produce the cartel production
 everyday. Any deviation from this production level starts a price war that lasts 
days. During the price war, the price is 0. If a firm is to deviate at any date during the
punishment, the punishment starts all over again in order to punish the newly deviating
firm.
Note that the average discounted profit at the cartel state is

 =  ( ) 

and the average discounted profit at  state is


¡ ¢
 = − 1 −  −+1  +  −+1  ( )  (13.14)

where  is the marginal cost. Note that, assuming  ( ) ≥ 0, the situation improves as
they leave more war dates in the past and get closer to the start date of the cartel with
positive payoffs:
 ≥ −1 ≥ · · · ≥ 1 

In order to check that this is a subgame-perfect equilibrium, one needs to apply


the single deviation test at each state, leading to  + 1 constraints. First, the single-
deviation test at the cartel state requires that the firms do not have incentive to deviate
in the cartel state and start a price war:

 ( ) ≥ (1 − )  ( ) + 1  (13.15)

i.e., the value of cartel is higher than one period optimal deviation and the value of
starting a war next day. As in the previous section, by substituting the value of 1 from
(13.14), one simplifies this constraint to
¡ ¢
1−  1 − 
 ( ) ≥  ( ) −   (13.16)
1 −  +1 1 −  +1
13.5. PRICE WARS 257

In any war state  , the single-deviation test requires that a firm does not have an
incentive to deviate and start the war all over again:

 ≥ 1 

That is, the value of being in the th day of war is at least as good as not producing
at all and avoiding the cost of production of a good that sells at price zero for one day
and starting the war all over again in the next period. Since  ≥ 1 for each , this
constraint is satisfied at each war period  if it is satisfied at the first day of the war,
i.e.,
1 ≥ 1 

Therefore, the single-deviation test in the war states yields a single constraint:

1 ≥ 0

i.e.,
¡ ¢
 ( ) ≥ 1 −  −+1   −+1  (13.17)

In summary, the price war strategies above form a subgame-perfect equilibrium if


and only if the constraints (13.16) and (13.17) are satisfied.
What is the optimal price war strategy profile for the firms? To answer this question,
note that in the optimal equilibrium, one selects 1 = 0 (i.e., (13.17) is satisfied with
equality) in order to provide the maximal deterrence in the cartel state:
¡ ¢
 =  −+1  ( )  1 −  −+1 

In that case, from the equivalent form (13.15), one can see that the constraint (13.16)
reduces to:
 ( ) ≥ (1 − )  ( ) 

This is the same constraint as the optimal Carrot & Stick equilibrium. As in there, in
¡ ¢ ¡ ¢
the optimal price war equilibrium, one selects  =   when  ≥ 1 −      
and  equal to the smallest solution to the quadratic equation

 ( ) = (1 − )  ( )

otherwise.
258 CHAPTER 13. APPLICATION: IMPLICIT CARTELS

13.6 Exercises with Solutions


1. [2010 Midterm 2] Consider the linear Cournot oligopoly above with  = 0. For
each of the following strategy profiles, find the parameter values under which the
strategy profile is a subgame-perfect equilibrium.

(a) Each firm is to produce  ∗ until somebody deviates, and produce   =


1 ( + 1) thereafter.
Solution: Just take  = 0 in Section 13.3. The condition is
¡ ¢
 (∗ ) ≥ (1 − )  ( ∗ ) +    
¡ ¢
where  (∗ ) =  ∗ (1 −  ∗ ),  (∗ ) = (1 − ( − 1)  ∗ − )2 4, and    =
1 ( + 1)2 .
(b) There are two states: Cartel and War. The game starts in the Cartel state.
In the Cartel state, each firm produces ∗ . In the Cartel state, if each firm
produces ∗ , they remain in the Cartel state in the next period, too; otherwise
they switch to the War state in the next period. In the War state, each firm
produces 1. In the War state, if each firm produces 1, they switch to
Cartel state in the next period; otherwise they remain in the War state in the
next period, too.
Solution: This is a price war strategy with one war period, or equivalently
a Carrot & Stick strategy with  =  ∗ and  = 1. The necessary and
sufficient conditions for this to be a SPE are (13.8) and (13.11). Since  ( ) =
0 and  ( ) = 12 , these conditions simplify to

(1 + )  ∗ (1 −  ∗ ) ≥ (1 − ( − 1)  ∗ )2 4
1
 ∗ (1 −  ∗ ) ≥ 
42
2. [Midterm 2, 2007] Consider the infinitely repeated game with the following stage
game (Linear Bertrand duopoly). Simultaneously, Firms 1 and 2 choose prices
1 ∈ [0 1] and 2 ∈ [0 1], respectively. Firm  sells



⎨ 1 −  if   
 (1  2 ) = (1 −  ) 2 if  = 


⎩ 0 if   
13.6. EXERCISES WITH SOLUTIONS 259

units at price  , obtaining the stage payoff of   (1  2 ). For each strategy
profile below, find the range of parameters under which the strategy profile is a
subgame-perfect equilibrium.

(a) They both charge  = 12 until somebody deviates; they both charge 0
thereafter.
Solution: After the switch, they produce 0 forever and the future moves do
not depend on the current actions. Hence, the reduced game is identical to
the original stage game. Since (0 0) is a SPE of the stage game, it passes the
single-deviation test at such a history. Before the switch, we need to check
that
 = 18 ≥ (1 − ) · 14 +  · 0

i.e.,  ≥ 12. (Note that by undercutting a firm can get 14− for any   0.)
(b) There are  + 1 states: Cartel, 1       . Each firm charges  = 12 in
Cartel state and  = ∗ in War states 1       where ∗  12. The
game starts at Cartel state. If each firm charges the above prices (12 in
Cartel state and ∗ in War states), then Cartel and  transition to Cartel
and  transitions to +1 for all   . They go to 1 in the next period
otherwise.
Solution: As in the price war with Cournot oligopoly there are two binding
conditions for SPE. In the cartel state no firm should have an incentive to
undercut:
¡ ¢
18 ≥ (1 − ) 4 +  1 −  ∗ (1 − ∗ ) 2 + +1 8

i.e.,
¡ ¢ ¡ ¢
1 − +1 8 ≥ (1 − ) 4 +  1 −  ∗ (1 − ∗ ) 2 (13.18)

Second, in the first day of War there is no incentive to deviate:

1 ≥ (1 − ) ∗ (1 − ∗ ) + 1 

i.e.,
1 ≥ ∗ (1 − ∗ ) 
260 CHAPTER 13. APPLICATION: IMPLICIT CARTELS

Here,
¡ ¢
 = 1 −  −+1 ∗ (1 − ∗ ) 2 +  −+1 8

is the average discounted payoff at  . The condition deters against the


deviations in which the a firm charges slightly less and gets all of the demand
for a day. By substituting the value of 1 in the last equality, one can simplify
this condition as
¡ ¢
14 ≥ 1 −  − ∗ (1 − ∗ )  (13.19)

Since  ≥ 1 , this condition further implies that there is no incentive to


deviate at other war states:

 ≥ (1 − ) ∗ (1 − ∗ ) 2 + 

Therefore, the conditions are (13.18) and (13.19).

13.7 Exercises
1. [Homework 4, 2011] Consider the infinitely repeated game with linear Cournot
oligopoly as the stage game and the discount factor . In the stage game, there are
  2 firms with zero cost and the inverse-demand function  = max {1 −  0}.
For each strategy profile below, find the range of  under which the strategy profile
is a subgame-perfect Nash equilibrium.

(a) At each , each firm produces 1 (2) until some firm produces another
amount; each firm produces 1 thereafter.

(b) At each , firms 1, . . . ,  produce 12, 1/4,. . . , 12 , respectively, until some
firm deviates (by not producing the amount that it is supposed to produce);
they all produce 1 ( + 1) thereafter.

(c) There are  + 1 states: Cartel, 1 , . . . ,  . Each firm produces 1 (2)


in the Cartel state and 1 in states 1       . The game starts at the
Cartel state. If each firm produces what it is supposed to produce in any
given state, then Cartel leads to Cartel in the next period,  leads to +1
in the next period for each    and  leads to Cartel. In any state, if
13.7. EXERCISES 261

any player deviates from what it is supposed to produce, they go to 1 in


the next period.

2. [Midterm 2 Make Up, 2007] Consider the infinitely repeated game with discount
rate  and the following stage game. Simultaneously, Seller chooses quality  ∈
[0 ∞) of the product and the Customer decides whether to buy at a fixed price .
The payoff vector is ( −  2 2  − ) if customer buys, and (− 2 2 0) otherwise,
where the first entry is the payoff of the seller and   0 is a constant.

(a) Find the highest price  for which there is a SPE such that customer buys on
the path everyday.

(b) Find the set of parameters ̂, ,  and  for which the following is a SPE. We
have a Trade state and  Waste states (1  2       ). In the trade state
seller chooses quality  = , and the buyer buys. In any Waste state, the
seller chooses quality level ̂ and the buyer does not buy. If everybody does
what he is supposed to do, in the next period Trade leads to Trade, 1 leads
to 2 , 2 leads to 3 , . . . , −1 leads to  , and  leads to Trade. Any
deviation takes us to 1 . The game starts at Trade state.

3. [Midterm 2, 2007] Consider the infinitely repeated game with the following stage
game (Linear Bertrand duopoly). Simultaneously, Firms 1 and 2 choose prices
1 ∈ [0 1] and 2 ∈ [0 1], respectively. Firm  sells



⎨ 1 −  if   
 (1  2 ) = (1 −  ) 2 if  = 


⎩ 0 if   

units at price  , obtaining the stage payoff of   (1  2 ). (All the previous prices
are observed, and each player maximizes the discounted sum of his stage payoffs
with discount factor  ∈ (0 1).) For each strategy profile below, find the range of
parameters under which the strategy profile is a subgame-perfect equilibrium.

(a) They both charge  = 12 until somebody deviates; they both charge 0
thereafter. (You need to find the range of .)
262 CHAPTER 13. APPLICATION: IMPLICIT CARTELS

(b) There are  + 1 states: Collusion, the first day of war (1 ), the second day of
war (2 ), ..., and the th day of war ( ). The game starts in the Collusion
state. They both charge  = 12 in the Collusion state and  = ∗ in the
war states (1 ,. . . ,  ), where ∗  12. If both players charge what they
are supposed to charge, then the Collusion state leads to the Collusion state,
1 leads to 2 , 2 leads to 3 , . . . , −1 leads to  , and  leads to the
Collusion state. If any firm deviates from what it is supposed to charge at
any state, then they go to 1 . (Every deviation takes us to the first day of a
new war.) (You need to find inequalities with , ∗ , and .)

4. [Selected from Midterms 2 and make up exams in years 2002 and 2004] Below,
there are pairs of stage games and strategy profiles. For each pair, check whether
the strategy profile is a subgame-perfect equilibrium of the game in which the
stage game is repeated infinitely many times. Each agent tries to maximize the
discounted sum of his expected payoffs in the stage game, and the discount rate is
 = 099. (Clearly explain your reasoning in each case.)

(a) Stage Game: Linear Cournot Duopoly: There are two firms. Simultane-
ously each firm  supplies  ≥ 0 units of a good, which is sold at price
 = max {1 − (1 + 2 )  0}. The cost is equal to zero.
Strategy profile: There are two states: Cartel and Competition. The game
starts at Cartel state. In Cartel state, each supplies  = 14. In Cartel state,
if each supplies  = 14, they remain in Cartel state in the next period;
otherwise they switch to Competition state in the next period. In Competi-
tion state, each supplies  = 12. In Competition state, they automatically
switch to Cartel state in the next period.

(b) Stage Game: Linear Cournot Duopoly of part (b).


Strategy profile: There are two states: Cartel and Competition. The game
starts at Cartel state. In Cartel state, each supplies  = 14. In Cartel
state, if each supplies  = 14, they remain in Cartel state in the next
period; otherwise they switch to Competition state in the next period. In
Competition state, each supplies  = 12. In Competition state, they switch
13.7. EXERCISES 263

to Cartel state in the next period if and only if both supply  = 12; otherwise
they remain in Competition state in the next period, too.
Chapter 14

Static Games with Incomplete


Information

So far we have focused on games in which any piece of information that is known by
any player is known by all the players (and indeed common knowledge). Such games
are called the games of complete information. Informational concerns do not play any
role in such games. In real life, players always have some private information that is not
known by other parties. For example, we can hardly know other players’ preferences and
beliefs as well as they do. Informational concerns play a central role in players’ decision
making in such strategic environments. In the rest of the course, we will focus on such
informational issues. We will consider cases in which a party may have some information
that is not known by some other party. Such games are called games of incomplete
information or asymmetric information. The informational asymmetries are modeled by
Nature’s moves. Some players can distinguish certain moves of nature while some others
cannot. Consider the following simple example, where a firm is contemplating the hiring
of a worker, without knowing how able the worker is.

Example 14.1 Consider the game in Figure 14.1. There are a Firm and a Worker.
Worker can be of High ability, in which case he would like to Work when he is hired, or
of Low ability, in which case he would rather Shirk. Firm would want to Hire the worker
that will work but not the worker that will shirk. Worker knows his ability level. Firm
does not know whether the worker is of high ability or low ability. Firm believes that the
worker is of high ability with probability  and low ability with probability 1 − . Most

265
266 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION

importantly, the firm knows that the worker knows his own ability level. To model this
situation, we let Nature choose between High and Low, with probabilities  and 1 − ,
respectively. We then let the worker observe the choice of Nature, but we do not let the
firm observe Nature’s choice.

Work (1, 2)
W
Firm Hire
Shirk
High p (0, 1)
Nature Do not (0, 0)
hire
Work (1, 1)
W
Low 1-p Hire
Shirk
(-1, 2)
Do not
hire (0, 0)

Figure 14.1: A game on employment decisions with incomplete information

A player’s private information is called his “type”. For instance, in the above example
Worker has two types: High and Low. Since Firm does not have any private information,
Firm has only one type. As in the above example, incomplete information is modeled
via imperfect-information games where Nature chooses each player’s type and privately
informs him. These games are called incomplete-information game or Bayesian game.

14.1 Bayesian Games


Formally, a static game with incomplete information is as follows. First, Nature chooses
some  = (1  2       ) ∈  , where each  ∈  is selected with probability  ().
Here,  ∈  is the type of player  ∈  = {1 2     }. Then, each player observes
14.1. BAYESIAN GAMES 267

his own type, but not the others’. Finally, players simultaneously choose their actions,
each player knowing his own type. We write  = (1  2      2 ) ∈  for any list of
actions taken by all the players, where  ∈  is the action taken by player . The payoff
of a player will now depend on players’ types and actions; we write  :  ×  → R
for the utility function of  and  = (1       ). Such a static game with incomplete
information is denoted by (    ). Such a game is called a Bayesian Games.
One can write the game in the example above as a Bayesian game by setting

•  = {  }

•  = { }   = { } ;

•  (  ) = ,  (  ) = 1 − ;

•  = { },  = { } 

• and the utility functions  and  are defined by the following tables, where
the first entry is the payoff of the firm and the table on the left corresponds to
 = (  )

 =     =   


 1,2 0,1  1,1 -1,2
 0,0 0,0  0,0 0,0

It is very important to note that players’ types may be “correlated”, meaning that a
player “updates” his beliefs about the other players’ type when he learns his own type.
Since he knows his type when he takes his action, he maximizes his expected utility with
respect to the new beliefs he came to after “updating” his beliefs. We assume that he
updates his beliefs using Bayes’ Rule.

Bayes’ Rule Let  and  be two events, then probability that  occurs conditional
on  occurring is

 ( ∩ )
 ( | ) = 
 ()
where  ( ∩ ) is the probability that  and  occur simultaneously and  (): the
(unconditional) probability that  occurs.
268 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION

In static games of incomplete information, the application of Bayes’ Rule will often
be trivial, but a very good understanding of the Bayes’ Rule is necessary to follow the
treatment of the dynamic games of incomplete information later.
Let  (0− | ) denote ’s belief that the types of all other players is 0− = (01  02   −1
0
 0+1      0 )
given that his type is . [We may need to use Bayes’ Rule if types across players are
‘correlated’. But if they are independent, then life is simpler; players do not update
their beliefs.] For example, for a two player Bayesian game, let 1 = 2 = { } and
 ( ) =  ( ) =  ( ) = 13 and  ( ) = 0. This distribution is vividly
tabulated as
 
 13 13
 0 13
Now,

Pr (1 = 2 = )  ( ) 13


1 (|) = = = = 12
Pr (1 = )  ( ) +  ( ) 13 + 13

Similarly,

1 (|) = 12
Pr (1 =  2 = )  ( ) 0
1 (|) = = = =0
Pr (1 = )  ( ) +  ( ) 0 + 13
Pr (1 =  2 = )  ( ) 13
1 (|) = = = = 1
Pr (1 = )  ( ) +  ( ) 0 + 13

14.2 Bayesian Nash Equilibrium


As usual, a strategy of a player determines which action he will take at each information
set of his. Here, information sets are identified with types  ∈  . Hence, a strategy of
a player  is a function
 :  →  

mapping his types to his actions. For instance, in the example above, Worker has four
strategies: (Work,Work)–meaning that he will work regardless of whether he is of high
or low ability, (Work, Shirk)–meaning that he will work if he is of high ability and shirk
if he is of low ability, (Shirk, Work), and (Shirk, Shirk).
14.2. BAYESIAN NASH EQUILIBRIUM 269

When the probability of each type is positive according to , any Nash equilibrium of
a Bayesian game is called Bayesian Nash equilibrium. In that case, in a Nash equilibrium,
for each type  , player  plays a best reply to the others’ strategies given his beliefs about
the other players’ types given  . If the probability of Nature choosing some  is zero,
then any action at that type is possible according to an equilibrium (as his action at that
type does not affect his expected payoff.) In a Bayesian Nash equilibrium, we assume
that for each type  , player  plays a best reply to the others’ strategies given his beliefs
about the other players’ types given  , regardless of whether the probability of that type
is positive.
Formally, a strategy profile ∗ = (∗1   ∗ ) is a Bayesian Nash Equilibrium in an
-person static game of incomplete information if and only if for each player  and type
 ∈  

X
∗ (1 ) ∈ arg max  (∗ ( )     ∗ ( )) ×  (0− | )


where  is the utility of player  and  denotes action. That is, for each player  each
possible type, the action chosen is optimal given the conditional beliefs of that type
against the optimal strategies of all other players. Notice that the utility function  of
player  depends both players’ actions and types.1 Notice also that a Bayesian Nash
equilibrium is a Nash equilibrium of a Bayesian game with the additional property that
each type plays a best reply.2 For example, for  = 34, consider the Nash equilibrium of
the game between the firm and the worker in which the firm hires and worker works if and
only if Nature chooses high. We can formally write this strategy profile as ∗ = (∗  ∗ )
with

∗ ( ) = 


∗ () = 
∗ () = 

We check that this is a Bayesian Nash equilibrium as follows. First consider the firm.
1
Utility function  does not depend the whole of strategies 1 ,. . . ,  , but the expected value of 
possibly does.
2
This property is necessarily satisfied in any Nash equilibrium if all types occur with positive prob-
ability.
270 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION

At his only type  , his beliefs about the other types are

 (| ) = 34 and  (| ) = 14

His expected utility from the action "hire" is

 [ ( ∗ ) | ] =  ( ∗ ()  )  (| ) +  ( ∗ ()  )  (| )
=  (  )  (| ) +  (  )  (| )
3 1 1
= 1 · + (−1) · = 
4 4 2
His expected payoff from action "dont" is

 [ ( ∗ ) | ] =  ( 



()  )  (| ) +  ( ∗ ()  )  (| )
=  (  )  (| ) +  (  )  (| )
3 1
= 0 · + 0 · = 0
4 4
Since  [ ( ∗ ) | ] ≥  [ ( 

) | ],  is a best response. Now consider,
the worker. He has two types. We need to check whether he play a best response for
each of these types. Consider  =  type. Of course,  ( |) = 1. Hence, his
utility from "work" is

 [ (∗  ) |] =  (  ) = 2

His utility from "shirk" is

 [ (∗  ) |] =  (  ) = 1

Clearly, 2  1, and "work" is the best response to ∗ for type . For type  = ,
we check that his utility from "shirk",

 [ (∗  ) |] =  (  ) = 2

is greater than his utility from "work",

 [ (∗  ) |] =  (  ) = 1

Hence, the type  =  also plays a best response. Therefore, we have checked that
∗ is a Bayesian Nash equilibrium.

Exercise 14.1 Formally, check that firm not hiring and worker shirking for each type
is also a Bayesian Nash equilibrium.
14.3. EXAMPLE 271

14.3 Example
Suppose that the payoffs are given by the table

 
   1 2
 −1   0

where  ∈ {0 2} is known by Player 1,  ∈ {1 3} is known by Player 2, and all pairs of
( ) have probability of 14.
Formally, the Bayesian game is defined as

•  = {1 2}

• 1 = {0 2}, 2 = {1 3}

•  (0 1) =  (0 3) =  (2 1) =  (2 3) = 14

• 1 = {  }, 2 = { }, and

• 1 and 2 are defined by the table above, e.g., 1 (   ) = 1 (   ) = ,


1 (   ) = 1, and 1 (   ) = −1.

I next compute a Bayesian Nash equilibrium ∗ of this game. To do that, one needs
to determine ∗1 (0) ∈ {  }, 1∗ (2) ∈ {  }, 2∗ (1) ∈ { }, and 2∗ (3) ∈ { }–
four actions in total. First observe that when  = 0, action  strictly dominates action
 , i.e.,
1 ( 2   = 0 )  1 ( 2   = 0 )

for all actions 2 ∈ 2 and types  ∈ {1 3} of Player 2. Hence, it must be that

∗1 (0) = 

Similarly, when  = 3, action  strictly dominates action , and hence

∗2 (3) = 

Now consider the type  = 2 of Player 1. Since his payoff does not depend on ,
observe that his payoff from  is 1 +  , where  is the probability that Player 2 plays
272 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION

. His payoff from  is 2 (1 −  ) −  , which is equal to 2 − 3 . Hence, for  = 2, 


is a best response if
1 +  ≥ 2 − 3 

i.e.,
 ≥ 14

When   14,  is the only best response. Note however that type  must play ,
and the probability of that type is 1/2. Therefore,

 ≥ 12  14

Since ∗1 (2) is a best response for  = 2, it follows that

∗1 (2) = 

Now consider  = 1. Given ∗1 , Player 2 knows that Player 1 plays  (regardless of
his type). Hence, the payoff of  = 1 is  = 1 when he plays  and 2 when he plays .
Therefore,
∗2 (1) = 

To check that ∗ is indeed a Bayesian Nash equilibrium, one checks that each type
plays a best response.

Exercise 14.2 Verify that ∗ is indeed a Bayesian Nash equilibrium. Following the
analysis above, show that there is no other Bayesian Nash equilibrium.

14.4 Exercises with Solutions


1. [Final, 2006] Consider a two-player game in which the payoffs, which depend on ,
and actions are as in the following table:

=0 =1
   
 1 −1 −1 1  1 1 −1 −1
 −1 1 1 −1  −1 1 1 −1

where Pr ( = 0) = Pr ( = 1) = 12. Only Player 2 knows whether  = 0 or  = 1.


14.4. EXERCISES WITH SOLUTIONS 273

(a) Write this as a Bayesian game.


Answer: A Bayesian game can be written as a list

 = ( 1        1         1       )

In this problem,

• the set of players:  = {1 2};


• the set of actions for each player: 1 = { } and 2 = { };
• the set of types for each player: 1 = {1 } (it is a singleton), 2 = {0 1}
(possible values of );
• beliefs are given by  (1  0) =  (1  1) = 12; (one can alternatively
defined the conditional beliefs of types, which does not make a difference
in this problem);
• utility functions 1 (1  2  1  2 ) and 2 (1  2  1  2 ) are given by the
matrices above.

(b) Find a Bayesian Nash equilibrium of this game.


Answer: I will find a BNE in pure strategies. Note that a pure strategy for
Player 1 is an action 1 (1 ) ∈ 1 , and a pure strategy for Player 2 is a pair
(2 (0)  2 (1)) ∈ 2 × 2 , assigning an action for each type of that player.
To find an equilibrium, I guess and eventually verify that there exists a BNE
in which Player 1’s strategy is 1 (1 ) = . Player 2’s best response to this
strategy is 2 (0) =  and 2 (1) = . Now we need to verify that 1 (1 ) = 
is a best response to the strategy of Player 2 that 2 (0) =  and 2 (1) = .
To do that, compute that the expected payoff of Player 1 from  is

1 () = 1 ( 2 (0)  2 = 0)  (2 = 0) + 1 ( 2 (1)  2 = 1)  (2 = 1)


1 1
= 1 (  2 = 0) · + 1 (  2 = 1) ·
2 2
1 1
= −1 · + 1 · = 0
2 2
and the expected utility from  is
1 1
1 () = 1 (  2 = 0) · + 1 (  2 = 1) · = 0
2 2
274 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION

Hence, 1 ( ) ≥ 1 (), showing that  is a best response. Therefore, the


strategy profile (1 (1 ) = ; 2 (0) =  2 (1) = ) is a Bayesian Nash equi-
librium.

2. [Midterm 2, 2001] This question is about a thief and a policeman. The thief
has stolen an object. He can either hide the object INSIDE his car on in the
TRUNK. The policeman stops the thief. He can either check INSIDE the car or
the TRUNK, but not both. (He cannot let the thief go without checking, either.)
If the policeman checks the place where the thief hides the object, he catches the
thief, in which case the thief gets −1 and the police gets 1; otherwise, he cannot
catch the thief, and the thief gets 1, the police gets −1.

(a) Compute all the Nash equilibria.


Solution: This is a matching-pennies game. There is a unique Nash equi-
librium, in which Thief hides the object INSIDE or the TRUNK with equal
probabilities, and the Policeman checks INSIDE or the TRUNK with equal
probabilities.
(b) Now imagine that there are 100 thieves and 100 policemen, indexed by  =
1     100, and  = 1     100. In addition to their payoffs above, each thief 
gets extra payoff  form hiding the object in the TRUNK, and each policeman
 gets extra payoff  from checking the TRUNK where

1  2  · · ·  50  0  51  · · ·  100 


1  2  · · ·  50  0  51  · · ·  100 

Policemen cannot distinguish the thieves from each other, nor can the thieves
distinguish the policemen from each other. Each thief has stolen an object,
hiding it either in the TRUNK or INSIDE the car. Then, each of them is
randomly matched to a policeman. Each matching is equally likely. Again,
a policeman can either check INSIDE the car or the TRUNK, but not both.
Write this game as a Bayesian game with two players, a thief and a policemen.
Compute a pure-strategy Bayesian Nash equilibrium of this game.
Solution: The type space is {1     100} × {1     100} where each pair
( ) is equally likely. The payoff of thief is his payoff from part (a) plus  ,
14.4. EXERCISES WITH SOLUTIONS 275

depending on his own type. The payoff of policeman is his payoff from part
(a) plus  , depending on his type.
A Bayesian Nash equilibrium: A thief of type  hides the object in

INSIDE if   0
TRUNK if   0;
a policeman of type  checks

INSIDE if   0
TRUNK if   0
This is a Bayesian Nash equilibrium, because, from the thief’s point of view
the policeman is equally likely to check TRUNK or INSIDE the car, hence
it is the best response for him to hide in the trunk iff the extra benefit from
hiding in the trunk is positive. Similar for the policemen.

Remark 14.1 Note that from the point of view of an outside observer, the mixed
strategy equilibrium of complete information game in part (a) and the pure strategy
Bayesian Nash equilibrium of the Bayesian game in part (b) are equivalent: in both
cases, the thief hides either inside the car or in trunk and policeman checks inside
or trunk, where the probability of each pair is 14. Moreover, in both games, the
players face the same uncertainty about the action of the other player, assigning
equal probabilities on both actions. The rationale for those beliefs are somewhat
different however. In the complete information game, a player thinks that the ac-
tions of the other player are equally likely because he does not know the strategy of
the other player, assigning equal probabilities on those strategies. In the Bayesian
game, however, he does know what the other player’s strategy is–as a function of
his type. Yet, he does not know which action the other player takes as he does not
know the other player’s type. Therefore, the uncertainty about the strategies in com-
plete information game is replaced with uncertainty about the others’ types. One
can always convert a mixed strategy Nash equilibrium to a pure strategy Bayesian
Nash equilibrium by introducing very small uncertainty about the players’ payoffs.
(This fact is known as Harsanyi’s Purification Theorem.) Hence, a mixed strategy
Nash equilibrium can be interpreted as coming from slight variations in players’
payoffs.
276 CHAPTER 14. STATIC GAMES WITH INCOMPLETE INFORMATION

14.5 Exercises
1. [Midterm 2, 2011] Consider a two-player game with the payoff matrix

 
 1  − 0
  0 1 

where  ∈ {−2 2} is privately known by Player 1, and Pr ( = −2) = 08. (There


is no other private information.)

(a) Write this formally as a Bayesian game.


(b) Find a Bayesian Nash equilibrium of this game. Verify that the strategy
profile you identified is indeed a Bayesian Nash equilibrium.

2. [Final 2010] Consider a two player Bayesian game with the following payoff matrix

  
  (1 )   (2 )  (1 ) + 10  (2 ) − 10  (1 ) − 10  (2 ) + 10
  (1 ) − 10  (2 ) + 10  (1 )   (2 )  (1 ) + 10  (2 ) − 10
  (1 ) + 10  (2 ) − 10  (1 ) − 10  (2 ) + 10  (1 )   (2 )

where  ∈ {0 1 2} is privately known by player  and  (0) = 1,  (1) =  (2) = 0,


 (1) = 1,  (0) =  (2) = 0,  (2) = 1, and  (0) =  (1) = 0. The functions , ,
and  are known and each pair (1  2 ) has probability 1/9.

(a) Write this as a Bayesian game.


(b) Find a Bayesian Nash equilibrium of this game. Verify that the strategy
profile you identified is indeed a Bayesian Nash equilibrium.

3. [Midterm 2 Make up, 2002] Consider the incomplete information game with payoff
matrix
O B
O 2 + 1  1 1  2
B 0 0 1 2 + 2
where 1 and 2 are the private information of players 1 and 2, respectively, and are
identically and independently distributed with uniform distribution on [−13 23].
14.5. EXERCISES 277

(Here  is the type of player .) Find a Bayesian Nash equilibrium of this game in
which for each action (O or B) there is a realization of  at which player  plays
that action.

4. [Homework 4, 2004] Consider a two player game with payoff matrix

 
 2 2 0 
  0 1 1

where  ∈ {0 3} is a parameter known by Player 1. Player 2 believes that  = 0


with probability 1/2 and  = 3 with probability 1/2. Everything above is common
knowledge.

(a) Write this game formally as a Bayesian game.


(b) Compute two Bayesian Nash equilibria of this game.
Chapter 15

Static Applications with Incomplete


Information

This chapter is devoted to economic applications with incomplete information. They


are meant to illustrate the common techniques in computing Bayesian Nash equilibria
in static games of incomplete information. There are four applications. The first ap­
plication is Cournot duopoly, where I illustrate how to computes the Bayesian Nash
equilibria when there is a continuum of actions but finitely many types. The next two
applications are the first-price auction and double auction. In these applications, there
are a continuum of actions and a continuum of types. In that case, it is not easy to
compute all equilibria, and one often considers equilibria with certain functional forms.
Here, the focus will be on (i) symmetric, linear equilibrium, (ii) symmetric but not nec­
essarily linear equilibrium, and (iii) linear but not necessarily symmetric equilibrium. I
will explain what symmetry and linearity means when we come there. Finally, I will
consider coordination games with incomplete information. With complete information,
these games often has multiple equilibria. When there is enough incomplete information,
multiple equilibria disappears. I will illustrate this using "monotone" equilibria, in which
there is a cutoff value such that players play one action below the cutoff and another
action above the cutoff. My technical objective in the last example is to illustrate how
to compute monotone equilibria (when they exist).

273
274CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

15.1 Cournot Duopoly with incomplete information


Consider a Cournot duopoly with inverse-demand function

P (Q) = a − Q

where Q = q1 + q2 . The marginal cost of Firm 1 is c = 0, and this is common knowledge.


Firm 2’s marginal cost c2 is its own private information. It can take values of

cH with probability θ, and


cL with probability 1 − θ.

Each firm maximizes its expected profit.


Here, Firm 1 has just one type, and Firm 2 has two types: cH and cL . Hence, a
strategy of Firm 1 is a real number q1 , while a strategy of Firm 2 is two real numbers
q2 (cH ) and q2 (cL ), one for when the cost is cH and one for when the cost is cL .

Bayesian Nash Equilibrium A Bayesian Nash equilibrium is a triplet (q1∗ , q2∗ (cH ) , q2∗ (cL ))
of real numbers, where q1∗ is the production level of Firm 1, q2∗ (cH ) is the production
level of type cH of Firm 2, and q2∗ (cL) is the production level of type cL of Firm 2. In
equilibrium each type plays a best response. First consider the high-cost type cH of
Firm 2. In equilibrium, that type knows that Firm 1 produces q1∗ . Hence, its production
level, q2∗ (cH ), solves the maximization problem

max(P − cH )q2 = max [a − q1∗ − q2 − cH ] q2 .


q2 q2

Hence,

a − q1∗ − cH
q2∗ (cH ) = (15.1)
2
Now consider the low-cost type cL of Firm 2. In equilibrium, that type also knows that
Firm 1 produces q1∗ . Hence, its production level, q2∗ (cL ), solves the maximization problem

max [a − q1∗ − q2 − cL ] q2 .
q2

Hence,
a − q1∗ − cL
q2∗ (cL ) = . (15.2)
2
15.1. COURNOT DUOPOLY WITH INCOMPLETE INFORMATION 275

The important point here is that both types consider the same q1∗ , as that is the strategy
of Firm 1, whose type is known by both types of Firm 2. Now consider Firm 1. It has
one type. Firm 1 knows the strategy of Firm 2, but since it does not know which type
of Firm 2 it faces, it does not know the production level of Firm 2. In Firm 1’s view,
the production level of Firm 2 is q2∗ (cH ) with probability θ and q2∗ (cL ) with probability
1 − θ. Hence, the expected profit of Firm 1 from production level q1 is

U1 (q1 , q2∗ ) = θ [a − q1 − q2∗ (cH )] q1 + (1 − θ) [a − q1 − q2∗ (cL )] q1


= {a − q1 − [θq2∗ (cH ) + (1 − θ)q2∗ (cL )]} q1 .

The equality is due to the fact that the production level q2 of Firm 2 enters the payoff
[a − q1 − q2 ] q1 = [a − q1 ] q1 − q1 q2 of Firm 1 linearly. The term

E [q2 ] = θq2∗ (cH ) + (1 − θ)q2∗ (cL )

is the expected production level of Firm 2. Hence, the expected profit of Firm 1 just his
profit from expected production level:

U (q1 , q2∗ ) = (a − q1 − E [q2 ]) q1 .

Its strategy q1∗ solves the maximization problem

max U (q1 , q2∗ ) .


q1

In this particular case, it is a best response to the expected production level:


a − E [q2 ] a − [θq2∗ (cH ) + (1 − θ)q2∗ (cL )]
q1∗ = = . (15.3)
2 2
It is important to note that the equilibrium action is a best response to expected
strategy of the other player when and only when the action of the other players affect
the payoff of the player linearly, as in this case.1 In particular, when the other players’
actions have a non-linear effect on the payoff of a player, his action may not be a best
response to expected action of the others. It is a common mistake to take a player’s
action as a best response to the expected action of others; you must avoid it.
To compute the Bayesian Nash equilibrium, one simply needs to solve the three linear
equations (15.1), (15.2), and (15.3) for q1∗ , q2∗ (cL ), q2∗ (cH ). Write
1
To be more precise, when ∂Ui /∂qi is linear in qj .
276CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

⎛ ⎞ ⎡ ⎤−1 ⎛ ⎞
q1∗ 2 θ 1−θ a
⎜ ⎟ ⎢ ⎥ ⎜ ⎟
⎜ q ∗ (cH ) ⎟ = ⎢ 1 2 0 ⎥ ⎜ a − cH ⎟ ,
⎝ 2 ⎠ ⎣ ⎦ ⎝ ⎠

q2 (cL ) 1 0 2 a − cL
yielding

a − 2cH (1 − θ)(cH − cL )
q2∗ (cH ) = +
3 6
a − 2c L θ(c H − c L)
q2∗ (cL ) = −
3 6
a + θc H + (1 − θ)c L
q1∗ = .
3

15.2 First-price Auction


There is an object to be sold. Two bidders want to buy it through an auction. Simul­
taneously, each bidder i submits a bid bi ≥ 0. Then, the highest bidder wins the object
and pays her bid. If they bid the same number, then the winner is determined by a
coin toss. The value of the object for bidder i is vi , which is privately known by bidder
i. That is, vi is the type of bidder i. Assume that v1 and v2 are "independently and
identically distributed" with uniform distribution over [0, 1]. This precisely means that
knowing her own value vi , bidder i believes that the other bidder’s value vj is distributed
with uniform distribution over [0, 1], and the type space of each player is [0, 1]. Recall
that the beliefs of a player about the other player’s types may depend on the player’s
own type. Independence assumes that it doesn’t.
Formally, the Bayesian game is as follows. Actions are bi , coming from the action
spaces [0, ∞); types are vi , coming from the type spaces [0, 1]; beliefs are uniform distri­
butions over [0, 1] for each type, and the utility functions are given by


⎪ vi − bi if bi > bj ,

vi −bi
ui (b1 , b2 , v1 , v2 ) = if bi = bj ,


2
⎩ 0 if bi < bj .

In a Bayesian Nash equilibrium, each type vi maximizes the expected payoff


1
E [ui (b1 , b2 , v1 , v2 )|vi ] = (vi − bi ) Pr{bi > bj (vj )} + (vi − bi ) Pr{bi = bj (vj )} (15.4)
2
15.2. FIRST-PRICE AUCTION 277

over bi .
Next, we will compute the Bayesian Nash equilibria. First, we consider a special equi­
librium. The technique we will use here is a common technique in computing Bayesian
Nash equilibria, and pay close attentions to the steps.

15.2.1 Symmetric, linear equilibrium


This section is devoted to the computation of a symmetric, linear equilibrium. Symmet­
ric means that equilibrium action bi (vi ) of each type vi is given by

bi (vi ) = b (vi )

for some function b from type space to action space, where b is the same function for all
players. Linear means that b is an affine function of vi , i.e.,

bi (vi ) = a + cvi .

To compute symmetric, linear equilibrium, one follows the following steps.

Step 1 Assume a symmetric linear equilibrium:

b∗1 (v1 ) = a + cv1


b∗2 (v2 ) = a + cv2

for all types v1 and v2 for some constants a and c, that will be determined later. The
important thing here is the constants do not depend on the players or their types.

Step 2 Compute the best reply function of each type. Fix some type vi . To compute
her best reply, first note that c > 0.2 Then, for any fixed value bi ,

Pr{bi = b∗j (vj )} = 0, (15.5)

as bj is strictly increasing in vj by Step 1. It is also true that a ≤ bi (vi ) ≤ vi . [You need


to figure this out!] Hence,
2
If c = 0, both bidders bid a independent of their type. Then, bidding 0 is a better response for a
type vi < a; a type vi > a also has an incentive to deviate by increasing her bid slightly.
278CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

vi

bi

a c

vj
vj(bi)

Figure 15.1: Payoff as a function of bid in first-price auction

E ui (bi , b∗j , v1 , v2 )|vi = (vi − bi ) Pr{bi ≥ a + cvj }


bi − a
= (vi − bi ) Pr{vj ≤ }
c
bi − a
= (vi − bi ) · .
c

Here, the first equality is obtained simply by substituting (15.5) to (15.4). The second
equality is simple algebra, and the third equality is due to the fact that vj is distributed
by uniform distribution on [0, 1]. [If you are taking this course, the last step must be
obvious to you!]

For a graphical derivation, consider Figure 15.1. The payoff of i is vi − bi when


vj ≤ vj (bi ) = (bi − a) /c and is zero otherwise. Hence, his expected payoff is the
15.2. FIRST-PRICE AUCTION 279

integral3
vj (bi )
(vi − bi ) dvj .
0

This is the area of the rectangle that is between 0 and vj (b) horizontally and between
bi and vi vertically: (vi − bi ) vj (bi ).
To compute the best reply, we must maximize the last expression over bi . Taking the
derivative and setting equal to zero yields

vi + a
bi = . (15.6)
2

Graphically, as plotted in Figure 15.1, when bi is increased by an amount of Δ, vj (bi )


increases by an amount of Δ/c. To the expected payoff, this adds a rectangle of size
(vi − bi − Δ) Δ/c, which is approximately (vi − bi ) Δ/c when Δ is small, and substact a
rectangle of size vj (bi ) Δ. At the optimum these two must be equal:

(vi − bi ) Δ/c = vj (bi ) Δ,

yielding (15.6) above.

Remark 15.1 Note that we took an integral to compute the expected payoff and took a
derivative to compute the best response. Since the derivative is an inverse of integral, this
involves unnecessary calculations in general. In this particular example, the calculations
were simple. In general those unnecessary calculations may be the hardest step. Hence,
it is advisable that one leaves the integral as is and use Leibnitz rule4 to differentiate it
to obtain the first-order condition. Indeed, the graphical derivation above does this.
3
If vj were not uniformly distributed on [0, 1], then it would have been the integral

vj (bi )
(vi − bi ) f (vj ) dvj = (vi − bi ) F (vj (bi ))
0

where f and F is the probability density and cumulative distribution functions of vj , respectively.
4
Leibnitz Rule:
U (x,y) U (x,y)
∂ ∂U ∂L ∂
f (x, y, t) dt = · f (x, y, U (x, y)) − · f (x, y, L (x, y)) + f (x, y, t) dt.
∂x t=L(x,y) ∂x ∂x t=L(x,y) ∂x
280CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

Step 3 Verify that best -reply functions are indeed affine, i.e., bi is of the form
bi = a + cvi . Indeed, we rewrite (15.6) as
1 a
bi = v i + . (15.7)
2 2
Check that both 1/2 and a/2 are constant, i.e., they do not depend on vi , and they are
same for both players.

Step 4 Compute the constants a and c. To do this, observe that in order to have
an equilibrium, the best reply bi in (15.6) must be equal to b∗i (vi ) for each vi . That is,
1 a
vi + = cvi + a.
2 2
must be an identity, i.e. it must remain true for all values of vi . Hence, the coefficient
of vi must be equal in both sides:
1
c= .
2
The intercept must be same in both sides, too:
a
a= .
2
Thus,
a = 0.
This yields the symmetric, linear Bayesian Nash equilibrium:

1
bi (vi ) = vi .
2

15.2.2 Any symmetric equilibrium


I now compute a symmetric Bayesian Nash equilibrium without assuming that b is linear.
Assume that b is strictly increasing and differentiable.

Step 1 Assume a Bayesian Nash equilibrium of the form

b∗1 (v1 ) = b (v1 )


b∗2 (v2 ) = b (v2 )

for some increasing, differentiable function b.


15.2. FIRST-PRICE AUCTION 281

Step 2 Compute the best reply of each type, or compute the first-order condition
that must be satisfied by the best reply. To this end, compute that, given that the other
player j is playing according to equilibrium, the expected payoff of playing bi for type
vi is

E ui bi , b∗j , v1 , v2 |vi = (vi − bi ) Pr{bi ≥ b(vj )}


= (vi − bi ) Pr{vj ≤ b−1 (bi )}
= (vi − bi )b−1 (bi ) ,

where b−1 is the inverse of b. Here, the first equality holds because b is strictly increasing;
the second equality is obtained by again using the fact that b is increasing, and the last
equality is by the fact that vj is uniformly distributed on [0, 1]. The first-order condition
is obtained by taking the partial derivative of the last expression with respect to bi and
setting it equal to zero. Then, the first-order condition is

−1 db−1
−b (b∗i (vi )) + (vi − bi∗ ) = 0.
dbi bi =b∗i (vi )

Using the formula on the derivative of the inverse function, this can be written as
1
−b−1 (b∗i (vi )) + (vi − bi∗ (vi )) = 0. (15.8)
b (v) b(v)=b∗i (v)

Step 3 Identify the best reply with the equilibrium action, towards computing the
equilibrium action. That is, set
b∗i (vi ) = b (vi ) .

Substituting this in (15.8), obtain


1
−vi + (vi − b (vi )) = 0. (15.9)
b (vi )
Most of the time the differential equation does not have a closed-form solution. In
that case, one suffices with analyzing the differential equation. Luckily, in this case the
differential equation can be solved, easily. By simple algebra, we rewrite the differential
equation as

b (vi ) vi + b (vi ) = vi .
282CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

Hence,
d [b (vi ) vi ]
= vi .
dvi
Therefore,

b (vi ) vi = vi2 /2 + const,

for some constant const. Since the equality also holds vi = 0, it must be that const = 0.
Therefore,
b (vi ) = vi /2.

In this case, we were lucky. In general, one obtains a differential equation as in (15.9),
but the equation is not easily solvable in general. Make sure that you understand the
steps until finding the differential equation well.

15.2.3 General Case


I have so far assumed that the types are uniformly distributed. Assume now instead
that the types are independently and identically distributed with a probability density
function f and cumulative distribution function F . (In the case of uniform, f is 1 and
F is identity on [0, 1].) To compute the symmetric equilibria in increasing differentiable
strategies, observe that the expected payoff in Step 2 is

E ui bi , b∗j , v1 , v2 |vi = (vi − bi ) Pr{vj ≤ b−1 (bi )} = (vi − bi )F b−1 (bi ) .

The first-order condition for best reply is then

db−1
−F b−1 (b∗i (vi )) + (vi − bi∗ (vi ))f b−1 (b∗i (vi )) = 0.
dbi bi =b∗i (vi )

Using the formula on the derivative of the inverse function, this can be written as

1
−F b−1 (b∗i (vi )) + (vi − b∗i (vi ))f b−1 (b∗i (vi )) = 0.
b (v) b(v)=b∗i (v)

In Step 3, towards identifying the best reply with the equilibrium action, one substututes
the equality b∗i (vi ) = b (vi ) in this equation and obtains
1
−F (vi ) + (vi − b (vi ))f (vi ) = 0.
b (vi )
15.3. DOUBLE AUCTION 283

Arranging the terms, one can write this as a usual differential equation:

b (vi ) F (vi ) + b (vi ) f (vi ) = vi f (vi ) .

The same trick in the case of uniform distribution applies more generally. One can
write the above differential equation as
d
[b (vi ) F (vi )] = vi f (vi ) .
dvi
By integrating both sides, one then obtains the solution
( vi
vf (v) dv
b (vi ) = 0 .
F (vi )

One can further simplify this solution by integrating the right hand side by parts:
(v ( vi
vi F (vi ) − 0 i F (v) dv F (v) dv
b (vi ) = = vi − 0 .
F (vi ) F (vi )
(v
That is, in equilibrium a bidder shades her bid down by an amount of 0 i F (v) dv /F (vi ) .

15.3 Double Auction


In many trading mechanisms, both buyers and the sellers submit bids (although the
price submitted by the seller is often referred to as "ask" rather than "bid"). Such
mechanisms are called double auction, where the name emphasizes that both sides of
the market are competing. This section is devoted to the case when there is only one
buyer and one seller. (This case is clearly about bilateral bargaining, rather than general
auctions.)
Consider a Seller, who owns an object, and a Buyer. They want to trade the object
through the following mechanism. Simultaneously, Seller names ps and Buyer names pb .

• If pb < ps , then there is no trade;

• if pb ≥ ps , then they trade at price


pb + ps
p= .
2
284CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

The value of the object for Seller is vs and for Buyer is vb . Each player knows her own
valuation privately. Assume that vs and vb are independently and identically distributed
with uniform distribution on [0, 1]. [Recall from the first-price auction what this means.]
Then, the payoffs are
pb +ps
vb − 2
if pb ≥ ps
ub =
0 otherwise
pb +ps
2
− vs if pb ≥ ps
us =
0 otherwise
We will now compute Bayesian Nash equilibria. In an equilibrium, one must compute
a price ps (vs ) for each type vs of the seller and a price pb (vb ) for each type vb of the
buyer. In a Bayesian Nash equilibrium, pb (vb ) solves the maximization problem
pb + ps (vs )
max E vb − : pb ≥ ps (vs ) ,
pb 2
and ps (vs ) solves the maximization problem
ps + pb (vb )
max E − vs : pb (vb ) ≥ ps ,
ps 2
where E [x : A] is the "integral" of x on set A. ( Note that E [x : A] = E [x|A] Pr (A),
where E [x|A] is the conditional expectation of x given A. Make sure that you know all
these terms!!!)
In this game, there are many Bayesian Nash equilibria. For example, one equilibrium
is given by
X if vb ≥ X
pb = ,
0 otherwise
X if vs ≤ X
ps =
1 otherwise
for some any fixed number X ∈ [0, 1]. We will now consider the Bayesian Nash equilib­
rium with linear strategies.

15.3.1 Equilibrium with linear strategies


Consider an equilibrium where the strategies are affine functions of valuation, but they
are not necessarily symmetric.
15.3. DOUBLE AUCTION 285

Step 1 Assume that there is an equilibrium with linear strategies:

pb (vb ) = ab + cb vb
ps (vs ) = as + cs vs

for some constants ab , cb , as , and cs . Assume also that cb > 0 and cs > 0. [Notice that a
and c may be different for buyer and the seller.]

Step 2 Compute the best responses for all types. To do this, first note that

pb − as
pb ≥ ps (vs ) = as + cs vs ⇐⇒ vs ≤ (15.10)
cs
and
ps − ab
ps ≤ pb (vb ) = ab + cb vb ⇐⇒ vb ≥
. (15.11)
cb
To compute the best reply for a type vb , one first computes his expected payoff from his
bid (leaving in an untegral form). As shown in Figure 15.2, the payoff of the buyer is
pb + ps (vs )
vb −
2
when vs ≤ vs (pb ) = (pb − as ) /cs and the payoff is zero otherwise. Hence, the expected
payoff is

pb + ps (vs )
E [ub (pb , ps , vb , vs ) |vb ] = E vb − : pb ≥ ps (vs )
2
pb −as
cs pb + ps (vs )
= vb − dvs .
0 2
By substituting ps (vs ) = as + cs vs in this expression, obtain
pb −as
cs pb + as + cs vs
E [ub (pb , ps , vb , vs ) |vb ] = vb − dvs .
0 2
Visually, this is the area of the trapezoid that lies beween 0 and vs (pb ) horizontally and
between the price (ps + pb ) /2 and vb vertically.5
5
The area is
pb − as 3pb + as
E [ub (pb , ps , vb , vs ) |vb ] = vb − ,
cs 4
but it is not needed for the final result.
286CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

vb

pb

c
a

vs(pb) vs

Figure 15.2: Payoff of of a buyer in double auction as his bid changes

To compute the best reply, take the derivative of the last expression with respect to
pb and set it equal to zero:
pb −as
∂ ∂ cs pb + as + cs vs
0 = E [ub (pb , ps , vb , vs ) |vb ] = vb − dvs
∂pb ∂pb 0 2
pb −as
vb − p b 1 cs
= − dvs
cs 0 2
vb − pb 1 pb − as
= − .
cs 2 cs

Solving for pb , obtain


2 1
pb = v b + a s . (15.12)
3 3
Graphically, a Δ amount of increase in pb has two impacts on the expected payoff.
First it causes a Δ/cs amount of increase in vs (pb ), adding the shaded rectangular area
of size (vb − pb ) Δ/cs in Figure 15.2. It also increases the price by an amount of Δ/2,
subtracting the shaded trapezoidal area of approximate size vs (b) Δ/2. At the optimum
the two amounts must be equal, yielding the above equality.
15.3. DOUBLE AUCTION 287

Now compute the best reply of a type vs . As in before, his expected payoff of playing
ps in equilibrium is

ps + pb (vb )
E [us (pb , ps , vb , vs ) |vs ] = E − vs : pb (vb ) ≥ ps
2
1
ps + ab + cb vb
= − vs dvb ,
ps −ab
c
2
b

where the last equality is by (15.11) and pb (vb ) = ab + cb vb . Once again, in order to
compute the best reply, take the derivative of the last expression with respect to ps and
set it equal to zero:6

1
1 1 1 ps − ab 1
− (ps − vs ) + dvb = 1− − (ps − vs ) = 0.
cb ps −ab
cb
2 2 cb cb
Once again, a Δ increase in ps leads to a Δ/2 increase in the price, resulting in a gain
ps −ab
of 1 − cb
Δ/2. It also leads to a Δ/cb decrease in the types of buyers who trade,
leading to a loss of (ps − vs ) Δ/cb . At the optimum, the gain and the loss must be equal,
yielding the above equality. Solving for ps , one can then obtain
2 ab + cb
p s = vs + . (15.13)
3 3

Step 3 Verify that best replies are of the form that is assumed in Step 1. Inspecting
(15.12) and (15.13), one concludes that this is indeed the case. The important point
here is to check that in (15.12) the coefficient 2/3 and the intercept 13 as are constants,
independent of vb . Similarly for the coefficient and the intercept in (15.13).

Step 4 Compute the constants. To do this, we identify the coefficients and the
intercepts in the best replies with the relevant constants in the functional form in Step
1. Firstly, by (15.12) and pb (vb ) = pb , we must have the identity
1 2
ab + cb vb = as + vb .
3 3
6
One uses Leibnitz rule. The derivative of upper bound is zero, contributing zero to the derivative.
The derivative of the lower bound is 1/cs , and this is multiplied by the expression in the integral at the
lower bound, which is simply ps − vs . (Note that at the lower bound pb = ps , and hence the price is
simply ps .) Finally, one adds the integral of the derivative of the expression inside the integral, which
is simply 1/2.
288CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

That is,
1
ab = as (15.14)
3
and
2
cb = . (15.15)
3
Similarly, by (15.13) and ps (vs ) = ps , we must have the identity
a b + cb 2
as + cs vs = + vs .
3 3
That is,
ab + cb
as = (15.16)
3
and
2
cs = . (15.17)
3
Solving (15.14), (15.15), (15.16), and (15.17), we obtain ab = 1/12 and as = 1/4.
Therefore, the linear Bayesian Nash equilibrium is given by
2 1
pb (vb ) = vb + (15.18)
3 12
2 1
ps (vs ) = vs + . (15.19)
3 4
In this equilibrium, the parties trade iff

pb (vb ) ≥ ps (vs )

i.e.,
2 1 2 1
vb + ≥ vs + ,
3 12 3 4
which can be written as
3 1 1 31 1
vb − vs ≥ − = = .
2 4 12 26 4
Whenever vb > vc there is a positive gain form trader. When the gain from trade is
lower than 1/4, the parties leave this gain from trade on the table. This is because of
the incomplete information. The parties do not know that there is a positive gain from
trade. Even if they tried to find ingenious mechanisms to elicit the values, buyer would
have an incentive to understate vb and seller would have an incentive to overstate vs ,
and some gains from trade would not be realized.
15.4. INVESTMENT IN A JOINT PROJECT 289

15.4 Investment in a Joint Project


In real life, the success of a project often requires investment by several independent
parties. In an firm, production function exhibit synergies between the capital invest­
ment and the labor. A successful product development requires input from both the
R&D department, who will develop the new prototype, and the marketing department,
who will do market research and advertisement. In a more macro level, we need both
entrepreneurs investing in new business ideas and the "workers" investing in their human
capital (when they are students).
In all these examples, the return from investment for one party is increasing in the
investment level by the other. For example, if R&D does not put effort in developing a
good product, the market research and advertisement will be all waste. Likewise if the
marketing department does not do a good job, R&D will not be useful, they will either
develop the wrong product (failure in the market research) or the product will not sell
because of bad advertisement. Similarly, as a student, in order to invest in your human
capital (by studying rather than partying), you should anticipate that there will be jobs
that will pay for your human capital, and in order for investing in skill oriented jobs, the
entrepreneur should anticipate that there will be skilled people they can hire. The firms
or the countries in which such investments take place proper while the others remain
poor.
I will know illustrate some of the issues related to this coordination problem on a
simple example. There are two players, 1 and 2, and a potential project. Each player
may either invest in the project or not invest. If they both invest in the project, it will
succeed; otherwise it will fail costing money to the party who invest (if there is any
investment). The payoffs are as follows:Consider the payoff matrix

Invest Not Invest


Invest θ, θ θ − 1, 0
Not Invest 0, θ − 1 0, 0

Player 1 chooses between rows, and Player 2 chooses between columns. Here, the payoffs
from not investing is normalized to zero. If a player invests, his payoff depends on the
other player’s action. If the other player also invests, the project succeeds, and both
players get θ. If the other player does not invest, the project fails, and the investing
290CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

player incurs a cost, totalling a net benefit of θ − 1.7 The important thing here is the
return from investment is 1 utile more if the other party also invests.
Now suppose that θ is common knowledge. Consider first the case θ < 0. In that
case, the return from investment is so low that Invest is strictly dominated by Not Invest.
(I am sure you can imagine a case in which even if you learn everything the school is
offering and get the best job, it is still not worth studying.) Each player chooses not
to invest. Now consider the other extreme case: θ > 1. In that case, the return from
investment is so high that Invest strictly dominates Not Invest, and both parties invest
regardless of their expectations about the other. (For example, studying may be such a
fun that you would study the material even if you thought that it will not help you get
any job.) These are two extreme, uninteresting cases.
Now consider the more interesting and less extreme case of 0 < θ < 1. In that case,
there are two equilibria in pure strategies and one equilibrium in mixed strategies. In
the good equilibrium, anticipating that the other player invests, each player invests in
the project, and each gets the positive payoff of θ. In the bad equilibrium, each player
correctly anticipates that the other party will not invest, so that neither of them invest,
yielding zero payoff for both players.
It is tempting to explain the differences between developed and underdeveloped coun­
tries that have similar resources or the successful and unsuccessful companies by such a
multiple equilibria story. Indeed, it has been done so by many researchers. We will next
consider the case with incomplete information and see that there are serious problems
with such explanations.
Now assume that players do not know θ, but each of them gets an arbitrarily precise
noisy signal about θ. In particular, each player i observes

xi = θ + η i , (15.20)

where η i is a noise term, uniformly distributed on [−ε, ε] and ε ∈ (0, 1) is a scalar


that measures the level of uncertainty players face. Assume also that θ is distributed
uniformly on a large interval [−L, L] where L » 1 + ε. Finally, assume that (θ, η 1 , η 2 )
are independently distributed. We take the payoff matrix, which depends on the players’
7
This payoff structure corresponds to investing in a project that yields 1 if the project succeeds and
0 if it fails. A player has to incur a cost c when he invests in the project. Writing θ = 1 − c for the net
return from the project, we obtain the payoff structure above.
15.4. INVESTMENT IN A JOINT PROJECT 291

types x1 and x2 , as
Invest Not Invest
Invest x 1 , x2 x1 − 1, 0
Not Invest 0, x2 − 1 0, 0

That is, the players do not know how much the other party values the investment,
but they know that the valuations are positively correlated. This is because they are
both estimates about the same thing. For example, if Player 1 finds out that investment
is highly valuable, i.e., x1 is high, then he will believe that Player 2 will also find out
that the investment is valuable, i.e., x2 is high. Because of the noise terms, he will
not know however what x2 is. In particular, for x1 ∈ [0, 1], he will find that the other
player’s signal is higher than his own with probability 1/2 and lower than his own with
probability 1/2:
Pr (xj < xi |xi ) = Pr (xj > xi |xi ) = 1/2. (15.21)

This is implied by the fact that θ is uniformly distributed and we are away from the
corners L and −L. [If you are mathematically inclined, you should prove this.]
We will now look for the symmetric Bayesian Nash equilibria in monotone (i.e. weakly
increasing) strategies. A monotone strategy si here is a strategy with a cutoff value x̂i
such that player invests if and only if his signal exceeds the cutoff:

Invest if xi ≥ x̂i ,
si (xi ) =
Not Invest if xi < x̂i .

Any symmetric Bayesian Nash equilibrium s∗ in monotone strategies has a cutoff value
x̂ such that
Invest if xi ≥ x,
ˆ
s∗i (xi ) =
Not Invest if xi < x̂.
Here, symmetry means that the cutoff value x̂ is the same for both players. In order to
identify such a strategy profile all we need to do is to determine a cutoff value.
We will now find the cutoff values x̂ that yields a Bayesian Nash equilibrium. Notice
that the payoff from investment is

Ui (Invest, sj |xi ) = Pr (sj (xj ) = Invest|xi ) xi + Pr (sj (xj ) = N ot Invest|xi ) (xi − 1)


= xi − Pr (sj (xj ) = N ot Invest|xi ) .
292CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

The payoff from Not invest is simply zero. Hence, a player invests as a best reply if
and only if his signal is at least as high as the probability that the other player is
not investing, i.e., xi ≥ Pr (sj (xj ) = N ot Invest|xi ). Therefore s∗ is a Bayesian Nash
equilibrium iff

(∀xi ≥ x̂) xi ≥ Pr s∗j (xj ) = N ot Invest|xi = Pr (xj < x̂|xi )


(∀xi < x̂) xi ≤ Pr s∗j (xj ) = N ot Invest|xi = Pr (xj < x̂|xi ) .

Proof. Consider xi ≥ x̂. According to s∗ , i Invests at xi , with expected payoff of


xi − Pr s∗j (xj ) = N ot Invest|xi = xi − Pr (xj < x̂|xi ). In a Bayesian Nash equilibrium
he has no incentive to deviate to Not Invest, i.e., xi − Pr (xj < x̂|xi ) ≥ 0, or equivalently
xi ≥ Pr (xj < x̂|xi ). Similarly, when xi < x̂, according to s∗ , player i does Not Invest,
getting 0, and hence he has no incentive to deviate to Invest and get xi − Pr (xj < x̂|xi )
iff xi < Pr (xj < x̂|xi ).
Now observe that if xi ≥ 1, then xi ≥ 1 ≥ Pr (xj < x̂|xi ), and hence s∗i (xi ) = Invest.
On the other hand, if xi < 0, then xi < 0 ≤ Pr (xj < x̂|xi ), and hence s∗i (xi ) = N ot
Invest. Therefore, x̂ ∈ [0, 1].
Most importantly, at the cutoff value the player must be indifferent between investing
and not investing:
x̂ = Pr (xj < x̂|x̂) . (15.22)

Intuitively, when xi is slightly lower than x̂ we have xi ≤ Pr (xj < x̂|xi ), and when xi
is slightly higher than x̂ we have xi ≥ Pr (xj < x̂|xi ). Because of continuity we must
have equality at xi = x̂. Below, for those who want to see a rigorous proof, I make this
argument more formally.
Proof. Since x̂ ∈ [0, 1], there are types xi > x̂, and all such types invest. Hence there
is a sequence of types xi → x̂ with xi > x̂. Since each xi invests, xi ≥ Pr (xj < x̂|xi ).
Moreover, Pr (xj < x̂|xi ) is continuous in xi . Hence, x̂ = lim xi ≥ lim Pr (xj < x̂|xi ) =
Pr (xj < x̂|x̂). Similarly, there are types xi < x̂, who do not invests, and considering such
types approaching x̂, we conclude that x̂ = lim xi ≤ lim Pr (xj < x̂|xi ) = Pr (xj < x̂|x̂).
Combining these two we obtain the equality.
Equation (15.22) shows that there is a unique symmetric Bayesian Nash equilibrium
in monotone strategies.
15.4. INVESTMENT IN A JOINT PROJECT 293

Proposition 15.1 There is a unique symmetric Bayesian Nash equilibrium in monotone


strategies:

Invest if xi ≥ 1/2,
s∗i (xi ) =
Not Invest if xi < 1/2.

Proof. By (15.22), we have x̂ = Pr (xj < x̂|x̂). But by (15.21), Pr (xj < x̂|x̂) = 1/2.
Therefore,

x̂ = Pr (xj < x̂|x̂) = 1/2.

We have shown that there is a unique symmetric Bayesian Nash equilibrium in


monotone strategies. It is beyond the scope of this course, but this also implies that
the symmetric Bayesian Nash equilibrium is the only rationalizable strategy (with the
exception of what to play at the cutoff 1/2).8 That is the game with incomplete informa­
tion has a unique solution, as opposed to the multiple equilibria in the case of complete
information.
The unique solution has intuitive properties. Firstly, the investment becomes more
likely when it is more valuable. This is because, as we increase θ, the probability
Pr (xi ≥ 1/2|θ) also increases. (That probability is (θ + ε − 1/2) /ε when it is in the
interior (0, 1).) That is, the outcome is determined by the underlying payoff parameters
in an intuitive way. Secondly, the cutoff value 1/2 is also intuitive. Suppose that ε is
very small so that x1 ∼
= x2 ∼
= θ. Let us say that Invest is risk dominant if it is a best
reply to the belief that the other player invests with probability 1/2 and does not invest
with probability 1/2. Such beliefs are meant to be completely uninformative. Note
that Invest is risk dominant if and only is xi ≥ 1/2. That is, the players play the risk
dominant action under incomplete information.

8
For mathematically inclined students: This is because the game is supermodular: (i) the return
to investment increases with the investment of the other party and with one’s own type xi , and (ii)
the beliefs are increasing in the sense that Pr (xj ≥ a|xi ) is weakly increasing in xi . In that case, the
rationalizable strategies are bounded by symmetric Bayesian Nash equilibria in monotone strategies.
Clearly, when the latter is unique, there must be a unique rationalizable strategy.
294CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

15.5 Exercises with Solution


1. [Final, 2007, early exam] A consumer needs 1 unit of a good. There are n firms
who can supply the good. The cost of producing the good for firm i is ci , which
is privately known by i, and (c1 , c2 , . . . , cn ) are independently and uniformly dis­
tributed on [0, 1]. Simultaneously, each firm i sets a price pi , and the consumer
buys from the firm with the lowest price. (If k > 1 firms charge the lowest price,
he buys from one of those firms randomly, each selling with probability 1/k.) The
payoff of i is pi − ci if it sells and 0 otherwise.

(a) Write this as a Bayesian game.

Answer:

• The set of players: N = {1, . . . , n}, the set of firms;


• the set of types of i: Ti = [0, 1], the set of possible costs ci ;
• the set of actions of i: Ai = [0, ∞), the set of possible prices pi ;
• the utility of i:

1
|{j:pj =pi }|
(pi − ci ) if pi ≤ pj for all j
ui (p1 , . . . , pn ; c1 , . . . , cn ) =
0 otherwise

• the beliefs: conditional on ci , (cj )j=i


# iid with uniform distribution on
[0, 1].

(b) Compute a symmetric, linear Bayesian Nash equilibrium. What happens as


n → ∞? Briefly interpret.

Answer: See part (c)

(c) Find all symmetric Bayesian Nash equilibrium in strictly increasing and dif­
ferentiable strategies.
[Hint: Given any c¯ ∈ (0, 1), the probability that cj ≥ c̄ for all j = i is
(1 − c̄)n−1 .]

Answer: We are looking for an equilibrium in which each player i plays


p, which is an increasing differentiable function that maps ci to p (ci ). Now,
15.5. EXERCISES WITH SOLUTION 295

given that the other players play p, the expected utility of firm i from charging
pi at cost ci is

Ui (pi , ci ) = Pr (p (cj ) > pi for all j = i) (pi − ci )


= Pr cj > p−1 (pi ) for all j = i (pi − ci )
n−1
= 1 − p−1 (pi ) (pi − ci ) .

To see the last equality, note that for all j = i, Pr (cj > p−1 (pi )) = (1 − p−1 (pi )).
Since the types are independently distributed, we must multiply these prob­
abilities over j–n − 1 times. The first order condition is
∂Ui n−1 n−2 1
= 1 − p−1 (pi ) −(n − 1) 1 − p−1 (pi ) (pi − ci )· = 0.
∂pi p (c) p(c)=pi

This equation must be satisfied at pi = p (ci ):


1
(1 − ci )n−1 − (n − 1) (1 − ci )n−2 (p (ci ) − ci ) = 0.
p (ci )
One can rewrite this as a differential equation:

(1 − ci )n−1 p (ci ) − (n − 1) (1 − ci )n−2 p (ci ) = −ci (n − 1) (1 − ci )n−2 .

(If you obtain this differential equation, you will get 8 out of 10.) To solve it,
notice that the left-hand side is
d
(1 − ci )n−1 p (ci ) .
dci
Therefore,

(1 − ci )n−1 p (ci ) = −ci (n − 1) (1 − ci )n−2 dci + const


n−1 n−1
= (1 − ci )n−1 − (1 − ci )n + const,
n−1 n
which is obtained by changing variable to v = 1 − c. To have the equality at
ci = 1, constant must be zero. Therefore,
n−1 1 n−1
p (ci ) = 1 − (1 − ci ) = + ci .
n n n
This is also the symmetric linear BNE in part (b). Here, with incomplete
information, the equilibrium price is a weighted average of the lowest cost
296CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

and the highest possible cost. This price can be high. However, as n → ∞,
p (ci ) → ci , and the firm with the lowest cost sells the good at its marginal
cost, as in the competitive equilibrium.

Remark 15.2 The problem here can be viewed as a procurement auction, in which
the lowest bidder wins. This is closely related to the problem in which n buyers
with privately known values bid in a first-price auction.

2. [Final 2002] Two partners simultaneously invest in a project, where the level of
investment can be any non-negative real number. If partner i invests xi and the
other partner j invests xj , then the payoff of partners i is

θi xi xj − x3i .

Here, θi is privately known by partner i, and the other partner believes that θi is
uniformly distributed on [0, 1]. All these are common knowledge. Find a symmetric
Bayesian Nash equilibrium in which the investment of partner i is in the form of

xi = a + b θi .
Solution: In this problem, all symmetric Bayesian Nash equilibria turn out to
be of the above form; the question hints the form. I construct a Bayesian Nash

equilibrium (x∗1 , x∗2 ), which will be in the form of x∗i (θi ) = a + b θi . The expected
payoff of i from investment xi is

U (xi ; θi ) = E θi xi x∗j − xi3 = θi xi E xj∗ − xi3 .

Of course, x∗i (θi ) satisfies the first-order condition

0 = ∂U (xi ; θi ) /∂xi |x∗ (θi ) = θi E x∗j − 3 (x∗i (θi ))2 ,


i

i.e.,
� E x∗j �
x∗i (θi ) = θi E x∗j /3 = θi .
3

That is, a = 0, and the equilibrium is in the form of x∗i (θi ) = b θi where

E x∗j
b= .
3
15.5. EXERCISES WITH SOLUTION 297

But x∗j = b θj , hence
[ � ] [� ]
E x∗j = E b θj = bE θj = 2b/3.

Substituting this in the previous equation we obtain

2
E x∗j 2b/3 2b
b = = = .
3 3 9

There are two solutions for this equality, each yielding a distinct Bayesian Nash
equilibrium. The first solution is

b = 2/9,

yielding Bayesian Nash equilibrium

2�
x∗i (θi ) = θi .
9

The second solution is b = 0, yielding the Bayesian Nash equilibrium in which each
player invests 0 regradless of his type.

3. [Midterm 2, 2001] Consider the following first-price, sealed-bid auction where an


indivisible good is sold. There are n ≥ 2 buyers indexed by i = 1, 2, . . . , n.
Simultaneously, each buyer i submits a bid bi ≥ 0. The agent who submits the
highest bid wins. If there are k > 1 players submitting the highest bid, then the
winner is determined randomly among these players – each has probability 1/k of
winning. The winner i gets the object and pays his bid bi , obtaining payoff vi − bi ,
while the other buyers get 0, where v1 , . . . , vn are independently and identically
distributed with probability density function f where

3x2 x ∈ [0, 1]
f (x) =
0 otherwise.

(a) Compute the symmetric, linear Bayesian Nash equilibrium.


Answer: We look for an equilibrium of the form

bi = a + cvi
298CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

where c > 0. Then, the expected payoff from bidding bi with type vi is

U (bi ; vi ) = (vi − bi ) Pr (bi > a + cvj ∀j = i)


= (vi − bi ) Pr (bi > a + cvj )
j#=i

bi − a
= (vi − bi ) Pr vj <
j#=i
c
3
bi − a
= (vi − bi )
j#=i
c
3(n−1)
bi − a
= (vi − bi )
c

for bi ∈ [a, a + c]. The first order condition is

3(n−1) 3(n−1)−1
∂U (bi ; vi ) bi − a 1 bi − a
=− + 3 (n − 1) (vi − bi ) = 0;
∂bi c c c

i.e.,
bi − a 1
− + 3 (n − 1) (vi − bi ) = 0;
c c
i.e.,
a + 3 (n − 1) vi
bi = .
3 (n − 1) + 1
Since this is an identity, we must have

a
a= ,
3 (n − 1) + 1

i.e., a = 0, and
3 (n − 1)
c= .
3 (n − 1) + 1

(b) What happens as n → ∞?


Answer: As n → ∞,
b i → vi .

In the limit, each bidder bids his valuation, and the seller extracts all the
gains from trade.
15.5. EXERCISES WITH SOLUTION 299

[Hint: Since v1 , v2 , . . . , vn is independently distributed, for any w1 , w2 , . . . , wk , we


have

Pr(v1 ≤ w1 , v2 ≤ w2 , . . . , vk ≤ wk ) = Pr(v1 ≤ w1 ) Pr(v2 ≤ w2 ) . . . Pr(vk ≤ wk ).]

4. [Midterm 2, 2002] Consider a game between two software developers, who sell
operating systems (OS) for personal computers. (There are also a PC maker and
the consumers, but their strategies are already fixed.) Each software developer
i, simultaneously offers “bribe” bi to the PC maker. (The bribes are in the form
of contracts.) Looking at the offered bribes b1 and b2 , the PC maker accepts the
highest bribe (and tosses a coin between them if they happen to be equal), and he
rejects the other. If a firm’s offer is rejected, it goes out of business, and gets 0.
Let i∗ denote the software developer whose bribe is accepted. Then, i∗ pays the
bribe bi∗ , and the PC maker develops its PC compatible only with the operating
system of i∗ . Then in the next stage, i∗ becomes the monopolist in the market for
operating systems. In this market the inverse demand function is given by

P = 1 − Q,

where P is the price of OS and Q is the demand for OS. The marginal cost of
producing the operating system for each software developer i is ci . The costs c1
and c2 are independently and identically distributed with the uniform distribution
on [0, 1], i.e., ⎧

⎨ 0
⎪ if c < 0
Pr (ci ≤ c) = c if c ∈ [0, 1]


⎩ 1 otherwise.

The software developer i knows its own marginal costs, but the other firm does not
know. Each firm tries to maximize its own expected profit. Everything described
so far is common knowledge.

(a) What quantity a software developer i would produce if it becomes monopolist?


What would be its profit?
Solution: Quantity is
1 − ci
qi =
2
300CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

and the profit is


2
1 − ci
vi = .
2
(b) Compute a symmetric Bayesian Nash equilibrium in which each firm’s bribe
is in the form of bi = α + γ (1 − ci )2 .
Solution: We have a first price auction where the valuation of buyer i, who
is the software developer i, is vi = (1 − ci )2 /4. His payoff from paying bribe
bi is
Ui (bi ; ci ) = (vi − bi ) Pr (bj < bi ) ,
where

Pr (bj < bi ) = Pr α + γ (1 − cj )2 < bi = Pr (1 − cj )2 < (bi − α) /γ


� �
= Pr 1 − cj < (bi − α) /γ = Pr cj > 1 − (bi − α) /γ
� [ � ]
= 1 − Pr cj ≤ 1 − (bi − α) /γ = 1 − 1 − (bi − α) /γ

= (bi − α) /γ.

Hence,

Ui (bi ; ci ) = (vi − bi ) (bi − α) /γ.
But maximizing Ui (bi ; ci ) is the same as maximizing

γUi (bi ; ci )2 = (vi − bi )2 (bi − α) .

The first order condition yields

2 (bi − vi ) (bi − α) + (bi − vi )2 = 0,

i.e.,
2 (bi − α) + (bi − vi ) = 0,
i.e.,
1 2 1 2
bi = vi + α = (1 − ci )2 + α.
3 3 12 3
Therefore,
1 2
γ= and α = α =⇒ α = 0,
12 3
yielding
1 1
bi = vi = (1 − ci )2 .
3 12
(Check that the second derivative is 2 (3bi − 2vi ) = −2vi < 0.)
15.6. EXERCISES 301

(c) Considering that the demand for PCs and the demand of OSs must be the
same, should PC maker accept the highest bribe? (Assume that PC maker
also tries to maximize its own profit. Explain your answer.)
Answer: A low-cost monopolist will charge a lower price, increasing the
profit for the PC maker. Since low-cost software developers pay higher
bribes, it is in the PC maker’s interest to accept the higher bribe. In that
case, he will get higher bribe now and higher profits later.

15.6 Exercises
1. [Midterm 2 Make Up, 2011] There are n players in a town. Simultaneously each
player i contributes xi to a public project, yielding a public good of amount

y = x1 + · · · + xn ,

where xi is any real number. The payoff of each player i is

ui = y 2 − ci xγi

where γ > 2 is a known parameter and the cost parameter ci ∈ {1, 2} of player i
is his private information. The costs (c1 , . . . , cn ) are independently and identically
distributed where the probability of ci = 1 is 1/2 for each player i.

(a) Write this formally as a Bayesian game.


(b) Find a Bayesian Nash equilibrium of this game. Verify that the strategy
profile you identified is indeed a Bayesian Nash equilibrium. (If you solve this
part for n = 2 and γ = 3, you will get 75% of the credit.)

2. [Homework 4, 2004] There are n people, who want to produce a common public
good through voluntary contributions. Simultaneously, every player i contributes
xi . The amount of public good produced is

y = x 1 + x2 + · · · x n .

The payoff of each player i is

u i = θ i y − y 2 − xi ,
302CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

where θi is a parameter privately known by player i, and θ1 , θ2 , . . . , θn are indepen­


dently and identically distributed with uniform distribution on [1, 2]. Assume that
xi can be positive or negative. Compute a symmetric Bayesian Nash equilibrium.
[Hint: symmetric means that xi (θi ) = xj (θj ) when θi = θj . The equilibrium will
be linear, in the form of xi (θi ) = aθi + b.]

3. [Homework 5, 2005] Consider a two player game with payoff matrix

L R
X 3, θ 0, 0
Y 2, 2θ 2, θ
Z 0, 0 3, −θ

where θ ∈ {−1, 1} is a parameter known by Player 2. Player 1 believes that θ = −1


with probability 1/2 and θ = 1 with probability 1/2. Everything above is common
knowledge.

(a) Write this game formally as a Bayesian game.


(b) Compute the Bayesian Nash equilibrium of this game.
(c) What would be the Nash equilibria in pure strategies (i) if it were common
knowledge that θ = −1, or (ii) if it were common knowledge that θ = 1?

4. [Homework 5, 2005] In a college there are n students. They are simultaneously


sending data over the college’s data network. Let xi ≥ 0 be the size data sent by
student i. Each student i chooses xi himself or herself. The speed of network is
inversely proportional to the total size of the data, so that it takes xi τ (x1 , . . . , xn )
minutes to send the message where

τ (x1 , . . . , xn ) = x1 + · · · + xn .

The payoff of student i is

θi xi − xi τ (x1 , . . . , xn ) ,

where θi ∈ {1, 2} is a payoff parameter of player i, privately known by himself or


herself. For each j = i, independent of θj , player j assigns probability 1/2 to θi = 1
and probability 1/2 to θi = 2. Everything described so far is common knowledge.
15.6. EXERCISES 303

(a) Write this game formally as a Bayesian game.


(b) Compute the symmetric Bayesian Nash equilibrium of this game.
Hint: symmetric means that xi (θi ) = xj (θj ) when θi = θj . In the symmetric
equilibrium one of the types will choose zero, i.e., for some θ ∈ {1, 2}, xi (θi ) =
0 whenever θi = θ. The expected value E [x1 + · · · + xn ] of x1 + · · · + xn is
E [x1 ] + · · · + E [xn ].

5. [Midterm 2, 2001] Consider the following first-price, sealed-bid auction where an


indivisible good is sold. There are n ≥ 2 buyers indexed by i = 1, 2, . . . , n.
Simultaneously, each buyer i submits a bid bi ≥ 0. The agent who submits the
highest bid wins. If there are k > 1 players submitting the highest bid, then the
winner is determined randomly among these players – each has probability 1/k of
winning. The winner i gets the object and pays his bid bi , obtaining payoff vi − bi ,
while the other buyers get 0, where v1 , . . . , vn are independently and identically
distributed with probability density function f where

(α + 1) xα x ∈ [0, 1]
f (x) =
0 otherwise

for some α > 0.

(a) Compute the symmetric, linear Bayesian Nash equilibrium.


(b) What happens as n → ∞, or as α → ∞? Give an economic explanation for
each limit.

[Hint: Since v1 , v2 , . . . , vn is independently distributed, for any w1 , w2 , . . . , wk , we


have

Pr(v1 ≤ w1 , v2 ≤ w2 , . . . , vk ≤ wk ) = Pr(v1 ≤ w1 ) Pr(v2 ≤ w2 ) . . . Pr(vk ≤ wk ).]

6. [Midterm 2, 2001] Consider a game of public good provision in which two players
simultaneously choose whether to contribute yielding payoff matrix

1\2 Contribute Don’t


Contribute 1 − c1 ,1 − c2 1 − c1 , 1
Don’t 1, 1 − c2 0,0
304CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

where the costs c1 and c2 are privately known by players 1 and 2, respectively. c1
and c2 are independently and identically distributed with uniform distribution on
[0, 2] (i.e., independent of his own cost, player 1 believes that c2 is distributed uni­
form distribution on [0, 2] and vice verse). Compute a Bayesian Nash equilibrium
of this game.

7. [Final 2002 Make Up] We consider an “all-pay auction” between two bidders, who
bid for an object. The value of the object for each bidder i is vi , where v1 and v2
are identically and independently distributed with uniform distribution on [0, 1].
Each bidder simultaneously bid bi ; the bidder who bids the highest bid gets the
object, and each bidder i pays his own bid bi . (If b1 = b2 , then each gets the object
with probability 1/2.) The payoff of player i is


⎨ vi − bi
⎪ if bi > bj ,
ui = vi /2 − bi if bi = bj ,


⎩ −b if bi < bj .
i

Find a symmetric Bayesian Nash equilibrium in the form of bi = a + cvi2 .

8. [Homework 6, 2006] (This question is also about a game that was played in the
class.) There are n students in the class. We have a certificate, whose value for
each student i is vi , where vi is privately known by student i and (v1 , . . . , vn ) are
independently and identically distributed with uniform distribution on [0, 100].
Simultaneously, each student i bids a real number bi . The player who bids the
highest number "wins" the certificate; if there are more than one highest bids, then
we determine the "winner" randomly among the highest bidders. The winner i gets
n−1
the certificate and pays bi to the professor. [Hint: Pr (maxj=
# i vj ≤ x) = (x/100)

for any x ∈ [0, 100].]

(a) Find a symmetric, linear Bayesian Nash equilibrium, where bi (vi ) = a + cvi
for some constants a and c.

(b) What is the equilibrium payoff of a student with value vi ?

(c) Assume that n = 80. How much would a student with value vi be willing to
pay (in terms of lost opportunities and pain of sitting in the class) in order
15.6. EXERCISES 305

to play this game? What is the payoff difference between the luckiest student
and the least lucky student?

9. [Homework 6, 2006] In a state, there are two counties, A and B. The state is to
dump the waste in one of the two counties. For a county i, the cost of having the
wasteland is ci , where cA and cB are independently and uniformly distributed on
[0, 1]. They decide where to dump the waste as follows. Simultaneously counties
A and B bid bA and bB , respectively. The waste is dumped in the county i who
bids lower, and the other county j pays bj to i. (We toss a coin if the bids are
equal. The payoff of a county is the amount of money it has minus the cost–if it
contains the wasteland.)

(a) Write this as a Bayesian game.


(b) Find all the symmetric equilibria where the bid is a strictly increasing dif­
ferentiable function of the cost. [If you can find a differential equation that
characterizes the symmetric equilibria, you will get 80% of this part.]

10. [Final, 2006] Alice and Bob have inherited a factory from their parents. The value
of the factory is vA for Alice and vB for Bob, where vA and vB are independently
and uniformly distributed over [0, 1], and each of them knows his or her own value.
Simultaneously, Alice and Bob bid bA and bB , respectively, and the highest bidder
wins the factory and pays the other sibling’s bid. (If the bids are equal, we toss a
coin to determine the winner.)

(a) (5pts) Write this game as a Bayesian game.


(b) (10 pts) Find a symmetric, linear Bayesian Nash equilibrium of this game.
(c) (10pts) Find all symmetric Bayesian Nash equilibria of this game in strictly
increasing differentiable strategies.

11. [Final 2007] There are n ≥ 2 siblings, who have inherited a factory from their
parents. The value of the factory is vi for sibling i, where (v1 , . . . , vn ) are inde­
pendently and uniformly distributed over [0, 1], and each of them knows his or her
own value. Simultaneously, each i bids bi , and the highest bidder wins the factory
and pays his own bid to his siblings, who share it equally among themselves. (If
306CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

the bids are equal, the winner is determined by a lottery with equal probabilities
on the highest bidders.) Note that if i wins, i gets vi − bi and any other j gets
bi / (n − 1).

(a) (5 points) Write this as a Bayesian game.


(b) (10 points) Compute a symmetric, linear Bayesian Nash equilibrium. What
happens as n → ∞? Briefly interpret.
(c) (10 points) Find all symmetric Bayesian Nash equilibrium in strictly increas­
ing and differentiable strategies.

12. [Homework 6, 2006] There is a house on the market. There are n ≥ 2 buyers.
The value of the house for buyer i is vi (measured in million dollars) where v1 ,
v2 , . . . , vn are independently and identically distributed with uniform distribution
on [0, 1]. The house is to be sold via first-price auction. This question explores
whether various "incentives" can be effective in improving participation.

(a) Suppose that seller gives a discount to the winner, so that winner pays only λbi
for some λ ∈ (0, 1), where bi is his own bid. Compute the symmetric Bayesian
Nash equilibrium. (Throughout the question, you can assume linearity if you
want.) Compute the expected revenue of the seller in that equilibrium.
(b) Suppose that seller gives a prize α > 0 to the winner. Compute the symmetric
Bayesian Nash equilibrium. Compute the expected revenue of the seller in
that equilibrium.
(c) Consider three different scenarios:
• the seller does not give any incentive;
• the seller gives 20% discount (λ = 0.8);
• the seller gives $100,000 to the winner.
For each scenarios, determine how much a buyer with value vi is willing to pay
in order to participate the auction. Briefly discuss whether such incentives
can facilitate the sale of the house.

13. [Homework 6, 2006] We have a penalty kicker and a goal keeper. Simultaneously,
penalty kicker decides whether to send the ball to the Left or to the Right, and
15.6. EXERCISES 307

the goal keeper decides whether to cover the Left or the Right. The payoffs are as
follows (where the first entry is the payoff of penalty kicker):

PK\GK Left Right


Left x − 1, y + 1 x + 1, −1
Right 1, y − 1 −1, 1
Here, x and y are independently and uniformly distributed on [−1, 1]; the penalty
kicker knows x, and the goal keeper knows y. Find a Bayesian Nash equilibrium.

14. [Final 2010] There are two identical objects and three potential buyers, named
1, 2, and 3. Each buyer only needs one object and does not care which of the
identical objects he gets. The value of the object for buyer i is vi where (v1 , v2 , v3 )
are independently and uniformly distributed on [0, 1]. The objects are sold to two
of the buyers through the following auction. Simultaneously, each buyer i submits
a bid bi , and the buyers who bid one of the two highest bids buy the object and
pay their own bid. (The ties are broken by a coin toss.) That is, if bi > bj for
some j, i gets an object and pays bi , obtaining the payoff of vi − bi ; if bi < bj for
all j, the payoff of i is 0.

(a) (5 points) Write this as a Bayesian game.


(b) (20 points) Compute a symmetric Bayesian Nash equilibrium of this game in
increasing differentiable strategies. (You will receive 15 points if you derive
the correct equations without solving them.)

15. [Final 2010] A state government wants to construct a new road. There are n
construction firms. In order to decrease the cost of delay in completion of the
road, the government wants to divide the road into k < n segments and construct
the segments simultaneously using different firms. The cost of delay for the public
is Cp = K/k for some constant K > 0. The cost of constructing a segment for firm
i is ci /k where (c1 , . . . , cn ) are independently and uniformly distributed on [0, 1],
where ci is privately known by firm i. The government hires the firms through the
following procurement auction.

k + 1st-price Procurement Auction Simultaneously, each firm i submits a bid


bi and each of the firms with the lowest k bids wins one of the segments. Each
308CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

winning firm is paid the lowest k + 1st bid as the price for the construction
of the segment. The ties are broken by a coin toss.

The payoff of a winning firm is the price paid minus its cost of constructing a
segment, and the payoff of a losing firm is 0. For example, if k = 2 and the bids
are (0.1, 0.2, 0.3, 0.4), then firms 1 and 2 win and each is paid 0.3, resulting in
payoff vector (0.3 − c1 /2, 0.3 − c2 /2, 0, 0).

(a) (10 points) For a given fixed k, find a Bayesian Nash equilibrium of this game
in which no firm bids below its cost. Verify that it is indeed a Bayesian Nash
equilibrium.

(b) (10 points) Assume that each winning firm is to pay β ∈ (0, 1) share of the
price to the local mafia. (In the above example it pays 0.3β to the mafia
and keep 0.3 (1 − β) for itself.) For a given fixed k, find a Bayesian Nash
equilibrium of this game in which no firm bids below its cost. Verify that it
is indeed a Bayesian Nash equilibrium.

(c) (5 points) Assuming that the government minimizes the sum of CP and the
total price it pays for the construction, find the condition for the optimal k
for the government in parts (a) and (c). Show that the optimal k in (c) is
weakly lower than the optimal k in (a). Briefly interpret the result. [Hint:
the expected value of the k + 1st lowest cost is (k + 1) / (n + 1).]

16. [Final 2011] There are k identical objects and n potential buyers where n > k >
1. Each buyer only needs one object and does not care which of the identical
objects he gets. The value of the object for buyer i is vi where (v1 , v2 , . . . , vn ) are
independently and uniformly distributed on [0, 1]. The objects are sold to k of
the buyers through the following auction. Simultaneously, each buyer i submits a
bid bi , and the buyers who bid one of the k highest bids buy the object and pay
their own bid. (The ties are broken by a coin toss.) That is, if bi > bj for at least
n − k bidders j, then i gets an object and pays bi , obtaining the payoff of vi − bi ;
if bi < bj for at least k bidders j, the payoff of i is 0.

(a) (5 points) Write this as a Bayesian game.


15.6. EXERCISES 309

(b) (20 points) Compute a symmetric Bayesian Nash equilibrium of this game in
increasing differentiable strategies. (You will receive 15 points if you derive
the correct equations without solving them.)
Hint: Let (x1 , . . . , xm ) be independently and uniformly distributed on [0, 1]
and let x(r) be rth highest xi among (x1 , . . . , xm ). Then, the probability
density function of x(r) is

m!
fm,r (x) = (1 − x)r−1 xm−r .
r! (m − r)!

17. [Final 2011] Consider the following charity auction. There are two bidders, namely
1 and 2. Each bidder i has a distinct favored charity. Simulatenously, each bidder
i contributes bi to the auction. The highest bidder wins, and the sum b1 + b2 goes
to the favored charity of the winner. The winner is determined by a coin toss in
case of a tie. The payoff of the bidder i is

θi (b1 + b2 ) − bγi if i wins


ui (b1 , b2 , θi ) =
−bγi otherwise,

where γ > 1 is a known parameter, θi is privately known by player i, and θ1 and θ2


are independently and uniformly distributed on [0, 1]. Find a differential equation
that must be satisfied by strategies in a symmetric Bayesian Nash equilibrium.
(Assume that the equilibrium strategies are increasing and differentiable.)

18. [Homework 5, 2011] Consider an n-player game in which each player i selects a
search level si ∈ [0, 1] (simultaneously), receiving the payoff

ui (s1 , . . . , sn , θ1 , . . . , θn ) = θi s1 · · · sn − sγi /γ,

where (θ1 , . . . , θn ) are independently and identically distributed on [0, ∞). the
expected value of each Here, γ > 1 is commonly known and θi is privately known
by player i. (Denote the expected value of θi by θ̄ and the expected value of θαi by
θ̄α for any α > 0.)

(a) For γ = 2, find the symmetric linear Bayesian Nash equilibria.

(b) For n = γ = 2, find the symmetric Bayesian Nash equilibria.


310CHAPTER 15. STATIC APPLICATIONS WITH INCOMPLETE INFORMATION

19. [Homework 5, 2011] Consider an n-player first price auction in which the value
of the object auctioned is vi for player i, where (v1 , . . . , vn ) are independently
and identically distributed with CDF F where F (v) = v α for some α > 0. The
value of vi is privately known by player i. Compute a symmetric Bayesian Nash
equilibrium.

20. [Homework 5, 2011] Consider an auction with two buyers where the value of the
object auctioned is vi for player i, where (v1 , v2 ) are independently and identically
distributed with uniform distribution on [0, 1]. The value of vi is privately known
by player i. In the auction, the buyers simultaneously bid b1 and b2 and the highest
bidder wins the object and pays the average bid (b1 + b2 ) /2 as the price. The ties
are broken with a coin toss. Compute a symmetric Bayesian Nash equilibrium.
Chapter 16

Dynamic Games with Incomplete


Information

This chapter is devoted to the basic concepts in dynamic games with incomplete in­
formation. As in the case of complete information, Bayesian Nash equilibrium allows
players to take suboptimal actions in information sets that are not reached in equilib­
rium. This problem addressed by sequential equilibrium, which explicitly requires that
the players play a best reply at every information set (sequential rationality) and that
the players’ beliefs are "consistent" with the other players’ strategies. Here, I will define
sequential equilibrium and apply it to some important games.

Remark 16.1 Sequential equilibrium is closely related to another solution concept,


called perfect Bayesian Nash equilibrium. Sequential equilibrium is a better defined
solution concept, and easier to understand. The two solution concepts are equiva­
lent in the games considered here. Hence, you should apply sequential equilibrium
in past exam questions regarding perfect Bayesian Nash equilibrium.

16.1 Sequential Equilibrium


Consider the game in Figure 16.1. This game is meant to describe a situation in which a
firm does not know whether a worker is hard working, in the sense of preferring to work
rather than shirk, or lazy, in the sense of wanting to shirk. The worker is likely to be hard
working. However, there is a Bayesian Nash equilibrium, plotted in bold lines, in which

311
312 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

Work (1, 2)
W
Firm Hire
Shirk
High .7 (0, 1)
Nature Do not (0, 0)
hire
Work (1, 1)
W
Low .3 Hire
Shirk
(-1, 2)
Do not
hire (0, 0)

Figure 16.1: A Bayesian Nash equilibrium in which player W plays a suboptimal action.

worker would shirk if he were hired, independent of whether he is hard working or lazy,
and anticipating this, the firm does not hire. Clearly, hard working worker’s shirking is
against his preferences (which were meant to model a worker who would rather work).
This is however consistent with Bayesian Nash equilibrium because every strategy of
the worker is a best reply to the "do not hire" strategy of the firm. (Worker gets 0 no
matter what strategy he plays.) In order to solve this problem, assume that players are
sequentially rational, i.e., they play a best reply at every information set, maximizing
their expected payoff conditional on that they are at the information set. That is, when
he is to move, the hard working worker would know that Nature has chosen "High" and
the firm has chosen "Hire", and he must play Work as the only best reply under that
knowledge. This would lead to the other equilibrium, in which firm hires and worker
works if he is hard working and shirks otherwise.
Notice that the latter equilibrium is the only subgame-perfect equilibrium in that
game. Since subgame perfection has been introduced as a remedy to the problem exhib­
ited in the former equilibrium, it is tempting to think that subgame perfection solves the
problem. As we have seen in the earlier lectures, it does not. For example, consider the
strategy profile in bold in Figure 16.2. This is a subgame-perfect equilibrium because
there is no proper subgame, and it clearly a Nash equilibrium. Strategy L is a best reply
only to X. However, at the information Player 2 moves, she knows that player one has
played either T or B. Given this knowledge, L could not be a best reply.
In order to formalize the idea of sequential rationality for general games, we need to
16.1. SEQUENTIAL EQUILIBRIUM 313

X
1 (2,6)

T
B

2
L R L R
(0,1) (3,2) (-1,3) (1,5)

Figure 16.2: A SPE in which player 2 plays a sequentially irrational strategy.

define beliefs:

Definition 16.1 A belief assessment is a list b of probability distributions on informa­


tion sets; for each information set I, b gives a probability distribution b(·|I) on I.

For any information set I, the player who moves at I believes that he is at node
n ∈ I with probability b (n|I). For example, for the game in Figure 16.2, in order to
define a belief assessment, we need to assign a probability μ on the node after T and
a probability 1 − μ on the node after B. (In information sets with single nodes, the
probability distribution is trivial, putting 1 on the sole node.) When Player 2 moves,
she believes that Player 1 played T with probability μ and B with probability 1 − μ.
We are now ready to define sequential rationality for a strategy profile:

Definition 16.2 For a given pair (s, b) of strategy profile s and belief assessment b,
strategy profile s is said to be sequentially rational iff, at each information set I, the
player who is to move at I maximizes his expected utility

1. given his beliefs b(·|I) at the information set (which imply that he is at information
set I), and

2. given that the players will play according to s in the continuation game.

For example, in Figure 16.2, for Player 2, given any belief μ, L yields

U2 (L; μ) = 1 · μ + 3 · (1 − μ)
314 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

T
B

.1 .9
2
L R L R
(0,10) (3,2) (-1,3) (1,5)

Figure 16.3: An inconsistent belief assessment

while R yields
U2 (R; μ) = 2 · μ + 5 · (1 − μ) .

Hence, sequential rationality requires that Player 2 plays R. Given Player 2 plays R,
the only best reply for Player 1 is T . Therefore, for any belief assessment b, the only
sequentially rational strategy profile is (T, R).
In order to have an equilibrium, b must also be consistent with σ. Roughly speaking,
consistency requires that players know which (possibly mixed) strategies are played by
the other players. For a motivation, consider Figure 16.3 and call the node on the left
nT and the node on the right nB . Given the beliefs b (nT |I2 ) = 0.1 and b (nB |I2 ) = 0.9,
strategy profile (T, R) is sequentially rational. Strategy T is a best response to R. To
check the sequential rationality for R, it suffices to note that, given the beliefs, L yields

(.1) (10) + (.9) (3) = 3.7

while R yields
(.1) (2) + (.9) (5) = 4.7.

(Note that there is no continuation game.) But (T, R) is not even a Nash equilibrium
in this game. This is because in a Nash equilibrium player knows the other player’s
strategy. She would know that Player 1 plays T , and hence she would assign probability
1 on nT . In contrast, according to b, she assigns only probability 0.1 on nT .
In order to define consistency formally, we need to think more carefully about the
information sets are reached positive probability (the information sets that are "on the
16.1. SEQUENTIAL EQUILIBRIUM 315

path") and the ones that are not supposed to be reached ("off the path") according to
the strategy profile.

Definition 16.3 Given any (possibly mixed) strategy profile s, belief assessment b, and
any information set I that is reached with positive probability according to s, the beliefs
b (·|I) at I is said to be consistent with s iff b (·|I) is derived using the Bayes rule and
s. That is, for each node n in I,
Pr (n|s)
b (n|I) = L ,
nl ∈I Pr (n |s)
'

where Pr (n|s) is the probability that we reach node n according to s.

For example, in order a belief assessment b to be consistent with (T, R), we need
Pr (nT | (T, R)) 1
μ = b (nT |I) = = = 1.
Pr (nT | (T, R)) + Pr (nB | (T, R)) 1+0
In general, there can be information sets that are not supposed to be reached accord­
L
ing to the strategy profile. In that case the number nl ∈I Pr (n' |s) on the denominator
would be zero, and we cannot apply the Bayes rule (directly). For such information
sets, we perturb the strategy profile slightly, by assuming that players may "tremble",
and apply the Bayes rule using the perturbed strategy profile. To see the general idea,
consider the game in Figure 16.4. The information set of player 3 is off the path of the
strategy profile (X, T, L). Hence, we cannot apply the Bayes rule. But we can still see
that the beliefs the figure are inconsistent. Let us perturb the strategies of players 1
and 2 assuming that players 1 and 2 tremble with probabilities ε1 and ε2 , respectively,
where ε1 and ε2 are small but positive numbers. That is, we put probability ε1 on E
and 1 − ε1 on X (instead of 0 and 1, respectively) and 1 − ε2 on T and ε2 on B (instead
of 1 and 0, respectively). Under the perturbed beliefs,
ε1 (1 − ε2 )
Pr (nT |I3 , ε1 , ε2 ) = = 1 − ε2 ,
ε1 (1 − ε2 ) + ε1 ε2
where nT is the node that follows T . As ε2 → 0, Pr (nT |I3 , ε1 , ε2 ) → 1. Therefore, for
consistency, we need b (nT |I3 ) = 1.

Definition 16.4 Given any (s, b), belief assessment b is consistent with s iff there exist
some trembling probabilities that go to zero such that the conditional probabilities derived
316 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

E X
2
2
T 0
B 0

0.1 0.9
3
L R L R

1 3 0 0
2 3 1 1
1 3 2 1

Figure 16.4: A belief assessment that is inconsistent off the path

by Bayes rule with trembles converge to probabilities given by b on all information sets
(on and off the path of s). That is, there exists a sequence (σ m , bm ) of assessments such
that

1. (σ m , bm ) → (σ, b),

2. σ m is "completely mixed" for every m, and

3. bm is derived from σ m using the Bayes’ rule.

Here, note that σ m and σ prescribe probability distributions σ m


i (·|I) and σ i (·|I) on
the available moves at every information set I of every player i. Likewise, σ m and σ
prescribe probability distributions bm (·|I) and b (·|I) on every information set I. The
first condition states that limm→∞ σ m m
i (a|I) → σ i (a|I) and limm→∞ b (n|I) → b (n|I)
for every i, I,and all nodes n ∈ I and all available moves a at I. The second condition
requires that σ m
i (a|I) > 0 everywhere (i.e. every available move is played with positive
probability). Under any such strategy profile, every information set is reached with
positive probability, and hence one can apply Bayes rule to obtain the beliefs.
Sequential equilibrium is defined as an assessment that is sequentially rational and
consistent:
16.2. SEQUENTIAL EQUILIBRIUM IN BEER AND QUICHE GAME 317

Definition 16.5 A pair (s, b) of a strategy profile s and a belief assessment b is said to
be a sequential equilibrium if (s, b) is sequentially rational and b is consistent with s.

Note that a sequential equilibrium is a pair, not just a strategy profile. Hence, in
order to identify a sequential equilibrium, one must identify a strategy profile s, which
describes what a player does at every information set, and a belief assessment b, which
describes what a player believes at every information set. In order to check that that
(s, b) is a sequential equilibrium, one must check that

1. (Sequential Rationality) s is a best response to belief b (·|I) and the belief that
the other players will follow s in the continuation games in every information set
I, and

2. (Consistency) there exist trembling probabilities that go to zero such that the
conditional probabilities derived from Bayes rule under the trembles approach
b (·|I) at every information set I.

Example 16.1 In the game in Figure 16.4, the unique subgame-perfect equilibrium is
s∗ = (E, T, R). Let us check that (s∗ , b∗ ) where b∗ (nT |I3 ) = 1 is a sequential equilibrium.
We need to check that

1. s∗ is sequentially rational (at all information sets) under b∗ , and

2. b∗ is consistent with s∗ .

At the information set of player 3, given b∗ (nT |I3 ) = 1, action L yields 1 while
R yields 3, and hence R is sequentially rational. At the information set of Player 2,
given the other strategies, T and B yield 3 and 1, respectively, and hence playing T
is sequentially rational. At the information set of Player 1, E and X yield 3 and 2,
respectively, and hence playing E is again sequentially rational.
Since all the information sets are reached under s∗ , we just need to use the Bayes
rule in order to check consistency:

1
Pr (nT |I3 , s∗ ) = = b∗ (nT |I3 ) .
1+0
318 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

0 1
1 du 1
el el
beer quiche
qui che du
{.1} don
2 ’t 3
0 don tw
’t
0

1 0
0 ts
0

du

el
du
el
beer {.9} quiche
qui che
3 ’t don 2
1 don ’t 1

Figure 16.5: Beer & Quiche game

16.2 Sequential equilibrium in Beer and Quiche Game

Consider the game in Figure 16.5. Here Player 1 has two types: strong (ts ) and weak
(tw ). The strong type likes beer for breakfast, while the weak type likes quiche. Player
1 is ordering his breakfast, while Player 2, who is a bully, is watching and contemplating
whether to pick a fight with Player 1. Player 2 would like to pick a fight if Player 1
is weak but not fight if he is strong. His payoffs are such that if he assign probability
more than 1/2 to weak, he prefers a fight, and if he assigns probability more than 1/2
to strong, then he prefers not to fight. Player 1 would like to avoid a fight: he gets 1
utile from the preferred breakfast and 2 utiles from avoiding the fight. Before observing
the breakfast Player 2 finds it more likely that Player 1 is strong.
One sequential equilibrium, denoted by (s∗ , b∗ ), is depicted in Figure 16.6. Both
types of Player 1 order beer. If Player 2 sees Beer, he assigns probability 0.9 to strong
and does not fight; if he sees Quiche, he assigns probability 1 on weak and fights. Let
us check that this is indeed a sequential equilibrium.
We start with sequential rationality. Playing Beer is clearly sequentially rational for
the strong type because it leads to the highest payoff for ts . For tw , beer yields 2 (beer,
don’t) while quiche yields only 1 (quiche, duel). Hence beer is sequentially rational for
tw , too. After observing beer, the expected payoff of Player 2 from "duel" is

(.9) (0) + (.1) (1) = .1


16.2. SEQUENTIAL EQUILIBRIUM IN BEER AND QUICHE GAME 319

0 1
1 du 1
el el
beer quiche du

t .1 {.1} 1 d
2 ’ on’ 3
0 don tw t
0

1 0
0 ts
0
du

el
du
el
.9 beer {.9} quiche 0

3 ’ t don 2
1 don ’t 1

Figure 16.6: A PBE in Beer and Quiche game

while his payoff from "don’t" is

(.9) (1) + (.1) (0) = .9,

and hence "don’t" is indeed sequentially rational. After observing quiche, the expected
payoff of Player 2 from duel is 1 (which is (1) (1) + (0) (0)) while his expected payoff
from "don’t" is 0. Hence, duel is sequentially rational at this information set.
To check consistency, we start the information set after beer. This information set
is on the path, and hence we use the Bayes rule. Clearly,

Pr (ts ) Pr (beer|ts , s∗ )
Pr (ts |beer, s∗ ) =
Pr (ts ) Pr (beer|ts , s∗ ) + Pr (tw ) Pr (beer|tw , s∗ )
(.9) (1)
= = .9
(.9) (1) + (.1) (1)
= b∗ (ts |beer) ,

showing that the beliefs are consistent after observing beer. Now consider the informa­
tion set after quiche. This information set is off the path, and we cannot apply the Bayes
rule directly. In order to check consistency at this information set, we need to find some
trembling probabilities that would lead to probability 1 on weak in the limit. (Notice
that we don’t need all the trembles to lead to this probability in the limit. There could
320 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

0 1
1 du 1
el el
beer quiche du

1 {.1} .1 d
2 ’t on’ 3
0 don tw t
0

1 0
0 ts
0

du

el
du
el
0 beer {.9} quiche .9
3 ’t don 2
1 don ’t 1

be some other trembles that would lead to a different limit.) Suppose that weak type
trembles with probability ε while the strong type trembles with probability zero. Then,
(.1) ε
Pr (tw |quiche, ε) = = 1.
(.1) ε + (.9) (0)
As ε → 0, clearly, Pr (tw |quiche, ε) → 1 = b∗ (tw |quiche), showing that b∗ is consistent
with s∗ .1
Above equilibrium is intuitive. Since weak type likes quiche, Player 2 takes ordering
quiche as a sign of weakness and fights. Anticipating this, none of the types orders
quiche. There is also another sequential equilibrium in which both types order quiche,
as depicted in Figure 16.2.

Exercise 16.1 Check that the strategy profile and the belief assessments in Figure 16.2
are a sequential equilibrium.

Exercise 16.2 Find all sequential equilibria in Beer and Quiche game. (Hint: Note
that there may be two different equilibria in which the strategy profiles are same but the
beliefs are different.)

Beer and Quiche game is a representative of an important class of games, called


signaling games. In these games, Player 1 has several types, i.e. he knows something
1
We could also allow both types to tremble. For example, we can take tremble probability ε for weak
type and ε2 for the strong type. The conditional probability would be
(.1) ε .1
2
= → 1.
(.1) ε + (.9) ε .1 + .9ε
16.2. SEQUENTIAL EQUILIBRIUM IN BEER AND QUICHE GAME 321

-2 1
1 du 1
el el
beer quiche du
{.1} don
0 ’t 3
0 don tw
’t
0

1 0
0 ts
0

du

el
du
el
beer {.9} quiche
3 ’t don 2
1 don ’t 1

Figure 16.7: A revised version of Beer and Quiche game

relevant. He takes an action (called a message). Player 2 observes Player 1’s action
but not his type and takes an action. Players’ payoffs depend both players’ actions and
Player 1’s type.

Definition 16.6 In a signaling game, a pooling equilibrium is a sequential equilibrium


in which all types of Player 1 play the same action.

Both of the equilibria in Beer and Quiche game are pooling equilibrium. In a pooling
equilibrium, Player 2 does not learn anything from Player 1’s actions on the path of
equilibrium (i.e. his beliefs at the information set on the path are just his prior beliefs).
In some signaling games, different types may take different actions, and Player 2 may
learn Player 1’s information from his actions:

Definition 16.7 In a signaling game, a separating equilibrium is a sequential equilib­


rium in which every type of Player 1 play a different action.

Notice that if a type t∗ plays action a∗ in a separating equilibrium, then by consistency


Player 2 assigns probability 1 to t∗ when he observes a∗ . Therefore, after Player 1 takes
his action Player 2 learns his type (putting probability 1 on the correct type).

Example 16.2 Consider the game in Figure 16.7, where weak type really dislikes beer.
In this game there is a unique sequential equilibrium, depicted in Figure 16.8. Since
weak type plays quiche and strong type plays beer, it is a separating equilibrium. Notice
that Player 2 assigns probability 1 to ts after beer and to tw after quiche.
322 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

-2 1
1 du 1
el el
beer quiche du

t 0 {.1} 1 d
0 ’ on’ 3
0 don tw t
0

1 0
0 ts
0
du

el
du
el
1 beer {.9} quiche 0

3 ’ t don 2
1 don ’t 1

Figure 16.8: A separating equilibrium

Exercise 16.3 Check that the strategy profile and the belief assessment form a sequential
equilibrium. Show also that this is the only sequential equilibrium.

Sequential equilibrium in mixed strategies In some games the only sequential


equilibrium is in mixed strategies. For example, in the original Beer and Quiche game
(of Figure 16.5), take the probability of the weak type tw as 0.8 instead of 0.1, so
that, before the bully observes what Player 1 has for his breakfast, bully finds Player 1
more likely to be weak. In that case neither of the pooling equilibria can remain as a
sequential equilibrium. For example, in the one in which both types play beer, Player
2 must assign probability 0.8 to weak type after observing beer, and he must fight by
sequential rationality. In that case, tw must play quiche as a best reply. One can also
check that there is no separating equilibrium. For example, if strong type has beer
and the weak type has quiche, then Player 2 would learn player’s type after the choice
of breakfast and would fight only after quiche. In that case, weak type would like to
deviate. Therefore in a sequential equilibrium, at least one of the types must be playing
a mixed strategy.
In order to find the equilibrium, let us write pB and pQ for the probabilities of "don’t"
(i.e. "don’t duel") after beer and quiche respectively. Write UB (t) and UQ (t) for the
16.2. SEQUENTIAL EQUILIBRIUM IN BEER AND QUICHE GAME 323

expected payoffs from beer and quiche for type t, respectively. Then,2

UB (ts ) − UQ (ts ) = 1 + 2 (pB − pQ )

and
UB (tw ) − UQ (tw ) = −1 + 2 (pB − pQ ) .

Hence,
UB (ts ) − UQ (ts ) = 2 + UB (tw ) − UQ (tw ) > UB (tw ) − UQ (tw ) . (16.1)

Now, if tw plays beer with positive probability, then for sequential rationality we must
have UB (tw ) ≥ UQ (tw ). Then (16.1) implies that UB (ts ) > UQ (ts ). In that case,
sequential rationality requires that ts must play beer with probability 1. Similarly, one
can conclude that if ts plays quiche with positive probability, then tw must play quiche
with probability 1. Therefore, in a sequential equilibrium, either (i) ts plays beer and tw
mixes, or (ii) ts mixes and tw plays quiche.
The case (ii) cannot happen in equilibrium. After beer, Player 2 must assign proba­
bility 1 on ts and not fight, i.e. pB = 0. Moreover, after quiche, he must assign
0.8
Pr (tw |quiche) = ≥ 0.8
0.8 + 0.2 Pr (quiche|ts )

to the weak type and must fight, i.e. pQ = 1. In that case, UB (ts ) = 3 and UB (tw ) = 0,
and strong type must fight with probability 1 (not mixing).
Therefore, in equilibrium, ts plays beer and tw mixes. By consistency, we must have
Pr (quiche|tw ) (0.8)
Pr (tw |quiche) = = 1.
Pr (quiche|tw ) (0.8) + 0 · 0.2
By sequential rationality, Player 2 must fight after quiche:

pQ = 1.

Since tw mixes, it must be that 0 = UB (tw ) − UQ (tw ) = −1 + 2 (pB − pQ ). Therefore,

pB = 1/2.

That is, after observing beer, player two strictly mixes between "duel" and "don’t". For
sequential rationality, he must then be indifferent between them. This happens only
2
Notice that UB (ts ) = 1 + 2pB and UQ (ts ) = 2pQ .
324 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

0 1
1 du 1
el el
.5 beer quiche du
.5 3/4
2
o n ’t .5 1/4 {.8} 1 don
’t
3
0 d tw 0

1 0
0 ts
0
du

el
.5

du
el
.5 beer {.2} quiche 0
.5 don 2
3 ’ t
1 don ’t 1

when3
Pr (tw |beer) = 1/2.

Finally, for consistency after beer, we must have

Pr (beer|tw ) (0.8)
1/2 = Pr (tw |beer) = .
Pr (beer|tw ) (0.8) + 1 · 0.2

By solving for Pr (beer|tw ), we obtain

Pr (beer|tw ) = 1/4.

We have identified a strategy profile and belief assessment, depicted in Figure 16.2. From
our derivation, one can check that this is indeed a sequential equilibrium.

Exercise 16.4 Check that the strategy profile and the belief assessment in Figure 16.2
form a sequential equilibrium.

16.3 A Simple example of Reputation Formation


In a complete information game, it is assumed that the players know exactly what other
players’ payoffs are. In real life this assumption almost never holds. What would happen
3
Payoff from duel is Pr (tw |beer) while the payoff from "don’t" is Pr (ts |beer).
16.3. A SIMPLE EXAMPLE OF REPUTATION FORMATION 325

1 2 1
(1,-5)
.9

(4,4) (5,2) (3,3)


.1
1 2 1
(0,-5)

(-1,4) (0,2) (-1,3)

in equilibrium if a player has a small amount of doubt about the other player’s payoffs?
It turns out that in dynamic games such small changes may have profound effects on
the equilibrium behavior. The next example illustrates this fact. (It also illustrates how
one computes a mixed-strategy sequential equilibrium.)
Consider the game in Figure 16.3. In this game, Player 2 does not know the payoffs
of Player 1. She thinks at the beginning that his payoffs are as in the upper branch with
high probability 0.9, but she also assigns the small probability of 0.1 to the possibility
that he is averse to play down, exiting the game. Call the first type of Player 1 "normal"
type and the second type of Player 1 "crazy" type. If it were common knowledge that
Player 1 is "normal", then backward induction would yield the following: Player 1 goes
down in the last decision node; Player 2 goes across, and Player 1 goes down in the first
node.
What happens in the incomplete information game of Figure 16.3 in which the above
common knowledge assumption is relaxed? By sequential rationality, the "crazy" type
(in the lower branch) will always go across. In the last decision node, the normal type
again goes down. Can it be the case that the normal type goes down in his first decision
node, as in the complete information case? It turns out that the answer is No. If
in a sequential equilibrium "normal" type goes down in the first decision node, in her
information set, Player 2 must assign probability 1 to the crazy type. (By Bayes rule,
Pr (crazy|across) = 0.1/ (0.1 + (.9) (0)) = 1. This is required for consistency.) Given
this belief and the actions that are already determined, she gets −5 from going across
and 2 from going down, and she must go down for sequential rationality. But then
326 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

"normal" type should go across as a best reply, which contradicts the assumption that
he goes down.
Similarly, one can also show that there is no sequential equilibrium in which the
normal type goes across with probability 1. If that were the case, then by consistency,
Player 2 would assign 0.9 to normal type in her information set. Her best response would
be to go across for sure, and in that case the normal type would prefer to go down in
the first node.
In any sequential equilibrium, normal type must mix in his first decision node. Write
α = Pr (across|normal) and β for the probability of going across for Player 2. Write
also μ for the probability Player 2 assigns to the upper node (the normal type) in her
information set. Since normal type mixes (i.e. 0 < α < 1), he is indifferent. Across
yields
3β + 5 (1 − β)

while down yields 4. For indifference, the equality 3β + 5 (1 − β) = 4 must therefore


hold, yielding
β = 1/2.

Since 0 < β < 1, Player 2 must be indifferent between going down, which yields 2 for
sure, and going across, which yields the expected payoff of

3μ + (−5) (1 − μ) = 8μ − 5.

That is, 8μ − 5 = 2, and


μ = 7/8.

But this belief must be consistent:


7 0.9α
=μ= .
8 0.9α + .1
Therefore,
α = 7/9.

This completes the computation of the unique sequential equilibrium, which is depicted
in Figure 16.3.

Exercise 16.5 Verify that the pair of mixed strategy profile and the belief assessment is
indeed a sequential equilibrium.
16.4. BARGAINING WITH INCOMPLETE INFORMATION 327

1 2 β=1/2 1
α=7/9
(1,-5)
.9 μ=7/8

(4,4) (5,2) (3,3)


.1
1 2 1
(0,-5)

(-1,4) (0,2) (-1,3)

Notice that in sequential equilibrium, after observing that Player 1 goes across, Player
2 increases her probability for Player 1 being a crazy type who will go across, from 0.1
to 0.125. If she assigned 0 probability at the beginning she would not change her beliefs
after she observes that he goes across. In the latter case, Player 1 could never convince
her that he will go across (no matter how many times he goes across), and he would not
try. When that probability is positive (no matter how small it is), she will increase her
probability of him being crazy after she sees him going across, and Player 1 would try
go across with some probability even he is not crazy.

Exercise 16.6 In the above game,compute the sequential equilibrium for any initial
probability π ∈ (0, 1) of crazy type (in the figure π = 0.1).

16.4 Bargaining with Incomplete Information


We will now analyze a relatively simple bargaining game with incomplete information.
A seller has an object, whose value for him is 0. There is also a buyer. The value of the
object for the buyer is v, where v is uniformly distributed on [0, 1]. The buyer knows v,
but the seller does not. There are two periods, 0 and 1. At period 0, seller sets a price
p0 and the buyer decides whether to buy the object at price p0 . If he buys, the payoffs
of the seller and the buyer are p0 and v − p0 , respectively. Otherwise, they proceed
to the next period. In period 1, the seller set again a price p1 and the buyer decides
328 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

whether to buy. If he buys, the payoffs of the seller and the buyer are δp0 and δ (v − p0 ),
respectively, where δ ∈ (0, 1). Otherwise, the game ends with zero payoffs.
Consider a sequential equilibrium with the following cutoff strategies.4 For any price
p0 and p1 there are cutoffs a (p0 ) and b (p1 ) such that at period 0, buyer buys if and only
if v ≥ a (p0 ) and at period 1, the buyer buys if and only if v ≥ b (p1 ).
At period 1, given any price p1 , buyer gets δ (v − p1 ) if he buys and 0 otherwise.
Hence, by sequential rationality, he should buy if and only if v ≥ p1 .That is, b (p1 ) = p1 .
Now, given any p0 , if the buyer does not buy in period 0, then seller knows, from the
strategy of the buyer, that v ≤ a (p0 ). That is, after the rejection of p0 , the seller
believes that v is uniformly distributed on [0, a (p0 )]. Given that buyer buys iff v ≥ p1 ,
the expected payoff of the seller is

US (p1 |p0 ) = p1 Pr (p1 ≤ v|v ≤ a (p0 )) = p1 (a (p0 ) − p1 ) /a (p0 ) .

For sequential rationality, after the rejection of p0 , the price p1 (p0 ) must maximize
US (p1 |p0 ). Therefore,
p1 (p0 ) = a (p0 ) /2. (16.2)

Now consider period 0. Given any price p0 , the types v ≥ a (p0 ) buy at price p0 at
period 0; the types v ∈ [a (p0 ) /2, a (p0 )) buy at price a (p0 ) /2 at period 1, and the other
types do not buy. For sequential rationality, we must have

v − p0 ≥ δ (v − p1 (p0 )) for v ≥ a (p0 )


v − p0 ≤ δ (v − p1 (p0 )) for v ∈ [a (p0 ) /2, a (p0 )).

By continuity, this implies that we have equality at v = a (p0 ):

a (p0 ) − p0 = δ (a (p0 ) − p1 (p0 )) = δa (p0 ) /2,

where the last equality is by (16.2). Therefore,


p0
a (p0 ) = .
1 − δ/2

All we need to do is now to find what price buyer sets at period 0. For any price p0 , he
gets p0 from types with v ≥ a (p0 ), δp1 (p0 ) = δa (p0 ) /2 from types v ∈ [a (p0 ) /2, a (p0 ))
4
This is actually the only sequential equilibrium.
16.5. EXERCISES WITH SOLUTIONS 329

and zero from the rest. His expected payoff is

US (p0 ) = p0 · (1 − a (p0 )) + δ (a (p0 ) /2) · (a (p0 ) − a (p0 ) /2)


2
p0 p0
= p0 1− +δ .
1 − δ/2 2−δ

The first period price must maximize US (p0 ). By taking the derivative and setting it
equal to zero, we obtain

(1 − δ/2)2
p0 = .
2 (1 − 3δ/4)

16.5 Exercises with Solutions

1. [Final 2007, Early exam] Find a sequential equilibrium of the following game:

A C
1/3 B
1 1/3
1/3
1
1
L1 R1 L2 L3
2 2
R2
R3
a b -1
a b 0
1
3 1 0
1 x y 1 2
2 2 2
3 0
0 3 l r l r
l r

1
2 1 -1
w z 2 1
10 -1 1
0 2
3 1
0 2
330 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

Answer: The following is the unique sequential equilibrium:

A C
1/3 B
1 1/3
1/3 1
1
L1 R1 L2 L3
2 2
1 0 R2
R3
a b -1
a b 0
1
3 1 0
1 x y 1 2
2 1/2 2 1/2 2
3 0
0 3 l r l r
l r

1
2 1 -1
w z 2 1
10 -1 1
0 2
3 1
0 2

2. [Final 2007, Early exam] This question is about a game, called "Deal or No Deal".
The monetary unit is M$, which means million dollars. The players are a Banker
and a Contestant. There are 3 cases: 0,1, and 2. One of the cases contains 1M$
and all the other cases contain zero M$. All cases are equally likely to contain the
1M$ prize (with probability 1/3). Contestant owns Case 0. Banker offers a price
p0 , and Contestant accepts or rejects the offer. If she accepts, then Banker buys
the content of Case 0 for price p0 , ending the game. (Contestant gets p0 M$ and
Banker gets the content of the case, minus p0 M$.) If she rejects the offer, then we
open Case 1, revealing the content to both players. Banker again offers a price p1 ,
and Contestant accepts or rejects the offer. If she accepts, then Banker buys the
content of Case 0 for price p1 ; otherwise we open Case 2, and the game ends with
Contestant owning the content of Case 0 and Banker owning zero. The utility of
owning x M$ is x for the Banker and x1/α for the Contestant, where α > 1.

(a) (10 points) Assuming α is commonly known, apply backward induction to


16.5. EXERCISES WITH SOLUTIONS 331

find a subgame-perfect equilibrium.

Answer: If Case 1 contains 1M$, then in period 1 players know that Case
0 contains 0, and hence Contestant accepts any offer, and Banker offers 0. If
Case 1 contains 0M$, then players know that Case 0 contains 0 with proba­
bility 1/2 and 1M$ with probability 1/2. The expected payoff of Contestant
from rejecting an offer p1 is 1/2. Hence, she accepts the offer iff
1/α
p1 ≥ 1/2, i.e., p1 ≥ 1/2α .

Therefore, Banker offers


p1 = 1/2α .

Notice that, since α > 1, the value of the case for the banker is 1/2 > p1 , and
he is happy to make that offer.
Now consider period 0. If the offer p0 is rejected, then with probability 1/3
it will be revealed that Case 1 contains 1M$, and players will get (0,0), and
with probability 2/3 it will be revealed that Case 1 contains 0M$, and Banker
will get payoff of 1/2 − 1/2α in expectation and Contestant will get payoff
1/α
of 1/2 (which is p1 ). The expected value of these payoffs for Banker and
Contestant are 1/3 − 2/ (2α 3) and 1/3, respectively. Therefore, Contestant
will accept p0 iff
1/α
p0 ≥ 1/3, i.e., p0 ≥ 1/3α .

Therefore, Banker will offer


p0 = 1/3α .

Notice that, since α > 1, 2/ (2α 3) > 1/3α , and hence Banker would rather
offer p0 and get 1/3 − 1/3α ; as opposed to making a rejected offer and getting
1/3 − 2/ (2α 3) as a result.
(b) Now assume that Banker does not know α, i.e., α is private information of
Contestant, and Pr (1/2α ≤ x) = 2x for any x ≤ 1/2. Consider a strategy of
the Contestant with cutoffs α̂0 (p0 ) and α̂1 (p1 ) such that Contestant accepts
the first price p0 iff α ≥ α̂0 (p0 ) and, in the case the game proceeds to the next
stage, she accepts the second price p1 iff α ≥ α̂1 (p1 ). Find the necessary and
sufficient conditions on α̂0 (p0 ) and α̂1 (p1 ) under which the above strategy is
332 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

played by the contestant in a sequential equilibrium. (You need to find two


equations, one contains only α̂0 (p0 ) and p0 and the other contains only α̂ (p1 )
and p1 as variables.)
[Hint: Some of the following equalities may be useful: for any x ≥ y,
Pr (1/2α ≤ x|1/2α > y) = 2 (x − y) / (1 − 2y); for any a ≥ 1, Pr (α ≤ a) =
( )
1−(1/2)a−1 , and for any a ≥ b ≥ 1, Pr (α ≥ b|α ≤ a) = 1/2b−1 − 1/2a−1 / (1 − 1/2a−1 ).]

Answer: As in part (a), if Case 1 contains 1M$, then Contestant accepts


any offer and Banker offers 0, each getting 0. If Case 1 contains 0M$, then
again Contestant accepts p1 iff

p1 ≥ 1/2α ,

i.e. iff α ≥ α̂1 (p1 ) where


p1 = 1/2αˆ 1 (p1 ) .

(Of course, α̂1 (p1 ) = − log (p1 ) / log (2), but you do not need to obtain this
explicit solution.)
Towards finding the equation for α0 , we need to find the price p1 (p0 ) that
will be offered in a sequential equilibrium. Given that p0 is rejected, Banker
knows that α < α̂0 (p0 ), or 1/2α > 1/2α̂0 (p0 ) . Write y = 1/2α̂0 (p0 ) . His
expected utility from offering p1 is

UB (p1 |p0 ) = Pr (1/2α ≤ p1 |1/2α > y) (1/2 − p1 )


2 (p1 − y)
= (1/2 − p1 ) ,
1 − 2y
which is maximized at

p1 (p0 ) = 1/4 + y/2 = 1/4 + 1/2αˆ 0 (p0 )+1 .

Given p0 , the types α ≥ α̂0 (p0 ) prefer to trade at p0 rather than waiting
for p1 (p0 ) the next period, and the types α ∈ (α̂1 (p1 (p0 )) , α̂0 (p0 )) wait for
p1 (p0 ) (and trade at that price) rather than trading at p0 . As explained in
the class, this implies that the type α̂0 (p0 ) is indifferent between these two
options:
1/α̂0 (p0 )
p0 = (2/3) (p1 (p0 ))1/α̂0 (p0 ) ,
16.5. EXERCISES WITH SOLUTIONS 333

where the left-hand side is the payoff from accepting p0 and the right-hand
side is the expected payoff from rejecting p0 and accepting p1 (p0 ) if Case 1
contains 0. By taking the powers on both sides and substituting the value of
p1 (p0 ), we obtain

p0 = (2/3)α̂0 (p0 ) p1 (p0 )


( )
= (2/3)α̂0 (p0 ) 1/4 + 1/2αˆ 0 (p0 )+1 .

(You can simplify this equation a bit more if you want, but you are not asked
to do so. Also, note that we specified all the actions and beliefs except for the
value of the initial price, which will be the price that maximizes the expected
payoff of the banker given what we described so far.)

3. [Midterm 2, 2004] Consider two pharmaceutical companies, who are competing


to develop a new drug, called Xenodyne. Simultaneously, each firm i invests xi
amount of money in R&D. The firm that invests more develops the drug first; if
they invest equal amounts, then each firm is equally likely to develop the drug first.
(The probability that they develop the drug at the same time is zero.) The firm
that develops the drug first obtains a patent for the drug and becomes a monopolist
in the market for Xenodyne. The other firm ceases to exist, obtaining the payoff of
zero, minus its investment in R&D. The monopolist then produces Q ≥ 0 units of
Xenodyne at marginal cost ci and sells it at price P = max {1 − Q, 0}, obtaining
payoff of (P − ci ) Q, minus its investment in R&D, where Q is chosen by the
monopolist. Here, ci is privately known by firm i, and c1 and c2 are independently
and identically distributed by uniform distribution on [0, 1].

(a) (10) Write this game formally as a static Bayesian game.


ANSWER:

• Type space: T1 = T2 = [0, 1].


• Action space: A1 = A2 = [0, ∞) × [0, ∞)[0,∞)×[0,∞) , where an action is a
pair (xi , Qi ), where Qi is a function of x1 and x2 .
334 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION




⎨ P (Qi (x1 , x2 )) Qi (x1 , x2 ) − xi
⎪ if xi > xj
ui (x1 , Q1 , x2 , Q2 ) = P (Qi (x1 , x2 )) Qi (x1 , x2 ) /2 − xi if xi = xj


⎩ −xi otherwise.

• pcj |ci is uniform distribution on [0, 1].

(b) (15) Find a symmetric Bayesian Nash equilibrium of the above game in which
each player’s investment is of the form xi = a (1 − ci )3 +b for some parameters
a and b. [If you can, you may want to solve part (c) first.]
ANSWER: See part (c).
(c) (10) Show that the equilibrium in part (b) is the only Bayesian Nash equi­
librium in which both firms act sequentially rationally and in which xi is an
increasing, differentiable function of (1 − ci ) .
ANSWER: By sequential rationality, a monopolist produces

Qi = 1 − ci /2

in order to maximize its profit, obtaining payoff of

(1 − ci )2 /4,

minus the investment in R&D. Define new variable

θ i = 1 − ci

which is also independently and identically distributed with uniform distrib­


ution on [0, 1]. Let x be the strategy played in a symmetric equilibrium, so
that x1 = x (θ1 ) and x2 = x (θ2 ). Now, the expected payoff of firm i is

θ2i
E [ui ] = Pr (xi > xj ) − xi .
4
This is because with probability Pr (xi > xj ) the firm will become monopolist
and get the monopoly profit θ2i /4 and will pay the investment cost xi with
probability 1. Since x is increasing, Pr (xi = xj ) = 0. Now,
( )
Pr (xi > xj ) = Pr (xi > x (θj )) = Pr θj < x−1 (xi ) = x−1 (xi ) .
16.5. EXERCISES WITH SOLUTIONS 335

Hence,
θ2i −1
E [ui ] = x (xi ) − xi .
4
Therefore, the first-order condition for maximization is

∂E [ui ] θ2 1
0= = i ' − 1,
∂xi 4 x (θi )
showing that
θ2i
x' (θi ) = ,
4
and therefore
θ3i
+ const,x (θi ) =
12
where the const = 0, so that x (0) = 0.

4. [Final 2002] Find a sequential equilibrium in the following game.

.5 .1
.4
1
2 2

2,1,0 0,1,0
3

0,0,1 0,0,1
1 0,0,1 2
2

1,0,0 0,2,0 0,0,0

0,2,2 1,1,3
3,3,3

Solution: There is a unique sequential equilibrium in this game. Clearly, 1 must


exit at the beginning and 2 has to go in on the right branch as he does not have
any choice. The behavior at the nodes in the bottom layer is given by sequential
rationality as in the figure below. Write α for the probability that 2 goes in in the
center branch, β for the probability that 3 goes right, and μ for the probability
3 assigns to the center branch. In equilibrium, 3 must mix (i.e., β ∈ (0, 1)).
[Because if 3 goes left, then 2 must exit at the center branch, hence 3 must assign
probability 1 to the node at the right (i.e., μ = 0), and hence she should play
336 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

right –a contradiction. Similarly, if 3 plays right, then 2 must go in at the center


branch. Given his prior beliefs (.4 and .1), μ = 4/5, hence 3 must play left – a
contradiction again.] In order 3 to mix, she must be indifferent, i.e.,

1 = 0μ + 3 (1 − μ) ,

hence
μ = 2/3.

By the Bayes’ rule, we must have


.4α
μ= = 2/3,
.4α + .1
i.e.,
α = 1/2.

That is player 2 must mix on the center branch, and hence she must be indifferent,
i.e.,
1 = 2β.

That is,
β = 1/2.

The equilibrium is depicted in the following figure.

.5 .1
.4
1
2 2
α = 1/2

2,1,0 0,1,0
3
μ =2/3 1−μ
0
β = 1/2
0,0,1 0,0,1
1 0,0,1 2
2

1,0,0 0,2,0 0,0,0

0,2,2 1,1,3
3,3,3
16.5. EXERCISES WITH SOLUTIONS 337

5. [Final 2002] We have a Judge and a Plaintiff. The Plaintiff has been injured. Sever­
ity of the injury, denoted by v, is the Plaintiff’s private information. The Judge
does not know v and believes that v is uniformly distributed on {0, 1, 2, . . . , 99} (so
that the probability that v = i is 1/100 for any i ∈ {0, 1, . . . , 99}). The Plaintiff
can verifiably reveal v to the Judge without any cost, in which case the Judge will
know v. The order of the events is as follows. First, the Plaintiff decides whether
to reveal v or not. Then, the Judge rewards a compensation R. The payoff of the
Plaintiff is R − v, and the payoff of the Judge is − (v − R)2 . Everything described
so far is common knowledge. Find a sequential equilibrium.

Solution: Consider a sequential equilibrium with strategy profile (s∗ , R∗ ), where


s∗ (v) ∈ {v, N R} determines whether the Plaintiff of type v reveals v or does Not
Reveal, and R∗ determines the reward, which is a function from {N R, 0, 1, . . . , 99}.
Given the Judge’s preferences, if the Plaintiff reveals her type v, the Judge will
choose the reward as
R∗ (v) = v

and
R∗ (N R) = E [v|N R] .

In equilibrium, the Plaintiff gives her best response to R∗ at each v. Hence, she
must reveal her type whenever v > R∗ (N R), and she must not reveal her type
whenever v < R∗ (N R). Suppose that R∗ (N R) > 0. Then, s∗ (0) = N R, and
hence N R is reached with positive probability. Thus,

R∗ (N R) = E [v|s∗ (v) = N R] ≤ E [v|v ≤ R∗ (N R)] ≤ R∗ (N R) /2,

which could be true only when R∗ (N R) = 0, a contradiction. Therefore, we must


have
R∗ (N R) = 0,

and thus
s∗ (v) = v

at each v > 0. There are two equilibria (more or less equivalent).


338 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

• s∗ (v) = v for all v; R∗ (v) = v; R∗ (N R) = 0, and the Judge puts probability


1 to v = 0 whenever the Plaintiff does not reveal her type.

• s∗ (0) = N R; s∗ (v) = v for all v > 0; R∗ (v) = v; R∗ (N R) = 0, and the


Judge puts probability 1 to v = 0 whenever the Plaintiff does not reveal her
type.

6. [Final 2001, Make Up] This question is about a game between a possible appli­
cant (henceforth student) to a Ph.D. program in Economics and the Admission
Committee. Ex-ante, Admission Committee believes that with probability .9 the
student hates economics and with probability .1 he loves economics. After Nature
decides whether student loves or hates economics with the above probabilities and
reveals it to the student, the student decides whether or not to apply to the Ph.D.
program. If the student does not apply, both the student and the committee get
0. If student applies, then the committee is to decide whether to accept or reject
the student. If the committee rejects, then committee gets 0, and student gets -1.
If the committee accepts the student, the payoffs depend on whether the student
loves or hates economics. If the student loves economics, he will be successful and
the payoffs will be 20 for each player. If he hates economics, the payoffs for both
the committee and the student will be -10. Find a separating equilibrium and a
pooling equilibrium of this game.

Solution: A separating equilibrium:


16.5. EXERCISES WITH SOLUTIONS 339

(-10,-10)
Apply Accept
{0} Reject
Don’t (-1,0)
.9
Hate
(0,0)
(20,20)
{1}
Love Apply Accept
.1
Reject
Don’t (-1,0)

(0,0)

A pooling equilibrium:

(-10,-10)
Apply Accept
{.9} Reject
Don’t (-1,0)
.9
Hate
(0,0)
(20,20)
{.1}
Love Apply Accept
.1
Reject
Don’t (-1,0)

(0,0)

7. [Final 2001] We have an employer and a worker, who will work as a salesman.
The worker may be a good salesman or a bad one. In expectation, if he is a good
salesman, he will make $200,000 worth of sales, and if he is bad, he will make only
$100,000. The employer gets 10% of the sales as profit. The employer offers a wage
340 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

w. Then, the worker accepts or rejects the offer. If he accepts, he will be hired at
wage w. If he rejects the offer, he will not be hired. In that case, the employer will
get 0, the worker will get his outside option, which will pay $15,000 if he is good,
$8,000 if he is bad. Assume that all players are risk-neutral.

(a) Assume that the worker’s type is common knowledge, and compute the subgame­
perfect equilibrium.

Solution: A worker will accept a wage iff it is at least as high as his outside


option, and the employer will offer the outside option – as he still makes
profit. That is, 15,000 for the good worker 8,000 for the bad.

(b) Assume that the worker knows his type, but the employer does not. Employer
believes that the worker is good with probability 1/4. Find the sequential
equilibrium.

Solution: Again a worker will accepts an offer iff his wage at least as high as
his outside option. Hence if w ≥ 15, 000 the offer will be accepted by both
types, yielding

U (w) = (1/4) (.1) 200, 000 + (3/4) (.1) 100, 000 − w = 12, 500 − w < 0

as the profit for the employer. If 8, 000 ≤ w < 15, 000, then only the bad
worker will accept the offer, yielding

U (w) = (3/4) [(.1) 100, 000 − w] = (3/4) [10, 000 − w]

as profit. If w < 0, no worker will accept the offer, and the employer will get
0. In that case, the employer will offer w = 8, 000, hiring the bad worker at
his outside option.

(c) Under the information structure in part (b), now consider the case that the
employer offers a share s in the sales rather than the fixed wage w. Compute
the sequential equilibrium.

Solution: Again a worker will accept the share s iff his income is at least as
high as his outside option. That is, a bad worker will accept s iff

100, 000s ≥ 8, 000


16.5. EXERCISES WITH SOLUTIONS 341

i.e.,
8, 000
s ≥ sB = = 8%.
100, 000
A good worker will accept s iff
15, 000
s ≥ sG = = 7.5%.
200, 000
In that case, if s < sG no one will accept the offer, and the employer will get
0; if sG ≤ s < sB , the good worker will accept the offer and the employer will
get
(1/4) (10% − s) 200, 000 = 50, 000 (10% − s) ,

and if s ≥ sB , each type will accept the offer and the employer will get

(10% − s) [(1/4) 200, 000 + (3/4) 100, 000] = 125, 000 (10% − s) .

Since 125, 000 (10% − sB ) = 2%125, 000 = 2, 500 is larger than 50, 000 (10% − sG ) =
2.5%50, 000 = 1, 250, he will offer s = sB , hiring both types.

8. [Final 2001, Make Up] As in the previous question, we have an employer and a
worker, who will work as a salesman. Now the market might be good or bad. In
expectation, if the market is good, the worker will make $200,000 worth of sales,
and if the market is bad, he will make only $100,000 worth of sales. The employer
gets 10% of the sales as profit. The employer offers a wage w. Then, the worker
accepts or rejects the offer. If he accepts, he will be hired at wage w. If he rejects
the offer, he will not be hired. In that case, the employer will get 0, the worker
will get his outside option, which will pay $12,000. Assume that all players are
risk-neutral.

(a) Assume that whether the market is good or bad is common knowledge, and
compute the subgame-perfect equilibrium.

ANSWER: A worker will accept a wage iff it is at least as high as his outside


option 12,000. If the market is good, the employer will offer the outside option
w = 12, 000, and make 20, 000 − 12, 000 = 8, 000 profit. If the market is bad,
the return 10,000 is lower than the worker’s outside option, and the worker
will not be hired.
342 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

(b) Assume that the employer knows whether the market is good or bad, but the
worker does not. The worker believes that the market is good with probability
1/4. Find the sequential equilibrium.

ANSWER: As in part (a). [We will have a separating equilibrium.]

(c) Under the information structure in part (b), now consider the case that the
employer offers a share s in the sales rather than the fixed wage w. Compute
a sequential equilibrium.

ANSWER: Note that, since the return is 10% independent of whether the
market is good or bad, the employer will make positive profit iff s < 10%.
Hence, except for s = 10%, we must have a pooling equilibrium. Hence, at
any s, the worker’s income is

[(1/4) 200, 000 + (3/4) 100, 000] s = 125, 000s.

This will be at least as high as his outside option iff


12, 000
s ≥ s∗ = = 9.6% < 10%.
125, 000
Hence an equilibrium: the worker will accept an offer s iff s ≥ s∗ , and the
employer will offer s∗ . The worker’s beliefs at any offer s is that the market is
good with probability 1/4. [Note that this is an inefficient equilibrium. When
the market is bad, the gains from trade is less than the outside option.]
There are other inefficient equilibria where there is no trade (i.e., worker
is never hired). In any such equilibrium, worker take any high offer as a
sign that the market is bad, and does not accept an offer s unless s ≥
12, 000/100, 000 = 12%, and the employer offers less than 12%. When the
market is good, in any such pure strategy equilibrium, he must in fact be
offering less than s∗ . (why?) For instance, employer offers s = 0 independent
of the market, and the worker accept s iff s > 12%.

9. [Final 2001] A risk-neutral entrepreneur has a project that requires $100,000 as an


investment, and will yield $300,000 with probability 1/2, $0 with probability 1/2.
There are two types of entrepreneurs: rich who has a wealth of $1,000,000, and
16.5. EXERCISES WITH SOLUTIONS 343

poor who has $0. For some reason, the wealthy entrepreneur cannot use his wealth
as an investment towards this project. There is also a bank that can lend money
with interest rate π. That is, if the entrepreneur borrows $100,000 to invest, after
the project is completed he will pay back $100, 000 (1 + π) – if he has that much
money. If his wealth is less than this amount at the end of the project, he will pay
all he has. The order of the events is as follows:

• First, bank posts π.


• Then, entrepreneur decides whether to borrow ($100,000) and invest.
• Then, uncertainty is resolved.

(a) Compute the subgame perfect equilibrium for the case when the wealth is
common knowledge.
ANSWER: The rich entrepreneur is always going to pay back the loan in
full amount, hence his expected payoff from investing (as a change from not
investing) is
(0.5)(300, 000) − 100, 000 (1 + π) .

Hence, he will invest iff this amount is non-negative, i.e.,

π ≤ 1/2.

Thus, the bank will set the interest rate at

π R = 1/2.

The poor entrepreneur is going to pay back the loan only when the project
succeeds. Hence, his expected payoff from investing is

(0.5)(300, 000 − 100, 000 (1 + π)).

He will invest iff this amount is non-negative, i.e.,

π ≤ 2.

Thus, the bank will set the interest rate at

π P = 2.
344 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

(b) Now assume that the bank does not know the wealth of the entrepreneur.
The probability that the entrepreneur is rich is 1/4. Compute the sequential
equilibrium.
ANSWER: As in part (a), the rich type will invest iff π ≤ π R = .5, and the
poor type will invest iff π ≤ π P = 2. Now, if π ≤ π R , the bank’s payoff is
1 3 1 1
U (π) = 100, 000 (1 + π) + 100, 000 (1 + π) + 0 − 100, 000
4 4 2 2
5
= 100, 000 (1 + π) − 100, 000
8
5
≤ 100, 000 (1 + π R ) − 100, 000
8
5 1
= 100, 000 (1 + 1/2) − 100, 000 = − 100, 000 < 0.
8 16
If π R < π ≤ π P , the bank’s payoff is
3 1 1
U (π) = 100, 000 (1 + π) + 0 − 100, 000
4 2 2
3
= 100, 000 (π − 1) ,
8
which is maximized at π P , yielding 38 100, 000. If π > π P , U (π) = 0. Hence,
the bank will choose π = π P .

16.6 Exercises
1. [Homework 5, 2011] In the following game, for each action of player 2, find a
sequential equilibrium in which player 2 plays that action:

x y
3/4 1/4
1
1
out in in out
2

2 0
0 L R L R 1

3 1 3 1
10 0 0 1
16.6. EXERCISES 345

2. [Final 2011] Find a sequential equilibrium of the following game. Verify that you
have indeed a sequential equilibrium.

(1,1)
(0,0)
b x
a
y
1 (2,-2)
1/3
2 (0,0)
x
1 a
1/3
y
(1,1)
b
1/3
(-1,-1) (0,0)
x
a
y
(1,1)
b

(2,2)

3. [Final 2011] Consider the following version of Yankee Swap Game, played by Alice,
Bob, and Caroline. There are 3 boxes, namely A, B, and C, and three prizes x,
y, and z. The prizes are put in the boxes randomly, so that any combination of
prizes is equally likely, and the boxes are closed without showing their contents to
the players. First, Alice is to open box A, revealing its content observable. Then,
in the alphabetical order, Bob and Caroline are to open the box with their own
initial, making its content observable, and either keep the content as is or swap its
content with the content of a box that has been opened already. Finally, Alice is
given the option of swapping the content of her box with the content of any other
box, ending the game when each player gets the prize in their own box.

(a) Assume that it is commonly known that, for each player, the payoff from x, y,
and z are 3, 2, and 0, respectively. Find a subgame-perfect Nash equilibrium.

(b) Now assume that it is commonly known that the preferences of Bob and
Caroline are as in part (a), but the preferences of Alice are privately known
by herself. With probability 1/2, her utility function is as above, but with
probability 1/2 she gets payoffs of 2, 3, and 0 from x, y, and z, respectively.
Find a sequential equilibrium of this game.
346 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

4. [Final 2006] Consider the following game

1 2 A 1 a
E
(0,2)
1−π
X D d

(0,0) (-1,1) (1,0)


π
1 E 2 A 1 a
(0,2)

(-1,1)

where π is the probability that Nature selects the lower branch.

(a) (10 pts) Find a sequential equilibrium for π = 3/4.


(b) (15 pts) Find a sequential equilibrium for π = 1/4.

5. [Final 2005] The following game describes a situation in which Player 2 is not sure
that she is playing a game with Player 1, i.e., she is not sure that Player 1 exists.

1 A 2 a 1 F
-1,3

.8 D d P

1 0 3
0 2 1

.2 2 a
-1,3

0
2

(a) (20 points) Compute a perfect Bayesian Nash equilibrium of this game.
(b) (5 points) Breifly discuss the equilibrium in (a) from Player 2’s point of view.
16.6. EXERCISES 347

6. [Final 2005] We have two players, Host and Contestant. There are three doors, L,
M, and R.

• Nature puts a car behind one of these doors, and goats behind the others.
The probability of having the car is same for all doors. Host knows which
door, but Contestant does not.
• Then, Contestant selects a door.
• Then, Host must open one of the two doors that are not selected by Contestant
and show Contestant what Nature put behind that door.
• Then, Contestant chooses any of the three doors, and receives whatever is
behind that door.

Payoffs for Contestant and Host are (1,-1) if Contestant receives a car, and (0,0)
if he receives a goat. Compute a perfect Bayesian Nash equilibrium of this game.
Verify that this is indeed a PBE. [Hint: Any strategy for Host in which he never
shows the car is part of some PBE.]

7. [Final 2004] Find a sequential equilibrium of the following game.

1 A 2 α 1
0
.4 D δ

0 2
0 1

.6 1 2 α
a 1
2
d δ

2 3
2 1

8. [Final 2004] A soda company, XC, introduces a new soda and wants to sell it
to a representative consumer. The soda may be either Good or Bad. The prior
probability that it is Good is 0.6. Knowing whether the soda is Good or Bad,
348 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

the soda company chooses an advertisement level for the product, which can be
either an Ad Blitz, which costs the company c, or No Advertisement, which does
not cost anything. Observing how strongly the company advertises the soda, but
without knowing whether the soda is Good or Bad, the representative consumer
decides whether or not to buy the product. After subtracting the price, the payoff
of representative consumer from buying the soda is 1 if it is Good and −1 if it
is Bad. His payoff is 0 if he does not buy the soda. If the soda is Good and
representative consumer buys it (and therefore learns that the soda is Good), then
the company sells the soda to other future consumers, enjoying a high revenue of
R. If the soda is Bad and the representative consumer buys it, the company will
have only a small revenue r. If the representative consumer does not buy the soda,
the revenue of the company is 0. Assume that 0 < r < c < R.

(a) Write this game as a signaling game. (Draw the game tree.)

(b) Find a separating equilibrium. (Verify that it is a sequential equilibrium.)

(c) Find a pooling equilibrium. (Verify that it is a sequential equilibrium.)

(d) Find a sequential equilibrium for the case that the prior probability of Good
is 0.4.

(e) Find a sequential equilibrium for the case that 0 < c < r < R (and the prior
probability of Good is 0.6).

9. [Final 2004] In this question, you are asked to help me to determine the letter
grades! We have a professor and a potential student. There are two types of
students, H and L. The student knows his type, but the professor does not. The
prior probability of type H is π ∈ [0, 1]. The events take place in the following
order.

• First, the professor determines a cutoff value γ ∈ [0, 100].

• Observing γ and his type, the student decides whether to take the class.

• If the student does not take the class, the game ends; the professor gets 0, and
the student gets Wt , where t ∈ {H, L} is his type and 0 < WL < WH < 100.
16.6. EXERCISES 349

• If the student takes the class, then he chooses an effort level e and takes an
exam. His score in the exam is s = e if t = L and s = 2e if t = H; i.e., a high
type student scores higher for any effort level.

• The student gets a letter grade

A if s ≥ γ
g=
B otherwise.

• The student’s payoff is 100 − e/2 if he gets g = A, and −e/2 if he gets B.


The professor’s payoff is s.

(a) Consider a prestigious institution with high standards, where π is high, and
WH is not too high. In particular, π > .5 (100 − WL ) / (100 − WH ) and WH <
(100 + WL ) /2. Compute a sequential equilibrium for this game.

(b) Consider a prestigious institution with spoiled kids, where both π and WH are
high. In particular, WH > (100 + WL ) /2 and π > 1−2 (100 − WH ) / (100 − WL ).
Compute a sequential equilibrium for this game.

(c) Consider a lower-tier college, where both π and WH are low; π < .5 (100 − WL ) / (100 − WH )
and WH < (100 + WL ) /2. Compute a sequential equilibrium for this game.

(d) Assuming that WL is the same at all three institutions, rank the exam scores
in (a), (b) and (c).

(e) (0 points) What cutoff value would you choose if you were a professor at
MIT?

10. [Final 2002 Make Up] Consider the following game.


350 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

T (1,1)
(3,1) T
L R

(2,1/2) B (0,1/2)
{.5} A
B
0
1 T 1
B T
L 1
R
2 {.1} B 3
0 B C 0
1,0 0,0

T
L {.4} R

3 B 2
1 B 1

(a) Find a pooling sequential equilibrium.

(b) Find a sequential equilibrium in which for each signal there is a type who
send that signal.

11. [Final 2002 Make Up] We have a Defendant and a Plaintiff, who injured the Defen­
dant. If they go to court, the Defendant will pay a cost c ∈ (0, 1) to the court and
a reward d to the Plaintiff, depending on the severity of the injury. [Here c and d
are measured in terms of utiles, where a utile is $1M.] The Plaintiff knows d but
the Defendant does not; she believes that d = 1 with probability π > c and d = 2
with probability 1 − π. The Plaintiff ask a settlement s, and the Defendant either
accepts, in which case she pays s (utile) to the Plaintiff, or rejects in which case
they go to court. Everything described up to here is common knowledge. Find a
sequential equilibrium.

12. [Final 2000] Consider the following private-value auction of a single object, whose
value for the seller is 0. there are two buyers, say 1 and 2. The value of the object
for each buyer i ∈ {1, 2} is vi so that, if i buys the object paying the price p, his
payoff is vi − p; if he doesn’t buy the object, his payoff is 0. We assume that
v1 and v2 are independently and identically distributed uniformly on [v, 1] where
0 ≤ v < 1.

(a) We use sealed-bid first-price auction, where each buyer i simultaneously bids
bi , and the one who bids the highest bid buys the object paying his own bid.
16.6. EXERCISES 351

Compute the symmetric Bayesian Nash equilibrium in linear strategies, where


bi = a + cvi . Compute the expected utility of a buyer for whom the value of
the object is v.

(b) Now assume that v1 and v2 are independently and identically distributed
uniformly on [0, 1]. Now, in order to enter the auction, a player must pay
an entry fee φ ∈ (0, 1). First, each buyer simultaneously decides whether
to enter the auction. Then, we run the sealed-bid auction as in part (a);
which players entered is now common knowledge. If only one player enters
the auction any bid b ≥ 0 is accepted. Compute the symmetric sequential
equilibrium where the buyers use the linear strategies in the auction if both
buyer enter the auction. Anticipating this equilibrium, which entry fee the
seller must choose? [Hint: In the entry stage, there is a cutoff level such that
a buyer enters the auction iff his valuation is at least as high as the cutoff
level.]

13. [Final 2000] Consider a worker and a firm. Worker can be of two types, High or
Low. The worker knows his type, while the firm believes that each type is equally
likely. Regardless of his type, a worker is worth 10 for the firm. The worker’s
reservation wage (the minimum wage that he is willing to accept) depends on his
type. If he is of high type his reservation wage is 5 and if he is of low type his
reservation wage is 0. First the worker demands a wage w0 ; if the firm accepts it,
then he is hired with wage w0 , when the payoffs of the firm and the worker are
10 − w0 and w0 , respectively. If the firm rejects it, in the next day, the firm offers
a new wage w1 . If the worker accept the offer, he is hired with that wage, when
the payoffs of the firm and the worker are again 10 − w1 and w1 , respectively. If
the worker rejects the offer, the game ends, when the worker gets his reservation
wage and the firm gets 0. Find a perfect Bayesian equilibrium of this game.

14. [Homework 5, 2004] Compute all sequential equilibria of the following game.

15. [Homework 5, 2004] Consider the following general Beer-Quiche game, where the
value of avoiding a fight is α, and the ex-ante probability of strong type is p. For
each case below find a sequential equilibrium.
352 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

1 2 1
1,-5
.5

4,4 5,2 3,3


.1 1 2 1
1,-5

.4
-1,4 0,2 -1,3
1 2 1
-2,-5

0,4 -1,2 -1,3

0 1
1 du 1
el el
beer quiche du
{1-p} don
α ’t 1+α
0 don tw
’t
0

1 0
0 ts
0
du

el
du
el

beer {p} quiche


1+α ’t don α
1 don ’t 1

(a) p = 0.4, and α = 2.


(b) p = 0.8, and α = 2.
(c) p = 0.8, and α = 1/2.

16. [Homework 5, 2004] Consider a buyer and a seller. The seller owns an object, whose
value for himself is c. The value of the object for the buyer is v. Each player knows
his own valuation not the other player’s valuation; v and c are independently and
identically distributed with uniform distribution on [0, 1]. We have two dates,
t = 0, 1. The players discount the future payoffs with δ = .9. Hence, if they trade
at t = 0 with price p, the payoffs of seller and the buyer are p − c and v − p,
respectively, while these payoffs would be 0.9 (p − c) and 0.9 (v − p), respectively,
if they traded at t = 1. If the do not trade at any of these dates, each gets 0. Find
a sequential equilibrium of the game in each of the following cases.
16.6. EXERCISES 353

(a) At t = 0, the seller offers a price p0 . If the buyer accepts, trade occurs at
price p0 . If the offer is rejected, the game end without possibility of a trade
at t = 1.

(b) At t = 0, the seller offers a price p0 . If the buyer accepts, trade occurs at price
p0 . If the buyer rejects, at t = 1, the seller sets another price p1 . If the buyer
accepts the price, the trade occurs at price p1 ; otherwise they do not trade.
[Hint: There is an equilibrium in which there is a threshold a (p0 ) such that a
buyers buys at t = 0 if his valuation is above a (p0 ), and the threshold and the
sellers strategies are "linear," i.e., a (p0 ) = min {αp0 + β, 1} and p0 = Ac + B
for some parameters α, β, A, and B.]

17. [Final 2000, Make Up] Two players (say A and B) own a company, each of them
owning a half of the Company. They want to dissolve the partnership in the
following way. Player A sets a price p. Then, player B decides whether to buy
A’s share or to sell his own share to A,in each case at price p. The value of the
Company for players A and B are vA and vB , respectively.

(a) Assume that the values vA and vB are commonly known. What would be the
price in the subgame-perfect equilibrium?

(b) Assume that the value of the Company for each player is his own private
information, and that these values are independently drawn from a uniform
distribution on [0,1]. Compute the sequential equilibrium.

18. Final 2000, Make Up] Consider the following game.


354 CHAPTER 16. DYNAMIC GAMES WITH INCOMPLETE INFORMATION

(2,1)
(3,1)
L 1 R

(0,0) {0.6} (1,0)

2 2

(0,0) (2,0)
{0.4}

L 1 R
(1,1) (3,1)

(a) Find a separating equilibrium.


(b) Find a pooling equilibrium.
(c) Find an equilibrium in which a type of player 1 plays a (completely) mixed
strategy.

You might also like