6.254: Game Theory With Engineering Applications Lecture 15: Repeated Games
6.254: Game Theory With Engineering Applications Lecture 15: Repeated Games
6.254: Game Theory With Engineering Applications Lecture 15: Repeated Games
Asu Ozdaglar
MIT
April 1, 2010
Introduction
Outline
Folk Theorems
Reference:
Fudenberg and Tirole, Section 5.1.
Introduction
Prisoners Dilemma
How to sustain cooperation in the society?
Recall the prisoners dilemma, which is the canonical game for
understanding incentives for defecting instead of cooperating.
Cooperate
Defect
Cooperate
1, 1
2, 1
Defect
1, 2
0, 0
Recall that the strategy prole (D, D ) is the unique NE. In fact, D
strictly dominates C and thus (D, D ) is the dominant equilibrium.
In society, we have many situations of this form, but we often observe
some amount of cooperation.
Why?
Introduction
Repeated Games
In many strategic situations, players interact repeatedly over time.
Perhaps repetition of the same game might foster cooperation.
By repeated games, we refer to a situation in which the same stage game
(strategic form game) is played at each date for some duration of T periods.
Such games are also sometimes called supergames.
We will assume that overall payo is the sum of discounted payos at each
stage.
Future payos are discounted and are thus less valuable (e.g., money
and the future is less valuable than money now because of positive
interest rates; consumption in the future is less valuable than
consumption now because of time preference).
We will see in this lecture how repeated play of the same strategic game
introduces new (desirable) equilibria by allowing players to condition their
actions on the way their opponents played in the previous periods.
4
Introduction
Discounting
We will model time preferences by assuming that future payos are
discounted proportionately (exponentially) at some rate [0, 1),
called the discount rate.
For example, in a two-period game with stage payos given by u 1 and
u 2 , overall payos will be
U = u 1 + u 2 .
With the interest rate interpretation, we would have
=
1
,
1+r
Introduction
Mathematical Model
More formally, imagine that I players playing a strategic form game
G = I , (Ai )i I , (gi )i I for T periods.
At each period, the outcomes of all past periods are observed by all players
perfect monitoring
Let us start with the case in which T is nite, but we will be particularly
interested in the case in which T = .
Here Ai denotes the set of actions at each stage, and
gi : A R,
where A = A1 AI .
t
That is, gi ait , a
is the stage payo to player i when action prole
i
t t
t
a = ai , ai is played.
Introduction
ui (a) =
t gi (ait , at i )
t =0
Introduction
Cooperate
1, 1
2, 1
Defect
1, 2
0, 0
Introduction
T , no matter what, the play will be (D, D ). Then given this, the
With this argument, we have that there exists a unique SPE: (D, D )
at each date.
Introduction
10
Innitely-Repeated Games
Innitely-Repeated Games
Now consider the innitely-repeated game G , i.e., players play the
game repeatedly at times t = 0, 1, . . ..
action proles.
A period-t history is ht = {a0 , . . . , at 1 } (action proles at all
periods before t), and the set of all period-t histories is H t .
A pure strategy for player i is si = {sit }, where sit : H t Ai
The payo to player i for the entire repeated game is then
ui (a) = (1 )
t gi (ait , at i )
t =0
Innitely-Repeated Games
Trigger Strategies
In innitely-repeated games we can consider trigger strategies.
A trigger strategy essentially threatens other players with a worse,
punishment, action if they deviate from an implicitly agreed action prole.
A non-forgiving trigger strategy (or grim trigger strategy) s would involve
this punishment forever after a single deviation.
A non-forgiving trigger strategy (for player i) takes the following form:
a i
if a =
a for all < t
ait =
ai if a = a for some < t
Here a is the implicitly agreed action prole and ai is the punishment action.
This strategy is non-forgiving since a single deviation from a induces player i
to switch to ai forever.
12
Innitely-Repeated Games
Cooperate
1, 1
2, 1
Defect
1, 2
0, 0
Innitely-Repeated Games
14
Innitely-Repeated Games
15
Innitely-Repeated Games
Remarks
Innitely-Repeated Games
A
B
C
A
2, 2
1, 2
0, 0
B
2, 1
1, 1
0, 1
C
0, 0
1, 0
1, 1
For the game dened above, the action A strictly dominates B, C for both
players, therefore the unique Nash equilibrium of the stage game is (A, A).
If 1/2, this game has an SPE in which (B, B ) is played in every period.
It is supported by a slightly more complicated strategy than grim trigger:
I. Play B in every period unless someone deviates, then go to II.
II. Play C . If no one deviates go to I. If someone deviates stay in II.
17
Folk Theorems
Folk Theorems
In fact, it has long been a folk theorem that one can support
cooperation in repeated prisoners dilemma, and other
non-one-stageequilibrium outcomes in innitely-repeated games
with suciently high discount factors.
These results are referred to as folk theorems since they were
believed to be true before they were formally proved.
Here we will see a relatively strong version of these folk theorems.
18
Folk Theorems
Feasible Payos
Consider stage game G = I , (Ai )i I , (gi )i I and innitely-repeated
game G ().
19
Folk Theorems
Minmax Payos
Minmax payo of player i: the lowest payo that player is opponent
can hold him to:
v i = min max gi (i , i ) .
i
i
=
arg
min
max
g
(
)
m
i
i
i
i
i
i ) =v .
gi (mii , m
i
i
20
Folk Theorems
Example
Consider
U
M
D
L
2, 2
1, 2
0, 1
R
1, 2
2, 2
0, 1
21
Folk Theorems
Example
Therefore, we have
v 1 = min
0q 1
and m21 [ 13 , 23 ].
Similarly, one can show that: v 2 = 0, and m12 = (1/2, 1/2, 0) is the
unique minimax prole.
22
Folk Theorems
gi () v i .
ui ( ) v i .
Folk Theorems
Folk Theorems
Denition
A payo vector v RI is strictly individually rational if vi > v i for all i.
Theorem
(Nash Folk Theorem) If (v1 , . . . , vI ) is feasible and strictly individually
rational, then there exists some < 1 such that for all > , there is a
Nash equilibrium of G () with payos (v1 , , vI ).
24
Folk Theorems
Proof
Suppose for simplicity that there exists an action prole
a = (a1 , , aI ) s.t. gi (a) = vi [otherwise, we have to consider
mixed strategies, which is a little more involved].
i these the minimax strategy of opponents of i and mi be is
Let m
i
i
i .
best response to m
i
25
Folk Theorems
Proof (continued)
If i deviates from the strategy in some period t, then denoting
vi = maxa gi (a), the most that player i could get is given by:
(1 ) vi + vi + + t 1 vi + t v i + t +1 v i + t +2 v i + .
Hence, following the suggested strategy will be optimal if
vi
1 t
t +1
vi + t v i +
v ,
1
1
1 i
thus if
vi
1 t vi + t (1 ) v i + t +1 v i
Folk Theorems
The Nash folk theorem states that essentially any payo can be
obtained as a Nash Equilibrium when players are patient enough.
However, the corresponding strategies involve this non-forgiving
punishments, which may be very costly for the punisher to carry out
(i.e., they represent non-credible threats).
This implies that the strategies used may not be subgame perfect.
The next example illustrates this fact.
U
D
L (q)
6, 6
7, 1
R (1 q)
0, 100
0, 100
The unique NE in this game is (D, L). It can also be seen that the
minmax payos are given by
v 1 = 0,
v 2 = 1,
Folk Theorems
28
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.