Lecture 17
Lecture 17
Mauricio Romero
Lecture 17: Repeated Games
Repeated Games
Lecture 17: Repeated Games
Repeated Games
I Repeated games are often useful for modeling relationships between two or more
agents that interact in a strategic situation not just once but over a long period of
time
I Repeated games are often useful for modeling relationships between two or more
agents that interact in a strategic situation not just once but over a long period of
time
I Agent may or may not cooperate with one another even though this may not in
the best interest of the agents in the short run through a system of rewards and
punishments
I Repeated games are often useful for modeling relationships between two or more
agents that interact in a strategic situation not just once but over a long period of
time
I Agent may or may not cooperate with one another even though this may not in
the best interest of the agents in the short run through a system of rewards and
punishments
I Agent may or may not cooperate with one another even though this may not in
the best interest of the agents in the short run through a system of rewards and
punishments
I Agent may or may not cooperate with one another even though this may not in
the best interest of the agents in the short run through a system of rewards and
punishments
2. Players observe the actions chosen by the players in period 1. Then in period 2,
players simultaneously play the game G .
1. In period 1, players simultaneously play the game G .
2. Players observe the actions chosen by the players in period 1. Then in period 2,
players simultaneously play the game G .
2. Players observe the actions chosen by the players in period 1. Then in period 2,
players simultaneously play the game G .
I Working incurs a cost of 1 however increases the utility of the other player −i by 2
Consider the following two-player game:
I Working incurs a cost of 1 however increases the utility of the other player −i by 2
I Thus,
ui (ei , e−i ) = 2e−i − ei .
Prisoner’s Dilemma (Game G )
e2 = 1 e2 = 0
e1 = 1 1, 1 −1, 2
e1 = 0 2, −1 0, 0
I What happens when T = 1
I What happens when T = 1
2. Both players observe the actions chosen by the two players. Then they play G
again.
Imagine players are engaged in a long run relationship that lasts more than just playing
the game once: (G , 2)
2. Both players observe the actions chosen by the two players. Then they play G
again.
3. Then payoffs are realized as the discounted sum of the utilities of the actions in
each period with discount factor δ ∈ (0, 1].
Suppose that the two players chose (e1 = 1, e2 = 1) in the first period
In the second period, they chose (e1 = 0, e2 = 1)
Suppose that the two players chose (e1 = 1, e2 = 1) in the first period
In the second period, they chose (e1 = 0, e2 = 1)
u1 = 1 + δ · 2
u2 = 1 + δ · (−1).
I We will solve for the set of pure SPNE of this game.
I We will solve for the set of pure SPNE of this game.
I A pure strategy for player 1 must specify what he does in each of these
information sets
I We will solve for the set of pure SPNE of this game.
I A pure strategy for player 1 must specify what he does in each of these
information sets
I A pure strategy for player 1 must specify what he does in each of these
information sets
1
1 0
2 2
1 0 1 0
1
1 0
2 2
1 0 1 0
I The first subgame that we will analyze is the one that the players encounter after
having play (e11 = 0, e21 = 0) in T = 1:
1
1 0
2 2
1 0 1 0
e2 = 1 e2 = 0
e1 = 1 δ, δ −δ, 2δ
e1 = 0 2δ, −δ 0, 0
I This game has a unique Nash equilibrium in which the players play
(e12 = 0, e22 = 0)
I This game has a unique Nash equilibrium in which the players play
(e12 = 0, e22 = 0)
I Therefore after having observed (e11 = 0, e21 = 0) in the first period, both players
will play (e12 = 0, e22 = 0) in period 2
Consider the subgame following a play of (e11 = 1, e21 = 0) in the first period. The
extensive form of this subgame is given by:
1
1 0
2 2
1 1
0 0
(−1 + δ, 2 + δ) (−1 + 2δ, 2 − δ)
e2 = 1 e2 = 0
e1 = 1 −1 + δ, 2 + δ −1 − δ, 2 + 2δ
e1 = 0 −1 + 2δ, 2 − δ −1, 2
I (e1 = 0, e2 = 0) is the unique Nash equilibrium
I (e1 = 0, e2 = 0) is the unique Nash equilibrium
I In any SPNE, (e12 = 0, e22 = 0) must be played after observing (e11 = 1, e21 = 0)
I We can go through the remaining smaller subgames after the observation of
(e11 = 1, e21 = 0) and after the observation of (e11 = 1, e21 = 1)
I We can go through the remaining smaller subgames after the observation of
(e11 = 1, e21 = 0) and after the observation of (e11 = 1, e21 = 1)
I The idea is that payoffs that have accrued in period 1 are essentially sunk, and
have no influence on incentives in period 2
To see this consider the normal form representation in the subgame after the
observation of (e11 = 1, e21 = 0)
e2 = 1 e2 = 0
e1 = 1 −1 + δ, 2 + δ −1 − 1δ, 2 + 2δ
e1 = 0 −1 + 2δ, 2 − δ −1, 2
I We can subtract off the payoff that player 1 received in period 1 and divide
through player 1’s payoffs by δ to obtain the following payoff matrix
e2 = 1 e2 = 0
e1 = 1 1, 1 −1, 2
e1 = 0 2, −1 0, 0
I We can subtract off the payoff that player 1 received in period 1 and divide
through player 1’s payoffs by δ to obtain the following payoff matrix
I We can do the same thing for player 2’s payoffs and get the payoff matrix
e2 = 1 e2 = 0
e1 = 1 1, 1 −1, 2
e1 = 0 2, −1 0, 0
I We’ve just performed affine transformations of each person’s utility functions
I We’ve just performed affine transformations of each person’s utility functions
I This payoff matrix is equivalent from a strategic perspective from the original
normal
I We’ve just performed affine transformations of each person’s utility functions
I This payoff matrix is equivalent from a strategic perspective from the original
normal
I Thus the set of Nash equilibria will remain unchanged after these transformations
I We’ve just performed affine transformations of each person’s utility functions
I This payoff matrix is equivalent from a strategic perspective from the original
normal
I Thus the set of Nash equilibria will remain unchanged after these transformations
I This payoff matrix is equivalent from a strategic perspective from the original
normal
I Thus the set of Nash equilibria will remain unchanged after these transformations
I Basically after any history, the strategic normal form is essentially the same as the
original prisoner’s dilemma
I So what have we learned?
I Basically after any history, the strategic normal form is essentially the same as the
original prisoner’s dilemma
I Both players play (e12 = 0, e22 = 0) after any information set in the last period
I Now let us see what must be played in the first period by the two players
I Now let us see what must be played in the first period by the two players
I Both players anticipate that (e12 = 0, e22 = 0) will be played after any chosen
action profile in the first period
I Now let us see what must be played in the first period by the two players
I Both players anticipate that (e12 = 0, e22 = 0) will be played after any chosen
action profile in the first period
e2 = 1 e2 = 0
e1 = 1 1, 1 −1, 2
e1 = 0 2, −1 0, 0
The unique Nash equilibrium of the above normal form game is (e11 = 0, e21 = 0)
Therefore the unique SPNE is:
e12 e22
=0 =0
2 e22
e = 0 e 1 =0 =0
1
, e1 = 0
1 e12 =0 2 e22 = 0
e12 =0 e22 =0
I Thus, the equilibrium outcome is simply the repetition of the unique NE of the
stage game
I Here the unique SPNE requires all players to play ei = 0 at all periods and all
information sets
I Thus, the equilibrium outcome is simply the repetition of the unique NE of the
stage game
I This holds more generally when the stage game has a unique NE
I Here the unique SPNE requires all players to play ei = 0 at all periods and all
information sets
I Thus, the equilibrium outcome is simply the repetition of the unique NE of the
stage game
I This holds more generally when the stage game has a unique NE
I Whenever the stage game has a unique NE, then the only SPNE of a finite
horizon repeated game with that stage game is the repetition of the stage game
NE
Theorem
Suppose that the stage game G has exactly one NE, (a1∗ , a2∗ , . . . , an∗ ). Then for any
δ ∈ (0, 1] and any T , the T -times repeated game has a unique SPNE in which all
players i play ai∗ at all information sets.
I The basic idea of the proof for this proposition is exactly the same that we saw in
the repeated prisoner’s dilemma
I The basic idea of the proof for this proposition is exactly the same that we saw in
the repeated prisoner’s dilemma
I All past payoffs are sunk
I The basic idea of the proof for this proposition is exactly the same that we saw in
the repeated prisoner’s dilemma
I All past payoffs are sunk
I In the last period, the incentives of all players are exactly the same as if the game
were being played once
I The basic idea of the proof for this proposition is exactly the same that we saw in
the repeated prisoner’s dilemma
I All past payoffs are sunk
I In the last period, the incentives of all players are exactly the same as if the game
were being played once
I Thus all players must play the stage game Nash equilibrium action regardless of
the history of play up to that point
I The basic idea of the proof for this proposition is exactly the same that we saw in
the repeated prisoner’s dilemma
I All past payoffs are sunk
I In the last period, the incentives of all players are exactly the same as if the game
were being played once
I Thus all players must play the stage game Nash equilibrium action regardless of
the history of play up to that point
I But then we can induct
I The basic idea of the proof for this proposition is exactly the same that we saw in
the repeated prisoner’s dilemma
I All past payoffs are sunk
I In the last period, the incentives of all players are exactly the same as if the game
were being played once
I Thus all players must play the stage game Nash equilibrium action regardless of
the history of play up to that point
I But then we can induct
I Knowing that the stage game Nash equilibrium is going to be played tomorrow, at
any information set, we can ignore the past payoffs
I The basic idea of the proof for this proposition is exactly the same that we saw in
the repeated prisoner’s dilemma
I All past payoffs are sunk
I In the last period, the incentives of all players are exactly the same as if the game
were being played once
I Thus all players must play the stage game Nash equilibrium action regardless of
the history of play up to that point
I But then we can induct
I Knowing that the stage game Nash equilibrium is going to be played tomorrow, at
any information set, we can ignore the past payoffs
I We concentrate just on the payoffs in the future. Thus in period T − 1, player i
simply wants to maximize:
T −1
max δ T −2 ui (ai , a−i ) + δ T −1 ui (a∗ ).
ai ∈Ai
I What player i plays today has no consequences for what happens in period T
since we saw that all players will play a∗ no matter what happens in period T − 1
I What player i plays today has no consequences for what happens in period T
since we saw that all players will play a∗ no matter what happens in period T − 1
I Thus again, for this to be a Nash equilibrium, we need a1T −1 = a1∗ , . . . , anT −1 = an∗ .
I What player i plays today has no consequences for what happens in period T
since we saw that all players will play a∗ no matter what happens in period T − 1
I Thus again, for this to be a Nash equilibrium, we need a1T −1 = a1∗ , . . . , anT −1 = an∗ .
I Following exactly this induction, we can conclude that every player must play ai∗
at all times and all histories