Repeated Game: 1 Prisoner's Dilemma
Repeated Game: 1 Prisoner's Dilemma
20 October 2008
• Nobel prize winner George Stigler of the Chicago school argues that cartel is not
the only means for firms to collude to restrict output and raise price. Instead,
collusion to restrict output and raise price could be a natural outcome of market
interaction amongst firms. Stigler calls this tacit collusion — collusion without
explicit agreement and cooperation amongst firms. In this lecture, we study how
cooperation can emerge as an equilibrium outcome in a multi-period dynamic
game.
1 Prisoner’s dilemma
• Consider the well known Prisoner’s dilemma in game theory. The story is that
two criminals, who have committed a serious crime together, are caught by the
police, and each is interrogated separately and cannot communicate with one
another in the interim. If each refuses to confess, there will only be enough
evidence to charge each a lesser crime. Then both will be given a light sentence
of 1 month. If one confesses and the other does not, the one who confesses will
be let go as the reward for his cooperation, while there will be enough evidence
to charge the other the serious crime that would send him to jail for 9 months
for non-repentance. Finally if both confess, both be will charged the serious
crime and sent to jail for 6 months.
• There are two players in this game: prisoners 1 and 2, and each may choose
between the strategies: {not confess, confess} .
Prisoner 2
not confess confess
Prisoner 1 not confess -1, -1 -9, 0
confess 0, -9 -6, -6
• The first numbers in the cells denote the payoff to prisoner 1 and the second
numbers the payoffs to prisoner 2. For example, if prisoner 1 chooses “not
confess”, and prisoner 2 chooses “confess”, prisoner 1 gets a 9 month sentence,
while prisoner 2 can go free.
1
• The NE is both prisoners choosing “confess”, each getting a payoff of -6.
• Only if they can cooperate, each choosing “not confess”, they would be better
off as each may receive a payoff of -1. Yet “not confess” is not an equilibrium
strategy. Whatever prisoner 2 chooses, it is in prisoner 1’s best interest to
choose “confess”. Though collectively, the players are better off cooperating,
each benefits from cheating the cooperative agreement.
• Now suppose the game is repeated twice, in which case there could well be
payoffs in doing so. In game theory, this is called a repeated game.
• If player 2 chooses “not confess”, player 1’s payoff is higher whatever actions he
chooses to adopt. A player may be rewarded for cooperation in the first stage
of the game by the other player if such cooperation goes with the promise by
the other player choosing “not confess” in the second stage of the game.
• Suppose player 1 adopts the strategy: play “not confess” in the first stage. And
if player 2 cooperates: {a2 = not confess}, then play “not confess” in the 2nd
stage. If player 2 did not cooperate but played “confess” in the 1st stage, then
play “confess” in the 2nd stage.
• Can this strategy adopted by player 1 which has the purpose of inviting player
2 to cooperate in the first stage to the mutual benefits of both players actually
work?
• If player 2 cooperates in the 1st stage and plays “not confess”, his playoff would
be -1. In the second stage, his playoff is highest if he plays “confess” irrespective
of the strategy adopted by player 1. And given that player 1 will choose “not
confess”, his playoff is 0. The sum of payoffs in the two periods is -1.
• If instead player 2 snubs player 1’s invitation to cooperate in the first stage to
play “confess”, he earns a higher payoff equal to 0. Player 1 then retaliates in
the 2nd stage and plays “confess”. Also playing “confess” which is player 2’s
dominant action, he earns -6 in the 2nd stage. The sum of payoffs in the two
stages is -6.
2
game. Before reaching such an important conclusion, we should be a trifle more
careful to check if the analysis is sound however.
• Even though player 2’s best response to player 1’s strategy of inviting collusion
is indeed to cooperate, we have not examined whether player 1’s strategy is a
credible strategy. That is, we have not verified if player 1’s promise to play the
cooperative action of “not confess” in the 2nd stage should be believed.
• When the play reaches the 2nd stage, what may be player 1’s best action? To
play “not confess”, player 1 earns a payoff of -9 when player 2 chooses “confess”.
If player 1 plays “confess” instead, its payoff will be -6. Now player 1 is obviously
better off playing “confess”. In fact, irrespective of player 2’s action, playing
“confess” yields player 1 a higher payoff in the 2nd stage.
• Thus, no matter what actually transpired in the 1st stage of the game, player
1’s best interest comes from adopting the non-cooperative action “confess” in
the 2nd stage of the game. Player 1 would not keep its promise to play the
cooperative action “not confess” in the second stage of the game unless the
player is irrational (crazy). The only equilibrium outcome in the 2nd stage of
the game is both players adopting the non-cooperative outcome “confess” that
is the NE of the static one-shot game.
• The reason that cooperation cannot be equilibrium in the 1st stage is that any
promise (threat) to reward (punish) good (bad) behaviours in the 2nd stage is
not credible. What if the play is extended to 3 periods?
3
players should really just concentrate on what they may get out from this stage.
The NE is both playing the noncooperative action “confess”. This means that
any promises made in the first stage as to behaviours to be expected in the
2nd stage cannot be credible. The play in the 1st stage can only the NE of
non-cooperation.
• Hence the assumption that the game is only played twice is not restrictive.
We reach the same conclusion that no cooperation is possible if the analysis is
extended to a game that is repeated three times. In fact, by the same logic,
the same conclusion of no cooperation applies in a game that is repeated any
number of times. As long as there is a last stage of play, the play in that very
last stage will revert to the static one-shot NE. Backing up stage by stage, the
plays in all previous stages unravel to the same static one-shot NE.
• A repeated game has no last stage if the game is repeated indefinitely, until
eternity. To proceed, assume the prisoner dilemma game between prisoners
1 and 2 is repeated indefintely, and each chooses what strategy to adopt to
maximize the sum of the PV of payoffs.
• Consider the following trigger strategy that may be adopted by the players:
• Under this strategy, the player rewards the rival’s good behaviour by playing the
cooperative action in the future. We call this the trigger strategy in the sense
that the rival’s non-cooperation would trigger similar non-cooperation by the
given player. The cooperation may be sustained through the threat to punish
non-cooperation by playing the non-cooperative action in the future.
4
• Is both players adopting the trigger strategy a NE?
• Assume that player 1 has chosen the trigger strategy, what may be player 2’s
best response? At the beginning of each period, player 2 faces two possible
contingencies:
• If player 2 chooses cooperation to play “not confess”, the current payoff is -1.
With both players choosing the cooperative action, player 1 will continue to play
cooperation in accordance with its adopted trigger strategy. At the beginning
of the next period, player 2 will face the same decision between cooperation or
otherwise. If it is optimal for player 2 to cooperate this period, it must then be
optimal for player 2 to choose cooperation in the coming period as well. The
sum of the discounted payoffs is
1
Vc = −1 − δ − δ 2 − δ 3 − ... = − .
1−δ
• If the above condition is met, we have player 2’s best response to player 1’s
trigger strategy as follows:
5
1. Play the non-cooperative action “confess”, if any previous play was not
the cooperative outcome.
2. Play the cooperative action “not confess”, if all previous plays were coop-
eration.
• Yet this is identical to player 1’s strategy. Hence, in response to player 1’s
trigger strategy, player 2’s best response is the exact trigger strategy. If that
is true, player 1’s best response to player 2’s trigger strategy must similarly be
the same trigger strategy. The pair of trigger strategy constitutes a NE.
• In this NE, the 2 players always play cooperation. The punishment of playing
non-cooperation will never actually be exercised. The threat to punish suffices
to deter either player from deviating from cooperation.
• That cooperation can be NE is not actually news for us. In finitely repeated
games, cooperation can possibly be NE as well. We think cooperation cannot
arise in finitely repeated games because there the strategies are not credible.
The challenge is to verify that the cooperative outcome does not involve any
non-credible threats and promises.
• In a finitely repeated game, we use backward induction, solving the game start-
ing from the final stage, to eliminate non-credible strategies. Backward in-
duction cannot be used in the infinitely repeated game, as the game has no
final stage. How can we be assured then that the trigger strategy NE has no
non-credible strategies?
• Contingency (1): when at least one stage before is not full cooperation.
1
The concept of subgame perfect NE was first proposed by German economist Reinhard Selton in
the 1970s. Selton subsequently won the Nobel prize for economics for his work on game theory. The
definition of subgame perfect NE we adopt here is not the usual definition you see in game theory
textbook. The formal definition is that a NE is subgame perfect if it is an equilibrium in any and
all subgames of the supergame. To understand this definition, we need to define what supergame
and subgame are, respectively. Rather than to delve into the formalities of game theory, we adopt
the more “down—to—earth” definition in the text. The two definitions are almost equivalent.
6
— If the other player has adopted the trigger strategy, he will play non-
cooperation in each period into the infinite future.
— In this case, the player’s best response is clearly to play non-cooperation
into the infinite future too.
— The player would certainly like to follow through the actions stipulated in
the trigger strategy in this contingency.
— This part of the trigger strategy has no non-credible threats or promises.
• Contingency (2): when all previous stages have been the cooperative outcome.
• In either contingency, the action dictated by the trigger strategy is the best
response to the trigger strategy adopted by the other player. The NE that is
made up of the players choosing the trigger strategy has no on-credible threats
or promises in all circumstances. It must then be subgame perfect.
• We will in the next lecture apply this analysis to firms colluding to raise price
and restrict output in oligopoly.