0% found this document useful (0 votes)
15 views93 pages

Mit17 810s21 Lec5

The lecture focuses on repeated games in game theory, exploring both finitely and infinitely repeated games where players engage in the same game structure multiple times. Key concepts include the expansion of Nash equilibria in repeated games, the importance of strategies like Grim Trigger and Tit-for-Tat, and the implications of discounting payoffs over time. The lecture also discusses the challenges of predicting outcomes due to the proliferation of equilibria in repeated games.

Uploaded by

shreejr419
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views93 pages

Mit17 810s21 Lec5

The lecture focuses on repeated games in game theory, exploring both finitely and infinitely repeated games where players engage in the same game structure multiple times. Key concepts include the expansion of Nash equilibria in repeated games, the importance of strategies like Grim Trigger and Tit-for-Tat, and the implications of discounting payoffs over time. The lecture also discusses the challenges of predicting outcomes due to the proliferation of equilibria in repeated games.

Uploaded by

shreejr419
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 93

17.810/17.

811 – Game Theory


Lecture 5: Repeated Games

Asya Magazinnik

MIT

1
Where We Are/Where We Are Headed

• We have developed a notion of dynamic games of complete


information in which players make multiple, sequential moves

• We will now consider a special form of such games: repeated


games, in which players repeat the same game structure again
and again

• We will study finitely and infinitely repeated games

2
Reading

These slides will focus on the following readings:

• Finitely Repeated Games


• Gibbons, 2.3A

• Infinitely Repeated Games


• Gibbons, 2.3B

3
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
4
Repeated Games

• The most interesting conceptual issue in repeated games is


the extent to which repetition creates the opportunity to
sustain more behavior (as Nash equilibria) than is possible in
single-shot games.

• The set of Nash equilibria is much larger in repeated games


than the corresponding static versions.
• Repeated games have a different problem: the proliferation of
equilibria is so great that generating precise predictions
becomes difficult.

5
Some Details

Definition (Stage Game)


Let G = {A1 , ..., An ; u1 , ..., un } denote a static game of complete
information in which players 1 through n simultaneously choose
actions a1 through an from the action spaces A1 through An ,
respectively, and payoffs are u1 (a1 , ..., an ) through un (a1 , ..., an ).
The game G will be called the stage game of a repeated game.
Given a stage game G , let G (T ) denote the finitely repeated
game in which G is played T times, with the outcomes of all
preceding plays observed before the next play begins. The payoffs
for G (T ) are simply the sum of the payoffs from the T stage
games.
6
Some Details

As before:

• A history is a sequence of play defining a path through the


game tree, which is also a record of prior actions and stage
game outcomes for all previous interactions.

• A strategy is a complete, contingent plan that tells the player


what to do in every situation, that is, at every possible history.

• The equilibrium path is the sequence of outcomes determined


in each stage game that results from the interaction of the
players’ equilibrium strategies at each moment in time.
7
An Example: Subgame Perfect Nash Equilibria

1/2 L M R
T 8, 8 0, 0 1, 9∗
M 0, 0 5∗ , 5∗ 0, 0
B 9∗ , 1 0, 0 3∗ , 3∗

• If this game is played once there are two Nash equilibria:


(M, M) and (B, R)
• Although the strategy profile (T , L) provides the highest
aggregate payoff, it is not a Nash equilibrium; Player 1
unilaterally defects to B and Player 2 unilaterally defects to
R.
• What happens if this game is played twice with players caring
about their combined two-period payoffs? Can players ever
get the (T , L) payoff?
8
Subgame Perfect Nash Equilibria

1/2 L M R
T 8, 8 0, 0 1, 9∗
M 0, 0 5∗ , 5∗ 0, 0
B 9∗ , 1 0, 0 3∗ , 3∗

• Consider the following strategies:


• Player 1: Play T in period 1; if Player 2 plays L in period 1 play M
in period 2; otherwise play B in period 2.
• Player 2: Play L in period 1; if Player 1 plays T in period 1 play M
in period 2; otherwise play R in period 2.
• Both players’ equilibrium payoff is 8 + 5 = 13. (Check that there is
no deviation that leaves either player better off.) In fact, these
strategies constitute a subgame perfect Nash equilibrium.
• Because (M, M) and (B, R) are Nash equilibria of the one-shot
game, playing them in the proper subgames is consistent with
subgame perfection.
9
The Repeated Prisoner’s Dilemma

• One of the most studied games is the repeated prisoner’s


dilemma.

• Consider an application focused on trade policy.

• Suppose the world economy performs better when all nations


agree to free trade, but that individual countries prefer to
protect their domestic economy.

• Given this tension, how are free trade regimes sustained?

• One answer is that free trade can be supported as an


equilibrium in a repeated game where a trade war begins
whenever a major country defects from the trade agreement.
10
Free Trade

Free Trade Game


US / EU Free Trade Protect
Free Trade 10, 10 1, 12∗
Protect 12∗ , 1 4∗ , 4∗

• Obviously, if the game is played once, the unique Nash


equilibrium is the strategy profile (Protect, Protect).

11
Free Trade

If it is played twice, then the strategy sets for each player are:

{FFF,FFP,FPT,FPP,PFF,PFP,PPF,PPP}

where FFP means “play Free Trade in period 1 and play Free
Trade in period 2 if the other country plays Free Trade in period
1, otherwise play Protect.”

• Note that a complete, contingent plan conditions strategies


on prior histories.

12
Normal Form

Two-Period Free Trade Game


US / EU FFF FFP FPF FPP PFF PFP PPF PPP

FFF 20,20 20,20 11,22 11,22 11,22 11,22 2,24 2,24

FFP 20,20 20,20 11,22 11,22 13,13 13,13 5,16 5,16

FPF 22,11 22,11 14,14 14,14 11,22 11,22 2,24 2,24

FPP 22,11 22,11 14,14 14,14 13,13 13,13 5,16 5,16

PFF 22,11 13,13 22,11 13,13 14,14 5,16 14,14 5,16

PFP 22,11 13,13 22,11 13,13 16,5 8,8 16,5 8,8

PPF 24,2 16,5 24,2 16,5 14,14 5,16 14,14 5,16

PPP 24,2 16,5 24,2 16,5 16,5 8,8 16,5 8,8

13
Nash Equilibria

Unlike the first example, repeating the game once does not achieve
cooperation, as (PPP, PPP) is the only Nash equilibrium.
This result can be generalized to any finite number of periods:

• In the last period, each country protects.


• This is known in the penultimate period. Thus, each country has an
incentive to protect in this period as well.
• This process unravels until each country is protecting in every
period.

Why could we induce first-period cooperation in first example?

• Because first-period behavior helps coordinate between multiple


equilibria in the second period.
14
Nash Equilibria

In the first example, the good equilibrium is used as a reward


whereas the bad equilibrium is used as a punishment.

Because the Prisoner’s Dilemma has only one Nash equilibrium, it


is impossible to encourage cooperation with the promise of
coordinating on a good equilibrium or the threat of coordinating on
a bad equilibrium.

We can generalize this result:

Proposition
If the stage game G has a unique Nash equilibrium then, for any
finite T , the repeated game G (T ) has a unique subgame-perfect
outcome: the Nash equilibrium of G is played in every stage.
15
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
16
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
17
Payoffs in Repeated Games: Discounting

If we discount our payoffs, that means payoffs received today are


more valuable than payoffs received in the future.

© Source unknown. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.

Why?
• Impatience, inflation, death, game form may change, preferences
may change, game may end...
18
Simple Payoffs

So if the payoffs to the stage game on the equilibrium path are


π1 , π2 , π3 . . ., then the present value of this infinite series of payoffs
is:

X
ui (si , s−i ) = π1 + δπ2 + δ 2 π3 + . . . = δ t−1 πt
t=1

Since the discount factor satisfies 0 ≤ δ < 1, this is a convergent


geometric series (see next slide).

We will show that:



X π
δ t−1 π =
1−δ
t=1

19
A Quick Convergence Proof

y = π + δπ + δ 2 π . . .
y = π + δ(π + δπ + δ 2 π . . . )
y = π + δy
y − δy = π
y (1 − δ) = π
π
y=
1−δ

20
Some Other Useful Information

• The continuation value is the payoff stream starting from


some time τ onward and is given by:

X π
δt π = δτ
t=τ
1−δ

• Given the discount factor δ, the average payoff of the infinite


sequence of payoffs π1 , π2 , π3 , . . . is:

X
(1 − δ) δ t−1 πt
t=1

Since the average payoff is just a rescaling of the present


value, maximizing the two quantities is equivalent.
21
Other Preliminaries

We are now ready to restate our familiar notions of strategies,


Nash equilibrium, subgames, and subgame perfection in the
context of infinitely repeated games.

But first, let’s define an infinitely repeated game.

Definition
Given a stage game G , let G (∞, δ) denote the infinitely repeated
game in which G is repeated forever and the players share the
discount factor δ. For each t, the outcomes of the t − 1
preceding plays of the stage game are observed before the t th
stage begins. Each player’s payoff in G (∞, δ) is the present value
of the player’s payoffs from the infinite sequence of stage games.

22
Strategies in Infinitely Repeated Games

Strategies are exactly the same as in finitely repeated games:

Definition
In the infinitely repeated game G (∞, δ), a player’s strategy
specifies the action the player will take in each stage, for each
possible history of play through the previous stage.

This is an infinite number of strategies for us to consider!

We won’t have to enumerate them all. Rather, we will consider


strategy types in our analysis.
23
Nash Equilibrium in Infinitely Repeated Games

Our general notion of Nash equilibrium also remains the same:


given the other players are playing their best response in every
period, player i has no incentive to unilaterally deviate from their
strategy in any period.

24
Subgames and Subgame Perfection

Definition
In the infinitely repeated game G (∞, δ), each subgame beginning
at stage t + 1 is identical to the original game G (∞, δ). As in
the finite-horizon case, there are as many subgames beginning at
stage t + 1 of G (∞, δ) as there are possible histories of play
through stage t.

Definition
A Nash equilibrium is subgame-perfect if the players’ strategies
constitute a Nash equilibrium in every subgame.

25
The One-Shot Deviation Principle

But how do we check every subgame in an infinitely repeated


game? Here we are helped by the one-shot deviation principle.

It turns out that to find an SPNE, it suffices to compare playing


your equilibrium strategy to any one-shot deviations of the form:

• Playing your equilibrium strategy up to period t − 1


• Deviating to something else in period t
• Returning to your equilibrium strategy in period t + 1

This allows us to consider the finite types of subgames where we


might end up, and one-shot deviations at some arbitrary period t.
26
The One-Shot Deviation Principle

Definition (One-shot deviation principle)


A strategy profile of an extensive-form game is a subgame-perfect
equilibrium (SPE) if and only if there is no profitable one-shot
deviation for any player in any subgame.

In an infinite horizon game where the discount factor is less than 1,


a strategy profile is a subgame perfect equilibrium if and only if it
satisfies the one-shot deviation principle.

Note: More broadly, Nash equilibria have no profitable one-shot


deviations on their equilibrium paths, but may have profitable
one-shot deviations off their equilibrium paths.
27
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
28
Back to PD: The Grim Trigger

• Consider the following strategy: “Play free trade in every


period until the other country protects. If the other country
protects, then protect forever after.”

• This is known as the grim trigger strategy, because any failure


to cooperate leads to the non-cooperative equilibrium in all
future periods.

• Does (grim trigger, grim trigger) constitute a Nash


equilibrium? Does it constitute a subgame perfect Nash
equilibrium?
• The answer to both questions is yes, under some conditions.

29
Nash Equilibrium: Infinitely Repeated PD

Free Trade Game


US / EU Free Trade Protect
Free Trade 10, 10 1, 12
Protect 12, 1 4, 4

To show that (grim trigger, grim trigger) is a Nash equilibrium, we


must show that neither player has a profitable deviation along the
equilibrium path.

• If each country plays grim trigger, both receive 10 in every


period
• If both countries discount the future at a common factor of δ,
10
the total utility of this strategy is 1−δ
30
Nash Equilibrium: Infinitely Repeated PD

Now let’s try to formulate the strongest possible alternative


strategy for US, given EU is playing grim trigger.
• Suppose the US defects in some period t.
• Since EU is playing grim trigger, they protect forever starting
in t + 1.
• Then the US should also protect forever starting in t + 1.
• Thus we consider the strategies:

1 2 3 ... t t + 1 t + 2 ...
US F F F ... P P P ...
EU F F F ... F P P ...

Note: this is not a one-shot deviation; we’re not yet solving


for SPNE. (What would a one-shot deviation look like for the
US?) 31
Nash Equilibrium: Infinitely Repeated PD

• This strategy yields for the US a payoff of 12 in the first


defection period (t) and a stream of 4 forever after
• The US is better off playing grim trigger if and only if:

10 δ4
≥ 12 +
1−δ 1−δ
10 ≥ 12(1 − δ) + 4δ
8δ ≥ 2
1
δ≥
4
• Thus, as long as the players are sufficiently patient (δ is
sufficiently large), both players playing the grim trigger
strategy is a Nash equilibrium. 32
Generalized Prisoner’s Dilemma

Generalized Prisoner’s Dilemma


1/2 Cooperate Don’t cooperate
Cooperate a, a d, c
Don’t cooperate c, d b, b

where c > a > b > d. Using exactly the same arguments, the grim
a δb
trigger strategy is a Nash equilibrium if and only if 1−δ ≥ c + 1−δ .
Rearranging yields the condition:

c −a
δ≥ .
c −b

33
Generalized Prisoner’s Dilemma

Thus, cooperation is harder to sustain (requires a higher discount


factor) when:

1 c is large relative to a and b

2 a and b are roughly equal.

34
Generalized Prisoner’s Dilemma: SPNE

We have derived the condition on δ under which (grim trigger,


grim trigger) is a Nash equilibrium. But is it an SPNE?

We now have to check every subgame, including ones off the


equilibrium path. Fortunately, we only have to check one-shot
deviations.

Only one additional type of subgame is relevant to us: one in which


Player 1 has defected at time t − 1, which is off the equilibrium
path. Then both players playing grim trigger would dictate:

1 2 ... t − 1 t t + 1 t + 2 ...
Player 1 C C ... D C D D ...
Player 2 C C ... C D D D ...

35
Generalized Prisoner’s Dilemma: SPNE

Is defecting forever a credible threat for Player 2?


• Her payoffs (starting from period t) from sticking to grim
trigger are:

δb
c + δb + δ 2 b + δ 3 b · · · = c +
1−δ

• But what if she just turned a blind eye to Player 1’s defection
and cooperated in period t instead?

1 2 ... t − 1 t t + 1 t + 2 ...
Player 1 C C ... D C C C ...
Player 2 C C ... C C C C ...

a
This would yield the payoff stream 1−δ .
36
Generalized Prisoner’s Dilemma: SPNE

Player 2 is better off sticking to grim trigger if and only if:

δb a
c+ ≥
1−δ 1−δ
c(1 − δ) + δb ≥ a
δ(b − c) ≥ a − c
a−c
δ≤
b−c

Where we have flipped the inequality because b − c < 0.


Thus, (grim trigger, grim trigger) is subgame perfect under the
c−a
knife-edge condition δ = c−b , since the previously derived
condition for Nash equilibrium also has to hold.

37
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
38
Tit-for-Tat Strategies

• The grim trigger strategy is not the only equilibrium of the


infinitely repeated prisoner’s dilemma that sustains the
cooperative outcome.
• The grim trigger equilibrium may be undesirable because
cooperation disappears forever following a single defection.
• It is not robust to mistakes by the players.
• Following a breakdown of cooperation the players cannot
renegotiate to return to the cooperative phase, something they
clearly have an incentive to do.
• An alternative Nash equilibrium is based on “tit-for-tat”
strategies of the form: “cooperate in the first period; then, in
any subsequent period, play the action that the other player
chose in the previous period.”
39
Tit-for-Tat Strategies: Subgame Perfection

There are two types of subgames to consider:


1 A subgame where, following cooperation in the previous
period, both players are expected to cooperate in the future.
This is the cooperation phase.
• We compare the utility of tit-for-tat to the utility of a unilateral,
one-period deviation to not cooperating, then returning to
tit-for-tat.

2 A subgame following cooperation by one player and defection


by the other. This is the punishment phase.
• We compare the utility of tit-for-tat to the utility of not punishing
and cooperating in the period immediately following defection, then
returning to tit-for-tat.

40
Tit-for-Tat Strategies: Subgame Perfection

1 The cooperation phase. Consider Player 1:


• Following tit-for-tat yields the present value utility 1−δ
a
.
• A one-shot defection in the cooperation phase (while Player 2
adheres to tit-for-tat) looks like:

1 2 ... t − 1 t t +1 t +2 t + 3 ...
Player 1 C C ... C D C D C ...
Player 2 C C ... C C D C D ...

This payoff stream yields the present value utility:

c + δd + δ 2 c + δ 3 d + δ 4 c + δ 5 d . . .
= c + δd + δ 2 (c + δd) + δ 4 (c + δd) . . .
c + δd
=
1 − δ2

41
Tit-for-Tat Strategies: Subgame Perfection

(Referring back to the generalized payoff matrix:)

Generalized Prisoner’s Dilemma


1/2 Cooperate Don’t cooperate
Cooperate a, a d, c
Don’t cooperate c, d b, b

where c > a > b > d.


Thus, tit-for-tat is an SPNE if and only if:

a c + δd

1−δ 1 − δ2

Rearranging allows us to express this as a condition on the


discount rate.
42
Tit-for-Tat Strategies

a c + δd

1−δ 1 − δ2
c + δd
a≥
1+δ
(1 + δ)a ≥ c + δd
δa − δd ≥ c − a
c −a
δ≥
a−d

43
Subgame Perfection

2 The punishment phase. Again consider it from Player 1’s


point of view:
• After Player 2 has defected, both players playing tit-for-tat
looks like:

1 2 ... t − 1 t t +1 t +2 t + 3 ...
Player 1 C C ... C D C D C ...
Player 2 C C ... D C D C D ...

c+δd
Which, as above, yields the payoff stream 1−δ 2 .

Alternatively, Player 1 could cooperate in period t and get a in


every period. Player 1 prefers to play tit-for-tat as long as:

c + δd a c −a
2
≥ →δ≤
1−δ 1−δ a−d
44
Subgame Perfection

Thus, tit-for-tat is only a subgame perfect Nash equilibrium if:


c−a
1 δ≥ a−d , and
c−a
2 δ≤ a−d

c−a
This is satisfied when δ = a−d , a very knife-edge condition.

45
Subgame Perfect Tit for Tat (“Adjusted Tit-for-Tat”)

An alternative version of tit-for-tat avoids the problem of oscillation in


the punishment phase. Milgrom, North and Weingast (1990) argue for
the strategy:

• Start out playing Cooperate


• Always play Cooperate at time t unless these two conditions both
hold:
1 The other player defected in t − 1
2 You cooperated in t − 2

This strategy punishes defection but rewards punishment for defection,


allowing players to get back to a cooperative equilibrium.

1 2 ··· t −2 t −1 t t +1 ···
C C ··· C C D C ···
C C ··· C D C C ··· 46
Subgame Perfect Tit for Tat (“Adjusted Tit-for-Tat”)

Again, need to check two subgames: cooperation and punishment phase.

1 In a cooperation phase the condition under which a player continues


to cooperate is:
a δ2 a
≥ c + δd +
1−δ 1−δ
2 In a punishment phase, we have to check that the defector wants to
return to cooperating rather than defect another period:

δa δ2 a
d+ ≥ b + δd +
1−δ 1−δ

and that the punisher is better off punishing:

δa a
c+ ≥
1−δ 1−δ

(which is true by definition, c > a)


47
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
48
Intermediate Punishment Strategies

• The grim trigger and tit-for-tat strategies represent just two


of the possible strategies that sustain cooperative outcomes.

• These strategies can be generalized to include strategies that


involve punishment phases of intermediate length.

49
Two Examples

1 Similar to grim trigger: Cooperate until your opponent defects. If


your opponent defects, do not cooperate for the next k periods but
then return to cooperation; if you defect, do not cooperate for the
next k periods but then return to cooperation. Once you have
returned to cooperation, cooperate until a defection occurs.

2 Similar to tit-for-tat: Cooperate until your opponent defects. If your


opponent defects, do not cooperate for k periods. If she cooperates
in any of the k periods, return to cooperation, ending the
punishment phase. If she fails to cooperate in any period of the
punishment phase, then the punishment phase starts over i.e. don’t
cooperate for k more periods. If your own failure to cooperate
caused the punishment phase then cooperate during the punishment
phase.

50
Strategy 1

• The payoff stream from defecting from mutual cooperation


(and then defecting while you’re being punished) consists of
the one-period gain from defecting, b for k periods, and an
infinite stream of a beginning k + 1 periods in the future.

• Recall that the present value of an infinite payoff stream of π


beginning at a future time τ is:

δτ π
1−δ

• Thus the utility of defecting after cooperation is:

δ − δ k+1 δ k+1
c+ b+ a.
1−δ 1−δ
51
Strategy 1

• Consequently, sustaining cooperation requires that:

a δ − δ k+1 δ k+1
≥c+ b+ a
1−δ 1−δ 1−δ
Multiplying through by 1 − δ yields the condition:
   
a 1 − δ k+1 ≥ (1 − δ) c + δ − δ k+1 b

• We cannot generate a closed form for the critical value of δ, but we can
rewrite this expression as:

c −a a−b
δ> + δ k+1 .
c −b c −b

• The first term on the right side is the critical value for the grim trigger
strategy, and second term is positive for any k. Thus it is harder to
sustain cooperation with a finite punishment phase. But if players make 52
mistakes, this equilibrium may be preferred to the grim trigger.
Strategy 2

• A defection from the cooperation phase generates a payoff


consisting of a one period benefit c, a punishment payoff of d
for k periods, and a return to cooperative payoffs a at the end
of the punishment.

δ−δ k+1 δ k+1


• Summing these up generates c + 1−δ d + 1−δ a.

• This payoff is lower than the payoff from defection in the


δ2 a 2 k+1
tit-for-tat equilibrium (c + δd + 1−δ ) by δ −δ
1−δ (a − d).
Thus, increasing the length of the punishment phase decreases
the incentive to defect from the cooperative phase.
53
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
54
The Folk Theorem

• A common theme of these examples is that so long as the


agents are sufficiently patient, outcomes that are not Nash
equilibria in static games may be Nash equilibria or subgame
perfect equilibria of infinitely repeated games.

• In fact, any feasible payoff vector to an infinitely repeated


game that satisfies individual rationality can be sustained as a
SPNE so long as agents are sufficiently patient.

• This result has been around a long time in many different


forms, so it’s called a Folk theorem.

55
Individually Rational Payoffs

Definition
The payoff vector v = (v1 , .., vi , ..., vn ) is individually rational if
vi ≥ mins−i {maxsi ui (si , s−i )} for each i ∈ N.

The value mins−i {maxsi ui (si , s−i )} is the minimum stage game
utility that player i attains from any strategy profile in which she
plays a best response to s−i . This value is identified by letting
players −i select s−i , so as to minimize the utility to i of playing a
best response to s−i .

56
Feasibility

Definition (1)
The payoff vector v = (v1 , .., vi , ..., vn ) is feasible if there is some
pure strategy profile s such that for each i ∈ N,
ui (s) = (1 − δi )vi .

Recall that (1 − δ)vi can be understood as the discounted average


of a stream of payoffs from a repeated game.

Alternatively, we call the payoffs in some stage game G feasible if


they are a convex combination (i.e. a weighted average) of the
pure-strategy payoffs of G (where the weights are non-negative
and sum to one). We call this the convex hull of the pure-strategy
payoffs.

57
Individual Rationality ∩ Feasibility

(0,4)
(3,3)
IR V
∩V
(1,1)

(4,0)

c d
c 3,3 0,4
d 4,0 1,1

58
The Folk Theorem

Theorem
For every feasible and individually rational payoff vector v there is
a vector of discount rates δ 0 (i.e. one δi0 for each player) such
that the payoff vector v occurs in a Nash equilibrium of the
repeated game if δi ≥ δi0 for all i.

59
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
60
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
61
Bargaining Theory

• If political science is the study of “who gets what, when and


how” then bargaining theory lies at its foundation.

• Legislators and executives bargain over budgets and new


legislation.

• States bargain to reach new international agreements and to


settle crises (e.g. refugee resettlement, climate accords...).

• Political parties bargain over coalition governments.

..
.
62
Bargaining Model of War

Two states are in conflict over a unit good.

• Divisible: an area of territory or an allocation of resources.

• Country 1 presents country 2 with a proposal to share the


resource (x, 1 − x).

• Country 2 can accept this offer (leading to peace) or reject


this offer (leading to war).

63
War in the Bargaining Model of War

The expected payoff to war depends on the probability that a


country will win, the utility of victory and defeat, and the
inefficiencies of fighting.
(Normalized) payoffs for war:
• victory = 1
• defeat = 0
• cost of war ci > 0
• pi , probability of i winning: p2 = 1 − p1
• ui (xi ) is increasing and weakly concave (weakly risk-averse)

Expected utility of war to state i:

pi (1) + (1 − pi )(0) − ci .
64
Extensive Form of One-Period Bargaining

x
2
Accept Reject

u1 (x) p − c1
u2 (1 − x) 1 − p − c2

65
Nash Equilibria of the Bargaining Game

What are the Nash equilibria of the one-shot bargaining game?

Anything goes.

• Pick any cutoff strategy for Player 2 (i.e., accept if x ≥ x̂ and


reject otherwise)
• Have Player 1 propose the cutoff value.
• This is an equilibrium.

66
Complete Information Equilibrium: Bargaining and War

Proposition
In the unique subgame perfect equilibrium of this game the
probability of war is zero.

67
Proof

• In the final stage Player 2 will accept any offer such that:

u2 (1 − x) ≥ 1 − p − c2 .

• In the first stage country 1 chooses among all the x.


• It knows that for any x, it gets
1 either u1 (x) if Player 2 accepts
2 or p − c1 if rejects
• Of all the possible offers, one that makes Player 1’s payoff largest is:

u2 (1 − x) = 1 − p − c2
x = 1 − u2−1 (1 − p − c2 )
or with linear utility in shares x = p + c2

• This offer is always accepted and there is no war in equilibrium.


68
Things to Observe from Bargaining

• A conflict of interest is not sufficient for conflict.


• All else equal, stronger countries (higher p) get better deals
• When preferences are known and “fixed” bargaining produces
efficient outcomes (Coase theorem)
• Bargaining power is a function of payoffs to war and the
process by which agreements are reached.

69
What About Infinite-Horizon Bargaining Models?

• But there are many questions one might have:

1 What if Player 2 got to make a counter-offer instead of


rejecting?

2 What if players have varying degrees of patience when it comes


to long, drawn-out bargaining?

3 Does bargaining ever end exogenously and if not, what will


agreements look like?

70
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
71
Rubinstein Bargaining

• Suppose that two players try to decide how to divide $1.

• The players take turns making offers so that Player 1 proposes


in periods 0, 2, 4, etc. and Player 2 makes proposals in the
other periods.

• The game continues (possibly infinitely) until a proposal is


accepted by the other player.

• In each period that she is the proposer, Player 1 makes an


offer (x1 , x2 ) where x1 is Player 1’s share and x2 is Player 2’s
share where x1 + x2 ≤ 1.

• If Player 2 accepts, the game ends and the dollar is divided


accordingly.
72
Rubinstein Bargaining

• If Player 2 rejects, then she gets to make an offer (x1 , x2 ) with


x1 + x2 ≤ 1, and the game continues if Player 1 rejects.

• To simplify matters, assume both players have linear utility


functions u1 (x1 , x2 ) = x1 and u2 (x1 , x2 ) = x2 .

• Each player has a discount factor δi ; players value proposal


(x1 , x2 ) accepted t periods in the future at (δ1t x1 , δ2t x2 ).

• A strategy has to consist of (1) the offers you accept when


the other player proposes, and (2) the offers you make when
you are the proposer.

73
Subgame Perfect Equilibria

• Rubinstein shows that there is a unique SPNE to this game


based on playing the following strategies in every period:
 
1−δ2 δ2 (1−δ1 )
1 Player 1 proposes 1−δ1 δ2 , 1−δ1 δ2 and accepts Player 2’s
offer if and only if x1 ≥ δ1−δ
1 (1−δ2 )
1 δ2
.
 
δ1 (1−δ2 ) 1−δ1
2 Player 2 proposes 1−δ1 δ2 , 1−δ1 δ2 and accepts Player 1’s
offer if and only if x2 ≥ δ1−δ
2 (1−δ1 )
1 δ2
.

• Prove that these strategies constitute a subgame perfect Nash


equilibrium to the alternating offers bargaining game.
74
Computing the Equilibrium

• Let v1 and v2 be the utilities of Player 1 and 2 for subgames


in which they are the proposer.

• Given that the postulated strategies are the same in every


period, these values are independent of t.

• These are continuation values because they also reflect the


utility of rejecting a proposal and moving to the next subgame.

75
Computing the Equilibrium

• Consider a subgame where Player 1 is the proposer.


• She must offer Player 2 at least δ2 v2 , Player 2’s discounted
continuation value.
• She keeps the rest for herself: x1 = 1 − δ2 v2 . Because this offer is
accepted, x1 is exactly Player 1’s continuation value:
v1 = x1 = 1 − δ2 v2 .
• Consider a subgame where Player 2 is the proposer. She must offer
at least δ1 v1 so that v2 = 1 − δ1 v1 .
• Now we simply solve this system of equations:

v1 = 1 − δ2 v2 and v2 = 1 − δ1 v1
→ v2 = 1 − δ1 (1 − δ2 v2 )
v2 (1 − δ1 δ2 ) = 1 − δ1
1 − δ1
76 v2 =
1 − δ1 δ2
Computing the Equilibrium

Solving for v1 ,
 
1 − δ1
v1 = 1 − δ2
1 − δ 1 δ2
δ2 − δ1 δ 2
v1 = 1 −
1 − δ1 δ2
1 − δ2
v1 =
1 − δ 1 δ2

Thus we have:
1 − δ2 1 − δ1
v1 = and v2 = .
1 − δ1 δ2 1 − δ 1 δ2

Plugging in these values yields the SPNE above:


• Player 1 proposes (v1 , 1 − v1 ) and accepts if x1 ≥ δ1 v1
• Player 2 proposes (v2 , 1 − v2 ) and accepts if x2 ≥ δ2 v2 77
Implications

• The model suggests


 a very simple path
 of play: in period zero,
1−δ2 δ2 (1−δ1 )
Player 1 proposes 1−δ1 δ2 , 1−δ1 δ2 , Player 2 accepts, and the
game ends.

• Because the whole dollar is allocated and there is no delay,


the subgame perfect Nash equilibrium is efficient.

• If both players have the same discount factor, there is a first


1−δ δ(1−δ)
mover advantage because 1−δ 2 > 1−δ 2 . Intuitively, because

Player 2 discounts the future, Player 1 only needs to offer her


a fraction of what she gets for being the proposer next period.
Because both players are identical, Player 2 is getting only a
fraction of what Player 1 gets.

78
Implications

• We can also compute a comparative static: how do


equilibrium outcomes change as a function of key parameters
of interest?
• How does your equilibrium offer change as a function of your
discount factor? Your opponent’s discount factor?

• To answer this question, simply take the first derivative of the


equilibrium outcome with respect to the parameter of interest.

δ2 − δ22
 
∂ 1 − δ2 (1 − δ1 δ2 )(0) − (1 − δ2 )(−δ2 )
= =
∂δ1 1 − δ1 δ2 (1 − δ1 δ2 )2 (1 − δ1 δ2 )2

This is positive, so Player 1 extracts a bigger share of the


dollar the more patient she is.

79
Implications

What about Player 1’s equilibrium share with respect to Player 2’s
discount rate?
 
∂ 1 − δ2 (1 − δ1 δ2 )(−1) − (1 − δ2 )(−δ1 ) δ1 − 1
= =
∂δ2 1 − δ 1 δ2 (1 − δ1 δ2 )2 (1 − δ1 δ2 )2

This is negative, so Player 1 extracts a smaller share of the dollar


the more patient Player 2 is.
If δ1 = δ2 = δ, then both players’ shares converge to 21 as δ
converges to 1. As both players become perfectly patient, they are
less willing to accept offers that are less than what they can get as
the proposer next period. In the limit, they demand exactly what
they expect to get next period.
80
Finitely Repeated Games

Infinitely Repeated Games


Discounting and Definitions
The Grim Trigger Strategy
Tit-for-Tat Strategy
Intermediate Punishment Strategies
Folk Theorem

Examples
Example 1: Bargaining Model of War
Example 2: Rubinstein Bargaining
Example 3: Bargaining Under a Closed Rule
81
Majority Rule Bargaining Under A Closed Rule

• A key feature of the Rubinstein model is that unanimous


consent is required to reach an agreement on the allocation.
• This rules out a number of important political settings where
only a simple or supermajority is required for agreement.
• Baron and Ferejohn (1989) have extended Rubinstein’s model
to simple majority rule with more than two bargainers.
• Suppose that there are N (odd) players bargaining and any
proposal requires n = (N + 1)/2 votes.
• Instead of assuming alternating offers, Baron and Ferejohn
consider a bargaining protocol with a random recognition rule.
• According to this protocol, in each period, every player is
chosen to make a proposal with an equal probability (1/N).

82
Majority Rule Bargaining Under A Closed Rule

• We focus on bargaining under a closed rule where the


proposer makes a take-it-leave-it offer for the current
legislative session.
• The proposer in each period makes an offer (x1 , x2 , . . . , xN )
such that xi is the share for player i.
• Feasibility requires that
P
xi ≤ 1.
• If this proposal is rejected, the session ends, discounting
occurs, and a new proposer is randomly chosen at the
beginning of the next session.
• To simplify, we assume that each player has the same discount
factor δ.
• This game has lots of subgame perfect equilibria. In fact for
large enough N and δ, there is a SPNE that can support any
feasible division of the dollar.
83
Majority Rule Bargaining Under A Closed Rule

• These strategies require, however, that each player know the


whole (possibly infinite) history of the game in order to know
which actions are consistent with the prescribed punishment

• Following Baron and Ferejohn, we analyze only stationary


equilibria, meaning those in which:
1 A proposer proposes the same division every time she is
recognized regardless of the history of the game.
2 Voters vote only on the basis of the current proposal and
expectations about future proposals, not on prior histories.

• Does there exist an equilibrium with this property?

84
Majority Rule Bargaining Under A Closed Rule

• Let vi be the continuation value (i.e. the discounted expected


utility from playing the rest of the game) for player i.

• We focus on symmetric equilibria (in which every player is


playing the same strategy), so that vi = v for all i.

• Any voter who gets xi ≥ δv votes in favor of the proposal,


whereas any voter who receives less than δv votes against it.

• Given these voting strategies, an optimal proposer must


propose:
• δv to n − 1 other players
• 0 to the rest of the other players
• the remainder, z = 1 − (n − 1)δv to himself

85
Majority Rule Bargaining Under A Closed Rule

• In this class of stationary symmetric equilibria, the proposer


chooses her coalition partners randomly.
• The continuation value is then:
     
1 n−1 N −n
v= z + δv + 0
N N N
| {z } | {z } | {z }
being proposer being in winning coalition being left out

• Substituting for z and simplifying yields:

1 − (n − 1)δv (n − 1)δv 1
v= + =
N N N

• The continuation value is a proportional share of the dollar.


86
Majority Rule Bargaining Under A Closed Rule

Finally, given our solution for v , we have to compute the


proposer’s share and make sure it makes the proposer better off
than punting (i.e. making a proposal that won’t be accepted) to
get to the next period.
Recalling that z = 1 − (n − 1)δv and plugging in v = 1/N,
 
δ
z = 1 − (n − 1)
N

Let’s put n in terms of N to make the analysis clearer:


  
N +1 δ (N − 1)δ
z =1− −1 =1−
2 N 2N
87
Majority Rule Bargaining Under A Closed Rule

For z to be optimal for the proposer, we must check that:

(N − 1)δ δ
1− ≥
2N N
2δ + (N − 1)δ
1≥
2N
2N ≥ δ(N + 1)
(2 − δ)N ≥ δ
δ
N≥
2−δ

The right-hand side is maximized at 1 when δ = 1. Thus, this


condition is always satisfied for more than one player.
88
Some Takeaways

• Because v is also the expected utility of the game, this result


implies that bargaining is efficient because the sum of player
utilities is maximized.

• We can compute a measure of proposal power: the difference


between the utility of being the proposer (z) and the
discounted continuation value (δv ):

(N − 1)δ δ (N + 1)δ
π = z − δv = 1 − − =1−
2N N 2N

Comparative statics: How does proposal power vary with δ


and N?
89
Comparative Statics

Proposal power decreases as players become more patient:

∂π N +1
=−
∂δ 2N
What about as N grows large?
   
∂π 2Nδ − δ(N + 1)(2) δ δ
=− 2
=− − 2 =
∂N 4N 2N 2N 2

Since this is positive, proposal power grows as the size of the


legislature increases.

90
Supermajority Rule (Try this at home)

• Now assume that k > n votes are required.

• Repeating the steps above, you can easily derive the proposer’s
share as:
z = 1 − (k − 1)δv

and the continuation values as:


z k −1
v= + δv
N N

• Algebra reveals that once again v = 1


N

• The proposer’s equilibrium share is now lowered to z = 1 − δ(k−1)


N
(from z = 1 − δ(n−1)
N ). Thus, going from majority to supermajority
rule mitigates the proposer’s advantage.

91
Implications for Institutional Design

• Proposal power: can be interpreted as committee membership


→ how does the game change when proposal power is
non-random?

• What features control proposal power?


• Bigger legislature → more proposal power
• Supermajoritarian rules → less proposal power
• Probability of being recognized is another lever

• Do we want to increase or decrease proposal power from a


normative standpoint?
• More unequal offers may be less fair
• But, high payoffs to committee membership may incentivize
good policy
92
MIT OpenCourseWare
https://ocw.mit.edu

17.810 / 17.811 Game Theory


Spring 2021

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

You might also like