Pset 4
Pset 4
Pset 4
Due on 11/15. If you are working with a partner, you and your partner may turn in a single copy
of the problem set. Please show your work and acknowledge any additional resources consulted.
Questions marked with an (∗) are intended for math-and-game-theory-heads who are interested
in deeper, formal exploration, perhaps as preparation for grad school. The questions typically
demonstrate the robustness of the results from class or other problems, and the answers do not
change the interpretation of those results. Moreover, this material will not play a large role on the
exam and tends to be worth relatively little on the problem sets. Some folks might consequently
prefer to skip these problems.
Cooperate Defect
Cooperate (b − c, b − c) (−c, b)
Defect (b, −c) (0, 0)
The repeated prisoner’s dilemma1 is built out of several stages, each of which is a copy of the
above game. At the end of each stage, the two players repeat the prisoner’s dilemma again with
probability δ, where 0 ≤ δ ≤ 1. A strategy in the repeated prisoner’s dilemma is a rule which
determines whether a player will cooperate or defect in each given stage. This rule may depend on
which round it is, and on either player’s actions in previous rounds.
For example, the grim trigger strategy is described by the following rule: cooperate if both
players have never defected, and defect otherwise. The goal of this problem is to show that the
strategy pair in which both players play grim trigger is a Nash equilibrium if δ > cb .
(a) Suppose that player 1 and player 2 are both following the grim trigger strategy. What actions
will be played in each stage of the repeated game? What are the payoffs to players 1 and 2 in
each stage?
(b) Using your result from part a, write down the expected payoff to player 1 from the entire
repeated prisoner’s dilemma in terms of c, b, and δ.
Hint: Remember that, if |δ| < 1:
a
a + aδ + aδ 2 + aδ 3 + . . . =
1−δ
1
Please consult Section 5 of the Game Theory handout on Repeated Games for details.
1
(c) Now we will check whether player 1 can improve his payoff by deviating from the grim trigger
strategy. Argue that we only need to check the case where player 1 plays all-D, that is, player
1 defects in every round.
(d) Suppose that player 2 plays grim trigger and player 1 deviates from grim trigger and plays
all-D. What is the total payoff to player 1 from the entire repeated prisoner’s dilemma?
(e) For grim trigger to be a Nash equilibrium, we need that the payoff to player 1 from playing
grim trigger is greater than or equal to the payoff to player 1 from playing all-D, assuming
player 2’s strategy is fixed.
Using your results from parts b and d, write down an inequality that must be satisfied in order
for grim trigger to be a Nash equilibrium. Simplify this inequality to obtain the condition
δ > cb .
(f) (∗) - 10 points. Show that the Grim Trigger is a Subgame Perfect equilibrium in addition to
being a Nash equilibrium [Hint: use the one-stage deviation principle]. For a formal discussion
of subgame perfection, see the Game Theory Handout.
So far we have focused on the Grim Trigger because it is a relatively simple strategy to under-
stand, but not necessarily because we think it is used in practice. Importantly, many of the
insights we have learned from studying the Grim Trigger generalize to any Nash equilibrium.
(g) (∗) - 10 points. Show that in any Nash equilibrium in which both players play C at each period,
player 2 must cooperate less in the future if player 1 were to deviate and play D at any period
instead of C. Interpret this result in terms of ‘reciprocity,’ as discussed in lecture.
(a) Suppose that the strategy pair (s1 , s2 ) is a Nash equilibrium, and let U1 (s1 , s2 ) and U2 (s1 , s2 )
be the payoffs to players 1 and 2, respectively. Show that U1 (s1 , s2 ) ≥ 0 and U2 (s1 , s2 ) ≥ 0.
(b) Notice that, in each round of the prisoner’s dilemma, the sum of the payoffs to players 1 and
2 is either 2(b − c), b − c, or 0. Show that, if s1 and s2 are any two strategy pairs, then
U1 (s1 , s2 ) + U2 (s1 , s2 ) ≤ 2(b−c)
1−δ .
2
(c) Now assume δ < cb . Using your results from part b, show that U1 (s1 , s2 ) + U2 (s1 , s2 ) < 2b for
any strategy pair (s1 , s2 ). Use this to conclude that, if (s1 , s2 ) is a Nash equilibrium, at least
one player receives total payoff less than b.
(d) Suppose that, when players 1 and 2 play s1 and s2 , both players cooperate in some round k.
Without loss of generality, we may assume that k = 1 (otherwise we repeat the argument from
parts a-c to the subgame starting at round k, introducing a factor of δ k−1 ). Using your result
from part c, show that one of the players can improve his payoff by deviating.
(e) Next we need to rule out the possibility of a round in which one player cooperates and the
other defects. Repeat the argument of part b using the additional result that players 1 and 2
never simultaneously cooperate (so the sum of their payoffs in a given round is either b − c or
b−c
0). Show that U1 (s1 , s2 ) + U2 (s1 , s2 ) ≤ 1−δ .
(f) Again assume that δ < cb . Use your results from parts a and e to conclude that each player’s
payoff is less than b; that is, U1 (s1 , s2 ) < b and U2 (s1 , s2 ) < b.
(g) Now suppose that, in the first round, player 1 cooperates and player 2 defects. By your
reasoning from part (f), player 2 receives total payoff less than b. Show that player 2 can
improve his payoff by deviating, so that (s1 , s2 ) is not a Nash equilibrium.
Using this proof by contradiction, you have showed that a strategy pair (s1 , s2 ) which involves
cooperation in any period cannot be a Nash equilibrium if δ < cb . It follows that (all-D, all-D) is
the only equilibrium in this case.
3
q r s
“Shunners contribute to the collective action and then try to help those needy indi-
viduals who have good reputations during the mutual aid game, but mistakenly fail
owing to errors with probability e. . . Shunners never help needy recipients who are in
bad standing.”
1−e
(a) The authors argue that (shunner, shunner, . . . , shunner) is an equilibrium iff n−1
n 1−w (b −
c)(1 − we) > C. Argue that this is the case, even if one considers deviations to all possible
strategies in this game, and not just to the two other strategies described by the authors. To
simplify the math, feel free to set e = 0.
(b) (∗) When is contribution to the public good sustained as part of a Nash equilibrium. What
property must any strategy that sustains contributions to the public good have?
(a) A state-dependent strategy is one in which a player chooses one action in some states of the
world, and another in other states. Identify a valid, state-dependent strategy for player 1. Do
4
L R
L a b
R c d
a>c,d>b
p = (d-b)/(d-b + a-c)
q r s t
(b) Consider the strategy pair where player 1 plays “A when red and B when blue” (a.k.a. “A iff
red”) and player 2 plays “B when green and A when orange” (a.k.a. “A iff orange”). What
are each player’s payoffs?
(c) A Bayesian Nash equilibrium (BNE) is simply a pair of strategies such that neither player gains
a higher expected payoff by unilaterally deviating to a different strategy. This can easily be
checked by showing that for no player is there no color such that by playing a different action
in that color, the player would receive a higher payoff. Show that when r/(r + s) < p and
r/(q + r) < 1 − p, the strategy pair “A iff red; A iff green” is a BNE. Note: when at least one
of the players is playing a state-dependent strategy in equilibrium, we call this an equilibrium
with state-dependent strategies (ESDS).
5
the coordination game presented in Fig. 2. That is, first, nature randomly draws a ball according
to the prior probabilities. Each player sees the color associated with that state by their partition.
Then, each player plays an action in the coordination game and receives payoffs according to their
actions, as presented in the payoff matrix.
(a) In this example, both players know that at the rightmost state it is not the leftmost state and
vice-versa. Assume r/(r + s) > p and s/(s + t) > p. Show there is no ESDS where both play
A in the leftmost state and B in the rightmost state. To do this:
(i) Suppose that both play A in the leftmost state. Player 2 must play A one state to the
right, too, since this state is also green. Show that player 1 will maximize his payoffs by
playing A when blue.
(ii) Show that player 2 will then maximize his payoffs by playing A when orange.
(iii) Argue that player 1 will maximize his payoffs by playing A when yellow.
(b) What is the relationship to higher order beliefs? To answer this, first suppose that 2 plays A
when he sees green. Then answer each of the following:
(i) Suppose player 2 sees orange. What does player 2 think about what 1 sees?
(ii) What does player 2 think that 1 thinks about what 2 sees?
(iii) What does player 2 think that 1 thinks that 2 will do?
(iv) What does player 2 think that 1 will do in response?
(v) How should player 2 respond?