Pset 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Problem Set 4

GTSB Fall 2015

Due on 11/15. If you are working with a partner, you and your partner may turn in a single copy
of the problem set. Please show your work and acknowledge any additional resources consulted.
Questions marked with an (∗) are intended for math-and-game-theory-heads who are interested
in deeper, formal exploration, perhaps as preparation for grad school. The questions typically
demonstrate the robustness of the results from class or other problems, and the answers do not
change the interpretation of those results. Moreover, this material will not play a large role on the
exam and tends to be worth relatively little on the problem sets. Some folks might consequently
prefer to skip these problems.

1 Grim Trigger in the Repeated Prisoner’s Dilemma (70 points)


In one instance of the prisoner’s dilemma, each player chooses whether to pay some cost c > 0 in
order to confer a benefit b > c onto the other player. The payoffs from a single iteration of this
prisoner’s dilemma are therefore:

Cooperate Defect
Cooperate (b − c, b − c) (−c, b)
Defect (b, −c) (0, 0)

The repeated prisoner’s dilemma1 is built out of several stages, each of which is a copy of the
above game. At the end of each stage, the two players repeat the prisoner’s dilemma again with
probability δ, where 0 ≤ δ ≤ 1. A strategy in the repeated prisoner’s dilemma is a rule which
determines whether a player will cooperate or defect in each given stage. This rule may depend on
which round it is, and on either player’s actions in previous rounds.
For example, the grim trigger strategy is described by the following rule: cooperate if both
players have never defected, and defect otherwise. The goal of this problem is to show that the
strategy pair in which both players play grim trigger is a Nash equilibrium if δ > cb .

(a) Suppose that player 1 and player 2 are both following the grim trigger strategy. What actions
will be played in each stage of the repeated game? What are the payoffs to players 1 and 2 in
each stage?

(b) Using your result from part a, write down the expected payoff to player 1 from the entire
repeated prisoner’s dilemma in terms of c, b, and δ.
Hint: Remember that, if |δ| < 1:

a
a + aδ + aδ 2 + aδ 3 + . . . =
1−δ
1
Please consult Section 5 of the Game Theory handout on Repeated Games for details.

1
(c) Now we will check whether player 1 can improve his payoff by deviating from the grim trigger
strategy. Argue that we only need to check the case where player 1 plays all-D, that is, player
1 defects in every round.

(d) Suppose that player 2 plays grim trigger and player 1 deviates from grim trigger and plays
all-D. What is the total payoff to player 1 from the entire repeated prisoner’s dilemma?

(e) For grim trigger to be a Nash equilibrium, we need that the payoff to player 1 from playing
grim trigger is greater than or equal to the payoff to player 1 from playing all-D, assuming
player 2’s strategy is fixed.
Using your results from parts b and d, write down an inequality that must be satisfied in order
for grim trigger to be a Nash equilibrium. Simplify this inequality to obtain the condition
δ > cb .

(f) (∗) - 10 points. Show that the Grim Trigger is a Subgame Perfect equilibrium in addition to
being a Nash equilibrium [Hint: use the one-stage deviation principle]. For a formal discussion
of subgame perfection, see the Game Theory Handout.
So far we have focused on the Grim Trigger because it is a relatively simple strategy to under-
stand, but not necessarily because we think it is used in practice. Importantly, many of the
insights we have learned from studying the Grim Trigger generalize to any Nash equilibrium.

(g) (∗) - 10 points. Show that in any Nash equilibrium in which both players play C at each period,
player 2 must cooperate less in the future if player 1 were to deviate and play D at any period
instead of C. Interpret this result in terms of ‘reciprocity,’ as discussed in lecture.

2 No Cooperation for Small δ (50 points)


In lecture, we argued that cooperative equilibria exist in the repeated prisoner’s dilemma if and
only if δ > cb . In problem 1, you showed that we can have a Nash equilibrium in which both players
always cooperate (specifically, the equilibrium in which both players play grim trigger) if δ > cb . In
this problem, we will show that if δ < cb , then the only Nash equilibrium is (all-D, all-D). That is,
cooperative equilibria exist only if δ > cb . Combined, your responses to these two questions thus
provide a complete proof to our claim from lecture.

(a) Suppose that the strategy pair (s1 , s2 ) is a Nash equilibrium, and let U1 (s1 , s2 ) and U2 (s1 , s2 )
be the payoffs to players 1 and 2, respectively. Show that U1 (s1 , s2 ) ≥ 0 and U2 (s1 , s2 ) ≥ 0.

(b) Notice that, in each round of the prisoner’s dilemma, the sum of the payoffs to players 1 and
2 is either 2(b − c), b − c, or 0. Show that, if s1 and s2 are any two strategy pairs, then
U1 (s1 , s2 ) + U2 (s1 , s2 ) ≤ 2(b−c)
1−δ .

2
(c) Now assume δ < cb . Using your results from part b, show that U1 (s1 , s2 ) + U2 (s1 , s2 ) < 2b for
any strategy pair (s1 , s2 ). Use this to conclude that, if (s1 , s2 ) is a Nash equilibrium, at least
one player receives total payoff less than b.

(d) Suppose that, when players 1 and 2 play s1 and s2 , both players cooperate in some round k.
Without loss of generality, we may assume that k = 1 (otherwise we repeat the argument from
parts a-c to the subgame starting at round k, introducing a factor of δ k−1 ). Using your result
from part c, show that one of the players can improve his payoff by deviating.

(e) Next we need to rule out the possibility of a round in which one player cooperates and the
other defects. Repeat the argument of part b using the additional result that players 1 and 2
never simultaneously cooperate (so the sum of their payoffs in a given round is either b − c or
b−c
0). Show that U1 (s1 , s2 ) + U2 (s1 , s2 ) ≤ 1−δ .

(f) Again assume that δ < cb . Use your results from parts a and e to conclude that each player’s
payoff is less than b; that is, U1 (s1 , s2 ) < b and U2 (s1 , s2 ) < b.

(g) Now suppose that, in the first round, player 1 cooperates and player 2 defects. By your
reasoning from part (f), player 2 receives total payoff less than b. Show that player 2 can
improve his payoff by deviating, so that (s1 , s2 ) is not a Nash equilibrium.

Using this proof by contradiction, you have showed that a strategy pair (s1 , s2 ) which involves
cooperation in any period cannot be a Nash equilibrium if δ < cb . It follows that (all-D, all-D) is
the only equilibrium in this case.

3 Panchanathan and Boyd (2004)


Recall the model presented in Panchanathan and Boyd (2004):

“. . . we consider a large population subdivided into randomly formed social groups of


size n. Social life consists of two stages. First, individuals decide whether or not
to contribute to a one-shot collective action game at a net personal cost C in order to
create a benefit B shared equally amongst the n−1 other group members, where B > C.
Second, individuals engage in a multi-period ‘mutual aid game’. . . In each period of the
mutual aid game, one randomly selected individual from each group is ‘needy’. Each of
his n − 1 neighbours can help him an amount b at a personal cost c, where b > c > 0.
Each individual?s behavioural history is known to all group members. This assumption
is essential because it is known that indirect reciprocity cannot evolve when information
quality is poor. The mutual aid game repeats with probability w and terminates with
probability 1 − w, thus lasting for 1/(1 − w) periods on average.”

Recall, also, the “shunner” strategy:

3
q r s

Figure 1: An Abstract Information Structure

“Shunners contribute to the collective action and then try to help those needy indi-
viduals who have good reputations during the mutual aid game, but mistakenly fail
owing to errors with probability e. . . Shunners never help needy recipients who are in
bad standing.”
  1−e 
(a) The authors argue that (shunner, shunner, . . . , shunner) is an equilibrium iff n−1
n 1−w (b −
c)(1 − we) > C. Argue that this is the case, even if one considers deviations to all possible
strategies in this game, and not just to the two other strategies described by the authors. To
simplify the math, feel free to set e = 0.

(b) (∗) When is contribution to the public good sustained as part of a Nash equilibrium. What
property must any strategy that sustains contributions to the public good have?

4 Introduction to Information Structures


Consider the information structure presented in Fig. 1. Recall, an information structure has three
components. The first component is a set of states of the world. In this information structure, there
are three states, each represented by a ball. The second component is the priors–the probability
with which each state occurs. In Fig. 1, the priors are presented below the balls: state 1 occurs with
probability q, state 2 occurs with probability r, and state 3 occurs with probability s = 1 − (q + r).
The third component is a partition of the states for each player that identifies states that the
player can and cannot distinguish. In Fig. 1, player 1 (top) cannot distinguish states 2 and 3 (blue)
from each other, but can distinguish state 1 (red) from states 2 and 3. Player 2 (bottom) cannot
distinguish states 1 and 2 (red) from each other, but can distinguish state 3 (blue) from states 1
and 2.
Suppose the two players whose information structure is represented in Fig. 1 are playing the
coordination game presented in Fig. 2. That is, first, nature randomly draws a ball according to
the prior probabilities. Each player sees the color associated with that state by their partition.
Then, each player plays an action in the coordination game and receives payoffs according to their
actions, as presented in the payoff matrix.

(a) A state-dependent strategy is one in which a player chooses one action in some states of the
world, and another in other states. Identify a valid, state-dependent strategy for player 1. Do

4
L R
L a b

R c d

a>c,d>b
p = (d-b)/(d-b + a-c)

Figure 2: The Coordination Game

q r s t

Figure 3: An Information Structure Illustrating the Importance of Higher-Order Beliefs

the same for player 2.

(b) Consider the strategy pair where player 1 plays “A when red and B when blue” (a.k.a. “A iff
red”) and player 2 plays “B when green and A when orange” (a.k.a. “A iff orange”). What
are each player’s payoffs?

(c) A Bayesian Nash equilibrium (BNE) is simply a pair of strategies such that neither player gains
a higher expected payoff by unilaterally deviating to a different strategy. This can easily be
checked by showing that for no player is there no color such that by playing a different action
in that color, the player would receive a higher payoff. Show that when r/(r + s) < p and
r/(q + r) < 1 − p, the strategy pair “A iff red; A iff green” is a BNE. Note: when at least one
of the players is playing a state-dependent strategy in equilibrium, we call this an equilibrium
with state-dependent strategies (ESDS).

5 Using Information Structures to Understand Higher-Order Be-


liefs
Next, consider the information structure presented in Fig. 3.
Again, suppose the two players whose information structure is represented in Fig. 3 are playing

5
the coordination game presented in Fig. 2. That is, first, nature randomly draws a ball according
to the prior probabilities. Each player sees the color associated with that state by their partition.
Then, each player plays an action in the coordination game and receives payoffs according to their
actions, as presented in the payoff matrix.

(a) In this example, both players know that at the rightmost state it is not the leftmost state and
vice-versa. Assume r/(r + s) > p and s/(s + t) > p. Show there is no ESDS where both play
A in the leftmost state and B in the rightmost state. To do this:

(i) Suppose that both play A in the leftmost state. Player 2 must play A one state to the
right, too, since this state is also green. Show that player 1 will maximize his payoffs by
playing A when blue.
(ii) Show that player 2 will then maximize his payoffs by playing A when orange.
(iii) Argue that player 1 will maximize his payoffs by playing A when yellow.

(b) What is the relationship to higher order beliefs? To answer this, first suppose that 2 plays A
when he sees green. Then answer each of the following:

(i) Suppose player 2 sees orange. What does player 2 think about what 1 sees?
(ii) What does player 2 think that 1 thinks about what 2 sees?
(iii) What does player 2 think that 1 thinks that 2 will do?
(iv) What does player 2 think that 1 will do in response?
(v) How should player 2 respond?

You might also like