Lecture 04
Lecture 04
1 Examples
1.1 Extensive form games with perfect information
• Some players are active (have at least two actions) on some nodes, and these nodes are called
moving nodes.
• Actions of players are represented by branches.
• In an extensive form game of perfect information, each player observe all actions that had been
chosen by others before she moves
• Nodes that some player cannot distinguish are in the same information set of that player.
• In an EFG with perfect information, each information set contains exactly one moving node.
• In an EFG with imperfect information, some information sets contain at least two moving nodes.
1
2 Theory
2.1 Representation of an extensive form game
• A game tree consists of a collection of nodes, X, and a binary relation ≻ such that, for any
x, y ∈ X, x ≻ y means “x come after y”.
– The initial node of the tree is called root.
– For example, recall the bargaining game, X = {1, 2, 3, 4, 5, 6, 7}.
– 2 ≻ 1 , 6 ≻ 3, and 6 ≻ 1, but not 6 ≻ 2.
• The relation ≻ satisfies the following
– asymmetry: there is no x, y ∈ X such that x ≻ y and y ≻ x
– transitivity: for any x, y, z ∈ X, if x ≻ y and y ≻ z, then x ≻ z
– common predecessor for non-initial nodes: for any two nodes y1 , y2 ∈ X, if there exists some
nodes x1 and x2 such that y1 ≻ x1 and y2 ≻ x2 , then there exists a node z ∈ X such that
y1 ≻ z and y2 ≻ z.
– and, if x ≺ y and z ≺ y, then it is either x ≻ z or z ≻ x.
• Using this relation, we further define
– set of predecessors of node x ∈ X by P (x) = {y ∈ X|x ≻ y}.
∗ x ∈ X is the root of X iff P (x) = ∅.
– set of successors of node x ∈ X by S(x) = {y ∈ X|y ≻ x}.
– The set of terminal nodes is denoted by Z = {x ∈ X|S(x) = ∅}, i.e. nodes without succes-
sors.
– The set of moving nodes is Y = X\Z = {x ∈ X|S(x) ̸= ∅}. {Y, Z} is a partition of X.
∗ A partition of set A is a collection
Sn of subsets {B1 , . . . , Bn } such that (a) ∀i, Bi ⊆ A,
(b) ∀i ̸= j, Bi ∩ Bj = ∅, (c) i=1 Bi = A
– A path of a terminal node z is Path(z) = P (z) ∪ {z}, i.e. the nodes that proceeds z and z
itself.
∗ For example, P (6) = {1, 3}, S(2) = {4, 5}, Z = {4, 5, 6, 7}, X = {1, 2, 3}, Path(4) =
{1, 2, 4}
• Let N be the set of players.
• Let (Yi )i∈N be a moving partition of Y such that Yi contains all the nodes that player i chooses
actions.1
• Let Ui be the information partition of player i. That is, Ui is a partition of Yi and each ui ∈ Ui
is an information set of player i, i.e. i do not know exactly in which node she is.
– e.g. N = {B, S}, YB = {1} and YS = {2, 3}, UB = {{1}}, US = {{2}, {3}} (with perfect
information), US = {{2, 3}} (with imperfect information. Because there are two elements
of {2, 3}, player S do not know in which one exactly).
• Let Au be the set of actions at information set u ∈ Ui of player i. (Why do nodes in the same
information set have the same set of actions?)
• e.g. A{1} = {100, 500}, A{2} = A{3} = {A, R} (perfect information), A{2,3} = {A, R}
1 We consider extensive form games such that there is only one player choosing action at each moving node.
2
• A pure strategy of player i is defined by a functions
Let Si denote the set of pure strategies of player i, and the set of strategy profiles S = ×i∈N Si .
– For example, in the perfect information bargaining (100, RA) means that sB ({1}) = 100,
sS ({2}) = R, sS ({3}) = A, SB = {100, 500}, SS = {AA, AR, RA, RR}, and S = SB × SS
– with imperfect information, (500, A) means that sB ({1}) = 500, sS ({2, 3}) = A, SS =
{A, R}
• The payoff function vi : Z → R is defined on outcome space, i.e. the set of terminal nodes, for
each player i.
D E
• Γ = N, Y, Z, ≻, Yi , Ui , (Au )u∈Ui , vi i∈N completely describes an extensive form game
• For every strategy profile s = (s1 , . . . , sn ), it must lead to some terminal node, and we call it the
terminal node induced by s, denoted by z = ζ(a), where ζ : S → Z is the outcome function.
– This ζ is determined by N, Y, Z, ≻, Yi , Ui , (Au )u∈Ui i∈N .
• The payoff function in terms of strategy profiles is ui = vi ◦ ζ, i.e.
3
• The RSFG of bargaining game with imperfect information is
• A strategic form game can always be represented as extensive form game with imperfect infor-
mation.
• For example: the strategic form game
can be represented as an extensive form game, the bargaining game with imperfect information.
• There can be various ways to represent a strategic form game by an extensive form game.
2.3 Equilibrium
2.3.1 Nash equilibrium
Definition 2.2. A Nash Equilibrium of an extensive form game Γ is the Nash equilibrium of its
reduced strategic form game G(Γ).
• The bargaining game with imperfect information has a unique NE (100, A).
4
• The bargaining with p.i. has three subgames, two of which are proper subgames:
– The bargaining with i.i. has only one subgame and no proper subgame.
Definition 2.4. A strategy profile s is a subgame perfect equilibrium of an extensive form game Γ if
it is a Nash equilibrium in all subgames of Γ.
• The entry game Γ has two subgames: the whole game Γ, and the part in red frame gred .
Theorem 2.1 (Zermelo’s Theorem). Every finite extensive form game of perfect information has a
subgame perfect equilibrium, and, hence, a Nash equilibrium, in pure strategy. And all SPE’s can be
obtained from backward induction procedure.
5
2.4 Randomized strategies
• A mixed strategy σi is simply a probability on Si , i.e. σi ∈ ∆(Si )
– In the entry game, Coke has for pure strategies, {OA, OT, EA, ET }, and it may play the
mixed strategy (0.1, 0.2, 0.3, 0.4).
• A behavioral strategy βi : Ui → ∆(Ai ) is defined to be such that, for every information set u ∈ Ui ,
βi (u) ∈ ∆(Au ) is a probability on Au .
– Mixing when a player makes decision at a specific information set, instead of mixing at the
beginning of the game
– In the entry game, a behavioral strategy {(0.3, 0.7), (3/7, 4/7)}, i.e. At the root, Cook
chooses O w/ prob. 0.3 and E w/ prob. 0.7; If Coke enters the market, it choose A w/
prob. 3/7 and T w/ prob. 4/7
• What is the relationship between mixed and behavioral strategies?
– First, it is always easy to obtain a mixed strategy σi from a behavioral strategy βi , say
Y
σi (si ) = βi (u) (si (u)) (3)
u∈Ui
– Second, let Si (u) = {si ∈ Si |∃s−i , ∃x ∈ ui , x ≺ ζ(si , s−i )} be set of strategies such that
player i could arrive the information set u, and let Si (u, ai ) = {si ∈ S(u)|si (u) = ai }, then
say that βi is consistent with σi if, for any u ∈ Ui and any ai ∈ Au ,
σi (Si (u, ai ))
σi (Si (u)) > 0 =⇒ βi (u)(ai ) = (4)
σi (Si (u))
– Consider the entry game, if a Coke’s behavioral strategy {(0.3, 0.7), (3/7, 4/7)} is given,
it is equivalent to its mixed strategy (9/70, 12/70, 0.3, 0.4). The other way around, if
a Coke’s mixed strategy (0.1, 0.2, 0.3, 0.4) is given, then there is a behavioral strategy
{(0.3, 0.7), (3/7, 4/7)} that induces the same outcome (distribution) as the mixed strategy.
∗ S1 (1) = {OA, OT, EA, ET }, S(1, O) = {OA, OT }, S(1, E) = {EA, ET }, so β1 (1)(O) =
0.3 and β1 (1)(E) = 0.7
∗ S1 (2) = {EA, ET }, S(2, A) = {EA}, S(2, T ) = {ET }, so β1 (2)(A) = 0.3/0.7 = 3/7
and β1 (2)(T ) = 4/7
• The set of mixed strategies is larger than those obtained from behavioral strategies
– (0.1, 0.2, 0.3, 0.4) is mixed strategy but cannot be obtained from any behavioral strategies.
• But they are somehow “observationally equivalent”.
Theorem 2.2. (Kuhn) For any profile of mixed and behavioral strategies, σ and β, the following hold:
(a) For any player i ∈ N , if σi and βi satisfy either (3) and (4), then Pr(z|σi , s−i ) = Pr(z|βi , s−i )
for any terminal node z ∈ Z.
(b) For all players i ∈ N , if σi and βi satisfy either (3) and (4), then Pr(z|σ) = Pr(z|β) for any
terminal node z ∈ Z.
Theorem 2.3. (Kuhn) Every finite extensive form game has a subgame perfect equilibrium in behav-
ioral strategies.
6
• There is a mixed (behavioral) equilibrium in the entry game: {((1, 0), (2/3, 1/3)) , (1/2, 1/2)}
Theorem 2.4 (One-deviation principle). Let Γ be a finite horizon extensive form game with perfect
information. The strategy profile s∗ is a SPE of Γ if and only if, for every player i ∈ N and for every
subgame g of Γ where player i moves at the initial node of g, there exists no profitable deviation by
player i which differs from s∗i only in the action specified at the initial node of g.
Remark 2.1. The one-deviation principle holds for infinite horizon games if certain regularity conditions
(e.g., continuity at infinity (Fudenberg and Tirole, 1991, p.110), which means that what happens in
the distance future has little impact on the payoff). Such conditions hold in all games with compact
action space and continuous payoff functions.
3 Applications
3.1 Pirate game
There are five rational pirates (in strict order of seniority A, B, C, D and E) who found 100 gold coins.
They must decide how to distribute them.
The pirate world’s rules of distribution say that the most senior pirate first proposes a plan of
distribution. The pirates, including the proposer, then vote on whether to accept this distribution. If
the majority accepts the plan, the coins are dispersed and the game ends. In case of a tie vote, the
proposer has the casting vote. If the majority rejects the plan, the proposer is thrown overboard from
the pirate ship and dies, and the next most senior pirate makes a new proposal to begin the system
again. The process repeats until a plan is accepted or if there is one pirate left.
Pirates base their decisions on three factors. First of all, each pirate wants to survive. Second,
given survival, each pirate wants to maximize the number of gold coins he receives. Third, each pirate
would prefer to throw another overboard, if all other results would otherwise be equal.
Solution:
• The final possible scenario would have all the pirates except D and E thrown overboard. Since
D is senior to E, they have the casting vote; so, D would propose to keep 100 for themself and 0
for E.
• If there are three left (C, D and E), C knows that D will offer E 0 in the next round; therefore,
C has to offer E one coin in this round to win E’s vote. Therefore, when only three are left the
allocation is C:99, D:0, E:1.
• If B, C, D and E remain, B can offer 1 to D; because B has the casting vote, only D’s vote is
required. Thus, B proposes B:99, C:0, D:1, E:0.
• With this knowledge, A can count on C and E’s support for the following allocation, which is
the final solution: A: 98, B:0, C:1, D:0, E:1
7
• In subgames Gi (ki , k−i ) with ki , k−i ≤ 2, the unique SPE outcome is that firm i takes ki steps
to win the competition, and the corresponding payoff to firm i is 7 − c(ki ) and to firm −i is 0.
• In subgames Gi (ki , k−i ) with ki ≤ 2 and k−i > 2, the unique SPE outcome is that firm i takes 1
step in each of its turn to win the competition, and the corresponding payoff to firm i is 7 − ki
and to firm −i is 0.
• In subgames Gi (ki , k−i ) with ki > 2 and k−i ≤ 2, the unique SPE outcome is that firm i takes no
steps and let firm −i to win the competition with 1 step in each of its turn, and the corresponding
payoff to firm i is 0 and to firm −i is 7 − k−i .
Now consider a game G1 (3, 3). Firm 1 can take one step into the game G2 (3, 2), in which firm 1 win,
with cost 1. Firm 1 can also take two steps into the game G2 (3, 1) with cost 4. Clearly, in SPE, firm
1 take one step. Similarly,
• in the unique SPE of the game Gi (3, k−i ) with k−i > 2, firm i take one step and win the
competition with a payoff of 4
• in the unique SPE of the game Gi (4, k−i ) with k−i ∈ {3, 4}, firm i take two steps and win the
competition with a payoff of 1.
• in the unique SPE of the game Gi (4, k−i ) with k−i ≥ 5, firm i take one step and win the
competition with a payoff of 3
Then consider a game G2 (5, 3). Firm 2 can take one step into the game G1 (3, 4) or two steps into
the game G1 (3, 3), where it will lose anyway. Hence, in the SPE, firm 2 takes no step and expect a
payoff of 0, and firm 1 expect a payoff of 4. Similar argument applies to G2 (5, 4), in whose SPE firm
2’s payoff is 0 and firm 1’s is 3.
In the game G1 (5, 5) as what we finally want to solve, firm 1 can take one step into the game
G2 (5, 4) with a cost of 1 and its final payoff is 2.
3.3 Bargaining
3.3.1 Nash bargaining
Nash (1950) considered a bargaining problem and take a cooperative approach to show that there is a
unique solution that satisfies certain desirable properties. Two people, A and B, are bargaining over
a set of possible outcomes, denoted by S ⊆ R2+ . If the individuals fail to reach an agreement they
both receive outcome zero (0, 0), which is called the disagreement point. Nash looked for solutions that
satisfy the following properties:
Axiom 1 (Pareto efficiency (PAR)). No one can improve upon the solution without making the other
person worse off.
Axiom 2 (Symmetry (SYM)). Both individuals receive the same outcome if the bargaining set is
symmetric.
Axiom 3 (Invariance (INV)). If the bargaining set is contracted or expanded by some factor, the shares
are also contracted or expanded by the same factor.
Axiom 4 (Independence of Irrelevant Alternatives (IIA)). Adding alternatives to the bargaining set
that have not been chosen does not change the solution.
Nash’s theorem states that there exists a unique solution that satisfies these properties and that it
is given by
∗ ∗
(πA , πB ) = arg max πA πB
(πA ,πB )∈S
8
3.3.2 Alternating offers bargaining
The game:
Two players, A and B, bargain over a cake of size 1. At time 0 player A makes an offer xA ∈ [0, 1]
to player B. If player B accepts the offer, agreement is reached and player A receives xA and player
B receives 1 − xA . If player B rejects the offer, she makes a counteroffer xB ∈ [0, 1] at time 1. If this
counteroffer is accepted by A, then B receives xB and A receives 1 − xB . Otherwise, player A makes
another offer at time 2. This process continues indefinitely until a player accepts an offer.
If the players reach an agreement at time t on a partition that gives player i a share xi of the cake,
then player i’s payoff is δit xi , where δit ∈ (0, 1) is player i’s discount factor. If players never reach an
agreement, then each player’s payoff is zero.
The solution:
Suppose that (x∗A , x∗B ) is an equilibrium offer, then it must satisfy the following properties:
1. (No delay) Equilibrium offers are accepted in all the subgames.
2. (Stationary) A player makes the same offer in equilibrium.
Therefore, the current value of B rejecting the offer x∗A is δB x∗B , and, in equilibrium,
1 − x∗A = δB x∗B .
Similarly,
1 − x∗B = δA x∗A .
The unique solution to these equations is
1 − δB
x∗A =
1 − δA δB
1 − δA
x∗B =
1 − δA δB
Thus, the following strategy profile (s∗A , s∗B ) constitutes an SPE:
• Player A always offer x∗A and accepts any xB with 1 − xB ≥ δA x∗A
• Player B always offer x∗B and accepts any xA with 1 − xA ≥ δB x∗B .
Proposition 3.1. One-deviation principle holds for the Rubinstein bargaining game.
Proposition 3.2. (s∗A , s∗B ) is a SPE of the alternating offers bargaining game.
Proof. Consider any period (subgame) when A has to make an offer. Her payoff to s∗A is x∗A . If,
instead, A offer xA < x∗A , then B accepts the offer and A obtain a payoff xA < x∗A to this deviation. If
she offer xA > x∗A , B will reject the offer and offer 1 − x∗B = δA x∗A . In this case, A will accept the offer
2 ∗
and obtain a payoff δA xA < x∗A . Hence, we conclude that there is no profitable one-shot deviation. By
a symmetric argument, there is no profitable one-shot deviation for player B either.
The SPE has the following property:
• Unique (try to prove it)
• Efficient (Pareto optimal)
1−δB
• In the unique SPE, the equilibrium payoff of player A is πA = x∗A = 1−δA δB and that of player B
δB (1−δA )
is πB = 1 − x∗A
= 1−δA δB .The
share of each player is decreasing in her discount factor. Suppose
that δA = δB = δ → 1, the Rubinstein bargaining outcome πA = πB = 1/2, which coincide with
Nash bargaining solution with S = {(πA , πB ) ∈ R2+ : πA + πB ≤ 1}.
9
4 Perfect Bayesian equilibrium
To study extensive form games with incomplete information, we can consider an EFG with a dummy
player, Nature. Its choice is random and may or may not be observable. It is indifferent between
outcomes, i.e. payoffs are the same in the different final nodes. An extensive form game with incomplete
information can be considered as an EFG with Nature and Nature’s actions are unobservable. It
is possible to define subgame perfect equilibrium or Bayesian-Nash equilibrium, but they are not
adequate.
The following example is an EFG with complete information, however, it illustrates the idea that
sometimes SPE’s may be unreasonable.
Example 4.1. Consider the following game:
There are two SPEs: (T, L) and (O, R). However, (O, R) is unreasonable: if player 2 know that she in
the information set I, she should never choose R since it is dominated by L.
Thus, we further require that players are rational in every continuation game. A continuation game
may start from an information set. In the above example, the information set I and the nodes follow
from that information set. In general, analyzing players’ decision at an information set requires him
forming beliefs regarding which decision node he is at.
To specify a perfect Bayesian equilibrium, we consider an assessment (µ, β) consisting of
• a belief system µ : X → [0, 1] that assigns a probability of each Pnode in each information set,
satisfying that, in every information set u ∈ Ui for some player i, x∈u µ(x) = 1; and
10
2. (Sequential rationality) given belief µ and subsequent strategies, the action chosen at each infor-
mation set is optimal.
In the above example, let µ ∈ [0, 1] be the probability assigned to the node follow T . A PBE of this
game is (µ = 1, (T, L)). That is, player 2 believes that the probability of the node following T equal
to 1 if he is at the information set I, and player 1 always chooses T , and player 2 always chooses L.
Given the belief µ = 1 (in fact, for any belief µ ∈ [0, 1]), choosing L is optimal for player 2 in the
continuation game. Given that β1 (T ) = 1 and β1 (B) = β1 (O) = 0, by Bayes rule,
β1 (T )
= 1 = µ.
β1 (T ) + β1 (B)
5 Signaling games
5.1 The example
• The game starts with a “decision” by Nature, which determines whether player 1 is of type I or
II with a probability p = Pr(I) = 1/2 (in this example)
• Player 1 decides whether to play L or R conditioning on his type. Hence, his strategy is defined
a mapping from types to actions:
• After player 1 moves, player 2 is able to see the action taken by player 1 but not the type of
player 1. Hence, her decision is conditional on the action observed. Hence, her strategy is defined
a mapping from player 1’s actions to her own actions:
• There is no subgames but continuation games. Player 2 forms a belief about the type of player
1 given the action observed.
– In the example, her belief are specified by r = Pr(I|L) and q = Pr(I|R)
– In equilibrium, we require beliefs are consistent meaning that it satisfies Bayes rule given
strategies. In the example, let a = Pr(L|I) and b = Pr(L|II). By Bayes rule, if ap+b(1−p) >
0,
Pr(L|I) Pr(I) ap
r = Pr(I|L) = = (5)
Pr(L|I) Pr(I) + Pr(L|II) Pr(II) ap + b(1 − p)
11
and, if (1 − a)p + (1 − b)(1 − p) > 0,
– In the case that ap + b(1 − p) = 0 ((1 − a)p + (1 − b)(1 − p) = 0), r (q) can be anything in
[0, 1].
σ1 : I 7→ L; II 7→ R
Step 2: compute the belief of player 2 with the candidate strategy specified in step 1. In this case,
simply, r = 1 and q = 0.
Step 3: find player 2’s best response given her beliefs. In this case,
σ2 : L 7→ U ; R 7→ U.
Step 4: given player 2’s strategy in step 3, verify that the candidate strategy in step 1 is optimal
for player 1. In practice, we verify whether player 1 has incentive to deviate. In this case, if player 1
of type I deviates from L to R, his payoff changes from 4 to 0, so the strategy is optimal for type I. If
type II deviates (from U to L), his payoff, again, changes from 4 to 0; so the strategy is also optimal
for type II. Then, we can conclude that there is a separating equilibrium that consists of strategy
profile (LR, U U ) and beliefs (1, 0).
We can also check whether there is another separating equilibrium with player plays
σ1 : I 7→ R; II 7→ L.
Then, player 2’s beliefs are simply r = 0 and q = 1, and her best response, given these beliefs, is
σ2 : L 7→ D; R 7→ D.
Finally, it is easy to check that player 1 has no incentive to deviate given player 2’s strategy. Hence,
the assessment (RL, DD) and (0, 1) is another separating equilibrium.
In pooling equilibria, player 1 of different types chooses the same action. We follow the same steps
as before.
Step1: consider a candidate strategy of player 1,
σ1 : I 7→ L; II 7→ L.
Step 2: Different from the case of separating equilibria, where we compute both r and q exactly,
we can only know the exact value of r, which is 1/2, but q can be any value in [0, 1].
Step 3: We know that player 2’s best response is to choose D if she observes action L given her
belief r = 1/2. When player 2 observes action R, her choice depends on her belief q. So we can consider
two cases: (case 1) if q ≤ 1/3, the player 2’s best response should be U ; otherwise, she should choose
D.
12
Step 4: Now we need to verify that, in two assessments ((LL, DU ), (1/2, q)) with q ≤ 1/3 and
((LL, DD), (1/2, q)) with q > 1/3, player 1’s strategy is optimal.
(4.1) In case 1, if player 1 of type I deviates from L to R, his payoff changes from 0 to 0, so the
strategy is optimal for type I; if type II deviates, his payoff changes from 8 to 4, so the strategy is
optimal for type II. Hence, ((LL, DU ), (1/2, q)) with q ≤ 1/3 is a PBE.
(4.2) In case 2, if type I deviates, his payoff increases from 0 to 8. Thus, he has incentives to
deviate and this case cannot be an equilibrium.
We can also check whether there is another pooling equilibrium with player 1’s strategy being
σ1 : I 7→ R; II 7→ R.
Following similar steps, the assessment ((RR, U D), (r, 1/2)) with r ≥ 2/3 is another pooling equilib-
rium.
13
6 Repeated games
6.1 Preliminaries
• Let G = ⟨N, (Ai , ui )i∈N ⟩ be an n-player strategic form game, and call it a stage game.
– For example, G can be the following prisoner’s dilemma (PD):
h1 = ∅
ht = (a1 , . . . , at−1 ), for t = 2, 3, . . .
– e.g. a possible fifth period history in PD is ((C, C), (C, C), (C, D), (D, D))
– The set of period t histories is given by
where s1i ∈ Ai and sti : H t → Ai , and the set of strategies is denoted by Si for player i.
• For example, a grim trigger strategy in PD is as follow: for player i = 1, 2,
s1i = C
(
t t C aτ−i = C, ∀τ ≤ t − 1
si (h ) =
D otherwise
– start with playing C and switch to D if the opponent has played D in the past
– defection is triggered by the opponent’s defection
– grim as punishment lasts forever
14
• The continuation payoff, for some strategy profile s, after some history ht is
∞
X
Ui (s|ht ) = (1 − δ) δ τ −t ui (aτ )
τ =t
t t t τ τ τ −1 τ −1
where a = s (h ) and a = s (h ,a ) for all τ ≥ t + 1.
6.2 Equilibria
Definition 6.1. The strategy profile s is a Nash equilibrium of the repeated game Gδ if for all i ∈ N
Ui (si , s−i ) ≥ Ui (s′i , s−i ), ∀s′i ∈ Si .
We may want to refine Nash equilibrium to subgame perfect equilibrium in dynamic games.
Definition 6.2. The strategy profile s is a subgame perfect equilibrium of the repeated game Gδ if for
all i ∈ N and all ht ∈ H t
Ui (si , s−i |ht ) ≥ Ui (s′i , s−i |ht ), ∀s′i ∈ Si .
Claim 6.1. The grim trigger strategy profile is a Nash equilibrium of the repeated PD if δ ≥ 1/2.
• Suppose player 2 plays grim trigger strategy and check play 1 has no incentive to deviate.
• There are two classes of possible deviations:
– responding C every period, and then the expected payoff is 2
– responding D at some period, and show that such a deviation is not profitable.
• Let T + 1, T = 0, 1, . . ., be the first period that player 1 defects. Then, the best (in what sense?)
deviation must generates the sequence of action profile (why?)
(C, C) . . . (C, C), (D, C), (D, D), (D, D), . . .
| {z }
T times
– But, consider a shot-deviation: s22 ((C, D)) = D (recall that, in the grim trigger strategy
s22 ((C, D)) = C). Then, she will get payoff 1 and the following sequence of action profiles
realizes after h2
(D, D), (D, D), (D, D), . . .
• Modified grim trigger strategy profile is a SPE, which can be verified by one-deviation principle.
(
1 t t C ht = ((C, C), . . . , (C, C))
si = C and si (h ) =
D otherwise
15
6.3 Equilibrium payoffs
Definition 6.3. The set of feasible payoff profiles of a strategic game is the set of all weighted averages
of payoff profiles in the game.
• What are the possible avergae discounted payoff pairs in a NE beside (1, 1) and (2, 2)?
• Consider the outcome path (b1 , b2 , . . .) that consists of repetitions of the sequence (a1 , . . . , ak )
– Let the average payoff of the sequence (a1 , . . . , ak ) be x.
• Consider the following strategy profile, for i = 1, 2,
(
1 1 t t bti hr−i = br−i , ∀r = 1, . . . , t − 1
si = bi and si (h ) =
D otherwise
– If xi < 1 for some i, then player i has incentive to deviate by playing D.
– If xi > 1 for all i, then the strategy profile is a NE when δ is close to 1.
16
6.4 General repeated games and folk theorems
• The basic idea behind IRPD is extended to general infinitely repeated games:
– If players cooperate, everyone gets a payoff higher than some “minimum” payoff
– A deviation triggers each player to begin an indefinite “punishment” of the deviant
Theorem 6.1 (Nash folk theorem). Let G be a strategic form game and Gδ be an infinitely repeated
games with discount factor δ.
• For any discount factor δ the discounted average payoff of every player in any Nash equilibrium
of Gδ is at least her minmax payoff in G.
• Let w be a feasible payoff profile of G for which each player’s payoff exceeds her minmax payoff.
Then for all ϵ > 0 there exists δ̄ < 1 such that if the discount factor exceeds δ̄ then Gδ has a
Nash equilibrium whose discounted average payoff profile w′ satisfies |w − w′ | < ϵ.
Theorem 6.2 (Subgame perfect folk theorem for two-player games). Let G be a two-player strategic
form game and Gδ be an infinitely repeated games with discount factor δ.
• For any discount factor δ the discounted average payoff of every player in any subgame perfect
equilibrium of Gδ is at least her minmax payoff in G.
• Let w be a feasible payoff profile of G for which each player’s payoff exceeds her minmax payoff.
Then for all ϵ > 0 there exists δ̄ < 1 such that if the discount factor exceeds δ̄ then Gδ has a
subgame perfect equilibrium whose discounted average payoff profile w′ satisfies |w − w′ | < ϵ.
17