0% found this document useful (0 votes)
74 views35 pages

Extensive Form Games: Backward Induction and Imperfect Information Games

The document summarizes a lecture on extensive form games, backward induction, and imperfect information games. It recaps the concept of Nash equilibrium and introduces extensive form games represented as game trees. It then discusses backward induction, which finds subgame perfect equilibria by analyzing games backwards from the end. Finally, it introduces imperfect information extensive form games, where players may have different information sets, and the concept of perfect recall.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views35 pages

Extensive Form Games: Backward Induction and Imperfect Information Games

The document summarizes a lecture on extensive form games, backward induction, and imperfect information games. It recaps the concept of Nash equilibrium and introduces extensive form games represented as game trees. It then discusses backward induction, which finds subgame perfect equilibria by analyzing games backwards from the end. Finally, it introduces imperfect information extensive form games, where players may have different information sets, and the concept of perfect recall.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Extensive Form Games: Backward Induction and


Imperfect Information Games

CPSC 532A Lecture 10

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 1
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Lecture Overview

1 Recap

2 Backward Induction

3 Imperfect-Information Extensive-Form Games

4 Perfect Recall

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 2
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

I promised to revisit this


Question: is there a problem having ∀ai , a0i ∈ Ai in the constraint
that is, are we requiring that the constraint hold in both
directions?
X X
p(a)ui (a) ≥ p(a)ui (a0i , a−i ) ∀i ∈ N, ∀ai , a0i ∈ Ai
a∈A|ai ∈a a∈A|ai 0 ∈a

p(a) ≥ 0 ∀a ∈ A
X
p(a) = 1
a∈A

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 3
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

I promised to revisit this


Question: is there a problem having ∀ai , a0i ∈ Ai in the constraint
that is, are we requiring that the constraint hold in both
directions?
X X
p(a)ui (a) ≥ p(a)ui (a0i , a−i ) ∀i ∈ N, ∀ai , a0i ∈ Ai
a∈A|ai ∈a a∈A|ai ∈a

p(a) ≥ 0 ∀a ∈ A
X
p(a) = 1
a∈A

Answer: yes, it was wrong. The version above fixes the problem,
changing the second sum so that it’s identical to the first.
Note that the constraint can equivalently be written as
X
[ui (a) − ui (a0i , a−i )]p(a) ≥ 0.
a∈A|ai ∈a

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 3
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Introduction

The normal form game representation does not incorporate


any notion of sequence, or time, of the actions of the players
The extensive form is an alternative representation that makes
the temporal structure explicit.
Two variants:
perfect information extensive-form games
a “game tree” consisting of choice nodes and terminal nodes
choice nodes labeled with players, and each outgoing edge
labeled with an action for that player
terminal nodes labeled with utilities
imperfect-information extensive-form games
we’ll get to this today

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 4
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Pure Strategies

Overall, a pure strategy for a player in a perfect-information


game is a complete specification of which deterministic action
to take at every node belonging to that player.
Definition
Let G = (N, A, H, Z, χ, ρ, σ, u) be a perfect-information
extensive-form game. Then the pure strategies of player i consist
of the cross product
× χ(h)
h∈H,ρ(h)=i

Using this definition, we recover the old definitions of mixed


strategies, best response, Nash equilibrium, . . .

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 5
q Aq q Aq q Aq
 A
Recap 
Backward A
Induction  A
Imperfect-Information Extensive-Form Games Perfect Recall

,0) (2,0) (0,0) (1,1) (0,0) (0,2)


Induced Normal Form
Figure 5.1 The Sharing game.

at the definition contains a subtlety. An agent’s strategy requires a decision


ice node, regardless of whether or not it is possible to reach that node given
hoice nodes. In the Sharing game above the situation is straightforward—
s three pure strategies, and player 2 has eight (why?). But now consider the
n in Figure 5.2. we can “convert” an extensive-form game into normal form
1
A B
2 2
CE CF DE DF
AG 3, 8 3, 8 8, 3 8, 3
C D E F
1
AH 3, 8 3, 8 8, 3 8, 3
(3,8) (8,3) (5,5)
BG 5, 5 2, 10 5, 5 2, 10
G H BH 5, 5 1, 0 5, 5 1, 0
(2,10) (1,0)

Figure 5.2 A perfect-information game in extensive form.

o define a complete strategy for this game, each of the players must choose
each of his two choice nodes. Thus we can enumerate the pure strategies
rs as follows.
A, G), (A, H), (B, G), (B, H)}
Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 6
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Subgame Perfection

Define subgame of G rooted at h:


the restriction of G to the descendents of H.
Define set of subgames of G:
subgames of G rooted at nodes in G

s is a subgame perfect equilibrium of G iff for any subgame


G0 of G, the restriction of s to G0 is a Nash equilibrium of G0
Notes:
since G is its own subgame, every SPE is a NE.
this definition rules out “non-credible threats”

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 7
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Lecture Overview

1 Recap

2 Backward Induction

3 Imperfect-Information Extensive-Form Games

4 Perfect Recall

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 8
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Centipede Game

5 Reasoning and Computing with the Extensive Form

1q A 2q A 1q A 2q A 1q A (3,5)
D D D D D

(1,0) (0,2) (3,1) (2,4) (4,3)

Figure 5.9 The centipede game


Play this as a fun game...
place. In other words, you have reached a state to which your analysis has given a
probability of zero. How should you amend your beliefs and course of action based
on this measure-zero event? It turns out this seemingly small inconvenience actually
raises a fundamental problem in game theory. We will not develop the subject further
here, but let us only mention that there exist different accounts of this situation, and
they depend
Extensive on theBackward
Form Games: probabilistic assumptions
Induction and made,Games
Imperfect Information on what is common knowledge
CPSC 532A (in9
Lecture 10, Slide
than
Recap possibly Backward
finding aInduction
Nash equilibriumImperfect-Information
that involves non-credible threats)
Extensive-Form but also
Games Perfect Recall
this procedure is computationally simple. In particular, it can be implemented as a
single depth-first traversal of the game tree, and thus requires time linear in the size
Computing Subgame Perfect Equilibria
of the game representation. Recall in contrast that the best known methods for finding
Nash equilibria of general games require time exponential in the size of the normal
form;
Idea:remember
Identifyas well
thethatequilibria
the induced in
normal
theform of an extensive-form
bottom-most trees,gameand is adopt
exponentially larger than the original representation.
these as one moves up the tree
function BACKWARD I NDUCTION (node h) returns u(h)
if h ∈ Z then
return u(h) // h is a terminal node
best util ← −∞
forall a ∈ χ(h) do
util at child ←BACKWARD I NDUCTION(σ(h, a))
if util at childρ(h) > best utilρ(h) then
best util ← util at child
return best util
Figure 5.6: Procedure for finding the value of a sample (subgame-perfect) Nash equi-
util
librium of a at child is a vector
perfect-information denoting
extensive-form the utility for each player
game.

the procedure
The algorithm BACKWARD doesn’t return
I NDUCTION an equilibrium
is described strategy,
in Figure 5.6. but rather
The variable
labels
util at child is aeach
vector node
denotingwith a vector
the utility ofplayer
for each realatnumbers.
the child node; util at childρ(h)
denotes the element of this vector corresponding to the utility for player ρ(h) (the
This labeling can be seen as an extension of the game’s utility
player who gets to move at node h). Similarly best util is a vector giving utilities for
each player. function to the non-terminal nodes
Observe that The equilibrium
this procedure strategies:
does not take thestrategy
return an equilibrium best action
for each at each node.
of the
n players, but rather describes how to label each node with a vector of n real numbers.
This labeling can be seen as an extension of the game’s utility function to CPSC
Extensive Form Games: Backward Induction and Imperfect Information Games
the non-
532A Lecture 10, Slide 10
good news: not only are we guaranteed to find a subgame-perfect equilibrium (rather
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall
than possibly finding a Nash equilibrium that involves non-credible threats) but also
this procedure is computationally simple. In particular, it can be implemented as a
Computing Subgame Perfect Equilibria
single depth-first traversal of the game tree, and thus requires time linear in the size
of the game representation. Recall in contrast that the best known methods for finding
Nash equilibria of general games require time exponential in the size of the normal
Idea:
form; Identify
remember thethatequilibria
as well the induced in theform
normal bottom-most trees,game
of an extensive-form and
is adopt
exponentially
these as larger
one than
movesthe original
up the representation.
tree
function BACKWARD I NDUCTION (node h) returns u(h)
if h ∈ Z then
return u(h) // h is a terminal node
best util ← −∞
forall a ∈ χ(h) do
util at child ←BACKWARD I NDUCTION(σ(h, a))
if util at childρ(h) > best utilρ(h) then
best util ← util at child
return best util
Figure 5.6: Procedure for finding the value of a sample (subgame-perfect) Nash equi-
librium of a perfect-information extensive-form game.
For zero-sum games, BackwardInduction has another name:
the minimax algorithm.
The algorithm BACKWARD I NDUCTION is described in Figure 5.6. The variable
util at child isHere
a vectorit’s
denoting
enoughthe utility for each
to store oneplayer at the child
number pernode; util at childρ(h)
node.
denotes the element of this vector corresponding to the utility for player ρ(h) (the
It’s possible to speed things up by pruning nodes
player who gets to move at node h). Similarly best util is a vector giving utilities for
that will
each player. never be reached in play: “alpha-beta pruning”.
Observe that this procedure does not return an equilibrium strategy for each of the
n players, but rather describes how to label each node with a vector of n real numbers.
Extensive
ThisForm Games:
labeling canBackward
be seenInduction and Imperfect
as an extension Information
of the game’s Games
utility function to CPSC 532A Lecture 10, Slide 10
the non-
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Backward Induction
118 5 Reasoning and Computing with the Extensive Form

1q A 2q A 1q A 2q A 1q A (3,5)
D D D D D

(1,0) (0,2) (3,1) (2,4) (4,3)

Figure 5.9 The centipede game

What happens when we use this procedure on Centipede?


In the
place. onlywords,
In other equilibrium,
you have reached player 1 which
a state to goesyourdown
analysisin
hasthe
givenfirst
a move.
probability of zero. How should you amend your beliefs and course of action based
However, this outcome is Pareto-dominated by all
on this measure-zero event? It turns out this seemingly small inconvenience actually but one
other
raises outcome.
a fundamental problem in game theory. We will not develop the subject further
here, but let us only mention that there exist different accounts of this situation, and
Two considerations:
they depend on the probabilistic assumptions made, on what is common knowledge (in
particular, whether there is common knowledge of rationality), and on exactly how one
practical: human subjects don’t go down right away
revises one’s beliefs in the face of measure zero events. The last question is intimately
theoretical:
related to the subjectwhat
of belief should you do
revision discussed as player
in Chapter 2. 2 if player 1 doesn’t
go down?
5.2 Imperfect-information
SPE analysis extensive-form gamesHowever, that same analysis
says to go down.
Up to thissays
point, that P1 would
in our discussion already have
of extensive-form gone
games we down.players
have allowed Howto do you
specify the action thatyour
update they would
beliefstake at every choice
upon node of the of
observation game. This implies zero event?
a measure
that players know the node they are in, and—recalling that in such games we equate
nodes withbut if player
the histories 1 knows
that led to them—all that you’ll
the prior doincluding
choices, something
those of else,
other it is
agents. For this reasonfor
rational we have
himcalled
notthese
to perfect-information
go down anymore... games. a paradox
We might not always want to make such a strong assumption about our players and
there’sIn many
our environment. a whole literature
situations we may want onto model
this question
agents needing to act with
partial or no knowledge of the actions taken by others, or even agents with limited
Extensive Form Games:memory
Backward Induction
of their own pastandactions.
Imperfect
The Information
sequencing ofGames
choices allows us to CPSC 532A Lecture 10, Slide 11
represent
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Lecture Overview

1 Recap

2 Backward Induction

3 Imperfect-Information Extensive-Form Games

4 Perfect Recall

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 12
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Intro

Up to this point, in our discussion of extensive-form games we


have allowed players to specify the action that they would
take at every choice node of the game.
This implies that players know the node they are in and all the
prior choices, including those of other agents.
We may want to model agents needing to act with partial or
no knowledge of the actions taken by others, or even
themselves.
This is possible using imperfect information extensive-form
games.
each player’s choice nodes are partitioned into information sets
if two choice nodes are in the same information set then the
agent cannot distinguish between them.

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 13
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Formal definition

Definition
An imperfect-information game (in extensive form) is a tuple
(N, A, H, Z, χ, ρ, σ, u, I), where
(N, A, H, Z, χ, ρ, σ, u) is a perfect-information extensive-form
game, and
I = (I1 , . . . , In ), where Ii = (Ii,1 , . . . , Ii,ki ) is an equivalence
relation on (that is, a partition of) {h ∈ H : ρ(h) = i} with
the property that χ(h) = χ(h0 ) and ρ(h) = ρ(h0 ) whenever
there exists a j for which h ∈ Ii,j and h0 ∈ Ii,j .

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 14
playerRecap
would beBackward
able toInduction
distinguish Imperfect-Information
the nodes). Thus, if I ∈ Games
Extensive-Form Ii is an equivalence
Perfect Recall clas

we can unambiguously use the notation χ(I) to denote the set of actions available
Example
player i at any node in information set I.

q1
"b
L" b R "
b
2 q" bq 2
" b
"b
A" " b (1,1)
bB
q" bq
" b
1
%e %e
ℓ % er ℓ % er
q
% eq q
% eq
(0,0) (2,4) (2,4) (0,0)

Figure 5.10 An imperfect-information game.


What are the equivalence classes for each player?
What are the pure strategies for each player?
Consider the imperfect-information extensive-form game shown in Figure 5.10. I
this game, player 1 has two information sets: the set including the top choice node, an
the set including the bottom choice nodes. Note that the two bottom choice nodes
the second information set have the same set of possible actions. We can regard play
1 as not knowing
Extensive Form Games: whether player
Backward Induction and 2
Imperfect A or BGames
choseInformation when she makes herLecture
CPSC 532A choice betwee
10, Slide 15
playerRecap
would beBackward
able toInduction
distinguish Imperfect-Information
the nodes). Thus, if I ∈ Games
Extensive-Form Ii is an equivalence
Perfect Recall clas

we can unambiguously use the notation χ(I) to denote the set of actions available
Example
player i at any node in information set I.

q1
"b
L" b R "
b
2 q" bq 2
" b
"b
A" " b (1,1)
bB
q" bq
" b
1
%e %e
ℓ % er ℓ % er
q
% eq q
% eq
(0,0) (2,4) (2,4) (0,0)

Figure 5.10 An imperfect-information game.


What are the equivalence classes for each player?
What are the pure strategies for each player?
Consider the imperfect-information extensive-form game shown in Figure 5.10. I
choice of an action in each equivalence class.
this game, player 1 has two information sets: the set including the top choice node, an
the set includingFormally, the pure
the bottom strategies
choice nodes.ofNote thati the
player consist
two of the cross
bottom choice nodes
product
the second informationIset × i,j ∈I χ(I
have
i ).
i,j same set of possible actions. We can regard play
the
1 as not knowing
Extensive Form Games: whether player
Backward Induction and 2
Imperfect A or BGames
choseInformation when she makes herLecture
CPSC 532A choice betwee
10, Slide 15
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Normal-form games
5 Reasoning and Computing with the Extensive

We can represent any normal form game.


1q
"b
C"" b
bD
q" bq
" b
2
c % e d c %e d

q eq q eq
% e % e
% %
(-1,-1) (-4,0) (0,-4) (-3,-3)

Figure
Note5.11 The
that it Prisoner’s
would also beDilemma
the samegame
if we in
putextensive
player 2 form.
at the
root node.

Recall that perfect-information games were not expressive enough to captu


isoner’s Dilemma
Extensive Form gameInduction
Games: Backward and many other
and Imperfect ones.Games
Information In contrast,CPSC
as is obvious
532A Lecture 10, from
Slide 16 th
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Induced Normal Form

Same as before: enumerate pure strategies for all agents


Mixed strategies are just mixtures over the pure strategies as
before.
Nash equilibria are also preserved.
Note that we’ve now defined both mapping from NF games to
IIEF and a mapping from IIEF to NF.
what happens if we apply each mapping in turn?
we might not end up with the same game, but we do get one
with the same strategy spaces and equilibria.

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 17
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Randomized Strategies

It turns out there are two meaningfully different kinds of


randomized strategies in imperfect information extensive form
games
mixed strategies
behavioral strategies
Mixed strategy: randomize over pure strategies
Behavioral strategy: independent coin toss every time an
information set is encountered

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 18
Figure 5.1 The Sharing game.
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Notice that the definition contains a subtlety. An agent’s strategy requires a decision
Randomizedatthestrategies example
each choice node, regardless of whether or not it is possible to reach that node given
other choice nodes. In the Sharing game above the situation is straightforward—
player 1 has three pure strategies, and player 2 has eight (why?). But now consider the
game shown in Figure 5.2.
1
A B
2 2

C D E F
1
(3,8) (8,3) (5,5)
G H

(2,10) (1,0)

Figure 5.2 A perfect-information game in extensive form.

In order to define a complete strategy for this game, each of the players must choose
Give anan action
example oftwoachoice
at each of his behavioral
nodes. Thus westrategy:
can enumerate the pure strategies
of the players as follows.
S1 = {(A, G), (A, H), (B, G), (B, H)}
S2 = {(C, E), (C, F ), (D, E), (D, F )}
It is important to note that we have to include the strategies (A, G) and (A, H), even
though once A is chosen the G-versus-H choice is moot.
The definition of best response and Nash equilibria in this game are exactly as they
are in for normal form games. Indeed, this example illustrates how every perfect-
information game can be converted to an equivalent normal form game. For example,
the perfect-information game of Figure 5.2 can be converted into the normal form im-
age of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are

Multi Agent Systems, draft of September 19, 2006

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 19
Figure 5.1 The Sharing game.
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Notice that the definition contains a subtlety. An agent’s strategy requires a decision
Randomizedatthestrategies example
each choice node, regardless of whether or not it is possible to reach that node given
other choice nodes. In the Sharing game above the situation is straightforward—
player 1 has three pure strategies, and player 2 has eight (why?). But now consider the
game shown in Figure 5.2.
1
A B
2 2

C D E F
1
(3,8) (8,3) (5,5)
G H

(2,10) (1,0)

Figure 5.2 A perfect-information game in extensive form.

In order to define a complete strategy for this game, each of the players must choose
Give anan action
example oftwoachoice
at each of his behavioral
nodes. Thus westrategy:
can enumerate the pure strategies
of the players as follows.
A with probability .5 and G with probability .3
S1 = {(A, G), (A, H), (B, G), (B, H)}

Give an Sexample
= {(C, E), (C,of
2 a mixed
F ), (D, E), (D, F )} strategy that is not a behavioral
It is important to note that we have to include the strategies (A, G) and (A, H), even
strategy:
though once A is chosen the G-versus-H choice is moot.
The definition of best response and Nash equilibria in this game are exactly as they
are in for normal form games. Indeed, this example illustrates how every perfect-
information game can be converted to an equivalent normal form game. For example,
the perfect-information game of Figure 5.2 can be converted into the normal form im-
age of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are

Multi Agent Systems, draft of September 19, 2006

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 19
Figure 5.1 The Sharing game.
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Notice that the definition contains a subtlety. An agent’s strategy requires a decision
Randomizedatthestrategies example
each choice node, regardless of whether or not it is possible to reach that node given
other choice nodes. In the Sharing game above the situation is straightforward—
player 1 has three pure strategies, and player 2 has eight (why?). But now consider the
game shown in Figure 5.2.
1
A B
2 2

C D E F
1
(3,8) (8,3) (5,5)
G H

(2,10) (1,0)

Figure 5.2 A perfect-information game in extensive form.

In order to define a complete strategy for this game, each of the players must choose
Give anan action
example oftwoachoice
at each of his behavioral
nodes. Thus westrategy:
can enumerate the pure strategies
of the players as follows.
A with probability .5 and G with probability .3
S1 = {(A, G), (A, H), (B, G), (B, H)}

Give an Sexample
= {(C, E), (C,of
2 a mixed
F ), (D, E), (D, F )} strategy that is not a behavioral
It is important to note that we have to include the strategies (A, G) and (A, H), even
strategy:
though once A is chosen the G-versus-H choice is moot.
(.6(A, G), .4(B,
The definition H)) and
of best response (why not?) in this game are exactly as they
Nash equilibria
are in for normal form games. Indeed, this example illustrates how every perfect-

In thisinformation
game
the
game can be converted to an equivalent normal form game. For example,
every game
perfect-information behavioral
of Figure 5.2 canstrategy corresponds
be converted into the normal form im- to a mixed
age of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are
strategy...
Multi Agent Systems, draft of September 19, 2006

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 19
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Games of imperfect recall


Imagine that player 1 sends two proxies to the game with the same
strategies. When one arrives, he doesn’t know if the other has
arrived before him, or if he’s the first one.
5.2 Imperfect-information extensive-form games 121

1 sH
 HH
L  HH R
 HH

s Hs 2
 HH

T T
 T U  T D
L  T R  T
 T  T
 T  T
1,0 100,100 5,1 2,2

Figure 5.12 A game with imperfect recall

What is the space


librium. of pure
Note in particular strategies
that in a mixed strategy, agent 1in this
decides game?
probabilistically
whether to play L or R in his information set, but once he decides he plays that pure
strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strate-
gies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh
each time he finds himself in the information set. Noting that the pure strategy D is
weakly dominant for agent 2 (and in fact is the unique best response to all strategies of
agent 1 other than the pure strategy L), agent 1 computes the best response to D as fol-
lows. If he uses the behavioral strategy (p, 1 − p) (that is, choosing L with probability
p each time he finds himself in the information set), his expected payoff is
1 ∗ p2 + 100 ∗ p(1 − p) + 2 ∗ (1 − p)
The expression simplifies to −99p2 + 98p + 2, whose maximum is obtained at p =
Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 20
98/198. Thus (R,D) = ((0, 1), (0, 1)) is no longer an equilibrium in behavioral strate-
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Games of imperfect recall


Imagine that player 1 sends two proxies to the game with the same
strategies. When one arrives, he doesn’t know if the other has
arrived before him, or if he’s the first one.
5.2 Imperfect-information extensive-form games 121

1 sH
 HH
L  HH R
 HH

s Hs 2
 HH

T T
 T U  T D
L  T R  T
 T  T
 T  T
1,0 100,100 5,1 2,2

Figure 5.12 A game with imperfect recall

What is the space


librium. of pure
Note in particular strategies
that in a mixed strategy, agent 1in this
decides game?
probabilistically
whether to play L or R in his information set, but once he decides he plays that pure
1: (L,strategy 2: (U,Thus
R);consistently. D)
the payoff of 100 is irrelevant in the context of mixed strate-
gies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh
each time he finds himself in the information set. Noting that the pure strategy D is
weakly dominant for agent 2 (and in fact is the unique best response to all strategies of
agent 1 other than the pure strategy L), agent 1 computes the best response to D as fol-
lows. If he uses the behavioral strategy (p, 1 − p) (that is, choosing L with probability
p each time he finds himself in the information set), his expected payoff is
1 ∗ p2 + 100 ∗ p(1 − p) + 2 ∗ (1 − p)
The expression simplifies to −99p2 + 98p + 2, whose maximum is obtained at p =
Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 20
98/198. Thus (R,D) = ((0, 1), (0, 1)) is no longer an equilibrium in behavioral strate-
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Games of imperfect recall


Imagine that player 1 sends two proxies to the game with the same
strategies. When one arrives, he doesn’t know if the other has
arrived before him, or if he’s the first one.
5.2 Imperfect-information extensive-form games 121

1 sH
 HH
L  HH R
 HH

s Hs 2
 HH

T T
 T U  T D
L  T R  T
 T  T
 T  T
1,0 100,100 5,1 2,2

Figure 5.12 A game with imperfect recall

What is the space


librium. of pure
Note in particular strategies
that in a mixed strategy, agent 1in this
decides game?
probabilistically
whether to play L or R in his information set, but once he decides he plays that pure
1: (L,strategy 2: (U,Thus
R);consistently. D)
the payoff of 100 is irrelevant in the context of mixed strate-
gies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh
What is the mixed strategy equilibrium?
each time he finds himself in the information set. Noting that the pure strategy D is
weakly dominant for agent 2 (and in fact is the unique best response to all strategies of
agent 1 other than the pure strategy L), agent 1 computes the best response to D as fol-
lows. If he uses the behavioral strategy (p, 1 − p) (that is, choosing L with probability
p each time he finds himself in the information set), his expected payoff is
1 ∗ p2 + 100 ∗ p(1 − p) + 2 ∗ (1 − p)
The expression simplifies to −99p2 + 98p + 2, whose maximum is obtained at p =
Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 20
98/198. Thus (R,D) = ((0, 1), (0, 1)) is no longer an equilibrium in behavioral strate-
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Games of imperfect recall


Imagine that player 1 sends two proxies to the game with the same
strategies. When one arrives, he doesn’t know if the other has
arrived before him, or if he’s the first one.
5.2 Imperfect-information extensive-form games 121

1 sH
 HH
L  HH R
 HH

s Hs 2
 HH

T T
 T U  T D
L  T R  T
 T  T
 T  T
1,0 100,100 5,1 2,2

Figure 5.12 A game with imperfect recall

What is the space


librium. of pure
Note in particular strategies
that in a mixed strategy, agent 1in this
decides game?
probabilistically
whether to play L or R in his information set, but once he decides he plays that pure
1: (L,strategy 2: (U,Thus
R);consistently. D)
the payoff of 100 is irrelevant in the context of mixed strate-
gies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh
What is the mixed strategy equilibrium?
each time he finds himself in the information set. Noting that the pure strategy D is
weakly dominant for agent 2 (and in fact is the unique best response to all strategies of

(that2. better for 1 than


agent 1 other than the pure strategy L), agent 1 computes the best response to D as fol-
Observe
lows. Ifthat Dbehavioral
he uses the is dominant for
strategy (p, 1 − p) R, DL with
is, choosing is probability
p each time he finds himself in the information set), his expected payoff is
L, D, so R, D is an equilibrium.
1 ∗ p2 + 100 ∗ p(1 − p) + 2 ∗ (1 − p)
The expression simplifies to −99p2 + 98p + 2, whose maximum is obtained at p =
Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 20
98/198. Thus (R,D) = ((0, 1), (0, 1)) is no longer an equilibrium in behavioral strate-
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Games of imperfect recall


Imagine that player 1 sends two proxies to the game with the same
strategies. When one arrives, he doesn’t know if the other has
arrived before him, or if he’s the first one.
5.2 Imperfect-information extensive-form games 121

1 sH
 HH
L  HH R
 HH

s Hs 2
 HH

T T
 T U  T D
L  T R  T
 T  T
 T  T
1,0 100,100 5,1 2,2

Figure 5.12 A game with imperfect recall

What is the space


librium. of pure
Note in particular strategies
that in a mixed strategy, agent 1in this
decides game?
probabilistically
whether to play L or R in his information set, but once he decides he plays that pure
1: (L,strategy 2: (U,Thus
R);consistently. D)
the payoff of 100 is irrelevant in the context of mixed strate-
gies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh
What is the mixed strategy equilibrium?
each time he finds himself in the information set. Noting that the pure strategy D is
weakly dominant for agent 2 (and in fact is the unique best response to all strategies of

(that2. better for 1 than


agent 1 other than the pure strategy L), agent 1 computes the best response to D as fol-
Observe
lows. Ifthat Dbehavioral
he uses the is dominant for
strategy (p, 1 − p) R, DL with
is, choosing is probability
p each time he finds himself in the information set), his expected payoff is
L, D, so R, D is an equilibrium.
1 ∗ p2 + 100 ∗ p(1 − p) + 2 ∗ (1 − p)
The expression simplifies to −99p2 + 98p + 2, whose maximum is obtained at p =
Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 20
98/198. Thus (R,D) = ((0, 1), (0, 1)) is no longer an equilibrium in behavioral strate-
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Games of imperfect recall


5.2 Imperfect-information extensive-form games 121

1 sH
 HH
L  HH R
 HH

s Hs 2
 HH

T T
 T U  T D
L  T R  T
 T  T
 T  T
1,0 100,100 5,1 2,2

Figure 5.12 A game with imperfect recall

What is an equilibrium in behavioral strategies?


librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically
whether to play L or R in his information set, but once he decides he plays that pure
strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strate-
gies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh
each time he finds himself in the information set. Noting that the pure strategy D is
weakly dominant for agent 2 (and in fact is the unique best response to all strategies of
agent 1 other than the pure strategy L), agent 1 computes the best response to D as fol-
lows. If he uses the behavioral strategy (p, 1 − p) (that is, choosing L with probability
p each time he finds himself in the information set), his expected payoff is
1 ∗ p2 + 100 ∗ p(1 − p) + 2 ∗ (1 − p)
The expression simplifies to −99p2 + 98p + 2, whose maximum is obtained at p =
98/198. Thus (R,D) = ((0, 1), (0, 1)) is no longer an equilibrium in behavioral strate-
gies, and instead we get the equilibrium ((98/198, 100/198), (0, 1)).
There is, however, a broad class of imperfect-information games in which the ex-
pressive power of mixed and behavioral strategies coincides. This is the class of games
of perfect recall. Intuitively speaking, in these games no player forgets any information
he knew about moves made so far; in particular, he remembers precisely all his own
moves.
Extensive Form Games: Backward Formally: and Imperfect Information Games
Induction CPSC 532A Lecture 10, Slide 21
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Games of imperfect recall


5.2 Imperfect-information extensive-form games 121

1 sH
 HH
L  HH R
 HH

s Hs 2
 HH

T T
 T U  T D
L  T R  T
 T  T
 T  T
1,0 100,100 5,1 2,2

Figure 5.12 A game with imperfect recall

What is an equilibrium in behavioral strategies?


librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically

again,whether
strategy
to play L or R in his information set, but once he decides he plays that pure
D strongly
consistently. Thusdominant
the payoff of 100 isfor 2 in the context of mixed strate-
irrelevant

each time he finds himself in the information set. Noting(p, that 1


the −purep),
gies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh
if 1 uses the behavioural strategy his
strategy D is expected
utilityweakly
is 1 ∗ p2 + 100 ∗ p(1 − p) + 2 ∗ (1 − p)
dominant for agent 2 (and in fact is the unique best response to all strategies of
agent 1 other than the pure strategy L), agent 1 computes the best response to D as fol-
simplifies to −99p2 +strategy
lows. If he uses the behavioral
98p(p,+ 2
1 − p) (that is, choosing L with probability
p each time he finds himself in the information set), his expected payoff is
maximum at p = 198/198 ∗ p + 100 ∗ p(1 − p) + 2 ∗ (1 − p)
2

thus equilibrium is to(98/198,


The expression simplifies −99p + 98p + 100/198),
2
2, whose maximum (0, 1) at p =
is obtained
98/198. Thus (R,D) = ((0, 1), (0, 1)) is no longer an equilibrium in behavioral strate-
gies, and instead we get the equilibrium ((98/198, 100/198), (0, 1)).
Thus, we can have behavioral strategies that are different
There is, however, a broad class of imperfect-information games in which the ex-
from mixed strategies.
pressive power of mixed and behavioral strategies coincides. This is the class of games
of perfect recall. Intuitively speaking, in these games no player forgets any information
he knew about moves made so far; in particular, he remembers precisely all his own
moves.
Extensive Form Games: Backward Formally: and Imperfect Information Games
Induction CPSC 532A Lecture 10, Slide 21
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Lecture Overview

1 Recap

2 Backward Induction

3 Imperfect-Information Extensive-Form Games

4 Perfect Recall

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 22
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Perfect Recall: mixed and behavioral strategies coincide


No player forgets anything he knew about moves made so far.
Definition
Player i has perfect recall in an imperfect-information game G if
for any two nodes h, h0 that are in the same information set for
player i, for any path h0 , a0 , h1 , a1 , h2 , . . . , hn , an , h from the root
of the game to h (where the hj are decision nodes and the aj are
actions) and any path h0 , a00 , h01 , a01 , h02 , . . . , h0m , a0m , h0 from the
root to h0 it must be the case that:
1 n=m
2 For all 0 ≤ j ≤ n, hj and h0j are in the same equivalence class
for player i.
3 For all 0 ≤ j ≤ n, if ρ(hj ) = i (that is, hj is a decision node
of player i), then aj = a0j .
G is a game of perfect recall if every player has perfect recall in it.
Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 23
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Perfect Recall

Clearly, every perfect-information game is a game of perfect recall.

Theorem (Kuhn, 1953)


In a game of perfect recall, any mixed strategy of a given agent
can be replaced by an equivalent behavioral strategy, and any
behavioral strategy can be replaced by an equivalent mixed
strategy. Here two strategies are equivalent in the sense that they
induce the same probabilities on outcomes, for any fixed strategy
profile (mixed or behavioral) of the remaining agents.

Corollary
In games of perfect recall the set of Nash equilibria does not
change if we restrict ourselves to behavioral strategies.

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 24
Recap Backward Induction Imperfect-Information Extensive-Form Games Perfect Recall

Computing Equilibria of Games of Perfect Recall

How can we find an equilibrium of an imperfect information


extensive form game?
One idea: convert to normal form, and use techniques
described earlier.
Problem: exponential blowup in game size.

Alternative (at least for perfect recall): sequence form


for zero-sum games, computing equilibrium is polynomial in
the size of the extensive form game
exponentially faster than the LP formulation we saw before
for general-sum games, can compute equilibrium in time
exponential in the size of the extensive form game
again, exponentially faster than converting to normal form

Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 25

You might also like