Non Zero Sum
Non Zero Sum
Non Zero Sum
about…
Non-zero-sum Game
Theory, Auctions and
Negotiation …Well what’s it
worth to you, eh?
Note to other teachers and users of these
slides. Andrew would be delighted if you
Andrew W. Moore
found this source material useful in
giving your own lectures. Feel free to use
these slides verbatim, or to modify them
Associate Professor
to fit your own needs. PowerPoint
originals are available. If you make use School of Computer Science
of a significant portion of these slides in
your own lecture, please include this
message, or the following link to the Carnegie Mellon University
source repository of Andrew’s tutorials:
http://www.cs.cmu.edu/~awm/tutorials . www.cs.cmu.edu/~awm
Comments and corrections gratefully
received. [email protected]
412-268-7599
A
Cooperates
-1 , -1 -9 , 0
A’s B’s A’s B’s
payoff payoff payoff payoff
A
Defects
0 , -9 -6 , -6
A’s B’s A’s B’s
payoff payoff payoff payoff
C -1 , -1 -9 , 0
D 0 , -9 -6 , -6
n=2
S1 = {C,D}
S2 = {C,D}
u1 (C,C) = -1 u2 (C,C) = -1
u1 (C,D) = -9 u2 (C,D) = 0
u1 (D,C) = 0 u2 (D,C) = -9
u1 (D,D) = -6 u2 (D,D) = -6
what would you do if you were Player A ??
Player
A
Assuming B plays “D”, what
oh what should I do ?
PLAYER B
C D
If one of a player’s strategies is
-1 , -1 -9 , 0
PLAYER A
C -1 , -1 -9 , 0
D 0 , -9 -6 , -6
In some cases (e.g. prisoner’s dilemma) this
means, if players are “rational” we can predict the
outcome of the game.
C -1 , -1 -9 , 0
D 0 , -9 -6 , -6
In some cases (e.g. prisoner’s dilemma) this
means, if players are “rational” we can predict the
outcome of the game.
C -1 , -1 -9 , 0 C D
D 0 , -9 -6 , -6
D 0 , -9 -6 , -6
In some cases (e.g. prisoner’s dilemma) this
means, if players are “rational” we can predict the
outcome of the game.
C -1 , -1 -9 , 0 C D
D 0 , -9 -6 , -6
D 0 , -9 -6 , -6
In some cases (e.g. prisoner’s dilemma) this
means, if players are “rational” we can predict the
outcome of the game.
C -1 , -1 -9 , 0 C D
D 0 , -9 -6 , -6
D 0 , -9 -6 , -6
In some cases (e.g. prisoner’s dilemma) this
means, if players are “rational” we can predict the D
outcome of the game.
D -6 , -6
Copyright © 2001, Andrew W. Moore Non-Zero-Sum Game Theory: Slide 10
Strict Domination Removal Example
Player B
I II III IV
I 3,1 4,1 5,9 2,6
Player A
Ib IIb IIIb
Ia 0 4 4 0 5 3
u1 I a , IIIb
IIa 4 0 0 4 5 3 u1 IIIa , IIIb max u1 IIa , IIIb
IIIa 3 5 3 5 6 6 u1 IIIa , IIIb
u2 IIIa , I b
AND u2 IIIa , IIIb max u2 IIIa , IIb
(IIIa,IIIb) is a N.E. because
u3 IIIa , IIIb
F 0 0 2 1
• Two Nash Equilibria.
• How useful is Game Theory in this case??
• Why this example is troubling…
Copyright © 2001, Andrew W. Moore Non-Zero-Sum Game Theory: Slide 20
INTERMISSION
(Why) are Nash Equilibria useful for
A.I. researchers?
TR A THE
Commons
YE OLDE
COMMONS
An outcome of
( g1 , g2 , g3 ·· , gn )
will pay how much to the i’th farmer?
An outcome of
( g1 , g2 , g3 ·· , gn )
will pay how much to the i’th farmer?
n
g i 36 g j
j 1
THEN
g i arg max
gi
Copyright © 2001, Andrew W. Moore What? Non-Zero-Sum Game Theory: Slide 26
Let’s Assume a pure Nash Equilibrium exists.
Call it
g , g , g
1
2
n
THEN
g i arg max g i 36 g i G*i
gi
g i 1
write Gi g j
j i
THEN
g i arg max g i 36 g i G*i
gi
Clearly all the gi*’s are the same (Proof by “it’s bloody obvious”)
Write g*=g1*=···gn*
Solution to g*=24 – 2/3(n-1)g* is: g*= 72__
2n+1
Copyright © 2001, Andrew W. Moore Non-Zero-Sum Game Theory: Slide 29
Consequences
At the Nash Equilibrium a rational farmer grazes
72 goats.
2n+1
How many goats in general will be grazed? Trivial
algebra gives: 36 -2n+1
36 goats total being grazed
[as n --> infinity , #goats --> 36]
Randy Randy
Explode Explode
Do Nothing Do Nothing
Repeated Games
Player B Player B
C D C D
Player A
Player A
-1 , -1 -9 , 0 -1 , -1 -9 , 0
C C
D 0 , -9 -6 , -6 D 0 , -9 -6 , -6
Idea 1
Player A has four pure strategies
C then C
C then D
D then C
Is Idea 1 correct?
D then D
Copyright © 2001, Andrew W. Moore Ditto for B Non-Zero-Sum Game Theory: Slide 36
Important Theoretical Result:
Assuming Implausible Threats, if the
game G has a unique N.E. (s1* ,·· sn*)
then the new game of repeating G T
times, and adding payouts, has a
unique N.E. of repeatedly choosing the
original N.E. (s1* ,·· sn*) in every game.
S2
0 ? 6 ?
Chris Chris
H F H F
H 2 2 0 0 H 2 1 0 0
Pat
Pat
F 0 0 1 1 F 0 0 1 2
With 2/3 chance 1/3 chance
Copyright © 2001, Andrew W. Moore Non-Zero-Sum Game Theory: Slide 40
In a Bayesian Game each player is given a type. All
players know their own types but only a prob. dist. for their
opponent’s types
An n-player Bayesian Game has
a set of action spaces A1 ·· An
a set of type spaces T1 ·· Tn
a set of beliefs P1 ·· Pn
a set of payoff functions u1 ·· un
P-i(t-i|ti) is the prob dist of the types for the other players,
given player i has type i .
ui(a1 , a2 ··· an , ti ) is the payout to player i if player j
chooses action aj (with aj Aj ) (forall j=1,2,···n) and if
player i has type ti Ti
Copyright © 2001, Andrew W. Moore Non-Zero-Sum Game Theory: Slide 41
Bayesian Games: Who Knows What?
We assume that all players enter knowing the full
information about the Ai’s , Ti’s, Pi’s and ui’s
The i’th player knows ti, but not t1 t2 t3 ·· ti-1 ti+1 ·· tn
All players know that all other players know the
above
And they know that they know that they know, ad
infinitum
Definition: A strategy Si(ti) in a Bayesian Game is a
mapping from Ti→Ai : a specification of what action
would be taken for each type
Copyright © 2001, Andrew W. Moore Non-Zero-Sum Game Theory: Slide 42
Example
A1 = {H,F} A2 = {H,F}
T1 = {H-love,Flove} T2 = {Hlove, Flove}
P1 (t2 = Hlove | t1 = Hlove) = 2/3
P1 (t2 = Flove | t1 = Hlove) = 1/3
P1 (t2 = Hlove | t1 = Flove) = 2/3
P1 (t2 = Flove | t1 = Hlove) = 1/3
P2 (t1 = Hlove | t2 = Hlove) = 1
P2 (t1 = Flove | t2 = Hlove) = 0
P2 (t1 = Hlove | t2 = Flove) = 1
P2 (t1 = Flove | t2 = Hlove) = 0
u1 (H,H,Hlove) = 2 u2 (H,H,Hlove) = 2
u1 (H,H,Flove) = 1 u2 (H,H,Flove) = 1
u1 (H,F,Hlove) = 0 u2 (H,F,Hlove) = 0
u1 (H,F,Flove) = 0 u2 (H,F,Flove) = 0
u1 (F,H,Hlove) = 0 u2 (F,H,Hlove) = 0
u1 (F,H,Flove) = 0 u2 (F,H,Flove) = 0
u1 (F,F,Hlove) = 1 u2 (F,F,Hlove) = 1
u1 (F,F,Flove) = 2 u2 (F,F,Flove) = 2
Copyright © 2001, Andrew W. Moore Non-Zero-Sum Game Theory: Slide 43
(GASP, SPLUTTER)
Bayesian Nash Equilibrium
The set of strategies (s1* ,s2* ··· sn*) are a
Pure Strategy Bayesian Nash Equilibrium
iff for each player i, and for each possible type of i : tiTi
si*(ti) =
arg max u s
t
,... s
t , a , s *
t ...s *
i 1 1 i 1 i 1 i i 1 i 1 n tn Pi ti ti
ai Ai t i Ti
ub(Ps,Pb,Vb) = What?
0 ¼ ½ ¾ 1
Vb →
Prob(Trade Happens) = 1/2 x (3/4)2 = 9/32
Copyright © 2001, Andrew W. Moore Non-Zero-Sum Game Theory: Slide 48
Value of Trade
1
↑ 3/4
Vs
0 1/4 1
Vb →
[Vs|Trade Occurs] = 1/3 x 3/4 = 1/4
[Vb|Trade Occurs] = 1/4 + 2/3 x 3/4 = 3/4
S
Y
M
N
A
IO
U
A
• Why?