Lesson 35 Game Theory and Linear Programming
Lesson 35 Game Theory and Linear Programming
Announcements
I
Outline
Recap
Definitions
Examples
Fundamental Theorem
Games we can solve so far
GT problems as LP problems
From the continuous to the discrete
Standardization
Rock/Paper/Scissors again
The row players LP problem
Definition
A zero-sum game is defined by a payoff matrix A, where aij
represents the payoff to the row player if R chooses option i and C
chooses option j.
Definition
A zero-sum game is defined by a payoff matrix A, where aij
represents the payoff to the row player if R chooses option i and C
chooses option j.
I
The row player chooses from the rows of the matrix, and the
column player from the columns.
Definition
A strategy for a player consists of a probability vector representing
the portion of time each option is employed.
Definition
A strategy for a player consists of a probability vector representing
the portion of time each option is employed.
I
time) is
For instance,
Definition
The expected value of row and column strategies p and q is the
scalar
n
X
E (p, q) =
pi aij qj = pAq
i,j=1
Definition
The expected value of row and column strategies p and q is the
scalar
n
X
E (p, q) =
pi aij qj = pAq
i,j=1
Probabilistically, this is the amount the row player receives (or the
column player if its negative) if players employ these strategies.
Rock/Paper/Scissors
Example
What is the payoff matrix for Rock/Paper/Scissors?
Rock/Paper/Scissors
Example
What is the payoff matrix for Rock/Paper/Scissors?
Solution
The payoff matrix is
0 1 1
0 1 .
A= 1
1 1
0
Example
Consider a new game: players R and C each choose a number 1,
2, or 3. If they choose the same thing, C pays R that amount. If
they choose differently, R pays C the amount that C has chosen.
What is the payoff matrix?
Example
Consider a new game: players R and C each choose a number 1,
2, or 3. If they choose the same thing, C pays R that amount. If
they choose differently, R pays C the amount that C has chosen.
What is the payoff matrix?
Solution
1 2 3
A = 1 2 3
1 2 3
E (p , q) E (p , q ) E (p, q )
In other words,
I
I
I
Strictly-determined games
2 2 non-strictly-determined games
Theorem
Let A be a payoff matrix. If ars is a saddle point, then e0r is an
optimal strategy for R and es is an optimal strategy for C. Also
v = E (e0r , es ) = ars .
q = a11 a21
p=
|A|
Outline
Recap
Definitions
Examples
Fundamental Theorem
Games we can solve so far
GT problems as LP problems
From the continuous to the discrete
Standardization
Rock/Paper/Scissors again
The row players LP problem
Lemma
Regardless of q, we have
max pAq = max e0i Aq
p
1im
Lemma
Regardless of q, we have
max pAq = max e0i Aq
p
1im
1im
m
X
pi e0i Aq
i=1
m
X
i=1
!
pi
i=1
Thus
max pAq e0i0 Aq.
p
m
X
pi e0i0 Aq
n
X
j=1
v e0i Aq
i = 1, 2, . . . m
qj 0
j = 1, 2, . . . n
qj = 1
Resolution:
I
Let xj =
qj
v
xj =
n
1X
1
qj = .
v
v
j=1
xj .
Upshot
Theorem
Consider a game with payoff matrix A, where each entry of A is
x
positive. The column players optimal strategy q is
,
x1 + + xn
where x 0 satisfies the LP problem of maximizing x1 + + xn
subject to the constraints Ax 1.
Rock/Paper Scissors
0 1 1
0 1 .
A= 1
1 1
0
Rock/Paper Scissors
0 1 1
0 1 .
A= 1
1 1
0
We can add 2 to everything to make
2 1 3
= 3 2 1 .
A
1 3 2
Convert to LP
The problem is to maximize x1 + x2 + x3 subject to the constraints
2x1 + x2 + 3x3 1
3x1 + 2x2 + x3 1
x1 + 3x3 + 2x3 1.
We introduce slack variables y1 , y2 , and y3 , so the constraints now
become
2x1 + x2 + 3x3 + y1 = 1
3x1 + 2x2 + x3 + y2 = 1
x1 + 3x3 + 2x3 + y3 = 1.
z
0
0
0
1
value
1
1
1
0
x1 x2 x3 y1
2
1
3 1
1 2/3 1/3 0
1
3
2 0
1 1 1 0
y2 y3
0 0
1/3
0
0 1
0 0
z
0
0
0
1
value
1
1/3
1
0
Then we use row operations to zero out the rest of column one:
y1
x1
y3
z
x1
x2
x3 y1
y2 y3
7/3
0 1/3
1 2/3 0
2/3
1/3
1/3
1
0
0
7/3
5/3
0
0 1/3 1
1/3
0 1/3 2/3 0
0
z
0
0
0
1
value
1/3
1/3
2/3
1/3
x3
x1
y3
z
x1
x2 x3
y1
y2 y3
3/7 2/7
0
0 1/7 1
5
1
3
1
/7 0 /7
/7 0
1/7
0 18/7 0 5/7
1
3
2
1
0 /7 0
/7
/7 0
z
0
0
0
1
value
1/7
2/7
3/7
3/7
x1 x2 x3
y1
y2
y3
7/18 5/18
1/18
0 0 1
1/18
7/18 5/18
1 0 0
1/18
7/18
0 1 0 5/18
1
1
1/6
0 0 0
/6
/6
z
0
0
0
1
value
1/6
1/6
1/6
1/2
Outline
Recap
Definitions
Examples
Fundamental Theorem
Games we can solve so far
GT problems as LP problems
From the continuous to the discrete
Standardization
Rock/Paper/Scissors again
The row players LP problem
Now lets think about the problem from the column players
perspective. If he chooses strategy p, and C knew it, he would
choose p to minimize the payoff pAq. Thus the row player wants
to maximize that quantity. That is, Rs objective is realized when
the payoff is
E = max min pAq.
p
Lemma
Regardless of p, we have
min pAq = min pAej
q
1jn
m
X
i=1
v pAej
j = 1, 2, . . . n
pi 0
i = 1, 2, . . . m
pi = 1
1 0
p
v
yi =
1
,
v
Upshot
Theorem
Consider a game with payoff matrix A, where each entry of A is
y0
positive. The row players optimal strategy p is
,
y1 + + yn
where y 0 satisfies the LP problem of minimizing
y1 + + yn = 10 y subject to the constraints A0 y 1.
Theorem
The row players LP problem is the dual of the column players LP
problem.
x3
x1
x2
z
x1 x2 x3
y1
y2
y3
7/18 5/18
1/18
0 0 1
1/18
7/18 5/18
1 0 0
1/18
7/18
0 1 0 5/18
1/6
1/6
1/6
0 0 0
z
0
0
0
1
value
1/6
1/6
1/6
1/2
The entries in the objective row below the slack variables are the
solutions to the dual problem! In this case, we have the same
values, which means R has the same strategy as C . This reflects
the symmetry of the original game.
Example
Consider the game: players R and C each choose a number 1, 2,
or 3. If they choose the same thing, C pays R that amount. If
they choose differently, R pays C the amount that C has chosen.
What should each do?
Example
Consider the game: players R and C each choose a number 1, 2,
or 3. If they choose the same thing, C pays R that amount. If
they choose differently, R pays C the amount that C has chosen.
What should each do?
Answer.
Choice
1
2
3
R
54.5%
27.3%
18.2%
C
22.7%
36.3%
40.1%