Chapter 5 - Game Theory: History
Chapter 5 - Game Theory: History
Chapter 5 - Game Theory: History
History
• The study of game theory dates back to 1944, when John von Neumann and Oscar
Morgenstern published their classic book, Theory of Games and Economic Behavior. Since
then, game theory has been used by army generals to plan war strategies, by union
negotiators and managers in collective bargaining, and by businesses of all types to
determine the best strategies given a competitive business environment.
• Game theory continues to be important today. In 1994, John Harsanui, John Nash, and
Reinhard Selten jointly received the Nobel Prize in Economics from the Royal Swedish of
Sciences. In their classic work, these individuals developed the notion of noncooperative
game theory. After the work of John von Neumann and Oscar Morgenstern, Nash
developed the concepts of Nash equilibrium and the Nash bargaining problem, which are
the corner-stones of modern game theory.
Reference: H.A. Taha, Operations Research: An Introduction, 8th Edition, Prentice Hall, 2007.
AMA484 Decision Analysis 2
What Is a Game?
In a game, there are three elements,
• a payoff function.
Suppose we have
(1) a topological tree Γ with a distinguished vertex A (called the starting point); and
(2) a function, called the payoff function, which assigns an n-vector (p1 , p2 , · · · , pn )
to each terminal vertex of Γ for an n players’ game.
AMA484 Decision Analysis 3
Example
• Player 1 chooses heads (H) or tails (T );
• Player 2, not knowing player 1’s choice, chooses heads or tails;
• if the two choose alike, then Player 2 wins a cent from Player 1; otherwise, Player
1 wins a cent from Player 2;
• in the tree, vectors at the terminal vertices represent the payoff function;
• number near other vertices denote player to whom move corresponds.
(−1, 1)
H
2
H
T
(1, −1)
1
(1, −1)
H
T
2
T
(−1, 1)
AMA484 Decision Analysis 4
Zero-sum Game
The payoff function is
(p1 , p2 ) ∈ {(−1, 1), (1, −1)}.
A game Γ is said to be zero-sum if, at each terminal vertex, the payoff function
(p1 , · · · , pn ) satisfies
Xn
pi = 0
i=1
and n is the number of the players (n-persons game).
Each terminal means a collection of strategies from all players.
AMA484 Decision Analysis 5
Normal Form
Here we consider a game with two players only.
Normal form:
The normal form is a matrix with
That is, aij is the amount which Player I receives from Player II.
Let
A = [aij ],
where A is called a payoff matrix.
AMA484 Decision Analysis 6
Example
In the game of matching pennies, each player has two strategies head and tail. The
normal form of the game is
−1 1
A= .
1 −1
This is a zero-sum game, since one player’s loss is the other player’s gain.
AMA484 Decision Analysis 7
Example
• Two countries, I and II, are at war.
• Country II has two airfields and can defend one but not both:
βj = defend airfield j, j = 1, 2.
• Country I can attack only one of the airfields:
θi = attack airfield i, i = 1, 2.
• If I attacks a defended one, it withdraws immediately with no loss.
• If I attacks an undefended airfield, the airfield will be destroyed.
• Airfield 1 has value 1 and airfield 2 has value 2. Then
β1 β2
θ1 0 1
θ2 2 0
This is a zero sum game, providing the values to I of the destruction of the airfields are
the same as the values of the airfields to II.
AMA484 Decision Analysis 8
Example
A manufacturer is playing a game between him and Nature (or call it fate). Each of the
players has the choice of two moves: The manufacturer has the choice between actions
β1 (to expand his plant now) and β2 (delay expansion), and Nature controls the choice
between θ1 (economic remain good) and θ2 (recession). Depending on the choice of
the moves, the payoffs are shown in the following table:
Player II (Manufacturer)
β1 β2
Player I θ1 L(β1 , θ1 ) L(β2 , θ1 )
(Nature) θ2 L(β1 , θ2 ) L(β2 , θ2 )
The amounts L(β1 , θ1 ), L(β2 , θ2 ) are referred to as the values of the loss function that
characterizes the particular game.
In other words, L(βj , θi ) is the loss of Player II (the amount he has to pay Player I)
when he chooses alternatives βj , and Player I chooses alternative θi .
AMA484 Decision Analysis 9
Although it does not really matter, we shall assume here that these amounts are in
dollars. In actual practice, they can also be expressed in terms of any goods or services,
in units of utility (desirability or satisfaction), and even in terms of life or death.
Let us also assumed that each player must choose his strategy without knowing what
his opponent is going to do and that once a player has made his choice it cannot be
changed.
The objectives of the theory of games are to determine optimum strategies (i.e.,
strategies that are most profitable to the respective players) and the corresponding
payoff, which is called the value of the game.
AMA484 Decision Analysis 10
Example 1
Given the 2 × 2 zero-sum two-person game
Player II
β1 β2
Player I θ1 7 −4
θ2 8 10
Find the optimum strategies of Players I and II and the value of the game.
S OLUTION.
For Player I, Strategy θ2 will yield more than Strategy θ1 regardless of the choice made
by Player II.
In a situation like this we say that Strategy θ2 dominates Strategy θ1 .
If we do this here, we find that Player I’s optimum strategy is Strategy θ2 , the only one
left, and the Player II’s optimum strategy is Strategy β1 , since a loss of $8 is obviously
preferable to a loss of $10.
Also, the value of the game, the payoff corresponding to Strategy β1 and θ2 , is $8.
AMA484 Decision Analysis 11
Example 2
Given the 3 × 2 zero-sum two-person game
Player II
β1 β2 β3
Player I θ1 −4 1 7
θ2 4 3 5
Find the optimum strategies of Player I and II and the value of the game.
S OLUTION.
In this game, Player I don’t have dominant strategy, but the third strategy of Player II is
dominated by other two strategy. (Why?)
AMA484 Decision Analysis 12
Clearly, for Player II, a profit of $4 or a loss of $1 is preferable to a loss of $7, and a
loss of $4 or a loss of $3 is preferable to a loss of $5. Thus, we can discard the third
column of the payoff matrix and study the 2 × 2 game
Player II
β1 β2
Player I θ1 −4 1
θ2 4 3
where Strategy θ2 of Player I is dominant strategy. Thus, the optimum choice of Player
I is Strategy θ2 , the optimum choice of Player II is Strategy β2 , and the value of the
game is $3.
AMA484 Decision Analysis 13
Player II
β1 β2 β3
Player I θ1 −1 6 −2
θ2 2 4 6
θ3 −2 −6 12
First, if Player II chooses Strategy β1 , β2 , β3 , the worst is that he loses $2, $6 or $12
respectively; Thus, he could minimize the maximum loss by choosing Strategy β1 .
For Player I, if he chooses Strategy θ1 , θ2 or θ3 , the worst that can happen is that she
loses −$2, $2 (wins) or −$6 respectively. Thus, she could minimize the maximum loss
by choosing Strategy θ2 .
The selection of Strategies β1 and θ2 is called minimax strategies.
By choosing Strategy β1 , Player II makes sure that his opponent can win at most $2,
and by choosing Strategy θ2 , Player I makes sure that she will actually win this amount.
In our example, even if Player I (II) announced publicly that she will choose Strategy
θ2 (β1 ), it would still be best for Player II (I) to choose Strategy β1 (θ2 ).
AMA484 Decision Analysis 14
Saddle point
A strategy pair (i, j) is in equilibrium (a saddle point) if the element aij corresponding
to both the largest in its column and the smallest in its row (LCSR, in short).
The value aij is called is the optimal payoff.
For example,
(ii) The game matrix below does not have a saddle point
−1 1
.
1 −1
AMA484 Decision Analysis 15
The saddle point may not be unique, but the optimal payoff is unique.
4 3 5
0 1 0 .
6 3 9
Both (1, 2) and (3, 2) are saddle points with the optimal payoff 3. How to find a saddle
point, suppose that there exists one?
AMA484 Decision Analysis 16
Gain Floor
Let us consider the game without a saddle point
Player I:
Strategy θ1 → 4 2
.
Strategy θ2 → 1 3
• Player II is not only unpredictable, but omniscient, (he will guess correctly
whatever the player I decides).
• Player I wins at least 2 units with Strategy θ1 , and 1 unit with Strategy θ2 .
• This certain win of at least two units is the Player I’s gain-floor and we shall
denote it by vI :
vI = max{min aij }.
i j
That is vI = 2.
AMA484 Decision Analysis 17
Loss Ceiling
4 2
.
1 3
• the minimum of these two values is defined to be the loss-ceiling of Player II,
which is 3 units.
Minimax Inequality
It can be easily shown that
vI ≤ vII ,
equivalently
max min aij ≤ min max aij .
i j j i
Show that the minimax strategies of Players I and II are not spyproof in the following
game:
Player II
β1 β2
Player I θ1 8 −5
θ2 2 6
S OLUTION.
Player I can minimize her maximum loss by choosing Strategy θ2 .
Player II can minimize his maximum loss by choosing Strategy β2 .
However, if Player II knew that Player I was going to base her choice on the minimax
criterion, he could switch to Strategy β1 and thus reduce his loss from $6 to $2.
Of course, if Player I discovered that Player II would try to outsmart her, she could in
turn switch to Strategy θ1 and increase her gain to $8.
In any case, the minimax strategies of the two players are not spyproof.
AMA484 Decision Analysis 20
Clearly, there is no saddle point in this Example, since the smallest value of each row is
also the smallest value of its column. Therefore not all games are spyproof.
In general, if a game has a saddle point, it is said to be strictly determined, and the
strategies corresponding to the saddle point are spyproof (optimum) minimax
strategies.
If a game does not have a saddle point, minimax strategies are not spyproof, and each
player can outsmart the other if he knows how the opponent will react in a given
situation. To avoid this possibility, each player should somehow mix up his or her
behavior patterns intentionally, and the best way of doing this is by introducing an
element of chance into the selection of strategies.
AMA484 Decision Analysis 21
Mixed Strategy
A mixed strategy for a player is a probability distribution on the set of his pure
strategies.
Suppose that a player has only a finite number, m, of pure strategies, a mixed strategy
reduces to an m-vector, x = (x1 , · · · , xm ), satisfying
m
X
xi ≥ 0, xi = 1.
i=1
We shall denote the set of all mixed strategies for Player I by X, and the set of all
mixed strategies for Player II by Y .
m
P
X = {x = (x1 , · · · , xm ) : xi ≥ 0, xi = 1},
i=1
n
P
Y = {y = (y1 , · · · , yn ) : yj ≥ 0, yj = 1}.
i=1
AMA484 Decision Analysis 22
Expected Payoff
Let us suppose that Players I and II are playing the matrix game A. If Player I chooses
the mixed Strategy x, and Player II chooses y, then the expected payoff will be
computed by
a · · · a1n x a y · · · x1 a1n yn
11 1 11 1
.. .. .. ..
.. −→ ..
.
. . . . . .
am1 · · · amn xm am1 y1 · · · xm amn yn
That is,
m X
X n
A(x, y) = xi aij yj .
i=1 j=1
(the minimum will be attained by a pure strategy j, A•j is the jth column of the matrix
A). Hence Player I should be choose x so as to maximize v(x):
Minimax Theorem
Thus we obtain the two numbers vI and vII . These numbers are called the values of
the game to Players I and II, respectively.
vI = vII
This theorem is the most important in game theory. It says that every two-person
zero-sum game will have optimal strategies.
AMA484 Decision Analysis 26
Row Domination
General case:
In a matrix A, we say the ith row dominates the kth row if
and
aij > akj , for at least one j.
Example,
2 0 1 4
→
6 2 5 .
3
→ 4 1 3 2
The second row dominates the third row. The third row is removed.
AMA484 Decision Analysis 28
Column Domination
Similarly, we say that the jth column dominates lth column if
Example,
↓ ↓
2 0 1 4
1 2 5 3 .
4 1 3 2
The second column dominates the fourth column.
AMA484 Decision Analysis 29
Theorem
Example
Consider the game with matrix
2 0 1 4
.
1 2 5 3
4 1 3 2
It is seen that the second column dominates the fourth column, i.e, column 4 is deleted,
2 0 1 4
1 2 5 3 .
4 1 3 2
In this new matrix, we find the third row dominates the first row,
2 0 1 4
1 2 5 3
4 1 3 2
AMA484 Decision Analysis 31
and in this new matrix, the third column is dominated by the second column. Hence
the matrix is reduced to
2 0 1 4
1 2 5 3
4 1 3 2
and we now look for optimal strategies to the small 2 × 2 matrix game.
AMA484 Decision Analysis 32
2 × 2 games
Theorem 3 Let A be a 2 × 2 matrix game. Then if A does not have a saddle point, its
unique optimal strategies and optimal expected payoff will be given by
JA⋆
x= ⋆ T
,
JA J
J(A⋆ )T
y= ,
JA⋆ J T
|A|
v= ,
JA⋆ J T
where A⋆ is the adjoint of A, |A| the determinant of A, and J is the vector (1, 1).
AMA484 Decision Analysis 33
Adjoint
Let
a b
A= .
c d
Then the adjoint of A is
d −b
⋆
A = .
−c a
AMA484 Decision Analysis 34
Counterexample
If there is a saddle point, the denominator may be zero:
JA⋆ J T = 0.
1 3
A= ,
2 4
4 −3 1 1
JA⋆ J T = (1, 1) = (2, −2) = 0.
−2 1 1 1
AMA484 Decision Analysis 35
Example Continued
We have
1 2
A= .
4 1
Thus,
1 −2
A⋆ = , |A| = −7,
−4 1
1
JA⋆ J T = (−3, −1) = −4,
1
⋆
JA (−3, −1) 3 1
x= = = , ,
JA⋆ J T −4 4 4
1 −4
(1, 1)
⋆ T −2 1
J(A ) 1 3
y= ⋆ T
= = , .
JA J −4 4 4
AMA484 Decision Analysis 36
Example
Solve the matrix game
−1 0
.
−1 2
It is easy to check that the game doesn’t have a saddle point. Now the adjoint matrix
A⋆ of A is
2 0
A⋆ =
1 1
and |A| = 2, JA⋆ = (3, 1); J(A⋆ )T = (2, 2) and JA⋆ J T = 4. Thus we have
3 1 1 1 1
x= , , y= , , v= .
4 4 2 2 2
AMA484 Decision Analysis 38
Statistical Games
In statistical inference, the decision based on the populations of sample data, and it is
no need to look upon such an inference as a game between nature (which controls the
relevant features(s) of the population) and the person who must arrive at some decision
about Nature’s choice.
For instance, if we want to estimate the mean, µ of a normal population on the basis of
a random sample of size n, we could say that Nature has control over the true value of
µ.
On the other hand, we might estimate µ in terms of the value of the sample mean or
median, and presumably there is some penalty that depends on the size of our error.
AMA484 Decision Analysis 40
(1) Statistical games treat Nature as a rational opponent rather than a rational
opponent rather than malevolent opponent.
(2) In a statistical game, the statistician is supplied with sample data that provide him
with some information about Nature’s choice. This also complicates matters, but it
merely amounts to the fact that we are dealing with more complicated kinds of
games.
AMA484 Decision Analysis 41
For example, we are told that a coin is either balanced with heads on one side and tails
on the other or two-headed. We cannot inspect the coin, but we can flip it once and
observe whether it comes up heads or tails. Then we must decide whether or not it is
two-headed, keeping in mind that there is a penalty of $1 if our decision is wrong and
no penalty if our decision is right. If we ignored the fact that we can observe one flip of
the coin, we could treat the problem as the following game:
Player II (Statistician)
β1 β2
Player I θ1 L(β1 , θ1 ) = 0 L(β2 , θ1 ) = 1
(Nature) θ2 L(β1 , θ2 ) = 1 L(β2 , θ2 ) = 0
Now θ1 and θ2 are the ‘state of Nature’ that the coin are two-headed and balanced
(head and tail) respectively. β1 and β2 are the statistician’s decision that the coin is
two-headed and balanced respectively. The entries in the table are the corresponding
values of the given loss function.
AMA484 Decision Analysis 42
Player II know the result of the flip of the coin, i.e., a random variable X has taken on
the value x = 0 (head) or x = 1 (tails). Since we shall want to make use of this
information in choosing between β1 and β2 , we need a function, a decision function,
that tells us what action to take when x = 0 (x = 1). We can express this by writing
β , if x = 0,
1
d1 (x) =
β2 , if x = 1.
The purpose of the subscript is to distinguish this decision function from others. For
instance, we have four various decision function here, i.e.,
d2 (0) = β1 , d2 (1) = β1 ,
d3 (0) = β2 , d2 (1) = β2 ,
d4 (0) = β2 , d4 (1) = β1 .
AMA484 Decision Analysis 43
To compare the merits of all these decision functions, let us first determine the
expected losses to which they lead for the various strategies of Nature, that is, the
values of the risk function
We have thus arrived at the following 4 × 2 zero-sum two-person game, in which the
payoffs are the corresponding values of the risk function
Player II (Statistician)
d1 d2 d3 d4
Player I θ1 0 0 1 1
1 1
(Nature) θ2 2 1 0 2
Player II (Statistician)
d1 d3
Player I θ1 0 1
1
(Nature) θ2 2 0
This leaves us with the 2 × 2 zero-sum two-person game.
It can be verified that if Nature is looked upon as a malevolent opponent, the optimum
strategy is to randomize between d1 and d3 with respective probabilities of 32 and 13 ,
and the value of the game is 31 of a dollar.
If Nature is rational opponent, we formulated this problem with reference to a
two-headed coin and an ordinary coin. We must decide on the basis of a single
observation whether the random variable has the Bernoulli distribution with the
parameter θ = 0 or the parameter θ = 21 .
AMA484 Decision Analysis 47
Example Continued
A random variable has the uniform density
1 , for 0 < x < θ,
θ
f (x) =
0, otherwise,
and we want to estimate the parameter θ (the move of Nature) on the basis of a single
observation. If the decision function is to be of the form d(x) = kx, where k ≥ 1, and
the losses are proportional to the absolute value of the errors, that is,
S OLUTION.
Since
L(kx, θ) = c|kx − θ|, c > 0.
For the risk function we get
θ
θ
1 1
Z k
Z
R(d, θ) = c(θ − kx) × dx + c(kx − θ) × dx
0 θ θ
k
θ
k 1
= cθ −1+ ,
2 k
and there is nothing we can do about the factor θ, but it can easily be verified that
√
k = 2 will minimize k2 − 1 + k1 .
Thus, if we actually took the observation and got x = 5, our estimate of θ would be
√
5 2.
AMA484 Decision Analysis 50
Decision Criteria
In the above example, we were able to find a decision function that minimized the risk
regardless of the true state of Nature, but this is the exception rather than the rule. Had
we not limited ourselves to decision functions of the form d(x) = kx, then the decision
function given by d(x) = θi would be best when θ happens to equal θi and it is
obvious that there can be no decision function that is best for all values of θ.
In general, we thus have to be satisfied with decision functions that are best only with
respect to some criterion, such as
(1) the minimax criterion, according to which we choose the decision function d for
which R(d, θ), maximized with respect to θ, is a minimum;
(2) the Bayes criterion, according to which we choose the decision function d for
which the Bayes risk E[R(d, θ)] is a minimum, where the expectation is taken
with respect to θ. This requires that we look upon θ as a random variable having a
given distribution.
AMA484 Decision Analysis 51
Example
Use the minimax criterion to estimate the parameter θ of a binomial distribution on the
basis of the random variable X, the observed number of successes in n trials, when the
decision function is of the form
x+a
d(x) = ,
n+b
where a and b are constants, and the loss function is given by
x + a x + a 2
L ,θ = c −θ ,
n+b n+b
where c is a positive constant.
AMA484 Decision Analysis 52
S OLUTION.
The problem is to find the values of a and b that will minimize the corresponding risk
function after it has been maximized with respect to θ.
After all, we have control over the choice of a and b, while Nature (our presumed
opponent) has control over the choice of θ.
Since E(X) = nθ and E(X 2 ) = nθ(1 − θ + nθ), it follows that
h x + a 2 i c 2 2 2
R(d, θ) = E c −θ = [θ (b − n) + θ(n − 2ab) + a ],
n+b (n + b)2
and we could find the value of θ that maximizes R(d, θ) and we can then find the value
of both a and b that minimize R(d, θ).
AMA484 Decision Analysis 53
To simplify the work in a problem of this kind, we can often use the equalizer
principle, according to which (under fairly general conditions) the risk function of a
minimax decision rule is a constant; for instance, it tells us that the risk function should
not depend on the value of θ.
To make the risk function independent of θ, the coefficients of θ and θ2 must both
equal 0 in the expression for R(d, θ). This yields b2 − n = 0 and n − 2ab = 0, and,
1√ √
hence, a = 2 n and b = n. Thus, the minimax decision function is given by
1√
x+ 2 n
d(x) = √ ,
n+ n
and if we actually obtained 39 successes in 100 trials, we would estimate the parameter
θ of this binomial distribution as
1
√
39 + 2 100
d(39) = √ = 0.40.
100 + 100
AMA484 Decision Analysis 54
xi ≥ 0, i = 1, 2, · · · , m.
AMA484 Decision Analysis 55
Now let
m
nX m
X m
X o
v = min ai1 xi , ai2 xi , · · · , ain xi .
i=1 i=1 i=1
The equation implies that
m
X
aij xi ≥ v, j = 1, 2, · · · , n.
i=1
maximize z = v,
m
P
subject to v− aij xi ≤ 0, j = 1, 2, · · · , n,
i=1
x1 + x2 + · · · + xm = 1,
xi ≥ 0, i = 1, 2, · · · , m,
v unrestricted.
AMA484 Decision Analysis 56
y1 + y2 + · · · + yn = 1,
yj ≥ 0, j = 1, 2, · · · , n.
Using a procedure similar to that of Player I, Player II’s problem reduces to
minimize w = v,
n
P
subject to v− aij yj ≥ 0, i = 1, 2, · · · , m,
j=1
y1 + y2 + · · · + yn = 1,
yj ≥ 0, j = 1, 2, · · · , n,
v unrestricted.
AMA484 Decision Analysis 57
Example
Solve the following game by linear programming.
β1 β2 β3
θ1 3 −1 −3
θ2 −2 4 −1
θ3 −5 −6 2
AMA484 Decision Analysis 58
Solving by a Computer
Player I’s Linear Program
maximize z = v,
subject to v − 3x1 + 2x2 + 5x3 ≤ 0,
v + x1 − 4x2 + 6x3 ≤ 0,
v + 3x1 + x2 − 2x3 ≤ 0,
x1 + x2 + x3 = 1,
x1 , x2 , x3 ≥ 0,
v unrestricted.
minimize z = v,
subject to v − 3y1 + y2 + 3x3 ≥ 0,
v + 2y1 − 4y2 + y3 ≥ 0,
v + 5y1 + 6y2 − 2y3 ≥ 0,
y1 + y2 + y3 = 1,
y1 , y2 , y3 ≥ 0,
v unrestricted.