Imitation in Large Games: Soumya Paul R. Ramanujam

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Imitation in Large Games

Soumya Paul

R. Ramanujam

The Institute of Mathematical Sciences


Chennai, India - 600 113
[email protected]

The Institute of Mathematical Sciences


Chennai, India - 600 113
[email protected]

In games with a large number of players where players may have overlapping objectives, the analysis
of stable outcomes typically depends on player types. A special case is when a large part of the player
population consists of imitation types: that of players who imitate choice of other (optimizing) types.
Game theorists typically study the evolution of such games in dynamical systems with imitation rules.
In the setting of games of infinite duration on finite graphs with preference orderings on outcomes for
player types, we explore the possibility of imitation as a viable strategy. In our setup, the optimising
players play bounded memory strategies and the imitators play according to specifications given by
automata. We present algorithmic results on the eventual survival of types.

Summary

Imitation is an important heuristic studied by game theorists in the analysis of large games, in both
extensive form games with considerable structure, and repeated normal form games with large number
of players. One reason for this is that notions of rationality underlying solution concepts are justified by
players assumptions about how other players play, iteratively. In such situations, players knowledge
of the types of other players alters game dynamics. Skilled players can then be imitated by less skilled
ones, and the former can then strategize about how the latter might play. In games with a large number
of players, both strategies and outcomes are studied using distributions of player types.
The dynamics of imitation, and strategizing of optimizers in the presence of imitators can give rise to
interesting consequences. For instance, in the game of chess, if the player playing white somehow knows
that her opponent will copy her move for move then the following simple sequence of moves allows her
to checkmate her opponent 1 :
1.e3 e6 2.Qf3 Qf6 3.Qg3 Qg6 4.Nf3 Nf6 5.Kd1 Kd8 6.Be2 Be7 7.Re1 Re8
8.Nc3 Nc6 9.Nb5 Nb4 10.Qxc7#
On the other hand, we can have the scenario where every player is imitating someone or the other
and the equilibrium attained maybe highly inefficient. This is usually referred to as herd behaviour and
has been studied for instance in [3].
In an ideal world, where players have unbounded resources and computational ability, each of them
can compute their optimal strategies and play accordingly and thus we can predict optimal play. But
in reality, this is seldom the case. Players are limited in their resources, in computational ability and
their knowledge of the game. Hence, in large games it is not possible for such players to compute their
optimal strategies beforehand by considering all possible scenarios that may arise during play. Rather,
they observe the outcome of the game and then strategise dynamically. In such a setting again, imitation
types make sense.
A resource bounded player may attach some cost to strategy selection. For such a player, imitating
another player who has been doing extensive research and computation may well be worthwhile, even if
1 This

is called monkey-chess in chess parlance.

Angelo Montanari, Margherita Napoli, Mimmo Parente (Eds.)


Proceedings of GandALF 2010
EPTCS 25, 2010, pp. 162172, doi:10.4204/EPTCS.25.16

Soumya Paul & R. Ramanujam

163

her own outcomes are less than optimal. What is lost in sub-optimal outcomes may be gained in avoiding
expensive strategisation.
Thus, in a large population of players, where resources and computational abilities are asymmetrically distributed, it is natural to consider a population where the players are predominantly of two kinds:
optimisers and imitators.2 Asymmetry in resources and abilities can then lead to different types of imitation and thus ensure that we do not end up with herd behaviour of the kind referred to above. Mutual
reasoning and strategising process between optimizers and imitators leads to interesting questions for
game dynamics in these contexts.
Imitation is typically modelled in the dynamical systems framework in game theory. Schlag ([12])
studies a model of repeated games where a player in every round samples one other player according
to some sampling procedure and then either imitates this player or sticks to her own move. He shows
that the strategy where a player imitates the sampled player with a probability that is proportional to the
difference in their payoffs, is the one that attains the maximum average payoff in the model. He also
gives a simple counterexample to show that the nave strategy of imitate if better may not always be
improving. Banerjee ([3]) studies a sequential decision model where each decision maker may look at
the decisions made by the previous decision makers and imitate them. He shows that the decision rules
that are chosen by optimising individuals are characterised by herd behaviour, i.e., people do what others
are doing rather than using their own information. He also shows that such an equilibrium is inefficient.
Levine and Pesendorfer ([7]) study a model where existing strategies are more likely to be imitated than
new strategies are to be introduced.
The common framework in all of the above studies is repeated non-zero-sum normal form games
where the questions asked of the model are somewhat different from standard ones on equilibria. Since
all players are not optimizers, we do not speak of equilibrium profiles as such but optimal strategies for
optimizers and possibly suboptimal outcomes for imitators. In the case of imitators, since they keep
switching (imitate i for 2 moves, j for 3 moves, then again i for 1 move, etc.) studies consider stability
of imitation patterns, what types of imitation survive eventually, since these would in turn determine play
by optimizers and thus stable subgames, thus determining stable outcomes. Note that, as in the example
of chess above, imitation and hence the study of system dynamics of this kind, makes equal sense in
large turn based extensive form games among resource bounded players as well.
For finitely presented infinite games the stability questions above can be easily posed and answered
in automata theoretic ways, since typically bounded memory strategies suffice for optimal solutions, and
stable imitation patterns can be analysed algorithmically. Indeed, this also provides a natural model for
resource bounded players as finite state automata.
With this motivation, we consider games of unbounded duration on finite graphs among players with
overlapping objectives where the population is divided into players who optimise and others who imitate.
Unbounded play is natural in the study of imitation as a heuristic, since losses incurred per move may be
amortised away and need not affect eventual outcomes very much. Imitator types specify how and who
to imitate and are given using finite state transducers. Since plays eventually settle down to connected
components, players preferences are given using orderings on Muller sets [11]. In this work, we study
turn-based games so as to use the set of techniques already available for the analysis of such games.
In this setting we address the following questions and present algorithmic results:
If the optimisers and the imitators play according to certain specifications, is a global outcome
eventually attained?
2 There would also be a third kind of players, randomisers, who play any random strategy, but we do not consider such
players in this exposition.

Imitation in Large Games

164

What sort of imitative behaviour (subtypes) eventually survive in the game?


How worse-off are the imitators from an equilibrium outcome?
Infinite two-player turn-based games on finite graphs have been extensively studied in the literature.
A seminal result by Buchi and Landweber [4] showed that for Muller objectives, winning strategies exist
in bounded memory strategies and can be effectively synthesised. Martin [8] showed that such games
with Borel winning conditions are sure-determined (one of the players always has a winning strategy
from every vertex). Zielonka [13] gave an insightful analysis of Muller games and provided an elegant
algorithm to compute bounded memory winning strategies.
For concurrent-move games, sure determinacy does not hold, and the optimal value determinacy (the
values of both the players at every vertex sum to 1) for concurrent-move games with Borel objectives
was proved in [9]. Concurrent games with qualitative reachability and more general parity objectives
have been studied in [2, 1]. Such games have also been extended to the multiplayer setting where the
objectives of the players are allowed to overlap. [5, 6] show that when the objectives are win-lose Borel,
subgame perfect equilibria exist. [11] show that bounded memory equilibrim tuples exist in turn based
games even when the objectives are not win-lose but every player has preferences over the various Muller
sets.

Games, Strategies and Objectives

The model of games we present is the standard model of turn based games of unbounded duration on
finite graphs. For any positive integer n, let [n] = {1, . . . , n}.
Definition 1 Let n N, n > 1. An n-player game arena is a directed graph G = (V1 , . . .Vn , A, E), where
S
Vi are finite sets of game positions with Vi V j = 0/ for i 6= j, V = i[n] Vi , A is a finite set of moves, and
E (V A V ) is the move relation that satisfies the following conditions:
1. For every v, v1 , v2 V and a, b A, if (v, a, v1 ) E and (v, b, v2 ) E then a 6= b.
2. For every v V , there exists a A and v V such that (v, a, v ) E.
When an initial position v0 V is specified, we call (G , v0 ) an initialised arena or just an arena.
In this model, we assume for convenience that the moves of all players are the same. When v Vi ,
we say that player i owns the vertex v. A game arena is thus a finite graph with nodes labelled by players
and edges labelled by moves such that no two edges out of a vertex share a common label and there are
no dead ends. For a vertex v V , let vE denote its set of neighbours: vE = {v |(v, a, v ) E for some
a A}. For v V and a A, let v[a] = {v |(v, a, v ) E}; v[a] is either empty or the singleton {v }. In
the latter case, we say a is enabled at v and write v[a] = v . For u A , we can similarly speak of u being
enabled at v and define v[u] so that when v[u] = {v }, there is a path in the graph from v to v such that u
is the sequence of move labels of edges along that path. Given v V and u A , if any u-labelled path
exists in the graph, it is unique. On the other hand, given any sequence of vertices that correspond to a
path in the graph, there may be more than one sequence of moves that label that path.
ai+1
a
A play in (G , v0 ) is an infinite path v0 1 . . ., such that vi vi+1 for i N. We often speak of
a0 a1 . . . A as the play to denote this path. The game starts by placing a token at v0 Vi . Player i
chooses an action a A enabled at v0 and the token moves along the edge labelled a to a neighbouring
vertex v1 V j . Player j chooses an action a A enabled at v1 , the token moves along the edge labelled
a to a neighbouring vertex and so on. Note that since there are no dead ends, any player whose turn it is
to move has some available move.

Soumya Paul & R. Ramanujam


a

165
a

Given a path = v 0 v1 . . . k vk , we call a , (1 k) the last i-move in , if v1 Vi and for all

/ Vi .
: < k, v

2.1 Objectives
The game arena describes only legal plays, and the game itself is defined by specifying outcomes and
players preferences on outcomes. Since each play results in an outcome for each player, players preferences are on plays. This can be specified finitely, as every infinite play on a finite graph settles down
to a strongly connected component.
For a play u A let inf(u) be the set of vertices that appear infinitely often in the play given by u.
With each player i, we associate a total pre-order i (2V 2V ). This induces a total preorder on plays
as follows: u i u iff inf(u) i inf(u ).
Thus an n-player game is given by a tuple (G , v0 , 1 , . . . , n ), consisting of an n-player game arena
and players preferences.

2.2 Strategies
Players strategise to achieve desired outcomes. Formally, a strategy i for player i is a partial function

i : VA A
where i (vu) is defined if v[u] is defined and v[u] Vi , and if (v[u])[i (vu)] is defined.
A strategy i of player i is said to be bounded memory if there exists a finite state transducer FST
A = (M, , g, m0 ) where M (the memory of the strategy) is a finite set of states, m0 M is the initial
state of the memory, : A M M is the memory update function, and g : Vi M A is the move
function such that for all v Vi and m M, g(v, m) is enabled at v and the following condition holds:
given v Vi , when u = a1 . . . ak A is a partial play from v, i (vu) is defined, i (vu) = g(v[u], mk ),
where mk is determined by: mi+1 = (ai+1 , mi ) for 0 i < k.
A strategy is said to be memoryless or positional if M is a singleton. That is, the moves depend only
on the current position.
Definition 2 Given a strategy profile = (1 , . . . , n ) for n players let denote the unique play in
(G , v0 ) conforming to . A profile is called a Nash equilibrium in (G , v0 , 1 , . . . , n ) if for every
player i and for every other strategy i of player i, inf(( i ,i ) ) i inf( ).

Specification of Strategies

We now describe how the strategies of the imitator and optimiser types are specified.

3.1 Imitator Types


An imitator type is again specified by a finite state transducer which advises the imitator whom to imitate
when using memory states for switching between imitating one player or another. When deciding not to
imitate any other player, we assume that the type advises what to play using a memoryless strategy.
An imitator type j for player j is a tuple (M, , , , m0 ) where M is the finite set denoting the
memory of the strategy, m0 M is the initial memory, : A M M is the memory update function,

Imitation in Large Games

166

: V A is a positional strategy such that for any v V , (v) is enabled at v, and : M [n] is the
imitation map.
Given j as above, define a strategy j for player j as follows. Let v V and u = a1 . . . ak A is
a partial play from v such that v[u] is defined and v[u] V j . Let mi+1 = (ai+1 , mi ) for 0 i < k. Then
j (vu) = a , if a is the last (mk ) move in the given play and a is enabled at vu, and j (vu) = (v[u]),
otherwise.
Note that the type specification only specifies whom to imitate, and how it decides whom to imitate
but is silent on the rationale for imitating a player or switching from imitating x to imitating y. In general
an imitator would have a set of observables, and based on observations of game states made during
course of play, would decide on whom to imitate when. Thus imitator specifications could be given by
a past-time formula in a simple propositional modal logic. With any such formula we can associate an
imitation type transducer as defined above, so we do not pursue that approach here. See, for instance,
[10] for more along that direction.
The following are some examples of imitating strategies that can be expressed using such automata:
1. Imitate player 1 for 3 moves and then keep imitating player 4 forever.
2. Imitate player 2 till she receives the highest payoff. Otherwise switch to imitating player 3.
3. Nondeterministically imitate player 4 or 5 forever.
For convenience of the subsequent technical analysis, we assume that an imitator type
= (M, , , , m0 ) is presented as a finite state transducer R = (M , , g , mI ) where
M = V M A[n] .
: A M M such that (a, hv, m, (a1 , . . . , an )i) = hv , m , (a1 , . . . , ai1 , a, ai+1 , . . . , an )i such
a
that v v , (a, m) = m and v Vi .
g : V M A such that g (v, hv, m, (a1 , . . . , an )i) = ai iff (m) = i and ai is enabled at v. Otherwise
g (v, hv, m, (a1 , . . . , an )i) = (v).
mI = hv0 , m0 , (a1 , . . . , an )i for some (a1 , . . . , an ) A|n| .
Figure 1 below depicts an imitator strategy where a player imitates player 1 for two moves and then
player 2 for one move and then again player 1 for two moves and so on. She just plays the last move
of the player she is currently imitating. Suppose there are a total of p actions, that is, |A| = p. She
remembers the last move of the player she is imitating in the states m1 to m p , and when it is her turn to
move, plays the corresponding action.
Given an FST R for an imitator type , we call a strongly connected component of R a subtype of
R . We will often refer to the strategy j induced by the imitator type R for player j as R , when the
context is clear.
We define the notion of an imitation equilibrium which is a tuple of strategies for the optimisers such
that none of the optimisers can do better by unilaterally deviating from it given that the imitators stick to
their specifications.
Definition 3 In the game (G , v0 , 1 , . . . , n ), given that the imitators r + 1, . . . , n play strategies
r+1 , . . . , n , a profile of strategies = (1 , . . . , r ) of the optimisers is called an imitation equilibrium if
for every optimiser i and for every other strategy i of i, inf(( i ,i ) ) i inf( ).
Remark Note that an imitation equilibrium may be quite different from a Nash equilibrium of the
game (G , v0 , 1 , . . . , n ) when restricted to the first r components. In a Nash equilibrium the imitators

Soumya Paul & R. Ramanujam

167

Figure 1: An imitator strategy


are not restricted to play according to the given specifications unlike in an imitation equilibrium. In the
latter case, the optimisers, in certain situations, may be able to exploit these restrictions imposed on the
imitators (as in the example of monkey-chess discussed in Section 1).

3.2 Optimiser Specifications


One of the motivations for an imitator to imitate an optimiser is the fact that an optimiser plays to get
best results. To an imitator, an optimiser appears to have the necessary resources to compute and play
the best strategy and hence by imitating such a player she cannot be much worse off. But what kind of
strategies do the optimisers play on their part?
In the next section, we show that if the optimisers know the types (the FSTs) of each of the imitators,
then it suffices for them to play bounded memory strategies. Of course, this depends on the solution
concept: Nash equilibrium is defined for strategy profiles, we need to particularize them for applying
only to optimizers.
Thus in the treatment below, we consider only bounded memory strategies for the optimisers.

Results

In this section, we first show that it suffices to consider bounded memory strategies for the optimisers.
Then we go on to address the questions raised towards the end of Section 1.
First we define a product operation between an arena and a bounded memory strategy.

4.1 Product Operation


Let (G , v0 ) be an arena and be a bounded memory strategy given by the FST A = (M, , g, mI ). We
define G A to be the graph (G , v0 ) where G = (V , E ) such that

Imitation in Large Games

168
V = V M
v0 = (v0 , m0 )

If g(v, m) is defined then (v, m) (v , m ) iff (a, m) = m , v v and g(v, m) = a.


a
a
If g(v, m) is not defined then (v, m) (v , m ) iff (a, m) = m and v v

Proposition 1 Let (G , v0 ) be an arena and be a bounded memory strategy. Then G A is an arena,


that is, there are no dead ends.
Proof Let (G , v0 ) = G A . : A M M being a function, (a, m) is defined for every a A and
m M. Also by the definition of G , for every vertex v V there exists an action a A enabled at v and
a
a vertex v V such that v v . Thus for every vertex (v, m) V ,
if g(v, m) is not defined then corresponding to every enabled action a A there exists (v , m ) V
a
such that (v, m) (v , m ),
if g(v, m) is defined then by definition the unique action a = g(v, m) is enabled at v. Hence, there
a
exists (v , m ) V such that (v, m) (v , m ).
2
Thus taking the product of the arena with a bounded memory strategy i of player i does the following. For a vertex v Vi , it retains only the outgoing edge that is labelled with the action specified by the
corresponding memory state of i . For all other vertices v
/ Vi , it retains all the outgoing edges.
Proposition 2 Let (G , v0 ) be an arena and 1 , . . . , n be bounded memory strategies. Then G A1
. . . An is an arena, that is, there are no dead ends.

4.2 Equilibrium
Of the n players let the first r be optimisers and the rest n r be imitators. Let r+1 , . . . , n be the
types of the imitators r + 1, . . . , n. We transform the game (G , v0 , 1 , . . . , n ) with n players to a game
(G , v0 , 1 , . . . , r+1 ) with r + 1 players in the following steps:
1. Construct the graph (G , v0 ) = ((V , E ), v0 ) as G = G Rr+1 Rn .

2. Let V = V1 . . . Vr Vr+1
such that for i : 1 i r, (v, m1 , . . . , mn ) Vi iff v Vi . And

(v, m1 , . . . , mn ) Vr+1
iff v Vr+1 . . . Vn . Let there be r + 1 players such that the vertex set
Vi belongs to player i. Thus we introduce a dummy player, the r + 1th player, who owns all the
vertices (v, m1 , . . . , mn ) V such that v was originally an imitator vertex in V . By construction,
a

we know that every vertex (v, m1 , . . . , mn ) Vr+1


has an unique outgoing edge (v, m1 , . . . , mn )
(v , m1 , . . . , mn ). Thus the dummy player r + 1 has no choice but to play this edge always. He has
, play the unique outgoing edge.
a unique strategy in the arena G : at every vertex of Vr+1

3. Lift the preference orders of the players 1 to r to subsets of V as follows. A subset W of V


corresponds to the Muller set F(W ) = {v | (v, mr+1 , . . . , mn ) W } of G . For every player i : 1
i r, for W,W V , W i W if and only if F(W ) i F(W ).
Since the player r + 1 has a unique strategy and plays it always, his preference ordering doesnt
matter in the game. However, for consistency, we assign the preference of an arbitrary imitator (say
imitator n) in the game (G , v0 , 1 , . . . , n ) to the r + 1th player in the game (G , v0 , 1 , . . . , r+1 ).
That is, for W,W V , W r+1 W if and only if F(W ) n F(W ).

Soumya Paul & R. Ramanujam

169

The game (G , v0 , 1 , . . . , r+1 ) is a turn based game with r + 1 players (the optimisers and the
dummy) such that each player i has a preference ordering i over the Muller sets of V . Such a game
was called a generalised Muller game in [11].
Let L be the set

L = {l (V {})|V |+1 | |l| = 1 v V (|l|v = 1)}


where |l|v denotes the number of occurences of v in l. We have
Theorem 1 ([11]) The game (G , v0 , 1 , . . . , r+1 ) has a Nash equilibrium in bounded memory strategies, the memory being L.
) be a Nash equilibrium tuple for r + 1 players in the game (G , v ,
Now let = (1 , . . . , r , r+1
0
1
, . . . , r+1 ). We now construct a bounded memory imitation equilibrium tuple for the r optimisers in
the game (G , v0 , 1 , . . . , n ).
For the optimiser i : 1 i r, let i = (L, , g , lI ). Define i = (M, , g, lI ) to a bounded memory
strategy in the game (G , v0 , 1 , . . . , n ) as

M = Mr+1 . . . Mn L where Mi , r + 1 i n is the memory of strategy i of imitator i.


: A M M such that (a, hmr+1 , . . . , mn , li) = hmr+1 , . . . mn , (a, l)i where mi = i (a, mi ),
r + 1 i n such that i is the memory update of strategy i .
g : V M A such that g(v, hmr+1 , . . . , mn , li) = g (hv, mr+1 , . . . , mn i, l).
n
i
lI = hmr+1
I , . . . mI , lI i where mI , r + 1 i n is the initial memory of strategy i .

We then have:
Theorem 2 = (1 , . . . , r ) is an imitation equilibrium in (G , v0 , 1 , . . . , n ).
Proof Suppose not and suppose player i has an incentive to deviate to a strategy in (G , v0 , 1 , . . . , n ).
Let u A be the unique play consistent with the tuple where the imitators stick to their strategy tuple
(r+1 , . . . , n ). Let u A be the unique play consistent with the tuple ( i , ) (that is when player i has
deviated to the strategy ) where again the imitators stick to their strategy tuple (r+1 , . . . , n ). Let l be
the first index such that u(l) 6= u (l). Then, v0 [ul1 ] Vi , (where ul1 is the length l 1 prefix of u). That
is, the vertex v0 [ul1 ] belongs to optimiser i since everyone else sticks to her strategy.
Now consider what happens in the game (G , v0 , r+1 , . . . , n ) when all the optimisers except i play
, . . . , , . . . , and the imitators stick to their strategy tuple (
the strategies 1 , . . . , i1
r+1 , . . . , n ). If
r
i+1
the optimiser i mimicks strategy for l 1 moves in the game then the play is exactly ul1 and reaches a
vertex (v, mr+1 , . . . , mn ) Vi where v = v0 [ul1 ]. By construction of the product, all the actions enabled
at v in the arena G are also enabled in the arena G . Hence the optimiser i can play u(l). By similar
arguments, optimiser i can mimick the strategy in the arena G forever.
Thus by mimicking in the game (G , v0 , r+1 , . . . , n ), the optimiser i can force a more preferable
Muller set. But this contradicts the fact that is an equilibrium tuple in the game (G , v0 , r+1 , . . . , n ).
2

4.3 Stability
Finally, we adress the questions asked in Section 1. Given a game (G , v0 , 1 , . . . , n ) with optimisers and
imitators where the optimisers play bounded memory strategies and the imitators play imitative strategies
specified by k finite state transducers we wish to find out:

Imitation in Large Games

170

If a certain stongly connected component W of G is where the play eventually settles down to.
What subtypes eventually survive.
How worse-off is imitator i from an equilibrium outcome.
We have the following theorem:
Theorem 3 Let (G , v0 , 1 , . . . , n ) be a game with n players where the first r are optimisers playing bounded memory strategies 1 , . . . , r and the rest n r are imitators playing imitative strategies
r+1 , . . . , n where every such strategy is among k different types. Let W be a strongly connected component of G . The following questions are decidable:
(i) Does the game eventually settle down to W ?
(ii) What subtypes of the k types eventually survive?
(iii) How worse-off is imitator i from an equilibrium outcome?
Proof Construct the arena (G , v0 ) = G A1 . . . Ar Rr+1 . . . Rn .
(i) For the strongly connected component S in (G , v0 ) that is reachable from v0 , let S be subgraph
induced by the set {v | (v, m1 , . . . , mn ) S }. Collapse the vertices of S that have the same name
and call the resulting graph S . Check if S is the same as W and output YES if so.
(ii) For the strongly connected component S in (G , v0 ) that is reachable from v0 do the following:
For i : r + 1 i n take the restriction of S to the ith component for every (v, m1 , . . . , mn ) S.
Let Si denote this restriction.
Collapse vertices with the same name in Si . Let Si be this new graph.
Check if Si is a subtype of i . If so output Si .
(iii) Compute a Nash equilibrium of the game (G , v0 , 1 , . . . , n ) using the procedure described in
[11]. Let S be the reachable strongly connected component of the arena (G , v0 ). Restrict S to the
first component and call it S. Let F = occ(S). Compare F with inf( ) according to the preference
ordering i of imitator i.
2

4.4 An Example
Let us look at an example illustrating the concepts of the previous section. Consider 3 firms A, B and
C. Each firm has a choice of producing 2 products, product a or product b repeatedly, i.e., potentially
infinitely often. In every batch each of them can decide to produce either of the products.
Now firm A is a large firm with all the technical knowhow and infrastructure and it can change
between its choice of production in consecutive batches without much increase in cost. On the other
hand, the firms B and C are small. For either of them, if in any successive batch it decides to change
from producing a to b or vice-versa, there is a high cost incurred in setting up the necessary infrastructure.
Whereas, if it sticks to the product of the previous batch, the infrastructure cost is negligible. Thus in
the case where it switches between products in consecutive batches, it is forced to set the price of its
product high. This actually favours firm A as it can always set its product at a reasonable price since it is
indifferent between producing either of the two products in any batch.
The demand in the market for a and b keeps changing. Firm A being the bigger firm has the resources and knowhow to analyse the market and anticipate the current demand and then produce a or b

Soumya Paul & R. Ramanujam

171

Figure 2: The arena G


accordingly. Also assume that firm A is the first to put its product out in the market. Thus it is tempting
for firms B and C to imitate A. But in doing so they run the risk of setting the prices of their products too
high and incurring a loss.
We model this situation in the form of the arena G shown in Figure 2 where the nodes of firm A,
B and C are denoted as , 2 and respectively. The preferences of each of the firms for the relevant
connected components when the market demand is low are given as:
{1, 2, 3, 4, 5, 6} >A X , for X ( {1, 2, 3, 4, 5, 6}
{1, 3, 5} >B {1, 4, 5} >B {1, 3, 5, 4} >B {2, 3, 6, 4} >B Y, for any other Y ( {1, 2, 3, 4, 5, 6}
{1, 3, 5} >C {1, 4, 5} >C {2, 3, 6, 4} >C {1, 3, 5, 4} >C Z, for any other Z ( {1, 2, 3, 4, 5, 6}
Thus firm A prefers the larger set {1, 2, 3, 4, 5} to the smaller ones while B and C prefer the smaller sets.
But when the market demand is high their preferences are given as:
{1, 2, 3, 4, 5, 6} >i X , for X ( {1, 2, 3, 4, 5, 6} and i {A, B,C}
That is, all of them prefer the larger set.
Now if A produces a and b in alternate batches and B and C imitate A, then we end up in the
component {1, 2, 3, 4, 5, 6} which is profitable for A but less so for B and C when the market demand is
not so high. But when the demand is high, the component {1, 2, 3, 4, 5, 6} is quite profitable even for B
and C and thus in this case, imitation is a viable strategy for them.

Discussion

The model that we have presented here is far from definitive, but we see these results as early reports in a
larger programme of studying games with player types. The model requires modification and refinement
in many directions, being addressed in related on-going work. In games with large number of players,
outcomes are typically associated not with player profiles but with distribution of types in the population.
Imitation crucially affects such dynamics. Our model can be easily modified to incorporate distributions

172

Imitation in Large Games

but the analysis is considerably more complicated. Further, it is natural to consider this model in the
context of repeated normal form games, but in such contexts almost-sure winning randomized strategies
are more natural. A more critical notion required is that of type based reduction of games, so that analysis
of large games can be reduced to that of interaction between player types.

Acknowledgement
We thank the anonymous referees for their helpful comments and suggestions. The second author thanks
NIAS (http://nias.knaw.nl) for support when he was working on this paper.

References
[1] L. de Alfaro & T. A. Henzinger (2000): Concurrent omega-regular Games. In: LICS 2000: 15th International
IEEE Symposium on Logic in Computer Science, IEEE Press, pp. 141154.
[2] L. de Alfaro, T. A. Henzinger & O. Kupferman (1998): Concurrent Reachability. In: FOCS 98, IEEE, pp.
564575.
[3] Abhijit V. Banerjee (1992): A Simple Model of Herd Behaviour. The Quarterly Journal of Economics 107(3),
pp. 797817.
[4] J. R. Buchi & L. H. Landweber (1969): Solving Sequential Conditions by Finite-State Strategies. Transactions
of the American Mathematical Society 138, pp. 295311.
[5] K. Chatterjee, M. Jurdzinski & R. Majumdar. (2004): On Nash equilibria in stochastic games. In: Proceedings of the 13th Annual Conference of the European Association for Computer Science Logic, LNCS 3210,
Springer-Verlag, pp. 2640.
[6] E. Gradel & M. Ummels (2008): Solution Concepts and Algorithms for Infinite Multiplayer Games. In:
New Perspectives on Games and Interaction, Texts in Logic and Games 4, Amsterdam University Press, pp.
151178.
[7] David K. Levine & Wolfgang Pesendorfer (2007): The Evolution of Cooperation Through Imitation. Games
and Economic Behaviour 58(2), pp. 293315.
[8] D. A. Martin (1975): Borel Determinacy. Annals of Mathematics 102, pp. 363371.
[9] D. A. Martin (1998): The Determinacy of Blackwell Games. The Journal of Symbolic Logic 63(4), pp.
15651581.
[10] Soumya Paul, R. Ramanujam & Sunil Simon (2009): Stability under Strategy Switching. In: Benedict Lowe
Klaus Ambos-Spies & Wofgang Merkle, editors: Proceedings of the 5th Conference on Computability in
Europe (CiE ), LNCS 5635, pp. 389398.
[11] Soumya Paul & Sunil Simon (2009): Nash equilibrium in generalised Muller games. In: Proceedings of the
Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS, Leibniz
International Proceedings in Informatics (LIPIcs) 4, Schloss DagstuhlLeibniz-Zentrum fuer Informatik, pp.
335346.
[12] Karl S. Schlag (1998): Why Imitate, and if so, How? A Boundedly Rational Approach to Multi-armed
Bandits. Journal of Economic Theory , pp. 130156.
[13] W. Zielonka (1998): Infinite Games on Finitely Coloured Graphs with Applications to Automata on Infinite
Trees. Theoretical Computer Science 200(1-2), pp. 135183.

You might also like