0% found this document useful (0 votes)
57 views7 pages

Biased Games

We present a novel extension of normal form games that we call biased games. In these games, a player’s utility is influenced by the distance between his mixed strategy and a given base strategy. We argue that biased games capture important aspects of the interaction between software agents. Our main result is that biased games satisfying certain mild conditions always admit an equilibrium. We also tackle the computation of equilibria in biased games.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views7 pages

Biased Games

We present a novel extension of normal form games that we call biased games. In these games, a player’s utility is influenced by the distance between his mixed strategy and a given base strategy. We argue that biased games capture important aspects of the interaction between software agents. Our main result is that biased games satisfying certain mild conditions always admit an equilibrium. We also tackle the computation of equilibria in biased games.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Biased Games

Ioannis Caragiannis

David Kurokawa

Ariel D. Procaccia

University of Patras

Carnegie Mellon University

Carnegie Mellon University

[email protected]

[email protected]

[email protected]

Abstract
We present a novel extension of normal form games that
we call biased games. In these games, a players utility
is influenced by the distance between his mixed strategy and a given base strategy. We argue that biased
games capture important aspects of the interaction between software agents. Our main result is that biased
games satisfying certain mild conditions always admit
an equilibrium. We also tackle the computation of equilibria in biased games.

Introduction

Since the seminal work of Nash (1950), the notion of


Nash equilibrium has been the cornerstone of game theory (and its interaction with AI, and computer science
more broadly). Nash equilibrium is most commonly
employed as a solution concept for normal form games,
which are defined by a set of n players, their sets of pure
strategies, and their payoff functions that map profiles of
pure strategies (reflecting the choice of each player) to
the players payoff. In Nash equilibrium, each players
choice of strategy is a best response to other players
strategies, in the sense that the player cannot secure a
higher payoff by deviating to a different strategy. Nashs
key mathematical contribution was to show that if players have the option of playing mixed strategies probability distributions over pure strategies then a Nash
equilibrium always exists.
More specifically, the payoffs under a mixed strategy
profile are simply the expected payoffs. To compute this
expectation, note that the probability of each possible
pure strategy profile (s1 , . . . , sn ) is the product of probabilities of each player i playing strategy si , and the payoffs for this pure strategy profile are given by the players payoff functions. Importantly, even when players
play mixed strategies, their utilities are completely determined by the payoff functions, which only take pure
strategies as input.
In this paper, we are interested in a fundamentally different way in which a players mixed strategy can directly affect his utility. Specifically, the player may be
biased towards (or away from) a specific base strategy,
so his utility may also depend on the distance between
his mixed strategy and the base strategy.

We believe that these issues have been largely overlooked, as normal form games are typically seen as oneshot interactions: the mixed strategy is only important
insofar as it selects a pure strategy. However, the mixed
strategy itself can play a more significant role in some
settings, justifying the preceding notions of bias:
In computational environments (e.g., networks), the
mixed strategies of software agents can be encoded
as programs that are submitted to a server, and therefore the mixed strategies themselves are visible to
certain parties. Such game-theoretic settings were
nicely motivated in the work of Rozenfeld and Tennenholtz (2007), and their justification is also implicit in the earlier work of Tennenholtz (2004).
Once mixed strategies are visible, bias towards certain mixed strategies can arise due to social norms
agents are expected to play certain strategies, mediation (Monderer and Tennenholtz 2009; Rozenfeld and
Tennenholtz 2007) agents are told to play certain
strategies, and privacy certain mixed strategies reveal more about the agents preferences than others.
In other settings, mixed strategies may be instantiated
multiple times before the agents actually interact. For
example, security games (Tambe 2012) are 2-player
games played by a defender and an attacker. The defenders strategy specifies a random allocation of security resources to potential targets, and the attackers
strategy pinpoints a target that will be attacked. It
is typically assumed that the defender would play a
mixed strategy for a period of time before the attacker
makes his move. The crux of this example is that
redeploying security resources (such as boats, in the
case of the US Coast Guard) is costly, and different
instantiations of a mixed strategy lead to different deployments. This can bias the defender, say, towards
pure strategies, or away from high-entropy strategies.
While security games are often viewed as Stackelberg games (where the defender moves first), they
have also been studied as normal form games that are
solved using Nash equilibrium (Korzhyk et al. 2011).

1.1

Overview of Our Model and Results

Our definition of a biased game starts from a normal


form game, but lets the utility of player i for a strat-

egy profile p = (p1 , . . . , pn ) be his (expected) payoff in


the normal form game, minus a bias term of the form
fi (kpi pi k) . Here, fi () is some real-valued function
(possibly 0 everywhere), kk is a norm, and pi is the
base strategy of i. Given a biased game that includes of
all these components, we can define an equilibrium in
the same way that the Nash equilibrium is defined (each
strategy is a best response to other strategies).
We remark that a biased game and its equilibria cannot be captured as a normal form game and its Nash
equilibria. From a mathematical viewpoint, we believe
that biased games are a very natural (strict) generalization of normal form games. And from a conceptual
viewpoint, as we have argued above, they provide a new
perspective on important, AI-related problems.
In 3 we prove our main result: the existence of equilibria in biased games. Specifically, we show that an
equilibrium always exists if each fi is a non-decreasing
continuous convex function, and the norms are all L p
norms. We also construct a biased game that satisfies
these conditions, except for having decreasing fi functions, in which equilibria do not exist.
In 4 we make some progress on the computation of
equilibria in biased games, by (significantly) generalizing a basic algorithm for Nash equilibrium computation.

1.2

Related Work

A line of work that is somewhat related to ours includes


the classic paper on psychological games (Geanakoplos,
Pearce, and Stacchetti 1989), and influential papers that
extended these ideas (see, e.g., (Battigalli and Dufwenberg 2009)). Psychological games are extensions of normal form games, where each players utility depends on:
(i) the mixed strategies played by all players (as usual),
and (ii) his beliefs. A psychological game also includes,
for each player, a coherent response mapping, which
maps the beliefs held by others and strategies played
by others to the players belief. A psychological equilibrium consists of a vector of beliefs, and a vector of
strategies, such that the beliefs are coherent, and given
these beliefs, the strategies form a Nash equilibrium.
Of course, in order to obtain technical results (such
as equilibrium existence), one must restrict the structure of these various components of the game. One may
wonder, though, whether our bias term can be encoded
in a players beliefs. This is not the case, because in
biased games the utility functions are defined in a way
that, when the beliefs are fixed, we obtain a normal form
game. That is, the beliefs only play a role to the extent
that they determine the payoffs in the induced normal
form game. Therefore, the psychological games framework cannot model our novel notion of bias.

Our Model

Before we begin, it will be helpful to establish some


notation which we will use throughout the paper.
m : [m] = {1, 2, ..., m}.

Pn = {x Rn | ni=1 xi = 1, i : xi 0} is the space of


allowed probability vectors in n dimensions.
Given a vector x Rn , we define (y, xi ) as
(x1 , x2 , ..., xi1 , y, xi+1 , xi+2 , ..., xn ).
We begin with the definition of the centerpiece of
study in game theory.
Definition 1. A normal form game with n players is
given by:
An n-tuple of finite strategy spaces (S1 , S2 , ..., Sn ).
An n-tuple of multilinear payoff functions
(T1 , T2 , ..., Tn ) where
Ti : P|S1 | P|S2 | P|Sn | R
is linear in each of its n parameters for all i.
In such a game, player i plays a mixed strategy, which is
a probability distribution pi P|Si | over his finite strategy space Si , and receives an (expected) utility of Ti (p)
where p = (p1 , p2 , ..., pn ).
Alternatively, one can think of the payoff functions
as assigning a payoff to every profile of pure strategies
in S1 Sn . These functions induce payoffs for a
mixed strategy profile by calculating expected payoffs.
We will find it more convenient, though, to think of the
domain of the functions Ti as mixed strategy profiles.
Importantly, when using a normal form game to analyze or predict actions of agents it is sensible to assume they act rationally: they will maximize their utility to the best of their ability. As a result, the equilibrium notion where each agent i best responds to others
strategies by choosing a distribution pi such that Ti (p)
max p0i Ti (p0i , pi ) (i.e., such that no other mixed strategy
gives a higher payoff) is a natural focal point of study.
This is known as a Nash Equilibrium of a normal form
game, and is so named due to Nashs proof of its existence in any such game (Nash 1950).
The main focus of our paper is on an extension of normal form games that we call biased games. Informally,
a biased game is a normal form game but where the utilities of the players are determined by a multilinear payoff
function (similarly to the standard setting) summed to a
function of the distance of the players strategy to some
base strategy.
Definition 2. A biased game with n players includes an
n-tuple of finite strategy spaces (S1 , S2 , ..., Sn ). For all i,
player is utility for a mixed strategy profile is given by
ui (p) := Ti (p) fi (kpi pi k) ,
where:
p = (p1 , p2 , ..., pn ).
The Ti are multilinear payoff functions similar to the
classic setting.
The fi are real-valued functions.
The norms kk are any valid norms (possibly different
for different players).1
1 We do not attach an index to the norm to avoid confusion,
as we later use subscripts for L p norms and superscripts for
exponents.

The pi describe base strategies in R|Si | it is not


necessary, but intuitive, that they be in P|Si | .
For ease of exposition we define the bias term as
fi (kpi pi k).
Notice that despite the name biased game, we allow
that all fi = 0 which reduces immediately to the definition of a standard game.
Definition 3. An equilibrium of a biased game is a strategy profile p such that ui (p) ui (qi , pi ) for all i [n]
and all qi P|Si | .
Below we illustrate these definitions using a simple
example.
Example 1. Let us consider a security game with two
players, the defender and the attacker, and two targets.
The defender has one resource that can cover a single
target. He receives a utility of 1 if he catches the attacker and 0 otherwise. Similarly, the attacker receives
a utility of 1 if he attacks an undefended target and 0 otherwise. Now suppose the defender has a base strategy of
[3/4, 1/4] (for pedagogical purposes). The utilities may
then be described by:



 2


1 0
3/4
|


u1 (p1 , p2 ) = p1
p2 2 p1
0 1
1/4 2


0 1
|
u2 (p1 , p2 ) = p1
p2
1 0
If we let

p1 =

x
1x


,

p2 =

y
1y


,

then simple analysis shows that simultaneously maximizing u1 w.r.t. x and u2 w.r.t. y, is equivalent to:
2y + 5
8

if x < 1/2
1
y= 0
if x > 1/2

anything otherwise

We first show that this function is a well-defined2 continuous function and thus, because it acts upon a convex
compact set, must have a fixed point by Brouwers theorem. We then proceed to show that any fixed point of
h must be an equilibrium.
Lemma 1. h is well-defined.
Proof. Let i be given. We will show that qi is welldefined.
Since vi (p0i , p) as a function of p0i is a continuous function on a compact space it must achieve its
maximum. It therefore suffices to show that there exists
a unique maximizer to vi . Suppose for the purposes of
contradiction there exist two such qi , denoted x and y.
Then let (0, 1) and z = x + (1 )y. Now consider the value vi would achieve at z:
vi (z, p) = ui (z, pi ) kz pi k22
= Ti (z, pi ) fi (kz pi k) kz pi k22 .
Specifically, let us consider each term separately:
Ti (z, pi ) = Ti (x + (1 )y, pi )
= Ti (x, pi ) + (1 )Ti (y, pi ).
fi (kz pi k)
= fi (kx + (1 )y pi (1 ) pi k)
= f (k (x pi ) + (1 )(y pi )k)
f ( kx pi k + (1 ) ky pi k)
f (kx pi k) + (1 ) f (ky pi k) .
where we have used the triangle inequality and the
definition of convexity.
kz pi k22 = kx + (1 )y pi k22
= k(x pi ) + (1 )(y pi )k22
(kx pi k2 + (1 )ky pi k2 )2

x=

We can then see that the only equilibrium is for x = 5/8


and y = 0.

Existence of Equilibria

In this section we examine the question of existence of


equilibria in biased games, and prove our main result:
Theorem 1. Every biased game in which fi is a nondecreasing continuous convex function for all i [n],
and the norms are all L p norms, has an equilibrium.
Proof. Consider the function h(p) = (q1 , q2 , ..., qn )
where pi , qi P|Si | are probability distributions over is
actions and qi = arg max p0i vi (p0i , p) with
vi (p0i , p) = ui (p0i , pi ) kp0i pi k22 .

kx pi k22 + (1 )ky pi k22


where we have again used the triangle inequality and
the definition of convexity. Importantly, there are two
critical differences between this term and the previous
one.
First, in this case the first inequality (created due to
the use of the triangle inequality of k k2 ) achieves
equality if and only if x pi = (y pi ) for some
0. This is a well-known fact of L p norms3 and
intuitively the condition is equivalent to the two vectors facing the same direction.
Second, in this case the second equality (created due
to the use of convexity) is over a strictly convex function. We thus have equality if and only if kx pi k2 =
ky pi k2 . That is, the two vectors have the same magnitude.
2 We

must prove that the arg max produces exactly one


probability distribution.
3 This is due to the biconditional conditions for equality in
Minkowskis inequality.

From these two observations, we see that we have


equality if and only if x pi = y pi or more explicitly, x = y. As we are assuming this is not the case,
we have strict inequality.
With this analysis of each term in separation, we get the
following:
vi (z, p) = Ti (z, pi ) fi (kz pi k) kz pi k22
> Ti (x, pi ) + (1 )Ti (y, pi )
f (kx pi k) (1 ) f (ky pi k)
kx pi k22 (1 )ky pi k22
= vi (x, p) + (1 )vi (y, p)
= vi (x, p) + (1 )vi (x, p)
= vi (x, p).
Thus, z is strictly better than the assumed maximizers x
and y! This is a clear contradiction.
(Lem 1)
Lemma 2. h is continuous.
Proof. We first establish some notation which we will
use throughout the proof.
P will be the set of values that p can be drawn from.
k k will be the L2 norm for vectors in Rn and for p
P, kpk2 = ni=1 kpi k2 . The induced metrics of these
norms will allow us to formally discuss continuity.
p P will denote the arbitrary input point to h for
which we wish to show continuity.
Now it suffices to show that qi is continuous at p for
arbitrary i. Suppose > 0 is given. We wish to
show that there exists a such that kp p0 k < implies kqi q0i k < where qi = arg maxx vi (x, p) and q0i =
arg maxx vi (x, p0 ).
Let X = {x | kx qi k < } and assume that X c is nonempty (if this is not the case, decrease the size of until
this is the case). Furthermore, as X c is a closed subset of a compact set, it is compact as well. Now note
that vi (y, p) is a continuous function of y and therefore
achieves its maximum on the compact set X c , denoted
M . Importantly, Lemma 1 shows that M 6= M where
M = max vi (x, p) = vi (qi , p).
x

We now further observe that vi (x, q) is a continuous


function of q on a compact set and thus, by the HeineCantor theorem is uniformly continuous. We therefore
have that there exists some > 0 such that if kp p0 k <
then
M M
|vi (x, p) vi (x, p0 )| <
2
for all x. Two important consequences of this are:
vi (qi , p0 ) = vi (qi , p) vi (qi , p) + vi (qi , p0 )
vi (qi , p) |vi (qi , p) vi (qi , p0 )|
M M
M + M
>M
=
2
2

For all y X c : vi (y, p0 ) = vi (y, p) vi (y, p) + vi (y, p0 )


vi (y, p) + |vi (y, p) vi (y, p0 )|
M M
M + M
< M +
=
2
2
0
0
Together, these imply that vi (qi , p ) > vi (x, p ) and so
arg maxx vi (x, p0 ) X. Thus, kqi q0i k < .
(Lem 2)
We have shown that h is indeed well-defined and continuous. By Brouwers fixed point theorem, it must
therefore have a fixed point. Our next lemma demonstrates that such fixed points are equilibria.
Lemma 3. Any fixed point of h represents an equilibrium in the biased game.
Proof. Suppose for the purposes of contradiction we
have a fixed point of h that is not an equilibrium. That
is, suppose h(p) = p and there exists some i such that
player i gains by the unilateral deviation from pi to p0i .
Now let [0, 1] and z = p0i + (1 )pi . Then we
have:
vi (z, p) = ui (z, pi ) kz pi k22
= Ti (z, pi ) fi (kz pi k) kz pi k22 .
Similarly to the proof of Lemma 1 we find that
Ti (z, pi ) = Ti (p0i , pi ) + (1 )Ti (pi , pi )
and


fi (kz pi k) f p0i pi + (1 ) f (kpi pi k) .
Moreover,
kz pi k22 = k p0i + (1 )pi pi k22
= k(p0i pi )k22
= 2 kp0i pi k22 .
We therefore find that:
vi (z, p)
= Ti (z, pi ) fi (kz pi k) kz pi k22
Ti (p0i , pi ) + (1 )Ti (pi , pi )


f p0i pi (1 ) f (kpi pi k)
2 kp0i pi k22
= ui (p0i , pi ) + (1 )ui (pi , pi ) 2 kp0i pi k22 .
Now let us consider this lower bound, call LB. For = 0
we have LB = ui (pi , pi ) which is tight (as z = pi in
this case). That is, we have equality. Moreover, we have

LB|=0 = ui (p0i , pi ) ui (pi , pi ).

Importantly, we assumed that it was advantageous for


player i to deviate from pi to p0i and thus we have that
ui (p0i , pi ) ui (pi , pi ) > 0. Therefore, there exists a
small such that
vi (z, p) > ui (pi , pi )
= ui (pi , pi ) kpi pi k22
= vi (pi , p).
This is a clear contradiction.

(Lem 3)

To conclude, we have argued that h has a fixed


point, and every fixed point is an equilibrium of the
biased game. This completes the proof of the theorem.
(Thm 1)
Let us now examine the assumptions that Theorem 1
makes. The assumption that the fi are continuous and
convex, and that the norms are L p norms, can be seen as
mild technical assumptions. But the assumption that the
fi are non-decreasing has more conceptual bite.
What would a biased game with decreasing fi look
like? For example, recall our discussion of security
games, where a mixed strategy equates to a potentially
costly redeployment of resources. One of the possible
interpretations is to set the base strategy to be the uniform distribution (that is, a uniformly random assignment of resources), and let the defender be biased away
from this strategy, that is, the fi functions are negative
and strictly decreasing.
Unfortunately, it turns out that Theorem 1 is simply
not true for decreasing fi functions: a biased game may
not admit equilibria even when all the other assumptions
hold, i.e., the fi are continuous convex functions and
the norms are L p norms. Below we construct such an
example.
Example 2. Suppose we have a two-player game with:




4 0
0 4
T1 =
,
T2 =
,
0 0
0 0



   2

x
x
1

,
B1
=

1x
1x
0 2
and

B2

y
1y




   2

y
0

,
=

1y
1 2

where the Bi describe the bias terms. Then the utility of the (row) player 1 as a function of x and y is:
u1 (x, y) = 2(x2 + 2(y 1)x + 1). Similarly, the utility
of the (column) player 2 is: u2 (x, y) = 2(y2 2xy + 2x).
Now note that as u1 is an upward-facing parabola in x,
its maximum over the set x [0, 1] is reached at one of
the endpoints (i.e. x {0, 1}). So let us consider these
two cases.
Suppose first that x = 0. Then u2 (x, y) = u2 (0, y) =
2y2 and so u2 is maximized for y [0, 1] when y = 1.
However, this implies that u1 (x, y) = u1 (x, 1) = 2x2 + 2
and thus u1 is maximized when x = 1 a contradiction.
Now suppose instead that x = 1. Then u2 (x, y) =
u2 (1, y) = 2(y2 2y + 2) and so u2 is maximized for
y [0, 1] when y = 0. However, this implies that
u1 (x, y) = u1 (x, 0) = 2(x2 2x + 1) and thus u1 is maximized when x = 0 a contradiction.

Computation of Equilibria

In this section we investigate the computation of equilibria in biased games. From the outset our expectations
are quite low, as even in normal form games, computing

a Nash equilibrium is computationally hard (Daskalakis,


Goldberg, and Papadimitriou 2009; Conitzer and Sandholm 2008). However, there are various equilibrium computation algorithms that work well on average or in special cases, such as the famous LemkeHowson (1964) Algorithm.
A major challenge in our (much more general) setting
is that algorithms for computing Nash equilibria in normal form games typically rely on the property that if p
is a Nash equilibrium and si is a pure strategy in the support of pi , then si is itself a best response to pi . This
is clearly not the case when it comes to biased games
and their equilibria. This can be seen in Example 1,
where pure strategies in the support of p1 are never best
responses to p2 .
To further illustrate the difficulties that one encounters in our setting, we note that in normal form games
with two players and rational payoffs, Nash equilibria
are defined by rational probabilities (Nash 1950), and
this is, of course, true if there is only one player. In
contrast, below we give an example of a biased game
with one player and rational parameters, but irrational
equilibrium.
Example 3. Consider the game where the sole player
has two strategies with payoffs 1 and 0, and the bias term
is kp [1/2, 1/2]| k42 . A simple analysis then yields that
the sole equilibrium is to play the first strategy with
1
probability 12 +
3 .
2 2

Due to these difficulties, we focus on certain subsets of biased games (which, in particular, circumvent
Example 3). Specifically, we consider the two-player
(and later, more generally the n-player) setting with a
bias term of the form ck k1 or ck k22 where c 0 is
some constant. Crucially, this still generalizes the classic setting. Our goal is to generalize the (extremely simple) support enumeration algorithm for computing Nash
equilibria (see, e.g., (von Stengel 2007, Algorithm 3.4)).
Let us first consider the L2 case: player i has a bias
term of the form ci k k22 where ci 0. Recall that for
each player i, if the strategy of the other player is fixed,
then i simply wishes to maximize his utility ui (pi ). That
is, for every player i, we wish to have that ui (pi ) is maximized subject to the constraints that the entries of pi are
nonnegative and sum to one. The Karush-Kuhn-Tucker
(KKT) conditions on a players utility then give necessary and sufficient conditions for maximization sufficiency is due to the concavity of the objective and the
affine nature of the constraints. Thus, equilibrium computation is equivalent to solving the following system.
For all i and pure strategies j of i:
pi, j 0
i, j 0
i, j pi, j = 0
p|~1 = 1
i

STANDARD(i, j) 2 BIAS(i, j) + i + i, j = 0.

where
STANDARD(i, j) =

d
Ti (p1 , p2 ),
d pi, j

and
BIAS(i, j) = ci (pi, j pi, j ) .
Crucially, aside from the i, j pi, j = 0 conditions, the
complete characterization is then a linear feasibility program. We can thus consider the 2|S1 |+|S2 | possibilities
(recall that |Si | is the number of pure strategies of player
i) of which one of i, j and pi, j are zero to find the equilibria. That is, for every player i and strategy j of i we
set one of i, j and pi, j to zero and solve the resulting
linear program. This computes an equilibrium exactly
(albeit in exponential time).
Dealing with bias terms of the form ci kk1 where ci
0 is largely analogous. The important difference appears
due to the discontinuity of the derivative of the L1 norm.
Via a simple case analysis which we omit here, we see
that for all i and pure strategies j of i:
pi, j 0
i, j 0
i, j pi, j = 0
p|~1 = 1.
i

and at least one of two discontinuity-caused requirements must be satisfied: either


STANDARD(i, j) BIAS(i, j) + i + i, j = 0
BIAS(i, j) 0
or
STANDARD(i, j) + BIAS(i, j) + i + i, j = 0
BIAS(i, j) 0.
Therefore, to compute an equilibrium when all players
have ck k1 bias terms we can consider the 4|S1 |+|S2 | linear programs that arise. That is, for every player i who
has a ck k1 bias term, and strategy j of i, we set one
of i, j and pi, j to zero (as before) and also determine
which of the two discontinuity-caused requirements we
will choose to include in our linear program the players who have ck k22 bias terms are dealt with as before.
The next statement summarizes the discussion above.
Theorem 2. Given a biased game with two players
and bias terms of the form ci kpi pi k22 or ci kpi pi k1 ,
an equilibrium can be computed by solving at most
4|S1 |+|S2 | linear programs.
Although we have discussed the algorithm only for
the two-player setting, this approach can be extended to
multiple players at the cost of solving multilinear programs instead of linear ones. The discrepancy is caused
due to the STANDARD(i, j) term: the derivative in its
expression is multilinear for n 3. This is a generalization of the sequential approach discussed by Widger
and Grosu (2009) for normal form games.

Discussion

In our model of biased games, a players bias term is


fi (kpi pi k). Consider a more general definition of the
bias term,

! !




fi pi pi + i, j p j ,


j[n]
where the i, j are real weights such that i, j = 0 if players i and j do not have the same number of pure strategies. The new term j[n] i, j p j captures a different type
of bias, towards playing a mixed strategy that is similar
to, or different from, those played by other players. This
extension can help capture well-known social phenomena, such as homophily (the propensity of individuals to
bond with others that are similar to themselves). Interestingly, all our results (existence and computation) also
hold for the more general definition of biased games.
However, it seems poorly motivated in the context of the
settings discussed in Section 1, which is why we chose
to simplify the presentation by focusing on the more restricted definition of biased games.
We feel that our existence result (Theorem 1) is
rather general, but the computation of equilibria in biased games poses difficult problems even for zero-sum
games. Still, we have been able to identify a rather interesting property of two-player zero-sum biased games
with bias terms of the form ckpi d pi k22 for some constants c > 0 and d. Based on quite a bit of linear algebra,
we can show that each of the 2|S1 |+|S2 | linear programs
in the support enumeration machinery yields at most one
solution. This stands in contrast to more general biased
games that may have infinitely many equilibria. For example, it can be easily seen that in the two-player game
with


2 0
T1 = T2 =
0 2
and the same bias term kpi [ 21 21 ]| k22 for both players, every pair of identical mixed strategies is an equilibrium.
The analytical uniqueness result mentioned above together with preliminary experimental results leads us to
the following conjecture.
Conjecture 1. In two-player zero-sum biased games
with bias terms of the form ckpi d pi k22 , where c > 0
and d are constants, there exists exactly one equilibrium.
Of course, the conditions of Conjecture 1 are quite
restrictive. In order to tackle equilibrium computation
in biased games under assumptions that are as general
as those of Theorem 1, fundamentally new ideas are required.

References
Battigalli, P., and Dufwenberg, M. 2009. Dynamic
psychological games. Journal of Economic Theory
144(1):135.
Conitzer, V., and Sandholm, T. 2008. Complexity results about Nash equilibria. Games and Economic Behavior 63(2):621641.
Daskalakis, C.; Goldberg, P. W.; and Papadimitriou,
C. H. 2009. The complexity of computing a Nash equilibrium. SIAM Journal on Computing 39(1):195259.
Geanakoplos, J.; Pearce, D.; and Stacchetti, E. 1989.
Psychological games and sequential rationality. Games
and Economic Behavior 1:6079.
Korzhyk, D.; Yin, Z.; Kiekintveld, C.; Conitzer, V.; and
Tambe, M. 2011. Stackelberg vs. Nash in security
games: An extended investigation of interchangeability, equivalence, and uniqueness. Journal of Artificial
Intelligence Research 41:297327.
Lemke, C. E., and Howson, J. T. 1964. Equilibrium
points of bimatrix games. SIAM Journal on Applied
Mathematics 12(2):413423.
Monderer, D., and Tennenholtz, M. 2009. Strong mediated equilibrium. Artificial Intelligence 173(1):180
195.
Nash, J. F. 1950. Equilibrium points in N-person games.
Proceedings of the National Academy of Sciences of the
United States of America 36:4849.
Rozenfeld, O., and Tennenholtz, M. 2007. Routing mediators. In Proceedings of the 20th International Joint
Conference on Artificial Intelligence (IJCAI), 1488
1493.
Tambe, M. 2012. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press.
Tennenholtz, M. 2004. Program equilibrium. Games
and Economic Behavior 49(2):363373.
von Stengel, B. 2007. Equilibrium computation for twoplayer games in strategic and extensive form. In Nisan,
N.; Roughgarden, T.; Tardos, E.; and Vazirani, V., eds.,
Algorithmic Game Theory. Cambridge University Press.
chapter 9.
Widger, J., and Grosu, D. 2009. Parallel computation
of Nash equilibria in N-player games. In Proceedings
of the 12th International Conference on Computational
Science and Engineering (CSE), 209215.

You might also like