Biased Games
Biased Games
Ioannis Caragiannis
David Kurokawa
Ariel D. Procaccia
University of Patras
Abstract
We present a novel extension of normal form games that
we call biased games. In these games, a players utility
is influenced by the distance between his mixed strategy and a given base strategy. We argue that biased
games capture important aspects of the interaction between software agents. Our main result is that biased
games satisfying certain mild conditions always admit
an equilibrium. We also tackle the computation of equilibria in biased games.
Introduction
We believe that these issues have been largely overlooked, as normal form games are typically seen as oneshot interactions: the mixed strategy is only important
insofar as it selects a pure strategy. However, the mixed
strategy itself can play a more significant role in some
settings, justifying the preceding notions of bias:
In computational environments (e.g., networks), the
mixed strategies of software agents can be encoded
as programs that are submitted to a server, and therefore the mixed strategies themselves are visible to
certain parties. Such game-theoretic settings were
nicely motivated in the work of Rozenfeld and Tennenholtz (2007), and their justification is also implicit in the earlier work of Tennenholtz (2004).
Once mixed strategies are visible, bias towards certain mixed strategies can arise due to social norms
agents are expected to play certain strategies, mediation (Monderer and Tennenholtz 2009; Rozenfeld and
Tennenholtz 2007) agents are told to play certain
strategies, and privacy certain mixed strategies reveal more about the agents preferences than others.
In other settings, mixed strategies may be instantiated
multiple times before the agents actually interact. For
example, security games (Tambe 2012) are 2-player
games played by a defender and an attacker. The defenders strategy specifies a random allocation of security resources to potential targets, and the attackers
strategy pinpoints a target that will be attacked. It
is typically assumed that the defender would play a
mixed strategy for a period of time before the attacker
makes his move. The crux of this example is that
redeploying security resources (such as boats, in the
case of the US Coast Guard) is costly, and different
instantiations of a mixed strategy lead to different deployments. This can bias the defender, say, towards
pure strategies, or away from high-entropy strategies.
While security games are often viewed as Stackelberg games (where the defender moves first), they
have also been studied as normal form games that are
solved using Nash equilibrium (Korzhyk et al. 2011).
1.1
1.2
Related Work
Our Model
x
1x
,
p2 =
y
1y
,
then simple analysis shows that simultaneously maximizing u1 w.r.t. x and u2 w.r.t. y, is equivalent to:
2y + 5
8
if x < 1/2
1
y= 0
if x > 1/2
anything otherwise
We first show that this function is a well-defined2 continuous function and thus, because it acts upon a convex
compact set, must have a fixed point by Brouwers theorem. We then proceed to show that any fixed point of
h must be an equilibrium.
Lemma 1. h is well-defined.
Proof. Let i be given. We will show that qi is welldefined.
Since vi (p0i , p) as a function of p0i is a continuous function on a compact space it must achieve its
maximum. It therefore suffices to show that there exists
a unique maximizer to vi . Suppose for the purposes of
contradiction there exist two such qi , denoted x and y.
Then let (0, 1) and z = x + (1 )y. Now consider the value vi would achieve at z:
vi (z, p) = ui (z, pi ) kz pi k22
= Ti (z, pi ) fi (kz pi k) kz pi k22 .
Specifically, let us consider each term separately:
Ti (z, pi ) = Ti (x + (1 )y, pi )
= Ti (x, pi ) + (1 )Ti (y, pi ).
fi (kz pi k)
= fi (kx + (1 )y pi (1 ) pi k)
= f (k (x pi ) + (1 )(y pi )k)
f ( kx pi k + (1 ) ky pi k)
f (kx pi k) + (1 ) f (ky pi k) .
where we have used the triangle inequality and the
definition of convexity.
kz pi k22 = kx + (1 )y pi k22
= k(x pi ) + (1 )(y pi )k22
(kx pi k2 + (1 )ky pi k2 )2
x=
Existence of Equilibria
(Lem 3)
1x
1x
0
2
and
B2
y
1y
2
y
0
,
=
1y
1
2
where the Bi describe the bias terms. Then the utility of the (row) player 1 as a function of x and y is:
u1 (x, y) = 2(x2 + 2(y 1)x + 1). Similarly, the utility
of the (column) player 2 is: u2 (x, y) = 2(y2 2xy + 2x).
Now note that as u1 is an upward-facing parabola in x,
its maximum over the set x [0, 1] is reached at one of
the endpoints (i.e. x {0, 1}). So let us consider these
two cases.
Suppose first that x = 0. Then u2 (x, y) = u2 (0, y) =
2y2 and so u2 is maximized for y [0, 1] when y = 1.
However, this implies that u1 (x, y) = u1 (x, 1) = 2x2 + 2
and thus u1 is maximized when x = 1 a contradiction.
Now suppose instead that x = 1. Then u2 (x, y) =
u2 (1, y) = 2(y2 2y + 2) and so u2 is maximized for
y [0, 1] when y = 0. However, this implies that
u1 (x, y) = u1 (x, 0) = 2(x2 2x + 1) and thus u1 is maximized when x = 0 a contradiction.
Computation of Equilibria
In this section we investigate the computation of equilibria in biased games. From the outset our expectations
are quite low, as even in normal form games, computing
Due to these difficulties, we focus on certain subsets of biased games (which, in particular, circumvent
Example 3). Specifically, we consider the two-player
(and later, more generally the n-player) setting with a
bias term of the form ck k1 or ck k22 where c 0 is
some constant. Crucially, this still generalizes the classic setting. Our goal is to generalize the (extremely simple) support enumeration algorithm for computing Nash
equilibria (see, e.g., (von Stengel 2007, Algorithm 3.4)).
Let us first consider the L2 case: player i has a bias
term of the form ci k k22 where ci 0. Recall that for
each player i, if the strategy of the other player is fixed,
then i simply wishes to maximize his utility ui (pi ). That
is, for every player i, we wish to have that ui (pi ) is maximized subject to the constraints that the entries of pi are
nonnegative and sum to one. The Karush-Kuhn-Tucker
(KKT) conditions on a players utility then give necessary and sufficient conditions for maximization sufficiency is due to the concavity of the objective and the
affine nature of the constraints. Thus, equilibrium computation is equivalent to solving the following system.
For all i and pure strategies j of i:
pi, j 0
i, j 0
i, j pi, j = 0
p|~1 = 1
i
STANDARD(i, j) 2 BIAS(i, j) + i + i, j = 0.
where
STANDARD(i, j) =
d
Ti (p1 , p2 ),
d pi, j
and
BIAS(i, j) = ci (pi, j pi, j ) .
Crucially, aside from the i, j pi, j = 0 conditions, the
complete characterization is then a linear feasibility program. We can thus consider the 2|S1 |+|S2 | possibilities
(recall that |Si | is the number of pure strategies of player
i) of which one of i, j and pi, j are zero to find the equilibria. That is, for every player i and strategy j of i we
set one of i, j and pi, j to zero and solve the resulting
linear program. This computes an equilibrium exactly
(albeit in exponential time).
Dealing with bias terms of the form ci kk1 where ci
0 is largely analogous. The important difference appears
due to the discontinuity of the derivative of the L1 norm.
Via a simple case analysis which we omit here, we see
that for all i and pure strategies j of i:
pi, j 0
i, j 0
i, j pi, j = 0
p|~1 = 1.
i
Discussion
References
Battigalli, P., and Dufwenberg, M. 2009. Dynamic
psychological games. Journal of Economic Theory
144(1):135.
Conitzer, V., and Sandholm, T. 2008. Complexity results about Nash equilibria. Games and Economic Behavior 63(2):621641.
Daskalakis, C.; Goldberg, P. W.; and Papadimitriou,
C. H. 2009. The complexity of computing a Nash equilibrium. SIAM Journal on Computing 39(1):195259.
Geanakoplos, J.; Pearce, D.; and Stacchetti, E. 1989.
Psychological games and sequential rationality. Games
and Economic Behavior 1:6079.
Korzhyk, D.; Yin, Z.; Kiekintveld, C.; Conitzer, V.; and
Tambe, M. 2011. Stackelberg vs. Nash in security
games: An extended investigation of interchangeability, equivalence, and uniqueness. Journal of Artificial
Intelligence Research 41:297327.
Lemke, C. E., and Howson, J. T. 1964. Equilibrium
points of bimatrix games. SIAM Journal on Applied
Mathematics 12(2):413423.
Monderer, D., and Tennenholtz, M. 2009. Strong mediated equilibrium. Artificial Intelligence 173(1):180
195.
Nash, J. F. 1950. Equilibrium points in N-person games.
Proceedings of the National Academy of Sciences of the
United States of America 36:4849.
Rozenfeld, O., and Tennenholtz, M. 2007. Routing mediators. In Proceedings of the 20th International Joint
Conference on Artificial Intelligence (IJCAI), 1488
1493.
Tambe, M. 2012. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press.
Tennenholtz, M. 2004. Program equilibrium. Games
and Economic Behavior 49(2):363373.
von Stengel, B. 2007. Equilibrium computation for twoplayer games in strategic and extensive form. In Nisan,
N.; Roughgarden, T.; Tardos, E.; and Vazirani, V., eds.,
Algorithmic Game Theory. Cambridge University Press.
chapter 9.
Widger, J., and Grosu, D. 2009. Parallel computation
of Nash equilibria in N-player games. In Proceedings
of the 12th International Conference on Computational
Science and Engineering (CSE), 209215.