Nash Equilibria Detection For Multi-Player Games: Rodica Ioana Lung, Tudor Dan Mihoc, D. Dumitrescu
Nash Equilibria Detection For Multi-Player Games: Rodica Ioana Lung, Tudor Dan Mihoc, D. Dumitrescu
Abstract— One of the main issues in computational game II. G ENERATIVE RELATIONS FOR NASH EQUILIBRIA
theory is equilibria detection in multi player games. This
problem is approached using a generative relation for strategy
A generative relation for Nash equilibria is a relation
profiles and two different search operators: crowding based between two strategy profiles that enables their comparison
differential evolution and a simple stepping stone reinforcing with respect to the Nash solution concept, i.e. it evaluates
search algorithm. A probabilistic generative relation is also which one is “closer’ to equilibrium. In [2] such a generative
derived in order to tackle a higher number of players. Each relation has been introduced. It is shown that solutions that
approach has advantages and disadvantages, illustrated by
numerical experiments for games involving from two to a
are nondominated/ascended with respect to this relation are
hundred players. exactly the Nash equilibria of the game. A probabilistic
variant of this relation is introduced in subsection II-B.
I. I NTRODUCTION A. Nash ascendancy relation
A finite strategic game is defined by Γ = ((N, Si , ui ), i =
The intrinsic similarities between mathematical games and 1, n) where:
multiobjective optimization problems are evident and have
• N represents the set of players, N = {1, ...., n}, n is
long been exploited. A game is defined as the ensemble
the number of players;
formed by a set of players, a set of actions available to
• for each player i ∈ N , Si represents the set of actions
each player and, most important, a set of payoff functions
available to him, Si = {si1 , si2 , ..., simi } where mi
that each player wishes to maximize. The multi-objective
represents the number of strategies available to player i
(or many-objective) [1] optimization problem consist of a
and S = S1 × S2 × ... × SN is the set of all possible
set of functions - most often conflicting - to be maximized
situations of the game;
(minimized) subject to some constraints.
• for each player i ∈ N , ui : S → R represents the payoff
The solution concepts to the two problems are however function.
different. While the Pareto optimal solution has been widely
Denote by (sij , s∗−i ) the strategy profile obtained from s∗
addressed from an evolutionary point of view, mostly through
by replacing the strategy of player i with sij i.e.
the Pareto domination relation, the Nash equilibria of a non-
cooperative game was until recently unaccessible because of (sij , s∗−i ) = (s∗1 , s∗2 , ..., s∗i−1 , sij , s∗i+1 , ..., s∗1 ).
the lack of a similar relation for strategy profiles.
This problem has been surpassed by the introduction The most common concept of solution for a non cooper-
of a generative relation [2] that enabled Nash equilibria ative game is the concept of Nash equilibrium [5], [6]. A
collective strategy s ∈ S for the game Γ represents a Nash
detection by using a adapted version of an evolutionary
algorithm designed for multiobjective optimization [3], [2]. equilibrium if no player has anything to gain by changing
only his own strategy.
Increasing the number of players is one of the challenges
facing algorithmic game theory just as increasing the number Several methods to compute NE of a game have been
of objectives is for multiobjective optimization. The long developed. For a review on computing techniques for the
NE see [5].
popular NSGA2 initially used has been proved not to be
able to cope with many-objective optimization problems. Consider two strategy profiles x and y from S. An operator
k : S × S → N that associates the cardinality of the set
In order to deal with multiple players two main approaches
are studied. A probabilistic version of the generative relation k(x, y) = |({i ∈ {1, ..., n}|ui(yi , x−i ) ≥ ui (x), yi 6= xi }|
aiming at reducing the fitness number evaluations during
the run of a search operator is proposed. An evolutionary to the pair (x, y) is introduced.
algorithm - Crowding Differential Evolution [4] - originally This set is composed by the players i that would benefit
designed for multimodal optimization and a simple stepping if - given the strategy profile x - would change their strategy
stone reinforcing search algorithm are tested using the two from xi to yi , i.e.
versions of the generative relation. ui (yi , x−i ) ≥ ui (x).
C. Numerical results
Numerical results present the and standard deviation of d) q = 10%: Table IV shows the same decreasing trend
the minimum distance to the NE and the number of payoff of the average distance with the increase in the number of
function evaluations for each operator with different values players for both methods.
for the number of players chosen in the probabilistic Nash
TABLE IV
ascendancy relation.
q = 10%
a) Nash ascendancy: q = 100%: Results regarding the
Nash ascendancy relation are presented in Table I. Because Results—no.players 10 20 50 100
SSRS stopped before the 2·107 number of payoff evaluations CRDE avg 40.22 22.18 10.94 7.56
allowed, the number of payoff function evaluations used is CRDE stdev 5.37 3.16 0.8 0.38
SSRS avg 32.16 25.34 15.68 5.22
indicated. Results shows that SSRS outperforms CrDE in all SSRS stdev 2.74 2.72 0.74 2.54
instances considering both the distance to NE and the number Avg fitness 816.20 7436.00 344300.00 8266500.00
of evaluations needed. Stdev fitness 463.62 3733.58 230622.89 3656538.32
TABLE I
q = 100% e) Discussion: Box-plots representing the distance to
Nash equilibria for the different values of q for SSRS and
Results—no.players 10 20 50 100 CrDE are represented in Figures 1,2. Box-plots also indicate
CrDE avg 1.76 7.2 6.29 6.85 a better performance in the case of 100 individuals for SSRS
CRDE stdev 0.6 1.36 1.12 0.04 when using the probabilistic ascendancy relation. CrDE also
SSRS avg 0.08 0.11 0.17 0.26
exhibits a better performance for a higher number of player
SSRS stdev 0.02 0.03 0.03 0.05
Avg fitness 2.57E+05 5.79E+05 1.71E+06 3.71E+06 for different values of q.
Stdev fitness 3.03E+04 7.17E+04 1.14E+05 1.98E+05 Figures 3 and 4 represent the distance to Nash equilibria
for SSRS versus the number of payoff function evaluation for
50 players and 100 players. The behavior of CrDE illustrated
SSRS
50
distance to Nash equilibrium
40
30
20
10
0
10-100 10-50 10-30 10-10 20-100 20-50 20-30 20-10 50-100 50-50 50-30 50-10 100-100 100-50 100-30 100-10
number of players/payoff functions - q parameter
Fig. 1. SSRS results: box-plots labels represent the number of players and the q parameter used. For example 10 − 100 indicates 10 players and q = 100%
CrDE
60
distance to Nash equilibrium
50
40
30
20
10
0
10-100 10-50 10-30 10-10 20-100 20-50 20-30 20-10 50-100 50-50 50-30 50-10 100-100 100-50 100-30 100-10
number of players/payoff functions - q parameter
Fig. 2. CrDEresults: box-plots labels represent the number of players and the q parameter used. For example 10 − 100 indicates 10 players and q = 100%
16
Distance to Nash equilibrium
distance to Nash equilibrium
10
14
8 12
10
6
8
4 6
4
2
2
0 0
0 2e+06 4e+06 6e+06 8e+06 1e+07 1.2e+07 0 5e+05 1e+06 1.5e+06 2e+06 2.5e+06 3e+06
number of payoff evaluations number of payoff evaluations
Fig. 3. SSRS results: 100 players; distance to NE versus number of fitness Fig. 4. SSRS results: 50 players; distance to NE versus number of fitness
evaluations. evaluations.
13.5
13
12.5
12
11.5
11
-5e+06 0 5e+06 1e+07 1.5e+07 2e+07 2.5e+07
number of payoff function evaluations
R EFERENCES