0% found this document useful (0 votes)
132 views5 pages

Nash Equilibria Detection For Multi-Player Games: Rodica Ioana Lung, Tudor Dan Mihoc, D. Dumitrescu

This document discusses methods for detecting Nash equilibria in multi-player games. It introduces a generative relation called the Nash ascendancy relation that can be used to characterize Nash equilibria in games. A probabilistic version of this relation is also presented to reduce the number of payoff function evaluations needed. Two approaches are explored for finding Nash equilibria using these relations: a stepping stone reinforcing search algorithm and crowding based differential evolution, an evolutionary algorithm originally designed for multimodal optimization problems.

Uploaded by

lungrodica6284
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views5 pages

Nash Equilibria Detection For Multi-Player Games: Rodica Ioana Lung, Tudor Dan Mihoc, D. Dumitrescu

This document discusses methods for detecting Nash equilibria in multi-player games. It introduces a generative relation called the Nash ascendancy relation that can be used to characterize Nash equilibria in games. A probabilistic version of this relation is also presented to reduce the number of payoff function evaluations needed. Two approaches are explored for finding Nash equilibria using these relations: a stepping stone reinforcing search algorithm and crowding based differential evolution, an evolutionary algorithm originally designed for multimodal optimization problems.

Uploaded by

lungrodica6284
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Nash Equilibria Detection for Multi-Player Games

Rodica Ioana Lung, Tudor Dan Mihoc, D. Dumitrescu

Abstract— One of the main issues in computational game II. G ENERATIVE RELATIONS FOR NASH EQUILIBRIA
theory is equilibria detection in multi player games. This
problem is approached using a generative relation for strategy
A generative relation for Nash equilibria is a relation
profiles and two different search operators: crowding based between two strategy profiles that enables their comparison
differential evolution and a simple stepping stone reinforcing with respect to the Nash solution concept, i.e. it evaluates
search algorithm. A probabilistic generative relation is also which one is “closer’ to equilibrium. In [2] such a generative
derived in order to tackle a higher number of players. Each relation has been introduced. It is shown that solutions that
approach has advantages and disadvantages, illustrated by
numerical experiments for games involving from two to a
are nondominated/ascended with respect to this relation are
hundred players. exactly the Nash equilibria of the game. A probabilistic
variant of this relation is introduced in subsection II-B.
I. I NTRODUCTION A. Nash ascendancy relation
A finite strategic game is defined by Γ = ((N, Si , ui ), i =
The intrinsic similarities between mathematical games and 1, n) where:
multiobjective optimization problems are evident and have
• N represents the set of players, N = {1, ...., n}, n is
long been exploited. A game is defined as the ensemble
the number of players;
formed by a set of players, a set of actions available to
• for each player i ∈ N , Si represents the set of actions
each player and, most important, a set of payoff functions
available to him, Si = {si1 , si2 , ..., simi } where mi
that each player wishes to maximize. The multi-objective
represents the number of strategies available to player i
(or many-objective) [1] optimization problem consist of a
and S = S1 × S2 × ... × SN is the set of all possible
set of functions - most often conflicting - to be maximized
situations of the game;
(minimized) subject to some constraints.
• for each player i ∈ N , ui : S → R represents the payoff
The solution concepts to the two problems are however function.
different. While the Pareto optimal solution has been widely
Denote by (sij , s∗−i ) the strategy profile obtained from s∗
addressed from an evolutionary point of view, mostly through
by replacing the strategy of player i with sij i.e.
the Pareto domination relation, the Nash equilibria of a non-
cooperative game was until recently unaccessible because of (sij , s∗−i ) = (s∗1 , s∗2 , ..., s∗i−1 , sij , s∗i+1 , ..., s∗1 ).
the lack of a similar relation for strategy profiles.
This problem has been surpassed by the introduction The most common concept of solution for a non cooper-
of a generative relation [2] that enabled Nash equilibria ative game is the concept of Nash equilibrium [5], [6]. A
collective strategy s ∈ S for the game Γ represents a Nash
detection by using a adapted version of an evolutionary
algorithm designed for multiobjective optimization [3], [2]. equilibrium if no player has anything to gain by changing
only his own strategy.
Increasing the number of players is one of the challenges
facing algorithmic game theory just as increasing the number Several methods to compute NE of a game have been
of objectives is for multiobjective optimization. The long developed. For a review on computing techniques for the
NE see [5].
popular NSGA2 initially used has been proved not to be
able to cope with many-objective optimization problems. Consider two strategy profiles x and y from S. An operator
k : S × S → N that associates the cardinality of the set
In order to deal with multiple players two main approaches
are studied. A probabilistic version of the generative relation k(x, y) = |({i ∈ {1, ..., n}|ui(yi , x−i ) ≥ ui (x), yi 6= xi }|
aiming at reducing the fitness number evaluations during
the run of a search operator is proposed. An evolutionary to the pair (x, y) is introduced.
algorithm - Crowding Differential Evolution [4] - originally This set is composed by the players i that would benefit
designed for multimodal optimization and a simple stepping if - given the strategy profile x - would change their strategy
stone reinforcing search algorithm are tested using the two from xi to yi , i.e.
versions of the generative relation. ui (yi , x−i ) ≥ ui (x).

Rodica Ioana Lung is with the Faculty of Economics and Business


Let x, y ∈ S. We say the strategy profile x Nash ascends
Administration, Babeş-Bolyai University of Cluj Napoca, Romania (phone: the strategy profile y in and we write x ≺ y if the inequality
+40264 418654 email: [email protected]).
Tudor Dan Mihoc and D. Dumitrescu are with the Department of Com- k(x, y) < k(y, x)
puter Science, Babeş-Bolyai University of Cluj Napoca, Romania (phone:
+40264405300; email: [email protected], [email protected]). holds.
Thus a strategy profile x ascends strategy profile y if there A. Stepping-Stone Reinforcing Search
are less players that can increase their payoffs by switching An initial strategy profile x is randomly generated.
their strategy from xi to yi than vice-versa. It can be said A mutation operator that modifies a position xi , where i
that strategy profile x is more stable (closer to equilibrium) is randomly chosen with a value ±ǫ (± is also randomly
then strategy y. chosen) is considered. Thus x′i = xi ± ǫ, where x′ denotes
Two strategy profiles x, y ∈ S may have the following the potential offspring.
relation: Each step a value of i is randomly generated until an
1) either x dominates y, x ≺ y (k(x, y) < k(y, x)) offspring Nash ascending the parent is produced using the
2) either y dominates x, y ≺ x (k(x, y) > k(y, x)) mutation operator described above. This is equivalent to
3) or k(x, y) = k(y, x) and x and y are considered randomly searching for a player that improves its payoff
indifferent (neither x dominates y nor y dominates x). when modifiying its strategy with ǫ (either +ǫ or −ǫ). In this
The strategy profile s∗ ∈ S is called non-ascended in Nash case the offspring becomes the parent. This step is aimed at
sense (NAS) if reinforcing the Nash ’characteristics’ of the current strategy
profile.
∄s ∈ S, s 6= s∗ such that s ≺ s∗ . The search ends when no such i is found for the current
parent, i.e. within the current strategy profile no player can
In [2] it is shown that all non-ascended strategies are NE
improve its payoff by modifiying its strategy with ǫ.
and also all NE are non-ascended strategies. Thus the Nash
ascendancy relation can be used to characterize the equilibria B. Crowding based Differential Evolution
of a game. Crowding Differential Evolution (CrDE) [4] extends the
Differential Evolution (DE) algorithm [7] with a crowding
B. Probabilistic Nash ascendancy relation
scheme. The only modification to the conventional DE
When evaluating the Nash ascendancy relation for two is made regarding the individual (parent) being replaced.
strategy profiles, 2N payoff functions have to be computed. Usually, the parent producing the offspring is substituted,
For a large number of players this number increases the whereas in CrDE the offspring replaces the most similar
computational complexity of the algorithm by increasing the individual among the population if it Nash ascends it. A
number of fitness function evaluations. One way to reduce DE/rand/1/exp [8] scheme is used.
this number is to consider only a subset of players when
computing the k operator. This subset can be randomly IV. N UMERICAL EXPERIMENTS
chosen from the player and its size can be constant or it Numerical experiments aim at illustrating how NEs can
can vary. be computed by means of evolutionary algorithms based on
Thus, we may consider a subset I ∈ N composed of a appropriate generative relations for the Cournot oligopoly
percent q of randomly chosen players from N . The operator model [9]. Results illustrate the use of the two generative re-
κq : S × S → N can be defined as lations and search operators for different number of players.

κq (x, y) = |({i ∈ I|ui (yi , x−i ) ≥ ui (x), yi 6= xi }|. A. Cournot oligopoly


Let qi , i = 1, ..., N denote the quantities of an homoge-
κq (x, y) counts the number of players from the set I that
neous product - produced by N companies respectively. The
benefit from changing their strategies from xi to yi , i ∈ I
market clearing price is
while the others keep theirs unchanged.
Thus only players selected in I participate in the evaluation P (Q) = a − Q,
of the ascendancy relation reducing the number of payoff
function evaluations to 2qN/100. where Q is the aggregate quantity on the market. Hence we
A first insight on the effects of reducing the number of have 
a − Q, for Q < a,
players involved in evaluating the ascendancy relation can P (Q) =
0, for Q ≥ a.
be given by some preliminary numerical experiments that
are presented in Section IV. Let us assume that the total cost for the company i of
producing quantity qi is C(qi ) = cqi . Therefore, there
III. S EARCH O PERATORS are no fixed costs and the marginal cost c is constant,
c < a. Suppose that the companies choose their quantities
Including the generative relation into a search operator
simultaneously. The payoff for the company i is its profit,
such an evolutionary algorithm would guide it to the Nash
which can be expressed as:
equilibria of the game. A simple stepwise search procedure
called Stepping-Stone Reinforcing Search (SSRS) to illus- ui (q1 , q2 , ..., qN ) = qi P (Q) − C(qi )
trate this claim is proposed. Also an evolutionary algorithm
for multimodal optimization, Crowding Differential Evolu- If we consider
N
tion [4] is adapted in order to detect multiple equilibria of a
X
Q= qi ,
game. i=1
then the Cournot oligopoly has one Nash equilibria that b) q = 50%: When considering the probabilistic Nash
can be computed by ascendancy relation with q = 50% (table II) the main
observation is that the average distance to the equilibrium
a−c
qi = , ∀i ∈ {1, ..., N }. decreases when the number of players increases for both
N +1 methods.
Apart from its applications in economy the Cournot TABLE II
oligopoly model can be used to test the behavior of different q = 50%
evolutionary approaches computing Nash equilibria for a
large number of players.
Results—no.players 10 20 50 100
CRDE avg 17.38 9.79 12.38 8.39
B. Parameter settings CRDE stdev 5.08 1.85 1 0.64
SSRS avg 26.64 8.76 2.90 1.57
All operators with all variants of generative relations are SSRS stdev 6.45 4.35 1.09 0.24
tested for 10, 20, 50 and 100 players. Because the number Avg fitness 73275 617952 2185920 5051820
of payoff functions is equal to the number of players, Stdev fitness 50755.15 237464.57 191832.46 382394.12
this setting creates the equivalent of four many-objective
optimization problem which are known to be difficult to solve
by evolutionary algorithms. c) q = 30%: The same decreasing trend as in the
SSRS uses only one parameter, ǫ, set to 0.05. CrDE uses previous case is noticed for q = 30% (table III). The trend
a population of 100 individuals. DE parameters are F = 0.5 is not only decreasing, but results obtained for 100 players
and pc = 0.9 as used in [4]. are significantly better than those obtained for 10 players.
The probabilistic ascendancy relation was tested for q =
TABLE III
10%, 30%, 50% and 100%. When q = 100% we have the
q = 30%
Nash ascendancy relation.
Each operator was run 10 times and the average and
standard deviation of the minimum distance to the NE for the Results—no.players 10 20 50 100
CRDE avg 25.18 13.94 7.71 6.59
10 runs is computed. The stopping criterion is the maximum
CRDE stdev 4.92 2.35 1.2 0.29
number of payoff functions evaluation of 2 · 107 . SSRS stops SSRS avg 33.61 17.34 2.69 1.39
either if its criterion is fulfilled or the maximum number of SSRS stdev 4.23 6.11 1.16 0.49
payoff evaluations is reached. Avg fitness 4781.40 327111.20 2770859.00 6585618.00
Stdev fitness 4025.90 172198.72 313516.29 595918.17

C. Numerical results
Numerical results present the and standard deviation of d) q = 10%: Table IV shows the same decreasing trend
the minimum distance to the NE and the number of payoff of the average distance with the increase in the number of
function evaluations for each operator with different values players for both methods.
for the number of players chosen in the probabilistic Nash
TABLE IV
ascendancy relation.
q = 10%
a) Nash ascendancy: q = 100%: Results regarding the
Nash ascendancy relation are presented in Table I. Because Results—no.players 10 20 50 100
SSRS stopped before the 2·107 number of payoff evaluations CRDE avg 40.22 22.18 10.94 7.56
allowed, the number of payoff function evaluations used is CRDE stdev 5.37 3.16 0.8 0.38
SSRS avg 32.16 25.34 15.68 5.22
indicated. Results shows that SSRS outperforms CrDE in all SSRS stdev 2.74 2.72 0.74 2.54
instances considering both the distance to NE and the number Avg fitness 816.20 7436.00 344300.00 8266500.00
of evaluations needed. Stdev fitness 463.62 3733.58 230622.89 3656538.32

TABLE I
q = 100% e) Discussion: Box-plots representing the distance to
Nash equilibria for the different values of q for SSRS and
Results—no.players 10 20 50 100 CrDE are represented in Figures 1,2. Box-plots also indicate
CrDE avg 1.76 7.2 6.29 6.85 a better performance in the case of 100 individuals for SSRS
CRDE stdev 0.6 1.36 1.12 0.04 when using the probabilistic ascendancy relation. CrDE also
SSRS avg 0.08 0.11 0.17 0.26
exhibits a better performance for a higher number of player
SSRS stdev 0.02 0.03 0.03 0.05
Avg fitness 2.57E+05 5.79E+05 1.71E+06 3.71E+06 for different values of q.
Stdev fitness 3.03E+04 7.17E+04 1.14E+05 1.98E+05 Figures 3 and 4 represent the distance to Nash equilibria
for SSRS versus the number of payoff function evaluation for
50 players and 100 players. The behavior of CrDE illustrated
SSRS
50
distance to Nash equilibrium

40

30

20

10

0
10-100 10-50 10-30 10-10 20-100 20-50 20-30 20-10 50-100 50-50 50-30 50-10 100-100 100-50 100-30 100-10
number of players/payoff functions - q parameter

Fig. 1. SSRS results: box-plots labels represent the number of players and the q parameter used. For example 10 − 100 indicates 10 players and q = 100%

CrDE
60
distance to Nash equilibrium

50

40

30

20

10

0
10-100 10-50 10-30 10-10 20-100 20-50 20-30 20-10 50-100 50-50 50-30 50-10 100-100 100-50 100-30 100-10
number of players/payoff functions - q parameter

Fig. 2. CrDEresults: box-plots labels represent the number of players and the q parameter used. For example 10 − 100 indicates 10 players and q = 100%

SSRS, 100 players SSRS, 50 players


12 18

16
Distance to Nash equilibrium
distance to Nash equilibrium

10
14

8 12

10
6
8

4 6

4
2
2

0 0
0 2e+06 4e+06 6e+06 8e+06 1e+07 1.2e+07 0 5e+05 1e+06 1.5e+06 2e+06 2.5e+06 3e+06
number of payoff evaluations number of payoff evaluations

Fig. 3. SSRS results: 100 players; distance to NE versus number of fitness Fig. 4. SSRS results: 50 players; distance to NE versus number of fitness
evaluations. evaluations.

V. C ONCLUSION AND FURTHER WORK


in Figures 5 and 6 is different: CrDE is more affected by the
The use of a probabilistic generative relation versus the
use of the probabilistic ascendancy relation.
deterministic one for Nash equilibria detection in non-
Altough SSRS’s accuracy is better than CrDE’s, its main cooperative games is studied for several Cournot oligopoly
disadvantage is that, in the current form, SSRS is only models. The probabilistic relation is introduced in order to
capable of detecting one NE at a time. reduce the computational complexity of the search.
CdDE, 50 players
12
[5] R. D. McKelvey and A. McLennan, “Computation of equilibria in finite
games,” in Handbook of Computational Economics, ser. Handbook of
Computational Economics, H. M. Amman, D. A. Kendrick, and J. Rust,
distance to Nash equilibrium 10
Eds. Elsevier, 1 1996, vol. 1, ch. 2, pp. 87–142.
[6] J. F. Nash, “Non-cooperative games,” Annals of Mathematics, vol. 54,
8 pp. 286–295, 1951.
[7] R. Storn and K. Price, “Differential evolution - a simple and efficient
6
adaptive scheme for global optimization over continuous spaces,”
Berkeley, CA, Tech. Rep. TR-95-012, 1995. [Online]. Available:
citeseer.ist.psu.edu/article/storn95differential.html
4
[8] ——, “Differential evolution a simple evolution strategy for fast
optimization.” Dr. Dobb’s Journal of Software Tools,, vol. 22, no. 4,
2 pp. 18–24, 1997.
0 5e+06 1e+07 1.5e+07 2e+07 2.5e+07 [9] A. F. Daughety, Cournot oligopoly : characterization and applications /
Payoff number evaluations
edited by Andrew F. Daughety. Cambridge University Press, Cambridge
[Cambridgeshire] ; New York :, 1988.
Fig. 5. CrDE results: 50 players; distance to NE versus number of fitness
evaluations.

CrDE, 100 players


14
distance to Nash equilibrium

13.5

13

12.5

12

11.5

11
-5e+06 0 5e+06 1e+07 1.5e+07 2e+07 2.5e+07
number of payoff function evaluations

Fig. 6. CrDE results: 100 players; distance to NE versus number of fitness


evaluations.

Two methods, a Crowding based Differential Evolution


algorithm and a Stepping Stone Reinforcing Search algorithm
are used for numerical experiments.
Numerical results are most interesting. Both proposed
methods, when using the probabilistic ascendancy relation
are able to cope with higher number of payoffs (100) better
than with less (10). This indicates the potential of the
proposed approach in surpassing the problem of solving
games with large number of players.
Further work includes a hybridization method for the
two algorithms in order to enhance their capabilities and
solve multi-player games presenting a set of multiple Nash
equilibria.

R EFERENCES

[1] H. Ishibuchi, N. Tsukamoto, and Y. Nojima, “Evolutionary many-


objective optimization,” in Genetic and Evolving Systems, 2008. GEFS
2008. 3rd International Workshop on, March 2008, pp. 47–52.
[2] R. I. Lung and D. Dumitrescu, “Computing nash equilibria by means
of evolutionary computation,” Int. J. of Computers, Communications &
Control, vol. III, no. suppl.issue, pp. 364–368, 2008.
[3] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan, “A fast elitist non-
dominated sorting genetic algorithm for multi-objective optimization:
Nsga-ii.” Springer, 2000, pp. 849–858.
[4] R. Thomsen, “Multimodal optimization using crowding-based differ-
ential evolution,” in Proceedings of the 2004 IEEE Congress on
Evolutionary Computation. Portland, Oregon: IEEE Press, 20-23 June
2004, pp. 1382–1389.

You might also like