Adaptive Dynamics of Memory-One Strategies in The Repeated Donation Game - La Porte

PLOS COMPUTATIONAL BIOLOGY
RESEARCH ARTICLE
Adaptive dynamics of memory-one strategies

in the repeated donation game
Philip LaPorte ID1*, Christian Hilbe ID2*, Martin A. Nowak3,4
1 Department of Mathematics, University of California, Berkeley, Berkeley, California, United States of
America, 2 Max Planck Research Group ‘Dynamics of Social Behavior’, Max Planck Institute for Evolutionary
Biology, Plön, Germany, 3 Department of Mathematics, Harvard University, Cambridge, Massachussetts,
United States of America, 4 Department of Organismic and Evolutionary Biology, Harvard University,
Cambridge, Massachussetts, United States of America
a1111111111
* [email protected] (PL); [email protected] (CH)
a1111111111
a1111111111
a1111111111
a1111111111 Abstract
Human interactions can take the form of social dilemmas: collectively, people fare best if
all cooperate but each individual is tempted to free ride. Social dilemmas can be resolved
when individuals interact repeatedly. Repetition allows them to adopt reciprocal strate-
OPEN ACCESS
gies which incentivize cooperation. The most basic model for direct reciprocity is the
Citation: LaPorte P, Hilbe C, Nowak MA (2023) repeated donation game, a variant of the prisoner’s dilemma. Two players interact over
Adaptive dynamics of memory-one strategies in
the repeated donation game. PLoS Comput Biol
many rounds; in each round they decide whether to cooperate or to defect. Strategies
19(6): e1010987. https://doi.org/10.1371/journal. take into account the history of the play. Memory-one strategies depend only on the pre-
pcbi.1010987 vious round. Even though they are among the most elementary strategies of direct reci-
Editor: Feng Fu, Dartmouth College, UNITED procity, their evolutionary dynamics has been difficult to study analytically. As a result,
STATES much previous work has relied on simulations. Here, we derive and analyze their adap-
Received: March 2, 2023 tive dynamics. We show that the four-dimensional space of memory-one strategies has
Accepted: June 13, 2023
an invariant three-dimensional subspace, generated by the memory-one counting strate-
gies. Counting strategies record how many players cooperated in the previous round,
Published: June 29, 2023
without considering who cooperated. We give a partial characterization of adaptive
Peer Review History: PLOS recognizes the dynamics for memory-one strategies and a full characterization for memory-one counting
benefits of transparency in the peer review
process; therefore, we enable the publication of
strategies.
all of the content of peer review and author
responses alongside final, published articles. The
editorial history of this article is available here:
https://doi.org/10.1371/journal.pcbi.1010987
Author summary
Copyright: © 2023 LaPorte et al. This is an open
access article distributed under the terms of the Direct reciprocity is a mechanism for evolution of cooperation based on the repeated
Creative Commons Attribution License, which interaction of the same players. In the most basic setting, we consider a game between two
permits unrestricted use, distribution, and players and in each round they choose between cooperation and defection. Hence, there
reproduction in any medium, provided the original
are four possible outcomes: (i) both cooperate; (ii) I cooperate, you defect; (ii) I defect,
author and source are credited.
you cooperate; (iv) both defect. A memory-one strategy for playing this game is character-
Data Availability Statement: All relevant data are ized by four quantities which specify the probabilities to cooperate in the next round
within the manuscript.
depending on the outcome of the current round. We study evolutionary dynamics in the
Funding: C.H. acknowledges generous support by space of all memory-one strategies. We assume that mutant strategies are generated in
the European Research Council Starting grant
850529: E-DIRECT. The funders had no role in
PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 1 / 31

PLOS COMPUTATIONAL BIOLOGY Adaptive dynamics of memory-1 strategies in the repeated donation game
study design, data collection and analysis, decision

to publish, or preparation of the manuscript. close proximity to the existing strategies, and therefore we can use the framework of adap-
Competing interests: The authors have declared tive dynamics, which is deterministic.
that no competing interests exist.
Introduction
Evolution of cooperation is of considerable interest, because it demonstrates that natural selec-
tion does not only lead to selfish, brutish behavior red in tooth and claw [1, 2]. Yet in absence
of a mechanism for its evolution, natural selection opposes cooperation. A mechanism for evo-
lution of cooperation is an interaction structure that allows natural selection to favor coopera-
tion over defection [3]. Direct reciprocity is one such mechanism [4–8]. This mechanism is
based on repeated interactions among the same individuals. In a repeated interaction, individ-
uals can condition their decisions on their co-player’s previous behavior. By being more coop-
erative towards other cooperators, they can generate a favorable social environment for the
evolution of cooperation.
The most basic model to illustrate reciprocity is the repeated donation game [1]. This game
takes place among two players, who interact for many rounds. Each round, players indepen-
dently decide whether to cooperate or defect. Cooperation implies a cost c for the donor and
generates a benefit b for the recipient. Defection implies no cost and confers no benefit. Both
players decide simultaneously. If they both cooperate, each of them gets payoff b − c. If both
players defect, each of them gets payoff 0. If one player cooperates while the other defects, the
cooperator’s payoff is −c while the defector’s is b. The donation game is a special case of a pris-
oner’s dilemma if b > c > 0, which is normally assumed.
If the donation game is played for a single round, players can only choose between the two
possible strategies of cooperation and defection. Based on the game’s payoffs, each player pre-
fers to defect, creating the dilemma. In contrast, in the repeated donation game, infinitely
many strategies are available. For example, players may choose to cooperate if and only if their
co-player cooperated in the previous round. This is the well-known strategy Tit-for-tat [5, 9].
Alternatively, players may wish to occasionally forgive a defecting opponent, as captured by
Generous Tit-for-tat [10, 11]. Against each of these strategies, unconditional defection is no
longer the best response. Instead, mutual cooperation is now in the co-player’s best interest.
During the past decades, there has been a considerable effort to explore whether condition-
ally cooperative behaviors would emerge naturally (e.g., [12–24]). To this end, researchers
study the dynamics in evolving populations, in which strategies are transmitted either by bio-
logical or cultural evolution (by inheritance or imitation). For such an analysis, it is useful to
restrict the space of strategies that individuals can choose from. The strategy space ought to be
small enough for a systematic analysis, yet large enough to capture the most interesting
behaviors.
One frequently used subspace is the set of memory-one strategies [24–32]. Players with
memory-one strategies respond to the outcome of the previous round only. Such strategies
can be written as a vector p = (pCC, pCD, pDC, pDD) in the 4-dimensional cube [0, 1]4. Each
entry pij reflects the player’s conditional cooperation probability, depending on the four possi-
ble outcomes of the previous round, CC, CD, DC, DD (the first letter is the focal player’s action,
the second letter is the co-player’s action). Despite their simplicity, memory-one strategies can
capture many different behavioral archetypes. They include always defect, ALLD = (0, 0, 0, 0),
always cooperate, ALLC = (1, 1, 1, 1), Tit-for-tat, TFT = (1, 0, 1, 0) [5, 9], Generous Tit-for-tat,
GTFT = (1, x, 1, x) with 0 < x < 1 [10, 11], and Win-stay, Lose-shift, WSLS = (1, 0, 0, 1) [25,

33]. The sixteen corner points of the cube are the pure strategies. The interior of the cube are
stochastic strategies. The center of the cube is the random strategy (1/2, 1/2, 1/2, 1/2) [5].
Conditionally cooperative strategies have been of particular interest in the study of human
behavior. For example, there is evidence for the intuitive expectation that people tend to coop-
erate more if their co-player was cooperative in the past, or if they expect their co-player to
cooperate in the future [34–36]. The concept of conditionally cooperative strategies is quite
broad and includes strategies such as Tit-for-two-tats, which cannot be realized as a memory-
one strategy. In this paper we consider only conditionally cooperative strategies which can be
realized as memory-one strategies, such as TFT, GTFT, and nearby strategies. However, it is
hoped that techniques similar to the ones used in this paper can be used to study more general
strategy spaces.
When both players adopt memory-one strategies, there is an explicit formula to derive their
average payoffs (as described in the next section). Based on this formula, it is possible to char-
acterize all Nash equilibria among the memory-one strategies [37–42]. In general, however the
payoff formula yields a complex expression in the players’ conditional cooperation probabili-
ties pij. As a result, it is difficult to characterize the dynamics of evolving populations, in which
players switch strategies depending on the payoffs they yield. Most previous work had to resort
to individual-based simulations. Only in special cases, an analytical description has been feasi-
ble (for example, based on differential equations). One special case arises when individuals are
restricted to use reactive strategies [43–48]. Reactive strategies only depend on the co-player’s
previous move. Within the memory-one strategies, they correspond to the 2-dimensional sub-
set with pCC = pDC and pCD = pDD. In addition, there has been work on the replicator dynamics
among three strategies [15, 49], and on the dynamics among transformed memory-one strate-
gies [50, 51]. Here, we wish to explore the dynamics among memory-one strategies directly,
using adaptive dynamics [52, 53].
We begin by describing two interesting mathematical results. First, we show that under
adaptive dynamics, the 4-dimensional space of memory-one strategies contains an invariant
3-dimensional subset. This subset comprises all “counting strategies”. These strategies only
depend on the number of cooperators in the previous round. They correspond to memory-
one strategies with pCD = pDC. Second, we find that for the donation game, the adaptive
dynamics exhibits an interesting symmetry between orbits forward-in-time and backward-in-
time. We use these mathematical results to partially characterize the adaptive dynamics among
memory-one strategies, and to fully characterize the dynamics among memory-one counting
strategies.
Model
We study the infinitely repeated donation game between two players. Each round, each player
has the option to cooperate (C) or to defect (D). Players make their choices independently, not
knowing their co-player’s choice in that round. Payoffs in each round are given by the matrix
C D
� �
C b c c ð1Þ
D b 0
The entries correspond to the payoff of the row-player, with b and c being the benefit and
cost of cooperation, respectively. We assume b > c > 0 throughout. The above payoff matrix is

a special case of a symmetric 2 × 2 game with matrix
C D
� �
C R S ð2Þ
D T P
The payoff matrix (1) of the donation game satisfies the typical inequalities of a prisoner’s
dilemma, T > R > P > S and 2R > T + S. Moreover, it satisfies the condition of ‘equal gains
from switching’,
RþP ¼ T þS ð3Þ
This condition ensures that if players interact repeatedly, their overall payoffs only depend
on how often each player cooperates, independent of the timing of cooperation.
In the following we focus on repeated games among players with memory-one strategies.
Each player’s decision is determined by a four-tuple p = (pCC, pCD, pDC, pDD). Depending on
the outcome of the previous round, CC, CD, DC, or DD, the focal player responds by cooperat-
ing with probability pCC, pCD, pDC, or pDD, respectively.
Strategies with large pCC exhibit a high frequency of mutual cooperation and will receive
relatively large payoffs in the donation game. We note that in games with other payoff matrices
2, it may be beneficial in the long run for players to take turns cooperating with each other
while the other defects. This behavior is called ST-reciprocity, because players will alternately
receive payoffs S and T rather than R in every round. ST-reciprocity becomes superior to R-
reciprocity in terms of payoffs when S + T > 2R, and it can be achieved by memory-one strate-
gies such as (p1, 0, 1, p4) with small but positive p1, p4. For an account of ST- and R-reciprocity
in other 2 × 2 games such as the Chicken or Snowdrift game, see [54, 55]. For the donation
game, where S + T = R < 2R, we are primarily interested in the evolution of mutual coopera-
tion CC.
We refer to a memory-one strategy as a counting strategy if it satisfies pCD = pDC. A count-
ing strategy only reacts to the number of cooperators in the previous round. If both players
cooperated in the previous round, they cooperate with probability pCC. If exactly one of the
players cooperated, they cooperate with probability pCD = pDC, irrespective of whether the out-
come was CD or DC. If no one cooperated, the cooperation probability is pDD. Memory-one
counting strategies include all unconditional strategies (such as ALLC and ALLD), as well as
the strategies GRIM = (1, 0, 0, 0) and WSLS = (1, 0, 0, 1).
If the two players employ memory-one strategies p = (pCC, pCD, pDC, pDD) and
p0 ¼ ðp0CC ; p0CD ; p0DC ; p0DD Þ, then their behavior generates a Markov chain with transition matrix
0 1
pCC p0CC pCC ð1 p0CC Þ ð1 pCC Þp0CC ð1 pCC Þð1 p0CC Þ
B C
B pCD p0DC pCD ð1 p0DC Þ ð1 pCD Þp0DC ð1 pCD Þð1 p0DC Þ C
M¼B
B p p0
C ð4Þ
@ DC CD pDC ð1 p0CD Þ ð1 pDC Þp0CD ð1 pDC Þð1 p0CD Þ C
A
pDD p0DD pDD ð1 p0DD Þ ð1 pDD Þp0DD ð1 pDD Þð1 p0DD Þ
That is, if s(n) = (sCC(n), sCD(n), sDC(n), sDD(n)), and sij(n) is the probability that the p-
player chooses i and the p0 -player chooses j in round n, then s(n + 1) = s(n)M. For
p, p0 2 (0, 1)4, the Markov chain has a unique invariant distribution v = (vCC, vCD, vDC, vDD).
This distribution v corresponds to the left eigenvector of M with respect to the eigenvalue 1,
normalized such that the entries of v sum up to one. The entries of v can be interpreted as the
average frequency of the four possible outcomes over the course of the game. Therefore we can

define the repeated-game payoff of the p-player as
Aðp; p0 Þ ¼ RvCC þ SvCD þ TvDC þ PvDD ð5Þ
For a more explicit representation of the players’ payoffs, one can use the determinant for-
mula by [56], which is shown in Methods.
To explore how players adapt their strategies over time, we use adaptive dynamics [52, 53].
Adaptive dynamics is a method to study deterministic evolutionary dynamics in a continuous
strategy space. The idea is that the population is (mostly) homogeneous at any given time.
Mutations generate a small ensemble of possible invaders, which are very close to the resident
in strategy space. These invaders can take over the population if they receive a higher payoff
against the resident than the resident achieves against itself. In the limit of infinitesimally small
variation between resident and invader, we obtain an ordinary differential equation. For mem-
ory-one strategies this differential equation takes the form
�
@Aðp; p0 Þ ��
p_ ij ¼ with i; j 2 fC; Dg ð6Þ
@pij �p¼p0
That is, populations evolve towards the direction of the payoff gradient. We derive an
explicit representation of this differential equation in Methods. The resulting expression
defines a flow on the cube [0, 1]4. Our aim is to understand the properties of this flow.
Results
Structural properties of adaptive dynamics
We begin by describing two general properties of adaptive dynamics in the cube [0, 1]4 of
memory-one strategies. The first property concerns an invariance result. As we prove in Meth-
ods, the subspace of counting strategies is left invariant under adaptive dynamics. That is, if
the initial population p(0) satisfies pCD(0) = pDC(0) and p(t) is a solution of the dynamic (6),
then pCD(t) = pDC(t) for all times t. Therefore, if initially all population members only care
about the number of cooperators, then the same is true for all future population members.
This result does not require the specific payoffs of the donation game. Instead it is true for all
symmetric 2 × 2 games. The result is useful because it allows us to decompose the space of
memory-one strategies into three invariant sets: the set of strategies with pCD > pDC, with
pCD = pDC, and with pCD < pDC. Each of these invariant subsets can be studied in isolation. In a
subsequent section, we provide such an analysis for the counting strategies (with pCD = pDC)
specifically.
As a second property, we observe an interesting symmetry between different orbits of adap-
tive dynamics. Specifically, if (pCC, pCD, pDC, pDD)(t) is a solution to (6) on some interval t 2
(a , b), then so is (1 − pDD, 1 − pDC, 1 − pCD, 1 − pCC)(−t) on the interval t 2 (−b, −a). This prop-
erty implies that for every orbit forward in time, there is an associated orbit backward in time
that exhibits the same dynamics. This result is specific to the donation games (or more pre-
cisely, to games with equal gains from switching). The formal proof of this symmetry is in
Methods. In the following we provide an intuitive argument. To this end, consider the follow-
ing series of transformations applied to the payoff matrix of a 2 × 2 game with equal gains

from switching (R + P = S + T):
µC D¶ negating µC D¶
C R S payoff
¡¡¡¡¡¡¡¡! C ¡R ¡S
D T P D ¡T ¡P
adding a µ C D ¶ µC D¶
constant
¡¡¡¡¡¡¡¡! C ¡R+(R+P ) ¡S +(S +T ) = C P T ð7Þ
D ¡T +(S +T ) ¡P +(R+P ) D S R
exchanging µC D¶
C and D
¡¡¡¡¡¡¡¡! C R S
D T P
Notice that we started and ended at the same game; this property is equivalent to equal
gains from switching. But now it is easy to see that solutions to the associated ordinary differ-
ential equation transform correspondingly as follows,
negating
payoff
ðpCC ; pCD ; pDC ; pDD ÞðtÞ ��! ðpCC ; pCD ; pDC ; pDD Þð tÞ
adding a
constant ð8Þ
��! ðpCC ; pCD ; pDC ; pDD Þð tÞ
exchanging
C and D
��! ð1 pDD ; 1 pDC ; 1 pCD ; 1 pCC Þð tÞ
The upshot of this duality is that solutions to adaptive dynamics come in related pairs. We
will see expressions of this duality in several of the figures below.

In the following, we aim to get a more qualitative understanding of the adaptive dynamics. To
this end, we first examine which combinations of signs can appear in the components of the
vector field ðp_ CC ; p_ CD ; p_ DC ; p_ DD Þ. For example, it turns out that if pCC is decreasing, pDC must be
decreasing as well. Similarly, if pDD is decreasing, then so is pCD. For c/b = 0.1, the results of
this sign analysis are shown in Fig 1. There we show a 9 × 9 × 9 × 9 evenly spaced grid on
[0, 1]4. Each point is colored according to the signs of the components of
p_ ¼ ðp_ CC ; p_ CD ; p_ CD ; p_ DD Þ at that point. Therefore, the figure provides information about the
direction of adaptive dynamics at each point. We observe that the combinations abcd of signs
come in pairs of the form abcd, dcba. For example, there are exactly as many points having
signs ‘+---’ as ‘---+’. The sets of points in each pair are related to each other by reflection
about the diagonal in the figure. If abcd are the signs at (x, y, z, w) 2 (0, 1)4, then dcba are the
signs at (1 − w, 1 − z, 1 − y, 1 − x). This is, of course, a consequence of the symmetry described
in the previous section.
In a next step, we aim to find all interior fixed (critical) points of adaptive dynamics. As we
show in Methods, these turn out to be the solutions to the linear system
bðpCC pCD Þ þ cð 1þpCC pDC Þ ¼ 0 and pCC þpDD ¼ pCD þpDC ð9Þ

Fig 1. Local adaptive dynamics for memory-one strategies. For a 9 × 9 × 9 × 9-grid (= 6561 points) we show the direction of change in terms of the
sign of each component of ðp_ CC ; p_ CD ; p_ DC ; p_ DD Þ as given by Eq (6). The possibilities are shown on the right. We observe that for 1424 points all four
components are positive, ++++. For 3269 points all four components are negative, - - --. Seven combinations do not occur. These combinations fall
into one or both of the following categories: (i) p_ CC is negative and p_ DC is positive, and (ii) p_ DD is negative and p_ CD is positive. Both combinations are
forbidden. Because of the symmetry (8) there are three pairs where each combination occurs as often as its partner. One such pair is ++-+ and +-++
(each occurring 353 times). The configuration +- -+ is its own mirror image and therefore a singleton (occurring 536 times). The reason for the
symmetry in the plot is explained in the main text. Let σ: [0, 1]4 ! [0, 1]4 be defined by σ(pCC, pCD, pDC, pDD) = (1−pDD, 1−pDC, 1−pCD, 1−pCC). If abcd
are the signs at p, then dcba are the signs at σ(p). σ acts by reflection about the dotted diagonal line shown. Finally, eight points are critical points with
ðp_ CC ; p_ CD ; p_ DC ; p_ DD Þ ¼ ð0; 0; 0; 0Þ. Two points are zero in one but not all of the four components. The graph is created for c = 0.1.
https://doi.org/10.1371/journal.pcbi.1010987.g001
In particular, the set of interior critical points forms a two-dimensional plane within the
four-dimensional cube. As we will show in Methods, (9) implies certain bounds on pCC and
pDD among the interior critical points: pCC > c/b and pDD < 1−c/b.
By definition, critical points satisfy a local condition, p_ ij ¼ 0 for all i, j 2 {C, D}. However, it
turns out that the critical points identified above have a shared global property. The points that
satisfy (9) coincide with the equalizer strategies that have been described earlier [56, 57]. An
equalizer is a strategy p such that A(p0 , p) is a constant, irrespective of p0 . Every such strategy
must be a critical point of adaptive dynamics. Our result shows that also the converse is true.
Every interior critical point of the system (6) needs to be an equalizer.

We can also examine what happens on the boundary of the strategy space. For our analysis,
4
we define the boundary Bð½0; 1� Þ to be all points p 2 [0, 1]4 with exactly one entry pij 2 {0, 1}.
That is, we exclude corner and edge points. What remains is a set of eight 3-dimensional
4
cubes. We call a point p 2 Bð½0; 1� Þ saturated if pij = 0 implies p_ ij � 0 and pij = 1 implies p_ ij � 0.
A point is called strictly saturated if the above inequalities are strict. A point is unsaturated if it
is not saturated. Orbits that start at an unsaturated point move into the interior of the strategy
space. Conversely, every strictly saturated point is the limit, forward in time, of some trajectory
in the interior.
For memory-one strategies, all eight boundary faces contain both saturated and unsaturated
points for some values of 0 < c < b (Fig 2). In the following, we discuss in more detail the
boundary face for which mutual cooperation is absorbing (that is, the boundary face with
pCC = 1). On this boundary face, the population obtains the socially optimal payoff of b − c,
irrespective of the specific values of pCD, pDC, pDD. As a result, we show in Methods that the
time derivatives with respect to these components vanish, p_ CD ¼ p_ DC ¼ p_ DD ¼ 0. The saturated
points on the face pCC = 1 are exactly those that satisfy p_ CC � 0, which yields the condition
2
ð1 pCD Þð1 ð1 pDC ÞðpCD pDD Þ ðpDC pDD Þ Þ c
2 � ð10Þ
ð1 pCD Þ ð1 pDC Þ þ pDC pDD ð2 pDD Þ þ ð1 pCD Þð1 pDC ÞðpDC þpDD Þ b
This set of saturated points contains all cooperative memory-one Nash equilibria, which
has been characterized by [38] to be the set of all strategies p that satisfy pCC = 1 and
1 pCD c 1 pCD c
� and � ð11Þ
pDD b c pDC b
We note, that the conditions (11) are more strict than the conditions (10). Put another way,
a boundary point can be a local maximum of the payoff function against itself without being a
global maximum.
In a similar way, one can also characterize the saturated points on the boundary face with
pDD = 0, where mutual defection is absorbing. We depict the set of saturated points on this
face in the bottom row of Fig 2, together with the previously discussed set of saturated points
with pCC = 1 in the top row. As the figure suggests, the two sets exactly complement each
other. For every point that is strictly saturated on the boundary face pCC = 1 there is a corre-
sponding point on the face pDD = 0 that is unsaturated. Of course, that correspondence is
again a consequence of the symmetry described earlier.
After describing the critical points in the interior, and the saturated points on the boundary,
we explore the ‘typical’ behavior of interior trajectories. To this end, we record the end behav-
ior of solutions p(t) to Eq (6) beginning at various initial conditions p(0). Dynamics are
assumed to cease at the boundary of the strategy space. This behavior can be numerically cal-
culated. The results, for a 9 × 9 × 9 × 9 grid of initial conditions and cost-to-benefit ratio
c/b = 0.1, are shown in Fig 3. There are 6561 initial conditions. Out of those, 1835 points are
observed to end at full cooperation (pCC = 1), 1375 points at full defection (pDD = 0), 2964
points at other places on the boundary, and 387 at interior critical points (equalizers). Unlike
in Fig 1, we do not observe the symmetry described in Eqs (7 and 8). The choice of depicting
the forward direction of time breaks the symmetry.

Fig 2. Saturated points on the boundary of memory-one strategies. The boundary of the set of memory-one
strategies consists of eight three-dimensional faces with pij = 0 or pij = 1 for exactly one pair of i, j 2 {C, D}. We omit
points (pCC, pCD, pDC, pDD) for which more than one pij is 0 or 1. Thus, the eight boundary faces do not intersect. A
point p on the boundary is saturated if the payoff gradient does not point into the interior of the cube. We show the set
of saturated points on all eight boundary faces. Because of the symmetry described by Eqs (7) and (8), these eight sets
of points fit together in four complementary pairs, like the curved pieces of a three-dimensional puzzle. The boundary
� ¼ D and D
face pij = 0 is paired with the face p �i �j ¼ 1 (where a bar refers to the opposite action, C � ¼ C). The paired
�
boundary faces fit together after a rotation of one of them 180˚ about the line parameterized by t; 12 ; 1 t . Parameter
c = 0.1.

Fig 3. Long-time limits of adaptive dynamics of memory-one strategies. For a 9 × 9 × 9 × 9-grid of starting points (= 6561 points), we show the limit
limt! 1 p(t) of a solution p(t) to Eq (6). Dynamics are assumed to cease at the boundary of the strategy space. Generically, there are 4 possibilities, as
shown in the legend. For 1835 points, the trajectory p(t) evolves to full cooperation, defined by pCC = 1 (blue). For 1375 points, the trajectory p(t)
evolves to full defection, defined by pDD = 0 (red). The remaining points either evolve into other regions of the boundary (green) or approach interior
critical points, which are equalizers (yellow). The symmetry described in the main text does not manifest in this plot, but reappears when we juxtapose
the plot with the corresponding plot for reversed time. Parameter c = 0.1.

Adaptive dynamics of memory-one counting strategies

After describing the dynamics of memory-one strategies, we proceed by analyzing the dynam-
ics of counting strategies, with pCD = pDC. Counting strategies are especially convenient
because they can be represented in three dimensions. To make this representation explicit, in
the following we write counting strategies as vectors q = (q2, q1, q0) 2 [0, 1]3. Here, qi is the
probability to cooperate if i of the two players have cooperated previously. The respective
memory-one representation is thus given by pCC = q2, pCD = pDC = q1, and pDD = q0. Corre-
spondingly, the dynamics that we explore is given by
�
@Aðq; q0 Þ ��
q_ i ¼ with i 2 f2; 1; 0g ð12Þ
@qi �q¼q0
This dynamics among counting strategies is not identical to the previously considered
dynamics among memory-one strategies, even when the starting population is taken from the
invariant subset with pCD = pDC. Instead, differences arise because the embedding
[0, 1]3 ! [0, 1]4 is not distance-preserving with the standard metric on each space. As a result,
the gradient of the payoff function is computed slightly differently in the two spaces—specifi-
cally, the memory-one adaptive dynamics (6) within the subspace of counting strategies sub-
space differs from the adaptive dynamics (6) by a factor of 2 in q_ 1 ðtÞ. The analysis in this
section is thus not to characterize the orbits of the invariant subspace of counting strategies
within the memory-one strategies. Rather we consider the space of counting strategies [0, 1]3
as an interesting space in its own right, which we analyze in the following.
In a first step, we reproduce Fig 1 for the case of counting strategies. In Fig 1, counting strat-
egies correspond to the points on the diagonal pCD = pDC of each subpanel. Fig 4 is the analog
of Fig 1 for counting strategies, where we plot the signs of the components of ðq_ 2 ; q_ 1 ; q_ 0 Þ at
each counting strategy. As one may expect, these combinations again come in pairs, where abc
is paired with cba. Some combinations, such as +++, are self-paired.
Similar to the memory-one strategies, we also want to characterize the set of interior critical
points of the system (12). In Methods, we show that these points can now be parametrized by
� � � �
c c c b
tþ ; t; t ; with t 2 ; ð13Þ
bþc bþc bþc bþc
Hence the set of interior critical points forms a straight line segment. The boundary points
of this line segment are
� � � �
2c c b b c
; ;0 and 1; ; ð14Þ
bþc bþc bþc bþc
pffiffi pffiffi
The length of this line segment is 3ðb cÞ=ðbþcÞ, which ranges from 3 (the diagonal of
the cube) to 0, as c/b ranges from 0 to 1. We can classify the stability of the critical points by
finding their associated eigenvalues. The complete results are shown in Fig 5. Five generic
types of critical points are present as we vary the cost-to-benefit ratio: source, spiral source, spi-
ral sink, sink, and saddle.
In addition to these interior critical points, Fig 6 also depicts the critical points on the
3
boundary faces Bð½0; 1� Þ. Using the terminology of the previous section, these critical points
are saturated without being strictly saturated. On each boundary face, the respective curve
thus separates the region of strictly saturated points from the unsaturated points. Because of
the aforementioned symmetry of solutions, the set of boundary critical points is symmetric
under the transformation (x, y, z) 7! (1 − z, 1 − y, 1 − x). We note that counting strategies have

Fig 4. Local adaptive dynamics for counting strategies. On a 9 × 9 × 9 × 9-grid representing the space of memory-one strategies, we depict the 729
points which are counting strategies (defined by pCD = pDC). They are colored according to their direction of change in terms of the sign of each
component of ðq_ 2 ; q_ 1 ; q_ 0 Þ. Generically, there are eight possibilities as shown in the legend. We observe that for 156 points all three components are
positive, +++, while for 373 points all three components are negative, ---. Three combinations do not occur: -+-, -++, and ++-. These are combinations
in which q_ 2 or q_ 0 is negative while q_ 1 is positive; such combinations are forbidden. Because of the symmetry derived in the main text there is a
symmetric pair, +- - and - -+, each occurring 29 times. The configuration +-+ is its own mirror image and therefore a singleton (occurring 142 times).
Parameter c = 0.1.
boundary properties unshared by memory-one strategies. For example, every boundary point
with q1 = 0 is saturated. Conversely, every boundary point with q1 = 1 is unsaturated.
To explore the dynamics in the interior, Fig 7 depicts the end behavior of solutions q(t) to
Eq (12) with initial conditions on an evenly spaced grid (analogous to Fig 3). Again, dynamics
are assumed to cease at the boundary. We observe that out of 729 initial points, 190 evolve to
full cooperation, 140 evolve to full defection, 229 evolve to other places on the boundary, and
170 evolve to interior critical points. The overall abundance of the four outcomes is thus simi-
lar to the respective numbers in the space of all memory-one strategies, with the only exception
being that now more orbits converge to interior critical points.

Fig 5. Classification of interior critical points in the space of counting strategies. We show the line of interior
critical points in the space of counting strategies for five values of c. The line is colored according to the type of each
critical point, which is determined by the eigenvalues of the linearization of the system (12) at this point. We observe
all five generic types: source, spiral source, sink, spiral sink, and saddle. The complete classification is shown in the
lower right panel. Each interior critical point is an equalizer (see main text). The line is parameterized by
(t + c/(1 + c), t, t − c/(1 + c)) as t ranges over the interval (c/(1 + c), 1/(1 + c)). The symmetry described in the main text
is manifest in this figure. The transformation σ: (x, y, z) 7! (1 − z, 1 − y, 1 − x) carries the line of critical points to itself.
It exchanges sinks and sources, spiral sinks and spiral sources, and saddle points and other saddle points.

Fig 6. Interior and boundary critical points in the space of counting strategies. For four values of c, we show the line of interior critical points
(green) and the boundary critical points (black) in the space of counting strategies. The boundary critical points consist of three pieces: the edge defined
by q0 = 0 and q2 = 1 (i.e. the intersection of full cooperation and full defection) and two separate curves on the faces q0 = 0 and q2 = 1. For example, the
strategy GRIM = (1, 0, 0) is a boundary critical point. The symmetry described in the main text is visible in the rotational symmetry of the set of critical
points.

Fig 7. Long-time limits of adaptive dynamics of counting strategies. On a 9 × 9 × 9 × 9-grid representing the space of memory-one strategies, we
depict the 729 points which are counting strategies (defined by pCD = pDC). They are colored according to the limit limt!1 q(t) of a solution q(t) to
Eq (6), with starting value q(0) in the grid. Dynamics are assumed to cease at the boundary of the strategy space. Generically, there are 4 possibilities as
shown in the legend. For 190 points the trajectory q(t) evolves to full cooperation, defined by q2 = 1 (blue). For 140 points the trajectory q(t) evolves to
full defection, defined by q0 = 0 (red). The remaining points either evolve into other regions of the boundary (green) or approach interior critical points,
which are equalizers (yellow). This figure is not a simple restriction of Fig 3 because the restriction of Eq (6) differs from Eq (12) by a factor of 2.
Parameter c = 0.1.

Fig 8. Trajectories of adaptive dynamics of counting strategies. We consider four different initial conditions. We
plot the solutions q(t) to Eq (12) on the left, colored by hue and marked with arrowheads to indicate the direction of
evolution in the strategy space. On the right, we plot the cooperation rate C(q(t)), which is a real number between zero
(full defection) and one (full cooperation). Each of the initial conditions leads to a different behavior. In the first row,
for an initial condition q(0) = (1, 1, 0.8), the cooperation rate decreases monotonically from one to zero. In the second
row, for q(0) = (0.6833, 0.85, 0), the cooperation rate increases monotonically from zero to one. In the third row, for
q (0) = (0.6, 0.5, 0), the cooperation rate increases from zero to an intermediate value before decreasing and then

increasing again to one. Finally, in the last row, for q(0) = (0.6667, 0.75, 0), the cooperation rate increases from zero
before oscillating and converging to an intermediate value. The last two orbits loop around the line of interior critical
points, shown in black. Parameter c = 0.1.
We can also plot a few solutions q(t) of Eq (12) in three dimensions to give an idea of the
possible behaviors. Four types of behavior are shown in Fig 8. Alongside plots of the trajectory
q(t) we depict the cooperation rate C(q(t)), defined as the average rate of cooperation in a
large population playing the respective strategy. Previous studies show that these cooperation
rates change monotonically when players are restricted to use reactive strategies (those with
pCC = pDC and pCD = pDD, see [1]). Within the counting strategies, this monotonicity is violated
in the third and fourth example, and the fourth converges to intermediate cooperation rather
than full cooperation or full defection.
Discussion and conclusion

The donation game is one of the main paradigms to explore direct reciprocity, and memory-
one strategies are among the best-studied strategy spaces in the respective literature [24–32].
These strategies are comparably simple. They only condition on the outcome of the very last
round, while ignoring the outcome of all previous rounds.
Despite their simplicity, the formulas that describe the payoffs of memory-one players are
non-trivial to manipulate mathematically. As a result, many previous studies on memory-one
strategies rely on simulations. On the one hand, such simulations give valuable insights into
the dynamics of reciprocity. On the other hand, they make it difficult to describe why certain
strategies are favored by evolution, and how results depend on parameters such as the cost of
cooperation.
To get a more analytical description of the evolution of reciprocity, we use the framework
of adaptive dynamics. This framework considers homogeneous populations that move into the
direction of mutants with maximum invasion fitness [52, 53]. For our setup of memory-one
players in the donation game, we show that this dynamics has two remarkable mathematical
properties. Our first result concerns the subspace of counting strategies. Counting strategies
only depend on the number of cooperating players in the previous round. We show that the
adaptive dynamics leaves the subspace of counting strategies invariant. Moreover, we show in
Methods that this invariance result is not restricted to donation games or memory-one strate-
gies. A similar invariance arises for arbitrary repeated 2 × 2 games, or when players remember
more than the very last round.
Second, we describe an interesting symmetry between forward-in-time orbits and back-
ward-in-time orbits. This symmetry is specific to the donation game, but is not restricted to
memory-one strategies. Its importance becomes apparent in many of our figures (for example,
in Figs 1 and 2, where it leads to beautiful geometric patterns).
We use these mathematical insights to qualitatively describe the adaptive dynamics of mem-
ory-one strategies and of counting strategies. In particular, we describe the set of interior criti-
cal points, and the set of saturated boundary points. Any converging solution of adaptive
dynamics ends up in one of these two sets. While previous research has identified which mem-
ory-one strategies are Nash equilibria [38, 39], our study identifies those memory-one strate-
gies that satisfy a local notion of uninvadability. For example, Eq (10) describes all memory-
one strategies that are mutually cooperative and locally stable. The respective condition is less
stringent than the condition for being a Nash equilibrium. This insight allows for the following
interpretation. If evolution generates mutant strategies that are phenotypically similar to the

parent, there is a strictly larger strategy set of memory-one strategies that can maintain
cooperation.
We believe these results give a more rigorous understanding of the properties of memory-
one strategies. At the same time we hope that similar techniques can be used to explore other
games and more general strategy spaces.
Methods
Derivation of the adaptive dynamics. In the main text, we have described how to define
the payoff of two players with memory-one strategies by representing the game as a Markov
chain. However, to derive the adaptive dynamics, it is useful to start with an alternative repre-
sentation of the payoffs. As shown by [56], the payoff expression (5) can be rewritten as
0 1
1þpCC p0CC 1þpCC 1þp0CC R
B C
B pCD p0DC 1þpCD p0DC SC
detB
B
C
@ pDC p0CD pDC 1þp0CD TC
A
pDD p0DD pDD p0DD P
Aðp; p0 Þ ¼ 0 1 ð15Þ
1þpCC p0CC 1þpCC 1þp0CC 1
B C
B pCD p0DC 1þpCD p0DC 1C
detB
B
C
@ pDC p0CD pDC 1þp0CD 1CA
pDD p0DD pDD p0DD 1
Using this representation, we can write out the expression for adaptive dynamics (6) in full.
To this end, it is convenient to multiply the resulting system by the common denominator,
(1 − pCD + pDC)r(pCC, pCD, pDC, pDD)2, where
rðx; y; z; wÞ ¼ w2 ð 1 þ 2x y zÞ þ wð2 2x2 þ 2yzÞ þ ð 1 þ xÞð 1 þ y þ z 2yz þ xð 1 þ y þ zÞÞ ð16Þ
This denominator is positive in the interior (0, 1)4 of the strategy space. Hence, multiplying
by the denominator only affects the timescale of evolution, but not the direction of the trajec-
tories. After applying this modification to the system (6), the dynamics among the memory-
one strategies of the donation game takes the following form,
� �
p_ CC ¼ f1 ðpCD ; pDC ; pDD Þ � b�g1 ðpCC ; pCD ; pDC ; pDD Þ þ c�h1 ðpCC ; pCD ; pDC ; pDD Þ
� �
p_ CD ¼ f2 ðpCC ; pDD Þ � b�g2 ðpCC ; pCD ; pDC ; pDD Þ þ c�h2 ðpCC ; pCD ; pDC ; pDD Þ
� � ð17Þ
p_ DC ¼ f3 ðpCC ; pDD Þ � b�g3 ðpCC ; pCD ; pDC ; pDD Þ þ c�h3 ðpCC ; pCD ; pDC ; pDD Þ
� �
p_ DD ¼ f4 ðpCC ; pCD ; pDC Þ � b�g4 ðpCC ; pCD ; pDC ; pDD Þ þ c�h4 ðpCC ; pCD ; pDC ; pDD Þ

Here, the auxiliary functions fi, gi, hi for i 2 {1, 2, 3, 4} are defined as follows
f1 ðy; z; wÞ ¼ wð2yz þ w yw zwÞ
g1 ðx; y; z; wÞ ¼ w þ wx w2 x þ x2 þ wx2 þ w2 y xy wxy x2 y þ xy2 þ z
xz þ wxz wyz þ xyz y2 z xz2 þ yz2
h1 ðx; y; z; wÞ ¼ 1 w þ w2 wx w2 x þ wx2 þ y þ xy þ wxy xy2 þ w2 z
þxz wxz x2 z 2yz wyz þ xyz þ y2 z þ xz2 yz2
f2 ðx; wÞ ¼ wð1 w þ xÞð1 xÞ
g2 ðx; y; z; wÞ ¼ w þ w2 x wx2 w2 y z wz xz þ x2 z þ yz þ 2wyz þ z2
wz2 xz2 yz2 þ z3

ð18Þ
h2 ðx; y; z; wÞ ¼1þw w2 þ w2 x x2 wx2 y þ x2 y þ wz w2 z þ xz þ yz
2xyz z2 þ wz2 þ xz2 þ yz2 z3
f3 ðx; wÞ ¼ f2 ð1 w; 1 xÞ
g3 ðx; y; z; wÞ ¼ g2 ð1 w; 1 z; 1 y; 1 xÞ
h3 ðx; y; z; wÞ ¼ h2 ð1 w; 1 z; 1 y; 1 xÞ
f4 ðx; y; zÞ ¼ f1 ð1 z; 1 y; 1 xÞ
g4 ðx; y; z; wÞ ¼ g1 ð1 w; 1 y; 1 z; 1 xÞ
h4 ðx; y; z; wÞ ¼ h1 ð1 w; 1 z; 1 y; 1 xÞ
Note that we can write fi, gi, hi for i 2 {3, 4} in terms of the same functions for i 2 {1, 2}.
This is a consequence of the symmetry we discuss later.
Invariance of counting strategies. Using the representation (17) and (18), it becomes
straightforward to show that the space of memory-one counting strategies remains invariant
under adaptive dynamics.
Proposition 1. Let C denote the three-dimensional subspace of counting strategies among the
memory-one strategies,
4
C ≔f p 2 ½0; 1� j pCD ¼ pDC g ð19Þ
Then C is invariant under adaptive dynamics. That is, if p(t) is a solution of Eq (17) with
pð0Þ 2 C, then pðtÞ 2 C for all t.

Proof. By using the definitions in (18), one can verify that

f2 ðpCC ; pDD Þ f3 ðpCC ; pDD Þ ¼ 0
g2 ðpCC ; pCD ; pDC ; pDD Þ g3 ðpCC ; pCD ; pDC ; pDD Þ ¼
ð1 pCD þpDC Þ ðpCD pDC ÞðpCC pCD pDC þpDD Þ ð20Þ
h2 ðpCC ; pCD ; pDC ; pDD Þ h3 ðpCC ; pCD ; pDC ; pDD Þ ¼
ð1 pCD þpDC Þ ðpCD pDC ÞðpCC pCD pDC þpDD Þ
In particular, if we define d ≔ pCD − pDC, it follows by (17) and (20) that
d_ ¼ p_ CD p_ DC ¼ f2 ðpCC ; pDD Þðb cÞð1 pCD þpDC ÞðpCD pDC ÞðpCC pCD pDC þpDD Þ ð21Þ
For d = pCD − pDC = 0, we can therefore conclude that d_ ¼ 0.

While the proof of Proposition 1 shows that the set of counting strategies is invariant, it also
shows that this set is not a local attractor. Instead, from Eq (21) it follows that the distance d to
the set of counting strategies decreases at a given time if and only if p 2 (0, 1)4 satisfies
pCC + pDD > pCD + pDC.
A symmetry between forward and backward orbits. Another direct implication of the
functional form of adaptive dynamics in Eqs (17) and (18) is that solutions come in pairs. In
Results we gave an intuitive argument for a symmetry in solutions for donation games. Here
we derive the result formally.
Proposition 2. Let p(t) = (pCC, pCD, pDC, pDD)(t) be a solution to Eq (17) on some interval t
2 (a, b).
Then p ~ ð tÞ≔ð1 pDD ; 1 pDC ; 1 pCD ; 1 pCC Þð tÞ is a solution to Eq (17) for the interval
t 2 (−b, −a).
Proof. We show the result for the first component; the other components follow similarly.
For the first component, we have
p~_ CC ð tÞ ¼ f4 ð1 pDD ; 1 pDC ; 1 pCD Þ½b g4 ð1 pDD ; 1 pDC ; 1 pCD ; 1 pCC Þ
þc h4 ð1 pDD ; 1 pCD ; 1 pDC ; 1 pCC Þ�

� �
¼ f1 ðpCD ; pDC ; pDD Þ b�g1 ðpCC ; pCD ; pDC ; pDD Þ þ c�h1 ðpCC ; pCD ; pDC ; pDD Þ
¼ p_ CC ðtÞ
Therefore, if p(t) satisfies the differential Eq (17), then so does p~ ð tÞ.

The transformation p7!p ~ , defined by (pCC, pCD, pDC, pDD) 7! (1 − pDD, 1 − pDC, 1 − pCD,
1 − pCC), reflects a point in the hypercube [0, 1]4 with respect to the 2-dimensional plane
� � �
�
4 �
P ¼ p 2 ½0; 1� � pCC þpDD ¼ 1; pCD þpDC ¼ 1 ð22Þ
That is, if one takes the line segment between p and p~ , then the midpoint of this line seg-
ment is in P. The plane P is exactly the set of points that are mapped onto themselves. Every
point is mapped onto itself if the transformation is applied twice. It can be directly checked
that the transformation p 7! p~ maps critical points to critical points (see next subsection), and
the previous proposition means that it interchanges points which are limits forward in time
and points which are limits backward in time.

The symmetry described by Proposition 2 is not unique to memory-one strategies; it is a

general phenomenon related to equal gains from switching. For example, the same argument
we used in Results can be used to establish a direct analogue of Proposition 2 for memory-one
counting strategies and for memory-n strategies.
The symmetry is particularly easy to visualize for the three-dimensional space of memory-
one counting strategies. In this case, we define q ! q~ to be the transformation
(q2, q1, q0) 7! (1−q0, 1−q1, 1−q2). The analogue of Proposition 2 says that if q(t) is a solution to
12 on the interval t 2 (a, b), then so is q~ ð tÞ on the interval t 2 (−b, −a). This pair of solutions
is related by a time reversal and a rotation of the cube [0, 1]3 about the axis q1 = 1/2, q2 + q0 =
1.
Critical points of adaptive dynamics. In the following, we characterize the fixed (critical)
points of adaptive dynamics in the interior of the hypercube.
Proposition 3. A stochastic strategy p 2 (0, 1)4 is a critical point of system (17) if and only if
bðpCC pCD Þ cð1 pCC þ pDC Þ ¼ 0; pCC þ pDD ¼ pCD þ pDC ð23Þ
Proof. ()) Directly setting 0 ¼ p_ CC ¼ p_ CD ¼ p_ DC ¼ pDD

_ quickly becomes unwieldy. Notice,
however, that f1, f2, f3, f4 do not vanish when their parameters take values in (0, 1). So at
interior critical points, we must have
p_ CC _
pCD
0 ¼ þ
f1 ðpCD ; pDC ; pDD Þ f2 ðpCC ; pDD Þ
¼ ðb cÞðpCC pDC ÞðpCC þ pDD pCD pDC Þð1 pCD þ pDC Þ
p_ CC _
pDC
0 ¼ þ
f1 ðpCD ; pDC ; pDD Þ f3 ðpCC ; pDD Þ ð24Þ
¼ ðb cÞðpCC pCD ÞðpCC þ pDD pCD pDC Þð1 pCD þ pDC Þ
p_ CC pDD_
0 ¼
f1 ðpCD ; pDC ; pDD Þ f4 ðpCC ; pCD ; pDC Þ
¼ ðb cÞðpCC pDD ÞðpCC þ pDD pCD pDC Þð1 pCD þ pDC Þ
Since 1 − pCD + pDC > 0 for pCD, pDC 2 (0, 1), either pCC = pCD = pDC = pDD or pCC + pDD
= pCD + pDC must be enforced. Note that if pCC = pCD = pDC = pDD, then pCC + pDD
= pCD + pDC holds trivially. Hence, in both cases we have the identity pDD = pCD + pDC −
pCC, which we can plug into p_ CC =f1 ðpCD ; pDC ; pDD Þ to get
� � � �
p_ CC 2 2
¼ bðpCD pCC Þþcð1 pCC þpDC Þ � 1þðpCD pCC Þ þðpDC pCC Þ ð25Þ
f1 ðpCD ; pDC ; pDD Þ
It is verified without too much difficulty that whenever the second factor vanishes in (0, 1)3,
then pCD + pDC −pCC 2 = (0, 1). Any interior critical points of (17) thus needs to satisfy
bðpCC pCD Þ cð1 pCC þpDC Þ ¼ 0 and pCC þ pDD ¼ pCD þ pDC ð26Þ

(() If a strategy satisfies the conditions (26), we can express pCD and pDC in terms of pCC and
pDD,
b pCC cð1þpDD Þ cð1 pCC Þ þ bpDD
pCD ¼ and pDC ¼ ð27Þ
b c b c
Inserting these expressions into the system (17) yields, after some algebraic manipulations,
p_ CC ¼ p_ CD ¼ p_ DC ¼ p_ DD ¼ 0.
Solving Eq (23) for pCC and pDD, we arrive at

c þ b pCD þ c pDC c ð 1 þ pCD Þ þ b pDC
pCC ¼ and pDD ¼ ð28Þ
bþc bþc
Using (28), the constraint pDD > 0 becomes pDC > (c/b)(1 − pCD). When we plug this back
into the expression for pCC and use the fact that pCD > 0, we get pCC > c/b. Similarly, the con-
straints pCC < 1 and pDC < 1 lead to pDD < 1 − c/b. The result is that we have two useful
bounds pCC > c/b and pDD < 1 − c/b among the interior critical points.
We now relate the interior critical points to the equalizer strategies discussed by [57] and
[56].
Definition. An equalizer is a strategy p for which A(p0 , p) is a constant function of p0 .
It follows from the definition that every equalizer strategy is a critical point of the dynamics
(17). In the interior (0, 1)4, the converse is also true. That is,
Proposition 4. Every interior critical point of the system (17) is an equalizer.
Proof. Our condition for critical points (27) coincides with the expression for equalizers,
Eq. (8) in [56], when using the payoffs of the donation game.
As shown by [39], equalizers are the only Nash equilibria among the stochastic memory-
one strategies. Thus our above results can be summarized as follows. In the donation game,
an interior point is a critical point of adaptive dynamics if and only if it is a Nash equilib-
rium (such a result does not need to hold in general, because strategies might be locally sta-
ble critical points of adaptive dynamics without being global best responses to themselves,
see [50]).
Analysis of the boundary faces. In the main text, we define the boundary of the strategy
space [0, 1]4 as the set of all (pCC, pCD, pDC, pDD) for which exactly one entry is in {0, 1}. There-
fore there are eight different boundary faces. One particularly important face is the one with
pCC = 1, which corresponds to a fully cooperative population. It follows from Eq (18) that on
this boundary face f2(pCC, pDD) = f3(pCC, pDD) = f4(pCC, pCD, pDC) = 0. By Eq (17) we can then
conclude that p_ CD ¼ p_ DC ¼ p_ DD ¼ 0. A point p on this boundary face is saturated if and only if
p_ CC � 0. By Eq (17) and because f1(pCD, pDC, pDD) > 0, this condition is equivalent to
b �g1(1, pCD, pDC, pDD) > − c � h1(1, pCD, pDC, pDD), which yields condition (10).
The boundary face with pDD = 0 can be analyzed analogously.
Adaptive dynamics of memory-one counting strategies

In the following, we identify memory-one counting strategies with points in the 3-dimensional
cube [0, 1]3. The entries of a counting strategy q = (q2, q1, q0) correspond to the cooperation
probability in the next round, based on the number of cooperators in the previous round. We
can embed the space of counting strategies into the space of memory-one strategies by using
the mapping (q2, q1, q0) 7! (q2, q1, q1, q0). Using this embedding, we can compute the payoff of

a q-player against q0 -player using the payoff formula (15), which yields
0 1
1 þ q2 q02 1 þ q2 1 þ q02 b c
B C
B q1 q01 1 þ q1 q01 c C
detBB C
@ q1 q01 q1 1 þ q01 b CA
q0 q00 q0 q00 0
Aðq; q0 Þ ¼ 0 1 ð29Þ
1 þ q2 q02 1 þ q2 1 þ q02 1
B C
B q1 q01 1 þ q1 q01 1C
detB
B
C
@ q1 q01 q1 1 þ q01 1CA
q0 q00 q0 q00 1
In the following we study the adaptive dynamics of counting strategies. Again, we consider
a homogeneous population with strategy q, evolving in the direction of the gradient of the pay-
off function, now calculated in [0, 1]3. Evolution in the space of counting strategies is thus
given by
�
@Aðq; q0 Þ ��
q_ i ¼ ð30Þ
@qi �q¼q0
To write out the adaptive dynamics Eq (30) in full, it is again convenient to multiply the
equations by the common denominator r(q2, q1, q0)2, with
rðx; y; zÞ ¼ ð 1þxÞð 1þyþ ð1 2yÞðy xÞÞ þ ð2 2x2 þ 2y2 Þz þ ð 1 þ 2x 2yÞz2 ð31Þ
This denominator is nonzero in the interior (0, 1)3 of the strategy space. After this rescaling,
the system of Eq (30) becomes
� �
q_ 2 ¼ f2 ðq1 ; q0 Þ � b�g2 ðq2 ; q1 ; q0 Þ þ c�h2 ðq2 ; q1 ; q0 Þ
� �
q_ 1 ¼ f1 ðq2 ; q0 Þ � b�g1 ðq2 ; q1 ; q0 Þ þ c�h1 ðq2 ; q1 ; q0 Þ ð32Þ
� �
q_ 0 ¼ f0 ðq2 ; q1 Þ � b�g0 ðq2 ; q1 ; q0 Þ þ c�h0 ðq2 ; q1 ; q0 Þ
The auxiliary functions fi, gi, hi now take the form

f2 ðy; zÞ ¼ zð2yðy zÞ þ zÞ
g2 ðx; y; zÞ ¼ x2 y þ 2xy þ x2 y xy2 þ z xz x2 z þ y2 z þ xz2 yz2
h2 ðx; y; zÞ ¼ 1 y 2xy þ x2 y þ 2y2 xy2 þ z þ xz x 2 z þ y2 z z2 þ xz2 yz2
f1 ðx; zÞ ¼ 2zð 1 þ xÞð1 þ x zÞ
g1 ðx; y; zÞ ¼ y þ xy x2 y 2y2 þ xy2 z þ x2 z þ yz y2 z xz2 þ yz2 ð33Þ
h1 ðx; y; zÞ ¼ 1 þ x2 þ y xy x2 y þ xy2 z þ x2 z yz y2 z þ z 2 xz2 þ yz2
f0 ðx; yÞ ¼ f2 ð1 y; 1 xÞ
g0 ðx; y; zÞ ¼ g2 ð1 z; 1 y; 1 xÞ
h0 ðx; y; zÞ ¼ h2 ð1 z; 1 y; 1 xÞ

Critical points of adaptive dynamics of counting strategies. Again, in the following we

characterize the fixed (critical) points of adaptive dynamics in the interior of [0, 1]3.
Proposition 5. The interior critical points of the system (32) are parametrized by
� � � �
c c c b
tþ ; t; t ; for t 2 ; ð34Þ
bþc bþc bþc bþc
Proof. Because f2, f1, f0 do not vanish in the interior of the strategy space (0, 1)3, we can com-
pute
q_ 1 q_ 0
þ ¼ ðb cÞðq0 q1 Þðq2 2q1 þ q0 Þ;
f1 ðq2 ; q0 Þ f0 ðq2 ; q1 Þ
ð35Þ
q_ 2 q_ 0
¼ ðb cÞðq2 q0 Þðq2 2q1 þ q0 Þ
f2 ðq1 ; q0 Þ f0 ðq2 ; q1 Þ
At a critical point we have q_ 2 ¼ q_ 1 ¼ q_ 0 ¼ 0; so the expressions on the right hand side must
vanish. This implies q2 − 2q1 + q0 = 0 or q2 = q1 = q0 (in which case q2 − 2q1 + q0 = 0 holds trivi-
ally). So q1 = (q2 + q0)/2 is a necessary condition for the strategy q to be a critical point. To
obtain a condition that is also sufficient we take this expression for q1 and plug it into
4q_ 1 ðq2 ; ðq2 þq0 Þ=2; q0 Þ � 2 �
¼ bðq0 q2 Þ þ cð2þq0 q2 Þ 2 ðq2 q0 Þ ð36Þ
f1 ðq2 ; q0 Þ
2c
This expression only vanishes when q2 q0 ¼ bþc . The solutions to the conditions
2c
q2 þq0 ¼ 2q1 ; q2 q0 ¼ ð37Þ
bþc
are parameterized by
� � � �
c c c b
tþ ; t; t ; t2 ; ð38Þ
bþc bþc bþc bþc
Conversely, it is easily checked that all of these strategies are critical points of (32).
Thus the interior critical points form a straight line segment on the interior of the cube
� � � � pffiffi pffiffi
c b
with boundary points bþc 2c
; bþc ; 0 and 1; bþc ; bbþcc and length 3 bbþcc, which ranges from 3
(the diagonal of the cube) to 0 as bc ranges from 0 to 1. We can classify the stability of these criti-
cal points by finding their associated eigenvalues. The results are complicated, but shown in
Fig 5.
Comparison to reactive strategies

Reactive strategies are the memory-one strategies satisfying pCC = pDC and pCD = pDD. They
form a two-dimensional space which has been studied extensively, including their adaptive
dynamics [43–48]. The set of interior critical points for adaptive dynamics of reactive strategies
coincides with the set of equalizer strategies, a result which we generalized in Results.
However, we also highlight several key differences between the strategy spaces. One impor-
tant theme is that the three-dimensional space of memory-one counting strategies captures a
surprising degree of complexity not seen in reactive strategies. In Fig 8 we show that the rate of
self-cooperation does not always monotonically increase or decrease, as it does for reactive
strategies. In fact, cooperativity can increase and decrease several times along a trajectory. Fur-
thermore, the symmetry pðtÞ 7! p ~ ð tÞ has a direct analogue for reactive strategies, which

turns out to associate each trajectory to itself. That is, trajectories for reactive strategies do not
come in pairs, as they do in the larger spaces of memory-one, memory-one counting, and
higher memory strategies.
In Fig 9, we plot the cooperative region for memory-one strategies (the region for which
the self-cooperation rate is locally increasing). The corresponding region for reactive strategies
is straightforward to describe [43]: If (pC, pD) is a player’s probability to cooperate depending
on the co-player’s previous action (C or D), then the cooperative region consists of all points
with pC − pD > c/b.
Extensions of the invariance result

Our Proposition 1 shows that among the memory-one strategies of the donation game, adap-
tive dynamics leaves the set of counting strategies invariant. In the following, we derive two
generalizations of this result. In a first step, we show that the same result holds for arbitrary
repeated 2 × 2 games.
Proposition 6. Let C denote the three-dimensional subspace of counting strategies among the
memory-one strategies, as defined by Eq (19). Then C is invariant under adaptive dynamics, for
any repeated 2 × 2 game with payoff matrix (2).
Proof. Let M be the Markov chain of the form (4) generated by the behavior of two play-
ers with strategies p and p0 . Moreover, let v denote the associated stationary distribution.
The payoff to the p-player in the repeated 2 × 2 game is then given by A(p, p0 ) = π(v),
where p : R4 ! R is some linear map that depends on the payoff matrix of the game but
not on p or p0 .
By definition vM = v. If we introduce an infinitesimal variation δ p in the strategy p there
will be an associated δM and δ v, and they satisfy (v + δv)(M + δM) = v + δv. Since v M = v and
since δvδM is disregarded as doubly infinitesimal, we have δ v M + vδM = δv. Choose δ p to be
(0, �, −�, 0). Then it can be seen easily that
0 1
0 0 0 0
B 0 C
B �pDC �ð1 p0DC Þ �p0DC �ð1 p0DC Þ C
dM ¼ B
B �p0
C ð39Þ
@ CD �ð1 p0CD Þ �p0CD �ð1 p0CD Þ CA
0 0 0 0
Now suppose p and p0 are equal and furthermore that pCD = pDC. Then vCD = vDC by sym-
metry, and vδM manifestly vanishes. It follows from the above that δvM = δv. Then δv is pro-
portional to v by uniqueness of a stationary distribution. But we are also demanding that the
sum of components of v + δv is 1. Thus δv = 0 and there is no variation in payoff π(v). No
player gains from deviating infinitesimally off the hypersurface pCD = pDC in adaptive dynam-
ics, i.e. from departing the space C.
In a second step, we ask whether a similar invariance result applies to memory-n strategies.
With an argument similar to the one above, we can show that it applies at least in a restricted
way.
Our notation for memory-n strategies is best introduced by example: the component
p� CDC � of a memory-3 strategy of player 1 denotes the probability of cooperation if the
DDC
outcomes of the most recent three rounds were CD, DD, CC, in that order.

Fig 9. Cooperative region for adaptive dynamics of memory-one strategies. For a 9 × 9 × 9 × 9-grid (= 6561 points) we show the points for which the
cooperativity, or rate of self-cooperation, of ðp_ CC ; p_ CD ; p_ DC ; p_ DD Þ is locally increasing. The rate of self-cooperation of a strategy p can be calculated by
A(p, p)/(b − c) using formula (15). We find that for 1876 points cooperativity is locally increasing; for 4677 points cooperativity is decreasing; and eight
points are critical points with ðp_ CC ; p_ CD ; p_ DC ; p_ DD Þ ¼ ð0; 0; 0; 0Þ. Note that, unlike the corresponding region for reactive strategies, trajectories beginning
in the cooperative region can leave this region, and trajectories beginning outside of the cooperative region can enter it. We show examples of this in
Fig 8). The graph is created for c = 0.1.

Proposition 7. Consider the adaptive dynamics for memory-n strategies p and let s be a fixed
arbitrary sequence of n − 1 moves for one player. Then the condition
p� Cs � ¼ p� Ds � ð40Þ
Ds Cs
is invariant for any repeated 2 × 2 game.

Proof. Similar to before, let M be the Markov chain generated by the behavior of two players
with memory-n strategies p and p0 , with stationary distribution v. The components of v are the
average frequencies of observing each possible history of length n over the course of the game.
n
The payoff to player 1 is given by A(S, S0 ) = π(v), where p : R4 ! R is again some linear func-
tion depending on the payoff matrix of the game but independent of p and p0 . Again, we intro-
duce an infinitesimal variation δp in the strategy p. As a result, there will be an associated δM
and δv, and they satisfy (v + δv)(M + δM) = v + δv. Since vM = v, and δvδM is disregarded as
doubly infinitesimal, we have δvM + v δM = δv.
Now suppose that p is a memory-n strategy that satisfies condition (40), with s being an
arbitrary but fixed sequence of length n − 1 of C’s and D’s. Let ei denote the vector with a 1 in
the ith position and zeros elsewhere, and let ei,j denote the matrix with a 1 in the i, j’th entry
and zeros elsewhere. The dimensions will be clear from context. We introduce the following
infinitesimal variation in p,
dp ¼ � � e� Cs � � � e� Ds � ð41Þ
Ds Cs
The corresponding variation in M is

0 1
B C
B C
! B C
dM ¼ �p 0 !e
Cs sD
�B1 p ! Ce Cs sC !
0
Ds B Ds C
@ A
Cs Ds sC Cs Ds sD
0 1
B C
B C
! B C
�p 0 !e
Cs sD
�B1 p0
! Ce Cs sD !
Ds B Ds C
@ A
Cs Ds sC Cs Ds sD
: ð42Þ
0 1
B C
B C
�p 0
! e Ds ! �B p0 ! Ce !
sC B1 Cs C
Cs @ A Ds sC
Ds Cs sC Ds Cs sD
0 1
B C
B C
þ �p0 !e ! þ �B1 p0 ! Ce !
Ds sD B Cs C
Cs @ A Ds sD
Ds Cs sC Ds Cs sD

We can compute
2 3
6 7
6 7
6 ! p0 0 7
vdM ¼ � 6v ! v !p ! 7e !
6 Cs Ds Cs 7
4 Ds 5 sC
Ds Cs Cs Ds sC
2 0 1 0 13
6 B C B C7
6 B C B C7
6 ! B 0 C B 0 C7
þ�6v B1 p !C v ! B1 p ! C 7e !
6 Cs B Ds C Ds B Cs C 7
4 @ A @ A5 sC
Ds Cs Cs Ds sD
2 3 ð43Þ
6 7
6 7
6 ! 0 0 7
þ�6 v p ! þv ! p ! 7e !
6 Cs Ds Ds Cs 7 sD
4 5
Ds Cs Cs Ds sC
2 0 1 0 13
6 B C B C7
6 B C B C7
6 !B C B C7
þ�6 v
Cs B
1 p0 !C þ v ! B1 p0 ! C7e sD !
6 B Ds AC Ds B Cs C 7
4 @ @ A5
Ds Cs Cs Ds sD
If p and p0 are equal, then it follows by symmetry that

v !¼v !
Cs Ds
ð44Þ
Ds Cs
Now (40) applied to p0 , along with (44), imply that the right hand side of (43) vanishes.
Since vδM = 0, our initial discussion means that δvM = δv. Therefore δv is proportional to v
by uniqueness of stationary distribution. Because the sum of components of v + δv is 1, we
conclude that δv = 0. Hence there is no variation in payoff π(v). No player gains from making
the infinitesimal variation (41).
Author Contributions
Conceptualization: Philip LaPorte, Christian Hilbe, Martin A. Nowak.
Formal analysis: Philip LaPorte.
Supervision: Christian Hilbe, Martin A. Nowak.
Validation: Christian Hilbe, Martin A. Nowak.
Visualization: Philip LaPorte, Martin A. Nowak.
Writing – original draft: Philip LaPorte, Christian Hilbe, Martin A. Nowak.
Writing – review & editing: Philip LaPorte, Christian Hilbe, Martin A. Nowak.
References
1. Sigmund K. The Calculus of Selfishness. Princeton, NJ: Princeton Univ. Press; 2010.

2. Nowak MA. Evolutionary dynamics. Cambridge MA: Harvard University Press; 2006.
3. Nowak MA. Five rules for the Evolution of Cooperation. Science. 2006; 314:1560–1563. https://doi.org/
10.1126/science.1133755 PMID: 17158317
4. Trivers RL. The evolution of reciprocal altruism. The Quarterly Review of Biology. 1971; 46:35–57.
https://doi.org/10.1086/406755
5. Axelrod R, Hamilton WD. The evolution of cooperation. Science. 1981; 211:1390–1396. https://doi.org/
10.1126/science.7466396 PMID: 7466396
6. Garcı́a J, van Veelen M. No strategy can win in the repeated prisoner’s dilemma: Linking game theory
and computer simulations. Frontiers in Robotics and AI. 2018; 5:102. https://doi.org/10.3389/frobt.
2018.00102 PMID: 33500981
7. Hilbe C, Chatterjee K, Nowak MA. Partners and rivals in direct reciprocity. Nature Human Behaviour.
2018; 2(7):469–477. https://doi.org/10.1038/s41562-018-0342-3 PMID: 31097794
8. Glynatsi NE, Knight VA. A bibliometric study of research topics, collaboration and centrality in the field
of the Iterated Prisoner’s Dilemma. Humanities and Social Sciences Communications. 2021; 8:45.
https://doi.org/10.1057/s41599-021-00718-9
9. Rapoport A. Prisoner’s Dilemma. In: Eatwell J, Milgate M, Newman P, editors. Game Theory. Palgrave
Macmillan UK; 1989. p. 199–204.
10. Molander P. The optimal level of generosity in a selfish, uncertain environment. Journal of Conflict Res-
olution. 1985; 29:611–618. https://doi.org/10.1177/0022002785029004004
11. Nowak MA, Sigmund K. Tit for tat in heterogeneous populations. Nature. 1992; 355:250–253. https://
doi.org/10.1038/355250a0
12. Hauert C, Schuster HG. Effects of increasing the number of players and memory size in the iterated
Prisoner’s Dilemma: a numerical approach. Proceedings of the Royal Society B. 1997; 264:513–519.
https://doi.org/10.1098/rspb.1997.0073
13. Szabó G, Antal T, Szabó P, Droz M. Spatial evolutionary prisoner’s dilemma game with three strategies
and external constraints. Physical Review E. 2000; 62:1095–1103. https://doi.org/10.1103/PhysRevE.
62.1095 PMID: 11088565
14. Killingback T, Doebeli M. The continuous Prisoner’s Dilemma and the evolution of cooperation through
reciprocal altruism with variable investment. The American Naturalist. 2002; 160(4):421–438. https://
doi.org/10.1086/342070 PMID: 18707520
15. Grujic J, Cuesta JA, Sanchez A. On the coexistence of cooperators, defectors and conditional coopera-
tors in the multiplayer iterated prisoner’s dilemma. Journal of Theoretical Biology. 2012; 300:299–308.
https://doi.org/10.1016/j.jtbi.2012.02.003 PMID: 22530239
16. van Veelen M, Garcı́a J, Rand DG, Nowak MA. Direct reciprocity in structured populations. Proceedings
of the National Academy of Sciences USA. 2012; 109:9929–9934. https://doi.org/10.1073/pnas.
1206694109 PMID: 22665767
17. van Segbroeck S, Pacheco JM, Lenaerts T, Santos FC. Emergence of fairness in repeated group inter-
actions. Physical Review Letters. 2012; 108:158104. https://doi.org/10.1103/PhysRevLett.108.158104
PMID: 22587290
18. Garcı́a J, Traulsen A. The Structure of Mutations and the Evolution of Cooperation. PLoS One. 2012; 7:
e35287. https://doi.org/10.1371/journal.pone.0035287 PMID: 22563381
19. Szolnoki A, Perc M. Defection and extortion as unexpected catalysts of unconditional cooperation in
structured populations. Scientific Reports. 2014; 4:5496. https://doi.org/10.1038/srep05496 PMID:
24975112
20. Szolnoki A, Perc M. Evolution of extortion in structured populations. Physical Review E. 2014;
89:022804. https://doi.org/10.1103/PhysRevE.89.022804
21. Yi SD, Baek SK, Choi JK. Combination with anti-tit-for-tat remedies problems of tit-for-tat. Journal of
Theoretical Biology. 2017; 412:1–7. https://doi.org/10.1016/j.jtbi.2016.09.017 PMID: 27670803
22. Knight V, Harper M, Glynatsi NE, Campbell O. Evolution reinforces cooperation with the emergence of
self-recognition mechanisms: An empirical study of strategies in the Moran process for the iterated pris-
oner’s dilemma. PLoS One. 2018; 13(10):e0204981. https://doi.org/10.1371/journal.pone.0204981
PMID: 30359381
23. Li J, Zhao X, Li B, Rossetti CSL, Hilbe C, Xia H. Evolution of cooperation through cumulative reciprocity.
Nature Computational Science. 2022; 2:677–686. https://doi.org/10.1038/s43588-022-00334-w
24. Murase Y, Hilbe C, Baek SK. Evolution of direct reciprocity in group-structured populations. Scientific
Reports. 2022; 12(1):18645. https://doi.org/10.1038/s41598-022-23467-4 PMID: 36333592
25. Nowak MA, Sigmund K. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner’s
Dilemma game. Nature. 1993; 364:56–58. https://doi.org/10.1038/364056a0 PMID: 8316296

26. Brauchli K, Killingback T, Doebeli M. Evolution of Cooperation in Spatially Structured Populations. Jour-
nal of Theoretical Biology. 1999; 200:405–417. https://doi.org/10.1006/jtbi.1999.1000 PMID: 10525399
27. Martinez-Vaquero LA, Cuesta JA, Sanchez A. Generosity pays in the presence of direct reciprocity: A
comprehensive study of 2x2 repeated games. PLoS ONE. 2012; 7(4):E35135. https://doi.org/10.1371/
journal.pone.0035135 PMID: 22529982
28. Stewart AJ, Plotkin JB. From extortion to generosity, evolution in the Iterated Prisoner’s Dilemma. Pro-
ceedings of the National Academy of Sciences USA. 2013; 110(38):15348–15353. https://doi.org/10.
1073/pnas.1306246110 PMID: 24003115
29. Glynatsi NE, Knight VA. Using a theory of mind to find best responses to memory-one strategies. Scien-
tific Reports. 2020; 10(1):1–9. https://doi.org/10.1038/s41598-020-74181-y PMID: 33057134
30. Schmid L, Hilbe C, Chatterjee K, Nowak MA. Direct reciprocity between individuals that use different
strategy spaces. PLoS Computational Biology. 2022; 18(6):e1010149. https://doi.org/10.1371/journal.
pcbi.1010149 PMID: 35700167
31. McAvoy A, Kates-Harbeck J, Chatterjee K, Hilbe C. Evolutionary instability of selfish learning in
repeated games. PNAS Nexus. 2022; 1(4):pgac141. https://doi.org/10.1093/pnasnexus/pgac141
PMID: 36714856
32. Montero-Porras E, Grujić J, Fernández Domingos E, Lenaerts T. Inferring strategies from observations
in long iterated prisoner’s dilemma experiments. Scientific Reports. 2022; 12:7589. https://doi.org/10.
1038/s41598-022-11654-2
33. Kraines DP, Kraines VY. Learning to cooperate with Pavlov an adaptive strategy for the iterated prison-
er’s dilemma with noise. Theory and Decision. 1993; 35:107–150. https://doi.org/10.1007/BF01074955
34. Fischbacher U, Gächter S, Fehr E. Are people conditionally cooperative? Evidence from a public goods
experiment. Economic Letters. 2001; 71:397–404. https://doi.org/10.1016/S0165-1765(01)00394-9
35. Fischbacher U, Gächter S. Social preferences, beliefs, and the dynamics of free riding in public goods
experiments. American Economic Review. 2010; 100(1):541–556. https://doi.org/10.1257/aer.100.1.
541
36. Grujic J, Gracia-Lázaro C, Milinski M, Semmann D, Traulsen A, Cuesta JA, et al. A comparative analy-
sis of spatial Prisoner’s Dilemma experiments: Conditional cooperation and payoff irrelevance. Scien-
tific Reports. 2014; 4:4615. https://doi.org/10.1038/srep04615 PMID: 24722557
37. Akin E. What you gotta know to play good in the iterated prisoner’s dilemma. Games. 2015; 6(3):175–
190. https://doi.org/10.3390/g6030175
38. Akin E. The iterated prisoner’s dilemma: Good strategies and their dynamics. In: Assani I, editor. Ergo-
dic Theory, Advances in Dynamics. Berlin: de Gruyter; 2016. p. 77–107.
39. Stewart AJ, Plotkin JB. Collapse of cooperation in evolving games. Proceedings of the National Acad-
emy of Sciences USA. 2014; 111(49):17558–17563. https://doi.org/10.1073/pnas.1408618111 PMID:
25422421
40. Hilbe C, Traulsen A, Sigmund K. Partners or rivals? Strategies for the iterated prisoner’s dilemma. Games
and Economic Behavior. 2015; 92:41–52. https://doi.org/10.1016/j.geb.2015.05.005 PMID: 26339123
41. Donahue K, Hauser OP, Nowak MA, Hilbe C. Evolving cooperation in multichannel games. Nature
Communications. 2020; 11:3885. https://doi.org/10.1038/s41467-020-17730-3 PMID: 32753599
42. Park PS, Nowak MA, Hilbe C. Cooperation in alternating interactions with memory constraints. Nature
Communications. 2022; 13:737. https://doi.org/10.1038/s41467-022-28336-2 PMID: 35136025
43. Nowak MA, Sigmund K. The evolution of stochastic strategies in the prisoner’s dilemma. Acta Applican-
dae Mathematicae. 1990; 20:247–265. https://doi.org/10.1007/BF00049570
44. Imhof LA, Nowak MA. Stochastic evolutionary dynamics of direct reciprocity. Proceedings of the Royal
Society B. 2010; 277:463–468. https://doi.org/10.1098/rspb.2009.1171 PMID: 19846456
45. Allen B, Nowak MA, Dieckmann U. Adaptive dynamics with interaction structure. American Naturalist.
2013; 181(6):E139–E163. https://doi.org/10.1086/670192 PMID: 23669549
46. Reiter JG, Hilbe C, Rand DG, Chatterjee K, Nowak MA. Crosstalk in concurrent repeated games
impedes direct reciprocity and requires stronger levels of forgiveness. Nature Communications. 2018;
9:555. https://doi.org/10.1038/s41467-017-02721-8 PMID: 29416030
47. McAvoy A, Nowak MA. Reactive learning strategies for iterated games. Proceedings of the Royal Soci-
ety A. 2019; 475:20180819. https://doi.org/10.1098/rspa.2018.0819 PMID: 31007557
48. Chen X, Fu F. Outlearning extortioners by fair-minded unbending strategies. arXiv. 2022;
p. 2201.04198.
49. Brandt H, Sigmund K. The good, the bad and the discriminator—Errors in direct and indirect reciprocity.
Journal of Theoretical Biology. 2006; 239:183–194. https://doi.org/10.1016/j.jtbi.2005.08.045 PMID:
16257417

50. Stewart AJ, Plotkin JB. The evolvability of cooperation under local and non-local mutations. Games.
2015; 6(3):231–250. https://doi.org/10.3390/g6030231
51. Chen X, Wang L, Fu F. The intricate geometry of zero-determinant strategies underlying evolutionary
adaptation from extortion to generosity. New Journal of Physics. 2022; 24:103001. https://doi.org/10.
1088/1367-2630/ac932d
52. Geritz SAH, Metz JAJ, Kisdi E, Meszéna G. Dynamics of Adaptation and Evolutionary Branching. Phys-
ical Review Letters. 1997; 78(10):2024–2027. https://doi.org/10.1103/PhysRevLett.78.2024
53. Hofbauer J, Sigmund K. Evolutionary Games and Population Dynamics. Cambridge, UK: Cambridge
University Press; 1998.
54. Wakiyama M, Tanimoto J. Reciprocity phase in various 2×2 games by agents equipped with two-mem-
ory length strategy encouraged by grouping for interaction and adaptation. Biosystems. 2011; 103
(1):93–104. https://doi.org/10.1016/j.biosystems.2010.10.009 PMID: 21035518
55. Miyaji K, Tanimoto J, Wang Z, Hagishima A, Ikegaya N. Direct reciprocity in spatial populations
enhances R-reciprocity as well as ST-reciprocity. PLOS One. 2013; p. 8: e71961. https://doi.org/10.
1371/journal.pone.0071961 PMID: 23951272
56. Press WH, Dyson FD. Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary
opponent. PNAS. 2012; 109:10409–10413. https://doi.org/10.1073/pnas.1206569109 PMID:
22615375
57. Boerlijst MC, Nowak MA, Sigmund K. Equal pay for all prisoners. American Mathematical Monthly.
1997; 104:303–307. https://doi.org/10.1080/00029890.1997.11990641

Adaptive Dynamics of Memory-One Strategies in The Repeated Donation Game - La Porte

Uploaded by

Copyright:

Available Formats

Adaptive Dynamics of Memory-One Strategies in The Repeated Donation Game - La Porte

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adaptive Dynamics of Memory-One Strategies in The Repeated Donation Game - La Porte

Uploaded by

Copyright:

Available Formats

PLOS COMPUTATIONAL BIOLOGY

Adaptive dynamics of memory-one strategies

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 1 / 31

study design, data collection and analysis, decision

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 2 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 3 / 31

a special case of a symmetric 2 × 2 game with matrix

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 4 / 31

define the repeated-game payoff of the p-player as

Aðp; p0 Þ ¼ RvCC þ SvCD þ TvDC þ PvDD ð5Þ

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 5 / 31

from switching (R + P = S + T):

Adaptive dynamics of memory-one strategies

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 6 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 7 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 8 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 9 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 10 / 31

Adaptive dynamics of memory-one counting strategies

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 11 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 12 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 13 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 14 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 15 / 31

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 16 / 31

Discussion and conclusion

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 17 / 31

rðx; y; z; wÞ ¼ w2 ð 1 þ 2x y zÞ þ wð2 2x2 þ 2yzÞ þ ð 1 þ xÞð 1 þ y þ z 2yz þ xð 1 þ y þ zÞÞ ð16Þ

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 18 / 31

f1 ðy; z; wÞ ¼ wð2yz þ w yw zwÞ

g1 ðx; y; z; wÞ ¼ w þ wx w2 x þ x2 þ wx2 þ w2 y xy wxy x2 y þ xy2 þ z

xz þ wxz wyz þ xyz y2 z xz2 þ yz2

h1 ðx; y; z; wÞ ¼ 1 w þ w2 wx w2 x þ wx2 þ y þ xy þ wxy xy2 þ w2 z

þxz wxz x2 z 2yz wyz þ xyz þ y2 z þ xz2 yz2

f2 ðx; wÞ ¼ wð1 w þ xÞð1 xÞ

g2 ðx; y; z; wÞ ¼ w þ w2 x wx2 w2 y z wz xz þ x2 z þ yz þ 2wyz þ z2

wz2 xz2 yz2 þ z3

2xyz z2 þ wz2 þ xz2 þ yz2 z3

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 19 / 31

Proof. By using the definitions in (18), one can verify that

g2 ðpCC ; pCD ; pDC ; pDD Þ g3 ðpCC ; pCD ; pDC ; pDD Þ ¼

ð1 pCD þpDC Þ ðpCD pDC ÞðpCC pCD pDC þpDD Þ ð20Þ

h2 ðpCC ; pCD ; pDC ; pDD Þ h3 ðpCC ; pCD ; pDC ; pDD Þ ¼

ð1 pCD þpDC Þ ðpCD pDC ÞðpCC pCD pDC þpDD Þ

In particular, if we define d ≔ pCD − pDC, it follows by (17) and (20) that

For d = pCD − pDC = 0, we can therefore conclude that d_ ¼ 0.

p~_ CC ð tÞ ¼ f4 ð1 pDD ; 1 pDC ; 1 pCD Þ½b g4 ð1 pDD ; 1 pDC ; 1 pCD ; 1 pCC Þ

þc h4 ð1 pDD ; 1 pCD ; 1 pDC ; 1 pCC Þ�

Therefore, if p(t) satisfies the differential Eq (17), then so does p~ ð tÞ.

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 20 / 31

The symmetry described by Proposition 2 is not unique to memory-one strategies; it is a

Proof. ()) Directly setting 0 ¼ p_ CC ¼ p_ CD ¼ p_ DC ¼ pDD

¼ ðb cÞðpCC pDC ÞðpCC þ pDD pCD pDC Þð1 pCD þ pDC Þ

¼ ðb cÞðpCC pDD ÞðpCC þ pDD pCD pDC Þð1 pCD þ pDC Þ

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 21 / 31

Solving Eq (23) for pCC and pDD, we arrive at

Adaptive dynamics of memory-one counting strategies

PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1010987 June 29, 2023 22 / 31

The auxiliary functions fi, gi, hi now take the form