Complex Physics
Complex Physics
3
4 Complex Physics; Jan O. Haerter, Kim Sneppen
4 Networks 149
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.1.1 When Networks are useful . . . . . . . . . . . . . . . . . 149
4.1.2 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . 150
4.1.3 Amplification factor . . . . . . . . . . . . . . . . . . . . 154
4.1.4 Adjacency matrix . . . . . . . . . . . . . . . . . . . . . . 157
4.1.5 “Scale free” networks . . . . . . . . . . . . . . . . . . . . 159
4.1.6 Amplification of “epidemic” signals . . . . . . . . . . . . 160
4.2 Analyzing Network Topologies . . . . . . . . . . . . . . . . . . . 165
4.2.1 Randomization: Constructing a proper null model . . . 165
4.2.2 Algorithm generating a synthetic scale-free network . . . 170
Jan O. Haerter & Kim Sneppen 5
6 Econophysics 231
6.1 Analysis of a Time Series . . . . . . . . . . . . . . . . . . . . . . 231
6.2 Fear-Factor model . . . . . . . . . . . . . . . . . . . . . . . . . . 237
6.3 Models of economic time-series . . . . . . . . . . . . . . . . . . . 241
6.3.1 A model of Economic Bubbles . . . . . . . . . . . . . . . 241
6.4 Bet hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
6.4.1 Bet hedging in random walk markets . . . . . . . . . . . 247
6.4.2 Bet-hedging with occasional catastrophes . . . . . . . . . 249
6 Complex Physics; Jan O. Haerter, Kim Sneppen
Perspectives
introduce the concept of fractals and the relation between fractal dimen-
sions and power laws.
Homework exercises will be assigned as the course progresses. They will usually
be listed at the end of the respective chapters in these lecture notes. Note that,
in some cases, exercises titled similarly will be available, however, one version
will contain a plus (+) symbol. The exercises labeled with a “+”, are more
open versions of the alternative ones, but lead you to similar results. For
completion of the course, you can go with the more detailed ones, but if you
get bored, the “+” version will simply be more challenging, as less explicit
guidance will be available in those variants. So, it is entirely up to you, which
one you work on — pick one, and try to get through it. Additionally, group
work on homework problems is explicitly encouraged. You should always work
through all problems and if you get stuck, discuss with your classmates or
visit me in my office. Please work through the assignments already at home,
8 Complex Physics; Jan O. Haerter, Kim Sneppen
before you come to exercise sessions. This will help you get the most out of
the tutorials we offer.
Please do the computer exercises, they are a very important part of the course.
They are key methodology that is absolutely needed for succeeding in complex
systems science!
Mini tutorials (marked green) are interspersed throughout the text. They
intend to quickly raise some thought which could be recapitulated after reading
a section. These should usually be quick to answer and not difficult after having
read the previous paragraphs.
Overall, with the lectures and exercises we intend to provide the student with
knowledge of a set of model types and simulation algorithms that are useful in
understanding our surrounding world. They aim to give the student a feeling of
playfulness when thinking about putting “Life, the Universe and Everything”
on a computer:
“The popular view that scientists proceed inexorably from well-established fact to well-
established fact, never being influenced by any unproved conjecture, is quite mistaken.
Provided it is made clear which are proved facts and which are conjectures, no harm
can result. Conjectures are of great importance since they suggest useful lines of
research.”
- Alan Turing; The Enigma
“The sciences do not try to explain, they hardly even try to interpret, they mainly
make models. By a model is meant a mathematical construct which, with the addition
of certain verbal interpretations, describes observed phenomena. The justification of
such a mathematical construct is solely and precisely that it is expected to work-that
is, correctly to describe phenomena from a reasonably wide area.”
- John von Neumann
“It’s fun to invent systems and meanings and then poke holes in them.”
- Marty Rubin
“I’m all in favour of the democratic principle that one idiot is as good as one genius,
but I draw the line when someone takes the next step and concludes that two idiots
are better than one genius. ”
- Leo Szilard
“With four parameters I can fit an elephant, and with five I can make him wiggle
his trunk.”
Jan O. Haerter & Kim Sneppen 9
“Simplicity is a great virtue but it requires hard work to achieve it and education to
appreciate it. And to make matters worse: complexity sells better.”
- Edsger W. Dijkstra
“Don’t be fooled by the many books on complexity or by the many complex and arcane
algorithms you find in this book or elsewhere. Although there are no textbooks on
simplicity, simple systems work and complex don’t.”
- Jim Gray
“Truth is ever to be found in simplicity, and not in the multiplicity and confusion
of things.”
- Isaac Newton
10 Complex Physics; Jan O. Haerter, Kim Sneppen
Chapter 1
11
12 Complex Physics; Jan O. Haerter, Kim Sneppen
NS !
I≡ . (1.1)
n1 ! n2 ! . . . nq !
I simply measures the multiplicity of the outcome, where the states are pop-
ulated according to the numbers ni . The aim is to find the numbers ni such
that ignorance is maximized while satisfying a constraint.
We are not forced to maximize I, instead, we could just as well maximize
any monotonically increasing function of I. Instead of the ignorance I, for
mathematical reasons it is convenient to use a different quantity S, the entropy,
defined as
1
S∼ ln(I) . (1.2)
NS
When I is maximized, so is S. We use the proportionality symbol in order
to indicate, that one has the liberty to choose the proportionality constant as
one pleases, the important aspect about S is that it is an increasing function
of I. The use of S is convenient, as for large numbers NS and ni , Stirling’s ap-
proximation can be used to transition from the discrete factorial to continuous
functions:
NS systems
1 2 3 4 ... q
q states
where for large N retaining only the first two terms of the RHS is a very good
approximation1 . Using only these, for NS → ∞, which we sometimes call the
thermodynamic limit, the entropy S in Eq. 1.2 can be re-written as
!
1 X X
S ∼ NS ln NS − ni ln ni − NS + ni + . . . (1.4)
NS i i
X
∼ − pi ln pi . (1.5)
i
The probabilities pi ≡ ni /NS thereby denote the likelihood that a given system
is in state i. The proportionality allows free choice of a constant, and for
physical systems
X
S = −kB pi ln pi , (1.6)
i
with kB = 1.38 × 10−23 JK−1 is conventional, hence entropy has units of energy
divided by temperature. We note that it would be equally reasonable to absorb
kB into the definition of temperature T , to be defined below. In that case,
temperature would simply be measured in units of energy and entropy would
remain dimensionless.
There are several aspects to point out about Eq. 1.5: Given that there is
only one possible state n1 , then n1 = NS and the entropy S vanishes. For
cases with more than one occupied state, 0 ≤ pi < 1 ∀i, and entropy is always
positive. As a measure of “disorder”, entropy S is the fundamental quantity
of statistical physics.
p1
grad S grad C
contour
entropy
p2
grad S = λ grad C
∇S = λ∇C ,
∂
(S − λC) = 0 .
∂λ
where Ni are the particle numbers for the different states i and α is the La-
grange multiplier for particle number. For practical purposes, α is often re-
expressed as α = −µ/kB T , where µ is the “chemical potential” and T temper-
ature.
Mini Tutorial: If l (out of q) distinct states have the same energy, what does
this do to the joint probability weight corresponding to this energy?
where we note that the opposite sign is conventional as compared to Eq. 1.19.
In the microcanonical case, the potential is just the entropy.
Once the partition function is known, observables can be evaluated by
taking appropriate derivatives, e.g. the internal energy
∂ ln Z
hEi = − , (1.21)
∂β
18 Complex Physics; Jan O. Haerter, Kim Sneppen
Figure 1.3: Free energy and possible derivatives. Here, a magnetic sys-
tem is assumed, where M represents total magnetization, H is the external
magnetic field and χ is the magnetic susceptibility.
specific heat
1 ∂ 2 ln Z
∂hEi ∂hEi dβ 1 ∂hEi
CH ≡ = =− = , (1.22)
∂T H ∂β dT kB T 2 ∂β kB T 2 ∂β 2
magnetization
∂F
M =− , (1.25)
∂H T
or susceptibilities
∂ 2F
χT = − , (1.26)
∂H 2 T
where the subscript T denotes that temperature is held constant while evalu-
ating the derivative.
Mini Tutorial: Show how the canonical free energy relates to the microcanon-
ical one. (Hint: Again use a change of variables and start by defining a micro-
canonical free energy.)
1.1.7 Exercises
1. Two electrons.
Consider two single-particle levels with energies − and . In these levels place
two electrons (no more than one electron of the same spin per level). As a
function of T find: (a) the partition function; (b) the average energy; (c) the
entropy; (d) for microcanonical ensembles corresponding to each system energy
level, compute the entropy; (e) for a—c, discuss the limits T = 0 and T → ∞.
2
2. Fluctuations.
(i) Verify that
∂2
h(M − hM i)2 i = hM 2 i − hM i2 = kB
2 2
T ln Z = kB T χT .
∂H 2
(ii) Show in a similar way that the fluctuations in the energy are related to
the specific heat at constant volume by
Use this equation to argue that ∆E ∼ N 1/2 where N is the number of particles
in the system.
2
Yeomans: Problem 2.1
20 Complex Physics; Jan O. Haerter, Kim Sneppen
The Ising model in a nutshell. In this text we will repeatedly make use of
the spin- 21 Ising model as our “canonical” example. Spin- 12 means that there
are two states for each spin. We will discuss the Ising model in more detail
in Sec. 1.4, but here briefly introduce the model for the sake of being able to
work directly with the Monte Carlo method.
The Ising model is defined as
X X
H = −J si sj − h si ,
hiji i
where si can take the values +1 or −1 and represents the spin at site i, J is the
coupling between neighboring spins and h is an external magnetic field. The
bracket specifies that sites i and j only interact if they are nearest neighbors,
i.e., the sum is carried out over all possible pairs of neighboring sites. In a
two-dimensional square lattice with N sites, there will be 2N such pairs to
sum over.
For J > 0, spins will minimize the energy when aligned (same sign), while for
J < 0 energy will be lowered if spins are anti-aligned (opposite sign). Similar
considerations go for the external magnetic field, which will tend to align spins
when sufficiently strong.
m
far from Tc
close to Tc
asymptotic behavior
updates n0 nmax
P
{s} H exp(−βH)
hHi = P .
{s} exp(−βH)
One could hence imagine to simply sum over all configurations and obtain an
exact number for the expectation value of interest. However, even on modern-
day computers, summing over 2N configuration for an N -site lattice of more
than a few dozen sites is prohibitively costly.
But do we really need to sample the entire space of configurations to get
a reasonable estimate of the expectation values? The idea of Monte Carlo
techniques is, to sample mainly those configurations that are likely to occur,
while ensuring that each state is represented as much as it probabilistically
should be. Take intermediate temperatures, where J/kB T ∼ 1. Further, take
the external field to be absent, i.e. h = 0. When all N spins are aligned,
the contribution to the partition function is exp (N z/2), where z is the coor-
dination number. Conversely, a state where all spins are anti-aligned gives a
contribution of exp (−N z/2). Notably, the former configuration is exp (N z)
times more likely that the latter, an enormous number even for modest N .
22 Complex Physics; Jan O. Haerter, Kim Sneppen
when continuing the Markov chain infinitely long. l labels a given spin con-
figuration and Pleq is the equilibrium probability of this configuration. This
requirement puts constraints on the transition probability between states.
The probability to reside in state l at time t + 1 is
!
X X
Pl (t + 1) = Pl (t) 1 − wl→m + Pm (t)wm→l , (1.27)
m6=l m6=l
where wi→j labels the transition probability from configuration i to j and the
summations are carried out to include
P all possible configurations m. It is
further useful to define wl→l = 1 −P m6=l wl→m , i.e. the probability to remain
in configuration l. Note also that m wl→m = 1, hence compactly
X
Pl (t + 1) = Pm (t)wm→l .
m
Pl (t + 1) = Pl (t) ,
hence X
[Pleq wl→m − Pmeq wm→l ] = 0 . (1.28)
m
A simple way to achieve the condition in Eq. 1.28 is to ensure that every term
vanishes, i.e.
Pleq wl→w = Pmeq wm→l ,
i.e.
wl→m P eq
= meq = e−β(Em −El ) ≡ e−β∆E . (1.29)
wm→l Pl
Jan O. Haerter & Kim Sneppen 23
Mini tutorial: The detailed balance condition (Eq. 1.28) is a sufficient starting
point to ensure importance sampling. But is it strictly necessary?
al→m = F (e−β∆E ) ,
by symmetry hence
1 1
am→l = F ≡F ,
e−β∆E x
with x ≡ e−β∆E .
It results that
al→m F (x)
= =x, (1.30)
am→l F (1/x)
whereby it must be ensured that 0 ≤ F (x) ≤ 1 for meaningful transition
probabilities. The choice of F (x) is not unique. Popular choices are
Mini tutorial: As the function F (x), required for obtaining consistency with
Boltzmann equilibrium statistics, is not unique, which feature then does change,
when a different choice is made for F (x)?
24 Complex Physics; Jan O. Haerter, Kim Sneppen
acceptance
probability
1
“Metropolis”
1/2
“Heat Bath”
0
0
energy difference, ΔE
Difficulties usually lie in the proper choice of system size N , the choice of the
transient period n0 , and the duration of the sampling period nmax . Generally,
fluctuations increase in the vicinity of Tc and achieving robust results may
require increasing both system size and sampling time.
Fig. 1.2.4 shows an example of a Monte Carlo simulation of the spin- 12 Ising
model on a 400
−1
P × 400 lattice. The timeseries show the average magnetization
m(t) = N i si (t) for two different values of temperature. Note that, for
the lower temperature (T = 2.2 J/kB ) a relatively steady value of m(t) is
approached rapidly. At the higher temperature (T = 2.3 J/kB ), which is very
close to Tc ≈ 2.27 J/kB , compare Sec. 1.7.3, the approach is much slower and
substantial temporal fluctuations remain. The spatial plots show the state of
Jan O. Haerter & Kim Sneppen 25
+J -J
+J -J -J +J
+J -J
Figure 1.6: Performing the Monte Carlo step. Example of a spin flip by
which the change of energy is negative for J > 0 and h > 0. In the Metropolis
algorithm this step would always be accepted. Imagine going the opposite
direction: The energy difference would be ∆E = +4J + 2h. The probability of
accepting this move would then become exp −β(4J + 2h). Note also that ∆E
can be computed entirely locally, since the remainder of the lattice maintains
its energy. Considering this speeds up the computation enormously.
the system near the end of the simulation (i.e., after ≈ 104 system updates).
Note the substantial spatial fluctuations, especially for the higher value of T ,
and the absence of any typical scale for the clusters shown.
0.7
0.80
0.6
0.75
Magnetization, m(t)
Magnetization, m(t)
0.5
0.4
0.70
0.3
0.2
0.65
0.1
0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000
400
s=+1 s=+1
s=−1 s=−1
y
y
0
0 400 0 400
x x
n and k are “times” measured in units of Monte Carlo updates, where one
Monte Carlo update represents the attempted update of N sites (N is system
size). In general, CA (t) ∼ exp(−t/τauto ), hence τauto can be estimated and
one can decide how many steps are required for a large enough, independent,
sample. More more details on Monte Carlo methods the reader is referred to
the literature [3].
1.2.6 Exercises
1A. Monte Carlo Simulation. +
Consider a 2D spin- 21 Ising model, define appropriate neighborhoods for all
sites and implement a Monte Carlo method to compute the internal energy
and magnetization as a function of temperature. Are you able to find the
critical temperature and estimate any critical exponent? Discuss differences
for finite/zero magnetic field h. Investigate the robustness of your results,
by modifying appropriate parameters of your simulation, and increasing the
replicas, i.e. using a set of different initial conditions or random number seeds.
Discuss various lattice geometries, in particular, a change of dimension. Make
a literature search for available exact results and compare your findings with
these. Consider the Wolff cluster update procedure to speed up your simula-
tion near criticality [2].
4. Obtain the expectation values for various temperatures and plot them
as function of kB T /J.
5. Repeat the simulation several times for each observable and temperature
to obtain a distribution of results for that data point. Use the distribu-
tion to quantify the sampling error and plot the error bars.
8. Repeat for a larger system size and make notes of your findings for n0
and nmax and discuss (qualitatively) how these values and the error bars
depend on the reduced temperature t.
M magnetic
system:
M(T) Tc magnetization
ferro- para-
T
magnet magnet
ρ ρliq(T) gas:
density difference
Δρ(T)
ρc
ρgas(T)
T
Tc
where si is the “spin” at a lattice site i and can take one of the values ±1. h is
the external magnetic field and J is the coupling parameter. For a ferromagnet,
which we qualitatively discuss here, J > 0, i.e. energy is minimized when spins
have the same sign.
3
First studied by Lenz and Ising in 1925, see Brush [4] and Wolf [5] for reviews on the
model and its vast applications.
30 Complex Physics; Jan O. Haerter, Kim Sneppen
T=Tc (M=0)
T>Tc (M=0)
Figure 1.9: Illustration of spin ordering. Cartoon of spin ordering for the
Ising model for increasing temperature (top to bottom).
We qualitatively sketch some limits (Fig. 1.9): For very high temperature,
spins are randomly oriented, all order disappears and there is no net magne-
tization. As temperature is lowered, the correlation length increases, i.e. the
length at which spins are correlated and point in the same direction — “patch-
iness” increases. At the so-called critical temperature Tc , the correlation length
“diverges”, there are patches of correlated spins of all patch sizes. When an
external field is absent (h = 0) there is however still no net magnetization.
As temperature is lowered below Tc , nonzero magnetization emerges sponta-
neously, i.e. the system breaks the symmetry w.r.t. positive and negative spin,
with one orientation dominating randomly. At T = 0, all spins are entirely
aligned.
Mini tutorial: What is the entropy (per site) for the Ising model in the limit
of infinite temperature?
H
F
H
T>Tc
T=Tc
T
Tc
T<Tc
T<Tc
M T=Tc
M
H>0 T>Tc
H=0
T H
χ χ
H=0 T=Tc
H≠0 T>Tc
T T<Tc H
Tc
where si is the value of spin at lattice position ri , and h...i denotes the ensemble
average.
For translationally invariant systems, hsi i = hsj i ≡ hsi and therefore the
correlation function only depends on the distance vector between the two spins.
It simplifies to
Γ(ri − rj ) ≡ Γij = hsi sj i − hsi2 . (1.34)
Away from the critical temperature Tc , spins tend to be uncorrelated, i.e.
ri
rj
1
Γ(r) ∼ . (1.36)
rd−2+η
In this equation, η depends on some of the system properties and is a example
of a so-called critical exponent.
∂2
χ T = kB T ln Z (1.37)
∂H 2
1
hM 2 i − hM i2
= (1.38)
kB T
1
= h(M − hM i)2 i (1.39)
kB T
1 X X
= h (si − hsi i) (sj − hsj i) i (1.40)
kB T i j
1 X
= Γij , (1.41)
kB T ij
where the total magnetization M was P written asPthe sum over all spins. For
the translationally invariant lattice, ij Γij = N j Γ0j , which can be approx-
imated by an integral near criticality, where the lattice structure is unimpor-
tant: Z
X
N Γ0j ∼ N dr Γ(r)rd−1 ∼ χT . (1.42)
j
Jan O. Haerter & Kim Sneppen 33
Hence, for correlations to remain, one needs to require η < 2. Overall, di-
vergent susceptibility (a macroscopic quantity) implies divergence also in the
fluctuations of magnetization (a microscopic property).
1945 (Fig. 1.11). The diagram shown is the actually measured version of
the one in Fig. 1.8. When rescaling temperature by the respective critical
temperature of each of the fluids (T /Tc ) and similarly for density (ρ/ρc ), the
measurements all collapse on a single line. This line can be well approximated
by a cubic equation, which requires only a single critical exponent β. This
result shows, that, by knowing about the critical behavior of one of the fluids,
one can deduce the behavior of all the others — and this holds also remarkably
far away from the critical temperature (compare: Fig. 1.11)!
In terms of theoretical modeling, the notion that only the symmetry of the
order parameter and the dimensionality of the system matters, this means that
very simple models can be chosen to describe the critical behavior of systems,
which, a priori, entail much more complicated microscopic interactions. This
is why some of the models are so heavily discussed in seemingly unrelated
contexts, even though it often seems that they constitute oversimplifications.
Concerning critical behavior, this is not so.
Several basic models. For a brief overview, we list several common models
along with their critical exponents (Tab. 1.1). Note that the critical expo-
nents for the mean field system can be thought of as corresponding to four
dimensional space, i.e. when the exponents for lower dimensions are known,
Jan O. Haerter & Kim Sneppen 35
the mean field exponents in some sense are an “extrapolation” to the next
dimension. In fact, in 4D the mean field critical exponents are exact, d = 4
is therefore sometimes called the upper critical dimension. The mean field
approach does not consider the dimensionality of the problem, only the coordi-
nation number of the lattice, i.e. the number of nearest neighbors, enters into
the calculation. The mean field assumption implies that neighboring spins are
uncorrelated. This assumption becomes more and more reasonable, when the
number of nearest neighbors increases with the dimensionality of the problem.
Therefore it seems intuitive that most critical exponents in the 3D Ising model
(numerical solution) agree better with the mean field exponents than those of
the 2D Ising model.
1.3.5 Exercises
1A. Paramagnet +
A paramagnetic solid contains a large number of non-interacting spin-1/2 par-
ticles. This substance is placed in a uniform magnetic field.
Obtain and sketch the magnetization and magnetic response function as well
as the entropy of the paramagnet in the field.
Check appropriately chosen limiting behavior for the external variables. Do
the limits make physical sense?
1B. Paramagnet4
A paramagnetic solid contains a large number N of non-interacting, spin-1/2
particles, each of magnetic moment µ on fixed lattice sites. This substance is
placed in a uniform magnetic field H.
(i) Write down an expression for the partition function of the solid, neglecting
lattice vibrations, in terms of x = µH/kB T .
(ii) Find the magnetization M , the susceptibility χ, and the entropy S, of the
paramagnet in the field H.
(iii) Check that your expressions have sensible limiting forms for x 1 and
x 1. Describe the microscopic spin configuration in each of these limits.
(iv) Sketch M , χ, and S as a function of x.
(Answers: (i) Z = (2 cosh x)N , (ii) M = N µ tanh x, χ = N µ2 /(kB T cosh2 x),
S = N k(ln 2 + ln(cosh x) − x tanh x)).
2. Critical Exponents5 Determine the critical exponents λ for the following
functions as t → 0:
• f (t) = At1/2 + Bt1/4 + Ct
• f (t) = A ln(exp(1/t4 ) − 1)
H ∼ aM (t + bM 2 )θ (1.51)
where 1 < θ < 2, a, b > 0 near the critical point. Find the exponents β, γ,
and δ and check if they obey the inequality γ ≥ β(δ − 1) as an equality.
3. Rushbrooke inequality. As the different observables are not indepen-
dent, also the corresponding critical exponents are related to one another.
Consider the specific heats at constant field H and constant magnetization M ,
respectively:
dS
CH ≡ T , (1.52)
dT H
dS
CM ≡ T , (1.53)
dT M
Consider now the entropy S = S(T, H) and the total derivative dS. Use the
∂S
∂M
∂z
∂y ∂x
Maxwell relation ∂H T = ∂T H and the chain rule ∂x y ∂z x ∂y = −1
z
to obtain a relation between the above observables:
2
∂M
χT (CH − CM ) = T . (1.55)
∂T H
Using the definitions of the critical exponents for these observables, verify the
Rushbrooke inequality α + 2β + γ ≥ 2.
Jan O. Haerter & Kim Sneppen 37
where hiji denotes that a sum is to be carried out over all nearest-neighbor
pairs of sites i and j, and J is the coupling between these neighboring sites.
The quantity h represents an external magnetic field which interacts with
the magnetic moment si . The magnetization
P is then defined as the system’s
macroscopic magnetic moment M = i si .
-J -J +J -J
E = -3J E = -J
-J -J
J>0 J<0
(ferromagnet) (anti-ferromagnet)
is easy to verify that the ground state of such systems is far from unique
and the number of states in the ground state increases with system size. In
other words, the ground state entropy per site (F/N ) is finite for an anti-
ferromagnetic triangular lattice, while it is zero for the ferromagnetic case.
This is an example of ground state entropy (see Exercises).
At high temperatures, spins fluctuate thermally and order is generally de-
stroyed. The macroscopic magnetic moment will vanish. This phase is referred
to as paramagnetic phase. Note that the situation becomes already more com-
plicated, when the lattice is not square, i.e. a simple anti-ferromagnetic order of
the checkerboard-type is not possible. Consider a triangular lattice, where two
neighboring sites may have a common neighbor. In this case, anti-alignment
is not consistently possible, a case referred to as a frustrated spin system. We
will however focus primarily on the square lattice geometry or one-dimensional
systems.
To give an overview, whether an analytical solution exists for the Ising
model depends on the dimension of the lattice. In 1D, an analytical solution
exists, which we will discuss in Sec. 1.6. In 2D, Lars Onsager in 1944 obtained
an analytical solution, which is however very technical. In 3D, no analytical
solution exists to date. In 4D, it has been shown that the exact solution is
identical to the mean-field solution, which we will discuss in Sec. 1.5.
Lattice gas. One intriguing variant is that of the lattice gas, where particles
Jan O. Haerter & Kim Sneppen 39
-ε ε
Figure 1.13: Exact mappings of Ising model. Lattice gas (left) and in-
compressible binary mixture of two chemical species (right).
The goal of the lattice gas model is that the occupation density (average num-
ber of particles per site) can vary at a fixed number of lattice sites. This means
that, as temperature is lowered, one might find that the system spontaneously
chooses a particular configuration of density, such that the respective ther-
modynamic potential is minimized. Since particle number is now not fixed,
one must consider the Gibbs free energy and the grand canonical partition
function (see Sec. 1.1.5). This brings in another Lagrange multiplier, namely
the chemical potential µ, which is conjugate to the total particle number. We
will later see that µ can be associated with the external magnetic field of the
Ising model, and that the grand canonical partition function is associated with
the canonical one in the case of the Ising model. This means that, below Tc ,
small changes in chemical potential can bring about a first order transition
in density, moving between the “liquid” (i.e. density ρliq ) and the “gas” (i.e.
density ρgas ) state abruptly. At temperatures close to Tc but below this value
the system will spontaneously collapse to either ρliq or ρgas .
The fortunate feature of the lattice gas model is that it maps exactly onto
the Ising model, hence, finding a solution for one means having the solution for
the other — there is no further approximation. How does this mapping work?
For each site (“cell”), we define the spin as si = +1 (occupied) respectively si =
40 Complex Physics; Jan O. Haerter, Kim Sneppen
ni = (si + 1)/2 ,
where ni now measures the number of particles at the site, i.e. either 0 or 1.
For a lattice of N sites, the total number of particles is
1X 1X N
Np = (si + 1) = si + .
2 i 2 i 2
ij = − if si = sj = 1 ,
ij = 0 otherwise.
In its original form the lattice gas model requires a grand canonical ensemble,
since the total number of particles Np may be varied (Sec. 1.1.5). The grand
partition function then reads
X
ZG = exp(βµNp − βEp ) . (1.57)
{s}
where
X µ z X zN µN
Eeff = − si sj − + si − + . (1.59)
8 2 4 i
8 2
hiji
This effective energy now highlights the correspondence between the two mod-
els:
• J corresponds to 8
µ z
• h corresponds to 2
+ 4
Jan O. Haerter & Kim Sneppen 41
• M corresponds to density ρ ≡ Np /N
Note that the symmetry of the Ising model regarding ±h is not preserved
regarding occupied/unoccupied sites in the lattice gas model, i.e. there is no
symmetry ρ ↔ (1 − ρ).
Mini Tutorial: What can a lattice gas model teach us about a liquid-gas phase
transition (at least, as a metaphor)?
∗
1.4.3 Models related to the Ising model.
A number of models are related to the Ising model, while they cannot be
transformed to be the Ising model (see Sec. 1.4.2). It is useful to know about
these models, to be able to compare them with the Ising model solution, which
is often known or more easily available (e.g. by a simple computation).
Potts model.
One simple extension of the Ising model are so-called Potts models, where si
can take more than two values, but only when neighboring sites have the same
value, is the energy value changed, i.e.
where δx specifies the delta function which is unity when x = 0 and zero oth-
erwise. Such Potts models can describe e.g. opinion dynamics in a population,
where “agreement” of opinion could cause a negative value of energy.
The spins si could also take vectorial values, such that si · sj would become
the inner product of two vectors. This model is called the Heisenberg model
and describes isotropic magnetic moments in a lattice, i.e. moments that are
not confined to one of the crystal axes. The Heisenberg model can also be
applied to quantum spins, such that the vectors are interpreted as quantum
spin operators ŝi .
Often, more complex lattices are introduced with non-trivial unit cells, e.g.
fcc, bcc, or hexagonal lattices. These introduce further complications, which
are often necessary when describing metals more quantitatively. Realistic de-
scriptions generally also demand inclusion of further-neighbor interactions, be-
yond the range of nearest neighbors.
When the coupling parameter J is made entirely random, then so-called
“glassy” materials can be described, e.g. spin-glasses. Also time-dependent
models are possible, leading altogether away from equilibrium statistical physics.
42 Complex Physics; Jan O. Haerter, Kim Sneppen
1. Copy their friends: Buy if your friends buy, sell if your friends sell.
where
1 X
m(t) = sk (1.61)
N k
is the average magnetization at time t.
The dynamics proceeds as heat bath dynamics where one selects a random
site i and sets
si (t + 1) = +1 (1.62)
with probability given by the equilibrium expectation
exp (β · hi ) 1
p= = (1.63)
exp (β · hi ) + exp (−β · hi ) 1 + exp (−2β · hi )
·and otherwise sets si (t + 1) = −1. The model is always considered for sub-
critical β where the spins tend to align for α = 0. In absence of global coupling
it will accordingly give large positive or large negative magnetization. With
global coupling α > 0, switch to negative spins will be favored if the spin is
positive (si > 0). The switching probability for this opposing move to occur
is larger if the absolute average spin |m(t)| is large. The α = 0 version of the
model is the two-dimensional Ising model. For larger α the coupling to the total
magnetization makes individual spin tend to take values opposite to itself (not
to the overall magnetization). Thus if magnetization deviates substantially
from zero, the individual agents tend to shift all the time, introducing an
increased volatility in the market that slowly tends to drive the market back
1.4.4 Exercises
1. Ground state for simple models. The ground state of a system (stable
state at T = 0) often serves as the starting point for finite temperature inves-
tigations, e.g. the low-temperature expansion technique (later in the course).
This is because it can dominate the partition function, even at T > 0. It is
therefore important to develop some intuition for the ground state of simple
models. Find the ground state for the following systems:
(i) The 1d Ising model with first and second neighbor interactions
X X
H = −J1 si si+1 − J2 si si+2 , (1.64)
i i
where both positive and negative values of the exchange parameters should be
considered.
(ii) The 1d p-state chiral clock model
X
H = −J cos(2π(ni − nj + ∆)/p) (1.65)
i
with J > 0, find the ground state energy and a possible representation of it.
3. Volatility model.
a) In the model in Sec. 1.4.3 we use the heat bath method, therefore repeat a
simulation of the Ising model for a 10 times 10 system as function of inverse
temperature β and plot the energy and average magnetization as function of
of temperature. Confirm that it works. (b) Simulate the above Ising inspired
model for volatility in a market model using an N = 10 × 10 system with
β = 0.7 and α = 1, respectively α = 2 and 5. Confirm that: The volatile
periods are associated to periods where M (t) is high (when α 1).
4. Self-organized criticality.
In self-organized criticality, the system itself controls the tuning parameter,
that is, for the case of the Ising model, the internal state will feed back onto
the temperature. This feedback is performed in a way such that the system
always remains close to the critical value of temperature. Can you find a way to
modify your Monte Carlo simulation of the Ising model, such that the system
becomes self-organized critical?
46 Complex Physics; Jan O. Haerter, Kim Sneppen
N
1 X
m= hsj i .
N j=1
si = m + (si − m)
X X
EM F = −J (−m2 + m(si + sj )) − h si . (1.69)
hiji i
We have hence replaced the (microscopic) interaction between each spin and
each neighbor by an average magnetic field, produced jointly by all the neigh-
bors. Eq. 1.69 can be simplified by noting that −J hiji (−m2 ) = JN2 z m2 ,
P
where z is the coordination
P number,Pi.e. number of nearest neighbors, of the
lattice; further, hiji (si + sj ) = z j sj , since all spins are equivalent. The
mean field energy then is
N
JN z 2 X
EM F = m − (Jzm + h) sj .
2 j=1
Mini Tutorial: Consider the mean field energy above and discuss its depen-
dence on dimensionality and geometry of the lattice.
48 Complex Physics; Jan O. Haerter, Kim Sneppen
N Jzm2
= exp −β [2 cosh(Jzmβ + hβ)]N . (1.71)
2
It is now straightforward to compute the magnetization per site:
N
1 X
m= hsj i = hsi ,
N j=1
m = tanh(Jzmβ) . (1.73)
Inspecting the plot (Fig. 1.5.2), we note that this equation can either have one
or three solutions. In the case of a single solution, only m = 0 is possible. We
further note that the transition to the case of three solutions is dependent on
the value of Jzβ. Considering that J and z are constants, but β can be varied,
we ask, at which value of β the transition occurs. This is easy to find when
noting that the slopes of the two curves shown must align for this specific β.
Hence, we demand that
dm d tanh(Jzmβ)
|m=0 = 1 = |m=0 = Jzβ + O(m2 ) .
dm dm
A transition between the single and three solution case will hence occur at the
critical βc ≡ (Jz)−1 . Alternatively, the mean field critical temperature is
Tc ≡ Jz/kB .
Notably, the critical temperature Tc increases with coupling J and the number
of neighbors z, but does not depend on the dimension or the geometry of the
Jan O. Haerter & Kim Sneppen 49
m,
tanh(Jzm/kBT)
T<Tc
T=Tc
T>Tc
Mini Tutorial: There are two solutions to Eq. 1.73, where h = 0. What is the
difference between the two? What would happen to these two, if h 6= 0?
m2 m + h0
fM F
= − θ ln 2 − θ ln cosh . (1.76)
zJ 2 θ
50 Complex Physics; Jan O. Haerter, Kim Sneppen
m2 1 m2 1 m4
6
fM F m
= − θ ln 2 − θ ln 1 + 2
+ 4
+O . (1.77)
zJ 2 2θ 24 θ θ6
2
Using the additional expansion ln(1 + x) = x − x2 + . . . the final logarithm in
Eq. 1.77 simplifies to yield
m2 1 m2 1 m4
6
fM F m
= − θ ln 2 − θ − + O
zJ 2 2 θ2 12 θ4 θ6
2 4
6
m 1 m m
= 1− − θ ln 2 + 3
+O . (1.78)
2 θ 12θ θ6
We are now in a position to ask for extrema of fM F regarding m, which
require that we set the derivative ∂fM F /∂m = 0, hence
1 m3
1 ∂fM F 1
=m 1− + =0. (1.79)
zJ ∂m θ 3 θ3
For later use we also compute the second derivative w.r.t. m, namely:
1 ∂ 2 fM F m2
1
= 1 − + . (1.80)
zJ ∂m2 θ θ3
Apart from the solution m = 0 Eq. 1.79 leads to
(Eq. 1.80) for any of the real solutions, and find that for t < 0 the second
derivative is positive for nonzero m and negative for m = 0, while for t > 0 the
only solution m = 0 gives a positive value of the second derivative. This finding
indicates that minima of fM F are expected for nonzero average magnetization
for t < 0 and vanishing magnetization for t > 0.
Mean-field critical exponent for zero-field magnitization: m ∼ (−t)β .
Note that θ2 in the vicinity of Tc acts as a constant, since the relevant variations
occur in the variable t (1 − θ, not θ, is the small quantity). Eq. 1.81 delivers
the mean field critical exponent β = 1/2, and further allows us to substitute
m2 back into the free energy fM F . This yields
fM F 3 3
= − (1 − θ)2 − θ ln 2 + θ(1 − θ)2 + higher order terms
zJ 2 4
Substituting “solitary” appearances of θ by unity (we are very close to Tc ),
we have
fM F 3
= − (1 − θ)2 − θ ln 2 + higher order terms ,
zJ 4
and can check that, with t < 0 the free energy is actually reduced as compared
to the symmetric choice of m = 0 — hence, free energy is minimized, not
maximized for our choice of m2 .
Mean-field critical exponent for zero-field specific heat: C ∼ |t|−α .
It is also possible to obtain the exponent α corresponding to the mean field
specific heat cH (in the absence of a magnetic field). We will find that cH has
a jump discontinuity (first order transition) as Tc is crossed. Since we have
already computed the mean field free energy fM F , we are now in a position to
proceed with entropy S and specific heat cH as derivatives of the free energy,
i.e.
∂fM F ∂fM F ∂θ 3
S=− =− = −kB (1 − θ) + ln 2 .
∂T ∂θ ∂T 2
The specific heat becomes
∂S ∂S 3
cH = T =θ = kB θ .
∂T ∂θ 2
Taking the limit T → Tc is now simply the statement θ = 1, hence cH = 32 kB .
Since cH = const, the specific heat critical exponent αM F = 0.
For T > Tc , m = 0 and the paramagnetic free energy only depends linearly
on temperature (fM F = −kB T ln 2), yielding constant entropy Spara = kB ln 2.
Hence, also for T > Tc the specific heat critical exponent αM F = 0.
m = tanh β(Jzm + h)
(1.82)
x3
tanh(x) = x − + O(x5 ) . (1.84)
3
where the third order term ensures that finite magnetization is possible. Re-
member now that m vanishes for T > Tc , whereas it is finite for T < Tc . It is
thus important to distinguish the former (super-critical) case from the latter
(sub-critical) one.
Sub-critical case: T < Tc . To third order in m, by inserting into Eq.1.83 ,
we obtain
1
m = βJzm − (βJzm)3 + hβ − hβ(βJzm)2 . (1.85)
3
When applying the derivative w.r.t. h on both sides of the equation and
taking the limit h → 0, we have
∂m
χ = |h=0 (1.86)
∂h
= βJzχ − (βJz)3 m2 χ + β − β(βJz)2 m2 (1.87)
which yields
1 1 1
lim χ = = . (1.90)
t→0,t>0 kB Tc t Jzt
From Eqs 1.89 and 1.90 it is now easy to read off the critical exponent γ cor-
responding to the temperature dependence of χ near the critical temperature.
In both cases,
γ = −1 ,
hence, χ indeed diverges at T = Tc and the scaling for χ near the critical
temperature does not depend on the sign of t.
Mini Tutorial: In how far would the results obtained above differ, if the lattice
was a 1D ring?
F ≤ Φ = F0 + hH − H0 i0 , (1.92)
where F is the true free energy, F0 is the free energy obtained from a trial
Hamiltonian H0 and h...i0 denotes the expectation value computed in the en-
semble defined by H0 . The trial Hamiltonian H0 thereby depends on a param-
eter h0 , which is then used to minimize Φ. A common choice is the take H0
as the free Hamiltonian, i.e.
X
H0 ≡ −h0 si .
i
note the difference between the symbols H0 and h0 . Using translational in-
variance of the lattice (all sites are equivalent),
we have
hH − H0 i0 = −JzN hsi20 /2 + N h0 hsi0 .
The approximate free energy then is
N
= (h0 − Jz tanh βh0 ) .
kB T cosh2 βh0
Minimization requires the last factor to vanish, i.e. h0 = Jzhsi0 . This gives a
condition for the mean field magnetization (compare Fig. 1.5.2):
Inserting this into the approximate free energy Φ yields the mean field free
energy:
JzN 2
Φmf = −N kB T ln(2 cosh βJzhsi0 ) + hsi0 . (1.94)
2
Finding the critical temperature Tc . Recall that the critical temperature
is defined as the temperature, where a transition from a ferromagnetic to a
paramagnetic phase is observed. The expression for the mean field magnetiza-
tion (Eq. 1.93) implicitly defines hsi0 , even though an analytical solution does
not exist. However, one does not need an explicit expression for hsi0 , if one
only is interested in the transition temperature Tc , i.e. where magnetization
just barely becomes finite.
A practical way to obtain this transition is to just plot both sides of Eq. 1.93
as a function of hsi0 . The LHS just gives a straight line of slope unity, while
the RHS gives a monotonically increasing concave function, however, the slope
depends on temperature. The concavity guarantees that, if the slope is less
Jan O. Haerter & Kim Sneppen 55
than unity at hsi0 = 0, there will be no further intersections for positive (or
negative, by symmetry) magnetization.
All we need to do is hence to look for an argument of the tanh that gives
∂ tanh(βJzhsi0 )
lim ≈ βJz = 1 ,
hsi0 →0 ∂hsi0
i.e. the critical temperature Tc becomes
Jz
Tc = ,
kB
a quantity that notably only depends on the coordination number z, that is,
the number of nearest neighbors of each site, but not on the dimensionality of
the lattice.
Now that Tc is known, we can define the dimensionless temperature
T − Tc
t≡ ,
Tc
i.e. T = Tc (t + 1) = kJzB (t + 1).
How does magnetization scale as we approach the critical point, i.e. what
is the exponent β in M ∼ (−t)β ? Since this still only requires small deviations
from Tc and small values of hsi0 , it is sufficient to expand the tanh in a Taylor
series:
1
hsi0 = tanh(βJzhsi0 ) = tanh( hsi0 ) ,
t+1
yielding
hsi30 hsi50
hsi0
hsi0 = − +O
1 + t 3(1 + t)3 (1 + t)5
hsi30
+ O hsi0 t2 , hsi30 t, hsi40 .
= hsi0 (1 − t) −
3
Hence, −t = hsi20 /3, or
hsi0 = 3(−t)1/2 , (1.95)
i.e. βmf = 1/2, that is, when temperature is lowered from Tc , the magni-
tude of total magnetization increases as a square root dependency with the
temperature difference Tc − T .
Similarly, we can now evaluate the specific heat critical exponent α: CH ∼
−α
|t| :
∂S
CH = T ,
∂T h
where the external field h is held fixed and the entropy S = − ∂F
∂T
(exercise).
As result, it is found that
3
T < Tc : CH = N k + O(t)
2
T > Tc : CH = 0,
56 Complex Physics; Jan O. Haerter, Kim Sneppen
hence, the specific heat has a jump discontinuity at Tc , but is otherwise con-
stant. Therefore, αmf = 0.
It is also possible to compute the critical isotherm exponent δ, which is
defined at t = 0, as h ∼ |M |δ sgn(M ). This requires the addition of a small
magnetic field h to the Ising Hamiltonian:
X X
H = −J si sj − h si ,
hiji i
hence,
hsi0 = tanh(β(Jzhsi0 + h)) .
With T = Tc , βJz = 1, hence hsi0 = tanh(hsi0 + h/Jz), which can now be
expanded for small hsi0 and h, yielding
h hsi30
+ O hsi20 h, hsi0 h2 , h3 , hsi50 .
hsi0 = hsi0 + −
Jz 3
Therefore, h ∼ hsi30 and δmf = 3.
In analogous ways, the susceptibility exponent γ in χT ∼ |t|−γ can be
computed, yielding γmf = 1 (left as exercise).
Mini tutorial: Based on physical plausibility, could you suggest further phe-
nomenological expressions for the free energy?
dF(m)
= 0 = 2ã2 tm + 4a4 m3 = m(2ã2 t + 4a4 m2 ) ,
dm
yielding m = 0 and p
|m| = tã2 /2a4 , (1.97)
hence β = 1/2. Notably, this is the same critical exponent which we previously
obtained within the explicit mean field derivation.
The specific heat critical exponent is obtained by differentiating F twice
w.r.t. t. Using Eq. 1.97 in the free energy (Eq. 1.96), we have for t < 0
ã22 t2
F = F0 − + O(t3 ) , (1.98)
4a4
paramagnet ferromagnet
F(m)-F0 F(m)-F0 F(m)-F0
m m m
1.5.7 Exercises
1. Specific heat and susceptibility in the mean field approximation.
(reproducing results above)
Follow the steps in Sec. 1.5.4 to obtain the mean field specific heat and sus-
ceptibility near t = 0. Hence, starting from the mean field free energy with
h = 0, expand to fourth order in m and find the minimum in free energy to
obtain the mean field magnetization as function of temperature (it is useful to
introduce the dimensionless temperature θ ≡ kJzBT
, where J is the coupling and
z the coordination number of the lattice (number of neighbors of a site.) The
free energy is now only a function of temperature. By taking the derivative
w.r.t. temperature, obtain entropy S. Differentiating again w.r.t. tempera-
ture, obtain cH . Discuss the difference of cH for t > 0 and t < 0 near t = 0.
What is the critical exponent α (in cH ∼ |t|−α )?
By using the self-consistency expression
m = tanh(β(Jzm + h))
where the notation {s} means that all configurations of the different si are
summed over in Eq. 1.101, J is again the nearest neighbors coupling and H
the external magnetic field. The idea is now to break down the partition
function into pairs of each two neighboring spins, yielding
X s0 + s1 s1 + s2
ZN = exp(βJs0 s1 + βh ) exp(βJs1 s2 + βh ) · . (1.102)
..
2 2
{s}
sN −1 + s0
· exp(βJsN −1 s0 + βh ). (1.103)
2
SN-1 S0=SN S
SN-2 1
S2
.
..
..
Figure 1.19: 1D Ising model. Each of the N sites i has two nearest neighbors.
Note the periodic boundary conditions, which are enforced by demanding S0 =
SN , i.e. the 1D system becomes a closed loop of N sites.
60 Complex Physics; Jan O. Haerter, Kim Sneppen
Noticeable, each of the factors in the argument of exp in Eq. 1.103 can take one
of four values, depending on the configuration of the two spins involved. It is
more convenient to collect these four terms as the coefficients of a 2×2-matrix,
det(T − λI) = 0 .
This gives
eβ(J+h) − λ) eβ(J−h) − λ) − e−2βJ = 0 ,
yielding q
λ1/2 = e βJ
cosh βh ± e2βJ sinh2 βh + e−2βJ .
tanh x
i ... i+R
Figure 1.20: Spin-spin correlation function. Spin chain with two spins
separated by R sites, tanh x, with x = βJ, is the correlation function between
two neighboring spins.
1
f = −kB T lim ZN
N →∞ N
1 N
ln λ1 + λN
= −kB T lim 2
N →∞ N
" N !#
1 λ2
= −kB T lim ln λN
1 1+ .
N →∞ N λ1
N
λ2
In the thermodynamic limit, the ratio λ1
→ 0 and the free energy per site
is just
f = −kB T ln λ1
q
βJ 2
= −kB T ln e cosh βh + e2βJ sinh βh + e−2βJ .
sinh βJ
Γ(1) = hsi si+1 i = = tanh βJ ,
cosh βJ
62 Complex Physics; Jan O. Haerter, Kim Sneppen
a result that is simply obtained by summing over the two different values of
the bond energy between sites i and i + 1, namely ±J. Now consider two spins
that are separated by a distance of R lattice sites instead:
and realize that the expectation value in Eq. 1.105 will not change by inserting
products sj sj = 1, hence
Note that the expectation value now factorizes into a product of bond expec-
tation values, when one simply considers the energy of each bond, not the sites
themselves, i.e.
Γ(R) = tanhR βJ .
Note that as R increases, the correlation falls off exponentially with distance.
There is hence no long-ranged order in the 1D Ising model, which would require
a power-law dependence on distance.
1.6.5 Exercises
1. Pair correlation function in 1D. (repetition of notes above)
The 1D Ising model has the advantage of showing an exact solution, but has
the disadvantage, that it has no finite critical temperature, i.e. Tc = 0.
To see a manifestation of this, consider now the spin-spin correlation func-
tion
Γ(1) = hsi si+1 i
for two neighboring spins in the absence of an external magnetic field (h = 0).
Can you compute the correlation function of two spins separated by a distance
R, i.e.
Γ(R) = hsi si+R i
by making use of Γ(1)?
Discuss that the dependence Γ(R) ∼ p(T )R with p(T ) < 1, i.e. that
correlations decay exponentially at finite T > 0. What about T = 0?
(Hint: Make use of multiple insertions of unity and think more of summations
over bonds than sites.)
2A. No finite temperature phase transition in 1D Ising model. +
Using the free energy and the concept of domain walls (Fig. 1.6.5), show that
domain walls are favored for small but finite temperatures in 1D chains, but
disfavored for 2D systems. Argue for a crude lower bound to the critical
temperature in 2D.
2B. No finite temperature phase transition in 1D Ising model.
The free energy of the h = 0 ferromagnetic Ising model for a state of fixed
Jan O. Haerter & Kim Sneppen 63
where B denotes the total number of bonds on the lattice, i.e. for a 2D square
lattice of N sites B = 2N . Since v is the small parameter in inverse tempera-
ture, including more orders of v means approaching lower and lower tempera-
tures.
Mini tutorial: Consider a triangle consisting of three sites and work out Z.
{s}
= 0 otherwise. (1.106)
Here, N denotes the total number of spins on the lattice, as before. The result
in Eq. 1.106 means that only closed loops contribute to the sum.
Jan O. Haerter & Kim Sneppen 65
no contribution
2N
v4 v6 v8 v 10
ln(cosh2 K) = ln(1 + v 2 + v 4 + . . . ) = v 2 + + + + + O(v 12 ) .
2 3 4 5
66 Complex Physics; Jan O. Haerter, Kim Sneppen
In this process, it turns out that terms involving powers greater than linear
in N drop out, which they should, since the free energy should be extensive
(i.e. ∼ N ). The high-temperature free energy finally is
3 7 19 61
F = −N kT (ln 2 + v 2 + v 4 + v 6 + v 8 + v 10 + O(v 12 )) .
2 3 4 5
where B = N − 1, i.e. the number of bonds in the open 1D chain. For the
periodic chain (B = N ), one closed loop of length N is possible, i.e. a single
contribution arises from v N , and
closed
Z1D = 2N coshB βJ(1 + v N ) ,
where the contribution from v N scales to zero at any finite T in the thermo-
dynamic limit, meaning that the boundary condition does not alter the result.
Note also, that it is straightforward to compute spin-spin correlation func-
tions hsm sn i using the same formalism, when considering that the product
sm sn just acts as an additional factor in any of the products of bonds in the
ln(1 + X)
with
1
X ≡ (N v 4 + 2N v 6 + N (N + 9)v 8 + 2N (N + 6)v 10 + . . . ) ,
2
and making use of the the power series ln(X) = X − X 2 /2 + O(X 3 ) to second order, we
have
N 2 v8
ln(1 + X) = X− + 2N 2 v 10 + O(v 12 )
2
9
= N v 4 + 2N v 6 + N v 8 + 12N v 10 ,
2
where we note that the two terms quadratic in N have dropped out, hence all remaining
terms are linear in system size (∼ N ), as we expect it for physical reasons for the free energy.
Putting it all together, we can proceed and write down the free energy.
Jan O. Haerter & Kim Sneppen 67
-0.5
-1
<E>/JN
-1.5
Monte Carlo
-2 high T expansion
low T expansion, 12th
-2.5 low T expansion, 14th
-3
0 2 4 6 8 10
Tk/J
Figure 1.23: Internal energy for the zero-field triangular lattice. Com-
parison of a Monte Carlo simulation (red solid curve) and a high-temperature
expansion to 6th order in v = tanh βJ (green dashed curve).
Q
βJsi sj
P
{s} hiji e sm sn
hsm sn i =
ZP
N B
2 cosh βJ graphs w. even powers except at m and n v # bonds
= .
2N coshB βJ
P
all graphs v # bonds
In other words, m and n should be the endpoints of lines on the lattice. In the
1D lattice this is just the path connecting the points m and n and
|m−n| |m − n|
hsm sn i = v = exp − ,
ξ
with the correlation length ξ ≡ −1/(ln tanh βJ). Hence, correlations decay
exponentially in one dimension — a feature that could also be shown using an
explicit solution of the 1D Ising model (refer to Sec. 1.6 for a derivation using
transfer matrices).
68 Complex Physics; Jan O. Haerter, Kim Sneppen
∞
!
(n)
X
Z = e−E0 /kB T 1+ ∆ZN .
n=1
(n)
E0 thereby denotes the ground state energy and ∆ZN are all Boltzmann
factors corresponding to excitations relative to the ground state. The label
(n) indicates that n spins were flipped relative to the ground state.
For example, if one bond is anti-aligned, this leads to an energy “cost” of
2J, yielding a Boltzmann factor x = e−2J/kB T = e−2K . A single spin flip in a
2D square lattice hence requires a factor x4 ,
When two spins are flipped, one needs to distinguish two cases: either, these
particular spins are neighbors, in that case, six bonds become anti-aligned
and the energy cost is 12J; or the spins are not neighbors, in which case 8
bonds are anti-aligned and the cost is 16J. Again, one needs to keep track
of multiplicities: in the first case, there are 2N ways to choose neighboring
spins, in the latter, there are N ways to choose the first spin, and N − 5
to choose the second, so that the two are not neighbors. To avoid double
counting, an additional factor of 1/2 needs to be applied, yielding N (N − 5)/2
configurations.
Jan O. Haerter & Kim Sneppen 69
2J
2J 2J
2J
Figure 1.25: Energy cost relative to the ground state for a single
spin flip. 2D square lattice Ising ferromagnetic without external field. This
corresponds to a total Boltzmann factor of x4 = e−8J/kB T .
−E0 /kB T 4 6 1 8 10 12
Z=e 1 + N x + 2N x + N (N + 9)x + 2N (N + 6)x + O(x ) .
2
(1.108)
−F ln Z E0
= =− + g(x) = ln(2 cosh2 K) + g(v) , (1.109)
N kB T N N kB T
where g(x) and g(v) are infinite power series in their respective arguments.
From other arguments (e.g. a Monte Carlo simulation or the mean field ap-
proximation) we might suspect a (single) critical temperature somewhere be-
tween the lowest and highest temperatures, i.e. T = 0 and T → ∞. If this is
so, then the singular contribution, i.e. that which leads to divergences at Tc ,
should match for the two expansions at hand. Since NEkB0 T and ln(2 cosh2 K)
are both perfectly “well-behaved” functions for T > 0, we are not concerned
with these and focus only on the correspondence between g(x) and g(v). Even
without knowing all the terms in these functions, it is possible to exploit the
topological fact that they both contain the same type of terms. If we can en-
sure that g(x) = g(v) at some “transition temperature”, located in between the
70 Complex Physics; Jan O. Haerter, Kim Sneppen
Mini tutorial: Think about the duality seen in the above derivation. Would
such a symmetry-based approach also allow calculating Tc for other lattices,
say, the 2D triangular lattice? Give arguments for why you think it would/would
not.
1.7.4 Exercises
A Triangular lattice +
By using a high-temperature expansion to sufficient order, obtain the free en-
ergy for a 2D triangular lattice. Discuss its scaling with system size and check
that it makes physical sense. Consider also the low-temperature expansion.
Compare results for internal energy with a variant of your Monte Carlo exer-
cise from Sec. 1.2 by making appropriate plots.
B Triangular lattice
Jan O. Haerter & Kim Sneppen 71
partial 1 2 3 4 5 6 7 8 9 ....
trace
coarse 1 2 3 4 5 6 7 8 9 ....
graining
1 2 3 4 5 ....
The aim is to find the fixed points of the renormalization procedure, i.e.
the parameter values where
H̃0 = H̃ ≡ H̃∗ . (1.111)
Jan O. Haerter & Kim Sneppen 73
Mini tutorial: Remind yourself of the condition for linear stability for a fixed
point of a dynamical system.
The order parameter is now defined as the probability P∞ (p) of any given
site to belong to the ”spanning cluster,” that is, the cluster that spans from
one end to the other. Below the percolation transition, that is, when p < pc ,
no such path exists, and the order parameter is zero. You can pick any site,
and it will never be able to belong to the spanning cluster. As the transition is
crossed (p = pc ), the probability changes from zero to finite values, as now at
least some sites will belong to the spanning cluster. As the number of occupied
sites increases further (p > pc ), eventually nearly every site will be part of the
spanning cluster.
a b c
p ba a
Rb(p)
Rb
Rb
Rb
Rb
1.0
0.8
0.6
Rb(p)
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
p
Hence, if one can find the rescaling function Tb (p), one can determine the
critical exponent. Since Tb (pc ) = pc , Eq. 1.115 can be written more compactly
as
log b
ν = (1.116)
|Tb (p)−Tb (pc )|
log |p−pc |
log b
= . (1.117)
log dT b
|
dp pc
The above equations simply transfer all complications to the rescaling func-
tion Tb (p), which does not automatically simplify the problem, since this func-
tion is not easy to determine. To make progress, we here introduce an approx-
imate rescaling function Rb (p), which has the effect of ”coarse graining” all
fluctuations on small length scales. The fixed point p∗ corresponding to Rb (p)
(Rb (p∗ ) = p∗ ) is not necessarily the same as the exact fixed point pc and there
can be many ways to carry out the approximate rescaling Rb (p). The coarse
graining has the effect of ”smearing out” all fluctuations on length scales less
than b and simultaneously reduces the number of degrees of freedom in the
system, for a system of dimension d, each rescaling reduces the number of
degrees of freedom by bd , and the original degrees of freedom N are reduced
Jan O. Haerter & Kim Sneppen 77
to N/bd . Information is lost upon rescaling and the procedure is therefore not
invertible.
We return to the question of percolation, where the transition to a ”span-
ning cluster” is characterized by a path existing from one side to the other. A
reasonable choice for the rescaling Rb (p) could hence be, to consider, whether
a path exists within the ”microsystem” of size b × b. In one dimension, the
rescaling is very simple: divide the lattice into blocks of size ba, where a is the
lattice constant of the lattice before the rescaling operation (Fig. 1.29). All ba
sites within each block are then replaced by a single block of size ba. Using
the ”spanning cluster” rule for the coarsening procedure, the block sites are
then occupied with probability Rb (p) = pb . Since the cluster is only spanned if
each site within the cluster is occupied, any unoccupied site would break the
spanning property. Finally, all length scales reduce by the factor b to make the
block size identical to the original lattice spacing. The fixed point equation
for the one-dimensional system hence is
Rb (p∗ ) = p∗ b = p∗ , (1.118)
yielding the two fixed points p∗ = 0 and p∗ = 1, which correspond to the
empty and entirely occupied lattices. For any initial occupation probability
p < 1, repeated rescaling will drive the value of p towards the lower fixed point
p∗ = 0. p∗ = 1 is an unstable fixed point and is non trivial (Fig. 1.30).
For percolation, it is useful to define the correlation function Γ(ri , rj ) be-
tween two sites ri and rj as the probability for the two sites to belong to the
same finite cluster. For the one-dimensional system, the correlation function
Γ(ri , rj ) = Γ(r) = pr = exp(r ln p) = exp(−r/ξ), where r ≡ |ri − rj | is simply
the distance between the sites at positions ri and rj . The result is easy to see,
since each site between ri and rj must be occupied, and the probability for this
to be the case is then pr . This formula allows us to identify the correlation
length as
1
ξ(p) = − . (1.119)
ln p
In the limit of p → 1− , this can be expanded as7
1 1
ξ(p) = − =− → (1 − p)−1 . (1.120)
ln p ln(1 − [1 − p])
This expansion allows us to read off the critical exponent ν = 1. Conversely,
evaluating ξ(p) for the rescaled lattice
1
ξ(Rb (p)) = − (1.121)
ln Rb (p)
1
= − (1.122)
ln pb
ξ(p)
= . (1.123)
b
7
Using the Taylor expansion ln(1 − x) → −x for x → 0.
78 Complex Physics; Jan O. Haerter, Kim Sneppen
proves that, indeed, the correlation length decreases by the factor b under
rescaling, as it should.
To determine ν, we compute the derivative of Rb w.r.t. p near the non-
trivial fixed point:
dRb
|p∗ =1 = bpb−1 |p∗ =1 = b , (1.124)
dp
which yields
log b
ν= =1. (1.125)
dRb
log |∗
dp p =1
a b
p ba Rb(p)
two-dimensional lattice Ising model (Sec. 1.2). In the current context of per-
colation, the actual criterion, which connects more directly to the ability of
the lattice to ”conduct” a signal from one end to the other, should be more
directly reflected in the coarse-graining rule. Consider therefore the following.
p3+3p2(1-p)
1.0 1.0
Triangular Lattice Square Lattice
0.8 0.8
0.6 0.6
Rb(p)
Rb(p)
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
p p
Given the crude approximation made, this value lies remarkably close to the
exact value ν = 4/3 [9, 10]. However, we may ask, why the renormalization
group transformation is not exact. One reason is, that sites that are connected
in the original lattice might no longer be connected after renormalizing. This
argument also goes the other way around. Further, the renormalization in-
troduces bonds that are actually not there in the original lattice. There is an
analogy to the renormalization of the two-dimensional Ising model, discussed
in Sec. 1.8.6.
When we consider only sites of even index 2i, then each of these sites is
only connected to odd-index neighbors. The goal is to evaluate the possible
configurations for the even sites and perform a partial sum. Doing this, in
terms of these three constants, the partition function is
Xh h h
i
Z(N, K, h) = e(K+ 2 )(S1 +S3 )+h + e(−K+ 2 )(S1 +S3 )−h × . . . . (1.130)
{s}
The goal is now to put the resulting partition function in a form that resembles
that of the original partition function, but where the even sites are left out.
To achieve this, the constants N , K, and h are allowed to be rescaled.
N
Z(N, K, h) = eN g(K,h) Z( , K 0 , h0 )
2
−H0
X
Ng
= e e , (1.131)
{s}
The equations in Eq. 1.132 are recursive relations and specify the fixed points
and flow diagram of the system. The action of an iteration is to remove
half the degrees of freedom, by which the number of sites N 0 = N/b with
b = 2. The lattice spacing is increased to a0 = b a. Other quantities depend
on the lattice spacing and are correspondingly rescaled, e.g. the correlation
length ξ 0 = ξ/b. The spins remaining in the new Hamiltonian interact through
the rescaled coupling K 0 and act under the rescaled field h0 . Noticeable, the
renormalization for the 1D Ising model is exact, the resulting coarse-grained
Hamiltonian looks exactly like the original, in the sense that no new terms
are generated, e.g. interactions between three particles. The only thing that
is necessary, is to define how the parameters of the system “scale” as the
transformation is performed. This exactness is the crucial difference between
a 1D and a 2D Ising model. In the latter, such exact mapping is not possible,
the transformation always produces additional terms that are not present in
the original Hamiltonian. The additional challenge in 2D hence becomes, to
discard some of those additional terms, to be able to derive a self-consistent
renormalization (Sec. 1.8.6).
(1 + y)2
x0 = x ,
(x + y)(1 + xy)
x+y
y0 = y ,
1 + xy
1
z0 = z 2 xy 2 . (1.133)
(x + y)(1 + xy)(1 + y)2
The first two equations do not depend on z, which means that the singular
behavior of the free energy does not depend on a shift in energy scale. Inves-
tigating the fixed points in the x-y plane it is first seen that x = 1 is always
a fixed point, irrespective of y, i.e. for any 0 ≤ y ≤ 1. These fixed points are
infinite temperature sinks.
Notably, the equations in Eq. 1.133 constitute a dynamical system in 3D
parameter space. The linear stability in the vicinity of any fixed point X ∗ can
be assessed by the Jacobian
∂q 0
J≡ |X ∗ , (1.134)
∂q
where q is any of the three variables and q 0 represents the primed variables, i.e.
the LHS of Eq. 1.133. At the ferromagnetic fixed point X ∗ = {x∗ , y ∗ } = {0, 1}
(where the continuous phase transition occurs, and we are therefore interested
Jan O. Haerter & Kim Sneppen 83
in), the Jacobian (Eq. 1.134) turns out to already be diagonal, and gives
∂x0 (x, y)
|X ∗ = 4 ,
∂x
∂y 0 (x, y)
|X ∗ = 2 , (1.135)
∂y
hence x0 ∼ 4x and 0 ∼ 2 near the fixed point X ∗ , where = y ∗ −1, i.e. a small
parameter proportional to the magnetic field near y = 1. As the Jacobian J
is diagonal, the coefficients in the Eq. 1.135 represent the eigenvalues of the
Jacobian. Notably, if we are working in the range where a linearized description
of the transformation in Eq. 1.133 is appropriate, i.e. sufficiently close to a
fixed point, then a repeated application of the transformation will just lead to
an additional scaling of the type in Eq. 1.135. This means, that the eigenvalues
of a duplicate application will become
where yi is a coefficient.
Notably, for x and these coefficients are different: the coupling x scales
with y1 = 2 while the field has y2 = 1. Hence, as one zooms out of the lattice,
“temperature” increases quadratically with the rescaling, while the “field” goes
only linearly.
Even though the exact form of the free energy might not be known at Tc ,
knowing the scaling of f (s) and its parameters we should be in a position to
analyze its scaling and that of its derivatives, e.g. the specific heat coefficient
at zero magnetic field,
2 (s)
∂ f (s)
C∼ 2
≡ ftt (h = 0) ∼ |t|−α .
∂t h=0
b = |t|−1/y1 ,
(s)
i.e. make the first argument in ftt become a constant? With this choice, the
prefactor in Eq. 1.138 becomes a function of t:
α = 2 − d/y1 .
β = (d − y2 )/y1 .
α + 2β + γ = 2 ,
i.e. the one we obtained from the Rushbrooke inequality in Sec. 1.3, as well as
γ = β(δ − 1) .
Another consequence is that the scaling above and below the critical temper-
ature should be the same, which is easily seen by inspecting Eq. 1.139.
When re-considering the pair-correlation function (Eq. 1.33), one obtains
where it was used that all spatial scales are diminished by the factor b−1 at
each renormalization step and c(b) is some function of the spatial rescaling
only, which however remains to be specified. At zero field (h = 0), one can
employ a similar “trick” as before, setting b ∼ |t|−1/y1 and obtains
Γ(r, t) ∼ c2 (|t|−1/y1 )Γ(|t|1/y1 r, ±1) ,
therefore the critical exponent ν = 1/y1 .
To obtain c(b) one can now set both t and h to zero in Eq. 1.140 and
remember the Eq. 1.36, where
Γ(r) ∼ r−(d−2+η) . (1.141)
Using our previous equation
Z
χT ∼ N Γ(r)rd−1 dr (1.142)
1.0
0.8
y = expH - 2 hL
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
x = expH - 4 KL
Figure 1.35: RG flow for the 1D Ising model. The flow always goes
towards x = 1, i.e. the limit of K = βJ = 0, hence infinite temperature
(T → ∞). The horizontal axis corresponds to h → ∞, hence the flow is
towards increasing h. The vertical axis corresponds to T = 0.
where K ≡ −βJ and h ≡ −βH and hiji denotes nearest neighbor sites i and
j. The lattice is now broken down in triangular blocks of three sites each
(Fig. 1.8.6), and we define the block spin of each triangle I by a “majority
rule”
SI ≡ sign{S1I + S2I + S3I } . (1.147)
By this definition
√ of block spins the lattice constant has been enlarged by a
factor l = 3.
As a first step, we want to express the original Hamiltonian by a formally
exact Hamiltonian using however the block spins. For this purpose we first
define the collection of spins which constitute one triangle I as
The goal is to approximate H0 . To this end, we break H down into the inter-
action within block spins H0 and those between block spins V ,
H = H0 + V .
The Hamiltonian H0 is
XX
H0 = K Si Sj ,
I i,j∈I
Notably, since for the second factor on the RHS all blocks are independent,
this factor can be evaluated to give
X
eH0 {SI ,σI } = Z0 (K)M ,
σI
Notably, the value of Z0 (K) does only depends on bond configurations, i.e.
the relative orientation of neighboring spins. Hence, the overall orientation
SI is irrelevant. There is only one way to obtain a sum of 3K, but three
configuration where two bonds are frustrated, i.e. two spins are anti-aligned,
each with energy −K, hence
Z0 (K) = 3e−K + e3 K .
88 Complex Physics; Jan O. Haerter, Kim Sneppen
3
Block I
Spins 1 2
3
J
1 2
V2
heV i0 = h1 + V + + . . . i0
2
hV 2 i0
= 1 + hV i0 + + ... .
2
Notably, we consider the “perturbation” V to be small, and in doing so we
will neglect higher order terms in V . Using
x2
log(1 + x) = x − + O(x3 ) ,
2
we have
1 hV i20
logheV i0 = hV i0 + hV 2 i0 − + O(V 3 )) .
2 2
Re-exponentiating, we have
V 1 2 2 3
he i0 = exp hV i0 + [hV i0 − hV i0 ] + O(V ) .
2
1
H0 = M log Z0 (K) + hV i0 + [hV 2 i0 − hV i20 ] + O(V 3 ) .
2
Jan O. Haerter & Kim Sneppen 89
where
VIJ = k(S3J )(S1I + S2I ) ,
thus hVJ i0 = 2KhS3J S1I i0 (compare schematic Fig. 1.8.6). Since H0 does not
couple different blocks, i.e. cannot induce any correlations between spins on
different blocks, The expectation value factorizes, giving
hVIJ i0 = 2KhS3J i0 hS1J i0 .
But
1 X J K[S1J S2J +S2J S3J +S3J S1J ]
hS3J i0 = S3 e .
Z0 σ
J
For SJ = 1 we have
e3K + e−K
hS3J i0 = ,
e3K + 3e−K
while for SJ = −1 we have
e3K + e−K
hS3J i0 = − ,
e3K + 3e−K
hence the expectation value of V within the unperturbed Hamiltonian becomes
X
hV i0 = 2KΦ(K)2 SI SJ
hIJi
e3K +e−K
with Φ(K) ≡ e3K +3e−K
. In total, the effective Hamiltonian, to first order in V
is X
H0 {SI } = M log Z0 (K) + K 0 SI SJ + O(V 2 ) ,
hIJi
0 2
where K = 2KΨ(K) . We have hence achieved the goal of deriving an RG
transformation that allows a rough approximation to the recursion relation for
the coupling constant K.
Fixed points and critical exponents. What are the fixed points of the
recursion relation we just obtained? Fixed points satisfy
K ∗ = 2K ∗ Φ(K ∗ )2 ,
which has three solutions
K∗ = 0
K∗ = ∞
√
Φ(K ∗ ) = 1/ 2 .
Using x ≡ exp(4K), the latter relation can be inverted, giving a non-trivial
fixed point
1 √
Kc = log(1 + 2 2) ≈ 0.34 ,
4
whereas the exact result (Onsager) is Kc = (log 3)/4 ≈ 0.27.
90 Complex Physics; Jan O. Haerter, Kim Sneppen
1.8.7 Exercises
1. Real-Space renormalization group transformation on a square
lattice. (compare: Fig. 1.34).
1. Define and outline the procedure of real-space renormalization transfor-
mation applied to site percolation.
3. Find the fixed points for the real-space renormalization group transfor-
mation in the above equation and comment on their nature. What are
the correlation lengths ξ associated with the respective fixed points?
Discuss the concept of flow in p-space associated with the real-space
renormalization group transformation Rb .
where t ≡ (T −Tc )/Tc and h = h0 /kB T show that α = 2−d/y1 , β = (d−y2 )/y1 ,
γ = (2y2 − d)/y1 and δ = y2 /(d − y2 ) and hence confirm that α + 2β + γ = 2,
and γ = β(δ − 1).
3. Numerical renormalization.
Set T = Tc in your 2D square lattice zero-field Monte Carlo simulation and
make lattice size N sufficiently large (say, N ∼ 100 × 100). For a snapshot
of your simulation near equilibrium, perform a “numerical renormalization“,
where you apply the following majority rule: For any square consisting of
2 × 2 sites, color this square by the majority of spins, i.e. if more are pointing
up than down, the ”block spin“ will point up. For a ”tie“, choose randomly
between up and down for the block spin. The resulting lattice will then only
have N 2 /4 sites and represent a zoomed-out version of the original. Repeat
this procedure several times and observe the patterns obtained for the various
iterations. If you are close to Tc you should observe that patches at different
scales remain, even when you rescale several times.
8
Yeomans, problem 8.2
Percolation, by Kim Sneppen 91
Now repeat this exercise for a temperature slightly below Tc , e.g. T = .99Tc .
Now you should find that the resulting patterns ”flow“ towards one of the
polarized extremes, either all spins pointing up or down.
Repeat again for T slightly larger than Tc , say T = 1.01Tc . Now the result
should be that pattern become random, you will end up with a featureless mix
of up and down spins.
92 Complex Physics, Kim Sneppen
Chapter 2
Scaling: Percolation,
Self-Organization, Fracture
There is nothing insignificant in the world. It all depends on the point of view.
P (a) = 1/e
P (2a) = 1/e2 → P (2a)/p(a) = 1/e
P (10a) = 1/e10
P (20a) = 1/e20 → P (20a)/P (10a) = 1/e10
and the probability for finding a system of size, say, 100 a or 200 a becomes very
small — and the latter exceedingly smaller than the first. In equilibrium, this
type of distribution arises when the probability to concentrate one more unit
of energy on degree of freedom is 1/e, and this remains true independently
93
94 Complex Physics, Kim Sneppen
of how much energy one has already concentrated. In any case, the expo-
nential distribution make unusually large concentration of energy essentially
impossible.
Exponential distributions and systems with characteristic sizes are indeed
often seen in the real world. Think for example about clusters of connected
identical spins in the Ising model, which all will be of similar size except when
the system is near the critical point. This point is critical in the sense that the
exponential associated to the binding energy of the ordered state just exactly
balances the exponential associated to entropy of the disordered state (compare
Fig. 1.6.5 and corresponding exercise). But this is a special situation because
the temperature will have to be tuned to be at the critical point.
However, there also exist many real world systems that have no character-
istic scale or size. Examples include:
• Earthquakes can be very very large, even if most of the recorded earth-
quakes are so small that we wouldn’t even feel them without sensitive
equipment.
• Solar flares can be huge, although most solar activity is fairly limited.
(The “Carrington event” in 1859 gave rise to aurora all over USA, at a
level where the illuminated night sky allowed people to read.) Energy
Percolation, by Kim Sneppen 95
distribution P (E) ∼ 1/E 1.52→1.65 for solar x-ray bursts [11]. In same
data set both duration of solar flares and time between flares scaled with
exponent ∼ −2.
• Most people have a few 100 followers on their internet activity, but some
have millions.
• Turbulent liquids can have large vortices with huge velocity gradients,
even if most of the liquid is locally laminar.
• Financial crashes can become very big, even if typical day-to-day stock
market fluctuations are small.
Poisson
exp
1/x^2
0.1
dP/dx
0.01
0.001
0.0001
1 10 100
x
Figure 2.2: Comparison between a power law with exponent 2, that is, ∼ 1/k2 ,
the Poisson distribution (p(x) ∝ ax /x!) and an exponential distribution p(x) ∝
exp(−x/a). Notice that both the x and y axes are logarithmic.
In many of the above cases the distribution P (k) of sizes k of the relevant
observable is a power law:
1
P (k) = γ , (2.2)
k
with an exponent γ of about two. To give some examples: for energy releases
in earthquakes γ is about 1.7; for the number of links in networks γ ∼ 2.2;
the largest exponent is found in stock market crashes which is presumably
about γ ∼ 4.5). This type of distribution is shown in Fig. 2.2. The exponent
γ = 2 corresponds to the famous Zipf distribution, found for the frequencies
96 Complex Physics, Kim Sneppen
of the distinct words used in books. This is also true for the distribution of
number of people in cities: there is half the number of cities above 2 million
than numbers of cities with more than 1 million people.
Often the Zipf distribution is plotted in a slightly more complicate way us-
ing the rank distribution of sizes in a group/population, see Fig. 2.3. In rank
distributions, along the x-axis one plots the rank, where 1 is the largest, 2 the
second largest and so forth. On the y-axis one plots the size corresponding to
each rank. Thus the rank distribution is the reciprocal of the cumulative dis-
tribution (mirror symmetric plot along x=y in the log-log plot). The exponent
of the Zipf distribution is one divided by the exponent of the cumulative size
distribution (i.e., = 1/(γ − 1)).
How many time word was used in Wikipedia
classical Zipf
distribution
(abundance=1/rank)
rank
most abundant word
To prove this, assume first that the distribution is scale free, hence
P (s · k)/P (k) = P (s)/P (1)
⇒ log(P (s · k)/P (1)) = log(P (k)/(P (1)) + log(P (s)/P (1))
⇒ f (s · k) = f (s) + f (k)
where f (k) = log(P (k)/P (1)) and f (1) = 0. Thus f (k) is a logarithm of k:
log(P (k)/P (1)) = γ · log(k) ⇒ P (k) ∝ k γ (2.3)
Reversely, if one assumes that n(k) ∝ k γ , “scale-freeness” is proven by mul-
tiplying the argument k with a factor s, and observing that the frequency n
changes with the same factor sγ for all values (scales) of k.
To compare a distribution containing a scale (s), say, an exponential, with
a scale-free one, compare:
Pscale (k) = exp(−k/s) with Ppower−law (k) = k −γ
Pscale (s) = exp(−1) with Ppower−law (s) = s−γ
Pscale (100 · s) Ppower−law (100 · s)
= exp(−50) with = 2−γ
pscale (50 · s) Ppower−law (50 · s)
Thus, extreme events are much more likely in the case of power laws than
when there is a characteristic scale.
In these lectures we will repeatedly return to scale-free behavior, and most
often this is caused by some sort of non-equilibrium dynamics. As an introduc-
tion this chapter will introduce three very different ways to obtain scale-free
behavior:
• Percolation as a static problem with some analogy to the Ising model
with power laws close to a critical point.
• Fragmentation, a sudden process that gives power laws (in fragment
sizes).
• Self organized critical systems to a critical state that is only maintained
when there is a separation of timescales.
In subsequent chapters we will see other mechanisms that generate power laws,
including in particular “rich gets richer” and merging processes (both intro-
duced in the network chapter (chapter 3)).
Questions:
2.1) Consider the distribution [12, 13, 14]
p(s) ∝ 1/sτ
as a distribution for wealth in human society (with s larger or equal to a lower cutoff
fortune of unity). Argue that the situation where τ ≤ 2 makes for a fundamentally
different society than when τ > 2. (Hint: Consider contribution to average wealth)
Notice that, as mentioned, τ = 2 is the famous Zipf distribution observed for example
for word frequencies in books [15].
Qlesson: Exponent 2 is special, and it is also the most common in nature.
98 Complex Physics, Kim Sneppen
2.2 Percolation
Percolation deals with the problem of connecting/percolating a path across
a heterogeneous material, which can be thought of as partially insulating,
partially conducting, and the path must be taken through the conducting part.
This type of problem is found within many fields of study, including physics,
geology, epidemics and sociology. Imagine a glass jar filled with beads, some
of which are made of glass and thus insulating, and some are metal and thus
conduct electricity. One may thus ask at which density of metal balls the
mixed system will be able to conduct a current. And one may be interested
in how the conductivity changes as one approaches this critical density. This
and analogous questions are formally addressed by studying percolation.
Let us first consider a simple percolation example on a two-dimensional
square lattice (Fig. 2.5). In this simulation we first assign each site a proba-
bility p to be conducting and probability 1 − p to be empty (or insulating). We
then allow bonds between all nearest neighbors which are both occupied. This
allows us to define clusters, consisting of sites which are directly or indirectly
connected by bonds. Each of these clusters are colored with a different color.
The cluster size s is defined as the number of sites of equal color. Clearly, for
larger values of p the probability of finding larger clusters will increase. In the
first exercise session we will repeat the simulation in Fig. 2.5 using python,
Percolation, by Kim Sneppen 99
which in turn will allow us to gain some intuition for this type of problems.
Figure 2.5: Clusters in the site percolation model. In the example shown,
sites are organized on a two-dimensional square lattice and each site is occupied
with probability 0.55. Neighboring sites of equal color belong to the same
cluster. From Malte Sørensen’s lecture notes on percolation, Springer.
Questions:
2.2) Make a two-dimensional percolation program on a square lattice, identify the
critical point pc , or percolation threshold, that is, the probability for conducting
sites at which a current could flow across the system. Obtain the cluster size dis-
tribution close to this critical point pc . In matlab a two-dimensional square (ma-
trix) of dimension 100 is generated and plotted by the following sequence of orders:
L = 100; r = rand(L, L); p = 0.6; z = r < p; [lw, num] = bwlabel(z, 4); img =
label2rgb(lw); image(img). (Hint: pc = 0.59275 and the cluster size distribution
at criticality should be n(s) ∝ 1/s187/91 ). It can be useful to either use logarithmic
100 Complex Physics, Kim Sneppen
Figure 2.6: Bethe lattice with occupied (black) and empty sites
(green). The black sites form clusters, that may be percolating to infinity, if
the density of the black sites (the probability of being occupied) is sufficiently
large (larger than 1/2 in the example shown). The gray-shaded area marks
the start of a large cluster. This type of network was introduced by H. A.
Bethe ”Statistical theory of super-lattices”. Proc. Roy. Soc. London Ser A
150: 552–575 (1935).
x−1
X
2 x−1
N (≤ x) = 1 + 3 · (1 + 2 + 2 + ... + 2 )=1+3· 2i = 3 · 2x − 2 , (2.4)
i=0
Percolation, by Kim Sneppen 101
Px−1 i
where we have used that i=0 2 = 2x −1 (prove that). If one instead considers
a lattice in D spatial dimensions, the number of sites within a distance x would
scale as
ND (≤ x) ∝ xD (2.5)
because 2x > xD for large enough x, the Bethe lattice will have a larger di-
mension than any chosen dimension D. Thus, the Bethe lattice is formally
infinite-dimensional, a feature that it shares with most real world networks.
Mini tutorial:
How many directed paths are there between two points in a beta-lattice?
Mini tutorial:
When a quantity Q scales as Q ∼ (pc − p)−γ , does a larger value of γ then
imply that Q tends to become relatively larger or smaller when one approaches
the critical point pc , that it, when the limit limp→pc Q is approached?
First consider the scaling of the mean cluster size S(p) ∝ |pc − p|−γ to which
an occupied site belongs (scaling for p → pc , with p in the vicinity of pc ,
p < pc ). S(p) can be calculated by starting at an occupied site and then adding
the contribution from each of the three sub-branches:
S(p) = 1 + 3 · T . (2.7)
Here T is the average contribution from one of the sub-branches. This contri-
bution can be determined self-consistently from
T = p · (1 + 2T ) , (2.8)
because T only receives a contribution from the first site of the sub-branch
if this is occupied (with probability p), multiplied with the contribution from
the two subsequent sub-branches when the first site is occupied.
p
T = for p < pc = 1/2 . (2.9)
1 − 2p
102 Complex Physics, Kim Sneppen
Inserting into Eq. 2.7, we therefore obtain the mean cluster size as
1+p
S(p) = 1 + 3T = ∝ (pc − p)−1 , (2.10)
2(pc − p)
which gives the critical exponent γ = 1 for the Bethe lattice. Thus, each time
the distance to the critical point is halved, the average cluster size S(p) is
doubled.
where gs,t is the number of different lattice configurations with size s and
perimeter number t. The sum runs over all perimeter sizes t and weights each
by the corresponding number of different configurations gs,t .
For a Bethe lattice with z = 3 the perimeter number t = 2 + s. This may
be seen by induction: start with a cluster of size s = 1 that obviously has three
perimeter sites (t = 3). For each added site, one looses this site’s contribution
to the perimeter, but gains two new perimeter sites further out in the lattice.
Thus, each site added to the cluster will yield a net contribution of one added
perimeter site. Thus,
t=2+s. (2.12)
Accordingly, the configuration count gs,t in the above sum only contains non-
zero values when t = 2 + s:
We here want to use a Taylor expansion around the critical point, pc = 1/2 to
obtain the leading-order contribution in the small parameter pc − p (compare
Sec 1.3.4). Denote the argument of the natural logarithm in Eq. 2.14 as f (p) ≡
4p − 4p2 . Note that f (pc ) = 1, df /dp(pc ) = 0 and d2 f /dp2 = −8, giving
f (p) ≈ 1 − 4 · (p − pc )2 . We therefore get
2
ns (p) 1−p
· exp s · ln(1 − 4 · (p − pc )2 )
∼
ns (pc ) 1 − pc
2
1−p s
∼ · exp(− ) (2.14)
1 − pc sp−pc
Thus
−1 1
sp−pc ≡ 2
∼ · (p − pc )−2 ∝ |p − pc |−1/σ . (2.15)
ln(1 − 4 · (p − pc ) ) 4
We refer to s∆ as the “cut-off cluster size” for p ∼ pc , as s∆ sets a scale in the
exponential in Eq. 2.14. Cluster sizes in excess of s∆ will virtually never be
observed. The exponent σ = 1/2 characterizes the scaling of the cluster size
cut-off for percolation in the Bethe lattice for p in the vicinity of pc .
To summarize, we have learned that the mean cluster size S(p) increases
proportional to 1/(pc − p) near the critical point, whereas the maximal cluster
size sp−P −c increases much faster, namely as 1/(pc − p)2 . As pc − p decreases,
the maximal size of clusters deviates more and more from the average cluster
size. This means one have to come up with a distribution that connect a di-
verging difference between the average and the maximal as p gets closer to pc .
I.e. if the maximal divided by mean is a factor 10 at one value of pc − p, it will
be a factor 100 for a ten times smaller pc − p. To bridge this diverging scales
we need a scale free distribution, that is we need a scale-free distribution of
the cluster sizes.
Mini tutorial:
How much easier is it to find a cluster of size s than a cluster of size 1?
While we have now computed the maximum cluster size, we have not yet
obtained the shape of the cluster size distribution. To make progress, consider
therefore the actual distribution of cluster sizes for p very close to pc = 1/2;
which we assume takes the form
We have hence incorporated the cutoff and assume a power law distribution
for all cluster sizes below this cutoff.The is, we assume that the only relevant
scale is the cutoff scale set by p − pc .
In other words, we assume that ns (pc ) is proportional to 1/sτ . This power
law form is consistent with the fact that clusters can become very large when
104 Complex Physics, Kim Sneppen
p → pc . Now assuming the above power law with cutoff spc −p . We again
consider the average size of a cluster starting from a random site. Then the
chance to select a cluster of size s is proportional to sns (p)). When summing
over all clusters the average cluster size becomes:
∞
X Z ∞
2
S(p) ∝ s ns (p) ∝ s2−τ · exp(−s/s∆ ) · ds
s=1 1
Z ∞
≈ s3−τ
∆ · z 2−τ · exp(−z) · dz = const · s3−τ
∆ . (2.17)
0
where we use the continuous limit because we anyway are concerned by big
clusters. Using that we already deduced the cutoff scaling
s∆ ∝ (p − pc )−2
we get that
S(p) ∝ (p − pc )2τ −6
From earlier we know that this should scale as (p − pc )−1 (see eq. 2.10).
Therefore
5
2τ − 6 = −1 → τ = , (2.18)
2
thus obtaining the cluster size distribution:
1
ns (p) ∝ · exp(−s · |p − pc |2 ) . (2.19)
s5/2
The above procedure for deducing relations between an exponent for cluster
sizes τ and the two exponents for respectively the average size (γ = 1) and
the cut-off size (1/σ = 2) can be generalized to percolation clusters in two or
three-dimensional percolation. Then
τ =3−σ·γ (2.20)
is obtained. In general the scaling of cluster sizes (eq. 2.19) is one example
of a power law distribution augmented by a cut-off function that define the
behaviour at and above a certain scale set by the distance from chosen p to the
critical pc The shape of the cutoff function f could be simulated numerically,
using:
Mini tutorial:
What is cluster size distribution when selecting random points and then count-
ing cluster sizes associated to each of these points? (That is, argue why there
is an additional factor s for the hereby selected cluster sizes).
Mini tutorial:
Map the cluster size distribution for the critical Bethe-lattice to a first return
of a random walker.
1 1
x*(1-(1/x-1)**3) x
0.8
0.1
0.6
P(p)
P(p)
0.01
0.4
0.001
0.2
0 0.0001
0 0.2 0.4 0.6 0.8 1 0.0001 0.001 0.01 0.1 1
occupation probability p p-0.5
Figure 2.7: Scaling for the percolation problem. Strength P (p) of the
infinite cluster in Bethe lattice with z = 3, where P is the fraction of sites con-
tained in the infinite cluster and p the occupation probability. The right hand
side shows a typical scaling plot (note the double-logarithmic axes), allowing
one to extract the behavior as one move very close to pc = 0.5 (for question
2.2).
Beyond the critical exponents discussed above, there are further critical
exponents in percolation — most importantly the correlation length exponent
ν. The correlation function g(r) is the probability that one occupied point is
within the same cluster as another another point at linear distance r,
where, for pedagogical reasons, “exp” is used as a name for a function with a
scale (in fact the actual cutoff function have another form). The correlation
length scales as
Rcorr ∝ |p − pc |−ν . (2.24)
The exponent is ν = 4/3 for two-dimensional percolation, implying that the
linear dimension across the largest cluster grows quite fast as one approaches
106 Complex Physics, Kim Sneppen
1)
the critical point from below .
Exponent Ising 2 dim Perc. 2-dim. Perc. 3-dim. Perc. Bethe lattice
γ 7/4 43/18 1.8 1
β 1/8 5/36 0.41 1
σ 36/91 0.44 1/2
τ 187/91 2.19 5/2
ν 1 4/3 0.88 1/2
Table 2.1: Exponents for the Ising model in two dimensions and for
percolation in two, three and infinite dimensions (Bethe lattice).
Notice that for the Ising model the exponent is relative to varying |Tc − T |
whereas |pc − p| is the variable in percolation. In Ising model the order param-
eter was the magnetization, in percolation it is the probability to belong to
the largest cluster (thereby order parameter in Ising model is finite for T < Tc ,
whereas order parameter in Percolation is > 0 for p < pc ). Chin-Kun Hu,
1984 suggested that the Ising model is related to bond percolation with bond
probability p = 1 − exp(−2J/kB T ). For Bethe lattice ν is assumed to be the
same as for high dimensional percolation, see also argument by [16].
Questions:
2.3) Go through the following argument associated to the effective order parameter
P (p) for percolation on a Bethe lattice. That is we are now above pc and want to
explore how the infinite cluster gets denser as we moves into the high density region
at p > pc . This would be analogous to explore the order parameter (magnetization)
in the Ising model as we lower T below Tc .
The strength of the infinite cluster P (p) is the probability that an arbitrary point
belongs to it. The critical exponent β is defined by P (p) ∝ |p − pc |β for p close to
but above pc .
For p > pc the largest cluster spans across the system and P (p) is finite (that is,
larger than zero). We want to calculate how P (p) vanishes as one approaches the
critical point pc from above.
M (l) ∝ lD (2.26)
0 75 150 km
0 75 150
miles
Let us now return to the percolation problem and the largest cluster, dis-
cussed in the previous section. If we were to describe the dimension of this
largest cluster, we might investigate larger and larger lattices, and explore the
density of the percolating cluster. That is, count how the fraction of lattice
sites it occupies decreases as the lattice area is increased (since the dimension
of the cluster is smaller than 2, the density will decrease).
In some real world fractals, one measures them with smaller and smaller
“measuring stick” → 0. If they becomes “large” as the ”measuring stick”
gets smaller, the object is a fractal. Or more precisely, if the number of boxes
needed to cover the object grows with some non-integer power law as function
of 1/, then they are fractal.
Percolation, by Kim Sneppen 109
Figure 2.9: Koch curve. Illustration of how the length of the line increases
as one considers finer and finer detail: When the ”measuring stick” is three
times smaller, the total length is four times larger. Thus the dimension D =
− ln(4)/ ln(1/3).
Consider now percolation in two dimensions. The mass of the largest cluster
becomes larger as we approach pc , scaling in size with the distance pc − p to
the critical point as
M ∝ (pc − p)−1/σ . (2.29)
110 Complex Physics, Kim Sneppen
Similarly, the linear dimension of the largest cluster is given by its correlation
length
l ∼ R ∝ (pc − p)−ν ⇒ (pc − p) ∝ l−1/ν . (2.30)
Thus we get the mass of largest cluster in terms of the of its linear extension
M ∝ l1/νσ . (2.31)
Mini tutorial:
What is the density of the percolating cluster at pc in an infinitely large two-
dimensional system?
The mass of the largest cluster M connects the exponent β to the dimension
of the largest cluster. That is, the mass should be calculated for distances up to
the size set by the correlation length, with densities deduced from the exponent
β. Thus
Mini tutorial:
Given that β = 5/36 and σ = 36/91 for two-dimensional percolation, calculate
ν.
carry a current if one applied a voltage drop over the full cluster, this sub-
part (called ”backbone”) would have dimension Dbackbone = 1.13 (see S. Havlin
lecture notes). And an even smaller subset of the largest cluster consists of
the sites along this backbone, that if broken, would break the whole infinite
cluster up. These are called red bonds, and have dimension less than 1 2 )
Mini tutorial: Calculate the dimension of the intersection between a line and
the infinite cluster in two-dimensional percolation at the critical point.
Mini tutorial: Assume the 1.5 dimensional coastline of Norway comes about as
the intersection of a 2 dimensional water surface with a mountain range with
2 1
The number of these so-called red bonds scales as nred ∝ p−p c
f or p > pc (Coniglio,
−ν
1982) and thus scale with correlation length R ∝ (p − pc ) as nred ∝ R1/ν , i.e. Dred =
1/ν = 3/4 for two-dimensional percolation.
3
In general a colored random walk in one dimension with Hurst exponent H (see econo-
physics chapter) has dimension D = 2 − H. Thus, a walk with Hurst exponent 1 correspond
to ballistic walk and has dimension D = 1. A walk with pink noise (1/f power spectra)
have Hurst exponent H = 0 and dimension D = 2
112 Complex Physics, Kim Sneppen
2.3.2 Fragmentation
Let us now consider the above equations in light of a simple model for
fragmentation. In some approximation this can be rephrased as an initial
impact with creation of a lot of cracks, followed by subsequent merging of
straight cracks, reflecting the fact that when two cracks meet then the first
crack stops propagation of the second one. This is illustrated in Fig. 2.10,
with similar dynamics in some cellular automata models, see Fig. 2.11.
Imagine that a two-dimensional square object is excited at one of its 1-
dimensional surfaces, and cracks spread inwards. That is when a plane of
glass hit the floor a lot of cracks start along the impact surface. When two
cracks meet, one will have arrived first and the last one will be annihilated (it
cannot crack across the other crack, because the glass is disconnected). We
now want to calculate the resulting size distribution of fragments.
where we assumed that only the upper end of integral counted, i.e. that τ < 2.
If τ is larger, one instead has to focus on the lower end of integral.
Noticeably, meteors are distributed with power law of about 1/M 1.8 , not
far from the above fragmentation exponent. To put this in perspective, then
the probability that earth is hit by a meteor larger than mass M scale as
P (> M ) ∝ 1/M 0.8 . The exponent then implies that we should expect a
meteor of more than 10% in diameter of the famous 10km diameter meteor
from Yucatan every ∼ 64, 000, 000 ·(1/1000)0.8 = 250, 000 years or so (assuming
that the meteor 64 million years ago was a typical event on that timescale).
Notice that the estimate uses the cumulative distribution.
Mini tutorial: Why should one use the cumulative and not the differential
estimate above? How does one use the differential distribution?
Percolation, by Kim Sneppen 115
Mini tutorial: Estimate how often a meteor of diameter larger than 100 m hits
earth. And larger than 100 km?
Questions:
2.5) Consider dust on a line, with points distributed with dimension Dnum = Ddust .
Show that the distribution of length between the dust follows the distribution
n(l) = 1/l1+Ddust .
Qlesson: The larger the dimension of dust, the narrower the distribution of intervals
between it. Notice that the average length between dust particles diverges with
system size (explain that).
2.6) What is the fractal dimension of the intersection of a line and a two-dimensional
percolating cluster at the critical value pc ?
Qlesson: Fractal dimension is obviously (?) smaller than 1. Why?
2.7) Formulate an automaton that would mimic the crack annihilation model above.
Simulate it, and calculate the fragment size distribution starting from random crack
initiation at one surface. (Hint: Use three numbers, one to give direction.)
Qlesson: Cellular automata can be used in many problems. Can you find a contin-
uum equation that describes the dynamics of the crack propagation? (I could not)
R LD
2.8) Consider the dimension equation for τ > 2 where integral in LDnum · 1 s ·
n(s)ds is dominated by lower limit. Argue for the identity
Z
LDtot − LD = LDnum s1−τ f (s/LD )ds
2000
time
1000
dn d2 n(x, t)
= + b · n(x, t) − d · n(x, t)2 + η(x, t) (2.42)
dt dx2
where the rates of birth (b > 0) and death (d > 0) are both positive and
where the noise term hη(x, t)η(x0 t0 )i = n(x, t)δ(x − x0 )δ(t − t0 ) is uncorrelated
in space and time and only takes values where there is already some active
sites n(x, t) > 0. Thus there is no noise if all is dead, only life create life.
The above equation has an absorbing state at n = 0, a spreading of activity
through the diffusion term (d2 /dx2 ) and further inhibits replication when local
density becomes large, that is, d · n2 > b · n. It will have a transition analogous
to directed percolation for a critical value of b (replication rate).
Mini tutorial:
Percolation, by Kim Sneppen 117
What could be the biological reason for the term −n2 in the above equation?
Figure 2.14: Directed percolation of a square lattice, each site can give offspring to
itself and its two nearest neighbors.
An exponent β = 0.27 for the infinite cluster, means that when we are at
some value p > pc , then the chance that a site is alive within the branching
three of directed percolation, scale as (p − pc )0.27 . Thus if one move p above
a factor 16 times closer to pc , then the chance that the site is alive is about a
factor 2 smaller.
Notice, that directed percolation cannot be solved analytically, and the
above scaling exponents was all obtained by extensive numerical simulations.
Mini tutorial:
What does a smaller β mean in terms of the density of the infinite cluster?
Mini tutorial:
What would the iterated version of rule 255 given in the formalism of Fig. 2.15?
Questions:
2.9) Simulate directed percolation in 1+1 dimensions. Estimate the critical value
of p = pc . Consider p < pc and determine the distributions of the size (number of
accumulated life sites) of branching trees starting from a single site at time zero.
See how size distribution changes as p becomes closer to pc .
Qlesson: Life-death processes are also critical, with a cluster size distribution 1/sτ
120 Complex Physics, Kim Sneppen
4 2 1
3 0
Rule number=1.2 +0.2 +0 .2 +1.2 +0.2 =18
time
space
ing arguments for some relevant dimensions. The scaling of the mass m of the
infinite cluster up to a correlation length `k ∼ −νk : m ∝ β−νk −ν⊥ ∼ `(νk +ν⊥ −β)/νk .
Thus the dimension counted with a length measured longitudinally is 1 − β/νk + χ.
Similarly the transverse dimension measure is 1 − β/ν⊥ + 1/χ = 2.33.
Qlesson: This question emphasizes that some exponents can be deduced from the
basic correlation length (ν’s) and density exponents (β).
Lessons:
• Power laws are a way to quantify the many scale free phenomena in our
surrounding world.
• This chapter presented two possible schemes for obtaining power laws:
Fine-tuning p → pc in analogy of the Ising model, and then the fast
process of fragmentation.
Supplementary reading:
Christensen, Kim, and Nicholas R. Moloney. Complexity and criticality. Vol.
1. World Scientific Publishing Company, 2005.
Thus the sum of things is ever being reviewed, and mortals dependent one
upon another. Some nations increase, others diminish, and in a short space
the generations of living creatures are changed and like runners pass on the
torch of life.
– Lucretius, 94 BC - 55 BC
123
124 Complex Physics, Kim Sneppen
t/τ t/τ
X X
2
hx i = h η(i) η(j)i
i=1 j=1
t/τ
X
= hη(i)η(j)i
i,j=1
t/τ
X
= hη(i)η(i)i
i=1
t/τ
X
`2 = (t/τ )`2 = `2 /τ t = 2 · D · t ,
= (3.2)
i=1
where we use that steps at different times are uncorrelated (for example
hη(1)η(2)i = ((+1) · (+1) + (+1) · (−1) + (−1) · (+1) + (−1) · (−1))/4 = 0).
Here D represents the diffusion constant, equal to the step size squared, di-
vided with the time τ that it takes to move one step (or a velocity times the
length before velocity is randomized). See Fig 3.2 where the walkers are walk-
ing up or down along the x−axis. Importantly the dimensions of the diffusion
constant
meter2
[D] = , (3.3)
second
reflecting that it results from multiplying a velocity with the distance travelled
between the times when direction of movement can be changed.
time
across position x during the time τ . The current of particles across position x
is
1 ` `
J = ρ(x − `/2) − ρ(x + `/2)
2 τ τ
` ` dρ(x) ` dρ(x)
= (ρ(x) − · ) − (ρ(x) + )
2τ 2 dx 2 dx
dρ(x)
= −D ·
dx
with D = `2 /(2τ ). Here the factor 21 is because only half of particles at position
x − `/2 moves forward (and only half of the ones at position x + `/2 moves
backward). The change in density, is subsequently given by the difference
between what moves in and what moves out:
dρ dJ d dρ
=− = D (3.4)
dt dx dx dx
A particle that starts at x = 0 at time t = 0, will at time t be found at
position x with probability
3
x2
1
ρ(x, t) = √ · exp(− ), (3.5)
4πtD 4Dt
with root mean square displacement hx2 i = 2Dt 1 . The diffusion equation has
the property that a Gaussian stays as a Gaussian, but with an ever increasing
width.
1
The diffusion equation can also be derived directly from considering many non-
interacting random walkers, each performing steps of length δl = 1 during time δt = 1.
Then the density distribution
1 1 1 1
ρ(x, t + 1) − ρ(x, t) = ρ(x − 1, t) − ρ(x, t) − ρ(x, t) + ρ(x + 1, t) (3.6)
2 2 2 2
126 Complex Physics, Kim Sneppen
Formal proof: Consider the first passage time for a random walker, de-
fined as the time when the first visit to position x occurred, starting at position
0 at t = 0. Characterizing the walk with the diffusion constant D, the distri-
bution Px (t) for the first passage time t to position x is:
x x2
F irstx (t) = √ · exp(− ). (3.9)
2πD · t3/2 4Dt
We here prove this equation based on a derivation that was presented in [19].
Considering the cumulative probability
Z t
Px (t) = F irstx (t0 )dt0 (3.10)
0
giving the probability that the random walk reached x at least once before the
time t. This probability is also equal to the probability that the walk passed
x at some time prior to t. Thus, it is equal to the probability that the max
excursion of the walk in the time interval [0; t] is larger than x.
The distribution of the “max” of the walk up to time t is related to the
distribution of the end-point at time t:
where we at position x add and subtract contributions according to the exchange of particles
with neighbor positions. Thus
∂ρ ∂2ρ
= D (3.7)
∂t ∂x2
with diffusion constant D = (1/2)δl2 /δt (D = 1/2 in the above example with a random
walk of unit step and unit time)
Percolation, by Kim Sneppen 127
Position
first passage
For each walk
walk that
ends at that ends at
position>x position >x
x x there is
2 walks that
mirror walk passed position x
time in interval [0;t]
0
0 t
walks that have passed x, all have first passsge in [0;t]
• For each walk that ends in a point y > x, there are exactly two walks
that go beyond x at some time before t.
These two paths are respectively the original path and its mirror path, defined
as the path that follow the original path to first passage of x, but thereafter
is reflected in x. This is illustrated in Fig. 3.3. Also notice that because all
paths that passes x have a mirror path, then there is a one -to -one mapping
between all paths that end at a position greater than x at time t and all pairs
of paths that pass x somewhere before t. There is no paths that passes x that
is not included here.
The first of the two path reaches a position greater than x, the second
mirror path does not. Therefore
Thus the probability that the random walker exceeds x at some time before t
is: Z ∞
1 2
Px (t) = 2 √ dy . e−y /4Dt (3.11)
4πDt x
The actual probability that it exceeds x for the first time between t and t + dt
128 Complex Physics, Kim Sneppen
4Dtx2 dy y 3 · dv 1 dv dv
dv = − 3
· dy ⇒ √ = − 2 3/2
= −√ · 3/2 ∝ − 3/2 .
y t 4Dx t 2Dx2 v v
The minus sign means that the substitution (y → v) leads to a shift between
upper and lower boundary. Further, the boundary y = ∞ is changed to v = 0
whereas the x-boundary is changed from y = x to v = 2Dt.
Z 2Dt
d dv −x2 /2v 1
F irstx (t) ∝ 2 3/2
·e ∝ 3/2 · exp(−x2 /4Dt) ,
dt 0 v t
where we differentiated
√ the integral with respect to its upper boundary. Over-
all, for x 4Dt, this provides us with the famous first return scaling that
is valid for times large compared to the typical time required to reach x:
1
F irstx∼0 (t) ∝ , (3.13)
t3/2
which is also equal to the distribution of times where the random walk first
returns to x = 0. This is then called the distribution of first return.
Super-Critical chain reaction: each reaction give rice to > 1 new reaction
Critical chain reaction: each reaction give rice to one new reaction
Sub-Critical chain reaction: each reaction give rice to < 1 new reaction
Let us define the size s of a branching process as the total number of acti-
vated sites which are involved at any time during the process. The probability
p(s) for having a size s of the tree must fulfil from partitioning the the tree
into two sub-trees at the root. Thus we start with one node, s = 1, which
can then split into a right branch with one node, or with equal probability
terminate. In the first case we end with one node, and p(1) = 1/2. Summing
over all partitions of a tree of size s then
s
X
p(s) = p(k) · p(s − k) , (3.14)
k=1
corresponding to all possible sizes k of the left tree, and the additional require-
130 Complex Physics, Kim Sneppen
ment that the corresponding right tree should have the remaining size s − k.
The top of Fig. 3.5 shows one such possible partition of a total tree. The
recursion relation defined by Eq. ?? can be solved using generating functions,
and we will do so later (in the end of the Network chapter). For now, we will
instead use the mapping to a random walk of the number of live branches,
counted in terms of subsequent branch point decisions (see Fig 3.5),
p(s) ∝ 1/s3/2 (3.15)
as it simply reflect the first return of a random walker.
Noticeably, one may also consider the distribution of survival times for
the critical branching trees in Fig. 3.5. This would be different than the size
distribution 1/s3/2 because there would be several sites that branch at the same
time t (see Fig. 3.5). This distribution would be the survival time distribution
in directed percolation in high enough dimensions.
Previous discussions about critical behavior in the Ising model (Sec. 1.4)
and in percolation (Sec. ??) raise the question why one should bother with
properties at a critical point; This is only around the critical point and rep-
resent a very small part of the parameter space. However, there is reasons
to believe that parts of nature tends to organize towards a critical point by
themselves. Some open driven systems tend to be pushed towards larger and
larger features, until they just marginally start to break down. The canonical
model for this type of phenomenon, which is termed self organized criticality
(SOC), was suggested by Bak, Tang and Wiesenfeld in 1987, and is illustrated
in Fig. 3.6.
The canonical version of SOC takes place on a two-dimensional square
lattice consisting of N = L×L sites. Each site i can take certain integer values
hi , where hi = 0, 1, 2, 3, 4, 5, . . . . All sites with hi = 4, 5, . . . are considered
unstable and topple simultaneously. When they toppleP they distribute one unit
to each of their four nearest neighbors. Thus, the sum i hi is conserved when
we are away from the boundaries. However, any site that is at the boundary,
distributes a unit out to imaginary neighbors outside of the system. These
units are lost. When all sites i have hi < 4 a new grain is added at a random
site, and the above procedure is repeated.
The model is visualized as the activity in a huge square office, where bu-
reaucrats do exactly nothing, unless they get 4 or more assignments. When
they get so many assignments, they get frustrated and push the assignments
to their neighbors. Neighbors to windows just throw their assignments out of
the window. This version of the model is illustrated in Fig. 3.6.
Mini tutorial:
What would happen to the dynamics if one closes all windows in the bureau-
cracy model above (i.e. and papers rebounce to the sending bureaucrat at the
edge?)
132 Complex Physics, Kim Sneppen
The key observable is the sequence of topplings that take place when one
new grain is added until the system is settled (see Fig. 3.7. This constitutes
an avalanche, which is measured by the sum of all activity until all sites are
below the threshold. The distribution of the sizes of these avalanches turns
out to be power law distributed,
1
P (s) ∝ , with ; 1 < τ < 3/2 , (3.16)
sτ
Mini tutorial:
Consider a random walker placed in the center of a line of length L. If it steps
right or left with equal probability, then how many steps typically pass before
it reaches one of the ends?
Percolation, by Kim Sneppen 133
5000 50
A) 45 B)
40
35
30
4000
25
20
15
10
3000 5
0
0 5 10 15 20 25 30 35 40 45 50
50
2000
C)
40
30
1000 20
10
0 0
0 10 20 30 40 50
0 500 1000 1500 2000
50 100
45
40 80
35
30 60
25
20 40
15
10 20
5
0
0 5 10 15 20 25 30 35 40 45 50 20 40 60 80 100
The steady state condition means that on average one avalanche has to
provide sufficiently many topplings to bring one grain out of the system. If
grains are added randomly, the average distance to the boundary is ∝ L, and
since grains topple in random directions, the added grain has to participate
in ∼ L2 steps before reaching the boundary. Therefore, for τ < 2, the steady
state implies (inserting the cut-off function in the upper end of the integral)
Z ∞ Z LD
2
L = s · P (s)ds ∼ s1−τ ds = LD·(2−τ ) (3.18)
0 0
Mini tutorial:
Given τ = 5/4, what is D for avalanches in the sandpile model.
Notice that if we instead excited the system by adding grains only at the
boundary, then on average it would only take L steps for a grain to leave the
Percolation, by Kim Sneppen 135
system (instead of L2 ). This is because one is much closer to the exit than
if grains were added in the bulk. This can be proven by considering the first
returns (to the boundary) of added grains:
L2
t · dt
Z
average exit time f rom boundary = ∝L (3.20)
t3/2
where the upper boundary is set by the time it takes to cross the system
and exit on the other side. When this happens it is not the first return of
the random walk, because it has loss on both sides. The excitation at the
boundary would give
1
1 = D · (2 − τ ) ⇒ τ = 2 − , (3.21)
D
which is substantially steeper than when grains are added in the bulk (large
avalanches are less likely).
Directed Sandpile
A solvable version of an SOC model was suggested by Depak Dhar (Physical
Review Letters 63 (16), 1659), who assigned a critical threshold of two and
distributed units from any position (x, y) in an L × L lattice to positions
(x−1/2, y +1) and position (x+1/2, y +1) (periodic boundaries in x direction,
and y effectively acts as a time coordinate). This model and a corresponding
avalanche are illustrated in Fig. 3.9.
The critical state of this directed sandpile model is one in which half of the
sites has value zero and the other half have value unity and these zeroes and
ones are randomly distributed. To see this one first have to realize that each
avalanche is compact: Any point inside the avalanche will recieve two grains
and thus for certain topples at next step. Also by nature no site will topple
more than once during an avalanche. Finally, consider for example the blue
edge of the avalanche in the figure. It will expand if the site at the boundary
had one grain. It will contract if the grain on the right have zero grain. Thus
if the probability to be 1 is exactly 1/2, then the boundary will perform a
random walk. This is exactly the criterion for a critical avalanche, with a size
distribution that become power law distributed.
Each avalanche, that is, a set of contiguous sites in space and time, will
consist of sites that at most topple once, and further the avalanche area (see
the figure) will be ”compact” in the sense that there are no islands within
the avalanche area that do not topple. In fact inside the avalanche each site
receives two grains and these are thus certain to topple. As a consequence
the size distribution of avalanches is given by the random walk movement of
its boundaries (compare Fig. 3.9): When these two boundaries merge, the
avalanche terminates. Thus the duration of avalanches is given by the point
where the two random walks meet each other. As the difference between two
136 Complex Physics, Kim Sneppen
Non topppled
sites
half are in
state 1
other half in
state 0
y-axis
(direc-
ted
topp-
lings)
(y=time)
x-axis
Figure 3.9: Directed sandpile. Each site can contain zero, one or more sand
grains. If a site contains two or more grains it topples and delivers one grain
to each of the two sites below it (downwards on the figure, see green arrows).
The critical state contains zero or one with equal probability=1/2 across the
entire x-y space. The figure shows an avalanche that involves all sites between
the two outer boundaries (solid points). The avalanche has a duration of 22
(layers) and a size of 64. The boundaries of the avalanche perform a random
walk, as highlighted by solid lines.
random walks is again a random walk, the avalanche duration will be power
law distributed as the first return of a random walk (eq. 3.13):
1
Pduration (t) ∝ , (3.22)
t3/2
and the chance for an avalanche to propagate more that ` steps along the
y axis would be Pduration (t > `) = 1/`1/2 . The size of the avalanche is its
length (duration)
√ times its width, which for a random walk gives the size
s = ` × ` = ` . Reversely, an avalanche of size s has length ` = s2/3 . As a
3/2
1
P (s) ∼ · exp(−s/N ) . (3.24)
s3/2
Figure 3.10: Life & history in term of the Ammonite family tree.
Reproduced from Ref. [20]. Ammonites lived in water, and left a highly
diverse fossil record with ∼ 7000 species, from 400 to 66 Million years ago.
Notice the intermittent dynamics with calm periods interrupted by coherent
extinction/speciation events.
• That large and small events may be associated to similar type of under-
lying dynamics. If extinctions were always externally driven by events
like for example asteroid impacts [23] one would expect a peak at the
large events.
Figure 3.11: Origination and extinction. The graph shows times of ex-
istence times of 35,000 genera in the Phanerozoic [21] as visualized by [22].
Every event is quantified by number of genera, each defining a group of closely
related species. The vertical distance from a point to the diagonal measures
the residence time of a species. Notice the many points located close to the
diagonal, reflecting the fact that most genera exist less than the overall genera
average of about 30 million years. Notice also the division of life before and
after the Permian extinction 250 million years before present.
Mini Tutorial: Draw ten real numbers from a uniform distribution between zero
140 Complex Physics, Kim Sneppen
#geological stages
50
model
data
50%
and one, Eliminate the smallest and replace it with another number drawn from
same distribution. What functional form would the final distribution converge
towards? Notice that it in fact does not matter what distribution we draw the
numbers from
Bak-Sneppen model: For simplicity, let us first assume that the numbers
Bi , i = 1, 2, ...N are placed on a line, mimicking a one-dimensional model
ecosystem. At each time step one changes the least stable of these species.
As the stability is defined within the context of a given species, the fitness of
a given species is a function of the species it interacts with, and accordingly
the neighbor species will also change their stabilities (B values). The co-
evolutionary updating rule for the agent-based model then reads [25, 26]:
• At each step, the smallest of the {Bi }i=1,N is identified. For this
as well as its nearest neighbors one replaces their Bi ’s by new
random numbers in [0, 1].
1 25000
A) B)
0.6
0.4
at time=15000
0.2
at time=15100
0 15000
1 100 1 100
species number species number
the system evolves. One observes highly correlated activity where sites that
topples are close to sites that topples in previous timestep.
The right panel of Fig. 3.13 shows a “space-time” map of the minimal Bi
sites in the time interval considered. Whenever the lowest barrier is found
among the three sites that was updated at and around the previous mini-
mum, then the active site performs a random walk. The figure shows that
this type of small steps is what happens most frequently. When the site of
lowest barrier value moves by more than one lattice spacing, it most frequently
backtracks in subsequent updates. Importantly, activity tends to stay local-
ized and form a sequence of changes in the same region of the model ecosys-
tem. Thereby evolution is reinforced locally, bridging punctuated equilibrium
in single species evolution [27, 28] to larger evolution and origination of new
taxonomic groups[29, 30].
Punctuated equilibrium is a concept from paleontology coined by Gould
and Eldredge, stating that most changes takes place on so fast timescales that
one often does not find intermediates between to species where one was evolv-
ing from the other. On larger scales, Quantum evolution coined by Simpson,
referred to the larger scale punctuations that is observed when one paleon-
tological period terminates, and is replaced by another one with substantial
differences in species compositions.
I.e. the distribution of “species” with B < Bc sits on a fractal in both space
and time. This “attractor” state is therefore critical, and the algorithm is one
of a class of models that let a system self-organize towards such criticality.
P(B)
time=0
time>>>1
P(B)
Bc=0.5
5 10 15
number of single species extinctions
Figure 3.15: Evolution of the number of sites with B<1/2 in the ran-
dom neighbor version of the Bak-Sneppen model. At any timestep
there is equal probability to increase or decrease the number of active species
a, thus defining a random walk of this number. The first return of the random
walk to zero defines an avalanche that terminates when all Bi > 1/2. When
this occurs the system is so stable, that the subsequent change will occur rarely,
but anywhere along the 1-d ecosystem.
and the other B above Bc (in the infinite system size limit where a vanishing
fraction of sites are below Bc . As the 2 newly assigned B’s are assigned uniform
random values in [0, 1], the condition for a statistically stationary distribution
of number of species in the interval [0, Bc ] is:
1
−1 + 2 Bc = 0 ⇒ Bc = (3.25)
2
Notice that one in principle could select a Bmin slightly above Bc , i.e. B =
Bc + 1/N . But then the chance to again select a subsequent one above this
number would smaller that 1/2. And the chance to select Bmin substantially
above Bc decays exponentially with both system size and distance from Bc .
The time series of the minimal B exhibits correlations. An avalanche is
defined as the number of steps s between two subsequent selections of minimal
B > Bt . The number n of B’s below Bt = Bc exhibits a random walk and the
size of the avalanche is determined number of updates s before this random
walk return to zero: P (s) ∝ s−3/2 . This is the famous distribution of waiting
times in the Gamblers Ruin Problem.
Here, the final approximation uses that the distribution of barriers below Bc
is scarce.
Percolation, by Kim Sneppen 145
10
0
0 0.5
time
Apart from integrating small and large extinction events into one combined
framework, the model predicts
Questions:
3.1) The sum of N random numbers selected uniformly between 0 and 1 will provide
a good fit to a Gaussian (for N sufficiently large). Given that we want a Gaussian
with spread 1, what should we select as N . Qlesson: Add 12 such numbers and
subtract 6, then you get a Gaussian with mean zero and standard deviation unity.
This is a handy and simple way to make it.
3.2) Simulate the standard SOC model for a N = 50 × 50 system, and plot the
avalanche size distribution.
Percolation, by Kim Sneppen 147
Qlesson: See that it actually gives a power law. Notice that it takes time before
large avalanches appear, i.e., there is a long transient.
3.3) After reaching the steady state, then restrict additions to one corner of the
N = 50 × 50 system and plot the avalanche size distribution.
Qlesson: The avalanche distribution gets steeper when only adding in a corner. Try
to explain why.
3.4) Always add grains to position (x, y) = (25, 25) and plot heights on the lattice
after a long time.
Qlesson: Organize a fractal pattern, Try possibly larger lattice to get more extended
fractal. You can also play with boundary.
3.5) Simulate the Depak Dhar sandpile model on a N = 100 × 100 system. Confirm
the scaling exponents mentioned in the text.
Qlesson: Observe that the avalanches are compact. Explain that.
3.6 Simulate a one-dimensional sand pile, with critical height two and a random
redistribution rule (Manna model). That is, at each toppling one distributes two
grains, but they are randomly put to left or to right neighbor (and sometimes out
of system when in site in end or beginning of system topples).
Qlesson: There can be critical behavior in a dynamic model in one dimension. This
is not possible in equilibrium models. The 1d ising model would not have that.
3.7) Simulate the evolution model for 100 species placed along a line in a variant
of the model where only one of the neighbors is updated at each step. Plot the
selected Bmin as function of time, as well as the max of all previous selected Bmin ’s.
How does the minima of B change as time progresses toward steady state (look at
envelope defined as max over all Bmin at earlier times)?
Qlesson: Self organization towards critical attractor is followed by following the
maximum of all previous minima.
3.8) Repeat the assessment of the above model, but now simulated for a finite
mutation rate µ = 0.05. At each step allow all sites to change with probability
pi ∝ exp(−Bi /µ), and then also update one of the neighbours of each site. Plot
the space-time evolution of system. Redo simulation for µ = 0.03. (Hint: one may
speed up the simulation by using an even driven simulation (Gillespie algorithm
(in later chapter)), where one updates one site at a time, selecting the next change
as the one with the smallest value of ti ∝ −ln(ran) · exp(Bi /µ) where ran is a
random number between 0 and 1). Qlesson: Extremal-dynamics and self-organized
criticality is obtained in the limit of infinitely small mutation rate, corresponding to
an extreme separation of timescales.
Lessons:
Supplementary reading:
Christensen, Kim, and Nicholas R. Moloney. Complexity and criticality. Vol.
1. World Scientific Publishing Company, 2005.
Bak, Per. How nature works: the science of self-organized criticality. Springer
Science & Business Media, 2013.
Chapter 4
Networks
4.1 Introduction
4.1.1 When Networks are useful
Networks are a widespread concept in both popular and scientific literature.
Networks are used to characterize the organization of a system of heterogeneous
149
150 Complex Physics
components (the nodes), each interacting with a small subset of the other
components. These interactions are described by links. Networks are very
much about history, as the real life networks evolved through a sequence of
events that took place on a much longer time-scale than dynamical processes
taking place on the network. The concept of networks may accordingly be
useful for systems with
• Heterogeneity: Systems with distinctly different components
The factor 2 comes about because each link has two ends, contributing with
2 to the connectivities. Thus, 2L is the number of link ends in the network,
distributed among N nodes.
The degree distribution (probability that a given node has k links to the
remaining N − 1 nodes) is the Binomial
(N − 1)!
P (k) = · pk · (1 − p)(N −1−k) , (4.2)
k!(N − 1 − k)!
hkik
P (k) = e−hki , (4.3)
k!
with an average degree
hki = p · (N − 1) (4.4)
152 Complex Physics
Figure 4.2: Basic network definitions. A) Network with nodes and links,
and each node characterized by its degree, which is the number of links associ-
ated to the node. B) The degree distribution n(k), k = 1, 2... for the network
in panel A).
and variance var = hk 2 i − hki2 = hki, and thus hk 2 i = hki · (hki + 1), an
equation that will be useful in discussions of signal amplification.
√ Note p that when the average degree hki is high, thus when the spread σ =
var = hki is much smaller than the mean, then the Poisson distribution
will approach a Gaussian distribution with mean and average given by the
above equations. If the spread of degree between nodes is comparable to the
mean degree, then the Gaussian approximation is poor, in part because it
would predict a negative number of links (as it is symmetric w.r.t. to its
peak).
A network is said to be connected if there exists a path between any pairs
of nodes in the network, see Fig. 4.4A). The distance between two nodes in a
network is defined as the minimal number of links that connects these nodes
1)
. For a connected network one defines its diameter as the maximum distance
between any two nodes. The diameter thus sets an upper scale for distances
in the system
Imagine a disease that spreads from a node in a random network where
all nodes have equal connectivity k. One step away, d = 1, there are k new
neighbors. A further step away, each of these newly visited nodes gives access
to k − 1 new nodes, see Fig. 4.4B). If we ignore that we can ignore overlap
where there is link between neighbors of a node, then the neighbors of the
1
To calculate the distance from a node i to all other nodes in a connected network,
one first makes a list of all the neighbor nodes of i and assigns them a distance d = 1.
Subsequently one adds new layers of nodes to the list from all neighbors of already included
nodes, provided that these neighbors are not already in the list. The distance to newly
added nodes is calculated from the distance of neighbor nodes that is already present in the
list.
Networks, by Kim Sneppen 153
dP/dx
dP/dx
0.01
0 0 0.001
0 10 0 10 0 10
x x x
d
0
X
number(nodes within d) = k · (k − 1)(d −1) ∼ k · (k − 1)d−1 . (4.5)
d0 =1
Therefore the number of visited nodes grows exponentially for any k > 2, see
Fig. 4.4. For a more randomized graph, where the degree k may differ between
the nodes, the disease will visit the entire network after a number of iterations
d. This number d will be given by the slightly more complicated expression
d
h(k − 1)ki
≈N , (4.6)
hki
which we will derive shortly through the ”amplification factor” expression be-
low. Importantly, eq. 4.6 takes into account that each subsequent node is
selected by a probability proportional to its degree. This is because a node is
in end of a link, and when we follow a link we therefore have double the big
probability to find a node with say 10 links than a node with 5 links.
We now use the above considerations to estimate the scale at which the
signal that amplify and spread on each node will start interfering with each
other. This scale is set by the scale at which the signal have reached a big
fraction of the network. Assuming that there is no overlap between the different
154 Complex Physics
signaling pathways before this upper scale is reached, we estimate the diameter
to be about
log(N )
Diam ∼ d ∼ 2 . (4.7)
log( hkhkii − 1)
The main lesson resulting from eq. 4.7 is that the diameter of a random
network only grows very slowly with network size N .
weighted average
R
k(k − 1)n(k)dk hk(k − 1)i hk 2 i
A = R = = − 1. (4.8)
kn(k)dk hki hki
Obviously this equation assumes that signals can spread both ways across a
link. In case this is not the case, that is where signals only transmit one way
along each link, the network is directed. Directed networks are not considered
in these notes, but play an important role in for example biological regulation,
or in hierarchical organizations.
Equation 4.8 implicitly assumes that there is no correlation between the
connectivity of one node and the connectivity of a neighbor node. When A > 1
then “disease like” signals tend to be exponentially amplified, and therefore will
spread across the entire network. For A = 1, on the other hand, perturbations
will be marginal spreading, where some will spread and others will “die-out”.
To have marginal spreading, one input signal on average should lead to
one output through a new link. A network with Poisson distributed degrees,
and hki = 1 will have hk 2 i = 2 and A = 1. Such a network will consist of
multiple clusters with the power law distribution of the cluster sizes shown
in Fig. 4.12B). We will return to the amplification factor later, as this is
interesting in setting the threshold, e.g. the fraction of vaccinations needed to
stop an epidemic. Thus A plays a role in determine the fraction of nodes that
need to be susceptible for the a disease to transmit, in analogy to the number
2 for the Bethe lattice where each node has 3 neighbors (percolation threshold
would be 1/A instead of 1/2 for the Bethe lattice.).
Mini Tutorial: Can you give a heuristic argument for why the numbers of 3-
loops should be independent of network size? (N attempts to make loops, but
each attempt only 1/N probability to succeed).
ranks nodes according to how centrally they are placed in the network. There
are indications that proteins with high betweenness centrality in molecular
networks tend to be more important than proteins on more peripheral network
locations [45, 46, 47].
Mini Tutorial: If A is the adjacency matrix for a network without self inter-
actions, then what is in the diagonal of the matrix A · A = A2 . What does the
trace of A2 represent for the network??
cluster). A modular network is one where one allows a few links between
clusters, but where links between pairs of nodes in different clusters are much
less likely than for pairs that both lie within the same cluster [48, 49, 50, 51].
A modular network correspond to a nearly block diagonal matrix, where the
blocks along the diagonal are supplemented with a few non-zero entries at
other places in the matrix.
Transfer matrix:
-1
T11 T12 0 k2 k-1
3
-1
T21 k1 0
T= T31 = -1
k1 0
for the case where A12=A 21=1, A13=A31=1
-1
1 0 k2 k-1
3 1 0
-1 -1
0 k1 0 0
T 0 = -1
k1 0 0
=k k1
-1
1
Notice that the matrix representation opens for simple manipulations. The
number of triangles in a non-directed network without self-links is
1
· trace A3
n(4) = (4.12)
6
where the factor 6 = 2 · 3 comes from “going” in a clockwise direction, re-
spectively counter-clockwise direction around each triangle (a factor 2), and
from the 3 contributions associated to the fact that any of the three nodes in
the triangle give a contribution to the count. Please notice that the adjacency
matrix in this case only should include links between nodes, and should not
include any self-links (trace(A) = 0).
Networks can have many features that extend beyond simple connectivity
and small loops. One of these is modules, or clusters of nodes that are more
connected within each other than between the modules. Examples of modules
are shown in Fig. 4.8 and 4.9. Fig 4.8 shows a network constructed by using
transmission of twitter messages (L. Weng et al, Scientific Reports 2012).
Mini Tutorial: Inspect the CEO network on Fig. 3.8. Is the number of triangle
loops larger or smaller than randomly expected?
Notably, such a scale-free degree distribution is far from the Poisson dis-
tribution from the previous sub-chapter (eq. 4.3). That is, if one assigns links
completely randomly between nodes, one would in practice never obtain a scale
free distribution. Scale free distributions are beyond simple randomness, al-
though the way they appear does involve randomness. We will discuss various
history-dependent processes for obtaining scale-free distributions at the end of
this chapter.
eq. 4.8,
R N k2 dk
hk 2 i kγ
A= − 1 = R1N kdk − 1 ∼ N 3−γ , (4.15)
hki γ
1 k
for γ ∈]2; 3[. In that case the denominator becomes independent of system size
N whereas the numerator increases with N . Thus, for γ < 3, A is dependent
of the upper cut-off in the integral, which represents the node with highest
connectivity.
hk 2 i V ar(k)
κ=A+1= = + hki
hki hki
The Internet
fraction, 1 − f :
A → A0 = A · (1 − f ) (4.16)
This comes about because each remaining node will lose each of its links with
probability f 0 = f 4) .
The network remains super-critical, when A · (1 − f ) > 1 or
1 1
(1 − f ) > = 2 . (4.17)
A hk i/hki − 1
I.e. the percolation threshold for the network is 1/A: This is the fraction of
nodes that need to be conducting to make an infinite cluster.
Conversely, the critical fraction for vaccination against disease spreading
[58] is
1
fc = 1 − 2 . (4.18)
hk i/hki − 1
This threshold is close to unity for scale free networks with degree exponent γ <
3, as is also seen in Fig. 4.12A. For narrower degree distributions the critical
threshold clearly separates from 1, with value fc = 1 − 1/hki for ER networks
when one simply uses the Poisson distribution property hk 2 i = hki2 + hki 5) .
4
Following a signal that enters into a node, see Fig. 4.4, each of its remaining k − 1 links
have probability (1 − f ) to survive the pruning. Thus the local amplification (1 − k) →
(1 − k) · (1 − f ) and the global amplification factor A is reduced by the factor (1 − f ).
5
After a fraction f is removed from an ER network, the fraction F in the largest cluster
obeys F = (1 − f ) · (1 − e−hki·F ) [58]: That is, the probability that a given remaining node is
Networks, by Kim Sneppen 163
Questions
4.1) What is the minimal number of links needed to connect 100 nodes in one large
component? (a collection of nodes that is directly or indirectly connected to each
other). Hint: Just think, no equations or simulation needed.
Qlesson: Think about the maximum number of links and minimum number of links
in a network, and the feature that most networks are closer to the lower limit (ma-
trix with mostly zeroes and a few links, i.e. sparse).
4.2) What is the largest diameter one can have in a network with 100 nodes? Hint:
Just think, no equations or simulation needed.
Qlesson: Think about the way to separate nodes from each other while still leaving
a path between them. Perhaps networks are not necessarily about maximizing the
ease of contact, but also about local protection from nonsense.
4.3) How does the diameter of a network scale with number of nodes N , when these
are organized on square/cubic/... lattices in d dimensions? Consider, for example,
4096 nodes, organized in 1-d, 2-d, 3-d lattices. The nodes are thereby placed on the
lattice sites, and the links are assigned between nearest neighbors along each of the
lattice axes. Determine the diameter of the 4096 node network, when instead orga-
nized in a Erdös-Rényi network with an average connectivity of 6 (same number of
neighbors as a 3 dimensional cubic lattice). Convince yourself that an Erdös-Rényi
network has infinite dimension. Hint: Just think, no equations or simulation needed.
Qlesson: Infinite dimension is easily realized, and nodes are close, but it’s easy to
get lost anyway.
4.4) Generate Erdös-Rényi networks with p = 3/(N − 1) (3 neighbors per node on
not connected to the largest cluster is exp(−hki · F ). Therefore, any of the remaining (1 − f )
nodes will belong to this cluster with probability (1 − e−hki·F ). At critical conditions the
largest cluster collapses and conforms to the scaling of the other clusters, see Fig. 4.12B.
164 Complex Physics
Figure 4.12: Weight of the largest cluster. A) The fraction of nodes that
is in the largest cluster (F ) as a function of the fraction of nodes removed[58].
The blue curve refers to a network with a relatively narrow degree distribution,
1/k 3.5 , which exhibits a critical threshold similar to ER networks. The red
curves shows behavior of networks where hk 2 i is dominated by the largest
hub in the network. The latter case is also simulated for different system
sizes, demonstrating that very large networks remain connected until nearly
all nodes are removed. B) Cluster size distribution in a critical ER network
P (n) ∝ 1/n2.5 [59], implying that a random node will be in a cluster with size
distribution n · P (n) ∝ 1/n1.5 . This critical distribution is also obtained as
one removes random links from a well-connected ER network until A = 1 and
hki = 1.
average, and not allowing for self-interactions) for N = 10, N = 100 and N = 1000
and count how many triangles there are at various network sizes. How does the
number of triangles change with N for fixed average connectivity (fixed connectivity
implies that p decreases with system size)?
Qlesson: Infinite dimension is easily realized, and nodes are close, but it’s easy to
get lost anyway.
4.5) Visualize the above network for N=100 using for example cytoscape (download
from web), the Python package networkx, or the matlab functions B = graph(A)
and plot(B) where A is the adjacency matrix. The cytoscape does not read the
matrix, but instead a sequence of lines, where each line has two nodes that are con-
nected by a link.
Qlesson: Networks can be visually nice.
4.6) Consider a non-directed network which only consists of one large component.
Prove that a random walker after infinitely long time will visit each node with prob-
ability that is proportional to its degree. Notice that random walks on networks
is at the core of search engines such as Google, where however the walkers also do
other moves to deal with properties of directed networks (e.g., to not get ”trapped”
indefinitely). Hint: Consider a steady state flux between two connected nodes with
different degrees.
Qlesson: Application of detailed balance that was introduced in the Metropolis al-
Networks, by Kim Sneppen 165
gorithm.
4.7) Construct a network of N = 100 nodes subdivided into 10 different classes with
10 nodes in each. Generate a random network where each node has approximately
0.01 links between modules, and nodes within same class have probability 0.5 to be
connected (remember that the generated matrices have to be symmetrical, and that
diagonal elements have to be zero). Calculate number of loops (=triangles), and
compare this with number of triangles when all links are randomized (i.e. when one
distribute the about 5+1 link per node randomly across the lattice).
Qlesson: There are much more triangles in the modular network.
4.8) Generate a random network of size N = 100 with 150 links (average degree
hki = 3) and monitor the size of largest component as nodes are removed subse-
quently. Do the same when removing links subsequently, maintaining all nodes.
4.9) Use the following equation for the fraction F of the nodes that remain in the
largest cluster of a Erdös-Rényi network, after a fraction f is removed [58]:
F = (1 − f ) · (1 − e−hki·F ) . (4.19)
f = f 0 · hk 2 i/hki (4.20)
of the links. Argue for this equation, and express the vaccination fraction f needed to
stop epidemics on a scale free network with N = 10000 nodes and degree distribution
n(k) ∝ k −2.5 .
Qlesson: Your neighbor has, on average, more link than you do.
B D B D
A A
C C
switch
partners
all nodes in the network are assigned new random links [63, 64]. For a system
with L links, then after t link swaps, the probability that a given link is not
changed is
2 t
F raction(unchanged links) = (1 − ) ≈ e−2t/L , (4.21)
L
which becomes insignificant when the number of “swaps” t becomes substan-
tially larger than L. Notice, that one cannot allow all random swaps: If there
is already a link between two nodes, then an attempted assignment of a second
link should be aborted. Also note, that one may keep a network connected
in one big component by simply only allowing “swaps” that maintain overall
connectedness.
Given an adequately randomized network, or better, a sample of about 1000
independent random networks, the significance of any quantifiable measure Q
is given by the probability that a random network has same value of Q as the
real network.
Q could, for example, be the number of short loops, that is, triangles in the
network. Alternatively, Q could, for example, be the number of links between
nodes with connectivity 10 and nodes with connectivity 20.
The excess ratio of a quantity Q is quantified by
N (Q)
Substance = R(pattern) = , (4.22)
hNrandom (Q)i
Transcription-regulation
more than random
Signal transduction
more than random
where Nrandom (Q) is the number of times the pattern occurs in the randomized
network. Here
2
σrandom (Q) = hNrandom (Q)2 i − hNrandom (Q)i2 (4.24)
substance: significance
Figure 4.16: Excess ratio R. (eq. 4.22). The paper analyse some artificial
networks. In both cases it quantify correlatioms between connectivities of pairs
of nodes that are directly linked to each other. In the left panel we consider
a network generated from a rewired version of the hardwired internet (that
originally had 6,584 cycles of length three) to a network without triangles.
The dark region in upper right corner illustrate that the network disfavour
connections between highly connected networks, compared to a randomized
version that preserve the degree distribution. In the right panel we rewire the
hardwired internet to maximize the number of triangles (obtaining a network
with 59,144 cycles of length three). In that case one obtain a network where
high degree nodes are more likely to be connected.
a random network with whatever degree distribution, which will work if it all
possible to find enough partners.
To generate a network of N nodes with a degree distribution n(k) ∝ k −γ ,
with a maximum of one link between each pair, one first assigns each node
i a degree ki from this distribution. That is, one selects a random number
r ∈]0, 1[ and solve for K
RN
K
dk/k γ
RN =r, (4.26)
1
dk/k γ
where K = Ki then is the selected number of links for node number i.
After assigning a number to each node we need to link these nodes up,
where each node should have its assigned number of Ki links. This is done
from the top down, starting with the node which should be assigned the largest
number of links. Thus one starts with the node at highest degree and connects
it to other nodes, linking it to the node of next largest degree and subsequently
connecting lower nodes until all links for this high-degree node are assigned
[69]. Subsequently, lower degree nodes are assigned neighbors in the same
orderly way, until all nodes have their assigned degree.
The network are now extremely ordered, each nodes are linked to all nodes
of higher degree. In fact, such a network where high degree nodes are connected
preferentially to high degree nodes is called assortative. Thus the generated
network is scale free, but it is not randomly put together.
To obtain a random network the network has to be randomized, using a
procedure that accomplishes the pairwise link swapping described in previous
section. Importantly one need to make a large number of edge-swapping, of
order L · ln(L) where L is the total number of links in the system.
H
1.0 e-mail
Internet
0.8
CEO
0.6
0.4
Yaest Random
nework
0.2
0
2.0 2.2 2.4 2.6 2.8 3.0
Many hubs Few hubs
Figure 4.18: Topological Hierarchy. Reproduced from [69]. The left part of
the figure illustrates maximally hierarchical (top) and anti-hierarchical (bot-
tom) networks of size N = 400 with a 1/k 2.5 degree distribution. The main
figure shows how H depends on the degree distribution in random scale-free
networks with distribution f (k) ∝ 1/k γ . As the degree distribution narrows,
the hubs tend to separate and for γ > 3 the hubs are distributed along a
“stringy” network that is dominated by nodes of low degree. The network ex-
amples is the hardwired network of internet routers, an email network, the net-
work of board members in American companies and the yeast protein-protein
interaction network.
often not directly connected to each other, and even the random network is
anti-hierarchical.
The figure also compares with a few real world networks, leaving us again
with the challenge to properly define what a random network actually is. No-
tice that H ∼ 1 for γ ∼ 2, whereas H decreases to 0 for somewhat larger γ,
reflecting that the hubs then have too few links to connect directly to each
other. This is also reflected in the behavior of the probability that none of K
neighbors have a degree higher than K (see Fig. 4.19):
R N 1−γ !K
k dk
P (K node is local top) = 1 − RKN
1
k 1−γ dk
∼ (1 − K 2−γ )K
∼ exp(−K 3−γ ) , (4.27)
where we use that, when γ > 2, the integral is dominated by its lower boundary.
Subsequently, we use that 1 − K 2−γ ∼ exp(−K 2−γ ).
Thereby P (K node is local top) becomes large for γ ∼ 3, even when K is
rather small.
Networks, by Kim Sneppen 173
chical paths, whereas biological networks form rough landscapes with several
mountains and broken hierarchical paths. To quantify the topology and make
it possible to compare different networks, one can measure the typical width
of individual mountains and the separation between different mountains (Fig.
4.20).
In particular, in Fig. 4.21 we complement the methods to generate ran-
dom networks (random one-mountain landscapes) [63] with preserved degree
sequences, to generate ridge landscapes. In its simplest implementation, we
assign a random rank to every node in a network, and organize the nodes hi-
erarchically based on their rank. This method creates non-random networks,
distinguished by a separation of hubs (leftmost network). Alternatively one
may assign each node a number equal to its rank, and thereby generate the
highly centralized system in the rightmost panel.
Thus, Fig. 4.21 shows topologies that all originate from a random scale-
free network (shown in Fig. 4.21e) with degree distribution P (k) ∝ k −2.5
and system size N = 400. The extreme networks, the perfect random-rank
hierarchy in Fig. 4.21(c) and the perfect degree-rank hierarchy Fig. 4.21(g)
(ε = 0), surround the networks with increasing error rate towards the random
scale-free network with ε = 1 in the middle (Fig. 4.21(e)).
The intermediate networks are generated by a rewiring rule where links also
sometimes, with probability ε > 0 are re-shuffled randomly. Notice, in particu-
lar, that when we consider already a small perturbation on the stringy network
of left panel, the diameter of the network collapses as seen in Fig. 4.21(d). Note
that the color gradient indicates that the random-rank hierarchy is still intact
at this stage, and that the hubs (“mountain tops”) are separated.
Both when we organize the network according to random numbers, or ac-
cording to degree, we obtain higher clustering, (meaning: more triangles),
than in the completely randomized network (not shown). This clustering is
expected, as organization along any coordinate tends to make friends of friends
more alike. The effect is stronger in the degree-rank hierarchy, since the clus-
tering automatically increases further, when the hubs with their many links
are connected.
Questions
4.11) Generate a random network with N = 1000 nodes and with power law dis-
tributed connectivity (that is, degree): n(k) ∝ 1/k 2.5 , for k = 1, 2, . . . , N . Compute
the number of nodes that have no neighbors with higher degree (number of local
maxima, i.e., “tops”). Rewire the network such that only moves which lower the
degree difference between nodes are allowed. Compute again the number of local
“tops”. Finally, try to rewire node links, where one only allows moves that increase
degree differences, and then compute the number of “tops”.
Qlesson: Networks topology is much more than its degree distribution.
Networks, by Kim Sneppen 175
(a) (b)
(c) Random rank, ε = 0 (d) Random rank, ε = 0.1 (e) Random, ε = 1 (f) Degree rank, ε = 0.5 (g) Degree rank, ε = 0
F = 0.13(5) F = 0.51(5) F = 0.83(5) F = 0.96(3) F = 1.00(0)
Figure 4.22: Citations per article: Notice that as the average of the field
changes, so does the tail. The distribution is not really a power law but closer
to a log-normal.
Mini Tutorial: Imagine that you jump from article to article, following ran-
dom entries in the reference list. At a random time you then make a reference
to one of the articles you visit. What is then the probability to reference an
article as function of its previous number of citations?
176 Complex Physics
We will now introduce two ways that generate close to scale free networks.
As you will see, the two ways are fundamentally different and in fact also differ
conceptually from any fine tuning to a critical point. Both methods will use
time development and certain dynamical rules to obtain interesting networks.
But while one model is closely associated to eternal growth (it becomes
boring in steady state), the other obtains its pattern in a ongoing steady state
of network re-wirings and in this sense has some similarity with SOC. As
nearly all studies of networks are essentially snapshots of only one instance,
we at present have no real way to judge the dominating dynamics in any real
system. In any case, each of the approaches have their correspondence in other
problems from physics, complex systems, and social science.
which gives:
f 0 (t) 1 d(kN )
2t = − , (4.31)
f (t) N dk
where f (t) ∝ t since f (t) = 2t/ k kN (k). Inserting f 0 /f = 1/t in the above
P
equation
dln(N )
−1 − =2. (4.32)
dln(k)
From this one obtains N ∝ k −3 :
1
n(k) ∝ . (4.33)
k3
Notice the if one instead added two links for each new node addition, each
attached to this new node, then nominator and denominator in eq. 4.28 would
both double. In the end the scaling behaviour would be the same. If, however,
we also added links preferentially without adding new nodes, the result could
be different. Thus if such links are assigned to existing nodes preferentially in
both ends then they would contribute with two times the nominator (one for
each link end), but only contribute with 2 in the denominator. This peculiar
growth would make the ratio smaller and change the scaling law 1/k 3 to 1/k γ
with γ ∈ [2; 3], see Fig. 4.23
In fact, γ = 2/(2 − p) + 1 where p is probability to add a new node with a
link, and 1 − p is the probability to add a new link with both ends attached
preferentially to already existing nodes. (this is proved by noting that a link
with both ends preferentially added give double the nominator as in the above
equation. Thus, the numerator is multiplied by p + 2 · (1 − p) = 2 − p, whereas
the denominator remains 2 · t).
The preferential growth model was originally proposed in an entirely dif-
ferent contexts, relating to modeling of human behavior exhibiting skew distri-
butions in a wide variety of aspects. Yule (Yule, G. U. (1924). Philosophical
Transactions of the Royal Society B. 213 (402–410): 21–87.) , Pareto (Pareto,
V. (1898). ”Cours d’economie politique”. Journal of Political Economy. 6)
and Zipf observed that the empirical observation of family sizes of taxonomic
species, fortunes of humans, and number of times a particular word is used,
all tend to be distributed with power laws
Here, n could for example denote the probability to have a word repeated s
times. This distribution is marginal in the sense that the average
Z max
sds
hsi = (4.35)
min sτ
receives a substantial contribution from the upper cut-off of the integral. That
is, at power laws wider than 1/s2 , say 1/s1.5 , a huge fraction of the probability
178 Complex Physics
-5 p=1.00
10
1e-05
p=0.66
1e-06
p=0.10
1e-07
N(K) 1e-08
p(k)
1e-09 p=0.1
-10
1e-10
10 p=1.0 p=0.66
1e-11
normal
1e-12 preferential
1e-13 attachment
1 10 100 1000
k
K
mass is bound relatively close to the upper cutoff. On the other hand, a
narrower scaling like 1/s2.5 will have an average that is independent of the
upper cutoff. Thus, in the case where s denotes resources or money, then
social systems should become unstable when the exponent τ becomes less than
2. Popularly speaking, the rich then become so rich that by confiscating their
fortunes society could increase the wealth of the rest by a substantial amount.
H. Simon (1955) suggested that the 1/sτ behavior reflected a human ten-
dency to preferentially give to those that already have. As H. Spencer stated
already in 1855, the human perception of importance of a particular subject
is proportional to how often one has heard about this subject. An observation
that relates to the absurdity of much of the public debate.
For networks, a feature of this history dependent mechanism of positive
feedback is that the most connected nodes also are the oldest. This property
can sometimes be tested, and often fails to be fulfilled. Another feature is that
in steady state, supplementing preferential attachment with random elimina-
tion of nodes in fact breaks the scale-free degree distribution (because removal
of any node preferentially reduces the number of links from the high degree
nodes).
Scale-free behavior, obtained by preferential attachment in networks, de-
pends on the ongoing growth process, as seen in Fig. ??. For networks the
removal of small degree nodes has the side effect that it preferentially removes
links from the high degree nodes, and thus limits their continued growth. Thus,
ongoing growth is often not that realistic.
Networks, by Kim Sneppen 179
q: both ends
preferential
q 1-q: only one
end prefe-
rential
Mini Tutorial: What would happen if one only merged, and did not inject, any
new particles in the above merging and creation model?
The above model persistently injects one new unit into the system, because
of the assignment sj = 1. The merging-injection model gives a distribution
p(s, t) that develops towards a steady state distribution p(s) as given by the
180 Complex Physics
Figure 4.25: Basic merging and creation process. (a,c) also implemented
while allowing different signs (b,d). Panels c,d show cumulative plots at dif-
ferent times for an N = 1, 000 system. From Minnhagen et al. Physica A 340,
725 (2004).
dynamical equation
s−1
X
p(s, t + 1) − p(s, t) = p(s − u, t) · p(u, t) − 2 · p(s, t)
u=1
p(s, t + 1) − p(s, t) = 0
s−1
X
⇒ p(s − u) · p(u) = 2 · p(s) . (4.36)
u=1
The first term represents the sum of all combinations of merging that result
in a size-s cluster. The second (loss) term 2 p(s, t) comes from selecting two
numbers, which each could be of size s with probability p(s, t).
When a number of size s is selected, it will be merged and then become
larger that s, thus surely becoming removed from the bin with clusters of size
s. The final steady state equation is only true for all the clusters which could
be in steady state. The largest cluster could not be in steady state. At any
time, the largest cluster s = smax would occasionally be merged with another
smaller one, and thus it can only grow. Thus the system rely on a steady
injection of small clusters, s = 1, and its overall mass will grow as the larges
cluster grows and separates from the power law that govern the remaining
population of clusters.
Networks, by Kim Sneppen 181
1 s
+ .....
Figure 4.26: Illustration of random walks. Emphasis is on the first return
distribution p(s) where s is the time axis.PThe second return distribution of
a random walker must fulfil psecond (s) = s−1 u=1 p(u)p(s − u) in terms of first
returns. Further, we make the reasonable assumption that second returns
P scale proportional to first returns, psecond (s) ∝ p(s) for large s. Then
must
p(u)p(s − u) = const · p(s). (see also eq. 3.14 and Fig. 3.5). Thus, the
first return of random walk should fulfil the recursion relation (apart from
a constant that can be absorbed in a pre-factor). First returns of random
walkers scale as p(s) = s−3/2 , a scaling which should therefore also solve the
merging-injection model.
182 Complex Physics
In steady state, after long time then the gain of aggregates with size s
should be equal to the chance that a cluster at size s is selected and merged
with another cluster (the factor 2 comes from the fact that we select two
clusters at each time-step). The above steady-state relation is fulfilled by a
power law with probability to have a number with size s that scales as
1
p(s) ∝ (4.37)
s3/2
( u u−3/2 (s − u)−3/2 ≈ 5.22 · s−3/2 ). Notice that the prefactor of 5.22 can be
P
absorbed by setting p(s) = 5.22/s3/2 . A further argument for the scaling can
be obtained by using that eq. 4.36 is fulfilled for the first return of random
walkers is illustrated in Fig. 4.26. This figure illustrate two of the terms in the
above sum, with a thin line marking the random walk for the corresponding
contribution. The figure illustrate two terms in the sum, the one where first
“first return” happens after 1 step, and the one where first “first return”
happens after two step. Obviously all second returns at “time” s comes about
from walks where the first “first return” is somewhere between 1 and s − 1.
The power law requires constant injection of ”mass” at s = 1 (Takayashu
et. al. 1988), making p(1) a fixed finite number (=1/2, see appendix to this
chapter). In the random walk picture this secures that we start the walk. For
a simulation see Fig. . As already mentioned, then the model also generates
one very cluster (or number) number that constantly grows beyond any size
(as the injected mass ultimately ends up in this aggregate).
The merging and creation model was first suggested as a model for aggre-
gation supplemented by on going injection of new dust in some region of the
interstellar space (Fields & Saslow (1965)).
Remarkably, if one changes the model slightly by allowing evaporation,
i.e. removal, the scaling changes. To accomplish this, at each step select
two numbers, merge them, but then remove 1 from the aggregate: (si , sj ) →
(si + sj − 1, r) where r now is some random injection with mean hri = 1.
This model, that amounts to a ”mass conserving” version of the above model,
predicts a scaling that is markedly steeper,
as was also shown by solving this model analytically (Minnhagen et al. Physica
A 340, 725, 2004). Hence, objects are merged and a small random quantity
is emitted into the list of other objects. In any case, mass conservation will
change the power law the steeper exponent 5/2.
a) b) P(>k)
1/k1.2 when merging
1 followed by new
node with 3 links
0.1
merging
0.01
1 10 100
k
Merging: Shortening signaling pathways
Figure 4.27: Merging and creation model of Kim et al. (2003). In addition to the
merging step shown, a steady state network demands addition of a node for each
merging. After a transient this evolutionary algorithm generates networks with
scale-free degree distributions, as illustrated in right panel. The scaling exponent
for the steady state distribution depends weakly on the average number of links that
a new node attaches to the older ones.
its neighbors. These are merged into one node, see Fig. 4.27a). Subsequently,
one adds a new node to the network and links it to a few randomly selected
nodes. This merging model partly corresponds to the above merging model
with evaporation, thus suggesting an exponent of 1/k 5/2 .
For networks, the merging-creation result generates a nearly scale-free net-
work with exponent γ ∼ 2.2, see Fig. 4.27b). On another note, considering
companies that may consider to merge with others. The justification for merg-
ing two companies could be efficiency, to to shorten communication pathways
and increase efficiency. Creation, on the other hand then reflect introduction
of new companies and their few start up relations.
In contrast to the preferential attachment model in previous sub-section,
the merging/creation model does not demand persistent growth. Instead, it
suggests an ongoing dynamics of an evolving network which at any time has a
scale free degree distribution.
There is also a merging-creation model of evolving networks that have
some potential relevance to solar flare dynamics. Solar flares is associated to
eruptions from the solar corona, which in turn is associated to the complex
phenomena of turbulence in magneto-hydrodynamics. That is, there is strong
magnetic fields on the sun, and the corona consists of charged particle made of
protons and electrons. Occasionally magnetic field lines converge into bundles
and makes solar spots with magnetic north poles or south poles. These field
lines may merge and grow or annihilate each other dependent on directions.
The above “mess” in the solar atmosphere inspire a model with donor
(q > 0) and acceptor (q < 0) nodes to be connected via directed links, see Fig.
4.28(a) and (b). q would then be the number of magnetic field lines in the
solar spot, and the sign of q marking its direction , i.e. whether it is a north
or a south pole. As links will always be between positive and negative nodes,
the network is bi-partite: There is two set of nodes, and there is only links
184 Complex Physics
a) b)
<E>
7 c)
k 5
3 <nn>
1⋅103 Time 3⋅103
1
E ∆E
1 ρE ∆nn
10 2
nn 1/s
102 1/s
P(>s)
10 3
10
4
10 d) e)
5
1 1 2 3 4 1 1 2 3 4
10 10 10 10 10 10 10 10
s ∆s
between the two types; There is no links between nodes that are both in one
of the two subset.
Each node may have a different number of links, but at any time a given
node cannot be both donor and acceptor. Further, we allow several parallel
links between pair of nodes, representing the number of field lines that con-
nect them. Thus we here talk about a network model where some nodes are
connected by stronger links than other pair of nodes.
At each time step, two nodes i and j are chosen at random. The update is
then:
• 1) Merge i and j. There are now two possibilities:
– a) If i and j have the same sign, all the links from i and j are assigned
to the merged node. Thereby, the merged node has the same neighbors
as i and j had together prior the merging, see Fig. 4.28(a).
– b) If i and j are of opposite sign, the resulting vertex is assigned the
sign of the sum qi + qj . Thereby, a number max{|qi |, |qj |} − |qi + qj |
of links are annihilated in such a way that only the two merging nodes
change their number of links. This is done by reconnecting donor nodes
of incoming links to acceptor nodes of outgoing links, see Fig. 4.28(b).
• 2) One new vertex is created of random sign, with one edge being con-
nected to a randomly chosen vertex.
This bipartite network model predicts power laws associated to the dynamics
of reconnections between the nodes (This power laws and a more geometric
version of the above model was first studied in the solar flare model of Hughes
et al. (2003)). In this regard it is interesting that the number of links per
node, k, is distributed with scaling P (k) ∝ 1/k 2 . This was also obtained
for the “number of loops at foot-point” in Hughes et al. In addition, the
distribution of re-connection events ∆k counted as the reduction of k when
two nodes of different sign merge is distributed as P∆ (∆k) ∝ 1/∆k 3 . This is
in fact similar to the distribution of “flare energies” in the solar flare model of
Hughes et al.
This suggest a simple perspective on solar corona dynamics. Perhaps on-
going merging is a main reason for scale-free behavior for magnetic activity in
the solar atmosphere. The bipartite model has its analogy in a scalar model,
with matter/antimatter as studied in the early 1990s by Krapivsky (1993).
Questions:
4.12) Simulate preferential growth in its original “rich gets richer” version solved
by Herbert Simon in 1955. That is, at each step add 1$. With probability p this
amount is added to an already existing person, with probability proportional to the
wealth of this person. In case this is not happening, that is with probability 1 − p,
introduce a new person with a fortune of 1. Explore the behavior for p close to
186 Complex Physics
1. That is in the limit where one very rarely gives money to the people who have
nothing.
Qlesson: Preferential growth indeed gives scaling. p = 1/2 correspond to the pref-
erential attachment growth model for networks.
4.13) Repeat the above, but now supplemented by the rule of removing a random
person each time the number of persons in the system exceeds 1,000.
Qlesson: Here, the rich gets richer dynamics remains robust to removal. This is
because the rich are not affected by other people’s elimination.
4.14) Consider the preferential network growth model and let n(k, t) be the number
of nodes with connectivity k at time t. Find the analytical expression for the steady
state distribution of n(k) for different probabilities of adding new nodes p (with 1−p
being the addition of new links).
Qlesson: In this case, each time one adds a link the denominator in eq 4.28 grows
with 2 but the nominator with less. Thereby γ = 1 + 2/(2 − p) that approaches 2
when 1 − p ∼ 1.
4.15) Let’s, for the time being, ignore the network aspect and simply simulate the
merging/creation model in terms of a set of integer numbers ki , i = 1, 2, . . . , n, (with
n = 1, 000) which are updated according to
ki , kj → ki = ki + kj and kj = 1 . (4.39)
Show numerically that this generates a steady state distribution of the sizes of the
numbers in the set n(k) ∝ k1τ with τ ∼ 1.5.
Qlesson: Merging with constant influx can indeed generate power laws.
4.16) Simulate the merging/creation model in terms of a set of integer numbers ki ,
Networks, by Kim Sneppen 187
ki , kj → ki = ki + kj − 1, and kj = δ , (4.40)
where δ is 0, 1 or 2 with equal probability and we only allow updates where all ki ≥ 0.
Thus, some of the 1, 000 numbers can be zero, for later to be merged and replaced
with nonzero numbers from elsewhere in the system. Show that this procedure
generates a steady-state distribution of the sizes of the numbers p(k) ∝ k1τ with
τ ∼ 2.5.
Qlesson: In merging the resulting distributions are very dependent on a finite but
persistent loss term.
Lessons:
• Networks are both about connecting large systems, but also about keep-
ing individual nodes isolated from most other nodes.
a quantity that can used to estimate the critical point for signal trans-
mission across the network (percolation threshold).
Supplementary reading:
Newman, Mark. Networks. Oxford university press, 2018.
where the variable x is not in itself interesting, but rather here to allow us to
calculate the pi ’s fromPdifferentiating with respect to x. We see directly that
G(0) = 0 and G(1) = i pi = 1. Now express
∞
X ∞
X
G2 (x) = p i xi p j xj
i=1 i=j
∞ X
X k−1
= pi pk−i xk
k=2 i=1
X∞
= 2pk xk , (4.42)
k=2
where we in the last equation use the basic equation defining pk , i.e. the steady
state equation for the merging-creation process,
X
pi pk−i = 2pk (4.43)
Now the sum from k = 2 to ∞ can be written in terms of whole sum from
k = 1 to ∞ minus the contribution from k = 1, i.e. minus p1 :
G2 (x) = 2 · (G(x) − p1 · x) ⇒
p
G(x) = 1 ± 1 − 2 · p1 · x . (4.44)
Thus with the help of the basic recursion equation for pk we got an expression
for the generating function G. The sign choice ± is fixed by the constraint
that G(0) = 0. Thus,
p
G(x) = 1 − 1 − 2 · p1 · x
We can now Taylor expand this expression for the generating function, yielding
1 (i)
G(x) = G(0) + G0 (0)x + · · · + G (0)xi + . . . , (4.46)
i!
with the i0 th derivative equal to
1 · 3 · 5 · 7...(2i − 1) 1 (2i)! 1
G(i) (x) = i
· (2i+1)/2
= i · ,
2 (1 − x) 2 · i! (1 − x)(2i+1)/2
where we express the i’th order derivative in terms of factorials, thus allowing
later use of approximate equations for these. The generating function is:
∞
X 1 (2i)! 1
G(x) = i
· · · xi (4.47)
i=1
4 i! · i! 2i − 1
Now we identify each order in x between this expression and the definition
equation for the generating function G(x):
1 (2n)! 1
pn =
n
(4.48)
4 n! · n! 2n − 1
√
Using Stirling’s approximation n! ∼ 2πn · (n/e)n :
√
1 2n (2n/e)2n 1 1
pn ∝ n 2n
∼√ ∼ 1/n3/2 , (4.49)
4 n (n/e) 2n − 1 n(n − 1)
which is indeed the scaling guessed from assuming that the second return scale
as the first returns of a random walker.
190 Complex Systems, Kim Sneppen
Chapter 5
Agent-based models
5.1 Introduction
In this chapter we attempt to understand aspects of our social/biological world
while assuming that it is built of many entities or agents that - by repeated
actions - allow larger scale organizations to form.
Agent-based models are entering into the mainstream of computational ap-
proaches to economic and social systems. By simulating complicated economic
relationships in terms of many different types of agents, explored recently in
the economic literature by, e.g., Le Baron, it is hoped that one can obtain re-
alistic models of societal dynamics. Here, we instead advocate another type of
agent-based model which can describe ”bottom-up” self-organization of com-
plex systems. The models we describe have only few rules for simple agents,
which are then iterated billions of times. Heterogeneity, diversity and complex
behavior should be an emergent property, not an input. We term this class of
models ”bottom-down” approach, with the value of a model being greatest,
when it is built on only few assumptions and parameters. A clearly ”wrong”
model is often more useful than an unclear model.
191
192 Complex Systems, Kim Sneppen
8 20
A) B)
#similar neighbors
D:move when<4
C:move when<3
move when<2
move when<6
B: start
4
20 40 1
1 20
updates/agent
20 20
C) D)
1 1
1 20 1 20
• Select one agent and let this agent move, if her number neighbors of the
same color is smaller than a certain threshold. When moving, we just
replace the agent’s position with another random agent. This swapping
takes place irrespective of, whether any of the agents gains or looses with
this move, the move is only driven by current stress.
The central parameter is the threshold of equally-colored neighbors. A sim-
ulation of the model is shown in Fig. 5.2. From panel A) one observes that
the system segregates. This segregation occurs to an extent that depends on
the threshold used: if an agent is required to move already when she has fewer
than three neighbors of same color as yourself (green curve in Fig. 5.2A)), the
system evolves to a state where most agents on average have 6.5 neighbors of
same color, that is, a strongly spatially-segregated system. This outcome is il-
lustrated in panel C). In fact, if one also moves when there are three neighbors
of same color, nearly the same degree of separation is obtained (Fig. 5.2D).
Thus, segregation can be obtained with a relatively moderate racial prefer-
194 Complex Systems, Kim Sneppen
ence, which is the main result of Schelling. Even if one only moved when one
has 0 or 1 neighbors of same color, there is noticeably segregation. Noticeably,
if the requirement is such that you are even pushed out with relatively few of
opposing color, segregation is weakened, as people are pushed to move all the
time, and boundaries tend to break up.
Noticeably, the above model is slightly simpler than the original model,
where there were also empty spaces and people moved to these empty spaces.
Our moves simply made two agents swap position, and not just one agent that
moves into an empty space. Further, we do not ask whether the agent replaced
gains with the move. If you move when you are frustrated because you only
have 3 neighbors of the same color as yourself, then an opposing color will
automatically be satisfied, as it then will have 8-3=5 neighbors with the color
threshold satisfied. However, for a threshold that forces you to move also when
you have 5 neighbours of your own color, our version of the model will force
a differently colored neighbor into a frustrated situation. However, the result
for threshold of six in Fig. 5.2A remains qualitatively similar.
Segregation models have been suggested to be important for a range of
biological problems, ranging from evolution of toxin producing bacteria [79] to
spatial sorting of different cell types into tissues, using the differential adhesion
hypothesis [80, 81]. The differential adhesion hypothesis assumes that similar
types of cells attract each other more strongly, than cells from different tissues.
A)
S
A value γ ∼ 0 means that production cost does not diminish much with
company size, whereas a lower γ would reflect an increasing effect for an
economies of scale. γ < −1 is not realistic, as it would mean that the to-
tal production cost of many product units, namely s · sγ = s1+γ , becomes
cheaper than for just one unit. However, γ ∼ −1 may be realistic for the
software or movie industries, whereas traditional factory production may have
moderately negative γ closer to 0.
The parameter σ quantifies a transportation cost that, for simplicity, is
assumed to be linear in the distance between the position x of the consumer
and the position y of the producer. Notice that this proportional dependence
is markedly different from the exponential “iceberg cost” assumed in the eco-
nomic literature [90]. In fact, one may even expect modern shipping costs to
increase more slowly that proportional to distance, however, for simplicity, we
keep the linear dependence here. Finally one may supplement the model with
a tariff parameter β, quantifying that a customs barrier between position x
and position y might be added to equation (5.1).
The model is executed in time steps, where each time unit consists of τ
trading updates as defined above. During these τ steps the value of s does
not change. After these τ updates, new company sizes s(y) are assigned to
be equal to previous ones plus the accumulated orders at site y during these
updates:
There is no direct memory to the earlier size of the company but, nevertheless,
they tend to remain localized because of the sensitivity of accumulated rewards
to production capacity at the previous production period.
Altogether, the model has three parameters γ, σ, and τ . In addition,
tariffs may be added for externally imposed customs, and the model should
in principle be extended to include pre-factors in front of sγ , in order to take
into account the variation in labor costs considered in model descendants of
[86]. That is, products with small γ would presumably have a large cost of
producing already the first product (as all subsequent products are nearly free).
The system size L is irrelevant, as long as it is much larger than the domain
scale set by the other parameters. γ and σ quantify incremental production
cost and transportation cost for a unit of product, whereas τ is proportional
to the time it takes to rebuild the production apparatus for the considered
product type.
Agents, by Kim Sneppen 197
Fig. 5.3 explores the dynamics of the one-dimensional model with periodic
boundary conditions (a ring of sites) using an intermediate level of economies
of scale exponent γ = −0.5. The first three panels illustrate emergence of pro-
duction centres (denoted “companies” in the following), reflecting the positive
feedback between consumers and the economies of scale.
Fig. 5.3A illustrates that a given manufacturer may collapse while others
emerge. Notice further that the emergence of new companies often occurs close
to the positions of the previously collapsed ones. This inheritance reflects the
memory associated to the geography of surrounding companies that survive
the collapse of the one in question. In other words, when a company disap-
pears, it leaves vacant a wide business niche because of the cost associated to
distance for local customers to deal with companies farther away in the larger
neighborhood.
Comparing Fig. 5.3A and B, one notices that lowering τ destabilizes com-
panies. Remember, that low values of τ (as in Fig. 5.3A) correspond to the
case where it only costs a few product units to build a new production facil-
ity. Therefore, a higher start-up cost will tend to stabilize existing production
centers. Comparing Fig. 5.3C with Fig. 5.3A one sees that lower distribution
costs may stabilize even a product with low τ .
Fig. 5.3D introduces another limit on avalability of products in terms of
an ”information horizon”. With this the agents are only allowed to explore
prices of the nearest say h = 10 neighbor agents in search for the lowest price
(modeling a traditional ”offline” economy). Fig. 5.3D uses the same other
parameters as Fig. 5.3C and illustrates that a low information horizon has an
effect comparable to a larger transportation cost (compare with Fig. 5.3A).
For relatively small transportation cost, the productions centers become
large, whereas the noise in allocating customers in the time interval τ becomes
small. In this limit an “equilibrium” production center should supply cus-
d γ d Dγ
tomers up to a distance x where the gain by economies of scale dx s ∝ dx x
balances the increase in transport cost σ by increasing x further. By differ-
entiating the cost of products originating from a region with radius x in D
dimensions, cost(x) = xDγ + σx:
d
cost(x) = 0 ⇒
dx
xDγ−1 ∝ σ ⇒
x ∝ (1/σ)1/(1−γD) . (5.3)
Here, we only include the overall scaling, and not the tendency that small
γ products tend to have larger costs for the first product (i.e. σ should be
interpreted as the transportation cost per production cost for first product).
Overall, the size of a production center or its associated customer base is
governed by the balance between the positive feedback of an economies of scale
and the negative feedback set by transport. Within this analogy, the patterns
in Fig. 5.3 are reminiscent of the ones found in reaction diffusion systems
[91] where a local positive feedback is combined with a spatially extended
198 Complex Systems, Kim Sneppen
• At each update a new word is initiated at the domain center with prob-
ability fword /N , and assigned a birth time according to a time counter.
Agents, by Kim Sneppen 199
A B
Figure 5.4: Swear word dynamics in Japan. The left panel shows the
distribution of swear words as measured in the 1980’s. The geographical dis-
tribution of concentric circles around Kyoto is the result of 600 years of history.
The right panel shows a snapshot of a simulation of the spatial dynamics of
word spreading over the Japanese mainland. Blue and red circles show two
examples where the same word form is found symmetrically on either side of
Kyoto. The graph in the upper left corner shows the mean distance between
two adjacent fronts (averaged over many runs) as a function of distance from
Kyoto. The orange broken circle belongs to a word which only is present at
Kyoto’s east side. The probability that a word coexists on both sides decays
with distance away from Kyoto. In the insert of Panel B on see this in terms
of the width of the respective word regions: As distance from Tokyo increases,
the region of surviving words tends to increase. Figure reproduced from ref.
[93].
• If the word is younger than what is already present on the site chosen,
the word overwrites the order word at this site. If the word is transmitted
to a site where an even newer version exists, the older word is ignored.
If a word is transmitted to a site where it already is, then in effect the
system is not changed.
As words spread, they always retain their original birth time, assigned at
origination at the center. In Fig. 5.4 the 2-d lattice is constrained within the
land boarders of Japan, thus allowing us to include the simplest geographic
features.
The simulations depends on the frequency of new words fword originating
200 Complex Systems, Kim Sneppen
from Kyoto and also on the size of the squares into which we coarse grain space.
With larger but fewer patches, fluctuations increase and the likelihood that a
word dies out becomes larger. The figure 5.4 shows the case where each “agent”
represents a lattice withe square size of ∆ = 30 km, and where frequency of
new words was calibrated such that about 20 words remain simultaneously on
main island, as can be counted from the data in panel A).
The size of ∆ was adjusted to fit the increasing distances between words as
one moves out from the center, see Fig. 5.4. Importantly, the words are quite
different from center to periphery of Japan.
The language spreading model with the basic assumption that “‘new” over-
rules “old” resembles a minimal disease spreading model, where people get
infected and subsequently immune to each disease. In this process, subsequent
waves of emerging new diseases become possible [94, 95].
Questions:
5.1) Simulate a Schelling-like model in two-dimensions for a 40 × 40 site system
where each site has 8 neighbors, and only one color is allowed to move away from
present location. Set threshold for moving when having less than 3 neighbors of
same color as yourself. Simulate this system for a 3 color system, only allowing one
color to make active moves. Simulate the system for long times to verify coarsening.
Use periodic boundary conditions.
Qlesson: One could get segregation driven by only one race. But it should be possible
to distinguish in a 3 race system
5.2) Simulate the globalization model in one dimension akin Fig. 5.3A. Plot the
average size of companies as a function of the transportation cost and time-scale τ
for a fixed value of economy of scale γ = −0.5. Simulate an N = 100 system placed
on a line, for at least 2000 updates per agent in the system.
Qlesson: The distribution of company sizes is quite independent of τ .
5.3) Simulate and visualize spreading of signals along a one-dimensional line, with
new words appearing at position x = n/2 with high frequency (for example each
time each agent has been involved in one word exchange). At each step, select
two neighbors, and let the youngest word spread to replace the oldest word. Also
simulate the model when new words are inserted more rarely. Qlesson: With fast
word innovation one will see words on the right and left sides of system rarely being
the same because the survival on the two sides are exposed to big fluctuations around
the insertion point.
Mini Tutorial: Suggest positive feedback mechanisms that could favor main-
tenance of high connectivity/central hubs in social networks.
The memory in (1a) and (1b) is a map that is updated based on the prin-
ciple of new is better than old information, whereas the priority in (1c) is only
used in the more complicated version that give segregation.
The 3 vectors above are updated with each encounter. Thus people just
talk about people. In the simplest model people only update their knowledge
about overall direction towards other people, and how new their knowledge
are.
The network model is executed in time steps, each consisting of one of the
two events (see Fig. 5.6):
• Communication (C): Choose a random link and let the two agents
connected by the link communicate about a third agent selected by one
of them. The two agents also update their information about each other.
3)
The communication step C is executed much more frequently than the rewiring
step R, thereby allowing each agent to build a reliable contemporary map of
where other agents are. R defines a network dynamics where individuals change
their neighborhood by gradual social climbing to friends of friends 4) .
The ABM is first simulated such that agents prioritize each other equally all
the time. This mimics the case where ”broadly minded” agents simply hunt for
new information about everybody. The network then develops the hierarchical
structure shown in Fig. 5.7 A). This hierarchical structure reflects the positive
feedback between being central and having access to new information about
other agents. That is, agents with new information are attractive in the R-
move of other agents. As a consequence, central agents tend to gain links and
networking reinforce a hierarchy based on information access.
The behavior of the ABM changes dramatically, when agents prioritize
their “information hunt” based on what they are used to hear, see Fig. 5.7B).
4)
In case an agent is completely disconnected from the system, it reconnects on the basis
of a weighted choice from its own prioritized list.
204 Complex Systems, Kim Sneppen
The model is defined on a 2-d square lattice of L×L sites, each occupied by
an agent. Each agent i can be assigned a number ri which can take any integer
value. This number plays the role of a particular idea, concept, or opinion. At
any time-step one random agent i is selected, and the following two actions
are attempted:
• One of the nearest neighbors j to the agent i is selected. Denoting by nj
the total number of agents with integer value equal to that of j, we with
probability nj /N let the agent i change its integer value to that of its
neighbor j, provided that i never assumed that particular integer value
before. In case it had, no update is made.
system. This is because an integer that is not on the lattice would not be
distinguished from a new number by the model. Another feature of the above
model is the factor nj /N , which implies that a minority concept has more
difficulty in spreading than a more widespread idea. This particular feature
represents cooperative effects in social systems, and is nearly the same as just
selecting two agents to then influence another. This feature is included in a
way that we 1) allow cooperativity to act on long distances but at the same
time restrict propagation to spreading on a 2-d plane, 2) avoid discussion of
detailed neighborhood updates related to where the two agents are located,
and 3) allow a single idea to nucleate from one person (with probability 1/N ).
Mini Tutorial: What would happen if there was no memory in the above
model, that is, that every idea could spread proportional to the number of
current followers, irrespective of history?
Figure 5.9: Three time series of the sizes of the dominant states of
the system. At α = 0.4 × 10−6 , α = 25 × 10−6 , and α = 400 × 10−6 ,
respectively. Time is measured in units of sweeps (updates per agent). Notice
that the length of each period does not change substantially with α, and in
fact becomes more regular with larger α.
dn n 2
= ∝ n2 (5.4)
dt N
1
n(t) = (5.5)
tc − t
that is, it is divergent (reaching system size) at some finite time tc . Thus, the
start of new paradigms is slow, but the final rise is fast.
Fig. 5.8 shows 12 subsequent states of a system driven by the model. The
snapshots reflect states of the simulation shown in Fig. 5.9b, starting at time
t = 62000. The first panel shows the system shortly after a new idea swept the
system, leaving the system in a coherent state dominated by this particular
idea. A few agents have different colors (i.e., ideas), representing the effect of a
finite innovation rate α over the short time interval after the dominating idea
took over. The second and third panels show the system closer to the next
transition (at t ∼ 68, 000), where several ideas have nucleated some sizeable
clusters of coherent colors. Panels 4 and 5 correspond to the spike at t ∼ 68000
which subsequently leaves the system with two mutually coexisting coherent
states that persist until they are erased by a new “avalanche” (panels 8 and
9). Finally panels 9-12 describe the evolution of the system from t = 82, 000
to t = 93, 000. This period is characterized by the dominance, erosion and
subsequent replacement of one state with another.
Figure 5.9 shows three time series for the rise and fall of different leading
communities, illustrating the behavior at low, intermediate, and high values
of the “innovation” rate α. In all cases one sees a sharp growth of the dom-
inating community, followed by a slower decline. Remarkably, the lengths of
domination periods are quite insensitive to α. However, as seen from Fig. 5.9
a-c, the nature of the decline of the dominating state depends on α:
• For low α, the dominating state nearly remains intact until it is replaced
by a rare single nucleation event that suddenly replaces the old state with
a new one. As a consequence, low noise only rarely leads to situations
where more than one state nucleate at the same time.
The model provides a new frame looking at the interplay between dom-
inance of prevailing concepts supported by a large number of followers, and
the striking inability of these concepts to defend themselves against new ideas
when the situation is prone to takeover. The increased vulnerability of a dom-
inating idea or paradigm with age is in our model seen in the steady increase
in the number of competing ideas, and a parallel decrease in its support. For
intermediate or large innovation rates, the takeover is a chaotic process with
multiple new states competing on short time scale. The final takeover is on a
much shorter time scale than the decline. Existing paradigms are eroded in a
pre-paradigm phase for the next paradigm much as envisioned by Kuhn (T.
S. Kuhn, “The Structure of Scientific Revolutions”, 1st. ed., Chicago: Univ.
of Chicago Pr., 1962.) New paradigms are born fast, ideally aggregating in
a real scientific competition between the many random ideas that emerged in
the pre-paradigm phase.
Questions:
5.4) Consider the paradigm model, with cooperative idea spreading. When you are
the first to get an idea in a system on N = L × L, what is the average time until the
idea is spread to two persons? And given n persons have the idea, what is the time
it spreads to one more person?. Simulate a 10 × 10 system where each agent can get
an idea with probability 0.01 at each time one agent try to transmit a message to
a neighbor. Plot popularity (number of followers) of some ideas in the simulation.
Notice that each agent needs a memory of the last 100 ideas it was exposed to, and
is not allowed to take any ideas in this list.
Qlesson: Its by far hardest to spread the idea the first time.
Agents, by Kim Sneppen 209
SIR-->Voter model-->epigenetics
S I
L R
L R
Figure 5.10: Schematics showing the SIR, Voter, and epigenetics mod-
els. The three models in this chapter, building on interacting populations.
Straight arrows show possible transitions, curves arrows show facilitating in-
teractions. The upper panel shows the SIR model, the middle panel the Voter
model and the lower panel the two step Voter model with bi-stability.
Figure 5.11: Interacting species in real life. It often appears that a lynx
population grows when a rabbit population is large, suggesting the famous
Lotka-Volterra coupling between their populations: dx/dt = a · x − η · x · y and
dy/dt = β · x · y − δ · y where x is the prey and y the predator population (and
the parameter β/η the amount of predator that is produced for each prey that
is eaten).
Epidemic models are very similar to models for spreading information and ideas.
The classical epidemic model is the SIR (Susceptible-Infectious-Recovered) model,
that divides the population into fraction of susceptible individuals (S), infectious
individuals (I) and recovered individuals (R). The latter individuals may be dead or
immunized and thereby are removed from further spreading of the disease.
The earliest mathematical treatment of disease infection for malaria was done
by Ross, [?], who also introduced mosquito nets to reduce malaria. The simplified
mass-action kinetics for a well-mixed population that include the recovery/removed
state can be found in (Kermack, W. O. and McKendrick, A. G. ”A Contribution to
the Mathematical Theory of Epidemics.” Proc. Roy. Soc. Lond. A 115, 700-721,
1927):
dS
= −λ · S · I (5.7)
dt
dI
= λ · S · I − γI (5.8)
dt
dR
= γ·I (5.9)
dt
where the parameter γ decides for how long individuals are infectious (time ∼ 1/γ)
and the parameter λ/γ subsequently determines how many individuals each infected
person can infect. A central parameter is the so called R0 -factor:
λ
R0 = , (5.10)
γ
Agents, by Kim Sneppen 211
dS=-3 I S..
..
dI = 3 I S - I
dR= I
population
0.5
0
0 5 10 15 20 25
time (infectous time)
1
dS=-3 I S..
..
dL =3 L S - L
population
dI = L - I
0.5 dR= I
0
0 5 10 15 20 25
time (infectous time)
1
Same L as above
but seperated in
population
10 smaller steps
0.5
0
0 5 10 15 20 25
time (infectous time)
Figure 5.12: Simulating epidemics. The classical SIR model and a variant
where one includes a latency time where people are infected but cannot infect
others.
which is central in thinking about how widespread the disease becomes before herd
immunity sets in.. R0 is the number of infections that each infected individual cause
at the beginning of the disease (when few individuals have been infected).
As the disease spreads, the amplification number tends to decrease because peo-
ple get immunized (or die), leaving fewer susceptible individuals. R0 is very large
for measles (∼ 10 → 15), it is about two for ebola, about 1.3 for common influenza.
For the Covid-19 it is estimated to be between 2 and 3.
Dividing the first with the last equation one obtain:
dln(S)
= −λ/γ = −R0 ⇒ S(∞) = e−R0 (1−S(∞)) (5.11)
dR
212 Complex Systems, Kim Sneppen
where we use that S + I + R = 1, S(0) = 1 and that R(∞) = 1 − S(∞) since there
are no infected individuals after the epidemics have died out. For an illustration of
the solution see Fig. ??. Thus, if an epidemic with R0 >> 1 really would follow
the SIR model with, S(∞) 1 and the number of ”survivors”, that is, those never
infected, declines with R0 as
S(∞) ∼ e−R0 .
This is a very small number, and much smaller than the so called herd immunity
limit.
Herd immunity is instead calculated from the size of S when the disease stop
growing exponentially, i.e. when dI/dt = 0 → S = 1/R0 which indeed is much
larger than e−R0 . The herd immunity is the level at which the epidemic stop if one
avoid “overshooting”, i.e. avoid that the many infected continues infect after S have
decreased to 1/R0 . Herd immunity is also what one wants to obtain with vaccination
strategies, since vaccinating a fraction > 1 − 1/R0 of the population would secure
that dI/dt < 0, and thus that an epidemic could not propagate.
If R0 < 1 then the disease cannot spread, corresponding to a percolation that is
limited to a finite cluster. In that case S(∞) = 1 is the only solution to 5.11. When
R0 > 1, on the other hand, the disease indeed spreads. If a fraction q is immune to
the disease, then the effective S → S · (1 − q) and the real spreading will occur with
an effective R = R0 · (1 − q) which becomes smaller than 1 when
1
R0 · (1 − q) < 1 → q > 1 − . (5.12)
R0
Thus, for an R factor of 10 one needs to vaccinate more than 90% of the population.
Recovered individuals can not be re-infected in the SIR model. Further, the
model assume that one become infectious imidiately after infection. Figure 5.13
also explore the effect of relaxing these conditions, thus having having a latency
period between infection and being infectious and becomming susceptible again after
some longer time interval. The latency period effectively delays the progress of the
disease, but does not change the long term fraction of infected people needed to
obtain herd immunity. In Fig. 5.14 we show data for recurrent epidemics of some
disease (influenza).
1
..
dS=-3 I S+R/50
..
dI = 3 I S - I
dR= I - R/50
population
0.5
Endemic
0
0 10 20 30 40 50 60 70 80 90 100
time (infectous time)
1
..
dS=-3 I S + R/50
..
dL =3 L S - L
dI = L - I
dR= I - R/50
population
0.5
0
0 10 20 30 40 50 60 70 80 90 100
time (infectous time)
Figure 5.13: Model including reinfections. The SIRS model, where re-
covered people can get reinfected after a longer time interval. This model is
relevant, if the immunity obtained has a time limit. Alternately, steady state
can be obtained if new individuals are born without immunity. The simula-
tions are started with a fraction of 0.001 individuals infected and the remaining
S = 0.999 susceptible.
Fig. 5.15. Agent based models, however, are very well-suited to investigate the
role of superspreaders. Like standard compartmental SEIR models, they can easily
reproduce the epidemic curves observed in a population. Unlike purely compart-
mental models, however, agent-based models can adjust individual infectivity and
mimic repeated social interactions within defined groups. In an agent-based model
an agent goes to the same workplace in the morning and home to the same house-
hold at night. In contrast, inhabitants of standard compartmental models go to a
new workplace and home to a new family in every time step.
R 0 =3 R e =1
With R =3 and 5 days infectious period, 1 hour contact has <10% chance to cause infection:
Without superperspreaders the infections are dependent on duration of contacts
For superspreaders the infections are more dependent on number of contacts
Figure 5.16: Agent based model for Covid-19 epidemic: Top panel show
progress of disease in each infected person. Bottom panel show the social
structure with each persons contacts divided into tree different social circles.
that the agent has one chance of transmitting the virus at a given contact. A chosen
proportion of agents were designated as superspreaders, with si = 50. Simulations
were run in a population of 1 million seeded with 100 infected agents. In defined
time steps ∆t within an agent’s infectious period, each infected agent was chosen for
a contact with an age-dependent probability. For each chosen agent we assigned a
contact in one of the three sectors: home, work/school, other. The were selected with
probabilities such that they occur in a ratio of 1:1:1 across the population. Contacts
were selected so that 1/3 occurred in each of our three social sectors, resembling
social science data from Mossong et al. (2008).
100
0
200 d) withsick/1000
superspreaders e) with sick/1000
superspreaders f) with sick/1000
superspreaders
ICU/100000 ICU/100000 ICU/100000
Cases
100
Cases/1000
ICU beds/100000
Figure 5.17: Agent based model for Covid-19 epidemic: Top panel show
simulation without superspreaders, lower panel show simulation with super-
spreaders.
Model
Model b)
a) Model
Sweden
Sweden
100
Covid-19 death per day
0 30 60 90 120
Time (days)
Figure 5.18: Agent based model fit to Sweden Covid-19: The figure
show that the model with superspreaders provide a fit to Swedish mortality
data for Covid-19 that is more plausible than a model without superspreaders
(rightmost panel). The shown data was the record until 1 july 2020.
ing “other” contacts randomly chosen from the population became much greater
than the benefits when we did not include superspreaders (Fig. 5.18 c, f). The
projected number of case were both substantially smaller when superspreaders were
included in the model.
Questions:
5.5) Simulate the SIR model with γ = 1 and λ = 10, starting with I = 0.000001
Agents, by Kim Sneppen 217
and S = 1. Assume that all individuals R are dead, but that there is a birth process
by adding a term +0.1 · S · (1 − S) to the equation for S. Simulate the long time
dynamics of this disease.
5.6) Formulate an extended SIR model, where there are two populations and infec-
tion from one to the other occurs (but not the reverse). Assume equal population
sizes and same parameters γ and λ for all allowed infections. Compare the popula-
tion collapse in the two populations for γ = 1, and λ = 5.
5.7) Simulate the SIRS model with λ = 5, γ = 1 and starting with I = 0.0001
and S = 1 (corresponding to a population of 10,000). Assume, in addition to the
standard SIR model, that R is converted to S with rate 0.01. Simulate the long
time dynamics of this disease. Simulate the long time dynamics of a disease in a
Gillespie algorithm with a population size 10,000. Assume that there is always one
infecting individual (of the total population of 10,000. This prevents extinction of
the disease).
5.8) Construct a agent based model for an epidemic where 10% do all infections,
but all are equally susceptible. Assume a normal SIR framework, with an infectious
period of 10 days and that the infection rate is such that each of the superspreaders
can infect 30 other persons in the beginning of the epidemic. Consider a society
with 10000 persons.
a) Assume first that persons contact each other randomly across the population,
and follow epidemic trajectory starting with 1% infected.
b) Assume instead that each person is embedded in an Erdos-Reynei network with
average connectivity k = 5. Use same infection parameters as before and calculate
epidemic trajectory starting with 1% of population infected.
Mini tutorial: Can you mention any example of meta-stable systems in physics/your
surroundings?
Competing states are part of society, where opinions spread through social con-
tacts [104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 98, 99, 93, 114]. Heavily studied
systems are the “voter models” [107, 108], where agents take one of two opinions
+1 or −1, and update these by repeatedly setting the states of pairs of agents to
be equal. Fig. [?] shows coarsening in a voter model that starts with five different
states assigned randomly to a number of agents on a one-dimensional line. In fact,
the Voter model will always coarsen to a state where eventually only one opinion
survives, and all agree on everything. However, if one adds some external noise to
the model then the dynamics will stop this coarsening, and the system instead sta-
bilizes at a finite level of coarsening given by the level of the noise. Other interesting
approaches include the Axelrod model [104], where opinions are multidimensional
218 Complex Systems, Kim Sneppen
5 10000
120
4.5
4 100
3.5
1000
80
no. of boundaries
3
2.5
60
2
100
40
1.5
1
20
0.5
10
0 0 1 10 100 1000 10000 100000
0 5 10 15 20 25 30 35 40 45 50 0 50 100 150 200 250 updates per site
Figure 5.19: Dynamics of the Voter model. At each step one selects
one site and then sets its state to the same as its neighbor. In the simulation
above we assume five different states. The system always coarsens and the
dynamics of boundaries perform random walks. Coarsening happens when two
random walkers meet, and annihilate the opinion between them. Notice the
self-similarity of the two coarsening pictures, where a five times larger system
coarsens to a similar number of patches after a 25-fold longer √ time. Because
all boundaries perform random walks √the domain sizes grow as time and the
number of boundaries coarsens as 1/ time, see rightmost panel for simulations
of an L=100,000 system.
(and agents only communicate to the extent that they at least share some opinions
with each other). Also this model coarsens, but without noise it will freeze in a
state of non-communicating clusters. Allowing for noise in the communication rule,
where also agents without anything in common sometimes communicate, ultimately
leads to a uniform state.
L R
L R
Figure 5.20: Models for bi-stability. The three-state model (upper panel)
does not require cooperativity. The schematic of the two-state model (lower
panel) indicates, that two representatives of R are needed to convert one L
into an R. This process represents cooperativity. Each site represents an agent
that can be either in the R or in the L state. Transitions between these two
states are in part random, and in part recruited: At each update of an agent i,
with probability β the agent i is set to the opposite state. Subsequently, two
other agents are chosen, and if these two are in an equal state then the state
of another random agent conforms to this state.
dr
= (r2 · (1 − r) − r · (1 − r)2 ) − β · r + β · (1 − r) + ξ(t) . (5.13)
dt
Here, the average noise hξi = 0 and variance of the noise hξξit ∝ 1/N . The above
equation can be rewritten as
dr
= r(1 − r)(2r − 1) + β · (1 − 2r) + ξ
dt
= (r(1 − r) − β) (2r − 1) + ξ (5.14)
The above equation have one steady state solution (dr/dt = 0) when β > 1/4 and
3 solutions for β < βc = 1/4. For small β there are therefore two stable solutions:
One at low, another at high r, separated by a barrier at the unstable state with
r = 1/2.
Local rules
Detailed: Seen from distance:
time
65
60
Desertification (AU)
55 Green Desert
Desert sahara state
50
state
45
40
35
-25000 -20000 -15000 -10000 -5000 0
time (year before present)
Figure 5.22: Bistable vegetation model. Switch to and from desert stage
in Western Sahara, measured by dust in sediments off the coast. 6000 years
ago the “green state” terminated.
Forrest
Grass
Dessert
Figure 5.23: Model for a micro climate. Imagine a lattice, where each site
can be either tree (T), grass (G) or dessert (D). All states tend to spread with
some rate to the neighbor states, with cattle being responsible for the removal
of grass and forest whereas random seeding (β) accomplishes the opposite. The
extreme states (tree and dessert) influence conversion of each other through
their influence on water drainage and wind erosion. The overall state of the
system can be perturbed by external drivers, like rain (favoring growth) or
cattle (destroying vegetation).
To select when an event actually occurs, the exponential decay specifies the cu-
mulative probability that the event occurs at times larger than ∆t. One should select
uniformly a random number ran ∈ [0, 1] and find the δt which solves exp(−r∆t) =
ran. The ∆t selected in this way will be exponentially distributed. Accordingly, if
the current time is t, then the next event should be assigned to occur at t + ∆t with
1
∆t = − · ln(ran) . (5.16)
r
Here, 1/r is the average time to the next event.
p p = r exp(−r t)
General infinity 1 Exponential
probability p(t’)dt’=1 probability
distribution 0 distribution
t t
1 random t 1
P P
P(t+dt) P(t)= p(t’)dt’
random 0
p(t)dt random P(t)=1−exp(−r t)
P(t)
P(t)=random’
Select t in interval with
probability p(t)dt t=−ln(random)/r
dt
t t
0 0
t t+dt t(next)
In the typical case where several competing events can take place, one at any
time needs to make a list of times with one time for when each of these events
should occur next. This could, for example, be events where a variable increases,
and competing events where the same variable decreases:
One update consists of selecting the first (earliest) events in this list, and then
making the change specified by the reaction chosen. Subsequently, the rates for
many of the other events may then change, which will then serve as new input for
the next event. Also, one should keep track of the total time during the simulation,
always updating this time with a step size given by the time used for the selected
events.
Mini tutorial: If doubling the event sizes, how should one scale the rates to maintain
the same average behavior?
After initialization with a start number for all variables an update step in the
event driven algorithm reads:
1 Monte Carlo step: Generate random numbers rani to determine the time-step
∆ti = − 1r · ln(rani ) for all potential events i, and select the first event for
updating.
2 Update: Increase the time by the timestep generated (from Step 1) and update
the variables with the change associated to the selected event.
The key assumption is that we consider systems without memory. That implies
that any event only depends on the quantified state of the system, and that this
occurrence is independent on how much time has passed since there last was a
change in the system.
Mini tutorial: How could the event sizes be larger than one unit in a system?
Mini tutorial: If event size is doubled for one of the processes, what would that
mean for the resulting noise?
The above changes can also be different for each of the terms in the differential
equation. For example, the change in increase could occur by making 2 units at each
event, while the decay term still could be in 1 unit at a time. This all depends on
the underlying “physics” of the problem. If each term is changed in steps of ∆i the
simulation of a dynamics with i = 1, 2, . . . different processes should proceed as:
1 Monte Carlo step: Generate random numbers rani to determine the time-step
∆ti = − ∆ri · ln(rani ) for all potential events i, and select the first event for
updating.
Agents, by Kim Sneppen 225
2 Update: Increase the time by the generated time-step (from Step 1) and
update the variable x with the change ∆i associated to the selected event.
Questions:
5.9) Draw 100,000 random numbers ran(i) uniformly between 0 and 1, and for each
number set xi = − ln(ran(i)). Plot the histogram of xi . Fit an exponential function
to this histogram.
Qlesson: It is simple to simulate exponential distributions.
5.10) Use the Gillespie algorithm to simulate the dynamics of dx/dt = 12−x with x
changing in steps of 1. Plot over 1,000 events. Then redo simulation when produc-
tion of x changes in steps of 4, while removal still changes in steps of one. Compare
the variation (standard deviation) in the timeseries of x in the two cases.
5.11) Draw 100,000 pairs of random t1 (i), t2 (i) each from a distribution exp(−t2 /2),
with t > 0, that is, as the right side of Gaussian with center in 0 and standard de-
viation one, and compare the distribution of t2 − t1 for all cases where t2 > t1 with
the distribution of t2 .
Hint: if one draws 12 numbers uniformly between 0 and 1, their sum is Gaussian
with standard deviation 1 and mean 6.
Plot the histogram of dx = xi (2) − xi (1) for all pairs where xi (2) > xi (1).
Qlesson: The distribution of second events, given the first have occurred, is different
when using Gaussian distributions (or any other distribution than an exponential).
5.12) Make a Gillespie simulation of the 2-state recruitment model with cooperativ-
ity as formulated in terms of the different processes in eq. 5.13. Let m vary between
0 and 1 in steps of 0.04 and set β = 0.1. Change step size to 0.03 and check how
stability of one of the states increases.
Qlesson: Should correspond to a agent based model with N = 25.
Lessons:
Supplementary reading:
Van Kampen, Nicolaas Godfried. Stochastic processes in physics and chemistry.
Vol. 1. Elsevier, 1992.
Farmer, J. Doyne, and Duncan Foley. ”The economy needs agent-based modelling.”
Nature 460.7256 (2009): 685-686.
Agents, by Kim Sneppen 227
5.7 Appendix
It may sometimes be useful to perform stochastic dynamic using a Langevin equa-
tion, where one follow the development of a variables in small fixed time steps, or
perhaps even in the Fokker Plank formalism where one propagate a whole ensemble
of systems.
By using a physical insights the two terms in the Langevin equation can be recast in
terms of loss and gain of particles within [x, x + dx], P (x, t) · dx. P (x, t) will evolve
according to the Fokker Planck equation:
dP (x, t) d
= − J (5.21)
dt dx
where the current J is given by
dV dP (x, t)
J = −µ · P · − D· (5.22)
dx dx
At equilibrium P = constant implying that J = 0 and thus
which can only be ∝ e−V (x)/kB T if µ = D/kB T . This famous equation by Einstein
states that the mobility (µ) is proportional to diffusion constant D. The diffusion
constant D has dimension of a mean free path times a typical (thermal) velocity.
The diffusive part of the Fokker-Planck equation describes how an initial√localized
particle spreads out, in flat potentials to a a Gaussian with spread σ ∝ Dt that
simply follows from the central limit theorem. The convective part of the Fokker-
Planck equation states that the particle moves downhill, with a speed proportional
to both dV /dx and the mobility µ.
228 Complex Systems, Kim Sneppen
V
C (transition state)
n exp(− ∆V /kBT)
∆V
n particles
A
x
The value of PA can be estimated from using a harmonic approximation around the
minimum A, and setting PA equal to the corresponding max of the corresponding
Gaussian density profile. Thus around point A:
1 d2 V 1
V (x) ≈ VA + 2
(x − xA )2 = VA + k (δx)2 (5.28)
2 dx 2
p
For a particle with mass m this is a harmonic oscillator with frequency ω = k/m.
The peak density PA is given by normalization of exp(−kδx2 /(2kB T )) (everything
has to be counted as if there is one particle in the potential well that can escape).
The escape rate (J per particle):
r
ωA τ mkB T eVA /kB T
r = RB (5.29)
m 2π eV /kB T dx
A
The remaining integral is calculated by a saddle point around its maximum, i.e.
around the barrier top at point C:
Z B Z ∞ √
V /kB T Vc /kB T −kc2 δx2 /(2kB T ) 2πkB T Vc /kB T
dx e =e dx e = √ e (5.30)
A −∞ kc
which with kc = mωc2 gives the final escape rate for overdamped motion:
ωa Vc − VA
r = · ωc τ · exp(− ) (5.31)
2π kB T
This equation can be interpreted in terms of a product between a number of at-
tempted climbs:
ωA
number of attempts = (5.32)
2π
multiplied by the fraction of these climbs that can reach c, simply given by the
Boltzmann weight e−(Vc −VA )/kB T . Finally just because a climb reaches the saddle
point, it is not given that it will pass. The chance that it will pass is ωc · τ which
is equal to one divided with the width of the saddle, in units of the steps defined
by the random kicking frequency 1/τ . I.e. imagine that the saddle is replaced by a
plateau of w = 1/(ωc τ ) steps, and we enter the first (leftmost) of these steps. We
then perform a random walk over the plateau, with absorbing boundaries on both
sides. As this is equivalent to a fair game, the chance to escape on the right hand
side is 1/w. Thus one may interpret the overdamped escape as:
r = ( Attempts to climb )
· ( chance to reach top given it attempts ) (5.33)
· ( chance to pass top given it reached top )
and one immediately notice that a higher viscosity, meaning a lower τ implies that
escape rate diminishes. This is not surprising, as a higher viscosity means that
everything goes accordingly slower, and therefore also the escape.
Lesson: The escape from a potential well is exponentially difficult in the inverse
temperature, where T ∝ D from Einsteins equation. In our Agent based models,
the effective temperature would be the variance of the noise term. When noise in
230 Complex Systems, Kim Sneppen
our Gilespie simulation is double as big then the escape is as if the “temperature”
is four times bigger.
Chapter 6
Econophysics
For then, since gold was soft and blunted easily, man would deem
it useless, but bronze was a metal held in high esteem.
Now the opposite: bronze is held cheap, while gold is prime.
And so the seasons of all things roll with the round of time:
What once was valuable, at length is held of no account,
while yet the worth of which was despised begin to mount.
Lucretius, De Renum Natura, Book 5 (60 years before crist)
231
232 Complex Physics, Kim Sneppen
market. Time series analysis of stock prices in part reflect the ancient dream of
predicting the future from the past in order to make profit. Much effort is put into
the analysis of time series of especially stocks, and anyway, as we will see, then they
are inherently un-predictable. We will here outline some of the simplest measures.
Figure 6.2: Dow Jones: An index following the average of the major shares in
USA. The index increases with about a factor 4,000. For comparison, the US public
debt changed from ∼ 108 $ in the period 1800-1850 to ∼ 5 × 1012 $ in year 2000.
Fig. 6.2 shows a stock market index during a 200 year period. The index is
calculated as the average of many shares, and should thus in principle be much less
variable than individual shares. In spite of this, there are indeed wild fluctuations,
with occasional collapses where the overall value of all stocks drops by a factor 10
over a relatively short period. In fact, when one inspects stock markets across the
world, then nearly all of them have had about one reduction by a factor 10 during
the last century. Value is dynamic.
To first approximation the market exhibits a biased random walk. More precisely,
de-trending for the overall increase due to general growth of the economy/inflation,
log(price) follow a random walk. In Fig. 6.3 we show the de-trended Dow-Jones
index, removing trends that are more than about 5 years long. The random walk
hypothesis was first put forward more than a century ago by Bachelier [116], and has
been recently supported by analyzing price fluctuations W (t) as function of time:
where the average is taken over all starting times t of intervals of duration T in the
available time series.
For a random walk W (T ) ∝ T 0.5 , whereas most stock markets show W (T ) ∝
T 0.55→0.65 with the lowest values of the Hurst exponent for the oldest markets. Notice
that one can define the Hurst exponent in terms of both the variance of prices over a
time interval with length T , or instead just define it in terms of the variation after a
time interval T . In both cases it involves sampling a lot of different starting points!
Figure 6.4: Hurst exponent simplified. The scaling between the spread in
s = ln(v) when measured over different times T . Thus the spread in ∆s =
s(t + T ) − s(t) is a function of T .
To characterize the stochastic dynamics of a time series one uses the Hurst
exponent. The Hurst exponent is defined by the scaling of the typical change in
price over a time interval of length T
where one normally follows the logarithm of the price s(t) = log(v(t)). This mea-
surement is performed by averaging over all starting points t in a given time series,
using the prescription shown in Fig. 6.4.
In economic time-series one follows the logarithm of the price because it is the
relative change in price that actually matters. That is, this determines how much
your investment gives in return. Thus, if a share changes value from 10 to 11, or
from 100 to 110, it is the same relative change, and the same change in ∆ log. The
scaling assumption in the above equation reflects the near-random walk behaviour
of the market, where deviations grow with time with some exponent, that in fact is
close to that of a random walk.
234 Complex Physics, Kim Sneppen
The correlation between the past and future is related to to the Hurst exponent
H. Consider the variation around the present time, t0 = x, with forecast at a time
T in the future ∆s(T ) = s(T + x) − s(x) whereas the historical counterpart is given
by ∆s(−T ) = s(x − T ) − s(x). Thus, we want to calculate the correlation between
past and future
where we use the assumption that an average over all starting time points x makes
hs(x + T )2 ix and hs2 (x)ix equal.
The numerator in eq. 6.3 can be re-expressed
Figure 6.6: Past→ future. Left shows an example of a time series with Hurst
exponent H = 0.40, generated by a wavelet method (not pensum). Right
panel examines the average return of investment as a function of H, where one
buys according to trend [117]. The red curves shows the profit when one buys
on the way up, and sells on way down in H > 0.5 markets, and oppositely in
H < 0.5 markets. The two other curves invest proportional to size of the past
price change ν = 1, respectively to this change squared ν = 2. Thus, weighting
the trend pays off even more. All returns are measured in units of the spread
in volatility during the time interval considered, and the curves in fact scale
proportionately to this as the horizon T for investment increases.
Thus, an ordinary random walk with H = 1/2 has C = 0, whereas an H > 1/2
walk implies that the past price difference ∆s(−T ) = s(0) − s(−T ) is most likely
maintained for ∆s(T ) = s(T ) − s(0). That is, if the price increased during the past
month, then it on average will increase also during the next month. In contrast, in
an H < 0.4 market the price fluctuations will tend to revert.
To get an interpretation of the above correlation, consider a stock that on a time
scale T follow the trend with probability p and reverses it with probability 1 − p.
The variance for one step of this walk is h(∆s(T ))2 it = 1. The numerator in eq.
6.5 is given by the sum of two contributions, one for following the trend,and one for
reversing the trend
Accordingly, using eq. 6.5 one find that p is associated to the Hurst exponent by
2p − 1 = 22H−1 − 1 or p = 4H−1 . This is the probability to follow the trend:
Figure 6.7: Following the trend. When H > 0.5 then there is more than
a 50% chance that the next move is in same direction as the previous move.
However, the reverse is not true! A tendency to follow the trend typically
implies a random walk with a longer “persistence length,” i.e. a longer time
before the walk changes direction. On this longer timescale the walk will still
be a random walk.
a statement that qualitatively should be true for all time intervals where the walk
can be characterized by H (see also Fig. 6.7). In particular for H = 1/2 then the
above probability is equal to 1/2, reflecting a true unbiased event. In the questions
we will try to use this to gain profit in correlated markets.
In the H > 1/2 case, a winning strategy is to “bet” on the trend: Buy when it
is bull market, and sell when it becomes bear market [117]. Thus for H > 1/2 one
should:
whereas this strategy should be reversed in a H < 0.5 market, see Fig. 6.6. Notice-
ably, electricity markets have H = 0.40 [119, 120]. Again we emphasize that this
buy-sell strategy would work on with trading intervals anywhere inside the time-
scale where the walk is characterized by the Hurst exponent H.
Finally, as a small notice, then an walk with hust exponent H has a fractal
dimension 2 − H when one consider the position as function of time (the walk
embedded in 2-dimensional space-time). From this one can calculate the first return
for various H-walks. Do this! (for help see chapter 2)
• Summary: In spite of all the people thinking, talking and dealing, the result-
ing market nearly behaves as a random particle exposed to Brownian noise.
Second order correlations presumably reflect crowd panic.
Questions:
6.1) Simulate a walk where the logarithm of a price (s) moves one step up or one
step down at each time-step. Let the probability to continue in the same direction
as in the previous step be p = 0.75. Investigate the Hurst exponent for this walk
numerically. Redo the simulation for p = 0.99. Hint: Just calculate the variance
for one hundred simulated time-series of length 100, hundred time-series of length
1,000, and hundred timeseries of length 10,000. Plot the variance of end points on a
log-log scale. (You can equivalently use one very long time-series and extract various
segments from it).
Econophysics, by Kim Sneppen 237
stock market
fluctuations
at different
Log scale
Linear scale
timescales
fat tails
(deviation/spread)
Figure 6.8: Fat tails: Distribution of short time-scale fluctuations exhibit fat tails.
The left panel show short timescale fluctuations of an index, re-scaled with the
timescale over which one examine the fluctuations. In right panel, the red and blue
curve have same variance, but different Kurtosis. Kurtosis quantifies 4’th moment,
normalized by second moment squared. It is more sensitive to tails in distribution
than second moment, and would thus be divergent when p(tail) ∝ 1/∆sτ , with
τ ≤ 5.
Qlesson: Any finite p > 1/2 still leads to a random walk, just with a correlation
time that is proportional to ln(1 − p) (what is the pre-factor?, what would happen
for p < 1/2?).
6.2) Simulate a random walk of uncorrelated up and down movements of s, where
step sizes δ are chosen from the fat-tailed distribution P (δ) ∝ 1/δ 3 . Visualize the
walk. Calculate the Hurst exponent by simulation.
Qlesson: Notice that the mean squared displacement diverges.
6.3) Plot eq. 6.5 as a function of the Hurst exponent H, and interpret this in terms
of profit of a sensible strategy. Devise an investment strategy and calculate the
maximum average profit per investment step for an H = 0.4 market.
Qlesson: Act as if tomorrow would be opposite to today.
6.4) Generate a market profile by the upper envelope of directed percolation, using
a critical value of p (and restarting a new seed at last present seed when all live
sites in the DP dies out). That is, when the upper branch dies out, one experiences
a sudden collapse. Analyze the Hurst exponent of this market. Try to devise an
investment strategy to make money in this market, and simulate the investment
strategy assuming that it is the logarithm of the price that follows this trajectory.
Qlesson: This is a persistent walk (exponent 0.63) with occasional collapses that
can be very very large. Follow the trend but bet hedge (see later).
“In economics, the majority is always wrong.” by John Kenneth Galbraith. This
classic quote can in fact be quantified by considering the coordinated movement of
many stocks. To explore economic time series we now consider inverse statistics
238 Complex Physics, Kim Sneppen
[?, 121]. In turbulence one often measures velocity differences as a function of dis-
tance, and obtains the famous Kolmogorov scaling. However, one could also consider
the inverse statistics that measure the time or the distance until the next large fluc-
tuation in relative velocity occurs. Thus, the inverse statistics focus attention on
the laminar/calm regions of the fluid [?], with large distances corresponding to large
laminar regions. In economics the corresponding measure is associated to the time
it takes before one obtains a given return on an investment. This will take a long
time when stocks are calm, or when fluctuations are in the opposing direction as the
one that is aimed at.
Let v(t) denote the asset price at time t. The logarithmic return at time t,
calculated over a time interval ∆t, is defined as ∆s(T ) = s(t + T ) − s(t), where
s(t0 ) = log v(t0 ). We consider a situation in which an investor aims at a given return
level, ρ, that may be positive (being “long” on the market) or negative (being “short”
on the market). If the investment is made at time t, then the inverse statistics, also
known as the ”investment horizon,” is defined as the shortest time interval τ (t) = T
fulfilling the inequality ∆s(T ) ≥ ρ, given that ρ ≥ 0. For losses ρ < 0 one similarly
defines the first time T where ∆s(T ) ≤ ρ. The inverse statistics histogram, or
in economics, the ”investment horizon distribution”, p(τp ), is the distribution of
waiting times T for obtaining the strike price. It is obtained by averaging over all
initiation times t in the available time series.
The data set used is the daily close of the DJIA covering its entire history
from 1896 until today. Fig. 6.10 depicts the empirical inverse statistics histograms
for the investment horizon distributions. The distributions are shown for a return
of 0.05 with open blue circles and a return of -0.05 with open red squares. The
histograms possess well-defined and pronounced maxima, the optimal investment
horizons, followed by long 1/t3/2 power-law tails.
Remarkably, the optimal investment horizons with equivalent magnitude of re-
turn level, but opposite signs, are different. Thus, the market as a whole, monitored
by the DJIA, exhibits a fundamental gain-loss asymmetry. As mentioned above,
other stock indexes including SP500 and NASDAQ, also show this asymmetry, while,
for instance, foreign exchange data on currencies do not.
It is even more surprising that a similar well-pronounced asymmetry is not found
for any of the individual stocks constituting the DJIA. This can be observed from
the insert of the figure, which shows the results of applying the same procedure
Econophysics, by Kim Sneppen 239
Figure 6.10: Inverse statistics and Fear-factor model. The upper two
panels show the definition of ”strike price”, and the distribution as measured
from the de-trended Dow-Jones index. The blue curves show the number of
days when the price first exceeds the current price by 5%, the red when it
first lies 5%below its current price (inset shows corresponding distributions for
individual companies). To read the curves, the x-axis labels the day following
the investment and the y-axis labels the probability that the price reaches the
5% deviation at that day. Lower panels define the model and show predicted
strike-price distributions.
where δ > 0 denotes the common fixed log-price increment (by assumption), and
240 Complex Physics, Kim Sneppen
(1 − p) · q = p + (1 − p) · (1 − q) , (6.10)
q > 1/2 is a “compensating” drift that governs the non-synchronized periods 1 . From
the price realizations of the N single stocks, one may construct the corresponding
price-weighted index, like in the DJIA, according to
N
1 X
I(t) = vi (t) (6.12)
N
i=1
and investigate inverse statistics for this (Fig. 6.10). Overall result: DJIA is re-
produced with one collective fear that occurs with probability p = 0.05 per day,
corresponding to one panic event per month or so. The other parameter is ρ = 5 · σ,
1
Note that there are only solutions when p < 0.5. For larger p the market is doomed, as
it is not possible to compensate the overall disasters
Econophysics, by Kim Sneppen 241
where σ is the standard deviation of the volatility of the index (average stock move-
ment) and we use an index of N = 30 shares. For DJIA the typical daily fluctuations
have σ = 1%.
We conclude that the asymmetric synchronous market model captures basic
characteristic properties of the day-to-day variations in stock markets. The agree-
ment between the empirically observed data, here exemplified by the DJIA index,
and the parallel results obtained for the model give credibility to the point that the
presence of a “fear-factor” is a fundamental social ingredient in the dynamics of the
overall market (see also the cartoon in Fig.6.11).
• Summary: Crowd behavior and panic on even the relatively small scale of a
once in a month event can be seen by using of inverse statistics.
Questions:
6.5) Consider the fear factor model with 10 stocks that move one step up or down,
all starting at 1,000. With probability p = 0.05 all stocks move down simultaneously.
What should the probability for other up, respective down, movements be in order
to let individual stocks perform a unbiased random walk? Simulate the system and
plot the time series for the average stock price.
Ups and downs of the average are asymmetric, but the average change is zero with
Hurst exponent 1/2.
A)
B)
Figure 6.12: Market model for volatility. A) M (t+1)−M (t). The volatility
it is supposed to reproduce the data shown in B) illustrating daily returns,
(S(t) − S(t − 1))/S(t − 1), for the Dow Jones stock market index. Fluctuations
are correlated: When variations on one day are large, then they most likely
are large again the next day [123]. The directions of these fluctuations are
uncorrelated! Volatility clustering is sometimes also discussed in terms of the
GARCH model (Tim Bollerslev 1986).
Figure 6.13: Herding. A market driven by herding can give huge volatility
(Kalton, in the Economist)
Second, there is a globally shared memory of how much each product was traded
during the last τ · N timesteps.
Econophysics, by Kim Sneppen 243
+ increase M
increase S
-
• Local Perception: Agent i has memory “slots” mij , j = 1, . . . , µ that each can
be assigned one of the D product types.
• Global Perception: All agents have access to the information of the common
global trading activity Sj associated to j = N × τ earlier trading events. Each
of these global trading positions is assigned one of the D product types.
The model is executed in steps. At each step, two random agents i and j are
picked. Agent i now selects the good k that he/she considers to have the highest
relative value:
1 + µl=1 δ(Mil − k)
P
1 + Mi (k)
pik = PN ·τ = , (6.13)
l=1 δ(Sl − k)
S(k)
and decides to share this to agent j. Here the δ function is = 1 if the corresponding
memory slot referring to product type k. The sum in the numerator thereby counts
the number Mi (k) that the product k occurs in the memory of agent i. The added
number 1 in the numerator avoids absorbing states where a product is absent in the
memory of all agents. When several products has same maximal pik one randomly
chose one of these to be the active one.
The exchange causes the following changes in our memory lists:
• First, one adjust the local memory by inserting the chosen product k in one
randomly chose place x, Mjx = k, in the receiving agent j.
• Second, one adjusts the shared memory S by inserting the chosen product k
in one randomly chosen place y, Sy = k, in the global memory.
244 Complex Physics, Kim Sneppen
D)
E)
Figure 6.15: A-C) Attention M (k) (colors) for each of D = 3 different products in
an economy with N = 10 agents each having M = 10 memory slots. Grey dots show
the global market share G of each product. A) τ = 0.5 · µ, B) τ = µ, C) τ = 2 · µ.
D) Coefficient of variation for the total memory M (k) of a product. The simulation
uses an individual memory µ = 20 and an economy with D = 20 products. E)
Spread (square root of variance) in M (k) as function of time separation ∆t between
measurement points. Parameters: µ = 200, D = 100 with N = 100 agents. τ is 100
(yellow), 200 (red) and 400 (blue) respectively. Grey shaded areas are bounded by
slope H = 1, respectively H = 0.5.
When the new memory is inserted, an old memory “bit” is discarded. Thereby, τ
defines the characteristic time for adjustment of the global market, whereas µ is the
lifetime of the individual memory.
The global trade activity of a product k, S(k) = τu=1 δ(Sy − k), reflects the
P
common history/processes shared among all agents. When S(k) of a product k is
large it means that it has been traded a lot in the past. Increases in S(k) could for
example reflect production of the “traded” product, with an increase in this number
making each copy less valuable.
Fig. 6.14 highlights the two feedback mechanisms in the model: A positive
feedback of fashions/viral marketing on a peer-to-peer scale, and a negative feedback
that acts through the dynamics of a common PN marketplace. The lower panel in Fig.
6.14 show the total local memory, M (k) = i Mi (k) that is allocated to a particular
good k. The simulation was done for a small economy where the common history
have a relatively long lifetime τ > µ.
Fig. 6.15 explore the impact of length of global memory, with the common
market G adjusting faster (A), equal (B) respectively slower that the memory of the
Econophysics, by Kim Sneppen 245
A)
.4
M(k)/M, G(k)/G
.3
.2
.1
0
5000 6000
B) time
.4
M(k)/M, G(k)/G
.3
.2
.1
0
5000 6000
C) time
.4
M(k)/M, G(k)/G
.3
.2
.1
0
5000 6000
time
Figure 6.16: Simulation with N = 1 (memory just copied from one position to
another). All panels show 3 products out of D = 10. Grey dots show S(blue)/S.
A) Standard model with µ = τ = 100. B) Linear model where copied memory is
selected proportional to (1 + M (k))/S(k). Again µ = τ = 100. C) Linear model as
in B) but with µ = 100 and τ = 1000.
agents (C). There is a shift from random fluctuations around an equilibrium value
(= 1/D), persistent fluctuations, to oscillation like behaviour. The latter pattern of
alternate dominance (D) has resemblance to the oscillations obtained with frustrated
bi-stability in some biological circuits [136, 137].
The stochastic oscillations in Fig. 6.15C) are characterized by shifting dominance
of products in an order that is partly random. That is, when diversity is high then
products which are not dominating tend to adjust toward each other, and the next
winner is selected by chance/contingency events.
Fig. 6.15D) explore the variation in the attention allocated to a given product
M (k). The top panel show that large variations occur when the length of the global
memory τ is larger than a threshold. This threshold increases with the local memory
µ and increase proportional to the logarithm of the number of agents N . Thus the
requirement (on τ /µ) for emergent fashions only grows slowly with N .
Fig. 6.15E) explore the variation in total memory M as function of the time
interval ∆t over which these variations are measured. For guidance we also mark
behaviour corresponding to Hurst exponent H = 0.5, respectively H = 1.0. There is
a limited scaling regime, with an apparent exponent that varies between a random
walk (H = 0.5), and a fully persistent walk H = 1 (as in Fig. 2C).
Fig. 6.16 investigate N = 1 case corresponding to a direct copy of memory from
one position to another in both the M -list and the S -list. Panel A show that even
246 Complex Physics, Kim Sneppen
similar length of the two memories, τ = µ, leads to large bubbles. Comparing the
blue curve M (k = 1) with the corresponding grey dots S(k = 1) one see that the
two memory lists are nearly in equilibrium. Noticeably, their difference is enough to
drive quite large fashions.
Fig. 6.16B,C) relax the assumption of copying the product with maximal (M (k)+
1)/S(k). Instead we at each step select a random product to be copied with prob-
ability P (k) ∝ (M (k) + 1)/S(k). One see that variations is smaller (panel B same
parameters as panel A). However increasing τ to τ >> µ again leads to large varia-
tions in value. Thus the soft proportional selection can cause “bubbles” if the two
timescales are widely separated.
Mini tutorial: Consider a model where one select with probability ∝ ((M (k) +
1)/S(k))2 . How would that behave compared to the two models explored in above
text?.
The presented model only considered two feedback loops, while real products
in their complicated reality would be exposed to history on a variety of timescales:
Production, production facilities, education, common sense, fundamental user value,
saleability, and social euphoria. In addition Fisher [130] suggested feedback associ-
ated to accumulation and spreading of debt as a driver for depressions. Some of the
above feedback can fuel a cycle of reinforcements, and bubbles may emerge when
these are relatively fast. In any case, we here focus on cases where these “bubbles”
ultimately become unstable, reflected in the assumption of a negative feedback on
the longest timescales.
associated to some sort of slower acting reality. A perspective that is partly compat-
ible with the phenomenological description of the waves of war presented in “War
and Peace” by Tolstoy [142].
Questions:
6.6) In the above model we use the heat bath method, therefore repeat a simulation
of the ising model for a 10 times 10 system as function of beta (1/temperature) and
plot the energy and average magnetization as function of of temperature.
Qlesson: It works
6.7) Simulate the above Ising inspired model for volatility in a market model using
a N = 10 × 10 system with β = 0.7 and α = 1, respectively α = 2 and 5.
Qlesson: By coupling agents to their overall average in some sort of frustrated ways,
an extended system can exhibit irregular dynamics. Perhaps this can be done more
elegantly than with the Ising model using the recruitment models from chapter 5...
6.8) Simulate the N=1 version of the fashion model above, using D = 5 products
and µ = 20 and τ = 10, respectively τ = 20 and τ = 50. Estimate Hurst exponent.
Qlesson: Notice the sensitivity of results with τ
6.9) For the D = 2 case the “bubble model” only contain the variables m1 = M (1)/µ
and s1 = S(1)/σ and can be studied through the eqs:
m1 + γ m2 + γ
dm1
= · (1 − m1 ) − · (1 − m2 )
dt s1 s2
γ γ
ds1 m1 + m2 +
θ· = · (1 − s1 ) − · (1 − s2 )
dt s1 s2
(6.14)
dv = µ · v · dt + σ · v · dw (6.15)
248 Complex Physics, Kim Sneppen
which for small 1 − x will be positive if µ > 0. Optimal investment fraction can then
be found by differentiation.
µ
1−x= 2 (6.23)
σ
which can in principle exceed 1, reflecting a situation where you can borrow (here
assumed at interest 0). Notice that this bet-hedging require that you always keep a
fraction x away, so f market goes down, then you need to take from your safe money
to keep the fraction constant. Reversely if market goes up, you will sell the asset
to keep your fraction of safe money equal x. A more extensive discussion of the
bet-hedging associated to this equation is done in Namiko Mitarais course in block
3.
Figure 6.17: Bet hedging. The equation that relates risk (volatility, = σ 2 )
and average rent µ is a cornerstone in valuating the price of stocks in future,
where one for example can buy the right to sell a given share at a given price
half a year from now (without buying the share). This has become a huge
market, with potential instabilities.
which in particular is not true if these sometimes are very big changes. (then the
variation on short time intervals can be large, and not approaching zero as the above
equation predict).
on horse races may optimize their portfolio [143], and have lately been applied
to biology by for example Bergstrøm and Lachman [144] and Kussel and Leibler
[145, 146]. Here we use the simplified formulation of [147], directly applicable to
simple win-loose games.
Consider a game with two outcomes, one good event where everything invested
get amplified by Ω, and alternatively a bad outcome where all invested capital is lost.
The probability to loose is set to p, and the probability that the ”bet” is successful
is 1 − p. Assume for example the quit and double game, Ω = 2 and p = 1/2, which
may be modified if one for example plays with a false coin, or if one have additional
information.
Given that you have a capital K, one may ask two questions:
• What is the optimal investment fraction when one play the game one time
• What is the optimal investment fraction when one can play the game many
times, but only using whatever is left of the original capital.
For the one round of the game the average outcome of an invested capital of unit
of money is
(1 − p) · Ω + p · 0 = (1 − p) · Ω (6.25)
When this product exceeds 1, one apparently would have the max average gain if
one invest everything.
When playing the game many times, one accordingly also maximize the average
return when one invest everything at each round. However the chance that you as
a single player have any money left after t bets require t wins in a row, and thus
becomes exponentially small
as the number of bets progresses. I.e. after many time-steps, the chance to have
anything left is near zero, but if you are lucky, then your capital is near infinite.
Therefore it is wrong to try to optimize the average outcome. In repeated games,
one should instead try to optimize the typical outcome. That is, in a p = 1/2
game, one will on average win half the games, and looses the other half. Therefore
the typical gain after two games will be the product of returns for a win game and
a loose game:
Capital ∝ W in · Loose (6.27)
and if Loose = 0 you will typically have zero money after an equal number of wins
and looses.
To be more quantitative, we now allow the player to maintain a fraction x of his
capital in a safe, and only playing with the remaining fraction 1 − x at each round.
The faction x is a constant throughout all repeated rounds, and thus specifies a
strategy.
In the above scenario with p = 1/2 and say Ω = 3 corresponding to a 50% chance
of tripling your fortune, the typical fortune after two games will be
and one should thus play with 1/4 of capital at each round.
where C(t, b) is the number of ways b bad events can be distributed among t total
events. Optimizing the above N (t) would be optimizing the average. Instead we will
look at the typical contribution to N , that is where the red part of the above sum
contribute most. The average of the binomial part of the above equation (shown
with color red) have an expected number of bad events
b=p·t (6.30)
This lead to optimization of
N (t) ∝ ((1 − x)Ω + x)t−pt xpt
t
= win1−p · loosep
t
= ((1 − x)Ω + x)1−p xp = etΛ(x) (6.31)
where the average long term growth rate
Λ(x) = (1 − p) · log(Ω(1 − x) + x) + p · log(x)
which then should be optimized with respect to the fraction x kept in the safe
”bank”. The first term in the growth rate is the logarithmic growth rate under
good conditions where where the invested fraction 1 − x is multiplied by Ω, while
the reserves x remains unchanged. The second term is the logarithmic growth rate
when the bet is lost. The expected value of the (logarithmic) growth rate Λ is then
given by the mean trajectory with an average of 1 − p good events and p losses per
time unit
We emphasize, that the growth rate weights the logarithms of multiplicative
growth factors of the entire capital under two conditions with their respective prob-
abilities of occurrence. Maximization of Λ with respect to x secures the long-term
optimal growth rate [143].
In contrast to its short-term counterpart, the long-term logarithmic growth rate
Λ(x) usually reaches its maximum at some x∗ between 0 and 1. In the economics
literature this is denoted the Kelly-optimal investment ratio [143]. It describes the
optimal fraction of capital that a prudent long-term investor should keep in relatively
safe financial assets such as bonds while investing the rest in more risky assets such
as stocks [148]. At the Kelly-optimum the derivative should be zero:
dΛ(x) Ω−1 1
|x∗ = −(1 − p) · ∗ ∗
+ p· ∗ =0⇒
dx Ω(1 − x ) + x x
Ω
x∗ = p ·
Ω−1
Hence for very large potential profit (Ω >> 1), the optimal strategy is to maintain
a safe fraction which is equal to the probability that you loose.
252 Complex Physics, Kim Sneppen
20 Environmental collapse
with probability p=0.1 x=0.15
log of population
=3
10 x=0.5
x=0.02
0
x=0.001
20 40 60 80
time
Figure 6.18: Dynamics of capital. One can play a game where one wins
by factor Ω = 3 by probability 1 − p = 0.9, and looses all investment with
probability p = 0.1. The blue curve is the growth of the Kelly-optimal strategy
with a fraction x∗ = p · Ω/(Ω − 1) = 0.15 in the bank. The orange and red
curves show sub-optimal strategies with x = 0.01 and x = 0.001. Conversely,
the cyan trajectory simulates an over-cautious strategy with x = 0.5.
Put in practical use: Imagine that an investment agent suggest you a 12% per
anuum investment for a 20 years investment. Your return after 20 years is the Ω ∼ 10
fold. You have to decide how big fraction of your capital you are going to invest. In
practice Ω/(Ω − 1) ∼ 1. Thus you should keep a bigger fraction of your capital in
your bank account than the probability that the investment agent is a crook. If this
probability is 50%, keep at least 50% of your money in a safe deposit.
Questions
6.10 Consider a game where your invested money is multiplied by ω < 1 when you
lose and a factor Ω > 1 when you win. Reconsider the above equations, and derive
the optimal bet hedging strategy. Discuss the derived equation in the limit where
Ω 1, p 1 and ω 1.
Qlesson: In that limit then you should bet hedge with a fraction of the money equal
the difference between the probability that things go bad, minus the loss when it
goes bad. I.e. it is equal to the difference between a probability and a fraction.
6.11 Simulate the long time (500 updates) development of a capital that grows
with rate Ω = 2 during good times, but is exposed to catastrophic events with
probability p = 0.1. In case of these events they lose everything invested. Simulate
the development of initial capital of 1 when using the Kelly optimum value of x.
Also simulate the development with other values of x, e.g. x = 0.01 and x = 0.9
and compare outcomes. Repeat simulation with finite disasters, say that bad events
leads to reduction of invested fortune with a factor ω = 10−2 , respectively ω = 0.5.
Econophysics, by Kim Sneppen 253
Hint: simulate the development of the log of the capital (where each event amount
to addition or subtraction of the log of the change).
Qlesson: There is an optimum, but the gain with varying around that optimum is
quite soft.
6.12) Consider the “Trimurti model” (Maslov and Sneppen, PLoS computational
biology (2015)) based on:
• Exponential growth, (dCi /dt ∝ Ci )
P
• Finite world, ( Ci < 1)
Lessons:
• Stocks are more correlated when they fall, than when they increase in value.
• Bet hedging is a way to deal with unknown future, and is associated to the
fact that 50% downturn and a 50% upturn does not balance, i.e. 0.5 · 1.5 < 1.
Supplementary reading:
Farmer, J. Doyne, Eric Smith, and Martin Shubik. ”Economics: the next physical
science?.” arXiv preprint physics/0506086 (2005).
Peters, Ole. ”The ergodicity problem in economics.” Nature Physics 15.12 (2019):
1216-1221.
Bibliography
[1] Edwin T Jaynes. Information theory and statistical mechanics. Physical re-
view, 106(4):620, 1957.
[2] Ulli Wolff. Collective monte carlo updating for spin systems. Physical Review
Letters, 62(4):361, 1989.
[4] Stephen G Brush. History of the lenz-ising model. Reviews of modern physics,
39(4):883, 1967.
[5] WP Wolf. The ising model and real magnetic materials. Brazilian Journal of
Physics, 30(4):794–810, 2000.
[8] Taisei Kaizoji, Stefan Bornholdt, and Yoshi Fujiwara. Dynamics of price and
trading volume in a spin model of stock markets with heterogeneous agents.
Physica A: Statistical Mechanics and its Applications, 316(1-4):441–452, 2002.
[9] MPM Den Nijs. A relation between the temperature exponents of the eight-
vertex and q-state potts model. Journal of Physics A: Mathematical and
General, 12(10):1857, 1979.
[10] Bernard Nienhuis. Exact critical point and critical exponents of o (n) models
in two dimensions. Physical Review Letters, 49(15):1062, 1982.
255
256 Complex Physics, Kim Sneppen
[15] G. K. Zipf. Human Behavior and the Principle of Least Effort. Addison-
Wesley, Cambridge, Massachusetts, 1949.
[16] Takashi Hara and Gordon Slade. Mean-field critical behaviour for percolation
in high dimensions. Communications in Mathematical Physics, 128(2):333–
391, 1990.
[17] Greg Huber, Mogens H Jensen, and Kim Sneppen. Distributions of self-
interactions and voids in (1+ 1)-dimensional directed percolation. Physical
Review E, 52(3):R2133, 1995.
[18] Lene Oddershede, Peter Dimon, and Jakob Bohr. Self-organized criticality in
fragmenting. Physical review letters, 71(19):3107, 1993.
[19] K. Jacobs. Stochastic Processes fro Physicists. Cambridge University Press,
2010.
[20] N. Eldredge. Life Pulse, Episodes from the story of the fossil record. Facts on
File Publications (New York), New York, 1987.
[21] J. J. Sepkoski. Ten years in the library: new data confirm paleontological
patterns. Paleobiology, 19:43–51, 1993.
[22] S. Bornholdt, K. Sneppen, and H. Westphal. “longevity of orders is related to
the longevity of their constituent genera rather than genus richness.”. Theory
in Biosciences:, 2009.
[23] L. W. Alvarez. Mass extinctions caused by large solid impacts. Physics Today,
pages 24–33, 1987.
[24] K. J. Kauffman, P. Prakash, and J. S. Edwards. Advances in flux balance
analysis. Current Opinion in Biotechnology, 14:491–496, 2003.
[25] P. Bak and K. Sneppen. Punctuated equilibrium and criticality in a simple
model of evolution. Phys. Rev. Lett., 71:4083, 1993.
[26] K. Sneppen, P. Bak, H. Flyvbjerg, and M. H. Jensen. Evolution as a Self-
Organized Critical Phenomenon. Proc. Natl. Acad. Sci. USA, 92:5209–5213,
1995.
[27] N. Eldredge and S. J. Gould. Punctuated equilibriua: An alternative to
phyletic gradualism. In T. J. M Schopf, J. M. Thomas, and S. Francisco,
editors, Models in Paleobiology. Freeman and Cooper, 1972.
[28] S. J. Gould and N. Eldredge. Punctuated equilibrium comes of age. Nature,
366:223–227, 1993.
[29] G. G. Simpson. Tempo and Mode in Evolution. Columbia Univ. Press, New
York, 1944.
[30] G. G. Simpson. The Major Features of Evolution. Columbia Univ. Press, New
York, 1953.
[31] H. Flyvbjerg, K. Sneppen, and P. Bak. Mean field model for a simple model
of evolution. Phys. Rev. Lett., 71:4087, 1993.
Econophysics, by Kim Sneppen 257
[35] Y. G. Jin et al. Pattern of marine mass extinction near the permina-triassic
boundary in south china. Science, 289:432–436, 2000.
[38] P. Erdös and A. Rényi. On the evolution of random graphs. Publ. Math. Inst.
Hung. Acad. Sci, 5:1760, 1960.
[52] H. Rieger J. D. Noh. Random walks on comples networks. Phys. Rev. Letters,
92:118701, 2004.
[55] H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A.-L. Barabasi. The large
scale organization of metabolic networks. Nature, 407:651–654, 2000.
[58] R. Cohen, K. Erez, D. ben Avraham, and S Havlin. “resilience of the inttener
to random breakdowns.”. Phys. Rev. Lett, 85:4626–4628, 2000.
[67] R. Milo, S.S. Shen-Orr, S. Itzkovitz, N. Kashtan, and U. Alon. Network motifs:
Simple building blocks of complex networks. Science, 298:824–827, 2002.
[68] S. Mangan and U. Alon. Structure and function of the feed-forward loop
network motif. Proc. Natl. Acad. Sci. USA, 100:11980–11985, 2003.
[71] J. von Neumann. The general and logical theory of automata. In L. A. Jeffres,
editor, Cerebral Mechanics in behaviour - the Hixon symposium, pages 1–31,
New York, 1951. John Wiley and Sons.
[78] M. Greenberg and S.P. Hastings. Spatial patterns for discrete models of diffu-
sion in excitable media. SIAM Journal of Applied Mathematics, 54:515–523,
1978.
[81] P. B. Armstrong. Cell sorting out: The self-assembly of tissues in vitro. In-
formation healthcare, 24:119–149, 1989.
260 Complex Physics, Kim Sneppen
[82] Dave Donaldson. Railroads of the raj: Estimating the impact of transportation
infrastructure. American Economic Review, 108(4-5):899–934, 2018.
[83] Marc Levinson. The Box: How the Shipping Container Made the World
Smaller and the World Economy Bigger-with a new chapter by the author.
Princeton University Press, 2016.
[84] Masahisa Fujita, Paul R Krugman, and Anthony J Venables. The spatial
economy: Cities, regions, and international trade. MIT press, 2001.
[85] Paul Krugman. Increasing returns and economic geography. Journal of polit-
ical economy, 99(3):483–499, 1991.
[86] Paul Krugman and Anthony J Venables. Globalization and the inequality of
nations. The quarterly journal of economics, 110(4):857–880, 1995.
[87] Frances Cairncross. The death of distance: How the communications revolu-
tion will change our lives. 1997.
[88] Kenneth L Kraemer, Jennifer Gibbs, and Jason Dedrick. Impacts of globaliza-
tion on e-commerce use and firm performance: A cross-country investigation.
The Information Society, 21(5):323–340, 2005.
[90] Philip McCann. Transport costs and new economic geography. Journal of
Economic Geography, 5(3):305–318, 2005.
[101] Louise H Taylor, Sophia M Latham, and EJ Mark. Risk factors for human
disease emergence. Philosophical Transactions of the Royal Society of London
B: Biological Sciences, 356(1411):983–989, 2001.
[102] Mark EJ Woolhouse and Sonya Gowtage-Sequeria. Host range and emerging
and reemerging pathogens. In Ending the War Metaphor:: The Changing
Agenda for Unraveling the Host-Microbe Relationship-Workshop Summary,
volume 192, 2006.
[103] Nathan D Wolfe, Claire Panosian Dunavan, and Jared Diamond. Origins of
major human infectious diseases. Nature, 447(7142):279–283, 2007.
[111] P. Chen and S. Redner. Majority rule dynamics in finite dimensions. Phys.
Rev. E, 71:036101, 2005.
[115] H.A. Kramers. Brownian motion in a field of force and the diffusion model of
chemical reactions. Physica, 7:284–304, 1940.
[117] Ingve Simonsen and Kim Sneppen. Profit profiles in correlated markets. Phys-
ica A: Statistical Mechanics and its Applications, 316(1):561–567, 2002.
[118] Jens Feder. Fractals. Springer Science & Business Media, 2013.
262 Complex Physics, Kim Sneppen
[119] Rafal Weron. Modeling and forecasting electricity loads and prices: a statistical
approach, volume 403. John Wiley & Sons, 2007.
[120] Ingve Simonsen. Volatility of power markets. Physica A: Statistical Mechanics
and its Applications, 355(1):10–20, 2005.
[121] Mogens H Jensen, Anders Johansen, and Ingve Simonsen. Inverse statistics
in economics: the gain–loss asymmetry. Physica A: Statistical Mechanics and
its Applications, 324(1):338–343, 2003.
[122] Raul Donangelo, Mogens H Jensen, Ingve Simonsen, and Kim Sneppen. Syn-
chronization model for stock market asymmetry. Journal of Statistical Me-
chanics: Theory and Experiment, 2006(11):L11001, 2006.
[123] Mandelbrot BB. The variation of certain speculative prices. The Journal of
Business, 36(4):394–419, 1963.
[124] Robert P Flood and Peter M Garber. Market fundamentals versus price-level
bubbles: the first tests. Journal of political economy, 88(4):745–770, 1980.
[125] Eng-Tuck Cheah and John Fry. Speculative bubbles in bitcoin markets? an
empirical investigation into the fundamental value of bitcoin. Economics Let-
ters, 130:32–36, 2015.
[126] Ch Baek and M Elbeck. Bitcoins as an investment or speculative vehicle? a
first look. Applied Economics Letters, 22(1):30–34, 2015.
[127] W Brian Arthur. Competing technologies, increasing returns, and lock-in by
historical events. The economic journal, 99(394):116–131, 1989.
[128] Jay R Ritter. Behavioral finance. Pacific-Basin finance journal, 11(4):429–437,
2003.
[129] Jianjun Miao. Introduction to economic theory of bubbles. Journal of Math-
ematical Economics, 53:130–136, 2014.
[130] Irving Fisher. The debt-deflation theory of great depressions. Econometrica:
Journal of the Econometric Society, pages 337–357, 1933.
[131] Ayumu Yasutomi. The emergence and collapse of money. Physica D: Nonlinear
Phenomena, 82(1-2):180–194, 1995.
[132] Raul Donangelo and Kim Sneppen. Self-organization of value and demand.
Physica A: Statistical Mechanics and its Applications, 276(3-4):572–580, 2000.
[133] Stefan Bornholdt. Expectation bubbles in a spin model of markets: Intermit-
tency from frustration across scales. International Journal of Modern Physics
C, 12(05):667–674, 2001.
[134] Jean-Philippe Bouchaud. Crises and collective socio-economic phenomena:
simple models and challenges. Journal of Statistical Physics, 151(3-4):567–
606, 2013.
[135] Herbert Spencer. Railway Morals & Railway Policy, volume 65. Longman,
Brown, Green & Longmans, 1855.
Econophysics, by Kim Sneppen 263
[136] Tony Yu-Chen Tsai, Yoon Sup Choi, Wenzhe Ma, Joseph R Pomerening,
Chao Tang, and James E Ferrell. Robust, tunable biological oscillations from
interlinked positive and negative feedback loops. Science, 321(5885):126–129,
2008.
[137] Sandeep Krishna, S Semsey, and Mogens Høgh Jensen. Frustrated bistability
as a means to engineer oscillations in biological systems. Physical biology,
6(3):036009, 2009.
[138] Marco A Janssen and Wander Jager. Fashions, habits and changing prefer-
ences: Simulation of psychological factors affecting market dynamics. Journal
of economic psychology, 22(6):745–772, 2001.
[139] Serge Galam and Annick Vignes. Fashion, novelty and optimality: an appli-
cation from physics. Physica A: Statistical Mechanics and its Applications,
351(2-4):605–619, 2005.
[140] Kim Sneppen and Namiko Mitarai. Multistability with a metastable mixed
state. Physical review letters, 109(10):100602, 2012.
[142] Leo Tolstoy. War and peace. 1869. War and Peace. 1869. English translation
by Rosemary Edmonds, published in, 1957.
[143] J.L. Kelly. A new interpretation of information rate. Bell System Technical
Journal, 35:917–926, 1956.
[146] E. Kussell and S. Leibler. Phenotypic diversity, population growth, and infor-
mation in fluctuating environments. Science, 309:2075–2078, 2005.
[148] Maslov S and Zhang Y-C. Optimal investment strategy for risky assets. In-
ternational Journal of Theoretical and Applied Finance, 1:377–387, 1998.
264 Complex Physics, Kim Sneppen