Econ 101
Econ 101
0.1258
Hs = ,
−0.1806 0.0647
0.1633
0.0704 0.1258
−0.5750 0.1633 −2.7085 1.3405
0.1700 −0.4883 −2.7085
whose six eigenvalues are now all real
{−2.49316, −1.7534, 0.33069, 1.44593, 2.38231, 3.42944} . (1.3)
Congratulations! You have produced your first random matrix drawn from the so-called GOE (Gaussian
Orthogonal Ensemble)... a classic - more on this name later.
You can now do several things: for example, you can make the entries complex or quaternionic instead
of real. In order to have real eigenvalues, the corresponding matrices need to be hermitian and self-dual
respectively3 - better have a look at one example of the former, for N as small as N = 2
0.3252 0.3077 + 0.2803i
Hher = . (1.4)
!
0.3077 − 0.2803i −1.7115
You have just met the Gaussian Unitary (GUE) and Gaussian Symplectic (GSE) ensembles, respectively
- and are surely already wondering who invented these names.
We will deal with this jargon later. Just remember: the Gaussian Orthogonal Ensemble does not contain
orthogonal matrices - but real symmetric matrices instead (and similarly for the others).
Although single instances can sometimes be also useful, exploring the statistical properties of an ensem-
ble typically requires collecting data from multiple samples. We can indeed now generate T such matrices,
collect the N (real) eigenvalues for each of them, and then produce a normalized histogram of the full set
of N × T eigenvalues. With the code [♠ Gaussian_Ensembles_Density.m], you may get a plot like
Fig. 1.1 for T = 50000 and N = 8.
Roughly half of the eigenvalues collected in total are positive, and half negative - this is evident from
the symmetry of the histograms. These histograms are concentrated (significantly nonzero) over the region
of the real axis enclosed by (for N = 8)
√
• ± 2N ≈ ±4 (GOE),
√
• ± 4N ≈ ±5.65 (GUE),
3 Hermitian matrices have real elements on the diagonal, and complex conjugate off-diagonal entries. Quaternion self-dual
matrices are 2N × 2N constructed as A=[X Y; -conj(Y) conj(X)]; A=(A+A’)/2, where X and Y are complex matrices, while conj
denotes complex conjugation of all entries.
Page 7 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
0.16
GOE
GUE
GSE
0.12
0.08
0.04
0
-10 -5 0 5 10
Figure 1.1: Histograms of GOE, GUE and GSE eigenvalues (N = 8 and T = 50000 samples).
√
• ± 8N ≈ 8 (GSE).
You can directly jump to the end of Chapter 5 to see what these histograms look like for big matrices.
Question. Can I compute analytically the shape of these histograms? And what
happens if N becomes very large?
I Yes, you can. In Chapters 10 and 12, we will set up a formalism to com-
pute exactly these shapes for any finite N. In Chapter 5, instead, we will see that
for large N the histograms approach a limiting shape, called Wigner’s semicircle
law.
1.1 One-pager on random variables
Attributed to Giancarlo Rota is the statement that a random variable X is neither random, nor is a variable4 .
Whatever it is, it can take values in a discrete alphabet (like the outcome of tossing a die, {1, 2, 3, 4, 5, 6})
or on an interval σ (possibly unbounded) of the real line. For the latter case, we say that ρ(x) is the prob-
ability density function5 (pdf) of X if ab dxρ(x) is the probability that X takes value in the interval (a, b) ⊆ σ .
R
A die will not blow up and disintegrate in the air. One of the six numbers will eventually come up. So
the sum of probabilities of the outcomes should be 1 (= 100%). People call this property normalization,
4 In the following we may use both upper and lower case to denote a random variable.
5 For example, for the GOE matrix (1.2) the diagonal entries were sampled from the Gaussian (or normal) pdf ρ(x) =
√
exp(−x2 /2)/ 2π. We will denote the normal pdf with mean µ and variance σ 2 as N(µ, σ 2 ) in the following.
Page 8 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
which for continuous variables just means σ dxρ(x) = 1.
R
All this in theory.
In practice, sample your random variable many times and produce a normalized histogram of the out-
comes. The pdf ρ(x) is nothing but the histogram profile as the number of samples gets sufficiently large.
The average of X is hXi = dxρ(x)x and higher moments are defined as hX n i = dxρ(x)xn . The variance
R R
is Var(X) = hX 2 i − (hXi)2 , which is a measure of how broadly spread around the mean the pdf is.
The cumulative distribution function F(x) is the probability that X is smaller or equal to x, F(x) =
−∞ dy ρ(y). Clearly, F(x) → 0 as x → −∞ and F(x) → 1 as x → +∞.
Rx
If we have two (continuous) random variables X1 and X2 , they must be described by a joint probability
density function (jpdf) ρ(x1 , x2 ). Then, the quantity ab dx1 cd dx2 ρ(x1 , x2 ) gives the probability that the first
variable X1 is in the interval (a, b) and the other X2 is, simultaneously, in the interval (c, d).
R R
When the jpdf is factorized, i.e. is the product of two density functions, ρ(x1 , x2 ) = ρ1 (x1 )ρ2 (x2 ),
the variables are said to be independent, otherwise they are dependent. When, in addition, we also have
ρ1 (x) = ρ2 (x), the random variables are called i.i.d. (independent and identically distributed). In any case,
ρ(x1 ) = ρ(x1 , x2 )dx2 is the marginal pdf of X1 when considered independently of X2 .
R
The above discussion can be generalized to an arbitrary number N of random variables. Given the jpdf
ρ(x1 , . . . , xN ), the quantity ρ(x1 , . . . , xN )dx1 · · · dxN is the probability that we find the first variable in the
interval [x1 , x1 + dx1 ], the second in the interval [x2 , x2 + dx2 ], etc. The marginal pdf ρ(x) that the first
variable will be in the interval [x, x + dx] (ignoring the others) can be computed as
ρ(x) = dx2 · · · dxN ρ(x, x2 , . . . , xN ). (1.5)
Z Z
Question. What is the jpdf ρ[H] of the N 2 entries {H11 , . . . , HNN } of the matrix H
in (1.1)?
I The entries in H are independent Gaussian the jpdf is
factorized as ρ[H] ≡ ρ(H11 , . . . , HNN ) = ∏i,Nj=1 exp −Hi2j /2 / 2π .
h variables,
hence
√ i
If a set of random variables is a function of another one, xi = xi (yy), there is a relation between the jpdf
of the two sets
ρ(x1 , . . . , xN )dx1 · · · dxN = ρ(x1 (yy), . . . , xN (yy))|J(xx → y )| dy1 · · · dyN , (1.6)
| {z }
ρb(y1 ,...,yN )
∂ xi
where J is the Jacobian of the transformation, given by J(xx → y ) = det ∂yj . We will use this property in
Chapter 6.
Page 9 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Question. What is the jpdf of the N(N − 1)/2 entries in the upper triangle of the
symmetric matrix Hs in (1.2)?
I For Hs , you need to consider the diagonal and the off-diagonal entries
separately: the diagonal entries are (Hs )ii = Hii , while the off-diagonal entries are
(Hs )i j = (Hi j + H ji )/2. As a result,
ρ((Hs )11 , . . . , (Hs )NN ) = ∏ exp −(Hs )ii2 /2 / 2π ∏ exp −(Hs )i2j / π ,
i=1 i< j
N h √ i √
(1.7)
i.e. the variance of off-diagonal entries is 1/2 of the variance of diagonal entries.
Make sure you understand why this is the case. This factor 2 has very important
consequences (see the last Question in Chapter 3). From now on, for a real sym-
metric Hs we will denote the jpdf of the N(N − 1)/2 entries in the upper triangle
by ρ[H] - dropping the subscript ’s’ when there is no risk of confusion.
Page 10 of 111
Chapter 2
Value the eigenvalue
In this Chapter, we start discussing the eigenvalues of random matrices.
2.1 Appetizer: Wigner’s surmise
x1 x3
Consider a 2 × 2 GOE matrix Hs = , with x1 , x2 ∼ N(0, 1) and x3 ∼ N(0, 1/2). What is the pdf
x3 x2
!
p(s) of the spacing s = λ2 − λ1 between its two eigenvalues (λ2 > λ1 )?
The two eigenvalues are random variables, given in terms of the entries by the roots of the characteristic
polynomial
λ 2 − Tr(Hs )λ + det(Hs ) , (2.1)
therefore λ1,2 = x1 + x2 ± (x1 − x2 )2 + 4x32 /2 and s = (x1 − x2 )2 + 4x32 .
q q
By definition, we have
1 2 1 2 2
∞ e− 2 x1 e− 2 x2 e−x3
p(s) = dx1 dx2 dx3 √ √ √ δ s − (x1 − x2 )2 + 4x32 . (2.2)
2π 2π π
Z
−∞
q
Changing variables as
r cos θ +ψ
= r cos θ = 2
ψ−r cos θ
2x x , (2.3)
= r sin θ ⇒ = 2
x1 − x2 x1
r sin θ
x1 + x2 =ψ x 3 = 2
3
2
and computing the corresponding Jacobian
∂ x1 ∂ x1 ∂ x1
∂r ∂θ ∂ψ cos θ /2 −r sin θ /2 1/2
∂ x2 ∂ x2 ∂ x2
∂r ∂θ ∂ψ (2.4)
∂ x3 ∂ x3 ∂ x3 sin θ /2 r cos θ /2 0
∂r ∂θ ∂ψ
J = det = det − cos θ /2 r sin θ /2 1/2 = −r/4 ,
11
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
one obtains
∞ 2π ∞ 2 2 2 2θ
1 2
− 1 ( r cos2θ +ψ ) +( −r cos2θ +ψ ) + r sin
p(s) = 3/2 dθ dψe 2
h i
8π 0 0
dr rδ (s − r)
Z Z Z
−∞
2 2
√
4π s 2π 2
− 1 s cos θ + s sin θ s 2
= dθ e 2 2 = e−s /4 . (2.5)
h 2 2 i
8π 3/2 0 2
Z
Note that we used cos2 θ + sin2 θ = 1 to achieve this very simple result: however, we could only enjoy this
massive simplification because the variance of the off-diagonal elements was 1/2 of the variance of diago-
nal elements - try to redo the calculation assuming a different ratio. Observe also that this pdf is correctly
normalized, 0∞ ds p(s) = 1.
R
It is often convenient to rescale this pdf and define p̄(s) = hsip (hsis), where hsi = 0∞ dsp(s)s is the
mean level spacing. Upon this rescaling, 0∞ p̄(s)ds = 0∞ s p̄(s)ds = 1. For the GOE as above, show that
R
p̄(s) = (πs/2) exp(−πs2 /4), which is called Wigner’s surmise1 , whose plot is shown in Fig. 2.1.
R R
In spite of its simplicity, this is actually a quite deep result: it tells us that the probability of sampling
two eigenvalues ’very close’ to each other (s → 0) is very small: it is as if each eigenvalue ’felt’ the presence
of the other and tried to avoid it (but not too much)! A bit like birds perching on an electric wire, or parked
cars on a street: not too close, not too far apart. If this metaphor does not win you over, check this out [1].
2.2 Eigenvalues as correlated random variables
In the previous Chapter, we met the N real eigenvalues {x1 , . . . , xN } of a random matrix H. These eigenvalues
are random variables described by a jpdf2 ρ(x1 , . . . , xN ).
Question. What does the jpdf of eigenvalues ρ(x1 , . . . , xN ) of a random matrix
ensemble look like?
I We will give it in Eq. (2.15) for the Gaussian ensemble. Not for every
ensemble the jpdf of eigenvalues is known.
The important (generic) feature is that the {xi }’s are not independent: their jpdf does not in general
factorize. The most striking incarnation of this property is the so-called level repulsion (as in Wigner’s
surmise): the eigenvalues of random matrices generically repel each other, while independent variables do
not - as we show in the following section.
2.3 Compare with the spacings between i.i.d.’s
It is useful at this stage to consider the statistics of gaps between adjacent i.i.d. random variables. In this
case, we will not see any repulsion.
1 Why is it defined a ’surmise’? After all, it is the result of an exact calculation! The story goes as follows: at a conference on
Neutron Physics by Time-of-Flight, held at the Oak Ridge National Laboratory in 1956, people asked a question about the possible
shape of the distribution of the spacings of energy levels in a heavy nucleus. E. P. Wigner, who was in the audience, walked up to
the blackboard and guessed (= surmised) the answer given above.
2 We will use the same symbol ρ for both the jpdf of the entries in the upper triangle and of the eigenvalues.
Page 12 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
0.8
0.6
0.4
p(s)
0.2
0
0 1 2 3 4
s
Figure 2.1: Plot of Wigner’s surmise.
Consider i.i.d. real random variables {X1 , . . . , XN } drawn from a parent pdf pX (x) defined over a support
σ . The corresponding cdf is F(x). The labelling is purely conventional, and we do not assume that the
variables are sorted in any order.
We wish to compute the conditional probability density function pN (s|X j = x) that, given that one of the
random variables X j takes a value around x, there is another random variable Xk (k 6= j) around the position
x + s, and no other variables lie in between. In other word, a gap of size s exists between two random
variables, one of which sits around x.
The claim is
pN (s|X j = x) = pX (x + s) [1 + F(x) − F(x + s)]N−2 . (2.6)
The reasoning goes as follows: one of the variables sits around x already, so we have N − 1 variables
left to play with. One of these should sit around x + s, and the pdf for this event is pX (x + s). The remaining
N − 2 variables need to sit either to the left of x - and this happens with probability F(x) - or to the right of
x + s - and this happens with probability 1 − F(x + s).
Now, the probability of a gap s between two adjacent particles, conditioned on the position x of one
variable, but irrespective of which variable this is is obtained by the law of total probability
N
pN (s|any X = x) = ∑ pN (s|X j = x)Prob(X j = x) = N pN (s|X j = x)pX (x) , (2.7)
j=1
Page 13 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
where one uses the fact that the variables are i.i.d. and thus the probability that the particle X j lies around x
is the same for every particle, and given by pX (x).
To obtain the probability of a gap s between any two adjacent random variables, no longer conditioned
on the position of one of the variables, we should simply integrate over x
pN (s) = dx pN (s|any X = x) = N dx pN (s|X j = x)pX (x) . (2.8)
σ σ
Z Z
As an exercise, let us verify that pN (s) is correctly normalized, namely 0 ds pN (s) = 1. We have
R∞
∞ ∞
ds pN (s) = N ds (2.9)
0 0 σ
dx pX (x + s) [1 + F(x) − F(x + s)]N−2 pX (x) .
Z Z Z
Changing variables F(x +s) = u in the s-integral, and using F(+∞) = 1 and du = F 0 (x +s)ds = pX (x +s)ds,
we get
ds pN (s) =N dx pX (x) (2.10)
0 σ F(x)
du[1 + F(x) − u]N−2 .
Z ∞ Z Z 1
1−F(x)N−1
N−1
| {z }
Setting now F(x) = v and using dv = F 0 (x)dx = pX (x)dx, we have
∞ N 1
ds pN (s) = (2.11)
0 0
dv(1 − vN−1 ) = 1 ,
Z Z
N −1
as required.
As there are N variables, it makes sense to perform the ’local’ change of variables s = ŝ/(N pX (x)) and
consider the limit N → ∞. The reason for choosing the scaling factor N pX (x) is that their typical spacing
around the point x will be precisely of order ∼ 1/(N pX (x)): increasing N, more and more variables need
to occupy roughly the same space, therefore their typical spacing goes down. The same happens locally
around points x where there is a higher chance to find variables, i.e. for a higher pX (x).
We thus have
ŝ
pN s = X j = x = pX (x + ŝ/N pX (x)) [1 + F(x) − F (x + ŝ/N pX (x))]N−2 , (2.12)
N pX (x)
which for large N and ŝ ∼ O(1), can be approximated as
ŝ
pN s = X j = x ≈ pX (x)e−ŝ , (2.13)
N pX (x)
therefore using (2.8)
ŝ ds 1
lim p̂N (ŝ) := lim pN s = dxpX (x)e−ŝ = e−ŝ , (2.14)
N→∞ N→∞ d ŝ N
=N×
N pX (x) σ
Z
the exponential law for the spacing of a Poisson process. From this, one deduces easily that i.i.d. variables
Page 14 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
do not repel, but rather attract: the probability of vanishing gaps, ŝ → 0, does not vanish, as in the case of
RMT eigenvalues!
2.4 Jpdf of eigenvalues of Gaussian matrices
The jpdf of eigenvalues of a N × N Gaussian matrix is given by3
1 1 N 2
ρ(x1 , . . . , xN ) = e− 2 ∑i=1 xi ∏ |x j − xk |β , (2.15)
ZN,β j<k
where
N
Γ(1 + jβ /2)
ZN,β = (2π)N/2 ∏ (2.16)
j=1 Γ(1 + β /2)
is a normalization constant4 , enforcing RN dxx ρ(x1 , . . . , xN ) = 1, and β = 1, 2, 4 is called the Dyson index5 .
Henceforth, dxx = ∏Nj=1 dx j . Note that the eigenvalues are considered to be unordered here.
R
This jpdf corresponds exactly to eigenvalues6 generated according to the algorithm in Chapter 17 , and
provided in the code [♠ Gaussian_Ensembles_Density.m].
Where does (2.15) come from? Let us postpone the proof for a while and draw some conclusions by just
staring at it for a few minutes.
1 N 2
The Gaussian factor e− 2 ∑i=1 xi kills any configuration of eigenvalues {xx} where some x j ’s are “big” (far
from zero, in absolute value): the eigenvalues do not like to stay too far from the origin. On the other hand,
the term ∏ j<k |x j − xk | kills configurations where two eigenvalues get “too close” to each other.
The “repulsion” factor ∏ j<k |x j −xk | has another effect: it makes the eigenvalues strongly non-independent!
Every eigenvalue feels the presence of all the others, and the jpdf (2.15) does not factorize at all. Hence, the
classical tools for independent random variables are of little use here. We will use (2.15) in the next Chapter
to deduce Wigner’s semicircle law in a few simple steps.
This interplay between confinement and repulsion is the physical mechanism at the heart of many results
in RMT.
3 This jpdf goes back to the prehistory of RMT. It is an immediate consequence of Theorem 2 in [2], a 1939 statistics paper
published in the journal Annals of Eugenics (a rather scary title, isn’t it?). In its full glory, it appeared explicitly for the first time in
[3].
4 It can be computed via the so-called Mehta’s integral, a close relative of the celebrated Selberg’s integral [5].
5 The Dyson index is equal to the number of real variables needed to specify one entry of your matrix: 1 for real, 2 for complex
and 4 for quaternions. This is usually referred to as Dyson’s threefold way. For the Gaussian ensemble, then, GOE corresponds to
β = 1, GUE to β = 2 and GSE to β = 4.
6 For β = 4, each matrix has 2N eigenvalues that are two-fold degenerate.
7 Quite often, however, you find in the literature a Gaussian weight including extra factors, such as exp(−(β /2)
∑i xi2 ) or
exp(−(N/2) ∑i xi2 ). One then needs to be very careful when comparing theoretical (obtained with such conventions) to
√
numerical simulations - in particular, a rescaling of the numerical eigenvalues by β or N before histogramming is essential in
these two modified scenarios.
p results
Page 15 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
As a final remark, go back to the spacing pdf in Eq. (2.5), which was obtained for N = 2 and β = 1 (a
2 × 2 GOE matrix). Armed with (2.15) one may redo the calculation as
∞
p(s) = dx1 dx2 ρ(x1 , x2 )δ (s − |x2 − x1 |) . (2.17)
Z
−∞
Try to compute this integral, and recover Eq. (2.5).
Page 16 of 111
Chapter 3
Classified Material
In this Chapter, we continue setting up the formalism and provide a simple classification of matrix models.
3.1 Count on Dirac
Question From the jpdf of eigenvalues ρ(x1 , . . . , xN ), how do I compute the shape
of the histograms of the N × T eigenvalues as in Fig. 1.1, for T sufficiently large?
I To cut a long story short, all you have to do is to take the marginal
ρ(x) = ··· dx2 · · · dxN ρ(x, x2 , . . . , xN ) , (3.1)
Z Z
and this function will reproduce the histogram profile you are after for any finite
N. Note that ρ(x) is correctly normalized to 1, as your histogram is.
Let us prove (3.1).
Take a single, fixed matrix H with real eigenvalues - no randomness in here - and perform the following
task: define a counting function n(x) such that ab n(x0 )dx0 gives the fraction of eigenvalues xi between a and
b.
R
The way to define it is to set1
1 N
n(x) = ∑ δ (x − xi ) , (3.2)
N i=1
the (normalized) sum of a set of “spikes” at the location xi of each eigenvalue. Using the following property
of the delta function
dxδ (x − x0 ) f (x) = f (x0 ) if x0 ∈ I and 0 otherwise, (3.3)
I
Z
1 As we know, the Dirac delta function (or rather distribution) δ (x) is basically an extremely peaked function at the point x = 0,
2
like the limit of a Gaussian pdf as its variance goes to zero, δ (x) = limε→0+ 2√1πε e−x /(4ε) .
17
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
we can show that indeed (3.2) does the job properly2 .
If H is now a random matrix, the function n(x) becomes a random measure on the real line - a function
of x that changes from one realization of H to another. The average of it over the set of random eigenvalues
{x1 , . . . , xN } becomes interesting now3
1 N
hn(x)i := ··· dxxρ(x1 , . . . , xN )n(x) = ∑ ··· dxxρ(x1 , . . . , xN )δ (x − xi ) = ρ(x) , (3.5)
N i=1
Z Z Z Z
where ρ(x) = · · · dx2 · · · dxN ρ(x, x2 , . . . , xN ) is the marginal density of ρ. Try to prove the last equality in
(3.5) using the properties of delta function, and the fact that ρ(x1 , . . . , xN ) is symmetric upon the exchange
R R
xi → x j . This is indeed the case for the Gaussian jpdf (2.15) and will remain generally true.
The quantity hn(x)i = ρ(x) has many names: most often, it is called the (average) spectral density.
Fig. 3.1 helps you visualize how T = 4 sets of N = 8 randomly located “spikes” conspire to produce the
continuous shape ρ(x) = hn(x)i.
Question. If N becomes very large, what does the spectral density ρ(x) for the
Gaussian ensemble look like?
I For the jpdf ρ(x1 , . . . , xN ) given in (2.15), the precise statement for the
spectral density ρ(x) = dx2 · · · dxN ρ(x, x2 , . . . , xN ) is
R
lim β Nρ( β Nx) = ρSC (x) , (3.6)
N→∞
p p
√
where ρSC (x) = π1 2 − x2 has a semicircular - or rather, semielliptical - shape.
This is called Wigner’s semicircle law.
2 Compute
N
N n(x)dx = ∑ ∑ χ[a,b] (xi ) , (3.4)
a
δ (x − xi )dx =
i=1 a i=1
Z b N Z b
where the indicator function χ[a,b] (z) is equal to 1 if z ∈ (a, b) and 0 otherwise. This is by definition the number of eigenvalues
between a and b, as it should.
3 We use again the shorthand dxx = ∏Nj=1 dx j .
Page 18 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
0.4
0.2
0
ρ(x)
M4
M3
M2
M1
-1.5 -1 -0.5 0 0.5 1 1.5
x
Figure 3.1: Sets of N = 8 randomly located “spikes”. A histogram of how many spikes occur around a given
region of the real line is nothing but the average spectral density there.
0.6
Semicircle
GOE
0.5 GUE
GSE
0.4
0.3
ρ(x)
0.2
0.1
0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x
Figure 3.2: Rescaled densities for N = 8 (GOE,GUE,GSE).
Page 19 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Question. What is the meaning of the unexpected rescaling factor β N?
p
I This means that the histograms of eigenvalues for larger and larger N become
concentrated over the interval [− 2β N, 2β N], in agreement with our numerical
p p
findings in Fig. 1.1. The points ± 2β N are called (spectral) edges.
p
Note that:
√
1. The edges are growing with N - bigger matrices have a wider range of
eigenvalues, can you explain why? To get histograms that do not become
wider and wider with N, we need to divide each eigenvalue by β N be-
fore histogramming. This is what we do in Fig. 3.2, using the very same
p
eigenvalues collected to produce Fig. 1.1. You can see that the histograms
for different β s nicely collapse on top of each other, reproducing an almost
√ √
perfect semielliptical shape between − 2 and 2.
2. The edges are at ± 2β N for the jpdf ρ(x1 , . . . , xN ) given in (2.15). If
you put ad hoc extra factors in the exponential, like exp(−(β /2) ∑i xi2 ) or
p
exp(−(N/2) ∑i xi2 ), as you sometimes find in the literature, this is tantamount
to rescaling the eigenvalues by an appropriate factor. For example, for the
choice exp(−(N/2) ∑i xi2 ), the edges are fixed - they do not grow with N - at
± 2β .
p
3. The edges of the semicircle are called soft: for large but finite N, there is
always a nonzero probability of sampling eigenvalues exceeding the edge
points. For example, for a GOE matrix 10 × 10, you have a tiny but nonzero
probability to sample eigenvalues larger than 2β N ≈ 4.47.... Other ensem-
bles have spectral densities with hard edges - this means impenetrable walls,
p
which the eigenvalues can never cross.
3.2 Layman’s classification
We deal here with ensembles of square matrices with real eigenvalues (the entries can be real, complex or
quaternionic random variables). Can we classify these ensembles according to simple features?
A useful scheme (covering several scenarios encountered in real life) is the following (see Fig. 3.3):
1. Independent entries: the first group on the left gathers matrix models whose entries are independent
random variables - modulo the symmetry requirements. Random matrices of this kind are usually
called Wigner matrices.
Examples: in this category, you may find adjacency matrices of random graphs [6], or matrices with
independent power-law entries (so-called Lévy matrices [7]), and power-law banded matrices [8]
Page 20 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
among others. Take a moment to download and read these papers - remember the following sen-
tence, found on Richard Feynman’s blackboard at the time of his death: “Know how to solve every
problem that has been solved”.
2. Rotational invariance: the second group on the right is characterized by the so-called rotational
invariance. In essence, this property means that any two matrices that are related via a similarity
transformation4 H 0 = UHU −1 occur in the ensemble with the same probability
0
ρ[H]dH11 · · · dHNN = ρ[H 0 ]dH11 0
· · · dHNN . (3.7)
This requires the following two conditions:
• ρ[H] = ρ[UHU −1 ]. This means that the jpdf of the entries retains the same functional form
before and after the transformation. This imposes a severe constraint on the allowable functional
forms thanks to Weyl’s lemma [9], which states that ρ[H] can only be a function of the traces of
the first N powers of H,
ρ[H] = ϕ Tr H, Tr H 2 , . . . , Tr H N . (3.8)
Since Tr H n = Tr (UHU −1 )n by the cyclic property of the trace, the ⇐ implication is trivial.
• dH11 · · · dHNN = dH11 NN
0 · · · dH 0 , i.e. the flat Lebesgue measure is invariant under conjugation by
U. This is a classical result.
The rotational invariance property in essence means that the eigenvectors are not that important, as
we can rotate our matrices as freely as we wish, and still leave their statistical weight unchanged.
Examples: you may find in this category the Wishart-Laguerre (Chapter 13) and Jacobi classical
ensembles, the so-called “weakly-confined” ensembles [10] and many others. The same advice
(“download-and-study”) applies here.
3. What about the intersection between the two classes? It turns out that it contains only the Gaussian
ensemble5 .
This is a consequence of a theorem by Porter and Rosenzweig [3]. And is bad news, isn’t it? We have
to make a choice: if we insist that the ensemble has independent entries, then eigenvectors do matter.
If we require a high level of rotational symmetry, then the entries get necessarily correlated. No free
lunch (beyond the Gaussian)!
4U is orthogonal/unitary/symplectic if H is real symmetric/complex hermitian/quaternion self-dual, respectively. You surely
have noticed that this is precisely the origin of the names given to the ensembles: Orthogonal, Unitary and Symplectic.
5 In its three incarnations: GOE, GUE and GSE.
Page 21 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Independent Gaussian Rotational
Entries Ensembles invariance
1
⇢[H] / i=1 fi (Hii ) i<j fij (Hij ) ⇢[H] = ⇢[U HU ]
QN Q
Figure 3.3: Visualization of the layman’s classification of random matrix ensembles.
Question. I can see that the Gaussian ensemble has independent entries. But I do
not easily see that it has this “rotational invariance”.
I This can be seen from the jpdf of entries in the upper triangle (1.7). Show that
you can rewrite this jpdf as
1
(3.9)
2
ρ[Hs ] ∝ exp − Tr(Hs2 ) ,
where Tr(·) is the matrix trace (the sum of diagonal element). For example, for the
a b
, the trace of Hs is a + c, and the trace
b c
2 × 2 real symmetric matrix Hs =
!
of Hs2 is a2 + c2 + 2b2 . You can actually rewrite (1.7) as (3.9) only thanks to that
factor 2...check this! Now, from (3.9), the rotational invariance property is much
easier to see: for a similarity transformation Hs 0 = UHsU −1 , one has Tr(Hs 02 ) =
Tr(Hs2 ) (cyclic property of the trace).
3.3 To know more...
1. Anything worth mentioning beyond the above classification? One important class is represented by
the biorthogonal ensembles: these are non-invariant, with non-independent entries, and yet their jpdf
of eigenvalues is known in terms of the product of two determinants. Check these papers out [11, 12]
for further information.
2. We suggest the following paper [13] about “histogramming without histogramming”. Solid maths and
an insightful and unconventional perspective on RMT spectra.
3. For a proof of the Porter-Rosenzweig theorem in the simplified 2 × 2 case, as well as for a nice and
pedagogical introduction to the Gaussian ensembles, we highly recommend the review [51].
Page 22 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
4. For the mathematically oriented reader, who is looking for more formal classifications of random
matrix models, we recommend the mini-review [15] and references therein.
Page 23 of 111
Chapter 4
The fluid semicircle
In this Chapter, we set up a statistical mechanics formalism to compute Wigner’s semicircle law for Gaussian
matrices. You will learn here the so-called “Coulomb gas technique”.
4.1 Coulomb gas
The Coulomb gas (or fluid) technique is usually attributed to Dyson [16]. Actually, a few years before,
Wigner had already used it for the derivation of the semicircle law [17].
Take the jpdf for the Gaussian ensemble (2.15)
1 1 N 2
ρ(x1 , . . . , xN ) = e− 2 ∑i=1 xi ∏ |x j − xk |β , (4.1)
ZN,β j<k
and rescale the eigenvalues as xi → xi β N.
p
The normalization constant now reads (set CN,β = ( β N)N+β N(N−1)/2 )
p
N β
N
N 2 2
ZN,β = CN,β 2 N ∑i=1 xi (4.2)
∏ dx j e−
RN j=1
∏ |x j − xk |β = CN,β ∏ dx j e−β N V[xx] ,
RN j=1
j<k
Z Z
where the energy term in the exponent is
1 1
V[xx] = (4.3)
2N ∑
xi2 − 2 ∑ ln |xi − x j | .
i 2N i6= j
The factor 1/2 in front of the logarithmic term is due to the symmetrization from i < j to i 6= j.
Stare at (4.2) intensely.
We have just exponentiated the product ∏ j<k , and obtained a canonical partition function1 !
2
1 We are integrating the Gibbs-Boltzmann weight e−β N V[xx] over all possible positions of the particles.
24
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
2
The Gibbs-Boltzmann weight e−β N V[xx] corresponds to a thermodynamical fluid of particles with posi-
tions {x1 , . . . , xN } on a line, in equilibrium at “inverse temperature” β under the effect of competing interac-
tions: a quadratic (single-particle) potential (see fig. 4.1), and a repulsive (all-to-all) logarithmic term. The
fluid is “static”, as there is no kinetic term in V[xx].
Confining well potential
-1.5 -1 -0.5 0 0.5 1 1.5
x
Figure 4.1: Sketch of the quadratic confining potential, which prevents the particles from escaping towards
±∞.
The presence of the pre-factor β N 2 shows - at least formally - that the limit N → ∞ is a simultaneous
thermodynamic and zero-temperature limit. A standard thermodynamic argument tells us how to find the
equilibrium positions at zero temperature of the particles (eigenvalues) under such interactions: all we need
to do is to minimize the free energy F = −(1/β ) ln ZN,β of this system. The calculation greatly simplifies
in the limit N → ∞.
Page 25 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Question. Why is this called a “Coulomb” gas?
I Because we have a logarithmic interaction among charged particles. More
precisely, we have a 2D “fluid” of charges constrained to a line. We know that
in 2D the electrostatic potential generated by a point charge is proportional to
the logarithm of the distance from it - while in 3D, this potential is inversely
proportional to the distance, and in 1D is proportional to the distance. Therefore, a
2D charged fluid confined to a line is not quite the same as a 1D fluid!
A simple way to see this is by using Gauss’s law, with a single charge q sitting at
the origin on a 2D plane. If we enclose the charge in a 1-sphere S (i.e. a circle),
then we must have S E · n ∝ q, where n is the normal vector to the circle. If you
assume that the electric field E is rotationally symmetric, i.e. E = E(r)r̂r , this turns
R
into E(r)2πr ∝ q, implying that E(r) ∝ q/r. Integrating a field that goes like 1/r
gives you a logarithmic potential.
4.2 Do it yourself (before lunch)
So, our goal is to find the free energy F = −(1/β ) ln ZN,β for a large number of particles N → ∞. As in
many branches of physics, “larger is easier”.
We now provide a “continuum” description of the fluid, based on the following steps.
1. Introduce a counting function
Define first a normalized one-point counting function
1 N
n(x) = ∑ δ (x − xi ) . (4.4)
N i=1
This is a random function, satisfying R dx n(x) = 1 and n(x) ≥ 0 everywhere. For finite N, this is just a
collection of “spikes” at the location of each eigenvalue. However, for large N, it is natural to assume that it
R
will become a smooth function of x. We will always work under this assumption2 .
2. Coarse-graining procedure
Instead of directly summing - or rather integrating - over all configurations of eigenvalues {x1 , . . . , xN },
which in stat-mech we would call microstates of our fluid, we first fix a certain one-point profile n(x) (non-
negative, smooth and normalized).
Sketch your favorite function over R and call it n(x) - whatever you like, really, provided it is non-
2
) /4ε
2 It 1 N e−(x−x
,
may be helpful to think that n(x) is nothing but the limit for ε → 0+ of a nascent delta function nε (x) = N ∑i=1 √i
2 πε
where the limit ε → 0+ is taken at the very end (after the limit N → ∞).
Page 26 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
negative, smooth and normalized. Then, we sum over all microstates {x1 , . . . , xN } compatible with your
sketch n(x) - in a sense to be made clearer. Finally, we sum over all possible (non-negative, smooth and
normalized) n(x) you might have come up with in the first place.
This coarse-graining procedure can be put on slightly cleaner grounds introducing the following repre-
sentation of unity as a functional integral
1 N
1= D[n(x)]δ n(x) − ∑ δ (x − xi ) , (4.5)
N i=1
Z
" #
which enforces the definition (4.4). The functional integral runs (so to speak) over all possible normalized,
non-negative and smooth functions n(x). See [18] for more details on functional integrations.
Inserting this representation of unity inside the multiple integral (4.2) and exchanging the order of inte-
grations, we end up with
N
2 1 N
ZN,β = CN,β D[n(x)] ∏ dx j e−β N V[xx] δ n(x) − ∑ δ (x − xi ) . (4.6)
RN j=1 N i=1
Z Z
" #
3. Convert sums into integrals
Using the identities3
N
∑ f (xi ) = N n(x) f (x)dx (4.7)
i=1 R
Z
N
∑ g(xi , x j ) = N 2 dxdx0 n(x)n(x0 )g(x, x0 ) , (4.8)
i, j=1 R2
ZZ
we can rewrite the two terms in the energy (4.3) as
1 N 1
∑ xi2 = 2N × N n(x)x2 dx (4.9)
2N i=1 R
Z
1 1
=
2N 2
ln |xi − x j | = ∑ ln |xi − x j | − ∑ ln ∆(xi )
2N 2 i6∑
=j i, j i
" #
1 1
dx n(x) ln ∆(x) , (4.10)
2N 2 2N 2
× N2 dxdx0 n(x)n(x0 ) ln |x − x0 | − ×N
R2 R
ZZ Z
where ∆(x) is a position-dependent short-distance cutoff. What does this mean?
Note that in the limit ε → 0+ , the double integral R2 dxdx0 nε (x)nε (x0 ) ln |x − x0 | is divergent. This
physically corresponds to the infinite-energy contribution originated by two neighboring charges getting
RR
“too close” to each other (the term i = j in the sum ∑i, j ln |xi − x j |). The term R dx nε (x) ln ∆(x) for ε → 0+
R
3 Prove them inserting the definition of n(x) into the integrals and using properties of the delta function.
Page 27 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
“renormalizes” the divergence and produces a finite result. More on how to plausibly fix ∆(x) later.
4. V[xx] → V[n(x)]
Note that in (4.9) and (4.10) the sums over eigenvalues {x1 , . . . , xN } have been expressed through the
counting function n(x), which - with a slight abuse of notation - will denote from now on its smooth limit as
N → ∞.
Therefore we can write
N
2 V[n(x)] 1 N
ZN,β = CN,β D[n(x)]e−β N (4.11)
RN j=1 N
∏ dx j δ n(x) − ∑ δ (x − xi ) .
i=1
Z Z
" #
IN [n(x)]
| {z }
The functional V[n(x)] reads
1 1 1
V[n(x)] = dx n(x) ln ∆(x) . (4.12)
2 2 2N
dx x2 n(x) − dxdx0 n(x)n(x0 ) ln |x − x0 | +
R R2 R
Z ZZ Z
5. Evaluate the integral IN [n(x)] for large N
We now have to evaluate
N
1 N
IN [n(x)] = ∏ dx j δ n(x) − ∑ δ (x − xi ) (4.13)
RN j=1 N i=1
Z
" #
in the limit N → ∞.
It is quite easy to give a physical interpretation of this multiple integral. It is basically counting how
many microstates - microscopic configurations of the fluid charges - are compatible with a given macrostate
- the density profile n(x). We know from standard statistical mechanics arguments that the logarithm of this
number should be proportional to the entropy of the fluid. Let us see how.
Introducing a ’functional’ analogue of the standard integral representation for the delta function [19],
we can write
N N
IN [n(x)] = D[n̂(x)] ∏ dx j exp iN dx n(x)n̂(x) − i dx n̂(x) ∑ δ (x − xi )
RN j=1 i=1
Z Z
" Z
# Z
= D[n̂(x)] exp iN dx n(x)n̂(x) dy e−i dx n̂(x)δ (x−y)
R
R
Z Z Z N
= D[n̂(x)]eNS[n̂(x)|n(x)] , (4.14)
Z
Page 28 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
where
S[n̂(x)|n(x)] = i dx n(x)n̂(x) + Log dy e−in̂(y) . (4.15)
R
Z Z
This type of integrals is music to the statistical physicist’s ears! It is of the form d(·) exp[Λ f (·)], with
R
Λ ≡ N a very large parameter. Hence it can be evaluated with a Laplace (or saddle-point) approximation [20].
Finding the critical point of the action S[n̂(x)|n(x)]
δS e−in̂(x)
0= , (4.16)
δ n̂(x) R dy e
−in̂(y)
= in(x) − i R
from which we obtain
e−in̂(x) = n(x) dy e−in̂(y) ⇒ in̂(x) = − ln n(x) − Log dy e−in̂(y) , (4.17)
R R
Z Z
where we ignore spurious phases (recall that in the complex field Log exp(z) may not just be equal to z!)
that would make the action evaluated at the saddle-point complex. Substituting in (4.15), we obtain
IN [n(x)] ∼ exp −N dx n(x) ln n(x) , (4.18)
Z
to leading order in N. As expected, the term inside square brackets has precisely the form of the Shannon
entropy of the density n(x).
6. Evaluate ∆(x)
Look back again at (4.12). The short-distance cutoff ∆(x) is yet to be fixed.
A standard, physically motivated argument - going back to Dyson for charges on a ring - posits that ∆(x)
- the so-called self-energy term - should be taken of the form
c
, (4.19)
Nn(x)
∆(x) ≈
as the higher the density of particles around x, the smaller the average distance between them4 . Also, N
charges spread over a distance of O(1) have a mean spacing ∼ O(1/N), and this justifies the 1/N factor. This
argument, however plausible, does not seem to have been made rigorous yet, though. Note, in particular,
that the constant c in (4.19) cannot be fixed by this simple heuristic argument. While conceptually quite
important (see e.g. [21]), this missing bit will prove rather inconsequential in the following.
4 We have already met a similar argument in section 2.3.
Page 29 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
7. Final expression
Combining (4.11), (4.12), (4.18) and (4.19), the partition function eventually reads
−β N 2 F0 [n(x)]+ β2 N ln N+ β2 −1 NF1 [n(x)]− β2 N ln c+o(N)
D[n(x)]e , (4.20)
ZN,β ' CN,β
Z
where
1 1
F0 [n(x)] = (4.21)
2
dx x2 n(x) − dxdx0 n(x)n(x0 ) ln |x − x0 | ,
Z ZZ
F1 [n(x)] = dx n(x) ln n(x) . (4.22)
Z2
Note that the term (β /2)N ln N is essentially independent of the potential, and can be absorbed into
the overall normalization constant. The O(N) contribution is composed by i) the self-energy term, ii) the
entropic term, and iii) a contribution coming from the unknown constant c in (4.19).
8. Flash-forward: cross-check with finite-N result
We now cheat a bit.
Let us use some information we will actually prove later, namely that the equilibrium density of the fluid
√
is Wigner’s semicircle law n? (x) ≡ ρSC (x) = π1 2 − x2 .
Inserting the semicircle law into (4.21) and (4.22) - and evaluating the corresponding integrals - we
obtain
3 ln 2
F0 [n? (x)] = + , (4.23)
8 4
1
(4.24)
2
F1 [n? (x)] = (1 − ln(2) − 2 ln(π)) .
Therefore, the partition function (4.20) reads for large N
−β N 2 F0 [n(x)]+ β2 N ln N+ β2 −1 NF1 [n(x)]− β2 N ln c+o(N)
D[n(x)]e
ZN,β ' CN,β
Z
0 2 2 1 2
−β N 2 F [n? (x)]+ β N ln N+ β −1 NF [n? (x)]− β N ln c+o(N)
≈ CN,β e
β 1 β
N 2 ln N + aβ N 2 + 1+ N ln N + bβ N + o(N) , (4.25)
4 2 2
≈ exp
where we used the easy asymptotics
β β
N 2 ln N + (ln β )N 2 + N ln N + N. (4.26)
1 − β /2 (1 − β /2) ln β
4 4 2 2
lnCN,β ∼
Page 30 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
The constants aβ and bβ are given as follows:
β
(4.27)
4
aβ = ln β − β F0 [n? (x)] ,
β β
(4.28)
1 − β /2
bβ =
2 2 2
− 1 F1 [n? (x)] + ln β − ln c .
Can we check that this result is plausible?
Note that for β = 2, the partition function ZN,β =2 from (2.16) has a particularly simple expression at
finite N,
ZN,β =2 = (2π)N/2 G(N + 2) , (4.29)
where G(x) is a Barnes G-function5 . Hence, if everything was done correctly, the large-N asymptotics of
(4.29) should precisely match the large-N behavior (4.25).
Let us check.
Using known asymptotics of the Barnes G-function, we deduce that
1 3
(4.30)
2 4
ln ZN,β =2 ∼ N 2 ln N − N 2 + N ln N + N (ln(2π) − 1) + O(1) ,
which coincides (up to the term N ln N included) with the asymptotics of ZN,β in (4.25) once β is set to 2.
This check should convince you that the “mean-field” approach - based on a continuum description of
the charged fluid of eigenvalues - is indeed capable of capturing the first three terms of the free energy,
and only fails at the level of O(N) contributions - as the renormalized self-energy term cannot be precisely
determined by a simple-minded scaling argument.
9. What’s next?
Let us recap what we have done so far. The normalization constant ZN,β of the Gaussian model has
been re-interpreted as the canonical partition function of a 2D static fluid of charged particles confined on a
line, in equilibrium at inverse temperature β . For a large number of particles, among all possible configura-
tions, the fluid will choose the one that minimizes its free energy, i.e. the logarithm of this partition function.
The partition function has been written as a functional integral over the space of normalized counting
functions n(x), see (4.20). For large N, it lends itself to a saddle-point evaluation, which will be carried out
in the next Chapter.
5 The Barnes G-function is defined via the recursion G(z + 1) = Γ(z)G(z), with G(1) = 1.
Page 31 of 111
Chapter 5
Saddle-point-of-view
Let us continue the study of the Coulomb gas method for large random matrices.
5.1 Saddle-point. What’s the point?
Earlier we showed that the partition function for the Gaussian model could be represented as
−β N 2 F0 [n(x)]+ β2 N ln N+ β2 −1 NF1 [n(x)]− β2 N ln c+o(N)
D[n(x)]e , (5.1)
ZN,β ' CN,β
Z
where
1 1
F0 [n(x)] = (5.2)
2
dx x2 n(x) − dxdx0 n(x)n(x0 ) ln |x − x0 | ,
Z ZZ
F1 [n(x)] = dx n(x) ln n(x) . (5.3)
Z2
Quite interestingly, the leading term in the exponential is of order ∼ O(N 2 ) and not of ∼ O(N) as in
standard short-range models. As a consequence of the all-to-all coupling between the charged particles, the
free energy per particle is dominated by the “energetic” component at the expenses of the “entropic” part
(sub-leading for large N).
Recall now that the functional integral runs over functions n(x) that are normalized, i.e. R dx n(x) = 1.
We can enforce this constraint introducing another delta function
R
δ dx n(x) − 1 = e . (5.4)
dk ik(RR dx n(x)−1)
R R 2π
Z Z
Rescaling ik → β N 2 κ and ignoring sub-leading terms, you end up with the truly appealing representation
2 S[n(x),κ]+O(N)
ZN,β ≈ CN,β D[n(x)] dκ e−β N , (5.5)
R
Z Z
where the action is
S[n(x), κ] = F0 [n(x)] − κ dx n(x) − 1 . (5.6)
Z
32
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
A saddle-point evaluation yields1
ZN,β ≈ exp(−β N 2 S[n? (x), κ ? ]) . (5.7)
Here, n? (x) is the minimizer of the functional (5.2) in the space of normalizable and non-negative functions
n(x).
We set up the minimization problem by searching for the critical points2
2
δ n(x) n=n?
= x2 − R dx0 n? (x0 ) ln |x − x0 | − κ ? ,
κ=κ ? (5.8)
∂
R
= ∂ κ S[n(x), κ] n=n? R dx n? (x) = 1 ,
0 = δ S[n(x), κ]
⇒
κ=κ ?
R
0
for x in the support of n? (x).
Effectively, κ ? (hereafter renamed κ for simplicity) is just a Lagrange multiplier enforcing the normal-
ization R dx n? (x) = 1.
R
What is then the intensive free energy
f = −(1/β N 2 ) ln ZN,β (5.9)
of our Coulomb gas for N → ∞? It is just given by f = S[n? (x), κ] ≡ F0 [n? (x)] - the action evaluated at the
saddle-point density.
To summarize, the main task is now to find the solution of the integral equation (5.8)
x2
(5.10)
2
− dx0 n? (x0 ) ln |x − x0 | − κ = 0 ,
R
Z
? (x)dx = 1.
satisfying n? (x) ≥ 0 everywhere, and Rn
R
5.2 Disintegrate the integral equation
...or (in more academic terms), solve it.
As a preliminary observation, note that the support of n? (x) (i.e. the set of x-values for which n? (x) > 0)
cannot be the full real line. In the limit x → ∞, the integral term
dx0 n? (x0 ) ln |x − x0 | ∼ ln x dx0 n? (x0 ) = ln x , (5.11)
R R
Z Z
1 The pre-factor C 2
N,β has the large-N behavior (4.26), whose logarithm is ∼ O(N ln N) and thus strictly speaking leading with
respect to N 2 . However, it is just an overall constant term, and the ’dynamical’ part of the free energy is of ∼ O(N 2 ).
2 Note that the factor 1/2 in front of the double integral disappears because the functional differentiation picks up two counting
functions, as in the integrand we have n(x)n(x0 ). An interesting account on functional differentiation can be found at [22].
Page 33 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
- where we used normalization of the density - which is clearly incompatible with the behavior ∼ x2 /2 of
the known term in the equation3 .
Therefore, we need to look for a solution over an interval (a, b) of the real line. Indeed, a rather amusing
feature of this type of integral equations - of the Carleman class - is that the support over which the solution
is to be found is itself unknown, and part of the problem!
The solution n? ≡ n? (x; a, b) we find will then be a parametric function of a, b. We will then fix the
’optimal’ a, b by requiring that the resulting free energy f in (5.9) is minimized - i.e. any other choice of
the support (a, b) for normalized and non-negative function ñ(x) 6= n? (x), once inserted into (5.9), would
produce a larger value for the free energy.
Let us now first convert the integral equation into a “simpler” one.
5.3 Better weak than nothing
The solution to the integral equation (5.10) can be obtained by first differentiating both sides with respect to
x. Since ln |x − x0 | is not (strictly speaking) differentiable at x = x0 , we consider the derivative in the weak
sense.
Let u be a function in L1 ([a, b]). We say that v ∈ L1 ([a, b]) is a weak derivative of u if
v(x)ϕ(x)dx (5.12)
a a
u(x)ϕ 0 (x)dx = −
Z b Z b
for all infinitely differentiable functions ϕ with ϕ(a) = ϕ(b) = 0. The notion of weak derivative extends
the standard (strong) derivative to functions that are not differentiable, but integrable in [a, b]. Also, if u is
differentiable in the standard sense, than its weak and strong derivatives coincide - just using integration by
parts.
Setting u(x) = dx0 n? (x0 ) ln |x − x0 |, we can write
R
1
ϕ 0 (x)
2 ε→0
dx0 n? (x0 ) ln |x − x0 | dx = lim ϕ 0 (x) dx0 n? (x0 ) ln[(x − x0 )2 + ε 2 ] dx
Z Z Z Z
1 n? (x0 )
ϕ(x) , (5.13)
2(x − x0 )
dx0 n? (x0 )
2
=− dx = − ϕ(x)dx Pr dx0
Z Z Z Z
(x − x0 )2 + ε 2 x − x0
where Pr stands for Cauchy’s principal value4 .
(x ) ? 0
Comparing with (5.12), we obtain that the weak derivative of u(x) is Pr dx0 nx−x 0 , therefore the new
R
3 This is true in general for potentials growing super-logarithmically at infinity - not just for the quadratic potential corresponding
to Gaussian ensembles.
4 This means precisely the limit lim
ε→0 F(x0 )dx0 + x+ε F(x0 )dx0 , if x is a singular point of F(x).
R x−ε R
Page 34 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
(singular) integral equation to be solved now is
n? (x0 )
Pr dx0 =x . (5.14)
Z
x − x0
To solve (5.14), we invoke a theorem by Tricomi [23], stating that
√
b f (x0 ) C Pr g(t)
− x−t
Pr dx0 , (5.15)
a
= g(x) ⇒ f (x) =
Z R b dt (t−a)(b−t)
x − x0 π (x − a)(b − x)
pa π
provided that [a, b] is a single (compact) support and C is an arbitrary constant.
Question. Who tells me that the optimal counting function n? (x) is supported on a
single interval [a, b]?
I There is some nice physical intuition behind this. The “thermodynamical” in-
terpretation of the eigenvalues implies that the gas of particles is confined by a
quadratic well with a single minimum (see Fig. 4.1). It is then physically reason-
able to foresee that the particles will fill the single minimum of the potential. If
a potential has many minima, then it is possible that n? (x) “splits” into as many
connected components as the number of minima of the potential. Any attempt to
use (5.15) in these multiple-support cases will produce unphysical solutions.
Evaluating the principal value integral with g(t) = t and imposing the normalization a dx n? (x) = 1, we
get
Rb
1 1 1
(5.16)
2 8
1 − x2 + (a + b)x + (b − a)2 .
π (x − a)(b − x)
n? (x) = p
Note that the density in (5.16) is a solution of the integral equation (5.14) between a and b for any choice
of a and b. How to fix the “optimal” a and b will be the subject of the next sections.
[Of course, do not even consider trusting us on this. You are not allowed to proceed until you have
derived (5.16) yourself. Sorry.]
5.4 Smart tricks
Now, stare at (5.16) intensely. As promised, the function n? (x) (defined for x ∈ (a, b)) indeed depends on
two free parameters a and b.
We need now to compute the intensive free energy
f = F0 [n? (x)] . (5.17)
It will of course depend as well on the two free parameters a and b, which arose as a Phoenix from the ashes
of the integral equation (5.14).
Page 35 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
A couple of smart tricks will make our life easier. First, we would really like to get rid of the double
integral in
1 1
(5.18)
2 2
f ≡ F0 [n? (x)] = dx x2 n? (x) − dxdx0 n? (x)n? (x0 ) ln |x − x0 | .
Z ZZ
To do that, we multiply the saddle point equation (5.10)
x2
(5.19)
2
− dx0 n? (x0 ) ln |x − x0 | − κ = 0
Z
by n? (x) and integrate over x. This way we obtain
1
(5.20)
2
dxdx0 n? (x)n? (x0 ) ln |x − x0 | = dx n? (x)x2 − κ ,
ZZ Z
where we used n? (x)dx = 1.
R
Next, we fix the Lagrange multiplier κ by setting x = a in (5.19). We obtain κ = a2 /2− a dx n? (x) ln(x−
a). Combining everything, we get
Rb
1 b a2 1 b
dx n? (x)x2 + (5.21)
4 a 4 2 a
f ≡ F0 [n? (x)] = − dx n? (x) ln(x − a) .
Z Z
No more κ, and no more double integrals. Nice, uh?
5.5 The final touch
Inserting (5.16) into (5.21) and computing the integrals with the help of an abacus5 , we obtain
1
512
f ≡ f (a, b) = −9a4 + 4a3 b + 2a2 5b2 + 48 + 4ab b2 + 16
−256 ln(b − a) − 9b4 + 96b2 + 512 ln(2) . (5.22)
We now have our (quite ugly) intensive free energy: In the code [♠ integral_check.m] we provide
a simple numerical confirmation that the above result is equivalent to (5.21).
All we need to do is to minimize it with respect to a and b - the (soft) edge points of the support of n? (x).
5 It may be useful to first change variables z = (x − a)/(b − a). The resulting integrals can then be handled by most symbolic
computation programs.
Page 36 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
√ √
If you do that, you will obtain the solution6 a = − 2 and b = 2, which imply for n? (x) from (5.16)
the following form
(5.23)
π
n? (x) ≡ ρSC (x) = 2 − x2 ,
1p
the famous Wigner’s semicircle law. Very appropriate name, given that it is not the equation of a semicircle,
but rather of a semi-ellipse. The code [♠ Tricomi_check.m] offers a numerical verification that the
√
semicircle indeed solves equation (5.14) for a = −b = − 2.
How to show this analytically, though?
We need to prove that √
2
√
Pr =x. (5.24)
0 2 − x02
√ dx
Z
− 2 π(x − x0 )
The primitive of the integrand is - ignoring an additive constant
√ √
2 − x2 ln (ϕ(x, y)) − 2 − x2 ln(x − y) + x arcsin √y2 − 2 − y2
F(y) = , (5.25)
π
p
where
ϕ(x, y) = 2 − x2 2 − y2 − xy + 2 . (5.26)
p p
Hence all you have to show is
√ √
(5.27)
ε→0+
lim F(x − ε) − F − 2 + F 2 − F(x + ε) = x , − 2≤x≤ 2.
h √ √ i
Have a go at it!
5.6 Epilogue
What is again the interpretation of the “semicircular” n? (x)? It is just the equilibrium profile of a gas of
many charged particles on a line, which minimizes the free energy of the gas. In the “eigenvalue” language,
it represents the normalized histogram of the N eigenvalues of a single (very big!) instance of the Gaussian
ensemble. The property that this object also faithfully represents the spectrum averaged over many samples
(i.e. n? (x) = hn(x)i = ρ(x)) is called self-averaging and we will assume it to hold.
The code [♠ Coulomb_gas.m] provides a numerical verification of what we worked on in this Chap-
ter and the previous one. It simulates the Coulomb gas through a simple Monte Carlo procedure, which
produces the equilibrium density for long enough times. Also, a numerical check of the semicircle distribu-
tion can be performed directly, i.e. through the numerical diagonalization of random matrices, with the code
[♠ Gaussian_finite_N_rescaled.m] (see Fig. 5.1).
6 The fact that the soft edges are symmetrically located around the origin is a consequence of the symmetry of the confining
potential under the exchange x → −x.
Page 37 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
0.5
ρSC (x)
N = 10
0.4 N = 102
N = 103
0.3
0.2
0.1
0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x
Figure 5.1: Numerical check of the semicircle law for GOE. Increasing the value of N, after a suitable
rescaling, the eigenvalue histograms collapse on top of the semicircle curve.
Note that, at the very beginning of the derivation of (5.23), we rescaled the eigenvalues by β N (Eq.
(4.2)). Therefore, in the simulations we need to perform the same rescaling of our eigenvalues by β N
p
before comparing the histogram to the theoretical semicircle. This is in agreement with the precise statement
p
we made in the second Question in Chapter 3, namely
lim β Nρ( β Nx) = ρSC (x) , (5.28)
N→∞
p p
1
√
where the function ρSC (x) = π 2 − x2 ≡ n? (x) is β -independent.
As a final remark, what happens if the confining potential is not quadratic? In general, if our invariant
ensemble is characterized by a joint probability density of the entries of the form
ρ[H] ∝ exp [−TrV (H)] , (5.29)
then the joint law of the eigenvalues is of the form
N
ρ(x1 , . . . , xN ) ∝ exp − ∑ V (xi ) ∏ |x j − xk |β (5.30)
i=1 j<k
" #
and the analogue of the Tricomi equation for the spectral density is
n? (x0 )
Pr dx0 = V 0 (x) . (5.31)
Z
x − x0
Try to solve for n? (x) in the case V (x) = x − α ln x (x > 0). This will correspond to the Wishart-Laguerre
Page 38 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
ensemble of random matrices, which will be extensively discussed in Chapter 13.
Question. Do all existing random matrix ensembles have the semicircle as their
average spectral density?
I Certainly not! The spectral density is highly non-universal - i.e. it strongly
depends on the ensemble you consider. This said, it is true that many ensembles
share it as their spectral density for large N. This is the case for instance of Wigner
ensembles (non-invariant), when the distribution of entries decays sufficiently fast
at infinity (see [24]).
Question. What are the moments of the semicircle law?
I They are given by the so called Catalan numbers. More precisely, defin-
ing
N
hTrX k i = dx1 · · · dxN ρ(x1 , . . . , xN ) ∑ xik = N dx xk ρ(x) , (5.32)
i=1
Z Z
where ρ(x1 , . . . , xN ) is the jpdf for the Gaussian ensemble (2.15) and ρ(x) its one-
point marginal for finite N, we have the relation
1 Cn
lim = y2n (5.33)
hTrX 2n i
N→∞
√ dy
β n N n+1 π 2
2 − y2 = n ,
Z √2
− 2
p
1 2n
where Cn = n+1 n is the nth Catalan number. Catalan numbers occur in a variety
of combinatorial problems, for example Cn is the number of ways to correctly match
n pairs of brackets.
Question. I see that the Coulomb gas treatment is insensitive to the precise value of
β . But is it possible to construct an explicit random matrix ensemble ρ[H], whose
eigenvalues are distributed according to a Coulomb gas with β 6= 1, 2, 4?
I Yes! This has been achieved by Dumitriu and Edelman [25], who pro-
duced ensembles of tridiagonal matrices - hence non-invariant - with independent
but not identically distributed nonzero entries, whose jpdf of eigenvalues can be
nevertheless computed analytically. This jpdf turns out to be equal to the Gaussian
or Wishart-Laguerre ones, albeit with a continuous Dyson index β > 0 (it enters
as a parameter of the distribution of the nonzero entries). These ensembles are
very useful also on the numerical side: they provide a much faster way to sample
GXE-distributed eigenvalues (with X=O,U,S), without having to diagonalize full
Gaussian matrices!
Page 39 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Question. If I drop the symmetry requirements on the entries of the ensemble
(Hi j 6= H ji ), what is the resulting analogue of the semicircle law for complex
eigenvalues?
I This is called the Girko-Ginibre (or circular) law. In essence, for any se-
quence of random N × N matrices whose entries are i.i.d. random variables, all
with mean zero and variance equal to 1/N, the limiting spectral density is the
uniform distribution over the unit disc in the complex plane.
5.7 To know more...
1. The Gaussian ensemble for β = 2. The eigenvalues can be interpreted as the positions of fermions in
a harmonic trap. To understand this mapping, have a look at [26] and references therein.
2. Recently, the Coulomb gas technique has been improved and modified to tackle a wealth of different
problems. It all started with a beautiful calculation on the following problem: what is the probability
that all the eigenvalues of a Gaussian matrix are negative? Check this paper out [27].
3. The Gaussian ensembles can also come in a variant called fixed-trace: this means that one multiplies
N
the jpdf (4.1) by δ ∑i=1 xi2 − t , which fixes the squared trace to the value t (see [28, 29] for details).
4. The normalization constant ZN,β for the Gaussian ensemble can be computed for finite N, with simple
algebraic manipulations on the so called Selberg integral
N
(5.34)
0
dxx|∆N (xx)|β ∏ xia−1 (1 − xi )b−1 .
i=1
Z 1
It was computed by the norwegian mathematician A. Selberg, who showed that, when it exists, it is
given by
N
. (5.35)
Γ(1 + β j/2)Γ(a + (N − j)β /2)Γ(b + (N − j)β /2)
∏
j=1 Γ(1 + β /2)Γ(a + b + β (2N − j − 1)/2)
To know more about recent developments in the beautiful theory of Selberg integrals, have a look at
[5].
Page 40 of 111
Chapter 6
Time for a change
In this Chapter, we show how to compute the jpdf of eigenvalues for random matrix models - whenever
possible.
6.1 Intermezzo: a simpler change of variables
Suppose we have to compute the following double integrals
I1 = dx dy ρ1 (x, y) I2 = dx dy ρ2 (x, y) , (6.1)
R2 R2
Z Z
with ρ1 (x, y) = f (x2 + y2 ) and ρ2 (x, y) = x f (x2 + y2 ). Here, f (t) is a function of your choice that makes
both integrals convergent.
A good strategy is to make the “polar” change of variables {x, y} = {r cos θ , r sin θ } to write
∞ 2π ∞ 2π
I1 = dr dθ ρ̂1 (r, θ ), I2 = dr dθ ρ̂2 (r, θ ) , (6.2)
0 0 0 0
Z Z Z Z
where ρ̂1 (r, θ ) = r f (r2 ) and ρ̂2 (r, θ ) = r2 cos θ f (r2 ). Obviously, we had to include here the extra Jacobian
factor
∂x ∂x
J(r, θ ) = ∂r ∂θ =r. (6.3)
∂y ∂y
!
∂r ∂θ
Therefore, we can formally write ρ1 (x, y)dxdy = ρ̂1 (r, θ )drdθ (and similarly for ρ2 ), meaning that the two
expressions give the same result once integrated over “corresponding” domains (e.g. R2 → (0, ∞) × (0, 2π)).
This is all trivial and easy. But together with the following two remarks, it is all you need to know to
fully understand what happens in the RMT case, with jpdf of entries and eigenvalues all over the place.
1. ρ̂1 (r, θ ) (the new integrand) is nothing but ρ1 (r cos θ , r sin θ ) × |J(r, θ )| (the old integrand, written in
terms of the new variables, times the Jacobian factor) - and similarly for ρ̂2 .
2. The marginal ρ̂1 (r) = 02π dθ ρ̂1 (r, θ ) is easier to compute than the corresponding ρ̂2 (r). This for two
reasons: i) the original ρ1 (x, y), once expressed in the new polar variables, no longer depends on one
R
41
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
of them (θ ), and ii) also the Jacobian does not depend on θ . So the integration in θ becomes trivial
and gives just a constant factor 2π.
6.2 ...that is the question
Take the case of real symmetric matrices for simplicity - call them H instead of Hs from now on.
Look again at the jpdf of eigenvalues (2.15) for the GOE ensemble (β = 1)
1 1 N 2
ρ(x1 , . . . , xN ) = e− 2 ∑i=1 xi ∏ |x j − xk | . (6.4)
ZN,β =1 j<k
We gave it without proof.
How to obtain it from the jpdf of entries in the upper triangle, ρ[H]
2 2
e
N −Hii /2
e−Hi j
ρ[H] = ∏ √ ∏√ ? (6.5)
i=1 2π i< j π
In this Chapter, we provide an answer to this outstanding question.
6.3 Keep your volume under control
A real symmetric matrix can be diagonalized by an orthogonal matrix O as H = OXOT , with X = diag(x1 , . . . , xN ).
2
Orthogonal N × N matrices are characterized by the property that OOT = 1, where 1 is the identity
matrix. As a subspace of RN , these matrices form a sub-manifold VN of dimension N(N − 1)/2, called the
Stiefel manifold. dO is precisely its “volume element” - the analog of dθ in the warm-up example above.
We know that 02π dθ = 2π. It is perhaps intuitive to give this number 2π the meaning of “volume”
occupied while dθ spans the entire one-dimensional manifold (the circumference of the unit circle). What
R
2
is, then, the “volume” occupied by orthogonal matrices in RN ?
A relatively simple calculation [30] shows that
2
2N π N /2
Vol(VN ) = dO = , (6.6)
VN ΓN (N/2)
Z
where
m
Γm (a) = π m(m−1)/4 ∏ Γ(a − (i − 1)/2) . (6.7)
i=1
We will use this result in a minute.
Page 42 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
If we call
dO
DO = , (6.8)
Vol(VN )
this defines the so-called Haar measure on the orthogonal group. The Haar measure is invariant under
orthogonal conjugation, and defines a probability space on orthogonal matrices. For further information,
consult [30, 31, 32, 33].
6.4 For doubting Thomases...
Let us compute the volume Vol(V2 ) for 2 × 2 orthogonal matrices “from first principles”1 .
Let
o11 o12
O= . (6.9)
o21 o22
!
The {oi j } are real variables. The volume we are after is
2
Vol(V2 ) = ∏ doi j δ o11 21
2 + o2 − 1 δ o12 22 11 12 21 22
2 + o2 − 1 δ (o o + o o ) , (6.10)
i, j=1
Z q q
where the delta functions enforce the constraints on the columns of O being orthogonal with each other, and
each having unit norm.
Changing to polar coordinates, we get
2π 2π ∞ ∞
Vol(V2 ) = dθ dφ
0 0 0 0
dr rδ (r − 1) dR Rδ (R − 1)δ (rR cos(θ − φ ))
Z Z Z Z
= dθ (6.11)
0 0
dφ δ (cos(θ − φ )) = 4π ,
Z 2π Z 2π
in agreement with (6.6) for N = 2 as it should.
6.5 Jpdf of eigenvalues and eigenvectors
As in section 6.1 - but this time with more variables - we are after the change of variables H → {xx, O}
N
ρ(H11 , . . . , HNN ) ∏ dHi j = ρ(H11 (xx, O), . . . , HNN (xx, O)) J(H → {xx, O}) dO ∏ dxi . (6.12)
i≤ j i=1
ρ̂(x1 ,...,xN ,O)
| {z }
On the left hand side, the jpdf of the N(N + 1)/2 entries of H in the upper triangle, including the diag-
onal. On the right hand side, the jpdf ρ̂ of both eigenvalues (N) and independent eigenvector compo-
cos θ sin θ
2
1 Alternatively, one may notice that the elements of V can be written either in the form (rotations in the
− sin θ cos θ
cos θ sin θ
plane by an angle θ ) or in the form (rotations followed by a reflection). That is, this group has two disconnected
sin θ − cos θ
components. Clearly, each of these components has a volume 2π, so the volume of V2 is 4π.
Page 43 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
nents (N(N − 1)/2, the dimension of the Stiefel manifold spanned by the orthogonal group over the reals).
The number of “degrees of freedom” is OK, thanks to the mind-wrecking and highly nontrivial identity
N(N + 1)/2 = N + N(N − 1)/2.
Clearly, on the right hand side we had to include the Jacobian of the change of variables, which we are
going to compute below. While in principle this Jacobian could depend on the full set of variables {xx, O}, it
turns out that it only depends on the eigenvalues {xx}, exactly as it happens for the change to polar coordi-
nates (6.3).
In our RMT case, this Jacobian is precisely the so-called Vandermonde determinant2 ,
J(H → {xx, O}) = ∏(x j − xk ) . (6.13)
j>k
This can be generalized to the hermitian and quaternion self-dual cases. The only difference is that the
Vandermonde is then raised to the power β = 2, 4 respectively. We will prove this in the next Chapter.
6.6 Leave the eigenvalues alone
Now, stare at the right hand side of (6.12) carefully.
The joint probability density of eigenvalues and eigenvectors ρ̂(x1 , . . . , xN , O) is the product of two
terms: the jpdf of entries - written as a function of eigenvalues and eigenvectors - times the Jacobian - which
is a function of the eigenvalues alone.
Then the next question is: how can I get the jpdf of eigenvalues alone? Well, you will need to integrate
out the eigenvector components {O} in (6.12). More precisely
ρ̂(x1 , . . . , xN )dxx = dxx dOρ̂(x1 , . . . , xN , O) , (6.14)
VN
Z
exactly as we did earlier on to find ρ̂1,2 (r) from ρ̂1,2 (r, θ ). And exactly as in that case, this integration over
VN may or may not be easy/possible to perform explicitly.
It is certainly possible when the original jpdf of entries, once expressed in terms of eigenvalues and
eigenvector components, is itself independent of eigenvectors - in complete analogy with our previous ex-
ample with r and θ . In this case, we would get
O), . . . , HNN (xx, @
ρ̂(x1 , . . . , xN , O) ≡ ρ(H11 (xx, @ O})
O)) J(H → {xx, @
= function of x alone , (6.15)
and all is left to do in (6.14) is the “volume” integral VN dO, yielding the simple constant in (6.6) - much
R
2 Why this is indeed a determinant in disguise will become clearer very shortly.
Page 44 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
like 2π in the warm-up example with r, θ above.
The prototypes of this favorable case are the rotationally invariant ensembles, see the next section3 .
6.7 For invariant models...
We can now formulate a cute little theorem for invariant models [31]. The proof is given below.
Let the real symmetric N × N matrix H have a jpdf of entries ρ[H] = φ Tr H, . . . , Tr H N , which is
evidently invariant under orthogonal similarity transformations4 . Then the jpdf of the N ordered eigenvalues
of H (x1 x2
≥ ≥ ··· ≥ is
xN )
2
π N /2
ρord (x1 , . . . , xN ) = φ ∑ xi , . . . , ∑ xiN ∏(xi − x j ) . (6.16)
ΓN (N/2) i i i< j
!
Note that there is no absolute value around the Vandermonde, as the eigenvalues are ordered.
Let us see how this theorem works in practice for the GOE case. We have
e
N −Hii2 /2
e−Hi2j
1 1
(6.17)
2
ρ[H] = ∏ √ ∏√ = N 2 −N
exp − Tr H 2 .
i=1 2π i< j π (2π)N/2 π 4
Therefore, applying the theorem above
2
π N /2 1 1 N 2
ρord (x1 , . . . , xN ) = × e− 2 ∑i=1 xi ∏(xi − x j ) , (6.18)
ΓN (N/2) (2π)N/2 π N 24−N i< j
1
(ord)
Z
N,β =1
| {z }
which needs to be compared with Eq. (2.15) for β = 1 - given without proof at the time
1 1 N 2
ρ(x1 , . . . , xN ) = e− 2 ∑i=1 xi ∏ |x j − xk | , (6.19)
ZN,β =1 j<k
with
N
Γ(1 + j/2)
ZN,β =1 = (2π)N/2 ∏ . (6.20)
j=1 Γ(3/2)
Do the two equations (6.18) and (6.19) indeed agree, as they should? Almost.
Notice that (6.18) holds for ordered eigenvalues, while (6.19) holds for unordered eigenvalues (hence
3 Instead, for models with independent entries, the jpdf of entries cannot be - in general - written in terms of the eigenvalues
alone. For such models, the jpdf of eigenvalues is therefore not generally known.
4 Recall Weyl’s lemma (3.8).
Page 45 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
the need to include the absolute value). The two normalization constants differ indeed by a factor N!
(ord)
ZN,β =1 = N!ZN,β =1 . (6.21)
6.8 The proof
Where does the normalizing factor
2
π N /2
(6.22)
ΓN (N/2)
in (6.16) come from? It is instructive to look at this derivation more closely.
Recall from (6.14) and (6.15) that (for the favorable case where one can integrate out the eigenvectors)
jpdf eigenv. = jpdf entries (as function of eigenv. alone) × |Vandermonde| × dO . (6.23)
VN
Z
This means that morally the normalizing factor (6.22) should corresponds to the volume integral VN dO as
in (6.6) (for β = 1, or ṼN dU over unitary matrix elements for β = 2 etc.).
R
R
There is a subtlety though: the change of variables between entries and eigenvalues (H → OXOT ) must
be one-to-one. But eigenvectors are defined up to a phase, e.g. if v is a real eigenvector, so is −v. To
guarantee the uniqueness of the eigen-decomposition, it is sufficient to fix the sign of the first row of the
matrix O, or the phases of the first row of the matrix U. This reduces the volume integral VN dO by a factor
2N in the orthogonal case, and the volume integral ṼN dU by (2π)N in the unitary case. And the proof is
R
complete.
R
Page 46 of 111
Chapter 7
Meet Vandermonde
The “repulsive” term between eigenvalues of invariant models ∏i< j (x j − xi ) can be written as a determinant,
called Vandermonde in honor of the French mathematician Alexandre-Théophile Vandermonde (who never
wrote it [34]).
7.1 The Vandermonde determinant
We have the following identity
1 ... 1
x1 . . . xN
N
. . .
N ∏ j i j (7.1)
i< j . . .
. . .
∆ (xx) := (x − x ) = det(xi−1 ) = det .
x1N−1 . . . xNN−1
The Vandermonde is clearly a completely anti-symmetric polynomial in N variables: take for example
N = 3. We have ∆3 (xx) = (x2 − x1 )(x3 − x1 )(x3 − x2 ). Now, exchange any two x j s: for example, x3 ↔ x2 . We
get −∆3 (xx) (we pick up a minus sign any time we make any exchange of two x j s).
The Vandermonde has a quite funny property: we can understand it already on a 2 × 2 matrix. Take
1 1 1 1
det = x2 − x1 det = 3(x2 − x1 ) . (7.2)
x1 x2 3x1 + 17 3x2 + 17
! !
Stare at these two determinants carefully. We have just replaced the second row of the first matrix (con-
taining first powers of x1 and x2 ) with a first degree polynomial. The result is just 3 times the Vandermonde
on the left. The 17 has disappeared altogether! This means that you have a lot of freedom in devising a
matrix whose determinant gives the Vandermonde.
More formally, the entries xik in the (k + 1)th row can be replaced, up to a constant factor a0 a1 · · · aN−1 ,
47
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
by a polynomial of degree k of the form: πk (xi ) = ak xik + · · · , where we omit terms of lower order in xi . The
important point is that these lower order terms can be absolutely anything. The result is that:
π0 (x1 ) . . . π0 (xN )
π1 (x1 ) . . . π1 (xN )
1 . . .
∆N (xx) = (7.3)
. . .
a0 a1 · · · aN−1
. . .
det .
πN−1 (x1 ) . . . πN−1 (xN )
Orthogonal polynomials are an important class of polynomials πk (x) that can be especially useful to play
this trick. We will discuss in Chapter 10 how this simple property can actually turn seemingly impossible
calculations into feasible ones.
For instance, let us show how the Hermite and Laguerre orthogonal polynomials can be used to express
the Vandermonde. For N = 3 it is easy to see that
H0 (x1 ) H0 (x2 ) H0 (x3 ) 1 1 1
x2 (7.4)
H2 (x1 ) H2 (x2 ) H2 (x3 ) x12 − 1 x22 − 1 x32 − 1
det H1 (x1 ) H1 (x2 ) H1 (x3 ) = det x1 x3 = ∆3 (xx),
and that −∆3 (xx)/2 =
L0 (x1 ) L0 (x2 ) L0 (x3 ) 1 1 1
(7.5)
−x1 + 1 −x2 + 1 −x3 + 1
x12 x22 x32
L2 (x1 ) L2 (x2 ) L2 (x3 ) 2 2 2
− 2x1 + 1 − 2x2 + 1 − 2x3 + 1
det L1 (x1 ) L1 (x2 ) L1 (x3 ) = det .
7.2 Do it yourself
We now derive the nontrivial relation (6.13) J(H → {xx, O}) = ∆N (xx) for real symmetric matrices H. We
stress that this proof does not require any assumption on the rotational invariance of the ensemble.
These can be diagonalized through an orthogonal transformation H = OXOT , where X = diag(x1 , . . . , xN ).
To find the Jacobian, we formally differentiate1 H,
δ H = (δ O)XOT + O(δ X)OT + OX(δ OT ) , (7.6)
and use δ OT = −OT (δ O)OT , which follows from OOT = 1. We get
δ H = (δ O)XOT + O(δ X)OT − OXOT (δ O)OT . (7.7)
1 The matrix element H can be written as H =
ij ij ∑`,m Oi` X`m O jm = ∑` Oi` x` O j` . The infinitesimal matrix δ H has entries
(δ H)i j = dHi j given by the differential of Hi j . Eq. (7.6) is a shorthand of this explicit differentiation w.r.t. Oi` , x` and O j` .
Page 48 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Pulling out a factor O to the left and OT to the right we obtain δ H = O(δ Ĥ)OT , where
δ Ĥ = (δ Ω)X − X(δ Ω) + δ X . (7.8)
Here, δ Ω = OT δ O is an antisymmetric matrix2 . Since δ H and δ Ĥ are related via an orthogonal transfor-
mation, we only have to find the Jacobian of δ Ĥ → {δ X, δ Ω}.
Noting that δ X is diagonal, we can write
d Ĥi j = dΩi j (x j − xi ) + dxi δi j . (7.9)
This is equivalent to the following differential relations:
d Ĥi j d Ĥi j
= δi j δik , = δik δ j` (x j − xi ) . (7.10)
dxk dΩk`
Don’t you see the Vandermonde trying hard to crop up here ©?
Let us now construct the Jacobian matrix J for a concrete 3 × 3 case. The generalization to the N × N
case will then appear obvious. The matrix J has dimension N(N+1)
2 , so it is a 6 × 6 matrix for N = 3. We
parametrize the antisymmetric matrix δ Ω as follows:
0 Ω 12 Ω 13
0 (7.11)
−Ω13 −Ω23 0
δ Ω = −Ω12 Ω23 .
Then the Jacobian matrix becomes:
d Ĥ11 d Ĥ11 d Ĥ11 d Ĥ11 d Ĥ11
dΩ13 dΩ23 dx1 dx2
d Ĥ12 d Ĥ12 d Ĥ12 d Ĥ12
d Ĥ11
dΩ13 dΩ23 dx1 dx2 3 0 0 0 1 0 0
dΩ12 dx3
0 0 0 0
d Ĥ13 d Ĥ13 d Ĥ13 d Ĥ13
d Ĥ12 d Ĥ12
dΩ dx
dΩ13 dΩ23 dx1 dx2 0 0 0
12
(7.12)
x3 − x1
0 0 0 1
x2 − x1 0
d Ĥ22 d Ĥ22 d Ĥ22 d Ĥ22
d Ĥ13 d Ĥ13
dΩ13 dΩ23 dx1 dx2 0 3 0 0
dΩ12 dx3 0 0
= .
x − x2
0 0
0 0 0 0 0 1
d Ĥ23 d Ĥ23 d Ĥ23 d Ĥ23
d Ĥ d Ĥ22
22 dx3 0 0
dΩ13 dΩ23 dx1 dx2
dΩ12
d Ĥ33 d Ĥ33 d Ĥ33 d Ĥ33 d Ĥ33 d Ĥ33
d Ĥ23 d Ĥ23
dΩ12 dΩ13 dΩ23 dx1 dx2 dx3
dΩ12 dx3
Swapping rows and columns, it is possible to bring this to the diagonal form, so that the determinant becomes
2 Obviously, you need to prove it before proceeding.
Page 49 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
trivial to compute. In the general N case, one has:
| det J| = ∏ |x j − xk | , (7.13)
j<k
as expected. The proof in the complex hermitian and quaternion self-dual cases is analogous and is left as
an exercise.
For a nice numerical test of the Jacobian identity (7.13), we refer to [35], Section 3.2, while for a “back-
of-the-envelope” derivation based on counting degrees of freedom, see [36].
We will make extensive use of the Vandermonde determinant and its properties in Chapter 10.
Page 50 of 111
Chapter 8
Resolve(nt) the semicircle
In this Chapter, we introduce the so called resolvent, a complex function from which the spectral density1
can be calculated. The advantage of the resolvent approach is that one has to solve an algebraic equation
? (x0 )
(like ax2 + bx + c = 0) instead of a (singular) integral equation (like Pr dx0 nx−x 0 = x, see (5.14)). The
disadvantage is that you need to know a bit of complex analysis.
R
8.1 A bit of theory
We introduce the complex function GN (z), with z ∈ C \ {xi }
1 1 1 N 1
GN (z) = Tr = ∑ , (8.1)
N z−H N i=1 z − xi
where the notation 1/(z − H) means the matrix inverse of z1 − H, and 1 is the N × N identity matrix.
If H is a random matrix, then GN (z) is a random complex function that has poles at the locations xi of
each eigenvalue.
The second ingredient we need is the Sokhotski-Plemelj formula
1 1
lim
= Pr (8.2)
ε→0+ y
∓ iπδ (y),
y ± iε
which should be interpreted as the integral relation (for a real-valued test function ϕ(x) such that the integrals
make sense)
−ε ∞ ϕ(y) ∞ ϕ(y)
lim + dy . (8.3)
ε→0+ ε y
dy ∓ iπϕ(0) = lim+
Z
−∞
Z Z
ε→0 −∞ y ± iε
For a one-liner proof, see below (around (8.6)).
1 Unless otherwise stated, we will no longer make a distinction between n? (x), hn(x)i and ρ(x).
51
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Question. What is the point of introducing this identity?
I First, stare at (8.2) carefully. You see that, on the right hand side, the
imaginary part is just a delta function. So, this identity is (yet another) way of
representing a delta function, as the imaginary part of a rational function (the left
hand side). Knowing that the spectral density is defined in terms of a delta function
N
ρ(x) = h(1/N) ∑i=1 δ (x − xi )i, you should be spotting an interesting connection
here. More on this later.
8.2 Averaging
Imagine now to take the limit N → ∞ of hGN (z)i, where we average over the distribution of the matrix H.
This average is called resolvent, or Green0 s f unction, or Stieltjes transform. It is natural to assume (and can
be mathematically justified) that:
• the sum in (8.1)
1 1 1 N 1
GN (z) = Tr = ∑ . (8.4)
N z−H N i=1 z − xi
gets converted into an integral,
• the poles at xi merge into a continuous “cut” on the real line,
• we have to “weigh” the integrand with the average density of eigenvalues ρ(x) at point x.
The cut on the real line is therefore nothing but the support of the spectral density, and the average re-
√ √
solvent is defined for all complex values z outside this cut (for example, outside the interval [− 2, 2] on
the real line for the Gaussian ensemble).
In formulae
(av) ρ(x0 )
hGN (z)i → G∞ (z) = dx0 , for N → ∞ . (8.5)
Z
z − x0
If you are inclined to believe that (8.5) is very plausible (to say the least), we can now proceed smoothly.
(av)
Compute now G∞ (z) (the averaged resolvent in the large N limit) at z = x − iε. Carrying out this
herculean task, we get
(av) ρ(x0 ) ε
= +i , (8.6)
ρ(x0 )(x − x0 )
G∞ (x − iε) = dx0 dx0 dx0 ρ(x0 )
Z Z Z
x − iε − x0 (x − x0 )2 + ε 2 (x − x0 )2 + ε 2
where we have multiplied up and down by x − x0 + iε and separated the real and imaginary parts.
Sending now ε → 0+ , we are basically proving the Sokhotski-Plemelj formula: the real part becomes a
0)
principal value integral (the so called Hilbert transform), Pr dx0 ρ(x
x−x0 , while the imaginary part (with the sign
R
Page 52 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
(av)
reversed with respect to the argument of G∞ , ± → ∓) becomes πρ(x), using the following representation
for the delta function
1 ε
δ (x) = lim . (8.7)
π ε→0+ x2 + ε 2
In summary
1 (av)
ρ(x) = lim Im G∞ (8.8)
π ε→0+
(x − iε) .
So, if you know (or can calculate) the resolvent in the complex plane, you can from it deduce the spectral
density.
All this in theory. Practice in the next section.
(av)
Are there important properties of the resolvent G∞ (z) in (8.5) that are worth
remembering?
I First of all, if you send |z| → ∞ in (8.5), you get
(av) ρ(x0 ) 1 1
G∞ (z) = dx0 dx0 ρ(x0 ) = + .... , (8.9)
z z
≈
Z Z
z − x0
where we have used normalization of ρ(x). This asymptotic ∼ 1/z behavior can
be important in applications.
Next, expanding the denominator in (8.5) to all orders, we observe that the resolvent
is the generating function of moments µk = hTr(H k )i = dxρ(x)xk
R
∞
(av) ρ(x0 ) 1 ρ(x0 ) 1 ∞ x0 k µk
G∞ (z) = dx0 = dx0 = ∑ dx0 ρ(x0 ) = ∑ k+1 ,
z z k=0 z
Z Z Z
z − x0 1 − x0 /z z k=0
(8.10)
with µ0 = 1.
8.3 Do it yourself
We propose here a truly elementary derivation of the algebraic equation satisfied by the resolvent for the
Gaussian ensemble.
√
Consider the partition function of the standard Gaussian ensemble, after a further rescaling xi → xi N
and ignoring prefactors
N βN
N
N
2 ∑i=1 xi2
ZN,β ∝ ∏ dx j e− ∏ |x j − xk |β = ∏ dx j e−β NV[xx] , (8.11)
R j=1 j<k R j=1
Z Z
with
1 1
V[xx] = (8.12)
2∑
xi2 − ln |xi − x j | .
i 2N i6∑
=j
Page 53 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Compared to our earlier Coulomb gas treatment, we have pulled out a factor N (not N 2 ), so that the
xi are now of O(1) for large N. Instead of introducing a continuous counting function n(x) (as we did in
Chapter 4), we can directly perform the saddle point evaluation of the N-fold integral (8.11), obtaining for
each variable xi the equation
∂ V[xx] 1 1
(8.13)
N
= 0 ⇒ xi =
∂ xi j6=i
∑ xi − x j .
1
Multiplying (8.13) by N(z−xi ) and summing over i, we get:
1 xi 1 1 1
= ∑∑ . (8.14)
N∑i z − xi N i j6=i xi − x j N(z − xi )
Adding and subtracting z in the numerator, the left-hand-side L becomes
1 1 1
L= (8.15)
xi − z + z
N∑
= −1 + z ∑ = −1 + zGN (z) .
i z − xi N i z − xi
1 N 1 1
As for the right-hand-side, let us define R = N2 ∑i=1 ∑ j6=i z−x i xi −x j
. Writing
1 1 1 1
= + , (8.16)
(z − xi )(xi − x j ) z − x j z − xi xi − x j
one obtains the following self-consistency equation for R
1
G (z) . (8.17)
1 0 1 0
R = GN2 (z) + R = GN2 (z) +
N N 2 2N N
G (z) − R ⇒
Equating L to R, we obtain as promised that the saddle-point condition (8.13) gets converted into an
equation for the resolvent
1
G (z) . (8.18)
1 0
2 2N N
−1 + zGN (z) = GN2 (z) +
This is good, but is still a differential equation for GN (z), while we promised an even simpler algebraic
equation. It is actually easy to get rid of the differential term in (8.18) by noticing that, with xi of O(1), the
1
resolvent as defined in (8.1) is itself of O(1) and therefore the term 2N GN0 (z) is subleading for large N.
Taking the average, the surviving algebraic (at long last!) equation for N → ∞ reads
(av)2 (av)
G∞ (z) − 2zG∞ (z) + 2 = 0 . (8.19)
It is instructive to solve (8.19) directly as a quadratic equation (recall that quadratic equations for com-
Page 54 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
plex variables admit the same solving formula as their real counterparts), yielding
G(av)
∞ (z) = z ± z2 − 2 . (8.20)
p
(av)
Setting now, z = x − iε, we obtain G∞ (x − iε) = x − iε ± (x2 − ε 2 − 2) + i(−2xε). The square root
√
(with positive real part) of a complex number a + ib can be written as [37] a + ib = p + iq, with
p
1
p= √ a2 + b2 + a q= √ a + b2 − a , (8.21)
2 2
q p q
sign(b) p 2
where sign(x) = 1 if x > 0 and = −1 if x < 0.
Hence we obtain (recalling (8.8))
1 ε sign(−2xε)
ImG(av)
π π
∞ (x − iε) = − ± √ (x2 − ε 2 − 2)2 + 4x2 ε 2 − x2 + ε 2 + 2
π 2
r q
ε→0+ sign(−x)
−→ ± √ |x2 − 2| − x2 + 2 . (8.22)
π 2
q
√ √
From this expression, you see that i) for |x| > 2 you obtain that the density is 0, and ii) for |x| < 2,
you need to select the (−) or (+) sign in front, according to whether x > 0 or x < 0 respectively. After
√
choosing the right sign, you get ρ(x) = (1/π) 2 − x2 as expected.
√
To double-check this result, we can insert the semicircle ρSC (x) = π1 2 − x2 back into the definition
(8.5) and perform the numerical integration for z = x − iε, with ε a small positive number and separately
√ √
for the two cases, 0 < x < 2 and − 2 < x < 0. This is done with the code [♠ check_resolvent.m],
where the results are compared with the two choices of sign in (8.20). You see that the (+) choice in (8.20)
√ √
only works with − 2 < x < 0, and the (−) choice in (8.20) only works with 0 < x < 2.
8.4 Localize the resolvent
Let us now take a step back and “unpack” the definition of the resolvent in equation (8.1).
We wrote that definition as the trace of the matrix (z − H)−1 , with z ∈ C \ {xi }, xi s being the eigenvalues
of H. If we write the trace explicitly, i.e. as a sum over the diagonal elements of the resolvent, we have
N
GN (z) = (1/N) ∑i=1 GN,ii (z), where
N
(c j )2
GN,ii (z) = [(z − H)−1 ]ii = (8.23)
j=1
∑ z −i x j ,
and cij is the ith component of the normalized eigenvector associated with the jth eigenvalue of H.
So, you may now ask, what should I make of these matrix elements? Well, it turns out that they contain
precious information about the localization properties of the matrix ensemble they are associated with.
Page 55 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
But, first of all, what is localization?
Simply put, the term localization refers to how “spread out” over their components the eigenvectors of a
matrix are. Let us define the inverse participation ratio (IPR) of a normalized eigenvector as
N
IN, j = ∑ (cij )4 . (8.24)
i=1
Now, when the eigenvector’s components are all roughly of the same magnitude, then we must have
√
cij ' 1/ N, ∀ i, due to normalization. Hence, we will have IN, j ' 1/N, and the IPR will vanish in the large
N limit.
If, on the other hand, the eigenvector is significantly different from zero only on a number s of sites, then
√
for those sites we will have cij ' 1/ s, and the IPR will remain roughly equal to 1/s in the large N limit.
So, all in all, the IPR is a handy tool that tells us whether certain eigenvectors of a matrix are extended (i.e.
have an extensive number of non zero components) or instead localized on a finite number of sites.
Although this may sound like a mathematical curiosity, the localization properties of matrix ensembles
are related to a number of relevant features of the physical systems they describe. In particular, it is often
crucial to detect the so called mobility edge, i.e. the critical eigenvalue that separates the part of the spectrum
associated with extended states from the one associated with localized states. For example, it has famously
been shown that the mobility edge determines the Anderson transition in electronic systems [38].
All in all, it should be now clear that having analytical access to the distributional properties of the IPRs
corresponding to different segments of a given ensemble’s eigenvalue spectrum is a valuable thing. Luckily,
this is where our diagonal elements (8.23) come to the rescue. Indeed, it has been shown in [39] that the
average value P(x) of IPRs associated with states whose corresponding eigenvalues lie between x and x + dx
can be written in the large N limit as
N
ε ε
P(x) = lim+ lim lim+ (8.25)
ε→0 N→∞ πρ(x) ∞
∑ |GN,ii (x − iε)|2 = ε→0
πNρ(x) i=1
|G(av) (x − iε)|2 .
8.5 To know more...
1. The saddle-point evaluation (8.13) based on the partition function (8.11) is clearly valid when the
neglected terms in the exponent are indeed subleading (O(N)). There are models - rotationally in-
variant by construction - where the Dyson index β is allowed to scale with N [40, 41]. These models
provide explicit realizations of invariant β -ensembles, for which the resolvent equation is necessarily
more involved. Ref. [41] is also suggested for an elementary derivation of this “improved” resolvent
equation in the presence of a hard wall in the spectrum.
2. Matrix models such as the Gaussian can be constructed introducing a fictitious time evolution (stochas-
tic) of the entries. In this case, it is possible to show that the resolvent satisfies a partial differential
equation of the Burgers type (see the beautiful paper [42]).
Page 56 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
3. The equation for the resolvent can be given a pretty interpretation in terms of planar diagrams. Dia-
grammatic methods are at the heart of many beautiful results in RMT (see [43] and [44]).
Page 57 of 111
Chapter 9
One pager on eigenvectors
Take the GUE ensemble of N × N hermitian matrices. Any given matrix in the ensemble will have unit-norm
eigenvectors having in general complex components. What is the statistics of such components?
Since eigenvalues and eigenvectors of invariant matrix models are decoupled, the only constraint on the
N components of an eigenvector is that its norm must be one, therefore their jpdf reads
N
PGUE (cc) = CN δ 1 − ∑ |cn |2 , (9.1)
n=1
!
where CN is a normalization constant.
It is convenient to compute the marginal distribution of a single component, say |c1 |2 , given by
PGUE (y) = d 2 c1 · · · d 2 cN δ (y − |c1 |2 )PGUE (cc) . (9.2)
Z
Similarly, we can compute the jpdf of eigenvector components (this time all real numbers) of a GOE
matrix.
The calculation in (9.2) is carried out by first defining an auxiliary object
N
PGUE (y;t) = d 2 c1 · · · d 2 cN δ (y − |c1 |2 )CN δ t − ∑ |cn |2 , (9.3)
n=1
Z
!
such that PGUE (y) = PGUE (y; 1). Then, taking the Laplace transform with respect to t to kill the delta function
in (9.3)
∞ 2 2
dt e−st PGUE (y;t) = CN d 2 c e−s|c| , (9.4)
0
d 2 c1 δ (y − |c1 |2 )e−s|c1 |
Z Z Z N−1
and finally converting the 2d integrals in polar coordinates
∞ ∞ 2
∞ 2 e−sy
dt e−st PGUE (y;t) = ĈN dρ ρ e−sρ ∝ , (9.5)
0 0 0 sN−1
dr rδ (y − r2 )e−sr
Z Z Z N−1
58
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
where we have absorbed the angular constants in the overall normalization.
Inverting the Laplace transform, we obtain
PGUE (y;t) ∝ (t − y)N−2 θ (t − y) , (9.6)
where θ (z) is the Heaviside step function. Setting t = 1 and normalizing, we obtain
PGUE (y) = (N − 1)(1 − y)N−2 for 0 ≤ y ≤ 1 . (9.7)
Similarly, for the GOE one obtains
1
(9.8)
Γ(N/2) (1 − y)(N−3)/2
y
PGOE (y) = √ √ for 0 ≤ y ≤ 1 .
π Γ((N − 1)/2)
Computing the average hyi in both cases
1 Γ(N/2) 1 1 1
dy yPGOE (y) = dy yPGUE (y) = , (9.9)
0 2Γ(1 + N/2) N 0 N
hyiGOE = ∼ hyiGUE =
Z Z
leads us to consider the scaled variable η = yN and take the limit N → ∞. This produces the scaled densities
1 1
PGOE (η) = lim PGOE e−η/2 , (9.10)
N→∞ N N 2πη
=√
η
1
PGUE (η) = lim PGUE = e−η . (9.11)
N→∞
N N
η
The first of these densities is called the Porter-Thomas distribution [45, 46]. Note also that the Gaussian
nature of the matrix ensembles has not been used anywhere in the derivation (the same densities would be
obtained for any orthogonal or unitary ensemble).
The study of eigenvectors of random matrices has been recently revived due to their importance in
quantum systems (see, e.g.,[47, 48, 49, 50])
Page 59 of 111
Chapter 10
Finite N
Look back at Chapter 1, where we constructed Gaussian matrices and histogrammed their eigenvalues. For
N → ∞, we showed in various ways that the average spectral density converges to the semicircle law. But
what happens for finite N? Can we compute analytically the shape of the histogram for, say, a 13 × 13
Gaussian matrix? The answer is Yes - and not only for Gaussian matrices, but for any rotationally invariant
ensemble! This is done here. We start from the case β = 2, as it is much easier.
10.1 β = 2 is easier
Already in Chapter 7, we mentioned that the Vandermonde determinant has some funny properties: in par-
ticular, each row in the Vandermonde matrix can be replaced by a polynomial of suitable degree, with many
a priori unspecified coefficients. The freedom in choosing these polynomials is enormous. A judicious
choice is the key of the celebrated orthogonal polynomial technique.
Take the jpdf of the N real eigenvalues of a rotationally invariant ensemble with β = 2
N
1 i
ρ(x1 , . . . , xN ) = ∏ e−V (x ) |∆N (xx)|2 , (10.1)
ZN i=1
which is written in the ‘potential’ form (see eq. (5.30)). For example, for the Gaussian ensemble V (x) =
x2 /2.
What is the goal then? To compute the average spectral density for finite N, i.e. the N − 1-fold integral
N
1
ρ(x1 ) = dx2 · · · dxN ρ(x1 , x2 , . . . , xN ) = dx2 · · · dxN ∏ e−V (xi ) |∆N (xx)|2 , (10.2)
ZN i=1
Z Z
N
where the partition function is ZN = dx1 · · · dxN ∏i=1 e−V (xi ) |∆N (xx)|2 .
R
Note that in (10.2) we are integrating over all variables but one. These integrals are nasty, though! The
integrand does not factorize at all, so we need to find some smart trick to carry out the integration. It took
a while even to the pioneers of these calculations (for instance, Gaudin and Mehta) to figure out how to
60
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
proceed. The steps are as follows:
Step 1:
Rewrite the Vandermonde ∆N (xx) as a determinant of the matrix A, whose entries are polynomials πk (x)
(to be determined), as in (7.3)
π0 (x1 ) . . . π0 (xN )
π1 (x1 ) . . . π1 (xN )
1 . . .
∆N (xx) = (10.3)
. . .
a0 a1 · · · aN−1
. . .
det .
πN−1 (x1 ) . . . πN−1 (xN )
Step 2:
Use the general relation1
N
(det A)2 = det(AT A) = det ∑ A ji A jk , (10.4)
j=1
!
applied to the matrix A from step 1, to write
N
1
∆N2 (xx) = 2
det ∑ π j−1 (xi )π j−1 (xk ) . (10.5)
(∏N−1
j=0 a j ) j=1
!
Step 3:
Pull the weight exp(− ∑i V (xi )) inside the determinant2 and shift the index j → j − 1, to write eventually
N−1
1 det (KN (xi , xk ))
ρ(x1 , . . . , xN ) = 2
det ∑ φ j (xi )φ j (xk ) = 2
, (10.6)
ZN (∏N−1
j=0 a j ) j=0 ZN (∏N−1
j=0 a j )
!
where φi (x) = e−V (x)/2 πi (x) and
N−1
1 0
KN (x, x0 ) = e− 2 (V (x)+V (x )) ∑ π j (x)π j (x0 ) , (10.7)
j=0
which is a central object in RMT: the kernel.
1 Hereafter, inside a determinant the indices of the entries will run from 1 to N.
2 Use √
(∏` α` ) det( f (i, j)) = det( αi α j f (i, j)).
Page 61 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Step 4:
Choose judiciously the (so far undetermined) polynomials π j (x). A great choice is to pick them or-
thonormal with respect to the weight3 exp(−V (x))
e−V (x) πi (x)π j (x)dx = δi j . (10.8)
Z
For instance, for the Gaussian (unitary) ensemble (V (x) = x2 /2) the corresponding orthonormal polyno-
mials are √
H j (x/ 2)
, (10.9)
2π 2 j j!
π j (x) = q√
√ j
if H j (x) are Hermite polynomials satisfying −∞ dx H j (x)Hk (x) exp(−x2 ) = π2 j!δ jk .
R∞
Question. What is the advantage of choosing polynomials with this “orthonormal-
ity” property?
I Well, the reason is that the kernel KN (x, x0 ) in (10.7), if the polynomials
are chosen this way, satisfies a quite amazing “reproducing” property
dyKN (x, y)KN (y, x0 ) = KN (x, x0 ) . (10.10)
Z
The proof is very simple: just insert (10.7) into (10.10) and use the orthonormality
relation (10.8). This property has a quite unexpected consequence, which eventu-
ally allows to carry out the multiple integrations in (10.2) in a very elegant, iterative
way. Another ingredient is necessary, though, and is presented in the next Section.
10.2 Integrating inwards
Summarizing, we have to carry out the multiple integration in (10.2) over a jpdf, which can be written as the
determinant of a kernel (see (10.6)), something like
dx2 · · · dxN det(KN (x j , xk )) =? (10.11)
Z
In normal situations, this would seem a rather hopeless task. But the reproducing property of the kernel
offers an unexpected way around.
First, an illuminating 2×2 example, and then the full-fledged (though dry) theory. Imagine the following
2 × 2 matrix depending on the vector x =
J2 (xx), through a function f (x, y) as follows
{x1 , x2 }
f (x1 , x1 ) f (x1 , x2 )
J2 (xx) = . (10.12)
f (x2 , x1 ) f (x2 , x2 )
!
3 Note that there is a factor (1/2) multiplying V (x) in the kernel (10.7), while there is none in the weight function of the
orthonormal polynomials in (10.8).
Page 62 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Suppose now that the function f satisfies the ‘’reproducing” property (10.10), namely f (x, y) f (y, z) dµ(y) =
f (x, z) for a certain measure µ(y). What happens to the following integral
R
dµ(x2 ) det(J2 (xx))? (10.13)
Z
Well, we have
dµ(x2 ) det(J2 (xx)) = dµ(x2 ) [ f (x1 , x1 ) f (x2 , x2 ) − f (x1 , x2 ) f (x2 , x1 )] =
Z Z
= q f (x1 , x1 ) − f (x1 , x1 ) = (q − 1) f (x1 , x1 ) , (10.14)
where q = dµ(x2 ) f (x2 , x2 ). We used the reproducing property to evaluate the second integral.
R
Maybe this short calculation is not particularly revealing, but it can be actually extended to the N × N
case as follows: let JN (xx) be an N × N matrix whose entries depend on a real vector x = (x1 , x2 , . . . , xN )
and have the form Ji j = f (xi , x j ), where f is some function satisfying the “reproducing kernel” property
f (x, y) f (y, z) dµ(y) = f (x, z) , for some measure dµ(y). Then the following holds:
R
det[JN (xx)] dµ(xN ) = [q − (N − 1)] det(JN−1 (x̃x)) , (10.15)
Z
where q = f (x, x) dµ(x), and the matrix JN−1 has the same functional form as JN with x replaced by
x̃x = (x1 , x2 , . . . , xN−1 ). A friendly proof can be found in [51].
R
This is a quite spectacular result, which is commonly referred to as Dyson-Gaudin integration lemma4 .
First of all, note that the 2 × 2 result (10.14) is in agreement with the general statement. Second, comparing
(10.11) and (10.15), we see that this lemma actually allows to integrate det(KN (x j , xk )) (essentially, the jpdf)
over the last variable xN , producing as a result a determinant of a smaller kernel matrix
det (KN (xi , x j ))1≤i, j≤N dxN = det (KN (xi , x j ))1≤i, j≤N−1 , (10.16)
Z
where we have used q = dxKN (x, x) = N (immediate from the definition of the kernel (10.7)). Basically,
the reproducing property carries over from the kernel to the determinant of the kernel!
R
Therefore, we can iterate the process N − k times, killing one integral at a time and reducing the dimen-
sion of the determinant by one, with a remarkable domino effect
... det (KN (xi , x j ))1≤i, j≤N dxk+1 · · · dxN = (N − k)! det (KN (xi , x j ))1≤i, j≤k . (10.17)
Z Z
4 The most accurate reference seems however to be [52].
Page 63 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
In particular, setting k = 0 we can normalize the jpdf (10.1) as
N−1
1
1= dxxρ(xx) = 2
dxx det (KN (xi , xk )) ⇒ ZN ∏ aj = N! , (10.18)
ZN (∏N−1
j=0 a j ) j=0
Z Z
!2
so that the two-point marginal ρ(x1 , x2 )
N
1 KN (x1 , x1 ) KN (x1 , x2 )
ρ(x1 , x2 ) = , (10.19)
j=3 KN (x2 , x1 ) KN (x2 , x2 )
Z
!
∏ dx j ρ(x1 , . . . , xN ) = N(N − 1) det
(where one uses (N − 2)!/N! = 1/[N(N − 1)]), while the one-point marginal (the average spectral density )
is simply
1
(10.20)
N
ρ(x1 ) = dx2 · · · dxN ρ(x1 , . . . , xN ) = KN (x1 , x1 ) .
Z
And the problem is solved not just for the one-point marginal, but for any k-point correlation function
- once the kernel is built out of suitable polynomials, orthonormal with respect to the weight V (x). The
fact that all such functions can be expressed in terms of determinants is usually referred to as determinantal
structure of the unitarily invariant ensembles.
For modern extensions of the “integrate-out” lemma and applications, have a look at [53, 54].
10.3 Do it yourself
Let us apply the general formalism to the GUE case, for which the orthonormal polynomials are π j (x) =
H j (x/ 2)/ 2π2 j j!, where H j (x) are Hermite polynomials. Then, we obtain immediately the spectral
density at finite N as
√ q√
√
N−1 H 2 (x/ 2)
1 2 j
ρ(x) = . (10.21)
2 j j!
√ e−x /2 ∑
N 2π j=0
In fig. 12.1 of Chapter 12 we show a comparison between a numerically generated histogram of GUE
eigenvalues, and the corresponding theoretical result in (10.21).
Question. If I send N → ∞ in (10.21), shouldn’t I recover the semicircle? I do not
see how.
I Yes, you should, and you will! The precise statement is
√ √ √ √
lim 2Nρ(z 2N) = (10.22)
N→∞ π
2 − z2 , for − 2 < z < 2 ,
1p
which requires a bit of work on the asymptotics of Hermite polynomials. We will
give a flavor of the steps you need just below.
Page 64 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
10.4 Recovering the semicircle
First, one injects the so called Christoffel-Darboux formula [55] into the game, a quite spectacular relation
that hugely simplifies sums of orthogonal polynomials. Specialized to the Hermite polynomials, it reads
n
H (x)Hk (y)
k n n+1
= . (10.23)
1 H (y)H (x) − Hn (x)Hn+1 (y)
∑ k!2k n!2n+1
k=0 x−y
With an eye towards (10.21), with a few manipulations and taking the limit x → y, we obtain the relation
N−1
∑ πk2 (x) = 0
N πN−1 (x)πN0 (x) − πN (x)πN−1 (x) , (10.24)
k=0
√
where the orthonormal polynomials with respect to the Gaussian weight were defined in (10.9).
After huge simplifications, the GUE spectral density for finite N - suitably rescaled - can be rewritten in
the form
2
√ √ 2e−Nz 2
√ √
2Nρ(z 2N) = √ NHN−1 (z N) − (N − 1)HN (z N)HN−2 (z N) . (10.25)
πN2N Γ(N)
h √ i
We should now analyze (10.25) in the limit N → ∞ for z ∼ O(1). To do so, we need to use the following
asymptotic formula for Hermite polynomials in the bulk5
√ 2 2 N (N!)1/2 NX 2
e 1
HN+m (X 2N) = gm,N (X) 1 + O , (10.26)
π N
Z Z
This can be proved by just expanding the left hand side as a double sum over permutations, performing the
integrals and then folding the result back into a single sum. Try to prove it yourself - for example, right now.
If you think about it for a second, this identity seems too good to be true. On the left hand side, you
have, say, a 20-fold integral of a truly nasty object, and on the right hand side a 20 × 20 determinant, which
can be easily handled by any scientific software - when not explicitly computable in closed form!
This identity is especially useful for unitary invariant ensembles (β = 2), because there you can write the
k−1
j ) det(x j ). For example, the partition
square of the Vandermonde determinant as ∏ j<k (x j − xk )2 = det(xk−1
67
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
function of the GUE can be written as a determinant
N 1 2 ∞ 1 2
ZN,β = 2 j dx e− 2 x x j+k−2
(−∞,∞)N j=1
∏ dx j e− x ∏(x j − xk )2 = N! det
j<k
Z
−∞
Z
1 1
= N! det 2 2 ( j+k−3) (−1) j+k + 1 Γ , (11.2)
2
( j + k − 1)
which can also be evaluated in closed form as a Selberg-like integral [5].
A nice feature of the final determinant - and this happens for all β = 2 calculations - is that it is of the
form det(Mi+ j ), i.e. it is a Hankel determinant (the matrix M is constant along the skew-diagonals). This
happens because the two determinants in the integrand on the left hand side are equal, and this produces a
factor (x j−1 )(xk−1 ) on the right hand side.
There are two other identities that are similar in spirit to the Andréief identity (the de Brujin identities
[57]). They read as follows:
dµ(xx) det[φi (x j )] = Pf sign(x − y)φi (x)φ j (y)dµ(x)dµ(y) , (11.3)
Z
x1 ≤x2 ≤...≤xN
ZZ
where i and j run from 1 to N, and
dµ(xx) det [φi (x j ) ψi (x j )] = (2N)!Pf dµ(x)(φi (x)ψ j (x) − φ j (x)ψi (x)) , (11.4)
Z Z
where i and j run from 1 to 2N.
In both the equations above, Pf denotes a Pfaffian. Just like the determinant can be written as a sum over
permutations, a Pfaffian is written as a sum over pairings.
Given a set S with an even number of elements, {1, ..., 2n}, a pairing of S is a collection of n pairs of ele-
ments from S. For instance, the set {1, 2, 3, 4} has three possible pairings: {{1, 2}, {3, 4}}, {{1, 3}, {2, 4}},
and {{1, 4}, {2, 3}}.
We can realize pairings as permutations acting of the trivial pairing {{1, 2}, {3, 4}}. The previous pair-
ings then correspond to the identity permutation, the transposition (23) and the cycle (243).
In terms of these permutations, we have
n
Pf(A) = ∑ s(P) ∏ AP(2 j−1),P(2 j) , (11.5)
P j=1
where s(P) is the signature of the permutation and A is a even-dimensional skew-symmetric matrix.
Page 68 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
For example, take n = 2 and A the following 4 × 4 matrix
0 A12 A13 A14
0 A 23
(11.6)
0
−A12 A24
A= .
−A14 −A24 −A34 0
−A13 −A23 A34
We have Pf(A) = A12 A34 − A13 A24 + A14 A23 (compare with the pairings listed above for the set {1, 2, 3, 4}).
Note also that Pf(A) = det(A).
p
We will encounter Pfaffians again in Chapter 12.
11.2 Do it yourself
Let us see a simple example where the Andréief formula turns a nasty problem into a doable one.
Question: what is the probability that a 9 × 9 GUE matrix has N+ = 7 positive eigenvalues? From first
principles, we have in general
N
PN (N+ = n) = dx1 · · · dxN ρ(x1 , . . . , xN )δ n − ∑ θ (xi ) , (11.7)
i=1
Z
!
where θ (x) is the Heaviside step function, = 1 if x > 0 and 0 otherwise.
Note that the delta function in (11.7) is more correctly a Kronecker delta δn,∑N θ (xi ) . We can introduce
i=1
the generating function
N
1 1 N N
2 ∑i=1 xi2 +(ln z) ∑i=1 θ (xi )
ϕN (z) = ∑ PN (N+ = n)zn = ZN,β =2 ∏ dx j e− ∏(x j − xk )2 . (11.8)
n=0 j<k
Z ∞ N
−∞ j=1
This multiple integral seems hopelessly complicated. But spotting that ∏ j<k (x j −xk )2 = det(xij−1 ) det(xij−1 ),
we can use the Andréief formula to write
det −∞ dx e
− 12 x2 +(ln z)θ (x) xi+ j−2
det (−1)i+ j + z ci+ j
ϕN (z) = 1 2
= , (11.9)
∞
R∞
det e− 2 x xi+ j−2 det (((−1)i+ j + 1) ci+ j )
−∞ dx
R
k−3 k−1
where ck = 2 2 Γ 2 . We have used Andréief also to express ZN,β =2 as a determinant, and erased a
common N! factor.
Evaluating the integrals, we got a ratio of Hankel determinants, which can easily be evaluated exactly
with a symbolic software.
Note that ϕN (1) = 1, as it should by normalization of PN (N+ = n) (see (11.8)). The probabilities
Page 69 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
PN (N+ = n) can then be reconstructed by differentiation
1 (n)
PN (N+ = n) = ∂z ϕN (z) . (11.10)
n! z→0
Carrying out this program, we may find that for a 9 × 9 GUE matrix,
161229045760 − 20942589825π 2 − 9172989000π 3 + 3386880000π 4
PN=9 (N+ = 7) =
48168960000π 4
' 5.67686 × 10−6 . (11.11)
Note that the evaluation is exact, and can be extended to many other values of n, N. Of course, it would be
desirable to have an exact and explicit formula for these probabilities at arbitrary n, N (see [58]).
For a numerical check of (11.10), see [♠ Andreief_check.m].
11.3 To know more...
1. There is an interesting connection between Hankel determinants, the so-called Toda equation on a
semi-infinite lattice, and Painlevé functions. Define
τn = det(ai+ j−2 )i, j=1,...,n , (11.12)
with “initial conditions” τ−1 = 0, τ0 = 1 and τ1 = a0 . Imagine that the entries ak of this Hankel matrix
0
are functions of x (and so is τn , for any fixed n). If the ak satisfy the following relation, ak = ak−1
(where 0 denotes differentiation with respect to x), then the following hierarchy of equations holds
τn00 τn − (τn0 )2 = τn+1 τn−1 . (11.13)
In [♠ Toda.m] we test this property for n = 3.
Quite amazingly, the same Toda lattice equation is obeyed by so-called τ-functions, which arise in
the Hamiltonian formulation of the six Painlevé equations (PI - PVI), other fundamental objects in the
theory of nonlinear integrable systems [59].
These deep connections between Andréief evaluations, Hankel determinants, Toda lattice and Painlevé
functions are at the root of quite spectacular results (see e.g. [60] and [61]).
Page 70 of 111
Chapter 12
Finite N is not finished
In this short Chapter, we compute in the quickest way the spectral density for the GOE (β = 1) and GSE
(β = 4). The symmetry classes beyond the Unitary have a reputation for being “unfriendly”. We do not aim
at giving the most general treatment of correlation functions for such cases. The goal of this Chapter is just
to provide a smooth and gentle appetizer, allowing you to tackle the nastier bits with your back covered.
12.1 β = 1
Let us assume N is even for simplicity. Indices i, j run from 1 to N, while k, ` run from 0 to N − 1.
Suppose the jpdf of eigenvalues is given by
N
1
ρ(x1 , . . . , xN ) = |∆N (xx)| ∏ w(xi ) . (12.1)
Z i=1
For w(x) = exp(−x2 /2), we recover the jpdf for the GOE.
Let’s compute the normalization factor, a.k.a. the partition function,
N
Z= dxx|∆N (xx)| ∏ w(xi ) = |âN | dxx| det(R j−1 (xi )w(xi ))| , (12.2)
i=1
Z Z
where Rk (x) = ak xk + · · · is a family of polynomials through which the Vandermonde materializes, and
N−1
âN = ∏ ak . (12.3)
k=0
!−1
To get rid of the absolute value, we restrict integration to the domain where the variables are ordered:
Z = N!|âN | dxx det(R j−1 (xi )w(xi )) . (12.4)
Z
−∞<x1 <x2 <···<∞
71
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
We may now use the de Brujin identity (11.3) to get
Z = N!|âN |Pf(Ai, j ) , (12.5)
where ∞ ∞
Ai, j = dx dyRi−1 (y)R j−1 (x)w(x)w(y)sign(x − y) , (12.6)
Z Z
−∞ −∞
and Pf denotes the Pfaffian of the skew-symmetric matrix Ai j .
Let us now stop for a second to check on a 2 × 2 example that, indeed, the expressions in (12.4) and
(12.5) coincide. Starting from the integral in (12.4), specialized to a 2 × 2 case, we have
+∞ y +∞ y
dy dxR0 (x)R1 (y)w(x)w(y) − dy dxR0 (y)R1 (x)w(x)w(y)
Z Z Z Z
−∞ −∞ −∞ −∞
+∞ x +∞ y
= dx dyR0 (y)R1 (x)w(x)w(y) − dy dxR0 (y)R1 (x)w(x)w(y) , (12.7)
Z Z Z Z
−∞ −∞ −∞ −∞
where we have simply renamed the variables x → y and y → x in the first integrals.
If we now expand the Pfaffian in equation (12.5) we obtain
Pf(Ai, j ) = dy dy dxR0 (y)R1 (x)w(x)w(y)
y
dxR0 (y)R1 (x)w(x)w(y) −
Z +∞ Z +∞ Z +∞ Z y
−∞ −∞ −∞
x +∞ y
= dx dyR0 (y)R1 (x)w(x)w(y) − dy dxR0 (y)R1 (x)w(x)w(y) , (12.8)
Z +∞ Z Z Z
−∞ −∞ −∞ −∞
where in the first integral we have simply rewritten the integration domain −∞ < y < x < ∞. The above
expression coincides with (12.7).
Now, stare at (12.6) for a few seconds. To simplify the notation slightly, we may define the following
skew-symmetric inner product
1
dx (12.9)
2
h f , gi1 = f (y)g(x)w(x)w(y)sign(x − y)dy ,
Z ∞ Z ∞
−∞ −∞
so that we can write Ai, j = 2hRi−1 , R j−1 i1 . Note that in general h f , gi1 = −hg, f i1 - we wouldn’t call it
skew-symmetric otherwise, would we?
Now, in complete analogy with what we did for β = 2 - identifying specific polynomials, orthonormal
with respect to the given weight - we may choose the polynomials R that “behave nicely” with respect to
this inner product. The nice properties we require are: evens and odds are orthogonal among themselves,
hR2k , R2` i1 = hR2k+1 , R2`+1 i1 = 0 , (12.10)
and evens are orthogonal to odds unless they are adjacent,
hR2k , R2`+1 i1 = −hR2`+1 , R2k i1 = δk` . (12.11)
Page 72 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
With this particular choice, the R’s are called skew-orthogonal polynomials. The matrix A in (12.6)
acquires a simple form,
0 2
0 2 (12.12)
−2 0
−2 0
..
.
A= ,
and the expression for Z drastically simplifies: the determinant of A becomes simply 2N , hence its Pfaffian
becomes 2N/2 , and all the information about the specific weight function w(x) is contained in âN . As a
consequence, from (12.5) we get for the partition function in (12.2) Z = N!|âN |2N/2 .
Let us now generalize this calculation slightly. Consider the quantity
Z[ f ] = |âN | dxx| det(R j−1 (xi )w(xi ) f (xi ))|
Z
N
= Z[ f = 1] dxxρ(x1 , . . . , xN ) ∏ f (xi ) , (12.13)
i=1
Z
where we introduced an arbitrary function f (x) in the game, such that the integral is convergent. Note that
Z[ f = 1] coincides with Z.
From this new partition function, we can recover the density of eigenvalues for finite N
ρ(x) = dx2 dx3 · · · dxN ρ(x, x2 , . . . , xN ) (12.14)
Z
δ
by means of a functional derivative. This is the operator δf, which satisfies all the properties of a derivative,
plus the condition
δ
(12.15)
δ f (x)
f (y) = δ (y − x) .
Then,
N
δ
Z[ f ] = Z[ f = 1] (12.16)
δ f (x)
dxxρ(x1 , . . . , xN ) ∑ δ (xi − x) = NZ[ f = 1]ρ(x) .
f =1 i=1
Z
Following a calculation perfectly analogous to the previous one, we arrive at
Z[ f ] = N!|âN |Pf(Ai j [ f ]) , (12.17)
where ∞ ∞
Ai j [ f ] = dx dyRi−1 (y)R j−1 (x)w(x)w(y) f (x) f (y)sign(x − y) . (12.18)
Z Z
−∞ −∞
Computing the functional derivative, and recalling the definition of Pfaffian in equation (11.5), we have
δ δ N/2
Z[ f ] = ∑ s(P) ∏ AP(2k−1),P(2k) [ f ] . (12.19)
δ f (x)
N!|âN |
f =1 P δ f (x) k=1
" #
f =1
Page 73 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
When we apply the product rule for the derivative, and set f = 1, for each term in the sum over permu-
tations we get
δ
A [f]
δ f (x) P(1),P(2)
AP(3),P(4) [1] · · · AP(N−1),P(N) [1] + . . .
f =1
δ
+ AP(1),P(2) [1] AP(3),P(4) [ f ]
δ f (x)
· · · AP(N−1),P(N) [1] + . . .
f =1
δ
A [f] . (12.20)
δ f (x) P(N−1),P(N)
+ AP(1),P(2) [1] · · ·
f =1
The orthogonality relations (12.10), (12.11) imply that the products in the above expressions are different
from zero only when P is the identity permutation, i.e. when P(2 j − 1) and P(2 j) are adjacent numbers for
each matrix element in the product.
Hence, the sum in (12.19) reduces to the expression in (12.20) where P( j) = j, ∀ j. Each element
AP(2 j−1),P(2 j) yields a factor 2 from the matrix A in (12.12), so that each product in (12.20) reduces to an
δ
expression of the type 2N/2−1 Comparing (12.16) and (12.19), we eventually find that
δ f (x) A2k−1,2k [ f ] f =1 .
h i
δ
ρ(x) = . (12.21)
N!|âN |2N/2−1 N/2
∑ δ f (x) A2k−1,2k [ f ]
NZ[ f = 1] k=1 f =1
Making the result of the functional differentiation of (12.18) explicit, and rearranging indices, we finally
obtain
1 N/2−1
ρ(x) = ∑ w(x)[R2k (x)Φ2k+1 (x) − R2k+1 (x)Φ2k (x)] , (12.22)
2N k=0
where
Φk (x) = dyRk (y)w(y)sign(x − y) (12.23)
Z ∞
−∞
For the Gaussian case, it can be shown [77, 78] that we can choose
√
2
R2k (x) = 1 H2k (x) , (12.24)
π 4 2k (2k)!!
√
2
R2k+1 (x) = 1 [−H2k+1 (x) + 4kH2k−1 (x)] , (12.25)
π 4 2k+2 (2k − 1)!!
where the Hk (x) are Hermite polynomials. This gives, for example,
√ √ √ √
2 2x
, , , . (12.26)
2(2x2 − 1) 2x(2x2 − 5)
R0 (x) = 1 R1 (x) = − 1 R2 (x) = 1 R3 (x) = − 1
π4 2π 4 2π 4 2π 4
Page 74 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Since the leading coefficient of Hk (x) is 2k , we have
N N(N−2)
(−1) 2 2 4
âN = N N/2−1
, (12.27)
π 4 ∏k=0 (2k)!
even though this quantity has completely dropped out from the final expression for the density (12.22).
For a numerical check of (12.22), see Fig. 12.1 below, which was obtained with the
code [♠ Gaussian_finite_density_check.m].
12.2 β = 4
Now
N
1
ρ(x1 , . . . , xN ) = |∆N (xx)|4 ∏ w(xi ) . (12.28)
Z i=1
Start by writing |∆N (xx)|4 as a determinant of size 2N, in which two columns depend on each variable. This
is
|∆N (xx)|4 = det[xik kxik−1 ] , (12.29)
where 1 ≤ i ≤ N and 0 ≤ k ≤ 2N − 1. For instance, for N = 2 we have
1 0 1 0
(12.30)
x1 1 x2 1
x13 3x12 x23 3x22
(x2 − x1 )4 = det 2 ,
x1 2x1 x22 2x2
which can be verified if you have 10 minutes to spare.
We can change xik by any family of polynomials Qk (xi ) = bk xik + · · · that produce the Vandermonde, and
kxik−1 by its derivative Qk0 (xi ). So
|∆N (xx)|4 = |b̂N | det[Q j−1 (xi ) Q0j−1 (xi )] , (12.31)
where 1 ≤ j ≤ 2N and
N−1
b̂N = ∏ bk2 . (12.32)
k=0
!−1
In this case, Eq. (12.17) gets modified as
Z[a] = (2N)!|b̂N |Pf(Bi, j [a]) , (12.33)
where
Bi, j [a] = 0
dx[Qi−1 (x)Q0j−1 (x) − Qi−1 (x)Q j−1 (x)]w(x)a(x) . (12.34)
Z ∞
−∞
We have used here the second De Brujin identity (11.4).
Page 75 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
We may consider the above integral as another skew-symmetric inner product
1 ∞
(12.35)
2
h f , gi4 = dx[ f (x)g0 (x) − f 0 (x)g(x)]w(x) ,
Z
−∞
and we may choose the polynomials Q to be skew-orthogonal with relation to this: evens and odds are
orthogonal among themselves,
hQ2k , Q2` i4 = hQ2k+1 , Q2`+1 i4 = 0 , (12.36)
and evens are orthogonal to odds unless they are adjacent,
hQ2k , Q2`+1 i4 = −hQ2`+1 , Q2k i4 = δk` . (12.37)
Computing the functional derivative as before, we have
1 N−1
ρ(x) = 0
∑ w(x)[Q2k (x)Q2k+1 0
(x) − Q2k+1 (x)Q2k (x)] . (12.38)
2N k=0
In the Gaussian case, we can choose
√ √
2 2 √
Q2k = 1 4kQ2k−2 (x) + H2k (x 2) , Q2k+1 (x) = 1 H2k+1 (x 2) . (12.39)
π 4 2k (2k)!! π 4 2k+1 (2k + 1)!!
h √ i
This gives, for example,
√ √
2 2x 2(4x2 + 1)
, , , . (12.40)
2x(4x2 − 3)
Q0 = 1 Q1 (x) = 1 Q2 (x) = 1 Q3 (x) = 1
π4 π4 2π 4 3π 4
Also, we have
N−1
2b 2 c 2N(N−1)
b̂N = N N−1
. (12.41)
π 2 ∏k=0 (k!!)2
Again, for a numerical check of (12.38), see Fig. 12.1, which was obtained with the
code [♠ Gaussian_finite_density_check.m].
Page 76 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
0.16 0.12 0.1
0.1
0.08
0.12
0.08
0.06
0.08 0.06
ρ(x)
ρ(x)
ρ(x)
0.04
0.04
0.04
0.02
0.02
0 0 0
-10 -5 0 5 10 -10 -5 0 5 10 -10 -5 0 5 10
x x x
Figure 12.1: Comparison between numerically generated eigenvalue histograms of 50000 matrices of size
N = 8 belonging to the Gaussian ensembles (GOE on the left, GUE in the middle, and GSE on the right)
and the corresponding theoretical densities.
Page 77 of 111
Chapter 13
Classical Ensembles: Wishart-Laguerre
In this Chapter, we present one of the “classical” examples of rotationally invariant models: the Wishart-
Laguerre (WL) ensemble.
13.1 Wishart-Laguerre ensemble
Historically, one of the earliest appearances of a random matrix ensemble1 occurred in 1928, when the Scot-
tish mathematician John Wishart published a paper on multivariate data analysis in the journal Biometrika
[64].
Wishart matrices are square N × N matrices W with correlated entries. They are constructed as2 W =
HH † , where H is a N × M matrix (M ≥ N) filled with i.i.d. Gaussian entries3 . These entries may be real,
complex or quaternion (we shall use again the Dyson index β = 1, 2, 4 for the three cases, respectively), and
† stands for the transpose or hermitian conjugate of the matrix H. For example, for a 2 × 3 complex matrix
H
x11 − iy11 x21 − iy21
x11 + iy11 x12 + iy12 x13 + iy13
W= (13.1)
x21 + iy21 x22 + iy22 x23 + iy23
!
x13 − iy13 x23 − iy23
x12 − iy12 x22 − iy22 .
Work out the matrix product, and convince yourself that W is hermitian, therefore has real eigenvalues.
The Wishart ensemble is also referred to as “Laguerre”, since its spectral properties involve Laguerre
polynomials, and also “chiral” in the context of applications to Quantum Chromodynamics (QCD) [65].
They are often called LOE, LUE and LSE, for β = 1, 2, 4, respectively.
While the Gaussian eigenvalues can in principle be anywhere on the real axis, Wishart matrices have
N non-negative eigenvalues, {x1 , x2 , . . . , xN }. Indeed, Wishart matrices W are positive semidefinite. This
1 In Mathematics, however, the 1897 work by Hurwitz on the volume form of a general unitary matrix is of historical significance
[62, 63].
2 Sometimes you find a normalized version of it, with a 1/M factor in front.
3 The notation W (N, M) is also used.
78
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
means that (e.g. for β = 2) u?W u ≥ 0 for all nonzero column vectors u of N complex numbers. The proof
is not hard, have a go at it!
Question. What is the jpdf of the entries of WL ensemble W ?
I With some effort (see below), it can be computed as
1 β
ρ[W ] ∝ e− 2 TrW (detW ) 2 (M−N+1)−1 , (13.2)
from which you immediately see that i) the entries are correlateda (the determi-
nant easily kills any hope of factorizing this jpdf), and ii) the model is rotationally
invariantb .
a Unless for specific combinations of β , M, N for which the determinant disappears.
b The jpdf (13.2) is not in contrast with Weyl’s lemma (see Eq. (3.8)). The determinant of a N × N
matrix W can be indeed written as a function of the traces of the first N powers of W (see [66]).
From (13.2), the jpdf of eigenvalues can be written down immediately (just express everything in terms
of the eigenvalues and append a Vandermonde at the end)
N
1 1 N αβ /2
ρ(x1 , . . . , xN ) = (L)
e− 2 ∑i=1 xi ∏ xi ∏ |x j − xk |β , (13.3)
ZN,β i=1 j<k
(L)
where α = (1 + M − N) − 2/β and the normalization constant ZN,β can be computed again using modifica-
tions of the Selberg integral4 [5]. As for the Gaussian ensembles, one may sometimes find in the literature
an extra factor β in the exponential.
The confining potential for the Wishart-Laguerre ensemble is thus V (x) = 12 x − α2 ln x, and this clearly
(α)
motivates the use of (associated) Laguerre polynomials Ln (x), which are orthogonal with respect to this
precise weight (after a simple rescaling),
(α) (α) Γ(n + α + 1)
dx xα e−x Ln (x)Lm (x) = δm,n . (13.4)
0 n!
Z ∞
Question. What happens if I take M < N?
I This situation defines the so-called Anti-Wishart ensemble W̃ . In this
case, one can show that N − M eigenvalues are exactly 0. The jpdf is similar to
(13.3), but some of the matrix elements of W̃ are non-random and deterministically
related to the first M rows of W̃ [67].
The code [♠ Wishart_check.m] produces instances of Wishart matrices for different β s, as well as
normalized histograms of their eigenvalues. You can start having a look at it now, but please come back to
it after reading the next chapter.
4 Note that while for Wishart matrices M − N is a non-negative integer and β = 1, 2 or 4, the jpdf in (13.3) is well defined for
any β > 0 and any α > −2/β (this last condition is necessary to ensure that the jpdf is normalizable). When these parameters take
continuous values, this jpdf defines the so-called β -Laguerre ensemble.
Page 79 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Question. What is the limiting spectral density of the WL ensembles for N → ∞?
I It is called the Marčenko-Pastur density [68], which is superimposed to
the histograms produced with the code above in Fig. 14.1 of the next chapter. We
are going to derive it using the resolvent method very shortly.
13.2 Jpdf of entries: matrix deltas...
The calculation of the jpdf of entries (13.2) proceeds through a few simple steps. Set for simplicity β = 2
(hermitian matrices). We can formally write
ρ[W ] = dHρ[H]δ W − HH † . (13.5)
Z
As usual, the measure dH means that we are integrating over the 2NM degrees of freedom (dof)5 of H: each
entry of the N × M matrix H is a complex number, so it is parametrized by two real numbers6 . Therefore,
N
dH = ∏i=1 ∏Mj=1 dRe[Hi j ]dIm[Hi j ].
The matrix delta δ (W − HH † ) enforces the constraint that a certain matrix W must be equal to another
matrix HH † . We do have an integral representation for the scalar delta function, which does the same job
for real numbers. It should then be easy to work out the corresponding integral representation for the delta
function of, say, a N × N hermitian matrix K - after all, it will just be the product of scalar deltas, one for
each of the real dof
N N
(R) (I)
δ (K) = ∏ δ (Kii ) ∏ ∏ δ (Ki j )δ (Ki j ) =
i=1 i=1 j>i
N N (R) (I)
dT11 dTNN dTi j dTi j N (R) (R) (I) (I)
Ki j +Ti j Ki j ]
exp i ∑ Tii Kii ∏∏ ei ∑i=1 ∑ j>i [Ti j , (13.6)
2π 2π 2π 2π
···
i=1 i=1 j>i
Z
( )Z
where we have introduced a set of N(N + 1)/2 parameters {T }, one for each delta.
Arranging the parameters {T } into a hermitian matrix, try to show that the ugly expression in (13.6) can
be recast in the more elegant form
1
δ (K) = 2 dT eiTr[T K] . (13.7)
2N π N
Z
5 With “degrees of freedom” we mean the independent real parameters that are necessary to define a matrix. For example, a
hermitian matrix has N 2 degrees of freedom - the N real entries on the diagonal, and the real and imaginary parts of the entries in
the upper triangle.
6 Note, in particular, that for N = M the square matrix H is not hermitian, and has 2N 2 “degrees of freedom” (dof) instead of N 2 .
Page 80 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
We can now perform the multiple integral in (13.5), with Gaussian distributed dof of H
N M
(R) (I) (R) (I) 1 1 (R)2 1 (I)2
2π 2 2
ρ[H] ≡ ρ(H11 , H11 , . . . , HNM , HNM ) = ∏ ∏ exp − Hi j − Hi j
i=1 j=1
1 1 †
= e− 2 Tr(HH ) , (13.8)
2π
NM
where (R) and (I) denote the real and imaginary part of each of the NM entries of H.
Combining (13.5), (13.7) and (13.8) we have
1 1 1
ρ[W ] = dT dH e− 2 Tr(HH . (13.9)
† )+iTrT (W −HH † )
2
2N π N 2π
NM Z Z
Dividing all the dof of the hermitian matrix T by 1/2 (i.e. changing variables T → T /2), we obtain
1 1 1 1
ρ[W ] = dT dH e− 2 Tr(HH 2 . (13.10)
† )+ i TrT (W −HH † )
2
2N π N 2π 2
NM N 2 Z Z
13.3 ...and matrix integrals
Next, we use the following identity for N × N hermitian matrices T
1 i M i M
∏ d 2 sk exp , (13.11)
(4πi)MN
[det(µ 1 − T )]−M = µ ∑ sk† sk − ∑ sk† T sk
k=1 2 k=1 2 k=1
( Z M
)
where 1 is the N × N identity matrix, sk are k = 1, . . . , M complex (column) vectors, so that
N ? and µ is such that Im[µ] > 0.
d 2 sk = ∏i=1 dsk,i dsk,i
Question. Any hint on how to prove it?
I Just write T = U † ΛU, with U the unitary matrix diagonalizing T , and Λ
the diagonal matrix of eigenvalues λi . Then make the change of variables
Usk → s̃k , which is unitary and thus has Jacobian equal to 1. The resulting integral
factorizes as
N i
∏ d 2 s̃k → ∏ 2 dx dy e 2 (x−iy)(µ−λ` )(x+iy) , (13.12)
k=1 `=1
Z M
" Z
#M
where x, y are real and imaginary part of the `th entry of sk , and the factor of 2 is
the Jacobian of the change of variables {s̃k,i , s̃k,i
? } → {x, y}. The integral in {x, y}
yields 2πi/(µ − λ` ), from which the claim is immediate.
We can now perform the H integral in (13.10). How? Just imagine that the kth vector sk is constructed
as sk = (H1k , . . . , HNk )T , i.e. it is basically the kth column of the rectangular matrix H.
Page 81 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
k=1
Hence, note the identity −(1/2)Tr(HH † ) = (i/2)µ ∑M sk† sk , with µ = i. Finally, we have to calculate
(R) (I) ? }. For each entry of H, we have
the Jacobian of the change of variables {Hik , Hik } → {sk,i , sk,i
(R) (I)
sk,i = Hik + iHik
(R) (I)
(13.13)
?
(
sk,i = Hik − iHik .
The Jacobian from s → H is
∂ sk,i ∂ sk,i
(R) (I)
∂ Hik ∂ Hik
1 i
(13.14)
? ?
= −2i .
!
∂ sk,i ∂ sk,i 1 −i
(R) (I)
∂ Hik ∂ Hik
=
Thus, the Jacobian from H → s (the one we need) is (in absolute value) equal to 1/2 for each entry. In total,
(1/2)NM .
Therefore, using (13.11)
1 1 1 i
ρ[W ] = 2 (4πi)MN (13.15)
2π 2
dT e 2 Tr(TW ) [det (i1 − T )]−M .
2N π N
NM N 2 +NM Z
We now need another matrix integral, with the pompous name “Ingham-Siegel integral of second type”
[69], whose general formula reads (see Appendix A in [70])
JN,M (Q, µ) = dT eiTr(T Q) [det(T − µ 1)]−M = CM,N (det Q)M−N eiµTrQ , (13.16)
Z
with CM,N = 2N π N(N+1)/2 iNM / ∏Mj=M−N+1 Γ( j), and the matrix Q is hermitian and positive definite, while T
is just hermitian. Both are N × N. We also require Im(µ) > 0 to ensure convergence, and M ≥ N.
To use this integral (13.16), we need to multiply back again all the degrees of freedom of the matrix T
by 2, and pull out a factor (−2) from the determinant, resulting in
2
−M
i
ρ[W ] = 2−N(1+M) π −N i−MN , (13.17)
2
dT eiTr(TW ) det T − 1
Z
which can be evaluated using (13.16) as
2
ρ[W ] =
2−N(1+M) π −N i−MN × 2N π N(N+1)/2 iNM
(detW )M−N e−(1/2)TrW =
∏Mj=M−N+1 Γ( j)
1
= N(N−1)
(detW )M−N e−(1/2)TrW , (13.18)
2NM π 2 ∏Mj=M−N+1 Γ( j)
i.e. the jpdf of the entries of Wishart matrices for β = 2, with the correct normalization7 (note that all the
imaginary factors have correctly disappeared). Well done!
7A reliable source for such normalizations is [31].
Page 82 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
13.4 To know more...
1. The spectral densities of the Wishart-Laguerre ensemble for finite N and β = 1, 2, 4 have been given
explicitly in [71], together with numerical checks.
2. The large-N behavior of the spectral density and two-point function for the Wishart-Laguerre ensem-
ble is determined by the asymptotics of Laguerre polynomials (in complete analogy with the Gaussian
case). These are explicitly given in [72].
3. Non-hermitian analogues of the Wishart-Laguerre ensemble can also be defined (see [73] for a nice
review).
4. Readers interested in the diagrammatic approach to fluctuations in the Wishart ensemble should have
a look at [43].
5. For a nice review on usefulness of Wishart-Laguerre ensemble in physics, see [75]. For specific
applications to QCD, see [65, 76].
Page 83 of 111
Chapter 14
Meet Marčenko and Pastur
In this Chapter, we investigate the average spectral density for the Wishart-Laguerre ensemble.
14.1 The Marčenko-Pastur density
The average density of eigenvalues has the following scaling form for N, M → ∞ (such that c = N/M ≤ 1 is
kept fixed)
1 x
ρMP , (14.1)
βN βN
ρ(x) →
where the Marčenko-Pastur scaling function (the analogue of the semicircle ρSC (x) in Eq. (3.6) for the
Gaussian ensemble) is independent of β and given by [64]
ρMP (y) = (14.2)
2πy
(y − ζ− )(ζ+ − y) ,
1 p
for x ∈ [ζ− , ζ+ ]. The edge-points ζ± are given by ζ− = (1 − c−1/2 )2 and ζ+ = (1 + c−1/2 )2 .
This scaling function ρMP (y) has a compact support on the positive semi-axis for c < 1 (with two soft
edges), but becomes singular at the origin if c → 1 (and the origin becomes a hard edge). This means that
Wishart matrices constructed from square matrices H exhibit an accumulation of eigenvalues very close to
zero.
It is worth stressing that the typical scale of an eigenvalue is ∼ O(N) in the WL case, as opposed to the
√
scale ∼ O( N) for the Gaussian ensemble.
14.2 Do it yourself: the resolvent method
Let us now derive the Marčenko-Pastur density using the resolvent (or Stieltjes transform) method. The
partition function (normalization constant) for the Wishart-Laguerre ensemble reads (after a rescaling xi →
84
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
β Nxi )
βN
N
(L) N αβ /2
2 ∑i=1 xi
ZN,β ∝ ∏ dx j e− ∏ xi ∏ dx j e−β NV[xx] , (14.3)
0 0
∏ |x j − xk |β =
j=1 i=1 j<k j=1
Z ∞ N Z ∞ N
with
1 1
+ (14.4)
2/β − 1 M 1
V[xx] = xi +
2∑ 2N 2N 2 ∑
− ln xi − ln |xi − x j | .
i i 2N i6∑
=j
As in Chapter 8, the xi are now of O(1) for large N. We can again perform the saddle point evaluation
of the N-fold integral (14.3), but this time there is an additional subtlety which, if overlooked, leads straight
to a nonsensical answer.
The subtlety is that the minimization of the exponent should be carried out within the set of positive
x . In other words, on top of the saddle-point equation, there is an inequality constraint to satisfy as well,
xi > 0 ∀i.
One way to handle this constraint is to introduce a penalty function −µ ∑i ln(xi ) in the “action” V[xx],
with a Lagrange multiplier µ. Since − ln(t) → ∞ for t → 0, it acts as if each particle felt an extra “infinite
wall”-type of repulsion while approaching the origin, and thus helps confining the eigenvalues on the posi-
tive semi axis. The extra wall is then “gently” removed (µ → 0) at the end of the calculation.
The saddle-point equations now read for any i ( and for N 1 and N/M = c ≤ 1)
1 1 1 1 1 1
+ = . (14.5)
2 2 2c
− −µ
xi N ∑
j6=i xi − x j
1
Multiplying (14.5) by N(z−xi ) and summing over i, we get in analogy with Eq. (8.18)
1 1 1 1 1 1
G (z) . (14.6)
1 0
GN (z) + = GN2 (z) +
2 2 2c N∑ 2 2N N
− −µ
i xi (z − xi )
The second term can be expressed in terms of GN (z) using
1 1 1 1 1 K + GN (z)
= = , (14.7)
N∑ zN i z
!
i xi (z − xi )
∑ xi + z − xi
(av)
and taking the average G∞ (z) = hGN (z)i in the limit N → ∞, we obtain
(av)
1 (av) 1 1 K + G∞ (z) 1 (av)2
G (z) + = G∞ (z) . (14.8)
2 ∞ 2 2c z 2
− −µ
Here K is a constant that we assume finite (by derivation, we have K = dxρ(x)/x).
R
Note that, had we not included the penalty function parametrized by µ from the beginning, we would
(av) (av)2
have landed for c = 1 on the equation 21 G∞ (z) = 21 G∞ (z), from which no sensible spectral density could
Page 85 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
be extracted! This is because the Wishart eigenvalues cannot equilibrate on the entire real line under a po-
tential V (x) = x (which is not confining for x → −∞).
It is convenient to set γ = (1 − c)/c > 0. Solving now the quadratic equation (14.8) for µ → 0, we get
(av) 1
(z) = (14.9)
γ 2 − 4γKz + z2 − 2γz γ
G∞
2 z z
± − +1 .
p !
Setting now z = x − iε, multiplying up and down by x + iε and using the real and imaginary part of the
square root p and q as in (8.3), we obtain
1 (av)
, (14.10)
−εx ± qx ε→0+ (x − x− (γ, K))(x+ (γ, K) − x)
ImG∞
π 2π(x2 + ε 2 ) 2πx
(x − iε) = −→
p
where it is understood that the (±) sign in (14.9) is to be chosen differently in different x-intervals, in anal-
is only valid for x such that the square
√
root exists. The constants x± (γ, K) = γ −2 K 2 + K + 2K + 1 .
ogy with the Gaussian case. Of course, the right hand side of (14.10)
We now have to fix the constant K by requiring normalization of ρ(x). Using the integral (for b > a)
b
dx = (14.11)
a 2πx 4
−2 ab + a + b ,
Z p
(x − a)(b − x) 1 √
1
all we have to do is to assign a ← x− (γ, K) and b ← x+ (γ, K), and to solve 4 −2 ab + a + b = 1 for K.
This gives K = 1/γ.
√
√
And for this value of K, the edge points become x± (γ, 1/γ) → (1 ± 1/ c)2 , which means that we have
recovered the MP law (14.2) using the resolvent method. Congratulations!
You can now fully enjoy Fig. 14.1, where we show a comparison between the Marčenko-Pastur density
and the histograms obtained by numerical diagonalization of WL random matrices for different β s.
Question. Wait a second...In the derivation, we said that we had to assume K
finite and equal to K = dxρ(x)/x (because the constant K arises as the average
R
h N1 ∑i x1i i). Shouldn’t we check that this is consistent with the final result?
I Yes, we should! The integral dxρ(x)/x amounts to computing the fol-
lowing
R
√
dx = , (14.12)
(x − a)(b − x) −2 ab + a + b
a 2πx2
√
4 ab
Z b p
√ √
and setting a ← (1 − 1/ c)2 and b ← (1 + 1/ c)2 . This gives c/(1 − c), which is
precisely equal to K = 1/γ. Bingo!
Page 86 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
0.5
c = 1/2
c = 1/8
0.4 β=1
β=2
β=4
0.3 β=1
β=2
ρ(x)
β=4
0.2
0.1
0
0 2 4 6 8 10 12 14 16
x
Figure 14.1: Comparison between the Marčenko-Pastur density for two different values of the rectangularity
ratio c and the corresponding histograms obtained from the numerical diagonalization of random Wishart
matrices (for all possible values of β ). All histograms are obtained from 5000 Wishart matrices of size
N = 100.
Page 87 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Question. What if I wanted to use the Coulomb gas technique to derive the
Marčenko-Pastur law?
I The partition function (normalization constant) for the WL ensemble, af-
ter the rescaling xi → xi β , reads
N
(L) (L)
ZN,β = CN,β ∏ dx j e−β V[xx] , (14.13)
(0,∞)N j=1
Z
where the energy is given this time by V[xx] = 21 ∑i xi − α2 ∑i ln xi − 21 ∑i6= j ln |xi − x j |.
In the WL case, the gas is in equilibrium under the competing effect of a
linear+logarithmic confining potential, and the 2D electrostatic repulsion.
Following the same procedure as in Chapter 4 (but with the rescaling n(x) →
(1/N)n(x/N)), we obtain for the energy functional V[n(x)] = N 2 V̂, with
1
V̂[n(x)] = (14.14)
2
dx v(x) n(x) − dxdx0 n(x)n(x0 ) ln |x − x0 | ,
Z ZZ
with v(x) = x/2 − 12 (1/c − 1) ln x.
The singular integral equation for the equilibrium density readily follows
1
n? (x0 ) 1 1 1
Pr dx0 . (14.15)
c 2 2 x
= −
−1
Z
x − x0
Try to apply Tricomi’s formula (5.15) - assuming a single-support solution on [a, b]
- and then determine a, b in such a way that the free energy is minimized. You will
discover that n? (x) = ρMP (x) as it should.
14.3 Correlations in the real world and a quick example: financial correla-
tions
A huge number of scientific disciplines, ranging from Physics to Economics, often need to deal with sta-
tistical systems described by a large number of degrees of freedom. Thus, understanding and describing
the collective behavior of a large numbers of random variables is one of the most fundamental issues in
multivariate Statistics. More often than not, the problem can be addressed in terms of correlations.
Suppose we are interested in understanding the correlation structure of a system described in terms of N
random variables {x1 , . . . , xN }, drawn from a - potentially unknown, but not changing in time - jpdf p(xx). In
order to do so, one of the most obvious operations to perform is to collect, if possible, as many “experimental
Page 88 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
observations” of such variables. Such observations can then be used to compute empirical time averages of
quantities expressed in terms of the random variables. So, let us assume we have collected M observations
- say, equally spaced in time - for each variable. Quite straightforwardly, one can collect all these numbers
in a N × M matrix X whose entries are xti (i = 1, . . . , N, t = 1, . . . , M).
Assuming all variables xti are adjusted in order that their sample mean1 is zero and their sample variance
is 1, then the quantity
1 M t t
ci j = ∑ xi x j (14.16)
M t=1
yields the well known Pearson estimator for the correlation between variables xi and x j . This is an estimator
of the true (or population) correlation2 c̃i j , which would be measured exactly for M → ∞, i.e. as more and
more observations are added to the data. However, real life practice always entails working with finite-sized
datasets (i.e. with finite M), which introduces some degree of measurement error.
The estimators for each pair of variables in the system can be collected into a single N × N matrix
C = XX T /M, known as the sample correlation matrix of the data in X, whose entries are given by Eq.
(14.16). These amount to N(N − 1)/2 real numbers (diagonal entries are equal to one), which for a large
system represent a whopping amount of information to process. So, what should we make of all this? Well,
a reasonable first step could be to compare the empirical correlation matrix of the system we are interested
in with the prediction of a suitably defined null hypothesis. In the first instance, we could for example look
for a null model describing uncorrelated Gaussian random data and see how our empirical data differ from it.
By any chance, do we know a random matrix ensemble from which we can draw this kind of random
correlation matrices? Well, of course we do! It is precisely the Wishart-Laguerre ensemble. As we dis-
cussed, the density of eigenvalues is well known for this ensemble, and it is given by the Marčenko-Pastur
law (14.2). This means that a zero-th order assessment of the statistical significance of the correlations in
a large system can be obtained from the comparison of the empirical eigenvalue spectrum of its correlation
matrix with the Marčenko-Pastur law for a system with the same rectangularity ratio N/M.
A prime example of the procedure outlined above is the analysis of financial correlations. Suppose you
want to invest your money in N stocks by forming an investment portfolio. As the old saying goes, “don’t
put your eggs in one basket”, which in financial terms translates into “don’t invest all your money in a
portfolio of highly correlated stocks” - not the most effective punchline, admittedly. Hence, distinguishing
signal from noise within financial correlation matrices is of paramount importance to build a well diversified
portfolio, where the possible losses due to the adverse movement of a group of stocks can be offset by other
groups of stocks.
When an empirical financial correlation matrix is diagonalized, one usually finds that several eigenvalues
are much larger than the expected upper bound of the Marčenko-Pastur law. The information contained in
the associated eigenvectors typically shows that these are due to the co-movements of groups of highly
correlated stocks belonging to well defined market sectors (e.g. pharmaceutical, financial, etc). This kind
i ∑t=1 i
1 The sample mean is x̄ = (1/M) M xt , not to be confused with the true mean hx i x).
i p(xx) , which is a property of the jpdf p(x
2 The true correlation c̃i j is a property of the jpdf p(xx) of the random variables {x1 , . . . , xN }.
Page 89 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
of random matrix approach to financial correlations was initiated in [79, 80] and since then a considerable
number of papers has been devoted to it (see [81] for a recent account).
Page 90 of 111
Chapter 15
Replicas...
In this Chapter, we add one more powerful tool to our arsenal. The Edwards-Jones formula, in conjunction
with the celebrated replica trick.
15.1 Meet Edwards and Jones
The Edwards-Jones formula [82] allows to write down a formal expression for the average spectral density
ρ(x) of a completely generic ensemble of real symmetric random matrices H, taking as a starting point just
the jpdf of the entries in the upper triangle, ρ[H].
The formula reads
ρ(x) = lim Im Log Z(x) , (15.1)
−2
πN ε→0+ ∂ x
∂ D E
where
i
Z(x) = (15.2)
2
dyy exp − y T (xε 1 − H) y ,
RN
Z
where xε = x − iε.
The average h·i is taken with respect to ρ[H], i.e. h·i = dH11 · · · dHNN ρ[H](·).
R
This formula is remarkable: it allows to compute the spectral density - the marginal of the jpdf of the
eigenvalues - without knowing the jpdf of eigenvalues! Only the information about the entries is required as
input.
While the formula (15.1) is in principle valid for any finite N, in practice the calculations can be carried
out until the end only in the limit N → ∞, where several simplifications take place.
15.2 The proof
The proof is not complicated - even though there are several
N
average spectral density is defined ρ(x) = N1 ∑i=1 δ (x − xi ) .
D E subtleties. Recall from Chapter 2 how the
91
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Recall also the Sokhotski-Plemelj identity: as ε → 0+ ,
1 1
(15.3)
x
→ Pr ∓ iπδ (x) .
x ± iε
This equation provides an interesting identity for the delta function, which we already used in Chapter
8. We can therefore write
1 1 1
ρ(x) = lim Im ∑ = lim Im ∑ , (15.4)
πN ε→0+ i=1 x − iε − xi πN ε→0+ i=1 xi + iε − x
D N E −1 D N E
where Im stands for the imaginary part, and we changed a sign for later convenience.
Next, we write the denominator in the sum as the derivative of a logarithm. But the denominator is a
complex number: and the logarithms of complex numbers are nasty beasts1 . Anyway, we can choose the
principal branch of the logarithm - and denote it by Log - to write
1
ρ(x) = lim Im ∑ Log(xi + iε − x) . (15.5)
πN ε→0+ ∂ x i=1
∂ D N E
Next, we use the identity
1 N Nπ
Z(x) = (2π)N/2 exp − ∑ Log(xi + iε − x) + i 4 , (15.6)
2 i=1
" #
where Z(x) is given by the multiple integral in (15.2). You can check this identity with the code [♠
Zmultiple.m]
Now, compare the last two equations. Clearly, the final formula would be easily established if we could
N
replace ∑i=1 Log(xi + iε − x) in (15.5) with something related to Z(x), using (15.6).
N
To extract ∑i=1 Log(xi + iε − x) from (15.6), we should take the logarithm on both sides. There is a
small glitch, though, due to another mind-boggling feature of complex logarithms. Namely, Log(exp(z))
may not just be equal to z, for z ∈ C!2
However, we can still write
N
∂
∑ Log(xi + iε − x) = −2 Log Z(x) + terms that are killed by ∂ x . (15.7)
i=1
Inserting (15.7) into (15.5), we establish the final formula (15.1).
1 For example, Log(z1 z2 ) may not be equal to Log(z1 ) + Log(z2 )!
2 For instance, if z = 0.2 − 4.4i, then Log(exp(z)) = 0.2 + 1.88319i.
Page 92 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
15.3 Averaging the logarithm
The Edwards-Jones formula (15.1) thus requires computing Log Z(x) , where the average is taken over
several realizations of the matrix H.
D E
This means that we should compute
i
(15.8)
2
Log Z(x) = dH11 · · · dHNN ρ[H]Log dyy exp − y T (xε 1 − H) y ,
RN
D E Z Z
which is very annoying: the logarithm is right in the way!
We would really need to exchange the order of integrals to perform the average over H before the aver-
age over y - otherwise we would be running the Edwards-Jones formula backwards and gain nothing!
There are two strategies to circumvent this obstacle, each with their own subtleties. To know more about
the replica method and its applications to spin glass theory see [83, 84].
15.4 Quenched vs. Annealed
Calling the quantity in (15.2) Z(x) is intentional: we wish to interpret it as the partition function of an as-
sociated stat-mech model in the canonical ensemble. The logarithm of Z will then be the free energy of this
model.
Looking again at the multiple integral defining Z(x), Z(x) = RN dyy exp − 2i y T (xε 1 − H) y , we see
that it encodes two different ’levels’ of randomness: i) the random matrix H - the so called disorder -
R
and ii) the dynamical variables y , which morally3 follow a Gibbs-Boltzmann distribution P(y1 , . . . , yN ) =
1 y
Z(x) exp (−H(y ; H, x)) .
Computing now Log Z(x) - as we should - assumes that the two levels of randomness are unfolding on
different timescales: first, the dynamical variables y need to equilibrate according to the Gibbs-Boltzmann
D E
distribution for a fixed instance of the random matrix H - and only afterwards the free energy is averaged
over the disorder (different realizations of H).
For these reasons, the disorder is called quenched4 : it is there, but it acts slowly. It only kicks in after
the y ’s have thermalized.
Computing a quenched disorder average is difficult, but can be attempted - in the limit N → ∞ - using the
so called replica trick, which gets rid of the logarithm inside the integral in (15.8) and allows the integrations
over H and y to be interchanged. More on this later.
A second strategy - which simplifies the calculations considerably - is to cheat a bit and treat the disorder
as annealed instead.
3 “Morally”, since the “Hamiltonian” H is actually complex, so P(y1 , . . . , yN ) is not a proper distribution.
4 Quenched adj. made less severe or intense; subdued or overcome; allayed; squelched.
Page 93 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
This means that the associated stat-mech model is described in terms of the joint set of dynamical vari-
ables {yy, H}, leading to a partition function Z (ann) (x) = dHdyy(· · · ).
R
The dynamical variables y are no longer integrated over at fixed value of the disorder H, but rather H and
y fluctuate and thermalize together. A questionable but widespread way to describe in words an annealed
average is: instead of computing the quenched average hLog Z(x)i - as we should - move the average inside
the logarithm 5 , LoghZ(x)i.
Clearly, this slick maneuver forces the logarithm out of the integrals, and allows for a much quicker -
even though not entirely justifiable - computation.
In the following section, we present the annealed calculation to obtain the semicircle law for the GOE6 .
5 For the annealed average, we should more properly write Log Z (ann) (x) - with no further average over H.
6 This is only for training purposes. There is no need to use Edwards-Jones when the jpdf of eigenvalues is known!
Page 94 of 111
Chapter 16
Replicas for GOE
In this Chapter, we apply the Edwards-Jones formula to compute the average spectral density of the GOE
ensemble.
16.1 Wigner’s semicircle for GOE: annealed calculation
The jpdf of entries in the upper triangle of a GOE is
ρ[H] = ∏ exp −NHii2 /2 / 2π/N ∏ exp −NHi2j / π/N , (16.1)
i=1 i< j
N h p i h p i
where we have already rescaled the unit variance by a factor 1/N. This has the net effect of rescaling the
√ √ √
eigenvalues by 1/ N (why?), so the corresponding spectral density will have edges between − 2 and 2
- not growing with N.
For the annealed calculation, we need to compute
i
Z (ann) (x) = dyy ∏ dHi j ρ[H] exp (16.2)
2
− y T (xε 1 − H) y .
RN i≤ j
Z Z
Separating diagonal and off-diagonal elements, and using the notation h(·)i = ∏i≤ j dHi j ρ[H](·), we
can write
R
N N
i i N
Z (ann) (x) ∝ dyy exp − xε ∑ yi2 exp ∑ Hii yi2 exp i ∑ Hi j yi y j , (16.3)
RN 2 i=1 2 i=1 i< j
Z
" # " # " #
D ED E
where we neglect some overall constant terms.
Expanding ez ≈ 1 + z + z2 /2 + . . . and using the fact that the entries of H are independent with hHi j i = 0
95
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
and hHi2j i = 1/(N(2 − δi j )), we can write
i N i 1 1
exp ∑ Hii yi2 = ∏ 1 + 2 Hii yi2 − 8 Hii2 yi4 + . . . = ∏ 1 − 8N yi4 + . . . , (16.4)
2 i=1 i=1 i=1
" #
D E N D E N
N
1 1 2 2
exp i ∑ Hi j yi y j y y +... . (16.5)
2 4N i j
= ∏ 1 + iHi j yi y j − Hi2j yi2 y2j + . . . = ∏ 1 −
i< j i< j i< j
" #
D E D E
Re-exponentiating, we can write
N N
i N 1 N 1
exp ∑ yi2 (16.6)
2 i=1 i< j i, j=1 i=1
N
with γ = ∑i=1 yi2 and α = 2N yields
N N
2 i
Z (ann) (x) ∝ dq e−2Nq dyy exp − xε ∑ yi2 + iq ∑ yi2
RN 2 i=1 i=1
Z ∞ Z
" #
−∞
∞ N
2 1 1
= dq e−2Nq , (16.8)
2 2
dy exp − εy2 − i x − q y2
R
Z
−∞
Z
where the y-integral is convergent as ε > 0. Writing X N = exp [NLogX], we have
∞ 1 2π
Z (ann) (x) ∝ (16.9)
2
Z
−∞
ε + i(x − 2q)
ϕx (q)
q p q
sign b p 2
Using this with a = x2 − ε 2 − 2 and b = −2εx, and choosing the sign in order to get a physical solution, we
obtain
1 1
ρ(x) = √ |x2 − 2| − x2 + 2 , (16.16)
π 2
q
√ √ √
which is indeed zero outside [− 2, 2] and equal to Wigner’s semicircle ρ(x) = π1 2 − x2 inside, as it
should.
In the next section, we embark in the tougher task of using Edwards-Jones in the correct (quenched)
version (without shortcuts). This will require the use of the celebrated replica trick.
16.2 Wigner’s semicircle: quenched calculation
We use now Edwards-Jones in the full-fledged form
i
ρ(x) = lim Im Log , (16.17)
−2
πN ε→0+ ∂ x 2
dyy exp − y T (xε 1 − H) y
RN
Z
∂ D E
where the average h·i is taken again with respect to ρ[H], i.e. h·i = dH11 · · · dHNN ρ[H](·) and xε =
R
x − iε.
Recall that we cannot perform the y -integral before taking the average over H, otherwise we would be
running the Edwards-Jones formula backwards! On the other hand, we cannot exchange the two integrations
as they stand, due to the logarithm standing right in the middle. How to proceed then?
Using the replica identity in the form
1
(16.18)
n→0 n
hLog Z(x)i = lim LoghZ(x)n i ,
Page 97 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
we replicate the y -integral n (integer) times, and we blindly hope that the analytical continuation to n in the
vicinity of zero makes sense. The formalism and notation we shall use in the following are similar to those
introduced first in [85].
Using again
ρ[H] = ∏ exp −NHii2 /2 / 2π/N ∏ exp −NHi2j / π/N , (16.19)
i=1 i< j
N h p i h p i
we want to compute the replicated partition function
2 2
N
e−NHii /2 e−NHi j
hZ(x)n i =
i≤ j i=1 2π/N i< j π/N
Z
!
n
i N n
∏ dHi j ∏ p ∏p ×
× ∏ dyya
∑ yia (xε δi j − Hi j ) y ja .
exp − (16.20)
RNn 2 i,∑ a=1
j=1 a=1
Z
! " #
Now that the innermost integral has been “replicated” n-times, we can exchange the order of integrations
to get
n N
xε N n 2 dHii N 2 i N n 2
e−i 2 ∑i=1 ∑a=1 yia
hZ(x)n i =
RNn
∏ dyya e−N ∑i=1 Hii /2+ 2 ∑i=1 Hii ∑a=1 yia ×
a=1 i=1
Z
! Z
!
N 2 n
∏ p2π/N
dHi j
× e−N ∑i< j Hi j +i ∑i< j ∑a=1 yia Hi j y ja . (16.21)
i< j
Z
!
∏ pπ/N
Neglecting constants, we can perform the two multiple Gaussian integrals involving H using (16.7) repeat-
n 2 (or γ = n y y ) to get
edly, with α = N/2 (or N) and γ = (1/2) ∑a=1 yia ∑a=1 ia ja
n N n N n
xε 1 1
∏ dyya ∑ ∑ yia y ja (16.22)
2
hZ(x)n i = ∑ ∑ yia2 − 8N ∑ ∑ yia2 −
RNn a=1 i=1 a=1 i=1 a 4N i< j a=1
Z
! 2 !2
exp −i ,
which can be more compactly rewritten as
n N n N n
xε 1
∏ dyya (16.23)
2
hZ(x)n i = ∑ ∑ yia2 − 8N ∑ ∑ yia y ja
RNn a=1 i=1 a=1 i, j=1 a=1
Z
! !2
exp −i .
In order to proceed further, we introduce the following normalized density
1 N n
y
µ(→
− ) = ∑ ∏ δ (ya − yia ) , (16.24)
N i=1 a=1
y = (y1 , . . . , yn ).
where the n-dimensional vector →
−
You can now check by direct substitution that the second term in the exponential in (16.23) can be
Page 98 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
rewritten as
n n
1 N N
∑ yia y ja d→
−
y d→
−
w µ(→
− w)
y )µ(→
−
∑ ya wa , (16.25)
8
− =−
8N i,∑
j=1 a=1 a=1
!2 Z
!2
n
y = ∏a=1
where d →
− dya .
We can enforce the definition (16.24) using the following functional-integral representation of the iden-
tity
1= DµDµ̂ exp −i d→
−
y µ̂(→
−
y ) Nµ(→
−
y ) − ∑ ∏ δ (ya − yia ) , (16.26)
i a
Z
" Z
!#
which leads to
n
N
d→
−
y µ(→
−
y )µ̂(→
− d→
−
y d→
−
w µ(→
− w)
y ) µ(→
−
∑ ya wa
8
hZ(x)n i = y )−
a=1
Z Z Z
!2
n N n
Z
! " Z
#
In the above equations DµDµ̂ denotes again functional integration, which was already used in Chapter
4. If you want to know more on this, see [18].
n
The multiple integral RNn (∏a=1 dy
ya ) (· · · ) is just a collection of N-identical copies of a single integral,
hence
R
n N n
xε
∏ dyya ∑ ∑ yia2 + i ∑ d→
−
y µ̂(→
−
2
exp −i y ) ∏ δ (ya − yia )
RNn a=1 i=1 a=1 i a
Z
! " Z
#
n
xε
= d→
−
∑ ya2 + i d→
−
y µ̂(→
−
2
y exp −i y ) ∏ δ (ya − y1a )
Rn a=1 a
(Z " Z
#)N
n
xε
= d→
− y)
−
∑ ya2 + iµ̂(→ , (16.28)
2
y exp −i
Rn a=1
( Z
" #)N
where in the last line we used the n delta functions to kill the multiple integral.
Exponentiating the last line of (16.28), we can eventually write
hZ(x)n i = DµDµ̂ exp {NSn [µ, µ̂; x]} , (16.29)
Z
Page 99 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
where the action is given by
n
1
d→
−
y µ(→
−
y )µ̂(→
− d→
−
y d→
−
w µ(→
− w)
y )µ(→
−
∑ ya wa
8
Sn [µ, µ̂; x] = −i y )−
a=1
Z Z
!2
n
xε
+ Log d→
− y)
−
∑ ya2 + iµ̂(→ . (16.30)
2
y exp −i
Rn a=1
" Z
" ##
The expression (16.29) lends itself to a nice saddle-point evaluation for N → ∞. The only catch is that
in doing so we would reverse the right order of limits: instead of taking n → 0 first, and N → ∞ afterwards,
we are going to do the opposite! This procedure is not mathematically justified, but we will proceed as if it
were.
16.2.1 Critical points
Finding the critical points of this action yields the two equations
n
δS 1
−y)= d→
−
w µ ? (→−
w ) ∑ ya wa , (16.31)
δµ 4
= 0 ⇒ −iµ̂ ? (→
a=1
Z
!2
n 2
δS y + i µ̂ (
? →y )
y (16.32)
exp −i x2ε ∑a=1
− ) =
δ µ̂
= 0 ⇒ µ ? (→ →
−0 0 ? →
−
Rn d y exp −i 2 ∑a=1 y a + i µ̂ ( y )
R xε n a 2 −0 .
Inserting (16.32) into (16.31), we get
− 2 ? → − → − 2
y)=
−
−iµ̂ ? (→ n (16.33)
∑a=1 w)
1R →
d− 2
wa2 + iµ̂ ? (→
−
xε n − →
w exp −i xε
R →
4 d w exp −i ∑a=1 wa + i µ̂ ( w ) ( y · w )
2
where both integrals on the r.h.s. run over Rn .
In order to proceed, we have to make assumptions on the behavior of µ ? and µ̂ ? upon permutation of
replica indices. There is a good body of research - although not yet a formal proof - pointing to the exactness
of the replica-symmetric high-temperature solution, i.e. the one preserving permutation-symmetry among
replicas, and rotational symmetry in the space of replicas.
y ) = µ ? (y),
This simply means that we should look for a solution of (16.31) and (16.32) in the form µ ? (→
−
−
with y = |→y |, and similarly for µ̂ ? .
Introducing n-dimensional spherical coordinates, we can rewrite (16.33) under the replica-symmetric
assumption as
4 0 dω ω n−1 exp[− 2i xε ω 2 + iµ̂ ? (ω)]ω 2 0 dφ (sin φ )n−2 (cos φ )2
−iµ̂ ? (y) = , (16.34)
dφ (sin φ )n−2
y2 R ∞ Rπ
0 dω ω n−1 exp[− 2i xε ω 2 + iµ̂ ? (ω)] 0
R∞ Rπ
where φ is taken as the angle between →
− w , and the other angular integrals cancel out between numer-
y and →
−
ator and denominator.
Page 100 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Performing the remaining angular integrals, and after an integration by parts in the denominator, we get
Γ(n/2)n y2 0∞ dω ω n+1 G(ω)
iµ̂ ? (y) = , (16.35)
2Γ(1 + n/2) 4 0∞ dω ω n G0 (ω)
R
R
where G(ω) := exp[− 2i xε ω 2 + iµ̂ ? (ω)]. In the replica limit n → 0, we obtain
y2 0∞ dω ωG(ω)
iµ̂ ? (y) = = C(x)y2 , (16.36)
4 0∞ dω G0 (ω)
R
R
where C(x) can be determined self-consistently using
∞ ∞ i 2 2
dω ωG(ω) 1
0 dω ω exp − 2 xε ω +C(x)ω
i 2 2 i
= (16.37)
0dω G0 (ω)
R R
0 dω exp − 2 xε ω +C(x)ω 2ω − 2 xε +C(x) 2 − 2i xε +C(x)
R0 ∞ = R∞ ,
so that
1 1
(16.38)
4
ixε ± 2 − xε2 .
8 − 2i xε +C(x)
q
C(x) = ⇒ C(x) =
16.2.2 One step back: summarize and continue
Let us now pause for a second and recap what we are doing. We started from the Edwards-Jones identity
ρ(x) = lim Im Log Z(x) , (16.39)
−2
πN ε→0+ ∂ x
∂ D E
where
i
Z(x) = (16.40)
2
dyy exp − y T (xε 1 − H) y ,
RN
Z
and xε = x − iε.
The average of the logarithm is performed by using the replica identity
1
(16.41)
n→0 n
hLog Z(x)i = lim LoghZ(x)n i ,
which in turn (for large N) can be approximated via a saddle-point evaluation from (16.29) as
hZ(x)n i = DµDµ̂ exp {NSn [µ, µ̂; x]} ∼ exp [NSn [µ ? , µ̂ ? ; x]] . (16.42)
Z
Combining (16.39), (16.41) and (16.42), we obtain
1 ∂
ρ(x) = lim Im lim (16.43)
−2
Sn [µ ? , µ̂ ? ; x] .
π ε→0+ n→0 n ∂ x
The derivative with respect to x only acts over the last term in the action (16.30), because x appears explicitly
(not through µ ? or µ̂ ? ) only there, and the action is stationary at the saddle point. Taking the derivative and
Page 101 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
writing the integral in spherical n-dimensional coordinates, we obtain
∞
lim Im lim (16.44)
1 − 2i 0 dy yn+1 exp −i x2ε y2 +C(x)y2
ρ(x) =
−2
R
π ε→0+ n→0 n 0∞ dy yn−1 exp −i x2ε y2 +C(x)y2
R .
Performing the integrals and simplifying
1 1
ρ(x) = lim+ Re . (16.45)
π ε→0 −2C(x) + ixε
Recalling that C(x) = 41 ixε ± 2 − xε2 and xε = x − iε, we can first extract the real and imaginary part of
C(x) using the lemma in (16.15). Therefore we can write
p
C(x) = Pε (x) + iQε (x) , (16.46)
with
1
Pε (x) = √ 2 − x2 + ε 2 + (2 − x2 + ε 2 )2 + (2εx)2 (16.47)
2
r q
sign(2εx)
Qε (x) = √ (2 − x2 + ε 2 )2 + (2εx)2 − (2 − x2 + ε 2 ) . (16.48)
2
rq
Hence
1
Re = (16.49)
−2Pε (x)
−2C(x) + ixε 4Pε (x)2 + (x − 2Qε (x))2
√ √
In the limit ε → 0+ and for − 2 < x < 2, Pε and Qε converge to
√
(16.50)
2 − x2
4
P0 (x) = ±
x
Q0 (x) = , (16.51)
4
from which
ρ(x) = (16.52)
π
2 − x2 ,
1p
i.e. Wigner’s semicircle law as expected.
Page 102 of 111
Chapter 17
Born to be free
We have so far dealt with the spectral properties of individual random matrix ensembles. You may have
been wondering (or not) what happens when you sum or multiply random matrices belonging to different
ensembles. In this Chapter we present an overview of the rather complicated tool you will need to tackle
this problem: free probability theory [86, 88].
17.1 Things about probability you probably already know
Two random variables X1 and X2 , with pdfs ρ1 and ρ2 , are said to be statistically independent when the
combined random variable (X1 , X2 ) has a factorized jpdf of the form
ρ1,2 (x1 , x2 ) = ρ1 (x1 )ρ2 (x2 ) . (17.1)
Statistical independence means that averages factorize as well (hX1 X2 i = hX1 ihX2 i), which in turn means
that their covariance is zero, and is key to finding the distribution of the sum of random variables. Let us
consider a random variable X with pdf ρ(x). Its characteristic function ϕ(t) is defined as
ϕ(t) = heitX i = dx ρ(x) eitx , (17.2)
Z
i.e. it is the Fourier transform of its pdf.
You should easily realize that the factorized jpdf in equation (17.1) implies that characteristic func-
tions are multiplicative upon the addition of statistically independent random variables, i.e. ϕ1,2 (t1 ,t2 ) =
ϕ1 (t1 )ϕ2 (t2 ). Even more simply, we can introduce the logarithm of the characteristic function h(t) =
log ϕ(t), the so called cumulant generating function, which is obviously additive upon the addition of ran-
dom variables:
h1,2 (t1 ,t2 ) = h1 (t1 ) + h2 (t2 ) . (17.3)
Therefore, the problem of finding the pdf of the sum of two independent random variables X1 and X2
reduces to a simple “algorithm”: compute the characteristic functions of X1 and X2 from their pdfs, form the
the cumulant generating function of the sum X1 + X2 via the additive law (17.3), compute the corresponding
characteristic function via exponentiation, and eventually compute the pdf of the sum X1 + X2 via inverse
Fourier transform.
103
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
17.2 Freeness
So, is there a generalization of statistical independence that will allow us to compute the eigenvalue spec-
trum of sums of random matrices? At first it might be tempting to guess that the statistical independence
of two scalar random variables could be straightforwardly generalized to the case of two random matrices
X1 and X2 by merely requiring the mutual independence of all entries. Unfortunately, this is not the case,
as independent entries are not enough to destroy all possible angular correlations between the eigenbases of
two matrices.
The property that generalizes statistical independence to random matrices is that of freeness. The theory
of free probability was initiated a few years ago by the pioneering works by Voiculescu and Speicher as an
abstract approach to Von Neumann algebras, and only later it was shown to have a concrete realization in
terms of random matrices.
Here is how freeness works. Let us consider two N × N random matrices X1 and X2 , and let us introduce
the following operator
1
τ(X) = lim Tr(X) . (17.4)
N→∞ N
The two matrices X1 and X2 are said to be free if for all integers n1 , m1 , n2 , m2 , . . . ≥ 1 we have
τ ((X1n1 − τ (X1n1 )) (X2m1 − τ (X2m1 )) (X1n2 − τ (X1n2 )) (X2m2 − τ (X2m2 )) . . .) =
τ ((X2n1 − τ (X2n1 )) (X1m1 − τ (X1m1 )) (X2n2 − τ (X2n2 )) (X1m2 − τ (X1m2 )) . . .) = 0. (17.5)
Not so straightforward, is it?
It might help to put the above definition into words. Two random matrices are free if the traces of all non-
commutative products of matrix polynomials, whose traces are zero, are zero. Still not very intuitive, right?
Well, unfortunately it does not get much better than that, but some intuition can be gained by exploring some
concrete examples of the above definition. For example, equation (17.5) reduces to τ(X1 X2 ) = τ(X1 )τ(X2 )
when n1 = m1 , and it reduces to τ(X12 X22 ) = τ(X12 )τ(X22 ) when n1 = m1 = 2. As you should quickly realize,
these equations generalize the moment factorization rules for statistically independent variables, and you
can verify that all such relations for higher order moments can be obtained from equation (17.5).
However, the interesting part comes into play when we explore cases in which matrix non-commutativity
kicks in. For example, you can easily work out the following result from (17.5) for n1 = n2 = m1 = m2 = 1:
τ(X1 X2 X1 X2 ) = τ 2 (X1 )τ(X22 ) + τ(X12 )τ 2 (X2 ) − τ(X12 )τ(X22 ) . (17.6)
This result has no counterpart in “conventional” probability theory. Hopefully, this will convince you that
freeness essentially represents a generalization of moment factorization.
Page 104 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
17.3 Free addition
Let us now put freeness to work.
Suppose we want to compute the average spectral density of the sum of large (i.e. N → ∞) random
matrices belonging to two different ensembles.
The first ingredient we need is our old friend the resolvent, which we introduced in chapter 8. Now,
(av)
given the resolvent G∞ (z) of a given ensemble, let us introduce its functional inverse B(z):
(av) (av)
G∞ (B(z)) = B G∞ (z) = z . (17.7)
The above function is known as the Blue function [87]. In case you are wondering: yes, it is called Blue
because it is the inverse of the Green’s function.
The last ingredient we need is the so called R-transform. Blue functions usually display a singular
behavior at the origin, and the R-transform is just defined as a Blue function minus its singular part:
1
. (17.8)
z
R(z) = B(z) −
We are all set now. Let us consider random matrices X1 and X2 belonging to ensembles characterized
av (z) and Gav (z), respectively. Let us form, through equations (17.7) and (17.8), the
by resolvents G∞,1 ∞,2
corresponding R-transforms R1 and R2 . The R-transform of the sum X = X1 + X2 is then simply given by
the sum of the two R-transforms:
R(z) = R1 (z) + R2 (z) . (17.9)
The above addition rule is the free counterpart of (17.3) for the moment generating functions of statisti-
cally independent random variables. Just like in that case, this rule provides a simple addition “algorithm”
for free random matrices, whose first part has been outlined above. Once the R-transform of the sum has
been computed, the corresponding Blue function and resolvent can be obtained through equations (17.8)
and (17.7). Once that has been done, the eigenvalue density can be derived from the resolvent via equation
(8.8).
17.4 Do it yourself
Enough with theory now: let us see free calculus at work on a concrete example.
All we need is the spectral density of large hermitian random matrix ensembles. So, how about the
eigenvalue density of the free sum of some of our usual suspects? For example, let us consider a mixture of
a GOE matrix H and a Wishart matrix W
S = pH + (1 − p)W , (17.10)
where p ∈ [0, 1].
Page 105 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
For both ensembles we already have computed the resolvents (equations (8.20) and (14.9) with K =
1/α). The functional inverse of those functions yield the Blue functions via equation (17.7), and the R-
transforms are immediately obtained via equation (17.8). Please verify that they are given by the following
functions for the GOE and Wishart ensembles respectively:
z
RGOE (z) = (17.11)
2
α +1
RW (z) = .
1−z
Using the R-transform’s scaling property RcH (z) = cRH (cz) (see the box below), we can adapt the addition
rule (17.9) to the present problem as follows:
RS (z) = p RGOE (pz) + (1 − p)RW ((1 − p)z) . (17.12)
Plugging the functions in (17.11) into the equation above gives
p2
z+ , (17.13)
(1 − p)(1 − α)
RS (z) =
2 1 − (1 − p)z
and the corresponding resolvent is obtained as BS (GS (z)) = z, where BS (z) = RS (z) + 1/z is the Blue func-
tion. The equation for the resolvent reads
p2 1
z= + . (17.14)
(1 − p)(1 − α)
GS (z) +
s 1 − (1 − p)GS (z) GS (z)
This is a third degree equation yielding, in general, one real solution and two complex conjugate ones for a
given fixed z. The relationship between the eigenvalue density and the resolvent is the one in equation (8.8),
and that informs us that we will need to select the solution with a positive imaginary part. All this is done,
and numerically verified, in the code [♠ GOE_Wishart_Sum.m]. An example of the output that can be
obtained is shown in Fig. 17.1.
Our goal in this Chapter was just to provide you with a short overview of the powerful tools free proba-
bility has to offer. There are plenty of papers out there if you’d like to know more. For example, you might
have a look at the nice review article in [89], which also details some of the many applications that free
random matrices have in quantitative finance.
Page 106 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
0.7
p = 0.3
0.6 p = 0.3
p = 0.5
0.5 p = 0.5
p = 0.7
p = 0.7
0.4
ρ(x)
0.3
0.2
0.1
0
-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4
x
Figure 17.1: Numerical check of the density obtained for the free addition of GOE and Wishart random
matrices from the solution of Eq. (17.14). The examples shown refer to different values of the parameter p
that quantifies the relative weight between the two ensembles (see Eq. (17.10)).
Question. Where does the scaling property of the R-transform come from?
I It is inherited from the scaling properties of our good old friend the resol-
vent. Indeed, multiplying a matrix H by a constant c rescales the eigenvalues by
the same factor c. Hence, from Eqs (8.4) and (8.5) it is easy to prove that the two
corresponding resolvents are related to each other through this simple relationship:
(av) (av)
G∞,cH = G∞,H (z/c)/c. We can then write the equation for the Blue function BcH
(av) 1 (av) 1
z = G∞,cH (BcH (z)) = G∞,H BcH (z) , (17.15)
c c
which shows that BcH (z) = cBH (cz). We then have the following for the corre-
sponding R functions:
1 1
(17.16)
z z
cRH (cz) = cBH (cz) − = BcH (z) − = RcH (z) .
Page 107 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
Question. We know that the sum of two Gaussian scalar random variables is again
Gaussian distributed. Is there an equivalent statement for the free addition of
Gaussian random matrices?
I Given the tools provided in this Chapter you should be able to show that
the semicircle distribution is stable under free addition, i.e. if you free sum M
matrices each having the semicircle as spectral density, you still end up with a
matrix whose spectral density is a semicircle.
Page 108 of 111
Bibliography
[1] P. Šeba, J. Stat. Mech. L10002 (2009).
[2] P.L. Hsu, Ann. Hum. Genet. 9, 250 (1939).
[3] C.E. Porter and N. Rosenzweig, Ann. Acad. Sci. Fennicae, Serie A VI Physica 6, 44 (1960), reprinted
in [4].
[4] C.E. Porter, Statistical Theories of Spectra: Fluctuations (Academic Press, New York, 1965).
[5] P.J. Forrester and S.O. Warnaar, Bull. Amer. Math. Soc. (N.S.) 45, 489 (2008).
[6] R. Kühn, J. Phys. A: Math. Theor., 41, 295002 (2008).
[7] P. Cizeau and J.P. Bouchaud, Phys. Rev. E, 50, 1810 (1994).
[8] A.D. Mirlin, Y.V. Fyodorov, F.-M. Dittes, J. Quezada, and T.H. Seligman, Phys. Rev. E 54, 3221
(1996).
[9] H. Weyl, Classical Groups (Princeton Univ. Press, Princeton, 1946).
[10] K.A. Muttalib, Y. Chen, M.E.H. Ismail, and V.N. Nicopoulos, Phys. Rev. Lett. 71, 471 (1993).
[11] A. Borodin, Nuclear Physics B 536, 704 (1998).
[12] P. Desrosiers and P.J. Forrester, J. Approx. Theory 152, 167 (2008).
[13] J.T. Albrecht, C.P. Chan, and A. Edelman, Found. Comput. Math. 9, 461 (2008).
[14] Y.V. Fyodorov, Introduction to the Random Matrix Theory: Gaussian Unitary Ensemble and Beyond
(2004), https://arxiv.org/pdf/math-ph/0412017.pdf .
[15] M. Zirnbauer, Symmetry classes in random matrix theory (2004), https://arxiv.org/pdf/
math-ph/0404058.pdf .
[16] F.J. Dyson, J. Math. Phys. 3, 140 (1962).
[17] E. Wigner, in Statistical properties of real symmetric matrices with many dimensions, Canadian Math-
ematical Congress Proceedings (University of Toronto Press, Toronto, 1957), p. 174.
[18] R. MacKenzie, Path integral methods and applications (2000), https://arxiv.org/abs/
quant-ph/0004090.
[19] J. Rammer, Quantum field theory of non-equilibrium states (Cambridge University Press, 2007).
[20] R. Wong, Asymptotic approximation of integrals: Computer Science and Scientific Computing (Aca-
demic Press, 1989).
[21] E. Sandier and S. Serfaty, Annals Probab. 43, 2026 (2015).
[22] E. Engel and R.M. Dreizler, Density Functional Theory: An Advanced Course (Springer, Heidelberg,
2011).
[23] F.G. Tricomi, Integral Equations (Dover publications, 1985).
[24] L. Erdös, Russ. Math. Surv. 66, 507 (2011).
[25] I. Dumitriu and A. Edelman, J. Math. Phys. 43, 5830 (2002).
[26] R. Marino, S.N. Majumdar, G. Schehr, and P. Vivo, Phys. Rev. Lett. 112, 254101 (2014).
109
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
[27] D.S. Dean and S.N. Majumdar, Phys. Rev. E 77, 041108 (2008).
[28] G. Akemann, G. M. Cicuta, L. Molinari, and G. Vernizzi, Phys. Rev. E 59, 1489 (1999).
[29] G. Akemann, G. M. Cicuta, L. Molinari, and G. Vernizzi, Phys. Rev. E 60, 5287 (1999).
[30] R.J. Muirhead, Aspects of multivariate statistical theory (John Wiley & Sons, 2009).
[31] A. Edelman, Eigenvalues and Condition Numbers of Random Matrices, MIT Ph.D. Dissertation
(1989).
[32] A. Edelman, Finite random matrix theory. Jacobians of matrix transforms (without wedge products)
(2005), http://web.mit.edu/18.325/www/handouts/handout2.pdf.
[33] A.M. Mathai, Jacobians of matrix transformations and functions of matrix arguments (World Scien-
tific Publishing Co Inc, 1997).
[34] B. Ycart, Revue d’Histoire des Mathématiques 9, 43 (2013).
[35] A. Edelman and N. Raj Rao, Acta Numerica 14, 233 (2005).
[36] A. Zee, Quantum Field Theory in a nutshell, 2nd edn. (Princeton Univ. Press, Princeton, 2010).
[37] S. Rabinowitz, Mathematics and Informatics Quarterly 3, 54-56 (1993).
[38] R. Abou-Chacra, P.W. Anderson, and D.J. Thouless, J. Phys. C 6, 1734 (1973).
[39] F.L. Metz, I. Neri, and D. Bollé, Phys. Rev. E 82, 031135 (2010).
[40] R. Allez, J.-P. Bouchaud, and A. Guionnet, Phys. Rev. Lett. 109, 094102 (2012).
[41] R. Allez, J.-P. Bouchaud, S.N. Majumdar, and P. Vivo, J. Phys. A: Math. Theor. 46, 015001 (2013).
[42] J.-P. Blaizot and M.A. Nowak, Phys. Rev. E 82, 051115 (2010).
[43] J. Jurkiewicz, G. Łukaszewski, and M.A. Nowak, Acta Phys. Pol. B 39, 799 (2008).
[44] J. Bouttier, Matrix integrals and enumeration of maps, in The Oxford Handbook of Random Matrix
Theory, ed. by G. Akemann, J. Baik, and P. Di Francesco (Oxford Univ. Press, Oxford, 2011).
[45] C.E. Porter and R.G. Thomas, Phys. Rev. 104, 483 (1956).
[46] T.A. Brody, J. Floris, J.B. French, P.A. Mello, A. Pandey, and S.S.M. Wong, Rev. Mod. Phys. 53, 385
(1981).
[47] J.T. Chalker and B. Mehlig, Phys. Rev. Lett. 81, 3367 (1998).
[48] B. Mehlig and J.T. Chalker, J. Math. Phys. 41, 3233 (2000).
[49] Y.V. Fyodorov, On statistics of bi-orthogonal eigenvectors in real and complex Ginibre ensem-
bles: Combining partial Schur decomposition with supersymmetry (2017), https://arxiv.
org/abs/1710.04699.
[50] K. Truong and A. Ossipov, Eur. Phys. Lett. 116, 37002 (2016).
[51] Y.V. Fyodorov, Introduction to the Random Matrix Theory: Gaussian Unitary Ensemble and Beyond
(2004), https://arxiv.org/pdf/math-ph/0412017.pdf.
[52] G. Mahoux and M.L. Mehta, J. Phys. I France 1, 1093 (1991).
[53] E. Kanzieper and G. Akemann, Phys. Rev. Lett. 95, 230201 (2005); G. Akemann and E. Kanzieper,
J. Stat. Phys. 129, 1159 (2007).
[54] G. Akemann and L. Shifrin, J. Phys. A : Math. Gen. 40, F785 (2007).
[55] W. Van Assche, Asymptotics for orthogonal polynomials (Springer, 2006).
[56] C. Andréief, Mem. de la Soc. Sci. de Bordeaux 2, 1 (1883).
[57] N.G. de Bruijn, J. Indian Math. Soc. 19, 133 (1955).
[58] N.S. Witte and P.J. Forrester, Random Matrices: Theory Appl. 01, 1250010 (2012).
[59] P.J. Forrester and N.S. Witte, Commun. Math. Phys. 219, 357 (2001).
[60] E. Kanzieper, Phys. Rev. Lett. 89, 250201 (2002).
Page 110 of 111
Giacomo Livan, Marcel Novaes, Pierpaolo Vivo
[61] E. Kanzieper, Constructive Approximation 41, 615 (2015).
[62] A. Hurwitz, Nachr. Ges. Wiss. Göttingen, 71 (1897).
[63] P. Diaconis and P.J. Forrester, Random Matrices: Theory Appl. 06, 1730001 (2017).
[64] J. Wishart, Biometrika 20A, 32 (1928).
[65] J.J.M. Verbaarschot, Applications of Random Matrix Theory to QCD, in The Oxford Handbook of
Random Matrix Theory, edited by G. Akemann, J. Baik, and P. Di Francesco (Oxford Univ. Press,
Oxford, 2011).
[66] F. Kleefeld and M. Dillig, Trace evaluation of matrix determinants and inversion of 4 × 4 matrices in
terms of Dirac covariants (1998), https://arxiv.org/pdf/hep-ph/9806415.pdf.
[67] R.A. Janik and M.A. Nowak, J. Phys. A: Math. Gen. 36, 3629 (2003).
[68] V.A. Marčenko and L.A. Pastur, Math. USSR-Sb 1, 457 (1967).
[69] Y.V. Fyodorov, Nucl. Phys. B 621, 643 (2002).
[70] E. Kanzieper and N. Singh, J. Math. Phys. 51, 103510 (2010).
[71] G. Livan and P. Vivo, Acta Phys. Pol. B 42, 1081 (2011).
[72] P.J. Forrester, N.E. Frankel, and T.M. Garoni, J. Math. Phys. 47, 023301 (2006).
[73] G. Akemann, Acta Phys. Pol. B 42, 0901 (2011).
[74] J. Jurkiewicz, G. Łukaszewski, and M. Nowak, Acta Phys. Pol. B 39, 799 (2008).
[75] S.N. Majumdar, Extreme Eigenvalues of Wishart Matrices: Application to Entangled Bipartite Sys-
tem, in The Oxford Handbook of Random Matrix Theory, edited by G. Akemann, J. Baik, and P. Di
Francesco (Oxford Univ. Press, Oxford, 2011).
[76] K. Splittorff and J.J.M. Verbaarschot, Lessons from Random Matrix Theory for QCD at Finite Density
(2008), https://arxiv.org/pdf/0809.4503.pdf.
[77] S. Ghosh and A. Pandey, Phys. Rev. E 65, 046221 (2002).
[78] M.L. Mehta, Random matrices (Academic press, 1967).
[79] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters, Phys. Rev. Lett. 83, 1467 (1999).
[80] V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N. Amaral, and H.E. Stanley, Phys. Rev. Lett. 83, 1471
(1999).
[81] J. Bun, J.-P. Bouchaud, and M. Potters, Phys. Rep. 666, 1 (2017).
[82] S.F. Edwards and R.C. Jones, J. Phys. A: Math. Gen. 9, 1595 (1976).
[83] T. Castellani and A. Cavagna, J. Stat. Mech., P05012 (2005).
[84] F. Zamponi, Mean field theory of spin glasses (2014), https://arxiv.org/pdf/1008.
4844.pdf.
[85] G.J. Rodgers and A. J. Bray, Phys. Rev. B 37, 3557 (1988).
[86] D.V. Voiculescu, J. Oper. Theory 18, 223 (1987).
[87] A. Zee, Nucl. Phys. B 474, 726 (1996).
[88] A. Nica and R. Speicher, Duke Math. J. 92, 553 (1998).
[89] Z. Burda, A. Jarosz, M. A. Nowak, J. Jurkiewicz, G. Papp, and I. Zahed, Quant. Fin. 11, 1103 (2011).
Page 111 of 111