Bragging Format1
Bragging Format1
A F I R S T C O U R S E I N R A N D O M M AT R I X T H E O RY
The real world is perceived and broken down as data, models and algorithms in the eyes
of physicists and engineers. Data is noisy by nature and classical statistical tools have so
far been successful in dealing with relatively smaller levels of randomness. The recent
emergence of Big Data and the required computing power to analyze them have rendered
classical tools outdated and insufficient. Tools such as random matrix theory and the study
of large sample covariance matrices can efficiently process these big datasets and help make
sense of modern, deep learning algorithms. Presenting an introductory calculus course for
random matrices, the book focuses on modern concepts in matrix theory, generalizing
the standard concept of probabilistic independence to non-commuting random variables.
Concretely worked out examples and applications to financial engineering and portfolio
construction make this unique book an essential tool for physicists, engineers, data analysts
and economists.
M A R C P OT T E R S
Capital Fund Management, Paris
J E A N - P H I L I P P E B O U C H AU D
Capital Fund Management, Paris
www.cambridge.org
Information on this title: www.cambridge.org/9781108488082
DOI: 10.1017/9781108768900
© Cambridge University Press 2021
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2021
Printed in the United Kingdom by TJ Books Limited
A catalogue record for this publication is available from the British Library.
Library of Congress Cataloging-in-Publication Data
Names: Potters, Marc, 1969– author. | Bouchaud, Jean-Philippe, 1962– author.
Title: A first course in random matrix theory : for physicists, engineers
and data scientists / Marc Potters, Jean-Philippe Bouchaud.
Description: Cambridge ; New York, NY : Cambridge University Press, 2021. |
Includes bibliographical references and index.
Identifiers: LCCN 2020022793 (print) | LCCN 2020022794 (ebook) |
ISBN 9781108488082 (hardback) | ISBN 9781108768900 (epub)
Subjects: LCSH: Random matrices.
Classification: LCC QA196.5 .P68 2021 (print) | LCC QA196.5 (ebook) |
DDC 512.9/434–dc23
LC record available at https://lccn.loc.gov/2020022793
LC ebook record available at https://lccn.loc.gov/2020022794
ISBN 978-1-108-48808-2 Hardback
Additional resources for this title at www.cambridge.org/potters
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
Contents
Preface page ix
List of Symbols xiv
Part I Classical Random Matrix Theory 1
1 Deterministic Matrices 3
1.1 Matrices, Eigenvalues and Singular Values 3
1.2 Some Useful Theorems and Identities 9
2 Wigner Ensemble and Semi-Circle Law 15
2.1 Normalized Trace and Sample Averages 16
2.2 The Wigner Ensemble 17
2.3 Resolvent and Stieltjes Transform 19
3 More on Gaussian Matrices* 30
3.1 Other Gaussian Ensembles 30
3.2 Moments and Non-Crossing Pair Partitions 36
4 Wishart Ensemble and Marčenko–Pastur Distribution 43
4.1 Wishart Matrices 43
4.2 Marčenko–Pastur Using the Cavity Method 48
5 Joint Distribution of Eigenvalues 58
5.1 From Matrix Elements to Eigenvalues 58
5.2 Coulomb Gas and Maximum Likelihood Configurations 64
5.3 Applications: Wigner, Wishart and the One-Cut Assumption 69
5.4 Fluctuations Around the Most Likely Configuration 73
5.5 An Eigenvalue Density Saddle Point 78
6 Eigenvalues and Orthogonal Polynomials* 83
6.1 Wigner Matrices and Hermite Polynomials 83
6.2 Laguerre Polynomials 87
6.3 Unitary Ensembles 91
vi Contents
Contents vii
viii Contents
Index 347
Preface
Physicists have always approached the world through data and models inspired by this
data. They build models from data and confront their models with the data generated by
new experiments or observations. But real data is by nature noisy; until recently, classical
statistical tools have been successful in dealing with this randomness. The recent emergence
of very large datasets, together with the computing power to analyze them, has created a
situation where not only the number of data points is large but also the number of studied
variables. Classical statistical tools are inadequate to tackle this situation, called the large
dimension limit (or the Kolmogorov limit). Random matrix theory, and in particular the
study of large sample covariance matrices, can help make sense of these big datasets, and
is in fact also becoming a useful tool to understand deep learning. Random matrix theory
is also linked to many modern problems in statistical physics such as the spectral theory of
random graphs, interaction matrices of spin-glasses, non-intersecting random walks, many-
body localization, compressed sensing and many more.
This book can be considered as one more book on random matrix theory. But our aim
was to keep it purposely introductory and informal. As an analogy, high school seniors
and college freshmen are typically taught both calculus and analysis. In analysis one learns
how to make rigorous proofs, define a limit and a derivative. At the same time in calculus
one can learn about computing complicated derivatives, multi-dimensional integrals and
solving differential equations relying only on intuitive definitions (with precise rules) of
these concepts. This book proposes a “calculus” course for random matrices, based in
particular on the relatively new concept of “freeness”, that generalizes the standard concept
of probabilistic independence to non-commuting random variables.
Rather than make statements about the most general case, concepts are defined with
some strong hypothesis (e.g. Gaussian entries, real symmetric matrices) in order to simplify
the computations and favor understanding. Precise notions of norm, topology, convergence,
exact domain of application are left out, again to favor intuition over rigor. There are many
good, mathematically rigorous books on the subject (see references below) and the hope is
that our book will allow the interested reader to read them guided by his/her newly built
intuition.
ix
x Preface
Readership
The book was initially conceived as a textbook for a graduate level standard 30 hours course
in random matrix theory, for physicists or applied mathematicians, given by one of us (MP)
during a sabbatical at UCLA in 2017–2018. As the book evolved many new developments,
special topics and applications have been included. Lecturers can then customize their
course offering by complementing the first few essential chapters with their own choice
of chapters or sections from the rest of the book.
Another group of potential readers are seasoned researchers analyzing large datasets
who have heard that random matrix theory may help them distinguish signal from noise
in singular value decompositions or eigenvalues of sample covariance matrices. They have
heard of the Marčenko–Pastur distribution but do not know how to extend it to more real-
istic settings where they might have non-Gaussian noise, true outliers, temporal (sample)
correlations, etc. They need formulas to compute null hypothesis and so forth. They want
to understand where these formulas come from intuitively without requiring full precise
mathematical proofs.
The reader is assumed to have a background in undergraduate mathematics taught in
science and engineering: linear algebra, complex variables and probability theory. Impor-
tant results from probability theory are recalled in the book (addition of independent vari-
ables, law of large numbers and central limit theorem, etc.) while stochastic calculus and
Bayesian estimation are not assumed to be known. Familiarity with physics approximation
techniques (Taylor expansions, saddle point approximations) is helpful.
Preface xi
[2010], Pastur and Scherbina [2010], Tao [2012], Erdős and Yau [2017], Mingo and
Speicher [2017]. These books are often too technical for the intended readership of the
present book. We nevertheless rely on these books to extract some relevant material for our
purpose.
xii Preface
Acknowledgments
The two of us want to warmly thank the research team at CFM with whom we have had
many illuminating discussions on these topics over the years, and in particular Jean-Yves
Audibert, Florent Benaych-Georges, Raphael Bénichou, Rémy Chicheportiche, Stefano
Ciliberti, Sungmin Hwang, Vipin Kerala Varma, Laurent Laloux, Eric Lebigot, Thomas
Madaule, Iacopo Mastromatteo, Pierre-Alain Reigneron, Adam Rej, Jacopo Rocchi,
Emmanuel Sérié, Konstantin Tikhonov, Bence Toth and Dario Vallamaina.
We also want to thank our academic colleagues for numerous, very instructive inter-
actions, collaborations and comments, including Gerard Ben Arous, Michael Benzaquen,
Giulio Biroli, Edouard Brézin, Zdzisław Burda, Benoı̂t Collins, David Dean, Bertrand
Eynard, Yan Fyodorov, Thomas Guhr, Alice Guionnet, Antti Knowles, Reimer Kuehn,
Pierre Le Doussal, Fabrizio Lillo, Satya Majumdar, Marc Mézard, Giorgio Parisi, Sandrine
Péché, Marco Tarzia, Matthieu Wyart, Francesco Zamponi and Tony Zee.
We want to thank some of our students and post-docs for their invaluable contribution
to some of the topics covered in this book, in particular Romain Allez, Joel Bun, Tristan
Gautié and Pierre Mergny. We also thank Pierre-Philippe Crépin, Théo Dessertaine, Tristan
Gautié, Armine Karami and José Moran for carefully reading the manuscript.
Finally, Marc Potters wants to thank Fan Yang, who typed up the original hand-written
notes. He also wants to thank Andrea Bertozzi, Stanley Osher and Terrence Tao, who
welcomed him for a year at UCLA. During that year he had many fruitful discussions with
members and visitors of the UCLA mathematics department and with participants of the
IPAM long program in quantitative linear algebra, including Alice Guionnet, Horng-Tzer
Yau, Jun Yin and more particularly with Nicholas Cook, David Jekel, Dimitri Shlyakhtenko
and Nikhil Srivastava.
Bibliographical Notes
Here is a list of books on random matrix theory that we have found useful.
• Books for mathematicians
– G. Blower. Random Matrices: High Dimensional Phenomena. Cambridge University
Press, Cambridge, 2009,
– G. W. Anderson, A. Guionnet, and O. Zeitouni. An Introduction to Random Matrices.
Cambridge University Press, Cambridge, 2010,
– Z. Bai and J. W. Silverstein. Spectral Analysis of Large Dimensional Random Matri-
ces. Springer-Verlag, New York, 2010,
– L. Pastur and M. Scherbina. Eigenvalue Distribution of Large Random Matrices.
American Mathematical Society, Providence, Rhode Island, 2010,
– T. Tao. Topics in Random Matrix Theory. American Mathematical Society, Provi-
dence, Rhode Island, 2012,
– L. Erdős and H.-T. Yau. A Dynamical Approach to Random Matrix Theory. American
Mathematical Society, Providence, Rhode Island, 2017,
Preface xiii
– J. A. Mingo and R. Speicher. Free Probability and Random Matrices. Springer, New
York, 2017.
• Books for physicists and mathematical physicists
– M. L. Mehta. Random Matrices. Academic Press, San Diego, 3rd edition, 2004,
– B. Eynard, T. Kimura, and S. Ribault. Random matrices. preprint arXiv:1510.04430,
2006,
– P. J. Forrester. Log Gases and Random Matrices. Princeton University Press,
Princeton, NJ, 2010,
– G. Akemann, J. Baik, and P. D. Francesco. The Oxford Handbook of Random Matrix
Theory. Oxford University Press, Oxford, 2011,
– E. Brézin and S. Hikami. Random Matrix Theory with an External Source. Springer,
New York, 2016,
– G. Schehr, A. Altland, Y. V. Fyodorov, N. O’Connell, and L. F. Cugliandolo, editors.
Stochastic Processes and Random Matrices, Les Houches Summer School, 2017.
Oxford University Press, Oxford,
– G. Livan, M. Novaes, and P. Vivo. Introduction to Random Matrices: Theory and
Practice. Springer, New York, 2018.
• More “applied” books
– A. M. Tulino and S. Verdú. Random Matrix Theory and Wireless Communications.
Now publishers, Hanover, Mass., 2004,
– R. Couillet and M. Debbah. Random Matrix Methods for Wireless Communications.
Cambridge University Press, Cambridge, 2011.
• Our own review paper on the subject, with significant overlap with this book
– J. Bun, J.-P. Bouchaud, and M. Potters. Cleaning large correlation matrices: Tools
from random matrix theory. Physics Reports, 666:1–109, 2017.
Symbols
Abbreviations
bbp: Baik, Ben Arous, Péché
cdf: Cumulative Distribution Function
clt: Central Limit Theorem
dbm: Dyson Brownian Motion
ema: Exponential Moving Average
fid: Free Identically Distributed
goe: Gaussian Orthogonal Ensemble
gse: Gaussian Symplectic Ensemble
gue: Gaussian Unitary Ensemble
hciz: Harish-Chandra–Itzykson–Zuber
iid: Independent Identically Distributed
lln: Law of Large Numbers
map: Maximum A Posteriori
mave: Mean Absolute Value
mmse: Minimum Mean Square Error
pca: Principal Component Analysis
pde: Partial Differential Equation
pdf: Probability Distribution Function
rie: Rotationally Invariant Estimator
rmt: Random Matrix Theory
scm: Sample Covariance Matrix
sde: Stochastic Differential Equation
svd: Singular Value Decomposition
Conventions
0+ : infinitesimal positive quantity
1: identity matrix
xiv
List of Symbols xv
∼: scales as, of the order of, also, for random variables, drawn from
≈: approximately equal to (mathematically or numerically)
∝: proportional to
:= : equal by definition to
≡: identically equal to
+: free sum
×: free product
E[.]: mathematical expectation
V[.]: mathematical variance
.: empirical average
[x]: dimension of x
√
i: −1
Re: real part
Im: imaginary part
−: principal value integral
±√
·: special square root, Eq. (4.56)
AT : matrix transpose
supp(ρ): domain where ρ(·) is non-zero
Note: most of the time f (t) means that t is a continuous variable, and ft means that t is
discrete.
Roman Symbols
A: generic constant, or generic free variable
A: generic matrix
a: generic coefficient, as in the gamma distribution, Eq. (4.17), or in the free
log-normal, Eq. (16.15)
B: generic free variable
B: generic matrix
b: generic coefficient, as in the gamma distribution, Eq. (4.17), or in the free
log-normal, Eq. (16.15)
C: generic coefficient
Ck : Catalan numbers
C: often population, or “true” covariance matrix, sometimes C = A + B
C: total investable capital or cross-covariance matrix
ck : cumulant of order k
d: distance between eigenvalues
dB: Wiener noise
E: sample, or empirical matrix; matrix corrupted by noise
E: error √
e: normalized vector of 1’s e = (1,1, . . . ,1)T / N
L: log-likelihood
M: generic matrix
W
p: inverse-Wishart matrix with coefficient p
mk : moment of order k
N: size of the matrix, number of variables
N(μ,σ 2 ): Gaussian distribution with mean μ and variance σ 2
N(µ,C): multivariate Gaussian distribution with mean µ and covariance C
n: number of eigenvalues in some interval, or number of replicas
O(N): orthogonal group in N dimensions
O: generic orthogonal matrix
P (·): generic probability distribution function defined by its argument: P (x) is
a short-hand for the probability density of variable x
Pγ (·): gamma distribution
P> (·): complementary cumulative distribution function
P0 (·): prior distribution in a Bayesian framework
Pi (t): probability to be in state i at time t
Pn : Legendre polynomial
P (x|y): conditional distribution of x knowing y
(a,b)
Pn : Jacobi polynomial
P: rank-1 (projector) matrix
P[X]: probability of event X
p: variance of the inverse-Wishart distribution, or quantile value
p(x): generic polynomial of x
p(A): generic matrix polynomial of A
pN (·): generic monic orthogonal polynomial
p(y,t|x): propagator of Brownian motion, with initial position x at t = 0
Q(·): generic polynomial
QN (·): (expected) characteristic polynomial
q: ratio of size of matrix N to size of sample T : q = N/T
q ∗: effective size ratio q ∗ = N/T ∗
q(A): generic matrix polynomial of A
qN (·): normalized characteristic polynomial
R: circle radius
RA (·): R-transform of the spectrum of A
R: portfolio risk, or error
r: signal-to-noise ratio; also an auxiliary variable in the spin-glass section
ri,t : return of asset i at time t
SA (·): S-transform of the spectrum of A
S: diagonal matrix of singular values
s: generic singular value
T: size of the sample
Greek Symbols
α: generic scale factor
β: effective inverse “temperature” for the Coulomb gas, defining the symmetry
class of the matrix ensemble (β = 1 for orthogonal, β = 2 for unitary,
β = 4 for symplectic)
βi : exposure of asset i to the common (market) factor
β: vector of βi ’s
Ŵ: gamma function
ŴN : multivariate gamma function
γ: generic parameter or generic quantity
γc : inverse correlation time
xx List of Symbols
φ: angle
φ(·): effective two-body potential in Matytsin’s formalism
ϕ(·): generating function, or Fourier transform of a probability distribution
(λ,λ′ ): scaled squared overlap between the eigenvectors of two sample matrices
(·),ψ(·): auxiliary functions
ψ: auxiliary integration vector
: generic rotation matrix
ω(·): generic eigenvalue field in a large deviation formalism (cf. Section 5.5)