0% found this document useful (0 votes)
254 views428 pages

Abmb PDF

This document provides an introduction to the interface between asymptotic geometric analysis and quantum information theory. It begins with notation and basic concepts in quantum information theory. It then covers elementary convex analysis, the mathematics of quantum information theory including states on multipartite Hilbert spaces, superoperators and quantum channels, and cones in QIT. It concludes with an introduction to quantum mechanics for mathematicians. The document aims to bridge asymptotic geometric analysis and quantum information theory.

Uploaded by

AnastasisKafetz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
254 views428 pages

Abmb PDF

This document provides an introduction to the interface between asymptotic geometric analysis and quantum information theory. It begins with notation and basic concepts in quantum information theory. It then covers elementary convex analysis, the mathematics of quantum information theory including states on multipartite Hilbert spaces, superoperators and quantum channels, and cones in QIT. It concludes with an introduction to quantum mechanics for mathematicians. The document aims to bridge asymptotic geometric analysis and quantum information theory.

Uploaded by

AnastasisKafetz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 428

Alice and Bob Meet Banach

The Interface of Asymptotic Geometric Analysis and


Quantum Information Theory

ion
Guillaume Aubrun

ut
Stanisław J. Szarek

rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
2010 Mathematics Subject Classification. Primary 46Bxx, 52Axx, 81Pxx, 46B07,
46B09, 52C17, 60B20, 81P40

ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
To Aurélie and Margaretmary

rd
ist
rib
ut
ion
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
Contents

ion
List of Tables xiii

ut
List of Figures xv

rib
Preface xvii

ist
Part 1. Alice and Bob. Mathematical Aspects of Quantum
Information Theory 1

rd
Chapter 0. Notation and Basic Concepts 3
0.1. Asymptotic and non-asymptotic notation 3
0.2. Euclidean and Hilbert spaces
0.3. Bra-ket notation
fo 3
4
ot
0.4. Tensor products 6
N

0.5. Complexification 6
0.6. Matrices vs. operators 7
0.7. Block matrices vs. operators on bipartite spaces 8
ly.

0.8. Operators vs. tensors 8


0.9. Operators vs. superoperators 8
on

0.10. States, classical and quantum 8


Chapter 1. Elementary Convex Analysis 11
se

1.1. Normed spaces and convex sets 11


lu

1.1.1. Gauges 11
1.1.2. First examples: `p -balls, simplex, polytopes, and convex hulls 12
na

1.1.3. Extreme points, faces 13


1.1.4. Polarity 15
so

1.1.5. Polarity and the facial structure 17


1.1.6. Ellipsoids 18
r

1.2. Cones 18
Pe

1.2.1. Cone duality 19


1.2.2. Nondegenerate cones and facial structure 21
1.3. Majorization and Schatten norms 22
1.3.1. Majorization 22
1.3.2. Schatten norms 23
1.3.3. Von Neumann and Rényi entropies 27
Notes and Remarks 28
Chapter 2. The Mathematics of Quantum Information Theory 31
2.1. On the geometry of the set of quantum states 31
2.1.1. Pure and mixed states 31
vii
viii CONTENTS

2.1.2. The Bloch ball DpC2 q 32


2.1.3. Facial structure 33
2.1.4. Symmetries 34
2.2. States on multipartite Hilbert spaces 35
2.2.1. Partial trace 35
2.2.2. Schmidt decomposition 36
2.2.3. A fundamental dichotomy: separability vs. entanglement 37
2.2.4. Some examples of bipartite states 39

ion
2.2.5. Entanglement hierarchies 41
2.2.6. Partial transposition 41
2.2.7. PPT states 43

ut
2.2.8. Local unitaries and symmetries of Sep 46

rib
2.3. Superoperators and quantum channels 47
2.3.1. The Choi and Jamiołkowski isomorphisms 47
2.3.2. Positive and completely positive maps 48

ist
2.3.3. Quantum channels and Stinespring representation 50
2.3.4. Some examples of channels 52

rd
2.4. Cones of QIT 55
2.4.1. Cones of operators 55
2.4.2. Cones of superoperators
2.4.3. Symmetries of the PSD cone
fo 56
58
ot
2.4.4. Entanglement witnesses 60
2.4.5. Proofs of Størmer’s theorem 61
N

Notes and Remarks 63

Chapter 3. Quantum Mechanics for Mathematicians 67


ly.

3.1. Simple-minded quantum mechanics 67


on

3.2. Finite vs. infinite dimension, projective spaces and matrices 68


3.3. Composite systems and quantum marginals; mixed states 68
3.4. The partial trace; purification of mixed states 70
se

3.5. Unitary evolution and quantum operations; the completely positive


maps 71
lu

3.6. Other measurement schemes 73


3.7. Local operations 74
na

3.8. Spooky action at a distance 74


Notes and Remarks 75
so

Part 2. Banach and his Spaces. Asymptotic Geometric Analysis


r
Pe

Miscellany 77

Chapter 4. More Convexity 79


4.1. Basic notions and operations 79
4.1.1. Distances between convex sets 79
4.1.2. Symmetrization 80
4.1.3. Zonotopes and zonoids 81
4.1.4. Projective tensor product 82
4.2. John and Löwner ellipsoids 84
4.2.1. Definition and characterization 84
4.2.2. Convex bodies with enough symmetries 89
CONTENTS ix

4.2.3. Ellipsoids and tensor products 91


4.3. Classical inequalities for convex bodies 91
4.3.1. The Brunn–Minkowski inequality 91
4.3.2. log-concave measures 93
4.3.3. Mean width and the Urysohn inequality 94
4.3.4. The Santaló and the reverse Santaló inequalities 98
4.3.5. Symmetrization inequalities 98
4.3.6. Functional inequalities 101

ion
4.4. Volume of central sections and the isotropic position 101
Notes and Remarks 103

ut
Chapter 5. Metric Entropy and Concentration of Measure in Classical Spaces 107
5.1. Nets and packings 107

rib
5.1.1. Definitions 107
5.1.2. Nets and packings on the Euclidean sphere 108

ist
5.1.3. Nets and packings in the discrete cube 113
5.1.4. Metric entropy for convex bodies 114

rd
5.1.5. Nets in Grassmann manifolds, orthogonal and unitary group 116
5.2. Concentration of measure 117

5.2.2. Gaussian concentration


fo
5.2.1. A prime example: concentration on the sphere 119
121
5.2.3. Concentration tricks and treats 124
ot
5.2.4. Geometric and analytic methods. Classical examples 129
N

5.2.5. Some discrete settings 136


5.2.6. Deviation inequalities for sums of independent random variables 139
Notes and Remarks 141
ly.

Chapter 6. Gaussian Processes and Random Matrices 149


on

6.1. Gaussian processes 149


6.1.1. Key example and basic estimates 150
se

6.1.2. Comparison inequalities for Gaussian processes 152


6.1.3. Sudakov and dual Sudakov inequalities 154
lu

6.1.4. Dudley’s inequality and the generic chaining 157


6.2. Random matrices 160
na

6.2.1. 8-Wasserstein distance 161


6.2.2. The Gaussian Unitary Ensemble 162
so

6.2.3. Wishart matrices 166


6.2.4. Real RMT models and Chevet–Gordon inequalities 173
r

6.2.5. A quick initiation to free probability 176


Pe

Notes and Remarks 178

Chapter 7. Some Tools from Asymptotic Geometric Analysis 181


7.1. `-position, K-convexity and the M M ˚ -estimate 181
7.1.1. `-norm and `-position 181
7.1.2. K-convexity and the M M ˚ -estimate 182
7.2. Sections of convex bodies 186
7.2.1. Dvoretzky’s theorem for Lipschitz functions 186
7.2.2. The Dvoretzky dimension 189
7.2.3. The Figiel–Lindenstrauss–Milman inequality 193
x CONTENTS

7.2.4. The Dvoretzky dimension of standard spaces 195


7.2.5. Dvoretzky’s theorem for general convex bodies 200
7.2.6. Related results 201
7.2.7. Constructivity 205
Notes and Remarks 207

Part 3. The Meeting: AGA and QIT 211

ion
Chapter 8. Entanglement of Pure States in High Dimensions 213
8.1. Entangled subspaces: qualitative approach 213
8.2. Entropies of entanglement and additivity questions 215

ut
8.2.1. Quantifying entanglement for pure states 215
8.2.2. Channels as subspaces 216

rib
8.2.3. Minimal output entropy and additivity problems 216
8.2.4. On the 1 Ñ p norm of quantum channels 217

ist
8.3. Concentration of Ep for p ą 1 and applications 218
8.3.1. Counterexamples to the multiplicativity problem 218

rd
8.3.2. Almost randomizing channels 220
8.4. Concentration of von Neumann entropy and applications 222
8.4.1. The basic concentration argument fo
8.4.2. Entangled subspaces of small codimension
222
224
8.4.3. Extremely entangled subspaces 224
ot
8.4.4. Counterexamples to the additivity problem 228
N

8.5. Entangled pure states in multipartite systems 228


8.5.1. Geometric measure of entanglement 228
ly.

8.5.2. The case of many qubits 229


8.5.3. Multipartite entanglement in real Hilbert spaces 231
on

Notes and Remarks 231

Chapter 9. Geometry of the Set of Mixed States 235


se

9.1. Volume and mean width estimates 236


9.1.1. Symmetrization 236
lu

9.1.2. The set of all quantum states 236


9.1.3. The set of separable states (the bipartite case) 238
na

9.1.4. The set of block-positive matrices 240


9.1.5. The set of separable states (multipartite case) 242
so

9.1.6. The set of PPT states 244


9.2. Distance estimates 245
r

9.2.1. The Gurvits–Barnum theorem 246


Pe

9.2.2. Robustness in the bipartite case 247


9.2.3. Distances involving the set of PPT states 248
9.2.4. Distance estimates in the multipartite case 249
9.3. The super-picture: classes of maps 250
9.4. Approximation by polytopes 252
9.4.1. Approximating the set of all quantum states 252
9.4.2. Approximating the set of separable states 256
9.4.3. Exponentially many entanglement witnesses are necessary 258
Notes and Remarks 260
CONTENTS xi

Chapter 10. Random Quantum States 263


10.1. Miscellaneous tools 263
10.1.1. Majorization inequalities 263
10.1.2. Spectra and norms of unitarily invariant random matrices 264
10.1.3. Gaussian approximation to induced states 266
10.1.4. Concentration for gauges of induced states 268
10.2. Separability of random states 268
10.2.1. Almost sure entanglement for low-dimensional environments 268

ion
10.2.2. The threshold theorem 269
10.3. Other thresholds 272
10.3.1. Entanglement of formation 272

ut
10.3.2. Threshold for PPT 273

rib
Notes and Remarks 273

Chapter 11. Bell Inequalities and the Grothendieck–Tsirelson Inequality 275

ist
11.1. Isometrically Euclidean subspaces via Clifford algebras 275
11.2. Local vs. quantum correlations 276

rd
11.2.1. Correlation matrices 276
11.2.2. Bell correlation inequalities and the Grothendieck constant 279
11.3. Boxes and games
11.3.1. Bell inequalities as games
fo 283
283
11.3.2. Boxes and the nonsignaling principle 285
ot
11.3.3. Bell violations 289
N

Notes and Remarks 294

Chapter 12. POVMs and the Distillability Problem 299


ly.

12.1. POVMs and zonoids 299


12.1.1. Quantum state discrimination 299
on

12.1.2. Zonotope associated to a POVM 300


12.1.3. Sparsification of POVMs 300
se

12.2. The distillability problem 301


12.2.1. State manipulation via LOCC channels 301
lu

12.2.2. Distillable states 302


12.2.3. The case of two qubits 302
na

12.2.4. Some reformulations of distillability 304


Notes and Remarks 305
so

Appendix A. Gaussian measures and Gaussian variables 307


r

A.1. Gaussian random variables 307


Pe

A.2. Gaussian vectors 308


Notes and Remarks 309

Appendix B. Classical groups and manifolds 311


B.1. The unit sphere S n´1 or SCd 311
B.2. The projective space 312
B.3. The orthogonal and unitary groups Opnq, Upnq 312
B.4. The Grassmann manifolds Grpk, Rn q, Grpk, Cn q 314
B.5. The Lorentz group Op1, n ´ 1q 318
Notes and Remarks 319
xii CONTENTS

Appendix C. Extreme maps between Lorentz cones and the S-lemma 321
Notes and Remarks 324
Appendix D. Polarity and the Santaló point via duality of cones 325
Appendix E. Hints to exercises 329
Appendix. Bibliography 375
Websites 398

ion
Appendix F. Notation 399
General notation 399

ut
Convex geometry 399
Linear algebra 400

rib
Probability 401
Geometry and asymptotic geometric analysis 402

ist
Quantum information theory 403

rd
Appendix. Index 405

fo
N ot
ly.
on
se
lu
na
so
r
Pe
List of Tables

ion
2.1 Cones of operators and their duals 55

ut
2.2 Cones of superoperators 57

rib
3.1 Spooky action at a distance: outcome distribution for a 2-qubit
measurement experiment 75

ist
4.1 Radii, volume radii, and widths for standard convex bodies in Rn 96

rd
5.1 Covering numbers of classical manifolds 116
5.2 Constants and exponents in subgaussian concentration inequalities 118
fo
5.3 Optimal bounds on Ricci curvature of classical manifolds 131
ot
5.4 log-Sobolev and Poincaré constants for classical manifolds 134
N

7.1 Derandomization/randomness reduction for Euclidean sections of B1n 207

9.1 Radii, volume radii, and widths for sets of quantum states 235
ly.

9.2 References for proofs of the results from Table 9.1 236
on

9.3 Volume estimates for bases of cones of superoperators 251


9.4 Verticial and facial dimensions for sets of quantum states 253
se

11.1 The magic square game 294


lu
na
so
r
Pe

xiii
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
List of Figures

ion
1.1 Gauge of a convex body 12

ut
1.2 A polytope and its polar 17

rib
1.3 A cone and its dual cone 20

2.1 The set of quantum states and the set of separable states 38

ist
2.2 The set of PPT states 44

rd
4.1 Symmetrizations of a convex body 80
4.2 An equilateral triangle in Löwner position 85
4.3 Width and half-width of a convex body fo 94
ot
5.1 A net and a packing for an equilateral triangle 108
N

5.2 Upper-bounding the volume of a spherical cap 109


2
5.3 Volume growth on the sphere S as a function of geodesic distance 130
ly.

6.1 Empirical eigenvalue distribution of a GUE matrix 164


on

6.2 Marčenko–Pastur densities 167

7.1 Low-dimensional illustration of Dvoretzky’s theorem 190


se

11.1 Diagrammatic representation of a quantum game 284


lu

D.1 Changing the center of polarity and duality of cones 326


na

D.2 The Santaló point via duality of cones 327


so

E.1 An example of an extreme point which is not exposed 329


E.2 Schatten unit balls in 2 ˆ 2 real self-adjoint matrices 333
r
Pe

E.3 Sharper upper and lower bounds of the volume of spherical caps 345

xv
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
Preface

ion
The quest to build a quantum computer is arguably one of the major scien-

ut
tific and technological challenges of the 21st century, and quantum information
theory (QIT) provides the mathematical framework for that quest. Over the last

rib
dozen or so years, it has become clear that quantum information theory is closely
linked to geometric functional analysis (Banach space theory, operator spaces,

ist
high-dimensional probability), a field also known as asymptotic geometric analy-
sis (AGA). In a nutshell, asymptotic geometric analysis investigates quantitative

rd
properties of convex sets, or other geometric structures, and their approximate
symmetries as the dimension becomes large. This makes it especially relevant to
quantum theory, where systems consisting of just a few particles naturally lead to
fo
models whose dimension is in the thousands, or even in billions.
While the idea for this book materialized after we independently taught grad-
ot
uate courses directed primarily at students interested in functional analysis (at the
N

University Lyon 1 and at the University Pierre et Marie Curie-Paris 6 in the spring
of 2010), the final product goes well beyond enhanced lecture notes. The book is
aimed at multiple audiences connected through their interest in the interface of QIT
ly.

and AGA: at quantum information researchers who want to learn AGA or to apply
on

its tools; at mathematicians interested in learning QIT, or at least the part of QIT
that is relevant to functional analysis/convex geometry/random matrix theory and
related areas; and at beginning researchers in either field. We have tried to make
se

the book as user-friendly as possible, with numerous tables, explicit estimates, and
reasonable constants when possible, so as to make it a useful reference even for
lu

established mathematicians generally familiar with the subject.


The first four chapters are of introductory nature. Chapter 0 outlines the basic
na

notation and conventions with emphasis on those that are field-specific to AGA or
to physics and may therefore need to be clarified for readers that were educated in
so

the other culture. It should be read lightly and used later as a reference. Chapter 1
r

introduces basic notions from convexity theory that are used throughout the book,
Pe

notably duality of convex bodies or of convex cones and Schatten norms. Chapter 2
goes over a selection of mathematical concepts and elementary results that are rel-
evant to quantum theory. It is aimed primarily at newcomers to the area, but other
readers may find it useful to read it lightly and selectively to familiarize themselves
with the “spirit” of the book. Chapter 3 may be helpful to mathematicians with
limited background in physics; it shows why various mathematical concepts appear
in quantum theory. It could also help in understanding physicists talking about the
subject and in seeing the motivation behind their enquiries. The choice of topics
largely reflects the aspects of the field that we ourselves found not-immediately-
obvious when encountering them for the first time.

xvii
xviii PREFACE

Chapters 4 through 7 include the background material from the widely under-
stood AGA that is either already established to be, directly or indirectly, relevant to
QIT, or that we consider to be worthwhile making available to the QIT community.
Even though most of this material can be found in existing books or surveys, many
items are difficult to locate in the literature and/or are not readily accessible to
outsiders. Here we have organized our exposition of AGA so that the applications
follow as seamlessly as possible. Our presentation of some aspects of the theory is
nonstandard. For example, we exploit the interplay between polarity and cone du-

ion
ality (outlined in Chapter 1 and with a sample application in Appendix D) to give
novel and potentially useful insights. Chapters 4 (More convexity) and 5 (Metric
entropy and concentration of measure) can be read independently of each other,

ut
but Chapters 6 and 7 depend on the preceding ones.

rib
Chapters 8 through 12 discuss topics from the QIT proper, mostly via applica-
tion of tools from the prior chapters. These chapters can largely be read indepen-
dently of each other. For the most part, they present results previously published

ist
in journal articles, often (but not always) by the authors and their collaborators,
most notably Cécilia Lancien, Elisabeth Werner, Deping Ye, Karol Życzkowski, and

rd
The Horodecki Group. A few results are byproducts of the work on this book (e.g.,
those in Section 9.4). The book also contains several new proofs. Some of them
fo
could arguably qualify as “proofs from The Book,” for example the first proof of
Størmer’s Theorem 2.36 (Section 2.4.5) or the derivation of the sharp upper bound
ot
for the expected value of the norm of the complex Wishart matrix (Proposition
6.31).
N

Some statements are explicitly marked as “not proved here”; in that case the
references (to the original source and/or to a more accessible presentation) are
ly.

indicated in the “Notes and Remarks” section at the end of the chapter. Otherwise,
the proof can be found either in the main text or in the exercises. There are over
on

400 exercises that form an important part of the book. They are diverse and aim
at multiple audiences. Some are simple and elementary complements to the text,
while others allow the reader to explore more advanced topics at their own pace.
se

Still others explore details of the arguments that we judged to be too technical to
lu

be included in the main text, but worthwhile to be outlined for those who may need
sharp versions of the results and/or to “reverse engineer” the proofs. All but the
na

simplest exercises come with hints, collected in Appendix E. Appendices A to D


contain material, generally of reference character, that would disrupt the narrative
so

if included in the main text.


The back matter of the book contains material designed to simplify the task
r

of the reader wanting to use the book as a reference: a guide to notation and
Pe

a keyword index. The bibliography likewise contains back-references displaying


page(s) where a given item is cited. For additional information and updates on
or corrections to this book, we refer the reader to the associated blog at https:
//aliceandbobmeetbanach.wordpress.com. At the same time, we encourage—or
even beg—the readers to report typos, errors, improvements, solutions to problems
and the like to the blog. (An alternative path to the online post-publication material
is by following the link given on the back cover of the book.)
While the initial impulse for the book was a teaching experience, it has not
been designed, in its ultimate form, with a specific course or courses in mind. For
starters, the quantity of material exceeds by far what can be covered in a single
PREFACE xix

semester. However, a graduate course centered on the main theme of the book—
the interface of QIT and AGA—can be easily designed around selected topics from
Chapters 4–7, followed by selected applications from Chapters 8–12. While we
assume at least a cursory familiarity with functional analysis (normed and inner
product spaces, and operators on them, duality, Hahn–Banach-type separation the-
orems etc.), real analysis (Lp -spaces), and probability, deep results from these fields
appear only occasionally and—when they do—an attempt is made to “soften the
blow” by presenting some background via appropriately chosen exercises. Alterna-

ion
tively, most chapters could serve as a core for an independent study course. Again,
this would be greatly facilitated by the numerous exercises and—mathematical ma-
turity being more critical than extensive knowledge—the text will be accessible to

ut
sufficiently motivated advanced undergraduates.

rib
Acknowledgements. This book has been written over several years; during this pe-
riod the project benefited greatly from the joint stays of the authors at the Isaac

ist
Newton Institute in Cambridge, Mathematisches Forschungsinstitut Oberwolfach
(within the framework of its Research in Pairs program), and the Instituto de

rd
Ciencias Matemáticas in Madrid. We are grateful to these institutions and their
staff for their support and hospitality. We are indebted to the many colleagues and
students who helped us bring this book into being, either by reading and comment-
fo
ing on specific chapters, or by sharing with us their expertise and/or providing us
with references. We thank in particular Dominique Bakry, Andrew Blasius, Michał
ot
Horodecki, Cécilia Lancien, Imre Leader, Ben Li, Harsh Mathur, Mark Meckes,
Emanuel Milman, Ion Nechita, David Reeb. We also thank the anonymous referees
N

for many suggestions which helped to improve the quality of the text. We are espe-
cially grateful to Gaëlle Jardine for careful proofreading of parts of the manuscript.
ly.

We acknowledge Aurélie Garnier, who created the comic strip. Thanks are also
due to Sergei Gelfand of the American Mathematical Society’s Editorial Division,
on

who guided this project from the conception to its conclusion and whose advice
and prodding were invaluable. Finally, we would like to thank our families for their
se

support, care, and patience throughout the years.


While working on the book the authors benefited from partial support of the
lu

Agence Nationale de la Recherche (France), grants OSQPI (2011-BS01-008-02, GA


and SJS) and StoQ (2014-CE25-0003, GA), and of the National Science Foundation
na

(U.S.A.), awards DMS-0801275, DMS-1246497, and DMS-1600124 (all SJS).


so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
ion
Part 1

ut
rib
Alice and Bob

ist
rd
Mathematical Aspects of Quantum Information
Theoryfo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 0

Notation and Basic Concepts

ion
0.1. Asymptotic and non-asymptotic notation

ut
The letters C, c, c1 , c0 , . . . denote absolute numerical constants, independent of

rib
the instance of the problem at hand. However, the actual values corresponding
to the same symbol may change from place to place. Such constants are always
assumed to be positive. Usually C or C 1 stands for a large (but finite) number,

ist
while c or c0 denotes a small (but nonzero) number. If a constant is allowed to
depend on a parameter (say n, or ε), we use expressions such as Cn or cpεq.

rd
When A, B are quantities depending on the dimension (and/or perhaps on
some other parameters), the notation A “ OpBq means that there exists an abso-
fo
lute constant C ą 0 such that the inequality A ď CB holds in every dimension.
Similarly, A “ ΩpBq means that B “ OpAq, and A “ ΘpBq means both A “ OpBq
ot
and B “ OpAq. We emphasize that these are non-asymptotic relations; they are
supposed to hold universally, in every instance of the problem, independently of
N

any other parameters that may be involved, and not just in the limit. We also write
A À B, A Á B and A » B as alternative notation for A “ OpBq, A “ ΩpBq and
ly.

A “ ΘpBq respectively. However, sometimes we will want to indicate relations that


have an asymptotic flavor. For example, A „ B will mean that A{B Ñ 1 as the
on

dimension tends to 8 (or as some other relevant parameter tends to its limiting
value), and both A “ opBq and A ! B mean that A{B Ñ 0. If we want to indicate
or emphasize that a dependence (of either kind) is not necessarily uniform in some
se

of the parameters, we may write, for example, cpαq or A “ Oε pBq to identify the
lu

parameter(s) on which the relation in question does or may depend, and similarly
for A „p B (asymptotic equivalence for fixed p). Note that if there is only one
parameter involved (say, the dimension n), then A „ B implies A » B; however,
na

A „p B does not necessarily entail A » B.


so

0.2. Euclidean and Hilbert spaces


r

Throughout this book, virtually all the normed spaces we consider will be finite-
Pe

dimensional (most concepts do extend to infinite-dimensional spaces, but we do not


dwell on this). In the case of real or complex Hilbert spaces,
a we denote by xψ, χy
the inner product of two vectors ψ, χ, and by |ψ| “ xψ, ψy the corresponding
Hilbert space norm. For a complex Hilbert space H, we use the convention that
the inner product is conjugate linear in the first argument and linear in the second
argument: if ψ, χ P H and λ P C, then
xλψ, χy “ λ̄xψ, χy and xψ, λχy “ λxψ, χy.
This convention is common in physics literature, but differs from the one usually
employed in mathematics.

3
4 0. NOTATION AND BASIC CONCEPTS

When H, H1 are (real or complex) finite-dimensional Hilbert spaces, we denote


by BpH1 , Hq the space of operators (= linear maps) from H1 to H, and BpHq “
BpH, Hq. The adjoint of an operator A P BpH1 , Hq is the unique operator A: P
BpH, H1 q satisfying the property
(0.1) xψ, Aψ 1 y “ xA: ψ, ψ 1 y
for any ψ P H, ψ 1 P H1 . We denote by B sa pHq the space of self-adjoint operators
satisfying A: “ A; B sa pHq is a real (but not complex) vector subspace of BpHq.

ion
The dependence A ÞÑ A: is conjugate linear. A simple but important instance
of this operation is when H1 “ C: if we identify ϕ P H with an operator z ÞÑ zϕ
belonging to BpC, Hq, then the adjoint of that operator is ϕ: “ xϕ, ¨y P BpH, Cq “

ut
H˚ .
The notation Bp¨, ¨q will be occasionally used for the corresponding concepts

rib
in the category of normed (or just vector) spaces. Note that while B stands for
“bounded,” in the finite-dimensional setting all linear operators are bounded and

ist
so—if minimal care is exercised—this will not introduce ambiguity. On the other
hand, the notation : will be reserved for operators acting between Hilbert spaces; in

rd
other contexts we will use the usual functional analytic notation T ˚ for the adjoint
of a linear map T .
fo
If H is a complex Hilbert space, we denote by H the Hilbert space which coin-
cides with H as far as the additive structure is concerned, but with multiplication
defined as pλ, xq ÞÑ λx. Again, the identity map H Q ψ ÞÑ ψ P H is R-linear,
ot
but not C-linear. Still, the Hilbert spaces H and H are isomorphic. Explicit iso-
N

morphisms can be constructed as follows: if pej q is an orthonormal basis in H and


ř ř
ψ “ λj ej P H, we denote by ψ the vector λj ej ; then the map ψ ÞÑ ψ is a
ly.

Hilbert-space isomorphism between H and H. However, this identification between


H and H is not canonical since it depends on the choice of a basis. (In general,
on

a mathematical procedure/construction/morphism is said to be canonical when it


depends only on the underlying structure of the object(s) at hand and does not
involve any additional arbitrary choices. An identification between two spaces is
se

canonical when there is only one natural candidate for an isomorphism. In the
lu

setting of vector spaces, “canonical” is roughly the same as “can be defined in a


coordinate-free way.”) In our context, it is the dual space H˚ “ BpH, Cq which
identifies canonically with H “ BpC, Hq via the map H˚ Q ψ : Ø ψ P H. This
na

subtlety does not arise in the real case since the map ψ ÞÑ ψ : is R-linear and so the
dual space H˚ “ BpH, Rq identifies canonically with H.
so

Here is some more notation: SH is the sphere of a real or complex Hilbert


r

space H, and S n´1 “ SRn . We denote by vol the Lebesgue measure on a finite-
Pe

dimensional Euclidean space, and occasionally by voln the Lebesgue measure on


Rn if we want to emphasize the dimension. If H is a linear or affine subspace, we
denote by volH the Lebesgue measure on H. We also denote by σ the Lebesgue
measure on S n´1 , normalized so that σpS n´1 q “ 1 (see Appendix B.1).

0.3. Bra-ket notation


When working with objects related to Hilbert spaces, particularly the complex
ones, we use throughout the book Dirac’s bra-ket notation. It resembles the conven-
tion, which may be familiar to some readers and that is commonly used, usually in
the real setting, in linear programming/optimization. In that convention, x P Rm
0.3. BRA-KET NOTATION 5

is a column vector (an m ˆ 1 matrix, which can also be identified with an operator
from R to Rm ); the transposition xT is a row vector , or a linear functional on Rm ;
xy T is the outer product of column vectors x and y, while xT y is their inner (scalar)
product, defined if x and y have the same dimension.
The Dirac notation has a very similar structure, the differences being that it is
(at least a priori ) coordinate-free, that the primary operation is : rather than T ,
and that the identification of a given object as a vector or as a functional is intrinsic
in the notation. “Standard” vectors in H are written as |ψy (a ket vector). The

ion
same vector, but thought of as an element of H˚ Ø H, is identified with |ψy: and
written as xψ| (a bra vector). The bra-ket notation works seamlessly with standard
operations on Hilbert spaces. The action of a functional xψ| on a vector |χy is

ut
xψ|χy, an alternative notation for the scalar product xψ, χy. If A P BpHq and

rib
ψ P H, then we have A|ψy “ |Aψy and xAψ| “ pA|ψyq: “ xψ|A: . Consequently, the
quantity xψ 1 |A|ψy can be read as xψ 1 , Aψy or as xA: ψ 1 , ψy, the equality of which is
a restatement of the definition (0.1).

ist
Let H1 , H2 be real or complex Hilbert spaces, and let ψ1 , ψ2 be vectors in
H1 , H2 respectively. Then the operator |ψ1 yxψ2 | : H2 Ñ H1 acts on χ P H2 as

rd
follows
|χy ÞÑ |ψ1 yxψ2 |χy “ xψ2 |χy|ψ1 y
fo
or, in the standard notation, χ ÞÑ xψ2 , χyψ1 . This operator has rank one unless one
ot
of the vectors ψ1 , ψ2 is zero.
In some mathematical circles, the operator |ψ1 yxψ2 | is sometimes denoted ψ1 b
N

ψ2 or ψ2 b ψ1 , or even ψ1 b ψ2 . However, such notation is inconvenient and often


ambiguous, and it becomes unmanageable when the Hilbert spaces, in which ψ1
ly.

and ψ2 live, are themselves equipped with a tensor product structure.


When E Ă H is a linear subspace, we denote by PE the orthogonal projection
on

onto E. When E is 1-dimensional, we have PE “ |xyxx| for any unit vector x P E.


We denote the standard basis of Cd by p|1y, . . . , |dyq. (Note that while p|jyqis
just one of many orthonormal bases of Cd , it becomes canonical if we take into ac-
se

count the lattice structure.) However, sometimes we will employ the enumeration
lu

p|0y, |1y, . . . , |d ´ 1yq, particularly for d “ 2, where we will follow the traditional
convention from computer science and use p|0y, |1yq. Either way, we will refer to
this basis as the computational basis. (As explained in Section 3.1, the designa-
na

tion “computational basis” may have an operational meaning, but such subtleties
so

will be normally beyond the scope of our analysis.) Nevertheless, in some cases,
particularly in the real context, we will use the notation e1 , e2 , . . . , ed that is more
r

common in the mathematical literature.


Pe

Exercise 0.1. Check the following properties, where ψ1 , χ1 P H1 , ψ2 , χ2 P H2 ,


χ3 P H3 , and A P BpH1 , H2 q.
(i) product/composition: |ψ1 yxψ2 | ˝ |χ2 yxχ3 | “ xψ2 , χ2 y|ψ1 yxχ3 |.
(ii) adjoint: p|ψ1 yxψ2 |q: “ |ψ2 yxψ1 |. ` ˘
(iii) trace: Tr |ψ1 yxχ1 | “ xχ1 , ψ1 y, Tr A|ψ1 yxψ2 | “ xψ2 |A|ψ1 y.
6 0. NOTATION AND BASIC CONCEPTS

0.4. Tensor products


Whenever pHi q1ďiďk are real or complex finite-dimensional Hilbert spaces, we
consider the tensor product (over the real or complex field, respectively)
k
â
(0.2) H“ Hi “ H1 b H2 b ¨ ¨ ¨ b Hk ,
i“1

which is often called a multipartite Hilbert space (or bipartite when k “ 2). The
space H carries a natural Hilbert space structure given by the inner product defined

ion
for product vectors by
k

ut
ź
xψ1 b ¨ ¨ ¨ b ψk , χ1 b ¨ ¨ ¨ b χk y “ xψi , χi y
i“1

rib
and extended to H by multilinearity. There are canonical identifications
˜ ¸
k k

ist
â â
B Hi ÐÑ BpHi q,
i“1 i“1

rd
where the tensor products are over the real or complex field, respectively. In the
complex case only, another canonical identification is

(0.3) B sa
˜
âk
¸
Hi ÐÑ
âk fo
B sa pHi q,
ot
i“1 i“1

where the tensor products are over the complex field on the left-hand side and over
N

the real field on the right-hand side. Except in the trivial cases, the analogue of
(0.3) is false in the setting of real Hilbert spaces: e.g., B sa pR2 qbB sa pR2 q is a proper
ly.

subspace of B sa pR2 b R2 q, which can be easily seen by comparing the dimensions.


While it is occasionally computationally convenient to allow some of the factors
on

in (0.2) to be 1-dimensional, such factors may be just dropped and so, when referring
to a multipartite Hilbert space, we will normally assume that all the factors are of
dimension at least 2.
se

We often work with concrete spaces such as pC2 qbk , which corresponds to k
lu

qubits. In that case the computational basis is obtained by the 2k vectors of the
form |i1 y b ¨ ¨ ¨ b |ik y, where pi1 , . . . , ik q P t0, 1uk . It is customary to drop the tensor
product sign: for example the computational basis of C2 b C2 consists of the 4
na

vectors |00y, |01y, |10y, |11y.


so

We also point out that tensor products commute with the operation of taking
dual, i.e., there is a canonical identification
r

pH1 b H2 q˚ Ø H1˚ b H2˚ .


Pe

Exercise 0.2. Let H1 , H2 be complex Hilbert spaces, and consider vectors


x1 , y1 P H1 and x2 , y2 P H2 . Write explicitly the operator |x1 b x2 ` y1 b y2 yxx1 b
x2 ` y1 b y2 | P B sa pH1 b H2 q as a linear combination of operators of the form
|zyxz| b |z 1 yxz 1 |, with z P H1 and z 1 P H2 .

0.5. Complexification
Let V be a real vector space. The complexification of V is the vector space
V C “ V b C (the tensor product is over the reals). Elements of V C are of the form
x b 1 ` y b i (for x, y P V ), which we write x ` iy for short.
0.6. MATRICES VS. OPERATORS 7

Note that the complexification of B sa pCn q is canonically isomorphic to BpCn q.


Note also that for real spaces V, W , pV bR W qC and V C bC W C are canonically
isomorphic.
Similarly, if f : V Ñ W is a linear map between real vector spaces, the map
x ` iy ÞÑ f pxq ` if pyq defines canonically a C-linear map (the complexification of
f ) from V C to W C .
An operation that goes in the opposite direction to complexification is that of
dropping the complex structure, i.e., considering a complex space as a real space, so

ion
that for example Cn is treated as R2n . In the abstract setting, if the original complex
space was endowed with a scalar product x¨, ¨y, the corresponding real scalar product
is Re x¨, ¨y. While this is frequently a useful point of view, particularly in geometric

ut
considerations (see Section 1.1), some caution is needed as this operation is not as

rib
sound functorially as complexification. For example, C bC C “ C identifies this way
with R2 , even though R2 bR R2 is 4-dimensional.

ist
0.6. Matrices vs. operators

rd
We denote by Mm,n the space of m ˆ n matrices, either real of complex, and by
Mn if m “ n. The entries of a matrix M P Mm,n are denoted by pmij q1ďiďm,1ďjďn .
We denote by M : the Hermitian conjugate of M , i.e., pmij q: “ pmji q. We will
fo
denote by Msa :
m :“ tM P Mm : M “ M u, the subspace of Mm consisting of Hermit-
ian (or self-adjoint) matrices. For matrices with real entries, “self-adjoint” simply
ot
means “symmetric.”
N

As a default, we identify complex m ˆ n matrices with operators from Cn to


C and write Mm,n “ BpCn , Cm q, and similarly Mn “ BpCn q, Msa
m sa
n “ B pC q.
n
:
ly.

The preceding definitions ensure that the above notion of is consistent with that
introduced in Section 0.2, and that the operator composition is consistent with
on

matrix multiplication. Again, this is fully parallel to the conventions in linear


analysis/optimization in the real setting.
More generally, Mm,n and Mn can (and often will) be identified with operators
se

on/between any Hilbert spaces of the appropriate dimensions. However, such iden-
tification requires specifying bases in the spaces in question and, consequently, is
lu

not canonical.
In the real case, Mn is a vector space of dimension n2 , and Msan is a subspace
na

of dimension npn ` 1q{2. In the complex case, Mn is a complex vector space of


complex dimension n2 , while Msa 2
n is a real vector space of real dimension n .
so

A natural inner product on Mm,n is given by the trace duality: if M, N P Mm,n ,


r

then
Pe

(0.4) xM, N y “ Tr M : N.
(Recall that we use the “physics” convention for sesquilinear forms, as explained in
Section 0.3.) The Euclidean structure on Mm,n induced by this inner product is
called the Hilbert–Schmidt Euclidean
? structure, and the corresponding norm is the
Hilbert–Schmidt norm }M }HS “ Tr M : M . (In linear algebra the more commonly
used name is Frobenius.) Note that in the complex case the inner product will, in
general, not be real. However, if M, N P Msa
m , then xM, N y “ Tr M N is real (even
if some of the entries of M, N are complex).
8 0. NOTATION AND BASIC CONCEPTS

0.7. Block matrices vs. operators on bipartite spaces


It is convenient to identify operators on Cm bCn with elements of Mmn having a
block structure. More precisely, to each operator A P BpCm bCn q there corresponds
the block matrix
» fi
M11 ¨ ¨ ¨ M1m
(0.5) M “ – ... .. ffi

. fl
Mm1 ¨¨¨ Mmm

ion
where, for each i, j P t1, . . . , mu, the matrix Mij P Mn is defined as

ut
» fi
pxi| b x1|qAp|jy b |1yq ¨¨¨ pxi| b x1|qAp|jy b |nyq
.. ..

rib
(0.6) Mij “ – fl .
— ffi
. .
pxi| b xn|qAp|jy b |1yq ¨¨¨ pxi| b xn|qAp|jy b |nyq

ist
0.8. Operators vs. tensors

rd
Let H1 , H2 be complex Hilbert spaces. The map u b v ÞÑ |vyxu| induces a
canonical identification between the spaces H1 b H2 and BpH1 , H2 q. Recall from
Section 0.2 that H1 identifies canonically with H1˚ . fo
As explained in Section 0.2, the use of the complex conjugacy can be avoided
if we agree to work with specified bases. Fix bases pei qiPI in H1 and pfj qjPJ in
ot
H2 . Define a map vec : BpH1 , H2 q Ñ H2 b H1 as follows: for i P I and j P J, set
N

vecp|fj yxei |q “ fj b ei and extend the definition by C-linearity. In other words, for
ψ1 P H1 , ψ2 P H2 we have vec |ψ2 yxψ1 | “ ψ2 b ψ1 where conjugacy is taken with
ly.

respect to the basis pei q.

0.9. Operators vs. superoperators


on

It is convenient to use the terminology superoperator to denote maps acting


between spaces of operators, or between spaces of matrices. The distinction between
se

operators and superoperators may seem rather arbitrary since, as we noted earlier,
lu

BpHq and Mm,n carry a natural Hilbert space structure. However, it helps to
organize one’s thinking and is widely used in quantum information theory.
na

Accordingly, we use two different types of notation to denote the identity map:
the identity operator on a Hilbert space H is denoted by IH (or In if H “ Cn or
Rn , or even simply I if there is no ambiguity), while the identity superoperator on
so

BpHq is denoted by IdBpHq (or simply Id).


r
Pe

0.10. States, classical and quantum


The concept that plays a central role throughout this book is that of a quantum
state.
We start by introducing the classical analogue: given a finite set S, a classical
state on S is simply a probability measure on S (or, equivalently, a probability
mass function indexed by s P S). We denote by ∆n the set of classical states on
t0, 1, . . . , nu. Geometrically, ∆n is an n-dimensional simplex; we shall return to
this circle of ideas in Chapter 1.
Let H be a complex finite-dimensional Hilbert space. A quantum state (or
simply a state) on H is a positive self-adjoint operator of trace one. We denote by
0.10. STATES, CLASSICAL AND QUANTUM 9

DpHq the set of states on H (the letter D stands for density matrix , which is an
alternative terminology for states). If H “ Cn`1 , the subset of DpHq consisting of
diagonal operators identifies naturally with ∆n (and similarly for operators diagonal
with respect to any fixed basis in any finite-dimensional Hilbert space).
In functional analysis, a state on a C ˚ -algebra is—by definition—a positive
linear functional of norm 1. This is consistent with the definitions of classical
and quantum states introduced above. Indeed, given a finite set S, states on the
commutative C ˚ -algebra CS correspond to classical states on S. Similarly, given

ion
a finite-dimensional complex Hilbert space H, the states on the C ˚ -algebra BpHq
can be identified with elements of DpHq via trace duality (0.4).

ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 1

Elementary Convex Analysis

ion
In this chapter we present an overview of basic properties of convex sets and

ut
convex cones. Unless stated explicitly otherwise, we shall assume that the base field
is R and that all the objects involved are finite-dimensional. However, notions for

rib
complex spaces will be important and even indispensable in some settings. They
are typically introduced by repeating mutatis mutandis the definitions of their real

ist
counterparts. At the same time, one can always consider them as real spaces by
ignoring the complex structure.

rd
If V is an n-dimensional vector space over R, we will usually assume that V is
identified with Rn . This implies in particular that there is a distinguished Euclidean
structure (i.e., a scalar product) in V , so that V is also identified with its dual V ˚ .
fo
1.1. Normed spaces and convex sets
ot
1.1.1. Gauges. We start with a simple proposition which characterizes the
N

subsets of Rn that can be the unit balls for some norm. A subset K Ă Rn is a
convex body if it is convex, compact, and with non-empty interior. We similarly
define convex bodies in linear (or affine) subspaces of Rn . We will call K symmetric
ly.

(or 0-symmetric if there is an ambiguity) if it is centrally symmetric with respect


on

to the origin, i.e., K “ ´K.


Proposition 1.1 (easy). Let K be a subset of Rn . The following are equivalent
se

(1) K is a symmetric convex body.


(2) There is a norm on Rn for which K is the unit ball.
lu

Given K, the corresponding norm can be retrieved by considering the gauge of


K, also called the Minkowski functional of K, which is defined for x P Rn by
na

(1.1) }x}K :“ inftt ě 0 : x P tKu,


so

where tK “ ttx : x P Ku (see Figure 1.1). If X is a normed space (most often,


X “ pRn , } ¨ }q), we will denote its unit ball tx : }x} ď 1u by BX . (However, to
r
Pe

lighten the notation, we will use specialized symbols for various “common” spaces.)
The correspondence X ÞÑ BX is the inverse of the correspondence K ÞÑ } ¨ }K .
In the complex case, the analogue of symmetry is circledness. A convex body
K Ă Cn is said to be circled if for every θ P R and x P K we have eiθ x P K. Circled
convex bodies are exactly the unit balls of norms in Cn .
Equation (1.1) will also be used to define the gauge of a non-necessarily-
symmetric convex set K. However, in order for the gauge to take only finite values
and to avoid other degeneracies, we will usually insist that K contain the origin in
its interior and that K be closed. We will still denote by } ¨ }K the gauge of such
convex set, and we will still have the (essentially tautological) relation
(1.2) K “ tx : }x}K ď 1u.
11
12 1. ELEMENTARY CONVEX ANALYSIS

• • •
0 x x/kxkK

ion
ut
Figure 1.1. Gauge of a convex body.

rib
(Observe that if K is closed, the infimum in (1.1) is always attained.) However, if
K is not assumed to be symmetric, we should note that in general }x}K ‰ } ´ x}K .

ist
We point out that the correspondence between convex bodies and their gauges
is order-reversing: K Ă L if and only if } ¨ }K ě } ¨ }L . In the same vein, we have

rd
} ¨ }tK “ t´1 } ¨ }K for t ą 0.
1.1.2. First examples: `p -balls, simplex, polytopes, and convex hulls.

˜ ¸1{p
fo
For 1 ď p ď `8, we denote by } ¨ }p the `p -norm, defined for x P Rn via
n
ot
ÿ
(1.3) }x}p “ |xk |p ,
N

k“1
where the limit case p “ `8 should be understood as }x}8 “ maxt|xk | : 1 ď k ď
nu. Recall also that } ¨ }2 will be usually denoted by | ¨ |. The `p -norms satisfy the
ly.

following inequalities: if 1 ď p ď q ď 8 and x P Rn , then


on

(1.4) }x}q ď }x}p ď n1{p´1{q }x}p .


The normed space pRn , } ¨ }p q is denoted by `np and its unit ball by Bpn .
se

If A Ă Rn , we denote by conv A the convex hull of A, i.e., the set of all convex
combinations of elements of A, which is also the smallest convex set containing
lu

A. The following theorem bounds the length of convex combinations needed to


generate the convex hull.
na

Theorem 1.2 (Carathéodory’s theorem, see Exercise 1.1). Let A Ă Rn . Then


convpAq is the set of all convex combinations of at most n ` 1 elements of A. The
so

same assertion holds if A Ă H, where H is an n-dimensional affine subspace of Rm


r

for some m ą n.
Pe

A convex body is a polytope if it is the convex hull of finitely many points.


The simplest polytope is the simplex which is the convex hull of n ` 1 affinely
independent points in Rn . This is the prototypical example of a non-symmetric
convex body (for n ě 2). Note that Carathéodory’s theorem implies that when
K “ conv A, then K is the union of all simplices with vertices in A (the dimension
of each simplex being equal to dim K).
A simplex is regular if all the pairwise distances between the n ` 1 vertices
are equal. A convenient representation of a regular simplex is as follows: consider
the affine hyperplane H Ă Rn`1 formed by all vectors whose coordinates add up
to 1, and denote by ∆n the convex hull of the vectors from the canonical basis in
1.1. NORMED SPACES AND CONVEX SETS 13

Rn`1 . Note that ∆n is a convex body in H, but only a convex subset of Rn`1 . The
simplex ∆n corresponds to the set of classical states, i.e., probability measures on
t0, . . . , nu.
Exercise 1.1 (Carathéodory’s theorem). Let A Ă Rn , x P conv A and consider
řN
a decomposition x “ i“1 λi xi (where pλi q is a convex combination and xi P A)
of minimal length N . Show that the points pxi q must be affinely independent, and
conclude that N ď n ` 1.

ion
Exercise 1.2. Let A Ă Rn be a compact set. Show that conv A is compact.
1.1.3. Extreme points, faces. Let K Ă Rn be a convex set. A point x P K is

ut
said to be extreme if it cannot be written in a nontrivial way as a convex combination
of points of K, i.e., if the equality x “ ty ` p1 ´ tqz for t P p0, 1q and y, z P K implies

rib
that x “ y “ z. The following fundamental theorem asserts that, in a sense, all
information about a convex body is contained in its extreme points.

ist
Theorem 1.3 (Krein–Milman theorem, see Exercise 1.6). Let K Ă Rn be a
convex body. Then K is the convex hull of its extreme points.

rd
Let F, K be closed convex sets with F Ă K. Then F is called a face of K
if every segment contained in K whose (relative) interior intersects F is entirely
fo
contained in F . If F ‰ H and F ‰ K, F is said to be a proper face. Note that a
singleton txu is a face if and only if x is an extreme point. If F is a face of K with
ot
dim F “ dim K ´ 1, then F is called a facet.
A frequently encountered setting in convex or functional analysis is that of two
N

convex sets K, L and a linear or affine map u such that upLq Ă K. For example, if
X, Y` are˘normed spaces, and u : X Ñ Y a linear operator, then u is a contraction
ly.

iff u BX Ă BY . The following elementary observation makes it possible to use the


facial structure of the sets in question to study these kinds of situations.
on

Proposition 1.4 (Affine maps preserve faces, see Exercise 1.4). Let K, L be
closed convex sets, let x be a point in the relative interior of L, and let u : L Ñ K
se

be an affine map. If F is a face of K such that upxq P F , then upLq Ă F .


lu

Finally, we introduce some more vocabulary. Let K Ă Rn be a closed convex


set. An affine hyperplane H Ă Rn is said to be a supporting hyperplane for K if
na

H X BK ‰ H and K is entirely contained in one of the closed half-spaces delimited


by H. Note that for any x P BK, there is at least one supporting hyperplane for K
so

which contains x. A proper subset F Ă K is an exposed face if it is the intersection


of K with a supporting hyperplane. We say then that H isolates F (as a face of K).
r

Similarly, a point x P K is an exposed point if txu is an exposed face, i.e., if there


Pe

exists a vector y P Rn such that the linear functional xy, ¨y attains its maximum on
K only at x. These notions are studied in Exercise 1.5.
Exercise 1.3. Show that the (relative) boundary of a closed convex set is a
union of exposed faces.
Exercise 1.4. Prove Proposition 1.4.
Exercise 1.5 (Extreme vs. exposed points, faces vs. exposed faces). Let K Ă
Rn be a closed convex set.
(a) Show that every exposed face F of a closed convex set K is indeed a face of K,
which is necessarily proper (i.e., F ‰ K, H).
14 1. ELEMENTARY CONVEX ANALYSIS

(b) Show that the relation “F is a face of G” is transitive.


(c) Show that every maximal proper face of a closed convex set K is exposed.
Deduce that every facet of K (i.e., a face of dimension dim K ´ 1) is exposed.
(d) By (a), any exposed point is extreme. Give an example of a convex body
K Ă R2 with an extreme point which is not exposed. (However, a theorem by
Straszewicz states that any extreme point is a limit of exposed points; see Theorem
18.6 in [Roc70].) Deduce that the relation “F is an exposed face of G” is not
transitive.

ion
(e) More generally, for k ď n ´ 2, give an example of a convex body L Ă Rn with
a k-dimensional face which is not exposed.
(f) Show that F is a face of K if and only if there exists a sequence F “ F0 Ă F1 Ă

ut
. . . Ă Fs “ K such that Fi´1 is an exposed face of Fi for i “ 1, . . . , s.

rib
(g) If every point in the (relative) boundary of a convex set K is extreme, K is
called strictly convex. Show that, in that case, every point of the boundary is an
exposed point.

ist
Exercise 1.6. Prove the Krein–Milman Theorem 1.3 by induction with respect

rd
to n. (Start by showing that any convex body has at least one extreme point.)
Exercise 1.7. Show that the extreme points of the set of quantum states DpHq
fo
are operators of the form |ψyxψ|, where ψ P H is a norm one vector (i.e., rank one
orthogonal projections).
ot
Exercise 1.8. Show that every face of a polytope is a polytope.
N

Exercise 1.9. Show that every proper face of a polytope is exposed.


Exercise 1.10. Find the extreme points of Bpn for 1 ď p ď 8.
ly.

Exercise 1.11 (Hanner’s inequalities and uniform convexity). The goal of this
on

exercise is to prove Hanner’s inequalities about the geometry of the p-norm, which
lead to precise quantitative statements about convexity and smoothness of balls in
se

Lp -spaces.
(i) Let p P p1, 2s. For t ą 0, set αptq “ p1 ` tqp´1 ` |1 ´ t|p´1 signp1 ´ tq. Show that
lu

for a, b P R, we have |a ` b|p ` |a ´ b|p “ suptαptq|a|p ` αp1{tq|b|p : t ą 0u.


(ii) Let p P p1, 2s. Show that for x, y P Rn ,
na

p p
(1.5) }x ` y}pp ` }x ´ y}pp ě p}x}p ` }y}p q ` |}x}p ´ }y}p | .
so

Show also that, for p P r2, 8q, (1.5) holds with ď instead of ě.
(iii) Let p P p1, 2s. Prove also that for x, y P R.,
r
Pe

˙2{p
}x ` y}pp ` }x ´ y}pp
ˆ
(1.6) ě }x}2p ` pp ´ 1q}y}2p .
2

(iv) Fix p P p1, 8q. Show that for any›ε ą ›0 there exists δ ą 0 such that whenever
x, y P Bpn verify }x ´ y}p ě ε, then › x`y2
› ď 1 ´ δ. (This property of Bpn is a
p
quantitative version of strict convexity and is called uniform convexity.)
Exercise 1.12 (A Borel selection theorem). Let K Ă Rn be a convex body.
Show that there is a Borel map Θ : Rn Ñ K with the property that for every
x P Rn we have xΘpxq, xy “ maxtxz, xy : z P Ku.
1.1. NORMED SPACES AND CONVEX SETS 15

1.1.4. Polarity. This section and the next one will present elements of convex
analysis. Readers not familiar with the subject are encouraged to go over the
suggested exercises, which are generally simple and elementary, but often contain
facts not included in standard texts.
Since norms on Rn are in one-to-one correspondence with symmetric convex
bodies, the notion of duality between normed spaces induces a duality for convex
bodies, which is called polarity. Its explicit definition is as follows: if A Ă Rn , the
polar of A is

ion
(1.7) A˝ :“ ty P Rn : xx, yy ď 1 for all x P Au.
In particular (cf. (1.2) and Exercise 1.13)

ut
(1.8) }y}A˝ “ sup xx, yy.

rib
xPAYt0u

The key example is A “ BX (the unit ball of X); we have then A˝ “ BX ˚ , the
unit ball with respect to the dual norm, the duality being induced by the standard

ist
Euclidean structure. For example, duality of `p -norms translates into

rd
(1.9) pBpn q˝ “ Bqn ,
where 1{p ` 1{q “ 1.
fo
A larger important class of sets is that of convex bodies containing 0 in the
interior; it is stable under the operation of polarity. While most of the properties
ot
of the operation K ÞÑ K ˝ listed below hold for more general sets, this last class
is sufficient for most applications (with the notable exception of cones, see Section
N

1.2).
Because of the inequality appearing in the definition (1.7), the concept of polar-
ly.

ity a priori makes sense only in the category of real Euclidean spaces. We exemplify
adjustments needed to make it work in the complex setting in Section 1.3.2, where
on

that setting is at times indispensable.


Since the notion of polarity appeals to the Euclidean structure on Rn , it is not
immediately canonical in the category of vector spaces. Equivalently, it depends
se

on how we identify the vector space Rn with its dual. One useful way to describe
lu

this dependence is as follows: if u P GLpn, Rq, then


(1.10) puAq˝ “ puT q´1 pA˝ q.
na

(The dependence of polarity on translation is somewhat less transparent; one


promising approach to its description is explored in Appendix D.) A way to make
so

polarity canonical is to consider the polar K ˝ as a subset of V ˚ , the dual of the


ambient space V containing K. Basically, all the formulas remain the same, except
r
Pe

that if x P V and x˚ P V ˚ , then xx˚ , xy needs to be understood as x˚ pxq. This


approach is occasionally useful, but is normally avoided since it requires considering
twice as many spaces as the other one.
A fundamental result from convex analysis is that if K is closed, convex and
contains the origin, then
(1.11) pK ˝ q˝ “ K
(see also Exercise 1.15). This is the bipolar theorem, a baby version of the Hahn–
Banach theorem. When K is a symmetric convex body, this is just saying that
a finite-dimensional normed space is reflexive (i.e., canonically isomorphic to its
double dual, see [Fol99]).
16 1. ELEMENTARY CONVEX ANALYSIS

At the functional-analytic level, the duality exchanges the operations of taking


a subspace and taking a quotient. Geometrically, this translates into the fact that
polarity exchanges the projection and the section operations. Here is a more precise
statement: if K Ă Rn , then, for every linear subspace E Ă Rn ,
(1.12) pPE Kq˝ “ E X K ˝ ,
where PE denotes the orthogonal projection onto E. Moreover, if K is a convex set
containing 0 in the interior, then

ion
(1.13) pK X Eq˝ “ PE pK ˝ q.
Note that in the left-hand sides in (1.12) and (1.13), the polars are taken inside E,

ut
equipped with the induced inner product.
Another pair of simple but useful relations involving polars is

rib
(1.14) pK Y Lq˝ “ K ˝ X L˝

ist
for any K, L Ă Rn and
(1.15) pK X Lq˝ “ convpK ˝ Y L˝ q

rd
if K, L are closed, convex and contain the origin.

fo
Exercise 1.13. Find a gap in the following argument. Since }y}A˝ ď 1 iff
y P A˝ iff supxPA xx, yy ď 1, it follows by homogeneity that }y}A˝ “ supxPA xx, yy.
ot
Exercise 1.14 (Stability properties of polarity). Show that K Ă Rn is bounded
iff K ˝ contains 0 in the interior. Similarly, if K is convex, then it contains 0 in its
N

interior iff K ˝ is bounded.


Exercise 1.15 (The general bipolar theorem). Show that if K Ă Rn is an
ly.

arbitrary subset, then pK ˝ q˝ “ convpK Y t0uq. (This holds even if K “ H, if one


on

applies reasonable conventions.) The bipolar theorem (1.11) is a special case of this
statement.
se

Exercise 1.16 (Polar of a projection). Prove (1.12).


lu

Exercise 1.17 (Polar of a section). The following argument seems to prove


that pK X Eq˝ Ă PE pK ˝ q, whenever K is an arbitrary convex body containing the
origin.
na

We will represent any point in Rn as px, x1 q, where x P E, x1 P E K . The condition


y P pK X Eq˝ Ă E means that xx, yy ď 1 for x P K X E. In other words, the
so

functional x ÞÑ xx, yy defined on E is dominated by } ¨ }K , and so, by the Hahn–


r

Banach theorem, it extends to a linear functional on Rn also dominated by } ¨ }K .


Pe

That extension must be of the form px, x1 q ÞÑ xpx, x1 q, py, y 1 qy for some y 1 P E K , and
the domination by } ¨ }K means that py, y 1 q P K ˝ . In particular, y P PE pK ˝ q.
Find an error. Fix it and complete the proof of (1.13) (under the assumptions
stated there). Give an example of K with 0 on the boundary such that (1.13) fails.
Exercise 1.18 (Polars of unions and intersections). Prove (1.14) and (1.15).
For the latter, show by examples that each of the hypotheses and the closure on
the right-hand side may be needed.
Exercise 1.19 (Polars of polytopes). Show that the polar of a polytope K Ă
Rn is a polytope if and only if dim K “ n and 0 is an interior point of K.
1.1. NORMED SPACES AND CONVEX SETS 17

1.1.5. Polarity and the facial structure. If K Ă Rn is a closed convex set


containing 0 in the interior and F is an exposed face of K, let us define
(1.16) νK pF q :“ ty P K ˝ : xy, xy “ 1 for all x P F u.
Then (see Exercise 1.20) νK pF q is an exposed face of K ˝ . Moreover, F ÞÑ νK pF q
is an injective order-reversing (with respect to inclusion) map between the corre-
sponding`sets of ˘exposed faces. If K is a convex body (and so νK ˝ is also defined),
then νK ˝ νK pF q “ F for any exposed face F of K.

ion
ut
rib
ist
K K◦

rd
Figure 1.2. A polytope and its polar. The reader is encouraged
to visualize the bijection νK between vertices (resp. edges, facets)
fo
of K and facets (resp. edges, vertices) of K ˝ . The map νK is
vaguely related to the Gauss map from differential geometry.
ot
If K is a polytope, then the action of νK is very regular: every vertex is mapped
N

to a facet and vice versa, and, more generally, every k-dimensional face is mapped
to an pn ´ k ´ 1q-dimensional face (see Figure 1.2).
ly.

The situation gets more complicated when dealing with general convex bodies:
if F is a maximal face (necessarily exposed, see Exercise 1.5), then νK pF q is a
on

minimal exposed face (not necessarily a minimal face, and certainly not necessarily
an extreme point of K ˝ ). However, it is still possible to retrieve all maximal faces
se

of K from extreme points of K ˝ . We have


Proposition 1.5. Let K Ă Rn be a convex body containing 0 in the interior.
lu

For y P BK ˝ we define
na

(1.17) Fy :“ tx P K : xy, xy “ 1u.


Then Fy is an exposed face of K. Moreover, the family
so

tFy : y is an extreme point of K ˝ u


r
Pe

contains the family of maximal faces of K.


The proof of the Proposition is outlined in Exercise 1.21 (see also Exercise
1.22).
Exercise 1.20. Prove the properties of νK listed in the paragraph following
its definition in (1.16).
Exercise 1.21 (Extreme points and maximal faces). Prove Proposition 1.5.
How does the assertion need to be modified if K is only a closed convex set con-
taining 0 in the interior (i.e., not necessarily bounded)?
18 1. ELEMENTARY CONVEX ANALYSIS

Exercise 1.22 (A dual Krein–Milman theorem). Let K Ă Rn be a closed


convex set containing 0 in the interior, let Fy be definedŤby (1.17), and let E be
the set of extreme points of K ˝ . Show that the formula yPE Fy “ BK is a dual
restatement of the Krein–Milman theorem (Theorem 1.3).
Exercise 1.23. Give an example of a body K Ă R2 (containing 0 in the
interior) with a maximal face F such that νK pF q is not necessarily a minimal face.
Exercise 1.24. Give an example of a body K Ă R2 (with 0 in the interior) and

ion
y, an extreme point of K ˝ , such that the face Fy given by (1.17) is not maximal.
1.1.6. Ellipsoids. A convex body K Ă Rn is an ellipsoid if it is the image

ut
of B2n under an affine transformation. In particular, 0-symmetric ellipsoids are
exactly the unit balls of Euclidean norms on Rn (i.e., norms induced by an inner

rib
product). Given a 0-symmetric ellipsoid E Ă Rn , we denote by x¨, ¨yE the inner
product associated to E . Note also that given a 0-symmetric ellipsoid E , there is a

ist
unique positive invertible matrix T such that E “ T pB2n q.
As explained in Section 0.4, there is a canonical notion of tensor product within

rd
the category of Euclidean spaces. Accordingly, given two 0-symmetric ellipsoids
1 1
E Ă Rn and E 1 Ă Rn , we denote by E b2 E 1 Ă Rn b Rn the resulting ellipsoid,
which satisfies
fo
xx b x1 , y b y 1 yE b2 E 1 “ xx, yyE xx1 , y 1 yE 1 .
1
for x, y P Rn and x1 , y 1 P Rn . An alternative presentation is to say that if T (resp.,
ot
1
T 1 ) is a linear transformation on Rn (resp., on Rn ) such that E “ T pB2n q (resp.,
N

1
such that E 1 “ T 1 pB2n q), then
1
E b2 E 1 “ pT b T 1 qpB2nn q,
ly.

1 1
where we identified Rn b Rn with Rnn .
on

Exercise 1.25 (Spherical sections of ellipsoids). Show that any p2n ´ 1q-
dimensional ellipsoid E admits an n-dimensional central section which is a Eu-
se

clidean ball.
lu

Exercise 1.26 (Polar of an ellipsoid is an ellipsoid). Follow the outline below


to give an elementary proof of the fact that the polar of an ellipsoid E Ă Rn
containing 0 in its interior is again an ellipsoid, and that among translates of a
na

given ellipsoid the volume of the polar is minimized iff the translate is 0-symmetric.
so

(See Exercise D.3 for a computation-free proof.)


(a) Show, by direct calculation, that if 0 ď a ă 1 and Da Ă R2 is the disk of
r

a
unit radius and center at pa, 0q, then Da˝ is an ellipse with center at p´ 1´a 2 , 0q
Pe

1 1
and principal semi-axes of length 1´a2 and ?1´a2 . In particular the area of Da˝ is
minimal iff a “ 0.
(b) Infer similar statements for the n-dimensional Euclidean ball, and then deduce
the desired conclusion.

1.2. Cones
A nonempty closed convex subset C of Rn (or of any real vector space) is called
a cone if whenever x P C and t ě 0, then tx P C. An equivalent definition: C is a
closed set such that x, x1 P C and t, t1 ě 0 imply tx ` t1 x1 P C. Examples of cones
include:
1.2. CONES 19

(1) the cone of elements of Rn with nonnegative coordinates (the positive orthant
Rn` ),
řn´1 (
(2) the Lorentz cone Ln “ px0 , x1 , . . . , xn´1 q : x0 ě 0, k“1 x2k ď x20 Ă Rn for
n ě 2,
(3) the cone PSD “ PSDpCn q Ă Msa n of complex positive semi-definite matrices.

1.2.1. Cone duality. The dual cone C ˚ is defined via


(1.18) C ˚ :“ tx P Rn : @ y P C xx, yy ě 0u.

ion
As was the case with the polarity (see Section 1.1.4), the notion of the dual cone is
not canonical in the category of vector spaces since it appeals to the scalar product.

ut
This can be again circumvented by considering C ˚ as a subset of the vector space
that is dual to the one containing C. We will present some advantages of this point

rib
of view in Appendix D, but will otherwise stick to the more familiar Euclidean
setting.

ist
It is readily checked that the cones Rn` , Ln and PSD defined in the preamble
to Section 1.2 have the remarkable property of being self-dual , i.e., verify C ˚ “ C.

rd
(For C “ PSD, extend the definition (1.18) mutatis mutandis to the setting of
arbitrary real inner product spaces and use trace duality (0.4).)
Not surprisingly, the notion of cone duality is strongly related to that of polarity.
fo
First, a simple argument shows that if C is a (closed convex) cone, then C ˚ “ ´C ˝
and, therefore, by (1.11),
ot
(1.19) pC ˚ q˚ “ C.
N

Similarly, for two closed convex cones C1 , C2 ,


(1.20) pC1 X C2 q˚ “ C1˚ ` C2˚
ly.

by (1.15). However, we also have another link to polarity of convex bodies, which is
on

less obvious. To point out that link, let us first define a base of a closed convex cone
C Ă Rn to be a closed convex set K Ă C such that (1) the affine space generated by
K does not contain the origin and (2) K generates C, i.e., C “ R` K. An alternative
se

description (which is equivalent, see Exercise 1.27) is as follows: fix a distinguished


lu

nonzero vector e P Rn and the corresponding affine hyperplane


(1.21) He :“ tx P Rn : xx, ey “ |e|2 u,
na

in which e is the point closest to the origin. If C Ă Rn is a closed convex cone such
that e P C ˚ zC K , the set C b defined as
so

(1.22) C b “ C X He
r
Pe

is then a base of C (that is, C is the smallest closed cone containing C b , see Exercise
1.28). In particular, knowing C b allows to reconstruct C.
As was to be expected, natural set-theoretic and algebraic operations on cones
induce analogous operations on bases of cones. Sometimes this is as trivial as
pC1 X C2 qb “ C1b X C2b , or as simple as pC1 ` C2 qb “ convpC1b Y C2b q. In fact, if we
want to stay in the class of closed cones, the more appropriate form of the latter
formula would be
(1.23) pC1 ` C2 qb “ convpC1b Y C2b q
(see Exercise 1.30; however, such adjustments are not needed under some natural
nondegeneracy assumptions, which we will describe later in Section 1.2.2).
20 1. ELEMENTARY CONVEX ANALYSIS

What is more interesting—and somewhat surprising—is that the duality of


cones likewise carries over to a precise duality of bases in the following sense (see
Figure 1.3; see also Lemma D.1 in Appendix D).
Lemma 1.6. Let C Ă Rn be a closed convex cone and let e P C XC ˚ be a nonzero
vector. Let C b “ C X He and pC ˚ qb “ C ˚ X He be the corresponding bases of C and
C ˚ . Then
(1.24) pC ˚ qb “ ty P He : @x P C b x´py ´ eq, x ´ ey ď |e|2 u.

ion
In other words, if we think of He as a vector space with the origin at e, and of C b
and pC ˚ qb as subsets of that vector space, then pC ˚ qb “ ´|e|2 pC b q˝ .

ut
C C∗

rib
Cb

ist
• e•
e He (C ∗ )b He

rd
• •
0 0
fo
Figure 1.3. A cone and its dual cone. Up to a reflection, the
bases C b and pC ˚ qb are polar to each other with respect to e.
N ot

Proof. If xx, ey “ xy, ey “ |e|2 , then x´py ´ eq, x ´ ey “ ´xy, xy ` |e|2 and
so the condition from (1.24) can be restated as “@x P C b ´ xy, xy ` |e|2 ď |e|2 ” or,
ly.

more simply, “@x P C b xy, xy ě 0.” Since C b generates C (see Exercise 1.28), the
latter condition is further equivalent to “xy, xy ě 0 for all x P C,” i.e., to “y P C ˚ ,”
on

as required. 
Here are two important classical examples where Lemma 1.6 applies.
se

(1) The positive orthant Rn`1


`
1
Ă Rn`1 . Take e “ p n`1 1
, . . . , n`1 q, so that He is
lu

` n`1 ˘b
given by the equation x0 ` ¨ ¨ ¨ ` xn “ 1. Then R` “ ∆n , the set of classical
n`1
states. Since R` is self-dual, it follows from Lemma 1.6 that
na

(1.25) ∆˝n “ ´pn ` 1q∆n .


Note that the prefactor is ´pn ` 1q and not ´n because the n-dimensional ball
so

circumscribed around ∆n is not of unit radius.


r

(2) The cone PSDpCn q Ă Msa n . Take e “ I {n (the maximally mixed state), so
Pe

that He is the hyperplane of trace one matrices. Then PSDb “ DpCn q, the set of
quantum states. Since PSD is self-dual, it follows from Lemma 1.6 that
(1.26) DpCn q˝ “ ´nDpCn q.
The bases of the Lorentz cones Ln relative to the natural choice e “ e0 are
Euclidean balls, so applying Lemma 1.6 just tells us that the Lorentz cone is self-
dual (a property which is easy to verify directly). However, other choices of e
lead to nontrivial consequences, see Exercise D.3. Another simple but important
observation is that since DpC2 q is a 3-dimensional Euclidean ball (the Bloch ball),
the cone PSDpC2 q is isomorphic (or even isometric in the appropriate sense) to the
Lorentz cone L4 (see Section 2.1.2).
1.2. CONES 21

Exercise 1.27. Let K be a base of a closed convex cone C, and H the affine
space generated by K. Show that K “ C X H.
Exercise 1.28 (Bases generate cones). Show that if e P Rn and a closed convex
cone C Ă Rn are such that e P C ˚ zC K , and if C b is defined by (1.21) and (1.22),
then R` C b “ C. Give an example showing that the closure is needed.
Exercise 1.29 (Nontrivial cones admit bases). Let C Ă Rn be a closed convex
cone. Show that C admits a base iff C is not a linear subspace iff C ‰ ´C.

ion
Exercise 1.30. Give an example of closed cones C1 , C2 in R3 such that the
cone C1 ` C2 is not closed.

ut
Exercise 1.31 (Time dilation and the Lorentz cone). Consider the cone Cy “

rib
tx P Rn : |x| ď xx, yyu where y P Rn satisfies |y| ą 1. Show that Cy˚ “ Cz for
a
z “ y{ |y|2 ´ 1.

ist
1.2.2. Nondegenerate cones and facial structure. We will be mostly
dealing with (closed convex) cones C Ă Rn verifying (i) C X p´Cq “ t0u and (ii)

rd
C ´ C “ Rn ; we will call such cones nondegenerate. The properties (i) and (ii) are
often referred to as C being respectively pointed and full . They are dual to each
fo
other, i.e., C verifies (i) iff C ˚ verifies (ii), and vice versa; the reader may explore
them further in Exercise 1.32. Here we note the following
ot
Lemma 1.7. Let C Ă Rn be a closed convex cone. Then y is an interior point
˚
of C iff xy, xy ą 0 for every x P Czt0u.
N

Proof. Let x P C. If Bpy, εq Ă C ˚ for some ε ą 0, then


ly.

(1.27) xy ` u, xy ě 0 for any |u| ă ε.


on

Since inf |u|ăε xy ` u, xy “ xy, xy ´ ε|x|, this is only possible if either xy, xy ą 0 or
|x| “ 0. This proves the “only if” part (see also Exercise 1.34). For the “if” part, we
note that Bpy, εq Ă C ˚ follows if (1.27) holds for x in C X S n´1 “: A. This could be
se

ensured by choosing ε “ inf xPA xy, xy, which is strictly positive since the continuous
function xy, ¨y is pointwise positive on the compact set A.
lu


Corollary 1.8. If C is a closed convex cone which is pointed, then 0 is an
na

exposed point of C. If, moreover, C ‰ t0u, then C admits a compact base.


Proof. Since C is pointed, C ˚ has nonempty interior. If y is any interior point
so

of C , Lemma 1.7 says that the hyperplane H “ tx P Rn : xy, xy “ 0u isolates 0 as


˚
r

an exposed point of C, and it readily follows that the base of C induced by e “ y


Pe

is compact. In fact, all the three properties stated in the Corollary are equivalent
(see Exercise 1.32). 
We are now ready to state the main observation of this section. Once made, it
is fairly straightforward to show.
Proposition 1.9 (Faces of cones and faces of bases, see Exercise 1.35). Let
C Ă Rn be a closed convex cone with a compact base C b . When we exclude the
exposed point 0 of C, there is a one-to-one correspondence between faces of C b and
those of C given by F ÞÑ R` F . Moreover, this correspondence preserves the exposed
(or non-exposed) character of each face.
22 1. ELEMENTARY CONVEX ANALYSIS

An important special case is when x is an extreme (or exposed) point of C b ; the


corresponding face of C is then the ray R` x, called an extreme ray (or an exposed
ray). The Krein–Milman theorem (see Section 1.1.2) implies then that C is the
convex hull of its extreme rays. We also note for future reference the following
consequence of Proposition 1.9 (for the second part, appeal to Exercise 1.7).
Corollary 1.10. All extreme rays of PSDpCn q are of the form R` |ψyxψ|,
where ψ P SCn . All rays contained in the boundary of the Lorentz cone Ln are
extreme.

ion
Exercise 1.32 (Full cones and pointed cones). Let C Ă Rn be a closed convex
cone, C ‰ t0u. Show that the following conditions are equivalent:

ut
(a) C is pointed (i.e., C X p´Cq “ t0u),
(b) C ˚ is full (i.e., C ˚ ´ C ˚ “ Rn ),

rib
(c) 0 is an exposed point of C,
(d) C does not contain a line,

ist
(e) C admits a compact base,
(f) dim C ˚ “ n,

rd
(g) span C ˚ “ Rn .
Exercise 1.33 (Structure theorem for a general cone). If C Ă Rn is a closed
fo
convex cone, then there exists a vector subspace V Ă Rn and a pointed cone
C 1 Ă V K such that C “ V ` C 1 (a direct Minkowski sum).
ot
Exercise 1.34. Deduce the “only if” part of Lemma 1.7 from Proposition 1.4.
N

Exercise 1.35. Prove Proposition 1.9 relating faces of cones to those of their
bases.
ly.

Exercise 1.36. Show that if the cones C1˚ , C2˚ are pointed with the same iso-
on

lating hyperplane, then the closure on the right-hand side of (1.20) is not needed.

1.3. Majorization and Schatten norms


se

1.3.1. Majorization. If x P Rn , we denote by xÓ P Rn the non-increasing


lu

rearrangement of x, i.e., the coordinates of xÓ are equal to the coordinates of x up


to permutation, and xÓ1 ě ¨ ¨ ¨ ě xÓn .
na

řn řn
Definition 1.11. If x, y P Rn with i“1 xi “ i“1 yi , we say that x is ma-
jorized by y, and write x ă y, if
so

k
ÿ k
ÿ
(1.28) xÓj ď yjÓ for any k P t1, 2, . . . , nu.
r
Pe

j“1 j“1

Note that, by hypothesis, (1.28) becomes an equality for k “ n.


The majorization property will be a crucial tool in Chapter 10. As a warm-up,
we will use it in the next section to prove Davis convexity theorem and various
properties of Schatten norms (non-commutative `p -norms).
There are several equivalent reformulations of the majorization property. We
gather some of them in the following proposition.
Proposition 1.12. For x, y P Rn with xi “ yi , the following conditions
ř ř
are equivalent.
(i) x ă y.
1.3. MAJORIZATION AND SCHATTEN NORMS 23

(ii) x can be written as a convex combination of coordinatewise permutations


of y.
(iii) There is an n ˆ n bistochastic matrix B such that y “ Bx (a matrix is
bistochastic if its entries are non-negative, and add up to 1 in each row
and each column).
(iv) Whenever φ is a permutationally invariant convex function on Rn , then
φpxq ď φpyq. řn řn
(v) For every t P R, we have ř i“1 |xi ´ t| ď ř|y
i“1 i ´ t|.

ion
n n
(vi) For every t P R, we have i“1 pxi ´ tq` ď i“1 pyi ´ tq` , where x` “
maxpx, 0q.

ut
Sketch of the proof. Fix y P Rn , and consider the non-empty convex com-
pact set

rib
Ky “ tx P Rn : x ă yu.
It is easily checked that x is an extreme point of Ky if and only if xÓ “ y Ó , and

ist
it follows from the Krein–Milman theorem that (i) is equivalent to (ii). Similarly,
the classical Birkhoff theorem, which asserts that extreme points of the set of bis-

rd
tochastic matrices are exactly permutation matrices, gives the equivalence of (ii)
and (iii). The implications (ii) ñ (iv) ñ (v) are obvious. We
ř checkřthat (v) and
fo
(vi) are equivalent since |x| “ 2x` ´ x (using the fact that xi “ yi ). Finally,
for t “ ykÓ , we compute
ot
n
ÿ k
ÿ k
ÿ
pyi ´ tq` “ pyiÓ ´ tq “ yiÓ ´ kt
N

i“1 i“1 i“1


n
ÿ n
ÿ k
ÿ k
ÿ k
ÿ
ly.

pxi ´ tq` “ pxÓi ´ tq` ě pxÓi ´ tq` ě pxÓi ´ tq “ xÓi ´ kt.


i“1 i“1 i“1 i“1 i“1
on

řk Ó řk
Therefore, the inequality from (vi) implies that i“1 xi ď i“1 yiÓ , hence x ă
y. 
se

Exercise 1.37. Show


ř that,řin the statement of Proposition 1.12, we have to
assume the hypothesis xi “ yi only in (vi); in (ii)–(v) this property follows
lu

formally.
Exercise 1.38 (Submajorization). Given x, y P Rn , we say that x is subma-
na

jorized by y and write x


řăw y řif (1.28) holds (the difference with majorization is
so

that we do not assume xi “ yi ). Show that x ăw y if and only if there exists


u P Rn such that u ă y and xk ď uk for every 1 ď k ď n.
r
Pe

1.3.2. Schatten norms. Recall that the space Mm,n of (real or complex) mˆ
n matrices carries a Euclidean structure given by the Hilbert–Schmidt inner product
(see Section 0.6). The Hilbert–Schmidt norm is a special case of the Schatten p-
norms, which are the non-commutative analogues of the `p -norms. If M P Mm,n ,
define |M | :“ pM : M q1{2 , and for 1 ď p ď 8,
1{p
}M }p :“ pTr |M |p q .
Note that } ¨ }HS “ } ¨ }2 . The case p “ 8 should be interpreted as the limit p Ñ 8
of the above, and corresponds to the usual operator norm
}M }8 “ }M }op :“ sup |M x|.
|x|ď1
24 1. ELEMENTARY CONVEX ANALYSIS

The quantity }M }1 “ Tr |M | is called the trace norm of M . Occasionally we will


loosely refer to various matrix spaces endowed with Schatten norms as Schatten
spaces or p-Schatten spaces.
There is ambiguity in the notation } ¨ }p in that it has two posible meanings:
the Schatten p-norm on Mm,n (matrices) and the usual `p -norm on Rn or Cn (se-
quences). However, it will be always clear from the context which of the two is the
intended one.
If M P Mm,n , and if we denote by spM q “ ps1 pM q, . . . , sn pM qq the singular

ion
values of M (i.e., the eigenvalues of |M |) arranged in the non-increasing order,
then for any p,

ut
(1.29) }M }p “ }spM q}p .

rib
The following lemma allows to reduce the study of Schatten norms to the case
of self-adjoint matrices.

ist
Lemma 1.13. Let M P Mm,n , and M̃ P Mm`n be the self-adjoint matrix defined
by

rd
„ 
0 M
M̃ “ .
M: 0
fo
Then we have }M̃ }p “ 21{p }M }p for 1 ď p ď 8. Similarly, if M, N P Mm,n , then
Tr M̃ Ñ “ 2 Re Tr M : N .
ot
Proof. For the first assertion, it suffices to notice that the eigenvalues of M̃
N

are equal to ˘si pM q. The second assertion is verified by direct calculation. 

The next lemma shows how the concept of majorization relates to eigenval-
ly.

ues/singular values of a matrix.


on

Lemma 1.14 (Spectrum majorizes the diagonal). Let M P Mn be a self-adjoint


matrix, let dpM q “ pmii q P Rm be the vector of diagonal entries of M , and let
specpM q “ pλi q P Rm be the vector of eigenvalues of M , arranged in non-increasing
se

order. Then dpM q ă specpM q.


lu

ř ř
Proof. First, it is known from linear algebra that i mii “ i λi , so ma-
jorization is in principle possible. Write M as M “ U ΛU : , where Λ is a diagonal
na

matrix whose entries are the eigenvalues of M , and U is a unitary matrix. We then
have
so

ÿ ÿ
mii “ uij λj uji “ |uij |2 λj .
r

j j
Pe

Since the matrix with entries |uij |2 is bistochastic, the assertion follows from Propo-
sition 1.12 (iii). 

We now state the Davis convexity theorem, which gives a characterization of


all convex functions f on Msa
m that are unitarily invariant.

Proposition 1.15 (Davis convexity theorem). Let f : Msa m Ñ R a function


which is unitarily invariant, i.e., such that f pU AU : q “ f pAq for any self-adjoint
matrix A and any unitary matrix U . Then f is convex if and only if the restriction
of f to the subspace of diagonal matrices is convex.
1.3. MAJORIZATION AND SCHATTEN NORMS 25

Proof. Assume that the restriction of f to diagonal matrices is convex (the


converse implication being obvious). This restriction, when considered as a func-
tion on Rm , is permutationally invariant, as can be checked by choosing for U a
permutation matrix. Given 0 ă λ ă 1 and A, B P Msa m , we need to show that
(1.30) f pλA ` p1 ´ λqBq ď λf pAq ` p1 ´ λqf pBq.
Since f is unitarily invariant, we may assume that the matrix λA ` p1 ´ λqB is
diagonal. Denoting by diag A the matrix obtained from a matrix A by changing all

ion
its off-diagonal elements to 0, the hypothesis on f implies
f pλA ` p1 ´ λqBq ď λf pdiag Aq ` p1 ´ λqf pdiag Bq.

ut
Using Lemma 1.14 and Proposition 1.12(iv), it follows that f pdiag Aq ď f pAq and
f pdiag Bq ď f pBq, showing (1.30). 

rib
An immediate consequence of the Davis convexity theorem is that the Schatten
p-norms satisfy the triangle inequality.

ist
Proposition 1.16. For 1 ď p ď 8, if M, N P Mm,n , we have

rd
}M ` N }p ď }M }p ` }N }p .
Proof. By the first assertion of Lemma 1.13, it is enough to consider the case
fo
of m “ n and self-adjoint M, N . We now use Proposition 1.15 for the unitarily
invariant function f p¨q “ } ¨ }p . The restriction of } ¨ }p to the subspace of diagonal
ot
matrices identifies with the usual (commutative) `p -norm on Rn , and hence, by
Proposition 1.15, the function } ¨ }p is convex on Msa m . Since it is also positively
N

homogeneous, the triangle inequality follows. 


Obviously, the Schatten p-norms of a given matrix satisfy the same inequalities
ly.

as the `p -norms: if 1 ď p ď q ď 8, and M is an m ˆ n matrix (with m ď n; what


on

is important is that the rank of M is at most m), then


(1.31) }M }q ď }M }p ď m1{p´1{q }M }p .
se

Duality between Schatten p-norms holds as in the commutative case.


lu

Proposition 1.17 (The non-commutative Hölder inequality). Let 1 ď p, q ď 8


such that 1{p ` 1{q “ 1, and M P Mm,n , N P Mn,m . We have
na

(1.32) | Tr M N | ď }M }p }N }q .
As a consequence, the Schatten p-norm and q-norm are dual to each other. This
so

holds in all settings: for rectangular matrices (real or complex), for Hermitian
matrices, and for real symmetric matrices.
r
Pe

As in the case of `np -spaces, the above duality relation can be equivalently ex-
pressed in terms of polars. Denote by Spm,n the unit ball associated to the Schatten
norm } ¨ }p on Mm,n and Spm,sa :“ Spm,m X Msa m . (Again, there are two settings, real
and complex, and some care needs to be exercised as minor subtleties occasionally
arise.) We then have
Corollary 1.18. If 1 ď p, q ď 8 with 1{p ` 1{q “ 1, then
(1.33) Sqm,n “ tA P Mm,n : |xX, Ay| ď 1 for all X P Spm,n u
(1.34) “ tA P Mm,n : RexX, Ay ď 1 for all X P Spm,n u
˘˝
Sqm,sa “ Spm,sa ,
`
(1.35)
26 1. ELEMENTARY CONVEX ANALYSIS

˝
where x¨, ¨y and are meant in the sense of trace duality (0.4).
While (1.33) and (1.35) are simply straightforward reformulations of duality
relations from Proposition 1.17, the equality in (1.34) needs to be justified (only the
inclusion “Ă” is immediate). Given A P Mm,n and X P Spm,n such that |xX, Ay| ą 1,
xX,Ay
let ξ “ |xX,Ay| . Then, setting X 1 “ ξX,¯ we see that X 1 P S m,n , while RexX 1 , Ay “
p
|xX, Ay| ą 1, which yields the other inclusion “Ą” in `(1.34).˘ The expression in
˝
(1.34) can be thought of as a definition of the polar Spm,n by “dropping the

ion
complex structure”; see Exercise 1.48 for the general principle. Another potential
complication is that, in the complex setting, the identification with the dual space
is anti-linear, see Section 0.2. Note that no issues of such nature arise in defining

ut
the polar of Spm,sa , as that set “lives” in a real inner product space irrespectively of

rib
the setting.
Proof of Proposition 1.17. Consider first the Hermitian case. By unitary

ist
invariance, we may assume that M is diagonal. We then have
ˇÿ ˇ
| TrpM N q| “ ˇ mii nii ˇ ď }pmii q}p }pnii q}q ď }M }p }N }q ,

rd
ˇ ˇ
i

where we used the commutative Hölder inequality, Lemma 1.14, and Proposition
1.12 (iv). fo
In the general case, Lemma 1.13 and the Hermitian case of (1.32) shown above
ot
imply that, for all M, N P Mn,m ,
N

Re Tr M : N ď }M }p }N }q ,
and the same bound for | TrpM N q| (or | TrpM : N q|) follows by the same trick as
ly.

the one used to establish equality in (1.34) (see the paragraph following Corollary
1.18).
on

As in the commutative case, Hölder’s inequality constitutes


` ˘˝ “the hard part”
of the duality assertion, such as the inclusion Sqm,sa Ă Spm,sa in (1.35). “The
se

easy part” involves establishing that for every M , there is N ‰ 0 such that we
have equality in (1.32). In the Hermitian case, this follows readily by restricting
lu

attention to matrices that diagonalize in the same orthonormal basis as M and by


appealing to the analogous statement for the usual `p -norm. In the general case
na

one considers similarly the singular value decomposition of M . 


Exercise 1.39 (Davis convexity theorem, the real case). State and prove a real
so

version of Proposition 1.15, i.e., for functions defined on the set of real symmetric
r

matrices.
Pe

Exercise 1.40 (Klein’s lemma). Show that if the function φ : R Ñ R is convex,


then X ÞÑ Tr φpXq is convex on the set of self-adjoint matrices, and similarly for
φ : I Ñ R and the set of self-adjoint matrices with spectrum in I, where I Ă R is
an interval.
Exercise 1.41. Show that the function X ÞÑ log Tr exppXq is convex on the
set of self-adjoint matrices.
Exercise 1.42 (Log-concavity of the determinant). Show that the function
log det is strictly concave on the interior of PSD.
1.3. MAJORIZATION AND SCHATTEN NORMS 27

Exercise 1.43. Show that if a function X ÞÑ ΦpXq is convex on Mn and


unitarily invariant, then Φpdiag Xq ď ΦpXq for any X P Mn (and similarly for
Msa
n in place of X P Mn ). If Φ is strictly convex and X is not diagonal, then the
inequality is strict.
Exercise 1.44 (Extreme points of Schatten unit balls). What are the extreme
m,n
points of S1m,n ? Of S1m,sa ? Of S8 m,sa
? S8 ? For the latter, how many connected
components does the set of extreme points have?

ion
Exercise 1.45 (Spectral theorem and SVD vs. Carathéodory’s theorem). Let
n,sa
K be one of S8n
, S1n , S8 , S1n,sa . Show that every element of K can be written as a
convex combination of n ` 1 extreme points of K. Compare this fact with what one

ut
obtains by a direct application of the Carathéodory’s Theorem 1.2 in the respective

rib
matrix space.
Exercise 1.46 (The real Schatten balls). In the real case, the space Msa
2 is

ist
3-dimensional. Which familiar solids are S12,sa and S8
2,sa
?
Exercise 1.47 (Characterization of unitarily invariant norms). Let m ď n,

rd
and } ¨ } be a norm on Rm such that
}pε1 xσp1q , . . . , εm xσpmq q} “ }px1 , . . . , xm q}
fo
for any x P R , ε P t´1, 1um and σ P Sm . (We call such norms permutationally
m
ot
symmetric.) Show that M ÞÑ }spM q} is a norm on Mm,n and that every norm
which is bi-unitarily invariant (i.e., verifying }U M V } “ }M } for U P Upmq and
N

V P Upnq) can be defined in this way.


Exercise 1.48 (Polarity in the complex setting). If H is a complex Hilbert
ly.

space and K a closed convex subset, the polar of K can be defined via K ˝ :“
ty P H : Re xx, yy ď 1 for all x P Ku, i.e., by dropping the complex structure, as
on

described in Section 0.5. Show that K ˝ :“ ty P H : |xx, yy| ď 1 for all x P Ku if


and only if K is circled.
se

1.3.3. Von Neumann and Rényi entropies. Let DpCd q be the set of quan-
lu

tum states on Cd (see Section 0.10) and σ P DpCd q. The von Neumann entropy of
σ is defined as
na

(1.36) Spσq “ ´ Trpσ log σq,


where log is the natural logarithm. (Note that many texts use base 2 logarithm to
so

define entropy, see Notes and Remarks.)


r

Proposition 1.19. The von Neumann entropy S satisfies the following prop-
Pe

erties:
(i) it is a concave function from DpCd q onto r0, log ds,
(ii) for σ P DpCd q, we have Spσq “ 0 if and only if σ is pure (i.e., has rank 1),
(iii) for σ P DpCd q, we have Spσq “ log d if and only if σ “ I {d,
(iv) if σ P DpCd q and U P Updq, then Spσq “ SpU σU : q,
(v) if σ P DpCd q and τ P DpCn q, then Spσ b τ q “ Spσq ` Spτ q.
Proof. All these properties are straightforward to show, except perhaps the
concavity which follows from the concavity of x ÞÑ ´x log x, together with Klein’s
lemma (Exercise 1.40). 
28 1. ELEMENTARY CONVEX ANALYSIS

The following lemma quantifies the fact that very mixed states have large en-
tropy.
Lemma 1.20. Let ρ P DpCd q be a state with spectrum in the interval 1´ε 1`ε
“ ‰
d , d
for some ε P r0, 1s. Then Spρq ě log d ´ hpεq, where
1`ε 1´ε
hpεq “ logp1 ` εq ` logp1 ´ εq.
2 2
Note that hpεq „ ε2 {2 as ε goes to 0.

ion
Proof. Assume that d is even and consider a state σ P DpCd q with d{2 eigen-
values equal to p1 ` εq{d and d{2 eigenvalues equal to p1 ´ εq{d. One checks directly

ut
from the definition of majorization that specpρq ă specpσq. It follows then from
Proposition 1.12 (iv) that

rib
Spρq ě Spσq “ logpdq ´ hpεq.
If d is odd, a similar argument applies where σ has pd ´ 1q{2 eigenvalues equal to

ist
p1 ˘ εq{d and one eigenvalue equal to 1{d. One checks by direct computation that
Spσq ą logpdq ´ hpεq. 

rd
Remark 1.21. Note that while the entropy of (normalized) quantum states
(i.e., ρ P D) is of primary physical interest, the definition makes sense for, and most
properties generalize to ρ P PSD.
fo
ot
Let σ be a state on Cd , and p P p0, 8q. The p-Rényi entropy of σ is
1
N

(1.37) Sp pσq “ log Trpσ p q.


1´p
The definition for p “ 1 should be understood as the limit as p Ñ 1. We then recover
ly.

the von Neumann entropy, so that S1 “ S. Other limit cases are p Ñ 0, which gives
S0 pσq “ log rank σ, and p Ñ 8, which gives S8 pσq “ ´ log }σ}8 . When p ą 1,
on

the Rényi entropy is connected to the Schatten p-norm by the formula Sp pσq “
p
1´p log }σ}p . Just like the von Neumann entropy is a generalization of Shannon
se

entropy, defined for classical states (probability mass functions) p “ ppk q P ∆n by


ÿ
lu

(1.38) Hppq :“ ´ pk log pk ,


k
na

the Rényi entropy may be thought of as a generalization of the `p -norm (up to


logarithmic change of variables and rescaling; it also has a classical variant defined
p
so

via Hp pqq :“ 1´p log }q}p ).


Exercise 1.49 (Properties of Rényi entropies). Verify that, for p P p0, 8s, Sp
r
Pe

satisfies properties (i)–(v) from Proposition 1.19. Note that (iii) fails for p “ 0.
Exercise 1.50 (Entropy of the state vs. entropy of the diagonal). Show that,
for any ρ P D, Spdiag ρq ě Spρq, with equality only if ρ is diagonal.
Exercise 1.51 (Monotonicity of Rényi entropies). Show that Sp pσq and Hp pqq
are non-increasing in p for fixed σ, q.

Notes and Remarks


A presentation of convex analysis oriented towards applications (notably to
computer science) can be found in [Bar02]. An older but still valuable reference is
the book [Roc70].
NOTES AND REMARKS 29

Section 1.1. Following the customary usage in functional analysis, we name


Theorem 1.3 after Krein–Milman. However, it should be pointed out that the main
contribution by Krein–Milman is an extension to infinite-dimensional locally convex
spaces; the finite-dimensional case, which is presented here, is due to Minkowski
[Min11].
The inequality (1.5) proved in Exercise 1.11 is due to Hanner [Han56]; it
belongs to the family of inequalities (including the earlier Clarkson inequalities
[Cla36]) that degenerate into the parallelogram identity when p “ 2. The inequal-

ion
ity (1.6) is the so-called “2-uniform convexity” of the p-norm for p P p1, 2s. For
p ě 2, the inequality is reversed (2-uniform smoothness); for p “ 1, it degener-
ates into the triangle inequality. One establishes similarly p-uniform convexity for

ut
p P r2, 8q and p-uniform smoothness for p P p1, 2s.

rib
It is natural to ask whether these inequalities remain valid for the Schatten
p-norm, i.e., when x, y are matrices. This is known to be true for inequality (1.6)
when 1 ď p ď 2 (and for its reversed form when p ě 2). However, the stronger

ist
Hanner inequality (1.5) for matrices has been proved only in the range 1 ď p ď 4{3
(or, for the reversed inequality, in the range p ě 4). For proofs and references, see

rd
[BCL94, CL06].
Section 1.2. Lemma 1.6 seems to be a folklore result, but does not appear in
fo
standard references for convexity (the best source we were pointed to after consult-
ing specialists was Exercise 6, §3.4 of [Grü03]). However, once stated, the Lemma
ot
is straightforward to prove.
N

Convex cones play a fundamental role in the theory of convex optimization


and in linear and semi-definite programming, all of which have their own links to
quantum information. We do not develop any of these areas or connections here.
ly.

We refer the interested reader to the books [BV04] and [BTN01a], the survey
[Nem07], and, for sample links, to [Rei08, KL09, BH13, HNW15].
on

Section 1.3. A comprehensive reference for majorization and for connections


to matrix inequalities is the book [Bha97]. Klein’s lemma originates from [Kle32].
se

Davis convexity theorem appears in [Dav57]. Early references for Schatten norms
lu

include [Sch50, Sch70].


The concept of von Neumann entropy is crucial in quantum information theory
and quantum Shannon theory. A reason for this is that von Neumann entropy
na

and its variants (quantum relative entropy, quantum mutual information) have
several operational interpretations, i.e., quantify the rate at which basic information
so

processing tasks (transmission, encoding, decoding) can be performed. This point


r

of view is hardly mentioned in this book. For an accessible introduction to quantum


Pe

Shannon theory we refer to [Wil17]. Interestingly, the concept of von Neumann


entropy appears already in [von27, von32] (see [Pet01] for historical background)
and predates the development of its classical counterpart, the Shannon entropy
which—like much of modern information theory—has its roots in the 1948 two-
part article by Claude Shannon [Sha48].
Many texts use base 2 logarithm to define entropy. While using the natural
logarithm simplifies some calculations, the choice of the base is immaterial in our
context; as a rule, the stated identities and estimates typically hold for any base, as
long as one is consistent. The few exceptions to this principle are clearly marked.
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 2

The Mathematics of Quantum Information Theory

ion
This chapter puts into mathematical perspective some basic concepts of quan-

ut
tum information theory. (For a physically motivated approach, see Chapter 3.) We
discuss the geometry of the set of quantum states, the entanglement vs. separabil-

rib
ity dichotomy, and introduce completely positive maps and quantum channels. All
these concepts will be extensively used in Chapters 8–12.

ist
2.1. On the geometry of the set of quantum states

rd
2.1.1. Pure and mixed states. In this section we take a closer look at the
set DpHq (or simply D) of quantum states on a finite-dimensional complex Hilbert

(2.1)
fo
space H. By definition (see Section 0.10), we have
DpHq “ tρ P Bsa pHq : ρ ě 0, Tr ρ “ 1u.
ot
If H “ C , the definition (2.1) simply says that DpCd q is the base of the positive
d
N

semi-definite cone PSDpCd q defined by the hyperplane H1 Ă Msa d of trace one


Hermitian matrices (cf. (1.22)). The (real) dimension of the set DpCd q equals
d2 ´ 1: it has non-empty interior inside H1 . (This follows from PSDpCd q being a
ly.

full cone.)
on

A state ρ P DpHq is called pure if it has rank 1, i.e., if there is a unit vector
ψ P H such that
ρ “ |ψyxψ|.
se

Note that |ψyxψ| is the orthogonal projection onto the (complex) line spanned by
lu

ψ. We sometimes use the terminology “consider a pure state ψ” (such language is


prevalent in physics literature). What we mean is that ψ is a unit vector and we con-
na

sider the corresponding pure state |ψyxψ|. We use the terminology of mixed states
when we want to emphasize that we consider the set of all states, not necessarily
so

pure.
Let ψ, χ be unit vectors in H. Then the pure states |ψyxψ| and |χyxχ| coincide
r

if and only if there is a complex number λ with |λ| “ 1 such that χ “ λψ. Therefore
Pe

the set of pure states identifies with PpHq, the projective space on H. (See Appendix
B.2; note that the space PpCd q is more commonly denoted by CPd´1 .)
The set DpHq is a compact convex set, and it is easily checked that the extreme
points of DpHq are exactly the pure states (cf. Proposition 1.9 and Corollary 1.10).
It follows from general convexity theory (Krein–Milman and Carathéodory’s
theorems) that any state is a convex combination of at most pdim Hq2 pure states.
However, using the spectral theorem instead tells us more: any state is a convex
combination of at most dim H pure states |ψi yxψi |, where pψi q are pairwise orthog-
onal unit vectors (cf. Exercise 1.45). A fundamental consequence is that whenever
we want to maximize a convex function (or minimize a concave function) over the

31
32 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

set DpHq, the extremum is achieved on a pure state, which significantly reduces the
dimension of the problem.
As opposed to pure states, which are extremal, the “most central” element in
DpHq is the state I { dim H, which is called the maximally mixed state, and denoted
by ρ˚ when there is no ambiguity. We also note that the set of states on H which
are diagonal with respect to a given orthonormal basis pei qiPI naturally identifies
with the set of classical states on I.
Exercise 2.1. Describe states which belong to the boundary of DpHq.

ion
Exercise 2.2 (Every state is an average of pure states). Show that every state
ρ P DpCd q can be written as d1 p|ψ1 yxψ1 | ` ¨ ¨ ¨ ` |ψd yxψd |q for some unit vectors

ut
ψ1 , . . . , ψd in Cd .

rib
2.1.2. The Bloch ball DpC2 q. The situation for d “ 2 is very special. Let
ρ P Msa
2 , with Tr ρ “ 1. Then ρ has two eigenvalues, which can be written as 1{2´λ

ist
and 1{2 ` λ for some λ P R. Moreover, ρ ě 0 if and only if |λ| ď 1{2. On the other
hand, we have ?

rd
}ρ ´ ρ˚ }HS “ 2|λ|.
?
Therefore, ρ is a state if and only if }ρ ´ ρ˚ }HS ď 1{ 2. What we have proved
fo
is that, inside the space of trace one self-adjoint?operators, the set of states is a
Euclidean ball centered at ρ˚ and with radius 1{ 2. This ball is called the Bloch
ot
ball and its boundary is called the Bloch sphere. Once we introduce the Pauli
matrices
N

„  „  „ 
0 1 0 ´i 1 0
(2.2) σx “ , σy “ , σz “ ,
1 0 i 0 0 ´1
ly.

a convenient orthonormal basis (with respect to the Hilbert–Schmidt inner product)


in Msa
2 is
on

´ 1 1 1 1 ¯
(2.3) ? I, ? σx , ? σy , ? σz .
2 2 2 2
se

2
A very useful consequence of DpC q being a ball is the fact—mentioned already
in Section 1.2.1—that the cone PSDpC2 q is isomorphic (or even isometric in the ap-
lu

propriate sense) to the Lorentz cone L4 . A popular explicit isomorphism, inducing


the so-called spinor map (see Appendix C), is given by
na

„ 
t ` z x ´ iy
(2.4) R4 Q x “ pt, x, y, zq ÞÑ “ X P Msa
2 .
so

x ` iy t ´ z
The formula for X can be rewritten in terms of the Pauli matrices (2.2) as
r
Pe

(2.5) X “ t I `xσx ` yσy ` zσz ,


and so a convenient expression for it is X “ x¨σ, where σ is a shorthand for
pI, σx , σy , σz q, and “¨” is a “formal dot product.” Since tI, σx , σy , σz u is a multiple of
the orthonormal basis (2.3) of Msa 2 , it follows that the map given by (2.4) is likewise
a multiple of isometry (with respect to the Euclidean metric in the domain and the
Hilbert–Schmidt metric in the range). Next, it is readily verified that
1
(2.6) Tr X “ t, det X “ t2 ´ x2 ´ y 2 ´ z 2 “: qpxq,
2
where q is the quadratic form of the Minkowski spacetime, which confirms that
X P PSDpC2 q iff x P L4 . The isomorphism x ÞÑ x¨σ will be useful in understanding
2.1. ON THE GEOMETRY OF THE SET OF QUANTUM STATES 33

automorphisms of the cones L4 and PSDpC2 q, and when proving Størmer’s theorem
in Section 2.4.5.
When d ą 2, the set DpCd q is no longer a ball, but rather the non-commutative
analogue of a simplex. Its symmetrization (see Section 4.1.2)
DpCd q “ conv DpCd q Y ´DpCd q “ tA P Msa
` ˘
d : }A}1 ď 1u,

is S1d,sa , the unit ball of the self-adjoint part of the 1-Schatten space (see Section
1.3.2).

ion
One way to quantify the fact that the set DpCd q is different from a ball when
d ą 2, is to compute the radius a of its inscribed and circumscribeda Hilbert–Schmidt
balls. The former equals 1{ dpd ´ 1q while the latter is pd ´ 1q{d (the same

ut
values as for the set ∆d´1 of classical states on t1, . . . , du, and for the same reasons).
In other words, if we denote by Bpρ˚ , rq the ball centered at ρ˚ and with Hilbert–

rib
Schmidt radius r inside the hyperplane H1 “ tTrp¨q “ 1u Ă Msa d , we have
˜ ¸ ˜ c ¸

ist
1 d d´1
(2.7) B ρ˚ , a Ă DpC q Ă B ρ˚ ,
dpd ´ 1q d

rd
and these values—differing by the factor of d ´ 1—are the best possible.
Exercise 2.3 (The Bloch sphere is a sphere). Show that the matrix X given
fo
by (2.5) has eigenvalues 1 and ´1 if and only if t “ 0 and x2 ` y 2 ` z 2 “ 1.
ot
Exercise 2.4 (Composition rules for Pauli matrices). Verify the composition
rules for Pauli matrices. (i) σa2 “ I (ii) If a, b, c are all different, then σa σb “ iεσc ,
N

where ε “ ˘1 is the sign of the permutation px, y, zq ÞÑ pa, b, cq; in particular, if


a ‰ b, then σa σb “ ´σb σa .
ly.

2.1.3. Facial structure.


on

Proposition 2.1 (Characterization of faces of D). There is a one-to-one cor-


respondence between nontrivial subspaces of Cd and proper faces of DpCd q. Given
a subspace t0u Ĺ E Ĺ Cd , the corresponding face DpEq is the set of states whose
se

range is contained in E:
lu

DpEq “ tρ P DpCd q : ρpCd q Ă Eu.


In particular, pure states (extreme points, i.e., minimal, 0-dimensional faces) cor-
na

respond to the case dim E “ 1. In the direction opposed to a pure state |xyxx|
lies a face which corresponds to all states with a range orthogonal to x; these are
so

maximal proper faces.


r

Remark 2.2. All faces of DpCd q are exposed (as defined in Exercise 1.5) since
Pe

DpEq is the intersection of DpCd q with the hyperplane tX : TrpXPE q “ 1u.


Proof of Proposition 2.1. Denote by rangepρq “ ρpCd q the range of a state
ρ P DpCd q. We use the following observation: if ρ, σ P DpCd q and λ P p0, 1q, then
(2.8) rangepλρ ` p1 ´ λqσq “ rangepρq ` rangepσq.
We first check that, for any nontrivial subspace E Ă Cd , DpEq is a face of
DpCd q. For indeed, if ρ P DpEq can be written as λρ1 ` p1 ´ λqρ2 for ρ1 , ρ2 P DpCd q
and λ P p0, 1q, then (2.8) implies that rangepρ1 q Ă E and rangepρ
Ť 2 q Ă E.
Conversely, let F Ă DpCd q be a proper face. Define E “ trangepρq : ρ P F u.
It follows—from (2.8) and from the fact that F is convex—that E is actually a
34 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

subspace and that F contains an element ρ such that rangepρq “ E. We now claim
that F “ DpEq. The direct inclusion is obvious. Conversely, consider σ P DpEq. For
1
λ ą 0 small enough the operator τ “ 1´λ pρ´λσq is a state. Since ρ “ λσ `p1´λqτ ,
we conclude that the segment joining σ and τ is contained in F ; in particular
σ P F. 
Exercise 2.5. Show directly (i.e., without appealing to Proposition 2.1) that
any exposed face of DpCd q has the form DpEq for some subspace E Ă Cd .

ion
2.1.4. Symmetries. We now describe the symmetries of DpCd q. This is
closely related to the famous theorem of Wigner that characterizes the isometries
of complex projective space as a metric space. Recall (see Appendix B.2) that rψs

ut
denotes the equivalence class in PpCd q of a unit vector ψ P SCd .

rib
Theorem 2.3 (Wigner’s theorem). Denote by PpCd q the projective space over
Cd , equipped with the Fubini–Study metric (B.5). A map f : PpCd q Ñ PpCd q

ist
is an isometry if and only if there is a map U on Cd which is either unitary or
anti-unitary such that, for any unit vector ψ,

rd
(2.9) f prψsq “ rU pψqs.
d d
A map U : C Ñ C is anti-unitary if it is the composition of a unitary map
with complex conjugation. fo
ot
Proof. We outline the proof of Wigner’s theorem for d “ 2. Since the projec-
tive space over C2 identifies with the Bloch sphere, its group of isometries is given
N

by the orthogonal group Op3q, and splits into direct isometries (rotations, or SOp3q)
and indirect isometries.
ly.

Let f be a direct isometry of the Bloch ball. It has two opposite fixed points rϕ1 s
and rϕ2 s, with ϕ1 K ϕ2 , and is a rotation of angle θ in the plane tr ?12 pϕ1 `eiα ϕ2 qs :
on

α P Ru. One checks that (2.9) is satisfied when U is given by U pϕ1 q “ ϕ1 and
U pϕ2 q “ eiθ ϕ2 . Note that U is determined up to a global phase. In particular,
if we insist on having U P SUp2q, we are led to the choice U pϕ1 q “ e´iθ{2 ϕ1
se

and U pϕ2 q “ eiθ{2 ϕ2 involving the half-angle. (We point out the isomorphism
lu

PSUp2q Ø SOp3q, see Exercise B.4.)


The complex conjugation with respect to an orthonormal basis pψ1 , ψ2 q in C2
na

induces on the Bloch ball the reflection R in the plane trcos θψ1 `sin θψ2 s : θ P Ru.
Since any indirect isometry of the Bloch ball is the composition of R with a direct
so

isometry, the result follows.


The case d ą 2 can be deduced from the d “ 2 case; we do not include the
r

argument here (see Notes and Remarks). 


Pe

When PpCd q is identified with the set of pure states on Cd , the isometries from
Theorem 2.3 act as ρ ÞÑ U ρU : or ρ ÞÑ U ρT U : for U P Updq. Here ρT denotes the
transposition of a state ρ with respect to a distinguished basis (since ρ “ ρ: , ρT is
also the complex conjugate of ρ with respect to that basis).
Theorem 2.4 (Kadison’s theorem). Affine maps preserving globally DpCd q are
of the form ρ ÞÑ U ρU : or ρ ÞÑ U ρT U : for U P Updq. In particular, they are
isometries with respect to the Hilbert–Schmidt distance.
d d
Proof. Let Φ be an affine map on Msa d such that ΦpDpC qq “ DpC q. Then
d
Φ preserves the set of faces of DpC q, which are described in Proposition 2.1. In
2.2. STATES ON MULTIPARTITE HILBERT SPACES 35

particular, Φ preserves the set of minimal faces, which identify with pure states.
Therefore Φ induces a bijection on PpCd q. We claim that Φ is an isometry with
respect to the Fubini–Study distance (B.5), which is equivalent to
Tr pΦp|ψyxψ|q ¨ Φp|ϕyxϕ|qq “ |xψ, ϕy|2
for ψ, ϕ P Cd . If rψs “ rϕs, this is clear. Otherwise, let M Ă Cd be the 2-
dimensional subspace generated by ψ and ϕ. By Proposition 2.1, the set DpM q
canonically identifies with a (3-dimensional) face of DpCd q. Consequently, ΦpDpM qq

ion
is also a face, which identifies with DpM 1 q for some 2-dimensional subspace M 1 Ă
Cd . Since DpM q and DpM 1 q are Bloch balls, the map Φ restricted to DpM q must
be an isometry (affine maps preserving S 2 are isometries). We may now apply

ut
Wigner’s theorem: there is U P Updq such that either Φpρq “ U ρU : whenever ρ is
a pure state, or Φpρq “ U ρT U : for all pure states ρ. Since Φ is affine, one of the

rib
two formulas is valid for all ρ P DpCd q. 
Although for d ą 2 the set DpCd q is not centrally symmetric, we may argue

ist
that the maximally mixed state ρ˚ plays the role of a center. In particular, we have

rd
Proposition 2.5. Let ρ P DpCd q be a state which is fixed by all the isometries
of DpCd q (with respect to the Hilbert–Schmidt distance). Then ρ “ ρ˚ .
fo
Proof. We have U ρU : “ ρ for every unitary matrix U . Since Updq spans Md
as a vector space, ρ commutes with any matrix, therefore it equals α I for some
ot
α P C, and the trace constraint forces α “ 1{d. 
N

One consequence of Proposition 2.5 is that ρ˚ is the centroid of DpCd q. Kadi-


son’s theorem also implies that D has enough symmetries in the sense of Section
ly.

4.2.2 (see Exercise 4.25). Another consequence of Kadison’s Theorem 2.4 is a char-
acterization of affine automorphisms of the cone of positive semi-definite matrices,
on

which will be presented in Proposition 2.29.


Exercise 2.6. Show that the affine automorphisms of DpC2 q form a group
se

which is isomorphic to Op3q.


Exercise 2.7. Show that the affine automorphisms of DpCd q form a group
lu

which is isomorphic to the semidirect product of PSUpdq and Z2 with respect to


the action of Z2 on PSUpdq induced by the complex conjugation.
na

Exercise 2.8. State and prove the real version of Wigner’s theorem.
so

Exercise 2.9. Let ρ be a state which is invariant under transposition with


r

respect to any basis. Show that ρ “ ρ˚ .


Pe

2.2. States on multipartite Hilbert spaces


2.2.1. Partial trace. A fundamental concept in quantum information theory
is the partial trace (for a physically motivated approach, see Section 3.4). Let
H “ H1 b H2 be a bipartite Hilbert space. The partial trace over H2 is the map
(or the superoperator, see Section 0.9) TrH2 : BpH1 q b BpH2 q Ñ BpH1 q defined as
IdBpH1 q b Tr. Its action on product operators is given by
TrH2 pA b Bq “ pTr BqA
for A P BpH1 q, B P BpH2 q. Similarly, the partial trace with respect to H1 is defined
as TrH1 “ Tr b IdBpH2 q .
36 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

In particular, if ρ is a state on H1 b H2 , then TrH1 ρ is a state on H2 , and TrH2


is a state on H1 . Note also the formulas TrH1 pρ1 b ρ2 q “ ρ2 and TrH2 pρ1 b ρ2 q “ ρ1
for states ρ1 P DpH1 q, ρ2 P DpH2 q.
We sometimes write Tr1 for TrH1 and Tr2 for TrH2 . The definition of partial
trace extends naturally to the multipartite setting: if H “ H1 b ¨ ¨ ¨ b Hk , then for
1 ď i ď k we denote by TrHi or Tri the operation
IdBpH1 q b ¨ ¨ ¨ b IdBpHi´1 q b Tr b IdBpHi`1 q b ¨ ¨ ¨ b IdBpHk q .

ion
2.2.2. Schmidt decomposition. We recall the singular value decomposition
(SVD) for matrices: any real or complex matrix A P Mk,d can be decomposed as
A “ U ΣV : , when U and V are unitary matrices of sizes k and d respectively, and

ut
Σ “ pΣij q P Mk,d is a “rectangular diagonal” (i.e., such that Σij “ 0 whenever

rib
i ‰ j) nonnegative matrix. Moreover, up to permutation, the “diagonal” elements
of Σ are uniquely determined by A and are called the singular values of A. We
often denote the singular values of A by s1 pAq ě ¨ ¨ ¨ ě sminpk,dq pAq. The singular

ist
values of A coincide with the eigenvalues of pAA: q1{2 when k ď d, and with the
eigenvalues of pA: Aq1{2 when k ě d. Note that, in any case, AA: and A: A share

rd
the same nonzero eigenvalues.
An equivalent presentation of the SVD is as follows: there exist orthonormal
fo
sequences pui q (in Rk or Ck , depending on the context) and pvi q (in Rd or Cd ), and
a non-increasing sequence of nonnegative scalars psi q such that
ot
ÿ
(2.10) A“ si |ui yxvi |.
N

When translated into the language of tensors (see Section 0.4), the singular value
ly.

decomposition becomes the Schmidt decomposition, which is widely used in quan-


tum information. We note that, besides the bipartite situation, there is no analogue
on

of the Schmidt decomposition in multipartite Hilbert spaces.


Proposition 2.6 (easy). Let ψ be a vector in a (real or complex) bipartite
Hilbert space H1 b H2 , with d1 “ dim H1 and d2 “ dim H2 . Set d :“ minpd1 , d2 q.
se

Then there exist nonnegative scalars pλi q1ďiďd , and orthonormal vectors pχi q1ďiďd
lu

in H1 and pϕi q1ďiďd in H2 , such that


d
na

ÿ
(2.11) ψ“ λi χi b ϕi .
i“1
so

The numbers pλ1 , . . . , λd q are uniquely determined if we require that λ1 ě ¨ ¨ ¨ ě λd


and are called the Schmidt coefficients of ψ.
r
Pe

Note that λ21 ` ¨ ¨ ¨ ` λ2d “ |ψ|2 . We may write λi pψq instead of λi to emphasize
the dependence on ψ. The largest r such that λr pψq ą 0 is called the Schmidt rank
of ψ. If ψ P Ck b Cd is identified with a matrix M P Mk,d as in Section 0.8, then
(2.12) TrCd |ψyxψ| “ M M : .
Via this identification, Schmidt coefficients of ψ coincide with singular values of M ,
and the Schmidt rank of ψ coincides with the rank of M . States of Schmidt rank
1 are exactly product vectors. The largest and the smallest Schmidt coefficients of
ψ P H1 b H2 are also given by the variational formulas
(2.13) λ1 pψq “ maxt|xψ, χ b ϕy| : χ P H1 , ϕ P H2 , |χ| “ |ϕ| “ 1u,
2.2. STATES ON MULTIPARTITE HILBERT SPACES 37

often referred to as the maximal overlap with a product vector, and


(2.14) λd pψq “ min max |xψ, χ b ϕy|.
χPH1 ,|χ|“1 ϕPH2 ,|ϕ|“1

The above are fully analogous to the (special cases of) Courant–Fischer variational
formulas for singular values of a matrix.
2.2.3. A fundamental dichotomy: separability vs. entanglement. We
now introduce a fundamental concept: the dichotomy between separability and

ion
entanglement for quantum states. Let H be a complex Hilbert space admitting a
tensor decomposition

ut
(2.15) H “ H1 b ¨ ¨ ¨ b Hk .
Recall that since 1-dimensional factors may be dropped, we may—and usually will—

rib
assume that all the factors are of dimension at least 2.
Definition 2.7. A pure state ρ “ |χyxχ| on H is said to be pure separable if

ist
the unit vector χ is a product vector, i.e., if there exist unit vectors χ1 , . . . , χk such
that χ “ χ1 b ¨ ¨ ¨ b χk . In that case,

rd
(2.16) ρ “ |χ1 yxχ1 | b ¨ ¨ ¨ b |χk yxχk |.
fo
Extending the definition of separability to mixed states requires to consider
convex combinations (we study in detail the convex hull operation A ÞÑ convpAq in
ot
Section 1.1.2).
Definition 2.8. A mixed state ρ “ |χyxχ| on H is said to be separable if it can
N

be written as a convex combination of pure separable states. We denote by SeppHq


(or simply by Sep) the set of separable states on H. We have
ly.

(2.17) SeppHq “ convt|χ1 b ¨ ¨ ¨ b χk yxχ1 b ¨ ¨ ¨ b χk | : χ1 P H1 , . . . , χk P Hk u.


on

States which are not separable are called entangled. Since pure states are the
extreme points even of the larger set DpHq (Proposition 2.1), it follows that the
se

pure separable states (i.e., those given by (2.16)) are exactly the extreme points of
SeppHq. Since there are vectors that are not product vectors, the set SeppHq is a
lu

proper subset of DpHq. A schematic representation of the inclusion Sep Ă D and


of the corresponding extreme points can be found in Figure 2.1.
na

An alternative description of the set SeppHq is the following: it is the convex


hull of product states.
so

(2.18) SeppHq “ convtρ1 b ¨ ¨ ¨ b ρk : ρ1 P DpH1 q, . . . , ρk P DpHk qu.


r

It is noteworthy that SeppHq and DpHq have the same dimension. This can
Pe

be seen from the following observation. Let V1 , . . . , Vk be real or complex vector


spaces and, for each i, let Fi be a family of linear independent vectors in Vi . Then
the family â
Fi “ tf1 b ¨ ¨ ¨ b fk : fi P Fi u
Vi . We apply the observation with Vi “ B sa pHi q and
Â
is linearly independent in
sa
with Fi being a basis of B pHi q consisting of states. This way, we obtain a family
of pdim Hq2 linearly independent product states which are elements of SeppHq. This
shows that SeppHq has dimension pdim Hq2 ´ 1. Note that this argument uses the
fact that the field is C: in real quantum mechanics, the set of separable states has
empty interior (cf. Section 0.4).
38 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

pure states

D = conv{pure states}

pure product states Sep = conv{pure product states}


ρ∗ = I/d2

ion
ut
rib
Figure 2.1. The sets of states (D) and of separable states (Sep)
on Cd bCd . Pure product states have measure zero inside the set of
pure states; however both convex hulls have the same dimension.

ist
The picture does not respect convexity of Sep, but it is supposed
to reflect the relative rarity of separability.

rd
A deeper result asserts that, in the bipartite case, not only do Sep and D have
fo
the same dimension, they also have the same inradius. This may look surprising
since Sep is defined as the convex hull of a very small subset of the set of extreme
ot
points of D. This remarkable fact was discovered by Gurvits and Barnum and will
be proved later (see Theorem 9.15).
N

It is often useful to consider the cone


SEPpHq “ tλρ : λ ě 0, ρ P SeppHqu
ly.

of separable operators; we will return to this in Section 2.4.


on

We emphasize that the notion of separability depends crucially on the tensor


decomposition (2.15) of H. As a concrete example, consider a tripartite space H “
H1 b H2 b H3 . There are several different notions of separability on H: separability
se

with respect to the tripartition H1 : H2 : H3 , and separability with respect to


lu

each of the three bipartitions H1 : H2 b H3 , H2 : H1 b H3 and H3 : H1 b H2 or


combinations thereof. Moreover, some authors introduce the concept of “absolute”
properties. For example, a state ρ P DpH1 b ¨ ¨ ¨ b Hk q is absolutely separable if
na

U ρU : is separable for any unitary operator U on H1 b ¨ ¨ ¨ b Hk . However, in this


so

book we will focus primarily on the setting in which all partitions are fixed.
Although the extreme points of Sep are very easy to describe (as noted earlier,
r

they are precisely the pure product states), there is no simple description of the
Pe

facial structure of Sep available (compare with Proposition 2.1, which describes all
the faces of D). The complexity of the facial structure of Sep can be related to
the fact that deciding whether a state is separable is known to be, in the general
setting, NP-hard. This makes calculating some parameters of Sep highly nontrivial;
we will run into this problem in Chapter 9 (see, e.g., Theorem 9.6). Finally, in view
of the dual formulation of the problem of describing faces of a convex body (see
Section 1.1.5, and particularly Proposition 1.5), characterizing maximal faces of
Sep is essentially equivalent to describing extreme points of the object dual to Sep
(see (2.47)), which are well understood only for very small dimensions. (Appendix
C discusses closely related issues.)
2.2. STATES ON MULTIPARTITE HILBERT SPACES 39

Exercise 2.10 (The length of separable representations). (i) Using Cara-


théodory’s theorem (see Section 1.1.2), show that any separable state on Cd b Cd
can be written as the convex combination of at most d4 pure product states. (ii)
Using a dimension-counting argument, prove that there exist separable states on
Cd b Cd which cannot be written as a convex combination of less than cd3 pure
product states, for some constant c ą 0.
Exercise 2.11 (Edges of Sep). Let d1 , d2 ě 2. Show that SeppCd1 b Cd2 q has
a face (as defined in Section 1.1.3) which is 1-dimensional.

ion
2.2.4. Some examples of bipartite states. We now present some examples
of states on Cd b Cd that are widely used in quantum information theory.

ut
2.2.4.1. Maximally entangled states. A pure state on Cd b Cd is called maxi-

rib
mally entangled if it has the form ρ “ |ψyxψ| with
d
1 ÿ
ψ“? e i b fi ,

ist
(2.19)
d i“1

rd
where pei q1ďiďd and pfi q1ďiďd are two orthonormal bases in Cd . Such a vector ψ is
called a maximally entangled vector.
In the special case of d “ 2, i.e., for systems formed of 2 qubits, the maximally
fo
entangled states are called Bell states. Many quantum information protocols, such
as quantum teleportation, use Bell states as a fundamental resource.
ot
If we identify vectors and matrices as explained in Section 0.8, the set of all
maximally entangled vectors on Cd b Cd (or, more precisely, on Cd b Cd ) identifies
N

with the unitary group Updq Ă Md .


ly.

Exercise 2.12 (Maximally entangled states and trace duality). Let ψ be


the maximally entangled state given by (2.19), with pei q and pfi q`both equal˘ to
on

the canonical basis p|iyq1ďiďd , and let ρ “ |ψyxψ|. Show that Tr ρpX b Y q “
1 T d
d TrpXY q for any X, Y P BpC q.
se

Exercise 2.13 (Maximal entanglement and the distance to Seg). Let ψ be a


unit vector in Cd bCd and Seg Ă SCd bCd the set of unit product vectors (see (B.6)).
lu

Show that |ψyxψ| is maximally entangled if and only if distpψ, Segq is maximal. For
extensions to the multipartite case, see Section 8.5.
na

2.2.4.2. Isotropic states. Isotropic states are states which are a convex (or
so

affine) combination of the maximally mixed state and a maximally entangled state.
They have the form
r

I
Pe

(2.20) ρβ “ β|ψyxψ| ` p1 ´ βq 2 ,
d
where ψ is as in (2.19) and ´ d21´1 ď β ď 1.
2.2.4.3. Werner states. Consider the flip operator F P B sa pCd b Cd q defined
on pure tensors by F px b yq “ y b x and extended by linearity. Its eigenspaces are
the symmetric subspace
Symd “ tψ P Cd b Cd : F pψq “ ψu
and the antisymmetric subspace
Asymd “ tψ P Cd b Cd : F pψq “ ´ψu.
40 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

The corresponding projectors are PSymd “ 12 pI `F q and PAsymd “ 12 pI ´F q. We


need to know that the symmetric and antisymmetric subspaces are irreducible for
the action U ÞÑ U b U of the unitary group.
Proposition 2.9 (see Exercise 2.15). Let E Ĺ Cd b Cd be a nonzero subspace
such that for every U P Updq and ψ P E, we have pU b U qψ P E. Then either
E “ Symd or E “ Asymd .
Note that dim Symd “ dpd`1q{2 while dim Asymd “ dpd´1q{2. The symmetric

ion
and antisymmetric states are defined respectively as
2 2
πs “ PSymd and πa “ PAsymd .

ut
dpd ` 1q dpd ´ 1q
For λ P r0, 1s, consider the state wλ (called the Werner state) obtained as a convex

rib
combination of these two projectors
(2.21) wλ “ λπs ` p1 ´ λqπa .

ist
Another equivalent expression is

rd
1
(2.22) wλ “ pI ´αF q,
d2 ´ dα
where
(2.23) α“
1 ` dp1 ´ 2λq
fo
P r´1, 1s.
1 ` d ´ 2λ
ot
When d “ 2, the space Asym2 has dimension one, and Werner states are then a
N

special case of isotropic states.


Exercise 2.14 (Polarization formulas in Symd and Asymd ). Prove that Symd “
ly.

spantx b x : x P Cd u and Asymd “ spantx b y ´ y b x : x, y P Cd u.


on

Exercise 2.15 (Irreducibility of Symd and Asymd ).


Denote by A “ spantU b U : U P Updqu.
(i) Prove that for every subspace E Ă Cd , PE b PE P A .
se

(ii) Show that for every nonzero vectors ϕ, ψ P Symd , there is V P A such that
xϕ|V |ψy ‰ 0.
lu

(iii) Show that for every nonzero vectors ϕ, ψ P Asymd , there is V P A such that
xϕ|V |ψy ‰ 0.
na

(iv) Deduce Proposition 2.9.


so

Exercise 2.16 (The twirling channel and Werner states).


(i) Show that a state ρ P DpCd bCd q satisfies pV bV qρpV bV q: “ ρ for all V P Updq
r

if and only if it is a Werner state.


Pe

(ii) Show that if U is chosen at random with respect to the Haar measure on UpCd q,
then for any ρ P DpCd b Cd q, EpU b U qρpU b U q: “ wλ with λ “ TrpρPSymd q. (The
map ρ ÞÑ EpU b U qρpU b U q: is called the twirling channel.)
(iii) Show that if ψ P SCd is chosen uniformly at random, then E |ψbψyxψbψ| “ πs .
2.2. STATES ON MULTIPARTITE HILBERT SPACES 41

2.2.5. Entanglement hierarchies.


2.2.5.1. k-extendible states. Consider a bipartite Hilbert space H1 b H2 and
k ě 2. For i P t1, . . . , ku, we denote by
Trall but i : BpH1 b H2bk q Ñ BpH1 b H2 q
the partial trace with respect to all copies of H2 , except for the ith. A state
ρ P DpH1 b H2 q is said to be k-extendible (with respect to H2 ) if there exists a
state ρk P DpH1 b H2bk q with the property that e

ion
Trall but i ρk “ ρ
for every i P t1, . . . , ku. The state ρk is called a k-extension of ρ. The main result

ut
regarding k-extendible states is the following theorem.

rib
Theorem 2.10 (not proved here). A quantum state on H1 b H2 is separable if
and only if it is k-extendible for every k ě 2.

ist
The “only if” direction is easy (see Exercise 2.17), while the “if” direction relies
on the quantum de Finetti theorem and is beyond the scope of this book.

rd
Exercise 2.17. For k ě 2, denote by k-Ext the set of k-extendible states on
H1 b H2 . Show that k-Ext is convex and check the inclusions Sep Ă l-Ext Ă k-Ext
for k ď l. fo
Exercise 2.18 (2-extendibility of pure states). (i) Let ρ P DpH1 b H2 q be a
ot
state such that TrH2 ρ “ |ψyxψ| for some ψ P H1 . Show that ρ “ |ψyxψ| b σ for
N

some σ P DpH2 q. (ii) Let χ P H1 b H2 be a unit vector. Show that |χyxχ| is


2-extendible if and only if χ is a product vector.
ly.

2.2.5.2. k-entangled states. A quantum state on H “ H1 b H2 is said to be


k-entangled if it can be written as a convex combination
on

ÿ
λi |ψi yxψi |
i
se

where each unit vector ψi P H1 b H2 has Schmidt rank at most k. Note that
lu

separable states are exactly 1-entangled states.


2.2.6. Partial transposition. Let H be a complex Hilbert space, and let pej q
na

be an orthonormal basis in H. We can identify BpHq with the set of n ˆ n matrices


by associating a matrix paij q with the operator
so

ÿ
aij |ei yxej |.
r

i,j
Pe

Once the basis is fixed, it makes sense to consider the transposition T : BpHq Ñ
BpHq with respect to that basis, defined as
´ÿ ¯ ÿ
T aij |ei yxej | “ aij |ej yxei |.
i,j i,j

We will sometimes use the alternative notation AT “ T pAq. Note that T is not
canonical and depends on the choice of the basis in H. The standard usage in linear
H
algebra refers to the transposition with respect to the standard basis p|jyqdim
j“1 .
We now define the partial transposition: if H “ H1 b H2 is a bipartite Hilbert
space, and if T denotes the transposition on BpH1 q (with respect to a specified
42 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

basis) and Id is the identity operation of BpH2 q, then the partial transposition (or
partial transpose) is the operation
Γ “ T b Id : BpH1 b H2 q Ñ BpH1 b H2 q.
The partial transposition of a state ρ P DpH1 b H2 q is denoted by ρΓ “ Γpρq. What
we have defined is actually the partial transposition with respect to the first factor.
The partial transposition with respect to the second factor is defined by switching
the roles of H1 and H2 .

ion
Partial transposition applies nicely to states represented as block matrices (see
Section 0.7): if ρ P DpH1 b H2 q corresponds to the block operator pAij q, with
Aij P BpH2 q, then ρΓ corresponds to the block operator pAji q. Similarly, par-

ut
tial transposition of ρ with respect to the second factor corresponds to the block
operator pATij q. We illustrate this by computing the partial transposition of the

rib
(maximally entangled) Bell state: if ψ “ ?12 p|00y ` |11yq, then (assuming transpo-
sition is taken with respect to the canonical basis of C2 )

ist
» fi » fi
1 0 0 1 1 0 0 0

rd
1 — 0 0 0 0 ffi ffi , |ψyxψ|Γ “ 1 — 0 0 1 0 ffi .
— ffi
(2.24) |ψyxψ| “ —
2 – 0 0 0 0 fl 2 – 0 1 0 0 fl
1 0 0 1 fo 0 0 0 1
As for the usual transposition, the partial transposition depends on a choice of
ot
basis. However, we have the following result.
N

Proposition 2.11. The eigenvalues of the partial transposition of an operator


do not depend on a choice of basis.
ly.

Proof. Let pei q and pe1i q be two orthonormal bases in H1 , and T and T 1 denote
the transpositions with respect to each basis. Let U be the unitary transformation
on

such that e1j “ U pej q. We claim that, for every operator X P BpH1 q,
(2.25) T 1 pXq “ V : T pXqV,
se

where V “ U T pU q. By linearity, it is enough to check (2.25) when X “ |e1i yxe1j |, in


lu

which case T 1 pXq “ |e1j yxe1i |. On the other hand, since X “ U |ei yxej |U : , we then
have
na

T pXq “ T pU : q|ej yxei |T pU q “ T pU : qU : |e1j yxe1i |U T pU q “ T pU q: U : |e1j yxe1i |U T pU q,


so

as claimed. This shows that the partial transpositions with respect to the two bases
are conjugated via the unitary transformation V b I, and the claim follows since
r
Pe

unitary conjugation preserves the spectrum. 

Partial transposition naturally extends to the multipartite setting: if H “


H1 b ¨ ¨ ¨ b Hk , then for any i P t1, . . . , ku we may define the partial transposition
with respect to the ith factor as
Γi :“ IdBpH1 q b ¨ ¨ ¨ b IdBpHi´1 q bT b IdBpHi`1 q b ¨ ¨ ¨ b IdBpHk q .
Exercise 2.19 (Eigenvalues of the partial transpose of a pure state). Find
all eigenvalues of the partial transpose of a pure state in terms of the Schmidt
coefficients of that state.
2.2. STATES ON MULTIPARTITE HILBERT SPACES 43

řd
Exercise 2.20 (Partial transpose and the flip operator). Let ψ “ ?1d i“1 ei b
ei be a maximally entangled state on Cd bCd and assume that partial transposition
is computed with respect to the basis pei q. Show that |ψyxψ|Γ “ d1 F where F :
x b y ÞÑ y b x is the flip operator.
Exercise 2.21. Find an error in the following argument that purports to mimic
the proof of Proposition 2.11 to show that the partial transpose of any state is
positive.

ion
If X P B sa pH1 q, then T pXq (with respect to some fixed basis) has the same spectrum
as X and so there is a unitary operator V such that T pXq “ V : XV . This shows
that the partial transpose with respect to the same basis is given by conjugation

ut
by the unitary transformation V b I. Since such conjugation preserves spectra, it
follows that the partial transpose of any state is positive.

rib
2.2.7. PPT states.

ist
Definition 2.12. A state ρ P DpH1 b H2 q is said to have a positive partial
transpose (or to be PPT) if the operator ρΓ is positive. We denote by PPTpH1 bH2 q,

rd
or simply PPT, the set of PPT states (note that this set is convex).
Proposition 2.11 implies that the definition of PPT states is basis-independent.
fo
Similarly, we do not need to specify whether we apply the partial transposition to
the first or the second factor; one passes from one to the other by applying the full
ot
transposition, which is a spectrum-preserving operation.
Let ρ be a state on H1 bH2 . Since the partial transposition preserves the trace,
N

we have Tr ρΓ “ 1, and therefore ρ is PPT if and only if ρΓ is a state. Geometrically,


the set of PPT states can therefore be described as an intersection
ly.

(2.26) PPT “ D X ΓpDq.


on

The map Γ is a linear map which preserves the Hilbert–Schmidt norm, and
therefore behaves as an isometry (see Exercise 2.22). This map is not a canonical
object and depends on the choice of a basis. However, the intersection D X ΓpDq
se

does not depend on the particular basis used.


lu

The next proposition lies at the root of the relevance of the concept of PPT
states to quantum information theory.
na

Proposition 2.13 (Peres–Horodecki criterion). Let ρ be a state on H1 b H2 .


If ρ is separable, then ρ is PPT. In other words, we have the inclusion
so

(2.27) SeppH1 b H2 q Ă PPTpH1 b H2 q.


r
Pe

Proof. Since the set PPT is convex, it suffices to show that the extreme points
of SeppH1 b H2 q are PPT. The extreme points of SeppH1 b H2 q are pure product
states, i.e., states of the form
ρ “ |ψ1 b ψ2 yxψ1 b ψ2 | “ |ψ1 yxψ1 | b |ψ2 yxψ2 |
for unit vectors ψ1 P H1 , ψ2 P H2 . The partial transpose of such a state is
ρΓ “ |ψ1 yxψ1 |T b |ψ2 yxψ2 | “ |ψ1 yxψ1 | b |ψ2 yxψ2 |,
where ψ1 is the vector obtained by applying the complex conjugation to each coor-
dinate of ψ1 . It follows that ρΓ is positive, hence ρ is PPT. 
44 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

PPT = D ∩ Γ(D)
Sep

ion
Γ(D)

ut
rib
Figure 2.2. An illustration of the inclusion Sep Ă PPT “

ist
D X ΓpDq. The inclusion is strict if and only if dim H1 dim H2 ą 6,
see Theorem 2.15. The set Sep is not a polytope, but the set of

rd
its extreme points is much “thinner” than those of D and of PPT
if the dimension is large.
fo
The Peres–Horodecki criterion (or the PPT criterion) is shown in action in
ot
(2.24), where it certifies non-separability of the Bell state: the partial transpose
|ψyxψ|Γ is clearly non-positive. However, positivity of ρΓ is, in general, only a
N

necessary condition for separability of ρ as, without additional assumptions, the


inclusion (2.27) is strict. Still, there are two important cases where PPT states are
ly.

guaranteed to be separable: pure states and states in low dimensions, specifically


in C2 b C2 and C2 b C3 .
on

Lemma 2.14. A pure state is PPT if and only if it is separable.


se

ř
Proof. Let ρ “ |ψyxψ| be a pure state, and let ψ “ λi χi b ψi be a Schmidt
decomposition. If we compute the partial transposition with respect to a basis
lu

including pχi q, we obtain


ÿ
(2.28) ρΓ “ λi λj |χi b ψj yxχj b ψi |.
na

i,j
so

Suppose there exist two non-zero Schmidt coefficients (say, λi and λj with i ‰ j).
Then one checks from (2.28) that the restriction of ρΓ to spantχi b ψj , χj b ψi u is
r

not positive. It follows that ρ is PPT if and only if only one Schmidt coefficient
Pe

of ψ is nonzero, which means that ψ is a product vector and, consequently, ρ is


separable. (See Exercise 2.19 for a complete description of the spectrum of ρΓ .) 
Theorem 2.15 (Størmer–Woronowicz theorem, see Section 2.4.5 for the 2 b 2
case, the 2 b 3 case is not proved here). If H “ C2 b C2 or H “ C3 b C2 or
H “ C2 b C3 , then every PPT state on H is separable.
Examples of entangled PPT states are known for any other (nontrivial) pairs
of dimensions.
Besides pure and low-dimensional states, another family of states for which
separability and the PPT property are equivalent are the Werner states. We have
2.2. STATES ON MULTIPARTITE HILBERT SPACES 45

Proposition 2.16 (Separability of Werner states). For λ P r0, 1s, let wλ be the
Werner state on H “ Cd b Cd as defined in (2.21). The following are equivalent
(i) wλ is separable,
(ii) wλ is PPT,
(iii) Tr wλ F ě 0,
(iv) λ ě 1{2.
Proof. The equivalence (iii) ðñ (iv) is a straightforward calculation (we have
Tr wλ F “ 2λ ´ 1). To show that (ii) ðñ (iv), we compute the partial transpose of

ion
Werner states in the form (2.22) to obtain (see also Exercise 2.20)
1
wλΓ “ 2
` ˘

ut
I ´αd|xyxx| ,
d ´ dα

rib
where x is the maximally entangled vector in the canonical basis p|iyq1ďiďd . It
follows that wλΓ ě 0 ðñ α ď 1{d ðñ λ ě 1{2 (see (2.23) for the second
equivalence). It remains to prove that (iv) implies (i); since Sep is convex, it is

ist
enough to establish that w1 and w1{2 are separable. The separability of w1 “ πs is
clear from part (iii) of Exercise 2.16. To show that w1{2 is separable, we proceed

rd
as follows. For j ‰ k and a complex number ξ with modulus one, denote v ˘ “
|jy ˘ ξ|ky. Next, think of ξ as a random variable uniformly distributed on the unit
fo
circle. The operator E |v ` yxv ` | b |v ´ yxv ´ | belongs to the separable cone SEP. We
compute
ot
E |v ` v ´ yxv ` v ´ | “ |jjyxjj| ` |kkyxkk| ` |jkyxjk| ` |kjyxkj| ´ |jkyxkj| ´ |kjyxjk|,
N

where we omitted the symbols b to reduce the clutter. Summing over j ‰ k, we


obtain that
ly.

ÿ ÿ
A :“ 2d |jyxj| b |jyxj| ` 2 |jyxj| b |kyxk| ´ 2F P SEP.
j j‰k
on

The separability of w1{2 follows now from the identity


1 1 ´A ÿ ¯
se

w1{2 “ pd I ´F q “ ` pd ´ 1q |jyxj| b |kyxk| ,


dpd2 ´ 1q dpd2 ´ 1q 2 j‰k
lu

where the first equality is just (2.22) (note that λ “ 1{2 implies α “ 1{d by
(2.23)). 
na

Exercise 2.22 (Partial transposition as a reflection). Find a subspace E Ă


B sa pH1 b H2 q such that Γ “ 2PE ´ Id, where PE denotes the orthogonal projection
so

onto E. Geometrically, Γ identifies with the reflection with respect to E.


r

Exercise 2.23 (Separability of isotropic states). For ´ d21´1 ď β ď 1, let


Pe

ρβ P DpCd b Cd q be the isotropic state as defined in (2.20). Show that ρβ is


1
separable if and only if β ď d`1 .
Exercise 2.24 (The realignment criterion). The realignment AR P BpCd2 b
C , Cd1 b Cd1 q of an operator A P BpCd1 b Cd2 q is defined as follows: the map
d2

A ÞÑ AR is C-linear, and |ijyxkl|R “ |ikyxjl|.


(i) Let ρ P DpCd1 b Cd2 q be a separable state. Show that }ρR }1 ď 1. (The trace
norm } ¨ }1 is defined in Section 1.3.2).
(ii) Let ρ P DpCd1 b Cd2 q be a pure entangled state. Show that }ρR }1 ą 1.
The condition }ρR }1 ď 1 is usually called the realignment criterion. Just as for
46 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

the PPT criterion, this is a necessary (but generally not sufficient) condition for
separability.
2.2.8. Local unitaries and symmetries of Sep. Let us state an analogue
of Kadison’s theorem (Theorem 2.4), which characterizes affine maps preserving
the set Sep. This can be seen as a motivation for the study of partial transposition.
Theorem 2.17 (not proved here). Let H “ Cd1 b ¨ ¨ ¨ b Cdk be a multipartite
Hilbert space. An affine map Φ : B sa pHq Ñ B sa pHq satisfies ΦpSepq “ Sep if and

ion
only if it can be written as the composition of maps of the following forms:
(i) local unitaries

ut
ρ ÞÑ pU1 b ¨ ¨ ¨ b Uk qρpU1 b ¨ ¨ ¨ b Uk q:
for Ui P Updi q,

rib
(ii) partial transpositions
ρ1 b ¨ ¨ ¨ b ρi b ¨ ¨ ¨ b ρk ÞÑ ρ1 b ¨ ¨ ¨ b ρTi b ¨ ¨ ¨ b ρk ,

ist
for some i P t1, . . . , du,

rd
(iii) swaps
ρ1 b ¨ ¨ ¨ b ρi b ¨ ¨ ¨ b ρj b ¨ ¨ ¨ b ρk ÞÑ ρ1 b ¨ ¨ ¨ b ρj b ¨ ¨ ¨ b ρi b ¨ ¨ ¨ b ρk ,
for some i ă j such that di “ dj .
fo
All these maps are also isometries with respect to the Hilbert–Schmidt distance.
ot
Although SeppHq has a much smaller group of isometries than DpHq, the con-
N

clusion of Proposition 2.5 still holds for Sep: the only fixed point is ρ˚ . This implies
for example that ρ˚ is the centroid of Sep.
ly.

Proposition 2.18. Consider H “ H1 b ¨ ¨ ¨ b Hk , and let A P B sa pHq be an


on

operator which is invariant under local unitaries, i.e., such that


A “ pU1 b ¨ ¨ ¨ b Uk qApU1 b ¨ ¨ ¨ b Uk q:
se

for any unitary matrices Ui on Hi . Then A is a multiple of identity. In particular,


if A is a state, then A “ ρ˚ .
lu

Proof. We use the following elementary fact: an operator Aj P BpHj q which


commutes with any unitary operator actually commutes with any operator and is
na

therefore a multiple of identity. We can write A as a linear combination of product


operators
so

ÿ piq piq
A“ c i A1 b ¨ ¨ ¨ b Ak ,
r

i
Pe

piq sa
where Aj P B pHj q. Let U “ U1 b ¨ ¨ ¨ b Uk , where pUj q are random unitary
matrices, independent and Haar-distributed on the corresponding unitary groups.
By the translation-invariance of the Haar measure (see Appendix B.3), the opera-
piq
tor E Uj Aj Uj: commutes with any unitary operator on Hj and therefore (by the
preceding fact) equals αi,j IHj for some αi,j P R. By independence, it follows that
ÿ piq piq
ci E U1 A1 U1: b ¨ ¨ ¨ b Uk Ak Uk:
` ˘
E U AU : “
i
ÿ piq piq
“ ci pE U1 A1 U1: q b ¨ ¨ ¨ b pE Uk Ak Uk: q
i
2.3. SUPEROPERATORS AND QUANTUM CHANNELS 47

˜ ¸
ÿ k
ź
“ ci αi,j IH .
i j“1

Since U AU : “ A, the conclusion follows. 

However, the group of local unitaries does not act irreducibly: there are non-
trivial invariant subspaces which are described by the following lemma.
Lemma 2.19 (not proved here). Let H “ Cd1 b ¨ ¨ ¨ b Cdk be a multipartite

ion
Hilbert space, and
G “ tU1 b ¨ ¨ ¨ b Uk : Ui P Updi qu

ut
1 2 1
be the group of local unitaries. For 1 ď i ď k, write Msadi “ Vi ‘ Vi , where Vi
2
denotes the hyperplane of trace zero Hermitian matrices, and Vi “ R I.

rib
A subspace E Ă B sa pHq is invariant under G if and only if it can be decomposed
as a direct sum of subspaces of the form

ist
Viα1 1 b ¨ ¨ ¨ b Viαk k

rd
for some choice pα1 , . . . , αk q P t1, 2uk .

2.3. Superoperators and quantum channels fo


We now turn our attention to maps acting between spaces of operators, whence
ot
the name superoperators. Other terms that will be used to describe these objects
N

are quantum maps and quantum operations. The crucial observation is that with
any such map one can naturally associate usual operators acting on larger Hilbert
spaces.
ly.

2.3.1. The Choi and Jamiołkowski isomorphisms. As usual, let H1 and


on

H2 denote complex (finite-dimensional) Hilbert spaces. Recall (see Sections 0.4 and
0.8) the canonical isomorphisms pH1 b H2 q˚ Ø H1˚ b H2˚ and
se

(2.29) H1˚ b H2 Ø BpH1 , H2 q.


lu

It follows that there is a canonical isomorphism


BpH1 , H2 q˚ Ø BpH2 , H1 q.
na

This isomorphism can be seen more concretely via trace duality: a map S P
so

BpH2 , H1 q is identified with the linear form on BpH1 , H2 q defined by T ÞÑ Tr ST .


By iterating (2.29), we deduce that there is a canonical isomorphism
r
Pe

J : BpBpH1 q, BpH2 qq ÝÑ BpH2 b H1 q


(both spaces being canonically isomorphic to H1 b H1˚ b H2 b H2˚ ), which is called
the Jamiołkowski isomorphism. A concrete representation of the Jamiołkowski
isomorphism is as follows: fix any basis pei q in H1 and denote by Eij the operator
|ei yxej | P BpH1 q. Then J is described as
(2.30) J : BpBpH1 q, BpH2 qq ÝÑ ÿBpH2 b H1 q
Φ Þ Ñ
Ý ΦpEij q b Eji .
i,j
48 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

It turns out that there is another related isomorphism, called the Choi isomorphism,
which is often more useful. Once a basis in H1 is fixed, the Choi isomorphism is
the C-linear bijective map
(2.31) C : BpBpH1 q, BpH2 qq ÝÑ ÿBpH2 b H1 q
Φ Þ Ñ
Ý ΦpEij q b Eij .
i,j

We call CpΦq the Choi matrix of Φ. Note that the Choi isomorphism is basis-

ion
dependent, whereas the Jamiołkowski isomorphism is not. The relation between
the isomorphisms J and C is given by the partial transposition: if Γ denotes the
partial transposition on H2 b H1 with respect to H1 , then C “ Γ ˝ J.

ut
Here is a simple lemma which identifies the elements in BpBpH1 q, BpH2 qq that
correspond to rank 1 operators under the Choi isomorphism.

rib
Lemma 2.20. Given A, B P BpH1 , H2 q, consider the map Φ : BpH1 q Ñ BpH2 q

ist
defined by
ΦpXq “ AXB :

rd
for X P BpH1 q. Then CpΦq “ |ayxb|, where a “ vecpAq and b “ vecpBq are the
vectors in H2 b H1 associated to the operators A and B (see Section 0.8). Note
fo
also that A has rank 1 if and only if a is a product vector.
Proof. By C-linearity it is enough to consider A “ |ψyxej | and B “ |χyxei |
ot
for some ψ, χ P H2 and some basis vectors ei , ej P H1 . A simple computation shows
that then CpΦq “ |ψyxχ| b Eij , while a “ ψ b ej and b “ χ b ei , and the Lemma
N

follows. 
ly.

Finally, let us mention a connection with the notion of realignment defined in


Exercise 2.24. If Φ : BpCd1 q Ñ BpCd2 q is a superoperator, the matrix of Φ with
on

respect to the bases pEij q1ďi,jďd1 and pEkl q1ďk,lďd2 is given by the realigned Choi
matrix CpΦqR .
se

2.3.2. Positive and completely positive maps. A map Φ : BpH1 q Ñ


BpH2 q is called self-adjointness-preserving if ΦpB sa pH1 qq Ă B sa pH2 q. It is easily
lu

checked that the following are equivalent:


(1) Φ is self-adjointness-preserving,
na

(2) ΦpX : q “ pΦpXqq: for any X P BpH1 q,


(3) JpΦq P B sa pH2 b H1 q,
so

(4) CpΦq P B sa pH2 b H1 q.


r

An elegant way to rewrite the definition (2.31) of Choi’s matrix is as follows.


Pe

` ˘
(2.32) CpΦq “ Φ b IdBpH1 q p|χyxχ|q,
ř
where χ “ i ei b ei P H1 b H1 is (a multiple of) a maximally entangled vector.
(Recall that we fixed a basis pei q in H1 when defining the Choi isomorphism.) We
also note that there is a one-to-one correspondence between
(a) self-adjointness-preserving C-linear maps Φ : BpH1 q Ñ BpH2 q and
(b) R-linear maps Ψ : B sa pH1 q Ñ B sa pH2 q.
The correspondence is straightforward: Ψ is obtained from Φ by restriction, whereas
Φ is obtained from Ψ by complexification (see Section 0.5).
2.3. SUPEROPERATORS AND QUANTUM CHANNELS 49

In the sequel we will occasionally refer to maps of the form Φ b IdBpH1 q as


extensions of Φ (not to be confused with k-extensions of states defined in Sec-
tion 2.2.5.1). As an example, the partial transposition Γ is an extension of the
transposition T .
Throughout this section, we consider a self-adjointness-preserving linear map
Φ : BpH1 q Ñ BpH2 q. The adjoint of Φ is the unique map Φ˚ : BpH2 q Ñ BpH1 q
such that
TrpXΦpY qq “ TrpΦ˚ pXqY q

ion
for any X P BpH2 q and Y P BpH1 q. Note that Φ˚ is automatically self-adjointness-
preserving if Φ is.

ut
The map Φ is said to be positivity preserving—shortened to positive when this
does not lead to ambiguity—if the image of every positive operator is a positive

rib
operator. The map Φ is said to be n-positive if Φ b Id : B sa pH1 b Cn q Ñ B sa pH2 b
Cn q is positive. (Note that n-positivity formally implies k-positivity for any k ă n.)

ist
Finally, the map Φ is said to be completely positive if it is n-positive for every integer
n. (However, only n “ minpdim H1 , dim H2 q needs to be checked, see Exercise

rd
2.28.) We denote by CP pH1 , H2 q the set of completely positive maps from BpH1 q
to BpH2 q. It is immediate from the definition that CP pH1 , H2 q is a convex cone;
more about this aspect of the theory in Section 2.4.
fo
The transposition is an example of a map which is positive but not 2-positive;
this can be seen, e.g., from (2.24) in Section 2.2.6 or from Exercise 2.32. Here is an
ot
important structure theorem concerning completely positive maps.
N

Theorem 2.21 (Choi’s theorem). Let Φ : BpH1 q Ñ BpH2 q be self-adjointness-


preserving. The following are equivalent:
ly.

(1) the map Φ is completely positive,


(2) the Choi matrix CpΦq is positive semi-definite,
on

(3) there exist finitely many operators A1 , . . . , AN P BpH1 , H2 q such that, for
any X P BpH1 q,
se

N
ÿ
(2.33) ΦpXq “ Ai XA:i .
lu

i“1

A decomposition of Φ in the form (2.33) is called a Kraus decomposition of Φ.


na

The smallest integer N such that a Kraus decomposition is possible is called the
Kraus rank of Φ. As will be clear from the proof, the Kraus rank of Φ is the same
so

as the rank of CpΦq in the usual (linear algebra) sense. In particular, it will follow
that the Kraus rank of Φ : BpH1 q Ñ BpH2 q is at most dim H1 dim H2 .
r
Pe

Proof. It is easily checked that p3q implies p1q. The implication p1q ñ p2q
follows from the representation (2.32) of the Choi matrix. We now prove p2q ñ p3q.
By the spectral theorem, there exist vectors ai P H1 b H2 such that
ÿ
(2.34) CpΦq “ |ai yxai |.
i

By Lemma 2.20, |ai yxai | is the Choi matrix of the map X ÞÑ Ai XA:i , where Ai P
BpH1 , H2 q is associated to ai via the relation ai “ vecpAi q. A representation of
type p3q follows now from the linearity of the Choi isomorphism. 
50 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

There is a simple relation between Kraus decompositions of a completely posi-


tive map and of its adjoint: if Φ is given by (2.33), then for any Y P BpH2 q,
N
ÿ
(2.35) Φ˚ pY q “ A:i Y Ai .
i“1
It is clear from the above analysis that Φ˚ is completely positive if and only if
Φ is. It is also readily checked that Φ˚ is positivity-preserving if and only if Φ is;
this and related properties are explored in Exercises 2.25–2.33, and discussed in a

ion
more general setting in Section 2.4.
Exercise 2.25. Let Φ : BpH1 q Ñ BpH2 q be self-adjointness-preserving. Show

ut
that Φ˚ is positive if and only if Φ is positive, and that for any n, Φ˚ is n-positive
if and only if Φ is n-positive.

rib
Exercise 2.26. Show that if Φ and Ψ are completely positive, so are Φ b Ψ
and Φ ˝ Ψ (the composition, assuming it is defined).

ist
Exercise 2.27. Show that any self-adjointness-preserving map Φ : BpH1 q Ñ

rd
BpH2 q is the difference of two completely positive maps.
Exercise 2.28. Show that the assertions of Theorem 2.21 are also equivalent
fo
to the fact that Φ is n-positive, with n “ minpdim H1 , dim H2 q.
Exercise 2.29. Let k ă n be integers. Show that the map Φ : Mn Ñ Mn
ot
defined by ΦpXq “ k TrpXq I ´X is k-positive but not pk ` 1q-positive.
N

2.3.3. Quantum channels and Stinespring representation. Consider a


self-adjointness-preserving map Φ : BpH1 q Ñ BpH2 q. We say that Φ is unital
ly.

if ΦpIH1 q “ IH2 . We say that Φ is trace-preserving if Tr ΦpXq “ Tr X for any


X P BpH1 q. It is easily checked that these properties are dual to each other:
on

(2.36) Φ is unital ðñ Φ˚ is trace-preserving.


We now introduce a fundamental concept in quantum information theory:
se

Definition 2.22. A quantum channel Φ : BpH1 q Ñ BpH2 q is a completely


lu

positive and trace-preserving map.


The reasons why we require quantum channels to be positivity- and trace-
na

preserving are clear: since Φ is supposed to represent some physically possible


so

process, we want states to be mapped to states. (The motivation behind the


complete positivity condition is more subtle; we attempt to explain it in Section
r

3.5.) A channel that is additionally unital (i.e., if both Φ and Φ˚ are channels)
Pe

is called doubly stochastic or bistochastic. Clearly, such channels exist only if


dim H1 “ dim H2 . (However, see Proposition 2.32 for a notion that makes sense
also when dim H1 ‰ dim H2 .)
Remark 2.23. It follows immediately from the relation (2.33) that the condi-
řN
tion i“1 Ai A:i “ IH2 is equivalent to Φ IH1 “ IH2 , i.e., to Φ being unital. It
` ˘
řN
is less obvious, but easily checked, that i“1 A:i Ai “ IH1 is equivalent to Φ being
trace-preserving. Indeed, if the condition holds, then, for any ξ P H1 ,
´ÿ N ¯ N
´ÿ ¯
Trp|ξyxξ|q “ Tr A:i Ai |ξyxξ| “ Tr Ai |ξyxξ|A:i .
i“1 i“1
2.3. SUPEROPERATORS AND QUANTUM CHANNELS 51

In other words, Tr ΦpXq “ Tr X if X “ |ξyxξ| and hence, by linearity, for any X P


B sa pH1 q. Furthermore, the argument is clearly reversible, so we have equivalence.
We now state the Stinespring representation theorem, which plays a fundamen-
tal role in understanding the structure of quantum maps.
Theorem 2.24 (Stinespring theorem). Let Φ : BpH1 q Ñ BpH2 q be a com-
pletely positive map. Then there exist a finite-dimensional Hilbert space H3 and an
embedding V : H1 Ñ H2 b H3 such that, for any X P BpH1 q,

ion
(2.37) ΦpXq “ TrH3 V XV : .
Moreover, Φ is a quantum channel if and only if V is an isometry. Conversely, for

ut
any isometric embedding V , the map Φ defined via (2.37) is a quantum channel.

rib
The proof shows that the smallest possible dimension for H3 equals the Kraus
rank of Φ; in particular we can require that dimpH3 q ď dimpH1 q dimpH2 q.

ist
Proof. Start from a Kraus decomposition (2.33) for Φ. Set H3 :“ CN , and
let p|iyq1ďiďN be its canonical basis. Define V by the formula

rd
N
ÿ
(2.38) V |ψy “ Ai |ψy b |iy for ψ P H1 .

We claim that, for any X P BpH1 q,


i“1
fo
ot
N
ÿ
V XV : “ Ai XA:j b |iyxj|.
N

i,j“1

As in Remark 2.23, this follows by linearity from the special case X “ |ψyxψ|. This
ly.

řN
implies the identity (2.37). We also see from (2.38) that V : V “ i“1 A:i Ai . By
Remark 2.23 it follows that Φ is a quantum channel if and only if V : V “ IH1 , which
on

is equivalent to V being an isometry. Finally, the last assertion is straightforward:


complete positivity follows from (the easy direction of) Choi’s Theorem 2.21 and
the trace preserving property is immediate. 
se

When H1 “ H2 , the Stinespring theorem can be reformulated as follows: any


lu

quantum channel can be lifted to a unitary transformation using some ancillary


Hilbert space.
na

Theorem 2.25. Let Φ : BpHq Ñ BpHq be a quantum channel. Then there


so

exist a finite-dimensional Hilbert space H1 , a unit vector ψ P H1 and a unitary


transformation U on H b H1 such that, for any X in BpHq,
r

ΦpXq “ TrH1 U pX b |ψyxψ|qU : .


Pe

(2.39)
Proof. Let V : H Ñ H b H1 be given by Theorem 2.24 (with H1 “ H3 ).
Choose any vector ψ P H1 . The map ϕ b ψ ÞÑ V pϕq (defined on the subspace
H b ψ Ă H b H1 ) is an isometry, and therefore can be extended to a unitary U on
H b H1 . One checks easily that (2.39) holds. 
We mention in passing that a popular way to quantify how different two quan-
tum channels are is the diamond norm. For a self-adjointness-preserving map
Φ : BpH1 q Ñ BpH2 q, define
}Φ}˛ “ sup sup }pΦ b IBpCk q qpρq}1 .
kPN ρPDpCk q
52 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

Exercise 2.30. Show that any positive unital map Φ : Msa sa


m Ñ Mn is a con-
traction with respect to the operator norm } ¨ }8 .
Exercise 2.31. Show that any positive trace-preserving map Φ : Msa sa
m Ñ Mn
is a contraction with respect to the trace norm } ¨ }1 (cf. Proposition 8.4).
Exercise 2.32. (i) Let Φ : Msa sa
m Ñ Mn be a trace preserving map. Show that
Φ is k-positive if and only if Φ b Id : B pCm b Ck q Ñ B sa pCn b Ck q is a contraction
sa

with respect to the trace norm } ¨ }1 . (ii) Let T : Mn Ñ Mn` be the transposition˘

ion
map. Calculate the norm of T b Id considered as a map on B sa pCm b C2 q, } ¨ }1
and give an example of an operator on which that norm is attained. (iii) Same
question for the operator norm } ¨ }8 .

ut
Exercise 2.33. Show that any positive, unital, and trace-preserving map Φ :

rib
n
Msa sa
n Ñ Mn is rank non-decreasing, i.e., rank Φpρq ě rank ρ for any ρ P DpC q.

2.3.4. Some examples of channels. In this section we list some important

ist
classes and examples of quantum channels or, more generally, of superoperators.
(Sometimes it is convenient to drop the trace-preserving constraint.)

rd
2.3.4.1. Unitary channels. Unitary channels are the completely positive isome-
tries of the set of states identified in Theorem 2.4, i.e., the maps that are of the
form ρ ÞÑ U ρU : for some U P Updq. fo
2.3.4.2. Mixed-unitary channels. A mixed-unitary channel Φ : BpCd q Ñ BpCd q
ot
is a channel which is a convex combination of unitary channels, i.e., is of the form
N
N

ÿ
(2.40) Φpρq “ λi Ui ρUi: ,
i“1
ly.

where pλi q is a convex combination and Ui P UpCd q. Such channels are automati-
cally unital. A remarkable fact is that the converse is true when d “ 2.
on

Proposition 2.26 (see Exercise 2.34). Let Φ : BpC2 q Ñ BpC2 q be a unital


quantum channel. Then Φ is mixed-unitary.
se

Exercise 2.34 (Proof of Proposition 2.26). (i) Argue that it is enough to prove
lu

Proposition 2.26 for channels which are diagonal with respect to the basis of Pauli
matrices (2.2).
na

(ii) Given real numbers a, b, c, check that the superoperator


1` ˘
| IyxI | ` a|σx yxσx | ` b|σy yxσy | ` c|σz yxσz |
so

2
is completely positive if and only if pa ` bq2 ď p1 ` cq2 and pa ´ bq2 ď p1 ´ cq2 .
r
Pe

(iii) Rewrite the conditions from part (ii) as a system of four linear inequalities and
conclude the proof.
Exercise 2.35. Show that any mixed-unitary channel Φ : BpCd q Ñ BpCd q can
be expressed as in (2.40) with N ď d4 ´ 2d2 ` 2. Note that the argument from
Exercise 2.34 gives N ď 4 (which is optimal) for d “ 2.
2.3.4.3. Depolarizing and dephasing channels. The completely depolarizing (or
completely randomizing) channel is the channel R : BpCd q Ñ BpCd q defined as
RpXq “ Tr X dI . It maps every state to the maximally mixed state. The completely
dephasing channel is the channel D : BpCd q Ñ BpCd q that maps any operator to
its diagonal part (with respect to a fixed basis).
2.3. SUPEROPERATORS AND QUANTUM CHANNELS 53

Exercise 2.36 (Depolarizing channels and isotropic states). The family of


depolarizing channels is defined as Rλ “ λ I `p1 ´ λqR for ´ d21´1 ď λ ď 1. Check
that the Choi matrix of Φλ is dρλ , where ρλ is the isotropic state defined in (2.20).
Exercise 2.37. Show that the completely depolarizing and completely dephas-
ing channels are mixed-unitaries (see also Exercise 8.6).
2.3.4.4. POVMs, quantum-classical channels. A POVM (Positive Operator-
Valued Measure)ř on H is a finite family of positive operators pMi q1ďiďN with the

ion
property that Mi “ I. Given a POVM, we can associate to it a quantum channel
(called sometimes a quantum-classical or q-c channel) Φ : BpHq Ñ BpCN q defined

ut
as
N

rib
ÿ
(2.41) Φpρq “ |iyxi| TrpMi ρq.
i“1

ist
The dual concept is the notion of a classical-quantum or c-q channel Ψ :
BpCN q Ñ BpHq. This is a channel of the form

rd
N
ÿ
Ψpρq “ ρi xi|ρ|iy,

where pρi q are states on H.


i“1
fo
ot
Exercise 2.38 (Duality between c-q and q-c channels). Let Φ be a q-c channel
N

of the form (2.41). Under what condition on pMi q is Φ unital? When this condition
is satisfied, show that the dual map Φ˚ is a c-q channel.
ly.

2.3.4.5. Entanglement-breaking maps. A map Φ P CP pHin , Hout q is said to


be entanglement-breaking if, for any integer d and for any positive operator X P
on

B sa pHin b Cd q, the operator pΦ b IdMd qpXq belongs to the cone SEPpHout b Cd q


of separable operators. Here are equivalent descriptions of entanglement-breaking
se

maps:
lu

Lemma 2.27 (Characterization of entanglement-breaking maps, see Exercise


2.39). Let Φ : BpHin q Ñ BpHout q be completely positive. The following are equiv-
alent:
na

(i) Φ is entanglement-breaking,
(ii) the Choi matrix CpΦq lies in the separable cone SEPpHout b Hin q,
so

(iii) there is a Kraus decomposition of Φ (2.33) where all the Kraus operators Ai
r

have rank 1.
Pe

Entanglement-breaking quantum channels are sometimes called q-c-q channels.


This reflects the fact that a quantum channel Φ is entanglement-breaking if and
only if it can be written as the composition of a q-c channel with a c-q channel.
Exercise 2.39. Prove Lemma 2.27.
Exercise 2.40 (Once broken, always broken). Let Φ, Ψ be two completely
positive maps, with one of them being entanglement-breaking. Show that pΦ b
ΨqpXq P SEP for any positive operator X.
54 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

2.3.4.6. PPT-inducing maps. A map Φ P CP pHin , Hout q is said to be PPT-


inducing if for any integer d and any positive operator X P B sa pHin b Cd q, the
operator pΦ b IdMd qpXq has positive partial transpose.
Lemma 2.28 (Characterization of PPT-inducing maps, see Exercise 2.41). A
completely positive map Φ is PPT-inducing if and only if JpΦq “ CpΦqΓ is positive
semi-definite.
Exercise 2.41. Prove Lemma 2.28.

ion
2.3.4.7. Schur channels. Given matrices A, B P Md , their Schur product A d B
is defined as the entrywise product: pA d Bqij “ Aij Bij . Given A P Md , the map

ut
ΘA : Md Ñ Md defined as ΘA pXq “ A d X is called a Schur multiplier. When A
is positive with Aii “ 1 for all i, the map ΘA is a quantum channel called a Schur

rib
channel.
Exercise 2.42 (Positivity of Schur multipliers). Let A P Md . Show that the

ist
following are equivalent:
(i) A is positive semi-definite,

rd
(ii) ΘA is positive,
(iii) ΘA is completely positive.
fo
Exercise 2.43 (Kraus decompositions of Schur channels). Let Φ : Md Ñ Md
be a quantum channel. Show that Φ is a Schur channel if and only if it admits a
ot
Kraus decomposition (2.33) where Ai are diagonal operators.
N

2.3.4.8. Separable and LOCC superoperators. We now assume that Hin and
out
H are bipartite spaces, say Hin “ H1in b H2in and Hout “ H1out b H2out . A
ly.

map Φ P CP pHin , Hout q is called separable if it admits a Kraus decomposition


p1q
involving product operators, i.e., if there exist operators Ai : H1in Ñ H1out and
on

p2q
Ai : H2in Ñ H2out such that for any X P BpHin q,
N
se

ÿ p1q p2q p1q p2q


ΦpXq “ pAi b Ai qXpAi b Ai q: .
lu

i“1

A widely used class is the class of LOCC channels (LOCC standing for “Local
na

Operations and Classical Communication”). Without defining this class, we simply


note that any LOCC channel is separable, and that any convex combination of
product channels (of the form Φ1 b Φ2 ) is an LOCC channel. (Note that these
so

notions are not all equivalent, see Exercise 2.44.) More properties of this class will
r

be presented in Section 12.2.


Pe

Exercise 2.44. Consider the following operators on C2 b C2


A1 “ |0yx0| b |0yx0|, A2 “ |0yx0| b |0yx1|, A3 “ |1yx1| b |1yx1|, A4 “ |1yx1| b |1yx0|.
ř4
Show that the channel on BpC2 bC2 q defined as ΦpXq “ i“1 Ai XA:i is a separable
channel which cannot be written as a convex combination of product channels.

2.3.4.9. Direct sums. Let Φ1 : BpH1in q Ñ BpH1out q and Φ2 : BpH2in q Ñ BpH2out q


be two quantum channels. Their direct sum
Φ1 ‘ Φ2 : BpH1in ‘ H2in q Ñ BpH1out ‘ H2out q
2.4. CONES OF QIT 55

is the quantum channel defined by its action on block operators as


ˆ„ ˙ „ 
X11 X12 Φ1 pX11 q 0
(2.42) pΦ1 ‘ Φ2 q “ .
X21 X22 0 Φ2 pX22 q
Exercise 2.45. Describe the Kraus operators of Φ1 ‘ Φ2 in terms of the Kraus
operators of Φ1 and Φ2 .

2.4. Cones of QIT

ion
In this section we will review some of the cones used commonly in quantum
information theory. We will distinguish between cones of operators and cones of su-
peroperators, and emphasize the distinction by using two different fonts: C denotes

ut
a generic cone of operators and C a generic cone of superoperators.

rib
2.4.1. Cones of operators. We start by describing some cones of operators
and by identifying their bases and their dual cones (Table 2.1). We work in a

ist
Hilbert space H and the corresponding space B sa pHq of self-adjoint operators. The
vector e chosen to define the base in (1.22) is the maximally mixed state. Here

rd
and in what follows, we assume that separability and the PPT property are defined
with respect to a fixed bipartition H “ H1 b H2 . However, most considerations
extend to multipartite variants and settings allowing flexibility in the choice of the
fo
partition. In order lighten the notation, we often write PSD and SEP instead of
PSDpHq and SEPpH1 b H2 q unless this may cause ambiguity.
ot
Table 2.1. List of cones of operators. All cones live in B sa pHq,
N

the space of self-adjoint operators on a bipartite Hilbert space H “


H1 b H2 with dimension n “ dim H. The base is taken with
ly.

respect to the distinguished vector e “ I {n. The cones C are listed


in the decreasing order (with respect to inclusion) from top to
on

bottom and, consequently, the dual cones C ˚ are in the increasing


order from top to bottom. Most inclusions/duality relations are
se

straightforward and/or were pointed out earlier in this chapter;


the remaining few are clarified in this subsection.
lu

Cone of operators C base C b dual cone C ˚


na

Block-positive BP BP SEP
Decomposable co-PSD ` PSD convpD Y ΓpDqq PPT
so

Positive PSD D PSD


Pos. partial transpose PPT PPT co-PSD ` PSD
r

SEP Sep BP
Pe

Separable

In the same way that PSD is associated with its base D, the set of separable
states Sep gives rise to the separable cone SEP, and the set PPT of states with
positive partial transpose leads to the PPT cone. Another example is the cone
of k-entangled matrices (cf. Section 2.2.5). In general, whenever a definition of a
set of matrices involves linear matrix inequalities and a trace constraint, dropping
that constraint gives us a cone. When the original set of matrices is compact, the
resulting cone is pointed, with the hyperplane of trace zero matrices isolating 0 as
an exposed point (cf. Corollary 1.8). All the cones cataloged in this section have
this property and are in fact nondegenerate.
56 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

One more convenient concept is that of co-PSD matrices


Γ
(2.43) co-PSD :“ ΓpPSDq “ tρ P Msa
n : ρ P PSDu

where Γ is the partial transpose defined in Section 2.2.6. It allows a compact


description of the cone dual to PPT : since PPT “ co-PSD X PSD, it follows
from (1.20) (see also Exercise (1.36)) that
(2.44) PPT ˚ “ co-PSD ` PSD,
the cone of decomposable matrices. Note that, except in trivial cases, this cone is

ion
strictly larger than PSD and so its base contains matrices that are not states.
To conclude the review of the standard cones, we will identify the cone SEP ˚ .

ut
To that end, it is convenient to think of operators on a composite Hilbert space
Cm b Cn as block matrices M “ pMjk qm j,k“1 , where Mjk P Mn (see Section 0.7).

rib
Since the extreme rays of SEP are generated by pure separable states |ξ b ηyxξ b η|
(see Section 2.2.3), we have

ist
(2.45) M P SEP ˚ ðñ @ξ P Cm , @η P Cn , Tr M |ξ b ηyxξ b η| ě 0
` ˘
m

rd
ÿ
(2.46) ðñ @ξ P Cm , ξj ξk Mjk P PSDpCn q.
j,k“1

fo
The condition in (2.46) is usually referred to as M “ pMjk q being block-positive.
(We note that the definition treats m and n symmetrically, even though this not
ot
apparent in (2.46).) In other words, the dual to the cone of separable matrices is
that of block-positive matrices, denoted by BP. As a consequence, the polar of Sep
N

can be identified: we obtain from Lemma 1.6 that


(2.47) Sep˝ “ ´d2 BP,
ly.

where BP denotes the set of block-positive matrices with unit trace and the minus
on

sign stands for the point reflection with respect to the appropriately normalized
identity matrix.
se

2.4.2. Cones of superoperators. We next turn our attention to the classes


of superoperators considered in Section 2.3.2. We consider superoperators acting
lu

from B sa pHq to B sa pKq and denote the corresponding cones as CpH, Kq, or as CpHq
when H “ K, or simply as C when there is no ambiguity. The cones we consider
na

most frequently are gathered in Table 2.2. (See Exercise 2.48 for a discussion of
identification and duality relations for k-positive superoperators and k-entangled
so

states.)
In the language of cones, a positivity-preserving superoperator Φ : B sa pHq Ñ
r

sa
` ˘
B pKq may be defined via the condition Φ PSDpHq Ă PSDpKq. It is readily
Pe

seen that the set of positivity-preserving


` maps˘ is itself a cone (which we will denote
by P pH, Kq) in the space B B sa pHq, B sa pKq .
As was noted in Section 2.3.2, Φ P P pH, Kq iff Φ˚ P P pK, Hq. As we shall
see, it would be erroneous to take this to mean that P is self-dual. Instead, this
is a special case of a very general elementary fact: If V1 , V2 are vector spaces, if
C1 Ă V1 , C2 Ă V2 are closed convex cones, and if Φ : V1 Ñ V2 is linear, then
ΦpC1 q Ă C2 iff Φ˚ pC2˚ q Ă C1˚ .
The most important cone of superoperators is arguably that of completely
positive maps, denoted by CP . By Choi’s Theorem 2.21, ` Φ P ˘CP iff the Choi
matrix CpΦq is positive semi-definite. In other words, CP Cm , Cn is isomorphic to
2.4. CONES OF QIT 57

Table 2.2. Cones of superoperators. To each cone C from the


first (double) column we associate a cone C which consists of Choi
matrices of elements from C. They are connected by the rela-
tion Φ P C ðñ CpΦq P C. We note that C is a subset of
BpB sa pHq, B sa pKqq while C is a subset of B sa pK b Hq. The cones
C and C are in decreasing order from top to bottom and the dual
cones C ˚ and C ˚ are in increasing order from top to bottom.

ion
Cone of superoperators C C C˚ C˚
Positivity-preserving P BP SEP EB
Decomposable DEC co-PSD ` PSD PPT PPT

ut
Completely positive CP PSD PSD CP
PPT-inducing PPT PPT co-PSD ` PSD DEC

rib
Entanglement-breaking EB SEP BP P

ist
PSDpCn b Cm q. This means that—with proper identifications, see Exercise 2.47—

rd
the cone CP is self-dual. Choi’s correspondence Φ ÞÑ CpΦq relates similarly ` nthe
cone˘ EBpCm , Cn q of entanglement-breaking maps from Msa m to M sa
n to SEP C b
Cm , as well as the cone P P T pCm , Cn q of P P T -inducing maps to PPT pCn bCm q.

A map Φ : Msa sa
fo
m Ñ Mn is said to be co-completely positive if CpΦq P co-PSD.
ot
Similarly, one says that Φ is decomposable if it can be represented as a sum of
N

a completely positive map and a co-completely positive map. It follows that the
correspondence Φ ÞÑ CpΦq relates the cone DECpCn , Cm q of decomposable maps
to the cone of decomposable matrices.
ly.

Interestingly, SEPpCn b Cm q˚ identifies with P pCm , Cn q. This last identi-


fication is in fact easy to see directly from (2.45)–(2.46). Indeed, CpΦq “ pMjk q
on

means that Mjk “ Φp|ej yxek |q and hence if ξ “ pξj qm m


j“1 P C , then Φp|ξyxξ|q “
řm
j,k“1 ξj ξk Mjk . Consequently,
se

CpΦq P SEPpCn b Cm q˚ ðñ Φp|ξyxξ|q P PSDpCn q for ξ P Cm


lu

ðñ Φ P P,
which is the claimed identification. The first equivalence is simply (2.45)–(2.46) for
na

the choice M “ CpΦq, whereas the second one reflects the fact that the property
of “preserving positivity” needs to be checked only on the extreme rays of the
so

PSD cone, i.e., on operators of the form |ξyxξ|. (See Section 1.2.2 and particularly
Corollary 1.10.)
r
Pe

Exercise 2.46 (Composition rules for maps). Show that a composition of


two co-completely positive maps is completely positive. Similarly, show that a
composition of a co-completely positive map and a completely positive map is co-
completely positive.
Exercise 2.47 (The completely positive cone is self-dual). Show that
CP pCn , Cm q “ tΨ P BpMsa sa m n
n , Mm q : TrpΨ ˝ Φq ě 0 @Φ P CP pC , C qu,
where Tr denotes the trace on BpMsa
n q.

Exercise 2.48 (k-positive superoperators and k-entangled states). Let 1 ď


k ď minpm, nq and Φ : Mn Ñ Mm be self-adjointness-preserving. Show that the
58 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

following are equivalent


(1) Φ is k-positive,
(2) for every x P Cm b Cn with Schmidt rank at most k, we have xx|CpΦq|xy ě 0,
(3) for every A P Mk,m and B P Mk,n , the operator pAbBq: CpΦqpAbBq is positive.
In words, the cone of Choi matrices of k-positive superoperators is dual to the cone
generated by the set of k-entangled states (as defined in Section 2.2.5).
2.4.3. Symmetries of the PSD cone. The results of Sections 2.1.4 allow us
to deduce a description of the groups of affine automorphisms of some of the cones

ion
cataloged in the present section. The argument is based on the following two simple
observations: first, since affine automorphisms preserve facial structure, and since 0

ut
is the only extreme point of all the cones considered above, any affine automorphism
must be linear. Next, if Φ : Msa sa
m Ñ Mn is such that A “ ΦpIq is positive definite,

rib
´1{2
then Ψ defined by Ψpρq “ A ΦpρqA´1{2 is unital, and its adjoint, Ψ˚ , is trace-
preserving (see (2.36)). This often allows to reduce the analysis of general maps to

ist
that of unital or trace-preserving maps. As an example of such reduction we will
prove the following statement.

rd
Proposition 2.29 (Characterization of automorphisms of the PSD cone). Let
n n
Φ : Msa sa
n Ñ Mn be an affine map which satisfies ΦpPSDpC qq “ PSDpC q. Then Φ
n fo
is a linear automorphism of PSDpC q and is of one of two possible forms: Φpρq “
V ρV : or Φpρq “ V ρT V : , for some V P GLpn, Cq. In the first case Φ is completely
ot
positive, whereas in the second case Φ is co-completely positive.
N

Proof. Since rank Φ ě dim PSDpCn q “ dim Msa n , it follows that Φ is sur-
jective and hence injective, so it is indeed an automorphism of PSDpCn q (and,
consequently, so is Φ´1 ). By the earlier remark, Φ must be linear. Since the
ly.

adjoint of a positive map is positive (see Section 2.3.2), it follows that Φ˚ and
pΦ˚ q´1 “ pΦ´1 q˚ are positive. Hence they are both automorphisms of PSDpCn q.
on

Let A “ Φ˚ pIq P PSDpCn q. We claim that A belongs to the interior of PSDpCn q


and, consequently, is positive definite (and invertible). This follows from topologi-
se

cal considerations, but can also be deduced from Proposition 1.4: if A “ Φ˚ pIq lay
on the boundary of PSDpCn q, we would have A P F for some face of PSDpCn q,
lu

which would imply Φ˚ pPSDpCn qq Ă F , contradicting injectivity of Φ˚ . Having


established the claim, we set Ψpσq “ A´1{2 Φ˚ pσqA´1{2 , so that Ψ is a unital auto-
na

morphism of PSDpCn q. Consequently, Ψ˚ is a trace-preserving automorphism of


PSDpCn q, which is only possible if Ψ˚ pDq “ D. It now follows from Kadison’s The-
so

orem 2.4 that, for some U P Upnq, either (i) Ψ˚ pτ q “ U τ U : or (ii) Ψ˚ pτ q “ U τ T U :


(for all τ P Msa
n ). The rest of the argument is just bookkeeping. First, the defi-
r
Pe

nition of Ψ—and that of an adjoint map—imply that Ψ˚ is given by the formula


Ψ˚ pτ q “ ΦpA´1{2 τ A´1{2 q. In case (i), this shows that ΦpA´1{2 τ A´1{2 q “ U τ U : or,
substituting ρ “ A´1{2 τ A´1{2 , Φpρq “ U A1{2 ρA1{2 U : “ V ρV : , where V “ U A1{2 ,
as needed. The fact that Φ is then completely positive is the easy implication of
Choi’s Theorem 2.21. Case (ii) is handled in the same way. 
We have an immediate
Corollary 2.30. Completely positive automorphisms of the cone PSDpCn q,
all of which are of the form ΦV pρq “ V ρV : for some V P GLpn, Cq, act transitively
on the interior of that cone.
2.4. CONES OF QIT 59

For future reference, we state here a slightly more general form of the principle
that is implicit in the proof of Proposition 2.29.
Lemma 2.31. If Φ : Msa sa
m Ñ Mn is a positivity-preserving linear map such
that A “ ΦpIq is positive definite, then Φ̃ defined by Φ̃pρq “ A´1{2 ΦpρqA´1{2 is
unital and positivity-preserving. Similarly, if Ψ is a positivity-preserving linear
map such that Ψpρq ‰ 0 for ρ P PSDpCm qzt0u, then Ψ̃pρq “ ΨpB ´1{2 ρB ´1{2 q is
trace-preserving and positivity-preserving, where B “ Ψ˚ pIq (necessarily positive
definite).

ion
We emphasize that the map Φ in Lemma 2.31 is not assumed to be an auto-
morphism of the PSD cone (as was the case in Proposition 2.29), only positivity-

ut
preserving. Moreover, we also allow the dimensions in the domain and in the range

rib
to be different. Finally, recall that, by Lemma 1.7, the properties “ΦpIq is positive
definite” and “Ψpρq ‰ 0 for ρ P PSDpCm qzt0u” are dual to each other.
In view of the above result, it is natural to wonder when a positivity-preserving

ist
map is equivalent, in the sense of Lemma 2.31, to a map which is both unital and
trace-preserving. (Of course if the dimensions in the domain and in the range are

rd
different, this is only possible if we use the normalized trace or, alternatively, if we
ask that the maximally mixed state be mapped to the maximally mixed state.) It
fo
turns out that this can be ensured if just a little more regularity is assumed. (See
Exercise 2.52 for examples exploring the necessity of the stronger hypothesis.) We
ot
have
Proposition 2.32 (Sinkhorn’s normal form for positive maps). Let Φ : Msa
N

m Ñ
Msa
n be a linear map which belongs to the interior of P , the cone of positivity-
preserving maps. Then there exist positive operators A P PSDpCn q and B P
ly.

PSDpCm q such that the map Φ̃pρq “ AΦpBρBqA is trace-preserving and maps the
on

maximally mixed state to the maximally mixed state (and is necessarily positivity-
preserving).
Proof. Let us first focus on the case m “ n. Given positive definite A, B, let
se

Φ̃ be given by the formula from the Proposition. Then


lu

(2.48) Φ̃ is unital ô AΦpB 2 qA “ I ô ΦpB 2 q “ A´2 ô ΦpB 2 q´1 “ A2 .


We next note that, in the notation of Corollary 2.30, Φ̃ “ ΦA ˝ Φ ˝ ΦB and so
na

Φ̃˚ “ ΦB ˝ Φ˚ ˝ ΦA (this uses the identity Φ˚M “ ΦM , valid when M is self-adjoint).


so

Accordingly, by (2.36),
(2.49)
r

Φ̃ is trace-preserving ô Φ̃˚ is unital ô BΦ˚ pA2 qB “ I ô Φ˚ pA2 q “ B ´2 .


Pe

Solving the last equation in (2.49) for B 2 and substituting it in (2.48) we are led
to a system of equations
˘´1
B 2 “ Φ˚ pA2 q´1 and Φ Φ˚ pA2 q´1 “ A2 .
`
(2.50)
The second equation in (2.50) says that S “ A2 is a fixed point of the function
` ˘´1
(2.51) S ÞÑ f pSq :“ Φ Φ˚ pSq´1 .
Conversely, if S is a positive definite fixed point of f , then A “ S 1{2 and B “
Φ˚ pA2 q´1{2 (i.e., B defined so that the first equation in (2.50) holds) satisfy (2.48)
and (2.49) and yield Φ̃ that is unital and trace-preserving. (The hypothesis “Φ
60 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

belongs to the interior of P ” guarantees that all the inverses and negative powers
above make sense, and that f is well-defined and continuous on PSDzt0u, see
Exercises 2.50 and 2.51.)
To find a fixed point of f we want to use Brouwer’s fixed-point theorem, which
requires a (continuous) function that is a self-map of a compact convex set. One
way to arrive at such setting is to consider f1 : DpCn q Ñ DpCn q defined by
f pσq
(2.52) f1 pσq “ .
Tr f pσq

ion
It then follows that there is σ0 P DpCn q such that f1 pσ0 q “ σ0 and hence f pσ0 q “
tσ0 , where t “ Tr f pσ0 q ą 0. The final step is to note that if we choose, as before,

ut
1{2
A “ σ0 and B “ Φ˚ pA2 q´1{2 , then the corresponding Φ̃ is trace-preserving and
satisfies Φ̃pIq “ t´1 I. If m “ n, this is only possible if t “ 1. In other words, σ0 is a

rib
fixed point of f that we needed in order to conclude the argument. In the general
case, the same argument yields t “ n{m, which translates to Φ̃pI {mq “ I {n, again

ist
as needed. 

rd
Exercise 2.49. Show that Φ P P pCn q is an automorphism of PSDpCn q if an
only if it is rank-preserving.
fo
Exercise 2.50 (Descriptions of the interior of the positive cone). Show that
Φ belongs to the interior of P pCn q iff Φ maps PSDpCn qzt0u to the interior of
ot
PSDpCn q iff there exists δ ą 0 such that Φpρq ě δpTr ρq I for all ρ P PSD.
N

Exercise 2.51 (Interior of the positive cone is self-dual). Show that Φ verifies
Φpρq ě δpTr ρq I (for all ρ P PSD) iff Φ˚ does.
ly.

Exercise 2.52 (Discussion of the necessity of the hypothesis of Proposition


2.32). Give examples of Φ, Ψ P P pC2 q such that (a) ΦpIq and Φ˚ pIq are positive
on

definite, but Φ is not equivalent (in the sense of Proposition 2.32) to a unital,
trace-preserving map, and (b) Ψ is unital and trace-preserving, but Ψ P BP .
se

Exercise 2.53 (Rank nondecreasing and Sinkhorn’s normal form). Give an ex-
ample of map Φ P P pC2 , C2 q which is rank nondecreasing (i.e., verifies rank Φpρq ě
lu

rank ρ for any ρ P DpC2 q), but which does not satisfy the conclusion of Proposition
2.32.
na

2.4.4. Entanglement witnesses. The formalism of cones and their duality


so

allows us to conveniently discuss the concept of entanglement witnesses. We start


with the following simple observation, which is a direct consequence of the identi-
r

fications of the dual cone SEP ˚ as BP (see Table 2.1 in Section 2.4), and of the
Pe

corresponding cone of superoperators as P (Table 2.2).


Proposition 2.33 (Entanglement witnesses, take #1). Let H “ Cm b Cn and
let ρ be a state on H. Then the following conditions are equivalent:
(i) ρ is entangled,
(ii) there exists σ P SEPpHq˚ “ BP such that xσ, ρyHS “ Trpσρq ă 0,
(iii) there exists a positivity-preserving linear map Ψ : Msa sa
n Ñ Mm such that
TrpCpΨqρq ă 0.
The next result is a simple corollary of the above observation, but it goes well
beyond a straightforward reformulation.
2.4. CONES OF QIT 61

Theorem 2.34 (Horodecki’s entanglement witness theorem). Let H “ Cm bCn


and let ρ be a state on H. Then ρ is entangled iff there exists a positivity-preserving
map Φ : Msa sa
m Ñ Mn such that the operator pΦbIdMsa n
qρ is not positive semi-definite.
In the setting of Proposition 2.33 and Theorem 2.34, the operator σ or the map
Φ are said to witness the entanglement present in ρ, hence the term “entanglement
witnesses.”
Proof of Theorem 2.34. The sufficiency is obvious: if ρ “ τ b τ 1 is a prod-

ion
uct state and Φ is positivity-preserving, then pΦ b Idqρ “ Φpτ q b τ 1 , which is clearly
positive; the case of convex combinations of product states easily follows. To show
necessity, let Ψ : Msa sa
n Ñ Mm be the positivity-preserving map given by Proposition

ut
n n
2.33. If χ P C b C is the maximally entangled vector as in (2.32), then

rib
0 ą TrpCpΨqρq “ xCpΨq, ρyHS “ xpΨ b IdMsa
n
q|χyxχ|, ρyHS
“ x|χyxχ|, pΨ˚ b IdMsa qρyHS “ xχ|pΨ˚ b IdMsa qρ|χy,

ist
n n
` ˚ ˘
which implies that Ψ b IdMsa n
ρ is not positive. Given that Ψ˚ is positivity-
preserving if and only if Ψ is (see Section 2.3.2), the choice of Φ “ Ψ˚ works as

rd
needed. 
Remark 2.35. It follows from general considerations that the entanglement
fo
witnesses σ, Φ may be required to satisfy various additional properties. First, one
may include a normalizing condition such as Tr σ “ 1 or Tr ΦpIq “ 1, which reduces
ot
the search for a witness to a convex compact set. Next, since linear functions
N

(restricted to compact sets) attain extreme values on extreme points, one may
insist that σ or Φ belong to an extreme ray of the respective cone (or even, by a
density argument, to an exposed ray; cf. Exercise 1.5). Finally, another acceptable
ly.

normalizing condition is to require that Φ be unital or trace-preserving. To see that


Φ can be assumed unital, we note first that by a density argument the operator
on

ΦpIq may be assumed to be positive definite, in which case Lemma 2.31 applies.
The case of the trace-preserving restriction is slightly more involved and requires
se

increasing the dimension of the range of Φ. We relegate the details of the arguments
to Exercises 2.54 and 2.55.
lu

Exercise 2.54 (Unital witnesses suffice). Show that in Theorem 2.34 one can
na

require that Φ be unital.


Exercise 2.55 (Trace-preserving witnesses suffice). Show that in Theorem 2.34
so

one can require that Φ be trace-preserving, at the cost of allowing the range of Φ
to be Msa
m`n .
r
Pe

Exercise 2.56 (Optimal entanglement witnesses). We work in the Hilbert


space H “ Cm b Cn . For σ P BP, we denote by Epσq “ tρ P D : Trpρσq ă 0u
the set of states detected to be entangled by σ. We say that σ is an optimal
entanglement witness if Epσq is maximal (i.e., whenever Epσq Ă Epτ q for τ P BP,
then Epσq “ Epτ q). Use the S-lemma (Lemma C.4) to show that if σ lies on an
extreme ray of BP and σ R PSD, then σ is an optimal entanglement witness.
2.4.5. Proofs of Størmer’s theorem. In this section we will present two
rather different proofs of the C2 b C2 case of Theorem 2.15, which we state here in
a slightly more general form. (See Notes and Remarks for comments regarding the
C2 b C3 case.)
62 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

Theorem 2.36 (Størmer’s theorem). If H “ C2 b C2 , then the separable cone


SEPpHq and the cone PPT pHq coincide. Equivalently, P pC2 q “ DECpC2 q.
The equivalence of the two assertions of the Theorem follows from Choi’s cor-
respondence and duality (see Section 2.4 and particularly Table 2.2). We will focus
on the second assertion. Since the inclusion DECpHq Ă P pHq always holds, we
only need to establish that every positivity-preserving map on Msa2 is decomposable.
In a nutshell, the first proof depends on noticing that Proposition 2.32 effec-
tively reduces the general case to that of unital, trace-preserving maps, which in

ion
turn follows easily from very classical facts. The second proof handles first the maps
generating extreme rays of P pC2 q, and concludes via the Krein–Milman theorem.

ut
Here are the details.
Proof # 1 of Theorem 2.36. The crucial observation is that it suffices to

rib
show that the interior of P pC2 q is contained in DECpC2 q. The needed inclusion
P pC2 q Ă DECpC2 q follows then from both cones being closed, and being the

ist
closures of their interiors.
To that end, suppose that Φ belongs to the interior of P pC2 q. Proposition

rd
2.32 implies then that there exist positive operators A, B P Msa 2 and a positivity-
sa sa
preserving, ˘ and trace-preserving map Φ̃ : M2 Ñ M2 such that Φpρq “
unital
`
A´1 Φ̃ B ´1 ρB ´1 A´1 for all ρ P Msa fo
2 . In other words, Φ “ ΦA´1 ˝ Φ̃ ˝ ΦB ´1 , where
ΦM pρq :“ M ρM : . Since every ΦM is completely positive, the composition rules for
ot
completely positive and co-completely positive maps (see Exercises 2.26 and 2.46)
show that the problem reduces to establishing decomposability of Φ̃.
N

Up to now, the argument worked in any dimension; presently, we will exploit


the special features of dimension 2. Since Φ̃ is an affine self-map of the Bloch ball
ly.

that preserves the center, it may be thought of as a linear map R P BpR3 q with
}R}8 ď 1. Such maps are convex combinations of elements of Op3q (cf. Exercises
on

1.44 and 1.45), which in turn correspond to maps of the form (i) ρ ÞÑ U ρU : or (ii)
ρ ÞÑ U ρT U : for some U P Up2q (depending on whether the said element of Op3q
belongs to SOp3q or not). This is a very special and elementary case of Kadison’s
se

Theorem 2.4, and was explained in the proof of Wigner’s Theorem 2.3 (see also
lu

Exercise B.4 for the isomorphism PSUp2q Ø SOp3q). It remains to recall that the
maps of form (i) are completely positive and those of form (ii) are co-completely
na

positive. 
Remark 2.37. The above argument, when combined with the resultřfrom Ex-
so

ercise 1.45, shows that every Φ P P pC2 q can be represented as Φ “ j ΦAj `


ř
k ΦBk ˝ T so that the total number of terms does not exceed 4.
r
Pe

Proof # 2 of Theorem 2.36. Again, we will prove the inclusion P pC2 q Ă


DECpC2 q. Since P pC2 q is convex and nondegenerate, it is enough to verify that its
extreme rays consist of decomposable maps (see the comment following Proposition
1.9). The following characterization of such extreme rays comes in handy.
Proposition 2.38 (see Appendix C). Let Φ : Msa sa
2 Ñ M2 be a map which gen-
erates an extreme ray of P pC q. Then either Φ is an automorphism of PSDpC2 q,
2

in which case it is described by Proposition 2.29, or Φ is of rank one, in which case


it is of the form Φpρq “ Trpρ|ϕyxϕ|q|ψyxψ| “ |ψyxϕ|ρ|ϕyxψ| for some ϕ, ψ P C2 zt0u.
Proposition 2.38 is a special case of the characterization of the extreme rays
of the maps preserving the Lorentz cone Ln (remember that the cone PSDpC2 q
NOTES AND REMARKS 63

is isomorphic to the Lorentz cone L4 ) that will be proved in Appendix C. The


proof is based on the so-called S-lemma, a well-known fact from control theory and
quadratic/semi-definite programming.
Once we assume the above Proposition, concluding the proof is easy. Indeed, if
Φ is an automorphism of PSDpC2 q, then, by Proposition 2.29, it is either completely
positive or co-completely positive, so a fortiori decomposable. On the other had,
if Φ is of rank one and Φpρq “ |ψyxϕ|ρ|ϕyxψ|, then Φ is clearly completely positive
with Kraus rank one and the single Kraus operator A “ |ψyxϕ| (see Choi’s Theorem

ion
2.21; actually, since A is itself of rank one, it follows that CpΦq is in fact separable
and hence that Φ entanglement-breaking, see Lemmas 2.20 and 2.27). 

ut
Notes and Remarks

rib
Classical references for the mathematical aspects of quantum information the-
ory are [NC00, Hol12, Wil17]. We also recommend [Wat].

ist
Section 2.1. A general reference for the geometry of quantum states is the
book [BŻ06]. Wigner’s theorem appears in [Wig59] and Kadison’s theorem in

rd
[Kad65] in a broader context. Elementary proofs can be found in [Hun72, Sim76]
and recent generalizations in [SCM16, Stø16].
fo
Section 2.2. The definition of separability for mixed states was introduced in
[Wer89]. The NP-hardness of deciding whether a state is separable was shown in
ot
[Gur03]. The argument sketched in Exercise 2.10 about the number of product
vectors needed to represent any separable state is from [CÐ13].
N

Werner states were introduced in [VW01], where the question of their separa-
bility (Proposition 2.16) is also discussed.
ly.

Theorem 2.10 was proved in [DPS04]. For more information about k-ex-
tendibility and the symmetric subspace (also in the multipartite setting) we refer to
on

the survey [Har13]. An early reference for k-entangled states is [TH00]. See Notes
and Remarks on Chapter 9 for quantitative results about the hierarchies defined in
se

Section 2.2.5.
The observation that non-PPT states are entangled (Peres–Horodecki criterion,
lu

Proposition 2.13) goes back to [Per96], see also [HHH96].


It was observed in [HHH96] that Theorem 2.15 is a consequence of results by
na

Størmer [Stø63] and Woronowicz [Wor76]. See Notes and Remarks on Section 2.4
for more information.
so

For examples of PPT entangled states in C3 b C3 or C2 b C4 , see [Hor97]; an


early result going in the same direction can be found in [Cho75b]. Less ad hoc
r

examples (in higher dimensions) are presented, e.g., in [BDM` 99]. A geometric
Pe

(non-constructive) argument is given in Chapter 9 (see Propositions 9.18 and 9.20;


this approach works if the dimension is sufficiently large).
The realignment criterion to detect entanglement (also called cross-norm cri-
terion) presented in Exercise 2.24 is from [CW03, Rud05]. It is neither weaker
nor stronger than the PPT criterion. For more separability criteria, see the survey
[HHHH09].
Theorem 2.17 was proved in [AS10] in the bipartite case and in [FLPS11] in
the general case.
The geometry of the set of absolutely separable states is poorly understood. By
definition, whether a state ρ is absolutely separable depends only on its spectrum.
64 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY

An explicit description is known for C2 b C2 : a state ρ with


? eigenvalues λ1 ě λ2 ě
λ3 ě λ4 is absolutely separable if and only if λ1 ď λ3 ` 2 λ2 λ4 [VADM01].
Similarly to absolute separability, one may say that a state ρ P H1 b H2 is
absolutely PPT if U ρU : is PPT for any unitary U on H1 b H2 . An intriguing
open problem is whether every absolutely PPT state is absolutely separable; see
[AJR15].
Lemma 2.19 can be proved via elementary representation theory; see, e.g.,
Appendix C in [ASY14].

ion
Section 2.3. The Jamiołkowski isomorphism can be traced to [Jam72]. Choi’s
and Jamiołkowski’s isomorphisms are seldom distinguished in the literature; a dis-

ut
cussion of the difference between the two appears in [LS13].
Choi’s Theorem 2.21 as stated was proved in [Cho75a], which also contains

rib
a description of extreme completely positive unital maps. Closely related state-
ments (including variants of Stinespring’s Theorem 2.24) varying by the level of

ist
abstractness were arrived at (largely) independently by various authors, see, e.g.,
[Sti55, Kra71, Kra83].

rd
Proposition 2.26 is from [LS93] and the argument from Exercise 2.34 is based on
more general results from [RSW02] which give various descriptions of all quantum
channels between qubits and of extreme points of the set of such channels.
fo
For elementary properties of the diamond norm, see Section 3.3.4 in [Wat]
(where it is studied under the name completely bounded trace norm). Entanglement-
ot
breaking channels were studied in detail in [HSR03].
N

The example from Exercise 2.29 is from [Tom85]. Exercise 2.44 is from [Wat],
to which we also refer for a discussion of the class of LOCC channels.
ly.

Section 2.4. Proposition 2.29 is a folklore result which appears explicitly in


[Sch65]. Many similar results involve classification of “linear preservers”, i.e., linear
on

maps on Md which preserve some property of matrices. Here is a typical statement


due to Frobenius: a linear map Φ : Md Ñ Md satisfies the equation det ΦpXq “
det X if and only if it has the from X ÞÑ AXB or A ÞÑ AX T B for A, B P Md with
se

detpABq “ 1. For a survey on linear preserver problems, see [LT92].


lu

The result from Proposition 2.32 and its derivation from Brouwer’s fixed-point
theorem appear in [Ide13, Ide16, AS15]. A similar statement (proved via an
na

iterative construction) appeared in [Gur03] for positive maps Φ which are “rank
non-decreasing” (however, not all such maps satisfy the conclusion of Proposition
so

2.32, see Exercise 2.53). The validity of Proposition 2.32 for completely positive
maps is simpler and well known, see for example [GGHE08] and its references.
r

The original Sinkhorn’s theorem (for matrices, or for maps preserving the positive
Pe

orthant in Rn ) goes back to [Sin64]; see [Ide16] for an extensive survey of related
topics.
Theorem 2.34 is from [HHH96]. The concept of optimal entanglement witness
which appears in Exercise 2.56 was investigated in [LKCH00].
Størmer’s Theorem 2.36 was initially proved in [Stø63]; the original formulation
involved the second of the two statements. The first proof presented here seems
to be new and was a byproduct of the work on this book [AS15]. The scheme
behind the second proof was apparently folklore for some time; it was documented
in [MO15]. The novelty of its current presentation, if any, consists in streamlining
of the proof of Proposition 2.38. (For more background information on Proposition
NOTES AND REMARKS 65

2.38, see Appendix C.) Other proofs (of either of the two versions given in Theorem
2.36) appeared in [KCKL00, VDD01, LMO06, KVSW09, Stø13]. A recent
study of positivity-preserving maps on M3 can be found in [MO16]. While [MO16]
is focused on the unital trace-preserving case, it is likely that (particularly when
combined with our Proposition 2.32) it may provide a clear picture of the more
general setting. In particular, it may lead to a simple and transparent proof of the
C2 b C3 case of Theorem 2.15 (Woronowicz’s Theorem).

ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 3

Quantum Mechanics for Mathematicians

ion
This section is addressed primarily to mathematicians who are new to quantum

ut
information theory. Its purpose is to indicate why various mathematical concepts
enter the theory, and to give an idea of their physical meaning or interpretation.

rib
We make no attempt at being comprehensive; our attention is restricted to the
constructs that play a central role in this book and that we ourselves have found

ist
(and still find) puzzling, such as mixed states and completely positive maps. In
any case, neither of the authors being a physicist, the scope (and the depth) of the

rd
presentation will necessarily be limited.
This section is designed to be essentially independent of the rest of the book.
The only “non-mainstream” technical device that is indispensable for following it
fo
is the Dirac bra-ket notation (see Section 0.3). The discussion will be occasionally
informal in order for the readers to acquaint themselves with concepts that are
ot
presented more rigorously elsewhere in the book.
N

3.1. Simple-minded quantum mechanics


The state of a physical system (say, a particle) is described by a wave function
ly.

ψ P L2 pR3 q, which is generally time-dependent and complex-valued. Its dependence


on

on time is governed by some evolution equation (for example, the Schrödinger equa-
tion) and is necessarily unitary: given t ą 0, there is a unitary operator Ut such
that if the state of the particle at time 0 is described by ψ0 (a priori unknown),
se

its state at time t will be Ut ψ0 . The probability of finding the particle at x P R3


(assuming the appropriate measurement is performed) is given by the probability
lu

density function |ψpxq|2 , according to the Copenhagen interpretation. This forces


wave functions to be normalized in L2 and justifies the postulate of unitary evolu-
na

tion. Other physical properties of the particle are exhibited similarly. In particular,
if a given physical quantity is discrete, then there is an orthonormal sequence (or
so

basis) puj q, indexed by possible values of the quantity in question, such that the
probability of obtaining the jth value during measurement is |xψ, uj y|2 . This is the
r
Pe

simplest case of the so-called Born rule. In a way, the actual values of the physical
quantity are of secondary importance and one simply says that “a measurement was
performed in the basis puj q” or that “puj q is the computational basis” for this par-
ticular measuring/experimental setup. (We will briefly discuss other, more general
measurement schemes in Section 3.6.)
It should be emphasized that it is possible for measurement results to be de-
terministic. If the basis puj q is such that ψ “ uj0 for some j0 , then measuring ψ
in the basis puj q will yield j0 th outcome with probability 1. For the same reason,
two states ψ and ϕ are in principle perfectly distinguishable if (and only if) they
are orthogonal; one then “merely” needs to arrange a measurement in a basis that
contains both ψ and ϕ.
67
68 3. QUANTUM MECHANICS FOR MATHEMATICIANS

3.2. Finite vs. infinite dimension, projective spaces and matrices


In the previous section the “state space” is the infinite-dimensional Hilbert space
H “ L2 pR3 q. However, if the number of possible values of a physical quantity is
finite (and it may be argued that this is always the case, the “infinite” being just
a useful abstraction of “large”), the interesting part of the Hilbert space is finite-
dimensional and, consequently, may be identified with Cd for some d P N (a d-level
system). A state is then simply a unit vector ψ P Cd . A priori d may be very
large, but even the simple case of d “ 2 (a qubit) is of interest: it may describe for

ion
example the spin of an electron or the polarization of a photon.
Next, it is apparent from the discussion in Section 3.1 that no measurement

ut
can distinguish between the wave functions ψ and ωψ, where ω P C with |ω| “ 1,
and so the “true” state space is the complex projective space PpCd q (or CPd´1 ) for

rib
d-level systems. Another mathematical scheme that conveniently disregards scalar
factors is to consider not a unit vector ψ P Cd , but the orthogonal projection onto

ist
Cψ or, in the language of matrices, the outer product ρ “ |ψyxψ| P Md . In that
language, when a measurement is performed in some basis puj q, the probability of

rd
the jth outcome is
|xψ, uj y|2 “ xuj , ψy xψ, uj y “ xuj |ρ|uj y “ Tr ρ|uj yxuj | .
` ˘
(3.1)
fo
3.3. Composite systems and quantum marginals; mixed states
ot
This section gives motivation to the definition of (mixed) quantum states which
appeared in Section 2.1.1.
N

For classical systems, the state space of a system consisting of components is


the Cartesian product of the corresponding state spaces. In the quantum setting, if
ly.

the state spaces of the components (subsystems, particles, . . . ) are Hilbert spaces
H1 , . . . , Hm , the state space of the composite system is the tensor product K “
on

H1 b ¨ ¨ ¨ b Hm . However, the Cartesian product of orthonormal bases of Hk ’s is an


orthonormal basis of K. This is as far as the similarities to the classical case go.
Consider now a bipartite system K “ H b E and assume that we have access
se

only to the H part. (This may be the case when H describes the state inside an
lu

apparatus in a laboratory and E the environment, or if we decide to focus only on


the first subsystem.) Suppose that our system is in the state described by ψ P K
na

and let us try to figure out the H-marginal of ψ, i.e., the state on H, measurements
of which “within H” are consistent with hypothetical measurements of the complete
so

state ψ.
If ψ “ ξ bη (a product vector), the result is as expected: the H-marginal of ψ is
r

ξ. To check this, we note that if we measure ξ in some basis puj q of H, we obtain the
Pe

jth outcome with probability pj “ |xψ, uj y|2 . For a different point of view, suppose
that we have access to the entire system and that we perform a measurement in the
basis puj b vk qj,k , where pvk q is some basis of E. The probability of obtaining the
pj, kqth outcome is then qjk “ |xξ b η, uj b vk y|2 “ |xξ, uj y|2 ¨ |xη, vk y|2 . Summing
over k, we again find that the probability of the jth outcome on the first component
is |xψ, uj y|2 “ pj . This is simply a verification that the probability distribution ppj q
is the (first) marginal of pqjk q and that, moreover, product vectors lead to product
distributions, or to independent random variables. Another way to express this
marginal probability is pj “ Tr ρPuj , where ρ “ |ψyxψ|, and where Pu “ |uyxu| b IE
is the orthogonal projection onto the subspace u b E of H b E. This calculation
3.3. COMPOSITE SYSTEMS AND QUANTUM MARGINALS; MIXED STATES 69

perfectly makes sense even if ξ is not a product vector, and it makes clear that pj
does not depend on, say, the choice of the basis of E.
Consider now ψ P H b E, which is not a product vector. Let
r
ÿ
(3.2) ψ“ ai ξi b ηi
i“1
be its Schmidt decomposition (see Section 2.2.2), necessarily with r ě 2. Since
řr H-marginal of ξi b ηi is ξi , it is tempting to guess that the H-marginal of ψ is
the

ion
i“1 ai ξi . However, one should
řrimmediately become suspicious: for any choice of
(complex) signs ωi , the vector i“1 ai ωi ξi is an equally valid candidate, and while
the state remains unchanged if you multiply a vector by a complex number ω with

ut
|ω| “ 1, it may change radically if you multiply different (non-zero) components

rib
by different numbers. A more careful analysis is needed, and it turns out that the
proper language to describe marginals is that of matrices. In the notation of the
preceding paragraph we have

ist
”´ ÿ r ¯`
` ˘ ˘ı
pj “ Tr |ψyxψ|Puj “ Tr ai āl |ξi yxξl | b |ηi yxηl | |uj yxuj | b IE

rd
i,l“1
r
ÿ “` ˘` ˘‰ ` ˘

i,l“1
ai āl Tr fo
|ξi yxξl | |uj yxuj | Tr |ηi yxηl |

r
ot
”´ ÿ ¯ ı
“ Tr |ai |2 |ξi yxξi | |uj yxuj |
N

i“1
r
´ÿ ¯
(3.3) “ xuj | |ai |2 |ξi yxξi | |uj y.
ly.

i“1
In other words, the probability that `a measurement˘ performed in a basis puj q yields
on

the jth outcome is xuj |ρH |uj y “ Tr ρH |uj yxuj | , where


r
se

ÿ
(3.4) ρH “ |ai |2 |ξi yxξi |.
i“1
lu

So the mixed state ρH fits the role of the H-marginal of the “global” state ρ “ ρHE “
|ψyxψ|. Therefore, while in principle the state of a quantum system is described
na

by a vector (or a rank one projection, or an element of a projective space, or a


wave function), i.e., by a pure state, we seldom, if ever, will be able to perform a
so

measurement in a global basis, and we therefore have to rely on mixed states for
modeling such systems. To use the Platonic analogy, a mixed state is “the shadow
r
Pe

on the wall” of our cave, comprising all the features of the “idea” (or “form”) ψ that
are accessible to our perception.
A more heuristic explanation of the formula for the marginal is that from the
perspective of H the state of our system is ξi with probability pi “ |ai |2 , and so we
need to compute the weighted
` average
˘ of probabilities corresponding to ρ “ |ξi yxξi |.
Since the expression Tr ρ |uyxu| is linear in ρ, the average can be performed inside
the trace, whence the formula for ρH . We encourage readers who are not used to
the bra-ket formalism to work out the details of several variants of this calculation
outlined in Exercise 3.1.
The key features of the marginal ρH are that it is canonical (for example, it
does not depend on the basis puj q of H in which the measurement is performed)
70 3. QUANTUM MECHANICS FOR MATHEMATICIANS

and that it encodes all the information that can be obtained about the global state
by measurements inside H. In particular, if ρH is truly mixed (i.e., not pure, with
r ě 2 in (3.2) or in (3.4)), then there are no measurements inside H that are
deterministic.
A simple but spectacular demonstration of this phenomenon are the Bell states
on C2 b C2 : ρ “ |ψyxψ| with ψ being (for example) one of the four Bell vectors
1 1
ϕ˘ “ ? p|00y ˘ |11yq, ψ ˘ “ ? p|01y ˘ |10yq,
2 2

ion
2
where |0y, |1y is the canonical basis of C (recall that |00y stands for |0y b |0y). It
is easily seen that in each case the marginal of ρ on either C2 factor is p|0yx0| `

ut
|1yx1|q{2 “ I {2. Consequently, when measuring in any basis pu1 , u2 q (of, say, the
first factor), each of the two outcomes occurs with probability 1{2, and so the

rib
results of such measurements, in and of themselves, tell us nothing. In particular,
they cannot help us distinguish between ϕ` , ϕ´ , ψ ` , ψ ´ , even though a global

ist
measurement performed in the basis consisting of these four vectors would tell
them apart perfectly.

rd
Exercise 3.1. Perform alternative calculations of the probabilities from (3.3)
řr outline. Consider a product basis puj b vk qj,k of H b E.
according to the following
fo
If ρ “ |ψyxψ| with ψ “ i“1 ai ξi b ηi , the probability of the pj, kqth outcome will
be, by (3.1),
ot
ˇ ÿr ˇ2
qjk “ ˇx ai ξi b ηi , uj b vk yˇ “ Tr ρp|uj yxuj | b |vk yxvk |q.
ˇ ˇ
N

i“1
ř
Finally, retrieve pj “ k qjk by expanding either the second or the third expression
ly.

in the above.
on

3.4. The partial trace; purification of mixed states


The discussion in the previous section shows that, in some cases, a natural way
se

of modeling the state of a subsystem of a quantum system is to consider operators


rather than unit vectors. An elegant way to describe quantum marginals is via the
lu

concept of partial trace, which is defined as follows (see also Section 2.2.1). First,
for any operator (self-adjoint or not) on a composite Hilbert space H b E which is
na

a tensor product of operators, we define its partial trace with respect to E as


TrE pσ b τ q “ Trpτ qσ.
so

Next, we extend this operation to all operators by linearity (which is possible be-
r

cause of the universal property of the tensor product). Clearly, if ξ P H, η P E are


Pe

unit vectors, then


TrE p|ξ b ηyxξ b η|q “ TrE p|ξyxξ| b |ηyxη|q “ |ξyxξ|.
řr
Similarly, if ψ “ i“1 ai ξi b ηi is a Schmidt decomposition, then
ÿ r
TrE p|ψyxψ|q “ |ai |2 |ξi yxξi |.
i“1
In other words, TrE pρq “ ρH , the H-marginal of ρ defined by (3.4). The notation
may be a little confusing since in order to find the H-marginal we need to cal-
culate the partial trace with respect to E, but it is generally accepted. It simply
corresponds to the following fact from elementary probability: given two random
3.5. UNITARY EVOLUTION AND QUANTUM OPERATIONS 71

variables X, Y with joint density f px, yq, the marginal density of X is obtained by
integrating f with respect to y.
Another point which needs to be clarified is that the set of mixed states on H
that may be obtained as H-marginals of pure states on composite systems H b E
(for some auxiliary space E) is exactly the set DpHq of positive semi-definite trace
d
one operators (usually referred to as density matrices, particularly if H “
ř C ). This
is the consequence of the following computation: if ρ P DpHq,řand ? ρ “ i λi |ξi yxξi |
is its spectral decomposition, then choosing E “ H and ψ “ i λi ξi b ξi ensures

ion
that TrE p|ψyxψ|q “ ρ. We say that |ψyxψ| (or simply ψ) is a purification of ρ.
Clearly, the Schmidt rank of ψ (always) equals rank ρ “: r. Moreover, the minimal
dimension of E for which a purification of ρ exists in H b E is also equal to r. Even

ut
though this construction is abstract, it is canonical in the following sense: if ρ is

rib
a physical state on H that is the H-marginal of a physical pure state řr ψ?P H b E
(where E is the environment relative to H), then we must have ψ “ i“1 λi ξi bηi
for some basis pηi q of E. (The only catch is that pηi q may not be the most natural

ist
basis of E.)

rd
3.5. Unitary evolution and quantum operations; the completely
positive maps
fo
As mentioned earlier, the evolution of a quantum system is unitary, i.e., if
t0 ă t1 , then there is a unitary operator U such that if the state of the system at
ot
time t0 (the initial state) is described by a vector ψ (which is a priori general and/or
unknown), then its state at time t1 (the terminal state) will be U ψ. (U depends
N

on the physical laws governing the evolution, and we may be able to control some
of its parameters, but it is independent of ψ.) If we switch to the language of
ly.

density matrices, the formula ψ ÞÑ U ψ becomes |ψyxψ| “ ρ ÞÑ Uρ U : . (These are


the unitary channels defined in Section 2.3.4.)
on

We now want to understand how the formalism needs to be adapted to describe


subsystems, i.e., when we pass to the more general context of mixed states. Assume
se

that our evolution operator U acts on a composite space H b E and—to begin


with— takes the form V b W , where V and W are unitary operators on H and
lu

E respectively. If ψ “ ξ b η is also a product vector, then the evolution of the


subsystem H is clearly given by ξ ÞÑ V ξ, or by σ ÞÑ V σV : in the language of
na

density matrices. The latter formula remains valid if ψ P H b E is an arbitrary


(unit) vector, and σ “ TrE p|ψyxψ|q is the corresponding H-marginal. (This follows
so

from the identity V TrE pρqV : “ TrE pU ρU : q, valid for U “ V b W and for any
matrix ρ.)
r

The situation becomes more complicated in a case where the evolution of the
Pe

subsystem H and the environment E are not decoupled, i.e., where U is not a
product of two unitaries. Even if the initial state of the system is a product vector
ψ “ ξ b η, there is no reason why the terminal state U ψ, which can a priori be
arbitrary, should be of that form. In other words, even if the initial H-marginal
σ “ |ξyxξ| is pure, the terminal marginal may be mixed. In particular, the evolution
of the marginal is not necessarily unitary. Moreover, for fixed ξ, different values
of the initial E-marginal η may result in radically different values of the terminal
H-marginal.
However, this is neither surprising nor fatal. First, if there is interaction be-
tween our subsystem H and the environment E, it is to be expected that the terminal
72 3. QUANTUM MECHANICS FOR MATHEMATICIANS

state of H possibly depends on the state of E. Second, while we may not know what
the initial state of E is, we can simply think of it as an external parameter affecting
the evolution of our subsystem H, which is the only one we can manipulate, control
and measure.
We now want to come up with a formula that generalizes the unitary evolution
ρ ÞÑ Uρ U : or, more precisely, that is the “shadow on the wall of our cave” of the
unitary evolution. Let us start again with the global initial state being a product
vector ψ “ ξ b η; the terminal state is then represented by the vector U pξ b ηq.

ion
Since η is assumed to be fixed, we can omit the dependence on η in the description
and simply talk about an (a priori arbitrary) isometry ξ ÞÑ V ξ P H b E. (Of course,
since by definition V ξ “ U pξ b ηq, V does implicitly depend on η.) In the language

ut
of density matrices, the evolution of the H-marginal is then given by

rib
(3.5) σ ÞÑ TrE V σV : ,
where σ “ |ξyxξ| is the initial marginal (cf. Theorem 2.24).

ist
If we want to give a description of the evolution that is intrinsic to H, we may
ř pvi q be an orthonormal basis of E. The isometry V can be
proceed as follows. Let

rd
represented as V ξ “ i pAi ξq b vi for some operators Ai P BpHq. Consequently,
ÿ ÿ`
Ai |ξyxξ|A:j b |vi yxvj |
˘
V σV : “ |Ai ξyxAj ξ| b |vi yxvj | “

and further,
i,j fo
i,j
ot
ÿ` ÿ
Ai |ξyxξ|A:j Tr |vi yxvj | “
˘
TrE V σV : “ Ai |ξyxξ|A:i .
N

i,j i

Accordingly, an alternative description of the evolution is


ly.

ÿ
(3.6) σ ÞÑ Ai σA:i .
i
on

This is a description intrinsic to H, since Ai P BpHq. Moreover, according to


Choi’s Theorem (Theorem 2.21), the evolution described by (3.5)–(3.6) is given by
se

a completely positive map on BpHq.řThe operators Ai aren’t completely arbitrary,


since the resulting map ξ ÞÑ V ξ “ i pAi ξq b vi needs to be an isometry. For this
lu

to happen we must have, for every ξ P H,


ÿ ÿ ÿ
xξ, ξy “ xV ξ, V ξy “ xAi ξ, Aj ξyxvi , vj y “ xAi ξ, Ai ξy “ xξ| A:i Ai |ξy.
na

i,j i i
so

Given that for self-adjoint operators A, B P BpHq the condition xξ|A|ξy “ xξ|B|ξy
for all ξ P H implies A “ B, it follows that V being an isometry is equivalent to
r

ÿ :
Pe

(3.7) Ai Ai “ I H ,
i
which in turn (see Remark 2.23) is equivalent to the map given by (3.6) being
trace-preserving. This should not come as surprise, since we want the evolution
equation to map density matrices to density matrices, which for linear evolutions
is equivalent to preserving the trace.
To summarize, under the hypothesis of unitary evolution of the global system
H b E, the relationship σ ÞÑ Φpσq between the initial state σ of subsystem H
(the initial H-marginal) and its terminal state Φpσq is described by a completely
positive trace-preserving map (CPTP) Φ acting on BpHq. CPTP maps are also
called quantum channels.
3.6. OTHER MEASUREMENT SCHEMES 73

We derived the above characterization of quantum evolution maps Φ under the


assumption that the initial global state was given by a product vector ψ “ ξbη, with
Φ P BpHq depending on the (a priori unknown, but specific) E-marginal described
by η. One could ask whether a similar (or some other) characterization can be
derived in a more general case where the initial state, while still a vector, is no
longer separable. However, there appears to be no straightforward way to produce
a canonical map in that setting. One natural approach would be to try to associate
an evolution map Φ : BpHq Ñ BpHq, acting in a consistent manner on H-marginals,

ion
to a given global unitary evolution induced by U and a given E-marginal τ P BpEq.
However, while knowing H- and E-marginals of a pure state tells us a lot about
the structure of that state, it still leaves a lot of uncertainty. For example, H- and

ut
E-marginals of all four Bell states ϕ` , ϕ´ , ψ ` , ψ ´ on C2 bC2 are identical: they are
maximally mixed states 21 IC2 . On the other hand, in the absence of some strong

rib
restrictions on the form of the global unitary evolution U , there is no reason to
expect the H-marginals of U ϕ` , U ϕ´ , U ψ ` , U ψ ´ to be the same. (In fact, various

ist
quantum algorithms exploit the fact that those marginals may be quite different.)
In other words, such a map Φ cannot be consistently defined.

rd
In physics texts this characterization, and specifically the postulate of complete
positivity, is usually arrived at in a somewhat different way. First, it is noted that
fo
a quantum evolution map (or a quantum operation) Φ : BpHq Ñ BpHq should
map density matrices to density matrices. Under the assumption of linearity, this
ot
is equivalent to Φ being positive and trace-preserving (see Section 2.3.2). Second,
when Φ is coupled with an identity map on the environment E, then the resulting
N

map Φ b IdBpEq should also be an allowed quantum operation and in particular,


it should be positive. If dim E is at least as large as dim H, this is equivalent
ly.

to complete positivity of Φ. The argument presented earlier in this section is


substantially more involved, but seems to us more physically natural (and less
on

formal).

3.6. Other measurement schemes


se

Throughout our discussion we assumed that a measurement is performed in


lu

some basis puj q of the entire space, or of the space corresponding to the accessi-
ble subsystem, `with the ˘probability of the jth outcome being either |xψ, uj y|2 or
na

xuj |ρ|uj y “ Tr ρ|uj yxuj | (depending on whether the state of the system is pure
or mixed). A slightly more general scheme is that of a projective measurement,
so

where the measuring apparatus is modeled by a sequence of mutually orthogonal


projections pPi q and the probability of the ith outcome is
r

|Pi ψ|2 “ xψ|Pi |ψy “ Tr ρPi .


Pe

(3.8)
However, this is barely more general: we can think of the instrument as being
related to a basis puj q, but as providing only a coarse-grained view, where some of
the basis elements uj are merged into one projection Pi .
A more substantive generalization is derived from basis/projective measure-
ments in a similar way that CPTP maps were derived from unitary operations.
Suppose that a projective measurement pPi q on H b E (rank one or not) is per-
formed and consider the effects of applying it to a product state ψ “ ξ b η. The
probability of the ith outcome is then
` ˘ ` ˘
(3.9) pi “ xψ|Pi |ψy “ Tr |ψyxψ|Pi “ Tr p|ξyxξ| b |ηyxη|qPi “
74 3. QUANTUM MECHANICS FOR MATHEMATICIANS

` ˘
Tr |ξyxξ| TrE pI b|ηyxη|qPi .
In the last equality we used the identity
` ˘ ` ˘
(3.10) Tr pτ b IqX “ Tr τ TrE X ,
which is easily verified if X is a product operator and follows by linearity for
arbitrary X. In other words, there are operators pMi q on H such that
(3.11) pi “ Trp|ξyxξ|Mi q.

ion
ř
Varying ξ and using the fact that i Pi “ IHbE we deduce that
ÿ
(3.12) Mi “ I H

ut
i
and that Mi is positive for each i. Even though Born’s rule (3.11) was derived for

rib
a pure state ρ “ |ξyxξ|, it extends by linearity to a general (possibly mixed) mixed
state ρ on H via the formula

ist
(3.13) pi “ TrpρMi q.
A system pMi q verifying the condition (3.12) is called a positive operator-valued

rd
measure (POVM) and the associated measurement scheme a POVM measurement.
The reason for invoking the term “measure” is that there are also continuous vari-
fo
ants, namely operator-valued measures integrating to identity.
ot
3.7. Local operations
This short section aims at explaining the meaning of the word “local,” which is
N

often used in quantum information theory. Up to now we have focused on a Hilbert


space denoted H. Moreover, the standard framework of quantum information the-
ly.

ory assumes that H is endowed with a tensor decomposition H “ HA b HB (or a


multipartite variant), where HA is the Hilbert space of Alice’s system and HB is
on

the Hilbert space of Bob’s system. The usual assumption is that Alice and Bob are
surrogates for two distant experimentalists who share a quantum system H.
se

In this context, operations that can be performed “privately” by Alice and Bob
are called local operations. For example, local unitaries on H are unitary operators
lu

of the form U “ UA b UB , where UA (resp., UB ) is a unitary operator on HA (resp.,


on HB ). Similarly, local POVMs on H are of the form pMi b Nj q, where pMi q is
na

a POVM on HA and pNj q is a POVM on HB . A local channel Φ : BpHq Ñ BpHq


is of form ΦA b ΦB , where ΦA : BpHA q Ñ BpHA q and ΦB : BpHB q Ñ BpHB q are
so

quantum channels.
A related concept is the class of LOCC operations, which are obtained by
r

combining Local Operations with Classical Communication between Alice and Bob.
Pe

The precise mathematical definition of LOCC operations is actually quite intricate


(see Section XI in [HHHH09]). We consider some aspects of LOCC operations in
Section 12.2.1.

3.8. Spooky action at a distance


We conclude this chapter by presenting a baby version of Einstein’s “spooky
action at a distance” consequence of a quantum description of the physical reality.
Suppose that each of two distant experimentalists, Alice and Bob, has in their
lab a particle that they can locally measure, and that each particle can be in one
of two possible states, |0y or |1y. Suppose further that, as a system, the two
NOTES AND REMARKS 75

particles are in a Bell quantum state ψ ` “ ?12 p|01y ` |10yq (on the Hilbert space
H “ HA b HB “ C2 b C2 ). As described in Section 3.3, independently of the
choice of measurement bases in HA and HB , both outcomes of Alice’s (resp., Bob’s)
measurement will be equally likely. However, some combinations of the outcomes
are more likely than others. For example, suppose that each of them performs
the measurement in their computational basis p|0y, |1yq, which, in the terminology
of Section 3.7, corresponds to a local POVM with pMi q “ pNj q “ p|0yx0|, |1yx1|q.
Table 3.1 shows the resulting joint probability distribution. Note that Alice’s and

ion
Table 3.1. Joint probability distribution of Alice’s and Bob’s
measurement outcomes.

ut
Bob

rib
|0y |1y
Alice
1
|0y 0

ist
2
1
|1y 2 0

rd
Bob’s outcomes are always different. This is not immediately fatal as it may just be
the case that—perhaps because of some conservation law in their interaction in the
fo
past—the two particles are in opposite states, we just don’t know which. However,
on further reflection, this indicates that either the description of the reality given
ot
by ψ ` is incomplete, with some other hidden variable controlling the outcomes of
N

measurements, or that the fact of Alice’s performing her experiment instantaneously


affects the particle that is in Bob’s possession.
Moreover, this phenomenon is just a harbinger of more involved schemes, but
ly.

based on very similar principles, which lead to effects that cannot be explained by
a hidden variable model, and to phenomena such as pseudotelepathy or quantum
on

teleportation. We will briefly explore some of these examples later on, mostly in
Chapter 11.
se

Exercise 3.2. Verify the details of the calculation of probabilities in Table 3.1.
lu

Notes and Remarks


na

There are many books which present quantum mechanics for specific audiences.
In addition to the references given at the end of Chapter 2, we point out [Mer07]
so

(mostly directed at computer scientists) and [RP11]. Other references targeting


mathematicians are [Tak08] and [Sha08].
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
ion
ut
Part 2

rib
Banach and his Spaces

ist
rd
Asymptotic Geometric Analysis Miscellany
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 4

More Convexity

ion
The focus of this chapter are concepts, invariants and operations related to

ut
finite-dimensional convex bodies. The primary objectives are to be able to describe,
tell apart, and measure the size of such bodies. While some of the results are

rib
relatively new, they all have roots in classical convex geometry and, most notably,
in the work of Hermann Minkowski in the late 19th and early 20th century. Other,

ist
more modern aspects of the theory of convex bodies will be addressed in Chapters
5 and 7.

rd
4.1. Basic notions and operations

fo
4.1.1. Distances between convex sets. A natural way to quantify how
different two subsets of a metric space are is the Hausdorff distance. When we
consider convex bodies K, L Ă Rn containing the origin in their interiors, and
ot
identified when related by a homothetic transformations, a more relevant notion is
N

often their geometric distance, defined as


(4.1) dg pK, Lq “ inftαβ : α, β ą 0, K Ă αL, L Ă βKu.
ly.

Equivalently,
}x}K }x}L
on

dg pK, Lq “ sup ˆ sup .


}x}L
xPRn ,x‰0 xPRn ,x‰0 }x}K
This “distance” satisfies the multiplicative version of the triangle inequality
se

dg pK, M q ď dg pK, Lqdg pL, M q.


lu

If we want to consider the family of n-dimensional convex bodies up to affine


transformations, the proper tool is the Banach–Mazur distance
na

(4.2) dBM pK, Lq “ inftdg pK ` a, T L ` bq : T P GLpn, Rq, a, b P Rn u.


In the case where K and L are symmetric (i.e., 0-symmetric), which is the setting
so

most frequently encountered in the literature, we can restrict the infimum in (4.2)
r

to a “ b “ 0. In either case, we are led to a compact set (see Exercise 4.3),


Pe

usually called the Banach–Mazur compactum (or Minkowski compactum). As a


consequence of the compactness, whenever a (reasonable) functional f pKq defined
on convex bodies in Rn (or on symmetric convex bodies) has the property that
it is affine-invariant, it attains its extreme values on specific equivalence classes of
convex bodies. It is sometimes challenging to identify those extremal bodies.
Exercise 4.1 (Two hyperplane sections are close). Let H Ă Rn be a hyper-
plane, and K, L two symmetric convex bodies such that K X H “ L X H. Show
that dBM pK, Lq ď C for some absolute constant C. Deduce that if H1 , H2 are two
linear hyperplanes, then dBM pK X H1 , K X H2 q ď C. (We tacitly identify H1 and
H2 with Rn´1 .)

79
80 4. MORE CONVEXITY

Exercise 4.2 (Boundedness of the space of convex bodies). Let K Ă Rn be a


convex body and let ∆ be the simplex of largest volume contained in K. Show that
if 0 is the centroid of ∆, then K Ă ´n∆ Ă n2 ∆ and K Ă pn ` 1q∆. In particular,
if ∆n is the regular n-dimensional simplex, then dBM pK, ∆n q ď n ` 1.
Exercise 4.3 (Compactness of the space of convex bodies). Deduce from the
previous exercise that the set of convex bodies in Rn (up to identification via in-
vertible affine transformation), equipped with the distance log dBM , is a compact
metric space.

ion
4.1.2. Symmetrization. If K Ă Rn is a non-symmetric convex body con-
taining 0, there are several symmetric convex bodies that can be associated with

ut
K (see Figure 4.1). Such symmetrization operations are useful because symmet-

rib
ric convex bodies are often easier to deal with, whereas the symmetrized set still
“remembers” many features of K.

ist
K
K∪
−K

rd
• •
0 0

fo
N ot
(K − K)/2

• K∩ •
ly.

0 0
on
se

−K • K
0
lu
na

K
so

Figure 4.1. A convex body K Ă R2 (top left) and its four kinds
r

of symmetrizations KY (top right), KX (middle left), pK ´ Kq{2


Pe

(middle right) and K (bottom).

We may define the following convex bodies


(4.3) KY “ convpK Y p´Kqq.
If K also contains 0 in its interior, we may also consider
(4.4) KX “ K X p´Kq.
These operations are dual to each other since we have, by the bipolar theorem
(1.11),
(4.5) K X p´Kq “ convpK ˝ , ´K ˝ q˝ .
4.1. BASIC NOTIONS AND OPERATIONS 81

Still another possible symmetrization is pK ´ Kq{2 :“ tpx ´ yq{2 : x, y P Ku


(cf. the definitions (4.7) below). This choice is appealing since it is invariant under
translations of K and makes sense even if 0 R K. However, the description of
the polar of pK ´ Kq{2 is somewhat awkward. The set K ´ K is often called
in the literature the difference body. Obviously if K is already 0-symmetric then
KX “ KY “ pK ´ Kq{2 “ K.
Several examples of n-dimensional convex bodies naturally lie inside an affine
hyperplane in Rn`1 . This is the case for the regular simplex (the set of classical

ion
states) and for the set of quantum states (see Section 0.10). In this situation
still another symmetrization is useful. If H Ă Rn`1 is an affine hyperplane not
containing 0, and K is a convex body in H (so that K is n-dimensional), one may

ut
consider

rib
(4.6) K “ convpK Y p´Kqq.
The symbol depicts a cylinder. This is motivated by the observation that

ist
when K is a Euclidean disk, the resulting body K is a cylinder. It coincides with
what is commonly called a generalized cylinder if K is centrally symmetric.

rd
The set K is an pn ` 1q-dimensional convex body, so while formula (4.6) is
identical to (4.3), we distinguish the two operations since they will be applied in
fo
different contexts (for a description of pK q˝ , see Exercise 4.5). For example, if
K “ ∆n is the regular simplex defined as the convex hull of the canonical basis in
Rn`1 , the convex body obtained after symmetrization is p∆n q “ B1n`1 .
ot
All these symmetrizations turn a non-symmetric convex body into a centrally
N

symmetric convex body. The word “symmetrization” is also used to describe op-
erations for which the output has some other symmetry properties. One example
ly.

of such an operation is the Steiner symmetrization as described in Exercise 4.31.


One of its important features is that for any convex body there is a sequence of
on

successive Steiner symmetrizations converging to a Euclidean ball, which is very


handy for proving geometric inequalities. For other examples of similar nature, see
Notes and Remarks on Section 5.2.
se

Exercise 4.4 (Origin shifting and symmetrization). Show that for any convex
lu

body K Ă Rn and a, b P K,
dBM ppK ´ aqY , pK ´ bqY q ď 4.
na

Exercise 4.5 (The polar of cylindrical symmetrization). Let He be defined as


so

(1.21), K be a convex body inside He and K its symmetrization defined as in


(4.6). Denote by C “ R` K the cone generated by K, and show that
r

ˆ ˙ ˆ ˙
Pe

˝ e ˚ e ˚
pK q “ ´C X ´ 2 `C .
|e|2 |e|
If we write x ď y when y ´ x P C ˚ , this is the “interval” tx P Rn : ´e{|e|2 ď x ď
e{|e|2 u in the order induced by C ˚ .
4.1.3. Zonotopes and zonoids. A crucial notion in convex geometry is that
of Minkowski operations on sets. If A, B Ă Rn and t P R, we set
(4.7) A ` B :“ tx ` y : x P A, y P Bu, tA :“ ttx : t P R, x P Au.
The definition of the Minkowski sum extends to the case of finitely many convex
bodies.
82 4. MORE CONVEXITY

A convex body K Ă Rn is called a zonotope if it is the sum of finitely many


segments. For example the cube r´1, 1sn is a zonotope since
r´1, 1sn “ r´e1 , e1 s ` ¨ ¨ ¨ ` r´en , en s,
where r´ei , ei s denotes the segment joining the ith canonical basis vector and its
opposite.
A convex body K Ă Rn is called a zonoid if it can be written as a limit
of zonotopes (in the Hausdorff distance). Note that the class of zonotopes (or

ion
zonoids) is invariant under affine transformations, so we could alternatively use the
Banach–Mazur distance instead of the Hausdorff distance.
Observe that zonotopes and zonoids are automatically centrally symmetric. We

ut
will usually assume that the center of symmetry is at the origin. Here is a useful
characterization of zonoids as polars of unit balls of subspaces of L1 .

rib
Proposition 4.1 (not proved here). Let K Ă Rn be a symmetric convex body.

ist
The following are equivalent.
(i) K is a zonoid.
(ii) There is a positive Borel measure µK on S n´1 such that, for any x P Rn ,

rd
ż
(4.8) }x}K ˝ “ |xx, θy| dµK pθq.
S n´1 fo
We emphasize that µK is not assumed to be a probability measure.
ot
It follows in particular that every ellipsoid is a zonoid (use µK “ σ in (4.8), then
affine equivalence). Note also that, for a given zonoid K Ă Rn , the Borel measure
N

µK on S n´1 satisfying (4.8) is unique if we additionally require it to be even (i.e.,


to verify µK p´Bq “ µK pBq for every Borel set B Ă S n´1 ).
ly.

Exercise 4.6 (A formula for µK ). Let K “ r´u1 , u1 s ` ¨ ¨ ¨ ` r´up , up s be a


on

zonotope, where u1 , . . . , up are vectors in Rn . What is the measure µK appearing


in (4.8)?
se

Exercise 4.7 (Planar zonotopes and zonoids). Show that every centrally sym-
metric polygon is a zonotope, and that any centrally symmetric convex body
lu

K Ă R2 is a zonoid.
Exercise 4.8 (Octahedron is not a zonotope). Show that B13 is not a zonotope.
na

Exercise 4.9. Let K1 , K2 be convex bodies in Rn such that K1 ` K2 “ B2n .


so

Does it follow that K1 , K2 are Euclidean balls?


r

4.1.4. Projective tensor product. If K and K 1 are closed convex sets in


Pe

1
R and Rn respectively, their projective tensor product is the closed convex set
n
1 1
Kb p K 1 in Rn b Rn Ø Rnn defined as follows
(4.9) p K 1 “ convtx b x1 : x P K, x1 P K 1 u.
Kb
This terminology is motivated by the fact that when K and K 1 are unit balls
with respect to some norms, the set K bK p 1 is the unit ball of the corresponding pro-
1
jective tensor product norm on R b Rn . Recall that given two finite-dimensional
n

normed spaces pV, } ¨ }q and pV 1 , } ¨ }q, their projective tensor product (denoted by
V bp V 1 ) is the space V b V 1 equipped with the norm
!ÿ ÿ )
}z}^ “ inf }xi } }yi } : z “ xi b yi .
4.1. BASIC NOTIONS AND OPERATIONS 83

It is easily checked that B1m b p B1n identifies with B1mn when the space Rm b Rn
mn
is identified with R (see also Exercise 4.16), and that B2m bp B2n identifies with
m,n m n
S1 when R b R is identified with Mm,n .
There is a dual notion to the projective tensor product, which is called the
1
injective tensor product. It can be defined via polarity: if K Ă Rn and K 1 Ă Rn
are convex bodies containing 0 in the interior, their injective tensor product is the
1 1
convex body K b q K 1 in Rn b Rn Ø Rnn defined as follows
p pK 1 q˝ ˝ .
` ˘
(4.10) Kb q K1 “ K˝ b

ion
This definition does not depend on the particular choice of Euclidean structures on
1 1
Rn and Rn , provided one considers the Euclidean structure on Rn b Rn obtained

ut
as their Hilbertian tensor product.

rib
The relevance of the above notions to information theoretical context—quantum
or classical—is evident. The set of separable states is the projective tensor product
of the sets of states on factor spaces. More precisely, if H “ H1 b H2 , then

ist
(4.11) SeppHq “ DpH1 q b
p DpH2 q.

rd
(These objects were defined in Section 2.2.) Similarly, for classical states, the
projective tensor product ∆m´1 b p ∆n´1 identifies with ∆mn´1 .
The definition of K b fo
p K 1 (similarly to other definitions and comments of this
section) immediately generalizes to tensor products of any finite number of factors.
However, for the sake of transparency we shall concentrate in this section on the
ot
case of two convex bodies. We also point out that the definition (4.9) makes sense
N

when K, K 1 are subsets of complex spaces.


It is easy to see that the operation bp commutes with some of the symmetriza-
ly.

tions we introduced earlier, e.g.,


1 p K 1 qY
(4.12) KY b
p KY “ pK b
on

and
(4.13) p K 1 “ pK b
K b p K 1q .
se

To check that (4.13) makes sense, we note that if K (resp., K 1 ) is a convex body in
lu

1
the affine hyperplane He Ă Rn (resp., He1 Ă Rn ) defined as in (1.21), then K b p K1
1
n n
is a convex body in the affine hyperplane Hebe1 Ă R b R (cf. Exercises 4.13 and
na

4.15 ).
A specific situation where (4.13) holds, which will be fundamental in Chapter
so

9, is when K is the set of quantum states on a Hilbert space. Since DpCd q “ S1d,sa ,
it follows that
r
Pe

1 1
(4.14) SeppCd b Cd q “ S1d,sa b
p S1d ,sa .
To put it in words, the symmetrization of the set of separable states is canonically
identified with the projective tensor product of two copies of the self-adjoint part of
the unit ball for the trace norm and, consequently, is the unit ball in the projective
tensor product norm of (the self-adjoint parts of) two 1-Schatten spaces.
Exercise 4.10 (Projective tensor product and compactness). Show that if
K, K 1 are compact convex sets, then convtx b x1 : x P K, x1 P K 1 u is compact and
p K 1 . Give an example of closed convex sets K, K 1 such that the
hence equal to K b
set convtx b x : x P K, x1 P K 1 u is not closed.
1
84 4. MORE CONVEXITY

ni
ni mi
` Ki Ă R
Exercise 4.11 (Linear invariance of projective tensor product). Let ˘
and let Ti : R Ñ R be linear maps, i “ 1, 2. Show that pT1 b T2 q K1 b p K2 “
pT1 K1 q b
p pT2 K2 q.
Exercise 4.12 (Projective tensor product with a linear subspace). Let K Ă Rn
1
be a closed convex set, let V “ span K, and let V 1 Ă Rn be a vector subspace.
Show that K bp V1 “V b p V 1 “ V b V 1.
Exercise 4.13 (Projective tensor product of affine subspaces). Let Vi Ă Rni

ion
p 2 is an affine subspace of Rn1 bRn2
be affine subspaces for i “ 1, 2. Show that V1 bV
and find its dimension.

ut
Exercise 4.14 (Projective tensor product of cones). Show that if C and C 1 are
closed convex cones, then the set convtx b x1 : x P C, x1 P C 1 u is a closed convex

rib
p C1.
cone and in particular equals C b
Exercise 4.15 (Projective tensor product of bodies are bodies). Show that if

ist
Ki Ă Rni are convex bodies, then K1 bK p 2 is a convex body in Rn1 bRn2 . Similarly,
if each Ki is a convex body in an affine subspace Vi Ă Rni , then K1 bK
p 2 is a convex

rd
body in V1 b V2 .
p
Exercise 4.16 (Projective tensor product with B1n ). Let K be a symmetric
convex body in Rm . (i) What is then B1k b fo
p K? (ii) Show that
pm!qk
ot
volpB1k b
p Kq “ volpKqk .
pkmq!
N

Exercise 4.17 (Extreme points of projective tensor products). If K and K 1 are


symmetric convex bodies, show that the set of extreme points of K b p K 1 is exactly
ly.

1 1
the set of elements x b x , where x is an extreme point of K and x is an extreme
point of K 1 . Show that this may be false if either K or K 1 is not symmetric.
on

Exercise 4.18 (Injective tensor products and bilinear forms). If K “ BX and


K 1 “ BX 1 , show that K bK
q 1 identifies with the set of bilinear maps F : X ˆX 1 Ñ R
se

such that |F px, x q| ď }x} ¨ }x1 } for all x, x1 (i.e., with the unit ball in the space of
1

bilinear maps).
lu

4.2. John and Löwner ellipsoids


na

4.2.1. Definition and characterization. We start with the following propo-


sition.
so

Proposition 4.2. For every convex body K Ă Rn


r

(i) there is a unique ellipsoid E Ă Rn with maximal volume under the constraint
Pe

E Ă K and
(ii) there is a unique ellipsoid F Ă Rn with minimal volume under the constraint
F Ą K.
The ellipsoid E appearing in (i) is called the John ellipsoid of K and denoted
by JohnpKq. The ellipsoid F appearing in (ii) is called the Löwner ellipsoid of K
and denoted by LöwpKq. By a compactness argument, the existence of an ellipsoid
of maximal/minimal volume is clear in (i) and (ii). Note also that these ellipsoids
are affine invariants: for any affine map T , we have JohnpT Kq “ T JohnpKq and
LöwpT Kq “ T LöwpKq. We say that K is in John position if JohnpKq “ B2n , and
that K is in Löwner position if LöwpKq “ B2n .
4.2. JOHN AND LÖWNER ELLIPSOIDS 85

Uniqueness deserves an argument (the proof will be elementary, but to show


part (ii) in full generality we will need a trick implicit in Proposition 4.4). For
(i) this is fairly straightforward: assume that E ‰ E 1 are two distinct ellipsoids of
maximal volume contained in K, then write E “ SpB2n q ` x and E 1 “ S 1 pB2n q ` x1
for S, S 1 P PSD and x, x1 P Rn . Since E ‰ E 1 , we necessarily have pS, xq ‰ pS 1 , x1 q.
By linear invariance, we may assume that S “ I, which implies that detpS 1 q “ 1. If
S 1 “ I, then E and E 1 are two distinct balls of radius 1, and it is easy to see that
1
convpE , E 1 q (and hence K) contains an ellipsoid centered at x`x2 of volume larger

ion
than volpE q, a contradiction. If S 1 ‰ I, then K contains the ellipsoid T pB2n q ` y
1
with T “ I `S 2 and y “ x`x2 . Since det T ą 1 (see Exercise 1.42), this ellipsoid is
of a volume greater than volpE q, also a contradiction.

ut
The uniqueness in (ii) follows by duality when K is centrally symmetric. Indeed,

rib
the minimization problem in (ii) can be restricted in that case to 0-symmetric
ellipsoids (by essentially the same argument as in the case of S 1 “ I above). Since
for a 0-symmetric ellipsoid F we have volpF q volpF ˝ q “ volpB2n q2 by (1.10), and

ist
since K Ă F ðñ F ˝ Ă K ˝ , the uniqueness follows, together with the relation
LöwpKq “ JohnpK ˝ q˝ .

rd
fo
ot
y B22

N

K◦ K
ly.

• •x
0
on
se


z
lu
na
so

Figure 4.2. An equilateral triangle K in Löwner position. The


polar body K ˝ is in John position. The contact points x, y, z satisfy
r

the relations x ` y ` z “ 0 and 23 |xyxx| ` 32 |yyxy| ` 23 |zyxz| “ I as


Pe

in Definition 4.5.

The uniqueness in (ii) in the general case is not obvious at this point; we
postpone its justification until after Proposition 4.4.
We will now present a general trick that makes it possible to reduce the search
for the Löwner ellipsoid of the not-necessarily-symmetric bodies to the symmetric
case. To that end, fix h ą 0 and consider the affine hyperplane
H :“ tph, xq : x P Rn u Ă Rn`1 .
86 4. MORE CONVEXITY

To each ellipsoid E Ă H we associate the symmetrization E “ convp´E Y E q,


which is an ellipsoidal cylinder in Rn`1 . The following lemma describes the Löwner
ellipsoid of E .
Lemma 4.3. Let S P GLpn, Rq and a P Rn , and consider the ellipsoid
E “ tph, Sx ` aq : x P B2n u Ă H.
Then LöwpE q “ T pB2n`1 q, where
«?

ion
ff
n ` 1hb 0
T “ ? 1 .
n ` 1 ha 1` n S

ut
In particular,

rib
(4.15) volpLöwpE qq “ cn h volpE q
for some constant cn depending only on n.

ist
Proof. Consider first the special case (denoted by E0 ) where S “ I, h “ 1,
and a “ 0. It follows from the uniqueness—which has already been fully proved

rd
in the symmetric case—that LöwppE0 q q inherits all the symmetries of pE0 q and
therefore has the form T0 pB2n`1 q, where T0 is a diagonal matrix with coefficients
fo
pα, β, . . . , βq, with α, β ą 0 to be determined. Since pE0 q Ă T0 pB2n`1 q if and only
if α12 ` β12 ď 1 and volpT0 pB2n`1 qq “ αβ n volpB2n`1 q, the minimization problem
ot
? a
yields the values α “ n ` 1, β “ 1 ` 1{n, as needed.
For the general case, note that E “ ApE0 q, where
N

„ 
h 0
A“ P Mn`1 .
ly.

a S
Since LöwpE q “ LöwpApE0 q q “ A LöwppE0 q q by invariance, it follows that T “
on

AT0 as claimed. The relation (4.15) follows by expressing det T in terms of det S.

se

Proposition 4.4. Let K Ă H be a convex body and E Ă H an ellipsoid. The


lu

following are equivalent:


(i) E is a minimal volume ellipsoid containing K.
na

(ii) LöwpE q “ LöwpK q.


Since E “ LöwpE q X H, Proposition 4.4 implies in particular uniqueness of
so

the Löwner ellipsoid for not-necessarily-symmetric convex bodies, completing the


proof of Proposition 4.2.
r
Pe

Proof of Proposition 4.4. Assuming (i), let F “ LöwpK q X H. Since F


is an ellipsoid containing K, we have volpF q ě volpE q, which by (4.15) implies
volpLöwpF qq ě volpLöwpE qq. Next, since K Ă F Ă LöwpK q, it follows that
LöwpK q “ LöwpF q. Given that LöwpE q is an ellipsoid containing K with
volume not exceeding the minimum possible, it must coincide with LöwpK q.
Assume now (ii), and let F be an ellipsoid containing K. Since LöwpF q
contains K , it follows that volpLöwpF qq ě volpLöwpK qq “ volpLöwpE qq. By
(4.15), this means that volpF q ě volpE q, as needed. 
The following concept will be useful for our purposes.
4.2. JOHN AND LÖWNER ELLIPSOIDS 87

Definition 4.5. A resolution of identity in Rn is a finite family pxi , ci qiPI ,


where pxi qiPI belong to S n´1 and pci qiPI are positive numbers, such that
ÿ
(4.16) ci |xi yxxi | “ In .
i
A resolution is called unbiased if, additionally,
ÿ
(4.17) ci xi “ 0.
i

ion
n
If K is a convex body in R and all points xi belong to BK, we will say that
pxi , ci qiPI is associated to K. Note that if, additionally, K Ă B2n or B2n Ă K (which

ut
will be usually the case), then all points xi are contact points of K and the unit
sphere, i.e., such that }xi }K “ }xi }K ˝ “ |xi |.

rib
ř
Taking trace of both sides in condition (4.16), we see that necessarily ci “ n.
More generally, if T P BpRn q, then

ist
ÿ
(4.18) Tr T “ ci xT xi , xi y

rd
i

(see Exercise 4.19). Note also that condition (4.17) is redundant for symmetric
convex bodies, since one can always enforce it by replacing every couple pci , xi q in
fo
the decomposition by two couples p 21 ci , xi q and p 12 ci , ´xi q.
The following pair of propositions characterizes John and Löwner positions via
ot
resolutions of identity. The presentations of these results that are easily available in
the literature focus on the class of symmetric bodies and we will assume henceforth
N

that they are both known to be true in that setting (for a reference, see Theorem
2.1.15 in [AAGM15] or Theorem 3.1 in [Bal97]). It is also easy to see that in
ly.

the symmetric case the two statements are formally equivalent by duality (i.e., by
passing to polars).
on

Proposition 4.6. Let K be a convex body in Rn . The following are equivalent.


(i) K is in Löwner position.
se

(ii) K Ă B2n and there exists an unbiased resolution of identity associated to K.


lu

Proposition 4.7. Let K be a convex body in Rn . The following are equivalent.


(i) K is in John position.
na

(ii) K Ą B2n and there exists an unbiased resolution of identity associated to K.


Proof of Proposition 4.6 (assuming the symmetric case). To a con-
so

vex body K we associate


r

"ˆ c ˙ *
1 n
Pe

K
r “ ? , x : x P K Ă Rn`1 .
n`1 n`1
It follows from Lemma 4.3 that B2n`1 “ Löw pB
` n ˘
Ă q . In view of Proposition 4.4,
2
we have the equivalence
K is in Löwner position ðñ K̃ is in Löwner position.
Consequently, our task is reduced to showing that K has an unbiased resolution of
identity (in Rn ) if and only if K̃ has a resolution of identity (in Rn`1 ). To that
end, let e0 “ p1, 0, . . . , 0q P Rn`1 and let pxi , ci q be a resolution of identity for K̃ .
The points xi are extreme points of K̃ , and since we have freedom to replace xi
88 4. MORE CONVEXITY

b
?1 , n
` ˘
by ´xi , we may assume that each xi has the form xi “ n`1 n`1 yi with
yi P K X S n´1 . Setting z “ ci yi , we have
ř
ÿ
In`1 “ ci |xi yxxi |
i
ˇ 1 c c
ÿ n EA 1 n ˇ
ci ˇ e0 ` p0, yi q ? e0 ` p0, yi qˇ
ˇ? ˇ

i
n`1 n`1 n`1 n`1
?
n n

ion
` ˘ ÿ
“ |e0 yxe0 | ` |e0 yxp0, zq| ` |p0, zqyxe0 | ` ci |p0, yi qyxp0, yi q|,
n`1 n`1 i

ut
ř
where in the last equality we used the fact that i ci “ n ` 1. By applying this
operator equality to the vector e0 , we obtain z “ ` 0. Thus ˘the middle term in the last

rib
n
line above vanishes, which easily implies that yi , n`1 ci is an unbiased resolution
of identity for K. The reverse argument simply retraces the above calculation

ist
backwards; the reader is encouraged to verify the details. (Note that z “ 0 then
follows from the hypothesis.) 

rd
Proof of Proposition 4.7. Assume that K is in John position. We claim
that K ˝ is in Löwner position. To check this, let E be an ellipsoid containing
fo
K ˝ . We then have E ˝ Ă K. We know from Exercise 1.26 (or from Exercise
D.3, which outlines a simpler but less elementary proof) that E ˝ is an ellipsoid
and that volpE q volpE ˝ q ě volpB2n q2 , with equality iff E is 0-symmetric. Since
ot
volpE ˝ q ď volpB2n q by definition of the John ellipsoid, it follows that volpE q ě
N

volpB2n q, showing that K ˝ is in Löwner position. By Proposition 4.6, K ˝ admits an


unbiased resolution of identity, and so does K.
ly.

Conversely, suppose that B2n Ă K and that pxŞ i , ci q is an unbiased resolution


of identity for K. We note that K is contained in i tx ¨ , xi y ď 1u (indeed, since
on

xi P BK X S n´1 , the support hyperplane for K at xi is necessarily orthogonal to


xi ). Let E Ă K be an ellipsoid. Write E “ SpB2n q ` a for S P PSD ř and a P R .
n

Since Sxi ` a P E Ă K, we have xSxi ` a, xi y ď 1 for all i. Since ci xi “ 0, this


se

shows that
lu

ÿ ÿ ÿ
n“ ci ě ci xSxi ` a, xi y “ ci xSxi , xi y “ Tr S,
i i i
na

the last equality following from (4.18). The AM/GM inequality now implies that
det S ď 1, and hence that volpE q ď volpB2n q. Since E Ă K was arbitrary, this
so

shows that K is in John position. 


r

John’s theorem implies estimates on the diameter of the Banach–Mazur com-


Pe

pactum which are essentially sharp in the symmetric case only (see Exercises 4.20–
4.21, and Notes and Remarks for further comments).
Exercise 4.19. Prove identity (4.18).
Exercise 4.20 (The diameter of Banach–Mazur compactum). Let K Ă B2n
(resp., K Ą B2n ) be a symmetric convex body and assume that there exists ? a
resolution of identity associated to K. Show that K Ą ?1n B2n (resp., K Ă nB2n q
?
and so, in particular, dg pB2n , Kq ď n. Conclude that any pair K, L of symmetric
convex bodies in Rn satisfies dBM pK, Lq ď n.
4.2. JOHN AND LÖWNER ELLIPSOIDS 89

Exercise 4.21 (Bounds on the diameter of Banach–Mazur compactum, the


non-symmetric case). Let K Ă B2n (resp., K Ą B2n ) be a convex body and assume
that there exists an unbiased resolution of identity associated to K. Show that
K Ą n1 B2n (resp., K Ă nB2n ). Conclude that any pair K, L of convex bodies in Rn
satisfies dBM pK, Lq ď n2 .
Exercise 4.22 (The length of resolutions of identity). Show that in Propo-
sitions 4.6 and 4.7, the length of the resolution of identity associated to K can
be assumed to be at most npn`3q in the general case, and at most npn`1q

ion
2 2 if K is
symmetric.

ut
Exercise 4.23 (The radius of Banach–Mazur compactum). Show that ? the first
estimates from Exercise 4.20 is optimal by verifying that dBM pB2n , B8
n
q “ n.

rib
4.2.2. Convex bodies with enough symmetries. In this section we de-
scribe a class of convex bodies “with enough symmetries,” which in particular admit

ist
a unique Euclidean structure compatible with those symmetries. These properties
force the John and Löwner ellipsoids (or any other ellipsoids “functorially associ-

rd
ated” with such bodies) to be balls with respect to that Euclidean structure.
Let K Ă Rn be a convex body. We consider symmetries of K, i.e., invertible
fo
affine maps T : Rn Ñ Rn such that T pKq “ K. We start by making two observa-
tions. First, such maps necessarily fix the centroid of K. If the centroid is at the
ot
origin (which may be assumed by translating K), the set of symmetries becomes
a subgroup of GLpn, Rq. Second, since this subgroup is compact, it must preserve
N

a scalar product (consider any scalar product and average it with respect to the
Haar measure on the group of symmetries). Equivalently, by replacing K with a
ly.

linear image we may ensure that all symmetries of K are (Euclidean) isometries;
in virtually all applications this property will be automatically satisfied. This is
on

tacitly assumed in what follows, although the definitions and the proposition can
be easily rephrased to make sense and/or hold without that assumption.
We therefore consider K Ă Rn a convex body with centroid at the origin. An
se

isometry of K is an orthogonal transformation O P Opnq such that OpKq “ K.


lu

The isometries of K form a subgroup of Opnq, which will be called the isometry
group of K and denoted by IsopKq. This definition extends mutatis mutandis to
convex bodies K Ă Cn ; in that case IsopKq is a subgroup of Upnq.
na

We say that K has enough symmetries if IsopKq1 “ R I (or C I in the complex


case). Here G1 denotes the commutant of G, i.e., the set of linear maps S such that
so

SO “ OS for every O P G.
r

There is a closely related notion (and possibly a source of confusion): one says
Pe

that IsopKq acts irreducibly if any IsopKq-invariant subspace is either t0u or Rn (or
Cn in the complex case; a subspace E is G-invariant if OpEq “ E for any O P G).
One checks that IsopKq acts irreducibly if and only if IsopKq1 contains no nontrivial
orthogonal projection, and also if and only if IsopKq1 X Msan “ R I; this idea is also
used in Proposition 4.8.
It is immediate that when K has enough symmetries, IsopKq acts irreducibly.
In the complex case, the reverse implication also holds (this is the content of Schur’s
lemma) and both notions are equivalent. In the real case, the notions are different
(see Exercise 4.26).
90 4. MORE CONVEXITY

The following proposition shows that ellipsoids associated to a convex body


in a “functorial” way (such as the John and Löwner ellipsoids, or the `-ellipsoid
introduced in Section 7.1) inherit its symmetries.
Proposition 4.8. Let K Ă Rn be a convex body and let E be an ellipsoid such
that OpE q “ E for any O P IsopKq. Then there exist pairwise orthogonal subspaces
E1 , . . . , Ek , which are invariant under IsopKq, and positive numbers λ1 , . . . , λk such
that
E “ T B2n , where T “ λ1 PE1 ` ¨ ¨ ¨ ` λk PEk .

ion
In particular, when IsopKq acts irreducibly, E is a Euclidean ball.

ut
Proof. Let T be the unique positive matrix such that E “ T pB2n q. Forř every
O P IsopKq, we have E “ OpE q “ OT O: pB2n q, thus OT O: “ T . Write T “ i λi Pi ,

rib
where λi ą 0 are distinct positive numbers and Pi pairwise orthogonal projectors.
From the relation OT O: “ T we deduce that, for every i, we have OPi O: “ Pi for

ist
all O P IsopKq, and therefore that the range of Pi is invariant under IsopKq. 
We conclude this section with two examples of groups of symmetries of Rn (or

rd
n
C ) which play an important role in geometric functional analysis
(4.19) Gunc :“ tpx1 , . . . , xn q ÞÑ pε1 x1 , . . . , εn xn q : |εj | “ 1u
(4.20) Gsym :“
fo
tpx1 , . . . , xn q ÞÑ pε1 xπp1q , . . . , εn xπpnq q : |εj | “ 1u,
ot
where ε1 , . . . , εn are scalars and π P Sn , the group of permutations. A convex
body K (resp., the norm or the space, for which K is the unit ball) is called
N

unconditional (with respect to the standard basis) if IsopKq Ą Gunc and, similarly,
permutationally symmetric if IsopKq Ą Gsym . Bodies of the second kind have
ly.

enough symmetries, but bodies of the first kind not necessarily; see Exercise 4.24.
(In functional analysis, the standard terminology for the latter is “symmetric,” but
on

we prefer to avoid the confusion with the notion of being centrally symmetric.)
More generally, one may consider bodies (or norms) that are unconditional (resp.,
se

permutationally symmetric) with respect to some other basis puj q, i.e., invariant
under maps of the form uj ÞÑ εj uj , j “ 1, . . . , n (resp., uj ÞÑ εj uπpjq , j “ 1, . . . , n).
lu

The basis puj q is then called unconditional (resp., permutationally symmetric),


and the property of having a basis of either kind is a linear invariant.
na

Exercise 4.24 (Permutationally symmetric or unconditional vs. enough sym-


metries). Show that every permutationally symmetric convex body has enough
so

symmetries. Give an example of an unconditional body which does not have enough
r

symmetries.
Pe

Exercise 4.25 (Examples of bodies with enough symmetries). Let 1 ď p ď 8,


p ‰ 2. For each convex body in the following list, determine if it has enough
symmetries.
(i) The `p ball Bpn ,
(ii) its non-commutative analogues Spn,m ,
(iii) the self-adjoint version Spn,sa , and its intersection with the hyperplane of trace
0 matrices,
(iv) the regular simplex,
(v) the set DpCn q of quantum states.
4.3. CLASSICAL INEQUALITIES FOR CONVEX BODIES 91

Exercise 4.26 (Enough symmetries vs. irreducible action). (i) Let R P SOp2q
be the rotation of angle 2π{p for an integer p ě 3. Construct a convex body K Ă R2
whose isometry group is exactly tRk : 0 ď k ď p ´ 1u. Show that K does not
have enough symmetries although IsopKq acts irreducibly.
(ii) For any n, give an example of a convex body L Ă R2n without enough symme-
tries although IsopLq acts irreducibly.
Exercise 4.27 (Projective tensor product and enough symmetries). Let K Ă
Rm and L Ă Rn be convex bodies with enough symmetries. Show that K b p L has

ion
enough symmetries.
4.2.3. Ellipsoids and tensor products. It turns out that Löwner ellipsoids

ut
behave well with respect to the projective tensor product, as the following lemma

rib
shows. Note that the analogous statement does not hold for the John ellipsoid (see
Exercise 4.28).

ist
1
Lemma 4.9. Let K Ă Rn and K 1 Ă Rn be two convex bodies and assume that
the ellipsoids LöwpKq and LöwpK 1 q are 0-symmetric. Then the Löwner ellipsoid

rd
of their projective tensor product is the Hilbertian tensor product of the respective
Löwner ellipsoids.
1
In terms of scalar products, for every x, y in Rn and x1 , y 1 in Rn , we have
xx b x1 , y b y 1 yLöwpK bK 1 1
fo
p 1 q “ xx, yyLöwpKq xx , y yLöwpK 1 q
ot
1
Proof. First suppose that LöwpKq “ B2n and LöwpK 1 q “ B2n . By Proposition
N

4.6, there exist unbiased resolutions of identity for K and K 1 , respectively pxi , ci q
1 1
and px1j , c1j q. We easily check that K b p K 1 Ă B2nn “ B2n b2 B2n . We may verify
ly.

that pxi b x1j , ci c1j q is an unbiased resolution of identity for K b p K 1 by writing


ÿÿ ´ÿ ¯ ´ÿ ¯
ci c1j xi b x1j “ c1j x1j “ 0,
on

ci x i b
i j i j
ÿÿ ´ÿ ¯ ´ÿ ¯
ci c1j |xi x1j yxxi x1j | ci |xi yxxi | b c1j |x1j yxx1j | “ I .
se

b b “
i j i j
lu

1
It follows from Proposition 4.6 that LöwpK bK
p q“ 1
B2nn .
For the general case, let T
1
and T 1 be linear maps such that T LöwpKq “ B2n and T LöwpK 1 q “ B2n . Using the
1
na

1 1 p 1 K 1 q,
elementary identities LöwpT Kq “ T LöwpKq and pT bT qpK bK
p q “ pT KqbpT
the result follows from the previous special case. 
so

Exercise 4.28 (Projective tensor product and the John ellipsoid). Compare
?
r

JohnpK bLq
p and JohnpKqbJohnpLq
p when K “ L “ B2n and when K “ L “ nB1n .
Pe

4.3. Classical inequalities for convex bodies


In this section we review classical inequalities involving various geometric in-
variants of convex bodies, most notably the volume and the mean width. We use
the Minkowski operations defined in (4.7).
4.3.1. The Brunn–Minkowski inequality. The Brunn–Minkowski inequal-
ity is a fundamental inequality which governs the behavior of the volume of sets
under operations related to convexity. It asserts that the volume (the Lebesgue
measure on Rn ) is log-concave with respect to Minkowski operations, in the follow-
ing sense.
92 4. MORE CONVEXITY

Theorem 4.10 (Brunn–Minkowski, not proved here). Let K, L Ă Rn be Borel


sets and λ P r0, 1s. Then
(4.21) volpλK ` p1 ´ λqLq ě volpKqλ volpLq1´λ .
Another formulation of the Brunn–Minkowski inequality can be given (see Ex-
ercise 4.30) as follows: under the same assumptions,
(4.22) volpK ` Lq1{n ě volpKq1{n ` volpLq1{n .
The Brunn–Minkowski inequality implies the famous isoperimetric inequality

ion
in Rn : among sets of given volume, the balls have the smallest surface area. If
K Ă Rn is sufficiently regular, the surface area can be defined as the first-order

ut
variation of the volume of the “enlarged” set K ` εB2n when ε goes to 0
volpK ` εB2n q ´ volpKq

rib
(4.23) areapKq :“ lim
εÑ0 ε
Note that for a general subset K Ă Rn , some care is needed in defining area since

ist
the limit in (4.23) may not exist or may not coincide with other notions of surface
area. However, such problems do not arise for convex sets.

rd
A convenient formulation of the isoperimetric inequality uses the concept of
volume radius. Given a bounded measurable K Ă Rn , its volume radius vradpKq
is defined as
ˆ
fo
volpKq n
˙1
ot
(4.24) vradpKq :“ .
volpBn2 q
N

In words, the volume radius of K is the radius of the Euclidean ball which has the
same volume of K. A standard computation shows that
ly.

π n{2
vol B2n “ ` n
` ˘
(4.25) ˘.
Γ 2 `1
on

Notice that, as a function of n, volpB2n q decreases super-exponentially


a fast to 0 as
n Ñ 8. In particular, volpB2n q1{n is equivalent to 2πe{n as n tends to infinity.
se

When K Ă Rn is a convex body containing 0 in the interior, another useful formula


for the volume radius of K (proved via integrating in spherical coordinates) is
lu

ˆż ˙1{n
´n
(4.26) vradpKq “ }θ}K dσpθq .
na

S n´1
Here is the statement of the isoperimetric inequality in Rn employing the notion
so

of volume radius.
r

Proposition 4.11 (Isoperimetric inequality). Let K Ă Rn be bounded and


Pe

denote r “ vradpKq. Then, for every ε ą 0,


(4.27) volpK ` εB2n q ě volprB2n ` εB2n q
or, equivalently, vradpK ` εB2n q ě vradpKq ` ε. Consequently, whenever the limit
in (4.23) exists, we have areapKq ě areaprB2n q.
Proof. It follows from the Brunn–Minkowski inequality (4.22) that
volpK ` εB2n q1{n ě volpKq1{n ` volpεB2n q1{n
“ pr ` εq volpB2n q1{n
“ volprB2n ` εB2n q1{n . 
4.3. CLASSICAL INEQUALITIES FOR CONVEX BODIES 93

Exercise 4.29 (Superadditivity of the volume radius). Show that the Brunn–
Minkowski inequality can be restated as vradpK ` Lq ě vradpKq ` vradpKq.
Exercise 4.30 (Superadditivity and log-concavity). Show that the inequalities
(4.21) and (4.22) are formally globally equivalent.
Exercise 4.31 (Steiner-like symmetrizations). Show that the following state-
ment is equivalent to the Brunn–Minkowski inequality for convex bodies. Let
K Ă Rn a convex body and E Ă Rn a k-dimensional subspace with 0 ă k ă n.

ion
Define a set L Ă E ˆ E K by the following (where x P E, y P E K )
px, yq P L ðñ |x| ď vradpK X pE ` yqq

ut
where the volume radius is measured in E ` y. Then L is convex. (When E is a
hyperplane, the map K ÞÑ L defined above is called Steiner symmetrization.)

rib
Exercise 4.32. Let E Ă Rm and F Ă Rn be two 0-symmetric ellipsoids. Show
the formula vradpE b2 F q “ vradpE q vradpF q.

ist
4.3.2. log-concave measures. Closely related to the Brunn–Minkowski in-

rd
equality is the concept of a log-concave measure. In our setting, log-concave mea-
sures appear as (limits of) marginals of uniform measures on convex sets.
Let µ be a measure on Rn with density f with respect to the Lebesgue measure.
fo
We say that µ is log-concave if log f is a concave function. Similarly, given α ą 0,
we say that µ is α-concave if the function f α is concave when restricted to the
ot
support of µ. We now state basic facts about log- and α-concave measures and
N

relegate the proofs to exercises.


Lemma 4.12 (see Exercise 4.34). Let µ be a finite log-concave measure on Rn .
ly.

Then there is a sequence pµs qsPN of measures on Rn converging weakly to µ, and


such that µs is 1{s-concave.
on

Lemma 4.13 (see Exercise 4.35). Let µ be a measure on Rn , and s P N. The


following are equivalent.
se

(1) The measure µ is 1{s-concave.


lu

(2) There is a closed convex set K Ă Rn ˆ Rs such that µ is the marginal over
Rs of the Lebesgue measure restricted to K, i.e., such that, for any Borel
set B Ă Rn ,
na

µpBq “ voln`s ppB ˆ Rs q X Kq .


so

As a corollary to Lemmas 4.12 and 4.13, we obtain the following characteriza-


r

tion of log-concave measures.


Pe

Proposition 4.14 (Characterization of log-concave measures, see Exercise


4.36). Let µ be a finite and absolutely continuous measure on Rn . The following
are equivalent
(1) The measure µ is log-concave.
(2) The measure µ satisfies the following analogue of (4.21): for any Borel
sets K, L P Rn and λ P r0, 1s,
(4.28) µpλK ` p1 ´ λqLq ě µpKqλ µpLq1´λ .
To summarize, log-concave measures on Rn are uniform measures on convex
bodies, marginals of uniform measures on convex bodies in RN for N ą n (see
94 4. MORE CONVEXITY

Exercise 4.33), and their limits. Archetypical examples of log-concave measures


include the standard Gaussian measure γn or any Gaussian measure (see Appendix
A.2 and Notes and Remarks on Section 4.3).
Exercise 4.33 (α-concavity and log-concavity). Check that an α-concave mea-
sure is log-concave, and also β-concave for any β P p0, αs.
Exercise 4.34 (More on α-concavity vs. log-concavity). Prove Lemma 4.12.

ion
Exercise 4.35 (α-concavity and marginals). Deduce Lemma 4.13 from the
Brunn–Minkowski inequality (4.22) applied in Rs .

ut
Exercise 4.36 (Characterization of log-concave measures). Deduce Proposi-
tion 4.14 from Lemmas 4.12 and 4.13.

rib
4.3.3. Mean width and the Urysohn inequality. Given a nonempty and
bounded set K Ă Rn and a vector u P Rn , we define the quantity

ist
(4.29) wpK, uq :“ sup xu, xy.

rd
xPK

In the particular case when K is a convex body containing 0 in the interior, we have
wpK, uq “ }u}K ˝ (see (1.8)). If |u| “ 1, then wpK, uq is called the support function
fo
of K in direction u. (An alternative notation for the support function, widely used
in convex geometry, is hK puq.) Geometrically, wpK, uq is then the distance from
ot
the origin to the hyperplane tangent to K in the direction u (that is, with u being
normal to the hyperplane, and outer to K). In particular wpK, uq ` wpK, ´uq is
N

the width of the smallest strip in direction orthogonal to u which contains K (see
Figure 4.3).
ly.

w(K, −u)
on

K
se
lu
na

u

0
so
r
Pe

w(K, u)

w(K, u) + w(K, −u)

Figure 4.3. If |u| “ 1, then wpK, uq ` wpK, ´uq is the width of


K in the direction of u.
4.3. CLASSICAL INEQUALITIES FOR CONVEX BODIES 95

For a nonempty bounded subset K Ă Rn , we may define the mean width of K


as the average of wpK, ¨q over the unit sphere
ż
(4.30) wpKq :“ wpK, uq dσpuq,
S n´1
where σ is the Lebesgue measure on the sphere, normalized so that σpS n´1 q “ 1.
Although the definition makes sense for every bounded set K, we mostly consider
the case where K is also closed and convex. This is not really a restriction since
wpK, ¨q “ wpconv K, ¨q.

ion
From the geometric point of view, it might have been more accurate to call
wpKq the mean half-width (or, as some authors do, to include an additional fac-

ut
tor 2 in the definition; observe that wpKq is half of the average of wpK, uq `
wpK, ´uq). However, we opted for simplicity. Note that, under our convention, one

rib
has wpB2n q “ 1, and that if K is a convex body which contains the origin in the
interior, then

ist
ż
wpKq “ }u}K ˝ dσpuq.
S n´1

rd
It is often convenient to consider the Gaussian variant of the mean width.
Let G be a standard Gaussian vector in Rn , i.e., a Rn -valued random variable
whose coordinates in any orthonormal basis are independent and follow the N p0, 1q
fo
distribution (see Appendix A). For any nonempty bounded set K Ă Rn , we define
the Gaussian mean width of K as
ot
ż
1
(4.31) wG pKq :“ E wpK, Gq “ sup xu, xy expp´|u|2 {2q du.
N

p2πqn{2 Rn xPK
Using (A.7), one checks that
ly.

(4.32) wG pKq “ κn wpKq,


?
on

where κn depends only on n and is of order n (more precise estimates appear in


Proposition A.1). We take the convention that whenever we write wpKq or wG pKq
for a set K Ă Rn , it is tacitly assumed that K is nonempty.
se

Given bounded subsets K, L in Rn and a vector u, one checks that wpK`L, uq “


wpK, uq ` wpL, uq. Integration yields
lu

(4.33) wpK ` Lq “ wpKq ` wpLq,


na

and similarly for wG . In the special case when L is a singleton, this shows that the
mean width (Gaussian or not) is translation-invariant.
so

An advantage of the Gaussian mean width is that it does not depend on the
ambient dimension. Indeed, suppose that K is a bounded subset in a subspace
r

E Ă Rn . Then the value of wG pKq does not depend on whether it is computed in


Pe

E or in Rn , while the value of wpKq does depend.


The following result, known as the Urysohn inequality, asserts that among sets
of given volume, the mean width is minimized for Euclidean balls.
Proposition 4.15 (Urysohn’s inequality, see Exercise 4.49). Let K Ă Rn be a
bounded Borel set. Then
(4.34) vradpKq ď wpKq.
The Urysohn inequality can be seen a consequence of the Brunn–Minkowski
inequality, see Exercise 4.49. Among closed sets, the Urysohn inequality is an
equality if and only if K is a Euclidean ball.
96 4. MORE CONVEXITY

Define the outradius of a bounded set K Ă Rn as the smallest radius (denoted


outradpKq) of a Euclidean ball that contains K (such a ball is unique, see Exercise
4.41), and the inradius of a convex body K Ă Rn as the largest radius (denoted
inradpKq) of a Euclidean ball contained in K. (Such a ball is not necessarily unique;
however, when K is symmetric, the inradius is witnessed by Euclidean balls centered
at the origin.) We have the chain of inequalities
(4.35) inradpKq ď vradpKq ď wpKq ď outradpKq.

ion
For a longer chain of inequalities which includes also dual quantities, see Exercise
4.51. It is instructive to compare in Table 4.1 the values of these quantities for
the most standard examples of convex bodies. For a derivation, see Exercises 4.38

ut
and 6.6 (we postpone the nontrivial mean width computations to Chapter 6, where
they fit more naturally).

rib
Table 4.1. Radii for standard convex bodies in Rn . Quantities

ist
in each row are non-decreasing from left to right, see (4.35) and
Exercise 4.51. The simplex K is normalized? to be a regular simplex

rd
inscribed in the Euclidean ball of radius n cantered at the origin.
This normalization is appealing since it has the property that K ˝ “
´K. When compared ? to the simplex ∆n as defined in Section 1.1.2,
fo
K is congruent to n ` 1 ∆n .
ot
K inradpKq wpK ˝ q´1 vradpKq wpKq outradpKq
B2n
N

1 1 1 1 1
? a a ? ?
B1n 1{ n „ π{2n „ 2e{πn „ 2 log n{ n 1
? ? ?
ly.

n
a a
B8 1 „ n{ 2 log n „ 2n{πe „ 2n{π n
? ? a ? ?
simplex 1{ n „ 1{ 2 log n „ e{2π „ 2 log n n
on

We check in Table 4.1 that for all these basic examples of convex bodies, the
se

volume radius and the mean width are of comparable order of magnitude, at least
lu

up to a logarithmic factor. This cannot be true for general convex bodies (see
Exercise 4.42), but a convex body such that vradpKq is much smaller than wpKq
has to be strongly “non-isotropic,” cf. Corollary 7.11.
na

The Urysohn inequality has a “dual” version, which is actually easier to prove
since it depends only on the Hölder inequality.
so

Proposition 4.16. For every convex body K Ă Rn containing the origin it its
r

interior, we have
Pe

(4.36) vradpKq ě wpK ˝ q´1 .


Proof. This follows from Hölder’s inequality
ż n
n`1 ´ n
1“ }θ}K }θ}K n`1 dσpθq
S n´1
ˆż n
˙ n`1 ˆż 1
˙ n`1
ď }θ}K dσpθq ¨ }θ}´n
K dσpθq
S n´1 S n´1
n
˝
“ pwpK q vradpKqq n`1 ,
where we used formula (4.26) to compute the volume radius. 
4.3. CLASSICAL INEQUALITIES FOR CONVEX BODIES 97

Exercise 4.37 (The mean width of the polar). Let K Ă Rn be a convex body.
Show that wpKqwpK ˝ q ě 1.
Exercise 4.38. Derive the estimates about inradius, volume radius and out-
radius in Table 4.1. For the mean width, see Exercise 6.6.
Exercise 4.39 (Rough bounds on volume radius of Bpn ). Use the inequalities
(1.4) between `p -norms and the information on volume radii from Table 4.1 (or
direct calculations) to conclude that vradpBpn q » n1{2´1{p for 1 ď p ď 8.

ion
ş p
Exercise 4.40 (Volume of Bpn ). Let 1 ď p ď 8. By calculating Rn e´}x}p dx
˘n
in two different ways, show that volpBpn q “ 2Γp1 ` p1 q {Γp1 ` np q. Deduce that,
`

ut
for large n, vradpBpn q „ 2Γp1 ` p1 qppeq1{p n1{2´1{p .

rib
Exercise 4.41 (Uniqueness of outradius witness). Show that there is a unique
Euclidean ball of minimal radius containing a given set K Ă Rn .

ist
Exercise 4.42 (The gap in Urysohn’s inequality). Give examples of convex
bodies K Ă R2 such that the ratio wpKq{ vradpKq is arbitrary large.

rd
Exercise 4.43 (The mean width and the diameter). Show that for a convex
body K Ă Rn , wpKq ě 12 κκn1 diam K. fo
Exercise 4.44 (The mean width and the perimeter). For a convex body K Ă
ot
R2 , show that wpKq is equal to p2πq´1 times the perimeter of K. For convex planar
sets, the Urysohn inequality is therefore equivalent to the isoperimetric inequality.
N

Exercise 4.45 (The mean width of a projection). Let K Ă Rn be bounded and


PE be the orthogonal projection onto a subspace E Ă Rn . Show that wG pPE Kq ď
ly.

wG pKq.
on

Exercise 4.46 (The mean width of an affine contraction). Let A : Rn Ñ Rn


be an affine contraction (i.e., such that |Ax ´ Ay| ď |x ´ y| for every x, y P Rn ).
Show that for every bounded set K Ă Rn , we have wG pAKq ď wG pKq.
se

Exercise 4.47 (The mean width of a union). If K, L are convex bodies in Rn


lu

with K X L ‰ H, then wpK Y Lq ď wpKq ` wpLq. For an improvement on this, see


Exercise 5.28.
na

Exercise 4.48 (Geometric mean width).`ş Prove the following ˘ strengthening


so

of the inequality from Proposition 4.16: exp S n´1 log }θ}K dσpθq ě vradpKq´1 .
In other words, the “geometric mean” of } ¨ }K is at least as large as vradpKq´1 ,
r

while inequality (4.36) asserts the same only about the “arithmetic mean” wpK ˝ q “
Pe

ş
S n´1
}θ}K dσpθq.
Exercise 4.49 (A proof of Urysohn’s inequality). (i) Explain in which sense
the following generalization of the Brunn–Minkowski holds and prove it: if pΩ, F, µq
is a measure space and Kt Ă Rn a convex body depending in a measurable way in
a parameter t P Ω, then
ż ˆ ˆż ˙˙1{n
1{n
(4.37) volpKt q dµptq ď vol Kt dµt .
Ω Ω
(ii) Fix a convex body K Ă Rn . By choosing pΩ, µq to be the orthogonal group
Opnq equipped with the Haar measure, and Kt “ tpKq for t P Opnq, prove (4.34).
98 4. MORE CONVEXITY

4.3.4. The Santaló and the reverse Santaló inequalities. When dealing
with convex bodies, it is often convenient to consider the dual picture, involving the
polar bodies. It turns out that the volume is especially well behaved with respect
to the polar operation. This is the content of the Santaló and reverse Santaló
inequalities.
Theorem 4.17 (Santaló and reverse Santaló inequalities, not proved here, but
see Exercise 7.33). There is a constant c ą 0 such that the following holds: for any
n P N and for any symmetric convex body K Ă Rn , we have

ion
(4.38) c ď vradpKq vradpK ˝ q ď 1.

ut
For a non-symmetric convex body K Ă Rn , the product vradpKq vradpK ˝ q may
be arbitrary large (and even infinite, if 0 belongs to the boundary of K). The correct

rib
version of the Theorem in that context is as follows: any convex body K Ă Rn can
be translated so that (4.38) holds. Moreover, it is known (see Proposition D.2 in

ist
Appendix D) that among the translates of K, the minimum of the volume of the
polar (and hence of the product of the volume radii) occurs when the polar has

rd
centroid at 0. Such a point is unique and called the Santaló point of K.
The upper bound in (4.38) is also known as the Blaschke–Santaló inequality
and can be proved through a symmetrization procedure. Note that a 0-symmetric
fo
ellipsoid E Ă Rn satisfies vradpE q vradpE ˝ q “ 1 and no other bodies saturate the
upper bound. Concerning the lower bound, the best constants to date are c “ 1{2
ot
in the symmetric case and c “ 1{4 in the general case (cf. Exercise 4.57).
N

Exercise 4.50 (Santaló implies Urysohn). Using the Santaló inequality, deduce
the Urysohn inequality (4.34) from its dual version (Proposition 4.16).
ly.

Exercise 4.51 (Inequalities between various radii). Show that if K Ă Rn is a


symmetric convex body, then
on

inradpKq ď wpK ˝ q´1 ď vradpKq ď vradpK ˝ q´1 ď wpKq ď outradpKq.


Show that these inequalities also hold if K is a convex body such that the only
se

fixed point of IsopKq is 0.


lu

Exercise 4.52 (Minimizers in the reverse Santaló inequality). Show that we


have vradpKq vradpK ˝ q “ vradpB16 q vradpB86
q when K “ B13 ˆ B13 Ă R6 . This
na

exemplifies non-uniqueness of the conjectured extremal case in reverse Santaló in-


equality, or (the symmetric version of) the Mahler conjecture (see Notes and Re-
so

marks).
r

4.3.5. Symmetrization inequalities. We described in Section 4.1.2 several


Pe

natural ways to construct a symmetric convex body associated to a given (non-


symmetric) convex body. In each case, it is possible to control the volume of the
symmetric body in terms of the volume of the initial body.
4.3.5.1. Milman–Pajor inequality.
Proposition 4.18. Let K, L be two convex bodies in Rn with the same centroid.
We have
volpKq volpLq ď volpK X Lq volpK ´ Lq.
In particular, if K Ă Rn is a convex body with centroid at the origin, then
(4.39) volpKX q ě 2´n volpKq
4.3. CLASSICAL INEQUALITIES FOR CONVEX BODIES 99

Recall that KX “ p´Kq X K. The factor 2´n may appear small, but remember
that it is the n-th root of the volume that is the relevant quantity. In particular, in
terms of volume radii, the conclusion of the second part of Proposition 4.18 simply
becomes vradpKX q ě 12 vradpKq. In is natural to conjecture that among convex
bodies of fixed volume with centroid at the origin, the volume of KX is minimized
when K is a simplex. This would lead to a constant pp2{e ` op1qqn instead of 2´n
in (4.39).
To prove Proposition 4.18, we use the following lemma (which is much simpler

ion
to prove for symmetric convex bodies, see Exercise 4.53).
Lemma 4.19 (Spingarn inequality). Let K Ă Rn be a convex body with centroid

ut
at the origin. If E Ă Rn is a (vector) subspace and F “ E K , we have the inequality

rib
volpKq ď volE pK X Eq volF pPF Kq.
Recall that volH refers to the Lebesgue measure on an affine subspace H Ă Rn .

ist
Proof of Lemma 4.19. Define a function Φ : PF K Ñ R` by

rd
Φpxq “ volE`x pK X pE ` xqq1{k ,
where k “ dim E. The Brunn–Minkowski inequality (4.22) implies that the function
fo
Φ is concave (see Exercise 4.31). Since concave functions can be realized as minima
of affine functions, there exists a y P F such that for any x P PF K,
ot
(4.40) Φpxq ď xx, yy ` Φp0q.
N

By the Fubini–Tonelli theorem and the Hölder inequality, we have


ly.

ż ˆż ˙k{pk`1q
1
k k`1
(4.41) volpKq “ Φpxq dx ď volF pPF Kq k`1 Φpxq dx .
on

PF K PF K

Next, by (4.40),
ż ż
se

(4.42) Φpxqk`1 dx ď Φpxqk pxx, yy ` Φp0qq dx.


PF K PF K
lu

ş
Since 0 is the centroid of K, we have PF K Φpxqk xx, yy dx “ 0. Consequently,
combining (4.41) and (4.42) we are led to
na

1 k k
volpKq ď volF pPF Kq k`1 Φp0q k`1 volpKq k`1 .
so

Since Φp0qk “ volE pK X Eq, the inequality follows. 


r
Pe

Proof of Proposition 4.18. We may assume, by translating them if nec-


essary, that K and L have centroid at the origin. We apply Lemma 4.19 to the
convex body K ˆ L Ă Rn ˆ Rn „ R2n and to the subspaces E “ tpx, xq : x P Rn u
and F “ tpx, ´xq : x P Rn u. We note that vol2n pK ˆ Lq “ voln pKq voln pLq,
voln pK X Eq “ 2n{2 voln pK X Lq and voln pPF Kq “ 2´n{2 voln pK ´ Lq. The con-
clusion follows. 

Exercise 4.53 (Spingarn inequality for symmetric bodies). Why is Lemma


4.19 very simple to prove when K is centrally symmetric?
100 4. MORE CONVEXITY

4.3.5.2. Rogers–Shephard inequalities. There is a converse to Lemma 4.19 which


is simpler since it does not require any hypothesis on the centroid.
Lemma 4.20. Let K Ă Rn be a convex body. If E Ă Rn is an affine subspace
of dimension k and F “ E K , we have the inequality
ˆ ˙´1
n
volpKq ě volE pK X Eq volF pPF Kq.
k
Proof. Let Φ : PF K Ñ R` as in the proof of Lemma 4.19. The function Φ is

ion
concave and vanishes on the boundary of PF K, therefore, for any x P PF K
Φpxq ě Φp0qp1 ´ }x}PF K q.

ut
It follows that

rib
ż ż
volpKq “ Φpxqk dx ě volE pK X Eq p1 ´ }x}PF K qk dx
PF K PF K

ist
`n˘´1
and the last integral reduces to a Beta integral and equals volF pPF Kq k . 

rd
Lemma 4.20 implies a series of inequalities, all due to Rogers and Shephard,
stating that the simplex is the convex body for which the volume increase is the
fo
largest after symmetrization. Their proofs are relegated to exercises.
Theorem 4.21 (see Exercise 4.54). If K Ă Rn is a convex body,
ot
ˆ ˙
´n 2n
(4.43) volpKq ď volppK ´ Kq{2q ď 2 volpKq.
N

n
As a consequence
ly.

(4.44) vradpKq ď vradppK ´ Kq{2q ď 2 vradpKq.


on

Theorem 4.22 (see Exercise 4.55). Let H be an affine hyperplane in Rn`1 ,


not containing the origin, and h ą 0 be the distance between H and the origin. Let
K be a convex body in H. We have the following inequalities
se

2n
lu

(4.45) 2h volH pKq ď voln`1 pK q ď 2h volH pKq.


n`1
If 0 P K, then KY Ă K ´ K and so, by (4.43), volpKY q ď 2n
` ˘
n volpKq ď
na

4n volpKq. However, the constant 4 can be improved to the optimal value of 2.


so

Theorem 4.23 (see Exercise 4.56). If K Ă Rn is a convex body with 0 P K,


then
r

volpKY q ď 2n volpKq.
Pe

Exercise 4.54. Deduce Theorem 4.21 from Lemma 4.20.


Exercise 4.55. Deduce Theorem 4.22 from Lemma 4.20.
Exercise 4.56. Deduce Theorem 4.23 from Theorem 4.22 and Lemma 4.20.
Exercise 4.57 (Symmetric vs. non-symmetric reverse Santaló inequality).
Show that whenever the reverse Santaló inequality (the lower bound in Theorem
4.17) holds with a constant c ą 0 for symmetric convex bodies, it holds with
constant c{2 for all convex bodies.
4.4. VOLUME OF CENTRAL SECTIONS AND THE ISOTROPIC POSITION 101

4.3.6. Functional inequalities. Most classical inequalities for convex bodies


described in this section admit functional variants. As an example, we will state
the Prékopa–Leindler inequality, which is a generalization of the Brunn–Minkowski
inequality.
Theorem 4.24 (Prékopa–Leindler inequality, not proved here, but see Exercise
4.58). Let λ P p0, 1q and let f, g, h be nonnegative integrable functions on Rn such
that
hpλx ` p1 ´ λqyq ě f pxqλ gpyq1´λ

ion
(4.46)
for all x, y P Rn . Then

ut
ż ˆż ˙λ ˆż ˙1´λ
(4.47) hpxq dx ě f pxq dx gpxq dx .

rib
Rn Rn Rn
The Brunn–Minkowski inequality in the form (4.21) follows immediately from
Theorem 4.24 applied with f “ 1K , g “ 1L , and h “ 1λK`p1´λqL (the indicator

ist
functions of K, L, and λK ` p1 ´ λqL). See Notes and Remarks for pointers to
other functional inequalities.

rd
Exercise 4.58. Using induction on the dimension, derive the general Prékopa–
Leindler inequality from the case n “ 1.
fo
4.4. Volume of central sections and the isotropic position
ot
Let K Ă Rn be a convex body with centroid at the origin. The inertia matrix
N

of K is defined as ż
1
IK “ |xyxx| dx.
vol K K
ly.

Note that IK is invertible (because it is positive definite). One says that K is


isotropic (or is in the isotropic position) if IK is a multiple of identity.
on

If T P GLpn, Rq, one checks that IT K “ T IK T : . It follows that any convex


body with centroid at the origin has a linear image which is isotropic. Moreover,
se

this position is unique in the following sense: if both K and T K are isotropic for
some T P GLpn, Rq, then T is a multiple of an orthogonal matrix. In particular, we
lu

have the following.


Proposition 4.25 (easy). Convex bodies with enough symmetric are isotropic.
na

Isotropic convex bodies have the remarkable property that all their central
so

hyperplane sections have comparable volumes.


Proposition 4.26 (see Exercise 4.59). Let K Ă Rn be a convex body with
r
Pe

centroid at the origin, and assume that IK “ λ2 I for some λ ą 0. Then, for any
linear hyperplane H Ă Rn ,
voln pKq voln pKq
(4.48) c ď voln´1 pK X Hq ď C ,
λ λ
1
where c “ 2? 3
and C “ ?12 .
A very important open problem is how the two parameters λ and voln pKq
appearing in (4.48) are related. The hyperplane conjecture postulates that, for every
convex body K with voln pKq “ 1 and IK “ λ2 I, we have λ ď C0 for an absolute
constant C0 ; see Notes and Remarks for more background on this conjecture.
For some special bodies much more precise estimates are available.
102 4. MORE CONVEXITY

Proposition 4.27 (Sections of the cube, not proved here). Let H be a k-


codimensional vector subspace of Rn . Then
` n
1 ď voln´k 21 B8 X H ď 2k{2 .
˘
(4.49)
We conclude the section by presenting a statement in the spirit of Proposition
4.26 for the volume radius. Since the volume radius is a more robust parameter
than the volume itself, it allows to infer in many situations (including non-isotropic
convex bodies) that the volume radius of a convex set is comparable to the vol-

ion
ume radius of sections through its centroid. (The reader who wonders why such
relationships may be relevant in the context of this book may check Section 9.3.)

ut
Proposition 4.28. Let K be an n-dimensional convex body with centroid at a,
and let H be a k-codimensional affine subspace passing through a. Denote θ “ k{n

rib
and let r and R be the inradius and outradius of K with respect to a. Then
ˆ ˙ n1
vradpK X Hq1´θ n

ist
(4.50) R´θ bpn, kq ď ď r´θ bpn, kq ,
vradpKq k

rd
where
˙ n1 ˜ ¸1{n
voln pB2n q Γp k2 ` 1qΓp n´k
ˆ
2 ` 1q
(4.51) bpn, kq :“
volk pB2k q voln´k pB2n´k q
fo “ n
Γp 2 ` 1q
.
ot
Proof. We may assume that a “ 0 (otherwise consider K ´a). By hypothesis,
we have then
N

(4.52) rB2n Ă K Ă RB2n ,


ly.

where B2n is the n-dimensional unit Euclidean ball. For a subspace E, denote by
PE the orthogonal projection onto E. Then, by Lemma 4.19,
on

(4.53) voln pKq ď vols pK X Hq volk pPH K Kq,


K
where H is the k-dimensional space orthogonal to H and s “ n ´ k. Therefore
se

voln pKq vols pK X Hq volk pPH K Kq vols pB2s q volk pB2k q


lu

n ď
voln pB2 q vols pB2s q volk pB2k q voln pB2n q
na

Hence, using (4.52),


vols pB2s q volk pB2k q
vradpKqn ď vradpK X Hqs Rk
so

,
voln pB2n q
r

which is the first inequality in (4.50). For the second inequality, we note that by
Pe

Lemma 4.20, which does not even require that H passes through the centroid of K,
ˆ ˙´1
n
(4.54) voln pKq ě vols pK X Hq volk pPH K Kq.
k
As earlier, this can be rewritten in terms of volume radii as
vols pB2s q volk pB2k q
ˆ ˙
n
vradpKqn ě vradpK X Hqs rk ,
k voln pB2n q
which is the second inequality in (4.50). 
NOTES AND REMARKS 103

Remark 4.29. Although the argument that led to bounds (4.50) looks rough,
we note that we always have (see Exercise 4.60)
ˆ ˙ n1
1 n ?
(4.55) ? ă bpn, kq ă 1 ă bpn, kq ă 2.
2 k
Exercise 4.59 (Isotropic position and central ş sections). (i) Let f : R Ñ R`
an even function such that log f is concave and f pxq dx “ 1. Show that 12f1p0q2 ď
1
ş 2
x f pxq dx ď 2f p0q 2 . (This conclusion also holds if the assumption “f is even” is

ion
ş
replaced by “ xf pxq dx “ 0,” but the proof is more involved, see [Fra99].)
(ii) Use (i) to prove Proposition 4.26.

ut
Exercise 4.60. Prove the bounds (4.55).

rib
Notes and Remarks

ist
A comprehensive reference for geometry and for convex bodies focusing on the
issues related to the Brunn–Minkowski inequality is the book [Sch14].

rd
Section 4.1. The Banach–Mazur distance is most frequently defined in the
category of normed spaces with
fo
dpX, Y q :“ inft}T } ¨ }T ´1 } : T : X Ñ Y an isomorphismu.
This corresponds to definition (4.2) with K, L being 0-symmetric (and, conse-
ot
quently, a “ b “ 0).
N

It is shown in [GLMP04] that dBM pK, ∆n q ď n for every convex body K Ă


Rn . This was known to Grünbaum for n “ 2. It would be nice to have a simple
proof for n ą 2 (cf. Exercise 4.2).
ly.

The question of computing the diameter of (various versions of) the Banach–
Mazur compactum has attracted a lot of attention. It follows from Exercise 4.20
on

that the diameter is at most n. In an important and short paper [Glu81], Gluskin
showed that this estimate is asymptotically sharp via the probabilistic method.
se

A variant of his argument shows that if we denote by Kn , Kn1 two randomly and
independently chosen n-dimensional sections of the 3n-dimensional cube, then with
lu

large probability dBM pKn , Kn1 q?Á n. Remarkably, no explicit example of a pair
of convex bodies ?more than C n apart is known. It is proved in [Sza90] that
na

n
dBM pKn , B8 q Á n log n for some randomly constructed Kn .
In the non-symmetric case, the order of growth of the diameter of the Banach–
so

Mazur compactum is not known, and determining it is an important open problem.


It is clearly Ωpnq, and we do not know whether this inequality is strict. Conversely,
r

an upper bound of Cn4{3 logC n was shown in [Rud00], which improves on the
Pe

trivial bound Opn2 q (see also [BLPS99]).


For more information and references on the Banach–Mazur distance and the
Banach–Mazur compactum see the website [@3].
For more information on zonotopes and zonoids, we refer to the surveys [SW83,
GW93].
We also point out that while the definition of the projective tensor product
appears to be well-adapted to 0-symmetric sets and cones, with linear maps as
morphisms, the projective tensor product is not invariant under affine maps. We
refer to [Sve81], Chapter 2, for a discussion of related categorical issues and to
[DF93] for exhaustive treatment of tensor products of normed spaces.
104 4. MORE CONVEXITY

The result from Exercise 4.17 appears in [Ceı̆76].


Note that, in general, the Minkowski sum of Borel sets does not need to be
Borel [ES70]. However, it is always measurable [Kec95]. The Minkowski sum also
behaves strangely with respect to smoothness: for example the Minkowski sum of
two planar convex bodies with real-analytic boundary is always of class C 6 but
possibly not of class C 7 [Kis87]. (See also [Bom90b, Bom90a].)

Section 4.2. John’s theorem was first proved (in a slightly different form) in

ion
[Joh48]. We refer to [Bal97] for a modern proof (arguments already appeared
in [Bal92a]) and to [Hen12] for historical aspects. The reduction of the general
setting to the symmetric case presented here (Proposition 4.4, and the proofs of

ut
Propositions 4.6 and 4.7) appears to be new.
The concept of convex bodies with “enough symmetries” was defined in [GG71];

rib
see also Chapter 16 in [TJ89].
The affinity between projective tensor products and Löwner ellipsoids (Lemma

ist
4.9) was noted in [Sza05, AS06].

rd
Section 4.3. The Brunn–Minkowski inequality (4.22) was first proved in di-
mensions 2 and 3 by Brunn and extended by Minkowski to higher dimensions. The
equality case is known: when K, L are convex bodies and 0 ă λ ă 1, the inequality
fo
(4.21) is an equality if and only if K and L are homothetic. The equality case was
extended by Lusternik to general case and is essentially the same up to null sets; for
ot
precise statements, and for a panorama of inequalities connected to the isoperimet-
ric inequalities, we refer to the survey [Gar02]. Far-reaching generalizations of the
N

Brunn–Minkowski inequality are the Alexandrov–Fenchel inequalities, for which we


refer to [Sch14].
ly.

The two sides of the inequality (4.22) can be very different; for example, if
K and L are perpendicular segments in R2 (hence of volume 0), K ` L is a rec-
on

tangle, and this behavior can be approximated in the category of convex bodies
by replacing segments with narrow rectangles. It is therefore surprising that the
se

Brunn–Minkowski inequality admits—after some tweaking—a reverse: any two n-


dimensional convex bodies have affine images (of the same dimension), for which
lu

(4.22) can be reversed, up to a universal constant (see (7.32) in Notes and Remarks
on Section 7.2). A vaguely similar reverse of Urysohn inequality (4.34) can be found
na

in Chapter 7 (Corollary 7.11).


Another variant of (4.22) that has information-theoretic links is the restricted
so

Brunn–Minkowski inequality [SV96, SV00]. It asserts that when K, L Ă Rn satisfy


some minimal non-degeneracy assumptions and Θ Ă K ˆ L Ă R2n is not too small
r

(e.g., vol2n pΘq ě c voln pKq voln pLq for appropriate universal constant c P p0, 1q),
Pe

then volpK `Θ Lq2{n ě volpKq2{n `volpLq2{n , where K `Θ L :“ tx`y : px, yq P Θu


is the restricted (to Θ) Minkowski sum.
The characterization of log-concave measures (Proposition 4.14) holds without
the absolute continuity assumption: by a result of Borell [Bor75a], any Radon
measure on Rn which satisfies part (2) of Proposition 4.14 necessarily has a density
with respect to the Lebesgue measure on some affine subspace, and this density is
a log-concave function.
The upper bound (known as the Blaschke–Santaló inequality) in Theorem 4.17
was proved by Blaschke in dimensions 2 and 3 and by Santaló in any dimension. The
NOTES AND REMARKS 105

first proof of the lower bound is due to Bourgain and Milman [BM87]. Other—
quite different—proofs were given later by Kuperberg [Kup08] (which gives the
values of c quoted in the text) and Nazarov [Naz12] (we recommend the notes
[RZ14] for a detailed presentation of Nazarov’s argument). However, no elementary
proof is known (a simple argument giving a lower bound vradpKq vradpK ˝ q Á
1{ log n appears in [Kup92]).
It is conjectured that the product vradpKq vradpK ˝ q in (4.38) is minimized for
the pair pB1n , B8n
q (and for the family of Hanner polytopes, defined as the smallest

ion
class of polytopes containing r´1, 1s and stable under the operations K ÞÑ K ˝ and
pK, Lq ÞÑ K ˆ L; cf. Exercise 4.52) and, in the non-symmetric case, for K “ ∆n
(the minimum being then conjectured to be unique). This is the content of the

ut
so-called Mahler conjecture.

rib
Several inequalities, for which the Euclidean ball is the extremal case, such that
the isoperimetric inequality, the Urysohn inequality and the Santaló inequality (the
upper bound in (4.38)), can be proved using symmetrizations. For example one may

ist
consider the Steiner symmetrizations as defined in Exercise 4.31. A useful result
is then the fact that, given any convex body K Ă Rn , there is choice of successive

rd
Steiner symmetrizations that converge to a Euclidean ball of radius vradpKq (see,
e.g., Theorem 1.1.16 in [AAGM15] for a sketch of proof).
fo
Proposition 4.18 appears in [MP00] and Lemma 4.19 in [Spi93]. Lemma 4.20
is from [RS58]; a simpler proof can be found in [Cha67].
ot
Theorem 4.24 was shown in [Lei72] and [Pré71, Pré73], see also [BL75,
BL76]. A complete compact proof can be found in [AAGM15] or [Gar02], the
N

latter of which also sketches historical background and contains many further ref-
erences.
ly.

Other functional versions of inequalities presented in this section include ana-


logues of the Santaló inequality that can be traced to K. Ball’s Ph.D. thesis [Bal86]
on

(see also [AAKM04]), and of its reverse [KM05]; see also [AAS15] and [CFG` 16]
for more recent contributions and references.
Functional versions of Rogers–Shephard inequalities were considered starting
se

from [Col06], see also [AGMJV16].


lu

Section 4.4. A very complete reference about the geometry of convex bodies in
isotropic position (including the most recent developments) is the book [BGVV14].
na

Proposition 4.26 was proved by Hensley [Hen80] for symmetric convex bodies and
the symmetry assumption was removed in [Fra99].
so

The hyperplane conjecture (also known as the “slicing problem”) asserts that
any convex body of volume 1 in Rn admits a hyperplane section of volume larger
r
Pe

than c0 , for some absolute constant c0 ą 0. This is equivalent to the statement


mentioned in the text: if an isotropic convex body K satisfies volpKq “ 1 and
IK “ λ2 I, does λ ď C0 for some absolute constant C0 ? (It is even conceivable that
the above are true with c0 “ C0 “ 1.) The answer is known to be positive for many
natural classes of bodies; of those that are particularly relevant to the subject of
this book we mention unit balls in Schatten p-norms, see [KMP98]. However, the
best known estimate in the general case is only λ “ Opn1{4 q [Kla06]; we refer to
[BGVV14] for more references and an extensive discussion of related questions.
The hyperplane conjecture can be seen as an isomorphic version of the classical
(now fully solved) Busemann–Petty problem, which asks the following: if two sym-
metric convex bodies K, L Ă Rn satisfy voln´1 pK X Hq ď voln´1 pL X Hq for every
106 4. MORE CONVEXITY

hyperplane H containing the origin, can we conclude that voln pKq ď voln pLq? It
is known that the answer is affirmative when n ď 4 and negative when n ě 5 (see
[Kol05] for references).
Proposition 4.27 is due to Vaaler ([Vaa79], the lower bound) and Ball ([Bal89],
the upper bound).
Proposition 4.28 is from [SWŻ08]. It is instructive to compare Propositions
4.26 and 4.28. The first one gives very precise estimates for volumes of hyperplane
sections in the isotropic position, while the second one deals with sections of pro-

ion
portional (or subproportional) codimension, but only at the level of the volume
radius, that is, after raising the volumes to the power of 1 over the dimension.

ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
CHAPTER 5

Metric Entropy and Concentration of Measure in

ion
Classical Spaces

ut
This chapter presents two fundamental concepts which will be applied in later

rib
chapters: the metric entropy (a.k.a. packing and covering) and the concentration of
measure. Their conjunction leads to the Dvoretzky theorem, which will be presented
in Chapter 7.

ist
5.1. Nets and packings

rd
We will introduce now the complementary concepts of covering numbers (also
called metric entropy) and packing numbers, which quantify the complexity of a
fo
given compact metric set. It will turn out that these parameters are closely related
to the volume and the mean width considered in the preceding chapter.
ot
We first analyze the special but fundamental cases of the sphere and the discrete
cube. We subsequently discuss classical groups and manifolds, and general convex
N

bodies.
ly.

5.1.1. Definitions. If K is a compact subset of a metric space pM, dq, a finite


subset N Ă K is called an ε-net of K if, for every x P K, distpx, N q ď ε. Since this
on

is equivalent to the union of the corresponding balls containing K, an alternative


terminology is that of a covering, see Figure 5.1. We denote by N pK, εq (or by
N pK, d, εq, if there is an ambiguity as to the choice of the metric) the minimal
se

cardinality of an ε-net in K.
A subset P Ă K is called ε-separated if any pair px, yq of distinct elements
lu

from P satisfies dpx, yq ą ε. This property implies that the balls of radius ε{2
centered at elements of P are disjoint (a configuration usually referred to as packing,
na

whence the usage of the letter P ; see Figure 5.1), and in most contexts the two
properties are essentially equivalent. We denote by P pK, εq or P pK, d, εq the largest
so

cardinality of an ε-separated set in K. The quantities N pK, εq and P pK, εq are


r

called, respectively, covering numbers and packing numbers. The function ε ÞÑ


Pe

N pK, d, εq, and its various generalizations, is also often referred to as the metric
entropy of pK, dq.
For any compact metric space K, the following two relations between nets and
packings are fundamental. First, if P is a 2ε-separated set and N is an ε-net, then
the open balls of radius ε centered at elements from N cover K, and each ball
contains at most one element of P. Second, an ε-separated set which is maximal
(with respect to inclusion) is an ε-net (the reader not familiar with this circle of
ideas is encouraged to check these elementary facts). It follows that we have the
inequalities
(5.1) P pK, 2εq ď N pK, εq ď P pK, εq.

107
108 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

• •

• • •
• •

ion
• • • •

ut
rib
Figure 5.1. A net (left) and a packing (right) for an equilateral
triangle (with the Euclidean metric in R2 ). For optimal packings

ist
or covering with few “classical” convex bodies in the plane (squares,
circles or triangles), see the website [@1].

rd
Packings and coverings have been extensively studied, particularly for “stan-
fo
dard” metric spaces. In various applications it is useful to know that there exist
“large” packings and/or “small” nets, and often to be able to exhibit them in a con-
ot
structive manner. By (5.1), both notions are equivalent whenever the resolution
parameter ε is specified only up to a multiplicative constant. On the other hand, for
N

some applications, such as coding theory, very precise results are in high demand.
In many situations the isometry group of K acts transitively and preserves a
ly.

natural probability measure µ. In particular, all balls of radius ε have then the
same measure, denoted by V pεq, and we have the simple inequalities
on

1 1
(5.2) ď N pK, εq ď P pK, εq ď .
V pεq V pε{2q
se

Exercise 5.1. Here, we introduce variations on the definitions and check their
equivalence. Let M be a metric space and K a compact subset. Denote by N 1 pK, εq
lu

the smallest cardinality of a family of closed balls of radius ε in M whose union


contains K (the difference with the definition of N pK, εq is that the centers are not
na

required to be in K). It is sometimes more convenient to allow sets of diameter


ď 2ε in place of balls of radius ε; call the resulting the quantity N 2 pK, εq. Let also
so

P 1 pK, εq be the largest cardinality of a family of disjoint open balls of radius ε{2
with centers in K. Check the inequalities
r
Pe

N 2 pK, εq ď N 1 pK, εq ď N pK, εq ď P pK, εq ď N 2 pK, ε{2q


and
P pK, εq ď P 1 pK, εq ď N pK, ε{2q.
Give examples showing that inequalities may be strict (see also Exercise 5.16).
5.1.2. Nets and packings on the Euclidean sphere. We first consider the
specific case of the sphere S n´1 for n ě 2; denote by g the geodesic distance and by
σ the normalized Haar measure. In some cases, it is more appropriate to consider
the extrinsic distance inherited from Rn . However, any result about one distance
transfers automatically to the other distance (see Appendix B.1 for details). We
give a brief overview of known estimates for packing and covering numbers for the
5.1. NETS AND PACKINGS 109

sphere. The first point of business will be a discussion of volumes of spherical caps,
which enter the subject via (5.2).
5.1.2.1. Estimates on volumes of spherical caps. Given x0 P S n´1 , let Cpx0 , εq
be the cap of center x0 and geodesic radius ε, and denote V pεq “ σpCpx0 , εqq
(ε P r0, πs is tacitly assumed). We have
şε n´2
sin θ dθ
(5.3) V pεq “ şπ0 n´2 .
0
sin θ dθ
?

ion
The denominator at the right-hand side of (5.3) (Wallis integral) equals 2π{κn´1 .
Note that V pπ ´ εq “ 1 ´ V pεq, in particular V pπ{2q “ 1{2. For fixed 0 ă ε ă π{2,
V pεq tends to 0 exponentially fast in the dimension: one has V pεq1{n „ sinpεq. The

ut
following proposition gives elementary but reasonably precise bounds. The first one

rib
is sharp when the radius is small, and the second one for a radius slightly smaller
than π{2.

ist
Proposition 5.1. If 0 ď t ď π{2, then V ptq ď 21 sinn´1 ptq. More precisely
? ?
(5.4) p 2πκn q´1 psin tqn´1 ď V ptq ď p 2πκn cos tq´1 psin tqn´1 ,

rd
?
where κn „ n is given by (A.8). Moreover, if n ą 2, then
1
(5.5)
2
fo
V pπ{2 ´ tq ď expp´nt2 {2q.
ot
S n−1
N
ly.

sin t
on

t
• •
0 x
C(x, t)
se
lu
na
so

Figure 5.2. Proof that V ptq ď 12 sinn´1 ptq. The surface area of
Cpx, tq (bold) does not exceed the surface area of a half-sphere of
r
Pe

radius sin t (dashed).

A proof of (5.4) is sketched in Exercise 5.4. It is based on the fact that, for
convex sets, surface area is monotone with respect to inclusion (Exercise 5.2). The
inequality (5.5) is from [Jen13] (see also [JS]); a version with n ´ 1 instead of n
in the exponent is proved in Exercise 5.3.
The following fact is only marginally used in what follows, but we include it
since we did not encounter it in the convexity/functional analysis literature.
Proposition 5.2 (Convavity properties of V p¨q, see Exercise 5.5). If V prq is the
measure of a spherical cap of radius r, then the function t ÞÑ log V pet q is concave.
A fortiori, the function r ÞÑ log V prq is strictly concave on r0, πs.
110 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

A consequence of Proposition 5.2 is that, for 0 ď s ď t ď π,


ˆ ˙n´1
t
(5.6) V ptq ď V psq.
s
Inequality (5.6) is a well-known fact in differential geometry; for example, it consti-
tutes the trivial case of the Gromov–Bishop comparison theorem. It is very likely
that Proposition 5.2 also follows from similar general results.
Exercise 5.2 (Surface area is monotone with respect to inclusion). Show that

ion
if K Ă L are convex bodies, then areapKq ď areapLq.
Exercise 5.3. Using Exercise 5.2, show that for t P r0, π{2s, we have V ptq ď

ut
1
2 sinn´1 ptq. Conclude that

rib
1 1
V pπ{2 ´ tq ď pcos tqn´1 ď expp´pn ´ 1qt2 {2q.
2 2

ist
This is only slightly weaker than the bound (5.5) and sharper than the estimates
typically cited in the literature.

rd
Exercise 5.4 (Sharp bounds for volumes of caps). Using
? Exercise 5.2, show the
inequalities (5.4). Then strengthen the lower bound to p 2π κn cospt{2qq´1 sinn´1 t.
fo
Exercise 5.5 (Convavity properties of V p¨q). Prove Proposition 5.2 and derive
the inequality (5.6).
ot
5.1.2.2. Nets in the sphere. If ε P rπ{2, πq, we clearly have N pS n´1 , g, εq “ 2.
N

The interesting case is when ε P p0, π{2q. In that range, the proportion V pεq of the
sphere covered by a cap of geodesic radius ε decays exponentially with n. It follows
ly.

that the cardinality of ε-nets grows also exponentially fast. For example, the first
estimate from Proposition 5.1 implies that, for ε P p0, π{2q,
on

2
(5.7) N pS n´1 , g, εq ě V pεq´1 ě n´1 .
sin ε
se

A basic and extremely useful bound for ε-nets (formulated in the extrinsic distance)
is the following
lu

Lemma 5.3. For every dimension n and every ε ď 1, there is an ε-net in


pS n´1 , |¨ |q with less than p2{εqn elements. In other words, N pS n´1 , |¨ |, εq ď p2{εqn .
na

The standard and often quoted volumetric argument (which is a special case
so

of Lemma 5.8 below) gives a slightly worse bound p1 ` 2{εqn . The improved bound
p2{εqn can be achieved by a finer analysis combining a version (based on [Dum07])
r

of Proposition 5.4 below with the use of explicit nets in lower dimensions, see [Swe].
Pe

We also note that there exist simple explicit ε-nets in S n´1 with cardinality at most
pC{εqn (see Exercise 5.22).
To discuss finer results it is more convenient to switch to the geodesic distance.
We know from the volume argument (5.2) that N pS n´1 , g, εq ě V pεq´1 . It turns out
that this trivial estimate is remarkably sharp: an almost-matching upper estimate
is provided by an elegant random covering argument due to Rogers.
Proposition 5.4 (Random covering bound). For every 0 ă η ă θ, we have
R ˆ ˙V
n´1 1 V pθq 1
N pS , g, θ ` ηq ď log ` .
V pθq V pηq V pθq
5.1. NETS AND PACKINGS 111

Proof. Let N “ r V 1pθq log pV pθq{V pηqqs. Choose pxi q1ďiďN randomly, inde-
Ť
pendently according to σ, and denote A “ tCpxi , θq : 1 ď i ď N u. The expected
proportion of the sphere missed by A can be computed using the Fubini–Tonelli
theorem
N V pηq
(5.8) EσpS n´1 zAq “ p1 ´ V pθqq ď expp´N V pθqq ď .
V pθq
In particular, there exist pxi q such that σpS n´1 zAq ď V pηq{V pθq. Let tCpyj , ηq :
1 ď j ď M u be a maximal family of disjoint balls of radius η contained in S n´1 zA.

ion
It follows from (5.8) that M ď 1{V pθq. By construction, S n´1 is covered by the
family

ut
( (
Bpxi , θ ` ηq : 1 ď i ď N Y Bpyj , 2ηq : 1 ď j ď M . 

rib
Corollary 5.5 (Neat random covering bound, see Exercise 5.8). For every
0 ă ε ă π{2, we have

ist
(5.9) N pS n´1 , g, εq ď Cn log n V pεq´1
for some absolute constant C.

rd
It follows from (5.7), (5.9) and (5.4) that, for a fixed ε P p0, π{2q, we have
1
(5.10)
nÑ8 n
fo
lim log N pS n´1 , g, εq “ ´ logpsin εq.
We note for future reference the following fact.
ot
Proposition 5.6. Let P Ă Rn be a polytope such that dBM pP, B2n q ď λ. Then
N

P has at least 2 expppn ´ 1q{2λ2 q vertices and at least 2 expppn ´ 1q{2λ2 q facets.
Proof. Consider first the statement about vertices. Without loss of generality
ly.

we may assume that λ´1 B2n Ă P Ă B2n , and that the vertices of P are unit vectors.
Let V be the set of vertices of P . The hypothesis is equivalent to saying that V
on

is a θ-net in pS n´1 , gq for cos θ “ 1{λ (see Exercise 5.7). Using (5.7), it follows
that card V ě 2psin θq´pn´1q ě 2 expppn ´ 1q{2λ2 q, where we used the inequality
se

sin arccos t ď expp´t2 {2q for 0 ď t ď 1. Since dBM pP, B2n q “ dBM pP ˝ , B2n q, and
since vertices of P ˝ are in bijection with facets of P , the statement about facets
lu

follows. 
na

We also point out that it is possible to approximate the sphere by polytopes with
at most exponentially many vertices and, simultaneously, at most exponentially
so

many facets (see Exercise 7.22).


Exercise 5.6. Check that the constant 2 cannot be replaced by a smaller
r
Pe

number in the statement of Lemma 5.3.


Exercise 5.7 (Nets and convex hulls). Let N Ă S n´1 and θ P p0, π{2q. Prove
that N is a θ-net in pS n´1 , gq if and only if pcos θqB2n Ă conv N .
Exercise 5.8 (Proof of the neat random covering bound). Deduce Corollary
5.5 from Proposition 5.4.
Exercise 5.9 (On the optimality of Corollary 5.5). Let Cn be the smallest
number such that the inequality N pS n´1 , g, εq ď Cn V pεq´1 holds for any ε ą 0.
By considering ε slightly smaller than π{2, show that Cn ě n`12 . A less trivial fact
is that Cn “ Ωpnq is also witnessed by taking ε very close to 0, see [CFR59] and
Notes and Remarks.
112 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

Exercise 5.10 (Nets in the projective space). Prove the following result, which
will be useful in Sections 8.1 and 9.4. Let ε P p0, π{2q. If N is an ε-net in
the projective space PpCd q (equipped with the Fubini-Study metric (B.5)), then
card N ě pc{εq2d´2 for some absolute positive constant c. In the opposite direction,
there exists an ε-net of cardinality not exceeding pC{εq2d´2 .
Exercise 5.11 (Volume of balls in PpCd q). Consider the projective space
PpCd q equipped with the Fubini-Study metric (B.5) and the invariant probabil-
ity measure. If ε P p0, π{2s, then the measure of any ball of radius ε in PpCd q is

ion
sin2d´2 ε.
5.1.2.3. Packing on the sphere. Recall that P pS n´1 , g, εq is the maximal num-

ut
ber of disjoint caps of geodesic radius ε{2. The exact value is known for π{2 ď ε ă π
(we have P pS n´1 , g, π{2q “ 2n, see Exercise 5.12) and so we restrict our discussion

rib
to the range 0 ă ε ă π{2.
Packing problems are usually harder than covering problems. For example, as

ist
opposed to (5.10), the exponential rate at which packing numbers increase, i.e., the
value of

rd
1
ppεq “ lim sup log P pS n´1 , g, εq
nÑ8 n
fo
is not known for ε P p0, π{2q. We know from (5.2) that V pεq´1 ď P pS n´1 , g, εq ď
V pε{2q´1 , and therefore
ot
(5.11) ´ log sinpεq ď ppεq ď ´ log sinpε{2q.
N

In this context the lower bound is known as the Chabauty–Shannon–Wyner bound


and actually corresponds to using the trivial algorithm to produce packings: pick
separated points, no matter how, as long as you can. It is an amazing fact that
ly.

the lower bound ppεq ě ´ log sin ε has never been improved: nobody knows how to
on

substantially beat the worst possible choices!


On the other hand, the upper bound in (5.11) has received various improve-
ments. It has been shown by Rankin that for ε P p0, π{2q
se

?
ppεq ď ´ logp 2 sinpε{2qq
lu

which matches the lower bound from (5.11) as ε increases to π{2. For small ε,
further improvements due to Kabatjanskiı̆–Levenšteı̆n are based on the so-called
na

linear programming bound (see Notes and Remarks).


Exercise 5.12 (Packing large caps on the sphere). Suppose that pxi q are N
so

points in S n´1 such that xxi , xj y ď t for i ‰ j.


r

(i) Show that N ď 1 ´ 1{t if t ă 0,


Pe

(ii) Show that N ď 2n if t “ 0


If t ą 0 is fixed, we know from (5.11) that exponentially many points in the sphere
may have pairwise inner products at most t. The situation when t tends to zero
with n is investigated in the following exercise.
Exercise 5.13 (Coarse approximation of B2n by polytopes with few vertices).
Suppose that pxi q are N points in S n´1 such that |xxi , xj y| ď t whenever i ‰ j, for
some t ą 0. ?
(i) If t ă 1{ n, show that N ď n{p1 ´ nt2 q.
(ii) By considering the family pxbk
i q1ďiďN for a suitable large k, show that if t ď 1{2,
Ct2 n
then N ď pC{tq for some absolute constant C.
5.1. NETS AND PACKINGS 113

2
(iii) Deduce that, for r ě 2, there is a polytope P with at most pCrqCn{r vertices
such that dg pP, B2n q ď r.
5.1.3. Nets and packings in the discrete cube. Although the discussion
from the previous sections dealt specifically with spheres, some ideas carry over
directly to other settings. As an illustration we consider the case of the discrete
cube t0, 1un (a.k.a. Boolean cube) equipped with the normalized Hamming distance
1
(5.12) cardti : xi ‰ yi u.
dH px, yq “

ion
n
We denote by V ptq the volume (i.e., the cardinality) of a ball of radius t P p0, 1q.

ut
We have
( ttnu
ÿ ˆn˙
V ptq “ card y P t0, 1un : dH px, yq ď t “

rib
.
k“0
k
The quantity V ptq is governed by the binary entropy function H defined for x P p0, 1q

ist
by Hpxq “ ´x log2 x ´ p1 ´ xq log2 p1 ´ xq. For t ď 1{2 such that tn is an integer,
we have (see Exercise 5.15)

rd
1
(5.13) 2nHptq ď V ptq ď 2nHptq .
n`1
fo
Related estimates will be used when discussing concentration of measure, see (5.59).
ot
As in the case of the sphere, the covering problem is simpler than the packing
problem (at least in some asymptotic regimes). In particular (see Exercise 5.14), a
N

random covering argument similar to Proposition 5.4—in combination with (5.13)—


implies that, for 0 ă ε ă 1{2,
ly.

1
(5.14) limlog2 N pt0, 1un , dH , εq “ 1 ´ Hpεq.
nÑ8 n
on

On the other hand, the corresponding limit for packing is unknown; we only
get from (5.2) the asymptotic bounds
se

1
(5.15) 1 ´ Hpεq ď lim sup log2 P pt0, 1un , dH , εq ď 1 ´ Hpε{2q
lu

nÑ8 n
for 0 ă ε ă 1{2. As in the case of the sphere, the lower bound from (5.15) (known
na

in this context as the Gilbert–Varshamov bound) has not been improved, while the
upper bound has been subject to various enhancements.
so

For the q-ary version of the cube, i.e., the space t0, . . . , q ´ 1un (also equipped
with normalized Hamming distance), the entropy function has to be replaced by
r
Pe

Hq pxq :“ ´x logq x ´ p1 ´ xq logq p1 ´ xq ` x logq pq ´ 1q.


Indeed, if Vq ptq denotes the cardinality of a ball of radius t in t1, . . . , q ´ 1un , for
t P p0, 1 ´ 1{qq such that tn is an integer, then
1
(5.16) q nHq ptq ď Vq ptq ď q nHq ptq .
n`1
Estimates about the q-ary cube are useful when one wants to construct nets or
separated sets in products of metric spaces. The following specific fact, which is an
easy consequence of (5.16) and (5.1), will be used later.
114 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

Proposition 5.7. Let pK, dq be a metric space such that P pK, d, εq ě q. Given
integer n P N, equip K n with the distance
dn ppx1 , . . . , xn q, py1 , . . . , yn qq “ dpx1 , y1 q ` ¨ ¨ ¨ ` dpxn , yn q.
Then, for t P p0, 1 ´ 1{qq,
qn
(5.17) P pK n , dn , tεnq ě P pt0, . . . , q ´ 1un , dH , tq ě ě q np1´Hq ptqq .
Vq ptq

ion
Exercise 5.14 (Efficient random nets of the Boolean cube). Show (5.14) by
adapting the random covering argument from Proposition 5.4.

ut
Exercise 5.15 (Volume of balls in the q-ary discrete cube). Show (5.16) (which
specified to q “ 2 gives (5.13)).

rib
5.1.4. Metric entropy for convex bodies. If the metric space pM, dq is
actually a normed space with a unit ball B, we write N pK, B, εq or N pK, εBq

ist
instead of N pK, d, εq. It is possible to come up with an alternative definition which
does not refer to the norm, by saying that N pK, B, εq is the minimum number N

rd
such that there exist x1 , . . . , xN in K with
N
(5.18) KĂ
ď

i“1
fo
pxi ` εBq.
ot
This alternative definition does not require the set B to be symmetric, or even
convex, or to have nonempty interior, even though that is usually the case. In
N

our context, the minimal reasonable hypothesis appears to be asking that B be


star-shaped with respect to the origin, i.e., that tB Ă B for t P r0, 1s.
ly.

The technology for estimating covering/packing numbers of subsets (particu-


larly convex subsets) of normed spaces is quite well-developed and frequently rather
on

sophisticated. We quote here a simple well-known result that expresses N p¨, ¨q in


terms of a “volume ratio.”
se

Lemma 5.8. Let L be a symmetric convex body in Rn and let K Ă Rn be a


Borel set. Then, for any ε ą 0,
lu

ˆ ˙n ˆ ˙n
1 volpKq 2 volpK ` 2ε Lq
(5.19) ď N pK, L, εq ď .
na

ε volpLq ε volpLq
Proof. If pxi q is an ε-net in K with respect to } ¨ }L , then the union of the sets
so

xi ` εL contains K, and the left-hand side inequality in (5.19) follows from volume
comparison. Consider now a family pxi q of N elements of K which is ε-separated
r

for } ¨ }L . This means that the sets xi ` 2ε L have disjoint interiors. Since they are
Pe

all included in K ` 2ε L, we have N volp 2ε Lq ď volpK ` 2ε Lq. Together with (5.1),


this implies the right-hand side inequality in (5.19) 
When K is convex and the “regularizing” trick implicit in Exercise 5.17 below is
applied, the lower and upper bounds are often as close as one can expect provided
K and L are is the M -position (see Notes and Remarks). The case K “ L in
Lemma 5.8 is related to the approximation of convex bodies by polytopes.
Lemma 5.9. Let 0 ă ε ă 1, K Ă Rn be a symmetric convex body and N be an
ε-net in K with respect to } ¨ }K . Then conv N Ą p1 ´ εqK.
5.1. NETS AND PACKINGS 115

Proof. Let P “ conv N and denote A “ supt}y}P : y P Ku. One checks


that P contains 0 in the interior, so that A ă 8. Given x P K, there is x1 P N
such that }x ´ x1 }K ď ε, and therefore }x}P ď }x1 }P ` }x ´ x1 }P ď 1 ` εA. Taking
supremum over x gives A ď 1 ` εA, so that A ď p1 ´ εq´1 , which is equivalent to
the inclusion P Ą p1 ´ εqK. 

The following is an immediate consequence of Lemmas 5.8 and 5.9.


Corollary 5.10. Let ε P p0, 1q. Any symmetric convex body in Rn is p1´εq´1 -

ion
close, in the Banach–Mazur distance, to a polytope with at most p1 ` 2{εqn vertices.
For an extension of Lemma 5.9 and 5.10 to not-necessarily-symmetric convex

ut
bodies, see Exercises 5.18–5.20. Note that the dependence on ε in Corollary 5.10 is
not sharp (see Notes and Remarks). For the special case K “ B2n , the conclusion

rib
of Lemma 5.9 can be easily improved to conv N Ą p1 ´ ε2 {2qK, see Exercise 5.7.

ist
Exercise 5.16 (Covering with balls whose centers lie outside of the set). For
convex bodies K, L in Rn , let N 1 pK,
Ť Lq be the smallest number N such that there

rd
exist x1 , . . . , xN in Rn with K Ă 1ďiďN pxi ` Lq (the difference with N pK, Lq is
that xi are not required to belong to K). Give an example with L symmetric for
which N 1 pK, Lq ă N pK, Lq. Can we have such an example with also K symmetric?
fo
Exercise 5.17 (A regularizing trick). Let K, L be convex bodies in Rn , with
ot
0 P L. Show that N pK, εLq “ N pK, pK ´ Kq X εLq.
N

Exercise 5.18 (Approximating by polytopes with few vertices). Let K Ă Rn


be a convex body with centroid at the origin (K is not assumed to be symmetric).
Using Lemma 5.8 and Proposition 4.18, show that for every ε P p0, 1q we have
ly.

N pK, εKq ď p2 ` 4{εqn , where N pK, εKq “ N pK, K, εq is defined as in (5.18). By


on

arguing as in the proof of Lemma 5.9, conclude that there exists a polytope P with
at most p2 ` 4{εqn vertices such that p1 ´ εqK Ă P Ă K.
se

Exercise 5.19 (Approximating by polytopes with few facets). Let ε P p0, 1q


and K Ă Rn be a convex body with centroid at the origin. Show that there exists
lu

a polytope Q with at most p2 ` 4{εqn facets such that p1 ´ εqQ Ă K Ă Q.


Exercise 5.20 (Approximating by polytopes and the Santaló inequality). Let
na

K be a convex body in Rn and let κ “ vradpKq vradpK ˝ q ă 8 (i.e., K satisfies


approximately the Santaló inequality, see Theorem 4.17 and the comments following
so

it). If ε P p0, 1q, then K can be approximated up to ε (in the sense of Exercises
r

5.18 and 5.19) by a polytope P with at most pCκ{εqn vertices (resp., facets).
Pe

Exercise 5.21 (Duality of metric entropy for ellipsoids). Let E and F be 0-


symmetric ellipsoids in Rn . Check that for every ε ą 0, N pE , F , εq “ N pF ˝ , E ˝ , εq.
Exercise 5.22 (Explicit nets in S n´1 ). Here is an explicit construction of an
ε-net in S n´1 with at most pC{εqn elements, for some (suboptimal) constant C.
n
(i) Show that, if N is an ε-net in Ba 2 (with 0 ă ε ă 1), then the set tx{|x| : x P N u
?
is an η-net in pS n´1 , | ¨ |q for η “ 2 ´ 2 1 ´ ε2 .
(ii) Let N “ B2n X ?εn Zn . Show that N is an ε-net in B2n and that card N ď pC{εqn .
116 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

5.1.5. Nets in Grassmann manifolds, orthogonal and unitary group.


We now extend the results given for the sphere to other classical manifolds, includ-
ing unitary and orthogonal groups and Grassmann manifolds (which are introduced
in Appendix B). Metric structures on such manifolds are induced by unitarily in-
variant norms on the corresponding matrix spaces, with Schatten p-norms being the
most popular choices. While there are several natural ways (also discussed in detail
in Appendix B) to define a metric on a manifold starting from a given Schatten
norm, all such metrics—for a fixed p—differ by at most by a multiplicative factor

ion
of π{2. Accordingly, the behavior of covering numbers in all such situations can be
subsumed in the following single statement.

ut
Theorem 5.11 (not proved here, but see Exercise 5.23). Let M be either SOpnq,
Upnq, SUpnq, Grpk, Rn q or Grpk, Cn q, equipped with a metric generated by the Schat-

rib
ten norm } ¨ }p for some 1 ď p ď 8. Then for any ε P p0, diam M s,
ˆ ˙dim M ˆ ˙dim M
c diam M C diam M

ist
(5.20) ď N pM, εq ď ,
ε ε

rd
where C, c ą 0 are universal constants (independent of n, k, p and ε), dim M is the
real dimension of M , and diam M the diameter of M with respect to the corre-
sponding metric.
fo
For easy reference, we list in Table 5.1 some of the values of the parameters
ot
(dimensions, diameters) that appear in (5.20).
N

Table 5.1. Real dimensions and diameters from the bounds (5.20)
for covering numbers of a selection of classical manifolds. The dis-
ly.

tances used on SOpnq and Upnq are the extrinsic metrics obtained
from the Schatten p-norm on Mn , and the distances on Grassmann
on

manifolds are the corresponding quotient metrics. The restriction


k ď n{2 is imposed to reduce clutter (note that Grpk, Rn q and
Grpn ´ k, Rn q are isometric).
se

M dim M diam M comments


lu

SOpnq npn ´ 1q{2 2n1{p


na

Upnq n2 2n1{p
n 1{2
Grpk, R q kpn ´ kq 2 p2kq1{p k ď n{2
so

n 1{2 1{p
Grpk, C q 2kpn ´ kq 2 p2kq k ď n{2
r
Pe

Exercise 5.23 (Metric entropy of classical groups and manifolds). Prove The-
orem 5.11 for M “ Upnq, M “ SUpnq or M “ SOpnq and for p “ 8, by appealing to
Lipschitz properties of the exponential map with matrix argument (Exercise B.8).
Exercise 5.24. Derive the formula for diameter of Grpk, Rn q in Table 5.1.
Exercise 5.25 (Volume of balls in classical groups and manifolds). Let M
be either SOpnq, Upnq or Grpk, Rn q, equipped with a metric as in Theorem 5.11.
Denoting by σ the Haar probability measure on M , deduce from Theorem 5.11 a
two-sided estimate for σpBpx, εqq, where Bpx, εq denotes the ball of radius ε centered
at x P M .
5.2. CONCENTRATION OF MEASURE 117

5.2. Concentration of measure


The classical isoperimetric inequality in Rn (Eq. (4.27), also known as Dido’s
problem) states that among all sets of given volume, the Euclidean balls have the
smallest surface area. As we already noticed in the setting of Rn in Section 4.3.1,
an alternative methodology is to consider, instead of the surface area, the family of
ε-enlargements of a given set. The latter approach makes sense in any metric space
X equipped with a measure µ (a metric measure space, or a metric probability space
if µpXq “ 1, which will be assumed as a default): for a subset A Ă X and ε ą 0,

ion
we define
Aε “ tx P X : distpx, Aq ď εu.

ut
The two viewpoints are roughly equivalent since the “surface area” relative to µ can
be retrieved (when that makes sense) as the first-order variation of µpAε q when ε

rib
goes to 0, cf. (4.23) and, conversely, the growth of the function ε ÞÑ µpAε q on the
macroscopic scale can be recovered from the knowledge of its derivative. However,

ist
the enlargement-based approach seems simpler (a more flexible definition) and is
often more fruitful since some otherwise useful bounds on µpAε q may be meaningless

rd
for small ε, and/or may be available in absence of any clue with regard to the nature
of extremal sets.
Lower bounds for µpAε q can be rephrased as deviation inequalities for Lips-
fo
chitz functions. This leads, in some settings, to a remarkable phenomenon: every
Lipschitz function concentrates strongly around some “central value.” Statements
ot
to such and similar effect will be the focus of our presentation. Specifically, we will
N

look for estimates of the form


2
(5.21) µpf ą Mf ` tq ď Ce´λt
ly.

and
2
µpf ą Ef ` tq ď Ce´λt ,
on

(5.22)
to be valid for any real-valued 1-Lipschitz function on X and all t ą 0, where Mf
and Ef are the median and the expected value of f calculated with respect to µ.
se

(A number M is said to be a median for a random variable X if PpX ě M q ě 1{2


lu

and PpX ď M q ě 1{2.) Clearly, (5.21) and (5.22) formally imply then similar two-
sided estimates for µp|f ´ Mf | ą tq and µp|f ´ Ef | ą tq with C replaced by 2C.
na

Concentration of this type is referred to as subgaussian (more on this terminology


in Section 5.2.6). For the convenience of a casual reader—and for easy reference—
so

we list in Table 5.2 the constants and the exponents that appear in subgaussian
concentration inequalities for a selection of classical objects.
r
Pe

Remark 5.12. We point out that if a function f is such that one of the in-
equalities (5.21) or (5.22) holds (for all t ą 0) with constants C, λ, then the other
inequality similarly holds (for the same function) with some other constants. For
example, if (5.22) holds with C ě 12 and λ, then (5.21) holds with 2C 2 and λ{2; if
(5.21) holds with C ě e´1{3 « 0.717 and λ, then (5.21) holds with eC 2 and λ{2 (see
Proposition 5.29 and Remarks 5.30, 5.31.) Sharper results of this nature (i.e., with
better dependence on C, λ) can sometimes be obtained if we assume that (5.21) (or
(5.22)) holds for all real-valued 1-Lipschitz functions on X; some questions in that
spirit are considered in [Led01] (see, e.g., Exercise 5.48).
Table 5.2. Constants and exponents in subgaussian concentration inequalities for a selection of classical objects. When
118

Pe
applicable, the reference measure is the canonical invariant measure on the object in question. We made an effort to come
r
up with reasonable values of constants/exponents, and some of them are optimal. Unless indicated otherwise, the metric used
for manifolds is the Riemannian geodesic distance. dH stands for the normalized Hamming distance (5.12). References: (a)
so
Theorem 5.24. (b) Log-Sobolev inequality (LSI), see Table 5.4. (c) Corollary 5.17. (d) Proposition 5.20; what follows from
the LSI is λ “ n´12
. (e) Ricci curvature, see Table 5.3. (f) Remark 5.12. (g) Corollary 5.52. (h) LSI on the discrete cube,
na
see Theorem 5.1 and Exercise 5.5 in [BLM13]. (i) Theorem 5.54; convex or concave functions only. (j) The constant in the
exponent is 18 and not 21 due to rescaling (t´1, 1u vs. t0, 1u) (k) Theorem 5.56; convex functions only. (l) Theorem 5.38. (m)
lu
Theorem 5.39. (n) pC, λq “ p2, 14 q if n “ 2; Remark 5.12. (o) Exercise 5.54. (p) Remark 5.19. (q) If we use instead the
non-Riemannian metric (B.11), the parameter λ needs to be multiplied by 2 in view of (B.12). (r) Remark 5.53.
se
Object C, λ in (5.21)–median C, λ in (5.22)–mean Comments
1 1
Gauss space pRn , | ¨ |, γn q
on
2 , 2 (a) 1, 12 (b)
1
Gauss space pCn , | ¨ |, γnC q 1, 1 (b)
2 , 1 (a)
n´1 n´1 1 n
ly.
pS , gq or pS , | ¨ |q n ą 2 for pS n´1 , gq (p)
2 , 2 (c) 1, n2 (d)
1 n´1 N
SOpnq 2 , 8 (e) 1, n´1
8 (b) metric (B.8)
1 n n
SUpnq 2 , 4 (e) 1, 4 (b) metric (B.8)
n
ot n
Upnq 2, 24 (f) 1, 12 (b) metric (B.8)
n 1 n´2 n´2
Grpk, R q metric (B.10) (q)
2 , 4 (e) 1, 4 (b)
fo
n 1 n
Grpk, C q metric (B.10) (q)
2 , 2 (e) 1, n2 (b)
n
(g) (h) (r)
rd
pt´1, 1u , dH q 1, 2n 1, 2n ně3
n
pt´1, 1u , | ¨ |q 2, 81 (i)(j) 1, 81 (k)(j) appropriate convexity hypotheses
1 c
ist
5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

Ricci curvature ě c 2 , 2 (l) 1, 2c (b)


1 1
LSI with constant ď α 2, 4α (f) 1, 2α (m)
rib
1 n´2 n´1
pS n´1 qk 2 , 2 (e)(n) 1, 2 (b) `2 product metric
2
k 1
r0, 1s
ut
`2 product metric
2 , π (o) 1, π2 (b) ion
5.2. CONCENTRATION OF MEASURE 119

In the next two subsections we will exemplify the concentration phenomenon


and related techniques in the case of the Euclidean sphere and the Gaussian space.
In subsequent subsections we will survey some general methods for proving isoperi-
metric/concentration results and present a selection of examples, in particular those
listed in Table 5.2. We will concentrate on the objects that exhibit subgaussian con-
centration; more general settings will be addressed briefly in exercises and in Notes
and Remarks (an exception is Section 5.2.6 which treats sums of independent subex-
ponential random variables). A comprehensive presentation of diverse aspects and

ion
manifestations of the concentration phenomenon is beyond the scope of this work;
we refer the interested reader to the monographs [Led01, BLM13] and/or to other
sources listed in Notes and Remarks. Here we restrict our attention to highlighting

ut
several central techniques and, subsequently, to going over examples that appear

rib
to be of relevance to the quantum theory.
5.2.1. A prime example: concentration on the sphere. The settings of

ist
the Euclidean sphere and of the projective space are directly relevant to quantum
information theory since the latter identifies canonically with the set of pure states.

rd
In the language of enlargements, the isoperimetric inequality on the sphere can be
stated as follows.
fo
Theorem 5.13 (Spherical isoperimetric inequality, not proved here). Equip
the unit sphere S n´1 Ă Rn with the geodesic distance g and the uniform probability
ot
measure σ. If A Ă S n´1 and if C Ă S n´1 is a spherical cap such that σpAq “ σpCq,
then, for any ε ą 0,
N

(5.23) σpAε q ě σpCε q.


ly.

Recall that the spherical cap with center x P S n´1 and radius ε is the set
Cpx, εq “ ty P S n´1 : gpx, yq ď εu.
on

Note that the class of spherical caps is stable under enlargements and that we have
se

(5.24) Cpx, εqδ “ Cpx, ε ` δq for any δ, ε ą 0.


In view of the simple relationship between g and the extrinsic (or chordal) dis-
lu

tance inherited from the ambient Euclidean space (see Appendix B.1), Theorem
5.13 is valid also for the latter. However, it is traditionally stated for the geo-
na

desic distance. Also, the formula (5.24) for Cpx, εqδ stated above would be more
complicated if we used | ¨ | to define caps.
so

The usefulness of Theorem 5.13 comes from the fact that there are explicit
r

integral formulas and sharp bounds for the measure of spherical caps, which were
Pe

explored in Section 5.1.2. However, while in the study of packing and covering
small caps seemed most interesting, in the present context of concentration the radii
close to π{2 are most relevant. This is because arguably the most useful instance
of Theorem 5.13 is σpAq “ 21 , in which case the radius of the corresponding cap C
is π{2 and the radius of its ε-enlargement, Cε , is π{2 ` ε. Taking into account the
bound (5.5) leads then to
Corollary 5.14. If n ą 2 and if A Ă S n´1 with σpAq ě 21 and ε ą 0, then
´ ´ π ¯¯ 1 2
(5.25) σpAε q ě σ C x, ` ε ě 1 ´ e´nε {2 .
2 2
120 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

There is no simple proof of the isoperimetric inequality on the sphere (Theorem


5.13) that we know of. However, a result just slightly weaker than Corollary 5.14
follows easily from the Brunn–Minkowski inequality (4.21). We have the following
Proposition 5.15. If ε P p0, π{2s and K, L Ă S n´1 are such that distpK, Lq ě
2
ε (in the geodesic distance), then σpKqσpLq ď e´nε {4 . In particular, if σpKq ě 1{2,
2
then σpKε q ě 1 ´ 2e´nε {4 .
Proof. The second statement follows by applying the first one with L “ Kεc .

ion
It thus remains to prove the first statement.
Define K 1 Ă B2n via K 1 :“ ttx : x P K, t P r0, 1su and similarly for L1 . Then
volpK 1 q “ σpKqvolpB2n q and volpL1 q “ σpLqvolpB2n q. Consequently, by the Brunn–

ut
Minkowski inequality in the form (4.21),

rib
ˆ 1 ˙
K ` L1 a a
vol ě volpK 1 qvolpL1 q “ σpKqσpLq volpB2n q.
2

ist
On the other hand, if x, y P S n´1 and the angle between x and y is at least ε,
then |px ` yq{2| ď cospε{2q. If ε ď π{2 (and so xx, yy ě 0), a simple calculation

rd
shows that the same is true if we replace x and y by x1 “ sx and y 1 “ ty, where
s, t P r0, 1s (in fact this is even
a true if ε ď 2π{3). This means that we have then
K 1 `L1
2
n
fo n
Ă cospε{2qB2 and so σpKqσpLq ď pcospε{2qq . It remains to appeal to the
2
(subtle but elementary) inequality cos u ď e´u {2 (see Exercise 5.3). 
ot
Remark 5.16. (1) Proposition 5.15 holds actually for the entire nontrivial
N

range of ε, which is r0, πs; this follows a posteriori from the estimate in Lévy’s
lemma (see Exercise 5.26). The above proof fails for large ε; however, only the range
r0, π{2s is relevant to the second statement and to Corollary 5.14: if µpKq ě 1{2,
ly.

then no point x can verify distpx, Kq ą π{2.


(2) The estimate in the Proposition is pretty tight: if K, L are opposite (i.e., K “
on

´L) caps with distpK, Lq “ 2ε, we conclude from the Proposition that µpKq ď
2 2
e´nε {2 . This compares fairly well with the bound 21 e´nε {2 implicit in (5.25).
se

Corollary 5.14 readily implies a concentration result for Lipschitz functions,


lu

which is often referred to in quantum information circles as Lévy’s lemma.


Corollary 5.17 (Lévy’s lemma). Let n ą 2. If f : pS n´1 , gq Ñ R is a
na

L-Lipschitz function and if Mf is a median for f , then, for any t ą 0,


1
so

(5.26) σpf ą Mf ` tq ď expp´nt2 {2L2 q,


2
r

and therefore
Pe

(5.27) σp|f ´ Mf | ą tq ď expp´nt2 {2L2 q.


Proof. Let A “ tx P S n´1 : f pxq ď Mf u and set ε “ t{L. Since f ď Mf on A
and since f is L-Lipschitz (i.e., |f pxq ´ f pyq| ď Lgpx, yq for x, y P S n´1 ), it follows
that for any y P S n´1 we have f pyq ď Mf ` Lgpy, Aq. In particular, if y P Aε , then
gpy, Aq ď ε and so f pyq ď Mf ` Lε “ Mf ` t. In other words, we proved that
Aε Ă tf ď Mf ` tu “ tf ą Mf ` tuc . The first inequality in Corollary 5.17 follows
now by observing that, by the definition of the median, σpAq ě 21 and by appealing
to Corollary 5.14.
The second inequality follows from the first one combined with an identical
bound on σpf ă Mf ´ tq, which is shown either by the same argument applied to
5.2. CONCENTRATION OF MEASURE 121

A “ tx P S n´1 : f pxq ě Mf u, or by appealing to the first inequality with f replaced


by ´f . 
Remark 5.18. Both parts of the above proof are quite general. First, any
lower bounds on measures of enlargements of sets of measure 12 imply (in fact are
equivalent to, see Exercise 5.27) bounds for deviation of Lipschitz function from
their medians. Second, any one-sided bound for deviation from the median (or the
expected value, or any other “symmetric” parameter) implies a two-sided bound, at
the cost of a factor of 2.

ion
Remark 5.19. In Corollaries 5.14 and 5.17 we have to assume that n ą 2
because the bound (5.5) is not valid in the entire nontrivial range 0 ď t ď π{2.

ut
2
If n “ 2, one needs to replace the function 12 e´nt {2 by maxt 12 ´ πt , 0u. However,

rib
no modifications are needed if the enlargements or the Lipschitz constants are
calculated with respect to the ambient space metric, or if only small values of ε or
t are of interest, say, ε ď 1 or t ď L.

ist
Concentration around the median follows naturally from the isoperimetric in-

rd
equality. As we mentioned in Remark 5.12, this implies formally concentration
around the expectation with altered constants. In some situations, it is possible to
obtain good constants with extra work.
fo
Proposition 5.20 (Lévy’s lemma for the mean, not proved here). Let n ą 2.
If f : pS n´1 , gq Ñ R is a 1-Lipschitz function, then for any t ą 0,
ot
(5.28) σpf ą Ef ` tq ď expp´nt2 {2q.
N

As mentioned in Remark 5.18, the inequality σp|f ´ Ef | ą tq ď 2 expp´nt2 {2q


follows formally, but is probably not optimal. See Problem 5.26 for questions about
ly.

possible better bounds in this and similar settings.


on

Exercise 5.26 (Proposition 5.15 holds for the full range of ε). Show that it
follows a posteriori from Theorem 5.13 and the bound (5.5) that, for n ą 2, in
the notation and under the hypotheses of Proposition 5.15, we have σpKq σpLq ď
se

1 ´nε2 {4
˘2
. For n “ 2, the optimal inequality is σpKq σpLq ď 14 1 ´ πε (cf. Remark
`
4e
lu

5.19).
Exercise 5.27 (Concentration implies isoperimetry). Show that, for a metric
na

probability space pX, µq, concentration implies isoperimetry in the following sense:
if µpf ą Mf ` tq ď α for any 1-Lipschitz function f , then µpAt q ě 1 ´ α for any
so

A Ă X with µpAq “ 21 .
r

Exercise 5.28 (A finer bound tor the mean width of a union). Let K, L be two
Pe

bounded sets in Rn , b
and R the outradius of K Y L. Show that wpconvpK Y Lqq ď

maxpwpKq, wpLqq ` n R.
5.2.2. Gaussian concentration. Another classical setting where ` isoperime- ˘
try and concentration have been widely studied is the Gaussian space Rn , | ¨ |, γn ,
where γn is the standard Gaussian measure on Rn (see Appendix A.2 for the no-
tation, basic properties and relevant facts). It turns out that the extremal sets
for the isoperimetric problem are then half-spaces, and since their enlargements
are also half-spaces, the solution to the problem can be expressed simply in terms
of the cumulative distribution function of an N p0, 1q variable, i.e., in terms of
Φpxq :“ γ1 pp´8, xsq. We have
122 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

Theorem 5.21 (Gaussian isoperimetric


` ˘ inequality, see Exercise 5.30). Let A Ă
Rn , and let a P R be defined by γ1 p´8, as “ γn pAq. Then, for any ε ą 0,
` ˘
(5.29) γn pAε q ě γ1 p´8, a ` εs
or, equivalently,
(5.30) Φ´1 pγn pAε qq ě Φ´1 pγn pAqq ` ε.
The solution to the Gaussian isoperimetric problem (Theorem 5.21) was orig-

ion
inally derived from the spherical isoperimetric inequality (Theorem 5.13) via the
following classical fact.

ut
Theorem 5.22 (Poincaré’s lemma, see Exercise 5.29). For n, N P N with
N ě n, we consider Rn to be a subspace of RN . Next, fix n and let νN be the

rib
pushforward
? to Rn , via the orthogonal projection, of the normalized uniform mea-
N ´1
sure on N S . Then, as N Ñ 8, pνN q converges to γn , the standard Gaussian
measure on Rn .

ist
The convergence in Theorem 5.22 holds in a very strong sense, e.g., in total

rd
variation, or in uniform convergence of densities.
Another derivation of the Gaussian isoperimetric inequality is based on the
fo
following analogue of the Brunn–Minkowski inequality in the Gaussian setting.
Theorem 5.23 (Ehrhard’s inequality, not proved here). Let A, B be Borel sub-
ot
sets of Rn and let λ P r0, 1s. Then
N

(5.31) Φ´1 pγn pp1 ´ λqA ` λBqq ě p1 ´ λqΦ´1 pγn pAqq ` λΦ´1 pγn pBqq.
Ehrhard’s inequality is stronger than log-concavity of the Gaussian measure
ly.

(Section 4.3.2), see Exercise 5.31. Assuming Ehrhard’s inequality, the derivation of
the Gaussian isoperimetric inequality goes as follows. Fix A, ε and let λ P p0, 1q.
on

Since Aε “ A ` εB2n “ p1 ´ λqp1 ´ λq´1 A ` λελ´1 B2n , we have, by (5.31),


(5.32) Φ´1 pγn pAε qq ě p1 ´ λqΦ´1 pγn pp1 ´ λq´1 Aqq ` λΦ´1 pγn pελ´1 B2n qq.
se

We now let λ Ñ 0` . The first term on the right-hand side of (5.32) converges
clearly to Φ´1 pγn pAqq, while the second term converges to ε (this is a little harder,
lu

but elementary, see Exercise 5.32), and so we proved the Gaussian isoperimetric
inequality in the form (5.30).
na

The next theorem follows from Theorem 5.21 according to the general scheme
indicated in Remark 5.18, with the explicit exponential bound being a consequence
so

of Exercise A.1.
r

Theorem 5.24. If f : Rn Ñ R is L-Lipschitz and Mf denotes its median (with


Pe

respect to γn ), then for any t ą 0


` ˘ 1 2 2
(5.33) γn pf ą Mf ` tq ď γ1 pt{L, 8q ď e´t {2L ,
2
2
{2L2
γn p|f ´ Mf | ą tq ď e´t .
As we already noted in the setting of the sphere, concentration around the
median formally implies similar concentration around the mean (see Remark 5.12).
However, this approach leads to suboptimal constants. A more precise technique
relies on the log-Sobolev inequality from Section 5.2.4.2, which specified to the
Gaussian setting yields the following.
5.2. CONCENTRATION OF MEASURE 123

Theorem 5.25 (see Theorem 5.39 and Proposition 5.42). If f : Rn Ñ R is


L-Lipschitz and Ef is the mean of f (with respect to γn ), then for any t ą 0
2
{2L2
(5.34) max tγn pf ą Ef ` tq, γn pf ă Ef ´ tqu ď e´t .
There is some numerical evidence that the assertion of Theorem 5.25 can be
further strengthened. We pose
Problem 5.26. If f : Rn Ñ R is 1-Lipschitz and Ef denotes its average with
2
respect to γn , is it true that γn p|f ´ Ef | ą tq ď e´t {2 ? The case n “ 1 implies

ion
the general case and is probably not that hard to settle. Similarly, is it true that
σp|f ´ Ef | ą tq ď expp´nt2 {2q if f : pS n´1 , gq Ñ R is a 1-Lipschitz function (and

ut
n ą 2; see Remark 5.19 for comments on peculiarities of the case n “ 2)?
An example of a function for which Theorem 5.24 is meaningful is the Euclidean

rib
norm, which is trivially 1-Lipschitz. This gives the following (see also Exercise 5.37).
Corollary 5.27. Let G be a standard Gaussian vector in Rn . Then, for any

ist
t ą 0,

rd
c
` ? ˘ 1 ´t2 {2 ´ 2 ¯ 1 2
P |G| ě n ` t ď e and P |G| ď n ´ ´ t ď e´t {2 .
2 3 2
fo
The distribution of |G|2 is commonly known as χ2 pnq, the chi-squared distribu-
tion with n degrees of freedom. Denoting by mn the median of |G|,b what is required
ot
to deduce Corollary 5.27 from Theorem 5.24 are the inequalities n ´ 32 ď mn ď
?
N

n. The lower bound is proved in Exercise


? 5.34 and the upper bound follows from
Proposition 5.34): we have mn ď κn ď n.
ly.

Exercise 5.29 (Weak convergence in Poincaré’s lemma). In the context of


Poincaré’s lemma (Theorem 5.22), show without any computation that the sequence
on

pνN q converges weakly towards γn .


Exercise 5.30 (Gaussian isoperimetric inequality via Poincaré lemma). Derive
se

the Gaussian isoperimetric inequality (5.29) from the Poincaré lemma (Theorem
5.22) and the spherical isoperimetric inequality (Theorem 5.13).
lu

Exercise 5.31 (Ehrhard’s inequality implies log-concavity). Show that The-


orem 5.23 (Ehrhard’s inequality) formally implies that the Gaussian measure γn
na

satisfies the log-concavity inequality (4.28).


so

Exercise 5.32 (Gaussian measure of large balls). Show that


` ˘
Φ´1 γn prB2n q
r

lim “ 1.
Pe

rÑ`8 r
Exercise 5.33 (Ehrhard-like (a-)symmetrization). Show that the following
statement is equivalent to the validity of Ehrhard’s inequality for convex bodies.
Let K Ă Rn be a convex body and let E Ă Rn be a k-dimensional subspace with
0 ă k ă n. Identify E and E K with, respectively, Rk and Rn´k and define a set
L Ă Rk`1 by
px, sq P L ðñ s ď Φ´1 pγn´k pty P E K : px, yq P Kuqq,
where x P E, s P R. Then L is convex.
In the case when E “ uK is a hyperplane (i.e., k “ n ´ 1) the transformation
K ÞÑ L is called Ehrhard (a-)symmetrization in direction u.
124 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

Exercise 5.34 (Median of the chi-squared distribution, based on [CR86]).


´ ¯1{3
Let X be a random variable with distribution χ2 pnq, and V “ n ´X2{3 . Show
that the density h of V satisfies the inequality hp1 ´ tq ď hp1 ` tq for t P r0, 1s,
and conclude that the median of V is greater than 1, therefore the median of X is
larger than n ´ 2{3. Higher order two-sided bounds for the median can be found in
[BS].
5.2.3. Concentration tricks and treats. This section contains a selection

ion
of largely elementary facts related to the concentration phenomenon. It supplies
a set of tools allowing for flexible applications of concentration results. As a rule,

ut
the facts are well known to experts in the area and are included here for future
reference. Proofs are relegated to exercises.

rib
5.2.3.1. Laplace transform. We mostly restrict ourselves to settings where con-
centration exhibits a subgaussian behaviour as in (5.21) or (5.22). Such behaviour

ist
can be proved via estimating the bilateral Laplace transform, using the exponential
Markov inequality PpX ą tq ď e´st E exppsXq for s ą 0.

rd
Lemma 5.28 (Laplace transform method). Let X be a random variable such
that E exppsXq ď A exppβs2 q for every s P R. Then, for every t ą 0,
fo
maxpPpX ą tq, Pp´X ą tqq ď A expp´t2 {4βq.
Exercise 5.35. Prove Lemma 5.28 about the Laplace transform method.
ot
Exercise 5.36. Prove Hoeffding’s lemma: if X is a mean zero random variable
N

taking values in an interval ra, bs, then E exppsXq ď expp 81 s2 pb ´ aq2 q for any s P R.
ly.

Exercise 5.37 (A large deviation bound for chi-squared variable, based on


[Vem04]). Let X be a random variable with distribution χ2 pnq, for example X “
|G|2 where G is a standard Gaussian vector in Rn . Show that E exppsXq “ p1 ´
on

` ˘n{2
2sq´n{2 for any s ă 1{2. Conclude that PpX ě p1 ` εqnq ď p1 ` εq expp´εq for
` ˘n{2
se

any ε ą 0 and that PpX ď p1 ´ εqnq ď p1 ´ εq exppεq for ε P p0, 1s. (We known
from Cramér’s large deviations theorem that this bounds are sharp.) Conclude that
lu

nε2
ˆ ˙
(5.35) Pp|X ´ n| ě εnq ď 2 exp ´ .
4 ` 8ε{3
na

5.2.3.2. Central values. Once we know that a function is concentrated around


so

some value, we can a posteriori infer that it also concentrates around the mean or
the median, or any other particular quantile. This can be formalized by the concept
r

of a central value. If Y is a real random variable, we will say that M is a central


Pe

value of Y if M is either the mean of Y , or any number between the 1st and the 3rd
quartile of Y (i.e., if mintPpY ě M q, PpY ď M qu ě 41 ; this happens in particular
if M is the median of Y ). The numbers 14 and 34 play no special role and can be
changed to other numbers from p0, 1q at the cost of deteriorating (or improving)
the constants in the statements that follow (see, e.g., Remark 5.31).
Proposition 5.29 (see Exercises 5.38–5.40). Let Y be a real random variable
and let M be any central value for Y . Let a P R and let constants A ě 21 , λ ą 0 be
such that, for any t ą 0,
(5.36) maxtPpY ą a ` tq, PpY ă a ´ tqu ď A expp´λt2 q.
5.2. CONCENTRATION OF MEASURE 125

a a
Then |M ´ a| ď logp4Aq λ´1{2 . Consequently, for any t ě logp4Aq λ´1{2 ,
(5.37) maxtPpY ą M ` tq, PpY ă M ´ tqu ď 4A2 expp´λt2 {2q.
a
Remark 5.30 (Improvements to Proposition 5.29). The expressions
a logp4Aq
and 4A2 in the assertion of Proposition 5.29 can be replaced by logpκAq and κA2 ,
where κ “ 2 when M is the median of Y and κ “ e when M is the expectation of
Y ; see Exercises 5.38, 5.39 and 5.40.
Remark 5.31 (On the necessity of restrictions on t in Proposition 5.29). We

ion
point out that the bound on the first (resp., the second) probability appearing
in (5.37) is valid under the formallyaweaker restriction t ą pM ´ aq` (resp.,

ut
t ą pM ´ aq´ ). The restriction t ě logp4Aq λ´1{2 , while annoying, cannot be
completely avoided if we want to keep full generality because the hypothesis (5.36)

rib
does not necessarily supply any information about the probabilities appearing in
the assertion if t is small. However, this is only a minor inconvenience since for

ist
such t the upper bound in (5.37) is never small and often holds for trivial reasons.
In particular, (5.37) holds for all t ą 0 if M is the mean or any quantile between

rd
the 27th and 73rd
? percentile, or if A ě 32{3 {4 « 0.52, and always if we replace the
2 2
factor 4A by 3 2A . If M is the median, we can go even further: no restrictions
on t are needed even if we replace 4A2 by 2A2 on the right hand side of (5.37); if
fo
M is the mean, similar improvement (i.e., eA2 on the right hand side) is possible
when A ě e´1{3 « 0.717 (these last observations were used in Remark 5.12).
ot
Corollary 5.32 (Lévy’s lemma for central values). Let f : pS n´1 , gq Ñ R be
N

an L-Lipschitz function and let M be any central value for f . Then |M ´ Mf | ď


?
2 log 2 n´1{2 and, for any ε ą 0,
ly.

´ nε2 ¯
(5.38) Ppf ě M ` εq ď exp ´ .
4L2
on

We sketch proofs and give more precise bounds and/or variations on the above
results in Exercises 5.38–5.48. Note that while (5.38) follows from Proposition
se

5.29 and Corollary 5.17 for n ą 2 and for ε not-too-small, a separate argument
is needed to cover the remaining cases (cf. Remark 5.31). We also point out that
lu

while Proposition 5.29 is meant to give reasonably good estimates valid in the most
general setting when concentration is present, better bounds are available in specific
na

instances. For example, Corollary 5.32 can be improved when M is the mean (see
Table 5.2 and Exercise 5.44), and similarly in the Gaussian case.
so

The heuristics behind Corollary 5.32 is as follows: if we know that all sets of
measure at least 12 have large enlargements, then approximately the same is true for
r
Pe

all sets of measure at least 41 . Actually, almost the same is true for much smaller
sets; here is a sample result.
Proposition 5.33 (see Exercise 5.49). Let pX, d, µq be a metric probability
space and let ε ą 0. Suppose that any set A Ă X with µpAq ě 12 verifies µpAε q ě
2 2 2
1 ´ Ce´λε . Then µpB2ε q ě 1 ´ Ce´λε for any set B Ă X with µpBq ě Ce´λε .
A common feature of concentration inequalities presented up to now is that
in order to translate them to concrete bounds for concrete functions, we need to
calculate—or at least reasonably estimate—the medians or expected values, or sim-
ilar parameters of the functions under consideration. A selection of tools, some of
them quite sharp, to handle expected values will be described in Section 6.1. The
126 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

preceding three results tell us that it doesn’t really matter which central value we
employ, as long as we are willing to pay a small penalty in the form of an addi-
tional multiplicative constant in the exponent and in front of the exponential. The
following observation shows that, in the Gaussian context, sometimes no penalty is
needed at all.
Proposition 5.34 (see Exercise 5.50). Let f : Rn Ñ R be a convex function.
Denote by Mf (resp., Ef ) the median (resp., the expectation) of f with respect to
the standard Gaussian measure γn . Then Mf ď Ef .

ion
Exercise 5.38. Show that a random variable Y0 such that a P pY0 ą tq ď
2
?
A expp´t q for t ą 0 must verify E Y0 ď E Y0 ď mintA π{2, 1 ` log` Au. De-
`

ut
duce the first assertion of Proposition 5.29 and the corresponding improvement
from Remark 5.30 if M is the mean of Y .

rib
Exercise 5.39. Show that if Y0 is a random variable such that Pb pY0 ą tq ď

ist
2
A expp´t q for t ą 0 and if M3{4 is its 3rd quartile, then M3{4 ď log` p4Aq.
Deduce the first assertion of Proposition 5.29 if M is between the
b 1st or the 3rd

rd
quartile of Y , and the strengthening from Remark 5.30: |M ´a| ď log` p2Aq λ´1{2
if M is the median of Y .
2 fo 2 2
Exercise 5.40. Prove the inequality e´s ď eδ e´ps`δq {2 for s, δ P R. Use it
and the last two exercises to show the second assertion of Proposition 5.29, and its
ot
strengthenings stated in Remark 5.30 when M is the median or the mean of Y .
N

Exercise 5.41. Verify the assertions in the last two sentences of Remark 5.31.
Exercise 5.42. Given α P p0, 1q, prove a version of (5.37) with the right-hand
ly.

side of the form B expp´αλt2 q, where B depends only on A and α (and on κ from
Remark 5.30, if applicable).
on

Exercise 5.43 (Lévy’s lemma for central values). Let n ą 2. Use Exercise
5.26 to derive Corollary 5.32 for any quantile between the 1st and the 3rd quartile.
se

Exercise 5.44 (The median and the mean on the sphere). Let f be a 1-
lu

n´1
Lipschitz function on pS
a , gq with n ą 2. Show that the median and the mean
of f differ at most by π{8n and describe the extremal function.
na

Exercise 5.45 (Variance of a Lipschitz function on the sphere). Let f be a


1-Lipschitz function on pS n´1 , gq with n ą 1. Show that Varpf q ď n2 and give an
so

example with Varpf q ě n1 . What function gives the maximal variance?


r

Exercise 5.46 (Concentration around L2 average). Let f be a 1-Lipschitz and


Pe

positive function on pS n´1 , gq with n ą 1. Set q “ pEf 2 q1{2 . Show that for any
t ą 0, Ppf ě q ` tq ď expp´nt2 {2q and Ppf ď q ´ tq ď e expp´nt2 {2q.
Exercise 5.47 (The case of S 1 ). Using directly the solution to the isoperimetric
problem on S 1 , show that Corollary 5.32 holds also for n “ 2.
Exercise 5.48. Let pX, d, µq be a metric probability space and let α : r0, 8q Ñ
r0, 8q be such that µpf ě Ef ` tq ď αptq for any bounded 1-Lipschitz function
f : X Ñ R and for all t ą 0. Then, for any such function f and for any t ą 0,
µpf ě Mf ` tq ď αpt{2q. Equivalently, µpAε q ě 1 ´ αpε{2q for any A Ă X with
µpAq ě 1{2 and any ε ą 0. The preceding argument can be iterated, see (1.18) in
[Led01].
5.2. CONCENTRATION OF MEASURE 127

Exercise 5.49. Prove Proposition 5.33 about enlargements of fairly small sets.
Exercise 5.50 (Median vs. mean for convex functions of Gaussian variables).
Prove Proposition 5.34 by showing first that the function g : t ÞÑ Φ´1 pγn ptf ď tuqq
is concave.
Exercise 5.51. Show that the following statement is a consequence of Propo-
sition 5.34. If pX1 , . . . , XN q are jointly Gaussian random variables and f : RN Ñ R
is a convex function, then the median of the random variable f pX1 , . . . , XN q does

ion
not exceed its expectation.
5.2.3.3. Local versions. It sometimes happens that a function defined on the
sphere S n´1 has a poor global Lipschitz behaviour, while its restriction to a subset

ut
of large measure is much more regular. To take advantage of such situation, we

rib
formulate a “local” version of Lévy’s lemma.
Corollary 5.35 (Lévy’s lemma, local version). Let Ω Ă S n´1 be a subset

ist
of measure larger than 3{4. Let f : pS n´1 , gq Ñ R be a function such that the
restriction of f to Ω is L-Lipschitz. Then, for every ε ą 0,

rd
Ppt|f pxq ´ Mf | ą εuq ď PpS n´1 zΩq ` 2 expp´nε2 {4L2 q,
where Mf is the median of f .
fo
One scenario under which the hypotheses of Corollary 5.35 may be satisfied is
ot
when we have an upper bound on some Sobolev norm of f (a “global” parameter,
which suggests that “restricted version of Lévy’s lemma” could have been better
N

terminology). However, our applications of the Corollary will be rather straightfor-


ward and will not require any advanced notions.
ly.

Exercise 5.52. Prove Corollary 5.35, the local version of Lévy’s lemma.
on

5.2.3.4. Pushforward. The following elementary result is very useful for estab-
lishing concentration phenomenon for many classical spaces. In a nutshell, it says
that concentration results can be “pushed forward” by surjective contractions.
se

Proposition 5.36 (Contraction principle). Let pX, µq and pY, νq be metric


lu

probability spaces. Assume that there exists a surjective contraction φ : X Ñ Y


which pushes forward µ to ν (i.e., νpBq “ µpφ´1 pBq) and let a P p0, 1q and ε ą 0.
na

Then
(5.39) inf νpBε q ě inf µpAε q.
so

BĂY, νpBqěa AĂX, µpAqěa

Similarly, for any t ą 0,


r
Pe

(5.40) sup νpg ´ Eg ą tq ď sup µpf ´ Ef ą tq.


g:Y ÑR, g 1-Lipschitz f :XÑR, f 1-Lipschitz

Moreover, (5.40) holds if expectation is replaced by median on both sides.


Exercise 5.53. Prove Proposition 5.36, the contraction principle. State a
more general version with φ : X Ñ Y assumed to be L-Lipschitz rather than a
contraction.
Exercise 5.54 (Concentration on the solid cube via Gaussian pusforward). Let
Y be the solid cube r0, 1sn endowed with the Lebesgue measure and the Euclidean
n
` 1 ˘ from R . Use Proposition 5.36 to show that Y verifies (5.21) with
metric inherited
pC, λq “ 2 , π and (5.22) with pC, λq “ p1, πq.
128 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

5.2.3.5. Direct products. It is easy to see that the concentration phenomenon


passes to direct products of metric probability spaces. Indeed, let X and Y be two
such spaces that exhibit the concentration phenomenon and let X ˆ Y be endowed
with the product measure and some reasonable product metric, such as the `p
product metric defined for px1 , y1 q and px2 , y2 q in X ˆ Y as
1{p
(5.41) dppx1 , y1 q, px2 , y2 qq “ pdX px1 , x2 qp ` dY px1 , x2 qp q ,
the limit case p “ 8 being interpreted as a maximum. If f is a 1-Lipschitz function

ion
on X ˆ Y , then φpxq “ Mf px,¨q is 1-Lipschitz on X and hence concentrated around
its median Mφ . Since, for each x P X, f px, ¨q is concentrated around φpxq, it follows
that f is concentrated around Mφ . (See Exercise 5.55 for precise statements.) The

ut
above argument can be clearly iterated. Here is another elementary result involving
product measures.

rib
Proposition 5.37 (Concentration on product spaces, see Exercise 5.55). Let

ist
pXi , di , µi q, 1 ď i ď n, be bounded metric probability spaces and denote Di “
diam Xi . Let X “ X1 ˆ . . . ˆ Xn be endowed with the product measure µ and the

rd
`1 product metric d. Then, for every 1-Lipschitz function f : X Ñ R and for any
t ě 0,
2
{D 2
(5.42)
where D “
` řn 2 1{2
Di
˘
.
µpf ě Ef ` tq ď e´2t fo ,
ot
i“1

Both approaches to products of metric probability spaces that are sketched


N

above share an unsatisfactory feature: the constants deteriorate as the number of


factors increases. In complete generality, this feature is unavoidable (see Section
ly.

5.2.5). However, in some natural settings (e.g., the Gaussian space) dimension-free
results are possible.
on

Exercise 5.55 (Concentration on product spaces, a naive approach). For the


purpose of this exercise the median of a random variable F is defined as MF “
1
se

2 psuptt : PpF ě tq ě 1{2u ` inftt : PpF ď tq ě 1{2uq, but most other definitions
would work if applied consistently and with sufficient care. Let pX, d1 , µq and
lu

pY, d2 , νq be metric probability spaces. Consider the space pX ˆ Y, d, πq, where


π “ µ b ν and d is any metric verifying
na

dppx1 , yq, px2 , yqq “ d1 px1 , x2 q and dppx, y1 q, px, y2 qq “ d2 py1 , y2 q


so

for all x, x1 , x2 P X and y, y1 , y2 P Y and let f : X ˆ Y Ñ R be a 1-Lipschitz


function with respect to d.
r

(i) Show that the function φpxq “ Mf px,¨q is 1-Lipschitz on X.


Pe

(ii) If X and Y exhibit the concentration phenomenon in the sense of (5.21) for
2
some C and λ, then πpf ą Mφ ` tq ď 2Ce´λt {4 for all t ą 0, and similarly for
πpf ă Mφ ´ tq.
(iii) Show that Mφ is a central value in the sense of Section 5.2.3.
(iv) Same as (ii) with (5.21) replaced by (5.22) and Mφ by Ef .
Exercise 5.56 (Concentration on product spaces, Laplace transform method).
ş of a probability metric space pX, d, µq is defined for λ P R
The Laplace functional
as EpX,d,µq pλq “ sup eλf dµ, where the supremum is taken over all 1-Lipschitz
functions f : X Ñ R with mean 0.
(i) Show that if X has diameter D, then EpX,d,µq pλq ď exppλ2 D2 {8q (use Exercise
5.2. CONCENTRATION OF MEASURE 129

5.36).
(ii) Show that if pX1 , d1 , µ1 q and pX2 , d2 , µ2 q are two metric probability spaces, if
d denotes the `1 product metric on X1 ˆ X2 as defined in (5.41), then
EpX1 ˆX2 ,d,µ1 bµ2 q pλq ď EpX1 ,d1 ,µ1 q pλqEpX2 ,d2 ,µ2 q pλq.
(iii) Show that in the context of Proposition 5.37, we have
EpX,d,µq pλq ď exppλ2 D2 {8q.

ion
(iv) Prove Proposition 5.37 using Lemma 5.28.
Exercise 5.57 (Hoeffding’s inequality). Show that Proposition 5.37 implies

ut
Hoeffding’s inequality: if X1 , . . . , Xn are independent random variables such that
Xi takes values in an interval of length li , then for any t ą 0,

rib
2
{L2
(5.43) PpS ě ES ` tq ď e´2t ,

ist
2
where S “ X1 ` ¨ ¨ ¨ ` Xn and L “ l12 ` ¨¨¨ ` ln2 .

rd
5.2.4. Geometric and analytic methods. Classical examples. In Sec-
tions 5.2.1 and 5.2.2 we sketched isoperimetric/concentration results on the Eu-
clidean sphere and for the Gaussian measure. While these are admittedly very
fo
special situations, the fact of the matter is that, in high-dimensional settings, some
form of concentration phenomenon is the rule rather than the exception.
ot
5.2.4.1. Gromov’s comparison theorem. The first result asserts that isoperimet-
ric and concentration inequalities hold under geometric assumptions which signifi-
N

cantly generalize the spherical case. The invariant that can be related to sphere-like
behavior is the Ricci curvature, which describes the rate of growth of volume under
ly.

geodesic flow on the manifold with the similar rate in the Euclidean space. For
example (see Figure 5.3), the circumference of a circle of geodesic radius θ (ă π)
on

on the sphere S 2 is 2π sin θ, and hence the length of the arc of the circle corre-
sponding to an angle α (measured on the plane tangent at the center of the circle)
3˘ 2˘
se

is α sin θ « α θ ´ θ6 “ αθ 1 ´ θ6 compared to αθ for the Euclidean plane. (Here


` `

and in the next paragraph « means equality up to higher order terms.)


lu

Repeating this calculation mutatis mutandis for an m-dimensional sphere (in


θ m´1
`
Rm`1 ) of radius R and a solid m-dimensional angle α we get α R sin R q «
na

3 ˘m´1 2˘
θ m´1 θ
` m´1
` m´1
α θ ´ 6R2 « αθ 1 ´ R2 6 compared to αθ in the Euclidean setting
(i.e., in Rm ). This is subsumed by saying that the Ricci curvature of RS m , the
so

m-dimensional sphere of radius R, at every point and in each direction is m´1 R2 .


r

The notion is generalized to an arbitrary point p on a Riemannian manifold X


Pe

of dimension greater than or equal to 2 and to an arbitrary unit vector u in the


tangent space at p by considering infinitesimal (solid) angles in the direction of u
2
and finding the coefficient of θ6 in the corresponding expression for the volume on
the geodesic sphere or radius θ centered at p; this coefficient is denoted by Ricp puq.
The minimum of Ricp puq over p P X and over directions u is denoted by cpXq.
Such straightforward calculation may be difficult to perform for more compli-
cated manifolds. On a less elementary level, the Ricci curvature can be computed
using the following formula expressed in the language of Riemannian geometry:
whenever pu1 , . . . , um q is an orthonormal basis in the tangent space at p (thought
130 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

the radius in
the ambient space angle α
is sin θ
a circle of
·· the resulting
geodesic arc of length 
θ2
radius θ α sin θ ≈ αθ 1 − 6

ion
·

ut
rib
ist
rd
Figure 5.3. Volume growth on the sphere S 2 as a function of
geodesic distance.

of as a real inner product space), we have


fo
ot
m
ÿ
(5.44) Ricp pu1 q “ secpu1 , ui q,
N

i“2
where sec denotes the sectional curvature. This leads to an alternative explanation
ly.

of the value of the Ricci curvature for the sphere, for other manifolds of constant
sectional curvature such as the Euclidean space or the hyperbolic space, or for their
on

quotients by discrete groups of symmetries (e.g., for tori or for the real projective
space). In the case of Lie groups, sectional curvature can be expressed via Lie
se

brackets. For examples of computations, see Exercises 5.58 and 5.59.


We are now ready to state the main result of this section. By RS m we denote
lu

the sphere of radius R in Rm`1 .


na

Theorem 5.38 (Gromov’s comparison theorem, not proved here). Let m ě 2


and let X be an m-dimensional connected Riemannian manifold such that cpXq ě
m´1 m m
so

R2 “ cpRS q. Let A Ă X and let C Ă RS be a cap such that µX pAq “


µRS m pCq, where µX and µRS m are normalized Riemannian volumes on, respec-
r

tively, X and RS m . Then, for every ε ą 0, µX pAε q ě µRS m pCε q.


Pe

It follows then (same proof as Corollary 5.17) that any 1-Lipschitz function
f : X Ñ R with median Mf satisfies, for any t ą 0,
1
µX ptf ą Mf ` tuq ď expp´pm ` 1qt2 {2R2 q.
2
As it turns out, the hypotheses of Theorem 5.38 are verified for many (but not
all) manifolds that naturally appear in mathematics and that play a role in physics,
notably for most classical Lie groups and their homogeneous spaces, see Table 5.3.
Exercise 5.58 (Ricci curvature of Grassmannians). For Grpk, Rn q or Grpk, Cn q,
the tangent space at any point can be identified with Mk,n´k . If X, Y P Mk,n´k are
5.2. CONCENTRATION OF MEASURE 131

Table 5.3. Optimal bounds on Ricci curvature for a selection


of classical manifolds. We restrict our attention to manifolds for
which that curvature is nonnegative, which in particular excludes
the hyperbolic space and its quotients. All the bounds concerning
specific objects can be derived via formula (5.44) involving the
(more standard) sectional curvatures. This is straightforward for
spaces, for which the sectional curvatures are constant (Rn , S n´1 ,
and PpRn q); the remaining cases are covered by Exercises 5.58

ion
and 5.59. Note that the values for the projective spaces PpV q
and the corresponding Grp1, V q do not coincide
? due to different
normalization of the metric (an additional 2 factor in (B.10) when

ut
compared to (B.5)).

rib
X metric cpXq comments
Rn Euclidean 0

ist
S n´1 geodesic n´2 ně2

rd
n´2
SOpnq standard (B.8) 4 ně2
n
SUpnq standard (B.8) 2
Upnq
n
standard (B.8)
Grpk, R q quotient from Opnq (B.10)
fo 0
n´2
1ďk ďn´1
ot
2
n
Grpk, C q quotient from Upnq (B.10) n 1ďk ďn´1
N

n
PpR q Fubini–Study (B.5) n´2 ně2
PpCn q Fubini–Study (B.5) 2n ně2
ly.

X1 ˆ X2 `2 product metric (5.41) mintcpX1 q, cpX2 qu


on

orthogonal, one can show (see Section 8.2.1 in [Pet06]) that


se

1`
}XY : ´ Y X : }2HS ` }X : Y ´ Y : X}2HS .
˘
(5.45) secpX, Y q “
4
lu

Use this formula and (5.44) to compute the corresponding values from Table 5.3.
In some references we find the coefficient 12 instead of 14 because of a different
na

normalization of the metric.


so

Exercise 5.59 (Ricci curvature of classical groups). For G “ SOpnq, SUpnq


or Upnq, the tangent space at I (or at any point) can be identified with the corre-
r
Pe

sponding Lie algebra g (“ son , sun or un ). If X, Y P g are orthonormal, one can


show (see Exercise 2.19 in [Pet06]) that secpX, Y q “ 41 }XY ´ Y X}2HS . Use this
formula and (5.44) to compute the corresponding values from Table 5.3.
5.2.4.2. Log-Sobolev inequalities (LSI). The next technique that we present is
of analytic nature. It is based on a class of inequalities which at the first sight seem
irrelevant to the subject at hand. Let pX, µq be a measure space and let f be a
non-negative function on X. The (continuous Shannon) entropy is defined by
ż
(5.46) Entµ pf q :“ f log f dµ
132 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

ş
if f dµ “ 1, where we used the convention 0 log 0 “ 0, and then extended to
non-negative integrable functions by 1-homogeneity. An explicit formula that im-
plements the extension is
ż ż ˆż ˙
(5.47) Entµ pf q :“ f log f dµ ´ f dµ log f dµ .

By Jensen’s inequality, Entµ pf q ě 0, with `8 being a possibility.


We now assume that X is a Riemannian manifold and that µ is a Borel measure

ion
on X. We say that pX, µq verifies a logarithmic Sobolev inequality with parameter
α if for every (sufficiently smooth) function f : X Ñ R we have
ż

ut
(5.48) Entµ pf q ď 2α |∇f |2 dµ.
2

rib
The smallest constant α that works in (5.48) is called the log-Sobolev constant of
pX, µq and denoted by LSpX, µq.

ist
The relevance of this circle of ideas to the concentration phenomenon is ex-
plained by the following result.

rd
Theorem 5.39 (Herbst’s argument). Let X be a Riemannian manifold and
let µ be a Borel probability measure on X such that LSpX, µq ď α. Then every

´ ¯
fo
1-Lipschitz function F : X Ñ R is integrable and satisfies, for every t ą 0,
ż
2
(5.49) µ F ą F dµ ` t ď e´t {2α .
N ot
Remark 5.40. The above Theorem can be extended to the setting of general
metric spaces, with essentially the same proof, once |∇f | is properly defined. For
example, we may use |∇f |pxq “ lim supyÑx |fdistpy,xq
pyq´f pxq|
ly.

if X has no isolated points;


discrete spaces may also be handled with some care. However, for clarity of the
on

exposition, we will assume for the rest of this subsection that the underlying spaces
are (connected) Riemannian manifolds.
se

ş Proof of Theorem 5.39. First, we may assume that F is smooth and that
F dµ “ 0; this may be achieved by replacing F by an appropriate approximation
lu

and subtracting a constant. The strategy is to show that the (bilateral) Laplace
transform of F verifies
na

ż
2
(5.50) eλF dµ ď eαλ {2 for all λ P R,
so

2
which by Lemma 5.28 implies that µpF ą tq ď e´t {2α , as needed. To establish
r

2
(5.50), we introduce an auxiliary function f “ fλ ą 0 defined via f 2 “ eλF ´αλ {2 .
Pe

2
In other words, f “ eλF {2´αλ {4 and it is readily checked that ∇f “ λ2 f ∇F . Since
2
|∇F | ď 1 (because F is 1-Lipschitz), it follows that |∇f |2 ď λ4 f 2 . Consequently,
by (5.48) (cf. (5.47)),
αλ2 ¯ ¯ αλ2 ż
ż ´ ż ´ż
(5.51) Entµ pf 2 q “ f 2 λF ´ dµ ´ f 2 dµ log f 2 dµ ď f 2 dµ.
2 2
ş
We now set φpλq “ f 2 dµ and note that differentiating under the integral sign
gives ż
φ1 pλq “ f 2 pF ´ αλq dµ.
5.2. CONCENTRATION OF MEASURE 133

This allows to rewrite (5.51) as


` ˘
λφ1 pλq ´ φpλq log φpλq ď 0,
which, for λ ‰ 0, is equivalent to
` ˘
d ´ log φpλq ¯
(5.52) ď 0.
dλ λ
On the other hand, given that φp0q “ 1, l’Hôpital’s rule yields
` ˘ ş
log φpλq φ1 pλq φ1 p0q F dµ

ion
(5.53) lim “ lim “ “ “ 0.
λÑ0 λ λÑ0 φpλq φp0q 1
` ˘
˘ (5.52) and (5.53) we conclude that log φpλq {λ ď 0 for λ ą 0 and

ut
Combining
`
log φpλq {λ ě 0 for λ ă 0, which just means that φpλq ď 1 for all λ P R. In

rib
ş 2
other words, eλF ´αλ {2 dµ ď 1 for λ P R, which is just a restatement of (5.50) and
concludes the argument. 

ist
Apart from the median being replaced by the expected value (which is largely
a matter of convenience or elegance, see Proposition 5.29 in Section 5.2.3), the

rd
assertion of Theorem 5.39 closely resembles (5.26) and (5.33), which quantified the
concentration phenomenon for Lipschitz functions in the spherical and Gaussian
fo
settings. However, its usefulness depends on availability of spaces pX, µq verifying
logarithmic Sobolev inequalities. The next few results ensure that the supply is
ot
indeed quite ample. For easy reference, the spaces and estimates on their log-
Sobolev constants are cataloged in Table 5.4.
N

Proposition 5.41 (not proved here). Let X be an m-dimensional Riemannian


manifold such that cpXq ą 0 and let µ be the normalized Riemannian volume. Then
ly.

m´1
LSpX, µq ď mcpXq .
on

Proposition 5.42 (not proved here). Let µ be a measure on Rn whose density


with respect to the Lebesgue measure is of the form e´U , where U verifies HesspU q ě
β I for some β ą 0. Then LSpRn , µq ď β ´1 . In particular, LSpRn , γn q ď 1 and
se

LSpCn , γnC q ď 21 .
lu

Proposition 5.43 (not proved here, but see Exercise 5.61). We have
LSpS 1 , σq “ 1 and LSpr0, 1s, vol1 q “ π ´2 .
na

Proposition 5.44 (Tensorization property of LSI, not proved here). Given


so

pXi , µi q, i “ 1, . . . , k, let X “ X1 ˆ ¨ ¨ ¨ ˆ Xk be endowed with the `2 product


metric as defined in (5.41) and the product measure µ “ µ1 b ¨ ¨ ¨ b µk . Then
r

LSpX, µq “ max1ďiďk LSpXi , µi q.


Pe

Remark 5.45 (Poincaré’s inequality). Another related famous functional in-


equality is the Poincaré inequality, which reads as follows: for every smooth function
f :XÑR
ż
(5.54) Varµ f ď α |∇f |2 dµ,
ş `ş ˘2
where Varµ f denotes the quantity f 2 dµ ´ f dµ . The smallest α is called
the Poincaré constant of pX, µq and denoted PpX, µq. Inequality (5.54) is implied
by the LSI (5.48) (with the same constant α); it implies sub-exponential instead of
subgaussian concentration. A list of Poincaré constants for common spaces can be
134 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

found in Table 5.4. An example of a probability measure satisfying the Poincaré


inequality but not the LSI is the (symmetric) exponential distribution on R.
Remark 5.46 (Contraction principle for LSI and Poincaré’s inequality). If
φ : pX, µq Ñ pY, νq is a surjective contraction which pushes forward µ onto ν, then
LSpY, νq ď LSpX, µq and PpY, νq ď PpX, µq. This can be proved as in Exercise 5.53
and is especially transparent if we define |∇f | as in Remark 5.40.

ion
Table 5.4. Bounds on log-Sobolev and Poincaré constants for a
selection of classical manifolds. We use the same metrics as in
Table 5.3. Except as indicated, the estimates on log-Sobolev con-

ut
stants follow from estimates on the Ricci curvature (see Proposi-
tion 5.41). Most of the time we use the bound LSpX, µq ă cpXq´1 ;

rib
the more precise expressions involving the dimension of X lead to
slightly better but often cumbersome formulas. The upper bounds

ist
on the Poincaré constants of Grassmann manifolds follow from Re-
mark 5.46. For more comments and references about Poincaré

rd
constants, see Notes and Remarks.

X or pX, µq
`
ra, bs, vol
b´a
1
˘
LSpX, µq
pb´aq2
π2
foPpX, µq
pb´aq2
π2
Comments
Prop. 5.43
ot
n´1 1 1
S n´1 n´1 Prop. 5.43 for S 1
N

1 1
PpRn q ď n´1 2n
1 1
PpCn q ă 2n 4n
ly.

pRn , γn q 1 1 Exercise 5.60


4 2
SOpnq
on

ă n´2 n´1
SUpnq ă n2 n
n2 ´1
Upnq ď n6 1
[MM13]
se

n
n 2 2
Grpk, R q ă n´2 ď n´1 1ďk ďn´1
lu

Grpk, C q n
ă n1 ď n1 1ďk ďn´1
pX ˆ Y, µX b µY q maxtLSpXq, LSpY qu maxtPpXq, PpY qu `2 product metric
na
so

Exercise 5.60 (Log-Sobolev constant for the Gaussian space). Show that
LSpRn , γn q ě 1 (we have actually equality, see Proposition 5.42).
r
Pe

Exercise 5.61 (Log-Sobolev constants for segments and circles). (i) Use the
contraction principle from Remark 5.46 to show that LSpr0, 1s, vol1 q ď π ´2 LSpS 1 , σq
and Ppr0, 1s, vol1 q ď π ´2 PpS 1 , σq. (ii) Verify that PpS 1 , σq “ 1. (iii) Verify that
Ppr0, 1s, vol1 q ě π ´2 (see Notes and Remarks for the reasons why there is actually
an equality).
5.2.4.3. Hypercontractivity, Gaussian polynomials. We give a brief introduc-
tion to the concept of hypercontractivity and illustrate it to give an example of a
concentration inequality for Gaussian polynomials.
We work on the probability space pRn , γn q. We define the Ornstein–Uhlenbeck
semigroup of operators pPt qtě0 as follows. For f : Rn Ñ R a bounded measurable
5.2. CONCENTRATION OF MEASURE 135

function, and x P Rn , let


´ a ¯
(5.55) pPt f qpxq “ E f e´t x ` 1 ´ e´2t G ,

where G is a standard Gaussian vector in Rn . These operators satisfy the semigroup


property Ps Pt “ Ps`t . Moreover it is easily checked (Exercise 5.62) that for every
p ě 1 and t ě 0,
}Pt f }Lp pγn q ď }f }Lp pγn q ,

ion
and therefore Pt extends to a bounded (contractive) operator on Lp pγn q. Remark-
ably, a stronger statement is true: provided p ą 1 and t ą 0, Pt is a contraction
from Lp pγn q to Lq pγn q for some q “ qptq ą p. This phenomenon is called hyper-

ut
contractivity.

rib
Proposition 5.47 (not proved here, but see Exercise 5.63). Let 1 ď p ď q ă 8
and t ą 0 such that q ď 1 ` e2t pp ´ 1q. Then

ist
}Pt f }Lq pγn q ď }f }Lp pγn q .

rd
The eigenvectors of Pt are the Hermite polynomials. In the one-dimensional
case, denote by phk qkPN the sequence of polynomials obtained by orthonormalizing
the sequence p1, x, x2 , . . . q in the space H1 :“ L2 pR, γ1 q. (In this context, we
fo
exceptionally mean N “ t0, 1, 2, 3, . . .u.) Given a multi-index α “ pα1 , . . . , αn q P
Nn , let hα be the multivariate polynomial
ot
(5.56) hα px1 , . . . , xn q “ hα1 px1 q ¨ ¨ ¨ hαn pxn q.
N

The family phα qαPNn is an orthonormal basis in Hn :“ L2 pRn , γn q, and we have


ly.

(5.57) Pt hα “ e´t|α| hα ,
on

řn
where |α| “ i“1 αi is the weight of the multi-index α, or the total degree of the
polynomial hα . Note that formula (5.57) allows to define Pt Q for any polynomial
Q even when t is negative.
se

Proposition 5.48. Let Q be a polynomial in n variables of (total) degree at


lu

most k. Then, for every q ě 2,


}Q}Lq pγn q ď pq ´ 1qk{2 }Q}L2 pγn q .
na

Proof. For any t ě 0, we have Pt P´t Q “ Q (see the remark following (5.57)).
so

Choosing t ą 0 such that q ´ 1 “ e2t , we may apply Proposition 5.47 to conclude


r

that }Q}Lq pγn q ď }P´t Q}L2 pγn q . We may write the decomposition of Q in the basis
Pe

of Hermite polynomials
ÿ
Q“ cα hα
|α|ďk

for some coefficients pcα q. It follows that }Q}2L2 pγn q “ c2α , while
ř

ÿ
}P´t Q}2L2 pγn q “ e2t|α| c2α ď e2tk }Q}2L2 pγn q ,
|α|ďk

whence the result follows. 


136 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

Corollary 5.49 (Concentration inequality for Gaussian polynomials). Let


Z1 , . . . , Zn be independent N p0, 1q variables and let X “ QpZ1 , . . . , Zn q, where Q
is a polynomial of (total) degree at most k. Then, for any t ě p2eqk{2 ,
ˆ ˙
´ ? ¯ k 2{k
P |X ´ EX| ě t Var X ď exp ´ t .
2e
Proof. There is no loss of generality in assuming that Z1 , . . . , Zn are defined
as the coordinate functions on pRn , γn q, so that Proposition 5.48 applies. We may

ion
assume EX “ 0, Var X “ 1 and write by Markov’s inequality, for any q ě 2,
P p|X| ě tq ď t´q E |X|q ď t´q pq ´ 1qkq{2 ď pq k{2 {tqq

ut
where we used Proposition 5.48. The choice q “ t2{k {e (which is larger than 2
provided t ě p2eqk{2 ) yields the result. 

rib
Remark 5.50. The phenomenon of hypercontractivity is not specific to the

ist
Gaussian case and is essentially equivalent to a log-Sobolev inequality (see Theorem
5.2.3 in [BGL14]). Similar concentration results are true for polynomials in binary

rd
random variables (see Theorem 9.21 in [O’D14]) and for polynomials on the sphere
(cf. [Mon12]). Here is a precise statement of the latter. If Q be a polynomial with
total degree at most k in n1 ` ¨ ¨ ¨ ` nd variables and X “ pX1 , . . . , Xd q with Xi
fo
independent and uniformly distributed on S ni ´1 , then for every q ě 2, }QpXq}Lq ď
pq ´ 1qk{2 }QpXq}L2 . (This is slightly more general than Corollary 12 in [Mon12]
ot
which assumes that n1 “ ¨ ¨ ¨ “ nd and that the partial degrees in each variable are
N

equal.) The argument is similar to the Gaussian case, using spherical harmonics
instead of Hermite polynomials. Concentration estimates similar to Corollary 5.49
follow.
ly.

Exercise 5.62 (Ornstein–Uhlenbeck semigroup is contractive). Show that Pt


on

is a contraction on Lp pγn q for any t ě 0 and p ě 1.


Exercise 5.63 (Sharpness of the hypercontractive inequality). When n “ 1,
se

compute Pt fλ when fλ pxq “ eλx . Conclude that Proposition 5.47 is sharp in the
following sense: when q ą 1 ` e2t pp ´ 1q, there is no constant C such that the
lu

inequality }Pt f }Lq pγ1 q ď C}f }Lp pγ1 q holds.


na

5.2.5. Some discrete settings. All the specific instances of concentration


we identified thus far involved manifolds. However, the phenomenon also occurs
so

in the discrete case. We will exemplify it (and the issues that may arise) on the
fundamental example of the Boolean cube t0, 1un , or t´1, 1un , endowed with the
r

normalized counting measure µ and the normalized Hamming distance dH px, yq :“


Pe

1
n cardti : xi ‰ yi u, which up to normalization coincides with the `1 metric in the
ambient space Rn . (This setting was already studied in Section 5.1.3; other product
measures, or metrics induced by `p -norms for other p are also frequently considered,
more about that later.)
A nearly optimal concentration result for the Boolean cube follows already from
Proposition 5.37. However, we can do better: the exact solution to the isoperimetric
problem on the cube is known. To describe it, we introduce a total order ă on t0, 1un
(called the simplicial order ) as follows: for x “ pxi q and y “ pyi q in t0, 1un , declare
that x ă y if either x1 ` ¨ ¨ ¨ ` xn ă y1 ` ¨ ¨ ¨ ` yn or x1 ` ¨ ¨ ¨ ` xn “ y1 ` ¨ ¨ ¨ ` yn and
x precedes y in the lexicographic order. Then the initial segments for this order are
5.2. CONCENTRATION OF MEASURE 137

isoperimetric sets. As opposed to the Gaussian and spherical case, the extremal
sets are not unique in any reasonable sense (see Exercise 5.66)
Theorem 5.51 (Harper’s isoperimetric inequality, not proved here). For any
integer N with 1 ď N ă 2n , let A Ă t0, 1un be the set of N smallest elements
with respect to the simplicial order. Then A has the smallest ε-enlargements (for
all ε ą 0) among all sets of the same cardinality. The set A verifies
(5.58) Bpx, k{2n q Ă A Ă Bpx, pk ` 1q{2n q

ion
for some k P t0, . . . , n ´ 1u.
If we define the boundary of A as BA :“ ty P t0, 1un : distpy, Aq “ 1{nu, the

ut
sets from Theorem 5.51 also have the “smallest boundary” among subsets of t0, 1un
of the same measure. In this language, the condition (5.58) says that A consists

rib
řk ` ˘
of a ball and a part of its boundary. If N “ j“1 nj for some k, the situation
becomes simple: the optimal sets are balls, and so are their enlargements.

ist
For example, if n “ 2m ` 1 is odd, an example of an optimal set of measure 21
is

rd
A “ ty P t0, 1un : Y ď mu ,
řn
where Y “ j“1 yj . The enlargements of A are then clearly of the form As{n “
(
Y ď m ` s and, consequently,
řm`s `n˘
fo
ot
ř `n˘
j“1 j jąm`s j 2
(5.59) µpAs{n q “ “1´ ě 1 ´ e´2s {n
,
2n 2n
N

where the inequality follows from Hoeffding’s inequality (5.43). A similar analysis
can be performed when n is even (see Exercise 5.64 for details). To summarize, we
ly.

have
on

Corollary 5.52. If A Ă t0, 1un with µpAq ě 12 , s P N and ε “ s{n, then


2
µpAε q ě 1 ´ e´2nε . Consequently, if f : t0, 1un Ñ R is a 1-Lipschitz function and
2
M is its median, then µpf ą M ` εq ď e´2nε .
se

2
lu

Remark 5.53. Some authors assert that the bound µpAε q ě 1 ´ e´2nε (for A
satisfying µpAq ě 12 ) holds for all ε ą 0. However, this may be false, but only if
n “ 1 or 2 and only for certain values of ε P p0, 1{nq, see Exercise 5.65.
na

The setting of Corollary 5.52 is a special case of that of Proposition 5.37.


so

(The differences include the mean being replaced by the median, and the numeri-
cal constants being better in the former, which is not surprising since it is a more
r
Pe

specialized result.) The Corollary is an elegant and sharp result, but it exhibits
the following unsatisfactory feature: if we use the standard Euclidean metric to
define the 1-Lipschitz property of f or the expansions At , the exponential term
2
in the estimates becomes e´2t {n . This should be compared to the dimension-free
2
(and differently scaled) term 12 e´t {2 in Theorem 5.24, the Gaussian isoperimet-
ric inequality. However, there is a fix to this difficulty due to Talagrand: if the
function f is convex, its restriction to t0, 1un exhibits dimension-free subgaussian
concentration. We have
Theorem 5.54 (Talagrand’s convex concentration inequality for the Boolean
cube, not proved here). Let A be a non-empty subset of t0, 1un Ă Rn and set
138 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

φA pxq :“ distpx, conv Aq, where the distance is calculated with respect to the Eu-
clidean metric. Then
1 2
(5.60) E e 2 φA ď 1{µpAq
2
and so µpφA ą tq ď e´t {2 {µpAq for t ą 0. Consequently, if f : r0, 1sn Ñ R is
a convex (or concave) 1-Lipschitz function and M is its median with respect to µ,
2
then µpf ą M ` tq ď 2e´t {2 for t ą 0.
In the statement of Theorem 5.54 we tacitly assume that µ is a measure on Rn

ion
supported on t0, 1un . The second assertion of the Theorem follows from (5.60) by
Markov’s inequality. Some finer issues related to the derivation of the last assertion

ut
are addressed in Exercise 5.67. See also Exercise 5.68.
Theorem 5.54 turned out to be very useful (for example in the context of

rib
random matrices) and has been generalized in various ways. Here is one possible
statement.

ist
Theorem 5.55 (not proved here). Let V1 , V2 , . . . , VN be finite-dimensional
ÀN
normed spaces and let V “ j“1 Vj be their sum in the `q -sense (for some q ě 2).

rd
For j “ 1, 2, . . . , N , let µj be a measure on Vj supported on a set of diameter at
most 1 and let µ “ bN j“1 µj . Further, assume that F : V Ñ R is 1-Lipschitz and
` ˘
fo
quasiconvex (i.e., F ´1 p´8, as is convex for all a P R) or quasiconcave. Then
1 q
(5.61) µpF ą M ` tq ď 2e´ 4 t for all t ą 0,
ot
where M is the median of F with respect to µ.
N

We conclude this section with a result that is the counterpart of Theorem 5.54
with the median replaced by the mean, whose degree of generality is intermediate
ly.

between those of Theorem 5.54 and Theorem 5.55.


on

Theorem 5.56 (Convex concentration inequality for the mean, not proved
here). Let µ “ µ1 b ¨ ¨ ¨ b µk be a product measure on r0, 1sn Ă Rn and let f :
r0, 1sn Ñ R be a function which is 1-Lipschitz with respect to the Euclidean distance
se

and convex with respect to each variable. Then, for any t ě 0,


lu

2
(5.62) µpf ą Ef ` tq ď e´t {2
.
While, by Remark 5.12 (which was based on the very general results from
na

Section 5.2.3.2), statements about concentration around the median formally im-
ply similar statements about the mean, we state Theorem 5.56 separately since it
so

combines good constants with a different set of hypotheses.


r

Exercise 5.64 (Concentration on even-dimensional Boolean cube). If n “ 2m


Pe

is even, an example of a set A Ă t0, 1un( with µpAq “ 21 that is optimal(in the sense
řn řn
of Theorem 5.51 is A “ j“1 yj ă m Y j“1 yj “ m and y1 “ 1 . Show that
2
also in this case µpAs{n q ě 1 ´ e´2s {n
for s P N.
2
Exercise 5.65. Show that the bound µpAε q ě 1 ´ e´2nε from Corollary 5.52
may fail for some ε ą 0 if n “ 1 or 2, but that it always holds if n ą 2 or if ε ě 1{n.
Exercise 5.66 (Non uniqueness in Harper’s theorem). Give an example of a
value N and two sets of N elements in t0, 1u4 with smallest ε-enlargements (for all
values of ε) among sets with N elements, which are distinct up to symmetries of the
hypercube. Note: it appears to be unknown whether uniqueness can be assured
5.2. CONCENTRATION OF MEASURE 139

by insisting that both A and its complement are isoperimetric sets for all sizes of
enlargement.
Exercise 5.67 (Talagrand’s concentration inequality for concave functions).
2
Derive the bound µpf ą M ` tq ď 2e´t {2 for concave f in Theorem 5.54 (or,
2
equivalently, µpf ă M ´ tq ď 2e´t {2 for convex f ) from the inequalities preceding
it.
Exercise 5.68 (Existence of convex Lipschitz extensions). Let K Ă Rn be a

ion
convex set and let f : K Ñ R be a convex 1-Lipschitz function. Then f admits
a convex 1-Lipschitz extension to Rn . Consequently, in Theorem 5.54 it doesn’t
matter whether we assume f to be convex and 1-Lipschitz on Rn or just on r0, 1sn .

ut
Exercise 5.69 (No dimension-free subgaussian bound in absence of convexity).

rib
Here is an example showing that convexity is crucial in Theorem 5.54. Define f :
t´1, 1un Ñ R by f px1 , . . . , xn q “ maxp0, x1 ` ¨ ¨ ¨ ` xn q1{2 . Show that
` f has 1{4
median

ist
1
˘
0 and is 2 -Lipschitz with respect to the Euclidean metric, while µ f ą cn
? ěc
for some absolute constant c ą 0.

rd
5.2.6. Deviation inequalities for sums of independent random vari-
ables. In this section we gather some simple but useful facts about deviation in-
fo
equalities for sum of independent mean zero random variables. We mostly focus on
two families of random variables: subgaussian and subexponential variables.
ot
In a probabilistic setting, the Lp -norm (for p ě 1) of a random variable X is
1{p
N

}X}p “ pE |X|p q . As a preliminary step, consider two prototypical examples: let


Z be an N p0, 1q random variable and T be a symmetric exponential variable with
parameter 1 (i.e., PpT ą tq “ Pp´T ą tq “ 12 e´t for t ą 0). A simple computation
ly.

(cf. (A.1)) shows that


on

? ˆ ˙1{p c
2 p`1 p
(5.63) }Z}p “ 1{2p Γ „ ,
π 2 e
se

p
(5.64) }T }p “ Γpp ` 1q1{p „
lu

e
as p tends to infinity.
na

The growth of the Lp -norms motivates the following definitions: a random


variable X is said to be subgaussian (or ψ2 ) when
so

(5.65) }X}ψ2 :“ sup p´1{2 }X}p ă 8.


pě1
r
Pe

This terminology is consistent with that introduced in the preamble to Section 5.2
and based on the tail behavior (cf. (5.21), (5.22); see Exercise 5.70 and Lemma 5.57
below). Similarly, X is said to be subexponential (or ψ1 ) when
}X}p
(5.66) }X}ψ1 :“ sup ă 8.
pě2 }T }p
The reader may be familiar with the arguably less ad hoc forms of ψr conditions,
based on either the rate of growth of the (bilateral) Laplace transform or the ap-
propriate Orlicz norms, or on the tail behavior of the type
r
Pp|X| ą tq ď Ce´λt for t ě 0
140 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

(cf. (5.21) and (5.22)). There is no need to be alarmed, though: while not identical,
all these approaches lead to quantities that are equivalent up to universal constants.
The definitions (5.65)–(5.66) were chosen out of convenience in view of the sample
applications we present. See Notes and Remarks for more details and a references.
If follows from (5.63) and (5.64) that }T }ψ1 “ 1, }Z}ψ2 “ 2{π and that
}¨}ψ1 ď }¨}ψ2 (see Exercise 5.75). We have obviously }¨}ψ2 ď }¨}8 and }¨}ψ1 ď }¨}8 ,
so the present discussion also applies to bounded variables. Another important
example of subgaussian variables is obtained by taking the inner product with

ion
a fixed vector of a randomly chosen unit vector in Rd or Cd . This has to be
compared with Poincaré’s lemma (Theorem 5.22) which says that the Gaussian
measure appears at the limit d Ñ 8.

ut
Lemma 5.57. If X is uniformly distributed on?S d´1 (resp., SCd ), then for every

rib
u P Rd (resp., u P Cd ), we have }xX, uy}ψ2 ď |u|{ d.
Proof. We may assume by homogeneity that |u| “ 1. Let G be a standard

ist
Gaussian vector in Rd . The variable uniformly distributed on S d´1 can be then
represented as X “ G{|G|. Moreover, |G| is independent of X and hence, for p ě 1,

rd
}xG, uy}p “ }|G|}p }xX, uy}p .

fo
We have }|G|}p ě }|G|}1 “ κd (see Section 4.3.3).aSince xG, uy has distribution
N p0, 1q, we know from (5.63) that }xX, uy}ψ2 “ 2{π “ κ1 . Therefore, using
Proposition A.1(ii), we obtain }xX, uy}ψ2 ď κκd1 ď ?1d . The complex case is similar.
ot

N

We also note that the square of a subgaussian variable is subexponential, as


follows easily from the definitions. We now consider the case of a sum of either
ly.

subgaussian or subexponential mean zero random variables. If the random vari-


ables are bounded, we can apply Hoeffding’s inequality (5.43). It turns our that
on

essentially the same result holds for subgaussian variables.


Proposition 5.58 (see Exercise 5.73). Let X1 , . . . , Xn be independent subgaus-
se

sian real random variables with mean zero, and S “ X1 ` ¨ ¨ ¨ ` Xn . Define K ą 0


by K 2 “ }X1 }2ψ2 ` ¨ ¨ ¨ ` }Xn }2ψ2 . Then for every t ą 0,
lu

t2
ˆ ˙
Pp|S| ą tq ď 2 exp ´ .
na

8eK 2
2
t
so

The proof actually yields a better bound 2 expp´ 2eK 2 q when pXi q are symmet-

ric random variables (i.e., such that Xi and ´Xi have the same distribution for any
r

fixed i).
Pe

In the case of ψ1 variables, the situation is slightly more complicated since


two tails enter the picture: subgaussian tails for moderate deviations (which are
reminiscent of the central limit phenomenon) and subexponential tails for large
deviations (which come from the tails of individual variables)
Proposition 5.59 (Bernstein’s inequalities, see Exercise 5.76). Let X1 , . . . , Xn
be independent real random variables with mean zero, and assume that }Xi }ψ1 ď K
for every index i. Then, for every vector a “ pa1 , . . . , an q P Rn and every t ě 0,
˜ˇ ˇ ¸
n
t2
ˆ ˆ ˙˙
ˇÿ ˇ t
P ˇ ai Xi ˇ ą t ď 2 exp ´ min , .
ˇ ˇ
ˇi“1 ˇ 8K 2 }a}22 4K}a}8
NOTES AND REMARKS 141

Remark 5.60. Propositions 5.58 and 5.59 readily generalize to the complex
case (with possibly different numerical constants).
Exercise 5.70 (Lipschitz function on a Gaussian space is subgaussian). Let
G be a standard Gaussian vector on Rn and f : Rn Ñ R a 1-Lipschitz function
such that f pGq has mean zero. Deduce from the results of Section 5.2.2 that
}f pGq}ψ2 ď C for some absolute constant C. (Except for the value of the constant
C, this is a generalization of Lemma 5.57.)

ion
řn
Exercise 5.71 (Khintchine inequalities). Let X “ i“1 εi ai , where a1 , . . . , an
are real numbers and pεi q is a sequence of independent random variables with
Ppεi “ 1q “ Ppεi “ ´1q “ 1{2. Show that, for any p ě 1,

ut
Ap }X}L2 ď }X}Lp ď Bp }X}L2

rib
?
where Ap ą 0 and Bp are constants depending only on p. Show that Bp “ Op pq
as p Ñ 8.

ist
Exercise 5.72 (Khintchine–Kahane inequalities). Khintchine inequalities have
řnx1 , . . . , xn belong to some
a vector-valued generalization which is due to Kahane: If

rd
normed space Y and X 1 denotes the random variable } i“1 εi xi }Y , then
A1p }X 1 }L2 ď }X 1 }Lp ď Bp1 }X 1 }L2
fo
where A1p ą 0 and Bp1 are constants depending only on p. Prove this. Moreover, we
? ?
ot
have A1 “ A11 “ 1{ 2 and Bp1 “ Θp pq as p Ñ 8.
N

Exercise 5.73. Prove Proposition 5.58 by following the outline given below.
(i) If X is symmetric, show that E exppλXq ď expp 2e }X}2ψ2 λ2 q for any λ ą 0.
(ii) Let Y be an independent copy of a mean zero random variable X. Show that
ly.

E exppλXq ď E exppλpX ´ Y qq. Using this symmetrization trick, deduce from (i)
that the inequality E exppλXq ď expp2e}X}2ψ2 λ2 q holds for any mean zero random
on

variable X.
(iii) Deduce Proposition 5.58 using Lemma 5.28.
se

Exercise 5.74 (Linear combinations of subgaussian random variables are sub-


lu

gaussian). Show the following variant of Proposition 5.58: if X1 , . . . , Xn are inde-


pendent and mean zero, then }X1 ` ¨ ¨ ¨ ` Xn }ψ2 ď Cp}X1 }2ψ2 ` ¨ ¨ ¨ ` }Xn }2ψ2 q for
na

some absolute constant C.


a
Exercise 5.75. Verify that }Z}ψ2 “ 2{π and that, for any variable X,
so

}X}ψ1 ď }X}ψ2 .
r

Exercise 5.76 (Bernstein’s inequalities). (i) Show that if EX “ 0 and }X}ψ1 ď


Pe

1, then E exppλXq ď 1 ` 2λ2 ď expp2λ2 q for |λ| ă 1{2 (cf. Lemma 5.28).
(ii) Under the hypotheses of Proposition 5.59, assuming ř K “ 1 and denoting S “
a1 X1 ` ¨ ¨ ¨ ` an Xn , prove that E exppλSq ď expp2λ2 a2i q for |λ| ď 1{p2}a}8 q.
(iii) Prove Proposition 5.59.

Notes and Remarks


Section 5.1. An encyclopedic reference for sphere packings is the book [CS99].
Other valuable and historically significant references are [Rog64, Bör04, FT97].
Packing and covering on the Euclidean sphere and the discrete cube.
To complement Proposition 5.1, it has been proved in [BGK` 01] that for 0 ď t ď
142 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

a ?
arccos 2{n, we have V ptq ě p6 n cos tq´1 psin tqn´1 (similar estimates appear in
[Bör04], Lemma 6.8.6). For some values of n, t (roughly for t ą 1.14 and for large
n), this is better than the lower bound from (5.4), and similarly superior to the
improved bound from Exercise 5.4 if t ą 1.221.
The random covering argument from Proposition 5.4 is due to Rogers [Rog57,
Rog63]. The factor Cn log n from Corollary 5.5 is usually referred to as the density
of the covering, even though calling it “the overlap” or “the redundancy” would seem
more logical. Both the original Rogers’s argument, and the one presented here,

ion
allow achieving C “ 1 at the expense of additional lower order terms (see Exercise
5.8 and its hint). Recent advances by Dumer [Dum07] improve the bound on the
density to p 21 ` op1qqn log n. The paper [Dum07] establishes also a density bound

ut
1
2 n log n ` 2n log log n ` 5n, valid for all ε P p0, 1q and all n ě 4. It should be noted,

rib
however, that the latter result deals with a slightly easier problem, covering the
sphere S n´1 Ă Rn by balls whose centers are not required to belong to S n´1 (i.e.,
with the parameter N 1 from Exercise 5.1). Finally, at the price of increasing the

ist
constant C, the result from Corollary 5.5 can be strengthened as follows: for any
dimension n and angle ε, there is a covering of S n´1 by caps of radius ε such that

rd
any point belongs to at most 400n log n caps [BW03].
Since the sphere looks locally like a Euclidean space, as the radii of the caps
fo
tend to 0, the packing/covering problems for S n´1 converge to the corresponding
problems for Rn´1 . (The original random covering argument of Rogers [Rog57]
ot
considered an even more general question, economical coverings of Rn by translates
of an arbitrary convex body—the spherical variant being an afterthought—and
N

led to an upper bound of n log n ` n log log n ` 5n for the appropriately defined
asymptotic density.) In that setting, a lower bound on density of optimal coverings
ly.

by Euclidean balls is Ωpnq [CFR59] and this estimate can be transferred back to
S n´1 if the radius is small?enough; see Example 6.3 in [BW03] for an argument
on

that works if ε ď arcsinp1{ nq.


References for the results mentioned about packing are [Ran55] (Rankin) and
[KL78] (Kabatjanskiı̆–Levenšteı̆n), we refer to [CS99] for more information (see
se

also [BN06a]). Again, when the radius of the cap tends to 0, the problem becomes
lu

the classical sphere packing problem in Rn . In this context, a classical result due to
Minkowski–Hlawka shows the existence of lattice packings of Euclidean balls (or ac-
tually, of any symmetric convex body) in Rn which cover a proportion 1{2n´1 of the
na

space (a.k.a. packing density). Remarkably, this result has been only marginally im-
so

proved in the past century [Rog47, DR47, Bal92b] and is exponentially far from
Kabatjanskiı̆–Levenšteı̆n upper bound—which is approximately of order 0.66n —for
r

the proportion covered by a (non-necessarily) lattice packing (see [Gru07] for more
Pe

on this topic).
Covering and particularly packing in the Hamming cube is of fundamental
importance in coding theory, see, e.g., [Rot06, CHLL97]. The case of (very
small) balls of radius 1{n in t0, . . . , q ´ 1un is treated in [KP88].
The Gilbert–Varshamov bound has been improved in the q-ary cube for certain
large values of q in [TVZ82], using a link with modular curves.
Packing and covering for convex bodies. For early references on metric
entropy of convex bodies see [CS90], [Pis89b].
The arguments from [Bar14] imply the following improvement on the volu-
metric bound from Corollary 5.10: for ε P p0, 1q, any symmetric convex body in
NOTES AND REMARKS 143

?
Rn is p1 ` εq-close in Banach–Mazur distance to a polytope with pC{ εqn vertices.
(This is sharp: consider the case of the sphere.) To the best of our knowledge, it is
not known whether analogous statement holds for not-necessarily symmetric bodies
and the affine version (4.2) of the Banach–Mazur distance. Similar questions can
be considered for large ε, or even ε growing with the dimension. In the case of the
sphere, this is essentially the problem considered in Exercise 5.13. Again, [Bar14]
contains good estimates in the general case. However, the bounds from [Bar14]
deteriorate as the asymmetry of the body (defined, for example, as the minimal dis-

ion
tance dBM to a symmetric body) increases. Estimates that are superior for some
ranges of parameters can be found in [Sza].
Let us also mention an important open problem, known as the duality conjec-

ut
ture: do there exist absolute constants c, C ą 0 such that for every two symmetric
convex bodies K, L Ă Rn we have

rib
(5.67) log N pL˝ , K ˝ q ď C log N pK, cLq?

ist
This was proved when K or L is the Euclidean ball [AMS04] and extended to
the case when a bound on the K-convexity constant (as defined in Section 7.1.2)

rd
is present in [AMSTJ04]. Another possible generalization to the setting of non-
symmetric convex bodies is more tricky; in that case, even the proper formulation
of (5.67) is not entirely clear. fo
A deep fact about covering numbers is the following ([Mil86], see also the dis-
cussion in [Pis89b]): there is an absolute constant C such that, for every symmetric
ot
convex body K Ă Rn there is an 0-symmetric ellipsoid E such that
N

(5.68) max pN pK, E q, N pE , Kqq ď C n .


Note that since metric entropy duality (5.67) is known to hold when one of the
ly.

bodies is an ellipsoid, it follows then that similar bounds automatically hold also
for N pK ˝ , E ˝ q and N pE ˝ , K ˝ q. (In the original definitions, all four quantities were
on

included explicitly or implicitly.) Such an ellipsoid E is called an M -ellipsoid for


K, and K is said to be in the M -position when B2n is an M -ellipsoid for K. The
se

M -ellipsoids are discussed in detail in [AAGM15].


lu

Metric entropy of classical manifolds. Theorem 5.11 is from [Sza82],


which covers the case of all metrics induced by unitarily invariant norms (see
na

also [Sza83, Sza98] and [Paj99]). Examples of packings in some Grassmannians


(mostly low-dimensional), some of them optimal, can be found in [CHS96, SS98].
so

More recent references, motivated by information transmission issues and concen-


trated on different asymptotics (k fixed and n tending to infinity), are [BN02,
r

BN05, BN06b]. It appears that the theoretical computer science community is


Pe

not aware that questions of that nature were considered in AGA already in 1980s.
Section 5.2. Classical general references about concentration of measure are
[Led01] and [Sch03]. We particularly recommend the recent monograph [BLM13].
For a presentation directed towards applications to data science, see [Ver].
Isoperimetry and concentration. A geometry-oriented reference about
isoperimetric inequalities is [BZ88]. The paternity of the isoperimetric inequal-
ity on the sphere (Theorem 5.13) is usually attributed to Lévy [Lév22, Lév51]
although the arguments he presented were not fully rigorous; [Sch48] is usually
cited as the first rigorous proof. Remarkably, the functional version (Lévy’s lemma,
144 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

in the language of our Corollary 5.17) appears explicitly in [Lév22] (see p. 279)
and is therefore almost one century old!
A self-contained proof of the isoperimetric inequality on S n´1 , based on the
concept of spherical symmetrization, appears in [FLM77]. Another symmetriza-
tion procedure (the two-point symmetrization) is applied in [Ben84]. The simple
proof of the non-sharp inequality from Proposition 5.15 is based on [AdRBV98].
Proposition 5.20 is from [JS].
The Gaussian isoperimetric inequality was proved independently by Borell

ion
[Bor75b] and Sudakov–Tsireslon [SC74]. For a proof of Poincaré’s lemma (Theo-
rem 5.22) going beyond the weak convergence version from Exercise 5.29, we refer
to [DF87] (which also advocates that the statement was first formulated by Borel

ut
and not by Poincaré). See also [Led96] and references therein. For a direct proof

rib
of concentration of measure on Gauss space, see [Pis86].
Ehrhard’s inequality (5.31) was proved in [Ehr83] for convex sets, then ex-
tended in [Lat96] to the case where only one of the sets set is convex, with the

ist
general case being treated in [Bor03]. A priori, deriving an isoperimetric inequal-
ity such as (5.29) requires validity of (5.31) for an arbitrary Borel set and a ball;

rd
the paper [Ehr83], however, contains a direct application of the technique to prove
(5.29). A general reference for this circle of ideas is [Lat02].
fo
The concept of central values was formalized and applied in the context of QIT
in [ASW11], which also contains versions of Corollaries 5.32 and 5.35. However,
ot
instances of the arguments can be found in [Has09] and in AGA literature dating
to (at least) 1980s.
N

Proposition 5.34 appears in [Dmi90, Kwa94, Fer97]. Exercise 5.48 appears


as Proposition 1.7 in [Led01]. Proposition 5.37 is Corollary 1.17 from [Led01].
ly.

There are various generalizations of Hoeffding’s inequality appearing in Exercise


5.57, notably due to Azuma [Azu67] and McDiarmid [McD89] in the context of
on

martingales.
Geometric and analytical methods. General references for Section 5.2.4
se

are [MS86, Sch03, DS01, GM00, BLM13, BGL14, GZ03].


lu

Gromov’s comparison theorem (Theorem 5.38) appeared first in the preprint


[Gro80]. A proof can be found in an appendix in [MS86]. A new proof and
an extension to non-Riemannian spaces was proposed recently in [CM15]. While
na

the theorem is sharp as stated, there is a reason to suspect that a more precise
result should be available: the proof proceeds via a local/variational argument
so

and the globally normalized volume appears only a posteriori. A more satisfactory
r

variant appears in [Mil15]. In addition to the curvature, it takes into account the
Pe

actual diameter of the manifold in question, which may be strictly smaller than
the bound following indirectly from the curvature. However, since the results in
[Mil15] necessarily involve model manifolds more complicated than spheres, their
statements are somewhat technical.
The case of manifolds of dimension 1 is a little special. First, while the definition
of Ricci curvature in dimension 1 needs to be properly construed, the only sensi-
ble value is 0 since every such manifold looks locally like a segment. Accordingly,
Proposition 5.41 is then vacuously true. Next, the solution to the isoperimetric
problem in S 1 (resp., in R) is very simple: among sets of any (positive, but not
full) measure, the boundary is the smallest if it consists of exactly two points.
NOTES AND REMARKS 145

Consequently, the solutions, both for the “smallest boundary” and the “smallest en-
largement” problems, are arcs (resp., segments). However, finer analytic statements
(including but not limited to LSI) are interesting and highly nontrivial already in
dimension 1. For example, in view of Proposition 5.44, the validity of (5.48) for
the 1-dimensional Gaussian measure implies the same inequality in any dimension
(with the same constant α, which, in view of Proposition 5.42, can be taken to be
1, which is optimal). Indeed, even statements about spaces consisting of only two
points can be deep as for example in the elementary proof of the Gaussian isoperi-

ion
metric inequality presented in [Bob97]. We will return to the same theme further
when reporting on developments directly related to LSI and hypercontractivity.
Log-Sobolev inequalities (LSI) were introduced in a seminal paper by Gross

ut
[Gro75]. Again, the case of manifolds of dimension 1 (segments, circles) is a little

rib
special; see [GMW14] for an elementary overview of this aspect of the subject and
for references. The link with concentration of measure (the Herbst argument) orig-
inates in an unpublished letter from Herbst to Gross. The connection between LSI,

ist
Ricci curvature, and the Hessian of the density was put forward in [BÉ85, Bak94].
For a comprehensive treatment of functional inequalities (including complete refer-

rd
ences), see [BGL14]. Another fruitful approach is the connection between LSI and
the quadratic transportation cost inequalities; see Chapter 6 in [Led01].
fo
As exemplified in Table 5.4, the values of the Poincaré constants can often
be computedş exactly. Indeed, the Poincaré inequality (5.54) can be rewritten as
ot
Varµ f ď α p´∆f qf dµ, where ∆ is the Laplace–Beltrami operator on L2 pX, µq.
It follows that the optimal α is equal to the reciprocal of the “spectral gap,” i.e.,
N

the smallest nonzero eigenvalue of ´∆. In some examples the eigenfunctions of the
Laplace–Beltrami operator can be explicitly described: for the Gauss space they
ly.

are the Hermite polynomials, for the sphere they are the spherical harmonics (see
the elementary [See66], or [BGM71] which covers also the case of the projective
on

spaces). On S n´1 , equality in (5.54) is achieved for functions of the form x ÞÑ xx, yy
with y P Rn . For Lie groups there is a connection with the spectrum of the Casimir
operator and representations of the associated Lie algebra (see Proposition 10.6 in
se

[Hal15]), which allows to derive the entire spectrum of ´∆. The case of SOpnq and
lu

SUpnq appears in [SC94] (for Upnq, see [Voi91]). Note that in these examples there
is equality in (5.54) when f is a function of the form M ÞÑ TrpAM q for A P Mn . For
a complete list of semisimple Lie algebras, see [Rot86]. The spectrum of Grassmann
na

manifolds is considered in [Tsu81, EC04, TK04, Hal07], which allows in principle


to retrieve the value of the Poincaré constant for specific dimensions if needed.
so

Hypercontractivity for the Ornstein–Uhlenbeck semigroup (Proposition 5.47)


r

has been first established by Nelson [Nel73]. The connection with log-Sobolev
Pe

inequalities was put forward by Gross [Gro75].


In many situations, the Gaussian case can be treated as a limit case from
the case of the hypercube via the central limit theorem. By the tensorization
property (Proposition 5.44), this amounts ultimately to verifying statements about
the two-point space t´1, 1u (see [Gro75] for a proof of the Gaussian LSI along
these lines). The hypercontractivity inequality on the discrete cube is known as the
Bonami–Beckner inequality [Bon70, Bec75]. Some variants of Proposition 5.48
appear in [Jan97]. For a more sophisticated technology giving sharp estimations
on the moments of Gaussian polynomials (or Gaussian chaoses) see [Lat06]. The
146 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE

statement about concentration on polynomials on products of spheres appearing in


Remark 5.50 follows from the proof of Corollary 12 in [Mon12].
Discrete settings. A reference focusing on the case of the hypercube is
[O’D14] (it contains in particular the versions of Proposition 5.48 and Corollary
5.49 for the hypercube alluded to in Remark 5.50). In addition to [O’D14], general
references for Section 5.2.5 are [Mat02, McD98]. The main statement of Theorem
5.51 was proved in [Har66] and rediscovered in [Kat75]. A short proof may be
found in [FF81]; we also recommend the reference [Lea91]. Theorem 5.51 deals

ion
with vertex-isoperimetry. If we consider instead edge-isoperimetry (minimizing the
number of edges joining A to Ac ), the optimal sets are no longer Hamming balls

ut
but subcubes.
Theorem 5.54 is taken from [Tal88] (Note that [Tal88] states the result for the

rib
cube t´1, 1un and so the coefficient in the exponent in the estimate corresponding
to (5.60) is there 81 .) Theorem 5.55 appears in [JS91] and [Mec04]. The latter

ist
paper addresses general unconditional direct sums and not only `q -sums; see also
[Mec03]. Similar results, but with quite different proofs were presented in [Mau91]

rd
and [Dem97]. The most abstract (and most flexible) statements are arguably in
[Tal95, Tal96b, Tal96a]. The arguments addressing settings more general than
that of Theorem 5.54 usually led to a coefficient 14 in the exponent as in (5.61),
fo
except for [Tal95], which includes a statement (Theorem 4.2.4) featuring coefficient
1
2 , but at the cost of introducing additional factors of lower order and restricting
ot
the range of t. A clean proof of Theorem 5.56 (which also has coefficient 12 in the
exponent) can be found in [BLM13]; the argument is attributed to [Led97] and
N

the result itself to [Tal96b].


ly.

Deviation inequalities. Some references for Section 5.2.6 are [Ver12] and
[CGLP12] (the latter treats also the case of intermediate growth between sub-
on

gaussian and subexponential). As pointed out in the main text, there are several
possible forms of ψr conditions and of definitions of the ψr -norms. The original
ones were (presumably) in terms of Orlicz/Young functions: given an increasing
se

convex function ψ : R` Ñ R` with ψp0q “ 0 and ψpxq Ñ 8 as x Ñ 8, we may


define a the ψ-norm of a random variable X as (for example)
lu

}X}ψ “ inftc ą 0 : E ψp|X|{cq ď ψp1qu.


na

If one considers ψr pxq “ exppxr q ´ 1 (r ě 1), then, for r “ 1, 2, one gets norms
which are equivalent (although not equal) to the ones defined in (5.66) and (5.65).
so

For precise statements and proofs, see Theorem 1.1.5 in [CGLP12], which also
covers the link to (the rate of growth of) the Laplace transforms mentioned in the
r
Pe

main text; cf. Lemma 5.28 and Exercise 5.76. Overall, Section 1.1 of [CGLP12]
is an excellent reference for ψr conditions/norms, which are otherwise difficult to
extract from books/surveys on the more general Orlicz spaces.
For a historical account of Bernstein’s contributions, we refer to pp. 126–128
in [AAGM15]. For more precise results about moments of sums of independent
variables, see [Lat97]. For non-commutative analogues of these inequalities (i.e.,
for sums of random matrices), see [Tro12].
Finally, among other techniques to prove concentration of measure, we men-
tion the so-called martingale method which implies for example concentration on
permutation groups (see [Sch82, Mau79, MS86]): If we equip the symmet-
ric group Sn with the uniform probability measure and the distance dpσ, τ q “
NOTES AND REMARKS 147

1
n cardti : σpiq ‰ τ piqu, then any 1-Lipschitz function f on pSn , dq satisfies
Ppf ě Ef ` tq ď expp´nt2 {8q for any t ě 0.
The best constants in Khintchine inequalities
? (see Exercise 5.72) have been
found in [Sza76] (who proved A1 “ 1{ 2q and in [Haa81] (for p ą 1). The
Khintchine–Kahane inequalities from Exercise 5.72 were first proved in [Kah85].
The correct asymptotic ?order of the constants as p Ñ 8 was found in [Kwa76],
while the value A11 “ 1{ 2 is from [LO94]. A complete proof of the Khintchine–
Kahane inequalities can be found by consulting Theorem 3.5.2 of [AAGM15].

ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 6

Gaussian Processes and Random Matrices

ion
This chapter is devoted to the development of probabilistic techniques which,

ut
along the concentration of measure from Chapter 5, constitute our most powerful
tools. Specifically, we will consider stochastic processes (mostly, but not exclusively,

rib
Gaussian) and present deep results permitting their quantitative study. The key
insights are the link between suprema of Gaussian processes and the mean width of

ist
convex bodies, and the use of comparison theorems for Gaussian processes to the
analysis of spectral behavior of random matrices.

rd
6.1. Gaussian processes

fo
This section deals with Gaussian processes (widely used in mathematical mod-
eling and in statistics) and presents several tools for estimating various parameters
ot
related to such processes. A Gaussian process X “ pXt qtPT is simply a family of
jointly Gaussian variables, normally with mean zero, defined on some probability
N

space Ω, which may or may not be specified. See Appendix A for more on the
terminology and for basic and not-so-basic facts about Gaussian variables.
ly.

We especially focus on studying the supremum of Gaussian processes, e.g.,


computing (or estimating) E suptXt : t P T u. In our context, suprema of Gaussian
on

processes appear when considering the Gaussian mean width of a convex body (and
this is essentially the general case, see Section 6.1.1) and therefore can be used to
estimate other geometric parameters such as volume. There are essentially three
se

levels of sophistication when investigating the supremum of a Gaussian process.


lu

(i) Discretize the problem by using an ε-net and appealing to the union bound.
(ii) Use a recursive version of (i) by considering a whole hierarchy of ε-nets (for
na

example ε “ 2´k for every integer k). This is called a “chaining argument.”
(iii) Use a further sophistication of (ii), where instead of using nets whose resolution
so

parameter is uniform across the index set, we allow more general partition schemes.
This is called the “generic chaining” or the “majorizing measure” approach.
r
Pe

A deep result due to Talagrand asserts that (iii) provides an estimate on the
supremum of any Gaussian process which is always sharp up to a multiplicative
constant. However, we mostly consider the situations (i) and (ii) since they are
much simpler and sufficient for our purposes.
We note for the record that without any assumptions on regularity of X, which
will be implicitly made in what follows, measurability issues and other complications
may in principle arise, particularly when T is uncountable. For the benefit of a
non-specialist reader we sketch examples of possible pathologies in Exercise 6.1.
However, such potential difficulties are not relevant in our context and we will
henceforth largely ignore them. For example, in all the settings we are interested

149
150 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

in we will have enough regularity so that


(6.1) E suptXt : t P T u “ sup E maxtXt : t P F u,
F ĂT, F finite

and other questions can similarly be reduced to considering instances of the problem
with finite index sets. As usual, the crucial point will be that the constants that
may appear in the statements do not depend on X and, in particular, on the size
of T .

ion
Exercise 6.1. Give examples of processes pXt qtPT such that, for every t P T ,
Xt “ 0 a.s., but (a) E suptXt : t P T u “ 8 (b) suptXt : t P T u is not measurable.

ut
6.1.1. Key example and basic estimates. We start with a simple—but
crucial—observation that if G : Ω Ñ Rn is a standard Gaussian vector, then

rib
pxG, xyqxPRn is a Gaussian process. Recalling the definition of the Gaussian mean
width of a (bounded nonempty) set K Ă Rn , as introduced in Section 4.3.3,

ist
(6.2) wG pKq “ E suptxG, xy : x P Ku,
we see that calculating wG pKq is equivalent to finding the expectation of the supre-

rd
mum of a certain Gaussian process, a subprocess of pxG, xyqxPRn .
This instance is actually, more or less, the general case. This follows by com-
bining two facts: `fo ˘
(i) the map x ÞÑ xG, xy is an isometry from Rn , | ¨ | to L2 pΩq
ot
˘ distribution of X “ pXt qtPT is uniquely determined by the covariances
`(ii) the joint
EXs Xt s,tPT and so all the stochastically relevant information about the process
N

is encoded in the geometry of X, considered as a subset of L2 pΩq.


Consequently, if E is a Euclidean space and vectors xt P E (for t P T ) are such that
ly.

xxs , xt y “ EXs Xt for all s, t P T , and if GE is a standard Gaussian vector on E,


then the Gaussian process pxGE , xt yqtPT is a faithful copy of X.
on

For a finite process X “ pXk q1ďkďN this is easily realized: we can choose
E :“ spantXk u Ă L2 pΩq and xk “ Xk . We then have in particular
se

(6.3) E max Xk “ wG pXq “ wG pKX q,


1ďkďN
lu

where KX :“ convtXk : 1 ď k ď N u is a convex set in E. (This effectively covers


any situation where (6.1) applies.)
na

The above construction shows that the two (classes of) problems, namely calcu-
lating (1) the mean width of a convex set and (2) the expectation of the supremum
so

of a Gaussian process, are essentially equivalent. This equivalence will turn out to
be very fruitful. Recall that if 0 P K, then suptxy, xy : x P Ku “ }y}K ˝ and so
r

wG pKq “ E }G}K ˝ . It may happen that the set KX does not contain 0, but this can
Pe

be remedied by considering instead X1 “ pXk ´X0 q1ďkďN for some X0 P convtXk u.


We have then
E maxtXk : 1 ď k ď N u “ E maxtXk ´ X0 : 1 ď k ď N u,
which is reminiscent of the fact that the mean width does not depend on the
choice of the origin. Note that if we select X0 belonging to the relative interior of
convtXk u, we will even be able to stay in the category of convex bodies with the
origin in the interior.
We next state a simple upper bound on the expectation of the supremum of a
Gaussian process.
6.1. GAUSSIAN PROCESSES 151

Lemma 6.1. Let pXk q1ďkďN be Gaussian random variables with mean zero and
variance bounded by 1. Then
a
(6.4) E max Xk ď 2 log N .
1ďkďN

Moreover, if pXk q1ďkďN are independent N p0, 1q random variables, then


a
(6.5) E max Xk ě p1 ´ op1qq 2 log N .
1ďkďN

ion
Proof. We use the following elementary computation: if X has distribution
N p0, σ 2 q with σ 2 ď 1, then E etX “ exppt2 σ 2 {2q ď exppt2 {2q for any real t. For
β ą 0 to be determined, we have (the second inequality being Jensen’s inequality)

ut
N N
1 ÿ 1 ÿ 1
E max Xk ď E log eβXk ď log E eβXk ď logpN exppβ 2 {2qq,

rib
1ďkďN β k“1
β k“1
β
?
and the optimal choice β “ 2 log N yields (6.4).

ist
This completes the proof of the first inequality. A slightly weaker, but more
general estimate, based on the simple (and not-so-optimal, see Appendix A.1) upper

rd
bound
1 2
(6.6) PpZ ě tq ď e´t {2 if t ě 0
fo
2
for the tail of a standard normal variable Z (see Exercise A.1) is given in Lemma
ot
6.16. We relegate the proof of the second inequality (based on a lower bound for
the tail of Z) to Exercise 6.2, which also gives an explicit expression for the op1q
N

quantity. 
ly.

We note that the estimate from (6.4) also holds for the expected maximum of
the absolute values of Gaussian variables.
on

Lemma 6.2 (see Exercise 6.3). Let N ě 2 and let pXk q1ďkďN be jointly Gauss-
ian random variables with variance bounded by 1. Then
se

a
E max |Xk | ď 2 log N .
1ďkďN
lu

When N ě 4, the inequality holds for any Gaussian random variables (that is, not
necessarily jointly Gaussian).
na

As an application, we have a bound on the volume of a polytope, given its


so

number of vertices.
r

Proposition 6.3. Let K Ă Rn be a polytope with (no more than) N vertices


Pe

and whose outradius is at most 1. Then


c
´1
a 2 log N
vrad K ď κn 2 log N „ ,
n
where κn is defined by (A.8).
Proof. Let x1 , . . . , xN be the vertices of K. Without loss of generality we
may assume that K Ă B2n . We can now apply the first part of Lemma 6.1 with
Xk “ xG, xk y to obtain
a
E max xG, xk y ď 2 log N .
1ďkďN
152 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

Since, for any y P Rn , suptxy, xy : x P?Ku “ max1ďkďN xy, xk y, the above


bound is (cf. (6.2)) equivalent to wG pKq ď 2 log N . It remains to appeal to the
relation (4.32) between the Gaussian mean width and the usual mean width, and
to Urysohn’s inequality (4.34). 
Remark 6.4 (Sharp bound on volume of polytopes with few vertices). The
´b ¯
logpN {nq
bound in Proposition 6.3 can be improved to O n . This improvement
is meaningful only when N is not much larger than n. For example, if K “ B1n

ion
(the unit ball of `n1 ), then K “ convt˘e1 , . . . , ˘en u, where pek qnk“1 is the standard
unit vector basis in Rn . Consequently,
`a ˘ Proposition 6.3 used with N “ 2n leads
to the bound vrad B1n “ O logpnq{n , while the correct value (cf. Table 4.1) is

ut
?
Op1{ nq. Some of these issues are explored in Exercise 6.4.

rib
Remark 6.5 (Conjectured extremal property of the regular simplex). It is
conjectured that the polytope with N vertices and outradius 1 that has the largest

ist
Gaussian mean width is the regular simplex inscribed in the unit ball. This is
known (and easy) for N ď 3. By the argument used in the proof of Proposition

rd
6.3, this is equivalent to characterizing the instances giving the extremal value of
E max1ďkďN Xk in the context of Lemma 6.1 (with pXk q1ďkďN jointly Gaussian).

have
fo
Exercise 6.2. Show that, in the context of the second part of Lemma 6.1, we
ot
ˆ ˙
a log log N
E max Xk ě 2 log N ´ O ?
log N
N

1ďkďN

by using the lower bound from (A.4).


ly.

Exercise 6.3. Prove Lemma 6.2 for N ě 4 as follows: if Z is an N p0, 1q


random variable, then
on

ż8
2N 2
E max |Xk | ď T ` 2N PpZ ą tq dt “ T ` ? e´T {2 ´ 2N T PpZ ą T q
1ďkďN T 2π
se

a
and check numerically that the choice T “ 2 log N ´ 3{2 gives the needed in-
lu

equality. Note that this proof does not use the hypothesis that the variables are
jointly Gaussian. For 2 or 3 jointly Gaussian variables, use Proposition 6.9 to
na

identify extremal configurations.


Exercise 6.4 (Volume of polytopes with very few vertices).?Show that if, in
so

the notation of Proposition 6.3, N “ Opnq, then vrad K “ Op1{ nq, which yields
the better bound stated in Remark 6.4 for that range of N .
r
Pe

Exercise 6.5 (Volume of symmetric polytopes with few vertices). Show that
if K Ă B2n is a symmetric polytope with N vertices, the conclusion
? ?of Proposition
6.3 can be slightly improved to the inequality vradpKq ď 2 log N { n.
Exercise 6.6 (Mean widths of standard sets). Prove the estimates involving
mean width from Table 4.1.
6.1.2. Comparison inequalities for Gaussian processes. The following
fundamental inequality is known as Slepian’s lemma. It expresses the fact that
strengthening correlations of a Gaussian process decreases the supremum.
6.1. GAUSSIAN PROCESSES 153

Proposition 6.6 (Slepian’s lemma, not proved here). Let pXk q1ďkďN and
pYk q1ďkďN be Gaussian processes, and assume that
E pXk ´ Xj q2 ď E pYk ´ Yj q2
“ ‰ “ ‰

for every 1 ď j, k ď N . Then,


(6.7) E sup Xk ď E sup Yk .
1ďkďN 1ďkďN

Moreover, if also E Xk2 “ E Yk2 for all k and, then for any λ1 , . . . , λN P R

ion
(6.8) P pXk ě λk for some kq ď P pYk ě λk for some kq .

ut
Slepian’s lemma can be re-formulated in geometric language: contractions de-
crease the mean width. More precisely, if T Ă Rn and if φ : T Ñ Rm is a contraction

rib
(with respect to the Euclidean distance, not necessarily linear), then
` ˘
(6.9) wG convpφpT qq “ wG pφpT qq ď wG pT q “ wG pconvpT qq.

ist
If m “ n, we can immediately deduce from (4.32) that also wpφpT qq ď wpT q. This

rd
property seems intuitively obvious, but we know a simple proof only if φ is linear
(or affine, see Exercise 4.46).
Slepian’s lemma admits a number of variants and generalizations, the follow-
fo
ing one has been quite useful. In particular, it leads to elegant proofs of various
statements about random matrices (see Section 6.2) and versions of Dvoretzky’s
ot
theorem (Section 7.2).
N

Proposition 6.7 (Gordon’s lemma, not proved Ť here). Let pXt qtPT and pYt qtPT
be Gaussian processes. Assume further that T “ sPS Ts and that
ly.

(i) }Xt ´ Xt1 }2 ď }Yt ´ Yt1 }2 if t P Ts , t1 P Ts1 with s ‰ s1 ,


(ii) }Xt ´ Xt1 }2 ě }Yt ´ Yt1 }2 if t, t1 P Ts for some s.
on

Then
E max min Xt ď E max min Yt .
sPS tPTs sPS tPTs
se

Moreover, if also E Xt2 “ E Yt2 for all t P T , then for any choice of real numbers
lu

pλt qtPT , ˜ ¸ ˜ ¸
ď č ď č
P tXt ě λt u ď P tYt ě λt u .
na

sPS tPTs sPS tPTs


so

Remark 6.8. (1) When all Ts are singletons, Gordon’s lemma reduces to
the Slepian version. Accordingly, Proposition 6.7 is sometimes referred to as the
r

Slepian–Gordon lemma. (2) Replacing Xt , Yt with ´Xt , ´Yt we get analogous state-
Pe

ments for min max in place of max min, and similarly for the Slepian’s lemma and
for the statements about probabilities. (3) Further generalizations to min and max
applied alternatively more than twice are possible.
Another fundamental comparison inequality is the Khatri–Šidák lemma.
Proposition 6.9 (Khatri–Šidák, see Exercise 6.9). Consider two Gaussian
processes pXk q1ďkďN and pYk q1ďkďN , and assume that
(1) for every 1 ď k ď N , E Xk2 “ E Yk2 ,
(2) the random variables pYk q1ďkďN are independent.
154 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

Then,
(6.10) E sup |Xk | ď E sup |Yk |.
1ďkďN 1ďkďN

Moreover, for any t1 , . . . , tN ě 0,


(6.11) P p|Xk | ě tk for some kq ď P p|Yk | ě tk for some kq
or equivalently
N

ion
ź
(6.12) P p|Xk | ď tk for all kq ě P p|Yk | ď tk for all kq “ Pp|Yk | ď tk q.
k“1

ut
Similarly to Slepian’s lemma, both (6.10) and (6.12) have nice geometric inter-
pretations. Consider n bands in Rn of the form Bi “ tx P Rn : |xx, ui y| ď ai u

rib
where u1 , . . . , un P S n´1 are unit vectors and a1 , . . . , an are positive numbers. Then,
the mean width of B1˝ X ¨ ¨ ¨ X Bn˝ is minimal when the directions of the bands (i.e.,

ist
the normal vectors ui ) are pairwise orthogonal. Similarly, the (Gaussian) measure
of the intersection of the bands is minimal if the bands are orthogonal.

rd
An remarkable statement that generalizes (6.12) and that has been a long-
standing open problem is the Gaussian correlation conjecture. It was answered
affirmatively very recently by Royen, who proved the following inequality: given
fo
0-symmetric convex sets K, L Ă Rn and a centered Gaussian measure P on Rn ,
then
ot
(6.13) PpK X Lq ě PpKqPpLq.
N

Exercise 6.7 (Comparison of tails implies comparison of expectations). De-


duce the first part (6.7) of Slepian’s lemma from the second part (6.8). To get rid of
ly.

the “equal variance” assumption, approximate the space by a sphere of large radius.
on

Exercise 6.8. Show that it is enough to verify (6.13) when P is the standard
Gaussian measure.
se

Exercise 6.9 (Proof of the Khatri–Šidák inequality). Prove the correlation


conjecture (6.13) in the special case where L is a band by using the fact that the
lu

Gaussian measure is log-concave and therefore satisfies (4.28). Then deduce the
Khatri–Šidák inequality (Proposition 6.9).
na

6.1.3. Sudakov and dual Sudakov inequalities. Given a Gaussian process


so

X “ pXt qtPT we may identify X with a subset of the Hilbert space L2 pΩq (cf. (6.3)
and the comments in the paragraph containing it).` Since the ˘ joint distribution of
r

pXt qtPT is uniquely determined by the covariances EXs Xt s,tPT , it follows that all
Pe

the stochastically relevant information about the process is encoded in the geometry
of X. As it turns out, the value of the expected supremum of X is intimately related
to the behavior of covering numbers N pX, εq. The first result in this direction is
the Sudakov inequality.
Proposition 6.10 (Sudakov minoration). Let X “ pXt qtPT be a Gaussian
process. Then,
a
(6.14) c sup ε log N pX, εq ď E sup Xt
εą0 tPT
for some absolute constant c ą 0.
6.1. GAUSSIAN PROCESSES 155

Proof. By (5.1), we may equivalently work with the packing number P pX, εq.
Let ε ą 0 and let S Ă T be a subset which is ε-separated in the L2 -norm, that is,
verifying }Xs ´ Xt }2 ě ε whenever s, t P S and s ‰ t. Let pYs qsPS be a Gaussian
process such that Ys are independent N p0, ε2 {2q random variables. By construction,
we have
}Ys ´ Yt }2 “ ε ď }Xs ´ Xt }2
for any s, t P S with s ‰ t. Accordingly, by Slepian’s lemma and Lemma 6.1, we
can conclude that

ion
a
ε logpcard Sq „ E sup Ys ď E sup Xs ď E sup Xt ,
sPS sPS tPT

ut
as needed. 
In view of the comments in Section 6.1.1 (cf. (6.2), (6.3)), Sudakov’s inequality

rib
(6.14) is really a statement
? about Gaussian mean widths of subsets of a Hilbert
space. Since wG pKq „ n wpKq for K Ă Rn (see Section 4.3.3), the inequality

ist
(6.14) may be restated as follows: for every bounded set (or, equivalently, for every
convex body) K Ă Rn we have

rd
(6.15) log N pK, εB2n q À wG pKq2 {ε2 „ nwpKq2 {ε2 .
In general, Sudakov’s inequality is not tight (see Exercise 6.11). However,
fo
in combination with the equally simple-minded bound (6.5) (applied at the ap-
propriate “level of resolution”), it often leads to surprisingly precise estimates for
ot
E suptPT Xt . We will elaborate on this point in the next section, in which we prove
N

the companion bound, Dudley’s inequality (Proposition 6.13).


When information about the mean width of K is available, (6.15) can be
used to upper-bound covering/packing numbers of K. As a rule of thumb, this
ly.

yields a reasonable estimate when log N pK, εq “ Opnq. For smaller ε, i.e., when
log N pK, εq " n, the volumetric approach from Lemma 5.8 is generally more precise.
on

We exemplify these phenomena in Exercise 6.12.


A dual version of the Sudakov inequality also holds.
se

Proposition 6.11 (Dual Sudakov minoration). For any bounded set K Ă Rn ,


lu

we have
(6.16) log N pB2n , K ˝ , εq “ log N pB2n , εK ˝ q À wG pKq2 {ε2 „ nwpKq2 {ε2 .
na

Modulo minor issues related primarily to possible lack of symmetry, Proposition


so

6.11 follows from Proposition 6.10, and vice versa, by the (known) Euclidean case
of the duality conjecture of covering numbers (5.67). However, there is a simple
r

self-contained argument.
Pe

Proof of Proposition 6.11. First, we may assume that K is a convex body


since replacing K with its closed convex hull and passing to a subspace (if K wasn’t
of full dimension) doesn’t change any of the quantities involved. Next, we may
assume that 0 is an interior point of K since otherwise K ˝ contains a half-space
and the left-hand side is 0. Further, we may assume that K is symmetric since while
replacing K by K ´ K increases both sides, the right-hand side changes precisely
by a factor of 4. The last “trivial” reduction is a rescaling. Since log N pB2n , εK ˝ q “
log N prB2n , rεK ˝ q, using r “ 4wGεpKq we reduce the problem to the following: If
L Ă Rn is a symmetric convex body with wG pLq “ 1, then logpN prB2n , 4L˝ qq À r2
for r ą 0.
156 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

As in the previous argument, it is more handy to argue via packings. Let


x1 , x2 , . . . , xN P rB2n be such that xi ` 2L˝ are disjoint and let γn be the standard
Gaussian measure on Rn . The remainder of the proof depends on two simple ob-
servations. ş
(a) Since 1 “ wG pLq “ }x}L˝ dγn , it follows by Markov’s inequality that
γn p2L˝ q ě 21 .
(b) Since |xi | ď r for all i ď N , the measure of each translation xi ` 2L˝ cannot
be “too small” and since the translations are disjoint, there cannot be too many of

ion
them.
Here are details of the calculation behind the second observation. First, by sym-
metry of L˝ ,

ut
γn pxi ` 2L˝ q ` γn p´xi ` 2L˝ q
ż
φpx ` xi q ` φpx ´ xi q
γn pxi ` 2L˝ q “ “ dx,

rib
2 2L˝ 2
2
where φpxq “ p2πq´n{2 e´|x| {2 is the density of γn . Next, by convexity of the

ist
exponential function and by the parallelogram identity
2 2
φpx ` xi q ` φpx ´ xi q e´|x`xi | {2 ` e´|x´xi | {2

rd
“ p2πq´n{2
2 2
´n{2 ´p|x`xi |2 `|x´xi |2 q{4
e
ě p2πq
“ p2πq´n{2 e´p|x|
fo 2
`|xi |2 q{2
ot
2
“ e´|xi | {2
φpxq
N

´r 2 {2
ěe φpxq.
Inserting this estimate into the preceding formula we get
ly.

2 1 2
γn pxi ` 2L˝ q ě e´r {2 γn p2L˝ q ě e´r {2
on

2
2
and so N ď 2er {2 . This is exactly what we needed, except in the case when r is
small, which can be handled separately by an elementary argument showing that
se

the left-hand side of (6.16) is then 0; see Exercise 6.13. 


lu

Remark 6.12. In the setting of observation (a) in the proof above, a stronger
statement is actually true: if wG pLq “ 1, then γn pL˝ q ě 12 , see Exercise 6.14.
na

Exercise 6.10 (Optimal constant in Sudakov’s inequality). Show that the


optimal constant in (6.14) is c “ p2π log 2q´1{2 ą 0.479.
so

Exercise 6.11 (The gap in Sudakov’s inequality). Show that a the gap in Su-
r

dakov’s inequality, i.e., the ratio between wG pKq and supεą0 ε log N pK, εq, can
Pe

be arbitrarily large. For example, let pdj qnj“1 be a “sufficiently fast” increasing se-
quence of positive integers and consider K “ Ka 1 ˆ K2 ˆ . . . ˆ Kn , where Kj is a
Euclidean sphere of dimension dj and radius 1{ dj .
Exercise 6.12 (Metric entropy of B1n ). Let K “ n1{2 B1n . It is known (see
Theorem 1 in [Sch84]) that then
" logp2εq
n ε2 if 1 ď ε ď 12 n1{2 ,
(6.17) log N pK, εq »
n logp2{εq if 0 ă ε ď 1.
Compare the performance/facility of application of (6.15) to that of Lemma 5.8
when estimating log N pK, εq.
6.1. GAUSSIAN PROCESSES 157

Exercise 6.13 (Gaussian measure and the inradius). Let γn be the stan-
dard Gaussian measure on Rn . Show that if a symmetric convex body K Ă Rn
satisfies γn pKq ě γ1 pr´r, rsq, then K Ą rB2n . In particular, if γn pKq ě .683,
then N pB2n , Kq “ 1. Conclude that the left-hand side of (6.16) is 0 whenever
wpKq{ε ď .317.
Exercise 6.14 (Gaussian measure and the mean width). Show that if a sym-
metric convex body L Ă Rn satisfies wG pLq ď 1, then γn pL˝ q ě 12 .

ion
n
Exercise 6.15 (Metric entropy of B8 ). Use one of the Sudakov inequalities to
n n
show that, for every 0 ă ε ă 1, N pB2 , B8 , εq grows (at most) polynomially with
the dimension n.

ut
It is actually known (see Theorem 1 in [Sch84]) that

rib
#
logp2nε2 q
n n if n´1{2 ď ε ď 1{2,
(6.18) log N pB2 , B8 , εq » ε2
2
n log nε2 if 0 ă ε ď n´1{2 .

ist
The similarity of the estimates (6.17) and (6.18) is not a coincidence; see (5.67).
(Note that (6.17) could have been equivalently stated with logp2ε2 q and logp2{ε2 q

rd
instead of logp2εq and logp2{εq, making the similarity even more apparent.)
6.1.4. Dudley’s inequality and the generic chaining. The preceding sec-
fo
tion presented lower bounds for expected suprema of a Gaussian process in terms of
the related covering/packing numbers. In this section we will present similar upper
ot
bounds in a slightly more general setting.
N

Let pS, ρq be a compact metric space and let pXs qsPS be a family of random
variables (a stochastic process indexed by S). We say that pXs q is centered if
E Xs “ 0 for all s P S, and that it is subgaussian if, for all s, t P S with s ‰ t and
ly.

for all λ ą 0,
on

λ2
ˆ ˙
(6.19) PpXs ´ Xt ą λq ď A exp ´α ,
ρps, tq2
se

where A, α are positive parameters (independent of λ, s, t). The motivation for the
terminology is that if the process is Gaussian, then (6.19) holds with A “ α “ 21 and
lu

with respect to the metric ρps, tq “ }Xs ´ Xt }2 , and the bound is then essentially
tight (see Exercise A.1).
na

Proposition 6.13 (Dudley’s inequality). If pXs qsPS is centered and satisfies


(6.19) with A ě 12 , then
so

ż R{2 b
` ˘
r

(6.20) E sup Xs ď 6α´1{2 1 ` 2 log A1{2 N pS, ηq dη,


Pe

sPS 0
where R is the radius of S.
Corollary 6.14. If pXs qsPS satisfies (6.19) with A ě 21 , but is not-necessarily-
centered, then
(6.21) E sup Xs ď sup E Xs ` B and E sup |Xs | ď sup E |Xs | ` B
sPS sPS sPS sPS
where B is the quantity on the right-hand side of (6.20).
The first bound in the Corollary follows immediately by considering Xs1 “
Xs ´ E Xs , and the second by noticing that if pXs q verifies (6.19), then so does
p|Xs |q.
158 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

Remarkş a 6.15. (1) Most formulations of Dudley’s inequality involve the ex-
pression log N pS, ηq dη. In that case, the integrand is 0 if η is larger than the
radius of S, and so one may as well integrate over r0, 8q. In our formulation, the
integrand is never 0; this is the price we are paying for having good dependence
of the bound on A and, to a lesser extent, for Lemma 6.16 being stated for not-
necessarily-centered variables.
(2) Some applications require majorizing the expected value of sups,t |Xs ´ Xt | “
sups,t pXs ´ Xt q; the proof below yields then (in the notation of Corollary 6.14) the

ion
bound 2B, without having to assume that pXs q is centered.
(3) When comparing Dudley’s inequality to Sudakov’s inequality a (6.14), we notice
that the former involves the L1 -norm of the function φpηq “ log N pS, ηq, while

ut
the latter the weak L1 -quasinorm (see [Gra14] for the definition). This explains

rib
why the two bounds are often of the same order and even if they are not, their ratio
depends rather weakly on the dimension and other parameters.

ist
Proof of Dudley’s inequality. Observe first that both sides of the in-
equality change in the same way if we rescale the process and/or the metric (i.e.,

rd
replace pXt q by paXt q and/or ρ by bρ for some a, b ą 0) and appropriately adjust
the parameter α. Accordingly, we may assume that both α and the radius of S are
equal to 1. For every integer k ě 0, let Nk be a 2´k -net of minimal cardinality for
fo
pS, ρq. By hypothesis, the net N0 consists of a single element s0 . For every k and
for every s P S, denote by πk psq an element of Nk satisfying ρps, πk psqq ď 2´k . The
ot
chaining equation reads for every s P S
N

ÿ` ˘
(6.22) Xs “ Xs0 ` Xπk`1 psq ´ Xπk psq .
kě0
ly.

It follows that
ÿ ` ˘ ÿ
(6.23) sup Xs ď Xs0 ` sup Xπk`1 psq ´ Xπk psq ď Xs0 ` sup pXu ´ Xu1 q,
on

sPS 1
kě0 sPS kě0 u,u

where the last supremum is taken over couples pu, u1 q P Nk`1 ˆ Nk satisfying
se

ρpu, u1 q ď 2´k ` 2´pk`1q “ 3 ¨ 2´pk`1q . Since E Xs0 “ 0, it remains to bound the


lu

expectation of each term in the sum, using the following fact


Lemma 6.16. If A ě 12 , β ą 0 and if Y1 , . . . , YN are random variables satisfying
na

PpYi ą tq ď A expp´t2 {β 2 q for all t ě 0, then


a
(6.24) E max Yi ď β 1 ` logpAN q.
so

1ďiďN
r

To bound E suppXu ´ Xu1 q, we apply the above Lemma with β “ 3 ¨ 2´pk`1q


Pe

and N “ cardpNk q ¨ cardpNk`1 q ď N pS, 2´pk`1q q2 . This gives


ÿ b ` ˘
(6.25) E sup Xs ď 3 2´k 1 ` 2 log A1{2 N pS, 2´k q .
sPS kě1

The result follows now by majorizing the last series with an integral. 
Proof of Lemma 6.16. We may assume that β “ 1 by working with Yi {β
and that the variables Yi are non-negative by working with the positive parts Yi` .
If N ě 2, then AN ě 1 and so
ż8
E max Yi “ Ppmax Yi ě tq dt
i 0 i
6.1. GAUSSIAN PROCESSES 159

a ż8
ď logpAN q ` AN ? expp´t2 q dt
logpAN q
a
ď logpAN q ` 1.
The first inequality is the union bound; the second ş8 one2 is the ?
upper bound in
2
Komatu’s inequality (A.4) which can be rewritten as u e´t dt ď p u2 ` 1´uqe´u
a
(valid for u ě ´0.3893 and applied with u “ logpAN q).
If N “ 1, the inequality is trivial if the variable has mean 0 and can be checked

ion
directly otherwise; see Exercise 6.17, which also treats in detail the case of small
A. 

ut
Although Dudley’s inequality is not sharp in general (see Exercises 6.19 and
6.20, which exhibit two different reasons for a possible gap), it does become sharp

rib
when sufficiently many symmetries are present; such situation is referred to as the
stationary case in probability literature. Here is a statement demonstrating this

ist
principle expressed in the language of convex sets and their Gaussian mean widths.
Proposition 6.17 (not proved here). Let K Ă Rn be a nonempty compact

rd
convex set and let F be the set of extreme points of K. If the isometry group of K
acts transitively on F , then

wG pKq “ wG pF q »
ż outradpF q b fo
` ˘
1 ` log N pF, ηq dη.
ot
0
In the most general situation, the chaining argument used in the proof of Propo-
N

sition 6.13 (which is based on a decomposition along consecutive “levels of resolu-


tions”) can be improved by using a generic version of the chaining.
ly.

Theorem 6.18 (Generic chaining, not proved here). Let pXt qtPT be a centered
subgaussian process and let ρ be the distance on T defined by ρps, tq “ }Xs ´ Xt }L2 .
on

Let pTk qkPN be an increasing family of subsets of T such that cardpT0 q “ 1 and
k
cardpTk q ď 22 for k ě 1. Then
se

ÿ8
(6.26) E sup Xt ď C sup 2k{2 ρps, Tk q
lu

tPT sPT k“0

for some absolute constant C. Conversely, if the process pXt qtPT is Gaussian, this
na

ř8always sharp in the following sense: if γ2 pT q denotes the infimum of


bound is
supsPT k“0 2k{2 ρps, Tk q over all such families pTk q, then we have
so

E sup Xt ě c γ2 pT q
r

tPT
Pe

for some absolute constant c.


To grasp the difference between Dudley’s integral and the generic chaining
bound, it is useful to rephrase the former in the language of Theorem 6.18. One
checks (see Exercise 6.22) that, for any compact metric space pT, ρq,
ż diam T a 8
ÿ
(6.27) log N pT, ηq dη » inf 2k{2 sup ρps, Tk q,
0 pTk q sPT
k“0
where the infimum is taken over families pTk q as in Theorem 6.18. Note that the
right-hand sides of (6.26) and (6.27) differ in the relative position of the summation
and the supremum.
160 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

Exercise 6.16 (The constant in Dudley’s inequality). ? Show that the constant
6 in Dudley’s inequality (6.20) can be improved to 3 ` 2 2 « 5.83 if we repeat the
proof with Nk being a θk -net, and optimize over θ P p0, 1q.
Exercise 6.17. The argument in the proof of Lemma?
6.16 works if AN ě 1.
Show that when AN ă 1, then the optimal majorant is 2π βAN and check that,
consequently, the bound from Lemma 6.16 holds whenever AN ě 0.4236.
Exercise 6.18 (Median of the maximum of a subgaussian process). a Show that

ion
under the hypotheses of Lemma 6.16 the median of maxi Yi is at most β logp2AN q.
Exercise 6.19 (The gap in Dudley’s inequality).
? Let pZk qnk“1 be an i.i.d. se-

ut
quence of N p0, 1q variables and let Xk “ Zk { 1 ` log k. Check that E maxk Xk ă 3
for any n P N, but that the integral on the right-hand side of (6.20) is Θplog log nq.

rib
Exercise 6.20 (The gap in Dudley’s inequality via B1n ). Let K “ B1n . Show
ş1 a ?
that 0 log N pK, ηq dη » plog nq3{2 while wG pKq „ 2 log n. Interpret this dis-

ist
crepancy as a gap in Dudley’s inequality.
Exercise 6.21 (Law of the iterated logarithm via Dudley’s inequality). Here is

rd
a rough version of the law of the iterated logarithm. Let pZi q1ďiďn be independent
N p0, 1q random variables and consider the Gaussian process X “ pXk q1ďkďn defined
fo
by Xk “ ?1k pZ1 ` ¨ ¨ ¨ ` Zk q. Estimate the covering numbers of X and conclude
that E maxtXk : 1 ď k ď nu “ Θplog log nq.
ot
Exercise 6.22 (Dudley integral as a chaining bound). Prove (6.27).
N

Exercise 6.23 (Generic chaining improves on Dudley’s inequality). Show that


the processes from Exercise 6.19 can be shown to be uniformly bounded via generic
ly.

chaining.
on

6.2. Random matrices


Random matrix theory (RMT) studies spectral properties of large-dimensional
matrices generated by some random procedure. We present in this chapter a very
se

small selection of results from RMT, which will be useful to analyze random con-
lu

structions of interest in QIT. In particular, while we focus mostly on the Gaussian


setting, most of the limit theorems are valid for a much wider class of random
na

matrices; this principle is known as universality. We study primarily (but not ex-
clusively) matrices with complex entries since these are the most relevant to QIT.
so

In contrast, much of the original motivation for RMT research came from statistics,
the setting in which the real case is more usual.
r

For A P Msa
n , we denote by pλi pAqq1ďiďn or simply pλi q1ďiďn the eigenvalues of
Pe

A, listed with multiplicities and arranged so that


(6.28) λ1 pAq ě λ2 pAq ě ¨ ¨ ¨ ě λn pAq.
The empirical spectral distribution of A, denoted by µsp pAq, is the probability mea-
sure obtained as the uniform measure over the spectrum of A. More formally
n
1 ÿ
(6.29) µsp pAq “ δ ,
n i“1 λi pAq
which is clearly independent of the order of eigenvalues. Obviously, if the matrix
A is random, the corresponding empirical spectral distribution is also random. We
are interested in giving a description of the typical shape of this random measure.
6.2. RANDOM MATRICES 161

6.2.1. 8-Wasserstein distance. At least two kinds of RMT limit theorems


are relevant for quantum information theory: fine information about the extreme
eigenvalues (or about the operator norm) and large-scale information about the
entire spectrum. These two possible perspectives are known in RMT as “local”
vs. “global” regimes. In order to encompass both aspects, we find it convenient to
introduce the 8-Wasserstein distance between probability measures on R.
Definition 6.19. Let µ1 , µ2 be probability measures on R. The 8-Wasserstein
distance is defined as

ion
(6.30) d8 pµ1 , µ2 q :“ inf }X1 ´ X2 }L8 ,

ut
with infimum over all couples pX1 , X2 q of random variables with (marginal) laws µ1
and µ2 , defined on a common probability space. Similarly, if Y1 , Y2 are real random

rib
variables, we will mean by d8 pY1 , Y2 q the 8-Wasserstein distance between the laws
of Y1 and Y2 .

ist
The definition of 8-Wasserstein distance immediately extends to probability
measures on a metric space pE, dq if we interpret in (6.30) the quantity }X1 ´X2 }L8

rd
as the smallest ∆ such that PpdpX1 , X2 q ď ∆q “ 1. Similarly, replacing the L8 -
norm by the Lp -norm leads to the p-Wasserstein distance dp , with the “finite p”
fo
case (and particularly p “ 1, 2) being much more intensively studied than p “ 8.
The metric d1 is also known, particularly in the computer science community, as
the Earth Mover’s distance.
ot
We note the following inequality (cf. Exercise 6.24): whenever f : R Ñ R is an
N

L-Lipschitz function and X, Y are random variables, then


(6.31) | E f pXq ´ E f pY q| ď L d8 pX, Y q.
ly.

The 8-Wasserstein distance can be computed from cumulative distribution


functions: if FX ptq “ PpX ď tq, then
on

(6.32) d8 pX, Y q “ inftε ą 0 : FX pt ´ εq ď FY ptq ď FX pt ` εq for all t P Ru.


Note the similarity with the definition of Lévy distance dL , which metrizes the weak
se

convergence
lu

dL pX, Y q “ inftε ą 0 : FX pt ´ εq ´ ε ď FY ptq ď FX pt ` εq ` ε for all t P Ru.


The following lemma is elementary, but it will be crucial for our purposes.
na

Lemma 6.20. Let Z be a random variable distributed according to a measure


so

νZ , with support equal to some bounded interval ra, bs. If pYn q is a sequence of
random variables, the following are equivalent:
r
Pe

(1) d8 pYn , Zq Ñ 0,
(2) Yn Ñ Z weakly and sup Yn Ñ b, inf Yn Ñ a.
By inf and sup we really mean here essential inf and sup. Note that the hy-
pothesis on the support is vital: the equivalence fails if the support is not connected
(see Exercise 6.29).
Proof. Since dL ď d8 , convergence in 8-Wasserstein distance implies weak
convergence. Moreover we have | sup Yn ´ sup Z| ď d8 pYn , Zq and similarly for the
infima, and therefore (1) implies (2).
Conversely, assume (2). Given ε ą 0, choose a “ x0 ă x1 ă ¨ ¨ ¨ ă xr “ b such
that xj`1 ´ xj ă ε and such that, for 0 ă j ă r, xj is a continuity point of FZ
162 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

(such points are dense in R). The hypothesis on the support of νZ implies that
FZ is strictly increasing on ra, bs, so that there exists α ą 0 with the property that
FZ pxj q ě FZ pxj´1 q ` α for 0 ă j ď r. For n large enough, we have inf Yn ą a ´ ε,
sup Yn ă b ` ε and |FYn pxj q ´ FZ pxj q| ă α for any 0 ă j ă r (using the fact that
FZ is continuous at xj ). This conditions imply that for any real number t,
FZ pt ´ 2εq ď FYn ptq ď FZ pt ` 2εq
and therefore d8 pYn , Zq ď 2ε. 

ion
Remark 6.21. The proof of Lemma 6.20 gives actually the following: a neigh-
bourhood basis around νZ for the topology induced by d8 is given by pVε qεą0 ,

ut
where Vε is the set of probability measures µ satisfying the condition
´ ¯
max dL pµ, νZ q, | sup µ ´ sup νZ |, | inf µ ´ inf νZ | ă ε,

rib
where by inf ν and sup ν we denote the infimum and supremum of the support of a

ist
measure ν.
Exercise 6.24 (8-Wasserstein distance and Lipschitz functions). Show the

rd
stronger version of (6.31) : If f : R Ñ R is an L-Lipschitz function, then | E f pXq´
E f pY q| ď L d1 pX, Y q.
fo
Exercise 6.25. Show that if f : R Ñ R` is an L-Lipschitz function and
d8 pX, Y q ď ε, then E f pY q ě E gpXq, where g “ pf ´ Lεq` .
ot
Exercise 6.26 (8-Wasserstein distance via cumulative distribution functions).
N

Prove the alternate formula (6.32) for the 8-Wasserstein distance.


Exercise 6.27 (8-Wasserstein distance and weak convergence). Show directly
ly.

that d8 pYn , Zq Ñ 0 implies the weak convergence Yn Ñ Z, i.e., the convergence


E f pYn q Ñ E f pZq for any bounded continuous function f : R Ñ R.
on

Exercise 6.28. Show that under the hypotheses of Lemma 6.20, d8 pYn , Zq Ñ
0 implies the convergence E f pYn q Ñ E f pZq for any continuous function f : R Ñ R
se

(bounded or not). Show, by example, that this may be false when Z is unbounded.
lu

Exercise 6.29. Give an example showing that connectedness is important in


Lemma 6.20.
na

Exercise 6.30. Show that if A, B P Msa


n , then d8 pµsp pAq, µsp pBqq ď }A´B}op .
so

6.2.2. The Gaussian Unitary Ensemble.


r

6.2.2.1. Definition of GUE. Recall that the space Msa


n of complex Hermitian
Pe

n ˆ n matrices can be considered as real Euclidean space when equipped with


the Hilbert–Schmidt inner product. We denote by GUEpnq (Gaussian Unitary
Ensemble) the distribution of the standard Gaussian vector in Msa
n (see Appendix
A). When a random matrix A has distribution GUEpnq, we say simply that A is a
GUEpnq matrix. Here are some other equivalent descriptions of GUEpnq matrices
(see Exercise 6.31).
2
(1) The density of GUEpnq is cn e´ Tr X {2 for X P Msa
n , where cn is the ap-
propriate normalization constant.
(2) Let C P Mn be a random matrix with independent
? NC p0, 1q entries (see
Appendix A). Then the matrix A “ pC ` C : q{ 2 is a GUEpnq matrix.
6.2. RANDOM MATRICES 163

(3) A “ paij q P Msa


n is a GUEpnq matrix if and only if the random variables
paij q1ďiďjďn are independent, the random variable aij having distribution
NC p0, 1q when i ‰ j and NR p0, 1q when i “ j.
The GUE has the property of unitary invariance: if A P Mn is a GUEpnq
matrix, then, for any fixed U P Upnq, the random matrix U AU : is also a GUEpnq
matrix.
Although it plays almost no role in our approach, an important feature of
natural unitarily invariant models is that there are explicit formulas for the density

ion
of eigenvalues (see also Exercise 6.32).
Proposition 6.22 (Ginibre formula, not proved here). Let A be a GUEpnq

ut
matrix, and λpAq “ pλi q1ďiďn be the spectrum of A, arranged in the non-increasing
order. Then the density of the random vector λpAq is given by

rib
1
řn 2 ź
cn 1tλ1 쨨¨ěλn u e´ 2 i“1 λi pλi ´ λj q2 ,

ist
1ďiăjďn

where cn is the appropriate normalization constant.

rd
The real-valued companion to the GUE is the Gaussian Orthogonal Ensemble or
GOE, which corresponds to the standard Gaussian vector in the space of self-adjoint
fo
real matrices (up to normalization, see Section 6.2.4). The Gaussian Symplectic
Ensemble (GSE) similarly corresponds to the standard Gaussian vector in the space
ot
of quaternionic Hermitian matrices.
For some arguments, it is important to introduce what we call the GUE0 pnq en-
N

semble, which is the GUE ensemble conditioned to have trace zero. In other words,
G0 is a GUE0 pnq matrix if it has the distribution of a standard Gaussian vector
ly.

in the hyperplane H Ă Msa n of trace zero Hermitian matrices. If A is a GUEpnq


matrix, then A0 :“ A ´ TrnA I is a GUE0 pnq matrix. Note that the coefficient TrnA
on

has distribution N p0, 1{nq and is independent of A0 .


Exercise 6.31. Show that (1), (2) and (3) provide equivalent definitions of
se

GUEpnq.
lu

Exercise 6.32 (Characterization of GUEpnq). Show that GUEpnq is the only


unitarily invariant distribution on Msa
n for which the formula from Proposition 6.22
holds.
na

6.2.2.2. Limit theorems. The probability distribution that is the non-commu-


so

tative analogue of the Gaussian distribution is the semicircular distribution (or


semicircle law ). The standard semicircular distribution µSC is the probability dis-
r

tribution on R with support r´2, 2s and with density


Pe

1 a
(6.33) 4 ´ x2

with respect to the Lebesgue measure. The even moments of the semicircular
distribution are the Catalan numbers: for a nonnegative integer p, we have
ż2 ˆ ˙
2p 1 2p
(6.34) x dµSC pxq “ .
´2 p ` 1 p
In particular the variance equals 1. If X is a random variable with distribution
µSC , then for any m P R and σ ě 0, we denote by µSCpm,σ2 q the distribution of
m ` σX, called the semicircular distribution with mean m and variance σ.
164 6. GAUSSIAN PROCESSES AND RANDOM MATRICES


Eigenvalues of An / n

ion
−2 0 2

ut
Figure 6.1. The empirical eigenvalue distribution of a GUEpnq

rib
matrix An for n “ 10000 approaches the semicircular distribution.

ist
The semicircular distribution appears as the limit spectral distribution of GUE
random matrices (see Figure 6.1).

rd
Theorem 6.23 (Convergence of GUE spectrum towards the semicircular dis-
tribution, not proved here). For each n, let An be a GUEpnq or GUE0 pnq matrix.
fo
After normalization, the sequence of empirical spectral distributions pµsp pAn qq con-
verges towards the semicircular distribution (with respect to the 8-Wasserstein dis-
ot
tance) in the following sense: for any ε ą 0,
lim Ppd8 pµsp pn´1{2 An q, µSC q ą εq “ 0.
N

nÑ8

Using Lemma 6.20 (see also Remark 6.21), one checks that Theorem 6.23 brings
ly.

together two facts, usually presented (and proved) independently in the RMT lit-
erature:
on

(1) The fact that the sequence pµsp pn´1{2 An qq of random empirical measures
converges (weakly, in probability) towards the semicircle law, a result
se

going back to Wigner.


(2) The convergence (in probability) of the largest and smallest eigenvalues of
lu

n´1{2 An towards ˘2. This requires a different and finer analysis, which
we sketch in what follows.
na

Since GUEpnq is the standard Gaussian vector in Msa n , and by the duality
between Schatten norms (see Proposition 1.17), the quantity E }An } is exactly
so

the Gaussian mean width of S1n,sa , the self-adjoint part of the unit ball for the
r

trace norm. Although the order of magnitude of E }An } can be readily deduced
Pe

from general principles (see Exercise 6.33), the derivation of the precise constant 2
requires more specialized arguments. However, once an appropriate bound such as
(6.37) below is established, concentration of }An } around its expectation is provided
by Theorem 5.24 and gives the following estimates.
Proposition 6.24. Let An be a GUEpnq or GUE0 pnq matrix. Then, for any
ε ą 0,
´ ` ˘ ¯ ´› › ¯ 1 ´ nε2 ¯
(6.35) P λ1 n´1{2 An ě 2 ` ε ď P ›n´1{2 An ›8 ě 2 ` ε ď exp ´ .
2 2
Proof. Since } ¨ }8 ď } ¨ }HS , the function } ¨ }8 is a 1-Lipschitz function. By
Theorem 5.24 (recall that GUEpnq is the standard Gaussian vector in the space Msa n,
6.2. RANDOM MATRICES 165

and similarly for GUE0 pnq and the hyperplane of trace zero matrices), it follows
that
`› › ˘ 1
(6.36) P ›An ›8 ě M ` t ď expp´t2 {2q,
2
› › ?
where M is the median of the random variable ›An ›8 . We claim that M ă 2 n.
This follows from two facts. First, we have the inequality
? ?
(6.37) E }An }8 ă 2 n ´ 0.6n´1{6 ă 2 n,

ion
which was derived in Appendix F in [Sza05] (note that this inequality extends to
the case of GUE0 pnq via Jensen’s inequality). Second, it follows from Proposition
5.34 that the median?of the random variable }An }8 is smaller than its mean. Once

ut
?
we know that M ď 2 n, (6.35) follows by setting t “ ε n and appealing to (6.36).

rib
An alternative proof is to use directly (6.37) in combination with Theorem 5.25,
but we opted for the argument above since, in our approach, concentration around
the median is more elementary than that around the mean. 

ist
? for the GOE. For example, if An is a GOEpnq
Similar estimates also hold

rd
matrix, we have E λ1 pAn q ď 2 n (see Exercise 6.48) and therefore
´ ` ˘ ¯ 1 ´ nε2 ¯
P λ1 n´1{2 An ě 2 ` ε ď exp ´ .
We next note that if A P Msa
fo
2 2
n , then }A}8 “ maxtλ1 pAq, ´λn pAqu, and that, by
ot
symmetry of GOEpnq, the distribution of ´λn pAn q is the same as that of λ1 pAn q.
Combining these observations with the bound above yields
N

´ ¯ ´ nε2 ¯
(6.38) P }n´1{2 An }8 ě 2 ` ε ď exp ´ .
2
ly.

The bound from Proposition 6.24 can be improved for small values of ε (the
on

Tracy–Widom effect).
Proposition 6.25 (not proved here). Let An be a GUEpnq or a GOEpnq ma-
trix. Then for any ε P p0, 1q,
se

´ ` ¯
P λ1 n´1{2 An ě 2 ` ε ď C expp´cnε3{2 q
˘
lu

and ´ ` ¯
na

P λ1 n´1{2 An ď 2 ´ ε ď C expp´cn2 ε3 q,
˘

for some absolute constants C, c ą 0.


so

The main result of this section, Theorem 6.23, is formulated as an asymptotic


r

statement. One can ask for a more quantitative version, or for a fixed–dimension
Pe

bound.
Problem 6.26. If An is a GUEpnq, a GUE0 pnq, or a GOEpnq matrix, what is
the rate of convergence in d8 pµsp pn´1{2 An q, µSC q Ñ 0? Proposition 6.25 suggests
that the answer may be Θpn´2{3 q. The convergence cannot be faster than n´2{3
due to the Tracy–Widom effect; see Notes and Remarks. The same question can be
asked about the Wishart matrices considered in the next section.
Exercise 6.33 (An elementary proof of boundedness of GUEpnq).
? Using a net
argument, show that if An is a GUEpnq matrix, then }An }8 ď C n with large
probability, where C ą 2 is some universal constant.
166 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

Exercise 6.34. Show that the GUEpnq version of Theorem 6.23 implies the
GUE0 pnq version.
6.2.3. Wishart matrices.
6.2.3.1. Definition of the Wishart ensemble. Let n, s be nonzero integers. Let
B P Mn,s a random matrix with independent NC p0, 1q entries. The random matrix
W “ BB : P Msa n is called a (complex) Wishart matrix and its distribution is
denoted by Wishartpn, sq. We often say simply that B is a Wishartpn, sq matrix.
The eigenvalues of W are the squares of the singular values of B, so that statements

ion
about the spectrum of Wishart matrices are equivalent to statements about singular
values of a random (rectangular) Gaussian matrix.

ut
Here is an equivalent description: let pG1 , . . . , Gs q be s independent copies of
a standard complex Gaussian vector in the space Cn . Then the matrix

rib
ÿs
(6.39) W “ |Gi yxGi |

ist
i“1
has distribution Wishartpn, sq.

rd
The rank of a Wishartpn, sq matrix is almost surely equal to minpn, sq. In the
following we often assume that s ě n, i.e., that the Wishart matrices are almost
surely positive definite. This is not really a restriction since the case s ă n can
fo
be covered by the following observation: if B P Mn,s is a random matrix with
independent NC p0, 1q entries, then W1 “ BB : is a Wishartpn, sq matrix while
ot
W2 “ B : B is a Wishartps, nq matrix (because the NC p0, 1q distribution is invariant
N

under complex conjugation), and the matrices W1 and W2 share the same non-zero
eigenvalues.
One can also consider the real version of Wishart matrices by starting with
ly.

G1 , . . . , Gs that are standard Gaussian vectors on Rn rather than on Cn (see Section


6.2.4). This is the setup which has a long history due to the fact that it is frequently
on

encountered in statistics.
6.2.3.2. Limit theorems. What does the spectrum of large Wishart matrices
se

look like? Before answering this question, it might be useful to have in mind the
following elementary result from probability theory, which can be considered as the
lu

commutative analogue of a Wishartpn, sq matrix (think of p as 1{n).


Let X be a random variable following a binomial distribution of parameters
na

s P N and p P p0, 1q (this means that X has the same distribution as the sum of s
independent Bernoulli random variables taking values 1 with probability p and 0
so

with probability 1 ´ p). We then have


r

Fact 6.27 (easy). When s tends to infinity and p tends to 0, then


Pe

(i) If α “ lim sp exists in p0, 8q, then X converges (weakly) towards a Poisson
distribution of parameter α.
?
(ii) If lim sp “ 8, then pX ´spq{ sp converges (weakly) towards a standard Gauss-
ian distribution.
In the non-commutative context, we replace independent Bernoulli variables by
free Bernoulli variables. The resulting limit laws are the so-called free Poisson dis-
tribution and, again, the semi-circular distribution given by (6.33). Free probability
theory is beyond the scope of this book (see Section 6.2.5 for a brief introduction)
and so, rather than defining freeness, we will explain the heuristics relating it to
RMT.
6.2. RANDOM MATRICES 167

In non-commutative probability theory, a Bernoulli variable with parameter


p “ n1 can be represented as a random rank 1 projection on Cn (i.e., uniformly
distributed on Grpn, 1q; more generally, we may consider a random rank p dim H
projection on H). According to a fundamental paradigm of free probability, freeness
is realized as a large dimension limit of independent matrix ensembles. Accordingly,
the RMT model to consider is
ÿ s
(6.40) X“ |ψi yxψi |,

ion
i“1

where the vectors ψi are i.i.d. and uniformly distributed on the sphere in Cn and
n, s Ñ 8. Since, for large n, the standard Gaussian vector on Cn is close to

ut
being uniformly distributed on the sphere of radius n1{2 (see Corollary 5.27), it

rib
follows that X is close to the appropriately rescaled Wishart random matrix given
by (6.39) (see Exercise 6.37). Consequently, the limiting behavior that is the non-
commutative analogue of Fact 6.27 can be retrieved from the results on spectral

ist
properties of Wishartpn, sq as n, s Ñ 8. Such results have been known for quite
a while, even if the full extent of the analogy and the identification of the limit

rd
laws as the free analogues of the Poisson and normal distributions had to await the
development of the language of free probability.
fo
To make the limit results for Wishart matrices more tangible, we need to de-
scribe explicitly what the free Poisson distributions are. They originally appeared
?
ot
in RMT as Marčenko–Pastur distributions. First, for λ ą 0, we let x˘ “ p1 ˘ λq2
and define a function supported on rx´ , x` s by
N

a
px ´ x´ qpx` ´ xq
fλ pxq “ 1rx´ ,x` s pxq.
2πx
ly.

The Marčenko–Pastur (a.k.a. free Poisson) distribution with parameter λ, denoted


on

µMPpλq , is then defined by


(6.41) µMPpλq “ p1 ´ λq` δ0 ` fλ dx,
se

where δ0 denotes a Dirac mass at 0 and f dx is the measure whose density (with
respect to the Lebesgue measure) is f .
lu
na

λ=1 λ=2
r so

fλ (x) fλ (x)
Pe

x x
0 4 0 x− x+

Figure 6.2. Marčenko–Pastur densities for λ “ 1 and λ “ 2.

Theorem 6.28 (not proved here). Consider a sequence of indices pn, sq which
tend to infinity in such a way that λ “ lim s{n P r1, 8q exists. For each pn, sq,
let Wn,s be a Wishartpn, sq matrix. After renormalization, the sequence of ran-
dom empirical spectral distributions pµsp pWn,s qq converges in probability towards
168 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

the Marčenko–Pastur distribution µMPpλq with respect to the 8-Wasserstein dis-


tance: for any ε ą 0,
lim Ppd8 pµsp pn´1 Wn,s q, µMPpλq q ą εq “ 0.
pn,sqÑ8

The alternative normalization s´1 Wn,s “ ps´1{2 Bqps´1{2 Bq? :


converges towards
?
a rescaled Marčenko–Pastur distribution with support rp1 ´ 1{ λq2 , p1 ` 1{ λq2 s.
For large λ, this shows that the matrix s´1{2 B is an almost isometric embedding
from Cn into Cs , all singular values being close to 1.

ion
As explained earlier, a similar result follows formally in the case λ P p0, 1q.
However, some care is needed in the formulation, since the atomic part in the

ut
Marčenko–Pastur distribution is supported outside of the continuous part, and this
lack of connectedness may prevent convergence with respect to the 8-Wasserstein

rib
distance (cf. Lemma 6.20 and Exercises 6.29 and 6.36).
In the case where the ratio s{n tends to infinity, the limiting Marčenko–Pastur

ist
distribution degenerates into a semicircular distribution, in the same way that a
Poisson distribution with a large parameter is almost Gaussian.

rd
Theorem 6.29 (not proved here). Consider a sequence of indices pn, sq which
both tend to infinity in such a way that lim s{n “ 8. For each pn, sq, let Wn,s
fo
be a Wishartpn, sq matrix. After renormalization and recentering, the sequence
of empirical spectral distributions pµsp pWn,s qq converges in probability towards the
ot
semicircular distribution µSC with respect to the 8-Wasserstein distance, in the
following sense: for any ε ą 0,
N

lim Ppd8 pµsp pAn,s q, µSC q ą εq “ 0,


nÑ8
ly.

where An,s stands for ?1 pWn,s ´ s Iq.


ns
on

As in Theorem 6.23, the assertion of convergence in 8-Wasserstein distance in


Theorems 6.28 and 6.29 subsumes both global convergence of the spectrum towards
the limit distribution, and convergence of the extreme eigenvalues towards the edges
se

of the limit distribution (see Proposition 6.33).


lu

Our last limit theorem deals with partial transposition of Wishart matrices.
As we shall see, the partial transposition dramatically changes the limit behavior.
Note that the distributions MPpλq and SCpλ, λq which appear in Theorems 6.28
na

and 6.30 have the same mean and the same variance (see Exercise 6.35). This
was to be expected since the partial transposition preserves both the trace and the
so

Hilbert–Schmidt norm.
r

Theorem 6.30 (not proved here). Consider a sequence of indices pd, sq which
Pe

tend to infinity in such a way that λ “ lim s{d2 P p0, 8q exists. For each d, s, let
Wd2 ,s be a Wishartpd2 , sq random matrix (considered as an operator on Cd b Cd )
and WdΓ2 ,s its partial transpose. Then, for any ε ą 0,
lim Ppd8 pµsp pd´2 WdΓ2 ,s q, µSCpλ,λq q ą εq “ 0,
pd,sqÑ8

where µSCpλ,λq denotes the semicircular distribution with mean λ and variance λ.
Exercise 6.35. Verify that (6.41) does indeed define a probability distribution
both for λ ě 1 and for 0 ă λ ă 1, and that the expected value and the variance of
the corresponding random variable are both equal to λ.
6.2. RANDOM MATRICES 169

Exercise 6.36. Check that fλ pλxq “ f1{λ pxq. Use this to deduce from The-
orem 6.28 that the weak convergence of µsp p n1 Wn,s q towards µMPpλq holds for any
λ ą 0.
Exercise 6.37 (Spherical variant of Wishart ensemble). Deduce from Theorem
6.28 the following variant: if Xn,s is defined as in (6.40) and n, s tend to infinity
with lim s{n “ λ, then µsp pXn,s q converge towards MPpλq (in probability, in 8-
Wasserstein distance).

ion
Exercise 6.38 (The quartercircular distribution). Check that if X has a stan-
dard semicircular distribution, then X 2 has a MPp1q distribution. In what sense
can we say that the singular value distribution of a large random (non-Hermitian)

ut
square matrix B with independent NC p0, 1q entries is given by a quartercircular

rib
distribution?
Exercise 6.39 (Free Poisson ˘ ?in the large λ limit). (a) Show that if Xλ
` variables

ist
has a MPpλq distribution, then Xλ ´λ { λ converges to the standard semicircular
distribution with respect to the 8-Wasserstein distance as λ Ñ 8.

rd
(b) Find a gap in the following argument, which purports to show that part (a) in
combination with Theorem 6.28 implies Theorem 6.29.
By Theorem 6.28, the empirical spectral distribution of Wn,s {n is approximately
Xλ (in the sense` of the 8-Wasserstein
˘ ? `
fo
distance)
˘ a if s{n « λ and n, s are large.
Consequently Xλ ´ λ { λ « Xλ ´ s{n { s{n is approximately the empirical
ot
` ˘ a ` ˘ ?
spectral distribution of Wn,s {n ´ s{n { s{n “ Wn,s ´ s { sn , which is exactly
N

the assertion of Theorem 6.29.


6.2.3.3. Concentration of spectrum. In view of Theorem 6.28, it is natural to
ly.

expect
? that
? the?spectrum? of a typical Wishartpn, sq matrix lies close to the interval
rp s ´ nq2 , p s ` nq2 s (for s ě n), or equivalently that all?singular values
? of an
on

? ?
nˆs matrix with i.i.d. NC p0, 1q entries lie close to the interval r s´ n, s` ns. A
first result in this direction is a precise bound (without any multiplicative constants
se

or error terms) for the expected largest singular value, i.e., the operator norm.
Proposition 6.31. Let B be an nˆs random matrix with independent NC p0, 1q
lu

entries. Then ? ?
E }B}op ď n ` s
na

Proposition 6.31 will be deduced from its analogue for real Wishart matrices,
so

which requires methods specific to that setting. Accordingly, we postpone its proof
until Section 6.2.4.2.
r

In view of Proposition 6.31, it is natural to ask the following question, the


Pe

answer to which is known to be affirmative in the real case (see Corollary 6.38).
Recall that sn pBq denotes the smallest singular value of B.
Problem 6.32. Let s ě n, and let B be an n ˆ s random?matrix? with indepen-
dent NC p0, 1q entries. Do we have the inequality E sn pBq ě s ´ n?
We now state a concentration result for the spectrum of Wishart matrices.
Proposition 6.33. Let B be a random n ˆ s matrix with independent NC p0, 1q
entries. For every t ą 0,
? ? ˘ 1
P }B}op ě n ` s ` t ď expp´t2 q.
`
(6.42)
2
170 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

? a
If s ą n, then for every t ą 4 2 log n{p s{n ´ 1q,
? ?
P sn pBq ď s ´ n ´ t ď expp´t2 {4q,
` ˘
(6.43)
where C and c denote absolute constants.
The above result is closely related to Proposition 6.24 and shares many of the
ramifications of the latter. For example, while we know from the general theory
of Gaussian concentration that the quantities in question are concentrated around
some value, identifying that value requires a separate argument and may be hard.

ion
In particular, a positive answer to Problem 6.32 would imply the validity of (6.43)
for all t ě 0 and with the bound expp´t2 q.

ut
Proof. The functions }¨}op and sn are 1-Lipschitz with respect to the Hilbert–
Schmidt norm on Mn,s . Let M be the?median ? of }B}op . By combining Propositions

rib
6.31 and 5.34, it follows that M ď n ` s, and we deduce (6.42) by using the
values from Table 5.2.

ist
Let M 1 be the median of sn pBq. We claim that
? ?
? ? 2 s ` n log 2n

rd
1
(6.44) M ě s´ n´ ? ? .
s´ n
? a
As before, using the values from Table 5.2, we get for t ą 4 2 log n{p s{n ´ 1q
fo
? ? 1
Ppsn pBq ď s ´ n ´ tq ď Ppsn pBq ď M 1 ´ t{2q ď expp´t2 {4q.
ot
2
We may obtain (6.44) as a consequence of the following inequality valid for any
N

tą0
1 ` ? ?
expp´tM 12 q ď E Tr expp´tBB : q ď n exp ´p s ´ nq2 t ` ps ` nqt2 .
˘
(6.45)
ly.

2
(The second inequality in (6.45) is not at all immediate to a prove; it appears as
on

Lemma 7.2 in [HT03].)


? We then use the optimal choice t “ ps ` nq logp2nq and
? ?
the inequality a ´ b ě a ´ b{ a (valid for a ě b). 
se

6.2.3.4. Random induced states. Wishart matrices are of interest in quantum


lu

theory since they lead to a very natural model of random quantum states. One
possible way to generate a random state on Cn is to take independent unit vec-
tors pψi q1ďiďs distributed uniformly on the sphere and to consider the average of
na

corresponding pure states, i.e.,


so

s
1ÿ
ρ“ |ψi yxψi |.
s i“1
r
Pe

This is exactly (6.40) up to normalization. However a closely related and often


better model is to consider the partial trace of a Haar-distributed pure state on
Cn b Cs . We call states obtained that way random induced states.
Let us denote by µn,s the distribution of the induced state TrCs |ψyxψ| when
ψ is uniformly distributed on the unit sphere in Cn b Cs . The measure µn,s is a
probability measure on the set DpCn q of states on Cn . As the following simple fact
shows, this measure is just a renormalization of Wishartpn, sq.
Proposition 6.34 (Wishart matrices as induced states). Let W be a random
matrix with distribution Wishartpn, sq. Then TrWW has distribution µn,s and is
independent of Tr W .
6.2. RANDOM MATRICES 171

Proof. The Proposition follows from the combination of two facts. First, if
G is a standard Gaussian vector in any given Euclidean or Hilbert space V (in our
G
case V “ Cn b Cs ), then the vector |G| is uniformly distributed on the unit sphere
of V and is independent of |G|. Second, when we identify a tensor ψ P Cn b Cs
with a matrix A P Mn,s , we have (see Section 0.8)
TrCs |ψyxψ| “ AA: 
The normalization factor Tr W is very strongly concentrated around the value

ion
ns (see Exercise 6.40). Therefore, it can be virtually treated as a constant when
translating the results for Wishart matrices in the language of induced states. We
have the following (recall that µsp pAq is the empirical spectral distribution of a

ut
self-adjoint matrix A, see (6.29)).

rib
Theorem 6.35. Given integers n, s, let ρn,s be a random induced state with
distribution µn,s .

ist
a
(i) If n is fixed and s tends to infinity, then npn ´ 1qs pρn,s ´ nI q converges in
distribution towards a GUE0 pnq matrix.

rd
(ii) If n tends to infinity and lim s{n “ λ P p0, 8q, then µsp psρn,s q converges weakly
in probability towards µMPpλq . Moreover, if λ ě 1, then the convergence also holds
in 8-Wasserstein distance.
fo ?
(iii) If both n and s{n tend to infinity, then µsp p nspρn,s ´ I {nqq converges in
probability in 8-Wasserstein distance towards µSC .
ot
Recall that the empirical spectral distributions of a rescaled GUE0 matrix is
N

almost semicircular (see Theorem 6.23), so that (i) and (iii) are indeed consistent.
To deduce (ii) from Theorem 6.28 and (iii) from Theorem 6.29, use Proposition
ly.

6.34 and the bounds from Exercise 6.40. The statement (i) is more elementary (see
Exercise 6.41).
on

Similarly, Proposition 6.33 can be restated as a result about spectrum of ran-


dom induced states or as a result about Schmidt coefficients of random pure states.
We single it out as a separate statement since it will be used several times. Alterna-
se

tively, a weaker statement follows from an elementary net argument (see Exercise
lu

6.43).
Proposition 6.36. For n ď s, let ψ be a random vector uniformly distributed
na

on the unit sphere of Cn bCs and let λ1 pψq ě ¨ ¨ ¨ ě λn pψq be its Schmidt coefficients.
Then, for any ε ą 0,
so

ˆ ˙
1 1`ε
(6.46) P λ1 pψq ě ? ` ? ď expp´nε2 q
r

n s
Pe

? ?
and, for any ε ě C s log n{p ns ´ nq,
ˆ ˙
1 1`ε
P λn pψq ď ? ´ ? ď expp´cnε2 q,
n s
where c and C are absolute constants.
Proposition 6.36 can be deduced from Proposition 6.33 or proved in the same
way using concentration of measure on the sphere (cf. Exercise 6.42). We also note
that Proposition 6.36 can be equivalently restated using matrices instead of tensors:
if M P Mn,s is uniformly distributed on the Hilbert–Schmidt
” sphere, then withı large
probability all its singular values belong to the interval ?1 ´ 1`ε
? , ?1 ` 1`ε
? .
n s n s
172 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

When s ě n, the probability measure µn,s has a density with respect to the
Lebesgue measure on DpCn q which has a simple form
dµn,s 1
(6.47) pρq “ pdet ρqs´n ,
d vol Zn,s
where Zn,s is a normalization factor. Note that formula (6.47) allows to define the
measure µn,s (in particular) for every real s ě n, while the partial trace construction
makes sense only for integer values of s. The explicit formula (6.47) will not be

ion
used in this book.
In the important special case where s “ n, the density of the measure µn,n is
constant: a random state distributed according to µn,n is distributed with respect to

ut
the uniform (Lebesgue) measure on DpCn q. This can be seen as a non-commutative
version of the following classical fact: if ψ “ pψ1 , . . . , ψn q is uniformly distributed

rib
on the unit sphere in Cn , the vector p|ψ1 |2 , . . . , |ψn |2 q is uniformly distributed on
the pn ´ 1q-dimensional simplex.

ist
Exercise 6.40 (Trace of a Wishart matrix). Let W be a Wishartpn, sq matrix.
Check that 2 Tr W has distribution χ2 p2nsq and deduce from Exercise 5.37 that for

rd
any t ą 0
nst2
ˆ ˙
Pp| Tr W ´ ns| ą tnsq ď 2 exp ´ .
fo
2 ` 4t{3
Exercise 6.41. Use the multivariate central limit theorem to prove part (i)
ot
from Theorem 6.35.
N

Exercise 6.42 (Mean of the largest Schmidt coefficient). Let ψ be a ran-


dom vector uniformly distributed on SCn bCs . Deduce from (6.49) that E λ1 pψq ď
κ2n `κ2s
ď ?1n ` ?1s . Then prove (6.46).
ly.

κ2ns

Exercise 6.43 (Elementary bounds on the spectrum of random induced states).


on

Let ρ be a random induced state with distribution µn,s , i.e., ρ “ TrCS |ψyxψ| with
ψ uniformly distributed on SCn bCs . a
se

(i) For any y P SCn , show that the function f defined on SCn bCs by f pψq “ xy|ρ|yy
2
is 1-Lipschitz
? and that E f “ 1{n. Conclude from Exercise 5.46 that, for any t ą 0,
lu

Pp|f ´ 1{ n| ą tq ď p1 ` eq expp´nst2 q.
(ii) Let N be a δ-net in SCn for δ ă 1{2. Denote ∆ “ ρ ´ I {n and show that
na

1
}∆}8 ď sup |xy|∆|yy| .
1 ´ 2δ yPN
so

?
(iii) Let s ě n. Conclude than }∆}8 ď C{ ns with high probability for some
r

constant C.
Pe

Exercise 6.44 (The limit distribution of the partial transpose). Let ν be the
law of XY , where X and Y are independent random variables following the standard
semicircular distribution. Let ψ P SCd bCd be a uniformly distributed random vector,
and A “ d|ψyxψ|Γ . (The partial transposition Γ was defined in Section 2.2.6.) Show
that, when d tends to infinity, µsp pAq converges in probability, in 8-Wasserstein
distance, towards ν.
Exercise 6.45 (Low moments of Wishart matrices and expected purity of
random induced states).
(i) Let G be an n ˆ s random matrix with independent NC p0, 1q entries. Show that
E TrpGG: GG: q “ n2 s ` s2 n and that EpTr GG: q2 “ nspns ` 1q.
6.2. RANDOM MATRICES 173

(ii) Let ρ be a random induced state with distribution µn,s . Show that E Tr ρ2 “
n`s
ns`1 .

6.2.4. Real RMT models and Chevet–Gordon inequalities. We con-


sider now variants of the random matrix models introduced before, where the
entries are real instead of complex. All the theorems stated for the GUE and
for complex Wishart matrices carry over mutatis mutandis to the real case. One
important modification that is worth pointing out is that in the density formula
from Proposition 6.22 the factors λi ´ λj are not squared, which makes certain

ion
arguments harder. However, the formulas in question play almost no role in our
approach. On the other hand, some other tools—most notably the analysis via

ut
Gaussian processes—are more adapted to the real setting.
The Gaussian Orthogonal Ensemble (GOE) is the real version of the GUE. A

rib
random matrix A has the GOEpnq distribution if the random variables paij q1ďiďjďn
are independent, with aii having the N p0, 2q distribution and aij (for i ‰ j) having

ist
the N p0, 1q distribution. This normalization is chosen so that the distribution
? is
invariant under conjugacy by an orthogonal matrix. Note also that A{ 2 is a

rd
standard Gaussian vector in the space Msa n.
Real Wishart matrices are then defined exactly as their complex analogues: if
B is an n ˆ s random matrix with independent N p0, 1q entries, the distribution of
W “ BB : is denoted by WishartR pn, sq.
fo
In both settings, an argument based on Gordon’s lemma (Proposition 6.7) al-
ot
lows for concise proofs of precise inequalities. This scheme actually allows obtaining
N

sharp bounds on the norm of a random matrix as an operator between any two real
normed spaces. The basic ingredient is a contraction property of the tensor product
map which holds only in the real case (Exercise 6.47).
ly.

6.2.4.1. Chevet–Gordon inequalities.


on

Proposition 6.37 (Chevet–Gordon inequalities). Let B P Mn,s be a random


matrix with independent N p0, 1q entries. Let K Ă Rs and L Ă S n´1 be compact
sets, and rK ě 0 such that K Ă rK B2s . Then
se

wG pKq´rK wG pLq ď E min maxxBt, uy ď E max maxxBt, uy ď wG pKq`rK wG pLq.


lu

uPL tPK uPL tPK

Note that the upper bound in Proposition 6.37 is always sharp up to a factor
na

of 2 (see Exercise 6.46).


Proof. Let G be a standard Gaussian vector in Rs ‘ Rn . We are going to
so

compare the following Gaussian processes indexed by pt, uq P K ˆ L,


r
Pe

Xt,u “ xBt, uy,


Yt,u “ xG, t ‘ rK uy.
One checks (see Exercise 6.47(ii)) that for pt, uq, pt1 , u1 q in K ˆ L
(6.48) EpXt,u ´ Xt1 ,u1 q2 ď EpYt,u ´ Yt1 ,u1 q2 .
We may now apply Slepian’s lemma (Proposition 6.6; as usual, the fact that the
supremum is presently taken over an infinite set can be circumvented by considering
all finite subfamilies, see (6.1)) to conclude that
E max Xt,u ď E max Yt,u “ wG pKq ` rK wG pLq.
pt,uqPKˆL pt,uqPKˆL
174 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

To prove the other inequality we use the Slepian–Gordon lemma (Proposition


Ť 6.7,
in the min max version, see Remark 6.8) with the partition K ˆL “ uPL Tu , where
Tu “ K ˆ tuu. The hypotheses are satisfied since there is equality in (6.48) when
u “ u1 . Consequently,
E min max Xt,u ě E min max Yt,u “ wG pKq ´ rK wG pLq. 
uPL tPK uPL tPK

As a corollary we obtain sharp bounds on the extreme singular values of a rect-


angular Gaussian matrix, or equivalently on the extreme eigenvalues of a Wishart

ion
matrix. These bounds match the support of the Marčenko–Pastur distribution from
Theorem 6.28. It is then routine to derive concentration estimates.

ut
Corollary 6.38. Let n ď s, let B P Mn,s be a random matrix with independent
N p0, 1q entries, and denote by sn pBq its smallest singular value. Then

rib
? ? ? ?
s ´ n ď κs ´ κn ď E sn pBq ď E }B}op ď κs ` κn ď s ` n.

ist
Consequently, for any t ě 0,
? ? 1

rd
Pp}B}op ě s` n ` tq ď expp´t2 {2q,
2
? ?
Ppsn pBq ď s´ n ´ tq ď expp´t2 {2q.
fo
Proof. We apply Proposition 6.37 with K “ S s´1 and L “ S n´1 . Note that
ot
wG pKq “ κs and wG pLq “ κn . The leftmost and rightmost inequalities follow
from Proposition A.1 (iv) and (i). The concentration estimates are proved as in
N

Proposition 6.33. 
ly.

Exercise 6.46 (Sharpness of Chevet’s inequality). In the notation of Proposi-


tion 6.37, show that
on

E max maxxBt, uy ě maxpwG pKq, outradpKqwG pLqq.


uPL tPK

Exercise 6.47 (The contractions underlying the Gordon–Chevet inequality).


se

Let m, n be integers.
lu

(i) If δ : Rm ˆ Rn Ñ Rm ˆ Rn is defined by δpx, yq “ p|y|x, |x|yq, show that for any


px, yq and px1 , y 1 q in Rm ˆ Rn ,
na

|x b y ´ x1 b y 1 | ď |δpx, yq ´ δpx1 , y 1 q|.


(ii) Fix r ą 0 and consider the map δr : Rm ˆ Rn Ñ Rm ˆ Rn defined by δr px, yq “
so

prx, |x|yq. Show that for any px, yq and px1 , y 1 q in Rm ˆ rB2n ,
r

|x b y ´ x1 b y 1 | ď |δr px, yq ´ δr px1 , y 1 q|.


Pe

(iii) Show that the analogues of (i) and (ii) fail in the complex setting.
Exercise 6.48 (Sharp bounds on the largest eigenvalue of GOEpnq and GUEpnq
matrices). Let A be a GOEpnq or GUEpnq random matrix. By arguing along ? the
lines of the proofs of Proposition 6.37 and Corollary 6.38, show that E λ1 pAq ď 2 n.
Exercise 6.49 (Mean width of the projective tensor product). Let K Ă Rm
and L Ă Rn be convex bodies. Assume that K Ă rK B2m and L Ă rL B2n . Prove
rL
that wG pK b
p Lq ď wG pKqrL ` wG pLqrK and wpK b p Lq ď wpKq ?
n
` wpLq ?rKm .
6.2. RANDOM MATRICES 175

6.2.4.2. A coupling argument. We prove here Proposition 6.31. Let B be an


n ˆ s random matrix with independent NC p0, 1q entries and A be 2n ˆ 2s random
matrix with independent N p0, 1q entries. We show that
1 κ2n ` κ2s ? ?
(6.49) E }B}op ď ? E }A}op ď ? ď n ` s.
2 2
We use representations of A and B via χ-distributed random variables. If G is
a standard Gaussian vector in Rn , the distribution of |G| is denoted by χpnq (the
variable has distribution χ2 pnq). If G is a standard
square of a χpnq-distributed ?

ion
n
Gaussian vector in C , then 2|G| has distribution χp2nq.
Lemma 6.39 (see Exercise 6.50). Let n ď s and A be an n ˆ s random matrix

ut
with independent N p0, 1q entries. There exist random matrices U P Opnq and V P

rib
Opsq, such that, denoting R “ U AV ,
(i) The random variables tri,j : 1 ď i ď n, 1 ď j ď su are independent,
(ii) For 1 ď i ď n, ri,i has distribution χps ` 1 ´ iq,

ist
(iii) For 2 ď i ď n, ri,i´1 has distribution χpn ` 1 ´ iq,
(iv) Other entries of R are almost surely zero.

rd
Lemma 6.40 (see Exercise 6.50). Let n ď s and B be an n ˆ s random matrix
with independent NC p0, 1q entries. ?There exist random matrices U 1 P Upnq and
V 1 P Upsq, such that, denoting S “ 2U 1 BV 1 ,fo
(i) The random variables tsi,j : 1 ď i ď n, 1 ď j ď su are independent,
ot
(ii) For 1 ď i ď n, si,i has distribution χp2s ` 2 ´ 2iq,
(iii) For 2 ď i ď n, si,i´1 has distribution χp2n ` 2 ´ 2iq,
N

(iv) Other entries of S are almost surely zero.


ly.

We apply Lemmas 6.39 (with dimensions 2n ˆ 2s instead of n ˆ s) and 6.40


to the matrices A and B appearing in (6.49). Since 2s ` 2 ´ 2i ď 2s ` 1 ´ i for
on

1 ď i ď n and 2n ` 2 ´ 2i ď 2n ` 1 ´ i for 2 ď i ď n, the initial matrices A


and B can be coupled (i.e., both defined on a single probability space) in such a
way that, almost surely, sij ď rij for any 1 ď i ď n and 1 ď j ď s. Since R
se

and S have positive entries, this implies that (almost surely) }S}op ď }R}op . Since
}A}op “ }R}op and }B}op “ ?12 }S}op , it follows that E }B}op ď ?12 E }A}op . The
lu

remaining inequalities in (6.49) are proved in Corollary 6.38.


na

Problem 6.41. Does there exist an argument along similar lines (i.e., using
Slepian’s lemma and coupling) that yields inequalities in the spirit of?(6.49), but
so

involving GUE and GOE matrices (say, E }B}op ď ?12 E }A}op ď 2 n, with B
being a GUEpnq matrix and A being a GOEp2nq matrix)?
r
Pe

Exercise 6.50 (Representation of Wishart matrices via χ-distributed vari-


ables). Prove Lemmas 6.39 and 6.40. Show also that the matrices U, V and R can
be chosen to be independent, with U, V Haar-distributed (same for U 1 , V 1 and S).
Exercise 6.51 (Neat bounds on the norms of Wishart matrices). Let A be a
random n ˆ s matrix with independent N p0, 1q entries with n ď s.
(i) Show that E }A} ě }M } where M “ pmi,j q is the n ˆ s matrix such that
mi,i “ κs`1´i for 1 ď i ď n and mi,i´1 “ κn`1´i for 2 ď i ď n (other entries being
zero). ? ? a
(ii) Conclude that E }A} ě p n ´ k ` s ´ kq 1 ´ 1{k for any 1 ď k ď n. Show
that this inequality also holds when A is defined using NC p0, 1q variables.
176 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

6.2.4.3. The escape phenomenon. Another consequence of the Chevet–Gordon


inequalities is the fact that a subset of the sphere which is small (when measured
using mean width) typically does not intersect a subspace of large dimension: a
generic subspace “escapes” from any small set. This is made very precise in the
following proposition.
Proposition 6.42. Let L Ă S n´1 a closed subset, k P t1, . . . , n ´ 1u such that
wG pLq ă κn´k , and E Ă Rn a random k-dimensional subspace. Then

ion
P pE X L ‰ Hq ď expp´pκn´k ´ wG pLqq2 {2q.
Proposition 6.42 will give a direct proof of the low-M ˚ estimate (Theorem

ut
7.45).
Proof of Proposition 6.42. Let s “ n ´ k, and B an n ˆ s random matrix

rib
with i.i.d. N p0, 1q entries, and E “ ker B. One checks that E is distributed accord-
ing to the Haar measure on Grpk, Rn q. (This follows from the characterization of

ist
the Haar measure as the only measure invariant under the action of Opnq.) More-
over, since L is closed, the condition E X L “ H is equivalent to minxPL |Bx| ą 0.

rd
We apply the Chevet–Gordon inequalities (Proposition 6.37) with K “ S s´1 to
conclude that
xPL
fo
E min |Bx| ě κs ´ wG pLq.
Since the function g : B ÞÑ minxPL |Bx| is 1-Lipschitz with respect to the Hilbert–
ot
Schmidt distance, we may apply Gaussian concentration of measure (see Table 5.2)
to conclude that
N

PpE X L ‰ Hq “ PpgpBq “ 0q
ly.

´ ¯
“ P gpBq ď E gpBq ´ pκs ´ wG pLq
on

ď expp´pκs ´ wG pLqq2 {2q. 


6.2.5. A quick initiation to free probability. We now mention briefly
se

deeper results about high-dimensional random matrices that touch upon the con-
nection with free probability. A rigorous introduction to free probability is behind
lu

the scope of this book, so we instead illustrate, on an example, the kind of conclu-
sions that can be derived from the general theory.
na

Free probability describes limit objects towards which large-dimensional ran-


dom matrices converge. Here is a typical statement about polynomials in indepen-
so

dent GUE matrices.


r

Theorem 6.43 (not proved here). Let P be a non-commutative self-adjoint


Pe

p1q pN q
polynomial in N variables. For every n, let An , . . . , An be N independent ran-
p1q ? pN q ?
dom matrices with GUEpnq distribution, and let Xn “ P pAn { n, . . . , An { nq.
Then, as n Ñ 8, the empirical spectral distributions pµsp pXn qq converge weakly,
in probability, towards the distribution of P pa1 , . . . , aN q, where a1 , . . . , aN are free
semicircular variables. Moreover, }Xn }8 converges in probability towards the value
}P pa1 , . . . , aN q}.
Let us explain the meaning of the concepts and notions that appear in The-
orem 6.43. First, a polynomial P is self-adjoint if P pM1 , . . . , MN q P Msan when-
ever M1 , . . . , MN P Msa
n ; an example is P px1 , x2 q “ x1 x2 x1 . Second, a family of
N “free semicircular random variables” can be concretely realized as follows: let
6.2. RANDOM MATRICES 177

N bk
be the Fock space over CN , with the usual convention that
À
F “ kPN pC q
N b0
pC q is a one-dimensional space spanned by a unit vector Ω. Let |1y, . . . , |N y be
the canonical basis of CN , and let h1 , . . . , hN P BpFq be the corresponding creation
operators, defined by hi pxq “ |iy b x P pCN qbpk`1q for every x P pCN qbk . Set
ai :“ hi ` h:i ; then the operators a1 , . . . , aN are an example of “free semicircular
variables.” The quantity }P pa1 , . . . , aN q} appearing in Theorem 6.43 is simply the
operator norm, and the distribution of a self-adjoint operator Y P BpHq is defined
as the unique probability measure µ on R such that, for every bounded continuous

ion
function f : R Ñ R, ż
xΩ|f pY q|Ωy “ f dµ

ut
R
(it is enough to consider the case where f is a polynomial). The unfamiliar reader

rib
is invited to check that this formalism is consistent with Theorem 6.23 (see Exercise
6.52).

ist
The phenomenon behind Theorem 6.43 is called “asymptotic freeness of random
matrices” and is not limited to the case of GUE matrices (see Notes and Remarks

rd
for more references). Here is another example involving unitary matrices. The “free
additive convolution” is a binary operation (denoted by ‘ and not defined here) on
probability measures (say, with compact support) on R.
fo
Theorem 6.44 (not proved here). Let µ and ν be two compactly supported
probability measures on R. For every n, let An , Bn P Msan be real (resp., complex)
ot
self-adjoint matrices such that the sequences of empirical measures pµsp pAn qq and
N

pµsp pBn qq converge weakly towards µ and ν as n Ñ 8. Let Un be a Haar-distributed


random orthogonal (resp., unitary) matrix. Then (weakly, in probability)
ly.

lim µsp pAn ` Un Bn Un: q “ µ ‘ ν.


nÑ8
on

The usefulness of Theorem 6.44 comes from the fact that in many situations
the free additive convolution of probability measures can be computed using the
so-called R-transform, a non-commutative analogue of the Fourier transform (see
se

for example Lecture 12 in [NS06]). Here is an example of conclusions that can be


lu

derived from Theorem 6.44.


Corollary 6.45 (not proved here). Fix 0 ď t ď 1{2. For any n, let En and
na

Fn be subspaces which are independent and Haar-distributed on the Grassmann


manifold Grpttnu, Cn q and denote An “ PEn ` PFn the sum of the corresponding
so

projectors. Then the sequence of empirical measures pµsp pAn qq converges weakly in
probability towards the deterministic measure
r
Pe

a
4tp1 ´ tq ´ px ´ 1q2 ” ?
(6.50) p1 ´ 2tqδ0 ` 1 1´2 tp1´tq,1`2?tp1´tqı pxq dx.
πxp2 ´ xq
a
Moreover, the sequence p}An }8 q converges in probability towards 1 ` 2 tp1 ´ tq.
An analogous statement for t ě 1{2 follows by applying 6.45 to EnK and FnK . The
measure defined in (6.50) is the free additive convolution of the measure p1´tqδ0 `tδ1
with itself (for t “ 1{2 we recover the arcsine distribution).
Exercise 6.52 (Semicircular variables via creation operators). Show that the
distribution of the operators ai defined on the Fock space in the paragraph following
Theorem 6.43 is indeed the semicircular distribution.
178 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

Notes and Remarks


Section 6.1. The elegant proof of Lemma 6.1 is due to Talagrand (see [Tal11]).
Denote by un the expectation of the maximum of n independent ? N p0, 1q variables.
?
For small n, explicit formulas are known: u1 “ 0, u2 “ 1{ π, u3 “ 3{p2 πq,
3 1 1 1 5 1 3 1
u4 “ ?π p 2 ` π arcsinp 3 qqq and u5 “ 2?π p 2 ` π arcsinp 3 qqq (numbers are from
the website [@2]). Moreover, an asymptotic expansion of un can be obtained from
the convergence of the maximum of independent Gaussian samples to the Gumbel

ion
distribution (see [LLR83], Theorem 1.5.3 and also [Pic68] to justify convergence
in expectation)
ˆ ˙
log log n 1

ut
a
un “ 2 log n ´ ? `O ? .
2 2 log n log n

rib
Inequalities in a similar spirit, but also for fixed n, appear in [DLS14].
References for the result from Remark 6.4 are [Glu88], [CP88] and [BF88].

ist
The conjecture from Remark 6.5 appears in [HS05].
The second part of Proposition 6.6 was originally proved by Slepian [Sle62] and

rd
is usually referred to as Slepian’s lemma. The first assertion, which follows from
the second one, is sometimes called the Sudakov–Fernique inequality and appears
in [Fer75]. Several proofs of Proposition 6.6 are available in addition to the original
one; see, e.g., Kahane [Kah86] and Gromov [Gro87].
fo
We also mention a well-known open problem related to Slepian’s lemma which
ot
is known as the Kneser–Poulsen conjecture. Suppose that x1 , . . . , xN and y1 , . . . , yN
are points in Rd with the property that |xi ´xj | ď |yi ´yj | for any 1 ďŤ
N

i, j ď N . The
conjecture
Ť asks whether for every radii r1 , . . . , r N ą 0, we have volp Bpxi , ri qq ď
volp Bpyi , ri qq. Under the same hypotheses, a sister conjecture is whether the
ly.

Ş Ş
inequality volp Bpxi , ri qq ě volp Bpyi , ri qq holds. Similar questions can be asked
for the spherical, hyperbolic and projective spaces. Note that, in the spherical
on

case, the two conjectures are equivalent since the complement of a cap is also a cap.
Also, since all Riemannian manifolds are asymptotically flat as distances go to 0,
se

the Euclidean case (in any particular dimension) would be a formal consequence of
a positive answer in any other setting. The answers were shown to be affirmative
lu

when k ď d ` 1 in the spherical setting (see [Gro87]) and when k ď d ` 3 or when


d “ 2 (for arbitrary k) in the Euclidean setting (see [BC02]). Both conjectures are
na

known to be true for spherical caps of angle π{2, see [Bez08], which also surveys
partial results and specific open problems in the hyperbolic setting. In the setting
so

of projective spaces the question about unions appears to have a negative answer,
as indicated by counterexamples in section 4 of [Šid68], which show that a full
r
Pe

two-sided analogue of Slepian’s lemma (in the spirit of Proposition 6.9) does not
hold.
Proposition 6.9 was proved independently by Khatri [Kha67] and Šidák [Šid67]
(see also [Šid68, Glu88]). The Gaussian correlation conjecture was proved by
Royen in [Roy14]. A more accessible and more detailed exposition can be found
in [LM15], to which we also refer for more background and references.
The Sudakov minoration (Proposition 6.10) appears in [Sud71]. The dual Su-
dakov inequality (Proposition 6.11) is due to Pajor–Tomczak-Jaegermann [PTJ85].
The proof presented here is due to Talagrand (see [LT91]). Some refinements of
both inequalities appear in [MTJ87].
NOTES AND REMARKS 179

Dudley’s inequality (Proposition 6.13) goes back to [Dud67] and was gener-
alized to the subgaussian setting in [JM78]. A version of Proposition 6.17 in the
language of stationary Gaussian processes can be found in [Fer97]. The first part
of Theorem 6.18 is due to Fernique [Fer75] and the second part (which is much
harder) is due to Talagrand [Tal87] (a later paper [Tal01] contains a more transpar-
ent exposition). For more information about the “generic chaining” principle (which
is a reincarnation of the “majorizing measures”), we refer to the books [Tal05] and
[Tal14] by Talagrand, the latter one being more accessible.

ion
Section 6.2. Two recent and excellent references about RMT are [AGZ10]
and [Tao12], and we direct the reader to them for the background, further in-

ut
formation, and bibliography. In particular, a huge branch of RMT which is not
considered here revolves around the universality principle and aims at extending

rib
convergence results to models with less symmetries and/or with weaker integrabil-
ity properties. Random matrices drawn from classical compact groups are the topic

ist
of the forthcoming monograph [Mec].
In the context of empirical measures, the 8-Wasserstein distance was intro-

rd
duced in [ASY14]. The 8-Wasserstein distance is much less popular than its
“finite p” cousins; for example, in [Vil09] it appears only in the bibliographical
fo
notes to the entire chapter devoted to the topic. However, it has a few interesting
applications, see for example [McC06]. We refer to [Vil09] for a thorough dis-
cussion of why the terminology “Wasserstein distance” is as highly questionable as
ot
it is predominant. For a proof that the Lévy distance metrizes weak convergence,
N

see Section 4.3 in [Gal95]. Knowing that the weak convergence is metrizable gives
unambiguous meaning to statements asserting that a sequence of random measures
“converges weakly in probability,” which are ubiquitous in RMT. A long list of con-
ly.

ditions equivalent to weak convergence is known as the Portmanteau lemma and


on

can be found, along many other facts about convergence of probability measures,
in [Bil99].
Wigner’s theorem about convergence to the semicircle distribution originates
se

from [Wig55, Wig58] and has been extended and strengthened in various direc-
tions, notably to matrices with independent (but not necessarily Gaussian) entries
lu

(see, e.g., references in [Tao12]).


The “small deviation” inequalities from Proposition 6.25 are from [Aub05,
na

Led03, LR10]. The perhaps surprising normalization is sharp and reflects the
fact that fluctuations of large random matrices are asymptotically smaller than
so

the upper bound given ? by the Gaussian concentration. For example, the quan-
tity λ1 pGUEpnqq ´ 2 n is of order n´1{6 (as opposed to Op1q following from the
r
Pe

Gaussian isoperimetric inequality), and it converges, after normalization, to the


Tracy–Widom distribution [TW94] (resp., GOEpnq, [TW96]).
The Marčenko–Pastur distribution appearing in Theorem 6.28 was introduced
in [MP67], where the weak convergence was proved. Convergence of extreme
eigenvalues was obtained in [Gem80, Sil85]. A reference for Theorem 6.29 is
[BY88]. Theorem 6.30 about partial transposition appears in [Aub12] (see also
[BN13, FŚ13] for a slightly different setting). We also refer to [CN16] for a survey
of RMT techniques in quantum information theory.
Proposition 6.31 seems new, but it is likely that—similarly as its special case,
(6.37)—it can be derived from the subtle inequalities contained in [HT03, HT05].
180 6. GAUSSIAN PROCESSES AND RANDOM MATRICES

The proof is, to the best of our knowledge, new; however, Lemma 6.39 appears in
[Sil85].
The formula (6.47) has been derived in [ŻS01] (and probably independently in
many other sources). Proposition 6.37 is from [Gor85] and improves on [Che78].
The argument leading to Corollary 6.38 is taken from [DS01]. Proposition 6.42 is
from [Gor88].
Free probability. The very interesting and fruitful link between free proba-
bility and large random matrices mentioned in Section 6.2.5 goes back to [Voi91].

ion
The monograph [NS06] gives an accessible and comprehensive approach to the sub-
ject with an emphasis on its combinatorial aspects. A highly readable exposition of

ut
many aspects of the subject relevant to quantum information theory can be found
in [HP00].

rib
The weak convergence in Theorem 6.43 was proved by Voiculescu. The exten-
sion to the convergence of the operator norm is a difficult result which was derived

ist
later by Haagerup–Thorbjørnsen [HT05].
Free additive convolution was introduced by Voiculescu in [Voi85] and the

rd
statement of Theorem 6.44 is from [Voi90]. The needed convergence of the operator
norms required for the last part of Corollary 6.45 was supplied recently in [CM14].
A formula for the sum of more than two projectors can also be derived, see [FN15].
fo
Finally, we mention that some concentration estimates for polynomials in ran-
dom matrices can be found in [MS12].
N ot
ly.
on
se
lu
na
so
r
Pe
CHAPTER 7

Some Tools from Asymptotic Geometric Analysis

ion
This chapter contains a selection of results from asymptotic geometric analysis

ut
which we believe to be of interest to quantum information theory. The most famous
of them is arguably Dvoretzky’s theorem which asserts that, roughly speaking, every

rib
convex body of sufficiently large dimension admits sections which are arbitrarily
close to Euclidean balls. There are actually several variations on this statement

ist
and they are studied in detail in Section 7.2. We also introduce the `-position of
convex bodies and use it to deduce the M M ˚ -estimate, an important result that

rd
allows appealing to duality when studying mean widths.

7.1. `-position, K-convexity and the M M ˚ -estimate


fo
7.1.1. `-norm and `-position. Let K Ă Rn be a convex body containing 0
in the interior. For T P Mn , we define the quantity `K pT q as
N ot

`K pT q “ E }T pGq}K ,
where G denotes a standard Gaussian vector in Rn . If there is no ambiguity about
ly.

the underlying convex body, we write ` instead of `K . The following proposition


collects elementary properties of this concept.
on

Proposition 7.1 (see Exercise 7.1). If K Ă Rn is a convex body containing 0


in the interior, then
se

(i) `K p¨q is a norm on Mn ,


(ii) `K obeys the ideal property: for S,? T P Mn , we have `K pST q ď `K pSq}T }op ,
lu

(iii) `K pIq “ wG pK ˝ q “ κn wpK ˝ q „ n wpK ˝ q,


(iv) for T P GLpn, Rq, `K pT q “ `T ´1 K pIq,
na

(v) if PE denotes the orthogonal projection on a subspace E Ă Rn , then


`K pPE q “ wG ppK X Eq˝ q “ wG pPE K ˝ q,
so

where by pK X Eq˝ we mean the polar of K X E inside E.


r
Pe

We now introduce the concept of `-position via the following lemma.


Lemma 7.2. For any convex body K Ă Rn containing 0 in the interior, there
is a unique T0 P PSD that is a solution to the maximization problem
(7.1) maxtdet T : T P PSDpRn q, `K pT q ď 1u.
If T0 is a multiple of the identity, we say that K is in the `-position.
Proof of Lemma 7.2. The maximum is attained by compactness and is ob-
viously strictly positive. Assume that T0 , T1 P PSDpRn q are both solutions of the
maximization problem. If T0 ‰ T1 , it would follow that T “ pT0 ` T1 q{2 verifies,
on the one hand, `pT q ď 1 and, on the other hand, det T ą pdet T0 q1{2 pdet T1 q1{2 “

181
182 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

det T0 (by strict log-concavity of det over PSD, see Exercise 1.42), a contradic-
tion. 
Note that the `-position of a convex body is unique up to homotheties and ro-
tations. It follows from Proposition 4.8 that convex bodies with enough symmetries
are automatically in the `-position.
Lemma 7.3. Let K be a convex body in the `-position. Then wG pK ˝ q TrpAq ď
n`K pAq for any A P Mn .

ion
Proof. We may assume A P PSD. Indeed, any B P Mn can be written as AO
for A P PSD and O P Opnq, and we have `K pAq “ `K pBq by rotational invariance

ut
of the Gaussian measure, while Tr B ď }B}1 “ Tr A.
Since K is in `-position, the solution of the variational problem (7.1) is λ I with

rib
λ “ `K pIq´1 “ wG pK ˝ q´1 . Consider A P PSD and ε ą 0 small enough such that
I `εA P PSD. Let B “ p`K pI `εAqq´1 pI `εAq. Since `K pBq “ 1 it follows that
detpBq ď detpλ Iq “ λn . Consequently, using the triangle inequality,

ist
1{n
pdetpI `εAqq ď λ`K pI `εAq ď 1 ` ελ`K pAq.

rd
1{n 1
Since detpI `εAq “1` n ε TrpAq ` opεq as ε goes to 0, the result follows. 
Remark 7.4. Before proceeding, let us point out that the more common def-
fo
inition of the `-norm (and of the `-position) is via the second moment, namely
pE }T pGq}2K q1{2 . Using the second moment leads to nicer duality relations, but we
ot
prefer to use the first moment to make the connection to the mean width more
N

transparent. The next proposition shows that the two quantities are equivalent;
however, they are not equal nor proportional, and so the corresponding two maxi-
mization problems lead to two slightly different notions of `-position.
ly.

Proposition 7.5 (not proved here). For any symmetric convex body K Ă Rn
on

and for any linear operator T : Rn Ñ Rn , we have


c
π
E }T pGq}K ď pE }T pGq}2K q1{2 ď E }T pGq}K .
se

2
Exercise 7.1. Prove the properties of the `-norm listed in Proposition 7.1.
lu

Exercise 7.2 (The left ideal property). In the setting of Proposition 7.1, is it
true that `K pST q ď }S}`K pT q?
na

7.1.2. K-convexity and the M M ˚ -estimate. Consider the Hilbert space


so

Hk :“ L2 pRk , γk q. It is useful to write the norm of an element f P Hk as


˘1{2
r

E |f pGq|2
`
,
Pe

where G is a standard Gaussian vector in Rk . Recall that the Hermite polynomials


phα qαPNk defined in (5.56) form an orthonormal basis in Hk . For an integer d ě 0,
denote by Rd : Hk Ñ Hk the orthogonal projection onto the subspace of homo-
geneous polynomials of total degree d : Rd phα q “ hα if |α| “ d and Rd phα q “ 0
if |α| ‰ d. For f P Hk , R0 pf q is a constant function and R1 pf q has the form
x ÞÑ xx, ay for some a P Rk .
Given n P N, let Hk,n be the space of Borel functions Θ “ pf1 , . . . , fn q : Rk Ñ
R such that fi P Hk for each i. The space Hk,n is a Hilbert space for the inner
n

product
(7.2) xxΘ, Θ1 yy :“ ExΘpGq, Θ1 pGqy
7.1. `-POSITION, K-CONVEXITY AND THE M M ˚ -ESTIMATE 183

and can be identified with the Hilbert space tensor product Hk b Rn . (This is the
canonical identification of the space of H-valued L2 functions on Ω with L2 pΩq b H;
if dim H ă 8, no completion of the latter is needed.) The projections Rd induce
extensions R̃d :“ Rd b IRn : Hk,n Ñ Hk,n . More concretely, for Θ P Hk,n , we have
R̃d pΘq :“ pRd f1 , . . . , Rd fn q. Similarly as for n “ 1, the function R̃1 pΘq : Rk Ñ Rn
is linear, i.e., it has the form x ÞÑ Ax for some A P Mk,n (depending on Θ), and
the operator R̃1 is the orthogonal projection onto the subspace of Hk,n formed by
such linear functions.

ion
Let K be a convex body in Rn containing 0 in the interior. For Θ P Hk,n ,
define

ut
˘1{2
~Θ~K “ E }ΘpGq}2K
`
(7.3)

rib
(this quantity is a norm when K is symmetric; again, we have here X-valued
L2 functions on pRk , γk q, where X “ pRn , } ¨ }K q). It is easily checked that, for
Θ P Hn,k ,

ist
(7.4) ~Θ~K “ suptxxΘ, Ξyy : Ξ P Hn,k , ~Ξ~K ˝ ď 1u.

rd
The K-convexity constant of the convex body K, denoted by KpKq, is the smallest
constant C such that the inequality
(7.5) ~R̃1 pΘq~K ď C~Θ~Kfo
holds for every k and for all Θ P Hk,n . It is not hard to show that KpKq ă 8 (see
ot
Exercise 7.3). Moreover, rather surprisingly, Kp¨q is often uniformly bounded for
N

large classes of bodies (for example, for balls in all commutative or non-commutative
`p spaces for a fixed p P p1, 8q). For general symmetric convex bodies, the sharp
ly.

estimate KpKq “ Oplog nq appears in Corollary 7.9.


We now connect the K-convexity constant with mean width estimates.
on

Proposition 7.6. Let K Ă Rn be a convex body containing 0 in the interior


which is in the `-position. Then wG pKqwG pK ˝ q ď nKpKq.
se

Proof. To each x P Rn associate Θpxq P K with the property that xx, Θpxqy “
}x}K ˝ ; we can also ensure that the map Θ is Borel (see Exercise 1.12), so that
lu

Θ P Hn,n . Since Θ takes values in K, it follows that ~Θ~K ď 1. (Actually, since


x ‰ 0 implies Θpxq P BK, we necessarily have ~Θ~K “ 1.) We have
na

wG pKq “ E }G}K ˝ “ ExG, ΘpGqy “ xxIRn , Θyy.


so

Given that R̃1 is an orthogonal projection onto a subspace containing IRn , we have
r

xxIRn , Θyy “ xxIRn , R̃1 pΘqyy. Recalling that R̃1 pΘq has the form x ÞÑ Ax for some
Pe

A P Mn , we can write
wG pKq “ ExG, AGy.
Since an elementary computation shows that ExG, AGy “ Tr A, a straightforward
application of Lemma 7.3 yields
(7.6) wG pKqwG pK ˝ q ď n`K pAq.
It remains to unscramble the meaning of the quantity `K pAq. We have
˘1{2
`K pAq “ E }ApGq}K ď E }ApGq}2K
`
“ ~A~K
“ ~R̃1 pΘq~K ď KpKq~Θ~K ď KpKq,
184 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

as needed, the only significant step in the above chain of equalities/inequalities


being the application of (7.5), the definition of the K-convexity constant. 
An upper bound on the K-convexity constant, whose importance cannot be
overstated, is the following result due to Pisier.
Theorem 7.7. There is a universal constant C such that
KpKq ď Cp1 ` log dpK, B2n qq
for any n P N and any symmetric convex body K Ă Rn .

ion
Remark 7.8. If K Ă Rn is unconditional, the bound in Theorem 7.7 can be
˘1{2

ut
`
improved to C 1 ` log dpK, B2n q .

rib
Before proving
? Theorem 7.7, we derive some of its consequences. First, since
dpK, B2n q ď n for every symmetric convex body K Ă Rn (see Exercise 4.20;
actually, the weaker result from Exercise 4.2 would suffice), we first have

ist
Corollary 7.9. There is a universal constant C such that KpKq ď C log n

rd
for any symmetric convex body K Ă Rn , n ě 2.
Combined with Proposition 7.6, this implies the following result known in as-
fo
ymptotic geometric analysis as the “M M ˚ -estimate.”
Theorem 7.10 (The M M ˚ -estimate). Let n ě 2 and let K Ă Rn be a sym-
ot
metric convex body which is in the `-position. Then
N

(7.7) 1 ď wpKq wpK ˝ q ď C log n.


We point out that the lower bound wpKqwpK ˝ q ě 1 is elementary (see Exercise
ly.

4.37). As a corollary, we obtain the fact that, in the `-position, the Urysohn
inequality (4.34) is sharp up to a logarithmic factor.
on

Corollary 7.11 (Reverse Urysohn’s inequality). Let n ě 2 and let K Ă Rn be


a symmetric convex body. Then there exists a linear transformation T P GLpn, Rq
se

such that
wpT pKqq ď C log n vradpT pKqq.
lu

Moreover, T can be chosen to commute with the group of isometries of K. In


particular, if K has enough symmetries, one may take T “ I.
na

Note that since both wpT pKqq and vradpT pKqq are 1-homogeneous in T , one
so

may require in Corollary 7.11 that T P SLpRn q, in which case vradpT pKqq “
vradpKq.
r
Pe

For the proof of Theorem 7.7 we need two auxiliary lemmas, the first of which
requires recalling some notation. Fix k ě 1 and let pPt qtě0 be the Ornstein–
Uhlenbeck semigroup introduced in (5.55). Then each Pt is a contraction on Hk
(Exercise 5.62). Moreover, the operator Pt extends to an operator P̃t on Hk,n by
the formula
P̃t pf1 , . . . , fk q “ pPt f1 , . . . , Pt fk q
(or, more abstractly, P̃t “ Pt b IRn ) and this extension is also a contraction with
respect to any “reasonable functional norm.”
Lemma 7.12. For any Θ P Hk,n and for any convex body K Ă Rn containing
0 in the interior, we have ~P̃t Θ~K ď ~Θ~K .
7.1. `-POSITION, K-CONVEXITY AND THE M M ˚ -ESTIMATE 185

Proof. For x P Rk , denote gpxq “ }Θpxq}K , so that ~Θ~K “ }g}Hk . Then,


for any z P K ˝ and any x P Rk , we have xΘpxq, zy ď gpxq. Since Pt preserves
positivity (this is clear from (5.55)), it follows that
xpP̃t Θqpxq, zy “ Pt pxΘpxq, zyq ď pPt gqpxq.
Taking supremum over z P K ˝ yields }pP̃t Θqpxq}K ď pPt gqpxq. Squaring and
integrating against γk we obtain (cf. (7.3))
~P̃t Θ~K ď ||Pt g||Hk ď ||g||Hk “ ~Θ~K ,

ion
the second inequality following from Pt being a contraction on Hk (see Exercise
5.62). 

ut
The second lemma that we need for the proof of Theorem 7.7 is the following.

rib
Lemma 7.13 (see Exercise 7.6). Let p be a polynomial such that p1q |ppxq| ď 1
for any x P r´1, 1s and p2q for some λ ě e, |ppzq| ď λ for any complex number z

ist
with |z| ď 1. Then |p1 p0q| ď 4e
π log λ.

Proof of Theorem 7.7. Fix k ě 1 and let λ “ dpK, B2n q. Since the K-

rd
convexity constant is linearly invariant (see Exercise 7.3), we may assume that
B2n Ă K Ă λB2n and therefore
(7.8) fo
~ ¨ ~K ď ~ ¨ ~B2n ď λ~ ¨ ~K .
Further, since KpKq ď dpK, B2n q (again, by Exercise 7.3, or directly from (7.8)), we
ot
may assume that λ ě e. Note that the Hilbert space norm on Hk,n corresponding
N

to the inner product (7.2) is exactly ~ ¨ ~B2n .


Our objective is to show that if Θ P Hk,n satisfies ~Θ~K ď 1, then ~R̃1 pΘq~K ď
4e
λ. (This will imply the Theorem with C “ 4e
ly.

π log π .) By density, we may assume


that Θ is a polynomial; denote by m its degree. Consider the Hk,n -valued polyno-
on

mial defined for z P C by


m
ÿ
πpzq “ z j R̃j pΘq.
se

j“1
For |z| ď 1, we have
lu

~πpzq~K ď ~πpzq~B2n ď ~Θ~B2n ď λ


na

where the middle inequality uses the Pythagorean theorem. If x “ expp´tq ą


0, then πpxq “ P̃t Θ (by (5.57)) and therefore ~πpxq~K ď 1. Similarly, if y “
so

´ expp´tq ă 0, then πpyq “ P̃t Ψ where Ψ is defined by Ψpxq “ ´Θp´xq. Because


K is symmetric, we have ~Ψ~K “ ~Θ~K and therefore ~πpyq~K ď 1. For any Ξ P
r

Hk,n with ~Ξ~K ˝ ď 1, the polynomial ppzq “ xxπpzq, Ξyy satisfies the hypotheses of
Pe

Lemma 7.13. It follows that |p1 p0q| “ |xxR̃1 pΘq, Ξyy| ď 4e


π log λ and the conclusion
follows from the duality formula (7.4). 
Remark 7.14. An alternative definition of K-convexity is obtained if we re-
place the Gauss space by the discrete ř cube. Given a function f : t´1, 1uk Ñ Rn ,
consider its decomposition f “ AĂt1,...,ku wA xA , where wA is the Walsh function
ś řk
pε1 , . . . , εk q ÞÑ iPA εi . Define then Rf :“ i“1 wtiu xtiu , the orthogonal projection
onto the space of linear functions (the Rademacher projection). Given a convex
body K in Rn , let K1 pKq be the smallest constant C such that the inequality
˘1{2 ˘1{2
E }Rf pεq}2K ď C E }f pεq}2K
` `
186 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

holds for every k and every f : t´1, 1uk Ñ Rn , where ε is uniformly distributed on
t´1, 1uk . It can be shown (see Section 6.6 in [AAGM15] for a detailed argument)
that for any symmetric convex body K,
2 1
K pKq ď KpKq ď K1 pKq.
π
This definition allows for a derivation of the estimate from Theorem 7.7 that par-
allels the one presented above, with the Hermite polynomials being replaced by
the Walsh functions, and Lemma 7.13 replaced by a careful application of Bern-

ion
stein’s inequality: If p is a polynomial of degree at most m such that |ppxq| ď 1 for
x P r´1, 1s, then |p1 p0q| ď m.

ut
Exercise 7.3 (A rough bound for the K-convexity constant). (i) Show that
KpB2n q “ 1. (ii) Show that if K, L are symmetric convex bodies in Rn , then

rib
?
KpKq ď dBM pK, LqKpLq. (iii) Conclude that KpKq ď n for symmetric convex
bodies K Ă Rn .

ist
Exercise 7.4 (K-convexity and duality). Show that KpKq “ KpK ˝ q for every
convex body K containing 0 in the interior.

rd
Exercise 7.5 (The K-convexity constant for B1n and for the cube). Let N “ 2k
and write the canonical basis of RN as peε qεPt´1,1uk . Define a map Θ P Hk,N
by Θpxq “ eε if the signs of the coordinates
?
fo
of x P Rk match the sequence ε P
t´1, 1u . Show that ~R̃1 pΘq~B1N ě c k for some c ą 0 and conclude that KpB1n q “
k
ot
? n
Ωp log nq “ KpB8 q.
N

Exercise 7.6 (A Bernstein-like inequality). Prove Lemma 7.13 by using the


conformal transformation z ÞÑ tanhpπz{4q mapping the strip S “ tz : | Im z| ă 1u
ly.

onto the open unit disk; reformulate the question as an inequality about holomor-
phic functions on S and use the three-lines lemma.
on

7.2. Sections of convex bodies


se

7.2.1. Dvoretzky’s theorem for Lipschitz functions. We start with the


simple but crucial observation that concentration of measure for Lipschitz functions
lu

defined on the unit sphere (Corollary 5.17) implies that such functions are actually
almost constant on a typical (randomly chosen) subspace of large dimension.
na

Throughout this section, whenever we consider a “random” k-dimensional sub-


space E Ă Rn (resp., E Ă Cn ), it is tacitly assumed that E is distributed uniformly
so

with respect to the Haar measure (as defined in Appendix B.4) on the Grassmann
manifold Grpk, Rn q (resp., Grpk, Cn q), for example by setting E “ U pRk q (resp.,
r

E “ U pCk q) where U is Haar-distributed on Opnq or SOpnq (resp., on Upnq or


Pe

SUpnq) and Rk Ă Rn (resp., Ck Ă Cn ) is the canonical inclusion.


It will be convenient to use the following concept: given a function f : X Ñ R,
the oscillation of f around the value µ on a subset A Ă X is defined as
oscpf, A, µq “ sup |f pxq ´ µ|.
xPA

In the following we consider the space S n´1 Ă Rn equipped with the geodesic
metric g. The objective is to show that, for a Lipschitz function f : S n´1 Ñ R and a
random k-dimensional subspace E Ă Rn , the oscillation of f around a central value
on the subsphere SE :“ S n´1 X E is small (and similarly for SCn Ă Cn ). We first
present a straightforward ε-net argument, which gives easily a result that is only
7.2. SECTIONS OF CONVEX BODIES 187

slightly worse than Theorem 7.15 below. We focus on the real case, but the same
argument applies in the complex setting. Note, however, that the latter does not
follow formally from the former: while Cn , SCn can be identified with R2n , S 2n´1
as metric spaces, not every 2k-dimensional R-linear subspace of R2n corresponds
to k-dimensional C-linear subspace of Cn .
Let f : pS n´1 , gq Ñ R be a 1-Lipschitz function, let µf be a central value for
f , and let E “ U pRk q be a random k-dimensional subspace of Rn , with U Haar-
distributed on Opnq. Let ε P p0, 1q and let N be an ε-net in pS k´1 , gq. First, since

ion
the function f ˝ U is 1-Lipschitz, we have
oscpf ˝ U, S k´1 , µf q ď ε ` oscpf ˝ U, N , µf q.

ut
We know from Corollary 5.32 that for any x P N ,

rib
P p|f pU pxqq ´ µf | ą εq ď 2 expp´nε2 {4q.
By the union bound, it follows that

ist
(7.9) P poscpf ˝ U, N , µf q ą εq ď cardpN q ¨ 2 expp´nε2 {4q.
By Lemma 5.3, we may choose N with card N ď pπ{εqk , so that the bound from

rd
(7.9) is substantially smaller than 1 provided k ď c1 nε2 { logp1{εq. In that case
we have oscpf, SE , µf q ď 2ε with high probability. We will slightly improve the
fo
dependence on ε in Theorem 7.15 below; this improvement turns out to be crucial
for some applications.
ot
A function f : SCn Ñ R is said to be circled if it satisfies f peiθ xq “ f pxq for
every x P SCn and θ P R. Circled functions are the complex counterpart of even
N

functions.
ly.

Theorem 7.15 (Dvoretzky–Milman theorem for Lipschitz functions). There


are constants c, c1 ą 0 such that the following holds. Let f : SCn Ñ R be a 1-
on

Lipschitz circled function, µf be a central value for f (with respect to the uniform
measure) and 0 ă ε ă 1. Assume that k ď cnε2 , and let E Ă Cn be a random
k-dimensional subspace. Then, with probability larger than 1 ´ expp´c1 nε2 q
se

oscpf, SE , µf q ď ε.
lu

The same conclusion holds for any 1-Lipschitz function f : S n´1 Ñ R and a random
subspace E Ă Rn . In both cases the dimension changes to cnε2 {L2 if the function
na

f is L-Lipschitz.
Remark 7.16. The proof given below gives for example the value c “ 1{400,
so

which is certainly far from optimal. (The argument actually works provided k `1 ď
r

nε2 {200.) While the bound can be undoubtedly improved, the use of Dudley’s
Pe

inequality inevitably results in poor constants. In the real case, the use of Slepian–
Gordon inequalities gives a constant of order 1{6 (see Exercise 7.7) and even better
when the function f is the restriction of a norm (see Remark 7.23). It would be
desirable to come up with a complex version of that argument, the difficulty being
that the inequalities from Exercise 6.47 do not carry over to the complex case.
Proof of Theorem 7.15. We consider the complex case and note that the
same argument applies in the real setting. We may also assume that µf “ 0
(otherwise consider f ´ µf ).
Let E “ U pCk q, with U P SUpnq a random Haar-distributed unitary matrix
(we could use equivalently the Haar measure on Upnq, but this would lead to worse
188 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

constants in (7.10) below, see Table 5.2). Consider the function F : SUpnq Ñ R
defined by
F pU q “ sup |f | “ sup |f pU pxqq|.
SE xPSCk

For U, V P SUpnq and x P SCk , we have (see Exercise B.5 for the last inequality)
|f pU xq ´ f pV xq| ď |U x ´ V x| ď }U ´ V }op ď }U ´ V }HS ď g2 pU, V q
where g2 denotes the geodesic distance on SUpnq, defined in (B.8). It follows that F

ion
is 1-Lipschitz on pSUpnq, g2 q. Using concentration of measure (see Table 5.2) gives
then, for any t ą 0,

ut
(7.10) PpF ě E F ` tq ď expp´nt2 {4q.
The remaining part of the proof consists in bounding E F . We will rely on the

rib
following lemma.

ist
Lemma 7.17. Let f : SCn Ñ R be a 1-Lipschitz circled function and U P SUpnq
be a Haar-distributed random unitary matrix. Then for any x, y P SCn with x ‰ y

rd
and for any λ ą 0,
pn ´ 1qλ2
ˆ ˙
Ppf pU xq ´ f pU yq ą λq ď exp ´ ,

where A and c are absolute constants.


fo
2|x ´ y|2
ot
Proof. Fix x, y P SCn . Since f is circled (and U is C-linear), we may replace
N

y by eiθ y and choose θ so that xx|yy is real nonnegative; note that this choice of θ
minimizes |x ´ y| and ensures that x ` y and x ´ y are orthogonal. Set z “ x`y 2
and w “ x´y
ly.

1
2 , then x “ z ` w and y “ z ´ w. Further, set β “ |w| “ 2 |x ´ y| (we
may assume that β ‰ 0) and w1 “ β ´1 w. Then, conditionally on u “ U pzq, U pw1 q
on

is distributed uniformly on the sphere SuK :“ SCn X uK . Since U pxq “ u ` βU pw1 q


and U pyq “ u ´ βU pw1 q, it follows that the conditional (on u “ U pzq) distribution
of f pU xq ´ f pU yq is the same as that of fu : SuK Ñ R defined by
se

fu pvq “ f pu ` βvq ´ f pu ´ βvq.


lu

As is readily seen, fu is 2β-Lipschitz and its mean is 0. From Lévy’s lemma (Corol-
lary 5.32) applied to fu and to the p2n ´ 3q-dimensional sphere SuK , we deduce
na

that, conditionally on u “ U pzq,


so

Ppf pU xq ´ f pU yq ą λq ď expp´p2n ´ 2qλ2 {4|x ´ y|2 q,


r

and hence the same inequality holds also without the conditioning. 
Pe

We now return to the proof of Theorem 7.15. Lemma 7.17 asserts that the
process pXs qsPSCk defined by Xs “ f pU sq is subgaussian (a notion defined in (6.19))
with constants A “ 1 and α “ pn ´ 1q{2. We apply Dudley’s inequality in the form
given in Corollary 6.14 to obtain
? ż 1{2
6 2 a
(7.11) E sup Xs ď sup E Xs ` ? 1 ` 2 logpN pSCk , | ¨ |, ηqq dη.
sPSCk sPSCk n´1 0
For any s P S, E Xs is equal to the mean?of f . Since
? 0 is a central value for f ,
it follows from Corollary 5.32 that E Xs ď 2 log 2{ 2n. We know from Lemma
7.2. SECTIONS OF CONVEX BODIES 189

? ?
5.3 that N pSCk , | ¨ |, ηq ď p2{ηq2k . Using the bound 1 ` t ď 1 ` t gives
? ? ? ż
log 2 3 2 12 2k 1{2 a
E F “ E sup Xs ď ? `? `? logp2{ηq dη.
sPSCk n n´1 n´1 0
ş1{2 a
The numerical value 0 logp2{ηq dη ď 0.759 leads to
?
5.08 ` 12.89 k
E F “ E sup Xs ď ? .
sPS n´1

ion
This quantity is smaller than ε{2 provided k ď cnε2 for some constant c, and the
conclusion follows by applying (7.10) for t “ ε{2. ?

ut
a To obtain the constant c “ 1{400, one checks thea ? kď
inequality 5.08 ` 12.89
200pk ` 1q ´ 1. It follows that E F ď ε provided 200pk ` 1q ´ 1 ď ε n ´ 1,

rib
or (since ε ă 1) when k ` 1 ď nε2 {200. Since we may assume that nε2 ě 400
(otherwise there is nothing to prove), this inequality is implied by the condition

ist
k ď nε2 {400. 
Exercise 7.7 (An alternative argument for Theorem 7.15 in the real case).

rd
Let f : S n´1 Ñ R be a 1-Lipschitz function. Denote by Mf the median of f , and
consider T “ tf “ Mf u. Let ε ą 0 such that nε2 ě 12, and k an integer such that
k ` 1 ď 61 ε2 n. fo
(i) For α P p0, π{2q, let Tα “ tx P S n´1 : distpx, T q ď αu, where distance
ot
refers?to the geodesic
? metric. Show that σpS n´1 zTα q ď expp´nα2 {2q. We now set
α “ 2 log 2{ n, so that σpTα q ě 1{2.
N

(ii) Show that if B Ă S n´1 satisfies σpBq ě 1{2, then wpS n´1 zBβ q ď 1`cos
2
β
.
n´1
(iii) Let A “ S zTε . Check that the assumptions on n, k, ε imply the inequality
ly.

1`cospε´αq
a
2 ď 1 ´ pk ` 1q{n, and conclude from (ii) that wG pAq ă κn´k .
(iv) Using Proposition 6.42, conclude that with positive probability, a random k-
on

dimensional subspace E Ă Rn satisfies E X S n´1 Ă A, and thus oscpf, SE , Mf q ď ε.


Exercise 7.8 (Removing the circledness assumption in Theorem 7.15). Show
se

that the following holds for some constants


? C, c ą 0. Let f : SCn Ñ R a 1-Lipschitz
?
function with mean µf and ε ě C log n{ n. Then for k ď cε2 n, a random k-
lu

dimensional subspace E Ă Cn satisfies oscpf, SE , µf q ď ε with high probability.


1 2π
ş
(Start by introducing the auxiliary circled functions gpxq “ 2π f peiθ xq dθ and
na

0

hpxq “ maxt|f pe xq ´ f pxq| : θ P r0, 2πsu.) ? ?
so

We do not know whether the assumption ε ě C log n{ n can be dropped.


7.2.2. The Dvoretzky dimension. A convex body K is said to be C-Eu-
r
Pe

clidean if dBM pK, B2dim K q ď C (the Banach–Mazur distance dBM and the geo-
metric distance dg were defined in (4.2) and (4.1)). It is customary to separate
the situation where C is controlled, but possibly large (the “isomorphic” theory),
from the situation where C “ 1 ` ε with ε ! 1 or at least “sufficiently small” (the
“almost isometric” theory). Still another aspect is when ε “ 0 (the “isometric”
theory), which is quite different in nature and hardly mentioned in this book (with
the exception of Section 11.1).
The goal of this section, and of the following ones, is to give upper and lower
bounds on the maximal possible dimension of a subspace E Ă Rn such that K X E
is C-Euclidean (when K is symmetric, we restrict ourselves to subspaces through
the origin, so that the results can be translated in terms of subspaces of normed
190 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

spaces). It is remarkable that, up to an absolute multiplicative constant, this


maximal dimension can be computed via a simple formula, which we now introduce
under the name of the Dvoretzky dimension.
Definition 7.18 (Dvoretzky dimension). Let K be either a convex body in
Rn containing 0 in the interior, or a circled convex body in Cn . The Dvoretzky
dimension of K is defined as
k˚ pKq “ pwpK ˝ q inradpKqq2 n.

ion
If } ¨ } is a norm on Rn (or Cn ), the Dvoretzky dimension of X “ pRn , } ¨ }q is defined
as the Dvoretzky dimension of its unit ball, or equivalently as

ut
k˚ pXq “ pM {bq2 n,
where b is the smallest number such that } ¨ } ď b| ¨ | and M “ E }X}, where X is

rib
a random variable uniformly distributed on S n´1 .
We note that b corresponds to the maximum value of } ¨ } over the Euclidean

ist
sphere, while M is the average value. Hence we always have M ď b, thus k˚ ď n.
Note also the inequality k˚ pKq ď dg pK, Lqk˚ pLq for a pair of convex bodies K, L.

rd
We should think of k˚ as a quantity meaningful only up to (absolute) multiplicative
constant. Likewise, in order to not to obscure the arguments, we will sometimes
fo
pretend in what follows that k˚ and similar expressions are integers.
The Dvoretzky dimension of a convex body K Ă Rn depends on the choice of
ot
the underlying Euclidean structure. The remarkable fact is that the following two
quantities are equivalent up to multiplicative universal constants (see Exercise 7.10
N

and Theorem 7.19)


(i) The supremum of k˚ pKq over all Euclidean structures on Rn .
ly.

(ii) The largest k such that K X E is 2-Euclidean for some E P Grpk, Rn q.


The usefulness of this concept comes from the fact that, for standard norms,
on

the Dvoretzky dimension is usually easily computed. We illustrate this in the case
of `p spaces and Schatten norms in Section 7.2.4. However, the following Theorem
se

7.19 is also of interest when applied to abstract norms. For example, it implies
the celebrated fact that any high-dimensional convex body has sections which are
lu

arbitrarily close to a Euclidean ball (see Corollary 7.40).


na
so
r
Pe

Figure 7.1. Low-dimensional illustration of Dvoretzky’s theorem:


the regular hexagon appears as a section of B13 and B8
3
.
7.2. SECTIONS OF CONVEX BODIES 191

Theorem 7.19 (Tangible Dvoretzky–Milman theorem). There are absolute


constants c, c1 ą 0 such that the following holds. Let K be either a convex body in Rn
containing 0 in the interior, or a circled convex body in Cn . Let M and k˚ “ k˚ pKq
(the Dvoretzky dimension of K) be as in Definition 7.18. Fix 0 ă ε ď 1, and let
k “ cε2 k˚ . Then a random k-dimensional subspace E satisfies the following: with
probability larger than 1 ´ expp´c1 ε2 k˚ q, we have
(7.12) @x P E, p1 ´ εqM |x| ď }x}K ď p1 ` εqM |x|

ion
and consequently
1`ε
dg pK X E, B2E q ď ,
1´ε

ut
where dg denotes the geometric distance as defined in (4.1) and B2E “ B2n X E.

rib
Proof. This is a straightforward consequence of Theorem 7.15, applied to the
function f pxq “ }x}K , which is a b-Lipschitz function (and, moreover, is circled in

ist
the complex case). Indeed, provided k ď cnpεM {bq2 , we obtain with probability
larger than 1 ´ expp´c1 pεM q2 nq that oscp} ¨ }K , SE , M q ď εM , which is equivalent

rd
to (7.12). 

Remark 7.20. A simple ε-net argument combined with a little trick (see Exer-
fo
cise 7.11) gives a version of the complex case of Theorem 7.19 with a slightly worse
dependence on ε, but without the assumption that K is circled.
ot
Remark 7.21 (about the dependence on ε). We now comment about the sharp-
N

ness of Theorem 7.19. First, the isomorphic version (for macroscopic ε) is always
sharp: the dimension of generic 2-Euclidean sections can never exceed k˚ pKq (see
ly.

Exercise 7.12). Second, one can construct norms for which the dependence on ε is
sharp (see Exercise 7.20). However, for some natural and interesting instances the
on

dependence on ε can be improved (we will see a very important example in Chapter
8, connected to the additivity conjecture; see Remark 8.21).
se

Remark 7.22. If K is A-Euclidean, then k˚ pKq ě n{A2 . Consequently, by


Theorem 7.19, for any fixed ε ą 0, K admits sections of proportional dimension
lu

which are p1 ` εq-Euclidean. Therefore any result about isomorphically Euclidean


sections implies a counterpart about almost isometric sections, the dimension of
na

the section being affected only by a multiplicative Ωpε2 q constant.


so

Remark 7.23. In the real case, the conclusion of Theorem 7.19 also holds for a
gauge (i.e., without the symmetry assumption of K). Moreover, a derivation from
r

the Chevet–Gordon inequalities allows for a more direct proof and gives a better
Pe

constant. For any k ď k˚ , we show the existence of a k-dimensional subspace


E Ă Rn such that
a
k 1 ` k{k˚
(7.13) dBM pK X E, B2 q ď a .
1 ´ k{k˚
To achieve this, we consider a random matrix B P Mk,n , which we interpret as
an operator from Rk to Rn . By the Chevet–Gordon inequalities (Proposition 6.37),
wG pK ˝ q ´ bκk ď E min max˝ xBx, yy ď E max max˝ xBx, yy ď wG pK ˝ q ` bκk .
xPS k´1 yPK xPS k´1 yPK
192 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

? ?
Using the inequality κk { k ď κn { n (Proposition A.1), we are led to
˜ c ¸
k
κn M ´ b ď E min max˝ xBx, yy
n xPS k´1 yPK
˜ c ¸
k
ď E max max˝ xBx, yy ď κn M ` b
xPS k´1 yPK n
and the existence of a subspace E “ BpRk q satisfying (7.13) follows.

ion
Due to the duality between sections and projections of convex bodies (see (1.12)
and (1.13)), Theorem 7.19 admits a dual formulation via projections onto subspaces.

ut
Corollary 7.24. Let K be a convex body in Rn , and ε ą 0. Provided k ď
2

rib
cε k˚ pK ˝ q, a random k-dimensional subspace E satisfies with large probability
p1 ´ εqwpKqB2E Ă PE K Ă p1 ` εqwpKqB2E .

ist
Remark 7.25 (Geometric interpretation of the M M ˚ -estimate). Let K Ă Rn
be a symmetric convex body and let k ď cε2 minpk˚ pKq, k˚ pK ˝ qq. We know then

rd
from Theorem 7.19 and Corollary 7.24 that for a random subspace E P Grpk, Rn q,
the section K X E is p1 ` εq-close to a Euclidean ball of radius wpK ˝ q´1 while the
fo
projection PE K is p1`εq-close to a Euclidean ball of radius wpKq; the ratio of these
radii is the quantity wpKqwpK ˝ q which appears in Theorem 7.10. In particular, if
ot
K is in the `-position, the radius of a typical k-dimensional projection only exceeds
the radius of a typical k-dimensional section by a logpnq factor. However it is not
N

clear whether the `-position is always compatible with the conditions k˚ pKq " 1
and k˚ pK ˝ q " 1 (see Problem 7.26).
ly.

Problem 7.26. Does there exist, for every symmetric convex body K Ă Rn , a
subspace E of dimension c log n such that
on

r1 B2E Ă K X E Ă 2r1 B2E and r2 B2E Ă PE K Ă 2r2 B2E ,


with r2 {r1 “ Oplog nq? Without the constraint on the radii this follows from clas-
se

sical facts, see Exercise 7.27.


lu

Exercise 7.9 (Dvoretzky dimension and duality). Let K Ă Rn be a convex


body such that dg pK, B2n q ď A. Show that k˚ pKqk˚ pK ˝ q ě n2 {A2 . In particular,
na

if K is symmetric and is in John or Löwner position, then k˚ pKqk˚ pK ˝ q ě n.


Exercise 7.10. Let K Ă Rn be a symmetric convex body, and suppose that
so

dBM pK X E, BE q ă A for some k-dimensional subspace E, where BE “ B2n X E.


r

Show that there is a linear transformation T P GLpn, Rq such that k˚ pT pLqq ě


Pe

pk ´ 1q{A2 .
Exercise 7.11 (Almost spherical sections discretized). (i) Let N be a δ-net in
pSCn , | ¨ |q, and } ¨ } a norm on Cn such that
@x P N , 1 ´ α ď }x} ď 1 ` β.
Show that
δp1 ` βq 1`β
(7.14) @x P SCn , 1 ´ α ´ ď }x} ď .
1´δ 1´δ
(ii) Use (i) to show that, when k ď cε2 log´1 p1{εqk˚ pKq, the conclusion from The-
orem 7.15 can be derived via the elementary net argument that led to (7.9).
7.2. SECTIONS OF CONVEX BODIES 193

Exercise 7.12 (Sharpness of the Dvoretzky–Milman theorem for random sub-


spaces). Consider a norm } ¨ } on Rn , and let M, b, k˚ be as in Definition 7.18. The
goal of this exercise is to show that, in Theorem 7.19, the value k˚ is always sharp
for macroscopic values of ε (say, ε “ 1 so that the lower bound in (7.12) is vacuous).
Assume that k is an integer with the following property: with probability larger
than 1 ´ 1{n, a random k-dimensional subspace E satisfies
(7.15) @x P E, }x} ď 2M |x|.

ion
(i) Show that there is an orthogonal decomposition of Rn as the direct sum of rn{ks
subspaces, each of them satisfying (7.15). a
(ii) Show that for every x P Rn , }x} ď 2M rn{ks|x|.

ut
(iii) Conclude that k ď CpM {bq2 n for some absolute constant C.

rib
7.2.3. The Figiel–Lindenstrauss–Milman inequality. In this section we
will derive, as a consequence of Theorem 7.19, a useful inequality due to Figiel–

ist
Lindenstrauss–Milman which can be interpreted as follows: complexity (of any
convex body) must lie somewhere.

rd
Fix a convex body K Ă Rn containing the origin in the interior. Define the
verticial dimension of K as
fo
dimV pKq “ log inftN : there is a polytope P with N vertices s.t. K Ă P Ă 4Ku
and the facial dimension of K as
ot
dimF pKq “ log inftN : there is a polytope Q with N facets s.t. K Ă Q Ă 4Ku.
N

The number 4 plays no special role in these definitions; all the results below are
only affected in the values of the constants if 4 is replaced by another number larger
ly.

than 1 (see Exercise 7.15). The basic properties of these concepts are gathered in
Proposition 7.27.
on

Proposition 7.27. Let K Ă Rn a convex body containing the origin in the


interior. Then
se

(i) for any T P GLpn, Rq, we have dimV pT Kq “ dimV pKq and dimF pT Kq “
dimF pKq,
lu

(ii) we have dimV pK ˝ q “ dimF pKq and dimF pK ˝ q “ dimV pKq,


(iii) for any subspace E Ă Rn , we have dimF pK XEq ď dimF K and dimV pPE Kq ď
na

dimV K,
(iv) if K has centroid at the origin, then dimV pKq ď Cn and dimF pKq ď Cn for
so

some absolute constant C.


r

We note that the verticial and facial dimensions are linearly invariant but not
Pe

affinely invariant (see Exercise 7.13).


Proof. (i) is obvious and (ii) follows from the fact that polarity exchanges
vertices and facets for polytopes (see (1.16)). The two dual inequalities in (iii) hold
since projections do not increase the number of vertices of polytopes, while sections
do not increase the number of facets. For (iv), see Exercises 5.18 and 5.19. 
Define also the asphericity of a convex body K Ă Rn as
(7.16) " *
R
apKq “ inf : there is a 0-symmetric ellipsoid E with rE Ă K Ă RE .
r
194 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

We have apKq “ dBM pK, B2n q if K is centrally symmetric. The following lemma
gives a simple connection between asphericity and verticial (resp., facial) dimension.
It is an immediate consequence of Proposition 5.6.
Lemma 7.28. Let K Ă Rn be a convex body containing the origin in the interior.
Then
n´1 n´1
dimV pKq apKq2 ě , dimF pKq apKq2 ě .
32 32

ion
When combined with Dvoretzky’s theorem, the inequalities from Lemma 7.28
give a much sharper result.

ut
Theorem 7.29 (Figiel–Lindenstrauss–Milman inequality). For any convex body
K Ă Rn containing the origin in the interior we have

rib
(7.17) dimF pKq dimV pKq apKq2 ě cn2

ist
where c ą 0 is an absolute constant.
Proof. We may assume that rB2n Ă K Ă RB2n with R{r “ apKq. Let M “

rd
E }X}K and M ˚ “ E }X}K ˝ where X is a random vector uniformly distributed on
the unit sphere.
fo
We apply Theorem 7.19 to K for ε “ 1{2 (say). There yields a subspace E Ă Rn
of dimension cprM q2 n such that
ot
M E 3M E
B ĂK XE Ă B .
2 2 2 2
N

It follows (using Proposition 7.27(iii) and Lemma 7.28) that dimF pKq ě dimF pK X
Eq ě cprM q2 n for an absolute constant c ą 0. We apply the same argument to
ly.

K ˝ (note that R´1 B2n Ă K ˝ ) and obtain that dimF pK ˝ q “ cpM ˚ {Rq2 n. Since
dimV pKq “ dimF pK ˝ q, it follows that
on

dimF pKq dimV pKq ě c2 n2 pM M ˚ q2 pr{Rq2 “ c2 n2 {apKq2


se

as needed, where we used the fact (see Exercise (4.37)) that M M ˚ ě 1. 


lu

A consequence is a remarkable combinatorial?


result about symmetric polytopes.
Indeed, we know from Exercise 4.20 that apKq ď n for any symmetric convex body
na

K Ă Rn .
Corollary 7.30. Let P Ă Rn be a symmetric polytope with n1 vertices and
so

n2 faces. Then
r

plog n1 qplog n2 q ě cn.


Pe

The conclusion of the Corollary fails dramatically for non-symmetric polytopes


(consider the simplex).
Exercise 7.13 (The position of the origin matters). Give examples of pla-
nar convex bodies containing the origin in the interior, whose verticial or facial
dimension is arbitrarily high.
Exercise 7.14. Give examples of symmetric polytopes in Rn with exppopnqq
vertices and exppopnqq facets.
7.2. SECTIONS OF CONVEX BODIES 195

Exercise 7.15 (Isomorphic facial and verticial dimension). Let K Ă Rn be a


convex body containing the origin in the interior. For A ě 1, define dimF pK, Aq
(resp., dimV pK, Aq) as log N , where N is the minimal number of facets (resp.,
vertices) of a polytope P such that K Ă P Ă AK. Show that, for any A, B ě 1,
A2 dimF pK, Aq ¨ B 2 dimV pK, Bq ¨ apKq2 ě cn2
where c ą 0 is an absolute constant.
7.2.4. The Dvoretzky dimension of standard spaces. In this section

ion
we compute the Dvoretzky dimension for the unit balls with respect to the most
standard norms: the commutative and non-commutative p-norms. Unless specified

ut
otherwise, the statements refer to both the real and the complex case.
7.2.4.1. `p norms. Let Bpn denote the unit ball (in either Rn or Cn ) for the

rib
norm } ¨ }p , where p P r1, 8s. We also define the conjugate exponent q P r1, 8s by
the relation p´1 ` q ´1 “ 1. Recall that pBpn q˝ “ Bqn .

ist
Theorem 7.31. The Dvoretzky dimension of Bpn is of the following order
$

rd
&n
’ if 1 ď p ď 2,
n 2{p
k˚ pBp q » pn if 2 ď p ď log n,

%
fo
log n if log n ď p ď 8.
Remark 7.32. We emphasize that the constants implicit in the relations “»”
ot
do not depend on p (in addition to not depending on n). The proof actually shows
that, for fixed p and as n tends to 8, wpBqn q „ n1{p´1{2 }g}Lp , where g is a standard
N

N p0, 1q (real or complex, accordingly) Gaussian random variable (“„” is uniform


in p on bounded sets, but not globally). A closed expression for }g}Lp is given in
ly.

(5.63).
on

Proof. We treat the real case, the complex case being similar. Let q P r1, 8s
be such that 1{p ` 1{q “ 1. By Definition 7.18, we have
k˚ pBpn q “ n inradpBpn q2 wpBqn q2 .
se

Accordingly, the Theorem will follow from the estimates


lu

#
n n1{2´1{p if 1 ď p ď 2,
inradpBp q “
na

1 if 2 ď p ď 8,
and
so

#?
p n1{p´1{2 if 1 ď p ď log n,
(7.18) E }x}p “ wpBqn q »
r

? ?
log n{ n if log n ď p ď 8
Pe

where x is a random vector uniformly distributed on S n´1 .


The value of inradpBpn q is a reformulation of the corresponding inequality (1.4)
between `p -norms. We estimate the mean width by introducing a standard Gaussian
vector G “ pg1 , . . . , gn q in Rn , so that wpBqn q “ κ´1
n E }G}p , where κn „ n
1{2
was
defined in (A.8). Consider also the random variable X “ }G}p . What is easy to
compute is the pth moment (or the Lp -norm) of X:
1{p 1{p
a
(7.19) }X}Lp “ pE X p q “ pn E |g1 |p q „ p{e n1{p
(see (5.63) for the last relation). Since we are interested in the value of E X, we
will use Gaussian concentration to relate the expectation of X to its pth moment.
196 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

Consider first the case p ě 2, then }¨}p is 1-Lipschitz (with respect to the Euclidean
metric) and so by Proposition 5.34 and Theorem 5.24
P pX ´ E X ą tq ď P pX ´ M ą tq ď P pg1 ą tq for all t ą 0,
where M is the median of X. In particular, we have
ż8 ´ p ¯p{2
` p
ptp´1 Pp|X ´ E X| ą tq dt ď Epg1` qp ď
` ˘
E pX ´ E Xq “
0 e
(see (A.1) or (5.63)) and so

ion
a
(7.20) }pX ´ E Xq` }Lp ď p{e.

ut
Since }X}Lp ´ }pX ´ E Xq` }Lp ď E X ď }X}Lp , it follows from (7.19) and (7.20)
?
that wppBpn q˝ q “ Θp pn1{p´1{2 q whenever 2 ď p ď log n. For log n ď p ď 8, we

rib
have } ¨ }8 ď } ¨ }p ď e} ¨ }8 , so that it suffices to prove the second part of (7.18) for
p “ 8. This is exactly (modulo the relation between the spherical and Gaussian

ist
means) the?content of Lemma 6.1, which asserts that, in the present notation,
E }G}8 „ 2 log n.

rd
If 1 ď p ă 2, } ¨ }p is n1{p´1{2 -Lipschitz and an argument along the same lines
yields
(7.21) fo
}pX ´ E Xq` }Lp ď n1{p´1{2 p{e.
a

Combining this with (7.19) shows that E X “ Θpn1{p q for 1 ď p ă 2, whence (7.18)
ot
for that range of p readily follows. 
N

While the above argument relies heavily on tools specific to the Gaussian case,
most of its elements can be carried over to a much more general setting. An example
ly.

of a more robust calculation is given in Exercise 7.17.


Remark 7.33 (Sharpness of Theorem 7.31). It can be shown that the estimates
on

for the dimension of nearly Euclidean subspaces implied by Theorem 7.31 are sharp
in the following sense: for 2 ă p ă 8, if some k-dimensional subspace E Ă Rn is
se

such that dBM pBpn X E, B2k q ď 2, then k ď Cpn2{p , where C is an absolute constant
(see Exercise 7.19).
lu

Remark 7.34 (Euclidean sections of `n8 ). The case of `n8 deserves a special
mention since almost Euclidean subspaces of `n8 are closely related to ε-nets in the
na

unit Euclidean sphere. It is easily checked (see Exercise 7.18) that the following
so

two statements are equivalent


(i) There is a k-dimensional subspace E Ă Rn such that dBM pB2k , B8n
X Eq ď 1 ` ε.
r

k´1
(ii) There exist n points x1 , . . . , xn in S such that
Pe

(7.22) p1 ` εq´1 B2k Ă convt˘xi : 1 ď i ď nu.


Moreover, (7.22) is also equivalent to p˘xi q being a θ-net in pS k´1 , gq with cos θ “
p1 ` εq´1 (Exercise 5.7). Since the smallest cardinality of such a net is essentially
of order V pθq´1 (Corollary 5.5), it follows from Proposition 5.1 that the largest
dimension of a p1 ` εq-Euclidean subspace in `n8 is Θplogpnq{ logp1{εqq.
The Dvoretzky dimension of `n1 is of order n, and consequently (by Theorem
7.19), for 0 ă ε ă 1, a typical subspace of dimension cε2 n of `n1 is p1`εq-close to the
Euclidean space. Remarkably, this phenomenon persists even when the dimension
of the subspace approaches n. We have
7.2. SECTIONS OF CONVEX BODIES 197

Theorem 7.35 (see Section 7.2.6.2). Let 0 ă α ă 1. With large probability,


a typical tp1 ´ αqnu-dimensional subspace E Ă Rn has the property that for every
x P E,
? ?
(7.23) Apαq´1 n|x| ď }x}1 ď n|x|,
where Apαq is a constant depending only on α.
For a more general result in this direction, see Theorem 7.42.

ion
Remark` 7.36. (i) The optimal
˘ dependence as α tends to 0 in Theorem 7.35
is Apαq “ Θ plogp2{αq{αq1{2 . The upper bound will be shown in Section 7.2.6.2,
where the Theorem is proved; see Exercise 7.16 for a slightly weaker lower bound.

ut
An alternative approach to Theorem 7.35 (with a simpler proof, but worse depen-
dence on α) is via Theorem 7.42. (ii) In the context of Theorem 7.35, the parameter

rib
A is often called the distortion (of the `1 -norm over E). However, an alternative
(and arguably better, see Section 7.2.7) definition of the distortion of a subspace

ist
E Ă Rn is the ratio between the maximum a and the minimum of the function }x}1
?
over S n´1 X E. This is because for A ă π{2 the inequality }x}1 ě A´1 n|x| may

rd
hold for all x P E only if dim E is small (depending on A) [SW].
Exercise 7.16 (Simple lower bound on `1 distortion).? Let a P p0, 1s and let
fo
E Ă Rn be a subspace such that the inequality }x}1 ě a n|x| is satisfied for all
x P E. Show that the codimension of E is at least a2 n ´ 1. (This is elementary.)
ot
?
Conclude that the optimal Apαq in Theorem 7.35 satisfies Apαq “ Ωp1{ αq.
N

Exercise 7.17 (p-norms of subgaussian vectors). Let pY1 , . . . , Yn q be inde-


pendent random variables satisfying }Yi }ψ2 ď A for some A ě 0. Denote Y “
ly.

?
pY1 , . . . , Yn q. Show that E }Y }p ď A p n1{p for 1 ď p ă `8 and that E }Y }8 ď
?
CA log n.
on

Exercise 7.18 (Optimal almost spherical sections of the cube). Show the
equivalence (i) ðñ (ii) in Remark 7.34.
se

Exercise 7.19 (Sharpness of Dvoretzky–Milman theorem for Bpn ). We show


lu

here that for 2 ă p ă 8, if some k-dimensional subspace E Ă Rn is such that


dBM pBpn X E, B2k q ď 2, then k ď Cpn2{p , where C is an absolute constant. (i)
na

Prove that k˚ pT pBpn qq ď Cpn2{p for any T P GLpn, Rq. (ii) Conclude using Exercise
7.10.
so

Exercise 7.20 (Isomorphic vs. almost isometric Euclidean subspaces). Given


r

0 ă ε ă 1 and n large enough, here is an example of a norm } ¨ } on Rn such that


Pe

| ¨ | ď } ¨ } ď 2| ¨ |, while if E Ă Rn is a subspace such that (for some A)


(7.24) @x P E, A|x| ď }x} ď p1 ` εqA|x|
then necessarily dim E “ Opε2 nq. We define the norm as } ¨ } “ | ¨ | ` } ¨ }p , where
p P r2, 4s is given by the relation n1{p´1{2 “ 2ε (this is possible for n large enough).
Suppose that a subspace E satisfies (7.24).
(i) Show that A ě 1 ` ε{2 and that εA ď 4pA ´ 1q.
(ii) Show that for every x P E, we have B|x| ď }x}p ď 5B|x| with B “ A ´ 1.
(iii) Using the result of Exercise 7.19, conclude that dim E ď Cε2 n for some absolute
constant C.
198 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

Exercise 7.21. Fix integers m, n ě 1 and consider the convex body obtained
as the `1 -sum of m copies of B2n
K “ tpx1 , . . . , xm q P pRn qm : |x1 | ` ¨ ¨ ¨ ` |xm | ď 1u.
Show that k˚ pKq ě cnm for some absolute constant c ą 0.
Exercise 7.22. Show that for every ε ą 0, there is a polytope P with at most
exppCn{ε2 q vertices and at most exppCn{ε2 q facets, such that p1´εqB2n Ă P Ă B2n .

ion
7.2.4.2. Schatten norms. We now consider the Schatten p-norms, for p P r1, 8s.
Recall that Spm,n is the corresponding unit ball in the space of (real of complex)
m ˆ n matrices, and Spn,sa is its analogue for the space of (real of complex) self-

ut
adjoint n ˆ n matrices. Also recall (see Corollary 1.18) that pSpm,n q˝ “ Sqm,n and
pSpn,sa q˝ “ Sqn,sa , where q P r1, 8s is defined by p´1 ` q ´1 “ 1.

rib
Theorem 7.37 (Dvoretzky dimension for Schatten norms). Consider two in-

ist
tegers m ď n, and p P r1, 8s. The Dvoretzky dimension of Spm,n satisfies
#
mn if 1 ď p ď 2,

rd
m,n
k˚ pSp q »
m2{p n if 2 ď p ď 8.
fo
Moreover, in the case m “ n, the same estimates are true for k˚ pSpn,sa q.
Remark 7.38. We emphasize again that the constants implicit in the » no-
ot
tation are absolute and do not depend on p, m, n. Moreover, the proof allows to
describe the precise asymptotic behavior of k˚ pSpm,n q and k˚ pSpn,sa q (i.e., relations
N

“„” in place of “»,” with reasonably explicit constants), see Exercise 7.23.
ly.

Proof. We focus primarily on the real case, the complex case being similar.
Let q P r1, 8s be such that 1{p ` 1{q “ 1. We have (see Definition 7.18)
on

k˚ pSpm,n q “ nm inradpSpm,n q2 wpSqm,n q2 .


Accordingly, the Theorem will follow from the estimates
se

#
m1{2´1{p if 1 ď p ď 2,
inradpSpm,n q “
lu

1 if 2 ď p ď 8
na

and
(7.25) E }A}p “ wpSqm,n q “ Θpm1{p´1{2 q,
so

where A is a random matrix uniformly distributed on the Hilbert–Schmidt unit


r

sphere in Mm,n . The inradius is the same as in the commutative case: we are just
Pe

comparing the `p -norm and the `2 -norm of the sequence of singular values of a
matrix (see (1.29); the comparison is formalized in (1.31)). In turn, (7.25) will be
obtained by combining well-known properties of random matrices with the relation
(A.7) between the spherical and the Gaussian mean. To that end, we note first
that once we show the following one-sided bounds for the extreme values of p
piq E }A}8 À m´1{2 and piiq E }A}1 Á m1{2 ,
the remaining cases will follow by appealing again to the inequalities (1.31) relating
different Schatten p-norms
m1{p´1 } ¨ }1 ď } ¨ }p ď m1{p } ¨ }8 .
7.2. SECTIONS OF CONVEX BODIES 199

m,n
Next, we know from the duality pS1m,n q˝ “ S8 and from Exercise 4.37 that
wpS1m,n qwpS8
m,n
q ě 1,
so that (ii) follows from (i) with c “ 1{C. Finally, to justify (i), introduce a standard
Gaussian vector B in Mm,n , so that E }A}8 “ κ´1 mn E }B}8 (in the complex case
:
replace κmn by κC mn ). Note that the random matrix W “ BB is a Wishart matrix,
allowing to use the results from Section 6.2.3. We know (see Proposition ? 6.31?for
the complex case? and Corollary 6.38 for the real case) that E }B} 8 ď m ` n.

ion
Since κmn „ mn, this shows (i), completes the proof of (7.25) and, consequently,
of the part of the Theorem concerning k˚ pSpm,n q.
The self-adjoint version can be treated exactly the same way, using estimates

ut
on the norm of GOE/GUE matrices (Proposition 6.24); recall that the GOE (resp.,
GUE) is essentially the standard Gaussian vector in the space of real symmetric

rib
(resp., complex self-adjoint) matrices. 

ist
Exercise 7.23 (Sharp bounds for mean widths of Schatten balls). We consider
either the real or the complex case. (i) Fix p P r1, 8s and let n, s tend to infinity

rd
in such a way that lim ns “ λ P r1, 8q. Show that the quantity E }A}p “ wpSqn,s q
appearing in (7.25) is equivalent to αp λ´1{2 n1{p´1{2 , where αp is defined by αpp “
?
ş p{2
fo
|x| dµMPpλq pxq for 1 ď p ă 8, and α8 “ 1 ` λ. (One can check that the
product αp λ´1{2 is bounded away from 0 and `8.) (ii) Fix p P r1, 8s. Show that,
ot
as n tends to infinity, the quantity wpSqn,sa q is equivalent to βp n1{p´1{2 , where βp
ş2
is defined by βpp “ ´2 |x|p dµSC pxq for 1 ď p ă 8, and β8 “ 2.
N

Exercise 7.24 (Uniformly bounded volume ratio for Schatten balls, 1 ď p ď 2).
ly.

Using the (reverse) Santaló inequality, show that for m ď n and 1 ď p ď 8,


cm1{2´1{p ď vradpSpm,n q ď wpSpm,n q ď Cm1{2´1{p .
on

Deduce that the convex bodies Spm,n have a (uniformly) bounded volume ratio if
1 ď p ď 2. (See Section 7.2.6.1 for the definition.)
se

Exercise 7.25 (Sharpness of Dvoretzky–Milman theorem for Schatten spaces).


lu

Let m ď n be integers, let p P r1, 8s, and suppose that E Ă Mm,n is a k-dimensional
subspace such that dBM pE X Spm,n , B2k q ď 2. The goal of this exercise is to show
na

that
(7.26) k ď Ck˚ pSpm,n q,
so

where C is an absolute constant. This shows that, for isomorphically Euclidean sec-
r

tions, the Dvoretzky dimension gives a sharp bound. (Note, however, the hypothesis
Pe

dBM pE X Spm,n , B2k q ď 1 ` ε does not imply that k ď Cε2 k˚ pSpm,n q; exploiting this
“room for improvement” will be crucial in Chapter 8, see Remark 8.21.) Note that
(7.26) holds trivially when 1 ď p ď 2.
(i) Show that there is a constant C0 and a polytope P with at most C0m`n vertices
such that P Ă S1m,n Ă 2P .
(ii) Using (i) and Remark 7.34, show that (7.26) holds when p “ 8.
(iii) Assume now that 2 ď p ă 8, and suppose that dBM pE X Spm,n , B2k q ď 2.
m,n
Show that k˚ pE X S8 q ě ck{n2{p , and (using the previous question) that k ď
m,n
Ck˚ pSp q.
200 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

7.2.5. Dvoretzky’s theorem for general convex bodies. A famous con-


sequence of the Dvoretzky–Milman theorem is the fact that any convex body of
sufficiently large dimension admits almost Euclidean sections (see Corollary 7.40
below). It is based on the fact that n-dimensional convex bodies, which are in John
position, have Dvoretzky dimensions that are Ωplog nq.
Proposition 7.39. Let K Ă Rn be a convex body in John position. Then
the Dvoretzky dimension of K satisfies k˚ pKq ě c log n for some absolute constant
c ą 0.

ion
Corollary 7.40 (Dvoretzky’s theorem). There is a constant c ą 0 such that
the following holds. Let K be a symmetric convex body in Rn (for some n P N) and

ut
let ε ą 0. Then there exists a subspace E Ă Rn of dimension at least cε2 log n such

rib
that
dg pK X E, B2E q ď 1 ` ε,

ist
where B2E is the Euclidean unit ball in E.
If K is a non-symmetric convex body, the same conclusion holds for some k-

rd
dimensional affine subspace E and the corresponding notion of the distance.
Proof of Corollary 7.40. If K is in John position, the conclusion follows
fo
immediately from Proposition 7.39 and Theorem 7.19. For a general convex body
K Ă Rn , we know from Proposition 4.7 that there is a linear map T such that T K is
ot
in John position. Therefore there exists a subspace E with dimension cε2 log n such
that dg pT pKq X E, B2E q ď 1 ` ε. It follows that there is an ellipsoid E Ă T ´1 pEq
N

such that
E Ă K X E Ă p1 ` εqE .
ly.

We now use the result from Exercise 1.25 to conclude that E can be replaced
by a multiple of the Euclidean ball if we replace E by a subspace F Ă E with
on

dim F “ r 21 dim Es.


Finally, the same arguments works for non-symmetric convex bodies, except
se

that the subspace E is affine. 


lu

The key estimate needed for the proof of Proposition 7.39 is the following
lemma, known as the Dvoretzky–Rogers lemma.
na

Lemma 7.41 (Dvoretzky–Rogers lemma). Let K Ă Rn be a convex body which


is in John position. Then there exists an orthonormal basis pxk q1ďkďn such that,
so

for any 1 ď k ď n, a
}xk }K ě k{n.
r
Pe

Proof. The Lemma is a consequence of the following claim: under the hy-
n
potheses of the Lemma, anya m-dimensional subspace F Ă R contains a vector x
with |x| “ 1 and }x}K ě m{n. Indeed, we construct successively xn , . . . , x1 and
obtain xk by applying the claim to the subspace orthogonal to txi : i ą ku.
To prove the claim, consider a resolution of identity pci , xi q given by Proposition
4.7. Recall that xi P BK X B2n are contact points, in particular it follows that K
is contained in each half-space tx¨, xi y ď 1u, or that } ¨ }K ě x¨, xi y. Given an
m-dimensional subspace F Ă Rn , we have
ÿ
PF “ ci PF |xi yxxi |.
7.2. SECTIONS OF CONVEX BODIES 201

2
ř ř
a gives m “ ci |PF xi | . Since ci “ n, there exists an index j
Taking the trace
with |PF xj | ě m{n. Let x “ PF xj {|PF xj |. We have
a
}x}K ě xx, xj y “ |PF xj | ě m{n. 
We can now complete the proof of Proposition 7.39.
Proof of Proposition 7.39. Let K be a convex body in John position, and
let X be a random vector uniformly distributed on S n´1 . Since inradpKq “ 1, it

ion
suffices to prove in view of Definition 7.18 that
a
E }X}K ě c log n{n.

ut
for some constant c.
We know from Lemma 7.41 that there exists an orthonormal family of n{4

rib
vectors pxi q with }xi }K ě 1{2. In particular, we have } ¨ }K ě 12 maxtx¨, xi y : 1 ď
i ď n{4u. Consequently, if G denotes a standard Gaussian vector in Rn , then

ist
1 1
E }X}K “ E }G}K ě E maxtxG, xi y : 1 ď i ď n{4u.
κn 2κn

rd
The random variables xG, xi y are i.i.d. standard
? normal variables, and therefore the
expectation of their maximum is of order log n by Lemma 6.1, as needed. 
fo
Exercise 7.26 (Complex version of Dvoretzky’s theorem). Check that Corol-
lary 7.40 remains valid for a circled convex body K Ă Cn .
ot
Exercise 7.27 (Simultaneous spherical sections for a set and its polar). (i)
N

Show that the following holds for some constant c ą 0: for every symmetric convex
body K Ă Rn there is a k-dimensional subspace E Ă Rn with k “ c log n such
ly.

that both K X E and PE K (or, equivalently, K ˝ X E) are 2-Euclidean. (ii) Can we


choose a position of K such that the conclusion is valid for most subspaces E?
on

7.2.6. Related results.


7.2.6.1. Volume ratio. Define the volume ratio of a convex body K Ă Rn as
se

ˆ ˙1{n
volpKq
lu

vrpKq “ .
volpJohnpKqq
The quantity vrpKq is an affine invariant. Consequently, if K “ BX , it makes
na

sense to denote vrpXq “ vrpKq. Examples of convex bodies with “bounded volume
ratio” (i.e., bounded by a dimension-independent constant) include Bpn , Spm,n and
so

Spn,sa for 1 ď p ď 2. For Bpn , this is a consequence of the computations from


r

Section 4.3.3 (Table 4.1, Exercises 4.39 and 4.40). For the Schatten spaces, the
Pe

boundedness follows from the proof of Theorem 7.37 (see also Exercise 7.24). The
following theorem asserts that bodies (resp., spaces) with bounded volume ratio
always have nearly Euclidean sections (resp., subspaces) of proportional dimension,
for arbitrary proportion α P p0, 1q.
Theorem 7.42 (not proved here). Let K Ă Rn a convex body in John position
and denote A “ vrpKq. Let E Ă Rn be a random k-dimensional subspace. Then,
with probability larger than 1 ´ e´n ,
n
B2E Ă K X E Ă pCAq n´k B2E ,
where C is an absolute constant.
202 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

In general, the bounded volume ratio property is inherited by subspaces only if


the dimension of the space and that of the subspace are comparable (Exercise 7.28).
However, all subspaces of the classical and non-commutative Lp -spaces alluded to
above do have uniformly bounded volume ratio. This is due to the fact that they
possess the so-called cotype 2 property, which is clearly inherited by subspaces and
which is known to imply the bounded volume ratio (see Notes and Remarks).
An important instance of Theorem 7.42 is the following striking fact.
Corollary 7.43 (see Exercise 7.29). Let n “ 2k; there exist E1 , E2 Ă Rn with

ion
dim E1 “ dim E2 “ k and E1 K E2 such that
(7.27) c|x| ď n´1{2 }x}1 ď |x| for x P Ei , i “ 1, 2,

ut
where c ą 0 is a universal constant. Similarly, if n “ 3k, there exist mutually

rib
orthogonal k-dimensional subspaces E1 , E2 , E3 such that the bounds from (7.27)
hold for x P Ei ` Ej , for any ti, ju Ă t1, 2, 3u.

ist
The property expressed by (7.27) is usually referred to as the Kashin decompo-
sition of `n1 . Another statement closely related to Theorem 7.42 is the following.

rd
Theorem 7.44 (not proved here). Let K Ă Rn a convex body in John position
and denote A “ vrpKq. There is an orthogonal transformation U P Opnq such that
K X U K Ă 8A2 B2n . fo
ot
Exercise 7.28 (Volume ratio of subspaces). (a) Let K Ă Rn be a symmetric
convex body and E Ă Rn be a k-dimensional subspace. Show that vrpK X Eq ď
N

n{k
pC vrpKqq . (b) Give examples of symmetric convex bodies K Ă Rn and sub-
spaces E Ă Rn such that the ratio vrpK X Eq{ vrpKq is arbitrarily large.
ly.

Exercise 7.29 (Kashin decomposition via volume ratio). (i) Derive Corollary
on

7.43 from Theorem 7.35. (ii) Show that the assertion of Corollary 7.43 holds for
spaces X with uniform bound on their volume ratios (i.e., with constant c depending
only on vrpXq).
se

Exercise 7.30 (A dual Kashin decomposition). Show ? that, for any n Pn N, there
lu

is an orthogonal transformation U P Opnq such that c nB2n Ă convpB8 , U B8n


q,
n n
where
? n c ą 0 is an absolute constant. Note that we always have convpB 8 , U B 8 q Ă
na

nB2 .
7.2.6.2. The low-M ˚ estimate and the proof of Theorem 7.35. Let K Ă Rn be a
so

symmetric convex body. The argument from Exercise 7.12 shows that sections of K
r

of dimension larger than k˚ pKq cannot be isomorphically Euclidean. Remarkably,


Pe

“one half” of the estimates (7.12) persists: an avatar of the lower bound remains
valid for subspaces of proportional dimension.
Theorem 7.45 (Low-M ˚ estimate). Let K be either a convex body in Rn con-
taining 0 in the interior or a circled convex body in Cn , and M ˚ “ wpKq. Let
0 ă α ă 1 and k “ np1 ´ αq. Then, with probability larger than 1 ´ expp´cαnq, a
random k-dimensional subspace E satisfies
?
c α
(7.28) @x P E, |x| ď }x}K

where c ą 0 is an absolute constant.
7.2. SECTIONS OF CONVEX BODIES 203

If we denote (as in Theorem 7.19) wpK ˝ q by M , we recall that M M ˚ ě 1 (see


Exercise 4.37), so that the lower bound in (7.28) is always worse than the lower
bound in (7.12). However, when a good upper bound on the product M M ˚ is
present (which is always the case for some choice of the Euclidean structure, see
Theorem 7.10), both estimates become comparable.

Proof. We give a proof (valid only in the real case) based on Proposition 6.42.
Consider L “ S n´1 X tK for t ą 0 to be chosen later. We have wG pLq ď wG ptKq “
twG pKq “ tκn M ˚ . We now chose t such that tκn M ˚ “ 21 κn´k ; this implies

ion
? ?
t ě c α{M˚ for some c ą 0 because κm „ m. Proposition 6.42 implies then
that, with high probability, a random subspace E P Grpk, Rn q does not intersect L.

ut
This is equivalent to the fact that the inequality } ¨ }K ą t| ¨ | holds on E. 

rib
Proof of Theorem 7.35. We argue as in the proof of Theorem 7.45 specified
to K “ B1n , the only
? modification comes in upper-bounding wG pLq. Denote L̃ “

ist
B2n X tB1n (t P r1, ns to be chosen later), then clearly
wG pLq ď wG pL̃q.

rd
(We actually have equality since L̃ “ conv L; this is a fairly easy consequence
` of the˘
fact that no extreme point of tB1n lies inside B2n .) Next, L̃˝ “ conv B2n Y t´1 B8
fo n

by (1.15) and so we have


ot
c
˝ ´1 n ´1 2
wG pL̃ q ě wG pt B8 q “ t ˆ n,
N

π
see Table 4.1 and Exercise 6.6 for the equality. Given that L̃ is permutationally
ly.

symmetric, it has enough symmetries and hence it is in the `-position (see Section
4.2.2 and particularly Proposition 4.8). Accordingly, Proposition 7.6 applies and
on

shows that
wG pL̃qwG pL̃˝ q ď nKpL̃q.
se

Further, since L̃ is unconditional, it follows from Remark 7.8 that


? ˘1{2
lu

˘1{2
KpL̃q ď C 1 ` log dpL̃, B2n q
` `
“ C 1 ` logpt{ nq .
Combining the above inequalities yields
na

c
π ` ? ˘1{2
wG pLq ď wG pL̃q ď C t 1 ` logp n{tq .
so

2
r

As in the proof of Theorem 7.45, we now choose t so that


Pe

c
π ` ? ˘1{2 ?
2C t 1 ` logp n{tq “ κn´k „ αn,
2
?
which can be rewritten as gpλq „ cα´1{2 , where gpxq “ xp1 ` log xq´1{2 , λ “ n{t
? ` ˘ 1{2
and c “ 2πC. Solving for λ we obtain λ » α´1{2 logp2{αq , whence t “
? 1{2
?
n{λ » pα{ logp2{αqq n, as needed. (We are using here the fact that if β P R is
fixed and if y “ gpxq :“ xp1 ` log xqβ , then the inverse function—which is defined
for sufficiently large y—satisfies g ´1 pyq „ yp1 ` log yq´β as y Ñ 8.) 
204 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

7.2.6.3. The quotient of a subspace theorem. It follows from Corollary 7.40 that
any convex body K Ă Rn admits isomorphically Euclidean sections of dimension
Ωplog nq. Dually, any convex body admits orthogonal projections of the same di-
mension which are isomorphically Euclidean. The bound Ωplog nq cannot be im-
proved, as shown by the case of the cube (for sections) or of the `n1 ball (for projec-
tions). However, it turns out that combining both operations leads to a surprising
phenomenon: every convex body admits a projection of a section of proportional
dimension which is isomorphically Euclidean.

ion
Theorem 7.46 (Quotient of a subspace theorem, not proved here). Given a
symmetric convex body K Ă Rn and α P p0, 1q, there exist subspaces E Ă F Ă Rn

ut
with dim E ě p1 ´ αqn such that
dBM pPE pK X F q, B2dim E q ď Cα´1 logpCα´1 q.

rib
We note that an “almost isometric” version of the quotient of a subspace theo-
rem follows then by appealing to Remark 7.22.

ist
Exercise 7.31 (Quotient of a subspace = subspace of a quotient). Show that

rd
given a decomposition Rn “ E ‘ F ‘ G into orthogonal subspaces, we have, for
K Ă Rn
pPE‘F Kq X E “ PE pK X pE ‘ Gqq.
fo
Conclude that the class of sections of projections of K coincides with the class of
projections of sections of K.
ot
Exercise 7.32 (Combining quotient and subspace operations is necessary).
N

Give an example of a family of convex bodies of growing dimension which has


neither sections nor projections of proportional dimension which are isomorphically
ly.

Euclidean. Therefore, in general, combining both operations is really needed for


Theorem 7.46 to be valid.
on

Exercise 7.33 (Quotient of a subspace implies reverse Santaló). We show here


how to derive the reverse Santaló inequality from the quotient of a subspace theorem
(Theorem 7.46). Let K Ă Rn be a symmetric convex body, and Rn “ E1 ‘ E2 ‘ E3
se

be an orthogonal decomposition. Let ni “ dim Ei . Denote K1 “ PE1 pK X pE1 ‘


lu

E2 qq, K2 “ K X E2 , and K3 “ PE3 K; these are convex bodies in, respectively


E1 , E2 , and E3 .
na

(i) Check by applying Lemma 4.20 twice that volpKq ě 41n volpK1 q volpK2 q volpK3 q
and volpK ˝ q ě 41n volpK1˝ q volpK2˝ q volpK3˝ q.
so

(ii) Given convex body L Ă Rk , define αpLq “ vradpLq vradpL˝ q. Show that, for
some constant c, αpKqn ě cn αpK1 qn1 αpK2 qn2 αpK3 qn3 .
r
Pe

(iii) By Theorem 7.46, we may assume that n1 “ n{2, and that K1 is A-Euclidean
for some absolute constant A. Show that αpK1 q ě A´1 . If βN denotes the infimum
of αpKq over all symmetric convex bodies of dimension at most N , conclude that
βN ě c2 {A.
7.2.6.4. Approximation of zonoids by zonotopes. We first state a reformulation
of Dvoretzky’s theorem for `n1 .
Theorem 7.47 (see Exercise 7.34). For any n P N, ε ą 0, there exists an
integer N ď Cn{ε2 and vectors x1 , . . . , xN P Rn such that Z Ă B2n Ă p1 ` εqZ,
where Z denotes the zonotope
(7.29) Z “ r´x1 , x1 s ` ¨ ¨ ¨ ` r´xN , xN s.
7.2. SECTIONS OF CONVEX BODIES 205

It is natural to ask whether a version of Theorem 7.47 holds when the Euclidean
ball is replaced by an arbitrary zonoid. The best result in this direction is the
following.
Theorem 7.48 (not proved here). For any 0-symmetric zonoid Y Ă Rn and
ε ą 0, there exists an integer N ď Cn logpnq{ε2 and vectors x1 , . . . , xN P Rn such
that Z Ă Y Ă p1 ` εqZ, where Z denotes the zonotope (7.29). Moreover, we can
ensure that supp µZ Ă supp µY , where the measures µY , µZ are defined in (4.8).

ion
Exercise 7.34 (Approximating balls by zonotopes via Dvoretzky’s theorem).
Prove Theorem 7.47 using the fact that the Dvoretzky dimension of B1n is of order
n (Theorem 7.31).

ut
7.2.6.5. The Johnson–Lindenstrauss lemma.

rib
Theorem 7.49 (Johnson–Lindenstrauss lemma). Let A be a finite subset of
Rn , m “ card A, and ε P p0, 1q. If k ě 4ε´2 log m, there exists a linear map

ist
f : Rn Ñ Rk such that, for every x, y P A,
(7.30) p1 ´ εq|x ´ y| ď |f pxq ´ f pyq| ď p1 ` εq|x ´ y|.

rd
Proof. We show that a random choice for f satisfies (7.30) with high prob-
ability. Let B : Rn Ñ Rk be a random matrix with i.i.d. N p0, 1q entries. For
fo
every unit vector u P Rn , Bu is a standard Gaussian vector in Rk , and the random
variable |Bu| follows the χ2 pkq distribution. Denoting by Mk2 the median of the
ot
χ2 pkq distribution, it follows from Theorem 5.24 that for any t ą 0,
N

ˇ
P ˇ|Bu| ´ Mk ˇ ą t ď expp´t2 {2q.
`ˇ ˘

Define f as M1k B. Given x, y P A, we apply he above inequality for u “


ly.

px ´ yq{|x ´ y| and t “ εMk to obtain


on

ˇ
P ˇ|f pxq ´ f pyq| ´ |x ´ y|ˇ ą ε|x ´ y| ď expp´ε2 Mk2 {2q.
`ˇ ˘

We now take the union bound over the m


` ˘ 2
2 ď m {2 pairs of points from A. It
follows that (7.30) is satisfied whenever m {2 ¨ expp´ε2 Mk2 {2q ă 1, i.e., 2 log m ă
2
se

ε2 Mk2 {2 ` log 2. Since Mk2 ě k ´ 2{3 (see Exercise 5.34), this condition is satisfied
lu

provided k ě 4ε´2 log m. 


7.2.7. Constructivity. A general feature of the proofs of most of the the-
na

orems in this chapter is a heavy use of the probabilistic method. For example,
the existence of a subspace satisfying the conclusion of Dvoretzky’s theorem or its
so

variants is proved by selecting it at random according to the unitarily invariant


r

measure on the corresponding Grassmannian (after a suitable Euclidean structure


Pe

has been chosen) or by using random matrices. Random constructions benefit from
the blessing of dimensionality, as opposed to the curse of dimensionality, which
renders an exhaustive search (and many deterministic algorithms) nonfeasible.
However, for theoretical and practical reasons, existence results are often unsat-
isfactory. For example, to write a computer code implementing an error-correcting
algorithm one needs a specific encoding matrix. This leads to the class of prob-
lems asking for explicit versions of, or pseudo-random models for objects whose
constructions involve probabilistic arguments. By “explicit” we mean here an algo-
rithm, whose complexity is manageable (say, with running time being polynomial
in the dimension). Individual constructions are often “more explicit” than that,
they may involve, e.g., closed formulas. An alternative to an explicit solution may
206 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

be a guarantee that we can efficiently check whether a randomly generated object


actually does the job.
When the initial setting is completely abstract, it seems unrealistic to expect
any meaningful statement. We therefore mostly concentrate on standard convex
bodies. Here is an example of a satisfactory result.
Theorem 7.50 (Explicit quotient of a subspace theorem for the simplex, not
proved here). Given n P N, there exists a set S Ă Rn which is an explicit affine
image of an explicit section of the 5n-dimensional simplex and which verifies

ion
B2n Ă S Ă CB2n .
Moreover, C can be replaced by 1 ` ε for ε P p0, 1q, if we use a simplex of dimension

ut
ě C1 n logp2{εq.

rib
Another result, for which substantial efforts has been devoted to derandom-
ization, is Dvoretzky’s theorem for B1n (or `n1 ). Recall that the (`1 -)distortion of

ist
a subspace E Ă Rn is the ratio between the maximum and the minimum of the
function }x}1 over S n´1 X E. We already showed, via the probabilistic method, the

rd
existence of subspaces of proportional dimension with arbitrarily small distortion
(Theorem 7.31) and the existence of subspaces of arbitrarily large proportional di-
mension with bounded distortion (Theorem 7.42). The randomness relied on the
fo
Haar measure on Grassmann manifold, which requires an infinite amount of ran-
dom bits to be exactly simulated. However, a careful look at the arguments shows
ot
that the same conclusion can be derived using only Opn2 q random bits.
A natural step towards explicit examples is randomness reduction: can we
N

match, or approach, the optimal dimension and distortion bounds using fewer ran-
dom bits? We point that constructions using Oplog nq random bits are very close to
ly.

be explicit, since we can then perform an exhaustive search among the polynomially
many possible bit strings. However, it is not clear whether the distortion of a given
on

subspace can be efficiently estimated; the following seems to be unknown.


Problem 7.51. Is the problem of calculating (or approximating well enough)
se

the `1 -distortion of a general subspace E Ă Rn NP-hard?


lu

The best results known to the authors and directed towards constructing ex-
plicit subspaces of `n1 (going in several different directions) are gathered in Table
na

7.1. One result that “doesn’t fit” in the table is the following.
Theorem 7.52 (not proved here). Given n P N, p P p1, 2q and η P p0, 1q, there
so

is an explicitly defined subspace E Ă Rn of dimension p1 ´ ηqn such that


r

` ˘
´1
(7.31) dg pB1n X E, Bpn X Eq ď p1{ηqO p2´pq .
Pe

In the language of this section, (7.31) gives a bound on the distortion of the `1 -norm
on the sphere of `np intersected with E.
In a different direction, we state a result which derandomizes Dvoretzky’s the-
orem (Corollary 7.40) simultaneously for a wide class of convex bodies.
Theorem 7.53 (not proved here). Given n P N and ε P p0, 1q, there is an
explicitly defined subspace E Ă Rn of dimension k “ c log n{ logp1{εq such that the
following holds. If K Ă Rn is a convex body invariant under the isometry group of
the cube (i.e., permutation of coordinates and sign flips) then
dg pK X E, B2E q ď 1 ` ε.
NOTES AND REMARKS 207

Table 7.1. The best known results for constructing almost-


Euclidean sections of B1n . The parameters , η, γ P p0, 1q are as-
sumed to be constants, although we explicitly point out when the
dependence on them is subsumed by the big-Oh or the little-oh
notation.

Reference Distortion Subspace dimension Randomness


[Ind07] 1` n1´o p1q explicit
plog nqOη plog log log nq

ion
[GLR10] p1 ´ ηqn explicit
[Ind00] 1` Ωp2 { logp1{qqn Opn log2 nq
[AAM06, LS08] Oη p1q p1 ´ ηqn Opnq

ut
[GLW08] 2Oη p1{γq p1 ´ ηqn Opnγ q
[IS10] 1` pγqOp1{γq n Opnγ q

rib
ist
Notes and Remarks
A recent and comprehensive reference for the material presented in this chapter

rd
(and much more) is [AAGM15]. Older standard and valuable references include
[MS86, Pis89b, TJ89, Ver].

Section 7.1. Proposition 7.5 is a special case


fo
aof Corollary 3 in [LO99]. If we
do not insist on obtaining the optimal constant π{2, the result is more elemen-
ot
tary: it is an instance of the Gaussian version of the Khintchine–Kahane inequality,
N

Exercise 5.72, whose proof carries over to the present context (modulo replacing
an application of Theorem 5.23 with that of Theorem 5.51) and extends to non-
ly.

symmetric convex bodies (see, e.g., [BLPS99], Lemma 3.3).


The K-convexity constant is more frequently defined in the literature for a
on

normed space Y and corresponds, on our notation, to KpBY q.


Proposition 7.6 is due to Figiel and Tomczak-Jaegermann [FTJ79] (where the
`-norm is also introduced) whereas Theorem 7.7 and the bound stated in Remark
se

7.8 are due to Pisier (see [Pis80, Pis81]). The proof of Theorem 7.7 that is
presented here is based on Lemma 7.13, which is from [Mau03]. The bound on
lu

the K-convexity constant from Theorem 7.7 is sharp: there is an example due to
Bourgain [Bou84] of a symmetric convex body K Ă Rn (for an arbitrarily large n)
na

with KpKq “ Ωplog nq; this example is presented in detail in [AAGM15, ? Section
6.7]. Besides unconditional bodies, the improved bound KpKq “ Op log nq holds
so

if K Ă Rn is, for example, a zonoid (see [Pis80]; or Theorem IV.5 in [LQ04] for a
r

detailed proof).
Pe

In is unknown if the M M ˚ -estimate is sharp, i.e., whether log n can be replaced


by a smaller function in Theorem 7.10. The pair pB1n , B8 n
q gives an example
? of a
(sequence of) symmetric convex bodies, for which wpKqwpK ˝ q “ Θp log nq, and
one may conjecture
? that the M M ˚ -estimate holds (for symmetric bodies) with a
bound Op log nq. In the non-symmetric case, the n-dimensional simplex ∆ is an
example with wp∆qwp∆˝ q » log n. While it is conceivable that the M M ˚ -estimate
holds with a bound that is polynomial in log n also for non-symmetric bodies, the
known general upper bounds in that setting are much weaker [BLPS99, Rud00].
This question is related to the problem of determining the diameter of the Banach–
Mazur compactum of not-necessarily-symmetric convex bodies.
208 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

Section 7.2. The history around Dvoretzky’s theorem starts with a conjec-
ture by Grothendieck [Gro53b]: does every n-dimensional normed space contain a
kpε, nq-dimensional subspace which is p1 ` εq-Euclidean, for some function kpε, nq
tending to infinity with n? This was shown affirmatively by Dvoretzky [Dvo61],
and later refined by [Mil71] using crucially concentration of measure. Other early
proofs include [Sza74] and [Fig76].
Theorem 7.15 with the dependence on ε as stated appears in [Gor88] in the
real case (see Exercise 7.7). The proof via Lemma 7.17 is from [Sch89] and it was

ion
noticed in [ASW11] that it carries over to the complex case.
When asking about the dependence on ε in Dvoretzky’s theorem, it is important
to keep in mind that there are two different questions, depending whether we ask

ut
if p1 ` εq-Euclidean subspaces either (i) exist or (ii) have measure 1 ´ op1q in the

rib
Grassmann manifold equipped with the standard Haar measure.
For example, one may ask: given ε ą 0 and k, for which values of n can we
guarantee that every n-dimensional symmetric convex body has a k-dimensional

ist
section which is p1 ` εq-Euclidean? If we believe that the worst case is the cube, it
is natural to conjecture that this holds for n ě Cpkqε´pk´1q{2 . This conjecture is

rd
confirmed for k “ 2 (see [Mil88]). For k ą 2 the problem is wide open and a good
dependence would follow from a positive answer to a weak version of the Knaster
fo
problem, see [KS03]. In a related direction, the random version of the Dvoretzky
theorem for the cube has been studied in [Sch07, Tik14] and the dependence on
ot
n
ε in Theorem 7.19 for K “ B8 is cpεq “ Θpε{ lnp1{εqq.
Most of the material from Sections 7.2.2 through 7.2.4 is based on the very
N

influential paper [FLM77]. The concepts of the verticial and facial dimensions of
a convex body were formally defined in [AS17].
ly.

Exercise 7.12 about the sharpness of the Dvoretzky dimension is an observation


due to Milman–Schechtman [MS97] (see [HW16] for a sharper statement). The
on

paper [MS97] also introduces global versions of Dvoretzky’s theorem, of which


here is a sample: for any symmetric convex body K Ă Rn , there is an integer
t ď Cε´2 n{k˚ pKq and U1 , . . . , Ut P Opnq such that the Minkowski sum U1 pKq `
se

¨ ¨ ¨ ` Ut pKq is p1 ` εq-Euclidean. For other similar results, see [AAGM15]. The


lu

result from Exercise 7.19 appears in [BDG` 77] (for another proof, see [AAGM15,
Theorem 5.4.3]). The construction from Exercise 7.20 is due to Figiel.
The estimate from Exercise 7.21 is relevant to [FHS13]. Theorem 7.35 is
na

from [Kaš77]; the correct order of magnitude of the distortion constant Apαq was
determined in [GG84]; the proof of the upper bound presented in Section 7.2.6.2
so

follows [PTJ90]. We also refer to [FR13, Chapter 10] for a detailed presentation
r

focusing on applications to compressed sensing.


Pe

The Dvoretzky–Rogers lemma was first proved in [DR50]. The proof presented
comes from [Peł80]. It has been realized since [BS88] that actually a stronger
property holds: There is a function f : p0, 1s Ñ r1, 8q such that, for any n-
dimensional normed space X there exist m ě p1 ´ δqn and operators α : Rm Ñ X,
β : X Ñ Rm verifying β ˝ α “ I and }α : `m m
1 Ñ X} ¨ }β : X Ñ `2 } ď f pδq. The
above is often referred to as a proportional Dvoretzky–Rogers factorization. It is
known that f pδq “ Opδ ´1 q and f pδq “ Ωpδ ´1{2 q [Gia96, Rud97]. Variants for
nonsymmetric bodies were also shown, see [You14]. For more information and
references see the website [@3].
NOTES AND REMARKS 209

Regarding Proposition 7.39, it has been proved in [Bal89, Bal91] that the
cube (resp., the simplex) has the smallest mean width among all symmetric (resp.,
non-necessarily symmetric) convex bodies in John position.
The relevance of the concept of volume ratio to Dvoretzky-like theorems was
realized in [Sza78, ST80], which were inspired by the important work [Kaš77]
that in particular established the existence of the Kashin decomposition of `n1 (see
Corollary 7.43). This concept is related to the notion of cotype 2. Let pεn q be a
sequence of independent variables such that Ppεi “ 1q “ Ppεi “ ´1q “ 1{2. The

ion
cotype 2 constant of a normed space X is the smallest number C2 pXq such that,
for every vectors x1 , . . . , xn P X, we have
›2

ut

n
ÿ ›ÿn ›
}xi }2 ď C2 pXq E › εi xi › .
› ›

rib
i“1
›i“1

The estimate vrpXq “ OpC2 pXq log C2 pXqq connecting volume ratio and cotype 2

ist
was proved in [BM87, MP86] (see [Mil87] for a simpler proof and [DS85] for an
earlier argument yielding Kashin’s decompositions under cotype 2 assumptions).

rd
Any bound on the cotype 2 constant is obviously inherited by subspaces. For more
information about the type and cotype theory, see [Mau03]. The formulation of
Theorem 7.44 appears in [Bal97].
fo
The low-M ˚ estimate (Theorem 7.45) was proved originally by Milman with a
worse dependence on α; the proof we present is due to Gordon [Gor88]. Another
ot
proof giving the correct dependence, and valid also in the complex setting, is due
to Pajor and Tomczak–Jaegermann [PT86]. See [AAGM15] for a presentation of
N

several different proofs. We also point that in some cases the upper bound in the
Dvoretzky–Milman theorem (Theorem 7.19) holds for dimensions larger than the
ly.

Dvoretzky dimension: see [KV07].


The quotient of a subspace theorem is due to Milman [Mil85]. The simple
on

argument to deduce the reverse Santaló inequality sketched in Exercise 7.33 is due
to Pisier. Another related result due to Milman [Mil86] is the reverse Brunn–
se

Minkowski inequality, which asserts the following: for any symmetric convex body
B Ă Rn there is a volume-preserving linear map TB P SLpn, Rq such that, if K, L
lu

are symmetric convex bodies, then


´ ¯
1{n
(7.32) vol pTK pKq ` TL pLqq ď C volpKq1{n ` volpLq1{n .
na

There is a close link with the M -ellipsoid and M -position introduced in (5.68), since
so

(7.32) is easily seen to hold when TK pKq and TL pLq admit multiples of Euclidean
balls as M -ellipsoids.
r
Pe

The results from AGA are classically presented in the real setting, but typically
remain valid for complex spaces (or circled convex bodies) as well. This is the case
for Theorems 7.42, 7.45 and 7.46. Often the proofs can be translated verbatim,
with the notable exception of the Chevet–Gordon inequalities, for which no complex
analogue is known. We also note that Pisier [Pis89a] obtained a proof of (7.32)
via interpolation which works primarily in the complex setting (see Chapter 7 in
[Pis89b]).
The theme of the approximation of zonoids by zonotopes with few summands
attracted attention in the late 80’s. The best result (Theorem 7.48) is due to
Talagrand [Tal90] and improves on [Sch87, BLM89]. It is an open question
whether Theorem 7.48 holds without the factor log n, i.e., with N ď Cpεqn.
210 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS

The Johnson–Lindenstrauss lemma appeared in [JL84]. It was announced in


[LN16] that the dependence on ε in the version presented here is optimal.
Theorem 7.50 is due to Ben-Tal and Nemirovski [BTN01b]; see also [KTJ09].
Problem 7.51 appears to be folklore. Analogous question for a vaguely similar re-
stricted invertibility property (RIP), important in the theory of compressed sensing,
was answered in the affirmative, see [BDMS13]. Table 7.1 comes from [IS10].
Theorem 7.52 is a special case of a result from [Kar11], which deals with the
distortion of the `r -norm on the sphere of `np for any 0 ă r ă p ă 2. Theorem 7.53

ion
is from [Fre14], which contains also a version of the Theorem for convex bodies
that are only assumed to be invariant under permutation of coordinates.

ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
N
Part 3

ot
fo
rd
ist
The Meeting: AGA and QIT

rib
ut
ion
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 8

Entanglement of Pure States in High Dimensions

ion
Throughout this chapter, we consider a multipartite Hilbert space

ut
H “ Cd 1 b ¨ ¨ ¨ b Cd k

rib
and study the entanglement of pure states on H. We will always assume that k ě 2
and that d1 , . . . , dk ě 2.
We identify pure states on H with elements of PpHq, the projective space on

ist
H. The set of product vectors forms the Segré variety Seg Ă PpHq (see (B.6) in
Appendix B.2). A simple remark, on which we will elaborate, is that most pure

rd
states are entangled. Indeed, since the variety Seg Ă PpHq has lower dimension
and measure zero, it follows that a randomly chosen—in any reasonable sense—
pure state in H is almost surely entangled. fo
A problem which turns out to be fundamental to several constructions in QIT
ot
is to show the existence of large-dimensional subspaces of H, in which every unit
vector corresponds to an entangled pure state. There are several variations on
N

this question. We may consider the qualitative version of the problem, where we
require the subspace simply to contain no nonzero product vector (see Theorem
ly.

8.1). Alternatively, we may insist that the subspace contains only very entangled
vectors, once it is specified how to quantify entanglement; for pure states this may
on

be done via the von Neumann or Rényi entropy of the partial trace.
The versions of Dvoretzky’s theorem that were discussed in Section 7.2 are
obviously relevant to such questions, since they show the existence of large subspaces
se

on which a given function is almost constant. This approach allows us to give


lu

a complete presentation of Hastings’s counterexample to the additivity problem


(Section 8.4.4).
na

Much of our exposition will be focused on detailed study of the bipartite case
H “ Ck b Cd (we will always assume that k ď d). One reason for such emphasis
so

is the fact that subspaces of a bipartite Hilbert space can provide a convenient de-
scription of quantum channels through the Stinespring representation, as we explain
r

in Section 8.2.2. Fine aspects of pure state entanglement in multipartite systems


Pe

are dealt with in the last part of the chapter (Section 8.5).

8.1. Entangled subspaces: qualitative approach


Let H “ Cd1 b Cd2 . A fundamental qualitative question we may ask about
entangled subspaces is: “What is the maximal dimension of a subspace of H in
which every unit vector corresponds to an entangled pure state?” The answer to
this question is pd1 ´ 1qpd2 ´ 1q, as shown by the following theorem, which also
settles the multipartite case.
Theorem 8.1. Let H “ Cd1 b ¨ ¨ ¨ b Cdk , and let n0 “ d1 ¨ ¨ ¨ dk ´ pd1 ` ¨ ¨ ¨ `
dk q ` k ´ 1. Then
213
214 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

(1) If m ą n0 , then any m-dimensional subspace of H contains a (nonzero)


product vector.
(2) If m ď n0 , a generic m-dimensional subspace of H contains no (nonzero)
product vector.
Proof. We only give an argument for the second part of the Theorem (the
first assertion can be proved via the projective dimension theorem from algebraic
geometry). The proof is based on dimension counting, and we find it instructive
to give a “probabilistic” version of dimension counting, which naturally fits in the

ion
general framework of this book. For simplicity, we only consider the case H “
Cd b Cd (so that n0 “ pd ´ 1q2 ), the general case being similar.

ut
We work in the projective space PpHq, which we equip with the distance given
by (B.5). The ball of center ψ and radius r is denoted by Bpψ, rq. We use bounds

rib
on the size of ε-nets in PpHq and the measure of ε-balls from Theorem 5.11 (and
Exercise 5.25; the more elementary results from Section 5.1.2 would actually suffice,

ist
cf. Exercise 5.10 and (5.2)). In this proof, as opposed to most material in this book,
the dependence of constants on the dimension is allowed, and we will denote by

rd
C, C 1 etc. positive constants which may depend on d and m, but are independent
of the parameter ε.
Let F be a random m-dimensional subspace of H, chosen with respect to the
fo
Haar measure on the Grassmann manifold. More concretely, we may realize F
as F “ U pF0 q, where F0 is any fixed m-dimensional subspace, and U is a Haar-
ot
distributed unitary matrix. Denote also Seg Ă PpHq the set of product vectors (the
N

Segré variety).
We are going to show that the event Seg XF “ H has probability 1. Given
ε ą 0, let Mε be an ε-net inside the projective space PpF0 q with cardpMε q ď
ly.

pC 1 {εq2m´2 . Next, let Nε be an ε-net inside PpCd q with cardpNε q ď pC 1 {εq2d´2 .


One checks that Nεb2 :“ tx b y : x, y P Nε u is a 2ε-net inside Seg. We use the
on

union bound in the following way


¨ ˜ ¸ ˛
se

ď ď
PpSeg XF ‰ Hq ď P ˝ Bpϕ, 2εq X U Bpψ, εq ‰ H‚
lu

ϕPNεb2 ψPMε
ÿ
ď P pBpϕ, 2εq X U pBpψ, εqq ‰ Hq
na

ϕPNεb2 ,ψPMε
ÿ
ď P pdpϕ, U ψq ă 3εq .
so

ϕPNεb2 ,ψPMε
r

The quantity Ppdpϕ, U ψq ă 3εq does not depend on the particular points ϕ, ψ P
Pe

PpHq, and is equal to the normalized measure of a ball of radius 3ε in PpHq, which
2
is bounded from above by pC 2 εq2d ´2 (or see Exercise 5.11 for the exact value).
Consequently,
2
PpSeg XU pF0 q ‰ Hq ď cardpNεb2 q cardpMε qpC 2 εq2d ´2
2
ď Cε2d ´2´p2m´2q´2p2d´2q
.
2
Provided m ď pd ´ 1q , the last quantity tends to 0 as ε tends to 0. This shows
that the event {F intersects Seg} has probability 0, so that F contains no nonzero
product vector. 
8.2. ENTROPIES OF ENTANGLEMENT AND ADDITIVITY QUESTIONS 215

Exercise 8.1 (Universal entanglers). Show that whenever d ě 4, a generic


unitary matrix U P Upd2 q has the property that for every product unit vector
ψ P Cd b Cd , U |ψyxψ|U : is entangled.

8.2. Entropies of entanglement and additivity questions


8.2.1. Quantifying entanglement for pure states. The most common way
to quantify the entanglement of a bipartite pure state is to use the entropy of
entanglement (for operational meanings of the entropy of entanglement, we refer to

ion
Notes and Remarks).
Let ψ P Ck bCd be a unit vector. The entropy of entanglement of ψ, denoted by

ut
Epψq, is defined as the von Neumann entropy of the reduced matrix ρ “ TrCd |ψyxψ|.
(8.1) Epψq “ Spρq “ ´ Tr ρ log ρ.

rib
Both parties play a symmetric role since the two reduced matrices TrCd |ψyxψ|
and TrCk |ψyxψ| have the same von Neumann entropy (in the matrix formalism, a

ist
consequence of the factřthat M M : and M : M have the same nonzero eigenvalues
for M P Mk,d ). If ψ “ λi ϕi b χi is a Schmidt decomposition of ψ, then

rd
ÿ ÿ
(8.2) Epψq “ ´ λ2i log λ2i “ ´2 λ2i log λi .
fo
For any p P r0, 8s, we introduce the p-entropy of entanglement, defined as
(8.3) Ep pψq “ Sp pρq,
ot
where ρ “ TrCd |ψyxψ| and Sp is the p-Rényi entropy introduced in Section 1.3.3.
N

Recall that the case p “ 1 corresponds to the von Neumann entropy, i.e., E1 pψq “
Epψq (as given by (8.1)). The limit cases p “ 0 and p “ 8 should be interpreted
ly.

as E0 pψq “ log rankpψq and E8 pψq “ ´2 log max λ1 , where rank ψ is the Schmidt
rank of ψ and λ1 its largest Schmidt coefficient.
on

Rényi entropies for p ą 1 are easier to manipulate since they are closely related
to Schatten norms. If we identify a vector ψ P Ck b Cd with a matrix M P Mk,d as
explained in Section 0.8, we obtain (see (2.12))
se

(8.4) }ρ}p “ }M }22p


lu

and therefore
p 2p
na

(8.5) Ep pψq “ log }ρ}p “ log }M }2p .


1´p 1´p
so

In all this chapter we assume that k ď d, and therefore (for any p P r0, 8s) the
p-entropy of entanglement varies between 0 and log k. Moreover, a pure state ψ
r

satisfies Ep pψq “ 0 if and only if it is a product vector, and satisfies Ep pψq “ log k
Pe

if and only if it is a maximally entangled vector.


These definitions make sense only in the bipartite case, as they rely on the
Schmidt decomposition of a bipartite pure state, which has no canonical analogue
for the multipartite case. The limit case p “ 8 is different: E8 depends only on
the largest Schmidt coefficient, which can be defined in a multipartite system as
the maximal modulus of inner product (or the maximal overlap) with a product
vector (cf. (2.13)). We elaborate on this in Section 8.5.
One of the goals of this chapter is to find subspaces W Ă Ck b Cd which are
very entangled, in the sense that the quantity Epψq (or Ep pψq) has a uniform lower
bound over all unit vectors ψ P W.
216 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

8.2.2. Channels as subspaces. A crucial insight allowing to relate analysis


of quantum channels to high-dimensional convex geometry is the observation that
there is an essentially one-to-one correspondence between channels and linear sub-
spaces of composite Hilbert spaces. Specifically, let W be a subspace of Ck b Cd
of dimension m. Then Φ : BpWq Ñ Mk defined by Φpρq “ TrCd pρq is a quantum
channel. Alternatively, and perhaps more properly, we could identify W with Cm
via an isometry V : Cm Ñ Ck b Cd whose range is W and define, for ρ P Mm , the
corresponding channel Φ : Mm Ñ Mk by

ion
(8.6) Φpρq “ TrCd pV ρV : q.
There is no restriction in considering quantum channels of the form (8.6):

ut
by Stinespring representation theorem (Theorem 2.24), any quantum channel Φ :
Mm Ñ Mk can be represented via (8.6) for some subspace W Ă Ck b Cd , with

rib
d “ km.
It is now easy to define a natural family of random quantum channels. They will

ist
be associated, via the above scheme, to random m-dimensional subspaces W of Ck b
Cd , distributed according to the Haar measure on the corresponding Grassmann

rd
manifold (for some fixed positive integers m, d, k that will be specified later). Note
that most interesting parameters of a channel defined by (8.6) depend only on
fo
the subspace W “ V pCm q and not on a particular choice of the isometry V (see,
e.g., Lemma 8.2). In this sense, the language of “random m-dimensional subspaces
ot
of Ck b Cd ” is equivalent to that of “random isometries from Cm to Ck b Cd ,”
with the corresponding mathematical objects being, respectively, the closely related
N

Grassmann manifolds and Stiefel manifolds (see Appendix B.4).


ly.

8.2.3. Minimal output entropy and additivity problems. Given a quan-


tum channel Φ : Mm Ñ Mk , we define its minimum output entropy as
on

(8.7) S min pΦq “ S1min pΦq “ min SpΦpρqq,


ρPDpCm q

as well as the p-entropy variant for p ě 0,


se

Spmin pΦq “ min Sp pΦpρqq.


lu

ρPDpCm q

The following lemma shows that, for channels defined via (8.6), the minimum output
na

entropy depends only on the range of the isometry V .


so

Lemma 8.2. Let Φ : Mm Ñ Mk a random channel, obtained by (8.6) from a


Haar-distributed isometry V : Cm Ñ Ck b Cd . Then, for any 0 ď p ď 8,
r
Pe

Spmin pΦq “ min Ep pψq,


ψPW,|ψ|“1

where W Ă Ck b Cd is the range of V .


Proof. Since the function Sp is concave (see Section 1.3.3), the minimum is
achieved on a pure state (pure states are extreme points of DpCm q). Consequently,
Spmin pΦq “ min Sp pΦp|ϕyxϕ|qq “ min Sp pTrCd |ψyxψ|q
ϕPSCm ψPW : |ψ|“1

and the result follows. 


8.2. ENTROPIES OF ENTANGLEMENT AND ADDITIVITY QUESTIONS 217

For some time, an important open problem in quantum information theory was
to decide whether the quantity S min is additive, i.e., whether every pair pΦ, Ψq of
quantum channels satisfies
?
(8.8) S min pΦ b Ψq “ S min pΦq ` S min pΨq.
The problem admits several equivalent formulations with operational meaning,
notably whether entangled inputs can increase the capacity of a quantum channel
to transmit classical information. (Note that the inequality “ď” in (8.8) always

ion
holds and is easy, see Exercise 8.2.)
A similar question can be asked for the quantities Spmin , the motivation being
that a positive answer to the p ą 1 question would have implied a positive answer

ut
to the (arguably more important) p “ 1 problem. However, it turns out that all

rib
these equalities are do not hold, at least for sufficiently large dimensions.
Theorem 8.3. For any p ě 1, there exist quantum channels Φ, Ψ such that

ist
(8.9) Spmin pΦ b Ψq ă Spmin pΦq ` Spmin pΨq.

rd
Theorem 8.3 will be a consequence of Proposition 8.6 (for p ą 1) and Proposi-
tion 8.24 (for p “ 1).
fo
Exercise 8.2 (Spmin is always subadditive). Show that the inequality Spmin pΦ b
Ψq ď Spmin pΦq ` Spmin pΨq is satisfied for any channels Φ, Ψ and any p ě 0.
ot
Exercise 8.3 (Reduction of the additivity problem to the case Φ “ Ψ). A trick
N

based on direct sums (as defined in (2.42)) allows a reduction to the case Φ “ Ψ in
questions such as (8.8).
ly.

(i) Given quantum channels Φ, Ψ, show that Spmin pΦ‘Ψq “ minpSpmin pΦq, Spmin pΨqq.
(ii) Assume that there is a pair of channels Φ, Ψ such that (8.9) holds for some p.
on

Deduce formally the existence of a channel Ξ such that Spmin pΞ b Ξq ă 2Spmin pΞq.
8.2.4. On the 1 Ñ p norm of quantum channels. The p ą 1 version of the
se

additivity problem has a nice functional-analytic interpretation. If p ą 1 and ρ is a


p
state, then Sp pρq “ 1´p log }ρ}p , and so the study of Spmin pΦq is replaced by that of
lu

maxt}Φpρq}p : ρ P DpCm qu, or the maximum output p-norm. The latter quantity
equals }Φ}1Ñp , i.e., the norm of Φ as an operator from pMsa sa
m , } ¨ }1 q to pMk , } ¨ }p q.
na

Therefore (8.9) is equivalent to


}Φ b Ψ}1Ñp ą }Φ}1Ñp }Ψ}1Ñp .
so

(8.10)
A remarkable fact is that for completely positive maps (and even for 2-positive
r
Pe

maps), the norm } ¨ }1Ñp is unchanged if we drop the self-adjointness constraint.


Proposition 8.4. Let Φ : Mm Ñ Mk be a 2-positive map, and p ě 1. Then
(8.11) sup }ΦpXq}p “ sup }ΦpXq}p
XPMm ,}X}1 “1 XPMsa
m ,}X}1 “1

We first show the following fact


„ 
A B
Lemma 8.5. If A, B, C P Mk are such that the block matrix M “ is
B: C
positive semi-definite, then for every p ě 1, }B}2p ď }A}p }C}p .
218 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

Proof. From the singular value decomposition, there exist unitary matrices
U, V P Upkq such that U BV : is a diagonal matrix with nonnegative diagonal entries.
Denote W “ U ‘ V P Up2kq. We have
„ 
: U AU : U BV :
WMW “ .
V B : U : V CV :
Since the Schatten norms are invariant under multiplication by unitaries, this shows
that to prove the Lemma it is enough to treat the case when the matrix B is diagonal

ion
with nonnegative entries, which we consider now. „ 
a b
We first note that b2ii ď aii cii , which follows from the matrix ii ii being
bii cii

ut
positive as a submatrix of M . Consequently, we have
¸1{2 ˜ ¸1{2

rib
˜
k k k k
p{2 p{2
ÿ p
ÿ ÿ p
ÿ p
}B}pp “ bii ď aii cii ď aii cii ď }A}pp{2 }C}p{2
p ,
i“1 i“1 i“1 i“1

ist
where the last inequality uses the fact that the diagonal is majorized by the spec-
trum (Lemma 1.14). 

rd
Proof of Proposition 8.4. For ϕ, ψ P SCm , consider u “ ϕ b |1y ` ψ b |2y P
Cm b C2 . By direct calculation

Φ b IdM2 p|uyxu|q “
„ fo
Φp|ϕyxϕ|q Φp|ψyxϕ|q

.
Φp|ϕyxψ|q Φp|ψyxψ|q
ot
Since Φ is 2-positive, the resulting matrix is block-positive and thus, by Lemma
N

8.5,
2
}Φp|ψyxϕ|q}p ď }Φp|ψyxψ|q}p }Φp|ϕyxϕ|q}p .
ly.

Taking supremum over unit vectors gives the required result (recall that extreme
points of S1d and S1d,sa are rank 1 operators). 
on

Exercise 8.4 (The equality (8.11) does not hold always). Define Φ : M2 Ñ M2
by ΦpXq “ X ´ TrpXq 2I . Show that for p ą 1, Φ fails to satisfy the equality (8.11).
se

Known examples where (8.11) fails for p “ 1 are more complicated, see [Wat05].
lu

8.3. Concentration of Ep for p ą 1 and applications


na

8.3.1. Counterexamples to the multiplicativity problem. We first con-


sider the case of the p-entropy of entanglement with p ą 1, and show that the
so

Dvoretzky theorem can be used to produce counterexamples to the multiplicativity


problem as announced in Theorem 8.3.
r
Pe

Proposition 8.6. There is a constant c such that the following holds. Let
p ą 1, and Φ : Mm Ñ Mk be a random channel, obtained by (8.6) from a Haar-
distributed isometry V : Cm Ñ Ck b Cd . Denote Ψ “ Φ, the channel obtained from
V , the complex conjugate of V . Assume that k “ d and that m “ cd1`1{p . Then,
for d large enough, with high probability,
(8.12) }Φ b Ψ}1Ñp ą }Φ}1Ñp }Ψ}1Ñp .
Proof. Denote by W Ă Md the range of V (we may consider W as a subspace
of Md after we identify tensors and matrices). From (8.4) and Lemma 8.2, we have
(8.13) }Φ}1Ñp “ max }A}22p .
APW : }A}HS “1
8.3. CONCENTRATION OF Ep FOR p ą 1 AND APPLICATIONS 219

We remark that }Φ}1Ñp “ }Ψ}1Ñp since the Schatten norms are invariant under
complex conjugation. We now appeal to Dvoretzky’s theorem for the Schatten
norm } ¨ }q with q “ 2p. Provided that m ď cd1`2{q for an appropriate universal
constant c ą 0, it follows from Theorem 7.37 that, with large probability
d1{q´1{2 }A}HS ď }A}q ď Cd1{q´1{2 }A}HS
for all A P W. We have therefore, by (8.13),
˘2
d1{p´1 ď }Φ}1Ñp “ }Ψ}1Ñp ď Cd1{q´1{2 “ C 2 d1{p´1 .
`
(8.14)

ion
The reason for choosing Φ as a second channel is that the channel Φ b Φ necessarily
has at least one output with at least one large eigenvalue, as shown by the following

ut
lemma.

rib
Lemma 8.7. Let Φ : Mm Ñ Mk be a quantum channel obtained from an isom-
etry V : Cm Ñ Ck b Cd , as in (8.6). Denote by ψ P Cm b Cm the maximally
entangled state

ist
1
ψ “ ? p|1y b |1y ` ¨ ¨ ¨ ` |my b |myq .
m

rd
Then › › m
›pΦ b Φqp|ψyxψ|q› ě
› ›

and consequently, for any p ą 1,


8 fodk

m
ot
}Φ b Φ}1Ñp ě }Φ b Φ}1Ñ8 ě
dk
N

In our setting, d “ k and m “ cd1`1{p , so we obtain from Lemma 8.7 the lower
bound }Φ b Φ}1Ñp “ Ωpd1{p´1 q. Since we have, by (8.14),
ly.

}Φ}1Ñp }Φ}1Ñp “ }Φ}21Ñp “ Θpd2p1{p´1q q,


on

we conclude that the inequality (8.12) holds for d large enough (a priori depending
on p ą 1). 
se

Remark 8.8. The proof shows that, for any fixed p ą 1, both the multiplicative
violation in (8.10) and the additive violation in (8.9) tend to infinity as the dimen-
lu

sion of the problem increases (at the rates Ωpd1´1{p q and Ωplog dq respectively).
Proof of Lemma 8.7. We work in the matrix formalism. Identify the range
na

of V with an m-dimensional subspace W Ă Mk,d . Let pA1 , . . . , Am q be the orthonor-


mal basis in W (with respect to the Hilbert–Schmidt inner product) obtained as
so

the image under V of the canonical basis in Cm , and


r

m
1 ÿ
Pe

M“? Ai b Ai P W b W.
m i“1
a
The conclusion of the Lemma is equivalent to the inequality }M }8 ě m{kd.
Let pϕj q1ďjďk and pψj 1 q1ďj 1 ďd be orthonormal bases in Ck and Cd , respectively.
We consider the maximally entangled states
k d
1 ÿ 1 ÿ
ϕ“ ? ϕj b ϕj , ψ “ ? ψj 1 b ψj 1
k j“1 d j 1 “1
and compute
ˇ ˇ
}M }8 ě ˇxψ|M |ϕyˇ
220 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

m ÿ k ÿ d
1 ÿ ˇ ˇ
ˇxψj 1 b ψj 1 |Ai b Ai |ϕj b ϕj yˇ
“ ?
mkd i“1 j“1 j 1 “1
m ÿ k ÿ d
1 ÿ
ˇxψj 1 |Ai |ϕj yˇ2
ˇ ˇ
“ ?
mkd i“1 j“1 j 1 “1
?
m
“ ? ,
kd

ion
ř ˇ ˇ2
where we used the fact that }X}2HS “ j,j 1 ˇxψj 1 |X|ϕj yˇ . 

ut
Exercise 8.5 (Non-random counterexamples for p ą 2). Let W Ă Md the
subspace of anti-symmetric matrices, i.e., such that AT “ ´A.

rib
(i) Show that for any A P W, }A}8 ď ?12 }A}HS .
(ii) Let Φ be the quantum channel constructed from W as in (8.6) and fix p ą 2.

ist
Using Lemma 8.7, show that the pair pΦ, Φq is an example for which (8.10) holds
for d large enough.

rd
8.3.2. Almost randomizing channels. A variant of the construction used in
the proof of Proposition 8.6 for p “ `8 gives the following: a channel Φ : Md Ñ Md
fo
constructed from a generic random embedding V : Cd Ñ Cd b CN with N “ Opdq
has the property that }Φpρq}op ď C{d for any state ρ P DpCd q. In other words,
ot
all output states have small eigenvalues. It is natural to ask whether similar lower
bounds of the eigenvalues of output states can also be achieved; showing that this
N

is indeed the case is the content of this section. Recall also (see Section 2.3.3) that
the dimension N of the environment in the Stinespring representation is an upper
ly.

bound on the Kraus rank of Φ.


Let 0 ă ε ă 1. A quantum channel Φ : Md Ñ Md is said to be ε-randomizing
on

if for all states ρ P DpCd q


}Φpρq ´ ρ˚ }op ď ε{d.
se

Recall that ρ˚ “ I {d denotes the maximally mixed state. These channels can
be thought as approximations of the completely randomizing channel R, which is
lu

defined by the property Rpρq “ ρ˚ for any ρ P DpCd q. The completely randomizing
channel rank has Kraus rank equal to d2 (see Exercise 8.6). On the other hand,
na

it turns out that there exist ε-randomizing channels with a substantially smaller
Kraus rank, as shown by the following theorem. The dependence on d is optimal
so

since any ε-randomizing channel has Kraus rank at least d, which is due to the fact
that rank one states must be mapped to full rank states.
r
Pe

Theorem 8.9. Let pUi q1ďiďN be independent random matrices Haar-distributed


on the unitary group Updq. Let Φ : Md Ñ Md be the quantum channel defined by
N
1 ÿ
Φpρq “ Ui ρUi: .
N i“1
Assume that 0 ă ε ă 1 and N ě Cd{ε2 . Then the channel Φ is ε-randomizing with
high probability.
The proof of Theorem 8.9 is based on two lemmas.
8.3. CONCENTRATION OF Ep FOR p ą 1 AND APPLICATIONS 221

Lemma 8.10. Let ρ and σ be pure states on Cd and let pUi q1ďiďN be independent
Haar-distributed random unitary matrices. Then, for every 0 ă δ ă 1,
˜ˇ ˇ ¸
ˇ1 ÿ N
1 ˇˇ δ
P ˇ :
TrpUi ρUi σq ´ ˇ ě ď 2 expp´cδ 2 N q.
ˇ
ˇ N i“1 dˇ d

Proof. Write ρ “ |ϕyxϕ| and σ “ |ψyxψ|. Denote Xi “ d TrpUi ρUi: σq “


ˇ? ˇ2
ˇ dxψ|Ui |ϕyˇ . We know from Lemma 5.57 that this variable is subexponential (as
ˇ ˇ

ion
the square of a subgaussian variable) and satisfies }Xi }ψ1 ď C. The conclusion
follows now directly from Bernstein’s inequalities (Proposition 5.59). 

ut
Lemma 8.11. Let ∆ : Msa sa
d Ñ Md be a linear map. Let A be the quantity

rib
A“ sup }∆pρq}op “ sup |Tr σ∆pρq|
ρPDpCd q ρ,σPDpCd q

Let 0 ă δ ă 1{4 and N be a δ-net in pSCd , | ¨ |q. Then A ď p1 ´ 4δq´1 B, where

ist
B “ sup |Tr |ψyxψ|∆p|ϕyxϕ|q| .

rd
ϕ,ψPN

Proof of Lemma 8.11. First note that for any X, Y P Msa


d , we have
(8.15) fo
|Tr Y ∆pXq| ď A}X}1 }Y }1 .
By a convexity argument, the supremum in A can be restricted to pure states.
ot
Given unit vectors ϕ, ψ P SCd , let ϕ0 , ψ0 P N so that |ϕ ´ ϕ0 | ď δ and |ψ ´ ψ0 | ď δ.
Given χ P SCd , we write Pχ for |χyxχ|. We have
N

}Pϕ ´ Pϕ0 }1 ď }Pϕ ´ |ϕyxϕ0 |}1 ` }|ϕyxϕ0 | ´ Pϕ0 }1 ď 2δ


ly.

and similarly }Pψ ´ Pψ0 }1 ď 2δ (this simple bound is not optimal). We now write
|Tr Pψ ∆pPϕ q| ď |TrpPψ ´ Pψ0 q∆pPϕ q| ` |Tr Pψ0 ∆pPϕ ´ Pϕ0 q| ` |Tr Pψ0 ∆pPϕ0 q| .
on

Using twice (8.15) and taking supremum over ϕ, ψ gives A ď 2δA ` 2δA ` B, hence
the result. 
se

Proof of Theorem 8.9. Fix a 81 -net N Ă pSCd , | ¨ |q with card N ď 162d , as


lu

provided by Lemma 5.3. Let ∆ “ R ´ Φ and A, B as in Lemma 8.11. Here A and


B are random quantities and it follows from Lemma 8.11 that
na

´ ε¯ ´ ε¯
P Aě ďP Bě .
d 2d
so

Using the union bound and Lemma 8.10, we get


ε¯
r

´
P Bě ď 164d ¨ 2 expp´cε2 N {4q.
Pe

2d
This is less than 1 if N ě Cd{ε2 , for some constant C. 
Exercise 8.6 (Kraus decomposition of the completely randomizing channel).
(i) Show that the Kraus rank of the completely randomizing channel R is d2 .
(ii) Let ω “ expp2iπ{dq and A, B be the unitary operators defined by their action
on the canonical basis by
(8.16) A|jy “ |j ` 1 mod dy B|jy “ ω j |jy.
Show that the operators pB j Ak q1ďj,kďd give a Kraus decomposition of R. These
operators are sometimes called the Heisenberg–Weyl operators.
222 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

8.4. Concentration of von Neumann entropy and applications


8.4.1. The basic concentration argument. We now consider the von Neu-
mann entropy (instead of the p-Rényi entropy) as the invariant quantifying entan-
glement. Since the von Neumann entropy is not naturally associated with a norm,
we are going to use the version of Dvoretzky theorem for Lipschitz functions (Theo-
rem 7.15). The relevant function is the entropy of entanglement ψ ÞÑ Epψq, defined
(via (8.1)) on the unit sphere in Ck b Cd . As usual in such situations, we need two
pieces of information: the Lipschitz constant of Ep¨q and a central value. They are

ion
provided by the next two lemmas.
Lemma 8.12. The Lipschitz constant of the function ψ ÞÑ Epψq, defined on

ut
pSCk bCd , | ¨ |q is bounded from above by C log k for some absolute constant C.

rib
This is clearly optimal up to the value of the constant C, since the function
E maps SCk bCd (which has diameter π, or π{2 if we consider E as a function on

ist
PpCk b Cd q) onto the segment r0, log ks. (Remember that in this chapter we always
assume k ď d.) Note that, in view of (B.1), it doesn’t matter—apart from the value

rd
of the constant—whether we use the geodesic distance or the extrinsic distance. For
a discussion of the optimal values of the constants see Exercise 8.7.
fo
Proof. We first check the commutative case by considering the function f :
S k´1 Ñ r0, log ks defined by
ot
ÿ
(8.17) f pxq “ ´ x2i logpx2i q,
N

i.e., the Shannon entropy of the probability distribution px2i q P ∆k . In the terminol-
ogy of (8.2), this is equivalent to restricting attention to vectors ψ whose Schmidt
ly.

decompositions use fixed sequences pϕi q, pχi q. One computes


on

k
ÿ
(8.18) |∇f pxq|2 “ 4 x2i p1 ` logpx2i qq2 ď C log2 k,
i“1
se

where the last inequality can be obtained by observing that the function t ÞÑ
tp1 ` log tq2 is concave on r0, e´2 s, and so the quantity |∇f pxq| increases when we
lu

replace the coordinates of x smaller than e´1 by their `2 average. It follows that
if L is the Lipschitz constant of f with respect to the geodesic distance on S k´1 ,
na

then L ď C 1{2 log k. Our objective is to show is that the same constant works for
the function ψ ÞÑ Epψq.
so

To that end, we will consider an auxiliary function which is defined as follows.


Let pui q1ďiďk be an orthonormal basis of Ck . If ψ P SCk bCd , set ρ “ TrCd |ψyxψ|
r
Pe

and let
ÿk
(8.19) f˜pψq “ ´ xui |ρ|ui y logpxui |ρ|ui yq.
i“1

In other words, f˜pψq is the entropy of the diagonal part of ρ, calculated in the
basis pui q. An important property of f˜ is that f˜pψq “ Spρq if pui q a basis which
diagonalizes ρ (which is obvious from the definitions) and f˜pψq ď Spρq in general
(which is a consequence of concavity of S and is the content of Exercise 1.50). Next,
one verifies that xui |ρ|ui y “ |Pi ψ|, where Pi is the orthogonal projection onto the
subspace ui bCd Ă Ck bCd . Since the map ψ ÞÑ p|P1 ψ|, . . . , |Pk ψ|q is a contraction,
8.4. CONCENTRATION OF VON NEUMANN ENTROPY AND APPLICATIONS 223

it follows that the Lipschitz constant of f˜ (with respect to g, the geodesic distance
on SCk bCd ) is at most L.
We now return to the original question. Let ψ1 , ψ2 P SCk bCd ; set ρk “
TrCd |ψk yxψk | and let f˜ be defined by (8.19) using a basis pui q which diagonalizes
ρ1 . Then
Epψ1 q ´ Epψ2 q “ Spρ1 q ´ Spρ2 q “ f˜pψ1 q ´ Spρ2 q ď f˜pψ1 q ´ f˜pψ2 q ď L gpψ1 , ψ2 q.
Since the roles of ψ1 and ψ2 can be reversed, it follows that the Lipschitz constant

ion
of E with respect to g is at most L (and hence exactly L), as claimed. 
Lemma 8.13 (not proved here, but see Remark 8.14). For k ď d, the expectation

ut
of the function ψ ÞÑ Epψq (with respect to the uniform measure on the unit sphere
in Ck b Cd ) satisfies

rib
˜ ¸
kd
ÿ 1 k´1 1k
(8.20) E Epψq “ ´ ě log k ´ .

ist
j“d`1
j 2d 2d

Remark 8.14 (An easy bound on the entropy of entanglement). An inequal-

rd
ity slightly weaker than (8.20) follows readily from Proposition 6.36 (or Exercise
6.43, which is even more elementary). First, with large probability, all Schmidt
coefficients of ψ belong to the interval

1 C 1
fo C

ot
? ´? ,? `?
k d k d
N

for some constant C. It‰ follows that all `a


the eigenvalues of the TrCd |ψyxψ| lie then
in an interval 1´ε 1`ε
“ ˘
k , k for some ε “ O k{d , and Lemma 1.20 yields the bound
ly.

Epψq “ Spρq ě log k ´ C 1 k{d. (The use of Lemma 1.20 requires ε ď 1, for larger ε
we may use the simpler bound Spρq ě S8 pρq “ ´ log }ρ}8 .)
on

An immediate consequence of Dvoretzky’s theorem (in the form from Theorem


7.15) is now:
se

Theorem 8.15. Let ε ą 0 and m ď cε2 kd{ log2 k. Then most m-dimensional
lu

subspaces W Ă Ck b Cd have the property that any unit vector x P W satisfies


1k
Epxq ě log k ´ ´ ε.
na

2d
In some cases the result given by Theorem 8.15 can be improved. In particular,
so

in order to obtain violations for the additivity of Smin we will need to produce
“extremely entangled subspaces,” in which every state has entropy logpkq ´ op1q
r
Pe

(see Section 8.4.3).


In the opposite direction, Exercise 8.9 shows an upper bound on the minimal
entropy inside any subspace of given dimension.
Exercise 8.7 (Sharp bounds for the Lipschitz constant of E). In the notation
of Lemma 8.12, assume k ď d and let L “ Lk be the Lipschitz constant of the
function ψ ÞÑ Epψq, calculated with respect to the geodesic distance on SCk bCd (or
on PpCk b Cd q). Show that Lk „ log k.
n
a any s-dimensional subspace F Ă C contains a unit
Exercise 8.8. Show that
vector x satisfying }x}8 ě s{n.
224 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

Exercise 8.9 (An upper bound on the minimal entropy for general subspaces).
Let W Ă Ck b Cd be a subspace of dimension αkd, with α ě 1{k. (i) Using
the previous exercise, show that W contains a unit vector ψ satisfying Epψq ď
hpαq ` p1 ´ αq logpk ´ 1q, where hptq “ ´t log t ´ p1 ´ tq logp1 ´ tq ď log 2 is the
˘ if λ ě 1 and Epψq ě log k ´ λ{k for all
binary entropy function. `(ii) Conclude that
ψ P W, then dim W “ O λd{p1 ` log λq .
8.4.2. Entangled subspaces of small codimension. The argument from
the previous section gives nothing for subspaces of dimension cdk or larger: if

ion
ε “ log d, the conclusion of Theorem 8.15 does not even imply nonnegativity of
Epxq. However, in view of Theorem 8.1, it seems plausible to quantify entanglement

ut
on subspaces of larger dimension. This can be achieved provided we use a suitable
measure of entanglement.

rib
One possibility is to use the p-Rényi entropy for p “ 1{2. Recall from (8.5)
that if we identify a unit vector x P Ck b Cd with A P Mk,d , then

ist
E1{2 pxq “ 2 log }A}1 ,
and our problem becomes a question about the behavior of }¨}1 vs. }¨}2 on subspaces

rd
of Mk,d .
Theorem 8.16. Let k ď d, and W Ă Ck b Cd be a random subspace of dimen-
fo
sion m. The following holds with large probability: for every unit vector x P W,
ot
E1{2 pxq ě logpk ´ m{dq ´ C.
N

The conclusion of Theorem 8.16 yields nontrivial quantitative information for


subspaces of codimension larger than C1 d, for some constant C1 . This compares
well with Theorem 8.1, which asserts that subspaces of codimension smaller than
ly.

d ` k ´ 1 are never fully entangled.


on

Proof. We identify Ck b Cd with Mk,d , and apply the low M ˚ -estimate (The-
orem 7.45) to the norm } ¨ }1 . One needs the value of M ˚ :“ E }X}op , where X
? distributed on the Hilbert–Schmidt sphere in Mk,d . The inequality
is uniformly
se

M ˚ ď C{ k follows Proposition 6.36. Denoting α “ 1 ´ m{kd, we conclude that


lu

for every A P W, ? ?
}A}1 ě c k α}A}HS ,
na

and therefore, for every unit vector x P W (now seen as a subspace of Ck b Cd ),


E1{2 pxq “ 2 log }A}1 ě logpk ´ m{dq ´ C. 
so

8.4.3. Extremely entangled subspaces. In a different direction, we might


r

seek for subspaces of not-so-large dimension, but with near-maximal entropy of


Pe

entanglement, say log k ´ op1q for example. In view of Lemma 8.13, this requires
k “ opdq. For simplicity, we will focus on the case d “ k 2 . This choice of dimensions
allows us to produce an example of a pair of channels violating the additivity
relation (8.8), although the method is applicable to a wider range of parameters.
Proposition 8.17. There are absolute constants c, C such that the following
holds. Let k be an integer and set d “ k 2 , m “ ck 2 . With large probability,
a random m-dimensional subspace W Ă Ck b Cd has the property that any unit
vector ψ P W satisfies
C
Epψq ě log k ´ .
k
8.4. CONCENTRATION OF VON NEUMANN ENTROPY AND APPLICATIONS 225

Remark 8.18. Proposition 8.17 is optimal in the following sense. First, we


cannot hope for larger values of Epψq on a random subspace since (by Lemma
8.13) the global average value is precisely of order log k ´ Ck . Second, subspaces of
dimension larger than Ck 2 cannot have this property, as shown by Exercise 8.9 (ii).
We start by relating the entropy of very mixed states to their Hilbert–Schmidt
distance to the maximally mixed state ρ˚ (cf. Lemma 1.20, which leads to a slightly
stronger conclusion under stronger hypothesis).

ion
Lemma 8.19. If ρ is any state on Ck , then
2
Spρq ě log k ´ k }ρ ´ ρ˚ }HS .

ut
Proof. The following inequality compares the entropy with its second order

rib
approximation: for every x, t P r0, 1s,
1
(8.21) ´ x log x ě ´t log t ´ p1 ` log tqpx ´ tq ´ px ´ tq2 .

ist
t
To check inequality (8.21), notice that it can be rewritten as logpyq ď y ´ 1 with

rd
y “ x{t. Given a state ρ P DpCk q with eigenvalues ppi q1ďiďk , we apply (8.21) with
x “ pi and t “ 1{k. Summing over i, we obtain the announced inequality. 
fo
It will be more convenient to work with a random matrix M P Mk,d of Hilbert–
Schmidt norm 1, rather than with a random unit vector ψ P Ck b Cd (both ap-
ot
proaches are equivalent, see Section 0.8). Also recall that when a vector ψ is
identified with a matrix M , we have TrCd |ψyxψ| “ M M : , see (2.12).
N

Here is a proposition which (via Lemma 8.19) immediately implies Proposition


8.17.
ly.

Proposition 8.20. There are absolute constants c, C such that the following
holds. Let k be an integer, d “ k 2 , m “ ck 2 and let SHS be the Hilbert–Schmidt
on

sphere in Mk,d . Consider the function g : SHS Ñ R defined by


› ›
› I ››
se

:
gpM q “ ›M M ´ › .

k HS
lu

With large probability, a random m-dimensional subspace W Ă Mk,d has the prop-
erty that
na

(8.22) sup gpM q ď C{k.


M PSHS XW
so

Remark 8.21. We wish to point out that while Proposition 8.20 will be derived
r

from Dvoretzky for Lipschitz functions, it can be rephrased in the language of the
Pe

standard Dvoretzky’s theorem. Indeed, its assertion says that for every M P W
with }M }HS “ 1 we have
›2
C2

› I ›› 4 2 Tr M M : Tr I 1
(8.23) 2
ě ›M M :
´ “ Tr |M | ´ ` 2 “ Tr |M |4 ´ ě 0.
k › k HS
› k k k
Consequently,
´ C 2 ¯1{4 ´ C2 ¯
(8.24) k ´1{4 }M }HS ď }M }4 ď k ´1{4 1 ` }M }HS ď k ´1{4 1 ` }M }HS
k 4k
2
for all M P W. In other words, W is p1 ` δq-Euclidean,
` with ˘δ “ C4k , when
considered as a subspace of the Schatten normed space Mk,d , } ¨ }4 . On the other
226 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

` ˘
hand, the Dvoretzky dimension of Mk,d , } ¨ }4 equals k 1{2 d (see Theorem 7.37) and
therefore the general theory (such as Theorem 7.19) gives only δ “ Opk ´1{4 q for m-
dimensional subspaces. Although the Dvoretzky dimension is sharp for the size of
isomorphically Euclidean subspaces (in the sense exemplified in Exercises 7.12 and
7.25), (8.24) supplies an instance where it can be beaten for almost isometrically
Euclidean subspaces.
Before embarking on the proof of Proposition 8.20 we offer some preliminary

ion
remarks. We know from Proposition 6.36 (the elementary argument from Exercise
6.43 would actually be sufficient) that all singular values of a typical M P SHS
belong to the interval

ut
„ 
1 C 1 C
(8.25) ? ´? ,? `? .

rib
k d k d
It follows that }M M : ´ I {k}8 “ Opk ´3{2 q and thus the median Mg of g satisfies

ist
Mg ď C{k. We next estimate the Lipschitz constant of g. The inequality
}M M : ´N N : }HS ď }M pM : ´N : q`pM ´N qN : }HS ď p}M }op `}N }op q}M ´N }HS

rd
has the following immediate consequence.

fo
Lemma 8.22. Let Ωt “ tM P SHS : }M }op ď tu for some t ě 0. The function
defined on Ωt by M ÞÑ M M : is 2t-Lipschitz with respect to the Hilbert–Schmidt
ot
norm.
In particular, the function g is 2-Lipschitz on Ω1 “ SHS? . However, a direct
N

application of Theorem 7.15 yields only a bound of order 1{ k in (8.22). (This


calculation parallels the one from Remark 8.21 that was expressed in the alterna-
ly.

tive language of the Dvoretzky dimension.) The trick is to apply concentration of


measure twice: to the function g itself, and to the function f : M ÞÑ }M }op , which
on

is used to control the Lipschitz constant of g. ?


The function f is 1-Lipschitz on SHS .?By (8.25), its median equals 1{ k `
se

Op1{kq; in particular it is bounded by 2{ k for k large enough. Consequently,


Lévy’s lemma (Corollary 5.17) implies that
lu

´ ? ¯ 1
(8.26) P f pM q ě 3{ k ď expp´k 2 q.
2
na

Similarly, an application of the


? standard Dvoretzky’s theorem (Theorem 7.19) to
the norm } ¨ }8 with ε “ 1{ k (note that the dimension of the ambient space is
so

n “ kd and that the Dvoretzky dimension is of order d, see Theorem 7.37) shows
r

that a random ck 2 -dimensional subspace W satisfies SHS X W Ă Ω3{?k with high


Pe

probability.
Starting from this point, we will present two possible paths to complete the
proof of Proposition 8.20. The first argument uses twice the general Dvoretzky
theorem for Lipschitz functions (Theorem 7.15) with the optimal dependence on
ε. The second argument is based on a trick due to Fukuda making the overall
argument more elementary. In terms of the hierarchy discussed at the beginning of
Section 6.1, the first proof we give uses principles from level (ii), namely the Dudley
inequality, whereas the second argument uses a single ε-net, staying at level (i).
Proof #1 of Proposition 8.20. We know from Lemma 8.22 that the func-
tion g is 2t-Lipschitz on Ωt . Let g̃ be a 2t-Lipschitz extension of g|Ω to SHS . Note
8.4. CONCENTRATION OF VON NEUMANN ENTROPY AND APPLICATIONS 227

that, in any metric space X, it is possible to extend any L-Lipschitz function h


defined on a subset Y without increasing the Lipschitz constant; use, e.g., the
formula
h̃pxq “ inf rhpyq ` L distpx, yqs .
yPY
This formula also guarantees that the extended function g̃ is circled. Since g̃ “ g
on most of SHS , the median of g (resp., g̃) is a central value of g̃ (resp., g). We
apply Theorem 7.19 to g̃ with ε “ 1{k, µ “ Mg and L “ 2t “ 6k ´1{2 to get

ion
sup |g̃ ´ µ| ď 1{k
SHS XW

ut
on a random subspace W Ă Mk,d of dimension m “ c0 ¨ kd ¨ pk ´1 {p6k ´1{2 qq2 “ cd.
We then have

rib
1 C1
sup g̃ ď µ ` ď .
SHS XW k k

ist
If SHS X W Ă Ω (which, as noticed before, holds with large probability), g and
g̃ coincide on SHS X W and therefore g ď C 1 {k on SHS X W, proving (8.22). 

rd
Proof #2 of Proposition 8.20. We use the following lemma which allows
to discretize the supremum in (8.22).
fo ?
Lemma 8.23. Let N be an ε-net in pSHS X W, | ¨ |q with ε ă 2 ´ 1. Then
ot
1
sup gpM q ď sup gpM q
M PSHS XW 1 ´ ε2 ´ 2ε M PN
N

Proof of Lemma 8.23. Let M P SHS X W. There exists M0 P N such that


δ :“ }M ´ M0 }HS ď ε. We write M “ M0 ` δN with N P SHS , and consider also
ly.

A “ M0 ` N and B “ M0 ´ N (note that the operators N , A and B all belong to


W). One checks that }A}2HS “ 2 ´ δ and }B}2HS “ 2 ` δ. We then set
on

δ`
∆ :“ M M : ´ M M0: “
˘
AA: ´ BB : ` 2δN N : ,
2
se

and the triangle inequality implies


lu

δ` ˘
}∆}HS ď }AA: ´ p2 ´ δqρ˚ }HS ´ }BB : ´ p2 ` δqρ˚ }HS ` }2δN N : ´ 2δρ˚ }HS .
2
na

We can thus estimate


gpM q ď gpM0 q ` }M M : ´ M0 M0: }HS
so

δ
ď gpM0 q ` pp2 ´ δqgpA{}A}HS q ` p2 ` δqgpB{}B}HS q ` 2δgpN qq
r

2
Pe

ď gpM0 q ` p2δ ` δ 2 q sup gpXq.


XPSHS XW

ď gpM0 q ` p2ε ` ε2 q sup gpXq


XPSHS XW

and taking supremum over M P SHS gives the result. 


We now return to the proof of the Proposition. The random subspace is realized
as W “ V pCm q where V : Cm Ñ Mk,d is a Haar-distributed isometry. If M is an
ε-net in pSCm , | ¨ |q, then N “ V pMq is an ε-net in pSHS X W, | ¨ |q. Let us choose
(for example) ε “ 1{3; by Lemma 5.3, we can ensure that card N ď 36m .
228 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

We apply the “local Lévy lemma” (Corollary 5.35) to the function


? g with the
subset Ω “ Ω3{?k Ă SHS and ε “ 1{k. The function g|Ω is 6{ k-Lipschitz, and
therefore, using (8.26)
Pptg ą Mg ` 1{kuq ď PpSHS Ă Ωq ` 2 expp´d{36q ď C expp´cdq.
Using the union bound and Lemma 8.23, this gives
ˆ ˙
9
P sup gpM q ě pMg ` 1{kq ď 36m C expp´cdq
M PSHS XW 2

ion
and this quantity is (much) smaller than 1 provided m ď c1 d, for sufficiently small
c1 ą 0. Since Mg “ Op1{kq, this concludes the proof. 

ut
8.4.4. Counterexamples to the additivity problem. Using Proposition

rib
8.17 and the approach used in Proposition 8.6 for the p-Rényi entropy, we can show
the following.

ist
Proposition 8.24. There is a constant c such that the following holds. Let
d “ k 2 , m “ ck 2 and Φ : Mm Ñ Md be a random channel, obtained by (8.6)

rd
from a Haar-distributed isometry V : Cm Ñ Cd b Cd . Set Ψ “ Φ, the channel
obtained from V , the complex conjugate of V . If k is large enough, then with large
probability, fo
S min pΦ b Ψq ă S min pΦq ` S min pΨq.
ot
Proof. Denote by W Ă Ck b Cd the range of V . From Lemma 8.2, we have
N

Smin pΦq “ min Epψq.


ψPW,|ψ|“1

Note that Smin pΦq “ Smin pΨq. From Proposition 8.17, we have with large proba-
ly.

bility
C
on

Smin pΦq ě log k ´ .


k
On the other hand, we know from Lemma 8.7 that applying ΦbΦ to the maximally
se

entangled state yields an output state with an eigenvalue greater than or equal
dim W m c
to dim Mk,d “ kd “ k . Then, a simple argument using just concavity of S (see
lu

Proposition 1.19) reduces the problem to calculating the entropy of the state with
one eigenvalue equal to kc and all the remaining ones identical, which yields
na

c log k 1
Smin pΦ b Ψq ď 2 log k ´ ` .
so

k k
We have therefore S min pΦbΨq ă S min pΦq`S min pΨq provided k is large enough. 
r
Pe

8.5. Entangled pure states in multipartite systems


8.5.1. Geometric measure of entanglement. The definition of the p-en-
tropy of entanglement relies on the Schmidt decomposition, which is specific to
the bipartite case. However, the case p “ 8 is different since its definition only
involves the largest Schmidt coefficient, and this quantity can be defined in a mul-
tipartite setting as the square of the maximal overlap with a product vector. In
the multipartite setting, the corresponding “8-entropy of entanglement” has been
introduced in the QIT literature via the geometric measure of entanglement.
Let H “ H1 b ¨ ¨ ¨ b Hk be a multipartite real or complex Hilbert space. Given
a unit vector ψ P H, the geometric measure of entanglement of ψ is defined as
8.5. ENTANGLED PURE STATES IN MULTIPARTITE SYSTEMS 229

!ˇ ˇ )
(8.27) gpψq “ max ˇxψ, ψ1 b ¨ ¨ ¨ b ψk yˇ : ψi unit vector in Hi , 1 ď i ď k
ˇ ˇ

(cf. (2.13)) and the 8-entropy of entanglement is


(8.28) E8 pψq “ ´2 log gpψq.
We always have E8 pψq ě 0, and E8 pψq is equal to 0 if and only if ψ is a
product vector. Therefore, it makes sense to call unit vectors ψ which maximize
E8 pψq “maximally entangled” vectors. In the bipartite case Cd b Cd , one recovers

ion
the usual notion of a maximally entangled state (see Section 2.2.4). However, in
the multipartite case it seems hard to describe the maximally entangled vectors.

ut
The problem has an immediate geometric reformulation.

rib
Proposition 8.25 (easy). Let H “ H1 b ¨ ¨ ¨ b Hk . The following numbers are
equal
(i) The minimal value of gpψq over all unit vectors ψ P H.

ist
(ii) The inradius of BH1 b p BHk , where BHi denotes the unit ball in Hi .
p ¨¨¨ b
(iii) The largest constant c such that any k-linear map φ : H1 ˆ ¨ ¨ ¨ ˆ Hk Ñ C

rd
satisfies
c|||φ||| ď maxt|φpx1 , . . . , xk q| : |x1 | ď 1, . . . , |xk | ď 1u,
where ||| ¨ ||| denotes the norm
ÿ
fo
ÿ
|||φ|||2 “ |φpx1 , . . . , xk q|2
ot
¨¨¨
x1 PB1 xk PBk
N

with Bi an orthonormal basis in Hi (the value of ||| ¨ ||| does not depend on the
choice of the bases).
ly.

Denote by gmin pHq be the common value of the numbers appearing in Propo-
on

sition 8.25. There is a simple lower bound on gmin pHq.


Lemma 8.26. a If H “ Cd1 b¨ ¨ ¨bCdk or H “ Rd1 b¨ ¨ ¨bRdk with d1 ď ¨ ¨ ¨ ď dk ,
then gmin pHq ě 1{ d1 ¨ ¨ ¨ dk´1 . Equivalently, for every unit vector ψ P H,
se

E8 pψq ď logpd1 q ` ¨ ¨ ¨ logpdk´1 q.


lu

Proof of Lemma 8.26. The same argument works for the real case and the
na

complex case; we prove the Lemma by induction on k. For k “ 2, we have


1
gmin pCd1 b Cd2 q “ ? ?
so

minp d1 , d2 q
which is a restatement on the inequalities between the trace norm and the Hilbert–
r
Pe

Schmidt norm on the space of d1 ˆ d2 matrices. For the induction step, we use the
bound (which is again the k “ 2 case)
1
gmin pCd1 b Hq ě ? gmin pHq. 
d1
8.5.2. The case of many qubits. We will now focus, for simplicity, on the
particular case of k qubits, i.e., d1 “ d2 “ ¨ ¨ ¨ “ dk “ 2 in the complex case.
In this section it is convenient to define entropy via logarithm to the base
p2q
2 and so we will exceptionally use E8 pψq :“ ´2 log2 gpψq (cf. (8.28)). In this
notation, the conclusion of Lemma 8.26 can be rewritten as follows: for any pure
p2q
state ψ P pC2 qbk , we have E8 pψq ď k ´ 1. The following seems to be unknown.
230 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

Problem 8.27. Does there exist a constant C, and for each k a unit vector
ψ P pC2 qbk , such that
p2q
E8 pψq ě k ´ C ?
The next proposition shows that random states are typically very entangled,
but not entangled enough to give a positive answer to Problem 8.27.
Proposition 8.28. There exist absolute constants c, C such that a uniformly
distributed random unit vector ψ P pC2 qbk satisfies with high probability

ion
? ?
k log k k log k
c ď gpψq ď C .
2k{2 2k{2

ut
The conclusion of Proposition 8.28 can be equivalently rewritten as

rib
p2q
k ´ logpkq ´ log logpkq ´ C 1 ď E8 pψq ď k ´ logpkq ´ log logpkq ` C 1 .
Proof of Proposition 8.28. The average of g over the unit sphere is exactly

ist
bk
the mean width of K “ pBC2 q (we think of pC2 qbk as a 2k`1 -dimensional real
p

space). The concentration of the functional g around its mean follows from Lévy’s

rd
lemma (see Table 5.2). Indeed, since K is contained in the unit ball, the functional
g “ wpK, ¨q is 1-Lipschitz and therefore
fo
Pp|gpψq ´ wpKq| ą tq ď 2 expp´2k t2 q.
?
ot
It remains
? to show that wpKq “ Θp k log k 2´k{2 q, or equivalently that wG pKq “
Θp k log kq. The upper bound follows from a standard ε-net argument: let N
N

be an ε-net in pSC2 , | ¨ |q with card N ď p2{εq4 (see Lemma 5.3). From Exercise
5.7 (the weaker result from Lemma 5.9 would be enough here), it follows that
ly.

conv N Ą p1 ´ ε2 {2qBC2 . Consequently, denoting by N bk the set


N bk “ tψ1 b ¨ ¨ ¨ b ψk : ψi P N for 1 ď i ď ku,
on

we have
convpN bk q Ą p1 ´ ε2 {2qk K.
se

Using Lemma 6.1, we conclude that


lu

b a
bk
wG pconvpN qq ď 2 cardpN bk q ď 8k logp2{εq.
? ?
na

Choosing ε “ 1{ k gives the upper bound wG pKq “ Op k log kq.


To show that this argument is sharp, we are going to construct ? large separated
so

sets in K. Start with a set M “ tx1 , . . . , xN u which is 1{ k-separated in the


projective space over C2 , with N “ cardpMq ě ck. (The estimate on the size of
r

separated sets in PpC2 q is an elementary special case of Theorem 5.11 or Exercise


Pe

5.10; note that PpC2 q identifies with the Bloch sphere, a 2-dimensional Euclidean
sphere of radius 1{2, if we use the metric (B.5).) This means that for i ‰ j, we
have |xxi , xj y| ď 1 ´ 1{2k.
We claim that a large subset of Mbk is separated. To construct it, introduce
Q “ t1, . . . , N uk , equipped with the normalized Hamming metric, defined for α, β P
Q by
1
dpα, βq “ cardti : αi ‰ βi u.
k
To each element α “ pα1 , . . . , αk q P Q we associate the vector
xα “ xα1 b ¨ ¨ ¨ b xαk P K.
NOTES AND REMARKS 231

When α, β P Q are such that dpα, βq ě k{10, we have


k
ź k{10
|xxα , xβ y| “ |xxαj , xβj y| ď p1 ´ 1{2kq ďc
j“1
?
for some constant c ă 1. We then have |xα ´ xβ | ě c1 :“ 2 ´ 2c ą 0. If we start
from a subset Q Ă Q which is k{10-separated, the set txα : α P Qu is c1 -separated
in pC2 qbk . By the Sudakov inequality (Proposition 6.10), we have then

ion
a
wG pKq ě c log card Q.
It remains to give a lower bound on the size of Q. Using the inequality (5.17)

ut
from Chapter 5 (which was obtained by the greedy packing algorithm), we obtain
2
kp1´HN p1{5qq
? QěN
card ě N c k for some constant c2 ą 0. It follows that wG pKq ě

rib
c k log k. 

ist
8.5.3. Multipartite entanglement in real Hilbert spaces. It turns out
that in the real case, Lemma 8.26 is surprisingly sharp, so that the real version of

rd
Problem 8.27 has a positive answer with C “ 1. The construction from Proposition
8.29 seems to be specific to the real case. For variants related to Clifford algebras,
see Exercise 8.10.
Proposition 8.29. For any integers k ě 1, we have
fo
ot
gmin ppR2 qbk q “ 2´pk´1q{2 .
N

Proof of Proposition 8.29. The inequality gmin ppR2 qbk q ě 2´pk´1q{2 is a


consequence of Lemma 8.26. Using Proposition 8.25(iii), the converse inequality
ly.

will follow provided we show the existence of a k-linear form φ : pR2 qk Ñ R such
that |φpx1 , . . . , xk q| ď 1 for unit vectors x1 , . . . , xk , and |||φ||| “ 2pk´1q{2 . Let
on

θ : R2 Ñ C the canonical isomorphism. It is easily verified that


k
se

ź
φ : px1 , . . . , xk q ÞÑ Re θpxi q
i“1
lu

ś
(where means complex multiplication) satisfies the desired conclusion. 
na

Exercise 8.10 (Clifford matrices and multipartite maximally entangled states).


Given d ě 2, let N such that MN pRq contains a d-dimensional subspace E in which
so

every matrix is a multiple of an isometry (the smallest possible N is described in


Theorem 11.4). Show that
r
Pe

?
d bk N
(8.29) gmin ppR q q ď k{2 .
d
When d P t2, 4, 8u, one can achieve N “ d and the upper bound (8.29) matches the
lower bound from Lemma 8.26.

Notes and Remarks


Section 8.1. Theorem 8.1 was proved in [Wal02, Par04, WS08]. The state-
ment from Exercise 8.1 is taken from [CDJ` 08].
232 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS

Section 8.2. There are multiple operational motivations to use the von Neu-
mann entropy when defining the entropy of entanglement in (8.1). Given a bipartite
state ρ, there are several ways to quantify how much entanglement it contains. Two
approaches that are in some sense extremal and dual to each other are the entan-
glement of distillation (the rate at which one can LOCC-transform copies of ρ into
Bell states, see also Chapter 12) and the entanglement cost (the rate at which one
can LOCC-transform Bell states into copies of ρ). For a general survey on entan-
glement measures we refer to [PV07]. If we restrict ourselves to pure states as we

ion
do in this chapter, all these entanglement measures coincide with the entropy of
entanglement (see Chapter 12.5.2 in [NC00].)
The “additivity conjecture” (8.8) has been a major open problem in QIT, partic-

ut
ularly since work by Shor [Sho04], who showed that the additivity of the minimum

rib
output von Neumann entropy was equivalent to the additivity of several other quan-
tities, including the capacity of quantum channels to carry classical information and
the entanglement of formation (defined later in Section 10.3.1). For example, the

ist
entire ICM 2006 talk by A. Holevo [Hol06] was devoted to this circle of ideas. A
positive answer would have greatly simplified the theory, leading to a “single letter”

rd
formula for the aforementioned capacity, see, e.g., [Hol06]. However, the answer
to the conjecture was shown to be negative by Hastings [Has09].
Exercise 8.3 is based on [FW07]. fo
Proposition 8.4 was proved in [Wat05, Aud09, Sza10]. We follow here the
ot
argument from [Sza10].
Our presentation in this chapter barely scratches the surface of the topic of
N

quantum channel capacities. In the quantum context, there are many notions of
capacity (see, e.g., [Wil17]) and each of them leads to its own class of mathemat-
ly.

ical questions. For a recent overview of applications of operator space theory to


the problem of estimating quantum capacity (i.e., the capacity to carry quantum
on

information) see [LJL15].


Section 8.3. The question of the multiplicativity of } ¨ }1Ñp (8.10) has been
se

considered in [WH02] and solved in [HW08]. The presentation in the text is based
on [ASW10], where the connection to Dvoretzky’s theorem was noticed. It is also
lu

known that } ¨ }1Ñp is not multiplicative for p close to 0 [CHL` 08], but part of
the range 0 ď p ă 1 is not covered by any approach. The explicit example from
na

Exercise 8.5 comes from [GHP10].


Modulo the optimal dependence on the dimension, Theorem 8.9 concerning
so

ε-randomizing channels has been proved in [HLSW04]; the parasitic logarithmic


factor has been removed in [Aub09]. A step towards derandomization has also
r
Pe

been made in [Aub09], where it was shown that the unitaries in question can be
sampled from any Kraus decomposition of the completely randomizing channel.
?
Section 8.4. Lemma 8.12 appears in [HLW06] with the value C “ 8{ log 2.
The argument leading to a better constant (Ck „ 1) in Lemma 8.12 that is sketched
in Exercise 8.7 was an unpublished byproduct of the work on [ASW11]. For various
aspects of continuity of the von Neumann entropy, see [Win16].
The exact formula (8.20) from Lemma 8.13 has been conjectured in [Pag93]
and proved in [FK94, SR95, Sen96]. Having the precise form (as opposed to the
weaker version stated in Remark 8.14) results in better constants in Theorem 10.16
in Section 10.3.1.
NOTES AND REMARKS 233

Theorem 8.16 appears to be new.


After Hastings’s counterexample to the additivity conjecture [Has09] appeared,
several papers tried to simplify and extend the original approach, including [BH10,
FKM10, FK10, ASW11, Fuk14]. We follow mostly [ASW11]; Lemma 8.23 and
the second proof of Proposition 8.20 are from [Fuk14].
A completely different strategy was used in a series of papers initiated by
Collins–Nechita [CN10, CN11] via free probability and allows to derive results
which are more precise in some regimes. Here is a sample theorem from [BCN12,

ion
CFN15]. Fix an integer k and t P p0, 1q. There is a deterministic convex set Kk,t Ă
DpCk q with the following property: if Φ : Mm Ñ Mk is a quantum channel obtained
from a random embedding V : Cm Ñ Ck b Cd with m “ tkd, then, almost surely as

ut
d Ñ 8, the set ΦpDpCm qq converges to Kk,t . This allows, at least in principle, to

rib
answer any question about minimal output entropies in this range of parameters. It
was subsequently shown in [BCN16] that generic channels violating additivity can
be obtained by following this strategy if and only if k ě 183. Moreover, the defect

ist
of non-additivity, i.e., the difference between the two sides of (8.9) is generically
almost log 2 for large k (or 1 bit if we use log2 to define entropy). This improves on

rd
the preceding arguments—including the one presented in the text—which showed
a violation that was minuscule. Still, in contrast with the Hayden–Winter example
fo
[HW08] (cf. Remark 8.8), the demonstrated violation does not go to infinity as
the dimensions increase. A drawback of the free probability-based method is that
ot
the results are valid only when the environment dimension d goes to infinity, and
obtaining explicit values of d, for which these asymptotic phenomena hold, requires
N

extra analysis, which is not supplied in [BCN16]. For more information on this
approach we refer to the survey [CN16]. Still another approach, due to Collins
ly.

[Col16] and perhaps more conceptual, relies on the Haagerup inequality about the
norms of convolutions on the free group.
on

In the opposite direction, it is proved in [Mon13] that random quantum chan-


nels satisfy a weak form of multiplicativity.
se

Section 8.5. The geometric measure of entanglement was considered under a


different terminology in [Shi95, BL01] ; see also [WG03]. Lemma 8.26 is well-
lu

known and appears for example in [AS06, JHK` 08, Arv09].


We could not locate Problem 8.27 in the literature although it seems a very
na

natural question. It is known that E8 pψq ă k ´ 1 for any unit vector ψ P pC2 qbk
whenever k ě 3 (see [JHK` 08]). The fact that random states are very entangled
so

(the upper bound from Proposition 8.28) has been noticed and used in [GFE09,
BMW09].
r
Pe

The argument behind Proposition 8.29 and Exercise 8.10 has been communi-
cated to us by Mikael de la Salle (see also Theorem 3.3 in [Hil07a]). The ? pa-
3 b4
pers [Hil06, Hil07a]? compute also the exact values gmin ppR q q “ 1{ 7 and
gmin ppR3 qb4 q “ 1{ 21.
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 9

Geometry of the Set of Mixed States

ion
Let H “ H1 b ¨ ¨ ¨ b Hk be a multipartite Hilbert space. We are interested

ut
in the geometry of the set of separable states on H, and related questions. To
simplify the exposition we are going to focus on two specific cases: the bipartite

rib
case H “ Cd1 b Cd2 (we may restrict ourselves to the balanced case d1 “ d2 “ d
in order to keep notation simple) and the case of k qubits H “ pC2 qbk . However,

ist
essentially all the methods carry over to the general case, except that the formulas
may sometimes become not very elegant (see, for example, Theorem 9.12). The sets

rd
D “ DpHq, Sep “ SeppHq and PPT “ PPTpHq were defined in Chapter 2. Recall
that Sep Ă PPT Ă D. One of the main goals of this chapter is to produce a table
(Table 9.1) which contains radii estimates for theses states, similar to Table 4.1 for
fo
the classical examples of convex bodies. The following table (Table 9.2) matches
estimates from Table 9.1 to the corresponding theorems in the text.
ot
Table 9.1. Radii estimates for sets of quantum states. In each
N

row n denotes the dimension of the corresponding Hilbert space.


The first columns reads as D “ DpCn q, Sep “ SeppCd bCd q, PPT “
ly.

PPTpCd bCd q and Sep1 “ SepppC2 qbk q. The notation Θ˚ indicates


a two-sided estimate up to multiplicative factors polynomial in
on

log n. References to precise statements can be found in Table 9.2.


Quantities in each row are non-decreasing from left to right, see
Exercise 4.51, Proposition 2.5 and Proposition 2.18. (This gives in
se

particular non-matching two-sided bounds for the missing entry in


lu

the last row.)


na

K n inradpKq wpK ˝ q´1 vradpKq wpKq outradpKq


b
expp´1{4q
D n ? 1 „ 1
?
2 n
„ ?
n
„ ?2
n
n´1
n
so

npn´1q
´ ¯ ´ ¯ ´ ¯ b
Sep d2 ? 1
Θ˚ n´3{4 Θ n´3{4 Θ n´3{4 n´1
n
npn´1q
r

´ ¯ ´ ¯ b
Pe

PPT d2 ? 1 Θpn´1{2 q Θ n´1{2 Θ n´1{2 n´1


n
npn´1q
` ˘ ` ˘ ` ˘ b
Sep1 2k Θ n´1.292... ?? Θ˚ n´1.094... Θ˚ n´1 n´1
n

We next clarify the statements about the radii appearing in Table 9.1. They are
all computed with respect to the Hilbert–Schmidt Euclidean structure. Both inradii
and outradii are computed for Hilbert–Schmidt balls centered at the maximally
mixed state ρ˚ . This choice of a center is optimal: one may argue that the optimal
center can be chosen to be invariant under isometries of the convex set, and this
property characterizes ρ˚ (see Propositions 2.5 and 2.18, cf. Exercise 4.51 and its
hint). Statements referred to as trivial in Table 9.2 follow from (2.7).
235
236 9. GEOMETRY OF THE SET OF MIXED STATES

Table 9.2. References for proofs of the results from Table 9.1.

K inradpKq wpK ˝ q´1 vradpKq, wpKq outradpKq


DpCn q trivial use (1.26) Theorem 9.1 trivial
SeppCd b Cd q Theorem 9.15 Theorem 9.6 Theorem 9.3 trivial
d d
PPTpC b C q trivial Theorem 9.13 Theorem 9.13 trivial
2 bk
SepppC q q Theorem 9.21 unknown Theorem 9.11 trivial

ion
Some arguments require to consider the affine space H1 of trace one Hermitian

ut
matrices as a vector space with ρ˚ as the origin. In order to emphasize this point
of view we use a specialized notation: if ρ P H1 and t P R, then we write

rib
(9.1) t ‚ ρ :“ tρ ` p1 ´ tqρ˚ .

ist
If K Ă H1 , we denote t ‚ K “ tt ‚ x : x P Ku. A similar caveat applies to polarity
calculated inside the space H1 .

rd
It is a remarkable fact that, despite sharing the same inradii and outradii, the
sets Sep and D behave so differently with respect to volume radius. In particular,
the proportion of states on Cd b Cd which are separable, when measured in terms
fo
of volume, is extremely small: of order expp´cd4 log dq. We will return to such
considerations in Chapter 10.
ot
9.1. Volume and mean width estimates
N

In this section, we prove the volume radius and mean width estimates from
ly.

Table 9.1. In particular, we compute (up to a logarithmic factor) the mean width
of Sep˝ (Theorem 9.6), which will play a crucial role in Chapter 10.
on

9.1.1. Symmetrization. We heavily use the symmetrization operations de-


fined in Section 4.1.2. Recall that, on a multipartite Hilbert space H “ H1 b¨ ¨ ¨bHk ,
se

we have
SeppHq “ DpH1 q b p DpHk q ,
p ¨¨¨ b
lu

and that DpHi q is the unit ball for the space pB sa pHi q, } ¨ }1 q.
The Rogers–Shephard inequality (Theorem 4.22) controls how much the volume
na

changes after symmetrization. In our context (i.e., H “ Cd b Cd Ø Cn , dim D “


dim Sep “ n2 ´ 1), it implies the inequalities
so

2
2 2n
r

(9.2) ? voln2 ´1 pDq ď voln2 pD q ď 5{2 voln2 ´1 pDq,


n n
Pe

2
2 2n
(9.3) ? voln2 ´1 pSepq ď voln2 pSep q ď 5{2 voln2 ´1 pSepq.
n n
9.1.2. The set of all quantum states.
Theorem 9.1. Let D “ DpCn q be the set of states on Cn . The volume of D
equals
śn
? npn´1q{2 j“1 Γpjq
(9.4) volpDq “ n p2πq ,
Γpn2 q
9.1. VOLUME AND MEAN WIDTH ESTIMATES 237

and satisfies the two-sided estimates


1 1
(9.5) ? ď vradpDq ď ? .
2 n n
?
The mean width of D satisfies the asymptotic
? estimate wpDq „ 2{ n when n Ñ 8.
Moreover, the upper bound wpDq ď 2{ n holds for every dimension n.
Proof. We do not derive the exact value (9.4). From there, a tedious but
routine calculation based on the Stirling formula gives then the asymptotic behavior

ion
of vradpDq in Table 9.1.
Alternatively, we present a “soft” way to prove (9.5). First, we know from the

ut
Santaló inequality (Theorem 4.17) that vradpDq vradpD˝ q ď 1. On the other hand,
D˝ “ p´nq ‚ D (see (1.26), recall that polarity is with respect to ρ˚ ). This gives

rib
the upper bound in (9.5).
n,sa
For the lower bound, consider the symmetrization ? D “ S1 , the unit ball

ist
? to the trace norm. Since }?¨ }1 ď n} ¨ }HS , the inradius of D
with respect
equals 1{ n and therefore vradpD q ě 1{ n. We may now appeal to the Rogers–

rd
Shephard inequality (9.2) to obtain the lower bound vradpDq ě 2?1 n (this requires
some numerical verification since the convex bodies D and D live in different
dimensions, leading to different powers in the definition of the volume radii).
fo
We now compute the Gaussian mean width of D. If An is a GUE0 pnq random
matrix, then
ot
(9.6) wG pDq “ E sup TrpAn ρq “ E sup TrpAn |ψyxψ|q “ E λ1 pAn q
N

ρPD ψPH,|ψ|“1

since TrpB|ψyxψ|q “ xψ|B|ψy. Given that wpDq “ κ´1 n2 ´1 wG pDq, the asymptotic
ly.

?
estimate follows from the facts that κn2 ´1 „?n and E λ1 pAn q „ 2 n (Theorem
6.23). To show that the inequality wpDq ď 2{ n holds in every dimension, we use
on

the refined bounds from Proposition A.1(i) and from (6.37). 


?
It is possible to give a more direct proof of the upper bound wpDq “ Op1{ nq
se

using a discretization lemma, which we state for future reference (see Exercise 9.1).
lu

Lemma 9.2. Let H “ Cd , and N be an α-net in pSCd , gq, with α ă π{4. Then
na

(9.7) cosp2αqD Ă conv t˘|ψyxψ| : ψ P N u Ă D .


Equivalently, N is ε-net in pSCd , |¨|q for ε “ 2 sinpα{2q, and cosp2αq “ 1´2ε2 `ε4 {2.
so

Proof of Lemma 9.2. Set P “ convt˘|ψyxψ| : ψ P N u. The inclusion


r

P Ă D is trivial. Let us check the other inclusion through the corresponding dual
Pe

(polar) norms
}A}pD q˝ “ max |xϕ|A|ϕy| “ }A}op ,
ϕPSCd

}A}P ˝ “ max |xψ|A|ψy| .


ψPN

We need to show that }A}P ˝ ě cosp2αq}A}op for every A P Msa d . We may assume
by homogeneity and symmetry that }A}op and the largest eigenvalue of A are both
equal to 1. Let ϕ P Cd be a unit vector such that Aϕ “ ϕ. Choose ψ P N verifying
gpϕ, ψq ď α. By adjusting the phase of ϕ (i.e., replacing ϕ with an appropriate
238 9. GEOMETRY OF THE SET OF MIXED STATES

element of rϕs), we may write ψ “ cospβqϕ ` sinpβqχ for a unit vector χ K ϕ, and
0 ď β ď α. We have then (since xϕ|A|χy “ 0 and xχ|A|χy ě ´1)
xψ|A|ψy “ cos2 pβqxϕ|A|ϕy ` sin2 pβqxχ|A|χy ě cos2 β ´ sin2 β “ cosp2βq ě cosp2αq.

Exercise 9.1 (An easy upper bound on the mean width of D). Using Lemma
9.2, give an alternate proof of the relation
?
wpDpCn qq “ Op1{ nq.

ion
Exercise 9.2. Show that Lemma 9.2 is sharp on C2 , i.e., that cosp2αq cannot
be replaced by a larger number in (9.7).

ut
9.1.3. The set of separable states (the bipartite case).

rib
Theorem 9.3. If Sep “ SeppCd b Cd q, we have the two-sided estimates
1 4

ist
(9.8) 3{2
ď vradpSepq ď wpSepq ď 3{2 .
6d d

rd
The inequality vradpSepq ď wpSepq is the Urysohn inequality (Proposition
4.15). We first give an elementary argument showing that wpSepq “ Opd´3{2 q, and
then prove separately the more precise bounds from (9.8).
fo
Proof that wpSepq “ Opd´3{2 q. We proceed through a net argument. It is
ot
easier to work with the Gaussian
? mean width, and therefore we prove the equivalent
statement wG pSepq “ Op dq. Since wG pSepq ď wG pSep q, it is enough to give an
N

upper bound on wG pSep q. Let P the polytope given by Lemma 9.4 below. Then
b ?
ly.

wG pSep q ď 2wG pP q ď 2 2 logpC d q “ Op dq


where we used Proposition 6.3 (note that vertices of P have Hilbert–Schmidt norm
on

1). 
Lemma 9.4. There is a constant C ą 0 such that for every dimension d, there
se

is a family N of product pure states on H “ Cd b Cd , with card N ď C d and such


lu

that, if we denote by P the polytope convt˘|ϕ b ψyxϕ b ψ| : ϕ b ψ P N u, we have


1
P Ă Sep Ă P.
na

2
The constant 1{2 appearing in Lemma 9.4 could be replaced by 1 ´  for any
so

ε ą 0, affecting only the value of C. Interestingly, the analogous statement for Sep
r

(i.e., without symmetrization) is false, see Proposition 9.31.


Pe

Proof. Let M be an α-net in pSCd , gq and P0 “ convt˘|ψyxψ| : ψ P Mu.


We write D for DpCd q and Sep for SeppCd b Cd q. We know from Lemma 9.2 that
cosp2αqD Ă P0 Ă D .
Since Sep “ D b
p D , it follows that
(9.9) cos2 p2αqSep Ă P0 b
p P0 Ă Sep .
It remains to choose α “ π{8, so that cos2 p2αq “ 1{2. We choose N to be the set
tϕ b ψ : ϕ, ψ P Mu, so that P “ P0 b p P0 . We bound the cardinality of M using
d
Lemma 5.3, yielding card N ď C for some absolute constant C. 
9.1. VOLUME AND MEAN WIDTH ESTIMATES 239

Proof that wpSepq ď 4d´3{2 . We have Sep “ D b p D, where D means DpCd q.


We use the Chevet–Gordon inequality in the form
? of Exercise 6.49 to obtain that
wG pSepq ď 2wG pDq. The bound wpDq ď 2{ d from Theorem 9.1 implies only
wpSepq ď p4 ` op1qqd´3{2 . However, using the refined bound (6.37) (cf. the proof
of Theorem 9.1), we can obtain
?
1 4 d ´ 1.2d´1{6
wpSepq “ wG pSepq ď ? ď 4d´3{2 . 
κd4 ´1 d4 ´ 1

ion
Proof that vradpSepq ě 16 d´3{2 . We first give a lower bound on vradpSep q
by estimating from below the inradius of Sep . We are going to compare Sep
with a simpler convex body which we now define. Let K Ă BpHq be the convex

ut
hull of rank one product operators (not necessarily self-adjoint!)

rib
K :“ conv t|x1 b x2 yxy1 b y2 | : x1 , y1 , x2 , y2 P BCd u .
The convex body K is most naturally seen as S1d b p S1d , it can also be identified

ist
b4
with pBCd q up to identification with dual space. The next lemma (the proof we
p

postpone for a moment) relates K to Sep .

rd
Lemma 9.5. Let H “ Cd b Cd . Let π : BpHq Ñ B sa pHq be the projection onto
self-adjoint part, πpAq :“ 12 pA ` A: q. Then
Sep Ă πpKq Ă 3 Sep .
fo
ot
Lemma 9.5 implies that inradpSep q ě 13 inradpKq. We also know from Lemma
N

8.26 that ´ ¯ 1
inradpKq “ inrad pBCd qb4 ě 3{2 .
p
d
ly.

Therefore,
1
vradpSep q ě inradpSep q ě 3{2 .
on

3d
We conclude using (9.3) that vradpSepq ě 6d13{2 . (As in the proof of Theorem 9.1,
this requires a somewhat tedious verification due to the fact that Sep and Sep live
se

in different dimensions.) 
lu

Proof of Lemma 9.5. The factor 3 appears as an upper bound on the geo-
metric distance between the sets D and Sep corresponding to 2 qubits, i.e., the
na

smallest positive number λ such that DpC2 b C2 q Ă λSeppC2 b C2 q . The upper


bound λ ď 3 follows from Proposition 9.17, or by noting that any state ρ can be
so

decomposed as
ρ ` I {2 I ´ρ
r

ρ“2 ´ ,
Pe

3 n
l jh 3
ljhn
separable separable
where separability can be checked, e.g., using the Peres criterion (see Theorem
2.15).
It is enough to show that extreme points of πpKq are contained in 3Sep . Any
extreme point A of πpKq can be written as
1
A “ p|x1 b x2 yxy1 b y2 | ` |y1 b y2 yxx1 b x2 |q
2
It may appear at the first sight that the above representation shows that A is
separable. However, while the two terms in the parentheses are indeed product
240 9. GEOMETRY OF THE SET OF MIXED STATES

operators, they are not self-adjoint and we can only conclude that A P DpHq (as
a self-adjoint operator whose trace norm is ď 1).
Let Hi be the 2-dimensional subspace of Cd spanned by xi and yi (if the
vectors are proportional, add any vector to get a 2-dimensional space) and let
H1 :“ H1 b H2 . Then A can be considered as an operator on H1 ; more precisely, as
an element of DpH1 q (and, conversely, any operator acting on H1 can be canonically
lifted to one acting on H). Since A belongs to DpH1 q , it also belongs to 3SeppH1 q ,
and thus to 3SeppHq . 

ion
9.1.4. The set of block-positive matrices. Let Sep “ SeppCd b Cd q. In
Theorem 9.3 we computed the order of magnitude of the mean width of Sep. We

ut
now focus on the dual quantity: the mean gauge of Sep, or the mean width of
Sep˝ (recall that polarity is taken with maximally mixed state ρ˚ “ I {d2 being the

rib
origin).
Theorem 9.6. Let Sep be the set of separable states on Cd b Cd . Then for

ist
some absolute constants c, C,

rd
cd3{2 ď vradpSep˝ q ď Cd3{2 ,
cd3{2 ď wpSep˝ q ď Cd3{2 logpdq.
fo
Since the cone BP of block-positive operators is dual to the cone SEP of sep-
arable operators (see Section 2.4), we obtain the following corollary.
ot
Corollary 9.7. Let BP be the set of trace one block-positive operators on
N

Cd b Cd . Then, for some absolute constants c, C


cd´1{2 ď vradpBPq ď Cd´1{2 ,
ly.

cd´1{2 ď wpBPq ď Cd´1{2 logpdq.


on

Proof. Since BP “ ´d´2 Sep˝ (see (2.47)), the derivation of Corollary 9.7
from Theorem 9.6 is immediate. 
se

The Santaló and reverse Santaló inequalities (Theorem 4.17) allow to esti-
lu

mate directly vradpSep˝ q from vradpSepq, so the first part of Theorem 9.6 fol-
lows from Theorem 9.3. However the analogous result for the mean width, the
na

M M ˚ -estimate (Theorem 7.10), is more demanding. Since we already know that


wpSepq “ Θpd´3{2 q (again from Theorem 9.3), the conclusion of Theorem 9.6 follows
so

after we prove the M M ˚ -estimate (7.7) for the pair pSep, Sep˝ q, i.e.,
(9.10) wpSepqwpSep˝ q “ Oplog dq.
r
Pe

Recall that the lower bound wpSepqwpSep˝ q ě 1 is elementary and holds for any
pair of polar bodies (see Exercise 4.37). However, (9.10) does not follow immediately
from the general theory: Theorem 7.10 is known to hold only for symmetric convex
bodies which are in a specific position (the `-position). In our situation Sep is not
symmetric and there is no reason to think that it is in the `-position.
The first step towards proving Theorem 9.6 is to introduce the following sym-
metrization of Sep
SepX “ ´Sep X Sep,
where ´Sep “ p´1q ‚ Sep, see (9.1). We check that the relevant geometric param-
eters are essentially unchanged by this symmetrization procedure.
9.1. VOLUME AND MEAN WIDTH ESTIMATES 241

Proposition 9.8. The convex bodies Sep and SepX have comparable volume
radius, mean width and dual mean with, as show by the following formulas, where
Sep˝X means pSepX q˝
(9.11) wpSep˝ q ď wpSep˝X q ď 2wpSep˝ q,
1
(9.12) vradpSepq ď vradpSepX q ď vradpSepq,
2
(9.13) wpSepq » wpSepX q » d´3{2 .

ion
Moreover, Sep and SepX have the same inradius, equal to pd2 pd2 ´1qq´1{2 . However,
the outradius of SepX is bounded by 1{d, while the outradius of Sep is of order 1.

ut
Proof. We have, for any self-adjoint A with zero trace,

rib
}ρ˚ ` A}SepX “ maxp}ρ˚ ` A}Sep , }ρ˚ ´ A}Sep q ď }ρ˚ ` A}Sep ` }ρ˚ ´ A}Sep .
When averaging A over the Hilbert–Schmidt sphere, using the fact that A and

ist
´A have the same distribution, we obtain (9.11). Inequalities (9.12) follow from
Proposition 4.18. For (9.13), we already know (cf. Theorem 9.3) that

rd
vradpSepq » wpSepq » d´3{2 .
fo
We therefore have the following chain of inequalities: the first is trivial, the third
is (9.12) and the last is Urysohn’s inequality (Proposition 4.15)
ot
wpSepX q ď wpSepq » vradpSepq » vradpSepX q ď wpSepX q.
N

Therefore all these quantities are comparable, and (9.13) follows.


The statement about the inradius is trivial. On the other hand, any matrix
A such that ρ˚ ` A P Sep satisfies A ě ´ I {d2 . Consequently, any A such that
ly.

ρ˚ ` A P SepX satisfies ´ I {d2 ď A ď I {d2 , or }A}8 ď 1{d2 . It follows that the


outradius of SepX , which is measured with respect to the Hilbert–Schmidt norm,
on

is bounded by 1{d. 
We are now going to prove that the M M ‹ -estimate holds for SepX .
se

Proposition 9.9. There is an absolute constant C such that


lu

(9.14) d4 „ κ2d4 ´1 ď wG pSepX qwG pSep˝X q ď Cd4 log d.


na

It is now easy to deduce Theorem 9.6. Indeed, using the relation (4.32) between
spherical and Gaussian widths, Proposition 9.9 implies that wpSepX qwpSep˝X q “
so

Oplog dq, and (9.10) follows from (i) and (iii) of Proposition 9.8.
Proof of Proposition 9.9. Denote K “ SepX ´ ρ˚ , so that K is a symmet-
r
Pe

ric convex body in the space H0 of self-adjoint trace zero operators on Cd b Cd .


The lower bound in (9.14) is a reformulation of the inequality wpKqwpK ˝ q ě 1,
which is elementary (see Exercise 4.37). Using the `-norms introduced in Section
7.1.1 (especially Proposition 7.1(iii)), we may reformulate (9.14) as
(9.15) d4 À `K pIH0 q`K ˝ pIH0 q ď Cd4 log d
To prove the upper bound in (9.15), let T : H0 Ñ H0 be a linear map such that
T K is in the `-position. We will take advantage of the symmetries of K. The set
Sep (hence also K) is invariant under local unitaries, and the decomposition of H0
into irreducible subspaces is (see Lemma 2.19) H0 “ E ‘ F1 ‘ F2 , where
E “ spantσ1 b σ2 : Tr σ1 “ Tr σ2 “ 0u,
242 9. GEOMETRY OF THE SET OF MIXED STATES

F1 “ spantσ1 b I : Tr σ1 “ 0u,
F2 “ spantI bσ2 : Tr σ2 “ 0u.
By Proposition 4.8, we may assume that T “ αPE ` λ1 PF1 ` λ2 PF2 for some
positive numbers α, λ1 , λ2 . We may also assume α “ 1 without loss of generality.
The ideal property of the `-norm (Proposition 7.1(ii)) implies that
`K pPE q “ `K pT PE q ď `K pT q,

ion
and similarly for `K ˝ pPE q. By the M M ˚ -estimate (Theorem 7.10), we know that
`K pT q`K ˝ pT ´1 q “ Opd4 log dq.

ut
Noting that T ´1 “ PE ` λ´1 ´1
1 PF1 ` λ2 PF2 , it follows that

rib
(9.16) `K pPE q`K ˝ pPE q “ Opd4 log dq.
The `-norms of the projections PF1 , PF2 can be upper-bounded in a rather

ist
straightforward fashion, mostly due to the fact that their ranks are relatively small.
We have

rd
Lemma 9.10. Let F “ F1 ‘ F2 . Then `K pPF q “ Opd3 q and `K ˝ pPF q “ Op1q.
fo
We now postpone the proof of Lemma 9.10 and show how it allows to com-
plete the proof of Proposition 9.9. To that?end, we compare the estimates from
ot
Lemma 9.10 the bounds with `K ˝ pIH0 q » d (a reformulation of Theorem 9.3)
and `K pIH0 q Á d7{2 (which follows from the already proved lower bound in (9.15)).
N

For L “ K or L “ K ˝ , we have therefore `L pPF q ď 21 `L pIH0 q for d large enough.


Using the triangle inequality `L pIH0 q ď `L pPE q ` `L pPF q, it follows that `L pIH0 q ď
ly.

2`L pPE q for d large enough. Combined with (9.16), this gives the upper bound in
(9.15), as needed. 
on

Proof of Lemma 9.10. We have dim F “ 2pd2 ´ 1q. We use Proposition


7.1(v) and the estimates on the inradius and the outradius of Sep? X from Proposition
se

9.8 to deduce the following inequalities (recall that κn is of order n, see Proposition
A.1)
lu

`K pPF q “ wG ppK X F q˝ q ď d2 κdim F À d3 ,


na

`K ˝ pPF q “ wG pPF Kq ď d´1 κdim F À 1. 


so

9.1.5. The set of separable states (multipartite case). We first note


that an iteration of the arguments from the bipartite case (Theorem 9.3) can be
r

used to show the following estimates (where the constants ck , Ck depend a priori
Pe

on k), for H “ Cd1 b ¨ ¨ ¨ b Cdk .


? ? ? ?
maxp d1 , . . . , dk q d1 ` ¨ ¨ ¨ ` dk
ck “ vradpSepq ď wpSepq ď Ck .
d1 ¨ ¨ ¨ dk d1 ¨ ¨ ¨ dk
These estimates are reasonably sharp as long as k remains bounded (few sub-
systems, each of them being possibly large), but deteriorate very quickly once k
grows. However, it is also possible to obtain fairly sharp bounds valid for large
values of k (many small subsystems). For simplicity, we first consider the case of k
qubits.
9.1. VOLUME AND MEAN WIDTH ESTIMATES 243

Theorem 9.11. Let k ě 1, n “ 2k , and H “ pC2 qbk . Then


? ?
c log n log log n C log n log log n
(9.17) ď wpSepq ď
n n
and
?
c C log n log log n
(9.18) ď vradpSepq ď ,
n1`α n1`α
where c, C are absolute constants, and α “ 81 log2 p27{16q « 0.094.

ion
Proof of Theorem 9.11. We write D “ DpC2 q and Sep “ SeppHq. Since
Sep “ Dbk , it follows from Lemma 9.2 that, if N is an ε-net in pSC2 , gq, then
p

ut
cosp2εqk Sep Ă P Ă Sep ,

rib
where
P :“ convt˘|ψ1 b ¨ ¨ ¨ b ψk yxψ1 b ¨ ¨ ¨ b ψk | : ψ1 , . . . , ψk P N u.

ist
?
We choose ε such that cosp2εqk “ 1{2, i.e., ε » 1{ k. The polytope P is

rd
contained in the Hilbert–Schmidt unit ball, and (using Lemma 5.3) can be chosen
with at most 2pcard N qk ď exppCk log kq vertices. The first idea would be to apply
directly Proposition 6.3. This approach yields the bound

vradpSepq ď wpSepq ď wpSep q ď


fo ?
C log n log log n
ot
n
which is the upper bound in (9.17). For the lower bound in (9.17), see Exercise 9.3.
N

The reason for the extra factor nα in (9.18) comes from the fact that the
Hilbert–Schmidt Euclidean structure is not the most adapted to the present prob-
ly.

lem. When we apply Proposition 6.3 in the Euclidean structure induced by some
ellipsoid E, we actually obtain the following result: if P is a polytope with v vertices
on

contained in an ellipsoid E Ă RN , we have


ˆ ˙1{N c
vol P 2 log v
(9.19) .
se

ď
vol E N
lu

In this inequality, for a fixed polytope P , the best choice of ellipsoid is given
by the Löwner ellipsoid of P . Accordingly, we are going to consider the Löwner
ellipsoid associated to the set Sep . By Lemma 4.9, we have
na

(9.20) LöwpSep q “ LöwpD qb2 k .


so

The set D is?a cylinder. To compute? its Löwner ellipsoid, we use Lemma 4.3 with
r

n “ 3, h “ 1{ 2, a “ 0 and S “ I { 2 P M2 . It follows that LöwpD q “ T pBHS q,


Pe

sa
where?BHSa denotes
a the a Hilbert–Schmidt unit ball in M2 and T is the matrix
diagp 2, 2{3, 2{3, 2{3q in the basis of Pauli matrices (2.3). Consequently,
c
vol LöwpD q 16
“ det T “
vol BHS 27
or, equivalently, vradpLöwpD qq “ p16{27q1{8 . From the formula
vradpLöwpSep qq “ vradpLöwpD qqk
(which follows from (9.20), see Exercise 4.32), we conclude that
vrad LöwpSep q “ p16{27qk{8 “ n´α
244 9. GEOMETRY OF THE SET OF MIXED STATES

with α “ 18 log2 p27{16q. If we use the (inner product induced by the) Löwner
ellipsoid of Sep as the reference Euclidean structure to apply (9.19), we obtain
the upper bound
? ?
k log k log n log log n
vradpSep q ď C vradpLöwpSep qq “ C .
n n1`α
To show the lower bound in (9.18), we use the fact (see Exercise 4.20) that for
every symmetric convex body K Ă RN , the inclusion K Ą ?1N LöwpKq holds. We

ion
apply this for K “ Sep (so that N “ n2 ) to conclude that
1 1
vradpSep q ě vradpLöwpSep qq “ 1`α .

ut
n n
Finally, an application of the Rogers–Shephard inequality (9.3) shows that

rib
vradpSepq and vradpSep q are of the same order. 
A similar argument allows to estimate the size of the set of separable states on

ist
k “qudits”, i.e., on pCd qbk .

rd
Theorem 9.12 (see Exercise 9.5). Let d ě 2, k ě 1, n “ dk , and H “ pCd qbk .
Then
? ?
(9.21)
cd log n log log n
n
fo
ď wpSepq ď
Cd log n log log n
n
ot
and
?
cd Cd log n log log n
N

(9.22) ď vradpSepq ď ,
n1`αd n1`αd
where αd “ 21 logd p1 ` d1 q ´ 2d12 logd pd ` 1q.
ly.

Exercise 9.3 (Lower bound on the mean width of Sep). Show that, for some
on

` ˘
constant c ą 0, Sep pC2 qbk contains k ck elements which are c-separated with re-
spect to the Hilbert–Schmidt distance. Then, use the Sudakov minoration (Propo-
se

sition 6.10) to show the lower bound in (9.17).


lu

Exercise 9.4 (Löwner ellipsoid and the Killing form). Check that the Löwner
ellipsoid of DpC2 q induces on Msa
2 the inner product
na

3 1
xu, vyL “ Trpuvq ´ Trpuq Trpvq.
2 2
so

Exercise 9.5 (The size of of Sep for k qudits). Complete the proof of Theorem
r

9.12.
Pe

9.1.6. The set of PPT states. We present estimates for the volume and
mean width of PPT. For asymptotic versions improving some of the constants, see
Exercise 9.6.
Theorem 9.13 (Volume and mean width of PPT). For H “ Cd b Cd , we have
1 2
ď wpPPT˝ q´1 ď vradpPPTq ď wpPPTq ď .
4d d
Proof. The upper bound on the mean width follows from the obvious inequal-
ity wpPPTq ď wpDq and from the bound wpDq ď 2{d (Theorem 9.1). To prove the
9.2. DISTANCE ESTIMATES 245

lower bound, we use the dual Urysohn inequality (Proposition 4.16), where polarity
is taken with respect to ρ˚
1
vradpPPTq ě .
wpPPT˝ q
If Γ denotes the partial transposition on H, then PPT “ DXΓpDq and therefore
(9.23) PPT˝ “ convpD˝ Y ΓpDq˝ q Ă D˝ ` ΓpDq˝ .

ion
Geometrically, the transformation Γ is an isometry with respect to the Hilbert–
Schmidt norm (cf. Exercise 2.22; the argument we present actually works for any
Hilbert–Schmidt isometry). Using the fact that D˝ “ ´d2 D and the upper bound

ut
from Theorem 9.1, we obtain

rib
wpPPT˝ q ď wpD˝ q ` wpΓpDq˝ q ď 2wpD˝ q “ 2d2 wpDq ď 4d. 

ist
It follows from Theorem 9.13 that D and PPT have comparable volume radii,
up to an absolute constant. An interesting question is whether this constant ap-

rd
proaches 1 as the dimension increases.
Problem 9.14. Is there an absolute constant c ă 1 such that, for every d ě 3,
fo
vradpPPTpCd b Cd qq ď c vradpDpCd b Cd qq.
ot
Exercise 9.6 (Sharper asymptotic bounds on the size of PPT). Prove that
N

wpPPT˝ q ď p2 ` op1qqd and conclude that


1 ´ op1q
wpPPTq ě vradpPPTq ě .
ly.

2d
on

Exercise 9.7 (Volume radius of PPT as a large deviation problem). Show that
Problem 9.14 can be reformulated as follows: does there exist a constant c ą 0 such
that, if B is a d2 ˆ d2 matrix with independent NC p0, 1q entries, then
se

P pBB : qΓ is positive ď expp´cd4 q?


` ˘
(9.24)
lu

This recasts the problem as a large deviation estimate for some random matrix
ensemble. Note that the same ensemble appears in Theorem 6.30, which asserts
na

that it is asymptotically semicircular with appropriate parameters. It is worthwhile


pointing out that bounds in the spirit of (9.24) hold for the GUE ensembles and
so

Wishart ensembles, see [BAG97, HP98] and [AGZ10].


r
Pe

9.2. Distance estimates


In this section we gather known estimates for the geometric distance (defined in
(4.1)) between D, Sep and the Hilbert–Schmidt ball BHS . Computing the distance
to the Hilbert–Schmidt ball is equivalent to computing the inradius and outradius.
In particular, it follows from the results of Table 9.1 that for H “ Cd b Cd ,
(9.25) dg pD, BHS q “ dg pSep, BHS q “ d2 ´ 1.
246 9. GEOMETRY OF THE SET OF MIXED STATES

9.2.1. The Gurvits–Barnum theorem. A remarkable fact, which is im-


plicit in (9.25) above, is that—in the bipartite case—not only the outradii, but also
the inradii of Sep and D are the same.
Theorem 9.15. Let H “ Cd1 b Cd2 , n “ d1 d2 and ρ be a state on H such that
› ›
›ρ ´ I › ď a 1
› ›
.
› n› HS npn ´ 1q
Then ρ is separable.

ion
An elementary geometric argument shows that Theorem 9.15 is equivalent to
the following statement: if A P B sa pCd1 b Cd2 q satisfies }A}HS ď 1, then I `A P

ut
SEP.

rib
aProof. Let K Ă DpHq be the set of states ρ such that }ρ ´ I {n}HS ď
1{ npn ´ 1q and C “ R` K be the cone generated by K. The assertion of Theo-

ist
rem 9.15 is equivalent to the cone inclusion C Ă SEP. By cone duality (see Section
1.2.1), this is also equivalent to SEP ˚ Ă C ˚ . Recall that SEP ˚ is the cone of

rd
block-positive operators, see (2.46).
Let M P B sa pHq. One checks that

M P C ðñ }M }HS ď ?
fo1
n´1
Tr M.
ot
It follows (see Exercise 1.31) that
N

M P C ˚ ðñ }M }HS ď Tr M.
We thus reduced the proof of Theorem 9.15 to the following problem: for a block-
ly.

positive matrix M P B sa pCd1 b Cd2 q, prove that Tr M 2 ď pTr M q2 . We will need


on

the following lemma.


ˆ ˙
A B
Lemma 9.16. Let M “ be a block-positive operator in B sa pCd1 b
B: C
se

C2 q. Then
lu

}B}22 ď }A}1 }C}1 .


Assuming the Lemma, we can complete the proof of the Theorem. Denote by
na

Mkl P BpCd1 q the blocks of M . For k, l P t1, . . . , d2 u, we then have


so

}Mkl }22 ď }Mkk }1 }Mll }1


r

(if k “ l this is obvious; if k ‰ l this is the content of the Lemma). Noting that
Pe

the diagonal blocks Mkk are positive semi-definite and summing over k, l gives the
needed inequality }M }2HS ď pTr M q2 . 

Proof of Lemma 9.16. Let B “ HU be the polar decomposition of B, with


H positive and U unitary. We may choose an orthonormal basis in Cd1 which makes
U diagonal. From the inequalities |Bii |2 ď Aii Cii and |Hij |2 ď Hii Hjj , we get
a 1
|Bij |2 “ |Hij |2 ď Hii Hjj “ |Bii Bjj | ď Aii Cii Ajj Cjj ď pAii Cjj ` Ajj Cii q.
2
Summing over i, j proves the Lemma. 
9.2. DISTANCE ESTIMATES 247

Exercise 9.8 (Another proof of the Gurvits–Barnum theorem). Here is an


alternative argument for Theorem 9.15.
(i) Show that for any operators Aij P Md1 ,
› ›2
›ÿ d2 › ÿd2
Aij b |iyxj|› ď }Aij }2op .
› ›

›i,j“1 › i,j“1
op

(ii) Use (i), Theorem 2.34 and Exercise 2.30 to give an alternate proof of Theorem

ion
9.15.
9.2.2. Robustness in the bipartite case. We now compute the geometric

ut
distance between D and Sep in the bipartite case.

rib
Proposition 9.17. Let H “ Cd1 b Cd2 for d1 , d2 ě 2, and denote n “ d1 d2 .
We have
n

ist
dg pD, Sepq “ dg pD, PPTq “ ` 1.
2
An equivalent way to describe the geometric distance is to define the robustness

rd
of a state ρ as follows (the notation ‚ was defined in (9.1))
" *
1
(9.26) Rpρq “ inf s ě 0 :
1`s
fo
‚ ρ P Sep .
ot
Proposition 9.17 asserts that the maximal robustness of a state on Cd1 b Cd2 equals
n{2. Since Sep Ă PPT Ă D, it suffices to prove that dg pD, PPTq ě n2 ` 1 and
N

dg pD, Sepq ď n2 ` 1.

Proof that dg pD, PPTq ě n2 ` 1. Let χ “ ?12 p|1yb|1y`|2yb|2yq P Cd1 bCd2


ly.

and ρ “ |χyxχ|. Now for 0 ă t ă 1, consider the state ρt “ t ‚ ρ. The non-zero


on

eigenvalues of ρΓ are p1{2, 1{2, 1{2, ´1{2q. It follows that ρt is not PPT whenever
´t{2 ` p1 ´ tq{n ă 0, or equivalently t ą 2{p2 ` nq. Therefore dg pD, PPTq ě
n
2 ` 1. 
se

Proof that dg pD, Sepq ď n2 ` 1. We have to show that for any state ρ, the
lu

state t0 ‚ ρ is separable when t0 :“ 2{p2 ` nq. By convexity, we may assume that ρ


is a pure state |χyxχ|. Consider the Schmidt decomposition of χ
na

d
ÿ
χ“ λj ϕj b ψj ,
so

j“1
r

for some d ď minpd1 , d2 q and orthonormal bases pϕj q in Cd1 and pψj q in Cd2 .
Pe

Let θ “ pθ1 , . . . , θd q a d-tuple of complex numbers with modulus one. Consider


the vectors
ÿd a
ϕpθq “ λj θj ϕj P Cd1 ,
j“1
d a
ÿ
ψpθq “ λj θj ψj P Cd2 .
j“1

We compute E |ϕpθq b ψpθ̄qyxϕpθq b ψpθ̄q|, where θ1 , . . . , θd are independent and


uniformly distributed on the unit circle, and θ̄ denotes the coordinatewise complex
248 9. GEOMETRY OF THE SET OF MIXED STATES

conjugate of θ. The resulting operator B, which belongs to the separable cone SEP
by construction, equals
d
ÿ a “ ‰
B“ λj λk λl λm E θj θ̄k θl θ̄m |ϕj b ψk yxϕl b ψm |
j,k,l,m“1

The quantity Erθj θ̄k θl θ̄m s vanishes unless either (1) j “ k and l “ m, or (2) j “ m
and k “ l. The non-vanishing terms can be gathered as B “ |χyxχ| ` A, where
ÿ

ion
A“ λj λk |ϕj b ψk yxϕj b ψk |.
j‰k

Denote α “ maxtλj λk : j ‰ ku. It is easily checked that α I ´A P SEP since it

ut
can be written as a positive combination of the operators

rib
t|ϕj b ψk yxϕj b ψk | : 1 ď j ď d1 , 1 ď k ď d2 u .
1
Note that α ď since λj λk ď 21 pλ2j ` λ2k q ď 12 . It follows that 1
I ´A P SEP, and

ist
2 2
therefore that
ˆ ˙
´ n ¯ 1

rd
t0 ‚ ρ “ t0 |χyxχ| ` ρ˚ “ t0 B ´ A ` I
2 2
is a separable state, as needed. 
fo
9.2.3. Distances involving the set of PPT states. We consider the case
of a balanced bipartite Hilbert space H “ Cd b Cd . Another relevant quantity—not
ot
covered by Proposition 9.17—is the geometric distance between PPT and Sep. This
N

quantity is of interest since it quantifies the degree to which PPT is a poor substitute
for separability in large dimensions. However, even the order of magnitude of the
ly.

distance seems unknown. Actually, we are not aware of any upper bound improving
substantially on the obvious estimate dg pSep, PPTq ď dg pSep, Dq.
on

Proposition 9.18. Let H “ Cd b Cd . We have


?
d
ď dg pSep, PPTq.
se

16
lu

Proof. We use the lower bound on the distance that comes from volume
comparison
vrad PPT
na

dg pSep, PPTq ě ,
vrad Sep
1
so

together with the lower bound vrad PPT ě 4d (Theorem 9.13) and the upper bound
´3{2
vradpSepq ď 4d (Theorem 9.3). 
r
Pe

Proposition 9.18 asserts that there are PPT states that are far from the set of
separable states. Another way of quantifying this phenomenon is as follows. Given
a state ρ on Cd b Cd , we introduce
dSep pρq “ min }ρ ´ σ}1 .
σPSeppCd bCd q

Theorem 9.19 (not proved here). For every ε ą 0, for d large enough, there
is a PPT state ρ on Cd b Cd such that dSep pρq ě 2 ´ ε.
The proof of Theorem 9.19 involves tricks that are beyond the scope of this
book. However, we present an argument showing that a weaker lower bound on the
distance to separable states (1{4 instead of 2) is achieved in a generic direction.
9.2. DISTANCE ESTIMATES 249

Proposition 9.20. Let S denote the unit sphere in the space of trace zero
Hermitian operators on Cd b Cd . For most directions u P S, there exists a PPT
state ρ such that
1
dSep pρq ě }u}´1
8 min Trppρ ´ σquq ě ´ op1q.
σPSeppCd bCd q 4
Proof. We consider the support functions wpPPT, ¨q and wpSep, ¨q, as defined
in (4.29). Since the outradii of PPT and Sep are less than 1, these functions are

ion
1-Lipschitz on S. Note also that the average of these functions on S is exactly the
mean width of the corresponding set. Using the values from Table 5.2, we conclude
that, for K “ PPT or K “ Sep and for ε ą 0,

ut
Pp|wpK, ¨q ´ wpKq| ą εq ď 2 expp´ε2 pd4 ´ 1q{2q.

rib
We next use the bounds wpPPTq ě p 21 ´ op1qqd´1 (Exercise 9.6) and wpSepq ď
4d´3{2 (Theorem 9.3) to conclude that, for most directions u P S, we have

ist
ˆ ˙
1
(9.27) wpPPT, uq ě ´ op1q d´1 , wpSep, uq ď 5d´3{2 .

rd
2
Moreover (see Proposition 6.24), most directions u also satisfy
(9.28) fo
}u}8 ď p2 ` op1qqd´1 .
Choose u P S satisfying both (9.27) and (9.28), and let ρ P PPT be such that
ot
Trpρuq “ wpPPT, uq. We then have
N

ˆ ˙
1
sup Trppρ ´ σquq “ wpPPT, uq ´ wpSep, uq ě ´ op1q d´1 .
σPSep 2
ly.

Using the inequality Trppρ ´ σquq ď }u}8 }ρ ´ σ}1 ď p2 ` op1qqd´1 }ρ ´ σ}1 , we


obtain
on

1
dSep pρq ě ´ op1q. 
4
se

Any improvement on the lower bound (9.27) for the mean width of PPT would
improve the lower bound in Proposition 9.20.
lu

9.2.4. Distance estimates in the multipartite case. We now focus on the


na

case of k qubits, i.e., the Hilbert space H “ pC2 qbk . Recall that the inradius of Sep
is witnessed by balls centered at ρ˚ (see Proposition 2.18 and the discussion in the
so

preamble to the present chapter). The inradius of Sep is known up to a universal


(not too large) multiplicative constant.
r
Pe

Theorem 9.21 (not proved here, but see Exercise 9.9). For H “ pC2 qbk , we
have a
54{17 ˆ 6´k{2 ď inradpSepq ď 2 ˆ 6´k{2
We next turn to the problem of estimating the geometric distance between D
and Sep in the case of many qubits, for which even the asymptotic order is not
known.
Proposition 9.22 (Robustness for many qubits). For H “ pC2 qbk , we have
?
2k´1 ` 1 ď dg pSep, Dq ď p 6qk .
250 9. GEOMETRY OF THE SET OF MIXED STATES

Proof. The upper bound is fairly straightforward: it follows by comparing


the two sets with the Hilbert–Schmidt ball. Specifically, we use the elementary
inequality dg pSep, Dq ď outradpDq{ inradpSepq,
a combined with Theorem 9.21 and
with the obvious fact that outradpDpCn qq “ pn ´ 1q{n.
To prove the lower bound, consider any decomposition pC2 qbk “ A b B, where
A “ pC2 qbj and B “ pC2 qbpk´jq for some 0 ă j ă k. A separable state on pC2 qbk
is also separable along the A : B cut, and therefore
dg pDppC2 qbk q, SepppC2 qbk qq ě dg pDpA b Bq, SeppA b Bqq “ 2k´1 ` 1,

ion
where the last equality comes from Proposition 9.17. 

ut
Exercise 9.9 (A bound on the inradius of Sep on k qubits via mean width).
Let P : Msa sa
2 Ñ M2 be the orthogonal projection onto the hyperplane of trace zero

rib
matrices, and let Π “ P bk .
(i) Check that ΠpSepppC2 qbk q q “ pP pDpC2 q qqbk
p
.

ist
(ii) Show that
ˆ´ ¯bk
p
˙

rd
a
inrad SepppC2 qbk q ď inrad 2´1{2 B23
` ˘
“ Op k log k ¨ 6´k{2 q.

fo
9.3. The super-picture: classes of maps
Up to now, we focused on determining volumes and other geometric parameters
ot
for various classes of states. Due to the Choi–Jamiołkowski isomorphism (see Sec-
tion 2.3.1), these results can be translated into statements about the corresponding
N

classes of quantum maps, or superoperators. However, there are some fine points
that need to be addressed for such translation to be rigorous.
ly.

To exemplify the fine points, consider the cone CP “ CP pMm , Mn q of com-


pletely positive maps Φ : Mm Ñ Mn , which can be identified via the Choi isomor-
on

phism Φ ÞÑ CpΦq (see Section 2.4, especially Table 2.2) with the positive semi-
definite cone PSDpCn b Cm q. So far, so good. However, if we restrict our at-
se

tention to the subset of trace-preserving maps Φ (denoted by CPTP ), the set of


the corresponding Choi matrices CpΦq forms a proper subset of the rescaled set
lu

of states mDpCn b Cm q. This is due to the fact that the trace-preserving condi-
tion Tr Φpρq “ Tr ρ (for ρ P Mm ) translates into TrCn CpΦq “ ICm (which implies
na

Tr CpΦq “ m, whence the rescaling factor m), which represents m2 independent


(real linear) scalar constraints. On the other hand, membership in mDpCn b Cm q
so

is represented by just one scalar constraint Trp¨q “ m (in addition to the positive
semi-definiteness constraint common to both settings).
r

In other words, if we denote by H Ă B sa pCn b Cm q the affine subspace


Pe

tTrCn p¨q “ ICm u, then the rescaled set of states K “ mDpCn b Cm q is a base of the
positive semi-definite cone, which is an m2 n2 ´ 1-dimensional convex set, while the
set of Choi matrices corresponding to completely positive trace-preserving maps is
K X H, a section of that base of relative codimension m2 ´ 1, i.e., a convex set of
dimension m2 n2 ´ m2 .
The problem of relating the size of a convex set to that of its (central) sections
is in general nontrivial, and two-sided bounds are only possible if the set is isotropic
(in the technical sense defined in Section 4.4; see especially Proposition 4.26). The
set D of all states actually is isotropic (see Proposition 4.25). While not all natural
sets of states have this property, they are all sufficiently balanced so that the more
9.3. THE SUPER-PICTURE: CLASSES OF MAPS 251

robust Proposition 4.28 leads to reasonable estimates. For notational simplicity, we


restrict ourself to superoperators Φ : Md Ñ Md in the following theorem.
Theorem 9.23. Let C “ CpMd , Md q be one of the cones of superoperators
appearing in Table 9.3, and CTP :“ tΦ P C : Φ is trace-preservingu. Denote
also by C “ tCpΦq : Φ P Cu the corresponding cone of Choi matrices and by
C b “ tA P C : Tr A “ 1u its base. Then, as d Ñ 8,
vrad CTP „ d vrad C b .
` ˘ ` ˘
(9.29)

ion
Table 9.3. Each cone C of superoperators is a nondegenerate

ut
cone in BpMsa sa
d , Md q and the subset CTP of trace-preserving el-
ements is a convex set of dimension d4 ´ d2 . The cone C Ă

rib
B sa pCd b Cd q is the image of C under the map Φ ÞÑ CpΦq, see
Section 2.4.

ist
Cone of superoperators C Cone C Base C b vradpC
? TP q
BP BP Θp dq

rd
Positivity-preserving P
Decomposable DEC co-PSD `PSD convpD Y ΓpDqq Θp1q
Completely positive CP PSD D „ e´1{4
PPT-inducing
Entanglement breaking
PPT
EB
PPT
SEP
fo
PPT
Sep
Θp1q
?
Θp1{ dq
N ot

Proof of Theorem 9.23. Denote K “ d C b and n “ dim K “ d4 ´ 1. Since


K is invariant under local unitaries, it follows (see Proposition 2.18) that the cen-
ly.

troid of K equals I {d.


As explained earlier, CTP identifies with a section of K (through the centroid)
on

of codimension k “ d2 ´ 1. It follows from Proposition 4.28 that


ˆ ˙ n1
vradpCTP q1´θ n
se

(9.30) R´θ bpn, kq ď ď r´θ bpn, kq ,


vradpKq k
lu

2
where θ “ nk “ dd4 ´1
´1
ă d12 and r, R denote respectively the inradius and outradius
of K. The constants bpn, kq were defined in (4.51); in our setting the bounds (4.55)
na

can be sharpened (see Exercise 9.10) to


ˆ ˙ n1
so

ˆ ˙ ˆ ˙
log d n log d
(9.31) bpn, kq “ 1 ´ O , bpn, kq “ 1 ` O .
d2 k d2
r
Pe

b
2
Since all the cones we consider have the?property that Sep
˝
? Ă C Ă BP “
´d Sep , we know from Table 9.1 that r “ 1{ d2 ´ 1 and R “ d 2 ´ 1, so r ´θ
` b“
Rθ “ 1`O log
` d˘ 1´θ
˘
d2 . Combining (9.30) and (9.31) yields vradpC TP q „ d vrad C ,
and it remains to again notice that since θ is small, the exponent 1 ´ θ does not
make much of a difference (this uses very weakly the estimates on the volume radii
from Table 9.1, or just rough bounds given by r and R). 
The same argument leads to non-asymptotic bounds (i.e., stated for a fixed
dimension) and to bound for maps from Mm to Mn . We also state a version of
Theorem 9.23 for the mean width. As we shall see in Chapter 10, the latter may
also be of independent importance.
252 9. GEOMETRY OF THE SET OF MIXED STATES

Proposition 9.24. In the same notation as in Theorem 9.23, we have


w CTP ď 1 ` d´2 d w C b .
` ˘ ` ˘ ` ˘
(9.32)
Proof. This is a consequence of the following inequality: for an n-dimensional
convex body K and a k-codimensional affine subspace H, we have
c
n
(9.33) wpK X Hq ď wpKq.
n´k´1
Inequality (9.33) follows from the link (4.32) between the Gaussian mean width and

ion
the standard mean width, from the fact that the Gaussian mean ? width of a subset
?
does not exceed that of the entire set, and from the inequality n ´ 1 ď κn ď n

ut
(Proposition A.1(i)). 

rib
Deriving meaningful lower bounds for wpK X Hq in terms of wpKq in a general
setting (such as Proposition 4.28 for the volume radius) is not that easy. However,
when K is one of the sets C b from Table 9.3, nontrivial lower bounds for the mean

ist
width follow from the estimates on the volume radii contained in the Table and
from Urysohn’s inequality.

rd
Exercise 9.10. Prove the bounds (9.31).
fo
Exercise 9.11 (Cones of channels are not self-dual). Let H “ Cm b Cn .
(i) Consider the affine subspace H “ tA P BpHq : TrCn A “ mI u. Show that
ot
D X H Ĺ PH D and Sep X H Ĺ PH Sep.
(ii) Conclude in particular that pD X Hq˝ ‰ ´mnpD X Hq: the self-duality of D is
N

destroyed by the partial trace condition.


(iii) Consider the affine subspace F “ t mI b σ : σ P Msa
n : Tr σ “ 1u Ĺ H. Show
ly.

that D X F “ PF D “ Sep X F “ PF Sep.


on

9.4. Approximation by polytopes


The proofs of volume and mean width estimates given in Section 9.1 proceed
through a symmetrization argument, by showing that the symmetrized sets (D
se

or Sep ) are close, with respect to the geometric distance, to a polytope with not-
lu

too-many vertices. It is natural to wonder whether similar effect can be achieved


without symmetrization. This problem is also of independent interest since approx-
na

imating convex sets by polytopes with not-too-many vertices, or with not-too-many


faces, has important algorithmic implications, and is a much studied question in
so

computational geometry. It is convenient to formulate the results using the notion


of verticial and facial dimensions introduced in Section 7.2.3. For an easy overview
r

and reference, we list the results in Table 9.4; the proofs can be found in the next
Pe

two sections.
9.4.1. Approximating the set of all quantum states. We first show that
it is possible to approximate D by a polytope whose number of vertices is expo-
nential in the dimension of the underlying Hilbert space. Recall that the notation
t ‚ K was defined in (9.1).
Proposition 9.25. For every ε P p0, 1q, there is a constant Cpεq such that the
following holds: for every dimension d ě 2, there exists a family N “ pϕi q1ďiďN of
unit vectors in Cd , with N ď exppCpεqdq, such that
(9.34) p1 ´ εq ‚ DpCd q Ă convt|ϕi yxϕi | : ϕi P N u.
9.4. APPROXIMATION BY POLYTOPES 253

Table 9.4. Verticial and facial dimensions of the set of states


DpCm q and of the set of separable states SeppCd b Cd q. They
are proved respectively in Sections 9.4.1 (Corollary 9.26) and 9.4.2
(Proposition 9.31 and Corollary 9.32). We also include the values of
asphericities apDq and apSepq (see Exercises 9.12 and 9.14), which
are important in some applications and derivations. For all these
notions, the maximally mixed state ρ˚ plays the role of the origin.

ion
K dimension apKq dimV pKq dimF pKq
DpCm q m2 ´ 1 m´1 Θpmq Θpmq

ut
SeppCd b Cd q d4 ´ 1 d2 ´ 1 Θpd log dq Ωpd3 { log dq

rib
The result from Proposition 9.25 can be rephrased as estimates on the verticial

ist
(or facial) dimension of DpCd q.
Corollary 9.26. There are absolute constants c, C such that, for any d ě 2,

rd
cd ď dimV pDpCd qq “ dimF pDpCd qq ď Cd.
fo
Proof. Since D˝ “ p´dq ‚ D, the facial and verticial dimensions are equal.
The upper bound follows from Proposition 9.25. Using the value apDq “ d ´ 1 (see
ot
Table 9.1 and Exercise 9.12), one can deduce the lower bound from Theorem 7.29.
Alternatively, an elementary argument is sketched in Exercise 9.13. 
N

It may seem reasonable to expect that choosing N as a δ-net in SCd (for some
δ depending only on ε) would be enough for the conclusion of Proposition 9.25 to
ly.

hold. This is the case for D (see Lemma 9.2). However, this approach fails for
D. Indeed, given δ, for d large enough, a δ-net N?may have the property that for
on

some fixed unit vector ψ, we have |xϕi , ψy| ą 1{ d for every ϕi P N . It follows
that xψ|ρ|ψy ą 1{d for every ρ P convt|ϕi yxϕi |u. However, this inequality fails for
se

ρ “ ρ˚ , which shows that even the maximally mixed state does not belong to the
convex hull of the net! Elements of the net may somehow conspire towards the
lu

direction ψ.
Yet, this approach can be salvaged if we use a balanced δ-net to avoid such
na

conspiracies. The idea is to use, instead of an arbitrary net, a family of random


points independently and uniformly distributed on the unit sphere, and to show
so

that these points satisfy the conclusion of Proposition 9.25 with high probability.
This is reminiscent of the random covering argument used in Proposition 5.4.
r
Pe

We start with a lemma which gives a rough bound on the number of unit vectors
that are needed.
Lemma 9.27. Let M be a δ-net in pSCd , | ¨ |q. Then
(9.35) p1 ´ 2dδq ‚ DpCd q Ă convt|ψi yxψi | : ψi P Mu Ă DpCd q.
The reader will notice that the proof given
` below can be
˘ fine-tuned to yield a
slightly better (but more complicated) factor 1 ´ 2pd ´ 1qδ in (9.35).
Proof. We have to show that, for any trace zero Hermitian matrix A,
λ1 pAq “ sup xψ|A|ψy ď p1 ´ 2δdq´1 sup xψi |A|ψi y.
ψPSCd ψi PN
254 9. GEOMETRY OF THE SET OF MIXED STATES

Since A has zero trace, we have }A}8 ď dλ1 pAq. Given ψ P SCd , there is ψi P M
with |ψ ´ ψi | ď δ. By the triangle inequality, we have
(9.36) xψ|A|ψy ď δ}A}8 ` xψ|A|ψi y
(9.37) ď 2δ}A}8 ` xψi |A|ψi y
(9.38) ď 2δdλ1 pAq ` xψi |A|ψi y.
Taking supremum over ψ, we get λ1 pAq ď 2δd λ1 pAq ` suptxψi |A|ψi y : ψi P Mu
and the result follows. 

ion
Lemma 9.27 is not enough to directly imply Proposition 9.25, but it can be

ut
“bootstrapped” to yield the needed estimate.

Proof of Proposition 9.25. The conclusion (9.34) can be equivalently re-

rib
formulated as follows: For any self-adjoint trace zero matrix A we have
1

ist
(9.39) λ1 pAq “ sup xψ|A|ψy ď sup xϕi |A|ϕi y.
ψPSCd 1 ´ ε ϕi PN

rd
ε
Let M be a 4d -net in pSCd , |¨|q. By Lemma 5.3, we may enforce card M ď p8d{εq2d .
By Lemma 9.27, we have

(9.40) sup xψ|A|ψy ď


ψPSCd
1 fo
sup xψ|A|ψy.
1 ´ ε{2 ψPM
ot
a
Set η “ ε{8. For ψ P SCd , denote by Cpψ, ηq Ă SCd the cap with center ψ and
N

radius η with respect to the geodesic distance. By symmetry, there is a number α


(depending on d and ε) such that
ly.

ż
1
(9.41) |ϕyxϕ| dσpϕq “ p1 ´ αq ‚ |ψyxψ|.
σpCpψ, ηqq Cpψ,ηq
on

Taking (Hilbert–Schmidt) inner product with |ψyxψ|, we obtain


se

ż
α 1
1´α` “ |xψ, ϕy|2 dσpϕq ě cos2 η ě 1 ´ η 2
d σpCpψ, ηqq Cpψ,ηq
lu

so that
na

d
(9.42) α ď η2 ď ε{4.
d´1
so

Denote L :“ σpCpψ, ηqq´1 and let N “ tϕi : 1 ď i ď 2L3 u be a family of


N “ r2L3 s independent random vectors uniformly distributed on SCd . (To not to
r
Pe

obscure the argument, we will pretend in what follows that 2L3 is an integer and
so N “ 2L3 .) We will rely on the following lemma
d
Lemma 9.28. Let S8 “ t∆ P Md : }∆}op ď 1u be the unit ball for the operator
norm. For ψ P SCd and t ě 0, the event
! )
d
Eψ,t “ pϕi q : p1 ´ αq ‚ |ψyxψ| P tS8 ` convt|ϕi yxϕi | : 1 ď i ď 2L3 u

satisfies
1 ´ PpEψ,t q ď exp p´Lq ` 2d exp ´t2 L2 {8 .
` ˘
9.4. APPROXIMATION BY POLYTOPES 255

We apply Lemma 9.28 with t “ ε{8d. When the event Eψ,t holds, we have
(9.43) p1 ´ αqxψ|A|ψy ď t}A}1 ` sup xϕi |A|ϕi y.
ϕi PN

If the events Eψ,t hold simultaneously for every ψ P M, we can conclude from
(9.40) and (9.43) that
(9.44) p1 ´ ε{2qp1 ´ αqλ1 pAq ď t}A}1 ` sup xϕi |A|ϕi y
ϕi PN

ion
Since A has zero trace, we have }A}1 ď 2dλ1 pAq, and (9.44) combined with (9.42)
implies that

ut
` ˘
p1 ´ εqλ1 pAq ď p1 ´ ε{2qp1 ´ αq ´ 2td λ1 pAq ď sup xϕi |A|ϕi y,
ϕi PN

rib
yielding (9.39). The Proposition will follow once we show that the events Eψ,t hold
simultaneously for every ψ P M with positive probability. To that end, we use

ist
Lemma 9.28 and the union bound
˜ ¸

rd
č ÿ
(9.45) P Eψ,t ě 1´ p1 ´ PpEψ,t qq
ψPM ψPM

(9.46) ě 1´
ˆ
8d
ε
˙2d ´
fo ˘¯
expp´Lq ` 2d exp ´ ε2 d´2 L2 {512 .
`
ot
We know from Proposition 5.1 that exppc1 pεqdq ď L ď exppC1 pεqdq for some con-
N

stants c1 pεq, C1 pεq depending only on ε. It follows that the quantity in (9.45)–
(9.46) is positive for d large enough (depending on ε), yielding a family of 2L3 ď
ly.

2 expp3C1 pεqdq vectors satisfying the conclusion of Proposition 9.25. Small values
of d are taken care of by adjusting the constant Cpεq if necessary. 
on

Proof of Lemma 9.28. Let Mψ “ cardpN X Cpψ, ηqq. The random variable
Mψ follows the binomial distribution BpN, pq for N “ 2L3 and p “ 1{L. It follows
se

from Hoeffding’s inequality (5.43) that


lu

ˆ ˙ ˆ 2 ˙
Np p N
P BpN, pq ď ď exp ´ .
2 2
na

Specialized to our situation, this yields


P Mψ ď L2 ď exp p´Lq .
` ˘
so

(9.47)
Moreover, conditionally on the value of Mψ , the points from N X Cpψ, ηq have
r
Pe

the same distribution as pϕk q1ďkďMψ , where pϕk q are independent and uniformly
distributed inside Cpψ, ηq. The random matrices
Xk “ |ϕk yxϕk | ´ E |ϕ1 yxϕ1 | “ |ϕk yxϕk | ´ p1 ´ αq ‚ |ψyxψ|
are independent mean zero matrices. We now use the matrix Hoeffding inequality
(see, e.g., Theorem 1.3 in [Tro12]) to conclude that for any t ě 0,
› 1 M
˜› › ¸
ÿψ ›
(9.48) P › Xk › ě t ď 2d expp´Mψ t2 {8q
› ›
› Mψ ›
k“1 8
256 9. GEOMETRY OF THE SET OF MIXED STATES

(the factor 2 appears because we want to control the operator norm rather than
the largest eigenvalue). Define a random matrix ∆ by the relation

1 ÿ
|ϕk yxϕk | ` ∆ “ p1 ´ αq ‚ |ψyxψ|.
Mψ k“1

The bound (9.48) translates into Pp}∆}8 ě tq ď 2d expp´Mψ t2 {8q. If we remove


the conditioning on Mψ and take (9.47) into account, we are led to

ion
Pp}∆}8 ě tq ď exp p´Lq ` 2d exp ´L2 t2 {8
` ˘

which is exactly the content of Lemma 9.28. 

ut
Exercise 9.12 (Asphericity of D). By comparing the values of the inradius

rib
and the outradius of DpCm q from Table 9.1, we see that the asphericity of DpCm q
is at most m ´ 1. Prove that it actually equals m ´ 1.

ist
Exercise 9.13 (An elementary bound for verticial dimension of D). Let P be
a polytope such that 41 ‚ DpCd q Ă P Ă DpCd q. Use Proposition 6.3 to prove that P

rd
has at least exppcdq vertices for some c ą 0.
9.4.2. Approximating the set of separable states. For simplicity, we only
fo
consider the case H “ Cd b Cd and denote Sep “ SeppCd b Cd q. As in the case
of D, a simple net argument (Lemma 9.29) shows that the verticial dimension of
ot
Sep is Opd log dq. However there is no analogue of the random construction used in
Proposition 9.25: this upper bound is sharp (see Proposition 9.31). Here are the
N

precise statements and the proofs.


ε
Lemma 9.29. Let Sep “ SeppCd b Cd q. If N is a in pSCd , | ¨ |q, then
ly.

4d2 -net
p1 ´ εq ‚ Sep Ă conv t|ψα b ψβ yxψα b ψβ | : ψα , ψβ P N u .
on

In particular, dimV pSepq ď Cd log d for some constant C.


Proof. We have to show that for any trace zero Hermitian matrix A, we have
se

W :“ sup xψ b ϕ|A|ψ b ϕy ď p1 ´ εq´1 sup xψα b ψβ |A|ψα b ψβ y.


lu

ψ,ϕPSCd ψα ,ψβ PN

First, note using Theorem 9.15 that


na

1 1
W ě 2 }A}2 ě 2 }A}8 .
d d
so

Let δ “ ε{4d2 . Given ϕ, ψ P SCd , there are ψα , ψβ P N with |ϕ ´ ψα | ď δ and


r

|ψ ´ ψβ | ď δ. Using the triangle inequality as in (9.36)–(9.37), we have


Pe

xϕ b ψ|A|ϕ b ψy ď 4δ}A}8 ` xψα b ψβ |A|ψα b ψβ y ď εW ` xψα b ψβ |A|ψα b ψβ y.


Taking supremum over ψ, ϕ gives the result. The estimate on the verticial dimension
follows from Lemma 5.3. 
Remark 9.30. A closer examination of the above proof shows that the bound
Opd log dq on the verticial dimension allows for an approximation more precise than
the default “up to factor 4” implicit in the definitions from in Section 7.2.3. For
example, the argument gives that (in the notation from Exercise 7.15) dimV pSep, 1`
d´κ q ď Cd log d, where the constant C depends on κ ą 0.
9.4. APPROXIMATION BY POLYTOPES 257

We will show next that the upper bound obtained in Lemma 9.29 is sharp. This
is in contrast with the case of the symmetrized set Sep , whose verticial dimension
is of order d (see Lemma 9.4).
Proposition 9.31. Let Sep “ SeppCd b Cd q. Then dimV pSepq ě cd log d for
some constant c ą 0.
Proof. Let P be a polytope with N vertices such that 41 ‚ Sep Ă P Ă Sep.
By Carathéodory’s theorem, we may write each vertex of P as a combination of d4

ion
extreme points of Sep (which are pure product states, i.e., of the form |ψ bϕyxψ bϕ|
for unit vectors ψ, ϕ P Cd ). We obtain therefore a polytope Q which is the convex
hull of N 1 ď N d4 pure product states, and such that 14 ‚ Sep Ă Q Ă Sep. Let

ut
p|ψi b ϕi yxψi b ϕi |q1ďiďN 1 be the vertices of Q. Fix χ P SCd arbitrarily. For any
ϕ P SCd , let α “ maxt|xϕ, ϕi y|2 : 1 ď i ď N 1 u. Consider the linear form

rib
gpρq “ Tr rρ p|χyxχ| b pα ICd ´|ϕyxϕ|qqs .

ist
1
For any 1 ď i ď N we have
gp|ψi b ϕi yxψi b ϕi |q “ |xχ, ψi y|2 pα ´ |xϕ, ϕi y|2 q ě 0

rd
and therefore g is nonnegative on Q. Since Q Ą 14 ‚ Sep, we have
ˆ ˙
1 1 fo 3
0ďg ‚ |χ b ϕyxχ b ϕ| “ gp|χ b ϕyxχ b ϕ|q ` gpρ˚ q
4 4 4
ot
ˆ ˙
1 3 1 1
“ pα ´ 1q ` ˆ α´
4 4 d d
N

ˆ ˙ ˆ ˙
1 3 1 3
“ α ` ´ ` .
4 4d 4 4d2
ly.

It follows that
1 ` d32 3
on

αě 3 ě 1 ´ d.
1` d
1
In other words, w proved that for every ϕ P SCd there is an index i P t1, ? ...,N u
se

2
such that |xϕ, ϕi y| ě 1 ´ 3{d. This means that pϕi q1ďiďN is a pC{ dq-net in
1

the projective space PpCd q equipped with the quotient metric


lu

? from pSCd , | ¨ |q. By


Theorem 5.11 (or Exercise 5.10), this implies that N 1 ě pc1 dq2pd´1q , and therefore
log N ě c d log d for some constant c ą 0. 
na

We conclude this section by stating an estimate on the facial dimension of Sep.


so

Corollary 9.32. Let Sep “ SeppCd b Cd q. Then


r

(9.49) cd3 { log d ď dimF pSepq ď Cd4


Pe

for some absolute constants C, c ą 0.


Proof. We use the Figiel–Lindenstrauss–Milman inequality (Theorem 7.29).
Recall that the dimension of Sep equals d4 ´ 1. The asphericity of Sep is bounded
from above by the ratio outradpSepq{ inradpSepq (see Table 9.1 for the values of the
radii; as indicated in Table 9.4, there is actually equality, see Exercise 9.14). Since
the value of this ratio is d2 ´ 1, it follows that
(9.50) dimF pSepq dimV pSepq ě cd4 .
The lower bound in (9.49) is immediate from (9.50) and Lemma 9.29, while the
upper bound follows from Proposition 7.27(iv). 
258 9. GEOMETRY OF THE SET OF MIXED STATES

Problem 9.33. Is it true that dimF pSeppCd b Cd qq “ Θpd4 q?


Exercise 9.14 (Asphericity of Sep). Prove that the asphericity of SeppCd bCd q
equals d2 ´ 1.
9.4.3. Exponentially many entanglement witnesses are necessary. We
conclude this chapter by showing that Dvoretzky’s theorem (applied via the Figiel–
Lindenstrauss–Milman inequality (7.17)) implies that the set of separable states is
complex in the following sense: super-exponentially many entanglement witnesses

ion
are necessary to approximate it within a constant factor.
In this section we write D, PSD and Sep for DpCd b Cd q, PSDpCd b Cd q and
SeppCd bCd q. We denote by P pCd q the cone of positivity-preserving operators from

ut
Md to Md . Recall the statement of Theorem 2.34: a state ρ P D is entangled if and
only if there exists an entanglement witness, i.e., Φ P P pCd q such that pΦ b Idqpρq

rib
is not positive. In other words
č

ist
(9.51) Sep “ tρ P D : pΦ b Idqpρq P PSDu .
ΦPP pCd q

rd
It is natural to wonder whether the intersection in (9.51) can be taken over a
smaller subfamily. For d “ 2, two superoperators suffice, namely Id and T ; this is
the content of Størmer’s theorem. It is known that for d ě 3 an infinite family is
fo
needed. If we consider instead the isomorphic version of the problem, the following
theorem shows that super-exponentially (in the dimension of the underlying Hilbert
ot
space) many witnesses are necessary.
N

Theorem 9.34. There is a constant c ą 0 such that the following holds: if


Φ1 , . . . , ΦN P P pCd q are such that
ly.

N
č
(9.52) tρ P D : pΦi b Idqpρq P PSDu Ă 2 ‚ Sep,
on

i“1

then N ` 1 ě exppcd3 { logpdqq.


se

The following variant of the above theorem also holds.


lu

Theorem 9.35 (see Exercise 9.16). There are universal constants c0 , c ą 0


such that the following holds: if Φ1 , . . . , ΦN P P pCd q are such that
na

N ?
č c0 d
(9.53) tρ P D : pΦi b Idqpρq P PSDu Ă ‚ Sep,
log d
so

i“1

Then N ` 1 ě exppcd2 log dq.


r
Pe

In other words, even being able to detect very robust entanglement requires
super-exponentially many witnesses. It would be of some interest to determine the
maximal robustness level (defined in (9.26)) at which this phenomenon still persist.

Note that, by Proposition 9.17, D Ă 1 ` d2 ‚ Sep for states on Cd b Cd , so the
`
2
question is nontrivial only if a threshold for the robustness level is smaller than d2 .
Proof of Theorem 9.34. Without loss of generality, we may assume that
each superoperator Φi is unital (see Exercise 9.15). We use the following lemma
Lemma 9.36. Let Φ P P pCd q be unital. Then for any ρ P D,
0 ď Tr rpΦ b Idqρs ď d.
9.4. APPROXIMATION BY POLYTOPES 259

Proof. Since linear forms achieve their extrema on extreme points


ř of convex
compact sets, we may assume that ρ “ |ψyxψ| is pure. Let ψ “ λi ei b fi the
Schmidt decomposition of ψ. We compute
d
ÿ
Tr rpΦ b Idqρs “ λ2i Tr Φp|ei yxei |q ď d
i“1

λ2i “ 1 and Φp|ei yxei |q ď


ř
where the last inequality follows from the facts that
ΦpIq “ I. 

ion
Let ε “ 1{p1 ` dq. Let P be a polytope with at most exppC0 d2 logpdqq facets
such that

ut
(9.54) p1 ´ εq ‚ D Ă P Ă D.

rib
The existence of P is guaranteed by Lemma 9.27, by the relation D˝ “ p´d2 q ‚ D
and by the fact that facets of P are in bijection with vertices of P ˝ (see Section

ist
1.1.5). Introduce the convex body

rd
Ki “ tρ P D : pΦi b Idqpρq P PSDu “ D X pΦi b Idq´1 pPSDq
(note that Sep Ă Ki ) and the polyhedral cone
(9.55) fo
Ci :“ A P B sa pCd b Cd q : pΦi b IdqpAq P R` P .
(

We claim that
ot
1
(9.56) ‚ Ki Ă P X Ci Ă Ki .
N

2
Before proving the claim, let us first show how it implies the Theorem. Combining
ly.

(9.56) and (9.52) we obtain


N ˆ ˙ N N N
on

1 č 1 č č č
‚ Sep Ă ‚ Ki Ă pP X Ci q “ P X Ci Ă Ki Ă 2 ‚ Sep.
2 i“1
2 i“1 i“1 i“1
se

The polytope R “ P X 1ďiďN Ci has at most f :“ pN ` 1q exppC0 d2 log dq facets.


Ş
Consequently, by the definition of the facial dimension (see Section 7.2.3), we must
lu

have log f ě dimF pSepq and so


logpN ` 1q ` Cd2 log d ě dimF pSepq.
na

Since we know from Corollary 9.32 that dimF pSepq “ Ωpd3 { log dq, it follows that
so

logpN ` 1q ě cd3 { log d for d large enough. Since small values of d can be taken
into account by adjusting the constant c if necessary, this implies the Theorem.
r

It remains to prove the claimed inclusions (9.56). The second inclusion is imme-
Pe

diate from the definitions and from (9.54). For the first inclusion, it is clearly enough
to show that 12 ‚Ki Ă Ci . To that end, let ρ P Ki and denote t “ Tr rpΦi b Idqρs ě 0.
We now consider two cases. First, if t “ 0, then (since pΦi b Idqpρq is a positive
operator) we must have pΦi b Idqpρq “ 0. Hence trivially ρ P Ci and, a fortiori,
1 ´1
2 ‚ ρ P Ci . If t ą 0, we note that t pΦi b Idqpρq P D and that, by Lemma 9.36, we
t 1 1
have t ď d, and therefore 1`t “ 1 ´ 1`t ď 1 ´ 1`d “ 1 ´ ε. It thus follows from
(9.54) that
t t
‚ t´1 pΦi b Idqpρq P ‚ D Ă p1 ´ εq ‚ D Ă P.
1`t 1`t
260 9. GEOMETRY OF THE SET OF MIXED STATES

It remains to notice that


ˆ ˙
t pΦi b Idqpρq ` ρ˚ 2 ρ ` ρ˚
‚ t´1 pΦi b Idqpρq “ “ pΦi b Idq ,
1`t 1`t 1`t 2
which means that we showed that pΦi b Idq 12 ‚ ρ P 1`t
` ˘
2 P . In particular (cf.
(9.55)), 21 ‚ ρ P Ci , as needed. 

Exercise 9.15 (Unital witnesses suffice). Let Φ P P pCd q. Show that there is
a unital map Ψ P P pCd q with the property that, for any ρ P DpCd b Cd q,

ion
pΦ b Idqpρq P PSD ðñ pΨ b Idqpρq P PSD.

ut
Exercise 9.16 (Detecting very robust entanglement is also hard). (i) Show
that, in the notation of Exercise 7.15, we have dimF pSeppCd bCd q, Aq ě d3 A´2 { log d

rib
for every A ą 1, where c ą 0 is an absolute constant. (ii) Prove Theorem 9.35.

ist
Notes and Remarks
Section 9.1. The exact formula (9.4) for the volume of D appears in [ŻS03].

rd
The question of computing exactly the volume of Sep was asked in [ŻHSL98] and
seems challenging already in the bipartite case. A conjecture by Slater [Sla12],
fo
strongly supported by numerical evidence, is that for H “ C2 b C2 , one has
volpSepq{ volpDq “ 8{33.
ot
Theorems 9.3, 9.12 and 9.13 are from [AS06]; Theorem 9.11 appeared earlier
in [Sza05]. Theorem 9.6 and its corollary about block-positive matrices is from
N

[ASY14], and will be crucial in Chapter 10. The same question for multipartite
Hilbert spaces or unbalanced bipartite Hilbert spaces was also studied in [ASY14];
ly.

an extra ingredient needed is the fact that PF Sep “ Sep X F for certain subspaces
F , see Exercise 9.11(iii).
on

Volume and mean width estimates for the hierarchies of states introduced in
Section 2.2.5 are also known. For 1 ď k ď d, denote by Entk the set of k-entangled
states in Cd b Cd . It is proved in [SWŻ11] that
se

ck 1{2 Ck 1{2
lu

(9.57) 3{2
ď vradpEntk q ď wpEntk q ď 3{2
d d
which is of course compatible with the extreme cases Ent1 “ Sep and Entd “ D.
na

Similarly, if Extk denotes the set of k-extendible states on Cd b Cd , it is proved


in [Lan16] that for each fixed k, as d Ñ 8
so

2
(9.58) wpExtk q „ ?
r

d k
Pe

Ş
Note that D “ Ext1 and Sep “ tExtk : k ě 1u. However the implicit dependence
on k in (9.58) does not allow to recover Theorem 9.3 as k Ñ 8.
Section 9.2. Theorem 9.15 was proved in [GB02]. The proof we present is
due to Hans-Jürgen Sommers and appears is [Som09]; the equivalence between
Theorem 9.15 and the inequality TrpM 2 q ď pTr M q2 for a block-positive matrix M
has been noted in [SWŻ08]. The alternative argument from Exercise 9.8 is from
[Wat].
Proposition 9.17 (in the language of robustness) has been proved by Vidal and
Tarrach [VT99]. Proposition 9.18 is from [Jen13]. The result from Theorem
NOTES AND REMARKS 261

9.19 is due to Beigi and Shor [BS10] and relies on the quantum de Finetti theo-
rem. Another argument, yielding better quantitative estimates, was presented in
[BHH` 14] and was based on the concept of private states. Proposition 9.20 is also
from [BHH` 14].
Both inequalities from Theorem 9.21 are due to Hildebrand ([Hil06] for the
lower bound and [Hil07a] for the upper bound), improving on previous results by
Gurvits and Barnum [GB03, GB05] (the lower bound) and [AS06] (the upper
bound, cf. the proof of Proposition 9.22).

ion
The question of determining the exact order of dg pSep, Dq for many qubits
(cf. Proposition 9.22) deserves attention since it can be connected to feasibility
of nuclear magnetic resonance (NMR) quantum information protocols (see, e.g.,

ut
[GB05]).

rib
Section 9.3. Theorem 9.23 was derived in [SWŻ08], to which we refer for
precise estimates for the constants implicit in the Θp¨q notation from Table 9.3.

ist
Another class of superoperators for which volume estimates are known is the
class of k-positive maps. Indeed, this class is essentially dual to the class of k-

rd
entangled operators (see Exercise 2.48). It was proved in [SWŻ11]—as a conse-
quence of (9.57)—that if Pk,TP denotes the set of k-positive trace-preserving maps
from Md to itself, then
a fo a
c k{d ď vradpPk,TP q ď C k{d.
ot
Section 9.4. The results from this section are from [AS17]. The fact that for
N

d ě 3 the intersection in (9.51) cannot be restricted to a finite subfamily has been


proved in [Sko16] and is based on [HK11].
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 10

Random Quantum States

ion
The main goal of this chapter is to prove the following result. Consider a system

ut
of N identical particles (e.g., N qubits) in a random pure state. For some k ď N {2,
let A and B be two subsystems, each consisting of k particles. There exists a

rib
threshold function k0 pN q which satisfies k0 pN q „ N {5 as N Ñ 8 and such that
the following holds. If k ă k0 pN q, then with high probability the two subsystems

ist
A and B share entanglement. Conversely, if k ą k0 pN q, then with high probability
the two subsystems A and B do not share entanglement.

rd
If the Hilbert space associated to a single particle is Cq (e.g., q “ 2 for qubits),
the dimension of the system A b B equals q 2k and the state ρ describing the A b B
subsystem is obtained as a partial trace over an environment of dimension q N ´2k
fo
(the remaining N ´ 2k particles). If the global system is in a random and uniformly
distributed pure state, the state ρ is a random induced state as introduced in Section
ot
6.2.3.4, where its distribution was denoted by µq2k ,qN ´2k . The central result of the
N

chapter (Theorem 10.12) answers the question whether a random induced state on
Cd b Cd with distribution µd2 ,s is separable or entangled. It relies on the volume
and mean width estimates from Chapter 9.
ly.

Section 10.3 contains results about other thresholds for random induced states:
for the PPT vs. non-PPT dichotomy (Theorem 10.17) and for the value of the
on

entanglement of formation being close to maximal or close to minimal (Theorem


10.16).
se

10.1. Miscellaneous tools


lu

The first sections of this chapter contain an intermediate result (a quantitative


central limit theorem) about approximation of random induced states by Gaussian
na

matrices (Proposition 10.6). As a tool, we present some majorization inequalities


in Section 10.1.1.
so

10.1.1. Majorization inequalities. Majorization was introduced in Section


r
Pe

1.3.1. We first state a technical result that ascertains that “flat” vectors (i.e., vectors
with a large `1 -norm and small `8 -norm) majorize many other vectors. Since we
need to consider homotheties, it is natural to work in Rn,0 , the hyperplane of Rn
consisting of vectors whose coordinates add up to 0.
Lemma 10.1. Let x, y P Rn,0 . Assume that }y}8 ď 1 and }y}1 ě αn for some
α P p0, 1s. Then
(10.1) x ă p2{α ´ 1q}x}8 y.
Proof of Lemma 10.1. By homogeneity, it is enough to verify that the con-
dition }x}8 ď 1 implies x ă p2{α ´ 1qy. Moreover, it is enough to check this for

263
264 10. RANDOM QUANTUM STATES

x being an extreme point of the set A :“ tx P Rn,0 : }x}8 ď 1u, since the set
tx P Rn,0 : x ă zu is convex for any z P Rn,0 .
Extreme points of A are of the following form: tn{2u coordinates are equal to
1 and tn{2u coordinates equal to ´1. In the case of odd n there is one remaining
coordinate, which is necessarily equal to 0. It is thus enough to verify that if x is
of that form, and if y satisfies }y}8 ď 1 and }y}1 “ αn, then x ă p2{α ´ 1qy. This
is shown by establishing that an average of permutations of y is a multiple of x.
First, average separately the positive and the negative coordinates of y to obtain

ion
a vector y 1 whose coordinates take only two values, one positive and one negative.
Since the `1 -norm of the positive and the negative part of y 1 is equal and amounts
to αn{2, the support of each part must be at least αn{2 and at most p1 ´ α{2qn,

ut
and the absolute value of each coordinate at least α{p2 ´ αq.

rib
Assume now that n is even. Next, select a set of n{2 equal coordinates (positive
or negative, depending on which part has larger support) and average the remaining
ones. The obtained vector is a multiple of an extreme point, as needed. If n is odd,

ist
select tn{2u equal coordinates (from the dominant sign) and average the remaining
ones to produce one zero and tn{2u equal coordinates. The resulting vector is also

rd
a multiple of an extreme point. 
A simpler but less precise version of Lemma 10.1 can be obtained without any
hypothesis on }y}8 .
fo
Lemma 10.2. Let x, y P Rn,0 with y ‰ 0. Then
ot
2n}x}8
N

(10.2) xă y.
}y}1
Proof. By homogeneity, we may assume that }y}8 “ 1 and the result follows
ly.

from Lemma 10.1. 


on

As a consequence, we obtain the fact that if two vectors from Rn,0 are flat and
close to each other, one is majorized by a small perturbation of the other one.
se

Proposition 10.3. Let x, y P Rn,0 . Assume that }x ´ y}8 ď ε and }y}1 ě αn


for some α ą 0. Then
lu

ˆ ˙

xă 1` y.
α
na

Proof. We use the following elementary property of majorization: if x1 ă λ1 y


and x2 ă λ2 y for some positive λ1 , λ2 , then x1 ` x2 ă pλ1 ` λ2 qy. We apply this
so

fact with x1 “ y, λ1 “ 1 and x2 “ x ´ y. Lemma 10.2 shows that we can choose


r

λ2 “ 2ε{α, and the Proposition follows. 


Pe

Exercise 10.1. Provide an alternative proof of Lemma 10.2 by using directly


the definition of majorization.
10.1.2. Spectra and norms of unitarily invariant random matrices. A
lot of information about a self-adjoint matrix can be retrieved from its spectrum; for
example, all unitarily invariant norms can be computed if one knows the eigenvalues
(see Section 1.3.2). In contrast, computing the values of other norms or gauges (e.g.,
the gauge associated to the set of separable states) usually requires some knowledge
about the eigenvectors.
However, if the matrix is random and if its distribution is unitarily invariant,
it is possible to circumvent this difficulty. Heuristically, the principle we are going
10.1. MISCELLANEOUS TOOLS 265

to establish and use is as follows: if A and B are two unitarily invariant random
matrices with similar spectra, then, for any norm or gauge } ¨ }, the typical values
of }A} and of }B} are comparable.
It is convenient to work in the hyperplane Msa,0 n of self-adjoint complex n ˆ
n matrices with trace zero. One says that a Mnsa,0 -valued random variable A is
unitarily invariant if, for any U P Upnq, the random matrices A and U AU : have
the same distribution. Recall also that µSC is the standard semicircular distribution,
that µsp pAq is the empirical spectral distribution of a self-adjoint matrix A, and

ion
that d8 denotes the 8-Wasserstein distance. All these concepts were introduced
in Section 6.2.

ut
Proposition 10.4. Let A and B be two Msa,0 n -valued random variables which
are unitarily invariant and satisfy the following conditions

rib
(10.3) Ppd8 pµsp pAq, µSC q ď εq ě 1 ´ p and E d8 pµsp pAq, µSC q ď ε

ist
for some ε, p P p0, 1q, and similarly for B. Then, for any convex body K Ă Mnsa,0
containing the origin in its interior,

rd
1´p 1 ` Cε
E }A}K ď E }B}K ď E }A}K
1 ` Cε 1´p
for some absolute constant C. fo
Proof of Proposition 10.4. Note that possible relations between A and
ot
B (such as independence) are irrelevant in the present situation. Consider the
following function on Rn,0 (recall that Rn,0 denotes the hyperplane of vectors of
N

sum zero in Rn )
φpxq “ E }U DiagpxqU : }K ,
ly.

where U P Upnq denotes a Haar-distributed random unitary matrix (independent of


on

everything else) and Diagpxq is the diagonal matrix whose ii-th entry is xi . Unitary
invariance implies that
se

(10.4) E }A}K “ E φpspecpAqq


and similarly for B (see Exercise 10.2). Let E be the event td8 pµsp pBq, µSC q ď εu.
lu

Assume for the moment that E holds, we have then (see Exercise 6.25)
ż2
na

ż
}B}1 “ n |x| dµsp pBqpxq ě n p|x| ´ εq` dµSC pxq
´2
so

ż2
ě n p|x| ´ 1q` dµSC pxq “ αn,
r

´2
Pe

α « 0.16 being a numerical constant. Applying Proposition 10.3 to the vectors


specpAq and specpBq, we conclude that (with C “ 2{α)
specpAq ă p1 ` Cd8 pµsp pAq, µsp pBqqq specpBq.
Since φ is convex and permutationally invariant, it follows that
φpspecpAqq ď p1 ` Cd8 pµsp pAq, µsp pBqqqφpspecpBqq.
Using the fact that d8 pµsp pAq, µsp pBqq ď ε ` d8 pµsp pAq, µSC q and taking expecta-
tion over A yields
E φpspecpAqq ď p1 ` 2CεqφpspecpBqq.
266 10. RANDOM QUANTUM STATES

Recall that the above inequality is true conditionally on E. Consequently,


E φpspecpBqq ě E φpspecpBqq1E ě p1 ` 2Cεq´1 PpEq E φpspecpAqq.
In view of (10.4) and since PpEq ě 1 ´ p by hypothesis, this shows that
1 ` 2Cε
E }A}K ď E }B}K .
1´p
The other inequality follows by symmetry. 

ion
If ε is large (2 or larger), the hypothesis d8 pµsp pAq, µSC q ď ε does not prevent
A from being identically zero. However, an isomorphic version of Proposition 10.4

ut
can be similarly obtained under the hypothesis that the spectra of A and B are
reasonably flat.

rib
Proposition 10.5 (see Exercise 10.3). Let A and B be two Mnsa,0 -valued ran-
dom variables which are unitarily invariant. Assume that

ist
(10.5) Pp}A}1 ě c1 nq ě 1 ´ p and E }A}8 ď C2 ,

rd
and similarly for B. Then, for any convex body K Ă Msa,0
n containing the origin in
the interior,
fo
C ´1 E }A}K ď E }B}K ď C E }A}K
with C “ p1 ´ pq´1 p2C2 {c1 q.
ot
Exercise 10.2 (Retrieving unitarily invariant distributions from the spec-
N

trum). Let A be a Msa,0


n -valued random variable which is unitarily invariant. Recall
that DiagpspecpAqq is the diagonal matrix whose diagonal entries are the eigen-
values of A arranged in the non-increasing order. Let U P Upnq be a Haar-
ly.

distributed random unitary matrix independent of A. Show that the random matrix
on

U DiagpspecpAqqU : has the same distribution as A.


Exercise 10.3 (All flat unitarily invariant distributions look alike). Prove
se

Proposition 10.5.
lu

10.1.3. Gaussian approximation to induced states. We are going to in-


vestigate typical properties of random induced states, in the large dimension regime.
Their spectral properties where discussed in Section 6.2.3, and are described either
na

by the Marčenko–Pastur distribution (when s is proportional to n) or by the semi-


circular distribution (when s " n).
so

However, we are also interested in properties that cannot be inferred from the
r

spectrum (the main example being separability vs. entanglement on a bipartite


Pe

system). In this context, it is useful to compare induced states with their Gaussian
approximation. Indeed, the Gaussian model allows to connect with tools from
convex geometry, such as the mean width.
It is convenient to work in the hyperplane Msa,0 n and to consider the shifted
operators ρ ´ I {n, which we compare with a GUE0 random matrix (see Section
6.2.2). The following proposition compares the expected value of any norm (or
gauge) computed for both models.
Proposition 10.6. Given integers n, s, denote by ρn,s a random induced state
on Cn with distribution µn,s , and by Gn an nˆn GUE0 random matrix. Let Cn,s be
10.1. MISCELLANEOUS TOOLS 267

the smallest constant such that the following holds: for any convex body K Ă Mnsa,0
containing 0 in the interior,
› › › › › ›
´1
› Gn › › I ›› › Gn ›
(10.6) Cn,s E › ? › ď E ›ρn,s ´ › ď Cn,s E › ? ›› .
› › › ›
n s K n K n s K
Then
(i) For any sequences pnk q and psk q such that limkÑ8 nk “ limkÑ8 sk {nk “ 8, we
have limkÑ8 Cnk ,sk “ 1.

ion
(ii) For any a ą 0, we have suptCn,s : s ě anu ă 8.
Remark 10.7. We emphasize that the quantity E }Gn }K appearing in (10.6) is

ut
exactly the Gaussian mean width of the polar set K ˝ . Indeed, the standard Gauss-
ian vector in the space Msa,0 (equipped with the Hilbert–Schmidt scalar product, as

rib
n
always) is exactly a GUE0 random matrix. In view of (4.32), we could have equiva-
lently formulated Proposition 10.6 using the usual mean width: if C̃n,s denotes the

ist
smallest constant such that the inequalities
˝
› ›
´1 wpK q I ›› wpK ˝ q

rd

(10.7) C̃n,s ? ď E ›ρn,s ´ › ď C̃n,s ? ,

s n K s

fo
are true for every convex body containing 0 in the interior, then the conclusions of
Proposition 10.6 hold for C̃n,s instead of Cn,s .
ot
Proof. It is easy to check that (10.6) holds for some Cn,s ă `8 if n and s
are fixed (see Exercise 10.4). Moreover, we know from Theorem 6.35(i) that, for
N

every fixed n,
ly.

(10.8) suptCn,s : s P Nu ă `8.

? n “ nk and s “ sk , with nk and


(i) Assume that ? sk {nk both tending to infinity, and
on

denote Ak “ nspρn,s ´ I {nq and Bk “ Gn { n. Consider the random variables


Xk “ d8 pµsp pAk q, µSC q and Yk “ d8 pµsp pBk q, µSC q. We know from Theorem 6.23
se

and Theorem 6.35(iii) that Xk and Yk converge to zero in probability. We also


claim that lim E Xk “ lim E Yk “ 0; this follows from the fact that Xk ď 2 ` }Ak },
lu

Yk ď 2 ` }Bk } and from Proposition 6.24 and Proposition 6.33. Part (i) follows
now from Proposition 10.4.
na

(ii) Let Ak and Bk be as before, but now we only assume that sk ě ank for some
a ą 0. We argue by contradiction: suppose that Cnk ,sk tends to infinity. We
so

know from (10.8) that the sequence pnk q cannot be bounded, so we may assume
limk nk “ `8. Similarly, using part (i), we may assume that sk {nk is bounded,
r

and therefore (by passing to a subsequence) that lim sk {nk “ λ P ra, 8q. We
Pe

know from Theorem 6.35(ii) and Theorem 6.23 that µsp pAk q and µsp pBk q converge
in probability towards a nontrivial deterministic limit, and therefore satisfy the
hypotheses of Proposition 10.5 for some constants p, c1 , C2 . 

Exercise 10.4. Let X and Y two Rn -valued random vectors with the property
that, for any θ P S n´1 , we have 0 ă E |xX, θy| ă `8 and 0 ă E |xY, θy| ă `8.
Show that there exists a constant C (depending on n, X, Y ) such that, for any
convex body K containing the origin in the interior, we have E }X}K ď C E }Y }K .
268 10. RANDOM QUANTUM STATES

10.1.4. Concentration for gauges of induced states. We present a con-


centration result valid for any gauge evaluated on random induced states.
Proposition 10.8. Let s ě n, let K Ă DpCn q be a convex body with inradius
r, and let ρ be a random state with distribution µn,s . Let M be the median of
}ρ ´ I {n}K0 , with K0 “ K ´ I {n. Then, for every η ą 0,
ˆˇ› ˇ ˙
ˇ› I ›› ˇ
P ˇˇ›ρ ´ › ´ M ˇˇ ě η ď expp´sq ` 2 expp´n2 sr2 η 2 {72q.
n K0

ion
Proof of Proposition 10.8. We know that ρ has the same distribution as
AA: , where A is an n ˆ s matrix uniformly distributed on the Hilbert–Schmidt

ut
sphere SHS . Consider the function f : SHS Ñ R defined by
› ›
I›

rib

(10.9) f pAq “ ››AA: ´ ›› .
n K0

For every t ą 0, denote by Ωt the subset Ωt “ tA P SHS : }A}8 ď tu. The function

ist
f is the composition of several operations:
(a) the map A ÞÑ }A}K0 , which is 1{r-Lipschitz with respect to the Hilbert–Schmidt

rd
norm.
(b) the map A ÞÑ A ´ I {n, which is an isometry for the Hilbert–Schmidt norm,
fo
(c) the map A ÞÑ AA: , which is 2t-Lipschitz on Ωt (see Lemma 8.22).
It follows that the Lipschitz constant of the restriction of f to Ωt is bounded by
ot
2t{r. We now apply the local version of Lévy’s lemma (Corollary 5.35) and obtain
that, for every η ą 0,
N

Pp|f ´ M | ě ηq ď PpSHS zΩt q ` 2 expp´nsr2 η 2 {8t2 q.


?
ly.

If weachoose t “ 3{ n, then PpSHS zΩt q ď expp´sq (apply Proposition 6.36 with


ε “ s{n) and the result follows. 
on

Remark 10.9. Taking t “ 1 in the argument above, one obtains that the global
Lipschitz constant of f is bounded by 2{r. This implies
? (see Proposition 5.29) that
se

any two central values for f differ by at most C{pr nsq.


lu

10.2. Separability of random states


Assume now that we work in a bipartite Hilbert space, and for simplicity con-
na

sider the case of Cd b Cd where both parties play a symmetric role. Throughout
this section we write Sep for SeppCd b Cd q and consider random induced states on
so

Cd b Cd with distribution µd2 ,s .


r

10.2.1. Almost sure entanglement for low-dimensional environments.


Pe

Since the maximally mixed state lies in the interior of the set of separable states, and
since the measures µd2 ,s converge weakly towards the Dirac mass at the maximally
mixed state (see Section 6.2.3.4), it follows that µd2 ,s pSepq tends to 1 when s tends
to infinity (d being fixed). Conversely, the following result shows that random
induced states are entangled with probability one when s ď pd ´ 1q2 .
Proposition 10.10. Let d, s be integers with s ď pd´1q2 . Then µd2 ,s pSepq “ 0.
Proof. Let S Ă Cd b Cd be the range of ρ. The random subspace S is
Haar-distributed on the Grassmann manifold Grps, Cd b Cd q. We use the following
simple fact which is an immediate consequence of the definition of separability: if
10.2. SEPARABILITY OF RANDOM STATES 269

ρ is separable, then S is spanned by product vectors. The Proposition now follows


from Theorem 8.1: when s ď pd ´ 1q2 , S almost surely contains no nonzero product
vector. 
Problem 10.11. For which values of d, s do we have µd2 ,s pSepq “ 0?
Exercise 10.5. Let d, s be integers with s ě d2 . Show that 0 ă µd2 ,s pSepq ă 1.
Exercise 10.6. Let d, s be integers such that µd2 ,s pSepq ą 0. Show that

ion
µd2 ,t pSepq ą 0 for every t ě s. (Cf. Problem 10.14.)
10.2.2. The threshold theorem. From the two extreme cases, s ď pd ´ 1q2

ut
and s “ 8, we may infer that induced states are more likely to be separable when
the environment has larger dimension. As it turns out, a phase transition takes

rib
place (at least when d is sufficiently large): the generic behavior of ρ “flips” to the
opposite one when s changes from being a little smaller than a certain threshold

ist
dimension s0 to being larger than s0 . More precisely, we have the following theorem.
Theorem 10.12. Define a function s0 pdq as s0 pdq “ wpSeppCd b Cd q˝ q2 . This

rd
function satisfies
(10.10) cd3 ď s0 pdq ď Cd3 log2 d
fo
for some constants c, C and is the threshold between separability and entanglement
in the following sense. If ρ is a random state on Cd bCd induced by the environment
ot
Cs , then, for any ε ą 0,
N

(i) if s ď p1 ´ εqs0 pdq, we have


(10.11) Ppρ is entangledq ě 1 ´ 2 expp´cpεqd3 q,
ly.

(ii) if s ě p1 ` εqs0 pdq, we have


on

(10.12) Ppρ is separableq ě 1 ´ 2 expp´cpεqsq,


where cpεq is a constant depending only on ε.
se

As a corollary, we recover the result mentioned in the preamble of the chapter:


lu

given N identical particles in a generic pure state, if we assign k of them to Alice


and k of them to Bob, their shared state suddenly jumps from typically entangled
na

to typically separable when k crosses a certain threshold value kN „ N {5. We state


the result for qubits only, but both the statement and the proof easily generalize
so

to D-level particles for D ą 2.


r

Corollary 10.13 (see Exercise 10.8). Given an integer N , there is kN „ N {5


Pe

with the following property. For some integer k ď N {2, decompose H “ pC2 qbN as
A b B b E with A “ B “ pC2 qbk and E “ pC2 qbpN ´2kq , and consider a unit vector
ψ P H chosen uniformly at random. Let ρ “ TrE |ψyxψ| be the induced state on
A b B. Then
(1) for k ă kN , Ppρ is entangledq ě 1 ´ 2 expp´αN q,
(2) for k ą kN , Ppρ is separableq ě 1 ´ 2 expp´αN q,
where α ą 1 is a constant independent of N .
Proof of Theorem 10.12. The inequalities (10.10) are a direct consequence
of Theorem 9.6.
270 10. RANDOM QUANTUM STATES

We next present a detailed proof of part (ii). Let ρd2 ,s be a random state
2
› I
› µd2 ,s . Denote Sep0 “ Sep ´ I {d . Consider also the function
with distribution
f pρq “ ›ρ ´ d2 ›Sep and the quantity Ed,s :“ E f pρd2 ,s q.
0
Fix ε ą 0, and let s, d be such that s ě p1 ` εqs0 pdq. Appealing to Proposition
10.6 (in the version given in Remark 10.7), we obtain
wpK ˝ q C̃n,s
(10.13) Ed,s ď C̃n,s ? ď? ,
s 1`ε

ion
where C̃n,s is the constant appearing in (10.7). The constants C̃n,s tend to 1 as d
and s tend to infinity under the constraint s ě p1 ` εqs0 pdq.

ut
Let Md,s be the median of f pρd2 ,s q. We know from Proposition 10.8 (the
inradius of Sep being Θp1{d2 q, see Table 9.1) that

rib
P f pρd2 ,s q ą Md,s ` η ď expp´sq ` 2 expp´csη 2 q.
` ˘
(10.14)
?
Remark 10.9 implies that |Md,s ´ Ed,s | ď Cd{ s. It follows then from (10.13)

ist
that there is an η ą 0 (depending only on ε) with the property that Md,s ` η ď 1
for all d large enough and s ě p1 ` εqs0 pdq. The inequality (10.12) follows now

rd
from (10.14) and from the obvious remark that a state ρ is entangled if and only if
f pρq ą 1. Small values of d can be taken into account by adjusting the constants if
fo
necessary. Note that the argument yields a priori a bound C 1 expp´c1 pεqsq, possibly
with C 1 ą 2, but the bound (10.12) follows then with cpεq “ c1 pεq{ log2 C 1 .
ot
The proof of part (i) goes along similar lines, particularly if we do not care
about the exact power of d appearing in the exponent of the probability bound
N

in (10.11);
` this is because˘Proposition 10.8 yields an estimate parallel to (10.14)
for P f pρd2 ,s q ă Md,s ´ η . There are some fine points which emerge when s is
ly.

relatively small, but they can be handled using inequalities from Exercise 10.7; see
[ASY14] for details. See also Remark 10.15. 
on

The fine points in the proof of part (i) of Theorem 10.12 would disappear if the
answer to the following natural problem was positive (cf. Exercise 10.6).
se

Problem 10.14 (As environment increases, entanglement decreases). Fix an


lu

integer d ě 2. Is it true that the function s ÞÑ µd2 ,s pSepq is non-decreasing?


Remark 10.15. An alternative and simpler argument to prove part (i) of Theo-
na

rem 10.12 is sketched in Exercise 10.9. That argument also has the advantage that
it produces explicitly an entanglement witness certifying that the induced state
so

is entangled. However, the argument works only in the range s ď cd3 for some
constant c ą 0; while this does not cover the entire range, it handles the case of
r
Pe

relatively small s that does not readily follow from Proposition 10.8.
Exercise 10.7 (Partial results on monotonicity of entanglement). Set πd,s :“
µ pSeppCd b Cd qq .
d2 ,s
(i) Show that the function d ÞÑ πd,s is non-increasing for any integer s ě 1.
(ii) Show the inequality π2d,s ď πd,4s .
Exercise 10.8 (Proof of the N {5 threshold result). Prove Corollary 10.13 by
combining Theorem 10.12 (applied with ε “ 1{2) and Exercise 10.7.
Exercise 10.9 (The induced state is its own witness). Let ρ be a random state
on Cd b Cd with distribution µd2 ,s , and W “ ρ ´ I {d2 .
10.2. SEPARABILITY OF RANDOM STATES 271

(i) Show that TrpW ρq is of order 1{s with high probability.


(ii) Show that for any unit vector x P Cd b Cd and 0 ă η ă 1, we have
´ˇ ˇ η¯
P ˇxx|W |xyˇ ą 2 ď C expp´csη 2 q.
d
(iii) Conclude that with high probability, suptTrpσW q : σ P Sepu ď Cd´3{2 s´1{2 .
(iv) Conclude that in the regime s ď cd3 , with high probability, W witnesses the
fact that ρ is entangled.

ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
272 10. RANDOM QUANTUM STATES

10.3. Other thresholds


10.3.1. Entanglement of formation. Theorem 10.12 settles the “entangle-
ment vs. separability” dichotomy for random induced states. In the generic entan-
glement regime, we could be more precise and ask about quantitative estimates:
how strongly is a random state entangled?
To address the above question we need a method to quantify the amount of
entanglement present in a quantum state.
› The›approach from the preceding section
allows to use the value of the gauge ›ρ ´ I{d2 ›Sep as a measure of the strength of

ion
0
entanglement. In this section we will work with invariants that are more “native”
to quantum information theory.

ut
For a pure state ψ, the entropy of entanglement Epψq was introduced in (8.1).
A possible way to extend this definition to mixed states is to use a “convex roof”

rib
construction. For a state ρ on Cd b Cd , define its entanglement of formation EF pρq
as

ist
!ÿ ÿ )
(10.15) EF pρq “ inf pi Epψi q : ρ “ pi |ψi yxψi | ,

rd
the infimum being taken over all decompositions of ρ as convex combinations of
pure states. Equivalently, the entanglement of formation is the smallest convex
fo
function which coincides with the entropy of entanglement on pure states.
Entanglement of pure states was studied in Chapter 8. In particular, for a
random pure state ψ (which corresponds to the case s “ 1), we typically have
ot
EF p|ψyxψ|q “ Epψq “ log d ´ 12 ` op1q; see Lemma 8.13. Here is a statement
N

describing a “behavior shift” which takes place as s increases.


Theorem 10.16 (Entanglement of formation for random induced states). Let
ly.

ρ be a random state on Cd b Cd with distribution µd2 ,s .


(1) If s ď cd2 { log2 d, then with high probability EF pρq ě logpdq ´ 1.
on

(2) If 0 ă ε ă 1 and s ě Cε´2 d2 log2 d, then with high probability EF pρq ď ε.


Proof. Assume s ď d2 . If S denotes the range of ρ, then S is a random Haar-
se

distributed s-dimensional subspace of C2 b C2 . We use the following relaxation


lu

EF pρq ě inftEpψq : ψ P Su.


We then conclude using Theorem 8.15 that, with high probability, EF pρq ě logpdq´
na

1 provided s ď cd2 { log2 d.


For the second part, denote by a the smallest eigenvalue of ρ and consider the
so

convex combination
r

I
Pe

ρ “ pρ ´ a Iq ` a I “ p1 ´ d2 aqσ ` d2 a 2
d
for some state σ. Using the convexity of EF and the obvious facts that EF pσq ď
log d and EF pI {d2 q “ 0, we obtain EF pρq ď p1 ´ d2 aq log d. However, we know
from Proposition 6.36 (or Exercise 6.43) that a ě d12 ´ dC ? with large probability.
s
It follows that as long as s ě C 2 ε´2 d2 log2 d, then
Cd logpdq
EF pρq ď ? ď ε. 
s
Exercise 10.10. Check that EF pρq “ 0 if and only if ρ is separable.
NOTES AND REMARKS 273

10.3.2. Threshold for PPT. The machinery developed in this chapter can
be applied to any property instead of separability and allows to reduce the estima-
tion of threshold dimensions to the estimation of a geometric quantity (the mean
width for the polar set).
One natural example is the PPT property. Since PPT “ D X ΓpDq, where Γ is
the partial transpose, it` follows
˘ easily (arguing as in the first part of the proof of
Proposition 9.8) that w PPT˝0 ď 2wpD˝0 q » d. The threshold s1 appearing in this
approach satisfies then

ion
s1 pdq “ wpPPT˝0 q2 “ Θpd2 q.
However, we know that the spectrum of large-dimensional partially transposed

ut
random states is described by a non-centered semicircular distribution (see Theorem
6.30). A more precise estimation of the threshold follows (note
? that the? distribution

rib
SCpλ, λq appearing in Theorem 6.30 has support rλ ´ 2 λ, λ ` 2 λs, which is
included in r0, `8q if and only if λ ě 4).

ist
Theorem 10.17 (Threshold for the PPT property). Define s1 pdq “ 4d2 . Let ρ
be a random state on Cd b Cd with distribution µd2 ,s . Then

rd
(i) if s ď p1 ´ εqs1 pdq, we have
fo
Ppρ is PPTq ď 2 expp´cpεqd2 q,
(ii) if s ě p1 ` εqs1 pdq, we have
ot
Ppρ is PPTq ě 1 ´ 2 expp´cpεqsq.
N

Here cpεq is a constant depending only on ε.


The comparison between Theorems 10.12, 10.16 and 10.17 is instructive: if s
ly.

is sufficiently larger than d2 , but sufficiently smaller than d3 , random states are
typically PPT and entangled (in particular they cannot be distilled, see Chapter
on

12), but have an amount of entanglement extremely small when measured via the
entanglement of formation.
se

Exercise 10.11. Explain the presence of expressions of the form Ωε pd2 q and
lu

Ωε psq in the exponents in Theorem 10.17.

Notes and Remarks


na

Theorem 10.12, as well as the preliminary results from Section 10.1, are from
so

[ASY14]. A high-level non-technical overview can be found in [ASY12]. In partic-


ular, the existence of a separability threshold around the value s “ d3 was proved
r

in [ASY14]; previously only the cases s ď d2 or s ě d4 were covered (see e.g


Pe

[HLW06]).
The answer to Problem 10.11 is known for qubits: we have µ4,2 pSeppC2 bC2 qq “
0 and µ4,3 pSeppC2 b C2 qq ą 0. As explained in section 7.1 of [ASY14], this follows
from results of [RW09] and [SBŻ06], respectively.
The entanglement of formation is only one of the many possible ways to quantify
entanglement of mixed states. However, other measures are harder to manipulate.
For a survey of the subject of entanglement measures see [PV07].
The threshold for the entanglement of formation (Theorem 10.16) is essentially
from [HLW06], and the threshold for the PPT property (Theorem 10.17) is from
[Aub12] (see also [ASY12]).
274 10. RANDOM QUANTUM STATES

Other thresholds functions have been computed or estimated: for the realign-
ment criterion [AN12], for the k-extendibility property [Lan16], and for still other
properties [CNY12, JLN14, JLN15] (including the absolute PPT property and
the reduction criterion).

ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
CHAPTER 11

Bell Inequalities and the Grothendieck–Tsirelson

ion
Inequality

ut
In this chapter we briefly sketch the connection (originally made by Tsirelson)

rib
between the celebrated Bell inequalities from the quantum theory, and the equally
celebrated Grothendieck inequality from functional analysis. The presentation is
anything but comprehensive: it has been unequivocally established in the last dozen

ist
or so years that the proper “mathematical home” of Bell inequalities is in the theories
of operator spaces and operator systems, which are beyond the scope of this book.

rd
An excellent survey that addresses these topics in much greater detail is [PV16].

fo
11.1. Isometrically Euclidean subspaces via Clifford algebras
In Section 7.2.4 we studied in detail the almost Euclidean subspaces of Mn ,
ot
i.e., on which a given Schatten p-norm is p1 ` εq-equivalent to the Hilbert–Schmidt
norm. For the purposes of the present chapter it is useful to focus on the case of
N

exactly or isometrically Euclidean subspaces, i.e., ε “ 0.


We first note that for a rank one matrix, all Schatten p-norms are equal. It
ly.

follows that there are subspaces of dimension n in Mn (e.g., the space of all matrices
with zero coefficients outside the first row) in which the ratio }¨}op {}¨}HS is constant
on

and equal to 1. However, such a subspace is not at the “correct level”: for subspaces
produced by ? Dvoretzky’s theorem – which?are also of dimension Θpnq – the same
ratio is Θp1{ nq (or, more precisely, „ 2{ n, see Exercise 7.23).
se

A less trivial construction, based on Clifford algebras (or, in more elementary


lu

terms, on Pauli matrices), gives isometrically Euclidean subspaces of Mn (and even


of Msa n ), at the correct level, of dimension Θplog nq, at least when n is a power
na

of 2. It is a natural question whether it is possible to interpolate between that


construction and the subspaces given by Dvoretzky’s theorem (see Problem 11.27
so

in Notes and Remarks).


Lemma 11.1. For every k ě 2, there is a p2k ´ 2q-dimensional subspace of the
r
Pe

space of 2k ˆ 2k real self-adjoint matrices in which every matrix is a multiple of an


orthogonal matrix.
This result is specific to subspaces over the real field: any 2-dimensional com-
plex subspace of complex matrices (or, more to the point, every 1-dimensional
complex affine subspace not containing 0) contains a singular matrix since the
polynomial λ ÞÑ detpA ` λBq must vanish (see also Exercise 11.1). However, a
similar phenomenon holds for complex self-adjoint matrices.
Lemma 11.2. For every k ě 1, there is a 2k-dimensional (real vector) subspace
of the space of 2k ˆ 2k complex Hermitian matrices in which every matrix is a
multiple of a unitary matrix.
275
276 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

Lemma 11.2 implies immediately Lemma 11.1 since an n ˆ n unitary matrix


can be considered as a 2n ˆ 2n orthogonal matrix when one disregards the complex
structure.

Proof of Lemma 11.2. Consider the following elements U1 , . . . , U2k of Mbk


2

Ui “ Ibpi´1q b σx b σybpk´iq ,

Uk`i “ Ibpi´1q b σz b σybpk´iq ,

ion
where σx , σy , σz are the Pauli matrices introduced in (2.2). It is easily checked (cf.
Exercise 2.4) that the operators pUi q1ďiď2k are self-adjoint and are anticommuting

ut
reflections: Ui2 “ I and Ui Uj “ ´Uj Ui for i ‰ j. It follows that for any ξ P R2k , the
matrix X “ ξ1 U1 ` ¨ ¨ ¨ ` ξ2k U2k satisfies XX : “ |ξ|2 I and therefore is a multiple

rib
of a unitary matrix. 

Remark 11.3. The subspaces in Lemmas 11.1 and 11.2 consist of trace zero

ist
matrices.

rd
The dimensions appearing in Lemma 11.1 are not optimal. Finding the minimal
possible dimension is related to the Radon–Hurwitz problem and involves more
advanced analysis of Clifford algebras. fo
Theorem 11.4 (not proved here). Given an integer k ě 1, consider
ot
(i) αpkq, the minimal integer n such that Mn pRq contains a k-dimensional subspace
in which every matrix is a multiple of an orthogonal matrix.
N

(ii) βpkq, the minimal integer n such that Mn pRq contains a k-dimensional subspace
in which every nonzero matrix is invertible.
ly.

Then
$ pk´2q{2
on


’ 2 if k “ 0 mod 8,

&2pk´1q{2 if k “ 1 or k “ 7 mod 8,
αpkq “ βpkq “
’2k{2 if k “ 2 or k “ 4 or k “ 6 mod 8,
se



% pk`1q{2
2 if k “ 3 or k “ 5 mod 8.
lu

Exercise 11.1 (Isometrically Euclidean subspaces and parity of the dimen-


sion). Show that Mn pRq contains a 2-dimensional subspace in which every matrix
na

is a multiple of an orthogonal matrix if and only if n is even.


so

11.2. Local vs. quantum correlations


r
Pe

Ever since the seminal 1935 paper [EPR35] by Einstein, Podolsky and Rosen it
has been apparent that quantum theory leads to predictions which are incompatible
with the classical understanding of physical reality. Specifically, the outcomes of
some experiment may be correlated in a way contradicting common sense (“spooky
action at a distance”). In this section we formalize the concept of correlations,
which will lead to the famous Bell inequalities discovered in [Bel64].

11.2.1. Correlation matrices. Let us start by defining what we mean by


correlation matrices in the classical and the quantum worlds. As we shall see,
comparing the two naturally involves the Grothendieck constant.
11.2. LOCAL VS. QUANTUM CORRELATIONS 277

Definition 11.5. A m ˆ n real matrix paij q is called a classical (or local)


correlation matrix if there exist random variables pXi q1ďiďm and pYj q1ďjďn defined
on a common probability space, satisfying |Xi | ď 1, |Yj | ď 1 (almost surely), and
such that, for any 1 ď i ď m, 1 ď j ď n,
(11.1) aij “ E Xi Yj .
We write LCm,n (or simply LC) for the set of m ˆ n local correlation matrices.
We emphasize that this notion does not coincide with the correlation or covari-

ion
ance matrices from statistics. In that context, covariance matrices are square and
positive semi-definite, corresponding to the scenario when pXi q “ pYi q and E Xi “ 0

ut
(see, e.g., Appendix A.2), while the correlation matrix of pXi q is the covariance ma-
trix of the standardized variables pX̃i q “ pXi {}X}2 q. When E Xi “ E Yj “ 0, (11.1)

rib
coincides with the somewhat less frequently used notion of cross-covariance.
The set LCm,n is a polytope with 2n`m´1 vertices (see Proposition 11.7) and

ist
appears in the literature under various names such as correlation polytope, Bell
polytope, local hidden variable polytope, local polytope. (The reader should be

rd
forewarned, though, that sometimes the same names are used for sets of the more
general objects, the so-called boxes, defined in Section 11.3.2.) The reasons for the
adjective “local” will become more clear later on. The facial structure of LCm,n is
fo
rather complicated (except in very low dimensions, see Exercises 11.4, 11.12, and
11.15).
ot
Definition 11.6. A m ˆ n real matrix paij q is called a quantum correlation
N

matrix if there is a state ρ P DpCd1 b Cd2 q (for some d1 , d2 ), self-adjoint operators


pXi q1ďiďm on Cd1 and pYj q1ďjďn on Cd2 satisfying }Xi }8 ď 1, }Yj }8 ď 1, and such
ly.

that, for any 1 ď i ď m and 1 ď j ď n,


(11.2) aij “ Tr ρpXi b Yj q.
on

We write QCm,n (or simply QC) for the set of m ˆ n quantum correlation matrices.
It turns out that both sets LC and QC have simple descriptions.
se

Proposition 11.7. The set LCm,n can be alternatively described as


lu

LCm,n “ conv tpξi ηj q1ďiďm,1ďjďn : ξ P t´1, 1um , η P t´1, 1un u “ B8


m p n
b B8 .
na

Proposition 11.8. The set QCm,n is convex and can be alternatively described
as
so

! )
QCm,n “ pxxi , yj yq1ďiďm,1ďjďn : xi , yj P Rminpm,nq , |xi | ď 1, |yj | ď 1 .
r
Pe

It is obvious from the Propositions that LC Ă QC. (This can also be established
directly from the definitions, without appealing to the results of Section 11.1.)
The crucial point—which is simple, but not entirely trivial, and will be studied in
detail in the next section—is that this inclusion is strict. This is one mathematical
manifestation of the fact that the quantum description of reality is different from
the classical one. Correlation matrices that do not belong to LC will be called
nonclassical or nonlocal.
Proof of Proposition 11.7. We first prove the inclusion Ą. It is clear that
given ξ P t´1, 1um and η P t´1, 1un , we have pξi ηj q P LCm,n (consider constant
random variables taking values ˘1), so it suffices to show that LCm,n is convex.
278 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

p1q p1q p1q p2q p2q p2q


If aij “ E Xi Yj and aij “ E Xi Yj are two classical correlation matrices
(without loss of generality we may assume that all random variables are defined on
pαq pαq
the same probability space), define random variables Xi “ Xi and Yj “ Yj ,
where α is an independent random index, equal to 1 with probability p and equal
p1q p2q
to 2 with probability 1 ´ p. Then E Xi Yi “ paij ` p1 ´ pqaij and this shows that
LCm,n is convex.
Conversely, note that any vector X P r´1, 1sd can we written as a convex
combination of elements of Id :“ t´1, 1ud

ion
ÿ
X“ λdξ pXqξ
ξPId

ut
with the functions λdξ d
: r´1, 1s Ñ r0, 1s being measurable (or even continuous) and

rib
adding to 1. If a P LCm,n is a classical correlation matrix with aij “ E Xi Yj , we
may write (denoting X “ pX1 , . . . , Xm q and Y “ pY1 , . . . , Yn q)

ist
´ ÿ ¯´ ÿ ¯ ÿ
λm λnη pY qηj “ E λm n
“ ‰
aij “ E ξ pXqξi ξ pXqλη pY q ξi ηj

rd
ξPIm ηPIn ξPIm ,ηPIn

which shows that a P convtpξi ηj qi,j : ξ P Im , η P In u. 


fo
Proof of Proposition 11.8. Let us first prove the direct inclusion. Let
paij q P QCm,n . There is a Hilbert space H “ Cd1 b Cd2 , a state ρ P DpHq and
ot
self-adjoint contractions pXi q and pYj q such that aij “ Tr ρpXi b Yj q. We introduce
the bilinear form on the space B sa pHq
N

βpS, T q “ Re TrpρST q.
ly.

This bilinear form is positive semi-definite (to check symmetry, use the fact that
Re Tr X “ Re Tr X : ) and therefore, after possibly passing to a quotient, it makes
on

B sa pHq into a real Euclidean space. The conclusion follows since aij “ βpXi b
I, I bYj q while βpXi bI, Xi bIq ď 1 and βpI bYj , I bYj q ď 1. To obtain the dimension
minpm, nq as claimed, note that we may a posteriori project the vectors pxi q1ďiďm
se

onto spantyj : 1 ď j ď nu, or vice versa.


lu

Conversely, let pxi q1ďiďm and pyj q1ďjďn be vectors of Euclidean norm at most
1 in Rminpm,nq . By Lemma 11.2, there exist d ˆ d complex Hermitian matrices
na

Ai , Bj (for some d), with Hilbert–Schmidt norm at most 1 and such that Tr Ai Bj “
xxi , yj y. Moreover, Ai , Bj are multiples of unitaries. Set Xi “ d1{2 Ai and Yj “
so

d1{2 BjT ; then Xi , Yj are unitaries and in particular }Xi }8 ď 1 and }Yj }8 ď 1.
Finally, if ρ “ |ψyxψ|, where ψ P Cd b Cd is a maximally entangled vector, then we
r
Pe

have
1
Tr ρXi b Yj “ Tr Xi YjT “ Tr Ai Bj “ xxi , yj y,
d
where the first equality follows by direct calculation (see Exercise 2.12). 

Remark 11.9. As a by-product of the proof, we obtain the following extra


information: Definition 11.6 is unchanged if we require the operators Xi , Yj to
satisfy Xi2 “ I, Yj2 “ I and Tr Xi “ Tr Yj “ 0 (cf. Remark 11.3). Moreover,
the latter reduction can be performed in a “functorial” way which preserves many
properties of ρ, see Exercise 11.7.
11.2. LOCAL VS. QUANTUM CORRELATIONS 279

Remark 11.10. Definitions 11.5 and 11.6 can be readily extended to the mul-
n1 ¨¨¨nk
tipartite setting. One ”defines LCn1 ,...,n
ı k ĂR as the set of arrays pai1 ,...,ik q of
p1q pkq pjq
the form ai1 ,...,ik “ E Xi1 ¨ ¨ ¨ Xik where all the Xij are random variables with
pjq
|Xij | ď 1 a.s., and QCn1 ,...,nk Ă Rn1 ¨¨¨nk as the set of arrays pai1 ,...,ik q of the form
” ı
p1q pkq pjq
ai1 ,...,ik “ Tr ρpXi1 b ¨ ¨ ¨ b Xik q where all the Xij P BpHj q are self-adjoint
pjq
operators with }Xij }8 ď 1, and ρ P DpH1 b ¨ ¨ ¨ b Hk q.

ion
Exercise 11.2 (Convexity of the set of quantum correlations). Show (directly
from the definition) that the set QC is convex.

ut
Exercise 11.3 (Unit vectors suffice). Show that

rib
QCm,n “ pxxi , yj yq1ďiďm,1ďjďn : xi , yj P Rd , d P N, |xi | “ 1, |yj | “ 1 .
(

Exercise 11.4 (The 2 ˆ 2 local correlation polytope is an `1 -ball). Show that

ist
LC2,2 , considered as a subset of R4 , is congruent to 2B14 (a ball of radius 2 in the
`1 -norm).

rd
Exercise 11.5 (Local correlation polytope and the cut-norm). The cut-norm
of a matrix B P Mm,n is defined as

}B}cut “ sup ˇ
ˇÿ ÿ ˇ
ˇ
ˇ fo +
bij ˇ : I Ă t1, . . . , mu, J Ă t1, . . . , nu .
ˇ
ot
ˇiPI jPJ ˇ
N

Show that }B}cut ď }B}LC˝m,n “ suptTr AB : A P LCm,n u ď 4}B}cut .


Exercise 11.6 (Correlation polytopes and operator norms). Let M P Mm,n be
ly.

a real matrix. Verify that }M }LC˝m,n equals }M : `n8 Ñ `m 1 }. Similarly, }M }QC˝m,n


n m
equals }M : `8 pHq Ñ `1 pHq}, where H is any real Hilbert space of dimension at
on

least mintm, nu.


` ˘
Exercise 11.7 (Trace zero measurements suffice). Let aij be a quantum
se

` ˘
correlation matrix defined by (11.2). Show that aij can be realized with a state
lu

ρ̃ “ ρ b σ b τ P DpCd1 b Cd2 b C2 b C2 q (so that, in particular, ρ̃ is separable or


PPT if ρ was) and with operators X̃i P B sa pCd1 b C2 q, Ỹj P B sa pCd2 b C2 q such
na

that, in addition to }X̃i }8 ď 1 and }Ỹj }8 ď 1, we have also Tr X̃i “ Tr Ỹj “ 0 for
all i, j. Moreover, it can be arranged that all X̃i and Ỹj are multiples of isometries
so

if Xi , Yj were.
Exercise 11.8 (Local correlation polytope on k qubits is also an `1 -ball). Show
r
Pe

k
that the set LC2,2,...,2 Ă R2 (as defined in Remark 11.10) is a convex polytope with
k
2k`1 vertices and 22 facets.
Exercise 11.9. Find the inradius and the outradius of the sets LC and QC.
Exercise 11.10. Show that the sets LC and QC have enough symmetries (in
the sense of Section 4.2.2).
11.2.2. Bell correlation inequalities and the Grothendieck constant.
In the context of correlation matrices, a Bell correlation inequality is a linear func-
tional ϕ : Mm,n Ñ R with the property that ϕpAq ď 1 for any classical correlation
matrix A P LCm,n . (We will discuss a more general setup in Section 11.3.) If we
280 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

identify Mm,n with its dual space, the set of Bell correlation inequalities becomes
the polytope LC˝m,n (the polar of LCm,n ) and can be identified with B1m b q B1n . Of
particular interest are the extreme (or optimal) inequalities or, equivalently, the
facets of LCm,n (cf. Section 1.1.5).
A famous example of a Bell correlation inequality in the 2ˆ2 case is the Clauser–
Horne–Shimony–Holt or CHSH inequality ϕCHSH , which is the linear functional
A ÞÑ 21 TrpAMCHSH q, where
„ 
1 1

ion
` ˘2
(11.3) MCHSH “ mij i,j“1 :“ .
1 ´1
It is easily checked that 12 MCHSH P LC˝2,2 since for any choice of ξ, η P t´1, 1u2 ,

ut
(11.4) ξ1 η1 ` ξ1 η2 ` ξ2 η1 ´ ξ2 η2 ď 2.

rib
Moreover, 8 of the 16 possible choices of pξ, ηq saturate this bound.
Since, as we mentioned, the inclusion LCm,n Ă QCm,n is strict (provided m, n ě

ist
2) it may happen that for a Bell correlation inequality ϕ and a quantum correlation
matrix A P QCm,n , we have ϕpAq ą 1. In that case, we say that the Bell correlation

rd
inequality ϕ is violated by A and the quantity ϕpAq is called the violation or, more
precisely, the quantum violation. This is, in particular, the case for the CHSH
inequality. We have fo
Proposition 11.11 (CHSH violations, see Exercises ? 11.11–11.13). The max-
ot
imal quantum violation of the CHSH inequality is 2, and no Bell correlation
inequality for 2 ˆ 2 correlation matrices yields a larger violation.
N

A remarkable fact is that violations of Bell correlation inequalities of arbitrary


ly.

size cannot exceed a universal constant called the Grothendieck constant.


Theorem 11.12 (Grothendieck–Tsirelson, not proved here). There exists an
on

absolute constant K ě 1 such that, for any positive integers m, n, the following
three equivalent conditions hold:
1˝ We have the inclusion
se

(11.5) QCm,n Ă KLCm,n .


lu

˝
` ˘
2 For any m ˆ n real matrix mij and for any ρ, Xi , Yj verifying the conditions
na

of Definition 11.6 we have


ÿ ÿ
(11.6) mij Tr ρpXi b Yj q ď K max mij ξi ηj .
so

m n
ξPt´1,1u ,ηPt´1,1u
i,j i,j
˝
` ˘
r

3 For any m ˆ n real matrix mij and for any (real) Hilbert space vectors xi , yj
Pe

with |xi | ď 1, |yj | ď 1 we have


ÿ ÿ
(11.7) mij xxi , yj y ď K max
m n
mij ξi ηj .
ξPt´1,1u ,ηPt´1,1u
i,j i,j

The traditional version of Grothendieck’s inequality is (11.7), the point being


the existence of K independent of m and n (not proved here). The equivalence of
3˝ with 2˝ (the Tsirelson’s bound ) is the content of Proposition 11.8. Finally, the
equivalence 1˝ ðñ 2˝ is just duality combined with the “classical” Proposition
11.7.
The best constant K such that (11.5)–(11.7) hold for any m, n is called the
(real) Grothendieck constant and denoted by KG . The precise value of KG is not
11.2. LOCAL VS. QUANTUM CORRELATIONS 281

π ?
known; as of this writing, the best estimates are 1.6769 ă KG ă 2 lnp1` 2q
«
pm,nq
1.7822. We also denote by KG the best constant in (11.5)–(11.7) for fixed
pnq pn,nq
m, n, and KG “ KG . This should not be confused with the optimal constant
in (11.7) under the restriction that xi , yj live in an n-dimensional Hilbert space,
which is denoted similarly by some authors. The values of all these and related
“Grothendieck constants” are discussed in Exercises 11.13–11.17 and in Notes and
Remarks.

ion
One sees immediately that the maximum on the right-hand side of (11.7) is the
norm of the bilinear form
M “ mij : `m n
` ˘
(11.8) 8 ˆ `8 Ñ R.

ut
Thus Proposition 11.7 is really an instance of the duality between the projective

rib
and injective tensor products (see Section 4.1.4 and particularly Exercise 4.18).
Similarly, the maximum on the left-hand side of (11.7) is the norm of M as a bilinear

ist
form on `m n
8 pHq ˆ `8 pHq. In the setting of operator spaces, the latter quantity may
be interpreted as the the so-called completely bounded norm of the bilinear form

rd
(11.8) or, equivalently, the minimal tensor norm of M in that category. In other
words, the values of the Grothendieck constants and of the maximal violations
of Bell correlation inequalities may be obtained by comparing two norms which
fo
naturally appear in the context of operator spaces. We will not go into the details
of that theory (or even define precisely the concepts we mentioned above) since to
ot
do that at a reasonable level of diligence would require (at least) another chapter.
Instead, we refer the interested reader to the excellent survey [PV16].
N

An important question, which has attracted lots of attention over the last 20 or
so years, is the characterization of states ρ that may lead to nonlocal correlations.
ly.

It is easy to see that if a state ρ is separable, then any correlation matrix (11.2)
belongs to the local polytope LC. (A more general fact of this nature is discussed in
on

Exercise 11.25.) In other words, entanglement is necessary—at least in the present


context—for nonlocality. However, it is known to be insufficient [Wer89] and,
se

with the goal of clarifying these issues, Peres asked in 1998 whether there is a link
between locality and the PPT property. Various variants of the question have been
lu

answered, but the following most basic version is apparently still open (see also
Remark 11.21 and Notes and Remarks on Section 11.3).
na

Problem 11.13 (Peres conjecture for correlation matrices). Can nonlocal cor-
relations be obtained, in the sense of Definition 11.6, from a PPT state?
so

As we mentioned earlier, the facial structure of the polytope LCm,n is, for large
r

m, n, rather complicated. For example, we could not find in the literature an answer
Pe

to the following simple question.


Problem 11.14 (How many Bell correlation inequalities are there?). How does
the number of facets of LCn,n grow with n? By general arguments (see Exercises
11.18 and 11.19) it follows that LCn,n has at least exppΩpnqq facets. An upper bound
2
of nΩpn q facets can be derived from the theory of 0{1 polytopes, i.e., of polytopes
which are the convex hull of a subset of t0, 1un . (See [Zie00, BP01].)
Of course, an even more important problem is to characterize all facets/optimal
Bell correlation inequalities modulo symmetries of LCn,n . However, most experts
appear to think that, for large n, a satisfactory answer to such question is unlikely.
282 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

Let us conclude this section with a result giving volume and mean width esti-
mates for the sets of correlation matrices. We state them for classical correlations
only, since similar estimates for quantum correlations follow formally via Theorem
11.12 (see, however, Problem 11.16).
Proposition 11.15. For m, n P N we have
(11.9)
´ 1 c ?
¯ ? ? 2 ? ? mn
? ´ op1q maxp m, nq ď vradpLCm,n q ď wpLCm,n q ď p m ` nq ,

ion
2 π κmn
?
where op¨q indicates
a the behavior as m, n Ñ 8. (Recall that the ratio k{κk de-

ut
creases from π{2 to 1 as k increases from 1 to 8, see Proposition A.1.)
Proof. The middle inequality is the Urysohn inequality (Proposition 4.15).

rib
To get the upper bound on the mean width, we use the Chevet–Gordon inequality
(see Section 6.2.4.1) in the form from Exercise 6.49:

ist
? m
? n
a ? ?
wG pLCm,n q ď n wG pB8 q ` m wG pB8 q “ 2{πpm n ` n mq.

rd
For the lower bound on the volume radius, we may assume m ě n. We claim that
(with the identification Mm,n Ø Rmn Ø pRn qm ), we have

(11.10)
1 fo
? pB2n qm Ă LCm,n .
2
ot
Since the volume radius of pB2n qm is easy to calculate, namely
˘1{mn
N

Γ mn
`
` n m˘ volpB2n q1{n 2 `1
vrad pB2 q “ “
volpB2mn q1{mn
˘1{n
Γ n `1
`
2
ly.

by (B.3), the lower bound in (11.9) follows then readily from Stirling’s formula (as
on

does an explicit nonasymptotic bound, should it be needed).


To establish (11.10), we note that, for B P Mm,n ,
ˇ ˇ
m ˇÿ n
se

ÿ ˇ
(11.11) sup TrpABq “ sup bij ξi ˇ
ˇ ˇ
ˇ
APLCm,n ξPt´1,1um i“1 ˇj“1
lu

ˇ
ˇ ˇ
ÿm ˇÿ n ˇ
Ave m bij ξi ˇ
ˇ ˇ
ě
na

ˇ
ξPt´1,1u ˇ ˇ
i“1 j“1
˜ ¸1{2
so

m n
p˚q 1 ÿ ÿ 2
ě ? bij ,
2 i“1 j“1
r
Pe

where p˚q denotes an application


? of the optimal Khintchine inequality (Exercise
5.71, with the value A1 “ 1{ 2 from [Sza76]). It remains to observe that the
inequality between the first and the last term in (11.11) is an equivalent dual version
of (11.10). 
While Theorem 11.12 implies that vradpLCn,n q » vradpQCn,n q uniformly in
m, n P N, it is not clear how different the two volume radii can be. Here is a
question whose flavor is similar to that of Problem 9.14.
Problem 11.16. Is there an absolute constant c ă 1 such that, for every n ě 2,
vradpLCn,n q ď c vradpQCn,n q.
11.3. BOXES AND GAMES 283

Even showing that the ratio volpLCq{ volpQCq tends to 0 does not seem straightfor-
ward.
Exercise
? 11.11 (The CHSH bound). Show that suptϕCHSH pAq : A P
QC2,2 u “ 2.
Exercise 11.12 (CHSH is the only 2 ˆ 2 Bell correlation inequality). By Ex-
ercise 11.4, the polytope LC2,2 has 16 facets. Show that the unit normals to these
facets are (up to the sign) exactly the „matrices
 that can be obtained by permuting

ion
1 1 0
the entries of either 2 MCHSH or of . Conclude that, up to the obvious
0 0
symmetries, ϕCHSH is the only nontrivial 2 ˆ 2 Bell correlation inequality.

ut
Exercise 11.13 (The Grothendieck–Tsirelson bound). Show that the sequence

rib
` pnq ˘ p2q ?
KG n increases to KG and that KG “ 2.
Exercise 11.14 (CHSH is the only 2 ˆ n Bell correlation inequality). Show

ist
p2,nq ?
that KG “ 2 for any n ě 2.

rd
Exercise 11.15 (CHSH is the only 3ˆ3 Bell correlation inequality). Using the
Matlab multi-parametric toolbox (or other software, or lots of time), it is routine
fo
to establish that LC3,3 has 90 facets. Using this information, show that, up to the
obvious symmetries, ϕCHSH is the only nontrivial 3 ˆ 3 Bell correlation inequality
p3q ?
ot
and deduce that KG “ 2.
p2q
N

Exercise 11.16. Show that KG coincides with the maximal ratio of }M :


`28 pCq Ñ `21 pCq} and }M : `28 pRq Ñ `21 pRq}, where M varies over the set of real
2 ˆ 2 matrices.
ly.

Exercise 11.17. Show that the complex Grothendieck constant (see (11.37)
on

in Notes and Remarks for the definition) for 2 ˆ 2 matrices equals 1.


Exercise 11.18 (Facial dimension of the local correlation polytope). Using
se

Corollary 7.30, show that LCn,n has exppΩpnqq facets. Moreover, for any fixed
λ ą 1, any polytope P such that P Ă LCn,n Ă λP or P Ă QCn,n Ă λP has
lu

exppΩpnqq facets.
na

Exercise 11.19 (Facial dimension of the local correlation polytope, take #2).
Combine Proposition 6.3, Theorem 4.17, Proposition 11.15 and Exercise 11.9 to
so

show that LCn,n has exppΩpnqq facets.


r

11.3. Boxes and games


Pe

This section outlines more general Bell inequalities described in the language
of boxes and games. It includes an explanation of how the original Grothendieck–
Bell setup fits into the broader framework, the CHSH inequality as a game, and
a presentation of several examples and special features such as no-signaling, PR-
boxes, and bounded or unbounded violations.
11.3.1. Bell inequalities as games. We start by rephrasing the CHSH in-
equality (11.4) as a game. The game involves two cooperating players, Alice and
Bob, and a—fair but tough—referee. The players may use a strategy agreed upon
in advance and may share some resources, but are not allowed to communicate dur-
ing the game. At each round of the game, the referee provides Alice and Bob with
284 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

Referee

i j
ξ η

ion
Alice Bob

ut
Figure 11.1. Diagrammatic representation of a quantum game.

rib
Prior to the game, Alice and Bob can agree on some strategy which,
in the quantum variant, may involve sharing a bipartite quantum

ist
state (as depicted by the wavy line). Once the game starts, they
are no longer allowed to communicate. The referee sends privately

rd
input i to Alice and input j to Bob; Alice and Bob answer him
privately with their outputs, respectively ξ and η.
fo
inputs (or settings) i and j, which can be 1 or 2, and each of them must respond
ot
with an output (respectively ξ and η) which can be 1 or ´1. Alice and Bob win if
the product ξη equals mij , the pi, jqth entry of the CHSH matrix (11.3), and lose
N

otherwise. The difficulty is that while Alice knows her setting i P t1, 2u, she doesn’t
know Bob’s setting j, and similarly with the roles reversed.
` ˘2 ` ˘2
ly.

A deterministic strategy consists of two vectors ξi i“1 , ηj j“1 P t´1, 1u2


indicating players’ responses for all values of the inputs. If the amount won or lost
on

in each round is 1, the winnings per round, averaged over all possible inputs i, j
(the value of the game), are
se

1ÿ 1
(11.12) mij ξi ηj ď
4 i,j 2
lu

(this is the same as the bound of 2 from (11.4) after renormalization) and half
na

of deterministic strategies saturate this bound. Consequently, the same bound


holds, and is optimal, for` random ˘strategies involving choosing at each round a
so

random pair of vectors ξpωq, ηpωq according to some distribution ppωq (this re-
quires shared randomness if the choices of Alice and Bob are not to be independent;
r

such strategies, deterministic or random, are usually called local or classical ). If


Pe

we are interested` instead in the


˘ probability of winning, the quantity to consider is
the average of 12 1 ` mij ξi ηj , which yields a bound of 34 .
The reader may wonder whether the uniform distribution on the set of inputs
that is implicit in (11.12) is not rather arbitrary. However, it is not hard to verify
that such distribution faces the players with the toughest challenge. Similarly, there
is a random strategy that yields game value 12 for any probability distribution on
the set of inputs that the referee may be using. (See Exercise 11.20.)
The quantum version of the CHSH game is very similar, except that rather than
being deterministic or using shared randomness, the responses of Alice and Bob are
based on measurements performed (locally on their respective sites HA and HB ) on
11.3. BOXES AND GAMES 285

a shared quantum state ρ P DpHA bHB q. More precisely, for every setting i of Alice
(resp., j of Bob) there is a pair of complementary projections Eiξ , ξ “ ξi P t´1, 1u,
on HA (resp., Fjη , η “ ηj P t´1, 1u, on HB ). If Alice receives from the referee
the input i, she performs the projective measurement corresponding to pEiξ qξ“˘1
and responds with the value of ξ supplied by the outcome of the measurement,
and similarly for Bob. According to the Born rule (3.8), if the referee provides
Alice and
` Bob with inputs pi, jq, the probability of a pair of responses pξ, ηq will
be Tr ρ Eiξ b Fjη . Consequently, for these inputs, the expected value of the CHSH
˘

ion
game will be mij (the corresponding entry of the payoff matrix MCHSH from (11.3))
times

ut
ÿ
ξη Tr ρ Eiξ b Fjη “ Tr ρpXi b Yj q,
` ˘
(11.13)

rib
ξ,η“˘1

where Xi “ ξ“˘1 ξEiξ “ Ei`1 ´ Ei´1 and, similarly, Yj “ Fj`1 ´ Fj´1 . Averaging
ř

ist
over all inputs i and j, we obtain the value
1ÿ
(11.14) mij Tr ρpXi b Yj q.

rd
4 i,j

Comparing (11.14) with (11.12) and appealing to Proposition


` ˘ ` ˘ 11.11 we conclude
fo
that there exists a quantum game, strategy (i.e., ρ, Eiξ , Fjη ), which yields the
?
value of 22 (which is also optimal). This is substantially better than the value of 21
ot
that can be achieved with classical strategies (deterministic or random). Similarly,
N

if we want ?
to focus on the probability of winning the game, the quantum strategy
2` 2
yields 4 « 0.8536, which needs to be compared to the upper bound of 34 for
ly.

classical strategies that was calculated earlier. For a discussion of fine points of the
optimality of this strategy see Exercise 11.21.
on

Exercise 11.20 (Optimality of the classical CHSH game strategies). (a) Show
that if, in the CHSH game, the referee uses a non-uniform distribution on the set of
se

inputs, then Alice and Bob have a deterministic strategy which gives a value strictly
larger than 21 . (b) Describe all classical strategies of Alice and Bob that yield 12 as
lu

the value of the CHSH game, irrespectively of the probability distribution on the
set of inputs used by the referee.
na

Exercise 11.21 (Optimality of the quantum CHSH game strategies). State


and prove a quantum version of the preceding exercise.
so

11.3.2. Boxes and the nonsignaling principle. The scheme that we de-
r
Pe

scribed above via the example of the CHSH game can be conceptualized and gener-
alized using the language of boxes. A box is a family of joint probability distributions
(11.15) P “ tpp¨, ¨|i, jq : 1 ď i ď m, 1 ď j ď nu.
In the context of the two-player games described earlier, ppξ, η|i, jq is the probability
that Alice and Bob respond with outputs ξ, η when presented with inputs i, j. If
the payoff corresponding to this scenario is vpξ, η, i, jq, the (average) value of the
game is
1 ÿ
(11.16) V “ ppξ, η|i, jqvpξ, η, i, jq.
mn ξ,η,i,j
286 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

For the CHSH game (classical or quantum), we had


(11.17) vpξ, η, i, jq “ mij ξη
with mij ’s given by (11.3) and ξ, η taking values in t´1, 1u. In the general case,
ξ and η are no longer binary and we will not require that they take the same
number of values. We will assume throughout this section that ξ P t1, . . . , ku and
η P t1, . . . , lu. (In fact, in some scenarios it may even be natural to consider boxes
with the number of possible outputs dependent on the particular input.)

ion
While the payoff function v can be a priori arbitrary, the probabilities implicit
in the box P reflect the players’ strategy and the resources available to them.
‚ Deterministic strategies (i.e., ξ “ f piq and η “ gpjq for some functions f and g)

ut
result in a deterministic box:

rib
(11.18) ppξ, η|i, jq “ 1tξ“f piqu 1tη“gpjqu .
‚ Random strategies result in product boxes:

ist
(11.19) ppξ, η|i, jq “ ppξ|iqppη|jq,

rd
where pp¨|iq “ pA p¨|iq and pp¨|jq “ pB p¨|jq are the (independent) marginals of the
distribution pp¨, ¨|i, jq.
‚ Random strategies with shared randomness result in local (or classical ) boxes:

(11.20) ppξ, η|i, jq “


ż fo
ppξ|i, λq ppη|j, λq dµpλq,
ot
Λ
where λ P Λ is the (shared, knowingly or not) hidden variable, and µ a probability
N

distribution on Λ.
‚ Quantum strategies result in quantum boxes:
ly.

ppξ, η|i, jq “ Tr ρ Eiξ b Fjη ,


` ˘
(11.21)
on

˘ ρ is a `quantum
`where state shared by Alice and Bob and, for each i (resp., j),
Eiξ ξ (resp., Fjη η ) is a POVM on Alice’s space HA (resp., Bob’s space HB ).
˘

Let us denote the corresponding sets of boxes by DB, RB, LB and QB. If there
se

is a need to specify the dimensions involved, we use expressions such as QBk,l|m,n .


lu

Since the number of values taken by ξ and η is, respectively, k and l, every box can
be thought of as an element of Rklmn` and we have
na

(11.22) DB Ă RB Ă LB Ă QB .
The first inclusion is trivial and it is clear from the definition that LB “ conv RB;
so

in particular LB is convex. (A moment of reflection—see Exercise 11.22—shows


r

also that every product box is a mixture of deterministic boxes and so in fact
Pe

LB “ conv DB.) The convexity of QB and the last inclusion in (11.22), which
follows from it, are slightly less obvious (see Exercises 11.23, 11.25, 11.26 and
Notes and Remarks for a discussion of these points and related issues). Except in
trivial cases, the inclusion LB Ă QB is strict; this follows, for example, from the fact
that correlations can be retrieved from boxes (as in (11.16)–(11.17)) and from the
inclusion LCm,n Ă QCm,n being strict. Boxes that do not belong to LB are called
nonclassical or nonlocal.
We next present a description of LB in the language of projective tensor prod-
ucts. First, consider the set of conditional marginal probability distributions
(11.23) Kk,m :“ tppξ|iq : 1 ď ξ ď k, 1 ď i ď mu,
11.3. BOXES AND GAMES 287

i.e., of matrices M “ pmξ,i q P Mk,m with nonnegative coefficients and columns


`summing˘m to 1. kThen Kk,m is a convex compact set that canonically identifies with
∆k´1 Ă pR qm “ Rkm and one sees (directly from the definitions) that
(11.24) LBk,l|m,n “ Kk,m b
p Kl,n .
Due to the requirement that pp¨, ¨|i, jq be probability distributions, it is evident
that the sets DB, RB, LB and QB are not full-dimensional in Rklmn . The description
(11.24) allows to deduce that

ion
dim LBk,l|m,n “ pdim Kk,m ` 1qpdim Kl,n ` 1q ´ 1
“ mnpk ´ 1qpl ´ 1q ` mpk ´ 1q ` npl ´ 1q

ut
(see Exercise 11.27). The geometry of QB is not as transparent as that of LB.
` some ˘light on it, let us consider a quantum box P “ tppξ, η|i, jqu “
To shed

rib
tTr ρ Eiξ b Fjη u P QB and, for given i, j, let us calculate the marginal density
ppξ|i, jq of ppξ, η|i, jq. We then obtain

ist
ÿ ÿ
Tr ρ Eiξ b Fjη “ Tr ρ Eiξ b IHB “ Tr ρA Eiξ ,
` ˘ ` ˘ ` ˘
ppξ|i, jq “ ppξ, η|i, jq “

rd
η η

which doesn’t depend on j (here ρA “ TrHB ρ is the partial trace, cf. (3.10)).
Similarly, the marginal densities ppη|i, jq do not depend on i. In other words, there
fo
exist distributions pp¨|iq “ pA p¨|iq, i “ 1, . . . , m, and pp¨|jq “ pB p¨|jq, j “ 1, . . . , n
such that, for every i, j, pA pξ|iq and pB pη|jq are the marginals of ppξ, η|i, jq, i.e.,
ot
ÿ ÿ
(11.25) ppξ, η|i, jq “ pA pξ|iq and ppξ, η|i, jq “ pB pη|jq.
N

η ξ

Let us reflect now on the operational significance of (11.25). If, for some i, the
ly.

distributions ppξ|i, jq depended on j, then (by implementing the procedure deter-


mining her response ξ to the input i obtained from the referee) Alice would gain
on

information about the input j sent by the referee to Bob (complete information if
the distributions pp¨|i, jq were disjointly supported for distinct j, and some infor-
se

mation if they were just different). This hypothetical event is usually interpreted
as instant—or at least faster than light—signaling or communication and, conse-
lu

quently, the constraint (11.25) is usually referred to as the nonsignaling principle.


(Actually, the arguably more appropriate interpretation may be that of precogni-
na

tion as nothing seems to forbid Alice from determining her response before—in the
sense of being inside the past light cone—Bob determines his or, indeed, before Bob
so

or even the referee knows the value of j. Note that while, in that case, Alice could
in principle communicate her response to Bob, this has no effect on the statistics
r

of her outputs.)
Pe

The set of boxes verifying (11.25) is called the nonsignaling polytope and we
will denote it by NSB. (It is indeed a polytope, being the intersection of an affine
subspace of Rklmn with the cube r0, 1sklmn .) An analysis of the constraints shows
that LB and NSB (and hence the intermediate set QB) have the same dimension;
see Exercises 11.28 and 11.29. For 2-output nonsignaling boxes (i.e., if k “ l “
2, in which case one may assume that ξ, η take values ˘1) one can still define
the corresponding correlation matrices by the formula that is (modulo different
normalization) implicit in (11.16)–(11.17), namely
ÿ
(11.26) aij “ ξ η ppξ, η|i, jq.
ξ,η“˘1
288 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

We will denote the set of such matrices by NSCm,n .


Here is an important example of elements of NSB, the so called Popescu-
Rohrlich boxes, or PR-boxes. Let m “ n “ k “ l “ 2 (a bipartite 2 ˆ 2 system
with binary outputs: ξ, η “ ˘1; i, j “ 1, 2) and consider the box P given by
#
1
1 if i “ j “ 2,
(11.27) ppξ, η|i, jq “ 12 tξ‰ηu
1
2 tξ“ηu otherwise.

0 12
„ 

ion
In other words, the joint distributions pp¨, ¨|i, jq are, respectively, either 1 or
„1  2 0
2 0
. Since all marginals pA p¨|iq and pB p¨|jq are identical, with probabilities of

ut
0 21
both outputs equal to 12 , it is immediately clear that P is ř nonsignaling. It is also

rib
apparent that for each combination pi, jq of inputs we have ξ,η“˘1 ξηppξ, η|i, jq “
mij , where pmij q is given by (11.3). Accordingly, the value of the CHSH game (as

ist
given by (11.16)–(11.17)) is 1, as is the probability of winning. Since the analysis
from Section 11.3.1 (based on Proposition? 11.11) shows that the best value that

rd
can be achieved by a quantum strategy is 22 , it follows that the PR-box cannot be
realized as a quantum box via (11.21). This implies that the inclusion QB Ă NSB
is always proper. fo
We will conclude this section by giving volume estimates for the sets of nonsig-
naling boxes NSB and sets of nonsignaling correlation matrices NSCm,n .
ot
Proposition 11.17. For k, l, m, n P N we have
N

? ?
(11.28) vradpNSBk,l|m,n q “ Θp mnq and vradpNSCm,n q “ Ωp mnq.
ly.

Proof. Since NSB “ r0, 1sklmn X H, where H Ă Rklmn is the nonsignal-


ing affine subspace (in the notation of Exercises 11.28–11.27, H “ Vk,m b p Vl,n )
on

the first relation follows almost immediately from Proposition 4.27. The only two
additional points that need to be made are as follows. First, while H doesn’t
contain the center of the cube r0, 1sklmn , it does contain the point all whose coor-
se

` ˘N
dinates are 14 , the center of the cube r0, 21 sklmn . Accordingly, volN pNSBq ě 21 ,
lu

where N “ dim H “ mn ` m ` n (by Exercise 11.27). Since, by the Brunn–


Minkowski inequality, central sections are at least as large as (parallel) non-central
na

sections, the upper bound from Proposition 4.27 works without change and yields
volN pNSBq ď 2pklmn´N q{2 . The second point is that the dimension and the codi-
so

` ˘1{N
mension of H are of the same order, and so 2pklmn´N q{2 “ Θp1q. It re-
r

mains to combine the above estimates with the well-known asymptotic expression
Pe

a
volpB2N q1{N „ 2πe{N (as N Ñ 8, see Appendix B.1 and particularly Exercise
B.1).
The second relation can be analyzed in a similar way. By definition, NSCm,n
is a linear image of NSB, essentially a projection of a section of r0, 1sklmn . Since a
projection of a section is larger than a section of a section, we get a lower bound.
(The reason for “essentially” is that the vector ξη “ p1, ´1qbp1, ´1q P R2 bR2 Ø R4
is of norm 2 rather than 1.) 
Problem 11.18 (Volume radius and mean width of sets of boxes). In the
assertion of Proposition 11.17, can Ω be replaced by Θ? The argument given above
(combined with, say, Proposition 4.28) runs into complications if m and n are of
11.3. BOXES AND GAMES 289

very different orders. More generally, what are the asymptotic orders of the volume
radii and mean widths of sets of boxes of the sets LB, QB, NSB for arbitrary values
of k, l? Some of the cases (e.g., LB, because of (11.24)) appear fairly straightforward
consequences of the methods presented in this book, but some of other ones seem to
require further analysis.
Exercise 11.22. Show that every product box is a convex combination of
deterministic boxes.

ion
Exercise 11.23 (Convexity of the set of quantum boxes). Show that the set
QB of quantum boxes is a convex subset of Rklmn
` .

ut
Exercise 11.24 (Pure states suffice). Show that in the definition of quantum
boxes (11.21) we can require the state ρ to be pure.

rib
Exercise 11.25. Show that (i) LB Ă QB and (ii) moreover, that every P P LB
can be realized as a quantum box (11.21) with ρ separable.

ist
Exercise 11.26. Show that if a quantum box P can be written as ppξ, η|i, jq “
TrpρpEiξ b Fjη qq with ρ P Sep, then P P LB.

rd
Exercise 11.27 (The dimension of the set of local boxes). Show that dim LB “
mnpk ´ 1qpl ´ 1q ` mpk ´ 1q ` npl ´ 1q. fo
Exercise 11.28 (All sets of boxes have the same dimension). Show that
ot
dim QB “ dim NSB “ dim LB.
N

Exercise 11.29. Deduce the equality dim QB “ dim LB from the fact that
dim D “ dim Sep (shown in Section 2.2.3).
ly.

11.3.3. Bell violations. Consider a linear functional V on Rmnkl (sometimes


on

called a “Bell functional” or a “Bell expression”). It can be written as


ÿ
(11.29) V pP q “ ppξ, η|i, jqvpξ, η, i, jq.
se

ξ,η,i,j

Except for the normalizing factor, which was removed to reduce the clutter, this is
lu

the same as the average value of a game defined in (11.16). The local (or classical)
optimal value of P is defined as
na

(11.30) ωL pV q “ maxt|V pP q| : P P LBu.


(We will always tacitly assume that ωL pV q ą 0, i.e., that V R LBK “ NSBK .)
so

In this context, a Bell inequality is an inequality of the kind |V p¨q| ď ωL pV q. If


r

a (necessarily nonlocal) box P satisfies |V pP q| ą ωL pV q, one says that the Bell


Pe

inequality is violated and the ratio |V pP q|{ωL pV q is called the violation. Similarly,
the quantum and nonsignaling optimal values of V are defined as
(11.31) ωQ pV q “ supt|V pP q| : P P QBu, ωNS pV q “ maxt|V pP q| : P P NSBu.
Finally, maxV ωQ pV q{ωL pV q is called the maximal quantum violation (for the par-
ticular values of m, n, k, l; more precisely, quantum-to-classical or quantum-to-local
violation), and similarly for violations involving nonsignaling boxes. For example,
the discussion following the definition (11.27) of PR-boxes shows that, for the CHSH
game, nonsignaling-to-classical violations can as large as 2 (see Exercise 11.33 and
cf. Proposition 11.24). All these parameters have nice functional-analytic interpre-
tations, see Exercise 11.31.
290 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

As in the case of the CHSH game, the reader may wonder whether the uniform
distribution on the set of inputs implicit in the definition of V pP q, and hence
indirectly in (11.30)–(11.31) is justified. While for some “balanced” Bell functionals
it will be true that—as for the CHSH game, see Exercise 11.20—the von Neumann–
Nash-type equilibrium indeed involves the uniform distribution, this will not be
universally the case. However, there is a simple trick that allows to sidestep this
issue: a game with the distribution πpi, jq on input settings and the payoff function
vpξ, η, i, jq is equivalent to the game with the payoff function mnπpi, jqvpξ, η, i, jq

ion
and the uniform distribution. In other words, considering uniform distributions
on sets of inputs covers all possible scenarios: it is just one of many essentially
equivalent ways of parameterizing the set of all possible Bell functionals. However,

ut
in some situations a moment of reflection will be needed; since, for example, the

rib
optimal πpi, jq’s for the local, quantum and nonsignaling strategies may be different,
one has to be sure that one does not compare “apples to oranges.”
As we will see later, measurement schemes involving boxes may lead to arbi-

ist
trarily large violations. However, this is not the case for boxes with 2-outcomes
(i.e., when k “ l “ 2). The reason is that sets of 2-outcome boxes are closely

rd
related to sets of correlations introduced in Section 11.2. This is particularly
clear when one compares the set LCm,n of classical/local correlations, which, by
Proposition 11.7, identifies canonically with B8 m p
b B8n
fo
Ă Rm b Rn Ø Rmn , and
the corresponding set LB2,2|m,n of local boxes, which, by (11.24), identifies with
ot
p K2,n “ ∆1 m b p ∆1 n Ă R2m b R2n Ø R4mn . In other words, LCm,n
` ˘ ` ˘
K2,m b
is the projective tensor product of two 0-symmetric cubes, while LB2,2|m,n is the
N

projective tensor product of two similar cubes, but contained in spaces twice their
dimension and centered at the point all whose coordinates are 21 .
ly.

Proposition 11.19. If k “ l “ 2 then for any Bell expression V , we have


on

ωQ pV q ď KG ωL pV q, where KG is the Grothendieck constant. If, additionally,


?
m “ n “ 2, then KG can be replaced by 2.
se

Proof. Assume that the labels ξ and η belong to t´1, 1u rather than t1, 2u.
The maximum in ωL pV q is achieved on an extreme point of LB, i.e., on a determin-
lu

istic box that is of the form (cf. (11.18))


1
na

ppξ, η|i, jq “ 1tξ“xi ,η“yj u “ p1 ` ξxi qp1 ` ηyj q


4
for some vectors x P t´1, 1um and y P t´1, 1un . We can then write
so

m ÿ
n m n
r

ÿ ÿ ÿ
V pP q “ αi,j xi yj ` βi x i ` γj yj ` δ
Pe

i“1 j“1 i“1 j“1

with αi,j “ Averξη vpξ, η, i, jqs, βi “ Averξ vpξ, η, i, jqs, γj “ Averη


ř vpξ, η, i, jqs and
δ “ Avervpξ, η, i, jqs. (In each formula, Ave is a shortcut for 14 over all indices
among i, j, ξ, η not appearing on the left of the equation.) We can gather all these
quantities in a single pm ` 1q ˆ pn ` 1q matrix by defining αi,n`1 “ βi , αn`1,j “ γj
and αn`1,n`1 “ δ and obtain
#ˇ ˇ +
ˇ ˇ ˇm`1
ÿ n`1
ÿ ˇ
(11.32) ωL pV q “ ˇV pP qˇ “ max ˇ α a ˇ : paij q P LCm`1,n`1 .
ˇ ˇ
ˇ i“1 j“1 ij ij ˇ
11.3. BOXES AND GAMES 291

Consider now a quantum box P 1 P QB2,2|m,n , of the form p1 pξ, η|i, jq “ Tr ρpEiξ b
Fjη q. Using the same notation as before and setting Xi “ Ei1 ´ Ei´1 and Yj “
Fi1 ´ Fj´1 as in (11.13)–(11.14), we can write
m ÿ
ÿ n m
ÿ n
ÿ
V pP 1 q “ αi,j Tr ρpXi b Yj q ` βi Tr ρpXi b Iq ` γj Tr ρpI bYj q ` δ
i“1 j“1 i“1 j“1
m`1
ÿ n`1
ÿ
“ αi,j Tr ρpXi b Yj q,

ion
i“1 j“1

where in the last sum we defined Xm`1 “ I and Yn`1 “ I. It now follows that

ut
#ˇ ˇ +
ˇ ˇ ˇm`1
ÿ n`1
ÿ ˇ
1
ˇV pP qˇ ď max ˇ
(11.33) α a ˇ : paij q P QCm`1,n`1 .
ˇ ˇ
ˇ i“1 j“1 ij ij ˇ

rib
Since P 1 P QB was arbitrary, the first statement of the Proposition follows by

ist
comparing (11.33) with (11.32) and appealing to Theorem 11.12. For the second
statement, we note that if m “ n “ 2, then Theorem 11.12 will be used for 3 ˆ 3

rd
p3q ?
matrices and so KG may be replaced by KG “ 2, see Exercises 11.13–11.15. 
Remark 11.20. The argument shows that the violations of bipartite n-input,
2-output boxes do not exceed KG
pn`1q fo
(and similarly for “rectangular” boxes, i.e.,
m ‰ n). Still, the matrices paij q that appear in (11.33) have a special structure
ot
pnq
and so it is conceivable that the bound KG works, too. However, this is unlikely
N

to be a matter of a formal algebraic reduction since, for example, the setting of


3-input, 2-output boxes leads to optimal Bell inequalities, which are not present in
the context of 3 ˆ 3 correlation matrices (the so-called I3322 inequalities).
ly.

Remark 11.21. Since the proof of Proposition 11.19 translates violations for
on

2-output quantum boxes to quantum violations for correlation matrices associated


with the same state ρ, it follows that Problem 11.13, i.e., the Peres conjecture
for correlation matrices, is formally equivalent to the analogous problem for boxes.
se

However, if we allow three outputs in one of the boxes, the answer is known: there is
lu

an example of a PPT state producing violations, even with dim HA “ dim HB “ 3.


As we mentioned earlier, measurement schemes involving more general boxes
na

may lead to arbitrarily large violations. This may happen for two reasons: either
the system is not bipartite (i.e., it involves three or more parties) or the outputs are
so

not binary. These two situations are exemplified by the following pair of results.
Recall that LCn1 ,...,nk and QCn1 ,...,nk are the k-partite generalizations of the sets of
r
Pe

classical and quantum correlation matrices, see Remark 11.10 for details.
pn ,...,n q
Proposition 11.22 (not proved here). Denote by KG 1 k
the best con-
pn,n,nq
stant K such that the inclusion QCn1 ,...,nk Ă KLCn1 ,...,nk holds. Then KG “
pn ,n ,n q
Ωpn1{4 plog nq´3{2 q and KG 1 2 3 ď KG mintn1 , n2 , n3 u1{2 .

? (not proved here). For any Bell functional V we have


Proposition 11.23
ωQ pV q{ωL pV q ď KG
C
kl (independently of the values of m, n). On the other hand,
if l “ 2 and m, n “ 2k , then there exists V such that ωQ pV q{ωL pV q “ Ω k{ log2 k .
`? ˘

Above KG C
stands for the complex Grothendieck constant; see Notes and Remarks
for a precise definition and for estimates. Both propositions can be understood as
292 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

statements about comparing different norms on tensor products of operator spaces.


(The identification (11.24) gives one hint why this may be the case.)
The existence of large nonsignaling-to-classical violations is much easier to es-
tablish. We have
Proposition 11.24. In the class of boxes with binary outputs (i.e., k “ l “ 2),
the maximal nonsignaling-to-classical violation satisfies
` ? ? ˘
max ωNS pV q{ωL pV q “ Ω minp m, nq .

ion
V

Moreover, the same bound holds for violations involving correlation matrices.

ut
Proof. Combine Propositions 11.15 and 11.17. One way to take care ? of fine
points is to use Urysohn’s inequality to deduce that?wpNSC ?m,n q “ Ωp mnq and

rib
then compare it to the upper bound wpLCm,n q “ Op m ` nq from (11.9). This
leads to a nonsignaling-to-classical violation (of correct order) of some Bell corre-

ist
lation inequality and shows the second (and hence the first) statement. 
We conclude the section by introducing another concept which quantifies non-

rd
locality and which is, in a sense, a generalization of the geometric distance between
sets (of boxes). Given P P NSB we define the local!fraction (or classical fraction)
of P as
(11.34)
fo
pL “ pL pP q :“ max tt P r0, 1s : P P tLB ` p1 ´ tqNSBu .
ot
The quantity pNL :“ 1 ´ pL is the nonlocal fraction. Similar parameters can be
N

defined for other pairs in place of LB, NSB. For example, replacing in (11.34) LB
by DB, the set of deterministic boxes (defined by (11.18)) leads to the notion of
ly.

fraction of determinism.
Clearly P P LB iff pL “ 1. Therefore, by the Hahn-Banach separation theorem,
on

whenever pNL ą 0, then there exists a Bell functional V such that V pP q ą ωL pV q


(i.e., P violates some Bell inequality). However, the size of the violation cannot
be immediately ascertained. What can be quantified, though, is the relationship
se

between different types of violations. We have


lu

Proposition 11.25. Let P P NSB and let V be a Bell functional. Then


ˆ ˙
V pP q ωNS pV q
na

(11.35) ´ 1 ď pNL ´1 .
ωL pV q ωL pV q
so

Proof. By definition, there is a local box P 1 and a nonsignaling box P 2 such


that P “ pL P 1 ` pNL P 2 and consequently
r
Pe

V pP q “ pL V pP 1 q ` pNL V pP 2 q ď pL ωL pV q ` pNL ωNS pV q,


which is equivalent to the asserted inequality (11.35). 
The meaningful case in (11.35) is when 0 ă pNL ă 1. We can then conclude that
while P violates some Bell inequalities, the violation is always noticeably smaller
than the nonsignaling violation ωNS pV q{ωL pV q, uniformly over all V for which that
ratio is strictly greater than 1.
An interesting and somewhat surprising setting when one has a nontrivial lower
bound on the local fraction is for (bipartite) quantum boxes with mintm, nu “ 2.
We have then
11.3. BOXES AND GAMES 293

Theorem 11.26 (not proved here). Consider a two-player game setup with
n “ 2 (i.e., two input settings at Bob’s site) and arbitrary (but fixed) m, k, l. Then
(11.36) inftpL pP q : P P QBk,l|m,2 u ě c,
where c ą 0 is a constant that depends only on k and l (but not on m nor on the
dimensions of the underlying Hilbert spaces). The same is true about the fraction
of determinism.
Theorem 11.26, in combination with Exercise 11.33, provides an alternative ar-

ion
gument that the PR-box cannot be realized as a quantum box. The same reasoning
works for any bipartite setup with mintm, nu “ 2 and any box which yields the

ut
optimal nonsignaling value for any Bell functional V such that ωNS pV q ą ωL pV q
(so, while more involved and less sharp, the present argument is very general).

rib
The assertion of Theorem 11.26 does not hold when both players have 3 or
more settings. This is because in that case there exist the so-called pseudotelepathy
quantum games, i.e., the games that can be won with probability 1 using quantum

ist
strategies, while no foolproof classical strategy is possible. Consequently, if P is
the corresponding quantum box and V is the probability of winning, then V pP q “

rd
1 “ ωNS pV q, while ωL pV q ă 1, and so it follows from (11.35) that pL “ 1 ´ pNL “ 0.
An outline of one such game, the Mermin–Peres magic square game, is given in
Exercise 11.35. fo
Exercise 11.30 (Linear vs. affine Bell inequalities). Show that definitions
ot
(11.30) and (11.31) yield the same value if we allow V to vary over all affine func-
N

tionals and not just over linear functionals.


Exercise 11.31 (Violations, symmetrizations, and the geometric distance).
ly.

Verify that ωL pV q “ }V }K ˝ , where K “ LB , the “cylindrical” symmetrization of


K, and similarly for QB and NSB. Deduce that the maximal quantum violation
on

equals dg pLB , QB q. (See (4.6) and (4.1) for definitions.)

Exercise 11.32 (Violations and widths). (i) Let δ “ maxu wpQB,uq´wpQB,´uq


se

wpLB,uq´wpLB,´uq
be the maximal ratio of widths of QB and LB (see Section 4.3.3). Show that the
lu

maximal quantum violation is contained between δ and 2δ ´ 1. (ii) State and prove
an analogous statement for NSB.
na

Note: It follows that the ratio of widths is an alternative measure of violation


equivalent (up to a factor of 2) to the one based on values. Observe that, by
so

Exercise 11.31, we would have equality—and not just equivalence—if we used LB ,


QB in place of LB, QB in the definition of δ.
r
Pe

Exercise 11.33 (Nonsignaling value of the CHSH game). Show that the non-
signaling value of the CHSH game is 2 and deduce that the maximal nonsignaling
violation for m “ n “ k “ l “ 2 is 2.
Exercise 11.34 (Quantum ˘ box for the CHSH game). Give an explicit example
of an ensemble ρ, pEiξ q, pFjη q which induces—via (11.21)—a quantum box giving
`

the optimal violation of the CHSH game.


Exercise 11.35 (The magic square game).
(i) Verify that the self-adjoint operators on C2 b C2 given in Table 11.1 have the
following properties
(a) the operators in each row commute and the same is true for each column
294 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

(b) the composition of the entries in each row is I, while the composition of the
entries in each column is ´ I.
Table 11.1. The magic square game.

σx b I I b σx σx b σx
´ σx b σz ´σz b σx σy b σy
I b σz σz b I σz b σz

ion
(ii) Show that there is no 3 ˆ 3 table consisting of numbers such that the product

ut
of the entries in each row is 1, while the product of the entries in each column is

rib
´1.
(iii) The Mermin–Peres magic square game is played as follows. The number of
input settings is m “ n “ 3 and the outputs are strings of ˘1 of length 3. An

ist
additional restriction is that the product of elements of Alice’s string must be 1,
while the product of elements of Bob’s string must be ´1 (so, in effect, k “ l “ 4).

rd
If the input settings communicated to Alice and Bob were pi, jq, Alice and Bob win
if theirs output strings placed respectively in ith row and jth column coincide on
fo
the common ij-th entry, and lose otherwise. Show that
(a) there is no deterministic (and hence classical) winning strategy,
(b) the following is a winning quantum strategy. Alice and Bob share a 4-qubit
ot
quantum state ϕ` b ϕ` , where ϕ` “ ?12 p|00y ` |11yq is a Bell state with the first
N

qubit of each copy of ϕ` going to Alice and the second to Bob. Given input i, Alice
measures her part of the state in a basis in which the (commuting) operators from
ly.

the ith row are simultaneously diagonal, and answers the corresponding triple of
eigenvalues. Given input j, Bob does the same thing using the jth column.
on

Notes and Remarks


Section 11.1. The argument that the proper mathematical home of Bell in-
se

equalities belongs to the operator space theory was most explicitly put forward in
lu

[JPPG` 10].
For a proof of Theorem 11.4, we refer the reader to [Por81] (Theorem 13.68) or
na

[Kir76]. There is a huge gap between that Theorem and Lemma 11.1, which ? both
yield subspaces of dimension Θplog nq and the optimal ratio }¨}op {}¨}HS ” 1{ n, and
?
so

the subspaces given by Dvoretzky’s theorem, which feature } ¨ }op {} ¨ }HS « 2{ n


and are of dimension Θpnq (see Theorem 7.37; for sharpness, see Exercise 7.25).
r

Accordingly, we suggest the following problem.


Pe

Problem 11.27. Given λ ě 1, denote by dpn, λq the maximal dimension of a


subspace E Ă Mn pRq such that, for any M P E,
1 λ
? }M }HS ď }M }8 ď ? }M }HS .
n n
It follows from Lemma 11.1 that, for λ ą 1, dpn, λq “ Ωplog nq. Is this sharp for
λ P p1, 2s? (For λ ą 2, we have dpn, λq “ Θλ pnq.)
Note that while Lemma 11.1 addresses only the case when n is a power of 2, one
can readily deduce that (for λ ą 1 and for arbitrary n) dpn, λq ě 2 log2 n ´ Cpλq.
(Consider E “ F b Im where F Ă M2k is the subspace from Lemma 11.1, for
NOTES AND REMARKS 295

appropriate k, m with 2k m ď n ă 2k pm ` 1q.) Note, however, that dpn, 1q “ 1 if


(and only if) n is odd, see Exercise 11.1.
Section 11.2. Proposition 11.7 is probably folklore, and Proposition 11.8 is
due to Tsirelson [Cir80, Tsi85, Tsi93]. Theorem 11.12 is known as Grothendieck’s
inequality [Gro53a] and its reformulation via correlation matrices is also due to
Tsirelson. The paper [Gro53a] went largely unnoticed for 15 years until it was
“brought to the mathematical mainstream” by Lindenstrauss and Pełczyński in

ion
[LP68]. In particular, the elementary formulation (11.7) comes from [LP68]. For
a beautiful recent survey about Grothendieck’s inequality, including historical back-
ground and far-reaching generalizations, see [Pis12a].

ut
pm,nq p2,nq
Concerning values of constants KG for specific m, n, we have KG “
? p3,nq ?

rib
2 for all n and KG “ 2 for all n (the latter is stated without proof in
[FR94] and attributed ultimately to Kemperman’s interpretation of results of Garg
[Gar83]; see also [BM08] on which Exercise 11.15 is based). The approach that

ist
p3q
was used to calculate KG in Exercise 11.15 can be in principle replicated for larger
dimensions, but the computational complexity of the problem increases very fast. It

rd
p4q ?
was implemented in [Li] to show rigorously that KG “ 2 (there are two new Bell
correlation inequalities that appear in the 4 ˆ 4 context, but neither of them leads

this
?
to a violation that is 2 or larger). Other values of KG
Various aspects of circle of ideas, including
foin
pm,nq

particular
seem to be unknown.
the significance of
ot
?
the constant 2, are discussed in [For10] and [FR94]. The CHSH inequality was
introduced in [CHSH69].
N

rns
One may also define KG as the best constant such that (11.7) holds for every
matrix paij q of arbitrary size and every vectors xi , yj P Rn . An easy observation is
ly.

pnq rns r2s ? r3s


that KG ď KG . While KG “ 2 [Kri79], the value of KG seems unknown;
on

`
see [BNV16, HQV 16] for recent lower and upper bounds.
The Grothendieck constant introduced in the text is the real Grothendieck
constant. It has a complex counterpart defined as the smallest constant KG C
such
se

that for any complex matrix pmij q of arbitrary size m ˆ n and any unit vectors
xi , yj in a complex Hilbert space, we have
lu

ˇ ˇ ˇ ˇ
ˇÿ ˇ ˇÿ ˇ
C
(11.37) ˇ mij xxi , yj yˇ ď KG max m ξ η
na

ˇ ˇ ˇ ˇ
ˇ ij i j ˇ
ˇ i,j ˇ ξPTm ,ηPTn ˇ ˇ
i,j
so

where T denotes the set of complex numbers of unit modulus. The best estimates
are 1.338... ă KG
C
ă 1.405..., which in particular imply KG C
ă KG (see [Pis12a] for
r

more information and references). Somewhat surprisingly, for 2 ˆ 2 matrices the


Pe

complex Grothendieck inequality holds with constant 1, see Exercise 11.17 (based
on [BM08]). For larger dimensions the optimal values of the constants do not seem
to be known.
The argument from Exercise 11.8 is from [WW01a]. The description of the
extremal Bell correlation inequalities (extreme points of LC˝ , or equivalently faces
of LC) has attracted a lot of attention, see the website [@4].
Section 11.3. For more information on quantum boxes and Bell inequalities
we refer the readers to the surveys [PV16] and [BCP` 14]. Older valuable refer-
ences include [Pit89] and [WW01b].
296 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY

Some authors reserve the term “value of the game” to payoff functions that are
nonnegative. (Of course, any finite payoff function can be made nonnegative via
an offset, but that makes a difference when we calculate the ratios of values for
different strategies, as we do.) A 2-output game for which the payoff function is of
the form (11.17) for some pmij q (or, perhaps, slightly more generally, mij ξη ` nij ,
which allows in particular, talking about 12 pξη ` 1q, the probability of winning the
game) is called an XOR game. This is because when we think of the outputs as
Boolean data a, b P t0, 1u, the value of the game depends only on the “exclusive

ion
or” value a ‘ b. XOR games can also be defined for more than two players; their
study is essentially equivalent to that of correlation matrices. It should be noted
that while for local correlation matrices and boxes the link to the projective tensor

ut
product works perfectly (as in Proposition 11.7 and (11.24)), the correspondence to

rib
operator space tensor products in the quantum setting is slightly less satisfactory
once we leave the setting of XOR games. This is pointed out, e.g., in section IV.B
of [PV16]): while we still can, with some work, come up with two-sided estimates,

ist
constants larger than 1 do appear. It would be very useful to come up with a
natural construction (such as the use of cylindrical symmetrizations in Exercise

rd
11.31) which allows to bypass this complication.
It is known [AIIS04] that determining whether a box is local is NP-complete,
fo
even for the class of boxes with 2 outputs, and similarly for correlation matrices.
This is established via a connection to the concept of the cut polytope associated
ot
to a graph G “ pV, Eq, which is a polytope in RE defined as
convtpδS peqqePE : S Ă V u,
N

where δS peq “ 1 if the edge e has one endpoint in S and one endpoint in V zS, and
ly.

0 otherwise. It can be checked that LCm,n is affinely equivalent to the cut polytope
of the complete bipartite graph Km,n (cf. the comments on contextuality at the
on

end of these notes) and that LB2,2|m,n is affinely equivalent to the cut polytope of
the complete tripartite graph Km,n,1 . For more information on cut polytopes we
refer the reader to [DL97].
se

It is unknown whether the set QB of quantum boxes is closed. A closely related


lu

question is known as Tsirelson’s problem and has to do with how quantum physics
models locality: we may define a set QB1 as the set of boxes ppξ, η|i, jq of the form
na

ppξ, η|i, jq “ xψ|Ēiξ F̄jη |ψy,

where ψ is a unit vector in a Hilbert space H, and, for every i and j, pĒiξ qξ and
so

pF̄jη qη are POVMs on H which satisfy the commutation condition Ēiξ F̄jη “ F̄jη Ēiξ
r

for any i, j, ξ, η. (It is crucial here to allow H to be infinite-dimensional.) To check


Pe

that QB Ă QB1 , simply take H “ HA b HB , Ēiξ “ Eiξ b I and F̄jη “ I bFjη . A


natural question is whether QB or its closure QB are equal to QB1 (the set QB1 can
be checked to be closed, see Proposition 3.4 in [Fri12]). It was proved in a series
of papers [JNP` 11, Fri12, Oza13] that the equality QB “ QB1 is equivalent to
Connes’ embedding problem on von Neumann algebras. On the other hand, Slofstra
proved [Slo16] using techniques from group theory that QB Ĺ QB1 .
The I3322 inequalities appeared in [Fro81] and the terminology was introduced
in [CG04]. PR-boxes are usually credited to [PR94], were they were studied in
some detail, but they make an appearance already in [KT85, Tsi85].
NOTES AND REMARKS 297

A concept in the spirit of local/nonlocal fraction was introduced in [BKP06].


Fraction of determinism appears in [JHH` 15], which also contains Theorem 11.26
and its proof.
Peres conjecture was stated in a somewhat vague form in [Per99]. A more
rigorous mathematical formulation and interesting positive partial results can be
found in a series of papers [WW00, WW01a, WW01b]. The example of a
quantum box mentioned in Remark 11.21, based on a PPT state on C3 b C3 and
disproving the Peres conjecture for bipartite systems was given in [VB14]. An

ion
earlier example in the multipartite setting was given in [Dür01]. See the discussion
in [VB14] and in section III.A of [BCP` 14] for more on the relationship between
nonlocality and entanglement, and for many more references.

ut
The fact that the multipartite analogue of Theorem 11.12 does not hold has
pn ,n ,n q

rib
been known for some time. In the present context, unboundedness of KG 1 2 3
`
as n1 , n2 , n3 tend to infinity was shown in [PGWP 08]. Quantitative estimates
where obtained later in [Pis12b]. Proposition 11.22, with a slightly worse power

ist
of logarithm, appeared in [BV13], the version stated here is from [PV16]. Propo-
sition 11.23 is from [PY15]. See also [JP11]; more references can be found in

rd
[PV16].
The form of the Mermin–Peres magic square game given in Exercise 11.35
fo
follows largely [Ara04]. Another (more explicit but less transparent) exposition
can be found in [BBT05]. Other demonstrations of pseudotelepathy are based
ot
on versions of the Kochen–Specker theorem [KS67] which involves the concept
of contextuality. Contextuality, or rather noncontextuality, is a generalization of
N

locality. For example, a two party scenario allows to perform measurements indexed
by pairs tpi, jqu, where i and j identify respectively local POVMs of Alice and
ly.

Bob; this can be represented by a complete bipartite graph Km,n . By contrast,


the more general scenario permits a general hypergraph: the observables are still
on

represented by vertices, with the hyperedges corresponding to their subsets that


can be performed (simultaneously or sequentially) without mutually affecting the
outcomes of other observables in the subset. See [BBT05] for more details and
se

examples and [CSW14] for sophisticated links to graph theory.


lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 12

POVMs and the Distillability Problem

ion
This last chapter consists of two parts which are linked by the central role played

ut
by the concept of POVMs, but are otherwise largely independent. The first part
deals with the norms that are associated with POVMs and which are intimately

rib
related to zonoids. This connection allows us to derive a sparsification result for
POVMs. The second part also uses the language of POVMs, but is focused on the

ist
distillability problem, a major unsolved problem in quantum information theory.

rd
12.1. POVMs and zonoids
12.1.1. Quantum state discrimination. What happens when a quantum
fo
system in a state ρ is measured with a POVM M? We only focus on the case of
a discrete POVM M “ pMi q1ďiďN (continuous POVMs could then be treated by
ot
approximation).
We know from Born’s rule (3.13) that the outcome i is obtained with probability
N

TrpρMi q. This simple formula can be used to quantify the efficiency of a POVM to
perform the task of state discrimination. State discrimination can be described as
ly.

follows: a quantum system is prepared in an unknown state which is either ρ or σ


(both hypotheses being a priori equally likely), and we have to guess the unknown
on

state.
After measuring it with the POVM M “ pMi q1ďiďN , the outcome i occurs
with probability pi “ TrpρMi q if the unknown state is ρ and with probability
se

qi “ TrpσMi q if the unknown state is σ. Consequently, the optimal strategy is as


follows: when outcome i is observed, guess ρ if pi ą qi and guess σ if pi ă qi (and
lu

use any rule if pi “ qi ). The probability of failure is then


na

N N
1ÿ 1 1ÿ
Ppfailureq “ minppi , qi q “ ´ |pi ´ qi | .
2 i“1 2 4 i“1
so

It is convenient to introduce the distinguishability (semi-)norm } ¨ }M defined


r

for ∆ P B sa pHq by
Pe

N
ÿ
(12.1) }∆}M “ |Trp∆Mi q| .
i“1

Note that } ¨ }M is a norm if and only if spantMi : 1 ď i ď N u “ B sa pHq, which


requires in particular N ě pdim Hq2 . Since Ppfailureq “ 21 ´ 14 }ρ ´ σ}M , this norm
can be used to quantify the performance of POVMs for state discrimination.
Exercise 12.1 (The Helstrom bound). Show that, for any POVM M, we have
} ¨ }M ď } ¨ }1 . Conversely, show that for any pair of states ρ, σ P DpHq there is a
POVM M such that }ρ ´ σ}M “ }ρ ´ σ}1 . This gives operational meaning to the

299
300 12. POVMS AND THE DISTILLABILITY PROBLEM

trace norm distance between quantum states; the optimal inequality Ppfailureq ě
1 1
2 ´ 4 }ρ ´ σ}1 is known as the Helstrom bound for quantum hypothesis testing.

12.1.2. Zonotope associated to a POVM. Given a POVM M, we denote


by BM “ t}¨}M ď 1u the unit ball for the distinguishability norm, and KM “ pBM q˝
its polar, i.e.,
KM “ tA P B sa pHq : TrpABq ď 1 whenever }B}M ď 1u.
The set KM is a compact convex set. Moreover KM has nonempty interior if and

ion
only if } ¨ }M is a norm. It follows from the inequality } ¨ }M ď } ¨ }1 that KM is
always included in the unit ball for the operator norm.

ut
The following proposition characterizes the convex sets that can be obtained
by means of this construction.

rib
Proposition 12.1. Let K Ă B sa pHq be a symmetric closed convex set. Then
the following are equivalent.

ist
(i) K is a zonotope such that K Ă t} ¨ }8 ď 1u and ˘ I P K.
(ii) There exists a POVM M on H such that K “ KM .

rd
Zonotopes were defined in Section 4.1.3 and briefly discussed in Section 7.2.6.4;
the insight implicit in the above Proposition permits us to relate the ideas and the
fo
techniques outlined in those sections to the task of state discrimination.
Proof of Proposition 12.1. For a POVM M “ pMi q1ďiďN , we claim that
ot
(12.2) KM “ r´M1 , M1 s ` ¨ ¨ ¨ ` r´MN , MN s.
N

Indeed, denoting by L the right-hand side of (12.2), we have for every A P B sa pHq
N
ly.

ÿ
}A}L˝ “ suptTrpABq : B P Lu “ ˝ ,
|TrpAMi q| “ }A}KM
on

i“1
so that L “ KM . Conversely, suppose that K is a zonotope as in (i). By definition,
there are operators pMi q1ďiďN such that
se

K “ r´M1 , M1 s ` ¨ ¨ ¨ ` r´MN , MN s.
lu

The hypotheses imply that I is an extreme point of K. Any extreme point of K has
the form ˘M1 ˘ ¨ ¨ ¨ ˘ MN , and therefore by changing Mi into ´Mi if necessary, we
na

may assume that


I “ M1 ` ¨ ¨ ¨ ` MN .
so

For every 1 ď i ď N , we have I ´Mi P K and thus } I ´Mi }8 ď 1. Therefore Mi is


positive, and M “ pMi q1ďiďN is a POVM such that KM “ K. 
r
Pe

12.1.3. Sparsification of POVMs. We are going to show that POVMs can


be sparsified, i.e., approximated by POVMs with few outcomes. The terminology
“approximation” refers here to the associated distinguishability norms: a POVM
M is considered to be ε-close to a POVM M1 when their distinguishability norms
satisfy inequalities of the form
(12.3) p1 ´ εq} ¨ }M ď } ¨ }M1 ď p1 ` εq} ¨ }M .
As an immediate consequence of Theorem 7.48 about approximation of zono-
topes by zonoids, we obtain a result about sparsification of POVMs: given any
POVM M, we can produce a POVM M1 with relatively few outcomes which per-
forms almost as well as M for state discrimination.
12.2. THE DISTILLABILITY PROBLEM 301

Theorem 12.2. There is a constant C such that the following holds: for every
POVM M “ pMi q1ďiďN on Cn and every ε P p0, 1q, there exists another POVM
M1 “ pMj1 q1ďjďN 1 with N 1 ď Cn2 log n{ε2 outcomes such that
(12.4) p1 ´ εq} ¨ }M ď } ¨ }M1 .
Proof. Consider the convex set KM Ă Msan , which is a zonoid by Proposition
12.1. By Theorem 7.48, there is a zonotope
Z “ r´A1 , A1 s ` . . . r´AN 1 , AN 1 s

ion
with Ai being positive operators, N 1 ď Cn2 log n{ε2 , and such that p1 ´ εqKM Ă
Z Ă KM . (The positivity of Ai follows from the last sentence in Theorem 7.48.)

ut
n,sa
Define A0 “ I ´pA1 ` ¨ ¨ ¨ ` AN 1 q. Note that A0 is positive since Z Ă KM Ă S8
(the unit ball for the operator norm). It follows that M1 :“ pA0 , A1 , . . . , AN 1 q is a

rib
POVM such that KM1 Ą Z Ą p1 ´ εqKM , and therefore } ¨ }M1 ě p1 ´ εq} ¨ }M as
claimed. 

ist
Remark 12.3. The one-sided inequality (12.4) in Theorem 12.2 is the mean-

rd
ingful half of (12.3) since we want the sparsified POVM to be not weaker than the
initial one. However, it is natural to wonder whether one can insist on a two-sided
inequality as in (12.3). This seems to require an extra argument.
fo
12.2. The distillability problem
ot
In this section we discuss the distillability problem, one of the most important
N

open problems connected to entanglement.


Consider a bipartite Hilbert space H “ HA b HB shared between two parties
ly.

customarily called Alice and Bob. For any integer n ě 1, the Hilbert space Hbn
bn bn
is also considered as a bipartite Hilbert space by identifying it with HA b HB .
on

Whenever we mention separability, partial transpose, LOCC,. . . for states or chan-


nels on Hbn , it is always understood as relative to the A : B bipartition.
se

12.2.1. State manipulation via LOCC channels. Given bipartite states


1 1
ρ P DpHA b HB q and σ P DpHA b HB q, we write ρ ù σ if, for any ε ą 0, there is
lu

an integer n and an LOCC quantum channel Φ : BppHA b HB qbn q Ñ BpHA 1


b HB1
q
such that
na

› ›
›Φpρbn q ´ σ › ď ε.
1
In words, this property is referred to as “σ can be distilled from (multiple copies
so

of) ρ.”
r

We are going to discuss this notion without giving a precise definition of LOCC
Pe

quantum channels. We only need to know that the class of LOCC channels is stable
under composition (which implies, together with the result from Exercise 2.31, that
the relation ù is transitive), that (see Section 2.3.4.8)
convtproduct channelsu Ă tLOCC channelsu Ă tseparable channelsu,
and that the local filtering operation is LOCC: given a state ρ on HA bHB , POVMs
pPi qiPI on HA and pQj qjPJ on HB , and S Ă I ˆ J, then (provided Tr M ą 0)
ρ ù TrMM , where
ÿ
M“ pPi b Qj qρpPi b Qj q.
i,jPS
302 12. POVMS AND THE DISTILLABILITY PROBLEM

The idea behind the last scheme is informally as follows: given n copies of
the state ρ, Alice and Bob can successively measure copies of ρ locally using the
POVMs pPi q and pQj q until they obtain outcomes i and j such that pi, jq P S, the
post-measurement state being then TrMM . (The protocol fails if none of the n copies
gives an outcome in S, but the probability of failure tends to zero as n tends to
infinity.) This is where classical communication (“CC” of LOCC) comes in: Alice
and Bob need a mechanism for certifying that i, j P S and this generally can not
be accomplished by “local” means unless S itself has a product structure.

ion
The above hierarchy of channels parallels somewhat the hierarchy of boxes (see
Section 11.3.2). For example, convtproduct channelsu can be thought of as “local
operations with shared randomness.”

ut
Exercise 12.2 (Distillation preserves separability and PPT). If ρ ù σ, show

rib
that σ is separable (resp., PPT) whenever ρ is separable (resp., PPT).
12.2.2. Distillable states. Recall the standard notation: the canonical basis

ist
of C2 is p|0y, |1yq and we often drop the tensor product signs (for example, |00y
should be understood as |0y b |0y). Next, it is convenient to work with the family of

rd
Bell vectors tϕ` , ϕ´ , ψ ` , ψ ´ u, which is the orthonormal basis of C2 b C2 consisting
of maximally entangled vectors
1 fo 1
ϕ˘ “ ? p|00y ˘ |11yq and ψ ˘ “ ? p|01y ˘ |10yq.
2 2
ot
The corresponding states are called the Bell states. A bipartite state ρ P DpHq is
N

said to be distillable if ρ ù |ψ ` yxψ ` |. The motivation for this concept is that many
quantum information protocols (e.g., quantum teleportation) use Bell states as a
resource. Distillable states are exactly those which are useful for these protocols.
ly.

Note that the choice of the Bell vector ψ ` in this definition is arbitrary: if x, y
are any two maximally entangled vectors on Cd b Cd , then there exist U, V P Updq
on

such that y “ pU b V qx. Since the channel ρ ÞÑ pU b V qρpU b V q: is LOCC (as a


product channel), we have |xyxx| ù |yyxy|. We use repeatedly this fact and refer
se

to it as “conjugating with local unitaries.”


It is easy to check that PPT states are not distillable (see Exercise 12.2). The
lu

distillability problem asks whether the converse holds.


na

Problem 12.4 (Distillability problem). Is every non-PPT state distillable?


The answer to Problem 12.4 is commonly believed to be negative.
so

12.2.3. The case of two qubits.


r
Pe

Proposition 12.5. Every entangled state on C2 b C2 is distillable.


Since in the C2 b C2 setting “entangled” and “non-PPT” are equivalent by
Theorem 2.15, Proposition 12.5 is indeed an instance of Problem 12.4. In the
argument it will be convenient to use states that are diagonal in the basis of Bell
vectors. For a, b, c, d ě 0 such that a ` b ` c ` d “ 1, let us denote
ρa,b,c,d “ a|ϕ` yxϕ` | ` b|ϕ´ yxϕ´ | ` c|ψ ` yxψ ` | ` d|ψ ´ yxψ ´ |.
The heart of the protocol lies in the following two lemmas, whose proofs we post-
pone. To each state ρ P DpC2 bC2 q, we associate the quantity spρq “ maxtxχ|ρ|χyu,
where the maximum is taken over all maximally entangled vectors χ P C2 b C2 .
12.2. THE DISTILLABILITY PROBLEM 303

Given that xχ|ρ|χy is the square of the fidelity between ρ and |χyxχ| (cf. Exer-
cise B.3), the functional sp¨q measures proximity to the set of maximally entangled
states. In particular, ρ is distillable if and only if there exists a sequence pσn q in
DpC2 b C2 q such that spσn q Ñ 1 and that, for every n, ρ ù σn .
Lemma 12.6. We have ρ ù ρspρq, 1´spρq , 1´spρq , 1´spρq .
3 3 3

Lemma 12.7. Given a, b, c, d ě 0 with a`b`c`d “ 1, denote α “ pa2 `b2 q{N ,


β “ 2ab{N , γ “ pc2 ` d2 q{N and δ “ 2cd{N , where N “ pa ` bq2 ` pc ` dq2 . Then

ion
ρa,b,c,d ù ρα,β,γ,δ .
Proof of Proposition 12.5. Let ρ P DpC2 b C2 q be an entangled state. By

ut
Theorem 2.15, this means that ρ is not PPT. Consequently, there exists a unit

rib
vector x P C2 b C2 such that xx|ρΓ |xy ă 0. Conjugating with local unitaries, we
may assume that the Schmidt decomposition? of x is α|00y ` β|11y. Consider the
operator W “ α|0yx0| ` β|1yx1|, then x “ 2 pI bW q|ϕ` y. By local filtering,

ist
pI bW qρpI bW q
(12.5) ρ ù σ :“

rd
TrpI bW qρpI bW q
(note that 0 ď W ď I, so that W can be one of the operators in a POVM) and one

I
fo
checks that xϕ` |σ Γ |ϕ` y ă 0. Using the formula TrpAΓ Bq “ TrpAB Γ q, we obtain
ˆ ˆ ˙˙
1
0 ą Tr σp|ϕ` yxϕ` |qΓ “ Tr σ
` ˘
´ |ψ ´ yxψ ´ | “ ´ xψ ´ |σ|ψ ´ y
ot
2 2
N

and therefore spσq ą 1{2.


The problem is thus reduced to showing that any state σ with spσq ą 1{2 is
distillable. By applying successively Lemmas 12.6 and 12.7, we obtain that σ ù σ 1
ly.

for some state σ 1 such that spσ 1 q ě φpspσqq, where φ is the function
on

t2 ` 19 p1 ´ tq2 1 ´ 2t ` 10t2
φptq “ 1 1 “ .
2
9 p1 ` 2tq ` 9 p2 ´ 2tq
2 5 ´ 4t ` 8t2
se

Since φptq ą t for t P p1{2, 1q, we have limnÑ8 φn pspσqq “ 1. In other words,
iterating the above procedure shows that σ ù σ 2 , where σ 2 is a state such that
lu

spσ 2 q is as close to 1 as we wish. It follows that σ is distillable. 


na

Proof of Lemma 12.6. By conjugating with local unitaries, we may assume


that spρq “ xψ ´ |ρ|ψ ´ y. The twirling channel Υ : BpC2 b C2 q Ñ BpC2 b C2 q is
so

defined as Υpρq “ EpU b U qρpU b U q: where U P Up2q is Haar-distributed. This


is an LOCC channel (it belongs to the convex hull of the set of product channels)
r

and moreover (see Exercise 2.16)


Pe

1 ´ spρq ` ` ˘
Υpρq “ spρq|ψ ´ yxψ ´ | ` |ϕ yxϕ` | ` |ϕ´ yxϕ´ | ` |ψ ` yxψ ` | .
3
The result follows since ψ ´ can be transformed into ϕ` by local unitaries. 
Proof of Lemma 12.7. We write ρ for ρa,b,c,d . It will be convenient to con-
1 1
sider ρ as a state on HA b HB and ρ b ρ as a state on HA b HB b HA b HB (all
the spaces HA , HB , HA , HB being equal to C ). When an operator X on C b C2
1 1 2 2
1 1
is thought of as acting on HA b HA (resp., HB b HB ), we denote it by XA (resp.,
by XB ). The same convention will be used for superoperators Ψ whose domain is
BpC2 b C2 q.
304 12. POVMS AND THE DISTILLABILITY PROBLEM

Denote by P “ |00yx00| ` |11yx11| and Q “ I ´P “ |01yx01| ` |10yx10| the


complementary rank 2 projectors acting on the space C2 b C2 . Next, consider
1 1
Π “ PA b PB ` QA b QB as an operator acting on HA b HB b HA b HB . A
simple computation shows that Π is the orthogonal projection onto the subspace
generated by the 8 vectors
ϕ` b ϕ` , ϕ` b ϕ´ , ϕ´ b ϕ` , ϕ´ b ϕ´ , ψ ` b ψ ` , ψ ` b ψ ´ , ψ ´ b ψ ` , ψ ´ b ψ ´ .
Consider also the quantum channel Ψ : BpC2 b C2 q Ñ BpC2 q given by Ψpρq “

ion
Tr2 U ρU : , where Tr2 denote the partial trace over the second factor, and U is the
“CNOT” unitary transformation on C2 b C2 defined by
U p|00yq “ |00y, U p|01yq “ |01y, U p|10yq “ |11y, U p|11yq “ |10y.

ut
A direct calculation shows that, for ε, η “ ˘ and with the usual rules for sign

rib
multiplication,
pΨA b ΨB qp|ϕε b ϕη yxϕε b ϕη |q “ |ϕεη yxϕεη |q,

ist
pΨA b ΨB qp|ψ ε b ψ η yxψ ε b ψ η |q “ |ψ εη yxψ εη |q.

rd
(We emphasize that, in the above formulas, not all occurrences of the symbol b
refer to the same bipartitions; for example in ϕε b ϕη we have ϕε P HA b HB
and ϕη P HA 1 1
b HB .) It follows (using first local filtering, then the LOCC channel
fo
ΨA b ΨB and a tedious but straightforward computation) that
Πpρ b ρqΠ
ot
ρù ù ρα,β,γ,δ ,
Tr Πpρ b ρqΠ
N

as asserted. 
12.2.4. Some reformulations of distillability. We start with a criterion
ly.

for distillability.
on

Lemma 12.8. A state ρ P DpHA b HB q is distillable if and only if there exists


an integer n and operators A : C2 Ñ HA
bn
, B : C2 Ñ HB bn
such that the operator
: bn
pA b Bq ρ pA b Bq is non-PPT.
se

Proof. Assume that there exist n, A and B with the above properties. Then,
lu

by local filtering, we have ρ ù σ, where


pA b Bq: ρbn pA b Bq
na

σ“
TrppA b Bq: ρbn pA b Bqq
so

is a non-PPT state on C2 b C2 . By Proposition 12.5, σ (and hence also ρ) is


distillable.
r

Conversely, if ρ is distillable, there exists, for some n, an LOCC channel Φ :


Pe

BppHA qbn b pHB qbn q Ñ BpC2 b C2 q such that Φpρbn q is non-PPT. Since Φ is
separable, it has the form
ÿ
ΦpXq “ pAi b Bi q: XpAi b Bi q
i
and therefore at least some couple pAi , Bi q satisfies the desired conclusion. 
There is also a connection between distillability
ř and 2-positivity. Fix an or-
thonormal basis pei q of HA and denote χ “ ei b ei P HA b HA . We recall
(see Section 2.3.2) that the Choi matrix associated to a completely positive map
Φ P CP pHB , HA q is defined as CpΦq “ pΦ b IdBpHA q qp|χyxχ|q.
NOTES AND REMARKS 305

Proposition 12.9. Given a state ρ P DpHA b HB q, let Φ P CP pHB , HA q


be such that ρ “ CpΦq. Denote by T P P pHA q the transposition map. Then the
following are equivalent
(1) ρ is distillable,
(2) there exists an integer n such that the map pT Φqbn is not 2-positive.
Proof. We apply the result of Exercise 2.48 (for k “ 2) to the superoperator
pT Φqbn . We note that CpT Φq “ pT Φ b Idqp|χyxχ|q “ pT b Idqpρq “ ρΓ .
It follows that pT Φqbn is 2-positive iff the operator pA b Bq: pρΓ qbn pA b Bq

ion
is positive for any A : C2 Ñ HA bn
and B : C2 Ñ HB bn
. This condition is also
: bn
equivalent to the operator pĀ b Bq ρ pĀ b Bq being PPT, and the result is now

ut
immediate from Lemma 12.8. 

rib
Problem 12.4 reduces therefore to the following.
Problem 12.10. Let Φ be a completely positive map such that pT Φqbn is 2-

ist
positive for every n (where T denotes the transposition). Is T Φ necessarily com-
pletely positive?

rd
A remarkable result is the fact that in order to solve Problem 12.4 it is enough
to search among Werner states.
fo
Proposition 12.11. Fix d ě 3. The following are equivalent
(i) Every non-PPT state on Cd b Cd is distillable,
ot
(ii) Every entangled Werner state on Cd b Cd is distillable.
N

Proof. Since PPT Werner states are separable (see Proposition 2.16), (i) im-
plies (ii). Conversely, let ρ P DpCd b Cd q be a non-PPT state. In other words,
ly.

there is a unit vector x P Cd b Cd such that xx|ρΓ |xy ă 0. By applying the


same argument as in the proof of Proposition 12.5, we deduce that there is a state
on

řd
σ P DpCd b Cd q such that ρ ù σ and xψ|σ Γ |ψy ă 0, where ψ “ ?1d i“1 ei b ei is
a maximally entangled vector. Equivalently (cf. Exercise 2.20), TrpF σq ă 0, where
F is the flip operator on Cd b Cd . Consider now Υ : BpCd b Cd q Ñ BpCd b Cd q,
se

the twirling quantum channel, defined by Υpρq “ EpU b U qρpU : b U : q where U


lu

is Haar-distributed on the unitary group. This channel is an LOCC channel and


maps any state σ to a Werner state w “ Υpσq satisfying TrpF σq “ TrpF wq (see
na

Exercise 2.16). It follows (see Proposition 2.16) that ρ ù w for some entangled
Werner state w, so (ii) implies (i). 
so

A consequence of Lemma 12.8 and Proposition 12.11 is that Problem 12.4 can
r

be reduced to the following question, where ψ denotes a maximally entangled vector


Pe

on Ck b Ck : for every k ě 3 and ε ą 0, do there exist an integer n and vectors


a, b, c, d P pCk qbn such that
A ˇ` ˘bn ˇˇ E
(12.6) a b b ` c b dˇ I ´p1 ` εq|ψyxψ| ˇa b b ` c b d ă 0 ?
ˇ

Notes and Remarks


Section 12.1. The distinguishability norms associated to POVMs were in-
troduced in [MWW09]. The observation behind Exercise 12.1 is due to Helstrom
[Hel69] and Holevo [Hol73]. The connection between POVMs and zonoids (Propo-
sition 12.1) was noticed in [AL15b], where Theorem 12.2 was proved (and where
improvements for specific examples of POVMs are also discussed). Volume and
306 12. POVMS AND THE DISTILLABILITY PROBLEM

mean width estimates for norms associated to a family of POVMs on a bipartite


state can also be found in [AL15a].
Section 12.2. For a precise description of the class of LOCC transformations
we refer to [HHHH09, Section XI] and [Wat, Chapter 6] (see also [CLM` 14]).
A basic reference on the distillability problem is [HH01] (see also the survey
[Cla06], and the website [@5]). The relationships between distillability, the PPT
property, and teleportation have also been studied in [HHH98, LP99, HHH99].
The protocol described in Lemmas 12.6 and 12.7 appeared in [BBP` 96] (see

ion
also [BDSW96]). Proposition 12.5 appears in [HHH97] and Proposition 12.11 is
from [HH99].

ut
Proposition 12.9, and the equivalence between Problem 12.4 and Problem 12.10,
are from [DSS` 00]. For numerical attempts to solve Problem 12.4 in its formula-

rib
tion (12.6), see [DSS` 00, DCLB00].
There is a quantitative version of the distillability problem, which asks for the

ist
asymptotic rate of Bell states production via LOCC channels from many copies of a
given state; the supremum of achievable rates is called the distillable entanglement.

rd
Entanglement that is not distillable is often referred to as bound entanglement, and
the states that exhibit it are called bound entangled.
If one uses operations preserving PPT instead of LOCC, then every non-PPT
fo
state can be “distilled” [EVWW01]. Note that some care is needed when analyzing
this issue because the class in question is not closed under tensoring.
N ot
ly.
on
se
lu
na
so
r
Pe
APPENDIX A

Gaussian measures and Gaussian variables

ion
This appendix serves as a brief general reference for Gaussian random variables,

ut
both scalar and vector-valued. It addresses terminology, basic properties, and var-
ious elementary but useful identities and inequalities. More specialized properties

rib
are included elsewhere in this book, most notably in Chapter 6.

ist
A.1. Gaussian random variables
The standard Gaussian distribution N p0, 1q is the probability measure on R

rd
(denoted by γ1 ) with density ?12π expp´x2 {2q dx. The standard complex Gaussian
distribution NC p0, 1q is the probability measure on C with density π1 expp´|z|2 q dz.
fo
(Occasionally we will write NR p0, 1q for N p0, 1q to emphasize the distinction.) The
word “standard” refers, in particular, to the unit variance normalization: if Z has
ot
distribution either N p0, 1q or NC p0, 1q, then E |Z|2 “ 1. We note also that if Z1 , Z2
are independent random variables with distribution N p0, 1q, then ?12 pZ1 ` iZ2 q has
N

distribution NC p0, 1q.


If X has N p0, 1q distribution (resp., NC p0, 1q distribution) and σ ě 0, the
ly.

distribution of σX is denoted by N p0, σ 2 q (resp., by NC p0, σ 2 q).


The moments of the Gaussian standard distributions can be computed explic-
on

itly: if Z has N p0, 1q distribution, then, for any p ě 0,


2p{2
ˆ ˙
p p ` 1 pÑ8 ? ´ p ¯p{2
Γ 2 .
se

(A.1) E |Z| “ ? „
π 2 e
lu

Similarly, if Z has NC p0, 1q distribution, then for any p ě 0,


´p ¯
(A.2) E |Z|p “ Γ `1
na

2
(indeed, |Z|2 follows an exponential distribution with parameter 1).
so

We also need some fine estimates on the cumulative distribution function of a


standard Gaussian variable, denoted by
r
Pe

` ˘
(A.3) Φpxq :“ γ1 p´8, xs .
For large x, we have 1 ´ Φpxq „ p2πq´1{2 x´1 expp´x2 {2q. This is refined by the
Komatu inequalities which assert that for every x ě 0,
ż8
2 x2 {2 2 2
(A.4) ? ďe e´t {2 dt ď ? .
x` x `42 x ` x2 ` 2
x
A further refinement is provided by the inequalities (where x ě 0)
ż8
π 2 2 4
(A.5) ? ď ex {2 e´t {2 dt ď ? .
2
pπ ´ 1qx ` x ` 2π 3x ` x2 ` 8
x

307
308 A. GAUSSIAN MEASURES AND GAUSSIAN VARIABLES

Exercise A.1 (A simple bound for the normal tail). Show the inequality (6.6):
if Z is a standard normal variable (i.e., distributed according to the N p0, 1q law),
2
then PpZ ě tq “ 12 Pp|Z| ě tq ď 12 e´t {2 for t ě 0. This bound motivates the
definition of subgaussian processes, see (6.19) and subsequent comments.
Exercise A.2 (Komatu inequalities). Prove the Komatu inequalities (A.4) by
arguing as follows:
(i) If f´ pxq, f pxq and f` pxq denote respectively the left, middle and right member of
1
the inequality to be proved, show that for x ě 0 we have f´ ě xf´ ´ 1, f 1 “ xf ´ 1

ion
1
and f` ď xf` ´ 1.
(ii) Show (A.4). The same argument proves the upper bound in (A.5).

ut
A.2. Gaussian vectors

rib
A family of real-valued centered random variables pXi q is jointly Gaussian
if any linear combination of the variables has distribution N p0, σ 2 q for some σ.

ist
A jointly Gaussian family is also called a Gaussian process (see Section 6.1). A
crucial property of jointly Gaussian families, or Gaussian processes, is that the

rd
joint distribution of pXi q is uniquely determined by the covariance matrix paij q “
pE Xi Xj q.
fo
When V is a real (resp., complex) finite-dimensional space equipped with a
Euclidean (resp., Hilbertian) norm, we call the standard Gaussian vector in V a
ot
V -valued random variable such that, in any orthonormal basis, the coordinates of V
are independent standard real (resp., complex) random variables. More concretely,
N

the distribution of a standard Gaussian vector in Rn (denoted by γn ) has density


1
expp´|x|2 {2q dx
ly.

p2πqn{2
whereas the distribution of a standard Gaussian vector in Cn (denoted by γnC ) has
on

density
1
expp´|x|2 q dx.
se

(A.6)
πn
lu

In all these cases the respective distribution will be referred to as the standard
Gaussian measure on the corresponding space V . Note that if Cn is identified with
2n
? , the distributions γn and γ2n do not coincide: they differ by a scaling factor of
C
na

R
2.
While we are mostly interested in standard Gaussian vectors and measures,
so

the joint distribution of any jointly Gaussian sequence X1 , X2 , . . . , Xn is referred


r

to as a Gaussian measure on Rn . Sequences or measures that are not centered


Pe

are also considered. However, this does not add a lot to generality: any such
measure is a pushforward of the standard Gaussian measure via a linear (or affine,
as appropriate) map.
Let G be a standard Gaussian vector in Rn . Rotational invariance of γn implies
G
that the random variable |G| is uniformly distributed on sphere S n´1 ; moreover |G|
G
and |G| are independent. This can be used to relate Gaussian averages and spherical
averages. For any function f : Rn Ñ R` satisfying f ptxq “ tf pxq whenever x P Rn
and t ě 0, we have
ż ż
(A.7) f dγn “ E f pGq “ κn f dσ,
Rn S n´1
NOTES AND REMARKS 309

where σ is the uniform measure on S n´1 and κn is the constant


?
2Γppn ` 1q{2q
(A.8) κn :“ E |G| “ .
Γpn{2q
In particular, (A.7) can be applied when f is the gauge associated to a convex body.
(See also Exercise A.6 .)
The constant κn appears in probability and statistics as the mean of χpnq, the
chi distribution with n degrees of freedom. (See Exercise 5.34 for bounds for he

ion
median, which
a is necessarily
a than κn by Proposition 5.34.) The first values
smallera
are κ1 “ 2{π, κ2 “ ? π{2, κ3 “ 2 2{π. Note also the formula κn κn`1 “ n. For
large n, we have κn „ n. More precise estimates are gathered in the following

ut
proposition.

rib
Proposition A.1 (see Exercises A.4 and A.5; (iv) and (v) are not proved here).
Let ?κn be the constant
? defined in (A.8). Then
(i) n ´ 1 ď κn ď ? n,

ist
(ii) the
b sequence κn { n is increasing,
b

rd
1 n
(iii) n ´ 2 ď κn ď n ´ 2n`1 ,
?
(iv) the sequence n ´ κn is non-increasing.
?
fo
(v) as n tends to infinity, we have κn “ np1 ´ 1{4n ` 1{32n2 ` Op1{n3 qq.
The complex analogue of (A.7) is as follows: if f : Cn Ñ R` satisfies f ptxq “
ot
tf pxq whenever x P Cn and t ě 0, we have
N

ż ż
(A.9) f dγnC “ κCn f dσ
Cn SCn
?
ly.

with κC
n “ κ2n { 2.
on

Exercise A.3. Let n ě 2. Prove the following result sometimes known as


the Herschel–Maxwell theorem: up to scaling, γn is the only rotationally-invariant
probability measure on Rn which is also a product measure.
se

Exercise A.4. Using the fact that the function log Γ is convex, show parts (i)
lu

and (ii) of Proposition A.1.


Exercise A.5. Prove part (iii) of Proposition A.1 by showing that the corre-
na

sponding ratios are monotone along even and odd subsequences.


so

Exercise A.6. State and prove a variant of (A.7) for α-homogeneous functions,
i.e., verifying f ptxq “ tα f pxq for x P Rn and t ą 0.
r
Pe

Notes and Remarks


The Komatu inequalities (A.4) appeared in [Kom55]. The upper bound in
(A.5) was proved in [Sam53, SW99] and the lower bound in [RW00]. A sur-
vey paper on related inequalities is [Due10]. Part (iii) from Proposition A.1 is
from
b [Chu62]. Part (iv) follows for example from the refined inequality κn ě
n ´ 12 ` 8pn`1q
1
from [Boy67]. Another derivation appears as Lemma C.4 in
[FR13].
For many characterizations of the Gaussian measure in the spirit of Exercise
A.3, see [Bry95].
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
APPENDIX B

Classical groups and manifolds

ion
This appendix contains an overview of the classical groups and manifolds that

ut
appear in this book, and of the natural structures, such as metrics and measures,
which they carry. Most of the facts included here have been known for 100 years

rib
or more, but the precise statements are often difficult to find in the literature,
mostly because presentations of these topics usually focus on more general and

ist
more abstract settings. Again, more specialized features of these objects are studied
elsewhere in this book, primarily in Chapter 5.

rd
B.1. The unit sphere S n´1 or SCd
We denote by S n´1 “ tx P Rn : |x| “ 1u the unit sphere in Rn . There are
fo
two natural distances on the sphere: the (intrinsic) geodesic distance (“as the crow
flies”) denoted by g and the extrinsic distance (“as the mole burrows”), i.e., the
ot
restriction to S n´1 of the Euclidean distance | ¨ | on Rn . Since they are related
N

by the formula |x ´ y| “ 2 sinpgpx, yq{2q, statements about | ¨ | have immediate


translations involving g and vice versa. Note also that, for any x, y P S n´1 , we
have
ly.

2
(B.1) gpx, yq ď |x ´ y| ď gpx, yq.
on

π
We denote by σ the uniform measure on S n´1 , normalized so that σpS n´1 q “ 1.
We note for the record that the non-normalized pn ´ 1q-dimensional “surface area”
se

of S n´1 equals
lu

˘ 2π n{2
voln´1 S n´1 “ ` n ˘ .
`
(B.2)
Γ 2
na

However, σ can be also induced by the Lebesgue measure voln on Rn as follows:


for any Borel set A Ă S n´1 ,
so

voln ttx : t P r0, 1s, x P Au


σpAq “ .
r

voln B2n
Pe

We note for the record the formula for the volume of the unit ball B2n
π n{2
vol B2n “
` ˘
(B.3) `n ˘.
Γ 2 `1
If G is a standard Gaussian vector on Rn , then G{|G| is distributed according to
σ. This is an efficient procedure to simulate the uniform measure on the sphere.
We denote by SCd the unit sphere in Cd . Since Cd identifies with R2d as a
real vector space, and SCd with S 2d´1 as a metric measure space, the preceding
discussion is also valid for SCd . Note also the formula, for x, y P SCd ,
(B.4) gpx, yq “ arccos Re xx, yy.

311
312 B. CLASSICAL GROUPS AND MANIFOLDS

a
Exercise B.1.aShow that volpB2n q1{n „ 2πe{n as n tends to infinity, and
that volpB2n q1{n ď 2πe{n for every n ě 1.

B.2. The projective space


We denote by PpC q the complex projective space on Cd (more commonly de-
d

noted by CPd´1 ), i.e., the quotient of SCd under the identification of unit vectors
ϕ, ψ which differ only by their phase; in other words, if ϕ “ eiθ ψ for some θ P R.
When ψ P SCd , we will occasionally denote by rψs its class in PpCd q. We equip

ion
PpCd q with the following metric (called Fubini–Study metric, or Bures metric)
(B.5) dprψs, rχsq “ arccos |xψ, χy|.

ut
The quantity |xψ, χy| is called the overlap of the vectors ψ and χ or, more properly,

rib
of rψs and rχs.
We also introduce the Segré variety on the bipartite Hilbert space Cd1 b Cd2 ,
defined as

ist
(B.6) Seg “ tϕ b ψ : ϕ P SCd1 , ψ P SCd2 u.

rd
As defined in (B.6), Seg is a subset of the unit sphere SCd1 bCd2 with real dimension
2pd1 ` d2 q ´ 3. Alternatively, one could define the Segré variety as a subset of the
fo
projective space PpCd1 b Cd2 q. In that case it has complex dimension d1 ` d2 ´ 2.
The real projective space PpRm q is defined and endowed with metric mutatis
ot
mutandis starting from the sphere S m´1 . However, the real setting generally ap-
pears in quantum theory only as a toy model. Note that the more standard (and
N

more general) definition of the projective space PpV q associated to a vector space
V over an arbitrary field K is by identification of vectors u, v P V zt0u such that
ly.

u “ kv for some k P Kzt0u. However, the equivalent approach starting from the
sphere S m´1 or SCd fits better the standard setup of quantum theory.
on

Exercise B.2. Check that the Fubini–Study metric is obtained as the quotient
metric from the geodesic metric on the unit sphere.
se

Exercise B.3 (Bures vs. Fubini–Study, fidelity vs. overlap). The Bures metric
lu

is usually defined for not-necessarily-pure states σ, τ P D by


(B.7) dpσ, τ q “ arccos F pσ, τ q,
na

a? ?
where F pσ, τ q “ Tr σ τ σ is the fidelity between σ and τ . (Note that some
texts define fidelity as the square of this quantity.) (i) Verify that if τ “ |χyxχ|,
so

a
then F pσ, τ q “ xχ|σ|χy. (ii) Deduce that if σ “ |ψyxψ|, τ “ |χyxχ|, then (B.5)
r

and (B.7) yield the same value (in other words, the Fubini–Study metric is the
Pe

restriction of the Bures metric to pure states and similarly for the fidelity vs. the
overlap). (iii) Verify that dpσ, τ q is indeed a metric.

B.3. The orthogonal and unitary groups Opnq, Upnq


We denote by Opnq “ tO P Mn pRq : OO: “ Iu the orthogonal group and
by Upnq “ tU P Mn pCq : U U : “ Iu the unitary group. Their dimensions, as
real Riemannian manifolds, are dim Opnq “ npn ´ 1q{2 and dim Upnq “ n2 . We
also recall the standard notation SOpnq “ tO P Opnq : detpOq “ 1u, SUpnq “
tU P Upnq : detpU q “ 1u, and PSUpnq for the quotient of Upnq under the relation
U „ V ðñ U “ λV for some λ P C. Note that Opnq is a disjoint union of two
B.3. THE ORTHOGONAL AND UNITARY GROUPS Opnq, Upnq 313

copies of SOpnq, so all statements about SOpnq transfer mutatis mutandis to Opnq.
We also point out the classical isomorphism PSUp2q Ø SOp3q (see Exercise B.4).
It what follows G will stand for either SOpnq, SUpnq or Upnq. There are many
metric structures one may consider on G. Each norm } ¨ } on Mn induces two
distances on G: the extrinsic distance (simply }U ´ V }, for U, V P G) and the
geodesic distance (the length of a shortest path in G joining U to V , where length
is measured with respect to } ¨ }). For p P r1, 8s, we will denote by gp the geodesic
distance induced by the Schatten p-norm.

ion
Among these choices we single out the standard Riemannian metric g2 , which
can be expressed for U, V P G as

ut
˜ ¸1{2
ÿn
(B.8) g2 pU, V q “ θi2

rib
i“1

where eiθ1 , . . . , eiθd are the eigenvalues of U ´1 V , and θj P r´π, πs. (See Exercise

ist
B.5.)
Proposition B.1 (not proved here). Let 1 ď p ď 8. Let U, V P G, and

rd
A P Msa
n with }A}8 ď π such that exppiAq “ U
´1
V . Then the map t ÞÑ U exppitAq,
defined for t P r0, 1s, is a geodesic joining U to V for the distance gp . If }U ´V }8 ă
fo
2 and 1 ă p ă 8, this is the unique path of minimal length.
The above result is very well-known for p “ 2, but it is also valid, with the
ot
stated caveats, for other values of p. As a consequence of Proposition B.1, extrinsic
and geodesic distances are easy to calculate and they are comparable, see Exercise
N

B.5. Note that if }U ´ V }8 “ 2, then A (which necessarily verifies }A}8 “ π) is no


longer uniquely determined and neither are the geodesics. See also Exercise B.6.
ly.

We point out that while Proposition B.1 appears to be stated in the complex
setting (i.e., G “ SUpnq or G “ Upnq), it makes sense just as well when G “ SOpnq:
on

the matrix B “ iA is then real skew-symmetric (see Exercise B.7). Moreover, it


follows then that SOpnq is a geodesically convex submanifold of Upnq, i.e., that the
shortest curve in Upnq connecting any two points in SOpnq is entirely contained in
se

SOpnq (or at least that there exists such curve, if the shortest curve is not unique).
lu

As compact groups, Opnq and Upnq carry a Haar measure: the unique proba-
bility measure which is invariant under right and/or left multiplication. The Haar
na

measure can also be generated more in a concrete fashion. For example, start from
a vector x1 uniformly distributed on S n´1 (resp., SCn ), and construct inductively
so

a random orthonormal basis px1 , . . . , xn q by choosing xk uniformly on the unit


sphere in the subspace tx1 , . . . , xk´1 uK . Then the random matrix with columns
r

px1 , . . . , xn q is Haar-distributed on Opnq (resp., Upnq). A slightly different scheme


Pe

is outlined in Exercise B.14.


Exercise B.4 (PSUp2q and SOp3q are isomorphic). The group Up2q acts on the
(real, 3-dimensional) hyperplane of trace zero matrices by the formula X ÞÑ U XU : .
This action preserves the Hilbert–Schmidt inner product. Check that this action
induces an isomorphism between PSUp2q and SOp3q.
Exercise B.5 (Equivalence of metrics on G). Let 1 ď p ď 8, G be either
SOpnq,Upnq or Upnq, and U, V P G.
(i) Denote by eiθ1 , . . . , eiθn denote the (complex) eigenvalues of U ´1 V with |θj | ď π.
Show that
gp pU, V q “ }pθ1 , . . . , θn q}p ,
314 B. CLASSICAL GROUPS AND MANIFOLDS

where the norm on the right hand-side is the p-norm on Rn .


(ii) Check that the geodesic and extrinsic metrics satisfy the inequalities
2
gp pU, V q ď }U ´ V }p ď gp pU, V q
π
for any U, V P G.
Exercise B.6 (All the geodesics in G). Show that it follows formally from
Proposition B.1 that all paths of the form t ÞÑ W eitA , A P Msa
n , W P G, and t P R

ion
are geodesics in the sense that all their “sufficiently short” arcs are the shortest
curves connecting their endpoints (unique if 1 ă p ă 8). Moreover, for 1 ă p ă 8
all such shortest curves are unique and hence all geodesics are of that form.

ut
Exercise B.7 (Geodesical convexity of SOpnq). Show that for any U P SOpnq

rib
there is a real skew-symmetric matrix B with }B}8 ď π such that U “ eB and
gp pI, U q “ }B}p . Conclude that SOpnq is a geodesically convex submanifold of Upnq

ist
with respect to any metric gp .
Exercise B.8 (Bi-Lipschitz estimates for the exponential map).

rd
(i) Show that } exppiBq ´ exppiAq}op ď }B ´ A}op for every A, B P Msa
n.
(ii) Consider, for θ P p0, πq,

Lpθq “ inf
"
} exppiBq ´ exppiAq}op
}B ´ A}op
sa
fo *
: A, B P Mn , }A}op ď θ, }B}op ď θ, A ‰ B .
ot
Show that for θ P p0, 2π{3q we have Lpθq ě Lpθ{2qp1 ´ |1 ´ eiθ{2 |q. Conclude that
N

(for example) Lpπ{4q ě 0.4.


Problem B.2. What is the precise value of Lpθq in Exercise B.8? We did not
ly.

find an answer in the literature (but we did not look very hard). An easy upper
bound is sinpθq{θ (check).
on

B.4. The Grassmann manifolds Grpk, Rn q, Grpk, Cn q


se

Let V be a finite-dimensional real or complex vector space. For 0 ă k ă


dim V , we denote by Grpk, V q the family of all k-dimensional subspaces of V . The
lu

set Grpk, V q is called the Grassmann manifold or the Grassmannian. Since its
properties effectively depend only on the dimension of V , in what follows we consider
na

only the concrete situations Grpk, Rn q and Grpk, Cn q. (See, however, Exercise B.15.)
Further, since the map E Ø E K is a bijection between Grpk, Rn q and Grpn ´ k, Rn q
so

preserving all the structures we will be interested in (and similarly for Cn ), the
reader may always concentrate on the cases when k ď n{2, which we will often
r
Pe

tacitly assume.
Before discussing metrics on the Grassmann manifold we introduce the concept
of principal angles. Given E, F P Grpk, Rn q or Grpk, Cn q, consider the singular
value decomposition of the operator PE PF (recall that PE denotes the orthonormal
projection onto E), which we will write in the form given by (2.10)
k
ÿ
(B.9) PE PF “ si |xi yxyi |
i“1
with si P r0, 1s, x1 , . . . , xk P E, and y1 , . . . , yk P F (the latter inclusions are auto-
matic for xi , yi corresponding to coefficients si ‰ 0 and can be arranged otherwise).
The principal angles between E and F are the numbers θ1 , . . . , θk P r0, π{2s defined
B.4. THE GRASSMANN MANIFOLDS Grpk, Rn q, Grpk, Cn q 315

by cos θi “ si . The unit vectors x1 , . . . , xk and y1 , . . . , yk are called principal vec-


tors. It is easily checked that we have xxi , xj y “ xxi , yj y “ xyi , yj y “ 0 for i ‰ j
and that si “ xxi , yi y; the equality means that θi is actually the angle between xi
and yi and, at the same time, the angle between Rxi and Ryi (or the Fubini–Study
distance between rxi s and ryi s—given by (B.5)—in the complex setting).
The principal angles quantify how close two subspaces are to each other. As
we shall see, a natural Riemannian metric on Grassmann manifolds is as follows: if
E, F P Grpk, Rn q or Grpk, Cn q, then

ion
˜ ¸1{2
? k
ÿ
2
(B.10) dpE, F q “ 2 θi

ut
i“1
where θ1 , . . . , θk are the principal
? angles between E and F . The reader may wonder

rib
why we included the factor 2, which may appear redundant, both geometrically
and esthetically. Indeed, as noted above, the natural metric on the projective space

ist
(Fubini–Study in the complex case), which corresponds to the case k “ 1 of the
Grassmannian does not have that factor. However, as we shall see, there are sound

rd
functorial reasons for using the normalization (B.10): it shows up in two canonical
constructions of the Grassmann manifold.
Another very natural way to define the distance in terms of principal angles is
(B.11)
fo
d8 pE, F q “ max θi .
1ďiďk
ot
However, the metric d8 is not Riemannian; an important (and obvious) inequality
N

relating the two metrics is


(B.12) d8 pE, F q ď 2´1{2 dpE, F q.
ly.

Fix 0 ă k ă n and consider the canonical action of Opnq on Grpk, Rn q. Let


R Ă Rn be the canonical inclusion, so that Rk P Grpk, Rn q. We now note that the
k
on

stabilizer subgroup of Opnq that fixes Rk consists of block-diagonal matrices of the


form „ 
O1 0
se

,
0 O2
lu

where O1 P Opkq and O2 P Opn ´ kq, and so it can be naturally identified with
Opkq ˆ Opn ´ kq. Since the action of Opnq on Grpk, Rn q is transitive, it follows
na

that Grpk, Rn q is a homogeneous space for Opnq and can be identified with the
quotient space Opnq{pOpkq ˆ Opn ` ´ kqq. It follows in particular that the dimension
so

˘
of Grpk, Rn q equals dim Opnq ´ dim Opkq ` dim Opn ´ kq “ kpn ´ kq.
For a more concrete description of this correspondence, consider the map
r

Opnq Ñ Grpk, Rn q that associates to an orthogonal matrix O the subspace spanned


Pe

by its first k columns, i.e., ORk . The preimage of E P Grpk, Rn q under this map,
i.e., the set tO P Opnq : OpRk q “ Eu is a (left) coset of Opkq ˆ Opn ´ kq. Sim-
ilarly, Grpk, Cn q identifies with the quotient space Upnq{pUpkq ˆ Upn ´ kqq. Note
that Grpk, Cn q is a complex manifold (of complex dimension kpn ´ kq), although
Upnq is not. As pointed out earlier, Grp1, V q identifies with the projective space
PpV q, except? that the metric (B.10) differs from the Fubini–Study metric (B.5) by
a factor of 2 when V “ Cn . We explain the reasons for this factor further below,
particularly in the paragraph containing (B.14). (The same formulas and the same
caveats apply to the case V “ Rn .) On the other hand, the metric d8 defined by
(B.11) coincides, for k “ 1, with the Fubini–Study distance.
316 B. CLASSICAL GROUPS AND MANIFOLDS

Whether we use the high-tech or simple-minded point of view, there is a canon-


ical procedure that allows to transfer metric structure(s) from Opnq to Grpk, Rn q
(and from Upnq to Grpk, Cn q). We will exemplify that procedure in the case of the
(extrinsic) Schatten p-norm for 1 ď p ď 8. We set, for E, F P Grpk, Rn q,
h̃p pE, F q :“ mint}U ´ V }p : U, V P Opnq, U Rk “ E, V Rk “ F u
(B.13) “ mint}W ´ I }p : W P Opnq, W E “ F u
and similarly for E, F P Grpk, Cn q. The definition “:“” works mutatis mutandis for

ion
any quotient map on (or, equivalently, for any family of “cosets” in) a metric space,
but the second equality requires that space to be a group with invariant metric (see
also Exercise B.9).

ut
The same scheme can be applied to the geodesic metric gp on Opnq or Upnq. In
particular, if p “ 2, we obtain the standard Riemannian structure on Grpk, Rn q or

rib
Grpk, Cn q and the resulting metric is (B.10), while p “ 8 yields the metric d8 from
(B.11) (see Exercise B.12). Moreover, it doesn’t matter whether we first define the

ist
geodesic metric and then pass to a quotient, or whether we reverse the order of
these operations (see Exercise B.10).

rd
It is instructive to specify the calculations implicit in the above paragraph to
the simplest nontrivial setting, that of the real projective space PpR2 q, or Grp1, R2 q.
fo
If the angle between two lines E, F Ă R2 (i.e., their Fubini–Study distance (B.5))
is θ P p0, π{2s, then the eigenvalues of the rotation W mapping E to F are eiθ and
ot
e´iθ and so W “ eiA , where A P Msa 2 has eigenvalues θ and ´θ (cf. the calculation in
the hint to Exercise B.7). It follows that the intrinsic Riemannian distance induced
N

by g2 and the quotient map Op2q Ñ Grp1, R2 q verifies


?
(B.14) dpE, F q “ g2 pW, Iq “ }A}2 “ pθ2 ` θ2 q1{2 “ 2 θ,
ly.

?
which explains the factor 2 appearing in (B.10). Observe that the second equality
on

in (B.14) is a straightforward application of Proposition B.1; the first one requires


noting that if R P Op2q is the? reflection swapping E and F , then (again by Propo-
sition B.1) g2 pR, Iq “ π ą 2 π{2 ě g2 pW, Iq.
se

And here is a slightly different approach to endowing a Grassmann manifold


lu

with a metric. The map E ÞÑ PE allows one to consider (for example) Grpk, Rn q
as a submanifold of Msa n , so any norm of Mn also induces two metrics (extrinsic
vs. geodesic) on Grpk, Rn q. As it turns out, the geodesic metric obtained from the
na

Hilbert–Schmidt norm is again (B.10). For an analysis of this situation via principal
angles, see Exercise B.13.
so

Finally, let us note that since SOpnq acts transitively on Grpk, Rn q, the Grass-
r

mann manifold can be likewise represented as a quotient of SOpnq, and similarly for
Pe

Grpk, Cn q and SUpnq, a point of view that can be occasionally useful (cf. the proof
of Theorem 7.15). This circle of ideas is explored in Exercises B.16 and B.17.
Each Grassmann manifolds carries a natural probability measure which can be
constructed in two different but equivalent ways
‚ as the normalized Riemannian volume induced by the metric (B.10)
‚ as the pushforward of the Haar measure on Opnq via the quotient map Opnq Ñ
Opnq{pOpkq ˆ Opn ´ kqq.
The latter construction can be described more tangibly as follows: fix E P Grpk, Rn q
and consider a Haar-distributed O P Opnq; then OpEq is a random element in
Grpk, Rn q whose distribution does not depend on the choice of E. Either way,
B.4. THE GRASSMANN MANIFOLDS Grpk, Rn q, Grpk, Cn q 317

we will call the resulting measure the standard Haar measure on Grpk, Rn q. The
same construction (using Upnq instead of Opnq) defines similarly the standard Haar
measure on Grpk, Cn q. For an even more concrete realization of the standard Haar
measure, see Exercise B.14.
Since Opnq consists of morphisms of the corresponding space that preserve the
inner product, the Haar measure on Grpk, Rn q may be seen as depending on the
choice of a Euclidean (i.e., inner product) structure on Rn . Using another Euclidean
structure on Rn leads to a different measure on Grpk, Rn q, as illustrates Exercise

ion
B.15. The same caveat applies to the complex case.
To complete the discussion of Grassmann manifolds, we will mention briefly
their “cousins,” Stiefel manifolds. For 1 ď k ď n, denote

ut
Stpk, Rn q :“ tpx1 , . . . , xk q P Rn : xxi , xj y “ δi,j u

rib
the set of k-tuples of orthonormal vectors in Rn . We have the canonical equivalences
Stpk, Rn q Ø Opnq{Opn ´ kq and Grpk, Rn q Ø Stpk, Rn q{Opkq. The complex version

ist
is defined similarly; as for Grassmann manifolds, Stiefel manifolds naturally inherit
metrics and a Haar measure from the orthogonal group.

rd
For simplicity, Exercises B.9–B.15 are stated in the real case, but the statements
are also valid in the complex case.
fo
Exercise B.9 (Induced metrics on spaces of cosets). Fix 0 ă k ă n, 1 ď p ď 8,
and denote H “ OpkqˆOpn´kq Ă Opnq. Let U0 H and V0 H be two left cosets of H and
ot
let U1 P U0 H. Show that mint}U1 ´ V }p : V P V0 Hu “ mint}U0 ´ V }p : V P V0 Hu
N

and that a similar equality holds for the corresponding geodesic distance gp .
Exercise B.10 (Geodesics in Grpk, Rn q and in Opnq). Fix 0 ă k ă n and
ly.

1 ă p ă 8. Denote by g̃p the metric on Grpk, Rn q obtained as the ` quotient metric


˘
n
from the geodesic metric `gp on Opnq.
˘ Show that any geodesic in Grpk, R q, g̃p can
on

be lifted to a geodesic in Opnq, gp , which is of the form given by Exercise B.6 and
on which the quotient map acts as an isometry. If p “ 1 or 8, any two points in
Grpk, Rn q can be connected by a geodesic with this property.
se

Exercise B.11 (Grpk, Rn q vs. Grpn ´ k, Rn q). Let E, F P Grpk, Rn q. Show that
lu

the nonzero principal angles between E and F coincide with the nonzero principal
angles between E K and F K .
na

Exercise B.12 (Equivalence of metrics on Grpk, Rn q). Let h̃p the metric on
so

Grpk, Rn q given by (B.13) and let g̃p be the geodesic metric defined in Exercise
B.10. Show that for E, F P Grpk, Rn q
r
Pe

h̃p pE, F q “ 21{p }p2 sin θ1 {2, . . . , 2 sin θk {2q}p , g̃p pE, F q “ 21{p }pθ1 , . . . , θk q}p ,
where } ¨ }p is the `p -norm
?
on Rk and θ1 , . . . , θk are the principal angles between E
and F . Conclude that 2 π 2 g̃p ď h̃p ď g̃p and that g̃8 coincides with the metric d8
from (B.11).
Exercise B.13 (Equivalence of metrics on Grpk, Rn q, take #2). Show that the
metric on Grpk, Rn q induced from the Schatten p-norm on Mn via the embedding
E ÞÑ PE is equivalent to the metrics g̃p and h̃p from Exercise B.12. Show that the
geodesic metric induced by it coincides with g̃p .
318 B. CLASSICAL GROUPS AND MANIFOLDS

Exercise B.14 (Simulating the Haar measure on Grpk, Rn q). For 1 ď k ď n,


let pxi q1ďiďk be independent standard Gaussian vectors in Rn . Show that the
subspace spantxi : 1 ď i ď ku is almost surely k-dimensional and distributed with
respect to the standard Haar measure on Grpk, Rn q. Prove also that the same holds
when pxi q are uniformly distributed on the unit sphere.
Exercise B.15 (About the choice of the Euclidean structure). Given a k-
dimensional subspace E Ă Rn , show that there is a sequence of Euclidean structures
on Rn such that the corresponding Haar measures on Grpk, Rn q converge towards

ion
the Dirac mass at E.
Exercise B.16. Does Grpk, Rn q identify with SOpnq{pSOpkqˆSOpn´kqq? Does

ut
Grpk, Cn q identify with SUpnq{pSUpkq ˆ SUpn ´ kqq?

rib
Exercise B.17 (Another representation of Grpk, Rn q as a coset space). Since
k n
` stabilizer of R ˘ under the canonical action of SOpnq on Grpk, Rn q is H “ SOpnqX
the

ist
Opkq ˆ Opn ´ kq (and since the action is transitive), Grpk, R q can be likewise
identified with SOpnq{H. Are the metrics induced this way by the Schatten p-norms

rd
the same as g̃p ’s and h̃p ’s? What about the analogous question for Grpk, Cn q? Note
that there are no subtleties as far as the induced probability measure is concerned:
all reasonable constructions lead to the same object by uniqueness of the Haar
measure.
fo
ot
B.5. The Lorentz group Op1, n ´ 1q
N

Just as the orthogonal group Opnq preserves the Euclidean norm on Rn , the
Lorentz group Op1, n ´ 1q consists of linear transformations preserving the quadratic
ly.

řn´1
form qpxq “ x20 ´ k“1 x2k , where x “ px0 , x1 , . . . , xn´1 q P Rn . Let J be the diagonal
matrix with diagonal entries p1, ´1, . . . , ´1q, i.e., the matrix inducing q in the sense
on

that qpxq “ xx|J|xy for x P Rn . Then


(B.15) M P Op1, n ´ 1q ðñ M T JM “ J.
se

This immediately shows that M P Op1, n ´ 1q verifies det M “ ˘1 and motivates


lu

the definition of of the proper Lorentz group


(B.16) SOp1, n ´ 1q :“ tM P Op1, n ´ 1q : det M “ 1u.
na

Let Ln “ tx P R`n : x0 ě 0 and ˘ qpxq ě 0u be the Lorentz cone. If M P Op1, n ´ 1q,


then clearly M Ln Y p´L `n ˘ “ Ln Y p´Ln q and so there are two possibilities:
q
so

` ˘
either M Ln “ Ln or M Ln “ ´Ln . Again, this motivates the definition of the
orthochronous subgroup of the Lorentz group (the transformations that preserve
r
Pe

the direction of time, identified with the coordinate x0 )


` ˘
(B.17) O` p1, n ´ 1q :“ tM P Op1, n ´ 1q : M Ln “ Ln u
and
(B.18) SO` p1, n ´ 1q :“ SOp1, n ´ 1q X O` p1, n ´ 1q,
the restricted` Lorentz
˘ group. Actually, we will see later (Exercise C.2) that the
condition M Ln “ Ln (i.e., M being a linear automorphism of Ln ) implies that
M is a positive multiple of an element `of O˘` p1, n´1q and so an alternative definition
of O` p1, n ´ 1q is tM P SLpn, Rq : M L`n “ ˘ Ln u. More generally, the structure of
the cone of linear maps M verifying M Ln Ă Ln is studied in Appendix C.
NOTES AND REMARKS 319

The instance that is of most immediate physical significance is n “ 4, with


R4 being identified with the Minkowski spacetime of the theory of special rela-
tivity. Another special feature of the case n “ 4 is that the Lorentz cone L4 is
isomorphic to the positive semi-definite cone PSDpC2 q (see Section 1.2.1) and so
its group of automorphisms can be identified with the group of automorphisms of
the latter cone described in Proposition 2.29. In particular, the fact that a lin-
ear map Φ : Msa sa
d Ñ Md satisfying ΦpPSDq “ PSD is either completely positive
or co-completely positive corresponds—for d “ 2—to the dichotomy O` p1, 3q vs.

ion
Op1, 3qzO` p1, 3q. When restricted to SO` p1, 3q, that identification induces an iso-
morphism of that group with PSLp2, Cq, or the so-called spinor map, see Exercise
B.19.

ut
Exercise B.18 (Examples of automorphisms of the Lorentz cone).

rib
!„ 
cosh θ sinh θ
)
(a) Show that SO` p1, 1q “ :θPR .
sinh θ cosh θ

ist
(b) Deduce that if c ą 0, then SO` p1, 1q acts transitively on the (branch of the)
hyperbola tpx0 , x1 q : x0 ą 0, x20 ´ x21 “ cu.

rd
Exercise B.19 (A spinor map). Let Ψpxq “ x¨σ “ X be the map from (2.4)–
2
(2.5) implementing the isomorphism between the cones L4 and
˘ PSDpC q.
(a) Show that if V P SLp2, Cq, then ΨV pxq “ Ψ fo
´1
` :
V ΨpxqV is an automorphism
of L4 which belongs to SO` p1, 3q, and that every element of SO` p1, 3q can be
ot
represented that way.
(b) Show that the map SLp2, Cq Q V ÞÑ ΨV P SO` p1, 3q is a group homomorphism
N

whose kernel is tId, ´ Idu and deduce that it induces a group isomorphism between
PSLp2, Cq “ SLp2, Cq{tId, ´ Idu and SO` p1, 3q (an example of the so-called “spinor
ly.

map”).
on

Notes and Remarks


Proposition B.1 appears in [Sza98]. For p P p1, 8q, p ‰ 2, the assertion—but
se

not the argument—is exactly the same as in the classical Riemannian case (p “ 2).
If p P p1, 2q, the Riemannian proof can be tweaked as it fits in the framework of
lu

Finsler geometry [CS05]. For p P p2, 8q, the metric structure induced by the p-
Schatten norm does not satisfy the usual hypotheses of Finsler geometry and so a
na

more specialized argument is needed.


Proposition B.1 can be extended to other bi-unitarily invariant norms (i.e.,
so

norms defined via singular values, see Exercise 1.47).


For more information and alternative definitions for principal angles, see the
r

book [GVL13].
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
APPENDIX C

Extreme maps between Lorentz cones and the

ion
S-lemma

ut
The focus of this appendix is the Lorentz cone

rib
n´1
ÿ
x2k ď x20 Ă Rn
(
(C.1) Ln “ px0 , x1 , . . . , xn´1 q : x0 ě 0,
k“1

ist
` ˘
and particularly the cone P pLn q :“ tΦ : Φ Ln Ă Ln u of linear maps that preserve
it. We have the following

rd
Proposition C.1. Let Φ : Rn Ñ Rn be a linear map which generates an
fo
extreme ray of P pLn q. Then either Φ is an automorphism of Ln or Φ is of rank
one, in which case Φ “ |uyxv| for some u, v P BLn zt0u. If n ą 2, the converse
implication also holds.
ot
In view of the isomorphism between the cones PSDpC2 q and L4 (see (2.4)),
N

Proposition 2.38—characterizing extreme rays of the cone of positivity-preserving


linear maps on Msa2 —is really a special case of Proposition C.1. Note that every
ly.

element of BPSDpC2 qzt0u is of the form |ϕyxϕ|, ϕ P C2 zt0u, so |ϕyxϕ| and |ψyxψ|
of Proposition 2.38 play the same role as u, v in Proposition C.1. However, the
on

true reason why they appear in the statements is that they generate extreme rays
respectively in PSDpC2 q and Ln (cf. Corollary 1.10). The following simple obser-
vation completely characterizes extreme rays generated by rank one maps in a very
se

general setting (we only need the “only if” part, which is easy).
lu

Lemma C.2 (see Exercise C.1). Let C Ă Rn be a nondegenerate cone and let
P pCq be the cone of linear maps preserving C. A rank one map Φ : Rn Ñ Rn
na

generates an extreme ray of P pCq iff it is of the form Φ “ |uyxv|, with u and v
generating extreme rays of respectively C and C ˚ .
so

While not as simple as the case of rank one maps, the structure of the set
r

of automorphisms of Ln is very well understood: they are of the form tΦ, where
Pe

t ą 0 and Φ P O` p1, n ´ 1q (see Appendix B.5), the orthochronous subgroup of


the Lorentz group Op1, n ´ 1q of transformations preserving the quadratic form
řn´1
qpxq “ x20 ´ k“1 x2k . This follows easily from the so-called S-lemma, a well-
known fact from control theory and quadratic/semi-definite programming. (This
and similar issues are explored in Exercises C.2–C.3.) The same lemma underlies
the proof of Proposition C.1. We first state the simplest version of the Lemma.
Lemma C.3 (S-lemma). Let M, N be n ˆ n symmetric real matrices. The
following two statements are equivalent:
(i) tx P Rn : xx|M |xy ě 0u Y tx P Rn : xx|N |xy ě 0u “ Rn
(ii) there exists t P r0, 1s such that the matrix p1´tqM `tN is positive semi-definite.
321
322 C. EXTREME MAPS BETWEEN LORENTZ CONES AND THE S-LEMMA

We apply the S-lemma in the following form, which is an easy consequence of


Lemma C.3 applied with M “ F and N “ ´G.
Lemma C.4 (S-lemma reformulated). Let F, G be n ˆ n symmetric real matri-
ces. Assume that there is an x̄ P Rn such that xx̄|G|x̄y ą 0. Then the following two
statements about such F, G are equivalent:
(i) if x P Rn verifies xx|G|xy ě 0, then xx|F |xy ě 0,
(ii) there exists µ ě 0 such that F ´ µG is positive semi-definite.

ion
We postpone the proof of the Lemma until the end of this appendix and show
how it implies the Proposition.

ut
Proof of Proposition C.1. In view of Lemma C.2, we may assume that
rank Φ ě 2. Let J be the diagonal matrix with diagonal entries p1, ´1, . . . , ´1q,

rib
i.e., the matrix inducing q in the sense that qpxq “ xx|J|xy for x P Rn . The map
Φ preserving Ln (and hence ´Ln ) means that the hypothesis (i) of Lemma C.4 is

ist
satisfied with G “ J and F “ Φ˚ JΦ. Since clearly ´J is not positive definite, it
follows that there is µ ě 0 and a positive semi-definite operator Q such that

rd
(C.2) Φ˚ JΦ “ µJ ` Q.
We now notice that since rank Φ ě 2, there is y “ Φx ‰ 0 such that y0 “ 0. In
fo
particular, xx|Φ˚ JΦ|xy “ xy|J|yy ă 0. Given that xx|Q|xy ě 0, it follows that µ
cannot be 0. Next, if Q “ 0, (C.2) means precisely that µ1{2 Φ P Op1, n ´ 1q and so
ot
Φ is an automorphism of Ln .
To complete the argument, we will show that if Q ‰ 0, then there is a rank one
N

operator ∆ such that Φ ˘ ∆ P P pLn q. Since Φ and ∆ have different ranks, they
are not proportional. Hence Φ ` ∆ and Φ ´ ∆ do not belong to the ray generated
ly.

by Φ, which implies that the ray is not extreme.


Let |vyxv|, v ‰ 0, be one of the terms appearing in the spectral decomposition
on

of Q; then Q “ Q1 ` |vyxv|, where Q1 is positive semi-definite. Next, let u P Rn zt0u


be such that Φ˚ Ju “ δv, where δ is either 1 or 0. Such u exists: if Φ˚ is invertible,
then u “ JpΦ˚ q´1 v satisfies Φ˚ Ju “ v, while in the opposite case the nullspace of
se

Φ˚ J is nontrivial. We will show that, for some ε ą 0,


lu

(C.3) Φ ` s|uyxv| P P pLn q if |s| ď ε,


thus supplying the needed ∆ “ ε|uyxv|. We have, by (C.2) and by the choice of u,
na

pΦ ` s|uyxv|q˚ JpΦ ` s|uyxv|q “ µJ ` Q ` 2sδ|vyxv| ` s2 |vyxu|J|uyxv|


so

(C.4) “ µJ ` Q1 ` p1 ` 2sδ ` s2 xu|J|uyq|vyxv|.


Since clearly 1 ` 2sδ ` s2 xu|J|uy ě 0 if |s| is sufficiently small, it follows that, for
r
Pe

such s, pΦ ` s|uyxv|q˚ JpΦ ` s|uyxv|q ´ µJ is positive semi-definite. Thus we can


deduce from the easy part of Lemma C.4 that Φ ` s|uyxv| P P pLn q, as needed. (To
be precise, we need to exclude the possibility that Φ ` s|uyxv| P ´P pLn q, but this
is simple.)
For the converse implication, Lemma C.2 takes care of the rank one maps,
so we just need to show that every automorphism Φ of Ln generates an extreme
ray of P pLn q if n ą 2. To that end, notice that the map Ψ ÞÑ Φ ˝ Ψ is a linear
automorphism of the cone P pLn q sending Id to Φ. Since linear maps preserve faces
and their character, the ray R` Φ is extreme iff R` Id is extreme. This means that
either all automorphisms of Ln generate extreme rays of P pLn q, or none of them
does, and we just have to exclude the latter possibility.
C. EXTREME MAPS BETWEEN LORENTZ CONES AND THE S-LEMMA 323

Indeed, suppose that all extreme rays of P pLn q are generated by rank one
řN
` that Id
maps. It then follows in particular (see Section 1.2.2) ˘ “ i“1 |ui yxvi | for
some ui , vi P BLn . Since u, v P Ln implies that Tr J|uyxv| “ xv|J|uy ě 0, we
obtain
´ ÿN ¯ ÿN
` ˘
´1 ě 2 ´ n “ Tr J “ Tr J |ui yxvi | “ Tr J|ui yxvi | ě 0,
i“1 i“1

which yields a desired contradiction. (See Exercise C.4 for the discussion of the

ion
case n “ 2.) 

ut
Proof of Lemma C.3. To show that piq ñ piiq, we argue by contradiction.
Denote Mt “ p1 ´ tqM ` tN and assume that, for every t P r0, 1s, the smallest

rib
eigenvalue λt of Mt is strictly negative. For t P r0, 1s, let
Λt :“ tx P S n´1 : Mt x “ λt xu.

ist
Note that t ÞÑ λt is continuous and t ÞÑ Λt is upper semicontinuous, i.e., tn Ñ t,
xn P Λtn and xn Ñ x imply x P Λt , and of course all Λt ‰ H.

rd
Consider the sets A “ tx P Rn : xx|M |xy ě 0u and B “ tx P Rn : xx|N |xy ě
0u. We have A Y B “ Rn by hypothesis. Since M0 “ M , it follows that Λ0 X A “ H
and so Λ0 Ă B. Similarly, Λ1 Ă A. Set fo
τ “ suptt P r0, 1s : Λt X B ‰ Hu.
ot
We now note that Λτ X B ‰ H; this is immediate if τ “ 0 and follows from upper
N

semicontinuity of t ÞÑ Λt if τ ą 0. For essentially the same reasons, Λτ X A ‰ H.


We now claim that Λτ X A X B ‰ H. This is clear if the eigenvalue λτ is
ly.

simple (note that all three sets, Λτ , A and B, are symmetric by definition). On the
other hand, if the multiplicity of λt equals k ą 1, then Λτ is a pk ´ 1q-dimensional
on

sphere and hence is connected. Consequently, the closed nonempty sets Λτ X A and
Λτ X B, the union of which is Λτ , must have a nonempty intersection.
To conclude the argument, choose x P Λτ X A X B ‰ H. Then, since x P Λτ ,
se

xx|Λτ |xy “ λt ă 0.
lu

On the other hand, since x P A X B,


na

xx|Λτ |xy “ p1 ´ τ qxx|M |xy ` τ xx|N |xy ě 0,


so

a contradiction. 
r

Exercise C.1. Prove Lemma C.2.


Pe

Exercise C.2. Use the S-lemma to show that every linear automorphism of
Ln is of the form tΦ, where t ą 0 and Φ P O` p1, n ´ 1q. In other words, there exists
t ą 0 such that xx|Φ˚ JΦ|xy “ t2 xx|J|xy for all x P Rn .
Exercise C.3. Show that maps of the form of tΦ, where t ą 0 and Φ P
SO` p1, n ´ 1q, act transitively on the interior of Ln .
Exercise C.4. Show that the all extreme rays of the cone P pL2 q consist of
maps of rank one.
324 C. EXTREME MAPS BETWEEN LORENTZ CONES AND THE S-LEMMA

Notes and Remarks


The fact that statements similar to Proposition C.1 imply Størmer’s theorem
was apparently folklore for some time; it appears explicitly in [MO15]. Proposi-
tion C.1 was proved in [LS75] and then rediscovered (apparently independently) in
[Hil05], where its relevance to the entanglement theory was also noted. The sub-
sequent paper [Hil07b] by the same author contains a stronger result, a complete
classification of elements of P pLn q. Our proof of Proposition C.1 follows roughly
that of [Hil05], but is substantially simpler. In turn, the argument from [Hil07b]

ion
was similar to, but simpler than [LS75]; all proofs seem to use either a variant of the
S-lemma (Lemma C.4) or closely related facts. The papers [Hil05, Hil07b] actu-

ut
ally characterize (for any m, n ě 2) extreme rays of maps that satisfy ΦpLm q Ă Ln ,
but this slightly more general fact is easy to derive from Proposition C.1 combined

rib
with (for example) Exercise C.3.

ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
APPENDIX D

Polarity and the Santaló point via duality of cones

ion
The goal of this appendix is to explore the dependence of polarity on transla-

ut
tion, which is otherwise not very transparent, by exploiting the duality of cones.
We believe that this approach deserves to be better known. Besides recovering the

rib
characterization of the Santaló point of a convex body, we are able to easily explain
other somewhat mysterious facts such as, for example, the polar of an ellipsoid with

ist
respect to any interior point being also an ellipsoid.
We start with a reformulation of Lemma 1.6 from Section 1.2.1 in a manner

rd
not appealing to the concept of scalar product. Let V be a real vector space and
V ˚ its dual. To make the analogy with Lemma 1.6 more apparent, we will write
xx˚ , xy for the evaluation x˚ pxq whenever x P V and x˚ P V ˚ . If C Ă V is a closed
fo
convex cone, the dual cone C ˚ Ă V ˚ is now defined by (cf. (1.18))
C ˚ :“ tx P V ˚ : @ y P C xx, yy ě 0u.
ot
(D.1)
We then have
N

Lemma D.1. Let e P C and e˚ P C ˚ be such that xe˚ , ey “ 1. Let He :“ ty P


: xy, ey “ 1u, He˚ :“ tx P V : xe˚ , xy “ 1u, and let C b “ C X He˚ and
ly.

˚
V
˚ b
pC q “ C ˚ X He be the corresponding bases of C and C ˚ . Then
on

(D.2) pC ˚ qb “ ty P He : @x P C b x´py ´ e˚ q, x ´ ey ď 1u.


In other words, if we think of He˚ as a vector space with the origin at e, of He as
se

a vector space with the origin at e˚ and as a dual of He˚ , and of C b and pC ˚ qb as
their respective subsets, then pC ˚ qb “ ´pC b q˝ .
lu

The proof of Lemma D.1 fully parallels that of Lemma 1.6 and so we relegate
na

it to Exercise D.1.
The formula in (D.2) suggest a definition of polarity in the affine context that
so

is a tad different than the one usually used. Namely, if K and L are (say, closed
and convex) subsets of two affine spaces whose underlying vector spaces are dual
r

to each other, and if a P K and b P L, then L is a polar of K with respect to the


Pe

pair pa, bq if L ´ b “ pK ´ aq˝ (in the sense indicated in Section 1.2.1).


This definition, and Lemma D.1, allow for a nice way to visualize polars of
translates of a convex body. Indeed, let K be an n-dimensional closed convex set.
We can represent K as a subset of H “ tpx0 , x1 , . . . , xn q : x0 “ 1u Ă Rn`1 and
consider the cone C Ă Rn`1 generated by it (i.e., C “ R` K, with the closure
not needed if K is compact). Then K is the base of C with respect to e0 “
p1, 0, . . . , 0q and automatically e0 P C ˚ . Now, if a P K, then Equation (D.2) shows
that pK ´ aq˝ can be identified (up to a reflection with respect to e0 ) with the base
of C ˚ corresponding to a, that is, with the section C ˚ X ty P Rn`1 : xy, ay “ 1u
of the cone C ˚ . This point of view is pictured in Figure D.1, where C and C ˚ are

325
326 D. POLARITY AND THE SANTALÓ POINT VIA DUALITY OF CONES

C C∗

K e −(K − e)◦ e∗
• •
a −(K − a)◦

ion
0 0 He
Ha
H e∗

ut
Figure D.1. If K is a base of C, then the polars of K with respect

rib
to different points (defined in the way implicit in Lemma D.1)
correspond to different sections of the cone C ˚ . It is possible to
superimpose the two pictures and even to assume that e “ e˚ , but

ist
that obscures the dependence of pK ´ aq˝ on a.

rd
separately represented in two copies of Rn`1 with e and e˚ being two copies of e0 .
(Note that while necessarily e0 P C ˚ , it is a priori possible that e0 R C.)
fo
Such approach has a number of nice immediate consequences, for example the
fact that the polar of a not-necessarily-centered ellipsoid is an ellipsoid as long as
ot
0 is an interior point (see Exercise D.3). Note, however, that we cannot directly
compare (say, the volumes of) pK ´aq˝ for different values of a since they do not live
N

in the same hyperplane of Rn`1 . However, a simple trick permits such comparisons
(cf. the comments following Theorem 4.17 in Section 4.3.4).
ly.

Proposition D.2. Let K Ă Rn be a convex body. Then there exists a unique


on

interior point s P K such that pK ´ sq˝ has centroid at 0. Moreover, if a ‰ s, then


the volume of pK ´ aq˝ is strictly larger than the volume of pK ´ sq˝ .
se

The point s appearing in the statement of Proposition D.2 is called the Santaló
point of K.
lu

Proof. We start with the construction outlined in the paragraph preceding


the statement of the Proposition. Note that since K is a convex body (hence n-
na

dimensional and compact), the cones C and C ˚ are both nondegenerate and e0 is
an interior point of C ˚ (see Lemma 1.7 and Exercise 1.32). We now consider the
so

following auxiliary optimization problem: among the solid cones of the form
r

Ta “ tx P C ˚ : xx, ay ď 1u,
Pe

where a varies over the interior of K, find one for which voln`1 pTa q is the smallest.
Note that the restrictions on a ensure that each Ta is indeed a (bounded) solid cone
with the base tx P C ˚ : xx, ay “ 1u “: Ba (this happens whenever a belongs to the
interior of C) and that e0 belongs to Ba (this happens whenever a P C X H). The
sets Ta and Ba are pictured in the first drawing in Figure D.2.
It is easy to see that inf a voln`1 pTa q ą 0, and that both the diameter and the
volume of Ta tend to `8 as a Ñ BK. Since voln`1 pTa q is a continuous function of
a, this implies that the infimum is attained. On the other hand, if a ÞÑ voln`1 pTa q
has a local extremum at s, then an elementary variational argument shows that
e0 is the centroid of Bs (see Exercise D.4), which—according to Lemma D.1—is
D. POLARITY AND THE SANTALÓ POINT VIA DUALITY OF CONES 327

Ha C∗ Ha C∗
α α

solid cone Ta

•a •a

0 α e0 0
• • • e0 •

ion
Ba Ba

PH Ba = −(K − a)◦

ut
H H

rib
Figure D.2. The first drawing illustrates the calculation (D.3) of
the volume of the solid cone Ta . The second drawing illustrates

ist
the calculation (D.4) of the volume of pK ´ aq˝ , the polar of K ´ a
constructed inside H. The minus sign in front of pK ´aq˝ indicates

rd
a reflection inside H with respect to e0 .

fo
affinely equivalent to pK ´sq˝ , the polar of K ´s inside H. More precisely, one sees
directly from (D.2) that, for every a, pK ´ aq˝ is (up to a reflection with respect to
ot
e0 ) the orthogonal projection of Ba onto H, as pictured in the second drawing in
Figure D.2.
N

Now comes a simple but crucial observation (illustrated in the two drawings of
Figure D.2). On the one hand,
ly.

1 1
(D.3) voln`1 pTa q “ voln pBa q ˆ
n`1 |a|
on

1
because |a| equals the cosine of the angle between a and e0 (denoted by α), and
hence is the same as the height of the cone Ta . On the other hand, since pK ´ aq˝ ,
se

the polar of K ´ a constructed inside H, is a reflection of PH pBa q, and since the


lu

angle α between Ba and H is the same as between a and e0 , it follows that


1
(D.4) voln ppK ´ aq˝ q “ voln pBa q ˆ .
na

|a|
This shows that voln`1 pTa q and voln ppK´aq˝ q differ only by a factor independent of
so

a, and so they achieve their minima simultaneously. This concludes the argument,
except for the uniqueness part (which is easy, see Exercise D.2). 
r
Pe

Exercise D.1. Prove Lemma D.1.


Exercise D.2. Let K Ă Rn be a convex body with 0 in the interior and such
that K ˝ has centroid at the origin. Then, for any point a ‰ 0 in the interior of K,
the centroid of pK ´ aq˝ is not 0.
Exercise D.3. This exercise supplies “soft” proofs of the facts derived previ-
ously via tedious calculations in Exercise 1.26. Let E Ă Rn be an ellipsoid.
(i) Show that if E contains 0 in its interior, then E ˝ is also an ellipsoid.
(ii) Show that if 0 P BE , then E ˝ is an elliptic paraboloid.
(iii) Show that, among translates of E , the volume of the polar is minimal iff the
328 D. POLARITY AND THE SANTALÓ POINT VIA DUALITY OF CONES

translate is 0-symmetric. Give a proof that does not use the uniqueness part of
Proposition D.2.
Exercise D.4. Show that if (in the notation from the proof of Proposition
D.2) the function a ÞÑ voln`1 pTa q has a local extremum at b P K, then e0 is the
centroid of Bb .

ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
APPENDIX E

Hints to exercises

ion
ut
Exercise 0.2. We may write |x1 b x2 ` y1 b y2 yxx1 b x2 ` y1 b y2 | as

rib
|x1 yxx1 | b |x2 yxx2 | ` |y1 yxy1 | b |y2 yxy2 |
3
1 ÿ

ist
` p´1qk |x1 ` ık y1 yxx1 ` ık y1 | b |x2 ` ık y2 yxx2 ` ık y2 |.
4 k“0

rd
Chapter 1

Exercise 1.1. If pxi q are affinely dependent, then fo


ř
µi xiř“ 0 for some (not
identically zero) real numbers pµi q adding to zero. Then x “ pλi ` εµi qxi and for
ot
a well-chosen ε this is a strictly shorter convex decomposition.
Exercise 1.2. By Carathéodory’s Theorem 1.2, conv A is a continuous image of
N

∆n ˆ An`1 .
Exercise 1.3. By the Hahn–Banach theorem, any boundary point of a convex
ly.

body K admits a supporting hyperplane, whose intersection with K is an exposed


on

face.
Exercise 1.4. If y P Lztxu, then x is an interior point of some segment ry, zs with
z P L.
se

Exercise 1.5. (a) and (b) follow fairly directly form the definitions. (c) Consider
lu

a supporting hyperplane to K at a point in the relative interior of the face and


apply Proposition 1.4 to the functional defining that hyperplane. (d) For the first
assertion, take K to be the Minkowski sum of a disk and a segment (see Figure
na

E.1), or of a disk and a square. For the second assertion, appeal to part (c).
(e) Let L be the Minkowski sum of an n-dimensional cube and B2n . Consider a
so

x
r


Pe

Figure E.1. An example of an extreme point which is not exposed

329
330 E. HINTS TO EXERCISES

hyperplane supporting to K which is parallel to one of the facets of the cube, and
let F be the corresponding exposed facet. Show that F is a translate of that facet
(hence an pn ´ 1q-dimensional cube) and consider any k-dimensional face of F . (f)
For sufficiency, use part (a) and part (b). For necessity, use part (c) to argue by
induction with respect to the dimension. (g) Assume K is full-dimensional. If a
supporting hyperplane does not isolate a point, then the boundary of K contains
a segment.
Exercise 1.6. Proceed by induction with respect to dim K. The base cases (di-

ion
mension 0 or 1) are simple. For the inductive step, let x P K and assume first
that x belongs to the relative interior of K. Next, note that every convex body

ut
admits at least one extreme point (for example, the smallest element with respect
to the lexicographic order) and let y be one such point. There is a (unique) point

rib
z P BK (the relative boundary) such that x belongs to the segment rz, ys. Let H
be a supporting hyperplane for K which contains z. We may apply the induction
hypothesis to K X H and produce a decomposition of z as a convex combination of

ist
extreme points of K X H (hence of K, by Exercise 1.5(b)). Finally, if x P BK, we
may perform the dimension reduction immediately.

rd
Exercise 1.7. For necessity, appeal to the spectral theorem. For sufficiency, use
the following fact: If ρ1 , ρ2 are positive operators and ρ “ ρ1 ` ρ2 , then the range
fo
of ρ contains the ranges of ρ1 and ρ2 . Alternatively, note that either all rank one
projections |ψyxψ| are extreme or none of them is, and appeal to the Krein-Milman
ot
Theorem 1.3. (See also Section 2.1.3.)
N

Exercise 1.8. If K “ convtx1 , . . . , xN u and F is a face of K, then F “ convtxi :


xi P F u.
Exercise 1.9. Prove that if F is an exposed face of a polytope P , and G an exposed
ly.

face of F , then G is an exposed face of P . Then use Exercise 1.5(f).


on

Exercise 1.10. The extreme points of B1n are the n vectors from the canonical
basis in Rn and their opposites. The extreme points of B8 n
are elements of t´1, 1un .
For 1 ă p ă `8, any boundary point is extreme (to show this, use the fact that
se

the function x ÞÑ |x|p is strictly convex; another “high level” argument is given in
lu

Exercise 1.11(iv)).
Exercise 1.11. (i) We may assume 0 ă b ď a. Check that
na

d
pαptqap ` αp1{tqbp q “ pp ´ 1qpap ´ bp {tp qpp1 ` tqp´2 ´ |1 ´ t|p´2 q
dt
so

so that the maximum is achieved for t “ b{a ď 1. (ii) Use (i) and the inequality
ř n řn
i“1 suptą0 t¨ ¨ ¨ u ě suptą0 i“1 t¨ ¨ ¨ u. For p ě 2, the proof goes along the same
r
Pe

lines except that the supremum in the variational formula is replaced by an infimum.
(iii) To deduce (1.6) from (1.5), use the following inequalities
˘p{2 ppp ´ 1q 2 p1 ` tqp ` p1 ´ tqp
1 ` pp ´ 1qt2
`
(E.1) ď1` t ď ,
2 2
valid for t P r´1, 1s and applied with t “ }y}p {}x}p (we may assume that }y}p ď
}x}p ). The second inequality in (E.1) can be proved by a Taylor expansion of the
right-hand side. (iv) For p ě 2, use (ii). For p ď 2, use (iii) applied to the pair
px ` y, x ´ yq.
Exercise 1.12. We may assume that K contains the origin in its interior. One
possibility is to define Θpxq “ ΘK pxq as the (unique) element of minimal Euclidean
CHAPTER 1 331

norm in the set (denoted F ) of points where x¨, xy is maximal on K. To see that this
choice is Borel, define a sequence pKm q of convex bodies approximating K from the
1
inside by the relation }¨}Km “ }¨}K ` m |¨|. One checks that (i) for each x P Rn zt0u,
the linear form x¨, xy achieves its maximum on Km at a unique point, denoted φm pxq,
(ii) for each m, the map φm is continuous, and (iii) the sequence pφm q converges
pointwise to ΘK . To see the last point, write φm pxq “ p1 ` |xm |{mq´1 xm for some
xm P BK. If y P F , then (by definition of φm ) we have
p1 ` |y|{mq´1 xy, xy ď p1 ` |xm |{mq´1 xxm , xy ď p1 ` |xm |{mq´1 xy, xy,

ion
which implies that |φm pxq| ă |xm | ď |y|. Deduce that pφm pxqq must converge to
the point of minimal Euclidean norm in F .

ut
Exercise 1.13. If h, h1 : Rn Ñ R` are positively homogeneous, then th ď 1u “
th1 ď 1u implies h “ h1 . What may fail here is that supxPA xx, ¨y may be negative.

rib
Exercise 1.14. Use the fact that K Ă RB2n ðñ R´1 B2n Ă K ˝ .

ist
Exercise 1.15. pK ˝ q˝ is a closed convex set containing both K and 0, so one
inclusion is clear. For the other inclusion, argue by contradiction using the Hahn–

rd
Banach separation theorem.
Exercise 1.17. If K does not contain 0 in the interior, then } ¨ }K takes the value
`8 which forbids the application of Hahn–Banach theorem. For an illustration of
fo
the importance of the assumptions consider K “ L˝ , where L “ tpx, yq P R2 :
p2 ´ yqp2 ´ xq ě 1, x ă 2u.
ot
Exercise 1.18. (1.14) is simple and (1.15) can be deduced from it using the bipolar
N

theorem. The example K “ ´L “ t0, 0uYp´8, 0qˆr´1, 1s shows that closedness is


needed. The example K “ t0, 2u, L “ r´1, 1s shows that convexity is needed. The
example K “ r1, 2s, L “ r3, 4s shows that containing the origin is needed. Finally,
ly.

taking K ˝ “ p´8, 0q ˆ t0u and L˝ “ tpx, yq : x ´ 1 ą 0, px ´ 1qpy ´ 1q ě 1u


on

shows that taking the closure is needed (it is clearly not needed if K ˝ and L˝ are
both compact).
Exercise 1.19. Let K “ convtV u Ă Rn containing 0 in the interior with V finite.
se

For any extreme point x P K ˝ there is a subset U Ă V such that span U “ Rn and
lu

x is the (unique) vector satisfying xx, uy “ 1 for every u P U . It follows that K ˝


has only finitely many extreme points.
na

Exercise 1.20. The hypotheses ensure that every supporting hyperplane to K is of


the form Hy “ tx : xy, xy “ 1u for some y P BK ˝ , so νK pF q ‰ H. To establish that
so

νK pF q is an exposed face, show that if x0 belongs to the relative interior of F and


H “ ty : xy, x0 y “ 1u, then νK pF q “ H XK ˝ (use the fact that for any x P F there
r

exist t P p0, 1q and x1 P F such that x0 “ p1 ´ tqx1 ` tx). To establish injectivity,


Pe

show that if F1 , F2 are exposed faces of K with F2 Ć F1 and F1 “ ` Hy0 X˘K, then
y0 P νK pF1 qzνK pF2 q. With regards to the last property, F Ă νK ˝ νK pF q is easy;
if we had a strict
` `inclusion,
˘˘ injectivity and order reversing would imply the strict
inclusion νK νK ˝ νK pF q Ĺ νK pF q, which is a contradiction since we just noted
that the reverse inclusion always holds.
Exercise 1.21. Show that the interior of K is disjoint with Fy (always) and that,
under our hypotheses, Fy ‰ H. Deduce that Hy “ tx : xy, xy “ 1u is a supporting
hyperplane and Fy an exposed face. For the second statement, if F is a maximal
exposed face of K, show that F coincides with Fy , where y is an extreme point of
νK pF q (appeal to the Krein-Milman theorem and use maximality).
332 E. HINTS TO EXERCISES

The same argument works in the general case with the caveat that, for some y, the
set Fy (as defined by (1.17)) may be empty.
Exercise 1.23. The polars of the examples from Exercise 1.5(d) will work.
` ˘ ` ˘
Exercise 1.24. Consider K “ B12 X tx ď 0u Y B22 X tx ě 0u , where x is the
first coordinate in R2 .
Exercise 1.25. If a1 ď ¨ ¨ ¨ ď a2n´1 denote the principal semi-axes of E , produce
an n-dimensional section which is a Euclidean ball of radius an by pairing each
small semi-axis (ak for k ă n) with a large semi-axis (ak for k ą n).

ion
Exercise 1.28. If e P C ˚ is such that the functional xe, ¨y doesn’t vanish identically
on C, then it doesn’t vanish identically on the relative interior of C and so, by

ut
Proposition 1.4 (applied with K “ R` and F “ t0u), xe, ¨y is strictly positive on
the relative interior of C. Show that this implies that the relative interior of C

rib
is contained in R` C b and deduce the assertion. For an example where closure is
needed, take C “ R2` and e “ p1, 0q.

ist
Exercise 1.29. By the bipolar theorem, C ˚ “ C K ðñ C “ pC K q˚ “ spanpCq, so
whenever C is not a linear subspace, any vector e P C ˚ zC induces a base.

rd
Exercise 1.30. Try C1 “ tpx, y, zq P R3 : x ě 0, y ě 0, z ě 0, xy ě z 2 u and
C2 “ R´ ˆ t0u ˆ t0u.
fo
Exercise 1.31. We may assume after rotation that y “ te0 with t ą 1. Note that
C?2 e0 is the Lorentz cone Ln and is therefore self-dual. For the general case, define
ot
?
a linear map Tλ by Tλ y “ λy and Tλ x “ x for x K y. For λ “ t2 ´ 1, we have
“ pTλ´1 qT Ln “ T1{λ Ln “ C?1`1{λ2 e0 “ Cue0 for
N

˚
Cte0 “ Tλ Ln and therefore Cte 0
?
u “ t{ t2 ´ 1.
ly.

Exercise 1.32. Prove for example that (a) ñ (c) ñ (e) ñ (f) ñ (b) ñ (g) ñ
(d) ñ (a). The first implication is straightforward, the next two are Corollary 1.8.
on

Other implications are simple. If C has a compact base, Lemma 1.6 implies that C ˚
has a pn ´ 1q-dimensional base, so dim C ˚ “ n. If C contains a line L, then C ˚ Ă LK
and span C ˚ Ă LK .
se

Exercise 1.33. Let V be a maximal vector subspace contained in C, then use


lu

Exercise 1.32.
Exercise 1.34. If x P C, then the map φpzq “ xz, xy verifies φpC ˚ q Ă R` and t0u
na

is a face of R` .
Exercise 1.35. Show that if F 1 is a face of C then R` F 1 “ F 1 . Deduce that
so

F 1 ÞÑ F 1 X C b is the inverse to the correspondence defined in the Proposition.


r

Exercise 1.36. Let y P C1 XC2 define the common isolating hyperplane. The proof
Pe

of the implication (a) ñ (d) from Exercise 1.32 shows then that the corresponding
bases of C1˚ and C2˚ are compact and hence so is their convex hull, which generates
C1˚ ` C2˚ (cf. (1.23)).
Exercise 1.37. In (ii)–(iv) this is easy; in (v) consider t tending to `8 and to
´8.
Exercise 1.38. The “if” direction is easy. For “only if,” use induction on n. Let
x, y P Rn such that x ăw y. Assume for notational simplicity that x “ xÓ , y “ y Ó .
Let δ “ minty1 ` ¨ ¨ ¨ ` yk ´ px1 ` ¨ ¨ ¨ ` xk q : 1 ď k ď nu (ě 0) with the minimum
achieved for k “ k0 . Show that px1 ` δ, x2 , . . . , xk0 q ă py1 , . . . , yk0 q and (if k0 ă n)
apply the induction hypothesis to the vectors pxk0 `1 , . . . , xn q and pyk0 `1 , . . . , yn q.
CHAPTER 1 333

Exercise 1.39. The statement and the proof are the same (simply replace every-
where “unitary” by “orthogonal”).
Exercises 1.40 and 1.41. Apply Proposition 1.15.
Exercise 1.42. We may check strict concavity on lines; for ř A positive definite and
B ‰ 0 self-adjoint, we have log detpA`tBq “ log detpAq` i logp1`tλi q where pλi q
are the eigenvalues of A´1{2 BA1{2 , which is strictly concave wherever it is defined.
Alternatively, use Klein’s lemma and analyze the proof for equality conditions.
Exercise 1.43. This follows from the fact that, for any X P Mn , diag X “

ion
Ave Dv XDv , where v varies over t´1, 1un endowed with normalized counting mea-
sure and Dv denotes the diagonal matrix made from the coordinates of a vector

ut
v.
Exercise 1.44. Extreme points of S1m,n are of the form |xyxy|, where x and y are

rib
unit vectors. Similarly, extreme points of S1m,sa are of the form |xyxx|. Extreme
m,n
points of S8 are (if, say, m ě n) the isometric embeddings of Rn into Rm , in

ist
particular, for m “ n, orthogonal matrices (resp., Cn into Cm , unitary matrices).
m,sa
Extreme points of S8 are reflections and have m ` 1 connected components

rd
(eigenvalues are ˘1), each of which can be identified with the Grassmann manifold
Grpk, Rm q for the appropriate k P t0, 1, . . . , mu.
Exercise 1.45. If X P K “ S8 n
fo
(real or complex case), let X “ U ΣV : be the polar
decomposition with U, V P Opnq (or Upnq in the complex case) and Σ a diagonal
ot
matrix with diagonal entries belonging to r0, 1s. Consider the diagonal of Σ as an
n
element of B8 and apply Exercise 1.10 and Carathéodory’s theorem in Rn . Other
N

instances of K are handled in similar way.


Applying Carathéodory’s theorem directly leads to a convex combination of m ` 1
ly.

extreme points, where m “ Θpn2 q is the (real) dimension of the corresponding


space of matrices.
on

Exercise 1.46. The set S12,sa is a cylinder (whose base is the real version of the
2,sa
Bloch ball) and the set S8 is a double-cone over a disk (see Figure E.2).
se

reflections |1ih1| − |0ih0|



lu

−|0ih0| |1ih1|
• •
na

−I I
• •
so

• •
−|1ih1| |0ih0|
r


Pe

|0ih0| − |1ih1|

Figure E.2. Schatten unit balls in 2 ˆ 2 real self-adjoint matrices

Exercise 1.47. The delicate point is the triangle inequality. For M, N P Mm,n ,
consider M̃ , Ñ P Msa
m`n as in Lemma 1.13. By mimicking the proof of Proposition
1.15, we obtain specpM̃ ` Ñ q ă specpM̃ q ` specpÑ q, and therefore spM ` N q ăw
spM q ` spN q. Using the result from Exercise 1.38, this implies that }spM ` N q} ď
}spM q ` spN q} ď }spM q} ` }spN q}. For the second statement (and, say, m “ n)
consider the restriction of the norm to diagonal matrices with real entries.
334 E. HINTS TO EXERCISES

Exercise 1.48. Mimic the explanation of the equality in (1.34), or use the bipolar
theorem.
Exercise 1.50. Use Exercise 1.43. For the second statement, analyze the proofs
for equality conditions.
` ˘
Exercise 1.51. Since Sp pσq “ Hp specpσq , it is enough to settle the commutative
d
ř
case. Calculate the derivative dp Hp pqq and show that it equals ´ i ri logpri {qi q
ř
for some classical state r “ pri q (depending on p). The quantity i ri logpri {qi q is
called the Kullback–Leibler divergence (or relative entropy) between r and q and

ion
is always nonnegative by concavity of the logarithm.

ut
Chapter 2

rib
Exercise 2.1. Boundary states are states having 0 in their spectrum.
Exercise 2.2. Prove the statement by induction on d. Use the intermediate value
theorem to show that the operator ρ ´ d1 |ψyxψ| is on the boundary of the PSD cone

ist
for some unit vector ψ.

rd
Exercise 2.3. Use (2.6).
Exercise 2.4. (i) Each σa is a self-adjoint isometry, so its eigenvalues are ˘1.
fo
The assertion also follows formally from Exercise 2.3. (ii) It is enough to verify
directly just one of the rules; the remaining ones follow then via simple algebra by
repeatedly using (i).
ot
Exercise 2.5. Hyperplanes in H1 are described by the equation TrpA ¨ q “ t for
N

some A P Msa d which is not a multiple of the identity (and which can be assumed
to be of trace 0) and some t P R. For such A P Msa d , we first note that
ly.

max TrpAρq “ λ1 pAq,


ρPDpCd q
on

so the value t “ λ1 corresponds to supporting hyperplanes (here λ1 pAq denotes


the largest eigenvalue of A). Let E be the eigenspace of A corresponding to the
eigenvalue λ1 pAq. Given ρ P DpCd q, the condition TrpAρq “ λ1 pAq is equivalent to
se

ρ having its range in E, and the result follows.


lu

Exercise 2.6. This is an immediate consequence of the fact that DpC2 q is linearly
isometric to the unit ball of R3 (see Section 2.1.2).
na

Exercise 2.7. For U P Updq, denote by ΦU the map ρ ÞÑ U ρU : and by ΨU the


map ρ ÞÑ U ρT U : . Check that the relations ΦU ˝ ΦV “ ΦU V , ΦU ˝ ΨV “ ΨU V ,
so

ΨU ˝ ΦV “ ΨU V and ΨU ˝ ΨV “ ΦU V hold for any U, V P Updq.


Exercise 2.8. The statement is that isometries of the real projective space PpRn q
r
Pe

are of the form rψs ÞÑ rOψs for some O P Opnq. This can be proved by induction
on n since the set of points at largest distance from rψs identifies with Ppψ K q.
Exercise 2.9. The hypothesis implies that the matrix of ρ in any orthonormal
basis has real entries. Since this property remains true when one multiplies each
basis element by a complex number with modulus 1, it follows that the matrix of
ρ in any orthonormal basis is diagonal, and therefore ρ “ ρ˚ .
Exercise 2.10. For (i), work in the affine hyperplane of trace one self-adjoint
operators, whose real dimension is d4 ´ 1. For (ii), let Seg Ă SCd bCd be the set of
k
product unit vectors (see (B.6)) ř and consider the map Ψ : ∆k´1 ˆ Seg Ñ Sep de-
fined as Ψpλ, ψ1 , . . . , ψk q “ λi |ψi yxψi |. Then prove that a necessary condition for
CHAPTER 2 335

the surjectivity of Ψ is that dimp∆k´1 q ` k dimpSegq ě dimpSepq for an appropriate


notion of dimension. One possible notion is the covering dimension of a compact
metric space pX, dq defined as dimpXq “ lim inf εÑ0 log N pX, εq{ logp1{εq where
N pX, εq is the covering number defined in Section 5.1. We have dimp∆k´1 q “ k,
dim Seg “ 4d ´ 3 and dim Sep “ d4 ´ 1.
Exercise 2.11. Consider first the case d1 “ d2 “ 2 and let E “ spanp|00y, |11yq Ă
C2 b C2 . Since the only product vectors contained in E are |00y and |11y, it follows
that the intersection of SeppC2 b C2 q with the hyperplane tA : TrpAEq “ 1u is

ion
the set of states of the form λ|00yx00| ` p1 ´ λq|11yx11| for λ P r0, 1s. This set is a
1-dimensional face. Deduce the case of arbitrary d1 , d2 .

ut
Exercise 2.12. Expand all the objects with respect to the canonical bases, i.e.,
|iy, |iy b |jy, |iyxj| etc., as appropriate.

rib
Exercise 2.13. Verify that distpψ, Segq “ 2 ´ 2λ1 pψq and note that λ1 pψq is
minimal when ψ is a maximally entangled vector.

ist
Exercise 2.14. The statement about the antisymmetric space follows from the
relation Asymd “ tψ ´ F pψq : ψ P Cd b Cd u. For the symmetric space, what it

rd
clear is that Symd “ tψ ` F pψq : ψ P Cd b Cd u “ spantx b y ` y b xu; then use
the polarization formula x b y ` y b x “ 21 px ` yqb2 ´ 12 px ´ yqb2 .
Exercise 2.15. (i) Write PE b PE as
1“
fo ‰
pPE ` PE K qb2 ` pPE ` iPE K qb2 ` pPE ´ PE K qb2 ` pPE ´ iPE K qb2 .
ot
4
(ii) By Exercise 2.14, there are unit vectors x, y P Cd such that xϕ, x b xy ‰ 0 and
N

xψ, y b yy ‰ 0. Let W P Updq be such that x “ W y. By (i), |y b yyxy b y| P A ,


and V “ pW b W qp|y b yyxy b y|q satisfies the desired conclusion. (iii) There are
ly.

vectors χ “ x b y ´ y b x and χ1 “ x1 b y 1 ´ y 1 b x1 (with x, y, x1 , y 1 P Cd ) such that


xψ, χy ‰ 0 and xχ1 , ϕy ‰ 0. Denote E “ spantx, yu and E 1 “ spantx1 , y 1 u and show
on

that necessarily dim E “ dim E 1 “ 2. Let W P Updq be such that E 1 “ W E and use
V “ pW b W qpPE b PE q. As before, V P A by (i). To verify that xϕ|V |ψy ‰ 0 use
se

the fact that pPE b PE qϕ, χ are all collinear (since dim Asym2 “ 1) and nonzero,
and similarly for V ϕ, pW b W qχ, χ1 . (iv) First, by Exercise 2.14, both Symd and
lu

Asymd are invariant under the U b U action of Updq and hence A -invariant. To
show that they are A -irreducible (and hence “U b U -irreducible”), prove and use
na

the following.
A semigroup A Ă BpHq acts irreducibly on H if and only if for any ϕ, ψ P Hzt0u
so

there exists V P A such that xϕ|V |ψy ‰ 0.


Exercise 2.16. (i) Apply Proposition 2.9 to eigenspaces of ρ. (ii) Use (i) and
r
Pe

the fact that V U is Haar-distributed for any fixed V P Updq. (iii) Apply (ii) to
ρ “ |x b xyxx b x|, where x is a fixed unit vector in Cd .
ř
Exercise 2.17. Convexity is easy. If ρ “ λi σi b τi is separable (with λi ą 0,
σi P DpH1 q and τi P DpH2 q), then λi σi b τibl is an l-extension of ρ. If ρk is a
ř
k-extension of ρ and l ă k, taking partial trace over k ´ l copies of H2 gives an
l-extension.
ř
Exercise 2.18. (i) Write ρ “ λi |χi yxχi | for λi ą 0 and unit vectors χi P
H1 b H2 . Necessarily TrH2 |χi yxχi | “ |ψyxψ| for all i, and by considering the
Schmidt decomposition of χi , one sees that χi “ ψ b ϕi for some ϕi P H2 , hence
the result. (ii) Let ρ P DpH1 b H2 b H2 q be a 2-extension of |ψyxψ|. By (i), ρ has
336 E. HINTS TO EXERCISES

the form |ψyxψ| b σ for some σ P DpH2 q. Taking partial trace over the first copy of
H2 shows that |ψyxψ| is a product state.
řd
Exercise 2.19. If ψ “ i“1 λi ei b fi is the Schmidt decomposition, show that
d
ÿ
|ψyxψ|Γ “ λi λj |ej b fi yxei b fj |
i,j“1

and that its spectrum is tλ2i : 1 ď i ď du Y t˘λi λj : 1 ď i ă j ď du.

ion
Exercise 2.21. What are the operators ρ on H1 b H2 , for which we can be sure
that ρΓ “ pV b Iq: ρpV b Iq? (Note that V depends on X.)

ut
Exercise 2.22. Note that Γ2 “ Id. Take E “ tA P B sa pH1 b H2 q : AΓ “ Au.
Exercise 2.23. ρΓβ “ βd F `p1´βq dI2 , and therefore ρβ is PPT if and only if β ď d`1 1
.

rib
1 Γ
It follows that ρβ is entangled for β ą d`1 . Next, verify that ρβ “ wλ , where wλ is
the Werner state (2.21) with λ “ pβpd2 ´ 1q ` d ` 1q{2d. For ´ d21´1 ď β ď d`1 1
, we

ist
1
have 2 ď λ ď 1, so wλ is separable by Proposition 2.16. Since the partial transpose
of a separable state is a separable state, the result follows.

rd
Exercise 2.24. (i) For ψ P SCd1 and ϕ P SCd2 , we have |ψ b ϕyxψ b ϕ|R “
|ψ b ψyxϕ b ϕ|; in particular }|ψ b ϕyxψ b ϕ|R }1 “ 1. Using the triangle inequality
fo
for } ¨ }1 , it follows that }ρR }1 ď 1 for any separable state
ř ρ. (ii) Let ρ “ |χyxχ| for
χ P SCd1 bCd2 . Consider a Schmidt decomposition χ “ λi ψi b ϕi . We have
ot
ÿ
ρR “ λi λj |ψi b ψj yxϕi b ϕj |.
N

i,j

Since the families pψiřb ψj qi,j andřpϕi b ϕj qi,j consist of orthonormal vectors, it
ly.

follows that }ρR }1 “ i,j λi λj “ p λi q2 , so }ρR } ą 1 unless ρ is separable.


Exercise 2.25. Use the self-dual property (see Section 1.2.1 for more on this) of
on

the positive semi-definite cone of operators: A ě 0 if and only if TrpABq ě 0 for


all B ě 0.
se

Exercise 2.26. Φ b Ψ “ pΦ b Idq ˝ pId bΨq.


lu

Exercise 2.27. Write the Choi matrix of Φ as the difference of two positive
operators.
na

Exercise 2.28. When dim H1 ď dim H2 , this follows from the proof of Theorem
2.21. Otherwise, consider Φ˚ to switch the roles of H1 and H2 , and use Exercise
so

2.25 and the fact that Φ is completely positive if and only if Φ˚ is completely
positive.
r

Exercise 2.29. To show that the map Φ is not pk ` 1q-positive, consider the input
Pe

řk`1
operator |ψyxψ| for ψ “ i“1 |iy b |iy P Cn b Ck`1 . To establish k-positivity of Φ,
řk
write any ψ P Cn b Ck as i“1 χi b ϕi with pϕi q an orthonormal basis in Ck , and
argue that
ÿ
pΦ b IdMk qp|ψyxψ|q ě |χi b ϕi ´ χj b ϕj yxχi b ϕi ´ χj b ϕj | ě 0.
iăj
` ˘
Exercise 2.30. The unit ball in Msa d , } ¨ }8 is an “order interval” tτ : ´ I ď τ ď Iu
(where σ ď τ means that τ ´ σ is positive semi-definite) and positive maps are
exactly those that preserve this order.
CHAPTER 2 337

Exercise 2.31. Use the preceding exercise and duality. Alternatively, use the
fact that any τ P Msa m can be written as τ1 ´ τ2 with τ1 , τ2 positive and }τ }1 “
}τ1 }1 ` }τ2 }1 .
Exercise 2.32. (i) The “only if” part follows from the preceding exercise (note
that Φ b Id is trace-preserving if Φ is). In the opposite direction, if σ is positive
and Φpσq is not, then }σ}1 “ Tr σ “ Tr Φpσq ă }Φpσq}1 . This takes care of k “ 1,
and the general case follows formally. (ii) The norm equals 2; note that the norm
is necessarily attained on a pure state and use ` Exercise 2.19. Essentially the same

ion
˘
argument gives k for the norm of Φ b Id on B sa pCm b Ck q, } ¨ }1 . (iii) Use part
(ii) and duality.

ut
Exercise 2.33. The case rank ρ “ n follows from Proposition 1.4 (the trace-
preserving hypothesis is not needed). In the general case, argue by contradiction:

rib
let E “ rangepσq, E 1 “ rangepΦpσqq, and assume that r :“ dim E ą r1 :“ dim E 1 .
Next, use Propositions 2.1 and 1.4 to infer that ΦpDpEqq Ă DpE 1 q and note that

ist
r “ Tr PE “ Tr ΦpPE q ď r1 }ΦpPE q}8 , hence }ΦpPE q}8 ě rr1 ą 1 “ }PE }8 .
Conclude by appealing to Exercise 2.30.

rd
Exercise 2.34. (i) A channel is an affine map from the Bloch ball to itself; such a
map is necessarily a contraction and preserves the center if and only if the channel
is unital. We are allowed to compose the channel with maps X ÞÑ U XU : for
fo
U P Up2q, which correspond to rotations of the Bloch ball. This yields the desired
form with |a|, |b|, |c| being the singular values of the contraction. We may have
ot
a, b, c negative since we are only allowed proper rotations (from SOp3q). (ii) follows
N

from Theorem 2.21 after we compute explicitly the Choi matrix. For (iii), note
that the inequalities for pa, b, cq obtained in part (ii) describe a tetrahedron whose
vertices are p1, 1, 1q (corresponding to the identity channel) and permutations of
ly.

p1, ´1, ´1q (corresponding to conjugations with Pauli matrices).


on

Exercise 2.35. Apply Carathéodory’s theorem in the space of unital and trace-
preserving superoperators.
Exercise 2.37. Check that RpXq “ E U XU : with U Haar-distributed (see also
se

Exercise 8.6), and that DpXq “ E V XV : where V is a uniformly distributed among


lu

diagonal matrices with ˘1 on the diagonal.


Exercise 2.38. The condition is that Tr Mi “ 1,ř which implies in particular
na

N “ dim H. One checks then directly that Φ˚ pρq “ Mi xi|ρ|iy.


Exercise 2.39.řThe implication (i) ñ (ii) is immediate from (2.32). Assuming (ii),
so

write CpΦq “ |xi b yi yxxi b yi | for xi P Hout , yi P Hin . Repeating the proof of
Theorem 2.21 with this decomposition instead of (2.34) gives (iii). Finally, assuming
r

positive semi-definite operators Ai P BpHin q and Bi P BpHout q such


Pe

(iii), there are ř


that ΦpY q “ i TrpBi Y qAi for any Y P B sa pHin q. Consequently, for any d and
any positive operator X P BpHin b Cd q,
”´ ¯ ´ ¯ı
1{2 1{2
ÿ
pΦ b IdMd qpXq “ Ai b TrHin Bi b I X Bi b I
i

belongs to the separable cone, hence Φ is entanglement-breaking.


Exercise 2.40. If Φ is entanglement-breaking, write Φ b Ψ “ pId bΨq ˝ pΦ b Idq
and use the fact that the product superoperator Id bΨ maps the separable cone to
the separable cone.
338 E. HINTS TO EXERCISES

Exercise 2.41. By (2.32), CpΦqΓ is positive whenever Φ is PPT-inducing. Con-


versely, assume that CpΦqΓ is positive. It is enough to show that pΦ b IdMd qpρq has
partial transpose for every pure state ρ “ |ψyxψ|, with ψ P Hin bCd . Denot-
positive ř
ing χ “ ei b ei P Hin b Hin , we may write ψ “ pI bBqχ for some B P BpHin , Cd q.
It follows that
“ ‰
pΦ b Idqpρq “ pΦ b Idq pI bBq|χyxχ|pI bB : q “ pI bBqCpΦqpI bB : q
has positive partial transpose.

ion
Exercise 2.42. Prove that A, B ě 0 implies A d B ě 0 (A d B is a submatrix of
A b B). Use also the fact that ΘA b IdMk “ ΘAbJ where J is the matrix with all
entries equal to 1.

ut
Exercise 2.43. Observe that if a P Cn and D “ Da (i.e., the diagonal matrix with

rib
Dii “ ai ), then DXD: “ Θ|ayxa| pXq for any X P Md .
Exercise 2.44. The map Φ is completely positive, and trace-preserving because
ř :

ist
Ai Ai “ IC2 bC2 . It is also obvious from the definition that Φ is a separable
channel. Assume nowřthat Φ can be written as a convex combination of product

rd
ř
channels of the form λj Ψj b Ξj with λj ą 0 and λj “ 1. The pure product
states |0yx0|b|0yx0| and |1yx1|b|1yx1| are mapped to themselves under Φ. It follows
that for every j, Ψj p|0yx0|q “ Ξj p|0yx0|q “ |0yx0| and Ψj p|1yx1|q “ Ξj p|1yx1|q “

p1q
fo
|1yx1|. This leads to a contradiction since Φp|0yx0| b |1yx1|q “ |0yx0| b |0yx0|.
p2q
Exercise 2.45. If tAi u are Kraus operators for Φ1 and tAj u are Kraus operators
ot
p1q p2q
for Φ2 , the family tAi b Iu Y tI bAj u are Kraus operators for Φ1 ‘ Φ2 .
N

Exercise 2.46. Prove that Φ is co-completely positive iff T ˝Φ is completely positive


iff Φ ˝ T is completely positive, where T denotes the transposition superoperator.
ly.

Exercise 2.47. Prove that, for Φ P BpMsa sa sa sa


m , Mn q and Ψ P BpMn , Mm q, we have
on

TrpΨ ˝ Φq “ TrpCpΦqF CpΨqF ˚ q


where F : Cm b Cn Ñ Cm b Cn is the flip, and use self-duality of PSD.
se

Exercise 2.48. A superoperator Φ : Msa sa


m Ñ Mn is k-positive if ΦbIdMk is positive,
if xy|pΦ b Idqp|xyxx|q|yy
i.e., ř ě 0 for any x P Cm b Ck and y P Cn b Ck . Writing
lu

x “ xi b |iy and y “ yi b |iy for xi P Cm and yi P Cn , this condition becomes


ř

k
na

ÿ
(E.2) xyi |Φp|xi yxxj |q|yj y ě 0.
i,j“1
so

If we expand xi as xi “ l xel , xi yel , where pel q is the basis in Cm used in the


ř
r

definition of the Choi matrix, (E.2) becomes


Pe

k
ÿ
xyi b xi |CpΦq|yj b xj y ě 0,
i,j“1

which is equivalent to xψ|CpΦq|ψy ě 0 for any ψ P Cn b Cm with Schmidt rank at


most k. This shows that (1) is equivalent to (2). The equivalence between (2) and
(3) follows from the fact that a vector x P Cm b Cn has Schmidt rank at most k iff
it can be written as x “ pA b Bqy, where y P Ck b Ck , A P Mk,m and B P Mk,n .
Exercise 2.49. The “only if” part follows from Proposition 2.29. For the “if”
part, argue first that if Φ is rank-preserving, then it maps the interior of PSD into
itself. Next, if there was a positive definite operator τ that was not in the image
CHAPTER 4 339

of the interior of PSD under Φ, then some point of the segment connecting τ and
ΦpIq would be of the form Φpσq for some σ P BPSD, in particular rank σ ă n “
rank Φpσq. Infer that Φ is a bijection of the interior of PSD onto itself and conclude
that it is an automorphism of PSD.
Exercise 2.50. Start by showing that if Φ belongs to the interior of P , then
ΦpDq X BPSD “ H. Next, consider λn (the smallest eigenvalue) as a function on
ΦpDq.
Exercise 2.51. The condition is equivalent to xΦpρq, σyHS ě δ Tr ρ Tr σ for all

ion
ρ, σ P PSD.
Exercise 2.52. (a) In the language of the second proof of Theorem 2.36, if }R}8 “

ut
1, then RpS 2 q X S 2 consists of at least 2 points. On the other hand, there are
nontrivial ellipsoids contained in B23 that intersect S 2 only at one point. For a

rib
concrete example, consider ρ ÞÑ 21 pρ ` |0yx0|q for a state ρ P DpC2 q (b) Any unitary
channel (even the identity channel!) will do.

ist
Exercise 2.53. Same example as in Exercise 2.52 (a).
Exercise 2.54. If ρ P DpCm b Cn q is an entangled state, let Φ be given by

rd
Theorem 2.34. Note that for ε ą 0 small enough, the map Φ1 : X ÞÑ ΦpXq `
εpTr Xq I also satisfies the conclusions of Theorem 2.34. Finally consider Ψ : X ÞÑ
Φ1 pIq´1{2 Φ1 pXqΦ1 pIq´1{2 . fo
Exercise 2.55. (i) Let A “ Φ˚ pIq; then Tr Φpρq “ TrpAρq for all ρ P Msa m . (ii) We
ot
may assume that A “ Φ˚ pIq is positive definite and satisfies Φ˚ pIq ď I. (iii) Set
B “ pI ´Aq1{2 , define Φ̃ : Msa sa
m Ñ Mm`n by Φ̃pρq “ BρB ‘ Φpρq and verify that Φ̃
N

is trace-preserving. (iv) Verify that Φ̃pρq ě 0 if and only if Φpρq ě 0, and that the
same is true for any extensions Φ b IdK and Φ̃ b IdK . (v) Deduce form (iv) that Φ̃
ly.

preserves positivity and that it detects entanglement of a state ρ (in the sense of
Theorem 2.34) iff Φ does.
on

Exercise 2.56. Let σ P BPzPSD, and τ P BP such that Epσq Ă Epτ q. We may
apply Lemma C.4 with F “ ´τ and G “ ´σ and conclude that µσ ´ τ P PSD for
se

some µ ě 0 (in fact, µ ą 0). Since we may write µσ “ pµσ ´ τ q ` τ , the assumption
that σ lies on an extreme ray forces τ to be proportional to σ.
lu

Chapter 4
na

Exercise 4.1. Write B “ K X H “ L X H and take a unit vector u K H. After


applying to K and L linear transformations fixing H, we may assume that u P K
so

and that maxtxu, xy : x P Ku “ 1, and similarly for L. Under these hypotheses,


r

we have
Pe

convtB, ˘uu Ă K Ă 2B ˆ r´u, us Ă 3 convtB, ˘uu


(same for L) which gives the result with C “ 9. The result appears in [Las08] and
we are not aware of a known better value for C.
Exercise 4.2. The inclusion K Ă ´n∆ is shown by a variational argument. To
prove the other inclusion we show that every point R pn ` 1q∆ would form, together
with one of the faces of ∆, a simplex of volume larger than that of ∆.
Exercise 4.3. Let pKk q a sequence of convex bodies in Rn . By Exercise 4.2, we may
assume (applying invertible affine transformations if necessary) that ∆n Ă Kk Ă
pn ` 1q∆n . Then apply Ascoli’s theorem to extract from p} ¨ }Kk q a subsequence
converging uniformly on S n´1 .
340 E. HINTS TO EXERCISES

Exercise 4.4. We have 12 pK ´ Kq Ă KY Ă K ´ K and K ´ K does not depend


on the choice of the origin.
Exercise 4.5. We know from (1.14) that pK q˝ “ K ˝ X p´K ˝ q. Then check that
x P K ˝ ðñ x ´ |e|e 2 P ´C ˚ .
řp
Exercise 4.6. µK “ i“1 |ui |δui {|ui | (replace δx by 12 pδx ` δ´x q to obtain an even
measure).
Exercise 4.7. If P is a polygon whose edges are the segments ˘S1 , . . . , ˘Sk , then

ion
P is a translate of S1 ` ¨ ¨ ¨ ` Sk . (This can also be checked by induction.) The
result for zonoids follows by approximation.
Exercise 4.8. Prove that every face of a zonotope is a zonotope.

ut
Exercise 4.9. No. For every partition of S n´1 as A1 Y A2 , the convex bodies

rib
K1 , K2 defined for x P Rn by
ż
}x}Ki “ |xx, θy| dσpθq

ist
Ai
are such that K1 ` K2 is a multiple of a Euclidean ball.

rd
Exercise 4.10. For the first statement, use Exercise 1.2. For the second statement,
try K “ r0, 1s Ă R and K 1 “ tpx, yq P R2 : x ě 0, y ě 0, xy ě 1u (cf. the hint to
Exercise 1.30).
Exercise 4.11. Straightforward from the definitions.
fo
ot
Exercise 4.12. Start by noticing that if txi u Ă K is a basis of V and tx1j u is a
basis of V 1 , then t˘xi b p x1j u Ă K bp V 1.
N

Exercise 4.13. Let d “ dimpV1 b p V2 q. If 0 P V1 and 0 P V2 , then V1 b


p V2 “ V1 b V2
and d “ dim V1 dim V2 . If, say, 0 P V1 and 0 R V2 , then d “ dim V1 pdim V2 ` 1q.
ly.

If 0 R V1 and 0 R V2 , then d “ pdim V1 ` 1qpdim V2 ` 1q ´ 1. The first is easy;


on

the second follows, e.g., from Exercise 4.12. For the third, consider first the case
when Vj “ enj ` Rnj ´1 and then appeal to Exercise 4.11 (or, alternatively, use the
approach from the paragraph following (4.13)).
se

Exercise 4.14. The part that is not obvious is that convtx b x1 : x P C, x1 P C 1 u is


lu

closed. Consider first the case when C, C 1 are pointed and hence admit (by Exercise
1.32) compact bases, which allows appealing to Exercise 4.10. Next, use Exercise
1.33 and Exercise 4.12.
na

Exercise 4.15. Use Exercise 4.10 and then (to show full-dimensionality) the po-
so

larization formula
1` ˘
px ` yq b px1 ` y 1 q ` px ´ yq b px1 ´ y 1 q “ x b x1 ` y b y 1 .
r

2
Pe

To show full-dimensionality in the affine setting use the same ideas as in Exercises
4.12 and 4.13 to establish that the relative interior of K1 b p K2 is nonempty.
k
Exercise 4.16. (i) The unit ball in `1 pXq, where X is the normed space whose
unit ball is K. (`k1 pXq is the space X k equipped withş the norm }px1 , . . . , xk q} “
1
}x1 }K ` ¨ ¨ ¨ ` }xk }K .) (ii) Use the formula volpLq “ n! Rn
expp´}x}L q dx, valid for
n
any symmetric convex body L Ă R .
Exercise 4.17. It is clear that any extreme point must be of the claimed form.
Conversely, given extreme points x P K, x1 P K 1 , let φ and φ1 be supporting func-
1
ř φ ď 1 on K with φpxq “ 1 (and similarly for φ ). Given a decomposition
tionals, i.e.,
x b x1 “ λi xi b x1i , show that we may assume that φpxi q “ φ1 px1i q “ 1. Now if
CHAPTER 4 341

xi ‰ x for some i, consider a linear functional ψ such that ψpxq ą suptψpxi q :


xi ‰ xu, and obtain a contradiction by computing pψ b φ1 qpx b x1 q.
Exercise 4.18. Straightforward from the definitions.
Exercise 4.19. Calculate TrpT In q by using (4.16).
Exercise 4.20. If pxi , cř
i q is a resolution of identity, then ?for any x P Rn , we have
2 2
ř
ci xx, xi y “ |x| and ci “ n, thus maxi |xx, xi y| ě |x|{ n.
n
Exerciseř4.21. If pxi , ci q is an ř for any x P R ,
ř unbiased resolution of identity, then
2 2

ion
we have ci xx, xi y “ |x| , ci xx, xi y “ 0, xx, xi y ě ´|x| and ci “ n. All this
together implies maxi xx, xi y ě |x|{n.
Exercise 4.22. Use Carathéodory’s theorem.

ut
?
Exercise 4.23. We have JohnpB8 n
q “ B2n and LöwpB8 n
q “ nB2n . ?If E Ă B8 n
Ă
αE for some ellipsoid E , the extremal volume property implies α ě n.

rib
Exercise 4.24. G1unc consists of all diagonal matrices. If ∆ is a diagonal matrix
and P a permutation matrix, what are ∆P and P ∆?

ist
Exercise 4.25. (i) Bpn is permutationally symmetric. (ii) Isometries of Spm,n

rd
include maps X ÞÑ U XV for U, V orthogonal/unitary matrices; it follows that Spn
has enough symmetries. (iii) Any isometry of Spn,sa preserves R I (indeed, ˘n´1{p I
can be characterized as isolated points in the set of elements of largest (for p ą 2) or
fo
smallest (for p ă 2) Hilbert–Schmidt norm in BSpn,sa ) so Spn,sa does not have enough
symmetries. (iv) Isometries include X ÞÑ ˘U XU : for U orthogonal/unitary, there
ot
are enough symmetries. (v) Isometries of the regular simplex are obtained from
N

permutations of its vertices; it has enough symmetries. (vi) See Theorem 2.3,
DpCd q has enough symmetries.
ly.

Exercise 4.26. (i) Choosing for K a regular p-gon fails since the isometry group is
a dihedral group. However it is possible to slightly modify K to obtain the required
on

isometry group, for example K “ convptRk p1, 0q : 0 ď k ď p ´ 1u Y tRk p1, εq :


0 ď k ď p ´ 1uq for ε ą 0 small enough. (ii) Consider L “ K b p B1n where
K
řnis the convex body from (i). We claim that isometries of L have the form
se

i“1 Ui b |eσpiq yxei | for some σ P Sn and U1 , . . . , Un P IsopKq, pei q being the
canonical basis of Rn . Indeed an isometry of L induces an isometry on the set
lu

M “ tRk p1, εq b ei : 0 ď k ď p ´ 1, 1 ď i ď nu (the set of points in K


farthest from the origin) and one checks that it must be of the announced form.
na

It follows that L does not have enough symmetries (since U b I commutes with
IsopLq ř for any U P SOp2q). On the other hand, there is no invariant subspace (if
so

x “ xi b ei P R2 b Rn , one checks that for any i the vector xj b ej belongs to


r

the spantV x : V P IsopLqu, and therefore the orbit of x spans R2 b Rn whenever


Pe

x ‰ 0).
Exercise 4.27. Isometries of K b p L include the maps A b B for A P IsopKq and
B P IsopLq. We claim that a linear map S P BpRm b Rn q which commutes with
all such maps is a multiple of identity; this follows from the fact that, for every
y, y 1 P Rn , the map Sy,y1 defined by the relation xSy,y1 pxq, x1 y “ xSpx b yq, x1 b y 1 y
(for x, x1 P Rm ) commutes with IsopKq, and similarly with the role of both factors
exchanged.
? ? ?
Exercise 4.28. We have Johnp nB1n q “ B2n . The John ellipsoid of nB1n b p nB1n
2
(which identifies with nB1n ) is E “ B2n b2 B2n . The John ellipsoid of B2n bB p 2n (which
n,n
identifies with S1 ) is ?n E . 1
342 E. HINTS TO EXERCISES

Exercise 4.30. By “globally equivalent” we mean that the validity of all in-
stances of (4.22) implies the validity of all instances of(4.21), and vice versa.
To derive (4.21) from (4.22), appeal to the arithmetic mean/geometric mean in-
equality. To recover (4.22), apply (4.21) to K{ volpKq1{n and L{ volpLq1{n with
t “ volpKq1{n {pvolpKq1{n ` volpLq1{n q
Exercise 4.31. Given convex bodies K1 , K2 P Rn , consider K “ convpK1 ˆ
t0u, K2 ˆ t1uq Ă Rn`1 . Then K X pRn ˆ tλuq corresponds to λK2 ` p1 ´ λqK1 .
Exercise 4.32. Use the formula detpA b Bq “ detpAqn detpBqm for A P Mm and

ion
B P Mn .
Exercise 4.34. If f “ exppϕq is the density of µ, take p1 ` ϕ{sqs` as the density

ut
of µs .
Exercise 4.35. To show that 1. implies 2., define

rib
ď
K“ txu ˆ f pxq1{s L,

ist
xPsupp µ

where L is any convex body of volume 1 in Rs . Conversely, apply Brunn–Minkowski

rd
inequality in Rs to deduce that the function x ÞÑ vols pptxuˆRs qXKq1{s is concave.
Exercise 4.36. Is µ is log-concave, take pµs q as in Lemma 4.12 and show (4.28)
for µs instead of µ by using Lemma 4.13 and (4.21) applied in Rn`s . Conversely,
fo
apply (4.28) with K and L being balls of radius tending to 0 to prove that µ is log-
concave. Note that the density f satisfies f pxq “ limεÑ0 µpBpx, εqq{ volpBpx, εqq
ot
for almost all x P Rn (see Chapter 3, Theorem 1.4 in [SS05]).
N

Exercise 4.37. If the origin is not an interior point of K, then wpK ˝ q “ 8.


Otherwise, for every u P S n´1 we have 1 ď pwpK, uqwpK ˝ , uqq1{2 . Integrate over u
ly.

and use the Cauchy–Schwarz inequality.


Exercise 4.38. Inradii and outradii are easy to compute. To compute volpB1n q,
on

note that it is the union of 2n essentially disjoint simplices, each with volume 1{n!.
n 1{n
n
` ˘
Exercise 4.39. Show and use B8 Ă n1{p Bpn Ă nB1n and volpnB1n q{ volpB8 q “
se

n{pn!q1{n „ e.
lu

Exercise 4.40. Integrate in Cartesian and polar coordinates, and appeal to (4.26).
Exercise 4.41. Observe if x ‰ y, then Bpx, rq X Bpy, rq Ă Bp x`y 1
2 , r q for some
na

1
r ă r.
Exercise 4.42. Consider rectangles of height 1 and width ε with ε Ñ 0.
so

Exercise 4.43. K contains a segment I with length ` “ diam K, so κn wpKq “


wG pKq ě wG pIq “ 21 κ1 `. It is an interesting question whether we always have
r
Pe

wpKq ě κκn1 outradpKq. In other words, we are asking whether among all sets of
given outradius R the segment of length 2R has the minimal mean width, which
doesn’t readily follow from the known results on sets for which—under certain
constraints—the mean width is extremal (see, e.g., [Bal91, Sch99, Bar98]). The
above question is equivalent to the following inequality (see Appendix A.2 for def-
initions): If X1 , X2 , . . . , XN are jointly Gaussian N p0, 1q-distributed
ř random vari-
ables such that, for some positive scalars t1 , t2 , . . . , tN we have k tk Xk “ 0, then
E maxk Xk ě E |X1 |.
Exercise 4.44. Show the inequality for symmetric polygons by induction on the
number of edges, then use symmetrization K ÞÑ K ´ K and approximation.
CHAPTER 4 343

Exercise 4.45. If G is a standard Gaussian vector in Rn , then E wpK, PE Gq ď


E wpK, Gq by Jensen’s inequality.
Exercise 4.46. We can assume A linear. Use the classical fact ([Hal82], Problem
177) that any A P Mn with }A}8 ď 1 can be written as the top left block of
an orthonormal matrix O P M2n to reduce to the case where A is an orthogonal
projection, which is covered by Exercise 4.45.
The same assertion holds in fact for any contraction (i.e., not necessarily affine), see
Proposition 6.6 and the comments following it. However, this change of generality

ion
makes the result much more subtle.
Exercise 4.47. By translation invariance, assume 0 P KXL, so that the functionals

ut
wpK, ¨q and wpL, ¨q are nonnegative. Then wpK Y L, ¨q “ maxpwpK, ¨q, wpL, ¨qq ď
wpK, ¨q ` wpL, ¨q.

rib
Exercise 4.48. By modifying the proof of Proposition 4.16 show that
´ż ¯1{p

ist
}θ}pK dσpθq vradpKq ě 1 for any p ą 0,
S n´1

rd
then let p Ñ 0. The inequality and the argument appear in Appendix A of [Sza05],
but were likely known earlier.
fo
Exercise 4.49. (i) When the measure µ is purely atomic with N atoms, the
result can be proved by induction on N , the case N “ 2 being exactly the Brunn–
ot
Minkowski inequality (4.22). The continuous case can then be derived by approx-
imation. Minkowski integrals of convex bodies are defined via their support func-
N

tions, so that inequality (4.37) makes sense whenever the map t ÞÑ wpKt , θq is
measurable for any θ P Rn . (ii) In that case, volpKt q “ volpKq for any t P Opnq.
ly.

ş invariance of the Haar measure (see Appendix B.3), the convex body L :“
By
Opnq
tpKq dµptq is necessarily a Euclidean ball centered at the origin. By comput-
on

ing the width of L in a fixed direction, we obtain that L is a Euclidean ball of radius
wpKq, showing the result.
Exercise 4.50. We have vradpKq ď vradpK ˝ q´1 ď wpKq.
se

Exercise 4.51. In the symmetric case, combine the results from Proposition 4.15,
lu

Proposition 4.16 and Theorem 4.17. In the general case, sufficient conditions are
that (i) 0 is the center of the largest Euclidean ball contained in K and (ii) 0 is the
na

centroid of K (for Santaló’s inequality to hold). These conditions are both satisfied
whenever 0 is the unique fixed point under IsopKq.
so

Exercise 4.53. By Fubini theorem, we have


r

volpKq ď volF pPF Kq max volE pK X pE ` xqq


Pe

xPF

and the convexity and symmetry of K imply that the maximum is achieved for
x “ 0.
Exercise 4.54. Apply Lemma 4.20 to the convex body K ˆ K Ă R2n and to the
pair of orthogonal subspaces E “ tpx, xq : x P Rn u and F “ tpx, ´xq : x P Rn u.
Exercise 4.55. The lower inequality follows from (4.21). For the upper inequality,
assume h “ 1 and apply Lemma 4.20 to the convex body
L “ tpλx, p1 ´ λqy, λq : x P K, y P K, λ P r0, 1su Ă R2n`1
344 E. HINTS TO EXERCISES

and to the pair of subspaces E “ tpx, x, 1{2q : x P Rn u, F “ tpx, ´x, tq : x P


n!2
Rn , t P Ru. We have volpLq “ p2n`1q! volpKq2 , volpK X Eq “ 2´n{2 volpKq and
volpPF Kq “ 2´n{2 volpK q.
Exercise 4.56. Apply Lemma 4.20 to the convex body L “ pK ˆ t1uq Ă Rn`1
with E the line generated by p0, . . . , 0, 1q.
Exercise 4.57. Use Theorem 4.21.
Exercise 4.58. Consider f, g, h : Rn Ñ r0, 8s verifying (4.46). For s P R, define

ion
fs : Rn´1 Ñ r0, 8s by fs pzq “ f ps, zq and similarly for gs , hs . If t “ λu ` p1 ´
λqv, check that ht , şfu , and gv verify the pn ´ 1q-dimensional instance of (4.46).
Deduce that f˜psq “ Rn´1 fs pzq dz and similarly defined g̃, h̃ verify the 1-dimensional

ut
instance of (4.46), and conclude by appealing to that instance.

rib
ş ş8
Exercise 4.59. (i) Let α “ f p0q. Write x2 f pxq dx “ 2 0 2tPpY ě tq dt where
Y is a random variable with density f . Show that the log-concavity hypothesis

ist
implies that PpX ě tq ď PpY ě tq ď PpZ ě tq, where X is uniformly distributed
on r´1{2α, 1{2αs and Z has a symmetric exponential distribution with density

rd
α expp´2α|t|q dt. (ii) Reduce to λ “ 1 by considering L “ λ´1 K. Assume that
H “ uK for a unit vector u. For λ “ 1, the function
fo
´
f : t ÞÑ pvoln Kq´1 voln´1 K X tx¨, uy “ tu
¯
ot
satisfies the hypotheses from (i); log-concavity is given by Lemma 4.13 and Exercise
4.34.
N

Exercise 4.60. Use the inclusions ?12 K Ă B2k Ă K for K “ B2k ˆ B2n´k and
?
L Ă B2n Ă 2L for L “ K ˝ “ convtB2k ˆ t0u, t0u ˆ B2n´k u, which correspond to
ly.

equality cases/extreme cases of Lemmas 4.19 and 4.20.


on

Chapter 5
se

Exercise 5.1. To obtain some of the strict inequalities, consider K “ t0, 1u Ă


r0, 1s.
lu

Exercise 5.2. This is an immediate consequence of Cauchy’s integral formula for


surface area (see [Sch14], Chapter 5.3). Alternatively, an elementary argument
na

can be given as follows: consider the map φ : Rn Ñ K which maps x to the closest
point to x in K. It is easy to check that (i) φ is a contraction (ii) φ maps BL onto
so

BK. It follows that φ decreases the surface area.


r

Exercise 5.3. Let K “ convpCpx, tqq. We have outrad K “ sin t. Let L be the
Pe

n-dimensional half-ball with center x and radius sin t, such that K Ă L (see Figure
5.2). Comparing the areas of K and L using Exercise 5.2 gives the result. To prove
2
the second part, check the inequality cos u ď e´u {2 for |u| ă π{2 (take logarithm
of both sides and then differentiate).
Exercise 5.4. Use Exercise 5.2 with L “ B2n and K “ B2n z convpCpx, tqq. This
gives areapS n´1 qV ptq ě sinptqn´1 voln´1 pB2n´1 q, which is equivalent to the lower
bound in (5.4). To get the upper bound, compare the solid cap with the circum-
scribed solid cone whose base is the same as that of the cap. For the strengthened
lower bound, consider an inscribed cone. See Figure E.3.
CHAPTER 5 345

S n−1

t
• •
0 x
C(x, t)

ion
ut
Figure E.3. Upper bound (dashed) and lower bounds (dotted)

rib
on the volume of a spherical cap.

ist
1
Exercise 5.5. The problem is equivalent to showing that r ÞÑ r VV prq
prq
is nonincreas-

rd
ing. After some elementary manipulations, the inequality to verify becomes
1 r ´ sinputq ¯n´2 1 r ´ sin t ¯n´2
ż ż
dt ď dt
r 0 sinpurq fo
r 0 sin r
sinputq
for r P p0, πq and u P p0, 1q. It can then be checked than the inequality sinpurq sin t
ă sin r
ot
holds pointwise if 0 ă t ă r ă π. The argument actually shows strict concavity in
the “nontrivial” range (i.e., when et ď π) for n ą 2.
N

For the second part, note that Proposition (5.2) implies the inequality V pλrq{V prq ě
V pλsq{V psq for λ ą 1 and r ď s, and we recover (5.6) when r tends to 0.
ly.

Exercise 5.6. If n “ 2 and ε is slightly smaller than 1, we need at least 4 arcs to


cover S 1 .
on

Exercise 5.7. Argue by contradiction using the Hahn–Banach separation theorem;


this is mostly planar geometry.
se

Exercise 5.8. Use Proposition 5.4 with θ ` η “ ε and η “ ε{n, and (5.6). This
lu

choice gives C “ e, but (as in the original Rogers’s argument) optimizing over η
leads to C “ 1, at the expense of additional lower order terms.
Exercise 5.9. We know from Exercise 5.7 that N pS n´1 , g, εq ě n ` 1 for any
na

ε ă π{2.
so

Exercise 5.10. Write N “ trψs : ψ P N1 u for some set N1 Ă SCd . Take now
T to be an ε-net in the unit circle tζ P C : |ζ| “ 1u and check that ? the set
r

N2 “ tζψ : ζ P T , ψ P N1 u is a 2ε-net in pSCd , gq (in fact even a 2ε-net).


Pe

Note that card N2 “ card T card N . Since we can ensure that card T “ rπ{εs, the
result follows from the bound (5.7). For the upper bound, argue similarly that
P pSCd , εq ě P pPpCd q, εq ˆ P pS 1 , εq, and then appeal to (5.1).
Exercise 5.11. Rough two-sided estimates of the form pCεq2d´2 follow from Exer-
cise 5.10 and (5.2), but the precise value requires a careful integration. First show
that the question is a special case (with n “ 2d) of the problem of calculating the
spherical volume of the ε-neighborhood (in the geodesic distance) of S 1 considered
as a subset of S n´1 . Next, observe thatşthe (non-normalized) volume of that neigh-
ε
borhood equals vol1 pS 1 q voln´3 pS n´3 q 0 cos t sinn´3 t dt. Conclude by evaluating
the integral and using repeatedly the formula (B.2).
346 E. HINTS TO EXERCISES

Exercise 5.12. (i) Expand }x1 ` ¨ ¨ ¨ ` xN }2 . (ii) Prove by induction on n that at


most 2n nonzero vectors in Rn can have pairwise nonpositive inner product.
Exercise 5.13. (i) Consider the Gram matrix G “ pxxi , xj yq1ďi,jďN and write
? ?
N “ }G}1 ď n}G}2 ď npN ` N 2 t2 q1{2 . (ii) Observe that the vectors
`n`k´1 ˘ pxbk
i q span
a space (the symmetric subspace) of dimension at most k´1 ď e p1 ` n{kqk . k

Then choose k “ logpnq{2 logp1{tq and apply (i) to the vectors pxbk i q. See Theorem
9.3 in [Alo03]. (iii) Consider a maximal set of points verifying the condition from
the exercise with t “ 1{r.

ion
Exercise 5.14. This is even simpler than the case of the sphere. Let pxi q1ďiďN be
ŤN
chosen uniformly and independently on t´1, 1un and let A “ i“1 Bpxi , εq. Then,

ut
by (5.13), E card Ac ď 2n p1´V pεq{2n qN ď exppn log 2´N 2npHpεq´1q {pn`1qq. This
is less than 1 (and therefore the event tA “ t´1, 1un u has positive probability)

rib
provided N ą npn ` 1q logp2q ¨ 2np1´Hpεqq . The matching lower bound on covering
numbers is (5.2).

ist
řtn ` ˘ řn ` ˘
Exercise 5.15. We have Vq ptq “ k“0 nk pq ´ 1qk ď k“0 nk pq ´ 1qk αk´tn “
q nHq ptq for α “ t{pp1 ´ tqpq ´ 1qq ď

rd
` n1.˘ For thetnlower` nbound,
˘ tn justp1´tqn
keep the last term
1
and write q ´nHq ptq Vq ptq ě q ´nHq tn pq ´ 1q “ tn t p1 ´ tq ě n`1 . The
` n˘ k
last inequality follows from the fact that k t p1 ´ tqn´k is maximal for k “ tn.
fo
Exercise 5.16. Consider L “ B13 and let K be a facet of L, then N 1 pK, Lq “ 1 ă
N pK, Lq. To obtain an example with K symmetric, let K be a rhombus made of
ot
two opposite faces of L. Then N 1 pK, Lq “ 2 but two central sections of L (which
N

are hexagons) cannot cover K. If we insist on having K with nonempty interior,


that example can be modified by slightly enlarging K and L.
ly.

Exercise 5.17. If x P K, then K X px ` εLq Ă x ` pεL X pK ´ Kqq.


Exercise 5.18. If KX “ K X p´Kq, then N pK, K, εq ď N pK, KX , εq ď p2 ` 4{εqn .
on

The argument from Lemma 5.9 applies then mutatis mutandis.


Exercise 5.19. One may be tempted to say that the statement follows from
se

Exercise 5.18 by duality, but this fails since K has centroid at the origin iff K ˝
has Santaló point at the origin. However, a variant of the preceding hint gives a
lu

correct argument: by Lemmas 5.8 and Proposition 4.18, we have N pBK, ´K, εq ď
N pBK, KX , εq ď p2 ` 4{εqn “: N . Let now x1 , . . . , xN in BK such that the sets
na

xi ´ εK cover BK. For each i, letŞfi be a linear form such that fi pxi q “ 1 and
fi ď 1 on K, and let that Q “ 1ďiďN tfi ď 1u. Show that y P BK satisfies
so

fi pyq ě 1 ´ ε for some i, and conclude that p1 ´ εqQ Ă K.


Exercise 5.20. Use relations from Section 1.1.4 to show that pKX q˝ Ă K ˝ ´ K ˝
r
Pe

and, subsequently, Theorem 4.21 to deduce that vradpKX q ă 4 vradpK ˝ q. Next, use
c
the hypothesis to conclude that vradpKX q ą 4κ vradpKq, where c is the constant
from Theorem 4.17. Then argue as in Exercises 5.18 and 5.19.
Exercise 5.21. If T is a linear map such that F “ T pB2n q, then B2n “ T pF ˝ q.
?
Exercise 5.22. (ii) Let L “ r0, ε{ n sn . The cubes tx ` L : x P N u have disjoint
volpB2n `Lq ?
interiors and lie inside B2n ` L, so card N ď vol L “ pε{ nq´n volpB2n ` Lq.
Then use Urysohn’s inequality to bound the volume radius of B2n ` L.
Exercise 5.23. By the results of Exercise B.8, if M “ SOpnq (resp., M “ Upnq,
M “ SUpnq), then N p π4 K, } ¨ }8 , 2.5εq ď N pM, } ¨ }8 , εq ď N pπK, } ¨ }8 , εq, where
K denotes the operator norm unit ball in the space of real skew-symmetric (resp.,
CHAPTER 5 347

complex skew-Hermitian, complex skew-Hermitian with zero trace) matrices. The


result follows then from Lemma 5.8.
Exercise 5.24. Let θ1 , . . . , θk P r0, π{2s be the principal angles between E, F P
Grpk, Rn q, as defined in (B.9).
? By Exercise B.12, the distance between E and F
equals }p˘2 sin θi {2q}p ď 2p2kq1{p .
Exercise 5.25. Use (5.2). Modulo the values of the constants C, c, the estimates
are reciprocals of the bounds from (5.20).
Exercise 5.26. First observe that the extremal cases are when K is a cap and

ion
L “ Kεc . Then prove, as a consequence of Proposition 5.2, that the function
t ÞÑ V ptqV pπ ´ t ´ εq increases on the interval r0, π2 ´ 2ε s and decreases on the

ut
interval r π2 ´ 2ε , π ´ εs.
Exercise 5.27. Consider f pxq “ distpx, Aq.

rib
Exercise 5.28. We may arrange by translation that K and L are inside RB2n , so
that the functions wpK, ¨q and wpL, ¨q are R-Lipschitz on S n´1 . Then use the union

ist
bound and Lévy’s lemma.
?
Exercise 5.29. Realize the normalized uniform measure on N S N ´1 as the dis-

rd
tribution
? of αN GN , where GN is a standard Gaussian vector in RN and αN “
N {|GN |, and use the law of large numbers to conclude that αN tends almost
surely to 1. fo
Exercise 5.30. By approximation, it is enough to show that γn pAq ą γ1 pp´8, asq
ot
implies γn pAε q ą γ1 pp´8,
? a ` εsq. Consider the orthogonal projection (restricted
to the sphere) πN,n : N S N ´1 Ñ Rn . For N large enough, we know from Theorem
N

´1 ´1
5.22 that the set T :“ πN,n pAq has larger measure than the cap C :“ πN,1 pp´8, asq.
´1
It follows that σpTε q ě σpCε q. Finally observe that πN,n pAε q Ą Tε while Cε “
ly.

´1
πN,1 pp´8, a ` εN sq for some εN tending to ε as N tends to infinity; this follows
?
on

´ ¯
from the (geodesic) radius of C being N arccos ´ ?aN and a similar formula for
the radius of Cε .
se

Exercise 5.31. Show that log Φ is concave by computing second derivative and
appealing to (A.4). Alternatively, use Proposition 5.2 and Poincaré’s lemma, and
lu

the fact that the function u ÞÑ logparccos uq is concave near 0.


Exercise 5.32. The nontrivial ` part nis ˘to show that, for fixed n and δ ą ` 0, and ˘for
na

´1 n c
r large
` enough, we
˘ have Φ γ n prB2 q ą p1 ´ δqr or, equivalently, γ n prB 2 q ă
γ1 rp1 ´ δqr, 8q . Now choose a finite set T Ă S n´1 such that prB n c
q Ă tx
˘ P
so

` ˘ 2`
Rn : maxuPT xx, uy ą p1 ´ δ{2qru and use the fact that γ1 rθr, 8q " γ1 rr, 8q as
r

r Ñ `8 (with θ P p0, 1q fixed). The last fact follows, e.g., from (A.4).
Pe

Exercise 5.33. To recover Ehrhard’s inequality for convex bodies A, B Ă Rn ,


consider the convex body K “ convtpt1u ˆ Aq Y pt0u ˆ Bqu Ă R ˆ Rn .
Exercise 5.34. hptq “ C exp ´p n2 ´ 31 qplogpt3 q ´ t3 q for some constant C (de-
` ˘

pending on n). The same argument shows that the median of the gamma distribu-
tion with parameter p is greater than p ´ 13 .
Exercise 5.35. Use the exponential Markov inequality and optimize over s ě 0.
Exercise 5.36. We may assume b ´ a “ 1. Write X as the convex combination
X “ pb´Xqa`pX ´aqb, use Jensen’s inequality and the convexity of the exponential
function to reduce to the inequality b exppsaq ´ a exppsbq ď expps2 {8q. The latter
follows since g 2 ď 41 , where gpsq “ log pb exppsaq ´ a exppsbqq.
348 E. HINTS TO EXERCISES

ε
Exercise 5.37. Use the exponential Markov inequality for ˘X with s “ 2p1˘εq ,
2 3
and the bound 1 ` ε ď exppε ´ pε ´ ε q{2q. Checking the last inequality is a tedious
but elementary computation.
Exercise 5.38. Argue as in the proof of Lemma 6.16 and Exercise 6.17, then set
Y0 “ λ1{2 pY ´ aq.
Exercise 5.39. Easy.
a
Exercise 5.40. Reduce the problem to λ “ 1, then choose δ “ logpκAq.

ion
a
Exercise 5.41. Assume λ “ 1. For the median, check that ? if 0 ď t ď logp2Aq,
then 2A2 expp´t2 {2q ě 12 . For the 3rd quartile, check that 3 2A2 expp´t2 {2q ě 43
a ?

ut
for 0 ď t ď logp4Aq, but that similar inequality holds with 3 2A2 replaced by
4A2 only if A ě 32{3 {4. For other quantiles, recalculate the bound on |M ´ a| and

rib
then show analogous inequalities. The only verification that is not straightforward
is establishing the bound 4A2 expp´λt2 {2q when M is the mean under the original

ist
hypothesis A ě 21 (i.e., without assuming that A ě e´1{3 ); it can be done by
identifying a family of extremal c.d.f. of Y that are of the form

rd
$
& 0 if t ă 0
F ptq “ 1´p if 0 ď t ď t0 ,
´t2

a
%
1 ´ Ae fo
if t ě t0
2
where t0 “ logpA{pq is the solution to p “ Ae´t , and then using the calculation
ot
from the proof of Lemma 6.16 together with some numerics.
N

Exercise 5.42. The bound is ApκAqα{p1´αq expp´αλt2 q.


Exercise 5.43. To show (5.38), note that if A “ tf ď M u and B “ tf ą M ` εu,
ly.

then distpA, Bq ě ε{L. For the first assertion and M “ M1{4 , the 1st quartile,
consider A “ tf ď M1{4 u and B “ tf ě Mf u, and similarly for the other quantiles.
on

Exercise 5.44. If we try to maximize Ef among 1-Lipschitz functions with ş1 median


0, we may assume f ě 0 (replace f by its positive part f ` ). Writing Ef “ 0 σptf ě
tuq dt, we know from the solution of the isoperimetric problem that the extremal
se

case is when f0 pxq “ distpx, Aq for some half-sphere A (the distance being the
lu

şπ{2 a
geodesic distance). And then Ef0 “ 0 V ptq dt ď π{8n from (5.5).
Exercise 5.45. It follows ş8 from the solution of the isoperimetric problem (and
na

from the formula Ef 2 “ 0 2tσp|f | ě tq dt) that among 1-Lipschitz functions with
median 0, Ef 2 is maximal for f0 pxq “ arcsinxx, uy for some u P S n´1 . For any
so

Lipschitz function f , we have therefore Var f ď Epf ´Mf q2 ď Var f0 . We compute


r

ż π{2 ż π{2
2
Pe

Ef02 “ 2tσp|f0 | ą tq dt “ 4 tV pπ{2 ´ tq dt ď


0 0 n
where we used (5.5). An example with variance 1{n is the function x ÞÑ xx, uy.
Note: The estimate 2{n is not quite sharp; it follows from Poincaré inequality
1
(5.54) that the variance is at most n´1 , since n ´ 1 is the first nontrivial eigenvalue
n´1
of ´∆ (the Laplacian) on S . Numerical evidence suggests that the optimal
bound is of the form B1 , where n ´ 1 ` 3n1
ă B ă n ´ 1 ` 3pn´1q 1
.
Exercise 5.46. Let m “aEf . It follows from Exercisea 5.45 that Var f ď 2{n.
Consequently, m ď q ď m2 ` 2{n and q ´ m ď 2{n. In what follows, use
the values from Table 5.2. The first inequality is then immediate since m ď q.
CHAPTER 5 349

a a
The second one is trivial if t ď 2{n or if t ą q. If t P p 2{n, qs (which implies
t ą q ´ m), then
` ˘ 2 2
Ppf ď q ´ tq “ P f ď m ´ pt ´ pq ´ mqq ď e´npt´pq´mqq {2 ď e ¨ e´nt {2 .
a
To derive the last inequality, use t ď q ď m2 ` 2{n. A very ambitious reader may
1
try to come up with a better estimate based on the sharper bound Var f ď n´1
(see the hint for Exercise 5.45).
Exercise 5.48. Apply the hypothesis to f pxq “ mintdistpx, Aq, εu.

ion
Exercise 5.49. Consider A “ XzBε .
Exercise 5.50. The concavity of g is a consequence of Ehrhard’s inequality (The-

ut
orem 5.23). Since gpMf q “ 0, we conclude that the inequality gptq ď αpt ´ Mf q
holds for some α ą 0 and every real t. This is equivalent to the statement

rib
γn ptf ď tuq ď PpZ ď tq where Z is an N pMf , α´2 q random variable. The conclu-
sion follows since stochastic domination allows comparison of the expectations.

ist
Exercise 5.51. The distribution of pXk q1ďkďN is the image of γN under an affine
map.

rd
Exercise 5.52. Consider the function f˜ defined on S n´1 by f˜pxq “ inftf pyq `
Lgpx, yq : y P Ωu. Show that f˜ is L-Lipschitz, coincides with f on Ω and that Mf
fo
is a central value for f˜. Then apply Corollary 5.32.
Exercise 5.53. Use the fact that for B Ă Y and ε ą 0, φ´1 pBε q Ą φ´1 pBqε . The
ot
statement about median is an immediate consequence of (5.39). For the statement
N

about expectation, restrict the supremum on the right-hand side to functions of


the form g ˝ φ. If φ is L-Lipschitz, one needs to replace Aε by Aε{L in (5.39),
µpf ´ Ef ą tq by µpf ´ Ef ą t{Lq in (5.40), and similarly for the median.
ly.

Exercise 5.54. If n “ 1, the function x ÞÑ Φpxq (the c.d.f. of the N p0, 1q dis-
on

tribution, see (A.3)) pushes forward γ1 to the Lebesgue measure on r0, 1s and is
Lipschitz with constant p2πq´1{2 , which allows to transfer the results on Gaussian
n n
` to˘ r0,`1s. For˘ general n P N, consider the surjection φ : R Ñ r0, 1s
concentration
se

given by φ pxj q “ Φpxj q .


lu

Exercise 5.55. (i) x ÞÑ suptt : νpF px, .q ě tq ě 1{2u is 1-Lipschitz, and similarly
for the other term. (ii) If f px, yq ą Mφ ` t, then either φpxq ą Mφ ` t{2 or
na

f px, yq ą φpxq ` t{2. (iii) p 21 q2 “ 41 . (iv) Argue as in (ii).


Exercise 5.56. (ii) For f : X1 ˆ X2 Ñ R 1-Lipschitz,ş x1 P X1 and x2 P X2 ,
so

introduce the functions fx2 px1 q “ f px1 , x2 q and gpx2 q “ fx2 dµ1 , and show that
they are 1-Lipschitz.
r
Pe

Exercise 5.58. Let B be an orthonormal basis of Mk,n´k as a real space. We claim


that for any X P Mk,n´k ,
1 ÿ
(E.3) }XY : ´ Y X : }2HS ` }X : Y ´ Y : X}2HS “ α}X}2HS
2 Y PB
where α “ n ´ 2 in the real case and α “ 2n in the complex case. The sum in (E.3)
does not depend on the choice of the orthonormal basis since it is theřtrace of a
quadratic form. Write the singular value decomposition of X as X “ sj |ej yxfj |
where pe1 , . . . , ek q and pf1 , . . . , fn´k q are orthonormal bases. For 1 ď a ď k and
1 ď b ď n ´ k, set Eab :“ |ea yxfb | and Fab :“ i|ea yxfb | and consider the orthonormal
350 E. HINTS TO EXERCISES

basis of Mk,n´k formed by pEab q (in the real case) or pEab q Y pFab q (in the complex
case). We compute
#
1 : : 2 1 : : 2 s2a ` s2b if a ‰ b
}XEab ´ Eab X }HS ` }X Eab ´ Eab X}HS “
2 2 0 if a “ b
#
1 1 s2a ` s2b if a ‰ b
:
}XFab ´ Fab X : }2HS ` }X : Fab ´ Fab
:
X}2HS “
2 2 4s2a if a “ b

ion
and (E.3) follows by summing over a, b. In the above formulas it is tacitly assumed
that sj “ 0 for j ą mintk, n ´ ku.

ut
Exercise 5.59. For Upnq, note?that un (= the skew-Hermitian matrices) con-
tains a central element u1 :“ i I { n, and so it follows from (5.44) and (5.45) that

rib
RicI pu1 q “ 0. In the case of SOpnq, consider the orthonormal basis of son of ma-
trices of the form Sij “ ?12 p|iyxj| ´ |jyxi|q and reduce to the case X “ S12 . The
?

ist
argument for SUpnq is similar; note that u1 “ i I { n R sun . For details of the
computations for both SOpnq and SUpnq, see Proposition E.15 in [AGZ10].

rd
Exercise 5.60. Test the log-Sobolev inequality on the function x ÞÑ exppλxq for
some λ ‰ 0. Alternatively, consider the function F : x ÞÑ x in (5.49) and let t Ñ 8.
Exercise 5.61. (i) There is a contraction φ : S 1 Ñ r0, πs which pushes forward σ
fo
to the normalized Lebesgue measure. (ii) Consider the Fourier series of the function
f from (5.54). (iii) Consider f pxq “ cospπxq.
ot
p p
Exercise ş 5.62. Useş Jensen’s inequality in the form |P t f | ď Pt p|f | q, and the
N

p
relation Pt g dγn “ g dγn (justify!) applied for g “ |f | . Note that the argument
is much easier when p “ 2, the contractivity following right away from (5.57).
ly.

Exercise 5.63. Use the fact that E exppλZq “ exppλ2 {2q when Z has an N p0, 1q
distribution. The result is Pt fλ “ exppλ2 p1 ´ e´2t q{2qfλe´t . Since }fλ }Lp pγn q “
on

2
epλ {2 , the statement about sharpness follows by taking λ Ñ 8.
Exercise 5.64. Write As{n “ ty P t´1, 1un : p1 ` εqy1 ` y2 ` ¨ ¨ ¨ ` yn ď m ` s ` εu
se

for small ε, use Hoeffding’s inequality (5.43) and take ε Ñ 0.


Exercise 5.65. If ε ă 1{n, then Aε “ A, and so in that case we may have
lu

µpAε q “ 21 . Positive results follow from (5.59) and from Exercise 5.64.
Exercise 5.66. Try N “ 9, and consider a Hamming ball of radius 1 plus any 4 of
na

the 6 elements of its boundary.


so

Exercise 5.67. The second assertion of Theorem 5.54 can be restated as follows:
If K, L Ă Rn satisfy distpK, Lq ě t and one of K, L is convex, then µpKqµpLq ď
r

2
e´t {2 .
Pe

Exercise 5.68. Consider the supremum of all 1-Lipschitz affine functions that are
smaller than f on K.
Exercise 5.69. We have for y P t´1, 1un
1 ÿ
f pyq “ ? inft|y ´ z| : z P t´1, 1un , zi ď 0u
2
(this formula is valid for n even and has to be slightly modified for n odd), so f is
?1 -Lipschitz. The bound on the probability follows from the central limit theorem.
2
ş8
Exercise 5.70. Write E |f pGq|p “ 0 ptp´1 Pp|f pGq| ą tq dt and use the Gaussian
isoperimetric inequality in the form given in Theorem 5.24.
CHAPTER 6 351

Exercise 5.71. First note that clearly }X}2L2 “ i |ai |2 . Next, for p ě 2, use
ř
?
Proposition 5.58 and the fact that }εi }ψ2 “ 1 (this gives Bp “ Op pq which is the
correct order of magnitude). The case p “ 1 (and hence p P p1, 2q) follows then from
the inequality E |X| ě pE X 2 q3{2 {pE X 4 q1{2 . An alternative approach is to appeal
to Theorem 5.56 to upper-bound higher moments of X (or to Theorem 5.54 and to
the fact that, for any nonnegative variable W , we always have E W ě 12 MW ).
Exercise 5.72. By change of variables, reduce the problem to comparing the
moments of a norm (or a seminorm) } ¨ } on Rn calculated with respect to the

ion
normalized counting measure µ on t´1, 1un . Next, follow the last strategy from the
hint to Exercise 5.71 combined with Theorem 5.54. The only difference is that while

ut
previously
ř we got “for free” the fact that the
ř Lipschitz constant of the linear function
pti q ÞÑ i ai ti was exactly the same as } i ai εi }L2 , this is no longer automatically

rib
true for the function for pti q ÞÑ }pti q}. However, the Lipschitz constant and the
median of } ¨ } can still be related: if K “ tpti q : }pti q} ď M}¨} u, then the
Euclidean inradius of K cannot be too small. This follows from the scalar case:

ist
if the Euclidean inradius of K was small, then K would be contained in aˇ ř narrowˇ
band tt : |xt, ay| ď 1u and, consequently, the median of function pti q ÞÑ ˇ i ai ti ˇ

rd
would be at most 1, much smaller than its L2 -norm (equal to |a|), contradicting
the argument from the hint to Exercise 5.71.
fo
Exercise 5.73. (i) We have E X n “ 0 if n is odd; compare both Taylor series
using the inequality k k k!{p2kq! ď pe{4qk . (ii) Use Jensen’s inequality.
ot
Exercise 5.74. Use the bound on the Laplace transform obtained in Exercise
N

5.73(ii) to upper-bound the moments.


a ?
Exercise 5.75. The equality }Z}ψ2 “ 2{π is equivalent to }Z}p ď p }Z}1
ly.

for p ě 1. Unless p is small, this follows from Stirling’s formula (on which the
asymptotic formula (5.63) is based). For small p one can verify the inequality
?
on

numerically. The inequality } ¨ }ψ1 ď } ¨ }ψ2 follows similarly from }T }p ě p for


p ě 2. (This is a very simple minded approach, we will be grateful to a reader who
supplies a nice rigorous argument.)
se

Exercise 5.76. (iii) Choose λ to be the minimum of 1{2}a}8 and t{4 a2i .
ř
lu

Chapter 6
na

Exercise 6.1. Let T “ r0, 1s “ Ω and let f : r0, 1s ÞÑ R` be an arbitrary function.


Define Xt ptq “ f ptq and Xt pωq “ 0 if ω ‰ t.
so

Exercise 6.2. Define tN ą 0 by the formula exppt2N {2q “ N { log3{2 N and check
using (A.4) that PpM ď tN q “ Op1{N c q for some constant c´ą 0, where
r

¯ M “
Pe

? log log N
maxtXk : 1 ď k ď N u. Conclude that E M ě 2 log N ´ O ?log N (handle
E M ` and E M ´ separately). See [DLS14] for more precise bounds.
Exercise 6.3. The suggested inequality follows from the formula
ż8
EY “ PpY ą tq dt,
0
valid for any nonnegative random variable Y .
Exercise 6.4. By Carathéodory’s theorem (Theorem 1.2), K equals the union of a
family of simplices, each of which has vertices of the form xk1 , . . . , xkn , xkn`1 . Next,
upper-bound the number of such simplices (simple combinatorics) and the volume
352 E. HINTS TO EXERCISES

of each simplex (Hadamard’s inequality). Note: If N " n, however, this argument


does not yield the logarithmic dependence on N {n from Remark 6.4.
Exercise 6.5. Proceed as in Proposition 6.3 using Lemma 6.2 instead ? of Lemma
6.1. In the nontrivial range 2 logp2N q ď n, use the inequality κn ą n ´ 1.
Exercise 6.6. We compute the Gaussian mean widthawG p¨q “ κn wp¨q. We have
n
by linearity of expectation that wG pB8 q “ n E |Z| “ n 2{π where Z is ? an N p0, 1q
random variable. It follows
? from Lemmas 6.1 and 6.2 that p1 ´ op1qq 2 log n ď
wG p∆n´1 q ď wG pB1n q ď 2 log n.

ion
Exercise 6.7. With the assumption that E Xk2 “ E Yk2 this is immediate by
integration (take λk “ t for all k). Without this assumption, let Z be an N p0, 1q

ut
random variable independent of pXk q, pYk q. For 0 ă t ă 1 and R large enough,
define new processes pX̄k q and pȲk q by X̄k “ tXk ` αk Z and Ȳk “ Yk ` βk Z, where

rib
the positive numbers αk , βk are adjusted so that E X̄k2 “ E Ȳk2 “ R2 . Check that
for R large enough, the second part of Slepian’s lemma can be applied to pX̄k q and

ist
pȲk q. Check also that E sup X̄k “ R ` t E sup Xk ` Op1{Rq and similarly for pȲk q,
so that letting R Ñ 8 and then t Ñ 1 yields (6.7).

rd
Exercise 6.8. Any centered Gaussian measure is the pushforward of the standard
Gaussian measure by a linear transformation.
Exercise 6.9. Without loss of generality,
`
some t ą 0. Define f psq “ γn´1 ty P R n´1
fo
L “ tpx1 , . . . , x˘n q P Rn : |x1 | ď tu for
: ps, yq P Ku , an even function of s.
ot
By (4.28), the function log f is concave, and therefore decreasing on r0, `8q. It
follows (differentiate) that
N

żt ż8
f dγ1 ě 2γ1 pr0, tsq f dγ1 ,
ly.

0 0

which is equivalent to the statement γn pK X Lq ě γn pLqγn pKq.


on

We now prove Proposition 6.9. Without loss of generality we may assume that
Xk “ xG, xk y, where G is a standard Gaussian vector in Rd (for some d ď N ),
and x1 , . . . , xN P Rd . We apply (6.13) to L “ tx P Rd : |xx, x1 y| ď t1 u and
se

K “ tx P Rd : |xx, xk y| ď tk for 2 ď k ď N u in order to obtain


lu

P p|Xk | ď tk for 1 ď k ď N q ě P p|X1 | ď t1 qPp|Xk | ď tk for 2 ď k ď N q .


na

The result follows by induction on N .


Exercise 6.10. Slepian’s lemma (6.7) supplies candidates for the extremal config-
so

uration. The worst case is when X contains only two elements.


Exercise 6.11. We have wG pKq “ wG pK1 q ` ¨ ¨ ¨ ` wG pKn q “ Θpnq regardless of
r
Pe

the choice of the sequence pdj q. Now choose dj “ 2j . Given 1 ď j ď k ď n, let


3k
Nj,k be a minimal 2j´ 2 -net in Kj and Mk “ Nk,1 ˆ ¨ ¨ ¨ ˆ Nk,k ˆ t0u ˆ ¨ ¨ ¨ ˆ t0u.
Check that Mk is an Op2´k{2 q-net of K and that log card Mk “ Op2k q.
Exercise 6.12. The only case that is not straightforward is applying the volumetric
bound (5.8) when ε " 1: the upper bound requires subtle estimates on volpB1n `
δB2n q, where δ “ 2?ε n . However, it is not hard to see that, in any case, the upper
?
bound is not tight: replace B1n by the smaller set B2n { n and deduce that the ratio
of volumes is greater than en{ε . The approach via Sudakov’s inequality is much
simpler and yields, for that range of ε, optimal or nearly optimal results.
CHAPTER 6 353

Exercise 6.13. If inradpKq ď r, then K is contained in a symmetric band of


width 2r. For the second part, we use Markov’s inequality to write γn p} ¨ }˝K ą εq ď
wpKq{ε ď .317, which implies γn pεK ˝ q ě .683 and therefore N pB2n , εK ˝ q “ 1.
Exercise 6.14. Apply Proposition 5.34 to the convex function } ¨ }L˝ .
Exercise 6.15. Since we are covering a Euclidean ball with translates of another
body, it is the dual Sudakov inequality (Proposition 6.11) that is relevant. Use the
value of the appropriate mean width from Table 4.1.
?
Exercise 6.16. The optimal θ equals 1 ` 2.

ion
Exercise 6.17. The worst case is when the random variables Yi are non-negative
and disjointly supported.

ut
Exercise 6.18. Use the union bound to estimate Ppmaxi Yi ą tq and argue as in
the proof of Lemma 6.16.

rib
Exercise 6.19. This is again similar to the proof of Lemma 6.16. First use (6.6)
and the union bound to estimate Ppmaxk Xk ą tq; this leads (as n Ñ 8) to an

ist
expression involving Riemann zeta function ζpsq. Then just use the fact that if
2
s ě 2, then ζpsq ď ζp2q “ π6 . The best bound for E maxk Xk that can be obtained

rd
by this line of argument
? is about 1.724. On the other hand, it is not hard to see
that E maxk Xk ą 2 for n large enough. The true value of E supk Xk seems to be
fo
between 1.45 and 1.5. To get a lower bound on the Dudley a integral, note that for
k ď n the elements X1 , . . . , Xk are ε-separated with ε “ 2{p1 ` log kq.
ot
Exercise 6.20. Use Lemma 6.1 and (6.17).
b
N

a
Exercise 6.21. For k ď l ď n, we compute }Xk ´ Xl }2 “ 2 ´ 2 k{l. Since the
a ?
family pX2j qjďlog n is 2 ´ 2-separated, the lower bound follows from Sudakov’s
ly.

inequality. For the upper bound, use Dudley’s inequality and the fact that, for
a ?
α ą 1, the family pXtαj u qYpXrαj s q gives a 2 ´ 2{ α-net with at most 2 log n{ log α
on

elements.
k
Exercise 6.22. Define a sequence pak q by ak “ inftη ą 0 : N pT, ηq ď 22 u for
se

ř8
k ě 1, a0 being the radius of T . The right-hand side of (6.27) is exactly k“0 2k{2 ak .
k k`1
To compare with the left-hand side, use the bound 22 ď N pT, ηq ď 22 for
lu

η P rak`1 , ak s, k ě 1.
Exercise 6.23. Consider the sets Tk “ tX1 , . . . , X22k ´1 , Xn u.
na

Exercise 6.25. If }X ´ Y }L8 ď ε, then f pY q ě gpXq.


so

Exercise 6.26. The direction that is not entirely straightforward is showing that
´1
d8 pX, Y q does not exceed the infimum in (6.32). If X̃ “ FX : p0, 1q Ñ R (the
r

inverse function) exists, then, when considered as a random variable with respect
Pe

to the Lebesgue measure, its law is the same as that of X. With care, such X̃
can be defined also if FX is not strictly increasing and/or discontinuous. Given
X, Y , what is }X̃ ´ Ỹ }8 ? This argument shows also that the infimum in (6.30) is
attained.
Exercise 6.27. Case 1˝ (the bounded case). If }Yn }8 ď M for some finite M
and all n, then also }Z}8 ď M . Now approximate f on r´M, M s by a Lipschitz
function and apply (6.31). Case 2˝ (the general case). Let ε ą 0 and choose M
so that Pp|Z| ą M q ă ε. Then, for all sufficiently large n, Pp|Yn | ą M ` 1q ă ε.
Apply Case 1˝ to Yn ’s and Z truncated at the level M ` 1, and then let ε Ñ 0.
(The last step uses the hypothesis that f is bounded.)
354 E. HINTS TO EXERCISES

Exercise 6.28. See the hint to Exercise 6.27; note that under the present hypothe-
ses Case 1˝ always holds. For an example, consider Z with distribution N p0, 1q,
2
1 ex {2
Yn “ Z ` n and f pxq “ 1`x2 .
Exercise 6.29. The measures p 21 ` n1 qδ0 ` p 21 ´ n1 qδ1 converge weakly but do not
converge in 8-Wasserstein distance, as n tends to infinity.
Exercise 6.30. The function A ÞÑ λk pAq is 1-Lipschitz with respect to the oper-
ator norm. It is remarkable that a similar inequality (with an additional mul-

ion
tiplicative constant C ă 3 on the right hand side) holds for normal matrices
[BDM83, BDK89].
Exercise 6.31. For (2), use the fact that the image of a standard Gaussian vector

ut
under the orthogonal projection onto a subspace is the standard Gaussian vector

rib
in that subspace.
Exercise 6.32. Show that if a random matrix X P Msa n is unitarily invariant, then
U DiagpXqU : (where U is a Haar-distributed random unitary matrix independent

ist
of X) has the same distribution as X.

rd
Exercise 6.33. If N is an ε-net in (SCn , | ¨ |), show (argue as in the proof of Lemma
5.9) that for any A P Mn ,
ˇ
}A}8 “ sup ˇxx|A|yyˇ ď
x,yPSCn
ˇ
fo
1 ˇ
sup ˇxx|A|yyˇ.
1 ´ 2ε x,yPN
ˇ
ot
Then use Proposition 6.3.
Exercise 6.34. Use Exercise 6.30 with A a GUEpnq matrix and B “ A ´ TrnA I.
N

Exercise 6.35. Show that the function pz ´ x` qpz ´ x´ q admits an analytic square
root gλ : Czrx´ , x` s Ñ C such that gλ pxq a ą 0 for x P px` , 8q, gλ pxq ă 0 for
ly.

x P p´8, x´ q, and limyÑ0˘ gλ px ` iyq “ ˘i px ´ x´ qpx` ´ xq for x P rx´ , x` s.


on

şx
It follows that if M :“ x´` fλ , then γ gλzpzq dz “ 2iM for any closed path γ which
ş

circles rx´ , x` s once in the clockwise direction, but does not wind around 0. To
evaluate the path integral over γ we choose R ą x` and set Γptq “ Reit , 0 ď t ď 2π,
se

andş note that ş


lu

(i) γ gλzpzq dz ` Γ gλzpzq dz “ 2πigλ p0q by the Cauchy integral formula, or by the
residue theorem,
na

(ii) Γ gλzpzq dz can be related to the constant term of the Laurent expansion of gλ ,
ş

which in turn can be found by subtracting the dominant (as z Ñ 8) term z and
so

considering the limit of gλ pxq ´ x as x Ñ `8.


An?alternative argument is to perform successive substitutions x “ y ` p1 ` λq,
r

y “ 2 λu, u “ cos t, and to recognize ? the resulting expression as an integral


Pe

involving the Poisson kernel Pr ptq for r “ λ.


Either approach allows to find also the expected value and the variance of
MPpλq, the calculation being in both cases simpler than the one sketched above.
Exercise 6.36. The equality is easily verified. To extend Theorem 6.28 to λ ă 1,
let W1 and W2 be as in the paragraph following (6.39), and note that for s ě n,
µsp pW2 q “ p1 ´ n{sqδ0 ` n{s µsp pW1 q.
Exercise 6.37. Couple W and X, defined as (6.39) and (6.40), by realizing ψi as
Gi {|Gi |. Using Exercise 6.30, it follows that d8 pµsp pn´1 W q, µsp pXqq ď supt|1 ´
n´1 |Gi |2 | : 1 ď i ď su. This tends to zero in probability, by Corollary 5.27.
CHAPTER 6 355

Exercise 6.38. By Theorem 6.28, the eigenvalue distribution of BB : approaches


a MPp1q ? distribution, therefore the singular value distribution of B approaches the
law of X 2 , where X „ µSC .
Exercise 6.39. For (a), use Lemma 6.20. The argument from (b) does not justify
changing the order of the limits. A separate question (to which the authors do not
know the answer) is whether we do actually have uniform convergence of Wn,s {n
to Xs{n in 8-Wasserstein distance as n, s Ñ 8.
Exercise 6.40. If W “ BB : , then 2 Tr W “ ij 2| Re Bij |2 ` 2| Im Bij |2 is the
ř

ion
sum of 2ns squared independent N p0, 1q variables.
Exercise 6.41. Let ψ P SCn be uniformly distributed, A “ |ψyxψ| ´ I {n, and B

ut
be a GUE0 pnq random matrix. By symmetry, the covariances of A and B (con-
sidered as Msa n -valued random vectors) are proportional, i.e., there exists β ą 0

rib
such thata E TrpAM q2 “ β 2 E TrpBM q2 for every M P Msa n . We compute that
β “ 1{ npn ´ 1q, and the result follows from the multivariate central limit theo-

ist
rem.
Exercise 6.42. Use Proposition 6.34 and Proposition A.1(ii).

rd
Exercise 6.43. (i) This is more transparent if we think of SCn bCs as the Hilbert–
Schmidt sphere SHS Ă Mn,s , and identify ρ and AA: , with A uniformly distributed
fo
on SHS . The function becomes f pAq “ |A: y| and is 1-Lipschitz for } ¨ }8 , hence for
} ¨ }HS . To apply Exercise 5.46 we identify SHS with S 2ns´1 . (ii) Given x P SCn , let
ot
y P N with |x ´ y| ď δ and write
N

|xx|∆|xy| ď |xy|∆|yy| ` |xx ´ y|∆|yy| ` |xx|∆|x ´ yy| ď |xy|∆|yy| ` 2δ}∆}8 ,


then take supremum over x. (iii) Choose for example δ “ 1{4. Using Lemma 5.3,
ly.

the union bound and (ii), we have


? ? ? ?
Pp}∆} ě 48{ nsq ď 82n Pp|f 2 ´ 1{n| ě 24{ nsq ď 82n Pp|f ´ 1{ n| ě 4{ sq.
on

(The? last inequality is valid whenever s ě n.) By (i), it follows that Pp}∆} ě
48{ nsq ď 64n p1 ` eq expp´16nq which tends to 0 as n tends to infinity.
se

Exercise 6.44. Combine the results from Exercise 2.19 and Exercise 6.38.
lu

řn řs
Exercise 6.45. (i) Expand E Tr GG: GG: “ i,k“1 j,l“1 ErGij Gkj Gkl Gil s and
notice that by independence ErGij Gkj Gkl Gil s “ 1ti“ku ` 1tj“lu (using the value
na

E |Z|4 “ 2 for N „ NC p0, 1q). The second computation is similar. (ii) Write GG:
:
as the product of independent random variables TrGG GG:
ˆ Tr GG: and use (i).
so

Exercise 6.46. Notice that for fixed t P Rn , E maxuPL xBt, uy “ |t|wG pLq, and
r

similarly with K and L switched.


Pe

Exercise 6.47. The inequality from (i) can be rewritten as


` ˘2 ` ˘` ˘
|x||y| ´ |x1 ||y 1 | ` 2 |x||x1 | ´ xx, x1 y |y||y 1 | ´ xy, y 1 y ě 0.
Part (ii) is proved similarly. For (iii), this fails already in dimension 1: if x “
x1 “ 1 and y “ y 1?“ eiε , then as ε tends to zero, |x b x1 ´ y b y 1 | „ 2ε while
|px, x1 q ´ py, y 1 q| „ 2ε.
Exercise 6.48. If A is a GOEpnq matrix and G` is a standard ˘ Gaussian vector
in Rn , consider the processes Xt “ xt, Aty “ Tr A|tyxt| and Yt “ xG, ty, both
indexed by t P S n´1 . Check that for s, t P S n´1 ,
}Xs ´ Xt }2L2 “ 2}|syxs| ´ |tyxt|}22 ď 4|s ´ t|2 “ 4}Ys ´ Yt }2L2
356 E. HINTS TO EXERCISES

and conclude by Slepian’s lemma that E λ1 pAq ď 2κn . The reason for a factor 2
sa
? the first equality is that A is a standard Gaussian vector in the space Mn times
in
2. The argument for the inequality is a special case of that from Exercise 6.47,
but using the bra-ket notation makes it?easier to rewrite it when A is a GUEpnq
matrix, in which case we get E λ1 pAq ď 2κ2n
Exercise 6.49. Let G1 , G2 , G3 be standard Gaussian vectors in Rm , Rn , Rm b Rn
respectively. Compare the processes Xpt,uq “ xG3 , t b uy and Ypt,uq “ rL xG1 , ty `
rK xG2 , uy (indexed by pt, uq P K ˆL) via Slepian’s lemma. To deduce the inequality

ion
for the usual mean width, use Proposition A.1(ii).
Exercise 6.50. Here is an outline of the complex case, the real case being similar.

ut
Proceed inductively as follows. Choose a (random) unitary matrix V0 P Upsq with
the property that the matrix BV0 has a zero entry at position p1, jq for j ą 1, while

rib
the p1, 1q-entry α is positive (note that α follows a χpsq distribution). Then choose
a (random) unitary matrix U0 P Upnq with the properties that U |1y “ |1y and that

ist
the matrix U0 BV0 has a zero entry at position pi, 1q for i ą 1, while the p2, 1q-entry
β is positive (note that β follows a χpn ´ 1q distribution). Repeat the procedure

rd
with the pn ´ 1q ˆ ps ´ 1q bottom right block of U0 BV0 , which has independent
NC p0, 1q entries and is independent of α, β.
Once the Lemma is proved, the second part of the exercise follows formally
fo
from the facts that (a) B has the same distribution as W BX where W P Upnq
and X P Upsq are Haar-distributed and independent of B and (b) if U is a random
ot
or deterministic unitary matrix and W is Haar-distributed and independent of U ,
then U W is Haar-distributed and independent of U .
N

Exercise 6.51. (i) Write R “ U AV as in Lemma 6.39, use Jensen’s inequality to


obtain E }A} ě } E R} “ }M }. (ii) Write }M } ě |M x|{|x| where x is the vector
ly.

? . . . , 1, 0, . . . , 0q with?k occurrences of “1”, and use the lower bounds κs`1´i ě


p1,
on

s ´ k and κn`1´i ě n ´ k for 2 ď i ď k. The whole argument applies to the


complex case (with κC j in place of κj ).

Exercise 6.52. It is enough to show that the relation xΩ|P pai ` a:i q|Ωy “ P dµSC
ş
se

holds for every polynomial P . We reduce to the case P pXq “ X n and check by
lu

expansion that xΩ|pai ` a:i qn |Ωy is the ş number of Dick paths of length 2n, which is
the nth Catalan number, and also xn dµSC pxq, see (6.34).
na

Chapter 7
so

Exercise 7.1. First, even if K is not symmetric, `K p´T q “ `K pT q due to symmetry


of G. The only part that is not straightforward is (ii). By homogeneity we may
r
Pe

assume }T }op “ 1, and using (i) we may also assume that T is an extreme point of
n
S8 (the operator norm unit ball). This means that T P Opnq (see Exercise 1.44),
and then it follows from the rotational invariance of the Gaussian measure that
`K pST q “ `K pSq. Note also that the second inequality in (v) is (1.13).
Exercise 7.2. No. Choosing T being a rank 1 operator, and S a rotation, one
would get from Proposition 7.1(v) that all 1-dimensional projections of K ˝ have
the same length.
Exercise 7.3. (i) Note that ~ ¨ ~B2n is the Euclidean norm associated to the inner
product (7.2), and so KpB2n q is the norm of R̃1 as an element of BpHk,n q, which
equals 1 since it is an orthonormal projection. (ii) First prove that KpKq “ KpT Kq
CHAPTER 7 357

for any T P GLpn, Rq; this follows from the formulas ~Θ~T K “ ~Θ ˝ T ´1 ~K and
R̃1 pΘ ˝ T ´1 q “ R̃1 pΘq ˝ T ´1 for Θ P Hk,n . Then show KpKq ď dg pK, B2n q using
(i). (iii) Use Exercise 4.20.
Exercise 7.4. Use (7.4) and the fact that R̃1 is self-adjoint in Hk,n .
Exercise ? 7.5. Let f : Rk Ñ R be the indicator function of Rk` , and z be the vector
p1, . . . , 1q{ k. We compute, for x “ px1 , . . . , xk q P Rk ,
1
R1 f pxq “ rE f pGqxG, zys xx, zy “ ? 2´k px1 ` ¨ ¨ ¨ ` xk q.

ion

It follows that

ut
1 ÿ
R̃1 pΘqpx1 , . . . , xk q “ ? 2´k xx, εyeε .

rib
εPt´1,1uk
? a 1
?
Since E |xG, εy| “ k 2{π for any ε P t´1, 1uk , we obtain ~R̃1 pΘq~B1N “ π k,

ist
while ~Θ~B1N “ 1. For the last equality, appeal to Exercise 7.4.
Exercise 7.6. The version on S is: if f : S Ñ C is an holomorphic function on

rd
S such that |f | ď λ on S and |f | ď 1 on R, then |f 1 p0q| ď e log λ. Reduce to the
case f p0q “ 0 by considering z ÞÑ pf pzq ´ f p´zqq{2. Use the three-lines lemma to
fo
conclude that |f pzq| ď λ| Im z| . Write f pzq “ zgpzq and use the maximal principle
(with 0 ă t ă 1) to show that |gpzq| ď λt {t for | Im z| ď t. The optimal choice
t “ 1{ log λ gives |f 1 p0q| ď e log λ.
ot
Exercise 7.7. (i) We have Tα “ tf ď Mf uα X tf ě Mf uα ; use Corollary 5.14.
N

(ii) For x P S n´1 , use wppBβ qc , xq ď cos β when x P a B and wppBβ qc , xq ď 1


a Check numerically that ε ´ α ě p1 ´ logp2q{6q ě 0.66ε and
otherwise. (iii)
ly.

1`cos 0.66ε
2 ď 1 ´ ε2 {6 for ε P p0, 1q. Apply (ii) with B “ Tα and β “ ε ´ α to get
? 1 ` cos β
on

a a
wG pAq “ κn wpAq ď n ď n ´ nε2 {6 ď n ´ pk ` 1q ď κn´k .
2
Exercise 7.8. Let E be a random cε2 n-dimensional subspace. Since g is 1-Lipschitz
se

and circled with mean µf , we can choose c ą 0 such that oscpg, SE , µf q ď ε{3
lu

with high probability, by Theorem 7.15. Moreover we can write hpxq ď 2π n `


i2kπ{n
maxt|f pe xq ´ f pxq| : ? 1 ď k?ď nu. Using the union bound and Lévy’s
na

lemma shows that Mh “ Op log n{ nq, where Mh is a median of h. We can


choose C such that Mh ď ε{3. Another application of Theorem 7.15 (the function
so

h is 2-Lipschitz and circled) gives that oscph, SE , Mh q ď ε{3 with high probability,
for some choice of c. We conclude by using the triangle inequality in the form
r

oscpf, SE , µf q ď oscpg, SE , µf q ` Mh ` oscph, SE , Mh q ď ε.


Pe

Exercise 7.9. We have inradpKq inradpK ˝ q ě 1{A and wpKqwpK ˝ q ě 1 (see


Exercise 4.37). The second statement follows from Exercise 4.20.
Exercise 7.10. Without loss of generality we may assume that inradpK X Eq ě 1
and outradpK X Eq ă A. For x P E and y P E K , define Tλ px ` yq “ x ` λy.
As λ tends to `8, the inradius of Tλ K tends to 1 and wG ppTλ Kq˝ q tends to
wG ppK X Eq˝ X Eq ą A´1 κk . Therefore, for λ large enough, one has k˚ pTλ Kq ě
pκk {κn q2 n{A2 ě pk ´ 1qA´2 . Compare also with Exercise B.15.
Exercise 7.11. (i) Let A be the maximum of || ¨ || on S n´1 . Prove that A ď
1 ` β ` δA, yielding the upper bound in (7.14); the lower bound follows. (ii) Adjust
358 E. HINTS TO EXERCISES

the values of δ, α, β such that (7.14) implies 1 ´ ε ď }x} ď 1 ` ε for any x P S n´1 ;
then use Lemma 5.3 and the union bound.
Exercise 7.12. (i) Let Rn “ Ei be an decomposition of Rn as the direct sum of
À
N “ rn{ks subspaces, with dim Ei ď k, and O P Opnq À Haar-distributed. Using the
union bound, show that the decomposition Rn “ OpEi q has the desired property
with positive probability. (ii) If xi is the projection of x onto the ? i-th subspace in
ř ř
a decomposition from (i), write ||x|| ď ||xi || ď 2M |xi | ď 2M N |x|. (iii) Use
(ii) and the fact that ||x|| “ b|x| for some x ‰ 0.

ion
Exercise 7.13. Let Kr Ă R2 by a disk of radius 1 centered at pr, 0q. Then
limrÑ1 dimV pKr q “ 8, or otherwise one would find a polytope P with K1 Ă P Ă

ut
4K1 , which is not possible.
Exercise 7.14. The n2 -dimensional convex body B1n ˆ ¨ ¨ ¨ ˆ B1n has p2nqn vertices

rib
and n2n facets.
Exercise 7.15. Mimic the proof of Theorem 7.29, replacing the use of Lemma

ist
7.28 by the inequalities dimF pK, AqapKq2 ě pn ´ 1q{2A2 , and dimV pK, BqapKq2 ě
pn ´ 1q{2B 2 .

rd
Exercise 7.16. If the codimension of E is k,?then E nontrivially intersects Rk`1
(seen as a subspace of Rn ), on which } ¨ }1 ď k ` 1 | ¨ |.
fo
Exercise 7.17. For p ă 8, mimic the proof of Theorem 7.31. For p “ 8, use
Lemma 6.16.
ot
Exercise 7.18. (i) is equivalent to the existence of a linear map A : Rk Ñ Rn
such that p1 ` εq´1 |x| ď }Apxq}8 ď |x| for any x P S k´1 . The map A has the
N

form x ÞÑ pxx, x1 y, . . . , xx, xn yq for x1 , . . . , xn P Rk . We have |xi | ď 1 and may


assume |xi | “ 1 by replacing xi with xi {|xi |. On the other hand, since K Ă L is
ly.

equivalent to the inequality wpK, ¨q ď wpL, ¨q between widths, the inclusion (7.22)
means precisely that p1 ` εq´1 |x| ď }Apxq}8 for x P Rk , hence the equivalence.
on

Exercise 7.19. (i) Denote S “ pT ´1 q˚ . We have wG ppT Bpn q˝ q “ E }T ´1 G}p ,


where G is a standard Gaussian vector in Rn . The ith component of the random vec-
se

` ˘1{p
tor T ´1 G has variance σi2 “ |Sei |2 and therefore E }T ´1 G}p ď E }T ´1 G}pp “
ř p 1{p
mp p σi q ď n1{p mp max σi , where mp denotes the Lp -norm of an N p0, 1q Gauss-
lu

ian variable. On the other hand,


na

inradpT pBpn qq ď inradpT pB8


n
qq “ outradpSB1n q´1 “ pmax σi q´1 .
It follows (cf. (A.1)) that k˚ pT Bpn q ď Cpn2{p .
so

Exercise 7.20. (i) Since } ¨ } ě p1 ` 2εq| ¨ | we have p1 ` 2εq ď Ap1 ` εq, so


r

A ě 1 ` ε{2. Similarly } ¨ } ď 2| ¨ | implies A ď 2 and therefore A ě 1 ` εA{4. To


Pe

get (ii), subtract |x| from (7.24).


?
Exercise 7.21. We have inradpKq “ 1{ m and E }G}K “ mκn if G is a standard
Gaussian vector in Rmn .
Exercise 7.22. By Theorem 7.31, there is a subspace E Ă RN of dimension
n “ cN ε2 such that P :“ E X B1N is p1 ` εq-Euclidean. The polytope P has at
most 2N facets (since taking sections of polytopes cannot increase the number of
facets). The polytope P also has at most 3N vertices, since every vertex of P is the
intersection of E with some face of B1N , and B1N has 3N faces.
Exercise 7.23. (i) Let B : Ω Ñ Mn,s be a standard Gaussian vector and let
W “ Wn,s :“ BB : be the corresponding Wishart matrix. Consider first p P r1, 8q.
CHAPTER 7 359

As in the proof of Theorem 7.37, the problem is reduced to showing that E }B}p “
` ˘1{p
E Tr W p{2 „ αp n1{p`1{2 or, equivalently, that
´ ¯1{p ˆż ˙1{p
E n´1 Trpn´1 W qp{2 “E |x|p{2 dµsp pn´1 W q „ αp .

(Above and in what follows all expected values E are calculated on the probabil-
ity space Ω, and all integrals are over R, often with respect to empirical spectral
`ş ˘1{p
measures depending on ω P Ω.) Recalling that αp “ |x|p{2 dµMPpλq , we see

ion
that we need to exploit the convergence µsp pn´1 W q Ñ µMPpλq explained in Section
6.2.3.2. However, there are a few technical points that need to be resolved. First,

ut
it is not enough to work ş with theş weak convergence of measures since (by defini-
tion) νn Ñ ν weakly iff f dνn Ñ f dν for every bounded continuous function, and

rib
f pxq “ |x|p{2 is not bounded. To address this problem, appeal to 8-Wasserstein
convergence and argue as in Exercise 6.28 (i.e.,ş using Theorem 6.28 and Lemma

ist
6.20) to conclude that n´1 Trpn´1 W qp{2 Ñ |x|p{2 dµMPpλq “ αpp in probability,
and similarly after raising all quantities to the power 1{p.

rd
Next, as every student of real analysis knows, the convergence Xn Ñ Y in
probability does not generally imply convergence in mean E Xn Ñ E Y : one only
fo
knows from Fatou’s lemma that lim inf n E Xn ě E Y . However, we do have con-
vergence in mean under some tightness assumptions, for example when the second
moments E Xn2 are uniformly bounded. (Prove this if it sounds unfamiliar.) In our
ot
setting, we have
N

´ ¯1{p
1{2
Xn “ n´1 Trpn´1 W qp{2 ď }n´1 W }8 “ }n´1{2 B}8 .
ly.

To conclude, verify that Proposition 6.33 (or Corollary 6.38 in the real case) im-
plies E }n´1{2 B}28 À λ. This is a simple instance of upper-bounding Lp -norms in
on

presence of ψ2 estimates
? explained in Section 5.2.6; actually it easily follows that
E }n´1{2 B}28 „ p1 ` λq2 .
The case p “ 8 is easier since the quantities in question are more tangible;
se

it follows from Proposition 6.31 (or Corollary 6.38) and Exercise 6.51. Note that
lu

the lower bound also follows formally from ? the case p ă 8 by using the facts that
} ¨ }8 ě n´1{p } ¨ }p and limpÑ8 αp “ 1 ` λ, while the upper bound is implicit in
na

the last calculation above.


(ii) Argue in a similar way by using the analogous results from Section 6.2.2 con-
so

cerning GUE/GOE matrices.


Exercise 7.24. The bounds on the mean width appear in (7.25). The bounds on
r

the volume radius follow from the inequalities vradpSpm,n q ď wpSpm,n q (Urysohn’s
Pe

inequality) vradpSpm,n q vradpSqm,n q ě c (the inverse Santaló inequality). The con-


stants C, c are independent of p P r1, 2s (in addition to being dimension indepen-
dent).
Exercise 7.25. (i) Let M and N be ε{4-nets in pS m´1 , | ¨ |q and pS n´1 , | ¨ |q respec-
tively, and take P “ convt|xyxy| : x P M, y P N u (cf. the proof of Lemma 9.2).
m,n
Use Lemma 5.3 to upper-bound the size of the nets. (ii) If dBM pE X S8 , B2k q ď 2,
˝ k ˝
(i) implies that dBM pE X P , B2 q ď 4. Since E X P is a 4-Euclidean poly-
tope with C0m`n faces, Remark 7.34 implies that k “ Opnq, as needed. (iii) If
m,n
dBM pE X Spm,n , B2k q ď 2, then by (1.31) dBM pE X S8 , B2k q ď 2m1{p . By Re-
m,n ´2{p m,n
mark 7.22, k˚ pE X S8 q ě km {4. This implies (Theorem 7.19) that S8
360 E. HINTS TO EXERCISES

has a 2-Euclidean section of dimension ckm´2{p , hence we conclude from (ii) that
k ď Cnm2{p .
Exercise 7.26. Identifying Cn with R2n , the ellipsoid JohnpKq is circled (as a
consequence of its uniqueness, it inherits all the symmetries from K), and therefore
we may reduce to the case where K is in John position. It suffices to check that
Lemma 7.41 transfers verbatim to the complex case.
?
Exercise 7.27. ? (i) By the result from Exercise 7.9, we have either k˚ pKq ě n
or k˚ pK ˝ q ě n. Assuming the latter without loss of generality, it follows from

ion
?
Corollary 7.24 that there exists a subspace F of dimension c n such that PF K is
2-Euclidean. Conclude by applying Corollary 7.40 to K X F . (ii) Yes, since we can

ut
choose a position for which the Haar measure on Grpk, Rn q concentrates near E,
see Exercise B.15.

rib
Exercise 7.28. (a) Without loss of generality, one may assume that JohnpKq “
B2n . Set A :“ vradpKq “ pvolpKq{ volpB2n qq1{n . From Lemma 5.8, we obtain that

ist
N pK, B2n q ď volpK ` B2n q{ volpB2n q ď volp2Kq{ volpB2n q ď p2Aqn . It follows that
K XE is covered by p2Aqn translates of B2n XE, hence volpK XEq ď p2Aqn volpB2 X

rd
n
Eq which is the claimed estimate. (b) Consider K “ B8 ˆ B2N and check that
vradpKq is bounded? by an absolute constant whenever N ě Cn log n, whereas
n
vradpB8 q “ Θp nq. fo
Exercise 7.29. The arguments in parts (i) and (ii) of the Exercise are identical,
ot
the key observation being that the intersection of two (or three) events with large
probability also has large probability. For the first statement, use the fact that
N

if E is Haar-distributed on Grpk, R2k q, so is E K . For the second statement, fix


an orthogonal decomposition R3k “ F1 ‘ F2 ‘ F3 and consider Ei “ OpFi q for
ly.

O P Op3kq Haar-distributed.
Exercise 7.30. Follows from Theorem 7.44 by duality.
on

Exercise 7.31. Both sets equal E X pK ` Gq.


Exercise 7.32. Use the example from Exercise 7.14, and Proposition 5.6.
se

Exercise 7.33. (i) It follows from Lemma 4.20 that volpKq ě 2´n volpK X pE1 ‘
E2 qq volpK3 q and that volpK X pE1 ‘ E2 qq ě 2´n volpK1 q volpK2 q. To obtain in-
lu

equalities for K ˝ , proceed similarly using (1.12) and (1.13). (ii) Use (4.55). (iii)
Follows easily from part (ii).
na

n
Exercise 7.34. Apply Corollary 7.24 with K “ B8 . Using the fact that k˚ pB1n q “
Ωpnq, it follows that there exists a subspace E Ă R of dimension Ωpnε2 q such that
n
so

n n
PE B 8 is p1 ` εq-Euclidean. Then note that PE B8 can be written as the Minkowski
r

sum of n segments. Observe that an isomorphic version of the statement follows


Pe

from Exercise 7.30.

Chapter 8

Exercise 8.1. Show that PpSeg XU pSegq ‰ Hq “ 0 by arguing as in the proof of


Theorem 8.1.
Exercise 8.2. This follows by restricting the minimum to product states, since
Sp pρ b σq “ Sp pρq ` Sp pσq.
Exercise 8.3. (i) Use concavity of Sp . (ii) Define Φ
r and Ψ
r as Φpρq
r “ Φpρq b τ and
Ψpρq “ σ b Ψpρq, where σ (resp., τ ) is a state minimizing the output entropy of Φ
r
CHAPTER 8 361

(resp., Ψ). Let Ξ “ Φ r then S min pΞb2 q “ S min pΦ


r ‘ Ψ; r ă S min pΦq
r b Ψq r ` S min pΨq
r “
p p p p
min
2Sp pΞq.
Exercise 8.4. The right-hand side of (8.11) is achieved on extreme points, i.e.,
˘|ψyxψ| for ψ P SC2 . An immediate computation shows that }Φp˘|ψyxψ|q}p “
21{p {2. On the other hand, if ψ K ϕ, then }Φp|ψyxϕ|q}p “ }|ψyxϕ|}p “ 1 ą 21{p {2.
Exercise 8.5. (i) Show that nonzero eigenvalues of anti-symmetric matrices come
in pairs. (ii) Argue as in the proof of Proposition 8.6, using (i) instead of Lemma
8.7.

ion
Exercise 8.6. (i) The Choi matrix of R is CpRq “ d1 ICd bCd . (ii) Use direct
computation, or argue that G “ tB j Ak : 1 ď j ď d, 1 ď k ď du is a group

ut
(with the counting measure as Haar measure) which generates Md as a vector
space
ř and therefore the argument used in the proof of Proposition 2.18 yields

rib
1 : I
d2 GPG GXG “ Tr X d.
Exercise 8.7. The idea is to follow carefully the proof of Lemma 8.12 to come up

ist
with an exact calculation instead of an estimate.
Recall that the argument shows that Lk , the Lipschitz constant of the function

rd
ψ ÞÑ Epψq, is the same as that of the function f from (8.17) and, in particular,
independent of d (as long as d ě k, which we assume). Next, compute w, the
fo
tangent (to S k´1 ) component of the gradient of f ; the supremum of the Euclidean
norm of w will be equal to Lk . By direct calculation, show that |w|2 “ 4F with
ot
ÿ ` ˘2
(E.4) F “ pj log p1{pj q ´ H 2 ,
N

x2j
ř
where pj “ (in the notation of (8.17)) and H :“ j pj logp1{pj q. To find the
ly.

ř
maximum of F over the set tppj q : pj ě 0, j pj “ 1u, use Lagrange multipliers
and deduce that the extremal sequences ppj q take only two values, namely such that
on

log p1{pj q “ p1 ` Hq ˘ α, for some α ą 0. By analyzing the constraints, show that


the maximum of the objective function F equals to α2 ´ 1, and that it is achieved
when the smaller value of pj is repeated k ´ 1 times, in which case α “ αk is the
se

positive root of the equation


lu

(E.5) e2α pα ´ 1q{pα ` 1q “ k ´ 1,


which implies that αk „ 21 log k (as k Ñ 8). Since the argument shows that
na

a
Lk “ 2 αk2 ´ 1, deduce the conclusion.
For any given value of k, Lk can be found numerically by solving equation (E.5);
so

numerical evidence suggests that log k ď Lk ď logp2kq for all k ě 2.


r

Exercise 8.8. Fix an orthonormal basis pϕj q1ďjďs of F and define y P Rn by


Pe

2 1{2
` řs ˘
yi “ j“1 |xϕj , iy| . Then check that
c
1 s
sup }x}8 “ }y}8 ě ? |y| “ .
xPF : |x|“1 n n
Exercise 8.9. (i) Use Exercise 8.8 to show? that W contains a unit vector ψ with
largest Schmidt coefficient greater than α. This uses the identification of bipartite
states with matrices (see Section 2.2.2) and the fact that the operator norm of a
matrix is at least as large as the absolute value of the largest matrix element. (The
latter seems very rough, but works; it is conceivable that refining the argument at
this point could lead to closing the gap between the lower and the upper bound
362 E. HINTS TO EXERCISES

on the dimension of “very entangled subspaces,” at least for some ranges of the
parameters.) Then appeal to concavity of entropy to show that under this constraint
the von Neumann entropy is maximized when the spectrum of ρ “ TrCm |ψyxψ| is
pα, p1 ´ αq{pk ´ 1q, . . . , p1 ´ αq{pk ´ 1qq. (ii) This is a tedious
` but straightforward
˘
consequence of part
` (i); use the
˘ fact that if y “ φpxq “ Θ xp1 ` log xq for x ě 1,
then φ´1 pyq “ Θ y{p1 ` log yq .
Exercise 8.10. Let θ : pRd , | ¨ |q Ñ pE, } ¨ }HS q be an isometry. For i, j P t1, . . . , N u,
consider the linear form φi,j : pRd qk Ñ R defined by

ion
φi,j px1 , . . . , xk q “ xi|θpx1 q . . . θpxk q|jy.
řN k
Show that }φi,j px1 , . . . , xk q} ď N ´k{2 |x1 | ¨ ¨ ¨ |xk | and that i,j“1 |||φi,j |||2 “ Ndk´1 ,

ut
so that |||φi,j ||| ě dk{2 {N pk`1q{2 for some indices pi, jq. Then use Proposition

rib
8.25(iii).

Chapter 9

ist
Exercise 9.1. Use Proposition 6.3.

rd
Exercise 9.2. Consider A “ |0yx0| ´ |1yx1| and N “ tψ P SC2 : }ψ}8 ď cos αu.
Then N is an α-net in pSCd , gq and |xψ|A|ψy| ď cosp2αq for any ψ P N .
fo
Exercise 9.3. For the first part, mimic the argument used in the proof of Propo-
sition 8.28. The second part is straightforward, see (6.15).
ot
Exercise 9.4. This is a reformulation of the statement from Lemma 4.3.
N

Exercise 9.5. The statement about the mean width is proved similarly as in the
qubit case. For the lower bound, one may notice that since SepppC2 qbk qq is a
section of SepppCd qbk q, its Gaussian mean width is smaller. For the volume, to
ly.

be able to generalize the argument from the proof of Theorem 9.11 one needs to
find LöwpDpCd q q. To that end, use Proposition 4.8 to show that it has the form
on

Ea,b “ paP ` bQqBHS , where P is the projection onto the hyperplane of trace zero
matrices and Q “ I ´P . Check that DpCd q Ă Ea,b ðñ a´2 p1 ´ 1{dq ` b´2 {d ď 1.
se

d2 ´1
? vol Ea,b “ a
a
Minimizing b volpBHS q under this constraint gives a “ d{pd ` 1q
and b “ d.
lu

Exercise 9.6. For the bound on wpPPT˝ q, argue as in the proof of Theorem 9.13,
but in the last step use Exercise 5.28. In the displayed formula, the first inequality
na

is Urysohn’s inequality. For the second one, use the bound on wpPPT˝ q and appeal
to the dual Urysohn inequality (Proposition 4.16).
so

Exercise 9.7. This follows from the fact that the measure µd2 ,d2 on DpCd b Cd q
r

is proportional to the Lebesgue measure (see the discussion following (6.47)), and
Pe

from Proposition 6.34.


ř
Exercise 9.8. (i) The operator whose norm has to be estimated is A “ i Bi with
Bi “ j Aij b |iyxj|. Since Bi:1 Bi2 “ 0 for i1 ‰ i2 , it follows that
ř
ÿ ÿ :
}A}2 “ }A: A} “ } Bi: Bi } ď }Bi Bi }.
i i

Bi Bi:
`ř : ˘
Next, using “ b |iyxi|, conclude that }Bi: Bi } “ }Bi Bi: } “
j Aij Aij
} j Aij A:ij } ď j }Aij A:ij } “ j }Aij }2 , as needed. (ii) It is enough to prove (see
ř ř ř

the comment following Theorem 9.15) that every matrix A P B sa pCd1 b Cd2 q with
}A}HS ď 1 satisfies I `A P SEP. By Theorem 2.34 and Remark 2.35, it suffices to
CHAPTER 10 363

show that for every


ř unital positive map Φ : Md1 Ñ Md2 , we have I `pΦbIdqpAq ě 0.
Writing A as Aij b |iyxj|, we have
ÿ ÿ ÿ
}pΦ b IdqpAq}28 “ } ΦpAi,j q b |iyxj|}28 ď }ΦpAi,j q}28 ď }Ai,j }28 ď 1,
from which the result follows. In the chain of inequalities we used successively (i),
Exercise 2.30, and } ¨ }op ď } ¨ }HS .
2
Exercise 9.9. Note that P pDpC ? qq is the shifted Bloch ball (a 3-dimensional real
Euclidean ball with radius 1{ 2). For the last inequality, argue as in the proof of

ion
Proposition 8.28.
Exercise 9.10. Use Stirling’s formula.

ut
Exercise 9.11. (i) The fact that the projection contains the section is a general
1
obvious fact. For ρ “ |1yx1| b |1yx1|, we have PH ρ “ mn ICm bCn `|1yx1| b p|1yx1| ´

rib
1 I
n IC n q, which is not positive. (ii) Use (1.13). (iii) We have PF ρ “
m b TrC ρ for
m

every state ρ.

ist
Exercise 9.12. Use the fact that D has enough symmetries. The argument sug-
gested in the hint to Exercise 9.14 also works.

rd
?
Exercise 9.13. We have vradpP q ě 14 vradpDpCd qq ě c{ d (see Table 9.1). If
?
P has N vertices, Proposition 6.3 implies that vradpP q “ Op log N {dq, and the
fo
result follows. We point that the result can also be proved by arguing as in the
proof of Proposition 9.31.
ot
Exercise 9.14. Argue that the smallest λ ą 0 such that p´1q ‚ Sep Ă λ ‚ Sep
equals d2 ´ 1 by considering a pure product state.
N

Exercise 9.15. Denote by E Ĺ Cd the range of ΦpIq (which is a positive opera-


tor) and consider Φ̃ : X ÞÑ ΦpXq ` PE K XPE K . The map Φ̃ is clearly positivity-
ly.

preserving and has the property that, for any state ρ on Cd b Cd , we have
on

pΦ b Idqpρq P PSD ðñ pΦ̃ b Idqpρq P PSD.


(The key point in inferring the latter is that positivity of Φ implies then that, for
any X P Md , the range of ΦpXq is contained in E.) Finally, define Ψ : X ÞÑ
se

Φ̃pIq´1{2 Φ̃pXqΦ̃pIq´1{2 .
lu

Exercise 9.16. (i) is an immediate consequence of Exercise 7.15 applied with


B “ 4. For (ii), proceed exactly as in the proof of Theorem 9.34.
na

Chapter 10
so

1
řk Ó 2n
řk
Exercise 10.1. Check that }x}8 i“1 xi ď minpk, n ´ kq ď }y}1 i“1 yiÓ .
r

Exercise 10.2. First remark that the unitary invariance of A implies that A and
Pe

V AV : have the same distribution when V is a random unitary matrix independent


of A (V is not assumed to be Haar-distributed). Now, let W be a (random) unitary
matrix W such that Diagpspec Aq “ W AW : (W is not unique but can be chosen as
a measurable function of A). If U is a Haar-distributed random matrix independent
of A, then U W is independent of A (this follows from the translation-invariance of
the Haar measure). Finally, we may apply the initial remark with V “ U W .
Exercise 10.3. Argue as in the proof of Proposition 10.4 using the event E =
t}B}1 ě c1 nu, and Lemma 10.2.
Exercise 10.4. Consider the random vector Z defined by PpZ “ ei q “ PpZ “
1
´ei q “ 2n where pei q is the canonical basis of Rn . We show separately that (i)
364 E. HINTS TO EXERCISES

E }X}K ď C1 E }Z}K and (ii) E }Z}K ď C2 E }Y }K . Writing X as a positive


combination of ˘ei gives (i) with C1 “ 2n E }X}1 . For (ii), denote A “ tErY 1A s :
A measurableu. Note that for any x P conv A, we have }x}K ď E }Y }K . The convex
hull of A has nonempty interior (otherwise Y would be almost surely contained in
some hyperplane) and therefore contains the 2n vectors ˘εei (1 ď i ď n) for some
ε ą 0. It follows that ε E }Z}K ď E }Y }K .
Exercise 10.5. This is obvious since µd2 ,s has a density with respect to the
Lebesgue measure.

ion
řs
Exercise 10.6. Consider the set Ls “ tpψ1 , . . . , ψs q P pCd bCd qs : i“1 |ψi yxψi | P
SEPu. Since SEP is convex, we have the following fact: if pψ1 , . . . , ψs´1 , ϕq P Ls

ut
and pψ1 , . . . , ψs´1 , χq P Ls , then pψ1 , . . . , ψs´1 , ?12 ϕ, ?12 χq P Ls`1 . It follows that
whenever pψ1 , . . . , ψs q is a Lebesgue point for Ls , then pψ1 , . . . , ψs´1 , ?12 ψs , ?12 ψs q

rib
is a Lebesgue point for Ls`1 . (A point x is a Lebesgue point for A P Rn if the ratio
volpA X Bpx, εqq{ vol Bpx, εq goes to 1 as ε goes to 0.) The result follows from the

ist
fact that almost every point of Ls is a Lebesgue point (see Chapter 3, Corollary
1.5 in [SS05]).

rd
Exercise 10.7. Realize ρ as TrCs |ψyxψ| for ψ uniformly distributed on SCd bCd bCs
(i) For d1 ď d2 , identify Cd1 as a subspace of Cd2 and let P : Cd2 Ñ Cd1 be the
orthogonal projection. Show that the map fo
pP b P qρpP b P q
ot
ρ ÞÑ
TrpP b P qρpP b P q
N

pushes forward µd22 ,s onto µd21 ,s , and preserves separability. (ii) Identify C2d with
C2 b Cd . Let Φ : BpC2d b C2d q ÞÑ BpCd b Cd q be the partial trace over Cd b Cd .
ly.

Then Φ pushes forward µ4d2 ,s onto µd2 ,4s , and preserves separability.
Exercise 10.8. We use the same notation as in Exercise 10.7. Theorem 10.12
on

applied for ε “ 1{2 gives, with δ :“ 2 expp´αN q for some α ą 1:


‚ if k, N are such that 2N ´2k ď 12 s0 p2k q, then π2k ,2N ´2k ď δ,
se

‚ if k, N are such that 2N ´2k ě 32 s0 p2k q, then π2k ,2N ´2k ě 1 ´ δ.


lu

Set pk :“ π2k ,2N ´2k . Exercise 10.7(ii) implies that ppk q is non-increasing. Define kN
as the smallest k such that pk ă 1 ´ δ. It is clear from the estimates (10.10) that
kN „ N {5. Our definition of kN implies that 2N ´2kN ă 23 s0 p2kN q, and therefore
na

2N ´2kN ´2 ď 21 s0 p2kN q, so that π2kN ,2N ´2kN ´2 ď δ. By Exercise 10.7(i), this implies
so

that pkN `1 ď δ and the Corollary follows.


Exercise 10.9. (i) We have Trpρ2 q “ d12 ` TrpW ρq. The value of E Trpρ2 q was
r

computed in Exercise 6.45. To obtain concentration use the fact that Tr ρ2 is


Pe

related to the Schatten 4-norm of M when ρ “ M M : with M distributed on the


Hilbert–Schmidt sphere in Md2 ,s . (ii) LetaΠ be the orthogonal projection onto the
subspace Cx b Cs . The function |Πψ| “ xx|ρ|xy is 1-Lipschitz as a function of ψ
and satisfies E |Πψ|2 “ 1{d2 ; use Exercise 5.46. (iii) Use Lemma 9.4 and the union
bound.
Exercise 10.10. It follows from (the proof of) Carathéodory’s theorem (see Ex-
ercise 1.1) that the infimum in (10.15) can be restricted to convex combinations of
length at most d4 . Then use a compactness argument.
CHAPTER 11 365

Exercise 10.11. The inradius of PPT is the same as that of Sep (see Table 9.1), so
the argument that led to (10.14) carries over to the present setting. For the bound
in (i), the relevant range of s is Θpd2 q.

Chapter 11

Exercise 11.1. If n ě 3 is odd, argue as in the comment following Lemma 11.1.


If n “ 2k, identify Rn with R2 b Rk and consider E “ F b Ik , where F Ă M2 pRq is
the subspace spanned by the two real Pauli matrices.

ion
Exercise 11.2. This can be seen directly from the definition. Alternatively, we may
use the description from Proposition 11.8. Let λ P p0, 1q and paij q, pa1ij q P QCm,n .
?

ut
?
We have aij “ xxi , yj y and a1ij “ xx1i , yj1 y. Defining x̃i “ λxi ‘ 1 ´ λx1i and
? ?
ỹj “ λyj ‘ 1 ´ λyj1 leads to λaij ` p1 ´ λqa1ij “ xx̃i , ỹj y. We then argue as in

rib
the end of Proposition 11.8 to ensure that vectors live in Rminpm,nq .
11.3. For vectorsaxi , yj of norm at most 1, the unit vectors x1i “ xi `

ist
Exercise
a
1 ´ |xi | u and yj1 “ yj ` 1 ´ |yj |2 v satisfy xxi , yj y “ xx1i , yj1 y provided u, v are
2

rd
unit vectors in tyj : 1 ď j ď nuK X txi : 1 ď i ď muK such that u K v.
Exercise 11.4. When considered as elements of R4 , the 8 distinct matrices Aξ,η “
pξi ηj q2i,j“1 are either opposite or orthogonal. A less explicit argument goes as
fo
2
?
follows: use Proposition 11.7, the fact that B8 is congruent to 2 B12 , and that
B1m b p B1n identifies with B1mn (cf. Exercise 11.8).
ot
Exercise 11.5. Given ξ P t´1, 1um and η P t´1, n
ř 1u , let I “ ti : ξi “ 1u and
N

J “ tj : ηj “ 1u and split the overall sum bij ξi ηj into 4 sums according to


whether i P I or not, j P J or not; then use the triangle inequality.
ly.

Exercise 11.6. For the first statement, note that t´1, 1uk is exactly the set of
k
extreme points of B8 “ pB1k q˝ . The second statement is even more straightforward
on

from Proposition 11.8: tpxi qki“1 : xi P H, |xi | ď 1u is exactly the unit ball of
` ˘˚
`k8 pHq “ `k1 pHq .
se

Exercise 11.7. Choose σ “ τ “ σz , X̃i “ Xi b |0yx0|, and Ỹj “ Yj b |0yx0|.


Exercise 11.8. First observe that Proposition 11.7 generalizes to the present
lu

2 bk
context, with the same proof: LC2,...,2 identifies with pB8 q p . Next use the facts
2
? k
that B8 is congruent to 2B12 , and that pB12 qbk identifies with B12 . It follows that
na

p
k k
LC2,...,2 is congruent to 2k{2 B12 , a polytope with 2k`1 vertices and 22 facets.
so

Exercise 11.9. The answers are most conveniently deduced from the characteriza-
tions given
? by Propositions 11.7 and 11.8. The outradius is in both cases easily seen
r

to be mn. It is a little more delicate to establish that the inradii are 1. For the
Pe

mp n
lower bound on the inradius of LCm,n “ B8 bB8 , note that it is in Löwner position
by Lemma 4.9 and then appeal to Exercise 4.20. For the remaining conclusions,
mn
use LCm,n Ă QCm,n Ă B8 .
m p n
Exercise 11.10. Since LCm,n “ B8 b B8 , this follows from Exercise 4.27 and
the fact that a cube has enough symmetries (Exercise 4.25). More concretely,
symmetries of LC are generated by permutations and sign flips of rows and columns.
Since these operations are also symmetries for QC, it follows that QC has likewise
enough symmetries.
Exercise 11.11. Taking into account Remark 11.9, it is enough to check that for
every self-adjoint operators X1 , X2 , Y1 , Y2 with X12 “ X22 “ I and Y12 “ Y22 “ I,
366 E. HINTS TO EXERCISES

?
we have Tr ρB ď 2 2, where B “ X1 b Y1 ` X1 b Y2 ` X2 b Y1 ´ X2 b Y2 . To
that end, show that B 2 “ 4 I ´pX1 X2 ´ X2 X1 q?b pY1 Y2 ´ Y2 Y1 q and conclude that
}B 2 }op ď 8. For an example giving violation 2, appeal to Proposition 11.8 and
consider the case where x1 , y1 , x2 , y2 P R2 are unit vectors separated by successive
45˝ angles.
Here is an alternative argument which allows to arrive at an example without
guessing. First, observe that
1

ion
suptϕCHSH pAq : A P QC2,2 u “ supt|y1 ` y2 | ` |y1 ´ y2 | : yj P H, |yj | ď 1u
2
(cf. Exercise 11.6). Next, note that for such y1 , y2 ,

ut
? ` ˘1{2 ˘1{2 ?
|y1 ` y2 | ` |y1 ´ y2 | ď 2 |y1 ` y2 |2 ` |y1 ´ y2 |2 “ 2 |y1 |2 ` |y2 |2
`
ď2 2

rib
and verify when equalities occur.
Exercise 11.12. By Exercise 11.4 and its hint, every normal to a facet is propor-

ist
tional to the sum of four vertices of that facet, which in turn are of the form Aξ,η .
All such sums can then be listed and classified: there are 8 that exhibit the CHSH

rd
pattern and another 8 with only one non-zero entry. Alternatively, one may notice
that every such sum is a matrix of Hilbert-Schmidt norm 4, whose entries are even
fo
integers that sum up to ˘4. Finally, the functionals corresponding to matrices
with only one non-zero entry cannot distinguish between classical and quantum
correlations.
ot
Exercise 11.13. If m ą n, the set LCn,n can be seen in a canonical way as a section
N

of LCm,n , which in turn is a section of LCm,m , and similarly for QCn,n , QCm,n and
p2q ?
QCm,m . The fact that KG ě 2 follows from Exercise 11.11, and the opposite
ly.

inequality by combining Exercises 11.11 and 11.12.


2 p n 2
?
Exercise 11.14. We have LC2,n “ B8 b B8 . Since B8 is congruent to 2B12 , it
on

follows that ?12 LC2,n is congruent to B12 b


p B8n
, which identifies with B8n
‘1 B 8n :“
n n n n
conv ptpx, 0q : x P B8 u Y tp0, xq : x P B8 uq. The facets of B8 ‘1 B8 are of the
se

n
form convpF ˆ t0u, t0u ˆ Gq, where F, G are facets of B8 . (This can be easily seen
n n ˝ n n
by identifying pB8 ‘1 B8 q with B1 ˆ B1 .) It follows that LC2,n has p2nq2 facets:
lu

4n facets
` ˘ express the fact that each entry of a correlation matrix belongs to r´1, 1s,
and 8 n2 “ 4n2 ´ 4n are equivalent to the CHSH inequality.
na

Exercise 11.15. Fix 1 ď i, j ď 3 and denote E Ă R3 ˆR3 the subspace of matrices


for which the ith row and the jth column are zero. It is clear from the definition
so

that PE LC3,3 “ LC2,2 , where E is identified with R2 ˆ R2 . It follows that whenever


r

tφp¨q ď 1u is a facet-defining inequality for LC2,2 , then tφpPE p¨qq ď 1u is a facet-


Pe

defining inequality for LC3,3 . A careful counting (cf. Exercise 11.12) shows that this
construction produces 18 facets of the kind ˘aij ď 1 and 9 ˆ 8 “ 72 facets defined
by inequalities equivalent to CHSH up to symmetries. The information that LC3,3
has 90 facets implies that LC3,3 is the intersection of the half-spaces
? associated to
these 18 ?` 72 “ 90 facets. Since PE QC3,3 “ QC2,2 Ă 2LC2,2 , it follows that
QC3,3 Ă 2LC3,3 .
` ˘
Exercise 11.16. If M “ mij , then
m ÿ
ÿ ˇ n
}M : `28 pCq Ñ `21 pCq} “ max
(
ˇ mij zj | : zj P C, |zj | ď 1, j “ 1, . . . , m .
i“1 j“1
CHAPTER 11 367

Since, as a real normed space, pC, | ¨ |q coincides with pR2 , | ¨ |q, it remains to appeal
to Exercise 11.6. (Note that we are concerned here with the case m “ n “ 2, but
a similar argument works if mintm, nu “ 2.)
Exercise 11.17. Let a, b, c, d P C and let φ : C Ñ R` be defined by φpzq “
|az ` b| ` |cz ` d|. Then φ is convex and, in particular, its maximal value over the
(closed) unit disk is attained on its boundary T. Next, note that for η1 , η2 P T we
have
|aη1 ` bη2 | ` |cη1 ` dη2 | “ φpη1 η2 q

ion
and, similarly, for y1 , y2 P C2 with |y1 | “ |y2 | “ 1
|ay1 ` by2 | ` |cy1 ` dy2 | “ φpxy1 |y2 yq.

ut
By the first observation, the maxima of these two expressions (over η1 , η2 P T and

rib
over unit vectors y1 , y2 P C2 respectively) coincide and it remains to notice that
these maxima „ represent
 the expressions on the two sides of the inequality (11.37)
a b

ist
for rmij s “ .
c d
Exercise 11.18. The polytope LCn,n is a symmetric polytope with 22n´1 vertices

rd
and dimension n2 (see Proposition 11.7), so the result follows. For the “moreover”
part, combine Exercise 7.15 and, if needed, Theorem 11.12. Note that, from general
fo
principles (see Exercise 4.20), apLCn,n q ď n and apQCn,n q ď n (in fact we have
equality by Exercise 11.9).
ot
Exercise 11.19. Via Santaló? inequality and its reverse, Proposition 11.15 implies
that vradpLC˝n,n q “ Θp1{ nq. Since outradpLC˝n,n q “ inradpLCn,n q “ 1 (see Exer-
N

cise 11.9), Proposition 6.3 implies that LC˝n,n has exppΩpnqq vertices, or equivalently
that LCn,n has exppΩpnqq facets.
ly.

ř
Exercise 11.20. (a) The value of the game is i,j πpi, jqmij ξi ηj , where pπpi, jqq is
on

the distribution on the set of inputs. If πpi0 , j0 q ă 14 , choose ξ, η so that pξi ηj q agrees
with pmij q except for the pi0 , j0 qth entry. (b) First, replacing pξ, ηq by p´ξ, ´ηq
does not change the outcome, so for each such pair of strategies only the sum of
se

their probabilities matters. Next, there are four pairs of that kind that saturate
lu

(11.4) and (11.12), with each pair leading to a mismatch in exactly one of the four
entries of the 2 ˆ 2 matrices pξi ηj q and pmij q. If one of these four pairs entered into
the random strategy with a weight strictly larger than 14 , the referee could use as
na

the setting pi, jq the index of the corresponding mismatched entry.


The combination of (a) and (b) describes the von Neumann–Nash-type equilibrium
so

for the CHSH game.


r

Exercise 11.21. Alice and Bob have a quantum strategy which gives the value of
Pe

?
at least 22 independently of the distribution pπpi, jqq on the set of inputs; moreover,
if that distribution is?not uniform, they have a quantum strategy yielding a value
strictly larger than 22 . For the universal strategy, use the same xi , yj as those
implicit in the hint to Exercise 11.11; it follows from the argument there that, when
expressed in terms of xi , yj , such strategy is unique up to isometries of the Hilbert
space in question. If pπpi, jqq is not uniform, then either |πp1, 1qy1 ` πp1, 2qy2 | `
|πp2, 1qy1 ´ πp2,
? 2qy2 | or |πp1, 1qx1 ` πp2, 1qx2 | ` |πp1, 2qx1 ´ πp2, 2qx2 | is strictly
larger than 2 2.
Exercise 11.22. Extreme points of the set Kk,m defined in (11.23) are deterministic
distributions that are of the form ppξ|iq “ δξ,f piq for some function f . It follows
368 E. HINTS TO EXERCISES

from the Krein–Milman theorem that any conditional probability distribution is a


convex combination of deterministic distributions. Since LB “ Kk,m b p Kl,n , the
result follows.
Exercise 11.23. Consider λ P p0, 1q and two boxes P, Ps P QB. Represent P “
tppξ, η|i, jqu as tTr ρpEiξ b Fjη qu and Ps “ tsppξ, η|i, jqu as tTr ρspEs ξ b Fsη qu, where the
i j
ξ η sξ sη
operators Ei , Fj , Ei , Fj act respectively on the Hilbert spaces HA , HB , H sA , HsB .
Verify that

ion
´ ´ ¯¯
λppξ, η|i, jq ` p1 ´ λqsppξ, η|i, jq “ Tr σ pEiξ ‘ E s ξ q b pF η ‘ Fsη q ,
i j j

where σ “ λρ ‘ p1 ´ λqs ρ is a state acting on the diagonal subspace HA b HB ‘

ut
HsA b HsB Ă pHA ‘ H sA q b pHB ‘ H sB q.

rib
Exercise 11.24. Replace ρ by its appropriate purification (see Section 3.4), i.e.,
represent ρ P HA b HB as ρ “ TrHC |ψyxψ| for some ψ P HA b HB b HC . Then
η η
write Tr ρpEiξ b Fjη q “ xψ|Eiξ b F j |ψy, where F j “ Fjη b IHC .

ist
Exercise 11.25. (i) By Exercise 11.23, it is enough to show that RB Ă QB, which

rd
is easy. Note that a product box P P RB can be represented in a trivial way: take
HA “ HB “ C, ρ “ ICbC and Eiξ “ ppξ|iq IC , Fjη “ ppη|jq IC . (ii) Consider a local
fo
box of the form (11.20). By Carathéodory’s theorem, we may assume that the index
set Λ is finite. To obtain a representation as a quantum box with a separable state,
consider HA “ HB “ CΛ and let p|λyqλPΛ be the canonical basis in CΛ . Define
ot
ρ “ λ µpλq|λyxλ| b |λyxλ|, Eiξ “ λ ppξ|i, λq|λyxλ| and Fjη “ λ ppη|j, λq|λyxλ|.
ř ř ř
N

One checks then that the representation (11.21) holds. Note that this construction
is essentially the argument used in Exercise 11.23 to prove convexity of QB, specified
ly.

to the present (simpler) setting.


Exercise 11.26. Since LB is convex it suffices to prove the result when ρ is a
on

product state, in which case it is almost immediate.


Exercise 11.27. Use (11.24) in combination with Exercises 4.13 and 4.15. Note
se

that the affine space Vk,m generated by Kk,m does not contain 0 and similarly for
Kl,n .
lu

Exercise 11.28. If pA p¨|iq P Kk,m and pB p¨|jq P Kl,n , the dimension of the set of
boxes P “ tppξ, η|i, jqu verifying (11.25) for inputs i, j and for that particular choice
na

of pA , pB is pk ´ 1qpl ´ 1q. Consequently, dim NSB ď mnpk ´ 1qpl ´ 1q ` dim Kk,m `


dim Kl,n , which coincides with the value of dim LB calculated in Exercise 11.27.
so

Since LB Ă QB Ă NSB, all dimensions must be the same. They are all convex
bodies in the affine space Vk,m b p Vl,n analyzed in Exercise 11.27.
r
Pe

Exercise 11.29. Let P “ tTr ρpEiξ b Fjη qu P QB and Ps “ tTr ρ˚ pEiξ b Fjη qu, where
ρ˚ is the maximally mixed state. Since ρ˚ is an interior point of Sep, it follows from
Exercise 11.26 that the intersection of the segment rPs, P s with LB is a segment of
nonzero length, in particular P belongs to the affine subspace generated by LB.
Since P P QB was arbitrary, we conclude that QB is contained in that subspace
and, in particular, dim QB ď dim LB. (The converse inequality is trivial.)
Exercise 11.30. If H Ă RN is an affine subspace not containing 0 and if V is
an affine functional on RN , then there exists v P RN such that xv, xy “ V pxq for
x P H.
APPENDIX A 369

Exercise 11.31. The first part is straightforward from the definitions. For the
second part, note that we cannot have LB Ă bQB if |b| ă 1, and then appeal to
the first part.
Exercise 11.32. (i) By Exercise 11.30, we can use affine functionals to exhibit
violations. Given such functional V , the largest violation among functionals of the
form Vs “ s ` V (where s P R) occurs when Vs pLBq is an interval of the form
r´a, as. Hence if V yields the maximal quantum violation, then
r´a, as “ V pLBq Ă V pQBq Ă r´aωQ pV q, aωQ pV qs

ion
and the last two intervals share (at least) one of the endpoints. In particular, the
ratio of the lengths of the intervals V pQBq and V pLBq is between p1 ` ωQ pV qq{2

ut
and ωQ pV q. (ii) Replace everywhere QB by NSB.

rib
Exercise 11.33. First, the PR-box yields value 4 (in the normalization given by
(11.29)). In the opposite direction,
ˇř ˇ that, for each i, j, ppξ, η|i, jq is a
use the fact
joint density to deduce that ˇ ξ,η ppξ, η|i, jqˇ ď 1. The second statement follows

ist
then from Exercise 11.15 and the proof of Proposition 11.19.

rd
Exercise 11.34. Reverse engineer the proof of Proposition 11.8 starting from the
configuration x1 , y1 , x2 , y2 P R2 from the hint to Exercise 11.11. This leads (for
example) to ρ being the maximally entangled state on C2 b C2 , the isometries
fo
X1 “ σx , X2 “ σz (the Pauli matrices), Y1 “ 2´1{2 pσx ` σz q, Y2 “ 2´1{2 pσx ´ σz q
and, finally, to the POVMs consisting of spectral projections of Xi ’s and Yj ’s (as in
ot
the formulas following (11.13)). The last step is somewhat tedious, but instructive.
N

Exercise 11.35. (i) The composition rules for Pauli matrices are in Exercise 2.4.
(ii) Multiply all the numbers in the matrix. (iii)(a) Use part (ii); it follows that
the probability of winning under any classical strategy is at most 8{9. (b) First,
ly.

the product of the elements of Alice’s output string must be an eigenvalue of the
on

composition of the corresponding operators, and similarly for Bob, and therefore
by (i)(b) their answers are valid. Next, we can compute (as in Section 3.8) the joint
probability distribution of outcomes when Alice and Bob measure a single shared
se

ϕ` in the eigenbasis of a Pauli matrix: for σx and σz both outcomes are always
equal, and for σy both outcomes are always different. It follows that for each of the
lu

entries in Table 11.1, the outcomes of Alice’s and Bob’s measurements on φ` b φ`


always coincide.
na

Chapter 12
so

Exercise 12.1. For the first part, use the triangle inequality. For the second part,
r

consider the POVM pP, I ´P q where P is the projection onto the range of pρ ´ σq` .
Pe

Exercise 12.2. Show that separability and PPT properties are preserved under
the action of a separable channel.

Appendix A

Exercise A.1. Use simple calculus (differentiation) for small t and (for example)
the upper Komatu inequality (A.4) for large t.
Exercise A.2. (i) is elementary calculus. (ii) Let δpxq be either f` pxq ´ f pxq or
f pxq ´ f´ pxq. We have δ 1 pxq ď xδpxq. Since δp0q ě 0, and δ vanishes at `8, the
result follows (otherwise consider a local minimum of δ).
370 E. HINTS TO EXERCISES

Exercise
ş A.3. Let µ be such a probability measure. The Fourier transform µ p:
u ÞÑ Rn exppixx, uyq dµpxq satisfies µ ppu ` vq “ µ ppuqpµpvq for u K v. Moreover
µ
p is radial. If f ptq denotes the value of µ p on the sphere of radius t2 , we have
f pt`uq “ f ptqf puq. Since f is continuous and real-valued (µ is even by assumption),
this implies f ptq “ expp´tσ 2 {2q for some σ ě 0, and therefore µ is a Gaussian
measure.
?
Exercise A.4. We have κn “ E |G| ď pE |G|2 q1{2 “ n. For the lower bound, use
the functional equation Γpt ` 1q “ tΓptq. Note also that κn`1 {κn “ n{κ2n .

ion
a b
n
Exercise A.5. If αn “ n ´ 1{2 and βn “ n ´ 2n`1 , it is elementary to
check that the sequences κ2n {α2n , κ2n`1 {α2n`1 , β2n {κ2n and β2n`1 {κ2n`1 are non-

ut
increasing. The result follows since all these sequences converge to 1.

rib
ş
Exercise A.6. Express Rn f dγn in polar coordinates. The factor is κn pαq “
E |G|α “ 2α{2 Γppn ` αq{2q{Γpn{2q and, under some minimal regularity assumptions

ist
on f , the formula is valid α ą ´n.

Appendix B

rd
Exercise B.1. Use Stirling’s formula and the bound n! ě pn{eqn .
Exercise B.2. This is immediate from (B.4).
Exercise B.3. (i)–(ii) Easy. (iii) Use the
fo
? non-commutative
? Hölder inequality and
ot
the fact that A: A and AA: (and hence A: A and AA: ) have the same non-zero
eigenvalues.
N

Exercise B.4. The argument is essentially included in the proof of Theorem 2.3.
ş1
ly.

Exercise B.5. (i) follows from Proposition B.1 and from the formula 0 }γ 1 ptq} dt
for the length of an absolutely continuous curve γ : r0, 1s Ñ G. (ii) The singular
on

numbers of U ´ V are the same as those of I ´U ´1 V and hence (in the notation of
part (i)) equal |1 ´ eiθj | “ 2 sinpθj {2q.
Exercise B.6. By rescaling the parameter t we can achieve }A}8 ď π. Next,
se

if s, t P R with |s ´ t| ă 1, apply Proposition B.1 with U “ eisA and V “ eitA .


lu

Finally, use the fact that the map X ÞÑ W X is an isometry with respect to any
Schatten p-norm.
na

The last assertion follows from the uniqueness part of Proposition B.1.
Note that allowing right (or two-sided) cosets does not increase generality since
:
so

eitA W “ W eitW AW . ˆ ˙ ˆ ˙
0 ´θ cos θ ´ sin θ
Exercise B.7. Use the formula exp
r

“ and the fact


θ 0 sin θ cos θ
Pe

n
that R can be decomposed as an orthogonal direct sum of subspaces of dimension
at most 2 that are invariant for U . For the last equality apply Exercise B.5(i).
řn´1
Exercise B.8. (i) For an integer n, write eiB ´ eiA “ k“0 eikA{n peiB{n ´
eiA{n qeipn´1´kqB{n . It follows that }eiB ´ eiA } ď n}eiB{n ´ eiA{n }. Conclude by
using the bound }eX ´ eY } ď }X ´ Y } exppmaxp}X}, }Y }qq which follows from the
series expansion, and take n Ñ 8. Alternatively, consider φptq “ eip1´tqB eitA and
show that φ1 ptq “ ieip1´tqB pA ´ BqeitA . These arguments work for any unitarily
invariant norm. (ii) The functional inequality for Lp¨q follows from
eiB ´ eiA “ 2peiB{2 ´ eiA{2 q ` peiB{2 ´ IqpeiB{2 ´ eiA{2 q ` peiB{2 ´ eiA{2 qpeiA{2 ´ Iq.
APPENDIX B 371

Iterating (and using the simple fact that Lpθq tends to 1 as θ goes to 0) gives that
ś8 k ś8
Lpθq ě k“1 p1 ´ |1 ´ eiθ{2 |q “ k“2 p1 ´ 2 sinpθ{2k qq, which is easy to estimate
numerically.
Exercise B.9. Let V P V0 H, then V “ V0 h and U1 “ U0 h1 for some h, h1 P H.
Now note that }U0 ´ V }p “ }U0 ´ V0 h}p “ }U0 h1 ´ V0 hh1 }p “ }U1 ´ V0 hh1 }p and
V0 hh1 P V0 H; taking infimum over V shows one inequality and the other follows by
symmetry. Similarly, if r0, 1s Q t ÞÑ U0 eitA is a geodesic connecting U0 to V P V0 H,
:
then t ÞÑ U0 eitA h1 “ U1 eith1 Ah1 is a curve of the same length connecting U1 to

ion
V h1 P V0 H.
Exercise B.10. In the notation from Exercise B.9 and its hint, we may assume

ut
that gp pU0 , V0 q equals the distance between U0 H and V0 H (in the sense of gp ; note
that the distance is attained, for example by compactness). Next, consider the

rib
geodesic connecting U0 to V0 , whose length is equal to that distance, and deduce
from Exercise B.9 that the quotient map Opnq Ñ Opnq{H is an isometry when

ist
restricted to that geodesic.
Exercise B.11. In the notation of (B.9) let Ei “ spantxi , yi u. The subspaces

rd
E1 , . . . , Ek are pairwise orthogonal and invariant under PE and PF ; they are 2-
dimensional for si ă 1 (which is equivalent to xi ‰ yi ) and 1-dimensional otherwise.
We
i |xi yxxi | and PF “
fo
ř now note that theř eigenvalues of |xi yxxi | ´ |yi yxyi | are ˘ sin θi ; since PE “
i |yi yxyi |, the principal angles θi ‰ 0 can be retrieved from
the eigenvalues of PE ´ PF . It remains to use the relation PE ´ PF “ PF K ´ PE K .
ot
Exercise B.12. Show first the inequalities “ď”. In the notation of (B.9) and of
N

the hint to Exercise B.11, define W0 P Opnq to be a rotation on each 2-dimensional


space Ej such that W0 xj “ yj (i.e., a rotation by θj ) and to be an identity on the
ly.

orthogonal complement of the union of such Ej ’s. The nonzero singular values of
W0 ´ I are then |eiθj ´ 1| “ 2 sin θj {2, each repeated twice, which combined with
on

(B.13) shows the needed upper bound on h̃p pE, F q. For an upper bound on the
geodesic distance g̃p pE, F q, consider a family W ptq, t P r0, 1s, where W ptq acts as a
rotation by tθj on Ej and calculate the length of the path t ÞÑ W ptq with respect
se

to the Schatten p-norm. (Alternatively, you may note that W ptq “ eitA for the
lu

appropriate A P Msa n and refer to the calculation from Exercise B.5.)


For the opposite inequality, show the following claim: If W P Opnq verifies W E “
na

F , then the singular values of W ´ I dominate those of W0 ´ I, in the sense that


sj pW0 ´ Iq ď sj pW ´ Iq for 1 ď j ď n. The lower bound on h̃p pE, F q follows
so

then immediately from (B.13); to get the lower bound on g̃p pE, F q, observe that,
by Exercise B.10, the optimal geodesic is of the form W ptq “ U0 eitA , t P r0, 1s, and
r

that its length—which is }A}p by Exercise B.5(i)—depends in a straightforward


Pe

way on the singular values of W p1q ´ I.


To show the claim, fix s P p0, 1q and let θ P p0, π{2q be such that s “ cos θ.
Next, consider Fs “ spantyj : sj ď su. If y P Fs is a unit vector, show that
|y ´ x| ě |eiθ ´ 1| for all x P S n´1 X E and deduce that whenever W E “ F , then
there are at least dim Fs singular values of pV ´ Iq|E that are at least |eiθ ´ 1|.
Since, by Exercise B.11, the same is true for pV ´ Iq|E K , the claim follows.
Exercise B.13. It follows from the argument sketched in the hint to Exercise
B.11 that the (nonzero) eigenvalues of PE ´ PF are ˘ sin θi and so }PE ´ PF }p “
21{p }psin θ1 , . . . , sin θk q}p . Comparing with the formulas from Exercise B.12 gives
2 ?1 h̃p pE, F q ď }PE ´ PF }p ď h̃p pE, F q.
π g̃p pE, F q ď }PE ´ PF }p ď g̃p pE, F q and 2
372 E. HINTS TO EXERCISES

Finally, since }PE ´ PF }p and g̃p differ only in terms of the 3rd order and higher,
they both induce the same geodesic distance.
Exercise B.14. Let px1 , . . . , xn q be independent Gaussian vectors in Rn . Since the
set of singular matrices has measure 0 in Mn , these vectors are almost surely linearly
independent. Moreover, the orthonormal matrix obtained by applying the Gram–
Schmidt procedure to the matrix with columns x1 , . . . , xn is Haar-distributed on
Opnq. It follows that the subspace spanpx1 , . . . , xk q is Haar-distributed on Grpk, Rn q.
Exercise B.15. Let g1 , . . . , gk (resp., h1 , . . . , hk ) be independent standard Gauss-

ion
ian vectors in E (resp., in E K ). For every ε ą 0, the random subspace spantgi `εhi :
1 ď i ď ku is distributed with respect to some Haar measure on Grpk, Rn q and con-

ut
verges to E almost surely as ε goes to 0.
Exercise B.16. The answer to both questions is “no.” The reason is that SOpkq ˆ

rib
SOpn ´ kq is a proper subgroup of the stabilizer of Rk under the canonical action of
SOpnq on Grpk, Rn q, and similarly in the complex case. In the complex case, even

ist
` ˘
the dimensions do not add up, we have dim SUpnq ´ dim SUpkq ` dim SUpn ´ kq “
2kpn ´ kq ` 1 ą 2kpn ´ kq “ dim Grpk, Cn q (note that these are real dimensions).

rd
A more complete answer is that SOpnq{pSOpkq ˆ SOpn ´ kqq identifies with
the set of oriented k-dimensional subspaces of Rn and is, in a canonical way, a
two-fold cover of Grpk, Rn q. (A particular example of this phenomenon is S n´1 “
fo
SOpnq{SOpn ´ 1q being a two-fold cover of Grp1, Rn q “ PpRn q.) Similarly, the set
SUpnq{pSUpkqˆSUpn´kqq identifies, in a way, with the set of “signed” k-dimensional
ot
subspaces of Cn . See also Exercise B.17.
N

Exercise B.17. There are two fine points: first, the cosets of H are subsets of the
cosets of Opkq ˆ Opn ´ kq and so the distances (extrinsic or geodesic) between the
ly.

former may be larger than between the latter. Next, geodesics connecting cosets
(as in Exercise B.10) may ` a priori ˘turn out to be longer if we insist` that they
on

˘
are entirely contained in SOpnq, gp (as opposed to the larger space Opnq, gp ).
Similar issues arise in the complex case.
To address these concerns, check that W and W ptq suggested in the hint to Exercise
se

B.12 are minimizers that work simultaneously for Opnq and for SOpnq (resp., for
lu

Upnq and for SUpnq).


Exercise B.19.` (a) (i) By˘ Proposition 2.29,` automorphisms
˘ of L4 are maps of the
form x ÞÑ Ψ´1 V ΨpxqV : or x ÞÑ Ψ´1 V ΨpxqT V : . (ii) Use (2.6) to show that
na

automorphisms from (i) preserve q and hence belong to O` p1, 3q iff | det V | “ 1. (iii)
Check that if V “ I, then the two maps from (i) belong respectively to SO` p1, 3q
so

and O` p1, 3qzSO` p1, 3q, and appeal to connectedness of SLp2, Cq.
r

(b) V ÞÑ ΨV being a homomorphism is straightforward; for the part about the


Pe

kernel note that if x “ Ψ´1 p|ξyxξ|q for some ξ P C2 , then ΦV pxq “ x can be
rewritten as V |ξyxξ|V : “ |ξyxξ|; this means that ΨV “ IR4 implies that every
ξ P C2 zt0u is an eigenvector of V , which is only possible if V is a multiple of I.

Appendix C

Exercise C.1. Start by showing that Φ “ |uyxv| ‰ 0 belongs to P pCq iff u P C


and v P C ˚ (or u P ´C, v P ´C ˚ ); necessity easily follows. For sufficiency, start
by observing that if u and v generate extreme rays in the respective cones and if,
for some ∆, Φ ˘ ∆ P P pCq, then ∆pCq Ă R` u and ∆˚ pC ˚ q Ă R` v. To show that
APPENDIX D 373

(for example) ∆pxq P R` u for x P C, consider separately the cases Φpxq “ 0 and
Φpxq ‰ 0.
Exercise C.2. If Ψ is the automorphism in question, then Ψ˚ JΨ “ µJ ` Q and
similarly pΨ´1 q˚ JpΨ´1 q “ νJ ` Q1 with µ, ν ě 0 and Q, Q1 positive semi-definite
(justify). Next, show that this implies that p1 ´ µνqJ “ νQ ` Ψ˚ Q1 Ψ, which is only
possible if µν “ 1 and Q “ Q1 “ 0.
Exercise C.3. If n “ 4, this follows from Corollary 2.30, modulo identifying com-
pletely positive automorphisms of PSDpC2 q with SO` p1, 3q (see the hint to Exercise

ion
B.19(a)). Deduce the conclusion for n ą 4: when looking for an automorphism Ψ
such that Ψpuq “ v, consider any 4-dimensional subspace E Ă Rn containing e0 , u

ut
and v, and define Ψ separately on E and E K .
A similar line of argument allows to derive the full statement (n ě 2) from Exercise

rib
B.18.
Exercise C.4. “Reverse engineer” the failure of the proof of the converse im-

ist
plication from Proposition C.1 when for n “ 2. Alternatively, notice that L2 is
isomorphic to the positive quadrant R2` and that the structure of the cone P pRn` q

rd
is particularly simple.

Appendix D
fo
Exercise D.1. One fine point is in verifying that the bases are nontrivial and that
ot
they generate the respective cones, but this is assured by the hypothesis xe˚ , ey “ 1
(cf. Exercise 1.28).
N

Exercise D.2. Start by noticing that }x}pK´aq˝ ď }x}K ˝ if xx, ay ě 0 while


}x}pK´aq˝ ě }x}K ˝ if xx, ay ď 0 (this may be more obvious if instead of the gauges
ly.

we consider the support functions wp¨, ¨q, see Section 4.3.3), and that for some x
(e.g., x “ ˘a) the inequalities are strict. Deduce that K ˝ X H ` Ĺ pK ´ aq˝ X H ` ,
on

where H ` “ tx P Rn : xx, ay ě 0u, ş with the reverse inclusion for the other
halfspace, and show that this implies pK´aq˝ xx, by dx ą 0.
Exercise D.3. (i) and (ii) By linear invariance, we may assume that E is a translate
se

of B2n . Identify it with the base of the Lorentz cone Ln`1 and apply Lemma
lu

D.1. (iii) This is immediate if we use the full force of Proposition D.2. For a
proof that does not use the uniqueness part, note that if K is centrally symmetric
na

and has centroid at the origin, then it is 0-symmetric. Apply this observation to
K “ pE ´ aq˝ and use the bipolar theorem.
so

Exercise D.4. Let u be such that the segment rb ´ u, b ` us lies in the interior of
K. We now consider a “ aptq :“ b ` tu for t P r´1, 1s and the corresponding solid
r
Pe

cones Ta . The first-order variation as t Ñ 0 is, for some constant Cpb, uq ą 0,


ż
voln`1 pTa q “ voln`1 pTb q ` Cpb, uq t xu, x ´ e0 y dx ` op|t|q.
Bb
ş
If b is a local extremum, it follows that Bb
x dx “ e0 .
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
Bibliography

ion
[AAGM15] Shiri Artstein-Avidan, Apostolos Giannopoulos, and Vitali D. Milman, Asymptotic

ut
geometric analysis. Part I, Mathematical Surveys and Monographs, vol. 202, Amer-
ican Mathematical Society, Providence, RI, 2015. 87, 105, 143, 146, 147, 186, 207,

rib
208, 209
[AAKM04] S. Artstein-Avidan, B. Klartag, and V. Milman, The Santaló point of a function,
and a functional form of the Santaló inequality, Mathematika 51 (2004), no. 1-2,

ist
33–48 (2005). 105
[AAM06] S. Artstein-Avidan and V. D. Milman, Logarithmic reduction of the level of ran-
domness in some probabilistic geometric constructions, J. Funct. Anal. 235 (2006),

rd
no. 1, 297–329. 207
[AAS15] Shiri Artstein-Avidan and Boaz A. Slomka, A note on Santaló inequality for the
polarity transform and its reverse, Proc. Amer. Math. Soc. 143 (2015), no. 4, 1693–
1704. 105 fo
[AdRBV98] Juan Arias-de Reyna, Keith Ball, and Rafael Villa, Concentration of the distance in
ot
finite-dimensional normed spaces, Mathematika 45 (1998), no. 2, 245–252. 144
[AGMJV16] David Alonso-Gutiérrez, Bernardo González Merino, C. Hugo Jiménez, and Rafael
N

Villa, Rogers–Shephard inequality for log-concave functions, Journal of Functional


Analysis 271 (2016), no. 11, 3269–3299. 105
[AGZ10] Greg W. Anderson, Alice Guionnet, and Ofer Zeitouni, An introduction to ran-
ly.

dom matrices, Cambridge Studies in Advanced Mathematics, vol. 118, Cambridge


University Press, Cambridge, 2010. 179, 245, 350
on

[AIIS04] David Avis, Hiroshi Imai, Tsuyoshi Ito, and Yuuya Sasaki, Deriving tight Bell in-
equalities for 2 parties with many 2-valued observables from facets of cut polytopes,
arXiv preprint quant-ph/0404014 (2004). 296
se

[AJR15] Srinivasan Arunachalam, Nathaniel Johnston, and Vincent Russo, Is absolute sep-
arability determined by the partial transpose?, Quantum Inf. Comput. 15 (2015),
lu

no. 7 & 8, 694–720. 64


[AL15a] Guillaume Aubrun and Cécilia Lancien, Locally restricted measurements on a mul-
tipartite quantum system: data hiding is generic, Quantum Inf. Comput. 15 (2015),
na

no. 5-6, 513–540. 306


[AL15b] Guillaume Aubrun and Cécilia Lancien, Zonoids and sparsification of quantum mea-
so

surements, Positivity (2015), 1–23 (English). 305


[Alo03] Noga Alon, Problems and results in extremal combinatorics. I, Discrete Math. 273
r

(2003), no. 1-3, 31–53, EuroComb’01 (Barcelona). 346


Pe

[AMS04] S. Artstein, V. Milman, and S. J. Szarek, Duality of metric entropy, Ann. of Math.
(2) 159 (2004), no. 3, 1313–1328. 143
[AMSTJ04] S. Artstein, V. Milman, S. Szarek, and N. Tomczak-Jaegermann, On convexified
packing and entropy duality, Geom. Funct. Anal. 14 (2004), no. 5, 1134–1141. 143
[AN12] Guillaume Aubrun and Ion Nechita, Realigning random states, J. Math. Phys. 53
(2012), no. 10, 102210, 16. 274
[Ara04] P. K. Aravind, Quantum mysteries revisited again, Amer. J. Phys. 72 (2004), no. 10,
1303–1307. 297
[Arv09] William Arveson, Maximal vectors in Hilbert space and quantum entanglement,
Journal of Functional Analysis 256 (2009), no. 5, 1476–1510. 233
[AS06] Guillaume Aubrun and Stanisław J Szarek, Tensor products of convex sets and the
volume of separable states on n qudits, Physical Review A 73 (2006), no. 2, 022109.
104, 233, 260, 261

375
376 BIBLIOGRAPHY

[AS10] Erik Alfsen and Fred Shultz, Unique decompositions, faces, and automorphisms of
separable states, J. Math. Phys. 51 (2010), no. 5, 052201, 13. 63
[AS15] Guillaume Aubrun and Stanisł aw Szarek, Two proofs of Størmer’s theorem, arXiv
preprint arXiv:1512.03293 (2015). 64
[AS17] Guillaume Aubrun and Stanislaw Szarek, Dvoretzky’s Theorem and the Complexity
of Entanglement Detection, Discrete Analysis, to appear (2017). 208, 261
[ASW10] Guillaume Aubrun, Stanisław Szarek, and Elisabeth Werner, Nonadditivity of Rényi
entropy and Dvoretzky’s theorem, J. Math. Phys. 51 (2010), no. 2, 022102, 7. 232
[ASW11] , Hastings’s additivity counterexample via Dvoretzky’s theorem, Comm.
Math. Phys. 305 (2011), no. 1, 85–97. 144, 208, 232, 233

ion
[ASY12] Guillaume Aubrun, Stanisław J. Szarek, and Deping Ye, Phase transitions for ran-
dom states and a semicircle law for the partial transpose, Phys. Rev. A 85 (2012),
030302. 273

ut
[ASY14] Guillaume Aubrun, Stanisław J. Szarek, and Deping Ye, Entanglement thresholds
for random induced states, Comm. Pure Appl. Math. 67 (2014), no. 1, 129–171. 64,

rib
179, 260, 270, 273
[Aub05] Guillaume Aubrun, A sharp small deviation inequality for the largest eigenvalue of

ist
a random matrix, Séminaire de Probabilités XXXVIII, Lecture Notes in Math., vol.
1857, Springer, Berlin, 2005, pp. 320–337. 179
[Aub09] , On almost randomizing channels with a short Kraus decomposition, Comm.

rd
Math. Phys. 288 (2009), no. 3, 1103–1116. 232
[Aub12] , Partial transposition of random states and non-centered semicircular dis-
tributions, Random Matrices Theory Appl. 1 (2012), no. 2, 1250001, 29. 179, 273
[Aud09] fo
Koenraad MR Audenaert, A note on the p Ñ q norms of 2-positive maps, Linear
Algebra and Its Applications 430 (2009), no. 4, 1436–1440. 232
ot
[Azu67] Kazuoki Azuma, Weighted sums of certain dependent random variables, Tôhoku
Math. J. (2) 19 (1967), 357–367. 144
N

[BAG97] G. Ben Arous and A. Guionnet, Large deviations for Wigner’s law and Voiculescu’s
non-commutative entropy, Probab. Theory Related Fields 108 (1997), no. 4, 517–
542. 245
ly.

[Bak94] Dominique Bakry, L’hypercontractivité et son utilisation en théorie des semigroupes,


Lectures on probability theory (Saint-Flour, 1992), Lecture Notes in Math., vol.
on

1581, Springer, Berlin, 1994, pp. 1–114. 145


[Bal86] KM Ball, Isometric problems in `p and sections of convex sets, Ph.D. thesis, Uni-
versity of Cambridge, 1986. 105
se

[Bal89] Keith Ball, Volumes of sections of cubes and related problems, Geometric aspects of
functional analysis (1987–88), Lecture Notes in Math., vol. 1376, Springer, Berlin,
lu

1989, pp. 251–260. 106, 209


[Bal91] , Volume ratios and a reverse isoperimetric inequality, J. London Math. Soc.
(2) 44 (1991), no. 2, 351–359. 209, 342
na

[Bal92a] , Ellipsoids of maximal volume in convex bodies, Geom. Dedicata 41 (1992),


no. 2, 241–250. 104
so

[Bal92b] , A lower bound for the optimal density of lattice packings, Internat. Math.
Res. Notices (1992), no. 10, 217–221. 142
r

[Bal97] , An elementary introduction to modern convex geometry, Flavors of geom-


Pe

etry, Math. Sci. Res. Inst. Publ., vol. 31, Cambridge Univ. Press, Cambridge, 1997,
pp. 1–58. 87, 104, 209
[Bar98] Franck Barthe, An extremal property of the mean width of the simplex, Math. Ann.
310 (1998), no. 4, 685–693. 342
[Bar02] Alexander Barvinok, A course in convexity, Graduate Studies in Mathematics,
vol. 54, American Mathematical Society, Providence, RI, 2002. 28
[Bar14] Alexander Barvinok, Thrifty approximations of convex bodies by polytopes., Int.
Math. Res. Not. 2014 (2014), no. 16, 4341–4356 (English). 142, 143
[BBP` 96] Charles H Bennett, Gilles Brassard, Sandu Popescu, Benjamin Schumacher, John A
Smolin, and William K Wootters, Purification of noisy entanglement and faithful
teleportation via noisy channels, Physical Review Letters 76 (1996), no. 5, 722. 306
[BBT05] Gilles Brassard, Anne Broadbent, and Alain Tapp, Quantum pseudo-telepathy,
Found. Phys. 35 (2005), no. 11, 1877–1907. 297
BIBLIOGRAPHY 377

[BC02] Károly Bezdek and Robert Connelly, Pushing disks apart—the Kneser-Poulsen con-
jecture in the plane, J. Reine Angew. Math. 553 (2002), 221–236. 178
[BCL94] Keith Ball, Eric A. Carlen, and Elliott H. Lieb, Sharp uniform convexity and smooth-
ness inequalities for trace norms, Invent. Math. 115 (1994), no. 3, 463–482. 29
[BCN12] Serban Belinschi, Benoît Collins, and Ion Nechita, Eigenvectors and eigenvalues in
a random subspace of a tensor product, Inventiones mathematicae 190 (2012), no. 3,
647–697. 233
[BCN16] Serban T. Belinschi, Benoît Collins, and Ion Nechita, Almost one bit violation for
the additivity of the minimum output entropy, Comm. Math. Phys. 341 (2016),
no. 3, 885–909. 233

ion
[BCP` 14] Nicolas Brunner, Daniel Cavalcanti, Stefano Pironio, Valerio Scarani, and Stephanie
Wehner, Bell nonlocality, Rev. Mod. Phys. 86 (2014), 419–478. 295, 297
[BDG` 77] G. Bennett, L. E. Dor, V. Goodman, W. B. Johnson, and C. M. Newman, On

ut
uncomplemented subspaces of Lp , 1 ă p ă 2, Israel J. Math. 26 (1977), no. 2,
178–187. 208

rib
[BDK89] Rajendra Bhatia, Chandler Davis, and Paul Koosis, An extremal problem in Fourier
analysis with applications to operator theory, J. Funct. Anal. 82 (1989), no. 1, 138–

ist
150. 354
[BDM83] Rajendra Bhatia, Chandler Davis, and Alan McIntosh, Perturbation of spectral sub-
spaces and solution of linear operator equations, Linear Algebra Appl. 52/53 (1983),

rd
45–67. 354
[BDM` 99] Charles H. Bennett, David P. DiVincenzo, Tal Mor, Peter W. Shor, John A. Smolin,
and Barbara M. Terhal, Unextendible product bases and bound entanglement, Phys.

[BDMS13]
fo
Rev. Lett. 82 (1999), no. 26, part 1, 5385–5388. 63
Afonso S. Bandeira, Edgar Dobriban, Dustin G. Mixon, and William F. Sawin,
ot
Certifying the restricted isometry property is hard, IEEE Trans. Inform. Theory 59
(2013), no. 6, 3448–3450. 210
N

[BDSW96] Charles H. Bennett, David P. DiVincenzo, John A. Smolin, and William K. Wootters,
Mixed-state entanglement and quantum error correction, Phys. Rev. A 54 (1996),
3824–3851. 306
ly.

[BÉ85] D. Bakry and Michel Émery, Diffusions hypercontractives, Séminaire de probabilités,


XIX, 1983/84, Lecture Notes in Math., vol. 1123, Springer, Berlin, 1985, pp. 177–
on

206. 145
[Bec75] William Beckner, Inequalities in Fourier analysis, Ann. of Math. (2) 102 (1975),
no. 1, 159–182. 145
se

[Bel64] J. S. Bell, On the Einstein Podolsky Rosen paradox, Physics 1 (1964), 195–200. 276
[Ben84] Yoav Benyamini, Two-point symmetrization, the isoperimetric inequality on the
lu

sphere and some applications, Texas functional analysis seminar 1983–1984 (Austin,
Tex.), Longhorn Notes, Univ. Texas Press, Austin, TX, 1984, pp. 53–76. 144
[Bez08] K. Bezdek, From the Kneser-Poulsen conjecture to ball-polyhedra, European J. Com-
na

bin. 29 (2008), no. 8, 1820–1830. 178


[BF88] I. Bárány and Z. Füredi, Approximation of the sphere by polytopes having few ver-
so

tices, Proc. Amer. Math. Soc. 102 (1988), no. 3, 651–659. 178
[BGK` 01] Andreas Brieden, Peter Gritzmann, Ravindran Kannan, Victor Klee, László Lovász,
r

and Miklós Simonovits, Deterministic and randomized polynomial-time approxima-


Pe

tion of radii, Mathematika 48 (2001), no. 1-2, 63–105 (2003). 141


[BGL14] Dominique Bakry, Ivan Gentil, and Michel Ledoux, Analysis and geometry of
Markov diffusion operators, Grundlehren der Mathematischen Wissenschaften [Fun-
damental Principles of Mathematical Sciences], vol. 348, Springer, Cham, 2014. 136,
144, 145
[BGM71] Marcel Berger, Paul Gauduchon, and Edmond Mazet, Le spectre d’une variété rie-
mannienne, Lecture Notes in Mathematics, Vol. 194, Springer-Verlag, Berlin-New
York, 1971. 145
[BGVV14] Silouanos Brazitikos, Apostolos Giannopoulos, Petros Valettas, and Beatrice-Helen
Vritsiou, Geometry of isotropic convex bodies, Mathematical Surveys and Mono-
graphs, vol. 196, American Mathematical Society, Providence, RI, 2014. 105
378 BIBLIOGRAPHY

[BH10] Fernando G. S. L. Brandão and Michał Horodecki, On Hastings’ counterexamples to


the minimum output entropy additivity conjecture, Open Syst. Inf. Dyn. 17 (2010),
no. 1, 31–52. 233
[BH13] Fernando G. S. L. Brandão and Aram W. Harrow, Product-state approximations
to quantum ground states (extended abstract), STOC’13—Proceedings of the 2013
ACM Symposium on Theory of Computing, ACM, New York, 2013, pp. 871–880.
29
[Bha97] Rajendra Bhatia, Matrix analysis, Graduate Texts in Mathematics, vol. 169,
Springer-Verlag, New York, 1997. 29
[BHH` 14] Piotr Badzia̧g, Karol Horodecki, Michał Horodecki, Justin Jenkinson, and Stani-

ion
sław J. Szarek, Bound entangled states with extremal properties, Phys. Rev. A 90
(2014), 012301. 261
[Bil99] Patrick Billingsley, Convergence of probability measures, second ed., Wiley Series in

ut
Probability and Statistics: Probability and Statistics, John Wiley & Sons, Inc., New
York, 1999, A Wiley-Interscience Publication. 179

rib
[BKP06] Jonathan Barrett, Adrian Kent, and Stefano Pironio, Maximally nonlocal and
monogamous quantum correlations, Phys. Rev. Lett. 97 (2006), 170409. 297

ist
[BL75] H.J. Brascamp and E.H. Lieb, Some inequalities for Gaussian measures and the
long range order of one-dimensional plasma, pp. 1–14, Clarendon Press, Oxford,
1975. 105

rd
[BL76] Herm Jan Brascamp and Elliott H. Lieb, On extensions of the Brunn-Minkowski
and Prékopa-Leindler theorems, including inequalities for log concave functions,
and with an application to the diffusion equation, J. Functional Analysis 22 (1976),

[BL01]
no. 4, 366–389. 105 fo
H Barnum and N Linden, Monotones and invariants for multi-particle quantum
ot
states, Journal of Physics A: Mathematical and General 34 (2001), no. 35, 6787. 233
[BLM89] J. Bourgain, J. Lindenstrauss, and V. Milman, Approximation of zonoids by zono-
N

topes., Acta Math. 162 (1989), no. 1-2, 73–141 (English). 209
[BLM13] Stéphane Boucheron, Gábor Lugosi, and Pascal Massart, Concentration inequalities,
Oxford University Press, Oxford, 2013, A nonasymptotic theory of independence,
ly.

With a foreword by Michel Ledoux. 118, 119, 143, 144, 146


[BLPS99] Wojciech Banaszczyk, Alexander E. Litvak, Alain Pajor, and Stanislaw J. Szarek,
on

The flatness theorem for nonsymmetric convex bodies via the local theory of Banach
spaces, Math. Oper. Res. 24 (1999), no. 3, 728–750. 103, 207
[BM87] J. Bourgain and V. D. Milman, New volume ratio properties for convex symmetric
bodies in Rn , Invent. Math. 88 (1987), no. 2, 319–340. 105, 209
se

[BM08] Bhaskar Bagchi and Gadadhar Misra, On Grothendieck constants, unpublished,


lu

2008. 295
[BMW09] Michael J. Bremner, Caterina Mora, and Andreas Winter, Are random pure states
useful for quantum computation?, Phys. Rev. Lett. 102 (2009), no. 19, 190502, 4.
na

233
[BN02] Alexander Barg and Dmitry Yu. Nogin, Bounds on packings of spheres in the Grass-
so

mann manifold, IEEE Trans. Inform. Theory 48 (2002), no. 9, 2450–2454. 143
[BN05] , Correction to: “Bounds on packings of spheres in the Grassmann manifold”
r

[IEEE Trans. Inform. Theory 48 (2002), no. 9, 2450–2454; mr1929456], IEEE


Pe

Trans. Inform. Theory 51 (2005), no. 7, 2732. 143


[BN06a] A. M. Barg and D. Yu. Nogin, A spectral approach to linear programming bounds
for codes, Problemy Peredachi Informatsii 42 (2006), no. 2, 12–25. 142
[BN06b] Alexander Barg and Dmitry Nogin, A bound on Grassmannian codes, J. Combin.
Theory Ser. A 113 (2006), no. 8, 1629–1635. 143
[BN13] Teodor Banica and Ion Nechita, Asymptotic eigenvalue distributions of block-
transposed Wishart matrices, J. Theoret. Probab. 26 (2013), no. 3, 855–869. 179
[BNV16] S. Brierley, M. Navascués, and T. Vértesi, Convex separation from convex optimiza-
tion for large-scale problems, arXiv preprint 1609.05011 (2016). 295
[Bob97] S. G. Bobkov, An isoperimetric inequality on the discrete cube, and an elementary
proof of the isoperimetric inequality in Gauss space, Ann. Probab. 25 (1997), no. 1,
206–214. 145
BIBLIOGRAPHY 379

[Bom90a] Jan Boman, Smoothness of sums of convex sets with real analytic boundaries, Math.
Scand. 66 (1990), no. 2, 225–230. 104
[Bom90b] , The sum of two plane convex C 8 sets is not always C 5 , Math. Scand. 66
(1990), no. 2, 216–224. 104
[Bon70] Aline Bonami, Étude des coefficients de Fourier des fonctions de Lp pGq, Ann. Inst.
Fourier (Grenoble) 20 (1970), no. fasc. 2, 335–402 (1971). 145
[Bor75a] C. Borell, Convex set functions in d-space, Period. Math. Hungar. 6 (1975), no. 2,
111–136. 104
[Bor75b] Christer Borell, The Brunn-Minkowski inequality in Gauss space, Invent. Math. 30
(1975), no. 2, 207–216. 144

ion
[Bor03] , The Ehrhard inequality, C. R. Math. Acad. Sci. Paris 337 (2003), no. 10,
663–666. 144
[Bör04] Károly Böröczky, Jr., Finite packing and covering, Cambridge Tracts in Mathemat-

ut
ics, vol. 154, Cambridge University Press, Cambridge, 2004. 141, 142
[Bou84] J. Bourgain, On martingales transforms in finite-dimensional lattices with an ap-

rib
pendix on the K-convexity constant, Math. Nachr. 119 (1984), 41–53. 207
[Boy67] A. V. Boyd, Note on a paper by Uppuluri, Pacific J. Math. 22 (1967), 9–10. 309

ist
[BP01] Imre Bárány and Attila Pór, On 0-1 polytopes with many facets, Adv. Math. 161
(2001), no. 2, 209–228. 281
[Bry95] Włodzimierz Bryc, The normal distribution, Lecture Notes in Statistics, vol. 100,

rd
Springer-Verlag, New York, 1995, Characterizations with applications. 309
[BS] Andrew Blasius and Stanisław Szarek, Sharp two-sided bounds for the medians of
gamma and chi-squared distributions, in preparation. 124
[BS88] fo
J. Bourgain and S. J. Szarek, The Banach-Mazur distance to the cube and the
Dvoretzky-Rogers factorization, Israel J. Math. 62 (1988), no. 2, 169–180. 208
ot
[BS10] Salman Beigi and Peter W. Shor, Approximating the set of separable states using
the positive partial transpose test, J. Math. Phys. 51 (2010), no. 4, 042202, 10. 261
N

[BTN01a] Aharon Ben-Tal and Arkadi Nemirovski, Lectures on modern convex optimization,
MPS/SIAM Series on Optimization, Society for Industrial and Applied Mathematics
(SIAM), Philadelphia, PA; Mathematical Programming Society (MPS), Philadel-
ly.

phia, PA, 2001, Analysis, algorithms, and engineering applications. 29


[BTN01b] , On polyhedral approximations of the second-order cone, Math. Oper. Res.
on

26 (2001), no. 2, 193–205. 210


[BV04] Stephen Boyd and Lieven Vandenberghe, Convex optimization, Cambridge Univer-
sity Press, Cambridge, 2004. 29
se

[BV13] Jop Briët and Thomas Vidick, Explicit lower and upper bounds on the entangled
value of multiplayer XOR games, Comm. Math. Phys. 321 (2013), no. 1, 181–207.
lu

297
[BW03] Károly Böröczky, Jr. and Gergely Wintsche, Covering the sphere by equal spherical
balls, Discrete and computational geometry, Algorithms Combin., vol. 25, Springer,
na

Berlin, 2003, pp. 235–251. 142


[BY88] Z. D. Bai and Y. Q. Yin, Convergence to the semicircle law, Ann. Probab. 16 (1988),
so

no. 2, 863–875. 179


[BZ88] Yu. D. Burago and V. A. Zalgaller, Geometric inequalities, Grundlehren der Math-
r

ematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol.


Pe

285, Springer-Verlag, Berlin, 1988, Translated from the Russian by A. B. Sosinskiı̆,


Springer Series in Soviet Mathematics. 143
[BŻ06] Ingemar Bengtsson and Karol Życzkowski, Geometry of quantum states, Cambridge
University Press, Cambridge, 2006, An introduction to quantum entanglement. 63
[CÐ13] Lin Chen and Dragomir Ž. Ðoković, Dimensions, lengths, and separability in finite-
dimensional quantum systems, J. Math. Phys. 54 (2013), no. 2, 022201, 13. 63
[CDJ` 08] Jianxin Chen, Runyao Duan, Zhengfeng Ji, Mingsheng Ying, and Jun Yu, Existence
of universal entangler, J. Math. Phys. 49 (2008), no. 1, 012103, 7. 231
[Ceı̆76] I. I. Ceı̆tlin, Extremal points of the unit ball of certain operator spaces, Mat. Zametki
20 (1976), no. 4, 521–527. 104
[CFG` 16] Umut Caglar, Matthieu Fradelizi, Olivier Guédon, Joseph Lehec, Carsten Schütt,
and Elisabeth M. Werner, Functional versions of Lp -affine surface area and entropy
inequalities, Int. Math. Res. Not. IMRN (2016), no. 4, 1223–1250. 105
380 BIBLIOGRAPHY

[CFN15] Benoît Collins, Motohisa Fukuda, and Ion Nechita, On the convergence of output
sets of quantum channels, J. Operator Theory 73 (2015), no. 2, 333–360. 233
[CFR59] H. S. M. Coxeter, L. Few, and C. A. Rogers, Covering space with equal spheres,
Mathematika 6 (1959), 147–157. 111, 142
[CG04] Daniel Collins and Nicolas Gisin, A relevant two qubit Bell inequality inequivalent
to the CHSH inequality, J. Phys. A 37 (2004), no. 5, 1775–1787. 296
[CGLP12] Djalil Chafaï, Olivier Guédon, Guillaume Lecué, and Alain Pajor, Interactions be-
tween compressed sensing random matrices and high dimensional geometry, Panora-
mas et Synthèses [Panoramas and Syntheses], vol. 37, Société Mathématique de
France, Paris, 2012. 146

ion
[Cha67] G. D. Chakerian, Inequalities for the difference body of a convex body, Proc. Amer.
Math. Soc. 18 (1967), 879–884. 105
[Che78] S. Chevet, Séries de variables aléatoires gaussiennes à valeurs dans E bp ε F . Appli-

ut
cation aux produits d’espaces de Wiener abstraits, Séminaire sur la Géométrie des
Espaces de Banach (1977–1978), École Polytech., Palaiseau, 1978, pp. Exp. No. 19,

rib
15. 180
[CHL` 08] Toby Cubitt, Aram W Harrow, Debbie Leung, Ashley Montanaro, and Andreas

ist
Winter, Counterexamples to additivity of minimum output p-renyi entropy for p
close to 0, Communications in Mathematical Physics 284 (2008), no. 1, 281–290.
232

rd
[CHLL97] Gérard Cohen, Iiro Honkala, Simon Litsyn, and Antoine Lobstein, Covering codes,
North-Holland Mathematical Library, vol. 54, North-Holland Publishing Co., Ams-
terdam, 1997. 142
[Cho75a] fo
Man Duen Choi, Completely positive linear maps on complex matrices, Linear Al-
gebra and Appl. 10 (1975), 285–290. 64
ot
[Cho75b] Man-Duen Choi, Positive semidefinite biquadratic forms, Linear Algebra and its
Applications 12 (1975), no. 2, 95–100. 63
N

[CHS96] John H. Conway, Ronald H. Hardin, and Neil J. A. Sloane, Packing lines, planes,
etc.: packings in Grassmannian spaces, Experiment. Math. 5 (1996), no. 2, 139–159.
143
ly.

[CHSH69] John F. Clauser, Michael A. Horne, Abner Shimony, and Richard A. Holt, Proposed
experiment to test local hidden-variable theories, Phys. Rev. Lett. 23 (1969), 880–
on

884. 295
[Chu62] J. T. Chu, Mathematical Notes: A Modified Wallis Product and Some Applications,
Amer. Math. Monthly 69 (1962), no. 5, 402–404. 309
B. S. Cirel1 son, Quantum generalizations of Bell’s inequality, Lett. Math. Phys. 4
se

[Cir80]
(1980), no. 2, 93–100. 295
lu

[CL06] Eric Carlen and Elliott H. Lieb, Some matrix rearrangement inequalities, Ann. Mat.
Pura Appl. (4) 185 (2006), no. suppl., S315–S324. 29
[Cla36] James A. Clarkson, Uniformly convex spaces, Trans. Amer. Math. Soc. 40 (1936),
na

no. 3, 396–414. 29
[Cla06] Lieven Clarisse, The distillability problem revisited, Quantum Inf. Comput. 6 (2006),
so

no. 6, 539–560. 306


[CLM` 14] Eric Chitambar, Debbie Leung, Laura Mančinska, Maris Ozols, and Andreas Winter,
r

Everything you always wanted to know about LOCC (but were afraid to ask), Comm.
Pe

Math. Phys. 328 (2014), no. 1, 303–326. 306


[CM14] Benoît Collins and Camille Male, The strong asymptotic freeness of Haar and de-
terministic matrices, Ann. Sci. Éc. Norm. Supér. (4) 47 (2014), no. 1, 147–163.
180
[CM15] Fabio Cavalletti and Andrea Mondino, Sharp and rigid isoperimetric inequalities in
metric-measure spaces with lower ricci curvature bounds, arXiv preprint 1502.06465
(2015). 144
[CN10] Benoît Collins and Ion Nechita, Random quantum channels I: graphical calculus and
the Bell state phenomenon, Comm. Math. Phys. 297 (2010), no. 2, 345–370. 233
[CN11] , Random quantum channels II: entanglement of random subspaces, Rényi
entropy estimates and additivity problems, Adv. Math. 226 (2011), no. 2, 1181–1201.
233
BIBLIOGRAPHY 381

[CN16] Benoît Collins and Ion Nechita, Random matrix techniques in quantum information
theory, Journal of Mathematical Physics 57 (2016), no. 1, 015215. 179, 233
[CNY12] Benoit Collins, Ion Nechita, and Deping Ye, The absolute positive partial transpose
property for random induced states, Random Matrices Theory Appl. 1 (2012), no. 3,
1250002, 22. 274
[Col06] Andrea Colesanti, Functional inequalities related to the Rogers-Shephard inequality,
Mathematika 53 (2006), no. 1, 81–101 (2007). 105
[Col16] Benoît Collins, Haagerup’s inequality and additivity violation of the Minimum Out-
put Entropy, arXiv preprint arXiv:1603.00577 (2016). 233
[CP88] Bernd Carl and Alain Pajor, Gel1 fand numbers of operators with values in a Hilbert

ion
space, Invent. Math. 94 (1988), no. 3, 479–504. 178
[CR86] Jeesen Chen and Herman Rubin, Bounds for the difference between median and
mean of gamma and Poisson distributions, Statist. Probab. Lett. 4 (1986), no. 6,

ut
281–283. 124
[CS90] Bernd Carl and Irmtraud Stephani, Entropy, compactness and the approximation of

rib
operators, Cambridge Tracts in Mathematics, vol. 98, Cambridge University Press,
Cambridge, 1990. 142

ist
[CS99] J. H. Conway and N. J. A. Sloane, Sphere packings, lattices and groups, third ed.,
Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Math-
ematical Sciences], vol. 290, Springer-Verlag, New York, 1999, With additional con-

rd
tributions by E. Bannai, R. E. Borcherds, J. Leech, S. P. Norton, A. M. Odlyzko,
R. A. Parker, L. Queen and B. B. Venkov. 141, 142
[CS05] Shiing-Shen Chern and Zhongmin Shen, Riemann-Finsler geometry, Nankai Tracts
fo
in Mathematics, vol. 6, World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ,
2005. 319
ot
[CSW14] Adán Cabello, Simone Severini, and Andreas Winter, Graph-theoretic approach to
quantum correlations, Phys. Rev. Lett. 112 (2014), 040401. 297
N

[CW03] Kai Chen and Ling-An Wu, A matrix realignment method for recognizing entangle-
ment, Quantum Inf. Comput. 3 (2003), no. 3, 193–202. 63
[Dav57] Chandler Davis, All convex invariant functions of hermitian matrices, Arch. Math.
ly.

8 (1957), 276–278. 29
[DCLB00] W. Dür, J. I. Cirac, M. Lewenstein, and D. Bruß, Distillability and partial transpo-
on

sition in bipartite systems, Phys. Rev. A 61 (2000), 062313. 306


[Dem97] Amir Dembo, Information inequalities and concentration of measure, Ann. Probab.
25 (1997), no. 2, 927–939. 146
se

[DF87] Persi Diaconis and David Freedman, A dozen de Finetti-style results in search of a
theory, Ann. Inst. H. Poincaré Probab. Statist. 23 (1987), no. 2, suppl., 397–423.
lu

144
[DF93] Andreas Defant and Klaus Floret, Tensor norms and operator ideals, North-Holland
Mathematics Studies, vol. 176, North-Holland Publishing Co., Amsterdam, 1993.
na

103
[DL97] Michel Marie Deza and Monique Laurent, Geometry of cuts and metrics, Algorithms
so

and Combinatorics, vol. 15, Springer-Verlag, Berlin, 1997. 296


[DLS14] Anirban DasGupta, S. N. Lahiri, and Jordan Stoyanov, Sharp fixed n bounds and
r

asymptotic expansions for the mean and the median of a Gaussian sample maxi-
Pe

mum, and applications to the Donoho-Jin model, Stat. Methodol. 20 (2014), 40–62.
178, 351
[Dmi90] V. A. Dmitrovskiı̆, On the integrability of the maximum and the local properties
of Gaussian fields, Probability theory and mathematical statistics, Vol. I (Vilnius,
1989), “Mokslas”, Vilnius, 1990, pp. 271–284. 144
[DPS04] Andrew C. Doherty, Pablo A. Parrilo, and Federico M. Spedalieri, Complete family
of separability criteria, Phys. Rev. A 69 (2004), 022308. 63
[DR47] H. Davenport and C. A. Rogers, Hlawka’s theorem in the geometry of numbers,
Duke Math. J. 14 (1947), 367–375. 142
[DR50] A. Dvoretzky and C. A. Rogers, Absolute and unconditional convergence in normed
linear spaces, Proc. Nat. Acad. Sci. U. S. A. 36 (1950), 192–197. 208
382 BIBLIOGRAPHY

[DS85] Stephen Dilworth and Stanisław Szarek, The cotype constant and an almost Eu-
clidean decomposition for finite-dimensional normed spaces, Israel J. Math. 52
(1985), no. 1-2, 82–96. 209
[DS01] Kenneth R. Davidson and Stanislaw J. Szarek, Local operator theory, random ma-
trices and Banach spaces, Handbook of the geometry of Banach spaces, Vol. I,
North-Holland, Amsterdam, 2001, pp. 317–366. 144, 180
[DSS` 00] David P. DiVincenzo, Peter W. Shor, John A. Smolin, Barbara M. Terhal, and
Ashish V. Thapliyal, Evidence for bound entangled states with negative partial trans-
pose, Phys. Rev. A 61 (2000), 062312. 306
[Dud67] R. M. Dudley, The sizes of compact subsets of Hilbert space and continuity of Gauss-

ion
ian processes, J. Functional Analysis 1 (1967), 290–330. 179
[Due10] Lutz Duembgen, Bounding standard Gaussian tail probabilities, Tech. report, Uni-
versity of Bern, Institute of Mathematical Statistics and Actuarial Science, 2010.

ut
309
[Dum07] Ilya Dumer, Covering spheres with spheres, Discrete Comput. Geom. 38 (2007),

rib
no. 4, 665–679. 110, 142
[Dür01] W. Dür, Multipartite bound entangled states that violate Bell’s inequality, Phys.

ist
Rev. Lett. 87 (2001), 230402. 297
[Dvo61] Aryeh Dvoretzky, Some results on convex bodies and Banach spaces, Proc. Internat.
Sympos. Linear Spaces (Jerusalem, 1960), Jerusalem Academic Press, Jerusalem;

rd
Pergamon, Oxford, 1961, pp. 123–160. 208
[EC04] Fida El Chami, Spectra of the Laplace operator on Grassmann manifolds, Int. J.
Pure Appl. Math. 12 (2004), no. 4, 395–418. 145
[Ehr83] fo
Antoine Ehrhard, Symétrisation dans l’espace de Gauss, Math. Scand. 53 (1983),
no. 2, 281–301. 144
ot
[EPR35] A. Einstein, B. Podolsky, and N. Rosen, Can quantum-mechanical description of
physical reality be considered complete?, Phys. Rev. 47 (1935), 777–780. 276
N

[ES70] P. Erdős and A. H. Stone, On the sum of two Borel sets, Proc. Amer. Math. Soc.
25 (1970), 304–306. 104
[EVWW01] Tilo Eggeling, Karl Gerd H. Vollbrecht, Reinhard F. Werner, and Michael M. Wolf,
ly.

Distillability via protocols respecting the positivity of partial transpose, Phys. Rev.
Lett. 87 (2001), 257902. 306
on

[Fer75] X. Fernique, Regularité des trajectoires des fonctions aléatoires gaussiennes, École
d’Été de Probabilités de Saint-Flour, IV-1974, Springer, Berlin, 1975, pp. 1–96.
Lecture Notes in Math., Vol. 480. 178, 179
se

[Fer97] Xavier Fernique, Fonctions aléatoires gaussiennes, vecteurs aléatoires gaussiens,


Université de Montréal, Centre de Recherches Mathématiques, Montreal, QC, 1997.
lu

144, 179
[FF81] P. Frankl and Z. Füredi, A short proof for a theorem of Harper about Hamming-
spheres, Discrete Math. 34 (1981), no. 3, 311–313. 146
na

[FHS13] Omar Fawzi, Patrick Hayden, and Pranab Sen, From low-distortion norm embed-
dings to explicit uncertainty relations and efficient information locking., J. ACM
so

60 (2013), no. 6, 61 (English). 208


[Fig76] T. Figiel, A short proof of Dvoretzky’s theorem on almost spherical sections of convex
r

bodies., Compos. Math. 33 (1976), 297–301 (English). 208


Pe

[FK94] S. K. Foong and S. Kanno, Proof of D. N. Page’s conjecture on: “Average en-
tropy of a subsystem” [Phys. Rev. Lett. 71 (1993), no. 9, 1291–1294; MR1232812
(94f:81007)], Phys. Rev. Lett. 72 (1994), no. 8, 1148–1151. 232
[FK10] Motohisa Fukuda and Christopher King, Entanglement of random subspaces via the
Hastings bound, J. Math. Phys. 51 (2010), no. 4, 042201, 19. 233
[FKM10] Motohisa Fukuda, Christopher King, and David K. Moser, Comments on Hastings’
additivity counterexamples, Comm. Math. Phys. 296 (2010), no. 1, 111–143. 233
[FLM77] T. Figiel, J. Lindenstrauss, and V. D. Milman, The dimension of almost spherical
sections of convex bodies, Acta Math. 139 (1977), no. 1-2, 53–94. 144, 208
[FLPS11] Shmuel Friedland, Chi-Kwong Li, Yiu-Tung Poon, and Nung-Sing Sze, The auto-
morphism group of separable states in quantum information theory, J. Math. Phys.
52 (2011), no. 4, 042203, 8. 63
BIBLIOGRAPHY 383

[FN15] Motohisa Fukuda and Ion Nechita, Additivity rates and PPT property for random
quantum channels, Ann. Math. Blaise Pascal 22 (2015), no. 1, 1–72. 180
[Fol99] Gerald B. Folland, Real analysis, second ed., Pure and Applied Mathematics (New
York), John Wiley & Sons, Inc., New York, 1999, Modern techniques and their
applications, A Wiley-Interscience Publication. 15
[For10] Dominique Fortin, Hadamard’s matrices, Grothendieck’s constant, and root two,
Optimization and optimal control, Springer Optim. Appl., vol. 39, Springer, New
York, 2010, pp. 423–447. 295
[FR94] P. C. Fishburn and J. A. Reeds, Bell inequalities, Grothendieck’s constant, and root
two, SIAM J. Discrete Math. 7 (1994), no. 1, 48–56. 295

ion
[FR13] Simon Foucart and Holger Rauhut, A mathematical introduction to compressive
sensing, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer, New
York, 2013. 208, 309

ut
[Fra99] Matthieu Fradelizi, Hyperplane sections of convex bodies in isotropic position,
Beiträge Algebra Geom. 40 (1999), no. 1, 163–183. 103, 105

rib
[Fre14] Daniel J. Fresen, Explicit Euclidean embeddings in permutation invariant normed
spaces, Adv. Math. 266 (2014), 1–16. 210

ist
[Fri12] Tobias Fritz, Tsirelson’s problem and Kirchberg’s conjecture, Rev. Math. Phys. 24
(2012), no. 5, 1250012, 67. 296
[Fro81] M. Froissart, Constructive generalization of Bell’s inequalities, Nuovo Cimento B

rd
(11) 64 (1981), no. 2, 241–251. 296
[FŚ13] Motohisa Fukuda and Piotr Śniady, Partial transpose of random quantum states:
exact formulas and meanders, J. Math. Phys. 54 (2013), no. 4, 042202, 23. 179
[FT97] fo
Gábor Fejes Tóth, Packing and covering, Handbook of discrete and computational
geometry, CRC Press Ser. Discrete Math. Appl., CRC, Boca Raton, FL, 1997,
ot
pp. 19–41. 141
[FTJ79] T. Figiel and Nicole Tomczak-Jaegermann, Projections onto Hilbertian subspaces of
N

Banach spaces, Israel J. Math. 33 (1979), no. 2, 155–171. 207


[Fuk14] Motohisa Fukuda, Revisiting additivity violation of quantum channels, Comm.
Math. Phys. 332 (2014), no. 2, 713–728. 233
ly.

[FW07] Motohisa Fukuda and Michael M. Wolf, Simplifying additivity problems using direct
sum constructions, J. Math. Phys. 48 (2007), no. 7, 072101, 7. 232
on

[Gal95] Janos Galambos, Advanced probability theory, vol. 10, CRC Press, 1995. 179
[Gar83] Anupam Garg, Detector error and Einstein-Podolsky-Rosen correlations, Phys. Rev.
D (3) 28 (1983), no. 4, 785–790. 295
se

[Gar02] R. J. Gardner, The Brunn-Minkowski inequality, Bull. Amer. Math. Soc. (N.S.) 39
(2002), no. 3, 355–405. 104, 105
lu

[GB02] Leonid Gurvits and Howard Barnum, Largest separable balls around the maximally
mixed bipartite quantum state, Physical Review A 66 (2002), no. 6, 062311. 260
[GB03] , Separable balls around the maximally mixed multipartite quantum states,
na

Physical Review A 68 (2003), no. 4, 042312. 261


[GB05] , Better bound on the exponent of the radius of the multipartite separable
so

ball, Physical Review A 72 (2005), no. 3, 032322. 261


[Gem80] Stuart Geman, A limit theorem for the norm of random matrices, Ann. Probab. 8
r

(1980), no. 2, 252–261. 179


Pe

[GFE09] D Gross, ST Flammia, and J Eisert, Most quantum states are too entangled to
be useful as computational resources, Physical review letters 102 (2009), no. 19,
190501. 233
[GG71] D. J. H. Garling and Y. Gordon, Relations between some constants associated with
finite dimensional Banach spaces, Israel J. Math. 9 (1971), 346–361. 104
[GG84] A. Yu. Garnaev and E. D. Gluskin, The widths of a Euclidean ball, Dokl. Akad.
Nauk SSSR 277 (1984), no. 5, 1048–1052. 208
[GGHE08] O. Gittsovich, O. Gühne, P. Hyllus, and J. Eisert, Unifying several separability
conditions using the covariance matrix criterion, Phys. Rev. A 78 (2008), 052319.
64
[GHP10] Andrzej Grudka, Michał Horodecki, and Łukasz Pankowski, Constructive counterex-
amples to the additivity of the minimum output rényi entropy of quantum channels
384 BIBLIOGRAPHY

for all p ą 2, Journal of Physics A: Mathematical and Theoretical 43 (2010), no. 42,
425304. 232
[Gia96] A. A. Giannopoulos, A proportional Dvoretzky-Rogers factorization result, Proc.
Amer. Math. Soc. 124 (1996), no. 1, 233–241. 208
[GLMP04] Y. Gordon, A. E. Litvak, M. Meyer, and A. Pajor, John’s decomposition in the
general case and applications, J. Differential Geom. 68 (2004), no. 1, 99–119. 103
[GLR10] Venkatesan Guruswami, James R. Lee, and Alexander Razborov, Almost Euclidean
subspaces of `N1 via expander codes, Combinatorica 30 (2010), no. 1, 47–68. 207
[Glu81] E. D. Gluskin, The diameter of the Minkowski compactum is roughly equal to n,
Funktsional. Anal. i Prilozhen. 15 (1981), no. 1, 72–73. 103

ion
[Glu88] , Extremal properties of orthogonal parallelepipeds and their applications to
the geometry of Banach spaces, Mat. Sb. (N.S.) 136(178) (1988), no. 1, 85–96. 178
[GLW08] Venkatesan Guruswami, James R. Lee, and Avi Wigderson, Euclidean sections of

ut
`N
1 with sublinear randomness and error-correction over the reals, Approximation,
randomization and combinatorial optimization, Lecture Notes in Comput. Sci., vol.

rib
5171, Springer, Berlin, 2008, pp. 444–454. 207
[GM00] A. A. Giannopoulos and V. D. Milman, Concentration property on probability spaces,

ist
Adv. Math. 156 (2000), no. 1, 77–106. 144
[GMW14] Whan Ghang, Zane Martin, and Steven Waruhiu, The sharp log-Sobolev inequality
on a compact interval, Involve 7 (2014), no. 2, 181–186. 145

rd
[Gor85] Yehoram Gordon, Some inequalities for Gaussian processes and applications, Israel
J. Math. 50 (1985), no. 4, 265–289. 180
[Gor88] Y. Gordon, On Milman’s inequality and random subspaces which escape through a
fo
mesh in Rn , Geometric aspects of functional analysis (1986/87), Lecture Notes in
Math., vol. 1317, Springer, Berlin, 1988, pp. 84–106. 180, 208, 209
ot
[Gra14] Loukas Grafakos, Classical Fourier analysis, third ed., Graduate Texts in Mathe-
matics, vol. 249, Springer, New York, 2014. 158
N

[Gro53a] A. Grothendieck, Résumé de la théorie métrique des produits tensoriels topologiques,


Bol. Soc. Mat. São Paulo 8 (1953), 1–79. 295
[Gro53b] , Sur certaines classes de suites dans les espaces de Banach et le théorème
ly.

de Dvoretzky-Rogers, Bol. Soc. Mat. São Paulo 8 (1953), 81–110 (1956). 208
[Gro75] Leonard Gross, Logarithmic Sobolev inequalities, Amer. J. Math. 97 (1975), no. 4,
on

1061–1083. 145
[Gro80] Misha Gromov, Paul Levy’s isoperimetric inequality, preprint IHES (1980). 144
[Gro87] M. Gromov, Monotonicity of the volume of intersection of balls, Geometrical aspects
se

of functional analysis (1985/86), Lecture Notes in Math., vol. 1267, Springer, Berlin,
1987, pp. 1–4. 178
lu

[Grü03] Branko Grünbaum, Convex polytopes, second ed., Graduate Texts in Mathematics,
vol. 221, Springer-Verlag, New York, 2003, Prepared and with a preface by Volker
Kaibel, Victor Klee and Günter M. Ziegler. 29
na

[Gru07] Peter M. Gruber, Convex and discrete geometry, Grundlehren der Mathematis-
chen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 336,
so

Springer, Berlin, 2007. 142


[Gur03] Leonid Gurvits, Classical deterministic complexity of Edmonds’ problem and quan-
r

tum entanglement, Proceedings of the thirty-fifth annual ACM symposium on The-


Pe

ory of computing, ACM, 2003, pp. 10–19. 63, 64


[GVL13] Gene H. Golub and Charles F. Van Loan, Matrix computations, fourth ed., Johns
Hopkins Studies in the Mathematical Sciences, Johns Hopkins University Press,
Baltimore, MD, 2013. 319
[GW93] Paul Goodey and Wolfgang Weil, Zonoids and generalisations, Handbook of convex
geometry, Vol. A, B, North-Holland, Amsterdam, 1993, pp. 1297–1326. 103
[GZ03] A. Guionnet and B. Zegarlinski, Lectures on logarithmic Sobolev inequalities, Sémi-
naire de Probabilités, XXXVI, Lecture Notes in Math., vol. 1801, Springer, Berlin,
2003, pp. 1–134. 144
[Haa81] Uffe Haagerup, The best constants in the Khintchine inequality, Studia Math. 70
(1981), no. 3, 231–283 (1982). 147
BIBLIOGRAPHY 385

[Hal82] Paul Richard Halmos, A Hilbert space problem book, second ed., Graduate Texts
in Mathematics, vol. 19, Springer-Verlag, New York-Berlin, 1982, Encyclopedia of
Mathematics and its Applications, 17. 343
[Hal07] Majdi Ben Halima, Branching rules for unitary groups and spectra of invariant
differential operators on complex Grassmannians, J. Algebra 318 (2007), no. 2,
520–552. 145
[Hal15] Brian Hall, Lie groups, Lie algebras, and representations, second ed., Graduate
Texts in Mathematics, vol. 222, Springer, Cham, 2015, An elementary introduction.
145
[Han56] Olof Hanner, On the uniform convexity of Lp and lp , Ark. Mat. 3 (1956), 239–244.

ion
29
[Har66] L. H. Harper, Optimal numberings and isoperimetric problems on graphs, J. Com-
binatorial Theory 1 (1966), 385–393. 146

ut
[Har13] Aram W Harrow, The church of the symmetric subspace, arXiv preprint 1308.6595
(2013). 63

rib
[Has09] Matthew B Hastings, Superadditivity of communication capacity using entangled
inputs, Nature Physics 5 (2009), no. 4, 255–257. 144, 232, 233

ist
[Hel69] Carl W. Helstrom, Quantum detection and estimation theory, J. Statist. Phys. 1
(1969), 231–252. 305
[Hen80] Douglas Hensley, Slicing convex bodies—bounds for slice area in terms of the body’s

rd
covariance, Proc. Amer. Math. Soc. 79 (1980), no. 4, 619–625. 105
[Hen12] Martin Henk, Löwner-John ellipsoids, Doc. Math. (2012), no. Extra volume: Opti-
mization stories, 95–106. 104
[HH99] fo
Michał Horodecki and Paweł Horodecki, Reduction criterion of separability and lim-
its for a class of distillation protocols, Phys. Rev. A 59 (1999), 4206–4216. 306
ot
[HH01] Paweł Horodecki and Ryszard Horodecki, Distillation and bound entanglement,
Quantum Inf. Comput. 1 (2001), no. 1, 45–75. 306
N

[HHH96] Michał Horodecki, Paweł Horodecki, and Ryszard Horodecki, Separability of mixed
states: necessary and sufficient conditions, Physics Letters A 223 (1996), no. 1–2,
1–8. 63, 64
ly.

[HHH97] Michał Horodecki, Paweł Horodecki, and Ryszard Horodecki, Inseparable two spin-
1
2
density matrices can be distilled to a singlet form, Phys. Rev. Lett. 78 (1997),
on

574–577. 306
[HHH98] , Mixed-State Entanglement and Distillation: Is there a “Bound” Entangle-
ment in Nature?, Phys. Rev. Lett. 80 (1998), 5239–5242. 306
se

[HHH99] , General teleportation channel, singlet fraction, and quasidistillation, Phys.


Rev. A 60 (1999), 1888–1898. 306
lu

[HHHH09] Ryszard Horodecki, Paweł Horodecki, Michał Horodecki, and Karol Horodecki,
Quantum entanglement, Rev. Modern Phys. 81 (2009), no. 2, 865–942. 63, 74,
306
na

[Hil05] Roland Hildebrand, Cones of ball-ball separable elements, arXiv preprint quant-
ph/0503194 (2005). 324
so

[Hil06] , Separable balls around the maximally mixed state for a 3-qubit system,
arXiv preprint quant-ph/0601201 (2006). 233, 261
r

[Hil07a] , Entangled states close to the maximally mixed state, Physical Review A 75
Pe

(2007), no. 6, 062330. 233, 261


[Hil07b] , Positive maps of second-order cones, Linear Multilinear Algebra 55 (2007),
no. 6, 575–597. 324
[HK11] Kil-Chan Ha and Seung-Hyeok Kye, Entanglement witnesses arising from exposed
positive linear maps., Open Syst. Inf. Dyn. 18 (2011), no. 4, 323–337 (English). 261
[HLSW04] Patrick Hayden, Debbie Leung, Peter W Shor, and Andreas Winter, Randomizing
quantum states: Constructions and applications, Communications in Mathematical
Physics 250 (2004), no. 2, 371–391. 232
[HLW06] Patrick Hayden, Debbie W. Leung, and Andreas Winter, Aspects of generic entan-
glement, Comm. Math. Phys. 265 (2006), no. 1, 95–117. 232, 273
[HNW15] Aram W. Harrow, Anand Natarajan, and Xiaodi Wu, An improved semidefinite
programming hierarchy for testing entanglement, arXiv preprint arXiv:1506.08834
(2015). 29
386 BIBLIOGRAPHY

[Hol73] A. S. Holevo, Statistical decision theory for quantum systems, J. Multivariate Anal.
3 (1973), 337–394. 305
[Hol06] Alexander S. Holevo, The additivity problem in quantum information theory, In-
ternational Congress of Mathematicians. Vol. III, Eur. Math. Soc., Zürich, 2006,
pp. 999–1018. 232
[Hol12] , Quantum systems, channels, information, De Gruyter Studies in Mathe-
matical Physics, vol. 16, De Gruyter, Berlin, 2012, A mathematical introduction.
63
[Hor97] Pawel Horodecki, Separability criterion and inseparable mixed states with positive
partial transposition, Physics Letters A 232 (1997), no. 5, 333 – 339. 63

ion
[HP98] Fumio Hiai and Dénes Petz, Eigenvalue density of the Wishart matrix and large
deviations, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1 (1998), no. 4, 633–
646. 245

ut
[HP00] , The semicircle law, free random variables and entropy, Mathematical Sur-
veys and Monographs, vol. 77, American Mathematical Society, Providence, RI,

rib
2000. 180
[HQV` 16] F. Hirsch, M.T. Quintino, T. Vértesi, M. Navascués, and N. Brunner, Better local

ist
hidden variable models for two-qubit Werner states and an upper bound on the
Grothendieck constant KG p3q, arXiv preprint 1609.06114 (2016). 295
[HS05] Daniel Hug and Rolf Schneider, Large typical cells in Poisson-Delaunay mosaics,

rd
Rev. Roumaine Math. Pures Appl. 50 (2005), no. 5-6, 657–670. 178
[HSR03] Michael Horodecki, Peter W. Shor, and Mary Beth Ruskai, Entanglement breaking
channels, Rev. Math. Phys. 15 (2003), no. 6, 629–641. 64
[HT03] fo
Uffe Haagerup and Steen Thorbjørnsen, Random matrices with complex gaussian
entries, Expositiones Mathematicae 21 (2003), no. 4, 293–337. 170, 179
ot
˚
[HT05] , A new application of random matrices: ExtpCred pF2 qq is not a group, Ann.
of Math. (2) 162 (2005), no. 2, 711–775. 179, 180
N

[Hun72] Walter Hunziker, A note on symmetry operations in quantum mechanics. 63


[HW08] Patrick Hayden and Andreas Winter, Counterexamples to the maximal p-norm mul-
tiplicativity conjecture for all p ą 1, Communications in Mathematical Physics 284
ly.

(2008), no. 1, 263–280. 232, 233


[HW16] Han Huang and Feng Wei, Upper bound for the Dvoretzky dimension in Milman-
on

Schechtman theorem, arXiv eprint 1612.03572 (2016). 208


[Ide13] Martin Idel, On the stucture of positive maps, Master’s thesis, Technische Univer-
sität München, 2013. 64
se

[Ide16] , A review of matrix scaling and sinkhorn’s normal form for matrices and
positive maps, arXiv preprint 1609.06349 (2016). 64
lu

[Ind00] Piotr Indyk, Dimensionality reduction techniques for proximity problems, Proceed-
ings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (San
Francisco, CA, 2000), ACM, New York, 2000, pp. 371–378. 207
na

[Ind07] , Uncertainty principles, extractors, and explicit embeddings of l2 into l1 ,


STOC’07—Proceedings of the 39th Annual ACM Symposium on Theory of Com-
so

puting, ACM, New York, 2007, pp. 615–620. 207


[IS10] Piotr Indyk and Stanislaw Szarek, Almost-Euclidean subspaces of `N 1 via tensor
r

products: a simple approach to randomness reduction, Approximation, randomiza-


Pe

tion, and combinatorial optimization, Lecture Notes in Comput. Sci., vol. 6302,
Springer, Berlin, 2010, pp. 632–641. 207, 210
[Jam72] A. Jamiołkowski, Linear transformations which preserve trace and positive semidef-
initeness of operators, Rep. Mathematical Phys. 3 (1972), no. 4, 275–278. 64
[Jan97] Svante Janson, Gaussian Hilbert spaces, Cambridge Tracts in Mathematics, vol. 129,
Cambridge University Press, Cambridge, 1997. 145
[Jen13] Justin Jenkinson, Convex geometric connections to information theory, Ph.D. the-
sis, Case Western Reserve University, 2013, http://rave.ohiolink.edu/etdc/view?
acc_num=case1365179413. 109, 260
[JHH` 15] P. Joshi, K. Horodecki, M. Horodecki, P. Horodecki, R. Horodecki, Ben Li, S. J.
Szarek, and T. Szarek, Bound on Bell inequalities by fraction of determinism and
reverse triangle inequality, Phys. Rev. A 92 (2015), 032329. 297
BIBLIOGRAPHY 387

[JHK` 08] Eylee Jung, Mi-Ra Hwang, Hungsoo Kim, Min-Soo Kim, DaeKil Park, Jin-Woo Son,
and Sayatnova Tamaryan, Reduced state uniquely defines the groverian measure of
the original pure state, Phys. Rev. A 77 (2008), 062317. 233
[JL84] William B. Johnson and Joram Lindenstrauss, Extensions of Lipschitz mappings
into a Hilbert space., Contemp. Math. 26 (1984), 189–206 (English). 210
[JLN14] Maria Anastasia Jivulescu, Nicolae Lupa, and Ion Nechita, On the reduction crite-
rion for random quantum states, Journal of Mathematical Physics 55 (2014), no. 11,
–. 274
[JLN15] , Thresholds for entanglement criteria in quantum information theory, Quan-
tum Inf. Comput. 15 (2015), no. 13-4, 1165–1184. 274

ion
[JM78] Naresh C. Jain and Michael B. Marcus, Continuity of sub-Gaussian processes, Prob-
ability on Banach spaces, Adv. Probab. Related Topics, vol. 4, Dekker, New York,
1978, pp. 81–196. 179

ut
[JNP` 11] M. Junge, M. Navascues, C. Palazuelos, D. Perez-Garcia, V. B. Scholz, and R. F.
Werner, Connes embedding problem and Tsirelson’s problem, J. Math. Phys. 52

rib
(2011), no. 1, 012102, 12. 296
[Joh48] Fritz John, Extremum problems with inequalities as subsidiary conditions, Studies

ist
and Essays Presented to R. Courant on his 60th Birthday, January 8, 1948, Inter-
science Publishers, Inc., New York, N. Y., 1948, pp. 187–204. 104
[JP11] M. Junge and C. Palazuelos, Large violation of Bell inequalities with low entangle-

rd
ment, Comm. Math. Phys. 306 (2011), no. 3, 695–746. 297
[JPPG` 10] M. Junge, C. Palazuelos, D. Pérez-García, I. Villanueva, and M. M. Wolf, Operator
space theory: a natural framework for Bell inequalities, Phys. Rev. Lett. 104 (2010),

[JS]
no. 17, 170405, 4. 294 fo
Justin Jenkinson and Stanisław Szarek, Optimal constants in concentration inequal-
ot
ities on the sphere, in preparation. 109, 144
[JS91] William B. Johnson and Gideon Schechtman, Remarks on Talagrand’s deviation
N

inequality for Rademacher functions, Functional analysis (Austin, TX, 1987/1989),


Lecture Notes in Math., vol. 1470, Springer, Berlin, 1991, pp. 72–77. 146
[Kad65] Richard V. Kadison, Transformations of states in operator theory and dynamics,
ly.

Topology 3 (1965), no. suppl. 2, 177–198. 63


[Kah85] Jean-Pierre Kahane, Some random series of functions, second ed., Cambridge Stud-
on

ies in Advanced Mathematics, vol. 5, Cambridge University Press, Cambridge, 1985.


147
[Kah86] , Une inégalité du type de Slepian et Gordon sur les processus gaussiens,
se

Israel J. Math. 55 (1986), no. 1, 109–110. 178


[Kar11] Zohar Shay Karnin, Deterministic construction of a high dimensional lp section in
l1n for any p ă 2, Proceedings of the 43rd Annual ACM Symposium on Theory of
lu

Computing, ACM, 2011, pp. 645–654. 210


[Kaš77] B. S. Kašin, The widths of certain finite-dimensional sets and classes of smooth
na

functions, Izv. Akad. Nauk SSSR Ser. Mat. 41 (1977), no. 2, 334–351, 478. 208, 209
[Kat75] G. O. H. Katona, The Hamming-sphere has minimum boundary, Studia Sci. Math.
so

Hungar. 10 (1975), no. 1-2, 131–140. 146


[KCKL00] B. Kraus, J. I. Cirac, S. Karnas, and M. Lewenstein, Separability in 2 ˆ N composite
r

quantum systems, Phys. Rev. A (3) 61 (2000), no. 6, 062302, 10. 65


Pe

[Kec95] Alexander S. Kechris, Classical descriptive set theory, Graduate Texts in Mathe-
matics, vol. 156, Springer-Verlag, New York, 1995. 104
[Kha67] C. G. Khatri, On certain inequalities for normal distributions and their applications
to simultaneous confidence bounds, Ann. Math. Statist. 38 (1967), 1853–1867. 178
[Kir76] A. A. Kirillov, Elements of the theory of representations, Springer-Verlag, Berlin-
New York, 1976, Translated from the Russian by Edwin Hewitt, Grundlehren der
Mathematischen Wissenschaften, Band 220. 294
[Kis87] Christer O. Kiselman, Smoothness of vector sums of plane convex sets, Math. Scand.
60 (1987), no. 2, 239–252. 104
[KL78] G. A. Kabatjanskiı̆ and V. I. Levenšteı̆n, Bounds for packings on the sphere and in
space, Problemy Peredači Informacii 14 (1978), no. 1, 3–25. 142
[KL09] Robert L. Kosut and Daniel A. Lidar, Quantum error correction via convex opti-
mization, Quantum Inf. Process. 8 (2009), no. 5, 443–459. 29
388 BIBLIOGRAPHY

[Kla06] B. Klartag, On convex perturbations with a bounded isotropic constant, Geom.


Funct. Anal. 16 (2006), no. 6, 1274–1290. 105
[Kle32] O. Klein, Zur Berechnung von Potentialkurven für zweiatomige Moleküle mit Hilfe
von Spektraltermen, Zeitschrift für Physik 76 (1932), no. 3-4, 226–235 (German). 29
[KM05] B. Klartag and V. D. Milman, Geometry of log-concave functions and measures,
Geom. Dedicata 112 (2005), 169–182. 105
[KMP98] H. König, M. Meyer, and A. Pajor, The isotropy constants of the Schatten classes
are bounded, Math. Ann. 312 (1998), no. 4, 773–783. 105
[Kol05] Alexander Koldobsky, Fourier analysis in convex geometry, Mathematical Surveys
and Monographs, vol. 116, American Mathematical Society, Providence, RI, 2005.

ion
106
[Kom55] Yûsaku Komatu, Elementary inequalities for Mills’ ratio, Rep. Statist. Appl. Res.
Un. Jap. Sci. Engrs. 4 (1955), 69–70. 309

ut
[KP88] G. A. Kabatyanskiı̆ and V. I. Panchenko, Packings and coverings of the Hamming
space by unit balls, Dokl. Akad. Nauk SSSR 303 (1988), no. 3, 550–552. 142

rib
[Kra71] K. Kraus, General state changes in quantum theory, Ann. Physics 64 (1971), 311–
335. 64

ist
[Kra83] Karl Kraus, States, effects, and operations, Lecture Notes in Physics, vol. 190,
Springer-Verlag, Berlin, 1983, Fundamental notions of quantum theory, Lecture
notes edited by A. Böhm, J. D. Dollard and W. H. Wootters. 64

rd
[Kri79] J.-L. Krivine, Constantes de Grothendieck et fonctions de type positif sur les sphères,
Adv. in Math. 31 (1979), no. 1, 16–30. 295
[KS67] Simon Kochen and E. P. Specker, The problem of hidden variables in quantum

[KS03]
fo
mechanics, J. Math. Mech. 17 (1967), 59–87. 297
Boris S. Kashin and Stanislaw J. Szarek, The Knaster problem and the geometry of
ot
high-dimensional cubes, C. R. Math. Acad. Sci. Paris 336 (2003), no. 11, 931–936.
208
N

[KT85] Leonid A Khalfin and Boris S Tsirelson, Quantum and quasi-classical analogs of Bell
inequalities, Symposium on the foundations of modern physics, vol. 85, Singapore:
World Scientific, 1985, p. 441. 296
ly.

[KTJ09] Hermann König and Nicole Tomczak-Jaegermann, Projecting l8 onto classical


spaces, Constr. Approx. 29 (2009), no. 2, 277–292. 210
on

[Kup92] Greg Kuperberg, A low-technology estimate in convex geometry, Internat. Math.


Res. Notices (1992), no. 9, 181–183. 105
[Kup08] , From the Mahler conjecture to Gauss linking integrals, Geom. Funct. Anal.
se

18 (2008), no. 3, 870–892. 105


[KV07] B. Klartag and R. Vershynin, Small ball probability and Dvoretzky’s Theorem., Isr.
lu

J. Math. 157 (2007), 193–207 (English). 209


[KVSW09] Dmitry S. Kaliuzhnyi-Verbovetskyi, Ilya M. Spitkovsky, and Hugo J. Woerdeman,
Matrices with normal defect one, Oper. Matrices 3 (2009), no. 3, 401–438. 65
na

[Kwa76] S. Kwapień, A theorem on the Rademacher series with vector valued coefficients,
Probability in Banach spaces (Proc. First Internat. Conf., Oberwolfach, 1975),
so

Springer, Berlin, 1976, pp. 157–158. Lecture Notes in Math., Vol. 526. 147
[Kwa94] Stanisław Kwapień, A remark on the median and the expectation of convex func-
r

tions of Gaussian vectors, Probability in Banach spaces, 9 (Sandjberg, 1993), Progr.


Pe

Probab., vol. 35, Birkhäuser Boston, Boston, MA, 1994, pp. 271–272. 144
[Lan16] Cécilia Lancien, k-Extendibility of high-dimensional bipartite quantum states, Ran-
dom Matrices Theory Appl. 5 (2016), no. 3, 1650011, 58. 260, 274
[Las08] Marek Lassak, Banach-Mazur distance of central sections of a centrally symmetric
convex body, Beiträge Algebra Geom. 49 (2008), no. 1, 243–246. 339
[Lat96] Rafał Latała, A note on the Ehrhard inequality, Studia Math. 118 (1996), no. 2,
169–174. 144
[Lat97] , Estimation of moments of sums of independent real random variables, Ann.
Probab. 25 (1997), no. 3, 1502–1513. 146
[Lat02] R. Latała, On some inequalities for Gaussian measures, Proceedings of the Inter-
national Congress of Mathematicians, Vol. II (Beijing, 2002), Higher Ed. Press,
Beijing, 2002, pp. 813–822. 144
BIBLIOGRAPHY 389

[Lat06] Rafał Latała, Estimates of moments and tails of Gaussian chaoses, Ann. Probab.
34 (2006), no. 6, 2315–2331. 145
[Lea91] Imre Leader, Discrete isoperimetric inequalities, Probabilistic combinatorics and its
applications (San Francisco, CA, 1991), Proc. Sympos. Appl. Math., vol. 44, Amer.
Math. Soc., Providence, RI, 1991, pp. 57–80. 146
[Led96] Michel Ledoux, Isoperimetry and Gaussian analysis, Lectures on probability the-
ory and statistics (Saint-Flour, 1994), Lecture Notes in Math., vol. 1648, Springer,
Berlin, 1996, pp. 165–294. 144
[Led01] , The concentration of measure phenomenon, Mathematical Surveys and
Monographs, vol. 89, American Mathematical Society, Providence, RI, 2001. 117,

ion
119, 126, 143, 144, 145
[Led03] , A remark on hypercontractivity and tail inequalities for the largest eigen-
values of random matrices, Séminaire de Probabilités XXXVII, Lecture Notes in

ut
Math., vol. 1832, Springer, Berlin, 2003, pp. 360–369. 179
[Led97] , On Talagrand’s deviation inequalities for product measures, ESAIM

rib
Probab. Statist. 1 (1995/97), 63–87 (electronic). 146
[Lei72] L. Leindler, On a certain converse of Hölder’s inequality. II, Acta Sci. Math.

ist
(Szeged) 33 (1972), no. 3-4, 217–223. 105
[Lév22] Paul Lévy, Leçons d’analyse fonctionnelle, Gauthier–Villars, Paris, 1922. 143, 144
[Lév51] , Problèmes concrets d’analyse fonctionnelle. Avec un complément sur les

rd
fonctionnelles analytiques par F. Pellegrino, Gauthier-Villars, Paris, 1951, 2d ed.
143
[Li] Ben Li, in preparation, Ph.D. thesis, Case Western Reserve University. 295
[LJL15] fo
Gao Li, Marius Junge, and Nicholas LaRacuente, Capacity Bounds via Operator
Space Methods, arXiv preprint 1509.07294 (2015). 232
ot
[LKCH00] M. Lewenstein, B. Kraus, J. I. Cirac, and P. Horodecki, Optimization of entangle-
ment witnesses, Phys. Rev. A 62 (2000), 052310. 64
N

[LLR83] M. R. Leadbetter, Georg Lindgren, and Holger Rootzén, Extremes and related prop-
erties of random sequences and processes, Springer Series in Statistics, Springer-
Verlag, New York-Berlin, 1983. 178
ly.

[LM15] Rafał Latała and Dariusz Matlak, Royen’s proof of the Gaussian correlation inequal-
ity, arXiv preprint 1512.08776 (2015). 178
on

[LMO06] Jon Magne Leinaas, Jan Myrheim, and Eirik Ovrum, Geometrical aspects of entan-
glement, Phys. Rev. A (3) 74 (2006), no. 1, 012313, 13. 65
[LN16] Kasper Green Larsen and Jelani Nelson, The Johnson-Lindenstrauss lemma is opti-
se

mal for linear dimensionality reduction, Proceedings of the 43rd International Col-
loquium on Automata, Languages and Programming (ICALP 2016), 2016. 210
lu

[LO94] Rafał Latała and Krzysztof Oleszkiewicz, On the best constant in the Khinchin-
Kahane inequality, Studia Math. 109 (1994), no. 1, 101–104. 147
[LO99] , Gaussian measures of dilatations of convex symmetric sets, Ann. Probab.
na

27 (1999), no. 4, 1922–1938. 207


[LP68] J. Lindenstrauss and A. Pełczyński, Absolutely summing operators in Lp -spaces and
so

their applications, Studia Math. 29 (1968), 275–326. 295


[LP99] N. Linden and S. Popescu, Bound entanglement and teleportation, Phys. Rev. A 59
r

(1999), 137–140. 306


Pe

[LQ04] Daniel Li and Hervé Queffélec, Introduction à l’étude des espaces de Banach, Cours
Spécialisés [Specialized Courses], vol. 12, Société Mathématique de France, Paris,
2004, Analyse et probabilités. [Analysis and probability theory]. 207
[LR10] Michel Ledoux and Brian Rider, Small deviations for beta ensembles, Electron. J.
Probab. 15 (2010), no. 41, 1319–1343. 179
[LS75] Raphael Loewy and Hans Schneider, Positive operators on the n-dimensional ice
cream cone, J. Math. Anal. Appl. 49 (1975), 375–392. 324
[LS93] L. J. Landau and R. F. Streater, On Birkhoff ’s theorem for doubly stochastic com-
pletely positive maps of matrix algebras, Linear Algebra Appl. 193 (1993), 107–127.
64
[LS08] Shachar Lovett and Sasha Sodin, Almost Euclidean sections of the N -dimensional
cross-polytope using OpN q random bits, Commun. Contemp. Math. 10 (2008), no. 4,
477–489. 207
390 BIBLIOGRAPHY

[LS13] M. S. Leifer and Robert W. Spekkens, Towards a formulation of quantum theory as


a causally neutral theory of Bayesian inference, Phys. Rev. A 88 (2013), 052130. 64
[LT91] Michel Ledoux and Michel Talagrand, Probability in Banach spaces, Ergebnisse der
Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas
(3)], vol. 23, Springer-Verlag, Berlin, 1991, Isoperimetry and processes. 178
[LT92] Chi-Kwong Li and Nam-Kiu Tsing, Linear preserver problems: a brief introduc-
tion and some special techniques, Linear Algebra Appl. 162/164 (1992), 217–235,
Directions in matrix theory (Auburn, AL, 1990). 64
[Mat02] Jiří Matoušek, Lectures on discrete geometry, Graduate Texts in Mathematics, vol.
212, Springer-Verlag, New York, 2002. 146

ion
[Mau79] Bernard Maurey, Construction de suites symétriques, C. R. Acad. Sci. Paris Sér.
A-B 288 (1979), no. 14, A679–A681. 146
[Mau91] B. Maurey, Some deviation inequalities, Geom. Funct. Anal. 1 (1991), no. 2, 188–

ut
197. 146
[Mau03] Bernard Maurey, Type, cotype and K-convexity, Handbook of the geometry of Ba-

rib
nach spaces, Vol. 2, North-Holland, Amsterdam, 2003, pp. 1299–1332. 207, 209
[McC06] Robert J. McCann, Stable rotating binary stars and fluid in a tube, Houston J.

ist
Math. 32 (2006), no. 2, 603–631. 179
[McD89] Colin McDiarmid, On the method of bounded differences, Surveys in combinatorics,
1989 (Norwich, 1989), London Math. Soc. Lecture Note Ser., vol. 141, Cambridge

rd
Univ. Press, Cambridge, 1989, pp. 148–188. 144
[McD98] , Concentration, Probabilistic methods for algorithmic discrete mathematics,
Algorithms Combin., vol. 16, Springer, Berlin, 1998, pp. 195–248. 146
[Mec] fo
Elizabeth Meckes, The random matrix theory of the classical compact groups, Cam-
bridge University Press, in preparation. 179
ot
[Mec03] Mark W. Meckes, Random phenomena in finite-dimensional normed spaces, Ph.D.
thesis, Case Western Reserve University, 2003. 146
N

[Mec04] , Concentration of norms and eigenvalues of random matrices, J. Funct.


Anal. 211 (2004), no. 2, 508–524. 146
[Mer07] N. David Mermin, Quantum computer science, Cambridge University Press, Cam-
ly.

bridge, 2007, An introduction. 75


[Mil71] V. D. Milman, A new proof of A. Dvoretzky’s theorem on cross-sections of convex
on

bodies, Funkcional. Anal. i Priložen. 5 (1971), no. 4, 28–37. 208


[Mil85] V.D. Milman, Random subspaces of proportional dimension of finite dimensional
normed spaces: Approach through the isoperimetric inequality., Banach spaces,
se

Proc. Conf., Columbia/Mo. 1984, Lect. Notes Math. 1166, 106-115 (1985)., 1985.
209
lu

[Mil86] Vitali D. Milman, Inégalité de Brunn-Minkowski inverse et applications à la théorie


locale des espaces normés. (An inverse form of the Brunn-Minkowski inequality with
applications to local theory of normed spaces)., C. R. Acad. Sci., Paris, Sér. I 302
na

(1986), 25–28 (English). 143, 209


[Mil87] V. D. Milman, Some remarks on Urysohn’s inequality and volume ratio of cotype
so

2-spaces, Geometrical aspects of functional analysis (1985/86), Lecture Notes in


Math., vol. 1267, Springer, Berlin, 1987, pp. 75–81. 209
r

[Mil88] V.D. Milman, A few observations on the connections between local theory and some
Pe

other fields., Geometric aspects of functional analysis, Isr. Semin. 1986-87, Lect.
Notes Math. 1317, 283-289 (1988)., 1988. 208
[Mil15] Emanuel Milman, Sharp isoperimetric inequalities and model spaces for the curva-
ture-dimension-diameter condition, J. Eur. Math. Soc. (JEMS) 17 (2015), no. 5,
1041–1078. 144
[Min11] Hermann Minkowski, Gesammelte Abhandlungen von Hermann Minkowski. Unter
Mitwirkung von Andreas Speiser und Hermann Weyl, herausgegeben von David
Hilbert. Band I, II., Leipzig u. Berlin: B. G. Teubner. Erster Band. Mit einem
Bildnis Hermann Minkowskis und 6 Figuren im Text. xxxvi, 371 S.; Zweiter Band.
Mit einem Bildnis Hermann Minkowskis, 34 Figuren in Text und einer Doppeltafel.
iv, 466 S. gr. 8˝ (1911)., 1911. 29
[MM13] Elizabeth S. Meckes and Mark W. Meckes, Spectral measures of powers of random
matrices, Electron. Commun. Probab. 18 (2013), no. 78, 13. 134
BIBLIOGRAPHY 391

[MO15] Marek Miller and Robert Olkiewicz, Topology of the cone of positive maps on qubit
systems, Journal of Physics A: Mathematical and Theoretical 48 (2015), no. 25,
255203. 64, 324
[MO16] Marek Miller and Robert Olkiewicz, Extremal positive maps on M3 pCq and idem-
potent matrices, Open Syst. Inf. Dyn. 23 (2016), no. 1, 1650001, 13. 65
[Mon12] Ashley Montanaro, Some applications of hypercontractive inequalities in quantum
information theory, J. Math. Phys. 53 (2012), no. 12, 122206, 15. 136, 146
[Mon13] , Weak multiplicativity for random quantum channels, Comm. Math. Phys.
319 (2013), no. 2, 535–555. 233
[MP67] V. A. Marčenko and L. A. Pastur, Distribution of eigenvalues in certain sets of

ion
random matrices, Mat. Sb. (N.S.) 72 (114) (1967), 507–536. 179
[MP86] Vitali D. Milman and Gilles Pisier, Banach spaces with a weak cotype 2 property,
Israel J. Math. 54 (1986), no. 2, 139–158. 209

ut
[MP00] V. D. Milman and A. Pajor, Entropy and asymptotic geometry of non-symmetric
convex bodies, Adv. Math. 152 (2000), no. 2, 314–335. 105

rib
[MS86] Vitali D. Milman and Gideon Schechtman, Asymptotic theory of finite-dimensional
normed spaces, Lecture Notes in Mathematics, vol. 1200, Springer-Verlag, Berlin,

ist
1986, With an appendix by M. Gromov. 144, 146, 207
[MS97] V. D. Milman and G. Schechtman, Global versus local asymptotic theories of finite-
dimensional normed spaces, Duke Math. J. 90 (1997), no. 1, 73–93. 208

rd
[MS12] Mark W. Meckes and Stanisław J. Szarek, Concentration for noncommutative poly-
nomials in random matrices, Proc. Amer. Math. Soc. 140 (2012), no. 5, 1803–1813.
180
[MTJ87] fo
V. D. Milman and N. Tomczak-Jaegermann, Sudakov type inequalities for convex
bodies in Rn , Geometrical aspects of functional analysis (1985/86), Lecture Notes
ot
in Math., vol. 1267, Springer, Berlin, 1987, pp. 113–121. 178
[MWW09] William Matthews, Stephanie Wehner, and Andreas Winter, Distinguishability of
N

quantum states under restricted families of measurements with an application to


quantum data hiding, Comm. Math. Phys. 291 (2009), no. 3, 813–843. 305
[Naz12] Fedor Nazarov, The Hörmander proof of the Bourgain-Milman theorem, Geometric
ly.

aspects of functional analysis, Lecture Notes in Math., vol. 2050, Springer, Heidel-
berg, 2012, pp. 335–343. 105
on

[NC00] Michael A. Nielsen and Isaac L. Chuang, Quantum computation and quantum in-
formation, Cambridge University Press, Cambridge, 2000. 63, 232
[Nel73] Edward Nelson, The free Markoff field, J. Functional Analysis 12 (1973), 211–227.
se

145
[Nem07] Arkadi Nemirovski, Advances in convex optimization: conic programming, Interna-
lu

tional Congress of Mathematicians. Vol. I, Eur. Math. Soc., Zürich, 2007, pp. 413–
444. 29
[NS06] Alexandru Nica and Roland Speicher, Lectures on the combinatorics of free proba-
na

bility, London Mathematical Society Lecture Note Series, vol. 335, Cambridge Uni-
versity Press, Cambridge, 2006. 177, 180
so

[O’D14] Ryan O’Donnell, Analysis of Boolean functions, Cambridge University Press, New
York, 2014. 136, 146
r

[Oza13] Narutaka Ozawa, About the Connes embedding conjecture: algebraic approaches,
Pe

Jpn. J. Math. 8 (2013), no. 1, 147–183. 296


[Pag93] Don N. Page, Average entropy of a subsystem, Phys. Rev. Lett. 71 (1993), no. 9,
1291–1294. 232
[Paj99] Alain Pajor, Metric entropy of the Grassmann manifold, Convex geometric analysis
(Berkeley, CA, 1996), Math. Sci. Res. Inst. Publ., vol. 34, Cambridge Univ. Press,
Cambridge, 1999, pp. 181–188. 143
[Par04] K. R. Parthasarathy, On the maximal dimension of a completely entangled subspace
for finite level quantum systems, Proc. Indian Acad. Sci. Math. Sci. 114 (2004),
no. 4, 365–374. 231
[Peł80] Aleksander Pełczyński, Geometry of finite-dimensional Banach spaces and operator
ideals, Notes in Banach spaces, Univ. Texas Press, Austin, Tex., 1980, pp. 81–181.
208
392 BIBLIOGRAPHY

[Per96] Asher Peres, Separability criterion for density matrices, Phys. Rev. Lett. 77 (1996),
1413–1415. 63
[Per99] , All the Bell inequalities, Found. Phys. 29 (1999), no. 4, 589–614, Invited
papers dedicated to Daniel Greenberger, Part II. 297
[Pet01] Dénes Petz, Entropy, von Neumann and the von Neumann entropy, John von Neu-
mann and the foundations of quantum physics (Budapest, 1999), Vienna Circ. Inst.
Yearb., vol. 8, Kluwer Acad. Publ., Dordrecht, 2001, pp. 83–96. 29
[Pet06] Peter Petersen, Riemannian geometry, second ed., Graduate Texts in Mathematics,
vol. 171, Springer, New York, 2006. 131
[PGWP` 08] D. Pérez-García, M. M. Wolf, C. Palazuelos, I. Villanueva, and M. Junge, Unbounded

ion
violation of tripartite Bell inequalities, Comm. Math. Phys. 279 (2008), no. 2, 455–
486. 297
[Pic68] James Pickands, III, Moment convergence of sample extremes, Ann. Math. Statist.

ut
39 (1968), 881–889. 178
[Pis80] G. Pisier, Un théorème sur les opérateurs linéaires entre espaces de Banach qui se

rib
factorisent par un espace de Hilbert, Ann. Sci. École Norm. Sup. (4) 13 (1980),
no. 1, 23–43. 207

ist
[Pis81] , Remarques sur un résultat non publié de B. Maurey, Seminar on Functional
Analysis, 1980–1981, École Polytech., Palaiseau, 1981, pp. Exp. No. V, 13. 207
[Pis86] Gilles Pisier, Probabilistic methods in the geometry of Banach spaces, Probability

rd
and analysis (Varenna, 1985), Lecture Notes in Math., vol. 1206, Springer, Berlin,
1986, pp. 167–241. 144
[Pis89a] , A new approach to several results of V. Milman, J. Reine Angew. Math.

[Pis89b]
393 (1989), 115–131. 209 fo
, The volume of convex bodies and Banach space geometry, Cambridge Tracts
ot
in Mathematics, vol. 94, Cambridge University Press, Cambridge, 1989. 142, 143,
207, 209
N

[Pis12a] , Grothendieck’s theorem, past and present, Bull. Amer. Math. Soc. (N.S.)
49 (2012), no. 2, 237–323. 295
[Pis12b] , Tripartite Bell inequality, random matrices and trilinear forms, arXiv
ly.

eprint 1203.2509 (2012). 297


[Pit89] Itamar Pitowsky, Quantum probability—quantum logic, Lecture Notes in Physics,
on

vol. 321, Springer-Verlag, Berlin, 1989. 295


[Por81] Ian R. Porteous, Topological geometry, second ed., Cambridge University Press,
Cambridge, 1981. 294
se

[PR94] Sandu Popescu and Daniel Rohrlich, Quantum nonlocality as an axiom, Found.
Phys. 24 (1994), no. 3, 379–385. 296
lu

[Pré71] András Prékopa, Logarithmic concave measures with application to stochastic pro-
gramming, Acta Sci. Math. (Szeged) 32 (1971), 301–316. 105
[Pré73] , On logarithmic concave measures and functions, Acta Sci. Math. (Szeged)
na

34 (1973), 335–343. 105


[PT86] Alain Pajor and Nicole Tomczak-Jaegermann, Subspaces of small codimension of
so

finite-dimensional Banach spaces., Proc. Am. Math. Soc. 97 (1986), 637–642 (Eng-
lish). 209
r

[PTJ85] Alain Pajor and Nicole Tomczak-Jaegermann, Remarques sur les nombres d’entropie
Pe

d’un opérateur et de son transposé, C. R. Acad. Sci. Paris Sér. I Math. 301 (1985),
no. 15, 743–746. 178
[PTJ90] , Gel1 fand numbers and Euclidean sections of large dimensions, Probability
in Banach spaces 6 (Sandbjerg, 1986), Progr. Probab., vol. 20, Birkhäuser Boston,
Boston, MA, 1990, pp. 252–264. 208
[PV07] Martin B. Plenio and Shashank Virmani, An introduction to entanglement measures,
Quantum Inf. Comput. 7 (2007), no. 1-2, 1–51. 232, 273
[PV16] Carlos Palazuelos and Thomas Vidick, Survey on nonlocal games and operator space
theory, Journal of Mathematical Physics 57 (2016), no. 1. 275, 281, 295, 296, 297
[PY15] C. Palazuelos and Z. Yin, Large bipartite Bell violations with dichotomic measure-
ments, Phys. Rev. A 92 (2015), 052313. 297
[Ran55] R. A. Rankin, The closest packing of spherical caps in n dimensions, Proc. Glasgow
Math. Assoc. 2 (1955), 139–144. 142
BIBLIOGRAPHY 393

[Rei08] Michael Reimpell, Quantum information and convex optimization, Ph.D. thesis,
Technische Universität Braunschweig, 2008. 29
[Roc70] R. Tyrrell Rockafellar, Convex analysis, Princeton Mathematical Series, No. 28,
Princeton University Press, Princeton, N.J., 1970. 14, 28
[Rog47] C. A. Rogers, Existence theorems in the geometry of numbers, Ann. of Math. (2) 48
(1947), 994–1002. 142
[Rog57] , A note on coverings, Mathematika 4 (1957), 1–6. 142
[Rog63] , Covering a sphere with spheres, Mathematika 10 (1963), 157–164. 142
[Rog64] , Packing and covering, Cambridge Tracts in Mathematics and Mathematical
Physics, No. 54, Cambridge University Press, New York, 1964. 141

ion
[Rot86] O. S. Rothaus, Hypercontractivity and the Bakry-Emery criterion for compact Lie
groups, J. Funct. Anal. 65 (1986), no. 3, 358–367. 145
[Rot06] Ron Roth, Introduction to coding theory, Cambridge University Press, 2006. 142

ut
[Roy14] Thomas Royen, A simple proof of the Gaussian correlation conjecture extended to
some multivariate gamma distributions, Far East J. Theor. Stat. 48 (2014), no. 2,

rib
139–145. 178
[RP11] Eleanor Rieffel and Wolfgang Polak, Quantum computing, Scientific and Engineering

ist
Computation, MIT Press, Cambridge, MA, 2011, A gentle introduction. 75
[RS58] C. A. Rogers and G. C. Shephard, Convex bodies associated with a given convex
body, J. London Math. Soc. 33 (1958), 270–281. 105

rd
[RSW02] Mary Beth Ruskai, Stanislaw Szarek, and Elisabeth Werner, An analysis of com-
pletely positive trace-preserving maps on M2 , Linear Algebra Appl. 347 (2002),
159–187. 64
[Rud97] fo
M. Rudelson, Contact points of convex bodies, Israel J. Math. 101 (1997), 93–124.
208
ot
[Rud00] , Distances between non-symmetric convex bodies and the M M ˚ -estimate,
Positivity 4 (2000), no. 2, 161–178. 103, 207
N

[Rud05] Oliver Rudolph, Further results on the cross norm criterion for separability, Quan-
tum Inf. Process. 4 (2005), no. 3, 219–239. 63
[RW00] Mary Beth Ruskai and Elisabeth Werner, Study of a class of regularizations of
ly.

1{|X| using Gaussian integrals, SIAM J. Math. Anal. 32 (2000), no. 2, 435–463
(electronic). 309
on

[RW09] Mary Beth Ruskai and Elisabeth M Werner, Bipartite states of low rank are almost
surely entangled, Journal of Physics A: Mathematical and Theoretical 42 (2009),
no. 9, 095303. 273
se

[RZ14] Dmitry Ryabogin and Artem Zvavitch, Analytic methods in convex geometry, An-
alytical and probabilistic methods in the geometry of convex bodies, IMPAN Lect.
lu

Notes, vol. 2, Polish Acad. Sci. Inst. Math., Warsaw, 2014, pp. 87–183. 105
[Sam53] M. R. Sampford, Some inequalities on Mill’s ratio and related functions, Ann. Math.
Statistics 24 (1953), 130–132. 309
na

[SBŻ06] Stanisław J Szarek, Ingemar Bengtsson, and Karol Życzkowski, On the structure of
the body of states with positive partial transpose, Journal of Physics A: Mathematical
so

and General 39 (2006), no. 5, L119. 273


[SC74] V. N. Sudakov and B. S. Cirel1 son, Extremal properties of half-spaces for spherically
r

invariant measures, Zap. Naučn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI)
Pe

41 (1974), 14–24, 165, Problems in the theory of probability distributions, II. 144
[SC94] L. Saloff-Coste, Precise estimates on the rate at which certain diffusions tend to
equilibrium, Math. Z. 217 (1994), no. 4, 641–677. 145
[Sch48] Erhard Schmidt, Die Brunn-Minkowskische Ungleichung und ihr Spiegelbild sowie
die isoperimetrische Eigenschaft der Kugel in der euklidischen und nichteuklidischen
Geometrie. I, Math. Nachr. 1 (1948), 81–157. 143
[Sch50] Robert Schatten, A Theory of Cross-Spaces, Annals of Mathematics Studies, no. 26,
Princeton University Press, Princeton, N. J., 1950. 29
[Sch65] Hans Schneider, Positive operators and an inertia theorem, Numer. Math. 7 (1965),
11–17. 64
[Sch70] Robert Schatten, Norm ideals of completely continuous operators, Second print-
ing. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 27, Springer-Verlag,
Berlin-New York, 1970. 29
394 BIBLIOGRAPHY

[Sch82] Gideon Schechtman, Lévy type inequality for a class of finite metric spaces, Martin-
gale theory in harmonic analysis and Banach spaces (Cleveland, Ohio, 1981), Lecture
Notes in Math., vol. 939, Springer, Berlin-New York, 1982, pp. 211–215. 146
[Sch84] Carsten Schütt, Entropy numbers of diagonal operators between symmetric Banach
spaces, J. Approx. Theory 40 (1984), no. 2, 121–128. 156, 157
[Sch87] Gideon Schechtman, More on embedding subspaces of Lp in lrn ., Compos. Math. 61
(1987), 159–169 (English). 209
[Sch89] Gideon Schechtman, A remark concerning the dependence on  in Dvoretzky’s the-
orem, Geometric aspects of functional analysis (1987–88), Lecture Notes in Math.,
vol. 1376, Springer, Berlin, 1989, pp. 274–277. 208

ion
[Sch99] Michael Schmuckenschläger, An extremal property of the regular simplex, Convex
geometric analysis (Berkeley, CA, 1996), Math. Sci. Res. Inst. Publ., vol. 34, Cam-
bridge Univ. Press, Cambridge, 1999, pp. 199–202. 342

ut
[Sch03] Gideon Schechtman, Concentration results and applications, Handbook of the ge-
ometry of Banach spaces, Vol. 2, North-Holland, Amsterdam, 2003, pp. 1603–1634.

rib
143, 144
[Sch07] G. Schechtman, The random version of Dvoretzky’s theorem in `n 8 , Geometric as-

ist
pects of functional analysis, Lecture Notes in Math., vol. 1910, Springer, Berlin,
2007, pp. 265–270. 208
[Sch14] Rolf Schneider, Convex bodies: the Brunn-Minkowski theory, expanded ed., Ency-

rd
clopedia of Mathematics and its Applications, vol. 151, Cambridge University Press,
Cambridge, 2014. 103, 104, 344
[SCM16] Gniewomir Sarbicki, Dariusz Chruściński, and Marek Mozrzymas, Generalising
fo
Wigner’s theorem, Journal of Physics A: Mathematical and Theoretical 49 (2016),
no. 30, 305302. 63
ot
[See66] R. T. Seeley, Spherical harmonics, Amer. Math. Monthly 73 (1966), no. 4, part II,
115–121. 145
N

[Sen96] Siddhartha Sen, Average entropy of a quantum subsystem, Physical review letters
77 (1996), no. 1, 1. 232
[Sha48] C. E. Shannon, A mathematical theory of communication, Bell System Tech. J. 27
ly.

(1948), 379–423, 623–656. 29


[Sha08] R. Shankar, Principles of quantum mechanics, second ed., Springer, New York, 2008,
on

Corrected reprint of the second (1994) edition. 75


[Shi95] Abner Shimony, Degree of entanglement, Fundamental problems in quantum theory
(Baltimore, MD, 1994), Ann. New York Acad. Sci., vol. 755, New York Acad. Sci.,
se

New York, 1995, pp. 675–679. 233


[Sho04] Peter W. Shor, Equivalence of additivity questions in quantum information theory,
lu

Comm. Math. Phys. 246 (2004), no. 3, 453–472. 232


[Šid67] Zbyněk Šidák, Rectangular confidence regions for the means of multivariate normal
distributions, J. Amer. Statist. Assoc. 62 (1967), 626–633. 178
na

[Šid68] , On multivariate normal probabilities of rectangles: Their dependence on


correlations, Ann. Math. Statist. 39 (1968), 1425–1434. 178
so

[Sil85] Jack W. Silverstein, The smallest eigenvalue of a large-dimensional Wishart matrix,


Ann. Probab. 13 (1985), no. 4, 1364–1368. 179, 180
r

[Sim76] Barry Simon, Quantum dynamics: from automorphism to Hamiltonian, Studies in


Pe

Mathematical Physics. Essays in Honor of Valentine Bargmann (1976), 327–349. 63


[Sin64] Richard Sinkhorn, A relationship between arbitrary positive matrices and doubly
stochastic matrices, Ann. Math. Statist. 35 (1964), no. 2, 876–879. 64
[Sko16] Łukasz Skowronek, There is no direct generalization of positive partial transpose
criterion to the three-by-three case, arXiv preprint 1605.05254 (2016). 261
[Sla12] Paul B Slater, Two-qubit separability probabilities: A concise formula, arXiv
preprint 1209.1613 (2012). 260
[Sle62] David Slepian, The one-sided barrier problem for Gaussian noise, Bell System Tech.
J. 41 (1962), 463–501. 178
[Slo16] William Slofstra, Tsirelson’s problem and an embedding theorem for groups arising
from non-local games, arXiv preprint arXiv:1606.03140 (2016). 296
[Som09] Hans-Jürgen Sommers, Mini-Workshop: Geometry of Quantum Entanglement,
Oberwolfach Rep. 6 (2009), no. 4, 2993–3031, Abstracts from the mini-workshop
BIBLIOGRAPHY 395

held December 6–12, 2009, Organized by Andreas Buchleitner, Stanisław Szarek,


Elisabeth Werner and Karol Życzkowski, Oberwolfach Reports. Vol. 6, no. 4. 260
[Spi93] Jonathan E. Spingarn, An inequality for sections and projections of a convex set,
Proc. Amer. Math. Soc. 118 (1993), no. 4, 1219–1224. 105
[SR95] Jorge Sánchez-Ruiz, Simple proof of Page’s conjecture on the average entropy of a
subsystem, Physical Review E 52 (1995), no. 5, 5653. 232
[SS98] P. W. Shor and N. J. A. Sloane, A family of optimal packings in Grassmannian
manifolds, J. Algebraic Combin. 7 (1998), no. 2, 157–163. 143
[SS05] Elias M. Stein and Rami Shakarchi, Real analysis, Princeton Lectures in Analysis,
III, Princeton University Press, Princeton, NJ, 2005, Measure theory, integration,

ion
and Hilbert spaces. 342, 364
[ST80] Stanislaw J. Szarek and Nicole Tomczak-Jaegermann, On nearly Euclidean decom-
position for some classes of Banach spaces., Compos. Math. 40 (1980), 367–385

ut
(English). 209
[Sti55] W. Forrest Stinespring, Positive functions on C ˚ -algebras, Proc. Amer. Math. Soc.

rib
6 (1955), 211–216. 64
[Stø63] Erling Størmer, Positive linear maps of operator algebras, Acta Math. 110 (1963),

ist
233–278. 63, 64
[Stø13] , Positive linear maps of operator algebras, Springer Monographs in Mathe-
matics, Springer, Heidelberg, 2013. 65

rd
[Stø16] , Positive maps which map the set of rank k projections onto itself, Positivity
(2016), 1–3. 63
[Sud71] V. N. Sudakov, Gaussian random processes, and measures of solid angles in Hilbert

[SV96]
fo
space, Dokl. Akad. Nauk SSSR 197 (1971), 43–45. 178
Stanislaw J. Szarek and Dan Voiculescu, Volumes of restricted Minkowski sums and
ot
the free analogue of the entropy power inequality, Comm. Math. Phys. 178 (1996),
no. 3, 563–570. 104
N

[SV00] S. J. Szarek and D. Voiculescu, Shannon’s entropy power inequality via restricted
Minkowski sums, Geometric aspects of functional analysis, Lecture Notes in Math.,
vol. 1745, Springer, Berlin, 2000, pp. 257–262. 104
ly.

[Sve81] George Svetlichny, On the foundations of experimental statistical sciences, Founda-


tions of Physics 11 (1981), no. 9-10, 741–782 (English). 103
on

[SW] Stanisław Szarek and Paweł Wolff, Radii of Euclidean sections of Lp -balls, in prepa-
ration. 197
[SW83] Rolf Schneider and Wolfgang Weil, Zonoids and related topics, Convexity and its
se

applications, Birkhäuser, Basel, 1983, pp. 296–317. 103


[SW99] Stanisław J. Szarek and Elisabeth Werner, A nonsymmetric correlation inequality
lu

for Gaussian measure, J. Multivariate Anal. 68 (1999), no. 2, 193–211. 309


[Swe] Michael Swearingin, Ph.D. thesis, Case Western Reserve University, in preparation.
110
na

[SWŻ08] Stanisław J. Szarek, Elisabeth Werner, and Karol Życzkowski, Geometry of sets of
quantum maps: a generic positive map acting on a high-dimensional system is not
so

completely positive, J. Math. Phys. 49 (2008), no. 3, 032113, 21. 106, 260, 261
[SWŻ11] , How often is a random quantum state k-entangled?, J. Phys. A 44 (2011),
r

no. 4, 045303, 15. 260, 261


Pe

[Sza] Stanislaw Szarek, Coarse approximation of convex bodies by polytopes and the com-
plexity of banach–mazur compacta, in preparation. 143
[Sza74] Andrzej Szankowski, On Dvoretzky’s theorem on almost spherical sections of convex
bodies., Isr. J. Math. 17 (1974), 325–338 (English). 208
[Sza76] S. J. Szarek, On the best constants in the Khinchin inequality, Studia Math. 58
(1976), no. 2, 197–208. 147, 282
[Sza78] Stanislaw Jerzy Szarek, On Kashin’s almost Euclidean orthogonal decomposition of
`1n ., Bull. Acad. Pol. Sci., Sér. Sci. Math. Astron. Phys. 26 (1978), 691–694 (English).
209
[Sza82] Stanisław J. Szarek, Nets of Grassmann manifold and orthogonal group, Proceedings
of research workshop on Banach space theory (Iowa City, Iowa, 1981), Univ. Iowa,
Iowa City, IA, 1982, pp. 169–185. 143
396 BIBLIOGRAPHY

[Sza83] , The finite-dimensional basis problem with an appendix on nets of Grass-


mann manifolds, Acta Math. 151 (1983), no. 3-4, 153–179. 143
[Sza90] n and random matrices, Amer. J. Math. 112
, Spaces with large distance to l8
(1990), no. 6, 899–942. 103
[Sza98] , Metric entropy of homogeneous spaces, Quantum probability (Gdańsk,
1997), Banach Center Publ., vol. 43, Polish Acad. Sci., Warsaw, 1998, pp. 395–410.
143, 319
[Sza05] Stanislaw J. Szarek, Volume of separable states is super-doubly-exponentially small
in the number of qubits, Phys. Rev. A (3) 72 (2005), no. 3, part A, 032304, 10. 104,
165, 260, 343

ion
[Sza10] Stanislaw J Szarek, On norms of completely positive maps, Topics in Operator The-
ory, Springer, 2010, pp. 535–538. 232
[Tak08] Leon A. Takhtajan, Quantum mechanics for mathematicians, Graduate Studies in

ut
Mathematics, vol. 95, American Mathematical Society, Providence, RI, 2008. 75
[Tal87] Michel Talagrand, Regularity of Gaussian processes, Acta Math. 159 (1987), no. 1-2,

rib
99–149. 179
[Tal88] , An isoperimetric theorem on the cube and the Kintchine-Kahane inequali-

ist
ties, Proc. Amer. Math. Soc. 104 (1988), no. 3, 905–909. 146
[Tal90] Michel Talagrand, Embedding subspaces of L1 into `N 1 ., Proc. Am. Math. Soc. 108
(1990), no. 2, 363–369 (English). 209

rd
[Tal95] Michel Talagrand, Concentration of measure and isoperimetric inequalities in prod-
uct spaces, Inst. Hautes Études Sci. Publ. Math. (1995), no. 81, 73–205. 146
[Tal96a] , New concentration inequalities in product spaces, Invent. Math. 126 (1996),

[Tal96b]
no. 3, 505–563. 146 fo
, A new look at independence, Ann. Probab. 24 (1996), no. 1, 1–34. 146
ot
[Tal01] , Majorizing measures without measures, Ann. Probab. 29 (2001), no. 1,
411–417. 179
N

[Tal05] , The generic chaining, Springer Monographs in Mathematics, Springer-


Verlag, Berlin, 2005, Upper and lower bounds of stochastic processes. 179
[Tal11] , Mean field models for spin glasses. Volume I, Ergebnisse der Mathematik
ly.

und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Re-


sults in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in
on

Mathematics], vol. 54, Springer-Verlag, Berlin, 2011, Basic examples. 178


[Tal14] , Upper and lower bounds for stochastic processes, Ergebnisse der Mathe-
matik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics
se

[Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys
in Mathematics], vol. 60, Springer, Heidelberg, 2014, Modern methods and classical
lu

problems. 179
[Tao12] Terence Tao, Topics in random matrix theory, Graduate Studies in Mathematics,
vol. 132, American Mathematical Society, Providence, RI, 2012. 179
na

[TH00] Barbara M. Terhal and Paweł Horodecki, Schmidt number for density matrices,
Phys. Rev. A 61 (2000), 040301. 63
n and the χ-
so

[Tik14] Konstantin E. Tikhomirov, The randomized Dvoretzky’s theorem in l8


distribution, Geometric aspects of functional analysis, Lecture Notes in Math., vol.
r

2116, Springer, Cham, 2014, pp. 455–463. 208


Pe

[TJ89] Nicole Tomczak-Jaegermann, Banach-Mazur distances and finite-dimensional op-


erator ideals, Pitman Monographs and Surveys in Pure and Applied Mathematics,
vol. 38, Longman Scientific & Technical, Harlow; copublished in the United States
with John Wiley & Sons, Inc., New York, 1989. 104, 207
[TK04] Gr. Tsagas and K. Kalogeridis, The spectrum of the Laplace operator for the mani-
fold SOp2p`2q`1q{SOp2pqˆSOp2q`1q, Conference “Applied Differential Geometry:
General Relativity”—Workshop “Global Analysis, Differential Geometry, Lie Alge-
bras”, BSG Proc., vol. 10, Geom. Balkan Press, Bucharest, 2004, pp. 188–196. 145
[Tom85] Jun Tomiyama, On the geometry of positive maps in matrix algebras. II, Linear
Algebra Appl. 69 (1985), 169–177. 64
[Tro12] Joel A Tropp, User-friendly tail bounds for sums of random matrices, Foundations
of Computational Mathematics 12 (2012), no. 4, 389–434. 146, 255
BIBLIOGRAPHY 397

[Tsi85] B. S. Tsirelson, Quantum analogues of Bell’s inequalities. The case of two spatially
divided domains, Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI)
142 (1985), 174–194, 200, Problems of the theory of probability distributions, IX.
295, 296
[Tsi93] , Some results and problems on quantum Bell-type inequalities, Hadronic J.
Suppl. 8 (1993), no. 4, 329–345. 295
[Tsu81] Chiaki Tsukamoto, Spectra of Laplace-Beltrami operators on SOpn ` 2q{SOp2q ˆ
SOpnq and Sppn ` 1q{Spp1q ˆ Sppnq, Osaka J. Math. 18 (1981), no. 2, 407–426. 145
[TVZ82] M. A. Tsfasman, S. G. Vlăduţ, and Th. Zink, Modular curves, Shimura curves,
and Goppa codes, better than Varshamov-Gilbert bound, Math. Nachr. 109 (1982),

ion
21–28. 142
[TW94] Craig A. Tracy and Harold Widom, Level-spacing distributions and the Airy kernel,
Comm. Math. Phys. 159 (1994), no. 1, 151–174. 179

ut
[TW96] , On orthogonal and symplectic matrix ensembles, Comm. Math. Phys. 177
(1996), no. 3, 727–754. 179

rib
[Vaa79] Jeffrey D. Vaaler, A geometric inequality with applications to linear forms, Pacific
J. Math. 83 (1979), no. 2, 543–553. 106

ist
[VADM01] Frank Verstraete, Koenraad Audenaert, and Bart De Moor, Maximally entangled
mixed states of two qubits, Phys. Rev. A 64 (2001), 012316. 64
[VB14] Tamás Vértesi and Nicolas Brunner, Disproving the Peres conjecture by showing

rd
Bell nonlocality from bound entanglement, Nat. Commun. 5 (2014), Article. 297
[VDD01] Frank Verstraete, Jeroen Dehaene, and Bart DeMoor, Local filtering operations on
two qubits, Phys. Rev. A 64 (2001), 010101. 65
[Vem04] fo
Santosh S. Vempala, The random projection method, DIMACS Series in Discrete
Mathematics and Theoretical Computer Science, 65, American Mathematical Soci-
ot
ety, Providence, RI, 2004, With a foreword by Christos H. Papadimitriou. 124
[Ver] Roman Vershynin, High-Dimensional Probability. An Introduction with Applications
N

in Data Science, Cambridge University Press, in preparation. 143, 207


[Ver12] , Introduction to the non-asymptotic analysis of random matrices, Com-
pressed sensing, Cambridge Univ. Press, Cambridge, 2012, pp. 210–268. 146
ly.

[Vil09] Cédric Villani, Optimal transport, Grundlehren der Mathematischen Wissenschaften


[Fundamental Principles of Mathematical Sciences], vol. 338, Springer-Verlag, Berlin,
on

2009, Old and new. 179


[Voi85] Dan Voiculescu, Symmetries of some reduced free product C ˚ -algebras, Operator
algebras and their connections with topology and ergodic theory (Buşteni, 1983),
se

Lecture Notes in Math., vol. 1132, Springer, Berlin, 1985, pp. 556–588. 180
[Voi90] , Circular and semicircular systems and free product factors, Operator al-
lu

gebras, unitary representations, enveloping algebras, and invariant theory (Paris,


1989), Progr. Math., vol. 92, Birkhäuser Boston, Boston, MA, 1990, pp. 45–60. 180
[Voi91] , Limit laws for random matrices and free products, Invent. Math. 104
na

(1991), no. 1, 201–220. 145, 180


[von27] John von Neumann, Thermodynamik quantenmechanischer Gesamtheiten., Nachr.
so

Ges. Wiss. Göttingen, Math.-Phys. Kl. 1927 (1927), 276–291 (German). 29


[von32] J. von Neumann, Mathematische Grundlagen der Quantenmechanik., 262 S. Berlin,
r

J. Springer. (Die Grundlehren der Mathematischen Wissenschaften in Einzeldarstel-


Pe

lungen, Bd. XXXVIII), 1932. 29


[VT99] Guifre Vidal and Rolf Tarrach, Robustness of entanglement, Physical Review A 59
(1999), no. 1, 141. 260
[VW01] K. G. H. Vollbrecht and R. F. Werner, Entanglement measures under symmetry,
Phys. Rev. A 64 (2001), 062307. 63
[Wal02] Nolan R. Wallach, An unentangled Gleason’s theorem, Quantum computation and
information (Washington, DC, 2000), Contemp. Math., vol. 305, Amer. Math. Soc.,
Providence, RI, 2002, pp. 291–298. 231
[Wat] John Watrous, Theory of quantum information, book in preparation, see https:
//cs.uwaterloo.ca/~watrous/TQI/. 63, 64, 260, 306
[Wat05] , Notes on super-operator norms induced by Schatten norms, Quantum In-
formation & Computation 5 (2005), no. 1, 58–68. 218, 232
398 BIBLIOGRAPHY

[Wer89] Reinhard F. Werner, Quantum states with Einstein-Podolsky-Rosen correlations ad-


mitting a hidden-variable model, Phys. Rev. A 40 (1989), 4277–4281. 63, 281
[WG03] Tzu-Chieh Wei and Paul M. Goldbart, Geometric measure of entanglement and
applications to bipartite and multipartite quantum states, Phys. Rev. A 68 (2003),
042307. 233
[WH02] R. F. Werner and A. S. Holevo, Counterexample to an additivity conjecture for
output purity of quantum channels, J. Math. Phys. 43 (2002), no. 9, 4353–4357,
Quantum information theory. 232
[Wig55] Eugene P. Wigner, Characteristic vectors of bordered matrices with infinite dimen-
sions, Ann. of Math. (2) 62 (1955), 548–564. 179

ion
[Wig58] , On the distribution of the roots of certain symmetric matrices, Ann. of
Math. (2) 67 (1958), 325–327. 179
[Wig59] , Group theory: And its application to the quantum mechanics of atomic

ut
spectra, Expanded and improved ed. Translated from the German by J. J. Griffin.
Pure and Applied Physics. Vol. 5, Academic Press, New York-London, 1959. 63

rib
[Wil17] Mark M. Wilde, Quantum information theory, second ed., Cambridge University
Press, Cambridge, 2017. 29, 63, 232

ist
[Win16] Andreas Winter, Tight uniform continuity bounds for quantum entropies: Condi-
tional entropy, relative entropy distance and energy constraints, Communications in
Mathematical Physics (2016), 1–23. 232

rd
[Wor76] S.L. Woronowicz, Positive maps of low dimensional matrix algebras, Reports on
Mathematical Physics 10 (1976), no. 2, 165 – 183. 63
[WS08] Jonathan Walgate and Andrew James Scott, Generic local distinguishability and
fo
completely entangled subspaces, Journal of Physics A: Mathematical and Theoretical
41 (2008), no. 37, 375305. 231
ot
[WW00] R. F. Werner and M. M. Wolf, Bell’s inequalities for states with positive partial
transpose, Phys. Rev. A 61 (2000), 062102. 297
N

[WW01a] , All-multipartite Bell-correlation inequalities for two dichotomic observables


per site, Phys. Rev. A 64 (2001), 032112. 295, 297
[WW01b] Reinhard F. Werner and Michael M. Wolf, Bell inequalities and entanglement, Quan-
ly.

tum Inf. Comput. 1 (2001), no. 3, 1–25. 295, 297


[You14] Pierre Youssef, Restricted invertibility and the Banach-Mazur distance to the cube,
on

Mathematika 60 (2014), no. 1, 201–218. 208


[ŻHSL98] Karol Życzkowski, Paweł Horodecki, Anna Sanpera, and Maciej Lewenstein, Volume
of the set of separable states, Physical Review A 58 (1998), no. 2, 883. 260
se

[Zie00] Günter M. Ziegler, Lectures on 0{1-polytopes, Polytopes—combinatorics and compu-


tation (Oberwolfach, 1997), DMV Sem., vol. 29, Birkhäuser, Basel, 2000, pp. 1–41.
lu

281
[ŻS01] Karol Życzkowski and Hans-Jürgen Sommers, Induced measures in the space of
mixed quantum states, J. Phys. A 34 (2001), no. 35, 7111–7125, Quantum informa-
na

tion and computation. 180


[ŻS03] , Hilbert-Schmidt volume of the set of mixed quantum states, J. Phys. A 36
so

(2003), no. 39, 10115–10130. 260


r

Websites
Pe

[@1] http://www2.stetson.edu/~efriedma/packing.html 108


[@2] http://mathworld.wolfram.com/GumbelDistribution.html 178
[@3] http://www.encyclopediaofmath.org/index.php?title=Banach-Mazur_compactum&oldid=
22053 (an article originated by A. A. Giannopoulos) 103, 208
[@4] http://qig.itp.uni-hannover.de/qiproblems/1 295
[@5] http://qig.itp.uni-hannover.de/qiproblems/2 306
APPENDIX F

Notation

ion
We list below mathematical symbols that appear in the book, particularly

ut
those that are subfield-specific or not generally accepted throughout mathematics,
or just potentially ambiguous. We grouped them by theme/subfield; since any such

rib
classification is necessarily imperfect, it may sometimes be necessary to check more
than one category. Within each category, we tried—to the extent it was possible—

ist
to arrange the symbols in the alphabetic order. The numbers following each brief
description refer to the pages on which the corresponding symbol is defined, or at

rd
least appears in a context.

General notation
xx|, |xy
fo
Dirac bra-ket notation, 4
scalar product, alternative notation to xx, yy, 5
ot
xx|yy
|xyxy| ket-bra, the rank one operator mapping z to xy, zy ¨ x, 5
N

|¨| Euclidean or Hilbertian norm, or modulus of a scalar, 3


|α| weight of a multi-index α P Nn , 135
ly.

1A indicator function of a set A, 101


À, Á, » Landau notation (alternative form), 3
on

card A cardinality of a set A, 111


log natural logarithm, 27
Op¨q, Ωp¨q, Θp¨q Landau notation, 3
se

op¨q, „, ! asymptotic notation, 3


group of permutations of t1, 2, . . . , mu, 27
lu

Sm
vol, voln , volE Lebesgue measure on Rn , on the subspace E, 4
na

Convex geometry
so

} ¨ }K gauge of a convex body K, 11


} ¨ }p p-norm on Rn , 12
r

K˝ polar of a set K Ă Rn , 15
Pe

C˚ cone dual to of a cone C, 19


Cb base of a cone C, 19
KX intersection symmetrization of a convex body K, 80
KY union symmetrization of a convex body K, 80
K cylindrical symmetrization of a convex body K, 81
x¨, ¨yE scalar product associated to an ellipsoid E , 18
Bpn unit ball of `np , 12
BX unit ball of a normed space X, 11
conv A convex hull of a set A, 12
∆n n-dimensional simplex, 12

399
400 F. NOTATION

hK p¨q support function of a convex body K, 94


inradpKq inradius of a convex body K, 96
IsopKq group of isometries preserving K, 89
JohnpKq John ellipsoid of a convex body K, 84
`np space Rn equipped with the p-norm, 12
Ln Lorentz cone in Rn , 19
LöwpKq Löwner ellipsoid of a convex body K, 84
νK map which implements duality of faces, 17

ion
outradpKq outradius of a convex body K, 96
S n´1 , SCn , SH unit sphere in Rn , in Cn , in Hilbert space H, 4, 311
vradpKq volume radius of a convex body K, 92

ut
wpK, ¨q support function of a convex body K, 94

rib
wpKq mean width of a convex body K, 95
wG pKq Gaussian mean width of a convex body K, 95

ist
Linear algebra
xÓ non-increasing rearrangement of a vector x P Rn , 22

rd
ă majorization, 22
ăw submajorization, 23
A:
|A|
fo
adjoint of a matrix (or operator) A, 4
absolute value of A (equals pA: Aq1{2 ), 23
ot
} ¨ }p Schatten p-norm on matrices, 23
} ¨ }HS Hilbert–Schmidt norm (equals } ¨ }2 ), 23
N

} ¨ }op operator norm (equals } ¨ }8 ), 23


rψs equivalence class of a unit vector ψ in the projective space,
ly.

312
BpHq bounded linear operators on a Hilbert space H, 4
on

B sa pHq bounded linear self-adjoint operators on a Hilbert space H,


4
se

dpAq vector formed by diagonal entries of a matrix A, 24


diag A matrix obtained from A by setting non-diagonal entries to
lu

zero, 25
Diagpvq, Dv for a vector v “ pvi q, the diagonal matrix whose ii-th entry
na

is vi , 265, 333
Eij operator |ei yxfj |, where pei q, pfj q are specified bases, 47
so

Grpk, Rn q, Grpk, Cn q Grassmann manifolds, 314


H conjugate of a Hilbert space H, 4
r

H1 often (but not always) the hyperplane of trace one matrices


Pe

in Msa d , 31
I identity matrix or identity operator, 8
Id identity superoperator, 8
J diagonal matrix with diagonal entries p1, ´1, . . . , ´1q, 318
λj pAq eigenvalues of a matrix A, usually arranged in the nonin-
creasing order if A is Hermitian, 160
λj pψq Schmidt coefficients of a vector ψ P H1 b H2 , 36
Mm,n space of m ˆ n (real or complex) matrices, 7
Mn equals Mn,n , 7
Msa
n space of self-adjoint matrices (subspace of Mn ), 7
PROBABILITY 401

Msa,0
n subspace of Msa n consisting of trace zero matrices, 265
Opnq orthogonal group, 312
Op1, n ´ 1q Lorentz group, 318
O` p1, n ´ 1q orthochronous subgroup of the Lorentz group, 318
P pCq cone of linear maps preserving the cone C, or preserving the
order induced by the cone C, 321
PE orthogonal projection onto the subspace E, 5
PSD cone of positive-semidefinite matrices, 19

ion
PSUpnq projective special unitary group, 312
qp¨q quadratic form of the Minkowski spacetime, 32, 318
Rn,0 the hyperplane of Rn consisting of vectors whose coordinates

ut
add up to 0, 263

rib
sj pAq singular values (arranged in non-increasing order) of a ma-
trix A, 24 ` ˘
spAq the vector sj pAq of singular values of a matrix A, 24

ist
SHS unit sphere for the Hilbert–Schmidt norm } ¨ }HS , 225
Spm,n unit ball for } ¨ }p in Mm,n , 25

rd
Spm,sa unit ball for } ¨ }p in Msa
n , 25
SVD singular value decomposition, 36
SOpnq
SOp1, n ´ 1q
fo
special orthogonal group, 312
proper Lorentz group, 318
ot
SO` p1, n ´ 1q restricted Lorentz group, 318
specpAq spectrum (arranged in non-increasing order) of a self-adjoint
N

matrix A, 24
SUpnq special unitary group, 312
ly.

T transposition with respect to a specified basis, 41


Upnq unitary group, 312
on

Probability
se

} ¨ }ψ1 subexponential norm, 139


} ¨ }ψ2 subgaussian norm, 139
lu

‘ free additive convolution, 177


d8 8-Wasserstein distance, 161
na

Ef expected value of the random variable f , also referred to as


the mean, the expectation, or the first moment, 117
so

Entµ pf q continuous entropy of f (with respect to µ), 132


FX p¨q cumulative distribution function of a random variable X,
r
Pe

161
Φp¨q cumulative distribution function of an N p0, 1q variable, 307
G a standard Gaussian vector, 308
γn , γnC standard Gaussian measure on Rn , Cn , 308
GUEpnq Gaussian Unitary Ensemble, 162
GUE0 pnq Gaussian Unitary Ensemble conditioned to have trace 0, 163
GOE Gaussian Orthogonal Ensemble, 163
Hppq Shannon entropy of a probability mass function p, 28
χpnq, χ2 pnq chi, chi-squared distribution with n degrees of freedom, 175
i.i.d. independent, identically distributed, 160
402 F. NOTATION

κn expected Euclidean norm of a standard Gaussian vector in


Rn , 309
κC
n expected Euclidean norm of a standard Gaussian vector in
Cn , 309
Mf median of the random variable f , 117
µsp pAq empirical spectral distribution of a self-adjoint matrix A, 160
µMPpλq Marčenko–Pastur distribution with parameter λ, 167
µSC semicircular distribution, 163

ion
oscpf, A, µq oscillation of f around µ on the subset A, 186
pPt q Ornstein–Uhlenbeck semigroup, 135
Wishartpn, sq Wishart distribution with parameters n, s, 166

ut
Geometry and asymptotic geometric analysis

rib
b2 Euclidean/Hilbertian tensor product, 18

ist
b
p projective tensor product, 82
b
q injective tensor product, 83
Aε ε-enlargement of a set A, 117

rd
~ ¨ ~K norm on Hk,n associated to K, 183
apKq asphericity of a convex body K, 193
cpXq
Cpx, θq
fo
minimum of Ricci curvatures of the manifold X, 129
spherical cap of angle θ with center at x, 109
ot
dpX, Y q Banach–Mazur distance between normed spaces X and Y ,
103
N

dBM pK, Lq Banach–Mazur distance between convex bodies K and L, 79


dg pK, Lq geometric distance between convex bodies K and L, 79
ly.

diam diameter of a set in a metric space, 116


dimF pKq facial dimension of a convex body K, 193
on

dimV pKq verticial dimension of a convex body K, 193


g geodesic distance on the sphere, 311
Hk the space L2 pRk , γk q, 182
se

Hk,n Hk b Rn , or Rn -valued functions on pRk , γk q, 182


lu

KG (real) Grothendieck constant, 280


KGC
complex Grothendieck constant, 295
na

pnq pm,nq rns


KG , KG , KG other variants of Grothendieck constant, 281, 295
k˚ pKq Dvoretzky dimension of a convex body K, 190
so

KpKq K-convexity constant of a convex body K, 183


`K `-norm associated to a convex body K, 181
r

N pK, εq, N pK, d, εq


Pe

covering number (metric space), 107


N pK, Lq, N pK, L, εq covering number (convex bodies), 114
LSpX, µq logarithmic Sobolev constant of the space X, 132
LSI logarithmic Sobolev inequality, 131
P pK, εq, P pK, d, εq packing number, 107
PpV q projective space associated to a vector space V , 312
PpX, µq Poincaré constant of the space X, 133
R, R1 , R̃1 Rademacher projection, 183, 185
Ricp Ricci curvature at point p, 129
sec sectional curvature, 130
σ normalized Lebesgue measure on a Euclidean sphere, 4, 311
QUANTUM INFORMATION THEORY 403

V pθq measure of the spherical cap of angle θ, 109


vrpKq volume ratio of a convex body K, 201

Quantum information theory


‚ homotheties with respect to the maximally mixed state, 236
ρΓ partial transposition of ρ, 42
} ¨ }˛ diamond norm, 51
} ¨ }1Ñp 1 Ñ p norm of a quantum channel, 217

ion
} ¨ }M distinguishability norm associated to the POVM M, 299
ρùσ state σ can be obtain from copies of ρ by an LOCC protocol,

ut
301
Asymd antisymmetric subspace of Cd b Cd , 39

rib
BP cone of block-positive operators, 56
CpΦq Choi matrix of a superoperator Φ, 48, 48

ist
co-PSD cone of co-positive semidefinite operators, 56
CP cone of completely positive superoperators, 49

rd
DpHq set of states on a Hilbert space H, 9, 31
DEC cone of decomposable superoperators, 57
Epψq entropy of entanglement of a pure state ψ, 215
Ep pψq
EF pρq
fo
p-entropy of entanglement of a pure state ψ, 215, 229
entanglement of formation of a state ρ, 272
ot
EB cone of entanglement-breaking superoperators, 57
k-Ext set of k-extendible states, 41
N

F flip operator, 39
gpψq geometric measure of entanglement of a pure state ψ, 229
ly.

gmin pHq extremal geometric measure of entanglement for H, 229


Γ partial transposition, 42
on

LB set of local boxes, 286


LC set of local correlations, 277
se

NSB set of non-signaling boxes, 286


NSC set of non-signaling correlations, 288
lu

P cone of positivity-preserving superoperators, 56


pL , pNL local, non-local fractions, 292
na

ϕ` , ϕ´ , ψ ` , ψ ´ Bell vectors, 70
ΦV completely positive map X ÞÑ V XV : , 58
so

πa antisymmetric state, 40
πs symmetric state, 40
r

PPT set of states with positive partial transpose, 43


Pe

PPT cone of PPT operators, 55


PPT cone of PPT-inducing superoperators, 57
QB set of quantum boxes, 286
QC set of quantum correlations, 277
Rpρq robustness of a state ρ, 247
ρ˚ maximally mixed state, 32
s0 pdq threshold for separability, 269
Spρq von Neumann entropy of a state ρ, 27
Sp pρq p-Rényi entropy of a state ρ, 28
S min pΦq, Spmin pΦq minimum output (p-)entropy of a quantum channel Φ, 216
404 F. NOTATION

Seg Segré variety, 312


SeppHq set of separable states on a multipartite Hilbert space H, 37
SEP cone of separable operators, 38
σx , σy , σz Pauli matrices, 32
Symd symmetric subspace of Cd b Cd , 39
Tr1 ρ, TrA ρ partial trace of ρ with respect to subsystem 1 or A, 35
wλ Werner state with parameter λ, 40
ωL pV q local value of a Bell expression V , 289

ion
ωNS pV q non-signaling value of a Bell expression V , 289
ωQ pV q quantum value of a Bell expression V , 289

ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Index

ion
In addition to pointing to definitions of concepts that appear

ut
throughout this book, the index is designed to direct the reader

rib
to fundamental or major results about such concepts and to other
facts, which have—in the authors’ opinion—a reference value. This
includes sharp versions of well-known inequalities, proofs of stan-

ist
dard results that are new or not widely known, or tables listing
values of various geometric parameters for classical objects. The

rd
index is not meant to be an exhaustive catalogue of all occurrences
of a given notion or phrase in the book.

absolutely separable state, 38


fo
block-positive matrix, 56
ot
additivity problem, 216, 228 Bonami–Beckner inequality, 145
adjoint Boolean cube, 113
N

map, 49 Borel selection theorem, 14


of an operator, 4 Born rule, 67
almost randomizing channels, 220 bound entanglement, 306
ly.

anti-unitary, 34 box, 285


antisymmetric bra-ket notation, 4
on

subspace, 39 Brouwer’s fixed-point theorem, 60


asphericity, 193 Brunn–Minkowski inequality, 92
asymmetry of a convex set, 143 restricted, 104
se

asymptotic freeness, 177 reverse, 104, 209


Bures metric, 312
lu

Banach–Mazur Busemann–Petty problem, 105


compactum, 79
distance, 79, 103 canonical, 4
na

Bell correlation inequality, 279 Carathéodory’s theorem, 12


Bell inequality for boxes, 289 Catalan numbers, 163
so

Bell polytope, 277 central value, 124


Bell states, 39, 70, 302 chaining argument, 158
r

nonseparability, 44 channel, 50
Pe

Bell vectors, 70, 302 Chevet–Gordon inequalities, 173


Bell violations, 280, 289 chi distribution, 175
arbitrarily large, 291 mean, 309
Bernstein’s inequalities, 140, 186 median, 124
bipartite Hilbert space, 6 Choi matrix, 48
bipolar theorem, 15 Choi’s isomorphism, 48
bistochastic channel, 50 Choi’s theorem, 49
bistochastic matrix, 23 CHSH game, 283
Blaschke–Santaló inequality, 98 CHSH inequality, 280
blessing of dimensionality, 205 circled
Bloch ball, 32 body, 11
Bloch sphere, 32 function, 187
block matrix, 8 classical box, 286

405
406 INDEX

classical correlation, 277 distinguishability, 299


classical-quantum (c-q) channel, 53 `1 -distortion, 197
Clifford algebras, 275 doubly stochastic channel, 50
CNOT, 304 dropping the complex structure, 7
co-completely positive map, 57 Dudley’s inequality, 157
co-positive semi-definite, 56 Dvoretzky dimension, 190
column vector, 5 Dvoretzky’s theorem, 200
completely depolarizing channel, 52 Dvoretzky–Milman theorem
completely positive cone, 49, 56 and the escape phenomenon, 189
for `n
p -spaces, 195

ion
duality, 57
completely positive map, 49 for convex bodies, 191
norm of, 217 for Lipschitz functions, 187
completely randomizing channel, 52, 220 for projections, 192

ut
complexification, 6 for Schatten spaces, 198
computational basis, 5, 67 isometric, 275

rib
concentration of measure, 117 Dvoretzky–Rogers lemma, 200
on standard spaces, 118
Earth Mover’s distance, 161

ist
subgaussian, 117
cone, 18 Ehrhard symmetrization, 123
base duality, 20 Ehrhard’s inequality, 122

rd
base of, 19 ellipsoids, 18
dual, 19 polars of, 18
self-dual, 19 tensor product of, 18
conjugate
of a Hilbert space, 4
fo
empirical spectral distribution, 160
ε-enlargements, 117
enough symmetries, 89
ot
of a matrix, 7
contact point, 87 entangled state, 37
k-entangled state, 41
N

contextuality, 297
contraction principle, 127, 134 entangled subspaces, 213
convex body, 11 extremely entangled, 224
ly.

C-Euclidean, 189 very entangled, 223


polytopal approximation, 193, 194 entanglement of formation, 272
entanglement witness, 60
on

convex hull, 12
convex roof, 272 entanglement-breaking channel, 53
Copenhagen interpretation, 67 entropy of entanglement, 215
p-entropy, 215
se

correlation conjecture, 154


escape phenomenon, 176
correlation polytope, 277
explicit constructions, 205
lu

covering, 107
exponential Markov inequality, 124
density, 142
exposed face, 13
number, 107, 114
na

exposed point, 13
creation operators, 177
exposed ray, 22
curse of dimensionality, 205
k-extendible state, 41
so

cut polytope, 296


extension of a map, 49
Davis convexity theorem, 24 extreme point, 13
r

extreme ray, 22
Pe

decomposable map, 57
decomposable matrix, 56 extrinsic distance, 311
density matrix, 9, 71
face, 13
deterministic box, 286
facet, 13
deterministic strategy, 284, 286
facial dimension, 193
diamond norm, 51
fidelity, 312
difference body, 81
Figiel–Lindenstrauss–Milman inequality,
direct sum of channels, 54
194
discrete cube, 113
Finsler geometry, 319
distillability problem, 302
flip operator, 39
and 2-positivity, 305
Fock space, 176
and Werner states, 305
fraction
distillable state, 302
classical, 292
INDEX 407

local, 292 lemma, 124


nonlocal, 292 matrix inequality, 255
of determinism, 292 α-homogeneous functions, 309
free additive convolution, 177 Horodecki’s entanglement witness theorem,
free Poisson distribution, 167 61
free probability, 176 hypercontractivity, 135
Frobenius norm, 7 hyperplane conjecture, 101
Fubini–Study metric, 312
full cone, 21 injective tensor product, 83
inradius, 96

ion
gauge, 11 intrinsic distance, 311
Gaussian distribution, 307 irreducible, 89
tail estimates, 307 isoperimetric inequality

ut
Gaussian mean width, 95 Gaussian, 122
Gaussian processes, 149, 308 in Rn , 92

rib
and the mean width, 150 on the discrete cube, 137
comparison inequalities, 153 on the sphere, 119
stationary, 159 isotropic convex body, 101

ist
Gaussian Unitary Ensemble, 162 isotropic states, 39
generic chaining, 159

rd
geodesic distance, 311 Jamiołkowski isomorphism, 47
geodesically convex, 313 John ellipsoid, 84
geometric distance, 79 John position, 84
geometric measure of entanglement, 228
Ginibre formula, 163
fo
Johnson–Lindenstrauss lemma, 205
jointly Gaussian variables, 308
GOE, 163
ot
K-convexity constant, 183, 185, 207
large deviations, 165
bounds, 183, 184
Gordon’s lemma, 153
N

duality, 186
Grassmann manifold, 314
for B1n , the cube, 186
ε-nets, 116
Kadison’s theorem, 34
ly.

Gromov’s comparison theorem, 130


Kashin decomposition of `n1 , 202
Grothendieck constant, 280
complex, 295 Khatri–Šidák lemma, 153
on

other variants, 281, 295 Klein’s lemma, 26


Grothendieck’s inequality, 280 Knaster problem, 208
GUE, 162 Kneser–Poulsen conjecture, 178
se

convergence to semicircle law, 164 Kochen–Specker theorem, 297


eigenvalue distribution, 163 Komatu inequalities, 307
lu

large deviations, 164 Kraus decomposition, 49


norm, 165 Kraus rank, 49
Krein–Milman theorem, 13
na

small deviations, 165


GUE0 , 163 `-norm, 181
Gumbel distribution, 178
so

`-position, 181
Gurvits–Barnum theorem, 246 Löwner position, 84
r

Haar measure, 313 Lévy distance, 161


Pe

Hamming distance, 113 Laplace transform


Hanner’s inequalities, 14 bilateral, 132
Harper’s isoperimetric inequality, 137 method, 124
Hastings’s counterexample, 228 law of the iterated logarithm, 160
Heisenberg–Weyl operators, 221 Lévy’s lemma, 120
Helstrom bound, 300 for central values, 125
Herbst’s argument, 132 for the mean, 121
Hermitian matrix, 7 local version, 127
Herschel–Maxwell theorem, 309 linear programming bound, 112
hidden variable, 75, 286 L-Lipschitz function, 120
Hilbert–Schmidt norm/inner product, 7 extension, 227
Hoeffding’s local, 74
inequality, 129 box, 286
408 INDEX

correlation, 277 of the projective space, 112


filtering, 301 of the sphere, 110, 111
polytope, 277 non-commutative Hölder inequality, 25
unitaries, 46 nondegenerate cone, 21
local strategy, 284 nonlocal
with shared randomness, 286 boxes, 286
LOCC channel, 54, 301 correlations, 277
log-concave measure, 93 fraction, 292
log-Sobolev nonsignaling
constants, 132, 134 box, polytope, correlation, 287

ion
inequality, 132 principle, 287
tensorization property, 133 violations, 289, 292
Lorentz cone, 19

ut
automorphisms, 323 operational, 29
Lorentz group, 318 Ornstein–Uhlenbeck semigroup, 134

rib
proper, 318 orthochronous subgroup, 318
restricted, 318 orthogonal group, 312
Low M ˚ -estimate, 202 ε-nets, 116

ist
Löwner ellipsoid, 84 geodesics, 313
`p -norm, 12 oscillation, 186

rd
`p product metric, 128 outer product, 5
outradius, 96
M -ellipsoid, 143 overlap, 37, 228, 229, 312
M -position, 143
magic square game, 293
fo
packing, 107
density, 142
Mahler conjecture, 105
ot
majorization, 22 number, 107
majorizing measure, 179 on the discrete cube, 113
N

Marčenko–Pastur distribution, 167 on the sphere, 112


maximally entangled, 39, 229 partial trace, 35, 70
partial transposition, 41
ly.

maximally mixed state, 20, 32


mean width, 95 Pauli matrices, 32
and Gaussian processes, 150 composition rules for, 33
on

of a union of sets, 121 Peres conjecture, 281, 291, 297


of classical bodies, 96 Peres–Horodecki criterion, 43
of QIT bodies, 235 permutationally symmetric
se

median, 117 basis, 90


of a χ2 pnq variable, 124 body, norm, space, 90
lu

of a convex function, 126 phase of a vector in Cd , 312


Mermin–Peres game, 293 Poincaré’s
constants, 133, 134
na

metric entropy, 107


of `n inequality, 133
p -balls, 156, 157
of classical manifolds, 116 lemma, 122
so

Milman–Pajor inequality, 98 pointed cone, 21


minimum output entropy, 216 polar
r

Minkowski compactum, 79 of a convex body, 15


Pe

Minkowski functional, 11 of a linear image, 15


Minkowski operations, 81 of a translate, 326
Minkowski–Hlawka theorem, 142 of sections, projections, 16
mixed state, 31, 69 of unions, intersections, 16
mixed-unitary channel, 52 polarity, 15
M M ˚ -estimate, 184, 207 in the complex setting, 27
multipartite Hilbert space, 6 polytope, 12
multiplicativity problem, 217, 218 positive cone, 56
duality, 57
ε-nets, 107 positive map, 49
of classical manifolds, 116 Sinkhorn’s normal form, 59
of product spaces, 114 n-positive map, 49
of the discrete cube, 113 positive orthant, 19
INDEX 409

positive semi-definite cone, 19 resolution of identity, 87


automorphisms, 58 unbiased, 87
extreme rays, 22 Ricci curvature, 129
positivity-preserving map, 49 bounds, 131
POVM, 53, 74 robustness, 247
associated zonotope, 300 bipartite, 247
sparsification, 300 multipartite, 249
PPT cone, 55 Rogers–Shephard inequalities, 100
PPT criterion, 44 row vector, 5
PPT state, 43

ion
PPT-inducing map, 54 S-lemma, 321
Prékopa–Leindler inequality, 101 and automorphisms of Ln , 323
precognition, 287 Santaló inequality, 98

ut
principal angles, 314 reverse, 98
probabilistic method, 205 Santaló point, 326

rib
projective measurement, 73 Schatten
projective space, 31, 68, 312 p-norms, 23
spaces, 24

ist
nets, 112
volume of balls, 112 Schmidt coefficients, 36
projective tensor product, 82 and Courant–Fischer formulas, 37

rd
projective unitary group, 312 Schmidt decomposition, 36
proper face, 13 Schmidt rank, 36
pseudotelepathy, 293, 297 Schur channel, 54
pure state, 31
separable, 37
fo
sectional curvature, 130
Segré variety, 213, 312
ot
purification, 71 self-adjoint
pushforward, 127 matrix, 7
operator, 4
N

q-c-q channel, 53 self-adjointness preserving map, 48


quantum box, 286 semicircle law, 163
ly.

quantum channel, 50, 72 semicircular distribution, 163


as a subspace, 216 separable cone, 38
quantum correlation, 277
on

duality, 57
quantum game, 284 separable map, 54
quantum map, 47 separable state, 37
quantum marginal, 70, 287 ε-separated set, 107
se

quantum operation, 47 set of PPT states, 43


quantum strategy, 285, 286
lu

volume and mean width, 235, 245


quantum violations set of quantum states, 9, 31
for boxes, 289, 291 centroid, 35
na

for correlations, 280 facial structure, 33


quantum-classical (q-c) channel, 53 polytopal approximation, 253
quartercircular distribution, 169 symmetries, 34
so

qubit, 6 volume and mean width, 235, 236


quotient of a subspace theorem, 204 set of separable states, 37
r

M M ˚ -estimate for, 240


Pe

R-transform, 177
centroid, 46
random covering, 110
dimension, 37
random induced states, 170
extreme points, 37
convergence, 171
facial structure, 38
density, 172
inradius, 235
large deviations, 171
polytopal approximation, 253
random strategy, 284, 286
symmetries, 46
random subspace, 186
volume and mean width, 235, 244
randomness reduction, 206
Shannon entropy
realignment, 45
continuous, 131
regular simplex, 12
discrete, 28
Rényi entropies, 28
simplex, 12
monotonicity, 28
410 INDEX

simplicial order, 136 uniform convexity, 14


singular value decomposition, 36 for Schatten p-norm, 29
singular values, 24, 36 unital map, 50
Slepian’s lemma, 153 unitarily invariant
Slepian–Gordon lemma, 153 function, 24
spherical cap, 109 norm, 27
volume, 109 random matrix, 265
Spingarn inequality, 99 unitary channel, 52
spinor map, 32, 319 unitary evolution, 71
standard Gaussian unitary group, 312

ion
measure, 94, 308 ε-nets, 116
vector, 95, 308 geodesics, 313
star-shaped set, 114 universal entanglers, 215

ut
state Urysohn’s inequality, 95
classical, 8 dual, 96

rib
quantum, 8 reverse, 184
Steiner symmetrization, 93
verticial dimension, 193

ist
Stiefel manifold, 317
Stinespring representation, 51, 72 volume of polytopes, 152
Stinespring theorem, 51 volume radius, 92

rd
strictly convex set, 14 superadditivity, 93
Størmer’s theorem, 44 volume ratio, 201
proofs, 62 and Kashin decomposition, 202
subexponential variable (ψ1 ), 139
subgaussian process, 157
fo
von Neumann entropy, 27
Lipschitz constant, 222, 223
ot
subgaussian variable (ψ2 ), 139 8-Wasserstein distance, 161
submajorized, 23 wave function, 67
N

Sudakov minoration, 154 weak convergence


dual, 155
of measures, 179, 359
superoperator, 8, 47 of random variables, 162
ly.

support function, 94 Werner states, 40


supporting hyperplane, 13
separability, 45
on

symmetric basis, 90 Wigner’s semicircle law, 164


symmetric convex body, 11 Wigner’s theorem, 34
symmetric subspace, 39
Wishart matrix, 166
se

symmetrizations, 80 convergence to MPpλq, 167


Talagrand’s convex concentration convergence to semicircle law, 168
lu

inequality, 137, 138 large deviations, 169


tensor product norm, 169, 174
partial transposition, 168
na

Hilbertian, 6
injective, 83 Woronowicz theorem, 44
projective, 82
so

XOR game, 296


threshold for entanglement, 269
threshold for PPT, 273 zonoids, 82
r

trace duality, 7 approximation by zonotopes, 205, 209


Pe

trace norm, 24 zonotopes, 82


trace-preserving map, 50 and POVMs, 300
Tracy–Widom
distribution, 179
effect, 165
transposition, 41
Tsirelson’s bound, 280
twirling channel, 40, 305

unconditional
basis, 90
body, norm, space, 90
direct sum, 146

You might also like