Abmb PDF
Abmb PDF
ion
Guillaume Aubrun
ut
Stanisław J. Szarek
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
2010 Mathematics Subject Classification. Primary 46Bxx, 52Axx, 81Pxx, 46B07,
46B09, 52C17, 60B20, 81P40
ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
To Aurélie and Margaretmary
rd
ist
rib
ut
ion
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
Contents
ion
List of Tables xiii
ut
List of Figures xv
rib
Preface xvii
ist
Part 1. Alice and Bob. Mathematical Aspects of Quantum
Information Theory 1
rd
Chapter 0. Notation and Basic Concepts 3
0.1. Asymptotic and non-asymptotic notation 3
0.2. Euclidean and Hilbert spaces
0.3. Bra-ket notation
fo 3
4
ot
0.4. Tensor products 6
N
0.5. Complexification 6
0.6. Matrices vs. operators 7
0.7. Block matrices vs. operators on bipartite spaces 8
ly.
1.1.1. Gauges 11
1.1.2. First examples: `p -balls, simplex, polytopes, and convex hulls 12
na
1.2. Cones 18
Pe
ion
2.2.5. Entanglement hierarchies 41
2.2.6. Partial transposition 41
2.2.7. PPT states 43
ut
2.2.8. Local unitaries and symmetries of Sep 46
rib
2.3. Superoperators and quantum channels 47
2.3.1. The Choi and Jamiołkowski isomorphisms 47
2.3.2. Positive and completely positive maps 48
ist
2.3.3. Quantum channels and Stinespring representation 50
2.3.4. Some examples of channels 52
rd
2.4. Cones of QIT 55
2.4.1. Cones of operators 55
2.4.2. Cones of superoperators
2.4.3. Symmetries of the PSD cone
fo 56
58
ot
2.4.4. Entanglement witnesses 60
2.4.5. Proofs of Størmer’s theorem 61
N
Miscellany 77
ion
4.4. Volume of central sections and the isotropic position 101
Notes and Remarks 103
ut
Chapter 5. Metric Entropy and Concentration of Measure in Classical Spaces 107
5.1. Nets and packings 107
rib
5.1.1. Definitions 107
5.1.2. Nets and packings on the Euclidean sphere 108
ist
5.1.3. Nets and packings in the discrete cube 113
5.1.4. Metric entropy for convex bodies 114
rd
5.1.5. Nets in Grassmann manifolds, orthogonal and unitary group 116
5.2. Concentration of measure 117
ion
Chapter 8. Entanglement of Pure States in High Dimensions 213
8.1. Entangled subspaces: qualitative approach 213
8.2. Entropies of entanglement and additivity questions 215
ut
8.2.1. Quantifying entanglement for pure states 215
8.2.2. Channels as subspaces 216
rib
8.2.3. Minimal output entropy and additivity problems 216
8.2.4. On the 1 Ñ p norm of quantum channels 217
ist
8.3. Concentration of Ep for p ą 1 and applications 218
8.3.1. Counterexamples to the multiplicativity problem 218
rd
8.3.2. Almost randomizing channels 220
8.4. Concentration of von Neumann entropy and applications 222
8.4.1. The basic concentration argument fo
8.4.2. Entangled subspaces of small codimension
222
224
8.4.3. Extremely entangled subspaces 224
ot
8.4.4. Counterexamples to the additivity problem 228
N
ion
10.2.2. The threshold theorem 269
10.3. Other thresholds 272
10.3.1. Entanglement of formation 272
ut
10.3.2. Threshold for PPT 273
rib
Notes and Remarks 273
ist
11.1. Isometrically Euclidean subspaces via Clifford algebras 275
11.2. Local vs. quantum correlations 276
rd
11.2.1. Correlation matrices 276
11.2.2. Bell correlation inequalities and the Grothendieck constant 279
11.3. Boxes and games
11.3.1. Bell inequalities as games
fo 283
283
11.3.2. Boxes and the nonsignaling principle 285
ot
11.3.3. Bell violations 289
N
Appendix C. Extreme maps between Lorentz cones and the S-lemma 321
Notes and Remarks 324
Appendix D. Polarity and the Santaló point via duality of cones 325
Appendix E. Hints to exercises 329
Appendix. Bibliography 375
Websites 398
ion
Appendix F. Notation 399
General notation 399
ut
Convex geometry 399
Linear algebra 400
rib
Probability 401
Geometry and asymptotic geometric analysis 402
ist
Quantum information theory 403
rd
Appendix. Index 405
fo
N ot
ly.
on
se
lu
na
so
r
Pe
List of Tables
ion
2.1 Cones of operators and their duals 55
ut
2.2 Cones of superoperators 57
rib
3.1 Spooky action at a distance: outcome distribution for a 2-qubit
measurement experiment 75
ist
4.1 Radii, volume radii, and widths for standard convex bodies in Rn 96
rd
5.1 Covering numbers of classical manifolds 116
5.2 Constants and exponents in subgaussian concentration inequalities 118
fo
5.3 Optimal bounds on Ricci curvature of classical manifolds 131
ot
5.4 log-Sobolev and Poincaré constants for classical manifolds 134
N
9.1 Radii, volume radii, and widths for sets of quantum states 235
ly.
9.2 References for proofs of the results from Table 9.1 236
on
xiii
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
List of Figures
ion
1.1 Gauge of a convex body 12
ut
1.2 A polytope and its polar 17
rib
1.3 A cone and its dual cone 20
2.1 The set of quantum states and the set of separable states 38
ist
2.2 The set of PPT states 44
rd
4.1 Symmetrizations of a convex body 80
4.2 An equilateral triangle in Löwner position 85
4.3 Width and half-width of a convex body fo 94
ot
5.1 A net and a packing for an equilateral triangle 108
N
E.3 Sharper upper and lower bounds of the volume of spherical caps 345
xv
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
Preface
ion
The quest to build a quantum computer is arguably one of the major scien-
ut
tific and technological challenges of the 21st century, and quantum information
theory (QIT) provides the mathematical framework for that quest. Over the last
rib
dozen or so years, it has become clear that quantum information theory is closely
linked to geometric functional analysis (Banach space theory, operator spaces,
ist
high-dimensional probability), a field also known as asymptotic geometric analy-
sis (AGA). In a nutshell, asymptotic geometric analysis investigates quantitative
rd
properties of convex sets, or other geometric structures, and their approximate
symmetries as the dimension becomes large. This makes it especially relevant to
quantum theory, where systems consisting of just a few particles naturally lead to
fo
models whose dimension is in the thousands, or even in billions.
While the idea for this book materialized after we independently taught grad-
ot
uate courses directed primarily at students interested in functional analysis (at the
N
University Lyon 1 and at the University Pierre et Marie Curie-Paris 6 in the spring
of 2010), the final product goes well beyond enhanced lecture notes. The book is
aimed at multiple audiences connected through their interest in the interface of QIT
ly.
and AGA: at quantum information researchers who want to learn AGA or to apply
on
its tools; at mathematicians interested in learning QIT, or at least the part of QIT
that is relevant to functional analysis/convex geometry/random matrix theory and
related areas; and at beginning researchers in either field. We have tried to make
se
the book as user-friendly as possible, with numerous tables, explicit estimates, and
reasonable constants when possible, so as to make it a useful reference even for
lu
notation and conventions with emphasis on those that are field-specific to AGA or
to physics and may therefore need to be clarified for readers that were educated in
so
the other culture. It should be read lightly and used later as a reference. Chapter 1
r
introduces basic notions from convexity theory that are used throughout the book,
Pe
notably duality of convex bodies or of convex cones and Schatten norms. Chapter 2
goes over a selection of mathematical concepts and elementary results that are rel-
evant to quantum theory. It is aimed primarily at newcomers to the area, but other
readers may find it useful to read it lightly and selectively to familiarize themselves
with the “spirit” of the book. Chapter 3 may be helpful to mathematicians with
limited background in physics; it shows why various mathematical concepts appear
in quantum theory. It could also help in understanding physicists talking about the
subject and in seeing the motivation behind their enquiries. The choice of topics
largely reflects the aspects of the field that we ourselves found not-immediately-
obvious when encountering them for the first time.
xvii
xviii PREFACE
Chapters 4 through 7 include the background material from the widely under-
stood AGA that is either already established to be, directly or indirectly, relevant to
QIT, or that we consider to be worthwhile making available to the QIT community.
Even though most of this material can be found in existing books or surveys, many
items are difficult to locate in the literature and/or are not readily accessible to
outsiders. Here we have organized our exposition of AGA so that the applications
follow as seamlessly as possible. Our presentation of some aspects of the theory is
nonstandard. For example, we exploit the interplay between polarity and cone du-
ion
ality (outlined in Chapter 1 and with a sample application in Appendix D) to give
novel and potentially useful insights. Chapters 4 (More convexity) and 5 (Metric
entropy and concentration of measure) can be read independently of each other,
ut
but Chapters 6 and 7 depend on the preceding ones.
rib
Chapters 8 through 12 discuss topics from the QIT proper, mostly via applica-
tion of tools from the prior chapters. These chapters can largely be read indepen-
dently of each other. For the most part, they present results previously published
ist
in journal articles, often (but not always) by the authors and their collaborators,
most notably Cécilia Lancien, Elisabeth Werner, Deping Ye, Karol Życzkowski, and
rd
The Horodecki Group. A few results are byproducts of the work on this book (e.g.,
those in Section 9.4). The book also contains several new proofs. Some of them
fo
could arguably qualify as “proofs from The Book,” for example the first proof of
Størmer’s Theorem 2.36 (Section 2.4.5) or the derivation of the sharp upper bound
ot
for the expected value of the norm of the complex Wishart matrix (Proposition
6.31).
N
Some statements are explicitly marked as “not proved here”; in that case the
references (to the original source and/or to a more accessible presentation) are
ly.
indicated in the “Notes and Remarks” section at the end of the chapter. Otherwise,
the proof can be found either in the main text or in the exercises. There are over
on
400 exercises that form an important part of the book. They are diverse and aim
at multiple audiences. Some are simple and elementary complements to the text,
while others allow the reader to explore more advanced topics at their own pace.
se
Still others explore details of the arguments that we judged to be too technical to
lu
be included in the main text, but worthwhile to be outlined for those who may need
sharp versions of the results and/or to “reverse engineer” the proofs. All but the
na
of the reader wanting to use the book as a reference: a guide to notation and
Pe
semester. However, a graduate course centered on the main theme of the book—
the interface of QIT and AGA—can be easily designed around selected topics from
Chapters 4–7, followed by selected applications from Chapters 8–12. While we
assume at least a cursory familiarity with functional analysis (normed and inner
product spaces, and operators on them, duality, Hahn–Banach-type separation the-
orems etc.), real analysis (Lp -spaces), and probability, deep results from these fields
appear only occasionally and—when they do—an attempt is made to “soften the
blow” by presenting some background via appropriately chosen exercises. Alterna-
ion
tively, most chapters could serve as a core for an independent study course. Again,
this would be greatly facilitated by the numerous exercises and—mathematical ma-
turity being more critical than extensive knowledge—the text will be accessible to
ut
sufficiently motivated advanced undergraduates.
rib
Acknowledgements. This book has been written over several years; during this pe-
riod the project benefited greatly from the joint stays of the authors at the Isaac
ist
Newton Institute in Cambridge, Mathematisches Forschungsinstitut Oberwolfach
(within the framework of its Research in Pairs program), and the Instituto de
rd
Ciencias Matemáticas in Madrid. We are grateful to these institutions and their
staff for their support and hospitality. We are indebted to the many colleagues and
students who helped us bring this book into being, either by reading and comment-
fo
ing on specific chapters, or by sharing with us their expertise and/or providing us
with references. We thank in particular Dominique Bakry, Andrew Blasius, Michał
ot
Horodecki, Cécilia Lancien, Imre Leader, Ben Li, Harsh Mathur, Mark Meckes,
Emanuel Milman, Ion Nechita, David Reeb. We also thank the anonymous referees
N
for many suggestions which helped to improve the quality of the text. We are espe-
cially grateful to Gaëlle Jardine for careful proofreading of parts of the manuscript.
ly.
We acknowledge Aurélie Garnier, who created the comic strip. Thanks are also
due to Sergei Gelfand of the American Mathematical Society’s Editorial Division,
on
who guided this project from the conception to its conclusion and whose advice
and prodding were invaluable. Finally, we would like to thank our families for their
se
ut
rib
Alice and Bob
ist
rd
Mathematical Aspects of Quantum Information
Theoryfo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 0
ion
0.1. Asymptotic and non-asymptotic notation
ut
The letters C, c, c1 , c0 , . . . denote absolute numerical constants, independent of
rib
the instance of the problem at hand. However, the actual values corresponding
to the same symbol may change from place to place. Such constants are always
assumed to be positive. Usually C or C 1 stands for a large (but finite) number,
ist
while c or c0 denotes a small (but nonzero) number. If a constant is allowed to
depend on a parameter (say n, or ε), we use expressions such as Cn or cpεq.
rd
When A, B are quantities depending on the dimension (and/or perhaps on
some other parameters), the notation A “ OpBq means that there exists an abso-
fo
lute constant C ą 0 such that the inequality A ď CB holds in every dimension.
Similarly, A “ ΩpBq means that B “ OpAq, and A “ ΘpBq means both A “ OpBq
ot
and B “ OpAq. We emphasize that these are non-asymptotic relations; they are
supposed to hold universally, in every instance of the problem, independently of
N
any other parameters that may be involved, and not just in the limit. We also write
A À B, A Á B and A » B as alternative notation for A “ OpBq, A “ ΩpBq and
ly.
dimension tends to 8 (or as some other relevant parameter tends to its limiting
value), and both A “ opBq and A ! B mean that A{B Ñ 0. If we want to indicate
or emphasize that a dependence (of either kind) is not necessarily uniform in some
se
of the parameters, we may write, for example, cpαq or A “ Oε pBq to identify the
lu
parameter(s) on which the relation in question does or may depend, and similarly
for A „p B (asymptotic equivalence for fixed p). Note that if there is only one
parameter involved (say, the dimension n), then A „ B implies A » B; however,
na
Throughout this book, virtually all the normed spaces we consider will be finite-
Pe
3
4 0. NOTATION AND BASIC CONCEPTS
ion
The dependence A ÞÑ A: is conjugate linear. A simple but important instance
of this operation is when H1 “ C: if we identify ϕ P H with an operator z ÞÑ zϕ
belonging to BpC, Hq, then the adjoint of that operator is ϕ: “ xϕ, ¨y P BpH, Cq “
ut
H˚ .
The notation Bp¨, ¨q will be occasionally used for the corresponding concepts
rib
in the category of normed (or just vector) spaces. Note that while B stands for
“bounded,” in the finite-dimensional setting all linear operators are bounded and
ist
so—if minimal care is exercised—this will not introduce ambiguity. On the other
hand, the notation : will be reserved for operators acting between Hilbert spaces; in
rd
other contexts we will use the usual functional analytic notation T ˚ for the adjoint
of a linear map T .
fo
If H is a complex Hilbert space, we denote by H the Hilbert space which coin-
cides with H as far as the additive structure is concerned, but with multiplication
defined as pλ, xq ÞÑ λx. Again, the identity map H Q ψ ÞÑ ψ P H is R-linear,
ot
but not C-linear. Still, the Hilbert spaces H and H are isomorphic. Explicit iso-
N
canonical when there is only one natural candidate for an isomorphism. In the
lu
subtlety does not arise in the real case since the map ψ ÞÑ ψ : is R-linear and so the
dual space H˚ “ BpH, Rq identifies canonically with H.
so
space H, and S n´1 “ SRn . We denote by vol the Lebesgue measure on a finite-
Pe
is a column vector (an m ˆ 1 matrix, which can also be identified with an operator
from R to Rm ); the transposition xT is a row vector , or a linear functional on Rm ;
xy T is the outer product of column vectors x and y, while xT y is their inner (scalar)
product, defined if x and y have the same dimension.
The Dirac notation has a very similar structure, the differences being that it is
(at least a priori ) coordinate-free, that the primary operation is : rather than T ,
and that the identification of a given object as a vector or as a functional is intrinsic
in the notation. “Standard” vectors in H are written as |ψy (a ket vector). The
ion
same vector, but thought of as an element of H˚ Ø H, is identified with |ψy: and
written as xψ| (a bra vector). The bra-ket notation works seamlessly with standard
operations on Hilbert spaces. The action of a functional xψ| on a vector |χy is
ut
xψ|χy, an alternative notation for the scalar product xψ, χy. If A P BpHq and
rib
ψ P H, then we have A|ψy “ |Aψy and xAψ| “ pA|ψyq: “ xψ|A: . Consequently, the
quantity xψ 1 |A|ψy can be read as xψ 1 , Aψy or as xA: ψ 1 , ψy, the equality of which is
a restatement of the definition (0.1).
ist
Let H1 , H2 be real or complex Hilbert spaces, and let ψ1 , ψ2 be vectors in
H1 , H2 respectively. Then the operator |ψ1 yxψ2 | : H2 Ñ H1 acts on χ P H2 as
rd
follows
|χy ÞÑ |ψ1 yxψ2 |χy “ xψ2 |χy|ψ1 y
fo
or, in the standard notation, χ ÞÑ xψ2 , χyψ1 . This operator has rank one unless one
ot
of the vectors ψ1 , ψ2 is zero.
In some mathematical circles, the operator |ψ1 yxψ2 | is sometimes denoted ψ1 b
N
count the lattice structure.) However, sometimes we will employ the enumeration
lu
p|0y, |1y, . . . , |d ´ 1yq, particularly for d “ 2, where we will follow the traditional
convention from computer science and use p|0y, |1yq. Either way, we will refer to
this basis as the computational basis. (As explained in Section 3.1, the designa-
na
tion “computational basis” may have an operational meaning, but such subtleties
so
will be normally beyond the scope of our analysis.) Nevertheless, in some cases,
particularly in the real context, we will use the notation e1 , e2 , . . . , ed that is more
r
which is often called a multipartite Hilbert space (or bipartite when k “ 2). The
space H carries a natural Hilbert space structure given by the inner product defined
ion
for product vectors by
k
ut
ź
xψ1 b ¨ ¨ ¨ b ψk , χ1 b ¨ ¨ ¨ b χk y “ xψi , χi y
i“1
rib
and extended to H by multilinearity. There are canonical identifications
˜ ¸
k k
ist
â â
B Hi ÐÑ BpHi q,
i“1 i“1
rd
where the tensor products are over the real or complex field, respectively. In the
complex case only, another canonical identification is
(0.3) B sa
˜
âk
¸
Hi ÐÑ
âk fo
B sa pHi q,
ot
i“1 i“1
where the tensor products are over the complex field on the left-hand side and over
N
the real field on the right-hand side. Except in the trivial cases, the analogue of
(0.3) is false in the setting of real Hilbert spaces: e.g., B sa pR2 qbB sa pR2 q is a proper
ly.
in (0.2) to be 1-dimensional, such factors may be just dropped and so, when referring
to a multipartite Hilbert space, we will normally assume that all the factors are of
dimension at least 2.
se
We often work with concrete spaces such as pC2 qbk , which corresponds to k
lu
qubits. In that case the computational basis is obtained by the 2k vectors of the
form |i1 y b ¨ ¨ ¨ b |ik y, where pi1 , . . . , ik q P t0, 1uk . It is customary to drop the tensor
product sign: for example the computational basis of C2 b C2 consists of the 4
na
We also point out that tensor products commute with the operation of taking
dual, i.e., there is a canonical identification
r
0.5. Complexification
Let V be a real vector space. The complexification of V is the vector space
V C “ V b C (the tensor product is over the reals). Elements of V C are of the form
x b 1 ` y b i (for x, y P V ), which we write x ` iy for short.
0.6. MATRICES VS. OPERATORS 7
ion
that for example Cn is treated as R2n . In the abstract setting, if the original complex
space was endowed with a scalar product x¨, ¨y, the corresponding real scalar product
is Re x¨, ¨y. While this is frequently a useful point of view, particularly in geometric
ut
considerations (see Section 1.1), some caution is needed as this operation is not as
rib
sound functorially as complexification. For example, C bC C “ C identifies this way
with R2 , even though R2 bR R2 is 4-dimensional.
ist
0.6. Matrices vs. operators
rd
We denote by Mm,n the space of m ˆ n matrices, either real of complex, and by
Mn if m “ n. The entries of a matrix M P Mm,n are denoted by pmij q1ďiďm,1ďjďn .
We denote by M : the Hermitian conjugate of M , i.e., pmij q: “ pmji q. We will
fo
denote by Msa :
m :“ tM P Mm : M “ M u, the subspace of Mm consisting of Hermit-
ian (or self-adjoint) matrices. For matrices with real entries, “self-adjoint” simply
ot
means “symmetric.”
N
The preceding definitions ensure that the above notion of is consistent with that
introduced in Section 0.2, and that the operator composition is consistent with
on
on/between any Hilbert spaces of the appropriate dimensions. However, such iden-
tification requires specifying bases in the spaces in question and, consequently, is
lu
not canonical.
In the real case, Mn is a vector space of dimension n2 , and Msan is a subspace
na
then
Pe
(0.4) xM, N y “ Tr M : N.
(Recall that we use the “physics” convention for sesquilinear forms, as explained in
Section 0.3.) The Euclidean structure on Mm,n induced by this inner product is
called the Hilbert–Schmidt Euclidean
? structure, and the corresponding norm is the
Hilbert–Schmidt norm }M }HS “ Tr M : M . (In linear algebra the more commonly
used name is Frobenius.) Note that in the complex case the inner product will, in
general, not be real. However, if M, N P Msa
m , then xM, N y “ Tr M N is real (even
if some of the entries of M, N are complex).
8 0. NOTATION AND BASIC CONCEPTS
ion
where, for each i, j P t1, . . . , mu, the matrix Mij P Mn is defined as
ut
» fi
pxi| b x1|qAp|jy b |1yq ¨¨¨ pxi| b x1|qAp|jy b |nyq
.. ..
rib
(0.6) Mij “ – fl .
— ffi
. .
pxi| b xn|qAp|jy b |1yq ¨¨¨ pxi| b xn|qAp|jy b |nyq
ist
0.8. Operators vs. tensors
rd
Let H1 , H2 be complex Hilbert spaces. The map u b v ÞÑ |vyxu| induces a
canonical identification between the spaces H1 b H2 and BpH1 , H2 q. Recall from
Section 0.2 that H1 identifies canonically with H1˚ . fo
As explained in Section 0.2, the use of the complex conjugacy can be avoided
if we agree to work with specified bases. Fix bases pei qiPI in H1 and pfj qjPJ in
ot
H2 . Define a map vec : BpH1 , H2 q Ñ H2 b H1 as follows: for i P I and j P J, set
N
vecp|fj yxei |q “ fj b ei and extend the definition by C-linearity. In other words, for
ψ1 P H1 , ψ2 P H2 we have vec |ψ2 yxψ1 | “ ψ2 b ψ1 where conjugacy is taken with
ly.
operators and superoperators may seem rather arbitrary since, as we noted earlier,
lu
BpHq and Mm,n carry a natural Hilbert space structure. However, it helps to
organize one’s thinking and is widely used in quantum information theory.
na
Accordingly, we use two different types of notation to denote the identity map:
the identity operator on a Hilbert space H is denoted by IH (or In if H “ Cn or
Rn , or even simply I if there is no ambiguity), while the identity superoperator on
so
DpHq the set of states on H (the letter D stands for density matrix , which is an
alternative terminology for states). If H “ Cn`1 , the subset of DpHq consisting of
diagonal operators identifies naturally with ∆n (and similarly for operators diagonal
with respect to any fixed basis in any finite-dimensional Hilbert space).
In functional analysis, a state on a C ˚ -algebra is—by definition—a positive
linear functional of norm 1. This is consistent with the definitions of classical
and quantum states introduced above. Indeed, given a finite set S, states on the
commutative C ˚ -algebra CS correspond to classical states on S. Similarly, given
ion
a finite-dimensional complex Hilbert space H, the states on the C ˚ -algebra BpHq
can be identified with elements of DpHq via trace duality (0.4).
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 1
ion
In this chapter we present an overview of basic properties of convex sets and
ut
convex cones. Unless stated explicitly otherwise, we shall assume that the base field
is R and that all the objects involved are finite-dimensional. However, notions for
rib
complex spaces will be important and even indispensable in some settings. They
are typically introduced by repeating mutatis mutandis the definitions of their real
ist
counterparts. At the same time, one can always consider them as real spaces by
ignoring the complex structure.
rd
If V is an n-dimensional vector space over R, we will usually assume that V is
identified with Rn . This implies in particular that there is a distinguished Euclidean
structure (i.e., a scalar product) in V , so that V is also identified with its dual V ˚ .
fo
1.1. Normed spaces and convex sets
ot
1.1.1. Gauges. We start with a simple proposition which characterizes the
N
subsets of Rn that can be the unit balls for some norm. A subset K Ă Rn is a
convex body if it is convex, compact, and with non-empty interior. We similarly
define convex bodies in linear (or affine) subspaces of Rn . We will call K symmetric
ly.
lighten the notation, we will use specialized symbols for various “common” spaces.)
The correspondence X ÞÑ BX is the inverse of the correspondence K ÞÑ } ¨ }K .
In the complex case, the analogue of symmetry is circledness. A convex body
K Ă Cn is said to be circled if for every θ P R and x P K we have eiθ x P K. Circled
convex bodies are exactly the unit balls of norms in Cn .
Equation (1.1) will also be used to define the gauge of a non-necessarily-
symmetric convex set K. However, in order for the gauge to take only finite values
and to avoid other degeneracies, we will usually insist that K contain the origin in
its interior and that K be closed. We will still denote by } ¨ }K the gauge of such
convex set, and we will still have the (essentially tautological) relation
(1.2) K “ tx : }x}K ď 1u.
11
12 1. ELEMENTARY CONVEX ANALYSIS
• • •
0 x x/kxkK
ion
ut
Figure 1.1. Gauge of a convex body.
rib
(Observe that if K is closed, the infimum in (1.1) is always attained.) However, if
K is not assumed to be symmetric, we should note that in general }x}K ‰ } ´ x}K .
ist
We point out that the correspondence between convex bodies and their gauges
is order-reversing: K Ă L if and only if } ¨ }K ě } ¨ }L . In the same vein, we have
rd
} ¨ }tK “ t´1 } ¨ }K for t ą 0.
1.1.2. First examples: `p -balls, simplex, polytopes, and convex hulls.
˜ ¸1{p
fo
For 1 ď p ď `8, we denote by } ¨ }p the `p -norm, defined for x P Rn via
n
ot
ÿ
(1.3) }x}p “ |xk |p ,
N
k“1
where the limit case p “ `8 should be understood as }x}8 “ maxt|xk | : 1 ď k ď
nu. Recall also that } ¨ }2 will be usually denoted by | ¨ |. The `p -norms satisfy the
ly.
If A Ă Rn , we denote by conv A the convex hull of A, i.e., the set of all convex
combinations of elements of A, which is also the smallest convex set containing
lu
for some m ą n.
Pe
Rn`1 . Note that ∆n is a convex body in H, but only a convex subset of Rn`1 . The
simplex ∆n corresponds to the set of classical states, i.e., probability measures on
t0, . . . , nu.
Exercise 1.1 (Carathéodory’s theorem). Let A Ă Rn , x P conv A and consider
řN
a decomposition x “ i“1 λi xi (where pλi q is a convex combination and xi P A)
of minimal length N . Show that the points pxi q must be affinely independent, and
conclude that N ď n ` 1.
ion
Exercise 1.2. Let A Ă Rn be a compact set. Show that conv A is compact.
1.1.3. Extreme points, faces. Let K Ă Rn be a convex set. A point x P K is
ut
said to be extreme if it cannot be written in a nontrivial way as a convex combination
of points of K, i.e., if the equality x “ ty ` p1 ´ tqz for t P p0, 1q and y, z P K implies
rib
that x “ y “ z. The following fundamental theorem asserts that, in a sense, all
information about a convex body is contained in its extreme points.
ist
Theorem 1.3 (Krein–Milman theorem, see Exercise 1.6). Let K Ă Rn be a
convex body. Then K is the convex hull of its extreme points.
rd
Let F, K be closed convex sets with F Ă K. Then F is called a face of K
if every segment contained in K whose (relative) interior intersects F is entirely
fo
contained in F . If F ‰ H and F ‰ K, F is said to be a proper face. Note that a
singleton txu is a face if and only if x is an extreme point. If F is a face of K with
ot
dim F “ dim K ´ 1, then F is called a facet.
A frequently encountered setting in convex or functional analysis is that of two
N
convex sets K, L and a linear or affine map u such that upLq Ă K. For example, if
X, Y` are˘normed spaces, and u : X Ñ Y a linear operator, then u is a contraction
ly.
Proposition 1.4 (Affine maps preserve faces, see Exercise 1.4). Let K, L be
closed convex sets, let x be a point in the relative interior of L, and let u : L Ñ K
se
exists a vector y P Rn such that the linear functional xy, ¨y attains its maximum on
K only at x. These notions are studied in Exercise 1.5.
Exercise 1.3. Show that the (relative) boundary of a closed convex set is a
union of exposed faces.
Exercise 1.4. Prove Proposition 1.4.
Exercise 1.5 (Extreme vs. exposed points, faces vs. exposed faces). Let K Ă
Rn be a closed convex set.
(a) Show that every exposed face F of a closed convex set K is indeed a face of K,
which is necessarily proper (i.e., F ‰ K, H).
14 1. ELEMENTARY CONVEX ANALYSIS
ion
(e) More generally, for k ď n ´ 2, give an example of a convex body L Ă Rn with
a k-dimensional face which is not exposed.
(f) Show that F is a face of K if and only if there exists a sequence F “ F0 Ă F1 Ă
ut
. . . Ă Fs “ K such that Fi´1 is an exposed face of Fi for i “ 1, . . . , s.
rib
(g) If every point in the (relative) boundary of a convex set K is extreme, K is
called strictly convex. Show that, in that case, every point of the boundary is an
exposed point.
ist
Exercise 1.6. Prove the Krein–Milman Theorem 1.3 by induction with respect
rd
to n. (Start by showing that any convex body has at least one extreme point.)
Exercise 1.7. Show that the extreme points of the set of quantum states DpHq
fo
are operators of the form |ψyxψ|, where ψ P H is a norm one vector (i.e., rank one
orthogonal projections).
ot
Exercise 1.8. Show that every face of a polytope is a polytope.
N
Exercise 1.11 (Hanner’s inequalities and uniform convexity). The goal of this
on
exercise is to prove Hanner’s inequalities about the geometry of the p-norm, which
lead to precise quantitative statements about convexity and smoothness of balls in
se
Lp -spaces.
(i) Let p P p1, 2s. For t ą 0, set αptq “ p1 ` tqp´1 ` |1 ´ t|p´1 signp1 ´ tq. Show that
lu
p p
(1.5) }x ` y}pp ` }x ´ y}pp ě p}x}p ` }y}p q ` |}x}p ´ }y}p | .
so
Show also that, for p P r2, 8q, (1.5) holds with ď instead of ě.
(iii) Let p P p1, 2s. Prove also that for x, y P R.,
r
Pe
˙2{p
}x ` y}pp ` }x ´ y}pp
ˆ
(1.6) ě }x}2p ` pp ´ 1q}y}2p .
2
(iv) Fix p P p1, 8q. Show that for any›ε ą ›0 there exists δ ą 0 such that whenever
x, y P Bpn verify }x ´ y}p ě ε, then › x`y2
› ď 1 ´ δ. (This property of Bpn is a
p
quantitative version of strict convexity and is called uniform convexity.)
Exercise 1.12 (A Borel selection theorem). Let K Ă Rn be a convex body.
Show that there is a Borel map Θ : Rn Ñ K with the property that for every
x P Rn we have xΘpxq, xy “ maxtxz, xy : z P Ku.
1.1. NORMED SPACES AND CONVEX SETS 15
1.1.4. Polarity. This section and the next one will present elements of convex
analysis. Readers not familiar with the subject are encouraged to go over the
suggested exercises, which are generally simple and elementary, but often contain
facts not included in standard texts.
Since norms on Rn are in one-to-one correspondence with symmetric convex
bodies, the notion of duality between normed spaces induces a duality for convex
bodies, which is called polarity. Its explicit definition is as follows: if A Ă Rn , the
polar of A is
ion
(1.7) A˝ :“ ty P Rn : xx, yy ď 1 for all x P Au.
In particular (cf. (1.2) and Exercise 1.13)
ut
(1.8) }y}A˝ “ sup xx, yy.
rib
xPAYt0u
The key example is A “ BX (the unit ball of X); we have then A˝ “ BX ˚ , the
unit ball with respect to the dual norm, the duality being induced by the standard
ist
Euclidean structure. For example, duality of `p -norms translates into
rd
(1.9) pBpn q˝ “ Bqn ,
where 1{p ` 1{q “ 1.
fo
A larger important class of sets is that of convex bodies containing 0 in the
interior; it is stable under the operation of polarity. While most of the properties
ot
of the operation K ÞÑ K ˝ listed below hold for more general sets, this last class
is sufficient for most applications (with the notable exception of cones, see Section
N
1.2).
Because of the inequality appearing in the definition (1.7), the concept of polar-
ly.
ity a priori makes sense only in the category of real Euclidean spaces. We exemplify
adjustments needed to make it work in the complex setting in Section 1.3.2, where
on
on how we identify the vector space Rn with its dual. One useful way to describe
lu
ion
(1.13) pK X Eq˝ “ PE pK ˝ q.
Note that in the left-hand sides in (1.12) and (1.13), the polars are taken inside E,
ut
equipped with the induced inner product.
Another pair of simple but useful relations involving polars is
rib
(1.14) pK Y Lq˝ “ K ˝ X L˝
ist
for any K, L Ă Rn and
(1.15) pK X Lq˝ “ convpK ˝ Y L˝ q
rd
if K, L are closed, convex and contain the origin.
fo
Exercise 1.13. Find a gap in the following argument. Since }y}A˝ ď 1 iff
y P A˝ iff supxPA xx, yy ď 1, it follows by homogeneity that }y}A˝ “ supxPA xx, yy.
ot
Exercise 1.14 (Stability properties of polarity). Show that K Ă Rn is bounded
iff K ˝ contains 0 in the interior. Similarly, if K is convex, then it contains 0 in its
N
applies reasonable conventions.) The bipolar theorem (1.11) is a special case of this
statement.
se
That extension must be of the form px, x1 q ÞÑ xpx, x1 q, py, y 1 qy for some y 1 P E K , and
the domination by } ¨ }K means that py, y 1 q P K ˝ . In particular, y P PE pK ˝ q.
Find an error. Fix it and complete the proof of (1.13) (under the assumptions
stated there). Give an example of K with 0 on the boundary such that (1.13) fails.
Exercise 1.18 (Polars of unions and intersections). Prove (1.14) and (1.15).
For the latter, show by examples that each of the hypotheses and the closure on
the right-hand side may be needed.
Exercise 1.19 (Polars of polytopes). Show that the polar of a polytope K Ă
Rn is a polytope if and only if dim K “ n and 0 is an interior point of K.
1.1. NORMED SPACES AND CONVEX SETS 17
ion
ut
rib
ist
K K◦
rd
Figure 1.2. A polytope and its polar. The reader is encouraged
to visualize the bijection νK between vertices (resp. edges, facets)
fo
of K and facets (resp. edges, vertices) of K ˝ . The map νK is
vaguely related to the Gauss map from differential geometry.
ot
If K is a polytope, then the action of νK is very regular: every vertex is mapped
N
to a facet and vice versa, and, more generally, every k-dimensional face is mapped
to an pn ´ k ´ 1q-dimensional face (see Figure 1.2).
ly.
The situation gets more complicated when dealing with general convex bodies:
if F is a maximal face (necessarily exposed, see Exercise 1.5), then νK pF q is a
on
minimal exposed face (not necessarily a minimal face, and certainly not necessarily
an extreme point of K ˝ ). However, it is still possible to retrieve all maximal faces
se
For y P BK ˝ we define
na
ion
y, an extreme point of K ˝ , such that the face Fy given by (1.17) is not maximal.
1.1.6. Ellipsoids. A convex body K Ă Rn is an ellipsoid if it is the image
ut
of B2n under an affine transformation. In particular, 0-symmetric ellipsoids are
exactly the unit balls of Euclidean norms on Rn (i.e., norms induced by an inner
rib
product). Given a 0-symmetric ellipsoid E Ă Rn , we denote by x¨, ¨yE the inner
product associated to E . Note also that given a 0-symmetric ellipsoid E , there is a
ist
unique positive invertible matrix T such that E “ T pB2n q.
As explained in Section 0.4, there is a canonical notion of tensor product within
rd
the category of Euclidean spaces. Accordingly, given two 0-symmetric ellipsoids
1 1
E Ă Rn and E 1 Ă Rn , we denote by E b2 E 1 Ă Rn b Rn the resulting ellipsoid,
which satisfies
fo
xx b x1 , y b y 1 yE b2 E 1 “ xx, yyE xx1 , y 1 yE 1 .
1
for x, y P Rn and x1 , y 1 P Rn . An alternative presentation is to say that if T (resp.,
ot
1
T 1 ) is a linear transformation on Rn (resp., on Rn ) such that E “ T pB2n q (resp.,
N
1
such that E 1 “ T 1 pB2n q), then
1
E b2 E 1 “ pT b T 1 qpB2nn q,
ly.
1 1
where we identified Rn b Rn with Rnn .
on
Exercise 1.25 (Spherical sections of ellipsoids). Show that any p2n ´ 1q-
dimensional ellipsoid E admits an n-dimensional central section which is a Eu-
se
clidean ball.
lu
given ellipsoid the volume of the polar is minimized iff the translate is 0-symmetric.
so
a
unit radius and center at pa, 0q, then Da˝ is an ellipse with center at p´ 1´a 2 , 0q
Pe
1 1
and principal semi-axes of length 1´a2 and ?1´a2 . In particular the area of Da˝ is
minimal iff a “ 0.
(b) Infer similar statements for the n-dimensional Euclidean ball, and then deduce
the desired conclusion.
1.2. Cones
A nonempty closed convex subset C of Rn (or of any real vector space) is called
a cone if whenever x P C and t ě 0, then tx P C. An equivalent definition: C is a
closed set such that x, x1 P C and t, t1 ě 0 imply tx ` t1 x1 P C. Examples of cones
include:
1.2. CONES 19
(1) the cone of elements of Rn with nonnegative coordinates (the positive orthant
Rn` ),
řn´1 (
(2) the Lorentz cone Ln “ px0 , x1 , . . . , xn´1 q : x0 ě 0, k“1 x2k ď x20 Ă Rn for
n ě 2,
(3) the cone PSD “ PSDpCn q Ă Msa n of complex positive semi-definite matrices.
ion
As was the case with the polarity (see Section 1.1.4), the notion of the dual cone is
not canonical in the category of vector spaces since it appeals to the scalar product.
ut
This can be again circumvented by considering C ˚ as a subset of the vector space
that is dual to the one containing C. We will present some advantages of this point
rib
of view in Appendix D, but will otherwise stick to the more familiar Euclidean
setting.
ist
It is readily checked that the cones Rn` , Ln and PSD defined in the preamble
to Section 1.2 have the remarkable property of being self-dual , i.e., verify C ˚ “ C.
rd
(For C “ PSD, extend the definition (1.18) mutatis mutandis to the setting of
arbitrary real inner product spaces and use trace duality (0.4).)
Not surprisingly, the notion of cone duality is strongly related to that of polarity.
fo
First, a simple argument shows that if C is a (closed convex) cone, then C ˚ “ ´C ˝
and, therefore, by (1.11),
ot
(1.19) pC ˚ q˚ “ C.
N
by (1.15). However, we also have another link to polarity of convex bodies, which is
on
less obvious. To point out that link, let us first define a base of a closed convex cone
C Ă Rn to be a closed convex set K Ă C such that (1) the affine space generated by
K does not contain the origin and (2) K generates C, i.e., C “ R` K. An alternative
se
in which e is the point closest to the origin. If C Ă Rn is a closed convex cone such
that e P C ˚ zC K , the set C b defined as
so
(1.22) C b “ C X He
r
Pe
is then a base of C (that is, C is the smallest closed cone containing C b , see Exercise
1.28). In particular, knowing C b allows to reconstruct C.
As was to be expected, natural set-theoretic and algebraic operations on cones
induce analogous operations on bases of cones. Sometimes this is as trivial as
pC1 X C2 qb “ C1b X C2b , or as simple as pC1 ` C2 qb “ convpC1b Y C2b q. In fact, if we
want to stay in the class of closed cones, the more appropriate form of the latter
formula would be
(1.23) pC1 ` C2 qb “ convpC1b Y C2b q
(see Exercise 1.30; however, such adjustments are not needed under some natural
nondegeneracy assumptions, which we will describe later in Section 1.2.2).
20 1. ELEMENTARY CONVEX ANALYSIS
ion
In other words, if we think of He as a vector space with the origin at e, and of C b
and pC ˚ qb as subsets of that vector space, then pC ˚ qb “ ´|e|2 pC b q˝ .
ut
C C∗
rib
Cb
ist
• e•
e He (C ∗ )b He
rd
• •
0 0
fo
Figure 1.3. A cone and its dual cone. Up to a reflection, the
bases C b and pC ˚ qb are polar to each other with respect to e.
N ot
Proof. If xx, ey “ xy, ey “ |e|2 , then x´py ´ eq, x ´ ey “ ´xy, xy ` |e|2 and
so the condition from (1.24) can be restated as “@x P C b ´ xy, xy ` |e|2 ď |e|2 ” or,
ly.
more simply, “@x P C b xy, xy ě 0.” Since C b generates C (see Exercise 1.28), the
latter condition is further equivalent to “xy, xy ě 0 for all x P C,” i.e., to “y P C ˚ ,”
on
as required.
Here are two important classical examples where Lemma 1.6 applies.
se
` n`1 ˘b
given by the equation x0 ` ¨ ¨ ¨ ` xn “ 1. Then R` “ ∆n , the set of classical
n`1
states. Since R` is self-dual, it follows from Lemma 1.6 that
na
(2) The cone PSDpCn q Ă Msa n . Take e “ I {n (the maximally mixed state), so
Pe
that He is the hyperplane of trace one matrices. Then PSDb “ DpCn q, the set of
quantum states. Since PSD is self-dual, it follows from Lemma 1.6 that
(1.26) DpCn q˝ “ ´nDpCn q.
The bases of the Lorentz cones Ln relative to the natural choice e “ e0 are
Euclidean balls, so applying Lemma 1.6 just tells us that the Lorentz cone is self-
dual (a property which is easy to verify directly). However, other choices of e
lead to nontrivial consequences, see Exercise D.3. Another simple but important
observation is that since DpC2 q is a 3-dimensional Euclidean ball (the Bloch ball),
the cone PSDpC2 q is isomorphic (or even isometric in the appropriate sense) to the
Lorentz cone L4 (see Section 2.1.2).
1.2. CONES 21
Exercise 1.27. Let K be a base of a closed convex cone C, and H the affine
space generated by K. Show that K “ C X H.
Exercise 1.28 (Bases generate cones). Show that if e P Rn and a closed convex
cone C Ă Rn are such that e P C ˚ zC K , and if C b is defined by (1.21) and (1.22),
then R` C b “ C. Give an example showing that the closure is needed.
Exercise 1.29 (Nontrivial cones admit bases). Let C Ă Rn be a closed convex
cone. Show that C admits a base iff C is not a linear subspace iff C ‰ ´C.
ion
Exercise 1.30. Give an example of closed cones C1 , C2 in R3 such that the
cone C1 ` C2 is not closed.
ut
Exercise 1.31 (Time dilation and the Lorentz cone). Consider the cone Cy “
rib
tx P Rn : |x| ď xx, yyu where y P Rn satisfies |y| ą 1. Show that Cy˚ “ Cz for
a
z “ y{ |y|2 ´ 1.
ist
1.2.2. Nondegenerate cones and facial structure. We will be mostly
dealing with (closed convex) cones C Ă Rn verifying (i) C X p´Cq “ t0u and (ii)
rd
C ´ C “ Rn ; we will call such cones nondegenerate. The properties (i) and (ii) are
often referred to as C being respectively pointed and full . They are dual to each
fo
other, i.e., C verifies (i) iff C ˚ verifies (ii), and vice versa; the reader may explore
them further in Exercise 1.32. Here we note the following
ot
Lemma 1.7. Let C Ă Rn be a closed convex cone. Then y is an interior point
˚
of C iff xy, xy ą 0 for every x P Czt0u.
N
Since inf |u|ăε xy ` u, xy “ xy, xy ´ ε|x|, this is only possible if either xy, xy ą 0 or
|x| “ 0. This proves the “only if” part (see also Exercise 1.34). For the “if” part, we
note that Bpy, εq Ă C ˚ follows if (1.27) holds for x in C X S n´1 “: A. This could be
se
ensured by choosing ε “ inf xPA xy, xy, which is strictly positive since the continuous
function xy, ¨y is pointwise positive on the compact set A.
lu
Corollary 1.8. If C is a closed convex cone which is pointed, then 0 is an
na
is compact. In fact, all the three properties stated in the Corollary are equivalent
(see Exercise 1.32).
We are now ready to state the main observation of this section. Once made, it
is fairly straightforward to show.
Proposition 1.9 (Faces of cones and faces of bases, see Exercise 1.35). Let
C Ă Rn be a closed convex cone with a compact base C b . When we exclude the
exposed point 0 of C, there is a one-to-one correspondence between faces of C b and
those of C given by F ÞÑ R` F . Moreover, this correspondence preserves the exposed
(or non-exposed) character of each face.
22 1. ELEMENTARY CONVEX ANALYSIS
ion
Exercise 1.32 (Full cones and pointed cones). Let C Ă Rn be a closed convex
cone, C ‰ t0u. Show that the following conditions are equivalent:
ut
(a) C is pointed (i.e., C X p´Cq “ t0u),
(b) C ˚ is full (i.e., C ˚ ´ C ˚ “ Rn ),
rib
(c) 0 is an exposed point of C,
(d) C does not contain a line,
ist
(e) C admits a compact base,
(f) dim C ˚ “ n,
rd
(g) span C ˚ “ Rn .
Exercise 1.33 (Structure theorem for a general cone). If C Ă Rn is a closed
fo
convex cone, then there exists a vector subspace V Ă Rn and a pointed cone
C 1 Ă V K such that C “ V ` C 1 (a direct Minkowski sum).
ot
Exercise 1.34. Deduce the “only if” part of Lemma 1.7 from Proposition 1.4.
N
Exercise 1.35. Prove Proposition 1.9 relating faces of cones to those of their
bases.
ly.
Exercise 1.36. Show that if the cones C1˚ , C2˚ are pointed with the same iso-
on
lating hyperplane, then the closure on the right-hand side of (1.20) is not needed.
řn řn
Definition 1.11. If x, y P Rn with i“1 xi “ i“1 yi , we say that x is ma-
jorized by y, and write x ă y, if
so
k
ÿ k
ÿ
(1.28) xÓj ď yjÓ for any k P t1, 2, . . . , nu.
r
Pe
j“1 j“1
ion
n n
(vi) For every t P R, we have i“1 pxi ´ tq` ď i“1 pyi ´ tq` , where x` “
maxpx, 0q.
ut
Sketch of the proof. Fix y P Rn , and consider the non-empty convex com-
pact set
rib
Ky “ tx P Rn : x ă yu.
It is easily checked that x is an extreme point of Ky if and only if xÓ “ y Ó , and
ist
it follows from the Krein–Milman theorem that (i) is equivalent to (ii). Similarly,
the classical Birkhoff theorem, which asserts that extreme points of the set of bis-
rd
tochastic matrices are exactly permutation matrices, gives the equivalence of (ii)
and (iii). The implications (ii) ñ (iv) ñ (v) are obvious. We
ř checkřthat (v) and
fo
(vi) are equivalent since |x| “ 2x` ´ x (using the fact that xi “ yi ). Finally,
for t “ ykÓ , we compute
ot
n
ÿ k
ÿ k
ÿ
pyi ´ tq` “ pyiÓ ´ tq “ yiÓ ´ kt
N
řk Ó řk
Therefore, the inequality from (vi) implies that i“1 xi ď i“1 yiÓ , hence x ă
y.
se
formally.
Exercise 1.38 (Submajorization). Given x, y P Rn , we say that x is subma-
na
1.3.2. Schatten norms. Recall that the space Mm,n of (real or complex) mˆ
n matrices carries a Euclidean structure given by the Hilbert–Schmidt inner product
(see Section 0.6). The Hilbert–Schmidt norm is a special case of the Schatten p-
norms, which are the non-commutative analogues of the `p -norms. If M P Mm,n ,
define |M | :“ pM : M q1{2 , and for 1 ď p ď 8,
1{p
}M }p :“ pTr |M |p q .
Note that } ¨ }HS “ } ¨ }2 . The case p “ 8 should be interpreted as the limit p Ñ 8
of the above, and corresponds to the usual operator norm
}M }8 “ }M }op :“ sup |M x|.
|x|ď1
24 1. ELEMENTARY CONVEX ANALYSIS
ion
values of M (i.e., the eigenvalues of |M |) arranged in the non-increasing order,
then for any p,
ut
(1.29) }M }p “ }spM q}p .
rib
The following lemma allows to reduce the study of Schatten norms to the case
of self-adjoint matrices.
ist
Lemma 1.13. Let M P Mm,n , and M̃ P Mm`n be the self-adjoint matrix defined
by
rd
„
0 M
M̃ “ .
M: 0
fo
Then we have }M̃ }p “ 21{p }M }p for 1 ď p ď 8. Similarly, if M, N P Mm,n , then
Tr M̃ Ñ “ 2 Re Tr M : N .
ot
Proof. For the first assertion, it suffices to notice that the eigenvalues of M̃
N
The next lemma shows how the concept of majorization relates to eigenval-
ly.
ř ř
Proof. First, it is known from linear algebra that i mii “ i λi , so ma-
jorization is in principle possible. Write M as M “ U ΛU : , where Λ is a diagonal
na
matrix whose entries are the eigenvalues of M , and U is a unitary matrix. We then
have
so
ÿ ÿ
mii “ uij λj uji “ |uij |2 λj .
r
j j
Pe
Since the matrix with entries |uij |2 is bistochastic, the assertion follows from Propo-
sition 1.12 (iii).
ion
its off-diagonal elements to 0, the hypothesis on f implies
f pλA ` p1 ´ λqBq ď λf pdiag Aq ` p1 ´ λqf pdiag Bq.
ut
Using Lemma 1.14 and Proposition 1.12(iv), it follows that f pdiag Aq ď f pAq and
f pdiag Bq ď f pBq, showing (1.30).
rib
An immediate consequence of the Davis convexity theorem is that the Schatten
p-norms satisfy the triangle inequality.
ist
Proposition 1.16. For 1 ď p ď 8, if M, N P Mm,n , we have
rd
}M ` N }p ď }M }p ` }N }p .
Proof. By the first assertion of Lemma 1.13, it is enough to consider the case
fo
of m “ n and self-adjoint M, N . We now use Proposition 1.15 for the unitarily
invariant function f p¨q “ } ¨ }p . The restriction of } ¨ }p to the subspace of diagonal
ot
matrices identifies with the usual (commutative) `p -norm on Rn , and hence, by
Proposition 1.15, the function } ¨ }p is convex on Msa m . Since it is also positively
N
(1.32) | Tr M N | ď }M }p }N }q .
As a consequence, the Schatten p-norm and q-norm are dual to each other. This
so
holds in all settings: for rectangular matrices (real or complex), for Hermitian
matrices, and for real symmetric matrices.
r
Pe
As in the case of `np -spaces, the above duality relation can be equivalently ex-
pressed in terms of polars. Denote by Spm,n the unit ball associated to the Schatten
norm } ¨ }p on Mm,n and Spm,sa :“ Spm,m X Msa m . (Again, there are two settings, real
and complex, and some care needs to be exercised as minor subtleties occasionally
arise.) We then have
Corollary 1.18. If 1 ď p, q ď 8 with 1{p ` 1{q “ 1, then
(1.33) Sqm,n “ tA P Mm,n : |xX, Ay| ď 1 for all X P Spm,n u
(1.34) “ tA P Mm,n : RexX, Ay ď 1 for all X P Spm,n u
˘˝
Sqm,sa “ Spm,sa ,
`
(1.35)
26 1. ELEMENTARY CONVEX ANALYSIS
˝
where x¨, ¨y and are meant in the sense of trace duality (0.4).
While (1.33) and (1.35) are simply straightforward reformulations of duality
relations from Proposition 1.17, the equality in (1.34) needs to be justified (only the
inclusion “Ă” is immediate). Given A P Mm,n and X P Spm,n such that |xX, Ay| ą 1,
xX,Ay
let ξ “ |xX,Ay| . Then, setting X 1 “ ξX,¯ we see that X 1 P S m,n , while RexX 1 , Ay “
p
|xX, Ay| ą 1, which yields the other inclusion “Ą” in `(1.34).˘ The expression in
˝
(1.34) can be thought of as a definition of the polar Spm,n by “dropping the
ion
complex structure”; see Exercise 1.48 for the general principle. Another potential
complication is that, in the complex setting, the identification with the dual space
is anti-linear, see Section 0.2. Note that no issues of such nature arise in defining
ut
the polar of Spm,sa , as that set “lives” in a real inner product space irrespectively of
rib
the setting.
Proof of Proposition 1.17. Consider first the Hermitian case. By unitary
ist
invariance, we may assume that M is diagonal. We then have
ˇÿ ˇ
| TrpM N q| “ ˇ mii nii ˇ ď }pmii q}p }pnii q}q ď }M }p }N }q ,
rd
ˇ ˇ
i
where we used the commutative Hölder inequality, Lemma 1.14, and Proposition
1.12 (iv). fo
In the general case, Lemma 1.13 and the Hermitian case of (1.32) shown above
ot
imply that, for all M, N P Mn,m ,
N
Re Tr M : N ď }M }p }N }q ,
and the same bound for | TrpM N q| (or | TrpM : N q|) follows by the same trick as
ly.
the one used to establish equality in (1.34) (see the paragraph following Corollary
1.18).
on
easy part” involves establishing that for every M , there is N ‰ 0 such that we
have equality in (1.32). In the Hermitian case, this follows readily by restricting
lu
version of Proposition 1.15, i.e., for functions defined on the set of real symmetric
r
matrices.
Pe
ion
Exercise 1.45 (Spectral theorem and SVD vs. Carathéodory’s theorem). Let
n,sa
K be one of S8n
, S1n , S8 , S1n,sa . Show that every element of K can be written as a
convex combination of n ` 1 extreme points of K. Compare this fact with what one
ut
obtains by a direct application of the Carathéodory’s Theorem 1.2 in the respective
rib
matrix space.
Exercise 1.46 (The real Schatten balls). In the real case, the space Msa
2 is
ist
3-dimensional. Which familiar solids are S12,sa and S8
2,sa
?
Exercise 1.47 (Characterization of unitarily invariant norms). Let m ď n,
rd
and } ¨ } be a norm on Rm such that
}pε1 xσp1q , . . . , εm xσpmq q} “ }px1 , . . . , xm q}
fo
for any x P R , ε P t´1, 1um and σ P Sm . (We call such norms permutationally
m
ot
symmetric.) Show that M ÞÑ }spM q} is a norm on Mm,n and that every norm
which is bi-unitarily invariant (i.e., verifying }U M V } “ }M } for U P Upmq and
N
space and K a closed convex subset, the polar of K can be defined via K ˝ :“
ty P H : Re xx, yy ď 1 for all x P Ku, i.e., by dropping the complex structure, as
on
1.3.3. Von Neumann and Rényi entropies. Let DpCd q be the set of quan-
lu
tum states on Cd (see Section 0.10) and σ P DpCd q. The von Neumann entropy of
σ is defined as
na
Proposition 1.19. The von Neumann entropy S satisfies the following prop-
Pe
erties:
(i) it is a concave function from DpCd q onto r0, log ds,
(ii) for σ P DpCd q, we have Spσq “ 0 if and only if σ is pure (i.e., has rank 1),
(iii) for σ P DpCd q, we have Spσq “ log d if and only if σ “ I {d,
(iv) if σ P DpCd q and U P Updq, then Spσq “ SpU σU : q,
(v) if σ P DpCd q and τ P DpCn q, then Spσ b τ q “ Spσq ` Spτ q.
Proof. All these properties are straightforward to show, except perhaps the
concavity which follows from the concavity of x ÞÑ ´x log x, together with Klein’s
lemma (Exercise 1.40).
28 1. ELEMENTARY CONVEX ANALYSIS
The following lemma quantifies the fact that very mixed states have large en-
tropy.
Lemma 1.20. Let ρ P DpCd q be a state with spectrum in the interval 1´ε 1`ε
“ ‰
d , d
for some ε P r0, 1s. Then Spρq ě log d ´ hpεq, where
1`ε 1´ε
hpεq “ logp1 ` εq ` logp1 ´ εq.
2 2
Note that hpεq „ ε2 {2 as ε goes to 0.
ion
Proof. Assume that d is even and consider a state σ P DpCd q with d{2 eigen-
values equal to p1 ` εq{d and d{2 eigenvalues equal to p1 ´ εq{d. One checks directly
ut
from the definition of majorization that specpρq ă specpσq. It follows then from
Proposition 1.12 (iv) that
rib
Spρq ě Spσq “ logpdq ´ hpεq.
If d is odd, a similar argument applies where σ has pd ´ 1q{2 eigenvalues equal to
ist
p1 ˘ εq{d and one eigenvalue equal to 1{d. One checks by direct computation that
Spσq ą logpdq ´ hpεq.
rd
Remark 1.21. Note that while the entropy of (normalized) quantum states
(i.e., ρ P D) is of primary physical interest, the definition makes sense for, and most
properties generalize to ρ P PSD.
fo
ot
Let σ be a state on Cd , and p P p0, 8q. The p-Rényi entropy of σ is
1
N
the von Neumann entropy, so that S1 “ S. Other limit cases are p Ñ 0, which gives
S0 pσq “ log rank σ, and p Ñ 8, which gives S8 pσq “ ´ log }σ}8 . When p ą 1,
on
the Rényi entropy is connected to the Schatten p-norm by the formula Sp pσq “
p
1´p log }σ}p . Just like the von Neumann entropy is a generalization of Shannon
se
satisfies properties (i)–(v) from Proposition 1.19. Note that (iii) fails for p “ 0.
Exercise 1.50 (Entropy of the state vs. entropy of the diagonal). Show that,
for any ρ P D, Spdiag ρq ě Spρq, with equality only if ρ is diagonal.
Exercise 1.51 (Monotonicity of Rényi entropies). Show that Sp pσq and Hp pqq
are non-increasing in p for fixed σ, q.
ion
ity (1.6) is the so-called “2-uniform convexity” of the p-norm for p P p1, 2s. For
p ě 2, the inequality is reversed (2-uniform smoothness); for p “ 1, it degener-
ates into the triangle inequality. One establishes similarly p-uniform convexity for
ut
p P r2, 8q and p-uniform smoothness for p P p1, 2s.
rib
It is natural to ask whether these inequalities remain valid for the Schatten
p-norm, i.e., when x, y are matrices. This is known to be true for inequality (1.6)
when 1 ď p ď 2 (and for its reversed form when p ě 2). However, the stronger
ist
Hanner inequality (1.5) for matrices has been proved only in the range 1 ď p ď 4{3
(or, for the reversed inequality, in the range p ě 4). For proofs and references, see
rd
[BCL94, CL06].
Section 1.2. Lemma 1.6 seems to be a folklore result, but does not appear in
fo
standard references for convexity (the best source we were pointed to after consult-
ing specialists was Exercise 6, §3.4 of [Grü03]). However, once stated, the Lemma
ot
is straightforward to prove.
N
We refer the interested reader to the books [BV04] and [BTN01a], the survey
[Nem07], and, for sample links, to [Rei08, KL09, BH13, HNW15].
on
Davis convexity theorem appears in [Dav57]. Early references for Schatten norms
lu
and its variants (quantum relative entropy, quantum mutual information) have
several operational interpretations, i.e., quantify the rate at which basic information
so
ion
This chapter puts into mathematical perspective some basic concepts of quan-
ut
tum information theory. (For a physically motivated approach, see Chapter 3.) We
discuss the geometry of the set of quantum states, the entanglement vs. separabil-
rib
ity dichotomy, and introduce completely positive maps and quantum channels. All
these concepts will be extensively used in Chapters 8–12.
ist
2.1. On the geometry of the set of quantum states
rd
2.1.1. Pure and mixed states. In this section we take a closer look at the
set DpHq (or simply D) of quantum states on a finite-dimensional complex Hilbert
(2.1)
fo
space H. By definition (see Section 0.10), we have
DpHq “ tρ P Bsa pHq : ρ ě 0, Tr ρ “ 1u.
ot
If H “ C , the definition (2.1) simply says that DpCd q is the base of the positive
d
N
full cone.)
on
A state ρ P DpHq is called pure if it has rank 1, i.e., if there is a unit vector
ψ P H such that
ρ “ |ψyxψ|.
se
Note that |ψyxψ| is the orthogonal projection onto the (complex) line spanned by
lu
sider the corresponding pure state |ψyxψ|. We use the terminology of mixed states
when we want to emphasize that we consider the set of all states, not necessarily
so
pure.
Let ψ, χ be unit vectors in H. Then the pure states |ψyxψ| and |χyxχ| coincide
r
if and only if there is a complex number λ with |λ| “ 1 such that χ “ λψ. Therefore
Pe
the set of pure states identifies with PpHq, the projective space on H. (See Appendix
B.2; note that the space PpCd q is more commonly denoted by CPd´1 .)
The set DpHq is a compact convex set, and it is easily checked that the extreme
points of DpHq are exactly the pure states (cf. Proposition 1.9 and Corollary 1.10).
It follows from general convexity theory (Krein–Milman and Carathéodory’s
theorems) that any state is a convex combination of at most pdim Hq2 pure states.
However, using the spectral theorem instead tells us more: any state is a convex
combination of at most dim H pure states |ψi yxψi |, where pψi q are pairwise orthog-
onal unit vectors (cf. Exercise 1.45). A fundamental consequence is that whenever
we want to maximize a convex function (or minimize a concave function) over the
31
32 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY
set DpHq, the extremum is achieved on a pure state, which significantly reduces the
dimension of the problem.
As opposed to pure states, which are extremal, the “most central” element in
DpHq is the state I { dim H, which is called the maximally mixed state, and denoted
by ρ˚ when there is no ambiguity. We also note that the set of states on H which
are diagonal with respect to a given orthonormal basis pei qiPI naturally identifies
with the set of classical states on I.
Exercise 2.1. Describe states which belong to the boundary of DpHq.
ion
Exercise 2.2 (Every state is an average of pure states). Show that every state
ρ P DpCd q can be written as d1 p|ψ1 yxψ1 | ` ¨ ¨ ¨ ` |ψd yxψd |q for some unit vectors
ut
ψ1 , . . . , ψd in Cd .
rib
2.1.2. The Bloch ball DpC2 q. The situation for d “ 2 is very special. Let
ρ P Msa
2 , with Tr ρ “ 1. Then ρ has two eigenvalues, which can be written as 1{2´λ
ist
and 1{2 ` λ for some λ P R. Moreover, ρ ě 0 if and only if |λ| ď 1{2. On the other
hand, we have ?
rd
}ρ ´ ρ˚ }HS “ 2|λ|.
?
Therefore, ρ is a state if and only if }ρ ´ ρ˚ }HS ď 1{ 2. What we have proved
fo
is that, inside the space of trace one self-adjoint?operators, the set of states is a
Euclidean ball centered at ρ˚ and with radius 1{ 2. This ball is called the Bloch
ot
ball and its boundary is called the Bloch sphere. Once we introduce the Pauli
matrices
N
„ „ „
0 1 0 ´i 1 0
(2.2) σx “ , σy “ , σz “ ,
1 0 i 0 0 ´1
ly.
´ 1 1 1 1 ¯
(2.3) ? I, ? σx , ? σy , ? σz .
2 2 2 2
se
2
A very useful consequence of DpC q being a ball is the fact—mentioned already
in Section 1.2.1—that the cone PSDpC2 q is isomorphic (or even isometric in the ap-
lu
„
t ` z x ´ iy
(2.4) R4 Q x “ pt, x, y, zq ÞÑ “ X P Msa
2 .
so
x ` iy t ´ z
The formula for X can be rewritten in terms of the Pauli matrices (2.2) as
r
Pe
automorphisms of the cones L4 and PSDpC2 q, and when proving Størmer’s theorem
in Section 2.4.5.
When d ą 2, the set DpCd q is no longer a ball, but rather the non-commutative
analogue of a simplex. Its symmetrization (see Section 4.1.2)
DpCd q “ conv DpCd q Y ´DpCd q “ tA P Msa
` ˘
d : }A}1 ď 1u,
is S1d,sa , the unit ball of the self-adjoint part of the 1-Schatten space (see Section
1.3.2).
ion
One way to quantify the fact that the set DpCd q is different from a ball when
d ą 2, is to compute the radius a of its inscribed and circumscribeda Hilbert–Schmidt
balls. The former equals 1{ dpd ´ 1q while the latter is pd ´ 1q{d (the same
ut
values as for the set ∆d´1 of classical states on t1, . . . , du, and for the same reasons).
In other words, if we denote by Bpρ˚ , rq the ball centered at ρ˚ and with Hilbert–
rib
Schmidt radius r inside the hyperplane H1 “ tTrp¨q “ 1u Ă Msa d , we have
˜ ¸ ˜ c ¸
ist
1 d d´1
(2.7) B ρ˚ , a Ă DpC q Ă B ρ˚ ,
dpd ´ 1q d
rd
and these values—differing by the factor of d ´ 1—are the best possible.
Exercise 2.3 (The Bloch sphere is a sphere). Show that the matrix X given
fo
by (2.5) has eigenvalues 1 and ´1 if and only if t “ 0 and x2 ` y 2 ` z 2 “ 1.
ot
Exercise 2.4 (Composition rules for Pauli matrices). Verify the composition
rules for Pauli matrices. (i) σa2 “ I (ii) If a, b, c are all different, then σa σb “ iεσc ,
N
range is contained in E:
lu
respond to the case dim E “ 1. In the direction opposed to a pure state |xyxx|
lies a face which corresponds to all states with a range orthogonal to x; these are
so
Remark 2.2. All faces of DpCd q are exposed (as defined in Exercise 1.5) since
Pe
subspace and that F contains an element ρ such that rangepρq “ E. We now claim
that F “ DpEq. The direct inclusion is obvious. Conversely, consider σ P DpEq. For
1
λ ą 0 small enough the operator τ “ 1´λ pρ´λσq is a state. Since ρ “ λσ `p1´λqτ ,
we conclude that the segment joining σ and τ is contained in F ; in particular
σ P F.
Exercise 2.5. Show directly (i.e., without appealing to Proposition 2.1) that
any exposed face of DpCd q has the form DpEq for some subspace E Ă Cd .
ion
2.1.4. Symmetries. We now describe the symmetries of DpCd q. This is
closely related to the famous theorem of Wigner that characterizes the isometries
of complex projective space as a metric space. Recall (see Appendix B.2) that rψs
ut
denotes the equivalence class in PpCd q of a unit vector ψ P SCd .
rib
Theorem 2.3 (Wigner’s theorem). Denote by PpCd q the projective space over
Cd , equipped with the Fubini–Study metric (B.5). A map f : PpCd q Ñ PpCd q
ist
is an isometry if and only if there is a map U on Cd which is either unitary or
anti-unitary such that, for any unit vector ψ,
rd
(2.9) f prψsq “ rU pψqs.
d d
A map U : C Ñ C is anti-unitary if it is the composition of a unitary map
with complex conjugation. fo
ot
Proof. We outline the proof of Wigner’s theorem for d “ 2. Since the projec-
tive space over C2 identifies with the Bloch sphere, its group of isometries is given
N
by the orthogonal group Op3q, and splits into direct isometries (rotations, or SOp3q)
and indirect isometries.
ly.
Let f be a direct isometry of the Bloch ball. It has two opposite fixed points rϕ1 s
and rϕ2 s, with ϕ1 K ϕ2 , and is a rotation of angle θ in the plane tr ?12 pϕ1 `eiα ϕ2 qs :
on
α P Ru. One checks that (2.9) is satisfied when U is given by U pϕ1 q “ ϕ1 and
U pϕ2 q “ eiθ ϕ2 . Note that U is determined up to a global phase. In particular,
if we insist on having U P SUp2q, we are led to the choice U pϕ1 q “ e´iθ{2 ϕ1
se
and U pϕ2 q “ eiθ{2 ϕ2 involving the half-angle. (We point out the isomorphism
lu
induces on the Bloch ball the reflection R in the plane trcos θψ1 `sin θψ2 s : θ P Ru.
Since any indirect isometry of the Bloch ball is the composition of R with a direct
so
When PpCd q is identified with the set of pure states on Cd , the isometries from
Theorem 2.3 act as ρ ÞÑ U ρU : or ρ ÞÑ U ρT U : for U P Updq. Here ρT denotes the
transposition of a state ρ with respect to a distinguished basis (since ρ “ ρ: , ρT is
also the complex conjugate of ρ with respect to that basis).
Theorem 2.4 (Kadison’s theorem). Affine maps preserving globally DpCd q are
of the form ρ ÞÑ U ρU : or ρ ÞÑ U ρT U : for U P Updq. In particular, they are
isometries with respect to the Hilbert–Schmidt distance.
d d
Proof. Let Φ be an affine map on Msa d such that ΦpDpC qq “ DpC q. Then
d
Φ preserves the set of faces of DpC q, which are described in Proposition 2.1. In
2.2. STATES ON MULTIPARTITE HILBERT SPACES 35
particular, Φ preserves the set of minimal faces, which identify with pure states.
Therefore Φ induces a bijection on PpCd q. We claim that Φ is an isometry with
respect to the Fubini–Study distance (B.5), which is equivalent to
Tr pΦp|ψyxψ|q ¨ Φp|ϕyxϕ|qq “ |xψ, ϕy|2
for ψ, ϕ P Cd . If rψs “ rϕs, this is clear. Otherwise, let M Ă Cd be the 2-
dimensional subspace generated by ψ and ϕ. By Proposition 2.1, the set DpM q
canonically identifies with a (3-dimensional) face of DpCd q. Consequently, ΦpDpM qq
ion
is also a face, which identifies with DpM 1 q for some 2-dimensional subspace M 1 Ă
Cd . Since DpM q and DpM 1 q are Bloch balls, the map Φ restricted to DpM q must
be an isometry (affine maps preserving S 2 are isometries). We may now apply
ut
Wigner’s theorem: there is U P Updq such that either Φpρq “ U ρU : whenever ρ is
a pure state, or Φpρq “ U ρT U : for all pure states ρ. Since Φ is affine, one of the
rib
two formulas is valid for all ρ P DpCd q.
Although for d ą 2 the set DpCd q is not centrally symmetric, we may argue
ist
that the maximally mixed state ρ˚ plays the role of a center. In particular, we have
rd
Proposition 2.5. Let ρ P DpCd q be a state which is fixed by all the isometries
of DpCd q (with respect to the Hilbert–Schmidt distance). Then ρ “ ρ˚ .
fo
Proof. We have U ρU : “ ρ for every unitary matrix U . Since Updq spans Md
as a vector space, ρ commutes with any matrix, therefore it equals α I for some
ot
α P C, and the trace constraint forces α “ 1{d.
N
4.2.2 (see Exercise 4.25). Another consequence of Kadison’s Theorem 2.4 is a char-
acterization of affine automorphisms of the cone of positive semi-definite matrices,
on
Exercise 2.8. State and prove the real version of Wigner’s theorem.
so
ion
2.2.2. Schmidt decomposition. We recall the singular value decomposition
(SVD) for matrices: any real or complex matrix A P Mk,d can be decomposed as
A “ U ΣV : , when U and V are unitary matrices of sizes k and d respectively, and
ut
Σ “ pΣij q P Mk,d is a “rectangular diagonal” (i.e., such that Σij “ 0 whenever
rib
i ‰ j) nonnegative matrix. Moreover, up to permutation, the “diagonal” elements
of Σ are uniquely determined by A and are called the singular values of A. We
often denote the singular values of A by s1 pAq ě ¨ ¨ ¨ ě sminpk,dq pAq. The singular
ist
values of A coincide with the eigenvalues of pAA: q1{2 when k ď d, and with the
eigenvalues of pA: Aq1{2 when k ě d. Note that, in any case, AA: and A: A share
rd
the same nonzero eigenvalues.
An equivalent presentation of the SVD is as follows: there exist orthonormal
fo
sequences pui q (in Rk or Ck , depending on the context) and pvi q (in Rd or Cd ), and
a non-increasing sequence of nonnegative scalars psi q such that
ot
ÿ
(2.10) A“ si |ui yxvi |.
N
When translated into the language of tensors (see Section 0.4), the singular value
ly.
Then there exist nonnegative scalars pλi q1ďiďd , and orthonormal vectors pχi q1ďiďd
lu
ÿ
(2.11) ψ“ λi χi b ϕi .
i“1
so
Note that λ21 ` ¨ ¨ ¨ ` λ2d “ |ψ|2 . We may write λi pψq instead of λi to emphasize
the dependence on ψ. The largest r such that λr pψq ą 0 is called the Schmidt rank
of ψ. If ψ P Ck b Cd is identified with a matrix M P Mk,d as in Section 0.8, then
(2.12) TrCd |ψyxψ| “ M M : .
Via this identification, Schmidt coefficients of ψ coincide with singular values of M ,
and the Schmidt rank of ψ coincides with the rank of M . States of Schmidt rank
1 are exactly product vectors. The largest and the smallest Schmidt coefficients of
ψ P H1 b H2 are also given by the variational formulas
(2.13) λ1 pψq “ maxt|xψ, χ b ϕy| : χ P H1 , ϕ P H2 , |χ| “ |ϕ| “ 1u,
2.2. STATES ON MULTIPARTITE HILBERT SPACES 37
The above are fully analogous to the (special cases of) Courant–Fischer variational
formulas for singular values of a matrix.
2.2.3. A fundamental dichotomy: separability vs. entanglement. We
now introduce a fundamental concept: the dichotomy between separability and
ion
entanglement for quantum states. Let H be a complex Hilbert space admitting a
tensor decomposition
ut
(2.15) H “ H1 b ¨ ¨ ¨ b Hk .
Recall that since 1-dimensional factors may be dropped, we may—and usually will—
rib
assume that all the factors are of dimension at least 2.
Definition 2.7. A pure state ρ “ |χyxχ| on H is said to be pure separable if
ist
the unit vector χ is a product vector, i.e., if there exist unit vectors χ1 , . . . , χk such
that χ “ χ1 b ¨ ¨ ¨ b χk . In that case,
rd
(2.16) ρ “ |χ1 yxχ1 | b ¨ ¨ ¨ b |χk yxχk |.
fo
Extending the definition of separability to mixed states requires to consider
convex combinations (we study in detail the convex hull operation A ÞÑ convpAq in
ot
Section 1.1.2).
Definition 2.8. A mixed state ρ “ |χyxχ| on H is said to be separable if it can
N
States which are not separable are called entangled. Since pure states are the
extreme points even of the larger set DpHq (Proposition 2.1), it follows that the
se
pure separable states (i.e., those given by (2.16)) are exactly the extreme points of
SeppHq. Since there are vectors that are not product vectors, the set SeppHq is a
lu
It is noteworthy that SeppHq and DpHq have the same dimension. This can
Pe
pure states
D = conv{pure states}
•
ρ∗ = I/d2
ion
ut
rib
Figure 2.1. The sets of states (D) and of separable states (Sep)
on Cd bCd . Pure product states have measure zero inside the set of
pure states; however both convex hulls have the same dimension.
ist
The picture does not respect convexity of Sep, but it is supposed
to reflect the relative rarity of separability.
rd
A deeper result asserts that, in the bipartite case, not only do Sep and D have
fo
the same dimension, they also have the same inradius. This may look surprising
since Sep is defined as the convex hull of a very small subset of the set of extreme
ot
points of D. This remarkable fact was discovered by Gurvits and Barnum and will
be proved later (see Theorem 9.15).
N
book we will focus primarily on the setting in which all partitions are fixed.
Although the extreme points of Sep are very easy to describe (as noted earlier,
r
they are precisely the pure product states), there is no simple description of the
Pe
facial structure of Sep available (compare with Proposition 2.1, which describes all
the faces of D). The complexity of the facial structure of Sep can be related to
the fact that deciding whether a state is separable is known to be, in the general
setting, NP-hard. This makes calculating some parameters of Sep highly nontrivial;
we will run into this problem in Chapter 9 (see, e.g., Theorem 9.6). Finally, in view
of the dual formulation of the problem of describing faces of a convex body (see
Section 1.1.5, and particularly Proposition 1.5), characterizing maximal faces of
Sep is essentially equivalent to describing extreme points of the object dual to Sep
(see (2.47)), which are well understood only for very small dimensions. (Appendix
C discusses closely related issues.)
2.2. STATES ON MULTIPARTITE HILBERT SPACES 39
ion
2.2.4. Some examples of bipartite states. We now present some examples
of states on Cd b Cd that are widely used in quantum information theory.
ut
2.2.4.1. Maximally entangled states. A pure state on Cd b Cd is called maxi-
rib
mally entangled if it has the form ρ “ |ψyxψ| with
d
1 ÿ
ψ“? e i b fi ,
ist
(2.19)
d i“1
rd
where pei q1ďiďd and pfi q1ďiďd are two orthonormal bases in Cd . Such a vector ψ is
called a maximally entangled vector.
In the special case of d “ 2, i.e., for systems formed of 2 qubits, the maximally
fo
entangled states are called Bell states. Many quantum information protocols, such
as quantum teleportation, use Bell states as a fundamental resource.
ot
If we identify vectors and matrices as explained in Section 0.8, the set of all
maximally entangled vectors on Cd b Cd (or, more precisely, on Cd b Cd ) identifies
N
the canonical basis p|iyq1ďiďd , and let ρ “ |ψyxψ|. Show that Tr ρpX b Y q “
1 T d
d TrpXY q for any X, Y P BpC q.
se
Show that |ψyxψ| is maximally entangled if and only if distpψ, Segq is maximal. For
extensions to the multipartite case, see Section 8.5.
na
2.2.4.2. Isotropic states. Isotropic states are states which are a convex (or
so
affine) combination of the maximally mixed state and a maximally entangled state.
They have the form
r
I
Pe
(2.20) ρβ “ β|ψyxψ| ` p1 ´ βq 2 ,
d
where ψ is as in (2.19) and ´ d21´1 ď β ď 1.
2.2.4.3. Werner states. Consider the flip operator F P B sa pCd b Cd q defined
on pure tensors by F px b yq “ y b x and extended by linearity. Its eigenspaces are
the symmetric subspace
Symd “ tψ P Cd b Cd : F pψq “ ψu
and the antisymmetric subspace
Asymd “ tψ P Cd b Cd : F pψq “ ´ψu.
40 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY
ion
and antisymmetric states are defined respectively as
2 2
πs “ PSymd and πa “ PAsymd .
ut
dpd ` 1q dpd ´ 1q
For λ P r0, 1s, consider the state wλ (called the Werner state) obtained as a convex
rib
combination of these two projectors
(2.21) wλ “ λπs ` p1 ´ λqπa .
ist
Another equivalent expression is
rd
1
(2.22) wλ “ pI ´αF q,
d2 ´ dα
where
(2.23) α“
1 ` dp1 ´ 2λq
fo
P r´1, 1s.
1 ` d ´ 2λ
ot
When d “ 2, the space Asym2 has dimension one, and Werner states are then a
N
(ii) Show that for every nonzero vectors ϕ, ψ P Symd , there is V P A such that
xϕ|V |ψy ‰ 0.
lu
(iii) Show that for every nonzero vectors ϕ, ψ P Asymd , there is V P A such that
xϕ|V |ψy ‰ 0.
na
(ii) Show that if U is chosen at random with respect to the Haar measure on UpCd q,
then for any ρ P DpCd b Cd q, EpU b U qρpU b U q: “ wλ with λ “ TrpρPSymd q. (The
map ρ ÞÑ EpU b U qρpU b U q: is called the twirling channel.)
(iii) Show that if ψ P SCd is chosen uniformly at random, then E |ψbψyxψbψ| “ πs .
2.2. STATES ON MULTIPARTITE HILBERT SPACES 41
ion
Trall but i ρk “ ρ
for every i P t1, . . . , ku. The state ρk is called a k-extension of ρ. The main result
ut
regarding k-extendible states is the following theorem.
rib
Theorem 2.10 (not proved here). A quantum state on H1 b H2 is separable if
and only if it is k-extendible for every k ě 2.
ist
The “only if” direction is easy (see Exercise 2.17), while the “if” direction relies
on the quantum de Finetti theorem and is beyond the scope of this book.
rd
Exercise 2.17. For k ě 2, denote by k-Ext the set of k-extendible states on
H1 b H2 . Show that k-Ext is convex and check the inclusions Sep Ă l-Ext Ă k-Ext
for k ď l. fo
Exercise 2.18 (2-extendibility of pure states). (i) Let ρ P DpH1 b H2 q be a
ot
state such that TrH2 ρ “ |ψyxψ| for some ψ P H1 . Show that ρ “ |ψyxψ| b σ for
N
ÿ
λi |ψi yxψi |
i
se
where each unit vector ψi P H1 b H2 has Schmidt rank at most k. Note that
lu
ÿ
aij |ei yxej |.
r
i,j
Pe
Once the basis is fixed, it makes sense to consider the transposition T : BpHq Ñ
BpHq with respect to that basis, defined as
´ÿ ¯ ÿ
T aij |ei yxej | “ aij |ej yxei |.
i,j i,j
We will sometimes use the alternative notation AT “ T pAq. Note that T is not
canonical and depends on the choice of the basis in H. The standard usage in linear
H
algebra refers to the transposition with respect to the standard basis p|jyqdim
j“1 .
We now define the partial transposition: if H “ H1 b H2 is a bipartite Hilbert
space, and if T denotes the transposition on BpH1 q (with respect to a specified
42 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY
basis) and Id is the identity operation of BpH2 q, then the partial transposition (or
partial transpose) is the operation
Γ “ T b Id : BpH1 b H2 q Ñ BpH1 b H2 q.
The partial transposition of a state ρ P DpH1 b H2 q is denoted by ρΓ “ Γpρq. What
we have defined is actually the partial transposition with respect to the first factor.
The partial transposition with respect to the second factor is defined by switching
the roles of H1 and H2 .
ion
Partial transposition applies nicely to states represented as block matrices (see
Section 0.7): if ρ P DpH1 b H2 q corresponds to the block operator pAij q, with
Aij P BpH2 q, then ρΓ corresponds to the block operator pAji q. Similarly, par-
ut
tial transposition of ρ with respect to the second factor corresponds to the block
operator pATij q. We illustrate this by computing the partial transposition of the
rib
(maximally entangled) Bell state: if ψ “ ?12 p|00y ` |11yq, then (assuming transpo-
sition is taken with respect to the canonical basis of C2 )
ist
» fi » fi
1 0 0 1 1 0 0 0
rd
1 — 0 0 0 0 ffi ffi , |ψyxψ|Γ “ 1 — 0 0 1 0 ffi .
— ffi
(2.24) |ψyxψ| “ —
2 – 0 0 0 0 fl 2 – 0 1 0 0 fl
1 0 0 1 fo 0 0 0 1
As for the usual transposition, the partial transposition depends on a choice of
ot
basis. However, we have the following result.
N
Proof. Let pei q and pe1i q be two orthonormal bases in H1 , and T and T 1 denote
the transpositions with respect to each basis. Let U be the unitary transformation
on
such that e1j “ U pej q. We claim that, for every operator X P BpH1 q,
(2.25) T 1 pXq “ V : T pXqV,
se
which case T 1 pXq “ |e1j yxe1i |. On the other hand, since X “ U |ei yxej |U : , we then
have
na
as claimed. This shows that the partial transpositions with respect to the two bases
are conjugated via the unitary transformation V b I, and the claim follows since
r
Pe
řd
Exercise 2.20 (Partial transpose and the flip operator). Let ψ “ ?1d i“1 ei b
ei be a maximally entangled state on Cd bCd and assume that partial transposition
is computed with respect to the basis pei q. Show that |ψyxψ|Γ “ d1 F where F :
x b y ÞÑ y b x is the flip operator.
Exercise 2.21. Find an error in the following argument that purports to mimic
the proof of Proposition 2.11 to show that the partial transpose of any state is
positive.
ion
If X P B sa pH1 q, then T pXq (with respect to some fixed basis) has the same spectrum
as X and so there is a unitary operator V such that T pXq “ V : XV . This shows
that the partial transpose with respect to the same basis is given by conjugation
ut
by the unitary transformation V b I. Since such conjugation preserves spectra, it
follows that the partial transpose of any state is positive.
rib
2.2.7. PPT states.
ist
Definition 2.12. A state ρ P DpH1 b H2 q is said to have a positive partial
transpose (or to be PPT) if the operator ρΓ is positive. We denote by PPTpH1 bH2 q,
rd
or simply PPT, the set of PPT states (note that this set is convex).
Proposition 2.11 implies that the definition of PPT states is basis-independent.
fo
Similarly, we do not need to specify whether we apply the partial transposition to
the first or the second factor; one passes from one to the other by applying the full
ot
transposition, which is a spectrum-preserving operation.
Let ρ be a state on H1 bH2 . Since the partial transposition preserves the trace,
N
The map Γ is a linear map which preserves the Hilbert–Schmidt norm, and
therefore behaves as an isometry (see Exercise 2.22). This map is not a canonical
object and depends on the choice of a basis. However, the intersection D X ΓpDq
se
The next proposition lies at the root of the relevance of the concept of PPT
states to quantum information theory.
na
Proof. Since the set PPT is convex, it suffices to show that the extreme points
of SeppH1 b H2 q are PPT. The extreme points of SeppH1 b H2 q are pure product
states, i.e., states of the form
ρ “ |ψ1 b ψ2 yxψ1 b ψ2 | “ |ψ1 yxψ1 | b |ψ2 yxψ2 |
for unit vectors ψ1 P H1 , ψ2 P H2 . The partial transpose of such a state is
ρΓ “ |ψ1 yxψ1 |T b |ψ2 yxψ2 | “ |ψ1 yxψ1 | b |ψ2 yxψ2 |,
where ψ1 is the vector obtained by applying the complex conjugation to each coor-
dinate of ψ1 . It follows that ρΓ is positive, hence ρ is PPT.
44 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY
PPT = D ∩ Γ(D)
Sep
ion
Γ(D)
ut
rib
Figure 2.2. An illustration of the inclusion Sep Ă PPT “
ist
D X ΓpDq. The inclusion is strict if and only if dim H1 dim H2 ą 6,
see Theorem 2.15. The set Sep is not a polytope, but the set of
rd
its extreme points is much “thinner” than those of D and of PPT
if the dimension is large.
fo
The Peres–Horodecki criterion (or the PPT criterion) is shown in action in
ot
(2.24), where it certifies non-separability of the Bell state: the partial transpose
|ψyxψ|Γ is clearly non-positive. However, positivity of ρΓ is, in general, only a
N
ř
Proof. Let ρ “ |ψyxψ| be a pure state, and let ψ “ λi χi b ψi be a Schmidt
decomposition. If we compute the partial transposition with respect to a basis
lu
i,j
so
Suppose there exist two non-zero Schmidt coefficients (say, λi and λj with i ‰ j).
Then one checks from (2.28) that the restriction of ρΓ to spantχi b ψj , χj b ψi u is
r
not positive. It follows that ρ is PPT if and only if only one Schmidt coefficient
Pe
Proposition 2.16 (Separability of Werner states). For λ P r0, 1s, let wλ be the
Werner state on H “ Cd b Cd as defined in (2.21). The following are equivalent
(i) wλ is separable,
(ii) wλ is PPT,
(iii) Tr wλ F ě 0,
(iv) λ ě 1{2.
Proof. The equivalence (iii) ðñ (iv) is a straightforward calculation (we have
Tr wλ F “ 2λ ´ 1). To show that (ii) ðñ (iv), we compute the partial transpose of
ion
Werner states in the form (2.22) to obtain (see also Exercise 2.20)
1
wλΓ “ 2
` ˘
ut
I ´αd|xyxx| ,
d ´ dα
rib
where x is the maximally entangled vector in the canonical basis p|iyq1ďiďd . It
follows that wλΓ ě 0 ðñ α ď 1{d ðñ λ ě 1{2 (see (2.23) for the second
equivalence). It remains to prove that (iv) implies (i); since Sep is convex, it is
ist
enough to establish that w1 and w1{2 are separable. The separability of w1 “ πs is
clear from part (iii) of Exercise 2.16. To show that w1{2 is separable, we proceed
rd
as follows. For j ‰ k and a complex number ξ with modulus one, denote v ˘ “
|jy ˘ ξ|ky. Next, think of ξ as a random variable uniformly distributed on the unit
fo
circle. The operator E |v ` yxv ` | b |v ´ yxv ´ | belongs to the separable cone SEP. We
compute
ot
E |v ` v ´ yxv ` v ´ | “ |jjyxjj| ` |kkyxkk| ` |jkyxjk| ` |kjyxkj| ´ |jkyxkj| ´ |kjyxjk|,
N
ÿ ÿ
A :“ 2d |jyxj| b |jyxj| ` 2 |jyxj| b |kyxk| ´ 2F P SEP.
j j‰k
on
where the first equality is just (2.22) (note that λ “ 1{2 implies α “ 1{d by
(2.23)).
na
the PPT criterion, this is a necessary (but generally not sufficient) condition for
separability.
2.2.8. Local unitaries and symmetries of Sep. Let us state an analogue
of Kadison’s theorem (Theorem 2.4), which characterizes affine maps preserving
the set Sep. This can be seen as a motivation for the study of partial transposition.
Theorem 2.17 (not proved here). Let H “ Cd1 b ¨ ¨ ¨ b Cdk be a multipartite
Hilbert space. An affine map Φ : B sa pHq Ñ B sa pHq satisfies ΦpSepq “ Sep if and
ion
only if it can be written as the composition of maps of the following forms:
(i) local unitaries
ut
ρ ÞÑ pU1 b ¨ ¨ ¨ b Uk qρpU1 b ¨ ¨ ¨ b Uk q:
for Ui P Updi q,
rib
(ii) partial transpositions
ρ1 b ¨ ¨ ¨ b ρi b ¨ ¨ ¨ b ρk ÞÑ ρ1 b ¨ ¨ ¨ b ρTi b ¨ ¨ ¨ b ρk ,
ist
for some i P t1, . . . , du,
rd
(iii) swaps
ρ1 b ¨ ¨ ¨ b ρi b ¨ ¨ ¨ b ρj b ¨ ¨ ¨ b ρk ÞÑ ρ1 b ¨ ¨ ¨ b ρj b ¨ ¨ ¨ b ρi b ¨ ¨ ¨ b ρk ,
for some i ă j such that di “ dj .
fo
All these maps are also isometries with respect to the Hilbert–Schmidt distance.
ot
Although SeppHq has a much smaller group of isometries than DpHq, the con-
N
clusion of Proposition 2.5 still holds for Sep: the only fixed point is ρ˚ . This implies
for example that ρ˚ is the centroid of Sep.
ly.
ÿ piq piq
A“ c i A1 b ¨ ¨ ¨ b Ak ,
r
i
Pe
piq sa
where Aj P B pHj q. Let U “ U1 b ¨ ¨ ¨ b Uk , where pUj q are random unitary
matrices, independent and Haar-distributed on the corresponding unitary groups.
By the translation-invariance of the Haar measure (see Appendix B.3), the opera-
piq
tor E Uj Aj Uj: commutes with any unitary operator on Hj and therefore (by the
preceding fact) equals αi,j IHj for some αi,j P R. By independence, it follows that
ÿ piq piq
ci E U1 A1 U1: b ¨ ¨ ¨ b Uk Ak Uk:
` ˘
E U AU : “
i
ÿ piq piq
“ ci pE U1 A1 U1: q b ¨ ¨ ¨ b pE Uk Ak Uk: q
i
2.3. SUPEROPERATORS AND QUANTUM CHANNELS 47
˜ ¸
ÿ k
ź
“ ci αi,j IH .
i j“1
However, the group of local unitaries does not act irreducibly: there are non-
trivial invariant subspaces which are described by the following lemma.
Lemma 2.19 (not proved here). Let H “ Cd1 b ¨ ¨ ¨ b Cdk be a multipartite
ion
Hilbert space, and
G “ tU1 b ¨ ¨ ¨ b Uk : Ui P Updi qu
ut
1 2 1
be the group of local unitaries. For 1 ď i ď k, write Msadi “ Vi ‘ Vi , where Vi
2
denotes the hyperplane of trace zero Hermitian matrices, and Vi “ R I.
rib
A subspace E Ă B sa pHq is invariant under G if and only if it can be decomposed
as a direct sum of subspaces of the form
ist
Viα1 1 b ¨ ¨ ¨ b Viαk k
rd
for some choice pα1 , . . . , αk q P t1, 2uk .
are quantum maps and quantum operations. The crucial observation is that with
any such map one can naturally associate usual operators acting on larger Hilbert
spaces.
ly.
H2 denote complex (finite-dimensional) Hilbert spaces. Recall (see Sections 0.4 and
0.8) the canonical isomorphisms pH1 b H2 q˚ Ø H1˚ b H2˚ and
se
This isomorphism can be seen more concretely via trace duality: a map S P
so
It turns out that there is another related isomorphism, called the Choi isomorphism,
which is often more useful. Once a basis in H1 is fixed, the Choi isomorphism is
the C-linear bijective map
(2.31) C : BpBpH1 q, BpH2 qq ÝÑ ÿBpH2 b H1 q
Φ Þ Ñ
Ý ΦpEij q b Eij .
i,j
We call CpΦq the Choi matrix of Φ. Note that the Choi isomorphism is basis-
ion
dependent, whereas the Jamiołkowski isomorphism is not. The relation between
the isomorphisms J and C is given by the partial transposition: if Γ denotes the
partial transposition on H2 b H1 with respect to H1 , then C “ Γ ˝ J.
ut
Here is a simple lemma which identifies the elements in BpBpH1 q, BpH2 qq that
correspond to rank 1 operators under the Choi isomorphism.
rib
Lemma 2.20. Given A, B P BpH1 , H2 q, consider the map Φ : BpH1 q Ñ BpH2 q
ist
defined by
ΦpXq “ AXB :
rd
for X P BpH1 q. Then CpΦq “ |ayxb|, where a “ vecpAq and b “ vecpBq are the
vectors in H2 b H1 associated to the operators A and B (see Section 0.8). Note
fo
also that A has rank 1 if and only if a is a product vector.
Proof. By C-linearity it is enough to consider A “ |ψyxej | and B “ |χyxei |
ot
for some ψ, χ P H2 and some basis vectors ei , ej P H1 . A simple computation shows
that then CpΦq “ |ψyxχ| b Eij , while a “ ψ b ej and b “ χ b ei , and the Lemma
N
follows.
ly.
respect to the bases pEij q1ďi,jďd1 and pEkl q1ďk,lďd2 is given by the realigned Choi
matrix CpΦqR .
se
` ˘
(2.32) CpΦq “ Φ b IdBpH1 q p|χyxχ|q,
ř
where χ “ i ei b ei P H1 b H1 is (a multiple of) a maximally entangled vector.
(Recall that we fixed a basis pei q in H1 when defining the Choi isomorphism.) We
also note that there is a one-to-one correspondence between
(a) self-adjointness-preserving C-linear maps Φ : BpH1 q Ñ BpH2 q and
(b) R-linear maps Ψ : B sa pH1 q Ñ B sa pH2 q.
The correspondence is straightforward: Ψ is obtained from Φ by restriction, whereas
Φ is obtained from Ψ by complexification (see Section 0.5).
2.3. SUPEROPERATORS AND QUANTUM CHANNELS 49
ion
for any X P BpH2 q and Y P BpH1 q. Note that Φ˚ is automatically self-adjointness-
preserving if Φ is.
ut
The map Φ is said to be positivity preserving—shortened to positive when this
does not lead to ambiguity—if the image of every positive operator is a positive
rib
operator. The map Φ is said to be n-positive if Φ b Id : B sa pH1 b Cn q Ñ B sa pH2 b
Cn q is positive. (Note that n-positivity formally implies k-positivity for any k ă n.)
ist
Finally, the map Φ is said to be completely positive if it is n-positive for every integer
n. (However, only n “ minpdim H1 , dim H2 q needs to be checked, see Exercise
rd
2.28.) We denote by CP pH1 , H2 q the set of completely positive maps from BpH1 q
to BpH2 q. It is immediate from the definition that CP pH1 , H2 q is a convex cone;
more about this aspect of the theory in Section 2.4.
fo
The transposition is an example of a map which is positive but not 2-positive;
this can be seen, e.g., from (2.24) in Section 2.2.6 or from Exercise 2.32. Here is an
ot
important structure theorem concerning completely positive maps.
N
(3) there exist finitely many operators A1 , . . . , AN P BpH1 , H2 q such that, for
any X P BpH1 q,
se
N
ÿ
(2.33) ΦpXq “ Ai XA:i .
lu
i“1
The smallest integer N such that a Kraus decomposition is possible is called the
Kraus rank of Φ. As will be clear from the proof, the Kraus rank of Φ is the same
so
as the rank of CpΦq in the usual (linear algebra) sense. In particular, it will follow
that the Kraus rank of Φ : BpH1 q Ñ BpH2 q is at most dim H1 dim H2 .
r
Pe
Proof. It is easily checked that p3q implies p1q. The implication p1q ñ p2q
follows from the representation (2.32) of the Choi matrix. We now prove p2q ñ p3q.
By the spectral theorem, there exist vectors ai P H1 b H2 such that
ÿ
(2.34) CpΦq “ |ai yxai |.
i
By Lemma 2.20, |ai yxai | is the Choi matrix of the map X ÞÑ Ai XA:i , where Ai P
BpH1 , H2 q is associated to ai via the relation ai “ vecpAi q. A representation of
type p3q follows now from the linearity of the Choi isomorphism.
50 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY
ion
more general setting in Section 2.4.
Exercise 2.25. Let Φ : BpH1 q Ñ BpH2 q be self-adjointness-preserving. Show
ut
that Φ˚ is positive if and only if Φ is positive, and that for any n, Φ˚ is n-positive
if and only if Φ is n-positive.
rib
Exercise 2.26. Show that if Φ and Ψ are completely positive, so are Φ b Ψ
and Φ ˝ Ψ (the composition, assuming it is defined).
ist
Exercise 2.27. Show that any self-adjointness-preserving map Φ : BpH1 q Ñ
rd
BpH2 q is the difference of two completely positive maps.
Exercise 2.28. Show that the assertions of Theorem 2.21 are also equivalent
fo
to the fact that Φ is n-positive, with n “ minpdim H1 , dim H2 q.
Exercise 2.29. Let k ă n be integers. Show that the map Φ : Mn Ñ Mn
ot
defined by ΦpXq “ k TrpXq I ´X is k-positive but not pk ` 1q-positive.
N
3.5.) A channel that is additionally unital (i.e., if both Φ and Φ˚ are channels)
Pe
ion
(2.37) ΦpXq “ TrH3 V XV : .
Moreover, Φ is a quantum channel if and only if V is an isometry. Conversely, for
ut
any isometric embedding V , the map Φ defined via (2.37) is a quantum channel.
rib
The proof shows that the smallest possible dimension for H3 equals the Kraus
rank of Φ; in particular we can require that dimpH3 q ď dimpH1 q dimpH2 q.
ist
Proof. Start from a Kraus decomposition (2.33) for Φ. Set H3 :“ CN , and
let p|iyq1ďiďN be its canonical basis. Define V by the formula
rd
N
ÿ
(2.38) V |ψy “ Ai |ψy b |iy for ψ P H1 .
i,j“1
As in Remark 2.23, this follows by linearity from the special case X “ |ψyxψ|. This
ly.
řN
implies the identity (2.37). We also see from (2.38) that V : V “ i“1 A:i Ai . By
Remark 2.23 it follows that Φ is a quantum channel if and only if V : V “ IH1 , which
on
(2.39)
Proof. Let V : H Ñ H b H1 be given by Theorem 2.24 (with H1 “ H3 ).
Choose any vector ψ P H1 . The map ϕ b ψ ÞÑ V pϕq (defined on the subspace
H b ψ Ă H b H1 ) is an isometry, and therefore can be extended to a unitary U on
H b H1 . One checks easily that (2.39) holds.
We mention in passing that a popular way to quantify how different two quan-
tum channels are is the diamond norm. For a self-adjointness-preserving map
Φ : BpH1 q Ñ BpH2 q, define
}Φ}˛ “ sup sup }pΦ b IBpCk q qpρq}1 .
kPN ρPDpCk q
52 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY
with respect to the trace norm } ¨ }1 . (ii) Let T : Mn Ñ Mn` be the transposition˘
ion
map. Calculate the norm of T b Id considered as a map on B sa pCm b C2 q, } ¨ }1
and give an example of an operator on which that norm is attained. (iii) Same
question for the operator norm } ¨ }8 .
ut
Exercise 2.33. Show that any positive, unital, and trace-preserving map Φ :
rib
n
Msa sa
n Ñ Mn is rank non-decreasing, i.e., rank Φpρq ě rank ρ for any ρ P DpC q.
ist
classes and examples of quantum channels or, more generally, of superoperators.
(Sometimes it is convenient to drop the trace-preserving constraint.)
rd
2.3.4.1. Unitary channels. Unitary channels are the completely positive isome-
tries of the set of states identified in Theorem 2.4, i.e., the maps that are of the
form ρ ÞÑ U ρU : for some U P Updq. fo
2.3.4.2. Mixed-unitary channels. A mixed-unitary channel Φ : BpCd q Ñ BpCd q
ot
is a channel which is a convex combination of unitary channels, i.e., is of the form
N
N
ÿ
(2.40) Φpρq “ λi Ui ρUi: ,
i“1
ly.
where pλi q is a convex combination and Ui P UpCd q. Such channels are automati-
cally unital. A remarkable fact is that the converse is true when d “ 2.
on
Exercise 2.34 (Proof of Proposition 2.26). (i) Argue that it is enough to prove
lu
Proposition 2.26 for channels which are diagonal with respect to the basis of Pauli
matrices (2.2).
na
2
is completely positive if and only if pa ` bq2 ď p1 ` cq2 and pa ´ bq2 ď p1 ´ cq2 .
r
Pe
(iii) Rewrite the conditions from part (ii) as a system of four linear inequalities and
conclude the proof.
Exercise 2.35. Show that any mixed-unitary channel Φ : BpCd q Ñ BpCd q can
be expressed as in (2.40) with N ď d4 ´ 2d2 ` 2. Note that the argument from
Exercise 2.34 gives N ď 4 (which is optimal) for d “ 2.
2.3.4.3. Depolarizing and dephasing channels. The completely depolarizing (or
completely randomizing) channel is the channel R : BpCd q Ñ BpCd q defined as
RpXq “ Tr X dI . It maps every state to the maximally mixed state. The completely
dephasing channel is the channel D : BpCd q Ñ BpCd q that maps any operator to
its diagonal part (with respect to a fixed basis).
2.3. SUPEROPERATORS AND QUANTUM CHANNELS 53
ion
property that Mi “ I. Given a POVM, we can associate to it a quantum channel
(called sometimes a quantum-classical or q-c channel) Φ : BpHq Ñ BpCN q defined
ut
as
N
rib
ÿ
(2.41) Φpρq “ |iyxi| TrpMi ρq.
i“1
ist
The dual concept is the notion of a classical-quantum or c-q channel Ψ :
BpCN q Ñ BpHq. This is a channel of the form
rd
N
ÿ
Ψpρq “ ρi xi|ρ|iy,
of the form (2.41). Under what condition on pMi q is Φ unital? When this condition
is satisfied, show that the dual map Φ˚ is a c-q channel.
ly.
maps:
lu
(i) Φ is entanglement-breaking,
(ii) the Choi matrix CpΦq lies in the separable cone SEPpHout b Hin q,
so
(iii) there is a Kraus decomposition of Φ (2.33) where all the Kraus operators Ai
r
have rank 1.
Pe
ion
2.3.4.7. Schur channels. Given matrices A, B P Md , their Schur product A d B
is defined as the entrywise product: pA d Bqij “ Aij Bij . Given A P Md , the map
ut
ΘA : Md Ñ Md defined as ΘA pXq “ A d X is called a Schur multiplier. When A
is positive with Aii “ 1 for all i, the map ΘA is a quantum channel called a Schur
rib
channel.
Exercise 2.42 (Positivity of Schur multipliers). Let A P Md . Show that the
ist
following are equivalent:
(i) A is positive semi-definite,
rd
(ii) ΘA is positive,
(iii) ΘA is completely positive.
fo
Exercise 2.43 (Kraus decompositions of Schur channels). Let Φ : Md Ñ Md
be a quantum channel. Show that Φ is a Schur channel if and only if it admits a
ot
Kraus decomposition (2.33) where Ai are diagonal operators.
N
2.3.4.8. Separable and LOCC superoperators. We now assume that Hin and
out
H are bipartite spaces, say Hin “ H1in b H2in and Hout “ H1out b H2out . A
ly.
p2q
Ai : H2in Ñ H2out such that for any X P BpHin q,
N
se
i“1
A widely used class is the class of LOCC channels (LOCC standing for “Local
na
notions are not all equivalent, see Exercise 2.44.) More properties of this class will
r
ion
In this section we will review some of the cones used commonly in quantum
information theory. We will distinguish between cones of operators and cones of su-
peroperators, and emphasize the distinction by using two different fonts: C denotes
ut
a generic cone of operators and C a generic cone of superoperators.
rib
2.4.1. Cones of operators. We start by describing some cones of operators
and by identifying their bases and their dual cones (Table 2.1). We work in a
ist
Hilbert space H and the corresponding space B sa pHq of self-adjoint operators. The
vector e chosen to define the base in (1.22) is the maximally mixed state. Here
rd
and in what follows, we assume that separability and the PPT property are defined
with respect to a fixed bipartition H “ H1 b H2 . However, most considerations
extend to multipartite variants and settings allowing flexibility in the choice of the
fo
partition. In order lighten the notation, we often write PSD and SEP instead of
PSDpHq and SEPpH1 b H2 q unless this may cause ambiguity.
ot
Table 2.1. List of cones of operators. All cones live in B sa pHq,
N
Block-positive BP BP SEP
Decomposable co-PSD ` PSD convpD Y ΓpDqq PPT
so
SEP Sep BP
Pe
Separable
In the same way that PSD is associated with its base D, the set of separable
states Sep gives rise to the separable cone SEP, and the set PPT of states with
positive partial transpose leads to the PPT cone. Another example is the cone
of k-entangled matrices (cf. Section 2.2.5). In general, whenever a definition of a
set of matrices involves linear matrix inequalities and a trace constraint, dropping
that constraint gives us a cone. When the original set of matrices is compact, the
resulting cone is pointed, with the hyperplane of trace zero matrices isolating 0 as
an exposed point (cf. Corollary 1.8). All the cones cataloged in this section have
this property and are in fact nondegenerate.
56 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY
ion
strictly larger than PSD and so its base contains matrices that are not states.
To conclude the review of the standard cones, we will identify the cone SEP ˚ .
ut
To that end, it is convenient to think of operators on a composite Hilbert space
Cm b Cn as block matrices M “ pMjk qm j,k“1 , where Mjk P Mn (see Section 0.7).
rib
Since the extreme rays of SEP are generated by pure separable states |ξ b ηyxξ b η|
(see Section 2.2.3), we have
ist
(2.45) M P SEP ˚ ðñ @ξ P Cm , @η P Cn , Tr M |ξ b ηyxξ b η| ě 0
` ˘
m
rd
ÿ
(2.46) ðñ @ξ P Cm , ξj ξk Mjk P PSDpCn q.
j,k“1
fo
The condition in (2.46) is usually referred to as M “ pMjk q being block-positive.
(We note that the definition treats m and n symmetrically, even though this not
ot
apparent in (2.46).) In other words, the dual to the cone of separable matrices is
that of block-positive matrices, denoted by BP. As a consequence, the polar of Sep
N
where BP denotes the set of block-positive matrices with unit trace and the minus
on
sign stands for the point reflection with respect to the appropriately normalized
identity matrix.
se
from B sa pHq to B sa pKq and denote the corresponding cones as CpH, Kq, or as CpHq
when H “ K, or simply as C when there is no ambiguity. The cones we consider
na
most frequently are gathered in Table 2.2. (See Exercise 2.48 for a discussion of
identification and duality relations for k-positive superoperators and k-entangled
so
states.)
In the language of cones, a positivity-preserving superoperator Φ : B sa pHq Ñ
r
sa
` ˘
B pKq may be defined via the condition Φ PSDpHq Ă PSDpKq. It is readily
Pe
ion
Cone of superoperators C C C˚ C˚
Positivity-preserving P BP SEP EB
Decomposable DEC co-PSD ` PSD PPT PPT
ut
Completely positive CP PSD PSD CP
PPT-inducing PPT PPT co-PSD ` PSD DEC
rib
Entanglement-breaking EB SEP BP P
ist
PSDpCn b Cm q. This means that—with proper identifications, see Exercise 2.47—
rd
the cone CP is self-dual. Choi’s correspondence Φ ÞÑ CpΦq relates similarly ` nthe
cone˘ EBpCm , Cn q of entanglement-breaking maps from Msa m to M sa
n to SEP C b
Cm , as well as the cone P P T pCm , Cn q of P P T -inducing maps to PPT pCn bCm q.
A map Φ : Msa sa
fo
m Ñ Mn is said to be co-completely positive if CpΦq P co-PSD.
ot
Similarly, one says that Φ is decomposable if it can be represented as a sum of
N
a completely positive map and a co-completely positive map. It follows that the
correspondence Φ ÞÑ CpΦq relates the cone DECpCn , Cm q of decomposable maps
to the cone of decomposable matrices.
ly.
ðñ Φ P P,
which is the claimed identification. The first equivalence is simply (2.45)–(2.46) for
na
the choice M “ CpΦq, whereas the second one reflects the fact that the property
of “preserving positivity” needs to be checked only on the extreme rays of the
so
PSD cone, i.e., on operators of the form |ξyxξ|. (See Section 1.2.2 and particularly
Corollary 1.10.)
r
Pe
ion
cataloged in the present section. The argument is based on the following two simple
observations: first, since affine automorphisms preserve facial structure, and since 0
ut
is the only extreme point of all the cones considered above, any affine automorphism
must be linear. Next, if Φ : Msa sa
m Ñ Mn is such that A “ ΦpIq is positive definite,
rib
´1{2
then Ψ defined by Ψpρq “ A ΦpρqA´1{2 is unital, and its adjoint, Ψ˚ , is trace-
preserving (see (2.36)). This often allows to reduce the analysis of general maps to
ist
that of unital or trace-preserving maps. As an example of such reduction we will
prove the following statement.
rd
Proposition 2.29 (Characterization of automorphisms of the PSD cone). Let
n n
Φ : Msa sa
n Ñ Mn be an affine map which satisfies ΦpPSDpC qq “ PSDpC q. Then Φ
n fo
is a linear automorphism of PSDpC q and is of one of two possible forms: Φpρq “
V ρV : or Φpρq “ V ρT V : , for some V P GLpn, Cq. In the first case Φ is completely
ot
positive, whereas in the second case Φ is co-completely positive.
N
Proof. Since rank Φ ě dim PSDpCn q “ dim Msa n , it follows that Φ is sur-
jective and hence injective, so it is indeed an automorphism of PSDpCn q (and,
consequently, so is Φ´1 ). By the earlier remark, Φ must be linear. Since the
ly.
adjoint of a positive map is positive (see Section 2.3.2), it follows that Φ˚ and
pΦ˚ q´1 “ pΦ´1 q˚ are positive. Hence they are both automorphisms of PSDpCn q.
on
cal considerations, but can also be deduced from Proposition 1.4: if A “ Φ˚ pIq lay
on the boundary of PSDpCn q, we would have A P F for some face of PSDpCn q,
lu
For future reference, we state here a slightly more general form of the principle
that is implicit in the proof of Proposition 2.29.
Lemma 2.31. If Φ : Msa sa
m Ñ Mn is a positivity-preserving linear map such
that A “ ΦpIq is positive definite, then Φ̃ defined by Φ̃pρq “ A´1{2 ΦpρqA´1{2 is
unital and positivity-preserving. Similarly, if Ψ is a positivity-preserving linear
map such that Ψpρq ‰ 0 for ρ P PSDpCm qzt0u, then Ψ̃pρq “ ΨpB ´1{2 ρB ´1{2 q is
trace-preserving and positivity-preserving, where B “ Ψ˚ pIq (necessarily positive
definite).
ion
We emphasize that the map Φ in Lemma 2.31 is not assumed to be an auto-
morphism of the PSD cone (as was the case in Proposition 2.29), only positivity-
ut
preserving. Moreover, we also allow the dimensions in the domain and in the range
rib
to be different. Finally, recall that, by Lemma 1.7, the properties “ΦpIq is positive
definite” and “Ψpρq ‰ 0 for ρ P PSDpCm qzt0u” are dual to each other.
In view of the above result, it is natural to wonder when a positivity-preserving
ist
map is equivalent, in the sense of Lemma 2.31, to a map which is both unital and
trace-preserving. (Of course if the dimensions in the domain and in the range are
rd
different, this is only possible if we use the normalized trace or, alternatively, if we
ask that the maximally mixed state be mapped to the maximally mixed state.) It
fo
turns out that this can be ensured if just a little more regularity is assumed. (See
Exercise 2.52 for examples exploring the necessity of the stronger hypothesis.) We
ot
have
Proposition 2.32 (Sinkhorn’s normal form for positive maps). Let Φ : Msa
N
m Ñ
Msa
n be a linear map which belongs to the interior of P , the cone of positivity-
preserving maps. Then there exist positive operators A P PSDpCn q and B P
ly.
PSDpCm q such that the map Φ̃pρq “ AΦpBρBqA is trace-preserving and maps the
on
maximally mixed state to the maximally mixed state (and is necessarily positivity-
preserving).
Proof. Let us first focus on the case m “ n. Given positive definite A, B, let
se
Accordingly, by (2.36),
(2.49)
r
Solving the last equation in (2.49) for B 2 and substituting it in (2.48) we are led
to a system of equations
˘´1
B 2 “ Φ˚ pA2 q´1 and Φ Φ˚ pA2 q´1 “ A2 .
`
(2.50)
The second equation in (2.50) says that S “ A2 is a fixed point of the function
` ˘´1
(2.51) S ÞÑ f pSq :“ Φ Φ˚ pSq´1 .
Conversely, if S is a positive definite fixed point of f , then A “ S 1{2 and B “
Φ˚ pA2 q´1{2 (i.e., B defined so that the first equation in (2.50) holds) satisfy (2.48)
and (2.49) and yield Φ̃ that is unital and trace-preserving. (The hypothesis “Φ
60 2. THE MATHEMATICS OF QUANTUM INFORMATION THEORY
belongs to the interior of P ” guarantees that all the inverses and negative powers
above make sense, and that f is well-defined and continuous on PSDzt0u, see
Exercises 2.50 and 2.51.)
To find a fixed point of f we want to use Brouwer’s fixed-point theorem, which
requires a (continuous) function that is a self-map of a compact convex set. One
way to arrive at such setting is to consider f1 : DpCn q Ñ DpCn q defined by
f pσq
(2.52) f1 pσq “ .
Tr f pσq
ion
It then follows that there is σ0 P DpCn q such that f1 pσ0 q “ σ0 and hence f pσ0 q “
tσ0 , where t “ Tr f pσ0 q ą 0. The final step is to note that if we choose, as before,
ut
1{2
A “ σ0 and B “ Φ˚ pA2 q´1{2 , then the corresponding Φ̃ is trace-preserving and
satisfies Φ̃pIq “ t´1 I. If m “ n, this is only possible if t “ 1. In other words, σ0 is a
rib
fixed point of f that we needed in order to conclude the argument. In the general
case, the same argument yields t “ n{m, which translates to Φ̃pI {mq “ I {n, again
ist
as needed.
rd
Exercise 2.49. Show that Φ P P pCn q is an automorphism of PSDpCn q if an
only if it is rank-preserving.
fo
Exercise 2.50 (Descriptions of the interior of the positive cone). Show that
Φ belongs to the interior of P pCn q iff Φ maps PSDpCn qzt0u to the interior of
ot
PSDpCn q iff there exists δ ą 0 such that Φpρq ě δpTr ρq I for all ρ P PSD.
N
Exercise 2.51 (Interior of the positive cone is self-dual). Show that Φ verifies
Φpρq ě δpTr ρq I (for all ρ P PSD) iff Φ˚ does.
ly.
definite, but Φ is not equivalent (in the sense of Proposition 2.32) to a unital,
trace-preserving map, and (b) Ψ is unital and trace-preserving, but Ψ P BP .
se
Exercise 2.53 (Rank nondecreasing and Sinkhorn’s normal form). Give an ex-
ample of map Φ P P pC2 , C2 q which is rank nondecreasing (i.e., verifies rank Φpρq ě
lu
rank ρ for any ρ P DpC2 q), but which does not satisfy the conclusion of Proposition
2.32.
na
fications of the dual cone SEP ˚ as BP (see Table 2.1 in Section 2.4), and of the
Pe
ion
uct state and Φ is positivity-preserving, then pΦ b Idqρ “ Φpτ q b τ 1 , which is clearly
positive; the case of convex combinations of product states easily follows. To show
necessity, let Ψ : Msa sa
n Ñ Mm be the positivity-preserving map given by Proposition
ut
n n
2.33. If χ P C b C is the maximally entangled vector as in (2.32), then
rib
0 ą TrpCpΨqρq “ xCpΨq, ρyHS “ xpΨ b IdMsa
n
q|χyxχ|, ρyHS
“ x|χyxχ|, pΨ˚ b IdMsa qρyHS “ xχ|pΨ˚ b IdMsa qρ|χy,
ist
n n
` ˚ ˘
which implies that Ψ b IdMsa n
ρ is not positive. Given that Ψ˚ is positivity-
preserving if and only if Ψ is (see Section 2.3.2), the choice of Φ “ Ψ˚ works as
rd
needed.
Remark 2.35. It follows from general considerations that the entanglement
fo
witnesses σ, Φ may be required to satisfy various additional properties. First, one
may include a normalizing condition such as Tr σ “ 1 or Tr ΦpIq “ 1, which reduces
ot
the search for a witness to a convex compact set. Next, since linear functions
N
(restricted to compact sets) attain extreme values on extreme points, one may
insist that σ or Φ belong to an extreme ray of the respective cone (or even, by a
density argument, to an exposed ray; cf. Exercise 1.5). Finally, another acceptable
ly.
ΦpIq may be assumed to be positive definite, in which case Lemma 2.31 applies.
The case of the trace-preserving restriction is slightly more involved and requires
se
increasing the dimension of the range of Φ. We relegate the details of the arguments
to Exercises 2.54 and 2.55.
lu
Exercise 2.54 (Unital witnesses suffice). Show that in Theorem 2.34 one can
na
one can require that Φ be trace-preserving, at the cost of allowing the range of Φ
to be Msa
m`n .
r
Pe
ion
turn follows easily from very classical facts. The second proof handles first the maps
generating extreme rays of P pC2 q, and concludes via the Krein–Milman theorem.
ut
Here are the details.
Proof # 1 of Theorem 2.36. The crucial observation is that it suffices to
rib
show that the interior of P pC2 q is contained in DECpC2 q. The needed inclusion
P pC2 q Ă DECpC2 q follows then from both cones being closed, and being the
ist
closures of their interiors.
To that end, suppose that Φ belongs to the interior of P pC2 q. Proposition
rd
2.32 implies then that there exist positive operators A, B P Msa 2 and a positivity-
sa sa
preserving, ˘ and trace-preserving map Φ̃ : M2 Ñ M2 such that Φpρq “
unital
`
A´1 Φ̃ B ´1 ρB ´1 A´1 for all ρ P Msa fo
2 . In other words, Φ “ ΦA´1 ˝ Φ̃ ˝ ΦB ´1 , where
ΦM pρq :“ M ρM : . Since every ΦM is completely positive, the composition rules for
ot
completely positive and co-completely positive maps (see Exercises 2.26 and 2.46)
show that the problem reduces to establishing decomposability of Φ̃.
N
that preserves the center, it may be thought of as a linear map R P BpR3 q with
}R}8 ď 1. Such maps are convex combinations of elements of Op3q (cf. Exercises
on
1.44 and 1.45), which in turn correspond to maps of the form (i) ρ ÞÑ U ρU : or (ii)
ρ ÞÑ U ρT U : for some U P Up2q (depending on whether the said element of Op3q
belongs to SOp3q or not). This is a very special and elementary case of Kadison’s
se
Theorem 2.4, and was explained in the proof of Wigner’s Theorem 2.3 (see also
lu
Exercise B.4 for the isomorphism PSUp2q Ø SOp3q). It remains to recall that the
maps of form (i) are completely positive and those of form (ii) are co-completely
na
positive.
Remark 2.37. The above argument, when combined with the resultřfrom Ex-
so
ion
2.21; actually, since A is itself of rank one, it follows that CpΦq is in fact separable
and hence that Φ entanglement-breaking, see Lemmas 2.20 and 2.27).
ut
Notes and Remarks
rib
Classical references for the mathematical aspects of quantum information the-
ory are [NC00, Hol12, Wil17]. We also recommend [Wat].
ist
Section 2.1. A general reference for the geometry of quantum states is the
book [BŻ06]. Wigner’s theorem appears in [Wig59] and Kadison’s theorem in
rd
[Kad65] in a broader context. Elementary proofs can be found in [Hun72, Sim76]
and recent generalizations in [SCM16, Stø16].
fo
Section 2.2. The definition of separability for mixed states was introduced in
[Wer89]. The NP-hardness of deciding whether a state is separable was shown in
ot
[Gur03]. The argument sketched in Exercise 2.10 about the number of product
vectors needed to represent any separable state is from [CÐ13].
N
Werner states were introduced in [VW01], where the question of their separa-
bility (Proposition 2.16) is also discussed.
ly.
Theorem 2.10 was proved in [DPS04]. For more information about k-ex-
tendibility and the symmetric subspace (also in the multipartite setting) we refer to
on
the survey [Har13]. An early reference for k-entangled states is [TH00]. See Notes
and Remarks on Chapter 9 for quantitative results about the hierarchies defined in
se
Section 2.2.5.
The observation that non-PPT states are entangled (Peres–Horodecki criterion,
lu
Størmer [Stø63] and Woronowicz [Wor76]. See Notes and Remarks on Section 2.4
for more information.
so
examples (in higher dimensions) are presented, e.g., in [BDM` 99]. A geometric
Pe
ion
Section 2.3. The Jamiołkowski isomorphism can be traced to [Jam72]. Choi’s
and Jamiołkowski’s isomorphisms are seldom distinguished in the literature; a dis-
ut
cussion of the difference between the two appears in [LS13].
Choi’s Theorem 2.21 as stated was proved in [Cho75a], which also contains
rib
a description of extreme completely positive unital maps. Closely related state-
ments (including variants of Stinespring’s Theorem 2.24) varying by the level of
ist
abstractness were arrived at (largely) independently by various authors, see, e.g.,
[Sti55, Kra71, Kra83].
rd
Proposition 2.26 is from [LS93] and the argument from Exercise 2.34 is based on
more general results from [RSW02] which give various descriptions of all quantum
channels between qubits and of extreme points of the set of such channels.
fo
For elementary properties of the diamond norm, see Section 3.3.4 in [Wat]
(where it is studied under the name completely bounded trace norm). Entanglement-
ot
breaking channels were studied in detail in [HSR03].
N
The example from Exercise 2.29 is from [Tom85]. Exercise 2.44 is from [Wat],
to which we also refer for a discussion of the class of LOCC channels.
ly.
The result from Proposition 2.32 and its derivation from Brouwer’s fixed-point
theorem appear in [Ide13, Ide16, AS15]. A similar statement (proved via an
na
iterative construction) appeared in [Gur03] for positive maps Φ which are “rank
non-decreasing” (however, not all such maps satisfy the conclusion of Proposition
so
2.32, see Exercise 2.53). The validity of Proposition 2.32 for completely positive
maps is simpler and well known, see for example [GGHE08] and its references.
r
The original Sinkhorn’s theorem (for matrices, or for maps preserving the positive
Pe
orthant in Rn ) goes back to [Sin64]; see [Ide16] for an extensive survey of related
topics.
Theorem 2.34 is from [HHH96]. The concept of optimal entanglement witness
which appears in Exercise 2.56 was investigated in [LKCH00].
Størmer’s Theorem 2.36 was initially proved in [Stø63]; the original formulation
involved the second of the two statements. The first proof presented here seems
to be new and was a byproduct of the work on this book [AS15]. The scheme
behind the second proof was apparently folklore for some time; it was documented
in [MO15]. The novelty of its current presentation, if any, consists in streamlining
of the proof of Proposition 2.38. (For more background information on Proposition
NOTES AND REMARKS 65
2.38, see Appendix C.) Other proofs (of either of the two versions given in Theorem
2.36) appeared in [KCKL00, VDD01, LMO06, KVSW09, Stø13]. A recent
study of positivity-preserving maps on M3 can be found in [MO16]. While [MO16]
is focused on the unital trace-preserving case, it is likely that (particularly when
combined with our Proposition 2.32) it may provide a clear picture of the more
general setting. In particular, it may lead to a simple and transparent proof of the
C2 b C3 case of Theorem 2.15 (Woronowicz’s Theorem).
ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 3
ion
This section is addressed primarily to mathematicians who are new to quantum
ut
information theory. Its purpose is to indicate why various mathematical concepts
enter the theory, and to give an idea of their physical meaning or interpretation.
rib
We make no attempt at being comprehensive; our attention is restricted to the
constructs that play a central role in this book and that we ourselves have found
ist
(and still find) puzzling, such as mixed states and completely positive maps. In
any case, neither of the authors being a physicist, the scope (and the depth) of the
rd
presentation will necessarily be limited.
This section is designed to be essentially independent of the rest of the book.
The only “non-mainstream” technical device that is indispensable for following it
fo
is the Dirac bra-ket notation (see Section 0.3). The discussion will be occasionally
informal in order for the readers to acquaint themselves with concepts that are
ot
presented more rigorously elsewhere in the book.
N
on time is governed by some evolution equation (for example, the Schrödinger equa-
tion) and is necessarily unitary: given t ą 0, there is a unitary operator Ut such
that if the state of the particle at time 0 is described by ψ0 (a priori unknown),
se
tion. Other physical properties of the particle are exhibited similarly. In particular,
if a given physical quantity is discrete, then there is an orthonormal sequence (or
so
basis) puj q, indexed by possible values of the quantity in question, such that the
probability of obtaining the jth value during measurement is |xψ, uj y|2 . This is the
r
Pe
simplest case of the so-called Born rule. In a way, the actual values of the physical
quantity are of secondary importance and one simply says that “a measurement was
performed in the basis puj q” or that “puj q is the computational basis” for this par-
ticular measuring/experimental setup. (We will briefly discuss other, more general
measurement schemes in Section 3.6.)
It should be emphasized that it is possible for measurement results to be de-
terministic. If the basis puj q is such that ψ “ uj0 for some j0 , then measuring ψ
in the basis puj q will yield j0 th outcome with probability 1. For the same reason,
two states ψ and ϕ are in principle perfectly distinguishable if (and only if) they
are orthogonal; one then “merely” needs to arrange a measurement in a basis that
contains both ψ and ϕ.
67
68 3. QUANTUM MECHANICS FOR MATHEMATICIANS
ion
example the spin of an electron or the polarization of a photon.
Next, it is apparent from the discussion in Section 3.1 that no measurement
ut
can distinguish between the wave functions ψ and ωψ, where ω P C with |ω| “ 1,
and so the “true” state space is the complex projective space PpCd q (or CPd´1 ) for
rib
d-level systems. Another mathematical scheme that conveniently disregards scalar
factors is to consider not a unit vector ψ P Cd , but the orthogonal projection onto
ist
Cψ or, in the language of matrices, the outer product ρ “ |ψyxψ| P Md . In that
language, when a measurement is performed in some basis puj q, the probability of
rd
the jth outcome is
|xψ, uj y|2 “ xuj , ψy xψ, uj y “ xuj |ρ|uj y “ Tr ρ|uj yxuj | .
` ˘
(3.1)
fo
3.3. Composite systems and quantum marginals; mixed states
ot
This section gives motivation to the definition of (mixed) quantum states which
appeared in Section 2.1.1.
N
the state spaces of the components (subsystems, particles, . . . ) are Hilbert spaces
H1 , . . . , Hm , the state space of the composite system is the tensor product K “
on
only to the H part. (This may be the case when H describes the state inside an
lu
and let us try to figure out the H-marginal of ψ, i.e., the state on H, measurements
of which “within H” are consistent with hypothetical measurements of the complete
so
state ψ.
If ψ “ ξ bη (a product vector), the result is as expected: the H-marginal of ψ is
r
ξ. To check this, we note that if we measure ξ in some basis puj q of H, we obtain the
Pe
jth outcome with probability pj “ |xψ, uj y|2 . For a different point of view, suppose
that we have access to the entire system and that we perform a measurement in the
basis puj b vk qj,k , where pvk q is some basis of E. The probability of obtaining the
pj, kqth outcome is then qjk “ |xξ b η, uj b vk y|2 “ |xξ, uj y|2 ¨ |xη, vk y|2 . Summing
over k, we again find that the probability of the jth outcome on the first component
is |xψ, uj y|2 “ pj . This is simply a verification that the probability distribution ppj q
is the (first) marginal of pqjk q and that, moreover, product vectors lead to product
distributions, or to independent random variables. Another way to express this
marginal probability is pj “ Tr ρPuj , where ρ “ |ψyxψ|, and where Pu “ |uyxu| b IE
is the orthogonal projection onto the subspace u b E of H b E. This calculation
3.3. COMPOSITE SYSTEMS AND QUANTUM MARGINALS; MIXED STATES 69
perfectly makes sense even if ξ is not a product vector, and it makes clear that pj
does not depend on, say, the choice of the basis of E.
Consider now ψ P H b E, which is not a product vector. Let
r
ÿ
(3.2) ψ“ ai ξi b ηi
i“1
be its Schmidt decomposition (see Section 2.2.2), necessarily with r ě 2. Since
řr H-marginal of ξi b ηi is ξi , it is tempting to guess that the H-marginal of ψ is
the
ion
i“1 ai ξi . However, one should
řrimmediately become suspicious: for any choice of
(complex) signs ωi , the vector i“1 ai ωi ξi is an equally valid candidate, and while
the state remains unchanged if you multiply a vector by a complex number ω with
ut
|ω| “ 1, it may change radically if you multiply different (non-zero) components
rib
by different numbers. A more careful analysis is needed, and it turns out that the
proper language to describe marginals is that of matrices. In the notation of the
preceding paragraph we have
ist
”´ ÿ r ¯`
` ˘ ˘ı
pj “ Tr |ψyxψ|Puj “ Tr ai āl |ξi yxξl | b |ηi yxηl | |uj yxuj | b IE
rd
i,l“1
r
ÿ “` ˘` ˘‰ ` ˘
“
i,l“1
ai āl Tr fo
|ξi yxξl | |uj yxuj | Tr |ηi yxηl |
r
ot
”´ ÿ ¯ ı
“ Tr |ai |2 |ξi yxξi | |uj yxuj |
N
i“1
r
´ÿ ¯
(3.3) “ xuj | |ai |2 |ξi yxξi | |uj y.
ly.
i“1
In other words, the probability that `a measurement˘ performed in a basis puj q yields
on
ÿ
(3.4) ρH “ |ai |2 |ξi yxξi |.
i“1
lu
So the mixed state ρH fits the role of the H-marginal of the “global” state ρ “ ρHE “
|ψyxψ|. Therefore, while in principle the state of a quantum system is described
na
measurement in a global basis, and we therefore have to rely on mixed states for
modeling such systems. To use the Platonic analogy, a mixed state is “the shadow
r
Pe
on the wall” of our cave, comprising all the features of the “idea” (or “form”) ψ that
are accessible to our perception.
A more heuristic explanation of the formula for the marginal is that from the
perspective of H the state of our system is ξi with probability pi “ |ai |2 , and so we
need to compute the weighted
` average
˘ of probabilities corresponding to ρ “ |ξi yxξi |.
Since the expression Tr ρ |uyxu| is linear in ρ, the average can be performed inside
the trace, whence the formula for ρH . We encourage readers who are not used to
the bra-ket formalism to work out the details of several variants of this calculation
outlined in Exercise 3.1.
The key features of the marginal ρH are that it is canonical (for example, it
does not depend on the basis puj q of H in which the measurement is performed)
70 3. QUANTUM MECHANICS FOR MATHEMATICIANS
and that it encodes all the information that can be obtained about the global state
by measurements inside H. In particular, if ρH is truly mixed (i.e., not pure, with
r ě 2 in (3.2) or in (3.4)), then there are no measurements inside H that are
deterministic.
A simple but spectacular demonstration of this phenomenon are the Bell states
on C2 b C2 : ρ “ |ψyxψ| with ψ being (for example) one of the four Bell vectors
1 1
ϕ˘ “ ? p|00y ˘ |11yq, ψ ˘ “ ? p|01y ˘ |10yq,
2 2
ion
2
where |0y, |1y is the canonical basis of C (recall that |00y stands for |0y b |0y). It
is easily seen that in each case the marginal of ρ on either C2 factor is p|0yx0| `
ut
|1yx1|q{2 “ I {2. Consequently, when measuring in any basis pu1 , u2 q (of, say, the
first factor), each of the two outcomes occurs with probability 1{2, and so the
rib
results of such measurements, in and of themselves, tell us nothing. In particular,
they cannot help us distinguish between ϕ` , ϕ´ , ψ ` , ψ ´ , even though a global
ist
measurement performed in the basis consisting of these four vectors would tell
them apart perfectly.
rd
Exercise 3.1. Perform alternative calculations of the probabilities from (3.3)
řr outline. Consider a product basis puj b vk qj,k of H b E.
according to the following
fo
If ρ “ |ψyxψ| with ψ “ i“1 ai ξi b ηi , the probability of the pj, kqth outcome will
be, by (3.1),
ot
ˇ ÿr ˇ2
qjk “ ˇx ai ξi b ηi , uj b vk yˇ “ Tr ρp|uj yxuj | b |vk yxvk |q.
ˇ ˇ
N
i“1
ř
Finally, retrieve pj “ k qjk by expanding either the second or the third expression
ly.
in the above.
on
concept of partial trace, which is defined as follows (see also Section 2.2.1). First,
for any operator (self-adjoint or not) on a composite Hilbert space H b E which is
na
Next, we extend this operation to all operators by linearity (which is possible be-
r
variables X, Y with joint density f px, yq, the marginal density of X is obtained by
integrating f with respect to y.
Another point which needs to be clarified is that the set of mixed states on H
that may be obtained as H-marginals of pure states on composite systems H b E
(for some auxiliary space E) is exactly the set DpHq of positive semi-definite trace
d
one operators (usually referred to as density matrices, particularly if H “
ř C ). This
is the consequence of the following computation: if ρ P DpHq,řand ? ρ “ i λi |ξi yxξi |
is its spectral decomposition, then choosing E “ H and ψ “ i λi ξi b ξi ensures
ion
that TrE p|ψyxψ|q “ ρ. We say that |ψyxψ| (or simply ψ) is a purification of ρ.
Clearly, the Schmidt rank of ψ (always) equals rank ρ “: r. Moreover, the minimal
dimension of E for which a purification of ρ exists in H b E is also equal to r. Even
ut
though this construction is abstract, it is canonical in the following sense: if ρ is
rib
a physical state on H that is the H-marginal of a physical pure state řr ψ?P H b E
(where E is the environment relative to H), then we must have ψ “ i“1 λi ξi bηi
for some basis pηi q of E. (The only catch is that pηi q may not be the most natural
ist
basis of E.)
rd
3.5. Unitary evolution and quantum operations; the completely
positive maps
fo
As mentioned earlier, the evolution of a quantum system is unitary, i.e., if
t0 ă t1 , then there is a unitary operator U such that if the state of the system at
ot
time t0 (the initial state) is described by a vector ψ (which is a priori general and/or
unknown), then its state at time t1 (the terminal state) will be U ψ. (U depends
N
on the physical laws governing the evolution, and we may be able to control some
of its parameters, but it is independent of ψ.) If we switch to the language of
ly.
from the identity V TrE pρqV : “ TrE pU ρU : q, valid for U “ V b W and for any
matrix ρ.)
r
The situation becomes more complicated in a case where the evolution of the
Pe
subsystem H and the environment E are not decoupled, i.e., where U is not a
product of two unitaries. Even if the initial state of the system is a product vector
ψ “ ξ b η, there is no reason why the terminal state U ψ, which can a priori be
arbitrary, should be of that form. In other words, even if the initial H-marginal
σ “ |ξyxξ| is pure, the terminal marginal may be mixed. In particular, the evolution
of the marginal is not necessarily unitary. Moreover, for fixed ξ, different values
of the initial E-marginal η may result in radically different values of the terminal
H-marginal.
However, this is neither surprising nor fatal. First, if there is interaction be-
tween our subsystem H and the environment E, it is to be expected that the terminal
72 3. QUANTUM MECHANICS FOR MATHEMATICIANS
state of H possibly depends on the state of E. Second, while we may not know what
the initial state of E is, we can simply think of it as an external parameter affecting
the evolution of our subsystem H, which is the only one we can manipulate, control
and measure.
We now want to come up with a formula that generalizes the unitary evolution
ρ ÞÑ Uρ U : or, more precisely, that is the “shadow on the wall of our cave” of the
unitary evolution. Let us start again with the global initial state being a product
vector ψ “ ξ b η; the terminal state is then represented by the vector U pξ b ηq.
ion
Since η is assumed to be fixed, we can omit the dependence on η in the description
and simply talk about an (a priori arbitrary) isometry ξ ÞÑ V ξ P H b E. (Of course,
since by definition V ξ “ U pξ b ηq, V does implicitly depend on η.) In the language
ut
of density matrices, the evolution of the H-marginal is then given by
rib
(3.5) σ ÞÑ TrE V σV : ,
where σ “ |ξyxξ| is the initial marginal (cf. Theorem 2.24).
ist
If we want to give a description of the evolution that is intrinsic to H, we may
ř pvi q be an orthonormal basis of E. The isometry V can be
proceed as follows. Let
rd
represented as V ξ “ i pAi ξq b vi for some operators Ai P BpHq. Consequently,
ÿ ÿ`
Ai |ξyxξ|A:j b |vi yxvj |
˘
V σV : “ |Ai ξyxAj ξ| b |vi yxvj | “
and further,
i,j fo
i,j
ot
ÿ` ÿ
Ai |ξyxξ|A:j Tr |vi yxvj | “
˘
TrE V σV : “ Ai |ξyxξ|A:i .
N
i,j i
ÿ
(3.6) σ ÞÑ Ai σA:i .
i
on
i,j i i
so
Given that for self-adjoint operators A, B P BpHq the condition xξ|A|ξy “ xξ|B|ξy
for all ξ P H implies A “ B, it follows that V being an isometry is equivalent to
r
ÿ :
Pe
(3.7) Ai Ai “ I H ,
i
which in turn (see Remark 2.23) is equivalent to the map given by (3.6) being
trace-preserving. This should not come as surprise, since we want the evolution
equation to map density matrices to density matrices, which for linear evolutions
is equivalent to preserving the trace.
To summarize, under the hypothesis of unitary evolution of the global system
H b E, the relationship σ ÞÑ Φpσq between the initial state σ of subsystem H
(the initial H-marginal) and its terminal state Φpσq is described by a completely
positive trace-preserving map (CPTP) Φ acting on BpHq. CPTP maps are also
called quantum channels.
3.6. OTHER MEASUREMENT SCHEMES 73
ion
to a given global unitary evolution induced by U and a given E-marginal τ P BpEq.
However, while knowing H- and E-marginals of a pure state tells us a lot about
the structure of that state, it still leaves a lot of uncertainty. For example, H- and
ut
E-marginals of all four Bell states ϕ` , ϕ´ , ψ ` , ψ ´ on C2 bC2 are identical: they are
maximally mixed states 21 IC2 . On the other hand, in the absence of some strong
rib
restrictions on the form of the global unitary evolution U , there is no reason to
expect the H-marginals of U ϕ` , U ϕ´ , U ψ ` , U ψ ´ to be the same. (In fact, various
ist
quantum algorithms exploit the fact that those marginals may be quite different.)
In other words, such a map Φ cannot be consistently defined.
rd
In physics texts this characterization, and specifically the postulate of complete
positivity, is usually arrived at in a somewhat different way. First, it is noted that
fo
a quantum evolution map (or a quantum operation) Φ : BpHq Ñ BpHq should
map density matrices to density matrices. Under the assumption of linearity, this
ot
is equivalent to Φ being positive and trace-preserving (see Section 2.3.2). Second,
when Φ is coupled with an identity map on the environment E, then the resulting
N
formal).
some basis puj q of the entire space, or of the space corresponding to the accessi-
ble subsystem, `with the ˘probability of the jth outcome being either |xψ, uj y|2 or
na
xuj |ρ|uj y “ Tr ρ|uj yxuj | (depending on whether the state of the system is pure
or mixed). A slightly more general scheme is that of a projective measurement,
so
(3.8)
However, this is barely more general: we can think of the instrument as being
related to a basis puj q, but as providing only a coarse-grained view, where some of
the basis elements uj are merged into one projection Pi .
A more substantive generalization is derived from basis/projective measure-
ments in a similar way that CPTP maps were derived from unitary operations.
Suppose that a projective measurement pPi q on H b E (rank one or not) is per-
formed and consider the effects of applying it to a product state ψ “ ξ b η. The
probability of the ith outcome is then
` ˘ ` ˘
(3.9) pi “ xψ|Pi |ψy “ Tr |ψyxψ|Pi “ Tr p|ξyxξ| b |ηyxη|qPi “
74 3. QUANTUM MECHANICS FOR MATHEMATICIANS
` ˘
Tr |ξyxξ| TrE pI b|ηyxη|qPi .
In the last equality we used the identity
` ˘ ` ˘
(3.10) Tr pτ b IqX “ Tr τ TrE X ,
which is easily verified if X is a product operator and follows by linearity for
arbitrary X. In other words, there are operators pMi q on H such that
(3.11) pi “ Trp|ξyxξ|Mi q.
ion
ř
Varying ξ and using the fact that i Pi “ IHbE we deduce that
ÿ
(3.12) Mi “ I H
ut
i
and that Mi is positive for each i. Even though Born’s rule (3.11) was derived for
rib
a pure state ρ “ |ξyxξ|, it extends by linearity to a general (possibly mixed) mixed
state ρ on H via the formula
ist
(3.13) pi “ TrpρMi q.
A system pMi q verifying the condition (3.12) is called a positive operator-valued
rd
measure (POVM) and the associated measurement scheme a POVM measurement.
The reason for invoking the term “measure” is that there are also continuous vari-
fo
ants, namely operator-valued measures integrating to identity.
ot
3.7. Local operations
This short section aims at explaining the meaning of the word “local,” which is
N
the Hilbert space of Bob’s system. The usual assumption is that Alice and Bob are
surrogates for two distant experimentalists who share a quantum system H.
se
In this context, operations that can be performed “privately” by Alice and Bob
are called local operations. For example, local unitaries on H are unitary operators
lu
quantum channels.
A related concept is the class of LOCC operations, which are obtained by
r
combining Local Operations with Classical Communication between Alice and Bob.
Pe
particles are in a Bell quantum state ψ ` “ ?12 p|01y ` |10yq (on the Hilbert space
H “ HA b HB “ C2 b C2 ). As described in Section 3.3, independently of the
choice of measurement bases in HA and HB , both outcomes of Alice’s (resp., Bob’s)
measurement will be equally likely. However, some combinations of the outcomes
are more likely than others. For example, suppose that each of them performs
the measurement in their computational basis p|0y, |1yq, which, in the terminology
of Section 3.7, corresponds to a local POVM with pMi q “ pNj q “ p|0yx0|, |1yx1|q.
Table 3.1 shows the resulting joint probability distribution. Note that Alice’s and
ion
Table 3.1. Joint probability distribution of Alice’s and Bob’s
measurement outcomes.
ut
Bob
rib
|0y |1y
Alice
1
|0y 0
ist
2
1
|1y 2 0
rd
Bob’s outcomes are always different. This is not immediately fatal as it may just be
the case that—perhaps because of some conservation law in their interaction in the
fo
past—the two particles are in opposite states, we just don’t know which. However,
on further reflection, this indicates that either the description of the reality given
ot
by ψ ` is incomplete, with some other hidden variable controlling the outcomes of
N
based on very similar principles, which lead to effects that cannot be explained by
a hidden variable model, and to phenomena such as pseudotelepathy or quantum
on
teleportation. We will briefly explore some of these examples later on, mostly in
Chapter 11.
se
Exercise 3.2. Verify the details of the calculation of probabilities in Table 3.1.
lu
There are many books which present quantum mechanics for specific audiences.
In addition to the references given at the end of Chapter 2, we point out [Mer07]
so
rib
Banach and his Spaces
ist
rd
Asymptotic Geometric Analysis Miscellany
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 4
More Convexity
ion
The focus of this chapter are concepts, invariants and operations related to
ut
finite-dimensional convex bodies. The primary objectives are to be able to describe,
tell apart, and measure the size of such bodies. While some of the results are
rib
relatively new, they all have roots in classical convex geometry and, most notably,
in the work of Hermann Minkowski in the late 19th and early 20th century. Other,
ist
more modern aspects of the theory of convex bodies will be addressed in Chapters
5 and 7.
rd
4.1. Basic notions and operations
fo
4.1.1. Distances between convex sets. A natural way to quantify how
different two subsets of a metric space are is the Hausdorff distance. When we
consider convex bodies K, L Ă Rn containing the origin in their interiors, and
ot
identified when related by a homothetic transformations, a more relevant notion is
N
Equivalently,
}x}K }x}L
on
most frequently encountered in the literature, we can restrict the infimum in (4.2)
r
79
80 4. MORE CONVEXITY
ion
4.1.2. Symmetrization. If K Ă Rn is a non-symmetric convex body con-
taining 0, there are several symmetric convex bodies that can be associated with
ut
K (see Figure 4.1). Such symmetrization operations are useful because symmet-
rib
ric convex bodies are often easier to deal with, whereas the symmetrized set still
“remembers” many features of K.
ist
K
K∪
−K
rd
• •
0 0
fo
N ot
(K − K)/2
• K∩ •
ly.
0 0
on
se
−K • K
0
lu
na
K
so
Figure 4.1. A convex body K Ă R2 (top left) and its four kinds
r
ion
states) and for the set of quantum states (see Section 0.10). In this situation
still another symmetrization is useful. If H Ă Rn`1 is an affine hyperplane not
containing 0, and K is a convex body in H (so that K is n-dimensional), one may
ut
consider
rib
(4.6) K “ convpK Y p´Kqq.
The symbol depicts a cylinder. This is motivated by the observation that
ist
when K is a Euclidean disk, the resulting body K is a cylinder. It coincides with
what is commonly called a generalized cylinder if K is centrally symmetric.
rd
The set K is an pn ` 1q-dimensional convex body, so while formula (4.6) is
identical to (4.3), we distinguish the two operations since they will be applied in
fo
different contexts (for a description of pK q˝ , see Exercise 4.5). For example, if
K “ ∆n is the regular simplex defined as the convex hull of the canonical basis in
Rn`1 , the convex body obtained after symmetrization is p∆n q “ B1n`1 .
ot
All these symmetrizations turn a non-symmetric convex body into a centrally
N
symmetric convex body. The word “symmetrization” is also used to describe op-
erations for which the output has some other symmetry properties. One example
ly.
Exercise 4.4 (Origin shifting and symmetrization). Show that for any convex
lu
body K Ă Rn and a, b P K,
dBM ppK ´ aqY , pK ´ bqY q ď 4.
na
ˆ ˙ ˆ ˙
Pe
˝ e ˚ e ˚
pK q “ ´C X ´ 2 `C .
|e|2 |e|
If we write x ď y when y ´ x P C ˚ , this is the “interval” tx P Rn : ´e{|e|2 ď x ď
e{|e|2 u in the order induced by C ˚ .
4.1.3. Zonotopes and zonoids. A crucial notion in convex geometry is that
of Minkowski operations on sets. If A, B Ă Rn and t P R, we set
(4.7) A ` B :“ tx ` y : x P A, y P Bu, tA :“ ttx : t P R, x P Au.
The definition of the Minkowski sum extends to the case of finitely many convex
bodies.
82 4. MORE CONVEXITY
ion
zonoids) is invariant under affine transformations, so we could alternatively use the
Banach–Mazur distance instead of the Hausdorff distance.
Observe that zonotopes and zonoids are automatically centrally symmetric. We
ut
will usually assume that the center of symmetry is at the origin. Here is a useful
characterization of zonoids as polars of unit balls of subspaces of L1 .
rib
Proposition 4.1 (not proved here). Let K Ă Rn be a symmetric convex body.
ist
The following are equivalent.
(i) K is a zonoid.
(ii) There is a positive Borel measure µK on S n´1 such that, for any x P Rn ,
rd
ż
(4.8) }x}K ˝ “ |xx, θy| dµK pθq.
S n´1 fo
We emphasize that µK is not assumed to be a probability measure.
ot
It follows in particular that every ellipsoid is a zonoid (use µK “ σ in (4.8), then
affine equivalence). Note also that, for a given zonoid K Ă Rn , the Borel measure
N
Exercise 4.7 (Planar zonotopes and zonoids). Show that every centrally sym-
metric polygon is a zonotope, and that any centrally symmetric convex body
lu
K Ă R2 is a zonoid.
Exercise 4.8 (Octahedron is not a zonotope). Show that B13 is not a zonotope.
na
1
R and Rn respectively, their projective tensor product is the closed convex set
n
1 1
Kb p K 1 in Rn b Rn Ø Rnn defined as follows
(4.9) p K 1 “ convtx b x1 : x P K, x1 P K 1 u.
Kb
This terminology is motivated by the fact that when K and K 1 are unit balls
with respect to some norms, the set K bK p 1 is the unit ball of the corresponding pro-
1
jective tensor product norm on R b Rn . Recall that given two finite-dimensional
n
normed spaces pV, } ¨ }q and pV 1 , } ¨ }q, their projective tensor product (denoted by
V bp V 1 ) is the space V b V 1 equipped with the norm
!ÿ ÿ )
}z}^ “ inf }xi } }yi } : z “ xi b yi .
4.1. BASIC NOTIONS AND OPERATIONS 83
It is easily checked that B1m b p B1n identifies with B1mn when the space Rm b Rn
mn
is identified with R (see also Exercise 4.16), and that B2m bp B2n identifies with
m,n m n
S1 when R b R is identified with Mm,n .
There is a dual notion to the projective tensor product, which is called the
1
injective tensor product. It can be defined via polarity: if K Ă Rn and K 1 Ă Rn
are convex bodies containing 0 in the interior, their injective tensor product is the
1 1
convex body K b q K 1 in Rn b Rn Ø Rnn defined as follows
p pK 1 q˝ ˝ .
` ˘
(4.10) Kb q K1 “ K˝ b
ion
This definition does not depend on the particular choice of Euclidean structures on
1 1
Rn and Rn , provided one considers the Euclidean structure on Rn b Rn obtained
ut
as their Hilbertian tensor product.
rib
The relevance of the above notions to information theoretical context—quantum
or classical—is evident. The set of separable states is the projective tensor product
of the sets of states on factor spaces. More precisely, if H “ H1 b H2 , then
ist
(4.11) SeppHq “ DpH1 q b
p DpH2 q.
rd
(These objects were defined in Section 2.2.) Similarly, for classical states, the
projective tensor product ∆m´1 b p ∆n´1 identifies with ∆mn´1 .
The definition of K b fo
p K 1 (similarly to other definitions and comments of this
section) immediately generalizes to tensor products of any finite number of factors.
However, for the sake of transparency we shall concentrate in this section on the
ot
case of two convex bodies. We also point out that the definition (4.9) makes sense
N
and
(4.13) p K 1 “ pK b
K b p K 1q .
se
To check that (4.13) makes sense, we note that if K (resp., K 1 ) is a convex body in
lu
1
the affine hyperplane He Ă Rn (resp., He1 Ă Rn ) defined as in (1.21), then K b p K1
1
n n
is a convex body in the affine hyperplane Hebe1 Ă R b R (cf. Exercises 4.13 and
na
4.15 ).
A specific situation where (4.13) holds, which will be fundamental in Chapter
so
9, is when K is the set of quantum states on a Hilbert space. Since DpCd q “ S1d,sa ,
it follows that
r
Pe
1 1
(4.14) SeppCd b Cd q “ S1d,sa b
p S1d ,sa .
To put it in words, the symmetrization of the set of separable states is canonically
identified with the projective tensor product of two copies of the self-adjoint part of
the unit ball for the trace norm and, consequently, is the unit ball in the projective
tensor product norm of (the self-adjoint parts of) two 1-Schatten spaces.
Exercise 4.10 (Projective tensor product and compactness). Show that if
K, K 1 are compact convex sets, then convtx b x1 : x P K, x1 P K 1 u is compact and
p K 1 . Give an example of closed convex sets K, K 1 such that the
hence equal to K b
set convtx b x : x P K, x1 P K 1 u is not closed.
1
84 4. MORE CONVEXITY
ni
ni mi
` Ki Ă R
Exercise 4.11 (Linear invariance of projective tensor product). Let ˘
and let Ti : R Ñ R be linear maps, i “ 1, 2. Show that pT1 b T2 q K1 b p K2 “
pT1 K1 q b
p pT2 K2 q.
Exercise 4.12 (Projective tensor product with a linear subspace). Let K Ă Rn
1
be a closed convex set, let V “ span K, and let V 1 Ă Rn be a vector subspace.
Show that K bp V1 “V b p V 1 “ V b V 1.
Exercise 4.13 (Projective tensor product of affine subspaces). Let Vi Ă Rni
ion
p 2 is an affine subspace of Rn1 bRn2
be affine subspaces for i “ 1, 2. Show that V1 bV
and find its dimension.
ut
Exercise 4.14 (Projective tensor product of cones). Show that if C and C 1 are
closed convex cones, then the set convtx b x1 : x P C, x1 P C 1 u is a closed convex
rib
p C1.
cone and in particular equals C b
Exercise 4.15 (Projective tensor product of bodies are bodies). Show that if
ist
Ki Ă Rni are convex bodies, then K1 bK p 2 is a convex body in Rn1 bRn2 . Similarly,
if each Ki is a convex body in an affine subspace Vi Ă Rni , then K1 bK
p 2 is a convex
rd
body in V1 b V2 .
p
Exercise 4.16 (Projective tensor product with B1n ). Let K be a symmetric
convex body in Rm . (i) What is then B1k b fo
p K? (ii) Show that
pm!qk
ot
volpB1k b
p Kq “ volpKqk .
pkmq!
N
1 1
the set of elements x b x , where x is an extreme point of K and x is an extreme
point of K 1 . Show that this may be false if either K or K 1 is not symmetric.
on
such that |F px, x q| ď }x} ¨ }x1 } for all x, x1 (i.e., with the unit ball in the space of
1
bilinear maps).
lu
(i) there is a unique ellipsoid E Ă Rn with maximal volume under the constraint
Pe
E Ă K and
(ii) there is a unique ellipsoid F Ă Rn with minimal volume under the constraint
F Ą K.
The ellipsoid E appearing in (i) is called the John ellipsoid of K and denoted
by JohnpKq. The ellipsoid F appearing in (ii) is called the Löwner ellipsoid of K
and denoted by LöwpKq. By a compactness argument, the existence of an ellipsoid
of maximal/minimal volume is clear in (i) and (ii). Note also that these ellipsoids
are affine invariants: for any affine map T , we have JohnpT Kq “ T JohnpKq and
LöwpT Kq “ T LöwpKq. We say that K is in John position if JohnpKq “ B2n , and
that K is in Löwner position if LöwpKq “ B2n .
4.2. JOHN AND LÖWNER ELLIPSOIDS 85
ion
than volpE q, a contradiction. If S 1 ‰ I, then K contains the ellipsoid T pB2n q ` y
1
with T “ I `S 2 and y “ x`x2 . Since det T ą 1 (see Exercise 1.42), this ellipsoid is
of a volume greater than volpE q, also a contradiction.
ut
The uniqueness in (ii) follows by duality when K is centrally symmetric. Indeed,
rib
the minimization problem in (ii) can be restricted in that case to 0-symmetric
ellipsoids (by essentially the same argument as in the case of S 1 “ I above). Since
for a 0-symmetric ellipsoid F we have volpF q volpF ˝ q “ volpB2n q2 by (1.10), and
ist
since K Ă F ðñ F ˝ Ă K ˝ , the uniqueness follows, together with the relation
LöwpKq “ JohnpK ˝ q˝ .
rd
fo
ot
y B22
•
N
K◦ K
ly.
• •x
0
on
se
•
z
lu
na
so
in Definition 4.5.
The uniqueness in (ii) in the general case is not obvious at this point; we
postpone its justification until after Proposition 4.4.
We will now present a general trick that makes it possible to reduce the search
for the Löwner ellipsoid of the not-necessarily-symmetric bodies to the symmetric
case. To that end, fix h ą 0 and consider the affine hyperplane
H :“ tph, xq : x P Rn u Ă Rn`1 .
86 4. MORE CONVEXITY
ion
ff
n ` 1hb 0
T “ ? 1 .
n ` 1 ha 1` n S
ut
In particular,
rib
(4.15) volpLöwpE qq “ cn h volpE q
for some constant cn depending only on n.
ist
Proof. Consider first the special case (denoted by E0 ) where S “ I, h “ 1,
and a “ 0. It follows from the uniqueness—which has already been fully proved
rd
in the symmetric case—that LöwppE0 q q inherits all the symmetries of pE0 q and
therefore has the form T0 pB2n`1 q, where T0 is a diagonal matrix with coefficients
fo
pα, β, . . . , βq, with α, β ą 0 to be determined. Since pE0 q Ă T0 pB2n`1 q if and only
if α12 ` β12 ď 1 and volpT0 pB2n`1 qq “ αβ n volpB2n`1 q, the minimization problem
ot
? a
yields the values α “ n ` 1, β “ 1 ` 1{n, as needed.
For the general case, note that E “ ApE0 q, where
N
„
h 0
A“ P Mn`1 .
ly.
a S
Since LöwpE q “ LöwpApE0 q q “ A LöwppE0 q q by invariance, it follows that T “
on
AT0 as claimed. The relation (4.15) follows by expressing det T in terms of det S.
se
ion
n
If K is a convex body in R and all points xi belong to BK, we will say that
pxi , ci qiPI is associated to K. Note that if, additionally, K Ă B2n or B2n Ă K (which
ut
will be usually the case), then all points xi are contact points of K and the unit
sphere, i.e., such that }xi }K “ }xi }K ˝ “ |xi |.
rib
ř
Taking trace of both sides in condition (4.16), we see that necessarily ci “ n.
More generally, if T P BpRn q, then
ist
ÿ
(4.18) Tr T “ ci xT xi , xi y
rd
i
(see Exercise 4.19). Note also that condition (4.17) is redundant for symmetric
convex bodies, since one can always enforce it by replacing every couple pci , xi q in
fo
the decomposition by two couples p 21 ci , xi q and p 12 ci , ´xi q.
The following pair of propositions characterizes John and Löwner positions via
ot
resolutions of identity. The presentations of these results that are easily available in
the literature focus on the class of symmetric bodies and we will assume henceforth
N
that they are both known to be true in that setting (for a reference, see Theorem
2.1.15 in [AAGM15] or Theorem 3.1 in [Bal97]). It is also easy to see that in
ly.
the symmetric case the two statements are formally equivalent by duality (i.e., by
passing to polars).
on
"ˆ c ˙ *
1 n
Pe
K
r “ ? , x : x P K Ă Rn`1 .
n`1 n`1
It follows from Lemma 4.3 that B2n`1 “ Löw pB
` n ˘
Ă q . In view of Proposition 4.4,
2
we have the equivalence
K is in Löwner position ðñ K̃ is in Löwner position.
Consequently, our task is reduced to showing that K has an unbiased resolution of
identity (in Rn ) if and only if K̃ has a resolution of identity (in Rn`1 ). To that
end, let e0 “ p1, 0, . . . , 0q P Rn`1 and let pxi , ci q be a resolution of identity for K̃ .
The points xi are extreme points of K̃ , and since we have freedom to replace xi
88 4. MORE CONVEXITY
b
?1 , n
` ˘
by ´xi , we may assume that each xi has the form xi “ n`1 n`1 yi with
yi P K X S n´1 . Setting z “ ci yi , we have
ř
ÿ
In`1 “ ci |xi yxxi |
i
ˇ 1 c c
ÿ n EA 1 n ˇ
ci ˇ e0 ` p0, yi q ? e0 ` p0, yi qˇ
ˇ? ˇ
“
i
n`1 n`1 n`1 n`1
?
n n
ion
` ˘ ÿ
“ |e0 yxe0 | ` |e0 yxp0, zq| ` |p0, zqyxe0 | ` ci |p0, yi qyxp0, yi q|,
n`1 n`1 i
ut
ř
where in the last equality we used the fact that i ci “ n ` 1. By applying this
operator equality to the vector e0 , we obtain z “ ` 0. Thus ˘the middle term in the last
rib
n
line above vanishes, which easily implies that yi , n`1 ci is an unbiased resolution
of identity for K. The reverse argument simply retraces the above calculation
ist
backwards; the reader is encouraged to verify the details. (Note that z “ 0 then
follows from the hypothesis.)
rd
Proof of Proposition 4.7. Assume that K is in John position. We claim
that K ˝ is in Löwner position. To check this, let E be an ellipsoid containing
fo
K ˝ . We then have E ˝ Ă K. We know from Exercise 1.26 (or from Exercise
D.3, which outlines a simpler but less elementary proof) that E ˝ is an ellipsoid
and that volpE q volpE ˝ q ě volpB2n q2 , with equality iff E is 0-symmetric. Since
ot
volpE ˝ q ď volpB2n q by definition of the John ellipsoid, it follows that volpE q ě
N
shows that
lu
ÿ ÿ ÿ
n“ ci ě ci xSxi ` a, xi y “ ci xSxi , xi y “ Tr S,
i i i
na
the last equality following from (4.18). The AM/GM inequality now implies that
det S ď 1, and hence that volpE q ď volpB2n q. Since E Ă K was arbitrary, this
so
pactum which are essentially sharp in the symmetric case only (see Exercises 4.20–
4.21, and Notes and Remarks for further comments).
Exercise 4.19. Prove identity (4.18).
Exercise 4.20 (The diameter of Banach–Mazur compactum). Let K Ă B2n
(resp., K Ą B2n ) be a symmetric convex body and assume that there exists ? a
resolution of identity associated to K. Show that K Ą ?1n B2n (resp., K Ă nB2n q
?
and so, in particular, dg pB2n , Kq ď n. Conclude that any pair K, L of symmetric
convex bodies in Rn satisfies dBM pK, Lq ď n.
4.2. JOHN AND LÖWNER ELLIPSOIDS 89
ion
2 2 if K is
symmetric.
ut
Exercise 4.23 (The radius of Banach–Mazur compactum). Show that ? the first
estimates from Exercise 4.20 is optimal by verifying that dBM pB2n , B8
n
q “ n.
rib
4.2.2. Convex bodies with enough symmetries. In this section we de-
scribe a class of convex bodies “with enough symmetries,” which in particular admit
ist
a unique Euclidean structure compatible with those symmetries. These properties
force the John and Löwner ellipsoids (or any other ellipsoids “functorially associ-
rd
ated” with such bodies) to be balls with respect to that Euclidean structure.
Let K Ă Rn be a convex body. We consider symmetries of K, i.e., invertible
fo
affine maps T : Rn Ñ Rn such that T pKq “ K. We start by making two observa-
tions. First, such maps necessarily fix the centroid of K. If the centroid is at the
ot
origin (which may be assumed by translating K), the set of symmetries becomes
a subgroup of GLpn, Rq. Second, since this subgroup is compact, it must preserve
N
a scalar product (consider any scalar product and average it with respect to the
Haar measure on the group of symmetries). Equivalently, by replacing K with a
ly.
linear image we may ensure that all symmetries of K are (Euclidean) isometries;
in virtually all applications this property will be automatically satisfied. This is
on
tacitly assumed in what follows, although the definitions and the proposition can
be easily rephrased to make sense and/or hold without that assumption.
We therefore consider K Ă Rn a convex body with centroid at the origin. An
se
The isometries of K form a subgroup of Opnq, which will be called the isometry
group of K and denoted by IsopKq. This definition extends mutatis mutandis to
convex bodies K Ă Cn ; in that case IsopKq is a subgroup of Upnq.
na
SO “ OS for every O P G.
r
There is a closely related notion (and possibly a source of confusion): one says
Pe
that IsopKq acts irreducibly if any IsopKq-invariant subspace is either t0u or Rn (or
Cn in the complex case; a subspace E is G-invariant if OpEq “ E for any O P G).
One checks that IsopKq acts irreducibly if and only if IsopKq1 contains no nontrivial
orthogonal projection, and also if and only if IsopKq1 X Msan “ R I; this idea is also
used in Proposition 4.8.
It is immediate that when K has enough symmetries, IsopKq acts irreducibly.
In the complex case, the reverse implication also holds (this is the content of Schur’s
lemma) and both notions are equivalent. In the real case, the notions are different
(see Exercise 4.26).
90 4. MORE CONVEXITY
ion
In particular, when IsopKq acts irreducibly, E is a Euclidean ball.
ut
Proof. Let T be the unique positive matrix such that E “ T pB2n q. Forř every
O P IsopKq, we have E “ OpE q “ OT O: pB2n q, thus OT O: “ T . Write T “ i λi Pi ,
rib
where λi ą 0 are distinct positive numbers and Pi pairwise orthogonal projectors.
From the relation OT O: “ T we deduce that, for every i, we have OPi O: “ Pi for
ist
all O P IsopKq, and therefore that the range of Pi is invariant under IsopKq.
We conclude this section with two examples of groups of symmetries of Rn (or
rd
n
C ) which play an important role in geometric functional analysis
(4.19) Gunc :“ tpx1 , . . . , xn q ÞÑ pε1 x1 , . . . , εn xn q : |εj | “ 1u
(4.20) Gsym :“
fo
tpx1 , . . . , xn q ÞÑ pε1 xπp1q , . . . , εn xπpnq q : |εj | “ 1u,
ot
where ε1 , . . . , εn are scalars and π P Sn , the group of permutations. A convex
body K (resp., the norm or the space, for which K is the unit ball) is called
N
unconditional (with respect to the standard basis) if IsopKq Ą Gunc and, similarly,
permutationally symmetric if IsopKq Ą Gsym . Bodies of the second kind have
ly.
enough symmetries, but bodies of the first kind not necessarily; see Exercise 4.24.
(In functional analysis, the standard terminology for the latter is “symmetric,” but
on
we prefer to avoid the confusion with the notion of being centrally symmetric.)
More generally, one may consider bodies (or norms) that are unconditional (resp.,
se
permutationally symmetric) with respect to some other basis puj q, i.e., invariant
under maps of the form uj ÞÑ εj uj , j “ 1, . . . , n (resp., uj ÞÑ εj uπpjq , j “ 1, . . . , n).
lu
symmetries. Give an example of an unconditional body which does not have enough
r
symmetries.
Pe
Exercise 4.26 (Enough symmetries vs. irreducible action). (i) Let R P SOp2q
be the rotation of angle 2π{p for an integer p ě 3. Construct a convex body K Ă R2
whose isometry group is exactly tRk : 0 ď k ď p ´ 1u. Show that K does not
have enough symmetries although IsopKq acts irreducibly.
(ii) For any n, give an example of a convex body L Ă R2n without enough symme-
tries although IsopLq acts irreducibly.
Exercise 4.27 (Projective tensor product and enough symmetries). Let K Ă
Rm and L Ă Rn be convex bodies with enough symmetries. Show that K b p L has
ion
enough symmetries.
4.2.3. Ellipsoids and tensor products. It turns out that Löwner ellipsoids
ut
behave well with respect to the projective tensor product, as the following lemma
rib
shows. Note that the analogous statement does not hold for the John ellipsoid (see
Exercise 4.28).
ist
1
Lemma 4.9. Let K Ă Rn and K 1 Ă Rn be two convex bodies and assume that
the ellipsoids LöwpKq and LöwpK 1 q are 0-symmetric. Then the Löwner ellipsoid
rd
of their projective tensor product is the Hilbertian tensor product of the respective
Löwner ellipsoids.
1
In terms of scalar products, for every x, y in Rn and x1 , y 1 in Rn , we have
xx b x1 , y b y 1 yLöwpK bK 1 1
fo
p 1 q “ xx, yyLöwpKq xx , y yLöwpK 1 q
ot
1
Proof. First suppose that LöwpKq “ B2n and LöwpK 1 q “ B2n . By Proposition
N
4.6, there exist unbiased resolutions of identity for K and K 1 , respectively pxi , ci q
1 1
and px1j , c1j q. We easily check that K b p K 1 Ă B2nn “ B2n b2 B2n . We may verify
ly.
ci x i b
i j i j
ÿÿ ´ÿ ¯ ´ÿ ¯
ci c1j |xi x1j yxxi x1j | ci |xi yxxi | b c1j |x1j yxx1j | “ I .
se
b b “
i j i j
lu
1
It follows from Proposition 4.6 that LöwpK bK
p q“ 1
B2nn .
For the general case, let T
1
and T 1 be linear maps such that T LöwpKq “ B2n and T LöwpK 1 q “ B2n . Using the
1
na
1 1 p 1 K 1 q,
elementary identities LöwpT Kq “ T LöwpKq and pT bT qpK bK
p q “ pT KqbpT
the result follows from the previous special case.
so
Exercise 4.28 (Projective tensor product and the John ellipsoid). Compare
?
r
JohnpK bLq
p and JohnpKqbJohnpLq
p when K “ L “ B2n and when K “ L “ nB1n .
Pe
ion
in Rn : among sets of given volume, the balls have the smallest surface area. If
K Ă Rn is sufficiently regular, the surface area can be defined as the first-order
ut
variation of the volume of the “enlarged” set K ` εB2n when ε goes to 0
volpK ` εB2n q ´ volpKq
rib
(4.23) areapKq :“ lim
εÑ0 ε
Note that for a general subset K Ă Rn , some care is needed in defining area since
ist
the limit in (4.23) may not exist or may not coincide with other notions of surface
area. However, such problems do not arise for convex sets.
rd
A convenient formulation of the isoperimetric inequality uses the concept of
volume radius. Given a bounded measurable K Ă Rn , its volume radius vradpKq
is defined as
ˆ
fo
volpKq n
˙1
ot
(4.24) vradpKq :“ .
volpBn2 q
N
In words, the volume radius of K is the radius of the Euclidean ball which has the
same volume of K. A standard computation shows that
ly.
π n{2
vol B2n “ ` n
` ˘
(4.25) ˘.
Γ 2 `1
on
ˆż ˙1{n
´n
(4.26) vradpKq “ }θ}K dσpθq .
na
S n´1
Here is the statement of the isoperimetric inequality in Rn employing the notion
so
of volume radius.
r
Exercise 4.29 (Superadditivity of the volume radius). Show that the Brunn–
Minkowski inequality can be restated as vradpK ` Lq ě vradpKq ` vradpKq.
Exercise 4.30 (Superadditivity and log-concavity). Show that the inequalities
(4.21) and (4.22) are formally globally equivalent.
Exercise 4.31 (Steiner-like symmetrizations). Show that the following state-
ment is equivalent to the Brunn–Minkowski inequality for convex bodies. Let
K Ă Rn a convex body and E Ă Rn a k-dimensional subspace with 0 ă k ă n.
ion
Define a set L Ă E ˆ E K by the following (where x P E, y P E K )
px, yq P L ðñ |x| ď vradpK X pE ` yqq
ut
where the volume radius is measured in E ` y. Then L is convex. (When E is a
hyperplane, the map K ÞÑ L defined above is called Steiner symmetrization.)
rib
Exercise 4.32. Let E Ă Rm and F Ă Rn be two 0-symmetric ellipsoids. Show
the formula vradpE b2 F q “ vradpE q vradpF q.
ist
4.3.2. log-concave measures. Closely related to the Brunn–Minkowski in-
rd
equality is the concept of a log-concave measure. In our setting, log-concave mea-
sures appear as (limits of) marginals of uniform measures on convex sets.
Let µ be a measure on Rn with density f with respect to the Lebesgue measure.
fo
We say that µ is log-concave if log f is a concave function. Similarly, given α ą 0,
we say that µ is α-concave if the function f α is concave when restricted to the
ot
support of µ. We now state basic facts about log- and α-concave measures and
N
(2) There is a closed convex set K Ă Rn ˆ Rs such that µ is the marginal over
Rs of the Lebesgue measure restricted to K, i.e., such that, for any Borel
set B Ă Rn ,
na
ion
Exercise 4.35 (α-concavity and marginals). Deduce Lemma 4.13 from the
Brunn–Minkowski inequality (4.22) applied in Rs .
ut
Exercise 4.36 (Characterization of log-concave measures). Deduce Proposi-
tion 4.14 from Lemmas 4.12 and 4.13.
rib
4.3.3. Mean width and the Urysohn inequality. Given a nonempty and
bounded set K Ă Rn and a vector u P Rn , we define the quantity
ist
(4.29) wpK, uq :“ sup xu, xy.
rd
xPK
In the particular case when K is a convex body containing 0 in the interior, we have
wpK, uq “ }u}K ˝ (see (1.8)). If |u| “ 1, then wpK, uq is called the support function
fo
of K in direction u. (An alternative notation for the support function, widely used
in convex geometry, is hK puq.) Geometrically, wpK, uq is then the distance from
ot
the origin to the hyperplane tangent to K in the direction u (that is, with u being
normal to the hyperplane, and outer to K). In particular wpK, uq ` wpK, ´uq is
N
the width of the smallest strip in direction orthogonal to u which contains K (see
Figure 4.3).
ly.
w(K, −u)
on
K
se
lu
na
u
•
0
so
r
Pe
w(K, u)
ion
From the geometric point of view, it might have been more accurate to call
wpKq the mean half-width (or, as some authors do, to include an additional fac-
ut
tor 2 in the definition; observe that wpKq is half of the average of wpK, uq `
wpK, ´uq). However, we opted for simplicity. Note that, under our convention, one
rib
has wpB2n q “ 1, and that if K is a convex body which contains the origin in the
interior, then
ist
ż
wpKq “ }u}K ˝ dσpuq.
S n´1
rd
It is often convenient to consider the Gaussian variant of the mean width.
Let G be a standard Gaussian vector in Rn , i.e., a Rn -valued random variable
whose coordinates in any orthonormal basis are independent and follow the N p0, 1q
fo
distribution (see Appendix A). For any nonempty bounded set K Ă Rn , we define
the Gaussian mean width of K as
ot
ż
1
(4.31) wG pKq :“ E wpK, Gq “ sup xu, xy expp´|u|2 {2q du.
N
p2πqn{2 Rn xPK
Using (A.7), one checks that
ly.
and similarly for wG . In the special case when L is a singleton, this shows that the
mean width (Gaussian or not) is translation-invariant.
so
An advantage of the Gaussian mean width is that it does not depend on the
ambient dimension. Indeed, suppose that K is a bounded subset in a subspace
r
ion
For a longer chain of inequalities which includes also dual quantities, see Exercise
4.51. It is instructive to compare in Table 4.1 the values of these quantities for
the most standard examples of convex bodies. For a derivation, see Exercises 4.38
ut
and 6.6 (we postpone the nontrivial mean width computations to Chapter 6, where
they fit more naturally).
rib
Table 4.1. Radii for standard convex bodies in Rn . Quantities
ist
in each row are non-decreasing from left to right, see (4.35) and
Exercise 4.51. The simplex K is normalized? to be a regular simplex
rd
inscribed in the Euclidean ball of radius n cantered at the origin.
This normalization is appealing since it has the property that K ˝ “
´K. When compared ? to the simplex ∆n as defined in Section 1.1.2,
fo
K is congruent to n ` 1 ∆n .
ot
K inradpKq wpK ˝ q´1 vradpKq wpKq outradpKq
B2n
N
1 1 1 1 1
? a a ? ?
B1n 1{ n „ π{2n „ 2e{πn „ 2 log n{ n 1
? ? ?
ly.
n
a a
B8 1 „ n{ 2 log n „ 2n{πe „ 2n{π n
? ? a ? ?
simplex 1{ n „ 1{ 2 log n „ e{2π „ 2 log n n
on
We check in Table 4.1 that for all these basic examples of convex bodies, the
se
volume radius and the mean width are of comparable order of magnitude, at least
lu
up to a logarithmic factor. This cannot be true for general convex bodies (see
Exercise 4.42), but a convex body such that vradpKq is much smaller than wpKq
has to be strongly “non-isotropic,” cf. Corollary 7.11.
na
The Urysohn inequality has a “dual” version, which is actually easier to prove
since it depends only on the Hölder inequality.
so
Proposition 4.16. For every convex body K Ă Rn containing the origin it its
r
interior, we have
Pe
Exercise 4.37 (The mean width of the polar). Let K Ă Rn be a convex body.
Show that wpKqwpK ˝ q ě 1.
Exercise 4.38. Derive the estimates about inradius, volume radius and out-
radius in Table 4.1. For the mean width, see Exercise 6.6.
Exercise 4.39 (Rough bounds on volume radius of Bpn ). Use the inequalities
(1.4) between `p -norms and the information on volume radii from Table 4.1 (or
direct calculations) to conclude that vradpBpn q » n1{2´1{p for 1 ď p ď 8.
ion
ş p
Exercise 4.40 (Volume of Bpn ). Let 1 ď p ď 8. By calculating Rn e´}x}p dx
˘n
in two different ways, show that volpBpn q “ 2Γp1 ` p1 q {Γp1 ` np q. Deduce that,
`
ut
for large n, vradpBpn q „ 2Γp1 ` p1 qppeq1{p n1{2´1{p .
rib
Exercise 4.41 (Uniqueness of outradius witness). Show that there is a unique
Euclidean ball of minimal radius containing a given set K Ă Rn .
ist
Exercise 4.42 (The gap in Urysohn’s inequality). Give examples of convex
bodies K Ă R2 such that the ratio wpKq{ vradpKq is arbitrary large.
rd
Exercise 4.43 (The mean width and the diameter). Show that for a convex
body K Ă Rn , wpKq ě 12 κκn1 diam K. fo
Exercise 4.44 (The mean width and the perimeter). For a convex body K Ă
ot
R2 , show that wpKq is equal to p2πq´1 times the perimeter of K. For convex planar
sets, the Urysohn inequality is therefore equivalent to the isoperimetric inequality.
N
wG pKq.
on
of the inequality from Proposition 4.16: exp S n´1 log }θ}K dσpθq ě vradpKq´1 .
In other words, the “geometric mean” of } ¨ }K is at least as large as vradpKq´1 ,
r
while inequality (4.36) asserts the same only about the “arithmetic mean” wpK ˝ q “
Pe
ş
S n´1
}θ}K dσpθq.
Exercise 4.49 (A proof of Urysohn’s inequality). (i) Explain in which sense
the following generalization of the Brunn–Minkowski holds and prove it: if pΩ, F, µq
is a measure space and Kt Ă Rn a convex body depending in a measurable way in
a parameter t P Ω, then
ż ˆ ˆż ˙˙1{n
1{n
(4.37) volpKt q dµptq ď vol Kt dµt .
Ω Ω
(ii) Fix a convex body K Ă Rn . By choosing pΩ, µq to be the orthogonal group
Opnq equipped with the Haar measure, and Kt “ tpKq for t P Opnq, prove (4.34).
98 4. MORE CONVEXITY
4.3.4. The Santaló and the reverse Santaló inequalities. When dealing
with convex bodies, it is often convenient to consider the dual picture, involving the
polar bodies. It turns out that the volume is especially well behaved with respect
to the polar operation. This is the content of the Santaló and reverse Santaló
inequalities.
Theorem 4.17 (Santaló and reverse Santaló inequalities, not proved here, but
see Exercise 7.33). There is a constant c ą 0 such that the following holds: for any
n P N and for any symmetric convex body K Ă Rn , we have
ion
(4.38) c ď vradpKq vradpK ˝ q ď 1.
ut
For a non-symmetric convex body K Ă Rn , the product vradpKq vradpK ˝ q may
be arbitrary large (and even infinite, if 0 belongs to the boundary of K). The correct
rib
version of the Theorem in that context is as follows: any convex body K Ă Rn can
be translated so that (4.38) holds. Moreover, it is known (see Proposition D.2 in
ist
Appendix D) that among the translates of K, the minimum of the volume of the
polar (and hence of the product of the volume radii) occurs when the polar has
rd
centroid at 0. Such a point is unique and called the Santaló point of K.
The upper bound in (4.38) is also known as the Blaschke–Santaló inequality
and can be proved through a symmetrization procedure. Note that a 0-symmetric
fo
ellipsoid E Ă Rn satisfies vradpE q vradpE ˝ q “ 1 and no other bodies saturate the
upper bound. Concerning the lower bound, the best constants to date are c “ 1{2
ot
in the symmetric case and c “ 1{4 in the general case (cf. Exercise 4.57).
N
Exercise 4.50 (Santaló implies Urysohn). Using the Santaló inequality, deduce
the Urysohn inequality (4.34) from its dual version (Proposition 4.16).
ly.
marks).
r
Recall that KX “ p´Kq X K. The factor 2´n may appear small, but remember
that it is the n-th root of the volume that is the relevant quantity. In particular, in
terms of volume radii, the conclusion of the second part of Proposition 4.18 simply
becomes vradpKX q ě 12 vradpKq. In is natural to conjecture that among convex
bodies of fixed volume with centroid at the origin, the volume of KX is minimized
when K is a simplex. This would lead to a constant pp2{e ` op1qqn instead of 2´n
in (4.39).
To prove Proposition 4.18, we use the following lemma (which is much simpler
ion
to prove for symmetric convex bodies, see Exercise 4.53).
Lemma 4.19 (Spingarn inequality). Let K Ă Rn be a convex body with centroid
ut
at the origin. If E Ă Rn is a (vector) subspace and F “ E K , we have the inequality
rib
volpKq ď volE pK X Eq volF pPF Kq.
Recall that volH refers to the Lebesgue measure on an affine subspace H Ă Rn .
ist
Proof of Lemma 4.19. Define a function Φ : PF K Ñ R` by
rd
Φpxq “ volE`x pK X pE ` xqq1{k ,
where k “ dim E. The Brunn–Minkowski inequality (4.22) implies that the function
fo
Φ is concave (see Exercise 4.31). Since concave functions can be realized as minima
of affine functions, there exists a y P F such that for any x P PF K,
ot
(4.40) Φpxq ď xx, yy ` Φp0q.
N
ż ˆż ˙k{pk`1q
1
k k`1
(4.41) volpKq “ Φpxq dx ď volF pPF Kq k`1 Φpxq dx .
on
PF K PF K
Next, by (4.40),
ż ż
se
ş
Since 0 is the centroid of K, we have PF K Φpxqk xx, yy dx “ 0. Consequently,
combining (4.41) and (4.42) we are led to
na
1 k k
volpKq ď volF pPF Kq k`1 Φp0q k`1 volpKq k`1 .
so
ion
concave and vanishes on the boundary of PF K, therefore, for any x P PF K
Φpxq ě Φp0qp1 ´ }x}PF K q.
ut
It follows that
rib
ż ż
volpKq “ Φpxqk dx ě volE pK X Eq p1 ´ }x}PF K qk dx
PF K PF K
ist
`n˘´1
and the last integral reduces to a Beta integral and equals volF pPF Kq k .
rd
Lemma 4.20 implies a series of inequalities, all due to Rogers and Shephard,
stating that the simplex is the convex body for which the volume increase is the
fo
largest after symmetrization. Their proofs are relegated to exercises.
Theorem 4.21 (see Exercise 4.54). If K Ă Rn is a convex body,
ot
ˆ ˙
´n 2n
(4.43) volpKq ď volppK ´ Kq{2q ď 2 volpKq.
N
n
As a consequence
ly.
2n
lu
volpKY q ď 2n volpKq.
Pe
ion
(4.46)
for all x, y P Rn . Then
ut
ż ˆż ˙λ ˆż ˙1´λ
(4.47) hpxq dx ě f pxq dx gpxq dx .
rib
Rn Rn Rn
The Brunn–Minkowski inequality in the form (4.21) follows immediately from
Theorem 4.24 applied with f “ 1K , g “ 1L , and h “ 1λK`p1´λqL (the indicator
ist
functions of K, L, and λK ` p1 ´ λqL). See Notes and Remarks for pointers to
other functional inequalities.
rd
Exercise 4.58. Using induction on the dimension, derive the general Prékopa–
Leindler inequality from the case n “ 1.
fo
4.4. Volume of central sections and the isotropic position
ot
Let K Ă Rn be a convex body with centroid at the origin. The inertia matrix
N
of K is defined as ż
1
IK “ |xyxx| dx.
vol K K
ly.
this position is unique in the following sense: if both K and T K are isotropic for
some T P GLpn, Rq, then T is a multiple of an orthogonal matrix. In particular, we
lu
Isotropic convex bodies have the remarkable property that all their central
so
centroid at the origin, and assume that IK “ λ2 I for some λ ą 0. Then, for any
linear hyperplane H Ă Rn ,
voln pKq voln pKq
(4.48) c ď voln´1 pK X Hq ď C ,
λ λ
1
where c “ 2? 3
and C “ ?12 .
A very important open problem is how the two parameters λ and voln pKq
appearing in (4.48) are related. The hyperplane conjecture postulates that, for every
convex body K with voln pKq “ 1 and IK “ λ2 I, we have λ ď C0 for an absolute
constant C0 ; see Notes and Remarks for more background on this conjecture.
For some special bodies much more precise estimates are available.
102 4. MORE CONVEXITY
ion
ume radius of sections through its centroid. (The reader who wonders why such
relationships may be relevant in the context of this book may check Section 9.3.)
ut
Proposition 4.28. Let K be an n-dimensional convex body with centroid at a,
and let H be a k-codimensional affine subspace passing through a. Denote θ “ k{n
rib
and let r and R be the inradius and outradius of K with respect to a. Then
ˆ ˙ n1
vradpK X Hq1´θ n
ist
(4.50) R´θ bpn, kq ď ď r´θ bpn, kq ,
vradpKq k
rd
where
˙ n1 ˜ ¸1{n
voln pB2n q Γp k2 ` 1qΓp n´k
ˆ
2 ` 1q
(4.51) bpn, kq :“
volk pB2k q voln´k pB2n´k q
fo “ n
Γp 2 ` 1q
.
ot
Proof. We may assume that a “ 0 (otherwise consider K ´a). By hypothesis,
we have then
N
where B2n is the n-dimensional unit Euclidean ball. For a subspace E, denote by
PE the orthogonal projection onto E. Then, by Lemma 4.19,
on
n ď
voln pB2 q vols pB2s q volk pB2k q voln pB2n q
na
,
voln pB2n q
r
which is the first inequality in (4.50). For the second inequality, we note that by
Pe
Lemma 4.20, which does not even require that H passes through the centroid of K,
ˆ ˙´1
n
(4.54) voln pKq ě vols pK X Hq volk pPH K Kq.
k
As earlier, this can be rewritten in terms of volume radii as
vols pB2s q volk pB2k q
ˆ ˙
n
vradpKqn ě vradpK X Hqs rk ,
k voln pB2n q
which is the second inequality in (4.50).
NOTES AND REMARKS 103
Remark 4.29. Although the argument that led to bounds (4.50) looks rough,
we note that we always have (see Exercise 4.60)
ˆ ˙ n1
1 n ?
(4.55) ? ă bpn, kq ă 1 ă bpn, kq ă 2.
2 k
Exercise 4.59 (Isotropic position and central ş sections). (i) Let f : R Ñ R`
an even function such that log f is concave and f pxq dx “ 1. Show that 12f1p0q2 ď
1
ş 2
x f pxq dx ď 2f p0q 2 . (This conclusion also holds if the assumption “f is even” is
ion
ş
replaced by “ xf pxq dx “ 0,” but the proof is more involved, see [Fra99].)
(ii) Use (i) to prove Proposition 4.26.
ut
Exercise 4.60. Prove the bounds (4.55).
rib
Notes and Remarks
ist
A comprehensive reference for geometry and for convex bodies focusing on the
issues related to the Brunn–Minkowski inequality is the book [Sch14].
rd
Section 4.1. The Banach–Mazur distance is most frequently defined in the
category of normed spaces with
fo
dpX, Y q :“ inft}T } ¨ }T ´1 } : T : X Ñ Y an isomorphismu.
This corresponds to definition (4.2) with K, L being 0-symmetric (and, conse-
ot
quently, a “ b “ 0).
N
The question of computing the diameter of (various versions of) the Banach–
Mazur compactum has attracted a lot of attention. It follows from Exercise 4.20
on
that the diameter is at most n. In an important and short paper [Glu81], Gluskin
showed that this estimate is asymptotically sharp via the probabilistic method.
se
A variant of his argument shows that if we denote by Kn , Kn1 two randomly and
independently chosen n-dimensional sections of the 3n-dimensional cube, then with
lu
large probability dBM pKn , Kn1 q?Á n. Remarkably, no explicit example of a pair
of convex bodies ?more than C n apart is known. It is proved in [Sza90] that
na
n
dBM pKn , B8 q Á n log n for some randomly constructed Kn .
In the non-symmetric case, the order of growth of the diameter of the Banach–
so
an upper bound of Cn4{3 logC n was shown in [Rud00], which improves on the
Pe
Section 4.2. John’s theorem was first proved (in a slightly different form) in
ion
[Joh48]. We refer to [Bal97] for a modern proof (arguments already appeared
in [Bal92a]) and to [Hen12] for historical aspects. The reduction of the general
setting to the symmetric case presented here (Proposition 4.4, and the proofs of
ut
Propositions 4.6 and 4.7) appears to be new.
The concept of convex bodies with “enough symmetries” was defined in [GG71];
rib
see also Chapter 16 in [TJ89].
The affinity between projective tensor products and Löwner ellipsoids (Lemma
ist
4.9) was noted in [Sza05, AS06].
rd
Section 4.3. The Brunn–Minkowski inequality (4.22) was first proved in di-
mensions 2 and 3 by Brunn and extended by Minkowski to higher dimensions. The
equality case is known: when K, L are convex bodies and 0 ă λ ă 1, the inequality
fo
(4.21) is an equality if and only if K and L are homothetic. The equality case was
extended by Lusternik to general case and is essentially the same up to null sets; for
ot
precise statements, and for a panorama of inequalities connected to the isoperimet-
ric inequalities, we refer to the survey [Gar02]. Far-reaching generalizations of the
N
The two sides of the inequality (4.22) can be very different; for example, if
K and L are perpendicular segments in R2 (hence of volume 0), K ` L is a rec-
on
tangle, and this behavior can be approximated in the category of convex bodies
by replacing segments with narrow rectangles. It is therefore surprising that the
se
(4.22) can be reversed, up to a universal constant (see (7.32) in Notes and Remarks
on Section 7.2). A vaguely similar reverse of Urysohn inequality (4.34) can be found
na
(e.g., vol2n pΘq ě c voln pKq voln pLq for appropriate universal constant c P p0, 1q),
Pe
first proof of the lower bound is due to Bourgain and Milman [BM87]. Other—
quite different—proofs were given later by Kuperberg [Kup08] (which gives the
values of c quoted in the text) and Nazarov [Naz12] (we recommend the notes
[RZ14] for a detailed presentation of Nazarov’s argument). However, no elementary
proof is known (a simple argument giving a lower bound vradpKq vradpK ˝ q Á
1{ log n appears in [Kup92]).
It is conjectured that the product vradpKq vradpK ˝ q in (4.38) is minimized for
the pair pB1n , B8n
q (and for the family of Hanner polytopes, defined as the smallest
ion
class of polytopes containing r´1, 1s and stable under the operations K ÞÑ K ˝ and
pK, Lq ÞÑ K ˆ L; cf. Exercise 4.52) and, in the non-symmetric case, for K “ ∆n
(the minimum being then conjectured to be unique). This is the content of the
ut
so-called Mahler conjecture.
rib
Several inequalities, for which the Euclidean ball is the extremal case, such that
the isoperimetric inequality, the Urysohn inequality and the Santaló inequality (the
upper bound in (4.38)), can be proved using symmetrizations. For example one may
ist
consider the Steiner symmetrizations as defined in Exercise 4.31. A useful result
is then the fact that, given any convex body K Ă Rn , there is choice of successive
rd
Steiner symmetrizations that converge to a Euclidean ball of radius vradpKq (see,
e.g., Theorem 1.1.16 in [AAGM15] for a sketch of proof).
fo
Proposition 4.18 appears in [MP00] and Lemma 4.19 in [Spi93]. Lemma 4.20
is from [RS58]; a simpler proof can be found in [Cha67].
ot
Theorem 4.24 was shown in [Lei72] and [Pré71, Pré73], see also [BL75,
BL76]. A complete compact proof can be found in [AAGM15] or [Gar02], the
N
latter of which also sketches historical background and contains many further ref-
erences.
ly.
(see also [AAKM04]), and of its reverse [KM05]; see also [AAS15] and [CFG` 16]
for more recent contributions and references.
Functional versions of Rogers–Shephard inequalities were considered starting
se
Section 4.4. A very complete reference about the geometry of convex bodies in
isotropic position (including the most recent developments) is the book [BGVV14].
na
Proposition 4.26 was proved by Hensley [Hen80] for symmetric convex bodies and
the symmetry assumption was removed in [Fra99].
so
The hyperplane conjecture (also known as the “slicing problem”) asserts that
any convex body of volume 1 in Rn admits a hyperplane section of volume larger
r
Pe
hyperplane H containing the origin, can we conclude that voln pKq ď voln pLq? It
is known that the answer is affirmative when n ď 4 and negative when n ě 5 (see
[Kol05] for references).
Proposition 4.27 is due to Vaaler ([Vaa79], the lower bound) and Ball ([Bal89],
the upper bound).
Proposition 4.28 is from [SWŻ08]. It is instructive to compare Propositions
4.26 and 4.28. The first one gives very precise estimates for volumes of hyperplane
sections in the isotropic position, while the second one deals with sections of pro-
ion
portional (or subproportional) codimension, but only at the level of the volume
radius, that is, after raising the volumes to the power of 1 over the dimension.
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
CHAPTER 5
ion
Classical Spaces
ut
This chapter presents two fundamental concepts which will be applied in later
rib
chapters: the metric entropy (a.k.a. packing and covering) and the concentration of
measure. Their conjunction leads to the Dvoretzky theorem, which will be presented
in Chapter 7.
ist
5.1. Nets and packings
rd
We will introduce now the complementary concepts of covering numbers (also
called metric entropy) and packing numbers, which quantify the complexity of a
fo
given compact metric set. It will turn out that these parameters are closely related
to the volume and the mean width considered in the preceding chapter.
ot
We first analyze the special but fundamental cases of the sphere and the discrete
cube. We subsequently discuss classical groups and manifolds, and general convex
N
bodies.
ly.
cardinality of an ε-net in K.
A subset P Ă K is called ε-separated if any pair px, yq of distinct elements
lu
from P satisfies dpx, yq ą ε. This property implies that the balls of radius ε{2
centered at elements of P are disjoint (a configuration usually referred to as packing,
na
whence the usage of the letter P ; see Figure 5.1), and in most contexts the two
properties are essentially equivalent. We denote by P pK, εq or P pK, d, εq the largest
so
N pK, d, εq, and its various generalizations, is also often referred to as the metric
entropy of pK, dq.
For any compact metric space K, the following two relations between nets and
packings are fundamental. First, if P is a 2ε-separated set and N is an ε-net, then
the open balls of radius ε centered at elements from N cover K, and each ball
contains at most one element of P. Second, an ε-separated set which is maximal
(with respect to inclusion) is an ε-net (the reader not familiar with this circle of
ideas is encouraged to check these elementary facts). It follows that we have the
inequalities
(5.1) P pK, 2εq ď N pK, εq ď P pK, εq.
107
108 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
• •
• • •
• •
ion
• • • •
ut
rib
Figure 5.1. A net (left) and a packing (right) for an equilateral
triangle (with the Euclidean metric in R2 ). For optimal packings
ist
or covering with few “classical” convex bodies in the plane (squares,
circles or triangles), see the website [@1].
rd
Packings and coverings have been extensively studied, particularly for “stan-
fo
dard” metric spaces. In various applications it is useful to know that there exist
“large” packings and/or “small” nets, and often to be able to exhibit them in a con-
ot
structive manner. By (5.1), both notions are equivalent whenever the resolution
parameter ε is specified only up to a multiplicative constant. On the other hand, for
N
some applications, such as coding theory, very precise results are in high demand.
In many situations the isometry group of K acts transitively and preserves a
ly.
natural probability measure µ. In particular, all balls of radius ε have then the
same measure, denoted by V pεq, and we have the simple inequalities
on
1 1
(5.2) ď N pK, εq ď P pK, εq ď .
V pεq V pε{2q
se
Exercise 5.1. Here, we introduce variations on the definitions and check their
equivalence. Let M be a metric space and K a compact subset. Denote by N 1 pK, εq
lu
P 1 pK, εq be the largest cardinality of a family of disjoint open balls of radius ε{2
with centers in K. Check the inequalities
r
Pe
sphere. The first point of business will be a discussion of volumes of spherical caps,
which enter the subject via (5.2).
5.1.2.1. Estimates on volumes of spherical caps. Given x0 P S n´1 , let Cpx0 , εq
be the cap of center x0 and geodesic radius ε, and denote V pεq “ σpCpx0 , εqq
(ε P r0, πs is tacitly assumed). We have
şε n´2
sin θ dθ
(5.3) V pεq “ şπ0 n´2 .
0
sin θ dθ
?
ion
The denominator at the right-hand side of (5.3) (Wallis integral) equals 2π{κn´1 .
Note that V pπ ´ εq “ 1 ´ V pεq, in particular V pπ{2q “ 1{2. For fixed 0 ă ε ă π{2,
V pεq tends to 0 exponentially fast in the dimension: one has V pεq1{n „ sinpεq. The
ut
following proposition gives elementary but reasonably precise bounds. The first one
rib
is sharp when the radius is small, and the second one for a radius slightly smaller
than π{2.
ist
Proposition 5.1. If 0 ď t ď π{2, then V ptq ď 21 sinn´1 ptq. More precisely
? ?
(5.4) p 2πκn q´1 psin tqn´1 ď V ptq ď p 2πκn cos tq´1 psin tqn´1 ,
rd
?
where κn „ n is given by (A.8). Moreover, if n ą 2, then
1
(5.5)
2
fo
V pπ{2 ´ tq ď expp´nt2 {2q.
ot
S n−1
N
ly.
sin t
on
t
• •
0 x
C(x, t)
se
lu
na
so
Figure 5.2. Proof that V ptq ď 12 sinn´1 ptq. The surface area of
Cpx, tq (bold) does not exceed the surface area of a half-sphere of
r
Pe
A proof of (5.4) is sketched in Exercise 5.4. It is based on the fact that, for
convex sets, surface area is monotone with respect to inclusion (Exercise 5.2). The
inequality (5.5) is from [Jen13] (see also [JS]); a version with n ´ 1 instead of n
in the exponent is proved in Exercise 5.3.
The following fact is only marginally used in what follows, but we include it
since we did not encounter it in the convexity/functional analysis literature.
Proposition 5.2 (Convavity properties of V p¨q, see Exercise 5.5). If V prq is the
measure of a spherical cap of radius r, then the function t ÞÑ log V pet q is concave.
A fortiori, the function r ÞÑ log V prq is strictly concave on r0, πs.
110 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
ion
if K Ă L are convex bodies, then areapKq ď areapLq.
Exercise 5.3. Using Exercise 5.2, show that for t P r0, π{2s, we have V ptq ď
ut
1
2 sinn´1 ptq. Conclude that
rib
1 1
V pπ{2 ´ tq ď pcos tqn´1 ď expp´pn ´ 1qt2 {2q.
2 2
ist
This is only slightly weaker than the bound (5.5) and sharper than the estimates
typically cited in the literature.
rd
Exercise 5.4 (Sharp bounds for volumes of caps). Using
? Exercise 5.2, show the
inequalities (5.4). Then strengthen the lower bound to p 2π κn cospt{2qq´1 sinn´1 t.
fo
Exercise 5.5 (Convavity properties of V p¨q). Prove Proposition 5.2 and derive
the inequality (5.6).
ot
5.1.2.2. Nets in the sphere. If ε P rπ{2, πq, we clearly have N pS n´1 , g, εq “ 2.
N
The interesting case is when ε P p0, π{2q. In that range, the proportion V pεq of the
sphere covered by a cap of geodesic radius ε decays exponentially with n. It follows
ly.
that the cardinality of ε-nets grows also exponentially fast. For example, the first
estimate from Proposition 5.1 implies that, for ε P p0, π{2q,
on
2
(5.7) N pS n´1 , g, εq ě V pεq´1 ě n´1 .
sin ε
se
A basic and extremely useful bound for ε-nets (formulated in the extrinsic distance)
is the following
lu
The standard and often quoted volumetric argument (which is a special case
so
of Lemma 5.8 below) gives a slightly worse bound p1 ` 2{εqn . The improved bound
p2{εqn can be achieved by a finer analysis combining a version (based on [Dum07])
r
of Proposition 5.4 below with the use of explicit nets in lower dimensions, see [Swe].
Pe
We also note that there exist simple explicit ε-nets in S n´1 with cardinality at most
pC{εqn (see Exercise 5.22).
To discuss finer results it is more convenient to switch to the geodesic distance.
We know from the volume argument (5.2) that N pS n´1 , g, εq ě V pεq´1 . It turns out
that this trivial estimate is remarkably sharp: an almost-matching upper estimate
is provided by an elegant random covering argument due to Rogers.
Proposition 5.4 (Random covering bound). For every 0 ă η ă θ, we have
R ˆ ˙V
n´1 1 V pθq 1
N pS , g, θ ` ηq ď log ` .
V pθq V pηq V pθq
5.1. NETS AND PACKINGS 111
Proof. Let N “ r V 1pθq log pV pθq{V pηqqs. Choose pxi q1ďiďN randomly, inde-
Ť
pendently according to σ, and denote A “ tCpxi , θq : 1 ď i ď N u. The expected
proportion of the sphere missed by A can be computed using the Fubini–Tonelli
theorem
N V pηq
(5.8) EσpS n´1 zAq “ p1 ´ V pθqq ď expp´N V pθqq ď .
V pθq
In particular, there exist pxi q such that σpS n´1 zAq ď V pηq{V pθq. Let tCpyj , ηq :
1 ď j ď M u be a maximal family of disjoint balls of radius η contained in S n´1 zA.
ion
It follows from (5.8) that M ď 1{V pθq. By construction, S n´1 is covered by the
family
ut
( (
Bpxi , θ ` ηq : 1 ď i ď N Y Bpyj , 2ηq : 1 ď j ď M .
rib
Corollary 5.5 (Neat random covering bound, see Exercise 5.8). For every
0 ă ε ă π{2, we have
ist
(5.9) N pS n´1 , g, εq ď Cn log n V pεq´1
for some absolute constant C.
rd
It follows from (5.7), (5.9) and (5.4) that, for a fixed ε P p0, π{2q, we have
1
(5.10)
nÑ8 n
fo
lim log N pS n´1 , g, εq “ ´ logpsin εq.
We note for future reference the following fact.
ot
Proposition 5.6. Let P Ă Rn be a polytope such that dBM pP, B2n q ď λ. Then
N
P has at least 2 expppn ´ 1q{2λ2 q vertices and at least 2 expppn ´ 1q{2λ2 q facets.
Proof. Consider first the statement about vertices. Without loss of generality
ly.
we may assume that λ´1 B2n Ă P Ă B2n , and that the vertices of P are unit vectors.
Let V be the set of vertices of P . The hypothesis is equivalent to saying that V
on
is a θ-net in pS n´1 , gq for cos θ “ 1{λ (see Exercise 5.7). Using (5.7), it follows
that card V ě 2psin θq´pn´1q ě 2 expppn ´ 1q{2λ2 q, where we used the inequality
se
sin arccos t ď expp´t2 {2q for 0 ď t ď 1. Since dBM pP, B2n q “ dBM pP ˝ , B2n q, and
since vertices of P ˝ are in bijection with facets of P , the statement about facets
lu
follows.
na
We also point out that it is possible to approximate the sphere by polytopes with
at most exponentially many vertices and, simultaneously, at most exponentially
so
Exercise 5.10 (Nets in the projective space). Prove the following result, which
will be useful in Sections 8.1 and 9.4. Let ε P p0, π{2q. If N is an ε-net in
the projective space PpCd q (equipped with the Fubini-Study metric (B.5)), then
card N ě pc{εq2d´2 for some absolute positive constant c. In the opposite direction,
there exists an ε-net of cardinality not exceeding pC{εq2d´2 .
Exercise 5.11 (Volume of balls in PpCd q). Consider the projective space
PpCd q equipped with the Fubini-Study metric (B.5) and the invariant probabil-
ity measure. If ε P p0, π{2s, then the measure of any ball of radius ε in PpCd q is
ion
sin2d´2 ε.
5.1.2.3. Packing on the sphere. Recall that P pS n´1 , g, εq is the maximal num-
ut
ber of disjoint caps of geodesic radius ε{2. The exact value is known for π{2 ď ε ă π
(we have P pS n´1 , g, π{2q “ 2n, see Exercise 5.12) and so we restrict our discussion
rib
to the range 0 ă ε ă π{2.
Packing problems are usually harder than covering problems. For example, as
ist
opposed to (5.10), the exponential rate at which packing numbers increase, i.e., the
value of
rd
1
ppεq “ lim sup log P pS n´1 , g, εq
nÑ8 n
fo
is not known for ε P p0, π{2q. We know from (5.2) that V pεq´1 ď P pS n´1 , g, εq ď
V pε{2q´1 , and therefore
ot
(5.11) ´ log sinpεq ď ppεq ď ´ log sinpε{2q.
N
the lower bound ppεq ě ´ log sin ε has never been improved: nobody knows how to
on
?
ppεq ď ´ logp 2 sinpε{2qq
lu
which matches the lower bound from (5.11) as ε increases to π{2. For small ε,
further improvements due to Kabatjanskiı̆–Levenšteı̆n are based on the so-called
na
2
(iii) Deduce that, for r ě 2, there is a polytope P with at most pCrqCn{r vertices
such that dg pP, B2n q ď r.
5.1.3. Nets and packings in the discrete cube. Although the discussion
from the previous sections dealt specifically with spheres, some ideas carry over
directly to other settings. As an illustration we consider the case of the discrete
cube t0, 1un (a.k.a. Boolean cube) equipped with the normalized Hamming distance
1
(5.12) cardti : xi ‰ yi u.
dH px, yq “
ion
n
We denote by V ptq the volume (i.e., the cardinality) of a ball of radius t P p0, 1q.
ut
We have
( ttnu
ÿ ˆn˙
V ptq “ card y P t0, 1un : dH px, yq ď t “
rib
.
k“0
k
The quantity V ptq is governed by the binary entropy function H defined for x P p0, 1q
ist
by Hpxq “ ´x log2 x ´ p1 ´ xq log2 p1 ´ xq. For t ď 1{2 such that tn is an integer,
we have (see Exercise 5.15)
rd
1
(5.13) 2nHptq ď V ptq ď 2nHptq .
n`1
fo
Related estimates will be used when discussing concentration of measure, see (5.59).
ot
As in the case of the sphere, the covering problem is simpler than the packing
problem (at least in some asymptotic regimes). In particular (see Exercise 5.14), a
N
1
(5.14) limlog2 N pt0, 1un , dH , εq “ 1 ´ Hpεq.
nÑ8 n
on
On the other hand, the corresponding limit for packing is unknown; we only
get from (5.2) the asymptotic bounds
se
1
(5.15) 1 ´ Hpεq ď lim sup log2 P pt0, 1un , dH , εq ď 1 ´ Hpε{2q
lu
nÑ8 n
for 0 ă ε ă 1{2. As in the case of the sphere, the lower bound from (5.15) (known
na
in this context as the Gilbert–Varshamov bound) has not been improved, while the
upper bound has been subject to various enhancements.
so
For the q-ary version of the cube, i.e., the space t0, . . . , q ´ 1un (also equipped
with normalized Hamming distance), the entropy function has to be replaced by
r
Pe
Proposition 5.7. Let pK, dq be a metric space such that P pK, d, εq ě q. Given
integer n P N, equip K n with the distance
dn ppx1 , . . . , xn q, py1 , . . . , yn qq “ dpx1 , y1 q ` ¨ ¨ ¨ ` dpxn , yn q.
Then, for t P p0, 1 ´ 1{qq,
qn
(5.17) P pK n , dn , tεnq ě P pt0, . . . , q ´ 1un , dH , tq ě ě q np1´Hq ptqq .
Vq ptq
ion
Exercise 5.14 (Efficient random nets of the Boolean cube). Show (5.14) by
adapting the random covering argument from Proposition 5.4.
ut
Exercise 5.15 (Volume of balls in the q-ary discrete cube). Show (5.16) (which
specified to q “ 2 gives (5.13)).
rib
5.1.4. Metric entropy for convex bodies. If the metric space pM, dq is
actually a normed space with a unit ball B, we write N pK, B, εq or N pK, εBq
ist
instead of N pK, d, εq. It is possible to come up with an alternative definition which
does not refer to the norm, by saying that N pK, B, εq is the minimum number N
rd
such that there exist x1 , . . . , xN in K with
N
(5.18) KĂ
ď
i“1
fo
pxi ` εBq.
ot
This alternative definition does not require the set B to be symmetric, or even
convex, or to have nonempty interior, even though that is usually the case. In
N
ˆ ˙n ˆ ˙n
1 volpKq 2 volpK ` 2ε Lq
(5.19) ď N pK, L, εq ď .
na
ε volpLq ε volpLq
Proof. If pxi q is an ε-net in K with respect to } ¨ }L , then the union of the sets
so
xi ` εL contains K, and the left-hand side inequality in (5.19) follows from volume
comparison. Consider now a family pxi q of N elements of K which is ε-separated
r
for } ¨ }L . This means that the sets xi ` 2ε L have disjoint interiors. Since they are
Pe
ion
close, in the Banach–Mazur distance, to a polytope with at most p1 ` 2{εqn vertices.
For an extension of Lemma 5.9 and 5.10 to not-necessarily-symmetric convex
ut
bodies, see Exercises 5.18–5.20. Note that the dependence on ε in Corollary 5.10 is
not sharp (see Notes and Remarks). For the special case K “ B2n , the conclusion
rib
of Lemma 5.9 can be easily improved to conv N Ą p1 ´ ε2 {2qK, see Exercise 5.7.
ist
Exercise 5.16 (Covering with balls whose centers lie outside of the set). For
convex bodies K, L in Rn , let N 1 pK,
Ť Lq be the smallest number N such that there
rd
exist x1 , . . . , xN in Rn with K Ă 1ďiďN pxi ` Lq (the difference with N pK, Lq is
that xi are not required to belong to K). Give an example with L symmetric for
which N 1 pK, Lq ă N pK, Lq. Can we have such an example with also K symmetric?
fo
Exercise 5.17 (A regularizing trick). Let K, L be convex bodies in Rn , with
ot
0 P L. Show that N pK, εLq “ N pK, pK ´ Kq X εLq.
N
arguing as in the proof of Lemma 5.9, conclude that there exists a polytope P with
at most p2 ` 4{εqn vertices such that p1 ´ εqK Ă P Ă K.
se
it). If ε P p0, 1q, then K can be approximated up to ε (in the sense of Exercises
r
5.18 and 5.19) by a polytope P with at most pCκ{εqn vertices (resp., facets).
Pe
ion
of π{2. Accordingly, the behavior of covering numbers in all such situations can be
subsumed in the following single statement.
ut
Theorem 5.11 (not proved here, but see Exercise 5.23). Let M be either SOpnq,
Upnq, SUpnq, Grpk, Rn q or Grpk, Cn q, equipped with a metric generated by the Schat-
rib
ten norm } ¨ }p for some 1 ď p ď 8. Then for any ε P p0, diam M s,
ˆ ˙dim M ˆ ˙dim M
c diam M C diam M
ist
(5.20) ď N pM, εq ď ,
ε ε
rd
where C, c ą 0 are universal constants (independent of n, k, p and ε), dim M is the
real dimension of M , and diam M the diameter of M with respect to the corre-
sponding metric.
fo
For easy reference, we list in Table 5.1 some of the values of the parameters
ot
(dimensions, diameters) that appear in (5.20).
N
Table 5.1. Real dimensions and diameters from the bounds (5.20)
for covering numbers of a selection of classical manifolds. The dis-
ly.
tances used on SOpnq and Upnq are the extrinsic metrics obtained
from the Schatten p-norm on Mn , and the distances on Grassmann
on
Upnq n2 2n1{p
n 1{2
Grpk, R q kpn ´ kq 2 p2kq1{p k ď n{2
so
n 1{2 1{p
Grpk, C q 2kpn ´ kq 2 p2kq k ď n{2
r
Pe
Exercise 5.23 (Metric entropy of classical groups and manifolds). Prove The-
orem 5.11 for M “ Upnq, M “ SUpnq or M “ SOpnq and for p “ 8, by appealing to
Lipschitz properties of the exponential map with matrix argument (Exercise B.8).
Exercise 5.24. Derive the formula for diameter of Grpk, Rn q in Table 5.1.
Exercise 5.25 (Volume of balls in classical groups and manifolds). Let M
be either SOpnq, Upnq or Grpk, Rn q, equipped with a metric as in Theorem 5.11.
Denoting by σ the Haar probability measure on M , deduce from Theorem 5.11 a
two-sided estimate for σpBpx, εqq, where Bpx, εq denotes the ball of radius ε centered
at x P M .
5.2. CONCENTRATION OF MEASURE 117
ion
we define
Aε “ tx P X : distpx, Aq ď εu.
ut
The two viewpoints are roughly equivalent since the “surface area” relative to µ can
be retrieved (when that makes sense) as the first-order variation of µpAε q when ε
rib
goes to 0, cf. (4.23) and, conversely, the growth of the function ε ÞÑ µpAε q on the
macroscopic scale can be recovered from the knowledge of its derivative. However,
ist
the enlargement-based approach seems simpler (a more flexible definition) and is
often more fruitful since some otherwise useful bounds on µpAε q may be meaningless
rd
for small ε, and/or may be available in absence of any clue with regard to the nature
of extremal sets.
Lower bounds for µpAε q can be rephrased as deviation inequalities for Lips-
fo
chitz functions. This leads, in some settings, to a remarkable phenomenon: every
Lipschitz function concentrates strongly around some “central value.” Statements
ot
to such and similar effect will be the focus of our presentation. Specifically, we will
N
and
2
µpf ą Ef ` tq ď Ce´λt ,
on
(5.22)
to be valid for any real-valued 1-Lipschitz function on X and all t ą 0, where Mf
and Ef are the median and the expected value of f calculated with respect to µ.
se
and PpX ď M q ě 1{2.) Clearly, (5.21) and (5.22) formally imply then similar two-
sided estimates for µp|f ´ Mf | ą tq and µp|f ´ Ef | ą tq with C replaced by 2C.
na
we list in Table 5.2 the constants and the exponents that appear in subgaussian
concentration inequalities for a selection of classical objects.
r
Pe
Remark 5.12. We point out that if a function f is such that one of the in-
equalities (5.21) or (5.22) holds (for all t ą 0) with constants C, λ, then the other
inequality similarly holds (for the same function) with some other constants. For
example, if (5.22) holds with C ě 12 and λ, then (5.21) holds with 2C 2 and λ{2; if
(5.21) holds with C ě e´1{3 « 0.717 and λ, then (5.21) holds with eC 2 and λ{2 (see
Proposition 5.29 and Remarks 5.30, 5.31.) Sharper results of this nature (i.e., with
better dependence on C, λ) can sometimes be obtained if we assume that (5.21) (or
(5.22)) holds for all real-valued 1-Lipschitz functions on X; some questions in that
spirit are considered in [Led01] (see, e.g., Exercise 5.48).
Table 5.2. Constants and exponents in subgaussian concentration inequalities for a selection of classical objects. When
118
Pe
applicable, the reference measure is the canonical invariant measure on the object in question. We made an effort to come
r
up with reasonable values of constants/exponents, and some of them are optimal. Unless indicated otherwise, the metric used
for manifolds is the Riemannian geodesic distance. dH stands for the normalized Hamming distance (5.12). References: (a)
so
Theorem 5.24. (b) Log-Sobolev inequality (LSI), see Table 5.4. (c) Corollary 5.17. (d) Proposition 5.20; what follows from
the LSI is λ “ n´12
. (e) Ricci curvature, see Table 5.3. (f) Remark 5.12. (g) Corollary 5.52. (h) LSI on the discrete cube,
na
see Theorem 5.1 and Exercise 5.5 in [BLM13]. (i) Theorem 5.54; convex or concave functions only. (j) The constant in the
exponent is 18 and not 21 due to rescaling (t´1, 1u vs. t0, 1u) (k) Theorem 5.56; convex functions only. (l) Theorem 5.38. (m)
lu
Theorem 5.39. (n) pC, λq “ p2, 14 q if n “ 2; Remark 5.12. (o) Exercise 5.54. (p) Remark 5.19. (q) If we use instead the
non-Riemannian metric (B.11), the parameter λ needs to be multiplied by 2 in view of (B.12). (r) Remark 5.53.
se
Object C, λ in (5.21)–median C, λ in (5.22)–mean Comments
1 1
Gauss space pRn , | ¨ |, γn q
on
2 , 2 (a) 1, 12 (b)
1
Gauss space pCn , | ¨ |, γnC q 1, 1 (b)
2 , 1 (a)
n´1 n´1 1 n
ly.
pS , gq or pS , | ¨ |q n ą 2 for pS n´1 , gq (p)
2 , 2 (c) 1, n2 (d)
1 n´1 N
SOpnq 2 , 8 (e) 1, n´1
8 (b) metric (B.8)
1 n n
SUpnq 2 , 4 (e) 1, 4 (b) metric (B.8)
n
ot n
Upnq 2, 24 (f) 1, 12 (b) metric (B.8)
n 1 n´2 n´2
Grpk, R q metric (B.10) (q)
2 , 4 (e) 1, 4 (b)
fo
n 1 n
Grpk, C q metric (B.10) (q)
2 , 2 (e) 1, n2 (b)
n
(g) (h) (r)
rd
pt´1, 1u , dH q 1, 2n 1, 2n ně3
n
pt´1, 1u , | ¨ |q 2, 81 (i)(j) 1, 81 (k)(j) appropriate convexity hypotheses
1 c
ist
5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
ion
manifestations of the concentration phenomenon is beyond the scope of this work;
we refer the interested reader to the monographs [Led01, BLM13] and/or to other
sources listed in Notes and Remarks. Here we restrict our attention to highlighting
ut
several central techniques and, subsequently, to going over examples that appear
rib
to be of relevance to the quantum theory.
5.2.1. A prime example: concentration on the sphere. The settings of
ist
the Euclidean sphere and of the projective space are directly relevant to quantum
information theory since the latter identifies canonically with the set of pure states.
rd
In the language of enlargements, the isoperimetric inequality on the sphere can be
stated as follows.
fo
Theorem 5.13 (Spherical isoperimetric inequality, not proved here). Equip
the unit sphere S n´1 Ă Rn with the geodesic distance g and the uniform probability
ot
measure σ. If A Ă S n´1 and if C Ă S n´1 is a spherical cap such that σpAq “ σpCq,
then, for any ε ą 0,
N
Recall that the spherical cap with center x P S n´1 and radius ε is the set
Cpx, εq “ ty P S n´1 : gpx, yq ď εu.
on
Note that the class of spherical caps is stable under enlargements and that we have
se
tance inherited from the ambient Euclidean space (see Appendix B.1), Theorem
5.13 is valid also for the latter. However, it is traditionally stated for the geo-
na
desic distance. Also, the formula (5.24) for Cpx, εqδ stated above would be more
complicated if we used | ¨ | to define caps.
so
The usefulness of Theorem 5.13 comes from the fact that there are explicit
r
integral formulas and sharp bounds for the measure of spherical caps, which were
Pe
explored in Section 5.1.2. However, while in the study of packing and covering
small caps seemed most interesting, in the present context of concentration the radii
close to π{2 are most relevant. This is because arguably the most useful instance
of Theorem 5.13 is σpAq “ 21 , in which case the radius of the corresponding cap C
is π{2 and the radius of its ε-enlargement, Cε , is π{2 ` ε. Taking into account the
bound (5.5) leads then to
Corollary 5.14. If n ą 2 and if A Ă S n´1 with σpAq ě 21 and ε ą 0, then
´ ´ π ¯¯ 1 2
(5.25) σpAε q ě σ C x, ` ε ě 1 ´ e´nε {2 .
2 2
120 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
ion
It thus remains to prove the first statement.
Define K 1 Ă B2n via K 1 :“ ttx : x P K, t P r0, 1su and similarly for L1 . Then
volpK 1 q “ σpKqvolpB2n q and volpL1 q “ σpLqvolpB2n q. Consequently, by the Brunn–
ut
Minkowski inequality in the form (4.21),
rib
ˆ 1 ˙
K ` L1 a a
vol ě volpK 1 qvolpL1 q “ σpKqσpLq volpB2n q.
2
ist
On the other hand, if x, y P S n´1 and the angle between x and y is at least ε,
then |px ` yq{2| ď cospε{2q. If ε ď π{2 (and so xx, yy ě 0), a simple calculation
rd
shows that the same is true if we replace x and y by x1 “ sx and y 1 “ ty, where
s, t P r0, 1s (in fact this is even
a true if ε ď 2π{3). This means that we have then
K 1 `L1
2
n
fo n
Ă cospε{2qB2 and so σpKqσpLq ď pcospε{2qq . It remains to appeal to the
2
(subtle but elementary) inequality cos u ď e´u {2 (see Exercise 5.3).
ot
Remark 5.16. (1) Proposition 5.15 holds actually for the entire nontrivial
N
range of ε, which is r0, πs; this follows a posteriori from the estimate in Lévy’s
lemma (see Exercise 5.26). The above proof fails for large ε; however, only the range
r0, π{2s is relevant to the second statement and to Corollary 5.14: if µpKq ě 1{2,
ly.
´L) caps with distpK, Lq “ 2ε, we conclude from the Proposition that µpKq ď
2 2
e´nε {2 . This compares fairly well with the bound 21 e´nε {2 implicit in (5.25).
se
and therefore
Pe
ion
Remark 5.19. In Corollaries 5.14 and 5.17 we have to assume that n ą 2
because the bound (5.5) is not valid in the entire nontrivial range 0 ď t ď π{2.
ut
2
If n “ 2, one needs to replace the function 12 e´nt {2 by maxt 12 ´ πt , 0u. However,
rib
no modifications are needed if the enlargements or the Lipschitz constants are
calculated with respect to the ambient space metric, or if only small values of ε or
t are of interest, say, ε ď 1 or t ď L.
ist
Concentration around the median follows naturally from the isoperimetric in-
rd
equality. As we mentioned in Remark 5.12, this implies formally concentration
around the expectation with altered constants. In some situations, it is possible to
obtain good constants with extra work.
fo
Proposition 5.20 (Lévy’s lemma for the mean, not proved here). Let n ą 2.
If f : pS n´1 , gq Ñ R is a 1-Lipschitz function, then for any t ą 0,
ot
(5.28) σpf ą Ef ` tq ď expp´nt2 {2q.
N
Exercise 5.26 (Proposition 5.15 holds for the full range of ε). Show that it
follows a posteriori from Theorem 5.13 and the bound (5.5) that, for n ą 2, in
the notation and under the hypotheses of Proposition 5.15, we have σpKq σpLq ď
se
1 ´nε2 {4
˘2
. For n “ 2, the optimal inequality is σpKq σpLq ď 14 1 ´ πε (cf. Remark
`
4e
lu
5.19).
Exercise 5.27 (Concentration implies isoperimetry). Show that, for a metric
na
probability space pX, µq, concentration implies isoperimetry in the following sense:
if µpf ą Mf ` tq ď α for any 1-Lipschitz function f , then µpAt q ě 1 ´ α for any
so
A Ă X with µpAq “ 21 .
r
Exercise 5.28 (A finer bound tor the mean width of a union). Let K, L be two
Pe
bounded sets in Rn , b
and R the outradius of K Y L. Show that wpconvpK Y Lqq ď
2π
maxpwpKq, wpLqq ` n R.
5.2.2. Gaussian concentration. Another classical setting where ` isoperime- ˘
try and concentration have been widely studied is the Gaussian space Rn , | ¨ |, γn ,
where γn is the standard Gaussian measure on Rn (see Appendix A.2 for the no-
tation, basic properties and relevant facts). It turns out that the extremal sets
for the isoperimetric problem are then half-spaces, and since their enlargements
are also half-spaces, the solution to the problem can be expressed simply in terms
of the cumulative distribution function of an N p0, 1q variable, i.e., in terms of
Φpxq :“ γ1 pp´8, xsq. We have
122 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
ion
inally derived from the spherical isoperimetric inequality (Theorem 5.13) via the
following classical fact.
ut
Theorem 5.22 (Poincaré’s lemma, see Exercise 5.29). For n, N P N with
N ě n, we consider Rn to be a subspace of RN . Next, fix n and let νN be the
rib
pushforward
? to Rn , via the orthogonal projection, of the normalized uniform mea-
N ´1
sure on N S . Then, as N Ñ 8, pνN q converges to γn , the standard Gaussian
measure on Rn .
ist
The convergence in Theorem 5.22 holds in a very strong sense, e.g., in total
rd
variation, or in uniform convergence of densities.
Another derivation of the Gaussian isoperimetric inequality is based on the
fo
following analogue of the Brunn–Minkowski inequality in the Gaussian setting.
Theorem 5.23 (Ehrhard’s inequality, not proved here). Let A, B be Borel sub-
ot
sets of Rn and let λ P r0, 1s. Then
N
(5.31) Φ´1 pγn pp1 ´ λqA ` λBqq ě p1 ´ λqΦ´1 pγn pAqq ` λΦ´1 pγn pBqq.
Ehrhard’s inequality is stronger than log-concavity of the Gaussian measure
ly.
(Section 4.3.2), see Exercise 5.31. Assuming Ehrhard’s inequality, the derivation of
the Gaussian isoperimetric inequality goes as follows. Fix A, ε and let λ P p0, 1q.
on
We now let λ Ñ 0` . The first term on the right-hand side of (5.32) converges
clearly to Φ´1 pγn pAqq, while the second term converges to ε (this is a little harder,
lu
but elementary, see Exercise 5.32), and so we proved the Gaussian isoperimetric
inequality in the form (5.30).
na
The next theorem follows from Theorem 5.21 according to the general scheme
indicated in Remark 5.18, with the explicit exponential bound being a consequence
so
of Exercise A.1.
r
ion
the general case and is probably not that hard to settle. Similarly, is it true that
σp|f ´ Ef | ą tq ď expp´nt2 {2q if f : pS n´1 , gq Ñ R is a 1-Lipschitz function (and
ut
n ą 2; see Remark 5.19 for comments on peculiarities of the case n “ 2)?
An example of a function for which Theorem 5.24 is meaningful is the Euclidean
rib
norm, which is trivially 1-Lipschitz. This gives the following (see also Exercise 5.37).
Corollary 5.27. Let G be a standard Gaussian vector in Rn . Then, for any
ist
t ą 0,
rd
c
` ? ˘ 1 ´t2 {2 ´ 2 ¯ 1 2
P |G| ě n ` t ď e and P |G| ď n ´ ´ t ď e´t {2 .
2 3 2
fo
The distribution of |G|2 is commonly known as χ2 pnq, the chi-squared distribu-
tion with n degrees of freedom. Denoting by mn the median of |G|,b what is required
ot
to deduce Corollary 5.27 from Theorem 5.24 are the inequalities n ´ 32 ď mn ď
?
N
the Gaussian isoperimetric inequality (5.29) from the Poincaré lemma (Theorem
5.22) and the spherical isoperimetric inequality (Theorem 5.13).
lu
lim “ 1.
Pe
rÑ`8 r
Exercise 5.33 (Ehrhard-like (a-)symmetrization). Show that the following
statement is equivalent to the validity of Ehrhard’s inequality for convex bodies.
Let K Ă Rn be a convex body and let E Ă Rn be a k-dimensional subspace with
0 ă k ă n. Identify E and E K with, respectively, Rk and Rn´k and define a set
L Ă Rk`1 by
px, sq P L ðñ s ď Φ´1 pγn´k pty P E K : px, yq P Kuqq,
where x P E, s P R. Then L is convex.
In the case when E “ uK is a hyperplane (i.e., k “ n ´ 1) the transformation
K ÞÑ L is called Ehrhard (a-)symmetrization in direction u.
124 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
ion
of largely elementary facts related to the concentration phenomenon. It supplies
a set of tools allowing for flexible applications of concentration results. As a rule,
ut
the facts are well known to experts in the area and are included here for future
reference. Proofs are relegated to exercises.
rib
5.2.3.1. Laplace transform. We mostly restrict ourselves to settings where con-
centration exhibits a subgaussian behaviour as in (5.21) or (5.22). Such behaviour
ist
can be proved via estimating the bilateral Laplace transform, using the exponential
Markov inequality PpX ą tq ď e´st E exppsXq for s ą 0.
rd
Lemma 5.28 (Laplace transform method). Let X be a random variable such
that E exppsXq ď A exppβs2 q for every s P R. Then, for every t ą 0,
fo
maxpPpX ą tq, Pp´X ą tqq ď A expp´t2 {4βq.
Exercise 5.35. Prove Lemma 5.28 about the Laplace transform method.
ot
Exercise 5.36. Prove Hoeffding’s lemma: if X is a mean zero random variable
N
taking values in an interval ra, bs, then E exppsXq ď expp 81 s2 pb ´ aq2 q for any s P R.
ly.
` ˘n{2
2sq´n{2 for any s ă 1{2. Conclude that PpX ě p1 ` εqnq ď p1 ` εq expp´εq for
` ˘n{2
se
any ε ą 0 and that PpX ď p1 ´ εqnq ď p1 ´ εq exppεq for ε P p0, 1s. (We known
from Cramér’s large deviations theorem that this bounds are sharp.) Conclude that
lu
nε2
ˆ ˙
(5.35) Pp|X ´ n| ě εnq ď 2 exp ´ .
4 ` 8ε{3
na
some value, we can a posteriori infer that it also concentrates around the mean or
the median, or any other particular quantile. This can be formalized by the concept
r
value of Y if M is either the mean of Y , or any number between the 1st and the 3rd
quartile of Y (i.e., if mintPpY ě M q, PpY ď M qu ě 41 ; this happens in particular
if M is the median of Y ). The numbers 14 and 34 play no special role and can be
changed to other numbers from p0, 1q at the cost of deteriorating (or improving)
the constants in the statements that follow (see, e.g., Remark 5.31).
Proposition 5.29 (see Exercises 5.38–5.40). Let Y be a real random variable
and let M be any central value for Y . Let a P R and let constants A ě 21 , λ ą 0 be
such that, for any t ą 0,
(5.36) maxtPpY ą a ` tq, PpY ă a ´ tqu ď A expp´λt2 q.
5.2. CONCENTRATION OF MEASURE 125
a a
Then |M ´ a| ď logp4Aq λ´1{2 . Consequently, for any t ě logp4Aq λ´1{2 ,
(5.37) maxtPpY ą M ` tq, PpY ă M ´ tqu ď 4A2 expp´λt2 {2q.
a
Remark 5.30 (Improvements to Proposition 5.29). The expressions
a logp4Aq
and 4A2 in the assertion of Proposition 5.29 can be replaced by logpκAq and κA2 ,
where κ “ 2 when M is the median of Y and κ “ e when M is the expectation of
Y ; see Exercises 5.38, 5.39 and 5.40.
Remark 5.31 (On the necessity of restrictions on t in Proposition 5.29). We
ion
point out that the bound on the first (resp., the second) probability appearing
in (5.37) is valid under the formallyaweaker restriction t ą pM ´ aq` (resp.,
ut
t ą pM ´ aq´ ). The restriction t ě logp4Aq λ´1{2 , while annoying, cannot be
completely avoided if we want to keep full generality because the hypothesis (5.36)
rib
does not necessarily supply any information about the probabilities appearing in
the assertion if t is small. However, this is only a minor inconvenience since for
ist
such t the upper bound in (5.37) is never small and often holds for trivial reasons.
In particular, (5.37) holds for all t ą 0 if M is the mean or any quantile between
rd
the 27th and 73rd
? percentile, or if A ě 32{3 {4 « 0.52, and always if we replace the
2 2
factor 4A by 3 2A . If M is the median, we can go even further: no restrictions
on t are needed even if we replace 4A2 by 2A2 on the right hand side of (5.37); if
fo
M is the mean, similar improvement (i.e., eA2 on the right hand side) is possible
when A ě e´1{3 « 0.717 (these last observations were used in Remark 5.12).
ot
Corollary 5.32 (Lévy’s lemma for central values). Let f : pS n´1 , gq Ñ R be
N
´ nε2 ¯
(5.38) Ppf ě M ` εq ď exp ´ .
4L2
on
We sketch proofs and give more precise bounds and/or variations on the above
results in Exercises 5.38–5.48. Note that while (5.38) follows from Proposition
se
5.29 and Corollary 5.17 for n ą 2 and for ε not-too-small, a separate argument
is needed to cover the remaining cases (cf. Remark 5.31). We also point out that
lu
while Proposition 5.29 is meant to give reasonably good estimates valid in the most
general setting when concentration is present, better bounds are available in specific
na
instances. For example, Corollary 5.32 can be improved when M is the mean (see
Table 5.2 and Exercise 5.44), and similarly in the Gaussian case.
so
The heuristics behind Corollary 5.32 is as follows: if we know that all sets of
measure at least 12 have large enlargements, then approximately the same is true for
r
Pe
all sets of measure at least 41 . Actually, almost the same is true for much smaller
sets; here is a sample result.
Proposition 5.33 (see Exercise 5.49). Let pX, d, µq be a metric probability
space and let ε ą 0. Suppose that any set A Ă X with µpAq ě 12 verifies µpAε q ě
2 2 2
1 ´ Ce´λε . Then µpB2ε q ě 1 ´ Ce´λε for any set B Ă X with µpBq ě Ce´λε .
A common feature of concentration inequalities presented up to now is that
in order to translate them to concrete bounds for concrete functions, we need to
calculate—or at least reasonably estimate—the medians or expected values, or sim-
ilar parameters of the functions under consideration. A selection of tools, some of
them quite sharp, to handle expected values will be described in Section 6.1. The
126 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
preceding three results tell us that it doesn’t really matter which central value we
employ, as long as we are willing to pay a small penalty in the form of an addi-
tional multiplicative constant in the exponent and in front of the exponential. The
following observation shows that, in the Gaussian context, sometimes no penalty is
needed at all.
Proposition 5.34 (see Exercise 5.50). Let f : Rn Ñ R be a convex function.
Denote by Mf (resp., Ef ) the median (resp., the expectation) of f with respect to
the standard Gaussian measure γn . Then Mf ď Ef .
ion
Exercise 5.38. Show that a random variable Y0 such that a P pY0 ą tq ď
2
?
A expp´t q for t ą 0 must verify E Y0 ď E Y0 ď mintA π{2, 1 ` log` Au. De-
`
ut
duce the first assertion of Proposition 5.29 and the corresponding improvement
from Remark 5.30 if M is the mean of Y .
rib
Exercise 5.39. Show that if Y0 is a random variable such that Pb pY0 ą tq ď
ist
2
A expp´t q for t ą 0 and if M3{4 is its 3rd quartile, then M3{4 ď log` p4Aq.
Deduce the first assertion of Proposition 5.29 if M is between the
b 1st or the 3rd
rd
quartile of Y , and the strengthening from Remark 5.30: |M ´a| ď log` p2Aq λ´1{2
if M is the median of Y .
2 fo 2 2
Exercise 5.40. Prove the inequality e´s ď eδ e´ps`δq {2 for s, δ P R. Use it
and the last two exercises to show the second assertion of Proposition 5.29, and its
ot
strengthenings stated in Remark 5.30 when M is the median or the mean of Y .
N
Exercise 5.41. Verify the assertions in the last two sentences of Remark 5.31.
Exercise 5.42. Given α P p0, 1q, prove a version of (5.37) with the right-hand
ly.
side of the form B expp´αλt2 q, where B depends only on A and α (and on κ from
Remark 5.30, if applicable).
on
Exercise 5.43 (Lévy’s lemma for central values). Let n ą 2. Use Exercise
5.26 to derive Corollary 5.32 for any quantile between the 1st and the 3rd quartile.
se
Exercise 5.44 (The median and the mean on the sphere). Let f be a 1-
lu
n´1
Lipschitz function on pS
a , gq with n ą 2. Show that the median and the mean
of f differ at most by π{8n and describe the extremal function.
na
positive function on pS n´1 , gq with n ą 1. Set q “ pEf 2 q1{2 . Show that for any
t ą 0, Ppf ě q ` tq ď expp´nt2 {2q and Ppf ď q ´ tq ď e expp´nt2 {2q.
Exercise 5.47 (The case of S 1 ). Using directly the solution to the isoperimetric
problem on S 1 , show that Corollary 5.32 holds also for n “ 2.
Exercise 5.48. Let pX, d, µq be a metric probability space and let α : r0, 8q Ñ
r0, 8q be such that µpf ě Ef ` tq ď αptq for any bounded 1-Lipschitz function
f : X Ñ R and for all t ą 0. Then, for any such function f and for any t ą 0,
µpf ě Mf ` tq ď αpt{2q. Equivalently, µpAε q ě 1 ´ αpε{2q for any A Ă X with
µpAq ě 1{2 and any ε ą 0. The preceding argument can be iterated, see (1.18) in
[Led01].
5.2. CONCENTRATION OF MEASURE 127
Exercise 5.49. Prove Proposition 5.33 about enlargements of fairly small sets.
Exercise 5.50 (Median vs. mean for convex functions of Gaussian variables).
Prove Proposition 5.34 by showing first that the function g : t ÞÑ Φ´1 pγn ptf ď tuqq
is concave.
Exercise 5.51. Show that the following statement is a consequence of Propo-
sition 5.34. If pX1 , . . . , XN q are jointly Gaussian random variables and f : RN Ñ R
is a convex function, then the median of the random variable f pX1 , . . . , XN q does
ion
not exceed its expectation.
5.2.3.3. Local versions. It sometimes happens that a function defined on the
sphere S n´1 has a poor global Lipschitz behaviour, while its restriction to a subset
ut
of large measure is much more regular. To take advantage of such situation, we
rib
formulate a “local” version of Lévy’s lemma.
Corollary 5.35 (Lévy’s lemma, local version). Let Ω Ă S n´1 be a subset
ist
of measure larger than 3{4. Let f : pS n´1 , gq Ñ R be a function such that the
restriction of f to Ω is L-Lipschitz. Then, for every ε ą 0,
rd
Ppt|f pxq ´ Mf | ą εuq ď PpS n´1 zΩq ` 2 expp´nε2 {4L2 q,
where Mf is the median of f .
fo
One scenario under which the hypotheses of Corollary 5.35 may be satisfied is
ot
when we have an upper bound on some Sobolev norm of f (a “global” parameter,
which suggests that “restricted version of Lévy’s lemma” could have been better
N
Exercise 5.52. Prove Corollary 5.35, the local version of Lévy’s lemma.
on
5.2.3.4. Pushforward. The following elementary result is very useful for estab-
lishing concentration phenomenon for many classical spaces. In a nutshell, it says
that concentration results can be “pushed forward” by surjective contractions.
se
Then
(5.39) inf νpBε q ě inf µpAε q.
so
ion
on X ˆ Y , then φpxq “ Mf px,¨q is 1-Lipschitz on X and hence concentrated around
its median Mφ . Since, for each x P X, f px, ¨q is concentrated around φpxq, it follows
that f is concentrated around Mφ . (See Exercise 5.55 for precise statements.) The
ut
above argument can be clearly iterated. Here is another elementary result involving
product measures.
rib
Proposition 5.37 (Concentration on product spaces, see Exercise 5.55). Let
ist
pXi , di , µi q, 1 ď i ď n, be bounded metric probability spaces and denote Di “
diam Xi . Let X “ X1 ˆ . . . ˆ Xn be endowed with the product measure µ and the
rd
`1 product metric d. Then, for every 1-Lipschitz function f : X Ñ R and for any
t ě 0,
2
{D 2
(5.42)
where D “
` řn 2 1{2
Di
˘
.
µpf ě Ef ` tq ď e´2t fo ,
ot
i“1
5.2.5). However, in some natural settings (e.g., the Gaussian space) dimension-free
results are possible.
on
2 psuptt : PpF ě tq ě 1{2u ` inftt : PpF ď tq ě 1{2uq, but most other definitions
would work if applied consistently and with sufficient care. Let pX, d1 , µq and
lu
(ii) If X and Y exhibit the concentration phenomenon in the sense of (5.21) for
2
some C and λ, then πpf ą Mφ ` tq ď 2Ce´λt {4 for all t ą 0, and similarly for
πpf ă Mφ ´ tq.
(iii) Show that Mφ is a central value in the sense of Section 5.2.3.
(iv) Same as (ii) with (5.21) replaced by (5.22) and Mφ by Ef .
Exercise 5.56 (Concentration on product spaces, Laplace transform method).
ş of a probability metric space pX, d, µq is defined for λ P R
The Laplace functional
as EpX,d,µq pλq “ sup eλf dµ, where the supremum is taken over all 1-Lipschitz
functions f : X Ñ R with mean 0.
(i) Show that if X has diameter D, then EpX,d,µq pλq ď exppλ2 D2 {8q (use Exercise
5.2. CONCENTRATION OF MEASURE 129
5.36).
(ii) Show that if pX1 , d1 , µ1 q and pX2 , d2 , µ2 q are two metric probability spaces, if
d denotes the `1 product metric on X1 ˆ X2 as defined in (5.41), then
EpX1 ˆX2 ,d,µ1 bµ2 q pλq ď EpX1 ,d1 ,µ1 q pλqEpX2 ,d2 ,µ2 q pλq.
(iii) Show that in the context of Proposition 5.37, we have
EpX,d,µq pλq ď exppλ2 D2 {8q.
ion
(iv) Prove Proposition 5.37 using Lemma 5.28.
Exercise 5.57 (Hoeffding’s inequality). Show that Proposition 5.37 implies
ut
Hoeffding’s inequality: if X1 , . . . , Xn are independent random variables such that
Xi takes values in an interval of length li , then for any t ą 0,
rib
2
{L2
(5.43) PpS ě ES ` tq ď e´2t ,
ist
2
where S “ X1 ` ¨ ¨ ¨ ` Xn and L “ l12 ` ¨¨¨ ` ln2 .
rd
5.2.4. Geometric and analytic methods. Classical examples. In Sec-
tions 5.2.1 and 5.2.2 we sketched isoperimetric/concentration results on the Eu-
clidean sphere and for the Gaussian measure. While these are admittedly very
fo
special situations, the fact of the matter is that, in high-dimensional settings, some
form of concentration phenomenon is the rule rather than the exception.
ot
5.2.4.1. Gromov’s comparison theorem. The first result asserts that isoperimet-
ric and concentration inequalities hold under geometric assumptions which signifi-
N
cantly generalize the spherical case. The invariant that can be related to sphere-like
behavior is the Ricci curvature, which describes the rate of growth of volume under
ly.
geodesic flow on the manifold with the similar rate in the Euclidean space. For
example (see Figure 5.3), the circumference of a circle of geodesic radius θ (ă π)
on
on the sphere S 2 is 2π sin θ, and hence the length of the arc of the circle corre-
sponding to an angle α (measured on the plane tangent at the center of the circle)
3˘ 2˘
se
3 ˘m´1 2˘
θ m´1 θ
` m´1
` m´1
α θ ´ 6R2 « αθ 1 ´ R2 6 compared to αθ in the Euclidean setting
(i.e., in Rm ). This is subsumed by saying that the Ricci curvature of RS m , the
so
the radius in
the ambient space angle α
is sin θ
a circle of
·· the resulting
geodesic arc of length
θ2
radius θ α sin θ ≈ αθ 1 − 6
ion
·
ut
rib
ist
rd
Figure 5.3. Volume growth on the sphere S 2 as a function of
geodesic distance.
i“2
where sec denotes the sectional curvature. This leads to an alternative explanation
ly.
of the value of the Ricci curvature for the sphere, for other manifolds of constant
sectional curvature such as the Euclidean space or the hyperbolic space, or for their
on
quotients by discrete groups of symmetries (e.g., for tori or for the real projective
space). In the case of Lie groups, sectional curvature can be expressed via Lie
se
It follows then (same proof as Corollary 5.17) that any 1-Lipschitz function
f : X Ñ R with median Mf satisfies, for any t ą 0,
1
µX ptf ą Mf ` tuq ď expp´pm ` 1qt2 {2R2 q.
2
As it turns out, the hypotheses of Theorem 5.38 are verified for many (but not
all) manifolds that naturally appear in mathematics and that play a role in physics,
notably for most classical Lie groups and their homogeneous spaces, see Table 5.3.
Exercise 5.58 (Ricci curvature of Grassmannians). For Grpk, Rn q or Grpk, Cn q,
the tangent space at any point can be identified with Mk,n´k . If X, Y P Mk,n´k are
5.2. CONCENTRATION OF MEASURE 131
ion
and 5.59. Note that the values for the projective spaces PpV q
and the corresponding Grp1, V q do not coincide
? due to different
normalization of the metric (an additional 2 factor in (B.10) when
ut
compared to (B.5)).
rib
X metric cpXq comments
Rn Euclidean 0
ist
S n´1 geodesic n´2 ně2
rd
n´2
SOpnq standard (B.8) 4 ně2
n
SUpnq standard (B.8) 2
Upnq
n
standard (B.8)
Grpk, R q quotient from Opnq (B.10)
fo 0
n´2
1ďk ďn´1
ot
2
n
Grpk, C q quotient from Upnq (B.10) n 1ďk ďn´1
N
n
PpR q Fubini–Study (B.5) n´2 ně2
PpCn q Fubini–Study (B.5) 2n ně2
ly.
1`
}XY : ´ Y X : }2HS ` }X : Y ´ Y : X}2HS .
˘
(5.45) secpX, Y q “
4
lu
Use this formula and (5.44) to compute the corresponding values from Table 5.3.
In some references we find the coefficient 12 instead of 14 because of a different
na
ş
if f dµ “ 1, where we used the convention 0 log 0 “ 0, and then extended to
non-negative integrable functions by 1-homogeneity. An explicit formula that im-
plements the extension is
ż ż ˆż ˙
(5.47) Entµ pf q :“ f log f dµ ´ f dµ log f dµ .
ion
on X. We say that pX, µq verifies a logarithmic Sobolev inequality with parameter
α if for every (sufficiently smooth) function f : X Ñ R we have
ż
ut
(5.48) Entµ pf q ď 2α |∇f |2 dµ.
2
rib
The smallest constant α that works in (5.48) is called the log-Sobolev constant of
pX, µq and denoted by LSpX, µq.
ist
The relevance of this circle of ideas to the concentration phenomenon is ex-
plained by the following result.
rd
Theorem 5.39 (Herbst’s argument). Let X be a Riemannian manifold and
let µ be a Borel probability measure on X such that LSpX, µq ď α. Then every
´ ¯
fo
1-Lipschitz function F : X Ñ R is integrable and satisfies, for every t ą 0,
ż
2
(5.49) µ F ą F dµ ` t ď e´t {2α .
N ot
Remark 5.40. The above Theorem can be extended to the setting of general
metric spaces, with essentially the same proof, once |∇f | is properly defined. For
example, we may use |∇f |pxq “ lim supyÑx |fdistpy,xq
pyq´f pxq|
ly.
exposition, we will assume for the rest of this subsection that the underlying spaces
are (connected) Riemannian manifolds.
se
ş Proof of Theorem 5.39. First, we may assume that F is smooth and that
F dµ “ 0; this may be achieved by replacing F by an appropriate approximation
lu
and subtracting a constant. The strategy is to show that the (bilateral) Laplace
transform of F verifies
na
ż
2
(5.50) eλF dµ ď eαλ {2 for all λ P R,
so
2
which by Lemma 5.28 implies that µpF ą tq ď e´t {2α , as needed. To establish
r
2
(5.50), we introduce an auxiliary function f “ fλ ą 0 defined via f 2 “ eλF ´αλ {2 .
Pe
2
In other words, f “ eλF {2´αλ {4 and it is readily checked that ∇f “ λ2 f ∇F . Since
2
|∇F | ď 1 (because F is 1-Lipschitz), it follows that |∇f |2 ď λ4 f 2 . Consequently,
by (5.48) (cf. (5.47)),
αλ2 ¯ ¯ αλ2 ż
ż ´ ż ´ż
(5.51) Entµ pf 2 q “ f 2 λF ´ dµ ´ f 2 dµ log f 2 dµ ď f 2 dµ.
2 2
ş
We now set φpλq “ f 2 dµ and note that differentiating under the integral sign
gives ż
φ1 pλq “ f 2 pF ´ αλq dµ.
5.2. CONCENTRATION OF MEASURE 133
ion
(5.53) lim “ lim “ “ “ 0.
λÑ0 λ λÑ0 φpλq φp0q 1
` ˘
˘ (5.52) and (5.53) we conclude that log φpλq {λ ď 0 for λ ą 0 and
ut
Combining
`
log φpλq {λ ě 0 for λ ă 0, which just means that φpλq ď 1 for all λ P R. In
rib
ş 2
other words, eλF ´αλ {2 dµ ď 1 for λ P R, which is just a restatement of (5.50) and
concludes the argument.
ist
Apart from the median being replaced by the expected value (which is largely
a matter of convenience or elegance, see Proposition 5.29 in Section 5.2.3), the
rd
assertion of Theorem 5.39 closely resembles (5.26) and (5.33), which quantified the
concentration phenomenon for Lipschitz functions in the spherical and Gaussian
fo
settings. However, its usefulness depends on availability of spaces pX, µq verifying
logarithmic Sobolev inequalities. The next few results ensure that the supply is
ot
indeed quite ample. For easy reference, the spaces and estimates on their log-
Sobolev constants are cataloged in Table 5.4.
N
m´1
LSpX, µq ď mcpXq .
on
LSpCn , γnC q ď 21 .
lu
Proposition 5.43 (not proved here, but see Exercise 5.61). We have
LSpS 1 , σq “ 1 and LSpr0, 1s, vol1 q “ π ´2 .
na
ion
Table 5.4. Bounds on log-Sobolev and Poincaré constants for a
selection of classical manifolds. We use the same metrics as in
Table 5.3. Except as indicated, the estimates on log-Sobolev con-
ut
stants follow from estimates on the Ricci curvature (see Proposi-
tion 5.41). Most of the time we use the bound LSpX, µq ă cpXq´1 ;
rib
the more precise expressions involving the dimension of X lead to
slightly better but often cumbersome formulas. The upper bounds
ist
on the Poincaré constants of Grassmann manifolds follow from Re-
mark 5.46. For more comments and references about Poincaré
rd
constants, see Notes and Remarks.
X or pX, µq
`
ra, bs, vol
b´a
1
˘
LSpX, µq
pb´aq2
π2
foPpX, µq
pb´aq2
π2
Comments
Prop. 5.43
ot
n´1 1 1
S n´1 n´1 Prop. 5.43 for S 1
N
1 1
PpRn q ď n´1 2n
1 1
PpCn q ă 2n 4n
ly.
ă n´2 n´1
SUpnq ă n2 n
n2 ´1
Upnq ď n6 1
[MM13]
se
n
n 2 2
Grpk, R q ă n´2 ď n´1 1ďk ďn´1
lu
Grpk, C q n
ă n1 ď n1 1ďk ďn´1
pX ˆ Y, µX b µY q maxtLSpXq, LSpY qu maxtPpXq, PpY qu `2 product metric
na
so
Exercise 5.60 (Log-Sobolev constant for the Gaussian space). Show that
LSpRn , γn q ě 1 (we have actually equality, see Proposition 5.42).
r
Pe
Exercise 5.61 (Log-Sobolev constants for segments and circles). (i) Use the
contraction principle from Remark 5.46 to show that LSpr0, 1s, vol1 q ď π ´2 LSpS 1 , σq
and Ppr0, 1s, vol1 q ď π ´2 PpS 1 , σq. (ii) Verify that PpS 1 , σq “ 1. (iii) Verify that
Ppr0, 1s, vol1 q ě π ´2 (see Notes and Remarks for the reasons why there is actually
an equality).
5.2.4.3. Hypercontractivity, Gaussian polynomials. We give a brief introduc-
tion to the concept of hypercontractivity and illustrate it to give an example of a
concentration inequality for Gaussian polynomials.
We work on the probability space pRn , γn q. We define the Ornstein–Uhlenbeck
semigroup of operators pPt qtě0 as follows. For f : Rn Ñ R a bounded measurable
5.2. CONCENTRATION OF MEASURE 135
ion
and therefore Pt extends to a bounded (contractive) operator on Lp pγn q. Remark-
ably, a stronger statement is true: provided p ą 1 and t ą 0, Pt is a contraction
from Lp pγn q to Lq pγn q for some q “ qptq ą p. This phenomenon is called hyper-
ut
contractivity.
rib
Proposition 5.47 (not proved here, but see Exercise 5.63). Let 1 ď p ď q ă 8
and t ą 0 such that q ď 1 ` e2t pp ´ 1q. Then
ist
}Pt f }Lq pγn q ď }f }Lp pγn q .
rd
The eigenvectors of Pt are the Hermite polynomials. In the one-dimensional
case, denote by phk qkPN the sequence of polynomials obtained by orthonormalizing
the sequence p1, x, x2 , . . . q in the space H1 :“ L2 pR, γ1 q. (In this context, we
fo
exceptionally mean N “ t0, 1, 2, 3, . . .u.) Given a multi-index α “ pα1 , . . . , αn q P
Nn , let hα be the multivariate polynomial
ot
(5.56) hα px1 , . . . , xn q “ hα1 px1 q ¨ ¨ ¨ hαn pxn q.
N
(5.57) Pt hα “ e´t|α| hα ,
on
řn
where |α| “ i“1 αi is the weight of the multi-index α, or the total degree of the
polynomial hα . Note that formula (5.57) allows to define Pt Q for any polynomial
Q even when t is negative.
se
Proof. For any t ě 0, we have Pt P´t Q “ Q (see the remark following (5.57)).
so
that }Q}Lq pγn q ď }P´t Q}L2 pγn q . We may write the decomposition of Q in the basis
Pe
of Hermite polynomials
ÿ
Q“ cα hα
|α|ďk
for some coefficients pcα q. It follows that }Q}2L2 pγn q “ c2α , while
ř
ÿ
}P´t Q}2L2 pγn q “ e2t|α| c2α ď e2tk }Q}2L2 pγn q ,
|α|ďk
ion
assume EX “ 0, Var X “ 1 and write by Markov’s inequality, for any q ě 2,
P p|X| ě tq ď t´q E |X|q ď t´q pq ´ 1qkq{2 ď pq k{2 {tqq
ut
where we used Proposition 5.48. The choice q “ t2{k {e (which is larger than 2
provided t ě p2eqk{2 ) yields the result.
rib
Remark 5.50. The phenomenon of hypercontractivity is not specific to the
ist
Gaussian case and is essentially equivalent to a log-Sobolev inequality (see Theorem
5.2.3 in [BGL14]). Similar concentration results are true for polynomials in binary
rd
random variables (see Theorem 9.21 in [O’D14]) and for polynomials on the sphere
(cf. [Mon12]). Here is a precise statement of the latter. If Q be a polynomial with
total degree at most k in n1 ` ¨ ¨ ¨ ` nd variables and X “ pX1 , . . . , Xd q with Xi
fo
independent and uniformly distributed on S ni ´1 , then for every q ě 2, }QpXq}Lq ď
pq ´ 1qk{2 }QpXq}L2 . (This is slightly more general than Corollary 12 in [Mon12]
ot
which assumes that n1 “ ¨ ¨ ¨ “ nd and that the partial degrees in each variable are
N
equal.) The argument is similar to the Gaussian case, using spherical harmonics
instead of Hermite polynomials. Concentration estimates similar to Corollary 5.49
follow.
ly.
compute Pt fλ when fλ pxq “ eλx . Conclude that Proposition 5.47 is sharp in the
following sense: when q ą 1 ` e2t pp ´ 1q, there is no constant C such that the
lu
in the discrete case. We will exemplify it (and the issues that may arise) on the
fundamental example of the Boolean cube t0, 1un , or t´1, 1un , endowed with the
r
1
n cardti : xi ‰ yi u, which up to normalization coincides with the `1 metric in the
ambient space Rn . (This setting was already studied in Section 5.1.3; other product
measures, or metrics induced by `p -norms for other p are also frequently considered,
more about that later.)
A nearly optimal concentration result for the Boolean cube follows already from
Proposition 5.37. However, we can do better: the exact solution to the isoperimetric
problem on the cube is known. To describe it, we introduce a total order ă on t0, 1un
(called the simplicial order ) as follows: for x “ pxi q and y “ pyi q in t0, 1un , declare
that x ă y if either x1 ` ¨ ¨ ¨ ` xn ă y1 ` ¨ ¨ ¨ ` yn or x1 ` ¨ ¨ ¨ ` xn “ y1 ` ¨ ¨ ¨ ` yn and
x precedes y in the lexicographic order. Then the initial segments for this order are
5.2. CONCENTRATION OF MEASURE 137
isoperimetric sets. As opposed to the Gaussian and spherical case, the extremal
sets are not unique in any reasonable sense (see Exercise 5.66)
Theorem 5.51 (Harper’s isoperimetric inequality, not proved here). For any
integer N with 1 ď N ă 2n , let A Ă t0, 1un be the set of N smallest elements
with respect to the simplicial order. Then A has the smallest ε-enlargements (for
all ε ą 0) among all sets of the same cardinality. The set A verifies
(5.58) Bpx, k{2n q Ă A Ă Bpx, pk ` 1q{2n q
ion
for some k P t0, . . . , n ´ 1u.
If we define the boundary of A as BA :“ ty P t0, 1un : distpy, Aq “ 1{nu, the
ut
sets from Theorem 5.51 also have the “smallest boundary” among subsets of t0, 1un
of the same measure. In this language, the condition (5.58) says that A consists
rib
řk ` ˘
of a ball and a part of its boundary. If N “ j“1 nj for some k, the situation
becomes simple: the optimal sets are balls, and so are their enlargements.
ist
For example, if n “ 2m ` 1 is odd, an example of an optimal set of measure 21
is
rd
A “ ty P t0, 1un : Y ď mu ,
řn
where Y “ j“1 yj . The enlargements of A are then clearly of the form As{n “
(
Y ď m ` s and, consequently,
řm`s `n˘
fo
ot
ř `n˘
j“1 j jąm`s j 2
(5.59) µpAs{n q “ “1´ ě 1 ´ e´2s {n
,
2n 2n
N
where the inequality follows from Hoeffding’s inequality (5.43). A similar analysis
can be performed when n is even (see Exercise 5.64 for details). To summarize, we
ly.
have
on
2
lu
Remark 5.53. Some authors assert that the bound µpAε q ě 1 ´ e´2nε (for A
satisfying µpAq ě 12 ) holds for all ε ą 0. However, this may be false, but only if
n “ 1 or 2 and only for certain values of ε P p0, 1{nq, see Exercise 5.65.
na
(The differences include the mean being replaced by the median, and the numeri-
cal constants being better in the former, which is not surprising since it is a more
r
Pe
specialized result.) The Corollary is an elegant and sharp result, but it exhibits
the following unsatisfactory feature: if we use the standard Euclidean metric to
define the 1-Lipschitz property of f or the expansions At , the exponential term
2
in the estimates becomes e´2t {n . This should be compared to the dimension-free
2
(and differently scaled) term 12 e´t {2 in Theorem 5.24, the Gaussian isoperimet-
ric inequality. However, there is a fix to this difficulty due to Talagrand: if the
function f is convex, its restriction to t0, 1un exhibits dimension-free subgaussian
concentration. We have
Theorem 5.54 (Talagrand’s convex concentration inequality for the Boolean
cube, not proved here). Let A be a non-empty subset of t0, 1un Ă Rn and set
138 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
φA pxq :“ distpx, conv Aq, where the distance is calculated with respect to the Eu-
clidean metric. Then
1 2
(5.60) E e 2 φA ď 1{µpAq
2
and so µpφA ą tq ď e´t {2 {µpAq for t ą 0. Consequently, if f : r0, 1sn Ñ R is
a convex (or concave) 1-Lipschitz function and M is its median with respect to µ,
2
then µpf ą M ` tq ď 2e´t {2 for t ą 0.
In the statement of Theorem 5.54 we tacitly assume that µ is a measure on Rn
ion
supported on t0, 1un . The second assertion of the Theorem follows from (5.60) by
Markov’s inequality. Some finer issues related to the derivation of the last assertion
ut
are addressed in Exercise 5.67. See also Exercise 5.68.
Theorem 5.54 turned out to be very useful (for example in the context of
rib
random matrices) and has been generalized in various ways. Here is one possible
statement.
ist
Theorem 5.55 (not proved here). Let V1 , V2 , . . . , VN be finite-dimensional
ÀN
normed spaces and let V “ j“1 Vj be their sum in the `q -sense (for some q ě 2).
rd
For j “ 1, 2, . . . , N , let µj be a measure on Vj supported on a set of diameter at
most 1 and let µ “ bN j“1 µj . Further, assume that F : V Ñ R is 1-Lipschitz and
` ˘
fo
quasiconvex (i.e., F ´1 p´8, as is convex for all a P R) or quasiconcave. Then
1 q
(5.61) µpF ą M ` tq ď 2e´ 4 t for all t ą 0,
ot
where M is the median of F with respect to µ.
N
We conclude this section with a result that is the counterpart of Theorem 5.54
with the median replaced by the mean, whose degree of generality is intermediate
ly.
Theorem 5.56 (Convex concentration inequality for the mean, not proved
here). Let µ “ µ1 b ¨ ¨ ¨ b µk be a product measure on r0, 1sn Ă Rn and let f :
r0, 1sn Ñ R be a function which is 1-Lipschitz with respect to the Euclidean distance
se
2
(5.62) µpf ą Ef ` tq ď e´t {2
.
While, by Remark 5.12 (which was based on the very general results from
na
Section 5.2.3.2), statements about concentration around the median formally im-
ply similar statements about the mean, we state Theorem 5.56 separately since it
so
is even, an example of a set A Ă t0, 1un( with µpAq “ 21 that is optimal(in the sense
řn řn
of Theorem 5.51 is A “ j“1 yj ă m Y j“1 yj “ m and y1 “ 1 . Show that
2
also in this case µpAs{n q ě 1 ´ e´2s {n
for s P N.
2
Exercise 5.65. Show that the bound µpAε q ě 1 ´ e´2nε from Corollary 5.52
may fail for some ε ą 0 if n “ 1 or 2, but that it always holds if n ą 2 or if ε ě 1{n.
Exercise 5.66 (Non uniqueness in Harper’s theorem). Give an example of a
value N and two sets of N elements in t0, 1u4 with smallest ε-enlargements (for all
values of ε) among sets with N elements, which are distinct up to symmetries of the
hypercube. Note: it appears to be unknown whether uniqueness can be assured
5.2. CONCENTRATION OF MEASURE 139
by insisting that both A and its complement are isoperimetric sets for all sizes of
enlargement.
Exercise 5.67 (Talagrand’s concentration inequality for concave functions).
2
Derive the bound µpf ą M ` tq ď 2e´t {2 for concave f in Theorem 5.54 (or,
2
equivalently, µpf ă M ´ tq ď 2e´t {2 for convex f ) from the inequalities preceding
it.
Exercise 5.68 (Existence of convex Lipschitz extensions). Let K Ă Rn be a
ion
convex set and let f : K Ñ R be a convex 1-Lipschitz function. Then f admits
a convex 1-Lipschitz extension to Rn . Consequently, in Theorem 5.54 it doesn’t
matter whether we assume f to be convex and 1-Lipschitz on Rn or just on r0, 1sn .
ut
Exercise 5.69 (No dimension-free subgaussian bound in absence of convexity).
rib
Here is an example showing that convexity is crucial in Theorem 5.54. Define f :
t´1, 1un Ñ R by f px1 , . . . , xn q “ maxp0, x1 ` ¨ ¨ ¨ ` xn q1{2 . Show that
` f has 1{4
median
ist
1
˘
0 and is 2 -Lipschitz with respect to the Euclidean metric, while µ f ą cn
? ěc
for some absolute constant c ą 0.
rd
5.2.6. Deviation inequalities for sums of independent random vari-
ables. In this section we gather some simple but useful facts about deviation in-
fo
equalities for sum of independent mean zero random variables. We mostly focus on
two families of random variables: subgaussian and subexponential variables.
ot
In a probabilistic setting, the Lp -norm (for p ě 1) of a random variable X is
1{p
N
? ˆ ˙1{p c
2 p`1 p
(5.63) }Z}p “ 1{2p Γ „ ,
π 2 e
se
p
(5.64) }T }p “ Γpp ` 1q1{p „
lu
e
as p tends to infinity.
na
This terminology is consistent with that introduced in the preamble to Section 5.2
and based on the tail behavior (cf. (5.21), (5.22); see Exercise 5.70 and Lemma 5.57
below). Similarly, X is said to be subexponential (or ψ1 ) when
}X}p
(5.66) }X}ψ1 :“ sup ă 8.
pě2 }T }p
The reader may be familiar with the arguably less ad hoc forms of ψr conditions,
based on either the rate of growth of the (bilateral) Laplace transform or the ap-
propriate Orlicz norms, or on the tail behavior of the type
r
Pp|X| ą tq ď Ce´λt for t ě 0
140 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
(cf. (5.21) and (5.22)). There is no need to be alarmed, though: while not identical,
all these approaches lead to quantities that are equivalent up to universal constants.
The definitions (5.65)–(5.66) were chosen out of convenience in view of the sample
applications we present. See Notes and Remarks for more details and a references.
If follows from (5.63) and (5.64) that }T }ψ1 “ 1, }Z}ψ2 “ 2{π and that
}¨}ψ1 ď }¨}ψ2 (see Exercise 5.75). We have obviously }¨}ψ2 ď }¨}8 and }¨}ψ1 ď }¨}8 ,
so the present discussion also applies to bounded variables. Another important
example of subgaussian variables is obtained by taking the inner product with
ion
a fixed vector of a randomly chosen unit vector in Rd or Cd . This has to be
compared with Poincaré’s lemma (Theorem 5.22) which says that the Gaussian
measure appears at the limit d Ñ 8.
ut
Lemma 5.57. If X is uniformly distributed on?S d´1 (resp., SCd ), then for every
rib
u P Rd (resp., u P Cd ), we have }xX, uy}ψ2 ď |u|{ d.
Proof. We may assume by homogeneity that |u| “ 1. Let G be a standard
ist
Gaussian vector in Rd . The variable uniformly distributed on S d´1 can be then
represented as X “ G{|G|. Moreover, |G| is independent of X and hence, for p ě 1,
rd
}xG, uy}p “ }|G|}p }xX, uy}p .
fo
We have }|G|}p ě }|G|}1 “ κd (see Section 4.3.3).aSince xG, uy has distribution
N p0, 1q, we know from (5.63) that }xX, uy}ψ2 “ 2{π “ κ1 . Therefore, using
Proposition A.1(ii), we obtain }xX, uy}ψ2 ď κκd1 ď ?1d . The complex case is similar.
ot
N
t2
ˆ ˙
Pp|S| ą tq ď 2 exp ´ .
na
8eK 2
2
t
so
The proof actually yields a better bound 2 expp´ 2eK 2 q when pXi q are symmet-
ric random variables (i.e., such that Xi and ´Xi have the same distribution for any
r
fixed i).
Pe
Remark 5.60. Propositions 5.58 and 5.59 readily generalize to the complex
case (with possibly different numerical constants).
Exercise 5.70 (Lipschitz function on a Gaussian space is subgaussian). Let
G be a standard Gaussian vector on Rn and f : Rn Ñ R a 1-Lipschitz function
such that f pGq has mean zero. Deduce from the results of Section 5.2.2 that
}f pGq}ψ2 ď C for some absolute constant C. (Except for the value of the constant
C, this is a generalization of Lemma 5.57.)
ion
řn
Exercise 5.71 (Khintchine inequalities). Let X “ i“1 εi ai , where a1 , . . . , an
are real numbers and pεi q is a sequence of independent random variables with
Ppεi “ 1q “ Ppεi “ ´1q “ 1{2. Show that, for any p ě 1,
ut
Ap }X}L2 ď }X}Lp ď Bp }X}L2
rib
?
where Ap ą 0 and Bp are constants depending only on p. Show that Bp “ Op pq
as p Ñ 8.
ist
Exercise 5.72 (Khintchine–Kahane inequalities). Khintchine inequalities have
řnx1 , . . . , xn belong to some
a vector-valued generalization which is due to Kahane: If
rd
normed space Y and X 1 denotes the random variable } i“1 εi xi }Y , then
A1p }X 1 }L2 ď }X 1 }Lp ď Bp1 }X 1 }L2
fo
where A1p ą 0 and Bp1 are constants depending only on p. Prove this. Moreover, we
? ?
ot
have A1 “ A11 “ 1{ 2 and Bp1 “ Θp pq as p Ñ 8.
N
Exercise 5.73. Prove Proposition 5.58 by following the outline given below.
(i) If X is symmetric, show that E exppλXq ď expp 2e }X}2ψ2 λ2 q for any λ ą 0.
(ii) Let Y be an independent copy of a mean zero random variable X. Show that
ly.
E exppλXq ď E exppλpX ´ Y qq. Using this symmetrization trick, deduce from (i)
that the inequality E exppλXq ď expp2e}X}2ψ2 λ2 q holds for any mean zero random
on
variable X.
(iii) Deduce Proposition 5.58 using Lemma 5.28.
se
}X}ψ1 ď }X}ψ2 .
r
1, then E exppλXq ď 1 ` 2λ2 ď expp2λ2 q for |λ| ă 1{2 (cf. Lemma 5.28).
(ii) Under the hypotheses of Proposition 5.59, assuming ř K “ 1 and denoting S “
a1 X1 ` ¨ ¨ ¨ ` an Xn , prove that E exppλSq ď expp2λ2 a2i q for |λ| ď 1{p2}a}8 q.
(iii) Prove Proposition 5.59.
a ?
arccos 2{n, we have V ptq ě p6 n cos tq´1 psin tqn´1 (similar estimates appear in
[Bör04], Lemma 6.8.6). For some values of n, t (roughly for t ą 1.14 and for large
n), this is better than the lower bound from (5.4), and similarly superior to the
improved bound from Exercise 5.4 if t ą 1.221.
The random covering argument from Proposition 5.4 is due to Rogers [Rog57,
Rog63]. The factor Cn log n from Corollary 5.5 is usually referred to as the density
of the covering, even though calling it “the overlap” or “the redundancy” would seem
more logical. Both the original Rogers’s argument, and the one presented here,
ion
allow achieving C “ 1 at the expense of additional lower order terms (see Exercise
5.8 and its hint). Recent advances by Dumer [Dum07] improve the bound on the
density to p 21 ` op1qqn log n. The paper [Dum07] establishes also a density bound
ut
1
2 n log n ` 2n log log n ` 5n, valid for all ε P p0, 1q and all n ě 4. It should be noted,
rib
however, that the latter result deals with a slightly easier problem, covering the
sphere S n´1 Ă Rn by balls whose centers are not required to belong to S n´1 (i.e.,
with the parameter N 1 from Exercise 5.1). Finally, at the price of increasing the
ist
constant C, the result from Corollary 5.5 can be strengthened as follows: for any
dimension n and angle ε, there is a covering of S n´1 by caps of radius ε such that
rd
any point belongs to at most 400n log n caps [BW03].
Since the sphere looks locally like a Euclidean space, as the radii of the caps
fo
tend to 0, the packing/covering problems for S n´1 converge to the corresponding
problems for Rn´1 . (The original random covering argument of Rogers [Rog57]
ot
considered an even more general question, economical coverings of Rn by translates
of an arbitrary convex body—the spherical variant being an afterthought—and
N
led to an upper bound of n log n ` n log log n ` 5n for the appropriately defined
asymptotic density.) In that setting, a lower bound on density of optimal coverings
ly.
by Euclidean balls is Ωpnq [CFR59] and this estimate can be transferred back to
S n´1 if the radius is small?enough; see Example 6.3 in [BW03] for an argument
on
also [BN06a]). Again, when the radius of the cap tends to 0, the problem becomes
lu
the classical sphere packing problem in Rn . In this context, a classical result due to
Minkowski–Hlawka shows the existence of lattice packings of Euclidean balls (or ac-
tually, of any symmetric convex body) in Rn which cover a proportion 1{2n´1 of the
na
space (a.k.a. packing density). Remarkably, this result has been only marginally im-
so
proved in the past century [Rog47, DR47, Bal92b] and is exponentially far from
Kabatjanskiı̆–Levenšteı̆n upper bound—which is approximately of order 0.66n —for
r
the proportion covered by a (non-necessarily) lattice packing (see [Gru07] for more
Pe
on this topic).
Covering and particularly packing in the Hamming cube is of fundamental
importance in coding theory, see, e.g., [Rot06, CHLL97]. The case of (very
small) balls of radius 1{n in t0, . . . , q ´ 1un is treated in [KP88].
The Gilbert–Varshamov bound has been improved in the q-ary cube for certain
large values of q in [TVZ82], using a link with modular curves.
Packing and covering for convex bodies. For early references on metric
entropy of convex bodies see [CS90], [Pis89b].
The arguments from [Bar14] imply the following improvement on the volu-
metric bound from Corollary 5.10: for ε P p0, 1q, any symmetric convex body in
NOTES AND REMARKS 143
?
Rn is p1 ` εq-close in Banach–Mazur distance to a polytope with pC{ εqn vertices.
(This is sharp: consider the case of the sphere.) To the best of our knowledge, it is
not known whether analogous statement holds for not-necessarily symmetric bodies
and the affine version (4.2) of the Banach–Mazur distance. Similar questions can
be considered for large ε, or even ε growing with the dimension. In the case of the
sphere, this is essentially the problem considered in Exercise 5.13. Again, [Bar14]
contains good estimates in the general case. However, the bounds from [Bar14]
deteriorate as the asymmetry of the body (defined, for example, as the minimal dis-
ion
tance dBM to a symmetric body) increases. Estimates that are superior for some
ranges of parameters can be found in [Sza].
Let us also mention an important open problem, known as the duality conjec-
ut
ture: do there exist absolute constants c, C ą 0 such that for every two symmetric
convex bodies K, L Ă Rn we have
rib
(5.67) log N pL˝ , K ˝ q ď C log N pK, cLq?
ist
This was proved when K or L is the Euclidean ball [AMS04] and extended to
the case when a bound on the K-convexity constant (as defined in Section 7.1.2)
rd
is present in [AMSTJ04]. Another possible generalization to the setting of non-
symmetric convex bodies is more tricky; in that case, even the proper formulation
of (5.67) is not entirely clear. fo
A deep fact about covering numbers is the following ([Mil86], see also the dis-
cussion in [Pis89b]): there is an absolute constant C such that, for every symmetric
ot
convex body K Ă Rn there is an 0-symmetric ellipsoid E such that
N
bodies is an ellipsoid, it follows then that similar bounds automatically hold also
for N pK ˝ , E ˝ q and N pE ˝ , K ˝ q. (In the original definitions, all four quantities were
on
not aware that questions of that nature were considered in AGA already in 1980s.
Section 5.2. Classical general references about concentration of measure are
[Led01] and [Sch03]. We particularly recommend the recent monograph [BLM13].
For a presentation directed towards applications to data science, see [Ver].
Isoperimetry and concentration. A geometry-oriented reference about
isoperimetric inequalities is [BZ88]. The paternity of the isoperimetric inequal-
ity on the sphere (Theorem 5.13) is usually attributed to Lévy [Lév22, Lév51]
although the arguments he presented were not fully rigorous; [Sch48] is usually
cited as the first rigorous proof. Remarkably, the functional version (Lévy’s lemma,
144 5. METRIC ENTROPY AND CONCENTRATION OF MEASURE
in the language of our Corollary 5.17) appears explicitly in [Lév22] (see p. 279)
and is therefore almost one century old!
A self-contained proof of the isoperimetric inequality on S n´1 , based on the
concept of spherical symmetrization, appears in [FLM77]. Another symmetriza-
tion procedure (the two-point symmetrization) is applied in [Ben84]. The simple
proof of the non-sharp inequality from Proposition 5.15 is based on [AdRBV98].
Proposition 5.20 is from [JS].
The Gaussian isoperimetric inequality was proved independently by Borell
ion
[Bor75b] and Sudakov–Tsireslon [SC74]. For a proof of Poincaré’s lemma (Theo-
rem 5.22) going beyond the weak convergence version from Exercise 5.29, we refer
to [DF87] (which also advocates that the statement was first formulated by Borel
ut
and not by Poincaré). See also [Led96] and references therein. For a direct proof
rib
of concentration of measure on Gauss space, see [Pis86].
Ehrhard’s inequality (5.31) was proved in [Ehr83] for convex sets, then ex-
tended in [Lat96] to the case where only one of the sets set is convex, with the
ist
general case being treated in [Bor03]. A priori, deriving an isoperimetric inequal-
ity such as (5.29) requires validity of (5.31) for an arbitrary Borel set and a ball;
rd
the paper [Ehr83], however, contains a direct application of the technique to prove
(5.29). A general reference for this circle of ideas is [Lat02].
fo
The concept of central values was formalized and applied in the context of QIT
in [ASW11], which also contains versions of Corollaries 5.32 and 5.35. However,
ot
instances of the arguments can be found in [Has09] and in AGA literature dating
to (at least) 1980s.
N
martingales.
Geometric and analytical methods. General references for Section 5.2.4
se
the theorem is sharp as stated, there is a reason to suspect that a more precise
result should be available: the proof proceeds via a local/variational argument
so
and the globally normalized volume appears only a posteriori. A more satisfactory
r
variant appears in [Mil15]. In addition to the curvature, it takes into account the
Pe
actual diameter of the manifold in question, which may be strictly smaller than
the bound following indirectly from the curvature. However, since the results in
[Mil15] necessarily involve model manifolds more complicated than spheres, their
statements are somewhat technical.
The case of manifolds of dimension 1 is a little special. First, while the definition
of Ricci curvature in dimension 1 needs to be properly construed, the only sensi-
ble value is 0 since every such manifold looks locally like a segment. Accordingly,
Proposition 5.41 is then vacuously true. Next, the solution to the isoperimetric
problem in S 1 (resp., in R) is very simple: among sets of any (positive, but not
full) measure, the boundary is the smallest if it consists of exactly two points.
NOTES AND REMARKS 145
Consequently, the solutions, both for the “smallest boundary” and the “smallest en-
largement” problems, are arcs (resp., segments). However, finer analytic statements
(including but not limited to LSI) are interesting and highly nontrivial already in
dimension 1. For example, in view of Proposition 5.44, the validity of (5.48) for
the 1-dimensional Gaussian measure implies the same inequality in any dimension
(with the same constant α, which, in view of Proposition 5.42, can be taken to be
1, which is optimal). Indeed, even statements about spaces consisting of only two
points can be deep as for example in the elementary proof of the Gaussian isoperi-
ion
metric inequality presented in [Bob97]. We will return to the same theme further
when reporting on developments directly related to LSI and hypercontractivity.
Log-Sobolev inequalities (LSI) were introduced in a seminal paper by Gross
ut
[Gro75]. Again, the case of manifolds of dimension 1 (segments, circles) is a little
rib
special; see [GMW14] for an elementary overview of this aspect of the subject and
for references. The link with concentration of measure (the Herbst argument) orig-
inates in an unpublished letter from Herbst to Gross. The connection between LSI,
ist
Ricci curvature, and the Hessian of the density was put forward in [BÉ85, Bak94].
For a comprehensive treatment of functional inequalities (including complete refer-
rd
ences), see [BGL14]. Another fruitful approach is the connection between LSI and
the quadratic transportation cost inequalities; see Chapter 6 in [Led01].
fo
As exemplified in Table 5.4, the values of the Poincaré constants can often
be computedş exactly. Indeed, the Poincaré inequality (5.54) can be rewritten as
ot
Varµ f ď α p´∆f qf dµ, where ∆ is the Laplace–Beltrami operator on L2 pX, µq.
It follows that the optimal α is equal to the reciprocal of the “spectral gap,” i.e.,
N
the smallest nonzero eigenvalue of ´∆. In some examples the eigenfunctions of the
Laplace–Beltrami operator can be explicitly described: for the Gauss space they
ly.
are the Hermite polynomials, for the sphere they are the spherical harmonics (see
the elementary [See66], or [BGM71] which covers also the case of the projective
on
spaces). On S n´1 , equality in (5.54) is achieved for functions of the form x ÞÑ xx, yy
with y P Rn . For Lie groups there is a connection with the spectrum of the Casimir
operator and representations of the associated Lie algebra (see Proposition 10.6 in
se
[Hal15]), which allows to derive the entire spectrum of ´∆. The case of SOpnq and
lu
SUpnq appears in [SC94] (for Upnq, see [Voi91]). Note that in these examples there
is equality in (5.54) when f is a function of the form M ÞÑ TrpAM q for A P Mn . For
a complete list of semisimple Lie algebras, see [Rot86]. The spectrum of Grassmann
na
has been first established by Nelson [Nel73]. The connection with log-Sobolev
Pe
ion
with vertex-isoperimetry. If we consider instead edge-isoperimetry (minimizing the
number of edges joining A to Ac ), the optimal sets are no longer Hamming balls
ut
but subcubes.
Theorem 5.54 is taken from [Tal88] (Note that [Tal88] states the result for the
rib
cube t´1, 1un and so the coefficient in the exponent in the estimate corresponding
to (5.60) is there 81 .) Theorem 5.55 appears in [JS91] and [Mec04]. The latter
ist
paper addresses general unconditional direct sums and not only `q -sums; see also
[Mec03]. Similar results, but with quite different proofs were presented in [Mau91]
rd
and [Dem97]. The most abstract (and most flexible) statements are arguably in
[Tal95, Tal96b, Tal96a]. The arguments addressing settings more general than
that of Theorem 5.54 usually led to a coefficient 14 in the exponent as in (5.61),
fo
except for [Tal95], which includes a statement (Theorem 4.2.4) featuring coefficient
1
2 , but at the cost of introducing additional factors of lower order and restricting
ot
the range of t. A clean proof of Theorem 5.56 (which also has coefficient 12 in the
exponent) can be found in [BLM13]; the argument is attributed to [Led97] and
N
Deviation inequalities. Some references for Section 5.2.6 are [Ver12] and
[CGLP12] (the latter treats also the case of intermediate growth between sub-
on
gaussian and subexponential). As pointed out in the main text, there are several
possible forms of ψr conditions and of definitions of the ψr -norms. The original
ones were (presumably) in terms of Orlicz/Young functions: given an increasing
se
If one considers ψr pxq “ exppxr q ´ 1 (r ě 1), then, for r “ 1, 2, one gets norms
which are equivalent (although not equal) to the ones defined in (5.66) and (5.65).
so
For precise statements and proofs, see Theorem 1.1.5 in [CGLP12], which also
covers the link to (the rate of growth of) the Laplace transforms mentioned in the
r
Pe
main text; cf. Lemma 5.28 and Exercise 5.76. Overall, Section 1.1 of [CGLP12]
is an excellent reference for ψr conditions/norms, which are otherwise difficult to
extract from books/surveys on the more general Orlicz spaces.
For a historical account of Bernstein’s contributions, we refer to pp. 126–128
in [AAGM15]. For more precise results about moments of sums of independent
variables, see [Lat97]. For non-commutative analogues of these inequalities (i.e.,
for sums of random matrices), see [Tro12].
Finally, among other techniques to prove concentration of measure, we men-
tion the so-called martingale method which implies for example concentration on
permutation groups (see [Sch82, Mau79, MS86]): If we equip the symmet-
ric group Sn with the uniform probability measure and the distance dpσ, τ q “
NOTES AND REMARKS 147
1
n cardti : σpiq ‰ τ piqu, then any 1-Lipschitz function f on pSn , dq satisfies
Ppf ě Ef ` tq ď expp´nt2 {8q for any t ě 0.
The best constants in Khintchine inequalities
? (see Exercise 5.72) have been
found in [Sza76] (who proved A1 “ 1{ 2q and in [Haa81] (for p ą 1). The
Khintchine–Kahane inequalities from Exercise 5.72 were first proved in [Kah85].
The correct asymptotic ?order of the constants as p Ñ 8 was found in [Kwa76],
while the value A11 “ 1{ 2 is from [LO94]. A complete proof of the Khintchine–
Kahane inequalities can be found by consulting Theorem 3.5.2 of [AAGM15].
ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 6
ion
This chapter is devoted to the development of probabilistic techniques which,
ut
along the concentration of measure from Chapter 5, constitute our most powerful
tools. Specifically, we will consider stochastic processes (mostly, but not exclusively,
rib
Gaussian) and present deep results permitting their quantitative study. The key
insights are the link between suprema of Gaussian processes and the mean width of
ist
convex bodies, and the use of comparison theorems for Gaussian processes to the
analysis of spectral behavior of random matrices.
rd
6.1. Gaussian processes
fo
This section deals with Gaussian processes (widely used in mathematical mod-
eling and in statistics) and presents several tools for estimating various parameters
ot
related to such processes. A Gaussian process X “ pXt qtPT is simply a family of
jointly Gaussian variables, normally with mean zero, defined on some probability
N
space Ω, which may or may not be specified. See Appendix A for more on the
terminology and for basic and not-so-basic facts about Gaussian variables.
ly.
processes appear when considering the Gaussian mean width of a convex body (and
this is essentially the general case, see Section 6.1.1) and therefore can be used to
estimate other geometric parameters such as volume. There are essentially three
se
(i) Discretize the problem by using an ε-net and appealing to the union bound.
(ii) Use a recursive version of (i) by considering a whole hierarchy of ε-nets (for
na
example ε “ 2´k for every integer k). This is called a “chaining argument.”
(iii) Use a further sophistication of (ii), where instead of using nets whose resolution
so
parameter is uniform across the index set, we allow more general partition schemes.
This is called the “generic chaining” or the “majorizing measure” approach.
r
Pe
A deep result due to Talagrand asserts that (iii) provides an estimate on the
supremum of any Gaussian process which is always sharp up to a multiplicative
constant. However, we mostly consider the situations (i) and (ii) since they are
much simpler and sufficient for our purposes.
We note for the record that without any assumptions on regularity of X, which
will be implicitly made in what follows, measurability issues and other complications
may in principle arise, particularly when T is uncountable. For the benefit of a
non-specialist reader we sketch examples of possible pathologies in Exercise 6.1.
However, such potential difficulties are not relevant in our context and we will
henceforth largely ignore them. For example, in all the settings we are interested
149
150 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
and other questions can similarly be reduced to considering instances of the problem
with finite index sets. As usual, the crucial point will be that the constants that
may appear in the statements do not depend on X and, in particular, on the size
of T .
ion
Exercise 6.1. Give examples of processes pXt qtPT such that, for every t P T ,
Xt “ 0 a.s., but (a) E suptXt : t P T u “ 8 (b) suptXt : t P T u is not measurable.
ut
6.1.1. Key example and basic estimates. We start with a simple—but
crucial—observation that if G : Ω Ñ Rn is a standard Gaussian vector, then
rib
pxG, xyqxPRn is a Gaussian process. Recalling the definition of the Gaussian mean
width of a (bounded nonempty) set K Ă Rn , as introduced in Section 4.3.3,
ist
(6.2) wG pKq “ E suptxG, xy : x P Ku,
we see that calculating wG pKq is equivalent to finding the expectation of the supre-
rd
mum of a certain Gaussian process, a subprocess of pxG, xyqxPRn .
This instance is actually, more or less, the general case. This follows by com-
bining two facts: `fo ˘
(i) the map x ÞÑ xG, xy is an isometry from Rn , | ¨ | to L2 pΩq
ot
˘ distribution of X “ pXt qtPT is uniquely determined by the covariances
`(ii) the joint
EXs Xt s,tPT and so all the stochastically relevant information about the process
N
For a finite process X “ pXk q1ďkďN this is easily realized: we can choose
E :“ spantXk u Ă L2 pΩq and xk “ Xk . We then have in particular
se
The above construction shows that the two (classes of) problems, namely calcu-
lating (1) the mean width of a convex set and (2) the expectation of the supremum
so
of a Gaussian process, are essentially equivalent. This equivalence will turn out to
be very fruitful. Recall that if 0 P K, then suptxy, xy : x P Ku “ }y}K ˝ and so
r
wG pKq “ E }G}K ˝ . It may happen that the set KX does not contain 0, but this can
Pe
Lemma 6.1. Let pXk q1ďkďN be Gaussian random variables with mean zero and
variance bounded by 1. Then
a
(6.4) E max Xk ď 2 log N .
1ďkďN
ion
Proof. We use the following elementary computation: if X has distribution
N p0, σ 2 q with σ 2 ď 1, then E etX “ exppt2 σ 2 {2q ď exppt2 {2q for any real t. For
β ą 0 to be determined, we have (the second inequality being Jensen’s inequality)
ut
N N
1 ÿ 1 ÿ 1
E max Xk ď E log eβXk ď log E eβXk ď logpN exppβ 2 {2qq,
rib
1ďkďN β k“1
β k“1
β
?
and the optimal choice β “ 2 log N yields (6.4).
ist
This completes the proof of the first inequality. A slightly weaker, but more
general estimate, based on the simple (and not-so-optimal, see Appendix A.1) upper
rd
bound
1 2
(6.6) PpZ ě tq ď e´t {2 if t ě 0
fo
2
for the tail of a standard normal variable Z (see Exercise A.1) is given in Lemma
ot
6.16. We relegate the proof of the second inequality (based on a lower bound for
the tail of Z) to Exercise 6.2, which also gives an explicit expression for the op1q
N
quantity.
ly.
We note that the estimate from (6.4) also holds for the expected maximum of
the absolute values of Gaussian variables.
on
Lemma 6.2 (see Exercise 6.3). Let N ě 2 and let pXk q1ďkďN be jointly Gauss-
ian random variables with variance bounded by 1. Then
se
a
E max |Xk | ď 2 log N .
1ďkďN
lu
When N ě 4, the inequality holds for any Gaussian random variables (that is, not
necessarily jointly Gaussian).
na
number of vertices.
r
ion
(the unit ball of `n1 ), then K “ convt˘e1 , . . . , ˘en u, where pek qnk“1 is the standard
unit vector basis in Rn . Consequently,
`a ˘ Proposition 6.3 used with N “ 2n leads
to the bound vrad B1n “ O logpnq{n , while the correct value (cf. Table 4.1) is
ut
?
Op1{ nq. Some of these issues are explored in Exercise 6.4.
rib
Remark 6.5 (Conjectured extremal property of the regular simplex). It is
conjectured that the polytope with N vertices and outradius 1 that has the largest
ist
Gaussian mean width is the regular simplex inscribed in the unit ball. This is
known (and easy) for N ď 3. By the argument used in the proof of Proposition
rd
6.3, this is equivalent to characterizing the instances giving the extremal value of
E max1ďkďN Xk in the context of Lemma 6.1 (with pXk q1ďkďN jointly Gaussian).
have
fo
Exercise 6.2. Show that, in the context of the second part of Lemma 6.1, we
ot
ˆ ˙
a log log N
E max Xk ě 2 log N ´ O ?
log N
N
1ďkďN
ż8
2N 2
E max |Xk | ď T ` 2N PpZ ą tq dt “ T ` ? e´T {2 ´ 2N T PpZ ą T q
1ďkďN T 2π
se
a
and check numerically that the choice T “ 2 log N ´ 3{2 gives the needed in-
lu
equality. Note that this proof does not use the hypothesis that the variables are
jointly Gaussian. For 2 or 3 jointly Gaussian variables, use Proposition 6.9 to
na
the notation of Proposition 6.3, N “ Opnq, then vrad K “ Op1{ nq, which yields
the better bound stated in Remark 6.4 for that range of N .
r
Pe
Exercise 6.5 (Volume of symmetric polytopes with few vertices). Show that
if K Ă B2n is a symmetric polytope with N vertices, the conclusion
? ?of Proposition
6.3 can be slightly improved to the inequality vradpKq ď 2 log N { n.
Exercise 6.6 (Mean widths of standard sets). Prove the estimates involving
mean width from Table 4.1.
6.1.2. Comparison inequalities for Gaussian processes. The following
fundamental inequality is known as Slepian’s lemma. It expresses the fact that
strengthening correlations of a Gaussian process decreases the supremum.
6.1. GAUSSIAN PROCESSES 153
Proposition 6.6 (Slepian’s lemma, not proved here). Let pXk q1ďkďN and
pYk q1ďkďN be Gaussian processes, and assume that
E pXk ´ Xj q2 ď E pYk ´ Yj q2
“ ‰ “ ‰
Moreover, if also E Xk2 “ E Yk2 for all k and, then for any λ1 , . . . , λN P R
ion
(6.8) P pXk ě λk for some kq ď P pYk ě λk for some kq .
ut
Slepian’s lemma can be re-formulated in geometric language: contractions de-
crease the mean width. More precisely, if T Ă Rn and if φ : T Ñ Rm is a contraction
rib
(with respect to the Euclidean distance, not necessarily linear), then
` ˘
(6.9) wG convpφpT qq “ wG pφpT qq ď wG pT q “ wG pconvpT qq.
ist
If m “ n, we can immediately deduce from (4.32) that also wpφpT qq ď wpT q. This
rd
property seems intuitively obvious, but we know a simple proof only if φ is linear
(or affine, see Exercise 4.46).
Slepian’s lemma admits a number of variants and generalizations, the follow-
fo
ing one has been quite useful. In particular, it leads to elegant proofs of various
statements about random matrices (see Section 6.2) and versions of Dvoretzky’s
ot
theorem (Section 7.2).
N
Proposition 6.7 (Gordon’s lemma, not proved Ť here). Let pXt qtPT and pYt qtPT
be Gaussian processes. Assume further that T “ sPS Ts and that
ly.
Then
E max min Xt ď E max min Yt .
sPS tPTs sPS tPTs
se
Moreover, if also E Xt2 “ E Yt2 for all t P T , then for any choice of real numbers
lu
pλt qtPT , ˜ ¸ ˜ ¸
ď č ď č
P tXt ě λt u ď P tYt ě λt u .
na
Remark 6.8. (1) When all Ts are singletons, Gordon’s lemma reduces to
the Slepian version. Accordingly, Proposition 6.7 is sometimes referred to as the
r
Slepian–Gordon lemma. (2) Replacing Xt , Yt with ´Xt , ´Yt we get analogous state-
Pe
ments for min max in place of max min, and similarly for the Slepian’s lemma and
for the statements about probabilities. (3) Further generalizations to min and max
applied alternatively more than twice are possible.
Another fundamental comparison inequality is the Khatri–Šidák lemma.
Proposition 6.9 (Khatri–Šidák, see Exercise 6.9). Consider two Gaussian
processes pXk q1ďkďN and pYk q1ďkďN , and assume that
(1) for every 1 ď k ď N , E Xk2 “ E Yk2 ,
(2) the random variables pYk q1ďkďN are independent.
154 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
Then,
(6.10) E sup |Xk | ď E sup |Yk |.
1ďkďN 1ďkďN
ion
ź
(6.12) P p|Xk | ď tk for all kq ě P p|Yk | ď tk for all kq “ Pp|Yk | ď tk q.
k“1
ut
Similarly to Slepian’s lemma, both (6.10) and (6.12) have nice geometric inter-
pretations. Consider n bands in Rn of the form Bi “ tx P Rn : |xx, ui y| ď ai u
rib
where u1 , . . . , un P S n´1 are unit vectors and a1 , . . . , an are positive numbers. Then,
the mean width of B1˝ X ¨ ¨ ¨ X Bn˝ is minimal when the directions of the bands (i.e.,
ist
the normal vectors ui ) are pairwise orthogonal. Similarly, the (Gaussian) measure
of the intersection of the bands is minimal if the bands are orthogonal.
rd
An remarkable statement that generalizes (6.12) and that has been a long-
standing open problem is the Gaussian correlation conjecture. It was answered
affirmatively very recently by Royen, who proved the following inequality: given
fo
0-symmetric convex sets K, L Ă Rn and a centered Gaussian measure P on Rn ,
then
ot
(6.13) PpK X Lq ě PpKqPpLq.
N
the “equal variance” assumption, approximate the space by a sphere of large radius.
on
Exercise 6.8. Show that it is enough to verify (6.13) when P is the standard
Gaussian measure.
se
Gaussian measure is log-concave and therefore satisfies (4.28). Then deduce the
Khatri–Šidák inequality (Proposition 6.9).
na
X “ pXt qtPT we may identify X with a subset of the Hilbert space L2 pΩq (cf. (6.3)
and the comments in the paragraph containing it).` Since the ˘ joint distribution of
r
pXt qtPT is uniquely determined by the covariances EXs Xt s,tPT , it follows that all
Pe
the stochastically relevant information about the process is encoded in the geometry
of X. As it turns out, the value of the expected supremum of X is intimately related
to the behavior of covering numbers N pX, εq. The first result in this direction is
the Sudakov inequality.
Proposition 6.10 (Sudakov minoration). Let X “ pXt qtPT be a Gaussian
process. Then,
a
(6.14) c sup ε log N pX, εq ď E sup Xt
εą0 tPT
for some absolute constant c ą 0.
6.1. GAUSSIAN PROCESSES 155
Proof. By (5.1), we may equivalently work with the packing number P pX, εq.
Let ε ą 0 and let S Ă T be a subset which is ε-separated in the L2 -norm, that is,
verifying }Xs ´ Xt }2 ě ε whenever s, t P S and s ‰ t. Let pYs qsPS be a Gaussian
process such that Ys are independent N p0, ε2 {2q random variables. By construction,
we have
}Ys ´ Yt }2 “ ε ď }Xs ´ Xt }2
for any s, t P S with s ‰ t. Accordingly, by Slepian’s lemma and Lemma 6.1, we
can conclude that
ion
a
ε logpcard Sq „ E sup Ys ď E sup Xs ď E sup Xt ,
sPS sPS tPT
ut
as needed.
In view of the comments in Section 6.1.1 (cf. (6.2), (6.3)), Sudakov’s inequality
rib
(6.14) is really a statement
? about Gaussian mean widths of subsets of a Hilbert
space. Since wG pKq „ n wpKq for K Ă Rn (see Section 4.3.3), the inequality
ist
(6.14) may be restated as follows: for every bounded set (or, equivalently, for every
convex body) K Ă Rn we have
rd
(6.15) log N pK, εB2n q À wG pKq2 {ε2 „ nwpKq2 {ε2 .
In general, Sudakov’s inequality is not tight (see Exercise 6.11). However,
fo
in combination with the equally simple-minded bound (6.5) (applied at the ap-
propriate “level of resolution”), it often leads to surprisingly precise estimates for
ot
E suptPT Xt . We will elaborate on this point in the next section, in which we prove
N
yields a reasonable estimate when log N pK, εq “ Opnq. For smaller ε, i.e., when
log N pK, εq " n, the volumetric approach from Lemma 5.8 is generally more precise.
on
we have
(6.16) log N pB2n , K ˝ , εq “ log N pB2n , εK ˝ q À wG pKq2 {ε2 „ nwpKq2 {ε2 .
na
6.11 follows from Proposition 6.10, and vice versa, by the (known) Euclidean case
of the duality conjecture of covering numbers (5.67). However, there is a simple
r
self-contained argument.
Pe
ion
them.
Here are details of the calculation behind the second observation. First, by sym-
metry of L˝ ,
ut
γn pxi ` 2L˝ q ` γn p´xi ` 2L˝ q
ż
φpx ` xi q ` φpx ´ xi q
γn pxi ` 2L˝ q “ “ dx,
rib
2 2L˝ 2
2
where φpxq “ p2πq´n{2 e´|x| {2 is the density of γn . Next, by convexity of the
ist
exponential function and by the parallelogram identity
2 2
φpx ` xi q ` φpx ´ xi q e´|x`xi | {2 ` e´|x´xi | {2
rd
“ p2πq´n{2
2 2
´n{2 ´p|x`xi |2 `|x´xi |2 q{4
e
ě p2πq
“ p2πq´n{2 e´p|x|
fo 2
`|xi |2 q{2
ot
2
“ e´|xi | {2
φpxq
N
´r 2 {2
ěe φpxq.
Inserting this estimate into the preceding formula we get
ly.
2 1 2
γn pxi ` 2L˝ q ě e´r {2 γn p2L˝ q ě e´r {2
on
2
2
and so N ď 2er {2 . This is exactly what we needed, except in the case when r is
small, which can be handled separately by an elementary argument showing that
se
Remark 6.12. In the setting of observation (a) in the proof above, a stronger
statement is actually true: if wG pLq “ 1, then γn pL˝ q ě 12 , see Exercise 6.14.
na
Exercise 6.11 (The gap in Sudakov’s inequality). Show that a the gap in Su-
r
dakov’s inequality, i.e., the ratio between wG pKq and supεą0 ε log N pK, εq, can
Pe
be arbitrarily large. For example, let pdj qnj“1 be a “sufficiently fast” increasing se-
quence of positive integers and consider K “ Ka 1 ˆ K2 ˆ . . . ˆ Kn , where Kj is a
Euclidean sphere of dimension dj and radius 1{ dj .
Exercise 6.12 (Metric entropy of B1n ). Let K “ n1{2 B1n . It is known (see
Theorem 1 in [Sch84]) that then
" logp2εq
n ε2 if 1 ď ε ď 12 n1{2 ,
(6.17) log N pK, εq »
n logp2{εq if 0 ă ε ď 1.
Compare the performance/facility of application of (6.15) to that of Lemma 5.8
when estimating log N pK, εq.
6.1. GAUSSIAN PROCESSES 157
Exercise 6.13 (Gaussian measure and the inradius). Let γn be the stan-
dard Gaussian measure on Rn . Show that if a symmetric convex body K Ă Rn
satisfies γn pKq ě γ1 pr´r, rsq, then K Ą rB2n . In particular, if γn pKq ě .683,
then N pB2n , Kq “ 1. Conclude that the left-hand side of (6.16) is 0 whenever
wpKq{ε ď .317.
Exercise 6.14 (Gaussian measure and the mean width). Show that if a sym-
metric convex body L Ă Rn satisfies wG pLq ď 1, then γn pL˝ q ě 12 .
ion
n
Exercise 6.15 (Metric entropy of B8 ). Use one of the Sudakov inequalities to
n n
show that, for every 0 ă ε ă 1, N pB2 , B8 , εq grows (at most) polynomially with
the dimension n.
ut
It is actually known (see Theorem 1 in [Sch84]) that
rib
#
logp2nε2 q
n n if n´1{2 ď ε ď 1{2,
(6.18) log N pB2 , B8 , εq » ε2
2
n log nε2 if 0 ă ε ď n´1{2 .
ist
The similarity of the estimates (6.17) and (6.18) is not a coincidence; see (5.67).
(Note that (6.17) could have been equivalently stated with logp2ε2 q and logp2{ε2 q
rd
instead of logp2εq and logp2{εq, making the similarity even more apparent.)
6.1.4. Dudley’s inequality and the generic chaining. The preceding sec-
fo
tion presented lower bounds for expected suprema of a Gaussian process in terms of
the related covering/packing numbers. In this section we will present similar upper
ot
bounds in a slightly more general setting.
N
Let pS, ρq be a compact metric space and let pXs qsPS be a family of random
variables (a stochastic process indexed by S). We say that pXs q is centered if
E Xs “ 0 for all s P S, and that it is subgaussian if, for all s, t P S with s ‰ t and
ly.
for all λ ą 0,
on
λ2
ˆ ˙
(6.19) PpXs ´ Xt ą λq ď A exp ´α ,
ρps, tq2
se
where A, α are positive parameters (independent of λ, s, t). The motivation for the
terminology is that if the process is Gaussian, then (6.19) holds with A “ α “ 21 and
lu
with respect to the metric ρps, tq “ }Xs ´ Xt }2 , and the bound is then essentially
tight (see Exercise A.1).
na
ż R{2 b
` ˘
r
sPS 0
where R is the radius of S.
Corollary 6.14. If pXs qsPS satisfies (6.19) with A ě 21 , but is not-necessarily-
centered, then
(6.21) E sup Xs ď sup E Xs ` B and E sup |Xs | ď sup E |Xs | ` B
sPS sPS sPS sPS
where B is the quantity on the right-hand side of (6.20).
The first bound in the Corollary follows immediately by considering Xs1 “
Xs ´ E Xs , and the second by noticing that if pXs q verifies (6.19), then so does
p|Xs |q.
158 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
Remarkş a 6.15. (1) Most formulations of Dudley’s inequality involve the ex-
pression log N pS, ηq dη. In that case, the integrand is 0 if η is larger than the
radius of S, and so one may as well integrate over r0, 8q. In our formulation, the
integrand is never 0; this is the price we are paying for having good dependence
of the bound on A and, to a lesser extent, for Lemma 6.16 being stated for not-
necessarily-centered variables.
(2) Some applications require majorizing the expected value of sups,t |Xs ´ Xt | “
sups,t pXs ´ Xt q; the proof below yields then (in the notation of Corollary 6.14) the
ion
bound 2B, without having to assume that pXs q is centered.
(3) When comparing Dudley’s inequality to Sudakov’s inequality a (6.14), we notice
that the former involves the L1 -norm of the function φpηq “ log N pS, ηq, while
ut
the latter the weak L1 -quasinorm (see [Gra14] for the definition). This explains
rib
why the two bounds are often of the same order and even if they are not, their ratio
depends rather weakly on the dimension and other parameters.
ist
Proof of Dudley’s inequality. Observe first that both sides of the in-
equality change in the same way if we rescale the process and/or the metric (i.e.,
rd
replace pXt q by paXt q and/or ρ by bρ for some a, b ą 0) and appropriately adjust
the parameter α. Accordingly, we may assume that both α and the radius of S are
equal to 1. For every integer k ě 0, let Nk be a 2´k -net of minimal cardinality for
fo
pS, ρq. By hypothesis, the net N0 consists of a single element s0 . For every k and
for every s P S, denote by πk psq an element of Nk satisfying ρps, πk psqq ď 2´k . The
ot
chaining equation reads for every s P S
N
ÿ` ˘
(6.22) Xs “ Xs0 ` Xπk`1 psq ´ Xπk psq .
kě0
ly.
It follows that
ÿ ` ˘ ÿ
(6.23) sup Xs ď Xs0 ` sup Xπk`1 psq ´ Xπk psq ď Xs0 ` sup pXu ´ Xu1 q,
on
sPS 1
kě0 sPS kě0 u,u
where the last supremum is taken over couples pu, u1 q P Nk`1 ˆ Nk satisfying
se
1ďiďN
r
The result follows now by majorizing the last series with an integral.
Proof of Lemma 6.16. We may assume that β “ 1 by working with Yi {β
and that the variables Yi are non-negative by working with the positive parts Yi` .
If N ě 2, then AN ě 1 and so
ż8
E max Yi “ Ppmax Yi ě tq dt
i 0 i
6.1. GAUSSIAN PROCESSES 159
a ż8
ď logpAN q ` AN ? expp´t2 q dt
logpAN q
a
ď logpAN q ` 1.
The first inequality is the union bound; the second ş8 one2 is the ?
upper bound in
2
Komatu’s inequality (A.4) which can be rewritten as u e´t dt ď p u2 ` 1´uqe´u
a
(valid for u ě ´0.3893 and applied with u “ logpAN q).
If N “ 1, the inequality is trivial if the variable has mean 0 and can be checked
ion
directly otherwise; see Exercise 6.17, which also treats in detail the case of small
A.
ut
Although Dudley’s inequality is not sharp in general (see Exercises 6.19 and
6.20, which exhibit two different reasons for a possible gap), it does become sharp
rib
when sufficiently many symmetries are present; such situation is referred to as the
stationary case in probability literature. Here is a statement demonstrating this
ist
principle expressed in the language of convex sets and their Gaussian mean widths.
Proposition 6.17 (not proved here). Let K Ă Rn be a nonempty compact
rd
convex set and let F be the set of extreme points of K. If the isometry group of K
acts transitively on F , then
wG pKq “ wG pF q »
ż outradpF q b fo
` ˘
1 ` log N pF, ηq dη.
ot
0
In the most general situation, the chaining argument used in the proof of Propo-
N
Theorem 6.18 (Generic chaining, not proved here). Let pXt qtPT be a centered
subgaussian process and let ρ be the distance on T defined by ρps, tq “ }Xs ´ Xt }L2 .
on
Let pTk qkPN be an increasing family of subsets of T such that cardpT0 q “ 1 and
k
cardpTk q ď 22 for k ě 1. Then
se
ÿ8
(6.26) E sup Xt ď C sup 2k{2 ρps, Tk q
lu
for some absolute constant C. Conversely, if the process pXt qtPT is Gaussian, this
na
E sup Xt ě c γ2 pT q
r
tPT
Pe
Exercise 6.16 (The constant in Dudley’s inequality). ? Show that the constant
6 in Dudley’s inequality (6.20) can be improved to 3 ` 2 2 « 5.83 if we repeat the
proof with Nk being a θk -net, and optimize over θ P p0, 1q.
Exercise 6.17. The argument in the proof of Lemma?
6.16 works if AN ě 1.
Show that when AN ă 1, then the optimal majorant is 2π βAN and check that,
consequently, the bound from Lemma 6.16 holds whenever AN ě 0.4236.
Exercise 6.18 (Median of the maximum of a subgaussian process). a Show that
ion
under the hypotheses of Lemma 6.16 the median of maxi Yi is at most β logp2AN q.
Exercise 6.19 (The gap in Dudley’s inequality).
? Let pZk qnk“1 be an i.i.d. se-
ut
quence of N p0, 1q variables and let Xk “ Zk { 1 ` log k. Check that E maxk Xk ă 3
for any n P N, but that the integral on the right-hand side of (6.20) is Θplog log nq.
rib
Exercise 6.20 (The gap in Dudley’s inequality via B1n ). Let K “ B1n . Show
ş1 a ?
that 0 log N pK, ηq dη » plog nq3{2 while wG pKq „ 2 log n. Interpret this dis-
ist
crepancy as a gap in Dudley’s inequality.
Exercise 6.21 (Law of the iterated logarithm via Dudley’s inequality). Here is
rd
a rough version of the law of the iterated logarithm. Let pZi q1ďiďn be independent
N p0, 1q random variables and consider the Gaussian process X “ pXk q1ďkďn defined
fo
by Xk “ ?1k pZ1 ` ¨ ¨ ¨ ` Zk q. Estimate the covering numbers of X and conclude
that E maxtXk : 1 ď k ď nu “ Θplog log nq.
ot
Exercise 6.22 (Dudley integral as a chaining bound). Prove (6.27).
N
chaining.
on
small selection of results from RMT, which will be useful to analyze random con-
lu
matrices; this principle is known as universality. We study primarily (but not ex-
clusively) matrices with complex entries since these are the most relevant to QIT.
so
In contrast, much of the original motivation for RMT research came from statistics,
the setting in which the real case is more usual.
r
For A P Msa
n , we denote by pλi pAqq1ďiďn or simply pλi q1ďiďn the eigenvalues of
Pe
ion
(6.30) d8 pµ1 , µ2 q :“ inf }X1 ´ X2 }L8 ,
ut
with infimum over all couples pX1 , X2 q of random variables with (marginal) laws µ1
and µ2 , defined on a common probability space. Similarly, if Y1 , Y2 are real random
rib
variables, we will mean by d8 pY1 , Y2 q the 8-Wasserstein distance between the laws
of Y1 and Y2 .
ist
The definition of 8-Wasserstein distance immediately extends to probability
measures on a metric space pE, dq if we interpret in (6.30) the quantity }X1 ´X2 }L8
rd
as the smallest ∆ such that PpdpX1 , X2 q ď ∆q “ 1. Similarly, replacing the L8 -
norm by the Lp -norm leads to the p-Wasserstein distance dp , with the “finite p”
fo
case (and particularly p “ 1, 2) being much more intensively studied than p “ 8.
The metric d1 is also known, particularly in the computer science community, as
the Earth Mover’s distance.
ot
We note the following inequality (cf. Exercise 6.24): whenever f : R Ñ R is an
N
convergence
lu
νZ , with support equal to some bounded interval ra, bs. If pYn q is a sequence of
random variables, the following are equivalent:
r
Pe
(1) d8 pYn , Zq Ñ 0,
(2) Yn Ñ Z weakly and sup Yn Ñ b, inf Yn Ñ a.
By inf and sup we really mean here essential inf and sup. Note that the hy-
pothesis on the support is vital: the equivalence fails if the support is not connected
(see Exercise 6.29).
Proof. Since dL ď d8 , convergence in 8-Wasserstein distance implies weak
convergence. Moreover we have | sup Yn ´ sup Z| ď d8 pYn , Zq and similarly for the
infima, and therefore (1) implies (2).
Conversely, assume (2). Given ε ą 0, choose a “ x0 ă x1 ă ¨ ¨ ¨ ă xr “ b such
that xj`1 ´ xj ă ε and such that, for 0 ă j ă r, xj is a continuity point of FZ
162 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
(such points are dense in R). The hypothesis on the support of νZ implies that
FZ is strictly increasing on ra, bs, so that there exists α ą 0 with the property that
FZ pxj q ě FZ pxj´1 q ` α for 0 ă j ď r. For n large enough, we have inf Yn ą a ´ ε,
sup Yn ă b ` ε and |FYn pxj q ´ FZ pxj q| ă α for any 0 ă j ă r (using the fact that
FZ is continuous at xj ). This conditions imply that for any real number t,
FZ pt ´ 2εq ď FYn ptq ď FZ pt ` 2εq
and therefore d8 pYn , Zq ď 2ε.
ion
Remark 6.21. The proof of Lemma 6.20 gives actually the following: a neigh-
bourhood basis around νZ for the topology induced by d8 is given by pVε qεą0 ,
ut
where Vε is the set of probability measures µ satisfying the condition
´ ¯
max dL pµ, νZ q, | sup µ ´ sup νZ |, | inf µ ´ inf νZ | ă ε,
rib
where by inf ν and sup ν we denote the infimum and supremum of the support of a
ist
measure ν.
Exercise 6.24 (8-Wasserstein distance and Lipschitz functions). Show the
rd
stronger version of (6.31) : If f : R Ñ R is an L-Lipschitz function, then | E f pXq´
E f pY q| ď L d1 pX, Y q.
fo
Exercise 6.25. Show that if f : R Ñ R` is an L-Lipschitz function and
d8 pX, Y q ď ε, then E f pY q ě E gpXq, where g “ pf ´ Lεq` .
ot
Exercise 6.26 (8-Wasserstein distance via cumulative distribution functions).
N
Exercise 6.28. Show that under the hypotheses of Lemma 6.20, d8 pYn , Zq Ñ
0 implies the convergence E f pYn q Ñ E f pZq for any continuous function f : R Ñ R
se
(bounded or not). Show, by example, that this may be false when Z is unbounded.
lu
ion
of eigenvalues (see also Exercise 6.32).
Proposition 6.22 (Ginibre formula, not proved here). Let A be a GUEpnq
ut
matrix, and λpAq “ pλi q1ďiďn be the spectrum of A, arranged in the non-increasing
order. Then the density of the random vector λpAq is given by
rib
1
řn 2 ź
cn 1tλ1 쨨¨ěλn u e´ 2 i“1 λi pλi ´ λj q2 ,
ist
1ďiăjďn
rd
The real-valued companion to the GUE is the Gaussian Orthogonal Ensemble or
GOE, which corresponds to the standard Gaussian vector in the space of self-adjoint
fo
real matrices (up to normalization, see Section 6.2.4). The Gaussian Symplectic
Ensemble (GSE) similarly corresponds to the standard Gaussian vector in the space
ot
of quaternionic Hermitian matrices.
For some arguments, it is important to introduce what we call the GUE0 pnq en-
N
semble, which is the GUE ensemble conditioned to have trace zero. In other words,
G0 is a GUE0 pnq matrix if it has the distribution of a standard Gaussian vector
ly.
GUEpnq.
lu
1 a
(6.33) 4 ´ x2
2π
with respect to the Lebesgue measure. The even moments of the semicircular
distribution are the Catalan numbers: for a nonnegative integer p, we have
ż2 ˆ ˙
2p 1 2p
(6.34) x dµSC pxq “ .
´2 p ` 1 p
In particular the variance equals 1. If X is a random variable with distribution
µSC , then for any m P R and σ ě 0, we denote by µSCpm,σ2 q the distribution of
m ` σX, called the semicircular distribution with mean m and variance σ.
164 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
√
Eigenvalues of An / n
ion
−2 0 2
ut
Figure 6.1. The empirical eigenvalue distribution of a GUEpnq
rib
matrix An for n “ 10000 approaches the semicircular distribution.
ist
The semicircular distribution appears as the limit spectral distribution of GUE
random matrices (see Figure 6.1).
rd
Theorem 6.23 (Convergence of GUE spectrum towards the semicircular dis-
tribution, not proved here). For each n, let An be a GUEpnq or GUE0 pnq matrix.
fo
After normalization, the sequence of empirical spectral distributions pµsp pAn qq con-
verges towards the semicircular distribution (with respect to the 8-Wasserstein dis-
ot
tance) in the following sense: for any ε ą 0,
lim Ppd8 pµsp pn´1{2 An q, µSC q ą εq “ 0.
N
nÑ8
Using Lemma 6.20 (see also Remark 6.21), one checks that Theorem 6.23 brings
ly.
together two facts, usually presented (and proved) independently in the RMT lit-
erature:
on
(1) The fact that the sequence pµsp pn´1{2 An qq of random empirical measures
converges (weakly, in probability) towards the semicircle law, a result
se
n´1{2 An towards ˘2. This requires a different and finer analysis, which
we sketch in what follows.
na
Since GUEpnq is the standard Gaussian vector in Msa n , and by the duality
between Schatten norms (see Proposition 1.17), the quantity E }An } is exactly
so
the Gaussian mean width of S1n,sa , the self-adjoint part of the unit ball for the
r
trace norm. Although the order of magnitude of E }An } can be readily deduced
Pe
from general principles (see Exercise 6.33), the derivation of the precise constant 2
requires more specialized arguments. However, once an appropriate bound such as
(6.37) below is established, concentration of }An } around its expectation is provided
by Theorem 5.24 and gives the following estimates.
Proposition 6.24. Let An be a GUEpnq or GUE0 pnq matrix. Then, for any
ε ą 0,
´ ` ˘ ¯ ´› › ¯ 1 ´ nε2 ¯
(6.35) P λ1 n´1{2 An ě 2 ` ε ď P ›n´1{2 An ›8 ě 2 ` ε ď exp ´ .
2 2
Proof. Since } ¨ }8 ď } ¨ }HS , the function } ¨ }8 is a 1-Lipschitz function. By
Theorem 5.24 (recall that GUEpnq is the standard Gaussian vector in the space Msa n,
6.2. RANDOM MATRICES 165
and similarly for GUE0 pnq and the hyperplane of trace zero matrices), it follows
that
`› › ˘ 1
(6.36) P ›An ›8 ě M ` t ď expp´t2 {2q,
2
› › ?
where M is the median of the random variable ›An ›8 . We claim that M ă 2 n.
This follows from two facts. First, we have the inequality
? ?
(6.37) E }An }8 ă 2 n ´ 0.6n´1{6 ă 2 n,
ion
which was derived in Appendix F in [Sza05] (note that this inequality extends to
the case of GUE0 pnq via Jensen’s inequality). Second, it follows from Proposition
5.34 that the median?of the random variable }An }8 is smaller than its mean. Once
ut
?
we know that M ď 2 n, (6.35) follows by setting t “ ε n and appealing to (6.36).
rib
An alternative proof is to use directly (6.37) in combination with Theorem 5.25,
but we opted for the argument above since, in our approach, concentration around
the median is more elementary than that around the mean.
ist
? for the GOE. For example, if An is a GOEpnq
Similar estimates also hold
rd
matrix, we have E λ1 pAn q ď 2 n (see Exercise 6.48) and therefore
´ ` ˘ ¯ 1 ´ nε2 ¯
P λ1 n´1{2 An ě 2 ` ε ď exp ´ .
We next note that if A P Msa
fo
2 2
n , then }A}8 “ maxtλ1 pAq, ´λn pAqu, and that, by
ot
symmetry of GOEpnq, the distribution of ´λn pAn q is the same as that of λ1 pAn q.
Combining these observations with the bound above yields
N
´ ¯ ´ nε2 ¯
(6.38) P }n´1{2 An }8 ě 2 ` ε ď exp ´ .
2
ly.
The bound from Proposition 6.24 can be improved for small values of ε (the
on
Tracy–Widom effect).
Proposition 6.25 (not proved here). Let An be a GUEpnq or a GOEpnq ma-
trix. Then for any ε P p0, 1q,
se
´ ` ¯
P λ1 n´1{2 An ě 2 ` ε ď C expp´cnε3{2 q
˘
lu
and ´ ` ¯
na
P λ1 n´1{2 An ď 2 ´ ε ď C expp´cn2 ε3 q,
˘
statement. One can ask for a more quantitative version, or for a fixed–dimension
Pe
bound.
Problem 6.26. If An is a GUEpnq, a GUE0 pnq, or a GOEpnq matrix, what is
the rate of convergence in d8 pµsp pn´1{2 An q, µSC q Ñ 0? Proposition 6.25 suggests
that the answer may be Θpn´2{3 q. The convergence cannot be faster than n´2{3
due to the Tracy–Widom effect; see Notes and Remarks. The same question can be
asked about the Wishart matrices considered in the next section.
Exercise 6.33 (An elementary proof of boundedness of GUEpnq).
? Using a net
argument, show that if An is a GUEpnq matrix, then }An }8 ď C n with large
probability, where C ą 2 is some universal constant.
166 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
Exercise 6.34. Show that the GUEpnq version of Theorem 6.23 implies the
GUE0 pnq version.
6.2.3. Wishart matrices.
6.2.3.1. Definition of the Wishart ensemble. Let n, s be nonzero integers. Let
B P Mn,s a random matrix with independent NC p0, 1q entries. The random matrix
W “ BB : P Msa n is called a (complex) Wishart matrix and its distribution is
denoted by Wishartpn, sq. We often say simply that B is a Wishartpn, sq matrix.
The eigenvalues of W are the squares of the singular values of B, so that statements
ion
about the spectrum of Wishart matrices are equivalent to statements about singular
values of a random (rectangular) Gaussian matrix.
ut
Here is an equivalent description: let pG1 , . . . , Gs q be s independent copies of
a standard complex Gaussian vector in the space Cn . Then the matrix
rib
ÿs
(6.39) W “ |Gi yxGi |
ist
i“1
has distribution Wishartpn, sq.
rd
The rank of a Wishartpn, sq matrix is almost surely equal to minpn, sq. In the
following we often assume that s ě n, i.e., that the Wishart matrices are almost
surely positive definite. This is not really a restriction since the case s ă n can
fo
be covered by the following observation: if B P Mn,s is a random matrix with
independent NC p0, 1q entries, then W1 “ BB : is a Wishartpn, sq matrix while
ot
W2 “ B : B is a Wishartps, nq matrix (because the NC p0, 1q distribution is invariant
N
under complex conjugation), and the matrices W1 and W2 share the same non-zero
eigenvalues.
One can also consider the real version of Wishart matrices by starting with
ly.
encountered in statistics.
6.2.3.2. Limit theorems. What does the spectrum of large Wishart matrices
se
look like? Before answering this question, it might be useful to have in mind the
following elementary result from probability theory, which can be considered as the
lu
s P N and p P p0, 1q (this means that X has the same distribution as the sum of s
independent Bernoulli random variables taking values 1 with probability p and 0
so
(i) If α “ lim sp exists in p0, 8q, then X converges (weakly) towards a Poisson
distribution of parameter α.
?
(ii) If lim sp “ 8, then pX ´spq{ sp converges (weakly) towards a standard Gauss-
ian distribution.
In the non-commutative context, we replace independent Bernoulli variables by
free Bernoulli variables. The resulting limit laws are the so-called free Poisson dis-
tribution and, again, the semi-circular distribution given by (6.33). Free probability
theory is beyond the scope of this book (see Section 6.2.5 for a brief introduction)
and so, rather than defining freeness, we will explain the heuristics relating it to
RMT.
6.2. RANDOM MATRICES 167
ion
i“1
where the vectors ψi are i.i.d. and uniformly distributed on the sphere in Cn and
n, s Ñ 8. Since, for large n, the standard Gaussian vector on Cn is close to
ut
being uniformly distributed on the sphere of radius n1{2 (see Corollary 5.27), it
rib
follows that X is close to the appropriately rescaled Wishart random matrix given
by (6.39) (see Exercise 6.37). Consequently, the limiting behavior that is the non-
commutative analogue of Fact 6.27 can be retrieved from the results on spectral
ist
properties of Wishartpn, sq as n, s Ñ 8. Such results have been known for quite
a while, even if the full extent of the analogy and the identification of the limit
rd
laws as the free analogues of the Poisson and normal distributions had to await the
development of the language of free probability.
fo
To make the limit results for Wishart matrices more tangible, we need to de-
scribe explicitly what the free Poisson distributions are. They originally appeared
?
ot
in RMT as Marčenko–Pastur distributions. First, for λ ą 0, we let x˘ “ p1 ˘ λq2
and define a function supported on rx´ , x` s by
N
a
px ´ x´ qpx` ´ xq
fλ pxq “ 1rx´ ,x` s pxq.
2πx
ly.
where δ0 denotes a Dirac mass at 0 and f dx is the measure whose density (with
respect to the Lebesgue measure) is f .
lu
na
λ=1 λ=2
r so
fλ (x) fλ (x)
Pe
x x
0 4 0 x− x+
Theorem 6.28 (not proved here). Consider a sequence of indices pn, sq which
tend to infinity in such a way that λ “ lim s{n P r1, 8q exists. For each pn, sq,
let Wn,s be a Wishartpn, sq matrix. After renormalization, the sequence of ran-
dom empirical spectral distributions pµsp pWn,s qq converges in probability towards
168 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
ion
As explained earlier, a similar result follows formally in the case λ P p0, 1q.
However, some care is needed in the formulation, since the atomic part in the
ut
Marčenko–Pastur distribution is supported outside of the continuous part, and this
lack of connectedness may prevent convergence with respect to the 8-Wasserstein
rib
distance (cf. Lemma 6.20 and Exercises 6.29 and 6.36).
In the case where the ratio s{n tends to infinity, the limiting Marčenko–Pastur
ist
distribution degenerates into a semicircular distribution, in the same way that a
Poisson distribution with a large parameter is almost Gaussian.
rd
Theorem 6.29 (not proved here). Consider a sequence of indices pn, sq which
both tend to infinity in such a way that lim s{n “ 8. For each pn, sq, let Wn,s
fo
be a Wishartpn, sq matrix. After renormalization and recentering, the sequence
of empirical spectral distributions pµsp pWn,s qq converges in probability towards the
ot
semicircular distribution µSC with respect to the 8-Wasserstein distance, in the
following sense: for any ε ą 0,
N
Our last limit theorem deals with partial transposition of Wishart matrices.
As we shall see, the partial transposition dramatically changes the limit behavior.
Note that the distributions MPpλq and SCpλ, λq which appear in Theorems 6.28
na
and 6.30 have the same mean and the same variance (see Exercise 6.35). This
was to be expected since the partial transposition preserves both the trace and the
so
Hilbert–Schmidt norm.
r
Theorem 6.30 (not proved here). Consider a sequence of indices pd, sq which
Pe
tend to infinity in such a way that λ “ lim s{d2 P p0, 8q exists. For each d, s, let
Wd2 ,s be a Wishartpd2 , sq random matrix (considered as an operator on Cd b Cd )
and WdΓ2 ,s its partial transpose. Then, for any ε ą 0,
lim Ppd8 pµsp pd´2 WdΓ2 ,s q, µSCpλ,λq q ą εq “ 0,
pd,sqÑ8
where µSCpλ,λq denotes the semicircular distribution with mean λ and variance λ.
Exercise 6.35. Verify that (6.41) does indeed define a probability distribution
both for λ ě 1 and for 0 ă λ ă 1, and that the expected value and the variance of
the corresponding random variable are both equal to λ.
6.2. RANDOM MATRICES 169
Exercise 6.36. Check that fλ pλxq “ f1{λ pxq. Use this to deduce from The-
orem 6.28 that the weak convergence of µsp p n1 Wn,s q towards µMPpλq holds for any
λ ą 0.
Exercise 6.37 (Spherical variant of Wishart ensemble). Deduce from Theorem
6.28 the following variant: if Xn,s is defined as in (6.40) and n, s tend to infinity
with lim s{n “ λ, then µsp pXn,s q converge towards MPpλq (in probability, in 8-
Wasserstein distance).
ion
Exercise 6.38 (The quartercircular distribution). Check that if X has a stan-
dard semicircular distribution, then X 2 has a MPp1q distribution. In what sense
can we say that the singular value distribution of a large random (non-Hermitian)
ut
square matrix B with independent NC p0, 1q entries is given by a quartercircular
rib
distribution?
Exercise 6.39 (Free Poisson ˘ ?in the large λ limit). (a) Show that if Xλ
` variables
ist
has a MPpλq distribution, then Xλ ´λ { λ converges to the standard semicircular
distribution with respect to the 8-Wasserstein distance as λ Ñ 8.
rd
(b) Find a gap in the following argument, which purports to show that part (a) in
combination with Theorem 6.28 implies Theorem 6.29.
By Theorem 6.28, the empirical spectral distribution of Wn,s {n is approximately
Xλ (in the sense` of the 8-Wasserstein
˘ ? `
fo
distance)
˘ a if s{n « λ and n, s are large.
Consequently Xλ ´ λ { λ « Xλ ´ s{n { s{n is approximately the empirical
ot
` ˘ a ` ˘ ?
spectral distribution of Wn,s {n ´ s{n { s{n “ Wn,s ´ s { sn , which is exactly
N
expect
? that
? the?spectrum? of a typical Wishartpn, sq matrix lies close to the interval
rp s ´ nq2 , p s ` nq2 s (for s ě n), or equivalently that all?singular values
? of an
on
? ?
nˆs matrix with i.i.d. NC p0, 1q entries lie close to the interval r s´ n, s` ns. A
first result in this direction is a precise bound (without any multiplicative constants
se
or error terms) for the expected largest singular value, i.e., the operator norm.
Proposition 6.31. Let B be an nˆs random matrix with independent NC p0, 1q
lu
entries. Then ? ?
E }B}op ď n ` s
na
Proposition 6.31 will be deduced from its analogue for real Wishart matrices,
so
which requires methods specific to that setting. Accordingly, we postpone its proof
until Section 6.2.4.2.
r
answer to which is known to be affirmative in the real case (see Corollary 6.38).
Recall that sn pBq denotes the smallest singular value of B.
Problem 6.32. Let s ě n, and let B be an n ˆ s random?matrix? with indepen-
dent NC p0, 1q entries. Do we have the inequality E sn pBq ě s ´ n?
We now state a concentration result for the spectrum of Wishart matrices.
Proposition 6.33. Let B be a random n ˆ s matrix with independent NC p0, 1q
entries. For every t ą 0,
? ? ˘ 1
P }B}op ě n ` s ` t ď expp´t2 q.
`
(6.42)
2
170 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
? a
If s ą n, then for every t ą 4 2 log n{p s{n ´ 1q,
? ?
P sn pBq ď s ´ n ´ t ď expp´t2 {4q,
` ˘
(6.43)
where C and c denote absolute constants.
The above result is closely related to Proposition 6.24 and shares many of the
ramifications of the latter. For example, while we know from the general theory
of Gaussian concentration that the quantities in question are concentrated around
some value, identifying that value requires a separate argument and may be hard.
ion
In particular, a positive answer to Problem 6.32 would imply the validity of (6.43)
for all t ě 0 and with the bound expp´t2 q.
ut
Proof. The functions }¨}op and sn are 1-Lipschitz with respect to the Hilbert–
Schmidt norm on Mn,s . Let M be the?median ? of }B}op . By combining Propositions
rib
6.31 and 5.34, it follows that M ď n ` s, and we deduce (6.42) by using the
values from Table 5.2.
ist
Let M 1 be the median of sn pBq. We claim that
? ?
? ? 2 s ` n log 2n
rd
1
(6.44) M ě s´ n´ ? ? .
s´ n
? a
As before, using the values from Table 5.2, we get for t ą 4 2 log n{p s{n ´ 1q
fo
? ? 1
Ppsn pBq ď s ´ n ´ tq ď Ppsn pBq ď M 1 ´ t{2q ď expp´t2 {4q.
ot
2
We may obtain (6.44) as a consequence of the following inequality valid for any
N
tą0
1 ` ? ?
expp´tM 12 q ď E Tr expp´tBB : q ď n exp ´p s ´ nq2 t ` ps ` nqt2 .
˘
(6.45)
ly.
2
(The second inequality in (6.45) is not at all immediate to a prove; it appears as
on
theory since they lead to a very natural model of random quantum states. One
possible way to generate a random state on Cn is to take independent unit vec-
tors pψi q1ďiďs distributed uniformly on the sphere and to consider the average of
na
s
1ÿ
ρ“ |ψi yxψi |.
s i“1
r
Pe
Proof. The Proposition follows from the combination of two facts. First, if
G is a standard Gaussian vector in any given Euclidean or Hilbert space V (in our
G
case V “ Cn b Cs ), then the vector |G| is uniformly distributed on the unit sphere
of V and is independent of |G|. Second, when we identify a tensor ψ P Cn b Cs
with a matrix A P Mn,s , we have (see Section 0.8)
TrCs |ψyxψ| “ AA:
The normalization factor Tr W is very strongly concentrated around the value
ion
ns (see Exercise 6.40). Therefore, it can be virtually treated as a constant when
translating the results for Wishart matrices in the language of induced states. We
have the following (recall that µsp pAq is the empirical spectral distribution of a
ut
self-adjoint matrix A, see (6.29)).
rib
Theorem 6.35. Given integers n, s, let ρn,s be a random induced state with
distribution µn,s .
ist
a
(i) If n is fixed and s tends to infinity, then npn ´ 1qs pρn,s ´ nI q converges in
distribution towards a GUE0 pnq matrix.
rd
(ii) If n tends to infinity and lim s{n “ λ P p0, 8q, then µsp psρn,s q converges weakly
in probability towards µMPpλq . Moreover, if λ ě 1, then the convergence also holds
in 8-Wasserstein distance.
fo ?
(iii) If both n and s{n tend to infinity, then µsp p nspρn,s ´ I {nqq converges in
probability in 8-Wasserstein distance towards µSC .
ot
Recall that the empirical spectral distributions of a rescaled GUE0 matrix is
N
almost semicircular (see Theorem 6.23), so that (i) and (iii) are indeed consistent.
To deduce (ii) from Theorem 6.28 and (iii) from Theorem 6.29, use Proposition
ly.
6.34 and the bounds from Exercise 6.40. The statement (i) is more elementary (see
Exercise 6.41).
on
tively, a weaker statement follows from an elementary net argument (see Exercise
lu
6.43).
Proposition 6.36. For n ď s, let ψ be a random vector uniformly distributed
na
on the unit sphere of Cn bCs and let λ1 pψq ě ¨ ¨ ¨ ě λn pψq be its Schmidt coefficients.
Then, for any ε ą 0,
so
ˆ ˙
1 1`ε
(6.46) P λ1 pψq ě ? ` ? ď expp´nε2 q
r
n s
Pe
? ?
and, for any ε ě C s log n{p ns ´ nq,
ˆ ˙
1 1`ε
P λn pψq ď ? ´ ? ď expp´cnε2 q,
n s
where c and C are absolute constants.
Proposition 6.36 can be deduced from Proposition 6.33 or proved in the same
way using concentration of measure on the sphere (cf. Exercise 6.42). We also note
that Proposition 6.36 can be equivalently restated using matrices instead of tensors:
if M P Mn,s is uniformly distributed on the Hilbert–Schmidt
” sphere, then withı large
probability all its singular values belong to the interval ?1 ´ 1`ε
? , ?1 ` 1`ε
? .
n s n s
172 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
When s ě n, the probability measure µn,s has a density with respect to the
Lebesgue measure on DpCn q which has a simple form
dµn,s 1
(6.47) pρq “ pdet ρqs´n ,
d vol Zn,s
where Zn,s is a normalization factor. Note that formula (6.47) allows to define the
measure µn,s (in particular) for every real s ě n, while the partial trace construction
makes sense only for integer values of s. The explicit formula (6.47) will not be
ion
used in this book.
In the important special case where s “ n, the density of the measure µn,n is
constant: a random state distributed according to µn,n is distributed with respect to
ut
the uniform (Lebesgue) measure on DpCn q. This can be seen as a non-commutative
version of the following classical fact: if ψ “ pψ1 , . . . , ψn q is uniformly distributed
rib
on the unit sphere in Cn , the vector p|ψ1 |2 , . . . , |ψn |2 q is uniformly distributed on
the pn ´ 1q-dimensional simplex.
ist
Exercise 6.40 (Trace of a Wishart matrix). Let W be a Wishartpn, sq matrix.
Check that 2 Tr W has distribution χ2 p2nsq and deduce from Exercise 5.37 that for
rd
any t ą 0
nst2
ˆ ˙
Pp| Tr W ´ ns| ą tnsq ď 2 exp ´ .
fo
2 ` 4t{3
Exercise 6.41. Use the multivariate central limit theorem to prove part (i)
ot
from Theorem 6.35.
N
κ2ns
Let ρ be a random induced state with distribution µn,s , i.e., ρ “ TrCS |ψyxψ| with
ψ uniformly distributed on SCn bCs . a
se
(i) For any y P SCn , show that the function f defined on SCn bCs by f pψq “ xy|ρ|yy
2
is 1-Lipschitz
? and that E f “ 1{n. Conclude from Exercise 5.46 that, for any t ą 0,
lu
Pp|f ´ 1{ n| ą tq ď p1 ` eq expp´nst2 q.
(ii) Let N be a δ-net in SCn for δ ă 1{2. Denote ∆ “ ρ ´ I {n and show that
na
1
}∆}8 ď sup |xy|∆|yy| .
1 ´ 2δ yPN
so
?
(iii) Let s ě n. Conclude than }∆}8 ď C{ ns with high probability for some
r
constant C.
Pe
Exercise 6.44 (The limit distribution of the partial transpose). Let ν be the
law of XY , where X and Y are independent random variables following the standard
semicircular distribution. Let ψ P SCd bCd be a uniformly distributed random vector,
and A “ d|ψyxψ|Γ . (The partial transposition Γ was defined in Section 2.2.6.) Show
that, when d tends to infinity, µsp pAq converges in probability, in 8-Wasserstein
distance, towards ν.
Exercise 6.45 (Low moments of Wishart matrices and expected purity of
random induced states).
(i) Let G be an n ˆ s random matrix with independent NC p0, 1q entries. Show that
E TrpGG: GG: q “ n2 s ` s2 n and that EpTr GG: q2 “ nspns ` 1q.
6.2. RANDOM MATRICES 173
(ii) Let ρ be a random induced state with distribution µn,s . Show that E Tr ρ2 “
n`s
ns`1 .
ion
arguments harder. However, the formulas in question play almost no role in our
approach. On the other hand, some other tools—most notably the analysis via
ut
Gaussian processes—are more adapted to the real setting.
The Gaussian Orthogonal Ensemble (GOE) is the real version of the GUE. A
rib
random matrix A has the GOEpnq distribution if the random variables paij q1ďiďjďn
are independent, with aii having the N p0, 2q distribution and aij (for i ‰ j) having
ist
the N p0, 1q distribution. This normalization is chosen so that the distribution
? is
invariant under conjugacy by an orthogonal matrix. Note also that A{ 2 is a
rd
standard Gaussian vector in the space Msa n.
Real Wishart matrices are then defined exactly as their complex analogues: if
B is an n ˆ s random matrix with independent N p0, 1q entries, the distribution of
W “ BB : is denoted by WishartR pn, sq.
fo
In both settings, an argument based on Gordon’s lemma (Proposition 6.7) al-
ot
lows for concise proofs of precise inequalities. This scheme actually allows obtaining
N
sharp bounds on the norm of a random matrix as an operator between any two real
normed spaces. The basic ingredient is a contraction property of the tensor product
map which holds only in the real case (Exercise 6.47).
ly.
Note that the upper bound in Proposition 6.37 is always sharp up to a factor
na
ion
matrix. These bounds match the support of the Marčenko–Pastur distribution from
Theorem 6.28. It is then routine to derive concentration estimates.
ut
Corollary 6.38. Let n ď s, let B P Mn,s be a random matrix with independent
N p0, 1q entries, and denote by sn pBq its smallest singular value. Then
rib
? ? ? ?
s ´ n ď κs ´ κn ď E sn pBq ď E }B}op ď κs ` κn ď s ` n.
ist
Consequently, for any t ě 0,
? ? 1
rd
Pp}B}op ě s` n ` tq ď expp´t2 {2q,
2
? ?
Ppsn pBq ď s´ n ´ tq ď expp´t2 {2q.
fo
Proof. We apply Proposition 6.37 with K “ S s´1 and L “ S n´1 . Note that
ot
wG pKq “ κs and wG pLq “ κn . The leftmost and rightmost inequalities follow
from Proposition A.1 (iv) and (i). The concentration estimates are proved as in
N
Proposition 6.33.
ly.
Let m, n be integers.
lu
prx, |x|yq. Show that for any px, yq and px1 , y 1 q in Rm ˆ rB2n ,
r
(iii) Show that the analogues of (i) and (ii) fail in the complex setting.
Exercise 6.48 (Sharp bounds on the largest eigenvalue of GOEpnq and GUEpnq
matrices). Let A be a GOEpnq or GUEpnq random matrix. By arguing along ? the
lines of the proofs of Proposition 6.37 and Corollary 6.38, show that E λ1 pAq ď 2 n.
Exercise 6.49 (Mean width of the projective tensor product). Let K Ă Rm
and L Ă Rn be convex bodies. Assume that K Ă rK B2m and L Ă rL B2n . Prove
rL
that wG pK b
p Lq ď wG pKqrL ` wG pLqrK and wpK b p Lq ď wpKq ?
n
` wpLq ?rKm .
6.2. RANDOM MATRICES 175
ion
n
Gaussian vector in C , then 2|G| has distribution χp2nq.
Lemma 6.39 (see Exercise 6.50). Let n ď s and A be an n ˆ s random matrix
ut
with independent N p0, 1q entries. There exist random matrices U P Opnq and V P
rib
Opsq, such that, denoting R “ U AV ,
(i) The random variables tri,j : 1 ď i ď n, 1 ď j ď su are independent,
(ii) For 1 ď i ď n, ri,i has distribution χps ` 1 ´ iq,
ist
(iii) For 2 ď i ď n, ri,i´1 has distribution χpn ` 1 ´ iq,
(iv) Other entries of R are almost surely zero.
rd
Lemma 6.40 (see Exercise 6.50). Let n ď s and B be an n ˆ s random matrix
with independent NC p0, 1q entries. ?There exist random matrices U 1 P Upnq and
V 1 P Upsq, such that, denoting S “ 2U 1 BV 1 ,fo
(i) The random variables tsi,j : 1 ď i ď n, 1 ď j ď su are independent,
ot
(ii) For 1 ď i ď n, si,i has distribution χp2s ` 2 ´ 2iq,
(iii) For 2 ď i ď n, si,i´1 has distribution χp2n ` 2 ´ 2iq,
N
and S have positive entries, this implies that (almost surely) }S}op ď }R}op . Since
}A}op “ }R}op and }B}op “ ?12 }S}op , it follows that E }B}op ď ?12 E }A}op . The
lu
Problem 6.41. Does there exist an argument along similar lines (i.e., using
Slepian’s lemma and coupling) that yields inequalities in the spirit of?(6.49), but
so
involving GUE and GOE matrices (say, E }B}op ď ?12 E }A}op ď 2 n, with B
being a GUEpnq matrix and A being a GOEp2nq matrix)?
r
Pe
ion
P pE X L ‰ Hq ď expp´pκn´k ´ wG pLqq2 {2q.
Proposition 6.42 will give a direct proof of the low-M ˚ estimate (Theorem
ut
7.45).
Proof of Proposition 6.42. Let s “ n ´ k, and B an n ˆ s random matrix
rib
with i.i.d. N p0, 1q entries, and E “ ker B. One checks that E is distributed accord-
ing to the Haar measure on Grpk, Rn q. (This follows from the characterization of
ist
the Haar measure as the only measure invariant under the action of Opnq.) More-
over, since L is closed, the condition E X L “ H is equivalent to minxPL |Bx| ą 0.
rd
We apply the Chevet–Gordon inequalities (Proposition 6.37) with K “ S s´1 to
conclude that
xPL
fo
E min |Bx| ě κs ´ wG pLq.
Since the function g : B ÞÑ minxPL |Bx| is 1-Lipschitz with respect to the Hilbert–
ot
Schmidt distance, we may apply Gaussian concentration of measure (see Table 5.2)
to conclude that
N
PpE X L ‰ Hq “ PpgpBq “ 0q
ly.
´ ¯
“ P gpBq ď E gpBq ´ pκs ´ wG pLq
on
deeper results about high-dimensional random matrices that touch upon the con-
nection with free probability. A rigorous introduction to free probability is behind
lu
the scope of this book, so we instead illustrate, on an example, the kind of conclu-
sions that can be derived from the general theory.
na
p1q pN q
polynomial in N variables. For every n, let An , . . . , An be N independent ran-
p1q ? pN q ?
dom matrices with GUEpnq distribution, and let Xn “ P pAn { n, . . . , An { nq.
Then, as n Ñ 8, the empirical spectral distributions pµsp pXn qq converge weakly,
in probability, towards the distribution of P pa1 , . . . , aN q, where a1 , . . . , aN are free
semicircular variables. Moreover, }Xn }8 converges in probability towards the value
}P pa1 , . . . , aN q}.
Let us explain the meaning of the concepts and notions that appear in The-
orem 6.43. First, a polynomial P is self-adjoint if P pM1 , . . . , MN q P Msan when-
ever M1 , . . . , MN P Msa
n ; an example is P px1 , x2 q “ x1 x2 x1 . Second, a family of
N “free semicircular random variables” can be concretely realized as follows: let
6.2. RANDOM MATRICES 177
N bk
be the Fock space over CN , with the usual convention that
À
F “ kPN pC q
N b0
pC q is a one-dimensional space spanned by a unit vector Ω. Let |1y, . . . , |N y be
the canonical basis of CN , and let h1 , . . . , hN P BpFq be the corresponding creation
operators, defined by hi pxq “ |iy b x P pCN qbpk`1q for every x P pCN qbk . Set
ai :“ hi ` h:i ; then the operators a1 , . . . , aN are an example of “free semicircular
variables.” The quantity }P pa1 , . . . , aN q} appearing in Theorem 6.43 is simply the
operator norm, and the distribution of a self-adjoint operator Y P BpHq is defined
as the unique probability measure µ on R such that, for every bounded continuous
ion
function f : R Ñ R, ż
xΩ|f pY q|Ωy “ f dµ
ut
R
(it is enough to consider the case where f is a polynomial). The unfamiliar reader
rib
is invited to check that this formalism is consistent with Theorem 6.23 (see Exercise
6.52).
ist
The phenomenon behind Theorem 6.43 is called “asymptotic freeness of random
matrices” and is not limited to the case of GUE matrices (see Notes and Remarks
rd
for more references). Here is another example involving unitary matrices. The “free
additive convolution” is a binary operation (denoted by ‘ and not defined here) on
probability measures (say, with compact support) on R.
fo
Theorem 6.44 (not proved here). Let µ and ν be two compactly supported
probability measures on R. For every n, let An , Bn P Msan be real (resp., complex)
ot
self-adjoint matrices such that the sequences of empirical measures pµsp pAn qq and
N
The usefulness of Theorem 6.44 comes from the fact that in many situations
the free additive convolution of probability measures can be computed using the
so-called R-transform, a non-commutative analogue of the Fourier transform (see
se
projectors. Then the sequence of empirical measures pµsp pAn qq converges weakly in
probability towards the deterministic measure
r
Pe
a
4tp1 ´ tq ´ px ´ 1q2 ” ?
(6.50) p1 ´ 2tqδ0 ` 1 1´2 tp1´tq,1`2?tp1´tqı pxq dx.
πxp2 ´ xq
a
Moreover, the sequence p}An }8 q converges in probability towards 1 ` 2 tp1 ´ tq.
An analogous statement for t ě 1{2 follows by applying 6.45 to EnK and FnK . The
measure defined in (6.50) is the free additive convolution of the measure p1´tqδ0 `tδ1
with itself (for t “ 1{2 we recover the arcsine distribution).
Exercise 6.52 (Semicircular variables via creation operators). Show that the
distribution of the operators ai defined on the Fock space in the paragraph following
Theorem 6.43 is indeed the semicircular distribution.
178 6. GAUSSIAN PROCESSES AND RANDOM MATRICES
ion
distribution (see [LLR83], Theorem 1.5.3 and also [Pic68] to justify convergence
in expectation)
ˆ ˙
log log n 1
ut
a
un “ 2 log n ´ ? `O ? .
2 2 log n log n
rib
Inequalities in a similar spirit, but also for fixed n, appear in [DLS14].
References for the result from Remark 6.4 are [Glu88], [CP88] and [BF88].
ist
The conjecture from Remark 6.5 appears in [HS05].
The second part of Proposition 6.6 was originally proved by Slepian [Sle62] and
rd
is usually referred to as Slepian’s lemma. The first assertion, which follows from
the second one, is sometimes called the Sudakov–Fernique inequality and appears
in [Fer75]. Several proofs of Proposition 6.6 are available in addition to the original
one; see, e.g., Kahane [Kah86] and Gromov [Gro87].
fo
We also mention a well-known open problem related to Slepian’s lemma which
ot
is known as the Kneser–Poulsen conjecture. Suppose that x1 , . . . , xN and y1 , . . . , yN
are points in Rd with the property that |xi ´xj | ď |yi ´yj | for any 1 ďŤ
N
i, j ď N . The
conjecture
Ť asks whether for every radii r1 , . . . , r N ą 0, we have volp Bpxi , ri qq ď
volp Bpyi , ri qq. Under the same hypotheses, a sister conjecture is whether the
ly.
Ş Ş
inequality volp Bpxi , ri qq ě volp Bpyi , ri qq holds. Similar questions can be asked
for the spherical, hyperbolic and projective spaces. Note that, in the spherical
on
case, the two conjectures are equivalent since the complement of a cap is also a cap.
Also, since all Riemannian manifolds are asymptotically flat as distances go to 0,
se
the Euclidean case (in any particular dimension) would be a formal consequence of
a positive answer in any other setting. The answers were shown to be affirmative
lu
known to be true for spherical caps of angle π{2, see [Bez08], which also surveys
partial results and specific open problems in the hyperbolic setting. In the setting
so
of projective spaces the question about unions appears to have a negative answer,
as indicated by counterexamples in section 4 of [Šid68], which show that a full
r
Pe
two-sided analogue of Slepian’s lemma (in the spirit of Proposition 6.9) does not
hold.
Proposition 6.9 was proved independently by Khatri [Kha67] and Šidák [Šid67]
(see also [Šid68, Glu88]). The Gaussian correlation conjecture was proved by
Royen in [Roy14]. A more accessible and more detailed exposition can be found
in [LM15], to which we also refer for more background and references.
The Sudakov minoration (Proposition 6.10) appears in [Sud71]. The dual Su-
dakov inequality (Proposition 6.11) is due to Pajor–Tomczak-Jaegermann [PTJ85].
The proof presented here is due to Talagrand (see [LT91]). Some refinements of
both inequalities appear in [MTJ87].
NOTES AND REMARKS 179
Dudley’s inequality (Proposition 6.13) goes back to [Dud67] and was gener-
alized to the subgaussian setting in [JM78]. A version of Proposition 6.17 in the
language of stationary Gaussian processes can be found in [Fer97]. The first part
of Theorem 6.18 is due to Fernique [Fer75] and the second part (which is much
harder) is due to Talagrand [Tal87] (a later paper [Tal01] contains a more transpar-
ent exposition). For more information about the “generic chaining” principle (which
is a reincarnation of the “majorizing measures”), we refer to the books [Tal05] and
[Tal14] by Talagrand, the latter one being more accessible.
ion
Section 6.2. Two recent and excellent references about RMT are [AGZ10]
and [Tao12], and we direct the reader to them for the background, further in-
ut
formation, and bibliography. In particular, a huge branch of RMT which is not
considered here revolves around the universality principle and aims at extending
rib
convergence results to models with less symmetries and/or with weaker integrabil-
ity properties. Random matrices drawn from classical compact groups are the topic
ist
of the forthcoming monograph [Mec].
In the context of empirical measures, the 8-Wasserstein distance was intro-
rd
duced in [ASY14]. The 8-Wasserstein distance is much less popular than its
“finite p” cousins; for example, in [Vil09] it appears only in the bibliographical
fo
notes to the entire chapter devoted to the topic. However, it has a few interesting
applications, see for example [McC06]. We refer to [Vil09] for a thorough dis-
cussion of why the terminology “Wasserstein distance” is as highly questionable as
ot
it is predominant. For a proof that the Lévy distance metrizes weak convergence,
N
see Section 4.3 in [Gal95]. Knowing that the weak convergence is metrizable gives
unambiguous meaning to statements asserting that a sequence of random measures
“converges weakly in probability,” which are ubiquitous in RMT. A long list of con-
ly.
can be found, along many other facts about convergence of probability measures,
in [Bil99].
Wigner’s theorem about convergence to the semicircle distribution originates
se
from [Wig55, Wig58] and has been extended and strengthened in various direc-
tions, notably to matrices with independent (but not necessarily Gaussian) entries
lu
Led03, LR10]. The perhaps surprising normalization is sharp and reflects the
fact that fluctuations of large random matrices are asymptotically smaller than
so
the upper bound given ? by the Gaussian concentration. For example, the quan-
tity λ1 pGUEpnqq ´ 2 n is of order n´1{6 (as opposed to Op1q following from the
r
Pe
The proof is, to the best of our knowledge, new; however, Lemma 6.39 appears in
[Sil85].
The formula (6.47) has been derived in [ŻS01] (and probably independently in
many other sources). Proposition 6.37 is from [Gor85] and improves on [Che78].
The argument leading to Corollary 6.38 is taken from [DS01]. Proposition 6.42 is
from [Gor88].
Free probability. The very interesting and fruitful link between free proba-
bility and large random matrices mentioned in Section 6.2.5 goes back to [Voi91].
ion
The monograph [NS06] gives an accessible and comprehensive approach to the sub-
ject with an emphasis on its combinatorial aspects. A highly readable exposition of
ut
many aspects of the subject relevant to quantum information theory can be found
in [HP00].
rib
The weak convergence in Theorem 6.43 was proved by Voiculescu. The exten-
sion to the convergence of the operator norm is a difficult result which was derived
ist
later by Haagerup–Thorbjørnsen [HT05].
Free additive convolution was introduced by Voiculescu in [Voi85] and the
rd
statement of Theorem 6.44 is from [Voi90]. The needed convergence of the operator
norms required for the last part of Corollary 6.45 was supplied recently in [CM14].
A formula for the sum of more than two projectors can also be derived, see [FN15].
fo
Finally, we mention that some concentration estimates for polynomials in ran-
dom matrices can be found in [MS12].
N ot
ly.
on
se
lu
na
so
r
Pe
CHAPTER 7
ion
This chapter contains a selection of results from asymptotic geometric analysis
ut
which we believe to be of interest to quantum information theory. The most famous
of them is arguably Dvoretzky’s theorem which asserts that, roughly speaking, every
rib
convex body of sufficiently large dimension admits sections which are arbitrarily
close to Euclidean balls. There are actually several variations on this statement
ist
and they are studied in detail in Section 7.2. We also introduce the `-position of
convex bodies and use it to deduce the M M ˚ -estimate, an important result that
rd
allows appealing to duality when studying mean widths.
`K pT q “ E }T pGq}K ,
where G denotes a standard Gaussian vector in Rn . If there is no ambiguity about
ly.
181
182 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
det T0 (by strict log-concavity of det over PSD, see Exercise 1.42), a contradic-
tion.
Note that the `-position of a convex body is unique up to homotheties and ro-
tations. It follows from Proposition 4.8 that convex bodies with enough symmetries
are automatically in the `-position.
Lemma 7.3. Let K be a convex body in the `-position. Then wG pK ˝ q TrpAq ď
n`K pAq for any A P Mn .
ion
Proof. We may assume A P PSD. Indeed, any B P Mn can be written as AO
for A P PSD and O P Opnq, and we have `K pAq “ `K pBq by rotational invariance
ut
of the Gaussian measure, while Tr B ď }B}1 “ Tr A.
Since K is in `-position, the solution of the variational problem (7.1) is λ I with
rib
λ “ `K pIq´1 “ wG pK ˝ q´1 . Consider A P PSD and ε ą 0 small enough such that
I `εA P PSD. Let B “ p`K pI `εAqq´1 pI `εAq. Since `K pBq “ 1 it follows that
detpBq ď detpλ Iq “ λn . Consequently, using the triangle inequality,
ist
1{n
pdetpI `εAqq ď λ`K pI `εAq ď 1 ` ελ`K pAq.
rd
1{n 1
Since detpI `εAq “1` n ε TrpAq ` opεq as ε goes to 0, the result follows.
Remark 7.4. Before proceeding, let us point out that the more common def-
fo
inition of the `-norm (and of the `-position) is via the second moment, namely
pE }T pGq}2K q1{2 . Using the second moment leads to nicer duality relations, but we
ot
prefer to use the first moment to make the connection to the mean width more
N
transparent. The next proposition shows that the two quantities are equivalent;
however, they are not equal nor proportional, and so the corresponding two maxi-
mization problems lead to two slightly different notions of `-position.
ly.
Proposition 7.5 (not proved here). For any symmetric convex body K Ă Rn
on
2
Exercise 7.1. Prove the properties of the `-norm listed in Proposition 7.1.
lu
Exercise 7.2 (The left ideal property). In the setting of Proposition 7.1, is it
true that `K pST q ď }S}`K pT q?
na
E |f pGq|2
`
,
Pe
product
(7.2) xxΘ, Θ1 yy :“ ExΘpGq, Θ1 pGqy
7.1. `-POSITION, K-CONVEXITY AND THE M M ˚ -ESTIMATE 183
and can be identified with the Hilbert space tensor product Hk b Rn . (This is the
canonical identification of the space of H-valued L2 functions on Ω with L2 pΩq b H;
if dim H ă 8, no completion of the latter is needed.) The projections Rd induce
extensions R̃d :“ Rd b IRn : Hk,n Ñ Hk,n . More concretely, for Θ P Hk,n , we have
R̃d pΘq :“ pRd f1 , . . . , Rd fn q. Similarly as for n “ 1, the function R̃1 pΘq : Rk Ñ Rn
is linear, i.e., it has the form x ÞÑ Ax for some A P Mk,n (depending on Θ), and
the operator R̃1 is the orthogonal projection onto the subspace of Hk,n formed by
such linear functions.
ion
Let K be a convex body in Rn containing 0 in the interior. For Θ P Hk,n ,
define
ut
˘1{2
~Θ~K “ E }ΘpGq}2K
`
(7.3)
rib
(this quantity is a norm when K is symmetric; again, we have here X-valued
L2 functions on pRk , γk q, where X “ pRn , } ¨ }K q). It is easily checked that, for
Θ P Hn,k ,
ist
(7.4) ~Θ~K “ suptxxΘ, Ξyy : Ξ P Hn,k , ~Ξ~K ˝ ď 1u.
rd
The K-convexity constant of the convex body K, denoted by KpKq, is the smallest
constant C such that the inequality
(7.5) ~R̃1 pΘq~K ď C~Θ~Kfo
holds for every k and for all Θ P Hk,n . It is not hard to show that KpKq ă 8 (see
ot
Exercise 7.3). Moreover, rather surprisingly, Kp¨q is often uniformly bounded for
N
large classes of bodies (for example, for balls in all commutative or non-commutative
`p spaces for a fixed p P p1, 8q). For general symmetric convex bodies, the sharp
ly.
Proof. To each x P Rn associate Θpxq P K with the property that xx, Θpxqy “
}x}K ˝ ; we can also ensure that the map Θ is Borel (see Exercise 1.12), so that
lu
Given that R̃1 is an orthogonal projection onto a subspace containing IRn , we have
r
xxIRn , Θyy “ xxIRn , R̃1 pΘqyy. Recalling that R̃1 pΘq has the form x ÞÑ Ax for some
Pe
A P Mn , we can write
wG pKq “ ExG, AGy.
Since an elementary computation shows that ExG, AGy “ Tr A, a straightforward
application of Lemma 7.3 yields
(7.6) wG pKqwG pK ˝ q ď n`K pAq.
It remains to unscramble the meaning of the quantity `K pAq. We have
˘1{2
`K pAq “ E }ApGq}K ď E }ApGq}2K
`
“ ~A~K
“ ~R̃1 pΘq~K ď KpKq~Θ~K ď KpKq,
184 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
ion
Remark 7.8. If K Ă Rn is unconditional, the bound in Theorem 7.7 can be
˘1{2
ut
`
improved to C 1 ` log dpK, B2n q .
rib
Before proving
? Theorem 7.7, we derive some of its consequences. First, since
dpK, B2n q ď n for every symmetric convex body K Ă Rn (see Exercise 4.20;
actually, the weaker result from Exercise 4.2 would suffice), we first have
ist
Corollary 7.9. There is a universal constant C such that KpKq ď C log n
rd
for any symmetric convex body K Ă Rn , n ě 2.
Combined with Proposition 7.6, this implies the following result known in as-
fo
ymptotic geometric analysis as the “M M ˚ -estimate.”
Theorem 7.10 (The M M ˚ -estimate). Let n ě 2 and let K Ă Rn be a sym-
ot
metric convex body which is in the `-position. Then
N
4.37). As a corollary, we obtain the fact that, in the `-position, the Urysohn
inequality (4.34) is sharp up to a logarithmic factor.
on
such that
wpT pKqq ď C log n vradpT pKqq.
lu
Note that since both wpT pKqq and vradpT pKqq are 1-homogeneous in T , one
so
may require in Corollary 7.11 that T P SLpRn q, in which case vradpT pKqq “
vradpKq.
r
Pe
For the proof of Theorem 7.7 we need two auxiliary lemmas, the first of which
requires recalling some notation. Fix k ě 1 and let pPt qtě0 be the Ornstein–
Uhlenbeck semigroup introduced in (5.55). Then each Pt is a contraction on Hk
(Exercise 5.62). Moreover, the operator Pt extends to an operator P̃t on Hk,n by
the formula
P̃t pf1 , . . . , fk q “ pPt f1 , . . . , Pt fk q
(or, more abstractly, P̃t “ Pt b IRn ) and this extension is also a contraction with
respect to any “reasonable functional norm.”
Lemma 7.12. For any Θ P Hk,n and for any convex body K Ă Rn containing
0 in the interior, we have ~P̃t Θ~K ď ~Θ~K .
7.1. `-POSITION, K-CONVEXITY AND THE M M ˚ -ESTIMATE 185
ion
the second inequality following from Pt being a contraction on Hk (see Exercise
5.62).
ut
The second lemma that we need for the proof of Theorem 7.7 is the following.
rib
Lemma 7.13 (see Exercise 7.6). Let p be a polynomial such that p1q |ppxq| ď 1
for any x P r´1, 1s and p2q for some λ ě e, |ppzq| ď λ for any complex number z
ist
with |z| ď 1. Then |p1 p0q| ď 4e
π log λ.
Proof of Theorem 7.7. Fix k ě 1 and let λ “ dpK, B2n q. Since the K-
rd
convexity constant is linearly invariant (see Exercise 7.3), we may assume that
B2n Ă K Ă λB2n and therefore
(7.8) fo
~ ¨ ~K ď ~ ¨ ~B2n ď λ~ ¨ ~K .
Further, since KpKq ď dpK, B2n q (again, by Exercise 7.3, or directly from (7.8)), we
ot
may assume that λ ě e. Note that the Hilbert space norm on Hk,n corresponding
N
j“1
For |z| ď 1, we have
lu
Hk,n with ~Ξ~K ˝ ď 1, the polynomial ppzq “ xxπpzq, Ξyy satisfies the hypotheses of
Pe
holds for every k and every f : t´1, 1uk Ñ Rn , where ε is uniformly distributed on
t´1, 1uk . It can be shown (see Section 6.6 in [AAGM15] for a detailed argument)
that for any symmetric convex body K,
2 1
K pKq ď KpKq ď K1 pKq.
π
This definition allows for a derivation of the estimate from Theorem 7.7 that par-
allels the one presented above, with the Hermite polynomials being replaced by
the Walsh functions, and Lemma 7.13 replaced by a careful application of Bern-
ion
stein’s inequality: If p is a polynomial of degree at most m such that |ppxq| ď 1 for
x P r´1, 1s, then |p1 p0q| ď m.
ut
Exercise 7.3 (A rough bound for the K-convexity constant). (i) Show that
KpB2n q “ 1. (ii) Show that if K, L are symmetric convex bodies in Rn , then
rib
?
KpKq ď dBM pK, LqKpLq. (iii) Conclude that KpKq ď n for symmetric convex
bodies K Ă Rn .
ist
Exercise 7.4 (K-convexity and duality). Show that KpKq “ KpK ˝ q for every
convex body K containing 0 in the interior.
rd
Exercise 7.5 (The K-convexity constant for B1n and for the cube). Let N “ 2k
and write the canonical basis of RN as peε qεPt´1,1uk . Define a map Θ P Hk,N
by Θpxq “ eε if the signs of the coordinates
?
fo
of x P Rk match the sequence ε P
t´1, 1u . Show that ~R̃1 pΘq~B1N ě c k for some c ą 0 and conclude that KpB1n q “
k
ot
? n
Ωp log nq “ KpB8 q.
N
onto the open unit disk; reformulate the question as an inequality about holomor-
phic functions on S and use the three-lines lemma.
on
defined on the unit sphere (Corollary 5.17) implies that such functions are actually
almost constant on a typical (randomly chosen) subspace of large dimension.
na
with respect to the Haar measure (as defined in Appendix B.4) on the Grassmann
manifold Grpk, Rn q (resp., Grpk, Cn q), for example by setting E “ U pRk q (resp.,
r
In the following we consider the space S n´1 Ă Rn equipped with the geodesic
metric g. The objective is to show that, for a Lipschitz function f : S n´1 Ñ R and a
random k-dimensional subspace E Ă Rn , the oscillation of f around a central value
on the subsphere SE :“ S n´1 X E is small (and similarly for SCn Ă Cn ). We first
present a straightforward ε-net argument, which gives easily a result that is only
7.2. SECTIONS OF CONVEX BODIES 187
slightly worse than Theorem 7.15 below. We focus on the real case, but the same
argument applies in the complex setting. Note, however, that the latter does not
follow formally from the former: while Cn , SCn can be identified with R2n , S 2n´1
as metric spaces, not every 2k-dimensional R-linear subspace of R2n corresponds
to k-dimensional C-linear subspace of Cn .
Let f : pS n´1 , gq Ñ R be a 1-Lipschitz function, let µf be a central value for
f , and let E “ U pRk q be a random k-dimensional subspace of Rn , with U Haar-
distributed on Opnq. Let ε P p0, 1q and let N be an ε-net in pS k´1 , gq. First, since
ion
the function f ˝ U is 1-Lipschitz, we have
oscpf ˝ U, S k´1 , µf q ď ε ` oscpf ˝ U, N , µf q.
ut
We know from Corollary 5.32 that for any x P N ,
rib
P p|f pU pxqq ´ µf | ą εq ď 2 expp´nε2 {4q.
By the union bound, it follows that
ist
(7.9) P poscpf ˝ U, N , µf q ą εq ď cardpN q ¨ 2 expp´nε2 {4q.
By Lemma 5.3, we may choose N with card N ď pπ{εqk , so that the bound from
rd
(7.9) is substantially smaller than 1 provided k ď c1 nε2 { logp1{εq. In that case
we have oscpf, SE , µf q ď 2ε with high probability. We will slightly improve the
fo
dependence on ε in Theorem 7.15 below; this improvement turns out to be crucial
for some applications.
ot
A function f : SCn Ñ R is said to be circled if it satisfies f peiθ xq “ f pxq for
every x P SCn and θ P R. Circled functions are the complex counterpart of even
N
functions.
ly.
Lipschitz circled function, µf be a central value for f (with respect to the uniform
measure) and 0 ă ε ă 1. Assume that k ď cnε2 , and let E Ă Cn be a random
k-dimensional subspace. Then, with probability larger than 1 ´ expp´c1 nε2 q
se
oscpf, SE , µf q ď ε.
lu
The same conclusion holds for any 1-Lipschitz function f : S n´1 Ñ R and a random
subspace E Ă Rn . In both cases the dimension changes to cnε2 {L2 if the function
na
f is L-Lipschitz.
Remark 7.16. The proof given below gives for example the value c “ 1{400,
so
which is certainly far from optimal. (The argument actually works provided k `1 ď
r
nε2 {200.) While the bound can be undoubtedly improved, the use of Dudley’s
Pe
inequality inevitably results in poor constants. In the real case, the use of Slepian–
Gordon inequalities gives a constant of order 1{6 (see Exercise 7.7) and even better
when the function f is the restriction of a norm (see Remark 7.23). It would be
desirable to come up with a complex version of that argument, the difficulty being
that the inequalities from Exercise 6.47 do not carry over to the complex case.
Proof of Theorem 7.15. We consider the complex case and note that the
same argument applies in the real setting. We may also assume that µf “ 0
(otherwise consider f ´ µf ).
Let E “ U pCk q, with U P SUpnq a random Haar-distributed unitary matrix
(we could use equivalently the Haar measure on Upnq, but this would lead to worse
188 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
constants in (7.10) below, see Table 5.2). Consider the function F : SUpnq Ñ R
defined by
F pU q “ sup |f | “ sup |f pU pxqq|.
SE xPSCk
For U, V P SUpnq and x P SCk , we have (see Exercise B.5 for the last inequality)
|f pU xq ´ f pV xq| ď |U x ´ V x| ď }U ´ V }op ď }U ´ V }HS ď g2 pU, V q
where g2 denotes the geodesic distance on SUpnq, defined in (B.8). It follows that F
ion
is 1-Lipschitz on pSUpnq, g2 q. Using concentration of measure (see Table 5.2) gives
then, for any t ą 0,
ut
(7.10) PpF ě E F ` tq ď expp´nt2 {4q.
The remaining part of the proof consists in bounding E F . We will rely on the
rib
following lemma.
ist
Lemma 7.17. Let f : SCn Ñ R be a 1-Lipschitz circled function and U P SUpnq
be a Haar-distributed random unitary matrix. Then for any x, y P SCn with x ‰ y
rd
and for any λ ą 0,
pn ´ 1qλ2
ˆ ˙
Ppf pU xq ´ f pU yq ą λq ď exp ´ ,
y by eiθ y and choose θ so that xx|yy is real nonnegative; note that this choice of θ
minimizes |x ´ y| and ensures that x ` y and x ´ y are orthogonal. Set z “ x`y 2
and w “ x´y
ly.
1
2 , then x “ z ` w and y “ z ´ w. Further, set β “ |w| “ 2 |x ´ y| (we
may assume that β ‰ 0) and w1 “ β ´1 w. Then, conditionally on u “ U pzq, U pw1 q
on
As is readily seen, fu is 2β-Lipschitz and its mean is 0. From Lévy’s lemma (Corol-
lary 5.32) applied to fu and to the p2n ´ 3q-dimensional sphere SuK , we deduce
na
and hence the same inequality holds also without the conditioning.
Pe
We now return to the proof of Theorem 7.15. Lemma 7.17 asserts that the
process pXs qsPSCk defined by Xs “ f pU sq is subgaussian (a notion defined in (6.19))
with constants A “ 1 and α “ pn ´ 1q{2. We apply Dudley’s inequality in the form
given in Corollary 6.14 to obtain
? ż 1{2
6 2 a
(7.11) E sup Xs ď sup E Xs ` ? 1 ` 2 logpN pSCk , | ¨ |, ηqq dη.
sPSCk sPSCk n´1 0
For any s P S, E Xs is equal to the mean?of f . Since
? 0 is a central value for f ,
it follows from Corollary 5.32 that E Xs ď 2 log 2{ 2n. We know from Lemma
7.2. SECTIONS OF CONVEX BODIES 189
? ?
5.3 that N pSCk , | ¨ |, ηq ď p2{ηq2k . Using the bound 1 ` t ď 1 ` t gives
? ? ? ż
log 2 3 2 12 2k 1{2 a
E F “ E sup Xs ď ? `? `? logp2{ηq dη.
sPSCk n n´1 n´1 0
ş1{2 a
The numerical value 0 logp2{ηq dη ď 0.759 leads to
?
5.08 ` 12.89 k
E F “ E sup Xs ď ? .
sPS n´1
ion
This quantity is smaller than ε{2 provided k ď cnε2 for some constant c, and the
conclusion follows by applying (7.10) for t “ ε{2. ?
ut
a To obtain the constant c “ 1{400, one checks thea ? kď
inequality 5.08 ` 12.89
200pk ` 1q ´ 1. It follows that E F ď ε provided 200pk ` 1q ´ 1 ď ε n ´ 1,
rib
or (since ε ă 1) when k ` 1 ď nε2 {200. Since we may assume that nε2 ě 400
(otherwise there is nothing to prove), this inequality is implied by the condition
ist
k ď nε2 {400.
Exercise 7.7 (An alternative argument for Theorem 7.15 in the real case).
rd
Let f : S n´1 Ñ R be a 1-Lipschitz function. Denote by Mf the median of f , and
consider T “ tf “ Mf u. Let ε ą 0 such that nε2 ě 12, and k an integer such that
k ` 1 ď 61 ε2 n. fo
(i) For α P p0, π{2q, let Tα “ tx P S n´1 : distpx, T q ď αu, where distance
ot
refers?to the geodesic
? metric. Show that σpS n´1 zTα q ď expp´nα2 {2q. We now set
α “ 2 log 2{ n, so that σpTα q ě 1{2.
N
(ii) Show that if B Ă S n´1 satisfies σpBq ě 1{2, then wpS n´1 zBβ q ď 1`cos
2
β
.
n´1
(iii) Let A “ S zTε . Check that the assumptions on n, k, ε imply the inequality
ly.
1`cospε´αq
a
2 ď 1 ´ pk ` 1q{n, and conclude from (ii) that wG pAq ă κn´k .
(iv) Using Proposition 6.42, conclude that with positive probability, a random k-
on
0
iθ
hpxq “ maxt|f pe xq ´ f pxq| : θ P r0, 2πsu.) ? ?
so
clidean if dBM pK, B2dim K q ď C (the Banach–Mazur distance dBM and the geo-
metric distance dg were defined in (4.2) and (4.1)). It is customary to separate
the situation where C is controlled, but possibly large (the “isomorphic” theory),
from the situation where C “ 1 ` ε with ε ! 1 or at least “sufficiently small” (the
“almost isometric” theory). Still another aspect is when ε “ 0 (the “isometric”
theory), which is quite different in nature and hardly mentioned in this book (with
the exception of Section 11.1).
The goal of this section, and of the following ones, is to give upper and lower
bounds on the maximal possible dimension of a subspace E Ă Rn such that K X E
is C-Euclidean (when K is symmetric, we restrict ourselves to subspaces through
the origin, so that the results can be translated in terms of subspaces of normed
190 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
ion
If } ¨ } is a norm on Rn (or Cn ), the Dvoretzky dimension of X “ pRn , } ¨ }q is defined
as the Dvoretzky dimension of its unit ball, or equivalently as
ut
k˚ pXq “ pM {bq2 n,
where b is the smallest number such that } ¨ } ď b| ¨ | and M “ E }X}, where X is
rib
a random variable uniformly distributed on S n´1 .
We note that b corresponds to the maximum value of } ¨ } over the Euclidean
ist
sphere, while M is the average value. Hence we always have M ď b, thus k˚ ď n.
Note also the inequality k˚ pKq ď dg pK, Lqk˚ pLq for a pair of convex bodies K, L.
rd
We should think of k˚ as a quantity meaningful only up to (absolute) multiplicative
constant. Likewise, in order to not to obscure the arguments, we will sometimes
fo
pretend in what follows that k˚ and similar expressions are integers.
The Dvoretzky dimension of a convex body K Ă Rn depends on the choice of
ot
the underlying Euclidean structure. The remarkable fact is that the following two
quantities are equivalent up to multiplicative universal constants (see Exercise 7.10
N
the Dvoretzky dimension is usually easily computed. We illustrate this in the case
of `p spaces and Schatten norms in Section 7.2.4. However, the following Theorem
se
7.19 is also of interest when applied to abstract norms. For example, it implies
the celebrated fact that any high-dimensional convex body has sections which are
lu
ion
and consequently
1`ε
dg pK X E, B2E q ď ,
1´ε
ut
where dg denotes the geometric distance as defined in (4.1) and B2E “ B2n X E.
rib
Proof. This is a straightforward consequence of Theorem 7.15, applied to the
function f pxq “ }x}K , which is a b-Lipschitz function (and, moreover, is circled in
ist
the complex case). Indeed, provided k ď cnpεM {bq2 , we obtain with probability
larger than 1 ´ expp´c1 pεM q2 nq that oscp} ¨ }K , SE , M q ď εM , which is equivalent
rd
to (7.12).
Remark 7.20. A simple ε-net argument combined with a little trick (see Exer-
fo
cise 7.11) gives a version of the complex case of Theorem 7.19 with a slightly worse
dependence on ε, but without the assumption that K is circled.
ot
Remark 7.21 (about the dependence on ε). We now comment about the sharp-
N
ness of Theorem 7.19. First, the isomorphic version (for macroscopic ε) is always
sharp: the dimension of generic 2-Euclidean sections can never exceed k˚ pKq (see
ly.
Exercise 7.12). Second, one can construct norms for which the dependence on ε is
sharp (see Exercise 7.20). However, for some natural and interesting instances the
on
dependence on ε can be improved (we will see a very important example in Chapter
8, connected to the additivity conjecture; see Remark 8.21).
se
Remark 7.23. In the real case, the conclusion of Theorem 7.19 also holds for a
gauge (i.e., without the symmetry assumption of K). Moreover, a derivation from
r
the Chevet–Gordon inequalities allows for a more direct proof and gives a better
Pe
? ?
Using the inequality κk { k ď κn { n (Proposition A.1), we are led to
˜ c ¸
k
κn M ´ b ď E min max˝ xBx, yy
n xPS k´1 yPK
˜ c ¸
k
ď E max max˝ xBx, yy ď κn M ` b
xPS k´1 yPK n
and the existence of a subspace E “ BpRk q satisfying (7.13) follows.
ion
Due to the duality between sections and projections of convex bodies (see (1.12)
and (1.13)), Theorem 7.19 admits a dual formulation via projections onto subspaces.
ut
Corollary 7.24. Let K be a convex body in Rn , and ε ą 0. Provided k ď
2
rib
cε k˚ pK ˝ q, a random k-dimensional subspace E satisfies with large probability
p1 ´ εqwpKqB2E Ă PE K Ă p1 ` εqwpKqB2E .
ist
Remark 7.25 (Geometric interpretation of the M M ˚ -estimate). Let K Ă Rn
be a symmetric convex body and let k ď cε2 minpk˚ pKq, k˚ pK ˝ qq. We know then
rd
from Theorem 7.19 and Corollary 7.24 that for a random subspace E P Grpk, Rn q,
the section K X E is p1 ` εq-close to a Euclidean ball of radius wpK ˝ q´1 while the
fo
projection PE K is p1`εq-close to a Euclidean ball of radius wpKq; the ratio of these
radii is the quantity wpKqwpK ˝ q which appears in Theorem 7.10. In particular, if
ot
K is in the `-position, the radius of a typical k-dimensional projection only exceeds
the radius of a typical k-dimensional section by a logpnq factor. However it is not
N
clear whether the `-position is always compatible with the conditions k˚ pKq " 1
and k˚ pK ˝ q " 1 (see Problem 7.26).
ly.
Problem 7.26. Does there exist, for every symmetric convex body K Ă Rn , a
subspace E of dimension c log n such that
on
pk ´ 1q{A2 .
Exercise 7.11 (Almost spherical sections discretized). (i) Let N be a δ-net in
pSCn , | ¨ |q, and } ¨ } a norm on Cn such that
@x P N , 1 ´ α ď }x} ď 1 ` β.
Show that
δp1 ` βq 1`β
(7.14) @x P SCn , 1 ´ α ´ ď }x} ď .
1´δ 1´δ
(ii) Use (i) to show that, when k ď cε2 log´1 p1{εqk˚ pKq, the conclusion from The-
orem 7.15 can be derived via the elementary net argument that led to (7.9).
7.2. SECTIONS OF CONVEX BODIES 193
ion
(i) Show that there is an orthogonal decomposition of Rn as the direct sum of rn{ks
subspaces, each of them satisfying (7.15). a
(ii) Show that for every x P Rn , }x} ď 2M rn{ks|x|.
ut
(iii) Conclude that k ď CpM {bq2 n for some absolute constant C.
rib
7.2.3. The Figiel–Lindenstrauss–Milman inequality. In this section we
will derive, as a consequence of Theorem 7.19, a useful inequality due to Figiel–
ist
Lindenstrauss–Milman which can be interpreted as follows: complexity (of any
convex body) must lie somewhere.
rd
Fix a convex body K Ă Rn containing the origin in the interior. Define the
verticial dimension of K as
fo
dimV pKq “ log inftN : there is a polytope P with N vertices s.t. K Ă P Ă 4Ku
and the facial dimension of K as
ot
dimF pKq “ log inftN : there is a polytope Q with N facets s.t. K Ă Q Ă 4Ku.
N
The number 4 plays no special role in these definitions; all the results below are
only affected in the values of the constants if 4 is replaced by another number larger
ly.
than 1 (see Exercise 7.15). The basic properties of these concepts are gathered in
Proposition 7.27.
on
(i) for any T P GLpn, Rq, we have dimV pT Kq “ dimV pKq and dimF pT Kq “
dimF pKq,
lu
dimV K,
(iv) if K has centroid at the origin, then dimV pKq ď Cn and dimF pKq ď Cn for
so
We note that the verticial and facial dimensions are linearly invariant but not
Pe
We have apKq “ dBM pK, B2n q if K is centrally symmetric. The following lemma
gives a simple connection between asphericity and verticial (resp., facial) dimension.
It is an immediate consequence of Proposition 5.6.
Lemma 7.28. Let K Ă Rn be a convex body containing the origin in the interior.
Then
n´1 n´1
dimV pKq apKq2 ě , dimF pKq apKq2 ě .
32 32
ion
When combined with Dvoretzky’s theorem, the inequalities from Lemma 7.28
give a much sharper result.
ut
Theorem 7.29 (Figiel–Lindenstrauss–Milman inequality). For any convex body
K Ă Rn containing the origin in the interior we have
rib
(7.17) dimF pKq dimV pKq apKq2 ě cn2
ist
where c ą 0 is an absolute constant.
Proof. We may assume that rB2n Ă K Ă RB2n with R{r “ apKq. Let M “
rd
E }X}K and M ˚ “ E }X}K ˝ where X is a random vector uniformly distributed on
the unit sphere.
fo
We apply Theorem 7.19 to K for ε “ 1{2 (say). There yields a subspace E Ă Rn
of dimension cprM q2 n such that
ot
M E 3M E
B ĂK XE Ă B .
2 2 2 2
N
It follows (using Proposition 7.27(iii) and Lemma 7.28) that dimF pKq ě dimF pK X
Eq ě cprM q2 n for an absolute constant c ą 0. We apply the same argument to
ly.
K ˝ (note that R´1 B2n Ă K ˝ ) and obtain that dimF pK ˝ q “ cpM ˚ {Rq2 n. Since
dimV pKq “ dimF pK ˝ q, it follows that
on
K Ă Rn .
Corollary 7.30. Let P Ă Rn be a symmetric polytope with n1 vertices and
so
n2 faces. Then
r
ion
we compute the Dvoretzky dimension for the unit balls with respect to the most
standard norms: the commutative and non-commutative p-norms. Unless specified
ut
otherwise, the statements refer to both the real and the complex case.
7.2.4.1. `p norms. Let Bpn denote the unit ball (in either Rn or Cn ) for the
rib
norm } ¨ }p , where p P r1, 8s. We also define the conjugate exponent q P r1, 8s by
the relation p´1 ` q ´1 “ 1. Recall that pBpn q˝ “ Bqn .
ist
Theorem 7.31. The Dvoretzky dimension of Bpn is of the following order
$
rd
&n
’ if 1 ď p ď 2,
n 2{p
k˚ pBp q » pn if 2 ď p ď log n,
’
%
fo
log n if log n ď p ď 8.
Remark 7.32. We emphasize that the constants implicit in the relations “»”
ot
do not depend on p (in addition to not depending on n). The proof actually shows
that, for fixed p and as n tends to 8, wpBqn q „ n1{p´1{2 }g}Lp , where g is a standard
N
(5.63).
on
Proof. We treat the real case, the complex case being similar. Let q P r1, 8s
be such that 1{p ` 1{q “ 1. By Definition 7.18, we have
k˚ pBpn q “ n inradpBpn q2 wpBqn q2 .
se
#
n n1{2´1{p if 1 ď p ď 2,
inradpBp q “
na
1 if 2 ď p ď 8,
and
so
#?
p n1{p´1{2 if 1 ď p ď log n,
(7.18) E }x}p “ wpBqn q »
r
? ?
log n{ n if log n ď p ď 8
Pe
Consider first the case p ě 2, then }¨}p is 1-Lipschitz (with respect to the Euclidean
metric) and so by Proposition 5.34 and Theorem 5.24
P pX ´ E X ą tq ď P pX ´ M ą tq ď P pg1 ą tq for all t ą 0,
where M is the median of X. In particular, we have
ż8 ´ p ¯p{2
` p
ptp´1 Pp|X ´ E X| ą tq dt ď Epg1` qp ď
` ˘
E pX ´ E Xq “
0 e
(see (A.1) or (5.63)) and so
ion
a
(7.20) }pX ´ E Xq` }Lp ď p{e.
ut
Since }X}Lp ´ }pX ´ E Xq` }Lp ď E X ď }X}Lp , it follows from (7.19) and (7.20)
?
that wppBpn q˝ q “ Θp pn1{p´1{2 q whenever 2 ď p ď log n. For log n ď p ď 8, we
rib
have } ¨ }8 ď } ¨ }p ď e} ¨ }8 , so that it suffices to prove the second part of (7.18) for
p “ 8. This is exactly (modulo the relation between the spherical and Gaussian
ist
means) the?content of Lemma 6.1, which asserts that, in the present notation,
E }G}8 „ 2 log n.
rd
If 1 ď p ă 2, } ¨ }p is n1{p´1{2 -Lipschitz and an argument along the same lines
yields
(7.21) fo
}pX ´ E Xq` }Lp ď n1{p´1{2 p{e.
a
Combining this with (7.19) shows that E X “ Θpn1{p q for 1 ď p ă 2, whence (7.18)
ot
for that range of p readily follows.
N
While the above argument relies heavily on tools specific to the Gaussian case,
most of its elements can be carried over to a much more general setting. An example
ly.
for the dimension of nearly Euclidean subspaces implied by Theorem 7.31 are sharp
in the following sense: for 2 ă p ă 8, if some k-dimensional subspace E Ă Rn is
se
such that dBM pBpn X E, B2k q ď 2, then k ď Cpn2{p , where C is an absolute constant
(see Exercise 7.19).
lu
Remark 7.34 (Euclidean sections of `n8 ). The case of `n8 deserves a special
mention since almost Euclidean subspaces of `n8 are closely related to ε-nets in the
na
unit Euclidean sphere. It is easily checked (see Exercise 7.18) that the following
so
k´1
(ii) There exist n points x1 , . . . , xn in S such that
Pe
ion
Remark` 7.36. (i) The optimal
˘ dependence as α tends to 0 in Theorem 7.35
is Apαq “ Θ plogp2{αq{αq1{2 . The upper bound will be shown in Section 7.2.6.2,
where the Theorem is proved; see Exercise 7.16 for a slightly weaker lower bound.
ut
An alternative approach to Theorem 7.35 (with a simpler proof, but worse depen-
dence on α) is via Theorem 7.42. (ii) In the context of Theorem 7.35, the parameter
rib
A is often called the distortion (of the `1 -norm over E). However, an alternative
(and arguably better, see Section 7.2.7) definition of the distortion of a subspace
ist
E Ă Rn is the ratio between the maximum a and the minimum of the function }x}1
?
over S n´1 X E. This is because for A ă π{2 the inequality }x}1 ě A´1 n|x| may
rd
hold for all x P E only if dim E is small (depending on A) [SW].
Exercise 7.16 (Simple lower bound on `1 distortion).? Let a P p0, 1s and let
fo
E Ă Rn be a subspace such that the inequality }x}1 ě a n|x| is satisfied for all
x P E. Show that the codimension of E is at least a2 n ´ 1. (This is elementary.)
ot
?
Conclude that the optimal Apαq in Theorem 7.35 satisfies Apαq “ Ωp1{ αq.
N
?
pY1 , . . . , Yn q. Show that E }Y }p ď A p n1{p for 1 ď p ă `8 and that E }Y }8 ď
?
CA log n.
on
Exercise 7.18 (Optimal almost spherical sections of the cube). Show the
equivalence (i) ðñ (ii) in Remark 7.34.
se
Prove that k˚ pT pBpn qq ď Cpn2{p for any T P GLpn, Rq. (ii) Conclude using Exercise
7.10.
so
Exercise 7.21. Fix integers m, n ě 1 and consider the convex body obtained
as the `1 -sum of m copies of B2n
K “ tpx1 , . . . , xm q P pRn qm : |x1 | ` ¨ ¨ ¨ ` |xm | ď 1u.
Show that k˚ pKq ě cnm for some absolute constant c ą 0.
Exercise 7.22. Show that for every ε ą 0, there is a polytope P with at most
exppCn{ε2 q vertices and at most exppCn{ε2 q facets, such that p1´εqB2n Ă P Ă B2n .
ion
7.2.4.2. Schatten norms. We now consider the Schatten p-norms, for p P r1, 8s.
Recall that Spm,n is the corresponding unit ball in the space of (real of complex)
m ˆ n matrices, and Spn,sa is its analogue for the space of (real of complex) self-
ut
adjoint n ˆ n matrices. Also recall (see Corollary 1.18) that pSpm,n q˝ “ Sqm,n and
pSpn,sa q˝ “ Sqn,sa , where q P r1, 8s is defined by p´1 ` q ´1 “ 1.
rib
Theorem 7.37 (Dvoretzky dimension for Schatten norms). Consider two in-
ist
tegers m ď n, and p P r1, 8s. The Dvoretzky dimension of Spm,n satisfies
#
mn if 1 ď p ď 2,
rd
m,n
k˚ pSp q »
m2{p n if 2 ď p ď 8.
fo
Moreover, in the case m “ n, the same estimates are true for k˚ pSpn,sa q.
Remark 7.38. We emphasize again that the constants implicit in the » no-
ot
tation are absolute and do not depend on p, m, n. Moreover, the proof allows to
describe the precise asymptotic behavior of k˚ pSpm,n q and k˚ pSpn,sa q (i.e., relations
N
“„” in place of “»,” with reasonably explicit constants), see Exercise 7.23.
ly.
Proof. We focus primarily on the real case, the complex case being similar.
Let q P r1, 8s be such that 1{p ` 1{q “ 1. We have (see Definition 7.18)
on
#
m1{2´1{p if 1 ď p ď 2,
inradpSpm,n q “
lu
1 if 2 ď p ď 8
na
and
(7.25) E }A}p “ wpSqm,n q “ Θpm1{p´1{2 q,
so
sphere in Mm,n . The inradius is the same as in the commutative case: we are just
Pe
comparing the `p -norm and the `2 -norm of the sequence of singular values of a
matrix (see (1.29); the comparison is formalized in (1.31)). In turn, (7.25) will be
obtained by combining well-known properties of random matrices with the relation
(A.7) between the spherical and the Gaussian mean. To that end, we note first
that once we show the following one-sided bounds for the extreme values of p
piq E }A}8 À m´1{2 and piiq E }A}1 Á m1{2 ,
the remaining cases will follow by appealing again to the inequalities (1.31) relating
different Schatten p-norms
m1{p´1 } ¨ }1 ď } ¨ }p ď m1{p } ¨ }8 .
7.2. SECTIONS OF CONVEX BODIES 199
m,n
Next, we know from the duality pS1m,n q˝ “ S8 and from Exercise 4.37 that
wpS1m,n qwpS8
m,n
q ě 1,
so that (ii) follows from (i) with c “ 1{C. Finally, to justify (i), introduce a standard
Gaussian vector B in Mm,n , so that E }A}8 “ κ´1 mn E }B}8 (in the complex case
:
replace κmn by κC mn ). Note that the random matrix W “ BB is a Wishart matrix,
allowing to use the results from Section 6.2.3. We know (see Proposition ? 6.31?for
the complex case? and Corollary 6.38 for the real case) that E }B} 8 ď m ` n.
ion
Since κmn „ mn, this shows (i), completes the proof of (7.25) and, consequently,
of the part of the Theorem concerning k˚ pSpm,n q.
The self-adjoint version can be treated exactly the same way, using estimates
ut
on the norm of GOE/GUE matrices (Proposition 6.24); recall that the GOE (resp.,
GUE) is essentially the standard Gaussian vector in the space of real symmetric
rib
(resp., complex self-adjoint) matrices.
ist
Exercise 7.23 (Sharp bounds for mean widths of Schatten balls). We consider
either the real or the complex case. (i) Fix p P r1, 8s and let n, s tend to infinity
rd
in such a way that lim ns “ λ P r1, 8q. Show that the quantity E }A}p “ wpSqn,s q
appearing in (7.25) is equivalent to αp λ´1{2 n1{p´1{2 , where αp is defined by αpp “
?
ş p{2
fo
|x| dµMPpλq pxq for 1 ď p ă 8, and α8 “ 1 ` λ. (One can check that the
product αp λ´1{2 is bounded away from 0 and `8.) (ii) Fix p P r1, 8s. Show that,
ot
as n tends to infinity, the quantity wpSqn,sa q is equivalent to βp n1{p´1{2 , where βp
ş2
is defined by βpp “ ´2 |x|p dµSC pxq for 1 ď p ă 8, and β8 “ 2.
N
Exercise 7.24 (Uniformly bounded volume ratio for Schatten balls, 1 ď p ď 2).
ly.
Deduce that the convex bodies Spm,n have a (uniformly) bounded volume ratio if
1 ď p ď 2. (See Section 7.2.6.1 for the definition.)
se
Let m ď n be integers, let p P r1, 8s, and suppose that E Ă Mm,n is a k-dimensional
subspace such that dBM pE X Spm,n , B2k q ď 2. The goal of this exercise is to show
na
that
(7.26) k ď Ck˚ pSpm,n q,
so
where C is an absolute constant. This shows that, for isomorphically Euclidean sec-
r
tions, the Dvoretzky dimension gives a sharp bound. (Note, however, the hypothesis
Pe
dBM pE X Spm,n , B2k q ď 1 ` ε does not imply that k ď Cε2 k˚ pSpm,n q; exploiting this
“room for improvement” will be crucial in Chapter 8, see Remark 8.21.) Note that
(7.26) holds trivially when 1 ď p ď 2.
(i) Show that there is a constant C0 and a polytope P with at most C0m`n vertices
such that P Ă S1m,n Ă 2P .
(ii) Using (i) and Remark 7.34, show that (7.26) holds when p “ 8.
(iii) Assume now that 2 ď p ă 8, and suppose that dBM pE X Spm,n , B2k q ď 2.
m,n
Show that k˚ pE X S8 q ě ck{n2{p , and (using the previous question) that k ď
m,n
Ck˚ pSp q.
200 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
ion
Corollary 7.40 (Dvoretzky’s theorem). There is a constant c ą 0 such that
the following holds. Let K be a symmetric convex body in Rn (for some n P N) and
ut
let ε ą 0. Then there exists a subspace E Ă Rn of dimension at least cε2 log n such
rib
that
dg pK X E, B2E q ď 1 ` ε,
ist
where B2E is the Euclidean unit ball in E.
If K is a non-symmetric convex body, the same conclusion holds for some k-
rd
dimensional affine subspace E and the corresponding notion of the distance.
Proof of Corollary 7.40. If K is in John position, the conclusion follows
fo
immediately from Proposition 7.39 and Theorem 7.19. For a general convex body
K Ă Rn , we know from Proposition 4.7 that there is a linear map T such that T K is
ot
in John position. Therefore there exists a subspace E with dimension cε2 log n such
that dg pT pKq X E, B2E q ď 1 ` ε. It follows that there is an ellipsoid E Ă T ´1 pEq
N
such that
E Ă K X E Ă p1 ` εqE .
ly.
We now use the result from Exercise 1.25 to conclude that E can be replaced
by a multiple of the Euclidean ball if we replace E by a subspace F Ă E with
on
The key estimate needed for the proof of Proposition 7.39 is the following
lemma, known as the Dvoretzky–Rogers lemma.
na
for any 1 ď k ď n, a
}xk }K ě k{n.
r
Pe
Proof. The Lemma is a consequence of the following claim: under the hy-
n
potheses of the Lemma, anya m-dimensional subspace F Ă R contains a vector x
with |x| “ 1 and }x}K ě m{n. Indeed, we construct successively xn , . . . , x1 and
obtain xk by applying the claim to the subspace orthogonal to txi : i ą ku.
To prove the claim, consider a resolution of identity pci , xi q given by Proposition
4.7. Recall that xi P BK X B2n are contact points, in particular it follows that K
is contained in each half-space tx¨, xi y ď 1u, or that } ¨ }K ě x¨, xi y. Given an
m-dimensional subspace F Ă Rn , we have
ÿ
PF “ ci PF |xi yxxi |.
7.2. SECTIONS OF CONVEX BODIES 201
2
ř ř
a gives m “ ci |PF xi | . Since ci “ n, there exists an index j
Taking the trace
with |PF xj | ě m{n. Let x “ PF xj {|PF xj |. We have
a
}x}K ě xx, xj y “ |PF xj | ě m{n.
We can now complete the proof of Proposition 7.39.
Proof of Proposition 7.39. Let K be a convex body in John position, and
let X be a random vector uniformly distributed on S n´1 . Since inradpKq “ 1, it
ion
suffices to prove in view of Definition 7.18 that
a
E }X}K ě c log n{n.
ut
for some constant c.
We know from Lemma 7.41 that there exists an orthonormal family of n{4
rib
vectors pxi q with }xi }K ě 1{2. In particular, we have } ¨ }K ě 12 maxtx¨, xi y : 1 ď
i ď n{4u. Consequently, if G denotes a standard Gaussian vector in Rn , then
ist
1 1
E }X}K “ E }G}K ě E maxtxG, xi y : 1 ď i ď n{4u.
κn 2κn
rd
The random variables xG, xi y are i.i.d. standard
? normal variables, and therefore the
expectation of their maximum is of order log n by Lemma 6.1, as needed.
fo
Exercise 7.26 (Complex version of Dvoretzky’s theorem). Check that Corol-
lary 7.40 remains valid for a circled convex body K Ă Cn .
ot
Exercise 7.27 (Simultaneous spherical sections for a set and its polar). (i)
N
Show that the following holds for some constant c ą 0: for every symmetric convex
body K Ă Rn there is a k-dimensional subspace E Ă Rn with k “ c log n such
ly.
ˆ ˙1{n
volpKq
lu
vrpKq “ .
volpJohnpKqq
The quantity vrpKq is an affine invariant. Consequently, if K “ BX , it makes
na
sense to denote vrpXq “ vrpKq. Examples of convex bodies with “bounded volume
ratio” (i.e., bounded by a dimension-independent constant) include Bpn , Spm,n and
so
Section 4.3.3 (Table 4.1, Exercises 4.39 and 4.40). For the Schatten spaces, the
Pe
boundedness follows from the proof of Theorem 7.37 (see also Exercise 7.24). The
following theorem asserts that bodies (resp., spaces) with bounded volume ratio
always have nearly Euclidean sections (resp., subspaces) of proportional dimension,
for arbitrary proportion α P p0, 1q.
Theorem 7.42 (not proved here). Let K Ă Rn a convex body in John position
and denote A “ vrpKq. Let E Ă Rn be a random k-dimensional subspace. Then,
with probability larger than 1 ´ e´n ,
n
B2E Ă K X E Ă pCAq n´k B2E ,
where C is an absolute constant.
202 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
ion
dim E1 “ dim E2 “ k and E1 K E2 such that
(7.27) c|x| ď n´1{2 }x}1 ď |x| for x P Ei , i “ 1, 2,
ut
where c ą 0 is a universal constant. Similarly, if n “ 3k, there exist mutually
rib
orthogonal k-dimensional subspaces E1 , E2 , E3 such that the bounds from (7.27)
hold for x P Ei ` Ej , for any ti, ju Ă t1, 2, 3u.
ist
The property expressed by (7.27) is usually referred to as the Kashin decompo-
sition of `n1 . Another statement closely related to Theorem 7.42 is the following.
rd
Theorem 7.44 (not proved here). Let K Ă Rn a convex body in John position
and denote A “ vrpKq. There is an orthogonal transformation U P Opnq such that
K X U K Ă 8A2 B2n . fo
ot
Exercise 7.28 (Volume ratio of subspaces). (a) Let K Ă Rn be a symmetric
convex body and E Ă Rn be a k-dimensional subspace. Show that vrpK X Eq ď
N
n{k
pC vrpKqq . (b) Give examples of symmetric convex bodies K Ă Rn and sub-
spaces E Ă Rn such that the ratio vrpK X Eq{ vrpKq is arbitrarily large.
ly.
Exercise 7.29 (Kashin decomposition via volume ratio). (i) Derive Corollary
on
7.43 from Theorem 7.35. (ii) Show that the assertion of Corollary 7.43 holds for
spaces X with uniform bound on their volume ratios (i.e., with constant c depending
only on vrpXq).
se
Exercise 7.30 (A dual Kashin decomposition). Show ? that, for any n Pn N, there
lu
nB2 .
7.2.6.2. The low-M ˚ estimate and the proof of Theorem 7.35. Let K Ă Rn be a
so
symmetric convex body. The argument from Exercise 7.12 shows that sections of K
r
“one half” of the estimates (7.12) persists: an avatar of the lower bound remains
valid for subspaces of proportional dimension.
Theorem 7.45 (Low-M ˚ estimate). Let K be either a convex body in Rn con-
taining 0 in the interior or a circled convex body in Cn , and M ˚ “ wpKq. Let
0 ă α ă 1 and k “ np1 ´ αq. Then, with probability larger than 1 ´ expp´cαnq, a
random k-dimensional subspace E satisfies
?
c α
(7.28) @x P E, |x| ď }x}K
M˚
where c ą 0 is an absolute constant.
7.2. SECTIONS OF CONVEX BODIES 203
Proof. We give a proof (valid only in the real case) based on Proposition 6.42.
Consider L “ S n´1 X tK for t ą 0 to be chosen later. We have wG pLq ď wG ptKq “
twG pKq “ tκn M ˚ . We now chose t such that tκn M ˚ “ 21 κn´k ; this implies
ion
? ?
t ě c α{M˚ for some c ą 0 because κm „ m. Proposition 6.42 implies then
that, with high probability, a random subspace E P Grpk, Rn q does not intersect L.
ut
This is equivalent to the fact that the inequality } ¨ }K ą t| ¨ | holds on E.
rib
Proof of Theorem 7.35. We argue as in the proof of Theorem 7.45 specified
to K “ B1n , the only
? modification comes in upper-bounding wG pLq. Denote L̃ “
ist
B2n X tB1n (t P r1, ns to be chosen later), then clearly
wG pLq ď wG pL̃q.
rd
(We actually have equality since L̃ “ conv L; this is a fairly easy consequence
` of the˘
fact that no extreme point of tB1n lies inside B2n .) Next, L̃˝ “ conv B2n Y t´1 B8
fo n
π
see Table 4.1 and Exercise 6.6 for the equality. Given that L̃ is permutationally
ly.
symmetric, it has enough symmetries and hence it is in the `-position (see Section
4.2.2 and particularly Proposition 4.8). Accordingly, Proposition 7.6 applies and
on
shows that
wG pL̃qwG pL̃˝ q ď nKpL̃q.
se
˘1{2
KpL̃q ď C 1 ` log dpL̃, B2n q
` `
“ C 1 ` logpt{ nq .
Combining the above inequalities yields
na
c
π ` ? ˘1{2
wG pLq ď wG pL̃q ď C t 1 ` logp n{tq .
so
2
r
c
π ` ? ˘1{2 ?
2C t 1 ` logp n{tq “ κn´k „ αn,
2
?
which can be rewritten as gpλq „ cα´1{2 , where gpxq “ xp1 ` log xq´1{2 , λ “ n{t
? ` ˘ 1{2
and c “ 2πC. Solving for λ we obtain λ » α´1{2 logp2{αq , whence t “
? 1{2
?
n{λ » pα{ logp2{αqq n, as needed. (We are using here the fact that if β P R is
fixed and if y “ gpxq :“ xp1 ` log xqβ , then the inverse function—which is defined
for sufficiently large y—satisfies g ´1 pyq „ yp1 ` log yq´β as y Ñ 8.)
204 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
7.2.6.3. The quotient of a subspace theorem. It follows from Corollary 7.40 that
any convex body K Ă Rn admits isomorphically Euclidean sections of dimension
Ωplog nq. Dually, any convex body admits orthogonal projections of the same di-
mension which are isomorphically Euclidean. The bound Ωplog nq cannot be im-
proved, as shown by the case of the cube (for sections) or of the `n1 ball (for projec-
tions). However, it turns out that combining both operations leads to a surprising
phenomenon: every convex body admits a projection of a section of proportional
dimension which is isomorphically Euclidean.
ion
Theorem 7.46 (Quotient of a subspace theorem, not proved here). Given a
symmetric convex body K Ă Rn and α P p0, 1q, there exist subspaces E Ă F Ă Rn
ut
with dim E ě p1 ´ αqn such that
dBM pPE pK X F q, B2dim E q ď Cα´1 logpCα´1 q.
rib
We note that an “almost isometric” version of the quotient of a subspace theo-
rem follows then by appealing to Remark 7.22.
ist
Exercise 7.31 (Quotient of a subspace = subspace of a quotient). Show that
rd
given a decomposition Rn “ E ‘ F ‘ G into orthogonal subspaces, we have, for
K Ă Rn
pPE‘F Kq X E “ PE pK X pE ‘ Gqq.
fo
Conclude that the class of sections of projections of K coincides with the class of
projections of sections of K.
ot
Exercise 7.32 (Combining quotient and subspace operations is necessary).
N
(i) Check by applying Lemma 4.20 twice that volpKq ě 41n volpK1 q volpK2 q volpK3 q
and volpK ˝ q ě 41n volpK1˝ q volpK2˝ q volpK3˝ q.
so
(ii) Given convex body L Ă Rk , define αpLq “ vradpLq vradpL˝ q. Show that, for
some constant c, αpKqn ě cn αpK1 qn1 αpK2 qn2 αpK3 qn3 .
r
Pe
(iii) By Theorem 7.46, we may assume that n1 “ n{2, and that K1 is A-Euclidean
for some absolute constant A. Show that αpK1 q ě A´1 . If βN denotes the infimum
of αpKq over all symmetric convex bodies of dimension at most N , conclude that
βN ě c2 {A.
7.2.6.4. Approximation of zonoids by zonotopes. We first state a reformulation
of Dvoretzky’s theorem for `n1 .
Theorem 7.47 (see Exercise 7.34). For any n P N, ε ą 0, there exists an
integer N ď Cn{ε2 and vectors x1 , . . . , xN P Rn such that Z Ă B2n Ă p1 ` εqZ,
where Z denotes the zonotope
(7.29) Z “ r´x1 , x1 s ` ¨ ¨ ¨ ` r´xN , xN s.
7.2. SECTIONS OF CONVEX BODIES 205
It is natural to ask whether a version of Theorem 7.47 holds when the Euclidean
ball is replaced by an arbitrary zonoid. The best result in this direction is the
following.
Theorem 7.48 (not proved here). For any 0-symmetric zonoid Y Ă Rn and
ε ą 0, there exists an integer N ď Cn logpnq{ε2 and vectors x1 , . . . , xN P Rn such
that Z Ă Y Ă p1 ` εqZ, where Z denotes the zonotope (7.29). Moreover, we can
ensure that supp µZ Ă supp µY , where the measures µY , µZ are defined in (4.8).
ion
Exercise 7.34 (Approximating balls by zonotopes via Dvoretzky’s theorem).
Prove Theorem 7.47 using the fact that the Dvoretzky dimension of B1n is of order
n (Theorem 7.31).
ut
7.2.6.5. The Johnson–Lindenstrauss lemma.
rib
Theorem 7.49 (Johnson–Lindenstrauss lemma). Let A be a finite subset of
Rn , m “ card A, and ε P p0, 1q. If k ě 4ε´2 log m, there exists a linear map
ist
f : Rn Ñ Rk such that, for every x, y P A,
(7.30) p1 ´ εq|x ´ y| ď |f pxq ´ f pyq| ď p1 ` εq|x ´ y|.
rd
Proof. We show that a random choice for f satisfies (7.30) with high prob-
ability. Let B : Rn Ñ Rk be a random matrix with i.i.d. N p0, 1q entries. For
fo
every unit vector u P Rn , Bu is a standard Gaussian vector in Rk , and the random
variable |Bu| follows the χ2 pkq distribution. Denoting by Mk2 the median of the
ot
χ2 pkq distribution, it follows from Theorem 5.24 that for any t ą 0,
N
ˇ
P ˇ|Bu| ´ Mk ˇ ą t ď expp´t2 {2q.
`ˇ ˘
ˇ
P ˇ|f pxq ´ f pyq| ´ |x ´ y|ˇ ą ε|x ´ y| ď expp´ε2 Mk2 {2q.
`ˇ ˘
ε2 Mk2 {2 ` log 2. Since Mk2 ě k ´ 2{3 (see Exercise 5.34), this condition is satisfied
lu
orems in this chapter is a heavy use of the probabilistic method. For example,
the existence of a subspace satisfying the conclusion of Dvoretzky’s theorem or its
so
has been chosen) or by using random matrices. Random constructions benefit from
the blessing of dimensionality, as opposed to the curse of dimensionality, which
renders an exhaustive search (and many deterministic algorithms) nonfeasible.
However, for theoretical and practical reasons, existence results are often unsat-
isfactory. For example, to write a computer code implementing an error-correcting
algorithm one needs a specific encoding matrix. This leads to the class of prob-
lems asking for explicit versions of, or pseudo-random models for objects whose
constructions involve probabilistic arguments. By “explicit” we mean here an algo-
rithm, whose complexity is manageable (say, with running time being polynomial
in the dimension). Individual constructions are often “more explicit” than that,
they may involve, e.g., closed formulas. An alternative to an explicit solution may
206 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
ion
B2n Ă S Ă CB2n .
Moreover, C can be replaced by 1 ` ε for ε P p0, 1q, if we use a simplex of dimension
ut
ě C1 n logp2{εq.
rib
Another result, for which substantial efforts has been devoted to derandom-
ization, is Dvoretzky’s theorem for B1n (or `n1 ). Recall that the (`1 -)distortion of
ist
a subspace E Ă Rn is the ratio between the maximum and the minimum of the
function }x}1 over S n´1 X E. We already showed, via the probabilistic method, the
rd
existence of subspaces of proportional dimension with arbitrarily small distortion
(Theorem 7.31) and the existence of subspaces of arbitrarily large proportional di-
mension with bounded distortion (Theorem 7.42). The randomness relied on the
fo
Haar measure on Grassmann manifold, which requires an infinite amount of ran-
dom bits to be exactly simulated. However, a careful look at the arguments shows
ot
that the same conclusion can be derived using only Opn2 q random bits.
A natural step towards explicit examples is randomness reduction: can we
N
match, or approach, the optimal dimension and distortion bounds using fewer ran-
dom bits? We point that constructions using Oplog nq random bits are very close to
ly.
be explicit, since we can then perform an exhaustive search among the polynomially
many possible bit strings. However, it is not clear whether the distortion of a given
on
The best results known to the authors and directed towards constructing ex-
plicit subspaces of `n1 (going in several different directions) are gathered in Table
na
7.1. One result that “doesn’t fit” in the table is the following.
Theorem 7.52 (not proved here). Given n P N, p P p1, 2q and η P p0, 1q, there
so
` ˘
´1
(7.31) dg pB1n X E, Bpn X Eq ď p1{ηqO p2´pq .
Pe
In the language of this section, (7.31) gives a bound on the distortion of the `1 -norm
on the sphere of `np intersected with E.
In a different direction, we state a result which derandomizes Dvoretzky’s the-
orem (Corollary 7.40) simultaneously for a wide class of convex bodies.
Theorem 7.53 (not proved here). Given n P N and ε P p0, 1q, there is an
explicitly defined subspace E Ă Rn of dimension k “ c log n{ logp1{εq such that the
following holds. If K Ă Rn is a convex body invariant under the isometry group of
the cube (i.e., permutation of coordinates and sign flips) then
dg pK X E, B2E q ď 1 ` ε.
NOTES AND REMARKS 207
ion
[GLR10] p1 ´ ηqn explicit
[Ind00] 1` Ωp2 { logp1{qqn Opn log2 nq
[AAM06, LS08] Oη p1q p1 ´ ηqn Opnq
ut
[GLW08] 2Oη p1{γq p1 ´ ηqn Opnγ q
[IS10] 1` pγqOp1{γq n Opnγ q
rib
ist
Notes and Remarks
A recent and comprehensive reference for the material presented in this chapter
rd
(and much more) is [AAGM15]. Older standard and valuable references include
[MS86, Pis89b, TJ89, Ver].
Exercise 5.72, whose proof carries over to the present context (modulo replacing
an application of Theorem 5.23 with that of Theorem 5.51) and extends to non-
ly.
7.8 are due to Pisier (see [Pis80, Pis81]). The proof of Theorem 7.7 that is
presented here is based on Lemma 7.13, which is from [Mau03]. The bound on
lu
the K-convexity constant from Theorem 7.7 is sharp: there is an example due to
Bourgain [Bou84] of a symmetric convex body K Ă Rn (for an arbitrarily large n)
na
with KpKq “ Ωplog nq; this example is presented in detail in [AAGM15, ? Section
6.7]. Besides unconditional bodies, the improved bound KpKq “ Op log nq holds
so
if K Ă Rn is, for example, a zonoid (see [Pis80]; or Theorem IV.5 in [LQ04] for a
r
detailed proof).
Pe
Section 7.2. The history around Dvoretzky’s theorem starts with a conjec-
ture by Grothendieck [Gro53b]: does every n-dimensional normed space contain a
kpε, nq-dimensional subspace which is p1 ` εq-Euclidean, for some function kpε, nq
tending to infinity with n? This was shown affirmatively by Dvoretzky [Dvo61],
and later refined by [Mil71] using crucially concentration of measure. Other early
proofs include [Sza74] and [Fig76].
Theorem 7.15 with the dependence on ε as stated appears in [Gor88] in the
real case (see Exercise 7.7). The proof via Lemma 7.17 is from [Sch89] and it was
ion
noticed in [ASW11] that it carries over to the complex case.
When asking about the dependence on ε in Dvoretzky’s theorem, it is important
to keep in mind that there are two different questions, depending whether we ask
ut
if p1 ` εq-Euclidean subspaces either (i) exist or (ii) have measure 1 ´ op1q in the
rib
Grassmann manifold equipped with the standard Haar measure.
For example, one may ask: given ε ą 0 and k, for which values of n can we
guarantee that every n-dimensional symmetric convex body has a k-dimensional
ist
section which is p1 ` εq-Euclidean? If we believe that the worst case is the cube, it
is natural to conjecture that this holds for n ě Cpkqε´pk´1q{2 . This conjecture is
rd
confirmed for k “ 2 (see [Mil88]). For k ą 2 the problem is wide open and a good
dependence would follow from a positive answer to a weak version of the Knaster
fo
problem, see [KS03]. In a related direction, the random version of the Dvoretzky
theorem for the cube has been studied in [Sch07, Tik14] and the dependence on
ot
n
ε in Theorem 7.19 for K “ B8 is cpεq “ Θpε{ lnp1{εqq.
Most of the material from Sections 7.2.2 through 7.2.4 is based on the very
N
influential paper [FLM77]. The concepts of the verticial and facial dimensions of
a convex body were formally defined in [AS17].
ly.
result from Exercise 7.19 appears in [BDG` 77] (for another proof, see [AAGM15,
Theorem 5.4.3]). The construction from Exercise 7.20 is due to Figiel.
The estimate from Exercise 7.21 is relevant to [FHS13]. Theorem 7.35 is
na
from [Kaš77]; the correct order of magnitude of the distortion constant Apαq was
determined in [GG84]; the proof of the upper bound presented in Section 7.2.6.2
so
follows [PTJ90]. We also refer to [FR13, Chapter 10] for a detailed presentation
r
The Dvoretzky–Rogers lemma was first proved in [DR50]. The proof presented
comes from [Peł80]. It has been realized since [BS88] that actually a stronger
property holds: There is a function f : p0, 1s Ñ r1, 8q such that, for any n-
dimensional normed space X there exist m ě p1 ´ δqn and operators α : Rm Ñ X,
β : X Ñ Rm verifying β ˝ α “ I and }α : `m m
1 Ñ X} ¨ }β : X Ñ `2 } ď f pδq. The
above is often referred to as a proportional Dvoretzky–Rogers factorization. It is
known that f pδq “ Opδ ´1 q and f pδq “ Ωpδ ´1{2 q [Gia96, Rud97]. Variants for
nonsymmetric bodies were also shown, see [You14]. For more information and
references see the website [@3].
NOTES AND REMARKS 209
Regarding Proposition 7.39, it has been proved in [Bal89, Bal91] that the
cube (resp., the simplex) has the smallest mean width among all symmetric (resp.,
non-necessarily symmetric) convex bodies in John position.
The relevance of the concept of volume ratio to Dvoretzky-like theorems was
realized in [Sza78, ST80], which were inspired by the important work [Kaš77]
that in particular established the existence of the Kashin decomposition of `n1 (see
Corollary 7.43). This concept is related to the notion of cotype 2. Let pεn q be a
sequence of independent variables such that Ppεi “ 1q “ Ppεi “ ´1q “ 1{2. The
ion
cotype 2 constant of a normed space X is the smallest number C2 pXq such that,
for every vectors x1 , . . . , xn P X, we have
›2
ut
›
n
ÿ ›ÿn ›
}xi }2 ď C2 pXq E › εi xi › .
› ›
rib
i“1
›i“1
›
The estimate vrpXq “ OpC2 pXq log C2 pXqq connecting volume ratio and cotype 2
ist
was proved in [BM87, MP86] (see [Mil87] for a simpler proof and [DS85] for an
earlier argument yielding Kashin’s decompositions under cotype 2 assumptions).
rd
Any bound on the cotype 2 constant is obviously inherited by subspaces. For more
information about the type and cotype theory, see [Mau03]. The formulation of
Theorem 7.44 appears in [Bal97].
fo
The low-M ˚ estimate (Theorem 7.45) was proved originally by Milman with a
worse dependence on α; the proof we present is due to Gordon [Gor88]. Another
ot
proof giving the correct dependence, and valid also in the complex setting, is due
to Pajor and Tomczak–Jaegermann [PT86]. See [AAGM15] for a presentation of
N
several different proofs. We also point that in some cases the upper bound in the
Dvoretzky–Milman theorem (Theorem 7.19) holds for dimensions larger than the
ly.
argument to deduce the reverse Santaló inequality sketched in Exercise 7.33 is due
to Pisier. Another related result due to Milman [Mil86] is the reverse Brunn–
se
Minkowski inequality, which asserts the following: for any symmetric convex body
B Ă Rn there is a volume-preserving linear map TB P SLpn, Rq such that, if K, L
lu
There is a close link with the M -ellipsoid and M -position introduced in (5.68), since
so
(7.32) is easily seen to hold when TK pKq and TL pLq admit multiples of Euclidean
balls as M -ellipsoids.
r
Pe
The results from AGA are classically presented in the real setting, but typically
remain valid for complex spaces (or circled convex bodies) as well. This is the case
for Theorems 7.42, 7.45 and 7.46. Often the proofs can be translated verbatim,
with the notable exception of the Chevet–Gordon inequalities, for which no complex
analogue is known. We also note that Pisier [Pis89a] obtained a proof of (7.32)
via interpolation which works primarily in the complex setting (see Chapter 7 in
[Pis89b]).
The theme of the approximation of zonoids by zonotopes with few summands
attracted attention in the late 80’s. The best result (Theorem 7.48) is due to
Talagrand [Tal90] and improves on [Sch87, BLM89]. It is an open question
whether Theorem 7.48 holds without the factor log n, i.e., with N ď Cpεqn.
210 7. SOME TOOLS FROM ASYMPTOTIC GEOMETRIC ANALYSIS
ion
is from [Fre14], which contains also a version of the Theorem for convex bodies
that are only assumed to be invariant under permutation of coordinates.
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Pe
rso
na
lu
se
on
ly.
N
Part 3
ot
fo
rd
ist
The Meeting: AGA and QIT
rib
ut
ion
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 8
ion
Throughout this chapter, we consider a multipartite Hilbert space
ut
H “ Cd 1 b ¨ ¨ ¨ b Cd k
rib
and study the entanglement of pure states on H. We will always assume that k ě 2
and that d1 , . . . , dk ě 2.
We identify pure states on H with elements of PpHq, the projective space on
ist
H. The set of product vectors forms the Segré variety Seg Ă PpHq (see (B.6) in
Appendix B.2). A simple remark, on which we will elaborate, is that most pure
rd
states are entangled. Indeed, since the variety Seg Ă PpHq has lower dimension
and measure zero, it follows that a randomly chosen—in any reasonable sense—
pure state in H is almost surely entangled. fo
A problem which turns out to be fundamental to several constructions in QIT
ot
is to show the existence of large-dimensional subspaces of H, in which every unit
vector corresponds to an entangled pure state. There are several variations on
N
this question. We may consider the qualitative version of the problem, where we
require the subspace simply to contain no nonzero product vector (see Theorem
ly.
8.1). Alternatively, we may insist that the subspace contains only very entangled
vectors, once it is specified how to quantify entanglement; for pure states this may
on
be done via the von Neumann or Rényi entropy of the partial trace.
The versions of Dvoretzky’s theorem that were discussed in Section 7.2 are
obviously relevant to such questions, since they show the existence of large subspaces
se
Much of our exposition will be focused on detailed study of the bipartite case
H “ Ck b Cd (we will always assume that k ď d). One reason for such emphasis
so
is the fact that subspaces of a bipartite Hilbert space can provide a convenient de-
scription of quantum channels through the Stinespring representation, as we explain
r
are dealt with in the last part of the chapter (Section 8.5).
ion
general framework of this book. For simplicity, we only consider the case H “
Cd b Cd (so that n0 “ pd ´ 1q2 ), the general case being similar.
ut
We work in the projective space PpHq, which we equip with the distance given
by (B.5). The ball of center ψ and radius r is denoted by Bpψ, rq. We use bounds
rib
on the size of ε-nets in PpHq and the measure of ε-balls from Theorem 5.11 (and
Exercise 5.25; the more elementary results from Section 5.1.2 would actually suffice,
ist
cf. Exercise 5.10 and (5.2)). In this proof, as opposed to most material in this book,
the dependence of constants on the dimension is allowed, and we will denote by
rd
C, C 1 etc. positive constants which may depend on d and m, but are independent
of the parameter ε.
Let F be a random m-dimensional subspace of H, chosen with respect to the
fo
Haar measure on the Grassmann manifold. More concretely, we may realize F
as F “ U pF0 q, where F0 is any fixed m-dimensional subspace, and U is a Haar-
ot
distributed unitary matrix. Denote also Seg Ă PpHq the set of product vectors (the
N
Segré variety).
We are going to show that the event Seg XF “ H has probability 1. Given
ε ą 0, let Mε be an ε-net inside the projective space PpF0 q with cardpMε q ď
ly.
ď ď
PpSeg XF ‰ Hq ď P ˝ Bpϕ, 2εq X U Bpψ, εq ‰ H‚
lu
ϕPNεb2 ψPMε
ÿ
ď P pBpϕ, 2εq X U pBpψ, εqq ‰ Hq
na
ϕPNεb2 ,ψPMε
ÿ
ď P pdpϕ, U ψq ă 3εq .
so
ϕPNεb2 ,ψPMε
r
The quantity Ppdpϕ, U ψq ă 3εq does not depend on the particular points ϕ, ψ P
Pe
PpHq, and is equal to the normalized measure of a ball of radius 3ε in PpHq, which
2
is bounded from above by pC 2 εq2d ´2 (or see Exercise 5.11 for the exact value).
Consequently,
2
PpSeg XU pF0 q ‰ Hq ď cardpNεb2 q cardpMε qpC 2 εq2d ´2
2
ď Cε2d ´2´p2m´2q´2p2d´2q
.
2
Provided m ď pd ´ 1q , the last quantity tends to 0 as ε tends to 0. This shows
that the event {F intersects Seg} has probability 0, so that F contains no nonzero
product vector.
8.2. ENTROPIES OF ENTANGLEMENT AND ADDITIVITY QUESTIONS 215
ion
Notes and Remarks).
Let ψ P Ck bCd be a unit vector. The entropy of entanglement of ψ, denoted by
ut
Epψq, is defined as the von Neumann entropy of the reduced matrix ρ “ TrCd |ψyxψ|.
(8.1) Epψq “ Spρq “ ´ Tr ρ log ρ.
rib
Both parties play a symmetric role since the two reduced matrices TrCd |ψyxψ|
and TrCk |ψyxψ| have the same von Neumann entropy (in the matrix formalism, a
ist
consequence of the factřthat M M : and M : M have the same nonzero eigenvalues
for M P Mk,d ). If ψ “ λi ϕi b χi is a Schmidt decomposition of ψ, then
rd
ÿ ÿ
(8.2) Epψq “ ´ λ2i log λ2i “ ´2 λ2i log λi .
fo
For any p P r0, 8s, we introduce the p-entropy of entanglement, defined as
(8.3) Ep pψq “ Sp pρq,
ot
where ρ “ TrCd |ψyxψ| and Sp is the p-Rényi entropy introduced in Section 1.3.3.
N
Recall that the case p “ 1 corresponds to the von Neumann entropy, i.e., E1 pψq “
Epψq (as given by (8.1)). The limit cases p “ 0 and p “ 8 should be interpreted
ly.
as E0 pψq “ log rankpψq and E8 pψq “ ´2 log max λ1 , where rank ψ is the Schmidt
rank of ψ and λ1 its largest Schmidt coefficient.
on
Rényi entropies for p ą 1 are easier to manipulate since they are closely related
to Schatten norms. If we identify a vector ψ P Ck b Cd with a matrix M P Mk,d as
explained in Section 0.8, we obtain (see (2.12))
se
and therefore
p 2p
na
In all this chapter we assume that k ď d, and therefore (for any p P r0, 8s) the
p-entropy of entanglement varies between 0 and log k. Moreover, a pure state ψ
r
satisfies Ep pψq “ 0 if and only if it is a product vector, and satisfies Ep pψq “ log k
Pe
ion
(8.6) Φpρq “ TrCd pV ρV : q.
There is no restriction in considering quantum channels of the form (8.6):
ut
by Stinespring representation theorem (Theorem 2.24), any quantum channel Φ :
Mm Ñ Mk can be represented via (8.6) for some subspace W Ă Ck b Cd , with
rib
d “ km.
It is now easy to define a natural family of random quantum channels. They will
ist
be associated, via the above scheme, to random m-dimensional subspaces W of Ck b
Cd , distributed according to the Haar measure on the corresponding Grassmann
rd
manifold (for some fixed positive integers m, d, k that will be specified later). Note
that most interesting parameters of a channel defined by (8.6) depend only on
fo
the subspace W “ V pCm q and not on a particular choice of the isometry V (see,
e.g., Lemma 8.2). In this sense, the language of “random m-dimensional subspaces
ot
of Ck b Cd ” is equivalent to that of “random isometries from Cm to Ck b Cd ,”
with the corresponding mathematical objects being, respectively, the closely related
N
ρPDpCm q
The following lemma shows that, for channels defined via (8.6), the minimum output
na
For some time, an important open problem in quantum information theory was
to decide whether the quantity S min is additive, i.e., whether every pair pΦ, Ψq of
quantum channels satisfies
?
(8.8) S min pΦ b Ψq “ S min pΦq ` S min pΨq.
The problem admits several equivalent formulations with operational meaning,
notably whether entangled inputs can increase the capacity of a quantum channel
to transmit classical information. (Note that the inequality “ď” in (8.8) always
ion
holds and is easy, see Exercise 8.2.)
A similar question can be asked for the quantities Spmin , the motivation being
that a positive answer to the p ą 1 question would have implied a positive answer
ut
to the (arguably more important) p “ 1 problem. However, it turns out that all
rib
these equalities are do not hold, at least for sufficiently large dimensions.
Theorem 8.3. For any p ě 1, there exist quantum channels Φ, Ψ such that
ist
(8.9) Spmin pΦ b Ψq ă Spmin pΦq ` Spmin pΨq.
rd
Theorem 8.3 will be a consequence of Proposition 8.6 (for p ą 1) and Proposi-
tion 8.24 (for p “ 1).
fo
Exercise 8.2 (Spmin is always subadditive). Show that the inequality Spmin pΦ b
Ψq ď Spmin pΦq ` Spmin pΨq is satisfied for any channels Φ, Ψ and any p ě 0.
ot
Exercise 8.3 (Reduction of the additivity problem to the case Φ “ Ψ). A trick
N
based on direct sums (as defined in (2.42)) allows a reduction to the case Φ “ Ψ in
questions such as (8.8).
ly.
(i) Given quantum channels Φ, Ψ, show that Spmin pΦ‘Ψq “ minpSpmin pΦq, Spmin pΨqq.
(ii) Assume that there is a pair of channels Φ, Ψ such that (8.9) holds for some p.
on
Deduce formally the existence of a channel Ξ such that Spmin pΞ b Ξq ă 2Spmin pΞq.
8.2.4. On the 1 Ñ p norm of quantum channels. The p ą 1 version of the
se
maxt}Φpρq}p : ρ P DpCm qu, or the maximum output p-norm. The latter quantity
equals }Φ}1Ñp , i.e., the norm of Φ as an operator from pMsa sa
m , } ¨ }1 q to pMk , } ¨ }p q.
na
(8.10)
A remarkable fact is that for completely positive maps (and even for 2-positive
r
Pe
Proof. From the singular value decomposition, there exist unitary matrices
U, V P Upkq such that U BV : is a diagonal matrix with nonnegative diagonal entries.
Denote W “ U ‘ V P Up2kq. We have
„
: U AU : U BV :
WMW “ .
V B : U : V CV :
Since the Schatten norms are invariant under multiplication by unitaries, this shows
that to prove the Lemma it is enough to treat the case when the matrix B is diagonal
ion
with nonnegative entries, which we consider now. „
a b
We first note that b2ii ď aii cii , which follows from the matrix ii ii being
bii cii
ut
positive as a submatrix of M . Consequently, we have
¸1{2 ˜ ¸1{2
rib
˜
k k k k
p{2 p{2
ÿ p
ÿ ÿ p
ÿ p
}B}pp “ bii ď aii cii ď aii cii ď }A}pp{2 }C}p{2
p ,
i“1 i“1 i“1 i“1
ist
where the last inequality uses the fact that the diagonal is majorized by the spec-
trum (Lemma 1.14).
rd
Proof of Proposition 8.4. For ϕ, ψ P SCm , consider u “ ϕ b |1y ` ψ b |2y P
Cm b C2 . By direct calculation
Φ b IdM2 p|uyxu|q “
„ fo
Φp|ϕyxϕ|q Φp|ψyxϕ|q
.
Φp|ϕyxψ|q Φp|ψyxψ|q
ot
Since Φ is 2-positive, the resulting matrix is block-positive and thus, by Lemma
N
8.5,
2
}Φp|ψyxϕ|q}p ď }Φp|ψyxψ|q}p }Φp|ϕyxϕ|q}p .
ly.
Taking supremum over unit vectors gives the required result (recall that extreme
points of S1d and S1d,sa are rank 1 operators).
on
Exercise 8.4 (The equality (8.11) does not hold always). Define Φ : M2 Ñ M2
by ΦpXq “ X ´ TrpXq 2I . Show that for p ą 1, Φ fails to satisfy the equality (8.11).
se
Known examples where (8.11) fails for p “ 1 are more complicated, see [Wat05].
lu
Proposition 8.6. There is a constant c such that the following holds. Let
p ą 1, and Φ : Mm Ñ Mk be a random channel, obtained by (8.6) from a Haar-
distributed isometry V : Cm Ñ Ck b Cd . Denote Ψ “ Φ, the channel obtained from
V , the complex conjugate of V . Assume that k “ d and that m “ cd1`1{p . Then,
for d large enough, with high probability,
(8.12) }Φ b Ψ}1Ñp ą }Φ}1Ñp }Ψ}1Ñp .
Proof. Denote by W Ă Md the range of V (we may consider W as a subspace
of Md after we identify tensors and matrices). From (8.4) and Lemma 8.2, we have
(8.13) }Φ}1Ñp “ max }A}22p .
APW : }A}HS “1
8.3. CONCENTRATION OF Ep FOR p ą 1 AND APPLICATIONS 219
We remark that }Φ}1Ñp “ }Ψ}1Ñp since the Schatten norms are invariant under
complex conjugation. We now appeal to Dvoretzky’s theorem for the Schatten
norm } ¨ }q with q “ 2p. Provided that m ď cd1`2{q for an appropriate universal
constant c ą 0, it follows from Theorem 7.37 that, with large probability
d1{q´1{2 }A}HS ď }A}q ď Cd1{q´1{2 }A}HS
for all A P W. We have therefore, by (8.13),
˘2
d1{p´1 ď }Φ}1Ñp “ }Ψ}1Ñp ď Cd1{q´1{2 “ C 2 d1{p´1 .
`
(8.14)
ion
The reason for choosing Φ as a second channel is that the channel Φ b Φ necessarily
has at least one output with at least one large eigenvalue, as shown by the following
ut
lemma.
rib
Lemma 8.7. Let Φ : Mm Ñ Mk be a quantum channel obtained from an isom-
etry V : Cm Ñ Ck b Cd , as in (8.6). Denote by ψ P Cm b Cm the maximally
entangled state
ist
1
ψ “ ? p|1y b |1y ` ¨ ¨ ¨ ` |my b |myq .
m
rd
Then › › m
›pΦ b Φqp|ψyxψ|q› ě
› ›
m
ot
}Φ b Φ}1Ñp ě }Φ b Φ}1Ñ8 ě
dk
N
In our setting, d “ k and m “ cd1`1{p , so we obtain from Lemma 8.7 the lower
bound }Φ b Φ}1Ñp “ Ωpd1{p´1 q. Since we have, by (8.14),
ly.
we conclude that the inequality (8.12) holds for d large enough (a priori depending
on p ą 1).
se
Remark 8.8. The proof shows that, for any fixed p ą 1, both the multiplicative
violation in (8.10) and the additive violation in (8.9) tend to infinity as the dimen-
lu
sion of the problem increases (at the rates Ωpd1´1{p q and Ωplog dq respectively).
Proof of Lemma 8.7. We work in the matrix formalism. Identify the range
na
m
1 ÿ
Pe
M“? Ai b Ai P W b W.
m i“1
a
The conclusion of the Lemma is equivalent to the inequality }M }8 ě m{kd.
Let pϕj q1ďjďk and pψj 1 q1ďj 1 ďd be orthonormal bases in Ck and Cd , respectively.
We consider the maximally entangled states
k d
1 ÿ 1 ÿ
ϕ“ ? ϕj b ϕj , ψ “ ? ψj 1 b ψj 1
k j“1 d j 1 “1
and compute
ˇ ˇ
}M }8 ě ˇxψ|M |ϕyˇ
220 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS
m ÿ k ÿ d
1 ÿ ˇ ˇ
ˇxψj 1 b ψj 1 |Ai b Ai |ϕj b ϕj yˇ
“ ?
mkd i“1 j“1 j 1 “1
m ÿ k ÿ d
1 ÿ
ˇxψj 1 |Ai |ϕj yˇ2
ˇ ˇ
“ ?
mkd i“1 j“1 j 1 “1
?
m
“ ? ,
kd
ion
ř ˇ ˇ2
where we used the fact that }X}2HS “ j,j 1 ˇxψj 1 |X|ϕj yˇ .
ut
Exercise 8.5 (Non-random counterexamples for p ą 2). Let W Ă Md the
subspace of anti-symmetric matrices, i.e., such that AT “ ´A.
rib
(i) Show that for any A P W, }A}8 ď ?12 }A}HS .
(ii) Let Φ be the quantum channel constructed from W as in (8.6) and fix p ą 2.
ist
Using Lemma 8.7, show that the pair pΦ, Φq is an example for which (8.10) holds
for d large enough.
rd
8.3.2. Almost randomizing channels. A variant of the construction used in
the proof of Proposition 8.6 for p “ `8 gives the following: a channel Φ : Md Ñ Md
fo
constructed from a generic random embedding V : Cd Ñ Cd b CN with N “ Opdq
has the property that }Φpρq}op ď C{d for any state ρ P DpCd q. In other words,
ot
all output states have small eigenvalues. It is natural to ask whether similar lower
bounds of the eigenvalues of output states can also be achieved; showing that this
N
is indeed the case is the content of this section. Recall also (see Section 2.3.3) that
the dimension N of the environment in the Stinespring representation is an upper
ly.
Recall that ρ˚ “ I {d denotes the maximally mixed state. These channels can
be thought as approximations of the completely randomizing channel R, which is
lu
defined by the property Rpρq “ ρ˚ for any ρ P DpCd q. The completely randomizing
channel rank has Kraus rank equal to d2 (see Exercise 8.6). On the other hand,
na
it turns out that there exist ε-randomizing channels with a substantially smaller
Kraus rank, as shown by the following theorem. The dependence on d is optimal
so
since any ε-randomizing channel has Kraus rank at least d, which is due to the fact
that rank one states must be mapped to full rank states.
r
Pe
Lemma 8.10. Let ρ and σ be pure states on Cd and let pUi q1ďiďN be independent
Haar-distributed random unitary matrices. Then, for every 0 ă δ ă 1,
˜ˇ ˇ ¸
ˇ1 ÿ N
1 ˇˇ δ
P ˇ :
TrpUi ρUi σq ´ ˇ ě ď 2 expp´cδ 2 N q.
ˇ
ˇ N i“1 dˇ d
ion
the square of a subgaussian variable) and satisfies }Xi }ψ1 ď C. The conclusion
follows now directly from Bernstein’s inequalities (Proposition 5.59).
ut
Lemma 8.11. Let ∆ : Msa sa
d Ñ Md be a linear map. Let A be the quantity
rib
A“ sup }∆pρq}op “ sup |Tr σ∆pρq|
ρPDpCd q ρ,σPDpCd q
ist
B “ sup |Tr |ψyxψ|∆p|ϕyxϕ|q| .
rd
ϕ,ψPN
and similarly }Pψ ´ Pψ0 }1 ď 2δ (this simple bound is not optimal). We now write
|Tr Pψ ∆pPϕ q| ď |TrpPψ ´ Pψ0 q∆pPϕ q| ` |Tr Pψ0 ∆pPϕ ´ Pϕ0 q| ` |Tr Pψ0 ∆pPϕ0 q| .
on
Using twice (8.15) and taking supremum over ϕ, ψ gives A ď 2δA ` 2δA ` B, hence
the result.
se
´ ε¯ ´ ε¯
P Aě ďP Bě .
d 2d
so
´
P Bě ď 164d ¨ 2 expp´cε2 N {4q.
Pe
2d
This is less than 1 if N ě Cd{ε2 , for some constant C.
Exercise 8.6 (Kraus decomposition of the completely randomizing channel).
(i) Show that the Kraus rank of the completely randomizing channel R is d2 .
(ii) Let ω “ expp2iπ{dq and A, B be the unitary operators defined by their action
on the canonical basis by
(8.16) A|jy “ |j ` 1 mod dy B|jy “ ω j |jy.
Show that the operators pB j Ak q1ďj,kďd give a Kraus decomposition of R. These
operators are sometimes called the Heisenberg–Weyl operators.
222 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS
ion
provided by the next two lemmas.
Lemma 8.12. The Lipschitz constant of the function ψ ÞÑ Epψq, defined on
ut
pSCk bCd , | ¨ |q is bounded from above by C log k for some absolute constant C.
rib
This is clearly optimal up to the value of the constant C, since the function
E maps SCk bCd (which has diameter π, or π{2 if we consider E as a function on
ist
PpCk b Cd q) onto the segment r0, log ks. (Remember that in this chapter we always
assume k ď d.) Note that, in view of (B.1), it doesn’t matter—apart from the value
rd
of the constant—whether we use the geodesic distance or the extrinsic distance. For
a discussion of the optimal values of the constants see Exercise 8.7.
fo
Proof. We first check the commutative case by considering the function f :
S k´1 Ñ r0, log ks defined by
ot
ÿ
(8.17) f pxq “ ´ x2i logpx2i q,
N
i.e., the Shannon entropy of the probability distribution px2i q P ∆k . In the terminol-
ogy of (8.2), this is equivalent to restricting attention to vectors ψ whose Schmidt
ly.
k
ÿ
(8.18) |∇f pxq|2 “ 4 x2i p1 ` logpx2i qq2 ď C log2 k,
i“1
se
where the last inequality can be obtained by observing that the function t ÞÑ
tp1 ` log tq2 is concave on r0, e´2 s, and so the quantity |∇f pxq| increases when we
lu
replace the coordinates of x smaller than e´1 by their `2 average. It follows that
if L is the Lipschitz constant of f with respect to the geodesic distance on S k´1 ,
na
then L ď C 1{2 log k. Our objective is to show is that the same constant works for
the function ψ ÞÑ Epψq.
so
and let
ÿk
(8.19) f˜pψq “ ´ xui |ρ|ui y logpxui |ρ|ui yq.
i“1
In other words, f˜pψq is the entropy of the diagonal part of ρ, calculated in the
basis pui q. An important property of f˜ is that f˜pψq “ Spρq if pui q a basis which
diagonalizes ρ (which is obvious from the definitions) and f˜pψq ď Spρq in general
(which is a consequence of concavity of S and is the content of Exercise 1.50). Next,
one verifies that xui |ρ|ui y “ |Pi ψ|, where Pi is the orthogonal projection onto the
subspace ui bCd Ă Ck bCd . Since the map ψ ÞÑ p|P1 ψ|, . . . , |Pk ψ|q is a contraction,
8.4. CONCENTRATION OF VON NEUMANN ENTROPY AND APPLICATIONS 223
it follows that the Lipschitz constant of f˜ (with respect to g, the geodesic distance
on SCk bCd ) is at most L.
We now return to the original question. Let ψ1 , ψ2 P SCk bCd ; set ρk “
TrCd |ψk yxψk | and let f˜ be defined by (8.19) using a basis pui q which diagonalizes
ρ1 . Then
Epψ1 q ´ Epψ2 q “ Spρ1 q ´ Spρ2 q “ f˜pψ1 q ´ Spρ2 q ď f˜pψ1 q ´ f˜pψ2 q ď L gpψ1 , ψ2 q.
Since the roles of ψ1 and ψ2 can be reversed, it follows that the Lipschitz constant
ion
of E with respect to g is at most L (and hence exactly L), as claimed.
Lemma 8.13 (not proved here, but see Remark 8.14). For k ď d, the expectation
ut
of the function ψ ÞÑ Epψq (with respect to the uniform measure on the unit sphere
in Ck b Cd ) satisfies
rib
˜ ¸
kd
ÿ 1 k´1 1k
(8.20) E Epψq “ ´ ě log k ´ .
ist
j“d`1
j 2d 2d
rd
ity slightly weaker than (8.20) follows readily from Proposition 6.36 (or Exercise
6.43, which is even more elementary). First, with large probability, all Schmidt
coefficients of ψ belong to the interval
„
1 C 1
fo C
ot
? ´? ,? `?
k d k d
N
Epψq “ Spρq ě log k ´ C 1 k{d. (The use of Lemma 1.20 requires ε ď 1, for larger ε
we may use the simpler bound Spρq ě S8 pρq “ ´ log }ρ}8 .)
on
Theorem 8.15. Let ε ą 0 and m ď cε2 kd{ log2 k. Then most m-dimensional
lu
2d
In some cases the result given by Theorem 8.15 can be improved. In particular,
so
in order to obtain violations for the additivity of Smin we will need to produce
“extremely entangled subspaces,” in which every state has entropy logpkq ´ op1q
r
Pe
Exercise 8.9 (An upper bound on the minimal entropy for general subspaces).
Let W Ă Ck b Cd be a subspace of dimension αkd, with α ě 1{k. (i) Using
the previous exercise, show that W contains a unit vector ψ satisfying Epψq ď
hpαq ` p1 ´ αq logpk ´ 1q, where hptq “ ´t log t ´ p1 ´ tq logp1 ´ tq ď log 2 is the
˘ if λ ě 1 and Epψq ě log k ´ λ{k for all
binary entropy function. `(ii) Conclude that
ψ P W, then dim W “ O λd{p1 ` log λq .
8.4.2. Entangled subspaces of small codimension. The argument from
the previous section gives nothing for subspaces of dimension cdk or larger: if
ion
ε “ log d, the conclusion of Theorem 8.15 does not even imply nonnegativity of
Epxq. However, in view of Theorem 8.1, it seems plausible to quantify entanglement
ut
on subspaces of larger dimension. This can be achieved provided we use a suitable
measure of entanglement.
rib
One possibility is to use the p-Rényi entropy for p “ 1{2. Recall from (8.5)
that if we identify a unit vector x P Ck b Cd with A P Mk,d , then
ist
E1{2 pxq “ 2 log }A}1 ,
and our problem becomes a question about the behavior of }¨}1 vs. }¨}2 on subspaces
rd
of Mk,d .
Theorem 8.16. Let k ď d, and W Ă Ck b Cd be a random subspace of dimen-
fo
sion m. The following holds with large probability: for every unit vector x P W,
ot
E1{2 pxq ě logpk ´ m{dq ´ C.
N
Proof. We identify Ck b Cd with Mk,d , and apply the low M ˚ -estimate (The-
orem 7.45) to the norm } ¨ }1 . One needs the value of M ˚ :“ E }X}op , where X
? distributed on the Hilbert–Schmidt sphere in Mk,d . The inequality
is uniformly
se
for every A P W, ? ?
}A}1 ě c k α}A}HS ,
na
entanglement, say log k ´ op1q for example. In view of Lemma 8.13, this requires
k “ opdq. For simplicity, we will focus on the case d “ k 2 . This choice of dimensions
allows us to produce an example of a pair of channels violating the additivity
relation (8.8), although the method is applicable to a wider range of parameters.
Proposition 8.17. There are absolute constants c, C such that the following
holds. Let k be an integer and set d “ k 2 , m “ ck 2 . With large probability,
a random m-dimensional subspace W Ă Ck b Cd has the property that any unit
vector ψ P W satisfies
C
Epψq ě log k ´ .
k
8.4. CONCENTRATION OF VON NEUMANN ENTROPY AND APPLICATIONS 225
ion
Lemma 8.19. If ρ is any state on Ck , then
2
Spρq ě log k ´ k }ρ ´ ρ˚ }HS .
ut
Proof. The following inequality compares the entropy with its second order
rib
approximation: for every x, t P r0, 1s,
1
(8.21) ´ x log x ě ´t log t ´ p1 ` log tqpx ´ tq ´ px ´ tq2 .
ist
t
To check inequality (8.21), notice that it can be rewritten as logpyq ď y ´ 1 with
rd
y “ x{t. Given a state ρ P DpCk q with eigenvalues ppi q1ďiďk , we apply (8.21) with
x “ pi and t “ 1{k. Summing over i, we obtain the announced inequality.
fo
It will be more convenient to work with a random matrix M P Mk,d of Hilbert–
Schmidt norm 1, rather than with a random unit vector ψ P Ck b Cd (both ap-
ot
proaches are equivalent, see Section 0.8). Also recall that when a vector ψ is
identified with a matrix M , we have TrCd |ψyxψ| “ M M : , see (2.12).
N
Proposition 8.20. There are absolute constants c, C such that the following
holds. Let k be an integer, d “ k 2 , m “ ck 2 and let SHS be the Hilbert–Schmidt
on
:
gpM q “ ›M M ´ › .
›
k HS
lu
With large probability, a random m-dimensional subspace W Ă Mk,d has the prop-
erty that
na
Remark 8.21. We wish to point out that while Proposition 8.20 will be derived
r
from Dvoretzky for Lipschitz functions, it can be rephrased in the language of the
Pe
standard Dvoretzky’s theorem. Indeed, its assertion says that for every M P W
with }M }HS “ 1 we have
›2
C2
›
› I ›› 4 2 Tr M M : Tr I 1
(8.23) 2
ě ›M M :
´ “ Tr |M | ´ ` 2 “ Tr |M |4 ´ ě 0.
k › k HS
› k k k
Consequently,
´ C 2 ¯1{4 ´ C2 ¯
(8.24) k ´1{4 }M }HS ď }M }4 ď k ´1{4 1 ` }M }HS ď k ´1{4 1 ` }M }HS
k 4k
2
for all M P W. In other words, W is p1 ` δq-Euclidean,
` with ˘δ “ C4k , when
considered as a subspace of the Schatten normed space Mk,d , } ¨ }4 . On the other
226 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS
` ˘
hand, the Dvoretzky dimension of Mk,d , } ¨ }4 equals k 1{2 d (see Theorem 7.37) and
therefore the general theory (such as Theorem 7.19) gives only δ “ Opk ´1{4 q for m-
dimensional subspaces. Although the Dvoretzky dimension is sharp for the size of
isomorphically Euclidean subspaces (in the sense exemplified in Exercises 7.12 and
7.25), (8.24) supplies an instance where it can be beaten for almost isometrically
Euclidean subspaces.
Before embarking on the proof of Proposition 8.20 we offer some preliminary
ion
remarks. We know from Proposition 6.36 (the elementary argument from Exercise
6.43 would actually be sufficient) that all singular values of a typical M P SHS
belong to the interval
ut
„
1 C 1 C
(8.25) ? ´? ,? `? .
rib
k d k d
It follows that }M M : ´ I {k}8 “ Opk ´3{2 q and thus the median Mg of g satisfies
ist
Mg ď C{k. We next estimate the Lipschitz constant of g. The inequality
}M M : ´N N : }HS ď }M pM : ´N : q`pM ´N qN : }HS ď p}M }op `}N }op q}M ´N }HS
rd
has the following immediate consequence.
fo
Lemma 8.22. Let Ωt “ tM P SHS : }M }op ď tu for some t ě 0. The function
defined on Ωt by M ÞÑ M M : is 2t-Lipschitz with respect to the Hilbert–Schmidt
ot
norm.
In particular, the function g is 2-Lipschitz on Ω1 “ SHS? . However, a direct
N
´ ? ¯ 1
(8.26) P f pM q ě 3{ k ď expp´k 2 q.
2
na
n “ kd and that the Dvoretzky dimension is of order d, see Theorem 7.37) shows
r
probability.
Starting from this point, we will present two possible paths to complete the
proof of Proposition 8.20. The first argument uses twice the general Dvoretzky
theorem for Lipschitz functions (Theorem 7.15) with the optimal dependence on
ε. The second argument is based on a trick due to Fukuda making the overall
argument more elementary. In terms of the hierarchy discussed at the beginning of
Section 6.1, the first proof we give uses principles from level (ii), namely the Dudley
inequality, whereas the second argument uses a single ε-net, staying at level (i).
Proof #1 of Proposition 8.20. We know from Lemma 8.22 that the func-
tion g is 2t-Lipschitz on Ωt . Let g̃ be a 2t-Lipschitz extension of g|Ω to SHS . Note
8.4. CONCENTRATION OF VON NEUMANN ENTROPY AND APPLICATIONS 227
ion
sup |g̃ ´ µ| ď 1{k
SHS XW
ut
on a random subspace W Ă Mk,d of dimension m “ c0 ¨ kd ¨ pk ´1 {p6k ´1{2 qq2 “ cd.
We then have
rib
1 C1
sup g̃ ď µ ` ď .
SHS XW k k
ist
If SHS X W Ă Ω (which, as noticed before, holds with large probability), g and
g̃ coincide on SHS X W and therefore g ď C 1 {k on SHS X W, proving (8.22).
rd
Proof #2 of Proposition 8.20. We use the following lemma which allows
to discretize the supremum in (8.22).
fo ?
Lemma 8.23. Let N be an ε-net in pSHS X W, | ¨ |q with ε ă 2 ´ 1. Then
ot
1
sup gpM q ď sup gpM q
M PSHS XW 1 ´ ε2 ´ 2ε M PN
N
δ`
∆ :“ M M : ´ M M0: “
˘
AA: ´ BB : ` 2δN N : ,
2
se
δ` ˘
}∆}HS ď }AA: ´ p2 ´ δqρ˚ }HS ´ }BB : ´ p2 ` δqρ˚ }HS ` }2δN N : ´ 2δρ˚ }HS .
2
na
δ
ď gpM0 q ` pp2 ´ δqgpA{}A}HS q ` p2 ` δqgpB{}B}HS q ` 2δgpN qq
r
2
Pe
ion
and this quantity is (much) smaller than 1 provided m ď c1 d, for sufficiently small
c1 ą 0. Since Mg “ Op1{kq, this concludes the proof.
ut
8.4.4. Counterexamples to the additivity problem. Using Proposition
rib
8.17 and the approach used in Proposition 8.6 for the p-Rényi entropy, we can show
the following.
ist
Proposition 8.24. There is a constant c such that the following holds. Let
d “ k 2 , m “ ck 2 and Φ : Mm Ñ Md be a random channel, obtained by (8.6)
rd
from a Haar-distributed isometry V : Cm Ñ Cd b Cd . Set Ψ “ Φ, the channel
obtained from V , the complex conjugate of V . If k is large enough, then with large
probability, fo
S min pΦ b Ψq ă S min pΦq ` S min pΨq.
ot
Proof. Denote by W Ă Ck b Cd the range of V . From Lemma 8.2, we have
N
Note that Smin pΦq “ Smin pΨq. From Proposition 8.17, we have with large proba-
ly.
bility
C
on
entangled state yields an output state with an eigenvalue greater than or equal
dim W m c
to dim Mk,d “ kd “ k . Then, a simple argument using just concavity of S (see
lu
Proposition 1.19) reduces the problem to calculating the entropy of the state with
one eigenvalue equal to kc and all the remaining ones identical, which yields
na
c log k 1
Smin pΦ b Ψq ď 2 log k ´ ` .
so
k k
We have therefore S min pΦbΨq ă S min pΦq`S min pΨq provided k is large enough.
r
Pe
!ˇ ˇ )
(8.27) gpψq “ max ˇxψ, ψ1 b ¨ ¨ ¨ b ψk yˇ : ψi unit vector in Hi , 1 ď i ď k
ˇ ˇ
ion
the usual notion of a maximally entangled state (see Section 2.2.4). However, in
the multipartite case it seems hard to describe the maximally entangled vectors.
ut
The problem has an immediate geometric reformulation.
rib
Proposition 8.25 (easy). Let H “ H1 b ¨ ¨ ¨ b Hk . The following numbers are
equal
(i) The minimal value of gpψq over all unit vectors ψ P H.
ist
(ii) The inradius of BH1 b p BHk , where BHi denotes the unit ball in Hi .
p ¨¨¨ b
(iii) The largest constant c such that any k-linear map φ : H1 ˆ ¨ ¨ ¨ ˆ Hk Ñ C
rd
satisfies
c|||φ||| ď maxt|φpx1 , . . . , xk q| : |x1 | ď 1, . . . , |xk | ď 1u,
where ||| ¨ ||| denotes the norm
ÿ
fo
ÿ
|||φ|||2 “ |φpx1 , . . . , xk q|2
ot
¨¨¨
x1 PB1 xk PBk
N
with Bi an orthonormal basis in Hi (the value of ||| ¨ ||| does not depend on the
choice of the bases).
ly.
Denote by gmin pHq be the common value of the numbers appearing in Propo-
on
Proof of Lemma 8.26. The same argument works for the real case and the
na
minp d1 , d2 q
which is a restatement on the inequalities between the trace norm and the Hilbert–
r
Pe
Schmidt norm on the space of d1 ˆ d2 matrices. For the induction step, we use the
bound (which is again the k “ 2 case)
1
gmin pCd1 b Hq ě ? gmin pHq.
d1
8.5.2. The case of many qubits. We will now focus, for simplicity, on the
particular case of k qubits, i.e., d1 “ d2 “ ¨ ¨ ¨ “ dk “ 2 in the complex case.
In this section it is convenient to define entropy via logarithm to the base
p2q
2 and so we will exceptionally use E8 pψq :“ ´2 log2 gpψq (cf. (8.28)). In this
notation, the conclusion of Lemma 8.26 can be rewritten as follows: for any pure
p2q
state ψ P pC2 qbk , we have E8 pψq ď k ´ 1. The following seems to be unknown.
230 8. ENTANGLEMENT OF PURE STATES IN HIGH DIMENSIONS
Problem 8.27. Does there exist a constant C, and for each k a unit vector
ψ P pC2 qbk , such that
p2q
E8 pψq ě k ´ C ?
The next proposition shows that random states are typically very entangled,
but not entangled enough to give a positive answer to Problem 8.27.
Proposition 8.28. There exist absolute constants c, C such that a uniformly
distributed random unit vector ψ P pC2 qbk satisfies with high probability
ion
? ?
k log k k log k
c ď gpψq ď C .
2k{2 2k{2
ut
The conclusion of Proposition 8.28 can be equivalently rewritten as
rib
p2q
k ´ logpkq ´ log logpkq ´ C 1 ď E8 pψq ď k ´ logpkq ´ log logpkq ` C 1 .
Proof of Proposition 8.28. The average of g over the unit sphere is exactly
ist
bk
the mean width of K “ pBC2 q (we think of pC2 qbk as a 2k`1 -dimensional real
p
space). The concentration of the functional g around its mean follows from Lévy’s
rd
lemma (see Table 5.2). Indeed, since K is contained in the unit ball, the functional
g “ wpK, ¨q is 1-Lipschitz and therefore
fo
Pp|gpψq ´ wpKq| ą tq ď 2 expp´2k t2 q.
?
ot
It remains
? to show that wpKq “ Θp k log k 2´k{2 q, or equivalently that wG pKq “
Θp k log kq. The upper bound follows from a standard ε-net argument: let N
N
be an ε-net in pSC2 , | ¨ |q with card N ď p2{εq4 (see Lemma 5.3). From Exercise
5.7 (the weaker result from Lemma 5.9 would be enough here), it follows that
ly.
we have
convpN bk q Ą p1 ´ ε2 {2qk K.
se
b a
bk
wG pconvpN qq ď 2 cardpN bk q ď 8k logp2{εq.
? ?
na
5.10; note that PpC2 q identifies with the Bloch sphere, a 2-dimensional Euclidean
sphere of radius 1{2, if we use the metric (B.5).) This means that for i ‰ j, we
have |xxi , xj y| ď 1 ´ 1{2k.
We claim that a large subset of Mbk is separated. To construct it, introduce
Q “ t1, . . . , N uk , equipped with the normalized Hamming metric, defined for α, β P
Q by
1
dpα, βq “ cardti : αi ‰ βi u.
k
To each element α “ pα1 , . . . , αk q P Q we associate the vector
xα “ xα1 b ¨ ¨ ¨ b xαk P K.
NOTES AND REMARKS 231
ion
a
wG pKq ě c log card Q.
It remains to give a lower bound on the size of Q. Using the inequality (5.17)
ut
from Chapter 5 (which was obtained by the greedy packing algorithm), we obtain
2
kp1´HN p1{5qq
? QěN
card ě N c k for some constant c2 ą 0. It follows that wG pKq ě
rib
c k log k.
ist
8.5.3. Multipartite entanglement in real Hilbert spaces. It turns out
that in the real case, Lemma 8.26 is surprisingly sharp, so that the real version of
rd
Problem 8.27 has a positive answer with C “ 1. The construction from Proposition
8.29 seems to be specific to the real case. For variants related to Clifford algebras,
see Exercise 8.10.
Proposition 8.29. For any integers k ě 1, we have
fo
ot
gmin ppR2 qbk q “ 2´pk´1q{2 .
N
will follow provided we show the existence of a k-linear form φ : pR2 qk Ñ R such
that |φpx1 , . . . , xk q| ď 1 for unit vectors x1 , . . . , xk , and |||φ||| “ 2pk´1q{2 . Let
on
ź
φ : px1 , . . . , xk q ÞÑ Re θpxi q
i“1
lu
ś
(where means complex multiplication) satisfies the desired conclusion.
na
?
d bk N
(8.29) gmin ppR q q ď k{2 .
d
When d P t2, 4, 8u, one can achieve N “ d and the upper bound (8.29) matches the
lower bound from Lemma 8.26.
Section 8.2. There are multiple operational motivations to use the von Neu-
mann entropy when defining the entropy of entanglement in (8.1). Given a bipartite
state ρ, there are several ways to quantify how much entanglement it contains. Two
approaches that are in some sense extremal and dual to each other are the entan-
glement of distillation (the rate at which one can LOCC-transform copies of ρ into
Bell states, see also Chapter 12) and the entanglement cost (the rate at which one
can LOCC-transform Bell states into copies of ρ). For a general survey on entan-
glement measures we refer to [PV07]. If we restrict ourselves to pure states as we
ion
do in this chapter, all these entanglement measures coincide with the entropy of
entanglement (see Chapter 12.5.2 in [NC00].)
The “additivity conjecture” (8.8) has been a major open problem in QIT, partic-
ut
ularly since work by Shor [Sho04], who showed that the additivity of the minimum
rib
output von Neumann entropy was equivalent to the additivity of several other quan-
tities, including the capacity of quantum channels to carry classical information and
the entanglement of formation (defined later in Section 10.3.1). For example, the
ist
entire ICM 2006 talk by A. Holevo [Hol06] was devoted to this circle of ideas. A
positive answer would have greatly simplified the theory, leading to a “single letter”
rd
formula for the aforementioned capacity, see, e.g., [Hol06]. However, the answer
to the conjecture was shown to be negative by Hastings [Has09].
Exercise 8.3 is based on [FW07]. fo
Proposition 8.4 was proved in [Wat05, Aud09, Sza10]. We follow here the
ot
argument from [Sza10].
Our presentation in this chapter barely scratches the surface of the topic of
N
quantum channel capacities. In the quantum context, there are many notions of
capacity (see, e.g., [Wil17]) and each of them leads to its own class of mathemat-
ly.
considered in [WH02] and solved in [HW08]. The presentation in the text is based
on [ASW10], where the connection to Dvoretzky’s theorem was noticed. It is also
lu
known that } ¨ }1Ñp is not multiplicative for p close to 0 [CHL` 08], but part of
the range 0 ď p ă 1 is not covered by any approach. The explicit example from
na
been made in [Aub09], where it was shown that the unitaries in question can be
sampled from any Kraus decomposition of the completely randomizing channel.
?
Section 8.4. Lemma 8.12 appears in [HLW06] with the value C “ 8{ log 2.
The argument leading to a better constant (Ck „ 1) in Lemma 8.12 that is sketched
in Exercise 8.7 was an unpublished byproduct of the work on [ASW11]. For various
aspects of continuity of the von Neumann entropy, see [Win16].
The exact formula (8.20) from Lemma 8.13 has been conjectured in [Pag93]
and proved in [FK94, SR95, Sen96]. Having the precise form (as opposed to the
weaker version stated in Remark 8.14) results in better constants in Theorem 10.16
in Section 10.3.1.
NOTES AND REMARKS 233
ion
CFN15]. Fix an integer k and t P p0, 1q. There is a deterministic convex set Kk,t Ă
DpCk q with the following property: if Φ : Mm Ñ Mk is a quantum channel obtained
from a random embedding V : Cm Ñ Ck b Cd with m “ tkd, then, almost surely as
ut
d Ñ 8, the set ΦpDpCm qq converges to Kk,t . This allows, at least in principle, to
rib
answer any question about minimal output entropies in this range of parameters. It
was subsequently shown in [BCN16] that generic channels violating additivity can
be obtained by following this strategy if and only if k ě 183. Moreover, the defect
ist
of non-additivity, i.e., the difference between the two sides of (8.9) is generically
almost log 2 for large k (or 1 bit if we use log2 to define entropy). This improves on
rd
the preceding arguments—including the one presented in the text—which showed
a violation that was minuscule. Still, in contrast with the Hayden–Winter example
fo
[HW08] (cf. Remark 8.8), the demonstrated violation does not go to infinity as
the dimensions increase. A drawback of the free probability-based method is that
ot
the results are valid only when the environment dimension d goes to infinity, and
obtaining explicit values of d, for which these asymptotic phenomena hold, requires
N
extra analysis, which is not supplied in [BCN16]. For more information on this
approach we refer to the survey [CN16]. Still another approach, due to Collins
ly.
[Col16] and perhaps more conceptual, relies on the Haagerup inequality about the
norms of convolutions on the free group.
on
natural question. It is known that E8 pψq ă k ´ 1 for any unit vector ψ P pC2 qbk
whenever k ě 3 (see [JHK` 08]). The fact that random states are very entangled
so
(the upper bound from Proposition 8.28) has been noticed and used in [GFE09,
BMW09].
r
Pe
The argument behind Proposition 8.29 and Exercise 8.10 has been communi-
cated to us by Mikael de la Salle (see also Theorem 3.3 in [Hil07a]). The ? pa-
3 b4
pers [Hil06, Hil07a]? compute also the exact values gmin ppR q q “ 1{ 7 and
gmin ppR3 qb4 q “ 1{ 21.
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
CHAPTER 9
ion
Let H “ H1 b ¨ ¨ ¨ b Hk be a multipartite Hilbert space. We are interested
ut
in the geometry of the set of separable states on H, and related questions. To
simplify the exposition we are going to focus on two specific cases: the bipartite
rib
case H “ Cd1 b Cd2 (we may restrict ourselves to the balanced case d1 “ d2 “ d
in order to keep notation simple) and the case of k qubits H “ pC2 qbk . However,
ist
essentially all the methods carry over to the general case, except that the formulas
may sometimes become not very elegant (see, for example, Theorem 9.12). The sets
rd
D “ DpHq, Sep “ SeppHq and PPT “ PPTpHq were defined in Chapter 2. Recall
that Sep Ă PPT Ă D. One of the main goals of this chapter is to produce a table
(Table 9.1) which contains radii estimates for theses states, similar to Table 4.1 for
fo
the classical examples of convex bodies. The following table (Table 9.2) matches
estimates from Table 9.1 to the corresponding theorems in the text.
ot
Table 9.1. Radii estimates for sets of quantum states. In each
N
npn´1q
´ ¯ ´ ¯ ´ ¯ b
Sep d2 ? 1
Θ˚ n´3{4 Θ n´3{4 Θ n´3{4 n´1
n
npn´1q
r
´ ¯ ´ ¯ b
Pe
We next clarify the statements about the radii appearing in Table 9.1. They are
all computed with respect to the Hilbert–Schmidt Euclidean structure. Both inradii
and outradii are computed for Hilbert–Schmidt balls centered at the maximally
mixed state ρ˚ . This choice of a center is optimal: one may argue that the optimal
center can be chosen to be invariant under isometries of the convex set, and this
property characterizes ρ˚ (see Propositions 2.5 and 2.18, cf. Exercise 4.51 and its
hint). Statements referred to as trivial in Table 9.2 follow from (2.7).
235
236 9. GEOMETRY OF THE SET OF MIXED STATES
Table 9.2. References for proofs of the results from Table 9.1.
ion
Some arguments require to consider the affine space H1 of trace one Hermitian
ut
matrices as a vector space with ρ˚ as the origin. In order to emphasize this point
of view we use a specialized notation: if ρ P H1 and t P R, then we write
rib
(9.1) t ‚ ρ :“ tρ ` p1 ´ tqρ˚ .
ist
If K Ă H1 , we denote t ‚ K “ tt ‚ x : x P Ku. A similar caveat applies to polarity
calculated inside the space H1 .
rd
It is a remarkable fact that, despite sharing the same inradii and outradii, the
sets Sep and D behave so differently with respect to volume radius. In particular,
the proportion of states on Cd b Cd which are separable, when measured in terms
fo
of volume, is extremely small: of order expp´cd4 log dq. We will return to such
considerations in Chapter 10.
ot
9.1. Volume and mean width estimates
N
In this section, we prove the volume radius and mean width estimates from
ly.
Table 9.1. In particular, we compute (up to a logarithmic factor) the mean width
of Sep˝ (Theorem 9.6), which will play a crucial role in Chapter 10.
on
we have
SeppHq “ DpH1 q b p DpHk q ,
p ¨¨¨ b
lu
and that DpHi q is the unit ball for the space pB sa pHi q, } ¨ }1 q.
The Rogers–Shephard inequality (Theorem 4.22) controls how much the volume
na
2
2 2n
r
2
2 2n
(9.3) ? voln2 ´1 pSepq ď voln2 pSep q ď 5{2 voln2 ´1 pSepq.
n n
9.1.2. The set of all quantum states.
Theorem 9.1. Let D “ DpCn q be the set of states on Cn . The volume of D
equals
śn
? npn´1q{2 j“1 Γpjq
(9.4) volpDq “ n p2πq ,
Γpn2 q
9.1. VOLUME AND MEAN WIDTH ESTIMATES 237
ion
of vradpDq in Table 9.1.
Alternatively, we present a “soft” way to prove (9.5). First, we know from the
ut
Santaló inequality (Theorem 4.17) that vradpDq vradpD˝ q ď 1. On the other hand,
D˝ “ p´nq ‚ D (see (1.26), recall that polarity is with respect to ρ˚ ). This gives
rib
the upper bound in (9.5).
n,sa
For the lower bound, consider the symmetrization ? D “ S1 , the unit ball
ist
? to the trace norm. Since }?¨ }1 ď n} ¨ }HS , the inradius of D
with respect
equals 1{ n and therefore vradpD q ě 1{ n. We may now appeal to the Rogers–
rd
Shephard inequality (9.2) to obtain the lower bound vradpDq ě 2?1 n (this requires
some numerical verification since the convex bodies D and D live in different
dimensions, leading to different powers in the definition of the volume radii).
fo
We now compute the Gaussian mean width of D. If An is a GUE0 pnq random
matrix, then
ot
(9.6) wG pDq “ E sup TrpAn ρq “ E sup TrpAn |ψyxψ|q “ E λ1 pAn q
N
ρPD ψPH,|ψ|“1
since TrpB|ψyxψ|q “ xψ|B|ψy. Given that wpDq “ κ´1 n2 ´1 wG pDq, the asymptotic
ly.
?
estimate follows from the facts that κn2 ´1 „?n and E λ1 pAn q „ 2 n (Theorem
6.23). To show that the inequality wpDq ď 2{ n holds in every dimension, we use
on
using a discretization lemma, which we state for future reference (see Exercise 9.1).
lu
Lemma 9.2. Let H “ Cd , and N be an α-net in pSCd , gq, with α ă π{4. Then
na
P Ă D is trivial. Let us check the other inclusion through the corresponding dual
Pe
(polar) norms
}A}pD q˝ “ max |xϕ|A|ϕy| “ }A}op ,
ϕPSCd
We need to show that }A}P ˝ ě cosp2αq}A}op for every A P Msa d . We may assume
by homogeneity and symmetry that }A}op and the largest eigenvalue of A are both
equal to 1. Let ϕ P Cd be a unit vector such that Aϕ “ ϕ. Choose ψ P N verifying
gpϕ, ψq ď α. By adjusting the phase of ϕ (i.e., replacing ϕ with an appropriate
238 9. GEOMETRY OF THE SET OF MIXED STATES
element of rϕs), we may write ψ “ cospβqϕ ` sinpβqχ for a unit vector χ K ϕ, and
0 ď β ď α. We have then (since xϕ|A|χy “ 0 and xχ|A|χy ě ´1)
xψ|A|ψy “ cos2 pβqxϕ|A|ϕy ` sin2 pβqxχ|A|χy ě cos2 β ´ sin2 β “ cosp2βq ě cosp2αq.
Exercise 9.1 (An easy upper bound on the mean width of D). Using Lemma
9.2, give an alternate proof of the relation
?
wpDpCn qq “ Op1{ nq.
ion
Exercise 9.2. Show that Lemma 9.2 is sharp on C2 , i.e., that cosp2αq cannot
be replaced by a larger number in (9.7).
ut
9.1.3. The set of separable states (the bipartite case).
rib
Theorem 9.3. If Sep “ SeppCd b Cd q, we have the two-sided estimates
1 4
ist
(9.8) 3{2
ď vradpSepq ď wpSepq ď 3{2 .
6d d
rd
The inequality vradpSepq ď wpSepq is the Urysohn inequality (Proposition
4.15). We first give an elementary argument showing that wpSepq “ Opd´3{2 q, and
then prove separately the more precise bounds from (9.8).
fo
Proof that wpSepq “ Opd´3{2 q. We proceed through a net argument. It is
ot
easier to work with the Gaussian
? mean width, and therefore we prove the equivalent
statement wG pSepq “ Op dq. Since wG pSepq ď wG pSep q, it is enough to give an
N
upper bound on wG pSep q. Let P the polytope given by Lemma 9.4 below. Then
b ?
ly.
1).
Lemma 9.4. There is a constant C ą 0 such that for every dimension d, there
se
2
The constant 1{2 appearing in Lemma 9.4 could be replaced by 1 ´ for any
so
ε ą 0, affecting only the value of C. Interestingly, the analogous statement for Sep
r
ion
Proof that vradpSepq ě 16 d´3{2 . We first give a lower bound on vradpSep q
by estimating from below the inradius of Sep . We are going to compare Sep
with a simpler convex body which we now define. Let K Ă BpHq be the convex
ut
hull of rank one product operators (not necessarily self-adjoint!)
rib
K :“ conv t|x1 b x2 yxy1 b y2 | : x1 , y1 , x2 , y2 P BCd u .
The convex body K is most naturally seen as S1d b p S1d , it can also be identified
ist
b4
with pBCd q up to identification with dual space. The next lemma (the proof we
p
rd
Lemma 9.5. Let H “ Cd b Cd . Let π : BpHq Ñ B sa pHq be the projection onto
self-adjoint part, πpAq :“ 12 pA ` A: q. Then
Sep Ă πpKq Ă 3 Sep .
fo
ot
Lemma 9.5 implies that inradpSep q ě 13 inradpKq. We also know from Lemma
N
8.26 that ´ ¯ 1
inradpKq “ inrad pBCd qb4 ě 3{2 .
p
d
ly.
Therefore,
1
vradpSep q ě inradpSep q ě 3{2 .
on
3d
We conclude using (9.3) that vradpSepq ě 6d13{2 . (As in the proof of Theorem 9.1,
this requires a somewhat tedious verification due to the fact that Sep and Sep live
se
in different dimensions.)
lu
Proof of Lemma 9.5. The factor 3 appears as an upper bound on the geo-
metric distance between the sets D and Sep corresponding to 2 qubits, i.e., the
na
decomposed as
ρ ` I {2 I ´ρ
r
ρ“2 ´ ,
Pe
3 n
l jh 3
ljhn
separable separable
where separability can be checked, e.g., using the Peres criterion (see Theorem
2.15).
It is enough to show that extreme points of πpKq are contained in 3Sep . Any
extreme point A of πpKq can be written as
1
A “ p|x1 b x2 yxy1 b y2 | ` |y1 b y2 yxx1 b x2 |q
2
It may appear at the first sight that the above representation shows that A is
separable. However, while the two terms in the parentheses are indeed product
240 9. GEOMETRY OF THE SET OF MIXED STATES
operators, they are not self-adjoint and we can only conclude that A P DpHq (as
a self-adjoint operator whose trace norm is ď 1).
Let Hi be the 2-dimensional subspace of Cd spanned by xi and yi (if the
vectors are proportional, add any vector to get a 2-dimensional space) and let
H1 :“ H1 b H2 . Then A can be considered as an operator on H1 ; more precisely, as
an element of DpH1 q (and, conversely, any operator acting on H1 can be canonically
lifted to one acting on H). Since A belongs to DpH1 q , it also belongs to 3SeppH1 q ,
and thus to 3SeppHq .
ion
9.1.4. The set of block-positive matrices. Let Sep “ SeppCd b Cd q. In
Theorem 9.3 we computed the order of magnitude of the mean width of Sep. We
ut
now focus on the dual quantity: the mean gauge of Sep, or the mean width of
Sep˝ (recall that polarity is taken with maximally mixed state ρ˚ “ I {d2 being the
rib
origin).
Theorem 9.6. Let Sep be the set of separable states on Cd b Cd . Then for
ist
some absolute constants c, C,
rd
cd3{2 ď vradpSep˝ q ď Cd3{2 ,
cd3{2 ď wpSep˝ q ď Cd3{2 logpdq.
fo
Since the cone BP of block-positive operators is dual to the cone SEP of sep-
arable operators (see Section 2.4), we obtain the following corollary.
ot
Corollary 9.7. Let BP be the set of trace one block-positive operators on
N
Proof. Since BP “ ´d´2 Sep˝ (see (2.47)), the derivation of Corollary 9.7
from Theorem 9.6 is immediate.
se
The Santaló and reverse Santaló inequalities (Theorem 4.17) allow to esti-
lu
mate directly vradpSep˝ q from vradpSepq, so the first part of Theorem 9.6 fol-
lows from Theorem 9.3. However the analogous result for the mean width, the
na
after we prove the M M ˚ -estimate (7.7) for the pair pSep, Sep˝ q, i.e.,
(9.10) wpSepqwpSep˝ q “ Oplog dq.
r
Pe
Recall that the lower bound wpSepqwpSep˝ q ě 1 is elementary and holds for any
pair of polar bodies (see Exercise 4.37). However, (9.10) does not follow immediately
from the general theory: Theorem 7.10 is known to hold only for symmetric convex
bodies which are in a specific position (the `-position). In our situation Sep is not
symmetric and there is no reason to think that it is in the `-position.
The first step towards proving Theorem 9.6 is to introduce the following sym-
metrization of Sep
SepX “ ´Sep X Sep,
where ´Sep “ p´1q ‚ Sep, see (9.1). We check that the relevant geometric param-
eters are essentially unchanged by this symmetrization procedure.
9.1. VOLUME AND MEAN WIDTH ESTIMATES 241
Proposition 9.8. The convex bodies Sep and SepX have comparable volume
radius, mean width and dual mean with, as show by the following formulas, where
Sep˝X means pSepX q˝
(9.11) wpSep˝ q ď wpSep˝X q ď 2wpSep˝ q,
1
(9.12) vradpSepq ď vradpSepX q ď vradpSepq,
2
(9.13) wpSepq » wpSepX q » d´3{2 .
ion
Moreover, Sep and SepX have the same inradius, equal to pd2 pd2 ´1qq´1{2 . However,
the outradius of SepX is bounded by 1{d, while the outradius of Sep is of order 1.
ut
Proof. We have, for any self-adjoint A with zero trace,
rib
}ρ˚ ` A}SepX “ maxp}ρ˚ ` A}Sep , }ρ˚ ´ A}Sep q ď }ρ˚ ` A}Sep ` }ρ˚ ´ A}Sep .
When averaging A over the Hilbert–Schmidt sphere, using the fact that A and
ist
´A have the same distribution, we obtain (9.11). Inequalities (9.12) follow from
Proposition 4.18. For (9.13), we already know (cf. Theorem 9.3) that
rd
vradpSepq » wpSepq » d´3{2 .
fo
We therefore have the following chain of inequalities: the first is trivial, the third
is (9.12) and the last is Urysohn’s inequality (Proposition 4.15)
ot
wpSepX q ď wpSepq » vradpSepq » vradpSepX q ď wpSepX q.
N
is bounded by 1{d.
We are now going to prove that the M M ‹ -estimate holds for SepX .
se
It is now easy to deduce Theorem 9.6. Indeed, using the relation (4.32) between
spherical and Gaussian widths, Proposition 9.9 implies that wpSepX qwpSep˝X q “
so
Oplog dq, and (9.10) follows from (i) and (iii) of Proposition 9.8.
Proof of Proposition 9.9. Denote K “ SepX ´ ρ˚ , so that K is a symmet-
r
Pe
F1 “ spantσ1 b I : Tr σ1 “ 0u,
F2 “ spantI bσ2 : Tr σ2 “ 0u.
By Proposition 4.8, we may assume that T “ αPE ` λ1 PF1 ` λ2 PF2 for some
positive numbers α, λ1 , λ2 . We may also assume α “ 1 without loss of generality.
The ideal property of the `-norm (Proposition 7.1(ii)) implies that
`K pPE q “ `K pT PE q ď `K pT q,
ion
and similarly for `K ˝ pPE q. By the M M ˚ -estimate (Theorem 7.10), we know that
`K pT q`K ˝ pT ´1 q “ Opd4 log dq.
ut
Noting that T ´1 “ PE ` λ´1 ´1
1 PF1 ` λ2 PF2 , it follows that
rib
(9.16) `K pPE q`K ˝ pPE q “ Opd4 log dq.
The `-norms of the projections PF1 , PF2 can be upper-bounded in a rather
ist
straightforward fashion, mostly due to the fact that their ranks are relatively small.
We have
rd
Lemma 9.10. Let F “ F1 ‘ F2 . Then `K pPF q “ Opd3 q and `K ˝ pPF q “ Op1q.
fo
We now postpone the proof of Lemma 9.10 and show how it allows to com-
plete the proof of Proposition 9.9. To that?end, we compare the estimates from
ot
Lemma 9.10 the bounds with `K ˝ pIH0 q » d (a reformulation of Theorem 9.3)
and `K pIH0 q Á d7{2 (which follows from the already proved lower bound in (9.15)).
N
2`L pPE q for d large enough. Combined with (9.16), this gives the upper bound in
(9.15), as needed.
on
9.8 to deduce the following inequalities (recall that κn is of order n, see Proposition
A.1)
lu
used to show the following estimates (where the constants ck , Ck depend a priori
Pe
ion
Proof of Theorem 9.11. We write D “ DpC2 q and Sep “ SeppHq. Since
Sep “ Dbk , it follows from Lemma 9.2 that, if N is an ε-net in pSC2 , gq, then
p
ut
cosp2εqk Sep Ă P Ă Sep ,
rib
where
P :“ convt˘|ψ1 b ¨ ¨ ¨ b ψk yxψ1 b ¨ ¨ ¨ b ψk | : ψ1 , . . . , ψk P N u.
ist
?
We choose ε such that cosp2εqk “ 1{2, i.e., ε » 1{ k. The polytope P is
rd
contained in the Hilbert–Schmidt unit ball, and (using Lemma 5.3) can be chosen
with at most 2pcard N qk ď exppCk log kq vertices. The first idea would be to apply
directly Proposition 6.3. This approach yields the bound
The reason for the extra factor nα in (9.18) comes from the fact that the
Hilbert–Schmidt Euclidean structure is not the most adapted to the present prob-
ly.
lem. When we apply Proposition 6.3 in the Euclidean structure induced by some
ellipsoid E, we actually obtain the following result: if P is a polytope with v vertices
on
ď
vol E N
lu
In this inequality, for a fixed polytope P , the best choice of ellipsoid is given
by the Löwner ellipsoid of P . Accordingly, we are going to consider the Löwner
ellipsoid associated to the set Sep . By Lemma 4.9, we have
na
The set D is?a cylinder. To compute? its Löwner ellipsoid, we use Lemma 4.3 with
r
sa
where?BHSa denotes
a the a Hilbert–Schmidt unit ball in M2 and T is the matrix
diagp 2, 2{3, 2{3, 2{3q in the basis of Pauli matrices (2.3). Consequently,
c
vol LöwpD q 16
“ det T “
vol BHS 27
or, equivalently, vradpLöwpD qq “ p16{27q1{8 . From the formula
vradpLöwpSep qq “ vradpLöwpD qqk
(which follows from (9.20), see Exercise 4.32), we conclude that
vrad LöwpSep q “ p16{27qk{8 “ n´α
244 9. GEOMETRY OF THE SET OF MIXED STATES
with α “ 18 log2 p27{16q. If we use the (inner product induced by the) Löwner
ellipsoid of Sep as the reference Euclidean structure to apply (9.19), we obtain
the upper bound
? ?
k log k log n log log n
vradpSep q ď C vradpLöwpSep qq “ C .
n n1`α
To show the lower bound in (9.18), we use the fact (see Exercise 4.20) that for
every symmetric convex body K Ă RN , the inclusion K Ą ?1N LöwpKq holds. We
ion
apply this for K “ Sep (so that N “ n2 ) to conclude that
1 1
vradpSep q ě vradpLöwpSep qq “ 1`α .
ut
n n
Finally, an application of the Rogers–Shephard inequality (9.3) shows that
rib
vradpSepq and vradpSep q are of the same order.
A similar argument allows to estimate the size of the set of separable states on
ist
k “qudits”, i.e., on pCd qbk .
rd
Theorem 9.12 (see Exercise 9.5). Let d ě 2, k ě 1, n “ dk , and H “ pCd qbk .
Then
? ?
(9.21)
cd log n log log n
n
fo
ď wpSepq ď
Cd log n log log n
n
ot
and
?
cd Cd log n log log n
N
(9.22) ď vradpSepq ď ,
n1`αd n1`αd
where αd “ 21 logd p1 ` d1 q ´ 2d12 logd pd ` 1q.
ly.
Exercise 9.3 (Lower bound on the mean width of Sep). Show that, for some
on
` ˘
constant c ą 0, Sep pC2 qbk contains k ck elements which are c-separated with re-
spect to the Hilbert–Schmidt distance. Then, use the Sudakov minoration (Propo-
se
Exercise 9.4 (Löwner ellipsoid and the Killing form). Check that the Löwner
ellipsoid of DpC2 q induces on Msa
2 the inner product
na
3 1
xu, vyL “ Trpuvq ´ Trpuq Trpvq.
2 2
so
Exercise 9.5 (The size of of Sep for k qudits). Complete the proof of Theorem
r
9.12.
Pe
9.1.6. The set of PPT states. We present estimates for the volume and
mean width of PPT. For asymptotic versions improving some of the constants, see
Exercise 9.6.
Theorem 9.13 (Volume and mean width of PPT). For H “ Cd b Cd , we have
1 2
ď wpPPT˝ q´1 ď vradpPPTq ď wpPPTq ď .
4d d
Proof. The upper bound on the mean width follows from the obvious inequal-
ity wpPPTq ď wpDq and from the bound wpDq ď 2{d (Theorem 9.1). To prove the
9.2. DISTANCE ESTIMATES 245
lower bound, we use the dual Urysohn inequality (Proposition 4.16), where polarity
is taken with respect to ρ˚
1
vradpPPTq ě .
wpPPT˝ q
If Γ denotes the partial transposition on H, then PPT “ DXΓpDq and therefore
(9.23) PPT˝ “ convpD˝ Y ΓpDq˝ q Ă D˝ ` ΓpDq˝ .
ion
Geometrically, the transformation Γ is an isometry with respect to the Hilbert–
Schmidt norm (cf. Exercise 2.22; the argument we present actually works for any
Hilbert–Schmidt isometry). Using the fact that D˝ “ ´d2 D and the upper bound
ut
from Theorem 9.1, we obtain
rib
wpPPT˝ q ď wpD˝ q ` wpΓpDq˝ q ď 2wpD˝ q “ 2d2 wpDq ď 4d.
ist
It follows from Theorem 9.13 that D and PPT have comparable volume radii,
up to an absolute constant. An interesting question is whether this constant ap-
rd
proaches 1 as the dimension increases.
Problem 9.14. Is there an absolute constant c ă 1 such that, for every d ě 3,
fo
vradpPPTpCd b Cd qq ď c vradpDpCd b Cd qq.
ot
Exercise 9.6 (Sharper asymptotic bounds on the size of PPT). Prove that
N
2d
on
Exercise 9.7 (Volume radius of PPT as a large deviation problem). Show that
Problem 9.14 can be reformulated as follows: does there exist a constant c ą 0 such
that, if B is a d2 ˆ d2 matrix with independent NC p0, 1q entries, then
se
This recasts the problem as a large deviation estimate for some random matrix
ensemble. Note that the same ensemble appears in Theorem 6.30, which asserts
na
ion
An elementary geometric argument shows that Theorem 9.15 is equivalent to
the following statement: if A P B sa pCd1 b Cd2 q satisfies }A}HS ď 1, then I `A P
ut
SEP.
rib
aProof. Let K Ă DpHq be the set of states ρ such that }ρ ´ I {n}HS ď
1{ npn ´ 1q and C “ R` K be the cone generated by K. The assertion of Theo-
ist
rem 9.15 is equivalent to the cone inclusion C Ă SEP. By cone duality (see Section
1.2.1), this is also equivalent to SEP ˚ Ă C ˚ . Recall that SEP ˚ is the cone of
rd
block-positive operators, see (2.46).
Let M P B sa pHq. One checks that
M P C ðñ }M }HS ď ?
fo1
n´1
Tr M.
ot
It follows (see Exercise 1.31) that
N
M P C ˚ ðñ }M }HS ď Tr M.
We thus reduced the proof of Theorem 9.15 to the following problem: for a block-
ly.
C2 q. Then
lu
(if k “ l this is obvious; if k ‰ l this is the content of the Lemma). Noting that
Pe
the diagonal blocks Mkk are positive semi-definite and summing over k, l gives the
needed inequality }M }2HS ď pTr M q2 .
(ii) Use (i), Theorem 2.34 and Exercise 2.30 to give an alternate proof of Theorem
ion
9.15.
9.2.2. Robustness in the bipartite case. We now compute the geometric
ut
distance between D and Sep in the bipartite case.
rib
Proposition 9.17. Let H “ Cd1 b Cd2 for d1 , d2 ě 2, and denote n “ d1 d2 .
We have
n
ist
dg pD, Sepq “ dg pD, PPTq “ ` 1.
2
An equivalent way to describe the geometric distance is to define the robustness
rd
of a state ρ as follows (the notation ‚ was defined in (9.1))
" *
1
(9.26) Rpρq “ inf s ě 0 :
1`s
fo
‚ ρ P Sep .
ot
Proposition 9.17 asserts that the maximal robustness of a state on Cd1 b Cd2 equals
n{2. Since Sep Ă PPT Ă D, it suffices to prove that dg pD, PPTq ě n2 ` 1 and
N
dg pD, Sepq ď n2 ` 1.
eigenvalues of ρΓ are p1{2, 1{2, 1{2, ´1{2q. It follows that ρt is not PPT whenever
´t{2 ` p1 ´ tq{n ă 0, or equivalently t ą 2{p2 ` nq. Therefore dg pD, PPTq ě
n
2 ` 1.
se
Proof that dg pD, Sepq ď n2 ` 1. We have to show that for any state ρ, the
lu
d
ÿ
χ“ λj ϕj b ψj ,
so
j“1
r
for some d ď minpd1 , d2 q and orthonormal bases pϕj q in Cd1 and pψj q in Cd2 .
Pe
conjugate of θ. The resulting operator B, which belongs to the separable cone SEP
by construction, equals
d
ÿ a “ ‰
B“ λj λk λl λm E θj θ̄k θl θ̄m |ϕj b ψk yxϕl b ψm |
j,k,l,m“1
The quantity Erθj θ̄k θl θ̄m s vanishes unless either (1) j “ k and l “ m, or (2) j “ m
and k “ l. The non-vanishing terms can be gathered as B “ |χyxχ| ` A, where
ÿ
ion
A“ λj λk |ϕj b ψk yxϕj b ψk |.
j‰k
ut
can be written as a positive combination of the operators
rib
t|ϕj b ψk yxϕj b ψk | : 1 ď j ď d1 , 1 ď k ď d2 u .
1
Note that α ď since λj λk ď 21 pλ2j ` λ2k q ď 12 . It follows that 1
I ´A P SEP, and
ist
2 2
therefore that
ˆ ˙
´ n ¯ 1
rd
t0 ‚ ρ “ t0 |χyxχ| ` ρ˚ “ t0 B ´ A ` I
2 2
is a separable state, as needed.
fo
9.2.3. Distances involving the set of PPT states. We consider the case
of a balanced bipartite Hilbert space H “ Cd b Cd . Another relevant quantity—not
ot
covered by Proposition 9.17—is the geometric distance between PPT and Sep. This
N
quantity is of interest since it quantifies the degree to which PPT is a poor substitute
for separability in large dimensions. However, even the order of magnitude of the
ly.
distance seems unknown. Actually, we are not aware of any upper bound improving
substantially on the obvious estimate dg pSep, PPTq ď dg pSep, Dq.
on
16
lu
Proof. We use the lower bound on the distance that comes from volume
comparison
vrad PPT
na
dg pSep, PPTq ě ,
vrad Sep
1
so
together with the lower bound vrad PPT ě 4d (Theorem 9.13) and the upper bound
´3{2
vradpSepq ď 4d (Theorem 9.3).
r
Pe
Proposition 9.18 asserts that there are PPT states that are far from the set of
separable states. Another way of quantifying this phenomenon is as follows. Given
a state ρ on Cd b Cd , we introduce
dSep pρq “ min }ρ ´ σ}1 .
σPSeppCd bCd q
Theorem 9.19 (not proved here). For every ε ą 0, for d large enough, there
is a PPT state ρ on Cd b Cd such that dSep pρq ě 2 ´ ε.
The proof of Theorem 9.19 involves tricks that are beyond the scope of this
book. However, we present an argument showing that a weaker lower bound on the
distance to separable states (1{4 instead of 2) is achieved in a generic direction.
9.2. DISTANCE ESTIMATES 249
Proposition 9.20. Let S denote the unit sphere in the space of trace zero
Hermitian operators on Cd b Cd . For most directions u P S, there exists a PPT
state ρ such that
1
dSep pρq ě }u}´1
8 min Trppρ ´ σquq ě ´ op1q.
σPSeppCd bCd q 4
Proof. We consider the support functions wpPPT, ¨q and wpSep, ¨q, as defined
in (4.29). Since the outradii of PPT and Sep are less than 1, these functions are
ion
1-Lipschitz on S. Note also that the average of these functions on S is exactly the
mean width of the corresponding set. Using the values from Table 5.2, we conclude
that, for K “ PPT or K “ Sep and for ε ą 0,
ut
Pp|wpK, ¨q ´ wpKq| ą εq ď 2 expp´ε2 pd4 ´ 1q{2q.
rib
We next use the bounds wpPPTq ě p 21 ´ op1qqd´1 (Exercise 9.6) and wpSepq ď
4d´3{2 (Theorem 9.3) to conclude that, for most directions u P S, we have
ist
ˆ ˙
1
(9.27) wpPPT, uq ě ´ op1q d´1 , wpSep, uq ď 5d´3{2 .
rd
2
Moreover (see Proposition 6.24), most directions u also satisfy
(9.28) fo
}u}8 ď p2 ` op1qqd´1 .
Choose u P S satisfying both (9.27) and (9.28), and let ρ P PPT be such that
ot
Trpρuq “ wpPPT, uq. We then have
N
ˆ ˙
1
sup Trppρ ´ σquq “ wpPPT, uq ´ wpSep, uq ě ´ op1q d´1 .
σPSep 2
ly.
1
dSep pρq ě ´ op1q.
4
se
Any improvement on the lower bound (9.27) for the mean width of PPT would
improve the lower bound in Proposition 9.20.
lu
case of k qubits, i.e., the Hilbert space H “ pC2 qbk . Recall that the inradius of Sep
is witnessed by balls centered at ρ˚ (see Proposition 2.18 and the discussion in the
so
Theorem 9.21 (not proved here, but see Exercise 9.9). For H “ pC2 qbk , we
have a
54{17 ˆ 6´k{2 ď inradpSepq ď 2 ˆ 6´k{2
We next turn to the problem of estimating the geometric distance between D
and Sep in the case of many qubits, for which even the asymptotic order is not
known.
Proposition 9.22 (Robustness for many qubits). For H “ pC2 qbk , we have
?
2k´1 ` 1 ď dg pSep, Dq ď p 6qk .
250 9. GEOMETRY OF THE SET OF MIXED STATES
ion
where the last equality comes from Proposition 9.17.
ut
Exercise 9.9 (A bound on the inradius of Sep on k qubits via mean width).
Let P : Msa sa
2 Ñ M2 be the orthogonal projection onto the hyperplane of trace zero
rib
matrices, and let Π “ P bk .
(i) Check that ΠpSepppC2 qbk q q “ pP pDpC2 q qqbk
p
.
ist
(ii) Show that
ˆ´ ¯bk
p
˙
rd
a
inrad SepppC2 qbk q ď inrad 2´1{2 B23
` ˘
“ Op k log k ¨ 6´k{2 q.
fo
9.3. The super-picture: classes of maps
Up to now, we focused on determining volumes and other geometric parameters
ot
for various classes of states. Due to the Choi–Jamiołkowski isomorphism (see Sec-
tion 2.3.1), these results can be translated into statements about the corresponding
N
classes of quantum maps, or superoperators. However, there are some fine points
that need to be addressed for such translation to be rigorous.
ly.
phism Φ ÞÑ CpΦq (see Section 2.4, especially Table 2.2) with the positive semi-
definite cone PSDpCn b Cm q. So far, so good. However, if we restrict our at-
se
of states mDpCn b Cm q. This is due to the fact that the trace-preserving condi-
tion Tr Φpρq “ Tr ρ (for ρ P Mm ) translates into TrCn CpΦq “ ICm (which implies
na
is represented by just one scalar constraint Trp¨q “ m (in addition to the positive
semi-definiteness constraint common to both settings).
r
tTrCn p¨q “ ICm u, then the rescaled set of states K “ mDpCn b Cm q is a base of the
positive semi-definite cone, which is an m2 n2 ´ 1-dimensional convex set, while the
set of Choi matrices corresponding to completely positive trace-preserving maps is
K X H, a section of that base of relative codimension m2 ´ 1, i.e., a convex set of
dimension m2 n2 ´ m2 .
The problem of relating the size of a convex set to that of its (central) sections
is in general nontrivial, and two-sided bounds are only possible if the set is isotropic
(in the technical sense defined in Section 4.4; see especially Proposition 4.26). The
set D of all states actually is isotropic (see Proposition 4.25). While not all natural
sets of states have this property, they are all sufficiently balanced so that the more
9.3. THE SUPER-PICTURE: CLASSES OF MAPS 251
ion
Table 9.3. Each cone C of superoperators is a nondegenerate
ut
cone in BpMsa sa
d , Md q and the subset CTP of trace-preserving el-
ements is a convex set of dimension d4 ´ d2 . The cone C Ă
rib
B sa pCd b Cd q is the image of C under the map Φ ÞÑ CpΦq, see
Section 2.4.
ist
Cone of superoperators C Cone C Base C b vradpC
? TP q
BP BP Θp dq
rd
Positivity-preserving P
Decomposable DEC co-PSD `PSD convpD Y ΓpDqq Θp1q
Completely positive CP PSD D „ e´1{4
PPT-inducing
Entanglement breaking
PPT
EB
PPT
SEP
fo
PPT
Sep
Θp1q
?
Θp1{ dq
N ot
2
where θ “ nk “ dd4 ´1
´1
ă d12 and r, R denote respectively the inradius and outradius
of K. The constants bpn, kq were defined in (4.51); in our setting the bounds (4.55)
na
ˆ ˙ ˆ ˙
log d n log d
(9.31) bpn, kq “ 1 ´ O , bpn, kq “ 1 ` O .
d2 k d2
r
Pe
b
2
Since all the cones we consider have the?property that Sep
˝
? Ă C Ă BP “
´d Sep , we know from Table 9.1 that r “ 1{ d2 ´ 1 and R “ d 2 ´ 1, so r ´θ
` b“
Rθ “ 1`O log
` d˘ 1´θ
˘
d2 . Combining (9.30) and (9.31) yields vradpC TP q „ d vrad C ,
and it remains to again notice that since θ is small, the exponent 1 ´ θ does not
make much of a difference (this uses very weakly the estimates on the volume radii
from Table 9.1, or just rough bounds given by r and R).
The same argument leads to non-asymptotic bounds (i.e., stated for a fixed
dimension) and to bound for maps from Mm to Mn . We also state a version of
Theorem 9.23 for the mean width. As we shall see in Chapter 10, the latter may
also be of independent importance.
252 9. GEOMETRY OF THE SET OF MIXED STATES
ion
the standard mean width, from the fact that the Gaussian mean ? width of a subset
?
does not exceed that of the entire set, and from the inequality n ´ 1 ď κn ď n
ut
(Proposition A.1(i)).
rib
Deriving meaningful lower bounds for wpK X Hq in terms of wpKq in a general
setting (such as Proposition 4.28 for the volume radius) is not that easy. However,
when K is one of the sets C b from Table 9.3, nontrivial lower bounds for the mean
ist
width follow from the estimates on the volume radii contained in the Table and
from Urysohn’s inequality.
rd
Exercise 9.10. Prove the bounds (9.31).
fo
Exercise 9.11 (Cones of channels are not self-dual). Let H “ Cm b Cn .
(i) Consider the affine subspace H “ tA P BpHq : TrCn A “ mI u. Show that
ot
D X H Ĺ PH D and Sep X H Ĺ PH Sep.
(ii) Conclude in particular that pD X Hq˝ ‰ ´mnpD X Hq: the self-duality of D is
N
or Sep ) are close, with respect to the geometric distance, to a polytope with not-
lu
and reference, we list the results in Table 9.4; the proofs can be found in the next
Pe
two sections.
9.4.1. Approximating the set of all quantum states. We first show that
it is possible to approximate D by a polytope whose number of vertices is expo-
nential in the dimension of the underlying Hilbert space. Recall that the notation
t ‚ K was defined in (9.1).
Proposition 9.25. For every ε P p0, 1q, there is a constant Cpεq such that the
following holds: for every dimension d ě 2, there exists a family N “ pϕi q1ďiďN of
unit vectors in Cd , with N ď exppCpεqdq, such that
(9.34) p1 ´ εq ‚ DpCd q Ă convt|ϕi yxϕi | : ϕi P N u.
9.4. APPROXIMATION BY POLYTOPES 253
ion
K dimension apKq dimV pKq dimF pKq
DpCm q m2 ´ 1 m´1 Θpmq Θpmq
ut
SeppCd b Cd q d4 ´ 1 d2 ´ 1 Θpd log dq Ωpd3 { log dq
rib
The result from Proposition 9.25 can be rephrased as estimates on the verticial
ist
(or facial) dimension of DpCd q.
Corollary 9.26. There are absolute constants c, C such that, for any d ě 2,
rd
cd ď dimV pDpCd qq “ dimF pDpCd qq ď Cd.
fo
Proof. Since D˝ “ p´dq ‚ D, the facial and verticial dimensions are equal.
The upper bound follows from Proposition 9.25. Using the value apDq “ d ´ 1 (see
ot
Table 9.1 and Exercise 9.12), one can deduce the lower bound from Theorem 7.29.
Alternatively, an elementary argument is sketched in Exercise 9.13.
N
It may seem reasonable to expect that choosing N as a δ-net in SCd (for some
δ depending only on ε) would be enough for the conclusion of Proposition 9.25 to
ly.
hold. This is the case for D (see Lemma 9.2). However, this approach fails for
D. Indeed, given δ, for d large enough, a δ-net N?may have the property that for
on
some fixed unit vector ψ, we have |xϕi , ψy| ą 1{ d for every ϕi P N . It follows
that xψ|ρ|ψy ą 1{d for every ρ P convt|ϕi yxϕi |u. However, this inequality fails for
se
ρ “ ρ˚ , which shows that even the maximally mixed state does not belong to the
convex hull of the net! Elements of the net may somehow conspire towards the
lu
direction ψ.
Yet, this approach can be salvaged if we use a balanced δ-net to avoid such
na
that these points satisfy the conclusion of Proposition 9.25 with high probability.
This is reminiscent of the random covering argument used in Proposition 5.4.
r
Pe
We start with a lemma which gives a rough bound on the number of unit vectors
that are needed.
Lemma 9.27. Let M be a δ-net in pSCd , | ¨ |q. Then
(9.35) p1 ´ 2dδq ‚ DpCd q Ă convt|ψi yxψi | : ψi P Mu Ă DpCd q.
The reader will notice that the proof given
` below can be
˘ fine-tuned to yield a
slightly better (but more complicated) factor 1 ´ 2pd ´ 1qδ in (9.35).
Proof. We have to show that, for any trace zero Hermitian matrix A,
λ1 pAq “ sup xψ|A|ψy ď p1 ´ 2δdq´1 sup xψi |A|ψi y.
ψPSCd ψi PN
254 9. GEOMETRY OF THE SET OF MIXED STATES
Since A has zero trace, we have }A}8 ď dλ1 pAq. Given ψ P SCd , there is ψi P M
with |ψ ´ ψi | ď δ. By the triangle inequality, we have
(9.36) xψ|A|ψy ď δ}A}8 ` xψ|A|ψi y
(9.37) ď 2δ}A}8 ` xψi |A|ψi y
(9.38) ď 2δdλ1 pAq ` xψi |A|ψi y.
Taking supremum over ψ, we get λ1 pAq ď 2δd λ1 pAq ` suptxψi |A|ψi y : ψi P Mu
and the result follows.
ion
Lemma 9.27 is not enough to directly imply Proposition 9.25, but it can be
ut
“bootstrapped” to yield the needed estimate.
rib
formulated as follows: For any self-adjoint trace zero matrix A we have
1
ist
(9.39) λ1 pAq “ sup xψ|A|ψy ď sup xϕi |A|ϕi y.
ψPSCd 1 ´ ε ϕi PN
rd
ε
Let M be a 4d -net in pSCd , |¨|q. By Lemma 5.3, we may enforce card M ď p8d{εq2d .
By Lemma 9.27, we have
ż
1
(9.41) |ϕyxϕ| dσpϕq “ p1 ´ αq ‚ |ψyxψ|.
σpCpψ, ηqq Cpψ,ηq
on
ż
α 1
1´α` “ |xψ, ϕy|2 dσpϕq ě cos2 η ě 1 ´ η 2
d σpCpψ, ηqq Cpψ,ηq
lu
so that
na
d
(9.42) α ď η2 ď ε{4.
d´1
so
obscure the argument, we will pretend in what follows that 2L3 is an integer and
so N “ 2L3 .) We will rely on the following lemma
d
Lemma 9.28. Let S8 “ t∆ P Md : }∆}op ď 1u be the unit ball for the operator
norm. For ψ P SCd and t ě 0, the event
! )
d
Eψ,t “ pϕi q : p1 ´ αq ‚ |ψyxψ| P tS8 ` convt|ϕi yxϕi | : 1 ď i ď 2L3 u
satisfies
1 ´ PpEψ,t q ď exp p´Lq ` 2d exp ´t2 L2 {8 .
` ˘
9.4. APPROXIMATION BY POLYTOPES 255
We apply Lemma 9.28 with t “ ε{8d. When the event Eψ,t holds, we have
(9.43) p1 ´ αqxψ|A|ψy ď t}A}1 ` sup xϕi |A|ϕi y.
ϕi PN
If the events Eψ,t hold simultaneously for every ψ P M, we can conclude from
(9.40) and (9.43) that
(9.44) p1 ´ ε{2qp1 ´ αqλ1 pAq ď t}A}1 ` sup xϕi |A|ϕi y
ϕi PN
ion
Since A has zero trace, we have }A}1 ď 2dλ1 pAq, and (9.44) combined with (9.42)
implies that
ut
` ˘
p1 ´ εqλ1 pAq ď p1 ´ ε{2qp1 ´ αq ´ 2td λ1 pAq ď sup xϕi |A|ϕi y,
ϕi PN
rib
yielding (9.39). The Proposition will follow once we show that the events Eψ,t hold
simultaneously for every ψ P M with positive probability. To that end, we use
ist
Lemma 9.28 and the union bound
˜ ¸
rd
č ÿ
(9.45) P Eψ,t ě 1´ p1 ´ PpEψ,t qq
ψPM ψPM
(9.46) ě 1´
ˆ
8d
ε
˙2d ´
fo ˘¯
expp´Lq ` 2d exp ´ ε2 d´2 L2 {512 .
`
ot
We know from Proposition 5.1 that exppc1 pεqdq ď L ď exppC1 pεqdq for some con-
N
stants c1 pεq, C1 pεq depending only on ε. It follows that the quantity in (9.45)–
(9.46) is positive for d large enough (depending on ε), yielding a family of 2L3 ď
ly.
2 expp3C1 pεqdq vectors satisfying the conclusion of Proposition 9.25. Small values
of d are taken care of by adjusting the constant Cpεq if necessary.
on
Proof of Lemma 9.28. Let Mψ “ cardpN X Cpψ, ηqq. The random variable
Mψ follows the binomial distribution BpN, pq for N “ 2L3 and p “ 1{L. It follows
se
ˆ ˙ ˆ 2 ˙
Np p N
P BpN, pq ď ď exp ´ .
2 2
na
(9.47)
Moreover, conditionally on the value of Mψ , the points from N X Cpψ, ηq have
r
Pe
the same distribution as pϕk q1ďkďMψ , where pϕk q are independent and uniformly
distributed inside Cpψ, ηq. The random matrices
Xk “ |ϕk yxϕk | ´ E |ϕ1 yxϕ1 | “ |ϕk yxϕk | ´ p1 ´ αq ‚ |ψyxψ|
are independent mean zero matrices. We now use the matrix Hoeffding inequality
(see, e.g., Theorem 1.3 in [Tro12]) to conclude that for any t ě 0,
› 1 M
˜› › ¸
ÿψ ›
(9.48) P › Xk › ě t ď 2d expp´Mψ t2 {8q
› ›
› Mψ ›
k“1 8
256 9. GEOMETRY OF THE SET OF MIXED STATES
(the factor 2 appears because we want to control the operator norm rather than
the largest eigenvalue). Define a random matrix ∆ by the relation
Mψ
1 ÿ
|ϕk yxϕk | ` ∆ “ p1 ´ αq ‚ |ψyxψ|.
Mψ k“1
ion
Pp}∆}8 ě tq ď exp p´Lq ` 2d exp ´L2 t2 {8
` ˘
ut
Exercise 9.12 (Asphericity of D). By comparing the values of the inradius
rib
and the outradius of DpCm q from Table 9.1, we see that the asphericity of DpCm q
is at most m ´ 1. Prove that it actually equals m ´ 1.
ist
Exercise 9.13 (An elementary bound for verticial dimension of D). Let P be
a polytope such that 41 ‚ DpCd q Ă P Ă DpCd q. Use Proposition 6.3 to prove that P
rd
has at least exppcdq vertices for some c ą 0.
9.4.2. Approximating the set of separable states. For simplicity, we only
fo
consider the case H “ Cd b Cd and denote Sep “ SeppCd b Cd q. As in the case
of D, a simple net argument (Lemma 9.29) shows that the verticial dimension of
ot
Sep is Opd log dq. However there is no analogue of the random construction used in
Proposition 9.25: this upper bound is sharp (see Proposition 9.31). Here are the
N
4d2 -net
p1 ´ εq ‚ Sep Ă conv t|ψα b ψβ yxψα b ψβ | : ψα , ψβ P N u .
on
ψ,ϕPSCd ψα ,ψβ PN
1 1
W ě 2 }A}2 ě 2 }A}8 .
d d
so
We will show next that the upper bound obtained in Lemma 9.29 is sharp. This
is in contrast with the case of the symmetrized set Sep , whose verticial dimension
is of order d (see Lemma 9.4).
Proposition 9.31. Let Sep “ SeppCd b Cd q. Then dimV pSepq ě cd log d for
some constant c ą 0.
Proof. Let P be a polytope with N vertices such that 41 ‚ Sep Ă P Ă Sep.
By Carathéodory’s theorem, we may write each vertex of P as a combination of d4
ion
extreme points of Sep (which are pure product states, i.e., of the form |ψ bϕyxψ bϕ|
for unit vectors ψ, ϕ P Cd ). We obtain therefore a polytope Q which is the convex
hull of N 1 ď N d4 pure product states, and such that 14 ‚ Sep Ă Q Ă Sep. Let
ut
p|ψi b ϕi yxψi b ϕi |q1ďiďN 1 be the vertices of Q. Fix χ P SCd arbitrarily. For any
ϕ P SCd , let α “ maxt|xϕ, ϕi y|2 : 1 ď i ď N 1 u. Consider the linear form
rib
gpρq “ Tr rρ p|χyxχ| b pα ICd ´|ϕyxϕ|qqs .
ist
1
For any 1 ď i ď N we have
gp|ψi b ϕi yxψi b ϕi |q “ |xχ, ψi y|2 pα ´ |xϕ, ϕi y|2 q ě 0
rd
and therefore g is nonnegative on Q. Since Q Ą 14 ‚ Sep, we have
ˆ ˙
1 1 fo 3
0ďg ‚ |χ b ϕyxχ b ϕ| “ gp|χ b ϕyxχ b ϕ|q ` gpρ˚ q
4 4 4
ot
ˆ ˙
1 3 1 1
“ pα ´ 1q ` ˆ α´
4 4 d d
N
ˆ ˙ ˆ ˙
1 3 1 3
“ α ` ´ ` .
4 4d 4 4d2
ly.
It follows that
1 ` d32 3
on
αě 3 ě 1 ´ d.
1` d
1
In other words, w proved that for every ϕ P SCd there is an index i P t1, ? ...,N u
se
2
such that |xϕ, ϕi y| ě 1 ´ 3{d. This means that pϕi q1ďiďN is a pC{ dq-net in
1
ion
are necessary to approximate it within a constant factor.
In this section we write D, PSD and Sep for DpCd b Cd q, PSDpCd b Cd q and
SeppCd bCd q. We denote by P pCd q the cone of positivity-preserving operators from
ut
Md to Md . Recall the statement of Theorem 2.34: a state ρ P D is entangled if and
only if there exists an entanglement witness, i.e., Φ P P pCd q such that pΦ b Idqpρq
rib
is not positive. In other words
č
ist
(9.51) Sep “ tρ P D : pΦ b Idqpρq P PSDu .
ΦPP pCd q
rd
It is natural to wonder whether the intersection in (9.51) can be taken over a
smaller subfamily. For d “ 2, two superoperators suffice, namely Id and T ; this is
the content of Størmer’s theorem. It is known that for d ě 3 an infinite family is
fo
needed. If we consider instead the isomorphic version of the problem, the following
theorem shows that super-exponentially (in the dimension of the underlying Hilbert
ot
space) many witnesses are necessary.
N
N
č
(9.52) tρ P D : pΦi b Idqpρq P PSDu Ă 2 ‚ Sep,
on
i“1
N ?
č c0 d
(9.53) tρ P D : pΦi b Idqpρq P PSDu Ă ‚ Sep,
log d
so
i“1
In other words, even being able to detect very robust entanglement requires
super-exponentially many witnesses. It would be of some interest to determine the
maximal robustness level (defined in (9.26)) at which this phenomenon still persist.
2˘
Note that, by Proposition 9.17, D Ă 1 ` d2 ‚ Sep for states on Cd b Cd , so the
`
2
question is nontrivial only if a threshold for the robustness level is smaller than d2 .
Proof of Theorem 9.34. Without loss of generality, we may assume that
each superoperator Φi is unital (see Exercise 9.15). We use the following lemma
Lemma 9.36. Let Φ P P pCd q be unital. Then for any ρ P D,
0 ď Tr rpΦ b Idqρs ď d.
9.4. APPROXIMATION BY POLYTOPES 259
ion
Let ε “ 1{p1 ` dq. Let P be a polytope with at most exppC0 d2 logpdqq facets
such that
ut
(9.54) p1 ´ εq ‚ D Ă P Ă D.
rib
The existence of P is guaranteed by Lemma 9.27, by the relation D˝ “ p´d2 q ‚ D
and by the fact that facets of P are in bijection with vertices of P ˝ (see Section
ist
1.1.5). Introduce the convex body
rd
Ki “ tρ P D : pΦi b Idqpρq P PSDu “ D X pΦi b Idq´1 pPSDq
(note that Sep Ă Ki ) and the polyhedral cone
(9.55) fo
Ci :“ A P B sa pCd b Cd q : pΦi b IdqpAq P R` P .
(
We claim that
ot
1
(9.56) ‚ Ki Ă P X Ci Ă Ki .
N
2
Before proving the claim, let us first show how it implies the Theorem. Combining
ly.
1 č 1 č č č
‚ Sep Ă ‚ Ki Ă pP X Ci q “ P X Ci Ă Ki Ă 2 ‚ Sep.
2 i“1
2 i“1 i“1 i“1
se
Since we know from Corollary 9.32 that dimF pSepq “ Ωpd3 { log dq, it follows that
so
logpN ` 1q ě cd3 { log d for d large enough. Since small values of d can be taken
into account by adjusting the constant c if necessary, this implies the Theorem.
r
It remains to prove the claimed inclusions (9.56). The second inclusion is imme-
Pe
diate from the definitions and from (9.54). For the first inclusion, it is clearly enough
to show that 12 ‚Ki Ă Ci . To that end, let ρ P Ki and denote t “ Tr rpΦi b Idqρs ě 0.
We now consider two cases. First, if t “ 0, then (since pΦi b Idqpρq is a positive
operator) we must have pΦi b Idqpρq “ 0. Hence trivially ρ P Ci and, a fortiori,
1 ´1
2 ‚ ρ P Ci . If t ą 0, we note that t pΦi b Idqpρq P D and that, by Lemma 9.36, we
t 1 1
have t ď d, and therefore 1`t “ 1 ´ 1`t ď 1 ´ 1`d “ 1 ´ ε. It thus follows from
(9.54) that
t t
‚ t´1 pΦi b Idqpρq P ‚ D Ă p1 ´ εq ‚ D Ă P.
1`t 1`t
260 9. GEOMETRY OF THE SET OF MIXED STATES
Exercise 9.15 (Unital witnesses suffice). Let Φ P P pCd q. Show that there is
a unital map Ψ P P pCd q with the property that, for any ρ P DpCd b Cd q,
ion
pΦ b Idqpρq P PSD ðñ pΨ b Idqpρq P PSD.
ut
Exercise 9.16 (Detecting very robust entanglement is also hard). (i) Show
that, in the notation of Exercise 7.15, we have dimF pSeppCd bCd q, Aq ě d3 A´2 { log d
rib
for every A ą 1, where c ą 0 is an absolute constant. (ii) Prove Theorem 9.35.
ist
Notes and Remarks
Section 9.1. The exact formula (9.4) for the volume of D appears in [ŻS03].
rd
The question of computing exactly the volume of Sep was asked in [ŻHSL98] and
seems challenging already in the bipartite case. A conjecture by Slater [Sla12],
fo
strongly supported by numerical evidence, is that for H “ C2 b C2 , one has
volpSepq{ volpDq “ 8{33.
ot
Theorems 9.3, 9.12 and 9.13 are from [AS06]; Theorem 9.11 appeared earlier
in [Sza05]. Theorem 9.6 and its corollary about block-positive matrices is from
N
[ASY14], and will be crucial in Chapter 10. The same question for multipartite
Hilbert spaces or unbalanced bipartite Hilbert spaces was also studied in [ASY14];
ly.
an extra ingredient needed is the fact that PF Sep “ Sep X F for certain subspaces
F , see Exercise 9.11(iii).
on
Volume and mean width estimates for the hierarchies of states introduced in
Section 2.2.5 are also known. For 1 ď k ď d, denote by Entk the set of k-entangled
states in Cd b Cd . It is proved in [SWŻ11] that
se
ck 1{2 Ck 1{2
lu
(9.57) 3{2
ď vradpEntk q ď wpEntk q ď 3{2
d d
which is of course compatible with the extreme cases Ent1 “ Sep and Entd “ D.
na
2
(9.58) wpExtk q „ ?
r
d k
Pe
Ş
Note that D “ Ext1 and Sep “ tExtk : k ě 1u. However the implicit dependence
on k in (9.58) does not allow to recover Theorem 9.3 as k Ñ 8.
Section 9.2. Theorem 9.15 was proved in [GB02]. The proof we present is
due to Hans-Jürgen Sommers and appears is [Som09]; the equivalence between
Theorem 9.15 and the inequality TrpM 2 q ď pTr M q2 for a block-positive matrix M
has been noted in [SWŻ08]. The alternative argument from Exercise 9.8 is from
[Wat].
Proposition 9.17 (in the language of robustness) has been proved by Vidal and
Tarrach [VT99]. Proposition 9.18 is from [Jen13]. The result from Theorem
NOTES AND REMARKS 261
9.19 is due to Beigi and Shor [BS10] and relies on the quantum de Finetti theo-
rem. Another argument, yielding better quantitative estimates, was presented in
[BHH` 14] and was based on the concept of private states. Proposition 9.20 is also
from [BHH` 14].
Both inequalities from Theorem 9.21 are due to Hildebrand ([Hil06] for the
lower bound and [Hil07a] for the upper bound), improving on previous results by
Gurvits and Barnum [GB03, GB05] (the lower bound) and [AS06] (the upper
bound, cf. the proof of Proposition 9.22).
ion
The question of determining the exact order of dg pSep, Dq for many qubits
(cf. Proposition 9.22) deserves attention since it can be connected to feasibility
of nuclear magnetic resonance (NMR) quantum information protocols (see, e.g.,
ut
[GB05]).
rib
Section 9.3. Theorem 9.23 was derived in [SWŻ08], to which we refer for
precise estimates for the constants implicit in the Θp¨q notation from Table 9.3.
ist
Another class of superoperators for which volume estimates are known is the
class of k-positive maps. Indeed, this class is essentially dual to the class of k-
rd
entangled operators (see Exercise 2.48). It was proved in [SWŻ11]—as a conse-
quence of (9.57)—that if Pk,TP denotes the set of k-positive trace-preserving maps
from Md to itself, then
a fo a
c k{d ď vradpPk,TP q ď C k{d.
ot
Section 9.4. The results from this section are from [AS17]. The fact that for
N
ion
The main goal of this chapter is to prove the following result. Consider a system
ut
of N identical particles (e.g., N qubits) in a random pure state. For some k ď N {2,
let A and B be two subsystems, each consisting of k particles. There exists a
rib
threshold function k0 pN q which satisfies k0 pN q „ N {5 as N Ñ 8 and such that
the following holds. If k ă k0 pN q, then with high probability the two subsystems
ist
A and B share entanglement. Conversely, if k ą k0 pN q, then with high probability
the two subsystems A and B do not share entanglement.
rd
If the Hilbert space associated to a single particle is Cq (e.g., q “ 2 for qubits),
the dimension of the system A b B equals q 2k and the state ρ describing the A b B
subsystem is obtained as a partial trace over an environment of dimension q N ´2k
fo
(the remaining N ´ 2k particles). If the global system is in a random and uniformly
distributed pure state, the state ρ is a random induced state as introduced in Section
ot
6.2.3.4, where its distribution was denoted by µq2k ,qN ´2k . The central result of the
N
chapter (Theorem 10.12) answers the question whether a random induced state on
Cd b Cd with distribution µd2 ,s is separable or entangled. It relies on the volume
and mean width estimates from Chapter 9.
ly.
Section 10.3 contains results about other thresholds for random induced states:
for the PPT vs. non-PPT dichotomy (Theorem 10.17) and for the value of the
on
1.3.1. We first state a technical result that ascertains that “flat” vectors (i.e., vectors
with a large `1 -norm and small `8 -norm) majorize many other vectors. Since we
need to consider homotheties, it is natural to work in Rn,0 , the hyperplane of Rn
consisting of vectors whose coordinates add up to 0.
Lemma 10.1. Let x, y P Rn,0 . Assume that }y}8 ď 1 and }y}1 ě αn for some
α P p0, 1s. Then
(10.1) x ă p2{α ´ 1q}x}8 y.
Proof of Lemma 10.1. By homogeneity, it is enough to verify that the con-
dition }x}8 ď 1 implies x ă p2{α ´ 1qy. Moreover, it is enough to check this for
263
264 10. RANDOM QUANTUM STATES
x being an extreme point of the set A :“ tx P Rn,0 : }x}8 ď 1u, since the set
tx P Rn,0 : x ă zu is convex for any z P Rn,0 .
Extreme points of A are of the following form: tn{2u coordinates are equal to
1 and tn{2u coordinates equal to ´1. In the case of odd n there is one remaining
coordinate, which is necessarily equal to 0. It is thus enough to verify that if x is
of that form, and if y satisfies }y}8 ď 1 and }y}1 “ αn, then x ă p2{α ´ 1qy. This
is shown by establishing that an average of permutations of y is a multiple of x.
First, average separately the positive and the negative coordinates of y to obtain
ion
a vector y 1 whose coordinates take only two values, one positive and one negative.
Since the `1 -norm of the positive and the negative part of y 1 is equal and amounts
to αn{2, the support of each part must be at least αn{2 and at most p1 ´ α{2qn,
ut
and the absolute value of each coordinate at least α{p2 ´ αq.
rib
Assume now that n is even. Next, select a set of n{2 equal coordinates (positive
or negative, depending on which part has larger support) and average the remaining
ones. The obtained vector is a multiple of an extreme point, as needed. If n is odd,
ist
select tn{2u equal coordinates (from the dominant sign) and average the remaining
ones to produce one zero and tn{2u equal coordinates. The resulting vector is also
rd
a multiple of an extreme point.
A simpler but less precise version of Lemma 10.1 can be obtained without any
hypothesis on }y}8 .
fo
Lemma 10.2. Let x, y P Rn,0 with y ‰ 0. Then
ot
2n}x}8
N
(10.2) xă y.
}y}1
Proof. By homogeneity, we may assume that }y}8 “ 1 and the result follows
ly.
As a consequence, we obtain the fact that if two vectors from Rn,0 are flat and
close to each other, one is majorized by a small perturbation of the other one.
se
ˆ ˙
2ε
xă 1` y.
α
na
to establish and use is as follows: if A and B are two unitarily invariant random
matrices with similar spectra, then, for any norm or gauge } ¨ }, the typical values
of }A} and of }B} are comparable.
It is convenient to work in the hyperplane Msa,0 n of self-adjoint complex n ˆ
n matrices with trace zero. One says that a Mnsa,0 -valued random variable A is
unitarily invariant if, for any U P Upnq, the random matrices A and U AU : have
the same distribution. Recall also that µSC is the standard semicircular distribution,
that µsp pAq is the empirical spectral distribution of a self-adjoint matrix A, and
ion
that d8 denotes the 8-Wasserstein distance. All these concepts were introduced
in Section 6.2.
ut
Proposition 10.4. Let A and B be two Msa,0 n -valued random variables which
are unitarily invariant and satisfy the following conditions
rib
(10.3) Ppd8 pµsp pAq, µSC q ď εq ě 1 ´ p and E d8 pµsp pAq, µSC q ď ε
ist
for some ε, p P p0, 1q, and similarly for B. Then, for any convex body K Ă Mnsa,0
containing the origin in its interior,
rd
1´p 1 ` Cε
E }A}K ď E }B}K ď E }A}K
1 ` Cε 1´p
for some absolute constant C. fo
Proof of Proposition 10.4. Note that possible relations between A and
ot
B (such as independence) are irrelevant in the present situation. Consider the
following function on Rn,0 (recall that Rn,0 denotes the hyperplane of vectors of
N
sum zero in Rn )
φpxq “ E }U DiagpxqU : }K ,
ly.
everything else) and Diagpxq is the diagonal matrix whose ii-th entry is xi . Unitary
invariance implies that
se
Assume for the moment that E holds, we have then (see Exercise 6.25)
ż2
na
ż
}B}1 “ n |x| dµsp pBqpxq ě n p|x| ´ εq` dµSC pxq
´2
so
ż2
ě n p|x| ´ 1q` dµSC pxq “ αn,
r
´2
Pe
ion
If ε is large (2 or larger), the hypothesis d8 pµsp pAq, µSC q ď ε does not prevent
A from being identically zero. However, an isomorphic version of Proposition 10.4
ut
can be similarly obtained under the hypothesis that the spectra of A and B are
reasonably flat.
rib
Proposition 10.5 (see Exercise 10.3). Let A and B be two Mnsa,0 -valued ran-
dom variables which are unitarily invariant. Assume that
ist
(10.5) Pp}A}1 ě c1 nq ě 1 ´ p and E }A}8 ď C2 ,
rd
and similarly for B. Then, for any convex body K Ă Msa,0
n containing the origin in
the interior,
fo
C ´1 E }A}K ď E }B}K ď C E }A}K
with C “ p1 ´ pq´1 p2C2 {c1 q.
ot
Exercise 10.2 (Retrieving unitarily invariant distributions from the spec-
N
distributed random unitary matrix independent of A. Show that the random matrix
on
Proposition 10.5.
lu
However, we are also interested in properties that cannot be inferred from the
r
system). In this context, it is useful to compare induced states with their Gaussian
approximation. Indeed, the Gaussian model allows to connect with tools from
convex geometry, such as the mean width.
It is convenient to work in the hyperplane Msa,0 n and to consider the shifted
operators ρ ´ I {n, which we compare with a GUE0 random matrix (see Section
6.2.2). The following proposition compares the expected value of any norm (or
gauge) computed for both models.
Proposition 10.6. Given integers n, s, denote by ρn,s a random induced state
on Cn with distribution µn,s , and by Gn an nˆn GUE0 random matrix. Let Cn,s be
10.1. MISCELLANEOUS TOOLS 267
the smallest constant such that the following holds: for any convex body K Ă Mnsa,0
containing 0 in the interior,
› › › › › ›
´1
› Gn › › I ›› › Gn ›
(10.6) Cn,s E › ? › ď E ›ρn,s ´ › ď Cn,s E › ? ›› .
› › › ›
n s K n K n s K
Then
(i) For any sequences pnk q and psk q such that limkÑ8 nk “ limkÑ8 sk {nk “ 8, we
have limkÑ8 Cnk ,sk “ 1.
ion
(ii) For any a ą 0, we have suptCn,s : s ě anu ă 8.
Remark 10.7. We emphasize that the quantity E }Gn }K appearing in (10.6) is
ut
exactly the Gaussian mean width of the polar set K ˝ . Indeed, the standard Gauss-
ian vector in the space Msa,0 (equipped with the Hilbert–Schmidt scalar product, as
rib
n
always) is exactly a GUE0 random matrix. In view of (4.32), we could have equiva-
lently formulated Proposition 10.6 using the usual mean width: if C̃n,s denotes the
ist
smallest constant such that the inequalities
˝
› ›
´1 wpK q I ›› wpK ˝ q
rd
›
(10.7) C̃n,s ? ď E ›ρn,s ´ › ď C̃n,s ? ,
›
s n K s
fo
are true for every convex body containing 0 in the interior, then the conclusions of
Proposition 10.6 hold for C̃n,s instead of Cn,s .
ot
Proof. It is easy to check that (10.6) holds for some Cn,s ă `8 if n and s
are fixed (see Exercise 10.4). Moreover, we know from Theorem 6.35(i) that, for
N
every fixed n,
ly.
Yk ď 2 ` }Bk } and from Proposition 6.24 and Proposition 6.33. Part (i) follows
now from Proposition 10.4.
na
(ii) Let Ak and Bk be as before, but now we only assume that sk ě ank for some
a ą 0. We argue by contradiction: suppose that Cnk ,sk tends to infinity. We
so
know from (10.8) that the sequence pnk q cannot be bounded, so we may assume
limk nk “ `8. Similarly, using part (i), we may assume that sk {nk is bounded,
r
and therefore (by passing to a subsequence) that lim sk {nk “ λ P ra, 8q. We
Pe
know from Theorem 6.35(ii) and Theorem 6.23 that µsp pAk q and µsp pBk q converge
in probability towards a nontrivial deterministic limit, and therefore satisfy the
hypotheses of Proposition 10.5 for some constants p, c1 , C2 .
Exercise 10.4. Let X and Y two Rn -valued random vectors with the property
that, for any θ P S n´1 , we have 0 ă E |xX, θy| ă `8 and 0 ă E |xY, θy| ă `8.
Show that there exists a constant C (depending on n, X, Y ) such that, for any
convex body K containing the origin in the interior, we have E }X}K ď C E }Y }K .
268 10. RANDOM QUANTUM STATES
ion
Proof of Proposition 10.8. We know that ρ has the same distribution as
AA: , where A is an n ˆ s matrix uniformly distributed on the Hilbert–Schmidt
ut
sphere SHS . Consider the function f : SHS Ñ R defined by
› ›
I›
rib
›
(10.9) f pAq “ ››AA: ´ ›› .
n K0
For every t ą 0, denote by Ωt the subset Ωt “ tA P SHS : }A}8 ď tu. The function
ist
f is the composition of several operations:
(a) the map A ÞÑ }A}K0 , which is 1{r-Lipschitz with respect to the Hilbert–Schmidt
rd
norm.
(b) the map A ÞÑ A ´ I {n, which is an isometry for the Hilbert–Schmidt norm,
fo
(c) the map A ÞÑ AA: , which is 2t-Lipschitz on Ωt (see Lemma 8.22).
It follows that the Lipschitz constant of the restriction of f to Ωt is bounded by
ot
2t{r. We now apply the local version of Lévy’s lemma (Corollary 5.35) and obtain
that, for every η ą 0,
N
Remark 10.9. Taking t “ 1 in the argument above, one obtains that the global
Lipschitz constant of f is bounded by 2{r. This implies
? (see Proposition 5.29) that
se
sider the case of Cd b Cd where both parties play a symmetric role. Throughout
this section we write Sep for SeppCd b Cd q and consider random induced states on
so
Since the maximally mixed state lies in the interior of the set of separable states, and
since the measures µd2 ,s converge weakly towards the Dirac mass at the maximally
mixed state (see Section 6.2.3.4), it follows that µd2 ,s pSepq tends to 1 when s tends
to infinity (d being fixed). Conversely, the following result shows that random
induced states are entangled with probability one when s ď pd ´ 1q2 .
Proposition 10.10. Let d, s be integers with s ď pd´1q2 . Then µd2 ,s pSepq “ 0.
Proof. Let S Ă Cd b Cd be the range of ρ. The random subspace S is
Haar-distributed on the Grassmann manifold Grps, Cd b Cd q. We use the following
simple fact which is an immediate consequence of the definition of separability: if
10.2. SEPARABILITY OF RANDOM STATES 269
ion
µd2 ,t pSepq ą 0 for every t ě s. (Cf. Problem 10.14.)
10.2.2. The threshold theorem. From the two extreme cases, s ď pd ´ 1q2
ut
and s “ 8, we may infer that induced states are more likely to be separable when
the environment has larger dimension. As it turns out, a phase transition takes
rib
place (at least when d is sufficiently large): the generic behavior of ρ “flips” to the
opposite one when s changes from being a little smaller than a certain threshold
ist
dimension s0 to being larger than s0 . More precisely, we have the following theorem.
Theorem 10.12. Define a function s0 pdq as s0 pdq “ wpSeppCd b Cd q˝ q2 . This
rd
function satisfies
(10.10) cd3 ď s0 pdq ď Cd3 log2 d
fo
for some constants c, C and is the threshold between separability and entanglement
in the following sense. If ρ is a random state on Cd bCd induced by the environment
ot
Cs , then, for any ε ą 0,
N
with the following property. For some integer k ď N {2, decompose H “ pC2 qbN as
A b B b E with A “ B “ pC2 qbk and E “ pC2 qbpN ´2kq , and consider a unit vector
ψ P H chosen uniformly at random. Let ρ “ TrE |ψyxψ| be the induced state on
A b B. Then
(1) for k ă kN , Ppρ is entangledq ě 1 ´ 2 expp´αN q,
(2) for k ą kN , Ppρ is separableq ě 1 ´ 2 expp´αN q,
where α ą 1 is a constant independent of N .
Proof of Theorem 10.12. The inequalities (10.10) are a direct consequence
of Theorem 9.6.
270 10. RANDOM QUANTUM STATES
We next present a detailed proof of part (ii). Let ρd2 ,s be a random state
2
› I
› µd2 ,s . Denote Sep0 “ Sep ´ I {d . Consider also the function
with distribution
f pρq “ ›ρ ´ d2 ›Sep and the quantity Ed,s :“ E f pρd2 ,s q.
0
Fix ε ą 0, and let s, d be such that s ě p1 ` εqs0 pdq. Appealing to Proposition
10.6 (in the version given in Remark 10.7), we obtain
wpK ˝ q C̃n,s
(10.13) Ed,s ď C̃n,s ? ď? ,
s 1`ε
ion
where C̃n,s is the constant appearing in (10.7). The constants C̃n,s tend to 1 as d
and s tend to infinity under the constraint s ě p1 ` εqs0 pdq.
ut
Let Md,s be the median of f pρd2 ,s q. We know from Proposition 10.8 (the
inradius of Sep being Θp1{d2 q, see Table 9.1) that
rib
P f pρd2 ,s q ą Md,s ` η ď expp´sq ` 2 expp´csη 2 q.
` ˘
(10.14)
?
Remark 10.9 implies that |Md,s ´ Ed,s | ď Cd{ s. It follows then from (10.13)
ist
that there is an η ą 0 (depending only on ε) with the property that Md,s ` η ď 1
for all d large enough and s ě p1 ` εqs0 pdq. The inequality (10.12) follows now
rd
from (10.14) and from the obvious remark that a state ρ is entangled if and only if
f pρq ą 1. Small values of d can be taken into account by adjusting the constants if
fo
necessary. Note that the argument yields a priori a bound C 1 expp´c1 pεqsq, possibly
with C 1 ą 2, but the bound (10.12) follows then with cpεq “ c1 pεq{ log2 C 1 .
ot
The proof of part (i) goes along similar lines, particularly if we do not care
about the exact power of d appearing in the exponent of the probability bound
N
in (10.11);
` this is because˘Proposition 10.8 yields an estimate parallel to (10.14)
for P f pρd2 ,s q ă Md,s ´ η . There are some fine points which emerge when s is
ly.
relatively small, but they can be handled using inequalities from Exercise 10.7; see
[ASY14] for details. See also Remark 10.15.
on
The fine points in the proof of part (i) of Theorem 10.12 would disappear if the
answer to the following natural problem was positive (cf. Exercise 10.6).
se
rem 10.12 is sketched in Exercise 10.9. That argument also has the advantage that
it produces explicitly an entanglement witness certifying that the induced state
so
is entangled. However, the argument works only in the range s ď cd3 for some
constant c ą 0; while this does not cover the entire range, it handles the case of
r
Pe
relatively small s that does not readily follow from Proposition 10.8.
Exercise 10.7 (Partial results on monotonicity of entanglement). Set πd,s :“
µ pSeppCd b Cd qq .
d2 ,s
(i) Show that the function d ÞÑ πd,s is non-increasing for any integer s ě 1.
(ii) Show the inequality π2d,s ď πd,4s .
Exercise 10.8 (Proof of the N {5 threshold result). Prove Corollary 10.13 by
combining Theorem 10.12 (applied with ε “ 1{2) and Exercise 10.7.
Exercise 10.9 (The induced state is its own witness). Let ρ be a random state
on Cd b Cd with distribution µd2 ,s , and W “ ρ ´ I {d2 .
10.2. SEPARABILITY OF RANDOM STATES 271
ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
272 10. RANDOM QUANTUM STATES
ion
0
entanglement. In this section we will work with invariants that are more “native”
to quantum information theory.
ut
For a pure state ψ, the entropy of entanglement Epψq was introduced in (8.1).
A possible way to extend this definition to mixed states is to use a “convex roof”
rib
construction. For a state ρ on Cd b Cd , define its entanglement of formation EF pρq
as
ist
!ÿ ÿ )
(10.15) EF pρq “ inf pi Epψi q : ρ “ pi |ψi yxψi | ,
rd
the infimum being taken over all decompositions of ρ as convex combinations of
pure states. Equivalently, the entanglement of formation is the smallest convex
fo
function which coincides with the entropy of entanglement on pure states.
Entanglement of pure states was studied in Chapter 8. In particular, for a
random pure state ψ (which corresponds to the case s “ 1), we typically have
ot
EF p|ψyxψ|q “ Epψq “ log d ´ 12 ` op1q; see Lemma 8.13. Here is a statement
N
convex combination
r
I
Pe
ρ “ pρ ´ a Iq ` a I “ p1 ´ d2 aqσ ` d2 a 2
d
for some state σ. Using the convexity of EF and the obvious facts that EF pσq ď
log d and EF pI {d2 q “ 0, we obtain EF pρq ď p1 ´ d2 aq log d. However, we know
from Proposition 6.36 (or Exercise 6.43) that a ě d12 ´ dC ? with large probability.
s
It follows that as long as s ě C 2 ε´2 d2 log2 d, then
Cd logpdq
EF pρq ď ? ď ε.
s
Exercise 10.10. Check that EF pρq “ 0 if and only if ρ is separable.
NOTES AND REMARKS 273
10.3.2. Threshold for PPT. The machinery developed in this chapter can
be applied to any property instead of separability and allows to reduce the estima-
tion of threshold dimensions to the estimation of a geometric quantity (the mean
width for the polar set).
One natural example is the PPT property. Since PPT “ D X ΓpDq, where Γ is
the partial transpose, it` follows
˘ easily (arguing as in the first part of the proof of
Proposition 9.8) that w PPT˝0 ď 2wpD˝0 q » d. The threshold s1 appearing in this
approach satisfies then
ion
s1 pdq “ wpPPT˝0 q2 “ Θpd2 q.
However, we know that the spectrum of large-dimensional partially transposed
ut
random states is described by a non-centered semicircular distribution (see Theorem
6.30). A more precise estimation of the threshold follows (note
? that the? distribution
rib
SCpλ, λq appearing in Theorem 6.30 has support rλ ´ 2 λ, λ ` 2 λs, which is
included in r0, `8q if and only if λ ě 4).
ist
Theorem 10.17 (Threshold for the PPT property). Define s1 pdq “ 4d2 . Let ρ
be a random state on Cd b Cd with distribution µd2 ,s . Then
rd
(i) if s ď p1 ´ εqs1 pdq, we have
fo
Ppρ is PPTq ď 2 expp´cpεqd2 q,
(ii) if s ě p1 ` εqs1 pdq, we have
ot
Ppρ is PPTq ě 1 ´ 2 expp´cpεqsq.
N
is sufficiently larger than d2 , but sufficiently smaller than d3 , random states are
typically PPT and entangled (in particular they cannot be distilled, see Chapter
on
12), but have an amount of entanglement extremely small when measured via the
entanglement of formation.
se
Exercise 10.11. Explain the presence of expressions of the form Ωε pd2 q and
lu
Theorem 10.12, as well as the preliminary results from Section 10.1, are from
so
[HLW06]).
The answer to Problem 10.11 is known for qubits: we have µ4,2 pSeppC2 bC2 qq “
0 and µ4,3 pSeppC2 b C2 qq ą 0. As explained in section 7.1 of [ASY14], this follows
from results of [RW09] and [SBŻ06], respectively.
The entanglement of formation is only one of the many possible ways to quantify
entanglement of mixed states. However, other measures are harder to manipulate.
For a survey of the subject of entanglement measures see [PV07].
The threshold for the entanglement of formation (Theorem 10.16) is essentially
from [HLW06], and the threshold for the PPT property (Theorem 10.17) is from
[Aub12] (see also [ASY12]).
274 10. RANDOM QUANTUM STATES
Other thresholds functions have been computed or estimated: for the realign-
ment criterion [AN12], for the k-extendibility property [Lan16], and for still other
properties [CNY12, JLN14, JLN15] (including the absolute PPT property and
the reduction criterion).
ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
CHAPTER 11
ion
Inequality
ut
In this chapter we briefly sketch the connection (originally made by Tsirelson)
rib
between the celebrated Bell inequalities from the quantum theory, and the equally
celebrated Grothendieck inequality from functional analysis. The presentation is
anything but comprehensive: it has been unequivocally established in the last dozen
ist
or so years that the proper “mathematical home” of Bell inequalities is in the theories
of operator spaces and operator systems, which are beyond the scope of this book.
rd
An excellent survey that addresses these topics in much greater detail is [PV16].
fo
11.1. Isometrically Euclidean subspaces via Clifford algebras
In Section 7.2.4 we studied in detail the almost Euclidean subspaces of Mn ,
ot
i.e., on which a given Schatten p-norm is p1 ` εq-equivalent to the Hilbert–Schmidt
norm. For the purposes of the present chapter it is useful to focus on the case of
N
follows that there are subspaces of dimension n in Mn (e.g., the space of all matrices
with zero coefficients outside the first row) in which the ratio }¨}op {}¨}HS is constant
on
and equal to 1. However, such a subspace is not at the “correct level”: for subspaces
produced by ? Dvoretzky’s theorem – which?are also of dimension Θpnq – the same
ratio is Θp1{ nq (or, more precisely, „ 2{ n, see Exercise 7.23).
se
Ui “ Ibpi´1q b σx b σybpk´iq ,
ion
where σx , σy , σz are the Pauli matrices introduced in (2.2). It is easily checked (cf.
Exercise 2.4) that the operators pUi q1ďiď2k are self-adjoint and are anticommuting
ut
reflections: Ui2 “ I and Ui Uj “ ´Uj Ui for i ‰ j. It follows that for any ξ P R2k , the
matrix X “ ξ1 U1 ` ¨ ¨ ¨ ` ξ2k U2k satisfies XX : “ |ξ|2 I and therefore is a multiple
rib
of a unitary matrix.
Remark 11.3. The subspaces in Lemmas 11.1 and 11.2 consist of trace zero
ist
matrices.
rd
The dimensions appearing in Lemma 11.1 are not optimal. Finding the minimal
possible dimension is related to the Radon–Hurwitz problem and involves more
advanced analysis of Clifford algebras. fo
Theorem 11.4 (not proved here). Given an integer k ě 1, consider
ot
(i) αpkq, the minimal integer n such that Mn pRq contains a k-dimensional subspace
in which every matrix is a multiple of an orthogonal matrix.
N
(ii) βpkq, the minimal integer n such that Mn pRq contains a k-dimensional subspace
in which every nonzero matrix is invertible.
ly.
Then
$ pk´2q{2
on
’
’ 2 if k “ 0 mod 8,
’
&2pk´1q{2 if k “ 1 or k “ 7 mod 8,
αpkq “ βpkq “
’2k{2 if k “ 2 or k “ 4 or k “ 6 mod 8,
se
’
’
% pk`1q{2
2 if k “ 3 or k “ 5 mod 8.
lu
Ever since the seminal 1935 paper [EPR35] by Einstein, Podolsky and Rosen it
has been apparent that quantum theory leads to predictions which are incompatible
with the classical understanding of physical reality. Specifically, the outcomes of
some experiment may be correlated in a way contradicting common sense (“spooky
action at a distance”). In this section we formalize the concept of correlations,
which will lead to the famous Bell inequalities discovered in [Bel64].
ion
ance matrices from statistics. In that context, covariance matrices are square and
positive semi-definite, corresponding to the scenario when pXi q “ pYi q and E Xi “ 0
ut
(see, e.g., Appendix A.2), while the correlation matrix of pXi q is the covariance ma-
trix of the standardized variables pX̃i q “ pXi {}X}2 q. When E Xi “ E Yj “ 0, (11.1)
rib
coincides with the somewhat less frequently used notion of cross-covariance.
The set LCm,n is a polytope with 2n`m´1 vertices (see Proposition 11.7) and
ist
appears in the literature under various names such as correlation polytope, Bell
polytope, local hidden variable polytope, local polytope. (The reader should be
rd
forewarned, though, that sometimes the same names are used for sets of the more
general objects, the so-called boxes, defined in Section 11.3.2.) The reasons for the
adjective “local” will become more clear later on. The facial structure of LCm,n is
fo
rather complicated (except in very low dimensions, see Exercises 11.4, 11.12, and
11.15).
ot
Definition 11.6. A m ˆ n real matrix paij q is called a quantum correlation
N
We write QCm,n (or simply QC) for the set of m ˆ n quantum correlation matrices.
It turns out that both sets LC and QC have simple descriptions.
se
Proposition 11.8. The set QCm,n is convex and can be alternatively described
as
so
! )
QCm,n “ pxxi , yj yq1ďiďm,1ďjďn : xi , yj P Rminpm,nq , |xi | ď 1, |yj | ď 1 .
r
Pe
It is obvious from the Propositions that LC Ă QC. (This can also be established
directly from the definitions, without appealing to the results of Section 11.1.)
The crucial point—which is simple, but not entirely trivial, and will be studied in
detail in the next section—is that this inclusion is strict. This is one mathematical
manifestation of the fact that the quantum description of reality is different from
the classical one. Correlation matrices that do not belong to LC will be called
nonclassical or nonlocal.
Proof of Proposition 11.7. We first prove the inclusion Ą. It is clear that
given ξ P t´1, 1um and η P t´1, 1un , we have pξi ηj q P LCm,n (consider constant
random variables taking values ˘1), so it suffices to show that LCm,n is convex.
278 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY
ion
ÿ
X“ λdξ pXqξ
ξPId
ut
with the functions λdξ d
: r´1, 1s Ñ r0, 1s being measurable (or even continuous) and
rib
adding to 1. If a P LCm,n is a classical correlation matrix with aij “ E Xi Yj , we
may write (denoting X “ pX1 , . . . , Xm q and Y “ pY1 , . . . , Yn q)
ist
´ ÿ ¯´ ÿ ¯ ÿ
λm λnη pY qηj “ E λm n
“ ‰
aij “ E ξ pXqξi ξ pXqλη pY q ξi ηj
rd
ξPIm ηPIn ξPIm ,ηPIn
βpS, T q “ Re TrpρST q.
ly.
This bilinear form is positive semi-definite (to check symmetry, use the fact that
Re Tr X “ Re Tr X : ) and therefore, after possibly passing to a quotient, it makes
on
B sa pHq into a real Euclidean space. The conclusion follows since aij “ βpXi b
I, I bYj q while βpXi bI, Xi bIq ď 1 and βpI bYj , I bYj q ď 1. To obtain the dimension
minpm, nq as claimed, note that we may a posteriori project the vectors pxi q1ďiďm
se
Conversely, let pxi q1ďiďm and pyj q1ďjďn be vectors of Euclidean norm at most
1 in Rminpm,nq . By Lemma 11.2, there exist d ˆ d complex Hermitian matrices
na
Ai , Bj (for some d), with Hilbert–Schmidt norm at most 1 and such that Tr Ai Bj “
xxi , yj y. Moreover, Ai , Bj are multiples of unitaries. Set Xi “ d1{2 Ai and Yj “
so
d1{2 BjT ; then Xi , Yj are unitaries and in particular }Xi }8 ď 1 and }Yj }8 ď 1.
Finally, if ρ “ |ψyxψ|, where ψ P Cd b Cd is a maximally entangled vector, then we
r
Pe
have
1
Tr ρXi b Yj “ Tr Xi YjT “ Tr Ai Bj “ xxi , yj y,
d
where the first equality follows by direct calculation (see Exercise 2.12).
Remark 11.10. Definitions 11.5 and 11.6 can be readily extended to the mul-
n1 ¨¨¨nk
tipartite setting. One ”defines LCn1 ,...,n
ı k ĂR as the set of arrays pai1 ,...,ik q of
p1q pkq pjq
the form ai1 ,...,ik “ E Xi1 ¨ ¨ ¨ Xik where all the Xij are random variables with
pjq
|Xij | ď 1 a.s., and QCn1 ,...,nk Ă Rn1 ¨¨¨nk as the set of arrays pai1 ,...,ik q of the form
” ı
p1q pkq pjq
ai1 ,...,ik “ Tr ρpXi1 b ¨ ¨ ¨ b Xik q where all the Xij P BpHj q are self-adjoint
pjq
operators with }Xij }8 ď 1, and ρ P DpH1 b ¨ ¨ ¨ b Hk q.
ion
Exercise 11.2 (Convexity of the set of quantum correlations). Show (directly
from the definition) that the set QC is convex.
ut
Exercise 11.3 (Unit vectors suffice). Show that
rib
QCm,n “ pxxi , yj yq1ďiďm,1ďjďn : xi , yj P Rd , d P N, |xi | “ 1, |yj | “ 1 .
(
ist
LC2,2 , considered as a subset of R4 , is congruent to 2B14 (a ball of radius 2 in the
`1 -norm).
rd
Exercise 11.5 (Local correlation polytope and the cut-norm). The cut-norm
of a matrix B P Mm,n is defined as
#ˇ
}B}cut “ sup ˇ
ˇÿ ÿ ˇ
ˇ
ˇ fo +
bij ˇ : I Ă t1, . . . , mu, J Ă t1, . . . , nu .
ˇ
ot
ˇiPI jPJ ˇ
N
` ˘
correlation matrix defined by (11.2). Show that aij can be realized with a state
lu
that, in addition to }X̃i }8 ď 1 and }Ỹj }8 ď 1, we have also Tr X̃i “ Tr Ỹj “ 0 for
all i, j. Moreover, it can be arranged that all X̃i and Ỹj are multiples of isometries
so
if Xi , Yj were.
Exercise 11.8 (Local correlation polytope on k qubits is also an `1 -ball). Show
r
Pe
k
that the set LC2,2,...,2 Ă R2 (as defined in Remark 11.10) is a convex polytope with
k
2k`1 vertices and 22 facets.
Exercise 11.9. Find the inradius and the outradius of the sets LC and QC.
Exercise 11.10. Show that the sets LC and QC have enough symmetries (in
the sense of Section 4.2.2).
11.2.2. Bell correlation inequalities and the Grothendieck constant.
In the context of correlation matrices, a Bell correlation inequality is a linear func-
tional ϕ : Mm,n Ñ R with the property that ϕpAq ď 1 for any classical correlation
matrix A P LCm,n . (We will discuss a more general setup in Section 11.3.) If we
280 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY
identify Mm,n with its dual space, the set of Bell correlation inequalities becomes
the polytope LC˝m,n (the polar of LCm,n ) and can be identified with B1m b q B1n . Of
particular interest are the extreme (or optimal) inequalities or, equivalently, the
facets of LCm,n (cf. Section 1.1.5).
A famous example of a Bell correlation inequality in the 2ˆ2 case is the Clauser–
Horne–Shimony–Holt or CHSH inequality ϕCHSH , which is the linear functional
A ÞÑ 21 TrpAMCHSH q, where
„
1 1
ion
` ˘2
(11.3) MCHSH “ mij i,j“1 :“ .
1 ´1
It is easily checked that 12 MCHSH P LC˝2,2 since for any choice of ξ, η P t´1, 1u2 ,
ut
(11.4) ξ1 η1 ` ξ1 η2 ` ξ2 η1 ´ ξ2 η2 ď 2.
rib
Moreover, 8 of the 16 possible choices of pξ, ηq saturate this bound.
Since, as we mentioned, the inclusion LCm,n Ă QCm,n is strict (provided m, n ě
ist
2) it may happen that for a Bell correlation inequality ϕ and a quantum correlation
matrix A P QCm,n , we have ϕpAq ą 1. In that case, we say that the Bell correlation
rd
inequality ϕ is violated by A and the quantity ϕpAq is called the violation or, more
precisely, the quantum violation. This is, in particular, the case for the CHSH
inequality. We have fo
Proposition 11.11 (CHSH violations, see Exercises ? 11.11–11.13). The max-
ot
imal quantum violation of the CHSH inequality is 2, and no Bell correlation
inequality for 2 ˆ 2 correlation matrices yields a larger violation.
N
absolute constant K ě 1 such that, for any positive integers m, n, the following
three equivalent conditions hold:
1˝ We have the inclusion
se
˝
` ˘
2 For any m ˆ n real matrix mij and for any ρ, Xi , Yj verifying the conditions
na
m n
ξPt´1,1u ,ηPt´1,1u
i,j i,j
˝
` ˘
r
3 For any m ˆ n real matrix mij and for any (real) Hilbert space vectors xi , yj
Pe
π ?
known; as of this writing, the best estimates are 1.6769 ă KG ă 2 lnp1` 2q
«
pm,nq
1.7822. We also denote by KG the best constant in (11.5)–(11.7) for fixed
pnq pn,nq
m, n, and KG “ KG . This should not be confused with the optimal constant
in (11.7) under the restriction that xi , yj live in an n-dimensional Hilbert space,
which is denoted similarly by some authors. The values of all these and related
“Grothendieck constants” are discussed in Exercises 11.13–11.17 and in Notes and
Remarks.
ion
One sees immediately that the maximum on the right-hand side of (11.7) is the
norm of the bilinear form
M “ mij : `m n
` ˘
(11.8) 8 ˆ `8 Ñ R.
ut
Thus Proposition 11.7 is really an instance of the duality between the projective
rib
and injective tensor products (see Section 4.1.4 and particularly Exercise 4.18).
Similarly, the maximum on the left-hand side of (11.7) is the norm of M as a bilinear
ist
form on `m n
8 pHq ˆ `8 pHq. In the setting of operator spaces, the latter quantity may
be interpreted as the the so-called completely bounded norm of the bilinear form
rd
(11.8) or, equivalently, the minimal tensor norm of M in that category. In other
words, the values of the Grothendieck constants and of the maximal violations
of Bell correlation inequalities may be obtained by comparing two norms which
fo
naturally appear in the context of operator spaces. We will not go into the details
of that theory (or even define precisely the concepts we mentioned above) since to
ot
do that at a reasonable level of diligence would require (at least) another chapter.
Instead, we refer the interested reader to the excellent survey [PV16].
N
An important question, which has attracted lots of attention over the last 20 or
so years, is the characterization of states ρ that may lead to nonlocal correlations.
ly.
It is easy to see that if a state ρ is separable, then any correlation matrix (11.2)
belongs to the local polytope LC. (A more general fact of this nature is discussed in
on
with the goal of clarifying these issues, Peres asked in 1998 whether there is a link
between locality and the PPT property. Various variants of the question have been
lu
answered, but the following most basic version is apparently still open (see also
Remark 11.21 and Notes and Remarks on Section 11.3).
na
Problem 11.13 (Peres conjecture for correlation matrices). Can nonlocal cor-
relations be obtained, in the sense of Definition 11.6, from a PPT state?
so
As we mentioned earlier, the facial structure of the polytope LCm,n is, for large
r
m, n, rather complicated. For example, we could not find in the literature an answer
Pe
Let us conclude this section with a result giving volume and mean width esti-
mates for the sets of correlation matrices. We state them for classical correlations
only, since similar estimates for quantum correlations follow formally via Theorem
11.12 (see, however, Problem 11.16).
Proposition 11.15. For m, n P N we have
(11.9)
´ 1 c ?
¯ ? ? 2 ? ? mn
? ´ op1q maxp m, nq ď vradpLCm,n q ď wpLCm,n q ď p m ` nq ,
ion
2 π κmn
?
where op¨q indicates
a the behavior as m, n Ñ 8. (Recall that the ratio k{κk de-
ut
creases from π{2 to 1 as k increases from 1 to 8, see Proposition A.1.)
Proof. The middle inequality is the Urysohn inequality (Proposition 4.15).
rib
To get the upper bound on the mean width, we use the Chevet–Gordon inequality
(see Section 6.2.4.1) in the form from Exercise 6.49:
ist
? m
? n
a ? ?
wG pLCm,n q ď n wG pB8 q ` m wG pB8 q “ 2{πpm n ` n mq.
rd
For the lower bound on the volume radius, we may assume m ě n. We claim that
(with the identification Mm,n Ø Rmn Ø pRn qm ), we have
(11.10)
1 fo
? pB2n qm Ă LCm,n .
2
ot
Since the volume radius of pB2n qm is easy to calculate, namely
˘1{mn
N
Γ mn
`
` n m˘ volpB2n q1{n 2 `1
vrad pB2 q “ “
volpB2mn q1{mn
˘1{n
Γ n `1
`
2
ly.
by (B.3), the lower bound in (11.9) follows then readily from Stirling’s formula (as
on
ÿ ˇ
(11.11) sup TrpABq “ sup bij ξi ˇ
ˇ ˇ
ˇ
APLCm,n ξPt´1,1um i“1 ˇj“1
lu
ˇ
ˇ ˇ
ÿm ˇÿ n ˇ
Ave m bij ξi ˇ
ˇ ˇ
ě
na
ˇ
ξPt´1,1u ˇ ˇ
i“1 j“1
˜ ¸1{2
so
m n
p˚q 1 ÿ ÿ 2
ě ? bij ,
2 i“1 j“1
r
Pe
Even showing that the ratio volpLCq{ volpQCq tends to 0 does not seem straightfor-
ward.
Exercise
? 11.11 (The CHSH bound). Show that suptϕCHSH pAq : A P
QC2,2 u “ 2.
Exercise 11.12 (CHSH is the only 2 ˆ 2 Bell correlation inequality). By Ex-
ercise 11.4, the polytope LC2,2 has 16 facets. Show that the unit normals to these
facets are (up to the sign) exactly the „matrices
that can be obtained by permuting
ion
1 1 0
the entries of either 2 MCHSH or of . Conclude that, up to the obvious
0 0
symmetries, ϕCHSH is the only nontrivial 2 ˆ 2 Bell correlation inequality.
ut
Exercise 11.13 (The Grothendieck–Tsirelson bound). Show that the sequence
rib
` pnq ˘ p2q ?
KG n increases to KG and that KG “ 2.
Exercise 11.14 (CHSH is the only 2 ˆ n Bell correlation inequality). Show
ist
p2,nq ?
that KG “ 2 for any n ě 2.
rd
Exercise 11.15 (CHSH is the only 3ˆ3 Bell correlation inequality). Using the
Matlab multi-parametric toolbox (or other software, or lots of time), it is routine
fo
to establish that LC3,3 has 90 facets. Using this information, show that, up to the
obvious symmetries, ϕCHSH is the only nontrivial 3 ˆ 3 Bell correlation inequality
p3q ?
ot
and deduce that KG “ 2.
p2q
N
Exercise 11.17. Show that the complex Grothendieck constant (see (11.37)
on
Corollary 7.30, show that LCn,n has exppΩpnqq facets. Moreover, for any fixed
λ ą 1, any polytope P such that P Ă LCn,n Ă λP or P Ă QCn,n Ă λP has
lu
exppΩpnqq facets.
na
Exercise 11.19 (Facial dimension of the local correlation polytope, take #2).
Combine Proposition 6.3, Theorem 4.17, Proposition 11.15 and Exercise 11.9 to
so
This section outlines more general Bell inequalities described in the language
of boxes and games. It includes an explanation of how the original Grothendieck–
Bell setup fits into the broader framework, the CHSH inequality as a game, and
a presentation of several examples and special features such as no-signaling, PR-
boxes, and bounded or unbounded violations.
11.3.1. Bell inequalities as games. We start by rephrasing the CHSH in-
equality (11.4) as a game. The game involves two cooperating players, Alice and
Bob, and a—fair but tough—referee. The players may use a strategy agreed upon
in advance and may share some resources, but are not allowed to communicate dur-
ing the game. At each round of the game, the referee provides Alice and Bob with
284 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY
Referee
i j
ξ η
ion
Alice Bob
ut
Figure 11.1. Diagrammatic representation of a quantum game.
rib
Prior to the game, Alice and Bob can agree on some strategy which,
in the quantum variant, may involve sharing a bipartite quantum
ist
state (as depicted by the wavy line). Once the game starts, they
are no longer allowed to communicate. The referee sends privately
rd
input i to Alice and input j to Bob; Alice and Bob answer him
privately with their outputs, respectively ξ and η.
fo
inputs (or settings) i and j, which can be 1 or 2, and each of them must respond
ot
with an output (respectively ξ and η) which can be 1 or ´1. Alice and Bob win if
the product ξη equals mij , the pi, jqth entry of the CHSH matrix (11.3), and lose
N
otherwise. The difficulty is that while Alice knows her setting i P t1, 2u, she doesn’t
know Bob’s setting j, and similarly with the roles reversed.
` ˘2 ` ˘2
ly.
in each round is 1, the winnings per round, averaged over all possible inputs i, j
(the value of the game), are
se
1ÿ 1
(11.12) mij ξi ηj ď
4 i,j 2
lu
(this is the same as the bound of 2 from (11.4) after renormalization) and half
na
random pair of vectors ξpωq, ηpωq according to some distribution ppωq (this re-
quires shared randomness if the choices of Alice and Bob are not to be independent;
r
a shared quantum state ρ P DpHA bHB q. More precisely, for every setting i of Alice
(resp., j of Bob) there is a pair of complementary projections Eiξ , ξ “ ξi P t´1, 1u,
on HA (resp., Fjη , η “ ηj P t´1, 1u, on HB ). If Alice receives from the referee
the input i, she performs the projective measurement corresponding to pEiξ qξ“˘1
and responds with the value of ξ supplied by the outcome of the measurement,
and similarly for Bob. According to the Born rule (3.8), if the referee provides
Alice and
` Bob with inputs pi, jq, the probability of a pair of responses pξ, ηq will
be Tr ρ Eiξ b Fjη . Consequently, for these inputs, the expected value of the CHSH
˘
ion
game will be mij (the corresponding entry of the payoff matrix MCHSH from (11.3))
times
ut
ÿ
ξη Tr ρ Eiξ b Fjη “ Tr ρpXi b Yj q,
` ˘
(11.13)
rib
ξ,η“˘1
where Xi “ ξ“˘1 ξEiξ “ Ei`1 ´ Ei´1 and, similarly, Yj “ Fj`1 ´ Fj´1 . Averaging
ř
ist
over all inputs i and j, we obtain the value
1ÿ
(11.14) mij Tr ρpXi b Yj q.
rd
4 i,j
if we want ?
to focus on the probability of winning the game, the quantum strategy
2` 2
yields 4 « 0.8536, which needs to be compared to the upper bound of 34 for
ly.
classical strategies that was calculated earlier. For a discussion of fine points of the
optimality of this strategy see Exercise 11.21.
on
Exercise 11.20 (Optimality of the classical CHSH game strategies). (a) Show
that if, in the CHSH game, the referee uses a non-uniform distribution on the set of
se
inputs, then Alice and Bob have a deterministic strategy which gives a value strictly
larger than 21 . (b) Describe all classical strategies of Alice and Bob that yield 12 as
lu
the value of the CHSH game, irrespectively of the probability distribution on the
set of inputs used by the referee.
na
11.3.2. Boxes and the nonsignaling principle. The scheme that we de-
r
Pe
scribed above via the example of the CHSH game can be conceptualized and gener-
alized using the language of boxes. A box is a family of joint probability distributions
(11.15) P “ tpp¨, ¨|i, jq : 1 ď i ď m, 1 ď j ď nu.
In the context of the two-player games described earlier, ppξ, η|i, jq is the probability
that Alice and Bob respond with outputs ξ, η when presented with inputs i, j. If
the payoff corresponding to this scenario is vpξ, η, i, jq, the (average) value of the
game is
1 ÿ
(11.16) V “ ppξ, η|i, jqvpξ, η, i, jq.
mn ξ,η,i,j
286 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY
ion
While the payoff function v can be a priori arbitrary, the probabilities implicit
in the box P reflect the players’ strategy and the resources available to them.
‚ Deterministic strategies (i.e., ξ “ f piq and η “ gpjq for some functions f and g)
ut
result in a deterministic box:
rib
(11.18) ppξ, η|i, jq “ 1tξ“f piqu 1tη“gpjqu .
‚ Random strategies result in product boxes:
ist
(11.19) ppξ, η|i, jq “ ppξ|iqppη|jq,
rd
where pp¨|iq “ pA p¨|iq and pp¨|jq “ pB p¨|jq are the (independent) marginals of the
distribution pp¨, ¨|i, jq.
‚ Random strategies with shared randomness result in local (or classical ) boxes:
distribution on Λ.
‚ Quantum strategies result in quantum boxes:
ly.
˘ ρ is a `quantum
`where state shared by Alice and Bob and, for each i (resp., j),
Eiξ ξ (resp., Fjη η ) is a POVM on Alice’s space HA (resp., Bob’s space HB ).
˘
Let us denote the corresponding sets of boxes by DB, RB, LB and QB. If there
se
Since the number of values taken by ξ and η is, respectively, k and l, every box can
be thought of as an element of Rklmn` and we have
na
(11.22) DB Ă RB Ă LB Ă QB .
The first inclusion is trivial and it is clear from the definition that LB “ conv RB;
so
also that every product box is a mixture of deterministic boxes and so in fact
Pe
LB “ conv DB.) The convexity of QB and the last inclusion in (11.22), which
follows from it, are slightly less obvious (see Exercises 11.23, 11.25, 11.26 and
Notes and Remarks for a discussion of these points and related issues). Except in
trivial cases, the inclusion LB Ă QB is strict; this follows, for example, from the fact
that correlations can be retrieved from boxes (as in (11.16)–(11.17)) and from the
inclusion LCm,n Ă QCm,n being strict. Boxes that do not belong to LB are called
nonclassical or nonlocal.
We next present a description of LB in the language of projective tensor prod-
ucts. First, consider the set of conditional marginal probability distributions
(11.23) Kk,m :“ tppξ|iq : 1 ď ξ ď k, 1 ď i ď mu,
11.3. BOXES AND GAMES 287
ion
dim LBk,l|m,n “ pdim Kk,m ` 1qpdim Kl,n ` 1q ´ 1
“ mnpk ´ 1qpl ´ 1q ` mpk ´ 1q ` npl ´ 1q
ut
(see Exercise 11.27). The geometry of QB is not as transparent as that of LB.
` some ˘light on it, let us consider a quantum box P “ tppξ, η|i, jqu “
To shed
rib
tTr ρ Eiξ b Fjη u P QB and, for given i, j, let us calculate the marginal density
ppξ|i, jq of ppξ, η|i, jq. We then obtain
ist
ÿ ÿ
Tr ρ Eiξ b Fjη “ Tr ρ Eiξ b IHB “ Tr ρA Eiξ ,
` ˘ ` ˘ ` ˘
ppξ|i, jq “ ppξ, η|i, jq “
rd
η η
which doesn’t depend on j (here ρA “ TrHB ρ is the partial trace, cf. (3.10)).
Similarly, the marginal densities ppη|i, jq do not depend on i. In other words, there
fo
exist distributions pp¨|iq “ pA p¨|iq, i “ 1, . . . , m, and pp¨|jq “ pB p¨|jq, j “ 1, . . . , n
such that, for every i, j, pA pξ|iq and pB pη|jq are the marginals of ppξ, η|i, jq, i.e.,
ot
ÿ ÿ
(11.25) ppξ, η|i, jq “ pA pξ|iq and ppξ, η|i, jq “ pB pη|jq.
N
η ξ
Let us reflect now on the operational significance of (11.25). If, for some i, the
ly.
information about the input j sent by the referee to Bob (complete information if
the distributions pp¨|i, jq were disjointly supported for distinct j, and some infor-
se
mation if they were just different). This hypothetical event is usually interpreted
as instant—or at least faster than light—signaling or communication and, conse-
lu
tion as nothing seems to forbid Alice from determining her response before—in the
sense of being inside the past light cone—Bob determines his or, indeed, before Bob
so
or even the referee knows the value of j. Note that while, in that case, Alice could
in principle communicate her response to Bob, this has no effect on the statistics
r
of her outputs.)
Pe
The set of boxes verifying (11.25) is called the nonsignaling polytope and we
will denote it by NSB. (It is indeed a polytope, being the intersection of an affine
subspace of Rklmn with the cube r0, 1sklmn .) An analysis of the constraints shows
that LB and NSB (and hence the intermediate set QB) have the same dimension;
see Exercises 11.28 and 11.29. For 2-output nonsignaling boxes (i.e., if k “ l “
2, in which case one may assume that ξ, η take values ˘1) one can still define
the corresponding correlation matrices by the formula that is (modulo different
normalization) implicit in (11.16)–(11.17), namely
ÿ
(11.26) aij “ ξ η ppξ, η|i, jq.
ξ,η“˘1
288 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY
0 12
„
ion
In other words, the joint distributions pp¨, ¨|i, jq are, respectively, either 1 or
„1 2 0
2 0
. Since all marginals pA p¨|iq and pB p¨|jq are identical, with probabilities of
ut
0 21
both outputs equal to 12 , it is immediately clear that P is ř nonsignaling. It is also
rib
apparent that for each combination pi, jq of inputs we have ξ,η“˘1 ξηppξ, η|i, jq “
mij , where pmij q is given by (11.3). Accordingly, the value of the CHSH game (as
ist
given by (11.16)–(11.17)) is 1, as is the probability of winning. Since the analysis
from Section 11.3.1 (based on Proposition? 11.11) shows that the best value that
rd
can be achieved by a quantum strategy is 22 , it follows that the PR-box cannot be
realized as a quantum box via (11.21). This implies that the inclusion QB Ă NSB
is always proper. fo
We will conclude this section by giving volume estimates for the sets of nonsig-
naling boxes NSB and sets of nonsignaling correlation matrices NSCm,n .
ot
Proposition 11.17. For k, l, m, n P N we have
N
? ?
(11.28) vradpNSBk,l|m,n q “ Θp mnq and vradpNSCm,n q “ Ωp mnq.
ly.
the first relation follows almost immediately from Proposition 4.27. The only two
additional points that need to be made are as follows. First, while H doesn’t
contain the center of the cube r0, 1sklmn , it does contain the point all whose coor-
se
` ˘N
dinates are 14 , the center of the cube r0, 21 sklmn . Accordingly, volN pNSBq ě 21 ,
lu
sections, the upper bound from Proposition 4.27 works without change and yields
volN pNSBq ď 2pklmn´N q{2 . The second point is that the dimension and the codi-
so
` ˘1{N
mension of H are of the same order, and so 2pklmn´N q{2 “ Θp1q. It re-
r
mains to combine the above estimates with the well-known asymptotic expression
Pe
a
volpB2N q1{N „ 2πe{N (as N Ñ 8, see Appendix B.1 and particularly Exercise
B.1).
The second relation can be analyzed in a similar way. By definition, NSCm,n
is a linear image of NSB, essentially a projection of a section of r0, 1sklmn . Since a
projection of a section is larger than a section of a section, we get a lower bound.
(The reason for “essentially” is that the vector ξη “ p1, ´1qbp1, ´1q P R2 bR2 Ø R4
is of norm 2 rather than 1.)
Problem 11.18 (Volume radius and mean width of sets of boxes). In the
assertion of Proposition 11.17, can Ω be replaced by Θ? The argument given above
(combined with, say, Proposition 4.28) runs into complications if m and n are of
11.3. BOXES AND GAMES 289
very different orders. More generally, what are the asymptotic orders of the volume
radii and mean widths of sets of boxes of the sets LB, QB, NSB for arbitrary values
of k, l? Some of the cases (e.g., LB, because of (11.24)) appear fairly straightforward
consequences of the methods presented in this book, but some of other ones seem to
require further analysis.
Exercise 11.22. Show that every product box is a convex combination of
deterministic boxes.
ion
Exercise 11.23 (Convexity of the set of quantum boxes). Show that the set
QB of quantum boxes is a convex subset of Rklmn
` .
ut
Exercise 11.24 (Pure states suffice). Show that in the definition of quantum
boxes (11.21) we can require the state ρ to be pure.
rib
Exercise 11.25. Show that (i) LB Ă QB and (ii) moreover, that every P P LB
can be realized as a quantum box (11.21) with ρ separable.
ist
Exercise 11.26. Show that if a quantum box P can be written as ppξ, η|i, jq “
TrpρpEiξ b Fjη qq with ρ P Sep, then P P LB.
rd
Exercise 11.27 (The dimension of the set of local boxes). Show that dim LB “
mnpk ´ 1qpl ´ 1q ` mpk ´ 1q ` npl ´ 1q. fo
Exercise 11.28 (All sets of boxes have the same dimension). Show that
ot
dim QB “ dim NSB “ dim LB.
N
Exercise 11.29. Deduce the equality dim QB “ dim LB from the fact that
dim D “ dim Sep (shown in Section 2.2.3).
ly.
ξ,η,i,j
Except for the normalizing factor, which was removed to reduce the clutter, this is
lu
the same as the average value of a game defined in (11.16). The local (or classical)
optimal value of P is defined as
na
inequality is violated and the ratio |V pP q|{ωL pV q is called the violation. Similarly,
the quantum and nonsignaling optimal values of V are defined as
(11.31) ωQ pV q “ supt|V pP q| : P P QBu, ωNS pV q “ maxt|V pP q| : P P NSBu.
Finally, maxV ωQ pV q{ωL pV q is called the maximal quantum violation (for the par-
ticular values of m, n, k, l; more precisely, quantum-to-classical or quantum-to-local
violation), and similarly for violations involving nonsignaling boxes. For example,
the discussion following the definition (11.27) of PR-boxes shows that, for the CHSH
game, nonsignaling-to-classical violations can as large as 2 (see Exercise 11.33 and
cf. Proposition 11.24). All these parameters have nice functional-analytic interpre-
tations, see Exercise 11.31.
290 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY
As in the case of the CHSH game, the reader may wonder whether the uniform
distribution on the set of inputs implicit in the definition of V pP q, and hence
indirectly in (11.30)–(11.31) is justified. While for some “balanced” Bell functionals
it will be true that—as for the CHSH game, see Exercise 11.20—the von Neumann–
Nash-type equilibrium indeed involves the uniform distribution, this will not be
universally the case. However, there is a simple trick that allows to sidestep this
issue: a game with the distribution πpi, jq on input settings and the payoff function
vpξ, η, i, jq is equivalent to the game with the payoff function mnπpi, jqvpξ, η, i, jq
ion
and the uniform distribution. In other words, considering uniform distributions
on sets of inputs covers all possible scenarios: it is just one of many essentially
equivalent ways of parameterizing the set of all possible Bell functionals. However,
ut
in some situations a moment of reflection will be needed; since, for example, the
rib
optimal πpi, jq’s for the local, quantum and nonsignaling strategies may be different,
one has to be sure that one does not compare “apples to oranges.”
As we will see later, measurement schemes involving boxes may lead to arbi-
ist
trarily large violations. However, this is not the case for boxes with 2-outcomes
(i.e., when k “ l “ 2). The reason is that sets of 2-outcome boxes are closely
rd
related to sets of correlations introduced in Section 11.2. This is particularly
clear when one compares the set LCm,n of classical/local correlations, which, by
Proposition 11.7, identifies canonically with B8 m p
b B8n
fo
Ă Rm b Rn Ø Rmn , and
the corresponding set LB2,2|m,n of local boxes, which, by (11.24), identifies with
ot
p K2,n “ ∆1 m b p ∆1 n Ă R2m b R2n Ø R4mn . In other words, LCm,n
` ˘ ` ˘
K2,m b
is the projective tensor product of two 0-symmetric cubes, while LB2,2|m,n is the
N
projective tensor product of two similar cubes, but contained in spaces twice their
dimension and centered at the point all whose coordinates are 21 .
ly.
Proof. Assume that the labels ξ and η belong to t´1, 1u rather than t1, 2u.
The maximum in ωL pV q is achieved on an extreme point of LB, i.e., on a determin-
lu
m ÿ
n m n
r
ÿ ÿ ÿ
V pP q “ αi,j xi yj ` βi x i ` γj yj ` δ
Pe
Consider now a quantum box P 1 P QB2,2|m,n , of the form p1 pξ, η|i, jq “ Tr ρpEiξ b
Fjη q. Using the same notation as before and setting Xi “ Ei1 ´ Ei´1 and Yj “
Fi1 ´ Fj´1 as in (11.13)–(11.14), we can write
m ÿ
ÿ n m
ÿ n
ÿ
V pP 1 q “ αi,j Tr ρpXi b Yj q ` βi Tr ρpXi b Iq ` γj Tr ρpI bYj q ` δ
i“1 j“1 i“1 j“1
m`1
ÿ n`1
ÿ
“ αi,j Tr ρpXi b Yj q,
ion
i“1 j“1
where in the last sum we defined Xm`1 “ I and Yn`1 “ I. It now follows that
ut
#ˇ ˇ +
ˇ ˇ ˇm`1
ÿ n`1
ÿ ˇ
1
ˇV pP qˇ ď max ˇ
(11.33) α a ˇ : paij q P QCm`1,n`1 .
ˇ ˇ
ˇ i“1 j“1 ij ij ˇ
rib
Since P 1 P QB was arbitrary, the first statement of the Proposition follows by
ist
comparing (11.33) with (11.32) and appealing to Theorem 11.12. For the second
statement, we note that if m “ n “ 2, then Theorem 11.12 will be used for 3 ˆ 3
rd
p3q ?
matrices and so KG may be replaced by KG “ 2, see Exercises 11.13–11.15.
Remark 11.20. The argument shows that the violations of bipartite n-input,
2-output boxes do not exceed KG
pn`1q fo
(and similarly for “rectangular” boxes, i.e.,
m ‰ n). Still, the matrices paij q that appear in (11.33) have a special structure
ot
pnq
and so it is conceivable that the bound KG works, too. However, this is unlikely
N
Remark 11.21. Since the proof of Proposition 11.19 translates violations for
on
However, if we allow three outputs in one of the boxes, the answer is known: there is
lu
may lead to arbitrarily large violations. This may happen for two reasons: either
the system is not bipartite (i.e., it involves three or more parties) or the outputs are
so
not binary. These two situations are exemplified by the following pair of results.
Recall that LCn1 ,...,nk and QCn1 ,...,nk are the k-partite generalizations of the sets of
r
Pe
classical and quantum correlation matrices, see Remark 11.10 for details.
pn ,...,n q
Proposition 11.22 (not proved here). Denote by KG 1 k
the best con-
pn,n,nq
stant K such that the inclusion QCn1 ,...,nk Ă KLCn1 ,...,nk holds. Then KG “
pn ,n ,n q
Ωpn1{4 plog nq´3{2 q and KG 1 2 3 ď KG mintn1 , n2 , n3 u1{2 .
Above KG C
stands for the complex Grothendieck constant; see Notes and Remarks
for a precise definition and for estimates. Both propositions can be understood as
292 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY
ion
V
Moreover, the same bound holds for violations involving correlation matrices.
ut
Proof. Combine Propositions 11.15 and 11.17. One way to take care ? of fine
points is to use Urysohn’s inequality to deduce that?wpNSC ?m,n q “ Ωp mnq and
rib
then compare it to the upper bound wpLCm,n q “ Op m ` nq from (11.9). This
leads to a nonsignaling-to-classical violation (of correct order) of some Bell corre-
ist
lation inequality and shows the second (and hence the first) statement.
We conclude the section by introducing another concept which quantifies non-
rd
locality and which is, in a sense, a generalization of the geometric distance between
sets (of boxes). Given P P NSB we define the local!fraction (or classical fraction)
of P as
(11.34)
fo
pL “ pL pP q :“ max tt P r0, 1s : P P tLB ` p1 ´ tqNSBu .
ot
The quantity pNL :“ 1 ´ pL is the nonlocal fraction. Similar parameters can be
N
defined for other pairs in place of LB, NSB. For example, replacing in (11.34) LB
by DB, the set of deterministic boxes (defined by (11.18)) leads to the notion of
ly.
fraction of determinism.
Clearly P P LB iff pL “ 1. Therefore, by the Hahn-Banach separation theorem,
on
(11.35) ´ 1 ď pNL ´1 .
ωL pV q ωL pV q
so
Theorem 11.26 (not proved here). Consider a two-player game setup with
n “ 2 (i.e., two input settings at Bob’s site) and arbitrary (but fixed) m, k, l. Then
(11.36) inftpL pP q : P P QBk,l|m,2 u ě c,
where c ą 0 is a constant that depends only on k and l (but not on m nor on the
dimensions of the underlying Hilbert spaces). The same is true about the fraction
of determinism.
Theorem 11.26, in combination with Exercise 11.33, provides an alternative ar-
ion
gument that the PR-box cannot be realized as a quantum box. The same reasoning
works for any bipartite setup with mintm, nu “ 2 and any box which yields the
ut
optimal nonsignaling value for any Bell functional V such that ωNS pV q ą ωL pV q
(so, while more involved and less sharp, the present argument is very general).
rib
The assertion of Theorem 11.26 does not hold when both players have 3 or
more settings. This is because in that case there exist the so-called pseudotelepathy
quantum games, i.e., the games that can be won with probability 1 using quantum
ist
strategies, while no foolproof classical strategy is possible. Consequently, if P is
the corresponding quantum box and V is the probability of winning, then V pP q “
rd
1 “ ωNS pV q, while ωL pV q ă 1, and so it follows from (11.35) that pL “ 1 ´ pNL “ 0.
An outline of one such game, the Mermin–Peres magic square game, is given in
Exercise 11.35. fo
Exercise 11.30 (Linear vs. affine Bell inequalities). Show that definitions
ot
(11.30) and (11.31) yield the same value if we allow V to vary over all affine func-
N
wpLB,uq´wpLB,´uq
be the maximal ratio of widths of QB and LB (see Section 4.3.3). Show that the
lu
maximal quantum violation is contained between δ and 2δ ´ 1. (ii) State and prove
an analogous statement for NSB.
na
Exercise 11.33 (Nonsignaling value of the CHSH game). Show that the non-
signaling value of the CHSH game is 2 and deduce that the maximal nonsignaling
violation for m “ n “ k “ l “ 2 is 2.
Exercise 11.34 (Quantum ˘ box for the CHSH game). Give an explicit example
of an ensemble ρ, pEiξ q, pFjη q which induces—via (11.21)—a quantum box giving
`
(b) the composition of the entries in each row is I, while the composition of the
entries in each column is ´ I.
Table 11.1. The magic square game.
σx b I I b σx σx b σx
´ σx b σz ´σz b σx σy b σy
I b σz σz b I σz b σz
ion
(ii) Show that there is no 3 ˆ 3 table consisting of numbers such that the product
ut
of the entries in each row is 1, while the product of the entries in each column is
rib
´1.
(iii) The Mermin–Peres magic square game is played as follows. The number of
input settings is m “ n “ 3 and the outputs are strings of ˘1 of length 3. An
ist
additional restriction is that the product of elements of Alice’s string must be 1,
while the product of elements of Bob’s string must be ´1 (so, in effect, k “ l “ 4).
rd
If the input settings communicated to Alice and Bob were pi, jq, Alice and Bob win
if theirs output strings placed respectively in ith row and jth column coincide on
fo
the common ij-th entry, and lose otherwise. Show that
(a) there is no deterministic (and hence classical) winning strategy,
(b) the following is a winning quantum strategy. Alice and Bob share a 4-qubit
ot
quantum state ϕ` b ϕ` , where ϕ` “ ?12 p|00y ` |11yq is a Bell state with the first
N
qubit of each copy of ϕ` going to Alice and the second to Bob. Given input i, Alice
measures her part of the state in a basis in which the (commuting) operators from
ly.
the ith row are simultaneously diagonal, and answers the corresponding triple of
eigenvalues. Given input j, Bob does the same thing using the jth column.
on
equalities belongs to the operator space theory was most explicitly put forward in
lu
[JPPG` 10].
For a proof of Theorem 11.4, we refer the reader to [Por81] (Theorem 13.68) or
na
[Kir76]. There is a huge gap between that Theorem and Lemma 11.1, which ? both
yield subspaces of dimension Θplog nq and the optimal ratio }¨}op {}¨}HS ” 1{ n, and
?
so
ion
[LP68]. In particular, the elementary formulation (11.7) comes from [LP68]. For
a beautiful recent survey about Grothendieck’s inequality, including historical back-
ground and far-reaching generalizations, see [Pis12a].
ut
pm,nq p2,nq
Concerning values of constants KG for specific m, n, we have KG “
? p3,nq ?
rib
2 for all n and KG “ 2 for all n (the latter is stated without proof in
[FR94] and attributed ultimately to Kemperman’s interpretation of results of Garg
[Gar83]; see also [BM08] on which Exercise 11.15 is based). The approach that
ist
p3q
was used to calculate KG in Exercise 11.15 can be in principle replicated for larger
dimensions, but the computational complexity of the problem increases very fast. It
rd
p4q ?
was implemented in [Li] to show rigorously that KG “ 2 (there are two new Bell
correlation inequalities that appear in the 4 ˆ 4 context, but neither of them leads
this
?
to a violation that is 2 or larger). Other values of KG
Various aspects of circle of ideas, including
foin
pm,nq
particular
seem to be unknown.
the significance of
ot
?
the constant 2, are discussed in [For10] and [FR94]. The CHSH inequality was
introduced in [CHSH69].
N
rns
One may also define KG as the best constant such that (11.7) holds for every
matrix paij q of arbitrary size and every vectors xi , yj P Rn . An easy observation is
ly.
`
see [BNV16, HQV 16] for recent lower and upper bounds.
The Grothendieck constant introduced in the text is the real Grothendieck
constant. It has a complex counterpart defined as the smallest constant KG C
such
se
that for any complex matrix pmij q of arbitrary size m ˆ n and any unit vectors
xi , yj in a complex Hilbert space, we have
lu
ˇ ˇ ˇ ˇ
ˇÿ ˇ ˇÿ ˇ
C
(11.37) ˇ mij xxi , yj yˇ ď KG max m ξ η
na
ˇ ˇ ˇ ˇ
ˇ ij i j ˇ
ˇ i,j ˇ ξPTm ,ηPTn ˇ ˇ
i,j
so
where T denotes the set of complex numbers of unit modulus. The best estimates
are 1.338... ă KG
C
ă 1.405..., which in particular imply KG C
ă KG (see [Pis12a] for
r
complex Grothendieck inequality holds with constant 1, see Exercise 11.17 (based
on [BM08]). For larger dimensions the optimal values of the constants do not seem
to be known.
The argument from Exercise 11.8 is from [WW01a]. The description of the
extremal Bell correlation inequalities (extreme points of LC˝ , or equivalently faces
of LC) has attracted a lot of attention, see the website [@4].
Section 11.3. For more information on quantum boxes and Bell inequalities
we refer the readers to the surveys [PV16] and [BCP` 14]. Older valuable refer-
ences include [Pit89] and [WW01b].
296 11. BELL INEQUALITIES AND THE GROTHENDIECK–TSIRELSON INEQUALITY
Some authors reserve the term “value of the game” to payoff functions that are
nonnegative. (Of course, any finite payoff function can be made nonnegative via
an offset, but that makes a difference when we calculate the ratios of values for
different strategies, as we do.) A 2-output game for which the payoff function is of
the form (11.17) for some pmij q (or, perhaps, slightly more generally, mij ξη ` nij ,
which allows in particular, talking about 12 pξη ` 1q, the probability of winning the
game) is called an XOR game. This is because when we think of the outputs as
Boolean data a, b P t0, 1u, the value of the game depends only on the “exclusive
ion
or” value a ‘ b. XOR games can also be defined for more than two players; their
study is essentially equivalent to that of correlation matrices. It should be noted
that while for local correlation matrices and boxes the link to the projective tensor
ut
product works perfectly (as in Proposition 11.7 and (11.24)), the correspondence to
rib
operator space tensor products in the quantum setting is slightly less satisfactory
once we leave the setting of XOR games. This is pointed out, e.g., in section IV.B
of [PV16]): while we still can, with some work, come up with two-sided estimates,
ist
constants larger than 1 do appear. It would be very useful to come up with a
natural construction (such as the use of cylindrical symmetrizations in Exercise
rd
11.31) which allows to bypass this complication.
It is known [AIIS04] that determining whether a box is local is NP-complete,
fo
even for the class of boxes with 2 outputs, and similarly for correlation matrices.
This is established via a connection to the concept of the cut polytope associated
ot
to a graph G “ pV, Eq, which is a polytope in RE defined as
convtpδS peqqePE : S Ă V u,
N
where δS peq “ 1 if the edge e has one endpoint in S and one endpoint in V zS, and
ly.
0 otherwise. It can be checked that LCm,n is affinely equivalent to the cut polytope
of the complete bipartite graph Km,n (cf. the comments on contextuality at the
on
end of these notes) and that LB2,2|m,n is affinely equivalent to the cut polytope of
the complete tripartite graph Km,n,1 . For more information on cut polytopes we
refer the reader to [DL97].
se
question is known as Tsirelson’s problem and has to do with how quantum physics
models locality: we may define a set QB1 as the set of boxes ppξ, η|i, jq of the form
na
where ψ is a unit vector in a Hilbert space H, and, for every i and j, pĒiξ qξ and
so
pF̄jη qη are POVMs on H which satisfy the commutation condition Ēiξ F̄jη “ F̄jη Ēiξ
r
ion
earlier example in the multipartite setting was given in [Dür01]. See the discussion
in [VB14] and in section III.A of [BCP` 14] for more on the relationship between
nonlocality and entanglement, and for many more references.
ut
The fact that the multipartite analogue of Theorem 11.12 does not hold has
pn ,n ,n q
rib
been known for some time. In the present context, unboundedness of KG 1 2 3
`
as n1 , n2 , n3 tend to infinity was shown in [PGWP 08]. Quantitative estimates
where obtained later in [Pis12b]. Proposition 11.22, with a slightly worse power
ist
of logarithm, appeared in [BV13], the version stated here is from [PV16]. Propo-
sition 11.23 is from [PY15]. See also [JP11]; more references can be found in
rd
[PV16].
The form of the Mermin–Peres magic square game given in Exercise 11.35
fo
follows largely [Ara04]. Another (more explicit but less transparent) exposition
can be found in [BBT05]. Other demonstrations of pseudotelepathy are based
ot
on versions of the Kochen–Specker theorem [KS67] which involves the concept
of contextuality. Contextuality, or rather noncontextuality, is a generalization of
N
locality. For example, a two party scenario allows to perform measurements indexed
by pairs tpi, jqu, where i and j identify respectively local POVMs of Alice and
ly.
ion
This last chapter consists of two parts which are linked by the central role played
ut
by the concept of POVMs, but are otherwise largely independent. The first part
deals with the norms that are associated with POVMs and which are intimately
rib
related to zonoids. This connection allows us to derive a sparsification result for
POVMs. The second part also uses the language of POVMs, but is focused on the
ist
distillability problem, a major unsolved problem in quantum information theory.
rd
12.1. POVMs and zonoids
12.1.1. Quantum state discrimination. What happens when a quantum
fo
system in a state ρ is measured with a POVM M? We only focus on the case of
a discrete POVM M “ pMi q1ďiďN (continuous POVMs could then be treated by
ot
approximation).
We know from Born’s rule (3.13) that the outcome i is obtained with probability
N
TrpρMi q. This simple formula can be used to quantify the efficiency of a POVM to
perform the task of state discrimination. State discrimination can be described as
ly.
state.
After measuring it with the POVM M “ pMi q1ďiďN , the outcome i occurs
with probability pi “ TrpρMi q if the unknown state is ρ and with probability
se
N N
1ÿ 1 1ÿ
Ppfailureq “ minppi , qi q “ ´ |pi ´ qi | .
2 i“1 2 4 i“1
so
for ∆ P B sa pHq by
Pe
N
ÿ
(12.1) }∆}M “ |Trp∆Mi q| .
i“1
299
300 12. POVMS AND THE DISTILLABILITY PROBLEM
trace norm distance between quantum states; the optimal inequality Ppfailureq ě
1 1
2 ´ 4 }ρ ´ σ}1 is known as the Helstrom bound for quantum hypothesis testing.
ion
only if } ¨ }M is a norm. It follows from the inequality } ¨ }M ď } ¨ }1 that KM is
always included in the unit ball for the operator norm.
ut
The following proposition characterizes the convex sets that can be obtained
by means of this construction.
rib
Proposition 12.1. Let K Ă B sa pHq be a symmetric closed convex set. Then
the following are equivalent.
ist
(i) K is a zonotope such that K Ă t} ¨ }8 ď 1u and ˘ I P K.
(ii) There exists a POVM M on H such that K “ KM .
rd
Zonotopes were defined in Section 4.1.3 and briefly discussed in Section 7.2.6.4;
the insight implicit in the above Proposition permits us to relate the ideas and the
fo
techniques outlined in those sections to the task of state discrimination.
Proof of Proposition 12.1. For a POVM M “ pMi q1ďiďN , we claim that
ot
(12.2) KM “ r´M1 , M1 s ` ¨ ¨ ¨ ` r´MN , MN s.
N
Indeed, denoting by L the right-hand side of (12.2), we have for every A P B sa pHq
N
ly.
ÿ
}A}L˝ “ suptTrpABq : B P Lu “ ˝ ,
|TrpAMi q| “ }A}KM
on
i“1
so that L “ KM . Conversely, suppose that K is a zonotope as in (i). By definition,
there are operators pMi q1ďiďN such that
se
K “ r´M1 , M1 s ` ¨ ¨ ¨ ` r´MN , MN s.
lu
The hypotheses imply that I is an extreme point of K. Any extreme point of K has
the form ˘M1 ˘ ¨ ¨ ¨ ˘ MN , and therefore by changing Mi into ´Mi if necessary, we
na
Theorem 12.2. There is a constant C such that the following holds: for every
POVM M “ pMi q1ďiďN on Cn and every ε P p0, 1q, there exists another POVM
M1 “ pMj1 q1ďjďN 1 with N 1 ď Cn2 log n{ε2 outcomes such that
(12.4) p1 ´ εq} ¨ }M ď } ¨ }M1 .
Proof. Consider the convex set KM Ă Msan , which is a zonoid by Proposition
12.1. By Theorem 7.48, there is a zonotope
Z “ r´A1 , A1 s ` . . . r´AN 1 , AN 1 s
ion
with Ai being positive operators, N 1 ď Cn2 log n{ε2 , and such that p1 ´ εqKM Ă
Z Ă KM . (The positivity of Ai follows from the last sentence in Theorem 7.48.)
ut
n,sa
Define A0 “ I ´pA1 ` ¨ ¨ ¨ ` AN 1 q. Note that A0 is positive since Z Ă KM Ă S8
(the unit ball for the operator norm). It follows that M1 :“ pA0 , A1 , . . . , AN 1 q is a
rib
POVM such that KM1 Ą Z Ą p1 ´ εqKM , and therefore } ¨ }M1 ě p1 ´ εq} ¨ }M as
claimed.
ist
Remark 12.3. The one-sided inequality (12.4) in Theorem 12.2 is the mean-
rd
ingful half of (12.3) since we want the sparsified POVM to be not weaker than the
initial one. However, it is natural to wonder whether one can insist on a two-sided
inequality as in (12.3). This seems to require an extra argument.
fo
12.2. The distillability problem
ot
In this section we discuss the distillability problem, one of the most important
N
customarily called Alice and Bob. For any integer n ě 1, the Hilbert space Hbn
bn bn
is also considered as a bipartite Hilbert space by identifying it with HA b HB .
on
› ›
›Φpρbn q ´ σ › ď ε.
1
In words, this property is referred to as “σ can be distilled from (multiple copies
so
of) ρ.”
r
We are going to discuss this notion without giving a precise definition of LOCC
Pe
quantum channels. We only need to know that the class of LOCC channels is stable
under composition (which implies, together with the result from Exercise 2.31, that
the relation ù is transitive), that (see Section 2.3.4.8)
convtproduct channelsu Ă tLOCC channelsu Ă tseparable channelsu,
and that the local filtering operation is LOCC: given a state ρ on HA bHB , POVMs
pPi qiPI on HA and pQj qjPJ on HB , and S Ă I ˆ J, then (provided Tr M ą 0)
ρ ù TrMM , where
ÿ
M“ pPi b Qj qρpPi b Qj q.
i,jPS
302 12. POVMS AND THE DISTILLABILITY PROBLEM
The idea behind the last scheme is informally as follows: given n copies of
the state ρ, Alice and Bob can successively measure copies of ρ locally using the
POVMs pPi q and pQj q until they obtain outcomes i and j such that pi, jq P S, the
post-measurement state being then TrMM . (The protocol fails if none of the n copies
gives an outcome in S, but the probability of failure tends to zero as n tends to
infinity.) This is where classical communication (“CC” of LOCC) comes in: Alice
and Bob need a mechanism for certifying that i, j P S and this generally can not
be accomplished by “local” means unless S itself has a product structure.
ion
The above hierarchy of channels parallels somewhat the hierarchy of boxes (see
Section 11.3.2). For example, convtproduct channelsu can be thought of as “local
operations with shared randomness.”
ut
Exercise 12.2 (Distillation preserves separability and PPT). If ρ ù σ, show
rib
that σ is separable (resp., PPT) whenever ρ is separable (resp., PPT).
12.2.2. Distillable states. Recall the standard notation: the canonical basis
ist
of C2 is p|0y, |1yq and we often drop the tensor product signs (for example, |00y
should be understood as |0y b |0y). Next, it is convenient to work with the family of
rd
Bell vectors tϕ` , ϕ´ , ψ ` , ψ ´ u, which is the orthonormal basis of C2 b C2 consisting
of maximally entangled vectors
1 fo 1
ϕ˘ “ ? p|00y ˘ |11yq and ψ ˘ “ ? p|01y ˘ |10yq.
2 2
ot
The corresponding states are called the Bell states. A bipartite state ρ P DpHq is
N
said to be distillable if ρ ù |ψ ` yxψ ` |. The motivation for this concept is that many
quantum information protocols (e.g., quantum teleportation) use Bell states as a
resource. Distillable states are exactly those which are useful for these protocols.
ly.
Note that the choice of the Bell vector ψ ` in this definition is arbitrary: if x, y
are any two maximally entangled vectors on Cd b Cd , then there exist U, V P Updq
on
Given that xχ|ρ|χy is the square of the fidelity between ρ and |χyxχ| (cf. Exer-
cise B.3), the functional sp¨q measures proximity to the set of maximally entangled
states. In particular, ρ is distillable if and only if there exists a sequence pσn q in
DpC2 b C2 q such that spσn q Ñ 1 and that, for every n, ρ ù σn .
Lemma 12.6. We have ρ ù ρspρq, 1´spρq , 1´spρq , 1´spρq .
3 3 3
ion
ρa,b,c,d ù ρα,β,γ,δ .
Proof of Proposition 12.5. Let ρ P DpC2 b C2 q be an entangled state. By
ut
Theorem 2.15, this means that ρ is not PPT. Consequently, there exists a unit
rib
vector x P C2 b C2 such that xx|ρΓ |xy ă 0. Conjugating with local unitaries, we
may assume that the Schmidt decomposition? of x is α|00y ` β|11y. Consider the
operator W “ α|0yx0| ` β|1yx1|, then x “ 2 pI bW q|ϕ` y. By local filtering,
ist
pI bW qρpI bW q
(12.5) ρ ù σ :“
rd
TrpI bW qρpI bW q
(note that 0 ď W ď I, so that W can be one of the operators in a POVM) and one
I
fo
checks that xϕ` |σ Γ |ϕ` y ă 0. Using the formula TrpAΓ Bq “ TrpAB Γ q, we obtain
ˆ ˆ ˙˙
1
0 ą Tr σp|ϕ` yxϕ` |qΓ “ Tr σ
` ˘
´ |ψ ´ yxψ ´ | “ ´ xψ ´ |σ|ψ ´ y
ot
2 2
N
for some state σ 1 such that spσ 1 q ě φpspσqq, where φ is the function
on
t2 ` 19 p1 ´ tq2 1 ´ 2t ` 10t2
φptq “ 1 1 “ .
2
9 p1 ` 2tq ` 9 p2 ´ 2tq
2 5 ´ 4t ` 8t2
se
Since φptq ą t for t P p1{2, 1q, we have limnÑ8 φn pspσqq “ 1. In other words,
iterating the above procedure shows that σ ù σ 2 , where σ 2 is a state such that
lu
1 ´ spρq ` ` ˘
Υpρq “ spρq|ψ ´ yxψ ´ | ` |ϕ yxϕ` | ` |ϕ´ yxϕ´ | ` |ψ ` yxψ ` | .
3
The result follows since ψ ´ can be transformed into ϕ` by local unitaries.
Proof of Lemma 12.7. We write ρ for ρa,b,c,d . It will be convenient to con-
1 1
sider ρ as a state on HA b HB and ρ b ρ as a state on HA b HB b HA b HB (all
the spaces HA , HB , HA , HB being equal to C ). When an operator X on C b C2
1 1 2 2
1 1
is thought of as acting on HA b HA (resp., HB b HB ), we denote it by XA (resp.,
by XB ). The same convention will be used for superoperators Ψ whose domain is
BpC2 b C2 q.
304 12. POVMS AND THE DISTILLABILITY PROBLEM
ion
Tr2 U ρU : , where Tr2 denote the partial trace over the second factor, and U is the
“CNOT” unitary transformation on C2 b C2 defined by
U p|00yq “ |00y, U p|01yq “ |01y, U p|10yq “ |11y, U p|11yq “ |10y.
ut
A direct calculation shows that, for ε, η “ ˘ and with the usual rules for sign
rib
multiplication,
pΨA b ΨB qp|ϕε b ϕη yxϕε b ϕη |q “ |ϕεη yxϕεη |q,
ist
pΨA b ΨB qp|ψ ε b ψ η yxψ ε b ψ η |q “ |ψ εη yxψ εη |q.
rd
(We emphasize that, in the above formulas, not all occurrences of the symbol b
refer to the same bipartitions; for example in ϕε b ϕη we have ϕε P HA b HB
and ϕη P HA 1 1
b HB .) It follows (using first local filtering, then the LOCC channel
fo
ΨA b ΨB and a tedious but straightforward computation) that
Πpρ b ρqΠ
ot
ρù ù ρα,β,γ,δ ,
Tr Πpρ b ρqΠ
N
as asserted.
12.2.4. Some reformulations of distillability. We start with a criterion
ly.
for distillability.
on
Proof. Assume that there exist n, A and B with the above properties. Then,
lu
σ“
TrppA b Bq: ρbn pA b Bqq
so
BppHA qbn b pHB qbn q Ñ BpC2 b C2 q such that Φpρbn q is non-PPT. Since Φ is
separable, it has the form
ÿ
ΦpXq “ pAi b Bi q: XpAi b Bi q
i
and therefore at least some couple pAi , Bi q satisfies the desired conclusion.
There is also a connection between distillability
ř and 2-positivity. Fix an or-
thonormal basis pei q of HA and denote χ “ ei b ei P HA b HA . We recall
(see Section 2.3.2) that the Choi matrix associated to a completely positive map
Φ P CP pHB , HA q is defined as CpΦq “ pΦ b IdBpHA q qp|χyxχ|q.
NOTES AND REMARKS 305
ion
is positive for any A : C2 Ñ HA bn
and B : C2 Ñ HB bn
. This condition is also
: bn
equivalent to the operator pĀ b Bq ρ pĀ b Bq being PPT, and the result is now
ut
immediate from Lemma 12.8.
rib
Problem 12.4 reduces therefore to the following.
Problem 12.10. Let Φ be a completely positive map such that pT Φqbn is 2-
ist
positive for every n (where T denotes the transposition). Is T Φ necessarily com-
pletely positive?
rd
A remarkable result is the fact that in order to solve Problem 12.4 it is enough
to search among Werner states.
fo
Proposition 12.11. Fix d ě 3. The following are equivalent
(i) Every non-PPT state on Cd b Cd is distillable,
ot
(ii) Every entangled Werner state on Cd b Cd is distillable.
N
Proof. Since PPT Werner states are separable (see Proposition 2.16), (i) im-
plies (ii). Conversely, let ρ P DpCd b Cd q be a non-PPT state. In other words,
ly.
řd
σ P DpCd b Cd q such that ρ ù σ and xψ|σ Γ |ψy ă 0, where ψ “ ?1d i“1 ei b ei is
a maximally entangled vector. Equivalently (cf. Exercise 2.20), TrpF σq ă 0, where
F is the flip operator on Cd b Cd . Consider now Υ : BpCd b Cd q Ñ BpCd b Cd q,
se
Exercise 2.16). It follows (see Proposition 2.16) that ρ ù w for some entangled
Werner state w, so (ii) implies (i).
so
A consequence of Lemma 12.8 and Proposition 12.11 is that Problem 12.4 can
r
ion
also [BDSW96]). Proposition 12.5 appears in [HHH97] and Proposition 12.11 is
from [HH99].
ut
Proposition 12.9, and the equivalence between Problem 12.4 and Problem 12.10,
are from [DSS` 00]. For numerical attempts to solve Problem 12.4 in its formula-
rib
tion (12.6), see [DSS` 00, DCLB00].
There is a quantitative version of the distillability problem, which asks for the
ist
asymptotic rate of Bell states production via LOCC channels from many copies of a
given state; the supremum of achievable rates is called the distillable entanglement.
rd
Entanglement that is not distillable is often referred to as bound entanglement, and
the states that exhibit it are called bound entangled.
If one uses operations preserving PPT instead of LOCC, then every non-PPT
fo
state can be “distilled” [EVWW01]. Note that some care is needed when analyzing
this issue because the class in question is not closed under tensoring.
N ot
ly.
on
se
lu
na
so
r
Pe
APPENDIX A
ion
This appendix serves as a brief general reference for Gaussian random variables,
ut
both scalar and vector-valued. It addresses terminology, basic properties, and var-
ious elementary but useful identities and inequalities. More specialized properties
rib
are included elsewhere in this book, most notably in Chapter 6.
ist
A.1. Gaussian random variables
The standard Gaussian distribution N p0, 1q is the probability measure on R
rd
(denoted by γ1 ) with density ?12π expp´x2 {2q dx. The standard complex Gaussian
distribution NC p0, 1q is the probability measure on C with density π1 expp´|z|2 q dz.
fo
(Occasionally we will write NR p0, 1q for N p0, 1q to emphasize the distinction.) The
word “standard” refers, in particular, to the unit variance normalization: if Z has
ot
distribution either N p0, 1q or NC p0, 1q, then E |Z|2 “ 1. We note also that if Z1 , Z2
are independent random variables with distribution N p0, 1q, then ?12 pZ1 ` iZ2 q has
N
(A.1) E |Z| “ ? „
π 2 e
lu
2
(indeed, |Z|2 follows an exponential distribution with parameter 1).
so
` ˘
(A.3) Φpxq :“ γ1 p´8, xs .
For large x, we have 1 ´ Φpxq „ p2πq´1{2 x´1 expp´x2 {2q. This is refined by the
Komatu inequalities which assert that for every x ě 0,
ż8
2 x2 {2 2 2
(A.4) ? ďe e´t {2 dt ď ? .
x` x `42 x ` x2 ` 2
x
A further refinement is provided by the inequalities (where x ě 0)
ż8
π 2 2 4
(A.5) ? ď ex {2 e´t {2 dt ď ? .
2
pπ ´ 1qx ` x ` 2π 3x ` x2 ` 8
x
307
308 A. GAUSSIAN MEASURES AND GAUSSIAN VARIABLES
Exercise A.1 (A simple bound for the normal tail). Show the inequality (6.6):
if Z is a standard normal variable (i.e., distributed according to the N p0, 1q law),
2
then PpZ ě tq “ 12 Pp|Z| ě tq ď 12 e´t {2 for t ě 0. This bound motivates the
definition of subgaussian processes, see (6.19) and subsequent comments.
Exercise A.2 (Komatu inequalities). Prove the Komatu inequalities (A.4) by
arguing as follows:
(i) If f´ pxq, f pxq and f` pxq denote respectively the left, middle and right member of
1
the inequality to be proved, show that for x ě 0 we have f´ ě xf´ ´ 1, f 1 “ xf ´ 1
ion
1
and f` ď xf` ´ 1.
(ii) Show (A.4). The same argument proves the upper bound in (A.5).
ut
A.2. Gaussian vectors
rib
A family of real-valued centered random variables pXi q is jointly Gaussian
if any linear combination of the variables has distribution N p0, σ 2 q for some σ.
ist
A jointly Gaussian family is also called a Gaussian process (see Section 6.1). A
crucial property of jointly Gaussian families, or Gaussian processes, is that the
rd
joint distribution of pXi q is uniquely determined by the covariance matrix paij q “
pE Xi Xj q.
fo
When V is a real (resp., complex) finite-dimensional space equipped with a
Euclidean (resp., Hilbertian) norm, we call the standard Gaussian vector in V a
ot
V -valued random variable such that, in any orthonormal basis, the coordinates of V
are independent standard real (resp., complex) random variables. More concretely,
N
p2πqn{2
whereas the distribution of a standard Gaussian vector in Cn (denoted by γnC ) has
on
density
1
expp´|x|2 q dx.
se
(A.6)
πn
lu
In all these cases the respective distribution will be referred to as the standard
Gaussian measure on the corresponding space V . Note that if Cn is identified with
2n
? , the distributions γn and γ2n do not coincide: they differ by a scaling factor of
C
na
R
2.
While we are mostly interested in standard Gaussian vectors and measures,
so
are also considered. However, this does not add a lot to generality: any such
measure is a pushforward of the standard Gaussian measure via a linear (or affine,
as appropriate) map.
Let G be a standard Gaussian vector in Rn . Rotational invariance of γn implies
G
that the random variable |G| is uniformly distributed on sphere S n´1 ; moreover |G|
G
and |G| are independent. This can be used to relate Gaussian averages and spherical
averages. For any function f : Rn Ñ R` satisfying f ptxq “ tf pxq whenever x P Rn
and t ě 0, we have
ż ż
(A.7) f dγn “ E f pGq “ κn f dσ,
Rn S n´1
NOTES AND REMARKS 309
ion
median, which
a is necessarily
a than κn by Proposition 5.34.) The first values
smallera
are κ1 “ 2{π, κ2 “ ? π{2, κ3 “ 2 2{π. Note also the formula κn κn`1 “ n. For
large n, we have κn „ n. More precise estimates are gathered in the following
ut
proposition.
rib
Proposition A.1 (see Exercises A.4 and A.5; (iv) and (v) are not proved here).
Let ?κn be the constant
? defined in (A.8). Then
(i) n ´ 1 ď κn ď ? n,
ist
(ii) the
b sequence κn { n is increasing,
b
rd
1 n
(iii) n ´ 2 ď κn ď n ´ 2n`1 ,
?
(iv) the sequence n ´ κn is non-increasing.
?
fo
(v) as n tends to infinity, we have κn “ np1 ´ 1{4n ` 1{32n2 ` Op1{n3 qq.
The complex analogue of (A.7) is as follows: if f : Cn Ñ R` satisfies f ptxq “
ot
tf pxq whenever x P Cn and t ě 0, we have
N
ż ż
(A.9) f dγnC “ κCn f dσ
Cn SCn
?
ly.
with κC
n “ κ2n { 2.
on
Exercise A.4. Using the fact that the function log Γ is convex, show parts (i)
lu
Exercise A.6. State and prove a variant of (A.7) for α-homogeneous functions,
i.e., verifying f ptxq “ tα f pxq for x P Rn and t ą 0.
r
Pe
ion
This appendix contains an overview of the classical groups and manifolds that
ut
appear in this book, and of the natural structures, such as metrics and measures,
which they carry. Most of the facts included here have been known for 100 years
rib
or more, but the precise statements are often difficult to find in the literature,
mostly because presentations of these topics usually focus on more general and
ist
more abstract settings. Again, more specialized features of these objects are studied
elsewhere in this book, primarily in Chapter 5.
rd
B.1. The unit sphere S n´1 or SCd
We denote by S n´1 “ tx P Rn : |x| “ 1u the unit sphere in Rn . There are
fo
two natural distances on the sphere: the (intrinsic) geodesic distance (“as the crow
flies”) denoted by g and the extrinsic distance (“as the mole burrows”), i.e., the
ot
restriction to S n´1 of the Euclidean distance | ¨ | on Rn . Since they are related
N
2
(B.1) gpx, yq ď |x ´ y| ď gpx, yq.
on
π
We denote by σ the uniform measure on S n´1 , normalized so that σpS n´1 q “ 1.
We note for the record that the non-normalized pn ´ 1q-dimensional “surface area”
se
of S n´1 equals
lu
˘ 2π n{2
voln´1 S n´1 “ ` n ˘ .
`
(B.2)
Γ 2
na
voln B2n
Pe
We note for the record the formula for the volume of the unit ball B2n
π n{2
vol B2n “
` ˘
(B.3) `n ˘.
Γ 2 `1
If G is a standard Gaussian vector on Rn , then G{|G| is distributed according to
σ. This is an efficient procedure to simulate the uniform measure on the sphere.
We denote by SCd the unit sphere in Cd . Since Cd identifies with R2d as a
real vector space, and SCd with S 2d´1 as a metric measure space, the preceding
discussion is also valid for SCd . Note also the formula, for x, y P SCd ,
(B.4) gpx, yq “ arccos Re xx, yy.
311
312 B. CLASSICAL GROUPS AND MANIFOLDS
a
Exercise B.1.aShow that volpB2n q1{n „ 2πe{n as n tends to infinity, and
that volpB2n q1{n ď 2πe{n for every n ě 1.
noted by CPd´1 ), i.e., the quotient of SCd under the identification of unit vectors
ϕ, ψ which differ only by their phase; in other words, if ϕ “ eiθ ψ for some θ P R.
When ψ P SCd , we will occasionally denote by rψs its class in PpCd q. We equip
ion
PpCd q with the following metric (called Fubini–Study metric, or Bures metric)
(B.5) dprψs, rχsq “ arccos |xψ, χy|.
ut
The quantity |xψ, χy| is called the overlap of the vectors ψ and χ or, more properly,
rib
of rψs and rχs.
We also introduce the Segré variety on the bipartite Hilbert space Cd1 b Cd2 ,
defined as
ist
(B.6) Seg “ tϕ b ψ : ϕ P SCd1 , ψ P SCd2 u.
rd
As defined in (B.6), Seg is a subset of the unit sphere SCd1 bCd2 with real dimension
2pd1 ` d2 q ´ 3. Alternatively, one could define the Segré variety as a subset of the
fo
projective space PpCd1 b Cd2 q. In that case it has complex dimension d1 ` d2 ´ 2.
The real projective space PpRm q is defined and endowed with metric mutatis
ot
mutandis starting from the sphere S m´1 . However, the real setting generally ap-
pears in quantum theory only as a toy model. Note that the more standard (and
N
more general) definition of the projective space PpV q associated to a vector space
V over an arbitrary field K is by identification of vectors u, v P V zt0u such that
ly.
u “ kv for some k P Kzt0u. However, the equivalent approach starting from the
sphere S m´1 or SCd fits better the standard setup of quantum theory.
on
Exercise B.2. Check that the Fubini–Study metric is obtained as the quotient
metric from the geodesic metric on the unit sphere.
se
Exercise B.3 (Bures vs. Fubini–Study, fidelity vs. overlap). The Bures metric
lu
a? ?
where F pσ, τ q “ Tr σ τ σ is the fidelity between σ and τ . (Note that some
texts define fidelity as the square of this quantity.) (i) Verify that if τ “ |χyxχ|,
so
a
then F pσ, τ q “ xχ|σ|χy. (ii) Deduce that if σ “ |ψyxψ|, τ “ |χyxχ|, then (B.5)
r
and (B.7) yield the same value (in other words, the Fubini–Study metric is the
Pe
restriction of the Bures metric to pure states and similarly for the fidelity vs. the
overlap). (iii) Verify that dpσ, τ q is indeed a metric.
copies of SOpnq, so all statements about SOpnq transfer mutatis mutandis to Opnq.
We also point out the classical isomorphism PSUp2q Ø SOp3q (see Exercise B.4).
It what follows G will stand for either SOpnq, SUpnq or Upnq. There are many
metric structures one may consider on G. Each norm } ¨ } on Mn induces two
distances on G: the extrinsic distance (simply }U ´ V }, for U, V P G) and the
geodesic distance (the length of a shortest path in G joining U to V , where length
is measured with respect to } ¨ }). For p P r1, 8s, we will denote by gp the geodesic
distance induced by the Schatten p-norm.
ion
Among these choices we single out the standard Riemannian metric g2 , which
can be expressed for U, V P G as
ut
˜ ¸1{2
ÿn
(B.8) g2 pU, V q “ θi2
rib
i“1
where eiθ1 , . . . , eiθd are the eigenvalues of U ´1 V , and θj P r´π, πs. (See Exercise
ist
B.5.)
Proposition B.1 (not proved here). Let 1 ď p ď 8. Let U, V P G, and
rd
A P Msa
n with }A}8 ď π such that exppiAq “ U
´1
V . Then the map t ÞÑ U exppitAq,
defined for t P r0, 1s, is a geodesic joining U to V for the distance gp . If }U ´V }8 ă
fo
2 and 1 ă p ă 8, this is the unique path of minimal length.
The above result is very well-known for p “ 2, but it is also valid, with the
ot
stated caveats, for other values of p. As a consequence of Proposition B.1, extrinsic
and geodesic distances are easy to calculate and they are comparable, see Exercise
N
We point out that while Proposition B.1 appears to be stated in the complex
setting (i.e., G “ SUpnq or G “ Upnq), it makes sense just as well when G “ SOpnq:
on
SOpnq (or at least that there exists such curve, if the shortest curve is not unique).
lu
As compact groups, Opnq and Upnq carry a Haar measure: the unique proba-
bility measure which is invariant under right and/or left multiplication. The Haar
na
measure can also be generated more in a concrete fashion. For example, start from
a vector x1 uniformly distributed on S n´1 (resp., SCn ), and construct inductively
so
ion
are geodesics in the sense that all their “sufficiently short” arcs are the shortest
curves connecting their endpoints (unique if 1 ă p ă 8). Moreover, for 1 ă p ă 8
all such shortest curves are unique and hence all geodesics are of that form.
ut
Exercise B.7 (Geodesical convexity of SOpnq). Show that for any U P SOpnq
rib
there is a real skew-symmetric matrix B with }B}8 ď π such that U “ eB and
gp pI, U q “ }B}p . Conclude that SOpnq is a geodesically convex submanifold of Upnq
ist
with respect to any metric gp .
Exercise B.8 (Bi-Lipschitz estimates for the exponential map).
rd
(i) Show that } exppiBq ´ exppiAq}op ď }B ´ A}op for every A, B P Msa
n.
(ii) Consider, for θ P p0, πq,
Lpθq “ inf
"
} exppiBq ´ exppiAq}op
}B ´ A}op
sa
fo *
: A, B P Mn , }A}op ď θ, }B}op ď θ, A ‰ B .
ot
Show that for θ P p0, 2π{3q we have Lpθq ě Lpθ{2qp1 ´ |1 ´ eiθ{2 |q. Conclude that
N
find an answer in the literature (but we did not look very hard). An easy upper
bound is sinpθq{θ (check).
on
set Grpk, V q is called the Grassmann manifold or the Grassmannian. Since its
properties effectively depend only on the dimension of V , in what follows we consider
na
only the concrete situations Grpk, Rn q and Grpk, Cn q. (See, however, Exercise B.15.)
Further, since the map E Ø E K is a bijection between Grpk, Rn q and Grpn ´ k, Rn q
so
preserving all the structures we will be interested in (and similarly for Cn ), the
reader may always concentrate on the cases when k ď n{2, which we will often
r
Pe
tacitly assume.
Before discussing metrics on the Grassmann manifold we introduce the concept
of principal angles. Given E, F P Grpk, Rn q or Grpk, Cn q, consider the singular
value decomposition of the operator PE PF (recall that PE denotes the orthonormal
projection onto E), which we will write in the form given by (2.10)
k
ÿ
(B.9) PE PF “ si |xi yxyi |
i“1
with si P r0, 1s, x1 , . . . , xk P E, and y1 , . . . , yk P F (the latter inclusions are auto-
matic for xi , yi corresponding to coefficients si ‰ 0 and can be arranged otherwise).
The principal angles between E and F are the numbers θ1 , . . . , θk P r0, π{2s defined
B.4. THE GRASSMANN MANIFOLDS Grpk, Rn q, Grpk, Cn q 315
ion
˜ ¸1{2
? k
ÿ
2
(B.10) dpE, F q “ 2 θi
ut
i“1
where θ1 , . . . , θk are the principal
? angles between E and F . The reader may wonder
rib
why we included the factor 2, which may appear redundant, both geometrically
and esthetically. Indeed, as noted above, the natural metric on the projective space
ist
(Fubini–Study in the complex case), which corresponds to the case k “ 1 of the
Grassmannian does not have that factor. However, as we shall see, there are sound
rd
functorial reasons for using the normalization (B.10): it shows up in two canonical
constructions of the Grassmann manifold.
Another very natural way to define the distance in terms of principal angles is
(B.11)
fo
d8 pE, F q “ max θi .
1ďiďk
ot
However, the metric d8 is not Riemannian; an important (and obvious) inequality
N
,
0 O2
lu
where O1 P Opkq and O2 P Opn ´ kq, and so it can be naturally identified with
Opkq ˆ Opn ´ kq. Since the action of Opnq on Grpk, Rn q is transitive, it follows
na
that Grpk, Rn q is a homogeneous space for Opnq and can be identified with the
quotient space Opnq{pOpkq ˆ Opn ` ´ kqq. It follows in particular that the dimension
so
˘
of Grpk, Rn q equals dim Opnq ´ dim Opkq ` dim Opn ´ kq “ kpn ´ kq.
For a more concrete description of this correspondence, consider the map
r
by its first k columns, i.e., ORk . The preimage of E P Grpk, Rn q under this map,
i.e., the set tO P Opnq : OpRk q “ Eu is a (left) coset of Opkq ˆ Opn ´ kq. Sim-
ilarly, Grpk, Cn q identifies with the quotient space Upnq{pUpkq ˆ Upn ´ kqq. Note
that Grpk, Cn q is a complex manifold (of complex dimension kpn ´ kq), although
Upnq is not. As pointed out earlier, Grp1, V q identifies with the projective space
PpV q, except? that the metric (B.10) differs from the Fubini–Study metric (B.5) by
a factor of 2 when V “ Cn . We explain the reasons for this factor further below,
particularly in the paragraph containing (B.14). (The same formulas and the same
caveats apply to the case V “ Rn .) On the other hand, the metric d8 defined by
(B.11) coincides, for k “ 1, with the Fubini–Study distance.
316 B. CLASSICAL GROUPS AND MANIFOLDS
ion
any quotient map on (or, equivalently, for any family of “cosets” in) a metric space,
but the second equality requires that space to be a group with invariant metric (see
also Exercise B.9).
ut
The same scheme can be applied to the geodesic metric gp on Opnq or Upnq. In
particular, if p “ 2, we obtain the standard Riemannian structure on Grpk, Rn q or
rib
Grpk, Cn q and the resulting metric is (B.10), while p “ 8 yields the metric d8 from
(B.11) (see Exercise B.12). Moreover, it doesn’t matter whether we first define the
ist
geodesic metric and then pass to a quotient, or whether we reverse the order of
these operations (see Exercise B.10).
rd
It is instructive to specify the calculations implicit in the above paragraph to
the simplest nontrivial setting, that of the real projective space PpR2 q, or Grp1, R2 q.
fo
If the angle between two lines E, F Ă R2 (i.e., their Fubini–Study distance (B.5))
is θ P p0, π{2s, then the eigenvalues of the rotation W mapping E to F are eiθ and
ot
e´iθ and so W “ eiA , where A P Msa 2 has eigenvalues θ and ´θ (cf. the calculation in
the hint to Exercise B.7). It follows that the intrinsic Riemannian distance induced
N
?
which explains the factor 2 appearing in (B.10). Observe that the second equality
on
with a metric. The map E ÞÑ PE allows one to consider (for example) Grpk, Rn q
as a submanifold of Msa n , so any norm of Mn also induces two metrics (extrinsic
vs. geodesic) on Grpk, Rn q. As it turns out, the geodesic metric obtained from the
na
Hilbert–Schmidt norm is again (B.10). For an analysis of this situation via principal
angles, see Exercise B.13.
so
Finally, let us note that since SOpnq acts transitively on Grpk, Rn q, the Grass-
r
mann manifold can be likewise represented as a quotient of SOpnq, and similarly for
Pe
Grpk, Cn q and SUpnq, a point of view that can be occasionally useful (cf. the proof
of Theorem 7.15). This circle of ideas is explored in Exercises B.16 and B.17.
Each Grassmann manifolds carries a natural probability measure which can be
constructed in two different but equivalent ways
‚ as the normalized Riemannian volume induced by the metric (B.10)
‚ as the pushforward of the Haar measure on Opnq via the quotient map Opnq Ñ
Opnq{pOpkq ˆ Opn ´ kqq.
The latter construction can be described more tangibly as follows: fix E P Grpk, Rn q
and consider a Haar-distributed O P Opnq; then OpEq is a random element in
Grpk, Rn q whose distribution does not depend on the choice of E. Either way,
B.4. THE GRASSMANN MANIFOLDS Grpk, Rn q, Grpk, Cn q 317
we will call the resulting measure the standard Haar measure on Grpk, Rn q. The
same construction (using Upnq instead of Opnq) defines similarly the standard Haar
measure on Grpk, Cn q. For an even more concrete realization of the standard Haar
measure, see Exercise B.14.
Since Opnq consists of morphisms of the corresponding space that preserve the
inner product, the Haar measure on Grpk, Rn q may be seen as depending on the
choice of a Euclidean (i.e., inner product) structure on Rn . Using another Euclidean
structure on Rn leads to a different measure on Grpk, Rn q, as illustrates Exercise
ion
B.15. The same caveat applies to the complex case.
To complete the discussion of Grassmann manifolds, we will mention briefly
their “cousins,” Stiefel manifolds. For 1 ď k ď n, denote
ut
Stpk, Rn q :“ tpx1 , . . . , xk q P Rn : xxi , xj y “ δi,j u
rib
the set of k-tuples of orthonormal vectors in Rn . We have the canonical equivalences
Stpk, Rn q Ø Opnq{Opn ´ kq and Grpk, Rn q Ø Stpk, Rn q{Opkq. The complex version
ist
is defined similarly; as for Grassmann manifolds, Stiefel manifolds naturally inherit
metrics and a Haar measure from the orthogonal group.
rd
For simplicity, Exercises B.9–B.15 are stated in the real case, but the statements
are also valid in the complex case.
fo
Exercise B.9 (Induced metrics on spaces of cosets). Fix 0 ă k ă n, 1 ď p ď 8,
and denote H “ OpkqˆOpn´kq Ă Opnq. Let U0 H and V0 H be two left cosets of H and
ot
let U1 P U0 H. Show that mint}U1 ´ V }p : V P V0 Hu “ mint}U0 ´ V }p : V P V0 Hu
N
and that a similar equality holds for the corresponding geodesic distance gp .
Exercise B.10 (Geodesics in Grpk, Rn q and in Opnq). Fix 0 ă k ă n and
ly.
be lifted to a geodesic in Opnq, gp , which is of the form given by Exercise B.6 and
on which the quotient map acts as an isometry. If p “ 1 or 8, any two points in
Grpk, Rn q can be connected by a geodesic with this property.
se
Exercise B.11 (Grpk, Rn q vs. Grpn ´ k, Rn q). Let E, F P Grpk, Rn q. Show that
lu
the nonzero principal angles between E and F coincide with the nonzero principal
angles between E K and F K .
na
Exercise B.12 (Equivalence of metrics on Grpk, Rn q). Let h̃p the metric on
so
Grpk, Rn q given by (B.13) and let g̃p be the geodesic metric defined in Exercise
B.10. Show that for E, F P Grpk, Rn q
r
Pe
h̃p pE, F q “ 21{p }p2 sin θ1 {2, . . . , 2 sin θk {2q}p , g̃p pE, F q “ 21{p }pθ1 , . . . , θk q}p ,
where } ¨ }p is the `p -norm
?
on Rk and θ1 , . . . , θk are the principal angles between E
and F . Conclude that 2 π 2 g̃p ď h̃p ď g̃p and that g̃8 coincides with the metric d8
from (B.11).
Exercise B.13 (Equivalence of metrics on Grpk, Rn q, take #2). Show that the
metric on Grpk, Rn q induced from the Schatten p-norm on Mn via the embedding
E ÞÑ PE is equivalent to the metrics g̃p and h̃p from Exercise B.12. Show that the
geodesic metric induced by it coincides with g̃p .
318 B. CLASSICAL GROUPS AND MANIFOLDS
ion
the Dirac mass at E.
Exercise B.16. Does Grpk, Rn q identify with SOpnq{pSOpkqˆSOpn´kqq? Does
ut
Grpk, Cn q identify with SUpnq{pSUpkq ˆ SUpn ´ kqq?
rib
Exercise B.17 (Another representation of Grpk, Rn q as a coset space). Since
k n
` stabilizer of R ˘ under the canonical action of SOpnq on Grpk, Rn q is H “ SOpnqX
the
ist
Opkq ˆ Opn ´ kq (and since the action is transitive), Grpk, R q can be likewise
identified with SOpnq{H. Are the metrics induced this way by the Schatten p-norms
rd
the same as g̃p ’s and h̃p ’s? What about the analogous question for Grpk, Cn q? Note
that there are no subtleties as far as the induced probability measure is concerned:
all reasonable constructions lead to the same object by uniqueness of the Haar
measure.
fo
ot
B.5. The Lorentz group Op1, n ´ 1q
N
Just as the orthogonal group Opnq preserves the Euclidean norm on Rn , the
Lorentz group Op1, n ´ 1q consists of linear transformations preserving the quadratic
ly.
řn´1
form qpxq “ x20 ´ k“1 x2k , where x “ px0 , x1 , . . . , xn´1 q P Rn . Let J be the diagonal
matrix with diagonal entries p1, ´1, . . . , ´1q, i.e., the matrix inducing q in the sense
on
` ˘
either M Ln “ Ln or M Ln “ ´Ln . Again, this motivates the definition of the
orthochronous subgroup of the Lorentz group (the transformations that preserve
r
Pe
ion
Op1, 3qzO` p1, 3q. When restricted to SO` p1, 3q, that identification induces an iso-
morphism of that group with PSLp2, Cq, or the so-called spinor map, see Exercise
B.19.
ut
Exercise B.18 (Examples of automorphisms of the Lorentz cone).
rib
!„
cosh θ sinh θ
)
(a) Show that SO` p1, 1q “ :θPR .
sinh θ cosh θ
ist
(b) Deduce that if c ą 0, then SO` p1, 1q acts transitively on the (branch of the)
hyperbola tpx0 , x1 q : x0 ą 0, x20 ´ x21 “ cu.
rd
Exercise B.19 (A spinor map). Let Ψpxq “ x¨σ “ X be the map from (2.4)–
2
(2.5) implementing the isomorphism between the cones L4 and
˘ PSDpC q.
(a) Show that if V P SLp2, Cq, then ΨV pxq “ Ψ fo
´1
` :
V ΨpxqV is an automorphism
of L4 which belongs to SO` p1, 3q, and that every element of SO` p1, 3q can be
ot
represented that way.
(b) Show that the map SLp2, Cq Q V ÞÑ ΨV P SO` p1, 3q is a group homomorphism
N
whose kernel is tId, ´ Idu and deduce that it induces a group isomorphism between
PSLp2, Cq “ SLp2, Cq{tId, ´ Idu and SO` p1, 3q (an example of the so-called “spinor
ly.
map”).
on
not the argument—is exactly the same as in the classical Riemannian case (p “ 2).
If p P p1, 2q, the Riemannian proof can be tweaked as it fits in the framework of
lu
Finsler geometry [CS05]. For p P p2, 8q, the metric structure induced by the p-
Schatten norm does not satisfy the usual hypotheses of Finsler geometry and so a
na
book [GVL13].
Pe
Pe
rso
na
lu
se
on
ly.
Not
fo
rd
ist
rib
ut
ion
APPENDIX C
ion
S-lemma
ut
The focus of this appendix is the Lorentz cone
rib
n´1
ÿ
x2k ď x20 Ă Rn
(
(C.1) Ln “ px0 , x1 , . . . , xn´1 q : x0 ě 0,
k“1
ist
` ˘
and particularly the cone P pLn q :“ tΦ : Φ Ln Ă Ln u of linear maps that preserve
it. We have the following
rd
Proposition C.1. Let Φ : Rn Ñ Rn be a linear map which generates an
fo
extreme ray of P pLn q. Then either Φ is an automorphism of Ln or Φ is of rank
one, in which case Φ “ |uyxv| for some u, v P BLn zt0u. If n ą 2, the converse
implication also holds.
ot
In view of the isomorphism between the cones PSDpC2 q and L4 (see (2.4)),
N
element of BPSDpC2 qzt0u is of the form |ϕyxϕ|, ϕ P C2 zt0u, so |ϕyxϕ| and |ψyxψ|
of Proposition 2.38 play the same role as u, v in Proposition C.1. However, the
on
true reason why they appear in the statements is that they generate extreme rays
respectively in PSDpC2 q and Ln (cf. Corollary 1.10). The following simple obser-
vation completely characterizes extreme rays generated by rank one maps in a very
se
general setting (we only need the “only if” part, which is easy).
lu
Lemma C.2 (see Exercise C.1). Let C Ă Rn be a nondegenerate cone and let
P pCq be the cone of linear maps preserving C. A rank one map Φ : Rn Ñ Rn
na
generates an extreme ray of P pCq iff it is of the form Φ “ |uyxv|, with u and v
generating extreme rays of respectively C and C ˚ .
so
While not as simple as the case of rank one maps, the structure of the set
r
of automorphisms of Ln is very well understood: they are of the form tΦ, where
Pe
ion
We postpone the proof of the Lemma until the end of this appendix and show
how it implies the Proposition.
ut
Proof of Proposition C.1. In view of Lemma C.2, we may assume that
rank Φ ě 2. Let J be the diagonal matrix with diagonal entries p1, ´1, . . . , ´1q,
rib
i.e., the matrix inducing q in the sense that qpxq “ xx|J|xy for x P Rn . The map
Φ preserving Ln (and hence ´Ln ) means that the hypothesis (i) of Lemma C.4 is
ist
satisfied with G “ J and F “ Φ˚ JΦ. Since clearly ´J is not positive definite, it
follows that there is µ ě 0 and a positive semi-definite operator Q such that
rd
(C.2) Φ˚ JΦ “ µJ ` Q.
We now notice that since rank Φ ě 2, there is y “ Φx ‰ 0 such that y0 “ 0. In
fo
particular, xx|Φ˚ JΦ|xy “ xy|J|yy ă 0. Given that xx|Q|xy ě 0, it follows that µ
cannot be 0. Next, if Q “ 0, (C.2) means precisely that µ1{2 Φ P Op1, n ´ 1q and so
ot
Φ is an automorphism of Ln .
To complete the argument, we will show that if Q ‰ 0, then there is a rank one
N
operator ∆ such that Φ ˘ ∆ P P pLn q. Since Φ and ∆ have different ranks, they
are not proportional. Hence Φ ` ∆ and Φ ´ ∆ do not belong to the ray generated
ly.
Indeed, suppose that all extreme rays of P pLn q are generated by rank one
řN
` that Id
maps. It then follows in particular (see Section 1.2.2) ˘ “ i“1 |ui yxvi | for
some ui , vi P BLn . Since u, v P Ln implies that Tr J|uyxv| “ xv|J|uy ě 0, we
obtain
´ ÿN ¯ ÿN
` ˘
´1 ě 2 ´ n “ Tr J “ Tr J |ui yxvi | “ Tr J|ui yxvi | ě 0,
i“1 i“1
which yields a desired contradiction. (See Exercise C.4 for the discussion of the
ion
case n “ 2.)
ut
Proof of Lemma C.3. To show that piq ñ piiq, we argue by contradiction.
Denote Mt “ p1 ´ tqM ` tN and assume that, for every t P r0, 1s, the smallest
rib
eigenvalue λt of Mt is strictly negative. For t P r0, 1s, let
Λt :“ tx P S n´1 : Mt x “ λt xu.
ist
Note that t ÞÑ λt is continuous and t ÞÑ Λt is upper semicontinuous, i.e., tn Ñ t,
xn P Λtn and xn Ñ x imply x P Λt , and of course all Λt ‰ H.
rd
Consider the sets A “ tx P Rn : xx|M |xy ě 0u and B “ tx P Rn : xx|N |xy ě
0u. We have A Y B “ Rn by hypothesis. Since M0 “ M , it follows that Λ0 X A “ H
and so Λ0 Ă B. Similarly, Λ1 Ă A. Set fo
τ “ suptt P r0, 1s : Λt X B ‰ Hu.
ot
We now note that Λτ X B ‰ H; this is immediate if τ “ 0 and follows from upper
N
simple (note that all three sets, Λτ , A and B, are symmetric by definition). On the
other hand, if the multiplicity of λt equals k ą 1, then Λτ is a pk ´ 1q-dimensional
on
sphere and hence is connected. Consequently, the closed nonempty sets Λτ X A and
Λτ X B, the union of which is Λτ , must have a nonempty intersection.
To conclude the argument, choose x P Λτ X A X B ‰ H. Then, since x P Λτ ,
se
xx|Λτ |xy “ λt ă 0.
lu
a contradiction.
r
Exercise C.2. Use the S-lemma to show that every linear automorphism of
Ln is of the form tΦ, where t ą 0 and Φ P O` p1, n ´ 1q. In other words, there exists
t ą 0 such that xx|Φ˚ JΦ|xy “ t2 xx|J|xy for all x P Rn .
Exercise C.3. Show that maps of the form of tΦ, where t ą 0 and Φ P
SO` p1, n ´ 1q, act transitively on the interior of Ln .
Exercise C.4. Show that the all extreme rays of the cone P pL2 q consist of
maps of rank one.
324 C. EXTREME MAPS BETWEEN LORENTZ CONES AND THE S-LEMMA
ion
was similar to, but simpler than [LS75]; all proofs seem to use either a variant of the
S-lemma (Lemma C.4) or closely related facts. The papers [Hil05, Hil07b] actu-
ut
ally characterize (for any m, n ě 2) extreme rays of maps that satisfy ΦpLm q Ă Ln ,
but this slightly more general fact is easy to derive from Proposition C.1 combined
rib
with (for example) Exercise C.3.
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
APPENDIX D
ion
The goal of this appendix is to explore the dependence of polarity on transla-
ut
tion, which is otherwise not very transparent, by exploiting the duality of cones.
We believe that this approach deserves to be better known. Besides recovering the
rib
characterization of the Santaló point of a convex body, we are able to easily explain
other somewhat mysterious facts such as, for example, the polar of an ellipsoid with
ist
respect to any interior point being also an ellipsoid.
We start with a reformulation of Lemma 1.6 from Section 1.2.1 in a manner
rd
not appealing to the concept of scalar product. Let V be a real vector space and
V ˚ its dual. To make the analogy with Lemma 1.6 more apparent, we will write
xx˚ , xy for the evaluation x˚ pxq whenever x P V and x˚ P V ˚ . If C Ă V is a closed
fo
convex cone, the dual cone C ˚ Ă V ˚ is now defined by (cf. (1.18))
C ˚ :“ tx P V ˚ : @ y P C xx, yy ě 0u.
ot
(D.1)
We then have
N
˚
V
˚ b
pC q “ C ˚ X He be the corresponding bases of C and C ˚ . Then
on
a vector space with the origin at e˚ and as a dual of He˚ , and of C b and pC ˚ qb as
their respective subsets, then pC ˚ qb “ ´pC b q˝ .
lu
The proof of Lemma D.1 fully parallels that of Lemma 1.6 and so we relegate
na
it to Exercise D.1.
The formula in (D.2) suggest a definition of polarity in the affine context that
so
is a tad different than the one usually used. Namely, if K and L are (say, closed
and convex) subsets of two affine spaces whose underlying vector spaces are dual
r
325
326 D. POLARITY AND THE SANTALÓ POINT VIA DUALITY OF CONES
C C∗
K e −(K − e)◦ e∗
• •
a −(K − a)◦
•
ion
0 0 He
Ha
H e∗
ut
Figure D.1. If K is a base of C, then the polars of K with respect
rib
to different points (defined in the way implicit in Lemma D.1)
correspond to different sections of the cone C ˚ . It is possible to
superimpose the two pictures and even to assume that e “ e˚ , but
ist
that obscures the dependence of pK ´ aq˝ on a.
rd
separately represented in two copies of Rn`1 with e and e˚ being two copies of e0 .
(Note that while necessarily e0 P C ˚ , it is a priori possible that e0 R C.)
fo
Such approach has a number of nice immediate consequences, for example the
fact that the polar of a not-necessarily-centered ellipsoid is an ellipsoid as long as
ot
0 is an interior point (see Exercise D.3). Note, however, that we cannot directly
compare (say, the volumes of) pK ´aq˝ for different values of a since they do not live
N
in the same hyperplane of Rn`1 . However, a simple trick permits such comparisons
(cf. the comments following Theorem 4.17 in Section 4.3.4).
ly.
The point s appearing in the statement of Proposition D.2 is called the Santaló
point of K.
lu
dimensional and compact), the cones C and C ˚ are both nondegenerate and e0 is
an interior point of C ˚ (see Lemma 1.7 and Exercise 1.32). We now consider the
so
following auxiliary optimization problem: among the solid cones of the form
r
Ta “ tx P C ˚ : xx, ay ď 1u,
Pe
where a varies over the interior of K, find one for which voln`1 pTa q is the smallest.
Note that the restrictions on a ensure that each Ta is indeed a (bounded) solid cone
with the base tx P C ˚ : xx, ay “ 1u “: Ba (this happens whenever a belongs to the
interior of C) and that e0 belongs to Ba (this happens whenever a P C X H). The
sets Ta and Ba are pictured in the first drawing in Figure D.2.
It is easy to see that inf a voln`1 pTa q ą 0, and that both the diameter and the
volume of Ta tend to `8 as a Ñ BK. Since voln`1 pTa q is a continuous function of
a, this implies that the infimum is attained. On the other hand, if a ÞÑ voln`1 pTa q
has a local extremum at s, then an elementary variational argument shows that
e0 is the centroid of Bs (see Exercise D.4), which—according to Lemma D.1—is
D. POLARITY AND THE SANTALÓ POINT VIA DUALITY OF CONES 327
Ha C∗ Ha C∗
α α
solid cone Ta
•a •a
0 α e0 0
• • • e0 •
ion
Ba Ba
PH Ba = −(K − a)◦
ut
H H
rib
Figure D.2. The first drawing illustrates the calculation (D.3) of
the volume of the solid cone Ta . The second drawing illustrates
ist
the calculation (D.4) of the volume of pK ´ aq˝ , the polar of K ´ a
constructed inside H. The minus sign in front of pK ´aq˝ indicates
rd
a reflection inside H with respect to e0 .
fo
affinely equivalent to pK ´sq˝ , the polar of K ´s inside H. More precisely, one sees
directly from (D.2) that, for every a, pK ´ aq˝ is (up to a reflection with respect to
ot
e0 ) the orthogonal projection of Ba onto H, as pictured in the second drawing in
Figure D.2.
N
Now comes a simple but crucial observation (illustrated in the two drawings of
Figure D.2). On the one hand,
ly.
1 1
(D.3) voln`1 pTa q “ voln pBa q ˆ
n`1 |a|
on
1
because |a| equals the cosine of the angle between a and e0 (denoted by α), and
hence is the same as the height of the cone Ta . On the other hand, since pK ´ aq˝ ,
se
|a|
This shows that voln`1 pTa q and voln ppK´aq˝ q differ only by a factor independent of
so
a, and so they achieve their minima simultaneously. This concludes the argument,
except for the uniqueness part (which is easy, see Exercise D.2).
r
Pe
translate is 0-symmetric. Give a proof that does not use the uniqueness part of
Proposition D.2.
Exercise D.4. Show that if (in the notation from the proof of Proposition
D.2) the function a ÞÑ voln`1 pTa q has a local extremum at b P K, then e0 is the
centroid of Bb .
ion
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
APPENDIX E
Hints to exercises
ion
ut
Exercise 0.2. We may write |x1 b x2 ` y1 b y2 yxx1 b x2 ` y1 b y2 | as
rib
|x1 yxx1 | b |x2 yxx2 | ` |y1 yxy1 | b |y2 yxy2 |
3
1 ÿ
ist
` p´1qk |x1 ` ık y1 yxx1 ` ık y1 | b |x2 ` ık y2 yxx2 ` ık y2 |.
4 k“0
rd
Chapter 1
∆n ˆ An`1 .
Exercise 1.3. By the Hahn–Banach theorem, any boundary point of a convex
ly.
face.
Exercise 1.4. If y P Lztxu, then x is an interior point of some segment ry, zs with
z P L.
se
Exercise 1.5. (a) and (b) follow fairly directly form the definitions. (c) Consider
lu
E.1), or of a disk and a square. For the second assertion, appeal to part (c).
(e) Let L be the Minkowski sum of an n-dimensional cube and B2n . Consider a
so
x
r
•
Pe
329
330 E. HINTS TO EXERCISES
hyperplane supporting to K which is parallel to one of the facets of the cube, and
let F be the corresponding exposed facet. Show that F is a translate of that facet
(hence an pn ´ 1q-dimensional cube) and consider any k-dimensional face of F . (f)
For sufficiency, use part (a) and part (b). For necessity, use part (c) to argue by
induction with respect to the dimension. (g) Assume K is full-dimensional. If a
supporting hyperplane does not isolate a point, then the boundary of K contains
a segment.
Exercise 1.6. Proceed by induction with respect to dim K. The base cases (di-
ion
mension 0 or 1) are simple. For the inductive step, let x P K and assume first
that x belongs to the relative interior of K. Next, note that every convex body
ut
admits at least one extreme point (for example, the smallest element with respect
to the lexicographic order) and let y be one such point. There is a (unique) point
rib
z P BK (the relative boundary) such that x belongs to the segment rz, ys. Let H
be a supporting hyperplane for K which contains z. We may apply the induction
hypothesis to K X H and produce a decomposition of z as a convex combination of
ist
extreme points of K X H (hence of K, by Exercise 1.5(b)). Finally, if x P BK, we
may perform the dimension reduction immediately.
rd
Exercise 1.7. For necessity, appeal to the spectral theorem. For sufficiency, use
the following fact: If ρ1 , ρ2 are positive operators and ρ “ ρ1 ` ρ2 , then the range
fo
of ρ contains the ranges of ρ1 and ρ2 . Alternatively, note that either all rank one
projections |ψyxψ| are extreme or none of them is, and appeal to the Krein-Milman
ot
Theorem 1.3. (See also Section 2.1.3.)
N
Exercise 1.10. The extreme points of B1n are the n vectors from the canonical
basis in Rn and their opposites. The extreme points of B8 n
are elements of t´1, 1un .
For 1 ă p ă `8, any boundary point is extreme (to show this, use the fact that
se
the function x ÞÑ |x|p is strictly convex; another “high level” argument is given in
lu
Exercise 1.11(iv)).
Exercise 1.11. (i) We may assume 0 ă b ď a. Check that
na
d
pαptqap ` αp1{tqbp q “ pp ´ 1qpap ´ bp {tp qpp1 ` tqp´2 ´ |1 ´ t|p´2 q
dt
so
so that the maximum is achieved for t “ b{a ď 1. (ii) Use (i) and the inequality
ř n řn
i“1 suptą0 t¨ ¨ ¨ u ě suptą0 i“1 t¨ ¨ ¨ u. For p ě 2, the proof goes along the same
r
Pe
lines except that the supremum in the variational formula is replaced by an infimum.
(iii) To deduce (1.6) from (1.5), use the following inequalities
˘p{2 ppp ´ 1q 2 p1 ` tqp ` p1 ´ tqp
1 ` pp ´ 1qt2
`
(E.1) ď1` t ď ,
2 2
valid for t P r´1, 1s and applied with t “ }y}p {}x}p (we may assume that }y}p ď
}x}p ). The second inequality in (E.1) can be proved by a Taylor expansion of the
right-hand side. (iv) For p ě 2, use (ii). For p ď 2, use (iii) applied to the pair
px ` y, x ´ yq.
Exercise 1.12. We may assume that K contains the origin in its interior. One
possibility is to define Θpxq “ ΘK pxq as the (unique) element of minimal Euclidean
CHAPTER 1 331
norm in the set (denoted F ) of points where x¨, xy is maximal on K. To see that this
choice is Borel, define a sequence pKm q of convex bodies approximating K from the
1
inside by the relation }¨}Km “ }¨}K ` m |¨|. One checks that (i) for each x P Rn zt0u,
the linear form x¨, xy achieves its maximum on Km at a unique point, denoted φm pxq,
(ii) for each m, the map φm is continuous, and (iii) the sequence pφm q converges
pointwise to ΘK . To see the last point, write φm pxq “ p1 ` |xm |{mq´1 xm for some
xm P BK. If y P F , then (by definition of φm ) we have
p1 ` |y|{mq´1 xy, xy ď p1 ` |xm |{mq´1 xxm , xy ď p1 ` |xm |{mq´1 xy, xy,
ion
which implies that |φm pxq| ă |xm | ď |y|. Deduce that pφm pxqq must converge to
the point of minimal Euclidean norm in F .
ut
Exercise 1.13. If h, h1 : Rn Ñ R` are positively homogeneous, then th ď 1u “
th1 ď 1u implies h “ h1 . What may fail here is that supxPA xx, ¨y may be negative.
rib
Exercise 1.14. Use the fact that K Ă RB2n ðñ R´1 B2n Ă K ˝ .
ist
Exercise 1.15. pK ˝ q˝ is a closed convex set containing both K and 0, so one
inclusion is clear. For the other inclusion, argue by contradiction using the Hahn–
rd
Banach separation theorem.
Exercise 1.17. If K does not contain 0 in the interior, then } ¨ }K takes the value
`8 which forbids the application of Hahn–Banach theorem. For an illustration of
fo
the importance of the assumptions consider K “ L˝ , where L “ tpx, yq P R2 :
p2 ´ yqp2 ´ xq ě 1, x ă 2u.
ot
Exercise 1.18. (1.14) is simple and (1.15) can be deduced from it using the bipolar
N
shows that taking the closure is needed (it is clearly not needed if K ˝ and L˝ are
both compact).
Exercise 1.19. Let K “ convtV u Ă Rn containing 0 in the interior with V finite.
se
For any extreme point x P K ˝ there is a subset U Ă V such that span U “ Rn and
lu
show that if F1 , F2 are exposed faces of K with F2 Ć F1 and F1 “ ` Hy0 X˘K, then
y0 P νK pF1 qzνK pF2 q. With regards to the last property, F Ă νK ˝ νK pF q is easy;
if we had a strict
` `inclusion,
˘˘ injectivity and order reversing would imply the strict
inclusion νK νK ˝ νK pF q Ĺ νK pF q, which is a contradiction since we just noted
that the reverse inclusion always holds.
Exercise 1.21. Show that the interior of K is disjoint with Fy (always) and that,
under our hypotheses, Fy ‰ H. Deduce that Hy “ tx : xy, xy “ 1u is a supporting
hyperplane and Fy an exposed face. For the second statement, if F is a maximal
exposed face of K, show that F coincides with Fy , where y is an extreme point of
νK pF q (appeal to the Krein-Milman theorem and use maximality).
332 E. HINTS TO EXERCISES
The same argument works in the general case with the caveat that, for some y, the
set Fy (as defined by (1.17)) may be empty.
Exercise 1.23. The polars of the examples from Exercise 1.5(d) will work.
` ˘ ` ˘
Exercise 1.24. Consider K “ B12 X tx ď 0u Y B22 X tx ě 0u , where x is the
first coordinate in R2 .
Exercise 1.25. If a1 ď ¨ ¨ ¨ ď a2n´1 denote the principal semi-axes of E , produce
an n-dimensional section which is a Euclidean ball of radius an by pairing each
small semi-axis (ak for k ă n) with a large semi-axis (ak for k ą n).
ion
Exercise 1.28. If e P C ˚ is such that the functional xe, ¨y doesn’t vanish identically
on C, then it doesn’t vanish identically on the relative interior of C and so, by
ut
Proposition 1.4 (applied with K “ R` and F “ t0u), xe, ¨y is strictly positive on
the relative interior of C. Show that this implies that the relative interior of C
rib
is contained in R` C b and deduce the assertion. For an example where closure is
needed, take C “ R2` and e “ p1, 0q.
ist
Exercise 1.29. By the bipolar theorem, C ˚ “ C K ðñ C “ pC K q˚ “ spanpCq, so
whenever C is not a linear subspace, any vector e P C ˚ zC induces a base.
rd
Exercise 1.30. Try C1 “ tpx, y, zq P R3 : x ě 0, y ě 0, z ě 0, xy ě z 2 u and
C2 “ R´ ˆ t0u ˆ t0u.
fo
Exercise 1.31. We may assume after rotation that y “ te0 with t ą 1. Note that
C?2 e0 is the Lorentz cone Ln and is therefore self-dual. For the general case, define
ot
?
a linear map Tλ by Tλ y “ λy and Tλ x “ x for x K y. For λ “ t2 ´ 1, we have
“ pTλ´1 qT Ln “ T1{λ Ln “ C?1`1{λ2 e0 “ Cue0 for
N
˚
Cte0 “ Tλ Ln and therefore Cte 0
?
u “ t{ t2 ´ 1.
ly.
Exercise 1.32. Prove for example that (a) ñ (c) ñ (e) ñ (f) ñ (b) ñ (g) ñ
(d) ñ (a). The first implication is straightforward, the next two are Corollary 1.8.
on
Other implications are simple. If C has a compact base, Lemma 1.6 implies that C ˚
has a pn ´ 1q-dimensional base, so dim C ˚ “ n. If C contains a line L, then C ˚ Ă LK
and span C ˚ Ă LK .
se
Exercise 1.32.
Exercise 1.34. If x P C, then the map φpzq “ xz, xy verifies φpC ˚ q Ă R` and t0u
na
is a face of R` .
Exercise 1.35. Show that if F 1 is a face of C then R` F 1 “ F 1 . Deduce that
so
Exercise 1.36. Let y P C1 XC2 define the common isolating hyperplane. The proof
Pe
of the implication (a) ñ (d) from Exercise 1.32 shows then that the corresponding
bases of C1˚ and C2˚ are compact and hence so is their convex hull, which generates
C1˚ ` C2˚ (cf. (1.23)).
Exercise 1.37. In (ii)–(iv) this is easy; in (v) consider t tending to `8 and to
´8.
Exercise 1.38. The “if” direction is easy. For “only if,” use induction on n. Let
x, y P Rn such that x ăw y. Assume for notational simplicity that x “ xÓ , y “ y Ó .
Let δ “ minty1 ` ¨ ¨ ¨ ` yk ´ px1 ` ¨ ¨ ¨ ` xk q : 1 ď k ď nu (ě 0) with the minimum
achieved for k “ k0 . Show that px1 ` δ, x2 , . . . , xk0 q ă py1 , . . . , yk0 q and (if k0 ă n)
apply the induction hypothesis to the vectors pxk0 `1 , . . . , xn q and pyk0 `1 , . . . , yn q.
CHAPTER 1 333
Exercise 1.39. The statement and the proof are the same (simply replace every-
where “unitary” by “orthogonal”).
Exercises 1.40 and 1.41. Apply Proposition 1.15.
Exercise 1.42. We may check strict concavity on lines; for ř A positive definite and
B ‰ 0 self-adjoint, we have log detpA`tBq “ log detpAq` i logp1`tλi q where pλi q
are the eigenvalues of A´1{2 BA1{2 , which is strictly concave wherever it is defined.
Alternatively, use Klein’s lemma and analyze the proof for equality conditions.
Exercise 1.43. This follows from the fact that, for any X P Mn , diag X “
ion
Ave Dv XDv , where v varies over t´1, 1un endowed with normalized counting mea-
sure and Dv denotes the diagonal matrix made from the coordinates of a vector
ut
v.
Exercise 1.44. Extreme points of S1m,n are of the form |xyxy|, where x and y are
rib
unit vectors. Similarly, extreme points of S1m,sa are of the form |xyxx|. Extreme
m,n
points of S8 are (if, say, m ě n) the isometric embeddings of Rn into Rm , in
ist
particular, for m “ n, orthogonal matrices (resp., Cn into Cm , unitary matrices).
m,sa
Extreme points of S8 are reflections and have m ` 1 connected components
rd
(eigenvalues are ˘1), each of which can be identified with the Grassmann manifold
Grpk, Rm q for the appropriate k P t0, 1, . . . , mu.
Exercise 1.45. If X P K “ S8 n
fo
(real or complex case), let X “ U ΣV : be the polar
decomposition with U, V P Opnq (or Upnq in the complex case) and Σ a diagonal
ot
matrix with diagonal entries belonging to r0, 1s. Consider the diagonal of Σ as an
n
element of B8 and apply Exercise 1.10 and Carathéodory’s theorem in Rn . Other
N
Exercise 1.46. The set S12,sa is a cylinder (whose base is the real version of the
2,sa
Bloch ball) and the set S8 is a double-cone over a disk (see Figure E.2).
se
−|0ih0| |1ih1|
• •
na
−I I
• •
so
• •
−|1ih1| |0ih0|
r
•
Pe
|0ih0| − |1ih1|
Exercise 1.47. The delicate point is the triangle inequality. For M, N P Mm,n ,
consider M̃ , Ñ P Msa
m`n as in Lemma 1.13. By mimicking the proof of Proposition
1.15, we obtain specpM̃ ` Ñ q ă specpM̃ q ` specpÑ q, and therefore spM ` N q ăw
spM q ` spN q. Using the result from Exercise 1.38, this implies that }spM ` N q} ď
}spM q ` spN q} ď }spM q} ` }spN q}. For the second statement (and, say, m “ n)
consider the restriction of the norm to diagonal matrices with real entries.
334 E. HINTS TO EXERCISES
Exercise 1.48. Mimic the explanation of the equality in (1.34), or use the bipolar
theorem.
Exercise 1.50. Use Exercise 1.43. For the second statement, analyze the proofs
for equality conditions.
` ˘
Exercise 1.51. Since Sp pσq “ Hp specpσq , it is enough to settle the commutative
d
ř
case. Calculate the derivative dp Hp pqq and show that it equals ´ i ri logpri {qi q
ř
for some classical state r “ pri q (depending on p). The quantity i ri logpri {qi q is
called the Kullback–Leibler divergence (or relative entropy) between r and q and
ion
is always nonnegative by concavity of the logarithm.
ut
Chapter 2
rib
Exercise 2.1. Boundary states are states having 0 in their spectrum.
Exercise 2.2. Prove the statement by induction on d. Use the intermediate value
theorem to show that the operator ρ ´ d1 |ψyxψ| is on the boundary of the PSD cone
ist
for some unit vector ψ.
rd
Exercise 2.3. Use (2.6).
Exercise 2.4. (i) Each σa is a self-adjoint isometry, so its eigenvalues are ˘1.
fo
The assertion also follows formally from Exercise 2.3. (ii) It is enough to verify
directly just one of the rules; the remaining ones follow then via simple algebra by
repeatedly using (i).
ot
Exercise 2.5. Hyperplanes in H1 are described by the equation TrpA ¨ q “ t for
N
some A P Msa d which is not a multiple of the identity (and which can be assumed
to be of trace 0) and some t P R. For such A P Msa d , we first note that
ly.
Exercise 2.6. This is an immediate consequence of the fact that DpC2 q is linearly
isometric to the unit ball of R3 (see Section 2.1.2).
na
are of the form rψs ÞÑ rOψs for some O P Opnq. This can be proved by induction
on n since the set of points at largest distance from rψs identifies with Ppψ K q.
Exercise 2.9. The hypothesis implies that the matrix of ρ in any orthonormal
basis has real entries. Since this property remains true when one multiplies each
basis element by a complex number with modulus 1, it follows that the matrix of
ρ in any orthonormal basis is diagonal, and therefore ρ “ ρ˚ .
Exercise 2.10. For (i), work in the affine hyperplane of trace one self-adjoint
operators, whose real dimension is d4 ´ 1. For (ii), let Seg Ă SCd bCd be the set of
k
product unit vectors (see (B.6)) ř and consider the map Ψ : ∆k´1 ˆ Seg Ñ Sep de-
fined as Ψpλ, ψ1 , . . . , ψk q “ λi |ψi yxψi |. Then prove that a necessary condition for
CHAPTER 2 335
ion
the set of states of the form λ|00yx00| ` p1 ´ λq|11yx11| for λ P r0, 1s. This set is a
1-dimensional face. Deduce the case of arbitrary d1 , d2 .
ut
Exercise 2.12. Expand all the objects with respect to the canonical bases, i.e.,
|iy, |iy b |jy, |iyxj| etc., as appropriate.
rib
Exercise 2.13. Verify that distpψ, Segq “ 2 ´ 2λ1 pψq and note that λ1 pψq is
minimal when ψ is a maximally entangled vector.
ist
Exercise 2.14. The statement about the antisymmetric space follows from the
relation Asymd “ tψ ´ F pψq : ψ P Cd b Cd u. For the symmetric space, what it
rd
clear is that Symd “ tψ ` F pψq : ψ P Cd b Cd u “ spantx b y ` y b xu; then use
the polarization formula x b y ` y b x “ 21 px ` yqb2 ´ 12 px ´ yqb2 .
Exercise 2.15. (i) Write PE b PE as
1“
fo ‰
pPE ` PE K qb2 ` pPE ` iPE K qb2 ` pPE ´ PE K qb2 ` pPE ´ iPE K qb2 .
ot
4
(ii) By Exercise 2.14, there are unit vectors x, y P Cd such that xϕ, x b xy ‰ 0 and
N
that necessarily dim E “ dim E 1 “ 2. Let W P Updq be such that E 1 “ W E and use
V “ pW b W qpPE b PE q. As before, V P A by (i). To verify that xϕ|V |ψy ‰ 0 use
se
the fact that pPE b PE qϕ, χ are all collinear (since dim Asym2 “ 1) and nonzero,
and similarly for V ϕ, pW b W qχ, χ1 . (iv) First, by Exercise 2.14, both Symd and
lu
Asymd are invariant under the U b U action of Updq and hence A -invariant. To
show that they are A -irreducible (and hence “U b U -irreducible”), prove and use
na
the following.
A semigroup A Ă BpHq acts irreducibly on H if and only if for any ϕ, ψ P Hzt0u
so
the fact that V U is Haar-distributed for any fixed V P Updq. (iii) Apply (ii) to
ρ “ |x b xyxx b x|, where x is a fixed unit vector in Cd .
ř
Exercise 2.17. Convexity is easy. If ρ “ λi σi b τi is separable (with λi ą 0,
σi P DpH1 q and τi P DpH2 q), then λi σi b τibl is an l-extension of ρ. If ρk is a
ř
k-extension of ρ and l ă k, taking partial trace over k ´ l copies of H2 gives an
l-extension.
ř
Exercise 2.18. (i) Write ρ “ λi |χi yxχi | for λi ą 0 and unit vectors χi P
H1 b H2 . Necessarily TrH2 |χi yxχi | “ |ψyxψ| for all i, and by considering the
Schmidt decomposition of χi , one sees that χi “ ψ b ϕi for some ϕi P H2 , hence
the result. (ii) Let ρ P DpH1 b H2 b H2 q be a 2-extension of |ψyxψ|. By (i), ρ has
336 E. HINTS TO EXERCISES
the form |ψyxψ| b σ for some σ P DpH2 q. Taking partial trace over the first copy of
H2 shows that |ψyxψ| is a product state.
řd
Exercise 2.19. If ψ “ i“1 λi ei b fi is the Schmidt decomposition, show that
d
ÿ
|ψyxψ|Γ “ λi λj |ej b fi yxei b fj |
i,j“1
ion
Exercise 2.21. What are the operators ρ on H1 b H2 , for which we can be sure
that ρΓ “ pV b Iq: ρpV b Iq? (Note that V depends on X.)
ut
Exercise 2.22. Note that Γ2 “ Id. Take E “ tA P B sa pH1 b H2 q : AΓ “ Au.
Exercise 2.23. ρΓβ “ βd F `p1´βq dI2 , and therefore ρβ is PPT if and only if β ď d`1 1
.
rib
1 Γ
It follows that ρβ is entangled for β ą d`1 . Next, verify that ρβ “ wλ , where wλ is
the Werner state (2.21) with λ “ pβpd2 ´ 1q ` d ` 1q{2d. For ´ d21´1 ď β ď d`1 1
, we
ist
1
have 2 ď λ ď 1, so wλ is separable by Proposition 2.16. Since the partial transpose
of a separable state is a separable state, the result follows.
rd
Exercise 2.24. (i) For ψ P SCd1 and ϕ P SCd2 , we have |ψ b ϕyxψ b ϕ|R “
|ψ b ψyxϕ b ϕ|; in particular }|ψ b ϕyxψ b ϕ|R }1 “ 1. Using the triangle inequality
fo
for } ¨ }1 , it follows that }ρR }1 ď 1 for any separable state
ř ρ. (ii) Let ρ “ |χyxχ| for
χ P SCd1 bCd2 . Consider a Schmidt decomposition χ “ λi ψi b ϕi . We have
ot
ÿ
ρR “ λi λj |ψi b ψj yxϕi b ϕj |.
N
i,j
Since the families pψiřb ψj qi,j andřpϕi b ϕj qi,j consist of orthonormal vectors, it
ly.
Exercise 2.27. Write the Choi matrix of Φ as the difference of two positive
operators.
na
Exercise 2.28. When dim H1 ď dim H2 , this follows from the proof of Theorem
2.21. Otherwise, consider Φ˚ to switch the roles of H1 and H2 , and use Exercise
so
2.25 and the fact that Φ is completely positive if and only if Φ˚ is completely
positive.
r
Exercise 2.29. To show that the map Φ is not pk ` 1q-positive, consider the input
Pe
řk`1
operator |ψyxψ| for ψ “ i“1 |iy b |iy P Cn b Ck`1 . To establish k-positivity of Φ,
řk
write any ψ P Cn b Ck as i“1 χi b ϕi with pϕi q an orthonormal basis in Ck , and
argue that
ÿ
pΦ b IdMk qp|ψyxψ|q ě |χi b ϕi ´ χj b ϕj yxχi b ϕi ´ χj b ϕj | ě 0.
iăj
` ˘
Exercise 2.30. The unit ball in Msa d , } ¨ }8 is an “order interval” tτ : ´ I ď τ ď Iu
(where σ ď τ means that τ ´ σ is positive semi-definite) and positive maps are
exactly those that preserve this order.
CHAPTER 2 337
Exercise 2.31. Use the preceding exercise and duality. Alternatively, use the
fact that any τ P Msa m can be written as τ1 ´ τ2 with τ1 , τ2 positive and }τ }1 “
}τ1 }1 ` }τ2 }1 .
Exercise 2.32. (i) The “only if” part follows from the preceding exercise (note
that Φ b Id is trace-preserving if Φ is). In the opposite direction, if σ is positive
and Φpσq is not, then }σ}1 “ Tr σ “ Tr Φpσq ă }Φpσq}1 . This takes care of k “ 1,
and the general case follows formally. (ii) The norm equals 2; note that the norm
is necessarily attained on a pure state and use ` Exercise 2.19. Essentially the same
ion
˘
argument gives k for the norm of Φ b Id on B sa pCm b Ck q, } ¨ }1 . (iii) Use part
(ii) and duality.
ut
Exercise 2.33. The case rank ρ “ n follows from Proposition 1.4 (the trace-
preserving hypothesis is not needed). In the general case, argue by contradiction:
rib
let E “ rangepσq, E 1 “ rangepΦpσqq, and assume that r :“ dim E ą r1 :“ dim E 1 .
Next, use Propositions 2.1 and 1.4 to infer that ΦpDpEqq Ă DpE 1 q and note that
ist
r “ Tr PE “ Tr ΦpPE q ď r1 }ΦpPE q}8 , hence }ΦpPE q}8 ě rr1 ą 1 “ }PE }8 .
Conclude by appealing to Exercise 2.30.
rd
Exercise 2.34. (i) A channel is an affine map from the Bloch ball to itself; such a
map is necessarily a contraction and preserves the center if and only if the channel
is unital. We are allowed to compose the channel with maps X ÞÑ U XU : for
fo
U P Up2q, which correspond to rotations of the Bloch ball. This yields the desired
form with |a|, |b|, |c| being the singular values of the contraction. We may have
ot
a, b, c negative since we are only allowed proper rotations (from SOp3q). (ii) follows
N
from Theorem 2.21 after we compute explicitly the Choi matrix. For (iii), note
that the inequalities for pa, b, cq obtained in part (ii) describe a tetrahedron whose
vertices are p1, 1, 1q (corresponding to the identity channel) and permutations of
ly.
Exercise 2.35. Apply Carathéodory’s theorem in the space of unital and trace-
preserving superoperators.
Exercise 2.37. Check that RpXq “ E U XU : with U Haar-distributed (see also
se
write CpΦq “ |xi b yi yxxi b yi | for xi P Hout , yi P Hin . Repeating the proof of
Theorem 2.21 with this decomposition instead of (2.34) gives (iii). Finally, assuming
r
ion
Exercise 2.42. Prove that A, B ě 0 implies A d B ě 0 (A d B is a submatrix of
A b B). Use also the fact that ΘA b IdMk “ ΘAbJ where J is the matrix with all
entries equal to 1.
ut
Exercise 2.43. Observe that if a P Cn and D “ Da (i.e., the diagonal matrix with
rib
Dii “ ai ), then DXD: “ Θ|ayxa| pXq for any X P Md .
Exercise 2.44. The map Φ is completely positive, and trace-preserving because
ř :
ist
Ai Ai “ IC2 bC2 . It is also obvious from the definition that Φ is a separable
channel. Assume nowřthat Φ can be written as a convex combination of product
rd
ř
channels of the form λj Ψj b Ξj with λj ą 0 and λj “ 1. The pure product
states |0yx0|b|0yx0| and |1yx1|b|1yx1| are mapped to themselves under Φ. It follows
that for every j, Ψj p|0yx0|q “ Ξj p|0yx0|q “ |0yx0| and Ψj p|1yx1|q “ Ξj p|1yx1|q “
p1q
fo
|1yx1|. This leads to a contradiction since Φp|0yx0| b |1yx1|q “ |0yx0| b |0yx0|.
p2q
Exercise 2.45. If tAi u are Kraus operators for Φ1 and tAj u are Kraus operators
ot
p1q p2q
for Φ2 , the family tAi b Iu Y tI bAj u are Kraus operators for Φ1 ‘ Φ2 .
N
k
na
ÿ
(E.2) xyi |Φp|xi yxxj |q|yj y ě 0.
i,j“1
so
k
ÿ
xyi b xi |CpΦq|yj b xj y ě 0,
i,j“1
of the interior of PSD under Φ, then some point of the segment connecting τ and
ΦpIq would be of the form Φpσq for some σ P BPSD, in particular rank σ ă n “
rank Φpσq. Infer that Φ is a bijection of the interior of PSD onto itself and conclude
that it is an automorphism of PSD.
Exercise 2.50. Start by showing that if Φ belongs to the interior of P , then
ΦpDq X BPSD “ H. Next, consider λn (the smallest eigenvalue) as a function on
ΦpDq.
Exercise 2.51. The condition is equivalent to xΦpρq, σyHS ě δ Tr ρ Tr σ for all
ion
ρ, σ P PSD.
Exercise 2.52. (a) In the language of the second proof of Theorem 2.36, if }R}8 “
ut
1, then RpS 2 q X S 2 consists of at least 2 points. On the other hand, there are
nontrivial ellipsoids contained in B23 that intersect S 2 only at one point. For a
rib
concrete example, consider ρ ÞÑ 21 pρ ` |0yx0|q for a state ρ P DpC2 q (b) Any unitary
channel (even the identity channel!) will do.
ist
Exercise 2.53. Same example as in Exercise 2.52 (a).
Exercise 2.54. If ρ P DpCm b Cn q is an entangled state, let Φ be given by
rd
Theorem 2.34. Note that for ε ą 0 small enough, the map Φ1 : X ÞÑ ΦpXq `
εpTr Xq I also satisfies the conclusions of Theorem 2.34. Finally consider Ψ : X ÞÑ
Φ1 pIq´1{2 Φ1 pXqΦ1 pIq´1{2 . fo
Exercise 2.55. (i) Let A “ Φ˚ pIq; then Tr Φpρq “ TrpAρq for all ρ P Msa m . (ii) We
ot
may assume that A “ Φ˚ pIq is positive definite and satisfies Φ˚ pIq ď I. (iii) Set
B “ pI ´Aq1{2 , define Φ̃ : Msa sa
m Ñ Mm`n by Φ̃pρq “ BρB ‘ Φpρq and verify that Φ̃
N
is trace-preserving. (iv) Verify that Φ̃pρq ě 0 if and only if Φpρq ě 0, and that the
same is true for any extensions Φ b IdK and Φ̃ b IdK . (v) Deduce form (iv) that Φ̃
ly.
preserves positivity and that it detects entanglement of a state ρ (in the sense of
Theorem 2.34) iff Φ does.
on
Exercise 2.56. Let σ P BPzPSD, and τ P BP such that Epσq Ă Epτ q. We may
apply Lemma C.4 with F “ ´τ and G “ ´σ and conclude that µσ ´ τ P PSD for
se
some µ ě 0 (in fact, µ ą 0). Since we may write µσ “ pµσ ´ τ q ` τ , the assumption
that σ lies on an extreme ray forces τ to be proportional to σ.
lu
Chapter 4
na
we have
Pe
ion
P is a translate of S1 ` ¨ ¨ ¨ ` Sk . (This can also be checked by induction.) The
result for zonoids follows by approximation.
Exercise 4.8. Prove that every face of a zonotope is a zonotope.
ut
Exercise 4.9. No. For every partition of S n´1 as A1 Y A2 , the convex bodies
rib
K1 , K2 defined for x P Rn by
ż
}x}Ki “ |xx, θy| dσpθq
ist
Ai
are such that K1 ` K2 is a multiple of a Euclidean ball.
rd
Exercise 4.10. For the first statement, use Exercise 1.2. For the second statement,
try K “ r0, 1s Ă R and K 1 “ tpx, yq P R2 : x ě 0, y ě 0, xy ě 1u (cf. the hint to
Exercise 1.30).
Exercise 4.11. Straightforward from the definitions.
fo
ot
Exercise 4.12. Start by noticing that if txi u Ă K is a basis of V and tx1j u is a
basis of V 1 , then t˘xi b p x1j u Ă K bp V 1.
N
the second follows, e.g., from Exercise 4.12. For the third, consider first the case
when Vj “ enj ` Rnj ´1 and then appeal to Exercise 4.11 (or, alternatively, use the
approach from the paragraph following (4.13)).
se
closed. Consider first the case when C, C 1 are pointed and hence admit (by Exercise
1.32) compact bases, which allows appealing to Exercise 4.10. Next, use Exercise
1.33 and Exercise 4.12.
na
Exercise 4.15. Use Exercise 4.10 and then (to show full-dimensionality) the po-
so
larization formula
1` ˘
px ` yq b px1 ` y 1 q ` px ´ yq b px1 ´ y 1 q “ x b x1 ` y b y 1 .
r
2
Pe
To show full-dimensionality in the affine setting use the same ideas as in Exercises
4.12 and 4.13 to establish that the relative interior of K1 b p K2 is nonempty.
k
Exercise 4.16. (i) The unit ball in `1 pXq, where X is the normed space whose
unit ball is K. (`k1 pXq is the space X k equipped withş the norm }px1 , . . . , xk q} “
1
}x1 }K ` ¨ ¨ ¨ ` }xk }K .) (ii) Use the formula volpLq “ n! Rn
expp´}x}L q dx, valid for
n
any symmetric convex body L Ă R .
Exercise 4.17. It is clear that any extreme point must be of the claimed form.
Conversely, given extreme points x P K, x1 P K 1 , let φ and φ1 be supporting func-
1
ř φ ď 1 on K with φpxq “ 1 (and similarly for φ ). Given a decomposition
tionals, i.e.,
x b x1 “ λi xi b x1i , show that we may assume that φpxi q “ φ1 px1i q “ 1. Now if
CHAPTER 4 341
ion
we have ci xx, xi y “ |x| , ci xx, xi y “ 0, xx, xi y ě ´|x| and ci “ n. All this
together implies maxi xx, xi y ě |x|{n.
Exercise 4.22. Use Carathéodory’s theorem.
ut
?
Exercise 4.23. We have JohnpB8 n
q “ B2n and LöwpB8 n
q “ nB2n . ?If E Ă B8 n
Ă
αE for some ellipsoid E , the extremal volume property implies α ě n.
rib
Exercise 4.24. G1unc consists of all diagonal matrices. If ∆ is a diagonal matrix
and P a permutation matrix, what are ∆P and P ∆?
ist
Exercise 4.25. (i) Bpn is permutationally symmetric. (ii) Isometries of Spm,n
rd
include maps X ÞÑ U XV for U, V orthogonal/unitary matrices; it follows that Spn
has enough symmetries. (iii) Any isometry of Spn,sa preserves R I (indeed, ˘n´1{p I
can be characterized as isolated points in the set of elements of largest (for p ą 2) or
fo
smallest (for p ă 2) Hilbert–Schmidt norm in BSpn,sa ) so Spn,sa does not have enough
symmetries. (iv) Isometries include X ÞÑ ˘U XU : for U orthogonal/unitary, there
ot
are enough symmetries. (v) Isometries of the regular simplex are obtained from
N
permutations of its vertices; it has enough symmetries. (vi) See Theorem 2.3,
DpCd q has enough symmetries.
ly.
Exercise 4.26. (i) Choosing for K a regular p-gon fails since the isometry group is
a dihedral group. However it is possible to slightly modify K to obtain the required
on
i“1 Ui b |eσpiq yxei | for some σ P Sn and U1 , . . . , Un P IsopKq, pei q being the
canonical basis of Rn . Indeed an isometry of L induces an isometry on the set
lu
It follows that L does not have enough symmetries (since U b I commutes with
IsopLq ř for any U P SOp2q). On the other hand, there is no invariant subspace (if
so
x ‰ 0).
Exercise 4.27. Isometries of K b p L include the maps A b B for A P IsopKq and
B P IsopLq. We claim that a linear map S P BpRm b Rn q which commutes with
all such maps is a multiple of identity; this follows from the fact that, for every
y, y 1 P Rn , the map Sy,y1 defined by the relation xSy,y1 pxq, x1 y “ xSpx b yq, x1 b y 1 y
(for x, x1 P Rm ) commutes with IsopKq, and similarly with the role of both factors
exchanged.
? ? ?
Exercise 4.28. We have Johnp nB1n q “ B2n . The John ellipsoid of nB1n b p nB1n
2
(which identifies with nB1n ) is E “ B2n b2 B2n . The John ellipsoid of B2n bB p 2n (which
n,n
identifies with S1 ) is ?n E . 1
342 E. HINTS TO EXERCISES
Exercise 4.30. By “globally equivalent” we mean that the validity of all in-
stances of (4.22) implies the validity of all instances of(4.21), and vice versa.
To derive (4.21) from (4.22), appeal to the arithmetic mean/geometric mean in-
equality. To recover (4.22), apply (4.21) to K{ volpKq1{n and L{ volpLq1{n with
t “ volpKq1{n {pvolpKq1{n ` volpLq1{n q
Exercise 4.31. Given convex bodies K1 , K2 P Rn , consider K “ convpK1 ˆ
t0u, K2 ˆ t1uq Ă Rn`1 . Then K X pRn ˆ tλuq corresponds to λK2 ` p1 ´ λqK1 .
Exercise 4.32. Use the formula detpA b Bq “ detpAqn detpBqm for A P Mm and
ion
B P Mn .
Exercise 4.34. If f “ exppϕq is the density of µ, take p1 ` ϕ{sqs` as the density
ut
of µs .
Exercise 4.35. To show that 1. implies 2., define
rib
ď
K“ txu ˆ f pxq1{s L,
ist
xPsupp µ
rd
inequality in Rs to deduce that the function x ÞÑ vols pptxuˆRs qXKq1{s is concave.
Exercise 4.36. Is µ is log-concave, take pµs q as in Lemma 4.12 and show (4.28)
for µs instead of µ by using Lemma 4.13 and (4.21) applied in Rn`s . Conversely,
fo
apply (4.28) with K and L being balls of radius tending to 0 to prove that µ is log-
concave. Note that the density f satisfies f pxq “ limεÑ0 µpBpx, εqq{ volpBpx, εqq
ot
for almost all x P Rn (see Chapter 3, Theorem 1.4 in [SS05]).
N
note that it is the union of 2n essentially disjoint simplices, each with volume 1{n!.
n 1{n
n
` ˘
Exercise 4.39. Show and use B8 Ă n1{p Bpn Ă nB1n and volpnB1n q{ volpB8 q “
se
n{pn!q1{n „ e.
lu
Exercise 4.40. Integrate in Cartesian and polar coordinates, and appeal to (4.26).
Exercise 4.41. Observe if x ‰ y, then Bpx, rq X Bpy, rq Ă Bp x`y 1
2 , r q for some
na
1
r ă r.
Exercise 4.42. Consider rectangles of height 1 and width ε with ε Ñ 0.
so
wpKq ě κκn1 outradpKq. In other words, we are asking whether among all sets of
given outradius R the segment of length 2R has the minimal mean width, which
doesn’t readily follow from the known results on sets for which—under certain
constraints—the mean width is extremal (see, e.g., [Bal91, Sch99, Bar98]). The
above question is equivalent to the following inequality (see Appendix A.2 for def-
initions): If X1 , X2 , . . . , XN are jointly Gaussian N p0, 1q-distributed
ř random vari-
ables such that, for some positive scalars t1 , t2 , . . . , tN we have k tk Xk “ 0, then
E maxk Xk ě E |X1 |.
Exercise 4.44. Show the inequality for symmetric polygons by induction on the
number of edges, then use symmetrization K ÞÑ K ´ K and approximation.
CHAPTER 4 343
ion
makes the result much more subtle.
Exercise 4.47. By translation invariance, assume 0 P KXL, so that the functionals
ut
wpK, ¨q and wpL, ¨q are nonnegative. Then wpK Y L, ¨q “ maxpwpK, ¨q, wpL, ¨qq ď
wpK, ¨q ` wpL, ¨q.
rib
Exercise 4.48. By modifying the proof of Proposition 4.16 show that
´ż ¯1{p
ist
}θ}pK dσpθq vradpKq ě 1 for any p ą 0,
S n´1
rd
then let p Ñ 0. The inequality and the argument appear in Appendix A of [Sza05],
but were likely known earlier.
fo
Exercise 4.49. (i) When the measure µ is purely atomic with N atoms, the
result can be proved by induction on N , the case N “ 2 being exactly the Brunn–
ot
Minkowski inequality (4.22). The continuous case can then be derived by approx-
imation. Minkowski integrals of convex bodies are defined via their support func-
N
tions, so that inequality (4.37) makes sense whenever the map t ÞÑ wpKt , θq is
measurable for any θ P Rn . (ii) In that case, volpKt q “ volpKq for any t P Opnq.
ly.
ş invariance of the Haar measure (see Appendix B.3), the convex body L :“
By
Opnq
tpKq dµptq is necessarily a Euclidean ball centered at the origin. By comput-
on
ing the width of L in a fixed direction, we obtain that L is a Euclidean ball of radius
wpKq, showing the result.
Exercise 4.50. We have vradpKq ď vradpK ˝ q´1 ď wpKq.
se
Exercise 4.51. In the symmetric case, combine the results from Proposition 4.15,
lu
Proposition 4.16 and Theorem 4.17. In the general case, sufficient conditions are
that (i) 0 is the center of the largest Euclidean ball contained in K and (ii) 0 is the
na
centroid of K (for Santaló’s inequality to hold). These conditions are both satisfied
whenever 0 is the unique fixed point under IsopKq.
so
xPF
and the convexity and symmetry of K imply that the maximum is achieved for
x “ 0.
Exercise 4.54. Apply Lemma 4.20 to the convex body K ˆ K Ă R2n and to the
pair of orthogonal subspaces E “ tpx, xq : x P Rn u and F “ tpx, ´xq : x P Rn u.
Exercise 4.55. The lower inequality follows from (4.21). For the upper inequality,
assume h “ 1 and apply Lemma 4.20 to the convex body
L “ tpλx, p1 ´ λqy, λq : x P K, y P K, λ P r0, 1su Ă R2n`1
344 E. HINTS TO EXERCISES
ion
fs : Rn´1 Ñ r0, 8s by fs pzq “ f ps, zq and similarly for gs , hs . If t “ λu ` p1 ´
λqv, check that ht , şfu , and gv verify the pn ´ 1q-dimensional instance of (4.46).
Deduce that f˜psq “ Rn´1 fs pzq dz and similarly defined g̃, h̃ verify the 1-dimensional
ut
instance of (4.46), and conclude by appealing to that instance.
rib
ş ş8
Exercise 4.59. (i) Let α “ f p0q. Write x2 f pxq dx “ 2 0 2tPpY ě tq dt where
Y is a random variable with density f . Show that the log-concavity hypothesis
ist
implies that PpX ě tq ď PpY ě tq ď PpZ ě tq, where X is uniformly distributed
on r´1{2α, 1{2αs and Z has a symmetric exponential distribution with density
rd
α expp´2α|t|q dt. (ii) Reduce to λ “ 1 by considering L “ λ´1 K. Assume that
H “ uK for a unit vector u. For λ “ 1, the function
fo
´
f : t ÞÑ pvoln Kq´1 voln´1 K X tx¨, uy “ tu
¯
ot
satisfies the hypotheses from (i); log-concavity is given by Lemma 4.13 and Exercise
4.34.
N
Exercise 4.60. Use the inclusions ?12 K Ă B2k Ă K for K “ B2k ˆ B2n´k and
?
L Ă B2n Ă 2L for L “ K ˝ “ convtB2k ˆ t0u, t0u ˆ B2n´k u, which correspond to
ly.
Chapter 5
se
can be given as follows: consider the map φ : Rn Ñ K which maps x to the closest
point to x in K. It is easy to check that (i) φ is a contraction (ii) φ maps BL onto
so
Exercise 5.3. Let K “ convpCpx, tqq. We have outrad K “ sin t. Let L be the
Pe
n-dimensional half-ball with center x and radius sin t, such that K Ă L (see Figure
5.2). Comparing the areas of K and L using Exercise 5.2 gives the result. To prove
2
the second part, check the inequality cos u ď e´u {2 for |u| ă π{2 (take logarithm
of both sides and then differentiate).
Exercise 5.4. Use Exercise 5.2 with L “ B2n and K “ B2n z convpCpx, tqq. This
gives areapS n´1 qV ptq ě sinptqn´1 voln´1 pB2n´1 q, which is equivalent to the lower
bound in (5.4). To get the upper bound, compare the solid cap with the circum-
scribed solid cone whose base is the same as that of the cap. For the strengthened
lower bound, consider an inscribed cone. See Figure E.3.
CHAPTER 5 345
S n−1
t
• •
0 x
C(x, t)
ion
ut
Figure E.3. Upper bound (dashed) and lower bounds (dotted)
rib
on the volume of a spherical cap.
ist
1
Exercise 5.5. The problem is equivalent to showing that r ÞÑ r VV prq
prq
is nonincreas-
rd
ing. After some elementary manipulations, the inequality to verify becomes
1 r ´ sinputq ¯n´2 1 r ´ sin t ¯n´2
ż ż
dt ď dt
r 0 sinpurq fo
r 0 sin r
sinputq
for r P p0, πq and u P p0, 1q. It can then be checked than the inequality sinpurq sin t
ă sin r
ot
holds pointwise if 0 ă t ă r ă π. The argument actually shows strict concavity in
the “nontrivial” range (i.e., when et ď π) for n ą 2.
N
For the second part, note that Proposition (5.2) implies the inequality V pλrq{V prq ě
V pλsq{V psq for λ ą 1 and r ď s, and we recover (5.6) when r tends to 0.
ly.
Exercise 5.8. Use Proposition 5.4 with θ ` η “ ε and η “ ε{n, and (5.6). This
lu
choice gives C “ e, but (as in the original Rogers’s argument) optimizing over η
leads to C “ 1, at the expense of additional lower order terms.
Exercise 5.9. We know from Exercise 5.7 that N pS n´1 , g, εq ě n ` 1 for any
na
ε ă π{2.
so
Exercise 5.10. Write N “ trψs : ψ P N1 u for some set N1 Ă SCd . Take now
T to be an ε-net in the unit circle tζ P C : |ζ| “ 1u and check that ? the set
r
Note that card N2 “ card T card N . Since we can ensure that card T “ rπ{εs, the
result follows from the bound (5.7). For the upper bound, argue similarly that
P pSCd , εq ě P pPpCd q, εq ˆ P pS 1 , εq, and then appeal to (5.1).
Exercise 5.11. Rough two-sided estimates of the form pCεq2d´2 follow from Exer-
cise 5.10 and (5.2), but the precise value requires a careful integration. First show
that the question is a special case (with n “ 2d) of the problem of calculating the
spherical volume of the ε-neighborhood (in the geodesic distance) of S 1 considered
as a subset of S n´1 . Next, observe thatşthe (non-normalized) volume of that neigh-
ε
borhood equals vol1 pS 1 q voln´3 pS n´3 q 0 cos t sinn´3 t dt. Conclude by evaluating
the integral and using repeatedly the formula (B.2).
346 E. HINTS TO EXERCISES
Then choose k “ logpnq{2 logp1{tq and apply (i) to the vectors pxbk i q. See Theorem
9.3 in [Alo03]. (iii) Consider a maximal set of points verifying the condition from
the exercise with t “ 1{r.
ion
Exercise 5.14. This is even simpler than the case of the sphere. Let pxi q1ďiďN be
ŤN
chosen uniformly and independently on t´1, 1un and let A “ i“1 Bpxi , εq. Then,
ut
by (5.13), E card Ac ď 2n p1´V pεq{2n qN ď exppn log 2´N 2npHpεq´1q {pn`1qq. This
is less than 1 (and therefore the event tA “ t´1, 1un u has positive probability)
rib
provided N ą npn ` 1q logp2q ¨ 2np1´Hpεqq . The matching lower bound on covering
numbers is (5.2).
ist
řtn ` ˘ řn ` ˘
Exercise 5.15. We have Vq ptq “ k“0 nk pq ´ 1qk ď k“0 nk pq ´ 1qk αk´tn “
q nHq ptq for α “ t{pp1 ´ tqpq ´ 1qq ď
rd
` n1.˘ For thetnlower` nbound,
˘ tn justp1´tqn
keep the last term
1
and write q ´nHq ptq Vq ptq ě q ´nHq tn pq ´ 1q “ tn t p1 ´ tq ě n`1 . The
` n˘ k
last inequality follows from the fact that k t p1 ´ tqn´k is maximal for k “ tn.
fo
Exercise 5.16. Consider L “ B13 and let K be a facet of L, then N 1 pK, Lq “ 1 ă
N pK, Lq. To obtain an example with K symmetric, let K be a rhombus made of
ot
two opposite faces of L. Then N 1 pK, Lq “ 2 but two central sections of L (which
N
Exercise 5.18 by duality, but this fails since K has centroid at the origin iff K ˝
has Santaló point at the origin. However, a variant of the preceding hint gives a
lu
correct argument: by Lemmas 5.8 and Proposition 4.18, we have N pBK, ´K, εq ď
N pBK, KX , εq ď p2 ` 4{εqn “: N . Let now x1 , . . . , xN in BK such that the sets
na
xi ´ εK cover BK. For each i, letŞfi be a linear form such that fi pxi q “ 1 and
fi ď 1 on K, and let that Q “ 1ďiďN tfi ď 1u. Show that y P BK satisfies
so
and, subsequently, Theorem 4.21 to deduce that vradpKX q ă 4 vradpK ˝ q. Next, use
c
the hypothesis to conclude that vradpKX q ą 4κ vradpKq, where c is the constant
from Theorem 4.17. Then argue as in Exercises 5.18 and 5.19.
Exercise 5.21. If T is a linear map such that F “ T pB2n q, then B2n “ T pF ˝ q.
?
Exercise 5.22. (ii) Let L “ r0, ε{ n sn . The cubes tx ` L : x P N u have disjoint
volpB2n `Lq ?
interiors and lie inside B2n ` L, so card N ď vol L “ pε{ nq´n volpB2n ` Lq.
Then use Urysohn’s inequality to bound the volume radius of B2n ` L.
Exercise 5.23. By the results of Exercise B.8, if M “ SOpnq (resp., M “ Upnq,
M “ SUpnq), then N p π4 K, } ¨ }8 , 2.5εq ď N pM, } ¨ }8 , εq ď N pπK, } ¨ }8 , εq, where
K denotes the operator norm unit ball in the space of real skew-symmetric (resp.,
CHAPTER 5 347
ion
L “ Kεc . Then prove, as a consequence of Proposition 5.2, that the function
t ÞÑ V ptqV pπ ´ t ´ εq increases on the interval r0, π2 ´ 2ε s and decreases on the
ut
interval r π2 ´ 2ε , π ´ εs.
Exercise 5.27. Consider f pxq “ distpx, Aq.
rib
Exercise 5.28. We may arrange by translation that K and L are inside RB2n , so
that the functions wpK, ¨q and wpL, ¨q are R-Lipschitz on S n´1 . Then use the union
ist
bound and Lévy’s lemma.
?
Exercise 5.29. Realize the normalized uniform measure on N S N ´1 as the dis-
rd
tribution
? of αN GN , where GN is a standard Gaussian vector in RN and αN “
N {|GN |, and use the law of large numbers to conclude that αN tends almost
surely to 1. fo
Exercise 5.30. By approximation, it is enough to show that γn pAq ą γ1 pp´8, asq
ot
implies γn pAε q ą γ1 pp´8,
? a ` εsq. Consider the orthogonal projection (restricted
to the sphere) πN,n : N S N ´1 Ñ Rn . For N large enough, we know from Theorem
N
´1 ´1
5.22 that the set T :“ πN,n pAq has larger measure than the cap C :“ πN,1 pp´8, asq.
´1
It follows that σpTε q ě σpCε q. Finally observe that πN,n pAε q Ą Tε while Cε “
ly.
´1
πN,1 pp´8, a ` εN sq for some εN tending to ε as N tends to infinity; this follows
?
on
´ ¯
from the (geodesic) radius of C being N arccos ´ ?aN and a similar formula for
the radius of Cε .
se
Exercise 5.31. Show that log Φ is concave by computing second derivative and
appealing to (A.4). Alternatively, use Proposition 5.2 and Poincaré’s lemma, and
lu
´1 n c
r large
` enough, we
˘ have Φ γ n prB2 q ą p1 ´ δqr or, equivalently, γ n prB 2 q ă
γ1 rp1 ´ δqr, 8q . Now choose a finite set T Ă S n´1 such that prB n c
q Ă tx
˘ P
so
` ˘ 2`
Rn : maxuPT xx, uy ą p1 ´ δ{2qru and use the fact that γ1 rθr, 8q " γ1 rr, 8q as
r
r Ñ `8 (with θ P p0, 1q fixed). The last fact follows, e.g., from (A.4).
Pe
pending on n). The same argument shows that the median of the gamma distribu-
tion with parameter p is greater than p ´ 13 .
Exercise 5.35. Use the exponential Markov inequality and optimize over s ě 0.
Exercise 5.36. We may assume b ´ a “ 1. Write X as the convex combination
X “ pb´Xqa`pX ´aqb, use Jensen’s inequality and the convexity of the exponential
function to reduce to the inequality b exppsaq ´ a exppsbq ď expps2 {8q. The latter
follows since g 2 ď 41 , where gpsq “ log pb exppsaq ´ a exppsbqq.
348 E. HINTS TO EXERCISES
ε
Exercise 5.37. Use the exponential Markov inequality for ˘X with s “ 2p1˘εq ,
2 3
and the bound 1 ` ε ď exppε ´ pε ´ ε q{2q. Checking the last inequality is a tedious
but elementary computation.
Exercise 5.38. Argue as in the proof of Lemma 6.16 and Exercise 6.17, then set
Y0 “ λ1{2 pY ´ aq.
Exercise 5.39. Easy.
a
Exercise 5.40. Reduce the problem to λ “ 1, then choose δ “ logpκAq.
ion
a
Exercise 5.41. Assume λ “ 1. For the median, check that ? if 0 ď t ď logp2Aq,
then 2A2 expp´t2 {2q ě 12 . For the 3rd quartile, check that 3 2A2 expp´t2 {2q ě 43
a ?
ut
for 0 ď t ď logp4Aq, but that similar inequality holds with 3 2A2 replaced by
4A2 only if A ě 32{3 {4. For other quantiles, recalculate the bound on |M ´ a| and
rib
then show analogous inequalities. The only verification that is not straightforward
is establishing the bound 4A2 expp´λt2 {2q when M is the mean under the original
ist
hypothesis A ě 21 (i.e., without assuming that A ě e´1{3 ); it can be done by
identifying a family of extremal c.d.f. of Y that are of the form
rd
$
& 0 if t ă 0
F ptq “ 1´p if 0 ď t ď t0 ,
´t2
a
%
1 ´ Ae fo
if t ě t0
2
where t0 “ logpA{pq is the solution to p “ Ae´t , and then using the calculation
ot
from the proof of Lemma 6.16 together with some numerics.
N
then distpA, Bq ě ε{L. For the first assertion and M “ M1{4 , the 1st quartile,
consider A “ tf ď M1{4 u and B “ tf ě Mf u, and similarly for the other quantiles.
on
case is when f0 pxq “ distpx, Aq for some half-sphere A (the distance being the
lu
şπ{2 a
geodesic distance). And then Ef0 “ 0 V ptq dt ď π{8n from (5.5).
Exercise 5.45. It follows ş8 from the solution of the isoperimetric problem (and
na
from the formula Ef 2 “ 0 2tσp|f | ě tq dt) that among 1-Lipschitz functions with
median 0, Ef 2 is maximal for f0 pxq “ arcsinxx, uy for some u P S n´1 . For any
so
ż π{2 ż π{2
2
Pe
a a
The second one is trivial if t ď 2{n or if t ą q. If t P p 2{n, qs (which implies
t ą q ´ m), then
` ˘ 2 2
Ppf ď q ´ tq “ P f ď m ´ pt ´ pq ´ mqq ď e´npt´pq´mqq {2 ď e ¨ e´nt {2 .
a
To derive the last inequality, use t ď q ď m2 ` 2{n. A very ambitious reader may
1
try to come up with a better estimate based on the sharper bound Var f ď n´1
(see the hint for Exercise 5.45).
Exercise 5.48. Apply the hypothesis to f pxq “ mintdistpx, Aq, εu.
ion
Exercise 5.49. Consider A “ XzBε .
Exercise 5.50. The concavity of g is a consequence of Ehrhard’s inequality (The-
ut
orem 5.23). Since gpMf q “ 0, we conclude that the inequality gptq ď αpt ´ Mf q
holds for some α ą 0 and every real t. This is equivalent to the statement
rib
γn ptf ď tuq ď PpZ ď tq where Z is an N pMf , α´2 q random variable. The conclu-
sion follows since stochastic domination allows comparison of the expectations.
ist
Exercise 5.51. The distribution of pXk q1ďkďN is the image of γN under an affine
map.
rd
Exercise 5.52. Consider the function f˜ defined on S n´1 by f˜pxq “ inftf pyq `
Lgpx, yq : y P Ωu. Show that f˜ is L-Lipschitz, coincides with f on Ω and that Mf
fo
is a central value for f˜. Then apply Corollary 5.32.
Exercise 5.53. Use the fact that for B Ă Y and ε ą 0, φ´1 pBε q Ą φ´1 pBqε . The
ot
statement about median is an immediate consequence of (5.39). For the statement
N
Exercise 5.54. If n “ 1, the function x ÞÑ Φpxq (the c.d.f. of the N p0, 1q dis-
on
tribution, see (A.3)) pushes forward γ1 to the Lebesgue measure on r0, 1s and is
Lipschitz with constant p2πq´1{2 , which allows to transfer the results on Gaussian
n n
` to˘ r0,`1s. For˘ general n P N, consider the surjection φ : R Ñ r0, 1s
concentration
se
Exercise 5.55. (i) x ÞÑ suptt : νpF px, .q ě tq ě 1{2u is 1-Lipschitz, and similarly
for the other term. (ii) If f px, yq ą Mφ ` t, then either φpxq ą Mφ ` t{2 or
na
introduce the functions fx2 px1 q “ f px1 , x2 q and gpx2 q “ fx2 dµ1 , and show that
they are 1-Lipschitz.
r
Pe
basis of Mk,n´k formed by pEab q (in the real case) or pEab q Y pFab q (in the complex
case). We compute
#
1 : : 2 1 : : 2 s2a ` s2b if a ‰ b
}XEab ´ Eab X }HS ` }X Eab ´ Eab X}HS “
2 2 0 if a “ b
#
1 1 s2a ` s2b if a ‰ b
:
}XFab ´ Fab X : }2HS ` }X : Fab ´ Fab
:
X}2HS “
2 2 4s2a if a “ b
ion
and (E.3) follows by summing over a, b. In the above formulas it is tacitly assumed
that sj “ 0 for j ą mintk, n ´ ku.
ut
Exercise 5.59. For Upnq, note?that un (= the skew-Hermitian matrices) con-
tains a central element u1 :“ i I { n, and so it follows from (5.44) and (5.45) that
rib
RicI pu1 q “ 0. In the case of SOpnq, consider the orthonormal basis of son of ma-
trices of the form Sij “ ?12 p|iyxj| ´ |jyxi|q and reduce to the case X “ S12 . The
?
ist
argument for SUpnq is similar; note that u1 “ i I { n R sun . For details of the
computations for both SOpnq and SUpnq, see Proposition E.15 in [AGZ10].
rd
Exercise 5.60. Test the log-Sobolev inequality on the function x ÞÑ exppλxq for
some λ ‰ 0. Alternatively, consider the function F : x ÞÑ x in (5.49) and let t Ñ 8.
Exercise 5.61. (i) There is a contraction φ : S 1 Ñ r0, πs which pushes forward σ
fo
to the normalized Lebesgue measure. (ii) Consider the Fourier series of the function
f from (5.54). (iii) Consider f pxq “ cospπxq.
ot
p p
Exercise ş 5.62. Useş Jensen’s inequality in the form |P t f | ď Pt p|f | q, and the
N
p
relation Pt g dγn “ g dγn (justify!) applied for g “ |f | . Note that the argument
is much easier when p “ 2, the contractivity following right away from (5.57).
ly.
Exercise 5.63. Use the fact that E exppλZq “ exppλ2 {2q when Z has an N p0, 1q
distribution. The result is Pt fλ “ exppλ2 p1 ´ e´2t q{2qfλe´t . Since }fλ }Lp pγn q “
on
2
epλ {2 , the statement about sharpness follows by taking λ Ñ 8.
Exercise 5.64. Write As{n “ ty P t´1, 1un : p1 ` εqy1 ` y2 ` ¨ ¨ ¨ ` yn ď m ` s ` εu
se
µpAε q “ 21 . Positive results follow from (5.59) and from Exercise 5.64.
Exercise 5.66. Try N “ 9, and consider a Hamming ball of radius 1 plus any 4 of
na
Exercise 5.67. The second assertion of Theorem 5.54 can be restated as follows:
If K, L Ă Rn satisfy distpK, Lq ě t and one of K, L is convex, then µpKqµpLq ď
r
2
e´t {2 .
Pe
Exercise 5.68. Consider the supremum of all 1-Lipschitz affine functions that are
smaller than f on K.
Exercise 5.69. We have for y P t´1, 1un
1 ÿ
f pyq “ ? inft|y ´ z| : z P t´1, 1un , zi ď 0u
2
(this formula is valid for n even and has to be slightly modified for n odd), so f is
?1 -Lipschitz. The bound on the probability follows from the central limit theorem.
2
ş8
Exercise 5.70. Write E |f pGq|p “ 0 ptp´1 Pp|f pGq| ą tq dt and use the Gaussian
isoperimetric inequality in the form given in Theorem 5.24.
CHAPTER 6 351
Exercise 5.71. First note that clearly }X}2L2 “ i |ai |2 . Next, for p ě 2, use
ř
?
Proposition 5.58 and the fact that }εi }ψ2 “ 1 (this gives Bp “ Op pq which is the
correct order of magnitude). The case p “ 1 (and hence p P p1, 2q) follows then from
the inequality E |X| ě pE X 2 q3{2 {pE X 4 q1{2 . An alternative approach is to appeal
to Theorem 5.56 to upper-bound higher moments of X (or to Theorem 5.54 and to
the fact that, for any nonnegative variable W , we always have E W ě 12 MW ).
Exercise 5.72. By change of variables, reduce the problem to comparing the
moments of a norm (or a seminorm) } ¨ } on Rn calculated with respect to the
ion
normalized counting measure µ on t´1, 1un . Next, follow the last strategy from the
hint to Exercise 5.71 combined with Theorem 5.54. The only difference is that while
ut
previously
ř we got “for free” the fact that the
ř Lipschitz constant of the linear function
pti q ÞÑ i ai ti was exactly the same as } i ai εi }L2 , this is no longer automatically
rib
true for the function for pti q ÞÑ }pti q}. However, the Lipschitz constant and the
median of } ¨ } can still be related: if K “ tpti q : }pti q} ď M}¨} u, then the
Euclidean inradius of K cannot be too small. This follows from the scalar case:
ist
if the Euclidean inradius of K was small, then K would be contained in aˇ ř narrowˇ
band tt : |xt, ay| ď 1u and, consequently, the median of function pti q ÞÑ ˇ i ai ti ˇ
rd
would be at most 1, much smaller than its L2 -norm (equal to |a|), contradicting
the argument from the hint to Exercise 5.71.
fo
Exercise 5.73. (i) We have E X n “ 0 if n is odd; compare both Taylor series
using the inequality k k k!{p2kq! ď pe{4qk . (ii) Use Jensen’s inequality.
ot
Exercise 5.74. Use the bound on the Laplace transform obtained in Exercise
N
for p ě 1. Unless p is small, this follows from Stirling’s formula (on which the
asymptotic formula (5.63) is based). For small p one can verify the inequality
?
on
Exercise 5.76. (iii) Choose λ to be the minimum of 1{2}a}8 and t{4 a2i .
ř
lu
Chapter 6
na
Exercise 6.2. Define tN ą 0 by the formula exppt2N {2q “ N { log3{2 N and check
using (A.4) that PpM ď tN q “ Op1{N c q for some constant c´ą 0, where
r
¯ M “
Pe
? log log N
maxtXk : 1 ď k ď N u. Conclude that E M ě 2 log N ´ O ?log N (handle
E M ` and E M ´ separately). See [DLS14] for more precise bounds.
Exercise 6.3. The suggested inequality follows from the formula
ż8
EY “ PpY ą tq dt,
0
valid for any nonnegative random variable Y .
Exercise 6.4. By Carathéodory’s theorem (Theorem 1.2), K equals the union of a
family of simplices, each of which has vertices of the form xk1 , . . . , xkn , xkn`1 . Next,
upper-bound the number of such simplices (simple combinatorics) and the volume
352 E. HINTS TO EXERCISES
ion
Exercise 6.7. With the assumption that E Xk2 “ E Yk2 this is immediate by
integration (take λk “ t for all k). Without this assumption, let Z be an N p0, 1q
ut
random variable independent of pXk q, pYk q. For 0 ă t ă 1 and R large enough,
define new processes pX̄k q and pȲk q by X̄k “ tXk ` αk Z and Ȳk “ Yk ` βk Z, where
rib
the positive numbers αk , βk are adjusted so that E X̄k2 “ E Ȳk2 “ R2 . Check that
for R large enough, the second part of Slepian’s lemma can be applied to pX̄k q and
ist
pȲk q. Check also that E sup X̄k “ R ` t E sup Xk ` Op1{Rq and similarly for pȲk q,
so that letting R Ñ 8 and then t Ñ 1 yields (6.7).
rd
Exercise 6.8. Any centered Gaussian measure is the pushforward of the standard
Gaussian measure by a linear transformation.
Exercise 6.9. Without loss of generality,
`
some t ą 0. Define f psq “ γn´1 ty P R n´1
fo
L “ tpx1 , . . . , x˘n q P Rn : |x1 | ď tu for
: ps, yq P Ku , an even function of s.
ot
By (4.28), the function log f is concave, and therefore decreasing on r0, `8q. It
follows (differentiate) that
N
żt ż8
f dγ1 ě 2γ1 pr0, tsq f dγ1 ,
ly.
0 0
We now prove Proposition 6.9. Without loss of generality we may assume that
Xk “ xG, xk y, where G is a standard Gaussian vector in Rd (for some d ď N ),
and x1 , . . . , xN P Rd . We apply (6.13) to L “ tx P Rd : |xx, x1 y| ď t1 u and
se
ion
Exercise 6.17. The worst case is when the random variables Yi are non-negative
and disjointly supported.
ut
Exercise 6.18. Use the union bound to estimate Ppmaxi Yi ą tq and argue as in
the proof of Lemma 6.16.
rib
Exercise 6.19. This is again similar to the proof of Lemma 6.16. First use (6.6)
and the union bound to estimate Ppmaxk Xk ą tq; this leads (as n Ñ 8) to an
ist
expression involving Riemann zeta function ζpsq. Then just use the fact that if
2
s ě 2, then ζpsq ď ζp2q “ π6 . The best bound for E maxk Xk that can be obtained
rd
by this line of argument
? is about 1.724. On the other hand, it is not hard to see
that E maxk Xk ą 2 for n large enough. The true value of E supk Xk seems to be
fo
between 1.45 and 1.5. To get a lower bound on the Dudley a integral, note that for
k ď n the elements X1 , . . . , Xk are ε-separated with ε “ 2{p1 ` log kq.
ot
Exercise 6.20. Use Lemma 6.1 and (6.17).
b
N
a
Exercise 6.21. For k ď l ď n, we compute }Xk ´ Xl }2 “ 2 ´ 2 k{l. Since the
a ?
family pX2j qjďlog n is 2 ´ 2-separated, the lower bound follows from Sudakov’s
ly.
inequality. For the upper bound, use Dudley’s inequality and the fact that, for
a ?
α ą 1, the family pXtαj u qYpXrαj s q gives a 2 ´ 2{ α-net with at most 2 log n{ log α
on
elements.
k
Exercise 6.22. Define a sequence pak q by ak “ inftη ą 0 : N pT, ηq ď 22 u for
se
ř8
k ě 1, a0 being the radius of T . The right-hand side of (6.27) is exactly k“0 2k{2 ak .
k k`1
To compare with the left-hand side, use the bound 22 ď N pT, ηq ď 22 for
lu
η P rak`1 , ak s, k ě 1.
Exercise 6.23. Consider the sets Tk “ tX1 , . . . , X22k ´1 , Xn u.
na
Exercise 6.26. The direction that is not entirely straightforward is showing that
´1
d8 pX, Y q does not exceed the infimum in (6.32). If X̃ “ FX : p0, 1q Ñ R (the
r
inverse function) exists, then, when considered as a random variable with respect
Pe
to the Lebesgue measure, its law is the same as that of X. With care, such X̃
can be defined also if FX is not strictly increasing and/or discontinuous. Given
X, Y , what is }X̃ ´ Ỹ }8 ? This argument shows also that the infimum in (6.30) is
attained.
Exercise 6.27. Case 1˝ (the bounded case). If }Yn }8 ď M for some finite M
and all n, then also }Z}8 ď M . Now approximate f on r´M, M s by a Lipschitz
function and apply (6.31). Case 2˝ (the general case). Let ε ą 0 and choose M
so that Pp|Z| ą M q ă ε. Then, for all sufficiently large n, Pp|Yn | ą M ` 1q ă ε.
Apply Case 1˝ to Yn ’s and Z truncated at the level M ` 1, and then let ε Ñ 0.
(The last step uses the hypothesis that f is bounded.)
354 E. HINTS TO EXERCISES
Exercise 6.28. See the hint to Exercise 6.27; note that under the present hypothe-
ses Case 1˝ always holds. For an example, consider Z with distribution N p0, 1q,
2
1 ex {2
Yn “ Z ` n and f pxq “ 1`x2 .
Exercise 6.29. The measures p 21 ` n1 qδ0 ` p 21 ´ n1 qδ1 converge weakly but do not
converge in 8-Wasserstein distance, as n tends to infinity.
Exercise 6.30. The function A ÞÑ λk pAq is 1-Lipschitz with respect to the oper-
ator norm. It is remarkable that a similar inequality (with an additional mul-
ion
tiplicative constant C ă 3 on the right hand side) holds for normal matrices
[BDM83, BDK89].
Exercise 6.31. For (2), use the fact that the image of a standard Gaussian vector
ut
under the orthogonal projection onto a subspace is the standard Gaussian vector
rib
in that subspace.
Exercise 6.32. Show that if a random matrix X P Msa n is unitarily invariant, then
U DiagpXqU : (where U is a Haar-distributed random unitary matrix independent
ist
of X) has the same distribution as X.
rd
Exercise 6.33. If N is an ε-net in (SCn , | ¨ |), show (argue as in the proof of Lemma
5.9) that for any A P Mn ,
ˇ
}A}8 “ sup ˇxx|A|yyˇ ď
x,yPSCn
ˇ
fo
1 ˇ
sup ˇxx|A|yyˇ.
1 ´ 2ε x,yPN
ˇ
ot
Then use Proposition 6.3.
Exercise 6.34. Use Exercise 6.30 with A a GUEpnq matrix and B “ A ´ TrnA I.
N
Exercise 6.35. Show that the function pz ´ x` qpz ´ x´ q admits an analytic square
root gλ : Czrx´ , x` s Ñ C such that gλ pxq a ą 0 for x P px` , 8q, gλ pxq ă 0 for
ly.
şx
It follows that if M :“ x´` fλ , then γ gλzpzq dz “ 2iM for any closed path γ which
ş
circles rx´ , x` s once in the clockwise direction, but does not wind around 0. To
evaluate the path integral over γ we choose R ą x` and set Γptq “ Reit , 0 ď t ď 2π,
se
(i) γ gλzpzq dz ` Γ gλzpzq dz “ 2πigλ p0q by the Cauchy integral formula, or by the
residue theorem,
na
(ii) Γ gλzpzq dz can be related to the constant term of the Laurent expansion of gλ ,
ş
which in turn can be found by subtracting the dominant (as z Ñ 8) term z and
so
ion
sum of 2ns squared independent N p0, 1q variables.
Exercise 6.41. Let ψ P SCn be uniformly distributed, A “ |ψyxψ| ´ I {n, and B
ut
be a GUE0 pnq random matrix. By symmetry, the covariances of A and B (con-
sidered as Msa n -valued random vectors) are proportional, i.e., there exists β ą 0
rib
such thata E TrpAM q2 “ β 2 E TrpBM q2 for every M P Msa n . We compute that
β “ 1{ npn ´ 1q, and the result follows from the multivariate central limit theo-
ist
rem.
Exercise 6.42. Use Proposition 6.34 and Proposition A.1(ii).
rd
Exercise 6.43. (i) This is more transparent if we think of SCn bCs as the Hilbert–
Schmidt sphere SHS Ă Mn,s , and identify ρ and AA: , with A uniformly distributed
fo
on SHS . The function becomes f pAq “ |A: y| and is 1-Lipschitz for } ¨ }8 , hence for
} ¨ }HS . To apply Exercise 5.46 we identify SHS with S 2ns´1 . (ii) Given x P SCn , let
ot
y P N with |x ´ y| ď δ and write
N
(The? last inequality is valid whenever s ě n.) By (i), it follows that Pp}∆} ě
48{ nsq ď 64n p1 ` eq expp´16nq which tends to 0 as n tends to infinity.
se
Exercise 6.44. Combine the results from Exercise 2.19 and Exercise 6.38.
lu
řn řs
Exercise 6.45. (i) Expand E Tr GG: GG: “ i,k“1 j,l“1 ErGij Gkj Gkl Gil s and
notice that by independence ErGij Gkj Gkl Gil s “ 1ti“ku ` 1tj“lu (using the value
na
E |Z|4 “ 2 for N „ NC p0, 1q). The second computation is similar. (ii) Write GG:
:
as the product of independent random variables TrGG GG:
ˆ Tr GG: and use (i).
so
Exercise 6.46. Notice that for fixed t P Rn , E maxuPL xBt, uy “ |t|wG pLq, and
r
and conclude by Slepian’s lemma that E λ1 pAq ď 2κn . The reason for a factor 2
sa
? the first equality is that A is a standard Gaussian vector in the space Mn times
in
2. The argument for the inequality is a special case of that from Exercise 6.47,
but using the bra-ket notation makes it?easier to rewrite it when A is a GUEpnq
matrix, in which case we get E λ1 pAq ď 2κ2n
Exercise 6.49. Let G1 , G2 , G3 be standard Gaussian vectors in Rm , Rn , Rm b Rn
respectively. Compare the processes Xpt,uq “ xG3 , t b uy and Ypt,uq “ rL xG1 , ty `
rK xG2 , uy (indexed by pt, uq P K ˆL) via Slepian’s lemma. To deduce the inequality
ion
for the usual mean width, use Proposition A.1(ii).
Exercise 6.50. Here is an outline of the complex case, the real case being similar.
ut
Proceed inductively as follows. Choose a (random) unitary matrix V0 P Upsq with
the property that the matrix BV0 has a zero entry at position p1, jq for j ą 1, while
rib
the p1, 1q-entry α is positive (note that α follows a χpsq distribution). Then choose
a (random) unitary matrix U0 P Upnq with the properties that U |1y “ |1y and that
ist
the matrix U0 BV0 has a zero entry at position pi, 1q for i ą 1, while the p2, 1q-entry
β is positive (note that β follows a χpn ´ 1q distribution). Repeat the procedure
rd
with the pn ´ 1q ˆ ps ´ 1q bottom right block of U0 BV0 , which has independent
NC p0, 1q entries and is independent of α, β.
Once the Lemma is proved, the second part of the exercise follows formally
fo
from the facts that (a) B has the same distribution as W BX where W P Upnq
and X P Upsq are Haar-distributed and independent of B and (b) if U is a random
ot
or deterministic unitary matrix and W is Haar-distributed and independent of U ,
then U W is Haar-distributed and independent of U .
N
Exercise 6.52. It is enough to show that the relation xΩ|P pai ` a:i q|Ωy “ P dµSC
ş
se
holds for every polynomial P . We reduce to the case P pXq “ X n and check by
lu
expansion that xΩ|pai ` a:i qn |Ωy is the ş number of Dick paths of length 2n, which is
the nth Catalan number, and also xn dµSC pxq, see (6.34).
na
Chapter 7
so
assume }T }op “ 1, and using (i) we may also assume that T is an extreme point of
n
S8 (the operator norm unit ball). This means that T P Opnq (see Exercise 1.44),
and then it follows from the rotational invariance of the Gaussian measure that
`K pST q “ `K pSq. Note also that the second inequality in (v) is (1.13).
Exercise 7.2. No. Choosing T being a rank 1 operator, and S a rotation, one
would get from Proposition 7.1(v) that all 1-dimensional projections of K ˝ have
the same length.
Exercise 7.3. (i) Note that ~ ¨ ~B2n is the Euclidean norm associated to the inner
product (7.2), and so KpB2n q is the norm of R̃1 as an element of BpHk,n q, which
equals 1 since it is an orthonormal projection. (ii) First prove that KpKq “ KpT Kq
CHAPTER 7 357
for any T P GLpn, Rq; this follows from the formulas ~Θ~T K “ ~Θ ˝ T ´1 ~K and
R̃1 pΘ ˝ T ´1 q “ R̃1 pΘq ˝ T ´1 for Θ P Hk,n . Then show KpKq ď dg pK, B2n q using
(i). (iii) Use Exercise 4.20.
Exercise 7.4. Use (7.4) and the fact that R̃1 is self-adjoint in Hk,n .
Exercise ? 7.5. Let f : Rk Ñ R be the indicator function of Rk` , and z be the vector
p1, . . . , 1q{ k. We compute, for x “ px1 , . . . , xk q P Rk ,
1
R1 f pxq “ rE f pGqxG, zys xx, zy “ ? 2´k px1 ` ¨ ¨ ¨ ` xk q.
ion
2π
It follows that
ut
1 ÿ
R̃1 pΘqpx1 , . . . , xk q “ ? 2´k xx, εyeε .
2π
rib
εPt´1,1uk
? a 1
?
Since E |xG, εy| “ k 2{π for any ε P t´1, 1uk , we obtain ~R̃1 pΘq~B1N “ π k,
ist
while ~Θ~B1N “ 1. For the last equality, appeal to Exercise 7.4.
Exercise 7.6. The version on S is: if f : S Ñ C is an holomorphic function on
rd
S such that |f | ď λ on S and |f | ď 1 on R, then |f 1 p0q| ď e log λ. Reduce to the
case f p0q “ 0 by considering z ÞÑ pf pzq ´ f p´zqq{2. Use the three-lines lemma to
fo
conclude that |f pzq| ď λ| Im z| . Write f pzq “ zgpzq and use the maximal principle
(with 0 ă t ă 1) to show that |gpzq| ď λt {t for | Im z| ď t. The optimal choice
t “ 1{ log λ gives |f 1 p0q| ď e log λ.
ot
Exercise 7.7. (i) We have Tα “ tf ď Mf uα X tf ě Mf uα ; use Corollary 5.14.
N
1`cos 0.66ε
2 ď 1 ´ ε2 {6 for ε P p0, 1q. Apply (ii) with B “ Tα and β “ ε ´ α to get
? 1 ` cos β
on
a a
wG pAq “ κn wpAq ď n ď n ´ nε2 {6 ď n ´ pk ` 1q ď κn´k .
2
Exercise 7.8. Let E be a random cε2 n-dimensional subspace. Since g is 1-Lipschitz
se
and circled with mean µf , we can choose c ą 0 such that oscpg, SE , µf q ď ε{3
lu
h is 2-Lipschitz and circled) gives that oscph, SE , Mh q ď ε{3 with high probability,
for some choice of c. We conclude by using the triangle inequality in the form
r
the values of δ, α, β such that (7.14) implies 1 ´ ε ď }x} ď 1 ` ε for any x P S n´1 ;
then use Lemma 5.3 and the union bound.
Exercise 7.12. (i) Let Rn “ Ei be an decomposition of Rn as the direct sum of
À
N “ rn{ks subspaces, with dim Ei ď k, and O P Opnq À Haar-distributed. Using the
union bound, show that the decomposition Rn “ OpEi q has the desired property
with positive probability. (ii) If xi is the projection of x onto the ? i-th subspace in
ř ř
a decomposition from (i), write ||x|| ď ||xi || ď 2M |xi | ď 2M N |x|. (iii) Use
(ii) and the fact that ||x|| “ b|x| for some x ‰ 0.
ion
Exercise 7.13. Let Kr Ă R2 by a disk of radius 1 centered at pr, 0q. Then
limrÑ1 dimV pKr q “ 8, or otherwise one would find a polytope P with K1 Ă P Ă
ut
4K1 , which is not possible.
Exercise 7.14. The n2 -dimensional convex body B1n ˆ ¨ ¨ ¨ ˆ B1n has p2nqn vertices
rib
and n2n facets.
Exercise 7.15. Mimic the proof of Theorem 7.29, replacing the use of Lemma
ist
7.28 by the inequalities dimF pK, AqapKq2 ě pn ´ 1q{2A2 , and dimV pK, BqapKq2 ě
pn ´ 1q{2B 2 .
rd
Exercise 7.16. If the codimension of E is k,?then E nontrivially intersects Rk`1
(seen as a subspace of Rn ), on which } ¨ }1 ď k ` 1 | ¨ |.
fo
Exercise 7.17. For p ă 8, mimic the proof of Theorem 7.31. For p “ 8, use
Lemma 6.16.
ot
Exercise 7.18. (i) is equivalent to the existence of a linear map A : Rk Ñ Rn
such that p1 ` εq´1 |x| ď }Apxq}8 ď |x| for any x P S k´1 . The map A has the
N
equivalent to the inequality wpK, ¨q ď wpL, ¨q between widths, the inclusion (7.22)
means precisely that p1 ` εq´1 |x| ď }Apxq}8 for x P Rk , hence the equivalence.
on
` ˘1{p
tor T ´1 G has variance σi2 “ |Sei |2 and therefore E }T ´1 G}p ď E }T ´1 G}pp “
ř p 1{p
mp p σi q ď n1{p mp max σi , where mp denotes the Lp -norm of an N p0, 1q Gauss-
lu
As in the proof of Theorem 7.37, the problem is reduced to showing that E }B}p “
` ˘1{p
E Tr W p{2 „ αp n1{p`1{2 or, equivalently, that
´ ¯1{p ˆż ˙1{p
E n´1 Trpn´1 W qp{2 “E |x|p{2 dµsp pn´1 W q „ αp .
(Above and in what follows all expected values E are calculated on the probabil-
ity space Ω, and all integrals are over R, often with respect to empirical spectral
`ş ˘1{p
measures depending on ω P Ω.) Recalling that αp “ |x|p{2 dµMPpλq , we see
ion
that we need to exploit the convergence µsp pn´1 W q Ñ µMPpλq explained in Section
6.2.3.2. However, there are a few technical points that need to be resolved. First,
ut
it is not enough to work ş with theş weak convergence of measures since (by defini-
tion) νn Ñ ν weakly iff f dνn Ñ f dν for every bounded continuous function, and
rib
f pxq “ |x|p{2 is not bounded. To address this problem, appeal to 8-Wasserstein
convergence and argue as in Exercise 6.28 (i.e.,ş using Theorem 6.28 and Lemma
ist
6.20) to conclude that n´1 Trpn´1 W qp{2 Ñ |x|p{2 dµMPpλq “ αpp in probability,
and similarly after raising all quantities to the power 1{p.
rd
Next, as every student of real analysis knows, the convergence Xn Ñ Y in
probability does not generally imply convergence in mean E Xn Ñ E Y : one only
fo
knows from Fatou’s lemma that lim inf n E Xn ě E Y . However, we do have con-
vergence in mean under some tightness assumptions, for example when the second
moments E Xn2 are uniformly bounded. (Prove this if it sounds unfamiliar.) In our
ot
setting, we have
N
´ ¯1{p
1{2
Xn “ n´1 Trpn´1 W qp{2 ď }n´1 W }8 “ }n´1{2 B}8 .
ly.
To conclude, verify that Proposition 6.33 (or Corollary 6.38 in the real case) im-
plies E }n´1{2 B}28 À λ. This is a simple instance of upper-bounding Lp -norms in
on
presence of ψ2 estimates
? explained in Section 5.2.6; actually it easily follows that
E }n´1{2 B}28 „ p1 ` λq2 .
The case p “ 8 is easier since the quantities in question are more tangible;
se
it follows from Proposition 6.31 (or Corollary 6.38) and Exercise 6.51. Note that
lu
the lower bound also follows formally from ? the case p ă 8 by using the facts that
} ¨ }8 ě n´1{p } ¨ }p and limpÑ8 αp “ 1 ` λ, while the upper bound is implicit in
na
the volume radius follow from the inequalities vradpSpm,n q ď wpSpm,n q (Urysohn’s
Pe
has a 2-Euclidean section of dimension ckm´2{p , hence we conclude from (ii) that
k ď Cnm2{p .
Exercise 7.26. Identifying Cn with R2n , the ellipsoid JohnpKq is circled (as a
consequence of its uniqueness, it inherits all the symmetries from K), and therefore
we may reduce to the case where K is in John position. It suffices to check that
Lemma 7.41 transfers verbatim to the complex case.
?
Exercise 7.27. ? (i) By the result from Exercise 7.9, we have either k˚ pKq ě n
or k˚ pK ˝ q ě n. Assuming the latter without loss of generality, it follows from
ion
?
Corollary 7.24 that there exists a subspace F of dimension c n such that PF K is
2-Euclidean. Conclude by applying Corollary 7.40 to K X F . (ii) Yes, since we can
ut
choose a position for which the Haar measure on Grpk, Rn q concentrates near E,
see Exercise B.15.
rib
Exercise 7.28. (a) Without loss of generality, one may assume that JohnpKq “
B2n . Set A :“ vradpKq “ pvolpKq{ volpB2n qq1{n . From Lemma 5.8, we obtain that
ist
N pK, B2n q ď volpK ` B2n q{ volpB2n q ď volp2Kq{ volpB2n q ď p2Aqn . It follows that
K XE is covered by p2Aqn translates of B2n XE, hence volpK XEq ď p2Aqn volpB2 X
rd
n
Eq which is the claimed estimate. (b) Consider K “ B8 ˆ B2N and check that
vradpKq is bounded? by an absolute constant whenever N ě Cn log n, whereas
n
vradpB8 q “ Θp nq. fo
Exercise 7.29. The arguments in parts (i) and (ii) of the Exercise are identical,
ot
the key observation being that the intersection of two (or three) events with large
probability also has large probability. For the first statement, use the fact that
N
O P Op3kq Haar-distributed.
Exercise 7.30. Follows from Theorem 7.44 by duality.
on
Exercise 7.33. (i) It follows from Lemma 4.20 that volpKq ě 2´n volpK X pE1 ‘
E2 qq volpK3 q and that volpK X pE1 ‘ E2 qq ě 2´n volpK1 q volpK2 q. To obtain in-
lu
equalities for K ˝ , proceed similarly using (1.12) and (1.13). (ii) Use (4.55). (iii)
Follows easily from part (ii).
na
n
Exercise 7.34. Apply Corollary 7.24 with K “ B8 . Using the fact that k˚ pB1n q “
Ωpnq, it follows that there exists a subspace E Ă R of dimension Ωpnε2 q such that
n
so
n n
PE B 8 is p1 ` εq-Euclidean. Then note that PE B8 can be written as the Minkowski
r
Chapter 8
ion
Exercise 8.6. (i) The Choi matrix of R is CpRq “ d1 ICd bCd . (ii) Use direct
computation, or argue that G “ tB j Ak : 1 ď j ď d, 1 ď k ď du is a group
ut
(with the counting measure as Haar measure) which generates Md as a vector
space
ř and therefore the argument used in the proof of Proposition 2.18 yields
rib
1 : I
d2 GPG GXG “ Tr X d.
Exercise 8.7. The idea is to follow carefully the proof of Lemma 8.12 to come up
ist
with an exact calculation instead of an estimate.
Recall that the argument shows that Lk , the Lipschitz constant of the function
rd
ψ ÞÑ Epψq, is the same as that of the function f from (8.17) and, in particular,
independent of d (as long as d ě k, which we assume). Next, compute w, the
fo
tangent (to S k´1 ) component of the gradient of f ; the supremum of the Euclidean
norm of w will be equal to Lk . By direct calculation, show that |w|2 “ 4F with
ot
ÿ ` ˘2
(E.4) F “ pj log p1{pj q ´ H 2 ,
N
x2j
ř
where pj “ (in the notation of (8.17)) and H :“ j pj logp1{pj q. To find the
ly.
ř
maximum of F over the set tppj q : pj ě 0, j pj “ 1u, use Lagrange multipliers
and deduce that the extremal sequences ppj q take only two values, namely such that
on
a
Lk “ 2 αk2 ´ 1, deduce the conclusion.
For any given value of k, Lk can be found numerically by solving equation (E.5);
so
2 1{2
` řs ˘
yi “ j“1 |xϕj , iy| . Then check that
c
1 s
sup }x}8 “ }y}8 ě ? |y| “ .
xPF : |x|“1 n n
Exercise 8.9. (i) Use Exercise 8.8 to show? that W contains a unit vector ψ with
largest Schmidt coefficient greater than α. This uses the identification of bipartite
states with matrices (see Section 2.2.2) and the fact that the operator norm of a
matrix is at least as large as the absolute value of the largest matrix element. (The
latter seems very rough, but works; it is conceivable that refining the argument at
this point could lead to closing the gap between the lower and the upper bound
362 E. HINTS TO EXERCISES
on the dimension of “very entangled subspaces,” at least for some ranges of the
parameters.) Then appeal to concavity of entropy to show that under this constraint
the von Neumann entropy is maximized when the spectrum of ρ “ TrCm |ψyxψ| is
pα, p1 ´ αq{pk ´ 1q, . . . , p1 ´ αq{pk ´ 1qq. (ii) This is a tedious
` but straightforward
˘
consequence of part
` (i); use the
˘ fact that if y “ φpxq “ Θ xp1 ` log xq for x ě 1,
then φ´1 pyq “ Θ y{p1 ` log yq .
Exercise 8.10. Let θ : pRd , | ¨ |q Ñ pE, } ¨ }HS q be an isometry. For i, j P t1, . . . , N u,
consider the linear form φi,j : pRd qk Ñ R defined by
ion
φi,j px1 , . . . , xk q “ xi|θpx1 q . . . θpxk q|jy.
řN k
Show that }φi,j px1 , . . . , xk q} ď N ´k{2 |x1 | ¨ ¨ ¨ |xk | and that i,j“1 |||φi,j |||2 “ Ndk´1 ,
ut
so that |||φi,j ||| ě dk{2 {N pk`1q{2 for some indices pi, jq. Then use Proposition
rib
8.25(iii).
Chapter 9
ist
Exercise 9.1. Use Proposition 6.3.
rd
Exercise 9.2. Consider A “ |0yx0| ´ |1yx1| and N “ tψ P SC2 : }ψ}8 ď cos αu.
Then N is an α-net in pSCd , gq and |xψ|A|ψy| ď cosp2αq for any ψ P N .
fo
Exercise 9.3. For the first part, mimic the argument used in the proof of Propo-
sition 8.28. The second part is straightforward, see (6.15).
ot
Exercise 9.4. This is a reformulation of the statement from Lemma 4.3.
N
Exercise 9.5. The statement about the mean width is proved similarly as in the
qubit case. For the lower bound, one may notice that since SepppC2 qbk qq is a
section of SepppCd qbk q, its Gaussian mean width is smaller. For the volume, to
ly.
be able to generalize the argument from the proof of Theorem 9.11 one needs to
find LöwpDpCd q q. To that end, use Proposition 4.8 to show that it has the form
on
Ea,b “ paP ` bQqBHS , where P is the projection onto the hyperplane of trace zero
matrices and Q “ I ´P . Check that DpCd q Ă Ea,b ðñ a´2 p1 ´ 1{dq ` b´2 {d ď 1.
se
d2 ´1
? vol Ea,b “ a
a
Minimizing b volpBHS q under this constraint gives a “ d{pd ` 1q
and b “ d.
lu
Exercise 9.6. For the bound on wpPPT˝ q, argue as in the proof of Theorem 9.13,
but in the last step use Exercise 5.28. In the displayed formula, the first inequality
na
is Urysohn’s inequality. For the second one, use the bound on wpPPT˝ q and appeal
to the dual Urysohn inequality (Proposition 4.16).
so
Exercise 9.7. This follows from the fact that the measure µd2 ,d2 on DpCd b Cd q
r
is proportional to the Lebesgue measure (see the discussion following (6.47)), and
Pe
Bi Bi:
`ř : ˘
Next, using “ b |iyxi|, conclude that }Bi: Bi } “ }Bi Bi: } “
j Aij Aij
} j Aij A:ij } ď j }Aij A:ij } “ j }Aij }2 , as needed. (ii) It is enough to prove (see
ř ř ř
the comment following Theorem 9.15) that every matrix A P B sa pCd1 b Cd2 q with
}A}HS ď 1 satisfies I `A P SEP. By Theorem 2.34 and Remark 2.35, it suffices to
CHAPTER 10 363
ion
Proposition 8.28.
Exercise 9.10. Use Stirling’s formula.
ut
Exercise 9.11. (i) The fact that the projection contains the section is a general
1
obvious fact. For ρ “ |1yx1| b |1yx1|, we have PH ρ “ mn ICm bCn `|1yx1| b p|1yx1| ´
rib
1 I
n IC n q, which is not positive. (ii) Use (1.13). (iii) We have PF ρ “
m b TrC ρ for
m
every state ρ.
ist
Exercise 9.12. Use the fact that D has enough symmetries. The argument sug-
gested in the hint to Exercise 9.14 also works.
rd
?
Exercise 9.13. We have vradpP q ě 14 vradpDpCd qq ě c{ d (see Table 9.1). If
?
P has N vertices, Proposition 6.3 implies that vradpP q “ Op log N {dq, and the
fo
result follows. We point that the result can also be proved by arguing as in the
proof of Proposition 9.31.
ot
Exercise 9.14. Argue that the smallest λ ą 0 such that p´1q ‚ Sep Ă λ ‚ Sep
equals d2 ´ 1 by considering a pure product state.
N
preserving and has the property that, for any state ρ on Cd b Cd , we have
on
Φ̃pIq´1{2 Φ̃pXqΦ̃pIq´1{2 .
lu
Chapter 10
so
1
řk Ó 2n
řk
Exercise 10.1. Check that }x}8 i“1 xi ď minpk, n ´ kq ď }y}1 i“1 yiÓ .
r
Exercise 10.2. First remark that the unitary invariance of A implies that A and
Pe
ion
řs
Exercise 10.6. Consider the set Ls “ tpψ1 , . . . , ψs q P pCd bCd qs : i“1 |ψi yxψi | P
SEPu. Since SEP is convex, we have the following fact: if pψ1 , . . . , ψs´1 , ϕq P Ls
ut
and pψ1 , . . . , ψs´1 , χq P Ls , then pψ1 , . . . , ψs´1 , ?12 ϕ, ?12 χq P Ls`1 . It follows that
whenever pψ1 , . . . , ψs q is a Lebesgue point for Ls , then pψ1 , . . . , ψs´1 , ?12 ψs , ?12 ψs q
rib
is a Lebesgue point for Ls`1 . (A point x is a Lebesgue point for A P Rn if the ratio
volpA X Bpx, εqq{ vol Bpx, εq goes to 1 as ε goes to 0.) The result follows from the
ist
fact that almost every point of Ls is a Lebesgue point (see Chapter 3, Corollary
1.5 in [SS05]).
rd
Exercise 10.7. Realize ρ as TrCs |ψyxψ| for ψ uniformly distributed on SCd bCd bCs
(i) For d1 ď d2 , identify Cd1 as a subspace of Cd2 and let P : Cd2 Ñ Cd1 be the
orthogonal projection. Show that the map fo
pP b P qρpP b P q
ot
ρ ÞÑ
TrpP b P qρpP b P q
N
pushes forward µd22 ,s onto µd21 ,s , and preserves separability. (ii) Identify C2d with
C2 b Cd . Let Φ : BpC2d b C2d q ÞÑ BpCd b Cd q be the partial trace over Cd b Cd .
ly.
Then Φ pushes forward µ4d2 ,s onto µd2 ,4s , and preserves separability.
Exercise 10.8. We use the same notation as in Exercise 10.7. Theorem 10.12
on
Set pk :“ π2k ,2N ´2k . Exercise 10.7(ii) implies that ppk q is non-increasing. Define kN
as the smallest k such that pk ă 1 ´ δ. It is clear from the estimates (10.10) that
kN „ N {5. Our definition of kN implies that 2N ´2kN ă 23 s0 p2kN q, and therefore
na
2N ´2kN ´2 ď 21 s0 p2kN q, so that π2kN ,2N ´2kN ´2 ď δ. By Exercise 10.7(i), this implies
so
Exercise 10.11. The inradius of PPT is the same as that of Sep (see Table 9.1), so
the argument that led to (10.14) carries over to the present setting. For the bound
in (i), the relevant range of s is Θpd2 q.
Chapter 11
ion
Exercise 11.2. This can be seen directly from the definition. Alternatively, we may
use the description from Proposition 11.8. Let λ P p0, 1q and paij q, pa1ij q P QCm,n .
?
ut
?
We have aij “ xxi , yj y and a1ij “ xx1i , yj1 y. Defining x̃i “ λxi ‘ 1 ´ λx1i and
? ?
ỹj “ λyj ‘ 1 ´ λyj1 leads to λaij ` p1 ´ λqa1ij “ xx̃i , ỹj y. We then argue as in
rib
the end of Proposition 11.8 to ensure that vectors live in Rminpm,nq .
11.3. For vectorsaxi , yj of norm at most 1, the unit vectors x1i “ xi `
ist
Exercise
a
1 ´ |xi | u and yj1 “ yj ` 1 ´ |yj |2 v satisfy xxi , yj y “ xx1i , yj1 y provided u, v are
2
rd
unit vectors in tyj : 1 ď j ď nuK X txi : 1 ď i ď muK such that u K v.
Exercise 11.4. When considered as elements of R4 , the 8 distinct matrices Aξ,η “
pξi ηj q2i,j“1 are either opposite or orthogonal. A less explicit argument goes as
fo
2
?
follows: use Proposition 11.7, the fact that B8 is congruent to 2 B12 , and that
B1m b p B1n identifies with B1mn (cf. Exercise 11.8).
ot
Exercise 11.5. Given ξ P t´1, 1um and η P t´1, n
ř 1u , let I “ ti : ξi “ 1u and
N
Exercise 11.6. For the first statement, note that t´1, 1uk is exactly the set of
k
extreme points of B8 “ pB1k q˝ . The second statement is even more straightforward
on
from Proposition 11.8: tpxi qki“1 : xi P H, |xi | ď 1u is exactly the unit ball of
` ˘˚
`k8 pHq “ `k1 pHq .
se
2 bk
context, with the same proof: LC2,...,2 identifies with pB8 q p . Next use the facts
2
? k
that B8 is congruent to 2B12 , and that pB12 qbk identifies with B12 . It follows that
na
p
k k
LC2,...,2 is congruent to 2k{2 B12 , a polytope with 2k`1 vertices and 22 facets.
so
Exercise 11.9. The answers are most conveniently deduced from the characteriza-
tions given
? by Propositions 11.7 and 11.8. The outradius is in both cases easily seen
r
to be mn. It is a little more delicate to establish that the inradii are 1. For the
Pe
mp n
lower bound on the inradius of LCm,n “ B8 bB8 , note that it is in Löwner position
by Lemma 4.9 and then appeal to Exercise 4.20. For the remaining conclusions,
mn
use LCm,n Ă QCm,n Ă B8 .
m p n
Exercise 11.10. Since LCm,n “ B8 b B8 , this follows from Exercise 4.27 and
the fact that a cube has enough symmetries (Exercise 4.25). More concretely,
symmetries of LC are generated by permutations and sign flips of rows and columns.
Since these operations are also symmetries for QC, it follows that QC has likewise
enough symmetries.
Exercise 11.11. Taking into account Remark 11.9, it is enough to check that for
every self-adjoint operators X1 , X2 , Y1 , Y2 with X12 “ X22 “ I and Y12 “ Y22 “ I,
366 E. HINTS TO EXERCISES
?
we have Tr ρB ď 2 2, where B “ X1 b Y1 ` X1 b Y2 ` X2 b Y1 ´ X2 b Y2 . To
that end, show that B 2 “ 4 I ´pX1 X2 ´ X2 X1 q?b pY1 Y2 ´ Y2 Y1 q and conclude that
}B 2 }op ď 8. For an example giving violation 2, appeal to Proposition 11.8 and
consider the case where x1 , y1 , x2 , y2 P R2 are unit vectors separated by successive
45˝ angles.
Here is an alternative argument which allows to arrive at an example without
guessing. First, observe that
1
ion
suptϕCHSH pAq : A P QC2,2 u “ supt|y1 ` y2 | ` |y1 ´ y2 | : yj P H, |yj | ď 1u
2
(cf. Exercise 11.6). Next, note that for such y1 , y2 ,
ut
? ` ˘1{2 ˘1{2 ?
|y1 ` y2 | ` |y1 ´ y2 | ď 2 |y1 ` y2 |2 ` |y1 ´ y2 |2 “ 2 |y1 |2 ` |y2 |2
`
ď2 2
rib
and verify when equalities occur.
Exercise 11.12. By Exercise 11.4 and its hint, every normal to a facet is propor-
ist
tional to the sum of four vertices of that facet, which in turn are of the form Aξ,η .
All such sums can then be listed and classified: there are 8 that exhibit the CHSH
rd
pattern and another 8 with only one non-zero entry. Alternatively, one may notice
that every such sum is a matrix of Hilbert-Schmidt norm 4, whose entries are even
fo
integers that sum up to ˘4. Finally, the functionals corresponding to matrices
with only one non-zero entry cannot distinguish between classical and quantum
correlations.
ot
Exercise 11.13. If m ą n, the set LCn,n can be seen in a canonical way as a section
N
of LCm,n , which in turn is a section of LCm,m , and similarly for QCn,n , QCm,n and
p2q ?
QCm,m . The fact that KG ě 2 follows from Exercise 11.11, and the opposite
ly.
n
form convpF ˆ t0u, t0u ˆ Gq, where F, G are facets of B8 . (This can be easily seen
n n ˝ n n
by identifying pB8 ‘1 B8 q with B1 ˆ B1 .) It follows that LC2,n has p2nq2 facets:
lu
4n facets
` ˘ express the fact that each entry of a correlation matrix belongs to r´1, 1s,
and 8 n2 “ 4n2 ´ 4n are equivalent to the CHSH inequality.
na
defining inequality for LC3,3 . A careful counting (cf. Exercise 11.12) shows that this
construction produces 18 facets of the kind ˘aij ď 1 and 9 ˆ 8 “ 72 facets defined
by inequalities equivalent to CHSH up to symmetries. The information that LC3,3
has 90 facets implies that LC3,3 is the intersection of the half-spaces
? associated to
these 18 ?` 72 “ 90 facets. Since PE QC3,3 “ QC2,2 Ă 2LC2,2 , it follows that
QC3,3 Ă 2LC3,3 .
` ˘
Exercise 11.16. If M “ mij , then
m ÿ
ÿ ˇ n
}M : `28 pCq Ñ `21 pCq} “ max
(
ˇ mij zj | : zj P C, |zj | ď 1, j “ 1, . . . , m .
i“1 j“1
CHAPTER 11 367
Since, as a real normed space, pC, | ¨ |q coincides with pR2 , | ¨ |q, it remains to appeal
to Exercise 11.6. (Note that we are concerned here with the case m “ n “ 2, but
a similar argument works if mintm, nu “ 2.)
Exercise 11.17. Let a, b, c, d P C and let φ : C Ñ R` be defined by φpzq “
|az ` b| ` |cz ` d|. Then φ is convex and, in particular, its maximal value over the
(closed) unit disk is attained on its boundary T. Next, note that for η1 , η2 P T we
have
|aη1 ` bη2 | ` |cη1 ` dη2 | “ φpη1 η2 q
ion
and, similarly, for y1 , y2 P C2 with |y1 | “ |y2 | “ 1
|ay1 ` by2 | ` |cy1 ` dy2 | “ φpxy1 |y2 yq.
ut
By the first observation, the maxima of these two expressions (over η1 , η2 P T and
rib
over unit vectors y1 , y2 P C2 respectively) coincide and it remains to notice that
these maxima „ represent
the expressions on the two sides of the inequality (11.37)
a b
ist
for rmij s “ .
c d
Exercise 11.18. The polytope LCn,n is a symmetric polytope with 22n´1 vertices
rd
and dimension n2 (see Proposition 11.7), so the result follows. For the “moreover”
part, combine Exercise 7.15 and, if needed, Theorem 11.12. Note that, from general
fo
principles (see Exercise 4.20), apLCn,n q ď n and apQCn,n q ď n (in fact we have
equality by Exercise 11.9).
ot
Exercise 11.19. Via Santaló? inequality and its reverse, Proposition 11.15 implies
that vradpLC˝n,n q “ Θp1{ nq. Since outradpLC˝n,n q “ inradpLCn,n q “ 1 (see Exer-
N
cise 11.9), Proposition 6.3 implies that LC˝n,n has exppΩpnqq vertices, or equivalently
that LCn,n has exppΩpnqq facets.
ly.
ř
Exercise 11.20. (a) The value of the game is i,j πpi, jqmij ξi ηj , where pπpi, jqq is
on
the distribution on the set of inputs. If πpi0 , j0 q ă 14 , choose ξ, η so that pξi ηj q agrees
with pmij q except for the pi0 , j0 qth entry. (b) First, replacing pξ, ηq by p´ξ, ´ηq
does not change the outcome, so for each such pair of strategies only the sum of
se
their probabilities matters. Next, there are four pairs of that kind that saturate
lu
(11.4) and (11.12), with each pair leading to a mismatch in exactly one of the four
entries of the 2 ˆ 2 matrices pξi ηj q and pmij q. If one of these four pairs entered into
the random strategy with a weight strictly larger than 14 , the referee could use as
na
Exercise 11.21. Alice and Bob have a quantum strategy which gives the value of
Pe
?
at least 22 independently of the distribution pπpi, jqq on the set of inputs; moreover,
if that distribution is?not uniform, they have a quantum strategy yielding a value
strictly larger than 22 . For the universal strategy, use the same xi , yj as those
implicit in the hint to Exercise 11.11; it follows from the argument there that, when
expressed in terms of xi , yj , such strategy is unique up to isometries of the Hilbert
space in question. If pπpi, jqq is not uniform, then either |πp1, 1qy1 ` πp1, 2qy2 | `
|πp2, 1qy1 ´ πp2,
? 2qy2 | or |πp1, 1qx1 ` πp2, 1qx2 | ` |πp1, 2qx1 ´ πp2, 2qx2 | is strictly
larger than 2 2.
Exercise 11.22. Extreme points of the set Kk,m defined in (11.23) are deterministic
distributions that are of the form ppξ|iq “ δξ,f piq for some function f . It follows
368 E. HINTS TO EXERCISES
ion
´ ´ ¯¯
λppξ, η|i, jq ` p1 ´ λqsppξ, η|i, jq “ Tr σ pEiξ ‘ E s ξ q b pF η ‘ Fsη q ,
i j j
ut
HsA b HsB Ă pHA ‘ H sA q b pHB ‘ H sB q.
rib
Exercise 11.24. Replace ρ by its appropriate purification (see Section 3.4), i.e.,
represent ρ P HA b HB as ρ “ TrHC |ψyxψ| for some ψ P HA b HB b HC . Then
η η
write Tr ρpEiξ b Fjη q “ xψ|Eiξ b F j |ψy, where F j “ Fjη b IHC .
ist
Exercise 11.25. (i) By Exercise 11.23, it is enough to show that RB Ă QB, which
rd
is easy. Note that a product box P P RB can be represented in a trivial way: take
HA “ HB “ C, ρ “ ICbC and Eiξ “ ppξ|iq IC , Fjη “ ppη|jq IC . (ii) Consider a local
fo
box of the form (11.20). By Carathéodory’s theorem, we may assume that the index
set Λ is finite. To obtain a representation as a quantum box with a separable state,
consider HA “ HB “ CΛ and let p|λyqλPΛ be the canonical basis in CΛ . Define
ot
ρ “ λ µpλq|λyxλ| b |λyxλ|, Eiξ “ λ ppξ|i, λq|λyxλ| and Fjη “ λ ppη|j, λq|λyxλ|.
ř ř ř
N
One checks then that the representation (11.21) holds. Note that this construction
is essentially the argument used in Exercise 11.23 to prove convexity of QB, specified
ly.
that the affine space Vk,m generated by Kk,m does not contain 0 and similarly for
Kl,n .
lu
Exercise 11.28. If pA p¨|iq P Kk,m and pB p¨|jq P Kl,n , the dimension of the set of
boxes P “ tppξ, η|i, jqu verifying (11.25) for inputs i, j and for that particular choice
na
Since LB Ă QB Ă NSB, all dimensions must be the same. They are all convex
bodies in the affine space Vk,m b p Vl,n analyzed in Exercise 11.27.
r
Pe
Exercise 11.29. Let P “ tTr ρpEiξ b Fjη qu P QB and Ps “ tTr ρ˚ pEiξ b Fjη qu, where
ρ˚ is the maximally mixed state. Since ρ˚ is an interior point of Sep, it follows from
Exercise 11.26 that the intersection of the segment rPs, P s with LB is a segment of
nonzero length, in particular P belongs to the affine subspace generated by LB.
Since P P QB was arbitrary, we conclude that QB is contained in that subspace
and, in particular, dim QB ď dim LB. (The converse inequality is trivial.)
Exercise 11.30. If H Ă RN is an affine subspace not containing 0 and if V is
an affine functional on RN , then there exists v P RN such that xv, xy “ V pxq for
x P H.
APPENDIX A 369
Exercise 11.31. The first part is straightforward from the definitions. For the
second part, note that we cannot have LB Ă bQB if |b| ă 1, and then appeal to
the first part.
Exercise 11.32. (i) By Exercise 11.30, we can use affine functionals to exhibit
violations. Given such functional V , the largest violation among functionals of the
form Vs “ s ` V (where s P R) occurs when Vs pLBq is an interval of the form
r´a, as. Hence if V yields the maximal quantum violation, then
r´a, as “ V pLBq Ă V pQBq Ă r´aωQ pV q, aωQ pV qs
ion
and the last two intervals share (at least) one of the endpoints. In particular, the
ratio of the lengths of the intervals V pQBq and V pLBq is between p1 ` ωQ pV qq{2
ut
and ωQ pV q. (ii) Replace everywhere QB by NSB.
rib
Exercise 11.33. First, the PR-box yields value 4 (in the normalization given by
(11.29)). In the opposite direction,
ˇř ˇ that, for each i, j, ppξ, η|i, jq is a
use the fact
joint density to deduce that ˇ ξ,η ppξ, η|i, jqˇ ď 1. The second statement follows
ist
then from Exercise 11.15 and the proof of Proposition 11.19.
rd
Exercise 11.34. Reverse engineer the proof of Proposition 11.8 starting from the
configuration x1 , y1 , x2 , y2 P R2 from the hint to Exercise 11.11. This leads (for
example) to ρ being the maximally entangled state on C2 b C2 , the isometries
fo
X1 “ σx , X2 “ σz (the Pauli matrices), Y1 “ 2´1{2 pσx ` σz q, Y2 “ 2´1{2 pσx ´ σz q
and, finally, to the POVMs consisting of spectral projections of Xi ’s and Yj ’s (as in
ot
the formulas following (11.13)). The last step is somewhat tedious, but instructive.
N
Exercise 11.35. (i) The composition rules for Pauli matrices are in Exercise 2.4.
(ii) Multiply all the numbers in the matrix. (iii)(a) Use part (ii); it follows that
the probability of winning under any classical strategy is at most 8{9. (b) First,
ly.
the product of the elements of Alice’s output string must be an eigenvalue of the
on
composition of the corresponding operators, and similarly for Bob, and therefore
by (i)(b) their answers are valid. Next, we can compute (as in Section 3.8) the joint
probability distribution of outcomes when Alice and Bob measure a single shared
se
ϕ` in the eigenbasis of a Pauli matrix: for σx and σz both outcomes are always
equal, and for σy both outcomes are always different. It follows that for each of the
lu
Chapter 12
so
Exercise 12.1. For the first part, use the triangle inequality. For the second part,
r
consider the POVM pP, I ´P q where P is the projection onto the range of pρ ´ σq` .
Pe
Exercise 12.2. Show that separability and PPT properties are preserved under
the action of a separable channel.
Appendix A
Exercise A.1. Use simple calculus (differentiation) for small t and (for example)
the upper Komatu inequality (A.4) for large t.
Exercise A.2. (i) is elementary calculus. (ii) Let δpxq be either f` pxq ´ f pxq or
f pxq ´ f´ pxq. We have δ 1 pxq ď xδpxq. Since δp0q ě 0, and δ vanishes at `8, the
result follows (otherwise consider a local minimum of δ).
370 E. HINTS TO EXERCISES
Exercise
ş A.3. Let µ be such a probability measure. The Fourier transform µ p:
u ÞÑ Rn exppixx, uyq dµpxq satisfies µ ppu ` vq “ µ ppuqpµpvq for u K v. Moreover
µ
p is radial. If f ptq denotes the value of µ p on the sphere of radius t2 , we have
f pt`uq “ f ptqf puq. Since f is continuous and real-valued (µ is even by assumption),
this implies f ptq “ expp´tσ 2 {2q for some σ ě 0, and therefore µ is a Gaussian
measure.
?
Exercise A.4. We have κn “ E |G| ď pE |G|2 q1{2 “ n. For the lower bound, use
the functional equation Γpt ` 1q “ tΓptq. Note also that κn`1 {κn “ n{κ2n .
ion
a b
n
Exercise A.5. If αn “ n ´ 1{2 and βn “ n ´ 2n`1 , it is elementary to
check that the sequences κ2n {α2n , κ2n`1 {α2n`1 , β2n {κ2n and β2n`1 {κ2n`1 are non-
ut
increasing. The result follows since all these sequences converge to 1.
rib
ş
Exercise A.6. Express Rn f dγn in polar coordinates. The factor is κn pαq “
E |G|α “ 2α{2 Γppn ` αq{2q{Γpn{2q and, under some minimal regularity assumptions
ist
on f , the formula is valid α ą ´n.
Appendix B
rd
Exercise B.1. Use Stirling’s formula and the bound n! ě pn{eqn .
Exercise B.2. This is immediate from (B.4).
Exercise B.3. (i)–(ii) Easy. (iii) Use the
fo
? non-commutative
? Hölder inequality and
ot
the fact that A: A and AA: (and hence A: A and AA: ) have the same non-zero
eigenvalues.
N
Exercise B.4. The argument is essentially included in the proof of Theorem 2.3.
ş1
ly.
Exercise B.5. (i) follows from Proposition B.1 and from the formula 0 }γ 1 ptq} dt
for the length of an absolutely continuous curve γ : r0, 1s Ñ G. (ii) The singular
on
numbers of U ´ V are the same as those of I ´U ´1 V and hence (in the notation of
part (i)) equal |1 ´ eiθj | “ 2 sinpθj {2q.
Exercise B.6. By rescaling the parameter t we can achieve }A}8 ď π. Next,
se
Finally, use the fact that the map X ÞÑ W X is an isometry with respect to any
Schatten p-norm.
na
The last assertion follows from the uniqueness part of Proposition B.1.
Note that allowing right (or two-sided) cosets does not increase generality since
:
so
eitA W “ W eitW AW . ˆ ˙ ˆ ˙
0 ´θ cos θ ´ sin θ
Exercise B.7. Use the formula exp
r
n
that R can be decomposed as an orthogonal direct sum of subspaces of dimension
at most 2 that are invariant for U . For the last equality apply Exercise B.5(i).
řn´1
Exercise B.8. (i) For an integer n, write eiB ´ eiA “ k“0 eikA{n peiB{n ´
eiA{n qeipn´1´kqB{n . It follows that }eiB ´ eiA } ď n}eiB{n ´ eiA{n }. Conclude by
using the bound }eX ´ eY } ď }X ´ Y } exppmaxp}X}, }Y }qq which follows from the
series expansion, and take n Ñ 8. Alternatively, consider φptq “ eip1´tqB eitA and
show that φ1 ptq “ ieip1´tqB pA ´ BqeitA . These arguments work for any unitarily
invariant norm. (ii) The functional inequality for Lp¨q follows from
eiB ´ eiA “ 2peiB{2 ´ eiA{2 q ` peiB{2 ´ IqpeiB{2 ´ eiA{2 q ` peiB{2 ´ eiA{2 qpeiA{2 ´ Iq.
APPENDIX B 371
Iterating (and using the simple fact that Lpθq tends to 1 as θ goes to 0) gives that
ś8 k ś8
Lpθq ě k“1 p1 ´ |1 ´ eiθ{2 |q “ k“2 p1 ´ 2 sinpθ{2k qq, which is easy to estimate
numerically.
Exercise B.9. Let V P V0 H, then V “ V0 h and U1 “ U0 h1 for some h, h1 P H.
Now note that }U0 ´ V }p “ }U0 ´ V0 h}p “ }U0 h1 ´ V0 hh1 }p “ }U1 ´ V0 hh1 }p and
V0 hh1 P V0 H; taking infimum over V shows one inequality and the other follows by
symmetry. Similarly, if r0, 1s Q t ÞÑ U0 eitA is a geodesic connecting U0 to V P V0 H,
:
then t ÞÑ U0 eitA h1 “ U1 eith1 Ah1 is a curve of the same length connecting U1 to
ion
V h1 P V0 H.
Exercise B.10. In the notation from Exercise B.9 and its hint, we may assume
ut
that gp pU0 , V0 q equals the distance between U0 H and V0 H (in the sense of gp ; note
that the distance is attained, for example by compactness). Next, consider the
rib
geodesic connecting U0 to V0 , whose length is equal to that distance, and deduce
from Exercise B.9 that the quotient map Opnq Ñ Opnq{H is an isometry when
ist
restricted to that geodesic.
Exercise B.11. In the notation of (B.9) let Ei “ spantxi , yi u. The subspaces
rd
E1 , . . . , Ek are pairwise orthogonal and invariant under PE and PF ; they are 2-
dimensional for si ă 1 (which is equivalent to xi ‰ yi ) and 1-dimensional otherwise.
We
i |xi yxxi | and PF “
fo
ř now note that theř eigenvalues of |xi yxxi | ´ |yi yxyi | are ˘ sin θi ; since PE “
i |yi yxyi |, the principal angles θi ‰ 0 can be retrieved from
the eigenvalues of PE ´ PF . It remains to use the relation PE ´ PF “ PF K ´ PE K .
ot
Exercise B.12. Show first the inequalities “ď”. In the notation of (B.9) and of
N
orthogonal complement of the union of such Ej ’s. The nonzero singular values of
W0 ´ I are then |eiθj ´ 1| “ 2 sin θj {2, each repeated twice, which combined with
on
(B.13) shows the needed upper bound on h̃p pE, F q. For an upper bound on the
geodesic distance g̃p pE, F q, consider a family W ptq, t P r0, 1s, where W ptq acts as a
rotation by tθj on Ej and calculate the length of the path t ÞÑ W ptq with respect
se
to the Schatten p-norm. (Alternatively, you may note that W ptq “ eitA for the
lu
then immediately from (B.13); to get the lower bound on g̃p pE, F q, observe that,
by Exercise B.10, the optimal geodesic is of the form W ptq “ U0 eitA , t P r0, 1s, and
r
Finally, since }PE ´ PF }p and g̃p differ only in terms of the 3rd order and higher,
they both induce the same geodesic distance.
Exercise B.14. Let px1 , . . . , xn q be independent Gaussian vectors in Rn . Since the
set of singular matrices has measure 0 in Mn , these vectors are almost surely linearly
independent. Moreover, the orthonormal matrix obtained by applying the Gram–
Schmidt procedure to the matrix with columns x1 , . . . , xn is Haar-distributed on
Opnq. It follows that the subspace spanpx1 , . . . , xk q is Haar-distributed on Grpk, Rn q.
Exercise B.15. Let g1 , . . . , gk (resp., h1 , . . . , hk ) be independent standard Gauss-
ion
ian vectors in E (resp., in E K ). For every ε ą 0, the random subspace spantgi `εhi :
1 ď i ď ku is distributed with respect to some Haar measure on Grpk, Rn q and con-
ut
verges to E almost surely as ε goes to 0.
Exercise B.16. The answer to both questions is “no.” The reason is that SOpkq ˆ
rib
SOpn ´ kq is a proper subgroup of the stabilizer of Rk under the canonical action of
SOpnq on Grpk, Rn q, and similarly in the complex case. In the complex case, even
ist
` ˘
the dimensions do not add up, we have dim SUpnq ´ dim SUpkq ` dim SUpn ´ kq “
2kpn ´ kq ` 1 ą 2kpn ´ kq “ dim Grpk, Cn q (note that these are real dimensions).
rd
A more complete answer is that SOpnq{pSOpkq ˆ SOpn ´ kqq identifies with
the set of oriented k-dimensional subspaces of Rn and is, in a canonical way, a
two-fold cover of Grpk, Rn q. (A particular example of this phenomenon is S n´1 “
fo
SOpnq{SOpn ´ 1q being a two-fold cover of Grp1, Rn q “ PpRn q.) Similarly, the set
SUpnq{pSUpkqˆSUpn´kqq identifies, in a way, with the set of “signed” k-dimensional
ot
subspaces of Cn . See also Exercise B.17.
N
Exercise B.17. There are two fine points: first, the cosets of H are subsets of the
cosets of Opkq ˆ Opn ´ kq and so the distances (extrinsic or geodesic) between the
ly.
former may be larger than between the latter. Next, geodesics connecting cosets
(as in Exercise B.10) may ` a priori ˘turn out to be longer if we insist` that they
on
˘
are entirely contained in SOpnq, gp (as opposed to the larger space Opnq, gp ).
Similar issues arise in the complex case.
To address these concerns, check that W and W ptq suggested in the hint to Exercise
se
B.12 are minimizers that work simultaneously for Opnq and for SOpnq (resp., for
lu
automorphisms from (i) preserve q and hence belong to O` p1, 3q iff | det V | “ 1. (iii)
Check that if V “ I, then the two maps from (i) belong respectively to SO` p1, 3q
so
and O` p1, 3qzSO` p1, 3q, and appeal to connectedness of SLp2, Cq.
r
kernel note that if x “ Ψ´1 p|ξyxξ|q for some ξ P C2 , then ΦV pxq “ x can be
rewritten as V |ξyxξ|V : “ |ξyxξ|; this means that ΨV “ IR4 implies that every
ξ P C2 zt0u is an eigenvector of V , which is only possible if V is a multiple of I.
Appendix C
(for example) ∆pxq P R` u for x P C, consider separately the cases Φpxq “ 0 and
Φpxq ‰ 0.
Exercise C.2. If Ψ is the automorphism in question, then Ψ˚ JΨ “ µJ ` Q and
similarly pΨ´1 q˚ JpΨ´1 q “ νJ ` Q1 with µ, ν ě 0 and Q, Q1 positive semi-definite
(justify). Next, show that this implies that p1 ´ µνqJ “ νQ ` Ψ˚ Q1 Ψ, which is only
possible if µν “ 1 and Q “ Q1 “ 0.
Exercise C.3. If n “ 4, this follows from Corollary 2.30, modulo identifying com-
pletely positive automorphisms of PSDpC2 q with SO` p1, 3q (see the hint to Exercise
ion
B.19(a)). Deduce the conclusion for n ą 4: when looking for an automorphism Ψ
such that Ψpuq “ v, consider any 4-dimensional subspace E Ă Rn containing e0 , u
ut
and v, and define Ψ separately on E and E K .
A similar line of argument allows to derive the full statement (n ě 2) from Exercise
rib
B.18.
Exercise C.4. “Reverse engineer” the failure of the proof of the converse im-
ist
plication from Proposition C.1 when for n “ 2. Alternatively, notice that L2 is
isomorphic to the positive quadrant R2` and that the structure of the cone P pRn` q
rd
is particularly simple.
Appendix D
fo
Exercise D.1. One fine point is in verifying that the bases are nontrivial and that
ot
they generate the respective cones, but this is assured by the hypothesis xe˚ , ey “ 1
(cf. Exercise 1.28).
N
we consider the support functions wp¨, ¨q, see Section 4.3.3), and that for some x
(e.g., x “ ˘a) the inequalities are strict. Deduce that K ˝ X H ` Ĺ pK ´ aq˝ X H ` ,
on
where H ` “ tx P Rn : xx, ay ě 0u, ş with the reverse inclusion for the other
halfspace, and show that this implies pK´aq˝ xx, by dx ą 0.
Exercise D.3. (i) and (ii) By linear invariance, we may assume that E is a translate
se
of B2n . Identify it with the base of the Lorentz cone Ln`1 and apply Lemma
lu
D.1. (iii) This is immediate if we use the full force of Proposition D.2. For a
proof that does not use the uniqueness part, note that if K is centrally symmetric
na
and has centroid at the origin, then it is 0-symmetric. Apply this observation to
K “ pE ´ aq˝ and use the bipolar theorem.
so
Exercise D.4. Let u be such that the segment rb ´ u, b ` us lies in the interior of
K. We now consider a “ aptq :“ b ` tu for t P r´1, 1s and the corresponding solid
r
Pe
ion
[AAGM15] Shiri Artstein-Avidan, Apostolos Giannopoulos, and Vitali D. Milman, Asymptotic
ut
geometric analysis. Part I, Mathematical Surveys and Monographs, vol. 202, Amer-
ican Mathematical Society, Providence, RI, 2015. 87, 105, 143, 146, 147, 186, 207,
rib
208, 209
[AAKM04] S. Artstein-Avidan, B. Klartag, and V. Milman, The Santaló point of a function,
and a functional form of the Santaló inequality, Mathematika 51 (2004), no. 1-2,
ist
33–48 (2005). 105
[AAM06] S. Artstein-Avidan and V. D. Milman, Logarithmic reduction of the level of ran-
domness in some probabilistic geometric constructions, J. Funct. Anal. 235 (2006),
rd
no. 1, 297–329. 207
[AAS15] Shiri Artstein-Avidan and Boaz A. Slomka, A note on Santaló inequality for the
polarity transform and its reverse, Proc. Amer. Math. Soc. 143 (2015), no. 4, 1693–
1704. 105 fo
[AdRBV98] Juan Arias-de Reyna, Keith Ball, and Rafael Villa, Concentration of the distance in
ot
finite-dimensional normed spaces, Mathematika 45 (1998), no. 2, 245–252. 144
[AGMJV16] David Alonso-Gutiérrez, Bernardo González Merino, C. Hugo Jiménez, and Rafael
N
[AIIS04] David Avis, Hiroshi Imai, Tsuyoshi Ito, and Yuuya Sasaki, Deriving tight Bell in-
equalities for 2 parties with many 2-valued observables from facets of cut polytopes,
arXiv preprint quant-ph/0404014 (2004). 296
se
[AJR15] Srinivasan Arunachalam, Nathaniel Johnston, and Vincent Russo, Is absolute sep-
arability determined by the partial transpose?, Quantum Inf. Comput. 15 (2015),
lu
[AMS04] S. Artstein, V. Milman, and S. J. Szarek, Duality of metric entropy, Ann. of Math.
(2) 159 (2004), no. 3, 1313–1328. 143
[AMSTJ04] S. Artstein, V. Milman, S. Szarek, and N. Tomczak-Jaegermann, On convexified
packing and entropy duality, Geom. Funct. Anal. 14 (2004), no. 5, 1134–1141. 143
[AN12] Guillaume Aubrun and Ion Nechita, Realigning random states, J. Math. Phys. 53
(2012), no. 10, 102210, 16. 274
[Ara04] P. K. Aravind, Quantum mysteries revisited again, Amer. J. Phys. 72 (2004), no. 10,
1303–1307. 297
[Arv09] William Arveson, Maximal vectors in Hilbert space and quantum entanglement,
Journal of Functional Analysis 256 (2009), no. 5, 1476–1510. 233
[AS06] Guillaume Aubrun and Stanisław J Szarek, Tensor products of convex sets and the
volume of separable states on n qudits, Physical Review A 73 (2006), no. 2, 022109.
104, 233, 260, 261
375
376 BIBLIOGRAPHY
[AS10] Erik Alfsen and Fred Shultz, Unique decompositions, faces, and automorphisms of
separable states, J. Math. Phys. 51 (2010), no. 5, 052201, 13. 63
[AS15] Guillaume Aubrun and Stanisł aw Szarek, Two proofs of Størmer’s theorem, arXiv
preprint arXiv:1512.03293 (2015). 64
[AS17] Guillaume Aubrun and Stanislaw Szarek, Dvoretzky’s Theorem and the Complexity
of Entanglement Detection, Discrete Analysis, to appear (2017). 208, 261
[ASW10] Guillaume Aubrun, Stanisław Szarek, and Elisabeth Werner, Nonadditivity of Rényi
entropy and Dvoretzky’s theorem, J. Math. Phys. 51 (2010), no. 2, 022102, 7. 232
[ASW11] , Hastings’s additivity counterexample via Dvoretzky’s theorem, Comm.
Math. Phys. 305 (2011), no. 1, 85–97. 144, 208, 232, 233
ion
[ASY12] Guillaume Aubrun, Stanisław J. Szarek, and Deping Ye, Phase transitions for ran-
dom states and a semicircle law for the partial transpose, Phys. Rev. A 85 (2012),
030302. 273
ut
[ASY14] Guillaume Aubrun, Stanisław J. Szarek, and Deping Ye, Entanglement thresholds
for random induced states, Comm. Pure Appl. Math. 67 (2014), no. 1, 129–171. 64,
rib
179, 260, 270, 273
[Aub05] Guillaume Aubrun, A sharp small deviation inequality for the largest eigenvalue of
ist
a random matrix, Séminaire de Probabilités XXXVIII, Lecture Notes in Math., vol.
1857, Springer, Berlin, 2005, pp. 320–337. 179
[Aub09] , On almost randomizing channels with a short Kraus decomposition, Comm.
rd
Math. Phys. 288 (2009), no. 3, 1103–1116. 232
[Aub12] , Partial transposition of random states and non-centered semicircular dis-
tributions, Random Matrices Theory Appl. 1 (2012), no. 2, 1250001, 29. 179, 273
[Aud09] fo
Koenraad MR Audenaert, A note on the p Ñ q norms of 2-positive maps, Linear
Algebra and Its Applications 430 (2009), no. 4, 1436–1440. 232
ot
[Azu67] Kazuoki Azuma, Weighted sums of certain dependent random variables, Tôhoku
Math. J. (2) 19 (1967), 357–367. 144
N
[BAG97] G. Ben Arous and A. Guionnet, Large deviations for Wigner’s law and Voiculescu’s
non-commutative entropy, Probab. Theory Related Fields 108 (1997), no. 4, 517–
542. 245
ly.
[Bal89] Keith Ball, Volumes of sections of cubes and related problems, Geometric aspects of
functional analysis (1987–88), Lecture Notes in Math., vol. 1376, Springer, Berlin,
lu
[Bal92b] , A lower bound for the optimal density of lattice packings, Internat. Math.
Res. Notices (1992), no. 10, 217–221. 142
r
etry, Math. Sci. Res. Inst. Publ., vol. 31, Cambridge Univ. Press, Cambridge, 1997,
pp. 1–58. 87, 104, 209
[Bar98] Franck Barthe, An extremal property of the mean width of the simplex, Math. Ann.
310 (1998), no. 4, 685–693. 342
[Bar02] Alexander Barvinok, A course in convexity, Graduate Studies in Mathematics,
vol. 54, American Mathematical Society, Providence, RI, 2002. 28
[Bar14] Alexander Barvinok, Thrifty approximations of convex bodies by polytopes., Int.
Math. Res. Not. 2014 (2014), no. 16, 4341–4356 (English). 142, 143
[BBP` 96] Charles H Bennett, Gilles Brassard, Sandu Popescu, Benjamin Schumacher, John A
Smolin, and William K Wootters, Purification of noisy entanglement and faithful
teleportation via noisy channels, Physical Review Letters 76 (1996), no. 5, 722. 306
[BBT05] Gilles Brassard, Anne Broadbent, and Alain Tapp, Quantum pseudo-telepathy,
Found. Phys. 35 (2005), no. 11, 1877–1907. 297
BIBLIOGRAPHY 377
[BC02] Károly Bezdek and Robert Connelly, Pushing disks apart—the Kneser-Poulsen con-
jecture in the plane, J. Reine Angew. Math. 553 (2002), 221–236. 178
[BCL94] Keith Ball, Eric A. Carlen, and Elliott H. Lieb, Sharp uniform convexity and smooth-
ness inequalities for trace norms, Invent. Math. 115 (1994), no. 3, 463–482. 29
[BCN12] Serban Belinschi, Benoît Collins, and Ion Nechita, Eigenvectors and eigenvalues in
a random subspace of a tensor product, Inventiones mathematicae 190 (2012), no. 3,
647–697. 233
[BCN16] Serban T. Belinschi, Benoît Collins, and Ion Nechita, Almost one bit violation for
the additivity of the minimum output entropy, Comm. Math. Phys. 341 (2016),
no. 3, 885–909. 233
ion
[BCP` 14] Nicolas Brunner, Daniel Cavalcanti, Stefano Pironio, Valerio Scarani, and Stephanie
Wehner, Bell nonlocality, Rev. Mod. Phys. 86 (2014), 419–478. 295, 297
[BDG` 77] G. Bennett, L. E. Dor, V. Goodman, W. B. Johnson, and C. M. Newman, On
ut
uncomplemented subspaces of Lp , 1 ă p ă 2, Israel J. Math. 26 (1977), no. 2,
178–187. 208
rib
[BDK89] Rajendra Bhatia, Chandler Davis, and Paul Koosis, An extremal problem in Fourier
analysis with applications to operator theory, J. Funct. Anal. 82 (1989), no. 1, 138–
ist
150. 354
[BDM83] Rajendra Bhatia, Chandler Davis, and Alan McIntosh, Perturbation of spectral sub-
spaces and solution of linear operator equations, Linear Algebra Appl. 52/53 (1983),
rd
45–67. 354
[BDM` 99] Charles H. Bennett, David P. DiVincenzo, Tal Mor, Peter W. Shor, John A. Smolin,
and Barbara M. Terhal, Unextendible product bases and bound entanglement, Phys.
[BDMS13]
fo
Rev. Lett. 82 (1999), no. 26, part 1, 5385–5388. 63
Afonso S. Bandeira, Edgar Dobriban, Dustin G. Mixon, and William F. Sawin,
ot
Certifying the restricted isometry property is hard, IEEE Trans. Inform. Theory 59
(2013), no. 6, 3448–3450. 210
N
[BDSW96] Charles H. Bennett, David P. DiVincenzo, John A. Smolin, and William K. Wootters,
Mixed-state entanglement and quantum error correction, Phys. Rev. A 54 (1996),
3824–3851. 306
ly.
206. 145
[Bec75] William Beckner, Inequalities in Fourier analysis, Ann. of Math. (2) 102 (1975),
no. 1, 159–182. 145
se
[Bel64] J. S. Bell, On the Einstein Podolsky Rosen paradox, Physics 1 (1964), 195–200. 276
[Ben84] Yoav Benyamini, Two-point symmetrization, the isoperimetric inequality on the
lu
sphere and some applications, Texas functional analysis seminar 1983–1984 (Austin,
Tex.), Longhorn Notes, Univ. Texas Press, Austin, TX, 1984, pp. 53–76. 144
[Bez08] K. Bezdek, From the Kneser-Poulsen conjecture to ball-polyhedra, European J. Com-
na
tices, Proc. Amer. Math. Soc. 102 (1988), no. 3, 651–659. 178
[BGK` 01] Andreas Brieden, Peter Gritzmann, Ravindran Kannan, Victor Klee, László Lovász,
r
ion
sław J. Szarek, Bound entangled states with extremal properties, Phys. Rev. A 90
(2014), 012301. 261
[Bil99] Patrick Billingsley, Convergence of probability measures, second ed., Wiley Series in
ut
Probability and Statistics: Probability and Statistics, John Wiley & Sons, Inc., New
York, 1999, A Wiley-Interscience Publication. 179
rib
[BKP06] Jonathan Barrett, Adrian Kent, and Stefano Pironio, Maximally nonlocal and
monogamous quantum correlations, Phys. Rev. Lett. 97 (2006), 170409. 297
ist
[BL75] H.J. Brascamp and E.H. Lieb, Some inequalities for Gaussian measures and the
long range order of one-dimensional plasma, pp. 1–14, Clarendon Press, Oxford,
1975. 105
rd
[BL76] Herm Jan Brascamp and Elliott H. Lieb, On extensions of the Brunn-Minkowski
and Prékopa-Leindler theorems, including inequalities for log concave functions,
and with an application to the diffusion equation, J. Functional Analysis 22 (1976),
[BL01]
no. 4, 366–389. 105 fo
H Barnum and N Linden, Monotones and invariants for multi-particle quantum
ot
states, Journal of Physics A: Mathematical and General 34 (2001), no. 35, 6787. 233
[BLM89] J. Bourgain, J. Lindenstrauss, and V. Milman, Approximation of zonoids by zono-
N
topes., Acta Math. 162 (1989), no. 1-2, 73–141 (English). 209
[BLM13] Stéphane Boucheron, Gábor Lugosi, and Pascal Massart, Concentration inequalities,
Oxford University Press, Oxford, 2013, A nonasymptotic theory of independence,
ly.
The flatness theorem for nonsymmetric convex bodies via the local theory of Banach
spaces, Math. Oper. Res. 24 (1999), no. 3, 728–750. 103, 207
[BM87] J. Bourgain and V. D. Milman, New volume ratio properties for convex symmetric
bodies in Rn , Invent. Math. 88 (1987), no. 2, 319–340. 105, 209
se
2008. 295
[BMW09] Michael J. Bremner, Caterina Mora, and Andreas Winter, Are random pure states
useful for quantum computation?, Phys. Rev. Lett. 102 (2009), no. 19, 190502, 4.
na
233
[BN02] Alexander Barg and Dmitry Yu. Nogin, Bounds on packings of spheres in the Grass-
so
mann manifold, IEEE Trans. Inform. Theory 48 (2002), no. 9, 2450–2454. 143
[BN05] , Correction to: “Bounds on packings of spheres in the Grassmann manifold”
r
[Bom90a] Jan Boman, Smoothness of sums of convex sets with real analytic boundaries, Math.
Scand. 66 (1990), no. 2, 225–230. 104
[Bom90b] , The sum of two plane convex C 8 sets is not always C 5 , Math. Scand. 66
(1990), no. 2, 216–224. 104
[Bon70] Aline Bonami, Étude des coefficients de Fourier des fonctions de Lp pGq, Ann. Inst.
Fourier (Grenoble) 20 (1970), no. fasc. 2, 335–402 (1971). 145
[Bor75a] C. Borell, Convex set functions in d-space, Period. Math. Hungar. 6 (1975), no. 2,
111–136. 104
[Bor75b] Christer Borell, The Brunn-Minkowski inequality in Gauss space, Invent. Math. 30
(1975), no. 2, 207–216. 144
ion
[Bor03] , The Ehrhard inequality, C. R. Math. Acad. Sci. Paris 337 (2003), no. 10,
663–666. 144
[Bör04] Károly Böröczky, Jr., Finite packing and covering, Cambridge Tracts in Mathemat-
ut
ics, vol. 154, Cambridge University Press, Cambridge, 2004. 141, 142
[Bou84] J. Bourgain, On martingales transforms in finite-dimensional lattices with an ap-
rib
pendix on the K-convexity constant, Math. Nachr. 119 (1984), 41–53. 207
[Boy67] A. V. Boyd, Note on a paper by Uppuluri, Pacific J. Math. 22 (1967), 9–10. 309
ist
[BP01] Imre Bárány and Attila Pór, On 0-1 polytopes with many facets, Adv. Math. 161
(2001), no. 2, 209–228. 281
[Bry95] Włodzimierz Bryc, The normal distribution, Lecture Notes in Statistics, vol. 100,
rd
Springer-Verlag, New York, 1995, Characterizations with applications. 309
[BS] Andrew Blasius and Stanisław Szarek, Sharp two-sided bounds for the medians of
gamma and chi-squared distributions, in preparation. 124
[BS88] fo
J. Bourgain and S. J. Szarek, The Banach-Mazur distance to the cube and the
Dvoretzky-Rogers factorization, Israel J. Math. 62 (1988), no. 2, 169–180. 208
ot
[BS10] Salman Beigi and Peter W. Shor, Approximating the set of separable states using
the positive partial transpose test, J. Math. Phys. 51 (2010), no. 4, 042202, 10. 261
N
[BTN01a] Aharon Ben-Tal and Arkadi Nemirovski, Lectures on modern convex optimization,
MPS/SIAM Series on Optimization, Society for Industrial and Applied Mathematics
(SIAM), Philadelphia, PA; Mathematical Programming Society (MPS), Philadel-
ly.
[BV13] Jop Briët and Thomas Vidick, Explicit lower and upper bounds on the entangled
value of multiplayer XOR games, Comm. Math. Phys. 321 (2013), no. 1, 181–207.
lu
297
[BW03] Károly Böröczky, Jr. and Gergely Wintsche, Covering the sphere by equal spherical
balls, Discrete and computational geometry, Algorithms Combin., vol. 25, Springer,
na
[CFN15] Benoît Collins, Motohisa Fukuda, and Ion Nechita, On the convergence of output
sets of quantum channels, J. Operator Theory 73 (2015), no. 2, 333–360. 233
[CFR59] H. S. M. Coxeter, L. Few, and C. A. Rogers, Covering space with equal spheres,
Mathematika 6 (1959), 147–157. 111, 142
[CG04] Daniel Collins and Nicolas Gisin, A relevant two qubit Bell inequality inequivalent
to the CHSH inequality, J. Phys. A 37 (2004), no. 5, 1775–1787. 296
[CGLP12] Djalil Chafaï, Olivier Guédon, Guillaume Lecué, and Alain Pajor, Interactions be-
tween compressed sensing random matrices and high dimensional geometry, Panora-
mas et Synthèses [Panoramas and Syntheses], vol. 37, Société Mathématique de
France, Paris, 2012. 146
ion
[Cha67] G. D. Chakerian, Inequalities for the difference body of a convex body, Proc. Amer.
Math. Soc. 18 (1967), 879–884. 105
[Che78] S. Chevet, Séries de variables aléatoires gaussiennes à valeurs dans E bp ε F . Appli-
ut
cation aux produits d’espaces de Wiener abstraits, Séminaire sur la Géométrie des
Espaces de Banach (1977–1978), École Polytech., Palaiseau, 1978, pp. Exp. No. 19,
rib
15. 180
[CHL` 08] Toby Cubitt, Aram W Harrow, Debbie Leung, Ashley Montanaro, and Andreas
ist
Winter, Counterexamples to additivity of minimum output p-renyi entropy for p
close to 0, Communications in Mathematical Physics 284 (2008), no. 1, 281–290.
232
rd
[CHLL97] Gérard Cohen, Iiro Honkala, Simon Litsyn, and Antoine Lobstein, Covering codes,
North-Holland Mathematical Library, vol. 54, North-Holland Publishing Co., Ams-
terdam, 1997. 142
[Cho75a] fo
Man Duen Choi, Completely positive linear maps on complex matrices, Linear Al-
gebra and Appl. 10 (1975), 285–290. 64
ot
[Cho75b] Man-Duen Choi, Positive semidefinite biquadratic forms, Linear Algebra and its
Applications 12 (1975), no. 2, 95–100. 63
N
[CHS96] John H. Conway, Ronald H. Hardin, and Neil J. A. Sloane, Packing lines, planes,
etc.: packings in Grassmannian spaces, Experiment. Math. 5 (1996), no. 2, 139–159.
143
ly.
[CHSH69] John F. Clauser, Michael A. Horne, Abner Shimony, and Richard A. Holt, Proposed
experiment to test local hidden-variable theories, Phys. Rev. Lett. 23 (1969), 880–
on
884. 295
[Chu62] J. T. Chu, Mathematical Notes: A Modified Wallis Product and Some Applications,
Amer. Math. Monthly 69 (1962), no. 5, 402–404. 309
B. S. Cirel1 son, Quantum generalizations of Bell’s inequality, Lett. Math. Phys. 4
se
[Cir80]
(1980), no. 2, 93–100. 295
lu
[CL06] Eric Carlen and Elliott H. Lieb, Some matrix rearrangement inequalities, Ann. Mat.
Pura Appl. (4) 185 (2006), no. suppl., S315–S324. 29
[Cla36] James A. Clarkson, Uniformly convex spaces, Trans. Amer. Math. Soc. 40 (1936),
na
no. 3, 396–414. 29
[Cla06] Lieven Clarisse, The distillability problem revisited, Quantum Inf. Comput. 6 (2006),
so
Everything you always wanted to know about LOCC (but were afraid to ask), Comm.
Pe
[CN16] Benoît Collins and Ion Nechita, Random matrix techniques in quantum information
theory, Journal of Mathematical Physics 57 (2016), no. 1, 015215. 179, 233
[CNY12] Benoit Collins, Ion Nechita, and Deping Ye, The absolute positive partial transpose
property for random induced states, Random Matrices Theory Appl. 1 (2012), no. 3,
1250002, 22. 274
[Col06] Andrea Colesanti, Functional inequalities related to the Rogers-Shephard inequality,
Mathematika 53 (2006), no. 1, 81–101 (2007). 105
[Col16] Benoît Collins, Haagerup’s inequality and additivity violation of the Minimum Out-
put Entropy, arXiv preprint arXiv:1603.00577 (2016). 233
[CP88] Bernd Carl and Alain Pajor, Gel1 fand numbers of operators with values in a Hilbert
ion
space, Invent. Math. 94 (1988), no. 3, 479–504. 178
[CR86] Jeesen Chen and Herman Rubin, Bounds for the difference between median and
mean of gamma and Poisson distributions, Statist. Probab. Lett. 4 (1986), no. 6,
ut
281–283. 124
[CS90] Bernd Carl and Irmtraud Stephani, Entropy, compactness and the approximation of
rib
operators, Cambridge Tracts in Mathematics, vol. 98, Cambridge University Press,
Cambridge, 1990. 142
ist
[CS99] J. H. Conway and N. J. A. Sloane, Sphere packings, lattices and groups, third ed.,
Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Math-
ematical Sciences], vol. 290, Springer-Verlag, New York, 1999, With additional con-
rd
tributions by E. Bannai, R. E. Borcherds, J. Leech, S. P. Norton, A. M. Odlyzko,
R. A. Parker, L. Queen and B. B. Venkov. 141, 142
[CS05] Shiing-Shen Chern and Zhongmin Shen, Riemann-Finsler geometry, Nankai Tracts
fo
in Mathematics, vol. 6, World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ,
2005. 319
ot
[CSW14] Adán Cabello, Simone Severini, and Andreas Winter, Graph-theoretic approach to
quantum correlations, Phys. Rev. Lett. 112 (2014), 040401. 297
N
[CW03] Kai Chen and Ling-An Wu, A matrix realignment method for recognizing entangle-
ment, Quantum Inf. Comput. 3 (2003), no. 3, 193–202. 63
[Dav57] Chandler Davis, All convex invariant functions of hermitian matrices, Arch. Math.
ly.
8 (1957), 276–278. 29
[DCLB00] W. Dür, J. I. Cirac, M. Lewenstein, and D. Bruß, Distillability and partial transpo-
on
[DF87] Persi Diaconis and David Freedman, A dozen de Finetti-style results in search of a
theory, Ann. Inst. H. Poincaré Probab. Statist. 23 (1987), no. 2, suppl., 397–423.
lu
144
[DF93] Andreas Defant and Klaus Floret, Tensor norms and operator ideals, North-Holland
Mathematics Studies, vol. 176, North-Holland Publishing Co., Amsterdam, 1993.
na
103
[DL97] Michel Marie Deza and Monique Laurent, Geometry of cuts and metrics, Algorithms
so
asymptotic expansions for the mean and the median of a Gaussian sample maxi-
Pe
mum, and applications to the Donoho-Jin model, Stat. Methodol. 20 (2014), 40–62.
178, 351
[Dmi90] V. A. Dmitrovskiı̆, On the integrability of the maximum and the local properties
of Gaussian fields, Probability theory and mathematical statistics, Vol. I (Vilnius,
1989), “Mokslas”, Vilnius, 1990, pp. 271–284. 144
[DPS04] Andrew C. Doherty, Pablo A. Parrilo, and Federico M. Spedalieri, Complete family
of separability criteria, Phys. Rev. A 69 (2004), 022308. 63
[DR47] H. Davenport and C. A. Rogers, Hlawka’s theorem in the geometry of numbers,
Duke Math. J. 14 (1947), 367–375. 142
[DR50] A. Dvoretzky and C. A. Rogers, Absolute and unconditional convergence in normed
linear spaces, Proc. Nat. Acad. Sci. U. S. A. 36 (1950), 192–197. 208
382 BIBLIOGRAPHY
[DS85] Stephen Dilworth and Stanisław Szarek, The cotype constant and an almost Eu-
clidean decomposition for finite-dimensional normed spaces, Israel J. Math. 52
(1985), no. 1-2, 82–96. 209
[DS01] Kenneth R. Davidson and Stanislaw J. Szarek, Local operator theory, random ma-
trices and Banach spaces, Handbook of the geometry of Banach spaces, Vol. I,
North-Holland, Amsterdam, 2001, pp. 317–366. 144, 180
[DSS` 00] David P. DiVincenzo, Peter W. Shor, John A. Smolin, Barbara M. Terhal, and
Ashish V. Thapliyal, Evidence for bound entangled states with negative partial trans-
pose, Phys. Rev. A 61 (2000), 062312. 306
[Dud67] R. M. Dudley, The sizes of compact subsets of Hilbert space and continuity of Gauss-
ion
ian processes, J. Functional Analysis 1 (1967), 290–330. 179
[Due10] Lutz Duembgen, Bounding standard Gaussian tail probabilities, Tech. report, Uni-
versity of Bern, Institute of Mathematical Statistics and Actuarial Science, 2010.
ut
309
[Dum07] Ilya Dumer, Covering spheres with spheres, Discrete Comput. Geom. 38 (2007),
rib
no. 4, 665–679. 110, 142
[Dür01] W. Dür, Multipartite bound entangled states that violate Bell’s inequality, Phys.
ist
Rev. Lett. 87 (2001), 230402. 297
[Dvo61] Aryeh Dvoretzky, Some results on convex bodies and Banach spaces, Proc. Internat.
Sympos. Linear Spaces (Jerusalem, 1960), Jerusalem Academic Press, Jerusalem;
rd
Pergamon, Oxford, 1961, pp. 123–160. 208
[EC04] Fida El Chami, Spectra of the Laplace operator on Grassmann manifolds, Int. J.
Pure Appl. Math. 12 (2004), no. 4, 395–418. 145
[Ehr83] fo
Antoine Ehrhard, Symétrisation dans l’espace de Gauss, Math. Scand. 53 (1983),
no. 2, 281–301. 144
ot
[EPR35] A. Einstein, B. Podolsky, and N. Rosen, Can quantum-mechanical description of
physical reality be considered complete?, Phys. Rev. 47 (1935), 777–780. 276
N
[ES70] P. Erdős and A. H. Stone, On the sum of two Borel sets, Proc. Amer. Math. Soc.
25 (1970), 304–306. 104
[EVWW01] Tilo Eggeling, Karl Gerd H. Vollbrecht, Reinhard F. Werner, and Michael M. Wolf,
ly.
Distillability via protocols respecting the positivity of partial transpose, Phys. Rev.
Lett. 87 (2001), 257902. 306
on
[Fer75] X. Fernique, Regularité des trajectoires des fonctions aléatoires gaussiennes, École
d’Été de Probabilités de Saint-Flour, IV-1974, Springer, Berlin, 1975, pp. 1–96.
Lecture Notes in Math., Vol. 480. 178, 179
se
144, 179
[FF81] P. Frankl and Z. Füredi, A short proof for a theorem of Harper about Hamming-
spheres, Discrete Math. 34 (1981), no. 3, 311–313. 146
na
[FHS13] Omar Fawzi, Patrick Hayden, and Pranab Sen, From low-distortion norm embed-
dings to explicit uncertainty relations and efficient information locking., J. ACM
so
[FK94] S. K. Foong and S. Kanno, Proof of D. N. Page’s conjecture on: “Average en-
tropy of a subsystem” [Phys. Rev. Lett. 71 (1993), no. 9, 1291–1294; MR1232812
(94f:81007)], Phys. Rev. Lett. 72 (1994), no. 8, 1148–1151. 232
[FK10] Motohisa Fukuda and Christopher King, Entanglement of random subspaces via the
Hastings bound, J. Math. Phys. 51 (2010), no. 4, 042201, 19. 233
[FKM10] Motohisa Fukuda, Christopher King, and David K. Moser, Comments on Hastings’
additivity counterexamples, Comm. Math. Phys. 296 (2010), no. 1, 111–143. 233
[FLM77] T. Figiel, J. Lindenstrauss, and V. D. Milman, The dimension of almost spherical
sections of convex bodies, Acta Math. 139 (1977), no. 1-2, 53–94. 144, 208
[FLPS11] Shmuel Friedland, Chi-Kwong Li, Yiu-Tung Poon, and Nung-Sing Sze, The auto-
morphism group of separable states in quantum information theory, J. Math. Phys.
52 (2011), no. 4, 042203, 8. 63
BIBLIOGRAPHY 383
[FN15] Motohisa Fukuda and Ion Nechita, Additivity rates and PPT property for random
quantum channels, Ann. Math. Blaise Pascal 22 (2015), no. 1, 1–72. 180
[Fol99] Gerald B. Folland, Real analysis, second ed., Pure and Applied Mathematics (New
York), John Wiley & Sons, Inc., New York, 1999, Modern techniques and their
applications, A Wiley-Interscience Publication. 15
[For10] Dominique Fortin, Hadamard’s matrices, Grothendieck’s constant, and root two,
Optimization and optimal control, Springer Optim. Appl., vol. 39, Springer, New
York, 2010, pp. 423–447. 295
[FR94] P. C. Fishburn and J. A. Reeds, Bell inequalities, Grothendieck’s constant, and root
two, SIAM J. Discrete Math. 7 (1994), no. 1, 48–56. 295
ion
[FR13] Simon Foucart and Holger Rauhut, A mathematical introduction to compressive
sensing, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer, New
York, 2013. 208, 309
ut
[Fra99] Matthieu Fradelizi, Hyperplane sections of convex bodies in isotropic position,
Beiträge Algebra Geom. 40 (1999), no. 1, 163–183. 103, 105
rib
[Fre14] Daniel J. Fresen, Explicit Euclidean embeddings in permutation invariant normed
spaces, Adv. Math. 266 (2014), 1–16. 210
ist
[Fri12] Tobias Fritz, Tsirelson’s problem and Kirchberg’s conjecture, Rev. Math. Phys. 24
(2012), no. 5, 1250012, 67. 296
[Fro81] M. Froissart, Constructive generalization of Bell’s inequalities, Nuovo Cimento B
rd
(11) 64 (1981), no. 2, 241–251. 296
[FŚ13] Motohisa Fukuda and Piotr Śniady, Partial transpose of random quantum states:
exact formulas and meanders, J. Math. Phys. 54 (2013), no. 4, 042202, 23. 179
[FT97] fo
Gábor Fejes Tóth, Packing and covering, Handbook of discrete and computational
geometry, CRC Press Ser. Discrete Math. Appl., CRC, Boca Raton, FL, 1997,
ot
pp. 19–41. 141
[FTJ79] T. Figiel and Nicole Tomczak-Jaegermann, Projections onto Hilbertian subspaces of
N
[FW07] Motohisa Fukuda and Michael M. Wolf, Simplifying additivity problems using direct
sum constructions, J. Math. Phys. 48 (2007), no. 7, 072101, 7. 232
on
[Gal95] Janos Galambos, Advanced probability theory, vol. 10, CRC Press, 1995. 179
[Gar83] Anupam Garg, Detector error and Einstein-Podolsky-Rosen correlations, Phys. Rev.
D (3) 28 (1983), no. 4, 785–790. 295
se
[Gar02] R. J. Gardner, The Brunn-Minkowski inequality, Bull. Amer. Math. Soc. (N.S.) 39
(2002), no. 3, 355–405. 104, 105
lu
[GB02] Leonid Gurvits and Howard Barnum, Largest separable balls around the maximally
mixed bipartite quantum state, Physical Review A 66 (2002), no. 6, 062311. 260
[GB03] , Separable balls around the maximally mixed multipartite quantum states,
na
[GFE09] D Gross, ST Flammia, and J Eisert, Most quantum states are too entangled to
be useful as computational resources, Physical review letters 102 (2009), no. 19,
190501. 233
[GG71] D. J. H. Garling and Y. Gordon, Relations between some constants associated with
finite dimensional Banach spaces, Israel J. Math. 9 (1971), 346–361. 104
[GG84] A. Yu. Garnaev and E. D. Gluskin, The widths of a Euclidean ball, Dokl. Akad.
Nauk SSSR 277 (1984), no. 5, 1048–1052. 208
[GGHE08] O. Gittsovich, O. Gühne, P. Hyllus, and J. Eisert, Unifying several separability
conditions using the covariance matrix criterion, Phys. Rev. A 78 (2008), 052319.
64
[GHP10] Andrzej Grudka, Michał Horodecki, and Łukasz Pankowski, Constructive counterex-
amples to the additivity of the minimum output rényi entropy of quantum channels
384 BIBLIOGRAPHY
for all p ą 2, Journal of Physics A: Mathematical and Theoretical 43 (2010), no. 42,
425304. 232
[Gia96] A. A. Giannopoulos, A proportional Dvoretzky-Rogers factorization result, Proc.
Amer. Math. Soc. 124 (1996), no. 1, 233–241. 208
[GLMP04] Y. Gordon, A. E. Litvak, M. Meyer, and A. Pajor, John’s decomposition in the
general case and applications, J. Differential Geom. 68 (2004), no. 1, 99–119. 103
[GLR10] Venkatesan Guruswami, James R. Lee, and Alexander Razborov, Almost Euclidean
subspaces of `N1 via expander codes, Combinatorica 30 (2010), no. 1, 47–68. 207
[Glu81] E. D. Gluskin, The diameter of the Minkowski compactum is roughly equal to n,
Funktsional. Anal. i Prilozhen. 15 (1981), no. 1, 72–73. 103
ion
[Glu88] , Extremal properties of orthogonal parallelepipeds and their applications to
the geometry of Banach spaces, Mat. Sb. (N.S.) 136(178) (1988), no. 1, 85–96. 178
[GLW08] Venkatesan Guruswami, James R. Lee, and Avi Wigderson, Euclidean sections of
ut
`N
1 with sublinear randomness and error-correction over the reals, Approximation,
randomization and combinatorial optimization, Lecture Notes in Comput. Sci., vol.
rib
5171, Springer, Berlin, 2008, pp. 444–454. 207
[GM00] A. A. Giannopoulos and V. D. Milman, Concentration property on probability spaces,
ist
Adv. Math. 156 (2000), no. 1, 77–106. 144
[GMW14] Whan Ghang, Zane Martin, and Steven Waruhiu, The sharp log-Sobolev inequality
on a compact interval, Involve 7 (2014), no. 2, 181–186. 145
rd
[Gor85] Yehoram Gordon, Some inequalities for Gaussian processes and applications, Israel
J. Math. 50 (1985), no. 4, 265–289. 180
[Gor88] Y. Gordon, On Milman’s inequality and random subspaces which escape through a
fo
mesh in Rn , Geometric aspects of functional analysis (1986/87), Lecture Notes in
Math., vol. 1317, Springer, Berlin, 1988, pp. 84–106. 180, 208, 209
ot
[Gra14] Loukas Grafakos, Classical Fourier analysis, third ed., Graduate Texts in Mathe-
matics, vol. 249, Springer, New York, 2014. 158
N
de Dvoretzky-Rogers, Bol. Soc. Mat. São Paulo 8 (1953), 81–110 (1956). 208
[Gro75] Leonard Gross, Logarithmic Sobolev inequalities, Amer. J. Math. 97 (1975), no. 4,
on
1061–1083. 145
[Gro80] Misha Gromov, Paul Levy’s isoperimetric inequality, preprint IHES (1980). 144
[Gro87] M. Gromov, Monotonicity of the volume of intersection of balls, Geometrical aspects
se
of functional analysis (1985/86), Lecture Notes in Math., vol. 1267, Springer, Berlin,
1987, pp. 1–4. 178
lu
[Grü03] Branko Grünbaum, Convex polytopes, second ed., Graduate Texts in Mathematics,
vol. 221, Springer-Verlag, New York, 2003, Prepared and with a preface by Volker
Kaibel, Victor Klee and Günter M. Ziegler. 29
na
[Gru07] Peter M. Gruber, Convex and discrete geometry, Grundlehren der Mathematis-
chen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 336,
so
[Hal82] Paul Richard Halmos, A Hilbert space problem book, second ed., Graduate Texts
in Mathematics, vol. 19, Springer-Verlag, New York-Berlin, 1982, Encyclopedia of
Mathematics and its Applications, 17. 343
[Hal07] Majdi Ben Halima, Branching rules for unitary groups and spectra of invariant
differential operators on complex Grassmannians, J. Algebra 318 (2007), no. 2,
520–552. 145
[Hal15] Brian Hall, Lie groups, Lie algebras, and representations, second ed., Graduate
Texts in Mathematics, vol. 222, Springer, Cham, 2015, An elementary introduction.
145
[Han56] Olof Hanner, On the uniform convexity of Lp and lp , Ark. Mat. 3 (1956), 239–244.
ion
29
[Har66] L. H. Harper, Optimal numberings and isoperimetric problems on graphs, J. Com-
binatorial Theory 1 (1966), 385–393. 146
ut
[Har13] Aram W Harrow, The church of the symmetric subspace, arXiv preprint 1308.6595
(2013). 63
rib
[Has09] Matthew B Hastings, Superadditivity of communication capacity using entangled
inputs, Nature Physics 5 (2009), no. 4, 255–257. 144, 232, 233
ist
[Hel69] Carl W. Helstrom, Quantum detection and estimation theory, J. Statist. Phys. 1
(1969), 231–252. 305
[Hen80] Douglas Hensley, Slicing convex bodies—bounds for slice area in terms of the body’s
rd
covariance, Proc. Amer. Math. Soc. 79 (1980), no. 4, 619–625. 105
[Hen12] Martin Henk, Löwner-John ellipsoids, Doc. Math. (2012), no. Extra volume: Opti-
mization stories, 95–106. 104
[HH99] fo
Michał Horodecki and Paweł Horodecki, Reduction criterion of separability and lim-
its for a class of distillation protocols, Phys. Rev. A 59 (1999), 4206–4216. 306
ot
[HH01] Paweł Horodecki and Ryszard Horodecki, Distillation and bound entanglement,
Quantum Inf. Comput. 1 (2001), no. 1, 45–75. 306
N
[HHH96] Michał Horodecki, Paweł Horodecki, and Ryszard Horodecki, Separability of mixed
states: necessary and sufficient conditions, Physics Letters A 223 (1996), no. 1–2,
1–8. 63, 64
ly.
[HHH97] Michał Horodecki, Paweł Horodecki, and Ryszard Horodecki, Inseparable two spin-
1
2
density matrices can be distilled to a singlet form, Phys. Rev. Lett. 78 (1997),
on
574–577. 306
[HHH98] , Mixed-State Entanglement and Distillation: Is there a “Bound” Entangle-
ment in Nature?, Phys. Rev. Lett. 80 (1998), 5239–5242. 306
se
[HHHH09] Ryszard Horodecki, Paweł Horodecki, Michał Horodecki, and Karol Horodecki,
Quantum entanglement, Rev. Modern Phys. 81 (2009), no. 2, 865–942. 63, 74,
306
na
[Hil05] Roland Hildebrand, Cones of ball-ball separable elements, arXiv preprint quant-
ph/0503194 (2005). 324
so
[Hil06] , Separable balls around the maximally mixed state for a 3-qubit system,
arXiv preprint quant-ph/0601201 (2006). 233, 261
r
[Hil07a] , Entangled states close to the maximally mixed state, Physical Review A 75
Pe
[Hol73] A. S. Holevo, Statistical decision theory for quantum systems, J. Multivariate Anal.
3 (1973), 337–394. 305
[Hol06] Alexander S. Holevo, The additivity problem in quantum information theory, In-
ternational Congress of Mathematicians. Vol. III, Eur. Math. Soc., Zürich, 2006,
pp. 999–1018. 232
[Hol12] , Quantum systems, channels, information, De Gruyter Studies in Mathe-
matical Physics, vol. 16, De Gruyter, Berlin, 2012, A mathematical introduction.
63
[Hor97] Pawel Horodecki, Separability criterion and inseparable mixed states with positive
partial transposition, Physics Letters A 232 (1997), no. 5, 333 – 339. 63
ion
[HP98] Fumio Hiai and Dénes Petz, Eigenvalue density of the Wishart matrix and large
deviations, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1 (1998), no. 4, 633–
646. 245
ut
[HP00] , The semicircle law, free random variables and entropy, Mathematical Sur-
veys and Monographs, vol. 77, American Mathematical Society, Providence, RI,
rib
2000. 180
[HQV` 16] F. Hirsch, M.T. Quintino, T. Vértesi, M. Navascués, and N. Brunner, Better local
ist
hidden variable models for two-qubit Werner states and an upper bound on the
Grothendieck constant KG p3q, arXiv preprint 1609.06114 (2016). 295
[HS05] Daniel Hug and Rolf Schneider, Large typical cells in Poisson-Delaunay mosaics,
rd
Rev. Roumaine Math. Pures Appl. 50 (2005), no. 5-6, 657–670. 178
[HSR03] Michael Horodecki, Peter W. Shor, and Mary Beth Ruskai, Entanglement breaking
channels, Rev. Math. Phys. 15 (2003), no. 6, 629–641. 64
[HT03] fo
Uffe Haagerup and Steen Thorbjørnsen, Random matrices with complex gaussian
entries, Expositiones Mathematicae 21 (2003), no. 4, 293–337. 170, 179
ot
˚
[HT05] , A new application of random matrices: ExtpCred pF2 qq is not a group, Ann.
of Math. (2) 162 (2005), no. 2, 711–775. 179, 180
N
[Ide16] , A review of matrix scaling and sinkhorn’s normal form for matrices and
positive maps, arXiv preprint 1609.06349 (2016). 64
lu
[Ind00] Piotr Indyk, Dimensionality reduction techniques for proximity problems, Proceed-
ings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (San
Francisco, CA, 2000), ACM, New York, 2000, pp. 371–378. 207
na
tion, and combinatorial optimization, Lecture Notes in Comput. Sci., vol. 6302,
Springer, Berlin, 2010, pp. 632–641. 207, 210
[Jam72] A. Jamiołkowski, Linear transformations which preserve trace and positive semidef-
initeness of operators, Rep. Mathematical Phys. 3 (1972), no. 4, 275–278. 64
[Jan97] Svante Janson, Gaussian Hilbert spaces, Cambridge Tracts in Mathematics, vol. 129,
Cambridge University Press, Cambridge, 1997. 145
[Jen13] Justin Jenkinson, Convex geometric connections to information theory, Ph.D. the-
sis, Case Western Reserve University, 2013, http://rave.ohiolink.edu/etdc/view?
acc_num=case1365179413. 109, 260
[JHH` 15] P. Joshi, K. Horodecki, M. Horodecki, P. Horodecki, R. Horodecki, Ben Li, S. J.
Szarek, and T. Szarek, Bound on Bell inequalities by fraction of determinism and
reverse triangle inequality, Phys. Rev. A 92 (2015), 032329. 297
BIBLIOGRAPHY 387
[JHK` 08] Eylee Jung, Mi-Ra Hwang, Hungsoo Kim, Min-Soo Kim, DaeKil Park, Jin-Woo Son,
and Sayatnova Tamaryan, Reduced state uniquely defines the groverian measure of
the original pure state, Phys. Rev. A 77 (2008), 062317. 233
[JL84] William B. Johnson and Joram Lindenstrauss, Extensions of Lipschitz mappings
into a Hilbert space., Contemp. Math. 26 (1984), 189–206 (English). 210
[JLN14] Maria Anastasia Jivulescu, Nicolae Lupa, and Ion Nechita, On the reduction crite-
rion for random quantum states, Journal of Mathematical Physics 55 (2014), no. 11,
–. 274
[JLN15] , Thresholds for entanglement criteria in quantum information theory, Quan-
tum Inf. Comput. 15 (2015), no. 13-4, 1165–1184. 274
ion
[JM78] Naresh C. Jain and Michael B. Marcus, Continuity of sub-Gaussian processes, Prob-
ability on Banach spaces, Adv. Probab. Related Topics, vol. 4, Dekker, New York,
1978, pp. 81–196. 179
ut
[JNP` 11] M. Junge, M. Navascues, C. Palazuelos, D. Perez-Garcia, V. B. Scholz, and R. F.
Werner, Connes embedding problem and Tsirelson’s problem, J. Math. Phys. 52
rib
(2011), no. 1, 012102, 12. 296
[Joh48] Fritz John, Extremum problems with inequalities as subsidiary conditions, Studies
ist
and Essays Presented to R. Courant on his 60th Birthday, January 8, 1948, Inter-
science Publishers, Inc., New York, N. Y., 1948, pp. 187–204. 104
[JP11] M. Junge and C. Palazuelos, Large violation of Bell inequalities with low entangle-
rd
ment, Comm. Math. Phys. 306 (2011), no. 3, 695–746. 297
[JPPG` 10] M. Junge, C. Palazuelos, D. Pérez-García, I. Villanueva, and M. M. Wolf, Operator
space theory: a natural framework for Bell inequalities, Phys. Rev. Lett. 104 (2010),
[JS]
no. 17, 170405, 4. 294 fo
Justin Jenkinson and Stanisław Szarek, Optimal constants in concentration inequal-
ot
ities on the sphere, in preparation. 109, 144
[JS91] William B. Johnson and Gideon Schechtman, Remarks on Talagrand’s deviation
N
functions, Izv. Akad. Nauk SSSR Ser. Mat. 41 (1977), no. 2, 334–351, 478. 208, 209
[Kat75] G. O. H. Katona, The Hamming-sphere has minimum boundary, Studia Sci. Math.
so
[Kec95] Alexander S. Kechris, Classical descriptive set theory, Graduate Texts in Mathe-
matics, vol. 156, Springer-Verlag, New York, 1995. 104
[Kha67] C. G. Khatri, On certain inequalities for normal distributions and their applications
to simultaneous confidence bounds, Ann. Math. Statist. 38 (1967), 1853–1867. 178
[Kir76] A. A. Kirillov, Elements of the theory of representations, Springer-Verlag, Berlin-
New York, 1976, Translated from the Russian by Edwin Hewitt, Grundlehren der
Mathematischen Wissenschaften, Band 220. 294
[Kis87] Christer O. Kiselman, Smoothness of vector sums of plane convex sets, Math. Scand.
60 (1987), no. 2, 239–252. 104
[KL78] G. A. Kabatjanskiı̆ and V. I. Levenšteı̆n, Bounds for packings on the sphere and in
space, Problemy Peredači Informacii 14 (1978), no. 1, 3–25. 142
[KL09] Robert L. Kosut and Daniel A. Lidar, Quantum error correction via convex opti-
mization, Quantum Inf. Process. 8 (2009), no. 5, 443–459. 29
388 BIBLIOGRAPHY
ion
106
[Kom55] Yûsaku Komatu, Elementary inequalities for Mills’ ratio, Rep. Statist. Appl. Res.
Un. Jap. Sci. Engrs. 4 (1955), 69–70. 309
ut
[KP88] G. A. Kabatyanskiı̆ and V. I. Panchenko, Packings and coverings of the Hamming
space by unit balls, Dokl. Akad. Nauk SSSR 303 (1988), no. 3, 550–552. 142
rib
[Kra71] K. Kraus, General state changes in quantum theory, Ann. Physics 64 (1971), 311–
335. 64
ist
[Kra83] Karl Kraus, States, effects, and operations, Lecture Notes in Physics, vol. 190,
Springer-Verlag, Berlin, 1983, Fundamental notions of quantum theory, Lecture
notes edited by A. Böhm, J. D. Dollard and W. H. Wootters. 64
rd
[Kri79] J.-L. Krivine, Constantes de Grothendieck et fonctions de type positif sur les sphères,
Adv. in Math. 31 (1979), no. 1, 16–30. 295
[KS67] Simon Kochen and E. P. Specker, The problem of hidden variables in quantum
[KS03]
fo
mechanics, J. Math. Mech. 17 (1967), 59–87. 297
Boris S. Kashin and Stanislaw J. Szarek, The Knaster problem and the geometry of
ot
high-dimensional cubes, C. R. Math. Acad. Sci. Paris 336 (2003), no. 11, 931–936.
208
N
[KT85] Leonid A Khalfin and Boris S Tsirelson, Quantum and quasi-classical analogs of Bell
inequalities, Symposium on the foundations of modern physics, vol. 85, Singapore:
World Scientific, 1985, p. 441. 296
ly.
[Kwa76] S. Kwapień, A theorem on the Rademacher series with vector valued coefficients,
Probability in Banach spaces (Proc. First Internat. Conf., Oberwolfach, 1975),
so
Springer, Berlin, 1976, pp. 157–158. Lecture Notes in Math., Vol. 526. 147
[Kwa94] Stanisław Kwapień, A remark on the median and the expectation of convex func-
r
Probab., vol. 35, Birkhäuser Boston, Boston, MA, 1994, pp. 271–272. 144
[Lan16] Cécilia Lancien, k-Extendibility of high-dimensional bipartite quantum states, Ran-
dom Matrices Theory Appl. 5 (2016), no. 3, 1650011, 58. 260, 274
[Las08] Marek Lassak, Banach-Mazur distance of central sections of a centrally symmetric
convex body, Beiträge Algebra Geom. 49 (2008), no. 1, 243–246. 339
[Lat96] Rafał Latała, A note on the Ehrhard inequality, Studia Math. 118 (1996), no. 2,
169–174. 144
[Lat97] , Estimation of moments of sums of independent real random variables, Ann.
Probab. 25 (1997), no. 3, 1502–1513. 146
[Lat02] R. Latała, On some inequalities for Gaussian measures, Proceedings of the Inter-
national Congress of Mathematicians, Vol. II (Beijing, 2002), Higher Ed. Press,
Beijing, 2002, pp. 813–822. 144
BIBLIOGRAPHY 389
[Lat06] Rafał Latała, Estimates of moments and tails of Gaussian chaoses, Ann. Probab.
34 (2006), no. 6, 2315–2331. 145
[Lea91] Imre Leader, Discrete isoperimetric inequalities, Probabilistic combinatorics and its
applications (San Francisco, CA, 1991), Proc. Sympos. Appl. Math., vol. 44, Amer.
Math. Soc., Providence, RI, 1991, pp. 57–80. 146
[Led96] Michel Ledoux, Isoperimetry and Gaussian analysis, Lectures on probability the-
ory and statistics (Saint-Flour, 1994), Lecture Notes in Math., vol. 1648, Springer,
Berlin, 1996, pp. 165–294. 144
[Led01] , The concentration of measure phenomenon, Mathematical Surveys and
Monographs, vol. 89, American Mathematical Society, Providence, RI, 2001. 117,
ion
119, 126, 143, 144, 145
[Led03] , A remark on hypercontractivity and tail inequalities for the largest eigen-
values of random matrices, Séminaire de Probabilités XXXVII, Lecture Notes in
ut
Math., vol. 1832, Springer, Berlin, 2003, pp. 360–369. 179
[Led97] , On Talagrand’s deviation inequalities for product measures, ESAIM
rib
Probab. Statist. 1 (1995/97), 63–87 (electronic). 146
[Lei72] L. Leindler, On a certain converse of Hölder’s inequality. II, Acta Sci. Math.
ist
(Szeged) 33 (1972), no. 3-4, 217–223. 105
[Lév22] Paul Lévy, Leçons d’analyse fonctionnelle, Gauthier–Villars, Paris, 1922. 143, 144
[Lév51] , Problèmes concrets d’analyse fonctionnelle. Avec un complément sur les
rd
fonctionnelles analytiques par F. Pellegrino, Gauthier-Villars, Paris, 1951, 2d ed.
143
[Li] Ben Li, in preparation, Ph.D. thesis, Case Western Reserve University. 295
[LJL15] fo
Gao Li, Marius Junge, and Nicholas LaRacuente, Capacity Bounds via Operator
Space Methods, arXiv preprint 1509.07294 (2015). 232
ot
[LKCH00] M. Lewenstein, B. Kraus, J. I. Cirac, and P. Horodecki, Optimization of entangle-
ment witnesses, Phys. Rev. A 62 (2000), 052310. 64
N
[LLR83] M. R. Leadbetter, Georg Lindgren, and Holger Rootzén, Extremes and related prop-
erties of random sequences and processes, Springer Series in Statistics, Springer-
Verlag, New York-Berlin, 1983. 178
ly.
[LM15] Rafał Latała and Dariusz Matlak, Royen’s proof of the Gaussian correlation inequal-
ity, arXiv preprint 1512.08776 (2015). 178
on
[LMO06] Jon Magne Leinaas, Jan Myrheim, and Eirik Ovrum, Geometrical aspects of entan-
glement, Phys. Rev. A (3) 74 (2006), no. 1, 012313, 13. 65
[LN16] Kasper Green Larsen and Jelani Nelson, The Johnson-Lindenstrauss lemma is opti-
se
mal for linear dimensionality reduction, Proceedings of the 43rd International Col-
loquium on Automata, Languages and Programming (ICALP 2016), 2016. 210
lu
[LO94] Rafał Latała and Krzysztof Oleszkiewicz, On the best constant in the Khinchin-
Kahane inequality, Studia Math. 109 (1994), no. 1, 101–104. 147
[LO99] , Gaussian measures of dilatations of convex symmetric sets, Ann. Probab.
na
[LQ04] Daniel Li and Hervé Queffélec, Introduction à l’étude des espaces de Banach, Cours
Spécialisés [Specialized Courses], vol. 12, Société Mathématique de France, Paris,
2004, Analyse et probabilités. [Analysis and probability theory]. 207
[LR10] Michel Ledoux and Brian Rider, Small deviations for beta ensembles, Electron. J.
Probab. 15 (2010), no. 41, 1319–1343. 179
[LS75] Raphael Loewy and Hans Schneider, Positive operators on the n-dimensional ice
cream cone, J. Math. Anal. Appl. 49 (1975), 375–392. 324
[LS93] L. J. Landau and R. F. Streater, On Birkhoff ’s theorem for doubly stochastic com-
pletely positive maps of matrix algebras, Linear Algebra Appl. 193 (1993), 107–127.
64
[LS08] Shachar Lovett and Sasha Sodin, Almost Euclidean sections of the N -dimensional
cross-polytope using OpN q random bits, Commun. Contemp. Math. 10 (2008), no. 4,
477–489. 207
390 BIBLIOGRAPHY
ion
[Mau79] Bernard Maurey, Construction de suites symétriques, C. R. Acad. Sci. Paris Sér.
A-B 288 (1979), no. 14, A679–A681. 146
[Mau91] B. Maurey, Some deviation inequalities, Geom. Funct. Anal. 1 (1991), no. 2, 188–
ut
197. 146
[Mau03] Bernard Maurey, Type, cotype and K-convexity, Handbook of the geometry of Ba-
rib
nach spaces, Vol. 2, North-Holland, Amsterdam, 2003, pp. 1299–1332. 207, 209
[McC06] Robert J. McCann, Stable rotating binary stars and fluid in a tube, Houston J.
ist
Math. 32 (2006), no. 2, 603–631. 179
[McD89] Colin McDiarmid, On the method of bounded differences, Surveys in combinatorics,
1989 (Norwich, 1989), London Math. Soc. Lecture Note Ser., vol. 141, Cambridge
rd
Univ. Press, Cambridge, 1989, pp. 148–188. 144
[McD98] , Concentration, Probabilistic methods for algorithmic discrete mathematics,
Algorithms Combin., vol. 16, Springer, Berlin, 1998, pp. 195–248. 146
[Mec] fo
Elizabeth Meckes, The random matrix theory of the classical compact groups, Cam-
bridge University Press, in preparation. 179
ot
[Mec03] Mark W. Meckes, Random phenomena in finite-dimensional normed spaces, Ph.D.
thesis, Case Western Reserve University, 2003. 146
N
Proc. Conf., Columbia/Mo. 1984, Lect. Notes Math. 1166, 106-115 (1985)., 1985.
209
lu
[Mil88] V.D. Milman, A few observations on the connections between local theory and some
Pe
other fields., Geometric aspects of functional analysis, Isr. Semin. 1986-87, Lect.
Notes Math. 1317, 283-289 (1988)., 1988. 208
[Mil15] Emanuel Milman, Sharp isoperimetric inequalities and model spaces for the curva-
ture-dimension-diameter condition, J. Eur. Math. Soc. (JEMS) 17 (2015), no. 5,
1041–1078. 144
[Min11] Hermann Minkowski, Gesammelte Abhandlungen von Hermann Minkowski. Unter
Mitwirkung von Andreas Speiser und Hermann Weyl, herausgegeben von David
Hilbert. Band I, II., Leipzig u. Berlin: B. G. Teubner. Erster Band. Mit einem
Bildnis Hermann Minkowskis und 6 Figuren im Text. xxxvi, 371 S.; Zweiter Band.
Mit einem Bildnis Hermann Minkowskis, 34 Figuren in Text und einer Doppeltafel.
iv, 466 S. gr. 8˝ (1911)., 1911. 29
[MM13] Elizabeth S. Meckes and Mark W. Meckes, Spectral measures of powers of random
matrices, Electron. Commun. Probab. 18 (2013), no. 78, 13. 134
BIBLIOGRAPHY 391
[MO15] Marek Miller and Robert Olkiewicz, Topology of the cone of positive maps on qubit
systems, Journal of Physics A: Mathematical and Theoretical 48 (2015), no. 25,
255203. 64, 324
[MO16] Marek Miller and Robert Olkiewicz, Extremal positive maps on M3 pCq and idem-
potent matrices, Open Syst. Inf. Dyn. 23 (2016), no. 1, 1650001, 13. 65
[Mon12] Ashley Montanaro, Some applications of hypercontractive inequalities in quantum
information theory, J. Math. Phys. 53 (2012), no. 12, 122206, 15. 136, 146
[Mon13] , Weak multiplicativity for random quantum channels, Comm. Math. Phys.
319 (2013), no. 2, 535–555. 233
[MP67] V. A. Marčenko and L. A. Pastur, Distribution of eigenvalues in certain sets of
ion
random matrices, Mat. Sb. (N.S.) 72 (114) (1967), 507–536. 179
[MP86] Vitali D. Milman and Gilles Pisier, Banach spaces with a weak cotype 2 property,
Israel J. Math. 54 (1986), no. 2, 139–158. 209
ut
[MP00] V. D. Milman and A. Pajor, Entropy and asymptotic geometry of non-symmetric
convex bodies, Adv. Math. 152 (2000), no. 2, 314–335. 105
rib
[MS86] Vitali D. Milman and Gideon Schechtman, Asymptotic theory of finite-dimensional
normed spaces, Lecture Notes in Mathematics, vol. 1200, Springer-Verlag, Berlin,
ist
1986, With an appendix by M. Gromov. 144, 146, 207
[MS97] V. D. Milman and G. Schechtman, Global versus local asymptotic theories of finite-
dimensional normed spaces, Duke Math. J. 90 (1997), no. 1, 73–93. 208
rd
[MS12] Mark W. Meckes and Stanisław J. Szarek, Concentration for noncommutative poly-
nomials in random matrices, Proc. Amer. Math. Soc. 140 (2012), no. 5, 1803–1813.
180
[MTJ87] fo
V. D. Milman and N. Tomczak-Jaegermann, Sudakov type inequalities for convex
bodies in Rn , Geometrical aspects of functional analysis (1985/86), Lecture Notes
ot
in Math., vol. 1267, Springer, Berlin, 1987, pp. 113–121. 178
[MWW09] William Matthews, Stephanie Wehner, and Andreas Winter, Distinguishability of
N
aspects of functional analysis, Lecture Notes in Math., vol. 2050, Springer, Heidel-
berg, 2012, pp. 335–343. 105
on
[NC00] Michael A. Nielsen and Isaac L. Chuang, Quantum computation and quantum in-
formation, Cambridge University Press, Cambridge, 2000. 63, 232
[Nel73] Edward Nelson, The free Markoff field, J. Functional Analysis 12 (1973), 211–227.
se
145
[Nem07] Arkadi Nemirovski, Advances in convex optimization: conic programming, Interna-
lu
tional Congress of Mathematicians. Vol. I, Eur. Math. Soc., Zürich, 2007, pp. 413–
444. 29
[NS06] Alexandru Nica and Roland Speicher, Lectures on the combinatorics of free proba-
na
bility, London Mathematical Society Lecture Note Series, vol. 335, Cambridge Uni-
versity Press, Cambridge, 2006. 177, 180
so
[O’D14] Ryan O’Donnell, Analysis of Boolean functions, Cambridge University Press, New
York, 2014. 136, 146
r
[Oza13] Narutaka Ozawa, About the Connes embedding conjecture: algebraic approaches,
Pe
[Per96] Asher Peres, Separability criterion for density matrices, Phys. Rev. Lett. 77 (1996),
1413–1415. 63
[Per99] , All the Bell inequalities, Found. Phys. 29 (1999), no. 4, 589–614, Invited
papers dedicated to Daniel Greenberger, Part II. 297
[Pet01] Dénes Petz, Entropy, von Neumann and the von Neumann entropy, John von Neu-
mann and the foundations of quantum physics (Budapest, 1999), Vienna Circ. Inst.
Yearb., vol. 8, Kluwer Acad. Publ., Dordrecht, 2001, pp. 83–96. 29
[Pet06] Peter Petersen, Riemannian geometry, second ed., Graduate Texts in Mathematics,
vol. 171, Springer, New York, 2006. 131
[PGWP` 08] D. Pérez-García, M. M. Wolf, C. Palazuelos, I. Villanueva, and M. Junge, Unbounded
ion
violation of tripartite Bell inequalities, Comm. Math. Phys. 279 (2008), no. 2, 455–
486. 297
[Pic68] James Pickands, III, Moment convergence of sample extremes, Ann. Math. Statist.
ut
39 (1968), 881–889. 178
[Pis80] G. Pisier, Un théorème sur les opérateurs linéaires entre espaces de Banach qui se
rib
factorisent par un espace de Hilbert, Ann. Sci. École Norm. Sup. (4) 13 (1980),
no. 1, 23–43. 207
ist
[Pis81] , Remarques sur un résultat non publié de B. Maurey, Seminar on Functional
Analysis, 1980–1981, École Polytech., Palaiseau, 1981, pp. Exp. No. V, 13. 207
[Pis86] Gilles Pisier, Probabilistic methods in the geometry of Banach spaces, Probability
rd
and analysis (Varenna, 1985), Lecture Notes in Math., vol. 1206, Springer, Berlin,
1986, pp. 167–241. 144
[Pis89a] , A new approach to several results of V. Milman, J. Reine Angew. Math.
[Pis89b]
393 (1989), 115–131. 209 fo
, The volume of convex bodies and Banach space geometry, Cambridge Tracts
ot
in Mathematics, vol. 94, Cambridge University Press, Cambridge, 1989. 142, 143,
207, 209
N
[Pis12a] , Grothendieck’s theorem, past and present, Bull. Amer. Math. Soc. (N.S.)
49 (2012), no. 2, 237–323. 295
[Pis12b] , Tripartite Bell inequality, random matrices and trilinear forms, arXiv
ly.
[PR94] Sandu Popescu and Daniel Rohrlich, Quantum nonlocality as an axiom, Found.
Phys. 24 (1994), no. 3, 379–385. 296
lu
[Pré71] András Prékopa, Logarithmic concave measures with application to stochastic pro-
gramming, Acta Sci. Math. (Szeged) 32 (1971), 301–316. 105
[Pré73] , On logarithmic concave measures and functions, Acta Sci. Math. (Szeged)
na
finite-dimensional Banach spaces., Proc. Am. Math. Soc. 97 (1986), 637–642 (Eng-
lish). 209
r
[PTJ85] Alain Pajor and Nicole Tomczak-Jaegermann, Remarques sur les nombres d’entropie
Pe
d’un opérateur et de son transposé, C. R. Acad. Sci. Paris Sér. I Math. 301 (1985),
no. 15, 743–746. 178
[PTJ90] , Gel1 fand numbers and Euclidean sections of large dimensions, Probability
in Banach spaces 6 (Sandbjerg, 1986), Progr. Probab., vol. 20, Birkhäuser Boston,
Boston, MA, 1990, pp. 252–264. 208
[PV07] Martin B. Plenio and Shashank Virmani, An introduction to entanglement measures,
Quantum Inf. Comput. 7 (2007), no. 1-2, 1–51. 232, 273
[PV16] Carlos Palazuelos and Thomas Vidick, Survey on nonlocal games and operator space
theory, Journal of Mathematical Physics 57 (2016), no. 1. 275, 281, 295, 296, 297
[PY15] C. Palazuelos and Z. Yin, Large bipartite Bell violations with dichotomic measure-
ments, Phys. Rev. A 92 (2015), 052313. 297
[Ran55] R. A. Rankin, The closest packing of spherical caps in n dimensions, Proc. Glasgow
Math. Assoc. 2 (1955), 139–144. 142
BIBLIOGRAPHY 393
[Rei08] Michael Reimpell, Quantum information and convex optimization, Ph.D. thesis,
Technische Universität Braunschweig, 2008. 29
[Roc70] R. Tyrrell Rockafellar, Convex analysis, Princeton Mathematical Series, No. 28,
Princeton University Press, Princeton, N.J., 1970. 14, 28
[Rog47] C. A. Rogers, Existence theorems in the geometry of numbers, Ann. of Math. (2) 48
(1947), 994–1002. 142
[Rog57] , A note on coverings, Mathematika 4 (1957), 1–6. 142
[Rog63] , Covering a sphere with spheres, Mathematika 10 (1963), 157–164. 142
[Rog64] , Packing and covering, Cambridge Tracts in Mathematics and Mathematical
Physics, No. 54, Cambridge University Press, New York, 1964. 141
ion
[Rot86] O. S. Rothaus, Hypercontractivity and the Bakry-Emery criterion for compact Lie
groups, J. Funct. Anal. 65 (1986), no. 3, 358–367. 145
[Rot06] Ron Roth, Introduction to coding theory, Cambridge University Press, 2006. 142
ut
[Roy14] Thomas Royen, A simple proof of the Gaussian correlation conjecture extended to
some multivariate gamma distributions, Far East J. Theor. Stat. 48 (2014), no. 2,
rib
139–145. 178
[RP11] Eleanor Rieffel and Wolfgang Polak, Quantum computing, Scientific and Engineering
ist
Computation, MIT Press, Cambridge, MA, 2011, A gentle introduction. 75
[RS58] C. A. Rogers and G. C. Shephard, Convex bodies associated with a given convex
body, J. London Math. Soc. 33 (1958), 270–281. 105
rd
[RSW02] Mary Beth Ruskai, Stanislaw Szarek, and Elisabeth Werner, An analysis of com-
pletely positive trace-preserving maps on M2 , Linear Algebra Appl. 347 (2002),
159–187. 64
[Rud97] fo
M. Rudelson, Contact points of convex bodies, Israel J. Math. 101 (1997), 93–124.
208
ot
[Rud00] , Distances between non-symmetric convex bodies and the M M ˚ -estimate,
Positivity 4 (2000), no. 2, 161–178. 103, 207
N
[Rud05] Oliver Rudolph, Further results on the cross norm criterion for separability, Quan-
tum Inf. Process. 4 (2005), no. 3, 219–239. 63
[RW00] Mary Beth Ruskai and Elisabeth Werner, Study of a class of regularizations of
ly.
1{|X| using Gaussian integrals, SIAM J. Math. Anal. 32 (2000), no. 2, 435–463
(electronic). 309
on
[RW09] Mary Beth Ruskai and Elisabeth M Werner, Bipartite states of low rank are almost
surely entangled, Journal of Physics A: Mathematical and Theoretical 42 (2009),
no. 9, 095303. 273
se
[RZ14] Dmitry Ryabogin and Artem Zvavitch, Analytic methods in convex geometry, An-
alytical and probabilistic methods in the geometry of convex bodies, IMPAN Lect.
lu
Notes, vol. 2, Polish Acad. Sci. Inst. Math., Warsaw, 2014, pp. 87–183. 105
[Sam53] M. R. Sampford, Some inequalities on Mill’s ratio and related functions, Ann. Math.
Statistics 24 (1953), 130–132. 309
na
[SBŻ06] Stanisław J Szarek, Ingemar Bengtsson, and Karol Życzkowski, On the structure of
the body of states with positive partial transpose, Journal of Physics A: Mathematical
so
invariant measures, Zap. Naučn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI)
Pe
41 (1974), 14–24, 165, Problems in the theory of probability distributions, II. 144
[SC94] L. Saloff-Coste, Precise estimates on the rate at which certain diffusions tend to
equilibrium, Math. Z. 217 (1994), no. 4, 641–677. 145
[Sch48] Erhard Schmidt, Die Brunn-Minkowskische Ungleichung und ihr Spiegelbild sowie
die isoperimetrische Eigenschaft der Kugel in der euklidischen und nichteuklidischen
Geometrie. I, Math. Nachr. 1 (1948), 81–157. 143
[Sch50] Robert Schatten, A Theory of Cross-Spaces, Annals of Mathematics Studies, no. 26,
Princeton University Press, Princeton, N. J., 1950. 29
[Sch65] Hans Schneider, Positive operators and an inertia theorem, Numer. Math. 7 (1965),
11–17. 64
[Sch70] Robert Schatten, Norm ideals of completely continuous operators, Second print-
ing. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 27, Springer-Verlag,
Berlin-New York, 1970. 29
394 BIBLIOGRAPHY
[Sch82] Gideon Schechtman, Lévy type inequality for a class of finite metric spaces, Martin-
gale theory in harmonic analysis and Banach spaces (Cleveland, Ohio, 1981), Lecture
Notes in Math., vol. 939, Springer, Berlin-New York, 1982, pp. 211–215. 146
[Sch84] Carsten Schütt, Entropy numbers of diagonal operators between symmetric Banach
spaces, J. Approx. Theory 40 (1984), no. 2, 121–128. 156, 157
[Sch87] Gideon Schechtman, More on embedding subspaces of Lp in lrn ., Compos. Math. 61
(1987), 159–169 (English). 209
[Sch89] Gideon Schechtman, A remark concerning the dependence on in Dvoretzky’s the-
orem, Geometric aspects of functional analysis (1987–88), Lecture Notes in Math.,
vol. 1376, Springer, Berlin, 1989, pp. 274–277. 208
ion
[Sch99] Michael Schmuckenschläger, An extremal property of the regular simplex, Convex
geometric analysis (Berkeley, CA, 1996), Math. Sci. Res. Inst. Publ., vol. 34, Cam-
bridge Univ. Press, Cambridge, 1999, pp. 199–202. 342
ut
[Sch03] Gideon Schechtman, Concentration results and applications, Handbook of the ge-
ometry of Banach spaces, Vol. 2, North-Holland, Amsterdam, 2003, pp. 1603–1634.
rib
143, 144
[Sch07] G. Schechtman, The random version of Dvoretzky’s theorem in `n 8 , Geometric as-
ist
pects of functional analysis, Lecture Notes in Math., vol. 1910, Springer, Berlin,
2007, pp. 265–270. 208
[Sch14] Rolf Schneider, Convex bodies: the Brunn-Minkowski theory, expanded ed., Ency-
rd
clopedia of Mathematics and its Applications, vol. 151, Cambridge University Press,
Cambridge, 2014. 103, 104, 344
[SCM16] Gniewomir Sarbicki, Dariusz Chruściński, and Marek Mozrzymas, Generalising
fo
Wigner’s theorem, Journal of Physics A: Mathematical and Theoretical 49 (2016),
no. 30, 305302. 63
ot
[See66] R. T. Seeley, Spherical harmonics, Amer. Math. Monthly 73 (1966), no. 4, part II,
115–121. 145
N
[Sen96] Siddhartha Sen, Average entropy of a quantum subsystem, Physical review letters
77 (1996), no. 1, 1. 232
[Sha48] C. E. Shannon, A mathematical theory of communication, Bell System Tech. J. 27
ly.
ion
and Hilbert spaces. 342, 364
[ST80] Stanislaw J. Szarek and Nicole Tomczak-Jaegermann, On nearly Euclidean decom-
position for some classes of Banach spaces., Compos. Math. 40 (1980), 367–385
ut
(English). 209
[Sti55] W. Forrest Stinespring, Positive functions on C ˚ -algebras, Proc. Amer. Math. Soc.
rib
6 (1955), 211–216. 64
[Stø63] Erling Størmer, Positive linear maps of operator algebras, Acta Math. 110 (1963),
ist
233–278. 63, 64
[Stø13] , Positive linear maps of operator algebras, Springer Monographs in Mathe-
matics, Springer, Heidelberg, 2013. 65
rd
[Stø16] , Positive maps which map the set of rank k projections onto itself, Positivity
(2016), 1–3. 63
[Sud71] V. N. Sudakov, Gaussian random processes, and measures of solid angles in Hilbert
[SV96]
fo
space, Dokl. Akad. Nauk SSSR 197 (1971), 43–45. 178
Stanislaw J. Szarek and Dan Voiculescu, Volumes of restricted Minkowski sums and
ot
the free analogue of the entropy power inequality, Comm. Math. Phys. 178 (1996),
no. 3, 563–570. 104
N
[SV00] S. J. Szarek and D. Voiculescu, Shannon’s entropy power inequality via restricted
Minkowski sums, Geometric aspects of functional analysis, Lecture Notes in Math.,
vol. 1745, Springer, Berlin, 2000, pp. 257–262. 104
ly.
[SW] Stanisław Szarek and Paweł Wolff, Radii of Euclidean sections of Lp -balls, in prepa-
ration. 197
[SW83] Rolf Schneider and Wolfgang Weil, Zonoids and related topics, Convexity and its
se
[SWŻ08] Stanisław J. Szarek, Elisabeth Werner, and Karol Życzkowski, Geometry of sets of
quantum maps: a generic positive map acting on a high-dimensional system is not
so
completely positive, J. Math. Phys. 49 (2008), no. 3, 032113, 21. 106, 260, 261
[SWŻ11] , How often is a random quantum state k-entangled?, J. Phys. A 44 (2011),
r
[Sza] Stanislaw Szarek, Coarse approximation of convex bodies by polytopes and the com-
plexity of banach–mazur compacta, in preparation. 143
[Sza74] Andrzej Szankowski, On Dvoretzky’s theorem on almost spherical sections of convex
bodies., Isr. J. Math. 17 (1974), 325–338 (English). 208
[Sza76] S. J. Szarek, On the best constants in the Khinchin inequality, Studia Math. 58
(1976), no. 2, 197–208. 147, 282
[Sza78] Stanislaw Jerzy Szarek, On Kashin’s almost Euclidean orthogonal decomposition of
`1n ., Bull. Acad. Pol. Sci., Sér. Sci. Math. Astron. Phys. 26 (1978), 691–694 (English).
209
[Sza82] Stanisław J. Szarek, Nets of Grassmann manifold and orthogonal group, Proceedings
of research workshop on Banach space theory (Iowa City, Iowa, 1981), Univ. Iowa,
Iowa City, IA, 1982, pp. 169–185. 143
396 BIBLIOGRAPHY
ion
[Sza10] Stanislaw J Szarek, On norms of completely positive maps, Topics in Operator The-
ory, Springer, 2010, pp. 535–538. 232
[Tak08] Leon A. Takhtajan, Quantum mechanics for mathematicians, Graduate Studies in
ut
Mathematics, vol. 95, American Mathematical Society, Providence, RI, 2008. 75
[Tal87] Michel Talagrand, Regularity of Gaussian processes, Acta Math. 159 (1987), no. 1-2,
rib
99–149. 179
[Tal88] , An isoperimetric theorem on the cube and the Kintchine-Kahane inequali-
ist
ties, Proc. Amer. Math. Soc. 104 (1988), no. 3, 905–909. 146
[Tal90] Michel Talagrand, Embedding subspaces of L1 into `N 1 ., Proc. Am. Math. Soc. 108
(1990), no. 2, 363–369 (English). 209
rd
[Tal95] Michel Talagrand, Concentration of measure and isoperimetric inequalities in prod-
uct spaces, Inst. Hautes Études Sci. Publ. Math. (1995), no. 81, 73–205. 146
[Tal96a] , New concentration inequalities in product spaces, Invent. Math. 126 (1996),
[Tal96b]
no. 3, 505–563. 146 fo
, A new look at independence, Ann. Probab. 24 (1996), no. 1, 1–34. 146
ot
[Tal01] , Majorizing measures without measures, Ann. Probab. 29 (2001), no. 1,
411–417. 179
N
[Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys
in Mathematics], vol. 60, Springer, Heidelberg, 2014, Modern methods and classical
lu
problems. 179
[Tao12] Terence Tao, Topics in random matrix theory, Graduate Studies in Mathematics,
vol. 132, American Mathematical Society, Providence, RI, 2012. 179
na
[TH00] Barbara M. Terhal and Paweł Horodecki, Schmidt number for density matrices,
Phys. Rev. A 61 (2000), 040301. 63
n and the χ-
so
[Tsi85] B. S. Tsirelson, Quantum analogues of Bell’s inequalities. The case of two spatially
divided domains, Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI)
142 (1985), 174–194, 200, Problems of the theory of probability distributions, IX.
295, 296
[Tsi93] , Some results and problems on quantum Bell-type inequalities, Hadronic J.
Suppl. 8 (1993), no. 4, 329–345. 295
[Tsu81] Chiaki Tsukamoto, Spectra of Laplace-Beltrami operators on SOpn ` 2q{SOp2q ˆ
SOpnq and Sppn ` 1q{Spp1q ˆ Sppnq, Osaka J. Math. 18 (1981), no. 2, 407–426. 145
[TVZ82] M. A. Tsfasman, S. G. Vlăduţ, and Th. Zink, Modular curves, Shimura curves,
and Goppa codes, better than Varshamov-Gilbert bound, Math. Nachr. 109 (1982),
ion
21–28. 142
[TW94] Craig A. Tracy and Harold Widom, Level-spacing distributions and the Airy kernel,
Comm. Math. Phys. 159 (1994), no. 1, 151–174. 179
ut
[TW96] , On orthogonal and symplectic matrix ensembles, Comm. Math. Phys. 177
(1996), no. 3, 727–754. 179
rib
[Vaa79] Jeffrey D. Vaaler, A geometric inequality with applications to linear forms, Pacific
J. Math. 83 (1979), no. 2, 543–553. 106
ist
[VADM01] Frank Verstraete, Koenraad Audenaert, and Bart De Moor, Maximally entangled
mixed states of two qubits, Phys. Rev. A 64 (2001), 012316. 64
[VB14] Tamás Vértesi and Nicolas Brunner, Disproving the Peres conjecture by showing
rd
Bell nonlocality from bound entanglement, Nat. Commun. 5 (2014), Article. 297
[VDD01] Frank Verstraete, Jeroen Dehaene, and Bart DeMoor, Local filtering operations on
two qubits, Phys. Rev. A 64 (2001), 010101. 65
[Vem04] fo
Santosh S. Vempala, The random projection method, DIMACS Series in Discrete
Mathematics and Theoretical Computer Science, 65, American Mathematical Soci-
ot
ety, Providence, RI, 2004, With a foreword by Christos H. Papadimitriou. 124
[Ver] Roman Vershynin, High-Dimensional Probability. An Introduction with Applications
N
Lecture Notes in Math., vol. 1132, Springer, Berlin, 1985, pp. 556–588. 180
[Voi90] , Circular and semicircular systems and free product factors, Operator al-
lu
ion
[Wig58] , On the distribution of the roots of certain symmetric matrices, Ann. of
Math. (2) 67 (1958), 325–327. 179
[Wig59] , Group theory: And its application to the quantum mechanics of atomic
ut
spectra, Expanded and improved ed. Translated from the German by J. J. Griffin.
Pure and Applied Physics. Vol. 5, Academic Press, New York-London, 1959. 63
rib
[Wil17] Mark M. Wilde, Quantum information theory, second ed., Cambridge University
Press, Cambridge, 2017. 29, 63, 232
ist
[Win16] Andreas Winter, Tight uniform continuity bounds for quantum entropies: Condi-
tional entropy, relative entropy distance and energy constraints, Communications in
Mathematical Physics (2016), 1–23. 232
rd
[Wor76] S.L. Woronowicz, Positive maps of low dimensional matrix algebras, Reports on
Mathematical Physics 10 (1976), no. 2, 165 – 183. 63
[WS08] Jonathan Walgate and Andrew James Scott, Generic local distinguishability and
fo
completely entangled subspaces, Journal of Physics A: Mathematical and Theoretical
41 (2008), no. 37, 375305. 231
ot
[WW00] R. F. Werner and M. M. Wolf, Bell’s inequalities for states with positive partial
transpose, Phys. Rev. A 61 (2000), 062102. 297
N
281
[ŻS01] Karol Życzkowski and Hans-Jürgen Sommers, Induced measures in the space of
mixed quantum states, J. Phys. A 34 (2001), no. 35, 7111–7125, Quantum informa-
na
Websites
Pe
Notation
ion
We list below mathematical symbols that appear in the book, particularly
ut
those that are subfield-specific or not generally accepted throughout mathematics,
or just potentially ambiguous. We grouped them by theme/subfield; since any such
rib
classification is necessarily imperfect, it may sometimes be necessary to check more
than one category. Within each category, we tried—to the extent it was possible—
ist
to arrange the symbols in the alphabetic order. The numbers following each brief
description refer to the pages on which the corresponding symbol is defined, or at
rd
least appears in a context.
General notation
xx|, |xy
fo
Dirac bra-ket notation, 4
scalar product, alternative notation to xx, yy, 5
ot
xx|yy
|xyxy| ket-bra, the rank one operator mapping z to xy, zy ¨ x, 5
N
Sm
vol, voln , volE Lebesgue measure on Rn , on the subspace E, 4
na
Convex geometry
so
K˝ polar of a set K Ă Rn , 15
Pe
399
400 F. NOTATION
ion
outradpKq outradius of a convex body K, 96
S n´1 , SCn , SH unit sphere in Rn , in Cn , in Hilbert space H, 4, 311
vradpKq volume radius of a convex body K, 92
ut
wpK, ¨q support function of a convex body K, 94
rib
wpKq mean width of a convex body K, 95
wG pKq Gaussian mean width of a convex body K, 95
ist
Linear algebra
xÓ non-increasing rearrangement of a vector x P Rn , 22
rd
ă majorization, 22
ăw submajorization, 23
A:
|A|
fo
adjoint of a matrix (or operator) A, 4
absolute value of A (equals pA: Aq1{2 ), 23
ot
} ¨ }p Schatten p-norm on matrices, 23
} ¨ }HS Hilbert–Schmidt norm (equals } ¨ }2 ), 23
N
312
BpHq bounded linear operators on a Hilbert space H, 4
on
zero, 25
Diagpvq, Dv for a vector v “ pvi q, the diagonal matrix whose ii-th entry
na
is vi , 265, 333
Eij operator |ei yxfj |, where pei q, pfj q are specified bases, 47
so
in Msa d , 31
I identity matrix or identity operator, 8
Id identity superoperator, 8
J diagonal matrix with diagonal entries p1, ´1, . . . , ´1q, 318
λj pAq eigenvalues of a matrix A, usually arranged in the nonin-
creasing order if A is Hermitian, 160
λj pψq Schmidt coefficients of a vector ψ P H1 b H2 , 36
Mm,n space of m ˆ n (real or complex) matrices, 7
Mn equals Mn,n , 7
Msa
n space of self-adjoint matrices (subspace of Mn ), 7
PROBABILITY 401
Msa,0
n subspace of Msa n consisting of trace zero matrices, 265
Opnq orthogonal group, 312
Op1, n ´ 1q Lorentz group, 318
O` p1, n ´ 1q orthochronous subgroup of the Lorentz group, 318
P pCq cone of linear maps preserving the cone C, or preserving the
order induced by the cone C, 321
PE orthogonal projection onto the subspace E, 5
PSD cone of positive-semidefinite matrices, 19
ion
PSUpnq projective special unitary group, 312
qp¨q quadratic form of the Minkowski spacetime, 32, 318
Rn,0 the hyperplane of Rn consisting of vectors whose coordinates
ut
add up to 0, 263
rib
sj pAq singular values (arranged in non-increasing order) of a ma-
trix A, 24 ` ˘
spAq the vector sj pAq of singular values of a matrix A, 24
ist
SHS unit sphere for the Hilbert–Schmidt norm } ¨ }HS , 225
Spm,n unit ball for } ¨ }p in Mm,n , 25
rd
Spm,sa unit ball for } ¨ }p in Msa
n , 25
SVD singular value decomposition, 36
SOpnq
SOp1, n ´ 1q
fo
special orthogonal group, 312
proper Lorentz group, 318
ot
SO` p1, n ´ 1q restricted Lorentz group, 318
specpAq spectrum (arranged in non-increasing order) of a self-adjoint
N
matrix A, 24
SUpnq special unitary group, 312
ly.
Probability
se
161
Φp¨q cumulative distribution function of an N p0, 1q variable, 307
G a standard Gaussian vector, 308
γn , γnC standard Gaussian measure on Rn , Cn , 308
GUEpnq Gaussian Unitary Ensemble, 162
GUE0 pnq Gaussian Unitary Ensemble conditioned to have trace 0, 163
GOE Gaussian Orthogonal Ensemble, 163
Hppq Shannon entropy of a probability mass function p, 28
χpnq, χ2 pnq chi, chi-squared distribution with n degrees of freedom, 175
i.i.d. independent, identically distributed, 160
402 F. NOTATION
ion
oscpf, A, µq oscillation of f around µ on the subset A, 186
pPt q Ornstein–Uhlenbeck semigroup, 135
Wishartpn, sq Wishart distribution with parameters n, s, 166
ut
Geometry and asymptotic geometric analysis
rib
b2 Euclidean/Hilbertian tensor product, 18
ist
b
p projective tensor product, 82
b
q injective tensor product, 83
Aε ε-enlargement of a set A, 117
rd
~ ¨ ~K norm on Hk,n associated to K, 183
apKq asphericity of a convex body K, 193
cpXq
Cpx, θq
fo
minimum of Ricci curvatures of the manifold X, 129
spherical cap of angle θ with center at x, 109
ot
dpX, Y q Banach–Mazur distance between normed spaces X and Y ,
103
N
ion
} ¨ }M distinguishability norm associated to the POVM M, 299
ρùσ state σ can be obtain from copies of ρ by an LOCC protocol,
ut
301
Asymd antisymmetric subspace of Cd b Cd , 39
rib
BP cone of block-positive operators, 56
CpΦq Choi matrix of a superoperator Φ, 48, 48
ist
co-PSD cone of co-positive semidefinite operators, 56
CP cone of completely positive superoperators, 49
rd
DpHq set of states on a Hilbert space H, 9, 31
DEC cone of decomposable superoperators, 57
Epψq entropy of entanglement of a pure state ψ, 215
Ep pψq
EF pρq
fo
p-entropy of entanglement of a pure state ψ, 215, 229
entanglement of formation of a state ρ, 272
ot
EB cone of entanglement-breaking superoperators, 57
k-Ext set of k-extendible states, 41
N
F flip operator, 39
gpψq geometric measure of entanglement of a pure state ψ, 229
ly.
ϕ` , ϕ´ , ψ ` , ψ ´ Bell vectors, 70
ΦV completely positive map X ÞÑ V XV : , 58
so
πa antisymmetric state, 40
πs symmetric state, 40
r
ion
ωNS pV q non-signaling value of a Bell expression V , 289
ωQ pV q quantum value of a Bell expression V , 289
ut
rib
ist
rd
fo
N ot
ly.
on
se
lu
na
so
r
Pe
Index
ion
In addition to pointing to definitions of concepts that appear
ut
throughout this book, the index is designed to direct the reader
rib
to fundamental or major results about such concepts and to other
facts, which have—in the authors’ opinion—a reference value. This
includes sharp versions of well-known inequalities, proofs of stan-
ist
dard results that are new or not widely known, or tables listing
values of various geometric parameters for classical objects. The
rd
index is not meant to be an exhaustive catalogue of all occurrences
of a given notion or phrase in the book.
nonseparability, 44 channel, 50
Pe
405
406 INDEX
ion
duality, 57
completely positive map, 49 for convex bodies, 191
norm of, 217 for Lipschitz functions, 187
completely randomizing channel, 52, 220 for projections, 192
ut
complexification, 6 for Schatten spaces, 198
computational basis, 5, 67 isometric, 275
rib
concentration of measure, 117 Dvoretzky–Rogers lemma, 200
on standard spaces, 118
Earth Mover’s distance, 161
ist
subgaussian, 117
cone, 18 Ehrhard symmetrization, 123
base duality, 20 Ehrhard’s inequality, 122
rd
base of, 19 ellipsoids, 18
dual, 19 polars of, 18
self-dual, 19 tensor product of, 18
conjugate
of a Hilbert space, 4
fo
empirical spectral distribution, 160
ε-enlargements, 117
enough symmetries, 89
ot
of a matrix, 7
contact point, 87 entangled state, 37
k-entangled state, 41
N
contextuality, 297
contraction principle, 127, 134 entangled subspaces, 213
convex body, 11 extremely entangled, 224
ly.
convex hull, 12
convex roof, 272 entanglement-breaking channel, 53
Copenhagen interpretation, 67 entropy of entanglement, 215
p-entropy, 215
se
covering, 107
exponential Markov inequality, 124
density, 142
exposed face, 13
number, 107, 114
na
exposed point, 13
creation operators, 177
exposed ray, 22
curse of dimensionality, 205
k-extendible state, 41
so
extreme ray, 22
Pe
decomposable map, 57
decomposable matrix, 56 extrinsic distance, 311
density matrix, 9, 71
face, 13
deterministic box, 286
facet, 13
deterministic strategy, 284, 286
facial dimension, 193
diamond norm, 51
fidelity, 312
difference body, 81
Figiel–Lindenstrauss–Milman inequality,
direct sum of channels, 54
194
discrete cube, 113
Finsler geometry, 319
distillability problem, 302
flip operator, 39
and 2-positivity, 305
Fock space, 176
and Werner states, 305
fraction
distillable state, 302
classical, 292
INDEX 407
ion
gauge, 11 intrinsic distance, 311
Gaussian distribution, 307 irreducible, 89
tail estimates, 307 isoperimetric inequality
ut
Gaussian mean width, 95 Gaussian, 122
Gaussian processes, 149, 308 in Rn , 92
rib
and the mean width, 150 on the discrete cube, 137
comparison inequalities, 153 on the sphere, 119
stationary, 159 isotropic convex body, 101
ist
Gaussian Unitary Ensemble, 162 isotropic states, 39
generic chaining, 159
rd
geodesic distance, 311 Jamiołkowski isomorphism, 47
geodesically convex, 313 John ellipsoid, 84
geometric distance, 79 John position, 84
geometric measure of entanglement, 228
Ginibre formula, 163
fo
Johnson–Lindenstrauss lemma, 205
jointly Gaussian variables, 308
GOE, 163
ot
K-convexity constant, 183, 185, 207
large deviations, 165
bounds, 183, 184
Gordon’s lemma, 153
N
duality, 186
Grassmann manifold, 314
for B1n , the cube, 186
ε-nets, 116
Kadison’s theorem, 34
ly.
`-position, 181
Gurvits–Barnum theorem, 246 Löwner position, 84
r
ion
inequality, 132 principle, 287
tensorization property, 133 violations, 289, 292
Lorentz cone, 19
ut
automorphisms, 323 operational, 29
Lorentz group, 318 Ornstein–Uhlenbeck semigroup, 134
rib
proper, 318 orthochronous subgroup, 318
restricted, 318 orthogonal group, 312
Low M ˚ -estimate, 202 ε-nets, 116
ist
Löwner ellipsoid, 84 geodesics, 313
`p -norm, 12 oscillation, 186
rd
`p product metric, 128 outer product, 5
outradius, 96
M -ellipsoid, 143 overlap, 37, 228, 229, 312
M -position, 143
magic square game, 293
fo
packing, 107
density, 142
Mahler conjecture, 105
ot
majorization, 22 number, 107
majorizing measure, 179 on the discrete cube, 113
N
ion
PPT-inducing map, 54 S-lemma, 321
Prékopa–Leindler inequality, 101 and automorphisms of Ln , 323
precognition, 287 Santaló inequality, 98
ut
principal angles, 314 reverse, 98
probabilistic method, 205 Santaló point, 326
rib
projective measurement, 73 Schatten
projective space, 31, 68, 312 p-norms, 23
spaces, 24
ist
nets, 112
volume of balls, 112 Schmidt coefficients, 36
projective tensor product, 82 and Courant–Fischer formulas, 37
rd
projective unitary group, 312 Schmidt decomposition, 36
proper face, 13 Schmidt rank, 36
pseudotelepathy, 293, 297 Schur channel, 54
pure state, 31
separable, 37
fo
sectional curvature, 130
Segré variety, 213, 312
ot
purification, 71 self-adjoint
pushforward, 127 matrix, 7
operator, 4
N
duality, 57
quantum game, 284 separable map, 54
quantum map, 47 separable state, 37
quantum marginal, 70, 287 ε-separated set, 107
se
R-transform, 177
centroid, 46
random covering, 110
dimension, 37
random induced states, 170
extreme points, 37
convergence, 171
facial structure, 38
density, 172
inradius, 235
large deviations, 171
polytopal approximation, 253
random strategy, 284, 286
symmetries, 46
random subspace, 186
volume and mean width, 235, 244
randomness reduction, 206
Shannon entropy
realignment, 45
continuous, 131
regular simplex, 12
discrete, 28
Rényi entropies, 28
simplex, 12
monotonicity, 28
410 INDEX
ion
measure, 94, 308 ε-nets, 116
vector, 95, 308 geodesics, 313
star-shaped set, 114 universal entanglers, 215
ut
state Urysohn’s inequality, 95
classical, 8 dual, 96
rib
quantum, 8 reverse, 184
Steiner symmetrization, 93
verticial dimension, 193
ist
Stiefel manifold, 317
Stinespring representation, 51, 72 volume of polytopes, 152
Stinespring theorem, 51 volume radius, 92
rd
strictly convex set, 14 superadditivity, 93
Størmer’s theorem, 44 volume ratio, 201
proofs, 62 and Kashin decomposition, 202
subexponential variable (ψ1 ), 139
subgaussian process, 157
fo
von Neumann entropy, 27
Lipschitz constant, 222, 223
ot
subgaussian variable (ψ2 ), 139 8-Wasserstein distance, 161
submajorized, 23 wave function, 67
N
Hilbertian, 6
injective, 83 Woronowicz theorem, 44
projective, 82
so
unconditional
basis, 90
body, norm, space, 90
direct sum, 146