Notes of Trace Inequalities
Notes of Trace Inequalities
Melchior Wirth
Trace inequalities for quantum entropies and the related concavity/convexity trace functionals
play a fundamental role in quantum information theory. They are also very important in other
areas like mathematical physics and noncommutative analysis. Since Lieb’s groundbreaking result
resolving a conjecture of Wigner and Yanase, great progress has been made in the past half a
century. We will give an introduction of several main results towards this direction.
Background: basic linear algebras. References:
Contents
Contents 1
2 Complete positivity 11
1
CONTENTS 2
4 Quantum States 22
9 Entanglement 53
Classically, the states of a physical system are modeled as points in a structured set like a
(smooth/Riemannian/Kähler/. . . ) manifold and the observables, that is, the measurable quan-
tities, as (smooth/continuous/. . . ) real-valued functions on the state space. So if the system is in
state x, the measurement outcome of observable f will be f (x).
More precisely, this is the setup of theories for one particle and the states represented by a
single point are called pure states. If we move to statistical physics, then (mixed) states are more
generally
R probability measures and the measurement outcome of observable f for a system in state
µ is f dµ. Pure states correspond to Dirac measures δx in this picture.
One common feature in both of these cases is that the state of a physical system completely
determines the measurement outcome of an observable. Early on in the development of quantum
mechanics it was recognized that this paradigm is incompatible with observations made at the
atomic level. At least that is the mainstream point of view, some people still try to develop
quantum physics in such a way that it is a deterministic theory. But we will not concern ourselves
with these approaches in these lectures.
Instead of a deterministic theory, in which the outcomes of a measurement are determined
by the state of the system, quantum theory is at its heart a probabilistic theory. The state of a
physical system only determines the probabilities of measurement outcomes and not their exact
value.
In quantum mechanics (of closed quantum systems), the state of system is a unit vector ξ from
some Hilbert space H, which we take finite-dimensional for simplicity’s sake here, so you can think
of H = Cn if you want. An observable is modeled by a self-adjoint operator A on H. If we choose a
basis of H, then A can be represented as a Hermitian matrix. By the spectral theorem, there exists
an orthonormal basis e1 , . . . , en of eigenvectors of A corresponding to the eigenvalues λ1 , . . . , λn
(counted with multiplicity). If the system is in the state ξ, the possible measurement outcomes for
the observable A are λ1 , . . . , λn with probabilities |⟨ξ, e1 ⟩|2 , . . . , |⟨ξ, en ⟩|2 , respectively.
Again, this is the setup for one particle and the corresponding quantum states are called pure
states. If we move to quantum statistical mechanics, the (mixed) states are represented by density
operators. A density operator ρ is a self-adjoint operator such that all its eigenvalues are positive
(i.e. non-negative) and whose trace is 1. The pure state ξ corresponds to the density operator
⟨ξ, · ξ⟩. In fact, this is more of a half-truth. To get the full picture, we should speak about open
and closed quantum systems and how pure states are not sufficient to describe open systems. But
for the purpose of this introduction, this analogy is a good guiding principle to understand the
mathematics.
The possible measurement outcomes for the observable A in state ρ are still λ1 , . . . , λn with
probabilities ⟨e1 , ρe1 ⟩, . . . , ⟨en , ρen ⟩. In particular, the expected value of A in state ρ is Tr(Aρ).
3
CHAPTER 1. INTRODUCTION AND NOTATIONS 4
These probabilities behave differently from the probabilities in Kolmogorov’s axiomatic setup usu-
ally studied in probability theory. In fact, this difference can be quantified in terms of so-called
“Bell inequalities”, which have been used to experimentally rule out a classical probabilistic inter-
pretation of quantum mechanics.
One of the striking differences between classical and quantum observables is that multiplication
of functions is commutative, while the multiplication of linear operators (or matrices, if you like)
is not. A physical consequence of this fact is that some observables cannot be measured at the
same time with arbitrary precision – a fact known as Heisenberg’s uncertainty principle.
One of the most important aspects for quantum information theory is how we describe compos-
ite systems. In classical systems, if we have two systems with (pure) state spaces X and Y , then
the pure state space of the composite system is X × Y . In other words, the state of the composite
system is described by the state of the two parts.
In quantum physics, if the states of two systems A and B are described by Hilbert spaces HA
and HB , then the Hilbert space for the composite system is their tensor product HA ⊗ HB . This
is different from classical physics in that a (pure) state of the composite system is in general not
described simply by a pair of pure states for A and B. Mathematically speaking, a unit vector
ξ ∈ HA ⊗ HB is not necessarily of the form ξA ⊗ ξB for unit vectors ξA ∈ HA , ξB ∈ HB . This leads
to the phenomenon of entanglement, which is at the heart of many interesting quantum effects,
both desirable and not.
So far, this is not much of a theory. We can describe the state of a system at a given time
and the possible measurement outcomes, but nothing has been said how physical systems change
with time. As is customary in quantum information theory, we will not concern ourselves with the
continuous times evolution of quantum systems, which is governed by the Schrödinger equation.
Instead we will content ourselves with describing possible changes a system can make.
It seems natural to describe the time evolution of a systems as its state evolving in time while
the observables stay unchanged. Note however that the only thing one can really measure is the
outcome of an observable in a given state, not the observable itself or the state itself. Thus it
is just as valid to describe the time evolution of a physical system as the observables evolving in
times while the states stay unchanged. These two standpoints are called Heisenberg (observables
evolve in times) and Schrödinger (states evolve in time) picture.
So, if we want a linear map Φ : B(H) → B(H) to describe the change of a system in the
Schrödinger picture, it should map states to states, that is, density operators to density operators.
If we break this down into the two parts of the definition of density operators, this means it should
• map positive operators to positive operators (positive map),
• preserve the trace of operators (trace-preserving map).
These two requirements are certainly enough for Φ to map density operators to density operators.
But it is one of the interesting quirks of quantum information theory that these two conditions are
not enough to ensure that Φ describes the change of the states of a quantum system. To see this,
one has to look at composite systems.
If systems A and B are described by HA and HB and the states of A change according to
Φ while the states of B stay unchanged, the states of AB should change according to the map
Φ ⊗ idB . So does Φ ⊗ idB always map density matrices to density matrices? Surprisingly not. It
is clearly trace-preserving, but it may fail to be positive. Maps Φ with the property that Φ ⊗ idB
is positive for arbitrary systems B are called completely positive and will occur time and again
during this course.
Similarly, the maps describing the change of the observables of a system in the Heisenberg
picture are unital completely positive maps. Here a linear map Φ : B(H) → B(H) is called unital
if Φ(1) = 1.
CHAPTER 1. INTRODUCTION AND NOTATIONS 5
Now how do quantum entropies and trace inequalities enter the picture? Entropy is a concept
that occurs in many different shapes and forms in physics, so much that it has been called a
metaphor of nature itself. One basic physical principle is that physical systems seek to maximize
their entropy (the second law of thermodynamics). This play a particularly important role in
understanding the dissipative behavior of open quantum systems.
A closely related quantity is the relative entropy. States that maximize the (“absolute”) entropy
when the total energy of the system is fixed are equilibrium states, called Gibbs states in this
context. The relative entropy of a state with respect to a Gibbs state then measures the deviation
of this state from equilibrium. One way to express this is the following: If a state has relative
entropy d with respect to the Gibbs state, then it takes ∼ log(d/ε) measurements to distinguish
this state from equilibrium with precision ε.
Mathematically, the quantum entropy S of a system in state ρ and the relative D entropy with
respect to a Gibbs state σ are expressed as
We will soon make sense of expression like log ρ for (some) matrices ρ. Trace inequalities enter
the picture to justify that these expressions have the expected physical properties. Maybe the
most prominent example is the data processing inequality, which states that the relative entropy
decreases when a quantum channel is applied to the state of the system. Mathematically, this is
reflected in certain convexity and monotonicity properties of the relative entropy.
Since this is a mathematics course after all, let us finish the first lecture by recalling some basic
linear algebra (or linear analysis, really).
Definition 1.1. An inner product space (or Hilbert space) is a vector space H equipped with a
map ⟨ · , · ⟩ : H × H → C satisfying
Lemma 1.3. For every inner product ⟨ · , · ⟩ on Cn there exists a positive definite matrix A ∈ Mn (C)
such that ⟨ξ, η⟩ = ξ H Aη for all ξ, η ∈ Cn .
CHAPTER 1. INTRODUCTION AND NOTATIONS 6
Proof. Let e1 , . . . , en be the standard basis of Cn and let Ajk = ⟨ej , ek ⟩. For ξ, η ∈ H we have by
sesquilinearity
* n n
+ n n
X X X X
⟨ξ, η⟩ = ξj ej , η k ek = ξj ηk ⟨ej , ek ⟩ = ξj Ajk ηk = ξ H Aη.
j=1 k=1 j,k=1 j,k=1
In particular,
ξ H Aξ = ⟨ξ, ξ⟩,
which implies that A is positive definite.
Lemma 1.4. If H is an inner product space, there exists n ∈ N and a linear isomorphism U : H →
Cn such that
(U ξ)H (U η) = ⟨ξ, η⟩
for all ξ, η ∈ H.
Proof. By linear algebra, there exists a linear isomorphism V : H → Cn for some n ∈ N. By the
previous lemma, there exists a positive definite matrix A ∈ Mn (C) such that ⟨ξ, η⟩ = (V ξ)H AV η
for all ξ, η ∈ H. Since A is positive definite, there exists an invertible matrix B ∈ Mn (C) such
that A = B H B (see Exercise 1.1). The map U = BV does the job.
In other words, there is essentially one inner product space of dimension n, namely Cn with
the inner product ⟨ξ, η⟩ = ξ H η. From now one we will always consider Cn with this specific inner
product, also called the standard inner product. Notice that it has the nice property that the
standard basis satisfies ⟨ej , ek ⟩ = δjk , in other words, it is an orthonormal basis.
Lemma 1.5. Let H, K be inner product spaces. For every linear map A : H → K there exists a
unique linear map A∗ : K → H such that
⟨Aξ, η⟩ = ⟨ξ, A∗ η⟩
for all ξ ∈ H, η ∈ K.
Proof. By the previous lemma we can assume without loss of generality H = Cm , K = Cn (with
the standard inner products). Then A∗ = AH does the job. Uniqueness is easy to see.
From now on we will write A∗ instead of AH because that’s what the cool kids do.
Definition 1.6. Let H be a Hilbert space. A linear map A : H → H is called self-adjoint if
A∗ = A.
If H = Cn , then a self-adjoint linear map A : H → H can be identified with a hermitian matrix
in Mn (C), and vice versa. We will use these two viewpoints interchangeably. The hermitian
matrices in Mn (C) are denoted by Mn (C)sa .
Theorem 1.7 (Spectral theorem). Let H be a Hilbert space. Every self-adjoint A : H → H can
be written in the form
Xn
A= λj ⟨ξj , · ⟩ξj
j=1
Proof. We prove this by induction over the dimension of H. For n = 1 it is clear. Suppose we
have proven it for dim(H) = n and let dim(H) = n + 1.
By Lemma 1.4, we can assume that H = Cn+1 ∼ = R2n+2 with the standard inner product.
Consider the map
f : Cn+1 → C, ξ 7→ ⟨ξ, Aξ⟩.
Since A is self-adjoint, we have f (ξ) = ⟨Aξ, ξ⟩ = ⟨ξ, Aξ⟩ = f (ξ). In other words, f is real-valued.
Thus f attains its maximum on the sphere S = {ξ ∈ Cn+1 : ∥ξ∥2 = 1}. By the Lagrange multiplier
theorem, a maximizer ξ1 of f on S must satisfy
for some λ1 ∈ R.
If ξ ⊥ ξ1 , then
⟨Aξ, ξ1 , ⟩ = ⟨ξ, Aξ1 ⟩ = λ1 ⟨ξ, ξ1 ⟩ = 0.
Thus A(V ) ⊂ V . The subspace V ⊥ has dimension n. By induction hypothesis, there exists an
⊥ ⊥
for η ∈ V ⊥ .
Hence if ξ = ⟨ξ1 , ξ⟩ξ1 + η with η ∈ V ⊥ , then
n+2
X n+1
X
Aξ = ⟨ξ1 , ξ⟩Aξ1 + λj ⟨ξj , η⟩ξj = λj ξj , ξ⟩ξj .
j=2 j=1
Remark. The operators Pj = ⟨ξj , ·⟩ξj are projections, that is, Pj2 = Pj∗ = Pj . Moreover, the
orthogonality relation ⟨ξi , ξj ⟩ = 0 for i ̸= j implies that the projections Pi are orthogonal in the
sense that Pi Pj = 0 for i ̸= j, and the completeness of an orthonormal basis implies that the
projections Pi sum up to 1.
This means that every self-adjoint operator A can be written as
m
X
A= λ j Pj
j=1
Pm
with real numbers λj and orthogonal projections Pj such that j=1 Pj = 1. Such a representation
is called spectral decomposition of A. Usually one sums up the projections belonging to the same
eigenvalue. Pm
If A has spectral decomposition A = j=1 λj Pj , then the eigenvalues of A are exactly the
numbers λ1 , . . . , λm .
Pm
Definition 1.8. If A ∈ Mn (C) is self-adjoint with spectral decomposition A = j=1 λj Pj and
f : {λ1 , . . . , λm } → C, we define
Xm
f (A) = f (λj )Pj .
j=1
Pm
Lemma 1.9. Let A ∈ Mn (C) be self-adjoint with spectral decomposition A = j=1 λ j Pj .
Proof. (a) We proceed by induction over k. For k = 0, the claim is true. Now assume it is true
for k and let us prove it for k + 1. We have
m
! m
X X
Ak+1 = Ak A = λki Pi λj Pj
i=1 j=1
m
X
= λki λj Pi Pj .
i,j=1
m
! m
m
X X X
(A − µ)f (A) = (λi − µ)Pi (λj − µ)−1 Pj = (λi − µ)(λj − µ)−1 Pi Pj
i=1 j=1 i,j=1
Again,
Pm we can−1 use the orthogonality
Pm relation of the projections Pi to reduce the last term to
j=1 (λj − µ) (λ j − µ)P j = j=1 Pj = 1. It follows from the uniqueness of (right) inverses
that f (A) = (A − µ)−1 .
Exercises
Exercise 1.1. For a matrix A ∈ Mn (C), show that the following are equivalent:
Proof. (i) =⇒ (ii): This implication uses the very practical polarization identity:
3
1X
⟨ξ, Aη⟩ = (−i)k ⟨ξ + ik η, A(ξ + ik η)⟩.
4
k=0
To prove it, you just have to sit down and expand the inner product. Not very pleasant, but it
works.
CHAPTER 1. INTRODUCTION AND NOTATIONS 9
Since A is positive, all summands on the right side are real. Thus
3 3
1X 1X
(−i)k ⟨ξ + ik η, A(ξ + ik η)⟩ = (−i)k ⟨ξ + ik η, A(ξ + ik η)⟩
4 4
k=0 k=0
3
1X
= (−i)k ⟨A(ξ + ik η), ξ + ik η⟩
4
k=0
Hence, A = A∗ .
If ξ is an eigenvector of A to the eigenvalue λ, then ⟨ξ, Aξ⟩ = λ∥ξ∥2 . Hence λ ≥ 0 (resp. λ > 0)
if A is positive semi-definite (resp. positive definite).
(ii) =⇒ (iii): Let λ1 , . . . , λn ∈ R+ denote the eigenvalues of A. By the spectral theorem, there
exist orthogonal projections P1 , . . . , Pn such that
n
X
A= λj Pj .
j=1
Pn p √
Let B = j=1 λj Pj = B. Then B = B ∗ and B 2 = A. If A the eigenvalues of A are strictly
positive, then B is invertible with inverse B −1 = j λ−1
P
j Pj .
(iii) =⇒ (i): If A = B ∗ B, then
Exercise 1.2. Show that for every A ∈ Mn (C) there exist positive semidefinite matrices A1 , . . . , A4 ∈
Mn (C) such that A = A1 − A2 + i(A3 − A4 ).
Exercise 1.3. For matrices A, B ∈ Mn (C) define their Hadamard product A ◦ B as the matrix
with entries Aj,k Bj,k . Show that the Hadamard product of two positive semi-definite matrices is
again positive semi-definite.
Exercise 1.4. The absolute value |A| of a matrix A ∈ Mn (C) is defined as |A| = (A∗ A)1/2 .
(a) Show that for every A ∈ Mn (C) there exists a unitary matrix U ∈ Mn (C) such that A = U |A|.
(b) Let A, B ∈ Mn (C) be self-adjoint with eigenvalues λ1 ≤ · · · ≤ λn and µ1 ≤ · · · ≤ µn ,
respectively. Show that if λk ≤ µk for all k ∈ {1, . . . , n}, then there exists a unitary matrix
U ∈ Mn (C) such that A ≤ U ∗ BU .
(c) Show that for every A ∈ Mn (C) there exists a unitary matrix U ∈ Mn (C) such that 21 (A +
A∗ )+ ≤ U ∗ |A|U (Hint: Use the minmax principle.)
(d) Show that for all A, B ∈ Mn (C) there exist unitary matrices U, V ∈ Mn (C) such that
|A + B| ≤ U ∗ |A|U + V ∗ |B|V .
CHAPTER 1. INTRODUCTION AND NOTATIONS 10
(e) Show that there exist A, B ∈ Mn (C) such that |A + B| ̸≤ |A| + |B|.
Exercise 1.5. (a) Show that if φ : Mn (C) → C is a linear map such that φ(AB) = φ(BA) for
all A, B ∈ Mn (C), then φ = φ(1)
n Tr.
(b) Let H be an infinite-dimensional Hilbert space and B(H) the set of all bounded linear
operators on H. Show that if φ : B(H) → C is a linear map such that φ(AB) = φ(BA) for
all A, B ∈ B(H), then φ = 0 (harder).
Exercise 1.6 (Schur complement theorem). Let A ∈ Mn (C) be invertible, B ∈ Mn,m (C) and
C ∈ Mm (C). Show that the Block matrix
A B
B∗ C
Exercise 1.7. (a) Let V be a subspace of Mn (C) such that ABC ∈ V for all A, C ∈ Mn (C)
and B ∈ V . Show that V = {0} or V = Mn (C).
(b) Show that if Φ : Mm (C) → Mn (C) is a linear map such that Φ(AB) = Φ(A)Φ(B) for all
A, B ∈ R, then either Φ = 0 or Φ is injective.
Chapter 2
Complete positivity
Definition 2.1 (Tensor product of vector spaces/Hilbert spaces). For Hilbert spaces H and K
let Bil(H × K; C) be the vector space of all sesquilinear maps from H × K to C. For ξ ∈ H and
η ∈ K define
ξ ⊗ η : Bil(H × K; C) → C, φ 7→ φ(ξ, η).
The tensor product H ⊗ K is the linear span of all elements ξ ⊗ η with ξ ∈ H and η ∈ K. It is a
Hilbert space when endowed with the inner product
Remark. We already know that Cm ⊗ Cn must be isomorphic (as inner product space) to Ck for
some k ∈ N. It is not hard to see that the elementary tensors ei ⊗ ej with i ∈ {1, . . . , m} and
j ∈ {1, . . . , n} form an orthonormal basis of Cm ⊗ Cn . Thus Cm ⊗ Cn ∼
= Cmn as inner product
spaces.
Definition 2.2 (Tensor product of matrices/maps). If Φ : H1 → K1 and Ψ : H2 → K2 are linear
maps, then their tensor product Φ ⊗ Ψ is the linear map from H1 ⊗ H2 to K1 ⊗ K2 , defined on
elementary tensors by
(Φ ⊗ Ψ)(ξ ⊗ η) = Φ(ξ) ⊗ Ψ(η).
The linear span of all elements Φ⊗Ψ with Φ ∈ Mm,k (C) and Ψ ∈ Mn,l (C) is denoted by Mm,k (C)⊗
Mn,l (C).
Remark. We will always identify elements of Mm (C) ⊗ Mn (C) with mn × mn matrices in the
following way: Let (Eij ) be the matrix units in Mn (C), that is, Eij is the matrix whose (i, j)-entry
is 1 and all other entries are 0. The matrix A ⊗ Eij is identified with the block matrix in Mmn (C)
with blocks of size m × m, where the block at position (i, j) is A and all other blocks are zero.
Here is an example:
0 1 0 A
A⊗ 7→ .
0 0 0 0
Since the matrix units form a basis of Mn (C), this identification can be linearly extended to all of
Mm (C) ⊗ Mn (C). For example,
a b aA bA
A⊗ 7→ .
c d cA dA
In other words, the elementary tensor A ⊗ B is identified with the Kronecker product of A and B
(which is also denoted by A ⊗ B for this reason).
11
CHAPTER 2. COMPLETE POSITIVITY 12
Definition 2.3 (Completely positive maps and quantum channels). A linear map Φ : Mm (C) →
Mn (C) is called positive if it maps positive semi-definite matrices to positive semi-definite matrices.
For k ≥ 1, a linear map Φ : Mm (C) → Mn (C) is said to be k-positive if Φ⊗idk : Mm (C)⊗Mk (C) →
Mn (C) ⊗ Mk (C) is positive. It is said to be completely positive if it is k-positive for any k ≥ 1.
In general, characterizing k-positive maps from Mm (C) to Mn (C) is a hard task. The situation
is much better for completely positive maps. Let us start with a few (non-) examples.
Example 2.4. The transpose map T (A) = AT is positive but not 2-positive (exercise).
Example 2.5. The depolarizing channel Φ(A) = λA + (1 − λ)Tr(A)1 for λ ∈ [0, 1] is completely
positive.
Example 2.6. The following maps are completely positive:
1. If V ∈ Mm,n (C), then the map Φ : Mm (C) → Mn (C), A 7→ V ∗ AV is completely positive.
2. ∗-homomorphism π, that is, a linear map π : Mm (C) → Mn (C) such that π(AB) = π(A)π(B)
and π(A∗ ) = π(A)∗ for all A, B ∈ Mm (C). For a more concrete example, take
A 0
π : Mn (C) → M2n (C), A 7→ .
0 A
We will soon see that all completely positive maps can be constructed from these two examples.
Lemma 2.7. A linear map Φ : Mm (C) → Mn (C) is completely positive if and only if for every
N ∈ N, all A1 , . . . , AN ∈ Mm (C) and ξ1 , . . . , ξN ∈ Cn we have
l
X
⟨ξj , Φ(A∗j Ak )ξk ⟩ ≥ 0.
j,k=1
PN
with Ai = k=1 Bki . Moreover, any ξ ∈ Mn (C) ⊗ MN (C) is of the form
N
X
ξ= ξj ⊗ ej
j=1
with ξj ∈ Cn . Since
* +
N
X N
X
ξ, (Φ ⊗ idMN (C) ) A∗i Aj ⊗ Eij ξ = ⟨ξi , Φ(A∗i Aj )ξj ⟩,
i,j=1 i,j=1
the map Φ ⊗ idMN (C) is positive if and only if the inequality from the lemma holds.
Theorem 2.8 (Stinespring’s dilation theorem). Any completely positive map Φ : Mm (C) → Mn (C)
can be represented as
Φ(A) = V ∗ π(A)V
with V ∈ Mk,n (C) and a unital ∗-homorphism π : Mm (C) → Mk (C) for some k ∈ N.
Proof. The proof is reminiscent of the GNS construction, and in fact, it can be understood as a
generalization of it. On Mm (C) ⊗ Cn define
* +
X X X
Aj ⊗ ξj , Bk ⊗ η k = ⟨ξj , Φ(A∗j Bk )ηk ⟩.
j k K j,k
This map is clearly sesquilinear, and by the previous lemma it is also positive semi-definite. It may
fail to be non-degenerate, so we define K as the quotient of Mm (C) ⊗ Cn by the kernel of ⟨·, ·⟩K .
We write X ⊗K ξ for the image of A ⊗ ξ in K under the quotient map.
Then * + * +
X X X X
Aj ⊗K ξj , Bk ⊗K ηk := Aj ⊗ ξ j , Bk ⊗ ηk
j k j k K
defines an inner product on K, making it a Hilbert space. Clearly, K is finite-dimensional, so that
K∼ = Ck for some k ∈ N.
Define V : Cn → K, ξ 7→ 1 ⊗K ξ and π : Mm (C) → B(K), π(A)(B ⊗K ξ) = AB ⊗K ξ. To show
that π(A) is well-defined, first note that A∗ A ≤ λ1, where λ is the largest eigenvalue of A∗ A. Thus
there exists C such that λ1 − A∗ A = C ∗ . Hence
X X X
λ ⟨ξj , Φ(Bj∗ Bk )ξk ⟩ − ⟨ξj , Φ(Bj∗ A∗ ABk )ξk ⟩ = ⟨ξj , Φ(Bj∗ (λ1 − A∗ A)Bk )ξk ⟩
j,k j,k j,k
X
= ⟨ξj , Φ(Bj∗ C ∗ CBk )ξk ⟩
j,k
≥0
P P
by the previous lemma. In particular, if j Bj ⊗K ξj = 0, then j ABj ⊗K ξj = 0.
Let us compute the adjoint of V . For ξ, η ∈ Cn and A ∈ Mm (C) we have
Remark. The construction in the proof is essentially forced upon us by the statement of the
Stinespring dilation theorem: Let us assume there is a (finite-dimensional) Hilbert space K, a
linear map V : Cn → K and a unital ∗-homomorphism π : Mm (C) → B(K) such that
Φ(A) = V ∗ π(A)V
for all A ∈ Mm (C).
Without loss of generality we may assume that elements of the form π(A)V ξ with A ∈ Mm (C)
and ξ ∈ Cn linearly span K. Since the map (A, ξ) 7→ π(A)V ξ is bilinear, there exists a surjection
q : Mm (C) ⊗ Cn → K such that q(A ⊗ ξ) = π(A)V ξ. In particular, K is a quotient of Mm (C) ⊗ Cn .
Moreover, the inner product on K must satisfy
⟨π(A)V ξ, π(B)V η⟩ = ⟨ξ, V ∗ π(A∗ B)V η⟩ = ⟨ξ, Φ(A∗ B)η⟩.
If we pull this back to Mm (C) ⊗ Cn via
⟨A ⊗ ξ, B ⊗ η⟩ := ⟨q(A ⊗ ξ), q(B ⊗ η)⟩,
one gets exactly the sesquilinear form from the proof.
For the following result recall the definition of the operator norm: If A is a linear operator
between the Hilbert spaces H and K, then
∥A∥ = sup ⟨Aξ, Aξ⟩1/2 .
ξ∈H
⟨ξ,ξ⟩≤1
Proof. Since CΦ is positive, there exists B ∈ Mmn (C) such that B ∗ B = CΦ . Let b1 , . . . , bmn ∈ Cmn
be the row vectors of B, so that
mn
X
CΦ = B ∗ B = b∗l bl .
l=1
Further let
Jl : Cn → Cn ⊗ Cm , ξ 7→ ξ ⊗ el
Note that
m
X m
X
Jk∗ CΦ Jl ξ = Jk∗ (Φ(Eij )ξ ⊗ Eij el ) = δjl δik Φ(Eij )ξ = Φ(Ekl )ξ.
i,j=1 i,j=1
Proof. Since ∗
m
X m
X m
X
Eij ⊗ Eij Eij ⊗ Eij = m Eij ⊗ Eij ,
i,j=1 i,j=1 i,j=1
P
the matrix i,j Eij ⊗ Eij is positive. As Φ is assumed to be m-positive, this implies that the Choi
matrix CΦ is positive. Now the first claim follows from the previous lemma.
For the second claim observe that
k
X k
X
Tr(Φ(A)) = Tr(Vj∗ AVj ) = Tr(AVj Vj∗ ),
j=1 j=1
Theorem 2.12 (Choi’s criterion of completely positive maps). Let Φ : Mm (C) → Mn (C) be a
linear map. The following are equivalent:
(i) Φ is m-positive.
(ii) The Choi matrix
m
X
CΦ := (Φ ⊗ idMm (C) ) Eij ⊗ Eij ∈ Mmn (C)
i,j=1
Theorem 2.13 (Uhlmann, Lindblad). For any quantum channel Φ : Mm (C) → Mn (C), there exist
N > 0 and a pure state δ ∈ MN such that
Z
1N
Φ(A) ⊗ = U ∗ (A ⊗ δ)U dU,
N
where dU is the Haar measure on the unitary group.
Exercises
Exercise 2.1. Show that Ck ⊗ Cm has the following universal property: For every bilinear map
φ : Ck ×Cm → Cn there exists a unique linear map Φ : Ck ⊗Cm → Cn such that φ(x, y) = Φ(x⊗y).
Exercise 2.2. For each n ≥ 1, find maps that are n-positive but not (n + 1)-positive.
Exercise 2.3. Show that the maps from Example 2.6 are completely positive.
Exercise 2.4 (Completely positive maps are completely bounded). Recall that the operator norm
of A ∈ Mn (C) is the square root of the largest eigenvalue of A∗ A. Accordingly, the norm of a
linear map Φ : Mm (C) → Mn (C) is
Show that if Φ is completely positive, then ∥Φ ⊗ idMk (C) ∥ ≤ ∥Φ∥ for all k ∈ N.
Exercise 2.5 (GNS construction for states). (a) Show that any positive linear functional φ : Mn (C) →
C is completely positive.
(b) A unital positive linear map φ : Mn (C) → C is called a state. Show that for every state φ
there exists a unital ∗-homomorphism π : Mn (C) → Mk (C) for some k ∈ N and a unit vector
ξ ∈ Ck such that
φ(A) = ⟨ξ, π(A)ξ⟩
for all A ∈ Mn (C).
CHAPTER 2. COMPLETE POSITIVITY 17
Exercise 2.6. Show that every 2-positive map satisfies the Kadison–Schwarz inequality.
Exercise 2.7 (Hilbert modules I). Recall that a right Mn (C)-module is a vector space E together
with an associative and distributive product E × Mn (C) → E. A right Hilbert Mn (C)-module is
a right Mn (C)-module together with a sesquilinear map
(·|·) : E × E → Mn (C)
such that
(a) Show that Mm,n (C) with the usual right multiplication and (A|B) = A∗ B is a right Hilbert
Mn (C)-module.
(b) (maybe not so easy) Show that for every (finite-dimensional) right Hilbert Mn (C)-module E
there exists m ∈ N and a linear isomorphism α : E → Mm,n (C) such that
• α(ξA) = α(ξ)A for all ξ ∈ E, A ∈ Mn (C),
• α(ξ)∗ α(η) = (ξ|η) for all ξ, η ∈ Mn (C).
Exercise 2.8 (Hilbert modules II). Let E, F be (finite-dimensional) right Hilbert Mn (C)-modules.
A linear map T : E → F is called adjointable if there exists a linear map T ∗ : F → E such that
(T ξ|η) = (ξ|T ∗ η)
for all ξ ∈ E, η ∈ F . The set of all adjointable operators from E to F is denoted by L(E, F ).
T (ξA) = (T ξ)A
T (B) = AB
Exercise 2.9 (Hilbert modules III). A Hilbert Mm (C)-Mn (C)-module is a right Hilbert Mn (C)-
module E together with a unital ∗-homomorphism π : Mm (C) → L(E, E).
CHAPTER 2. COMPLETE POSITIVITY 18
Φ(A) = V ∗ π(A)V
If Φ : Mm (C) → Mn (C) is a linear map, we denote its adjoint with respect to the Hilbert–
Schmidt inner product by Φ† .
Lemma 3.2. A matrix A ∈ Mn (C) is positive if and only if ⟨A, B⟩HS ≥ 0 for all B ∈ Mn (C).
Remark. By the Riesz representation theorem, every linear map φ : Mn (C) → C is of the form
φ = Tr(B · ) for some B ∈ Mn (C). By the previous lemma, this map is positive if and only if
B ≥ 0.
The Hilbert–Schmidt adjoint connects the Heisenberg and Schrödinger picture, as the following
lemma shows.
Lemma 3.3. A linear map Φ : Mm (C) → Mn (C) is unital completely positive if and only if Φ† is
completely positive trace-preserving.
Proof. Since (Φ ⊗ idMk (C) )† = Φ† ⊗ idMk (C) , it suffices to show that Φ is unital positive if and only
if Φ† is positive trace-preserving.
As
⟨Φ(1), A⟩HS = ⟨1, Φ† (A)⟩HS = Tr(Φ† (A)),
we have Φ(1) = 1 if and only if Tr(Φ† (A)) = Tr(A) for all A ∈ Mn (C).
That Φ† is positive if and only if Φ is positive is an easy consequence of the previous lemma.
19
CHAPTER 3. CONDITIONAL EXPECTATIONS AND PARTIAL TRACE 20
Definition 3.4. Let M be a subalgebra of Mn (C) that is closed under taking adjoints and contains
1. Let ιM : M → Mn (C) be the inclusion map. The conditional expectation EM onto M is ιM ◦ι†M .
Remark. It follows from (a) and (b) that EM is the orthogonal projection onto M (with respect
to the Hilbert–Schmidt inner product). This can be equivalently characterized by the following
properties:
(i) EM (B) ∈ M and ∥B − EM (B)∥ ≤ ∥A − EM (B)∥ for all A ∈ M with equality if and only if
A = EM (B).
(ii) EM (B) ∈ M and B − EM (B) ⊥ M.
Example 3.6 (Trace). If M = C1, then EM (A) = Tr(A)1.
Remark. As remarked before, every positive linear map φ : Mn (C) → C is of the form φ =
Tr(B 1/2 · B 1/2 ) for some B ∈ Mn (C)+ . Since A 7→ B 1/2 AB 1/2 is completely positive and Tr is
completely positive by the previous example, this implies that every positive map from Mn (C) to
C is completely positive.
Example 3.7 (Restriction to the diagonal). If M consists of all diagonal matrices in Mn (C), then
EM (A) = diag(A11 , . . . , Ann ).
Example 3.8 (Hadamard product). Let M ⊂ Mn (C) ⊗ Mn (C) be the subalgebra formed by all
elements of the form
Xn
Ajk Ejk ⊗ Ejk
j,k=1
Definition 3.9. The partial trace Tr1 : Mm (C) ⊗ Mn (C) → Mn (C) is the linear map given by
Tr1 (A ⊗ B) = Tr(A)B.
Tr2 (A ⊗ B) = Tr(B)A.
Exercises
Exercise 3.1. Show again that the Hadamard product of two positive matrices is positive. (Hint:
Use Example 3.8.)
Exercise 3.2. Let Φ : Mm (C) → Mn (C) be a quantum channel. There exist a unitary U ∈ Mk
with k = mn2 and a unit vector φ ∈ Cn ⊗ Cn such that
where TrE is the partial trace over the first two factors of Cm ⊗ Cn ⊗ Cn .
Exercise 3.3. Let Tr1 be the partial trace over the first tensor factor of Cm ⊗ Cn . Show that for
any ρ over Cm ⊗ Cn , we have
Z
1
⊗ Tr1 ρ = (u ⊗ 1)ρ(u∗ ⊗ 1)du,
m
where du denotes the normalized Haar measure on the unitary group over Cm . Or, we have
m
1 1 X
⊗ Tr1 ρ = 2 (ujk ⊗ 1)ρ(u∗jk ⊗ 1),
m m
j,k=1
where {ujk }1≤j,k≤m denotes the discrete Heisenberg–Weyl group over Cm , i.e.
m
2πi
X
ujk = η kl |j + l⟩ ⟨l| , η := e m .
l=1
Chapter 4
Quantum States
We begin with a piece of notation physicists are very fond of – the bra-ket notation. Let H be
a Hilbert space. Every vector ξ ∈ H gives rise to a linear map from C to H that maps λ ∈ C
to λξ ∈ H. This linear map is denoted by |ξ⟩. As C is a Hilbert space with the standard inner
product, we can take the adjoint of |ξ⟩, which is denoted by ⟨ξ|.
Not only does every vector ξ ∈ H give rise to a linear map from C to H, the converse is also
true: If Φ : C → H is linear, then Φ = |Φ(1)⟩. For this reason, we will not distinguish between
elements of H and linear maps from C to H. In particular, we identify elements of C with linear
maps from C to C (in other words, we treat 1 × 1 matrices as complex numbers).
With these identifications, the bra-ket notation works very nicely. For example,
and |ξ⟩ ⟨η| is the linear map that sends ζ to ⟨η, ζ⟩ξ.
With this notation, we can write the spectral decomposition of a self-adjoint matrix A with
orthonormal basis (ξj ) consisting of eigenvectors and associated eigenvalues (λj ) as
m
X
A= λj |ξj ⟩ ⟨ξj | .
j=1
Definition 4.1. A density matrix or quantum state is a positive semi-definite matrix ρ ∈ Mn (C)
with Tr(ρ) = 1.
Since a density matrix ρ is positive semi-definite, it has an orthonormal basis (ξj ) consisting
of eigenvectors and the associated
P eigenvalues (λj ) are non-negative. Moreover, the condition
Tr(ρ) = 1 is equivalent to j λj = 1.
Thus, density matrices are exactly the matrices that can be expressed as
n
X
ρ= λj |ξj ⟩ ⟨ξj |
j=1
P
with ∥ξj ∥ = 1 and λj ≥ 0, j λj = 1.
Definition 4.2. A quantum state of the form |ξ⟩ ⟨ξ| is called a pure state. Every other quantum
state is called a mixed state.
22
CHAPTER 4. QUANTUM STATES 23
Equality holds only if λ2j = λj for all j ∈ {1, . . . , n}, which means means λj ∈ {0, 1}, which can
only happen for mixed states.
CHAPTER 4. QUANTUM STATES 24
Mixed states can be seen as “shadows” of pure states on a larger Hilbert space. This is the
first instance in this course of the paradigm known as “Church of the Larger Hilbert Space”.
Proposition 4.6. If ρ ∈ Mn (C) is a quantum state, then there exists a pure state σ ∈ Mn (C) ⊗
Mn (C) such that ρ = Tr1 (σ).
Pn Pn p
Let ρ = j=1 λj |ξj ⟩ ⟨ξj | be the spectral decomposition of ρ and let ξ = j=1 λj ξj ⊗ ξj .
Proof.P
Since j λj = 1, we have
n p
X n
X
λi λj ⟨ξi , ξj ⟩2 =
p
⟨ξ, ξ⟩ = λj = 1.
i,j=1 j=1
= ρ.
Definition 4.7. If ρ ∈ Mn (C) is a quantum state, any pure state σ ∈ Mm (C) ⊗ Mn (C) such that
Tr1 (σ) = ρ is a called purification of ρ.
Remark. By the previous proposition, it is always possible to take m = n. But in general, one
can do much better. In the extreme case when ρ is already a pure state or example, one can of
course take m = 1.
Definition 4.8. A quantum channel (in the Schrödinger picture) is a completely positive trace-
preserving linear map from Mm (C) to Mn (C).
It is immediate from the definition that quantum channels map quantum states to quantum
states, which make them suitable to model changes of the state of a physical system.
Exercises
Let
1 0
σ1 = ,
0 1
0 −i
σ2 = ,
i 0
1 0
σ3 = .
0 −1
These matrices are called Pauli matrices. It can be useful to write σ0 for the 2 × 2 identity matrix.
CHAPTER 4. QUANTUM STATES 25
(a) Show that (σj )3j=0 is an orthonormal basis of the self-adjoint 2 × 2 matrices with the inner
product 12 ⟨ · , · ⟩HS .
(b) Show that ρ ∈ M2 (C) is a quantum state if and only if there exists a ∈ R3 with a21 +a22 +a33 ≤ 1
such that ρ = 21 (I + a1 σ1 + a2 σ2 + a3 σ3 ).
The equivalence in (b) establishes a bijection between quantum states in M2 (C) and the unit ball
of R3 . This graphical representation of qubit states is called the Bloch sphere.
Figure 4.1: Bloch sphere. Points denoted by |ξ⟩ correspond to the pure states |ξ⟩ ⟨ξ|. The ONB of
C2 is denoted by |0⟩, |1⟩.
The map a 7→ 12 (I +a1 σ1 +a2 σ2 +a3 σ3 ) is affine, that is, it preserves convex combinations. Thus
points on the surface of the unit ball correspond to pure states, while interior points correspond to
mixed states. The center of the unit ball corresponds to the state 12 I, which is called the maximally
mixed state.
Chapter 5
Since projections are positive semi-definite, every PVM is a POVM. The measurement inter-
pretation for POVMs is the same as for PVMs: The probability to measure outcome i when the
system is in state ρ is given by Tr(Pi ρ).
Thus measurement, known as Helstrom measurement, is used to optimally distinguish between the
states ρ1 and ρ2
26
CHAPTER 5. POVMS AND QUANTUM MEASUREMENTS 27
Example 5.4. For every d ∈ N the family (1/d)di=1 is a POVM, although it is not very interesting
as a measurement: Tr(1/d · ρ) = d1 for every quantum state ρ, so it does not help to distinguish
between states of a system.
There is a one-to-one correspondence between POVMs on {1, . . . , n} with values in Mm (C) and
a special class of quantum channels from Mm (C) to Mn (C), called quantum-to-classical channels.
Example 5.6. The dephasing channel Φ : Mn (C) → Mn (C), A 7→ diag(A11 , . . . , Ann ) is a quantum-
to-classical channel.
More generally, if Ψ : Mm (C) → Mn (C) is any quantum channel and Φ : Mn (C) → Mn (C) is the
dephasing channel, then Ψ◦Φ is a quantum-to-classical channel. Vice versa, if Ψ : Mm (C) → Mn (C)
is a quantum-to-classical channel, then Φ ◦ Ψ = Ψ, hence all quantum-to-classical channels are of
this form.
Proposition 5.7. (a) For every quantum-to-classical channel Φ : Mm (C) → Mn (C) there exists
a unique POVM on {1, . . . , n} with values in Mm (C) such that
n
X
Φ(A) = Tr(Pi A)Eii
i=1
is a quantum-to-classical channel.
Proof. (a) Let Pi = Φ† (Eii ). Since Φ is completely positive, so is Φ† , hence Pi is positive. Moreover,
since Φ is trace-preserving, Φ† is unital, which implies
n
X n
X
Pi = Φ† (Eii ) = Φ† (1) = 1.
i=1 i=1
Just like mixed states are “shadows” of pure states, POVMs are “shadows” of PVMs. This is
another instance of the church of the larger Hilbert space.
Theorem 5.8 (Naimark dilation theorem). If (Pi )i∈I is a POVM with values in Mm (C), then
there exists an isometry V : Cm → Ck and a PVM (Qi )i∈I with values in Mk (C) such that
Pi = V ∗ Qi V
for 1 ≤ i ≤ n.
Proof. Let Φ be the quantum-to-classical channel associated with (Pi )i∈I by the previous proposi-
tion. By the Stinespring dilation theorem, there exists V : Cm → Ck and a unital ∗-homomorphism
π : Mn (C) → Mk (C) such that Φ† (A) = V ∗ π(A)V for all A ∈ Mm (C). Since Φ is trace-preserving,
Φ† is unital, which implies V ∗ V = 1.
Let E denote the dephasing channel on Mk (C) and Ψ = E ◦ π † , which is a quantum-to-classical
channel. Moreover, since E † = E is the identity on diagonal matrices, the Hilbert–Schmidt adjoint
of Ψ acts as π on the diagonal matrices, which is a ∗-homomorphism. Thus the POVM given by
Qi = Ψ† (Eii ) is a PVM.
Finally,
V ∗ Qi V = V ∗ π(E † (Eii ))V = V ∗ π(Eii )V = Φ† (Eii ) = Pi .
Chapter 6
In this chapter we will see the first glimpse of a quantum entropy, namely the von Neumann
entropy. It is the trace of a matrix-valued function, which connects it with the other part of the
title of this course, the trace inequalities. More specifically, we will investigate monotonicity and
convexity properties of maps of the form A 7→ Tr(f (A)) in this chapter.
Lemma 6.1 (Peierls Inequality). If A ∈ Mn (C) is self-adjoint, f : R → R is convex and u1 , . . . , un
is an orthonormal basis of Cn , then
n
X
f (⟨uj , Auj ⟩) ≤ Trf (A),
j=1
29
CHAPTER 6. BASIC TRACE INEQUALITIES AND CONVEXITY/CONCAVITY RESULTS
30
Proof. Let us first consider the case when f is a polynomial. Otherwise replacing A by A + tB, it
suffices to prove differentiability at 0. We have
and thus
Tr((A + tB)m ) = Tr(Am ) + tTr(mAm−1 B) + o(t).
Hence the statement holds when f is a monomial, and by linearity also when f is a polynomial.
Now let f ∈ C 1 (R) be arbitrary. For T > 0 we have
φk : R → R, t 7→ Tr(pk (A + tB)).
uniformly on [−T, T ].
Since T > 0 was arbitrary, the function φ is differentiable with
Remark. It was crucial in the proof that the trace is invariant under cyclic permutations. In
d
general, it is not true that dt |t=0 f (A + tB) = f ′ (A)B, as simple examples show. In the exercises
you will be asked to give a correct formula for this derivative in terms of the spectral decompositions
of A and B.
Proof. First let f be monotone increasing and let A, B ∈ Mn (C) with A ≤ B. We have to show
that
Tr(f (A)) ≤ Tr(f (B)).
We can assume without loss of generality that f is continuously differentiable. By the previous
lemma we have
Z 1
d
Tr(f (B)) − Tr(f (A)) = Tr(f (A + t(B − A))) dt
0 dt
Z 1
= Tr(f ′ (A + t(B − A))(B − A)) dt.
0
Now let f be convex, A, B ∈ Mn (C) be self-adjoint and λ ∈ [0, 1]. Let (uj ) be an orthonormal
basis of Cn consisting of eigenvectors of λA + (1 − λ)B. By convexity of f and Peierls inequality
we have
n
X
Tr(f (λA + (1 − λB))) = ⟨uj , f (λA + (1 − λB))uj ⟩
j=1
Xn
= f (⟨uj , (λA + (1 − λB))uj ⟩)
j=1
Xn
≤ λf (⟨uj , Auj ⟩) + (1 − λ)f (⟨uj , Buj ⟩)
j=1
Remark. That we can assume f to be continuously differentiable in the first part of the proof
will be justified in the exercises.
Corollary 6.4. For self-adjoint A ∈ Mn (C) and λ ∈ R let N (A, λ) be the number of eigenvalues
of A less or equal than λ, counted with multiplicity. If A ≤ B, then N (A, λ) ≥ N (B, λ) for all
λ ∈ R.
is differentiable with
φ′ (0) = Tr(f ′ (B)(A − B)).
Moreover, by the previous theorem, φ is convex. Thus
Tr(f ′ (B)(A − B)) = φ′ (0) ≤ φ(1) − φ(0) = Tr(f (A)) − Tr(f (B)).
Proof. Let !
n
X
n xk
φ : R → R, x 7→ log e .
k=1
where
exj
aj = Pn .
k=1 exk
For any y ∈ Rn we have
2
n n n n n
X ∂ 2 φ(x) X X X X
yj yk = aj yj2 − aj ak yj yk = aj yj2 − aj yj ≥ 0
∂xj ∂xk j=1 j=1 j=1
j,k=1 j,k=1
Tr(Hε ρ) − log Tr(eHε ) = Tr(ρ log(ρ + ε1)) − log(1 + εn) → Tr(ρ log ρ)
as ε ↘ 0. Thus
Tr(ρ log ρ) ≤ sup{Tr(Hρ) − log Tr(eH ) | H ∈ Mn (C)sa }.
In the previous theorem we encountered one of the central quantities in this course, the von
Neumann entropy. Moreover, another entropy quantity was hidden in the proof, namely the relative
entropy, which we will re-encounter later.
Definition 6.8. The von Neumann entropy S(ρ) of a quantum state ρ is defined as
Definition 6.9. Given a self-adjoint matrix H ∈ Mn (C) and β ∈ [−∞, ∞], the Gibbs state for
the Hamiltonian H at inverse temperature β is the density matrix ρβ,H given by
1
ρβ,H = e−βH
Tr(e−βH )
if β ∈ R and ρ±∞,H = limβ→±∞ ρβ,H .
Gibbs states are the equilibrium states for systems with Hamiltonian H for fixed energy of the
system. Mathematically, this can be expressed as follows.
Theorem 6.10. Let H ∈ Mn (C) be a self-adjoint matrix with eigenvalues λ1 ≤ · · · ≤ λn . For
each E ∈ [λ1 , λn ] there exists β ∈ [−∞, ∞] such that E = Tr(Hρβ,H ) and
Proof. Let ρ be a quantum state with Tr(ρH) = E. Assume there exists a Gibbs state ρβ,H such
that Tr(ρβ,H ) = E (we will show this afterwards). By the duality formula for the quantum entropy,
To show that Tr(ρβ,H H) takes all values between λ1 and λn , we use the intermediate value
theorem. We have
d d Tr(e−βH H)
Tr(ρβ,H H) =
dβ dβ Tr(e−βH )
2
Tr(e−βH H 2 ) Tr(e−βH H)
=− +
Tr(e−βH ) Tr(e−βH )
= Tr(ρβ,H H)2 − Tr(ρβ,H H 2 ).
we deduce
Hence Tr(ρβ,H H) takes all values between λ1 and λn for β ∈ [−∞, ∞].
Uniqueness of the Gibbs state with energy E follows from the strict monotonicity of β 7→
Tr(ρβ,H H) if H is not a multiple of the identity, while in the case H = E1 we have ρβ,H = E
n1
independently of β.
Proof. If A = X ⊗ Y , then
For a density matrix ρ ∈ Mm (C)⊗Mn (C) we write ρ1 and ρ2 for Tr2 (ρ) and Tr1 (ρ), respectively.
Exercises
Exercise 6.1. Let f : R → R be continuously differentiable and let A, B ∈ Mn (C) be self-adjoint.
Show that the map
Φ : R → Mn (C), t 7→ f (A + tB)
is differentiable and express Φ′ (0) in terms of the spectral decomposition of A and B.
d tA
Exercise 6.2. Show that for every A ∈ Mn (C) the map t 7→ etA is differentiable with dt e = AetA
P∞ k
(Hint: Use that etA = k=0 tk! Ak ).
Exercise 6.3. Let λ1 , . . . , λn , µ1 , . . . , µn ∈ R with λ1 < λ2 < · · · < λn and µ1 ≤ µ2 ≤ · · · ≤ µn .
(a) Show that there exists a continuously differentiable increasing function f : R → R such
that f (λk ) = µk for all k ∈ {1, . . . , n}. Show that f can be chosen strictly increasing if
µ1 < · · · < µn .
(b) Let g : R → R be increasing and let A1 , . . . , Am ∈ Mn (C) be self-adjoint. Show that there
exists a continuously differentiable increasing function f : R → R such that f (Ak ) = g(Ak )
for all k ∈ {1, . . . , n}.
TreA+B Tr(eA B)
log ≥ .
TreA TreA
Exercise 6.5. (a) Let ω : Mn (C) be a linear functional with ω(1) = 1. Show that ω(A) ≥ 0 for
all A ∈ Mn (C)+ if and only if ∥ω∥ = 1.
(b) Show that there exists a bijection
such that f (λρ+(1−λ)σ) = λf (ρ)+(1−λ)f (σ) for all ρ, σ ∈ Mn (C)+ with Tr(ρ) = Tr(σ) = 1
and λ ∈ [0, 1].
Chapter 7
37
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 38
Remark. If S, T ∈ Mn (C) are invertible, then T S = T (ST )T −1 , that is, ST and T S are similar.
Thus they have the same eigenvalues. In particular, r(ST ) = r(T S) as used in the proof of the
previous theorem. In general σ(ST ) \ {0} = σ(T S) \ {0}, thus r(ST ) = r(T S).
Definition 7.5. A function f : (0, ∞) → R is said to be operator convex if for any n ∈ N, any
positive definite matrices A, B ∈ Mn (C) and any λ ∈ (0, 1) we have
As with operator monotone functions, every operator convex (resp. operator concave) function
is convex (resp. concave), but the converse is not true.
Example 7.6. The square function f (x) = x2 is operator convex. In fact, for any positive semi-
definite A, B and λ ∈ (0, 1), we have
Example 7.7. The cube function f (x) = x3 is not operator convex. In fact, if f is operator
convex, then we must have (since A + tB = (1 − t)A + t(A + B))
for any positive semi-definite A, B and t ∈ (0, 1). The above inequality can be reformulated as
(A + tB)3 − A3
≤ (A + B)3 − A3 .
t
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 39
Letting t → 0+ , we get
B 3 + B 2 A + BAB + AB 2 ≥ 0.
Now we choose
1 1 1 0
A= ,B = .
1 1 0 0
Then B = B 2 = B 3 = BAB and
4 1
B 3 + B 2 A + BAB + AB 2 = ,
1 0
Proof. Let A, B ∈ Mn (C) be positive definite and let C = A−1/2 BA−1/2 . For λ ∈ [0, 1] we have
(λ + (1 − λ)x)−1 ≤ λ + (1 − λ)x−1
To show the operator convexity/concavity of x → 7 xp for some (but not all, see the exercises)
other values of p, we will use the following integral representations.
Lemma 7.9. For positive definite A ∈ Mn (C), the following integral formulas hold.
sin(p + 1)π ∞ p
Z
Ap = t (t1 + A)−1 dt for p ∈ (−1, 0),
π 0
sin pπ ∞ p −1
Z
p
A = t (t 1 − (t1 + A)−1 ) dt for p ∈ (0, 1),
π 0
sin(p − 1)π ∞ p−1 −1
Z
p
A = t (t A + t(t1 + A)−1 − 1) dt for p ∈ (1, 2).
π 0
Proof. Exercise.
Proof. From the integral identities in the previous lemma, it suffices to show that for any t > 0,
x 7→ −(t + x)−1 is operator monotone and operator concave.
Proof. Exercise.
Theorem 7.13. Let f be a (continuous) function that maps (0, ∞) into itself. Then the following
are equivalent:
(a) f is operator monotone;
(b) f is operator concave.
Both of them imply
Proof. We first show (b) =⇒ (c). Assume (b), then for any λ ∈ (0, 1) and any positive definite
A, B, we have
f (λA + (1 − λB)) ≥ λf (A) + (1 − λ)f (B).
By operator monotonicity and operator concavity of −x−1 , then
So we have (c).
Now we prove the equivalence of (a) and (b). Assume (b). Then for any 0 ≤ A ≤ B, we will
show that f (A) ≤ f (B). For this note that for any λ ∈ (0, 1):
λ
λB = λA + (1 − λ) (B − A).
1−λ
By operator concavity, we have
λ
f (λB) ≥ λf (A) + (1 − λ)f (B − A) .
1−λ
Since f ≥ 0 and B − A ≥ 0, we get f (λB) ≥ λf (A) for any λ ∈ (0, 1). Letting λ → 1− , we get by
continuity that f (B) ≥ f (A). So f is operator monotone and we have (a).
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 41
Now assume (a). Let A, B ∈ Mn (C)++ and λ ∈ [0, 1]. Write 1n for the unit matrix in Mn (C).
Define the unitary matrix V ∈ M2n (C) by
λ1/2 (1 − λ)1/2 (B − A)
A 0 λA + (1 − λ)B
U∗ U= .
0 B λ1/2 (1 − λ)1/2 (B − A) (1 − λ)A + λB
if µ ≥ ∥λA + (1 − λ)B∥.
By the Schur complement theorem,
ε1n D
≥0
D µ1n
Hence λf (A) + (1 − λ)f (B) ≤ f (λA + (1 − λ)B + ε1n ). Letting ε ↘ 0 yields (by continuity of f )
λf (A) + (1 − λ)f (B) ≤ f (λA + (1 − λ)B) as desired.
Remark. With a little more work one can show that every operator monotone function is au-
tomatically continuous. From the proof, we see that we only need f ≥ 0 in deriving operator
monotonicity from operator concavity. It is not true if we don’t have f ≥ 0. For example,
f (x) = −x log x is operator concave, but it is not even scalar monotone.
Lemma 7.14 (Dilation of contractions). If A ∈ Mn (C) with A∗ A ≤ 1, then there exists m ≥ n
and a unitary U ∈ Mm (C) such that P U |Cn = A, where P : Cm → Cn is the projection onto the
first n coordinates.
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 42
Therefore
f (Φ(A)) = f (W ∗ π(A)W ) ≤ W ∗ π(f (A))W = Φ(f (A)).
On the other hand, if π is unital, then 1 = π(1) = V ∗ V , hence C = 0. Then
1 ∗ f (π(A)) 0 1 f (π(A)) 0
U U + V∗ V
2 0 f (0)1 2 0 f (0)1
∗
W f (π(A))W 0
= .
0 Bf (A)B + f (0)W W ∗
Thus f (W ∗ π(A)W ) ≤ W ∗ f (π(A))W , and we conclude as before.
Corollary 7.16. If f : [0, ∞) → R is operator convex, then
Xm m
X
f Vj∗ Aj Vj ≤ Vj∗ f (Aj )Vj
j=1 j=1
Pm
for all A1 , . . . , Am ∈ Mn (C) and all V1 , . . . , Vm ∈ Mn,k (C) with j=1 Vj∗ Vj = 1.
Pm
If additionally f (0) ≤ 0, the same inequality holds under the assumption j=1 Vj∗ Vj ≤ 1.
Exercises
Exercise 7.1. For a continuously differentiable function f : (0, ∞) → R let
(
f (λ)−f (µ)
if λ ̸= µ,
Df : (0, ∞)2 → R, (λ, µ) 7→ ′
λ−µ
f (λ) if λ = µ.
Show that f is operator monotone if and only if for all n ∈ N and λ1 , . . . , λn > 0 the matrix
[Df (λj , λk )]j,k is positive semi-definite.
Exercise 7.2. Show that the set
E = {log f | f : (0, ∞) → (0, ∞) operator monotone}
is convex.
Exercise 7.3. Show the integral formulas from Lemma 7.9.
Exercise 7.4. Show that
1. f (x) = log x is operator concave and operator monotone;
2. f (x) = x log x is operator convex.
x−1
3. f (x) = log x is operator concave and operator monotone.
Proof. Hint: we have Z ∞
1 1
log x = − dt,
0 t + 1 t + x
p
x −x
x log x = lim+ ,
p→1 p−1
Z 1
x−1
= xα dα.
log x 0
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 44
Exercise 7.5. Show that x 7→ xp is neither operator convex nor operator concave if p ∈
/ [−1, 2].
Exercise 7.6. Give an example of an operator concave function f : (0, 1) → R that is not operator
monotone.
Exercise 7.7. Show that every operator monotone function f : (0, ∞) → R is continuous.
Exercise 7.8. Show that if A ∈ Mn (C) and f : [0, ∞) → R, then Af (A∗ A) = f (AA∗ )A.
Exercise 7.9. Let f : [0, ∞) → R be a continuous function such that f (Φ(A)) ≤ Φ(f (A)) for all
n ∈ N, A ∈ Mn (C) and all unital completely positive maps Φ : Mn (C) → Mn (C). Show that f is
operator convex.
Chapter 8
Theorem 8.1 (Lieb’s concavity theorem and Ando’s convexity theorem). For K ∈ Mn (C) the
function
Mn (C)+ × Mn (C)+ → C, (A, B) 7→ Tr(K ∗ Ap KB 1−p ),
is jointly concave if p ∈ [0, 1] and jointly convex if p ∈ [−1, 0].
Remark 8.2. The parameters can be more general, as we shall see later. The convexity result is
named after Ando as he proved it in 1979, but this result is contained in another result of Lieb in
the same 1973 paper.
Corollary 8.3. The quantum relative entropy
Let us come back to Lieb’s concavity theorem. It now has a lot of proofs. We will give two here,
the first one being Lieb’s original proof using interpolation. For this, let us recall the three-line
lemma first.
Lemma 8.4. Let S := {z ∈ C : 0 < ℜz < 1} be the open strip and denote by S its closure.
Suppose that f : S → C is bounded function such that
45
CHAPTER 8. LIEB’S CONCAVITY THEOREM 46
1. f is analytic in S;
2. f is continuous on S;
3. sup{|f (k + iy)| : y ∈ R} := Mk < ∞, k = 0, 1.
Then for any θ ∈ [0, 1], we have |f (θ)| ≤ M01−θ M1θ .
Proof of Lieb’s concavity theorem. To prove the joint concavity of
(A, B) 7→ TrAp K ∗ B 1−p K, 0 ≤ p ≤ 1,
it suffices to prove the concavity of
A 7→ TrAp K ∗ A1−p K, 0 ≤ p ≤ 1.
In fact, this is a doubling dimension trick:
p 1−p
K∗
A 0 0 A 0 0 0
TrAp K ∗ B 1−p K = Tr .
0 B 0 0 0 B K 0
For any positive semi-definite A1 , A2 and λ ∈ (0, 1), put A := λA1 + (1 − λ)A2 . We need to show
that
λTrAp1 K ∗ A1−p
1 K + (1 − λ)TrAp2 K ∗ A1−p
2 K ≤ TrAp K ∗ A1−p K.
1−p p
By approximation, we may assume that A1 , A2 and A are all positive definite. Set M := A 2 KA 2 .
For k = 1, 2, consider the function
z 1−z 1−z z
fk (z) := TrAzk A− 2 M ∗ A− 2 A1−z
k A− 2 M A− 2 , z ∈ S.
Then what we need to show can be reformulated as
λf1 (p) + (1 − λ)f2 (p) ≤ TrM ∗ M.
z z
We claim that the function fk is uniformly bounded on S. In fact, denote Gk (z) := A− 2 Azk A− 2
and we may write fk as fk (z) = TrM ∗ Gk (1 − z)M Gk (z). By Cauchy–Schwarz,
1/2 1/2
|fk (z)| ≤ (TrM M ∗ Gk (1 − z)Gk (1 − z)) (TrM M ∗ Gk (z)Gk (z)) .
For z = x + iy, we have
TrM M ∗ Gk (1 − z)Gk (1 − z) ≤ ∥A−1 ∥2x ∥Ak ∥2x TrM ∗ M,
which is uniformly bounded. Similarly TrM M ∗ Gk (z)Gk (z) is also uniformly bounded. Therefore,
we finish the proof of the claim, so that we can use the three-line lemma to f (z) := λf1 (z) + (1 −
λ)f2 (z). When ℜz = 0:
iy −1+iy 1 1 iy
iy −iy −1+iy iy
fk (0 + iy) = Tr Ak2 A− 2 M ∗ A 2 Ak2 · Ak2 A 2 M A− 2 Ak2 .
By Cauchy–Schwarz:
−1+iy 1+iy
|fk (iy)| ≤ TrM ∗ A 2 Ak A− 2 M, k = 1, 2.
So for any y ∈ R
−1+iy 1+iy
|f (iy)| ≤ TrM ∗ A 2 (λA0 + (1 − λ)A1 )A− 2 M = TrM ∗ M.
Similarly we can prove |f (1 + iy)| ≤ TrM ∗ M for all y ∈ R. This concludes the proof by three-line
lemma.
CHAPTER 8. LIEB’S CONCAVITY THEOREM 47
Now we give another proof using perspective functions. We start with an even more gen-
eral convexity/concavity result, which reduces most of the other results in this chapter to easy
corollaries.
Theorem 8.5 (ENG perspective theorem). Let f : [0, ∞) → R be operator convex (resp. operator
concave) and g : (0, ∞) → (0, ∞) operator concave. Assume that f (0) ≤ 0 (resp. f (0) ≥ 0). Then
the map
Proof. We only prove the jointly convex case. The jointly concave case follows by replacing f by
−f .
Let A1 , A2 ∈ Mn (C)++ , B2 , B2 ∈ Mn (C)+ , λ ∈ [0, 1] and define A = λA1 + (1 − λ)A2 , B =
λB1 + (1 − λ)B2 . Let V1 = (λg(A1 ))1/2 g(A)−1/2 , V2 = ((1 − λ)g(A2 ))1/2 g(A)−1/2 . Since g is
operator concave, we have
is jointly concave if p, q ∈ [0, 1] and jointly convex if p ∈ [1, 2] and q ∈ [0, 1].
Proof. As x 7→ xp is operator concave for p ∈ [0, 1] and operator convex for p ∈ [1, 2], the result
follows immediately from the ENG perspective theorem.
We are now in the position to prove Lieb’s concavity theorem and Ando’s convexity theorem.
Proof of Theorem 8.1. Equip Mn (C) with the Hilbert–Schmidt inner product
and the joint convexity resp. joint concavity follows from the previous corollary.
is jointly concave when 0 < p < 1 and jointly convex when −1 < p < 0.
is jointly concave.
Proof. Since f (x) = x1/2 and g(x) = x are operator concave, and f (0) = 0, this result follows
directly from the ENG perspective theorem.
is jointly concave.
Theorem 8.10 (Arithmetic-geometric-harmonic mean inequality). For all positive definite ma-
trices A, B, we have
M−1 (A, B) ≤ M0 (A, B) ≤ M1 (A, B).
which is equivalent to
−1
1 + A1/2 B −1 A1/2
≤ (A−1/2 BA−1/2 )1/2 .
2
−1
This is true by the scalar inequality ( 1+x
2 ) ≤ x−1/2 and the functional calculus.
The second inequality is
A+B
A1/2 (A−1/2 BA−1/2 )1/2 A1/2 ≤ ,
2
which is equivalent to
1 + A−1/2 BA−1/2
(A−1/2 BA−1/2 )1/2 ≤ .
2
√ 1+x
This follows from the scalar inequality x≤ 2 and the functional calculus.
(A, B) 7→ B ∗ A−1 B
is jointly convex.
Proof. Let A1 , A2 ∈ Mn (C)++ , B1 , B2 ∈ Mn (C) and t ∈ [0, 1]. By the Schur complement theorem,
A1 B1
≥ 0,
B1∗ B1∗ A−1
1 B1
and the same holds for A1 and B1 replaced by A2 and B2 , respectively. Thus
tA1 + (1 − t)A2 tB1 + (1 − t)B2
≥ 0.
tB1∗ + (1 − t)B2∗ tB1∗ A−1 ∗ −1
1 B1 + (1 − t)B2 A2 B2
tB1∗ A−1 ∗ −1
1 B1 + (1 − t)B2 A2 B2
≥ (tB1∗ + (1 − t)B2∗ )(tA1 + (1 − t)A2 )−1 (tB1 + (1 − t)B2 ).
Proof. For any y1 , y2 , suppose xi is such that f (xi , yi ) = maxx f (x, yi ) for i = 1, 2. Then for any
λ ∈ (0, 1), we have
is concave.
(exercise).
Then the desired concavity result follows from the joint convexity of quantum relative entropy
and the above lemma.
Using the above theorem, one can extend the Golden–Thompson inequality TreH+K ≤ TreH eK
to three matrices. Note that TreH eK eL is in general not even a real number.
where Z ∞
1 1 d
TA (B) = B ds = |t=0 log(A + tB).
0 s+A s+A dt
N
Proof. Let C ⊂ R be a convex set such that tx ∈ C for every t > 0, x ∈ C. If f : C → R is
concave and f (tx) = tf (x) for all t > 0, x ∈ C, then
f (x + ty) − f (x)
f (y) ≤ lim ,
t→0+ t
for any x, y ∈ C.
Indeed, for any t > 0
x ty f (x) tf (y)
f (x + ty) = (1 + t)f + ≥ (1 + t) + = f (x) + tf (y).
1+t 1+t 1+t 1+t
Now we apply this result to C = Mn (C)++ and f (X) = Tr[eH+K+log X ]. This function is concave
by the previous theorem and homogeneous of degree one. For X = e−K and Y = eL we get
H+K+L d H+K+log(e−K +teL ) H+K+log(e−K ) d −K L
Tre ≤ |t=0 Tr[e ] = Tr e |t=0 log(e + te ) ,
dt dt
CHAPTER 8. LIEB’S CONCAVITY THEOREM 51
where the right hand side is exactly what we need once we prove
Z ∞
1 1 d
B ds = |t=0 log(A + tB).
0 s+A s+A dt
This identity follows from the integral formula
Z ∞
(s + 1)−1 − (s + A)−1 ds
log A =
0
as follows:
Z ∞
d d
log(X + tY ) = ((s + 1)−1 − (s + X + tY )−1 ) ds
dt t=0 dt t=0 0
Z ∞
d
=− (s + X + tY )−1 ds
0 dt t=0
Z ∞
= (s + X)−1 Y (s + X)−1 .
0
Exercises
Exercise 8.1. Suppose that p ≤ q. For K ∈ Mn (C) the function
is
1. jointly concave if 0 ≤ p ≤ q ≤ 1 such that p + q ≤ 1;
2. jointly convex if −1 ≤ p ≤ 0, 1 ≤ q ≤ 2 such that p + q ≥ 1.
Exercise 8.2. Prove the duality formula:
• Pt x → x, t → 0.
It has a generator L that is defined via
x − Pt x
L(x) := lim .
t→0 t
Show that ρ 7→ ⟨L(ρp ), ρ1−p ⟩ is convex when 0 ≤ p ≤ 1, and concave when −1 ≤ p ≤ 0.
Chapter 9
Entanglement
If two classical physical systems are described by the (finite) pure state spaces X and Y , then
the composite system is described the pure state space X × Y . This means that the mixed states
of
P the composite system are P probability densities on X × Y . For every ρ : X × Y → [0, 1] with
x,y ρ(x, y) = 1 one has ρ = x,y ρ(x, y)1(x,y) . In other words, every probability density on X ×Y
is a convex combination of the Dirac densities 1(x,y) with x ∈ X, y ∈ Y .
The situation is markedly different for quantum systems, where the phenomenon of entangle-
ment occurs, which is one of the key features of quantum information theory compared to classical
information theory.
Definition 9.1. A quantum state ρ ∈ Mm (C)⊗Mn (C) is called separable if there exist λ1 , . . . , λk ∈
Pk (1) (1) (2) (2)
[0, 1] with j=1 λj = 1 and quantum states σ1 , . . . , σk ∈ Mm (C), σ1 , . . . , σk ∈ Mn (C) such
that
k
(1) (2)
X
ρ= λj σj ⊗ σj .
j=1
53
CHAPTER 9. ENTANGLEMENT 54
(1)
Since Φ is positive, the matrix Φ(σj ) is positive. Thus (Φ ⊗ idMk (C) )(ρ) ≥ 0.
Proof. Let Φ : Mm (C) → Mk (C) be a positive map that is not 2-positive. For example, we can
take k = m and Φ the transpose map. Then there exists a (necessarily non-zero) positive matrix
A ∈ Mm (C) ⊗ Mn (C) such that (Φ ⊗ idMn (C) )(A) is not positive. By the Horodecki criterion,
A/Tr(A) is an entangled state.
Example 9.4 (Werner states). Let m = n ≥ 2 and W = i,j Eij ⊗Eji . As W 2 = 1, the matrix W
P
has eigenvalues ±1. Let P±1 be the orthogonal projection onto the eigenspace of W corresponding
to the eigenvector ±1. More explicitly, P1 = 21 (1 ⊗ 1 + W ) and P−1 = 12 (1 ⊗ 1 − W ).
A basis of the range of P1 is given by (ei ⊗ ej + ej ⊗ ej )i≤j and a basis of the range of P−1 is
given by (ei ⊗ ej − ej ⊗ ei )i<j . Thus Tr(P1 ) = n(n+1)
2 and Tr(P−1 ) = n(n−1)
2 .
2λ 2(1−λ)
A quantum state of the form ρλ = n(n−1) P−1 + n(n+1) P1 with λ ∈ [0, 1] is called a Werner
state. Werner states are entangled states. We only prove this here for λ > 21 .
Let Φ : Mm (C) → Mm (C) be the transpose map, which is positive, but not completely positive.
We have
X
(Φ ⊗ idMk (C) )(W ) = Eij ⊗ Eij
i,j
and therefore
1 1X
(Φ ⊗ idMk (C) )(P1 ) = (1 ⊗ 1) + Eij ⊗ Eij ,
2 2 i,j
1 1X
(Φ ⊗ idMk (C) )(P−1 ) = (1 ⊗ 1) − Eij ⊗ Eij .
2 2 i,j
1
P
Let Q0 = n i,j Eij ⊗ Eij and Q1 = 1 ⊗ 1 − Q0 . From the previous identities we deduce
2λ 2(1 − λ)
(Φ ⊗ idMk (C) )(ρλ ) = (Φ ⊗ idMk (C) ) P−1 + P1
n(n − 1) n(n + 1)
2λ − 1 + n (2λ − 1)n + 1 X
= 2
(1 ⊗ 1) + Ei,j ⊗ Eij
n(n − 1) n(n2 − 1) i,j
1 − 2λ 1 − 2λ Q1
= Q0 + 1 − .
n n n2 − 1
Hence Q0 and Q1 are orthogonal projections with Q0 Q1 = 0. It follows that (Φ ⊗ idMk (C) )(ρλ ) has
eigenvalues (1 − 2λ)/n and (1 − 1−2λ 2
n )/(n − 1). In particular, (Φ ⊗ idMk (C) )(ρλ ) is not positive for
λ > 12 .
Chapter 10
Definition 10.1 (Quantum relative entropy). For any quantum states ρ and σ, the quantum
relative entropy of ρ with respect to σ is
D(ρ||σ) := Tr(ρ(log ρ − log σ)).
Although the quantum relative entropy is not a distance, it still serves as a nice measure to
distinguish quantum states.
Lemma 10.2. If ρ, σ ∈ Mn (C) are quantum states, then D(ρ∥σ) ≥ 0 with equality if ρ = σ.
Proof. Let f (x) = x log x. Since f is convex, Klein’s inequality implies
0 ≤ Tr(f (ρ) − f (σ) − f ′ (σ)(ρ − σ)) = Tr(ρ log ρ − σ log σ − log σ(ρ − σ)) = Tr(ρ(log ρ − log σ)).
In fact, the quantum relative entropy vanishes D(ρ||σ) = 0 if and only if ρ = σ. The converse
implication follows from the equality case in Klein’s inequality for strictly convex functions, which
we did not discuss. However, it is also an immediate consequence of the following result. For its
formulation, recall that the trace norm of a matrix A ∈ Mm (C) is defined as ∥A∥1 = Tr(|A|).
Theorem 10.3 (Pinsker’s inequality). For any quantum states ρ and σ, we have
1
D(ρ||σ) ≥ ∥ρ − σ∥21 .
2
To prove this, we shall need the following monotonicity property, sometimes called data pro-
cessing inequality of quantum relative entropy.
Theorem 10.4 (Data processing inequality for quantum relative entropy). For any quantum states
ρ, σ ∈ Mm (C) and any quantum channel Λ : Mm (C) → Mn (C), we have
D(Λ(ρ)||Λ(σ)) ≤ D(ρ||σ).
Proof. As noticed before,
1 − Tr(ρp σ 1−p )
D(ρ∥σ) = lim .
p↗1 1−p
Let fp (x) = xp for p ∈ [0, 1] and
Lσ : Mn (C) → Mn (C), Lσ A = σA
Rρ : Mn (C) → Mn (C), Rρ A = Aρ
55
CHAPTER 10. DATA PROCESSING INEQUALITIES 56
Then
⟨σ 1/2 , fp (Lρ Rσ−1 )σ 1/2 ⟩HS = Tr(ρp σ 1−p ).
−1
For convenience, write ∆ρ,σ = Lρ Rσ−1 and ∆Λ(ρ),Λ(σ) = LΛ(ρ) RΛ(σ) . Consider the map
D(Φ(p)∥Φ(q)) ≤ D(p∥q).
Proof. (a) If ρ = diag(p), σ = diag(q), then D(ρ∥σ) = D(p∥q). Moreover, let E : Mm (C) →
Cm , E(A) = (A11 , . . . , Amm ). Then the map Λ : Mm (C) → Mn (C), Λ(A) = diag(Φ(E(A))) is a
quantum channel. It follows from the data processing inequality for quantum entropies that
Since x log x is convex, we have x log x ≥ x − 1. Moreover, if x ≤ 1, then all terms in the Taylor
expansion are positive. Thus
1
x log x ≥ (x − 1) + (1 − x)2+ .
2
It follows that
n
X pj pj
D(p∥q) = log qj
q
j=1 j
qj
n n 2
X pj 1X pj
≥ − 1 qj + 1− qj
j=1
qj 2 j=1 qj +
2
n
1 X pj
≥ 1− qj .
2 j=1 qj +
P P P
Since j (1 − pj /qj )pj = 0, we have j (1 − pj /qj )+ qj = j (1 − pj /qj )− qj . Thus
n n
√
X pj X pj
∥p − q∥1 = 1− qj = 2 1− qj ≤ 2 2D(p∥q)1/2 .
j=1
qj j=1
qj +
Pl
Proof. Let A = k=1 λk Pk be the spectral decomposition of A. For an eigenvector ξ of Φ(A) with
Pl
∥ξ∥ = 1 define
P µk = ⟨ξ, Φ(Pk )ξ⟩ for 1 ≤ k ≤ l and µl+1 = 1 − k=1 µk . Clearly, µk ≥ 0 for k ≤ l,
and since k Pk = 1 and Φ(1) ≤ 1, we also have µl+1 ≥ 0.
Using the convexity of f , we get
If Φ(1) = 1, then µl+1 = 0, and if f (0) ≤ 0, then µl+1 f (0) ≤ 0. In either case, we get
l
X X
⟨ξ, f (Φ(A))ξ⟩ ≤ µk f (λk ) = ⟨ξ, Φ(f (λk )Pk )ξ⟩ = ⟨ξ, Φ(f (A))ξ⟩.
k=1 k
Summing over an orthonormal eigenbasis for Φ(A), the desired inequality follows.
In the case when
For the following recall that a POVM is a family (Pi )ni=1 of positive matrices with
P
i Pi = 1
and that for every POVM (Pi )ni=1 the map
k
X
Φ : Mm (C) → Mn (C), A 7→ Tr(Pi A)Eii
i=1
is a quantum channel.
Lemma 10.8. Let ρ, σ ∈ Mm (C) be two quantum states and Λ : Mm (C) → Mn (C) any quantum
channel.
(a) The trace distance is monotone under quantum channels:
(b) There exists a POVM (Pi )ni=1 with associated quantum-to-classical channel Φ such that
n
X
∥ρ − σ∥1 = ∥Φ(ρ) − Φ(σ)∥1 = |Tr(Pi ρ) − Tr(Pi σ)|.
i=1
Proof. (a) Let A = ρ − σ. Since f (x) = |x| is convex and f (0) = 0, we deduce from the previous
theorem
∥Λ(A)∥1 = Tr(|Λ(A)|) ≤ Tr(Λ(|A|)) = Tr(|A|) = ∥A∥1 .
P
(b) Consider the spectral decomposition of ρ − σ = j λj Qj . Then
X X
P1 := Qj , P2 := Qj
λj ≥0 λj <0
CHAPTER 10. DATA PROCESSING INEQUALITIES 59
2
X X X X
∥Φ(ρ) − Φ(σ)∥1 = |Tr(Pi ρ) − Tr(Pi σ)| = λj + λj = |λj | = ∥ρ − σ∥1 .
i=1 λj ≥0 λj <0 j
Proof of Theorem 10.3. We only prove the weaker version with constant 18 instead of 12 . Take Λ as
the quantum-to-classical channel in the previous lemma. Then from the monotonicity of quantum
relative entropy and classical Pinsker’s inequality:
1 1
D(ρ∥σ) ≥ D(Λ(ρ)∥Λ(σ)) ≥ ∥Λ(ρ) − Λ(σ)∥21 = ∥ρ − σ∥21 .
8 8
Recall that S(ρ) = −Tr(ρ log ρ) is the quantum entropy. For any bipartite state ρ over H1 ⊗H2 ,
we denote ρ1 = Tr2 (ρ) and ρ2 = Tr1 (ρ). We have seen the following subadditivity result of entropy
in the previous lectures:
S(ρ) ≤ S(ρ1 ) + S(ρ2 ),
which follows from the non-negativity of quantum relative entropy:
Theorem 10.9 (Strong subadditivity of the quantum entropy). If ρ ∈ Ml (C) ⊗ Mm (C) ⊗ Mn (C)
is a quantum state, then
S(ρ12 ) + S(ρ23 ) ≥ S(ρ123 ) + S(ρ2 ).
This inequality reduces to the subadditivity of the quantum entropy when m = 1.
Proof. Similar to the computations in the proof of the subadditivity of the quantum entropy, one
obtains
Definition 10.10. For any tripartite state ρ123 the conditional mutual information of 1 and 2
given 3, I(1, 2|3), is defined as
Then SSA says I(1, 2|3) ≥ 0. For any bipartite state ρ12 the squashed entanglement of ρ12 is
defined as
1
Esq (ρ12 ) := inf{I(1, 2|3) : ρ123 is any tripartite extension of ρ12 }.
2
The functional Esq provides a faithful measure of entanglement.
Theorem 10.11. A bipartite state ρ12 is separable if and only if Esq (ρ11 ) = 0.
So if Esq (ρ) > 0, then ρ is entangled. The following extended SSA provides a lower bound of
Esq :
Theorem 10.12 (Extended SSA). For any tripartite state ρ123 ∈ Ml (C) ⊗ Mm (C) ⊗ Mn (C) we
have
S(ρ13 ) + S(ρ23 ) − S(ρ123 ) − S(ρ3 ) ≥ 2 max{S(ρ1 ) − S(ρ12 ), S(ρ2 ) − S(ρ12 ), 0}.
As a corollary,
Esq (ρ12 ) ≥ max{S(ρ1 ) − S(ρ12 ), S(ρ2 ) − S(ρ12 ), 0}.
So if either of the conditional entropies S(ρ12 ) − S(ρ1 ) or S(ρ12 ) − S(ρ2 ) is strictly negative, then
Esq (ρ12 ) > 0 and thus ρ12 is entangled.
To prove this theorem, we need a purification trick that is very useful. Let us come back to the
quantum entropy S(ρ). It is clear that S(ρ) = 0 iff ρ is pure. If ρ = ρ12 is a bipartite state, then
(exercise) S(ρ12 ) = 0 implies S(ρ1 ) = S(ρ2 ). The following theorem says that if S(ρ12 ) is small,
then S(ρ1 ) is close to S(ρ2 ):
Theorem 10.13. For any bipartite state ρ12 we have
For the proof, we recall the following purification result from Chapter 4: If ρ ∈ Mn (C) is a
quantum state, then there exists a unit vector ξ ∈ Cn ⊗Cn such that ρ = Tr1 (|ξ⟩ ⟨ξ|) = Tr2 (|ξ⟩ ⟨ξ|).
This is not quite the statement of Proposition 4.6, but can be deduced from its proof.
Now we can prove Theorem 10.13 as follows:
Proof of Theorem 10.13. Consider a purification ρ123 of ρ12 as described above. Then S(ρ12 ) =
S(ρ3 ) and S(ρ1 ) = S(ρ23 ). By the additivity of quantum entropy:
Proof of Theorem 10.12. Consider any purification ρ1234 of ρ123 . Since ρ1234 is pure, S(ρ14 ) =
S(ρ23 ) and S(ρ124 ) = S(ρ3 ). Then
S(ρ12 ) + S(ρ23 ) − S(ρ1 ) − S(ρ3 ) = S(ρ12 ) + S(ρ14 ) − S(ρ1 ) − S(ρ124 ) ≥ 0.
So
S(ρ12 ) + S(ρ23 ) ≥ S(ρ1 ) + S(ρ3 ).
Similarly we have
S(ρ13 ) + S(ρ23 ) ≥ S(ρ1 ) + S(ρ2 ).
Adding the above two inequalities, we get
S(ρ12 ) + S(ρ13 ) + 2S(ρ23 ) ≥ 2S(ρ1 ) + S(ρ2 ) + S(ρ3 ).
This is independent of H4 . Consider again the purification ρ1234 of ρ123 , then the above argument
shows:
S(ρ14 ) + S(ρ13 ) + 2S(ρ34 ) ≥ 2S(ρ1 ) + S(ρ4 ) + S(ρ3 ).
Since ρ1234 is pure, S(ρ14 ) = S(ρ23 ), S(ρ34 ) = S(ρ12 ) and S(ρ4 ) = S(ρ123 ). Hence,
S(ρ23 ) + S(ρ13 ) + 2S(ρ12 ) ≥ 2S(ρ1 ) + S(ρ123 ) + S(ρ3 ),
which is nothing but
S(ρ13 ) + S(ρ23 ) − S(ρ123 ) − S(ρ3 ) ≥ 2S(ρ1 ) − 2S(ρ12 ).
Similarly, we can derive a lower bound in terms of 2S(ρ2 ) − 2S(ρ12 ).
Exercises
Exercise 10.1. Prove the classical Pinsker inequality (you can use the monotonicity of classical
relative entropy). Let p, q be two probability densities over some finite set X . We have
1
D(p||q) ≥ ∥p − q∥21 .
2
Exercise 10.2. Suppose that {Λx }x∈X is a POVM, i.e. a finite set of operators such that
X
Λx ≥ 0, ∀x ∈ X , and Λx = 1.
x
Show that this gives a quantum channel Λ such that for any quantum state ρ, Λ(ρ) is a classical
probability density over X satisfying Λ(ρ)(x) = Tr(ρΛx ).
Exercise 10.3. Show that for any quantum channel Λ and any matrix X, we have ∥Λ(X)∥p ≤
∥X∥p , 1 ≤ p ≤ ∞.
Exercise 10.4. Find an entangled state.
Exercise 10.5. Show that for any pure bipartite state ρ over H1 ⊗ H2 , we have S(ρ1 ) = S(ρ2 ).
Show also that any quantum state can be purified. That is, for any quantum state ρ over H, there
exists a pure state |ψ⟩ ⟨ψ| on H ⊗ H such that
ρ = Tr1 [|ψ⟩ ⟨ψ|] = Tr2 [|ψ⟩ ⟨ψ|].
Exercise 10.6. Show that the monotonicity of quantum relative entropy implies the joint con-
vexity.
Chapter 11
Definition 11.1. A quantum Markov semigroup (QMS) on Mn (C) is a family (Pt )t≥0 of unital
completely positive maps on Mn (C) such that
• P0 = idMn (C) , Ps Pt = Pst for all s, t ≥ 0,
• Pt → P0 as t → 0.
Remark. By the Heisenberg–Schrödinger duality, if (Pt ) is a QMS, then Pt† is completely positive
trace-preserving for all t ≥ 0. In particular, Pt† maps quantum states to quantum states for all
t ≥ 0.
Theorem 11.2 (Lindblad). If (Pt ) is a quantum Markov semigroup on Mn (C), then for each
A ∈ Mn (C) the limit
1
L(A) = lim (A − Pt (A))
t↘0 t
Proof of Lindblad’s theorem. Clearly, the set V = A ∈ Mn (C) | limt↘0 1t (A − Pt (A)) exists} is a
subspace of Mn (C). For A ∈ Mn (C) and δ > 0 let
Z δ
Aδ = Pt (A) dt.
0
62
CHAPTER 11. SUPPLEMENT: QUANTUM MARKOV SEMIGROUPS AND
LOGARITHMIC SOBOLEV INEQUALITIES 63
Since t 7→ Pt (A) is continuous, we have δ −1 Aδ → A as δ → 0. Moreover,
!
Z δ Z δ
1 1
(Aδ − Pt (Aδ )) = Ps (A) ds − Ps+t (A) ds
t t 0 0
!
Z δ Z t+δ
1
= Ps (A) ds − Ps (A) ds
t 0 t
1 t 1 t+δ
Z Z
= Ps (A) ds − Ps (A) ds
t 0 t δ
→ A − Pδ (A)
A direct computation shows that t 7→ e−tL (A) solves the same IVP. It follows from the uniqueness
theorem for ordinary differential equations that Pt (A) = e−tL (A) for all t ≥ 0.
For the second part we first assume that (e−tL ) is a quantum Markov semigroup. Let U (n) =
{U ∈ Mn (C) | U ∗ U = U U ∗ = 1}. There exists a unique probability measure µ on U (n) such that
Z Z
f (U V W ) dµ(V ) = f (V ) dµ(V )
U (n) U (n)
for all U, W ∈
R U (n) and all continuous f : U (n) → C.
Let G = U (n) L(U ∗ )U dµ(U ) and Φ(A) = GA + AG∗ − L(A). If V ∈ Mn (C) is unitary, then
Z Z Z
L(V U ∗ )U dµ(U ) = L((U V ∗ )∗ )U V ∗ dµ(U )V = L(U ∗ )U µ(U )V = GV,
U (n) U (n) U (n)
Hence
Φ(1) = G1 + 1G∗ − L(1) = G + G∗ .
for all U, W ∈ U (n) and all continuous f : U (n) → C is called the (normalized) Haar measure on
U (n). More generally, a Haar measure exists for any compact group (and any locally compact
group if one drops the assumption that the measure be finite). In general, there is no explicit
formula for it.
Remark. By duality one obtains that t 7→ Pt† (A) is also differentiable for all A ∈ Mn (C) and
d † † †
dt Pt (A) = −L (Pt (A)).
Definition 11.3. If (Pt )t≥0 is a quantum Markov semigroup on Mn (C), then the unique linear
map L : Mn (C) → Mn (C) such that e−tL = Pt for all t ≥ 0 is called the generator of (Pt ).
Proof. By Lindblad’s theorem, there exist G ∈ Mn (C) and Φ : Mn (C) → Mn (C) completely pos-
itive such that Φ(1) = G + G∗ and L(A) = GA + AG∗ − Φ(A) Pm for all A ∈ Mn (C). By Kraus’
theorem, there exist V1 , . . . , Vm ∈ Mn (C) such that Φ(A) = j=1 Vj∗ AVj for all A ∈ Mn (C). Let
H = 2i 1
(G − G∗ ).
Then we have G = 12 Φ(1) + iH and thus
Proof. (i) =⇒ (ii): Let f (t) = eαt D(Pt† (ρ)∥σ). By (i), f (t) ≤ f (0) for all t ≥ 0. We have
d
f ′ (t) = αf (t) + eαt Tr(Pt† (ρ)(log Pt† (ρ) − log σ)).
dt
To compute dtd
Tr(Pt† (ρ) log Pt† (ρ)), we can use a similar argument as in the section on monotonicity
of trace functionals to see that
d
Tr(Pt† (ρ) log Pt† (ρ)) = −Tr((log Pt† (ρ) + 1)L† (Pt† (ρ))) = −Tr(L† (Pt† (ρ)) log Pt† (ρ)),
dt
where we used that
Tr(1L† (Pt† (ρ))) = Tr(L(1)Pt† (ρ)) = 0.
†
Clearly, d
dt Tr(Pt (ρ) log σ) = −Tr(L† (Pt† (ρ)). Thus
f ′ (t) = αf (t) − eαt Tr(L† (Pt† (ρ))(log Pt† (ρ) − log σ)).
In particular,
0 ≤ f ′ (0) = αD(ρ∥σ) − Tr(L† (ρ)(log ρ − log σ)).
(ii) =⇒ (i): Again let f (t) = eαt D(Pt† (ρ)∥σ). We have seen above that
f ′ (t) = eαt (αD(Pt† (ρ)∥σ) − Tr(L† (Pt† (ρ))(log Pt† (ρ) − log σ))).
By (ii), f ′ (t) ≤ 0 for all t ≥ 0. Hence
D(ρ∥σ) = f (0) ≥ f (t) = eα D(Pt† (ρ)∥σ).
Example 11.6 (Depolarizing semigroup). Let σ ∈ Mn (C) be a full-rank quantum state and
E(A) = Tr(Aσ)1. Then the operators Pt = e−t idMn (C) +(1−e−t )E, t ≥ 0, form a quantum Markov
semigroup with generator L = id − E. This semigroup is called the (generalized) depolarizing
semigroup.
Moreover, Pt† (σ) = σ and
D(Pt† (ρ)∥σ) ≤ e−t D(ρ∥σ).
Indeed, Pt is unital completely positive as convex combination of two unital completely positive
maps. Moreover,
Ps (Pt (A)) = Ps (e−t A + (1 − e−t )E(A))
= e−t (e−s A + (1 − e−s )E(A)) + (1 − e−t )(e−s E(A) + (1 − e−s )(E 2 (A)))
= e−(s+t) A + (e−t (1 − e−s ) + (1 − e−t ))E(A)
= e−(s+t) A + (1 − e−(s+t) )E(A)
= Ps+t (A).
CHAPTER 11. SUPPLEMENT: QUANTUM MARKOV SEMIGROUPS AND
LOGARITHMIC SOBOLEV INEQUALITIES 66
The property P0 = id and the continuity of t 7→ Pt are clear. Note moreover that Pt† (A) =
e−t A + (1 − e−t )Tr(A)σ.
To see that exponential decay of the relative entropy, recall that D is convex. Thus
Exercises
Exercise 11.1. Find a Lindblad form for the QMS from Example 11.6 in the case σ = 1/n.
Exercise 11.2. Let (Pt ) be a quantum Markov semigroup with generator L. Show that ρ 7→
Tr(L(ρp )ρ1−p ) is convex when 0 ≤ p ≤ 1, and concave when −1 ≤ p ≤ 0.