0% found this document useful (0 votes)
9 views66 pages

Notes of Trace Inequalities

The document discusses trace inequalities and quantum entropies, highlighting their significance in quantum information theory and other fields. It provides an overview of key results, mathematical concepts, and the role of quantum states and measurements. The text emphasizes the probabilistic nature of quantum mechanics and introduces the mathematical framework necessary for understanding quantum entropies and their properties.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views66 pages

Notes of Trace Inequalities

The document discusses trace inequalities and quantum entropies, highlighting their significance in quantum information theory and other fields. It provides an overview of key results, mathematical concepts, and the role of quantum states and measurements. The text emphasizes the probabilistic nature of quantum mechanics and introduces the mathematical framework necessary for understanding quantum entropies and their properties.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Trace Inequalities and Quantum Entropies

Melchior Wirth

Trace inequalities for quantum entropies and the related concavity/convexity trace functionals
play a fundamental role in quantum information theory. They are also very important in other
areas like mathematical physics and noncommutative analysis. Since Lieb’s groundbreaking result
resolving a conjecture of Wigner and Yanase, great progress has been made in the past half a
century. We will give an introduction of several main results towards this direction.
Background: basic linear algebras. References:

1. Eric A. Carlen. Trace inequalities and quantum entropy: an introductory course.


2. Bhatia Rajendra. Matrix Analysis
3. Bhatia Rajendra. Positive Definite Matrices
4. Mark M. Wilde. From Classical to Quantum Shannon Theory.
5. Michael M. Wolf. Quantum Channels & Operations: Guided Tour
6. Anna Vershynina, Eric A. Carlen and Elliott H. Lieb. Strong Subadditivity of Quantum
Entropy
7. Anna Vershynina, Eric A. Carlen and Elliott H. Lieb. Matrix and Operator Trace Inequalities
8. Eric A. Carlen. On some convexity and monotonicity inequalities of Elliott Lieb
9. John Watrous. The Theory of Quantum Information

Contents

Contents 1

1 Introduction and Notations 3

2 Complete positivity 11

1
CONTENTS 2

3 Conditional expectations and partial trace 19

4 Quantum States 22

5 POVMs and quantum measurements 26

6 Basic trace inequalities and convexity/concavity results 29

7 Operator monotonicity and convexity 37

8 Lieb’s concavity theorem 45

9 Entanglement 53

10 Data processing inequalities 55

11 Supplement: Quantum Markov Semigroups and Logarithmic Sobolev Inequal-


ities 62
Chapter 1

Introduction and Notations

Classically, the states of a physical system are modeled as points in a structured set like a
(smooth/Riemannian/Kähler/. . . ) manifold and the observables, that is, the measurable quan-
tities, as (smooth/continuous/. . . ) real-valued functions on the state space. So if the system is in
state x, the measurement outcome of observable f will be f (x).
More precisely, this is the setup of theories for one particle and the states represented by a
single point are called pure states. If we move to statistical physics, then (mixed) states are more
generally
R probability measures and the measurement outcome of observable f for a system in state
µ is f dµ. Pure states correspond to Dirac measures δx in this picture.
One common feature in both of these cases is that the state of a physical system completely
determines the measurement outcome of an observable. Early on in the development of quantum
mechanics it was recognized that this paradigm is incompatible with observations made at the
atomic level. At least that is the mainstream point of view, some people still try to develop
quantum physics in such a way that it is a deterministic theory. But we will not concern ourselves
with these approaches in these lectures.
Instead of a deterministic theory, in which the outcomes of a measurement are determined
by the state of the system, quantum theory is at its heart a probabilistic theory. The state of a
physical system only determines the probabilities of measurement outcomes and not their exact
value.
In quantum mechanics (of closed quantum systems), the state of system is a unit vector ξ from
some Hilbert space H, which we take finite-dimensional for simplicity’s sake here, so you can think
of H = Cn if you want. An observable is modeled by a self-adjoint operator A on H. If we choose a
basis of H, then A can be represented as a Hermitian matrix. By the spectral theorem, there exists
an orthonormal basis e1 , . . . , en of eigenvectors of A corresponding to the eigenvalues λ1 , . . . , λn
(counted with multiplicity). If the system is in the state ξ, the possible measurement outcomes for
the observable A are λ1 , . . . , λn with probabilities |⟨ξ, e1 ⟩|2 , . . . , |⟨ξ, en ⟩|2 , respectively.
Again, this is the setup for one particle and the corresponding quantum states are called pure
states. If we move to quantum statistical mechanics, the (mixed) states are represented by density
operators. A density operator ρ is a self-adjoint operator such that all its eigenvalues are positive
(i.e. non-negative) and whose trace is 1. The pure state ξ corresponds to the density operator
⟨ξ, · ξ⟩. In fact, this is more of a half-truth. To get the full picture, we should speak about open
and closed quantum systems and how pure states are not sufficient to describe open systems. But
for the purpose of this introduction, this analogy is a good guiding principle to understand the
mathematics.
The possible measurement outcomes for the observable A in state ρ are still λ1 , . . . , λn with
probabilities ⟨e1 , ρe1 ⟩, . . . , ⟨en , ρen ⟩. In particular, the expected value of A in state ρ is Tr(Aρ).

3
CHAPTER 1. INTRODUCTION AND NOTATIONS 4

These probabilities behave differently from the probabilities in Kolmogorov’s axiomatic setup usu-
ally studied in probability theory. In fact, this difference can be quantified in terms of so-called
“Bell inequalities”, which have been used to experimentally rule out a classical probabilistic inter-
pretation of quantum mechanics.
One of the striking differences between classical and quantum observables is that multiplication
of functions is commutative, while the multiplication of linear operators (or matrices, if you like)
is not. A physical consequence of this fact is that some observables cannot be measured at the
same time with arbitrary precision – a fact known as Heisenberg’s uncertainty principle.
One of the most important aspects for quantum information theory is how we describe compos-
ite systems. In classical systems, if we have two systems with (pure) state spaces X and Y , then
the pure state space of the composite system is X × Y . In other words, the state of the composite
system is described by the state of the two parts.
In quantum physics, if the states of two systems A and B are described by Hilbert spaces HA
and HB , then the Hilbert space for the composite system is their tensor product HA ⊗ HB . This
is different from classical physics in that a (pure) state of the composite system is in general not
described simply by a pair of pure states for A and B. Mathematically speaking, a unit vector
ξ ∈ HA ⊗ HB is not necessarily of the form ξA ⊗ ξB for unit vectors ξA ∈ HA , ξB ∈ HB . This leads
to the phenomenon of entanglement, which is at the heart of many interesting quantum effects,
both desirable and not.
So far, this is not much of a theory. We can describe the state of a system at a given time
and the possible measurement outcomes, but nothing has been said how physical systems change
with time. As is customary in quantum information theory, we will not concern ourselves with the
continuous times evolution of quantum systems, which is governed by the Schrödinger equation.
Instead we will content ourselves with describing possible changes a system can make.
It seems natural to describe the time evolution of a systems as its state evolving in time while
the observables stay unchanged. Note however that the only thing one can really measure is the
outcome of an observable in a given state, not the observable itself or the state itself. Thus it
is just as valid to describe the time evolution of a physical system as the observables evolving in
times while the states stay unchanged. These two standpoints are called Heisenberg (observables
evolve in times) and Schrödinger (states evolve in time) picture.
So, if we want a linear map Φ : B(H) → B(H) to describe the change of a system in the
Schrödinger picture, it should map states to states, that is, density operators to density operators.
If we break this down into the two parts of the definition of density operators, this means it should
• map positive operators to positive operators (positive map),
• preserve the trace of operators (trace-preserving map).
These two requirements are certainly enough for Φ to map density operators to density operators.
But it is one of the interesting quirks of quantum information theory that these two conditions are
not enough to ensure that Φ describes the change of the states of a quantum system. To see this,
one has to look at composite systems.
If systems A and B are described by HA and HB and the states of A change according to
Φ while the states of B stay unchanged, the states of AB should change according to the map
Φ ⊗ idB . So does Φ ⊗ idB always map density matrices to density matrices? Surprisingly not. It
is clearly trace-preserving, but it may fail to be positive. Maps Φ with the property that Φ ⊗ idB
is positive for arbitrary systems B are called completely positive and will occur time and again
during this course.
Similarly, the maps describing the change of the observables of a system in the Heisenberg
picture are unital completely positive maps. Here a linear map Φ : B(H) → B(H) is called unital
if Φ(1) = 1.
CHAPTER 1. INTRODUCTION AND NOTATIONS 5

Now how do quantum entropies and trace inequalities enter the picture? Entropy is a concept
that occurs in many different shapes and forms in physics, so much that it has been called a
metaphor of nature itself. One basic physical principle is that physical systems seek to maximize
their entropy (the second law of thermodynamics). This play a particularly important role in
understanding the dissipative behavior of open quantum systems.
A closely related quantity is the relative entropy. States that maximize the (“absolute”) entropy
when the total energy of the system is fixed are equilibrium states, called Gibbs states in this
context. The relative entropy of a state with respect to a Gibbs state then measures the deviation
of this state from equilibrium. One way to express this is the following: If a state has relative
entropy d with respect to the Gibbs state, then it takes ∼ log(d/ε) measurements to distinguish
this state from equilibrium with precision ε.
Mathematically, the quantum entropy S of a system in state ρ and the relative D entropy with
respect to a Gibbs state σ are expressed as

S(ρ) = −Tr(ρ log ρ),


D(ρ∥σ) = Tr(ρ(log ρ − log σ)).

We will soon make sense of expression like log ρ for (some) matrices ρ. Trace inequalities enter
the picture to justify that these expressions have the expected physical properties. Maybe the
most prominent example is the data processing inequality, which states that the relative entropy
decreases when a quantum channel is applied to the state of the system. Mathematically, this is
reflected in certain convexity and monotonicity properties of the relative entropy.
Since this is a mathematics course after all, let us finish the first lecture by recalling some basic
linear algebra (or linear analysis, really).

Definition 1.1. An inner product space (or Hilbert space) is a vector space H equipped with a
map ⟨ · , · ⟩ : H × H → C satisfying

• ⟨ξ, αη + βζ⟩ = α⟨ξ, η⟩ + β⟨ξ, ζ⟩ for all ξ, η, ζ ∈ H, α, β ∈ C,


• ⟨η, ξ⟩ = ⟨ξ, η⟩ for all ξ, η ∈ H,
• ⟨ξ, ξ⟩ ≥ 0 for all ξ ∈ H with equality if and only if ξ = 0.
Remark. We take all vector spaces to be finite-dimensional and over the complex numbers, unless
otherwise stated.
As we know, every (finite-dimensional complex) vector space is isomorphic to Cn for some
n ∈ N. To determine all inner products on Cn , we recall the notion of positive (semi-) definite
matrices.
Definition 1.2. A matrix A ∈ Mn (C) is called positive semidefinite (or simply positive) if ξ H Aξ ≥
0 for all ξ ∈ Cn . It is called positive definite if ξ H Aξ > 0 for all ξ ∈ Cn \ {0}. The set of all
positive semidefinite matrices in Mn (C) is denoted by Mn (C)+ and the subset of all positive definite
matrices by Mn (C)++ .
For A, B ∈ Mn (C) we write A ≤ B if B − A is positive semidefinite. In particular, A ≥ 0 means
that A is positive semidefinite.

Lemma 1.3. For every inner product ⟨ · , · ⟩ on Cn there exists a positive definite matrix A ∈ Mn (C)
such that ⟨ξ, η⟩ = ξ H Aη for all ξ, η ∈ Cn .
CHAPTER 1. INTRODUCTION AND NOTATIONS 6

Proof. Let e1 , . . . , en be the standard basis of Cn and let Ajk = ⟨ej , ek ⟩. For ξ, η ∈ H we have by
sesquilinearity
* n n
+ n n
X X X X
⟨ξ, η⟩ = ξj ej , η k ek = ξj ηk ⟨ej , ek ⟩ = ξj Ajk ηk = ξ H Aη.
j=1 k=1 j,k=1 j,k=1

In particular,
ξ H Aξ = ⟨ξ, ξ⟩,
which implies that A is positive definite.

Lemma 1.4. If H is an inner product space, there exists n ∈ N and a linear isomorphism U : H →
Cn such that
(U ξ)H (U η) = ⟨ξ, η⟩
for all ξ, η ∈ H.

Proof. By linear algebra, there exists a linear isomorphism V : H → Cn for some n ∈ N. By the
previous lemma, there exists a positive definite matrix A ∈ Mn (C) such that ⟨ξ, η⟩ = (V ξ)H AV η
for all ξ, η ∈ H. Since A is positive definite, there exists an invertible matrix B ∈ Mn (C) such
that A = B H B (see Exercise 1.1). The map U = BV does the job.

In other words, there is essentially one inner product space of dimension n, namely Cn with
the inner product ⟨ξ, η⟩ = ξ H η. From now one we will always consider Cn with this specific inner
product, also called the standard inner product. Notice that it has the nice property that the
standard basis satisfies ⟨ej , ek ⟩ = δjk , in other words, it is an orthonormal basis.
Lemma 1.5. Let H, K be inner product spaces. For every linear map A : H → K there exists a
unique linear map A∗ : K → H such that

⟨Aξ, η⟩ = ⟨ξ, A∗ η⟩

for all ξ ∈ H, η ∈ K.

Proof. By the previous lemma we can assume without loss of generality H = Cm , K = Cn (with
the standard inner products). Then A∗ = AH does the job. Uniqueness is easy to see.

From now on we will write A∗ instead of AH because that’s what the cool kids do.
Definition 1.6. Let H be a Hilbert space. A linear map A : H → H is called self-adjoint if
A∗ = A.
If H = Cn , then a self-adjoint linear map A : H → H can be identified with a hermitian matrix
in Mn (C), and vice versa. We will use these two viewpoints interchangeably. The hermitian
matrices in Mn (C) are denoted by Mn (C)sa .
Theorem 1.7 (Spectral theorem). Let H be a Hilbert space. Every self-adjoint A : H → H can
be written in the form
Xn
A= λj ⟨ξj , · ⟩ξj
j=1

with real numbers λj and an orthonormal basis ξ1 , . . . , ξn of H.


CHAPTER 1. INTRODUCTION AND NOTATIONS 7

Proof. We prove this by induction over the dimension of H. For n = 1 it is clear. Suppose we
have proven it for dim(H) = n and let dim(H) = n + 1.
By Lemma 1.4, we can assume that H = Cn+1 ∼ = R2n+2 with the standard inner product.
Consider the map
f : Cn+1 → C, ξ 7→ ⟨ξ, Aξ⟩.
Since A is self-adjoint, we have f (ξ) = ⟨Aξ, ξ⟩ = ⟨ξ, Aξ⟩ = f (ξ). In other words, f is real-valued.
Thus f attains its maximum on the sphere S = {ξ ∈ Cn+1 : ∥ξ∥2 = 1}. By the Lagrange multiplier
theorem, a maximizer ξ1 of f on S must satisfy

2Aξ1 = ∇ξ ⟨ξ, Aξ⟩ = λ1 ∇ξ (∥ξ∥2 − 1) = 2λ1 ξ1 .


ξ=ξ1 ξ=ξ1

for some λ1 ∈ R.
If ξ ⊥ ξ1 , then
⟨Aξ, ξ1 , ⟩ = ⟨ξ, Aξ1 ⟩ = λ1 ⟨ξ, ξ1 ⟩ = 0.
Thus A(V ) ⊂ V . The subspace V ⊥ has dimension n. By induction hypothesis, there exists an
⊥ ⊥

orthonormal basis ξ2 , . . . , ξn+1 of V ⊥ and real numbers λ2 , . . . , λn+1 such that


n+1
X
Aη = λj ⟨vj , η⟩vj
j=2

for η ∈ V ⊥ .
Hence if ξ = ⟨ξ1 , ξ⟩ξ1 + η with η ∈ V ⊥ , then
n+2
X n+1
X
Aξ = ⟨ξ1 , ξ⟩Aξ1 + λj ⟨ξj , η⟩ξj = λj ξj , ξ⟩ξj .
j=2 j=1

Remark. The operators Pj = ⟨ξj , ·⟩ξj are projections, that is, Pj2 = Pj∗ = Pj . Moreover, the
orthogonality relation ⟨ξi , ξj ⟩ = 0 for i ̸= j implies that the projections Pi are orthogonal in the
sense that Pi Pj = 0 for i ̸= j, and the completeness of an orthonormal basis implies that the
projections Pi sum up to 1.
This means that every self-adjoint operator A can be written as
m
X
A= λ j Pj
j=1
Pm
with real numbers λj and orthogonal projections Pj such that j=1 Pj = 1. Such a representation
is called spectral decomposition of A. Usually one sums up the projections belonging to the same
eigenvalue. Pm
If A has spectral decomposition A = j=1 λj Pj , then the eigenvalues of A are exactly the
numbers λ1 , . . . , λm .
Pm
Definition 1.8. If A ∈ Mn (C) is self-adjoint with spectral decomposition A = j=1 λj Pj and
f : {λ1 , . . . , λm } → C, we define
Xm
f (A) = f (λj )Pj .
j=1

This notation is consistent with the usual notations Ak and (A − µ)−1 .


CHAPTER 1. INTRODUCTION AND NOTATIONS 8

Pm
Lemma 1.9. Let A ∈ Mn (C) be self-adjoint with spectral decomposition A = j=1 λ j Pj .

(a) If f (λ) = λk , then f (A) = Ak .


/ {λ1 , . . . , λm } and f (λ) = (λ − µ)−1 , then f (A) = (A − µ)−1 .
(b) If µ ∈

Proof. (a) We proceed by induction over k. For k = 0, the claim is true. Now assume it is true
for k and let us prove it for k + 1. We have

m
! m 
X X
Ak+1 = Ak A = λki Pi  λj Pj 
i=1 j=1
m
X
= λki λj Pi Pj .
i,j=1

Since thePmaps Pi are orthogonal projections, we have Pi Pj = 0 if i ̸= j and Pi2 = Pi . Thus


m
Ak+1 = j=1 λk+1j Pj = f (A).
(b) Since µ ∈
/ {λ1 , . . . , λm }, the operator A − λ is injective, hence also surjective by dimension
considerations. This means that the inverse (A − µ)−1 exists. Moreover,

m
! m 
m
X X X
(A − µ)f (A) = (λi − µ)Pi  (λj − µ)−1 Pj  = (λi − µ)(λj − µ)−1 Pi Pj
i=1 j=1 i,j=1

Again,
Pm we can−1 use the orthogonality
Pm relation of the projections Pi to reduce the last term to
j=1 (λj − µ) (λ j − µ)P j = j=1 Pj = 1. It follows from the uniqueness of (right) inverses
that f (A) = (A − µ)−1 .

Exercises
Exercise 1.1. For a matrix A ∈ Mn (C), show that the following are equivalent:

(i) A is positive semidefinite (positive definite).


(ii) A = A∗ and all eigenvalues of A are nonnegative (strictly positive).
(iii) A = B ∗ B for some (invertible) B ∈ Mn (C).

Proof. (i) =⇒ (ii): This implication uses the very practical polarization identity:
3
1X
⟨ξ, Aη⟩ = (−i)k ⟨ξ + ik η, A(ξ + ik η)⟩.
4
k=0

To prove it, you just have to sit down and expand the inner product. Not very pleasant, but it
works.
CHAPTER 1. INTRODUCTION AND NOTATIONS 9

Since A is positive, all summands on the right side are real. Thus
3 3
1X 1X
(−i)k ⟨ξ + ik η, A(ξ + ik η)⟩ = (−i)k ⟨ξ + ik η, A(ξ + ik η)⟩
4 4
k=0 k=0
3
1X
= (−i)k ⟨A(ξ + ik η), ξ + ik η⟩
4
k=0

Then we can apply the polarization identity again to get


3
1X
(−i)k ⟨A(ξ + ik η), ξ + ik η⟩ = ⟨Aξ, η⟩ = ⟨ξ, A∗ η⟩.
4
k=0

Hence, A = A∗ .
If ξ is an eigenvector of A to the eigenvalue λ, then ⟨ξ, Aξ⟩ = λ∥ξ∥2 . Hence λ ≥ 0 (resp. λ > 0)
if A is positive semi-definite (resp. positive definite).
(ii) =⇒ (iii): Let λ1 , . . . , λn ∈ R+ denote the eigenvalues of A. By the spectral theorem, there
exist orthogonal projections P1 , . . . , Pn such that
n
X
A= λj Pj .
j=1

Pn p √
Let B = j=1 λj Pj = B. Then B = B ∗ and B 2 = A. If A the eigenvalues of A are strictly
positive, then B is invertible with inverse B −1 = j λ−1
P
j Pj .
(iii) =⇒ (i): If A = B ∗ B, then

⟨ξ, Aξ⟩ = ⟨ξ, B ∗ Bξ⟩ = ⟨Bξ, Bξ⟩ ≥ 0

for all ξ ∈ Cn . Thus A is positive semi-definite.


If B is invertible and ξ ̸= 0, then Bξ ̸= 0, hence ⟨Bξ, Bξ⟩ > 0.

Exercise 1.2. Show that for every A ∈ Mn (C) there exist positive semidefinite matrices A1 , . . . , A4 ∈
Mn (C) such that A = A1 − A2 + i(A3 − A4 ).

Exercise 1.3. For matrices A, B ∈ Mn (C) define their Hadamard product A ◦ B as the matrix
with entries Aj,k Bj,k . Show that the Hadamard product of two positive semi-definite matrices is
again positive semi-definite.

Exercise 1.4. The absolute value |A| of a matrix A ∈ Mn (C) is defined as |A| = (A∗ A)1/2 .

(a) Show that for every A ∈ Mn (C) there exists a unitary matrix U ∈ Mn (C) such that A = U |A|.
(b) Let A, B ∈ Mn (C) be self-adjoint with eigenvalues λ1 ≤ · · · ≤ λn and µ1 ≤ · · · ≤ µn ,
respectively. Show that if λk ≤ µk for all k ∈ {1, . . . , n}, then there exists a unitary matrix
U ∈ Mn (C) such that A ≤ U ∗ BU .
(c) Show that for every A ∈ Mn (C) there exists a unitary matrix U ∈ Mn (C) such that 21 (A +
A∗ )+ ≤ U ∗ |A|U (Hint: Use the minmax principle.)
(d) Show that for all A, B ∈ Mn (C) there exist unitary matrices U, V ∈ Mn (C) such that
|A + B| ≤ U ∗ |A|U + V ∗ |B|V .
CHAPTER 1. INTRODUCTION AND NOTATIONS 10

(e) Show that there exist A, B ∈ Mn (C) such that |A + B| ̸≤ |A| + |B|.
Exercise 1.5. (a) Show that if φ : Mn (C) → C is a linear map such that φ(AB) = φ(BA) for
all A, B ∈ Mn (C), then φ = φ(1)
n Tr.

(b) Let H be an infinite-dimensional Hilbert space and B(H) the set of all bounded linear
operators on H. Show that if φ : B(H) → C is a linear map such that φ(AB) = φ(BA) for
all A, B ∈ B(H), then φ = 0 (harder).

Exercise 1.6 (Schur complement theorem). Let A ∈ Mn (C) be invertible, B ∈ Mn,m (C) and
C ∈ Mm (C). Show that the Block matrix
 
A B
B∗ C

is positive if and only if A ≥ 0, C ≥ 0 and C − B ∗ A−1 B ≥ 0.

Exercise 1.7. (a) Let V be a subspace of Mn (C) such that ABC ∈ V for all A, C ∈ Mn (C)
and B ∈ V . Show that V = {0} or V = Mn (C).
(b) Show that if Φ : Mm (C) → Mn (C) is a linear map such that Φ(AB) = Φ(A)Φ(B) for all
A, B ∈ R, then either Φ = 0 or Φ is injective.
Chapter 2

Complete positivity

Definition 2.1 (Tensor product of vector spaces/Hilbert spaces). For Hilbert spaces H and K
let Bil(H × K; C) be the vector space of all sesquilinear maps from H × K to C. For ξ ∈ H and
η ∈ K define
ξ ⊗ η : Bil(H × K; C) → C, φ 7→ φ(ξ, η).
The tensor product H ⊗ K is the linear span of all elements ξ ⊗ η with ξ ∈ H and η ∈ K. It is a
Hilbert space when endowed with the inner product

⟨ξ1 ⊗ η1 , ξ2 ⊗ η2 ⟩ = ⟨ξ1 , ξ2 ⟩⟨η1 , η2 ⟩.

Remark. We already know that Cm ⊗ Cn must be isomorphic (as inner product space) to Ck for
some k ∈ N. It is not hard to see that the elementary tensors ei ⊗ ej with i ∈ {1, . . . , m} and
j ∈ {1, . . . , n} form an orthonormal basis of Cm ⊗ Cn . Thus Cm ⊗ Cn ∼
= Cmn as inner product
spaces.
Definition 2.2 (Tensor product of matrices/maps). If Φ : H1 → K1 and Ψ : H2 → K2 are linear
maps, then their tensor product Φ ⊗ Ψ is the linear map from H1 ⊗ H2 to K1 ⊗ K2 , defined on
elementary tensors by
(Φ ⊗ Ψ)(ξ ⊗ η) = Φ(ξ) ⊗ Ψ(η).
The linear span of all elements Φ⊗Ψ with Φ ∈ Mm,k (C) and Ψ ∈ Mn,l (C) is denoted by Mm,k (C)⊗
Mn,l (C).
Remark. We will always identify elements of Mm (C) ⊗ Mn (C) with mn × mn matrices in the
following way: Let (Eij ) be the matrix units in Mn (C), that is, Eij is the matrix whose (i, j)-entry
is 1 and all other entries are 0. The matrix A ⊗ Eij is identified with the block matrix in Mmn (C)
with blocks of size m × m, where the block at position (i, j) is A and all other blocks are zero.
Here is an example:    
0 1 0 A
A⊗ 7→ .
0 0 0 0
Since the matrix units form a basis of Mn (C), this identification can be linearly extended to all of
Mm (C) ⊗ Mn (C). For example,
   
a b aA bA
A⊗ 7→ .
c d cA dA

In other words, the elementary tensor A ⊗ B is identified with the Kronecker product of A and B
(which is also denoted by A ⊗ B for this reason).

11
CHAPTER 2. COMPLETE POSITIVITY 12

Definition 2.3 (Completely positive maps and quantum channels). A linear map Φ : Mm (C) →
Mn (C) is called positive if it maps positive semi-definite matrices to positive semi-definite matrices.
For k ≥ 1, a linear map Φ : Mm (C) → Mn (C) is said to be k-positive if Φ⊗idk : Mm (C)⊗Mk (C) →
Mn (C) ⊗ Mk (C) is positive. It is said to be completely positive if it is k-positive for any k ≥ 1.
In general, characterizing k-positive maps from Mm (C) to Mn (C) is a hard task. The situation
is much better for completely positive maps. Let us start with a few (non-) examples.
Example 2.4. The transpose map T (A) = AT is positive but not 2-positive (exercise).
Example 2.5. The depolarizing channel Φ(A) = λA + (1 − λ)Tr(A)1 for λ ∈ [0, 1] is completely
positive.
Example 2.6. The following maps are completely positive:
1. If V ∈ Mm,n (C), then the map Φ : Mm (C) → Mn (C), A 7→ V ∗ AV is completely positive.
2. ∗-homomorphism π, that is, a linear map π : Mm (C) → Mn (C) such that π(AB) = π(A)π(B)
and π(A∗ ) = π(A)∗ for all A, B ∈ Mm (C). For a more concrete example, take
 
A 0
π : Mn (C) → M2n (C), A 7→ .
0 A

We will soon see that all completely positive maps can be constructed from these two examples.
Lemma 2.7. A linear map Φ : Mm (C) → Mn (C) is completely positive if and only if for every
N ∈ N, all A1 , . . . , AN ∈ Mm (C) and ξ1 , . . . , ξN ∈ Cn we have
l
X
⟨ξj , Φ(A∗j Ak )ξk ⟩ ≥ 0.
j,k=1

Proof. First assume that Φ is completely positive. We have


l
X N
X
⟨ξj , Φ(A∗j Ak )ξk ⟩ = ⟨ξi δi,j , Φ(A∗j Ak )ξl δk,l ⟩
j,k=N i,j,k,l=1
N
X
= ⟨ξi , Φ(A∗j Ak )ξl ⟩⟨ei , Ejk el ⟩
i,j,k,l=1
N
X
= ⟨ξi ⊗ ei , (Φ ⊗ id)(A∗j Ak ⊗ Ejk )(ξl ⊗ el )⟩
i,j,k,l=1
*  ∗ ! +
N
X N
X N
X N
X
= ξi ⊗ ei , (Φ ⊗ id)  Aj ⊗ E1j  Ak ⊗ E1k  ξl ⊗ el .
i=1 j=1 k=1 l=1
P ∗ P 
N N
Since j=1 A j ⊗ E1j k=1 A k ⊗ E1k is positive and Φ is completely positive, the last ex-
pression is nonnegative.
Conversely, any positive element of Mm (C) ⊗ MN (C) is of the form
 ∗  
N
X XN XN
 Bij ⊗ Eij   Bij ⊗ Eij  = A∗i Aj ⊗ Eij
i,j i,j i,j=1
CHAPTER 2. COMPLETE POSITIVITY 13

PN
with Ai = k=1 Bki . Moreover, any ξ ∈ Mn (C) ⊗ MN (C) is of the form
N
X
ξ= ξj ⊗ ej
j=1

with ξj ∈ Cn . Since
*   +
N
X N
X
ξ, (Φ ⊗ idMN (C) )  A∗i Aj ⊗ Eij  ξ = ⟨ξi , Φ(A∗i Aj )ξj ⟩,
i,j=1 i,j=1

the map Φ ⊗ idMN (C) is positive if and only if the inequality from the lemma holds.

Theorem 2.8 (Stinespring’s dilation theorem). Any completely positive map Φ : Mm (C) → Mn (C)
can be represented as
Φ(A) = V ∗ π(A)V
with V ∈ Mk,n (C) and a unital ∗-homorphism π : Mm (C) → Mk (C) for some k ∈ N.

Proof. The proof is reminiscent of the GNS construction, and in fact, it can be understood as a
generalization of it. On Mm (C) ⊗ Cn define
* +
X X X
Aj ⊗ ξj , Bk ⊗ η k = ⟨ξj , Φ(A∗j Bk )ηk ⟩.
j k K j,k

This map is clearly sesquilinear, and by the previous lemma it is also positive semi-definite. It may
fail to be non-degenerate, so we define K as the quotient of Mm (C) ⊗ Cn by the kernel of ⟨·, ·⟩K .
We write X ⊗K ξ for the image of A ⊗ ξ in K under the quotient map.
Then * + * +
X X X X
Aj ⊗K ξj , Bk ⊗K ηk := Aj ⊗ ξ j , Bk ⊗ ηk
j k j k K
defines an inner product on K, making it a Hilbert space. Clearly, K is finite-dimensional, so that
K∼ = Ck for some k ∈ N.
Define V : Cn → K, ξ 7→ 1 ⊗K ξ and π : Mm (C) → B(K), π(A)(B ⊗K ξ) = AB ⊗K ξ. To show
that π(A) is well-defined, first note that A∗ A ≤ λ1, where λ is the largest eigenvalue of A∗ A. Thus
there exists C such that λ1 − A∗ A = C ∗ . Hence
X X X
λ ⟨ξj , Φ(Bj∗ Bk )ξk ⟩ − ⟨ξj , Φ(Bj∗ A∗ ABk )ξk ⟩ = ⟨ξj , Φ(Bj∗ (λ1 − A∗ A)Bk )ξk ⟩
j,k j,k j,k
X
= ⟨ξj , Φ(Bj∗ C ∗ CBk )ξk ⟩
j,k

≥0
P P
by the previous lemma. In particular, if j Bj ⊗K ξj = 0, then j ABj ⊗K ξj = 0.
Let us compute the adjoint of V . For ξ, η ∈ Cn and A ∈ Mm (C) we have

⟨ξ, V ∗ (A ⊗K η)⟩ = ⟨V ξ, A ⊗K η⟩ = ⟨1 ⊗K ξ, A ⊗K η⟩ = ⟨ξ, Φ(A)η⟩.

Hence V ∗ (A ⊗K η) = Φ(A)η and we conclude that

V ∗ π(A)V ξ = V ∗ π(A)(1 ⊗K ξ) = V ∗ (A ⊗K ξ) = Φ(A)ξ.


CHAPTER 2. COMPLETE POSITIVITY 14

Remark. The construction in the proof is essentially forced upon us by the statement of the
Stinespring dilation theorem: Let us assume there is a (finite-dimensional) Hilbert space K, a
linear map V : Cn → K and a unital ∗-homomorphism π : Mm (C) → B(K) such that
Φ(A) = V ∗ π(A)V
for all A ∈ Mm (C).
Without loss of generality we may assume that elements of the form π(A)V ξ with A ∈ Mm (C)
and ξ ∈ Cn linearly span K. Since the map (A, ξ) 7→ π(A)V ξ is bilinear, there exists a surjection
q : Mm (C) ⊗ Cn → K such that q(A ⊗ ξ) = π(A)V ξ. In particular, K is a quotient of Mm (C) ⊗ Cn .
Moreover, the inner product on K must satisfy
⟨π(A)V ξ, π(B)V η⟩ = ⟨ξ, V ∗ π(A∗ B)V η⟩ = ⟨ξ, Φ(A∗ B)η⟩.
If we pull this back to Mm (C) ⊗ Cn via
⟨A ⊗ ξ, B ⊗ η⟩ := ⟨q(A ⊗ ξ), q(B ⊗ η)⟩,
one gets exactly the sesquilinear form from the proof.
For the following result recall the definition of the operator norm: If A is a linear operator
between the Hilbert spaces H and K, then
∥A∥ = sup ⟨Aξ, Aξ⟩1/2 .
ξ∈H
⟨ξ,ξ⟩≤1

Theorem 2.9 (Kadison–Schwarz inequality). If Φ : Mm (C) → Mn (C) is a completely positive


map, then we have
Φ(A)∗ Φ(A) ≤ ∥Φ(1)∥Φ(A∗ A).
Proof. By Stinespring’s dilation theorem there exists V ∈ Mk,n (C) and a unital ∗-homomorphism
π : Mm (C) → Mk (C) such that
Φ(A) = V ∗ π(A)V
for all A ∈ Mm (C). Thus
Φ(A)∗ Φ(A) = V ∗ π(A)∗ V V ∗ π(A)V
≤ ∥V V ∗ ∥V ∗ π(A)∗ π(A)V
= ∥V V ∗ ∥Φ(A∗ A).
Now it suffices to notice that
∥V V ∗ ∥ = ∥V ∗ V ∥ = ∥Φ(1)∥.
Remark. In fact, the Kadison–Schwarz inequality holds more generally for 2-positive maps (ex-
ercise).
Lemma 2.10. Let Φ : Mm (C) → Mn (C) be a linear map. If the Choi matrix
 
m
X
CΦ := (Φ ⊗ idMm (C) )  Eij ⊗ Eij  ∈ Mmn (C)
i,j=1

is positive, then there exist V1 , . . . , Vk ∈ Mm,n (C) such that


k
X
Φ(A) = Vl∗ AVl .
l=1
CHAPTER 2. COMPLETE POSITIVITY 15

Proof. Since CΦ is positive, there exists B ∈ Mmn (C) such that B ∗ B = CΦ . Let b1 , . . . , bmn ∈ Cmn
be the row vectors of B, so that
mn
X
CΦ = B ∗ B = b∗l bl .
l=1
Further let
Jl : Cn → Cn ⊗ Cm , ξ 7→ ξ ⊗ el
Note that
m
X m
X
Jk∗ CΦ Jl ξ = Jk∗ (Φ(Eij )ξ ⊗ Eij el ) = δjl δik Φ(Eij )ξ = Φ(Ekl )ξ.
i,j=1 i,j=1

Let Vl be the matrix with rows bl J1 , . . . , bl Jm . A direct calculation shows


mn
X
Vl∗ Eij Vl = Φ(Eij ),
l=1

from which the claim follows by linearity.


Remark. As we have seen in the proof, one can always take k = mn in the previous lemma.
However, the minimal number of Vl may be smaller. For example, some rows of B in the proof
may be zero.
Theorem 2.11 (Kraus, Choi). Any m-positive map Φ : Mm (C) → Mn (C) can be represented as
k
X
Φ(A) = Vj∗ AVj .
j=1

Furthermore, Φ is unital if and only if


k
X
Vj∗ Vj = 1
j=1

and trace-preserving if and only if


k
X
Vj Vj∗ = 1.
j=1

Proof. Since  ∗  
m
X m
X m
X
 Eij ⊗ Eij   Eij ⊗ Eij  = m Eij ⊗ Eij ,
i,j=1 i,j=1 i,j=1
P
the matrix i,j Eij ⊗ Eij is positive. As Φ is assumed to be m-positive, this implies that the Choi
matrix CΦ is positive. Now the first claim follows from the previous lemma.
For the second claim observe that
k
X k
X
Tr(Φ(A)) = Tr(Vj∗ AVj ) = Tr(AVj Vj∗ ),
j=1 j=1

Vj Vj∗ = 1. The claim for unital maps is immediate.


P
so that Φ is trace-preserving if and only if j
CHAPTER 2. COMPLETE POSITIVITY 16

Theorem 2.12 (Choi’s criterion of completely positive maps). Let Φ : Mm (C) → Mn (C) be a
linear map. The following are equivalent:

(i) Φ is m-positive.
(ii) The Choi matrix
 
m
X
CΦ := (Φ ⊗ idMm (C) )  Eij ⊗ Eij  ∈ Mmn (C)
i,j=1

is positive, where Eij , 1 ≤ i, j ≤ m are the matrix units.


(iii) Φ is completely positive.

Proof. (i) =⇒ (ii) was shown


P in the proof of the previous theorem. (ii) =⇒ (iii): If the Choi matrix
is positive, then Φ(A) = l Vl∗ AVl by Lemma 2.10. It is easy to see that maps of this form are
completely positive. (iii) =⇒ (i) is obvious.

Theorem 2.13 (Uhlmann, Lindblad). For any quantum channel Φ : Mm (C) → Mn (C), there exist
N > 0 and a pure state δ ∈ MN such that
Z
1N
Φ(A) ⊗ = U ∗ (A ⊗ δ)U dU,
N
where dU is the Haar measure on the unitary group.

Exercises
Exercise 2.1. Show that Ck ⊗ Cm has the following universal property: For every bilinear map
φ : Ck ×Cm → Cn there exists a unique linear map Φ : Ck ⊗Cm → Cn such that φ(x, y) = Φ(x⊗y).

Exercise 2.2. For each n ≥ 1, find maps that are n-positive but not (n + 1)-positive.

Exercise 2.3. Show that the maps from Example 2.6 are completely positive.
Exercise 2.4 (Completely positive maps are completely bounded). Recall that the operator norm
of A ∈ Mn (C) is the square root of the largest eigenvalue of A∗ A. Accordingly, the norm of a
linear map Φ : Mm (C) → Mn (C) is

∥Φ∥ = sup ∥Φ(A)∥.


∥A∥=1

Show that if Φ is completely positive, then ∥Φ ⊗ idMk (C) ∥ ≤ ∥Φ∥ for all k ∈ N.

Exercise 2.5 (GNS construction for states). (a) Show that any positive linear functional φ : Mn (C) →
C is completely positive.
(b) A unital positive linear map φ : Mn (C) → C is called a state. Show that for every state φ
there exists a unital ∗-homomorphism π : Mn (C) → Mk (C) for some k ∈ N and a unit vector
ξ ∈ Ck such that
φ(A) = ⟨ξ, π(A)ξ⟩
for all A ∈ Mn (C).
CHAPTER 2. COMPLETE POSITIVITY 17

Exercise 2.6. Show that every 2-positive map satisfies the Kadison–Schwarz inequality.
Exercise 2.7 (Hilbert modules I). Recall that a right Mn (C)-module is a vector space E together
with an associative and distributive product E × Mn (C) → E. A right Hilbert Mn (C)-module is
a right Mn (C)-module together with a sesquilinear map

(·|·) : E × E → Mn (C)

such that

• (ξ|ηA) = (ξ|η)A for all ξ, η ∈ E, A ∈ Mn (C),


• (η|ξ) = (ξ|η)∗ for all ξ, η ∈ Mn (C),
• (ξ|ξ) is positive semidefinite for all ξ ∈ E,
• (ξ|ξ) = 0 if and only if ξ = 0.

(a) Show that Mm,n (C) with the usual right multiplication and (A|B) = A∗ B is a right Hilbert
Mn (C)-module.
(b) (maybe not so easy) Show that for every (finite-dimensional) right Hilbert Mn (C)-module E
there exists m ∈ N and a linear isomorphism α : E → Mm,n (C) such that
• α(ξA) = α(ξ)A for all ξ ∈ E, A ∈ Mn (C),
• α(ξ)∗ α(η) = (ξ|η) for all ξ, η ∈ Mn (C).

Exercise 2.8 (Hilbert modules II). Let E, F be (finite-dimensional) right Hilbert Mn (C)-modules.
A linear map T : E → F is called adjointable if there exists a linear map T ∗ : F → E such that

(T ξ|η) = (ξ|T ∗ η)

for all ξ ∈ E, η ∈ F . The set of all adjointable operators from E to F is denoted by L(E, F ).

(a) Show that a linear operator T : E → F is adjointable if and only if

T (ξA) = (T ξ)A

for all ξ ∈ E, A ∈ Mn (C).


Hint: One direction is easy. For the other one use the inner product ⟨ξ, η⟩ = tr((ξ|η)) on E
and F .
(b) Show that every adjointable map T : Mm,n (C) → Mk,n (C) is of the form

T (B) = AB

for some A ∈ Mk,m (C).

Exercise 2.9 (Hilbert modules III). A Hilbert Mm (C)-Mn (C)-module is a right Hilbert Mn (C)-
module E together with a unital ∗-homomorphism π : Mm (C) → L(E, E).
CHAPTER 2. COMPLETE POSITIVITY 18

(a) Show that Mkm,n (C) with


 
A
πL : Mm (C) → L(Mkm,n (C), Mkm,n (C)), πL (A)B = 
 .. B

.
A

is a Hilbert Mm (C)-Mn (C)-module.


(b) Show that for every Hilbert Mm (C)-Mn (C)-module (E, π) there exists k ∈ N and a linear
isomorphism α : E → Mkn,n (C) such that
• α(ξB) = α(ξ)B for all ξ ∈ E, B ∈ Mn (C),
• (α(ξ)∗ α(η) = (ξ|η) for all ξ, η ∈ E,
• α ◦ π(A) = πL (A) ◦ α for all A ∈ Mm (C).
(c) Show that for every completely positive map Φ : Mm (C) → Mn (C) there exists a Hilbert
Mm (C)-Mn (C)-module (E, π) and an adjointable operator V : Mn (C) → E such that

Φ(A) = V ∗ π(A)V

for all A ∈ Mm (C).


(d) Deduce Theorem 2.11 from (b) and (c).
Chapter 3

Conditional expectations and partial


trace

Definition 3.1. The Hilbert–Schmidt inner product on Mn (C) is defined as

⟨·, ·⟩HS : Mn (C) × Mn (C) → C, (A, B) 7→ Tr(A∗ B).

If Φ : Mm (C) → Mn (C) is a linear map, we denote its adjoint with respect to the Hilbert–
Schmidt inner product by Φ† .
Lemma 3.2. A matrix A ∈ Mn (C) is positive if and only if ⟨A, B⟩HS ≥ 0 for all B ∈ Mn (C).

Proof. If A is positive, then

⟨A, B⟩HS = Tr(AB) = Tr(A1/2 BA1/2 ) ≥ 0

for all B ∈ Mn (C)+ .


For the converse let ξ ∈ Cn and Bξ η = ⟨ξ, η⟩η. Since Bξ∗ Bξ = Bξ2 = ∥ξ∥2 Bξ , the operator Bξ
is positive. Moreover,
⟨A, Bξ ⟩HS = Tr(A∗ Bξ ) = ⟨Aξ, ξ⟩.
Thus if ⟨A, B⟩HS ≥ 0 for all B ∈ Mn (C)+ , then ⟨Aξ, ξ⟩ ≥ 0, that is, A ≥ 0.

Remark. By the Riesz representation theorem, every linear map φ : Mn (C) → C is of the form
φ = Tr(B · ) for some B ∈ Mn (C). By the previous lemma, this map is positive if and only if
B ≥ 0.
The Hilbert–Schmidt adjoint connects the Heisenberg and Schrödinger picture, as the following
lemma shows.
Lemma 3.3. A linear map Φ : Mm (C) → Mn (C) is unital completely positive if and only if Φ† is
completely positive trace-preserving.

Proof. Since (Φ ⊗ idMk (C) )† = Φ† ⊗ idMk (C) , it suffices to show that Φ is unital positive if and only
if Φ† is positive trace-preserving.
As
⟨Φ(1), A⟩HS = ⟨1, Φ† (A)⟩HS = Tr(Φ† (A)),
we have Φ(1) = 1 if and only if Tr(Φ† (A)) = Tr(A) for all A ∈ Mn (C).
That Φ† is positive if and only if Φ is positive is an easy consequence of the previous lemma.

19
CHAPTER 3. CONDITIONAL EXPECTATIONS AND PARTIAL TRACE 20

Definition 3.4. Let M be a subalgebra of Mn (C) that is closed under taking adjoints and contains
1. Let ιM : M → Mn (C) be the inclusion map. The conditional expectation EM onto M is ιM ◦ι†M .

Proposition 3.5. The conditional expectation EM has the following properties:


2
(a) It is idempotent, that is, EM = EM .

(b) It is symmetric with respect to the Hilbert–Schmidt inner product, that is, EM = EM .
(c) It is unital completely positive trace-preserving.
(d) It is a bimodule map, that is, EM (ABC) = AEM (B)C for all A, C ∈ M and B ∈ Mn (C).

Proof. (a) By definition, ιM is an isometry, so that ι†M ιM = idM and hence


2
EM = ιM ι†M ιM ι†M = ιM ι†M = EM

(b) Clear from the definition.


(c) This follows from the previous lemma.
(d) For A, C ∈ M let LA B = AB, RC B = BC. Clearly, LA and RC commute with ιM . Taking
adjoints yields the claim.

Remark. It follows from (a) and (b) that EM is the orthogonal projection onto M (with respect
to the Hilbert–Schmidt inner product). This can be equivalently characterized by the following
properties:

(i) EM (B) ∈ M and ∥B − EM (B)∥ ≤ ∥A − EM (B)∥ for all A ∈ M with equality if and only if
A = EM (B).
(ii) EM (B) ∈ M and B − EM (B) ⊥ M.
Example 3.6 (Trace). If M = C1, then EM (A) = Tr(A)1.

Remark. As remarked before, every positive linear map φ : Mn (C) → C is of the form φ =
Tr(B 1/2 · B 1/2 ) for some B ∈ Mn (C)+ . Since A 7→ B 1/2 AB 1/2 is completely positive and Tr is
completely positive by the previous example, this implies that every positive map from Mn (C) to
C is completely positive.
Example 3.7 (Restriction to the diagonal). If M consists of all diagonal matrices in Mn (C), then
EM (A) = diag(A11 , . . . , Ann ).

Example 3.8 (Hadamard product). Let M ⊂ Mn (C) ⊗ Mn (C) be the subalgebra formed by all
elements of the form
Xn
Ajk Ejk ⊗ Ejk
j,k=1

for A ∈ Mn (C). Then


n
X
EM (A ⊗ B) = Ajk Bjk Ejk ⊗ Ejk .
j,k=1
CHAPTER 3. CONDITIONAL EXPECTATIONS AND PARTIAL TRACE 21

Definition 3.9. The partial trace Tr1 : Mm (C) ⊗ Mn (C) → Mn (C) is the linear map given by

Tr1 (A ⊗ B) = Tr(A)B.

Likewise, the partial trace Tr2 : Mm (C) ⊗ Mn (C) → Mm (C) is given by

Tr2 (A ⊗ B) = Tr(B)A.

Example 3.10. Let M = Mm (C) ⊗ 1, N = 1 ⊗ Mn (C). The conditional expectations from


Mm (C) ⊗ Mn (C) onto M and N , respectively, are given by EM (X) = n1 Tr2 (X) ⊗ 1 and EN (X) =
1 1 1
m 1 ⊗ Tr1 (X). The factors n and m come from the fact that Tr(A ⊗ 1) = nTr(A) and Tr(1 ⊗ B) =
mTr(B).

Exercises
Exercise 3.1. Show again that the Hadamard product of two positive matrices is positive. (Hint:
Use Example 3.8.)

Exercise 3.2. Let Φ : Mm (C) → Mn (C) be a quantum channel. There exist a unitary U ∈ Mk
with k = mn2 and a unit vector φ ∈ Cn ⊗ Cn such that

Φ(ρ) = TrE [U (ρ ⊗ |φ⟩ ⟨φ|)U ∗ ],

where TrE is the partial trace over the first two factors of Cm ⊗ Cn ⊗ Cn .

Exercise 3.3. Let Tr1 be the partial trace over the first tensor factor of Cm ⊗ Cn . Show that for
any ρ over Cm ⊗ Cn , we have
Z
1
⊗ Tr1 ρ = (u ⊗ 1)ρ(u∗ ⊗ 1)du,
m
where du denotes the normalized Haar measure on the unitary group over Cm . Or, we have
m
1 1 X
⊗ Tr1 ρ = 2 (ujk ⊗ 1)ρ(u∗jk ⊗ 1),
m m
j,k=1

where {ujk }1≤j,k≤m denotes the discrete Heisenberg–Weyl group over Cm , i.e.
m
2πi
X
ujk = η kl |j + l⟩ ⟨l| , η := e m .
l=1
Chapter 4

Quantum States

We begin with a piece of notation physicists are very fond of – the bra-ket notation. Let H be
a Hilbert space. Every vector ξ ∈ H gives rise to a linear map from C to H that maps λ ∈ C
to λξ ∈ H. This linear map is denoted by |ξ⟩. As C is a Hilbert space with the standard inner
product, we can take the adjoint of |ξ⟩, which is denoted by ⟨ξ|.
Not only does every vector ξ ∈ H give rise to a linear map from C to H, the converse is also
true: If Φ : C → H is linear, then Φ = |Φ(1)⟩. For this reason, we will not distinguish between
elements of H and linear maps from C to H. In particular, we identify elements of C with linear
maps from C to C (in other words, we treat 1 × 1 matrices as complex numbers).
With these identifications, the bra-ket notation works very nicely. For example,

⟨ξ| 1 |η⟩ = ⟨ξ, η⟩

and |ξ⟩ ⟨η| is the linear map that sends ζ to ⟨η, ζ⟩ξ.
With this notation, we can write the spectral decomposition of a self-adjoint matrix A with
orthonormal basis (ξj ) consisting of eigenvectors and associated eigenvalues (λj ) as
m
X
A= λj |ξj ⟩ ⟨ξj | .
j=1

Definition 4.1. A density matrix or quantum state is a positive semi-definite matrix ρ ∈ Mn (C)
with Tr(ρ) = 1.

Since a density matrix ρ is positive semi-definite, it has an orthonormal basis (ξj ) consisting
of eigenvectors and the associated
P eigenvalues (λj ) are non-negative. Moreover, the condition
Tr(ρ) = 1 is equivalent to j λj = 1.
Thus, density matrices are exactly the matrices that can be expressed as
n
X
ρ= λj |ξj ⟩ ⟨ξj |
j=1
P
with ∥ξj ∥ = 1 and λj ≥ 0, j λj = 1.

Definition 4.2. A quantum state of the form |ξ⟩ ⟨ξ| is called a pure state. Every other quantum
state is called a mixed state.

From the previous discussion, we have the following result.

22
CHAPTER 4. QUANTUM STATES 23

Lemma 4.3. Every quantum state is a convex combination of pure states.


There are several other ways to characterize pure states.
Proposition 4.4. For a quantum state ρ ∈ Mn (C), the following properties are equivalent:
(i) ρ is a pure state.
(ii) If σ ∈ Mn (C) is positive semi-definite and σ ≤ ρ, then there exists λ ≥ 0 such that σ = λρ.
(iii) If ρ1 , ρ2 ∈ Mn (C) are quantum states and µ ∈ (0, 1) such that ρ = µρ1 + (1 − µ)ρ2 , then
ρ1 = ρ2 = ρ.
Proof. (i) =⇒ (ii): Let ξ1 ∈ Cn such that ρ = |ξ1 ⟩ ⟨ξ1 | and complete it to an orthonormal basis
(ξj ) of Cn . Since σ ≤ ρ, if j ≥ 2, then
∥σ 1/2 ξj ∥2 = ⟨ξj , σξj ⟩ ≤ ⟨ξj , ρξj ⟩ = |⟨ξ1 , ξj ⟩|2 = 0
and hence also σξj = σ 1/2 σ 1/2 ξj = 0. By symmetry of σ, if j ≥ 2, then
⟨ξj , σξ1 ⟩ = ⟨σξj , ξ1 ⟩ = 0.
n
Thus σξ1 = ⟨ξ1 , σξ1 ⟩ξ1 . If ξ ∈ C is arbitrary, then
n
X
σξ = σ ⟨ξ, ξj ⟩ξj = ⟨ξ, ξ1 ⟩⟨ξ1 , σξ1 ⟩ξ1 = ⟨ξ1 , σξ1 ⟩ |ξ1 ⟩ ⟨ξ1 | ξ.
j=1

Now the claim follows with λ = ⟨ξ1 , σξ1 ⟩.


(ii) =⇒ (iii): Since ρ1 , ρ2 are positive semi-definite, we have µρ1 , (1 − µ)ρ2 ≤ ρ. By (ii), there
exist λ1 , λ2 ≥ 0 such that µρ1 = λ1 ρ and (1 − µ)ρ2 = λ2 ρ. Taking the trace on both sides, we
obtain
µ = µTr(ρ1 ) = λ1 Tr(ρ) = λ1
and likewise 1 − µ = λ2 . Therefore ρ1 = ρ2 = ρ.
P (iii) =⇒ (i): As ρ is a P quantum state, there is an orthonormal basis (ξj ) and λj ≥ 0 with
j λj = 1 such that ρ = j λj |ξj ⟩ ⟨ξj |. If ρ is not a pure state, there exist two indices j for
which λj ̸= 0. P Without loss of generality we may assume λ1 , λ2 ̸= 0. Let ρ1 = |ξ1 ⟩ ⟨ξ1 | and
n
ρ2 = (1 − λ1 )−1 j=2 λj |ξj ⟩ ⟨ξj |. Clearly ρ2 is positive semi-definite and
n
X
Tr(ρ2 ) = (1 − λ1 )−1 λj = (1 − λ1 )−1 (1 − λ1 ) = 1.
j=2

Thus ρ1 , ρ2 are density matrices and λ1 ρ1 + λ2 ρ2 = ρ. It follows from (ii) that ρ1 = ρ2 = ρ, a


contradiction. Thus ρ must be a pure state.
There is also a handy quantitative measure to decide if a state is pure or not.
Lemma 4.5. Every quantum state ρ satisfies Tr(ρ2 ) ≤ 1 with equality if and only if ρ is pure.
Proof. Let λ1 , . . . , λn be the eigenvalues of ρ, counted with multiplicity, and recall that λj ≥ 0,
2
P
j λj = 1. In particular, 0 ≤ λj ≤ 1 which implies λj ≤ λj . Thus
n
X n
X
Tr(ρ2 ) = λ2j ≤ λj = 1.
j=1 j=1

Equality holds only if λ2j = λj for all j ∈ {1, . . . , n}, which means means λj ∈ {0, 1}, which can
only happen for mixed states.
CHAPTER 4. QUANTUM STATES 24

Mixed states can be seen as “shadows” of pure states on a larger Hilbert space. This is the
first instance in this course of the paradigm known as “Church of the Larger Hilbert Space”.

Proposition 4.6. If ρ ∈ Mn (C) is a quantum state, then there exists a pure state σ ∈ Mn (C) ⊗
Mn (C) such that ρ = Tr1 (σ).
Pn Pn p
Let ρ = j=1 λj |ξj ⟩ ⟨ξj | be the spectral decomposition of ρ and let ξ = j=1 λj ξj ⊗ ξj .
Proof.P
Since j λj = 1, we have
n p
X n
X
λi λj ⟨ξi , ξj ⟩2 =
p
⟨ξ, ξ⟩ = λj = 1.
i,j=1 j=1

Thus σ = |ξ⟩ ⟨ξ| is a pure state in Mn (C) ⊗ Mn (C). Moreover,


 
X n p
p
Tr1 (σ) = Tr1  λi λj |ξi ⟩ ⟨ξj | ⊗ |ξi ⟩ ⟨ξj |
i,j=1
n p
X p
= λi λj Tr(|ξi ⟩ ⟨ξj |) |ξi ⟩ ⟨ξj |
i,j=1
X n
= λj |ξj ⟩ ⟨ξj |
j=1

= ρ.

Definition 4.7. If ρ ∈ Mn (C) is a quantum state, any pure state σ ∈ Mm (C) ⊗ Mn (C) such that
Tr1 (σ) = ρ is a called purification of ρ.
Remark. By the previous proposition, it is always possible to take m = n. But in general, one
can do much better. In the extreme case when ρ is already a pure state or example, one can of
course take m = 1.
Definition 4.8. A quantum channel (in the Schrödinger picture) is a completely positive trace-
preserving linear map from Mm (C) to Mn (C).
It is immediate from the definition that quantum channels map quantum states to quantum
states, which make them suitable to model changes of the state of a physical system.

Exercises
Let
 
1 0
σ1 = ,
0 1
 
0 −i
σ2 = ,
i 0
 
1 0
σ3 = .
0 −1

These matrices are called Pauli matrices. It can be useful to write σ0 for the 2 × 2 identity matrix.
CHAPTER 4. QUANTUM STATES 25

(a) Show that (σj )3j=0 is an orthonormal basis of the self-adjoint 2 × 2 matrices with the inner
product 12 ⟨ · , · ⟩HS .
(b) Show that ρ ∈ M2 (C) is a quantum state if and only if there exists a ∈ R3 with a21 +a22 +a33 ≤ 1
such that ρ = 21 (I + a1 σ1 + a2 σ2 + a3 σ3 ).

The equivalence in (b) establishes a bijection between quantum states in M2 (C) and the unit ball
of R3 . This graphical representation of qubit states is called the Bloch sphere.

Figure 4.1: Bloch sphere. Points denoted by |ξ⟩ correspond to the pure states |ξ⟩ ⟨ξ|. The ONB of
C2 is denoted by |0⟩, |1⟩.

The map a 7→ 12 (I +a1 σ1 +a2 σ2 +a3 σ3 ) is affine, that is, it preserves convex combinations. Thus
points on the surface of the unit ball correspond to pure states, while interior points correspond to
mixed states. The center of the unit ball corresponds to the state 12 I, which is called the maximally
mixed state.
Chapter 5

POVMs and quantum measurements

As discussed in the introduction, if A is a self-adjoint matrix with an orthonormal basis (ξj )


consisting of eigenvectors and corresponding eigenvalues (λj ), then the measurement outcome for
a quantum system in state ξ is λj with probability |⟨ξj , ξ⟩|2 . Another way to express this is
Tr(Pj |ξ⟩ ⟨ξ|) with Pj = |ξj ⟩ ⟨ξj |.
This only describes the measurement outcomes for pure states. Mixed states can be written as
convex combination of pure states, and the measurement probability are affine in the sense that a
mixed state is a statistical mixture of pure states. This means that for arbitrary quantum states
ρ, the probability to measure the value λj is Tr(Pj ρ). When dealing with open quantum systems,
this class of measurements is however to restrictive. We want to allow for a more general class of
measurements, described by so-called POVMs.

Definition 5.1. Let I be a finite set. A projection-valued measure


P (PVM) on I with values in
Mn (C) is a family (Pi )i∈I of projections in Mn (C) such that i∈I Pi = 1.
A positive-operator valued measure (POVM) onP I with values in Mn (C) is a family (Pi )i∈I of
positive semi-definite matrices in Mn (C) such that i∈I Pi = 1.

Since projections are positive semi-definite, every PVM is a POVM. The measurement inter-
pretation for POVMs is the same as for PVMs: The probability to measure outcome i when the
system is in state ρ is given by Tr(Pi ρ).

Example 5.2. If (ξi )ni=1 is an orthonormal basis n n


Pnof C , then (|ξi ⟩ ⟨ξi |)i=1 is a PVM. One can also
group them. For example P1 = |ξ1 ⟩ ⟨ξ1 |, P2 = i=2 |ξi ⟩ ⟨ξi | also forms a PVM.
Pn
Example 5.3. Let ρ1 , ρ2 be quantum states and let ρ1 − ρ2 = i=1 λi |ξi ⟩ ⟨ξi | be the spectral
decomposition of ρ1 − ρ2 . A PVM on {0, 1, 2} with values in Mn (C) is given by
X
P0 = |ξi ⟩ ⟨ξi |
i:λi =0
X
P1 = |ξi ⟩ ⟨ξi |
i:λi >0
X
P2 = |ξi ⟩ ⟨ξi | .
i:λi <0

Thus measurement, known as Helstrom measurement, is used to optimally distinguish between the
states ρ1 and ρ2

26
CHAPTER 5. POVMS AND QUANTUM MEASUREMENTS 27

Example 5.4. For every d ∈ N the family (1/d)di=1 is a POVM, although it is not very interesting
as a measurement: Tr(1/d · ρ) = d1 for every quantum state ρ, so it does not help to distinguish
between states of a system.

There is a one-to-one correspondence between POVMs on {1, . . . , n} with values in Mm (C) and
a special class of quantum channels from Mm (C) to Mn (C), called quantum-to-classical channels.

Definition 5.5. A quantum channel Φ : Mm (C) → Mn (C) is called quantum-to-classical channel


if Φ(A) is diagonal for every A ∈ Mm (C).

Example 5.6. The dephasing channel Φ : Mn (C) → Mn (C), A 7→ diag(A11 , . . . , Ann ) is a quantum-
to-classical channel.
More generally, if Ψ : Mm (C) → Mn (C) is any quantum channel and Φ : Mn (C) → Mn (C) is the
dephasing channel, then Ψ◦Φ is a quantum-to-classical channel. Vice versa, if Ψ : Mm (C) → Mn (C)
is a quantum-to-classical channel, then Φ ◦ Ψ = Ψ, hence all quantum-to-classical channels are of
this form.

Proposition 5.7. (a) For every quantum-to-classical channel Φ : Mm (C) → Mn (C) there exists
a unique POVM on {1, . . . , n} with values in Mm (C) such that
n
X
Φ(A) = Tr(Pi A)Eii
i=1

for all A ∈ Mm (C).


Moreover, if the restriction of Φ† to the diagonal matrices is a ∗-homomorphism, then (Pi )ni=1
is a PVM.
(b) If (Pi )ni=1 is a POVM with values in Mm (C), then
n
X
Φ : Mm (C) → Mn (C), A 7→ Tr(Pi A)Eii
i=1

is a quantum-to-classical channel.

Proof. (a) Let Pi = Φ† (Eii ). Since Φ is completely positive, so is Φ† , hence Pi is positive. Moreover,
since Φ is trace-preserving, Φ† is unital, which implies
n
X n
X
Pi = Φ† (Eii ) = Φ† (1) = 1.
i=1 i=1

Finally, since Φ(A) is diagonal,


n
X n
X n
X
Tr(Pi A)Eii = Tr(Φ† (Eii )A)Eii = Tr(Ei iΦ(A))Eii = Φ(A)
i=1 i=1 i=1

If the restriction of Φ† to the diagonal matrices is a ∗-homomorphism, then


m
X m
X
(Pi )kl = (Pi )kj (Pi )jl = Φ† (Eii )kj Φ† (Eii )jl = Φ(Eii )kl = (Pi )kl .
j=1 j=1
CHAPTER 5. POVMS AND QUANTUM MEASUREMENTS 28

Thus (Pi )ni=1 is a PVM.


(b) Clearly Φ(A) is diagonal for all A ∈ Mm (C). Furthemore,
n n
! !
X X
Tr(Φ(A)) = Tr(Pi A) = Tr Pi A = Tr(A),
i=1 i=1

which shows that Φ is trace-preserving.


To see that Φ is completely positive, it suffices to note that Φ is the sum of the completely
1/2 1/2
positive maps A 7→ Tr(Pi APi )Eii .

Just like mixed states are “shadows” of pure states, POVMs are “shadows” of PVMs. This is
another instance of the church of the larger Hilbert space.

Theorem 5.8 (Naimark dilation theorem). If (Pi )i∈I is a POVM with values in Mm (C), then
there exists an isometry V : Cm → Ck and a PVM (Qi )i∈I with values in Mk (C) such that

Pi = V ∗ Qi V

for 1 ≤ i ≤ n.

Proof. Let Φ be the quantum-to-classical channel associated with (Pi )i∈I by the previous proposi-
tion. By the Stinespring dilation theorem, there exists V : Cm → Ck and a unital ∗-homomorphism
π : Mn (C) → Mk (C) such that Φ† (A) = V ∗ π(A)V for all A ∈ Mm (C). Since Φ is trace-preserving,
Φ† is unital, which implies V ∗ V = 1.
Let E denote the dephasing channel on Mk (C) and Ψ = E ◦ π † , which is a quantum-to-classical
channel. Moreover, since E † = E is the identity on diagonal matrices, the Hilbert–Schmidt adjoint
of Ψ acts as π on the diagonal matrices, which is a ∗-homomorphism. Thus the POVM given by
Qi = Ψ† (Eii ) is a PVM.
Finally,
V ∗ Qi V = V ∗ π(E † (Eii ))V = V ∗ π(Eii )V = Φ† (Eii ) = Pi .
Chapter 6

Basic trace inequalities and


convexity/concavity results

In this chapter we will see the first glimpse of a quantum entropy, namely the von Neumann
entropy. It is the trace of a matrix-valued function, which connects it with the other part of the
title of this course, the trace inequalities. More specifically, we will investigate monotonicity and
convexity properties of maps of the form A 7→ Tr(f (A)) in this chapter.
Lemma 6.1 (Peierls Inequality). If A ∈ Mn (C) is self-adjoint, f : R → R is convex and u1 , . . . , un
is an orthonormal basis of Cn , then
n
X
f (⟨uj , Auj ⟩) ≤ Trf (A),
j=1

and equality holds if (uj ) consists of eigenvectors of A.

Proof. Let v1 , . . . , vn be an orthonormal basis of Cn consisting of eigenvectors of A and let


λ1 , . . . , λn be the corresponding eigenvalues. Since f is convex, we have
n n n
!
X X X
2
f (⟨uj , Auj ⟩) = f λk |⟨vk , uj ⟩|
j=1 j=1 k=1
Xn
≤ f (λk )|⟨vk , uj ⟩|2
j,k=1
X n
= f (λk )
k=1
= Trf (A).

The equality case is easy to see.

Lemma 6.2. If A, B ∈ Mn (C) are self-adjoint and f : R → R is continuously differentiable, then


the function
φ : R → R, t 7→ Tr(f (A + tB))
is differentiable with
φ′ (t) = Tr(f ′ (A + tB)B).

29
CHAPTER 6. BASIC TRACE INEQUALITIES AND CONVEXITY/CONCAVITY RESULTS
30

Proof. Let us first consider the case when f is a polynomial. Otherwise replacing A by A + tB, it
suffices to prove differentiability at 0. We have

(A + tB)m = Am + t(BAm−1 + ABAm−2 + · · · + Am−1 B) + o(t)

and thus
Tr((A + tB)m ) = Tr(Am ) + tTr(mAm−1 B) + o(t).
Hence the statement holds when f is a monomial, and by linearity also when f is a polynomial.
Now let f ∈ C 1 (R) be arbitrary. For T > 0 we have

∥A + tB∥ ≤ ∥A∥ + T ∥B∥

if |t| ≤ T . In particular, the spectrum of A + tB is contained in the interval IT = [−∥A∥ −


T ∥B∥, ∥A∥ + T ∥B∥].
By the Stone–Weierstraß theorem there exists a sequence of polynomial pk such that pk → f
and p′k → f ′ uniformly on IT . Let

φk : R → R, t 7→ Tr(pk (A + tB)).

The uniform convergence of (pk ) and (p′k ) implies φk → φ and

φ′k → Tr(f ′ (A + •B)B)

uniformly on [−T, T ].
Since T > 0 was arbitrary, the function φ is differentiable with

φ′ (t) = Tr(f ′ (A + tB)B).

Remark. It was crucial in the proof that the trace is invariant under cyclic permutations. In
d
general, it is not true that dt |t=0 f (A + tB) = f ′ (A)B, as simple examples show. In the exercises
you will be asked to give a correct formula for this derivative in terms of the spectral decompositions
of A and B.

Theorem 6.3. Let f : R → R be a function. If f is monotone increasing, then so is A 7→ Trf (A)


on Mn (C)sa . If f is convex, then so is A 7→ Trf (A) on Mn (C)sa .

Proof. First let f be monotone increasing and let A, B ∈ Mn (C) with A ≤ B. We have to show
that
Tr(f (A)) ≤ Tr(f (B)).
We can assume without loss of generality that f is continuously differentiable. By the previous
lemma we have
Z 1
d
Tr(f (B)) − Tr(f (A)) = Tr(f (A + t(B − A))) dt
0 dt
Z 1
= Tr(f ′ (A + t(B − A))(B − A)) dt.
0

Since f is monotone increasing, f ′ (A + t(B − A)) ≥ 0. Moreover, B − A ≥ 0 by assumption. Thus


the integrand is non-negative, which implies Tr(f (A)) ≤ Tr(f (B)) as desired.
CHAPTER 6. BASIC TRACE INEQUALITIES AND CONVEXITY/CONCAVITY RESULTS
31

Now let f be convex, A, B ∈ Mn (C) be self-adjoint and λ ∈ [0, 1]. Let (uj ) be an orthonormal
basis of Cn consisting of eigenvectors of λA + (1 − λ)B. By convexity of f and Peierls inequality
we have
n
X
Tr(f (λA + (1 − λB))) = ⟨uj , f (λA + (1 − λB))uj ⟩
j=1
Xn
= f (⟨uj , (λA + (1 − λB))uj ⟩)
j=1
Xn
≤ λf (⟨uj , Auj ⟩) + (1 − λ)f (⟨uj , Buj ⟩)
j=1

≤ λTr(f (A)) + (1 − λ)Tr(f (B)).

Remark. That we can assume f to be continuously differentiable in the first part of the proof
will be justified in the exercises.
Corollary 6.4. For self-adjoint A ∈ Mn (C) and λ ∈ R let N (A, λ) be the number of eigenvalues
of A less or equal than λ, counted with multiplicity. If A ≤ B, then N (A, λ) ≥ N (B, λ) for all
λ ∈ R.

Proof. The function 1(−∞,λ] is decreasing and

N (A, λ) = Tr(1(−∞,λ] (A)).

Now the claim follows from the previous theorem.

Theorem 6.5 (Klein’s Inequality). Let f be a continuously differentiable convex function on R.


Then for any A, B ∈ Mn (C)sa , we have

Tr[f (A) − f (B) − f ′ (B)(A − B)] ≥ 0.

Proof. By Lemma 6.2 the function

φ : R → R, t 7→ Tr(f (B + t(A − B)))

is differentiable with
φ′ (0) = Tr(f ′ (B)(A − B)).
Moreover, by the previous theorem, φ is convex. Thus

Tr(f ′ (B)(A − B)) = φ′ (0) ≤ φ(1) − φ(0) = Tr(f (A)) − Tr(f (B)).

Theorem 6.6 (Peierls–Bogoliubov Inequality). The function A 7→ log Tr exp(A) is convex on


Mn (C)sa .

Proof. Let !
n
X
n xk
φ : R → R, x 7→ log e .
k=1

A direct computation shows


∂2φ
= aj δjk − aj ak ,
∂xj ∂xk
CHAPTER 6. BASIC TRACE INEQUALITIES AND CONVEXITY/CONCAVITY RESULTS
32

where
exj
aj = Pn .
k=1 exk
For any y ∈ Rn we have
 2
n n n n n
X ∂ 2 φ(x) X X X X
yj yk = aj yj2 − aj ak yj yk = aj yj2 −  aj yj  ≥ 0
∂xj ∂xk j=1 j=1 j=1
j,k=1 j,k=1

by Jensen’s inequality. Thus φ is convex.


Let A, B ∈ Mn (C)sa , λ ∈ [0, 1] and let (uj ) be an orthonormal basis of Cn consisting of
eigenvectors of λA + (1 − λB). For xj = ⟨uj , Auj ⟩, yj = ⟨uj , Buj ⟩ we have
 
X n
φ(λx + (1 − λ)y) = log  exp(⟨uj , (λA + (1 − λ)B)uj ⟩)
j=1
 
Xn
= log  ⟨uj , exp(λA + (1 − λ)B)uj ⟩
j=1

= log Tr(exp(λA + (1 − λ)B)).

On the other hand, since φ is convex,

φ(λx + (1 − λ)y) ≤ λφ(x) + (1 − λ)φ(y)


   
Xn n
X
= λ log  e⟨uj ,Auj ⟩  + (1 − λ) log  e⟨uj ,Buj ⟩ 
j=1 j=1
A B
≤ λ log Tr(e ) + (1 − λ) log Tr(e ),

where the last step follows from Peierls inequality.

Remark (Chandler Davis convexity theorem). More generally, if φ : Rn → R is a symmetric


convex function, then the map Φ that maps a self-adjoint matrix with eigenvalues λ1 , . . . , λn to
φ(λ1 , . . . , λn ) is convex (exercise).
Theorem 6.7 (Duality formula of the quantum entropy). If ρ ∈ Mn (C) is positive and Tr(ρ) = 1,
then
Tr(ρ log ρ) = sup{Tr(Hρ) − log Tr(eH ) | H ∈ Mn (C)sa }.
Remark. The function λ 7→ λ log λ can be continuously extended to a function f on [0, ∞) by
setting f (0) = 0. Here and in the following we understand Tr(ρ log ρ) as Tr(f (ρ)), so that it also
makes sense if ρ is not positive definite.
H
e
Proof. For H ∈ Mn (C)sa let σ = Tr(e H) . Let f : [0, ∞) → R be the continuous extension of
λ → λ log λ. This function is convex. By Klein’s inequality,

0 ≤ Tr(f (ρ) − f (σ) − f ′ (σ)(ρ − σ))


= Tr(ρ log ρ) − Tr(σ log σ) − Tr((log σ + 1)(ρ − σ))
= Tr(ρ log ρ) − Tr(ρ log σ)
= Tr(ρ log ρ) − Tr(Hρ) + log Tr(eH ).
CHAPTER 6. BASIC TRACE INEQUALITIES AND CONVEXITY/CONCAVITY RESULTS
33

Thus Tr(ρ log ρ) ≥ Tr(Hρ) − log Tr(eH ) for all H ∈ Mn (C)sa .


For the converse inequality let ε > 0 and Hε = log(ρ + ε1). Then

Tr(Hε ρ) − log Tr(eHε ) = Tr(ρ log(ρ + ε1)) − log(1 + εn) → Tr(ρ log ρ)

as ε ↘ 0. Thus
Tr(ρ log ρ) ≤ sup{Tr(Hρ) − log Tr(eH ) | H ∈ Mn (C)sa }.

In the previous theorem we encountered one of the central quantities in this course, the von
Neumann entropy. Moreover, another entropy quantity was hidden in the proof, namely the relative
entropy, which we will re-encounter later.
Definition 6.8. The von Neumann entropy S(ρ) of a quantum state ρ is defined as

S(ρ) = −Tr(ρ log ρ).

Definition 6.9. Given a self-adjoint matrix H ∈ Mn (C) and β ∈ [−∞, ∞], the Gibbs state for
the Hamiltonian H at inverse temperature β is the density matrix ρβ,H given by
1
ρβ,H = e−βH
Tr(e−βH )
if β ∈ R and ρ±∞,H = limβ→±∞ ρβ,H .
Gibbs states are the equilibrium states for systems with Hamiltonian H for fixed energy of the
system. Mathematically, this can be expressed as follows.
Theorem 6.10. Let H ∈ Mn (C) be a self-adjoint matrix with eigenvalues λ1 ≤ · · · ≤ λn . For
each E ∈ [λ1 , λn ] there exists β ∈ [−∞, ∞] such that E = Tr(Hρβ,H ) and

S(ρβ,H ) = max{S(ρ) | ρ quantum state, Tr(ρH) = E}.

Moreover, the Gibbs state ρβ,H satisfying E = Tr(Hρβ,H ) is unique.

Proof. Let ρ be a quantum state with Tr(ρH) = E. Assume there exists a Gibbs state ρβ,H such
that Tr(ρβ,H ) = E (we will show this afterwards). By the duality formula for the quantum entropy,

S(ρ) = − sup{Tr(Aρ) − log Tr(eA ) | A ∈ Mn (C)sa }


≤ Tr(βHρ) + log Tr(e−βH )
= βE + log Tr(e−βH ).

On the other hand,

S(ρβ,H ) = −Tr(ρβ,H log ρβ,H )


1
=− Tr(e−βH (−βH − log Tr(e−βH ))
Tr(e−βH )
= βTr(ρβ,H H) + log Tr(e−βH )
= βE + log Tr(e−βH ).

Thus S(ρ) ≤ S(ρβ,H ). The case β = ±∞ follows by taking limits.


Now let us turn to the existence of a Gibbs state with energy E. If H = E1, we can take any
β ∈ [−∞, ∞] to get ρβ,H = n1 with energy Tr(ρβ,1 ) = E. Let us assume in the following that H is
not a multiple of the identity.
CHAPTER 6. BASIC TRACE INEQUALITIES AND CONVEXITY/CONCAVITY RESULTS
34

To show that Tr(ρβ,H H) takes all values between λ1 and λn , we use the intermediate value
theorem. We have
d d Tr(e−βH H)
Tr(ρβ,H H) =
dβ dβ Tr(e−βH )
2
Tr(e−βH H 2 ) Tr(e−βH H)

=− +
Tr(e−βH ) Tr(e−βH )
= Tr(ρβ,H H)2 − Tr(ρβ,H H 2 ).

By the Cauchy–Schwarz inequality,

Tr(ρβ,H H)2 = Tr(ρβ,H H1)2 ≤ Tr(ρβ,H H 2 )Tr(ρβ,H 12 ) = Tr(ρβ,H H 2 )

with equality if and only if H is a multiple of 1, which we ruled out.


d
Thus dβ Tr(ρβ,H H) < 0, which implies that β 7→ Tr(ρβ,H H) is strictly increasing. Moreover,
from
n
1 X
Tr(ρβ,H H) = Pn −βλj
λj e−βλj
j=1 e j=1

we deduce

Tr(ρ∞,H H) = lim Tr(ρβ,H H) = λ1 ,


β→∞

Tr(ρ−∞,H H) = lim Tr(ρβ,H H) = λn .


β→−∞

Hence Tr(ρβ,H H) takes all values between λ1 and λn for β ∈ [−∞, ∞].
Uniqueness of the Gibbs state with energy E follows from the strict monotonicity of β 7→
Tr(ρβ,H H) if H is not a multiple of the identity, while in the case H = E1 we have ρβ,H = E
n1
independently of β.

Lemma 6.11. For A ∈ Mm (C) ⊗ Mn (C) and B ∈ Mm (C) we have

Tr(Tr2 (A)B) = Tr(A(B ⊗ 1)).

Proof. If A = X ⊗ Y , then

Tr(Tr2 (A)B) = Tr(Y )Tr(XB) = Tr(XB ⊗ Y ) = Tr((X ⊗ Y )(B ⊗ 1).

The general cse follows by linearity.

For a density matrix ρ ∈ Mm (C)⊗Mn (C) we write ρ1 and ρ2 for Tr2 (ρ) and Tr1 (ρ), respectively.

Proposition 6.12 (Subadditivity of quantum entropy). If ρ ∈ Mm (C) ⊗ Mn (C) is a quantum


state, then
S(ρ) ≤ S(ρ1 ) + S(ρ2 )
with equality if and only if ρ = ρ1 ⊗ ρ2 .
CHAPTER 6. BASIC TRACE INEQUALITIES AND CONVEXITY/CONCAVITY RESULTS
35

Proof. By the previous lemma,


S(ρ1 ) = −Trρ1 log ρ1 = −Trρ(log ρ1 ⊗ 1),
and similarly
S(ρ2 ) = −Trρ2 log ρ2 = −Trρ(1 ⊗ log ρ2 ).
Thus
S(ρ1 ) + S(ρ2 ) − S(ρ) = Trρ(log ρ − log ρ1 ⊗ 1 − 1 ⊗ log ρ2 ) = Trρ(log ρ − log(ρ1 ⊗ ρ2 )) ≥ 0
by Klein’s inequality for f (x) = x log x:
0 ≤ Tr(f (ρ) − f (ρ1 ⊗ ρ1 ) − f ′ (ρ1 ⊗ ρ2 )(ρ − ρ1 ⊗ ρ2 ))
= Tr(ρ(log ρ − log(ρ1 ⊗ ρ2 ))).
Theorem 6.13 (Golden–Thompson inequality). If A, B ∈ Mn (C) are self-adjoint, then
Tr(eA+B ) ≤ Tr(eA eB ).
Proof. We first show by induction that
k −k k −k
|Tr(C1 . . . C2k )| ≤ Tr(|C1 |2 )2 . . . Tr(|C2k |2 )2
for all k ∈ N and C1 , . . . , C2k ∈ Mn (C). For k = 1 this follows from the Cauchy–Schwarz inequality:
|Tr(C1 C2 )| = |⟨C1∗ , C2 ⟩HS | ≤ Tr(C1 C1∗ )1/2 Tr(C2∗ C2 )1/2 = Tr(|C1 |2 )1/2 Tr(|C2 |2 )1/2 .
For the induction step, we have
k −k k −k
|Tr(C1 . . . C2k+1 )| ≤ Tr(|C1 C2 |2 )2 . . . Tr(|C2k+1 −1 C2k+1 |2 )2 .
By cyclicity of the trace and a second application of the induction hypothesis,
k
Tr(|C1 C2 |2 ) = Tr(C2∗ C1∗ C1 C2 . . . C2∗ C1∗ C1 C2 )
k−1
= Tr((C1∗ C1 C2 C2∗ )2 )
k k
≤ Tr(|C1∗ C1 |2 )1/2 Tr(|C2 C2∗ |2 )1/2
k+1 k+1
= Tr(|C1 |2 )1/2 Tr(|C2 |2 )1/2 .
Applying the same argument to the other factors, we get
k+1 −(k+1) k+1 −(k+1)
|Tr(C1 . . . C2k+1 )| ≤ Tr(|C1 |2 )2 . . . Tr(|C2k+1 |2 )2
as desired.
Now take C1 = · · · = C2k = XY for self-adjoint X, Y ∈ Mn (C). By the previous step and the
cyclicity of the trace,
k k k−1 k−1
Tr((XY )2 ) ≤ Tr(|XY |2 ) = Tr((Y X 2 Y )2 ) = Tr((X 2 Y 2 )2 ).
k k k
By induction one obtains Tr((XY )2 ) ≤ Tr(X 2 Y 2 ).
−k −k
If we take X = e2 A , Y = e2 B , then
−k
A 2−k B 2k
Tr((e2 e ) ) ≤ Tr(eA eB ).
By the Lie–Trotter product formula, the left side converges to Tr(eA+B ) as k → ∞.
CHAPTER 6. BASIC TRACE INEQUALITIES AND CONVEXITY/CONCAVITY RESULTS
36

Exercises
Exercise 6.1. Let f : R → R be continuously differentiable and let A, B ∈ Mn (C) be self-adjoint.
Show that the map
Φ : R → Mn (C), t 7→ f (A + tB)
is differentiable and express Φ′ (0) in terms of the spectral decomposition of A and B.
d tA
Exercise 6.2. Show that for every A ∈ Mn (C) the map t 7→ etA is differentiable with dt e = AetA
P∞ k
(Hint: Use that etA = k=0 tk! Ak ).
Exercise 6.3. Let λ1 , . . . , λn , µ1 , . . . , µn ∈ R with λ1 < λ2 < · · · < λn and µ1 ≤ µ2 ≤ · · · ≤ µn .

(a) Show that there exists a continuously differentiable increasing function f : R → R such
that f (λk ) = µk for all k ∈ {1, . . . , n}. Show that f can be chosen strictly increasing if
µ1 < · · · < µn .
(b) Let g : R → R be increasing and let A1 , . . . , Am ∈ Mn (C) be self-adjoint. Show that there
exists a continuously differentiable increasing function f : R → R such that f (Ak ) = g(Ak )
for all k ∈ {1, . . . , n}.

Exercise 6.4. For any A, B ∈ Mn (C)sa , show that

TreA+B Tr(eA B)
 
log ≥ .
TreA TreA

In particular, when TreA = 1, we have

log TreA+B ≥ Tr(eA B).

Exercise 6.5. (a) Let ω : Mn (C) be a linear functional with ω(1) = 1. Show that ω(A) ≥ 0 for
all A ∈ Mn (C)+ if and only if ∥ω∥ = 1.
(b) Show that there exists a bijection

f : {ρ ∈ Mn (C)+ | Tr(ρ) = 1} → {ω : Mn (C) → C | φ linear, φ(1) = ∥φ∥ = 1}

such that f (λρ+(1−λ)σ) = λf (ρ)+(1−λ)f (σ) for all ρ, σ ∈ Mn (C)+ with Tr(ρ) = Tr(σ) = 1
and λ ∈ [0, 1].
Chapter 7

Operator monotonicity and operator


concavity/convexity

Recall that for positive semi-definite square matrices A, B, we write A ≤ B if B − A ≥ 0. If A ≤ B,


then C ∗ AC ≤ C ∗ BC. In the following, we work with positive-definite matrices for simplicity.
Definition 7.1. A function f : (0, ∞) → R is said to be operator monotone if A ≤ B implies
f (A) ≤ f (B) for positive definite square matrices A, B of arbitrary size.
Clearly, every operator monotone function is (scalar) monotone, as can be seen by plugging in
1 × 1 matrices. The converse is not true, as we shall see in the exercises.
Example 7.2. For α ≥ 0 and β ∈ R the function x 7→ αx + β is operator monotone.
Beyond this rather trivial class of examples, it takes some work to come up with more interesting
operator monotone functions. We will get two know two (well, one plus one family) in the next
proposition. First, however, let us see a monotone function which is not operator monotone.
Example 7.3. The function x 7→ x2 is not operator monotone. Indeed, the matrices
   
1 1 2 1
A= , B=
1 1 1 1
clearly satisfy A ≤ B, yet
     
2 2 5 3 2 2 3 1
B −A = − =
3 2 2 2 1 0
has determinant −1, so that it is not positive semidefinite.
Proposition 7.4. (a) The function x 7→ −x−1 is operator monotone.
(b) For α ∈ [0, 1] the function x 7→ xα is operator monotone.
Proof. (a) Recall that for C ≥ 0, C ≤ 1 iff ∥C∥ ≤ 1. Also, ∥X ∗ X∥ = ∥X∥2 = ∥X ∗ ∥2 =
∥XX ∗ ∥. If A, B ∈ Mn (C) are positive definite and A ≤ B, then B −1/2 AB −1/2 ≤ 1, hence
∥B −1/2 AB −1/2 ∥ ≤ 1. Thus
∥A1/2 B −1 A1/2 ∥ = ∥(B −1/2 A1/2 )∗ (B −1/2 A1/2 )∥
= ∥(B −1/2 A1/2 )(B −1/2 A1/2 )∗ ∥
= ∥B −1/2 AB −1/2 ∥
≤ 1,

37
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 38

which implies A−1/2 B −1 A−1/2 ≤ 1. Hence B −1 ≤ A−1 .


(b) Clearly this is true for α ∈ {0, 1}. Let E be the set of numbers α ∈ [0, 1] for which x 7→ xα
is operator monotone. We will show that E is convex.
Let α, β ∈ E and let A, B ∈ Mn (C)+ with A ≤ B. Since x 7→ xα is operator monotone, Aα ≤
B α , which implies B −α/2 Aα B −α/2 ≤ 1. Hence ∥Aα/2 B −α/2 ∥ ≤ 1. Similarly ∥B −β/2 Aβ/2 ∥ ≤
1.
Let r(S) = max{|λ| : λ ∈ σ(S)} for S ∈ Mn (C) and note that r(ST ) = r(T S) for all
invertible S, T ∈ Mn (C). We have

r(B −(α+β)/4 A(α+β)/2 B −(α+β)/4 )


= r(B (α−β)/4 B −(α+β)/4 A(α+β)/2 B −(α+β)/4 B −(α−β)/4 )
= r(B −β/2 A(α+β)/2 B −α/2 )
≤ ∥B −β/2 A(α+β)/2 B −α/2 ∥
≤ ∥B −β/2 Aβ/2 ∥∥Aα/2 B −α/2 ∥
≤ 1.

Therefore B −(α+β)/4 A(α+β)/2 B −(α+β)/4 ≤ 1, hence A(α+β)/2 ≤ B (α+β)/2 .


Thus (α + β)/2 ∈ E. Moreover, a continuity argument shows that E is closed. Thus E is
convex. Together with {0, 1} ⊂ E this implies E = [0, 1], as desired.

Remark. If S, T ∈ Mn (C) are invertible, then T S = T (ST )T −1 , that is, ST and T S are similar.
Thus they have the same eigenvalues. In particular, r(ST ) = r(T S) as used in the proof of the
previous theorem. In general σ(ST ) \ {0} = σ(T S) \ {0}, thus r(ST ) = r(T S).
Definition 7.5. A function f : (0, ∞) → R is said to be operator convex if for any n ∈ N, any
positive definite matrices A, B ∈ Mn (C) and any λ ∈ (0, 1) we have

f (λA + (1 − λ)B) ≤ λf (A) + (1 − λ)f (B).

We say f is operator concave if −f is operator convex.

As with operator monotone functions, every operator convex (resp. operator concave) function
is convex (resp. concave), but the converse is not true.

Example 7.6. The square function f (x) = x2 is operator convex. In fact, for any positive semi-
definite A, B and λ ∈ (0, 1), we have

λA2 + (1 − λ)B 2 − (λA + (1 − λ)B)2 = λ(1 − λ)(A − B)2 ≥ 0.

Example 7.7. The cube function f (x) = x3 is not operator convex. In fact, if f is operator
convex, then we must have (since A + tB = (1 − t)A + t(A + B))

f (A + tB) ≤ (1 − t)f (A) + tf (A + B),

for any positive semi-definite A, B and t ∈ (0, 1). The above inequality can be reformulated as

(A + tB)3 − A3
≤ (A + B)3 − A3 .
t
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 39

Letting t → 0+ , we get
B 3 + B 2 A + BAB + AB 2 ≥ 0.
Now we choose    
1 1 1 0
A= ,B = .
1 1 0 0
Then B = B 2 = B 3 = BAB and
 
4 1
B 3 + B 2 A + BAB + AB 2 = ,
1 0

which is not positive semi-definite. This leads to a contradiction.

Proposition 7.8. The function x 7→ x−1 is operator convex.

Proof. Let A, B ∈ Mn (C) be positive definite and let C = A−1/2 BA−1/2 . For λ ∈ [0, 1] we have

λA−1 + (1 − λ)B −1 − (λA + (1 − λ)B)−1


= A−1/2 (λ1 + (1 − λ)C −1 − (λ1 + (1 − λ)C)−1 )A−1/2

Since the real-valued function x 7→ x−1 is convex,

(λ + (1 − λ)x)−1 ≤ λ + (1 − λ)x−1

for any x ∈ R. Applying this to the eigenvalues of C implies

(λ1 + (1 − λ)C)−1 ≤ λ1 + (1 − λ)C −1 ,

which yields the desired inequality.

To show the operator convexity/concavity of x → 7 xp for some (but not all, see the exercises)
other values of p, we will use the following integral representations.
Lemma 7.9. For positive definite A ∈ Mn (C), the following integral formulas hold.

sin(p + 1)π ∞ p
Z
Ap = t (t1 + A)−1 dt for p ∈ (−1, 0),
π 0
sin pπ ∞ p −1
Z
p
A = t (t 1 − (t1 + A)−1 ) dt for p ∈ (0, 1),
π 0
sin(p − 1)π ∞ p−1 −1
Z
p
A = t (t A + t(t1 + A)−1 − 1) dt for p ∈ (1, 2).
π 0

Proof. Exercise.

With this lemma, one can prove directly that

Proposition 7.10. For the power functions fp (x) := xp ,

(a) when −1 ≤ p < 0, −fp is operator monotone and operator concave;


(b) when 0 ≤ p ≤ 1, fp is operator monotone and operator concave;
(c) when 1 ≤ p ≤ 2, fp is operator convex.
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 40

Proof. From the integral identities in the previous lemma, it suffices to show that for any t > 0,
x 7→ −(t + x)−1 is operator monotone and operator concave.

Proposition 7.11. We have the following


(a) f (x) = log x is operator concave and operator monotone;
(b) f (x) = x log x is operator convex.

Proof. Exercise.

Theorem 7.12 (Loewner’s Theorem). A function f : (0, ∞) → R is operator monotone if and


only if it is of the form Z ∞
1 − tx
f (x) = ax + b − dµ(t), (7.1)
0 t+x
where a ≥ 0, b ∈ R and µ is a positive finite measure on (0, ∞).

Proof. See the book of Barry Simon.

Theorem 7.13. Let f be a (continuous) function that maps (0, ∞) into itself. Then the following
are equivalent:
(a) f is operator monotone;
(b) f is operator concave.
Both of them imply

(c) f −1 is operator convex.

Proof. We first show (b) =⇒ (c). Assume (b), then for any λ ∈ (0, 1) and any positive definite
A, B, we have
f (λA + (1 − λB)) ≥ λf (A) + (1 − λ)f (B).
By operator monotonicity and operator concavity of −x−1 , then

f (λA + (1 − λB))−1 ≤ [λf (A) + (1 − λ)f (B)]−1 ≤ λf (A)−1 + (1 − λ)f (B)−1 .

So we have (c).
Now we prove the equivalence of (a) and (b). Assume (b). Then for any 0 ≤ A ≤ B, we will
show that f (A) ≤ f (B). For this note that for any λ ∈ (0, 1):

λ
λB = λA + (1 − λ) (B − A).
1−λ
By operator concavity, we have
 
λ
f (λB) ≥ λf (A) + (1 − λ)f (B − A) .
1−λ

Since f ≥ 0 and B − A ≥ 0, we get f (λB) ≥ λf (A) for any λ ∈ (0, 1). Letting λ → 1− , we get by
continuity that f (B) ≥ f (A). So f is operator monotone and we have (a).
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 41

Now assume (a). Let A, B ∈ Mn (C)++ and λ ∈ [0, 1]. Write 1n for the unit matrix in Mn (C).
Define the unitary matrix V ∈ M2n (C) by

λ1/2 1n −(1 − λ)1/2 1n


 
U= .
(1 − λ)1/2 1n λ1/2 1n

A direct computation shows

λ1/2 (1 − λ)1/2 (B − A)
   
A 0 λA + (1 − λ)B
U∗ U= .
0 B λ1/2 (1 − λ)1/2 (B − A) (1 − λ)A + λB

Let D = −λ1/2 (1 − λ)1/2 (B − A) and note that for ε > 0 we have


       
λA + (1 − λ)B + ε1n 0 ∗ A 0 ε1n D ε1n D
−U U= ≥
0 2µ1n 0 B D 2µ1n − (1 − λ)A + λB D µ1n

if µ ≥ ∥λA + (1 − λ)B∥.
By the Schur complement theorem,
 
ε1n D
≥0
D µ1n

if µ ≥ ε−1 ∥D∥2 . Thus


   
λA + (1 − λ)B + ε1n 0 ∗ A 0
≥U U
0 2µ1n 0 B

for µ sufficiently large.


Since U is unitary and f is operator monotone, we have

λ1/2 (1 − λ)1/2 (f (B) − f (A))


 
λf (A) + (1 − λ)f (B)
λ1/2 (1 − λ)1/2 (f (B) − f (A)) (1 − λ)f (A) + λf (B)
 
f (A) 0
= U∗ U
0 f (B)
   
A 0
= f U∗ U
0 B
 
λA + (1 − λ)B + ε1n 0
≤f
0 2µ1n
 
f (λA + (1 − λ)B + ε1n ) 0
= .
0 2f (µ)1n

Hence λf (A) + (1 − λ)f (B) ≤ f (λA + (1 − λ)B + ε1n ). Letting ε ↘ 0 yields (by continuity of f )
λf (A) + (1 − λ)f (B) ≤ f (λA + (1 − λ)B) as desired.

Remark. With a little more work one can show that every operator monotone function is au-
tomatically continuous. From the proof, we see that we only need f ≥ 0 in deriving operator
monotonicity from operator concavity. It is not true if we don’t have f ≥ 0. For example,
f (x) = −x log x is operator concave, but it is not even scalar monotone.
Lemma 7.14 (Dilation of contractions). If A ∈ Mn (C) with A∗ A ≤ 1, then there exists m ≥ n
and a unitary U ∈ Mm (C) such that P U |Cn = A, where P : Cm → Cn is the projection onto the
first n coordinates.
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 42

Proof. Let m = 2n, B = (1 − AA∗ )1/2 , C = (1 − A∗ A)1/2 and


 
A B
U= .
C −A∗
We have
A∗
 ∗
A A + C2 A∗ B − CA∗
   
C A B
U ∗U = = .
B −A C −A∗ BA − AC B 2 + AA∗
By definition, A∗ A + C 2 = B 2 + AA∗ = 1. Moreover, A∗ B = CA∗ and BA = AC (exercise). The
property P U |Cn is immediate from the definition of U .
Theorem 7.15 (Jensen’s inequality for operators). If f : [0, ∞) → R is operator convex, then
f (Φ(A)) ≤ Φ(f (A))
for all A ∈ Mm (C)+ and unital completely positive maps Φ : Mm (C) → Mn (C).
If additionally f (0) ≤ 0, then the same inequality holds for all contractive completely positive
maps Φ.
Proof. Let Φ : Mm (C) → Mn (C) be a contractive completely positive map. By Stinespring’s
theorem, there exist W ∈ Mk,n (C) and a unital ∗-homomorphism π : Mm (C) → Mk (C) such that
Φ(A) = W ∗ π(A)W.
Moreover, since Φ is contractive, W ∗ W = Φ(1) ≤ 1.
Let B = (1 − W W ∗ )1/2 , C = (1 − W ∗ W )1/2 and
     
π(A) 0 W B W −B
X= , U= , V = .
0 0 C −W ∗ C W∗
By the previous lemma, U and V are unitary. Furthermore,
 ∗
W π(A)W W ∗ π(A)B
 ∗
−W ∗ π(A)B
 
W π(A)W
U ∗ XU = , V ∗ XV = .
Bπ(A)W Bπ(A)B −Bπ(A)W Bπ(A)B
Since f is operator convex, we obtain
f (W ∗ π(A)W )
   ∗ 
0 W π(A)W 0
=f
0 f (Bπ(A)B) 0 Bπ(A)B
 
1 ∗ 1
=f U XU + V ∗ XV
2 2
1 ∗ 1 ∗
≤ U f (X)U + V f (X)V
2  2   
1 ∗ f (π(A)) 0 1 f (π(A)) 0
= U U + V∗ V
2 0 f (0)1 2 0 f (0)1
If f (0) ≤ 0, then
   
1 ∗ f (π(A)) 0 1 ∗ f (π(A)) 0
U U+ V V
2 0 f (0)1 2 0 f (0)1
   
1 π(f (A)) 0 1 π(f (A)) 0
≤ U∗ U + V∗ V
2 0 0 2 0 0
 ∗ 
W π(f (A))W 0
= .
0 Bπ(f (A))B
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 43

Therefore
f (Φ(A)) = f (W ∗ π(A)W ) ≤ W ∗ π(f (A))W = Φ(f (A)).
On the other hand, if π is unital, then 1 = π(1) = V ∗ V , hence C = 0. Then
   
1 ∗ f (π(A)) 0 1 f (π(A)) 0
U U + V∗ V
2 0 f (0)1 2 0 f (0)1
 ∗ 
W f (π(A))W 0
= .
0 Bf (A)B + f (0)W W ∗
Thus f (W ∗ π(A)W ) ≤ W ∗ f (π(A))W , and we conclude as before.
Corollary 7.16. If f : [0, ∞) → R is operator convex, then
 
Xm m
X
f Vj∗ Aj Vj  ≤ Vj∗ f (Aj )Vj
j=1 j=1
Pm
for all A1 , . . . , Am ∈ Mn (C) and all V1 , . . . , Vm ∈ Mn,k (C) with j=1 Vj∗ Vj = 1.
Pm
If additionally f (0) ≤ 0, the same inequality holds under the assumption j=1 Vj∗ Vj ≤ 1.

Exercises
Exercise 7.1. For a continuously differentiable function f : (0, ∞) → R let
(
f (λ)−f (µ)
if λ ̸= µ,
Df : (0, ∞)2 → R, (λ, µ) 7→ ′
λ−µ
f (λ) if λ = µ.
Show that f is operator monotone if and only if for all n ∈ N and λ1 , . . . , λn > 0 the matrix
[Df (λj , λk )]j,k is positive semi-definite.
Exercise 7.2. Show that the set
E = {log f | f : (0, ∞) → (0, ∞) operator monotone}
is convex.
Exercise 7.3. Show the integral formulas from Lemma 7.9.
Exercise 7.4. Show that
1. f (x) = log x is operator concave and operator monotone;
2. f (x) = x log x is operator convex.
x−1
3. f (x) = log x is operator concave and operator monotone.
Proof. Hint: we have Z ∞  
1 1
log x = − dt,
0 t + 1 t + x
p
x −x
x log x = lim+ ,
p→1 p−1
Z 1
x−1
= xα dα.
log x 0
CHAPTER 7. OPERATOR MONOTONICITY AND CONVEXITY 44

Exercise 7.5. Show that x 7→ xp is neither operator convex nor operator concave if p ∈
/ [−1, 2].
Exercise 7.6. Give an example of an operator concave function f : (0, 1) → R that is not operator
monotone.
Exercise 7.7. Show that every operator monotone function f : (0, ∞) → R is continuous.

Exercise 7.8. Show that if A ∈ Mn (C) and f : [0, ∞) → R, then Af (A∗ A) = f (AA∗ )A.

Exercise 7.9. Let f : [0, ∞) → R be a continuous function such that f (Φ(A)) ≤ Φ(f (A)) for all
n ∈ N, A ∈ Mn (C) and all unital completely positive maps Φ : Mn (C) → Mn (C). Show that f is
operator convex.
Chapter 8

Lieb’s concavity theorem

In 1963, Wigner, Yanase and Dyson conjectured that


1
Sp (ρ) = Tr[K, ρp ][K, ρ1−p ]
2
is concave in ρ, where K = K ∗ is arbitrary. The quantity −Sp (ρ) is sometimes called Wigner–
Yanase–Dyson skew information. It was resolved in 1973 by Lieb. Among many others, Lieb
proved the following result. We write Mn (C)+ for the positive n × n matrices and Mn (C)++ for
the positive definite n × n matrices.

Theorem 8.1 (Lieb’s concavity theorem and Ando’s convexity theorem). For K ∈ Mn (C) the
function
Mn (C)+ × Mn (C)+ → C, (A, B) 7→ Tr(K ∗ Ap KB 1−p ),
is jointly concave if p ∈ [0, 1] and jointly convex if p ∈ [−1, 0].
Remark 8.2. The parameters can be more general, as we shall see later. The convexity result is
named after Ando as he proved it in 1979, but this result is contained in another result of Lieb in
the same 1973 paper.
Corollary 8.3. The quantum relative entropy

D(ρ||σ) := Tr(ρ(log ρ − log σ)).

is jointly convex in density matrices ρ and σ.

Proof. It follows from Lieb’s concavity theorem and


1
Trρp σ 1−p − 1 .

D(ρ||σ) = lim−
p→1 p−1

Let us come back to Lieb’s concavity theorem. It now has a lot of proofs. We will give two here,
the first one being Lieb’s original proof using interpolation. For this, let us recall the three-line
lemma first.

Lemma 8.4. Let S := {z ∈ C : 0 < ℜz < 1} be the open strip and denote by S its closure.
Suppose that f : S → C is bounded function such that

45
CHAPTER 8. LIEB’S CONCAVITY THEOREM 46

1. f is analytic in S;
2. f is continuous on S;
3. sup{|f (k + iy)| : y ∈ R} := Mk < ∞, k = 0, 1.
Then for any θ ∈ [0, 1], we have |f (θ)| ≤ M01−θ M1θ .
Proof of Lieb’s concavity theorem. To prove the joint concavity of
(A, B) 7→ TrAp K ∗ B 1−p K, 0 ≤ p ≤ 1,
it suffices to prove the concavity of
A 7→ TrAp K ∗ A1−p K, 0 ≤ p ≤ 1.
In fact, this is a doubling dimension trick:
p  1−p 
K∗
  
A 0 0 A 0 0 0
TrAp K ∗ B 1−p K = Tr .
0 B 0 0 0 B K 0
For any positive semi-definite A1 , A2 and λ ∈ (0, 1), put A := λA1 + (1 − λ)A2 . We need to show
that
λTrAp1 K ∗ A1−p
1 K + (1 − λ)TrAp2 K ∗ A1−p
2 K ≤ TrAp K ∗ A1−p K.
1−p p
By approximation, we may assume that A1 , A2 and A are all positive definite. Set M := A 2 KA 2 .
For k = 1, 2, consider the function
z 1−z 1−z z
fk (z) := TrAzk A− 2 M ∗ A− 2 A1−z
k A− 2 M A− 2 , z ∈ S.
Then what we need to show can be reformulated as
λf1 (p) + (1 − λ)f2 (p) ≤ TrM ∗ M.
z z
We claim that the function fk is uniformly bounded on S. In fact, denote Gk (z) := A− 2 Azk A− 2
and we may write fk as fk (z) = TrM ∗ Gk (1 − z)M Gk (z). By Cauchy–Schwarz,
1/2 1/2
|fk (z)| ≤ (TrM M ∗ Gk (1 − z)Gk (1 − z)) (TrM M ∗ Gk (z)Gk (z)) .
For z = x + iy, we have
TrM M ∗ Gk (1 − z)Gk (1 − z) ≤ ∥A−1 ∥2x ∥Ak ∥2x TrM ∗ M,
which is uniformly bounded. Similarly TrM M ∗ Gk (z)Gk (z) is also uniformly bounded. Therefore,
we finish the proof of the claim, so that we can use the three-line lemma to f (z) := λf1 (z) + (1 −
λ)f2 (z). When ℜz = 0:
 iy −1+iy 1 1 iy 
iy −iy −1+iy iy
fk (0 + iy) = Tr Ak2 A− 2 M ∗ A 2 Ak2 · Ak2 A 2 M A− 2 Ak2 .

By Cauchy–Schwarz:
−1+iy 1+iy
|fk (iy)| ≤ TrM ∗ A 2 Ak A− 2 M, k = 1, 2.
So for any y ∈ R
−1+iy 1+iy
|f (iy)| ≤ TrM ∗ A 2 (λA0 + (1 − λ)A1 )A− 2 M = TrM ∗ M.
Similarly we can prove |f (1 + iy)| ≤ TrM ∗ M for all y ∈ R. This concludes the proof by three-line
lemma.
CHAPTER 8. LIEB’S CONCAVITY THEOREM 47

Now we give another proof using perspective functions. We start with an even more gen-
eral convexity/concavity result, which reduces most of the other results in this chapter to easy
corollaries.

Theorem 8.5 (ENG perspective theorem). Let f : [0, ∞) → R be operator convex (resp. operator
concave) and g : (0, ∞) → (0, ∞) operator concave. Assume that f (0) ≤ 0 (resp. f (0) ≥ 0). Then
the map

Mn (C)++ × Mn (C)+ → Mn (C),


(A, B) 7→ g(A)1/2 f (g(A)−1/2 Bg(A)−1/2 )g(A)1/2

is jointly convex (resp. concave).

Proof. We only prove the jointly convex case. The jointly concave case follows by replacing f by
−f .
Let A1 , A2 ∈ Mn (C)++ , B2 , B2 ∈ Mn (C)+ , λ ∈ [0, 1] and define A = λA1 + (1 − λ)A2 , B =
λB1 + (1 − λ)B2 . Let V1 = (λg(A1 ))1/2 g(A)−1/2 , V2 = ((1 − λ)g(A2 ))1/2 g(A)−1/2 . Since g is
operator concave, we have

V1∗ V1 + V2∗ V2 = g(A)−1/2 (λg(A1 ) + (1 − λ)g(A2 ))g(A)−1/2 ≤ 1.

The operator Jensen inequality implies

g(A)1/2 f (g(A)−1/2 Bg(A)−1/2 )g(A)1/2


= g(A)1/2 f (V1∗ g(A1 )−1/2 B1 g(A1 )−1/2 V1 + V2∗ g(A2 )−1/2 B2 g(A2 )−1/2 V2 )g(A)1/2
≤ g(A)1/2 V1∗ f (g(A1 )−1/2 B1 g(A1 )−1/2 )V1 g(A)1/2
+ g(A)1/2 V2∗ f (g(A2 )−1/2 B2 g(A2 )−1/2 )V2 g(A)1/2
= λg(A1 )1/2 f (g(A1 )−1/2 B1 g(A1 )−1/2 )g(A1 )1/2
+ (1 − λ)g(A2 )1/2 f (g(A2 )−1/2 B2 g(A2 )−1/2 )g(A2 ).

Corollary 8.6. The function

Λp,q : Mn (C)++ × Mn (C)+ → Mn (C)+ , (A, B) 7→ Aq/2 (A−q/2 BA−q/2 )p Aq/2

is jointly concave if p, q ∈ [0, 1] and jointly convex if p ∈ [1, 2] and q ∈ [0, 1].

Proof. As x 7→ xp is operator concave for p ∈ [0, 1] and operator convex for p ∈ [1, 2], the result
follows immediately from the ENG perspective theorem.

We are now in the position to prove Lieb’s concavity theorem and Ando’s convexity theorem.

Proof of Theorem 8.1. Equip Mn (C) with the Hilbert–Schmidt inner product

⟨·, ·⟩HS : Mn (C) × Mn (C) → C, (A, B) 7→ Tr(A∗ B),

making Mn (C) into a Hilbert space. For A, B ∈ Mn (C) define

LA , RB : Mn (C) → Mn (C), LA K = AK, RB K = KB.

Note that LA and RB commute.


CHAPTER 8. LIEB’S CONCAVITY THEOREM 48

With this notation we have

Tr(K ∗ Ap KB 1−p ) = ⟨K, LpA RB


1−p
K⟩HS
1/2 −1/2 −1/2 1−p 1/2
= ⟨K, LA (LA RB LA ) LA K⟩HS
= ⟨K, Λ1−p,1 (LA , RB )K⟩HS

and the joint convexity resp. joint concavity follows from the previous corollary.

Theorem 8.7. The operator function

Mn (C)++ × Mn (C)+ → Mn (C) ⊗ Mn (C), (A, B) 7→ Ap ⊗ B 1−p

is jointly concave when 0 < p < 1 and jointly convex when −1 < p < 0.

Proof. For A, B ∈ Mn (C) let


SA = A ⊗ 1, TB = 1 ⊗ B.
Since SA and TB commute, we have
p 1−p
Ap ⊗ B 1−p = SA TB = Λ1−p,1 (SA , RB ),

and the joint convexity follows from the previous corollary.

Theorem 8.8. The geometric mean

M0 (A, B) := A1/2 (A−1/2 BA−1/2 )1/2 A1/2

is jointly concave.

Proof. Since f (x) = x1/2 and g(x) = x are operator concave, and f (0) = 0, this result follows
directly from the ENG perspective theorem.

Theorem 8.9. The harmonic mean


−1
A−1 + B −1

M−1 (A, B) :=
2

is jointly concave.

Proof. Let f (x) = (1 + x−1 )−1 = x


1+x =1− 1
1+x and g(x) = x. By Proposition 7.4, f is operator
concave. Clearly f (0) = 0. Since

M−1 (A, B) = 2(A−1/2 (1 + A1/2 B −1 A1/2 )A−1/2 )−1


= 2A1/2 f (A−1/2 BA−1/2 )A1/2 ,

the result follows from the ENG perspective theorem.

Recall that the arithmetic mean is given by


A+B
M1 (A, B) := .
2
As in the scalar case, we have
CHAPTER 8. LIEB’S CONCAVITY THEOREM 49

Theorem 8.10 (Arithmetic-geometric-harmonic mean inequality). For all positive definite ma-
trices A, B, we have
M−1 (A, B) ≤ M0 (A, B) ≤ M1 (A, B).

Proof. The first inequality is nothing but


−1
A−1 + B −1

≤ A1/2 (A−1/2 BA−1/2 )1/2 A1/2 ,
2

which is equivalent to
−1
1 + A1/2 B −1 A1/2

≤ (A−1/2 BA−1/2 )1/2 .
2
−1
This is true by the scalar inequality ( 1+x
2 ) ≤ x−1/2 and the functional calculus.
The second inequality is
A+B
A1/2 (A−1/2 BA−1/2 )1/2 A1/2 ≤ ,
2
which is equivalent to
1 + A−1/2 BA−1/2
(A−1/2 BA−1/2 )1/2 ≤ .
2
√ 1+x
This follows from the scalar inequality x≤ 2 and the functional calculus.

We end with the following

Theorem 8.11. The operator function

(A, B) 7→ B ∗ A−1 B

is jointly convex.

Proof. Let A1 , A2 ∈ Mn (C)++ , B1 , B2 ∈ Mn (C) and t ∈ [0, 1]. By the Schur complement theorem,
 
A1 B1
≥ 0,
B1∗ B1∗ A−1
1 B1

and the same holds for A1 and B1 replaced by A2 and B2 , respectively. Thus
 
tA1 + (1 − t)A2 tB1 + (1 − t)B2
≥ 0.
tB1∗ + (1 − t)B2∗ tB1∗ A−1 ∗ −1
1 B1 + (1 − t)B2 A2 B2

Another application of the Schur complement theorem yields

tB1∗ A−1 ∗ −1
1 B1 + (1 − t)B2 A2 B2
≥ (tB1∗ + (1 − t)B2∗ )(tA1 + (1 − t)A2 )−1 (tB1 + (1 − t)B2 ).

Lemma 8.12. If f (·, ·) is jointly concave, then maxx f (x, y) is concave.


CHAPTER 8. LIEB’S CONCAVITY THEOREM 50

Proof. For any y1 , y2 , suppose xi is such that f (xi , yi ) = maxx f (x, yi ) for i = 1, 2. Then for any
λ ∈ (0, 1), we have

max f (x, λy1 + (1 − λ)y2 ) ≥ f (λx1 + (1 − λ)x2 , λy1 + (1 − λ)y2 )


x
≥ λf (x1 , y1 ) + (1 − λ)f (x2 , y2 )
= λ max f (x, y1 ) + (1 − λ) max f (x, y2 ).
x x

Theorem 8.13. For any self-adjoint H, the function

Mn (C)++ → R, A 7→ Tr exp(H + log A)

is concave.

Proof. We have the following duality formula:

Tr exp(H + log A) = max TrXH + TrX − TrX(log X − log A)


X≥0

(exercise).
Then the desired concavity result follows from the joint convexity of quantum relative entropy
and the above lemma.

Using the above theorem, one can extend the Golden–Thompson inequality TreH+K ≤ TreH eK
to three matrices. Note that TreH eK eL is in general not even a real number.

Proposition 8.14. For all self-adjoint matrices H, K, L we have

TreH+K+L ≤ Tr[eH Te−K (eL )],

where Z ∞
1 1 d
TA (B) = B ds = |t=0 log(A + tB).
0 s+A s+A dt
N
Proof. Let C ⊂ R be a convex set such that tx ∈ C for every t > 0, x ∈ C. If f : C → R is
concave and f (tx) = tf (x) for all t > 0, x ∈ C, then

f (x + ty) − f (x)
f (y) ≤ lim ,
t→0+ t
for any x, y ∈ C.
Indeed, for any t > 0
   
x ty f (x) tf (y)
f (x + ty) = (1 + t)f + ≥ (1 + t) + = f (x) + tf (y).
1+t 1+t 1+t 1+t

Now we apply this result to C = Mn (C)++ and f (X) = Tr[eH+K+log X ]. This function is concave
by the previous theorem and homogeneous of degree one. For X = e−K and Y = eL we get
 
H+K+L d H+K+log(e−K +teL ) H+K+log(e−K ) d −K L
Tre ≤ |t=0 Tr[e ] = Tr e |t=0 log(e + te ) ,
dt dt
CHAPTER 8. LIEB’S CONCAVITY THEOREM 51

where the right hand side is exactly what we need once we prove
Z ∞
1 1 d
B ds = |t=0 log(A + tB).
0 s+A s+A dt
This identity follows from the integral formula
Z ∞
(s + 1)−1 − (s + A)−1 ds

log A =
0

as follows:
Z ∞
d d
log(X + tY ) = ((s + 1)−1 − (s + X + tY )−1 ) ds
dt t=0 dt t=0 0
Z ∞
d
=− (s + X + tY )−1 ds
0 dt t=0
Z ∞
= (s + X)−1 Y (s + X)−1 .
0

Exercises
Exercise 8.1. Suppose that p ≤ q. For K ∈ Mn (C) the function

Mn (C)+ × Mn (C)+ → C, (A, B) 7→ TrK ∗ Ap KB q ,

is
1. jointly concave if 0 ≤ p ≤ q ≤ 1 such that p + q ≤ 1;
2. jointly convex if −1 ≤ p ≤ 0, 1 ≤ q ≤ 2 such that p + q ≥ 1.
Exercise 8.2. Prove the duality formula:

Tr exp(H + log A) = max TrXH + TrX − TrX(log X − log A).


X≥0

Exercise 8.3. The parallel sum A : B of A, B ∈ Mn (C)+ is defined as

A : B = lim ((A + ε1)−1 + (B + ε1)−1 )−1 .


ε↘0

1. Show that the limit in the definition of A : B exists.


2. Show that
⟨ξ, (A−1 + B −1 )−1 ξ⟩ = inf{⟨η, Aη⟩ + ⟨ζ, Bζ⟩ | ξ = η + ζ}
for all ξ ∈ Cn .
3. Show that S ∗ (A : B)S ≤ (S ∗ AS) : (S ∗ BS) for all S ∈ Mn (C).
Exercise 8.4. A quantum Markov semigroup is a family (Pt )t≥0 of linear operators on Mn (C)
such that
• P0 = id, Ps Pt = Ps+t , s, t ≥ 0,
• Pt is unital completely positive trace preserving,
CHAPTER 8. LIEB’S CONCAVITY THEOREM 52

• Pt x → x, t → 0.
It has a generator L that is defined via
x − Pt x
L(x) := lim .
t→0 t
Show that ρ 7→ ⟨L(ρp ), ρ1−p ⟩ is convex when 0 ≤ p ≤ 1, and concave when −1 ≤ p ≤ 0.
Chapter 9

Entanglement

If two classical physical systems are described by the (finite) pure state spaces X and Y , then
the composite system is described the pure state space X × Y . This means that the mixed states
of
P the composite system are P probability densities on X × Y . For every ρ : X × Y → [0, 1] with
x,y ρ(x, y) = 1 one has ρ = x,y ρ(x, y)1(x,y) . In other words, every probability density on X ×Y
is a convex combination of the Dirac densities 1(x,y) with x ∈ X, y ∈ Y .
The situation is markedly different for quantum systems, where the phenomenon of entangle-
ment occurs, which is one of the key features of quantum information theory compared to classical
information theory.
Definition 9.1. A quantum state ρ ∈ Mm (C)⊗Mn (C) is called separable if there exist λ1 , . . . , λk ∈
Pk (1) (1) (2) (2)
[0, 1] with j=1 λj = 1 and quantum states σ1 , . . . , σk ∈ Mm (C), σ1 , . . . , σk ∈ Mn (C) such
that
k
(1) (2)
X
ρ= λj σj ⊗ σj .
j=1

Every quantum state that is not separable is called entangled.


Remark. The notion of separable and pure states applies to states of a composite system for a
given composition. For example, if we view Mm (C) ⊗ Mn (C) as Mmn (C) ⊗ M1 (C), then every
quantum state is naturally separable.
Examples of separable states are easy to come by – just take your favorite quantum states in
Mm (C) and Mn (C) and then form their tensor product and take convex combinations if you like.
What is less obvious is how to find entangled states (or if they exist at all). For this purpose, the
following criterion comes in handy.
Proposition 9.2 (Horodecki criterion). A quantum state ρ ∈ Mm (C) ⊗ Mn (C) is separable if and
only if for every k ∈ C and every positive map Φ : Mm (C) → Mk (C) the matrix (Φ ⊗ idMk (C) )(ρ)
is positive.
Proof. We only prove the easier implication here.
(1)
If ρ is separable, then there exists λ1 , . . . , λl ≥ 0 and quantum states σ(1)1 , . . . , σl ∈ Mm (C),
(2) (2) P (1) (2)
σ1 , . . . , σl ∈ Mn (C) such that ρ = j λj σj ⊗ σj .
If Φ : Mm (C) → Mk (C) is positive, then
l
(1) (2)
X
(Φ ⊗ idMk (C) )(ρ) = λj Φ(σj ) ⊗ σj .
j=1

53
CHAPTER 9. ENTANGLEMENT 54

(1)
Since Φ is positive, the matrix Φ(σj ) is positive. Thus (Φ ⊗ idMk (C) )(ρ) ≥ 0.

Corollary 9.3. Whenever m, n ≥ 2, there exist entangled states in Mm (C) ⊗ Mn (C).

Proof. Let Φ : Mm (C) → Mk (C) be a positive map that is not 2-positive. For example, we can
take k = m and Φ the transpose map. Then there exists a (necessarily non-zero) positive matrix
A ∈ Mm (C) ⊗ Mn (C) such that (Φ ⊗ idMn (C) )(A) is not positive. By the Horodecki criterion,
A/Tr(A) is an entangled state.

Example 9.4 (Werner states). Let m = n ≥ 2 and W = i,j Eij ⊗Eji . As W 2 = 1, the matrix W
P
has eigenvalues ±1. Let P±1 be the orthogonal projection onto the eigenspace of W corresponding
to the eigenvector ±1. More explicitly, P1 = 21 (1 ⊗ 1 + W ) and P−1 = 12 (1 ⊗ 1 − W ).
A basis of the range of P1 is given by (ei ⊗ ej + ej ⊗ ej )i≤j and a basis of the range of P−1 is
given by (ei ⊗ ej − ej ⊗ ei )i<j . Thus Tr(P1 ) = n(n+1)
2 and Tr(P−1 ) = n(n−1)
2 .
2λ 2(1−λ)
A quantum state of the form ρλ = n(n−1) P−1 + n(n+1) P1 with λ ∈ [0, 1] is called a Werner
state. Werner states are entangled states. We only prove this here for λ > 21 .
Let Φ : Mm (C) → Mm (C) be the transpose map, which is positive, but not completely positive.
We have
X
(Φ ⊗ idMk (C) )(W ) = Eij ⊗ Eij
i,j

and therefore
1 1X
(Φ ⊗ idMk (C) )(P1 ) = (1 ⊗ 1) + Eij ⊗ Eij ,
2 2 i,j
1 1X
(Φ ⊗ idMk (C) )(P−1 ) = (1 ⊗ 1) − Eij ⊗ Eij .
2 2 i,j

1
P
Let Q0 = n i,j Eij ⊗ Eij and Q1 = 1 ⊗ 1 − Q0 . From the previous identities we deduce
 
2λ 2(1 − λ)
(Φ ⊗ idMk (C) )(ρλ ) = (Φ ⊗ idMk (C) ) P−1 + P1
n(n − 1) n(n + 1)
2λ − 1 + n (2λ − 1)n + 1 X
= 2
(1 ⊗ 1) + Ei,j ⊗ Eij
n(n − 1) n(n2 − 1) i,j
 
1 − 2λ 1 − 2λ Q1
= Q0 + 1 − .
n n n2 − 1

Observe that Q0 is self-adjoint and


1 X 1 XX
Q20 = Eij Ekl ⊗ E ij E kl = Eil ⊗ Eil = Q0 .
n2 n2
i,j,k,l k i,l

Hence Q0 and Q1 are orthogonal projections with Q0 Q1 = 0. It follows that (Φ ⊗ idMk (C) )(ρλ ) has
eigenvalues (1 − 2λ)/n and (1 − 1−2λ 2
n )/(n − 1). In particular, (Φ ⊗ idMk (C) )(ρλ ) is not positive for
λ > 12 .
Chapter 10

Data processing inequalities

Definition 10.1 (Quantum relative entropy). For any quantum states ρ and σ, the quantum
relative entropy of ρ with respect to σ is
D(ρ||σ) := Tr(ρ(log ρ − log σ)).
Although the quantum relative entropy is not a distance, it still serves as a nice measure to
distinguish quantum states.
Lemma 10.2. If ρ, σ ∈ Mn (C) are quantum states, then D(ρ∥σ) ≥ 0 with equality if ρ = σ.
Proof. Let f (x) = x log x. Since f is convex, Klein’s inequality implies
0 ≤ Tr(f (ρ) − f (σ) − f ′ (σ)(ρ − σ)) = Tr(ρ log ρ − σ log σ − log σ(ρ − σ)) = Tr(ρ(log ρ − log σ)).
In fact, the quantum relative entropy vanishes D(ρ||σ) = 0 if and only if ρ = σ. The converse
implication follows from the equality case in Klein’s inequality for strictly convex functions, which
we did not discuss. However, it is also an immediate consequence of the following result. For its
formulation, recall that the trace norm of a matrix A ∈ Mm (C) is defined as ∥A∥1 = Tr(|A|).
Theorem 10.3 (Pinsker’s inequality). For any quantum states ρ and σ, we have
1
D(ρ||σ) ≥ ∥ρ − σ∥21 .
2
To prove this, we shall need the following monotonicity property, sometimes called data pro-
cessing inequality of quantum relative entropy.
Theorem 10.4 (Data processing inequality for quantum relative entropy). For any quantum states
ρ, σ ∈ Mm (C) and any quantum channel Λ : Mm (C) → Mn (C), we have
D(Λ(ρ)||Λ(σ)) ≤ D(ρ||σ).
Proof. As noticed before,
1 − Tr(ρp σ 1−p )
D(ρ∥σ) = lim .
p↗1 1−p
Let fp (x) = xp for p ∈ [0, 1] and
Lσ : Mn (C) → Mn (C), Lσ A = σA
Rρ : Mn (C) → Mn (C), Rρ A = Aρ

55
CHAPTER 10. DATA PROCESSING INEQUALITIES 56

Then
⟨σ 1/2 , fp (Lρ Rσ−1 )σ 1/2 ⟩HS = Tr(ρp σ 1−p ).
−1
For convenience, write ∆ρ,σ = Lρ Rσ−1 and ∆Λ(ρ),Λ(σ) = LΛ(ρ) RΛ(σ) . Consider the map

V : Mn (C) → Mm (C), A 7→ Λ† (AΛ(σ)−1/2 )σ 1/2 .


Then V (Λ(σ)1/2 ) = Λ† (1)σ 1/2 = σ 1/2 , where we used the trace-preserving property of Λ. So we
may write
Tr(ρp σ 1−p ) = ⟨σ 1/2 , fp (∆ρ,σ )σ 1/2 ⟩HS = ⟨Λ(σ)1/2 , V † fp (∆ρ,σ )V (Λ(σ)1/2 )⟩HS .
Note that for any A ∈ Mn (C), we have by Kadison–Schwarz inequality,
⟨A, V † V (A)⟩HS = ⟨V (A), V (A)⟩HS
= ⟨Λ† (AΛ(σ)−1/2 )σ 1/2 , Λ† (AΛ(σ)−1/2 )σ 1/2 ⟩HS
≤ Tr[Λ† (Λ(σ)−1/2 A∗ AΛ(σ)−1/2 )σ]
= ⟨A, A⟩HS ,
and
⟨A, V † ∆ρ,σ V (A)⟩HS = ⟨Λ† (AΛ(σ)−1/2 )σ 1/2 , ∆ρ,σ Λ† (AΛ(σ)−1/2 )σ 1/2 ⟩HS
= ⟨Λ† (AΛ(σ)−1/2 ), ρΛ† (AΛ(σ)−1/2 )⟩HS
≤ Tr[Λ† (AΛ(σ)−1 A∗ )σ]
= Tr[AΛ(σ)−1 A∗ Λ(ρ)]
= ⟨A, ∆Λ(ρ),Λ(σ) (A)⟩HS .
So we have V † V ≤ 1 and V † ∆ρ,σ V ≤ ∆Λ(ρ),Λ(σ) . Then we get

V † fp (∆ρ,σ )V ≤ fp (V † ∆ρ,σ V ) ≤ fp (∆Λ(ρ),Λ(σ) ),


where in the first inequality we used operator Jensen’s inequality (fp is operator concave since it is
operator monotone), and in the second inequality we used the operator monotonicity. Therefore,
1 − Tr(ρp σ 1−p )
D(ρ∥σ) = lim
p↗1 1−p
1 − ⟨Λ(σ)1/2 , V † fp (∆ρ,σ )V Λ(σ)1/2 ⟩HS
= lim
p↗1 1−p
1 − ⟨Λ(σ) , fp (∆Λ(ρ),Λ(σ) )(Λ(σ)1/2 )⟩HS
1/2

1−p
1 − Tr(Λ(ρ) Λ(σ)1−p )
p
= lim
p↗1 1−p
= D(Λ(ρ)∥Λ(σ)).
Definition 10.5. A vector p ∈ Cn is called probability vector if pj ≥ 0 for 1 ≤ j ≤ n and
n m
→ Cn that maps probability vectors to
P
j=1 pj = 1. A classical channel is a linear map Φ : C
probability vectors.
If p, q ∈ Cn are probability vectors, the relative entropy of p with respect to q is defined as
n
X pj
D(p∥q) = pj log .
j=1
qj
CHAPTER 10. DATA PROCESSING INEQUALITIES 57

Theorem 10.6. Let p, q ∈ Cm be probability vectors.


(a) If Φ : Cm → Cn is a classical channel, then

D(Φ(p)∥Φ(q)) ≤ D(p∥q).

(b) We have the Pinsker’s inequality


1
D(p∥q) ≥ ∥p − q∥21 .
8

Proof. (a) If ρ = diag(p), σ = diag(q), then D(ρ∥σ) = D(p∥q). Moreover, let E : Mm (C) →
Cm , E(A) = (A11 , . . . , Amm ). Then the map Λ : Mm (C) → Mn (C), Λ(A) = diag(Φ(E(A))) is a
quantum channel. It follows from the data processing inequality for quantum entropies that

D(Φ(p)∥Φ(q)) = D(diag(Φ(E(ρ))), diag(Φ(E(σ)))) = D(Λ(ρ)∥Λ(σ)) ≤ D(ρ∥σ) = D(p∥q).


1
(b) As x 7→ x log x has second derivative x, the Taylor expansion at 1 gives

X (x − 1)k+2 dk 1
x log x = x − 1 + .
(k + 2)! dxk x=1 x
k=0

Since x log x is convex, we have x log x ≥ x − 1. Moreover, if x ≤ 1, then all terms in the Taylor
expansion are positive. Thus
1
x log x ≥ (x − 1) + (1 − x)2+ .
2
It follows that
n  
X pj pj
D(p∥q) = log qj
q
j=1 j
qj
n   n  2
X pj 1X pj
≥ − 1 qj + 1− qj
j=1
qj 2 j=1 qj +
 2
n  
1 X pj
≥ 1− qj  .
2 j=1 qj +
P P P
Since j (1 − pj /qj )pj = 0, we have j (1 − pj /qj )+ qj = j (1 − pj /qj )− qj . Thus
n n 


X pj X pj
∥p − q∥1 = 1− qj = 2 1− qj ≤ 2 2D(p∥q)1/2 .
j=1
qj j=1
qj +

Remark. The constant 18 in Pinsker’s inequality can be improved to 1


2, but that requires a bit
more work or some slick probabilistic arguments.
Theorem 10.7. Let f : R → R be convex, A ∈ Mm (C) self-adjoint and Φ : Mm (C) → Mn (C) a
completely positive map with Φ(1) ≤ 1. If f (0) ≤ 0 or Φ(1) = 1, then

Tr(f (Φ(A))) ≤ Tr(Φ(f (A))).


CHAPTER 10. DATA PROCESSING INEQUALITIES 58

Pl
Proof. Let A = k=1 λk Pk be the spectral decomposition of A. For an eigenvector ξ of Φ(A) with
Pl
∥ξ∥ = 1 define
P µk = ⟨ξ, Φ(Pk )ξ⟩ for 1 ≤ k ≤ l and µl+1 = 1 − k=1 µk . Clearly, µk ≥ 0 for k ≤ l,
and since k Pk = 1 and Φ(1) ≤ 1, we also have µl+1 ≥ 0.
Using the convexity of f , we get

⟨ξ, f (Φ(A))ξ⟩ = f (⟨ξ, Φ(A)ξ⟩)


!
X
=f λk ⟨ξ, Φ(Pk )ξ⟩
k
l
!
X
=f λk µk + 0 · µk+1
k=1
l
X
≤ µk f (λk ) + µl+1 f (0).
k=1

If Φ(1) = 1, then µl+1 = 0, and if f (0) ≤ 0, then µl+1 f (0) ≤ 0. In either case, we get
l
X X
⟨ξ, f (Φ(A))ξ⟩ ≤ µk f (λk ) = ⟨ξ, Φ(f (λk )Pk )ξ⟩ = ⟨ξ, Φ(f (A))ξ⟩.
k=1 k

Summing over an orthonormal eigenbasis for Φ(A), the desired inequality follows.
In the case when

For the following recall that a POVM is a family (Pi )ni=1 of positive matrices with
P
i Pi = 1
and that for every POVM (Pi )ni=1 the map
k
X
Φ : Mm (C) → Mn (C), A 7→ Tr(Pi A)Eii
i=1

is a quantum channel.
Lemma 10.8. Let ρ, σ ∈ Mm (C) be two quantum states and Λ : Mm (C) → Mn (C) any quantum
channel.
(a) The trace distance is monotone under quantum channels:

∥Λ(ρ) − Λ(σ)∥1 ≤ ∥ρ − σ∥1 .

(b) There exists a POVM (Pi )ni=1 with associated quantum-to-classical channel Φ such that
n
X
∥ρ − σ∥1 = ∥Φ(ρ) − Φ(σ)∥1 = |Tr(Pi ρ) − Tr(Pi σ)|.
i=1

Proof. (a) Let A = ρ − σ. Since f (x) = |x| is convex and f (0) = 0, we deduce from the previous
theorem
∥Λ(A)∥1 = Tr(|Λ(A)|) ≤ Tr(Λ(|A|)) = Tr(|A|) = ∥A∥1 .
P
(b) Consider the spectral decomposition of ρ − σ = j λj Qj . Then
X X
P1 := Qj , P2 := Qj
λj ≥0 λj <0
CHAPTER 10. DATA PROCESSING INEQUALITIES 59

give a POVM P = (Pi )2i=1 . By definition,

2
X X X X
∥Φ(ρ) − Φ(σ)∥1 = |Tr(Pi ρ) − Tr(Pi σ)| = λj + λj = |λj | = ∥ρ − σ∥1 .
i=1 λj ≥0 λj <0 j

Now we are ready to prove the quantum Pinsker’s inequality.

Proof of Theorem 10.3. We only prove the weaker version with constant 18 instead of 12 . Take Λ as
the quantum-to-classical channel in the previous lemma. Then from the monotonicity of quantum
relative entropy and classical Pinsker’s inequality:
1 1
D(ρ∥σ) ≥ D(Λ(ρ)∥Λ(σ)) ≥ ∥Λ(ρ) − Λ(σ)∥21 = ∥ρ − σ∥21 .
8 8
Recall that S(ρ) = −Tr(ρ log ρ) is the quantum entropy. For any bipartite state ρ over H1 ⊗H2 ,
we denote ρ1 = Tr2 (ρ) and ρ2 = Tr1 (ρ). We have seen the following subadditivity result of entropy
in the previous lectures:
S(ρ) ≤ S(ρ1 ) + S(ρ2 ),
which follows from the non-negativity of quantum relative entropy:

S(ρ1 ) + S(ρ2 ) − S(ρ) = Trρ(log ρ − log(ρ1 ⊗ ρ2 )) = D(ρ||ρ1 ⊗ ρ2 ) ≥ 0.

Moreover, the equality S(ρ) = S(ρ1 ) + S(ρ2 ) holds iff ρ = ρ1 ⊗ ρ2 .


Actually, the quantum entropy satisfies the following strong subadditivity (SSA). For a multi-
partite state ρ ∈ Ml1 (C) ⊗ · · · ⊗ Mln (C) ⊗ Mn (C) we use the notation ρj1 ...jN to denote the state
Trk1 . . . TrkM (ρ), where k1 , . . . , kM are chosen such that {j1 , . . . , jN } ⊔ {k1 , . . . , kM } = {1, . . . , n}.
In particular, if ρ is a tripartite state, then ρ123 = ρ etc. Note that this notation depends on the
splitting of our quantum system into subsystems.

Theorem 10.9 (Strong subadditivity of the quantum entropy). If ρ ∈ Ml (C) ⊗ Mm (C) ⊗ Mn (C)
is a quantum state, then
S(ρ12 ) + S(ρ23 ) ≥ S(ρ123 ) + S(ρ2 ).
This inequality reduces to the subadditivity of the quantum entropy when m = 1.

Proof. Similar to the computations in the proof of the subadditivity of the quantum entropy, one
obtains

D(ρ123 ∥ρ12 ⊗ ρ3 ) = Tr(ρ123 (log ρ123 − log ρ12 ⊗ 1 − 1 ⊗ log ρ3 ))


= −S(ρ123 ) − Tr(Tr3 (ρ123 log ρ12 ⊗ 1)) − Tr(Tr12 (ρ123 1 ⊗ log ρ3 ))
= −S(ρ123 ) − Tr(ρ12 log ρ12 ) − Tr(ρ3 ⊗ ρ3 )
= S(ρ12 ) + S(ρ3 ) − S(ρ123 )

Now consider the quantum channel Λ = Tr1 . We have

Λ(ρ123 ) = ρ23 , Λ(ρ12 ⊗ ρ3 ) = ρ2 ⊗ ρ3 .

By a similar computation as above,

D(Λ(ρ123 )∥Λ(ρ12 ⊗ ρ3 )) = S(ρ2 ) + S(ρ3 ) − S(ρ23 ).


CHAPTER 10. DATA PROCESSING INEQUALITIES 60

Thus it follows from the data processing inequality that

S(ρ12 ) + S(ρ3 ) − S(ρ123 ) = D(ρ123 ∥ρ12 ⊗ ρ3 )


≥ D(Λ(ρ123 )∥Λ(ρ12 ⊗ ρ3 ))
= S(ρ2 ) + S(ρ3 ) − S(ρ23 ).

Definition 10.10. For any tripartite state ρ123 the conditional mutual information of 1 and 2
given 3, I(1, 2|3), is defined as

I(1, 2|3) := S(ρ13 ) + S(ρ23 ) − S(ρ123 ) − S(ρ3 ).

Then SSA says I(1, 2|3) ≥ 0. For any bipartite state ρ12 the squashed entanglement of ρ12 is
defined as
1
Esq (ρ12 ) := inf{I(1, 2|3) : ρ123 is any tripartite extension of ρ12 }.
2
The functional Esq provides a faithful measure of entanglement.

Theorem 10.11. A bipartite state ρ12 is separable if and only if Esq (ρ11 ) = 0.
So if Esq (ρ) > 0, then ρ is entangled. The following extended SSA provides a lower bound of
Esq :

Theorem 10.12 (Extended SSA). For any tripartite state ρ123 ∈ Ml (C) ⊗ Mm (C) ⊗ Mn (C) we
have
S(ρ13 ) + S(ρ23 ) − S(ρ123 ) − S(ρ3 ) ≥ 2 max{S(ρ1 ) − S(ρ12 ), S(ρ2 ) − S(ρ12 ), 0}.
As a corollary,
Esq (ρ12 ) ≥ max{S(ρ1 ) − S(ρ12 ), S(ρ2 ) − S(ρ12 ), 0}.
So if either of the conditional entropies S(ρ12 ) − S(ρ1 ) or S(ρ12 ) − S(ρ2 ) is strictly negative, then
Esq (ρ12 ) > 0 and thus ρ12 is entangled.

To prove this theorem, we need a purification trick that is very useful. Let us come back to the
quantum entropy S(ρ). It is clear that S(ρ) = 0 iff ρ is pure. If ρ = ρ12 is a bipartite state, then
(exercise) S(ρ12 ) = 0 implies S(ρ1 ) = S(ρ2 ). The following theorem says that if S(ρ12 ) is small,
then S(ρ1 ) is close to S(ρ2 ):
Theorem 10.13. For any bipartite state ρ12 we have

|S(ρ1 ) − S(ρ2 )| ≤ S(ρ12 ).

For the proof, we recall the following purification result from Chapter 4: If ρ ∈ Mn (C) is a
quantum state, then there exists a unit vector ξ ∈ Cn ⊗Cn such that ρ = Tr1 (|ξ⟩ ⟨ξ|) = Tr2 (|ξ⟩ ⟨ξ|).
This is not quite the statement of Proposition 4.6, but can be deduced from its proof.
Now we can prove Theorem 10.13 as follows:

Proof of Theorem 10.13. Consider a purification ρ123 of ρ12 as described above. Then S(ρ12 ) =
S(ρ3 ) and S(ρ1 ) = S(ρ23 ). By the additivity of quantum entropy:

S(ρ1 ) = S(ρ23 ) ≤ S(ρ2 ) + S(ρ3 ) = S(ρ2 ) + S(ρ12 ).

So S(ρ12 ) ≥ S(ρ1 ) − S(ρ2 ), and the proof is finished by symmetry.


CHAPTER 10. DATA PROCESSING INEQUALITIES 61

Proof of Theorem 10.12. Consider any purification ρ1234 of ρ123 . Since ρ1234 is pure, S(ρ14 ) =
S(ρ23 ) and S(ρ124 ) = S(ρ3 ). Then
S(ρ12 ) + S(ρ23 ) − S(ρ1 ) − S(ρ3 ) = S(ρ12 ) + S(ρ14 ) − S(ρ1 ) − S(ρ124 ) ≥ 0.
So
S(ρ12 ) + S(ρ23 ) ≥ S(ρ1 ) + S(ρ3 ).
Similarly we have
S(ρ13 ) + S(ρ23 ) ≥ S(ρ1 ) + S(ρ2 ).
Adding the above two inequalities, we get
S(ρ12 ) + S(ρ13 ) + 2S(ρ23 ) ≥ 2S(ρ1 ) + S(ρ2 ) + S(ρ3 ).
This is independent of H4 . Consider again the purification ρ1234 of ρ123 , then the above argument
shows:
S(ρ14 ) + S(ρ13 ) + 2S(ρ34 ) ≥ 2S(ρ1 ) + S(ρ4 ) + S(ρ3 ).
Since ρ1234 is pure, S(ρ14 ) = S(ρ23 ), S(ρ34 ) = S(ρ12 ) and S(ρ4 ) = S(ρ123 ). Hence,
S(ρ23 ) + S(ρ13 ) + 2S(ρ12 ) ≥ 2S(ρ1 ) + S(ρ123 ) + S(ρ3 ),
which is nothing but
S(ρ13 ) + S(ρ23 ) − S(ρ123 ) − S(ρ3 ) ≥ 2S(ρ1 ) − 2S(ρ12 ).
Similarly, we can derive a lower bound in terms of 2S(ρ2 ) − 2S(ρ12 ).

Exercises
Exercise 10.1. Prove the classical Pinsker inequality (you can use the monotonicity of classical
relative entropy). Let p, q be two probability densities over some finite set X . We have
1
D(p||q) ≥ ∥p − q∥21 .
2
Exercise 10.2. Suppose that {Λx }x∈X is a POVM, i.e. a finite set of operators such that
X
Λx ≥ 0, ∀x ∈ X , and Λx = 1.
x

Show that this gives a quantum channel Λ such that for any quantum state ρ, Λ(ρ) is a classical
probability density over X satisfying Λ(ρ)(x) = Tr(ρΛx ).
Exercise 10.3. Show that for any quantum channel Λ and any matrix X, we have ∥Λ(X)∥p ≤
∥X∥p , 1 ≤ p ≤ ∞.
Exercise 10.4. Find an entangled state.
Exercise 10.5. Show that for any pure bipartite state ρ over H1 ⊗ H2 , we have S(ρ1 ) = S(ρ2 ).
Show also that any quantum state can be purified. That is, for any quantum state ρ over H, there
exists a pure state |ψ⟩ ⟨ψ| on H ⊗ H such that
ρ = Tr1 [|ψ⟩ ⟨ψ|] = Tr2 [|ψ⟩ ⟨ψ|].
Exercise 10.6. Show that the monotonicity of quantum relative entropy implies the joint con-
vexity.
Chapter 11

Supplement: Quantum Markov


Semigroups and Logarithmic Sobolev
Inequalities

Definition 11.1. A quantum Markov semigroup (QMS) on Mn (C) is a family (Pt )t≥0 of unital
completely positive maps on Mn (C) such that
• P0 = idMn (C) , Ps Pt = Pst for all s, t ≥ 0,
• Pt → P0 as t → 0.
Remark. By the Heisenberg–Schrödinger duality, if (Pt ) is a QMS, then Pt† is completely positive
trace-preserving for all t ≥ 0. In particular, Pt† maps quantum states to quantum states for all
t ≥ 0.
Theorem 11.2 (Lindblad). If (Pt ) is a quantum Markov semigroup on Mn (C), then for each
A ∈ Mn (C) the limit
1
L(A) = lim (A − Pt (A))
t↘0 t

exists, L is a linear map from Mn (C) to itself and Pt = e−tL .


Moreover, if L : Mn (C) → Mn (C) is a linear map, then (e−tL ) is a quantum Markov semigroup
if and only if there exist G ∈ Mn (C) and a completely positive map Φ : Mn (C) → Mn (C) with
Φ(1) = G + G∗ such that
L(A) = GA + AG∗ − Φ(A)
for all A ∈ Mn (C).
Remark. For a linear map L : Mn (C) → Mn (C), the exponential e−tL is defined as

X (−1)k tk
e−tL = Lk .
k!
k=0

Proof of Lindblad’s theorem. Clearly, the set V = A ∈ Mn (C) | limt↘0 1t (A − Pt (A)) exists} is a
subspace of Mn (C). For A ∈ Mn (C) and δ > 0 let
Z δ
Aδ = Pt (A) dt.
0

62
CHAPTER 11. SUPPLEMENT: QUANTUM MARKOV SEMIGROUPS AND
LOGARITHMIC SOBOLEV INEQUALITIES 63
Since t 7→ Pt (A) is continuous, we have δ −1 Aδ → A as δ → 0. Moreover,
!
Z δ Z δ
1 1
(Aδ − Pt (Aδ )) = Ps (A) ds − Ps+t (A) ds
t t 0 0
!
Z δ Z t+δ
1
= Ps (A) ds − Ps (A) ds
t 0 t

1 t 1 t+δ
Z Z
= Ps (A) ds − Ps (A) ds
t 0 t δ
→ A − Pδ (A)

as t → 0. Thus Aδ ∈ V , and it follows that V is dense in Mn (C). Since Mn (C) is finite-dimensional,


we must have V = Mn (C).
Now let A ∈ Mn (C). We want to show that Pt (A) = e−tL (A). Note that
 
1 1
(Pt+h (A) − Pt (A)) = Pt (Ph (A) − A) → −Pt (L(A))
h h

as h → 0. In other words, the map t 7→ Pt (A) solves the initial-value problem


d
A(t) = −LA(t)
dt
A(0) = A.

A direct computation shows that t 7→ e−tL (A) solves the same IVP. It follows from the uniqueness
theorem for ordinary differential equations that Pt (A) = e−tL (A) for all t ≥ 0.
For the second part we first assume that (e−tL ) is a quantum Markov semigroup. Let U (n) =
{U ∈ Mn (C) | U ∗ U = U U ∗ = 1}. There exists a unique probability measure µ on U (n) such that
Z Z
f (U V W ) dµ(V ) = f (V ) dµ(V )
U (n) U (n)

for all U, W ∈
R U (n) and all continuous f : U (n) → C.
Let G = U (n) L(U ∗ )U dµ(U ) and Φ(A) = GA + AG∗ − L(A). If V ∈ Mn (C) is unitary, then
Z Z Z
L(V U ∗ )U dµ(U ) = L((U V ∗ )∗ )U V ∗ dµ(U )V = L(U ∗ )U µ(U )V = GV,
U (n) U (n) U (n)

where we used the invariance property of µ.


Since every element of Mn (C) is a linear combination of four unitary matrices, we conclude
Z
L(AU ∗ )U dµ(U ) = GA
U (n)

for all A ∈ Mn (C).


Thus

Φ(A∗ A) = (GA∗ )A + A∗ (GA∗ )∗ − L((U A)∗ U A)


Z
= (L((U A)∗ )U A + (U A)∗ L(U A) − L((U A)∗ (U A))) dµ(U ).
U (n)
CHAPTER 11. SUPPLEMENT: QUANTUM MARKOV SEMIGROUPS AND
LOGARITHMIC SOBOLEV INEQUALITIES 64
Note that
1
L((U A)∗ )U A + (U A)∗ L(U A) − L((U A)∗ (U A)) = lim (Pt ((U A)∗ (U A)) − Pt (U A)∗ Pt (U A)) ≥ 0
t→0 t

by the Kadison–Schwarz inequality. Thus Φ is positive.


If we replace L by L ⊗ idMk (C) , then G is replaced by G ⊗ 1k and Φ by Φ ⊗ idMk (C) . Since
e−t(L⊗id) = Pt ⊗id again satisfies the Kadison–Schwarz inequality, the argument from above implies
that Φ ⊗ idMk (C) is positive for all k ∈ N.
Finally,
1
L(1) = lim (1 − Pt (1)) = 0.
t→0 t

Hence
Φ(1) = G1 + 1G∗ − L(1) = G + G∗ .

Remark. The unique probability measure µ on U (n) such that


Z Z
f (U V W ) dµ(V ) = f (V ) dµ(V )
U (n) U (n)

for all U, W ∈ U (n) and all continuous f : U (n) → C is called the (normalized) Haar measure on
U (n). More generally, a Haar measure exists for any compact group (and any locally compact
group if one drops the assumption that the measure be finite). In general, there is no explicit
formula for it.

Remark. By duality one obtains that t 7→ Pt† (A) is also differentiable for all A ∈ Mn (C) and
d † † †
dt Pt (A) = −L (Pt (A)).

Definition 11.3. If (Pt )t≥0 is a quantum Markov semigroup on Mn (C), then the unique linear
map L : Mn (C) → Mn (C) such that e−tL = Pt for all t ≥ 0 is called the generator of (Pt ).

Corollary 11.4 (Gorini–Kossakowski–Lindblad–Sudarshan). If (Pt ) is a quantum Markov semi-


group on Mn (C) with generator L, then there exist H ∈ Mn (C)sa and V1 , . . . , Vm ∈ Mn (C) such
that
m  
X 1 ∗ 1 ∗ ∗
L(A) = i[H, A] + V Vj A + AVj Vj − Vj AVj
j=1
2 j 2

for all A ∈ Mn (C).

Proof. By Lindblad’s theorem, there exist G ∈ Mn (C) and Φ : Mn (C) → Mn (C) completely pos-
itive such that Φ(1) = G + G∗ and L(A) = GA + AG∗ − Φ(A) Pm for all A ∈ Mn (C). By Kraus’
theorem, there exist V1 , . . . , Vm ∈ Mn (C) such that Φ(A) = j=1 Vj∗ AVj for all A ∈ Mn (C). Let
H = 2i 1
(G − G∗ ).
Then we have G = 12 Φ(1) + iH and thus

L(A) = GA + AG∗ − Φ(A)


1 1
= iHA − iAH + Φ(1)A + AΦ(1) − Φ(A)
2 2
m m
1X ∗ X
= i[H, A] + (Vj Vj A + AVj∗ Vj ) − Vj∗ AVj .
2 j=1 j=1
CHAPTER 11. SUPPLEMENT: QUANTUM MARKOV SEMIGROUPS AND
LOGARITHMIC SOBOLEV INEQUALITIES 65
Remark. One calls an operator P L of this form a Lindbladian and A7→ i[H, A] the conservative
m
(or Hamiltonian part) and A 7→ j=1 12 Vj∗ Vj A + 12 AVj∗ Vj − Vj∗ AVj the dissipative part of L.
The matrices Vj are called jump operators.
Theorem 11.5. Let (Pt ) be a quantum Markov semigroup on Mn (C), σ ∈ Mn (C) a full-rank
quantum state such that Pt† (σ) = σ for all t ≥ 0 and α ≥ 0. The following conditions are
equivalent:
(i) D(Pt† (ρ)∥σ) ≤ e−αt D(ρ∥σ) for all quantum states ρ ∈ Mn (C) and t ≥ 0,
(ii) αD(ρ∥σ) ≤ Tr(L† (ρ)(log ρ − log σ)) for all full-rank quantum states ρ ∈ Mn (C).

Proof. (i) =⇒ (ii): Let f (t) = eαt D(Pt† (ρ)∥σ). By (i), f (t) ≤ f (0) for all t ≥ 0. We have
d
f ′ (t) = αf (t) + eαt Tr(Pt† (ρ)(log Pt† (ρ) − log σ)).
dt
To compute dtd
Tr(Pt† (ρ) log Pt† (ρ)), we can use a similar argument as in the section on monotonicity
of trace functionals to see that
d
Tr(Pt† (ρ) log Pt† (ρ)) = −Tr((log Pt† (ρ) + 1)L† (Pt† (ρ))) = −Tr(L† (Pt† (ρ)) log Pt† (ρ)),
dt
where we used that
Tr(1L† (Pt† (ρ))) = Tr(L(1)Pt† (ρ)) = 0.

Clearly, d
dt Tr(Pt (ρ) log σ) = −Tr(L† (Pt† (ρ)). Thus

f ′ (t) = αf (t) − eαt Tr(L† (Pt† (ρ))(log Pt† (ρ) − log σ)).
In particular,
0 ≤ f ′ (0) = αD(ρ∥σ) − Tr(L† (ρ)(log ρ − log σ)).
(ii) =⇒ (i): Again let f (t) = eαt D(Pt† (ρ)∥σ). We have seen above that
f ′ (t) = eαt (αD(Pt† (ρ)∥σ) − Tr(L† (Pt† (ρ))(log Pt† (ρ) − log σ))).
By (ii), f ′ (t) ≤ 0 for all t ≥ 0. Hence
D(ρ∥σ) = f (0) ≥ f (t) = eα D(Pt† (ρ)∥σ).
Example 11.6 (Depolarizing semigroup). Let σ ∈ Mn (C) be a full-rank quantum state and
E(A) = Tr(Aσ)1. Then the operators Pt = e−t idMn (C) +(1−e−t )E, t ≥ 0, form a quantum Markov
semigroup with generator L = id − E. This semigroup is called the (generalized) depolarizing
semigroup.
Moreover, Pt† (σ) = σ and
D(Pt† (ρ)∥σ) ≤ e−t D(ρ∥σ).
Indeed, Pt is unital completely positive as convex combination of two unital completely positive
maps. Moreover,
Ps (Pt (A)) = Ps (e−t A + (1 − e−t )E(A))
= e−t (e−s A + (1 − e−s )E(A)) + (1 − e−t )(e−s E(A) + (1 − e−s )(E 2 (A)))
= e−(s+t) A + (e−t (1 − e−s ) + (1 − e−t ))E(A)
= e−(s+t) A + (1 − e−(s+t) )E(A)
= Ps+t (A).
CHAPTER 11. SUPPLEMENT: QUANTUM MARKOV SEMIGROUPS AND
LOGARITHMIC SOBOLEV INEQUALITIES 66
The property P0 = id and the continuity of t 7→ Pt are clear. Note moreover that Pt† (A) =
e−t A + (1 − e−t )Tr(A)σ.
To see that exponential decay of the relative entropy, recall that D is convex. Thus

D(Pt† (ρ)∥σ) = D(e−t ρ + (1 − e−t )σ∥σ)


≤ e−t D(ρ∥σ) + (1 − e−t )D(σ∥σ)
= e−t D(ρ∥σ).

Exercises
Exercise 11.1. Find a Lindblad form for the QMS from Example 11.6 in the case σ = 1/n.

Exercise 11.2. Let (Pt ) be a quantum Markov semigroup with generator L. Show that ρ 7→
Tr(L(ρp )ρ1−p ) is convex when 0 ≤ p ≤ 1, and concave when −1 ≤ p ≤ 0.

You might also like