Lecture 9
Lecture 9
Lecture 9
Recall Shannon’s source coding theorem from Lecture 1. There, we considered a discrete
memoryless source of information, i.e., a sequence of independent and identically distributed
random variables on a finite alphabet, and showed that it can be compressed with compres-
sion rates arbitrarily close to the Shannon entropy of the source. We will now discuss the
quantum analogue of this result.
with Kn ∈ B(H). For any quantum state ρ ∈ D(H) we have
F (T, ρ) = t |hρ, Kn iHS |2 .
for any σ ∈ D(H0 ) and any pure state |φihφ| ∈ D(H0 ). Applying this result, we find that
F (T, ρ) = hψEA |(idE ⊗ T ) (|ψEA ihψEA |) |ψEA i,
where |ψEA i = vec ρ . Finally, note that
√ √
hψEA |(idE ⊗ AdK ) (|ψEA ihψEA |) |ψEA i = | vec( ρ)† vec(K ρ)|2 = |hρ, KiHS |2 ,
Definition 1.4 (Quantum compression schemes). Let H denote a complex Euclidean space
and ρ ∈ D(H) a quantum state. An (n, m, δ)-compression scheme for ρ is a pair of quantum
E : B(H⊗n ) → B(( 2 )⊗m ) and D : B(( 2 )⊗m ) → B(H⊗n ),
such that
F (D ◦ E, ρ⊗n ) > 1 − δ.
Note that we used the channel fidelity to quantify the final error of the compression
scheme, which takes entanglement with a reference system into account. Indeed, it would be
trivial (and not very interesting) to construct compression schemes otherwise, since
Definition 1.5 (Achievable compression rates). We call R ∈ + an achievable compression
rate for ρ ∈ D(H) if for every n ∈ N
there exists an (n, mn , δn ) compression scheme such
R = lim and lim δn = 0.
n→∞ n n→∞
2 The von Neumann entropy and Schumacher compression
The von Neumann entropy is the proper quantum generalization of the Shannon entropy:
Definition 2.1 (von Neumann entropy). For a quantum state ρ ∈ D(H) we define the von
Neumann entropy as
H(ρ) = − Tr [ρ log(ρ)] ,
where the logarithm is taken in base 2.
We will prove the following theorem:
Theorem 2.2 (Schumacher’s compression theorem). Let H denote a complex Euclidean
space and ρ ∈ D(H) a quantum state.
1. Any number R > H(ρ) is an achievable compression rate for (ρ⊗n )n∈N .
then we have limk→∞ δk = 1, i.e., the channel fidelity of the compression schemes goes
to zero in the limit k → ∞.
Proof. For brevity set d = dim(H). We start with the direct part: By the spectral theorem,
we have
ρ= pi |ψi ihψi |,
for a probability distribution p ∈ P (1, . . . , d) and an orthonormal basis {|ψ1 i, . . . , |ψd i} ⊂ H.
Note that H(ρ) = H(p), and recall the set of -typical strings Tn, (p) of length n. For each
> 0 and n ∈ , we define an -typical projector Πn, ∈ Proj (H⊗n ) by
Πn, = |ψi1 ihψi1 | ⊗ · · · ⊗ |ψin ihψin |.
(i1 ,...,in )∈Tn, (p)
Expressing the basic properties of typical sequences (see Lecture 1) in terms of the projector
Πn, shows the following:
• For any n ∈ N and any > 0 we have
2−n(H(ρ)+) Πn, < Πn, ρ⊗n Πn, < 2−n(H(ρ)−) Πn, . (1)
For > 0 and any n ∈ , we will now construct an (n, dn(H(ρ)+)e, δn ) compression scheme
for ρ such that δn → 0 as n → ∞. This shows that H(ρ) + is an achievable rate. Consider
m = dn(H(ρ) + )e and choose a bit string b(i1 , i2 , . . . , in ) ∈ {0, 1}m for any typical sequence
(i1 , . . . , in ) ∈ T,n (p). Let us denote by Sn,m ⊂ ( 2 )⊗m the span of the orthonormal vectors
|b(i1 , . . . , in )i for all (i1 , . . . , in ) ∈ T,n (p), and define an isometry Vn, : Sn,m → H⊗n by
Vn, = (|ψi1 i ⊗ · · · ⊗ |ψim i) hb(i1 , . . . , in )|
(i1 ,...,in )∈Tn, (p)
It is easy to see that
Πn, = Vn, Vn, .
Next, we define two quantum channels:
where we extend Vn, to all of ( 2 )⊗m by setting it zero on the basis vectors it is not defined
on. Finally, we can compute that
for some completely positive map Fn, : B(H⊗n ) → B(H⊗n ) that we do not need to know
exactly. Using Lemma 1.3 and (3), we find that
For the reverse direction, consider a sequence of (nk , mk , δk )-compression schemes for
ρ ∈ D(H) given by quantum channels
Let {Al }L
l=1 denote the Kraus operators of the quantum channel Dk ◦ Ek and note that
(k) (k)
rk Al 6 2mk by assumption. For each k and l we denote by Πl the projection onto the
(k) (k) (k) (k) (k)
image of Al such that rk Πl 6 2mk and Πl Al = Al . By Lemma 1.3 we have
u Lk
F (Dk ◦ Ek , ρ⊗nk ) = t |hρ⊗nk , Al iHS |2 .
where we set
(k) ⊗nk
ql = Tr Ad (nk ) ρ > 0.
Since Dk ◦ Ek is a quantum channel, we find that
Lk Lk
ql = Tr Ad (nk ) ρ = 1.
l=1 l=1
whenever ker (σ) ⊆ ker (ρ). This expression has a simple, but useful, consequence. It allows
to write the relative entropy as a limit of slightly simpler trace functionals:
Lemma 3.2. For any ρ, σ ∈ D(H) we have
1 − Tr ρ1− σ
D(ρkσ) = lim .
ln(2) &0
Proof. If ker (σ) * ker (ρ), then we have
lim 1 − Tr ρ1− σ = 1 − Tr ρ 1H − Πker(σ) = Tr ρΠker(σ) > 0.
In this case, we conclude that the limit in the statement diverges as it should.
Assume now, that ker (σ) ⊆ ker (ρ) and consider the spectral decompositions
ρ= λi |vi ihvi |,
σ= µj |wj ihwj |,
= ln(2)D(ρkσ).
This finishes the proof.
The previous lemma is quite useful when proving the data-processing inequality for the
relative entropy.
for any ρ, σ ∈ D(H) and all quantum channels T : B(H) → B(H0 ). Taking the limit as in
Lemma 3.2 then yields the data-processing inequality of the relative entropy.
4.1 Monotonicity of certain Hilbert-Schmidt operators
Let H denote a complex Euclidean space. For positive operators A, B ∈ B(H)+ , we define
the linear maps LA : B(H) → B(H) and RB : B(H) → B(H) by
We will now think of these maps as operators acting on the Hilbert-Schmidt inner product
space B(H). The following properties are easy to show:
• For any A, B ∈ B(H)+ the operators LA and RB are positive1 semidefinite, since, e.g.,
LA = L∗A and LA = L√A ◦ L√A and the same argument works for RB .
Definition 4.1. For a complex Euclidean space H and a function f : (0, ∞) → (0, ∞), we
define the operator
Gf (A, B) = f RB ◦ L−1
A ◦ LA ,
for any pair of positive invertible operators A, B ∈ B(H)++ .
By the spectral decomposition of RB ◦ L−1A in the basis {|vi ihwj |}i,j ⊂ B(H) it is easy
to show that the operator Gf (A, B) is positive semidefinite for any A, B ∈ B(H)++ . Let
us consider the operator Gf (A, B) for the function f (x) = xα where α ∈ (0, 1). Then, it is
easy to show that
Gxα (A, B) = RB α ◦ LA1−α .
A consequence of this is the following lemma which proof is immediate:
To prove the data-processing inequality we will use the following integral representation
of the function f (x) = xα :
sin(πα) ∞ x
xα = λα−1 dλ.
π 0 λ + x
sin(πα) ∞
h1H , Gx (ρ, σ)1H iHS =
α h1H , G λ+x
x (ρ, σ)1H iHS λ dλ.
π 0
Theorem 4.3. For invertible quantum states ρ, σ ∈ D(H) and any quantum channel T :
B(H) → B(H0 ) for which T (ρ) and T (σ) are invertible we have
x (T (ρ), T (σ)) > T ◦ G x (ρ, σ) ◦ T ,
G λ+x λ+x
Lemma 4.5. Let H denote a complex Euclidean space. For positive invertible operators
A, B ∈ B(H)++ and X ∈ B(H) we have
XA−1 X † 6 B −1 if and only if X † BX 6 A.
Proof. Obviously, it is enough to show one direction of the equivalence. If XA−1 X † 6 B −1 ,
then we have B 1/2 XA−1 X † B 1/2 6 1H . This final condition is equivalent to kA−1/2 X † B 1/2 k∞ 6
1, which implies (A−1/2 X † B 1/2 )(B 1/2 XA−1/2 ) 6 1H . Multiplying by A1/2 on both sides,
shows that X † BX 6 A.
Proof of Theorem 4.3. Fix λ > 0. By Lemma 4.5 it is sufficient to show that
G λ+x
x (ρ, σ) > T ∗ ◦ G λ+x
x (T (ρ), T (σ))
◦ T.
Note that
G λ+x
x (ρ, σ) = (λ + Rσ ◦ L−1 −1 −1 −1 −1
ρ ) ◦ Rσ ◦ Lρ ◦ Lρ = λRσ + Lρ ,
and a similar expression holds for the operator G λ+xx (T (ρ), T (σ))−1 . With this we have
h i
−1 † −1 −1
hX, G λ+x (ρ, σ) (X)iHS = Tr X λRσ + Lρ (X)
h i h i
= λ Tr Xσ −1 X † + Tr X † ρ−1 X
h i
hX, T ∗ ◦ G λ+x
x (T (ρ), T (σ))
◦ T (X)iHS = Tr T (X)† λRT−1(σ) + L−1
T (ρ) (T (X))
h i h i
−1 † † −1
= λ Tr T (X)T (σ) T (X) + Tr T (X) T (ρ) T (X) ,
By Lemma 4.4, we have
h i h i
Tr T (X)T (σ)−1 T (X)† 6 Tr Xσ −1 X † ,
and h i h i
Tr T (X)† T (ρ)−1 T (X) 6 Tr X † ρ−1 X ,
for all invertible ρ, σ ∈ D(H) and all X ∈ B(H).
4.2 Proving the data-processing inequality
Now, we are ready to prove the main result of this lecture:
Theorem 4.6 (Data-processing inequality). For any quantum channel T : B(H) → B(H0 ),
we have
D(T (ρ), T (σ)) 6 D(ρ, σ),
for all quantum states ρ, σ ∈ D(H).
Proof. Assume first that ρ, σ ∈ D(H) and T (ρ), T (σ) ∈ D(H0 ) are invertible. Then, we have
sin(πα) ∞
1−α α α−1
Tr ρ σ = h1H , Gxα (ρ, σ)1H iHS = h1H , G λ+x
x (ρ, σ)1H iHS λ dλ,
π 0
and, using that sin(πα) > 0 for all α ∈ (0, 1), we conclude that
for any α ∈ (0, 1) and any , δ > 0. Using that the function (ρ, σ) 7→ Tr ρ1−α σ α is
Tr T (ρ)1−α T (σ)α = lim Tr Tδ (ρ )1−α Tδ (σ )α > lim Tr ρ1−α σα = Tr ρ1−α σ α .
,δ&0 &0
1 − Tr T (ρ)1− T (σ)
D(T (ρ)kT (σ)) = lim
ln(2) &0
1 1 − Tr ρ σ
6 lim
ln(2) &0
= D(ρkσ).
4.3 Generalizing the data-processing inequality
If you carefully read the proof given in the previous sections, you might realize that we did
not use that the linear map T : B(H) → B(H0 ) was a quantum channel. In the way, we
have stated it, we have only used that the map T : B(H) → B(H0 ) is a trace-preserving
2-positive map. By a slight modification of the proof, we can actually make it work for duals
of so-called unital Schwarz maps.
Definition 4.7. A linear map P : B(H) → B(H0 ) is called a unital Schwarz map if it is
unital and satisfies the Schwarz inequality
P (X)† P (X) 6 P (X † X),
for any X ∈ B(H).
Note that the Schwarz inequality for a unital map P : B(H) → B(H0 ) is equivalent to
1H0 P (X)
> 0,
P (X)† P (X † X)
for any X ∈ B(H). The following lemma is therefore immediate:
Lemma 4.8. If T : B(H0 ) → B(H) is a quantum channel, then T ∗ is a unital Schwarz map.
There are many examples of unital Schwarz maps that do not arise as adjoints of quantum
channels, or even of 2-positive maps. The most prominent example is the map P : B( 2 ) → C
B( 2 ) given by
P (X) = Tr [X] C + X T .
1 1 2 1
2 2 2
It turns out, that Theorem 4.3 also holds for linear maps T that are adjoints of unital Schwarz
maps. The proof is almost the same, but in the final lines of the proof we use the following
Theorem 4.9 (A tracial inequality). Let P : B(H) → B(H0 ) denote a unital Schwarz map.
For any C ∈ B(H0 )+ and any X ∈ B(H0 ) satisfying ker(C) ⊆ ker(X † ) we have
h i h i
Tr P ∗ (X)† P ∗ (C)−1 P ∗ (X) 6 Tr X † C −1 X ,
The data-processing inequality for the quantum relative entropy therefore holds for all
trace-preserving maps that are adjoints of unital Schwarz maps. Recently, a different tech-
nique for proving the data-processing inequality was discovered. Unfortunately, it goes be-
yond the scope of this course, but for completeness we state the most general form of the
data-processing inequality:
Theorem 4.10 (General data-processing inequality). For any positive and trace-preserving
map P : B(H) → B(H0 ), we have
Whether this theorem can be proven using the techniques used above is an open question!