Lecture 9
Lecture 9
Lecture 9
Recall Shannon’s source coding theorem from Lecture 1. There, we considered a discrete
memoryless source of information, i.e., a sequence of independent and identically distributed
random variables on a finite alphabet, and showed that it can be compressed with compres-
sion rates arbitrarily close to the Shannon entropy of the source. We will now discuss the
quantum analogue of this result.
1
with Kn ∈ B(H). For any quantum state ρ ∈ D(H) we have
v
uN
uX
F (T, ρ) = t |hρ, Kn iHS |2 .
n=1
for any σ ∈ D(H0 ) and any pure state |φihφ| ∈ D(H0 ). Applying this result, we find that
p
F (T, ρ) = hψEA |(idE ⊗ T ) (|ψEA ihψEA |) |ψEA i,
√
where |ψEA i = vec ρ . Finally, note that
√ √
hψEA |(idE ⊗ AdK ) (|ψEA ihψEA |) |ψEA i = | vec( ρ)† vec(K ρ)|2 = |hρ, KiHS |2 ,
Definition 1.4 (Quantum compression schemes). Let H denote a complex Euclidean space
and ρ ∈ D(H) a quantum state. An (n, m, δ)-compression scheme for ρ is a pair of quantum
channels
C C
E : B(H⊗n ) → B(( 2 )⊗m ) and D : B(( 2 )⊗m ) → B(H⊗n ),
such that
F (D ◦ E, ρ⊗n ) > 1 − δ.
Note that we used the channel fidelity to quantify the final error of the compression
scheme, which takes entanglement with a reference system into account. Indeed, it would be
trivial (and not very interesting) to construct compression schemes otherwise, since
R
Definition 1.5 (Achievable compression rates). We call R ∈ + an achievable compression
rate for ρ ∈ D(H) if for every n ∈ N
there exists an (n, mn , δn ) compression scheme such
that
mn
R = lim and lim δn = 0.
n→∞ n n→∞
2
2 The von Neumann entropy and Schumacher compression
The von Neumann entropy is the proper quantum generalization of the Shannon entropy:
Definition 2.1 (von Neumann entropy). For a quantum state ρ ∈ D(H) we define the von
Neumann entropy as
H(ρ) = − Tr [ρ log(ρ)] ,
where the logarithm is taken in base 2.
We will prove the following theorem:
Theorem 2.2 (Schumacher’s compression theorem). Let H denote a complex Euclidean
space and ρ ∈ D(H) a quantum state.
1. Any number R > H(ρ) is an achievable compression rate for (ρ⊗n )n∈N .
then we have limk→∞ δk = 1, i.e., the channel fidelity of the compression schemes goes
to zero in the limit k → ∞.
Proof. For brevity set d = dim(H). We start with the direct part: By the spectral theorem,
we have
Xd
ρ= pi |ψi ihψi |,
i=1
for a probability distribution p ∈ P (1, . . . , d) and an orthonormal basis {|ψ1 i, . . . , |ψd i} ⊂ H.
Note that H(ρ) = H(p), and recall the set of -typical strings Tn, (p) of length n. For each
N
> 0 and n ∈ , we define an -typical projector Πn, ∈ Proj (H⊗n ) by
X
Πn, = |ψi1 ihψi1 | ⊗ · · · ⊗ |ψin ihψin |.
(i1 ,...,in )∈Tn, (p)
Expressing the basic properties of typical sequences (see Lecture 1) in terms of the projector
Πn, shows the following:
• For any n ∈ N and any > 0 we have
2−n(H(ρ)+) Πn, < Πn, ρ⊗n Πn, < 2−n(H(ρ)−) Πn, . (1)
N
For > 0 and any n ∈ , we will now construct an (n, dn(H(ρ)+)e, δn ) compression scheme
for ρ such that δn → 0 as n → ∞. This shows that H(ρ) + is an achievable rate. Consider
m = dn(H(ρ) + )e and choose a bit string b(i1 , i2 , . . . , in ) ∈ {0, 1}m for any typical sequence
C
(i1 , . . . , in ) ∈ T,n (p). Let us denote by Sn,m ⊂ ( 2 )⊗m the span of the orthonormal vectors
|b(i1 , . . . , in )i for all (i1 , . . . , in ) ∈ T,n (p), and define an isometry Vn, : Sn,m → H⊗n by
X
Vn, = (|ψi1 i ⊗ · · · ⊗ |ψim i) hb(i1 , . . . , in )|
(i1 ,...,in )∈Tn, (p)
3
It is easy to see that
†
Πn, = Vn, Vn, .
Next, we define two quantum channels:
and
C
where we extend Vn, to all of ( 2 )⊗m by setting it zero on the basis vectors it is not defined
on. Finally, we can compute that
for some completely positive map Fn, : B(H⊗n ) → B(H⊗n ) that we do not need to know
exactly. Using Lemma 1.3 and (3), we find that
For the reverse direction, consider a sequence of (nk , mk , δk )-compression schemes for
ρ ∈ D(H) given by quantum channels
(k)
Let {Al }L
l=1 denote the Kraus operators of the quantum channel Dk ◦ Ek and note that
k
(k) (k)
rk Al 6 2mk by assumption. For each k and l we denote by Πl the projection onto the
(k) (k) (k) (k) (k)
image of Al such that rk Πl 6 2mk and Πl Al = Al . By Lemma 1.3 we have
v
u Lk
(k)
uX
F (Dk ◦ Ek , ρ⊗nk ) = t |hρ⊗nk , Al iHS |2 .
l=1
where we set
(k) ⊗nk
ql = Tr Ad (nk ) ρ > 0.
Al
4
Since Dk ◦ Ek is a quantum channel, we find that
Lk Lk
(k)
X X
⊗nk
ql = Tr Ad (nk ) ρ = 1.
Al
l=1 l=1
5
whenever ker (σ) ⊆ ker (ρ). This expression has a simple, but useful, consequence. It allows
to write the relative entropy as a limit of slightly simpler trace functionals:
Lemma 3.2. For any ρ, σ ∈ D(H) we have
1 − Tr ρ1− σ
1
D(ρkσ) = lim .
ln(2) &0
Proof. If ker (σ) * ker (ρ), then we have
lim 1 − Tr ρ1− σ = 1 − Tr ρ 1H − Πker(σ) = Tr ρΠker(σ) > 0.
&0
In this case, we conclude that the limit in the statement diverges as it should.
Assume now, that ker (σ) ⊆ ker (ρ) and consider the spectral decompositions
n
X
ρ= λi |vi ihvi |,
i=1
and
m
X
σ= µj |wj ihwj |,
j=1
= ln(2)D(ρkσ).
This finishes the proof.
The previous lemma is quite useful when proving the data-processing inequality for the
relative entropy.
for any ρ, σ ∈ D(H) and all quantum channels T : B(H) → B(H0 ). Taking the limit as in
Lemma 3.2 then yields the data-processing inequality of the relative entropy.
6
4.1 Monotonicity of certain Hilbert-Schmidt operators
Let H denote a complex Euclidean space. For positive operators A, B ∈ B(H)+ , we define
the linear maps LA : B(H) → B(H) and RB : B(H) → B(H) by
We will now think of these maps as operators acting on the Hilbert-Schmidt inner product
space B(H). The following properties are easy to show:
• For any A, B ∈ B(H)+ the operators LA and RB are positive1 semidefinite, since, e.g.,
LA = L∗A and LA = L√A ◦ L√A and the same argument works for RB .
Definition 4.1. For a complex Euclidean space H and a function f : (0, ∞) → (0, ∞), we
define the operator
Gf (A, B) = f RB ◦ L−1
A ◦ LA ,
for any pair of positive invertible operators A, B ∈ B(H)++ .
By the spectral decomposition of RB ◦ L−1A in the basis {|vi ihwj |}i,j ⊂ B(H) it is easy
to show that the operator Gf (A, B) is positive semidefinite for any A, B ∈ B(H)++ . Let
us consider the operator Gf (A, B) for the function f (x) = xα where α ∈ (0, 1). Then, it is
easy to show that
Gxα (A, B) = RB α ◦ LA1−α .
A consequence of this is the following lemma which proof is immediate:
To prove the data-processing inequality we will use the following integral representation
of the function f (x) = xα :
sin(πα) ∞ x
Z
xα = λα−1 dλ.
π 0 λ + x
sin(πα) ∞
Z
α−1
h1H , Gx (ρ, σ)1H iHS =
α h1H , G λ+x
x (ρ, σ)1H iHS λ dλ.
π 0
7
Theorem 4.3. For invertible quantum states ρ, σ ∈ D(H) and any quantum channel T :
B(H) → B(H0 ) for which T (ρ) and T (σ) are invertible we have
∗
x (T (ρ), T (σ)) > T ◦ G x (ρ, σ) ◦ T ,
G λ+x λ+x
Lemma 4.5. Let H denote a complex Euclidean space. For positive invertible operators
A, B ∈ B(H)++ and X ∈ B(H) we have
XA−1 X † 6 B −1 if and only if X † BX 6 A.
Proof. Obviously, it is enough to show one direction of the equivalence. If XA−1 X † 6 B −1 ,
then we have B 1/2 XA−1 X † B 1/2 6 1H . This final condition is equivalent to kA−1/2 X † B 1/2 k∞ 6
1, which implies (A−1/2 X † B 1/2 )(B 1/2 XA−1/2 ) 6 1H . Multiplying by A1/2 on both sides,
shows that X † BX 6 A.
Proof of Theorem 4.3. Fix λ > 0. By Lemma 4.5 it is sufficient to show that
−1
G λ+x
x (ρ, σ) > T ∗ ◦ G λ+x
x (T (ρ), T (σ))
−1
◦ T.
Note that
−1
G λ+x
x (ρ, σ) = (λ + Rσ ◦ L−1 −1 −1 −1 −1
ρ ) ◦ Rσ ◦ Lρ ◦ Lρ = λRσ + Lρ ,
and a similar expression holds for the operator G λ+xx (T (ρ), T (σ))−1 . With this we have
h i
−1 † −1 −1
hX, G λ+x (ρ, σ) (X)iHS = Tr X λRσ + Lρ (X)
x
h i h i
= λ Tr Xσ −1 X † + Tr X † ρ−1 X
and
h i
hX, T ∗ ◦ G λ+x
x (T (ρ), T (σ))
−1
◦ T (X)iHS = Tr T (X)† λRT−1(σ) + L−1
T (ρ) (T (X))
h i h i
−1 † † −1
= λ Tr T (X)T (σ) T (X) + Tr T (X) T (ρ) T (X) ,
By Lemma 4.4, we have
h i h i
Tr T (X)T (σ)−1 T (X)† 6 Tr Xσ −1 X † ,
and h i h i
Tr T (X)† T (ρ)−1 T (X) 6 Tr X † ρ−1 X ,
for all invertible ρ, σ ∈ D(H) and all X ∈ B(H).
8
4.2 Proving the data-processing inequality
Now, we are ready to prove the main result of this lecture:
Theorem 4.6 (Data-processing inequality). For any quantum channel T : B(H) → B(H0 ),
we have
D(T (ρ), T (σ)) 6 D(ρ, σ),
for all quantum states ρ, σ ∈ D(H).
Proof. Assume first that ρ, σ ∈ D(H) and T (ρ), T (σ) ∈ D(H0 ) are invertible. Then, we have
sin(πα) ∞
Z
1−α α α−1
Tr ρ σ = h1H , Gxα (ρ, σ)1H iHS = h1H , G λ+x
x (ρ, σ)1H iHS λ dλ,
π 0
and, using that sin(πα) > 0 for all α ∈ (0, 1), we conclude that
for any α ∈ (0, 1) and any , δ > 0. Using that the function (ρ, σ) 7→ Tr ρ1−α σ α is
Tr T (ρ)1−α T (σ)α = lim Tr Tδ (ρ )1−α Tδ (σ )α > lim Tr ρ1−α σα = Tr ρ1−α σ α .
,δ&0 &0
1 − Tr T (ρ)1− T (σ)
1
D(T (ρ)kT (σ)) = lim
ln(2) &0
1−
1 1 − Tr ρ σ
6 lim
ln(2) &0
= D(ρkσ).
9
4.3 Generalizing the data-processing inequality
If you carefully read the proof given in the previous sections, you might realize that we did
not use that the linear map T : B(H) → B(H0 ) was a quantum channel. In the way, we
have stated it, we have only used that the map T : B(H) → B(H0 ) is a trace-preserving
2-positive map. By a slight modification of the proof, we can actually make it work for duals
of so-called unital Schwarz maps.
Definition 4.7. A linear map P : B(H) → B(H0 ) is called a unital Schwarz map if it is
unital and satisfies the Schwarz inequality
P (X)† P (X) 6 P (X † X),
for any X ∈ B(H).
Note that the Schwarz inequality for a unital map P : B(H) → B(H0 ) is equivalent to
1H0 P (X)
> 0,
P (X)† P (X † X)
for any X ∈ B(H). The following lemma is therefore immediate:
Lemma 4.8. If T : B(H0 ) → B(H) is a quantum channel, then T ∗ is a unital Schwarz map.
There are many examples of unital Schwarz maps that do not arise as adjoints of quantum
channels, or even of 2-positive maps. The most prominent example is the map P : B( 2 ) → C
C
B( 2 ) given by
P (X) = Tr [X] C + X T .
1 1 2 1
2 2 2
It turns out, that Theorem 4.3 also holds for linear maps T that are adjoints of unital Schwarz
maps. The proof is almost the same, but in the final lines of the proof we use the following
inequality:
Theorem 4.9 (A tracial inequality). Let P : B(H) → B(H0 ) denote a unital Schwarz map.
For any C ∈ B(H0 )+ and any X ∈ B(H0 ) satisfying ker(C) ⊆ ker(X † ) we have
h i h i
Tr P ∗ (X)† P ∗ (C)−1 P ∗ (X) 6 Tr X † C −1 X ,
10
The data-processing inequality for the quantum relative entropy therefore holds for all
trace-preserving maps that are adjoints of unital Schwarz maps. Recently, a different tech-
nique for proving the data-processing inequality was discovered. Unfortunately, it goes be-
yond the scope of this course, but for completeness we state the most general form of the
data-processing inequality:
Theorem 4.10 (General data-processing inequality). For any positive and trace-preserving
map P : B(H) → B(H0 ), we have
Whether this theorem can be proven using the techniques used above is an open question!
11