Algebraic Approach To Quantum Theory
Algebraic Approach To Quantum Theory
Contents
1 Introduction
2 States and effects
2.1 Basic Quantum Mechanics . . . . . . . . . . .
2.2 Positive operators . . . . . . . . . . . . . . . .
2.3 Generalized States . . . . . . . . . . . . . . .
2.3.1 Ensembles . . . . . . . . . . . . . . . .
2.3.2 Tensor Product and Reduced States . .
2.3.3 Density Operator/Matrix . . . . . . . .
2.3.4 Example: Entanglement . . . . . . . .
2.3.5 Example: Bloch Sphere . . . . . . . . .
2.4 Generalised propositions . . . . . . . . . . . .
2.5 Abstract State/Effect Formalism . . . . . . .
2.5.1 Example: Quantum Theory . . . . . .
2.5.2 Example: Classical Probability Theory
2.6 Algebraic Formulation . . . . . . . . . . . . .
2.6.1 Example: Quantum Theory . . . . . .
2.6.2 Example: Classical Probability Theory
2.6.3 Example: C -Algebra . . . . . . . . . .
2.7 Structure of matrix algebras . . . . . . . . . .
2.8 Hybrid quantum/classical systems . . . . . . .
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Channels
3.1 Schrdinger Picture . . . . . . . . . . . . . . . . . . .
3.1.1 Example: Partial Transpose . . . . . . . . . .
3.1.2 Example: Homomorphisms . . . . . . . . . . .
3.1.3 Example: Mixture of unitaries . . . . . . . . .
3.1.4 Example: Classical channels . . . . . . . . . .
3.2 Heisenberg Picture . . . . . . . . . . . . . . . . . . .
3.3 Observables as channels with classical output . . . .
3.3.1 Example: Observable associated with an effect
3.3.2 Coarse-graining of observables . . . . . . . . .
3.3.3 Accessible observables . . . . . . . . . . . . .
3.4 Stinespring dilation . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
4
4
6
9
10
11
11
12
13
14
15
17
17
18
19
25
.
.
.
.
.
.
.
.
.
.
.
28
29
30
30
31
31
32
34
34
37
37
38
1 Introduction
This document is meant as a pedagogical introduction to the modern language used to talk
about quantum theories, especially in the field of quantum information. It assumes that the
reader has taken a first traditional course on quantum mechanics, and is familiar with the
concept of Hilbert space and elementary linear algebra.
As in the popular textbook on quantum information by Nielsen and Chuang [1], we introduce
the generalised concept of states (density matrices), observables (POVMs) and transformations (channels), but we also go further and characterise these structures from an algebraic
standpoint, which provides many useful technical tools, and clarity as to their generality. This
approach also makes it manifest that quantum theory is a direct generalisation of probability
theory, and provides a unifying formalism for both fields.
Although this algebraic approach dates back, in parts, to John von Neumann, we are not aware
of any presentation which focuses on finite-dimensional systems. This simplification allows us to
have a self-contained presentation which avoids many of the technicalities inherent to the most
general C -algebraic approach, while being perfectly appropriate for the quantum information
literature.
Normally, one goes on to describe the projection postulate, but we will see that it can actually
be derived from the born rule, once properly generalised.
Definition 2.2.
An operator A is called positive, denoted as A 0, if for all |i H, h|A|i 0. We also
write A B if A B 0.
In finite dimension, this definition is equivalent to A being self-adjoint and having non-negative
eigenvalues, that is A = A, i : ai 0. It is straightforward to see that such an operator
implies the defining properties. Let {|ii} denote the eigenbasis of A:
h|A|i =
X
i,j
ci cj hi|aj |ji =
|ci |2 ai 0.
For the converse, consider A with h|A|i 0 for all |i H. To see that this A is self-adjoint,
observe that in general,
h + |A| + i = h|A|i + h|A|i + h|A|i + h|A|i.
(2.1)
Since this is real by assumption, and also the first two terms on the right side of the equality
are real, we must have
h|A|i + h|A|i = h|A|i + h|A|i
for all states and , which implies Im h|A|i = Im h|A|i. Replacing by i, we also
obtain Re h|A|i = Re h|A|i. It follows that h|A|i = h|A|i, and hence A is self-adjoint.
The positivity of the eigenvalues follows from the fact that eigenvalues can be considered as
expectation values of eigenvectors, i.e. ai = hi|A|ii 0.
Another important point to notice is that for any operator B, we have B B 0, because for
all , h|B B|i = kB|ik2 0. It is
also true that any positive operator A can be written in
this form by using, for instance, B = A, defined as having the same eigenvectors as A, but
the square root of the eigenvectors of A.
Also, we will use the fact that the binary relation A B if B A 0 defines a partial order
on the set of all operators. This order is partial because not every pair of operators A, B can
be compared. For instance, consider the matrices
1 0
A=
0 0
0 0
and B =
.
0 1
!
It is easy to see that neither A B nor B A are positive, since they both have a negative
eigenvalue.
An important set of positive operators are the projectors:
Definition 2.3.
An operator P is a projector, if it satisfies P P = P or, equivalently, P = P and P 2 = P .
It follows that the eigenvalues of a projector P are either 0 or 1. Indeed, if P |i = p|i for
some eigenvalue p, then also p|i = P |i = P 2 |i = p2 |i, it follows that p2 = p, and hence
p = 0 or p = 1. In particular, this implies that P 0.
Such an operator P projects all vectors orthogonally on the subspace P H {P |i : |i
H} {|i H : P |i = |i}. Given any normalised state |i, we can define the rank-one
operator
P = |ih|
as a projector on the subspace spanned by |i. It maps any vector |i to P |i = h|i|i.
This notation allows us to write any self-adjoint operator A (in finite dimensions) as a sum of
projectors onto its eigenspaces. Let |i, ki be the eigenvectors of A with distinct eigenvalues ai ,
i.e., such that A|i, ki = ai |i, ki. We define the spectral projectors
Pi :=
|i, kihi, k|
which projects on the eigenspace Pi H associated with the eigenvalue ai . (It is easy to check
that a sum of projectors is also a projector). This allows us to write A in terms of its spectral
decomposition:
X
A=
ai P i .
i
The completeness of the eigenvectors imply that i Pi = 1, although one of the eigenvalue ai
may be equal to zero, eliminating one of the terms from the sum above.
P
Observe that the probability of obtaining the value ai in a measurement of A on state |i can
now be written simply as
X
pi =
|hi, k|i|2 = h|Pi |i.
k
In particular, it depends solely on the spectral projector Pi and on the state |i.
Note that this simple expression holds only provided that we make sure that the eigenvalues ai
are distinct and hence that the projectors Pi are of maximum rank. Moreover, this makes the
spectral decomposition of a self-adjoint operator unique.
Often, one is not interested in the individual outcomes probabilities when measuring an observable, but just in the average of the measured value:
Definition 2.4.
The expectation value hAi of an observable A with spectral decomposition A =
measured on a quantum system in state |i is given by
hAi =
ai p i =
ai h|Pi |i = h|A|i
ai P i
(2.2)
It is worth noting that the probabilities pi themselves are expectations values of the spectral
projectors:
pi = hPi i.
Therefore, any prediction of quantum theory is given by the expectation value of some selfadjoint operator.
2.3.1 Ensembles
Consider the case where the observer of quantum system is uncertain about what its exact
state is. A natural way to model this situation is to assign probabilities pi to quantum states
i according to his belief about the system. This defines the ensemble {(pi , i )}ni=1 .
The expectation value of the observable A must then be the average, in terms of the classical
probability distribution i 7 pi , of the various quantum expectations values hi |A|i i:
hAi =
(2.3)
pi hi |A|i i.
This can be rewritten in a more compact form using the trace operator Tr. Recall that the
trace of a matrix is cyclic, i.e., Tr(AB) = Tr(BA). Of course, the product AB must be a square
matrix, otherwise the trace is not defined. However, A and B themselves can be rectangular
matrices. The best way of thinking about a ket |i i is as a matrix with just a single column,
whereas a bra |i i is a matrix with just a single row. Considering that the trace of a number
(i.e. a one-by-one matrix) is that number itself, we obtain
(2.4)
We invite the reader to verify that this is correct. This allows us to rewrite the expectation
value of A as
hAi =
pi hi |A|i i =
pi Tr |i ihi |A = Tr(A)
(2.5)
(2.6)
pi |i ihi |.
which is usually called density matrix, or density operator. Given that, as noted at the end of
Section 2.2, all quantum predictions take the form of the expectation value of some self-adjoint
operator, this means that in this scenario, the matrix is all that we need to know about the
ensemble {(pi , i )}ni=1 in order to compute predictions.
For instance, considering an observable in its spectral decomposition A =
bility qj of observing the outcome aj is also simply
hPj i = Tr(Pj ).
aj Pj , the proba(2.7)
It is important to not that many different ensembles give rise to the same density matrix :
Theorem 2.1. Two ensembles {(pi , |i i)}ni=1 , {(qi , |i i)}m
i=1 are represented by the same density
operator, i.e,
X
X
pi |i ihi | =
qi |i ihi |
(2.8)
i
pi |i i =
uij qj |j i.
(2.9)
Proof. The sufficiency of the condition can be verified by simple substitution and using the
P
unitarity of the matrix uij , namely the fact that i uik uij = kj . For the converse, we follow
P
Ref. [1]. Let = i pi |i ihi |. First, observe that we can always build an ensemble of mutually
P
orthogonal states representing using its eigenvectors |ki and eigenvalues pi : = i pi |iihi|.
If we can prove the theorem in the case where one of the ensembles is composed of orthogonal
states, then we obtained the desired unitary matrix by multiplying the unitary relating the
vectors of the first ensemble to the orthogonal ones with that relating the orthogonal vectors
to the vectors of the second ensemble. Let us define |ai i := pi |i i and |bi i := qi |i i for
conciseness. We assume that the states |aii are all mutually orthogonal. In this case, observe
that if |i is orthogonal to all vectors |aii , then
0 = h||i =
|hbi |i|2
which implies that hbi |i = 0 for all i, and hence |i is also orthogonal to all vectors |bi i. This
means that the families |ai i and |bi i have the same span. Hence there are complex numbers cij
such that
X
|bi i =
cij |aj i.
j
Moreover, we have
=
|ak ihak | =
|bk ihbk | =
XX
ij
From the linear independence of the matrices |ai ihak |, we conclude that
hence the matrix cij is unitary.
k cki ckj
= ij , and
The uncertainty involved in a given ensemble can be measured by the Shannon entropy S(p) =
P
i pi ln pi . For a given density matrix , this entropy is not defined because it depends on
the ensemble from which is constructed. However, it makes sense to consider the ensemble
that corresponds to a minimal uncertainty. This defines the von Neumann entropy associated
with a density matrix:
S() = Tr( ln ) = Pmin
S(p),
(2.10)
=
pi |i ihi |
where the minimum is over all possible ensembles {(pi , i )}ni=1 such that = i pi |i ihi |. In
fact, one can show that this minimum is reached whenever the states i are all orthogonal. In
this case, the probabilities pi are simply the eigenvalues of . Therefore, S() is the Shannon
entropy computed from the eigenvalues of . We will come back to this special diagonalising
ensemble in Section 2.3.3.
P
n X
m
X
ij |iiA |jiB .
(2.13)
i=1 j=1
We can think of HA and HB as two independent parts of H. In particular, given any vectors
A HA and B HB , we can construct the joint vector
|A i |B i =
n X
m
X
i=1 j=1
(2.14)
(2.15)
(2.16)
Observe that the labels A and B on the kets are not really necessary, as we always take care
to preserve the ordering of the tensor factors: system A on the left of the tensor product and
system B on the right. Hence in the following we would write the above two equations as
(A 1)(|ii |ji) = (A|ii) |ji
(1 B)(|ii |ji) = |ii (B|ji).
(2.17)
(2.18)
An essential property of this representation of the operators A and B is that they commute:
(A 1)(1 B) = (1 B)(A 1) = A B.
(2.19)
(2.20)
(2.21)
(2.22)
(2.23)
for any operators A, A0 , B, B 0 acting on the respective Hilbert spaces, and any C.
The tensor product can be straightforwardly generalised to more than two tensor factors, and
two non-square matrices, i.e., operators between two different Hilbert spaces. In particular, the
tensor product of two kets is the same as that of two matrices with only one column (columns
vectors). We invite the reader to experiment with this concept. In particular, a good exercise
is to understand the tensor product in terms of its action on matrix components. Here we just
observe that the tensor product is associative:
(A B) C A (B C) A B C.
(2.24)
Also, a construction that we will often encounter is the tensor product of an operator and ket,
or a bra, whose action on states is defined as follows:
(A |i)|ii = A|ii |i
(A h|)(|ii |ji) = h|jiA|ii.
(2.25)
(2.26)
We are now in measure to define the concept of reduced state. Suppose that, for some reason,
we decide to only measure observables on HA , i.e., observables represented by operators of
the form A 1, where A is any operator on HA . Observe that, in particular, if the spectral
P
P
decomposition of A is A = i ai Pi , then the spectral decomposition of A 1 is A = i ai Pi 1.
Indeed, it is easy to check that Pi 1 are also projectors.
This implies that all the possible predictions are of the form of the expectation value of a
self-adjoint operator having the shape X 1, which, on an arbitrary state HA HB , is
hX 1i = h|X 1|i = Tr (X 1)|ih| =
Tr
X |jihj| |ih| ,
(2.27)
where, in the last step, we expanded the identity on system B in terms of a basis with elements
|ji. Observing that
X |jihj| = (1 |ji)X(1 hj|),
(2.28)
and using the cyclicity of the trace, we obtain
hX 1i =
Tr (1 |ji)X(1 hj|)|ih|
(2.29)
= Tr(X)
where we defined the operator
=
(1 hj|)|ih|(1 |ji).
(2.30)
This operator , which is also a density matrix, is an operator acting only on Hilbert space HA .
Nonetheless, it contains all the information that we will ever need about the state HA HB ,
provided we are restricted to only measure observables of system A.
It will be useful to define more generally the operation which maps |ih| to . It is a linear
map that we call the partial trace over system B, written TrB , which acts on operators as
follows
X
TrB (Z) = (1 hj|)Z(1 |ji).
(2.31)
j
(2.32)
where the implicit product used here is simply the composition of maps.
The above reasoning can be carried likewise if we started with a density matrix AB on system
HA HB rather than simply the vector :
Definition 2.5.
Let AB be a density matrix on a bipartite system, then
A = TrB (AB )
is the reduced state on system A.
(2.33)
In the previous two sections, we have seen that it is useful to characterize the state of a system
through an operator , such that the expectation value of a self-adjoint operator A can be
obtained via Tr(A). Now we take this more general concept of states to its full generality:
Definition 2.6.
A density operator (state) is an operator satisfying
i) 0,
ii) Tr = 1.
The expectation value of the observable represented by the self-adjoint operator A with respect
to the state is given by
hAi = Tr(A).
(2.34)
These two conditions completely capture the concept of a density matrix that emerged in either
of the two contexts studied in Section 2.3.1 or 2.3.2. That is, not only do the density matrices
emerging in these context always satisfy conditions (i) and (ii), but also any matrices satisfying
these condition can emerge in both contexts.
First, observe that these conditions are necessary and sufficient for the following interpretation:
for any observable represented by a self-adjoint operator A with spectral decomposition A =
P
Pi , we want the numbers pi = Tr(Pi ) to form a probability distribution, i.e., that pi > 0
i aiP
and i pi = 1. Indeed, this requirement implies in particular that for any vector |i we must
have Tr(|ih|) = h||i 0, which simply means 0, namely Condition (i). Moreover,
P
P
from the fact that i Pi = 1, i pi = 1 directly implies Tr(1) = 1, which is Condition (ii).
The converse is similarly straightforward.
Moreover, any such density matrix can be obtained either as an ensemble or as a reduced
state, which shows that this definition is not taking us away from the accepted framework of
quantum mechanics. Indeed, suppose is any matrix satisfying conditions (i) and (ii). Let |ii
be a complete set of eigenvectors for with eigenvalues pi , i = 1, . . . , n. The two conditions
P
imply that pi 0 and i pi = 1, therefore represents the ensemble {pi , |ii}ni=1 via its spectral
P
decomposition = i pi |iihi|. This arbitrary density matrix can also represent a reduced state.
Indeed, consider two copies of the Hilbert space on which it is defined, and the bipartite vector
P
|i = i pi |ii |ii. One can check that is obtained by tracing out |ih| on any of the two
systems. The vector |i is generally called a purification of .
10
Definition 2.7.
We call a state pure if it cannot be written as an ensemble of two or more distinct states.
(i.e., it is extremal in the convex set of density matrices). The following propositions are all
equivalent (on finite-dimensional Hilbert spaces):
i) is pure
ii) S() = 0
iii) 2 =
iv) = |ih| for some normalised vector .
Proof. We show (i) (ii) (iii) (iv) (i). It directly follows from the extremality
P
condition that S() = 0. If = i pi |iihi| for orthogonal eigenstates |ii, then we have S() =
P
i pi ln pi = 0. But each term pi ln pi is positive since 0 pi 1. Hence, for all i we have
pi ln pi = 0, which is true either if pi = 0 or ln pi = 0, i.e., pi = 1. Hence is a projector:
P
2 = . Moreover, since i pi = 1 then only a single pi can be nonzero. It follows that = |iihi|.
Moreover, such a state is extremal because |ih| = p1 + (1 p)2 implies, by multiplying
both sides by h| on the left and |i on the right, that 1 = ph|1 |i + (1 p)h|2 |i, or
p(1 h|1 |i) + (1 p)(1 h|2 |i) = 0. Since both terms are positive we must have,
h|i |i = 1, which implies that 1 = 2 = |ih|.
We close this section with some examples:
1X
1X
(|iiA |iiB ) (hj|A hj|B ) =
|iihj|A |iihj|B .
n i,j
n i,j
X
I hk| AB I |ki
(2.35)
=
=
1 X
I hk| |iihj| |iihj| I |ki
n k,i,j
1X
1X
1
I|iihj|I hk|ii hj|ki =
|kihk| = .
| {z } | {z }
n k,i,j
n k
n
k,i
(2.36)
(2.37)
j,k
The reduced state A = 1/n is referred to as the maximally mixed state. Because it is invariant
under any unitary transformation U , it is also the reduced state of the states of the form
(U 1)|i, which just amount to a different choice of basis on system A. Observe that,
although the full state AB = |ih| is pure, and hence S(|ih|) = 0, i.e., we possess maximal
11
information about it, its part A has maximal entropy: S(A ) = ln n, which means that we
know absolutely nothing about the state of system A. This is completely contrary to classical
systems, where knowing the state of the whole system implies also complete knowledge of its
parts. Quantum states having this non-classical property are called entangled. Hence, we
call any pure state AB = |ih| of the compound system entangled whenever its parts have
non-zero entropy. Since the parts of |i have maximal entropy, |i is maximally entangled.
1 0
0 1
1 =
0 1
1 0
2 =
0 i
i 0
3 =
1 0
0 1
(2.38)
The set of Hermitian matrices with trace one is completely parametrised with three real numbers
r1 , r2 , r3 as:
1
1
1
= 0 + (r1 1 + r2 2 + r3 3 ) =
2
2
2
1 + r3 r1 ir2
r1 + ir2 1 r3
r1 , r2 , r3 R.
(2.39)
Moreover, for the operator to be positive, its eigenvalues 1 , 2 need to be non-negative. Since
we already guaranteed that they sum to one, they cannot be both negative. Therefore, the only
extra condition required is that their product be non-negative, that is,
1
1
1
det = 1 2 = (1 r32 (r12 (ir2 )2 )) = (1 r12 r22 r32 ) = (1 ||~r||2 ) 0.
4
4
4
(2.40)
From this we can see that the set of possible states (density matrices) corresponds via the
above parametrisation to the 3-dimensional ball characterised by k~rk2 1. This is manifestly
a convex set whose extreme points; the pure states, lie on its boundary: the unit sphere.
(2.41)
12
where P and Q are two orthogonal projectors such that P + Q = 1, i.e., P = 1 Q. The
particular values of a and b do not matter as long as they are distinct. In particular, we could
have chosen a = 1 and b = 0, so that we simply have A P . We can then interpret that, in
the outcome of a measurement of P , obtaining the eigenvalue 1 signifies that the proposition
P is true, and obtaining the eigenvalue 0 signifies that it is false. The converse proposition, its
negation, is then simply characterised by the observable P = 1 P . A state of the system
assign probabilities to the truth values of these propositions:
prob(P is true) = Tr(P )
(2.42)
(2.43)
Like in the previous chapter, where we introduced uncertainty about our knowledge about
the state, one can also add uncertainty to our knowledge of the measurement device. One
could imagine that, with probability p our measurement device shows the wrong measurement
outcome. This means that we must review our probability (2.42) as follows:
prob(P is true) = (1 p) Tr(P ) + p Tr(P )
= Tr ((1 p)P + pP )
= Tr(E),
(2.44)
(2.45)
(2.46)
(2.47)
(2.48)
(2.49)
We see that the operator E is not a projector in general, but it is still positive E 0, and Tr(E)
directly yields the probability of the outcome of a measurement. We can therefore generalize the
concept of proposition, allowing them to be represented by any operators satisfying 0 E 1,
rather than simply projectors.
These operators are traditionally called effects. It is important to observe that taking the
spectral decomposition of E would have no physical meaning unlike for an observable.
Also we note that projectors are a special type of effect, which we call sharp, because they do
not introduce extra uncertainty beyond that represented by the state.
13
Such a theory consists of a set of effects (propositions about the systems which can be true or
false) and a set of states (an assignment of a probability to each effect). This is essentially a
Bayesian point of view on physics, where a state represents an observers state of knowledge
about a system. It may or may not represent an objective attribute of a physical system,
depending on whether the observer is actually right, or whether such objective attributes even
exist.
The most basic mathematical structure that one usually requires these sets of effects and
observables to possess is one that gives the ability to take convex combinations of objects. E.g.,
if 1 and 2 are two states, then one can form a new valid state as p1 +(1p)2 where p [0, 1].
This corresponds to the physical interpretations of a mixture: the observer is uncertain whether
the state is 1 or 2 , and attributes probability p to it being 1 . The same can be done with
effects as in the previous section.
The following definition characterizes special points of the convex set of states:
Definition 2.8.
Let be the set of states. Then the element is extremal, or pure, if does not admit a
proper convex combination, i.e., if there exists 0 < p < 1 such that
= p1 + (1 p)2 ,
(2.50)
14
pi = 1
effects : E = {ei }i , 0 ei 1,
where a state assigns to the effect E the probability: p(E is true) =
pi ei .
The states are characterized by a probability distribution on the set . Extremal states are
given by pi = ij , and are hence one-to-one with elements of .
Normally, propositions are characterised as subsets : namely those pure states for which
the proposition is true. These corresponds to the effects associating 1 with elements of and
0 with elements not in . They are the extremal points of the full set of effects.
For example let = Z. If this system characterises a random variable called X, then the
proposition X is positive corresponds to the effect E = {ei }iZ with
if i 0,
ei =
0 otherwise,
(2.51)
or, more generally, the proposition X belong to some subset corresponds to the effect
ei =
if i ,
0 otherwise.
(2.52)
X
i
ei pi =
pi ,
pi |iihi|
E=
ei |iihi|.
Therefore, this amounts to valid state/effect configuration of a quantum system, where the
probability assigned to the effect is exactly the same as in the classical setting p(E is true) =
P
Tr(E) = i pi ei .
15
16
This definition is such that the set of positive elements is a cone, namely closed under linear
combinations with positive scalars. In turn, this makes the relation a b defined by b a 0
into a partial order.
It is not hard to see that for matrices, with A := A , this condition is equivalent to DefiniP
tion 2.2. For instance, if A is positive as a matrix, we have seen that A = i ai |iihi|, where
P
(2.53)
Proof. Since 1 = i |iihi|, we have A = ij hi|A|ji|iihj|. Therefore, f (A) = i,j hi|A|jif (|iihj|).
P
P
If we define R := ij f (|iihj|)|jihi|, then f (|iihj|) = hj|R|ii, and f (A) = i,j hi|A|jihj|R|ii =
Tr(RA).
P
Hence, the dual (that is the the set of linear functionals) of a finite dimensional algebra is
isomorphic (as a linear space) to the algebra. This fact does not translate into the infinite
dimensional setting.
The previously defined probability theories can in fact be defined in this way from -algebras:
17
(2.54)
(2.55)
(2.56)
18
where the vectors |xi, form an orthogonal basis, we see that the matrix product and other
operations are the ones above.
We will also need to consider classical systems with continuously many pure states. This can
be formalised in several ways. Here, we will represent this situation by a measurable set , i.e.,
a set with a concept of volume, or measure, for most subsets. A typical example will be = R.
The algebra that we will consider,
L() L (),
is the commutative algebra of bounded functions f : C, i.e., there is a constant C such
that |f (x)| < C for all x except possibly for a subset of measure zero. L () is the natural
classical counterpart of B(H).
In addition to the -algebra structure described above, one normally also requires a norm k k
to be defined (and finite) on every element of the algebra, and which respects compatibility
conditions with respect to the product and operations.
If also kxyk kxkkyk for all x, y A, then A is a Banach algebra. If, moreover, kx xk =
kxkkx k for all x A, then A is a C -algebra, which is the most popular algebraic framework
for quantum theory. We define here this important concept for the readers benefit, but we will
continue working with concrete operators on Hilbert spaces for simplicity in this document.
for all x A.
(2.57)
A typical example is given by the set of bounded linear operators B(H) on a Hilbert space H,
where the norm is the usual operator norm given by the square root of
h|A A|i
.
h|i
H
kAk2 = sup
We see that the requirement of a finite norm eliminates unbounded operators such as most
quantum mechanical observables (position, momentum, energy). However this is not a problem
, since, as seen above, the interpretational content of these observables can be replaced by
elementary effects (in this case projectors) which are always bounded. We will come back to
this point later when we need to talk about infinite-dimensional Hilbert spaces.
The above axioms can be justified by the fact that they precisely capture the nature of algebras
of linear operators:
Theorem 2.3. Any C -Algebra is a norm-closed -subalgebra of B(H) for some Hilbert space
H.
19
The proof is beyond the scope of this document. Instead, we will focus on concrete -algebras
of matrices.
h|A A|i
.
h|i
However, contrary to the infinite-dimensional case, we will not need to make use of the norm
for characterising these algebras.
The smallest such algebra that one can build is the algebra alg(A) generated by a single selfadjoint matrix A, i.e., all linear combinations of (non-zero) natural powers of A. This algebra
is commutative and unital, and can be characterised as follows:
Theorem 2.4. If A is a self-adjoint matrix with spectral decomposition A = i ai Pi (where
ai 6= 0 are distinct eigenvalues of A, and P1 , , Pm are orthogonal projectors), then
P
alg(A) = span(P1 , . . . , Pm )
namely, the -algebra generated by A is equal to the span of the spectral projectors of A. Explicitly,
Y A ai 1A
Pj =
,
i6=j aj ai
where 1A = A0 =
Pm
i=1
X
n=1
cn A |cn C
and
span(P1 , . . . , Pm ) =
X
m
ci Pi |ci C .
i=1
It is clear that any power of A is inside this space spanned by the projectors Pi , since An =
P n
i ai Pi . What we have to show, however, is that the projectors Pi can be written as polynomials
in A. But first, we need to show that alg(A) contains the unit element A0 . Since the set of
Q
eigenvectors of A is complete, we have i (ai 1 A)A = 0. Indeed, this polynomial sends every
Q
eigenvector to zero. Multiplying by the generalised inverse of A we obtain i (ai A0 A) = 0.
Since ai 6= 0 for all i, we can simply solve for A0 . We can then obtain the projectors explicitly
20
X
k
(ak a1 )
(ak aj1 ) (ak aj+1 )
(ak an )
...
...
Pk
aj a1
aj aj1 aj aj+1
aj an
|
{z
X
k1 ,...,kj1 ,kj+1 ,...,kn
X
k1 ,...,kj1 ,kj+1 ,...,kn
Y
i6=j
k,j
k (ak
Y
ai )Pk
=
aj ai
i6=j
ak P k ai
aj ai
Pk
Y
i6=j
A ai 1A
.
aj ai
=:Im (B)
Re (B) and Im (B) are both self-adjoint by definition, and manifestly inside of the same algebra A. Moreover, theorem 2.4 showed that the spectral projectors of these matrices are
also within A, since it must contain alg(Re (B)) and alg(Im (B)). Hence if we write their
P
P
spectral decompositions as Re (B) = i bi Pi and Im (B) = i bi Qi then we obtain
B=
X
i
ai P i + i
bj Qj ,
where Pi , Qj A. This proves Equ. 2.58. In order to build the unit element, let us write simply
Ti , i = 1, . . . , n for the finite set of projectors including the Pi s and Qi s that we just built.
These matrices span the whole of A as we have just seen. From theorem 2.4, we know that the
P
P
P
P
projector P := ( i Ti )0 on the range of i Ti is also inside A, and satisfies P ( i Ti ) = i Ti ,
P
which implies i (1 P )Ti (1 P ) = 0. But this is a sum of positive operators, therefore
each one of them must be zero: (1 P )Ti (1 P ) = 0 for all i, as well as their square roots
Ti (1 P ) = (1 P )Ti = 0 (which can be proven by taking the expectation value of the previous
expression in an aribtrary vector). It follows that Ti P = P Ti = Ti for all i. Since the matrices
Ti span A, this shows that P A is the unit of A.
Note that Equ. 2.58 is also true in any -algebras of operators on a Hilbert space, even an
infinite-dimensional one, provided that the algebra is closed in the weak operator topology, i.e.,
21
that if a sequence of operators An are such that h|An |i converges to h|A|i for all vectors
|i then A is also inside the algebra. These algebras are special types of C -algebras called vonNeumann algebras. The span of the projectors must then also be closed in that topology. There
are, however, many non-trivial C -algebras which contain no projector besides the identity and
zero elements.
The unital algebra alg(A) generated by a single self-adjoint matrix A is the prototype of a
commuting matrix algebra. Indeed, all commuting algebras are of this form:
Theorem 2.6. Any commuting -algebra A of complex matrices is of the form
A = alg(A) = span(P1 , . . . , Pn ),
where A is a self-adjoint matrix and P1 , . . . , Pn , are a complete set of orthogonal projectors,
and n is the dimension of A (as a linear space).
(2.59)
Hence the center Z(A) is made of those elements of A which commute with all the other
elements of A. Clearly, Z(A) is a -algebra: If we take A, B Z(A) we see that also AB
Z(A). Indeed, for all C A, ABC = ACB = CAB. Moreover, Z(A) is clearly commutative
(abelian), i.e., for all A, B in Z(A), [A, B] = 0.
22
The structure of the center Z(A) gives us already important information about the structure
of a general matrix algebra A. In order to express the resulting property, however, we need to
introduce the concept of direct sum for algebras. The algebraic direct sum can be defined
concretely as follows: given two square matrices A and B, respectively of sizes nn and mm,
we define the matrix A B of size nm nm as
A 0
AB =
0 B
where the 0s represent rectangular matrices of the apropriate sizes with only zero components.
It should be clear how this generalises for the direct sum of more than two matrices.
This direct sum could also be defined algebraically also on general abstract algebras A and B:
A B is the direct sum of A and B as linear spaces equipped with the product defined by
(A B)(A0 B 0 ) := (AA0 ) (BB 0 ).
As block matrices, this is simply
A 0
0 B
AA0
0
A0 0
.
=
0 BB 0
0 B0
!
With the help of this concept, we obtain the first importance piece of information about general
matrix algebras:
Lemma 2.1. For any -algebra A of complex matrices, there is a unitary matrix U , and smaller
matrix algerbas Ai with Z(Ai ) = C 1, such that
U AU = A1 An 0.
Proof. Theorem 2.6 tells us that there exists a minimal set of orthogonal projectors {Pi }ni=1
such that Z(A) = span(P1 , ..., Pn ). Note that the sum of all projectors is the identity of A, i.e.
P
i Pi = 1A . This matrix 1A should not be confused with the identity matrix: it ould be in
general any arbitrary projector.
Due to the fact that the Pi commute with every element of the algebra, we obtain the following
decomposition of any element A A:
A = 1A A =
!
X
i
Pi A =
X
i
Pi Pi A =
Pi APi ' A1 A2 An 0,
(2.60)
for some smaller matrices Ai whose dimensions sum to the total dimension of the projector 1A .
The symbol ' means that the two matrices are equal up to a change of orthonormal basis.
In this basis, we have,
23
where 1di denotes the identity matrix of dimension di , where di = Tr Pi , and 0dn+1 is the zero
square matrix of dimension
dn+1 := d
n
X
di
i=1
where d is the dimension of the original matrices in A, such as A. This is just the identity
matrix accept for a few trailing zeros on the diagonal (dn+1 of them).
Also, the smaller matrices Ai are defined as representing the non-trivial parts of the matrices
Pi APi in the new basis:
Pi APi ' 0d1 +...di1 Ai 0di+1 ++dn+1 .
Hence, in this special basis we have, in block matrix notation,
A'
A1
A2
..
.
An
A1 A2 An 0.
where the blank spots are filled with zeroes, and 0 represents the zero square matrix of dimension
dn+1 .
Because the projectors Pi only depend on the algebra A, not on the chosen element A, all
elements of A have this particular form when expressed on that specific basis (but for different
values of the small matrices Ai ). In fact, it is easy to see that the allowed values of the matrices
Ai must also form a -algebra that we call Ai , so that we obtain:
A ' A1 An 0.
In order to completely elucidate the structure of A, we still need to understand the structures
of the smaller algebras Ai . What distinguishes them from a general matrix algebra like A is the
fact that their center is trivial, i.e., it consists of all multiples of the identity: Z(Ai ) = C1.
Indeed, if there were other elements in the center of Ai , there would be a finer decomposition
of the minimal set of projectors of the center. Such a matrix algebra with a trivial center is
called a factor.
We now show that such a matrix algebra factor always consists of all matrices of the form 1 A
in some basis:
Lemma 2.2. Let A be a -algebra of d-dimensional complex matrices, with trivial center (i.e.,
a factor): Z(A) = C1A . Then, there is a unitary matrix U such that
U AU = 1m B(Cn ).
with mn = d.
(2.61)
24
Proof. Our proof is adapted from Ref. [2]. Given any nonzero element A A, consider the set
IA := {XAY : X, Y A}. The set IA is an ideal, i.e., for all B A and A IA , AB IA
and BA IA . Let us show that IA contains the identity matrix 1 (which implies IA = A).
It is clear that IA is closed under multiplication and under the operation, and hence forms
a matrix algebra. From theorem 2.58 we know that IA contains a unit P IA . This implies
that for all B A, P (BP ) = (BP ) = (BP )P since BP IA . Since also (BP )P = BP (P
is a projector), we have P B = BP and hence P Z(A). But since A is a factor this implies
P = 1 IA .
In other words, for any A A, 1 = i Xi AYi for some Xi , Yi A. Hence, for all A, B A,
P
B = i BXi AYi . If B 6= 0, this implies that at least one of the terms in the sum must
be nonzero: BXi AYi 6= 0, which implies BXi A 6= 0. We have therefore shown that for all
A, B 6= 0 A there exists X A such that BXA 6= 0. Although this may not seem very
impressive, this is the essential property of A we need.
P
Now let us consider a maximal commutative -subalgebra C of A. The maximality means that
if B A is such that [B, A] = 0 for all A C, then B C. This exists because we can build it
by progressively adding any such element and its dual B to a given commutative subalgebra
until there is not more left.
From theorem 2.6, we know that C = span(P1 , . . . , Pn ), where Pi are a complete family of
orthogonal projectors. We have just shown above that for every pair i, j, there exists a Xij A
such that
Fij := Pi Xij Pj 6= 0.
Observe that Fij Fij commutes with every elements of C and is therefore contained in C due to
its maximality. This means that it is a linear combination of the projectors Pk , but its product
with Pk , k 6= j is zero, hence
Fij Fij = Pj Xji Pi Xij Pj = ij Pj
for some ij > 0. In particular, this implies that all the Pi s project on spaces of same dimensions
m since Fij Fij above maps all vectors in Pj Cd via Pi Cd back to Pj Cd without sending any to
zero. Therefore, we can decompose Cd into Cm C n with an orthonormal basis |ii |ji such
that
m
Pi =
j=1
This yields Fij = Yij |iihj|, where Yij = (1 hi|)Xij (1 |ji). But then Fij Fij = Yij Yij |iihi| =
ij 1 |iihi|, which implies Yij Yij = ij 1, and hence
Fij = ij 1 |iihj|.
We only have left to show that the matrices Fij span A. Consider any A A, for the same
reason as above, we have Pi APj Xji Pi Pi . It follows that
Pi APj = Pi APj Pj Pi APj Fji Fji Pi APj Xji Pi Xij Pj Pi Xij Pj .
This implies that there is are linear functionals fij : A C such that, for all A A,
Pi APj = fij (A)Fij .
Pi = 1 and hence A =
ij
Pi APj =
ij
25
fij (A)Fij .
It is now straightforward to combine Lemma 2.1 and 2.2 in order to completely elucidate the
structure of general matrix algebras:
Theorem 2.7. For any -algebra subalgebra A of B(Cd ), there is a unitary matrix U such that
U AU =
"N
M
(2.62)
i=1
where
PN
i=1
ni mi + d0 = d.
Equivalently, this means that the elements of A are precisely those matrices of the form
A=U
1m1 A1
..
1mN AN
(2.63)
A=U
...
N 1mN 1nN
(2.64)
1 Ai
(2.65)
for generic element of A, according to the decomposition given by Theorem 2.7. For conciseness,
in this notation we leave implicit the possible trailing zero block. Also, one must remember
that this is block-diagonal only in a basis which may not be the canonical one.
It is easy to see that A = A if and only if Ai = Ai for each i. Moreover, by writing A in
diagonal form, we immediatly see that the eigenvalues of A are all in the interval [0, 1] if and
26
only if that is the case also for each Ai . Therefore, we conclude that the effects 0 A 1 of
L
A are precisely the operators of the form i 1 Ai where 0 Ai 1 for each i.
From theorem 2.2, we know that the states can be represented by matrices R as the functionals
A 7 Tr(RA)
on effects A. But since the set of effects is restricted, two matrices R and R0 may actually
represent the same functional, i.e., Tr(RA) = Tr(R0 A) for all A A.
Let us find a special matrix R giving a unique representation of the functional. Let P1 , . . . , PN be
P
the projectors spanning the center of A. We know that the elements of A satisfy A = i Pi APi .
Therefore,
X
X
Tr(RA) =
Tr(RPi APi ) =
Tr(Pi RPi A).
i
Pi RPi =
X
i
Ri .
Tr Ri (1 Ai ) =
We obtain that the same functional is uniquely represented by a matrix of the form
R=
1 Ri ,
mi ri = 1.
M
i
pi
1i
mi
i ,
(2.66)
which is a unique representation for a state of the theory defined by the algebra A.
We make two observations. The first is that R is also a density matrix on the quantum system
defined by the full matrix algebra of which A is a subalgebra. Therefore, restricting the quantum
effects to those of the subalgebra A is equivalent to restricting the density matrices also to A.
The second observation, is that a state of the system defined by A is represented by a probability distribution p1 , . . . , pN , hence a classical system, together with a set of quantum states
27
(2.67)
(2.68)
Importantly, there are canonical embeddings of a single algebra into the tensored space, which
allow one to think of each algerba Ai as a subalgebra of A1 A2 , via the homomorphisms
A1 A1 A2
A 7 A 1
A2 A1 A2
A 7 1 A
In case of matrix algebras, which are -subalgebras of full matrix algebras, the tensor product
is precisely the same as the quantum one, which is also called the Kronecker product, e.g.,
a11 . . . a1n
a11 B . . . a1n B
.
..
. . . ..
...
.
B = ..
.
.
.
.
,
an1 . . . ann
an1 B . . . ann B
where aij are the components of a matrix, and B represents another matrix.
For instance, if we compose two finite dimensional quantum systems A1 = B(H1 ) and A2 =
B(H2 ), the tensor product is isomorphic to bounded operators on the tensor products of the
hilbert spaces A1 A2 B(H1 H2 ).
Recall that a classical system associated with the finite set of pure states (sample space)
corresponds to the commutative algebra of functions A = L() on , namely the space of
vectors with elements indexed by elements of , equipped with the component-wise product.
Taking the tensor product of two classical systems A1 = L(1 ) and A2 = L(2 ) results in
the set of functions on the cartesian products of the sample spaces: A1 A2 L(1 2 ).
Alternatively, one may also view those commutative algebras as sets of diagonal matrices, with
diagonal elements index by . The Kronecker product then gives the same result.
A natural hybrid system then is obtained by considering the composition of a classical system
A1 = L() and quantum system A2 = B(H). The easiest way to see what happens is, again,
28
3 Channels
to represent L() as a set of diagonal matrices. Using the Kronecker product, one then find
that the tensor product yields a block-diagonal algebra
L() B(H) ' B(H) B(H),
where each factor B(H) on the right hand side of the equation is indexed by an element of .
For instance, a completely uncorrelated state on A1 A2 has the form
p1
{pi }N
i=1
..
.
pN
p1
..
.
pN
pi |iihi|
where the orthogonal vector states |ii just serves for the matrix representation of the classical
P
state {pi }N
i pi |iihi|.
i=1
Similarly, a general correlated classical/quantum state has the form
p1 1
..
.
pN N
N
X
pi |iihi| i .
i=1
This is a special case of the general expression Equ. (2.66), where |iihi| is simply the identity
operator on a one-dimensional factor, and where each state i leaves in a Hilbert space of same
dimension.
3 Channels
We want to identify the most general way of representing a transfer of information from a
system represented by some algebra A1 to another system represented by A2 . This can be
done in two ways: either one maps states to states (Schrdinger picture), or effects to effects
(Heisenberg picture). In infinite dimensions, the Heisenberg picture is more general. However,
we begin with the Schrdinger picture because it is somewhat more intuitive.
Mathematically, a map preserving all the structure of A1 is given by the following definition:
Definition 3.1 (-homomorphism).
Let A1 and A2 be -algebras. A map : A1 A2 is referred to as -homomorphism (or
-algebra morphism) if it satisfies
a) is linear
b) (AB) = (A)(B)
c) (A ) = (A)
29
However, these conditions are too strong for our purpose. Indeed, we only want to preserve
the state/effect structure. This structure does rely on the algebra product, but only through
the partial order defined via the concept of positivity. Indeed, recall that positivity is defined
using the product (Definition 2.11). But, as we will see, the requirement that a map preserves
this partial order is much weaker than requiring that it preserves the product structure.
namely, the image of under E ought to be state E(i ) with probability pi . Since the set
of density matrices for the system represented by A1 span the whole of A1 , this map can be
naturally extended to a linear map
E : A1 A2
Moreover, for the image of any density matrix to also be a density matrix, the map E must
send positive matrices to positive matrices (according to which we simply say that E is itself
positive), and it must also preserve the trace of matrices:
X 0 E(X) 0,
and
Tr(E(X)) = Tr(X).
But there is an extra more subtle consequence of the positivity condition. Namely, E should
also respect all those conditions if we see as part of a larger system. That is, if we add any
system B(H) on which E acts trivially. On the large system, this is represented by the map
(E id) : A1 B(H) A2 B(H) defined by
(E id)(X Y ) = E(X) Y.
Classically, one would expect that the positivity of E implies that of E id. Indeed, if the
algebra A1 is commutative, then a general operator in A1 B(H) is of the form
X=
|iihi| Xi ,
where Xi B(H). Since the eigenvalues of X are just those of the Xi s, it is easy to see that
X 0 if and only if Xi 0 for all i. We then have
(E id)(X) =
E(|iihi|) Xi .
Since E(|iihi|) and Xi are positive, so is the right hand side of this equation. This shows that,
when A1 is commutative, and hence represents a classical system, the positivity of E implies
that of E id. However, the following example shows that this fails when A1 is quantum:
30
3 Channels
1X
1X
(E 1)(|ih|) = (E 1)
|iihj| |iihj| =
|jihi| |iihj| =:
d ij
d ij
(3.69)
But the operator is not positive. Indeed, consider the case d = 2, and the state |i =
1 (|01i |10i). A direct calculation shows that
2
|i = |i,
from which it follows that h|
|i = 1.
We conclude that positivity is not a sufficient criterion for a map to fit into the state/effect
formalism of quantum systems. We therefore need to extend this notion to that of complete
positivity:
Definition 3.2 (Complete Positivity).
A linear map E : A1 A2 is completely positive, iff E idn : A1 B(Cn ) A2 B(Cn ) is
positive for all n, i.e.
n N : (E idn )(X) 0 X 0.
(3.70)
This gives us all the tool to define a general channel in the Schrdinger picture:
Definition 3.3 (Channel).
A channel represents a general transfer of information from a system represented by that algebra
A1 to that represented by A2 . When those are matrix algebras, it can be represented by an
arbitrary linear, completely positive, trace-preserving (CPTP) map E : A1 A2 , meant to be
applied to states (Schrdinger picture).
In what follows we give some simple examples of channels.
31
Ei Ei .
The right-hand side is manifestly positive, which completes the proof that is a channel.
An example of -homomorphism is that induced by a unitary operator U (defined by the
property U U = U U = 1), through
() = U U .
In fact, we will see below that when A1 = A2 is a matrix algebras, then a -homomorphism is
necessarily of this form.
qi =
X
ij
ij pj =
X
j
pj
{pj }N
j=1
32
3 Channels
The quantity on the left hand side of those equations has a physical intepretation: it is the
probability that the state E() of Bob associates to the effect E of Bob. But we see here that
is also has a different interpretation from the point of view of Alice: it is the probability that
Alices state assigns to her effect E (E).
It is important to note that the adjoint map E exchanges image and preimage, i.e. E :
B(HB ) B(HA ).
In this document, we refer to E as the channel in the Schrdinger picture, which acts on states,
and to E as channel in the Heisenberg picture, as it acts on effects.
A familiar example is given by unitary channels. If E() = U U , then E (E) = U EU . Indeed,
Tr(EE()) = Tr(EU U ) = Tr(U EU ) = Tr(E (E)).
We have already established the properties that the map E needs to be a channel, but what
do those properties correspond to in terms of E ? One could simply derive them directly from
those of E. But, equivalently, one may also follow the same principle used to derived the
properties of E, and demand that E preserves convex combinations of effects, which have the
same ignorance interpretation as for convex combinations of states, and that it maps effects
33
to effects, even when acting on part of a larger system. The first condition implies that E be
linear. Since effects are defined in part by the fact that they are positive, the second condition
implie that E be completely positive.
Hence, both E and E are linear completely positive maps. The only difference is that whereas
E must be trace-preserving, E must be unital, i.e., E (1) = 1.
Indeed, for all operators X, we have
Tr(E (1)X) = Tr(1E(X)) = Tr(E(X)) = Tr(X) = Tr(1X).
Since this holds for all X, this implies
E (1) = 1.
This leads us to a more general definition of channel which holds in more general terms (including in the abstract framework of C -algebras):
Definition 3.4 (Channel, more general).
A channel transferring information from the system defined by the algebra A1 to that defined
by the algebra A2 is represented by a linear, completely positive map
E : A2 A1 .
which is also unital: E (1) = 1, and meant to be applied to effects.
In order to avoid confusion, we will typically use the symbol when using the Heisenberg
picture, even if there is not corresponding Schrdinger representation.
Again, it is important to note that in the Heisenberg representation, channels run backward
in time. A consequence of this is that they must be composed in the opposite order from the
Schrdinger representation, since
(E1 E2 ) = E1 E2
The reason this representation is preferred is that for a general C -Algebras A, effects are
elements of A, but states are just linear functionals on A. In fact, states can be seen themselves
as channels from the system defined by the one-dimensional algebra C to that defined by A,
which, in the Heisenberg picture, is given by a linear completely positive unital map : A C.
(This hints at the fact that we are secretly working in the category whose objects are C algebras, and whose morphisms are unital CP maps). In this language, the effect of E : A2
A1 on the state : A1 C of the system A1 is given by composition: is mapped to the new
state 0 : A2 C defined by
0 = E .
34
3 Channels
Figure 3.1: An observable is a transfer of information from the system being observed to a
pointer recording the result of a measurement.
35
where we used the orthogonal vectors |truei and |falsei to represent the two pure states of the
classical apparatus. In the Heisenberg picture, it maps classical effects f : [0, 1] to
XE (f ) = f (true) E + f (false)(1 E).
When the system is quantum: A = B(H), we call a classical-valued channel a quantum-toclassical channel, or QC-channel. As an example we will determine the general structure of
such QC-channels in the case where H is finite dimensional and finite. This allows us to
represent it as the CPTP map
X : B(H) L(),
From linearity of X we can infer:
X () = {fi ()}i
lin
Theorem 2.2 tells us that there exists Ai B(H) such that fi () = Tr(Ai ). The positivity of
the map X requires that for all , Tr(Ai ) 0 which implies that Ai 0 for all i. Moreover,
the trace-preserving property of X says that for all operator X,
Tr(X (Y )) =
Ai = 1.
Hence, the fact that the QC-channel X is linear, trace-preserving and positive implies that it
P
is represented by a set of operators Ai 0, i , such that i Ai = 1 as
n
X () = Tr(Ai )
Tr(Ai )|iihi|.
(3.71)
These operators Ai are in fact effects of the quantum system. Indeed, they automatically satisfy
0 Ai 1, and are used to define the probabilities Tr(Ai ) which make up the classical state
X ().
Let us show that the complete positivity of X does not impose any extra condition on these
operators. Any operator W of the extended algebra B(H) B(H0 ) can be written as W =
P
j Xj Yj . The action of X id on it is
(X id)(j Xj Yj ) =
Tr(Ai Xj )|iihi| Yj =
ij
hk| Ai Xj Ai |ki|iihi| Yj
ijk
|iihk| Ai Xj Ai |kihi| Yj
ijk
X
ik
(|iihk| Ai 1)(
q
{z
Eik
} j
Xj Yj ) ( (Ai )|kihi| 1)
q
{z
Eik
36
3 Channels
ik
Eik W Eik
, which is
XA () = Tr(P )
Tr(P )|ih|,
where the orthogonal vectors |i are used to label the classical states of the measurement
apparatus. The set of effects defining this observables are simply the spectral projectors P ,
and the eigenvalues correspond to the classical pure states .
We see that the old notion of observable defined by a self-adjoint operator corresponds only to
a special type of QC-channel, or POVM, namely one whose effects are orthogonal projectors.
We say that it is a sharp, or projective observable.
Definition 3.6.
We say that an observable represented by a QC-channel X is sharp or projective if it maps all
sharp classical effects E 2 = E to sharp quantum effects: X (E)2 = X (E).
Note that a complete set of projector are automatically mutually orthogonal, which is why
orthogonality is not part of this definition:
Lemma 3.1. If Pi are projectors such that
Proof. For the proof, we just need to show that that it holds for two projectors P and Q such
that P + Q 1. Multiplying the last inequality by P from both sides we obtain P + P QP P
which implies P QP 0. But P QP 0 by construction, so P QP = 0. This implies that for all
states |i, h|P QQP |i = 0. Therefore, kQP |ik = 0 for all |i, which implies QP = 0.
37
Despite this fact, the above concept of sharp observable is already more general than that
associated with self-adjoint operators, in a way which may seem cosmetic at first, but which
is important to realize. According to our definition, the set of value can be any measurable
set, such as, for instance = Rn , something which would require n commuting self-adjoint
operators to describe.
( X )() = Tr(E )
o
0
with E =
P .
X E : B(HA )
B(HB )
L()
on system A.
Hence, measuring X on B is perfectly equivalent, in terms of the classical information collected,
to measuring X E on system A directly. This is just the Heisenberg picture: the channel
represented in the Schrdinger picture by the CPTP map E sends the observable X to X E.
Therefore, we ought to be able to write this action directly in terms of the unital CP map E .
Indeed, this is how the effects defining X transform: if X() = {Tr(X )} , we have
n
= Tr(E (X ))
38
3 Channels
Tr(Pi )|iihi|,
(3.72)
Tr(Pi ) Tr(X|iihi|) =
which yields
E (X) =
hi|X|iiPi .
Therefore, if we only have access to system B, the only effects of system A that we can indirectly
P
observe are of the form i xi Pi , where xi [0, 1], which we recognise as the effects of the
commutative algebra alg(P1 , ..., Pn ), which is isomorphic to the classical algebra L({1, ..., N }).
Therefore, from the point of view of an observer of system B, system A looks purely classical!
This specific channel E is a simple prototype of the phenomenon of decoherence by which a
quantum system appears classical.
39
This unitary is also called the controlled NOT gate in the context of quantum computation,
where it serves as a basic building block to construct more complex unitaries (See Ref. [1]).
Here the system S can be thought of as a control bit, since if it starts
in state |0i, then nothing
0
1
happen to E, whereas, if it starts in state |1i, the unitary 1 0 is applied to E, which swaps
states |0i and |1i (a NOT operation in classical logic). Of course, this interpretation of the
map makes sense only in this particular basis. For instance, in the basis |+i |0i + |1i and
|i |0i |1i, the role of control and target systems are exchanged.
But lets keep with the original basis, and ask what happens if it is not the control qubit S
which starts in state |0i, but the target E instead. Fixing the input state of one of the system
amounts to considering the linear map
V : HS HS HE
|i 7 U (|i |0i)
Explicitely, we find
V = |0ih0|S |0iE + |1ih1|S |1iE |00ih0| + |11ih1|.
Such a map is a called an isometry, in that, like a unitary map it satisfies V V = 1, but it
is not invertible, and V V is not the identity (but it is always a projector). It simply embeds
a small Hilbert space into a bigger one while preserving the orthogonality of vectors. In this
example; it represents HS as the subspace of HS HE spanned by |00i and |11i.
Moreover, if the initial state of S is either |0i or |1i, the map V simply makes a copy of the
classical bit encoding this information. Again, this interpretation does not work in a different
basis, but it indicates that some information is being transmitted from S to E.
Finally, suppose that we only care about system S after this interaction, i.e., we promise to
only make further measurements on system S only. In other words we discard E. In the density
matrix formalism, this amounts to looking at the reduced state of system S in the state after
the interaction. If the original state of S was represented by , then after the interaction, the
state of the joint system is V V . Discarding system E yields the final state TrE V V of the
system S.
Altogether, this amounts to the map
7 TrE V V ,
which is a channel from S to S. Explicitely,
(hi| 1)V A V (|ii 1)
TrE V V =
ijk
hi|jihk|ii|jihj||kihk| =
ijk
|iihi||iihi|
X
i
Tr |iihi| |iihi|,
40
3 Channels
We see that the unitary evolution transforms a natural type of limitation (access only to
a subsystem) to that of having access only to a commutative subalgebra of observables, as
explained in Section 3.3.3 for this type of channel. The fact that only classical information is
left in the system after this interaction is crucially linked to the fact that information about a
basis was copied between S and E by V , as will be shown in a later part of this lecture.
In fact, a central result in quantum information theory is that any channel can occur in this
way. This is stated by the following theorem, which have have here specialised to the finitedimensional case in order to make the proof more accessible:
Theorem 3.1 (Stinespring). A linear map E : B(HA ) B(HB ) is completely positive if and
only if there exists a Hilbert space HE (environment) and a linear map V : HA HB HE
such that, for all state ,
E() = TrE (V V ),
or, equivalently,
E (X) = V (X 1E )V.
1X
|ii |ii,
d i
where |ii forms a basis of HA . Clearly, XE 0 since E is CP and |ih| 0. This object XE is
commonly referred to as the Choi matrix of E. The good thing about the Choi matrix is that
we can reconstruct the CP map E from it as follows:
E() = (11 h|23 )(E1 id2 id3 )(|ih|12 3 )(11 |i23 )
= (11 h|23 )(E 3 )(11 |i23 )
(3.73)
where we are now using three different copies of HA which we labelled by the subscripts 1, 2, 3
for clarity. This equation can be verified by direct expansion of the definition of the maximally
entangled state |i. This invertible map between CP maps and positive matrices is known as
the Choi-Jamiolkowski isomorphism. The particular representation of the CP map that we are
looking for simply amounts to diagonalising the Choi matrix. Indeed, since XE 0, we can
P
write XE in diagonal form as XE = di=1 i |i ihi | where i 0 and hi |j i = ij . Substituting
this diagonal form in Equ. 3.73 yields the expression
E() = (1 h|)(E )(1 |i)
=
X
ijk
X
i
Ei Ei ,
Ei =
41
i (1 hk|)|i ihk|
are called Kraus operators for the CP map. They are directly related to the operator V that
we are looking for. Indeed, if we now take HE to be a hilbert space of dimension equal to the
rank d of XE , and {|ii}di=1 any orthonormal basis of HE . Then if we define
V :=
Ei |iiE .
We obtain
TrE V V =
Ei Ei = E(),
which is the form that we were looking for. Moreover, if E is also trace-preserving, then E is
unital, which means that E (1) = V V = 1.
The isometry V from Theorem 3.1, together with the Hilbert space HE , is called a Stinespring
dilation of E. An important aspect of this dilation is that it is unique up to a partial isometry
on HE . A partial isometry W preserves the scalar product on its range, but can send some
vectors to zero. It is characterised by the fact that W W is a projector, which also implies that
W W is a projector.
Theorem 3.2 (Stinespring, uniqueness). Let HE1 and HE2 and V1 : HA HB HE1 , V2 :
HA HB HE2 be such that
V1 (X 1E1 )V1 = V2 (X 1E2 )V2
for all X B(HB ). Then there is a partial isometry W : HE1 HE2 , such that
V2 = (1B W )V1 .
Proof. Consider the span K of all vectors of the form (X 1E1 )V1 |i, for arbitrary state |i
as mapping
and operator X. We consider a basis (Xi 1E1 )V1 |i i for i = 1, . . . , n and define W
(Xi 1E1 )V1 |i i to (Xi 1E2 )V2 |i i. If |i belongs to the orthogonal complement of K, we
|i = 0. It follows from this definition that W (X 1)V1 = W (X 1)V2 for all X,
define W
and in particular,
V1 = V2 .
W
is a partial isometry because, due to the fact that both dilations represent the
The operator W
same channel,
h|V2 (X 1)(Y 1)V2 |i = h|V2 X Y 1 V2 |i
42
References
(X 1)|i = (X 1)W
|i for all |i K. Moreover, if |0 i belongs to the
which shows that W
|0 i = 0 by definition of W
, but also, since W
maps
orthogonal complement K , we have W
(X 1)|0 i = 0.
onto K, then for all |i and all X, h|W (X 1)|0 i = 0, which implies W
(X 1)|0 i = (X 1)W
|0 i also for all 0 K . We conclude that
Therefore, W
(X 1) = (X 1)W
W
for all X, which implies
=1W
W
.
for some operator W which inherits its isometric property from W
References
[1] Michael A Nielsen and Isaac L Chuang. Quantum computation and quantum information.
Cambridge university press, 2010.
[2] Masamichi Takesaki. Theory of operator algebras I, volume 2. Springer Science & Business
Media, 2002.
[3] Man-Duen Choi. Completely positive linear maps on complex matrices. Quantum
Computation and Quantum Information Theory: Reprint Volume with Introductory Notes
for ISI TMR Network School, 12-23 July 1999, Villa Gualino, Torino, Italy, 10:174, 2000.